Add upstream patch "UnitTestFrameworkPkg: Use TianoCore mirror of
subhook submodule" to edk2, so the submodule can be cloned again.
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Newer AMD CPUs support ERAPS (Enhanced Return Address Prediction Security)
feature that enables the auto-clear of RSB entries on a TLB flush, context
switches and VMEXITs. The number of default RSP entries is reflected in
RapSize.
Add the feature bit and feature word to support these features.
CPUID_Fn80000021_EAX
Bits Feature Description
24 ERAPS:
Indicates support for enhanced return address predictor security.
CPUID_Fn80000021_EBX
Bits Feature Description
31-24 Reserved
23:16 RapSize:
Return Address Predictor size. RapSize x 8 is the minimum number
of CALL instructions software needs to execute to flush the RAP.
15-00 MicrocodePatchSize. Read-only.
Reports the size of the Microcode patch in 16-byte multiples.
If 0, the size of the patch is at most 5568 (15C0h) bytes.
Link: https://www.amd.com/content/dam/amd/en/documents/epyc-technical-docs/programmer-references/57238.zip
Signed-off-by: Babu Moger <babu.moger@amd.com>
Link: https://lore.kernel.org/r/7c62371fe60af1e9bbd853f5f8e949bf2d908bd0.1729807947.git.babu.moger@amd.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 9c07a7af5d)
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
CPUID leaf 0x80000022, i.e. ExtPerfMonAndDbg, advertises new performance
monitoring features for AMD processors. Bit 0 of EAX indicates support
for Performance Monitoring Version 2 (PerfMonV2) features. If found to
be set during PMU initialization, the EBX bits can be used to determine
the number of available counters for different PMUs. It also denotes the
availability of global control and status registers.
Add the required CPUID feature word and feature bit to allow guests to
make use of the PerfMonV2 features.
Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/a96f00ee2637674c63c61e9fc4dee343ea818053.1729807947.git.babu.moger@amd.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 209b0ac120)
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
According to AMD's Speculative Return Stack Overflow whitepaper (link
below), the hypervisor should synthesize the value of IBPB_BRTYPE and
SBPB CPUID bits to the guest.
Support for this is already present in the kernel with commit
e47d86083c66 ("KVM: x86: Add SBPB support") and commit 6f0f23ef76be
("KVM: x86: Add IBPB_BRTYPE support").
Add support in QEMU to expose the bits to the guest OS.
host:
# cat /sys/devices/system/cpu/vulnerabilities/spec_rstack_overflow
Mitigation: Safe RET
before (guest):
$ cpuid -l 0x80000021 -1 -r
0x80000021 0x00: eax=0x00000045 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
^
$ cat /sys/devices/system/cpu/vulnerabilities/spec_rstack_overflow
Vulnerable: Safe RET, no microcode
after (guest):
$ cpuid -l 0x80000021 -1 -r
0x80000021 0x00: eax=0x18000045 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
^
$ cat /sys/devices/system/cpu/vulnerabilities/spec_rstack_overflow
Mitigation: Safe RET
Reported-by: Fabian Vogt <fvogt@suse.de>
Link: https://www.amd.com/content/dam/amd/en/documents/corporate/cr/speculative-return-stack-overflow-whitepaper.pdf
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240805202041.5936-1-farosas@suse.de
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 0701abbf98)
References: bsc#1228079
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
qemu-ga on a NetBSD -current VM terminates with a SIGSEGV upon receiving
'guest-set-time' command...
Core was generated by `qemu-ga'.
Program terminated with signal SIGSEGV, Segmentation fault.
at ../qga/commands-posix.c:88
88 *str[len] = '\0';
[Current thread is 1 (process 1112)]
(gdb) bt
at ../qga/commands-posix.c:88
action=action@entry=0xcda34b8 "set hardware clock to system time", errp=errp@entry=0xffffff922a70, in_str=0x0)
at ../qga/commands-posix.c:164
errp=errp@entry=0xffffff922ad0) at ../qga/commands-posix.c:304
at qga/qga-qapi-commands.c:193
allow_oob=allow_oob@entry=false, cur_mon=cur_mon@entry=0x0) at ../qapi/qmp-dispatch.c:220
type=type@entry=JSON_RCURLY, x=28, y=1) at ../qobject/json-streamer.c:99
at ../qobject/json-lexer.c:313
buffer=buffer@entry=0xffffff922d10 "{\"execute\":\"guest-set-time\"}\n", size=<optimized out>)
at ../qobject/json-lexer.c:350
buffer=buffer@entry=0xffffff922d10 "{\"execute\":\"guest-set-time\"}\n", size=<optimized out>)
at ../qobject/json-streamer.c:121
at ../qga/channel-posix.c:94
(gdb)
The commandline options used on the host machine...
qemu-system-aarch64 \
-machine type=virt,pflash0=rom \
-m 8G \
-cpu host \
-smp 8 \
-accel hvf \
-device virtio-net-pci,netdev=unet \
-device virtio-blk-pci,drive=hd \
-drive file=netbsd.qcow2,if=none,id=hd \
-netdev user,id=unet,hostfwd=tcp::2223-:22 \
-object rng-random,filename=/dev/urandom,id=viornd0 \
-device virtio-rng-pci,rng=viornd0 \
-serial mon:stdio \
-display none \
-blockdev node-name=rom,driver=file,filename=/opt/homebrew/Cellar/qemu/9.0.2/share/qemu/edk2-aarch64-code.fd,read-only=true \
-chardev socket,path=/tmp/qga_netbsd.sock,server=on,wait=off,id=qga0 \
-device virtio-serial \
-device virtconsole,chardev=qga0,name=org.qemu.guest_agent.0
This patch rectifies the operator precedence while assigning the NUL
terminator.
Fixes: c3f32c13a3
Signed-off-by: Sunil Nimmagadda <sunil@nimmagadda.net>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Link: https://lore.kernel.org/r/m15xppk9qg.fsf@nimmagadda.net
Signed-off-by: Konstantin Kostiuk <kkostiuk@redhat.com>
(cherry picked from commit 9cfe110d9f)
References: bsc#1232617
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Update to latest stable release (9.1.1).
Full list of backports here:
https://lore.kernel.org/qemu-devel/7f0561ec-3564-4860-bacf-a98071a5ce52@tls.msk.ru/
A selection of them is listed here too:
ui/dbus: fix filtering all update messages
ui/win32: fix potential use-after-free with dbus shared memory
ui/dbus: fix leak on message filtering
hw/audio/hda: fix memory leak on audio setup
hw/audio/hda: free timer on exit
hw/char/pl011: Use correct masks for IBRD and FBRD
hw/intc/arm_gicv3_cpuif: Add cast to match the documentation
hw/intc/arm_gicv3: Add cast to match the documentation
hw/intc/arm_gicv3: Add cast to match the documentation
meson: ensure -mcx16 is passed when detecting ATOMIC128
meson: define qemu_isa_flags
meson: fix machine option for x86_version
target/m68k: Always return a temporary from gen_lea_mode
tcg/ppc: Use TCG_REG_TMP2 for scratch index in prepare_host_addr
tcg/ppc: Use TCG_REG_TMP2 for scratch tcg_out_qemu_st
linux-user: Fix parse_elf_properties GNU0_MAGIC check
linux-user/flatload: Take mmap_lock in load_flt_binary()
vnc: fix crash when no console attached
testing: bump mips64el cross to bookworm and fix package list
hw/sd/sdcard: Fix handling of disabled boot partitions
target/arm: Avoid target_ulong for physical address lookups
block/reqlist: allow adding overlapping requests
util/timer: avoid deadlock when shutting down
hw/mips/jazz: fix typo in in-built NIC alias
target/ppc: Fix lxvx/stxvx facility check
tcg: Fix iteration step in 32-bit gvec operation
hw/loongarch/virt: Add description for virt machine type
migration/multifd: Fix p->iov leak in multifd-uadk.c
target/ppc: Fix migration of CPUs with TLB_EMB TLB type
target/hppa: Fix random 32-bit linux-user crashes
target/arm: Correct ID_AA64ISAR1_EL1 value for neoverse-v1
hw/char/stm32l4x5_usart.c: Enable USART ACK bit response
migration/multifd: Fix rb->receivedmap cleanup race
mac_dbdma: Remove leftover `dma_memory_unmap` calls
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
When running configure, first of all we disable everything, and then we
enable only the feature that we know we want (and, of course, system
and user emulation use different sets of such features).
Consolidate the first part in a macro, that can be share between the two
spec files, making everything simpler and prettier.
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Convert conditional build of features to the %bcond_without, so they
can actually be disabled, e.g., at the project level.
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Upstream provides services for qemu-pr-helper. So far, we've not needed
them, so let's continue not to ship them for now.
However, in case at some point we want to start offering them, stash the
commented out runes for that in the spec file.
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Package qemu-vmsr-helper for letting VMs access the RAPL MSR.
I'll live in its own package and only makes sense on x86_64.
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
The fstat call can take a long time to finish when running over
NFS. Add a version of it that runs in the thread pool.
Adapt one of its users, raw_co_get_allocated_file size to use the new
version. That function is called via QMP under the qemu_global_mutex
so it has a large chance of blocking VCPU threads in case it takes too
long to finish.
Link: https://lore.kernel.org/r/20240409145917.6780-1-farosas@suse.de
Reviewed-by: Claudio Fontana <cfontana@suse.de>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: João Silva <jsilva@suse.de>
References: bsc#1211000
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Convert the remaining functions to make the QMP commands query-block
and query-named-block-nodes run in their entirety in a coroutine. With
this, any yield from those commands will return all the way back to
the main loop. This releases the BQL and the main loop and avoids
having the QMP command block another more important task from running.
Both commands need to be converted at once because hmp_info_block
calls both and it needs to be moved to a coroutine as well.
Now the wrapper for bdrv_co_get_allocated_file_size() can be made not
mixed and the wrapper for bdrv_co_block_device_info() can be removed.
Link: https://lore.kernel.org/r/20240409145917.6780-1-farosas@suse.de
Signed-off-by: Lin Ma <lma@suse.com>
References: bsc#1211000
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
We're currently doing a full query-block just to enumerate the devices
for qmp_nbd_server_add and then discarding the BlockInfoList
afterwards. Alter hmp_nbd_server_start to instead iterate explicitly
over the block_backends list.
This allows the removal of the dependency on qmp_query_block from
hmp_nbd_server_start. This is desirable because we're about to move
qmp_query_block into a coroutine and don't need to change the NBD code
at the same time.
Add the GRAPH_RDLOCK_GUARD_MAINLOOP macro because
bdrv_skip_implicit_filters() needs the graph lock.
Link: https://lore.kernel.org/r/20240409145917.6780-1-farosas@suse.de
References: bsc#1211000
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
We're converting callers of bdrv_co_get_allocated_file_size() to run
in coroutines because that function will be made asynchronous when
called (indirectly) from the QMP dispatcher.
This function is a candidate because it calls bdrv_query_image_info()
-> bdrv_co_do_query_node_info() -> bdrv_co_get_allocated_file_size().
It is safe to turn this is a coroutine because the code it calls is
made up of either simple accessors and string manipulation functions
[1] or it has already been determined to be safe [2].
1) bdrv_refresh_filename(), bdrv_is_read_only(),
blk_enable_write_cache(), bdrv_cow_bs(), blk_get_public(),
throttle_group_get_name(), bdrv_write_threshold_get(),
bdrv_query_dirty_bitmaps(), throttle_group_get_config(),
bdrv_filter_or_cow_bs(), bdrv_skip_implicit_filters()
2) bdrv_co_do_query_node_info() (see previous commits);
This was the only caller of bdrv_query_image_info(), so we can remove
the wrapper for that function now.
Link: https://lore.kernel.org/r/20240409145917.6780-1-farosas@suse.de
References: bsc#1211000
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
This function is a caller of bdrv_do_query_node_info(), which have
been converted to a coroutine. Convert this function as well so we're
closer from having the whole qmp_query_block as a single coroutine.
Also remove the wrapper for bdrv_co_do_query_node_info() now that all
its callers are converted.
Link: https://lore.kernel.org/r/20240409145917.6780-1-farosas@suse.de
References: bsc#1211000
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
We're converting callers of bdrv_co_get_allocated_file_size() to run
in coroutines because that function will be made asynchronous when
called (indirectly) from the QMP dispatcher.
This function is a candidate because it calls bdrv_do_query_node_info(),
which in turn calls bdrv_co_get_allocated_file_size().
All the functions called from bdrv_do_query_node_info() onwards are
coroutine-safe, either have a coroutine version themselves[1] or are
mostly simple code/string manipulation[2].
1) bdrv_co_getlength(), bdrv_co_get_allocated_file_size(),
bdrv_co_get_info();
2) bdrv_refresh_filename(), bdrv_get_format_name(),
bdrv_get_full_backing_filename(), bdrv_query_snapshot_info_list(),
bdrv_get_specific_info();
Link: https://lore.kernel.org/r/20240409145917.6780-1-farosas@suse.de
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
References: bsc#1211000
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Move this function into a coroutine so we can convert the whole
qmp_query_block command into a coroutine in the next patches.
Placing the entire command in a coroutine allow us to yield all the
way back to the main loop, releasing the BQL and unblocking the main
loop.
When the whole conversion is completed, we'll be able to avoid a
priority inversion that happens when a QMP command calls a slow
(buggy) system call and blocks the vcpu thread from doing mmio due to
contention on the BQL.
About coroutine safety:
Most callees have coroutine versions themselves and thus are safe to
call in a coroutine. The remaining ones:
- bdrv_refresh_filename, bdrv_get_full_backing_filename: String
manipulation, nothing that would be unsafe for use in coroutines;
- bdrv_get_format_name: Just accesses a field;
- bdrv_get_specific_info, bdrv_query_snapshot_info_list: No locks or
anything that would poll or block.
(using a mixed wrapper for now, but after all callers are converted,
this can become a coroutine exclusively)
Link: https://lore.kernel.org/r/20240409145917.6780-1-farosas@suse.de
References: bsc#1211000
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
There is a small window at the end of block device migration when
devices are being re-activated. This includes a resetting of some
fields of BDRVQcow2State at qcow2_co_invalidate_cache(). A concurrent
QMP query-block command can call qcow2_get_specific_info() during this
window and see the cleared values, which leads to an assert:
qcow2_get_specific_info: Assertion `false' failed
This is the same issue as Gitlab #1933, which has already been
resolved[1], but there the fix applied only to non-coroutine
commands. Once we move query-block to a coroutine the problem will
manifest again.
Add an operation blocker to the invalidation function to block the
query info path during this window.
Instead of failing query-block, which would be disruptive to users,
use the blocker to know when to reschedule the coroutine back into the
iohandler so it doesn't run while the BDRVQcow2State is inconsistent.
To avoid failing query-block when all block operations are blocked,
unblock the INFO operation at various places. This preserves the prior
situations where query-block used to work.
1 - https://gitlab.com/qemu-project/qemu/-/issues/1933
Link: https://lore.kernel.org/all/87bk6trl9i.fsf@suse.de/
Link: https://lore.kernel.org/r/20240409145917.6780-1-farosas@suse.de
References: bsc#1221812
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Some callers of this function are about to be converted to run in
coroutines, so allow it to be executed both inside and outside a
coroutine while we convert all the callers.
This will be reverted once all callers of bdrv_do_query_node_info run
in a coroutine.
Link: https://lore.kernel.org/r/20240409145917.6780-1-farosas@suse.de
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
References: bsc#1211000
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
The nios2 emulation target has been removed upstream by commit
6c3014858c (target/nios2: Remove the deprecated Nios II target,
2024-03-27).
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Upstream commit 7c08eefcaf (tests/data/acpi: Move x86 ACPI tables
under x86/${machine} path, 2024-06-25) has moved some files under
tests/data. Update the spec file to match.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
The avx512f, live-block-migration and pvrdma options no longer exist
in upstream configure because those features were removed. Make the
corresponding changes in the spec files.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Update to latest upstream major release, 9.1.0:
https://lore.kernel.org/qemu-devel/172549088090.3334224.10887376086844748499@amd.com/
Full changelog available here:
https://wiki.qemu.org/ChangeLog/9.1
Some of the most notable features/fixes:
* migration: compression offload support via Intel In-Memory Analytics
Accelerator (IAA) or User Space Accelerator Development Kit (UADK),
along with enhanced support for postcopy failure recovery
* virtio: support for VIRTIO_F_NOTIFICATION_DATA, allowing guest
drivers to provide additional data as part of sending device notifications
for performance/debug purposes
* guest-agent: support for guest-network-get-route command on linux,
guest-ssh-* commands on Windows, and enhanced CLI support for
configuring allowed/blocked commands
* block: security fixes for QEMU NBD server and NBD TLS encryption
* ARM: emulation support for FEAT_NMI, FEAT_CSV2_3, FEAT_ETS2,
FEAT_Spec_FPACC, FEAT_WFxT, FEAT_Debugv8p8 architecture features
* ARM: nested/two-stage page table support for emulated SMMUv3
* ARM: xilinx_zynq board support for cache controller and multiple
CPUs, and B-L475E-IOT01A board support for a DM163 display
* LoongArch: support for directly booting an ELF kernel and for running
up to 256 vCPUs via extioi virt extension
* LoongArch: enhanced debug/GDB support
* RISC-V: support for version 1.13 of privileged architecture specification
* RISC-V: support for Zve32x, Zve64x, Zimop, Zcmop, Zama16b, Zabha,
Zawrs, and Smcntrpmf extensions
* RISC-V: enhanced debug/GDB support and general fixes
* SPARC: emulation support for FMAF, IMA, VIS3, and VIS4 architecture
features
* x86: KVM support for running AMD SEV-SNP guests
* x86: CPU emulation support for Icelake-Server-v7, SapphireRapids-v3,
and SierraForest
The following bugs/CVEs were solved (in 9.0.x) with backports that are
now included in 9.1 upstream:
- CVE-2024-4467 (bsc#1227322)
- CVE-2024-7409 (bsc#1229007)
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Remove spurious initialization with PC_MACHINE_CLASS().
Signed-off-by: Fabiano Rosas <farosas@suse.de>
[DF: added some context in the changelog]
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
This should allow qemu to be built with GCC14. [1] I believe that the
package actually intends to use -Wno-error already (which makes sense
for package building) because it puts it to EXTRA_CFLAGS, but at least
the ipxe slap -Werror after EXTRA_CFLAGS, unless NO_WERROR is defined
to one.
[1] https://github.com/ipxe/ipxe/issues/1219
References: bsc#1227960
Signed-off-by: Martin Jambor <mjambor@suse.com>
[set NO_WERROR=1 only for ipxe]
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Update to latest stable release (9.0.2).
Full list of backports here:
https://lore.kernel.org/qemu-devel/1721203819.679622.831479.nullmailer@tls.msk.ru/
A selection of them is listed here too:
hw/nvme: fix number of PIDs for FDP RUH update
sphinx/qapidoc: Fix to generate doc for explicit, unboxed arguments
char-stdio: Restore blocking mode of stdout on exit
virtio: remove virtio_tswap16s() call in vring_packed_event_read()
virtio-pci: Fix the failure process in kvm_virtio_pci_vector_use_one()
tcg/optimize: Fix TCG_COND_TST* simplification of setcond2
block: Parse filenames only when explicitly requested
iotests/270: Don't store data-file with json: prefix in image
iotests/244: Don't store data-file with protocol in image
qcow2: Don't open data_file with BDRV_O_NO_IO
tests: add testing of parameter=3D1 for SMP topology (bsc#1228169)
hw/core: allow parameter=3D1 for SMP topology on any machine
target/arm: Fix FJCVTZS vs flush-to-zero
target/arm: Fix VCMLA Dd, Dn, Dm[idx]
i386/cpu: fixup number of addressable IDs for processor cores in the physical package
tests: Update our CI to use CentOS Stream 9 instead of 8
migration: Fix file migration with fdset
tcg/loongarch64: Fix tcg_out_movi vs some pcrel pointers
target/sparc: use signed denominator in sdiv helper
linux-user: Make TARGET_NR_setgroups affect only the current thread
accel/tcg: Fix typo causing tb->page_addr[1] to not be recorded
stdvga: fix screen blanking
hw/audio/virtio-snd: Always use little endian audio format
Revert "monitor: use aio_co_reschedule_self()"
ui/gtk: Draw guest frame at refresh cycle
virtio-net: drop too short packets early
target/i386: fix size of EBP writeback in gen_enter()
References: bsc#1228169
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Update to latest stable release (9.0.1).
Full list of backports here:
https://lore.kernel.org/qemu-devel/1718081053.366429.1238758.nullmailer@tls.msk.ru/
A selection of them is reported here too:
Update version for 9.0.1 release
target/loongarch: fix a wrong print in cpu dump
ui/sdl2: Allow host to power down screen
virtio-gpu: fix v2 migration
target/i386: fix SSE and SSE2 feature check
target/i386: fix xsave.flat from kvm-unit-tests
disas/riscv: Decode all of the pmpcfg and pmpaddr CSRs
riscv, gdbstub.c: fix reg_width in ricsv_gen_dynamic_vector_feature()
target/riscv/kvm.c: Fix the hart bit setting of AIA
target/riscv: rvzicbo: Fixup CBO extension register calculation
target/riscv: do not set mtval2 for non guest-page faults
target/riscv: prioritize pmp errors in raise_mmu_exception()
target/riscv: rvv: Remove redudant SEW checking for vector fp narrow/widen instructions
target/riscv: rvv: Check single width operator for vfncvt.rod.f.f.w
target/riscv: rvv: Check single width operator for vector fp widen instructions
target/riscv: rvv: Fix Zvfhmin checking for vfwcvt.f.f.v and vfncvt.f.f.w instructions
target/riscv/cpu.c: fix Zvkb extension config
target/riscv: Fix the element agnostic function problem
target/riscv/kvm: tolerate KVM disable ext errors
target/riscv/kvm: Fix exposure of Zkr
hw/intc/riscv_aplic: APLICs should add child earlier than realize
iotests: test NBD+TLS+iothread
qio: Inherit follow_coroutine_ctx across TLS
target/arm: Disable SVE extensions when SVE is disabled
hw/intc/arm_gic: Fix handling of NS view of GICC_APR<n>
hvf: arm: Fix encodings for ID_AA64PFR1_EL1 and debug System registers
gitlab: use 'setarch -R' to workaround tsan bug
gitlab: use $MAKE instead of 'make'
dockerfiles: add 'MAKE' env variable to remaining containers
gitlab: Update msys2-64bit runner tags
target/i386: no single-step exception after MOV or POP SS
target/i386: disable jmp_opt if EFLAGS.RF is 1
hw/loongarch/virt: Fix FDT memory node address width
hw/loongarch: Fix fdt memory node wrong 'reg'
target/loongarch/kvm: fpu save the vreg registers high 192bit
hw/core/machine: move compatibility flags for VirtIO-net USO to machine 8.1
target-i386: hyper-v: Correct kvm_hv_handle_exit return value
hw/pflash: fix block write start
tcg/loongarch64: Fill out tcg_out_{ld,st} for vector regs
ui/gtk: Check if fence_fd is equal to or greater than 0
ui/gtk: Fix mouse/motion event scaling issue with GTK display backend
configure: Fix error message when C compiler is not working
configure: quote -D options that are passed through to meson
target/i386: fix feature dependency for WAITPKG
target/i386: rdpkru/wrpkru are no-prefix instructions
target/i386: fix operand size for DATA16 REX.W POPCNT
hw/remote/vfio-user: Fix config space access byte order
hw/loongarch/virt: Fix memory leak
target/sh4: Update DisasContextBase.insn_start
target/sparc: Fix FPMERGE
target/sparc: Fix FMULD8*X16
target/sparc: Fix FMUL8x16A{U,L}
target/sparc: Fix FMUL8x16
target/sparc: Fix FEXPAND
target/i386: Give IRQs a chance when resetting HF_INHIBIT_IRQ_MASK
plugins: Update stale comment
target/sh4: Fix SUBV opcode
target/sh4: Fix ADDV opcode
hw/arm/npcm7xx: Store derivative OTP fuse key in little endian
hw/dmax/xlnx_dpdma: fix handling of address_extension descriptor fields
hw/ufs: Fix buffer overflow bug
.gitlab-ci.d/cirrus.yml: Shorten the runtime of the macOS and FreeBSD jobs
tests/avocado: update sunxi kernel from armbian to 6.6.16
target/arm: Restrict translation disabled alignment check to VMSA
target/riscv/kvm: remove sneaky strerrorname_np() instance
target/loongarch/cpu.c: typo fix: expection
backends/cryptodev-builtin: Fix local_error leaks
nbd/server: Mark negotiation functions as coroutine_fn
nbd/server: do not poll within a coroutine context
docs: i386: pc: Update maximum CPU numbers for PC Q35
linux-user: do_setsockopt: fix SOL_ALG.ALG_SET_KEY
migration/colo: Fix bdrv_graph_rdlock_main_loop: Assertion `!qemu_in_coroutine()' failed.
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Change the order of audio driver list in SLE to prefer pulseaudio
over pipewire (related to bsc#1222218).
Signed-off-by: Antonio Larrosa <alarrosa@suse.com>
References: bsc#1222218
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
In commit "[openSUSE][RPM] Normalize hostname, for reproducible builds"
(dec5f6c8a7acd23222a14c6600d6967219fda65c) the USER and HOSTNAME
variables were defined in the different RPM section. Fix that.
Fixes: dec5f6c8a7acd23222a14c6600d6967219fda65c
References: boo#1084909
Suggested-by: Bernhard M. Wiedemann <githubbmwprimary@lsmod.de>
Signed-offf-by: Dario Faggioli <dfaggioli@suse.com>
Update to latest upstream release 9.0.0.
Full changelog at:
https://wiki.qemu.org/ChangeLog/9.0
Highlights include:
* block: virtio-blk now supports multiqueue where different queues of a
single disk can be processed by different I/O threads
* gdbstub: various improvements such as catching syscalls in user-mode,
support for fork-follow modes, and support for siginfo:read
* memory: preallocation of memory backends can now be handled
concurrently using multiple threads in some cases
* migration: support for "mapped-ram" capability allowing for more
efficient VM snapshots, improved support for zero-page detection, and
checkpoint-restart support for VFIO
* ARM: architectural feature support for ECV (Enhanced Counter Virtualization),
NV (Nested Virtualization), and NV2 (Enhanced Nested
Virtualization)
* ARM: board support for B-L475E-IOT01A IoT node, mp3-an536 (MPS3 dev board
+ AN536 firmware), and raspi4b (Raspberry Pi 4 Model B)
* ARM: additional IO/disk/USB/SPI/ethernet controller and timer support for
Freescale i.MX6, Allwinner R40, Banana Pi, npcm7xxx, and virt boards
* HPPA: numerous bug fixes and SeaBIOS-hppa firmware updated to version 16
* LoongArch: KVM acceleration support, including LSX/LASX vector
extensions
* RISC-V: ISA/extension support for Zacas, amocas, RVA22 profiles,
Zaamo, Zalrsc, Ztso, and more
* RISC-V: SMBIOS support for RISC-V virt machine, ACPI support for
SRAT, SLIT, AIA, PLIC and updated RHCT table support, and numerous fixes
* s390x: Emulation support for CVDG, CVB, CVBY and CVBG instructions,
and fixes for LAE (Load Address Extended) emulation
* and lots more...
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Update to latest stable release (8.2.3).
Full changelog/backports here:
https://lore.kernel.org/qemu-devel/1713980341.971368.1218343.nullmailer@tls.msk.ru/
Some of the upstream backports are:
Update version for 8.2.3 release
ppc/spapr: Initialize max_cpus limit to SPAPR_IRQ_NR_IPIS.
ppc/spapr: Introduce SPAPR_IRQ_NR_IPIS to refer IRQ range for CPU IPIs.
hw/pci-host/ppc440_pcix: Do not expose a bridge device on PCI bus
hw/isa/vt82c686: Keep track of PIRQ/PINT pins separately
virtio-pci: fix use of a released vector
linux-user/x86_64: Handle the vsyscall page in open_self_maps_{2,4}
hw/audio/virtio-snd: Remove unused assignment
hw/net/net_tx_pkt: Fix overrun in update_sctp_checksum()
hw/sd/sdhci: Do not update TRNMOD when Command Inhibit (DAT) is set
hw/net/lan9118: Fix overflow in MIL TX FIFO
hw/net/lan9118: Replace magic '2048' value by MIL_TXFIFO_SIZE definition
backends/cryptodev: Do not abort for invalid session ID
hw/misc/applesmc: Fix memory leak in reset() handler
hw/block/nand: Fix out-of-bound access in NAND block buffer
hw/block/nand: Have blk_load() take unsigned offset and return boolean
hw/block/nand: Factor nand_load_iolen() method out
qemu-options: Fix CXL Fixed Memory Window interleave-granularity typo
hw/virtio/virtio-crypto: Protect from DMA re-entrancy bugs
hw/char/virtio-serial-bus: Protect from DMA re-entrancy bugs
hw/display/virtio-gpu: Protect from DMA re-entrancy bugs
mirror: Don't call job_pause_point() under graph lock (bsc#1224179)
...and many more...
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Update to latest stable release (8.2.2).
Full changelog here:
https://lore.kernel.org/qemu-devel/1709577077.783602.1474596.nullmailer@tls.msk.ru/
Upstream backports:
chardev/char-socket: Fix TLS io channels sending too much data to the backend
tests/unit/test-util-sockets: Remove temporary file after test
hw/usb/bus.c: PCAP adding 0xA in Windows version
hw/intc/Kconfig: Fix GIC settings when using "--without-default-devices"
gitlab: force allow use of pip in Cirrus jobs
tests/vm: avoid re-building the VM images all the time
tests/vm: update openbsd image to 7.4
target/i386: leave the A20 bit set in the final NPT walk
target/i386: remove unnecessary/wrong application of the A20 mask
target/i386: Fix physical address truncation
target/i386: check validity of VMCB addresses
target/i386: mask high bits of CR3 in 32-bit mode
pl031: Update last RTCLR value on write in case it's read back
hw/nvme: fix invalid endian conversion
update edk2 binaries to edk2-stable202402
update edk2 submodule to edk2-stable202402
target/ppc: Fix crash on machine check caused by ifetch
target/ppc: Fix lxv/stxv MSR facility check
.gitlab-ci.d/windows.yml: Drop msys2-32bit job
system/vl: Update description for input grab key
docs/system: Update description for input grab key
hw/hppa/Kconfig: Fix building with "configure --without-default-devices"
tests/qtest: Depend on dbus_display1_dep
meson: Explicitly specify dbus-display1.h dependency
audio: Depend on dbus_display1_dep
ui/console: Fix console resize with placeholder surface
ui/clipboard: add asserts for update and request
ui/clipboard: mark type as not available when there is no data
ui: reject extended clipboard message if not activated
target/i386: Generate an illegal opcode exception on cmp instructions with lock prefix
i386/cpuid: Move leaf 7 to correct group
i386/cpuid: Decrease cpuid_i when skipping CPUID leaf 1F
i386/cpu: Mask with XCR0/XSS mask for FEAT_XSAVE_XCR0_HI and FEAT_XSAVE_XSS_HI leafs
i386/cpu: Clear FEAT_XSAVE_XSS_LO/HI leafs when CPUID_EXT_XSAVE is not available
.gitlab-ci/windows.yml: Don't install libusb or spice packages on 32-bit
iotests: Make 144 deterministic again
target/arm: Don't get MDCR_EL2 in pmu_counter_enabled() before checking ARM_FEATURE_PMU
target/arm: Fix SVE/SME gross MTE suppression checks
target/arm: Handle mte in do_ldrq, do_ldro
target/arm: Split out make_svemte_desc
target/arm: Adjust and validate mtedesc sizem1
target/arm: Fix nregs computation in do_{ld,st}_zpa
linux-user/aarch64: Choose SYNC as the preferred MTE mode
tests/acpi: Update DSDT.cxl to reflect change _STA return value.
hw/i386: Fix _STA return value for ACPI0017
tests/acpi: Allow update of DSDT.cxl
smmu: Clear SMMUPciBus pointer cache when system reset
virtio_iommu: Clear IOMMUPciBus pointer cache when system reset
virtio-gpu: Correct virgl_renderer_resource_get_info() error check
hw/cxl: Pass CXLComponentState to cache_mem_ops
hw/cxl/device: read from register values in mdev_reg_read()
cxl/cdat: Fix header sum value in CDAT checksum
cxl/cdat: Handle cdat table build errors
vhost-user.rst: Fix vring address description
tcg/arm: Fix goto_tb for large translation blocks
tcg: Increase width of temp_subindex
hw/net/tulip: add chip status register values
hw/smbios: Fix port connector option validation
hw/smbios: Fix OEM strings table option validation
configure: run plugin TCG tests again
tests/docker: Add sqlite3 module to openSUSE Leap container
hw/riscv/virt-acpi-build.c: fix leak in build_rhct()
migration: Fix logic of channels and transport compatibility check
virtio-blk: avoid using ioeventfd state in irqfd conditional
virtio: Re-enable notifications after drain
virtio-scsi: Attach event vq notifier with no_poll
iotests: give tempdir an identifying name
iotests: fix leak of tmpdir in dry-run mode
hw/scsi/lsi53c895a: add missing decrement of reentrancy counter
linux-user/aarch64: Add padding before __kernel_rt_sigreturn
tcg/loongarch64: Set vector registers call clobbered
pci-host: designware: Limit value range of iATU viewport register
target/arm: Reinstate "vfp" property on AArch32 CPUs
qemu-options.hx: Improve -serial option documentation
system/vl.c: Fix handling of '-serial none -serial something'
target/arm: fix exception syndrome for AArch32 bkpt insn
block/blkio: Make s->mem_region_alignment be 64 bits
qemu-docs: Update options for graphical frontends
Make 'uri' optional for migrate QAPI
vfio/pci: Clear MSI-X IRQ index always
migration: Fix use-after-free of migration state object
migration: Plug memory leak on HMP migrate error path
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
We wanted QEMU to support larger VMs (in therm of RAM size) by default
and we therefore introduced patch "[openSUSE] increase x86_64 physical
bits to 42". This, however, means that we create VMs with 42 bits of
physical address space even on hosts that only has, say, 40. And that
can't work.
In fact, it has been a problem since a long time (e.g., bsc#1205978) and
it's also the actual root cause of bsc#1219977.
Get rid of that old patch, in favor of a new one that still raise the
default number of address bits to 42, but only on hosts that supports
that.
This means that we can also use the proper SeaBIOS version, without
reverting commits that were only a problem due to our broken downstream
patch.
We probably aslo don't need to ship some of the custom ACPI tables (for
passing tests), but we'll actually remove them later, after double
checking properly that all the tests do work.
References: bsc#1205978
References: bsc#1219977
References: bsc#1220799
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Update the copyright year to 2024, sort dependencies etc.
This way, 'osc' does not have to do these changes all the times (they're
automatic, so no big deal, but it's annoying to see them in the diffs of
all the requests).
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Backported commits:
* Update version for 8.2.1 release
* target/arm: Fix incorrect aa64_tidcp1 feature check
* target/arm: Fix A64 scalar SQSHRN and SQRSHRN
* target/xtensa: fix OOB TLB entry access
* qtest: bump aspeed_smc-test timeout to 6 minutes
* monitor: only run coroutine commands in qemu_aio_context
* iotests: port 141 to Python for reliable QMP testing
* iotests: add filter_qmp_generated_node_ids()
* block/blklogwrites: Fix a bug when logging "write zeroes" operations.
* virtio-net: correctly copy vnet header when flushing TX (bsc#1218484, CVE-2023-6693)
* tcg/arm: Fix SIGILL in tcg_out_qemu_st_direct
* linux-user/riscv: Adjust vdso signal frame cfa offsets
* linux-user: Fixed cpu restore with pc 0 on SIGBUS
* block/io: clear BDRV_BLOCK_RECURSE flag after recursing in bdrv_co_block_status
* coroutine-ucontext: Save fake stack for pooled coroutine
* tcg/s390x: Fix encoding of VRIc, VRSa, VRSc insns
* accel/tcg: Revert mapping of PCREL translation block to multiple virtual addresses
* acpi/tests/avocado/bits: wait for 200 seconds for SHUTDOWN event from bits VM
* s390x/pci: drive ISM reset from subsystem reset
* s390x/pci: refresh fh before disabling aif
* s390x/pci: avoid double enable/disable of aif
* hw/scsi/esp-pci: set DMA_STAT_BCMBLT when BLAST command issued
* hw/scsi/esp-pci: synchronise setting of DMA_STAT_DONE with ESP completion interrupt
* hw/scsi/esp-pci: generate PCI interrupt from separate ESP and PCI sources
* hw/scsi/esp-pci: use correct address register for PCI DMA transfers
* migration/rdma: define htonll/ntohll only if not predefined
* hw/pflash: implement update buffer for block writes
* hw/pflash: use ldn_{be,le}_p and stn_{be,le}_p
* hw/pflash: refactor pflash_data_write()
* backends/cryptodev: Do not ignore throttle/backends Errors
* target/i386: pcrel: store low bits of physical address in data[0]
* target/i386: fix incorrect EIP in PC-relative translation blocks
* target/i386: Do not re-compute new pc with CF_PCREL
* load_elf: fix iterator's type for elf file processing
* target/hppa: Update SeaBIOS-hppa to version 15
* target/hppa: Fix IOR and ISR on error in probe
* target/hppa: Fix IOR and ISR on unaligned access trap
* target/hppa: Export function hppa_set_ior_and_isr()
* target/hppa: Avoid accessing %gr0 when raising exception
* hw/hppa: Move software power button address back into PDC
* target/hppa: Fix PDC address translation on PA2.0 with PSW.W=0
* hw/pci-host/astro: Add missing astro & elroy registers for NetBSD
* hw/hppa/machine: Disable default devices with --nodefaults option
* hw/hppa/machine: Allow up to 3840 MB total memory
* readthodocs: fully specify a build environment
* .gitlab-ci.d/buildtest.yml: Work around htags bug when environment is large
* target/s390x: Fix LAE setting a wrong access register
* tests/qtest/virtio-ccw: Fix device presence checking
* tests/acpi: disallow tests/data/acpi/virt/SSDT.memhp changes
* tests/acpi: update expected data files
* edk2: update binaries to git snapshot
* edk2: update build config, set PcdUninstallMemAttrProtocol = TRUE.
* edk2: update to git snapshot
* tests/acpi: allow tests/data/acpi/virt/SSDT.memhp changes
* util: fix build with musl libc on ppc64le
* tcg/ppc: Use new registers for LQ destination
* hw/intc/arm_gicv3_cpuif: handle LPIs in in the list registers
* hw/vfio: fix iteration over global VFIODevice list
* vfio/container: Replace basename with g_path_get_basename
* edu: fix DMA range upper bound check
* hw/net: cadence_gem: Fix MDIO_OP_xxx values
* audio/audio.c: remove trailing newline in error_setg
* chardev/char.c: fix "abstract device type" error message
* target/riscv: Fix mcycle/minstret increment behavior
* hw/net/can/sja1000: fix bug for single acceptance filter and standard frame
* target/i386: the sgx_epc_get_section stub is reachable
* configure: use a native non-cross compiler for linux-user
* include/ui/rect.h: fix qemu_rect_init() mis-assignment
* target/riscv/kvm: do not use non-portable strerrorname_np()
* iotests: Basic tests for internal snapshots
* vl: Improve error message for conflicting -incoming and -loadvm
* block: Fix crash when loading snapshot on inactive node
References: bsc#1218484 (CVE-2023-6693)
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Depending on the VM configuration (both at the VM definition level and
on the guest itself) a VGA console might be necessary, or weird lockup
will occur. Since the VGA module package is smalle enough, add a
dependency for it, from other display modules, to act as a workaround.
While there, make more explicit and precise the dependencies between all
the various modules, by specifying that they should all have the same
version and release.
References: bsc#1219164
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Historically, KVM was available only for x86 and s390, and was invoked
via a binary called 'kvm' or 'qemu-kvm'. For a while, we've shipped a
package that was making it possible to invoke QEMU like that, but only
for these two arches. This, however, created a lot of confusion and
dependencies issues.
Fix them by creating a symlink from 'qemu-kvm' to the proper binary on
all arches and by making the main QEMU package Providing and Obsoleting
(also on all arches) the old qemu-kvm one.
Note that, for RISCV, the qemu-system-riscv64 binary, to which the symlink
should point, is in the qemu-extra package. However, if we are on RISCV,
qemu-extra is an hard dependency of qemu. Therefore, it's fine to ship
the link and also set the Provides: and Obsoletes: tag in the qemu
package itself. It'd be more correct to do that in the qemu-extra
package, of course, but this would complicate the spec file and it's not
worth it, considering this is all legacy and should very well go away
soon.
References: bsc#1218684
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Add to the ipxe submodule the commit (and all its dependencies) for
fixing building with binutils 2.42
References: bsc#1219733
References: bsc#1219722
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Point the submodules to the repositories that host our downstream
patches:
* roms/seabios
- [openSUSE] switch to python3 as needed
- [openSUSE] build: enable cross compilation on ARM
- [openSUSE] build: be explicit about -mx86-used-note=no
* roms/SLOF
- Allow to override build date with SOURCE_DATE_EPOCH
* roms/ipxe
- [ath5k] Add missing AR5K_EEPROM_READ in ath5k_eeprom_read_turbo_modes
- [openSUSE] [build] Makefile: fix issues of build reproducibility
- [openSUSE] [test] help compiler out by initializing array[openSUSE]
- [openSUSE] [build] Silence GCC 12 spurious warnings
- [librm] Use explicit operand size when pushing a label address
* roms/skiboot
- [openSUSE] Makefile: define endianess for cross-building on aarch64
- [openSUSE] Make Sphinx build reproducible (boo#1102408)
* roms/qboot
- [openSUSE] add cross.ini file to handle aarch64 based build
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Update to latest upstream release.
The full list of changes are available at:
https://wiki.qemu.org/ChangeLog/8.2
Highlights include:
* New virtio-sound device emulation
* New virtio-gpu rutabaga device emulation used by Android emulator
* New hv-balloon for dynamic memory protocol device for Hyper-V guests
* New Universal Flash Storage device emulation
* Network Block Device (NBD) 64-bit offsets for improved performance
* dump-guest-memory now supports the standard kdump format
* ARM: Xilinx Versal board now models the CFU/CFI, and the TRNG device
* ARM: CPU emulation support for cortex-a710 and neoverse-n2
* ARM: architectural feature support for PACQARMA3, EPAC, Pauth2, FPAC,
FPACCOMBINE, TIDCP1, MOPS, HBC, and HPMN0
* HPPA: CPU emulation support for 64-bit PA-RISC 2.0
* HPPA: machine emulation support for C3700, including Astro memory
controller and four Elroy PCI bridges
* LoongArch: ISA support for LASX extension and PRELDX instruction
* LoongArch: CPU emulation support for la132
* RISC-V: ISA/extension support for AIA virtualization support via KVM,
and vector cryptographic instructions
* RISC-V: Numerous extension/instruction cleanups, fixes, and reworks
* s390x: support for vfio-ap passthrough of crypto adapter for
protected
virtualization guests
* Tricore: support for TC37x CPU which implements ISA v1.6.2
* Tricore: support for CRCN, FTOU, FTOHP, and HPTOF instructions
* x86: Zen support for PV console and network devices
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Add some block drivers and virtiofsd as hard dependencies of the
qemu-headless package, to make sure it's really useful for headless
server environments (even when recommended packages are not installed).
Singed-off-by: Dario Faggioli <dfaggioli@suse.com>
Use a fixed USER value (in case someone builds outside of OBS/osc).
References: boo#1084909
Signed-off-by: Bernhard M. Wiedemann <githubbmwprimary@lsmod.de>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Define a new sub-(meta-)package that can be installed for having
all the other modules and packages necessary for SPICE to work.
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Align to upstream stable release. It includes many of the patches we had
backported ourself, to fix bugs and issues, plus more.
See here for details:
- https://lore.kernel.org/qemu-devel/1700589639.257680.3420728.nullmailer@tls.msk.ru/
- https://gitlab.com/qemu-project/qemu/-/commits/stable-8.1?ref_type=heads
An (incomplete!) list of such backports is:
* Update version for 8.1.3 release
* hw/mips: LOONGSON3V depends on UNIMP device
* target/arm: HVC at EL3 should go to EL3, not EL2
* s390x/pci: only limit DMA aperture if vfio DMA limit reported
* target/riscv/kvm: support KVM_GET_REG_LIST
* target/riscv/kvm: improve 'init_multiext_cfg' error msg
* tracetool: avoid invalid escape in Python string
* tests/tcg/s390x: Test LAALG with negative cc_src
* target/s390x: Fix LAALG not updating cc_src
* tests/tcg/s390x: Test CLC with inaccessible second operand
* target/s390x: Fix CLC corrupting cc_src
* tests/qtest: ahci-test: add test exposing reset issue with pending callback
* hw/ide: reset: cancel async DMA operation before resetting state
* target/mips: Fix TX79 LQ/SQ opcodes
* target/mips: Fix MSA BZ/BNZ opcodes displacement
* ui/gtk-egl: apply scale factor when calculating window's dimension
* ui/gtk: force realization of drawing area
* ati-vga: Implement fallback for pixman routines
* ...
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Avoid parallel processing in sphinx because that causes variations in
generated files
This is addressed here, with a downstream patch, until a proper solution
is found upstream.
Signed-off-by: Bernhard Wiedemann <bwiedemann@suse.com>
References: boo#1102408
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
The supportconfig 'scplugin.rc' file is deprecated in favor of
supportconfig.rc'. Adapt the qemu plugin to the new scheme.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Our workflow does not include patches in the spec files. Still, it could
be useful to add some there, during development and/or debugging issues.
Make sure that they are applied properly, by adding -p1 to the
%autosetup directive (it's a nop if there are no patches, so both cases
are ok).
Suggested-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
This fixes the following upstream issues:
* https://gitlab.com/qemu-project/qemu/-/issues/1826
* https://gitlab.com/qemu-project/qemu/-/issues/1834
* https://gitlab.com/qemu-project/qemu/-/issues/1846
It also contains a fix for:
* CVE-2023-42467 (bsc#1215192)
As well as several upstream backports:
* target/riscv: Fix vfwmaccbf16.vf
* disas/riscv: Fix the typo of inverted order of pmpaddr13 and pmpaddr14
* roms: use PYTHON to invoke python
* hw/audio/es1370: reset current sample counter
* migration/qmp: Fix crash on setting tls-authz with null
* util/log: re-allow switching away from stderr log file
* vfio/display: Fix missing update to set backing fields
* amd_iommu: Fix APIC address check
* vdpa net: follow VirtIO initialization properly at cvq isolation probing
* vdpa net: stop probing if cannot set features
* vdpa net: fix error message setting virtio status
* vdpa net: zero vhost_vdpa iova_tree pointer at cleanup
* linux-user/hppa: Fix struct target_sigcontext layout
* chardev/char-pty: Avoid losing bytes when the other side just (re-)connected
* hw/display/ramfb: plug slight guest-triggerable leak on mode setting
* win32: avoid discarding the exception handler
* target/i386: fix memory operand size for CVTPS2PD
* target/i386: generalize operand size "ph" for use in CVTPS2PD
* subprojects/berkeley-testfloat-3: Update to fix a problem with compiler warnings
* scsi-disk: ensure that FORMAT UNIT commands are terminated
* esp: restrict non-DMA transfer length to that of available data
* esp: use correct type for esp_dma_enable() in sysbus_esp_gpio_demux()
* optionrom: Remove build-id section
* target/tricore: Fix RCPW/RRPW_INSERT insns for width = 0
* accel/tcg: Always require can_do_io
* accel/tcg: Always set CF_LAST_IO with CF_NOIRQ
* accel/tcg: Improve setting of can_do_io at start of TB
* accel/tcg: Track current value of can_do_io in the TB
* accel/tcg: Hoist CF_MEMI_ONLY check outside translation loop
* accel/tcg: Avoid load of icount_decr if unused
* softmmu: Use async_run_on_cpu in tcg_commit
* migration: Move return path cleanup to main migration thread
* migration: Replace the return path retry logic
* migration: Consolidate return path closing code
* migration: Remove redundant cleanup of postcopy_qemufile_src
* migration: Fix possible race when shutting down to_dst_file
* migration: Fix possible races when shutting down the return path
* migration: Fix possible race when setting rp_state.error
* migration: Fix race that dest preempt thread close too early
* ui/vnc: fix handling of VNC_FEATURE_XVP
* ui/vnc: fix debug output for invalid audio message
* hw/scsi/scsi-disk: Disallow block sizes smaller than 512 [CVE-2023-42467]
* accel/tcg: mttcg remove false-negative halted assertion
* meson.build: Make keyutils independent from keyring
* target/arm: Don't skip MTE checks for LDRT/STRT at EL0
* hw/arm/boot: Set SCR_EL3.FGTEn when booting kernel
* include/exec: Widen tlb_hit/tlb_hit_page()
* tests/file-io-error: New test
* file-posix: Simplify raw_co_prw's 'out' zone code
* file-posix: Fix zone update in I/O error path
* file-posix: Check bs->bl.zoned for zone info
* file-posix: Clear bs->bl.zoned on error
* hw/cxl: Fix out of bound array access
* hw/cxl: Fix CFMW config memory leak
* linux-user/hppa: lock both words of function descriptor
* linux-user/hppa: clear the PSW 'N' bit when delivering signals
* hw/ppc: Read time only once to perform decrementer write
* hw/ppc: Reset timebase facilities on machine reset
* hw/ppc: Always store the decrementer value
* target/ppc: Sign-extend large decrementer to 64-bits
* hw/ppc: Avoid decrementer rounding errors
* hw/ppc: Round up the decrementer interval when converting to ns
* host-utils: Add muldiv64_round_up
Signed-of-by: Dario Faggioli <dfaggioli@suse.com>
perl-Text-Markdown is not always available (e.g., in SLE/Leap).
Use discount instead, as the provider of the 'markdown' binary.
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
OBS SCM bridge can handle git submodule, while it can't handle (yet?)
meson subprojects. The (ugly, I know!) solution, for now, is to turn
the latter into the former, with commands like the followings:
git submodule add -f https://gitlab.com/qemu-project/berkeley-testfloat-3 subprojects/berkeley-testfloat-3
git -C subprojects/berkeley-testfloat-3 reset --hard 40619cbb3bf32872df8c53cc457039229428a263
(the hash used comes from the subprojects/berkeley-testfloat-3.wrap file)
It's also necessary to manually apply the layering of the packagefiles,
and that is done in the specfile.
Longer term and better solutions could be:
- Make SCM support meson subprojects
- Create standalone packages for the subprojects (and instruct
QEMU to pick stuff from there)
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Full list of changes are available at:
https://wiki.qemu.org/ChangeLog/8.1
Highlights:
* VFIO: improved live migration support, no longer an experimental feature
* GTK GUI now supports multi-touch events
* ARM, PowerPC, and RISC-V can now use AES acceleration on host processor
* PCIe: new QMP commands to inject CXL General Media events, DRAM
events and Memory Module events
* ARM: KVM VMs on a host which supports MTE (the Memory Tagging Extension)
can now use MTE in the guest
* ARM: emulation support for bpim2u (Banana Pi BPI-M2 Ultra) board and
neoverse-v1 (Cortex Neoverse-V1) CPU
* ARM: new architectural feature support for: FEAT_PAN3 (SCTLR_ELx.EPAN),
FEAT_LSE2 (Large System Extensions v2), and experimental support for
FEAT_RME (Realm Management Extensions)
* Hexagon: new instruction support for v68/v73 scalar, and v68/v69 HVX
* Hexagon: gdbstub support for HVX
* MIPS: emulation support for Ingenic XBurstR1/XBurstR2 CPUs, and MXU
instructions
* PowerPC: TCG SMT support, allowing pseries and powernv to run with up
to 8 threads per core
* PowerPC: emulation support for Power9 DD2.2 CPU model, and perf
sampling support for POWER CPUs
* RISC-V: ISA extension support for BF16/Zfa, and disassembly support
for Zcm*/Z*inx/XVentanaCondOps/Xthead
* RISC-V: CPU emulation support for Veyron V1
* RISC-V: numerous KVM/emulation fixes and enhancements
* s390: instruction emulation fixes for LDER, LCBB, LOCFHR, MXDB, MXDBR,
EPSW, MDEB, MDEBR, MVCRL, LRA, CKSM, CLM, ICM, MC, STIDP, EXECUTE, and
CLGEBR(A)
* SPARC: updated target/sparc to use tcg_gen_lookup_and_goto_ptr() for
improved performance
* Tricore: emulation support for TC37x CPU that supports ISA v1.6.2
instructions
* Tricore: instruction emulation of POPCNT.W, LHA, CRC32L.W, CRC32.B,
SHUFFLE, SYSCALL, and DISABLE
* x86: CPU model support for GraniteRapids
* and lots more...
This also (automatically) fixes:
- bsc#1212850 (CVE-2023-3354)
- bsc#1213001 (CVE-2023-3255)
- bsc#1213925 (CVE-2023-3180)
- bsc#1213414 (CVE-2023-3301)
- bsc#1207205 (CVE-2023-0330)
- bsc#1212968 (CVE-2023-2861)
- bsc#1179993, bsc#1181740, bsc#1211697
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
By default try to preserve argv[0].
Original report is boo#1197298, which also became relevant recently again in bsc#1212768.
Signed-off-by: Fabian Vogt <fabian@ritter-vogt.de>
References: boo#1197298
References: bsc#1212768
Signed-off-by: Fabian Vogt <fabian@ritter-vogt.de>
Create separate packages for qemu-img and qemu-pr-helper.
Signed-off-by: Vasiliy Ulyanov <vulyanov@suse.de>
Co-authored-by: Vasiliy Ulyanov <vulyanov@suse.de>
Since version 8.0.0, virtiofsd is not part of QEMU sources any longer.
We therefore have also moved it to a separate package. To retain
compatibility and consistency of behavior, require such a package as an
hard dependency.
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
For example, let's try to avoid recommending GUI UI stuff, unless GTK is
already installed. This way we avoid things like bringing in an entire
graphic stack on servers.
References: bsc#1205680
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
- The qemu-headless subpackage was defined but never build, because it
had no files. Fix that by putting there just a simple README.
- Move the docs in a dedicated subpackage
Resolves: bsc#1209629
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
As part of the effort to close the gap with Leap I think we are fine
removing the $pkgversion component to creating a unique CONFIG_STAMP.
This stamp is only used in creating a unique symbol used in ensuring the
dynamically loaded modules correspond correctly to the loading qemu.
The default inputs to producing this unique symbol are somewhat reasonable
as a generic mechanism, but specific packaging and maintenance practices
might require the default to be modified for best use. This is an example
of that.
Signed-off-by: Bruce Rogers <brogers@suse.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
We are disabling the following tests:
qemu-system-ppc64 / display-vga-test
They are failing due to some memory corruption errors. We believe that
this might be due to the combination of the compiler version and of LTO,
and will take up the investigation within the upstream community.
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Executing tests in obs is very fickle, since you aren't guaranteed
reliable cpu time. Triple the timeout for each test to help ensure
we don't fail a test because the stars align against us.
Signed-off-by: Bruce Rogers <brogers@suse.com>
[DF: Small tweaks necessary for rebasing on top of 6.2.0]
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Since we have a quite restricted execution environment, as far as
networking is concerned, we need to change the error message we expect
in test 162. There is actually no routing set up so the error we get is
"Network is unreachable". Change the expected output accordingly.
Signed-off-by: Bruce Rogers <brogers@suse.com>
Revert commit "tests/qtest: enable more vhost-user tests by default"
(8dcb404bff), as it causes prooblem when building with GCC 12 and LTO
enabled.
This should be considered temporary, until the actual reason why the
code of the tests that are added in that commit breaks.
It has been reported upstream, and will be (hopefully) solved there:
https://lore.kernel.org/qemu-devel/1d3bbff9e92e7c8a24db9e140dcf3f428c2df103.camel@suse.com/
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
SG_IO may return additional status in the 'status', 'driver_status',
and 'host_status' fields. When either of these fields are set the
command has not been executed normally, so we should not continue
processing this command but rather return an error.
scsi_read_complete() already checks for these errors,
scsi_write_complete() does not.
References: bsc#1178049
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
While using SCSI passthrough, Following scenario makes qemu doesn't
realized the capacity change of remote scsi target:
1. online resize the scsi target.
2. issue 'rescan-scsi-bus.sh -s ...' in host.
3. issue 'rescan-scsi-bus.sh -s ...' in vm.
In above scenario I used to experienced errors while accessing the
additional disk space in vm. I think the reasonable operations should
be:
1. online resize the scsi target.
2. issue 'rescan-scsi-bus.sh -s ...' in host.
3. issue 'block_resize' via qmp to notify qemu.
4. issue 'rescan-scsi-bus.sh -s ...' in vm.
The errors disappear once I notify qemu by block_resize via qmp.
So this patch replaces the number of logical blocks of READ CAPACITY
response from scsi target by qemu's bs->total_sectors. If the user in
vm wants to access the additional disk space, The administrator of
host must notify qemu once resizeing the scsi target.
Bonus is that domblkinfo of libvirt can reflect the consistent capacity
information between host and vm in case of missing block_resize in qemu.
E.g:
...
<disk type='block' device='lun'>
<driver name='qemu' type='raw'/>
<source dev='/dev/sdc' index='1'/>
<backingStore/>
<target dev='sda' bus='scsi'/>
<alias name='scsi0-0-0-0'/>
<address type='drive' controller='0' bus='0' target='0' unit='0'/>
</disk>
...
Before:
1. online resize the scsi target.
2. host:~ # rescan-scsi-bus.sh -s /dev/sdc
3. guest:~ # rescan-scsi-bus.sh -s /dev/sda
4 host:~ # virsh domblkinfo --domain $DOMAIN --human --device sda
Capacity: 4.000 GiB
Allocation: 0.000 B
Physical: 8.000 GiB
5. guest:~ # lsblk /dev/sda
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 8G 0 disk
└─sda1 8:1 0 2G 0 part
After:
1. online resize the scsi target.
2. host:~ # rescan-scsi-bus.sh -s /dev/sdc
3. guest:~ # rescan-scsi-bus.sh -s /dev/sda
4 host:~ # virsh domblkinfo --domain $DOMAIN --human --device sda
Capacity: 4.000 GiB
Allocation: 0.000 B
Physical: 8.000 GiB
5. guest:~ # lsblk /dev/sda
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 4G 0 disk
└─sda1 8:1 0 2G 0 part
References: [SUSE-JIRA] (SLE-20965)
Signed-off-by: Lin Ma <lma@suse.com>
The final step of xl migrate|save for an HVM domU is saving the state of
qemu. This also involves releasing all block devices. While releasing
backends ought to be a separate step, such functionality is not
implemented.
Unfortunately, releasing the block devices depends on the optional
'live' option. This breaks offline migration with 'virsh migrate domU
dom0' because the sending side does not release the disks, as a result
the receiving side can not properly claim write access to the disks.
As a minimal fix, remove the dependency on the 'live' option. Upstream
may fix this in a different way, like removing the newly added 'live'
parameter entirely.
Fixes: 5d6c599fe1 ("migration, xen: Fix block image lock issue on live migration")
Signed-off-by: Olaf Hering <olaf@aepfle.de>
References: bsc#1079730, bsc#1101982, bsc#1063993
Signed-off-by: Bruce Rogers <brogers@suse.com>
Provide monitor naming of xen disks, and plumb guest driver
notification through xenstore of resizing instigated via the
monitor.
[BR: minor edits to pass qemu's checkpatch script]
[BR: significant rework needed due to upstream xen disk qdevification]
[BR: At this point, monitor_add_blk call is all we need to add!]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Add code to read the suse specific suse-diskcache-disable-flush flag out
of xenstore, and set the equivalent flag within QEMU.
Patch taken from Xen's patch queue, Olaf Hering being the original author.
[bsc#879425]
[BR: minor edits to pass qemu's checkpatch script]
[BR: With qdevification of xen-block, code has changed significantly]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Signed-off-by: Olaf Hering <olaf@aepfle.de>
For SLES we want users to be able to use large memory configurations
with KVM without fiddling with ulimit -Sv.
Signed-off-by: Andreas Färber <afaerber@suse.de>
[BR: add include for sys/resource.h]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Change from using glib alloc and free routines to those
from libc. Also perform safety measure of dropping privs
to user if configured no-caps.
References: boo#988279
Signed-off-by: Bruce Rogers <brogers@suse.com>
[AF: Rebased for v2.7.0-rc2]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Virtio-Console can only process one character at a time. Using it on S390
gave me strange "lags" where I got the character I pressed before when
pressing one. So I typed in "abc" and only received "a", then pressed "d"
but the guest received "b" and so on.
While the stdio driver calls a poll function that just processes on its
queue in case virtio-console can't take multiple characters at once, the
muxer does not have such callbacks, so it can't empty its queue.
To work around that limitation, I introduced a new timer that only gets
active when the guest can not receive any more characters. In that case
it polls again after a while to check if the guest is now receiving input.
This patch fixes input when using -nographic on s390 for me.
[AF: Rebased for v2.7.0-rc2]
[BR: minor edits to pass qemu's checkpatch script]
Signed-off-by: Bruce Rogers <brogers@suse.com>
When using hugetlbfs (which is required for HV mode KVM on 970), we
check for MMU notifiers that on 970 can not be implemented properly.
So disable the check for mmu notifiers on PowerPC guests, making
KVM guests work there, even if possibly racy in some odd circumstances.
Signed-off-by: Bruce Rogers <brogers@suse.com>
When doing lseek, SEEK_SET indicates that the offset is an unsigned variable.
Other seek types have parameters that can be negative.
When converting from 32bit to 64bit parameters, we need to take this into
account and enable SEEK_END and SEEK_CUR to be negative, while SEEK_SET stays
absolute positioned which we need to maintain as unsigned.
Signed-off-by: Alexander Graf <agraf@suse.de>
Linux syscalls pass pointers or data length or other information of that sort
to the kernel. This is all stuff you don't want to have sign extended.
Otherwise a host 64bit variable parameter with a size parameter will extend
it to a negative number, breaking lseek for example.
Pass syscall arguments as ulong always.
Signed-off-by: Alexander Graf <agraf@suse.de>
[JRZ: changes from linux-user/qemu.h wass moved to linux-user/user-internals.h]
Signed-off-by: Jose R Ziviani <jziviani@suse.de>
[DF: Forward port, i.e., use ulong for do_prctl too]
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
We add a --cross-file reference so that we can do cross compilation
of qboot from an aarch64 build.
Signed-off-by: Bruce Rogers <brogers@suse.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Certain rom subpackages build from qemu git-submodules call the date
program to include date information in the packaged binaries. This
causes repeated builds of the package to be different, wkere the only
real difference is due to the fact that time build timestamp has
changed. To promote reproducible builds and avoid customers being
prompted to update packages needlessly, we'll use the timestamp of the
VERSION file as the packaging timestamp for all packages that build in a
timestamp for whatever reason.
References: bsc#1011213
Signed-off-by: Bruce Rogers <brogers@suse.com>
The sgabios submodule is no longer there, so let's get rid of any
reference to it from our spec files.
Remove no longer supported './configure' options.
We're also not set yet for using the set_version service, so we need to
update the following manually:
- the Version: tags in the spec files
- the rpm/seabios_version and rpm/skiboot_version files (see qemu.spec
for instructions on how to do that)
- the %{sbver} variable in rpm/common.inc
A better solution for handling this aspect is being worked on.
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
In an upstream tarball there are some special files, generated by a
script that is run when the archive is prepared. Let's make our
repository look a little more like that, so we can build it properly.
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Stash the "packaging files" in the QEMU repository, in the rpm/
directory. During package build, they will be pulled out from there
and used as appropriate.
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
DisplaySurface may be free before the pixman image is freed, since the
image is refcounted and used by different objects, including pending
dbus messages.
Furthermore, setting the destroy function in
create_displaysurface_from() isn't appropriate, as it may not be used,
and may be overriden as in ramfb.
Set the destroy function when the shared handle is set, use the HANDLE
directly for destroy data, using a single common helper
qemu_pixman_win32_image_destroy().
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-ID: <20241008125028.1177932-5-marcandre.lureau@redhat.com>
(cherry picked from commit 330ef31deb)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When SET_STREAM_FORMAT is called, we should clear the existing setup.
Factor out common function to close a stream.
Direct leak of 144 byte(s) in 3 object(s) allocated from:
#0 0x7f91d38f7350 in calloc (/lib64/libasan.so.8+0xf7350) (BuildId: a4ad7eb954b390cf00f07fa10952988a41d9fc7a)
#1 0x7f91d2ab7871 in g_malloc0 (/lib64/libglib-2.0.so.0+0x64871) (BuildId: 36b60dbd02e796145a982d0151ce37202ec05649)
#2 0x562fa2f447ee in timer_new_full /home/elmarco/src/qemu/include/qemu/timer.h:538
#3 0x562fa2f4486f in timer_new /home/elmarco/src/qemu/include/qemu/timer.h:559
#4 0x562fa2f448a9 in timer_new_ns /home/elmarco/src/qemu/include/qemu/timer.h:577
#5 0x562fa2f47955 in hda_audio_setup ../hw/audio/hda-codec.c:490
#6 0x562fa2f4897e in hda_audio_command ../hw/audio/hda-codec.c:605
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-ID: <20241008125028.1177932-3-marcandre.lureau@redhat.com>
(cherry picked from commit 6d6e23361f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The result of 1 << regbit with regbit==31 has a 1 in the 32nd bit.
When cast to uint64_t (for further bitwise OR), the 32 most
significant bits will be filled with 1s. However, the documentation
states that the upper 32 bits of ICH_AP[0/1]R<n>_EL2 are reserved.
Add an explicit cast to match the documentation.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Cc: qemu-stable@nongnu.org
Fixes: c3f21b065a ("hw/intc/arm_gicv3_cpuif: Support vLPIs")
Signed-off-by: Alexandra Diupina <adiupina@astralinux.ru>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 3db74afec3)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The result of 1 << regbit with regbit==31 has a 1 in the 32nd bit.
When cast to uint64_t (for further bitwise OR), the 32 most
significant bits will be filled with 1s. However, the documentation
states that the upper 32 bits of ICC_AP[0/1]R<n>_EL2 are reserved.
Add an explicit cast to match the documentation.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Cc: qemu-stable@nongnu.org
Fixes: 28cca59c46 ("hw/intc/arm_gicv3: Add NMI handling CPU interface registers")
Signed-off-by: Alexandra Diupina <adiupina@astralinux.ru>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 12dc8f6eca)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The result of 1 << regbit with regbit==31 has a 1 in the 32nd bit.
When cast to uint64_t (for further bitwise OR), the 32 most
significant bits will be filled with 1s. However, the documentation
states that the upper 32 bits of ICH_AP[0/1]R<n>_EL2 are reserved.
Add an explicit cast to match the documentation.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Cc: qemu-stable@nongnu.org
Fixes: d2c0c6aab6 ("hw/intc/arm_gicv3: Handle icv_nmiar1_read() for icc_nmiar1_read()")
Signed-off-by: Alexandra Diupina <adiupina@astralinux.ru>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit e0c0ea6eca)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Moving -mcx16 out of CPU_CFLAGS caused the detection of ATOMIC128 to
fail, because flags have to be specified by hand in cc.compiles and
cc.links invocations (why oh why??).
Ensure that these tests enable all the instruction set extensions that
will be used to build the emulators.
Fixes: c2bf2ccb26 ("configure: move -mcx16 flag out of CPU_CFLAGS", 2024-05-24)
Reported-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 8db4e0f92e)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Create a separate variable for compiler flags that enable
specific instruction set extensions, so that they can be used with
cc.compiles/cc.links.
Note that -mfpmath=sse is a code generation option but it does not
enable new instructions, therefore I did not make it part of
qemu_isa_flags.
Suggested-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 6ae8c5382b)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
In the fallback when STDBRX is not available, avoid clobbering
TCG_REG_TMP1, which might be h.base, which is still in use.
Use TCG_REG_TMP2 instead.
Cc: qemu-stable@nongnu.org
Fixes: 01a112e2e9 ("tcg/ppc: Reorg tcg_out_tlb_read")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-By: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 4cabcb89b1)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Comparing a string of 4 bytes only works in little-endian.
Adjust bulk bswap to only apply to the note payload.
Perform swapping of the note header manually; the magic
is defined so that it does not need a runtime swap.
Fixes: 83f990eb5a ("linux-user/elfload: Parse NT_GNU_PROPERTY_TYPE_0 notes")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2596
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 2884596f5f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Since commit e99441a379 ("ui/curses: Do not use console_select()")
qemu_text_console_put_keysym() no longer checks for NULL console
argument, which leads to a later crash:
Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
0x00005555559ee186 in qemu_text_console_handle_keysym (s=0x0, keysym=31) at ../ui/console-vc.c:332
332 } else if (s->echo && (keysym == '\r' || keysym == '\n')) {
(gdb) bt
#0 0x00005555559ee186 in qemu_text_console_handle_keysym (s=0x0, keysym=31) at ../ui/console-vc.c:332
#1 0x00005555559e18e5 in qemu_text_console_put_keysym (s=<optimized out>, keysym=<optimized out>) at ../ui/console.c:303
#2 0x00005555559f2e88 in do_key_event (vs=vs@entry=0x5555579045c0, down=down@entry=1, keycode=keycode@entry=60, sym=sym@entry=65471) at ../ui/vnc.c:2034
#3 0x00005555559f845c in ext_key_event (vs=0x5555579045c0, down=1, sym=65471, keycode=<optimized out>) at ../ui/vnc.c:2070
#4 protocol_client_msg (vs=0x5555579045c0, data=<optimized out>, len=<optimized out>) at ../ui/vnc.c:2514
#5 0x00005555559f515c in vnc_client_read (vs=0x5555579045c0) at ../ui/vnc.c:1607
Fixes: e99441a379 ("ui/curses: Do not use console_select()")
Fixes: https://issues.redhat.com/browse/RHEL-50529
Cc: qemu-stable@nongnu.org
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 0e60fc8093)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The mips64el cross setup is very broken for bullseye which has now
entered LTS support so is unlikely to be fixed. While we still can't
build the container with all packages for bookworm due to a single
missing dependency that will hopefully get fixed in due course. For
the sake of keeping the CI green we disable the problematic packages
via the lcitool's mappings.yml file.
See also: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1081535
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
[thuth: Disable the problematic packages via lcitool's mappings.yml]
Message-ID: <20241002080333.127172-1-thuth@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit c60473d292)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The enable bits in the EXT_CSD_PART_CONFIG ext_csd register do *not*
specify whether the boot partitions exist, but whether they are enabled
for booting. Existence of the boot partitions is specified by a
EXT_CSD_BOOT_MULT != 0.
Currently, in the case of boot-partition-size=1M and boot-config=0,
Linux detects boot partitions of 1M. But as sd_bootpart_offset always
returns 0, all reads/writes are mapped to the same offset in the backing
file.
Fix this bug by calculating the offset independent of which partition is
enabled for booting.
This bug is unlikely to affect many users with QEMU's current set of
boards, because only aspeed sets boot-partition-size, and it also
sets boot-config to 8. So to run into this a user would have to
manually mark the boot partition non-booting from within the guest.
Cc: qemu-stable@nongnu.org
Signed-off-by: Jan Luebbe <jlu@pengutronix.de>
Message-id: 20240906164834.130257-1-jlu@pengutronix.de
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: added note to commit message about effects of bug]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 9601076b3b)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
target_ulong is typedef'ed as a 32-bit integer when building the
qemu-system-arm target, and this is smaller than the size of an
intermediate physical address when LPAE is being used.
Given that Linux may place leaf level user page tables in high memory
when built for LPAE, the kernel will crash with an external abort as
soon as it enters user space when running with more than ~3 GiB of
system RAM.
So replace target_ulong with vaddr in places where it may carry an
address value that is not representable in 32 bits.
Fixes: f3639a64f6 ("target/arm: Use softmmu tlbs for page table walking")
Cc: qemu-stable@nongnu.org
Reported-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Message-id: 20240927071051.1444768-1-ardb+git@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 67d762e716)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Allow overlapping request by removing the assert that made it
impossible. There are only two callers:
1. block_copy_task_create()
It already asserts the very same condition before calling
reqlist_init_req().
2. cbw_snapshot_read_lock()
There is no need to have read requests be non-overlapping in
copy-before-write when used for snapshot-access. In fact, there was no
protection against two callers of cbw_snapshot_read_lock() calling
reqlist_init_req() with overlapping ranges and this could lead to an
assertion failure [1].
In particular, with the reproducer script below [0], two
cbw_co_snapshot_block_status() callers could race, with the second
calling reqlist_init_req() before the first one finishes and removes
its conflicting request.
[0]:
> #!/bin/bash -e
> dd if=/dev/urandom of=/tmp/disk.raw bs=1M count=1024
> ./qemu-img create /tmp/fleecing.raw -f raw 1G
> (
> ./qemu-system-x86_64 --qmp stdio \
> --blockdev raw,node-name=node0,file.driver=file,file.filename=/tmp/disk.raw \
> --blockdev raw,node-name=node1,file.driver=file,file.filename=/tmp/fleecing.raw \
> <<EOF
> {"execute": "qmp_capabilities"}
> {"execute": "blockdev-add", "arguments": { "driver": "copy-before-write", "file": "node0", "target": "node1", "node-name": "node3" } }
> {"execute": "blockdev-add", "arguments": { "driver": "snapshot-access", "file": "node3", "node-name": "snap0" } }
> {"execute": "nbd-server-start", "arguments": {"addr": { "type": "unix", "data": { "path": "/tmp/nbd.socket" } } } }
> {"execute": "block-export-add", "arguments": {"id": "exp0", "node-name": "snap0", "type": "nbd", "name": "exp0"}}
> EOF
> ) &
> sleep 5
> while true; do
> ./qemu-nbd -d /dev/nbd0
> ./qemu-nbd -c /dev/nbd0 nbd:unix:/tmp/nbd.socket:exportname=exp0 -f raw -r
> nbdinfo --map 'nbd+unix:///exp0?socket=/tmp/nbd.socket'
> done
[1]:
> #5 0x000071e5f0088eb2 in __GI___assert_fail (...) at ./assert/assert.c:101
> #6 0x0000615285438017 in reqlist_init_req (...) at ../block/reqlist.c:23
> #7 0x00006152853e2d98 in cbw_snapshot_read_lock (...) at ../block/copy-before-write.c:237
> #8 0x00006152853e3068 in cbw_co_snapshot_block_status (...) at ../block/copy-before-write.c:304
> #9 0x00006152853f4d22 in bdrv_co_snapshot_block_status (...) at ../block/io.c:3726
> #10 0x000061528543a63e in snapshot_access_co_block_status (...) at ../block/snapshot-access.c:48
> #11 0x00006152853f1a0a in bdrv_co_do_block_status (...) at ../block/io.c:2474
> #12 0x00006152853f2016 in bdrv_co_common_block_status_above (...) at ../block/io.c:2652
> #13 0x00006152853f22cf in bdrv_co_block_status_above (...) at ../block/io.c:2732
> #14 0x00006152853d9a86 in blk_co_block_status_above (...) at ../block/block-backend.c:1473
> #15 0x000061528538da6c in blockstatus_to_extents (...) at ../nbd/server.c:2374
> #16 0x000061528538deb1 in nbd_co_send_block_status (...) at ../nbd/server.c:2481
> #17 0x000061528538f424 in nbd_handle_request (...) at ../nbd/server.c:2978
> #18 0x000061528538f906 in nbd_trip (...) at ../nbd/server.c:3121
> #19 0x00006152855a7caf in coroutine_trampoline (...) at ../util/coroutine-ucontext.c:175
Cc: qemu-stable@nongnu.org
Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Message-Id: <20240712140716.517911-1-f.ebner@proxmox.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
(cherry picked from commit 6475155d51)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When we shut down a guest we disable the timers. However this can
cause deadlock if the guest has queued some async work that is trying
to advance system time and spins forever trying to wind time forward.
Pay attention to the return code and bail early if we can't wind time
forward.
Reported-by: Elisha Hollander <just4now666666@gmail.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-Id: <20240916085400.1046925-15-alex.bennee@linaro.org>
(cherry picked from commit bc02be4508)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Commit e104edbb9d ("hw/mips/jazz: use qemu_find_nic_info()") contained a typo
in the NIC alias which caused initialisation of the in-built dp83932 NIC to fail
when using the normal -nic user,model=dp83932 command line.
Fixes: e104edbb9d ("hw/mips/jazz: use qemu_find_nic_info()")
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 2e4fdf5660)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The XT check for the lxvx/stxvx instructions is currently
inverted. This was introduced during the move to decodetree.
>From the ISA:
Chapter 7. Vector-Scalar Extension Facility
Load VSX Vector Indexed X-form
lxvx XT,RA,RB
if TX=0 & MSR.VSX=0 then VSX_Unavailable()
if TX=1 & MSR.VEC=0 then Vector_Unavailable()
...
Let XT be the value 32×TX + T.
The code currently does the opposite:
if (paired || a->rt >= 32) {
REQUIRE_VSX(ctx);
} else {
REQUIRE_VECTOR(ctx);
}
This was already fixed for lxv/stxv at commit "2cc0e449d1 (target/ppc:
Fix lxv/stxv MSR facility check)", but the indexed forms were missed.
Cc: qemu-stable@nongnu.org
Fixes: 70426b5bb7 ("target/ppc: moved stxvx and lxvx from legacy to decodtree")
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Claudio Fontana <cfontana@suse.de>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-ID: <20240911141651.6914-1-farosas@suse.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 8bded2e73e)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The description about virt machine type is removed by mistake, add
new description here. Here is output result with command
"./qemu-system-loongarch64 -M help"
Supported machines are:
none empty machine
virt QEMU LoongArch Virtual Machine (default)
x-remote Experimental remote machine
Without the patch, it shows as follows:
Supported machines are:
none empty machine
virt (null) (default)
x-remote Experimental remote machine
Fixes: ef2f11454c(hw/loongarch/virt: Replace Loongson IPI with LoongArch IPI)
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 4265b4f358)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The send_cleanup() hook should free the p->iov that was allocated at
send_setup(). This was missed because the UADK code is conditional on
the presence of the accelerator, so it's not tested by default.
Fixes: 819dd20636 ("migration/multifd: Add UADK initialization")
Reported-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
(cherry picked from commit 405e352d28)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
In vmstate_tlbemb a cut-and-paste error meant we gave
this vmstate subsection the same "cpu/tlb6xx" name as
the vmstate_tlb6xx subsection. This breaks migration load
for any CPU using the TLB_EMB CPU type, because when we
see the "tlb6xx" name in the incoming data we try to
interpret it as a vmstate_tlb6xx subsection, which it
isn't the right format for:
$ qemu-system-ppc -drive
if=none,format=qcow2,file=/home/petmay01/test-images/virt/dummy.qcow2
-monitor stdio -M bamboo
QEMU 9.0.92 monitor - type 'help' for more information
(qemu) savevm foo
(qemu) loadvm foo
Missing section footer for cpu
Error: Error -22 while loading VM state
Correct the incorrect vmstate section name. Since migration
for these CPU types was completely broken before, we don't
need to care that this is a migration compatibility break.
This affects the PPC 405, 440, 460 and e200 CPU families.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2522
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Arman Nabiev <nabiev.arman13@gmail.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
(cherry picked from commit 203beb6f04)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The linux-user hppa target crashes randomly for me since commit
081a0ed188 ("target/hppa: Do not mask in copy_iaoq_entry").
That commit dropped the masking of the IAOQ addresses while copying them
from other registers and instead keeps them with all 64 bits up until
the full gva is formed with the help of hppa_form_gva_psw().
So, when running in linux-user mode on an emulated 64-bit CPU, we need
to mask to a 32-bit address space at the very end in hppa_form_gva_psw()
if the PSW-W flag isn't set (which is the case for linux-user on hppa).
Fixes: 081a0ed188 ("target/hppa: Do not mask in copy_iaoq_entry")
Cc: qemu-stable@nongnu.org # v9.1+
Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit d33d3adb57)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Fix a segmentation fault in multifd when rb->receivedmap is cleared
too early.
After commit 5ef7e26bdb ("migration/multifd: solve zero page causing
multiple page faults"), multifd started using the rb->receivedmap
bitmap, which belongs to ram.c and is initialized and *freed* from the
ram SaveVMHandlers.
Multifd threads are live until migration_incoming_state_destroy(),
which is called after qemu_loadvm_state_cleanup(), leading to a crash
when accessing rb->receivedmap.
process_incoming_migration_co() ...
qemu_loadvm_state() multifd_nocomp_recv()
qemu_loadvm_state_cleanup() ramblock_recv_bitmap_set_offset()
rb->receivedmap = NULL set_bit_atomic(..., rb->receivedmap)
...
migration_incoming_state_destroy()
multifd_recv_cleanup()
multifd_recv_terminate_threads(NULL)
Move the loadvm cleanup into migration_incoming_state_destroy(), after
multifd_recv_cleanup() to ensure multifd threads have already exited
when rb->receivedmap is cleared.
Adjust the postcopy listen thread comment to indicate that we still
want to skip the cpu synchronization.
CC: qemu-stable@nongnu.org
Fixes: 5ef7e26bdb ("migration/multifd: solve zero page causing multiple page faults")
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240917185802.15619-3-farosas@suse.de
[peterx: added comment in migration_incoming_state_destroy()]
Signed-off-by: Peter Xu <peterx@redhat.com>
(cherry picked from commit 4ce5622908)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
These were passing a NULL buffer pointer unconditionally, which happens
to behave in a mostly benign way (except for the chance of an excess
memory region unref and a bounce buffer leak). Per the function comment,
this was never meant to be accepted though, and triggers an assertion
with the "softmmu: Support concurrent bounce buffers" change.
Given that the code in question never sets up any mappings, just remove
the unnecessary dma_memory_unmap calls along with the DBDMA_io struct
fields that are now entirely unused.
Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
Message-Id: <20240916175708.1829059-1-mnissler@rivosinc.com>
Fixes: be1e343995 ("macio: switch over to new byte-aligned DMA helpers")
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
(cherry picked from commit 2d0a071e62)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When DMA memory can't be directly accessed, as is the case when
running the device model in a separate process without shareable DMA
file descriptors, bounce buffering is used.
It is not uncommon for device models to request mapping of several DMA
regions at the same time. Examples include:
* net devices, e.g. when transmitting a packet that is split across
several TX descriptors (observed with igb)
* USB host controllers, when handling a packet with multiple data TRBs
(observed with xhci)
Previously, qemu only provided a single bounce buffer per AddressSpace
and would fail DMA map requests while the buffer was already in use. In
turn, this would cause DMA failures that ultimately manifest as hardware
errors from the guest perspective.
This change allocates DMA bounce buffers dynamically instead of
supporting only a single buffer. Thus, multiple DMA mappings work
correctly also when RAM can't be mmap()-ed.
The total bounce buffer allocation size is limited individually for each
AddressSpace. The default limit is 4096 bytes, matching the previous
maximum buffer size. A new x-max-bounce-buffer-size parameter is
provided to configure the limit for PCI devices.
Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240819135455.2957406-1-mnissler@rivosinc.com
Signed-off-by: Peter Xu <peterx@redhat.com>
(cherry picked from commit 637b0aa139)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This fixes:
commit e28112d007
Author: Daniel P. Berrangé <berrange@redhat.com>
Date: Thu Jun 8 17:40:16 2023 +0100
gitlab: stable staging branches publish containers in a separate tag
Due to a copy+paste mistake, that commit included "QEMU_JOB_SKIPPED"
in the final rule that was meant to be a 'catch all' for staging
branches.
As a result stable branches are still splattering dockers from the
primary development branch.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Tested-by: Michael Tokarev <mjt@tls.msk.ru>
Message-ID: <20240906140958.84755-1-berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 8d5ab746b1)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
On GICv2 and later, level triggered interrupts are pending when either
the interrupt line is asserted or the interrupt was made pending by a
GICD_ISPENDRn write. Making a level triggered interrupt pending by
software persists until either the interrupt is acknowledged or cleared
by writing GICD_ICPENDRn. As long as the interrupt line is asserted,
the interrupt is pending in any case.
This logic is transparently implemented in gic_test_pending() for
GICv1 and GICv2. The function combines the "pending" irq_state flag
(used for edge triggered interrupts and software requests) and the
line status (tracked in the "level" field). However, we also
incorrectly set the pending flag on a guest write to GICD_ISENABLERn
if the line of a level triggered interrupt was asserted. This keeps
the interrupt pending even if the line is de-asserted after some
time.
This incorrect logic is a leftover of the initial 11MPCore GIC
implementation. That handles things slightly differently to the
architected GICv1 and GICv2. The 11MPCore TRM does not give a lot of
detail on the corner cases of its GIC's behaviour, and historically
we have not wanted to investigate exactly what it does in reality, so
QEMU's GIC model takes the approach of "retain our existing behaviour
for 11MPCore, and implement the architectural standard for later GIC
revisions".
On that basis, commit 8d999995e4 in 2013 is where we added the
"level-triggered interrupt with the line asserted" handling to
gic_test_pending(), and we deliberately kept the old behaviour of
gic_test_pending() for REV_11MPCORE. That commit should have added
the "only if 11MPCore" condition to the setting of the pending bit on
writes to GICD_ISENABLERn, but forgot it.
Add the missing "if REV_11MPCORE" condition, so that our behaviour
on GICv1 and GICv2 matches the GIC architecture requirements.
Cc: qemu-stable@nongnu.org
Fixes: 8d999995e4 ("arm_gic: Fix GIC pending behavior")
Signed-off-by: Jan Klötzke <jan.kloetzke@kernkonzept.com>
Message-id: 20240911114826.3558302-1-jan.kloetzke@kernkonzept.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: expanded comment a little and converted to coding-style form;
expanded commit message with the historical backstory]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 110684c9a6)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Currently, the guest may write to the device configuration space,
whereas the virtio sound device specification in chapter 5.14.4
clearly states that the fields in the device configuration space
are driver-read-only.
Remove the set_config function from the virtio_snd class.
This also prevents a heap buffer overflow. See QEMU issue #2296.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2296
Signed-off-by: Volker Rümelin <vr_qemu@t-online.de>
Message-Id: <20240901130112.8242-1-vr_qemu@t-online.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 7fc6611cad)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Running "make distclean" in the build tree currently fails since this
tries to run the "distclean" target in the contrib/plugins/ folder, too,
but the Makefile there is missing this target. Thus add 'distclean' there
to fix this issue.
And to avoid regressions with "make distclean", add this command to one
of the build jobs, too.
Message-ID: <20240902154749.73876-1-thuth@redhat.com>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 1231bc7d12)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
As debian-11 transitions to LTS we are starting to have problems
building the image. While we could update to a later Debian building a
32 bit QEMU without modern floating point is niche host amongst the
few remaining 32 bit hosts we regularly build for. For now we still
have armhf-debian-cross-container which is currently built from the
more recent debian-12.
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240910173900.4154726-2-alex.bennee@linaro.org>
(cherry picked from commit d0068b746a)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Both gnutls and gcrypt can be configured to exclude support for certain
algorithms via a runtime check against system crypto policies. Thus it
is not sufficient to have a compile time test for hash support in their
pbkdf implementations.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
(cherry picked from commit e6c09ea4f9)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Error reporting from gnutls was improved by:
commit 57941c9c86
Author: Daniel P. Berrangé <berrange@redhat.com>
Date: Fri Mar 15 14:07:58 2024 +0000
crypto: push error reporting into TLS session I/O APIs
This has the effect of changing the output from one of the NBD
tests.
Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
(cherry picked from commit 48b8583698)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
While adding hppa64 support, the psw_v variable got extended from 32 to 64
bits. So, when packaging the PSW-V bit from the psw_v variable for interrupt
processing, check bit 31 instead the 63th (sign) bit.
This fixes a hard to find Linux kernel boot issue where the loss of the PSW-V
bit due to an ITLB interruption in the middle of a series of ds/addc
instructions (from the divU milicode library) generated the wrong division
result and thus triggered a Linux kernel crash.
Link: https://lore.kernel.org/lkml/718b8afe-222f-4b3a-96d3-93af0e4ceff1@roeck-us.net/
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Fixes: 931adff314 ("target/hppa: Update cpu_hppa_get/put_psw for hppa64")
Cc: qemu-stable@nongnu.org # v8.2+
(cherry picked from commit ead5078cf1)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Freeform sections with titles are currently generating a TOC entry for
the first paragraph in the section after the header, which is not what
we want.
(Easiest to observe directly in the QMP reference manual's
"Introduction" section.)
When freeform sections are parsed, we create both a section header *and*
an empty, title-less section. This causes some problems with sphinx's
post-parse tree transforms, see also 2664f317 - this is a similar issue:
Sphinx doesn't like section-less titles and it also doesn't like
title-less sections.
Modify qapidoc.py to parse text directly into the preceding section
title as child nodes, eliminating the section duplication. This removes
the extra text from the TOC.
Only very, very lightly tested: "it looks right at a glance" ™️. I am
still in the process of rewriting qapidoc, so I didn't give it much
deeper thought.
Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-ID: <20240822204803.1649762-1-jsnow@redhat.com>
Commit 3e7ef738 plugged the use-after-free of the global nbd_server
object, but overlooked a use-after-free of nbd_server->listener.
Although this race is harder to hit, notice that our shutdown path
first drops the reference count of nbd_server->listener, then triggers
actions that can result in a pending client reaching the
nbd_blockdev_client_closed() callback, which in turn calls
qio_net_listener_set_client_func on a potentially stale object.
If we know we don't want any more clients to connect, and have already
told the listener socket to shut down, then we should not be trying to
update the listener socket's associated function.
Reproducer:
> #!/usr/bin/python3
>
> import os
> from threading import Thread
>
> def start_stop():
> while 1:
> os.system('virsh qemu-monitor-command VM \'{"execute": "nbd-server-start",
+"arguments":{"addr":{"type":"unix","data":{"path":"/tmp/nbd-sock"}}}}\'')
> os.system('virsh qemu-monitor-command VM \'{"execute": "nbd-server-stop"}\'')
>
> def nbd_list():
> while 1:
> os.system('/path/to/build/qemu-nbd -L -k /tmp/nbd-sock')
>
> def test():
> sst = Thread(target=start_stop)
> sst.start()
> nlt = Thread(target=nbd_list)
> nlt.start()
>
> sst.join()
> nlt.join()
>
> test()
Fixes: CVE-2024-7409
Fixes: 3e7ef738c8 ("nbd/server: CVE-2024-7409: Close stray clients at server-stop")
CC: qemu-stable@nongnu.org
Reported-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20240822143617.800419-2-eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
The qtests are broken since a while in the MSYS2 job in the gitlab-CI,
likely due to some changes in the MSYS2 environment. So far nobody has
neither a clue what's going wrong here, nor an idea how to fix this
(in fact most QEMU developers even don't have a Windows environment
available for properly analyzing this problem), so we should disable the
qtests here for the time being to get at least test coverage again
for the remaining tests that are run here.
Since we already get compile-test coverage for the system emulation
in the cross-win64-system job, and since the MSYS2 job is one of the
longest running jobs in our CI (it takes more than 1 hour to complete),
let's seize the opportunity and also cut the run time by disabling
the system emulation completely here, including the libraries that
are only useful for system emulation. In case somebody ever figures
out the failure of the qtests on MSYS2, we can revert this patch
to get everything back.
Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240820170142.55324-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
In commit 412d294ffd we tried to improve the error message printed when
the machine type is unknown, but we used the wrong variable, resulting in:
$ ./build/x86/qemu-system-aarch64 -M bang
qemu-system-aarch64: unsupported machine type: "(null)"
Use -machine help to list supported machines
Use the right variable, so we produce more helpful output:
$ ./build/x86/qemu-system-aarch64 -M bang
qemu-system-aarch64: unsupported machine type: "bang"
Use -machine help to list supported machines
Note that we must move the qdict_del() to below the error_setg(),
because machine_type points into the value of that qdict entry,
and deleting it will make the pointer invalid.
Cc: qemu-stable@nongnu.org
Fixes: 412d294ffd ("vl.c: select_machine(): add selected machine type to error message")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Fix for 9.1
# -----BEGIN PGP SIGNATURE-----
#
# iLMEAAEKAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCZsVYjgAKCRBAov/yOSY+
# 306ZA/9/DFdJB5WbVtv8ZNaRKT2jj6N9o5YlLbO1HsdMGpJbDWNJAIrOIdfBCYzF
# oEvjuYItBI9DXcSUE748ucBkct/x4WkBwfL5mxfTRXOhvx3iKFeC2ZKyKPtsciRO
# QE4UDmrFbQ9IrW33Vw0+CRMlN/U8xBO7lPDfbk2MA7fM74ns8A==
# =EbRt
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 21 Aug 2024 01:01:34 PM AEST
# gpg: using RSA key B8FF1DA0D2FDCB2DA09C6C2C40A2FFF239263EDF
# gpg: Good signature from "Song Gao <m17746591750@163.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B8FF 1DA0 D2FD CB2D A09C 6C2C 40A2 FFF2 3926 3EDF
* tag 'pull-loongarch-20240821' of https://gitlab.com/gaosong/qemu:
hw/loongarch: Fix length for lowram in ACPI SRAT
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In multifd_recv_setup() we allocate (among other things)
* a MultiFDRecvData struct to multifd_recv_state::data
* a MultiFDRecvData struct to each multfd_recv_state->params[i].data
(Then during execution we might swap these pointers around.)
But in multifd_recv_cleanup() we free multifd_recv_state->data
in multifd_recv_cleanup_state() but we don't ever free the
multifd_recv_state->params[i].data. This results in a memory
leak reported by LeakSanitizer:
(cd build/asan && \
ASAN_OPTIONS="fast_unwind_on_malloc=0:strip_path_prefix=/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../" \
QTEST_QEMU_BINARY=./qemu-system-x86_64 \
./tests/qtest/migration-test --tap -k -p /x86_64/migration/multifd/file/mapped-ram )
[...]
Direct leak of 72 byte(s) in 3 object(s) allocated from:
#0 0x561cc0afcfd8 in __interceptor_calloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qemu-system-x86_64+0x218efd8) (BuildId: be72e086d4e47b172b0a72779972213fd9916466)
#1 0x7f89d37acc50 in g_malloc0 debian/build/deb/../../../glib/gmem.c:161:13
#2 0x561cc1e9c83c in multifd_recv_setup migration/multifd.c:1606:19
#3 0x561cc1e68618 in migration_ioc_process_incoming migration/migration.c:972:9
#4 0x561cc1e3ac59 in migration_channel_process_incoming migration/channel.c:45:9
#5 0x561cc1e4fa0b in file_accept_incoming_migration migration/file.c:132:5
#6 0x561cc30f2c0c in qio_channel_fd_source_dispatch io/channel-watch.c:84:12
#7 0x7f89d37a3c43 in g_main_dispatch debian/build/deb/../../../glib/gmain.c:3419:28
#8 0x7f89d37a3c43 in g_main_context_dispatch debian/build/deb/../../../glib/gmain.c:4137:7
#9 0x561cc3b21659 in glib_pollfds_poll util/main-loop.c:287:9
#10 0x561cc3b1ff93 in os_host_main_loop_wait util/main-loop.c:310:5
#11 0x561cc3b1fb5c in main_loop_wait util/main-loop.c:589:11
#12 0x561cc1da2917 in qemu_main_loop system/runstate.c:801:9
#13 0x561cc3796c1c in qemu_default_main system/main.c:37:14
#14 0x561cc3796c67 in main system/main.c:48:12
#15 0x7f89d163bd8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
#16 0x7f89d163be3f in __libc_start_main csu/../csu/libc-start.c:392:3
#17 0x561cc0a79fa4 in _start (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qemu-system-x86_64+0x210bfa4) (BuildId: be72e086d4e47b172b0a72779972213fd9916466)
Direct leak of 24 byte(s) in 1 object(s) allocated from:
#0 0x561cc0afcfd8 in __interceptor_calloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qemu-system-x86_64+0x218efd8) (BuildId: be72e086d4e47b172b0a72779972213fd9916466)
#1 0x7f89d37acc50 in g_malloc0 debian/build/deb/../../../glib/gmem.c:161:13
#2 0x561cc1e9bed9 in multifd_recv_setup migration/multifd.c:1588:32
#3 0x561cc1e68618 in migration_ioc_process_incoming migration/migration.c:972:9
#4 0x561cc1e3ac59 in migration_channel_process_incoming migration/channel.c:45:9
#5 0x561cc1e4fa0b in file_accept_incoming_migration migration/file.c:132:5
#6 0x561cc30f2c0c in qio_channel_fd_source_dispatch io/channel-watch.c:84:12
#7 0x7f89d37a3c43 in g_main_dispatch debian/build/deb/../../../glib/gmain.c:3419:28
#8 0x7f89d37a3c43 in g_main_context_dispatch debian/build/deb/../../../glib/gmain.c:4137:7
#9 0x561cc3b21659 in glib_pollfds_poll util/main-loop.c:287:9
#10 0x561cc3b1ff93 in os_host_main_loop_wait util/main-loop.c:310:5
#11 0x561cc3b1fb5c in main_loop_wait util/main-loop.c:589:11
#12 0x561cc1da2917 in qemu_main_loop system/runstate.c:801:9
#13 0x561cc3796c1c in qemu_default_main system/main.c:37:14
#14 0x561cc3796c67 in main system/main.c:48:12
#15 0x7f89d163bd8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
#16 0x7f89d163be3f in __libc_start_main csu/../csu/libc-start.c:392:3
#17 0x561cc0a79fa4 in _start (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qemu-system-x86_64+0x210bfa4) (BuildId: be72e086d4e47b172b0a72779972213fd9916466)
SUMMARY: AddressSanitizer: 96 byte(s) leaked in 4 allocation(s).
Free the params[i].data too.
Cc: qemu-stable@nongnu.org
Fixes: d117ed0699 ("migration/multifd: Allow receiving pages without packets")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
virtio: regression fixes
3 small patches to make sure we don't ship regressions.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmbEdw8PHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRp0dsIAKTzhmBR3IviFQVo223RgcDfthxoKejTB5tv
# EhGVUi4ddrViIIHsKFZ0pTHXnRcwHpPRokg6GrbqNhrAM6K7ptP8pkEK1DDkbGtq
# HaeceK55nNZ/wM1O5xHpRLVc2WtxmBrliDTFHGB2HjURO/kpjoHqWbE6Sn4GILc1
# EYU2T3Wn1UFgj+H4L7yF4SzmQSmyzq+7Tml6Z2GzpsatdwCoFQz2nA28piCnRMCq
# lusMo2YdE6js9JS/h+zMqgKValuCyuU7S7ZbSO2dvYQwt/hgk07BegBrdsAENNh6
# 0IWRHrojwAg+4U6ULzbrBG6/hW2A8Q5065D8Nf9Bjy4eAU7QSbU=
# =K6xx
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 20 Aug 2024 08:59:27 PM AEST
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [undefined]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [undefined]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu:
virtio-pci: Fix the use of an uninitialized irqfd
hw/audio/virtio-snd: fix invalid param check
vhost: Add VIRTIO_NET_F_RSC_EXT to vhost feature bits
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The crash was reported in MAC OS and NixOS, here is the link for this bug
https://gitlab.com/qemu-project/qemu/-/issues/2334https://gitlab.com/qemu-project/qemu/-/issues/2321
In this bug, they are using the virtio_input device. The guest notifier was
not supported for this device, The function virtio_pci_set_guest_notifiers()
was not called, and the vector_irqfd was not initialized.
So the fix is adding the check for vector_irqfd in virtio_pci_get_notifier()
The function virtio_pci_get_notifier() can be used in various devices.
It could also be called when VIRTIO_CONFIG_S_DRIVER_OK is not set. In this situation,
the vector_irqfd being NULL is acceptable. We can allow the device continue to boot
If the vector_irqfd still hasn't been initialized after VIRTIO_CONFIG_S_DRIVER_OK
is set, it means that the function set_guest_notifiers was not called before the
driver started. This indicates that the device is not using the notifier.
At this point, we will let the check fail.
This fix is verified in vyatta,MacOS,NixOS,fedora system.
The bt tree for this bug is:
Thread 6 "CPU 0/KVM" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7c817be006c0 (LWP 1269146)]
kvm_virtio_pci_vq_vector_use () at ../qemu-9.0.0/hw/virtio/virtio-pci.c:817
817 if (irqfd->users == 0) {
(gdb) thread apply all bt
...
Thread 6 (Thread 0x7c817be006c0 (LWP 1269146) "CPU 0/KVM"):
0 kvm_virtio_pci_vq_vector_use () at ../qemu-9.0.0/hw/virtio/virtio-pci.c:817
1 kvm_virtio_pci_vector_use_one () at ../qemu-9.0.0/hw/virtio/virtio-pci.c:893
2 0x00005983657045e2 in memory_region_write_accessor () at ../qemu-9.0.0/system/memory.c:497
3 0x0000598365704ba6 in access_with_adjusted_size () at ../qemu-9.0.0/system/memory.c:573
4 0x0000598365705059 in memory_region_dispatch_write () at ../qemu-9.0.0/system/memory.c:1528
5 0x00005983659b8e1f in flatview_write_continue_step.isra.0 () at ../qemu-9.0.0/system/physmem.c:2713
6 0x000059836570ba7d in flatview_write_continue () at ../qemu-9.0.0/system/physmem.c:2743
7 flatview_write () at ../qemu-9.0.0/system/physmem.c:2774
8 0x000059836570bb76 in address_space_write () at ../qemu-9.0.0/system/physmem.c:2894
9 0x0000598365763afe in address_space_rw () at ../qemu-9.0.0/system/physmem.c:2904
10 kvm_cpu_exec () at ../qemu-9.0.0/accel/kvm/kvm-all.c:2917
11 0x000059836576656e in kvm_vcpu_thread_fn () at ../qemu-9.0.0/accel/kvm/kvm-accel-ops.c:50
12 0x0000598365926ca8 in qemu_thread_start () at ../qemu-9.0.0/util/qemu-thread-posix.c:541
13 0x00007c8185bcd1cf in ??? () at /usr/lib/libc.so.6
14 0x00007c8185c4e504 in clone () at /usr/lib/libc.so.6
Fixes: 2ce6cff94d ("virtio-pci: fix use of a released vector")
Cc: qemu-stable@nongnu.org
Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20240806093715.65105-1-lulu@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Commit 9b6083465f ("virtio-snd: check for invalid param shift
operands") tries to prevent invalid parameters specified by the
guest. However, the code is not correct.
Change the code so that the parameters format and rate, which are
a bit numbers, are compared with the bit size of the data type.
Fixes: 9b6083465f ("virtio-snd: check for invalid param shift operands")
Signed-off-by: Volker Rümelin <vr_qemu@t-online.de>
Message-Id: <20240802071805.7123-1-vr_qemu@t-online.de>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
VIRTIO_NET_F_RSC_EXT is implemented in the rx data path, which vhost
implements, so vhost needs to support the feature if it is ever to be
enabled with vhost. The feature must be disabled otherwise.
Fixes: 2974e916df ("virtio-net: support RSC v4/v6 tcp traffic for Windows HCK")
Reported-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20240802-rsc-v1-1-2b607bd2f555@daynix.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Yutaro Shimizu from the Cyber Defense Institute discovered a bug in the
NVMe emulation that leaks contents of an uninitialized heap buffer if
subsystem and FDP emulation are enabled.
Cc: qemu-stable@nongnu.org
Reported-by: Yutaro Shimizu <shimizu@cyberdefense.jp>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Various fixes
- Null pointer dereference in IPI IOCSR (Jiaxun)
- Correct '-smbios type=4' in man page (Heinrich)
- Use correct MMU index in MIPS get_pte (Phil)
- Reset MPQEMU remote message using device_cold_reset (Peter)
- Update linux-user MIPS CPU list (Phil)
- Do not let exec_command read console if no pattern to wait for (Nick)
- Remove shadowed declaration warning (Pierrick)
- Restrict STQF opcode to SPARC V9 (Richard)
- Add missing Kconfig dependency for POWERNV ISA serial port (Bernhard)
- Do not allow vmport device without i8042 PS/2 controller (Kamil)
- Fix QCryptoTLSCredsPSK leak (Peter)
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmbDzAsACgkQ4+MsLN6t
# wN7SvBAAwM0Frtg4ZKDZQu8XgMjLq1xVoSWjC3YJZKTpyGap5gO+7StvHg0sf9iB
# YyGqocCO+qdj9a7pTSasfGDyufpwoIZkOqkwGUWKBos76cOcHWt4e/gkl9O65Lf1
# VVKX4/xdY+a5w2eVAAdWWrYdaPWkKLm0ZZXKoeSIvN4R9A41j7J4kANhE2SweczF
# NnTt2gBnSlpRzghlVWPJKhnq+aYbvLeR7ApdNGUJDpSI1ZTh9gH1GtZFwBN7aeDo
# PvDucoui0EmuyHTVdOYOH3zihTfzKlNZECcT3Y6/6i8y5p7jLHyINHHexsKw6T56
# i5RidJMPTfM0EO6LU1GvUN5FzZy24zXOf298Fe/GMYczQsOznQd4+aFHYPb3d4hZ
# 8Vc1wB1s8XF5WGj+7bchBAUdynUnbwUqfMOb2pMXLIm21pSDnOTVgmYMnp1Kt4AA
# 9WbHiS6tUJf/HjQsep8BBNGUiVSsUPDNNhL8QN43u2C0NgNRPgtRuIV+ytgVXS1G
# 2t1QiRX0lX4ACHmw88agUCU3OhorumuDOpoitQK5jn2VutT7TqbGgibkQMFSgn9E
# Xwrmtlf7nYU9MVgXYJjH2bBh7wbOmQCqbHniEj0targkxccAMJoswG4vtKsP9zkd
# tBs6qMiZ8qSj5eoq8JBRF8bF4tONmboPZjRlboACJ0kTD5wCElA=
# =lPMG
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 20 Aug 2024 08:49:47 AM AEST
# gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]
* tag 'hw-misc-20240820' of https://github.com/philmd/qemu:
crypto/tlscredspsk: Free username on finalize
hw/i386/pc: Ensure vmport prerequisites are fulfilled
hw/i386/pc: Unify vmport=auto handling
hw/ppc/Kconfig: Add missing SERIAL_ISA dependency to POWERNV machine
target/sparc: Restrict STQF to sparcv9
contrib/plugins/execlog: Fix shadowed declaration warning
tests/avocado: Mark ppc_hv_tests.py as non-flaky after fixed console interaction
tests/avocado: exec_command should not consume console output
linux-user/mips: Select Loongson CPU for Loongson binaries
linux-user/mips: Select MIPS64R2-generic for Rel2 binaries
linux-user/mips: Select Octeon68XX CPU for Octeon binaries
linux-user/mips: Do not try to use removed R5900 CPU
hw/remote/message.c: Don't directly invoke DeviceClass:reset
hw/dma/xilinx_axidma: Use semicolon at end of statement, not comma
target/mips: Load PTE as DATA
target/mips: Use correct MMU index in get_pte()
target/mips: Pass page table entry size as MemOp to get_pte()
qemu-options.hx: correct formatting -smbios type=4
hw/mips/loongson3_virt: Fix condition of IPI IOCSR connection
hw/mips/loongson3_virt: Store core_iocsr into LoongsonMachineState
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
When the creds->username property is set we allocate memory
for it in qcrypto_tls_creds_psk_prop_set_username(), but
we never free this when the QCryptoTLSCredsPSK is destroyed.
Free the memory in finalize.
This fixes a LeakSanitizer complaint in migration-test:
$ (cd build/asan; ASAN_OPTIONS="fast_unwind_on_malloc=0" QTEST_QEMU_BINARY=./qemu-system-x86_64 ./tests/qtest/migration-test --tap -k -p /x86_64/migration/precopy/unix/tls/psk)
=================================================================
==3867512==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 5 byte(s) in 1 object(s) allocated from:
#0 0x5624e5c99dee in malloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qemu-system-x86_64+0x218edee) (BuildId: a9e623fa1009a9435c0142c037cd7b8c1ad04ce3)
#1 0x7fb199ae9738 in g_malloc debian/build/deb/../../../glib/gmem.c:128:13
#2 0x7fb199afe583 in g_strdup debian/build/deb/../../../glib/gstrfuncs.c:361:17
#3 0x5624e82ea919 in qcrypto_tls_creds_psk_prop_set_username /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../crypto/tlscredspsk.c:255:23
#4 0x5624e812c6b5 in property_set_str /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qom/object.c:2277:5
#5 0x5624e8125ce5 in object_property_set /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qom/object.c:1463:5
#6 0x5624e8136e7c in object_set_properties_from_qdict /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qom/object_interfaces.c:55:14
#7 0x5624e81372d2 in user_creatable_add_type /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qom/object_interfaces.c:112:5
#8 0x5624e8137964 in user_creatable_add_qapi /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qom/object_interfaces.c:157:11
#9 0x5624e891ba3c in qmp_object_add /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qom/qom-qmp-cmds.c:227:5
#10 0x5624e8af9118 in qmp_marshal_object_add /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qapi/qapi-commands-qom.c:337:5
#11 0x5624e8bd1d49 in do_qmp_dispatch_bh /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qapi/qmp-dispatch.c:128:5
#12 0x5624e8cb2531 in aio_bh_call /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/async.c:171:5
#13 0x5624e8cb340c in aio_bh_poll /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/async.c:218:13
#14 0x5624e8c0be98 in aio_dispatch /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/aio-posix.c:423:5
#15 0x5624e8cba3ce in aio_ctx_dispatch /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/async.c:360:5
#16 0x7fb199ae0d3a in g_main_dispatch debian/build/deb/../../../glib/gmain.c:3419:28
#17 0x7fb199ae0d3a in g_main_context_dispatch debian/build/deb/../../../glib/gmain.c:4137:7
#18 0x5624e8cbe1d9 in glib_pollfds_poll /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/main-loop.c:287:9
#19 0x5624e8cbcb13 in os_host_main_loop_wait /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/main-loop.c:310:5
#20 0x5624e8cbc6dc in main_loop_wait /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/main-loop.c:589:11
#21 0x5624e6f3f917 in qemu_main_loop /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../system/runstate.c:801:9
#22 0x5624e893379c in qemu_default_main /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../system/main.c:37:14
#23 0x5624e89337e7 in main /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../system/main.c:48:12
#24 0x7fb197972d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
#25 0x7fb197972e3f in __libc_start_main csu/../csu/libc-start.c:392:3
#26 0x5624e5c16fa4 in _start (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qemu-system-x86_64+0x210bfa4) (BuildId: a9e623fa1009a9435c0142c037cd7b8c1ad04ce3)
SUMMARY: AddressSanitizer: 5 byte(s) leaked in 1 allocation(s).
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240819145021.38524-1-peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Found on debian stable.
../contrib/plugins/execlog.c: In function ‘vcpu_tb_trans’:
../contrib/plugins/execlog.c:236:22: error: declaration of ‘n’ shadows a previous local [-Werror=shadow=local]
236 | for (int n = 0; n < all_reg_names->len; n++) {
| ^
../contrib/plugins/execlog.c:184:12: note: shadowed declaration is here
184 | size_t n = qemu_plugin_tb_n_insns(tb);
|
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240814233645.944327-2-pierrick.bouvier@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Now that exec_command doesn't incorrectly consume console output,
and guest time is set correctly, ppc_hv_tests.py is working more
reliably. Try marking it non-flaky.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-ID: <20240805232814.267843-3-npiggin@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
_console_interaction reads data from the console even when there is only
an input string to send, and no output data to wait on. This can cause
lines to be missed by wait_for_console_pattern calls that follows an
exec_command. Fix this by not reading the console if there is no pattern
to wait for.
This solves occasional hangs in ppc_hv_tests.py, usually when run on KVM
hosts that are fast enough to output important lines quickly enough to be
consumed by exec_command, so they get missed by subsequent wait for
pattern calls.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240805232814.267843-2-npiggin@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Directly invoking the DeviceClass::reset method is a bad idea,
because if the device is using three-phase reset then it relies on
transitional reset machinery which is likely to disappear at some
point.
Reset the device in the standard way, by calling device_cold_reset().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240813165250.2717650-7-peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
In axidma_class_init() we accidentally used a comma at the end of
a statement rather than a semicolon. This has no ill effects, but
it's obviously not intended and it means that Coccinelle scripts
for instance will fail to match on the two statements. Use a
semicolon instead.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240813165250.2717650-6-peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
When refactoring page_table_walk_refill() in commit 4e999bf419
we missed the indirect call to cpu_mmu_index() in get_pte():
page_table_walk_refill()
-> get_pte()
-> cpu_ld[lq]_code()
-> cpu_mmu_index()
Since we don't mask anymore the modes in hflags, cpu_mmu_index()
can return UM or SM, while we only expect KM or ERL.
Fix by propagating ptw_mmu_idx to get_pte(), and use the
cpu_ld/st_code_mmu() API with the correct MemOpIdx.
Reported-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Reported-by: Waldemar Brodkorb <wbx@uclibc-ng.org>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2470
Fixes: 4e999bf419 ("target/mips: Pass ptw_mmu_idx down from mips_cpu_tlb_fill")
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240814090452.2591-3-philmd@linaro.org>
RISC-V PR for 9.1
This reverts a commit adding `#msi-cells=<0>` to the virt machine
as that commit results in PCI devices unable to us MSIs. Even though
it's a kernel bug, we don't want to break existing users.
* Revert adding #msi-cells to virt machine
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEaukCtqfKh31tZZKWr3yVEwxTgBMFAmbCzDEACgkQr3yVEwxT
# gBP2Jw/+Phcb9tw8vv3kHyjXaH5JuqMvRvE0DZi3Zub9cdwIygXEC8/o0q4Szh+4
# FGZbxSsQ6XdfOW87qY66kTlM8yxVJf2RoQcQ27QTs0kCM3TR/1nzRbc2wWPMYRmH
# FvOL926Nr+ysxtVd84HZc82GwQpEIG1qdWpy5VECMZXW8mtOTQjgltKuiH9Jl+ZX
# N0uqWc4/lp+x+UIZqS9b76AiZ8l1G5nRFdXgmKKU7J8iVeWLRRzV1NRu+cZP4WEv
# kjpMODdedScEcvqb122SVTTJcpdvhuB+bWH6mITajbt2G4YxsNYJ9594nef/sKBH
# hf3oSfXUnwDqTldnrkFonO9OhdO3ZCdtqw5Lzi1E/D2zny2CnMMIAcs8hbenVGkW
# NW0J/z84J+X1qf5gmt07l2BlUhBooCS8TJsbO8PX/lR2iCL/BxuKHEjxCnCZ6f5z
# 3FxhqO3Shk9FnfAsTxtY00RLmRo4t+ESTsBsZPiSXB3EmCo/BmgR/0Grm7UKZbbL
# /9lzUHyUYj09Mvk7IJc4KGjihfQ9TwjNdlmq2MlRHWdVT09+Bu7DRhHvNzuVYMb9
# 1iktWv4Fnit6Xe6rPOvNXF5ilmUu2fm3p6z2ogG8cRbPHPPQ7NLx8BQSqPvBHdfx
# KIV6f1xBJSSQcTdIq/ySnN1SF1h2YVPLIlv1Aap3kN/J71kkpLY=
# =C6id
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 19 Aug 2024 02:38:09 PM AEST
# gpg: using RSA key 6AE902B6A7CA877D6D659296AF7C95130C538013
# gpg: Good signature from "Alistair Francis <alistair@alistair23.me>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6AE9 02B6 A7CA 877D 6D65 9296 AF7C 9513 0C53 8013
* tag 'pull-riscv-to-apply-20240819-1' of https://github.com/alistair23/qemu:
Revert "hw/riscv/virt.c: imsics DT: add '#msi-cells'"
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Some fixes for 9.1-rc3 (build, replay, docs, plugins)
- re-enable gdbsim-r5f562n8 test
- ensure updates to python deps re-trigger configure
- tweak configure detection of GDB MTE support
- make checkpatch emit more warnings on updating headers
- allow i386 access_ptr to force slow path for plugins
- fixe some replay regressions
- update the replay-dump tool
- better handle muxed chardev during replay
- clean up TCG plugins docs to mention scoreboards
- fix plugin scoreboard race condition
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAma/UJcACgkQ+9DbCVqe
# KkT51gf/buOo0leJnBkYDTPWOOsDupW/nUUqOlTStvpKGEVNZgmxH0V4ffdCNO8E
# P4xQpD8WrpFKZHu2zE7EmXJ6/wkSp2BeSPcZ8lhld8jKNY3ksBlsCwb26/D9WsWK
# /JaqAegdg3fwCgbcQ057dRlKJV2ojjWD/JqPWa5G9AIlSqiHEfvcTj9t33BpJKXC
# xV7Yt1TZExkfkCAny54Sx4O6oiDhvSgJmWCUGIVE2W39+g3jUKf2tvbggR5MEIH3
# fJ/F2vmcnllmK21awiRa9/WVZ55+Cbgj6PlLf/Qh6rhzooTMy+x0G+5BkNtZwNCs
# 8qFu8vFkuJM9YwDw9btaz3b+nG8Mzg==
# =HUN1
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 16 Aug 2024 11:13:59 PM AEST
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [full]
* tag 'pull-maintainer-9.1-rc3-160824-1' of https://gitlab.com/stsquad/qemu: (21 commits)
plugins: fix race condition with scoreboards
docs/devel: update tcg-plugins page
docs: Fix some typos (found by typos) and grammar issues
savevm: Fix load_snapshot error path crash
virtio-net: Use virtual time for RSC timers
virtio-net: Use replay_schedule_bh_event for bhs that affect machine state
chardev: set record/replay on the base device of a muxed device
tests/avocado: replay_kernel.py add x86-64 q35 machine test
Revert "replay: stop us hanging in rr_wait_io_event"
replay: allow runstate shutdown->running when replaying trace
tests/avocado: excercise scripts/replay-dump.py in replay tests
scripts/replay-dump.py: rejig decoders in event number order
scripts/replay-dump.py: Update to current rr record format
buildsys: Fix building without plugins on Darwin
target/i386: allow access_ptr to force slow path on failed probe
scripts/checkpatch: more checks on files imported from Linux
configure: Fix GDB version detection for GDB_HAS_MTE
configure: Avoid use of param. expansion when using gdb_version
configure: Fix arch detection for GDB_HAS_MTE
Makefile: trigger re-configure on updated pythondeps
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
A deadlock can be created if a new vcpu (a) triggers a scoreboard
reallocation, and another vcpu (b) wants to create a new scoreboard at
the same time.
In this case, (a) holds the plugin lock, and starts an exclusive
section, waiting for (b). But at the same time, (b) is waiting for
plugin lock.
The solution is to drop the lock before entering the exclusive section.
This bug can be easily reproduced by creating a callback for any tb
exec, that allocates a new scoreboard. In this case, as soon as we reach
more than 16 vcpus, the deadlock occurs.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2344
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-Id: <20240812220748.95167-2-pierrick.bouvier@linaro.org>
[AJB: tweak var position to meet coding style]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240813202329.1237572-22-alex.bennee@linaro.org>
chardev events to a muxed device don't get recorded because e.g.,
qemu_chr_be_write() checks whether the base device has the record flag
set.
This can be seen when replaying a trace that has characters typed into
the console, an examination of the log shows they are not recorded.
Setting QEMU_CHAR_FEATURE_REPLAY on the base chardev fixes the problem.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20240813050638.446172-8-npiggin@gmail.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240813202329.1237572-16-alex.bennee@linaro.org>
This reverts commit 1f881ea4a4.
That commit causes reverse_debugging.py test failures, and does
not seem to solve the root cause of the problem x86-64 still
hangs in record/replay tests.
The problem with short-cutting the iowait that was taken during
record phase is that related events will not get consumed at the
same points (e.g., reading the clock).
A hang with zero icount always seems to be a symptom of an earlier
problem that has caused the recording to become out of synch with
the execution and consumption of events by replay.
Acked-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20240813050638.446172-6-npiggin@gmail.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240813202329.1237572-14-alex.bennee@linaro.org>
When replaying a trace, it is possible to go from shutdown to running
with a reverse-debugging step. This can be useful if the problem being
debugged triggers a reset or shutdown.
This can be tested by making a recording of a machine that shuts down,
then using -action shutdown=pause when replaying it. Continuing to the
end of the trace then reverse-stepping in gdb crashes due to invalid
runstate transition.
Just permitting the transition seems to be all that's necessary for
reverse-debugging to work well in such a state.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20240813050638.446172-5-npiggin@gmail.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240813202329.1237572-13-alex.bennee@linaro.org>
Since commit 0082475e26 the plugin symbol list is unconditionally
added to the linker flags, leading to a build failure:
Undefined symbols for architecture arm64:
"_qemu_plugin_entry_code", referenced from:
<initial-undefines>
...
ld: symbol(s) not found for architecture arm64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
ninja: build stopped: subcommand failed.
Fix by restricting the whole meson file to the --enable-plugins
configure argument.
Fixes: 0082475e26 ("meson: merge plugin_ldflags into emulator_link_args")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2476
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240813112457.92560-1-philmd@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240813202329.1237572-9-alex.bennee@linaro.org>
When we are using TCG plugin memory callbacks probe_access_internal
will return TLB_MMIO to force the slow path for memory access. This
results in probe_access returning NULL but the x86 access_ptr function
happily accepts an empty haddr resulting in segfault hilarity.
Check for an empty haddr to prevent the segfault and enable plugins to
track all the memory operations for the x86 save/restore helpers. As
we also want to run the slow path when instrumenting *-user we should
also not have the short cutting test_ptr macro.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2489
Fixes: 6d03226b42 (plugins: force slow path when plugins instrument memory ops)
Reviewed-by: Alexandre Iooss <erdnaxe@crans.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240813202329.1237572-8-alex.bennee@linaro.org>
The gtk-vnc package is used by the vnc-display-test qtest
program. Technically only gvnc is needed, but since we
already pull in the gtk3 dep, it is harmless to depend
on gtk-vnc.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240718094159.902024-2-berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Since quite a while MSYS2 now supports Clang as a compiler, too.
Unfortunately, this compiler is lacking the __attribute__((gcc_struct))
that we need for compiling on Windows. But since the compiler is
available now, some people started to use it to compile QEMU on MSYS2,
apparently ignoring the compiler warnings (see for example the ticket at
https://gitlab.com/qemu-project/qemu/-/issues/2476 ). These builds are
likely broken in a couple of spots, so let's make sure that we rather
bail out early in the configuration phase instead of allowing the build
to succeed with warnings.
Message-ID: <20240815122719.727639-1-thuth@redhat.com>
Tested-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Thomas Huth <thuth@redhat.com>
* fix --static compilation of hexagon
* fix incorrect application of REX to MMX operands
* fix crash on module load
* update Italian translation
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAma7kZ4UHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroOy7QgAriuxfgw3Yvu9UPPfEZT5V9p5XfDf
# LceO3C6OABIkFoGSO8WK5dWfQy3oYbrwEXX/l/PW1lUc2DFrSUo9YtIfjelRkxoC
# 0EAAbV5A+xCLYmujFqBSe/6usRj82uKjSET1KK1aCam7ONZLNZf2yb4OwdShvLSN
# MPgtBOrwznR1qh3KJtLB6YSRC0Rie1hOxbXFpx1AklXYnIiqUdMjXOHSjs+Amva0
# VczuqwjtVdNDTPqbZlCXatPtZ8nwYeEOD2jOqgjAoEwwabZ1fFGDCNXlqEDLSdTm
# Cc+IZPYU5a8+tVfH0DYEMgMSkRhDUqVZ/076L+pRi+Q8ClxWV8fKsf5qKw==
# =jJtu
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 14 Aug 2024 03:02:22 AM AEST
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu:
po: update Italian translation
module: Prevent crash by resetting local_err in module_load_qom_all()
target/i386: Assert MMX and XMM registers in range
target/i386: Use unit not type in decode_modrm
target/i386: Do not apply REX to MMX operands
target/hexagon: don't look for static glib
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Our current usage of MMU indexes when EL3 is AArch32 is confused.
Architecturally, when EL3 is AArch32, all Secure code runs under the
Secure PL1&0 translation regime:
* code at EL3, which might be Mon, or SVC, or any of the
other privileged modes (PL1)
* code at EL0 (Secure PL0)
This is different from when EL3 is AArch64, in which case EL3 is its
own translation regime, and EL1 and EL0 (whether AArch32 or AArch64)
have their own regime.
We claimed to be mapping Secure PL1 to our ARMMMUIdx_EL3, but didn't
do anything special about Secure PL0, which meant it used the same
ARMMMUIdx_EL10_0 that NonSecure PL0 does. This resulted in a bug
where arm_sctlr() incorrectly picked the NonSecure SCTLR as the
controlling register when in Secure PL0, which meant we were
spuriously generating alignment faults because we were looking at the
wrong SCTLR control bits.
The use of ARMMMUIdx_EL3 for Secure PL1 also resulted in the bug that
we wouldn't honour the PAN bit for Secure PL1, because there's no
equivalent _PAN mmu index for it.
We could fix this in one of two ways:
* The most straightforward is to add new MMU indexes EL30_0,
EL30_3, EL30_3_PAN to correspond to "Secure PL1&0 at PL0",
"Secure PL1&0 at PL1", and "Secure PL1&0 at PL1 with PAN".
This matches how we use indexes for the AArch64 regimes, and
preserves propirties like being able to determine the privilege
level from an MMU index without any other information. However
it would add two MMU indexes (we can share one with ARMMMUIdx_EL3),
and we are already using 14 of the 16 the core TLB code permits.
* The more complicated approach is the one we take here. We use
the same MMU indexes (E10_0, E10_1, E10_1_PAN) for Secure PL1&0
than we do for NonSecure PL1&0. This saves on MMU indexes, but
means we need to check in some places whether we're in the
Secure PL1&0 regime or not before we interpret an MMU index.
The changes in this commit were created by auditing all the places
where we use specific ARMMMUIdx_ values, and checking whether they
needed to be changed to handle the new index value usage.
Note for potential stable backports: taking also the previous
(comment-change-only) commit might make the backport easier.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2326
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240809160430.1144805-3-peter.maydell@linaro.org
We have a long comment describing the Arm architectural translation
regimes and how we map them to QEMU MMU indexes. This comment has
got a bit out of date:
* FEAT_SEL2 allows Secure EL2 and corresponding new regimes
* FEAT_RME introduces Realm state and its translation regimes
* We now model the Cortex-R52 so that is no longer a hypothetical
* We separated Secure Stage 2 and NonSecure Stage 2 MMU indexes
* We have an MMU index per physical address spacea
Add the missing pieces so that the list of architectural translation
regimes matches the Arm ARM, and the list and count of QEMU MMU
indexes in the comment matches the enum.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240809160430.1144805-2-peter.maydell@linaro.org
This commit adds validation checks for the MCOPRE and MCOSEL values in
the rcc_update_cfgr_register function. If the MCOPRE value exceeds
0b100 or the MCOSEL value exceeds 0b111, an error is logged and the
corresponding clock mux is disabled. This helps in identifying and
handling invalid configurations in the RCC registers.
Reproducer:
cat << EOF | qemu-system-aarch64 -display \
none -machine accel=qtest, -m 512M -machine b-l475e-iot01a -qtest \
stdio
writeq 0x40021008 0xffffffff
EOF
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2356
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When cross compiling QEMU configured with --static, I've been getting
configure errors like the following:
Build-time dependency glib-2.0 found: NO
../target/hexagon/meson.build:303:15: ERROR: Dependency lookup for glib-2.0 with method 'pkgconfig' failed: Could not generate libs for glib-2.0:
Package libpcre2-8 was not found in the pkg-config search path.
Perhaps you should add the directory containing `libpcre2-8.pc'
to the PKG_CONFIG_PATH environment variable
Package 'libpcre2-8', required by 'glib-2.0', not found
This happens because --static sets the prefer_static Meson option, but
my build machine doesn't have a static libpcre2. I don't think it
makes sense to insist that native dependencies are static, just
because I want the non-native QEMU binaries to be static.
Signed-off-by: Alyssa Ross <hi@alyssa.is>
Link: https://lore.kernel.org/r/20240805104921.4035256-1-hi@alyssa.is
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* Fix BTI versus CF_PCREL
* include: Fix typo in name of MAKE_IDENTFIER macro
* docs: Various txt-to-rST conversions
* hw/core/ptimer: fix timer zero period condition for freq > 1GHz
* arm/virt: place power button pin number on a define
# -----BEGIN PGP SIGNATURE-----
#
# iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAma5+4wZHHBldGVyLm1h
# eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3pX3D/9UVutdg5TsB9N8y5mPaVSn
# Yx0awBgxK5SHWeVgQJBkSdqh6LiGhhukR3VHfNanDELq24s0uLqLW86thgj+iB0H
# 51rnVHJtWtT9mIt0Qq9BlXX8+j0th6hELy/z+/aYdrWI1pmKsGYgF1gRh1vXrg+I
# 0s/S7kZY5CNDBbTXoBNtJfbZRe8fzyy5gUqc/tnw6Qonp8XM1OeG6sg/qF0KwzbB
# 8R7IvnY7gaBWm3daXqrFoxYuR+9i6F8uaFflOm+CarKQc9foH6KEzmfLAYLfGkFZ
# 2ZVHg3uC4k4OicyrpYcWsgumNTzOj8RTI4kV7M8NAj5TXCr+0pO6lnhlAKVGTWiL
# nJrW62dN56w8NVOzcy0tB0xqTHnKIxioGZyU4RDVKHjD/Fy0x7LX7KVmaBEZgyxJ
# oA4zY4KOrCNFsXQlqZgx38v/1hshnIYFN7V5AmfGEfbbKpBznKBQKmuyJ9VwSfGT
# jLwlwU4VMJPsj2Rs70seEl6obgyZicAXIAbqPgtMsvt3H2kKI2jtsNPFka3WaY62
# 0jOEbbFrsKV1//ZExBZdFhqBH/CoiZMvM4jsq1Y/oxAxIWtGv5dmJJsAA3w33YE4
# kNWXfHKAAhydZKeQloMgeOdLliP5UiCfF1FltwAWkLo59GV3TkjwagDU8+pWs9OF
# plOKWaKDUzkHq6G197uaBA==
# =ftoZ
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 12 Aug 2024 10:09:48 PM AEST
# gpg: using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg: issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [full]
# gpg: aka "Peter Maydell <pmaydell@gmail.com>" [full]
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [full]
# gpg: aka "Peter Maydell <peter@archaic.org.uk>" [unknown]
* tag 'pull-target-arm-20240812' of https://git.linaro.org/people/pmaydell/qemu-arm:
arm/virt: place power button pin number on a define
hw/core/ptimer: fix timer zero period condition for freq > 1GHz
docs: Typo fix in live disk backup
docs/interop/prl-xml.rst: Fix minor grammar nits
docs/interop/prl-xml.txt: Convert to rST
docs/interop/parallels.txt: Convert to rST
docs/interop/nbd.txt: Convert to rST
docs/specs/rocker.txt: Convert to rST
include: Fix typo in name of MAKE_IDENTFIER macro
target/arm: Fix BTI versus CF_PCREL
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCAAdFiEEIV1G9IJGaJ7HfzVi7wSWWzmNYhEFAma5uNkACgkQ7wSWWzmN
# YhFpLwf+J9+cBWKUze7FZkxNHU78GJ/b+oVQfLYPnrCRrVKoyTr9yiKfMDS8qf5/
# tPd+xFABwcHb8UL3EeAe9w5aB0QCqqdmZMFRkWuaZ7HEbZkYNt9cJck5iMdNaPBm
# cKiFRLb8FDVA3aegCcsBqnwCxgFW+3P3rrnHQz1C+GQAOm7FER+HiFnYucjrrLSM
# SaXZYIH/LPqL01gbZcbixQkhgL5XFWUToFXQEYECGS07uZZ1WSJkxIP6WZDchJ4+
# vYO8/fWXVdrjvDirraZQRYnurWQGpTUk0Ocn2R8MaJsF8TK031MrMRJ3YP9zXp4n
# wMe0BZO/YG5oi2gFrJpYL2AZqh2MgQ==
# =DhS+
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 12 Aug 2024 05:25:13 PM AEST
# gpg: using RSA key 215D46F48246689EC77F3562EF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* tag 'net-pull-request' of https://github.com/jasowang/qemu:
net: Fix '-net nic,model=' for non-help arguments
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add in the missing space in the section header.
Fixes: 1084159b31 ("qapi: deprecate drive-backup", v6.2.0)
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Convert the rocker.txt specification document to rST format. We make
extensive use of the :: marker to introduce a literal block for all
the tables and ASCII art, rather than trying to convert the tables to
rST table syntax. This produces a valid rST document without needing
a huge diff.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240801170131.3977807-2-peter.maydell@linaro.org
In commit bb71846325 we added some macro magic to avoid
variable-shadowing when using some of our more complicated
macros. One of the internal components of this is a macro
named MAKE_IDENTFIER. Fix the typo in its name: it should
be MAKE_IDENTIFIER.
Commit created with
sed -i -e 's/MAKE_IDENTFIER/MAKE_IDENTIFIER/g' include/qemu/*.h include/qapi/qmp/qobject.h
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240801102516.3843780-1-peter.maydell@linaro.org
With pcrel, we cannot check the guarded page bit at translation
time, as different mappings of the same physical page may or may
not have the GP bit set.
Instead, add a couple of helpers to check the page at runtime,
after all other filters that might obviate the need for the check.
The set_btype_for_br call must be moved after the gen_a64_set_pc
call to ensure the current pc can still be computed.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240802003028.795476-1-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
A malicious client can attempt to connect to an NBD server, and then
intentionally delay progress in the handshake, including if it does
not know the TLS secrets. Although the previous two patches reduce
this behavior by capping the default max-connections parameter and
killing slow clients, they did not eliminate the possibility of a
client waiting to close the socket until after the QMP nbd-server-stop
command is executed, at which point qemu would SEGV when trying to
dereference the NULL nbd_server global which is no longer present.
This amounts to a denial of service attack. Worse, if another NBD
server is started before the malicious client disconnects, I cannot
rule out additional adverse effects when the old client interferes
with the connection count of the new server (although the most likely
is a crash due to an assertion failure when checking
nbd_server->connections > 0).
For environments without this patch, the CVE can be mitigated by
ensuring (such as via a firewall) that only trusted clients can
connect to an NBD server. Note that using frameworks like libvirt
that ensure that TLS is used and that nbd-server-stop is not executed
while any trusted clients are still connected will only help if there
is also no possibility for an untrusted client to open a connection
but then stall on the NBD handshake.
Given the previous patches, it would be possible to guarantee that no
clients remain connected by having nbd-server-stop sleep for longer
than the default handshake deadline before finally freeing the global
nbd_server object, but that could make QMP non-responsive for a long
time. So intead, this patch fixes the problem by tracking all client
sockets opened while the server is running, and forcefully closing any
such sockets remaining without a completed handshake at the time of
nbd-server-stop, then waiting until the coroutines servicing those
sockets notice the state change. nbd-server-stop now has a second
AIO_WAIT_WHILE_UNLOCKED (the first is indirectly through the
blk_exp_close_all_type() that disconnects all clients that completed
handshakes), but forced socket shutdown is enough to progress the
coroutines and quickly tear down all clients before the server is
freed, thus finally fixing the CVE.
This patch relies heavily on the fact that nbd/server.c guarantees
that it only calls nbd_blockdev_client_closed() from the main loop
(see the assertion in nbd_client_put() and the hoops used in
nbd_client_put_nonzero() to achieve that); if we did not have that
guarantee, we would also need a mutex protecting our accesses of the
list of connections to survive re-entrancy from independent iothreads.
Although I did not actually try to test old builds, it looks like this
problem has existed since at least commit 862172f45c (v2.12.0, 2017) -
even back when that patch started using a QIONetListener to handle
listening on multiple sockets, nbd_server_free() was already unaware
that the nbd_blockdev_client_closed callback can be reached later by a
client thread that has not completed handshakes (and therefore the
client's socket never got added to the list closed in
nbd_export_close_all), despite that patch intentionally tearing down
the QIONetListener to prevent new clients.
Reported-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Fixes: CVE-2024-7409
CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20240807174943.771624-14-eblake@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
A client that opens a socket but does not negotiate is merely hogging
qemu's resources (an open fd and a small amount of memory); and a
malicious client that can access the port where NBD is listening can
attempt a denial of service attack by intentionally opening and
abandoning lots of unfinished connections. The previous patch put a
default bound on the number of such ongoing connections, but once that
limit is hit, no more clients can connect (including legitimate ones).
The solution is to insist that clients complete handshake within a
reasonable time limit, defaulting to 10 seconds. A client that has
not successfully completed NBD_OPT_GO by then (including the case of
where the client didn't know TLS credentials to even reach the point
of NBD_OPT_GO) is wasting our time and does not deserve to stay
connected. Later patches will allow fine-tuning the limit away from
the default value (including disabling it for doing integration
testing of the handshake process itself).
Note that this patch in isolation actually makes it more likely to see
qemu SEGV after nbd-server-stop, as any client socket still connected
when the server shuts down will now be closed after 10 seconds rather
than at the client's whims. That will be addressed in the next patch.
For a demo of this patch in action:
$ qemu-nbd -f raw -r -t -e 10 file &
$ nbdsh --opt-mode -c '
H = list()
for i in range(20):
print(i)
H.insert(i, nbd.NBD())
H[i].set_opt_mode(True)
H[i].connect_uri("nbd://localhost")
'
$ kill $!
where later connections get to start progressing once earlier ones are
forcefully dropped for taking too long, rather than hanging.
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20240807174943.771624-13-eblake@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
[eblake: rebase to changes earlier in series, reduce scope of timer]
Signed-off-by: Eric Blake <eblake@redhat.com>
Allowing an unlimited number of clients to any web service is a recipe
for a rudimentary denial of service attack: the client merely needs to
open lots of sockets without closing them, until qemu no longer has
any more fds available to allocate.
For qemu-nbd, we default to allowing only 1 connection unless more are
explicitly asked for (-e or --shared); this was historically picked as
a nice default (without an explicit -t, a non-persistent qemu-nbd goes
away after a client disconnects, without needing any additional
follow-up commands), and we are not going to change that interface now
(besides, someday we want to point people towards qemu-storage-daemon
instead of qemu-nbd).
But for qemu proper, and the newer qemu-storage-daemon, the QMP
nbd-server-start command has historically had a default of unlimited
number of connections, in part because unlike qemu-nbd it is
inherently persistent until nbd-server-stop. Allowing multiple client
sockets is particularly useful for clients that can take advantage of
MULTI_CONN (creating parallel sockets to increase throughput),
although known clients that do so (such as libnbd's nbdcopy) typically
use only 8 or 16 connections (the benefits of scaling diminish once
more sockets are competing for kernel attention). Picking a number
large enough for typical use cases, but not unlimited, makes it
slightly harder for a malicious client to perform a denial of service
merely by opening lots of connections withot progressing through the
handshake.
This change does not eliminate CVE-2024-7409 on its own, but reduces
the chance for fd exhaustion or unlimited memory usage as an attack
surface. On the other hand, by itself, it makes it more obvious that
with a finite limit, we have the problem of an unauthenticated client
holding 100 fds opened as a way to block out a legitimate client from
being able to connect; thus, later patches will further add timeouts
to reject clients that are not making progress.
This is an INTENTIONAL change in behavior, and will break any client
of nbd-server-start that was not passing an explicit max-connections
parameter, yet expects more than 100 simultaneous connections. We are
not aware of any such client (as stated above, most clients aware of
MULTI_CONN get by just fine on 8 or 16 connections, and probably cope
with later connections failing by relying on the earlier connections;
libvirt has not yet been passing max-connections, but generally
creates NBD servers with the intent for a single client for the sake
of live storage migration; meanwhile, the KubeSAN project anticipates
a large cluster sharing multiple clients [up to 8 per node, and up to
100 nodes in a cluster], but it currently uses qemu-nbd with an
explicit --shared=0 rather than qemu-storage-daemon with
nbd-server-start).
We considered using a deprecation period (declare that omitting
max-parameters is deprecated, and make it mandatory in 3 releases -
then we don't need to pick an arbitrary default); that has zero risk
of breaking any apps that accidentally depended on more than 100
connections, and where such breakage might not be noticed under unit
testing but only under the larger loads of production usage. But it
does not close the denial-of-service hole until far into the future,
and requires all apps to change to add the parameter even if 100 was
good enough. It also has a drawback that any app (like libvirt) that
is accidentally relying on an unlimited default should seriously
consider their own CVE now, at which point they are going to change to
pass explicit max-connections sooner than waiting for 3 qemu releases.
Finally, if our changed default breaks an app, that app can always
pass in an explicit max-parameters with a larger value.
It is also intentional that the HMP interface to nbd-server-start is
not changed to expose max-connections (any client needing to fine-tune
things should be using QMP).
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20240807174943.771624-12-eblake@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
[ericb: Expand commit message to summarize Dan's argument for why we
break corner-case back-compat behavior without a deprecation period]
Signed-off-by: Eric Blake <eblake@redhat.com>
Upcoming patches to fix a CVE need to track an opaque pointer passed
in by the owner of a client object, as well as request for a time
limit on how fast negotiation must complete. Prepare for that by
changing the signature of nbd_client_new() and adding an accessor to
get at the opaque pointer, although for now the two servers
(qemu-nbd.c and blockdev-nbd.c) do not change behavior even though
they pass in a new default timeout value.
Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20240807174943.771624-11-eblake@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
[eblake: s/LIMIT/MAX_SECS/ as suggested by Dan]
Signed-off-by: Eric Blake <eblake@redhat.com>
Define a hexagon_cpu_properties list to match the idiom used
by other targets.
Signed-off-by: Brian Cain <bcain@quicinc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Taylor Simpson <ltaylorsimpson@gmail.com>
Add my git tree for hexagon. Note that the branch is "hex-next" and not
"hex.next" as had been used previously. But I'll keep the "hex.next" branch
in sync with "hex-next" until this commit lands to avoid confusion.
Signed-off-by: Brian Cain <bcain@quicinc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The implementation for these instructions handles -0 as an invalid float
point value, whereas the Hexagon hardware considers it the same as +0
(which is valid). Let's fix that and add a regression test.
Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Reviewed-by: Brian Cain <bcain@quicinc.com>
Reviewed-by: Taylor Simpson <ltaylorsimpson@gmail.com>
Signed-off-by: Brian Cain <bcain@quicinc.com>
Apparently 'qemu-img info' doesn't report the backing file format field
for qed (as it does for qcow2):
$ qemu-img create -f qed base.qed 1M && qemu-img create -f qed -b base.qed -F qed top.qed 1M
$ qemu-img create -f qcow2 base.qcow2 1M && qemu-img create -f qcow2 -b base.qcow2 -F qcow2 top.qcow2 1M
$ qemu-img info top.qed | grep 'backing file format'
$ qemu-img info top.qcow2 | grep 'backing file format'
backing file format: qcow2
This leads to the 024 test failure with -qed. Let's just filter the
field out and exclude it from the output.
This is a fixup for the commit f93e65ee51 ("iotests/{024, 271}: add
testcases for qemu-img rebase").
Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Message-ID: <20240730094701.790624-1-andrey.drobyshev@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When reading with `read_cluster` we get the `mapping` with
`find_mapping_for_cluster` and then we call `open_file` for this
mapping.
The issue appear when its the same file, but a second cluster that is
not immediately after it, imagine clusters `500 -> 503`, this will give
us 2 mappings one has the range `500..501` and another `503..504`, both
point to the same file, but different offsets.
When we don't open the file since the path is the same, we won't assign
`s->current_mapping` and thus accessing way out of bound of the file.
From our example above, after `open_file` (that didn't open anything) we
will get the offset into the file with
`s->cluster_size*(cluster_num-s->current_mapping->begin)`, which will
give us `0x2000 * (504-500)`, which is out of bound for this mapping and
will produce some issues.
Signed-off-by: Amjad Alsharafi <amjadsharafi10@gmail.com>
Message-ID: <1f3ea115779abab62ba32c788073cdc99f9ad5dd.1721470238.git.amjadsharafi10@gmail.com>
[kwolf: Simplified the patch based on Amjad's analysis and input]
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
How this `abort` was intended to check for was:
- if the `mapping->first_mapping_index` is not the same as
`first_mapping_index`, which **should** happen only in one case,
when we are handling the first mapping, in that case
`mapping->first_mapping_index == -1`, in all other cases, the other
mappings after the first should have the condition `true`.
- From above, we know that this is the first mapping, so if the offset
is not `0`, then abort, since this is an invalid state.
The issue was that `first_mapping_index` is not set if we are
checking from the middle, the variable `first_mapping_index` is
only set if we passed through the check `cluster_was_modified` with the
first mapping, and in the same function call we checked the other
mappings.
One approach is to go into the loop even if `cluster_was_modified`
is not true so that we will be able to set `first_mapping_index` for the
first mapping, but since `first_mapping_index` is only used here,
another approach is to just check manually for the
`mapping->first_mapping_index != -1` since we know that this is the
value for the only entry where `offset == 0` (i.e. first mapping).
Signed-off-by: Amjad Alsharafi <amjadsharafi10@gmail.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <b0fbca3ee208c565885838f6a7deeaeb23f4f9c2.1721470238.git.amjadsharafi10@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Before this commit, the behavior when calling `commit_one_file` for
example with `offset=0x2000` (second cluster), what will happen is that
we won't fetch the next cluster from the fat, and instead use the first
cluster for the read operation.
This is due to off-by-one error here, where `i=0x2000 !< offset=0x2000`,
thus not fetching the next cluster.
Signed-off-by: Amjad Alsharafi <amjadsharafi10@gmail.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <b97c1e1f1bc2f776061ae914f95d799d124fcd73.1721470238.git.amjadsharafi10@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In the case of scsi-block, RESERVATION_CONFLICT is not a backend error,
but indicates that the guest tried to make a request that it isn't
allowed to execute. Pass the error to the guest so that it can decide
what to do with it.
Without this, if we stop the VM in response to a RESERVATION_CONFLICT
(as is the default policy in management software such as oVirt or
KubeVirt), it can happen that the VM cannot be resumed any more because
every attempt to resume it immediately runs into the same error and
stops the VM again.
One case that expects RESERVATION_CONFLICT errors to be visible in the
guest is running the validation tests in Windows 2019's Failover Cluster
Manager, which intentionally tries to execute invalid requests to see if
they are properly rejected.
Buglink: https://issues.redhat.com/browse/RHEL-50000
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20240731123207.27636-5-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
scsi_block_sgio_complete() has surprising behaviour in that there are
error cases in which it directly completes the request and never calls
the passed callback. In the current state of the code, this doesn't seem
to result in bugs, but with future code changes, we must be careful to
never rely on the callback doing some cleanup until this code smell is
fixed. For now, just add warnings to make people aware of the trap.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20240731123207.27636-4-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Instead of calling into scsi_handle_rw_error() directly from
scsi_block_sgio_complete() and skipping the normal callback, go through
the normal cleanup path by calling the callback with a positive error
value.
The important difference here is not only that the code path is cleaner,
but that the callbacks set r->req.aiocb = NULL. If we skip setting this
and the error action is BLOCK_ERROR_ACTION_STOP, resuming the VM runs
into an assertion failure in scsi_read_data() or scsi_write_data()
because the dangling aiocb pointer is unexpected.
Fixes: a108557bbf ("scsi: inline sg_io_sense_from_errno() into the callers.")
Buglink: https://issues.redhat.com/browse/RHEL-50000
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20240731123207.27636-3-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In some error cases, scsi_block_sgio_complete() never calls the passed
callback, but directly completes the request. This leads to bugs because
its error paths are not exact copies of what the callback would normally
do.
In preparation to fix this, allow passing positive return values to the
callbacks that represent the status code that should be used to complete
the request.
scsi_handle_rw_error() already handles positive values for its ret
parameter because scsi_block_sgio_complete() calls directly into it.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20240731123207.27636-2-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Upstream clang 18 (and backports to clang 17 in Fedora and RHEL)
implemented support for __attribute__((cleanup())) in its Thread Safety
Analysis, so we can now actually have a proper implementation of
WITH_GRAPH_RDLOCK_GUARD() that understands when we acquire and when we
release the lock.
-Wthread-safety is now only enabled if the compiler is new enough to
understand this pattern. In theory, we could have used some #ifdefs to
keep the existing basic checks on old compilers, but as long as someone
runs a newer compiler (and our CI does), we will catch locking problems,
so it's probably not worth keeping multiple implementations for this.
The implementation can't use g_autoptr any more because the glib macros
define wrapper functions that don't have the right TSA attributes, so
the compiler would complain about them. Just use the cleanup attribute
directly instead.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20240627181245.281403-3-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The graph lock needs to be held when calling bdrv_co_pdiscard(). Fix
block_copy_task_entry() to take it for the call.
WITH_GRAPH_RDLOCK_GUARD() was implemented in a weak way because of
limitations in clang's Thread Safety Analysis at the time, so that it
only asserts that the lock is held (which allows calling functions that
require the lock), but we never deal with the unlocking (so even after
the scope of the guard, the compiler assumes that the lock is still
held). This is why the compiler didn't catch this locking error.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20240627181245.281403-2-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
BlockdevSnapshotInternal is the arguments type of command
blockdev-snapshot-internal-sync. Its doc comment contains this note:
# .. note:: In a transaction, if @name is empty or any snapshot matching
# @name exists, the operation will fail. Only some image formats
# support it; for example, qcow2, and rbd.
"In a transaction" is misleading, and "if @name is empty or any
snapshot matching @name exists, the operation will fail" is redundant
with the command's Errors documentation. Drop.
The remainder is fine. Move it to the command's doc comment, where it
is more prominently visible, with a slight rephrasing for clarity.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240718123609.3063055-1-armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Misc HW & UI patches
- Replace Loongson IPI with LoongArch IPI on LoongArch Virt machine (Bibo)
- SD card: Do not abort when reading DAT lines on invalid cmd state (Phil)
- SDHCI: Reset @data_count index on invalid ADMA transfers (Phil)
- Don't decrement PFlash counter below 0 (Peter)
- Explicit a 8bit truncate on IDE ATAPI (Peter)
- Silent Coverity warning in ISA FDC (Peter)
- Remove dead code in PCI IDE bmdma_prepare_buf (Peter)
- Improve OpenGL and related display error messages (Peter)
- Set PCI base address register write mask on GC64120 host bridge (Phil)
- List PCIe Root Port and PCIe-to-PCI bridge in QEMU PCI IDs list (George)
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmayMloACgkQ4+MsLN6t
# wN6SFQ//S0WvrFNsCeHphsbPETNwHL72j2XdX9xnt9UJZoBhFitOTCzo/EpNQHJe
# dFxCAfef9Nc9WDumyWsb7hE6IGjn/wPpVUnOnoWZZAilA6LK01J0mxgDXNRUf8ES
# iRo5x1Zd3oNBcKA9oqCuALkapXYypKCwSlRgvc42ekdYXHG95pFbJv9MmWIYy6Vn
# 0+hBWv3+Xegv7oFH4UsbjY844vsFcjupvrEm10bcH/zeYhEWVvXRylyfAQS8ww+U
# TYWj9g1i+Cfz+QxKyXovlS21ogieckiTYlr4yM7Ze7fD3Tyj5Q3KRfjC9tD0HoNb
# hjTSojfzk9m93/c5nASL7ChbjisJWqewH5J0eVLSMkqDRUsbFbsryJ4bDXIQNSYD
# HTko32P5obrDQO6l8rr6zuk1Y8lKBd0cY4fGlynXzsitp7duAqWJeMbD0s0duASW
# pqGITK/F/hKHJC6RVDaiFoyGHEa+wm4K6YqfwSFy0EOb5qYq0/d0MAEzTXPB1K1S
# mFMF6+Yk7ZfOnYwSDTDGf5hnmSvSLLdY+Ne94g9gLvuIRWCvc5rrjfBzAbnOfeif
# EMpFbofkMys5p7kxGUZhkJpRQiRjB11fZl9bplyhjGpPgQrq+E/j0G3Uc7jtkOUO
# sjB/4iA7RFvCe47EWqN3WR+rf462EGk2MD+Ebxd9FLsiciFvk1Y=
# =jOxG
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 07 Aug 2024 12:25:30 AM AEST
# gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]
* tag 'hw-misc-20240806' of https://github.com/philmd/qemu: (28 commits)
docs/specs/pci-ids: Fix markup
docs/specs/pci-ids: Add missing devices
hw/pci-host/gt64120: Reset config registers during RESET phase
hw/pci-host/gt64120: Set PCI base address register write mask
ui/console: Note in '-display help' that some backends support suboptions
system/vl.c: Expand OpenGL related errors
hw/display/virtio-gpu: Improve "opengl is not available" error message
hw/ide/pci: Remove dead code from bmdma_prepare_buf()
hw/block/fdc-isa: Assert that isa_fdc_get_drive_max_chs() found something
hw/ide/atapi: Be explicit that assigning to s->lcyl truncates
hw/block/pflash_cfi01: Don't decrement pfl->counter below 0
hw/sd/sdhci: Reset @data_count index on invalid ADMA transfers
hw/sd/sdcard: Do not abort when reading DAT lines on invalid cmd state
hw/sd/sdcard: Explicit dummy byte value
hw/intc/loongson_ipi: Restrict to MIPS
hw/loongarch/virt: Replace Loongson IPI with LoongArch IPI
hw/intc/loongarch_ipi: Add loongarch IPI support
hw/intc/loongson_ipi: Move common code to loongson_ipi_common.c
hw/intc/loongson_ipi: Expose loongson_ipi_core_read/write helpers
hw/intc/loongson_ipi: Add LoongsonIPICommonClass::cpu_by_arch_id handler
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reset config values in the device RESET phase, not only once
when the device is realized, because otherwise the device can
use unknown values at reset.
Since we are adding a new reset method, use the preferred
Resettable API (for a simple leaf device reset, a
DeviceClass::reset method and a ResettableClass::reset_hold
method are essentially identical).
Reported-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20240802213122.86852-3-philmd@linaro.org>
When booting Linux we see:
PCI host bridge to bus 0000:00
pci_bus 0000:00: root bus resource [mem 0x10000000-0x17ffffff]
pci_bus 0000:00: root bus resource [io 0x1000-0x1fffff]
pci_bus 0000:00: No busn resource found for root bus, will use [bus 00-ff]
pci 0000:00:00.0: [11ab:4620] type 00 class 0x060000
pci 0000:00:00.0: [Firmware Bug]: reg 0x14: invalid BAR (can't size)
pci 0000:00:00.0: [Firmware Bug]: reg 0x18: invalid BAR (can't size)
pci 0000:00:00.0: [Firmware Bug]: reg 0x1c: invalid BAR (can't size)
pci 0000:00:00.0: [Firmware Bug]: reg 0x20: invalid BAR (can't size)
pci 0000:00:00.0: [Firmware Bug]: reg 0x24: invalid BAR (can't size)
This is due to missing base address register write mask.
Add it to get:
PCI host bridge to bus 0000:00
pci_bus 0000:00: root bus resource [mem 0x10000000-0x17ffffff]
pci_bus 0000:00: root bus resource [io 0x1000-0x1fffff]
pci_bus 0000:00: No busn resource found for root bus, will use [bus 00-ff]
pci 0000:00:00.0: [11ab:4620] type 00 class 0x060000
pci 0000:00:00.0: reg 0x10: [mem 0x00000000-0x00000fff pref]
pci 0000:00:00.0: reg 0x14: [mem 0x01000000-0x01000fff pref]
pci 0000:00:00.0: reg 0x18: [mem 0x1c000000-0x1c000fff]
pci 0000:00:00.0: reg 0x1c: [mem 0x1f000000-0x1f000fff]
pci 0000:00:00.0: reg 0x20: [mem 0x1be00000-0x1be00fff]
pci 0000:00:00.0: reg 0x24: [io 0x14000000-0x14000fff]
Since this device is only used by MIPS machines which aren't
versioned, we don't need to update migration compat machinery.
Mention the datasheet referenced. Remove the "Malta assumptions
ahead" comment since the reset values from the datasheet are used.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-Id: <20240802213122.86852-2-philmd@linaro.org>
Currently '-display help' only prints the available backends. Some
of those backends support suboptions (e.g. '-display gtk,gl=on').
Mention that in the help output, and point the user to where they
might be able to find more information about the suboptions.
The new output looks like this:
$ qemu-system-aarch64 -display help
Available display backend types:
none
gtk
sdl
egl-headless
curses
spice-app
dbus
Some display backends support suboptions, which can be set with
-display backend,option=value,option=value...
For a short list of the suboptions for each display, see the top-level -help output; more detail is in the documentation.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <20240731154136.3494621-4-peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Expand the OpenGL related error messages we produce for various
"OpenGL not present/not supported" cases, to hopefully guide the
user towards how to fix things.
Now if the user tries to enable GL on a backend that doesn't
support it the error message is a bit more precise:
$ qemu-system-aarch64 -M virt -device virtio-gpu-gl -display curses,gl=on
qemu-system-aarch64: OpenGL is not supported by display backend 'curses'
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
[AJB: Improved error report message]
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-ID: <20240731154136.3494621-3-peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
If the user tries to use the virtio-gpu-gl device but the display
backend doesn't have OpenGL support enabled, we currently print a
rather uninformative error message:
$ qemu-system-aarch64 -M virt -device virtio-gpu-gl
qemu-system-aarch64: -device virtio-gpu-gl: opengl is not available
Since OpenGL is not enabled on display frontends by default, users
are quite likely to run into this. Improve the error message to
be more specific and to suggest to the user a path forward.
Note that the case of "user tried to enable OpenGL but the display
backend doesn't handle it" is caught elsewhere first, so we can
assume that isn't the problem:
$ qemu-system-aarch64 -M virt -device virtio-gpu-gl -display curses,gl=on
qemu-system-aarch64: OpenGL is not supported by the display
(Use of error_append_hint() requires us to add an ERRP_GUARD() to
the function, as noted in include/qapi/error.h.)
With this commit we now produce the hopefully more helpful error:
$ ./build/x86/qemu-system-aarch64 -M virt -device virtio-gpu-gl
qemu-system-aarch64: -device virtio-gpu-gl: The display backend does not have OpenGL support enabled
It can be enabled with '-display BACKEND,gl=on' where BACKEND is the name of the display backend to use.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2443
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-ID: <20240731154136.3494621-2-peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Coverity notes that the code at the end of the loop in
bmdma_prepare_buf() is unreachable. This is because in commit
9fbf0fa81f ("ide: remove hardcoded 2GiB transactional limit")
we removed the only codepath in the loop which could "break" out of
it, but didn't notice that this meant we should also remove the code
at the end of the loop.
Remove the dead code.
Resolves: Coverity CID 1547772
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
[PMD: Break and return once at EOF]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240805182419.22239-1-philmd@linaro.org>
Coverity complains about an overflow in isa_fdc_get_drive_max_chs()
that can happen if the loop over fd_formats never finds a match,
because we initialize *maxc to 0 and then at the end of the
function decrement it.
This can't ever actually happen because fd_formats has at least
one entry for each FloppyDriveType, so we must at least once
find a match and update *maxc, *maxh and *maxs. Assert that we
did find a match, which should keep Coverity happy and will also
detect possible bugs in the data in fd_formats.
Resolves: Coverity CID 1547663
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240731143617.3391947-6-peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
In ide_atapi_cmd_reply_end() we calculate a 16-bit size, and then
assign its two halves to s->lcyl and s->hcyl like this:
s->lcyl = size;
s->hcyl = size >> 8;
Coverity warns that the first line here can overflow the
8-bit s->lcyl variable. This is true, and in this case we're
deliberately only after the low 8 bits of the value. The
code is clearer to both humans and Coverity if we're explicit
that we only wanted the low 8 bits, though.
Resolves: Coverity CID 1547621
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20240731143617.3391947-5-peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
In pflash_write() Coverity points out that we can decrement the
unsigned pfl->counter below zero, which makes it wrap around. In
fact this is harmless, because if pfl->counter is 0 at this point we
also increment pfl->wcycle to 3, and the wcycle == 3 handling doesn't
look at counter; the only way back into code which looks at the
counter value is via wcycle == 1, which will reinitialize the counter.
But it's arguably a little clearer to break early in the "counter ==
0" if(), to avoid the decrement-below-zero.
Resolves: Coverity CID 1547611
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20240731143617.3391947-4-peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
A new minor version of OpenSBI was just released after our bump to
OpenSBI 1.5. It contains significant bug fixes that it's worth doing
a new update for QEMU 9.1.
Submodule roms/opensbi 455de672dd..43cace6c36:
> lib: sbi: check result of pmp_get() in is_pmp_entry_mapped()
> lib: sbi: fwft: fix incorrect size passed to sbi_zalloc()
> lib: sbi: dbtr: fix potential NULL pointer dereferences
> include: Adjust Sscofpmf mhpmevent mask for upper 8 bits
> lib: sbi_hsm: Save/restore menvcfg only when it exists
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20240805120259.1705016-2-dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Coverity complained about the possible out-of-bounds access with
counter_virt/counter_virt_prev because these two arrays are
accessed with privilege mode. However, these two arrays are accessed
only when virt is enabled. Thus, the privilege mode can't be M mode.
Add the asserts anyways to detect any wrong usage of these arrays
in the future.
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Fixes: Coverity CID 1558459
Fixes: Coverity CID 1558462
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240724-fixes-v1-1-4a64596b0d64@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Sweep the entire documentation again. Last done in commit
209e64d9ed (qapi: Refill doc comments to conform to current
conventions).
To check the generated documentation does not change, I compared the
generated HTML before and after this commit with "wdiff -3". Finds no
differences. Comparing with diff is not useful, as the reflown
paragraphs are visible there.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240729065220.860163-1-armbru@redhat.com>
[Straightforward conflict with commit 442110bc6f resolved]
Analyzing qemu-produced core dumps of multi-threaded apps runs into:
(gdb) info threads
[...]
21 Thread 0x3ff83cc0740 (LWP 9295) warning: Couldn't find general-purpose registers in core file.
<unavailable> in ?? ()
The reason is that all pr_pid values are the same, because the same
TaskState is used for all CPUs when generating NT_PRSTATUS notes.
Fix by using TaskStates associated with individual CPUs.
Cc: qemu-stable@nongnu.org
Fixes: 243c470662 ("linux-user/elfload: Write corefile elf header in one block")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240801202340.21845-1-iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
When a channel fails to create, the code currently just returns. This
is wrong for two reasons:
1) Channel n+1 will not get to initialize it's semaphores, leading to
an assert when terminate_threads tries to post to it:
qemu-system-x86_64: ../util/qemu-thread-posix.c:92:
qemu_mutex_lock_impl: Assertion `mutex->initialized' failed.
2) (theoretical) If channel n-1 already started creation it will
defeat the purpose of the channels_created logic which is in place
to avoid migrate_fd_cleanup() to run while channels are still being
created.
This cannot really happen today because the current failure cases
for multifd_new_send_channel_create() are all synchronous,
resulting from qio_channel_file_new_path() getting a bad
filename. This would hit all channels equally.
But I don't want to set a trap for future people, so have all
channels try to create (even if failing), and only fail after the
channels_created semaphore has been posted.
While here, remove the error_report_err call. There's one already at
migrate_fd_cleanup later on.
Cc: qemu-stable@nongnu.org
Reported-by: Jim Fehlig <jfehlig@suse.com>
Fixes: b7b03eb614 ("migration/multifd: Add outgoing QIOChannelFile support")
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The QIOChannelFile object already has its reference decremented by
g_autoptr. Trying to unref an extra time causes:
ERROR:../qom/object.c:1241:object_unref: assertion failed: (obj->ref > 0)
Fixes: a701c03dec ("migration: Drop reference to QIOChannel if file seeking fails")
Fixes: 6d3279655a ("migration: Fix file migration with fdset")
Reported-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The vcek-disabled property of the sev-snp-guest object is misspelled
vcek-required (which I suppose would use the opposite polarity) in
the call to object_class_property_add_bool(). Fix it.
Reported-by: Zixi Chen <zixchen@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCAAdFiEEIV1G9IJGaJ7HfzVi7wSWWzmNYhEFAmasTgwACgkQ7wSWWzmN
# YhFUtAgAq45v7fQJ7cKKwRam/VrIkxT5cM59ODwzLSL9kPWfL6f/bJ7xM/zvLyvn
# LNBXFWWu+eNKA73f95cckZwaqZ4U6giGbiesCACn1IpgVtieLS+Lq78jsifKIAsR
# yxFvbT9oLhU0dZ1Up3+isc6V+jeAE4ZYu4KOiIt7PscTEzkJl+vSUjN4X9rRVtUD
# PzONUacL6MoTJtX8UZJZXNzLN9JTsN39Gx+LSDGQ27MDmDvE3R9BW+T0ZgF9JQZ7
# wnrL5sharqF3gxa7X55fPBI1qwY5gWcH0yyJpRdM8guA13vhtvlrhNSypip9eKWi
# HtPHUTKEB5YOvF236WRiuQPIm/GNpA==
# =7HGN
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 02 Aug 2024 01:10:04 PM AEST
# gpg: using RSA key 215D46F48246689EC77F3562EF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* tag 'net-pull-request' of https://github.com/jasowang/qemu:
net: Reinstate '-net nic, model=help' output as documented in man page
net: update netdev stream man page with the reconnect parameter
net: update netdev dgram man page with unix socket
net: update netdev stream man page with unix socket
net: update netdev stream/dgram man page
virtio-net: Fix network stall at the host side waiting for kick
virtio-net: Ensure queue index fits with RSS
rtl8139: Fix behaviour for old kernels.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
While refactoring the NIC initialization code, I broke '-net nic,model=help'
which no longer outputs a list of available NIC models.
Fixes: 2cdeca04ad ("net: report list of available models according to platform")
Cc: qemu-stable@nongnu.org
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Jason Wang <jasowang@redhat.com>
"-netdev stream" supports a reconnect parameter that attempts to
reconnect automatically the socket if it is disconnected. The code
has been added but the man page has not been updated.
Fixes: 148fbf0d58 ("net: stream: add a new option to automatically reconnect"
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Add the description of "-netdev dgram" with a unix domain socket.
The code has been added but the man page has not been updated.
Fixes: 784e7a2531 ("net: dgram: add unix socket")
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Add the description of "-netdev stream" with a unix domain socket.
The code has been added but the man page has not been updated.
Include an example how to use "-netdev stream" and "passt" in place
of "-netdev user".
("passt" is a non privileged translation proxy between layer-2, like
"-netdev stream", and layer-4 on host, like TCP, UDP, ICMP/ICMPv6 echo)
Fixes: 13c6be9661 ("net: stream: add unix socket")
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Add the description of "-netdev stream" and "-netdev dgram" in the QEMU
manpage.
Add some examples on how to use them.
Fixes: 5166fe0ae4 ("qapi: net: add stream and dgram netdevs")
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Patch 06b1297017 ("virtio-net: fix network stall under load")
added double-check to test whether the available buffer size
can satisfy the request or not, in case the guest has added
some buffers to the avail ring simultaneously after the first
check. It will be lucky if the available buffer size becomes
okay after the double-check, then the host can send the packet
to the guest. If the buffer size still can't satisfy the request,
even if the guest has added some buffers, viritio-net would
stall at the host side forever.
The patch enables notification and checks whether the guest has
added some buffers since last check of available buffers when
the available buffers are insufficient. If no buffer is added,
return false, else recheck the available buffers in the loop.
If the available buffers are sufficient, disable notification
and return true.
Changes:
1. Change the return type of virtqueue_get_avail_bytes() from void
to int, it returns an opaque that represents the shadow_avail_idx
of the virtqueue on success, else -1 on error.
2. Add a new API: virtio_queue_enable_notification_and_check(),
it takes an opaque as input arg which is returned from
virtqueue_get_avail_bytes(). It enables notification firstly,
then checks whether the guest has added some buffers since
last check of available buffers or not by virtio_queue_poll(),
return ture if yes.
The patch also reverts patch "06b12970174".
The case below can reproduce the stall.
Guest 0
+--------+
| iperf |
---------------> | server |
Host | +--------+
+--------+ | ...
| iperf |----
| client |---- Guest n
+--------+ | +--------+
| | iperf |
---------------> | server |
+--------+
Boot many guests from qemu with virtio network:
qemu ... -netdev tap,id=net_x \
-device virtio-net-pci-non-transitional,\
iommu_platform=on,mac=xx:xx:xx:xx:xx:xx,netdev=net_x
Each guest acts as iperf server with commands below:
iperf3 -s -D -i 10 -p 8001
iperf3 -s -D -i 10 -p 8002
The host as iperf client:
iperf3 -c guest_IP -p 8001 -i 30 -w 256k -P 20 -t 40000
iperf3 -c guest_IP -p 8002 -i 30 -w 256k -P 20 -t 40000
After some time, the host loses connection to the guest,
the guest can send packet to the host, but can't receive
packet from the host.
It's more likely to happen if SWIOTLB is enabled in the guest,
allocating and freeing bounce buffer takes some CPU ticks,
copying from/to bounce buffer takes more CPU ticks, compared
with that there is no bounce buffer in the guest.
Once the rate of producing packets from the host approximates
the rate of receiveing packets in the guest, the guest would
loop in NAPI.
receive packets ---
| |
v |
free buf virtnet_poll
| |
v |
add buf to avail ring ---
|
| need kick the host?
| NAPI continues
v
receive packets ---
| |
v |
free buf virtnet_poll
| |
v |
add buf to avail ring ---
|
v
... ...
On the other hand, the host fetches free buf from avail
ring, if the buf in the avail ring is not enough, the
host notifies the guest the event by writing the avail
idx read from avail ring to the event idx of used ring,
then the host goes to sleep, waiting for the kick signal
from the guest.
Once the guest finds the host is waiting for kick singal
(in virtqueue_kick_prepare_split()), it kicks the host.
The host may stall forever at the sequences below:
Host Guest
------------ -----------
fetch buf, send packet receive packet ---
... ... |
fetch buf, send packet add buf |
... add buf virtnet_poll
buf not enough avail idx-> add buf |
read avail idx add buf |
add buf ---
receive packet ---
write event idx ... |
wait for kick add buf virtnet_poll
... |
---
no more packet, exit NAPI
In the first loop of NAPI above, indicated in the range of
virtnet_poll above, the host is sending packets while the
guest is receiving packets and adding buffers.
step 1: The buf is not enough, for example, a big packet
needs 5 buf, but the available buf count is 3.
The host read current avail idx.
step 2: The guest adds some buf, then checks whether the
host is waiting for kick signal, not at this time.
The used ring is not empty, the guest continues
the second loop of NAPI.
step 3: The host writes the avail idx read from avail
ring to used ring as event idx via
virtio_queue_set_notification(q->rx_vq, 1).
step 4: At the end of the second loop of NAPI, recheck
whether kick is needed, as the event idx in the
used ring written by the host is beyound the
range of kick condition, the guest will not
send kick signal to the host.
Fixes: 06b1297017 ("virtio-net: fix network stall under load")
Cc: qemu-stable@nongnu.org
Signed-off-by: Wencheng Yang <east.moutain.yang@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Ensure the queue index points to a valid queue when software RSS
enabled. The new calculation matches with the behavior of Linux's TAP
device with the RSS eBPF program.
Fixes: 4474e37a5b ("virtio-net: implement RX RSS processing")
Reported-by: Zhibin Hu <huzhibin5@huawei.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Old linux kernel rtl8139 drivers (ex. debian 2.1) uses outb to set the rx
mode for RxConfig. Unfortunatelly qemu does not support outb for RxConfig.
Signed-off-by: Hans <sungdgdhtryrt@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
virtio,pci,pc: fixes
revert virtio pci/SR-IOV emulation at author's request
a couple of fixes in virtio,vtd
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmarSFUPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRp7fwH/3wNCGhgHhF5dhKRKRn8hqhxYl2rXnv0LKYI
# Rgsoxh3kw6oKBXxLG/B4V2GkqDSU8q8NuHnvGmmAUQ/uHmwTWbBbrZ+HwMMmaRhT
# Ox8kIXiVYAtw24yLKDvyoKbMLjLKb9/QqTT4rbsQ9yl5PLxwoGGJEu/ifM1MbZZY
# f5CDtj3hRArIZEjMt0Q3h+G7///BRVZxQ/0de57whGXcr349qgMpiIThvlCOj7Yf
# rQ68AGS4yk1Jk0oxiYyWjo43o8JbB5bMnCrkzDy4ZdY5Sw9zGb48CmcrBUl4J9lv
# NVDYK63dsvRS0ew7PxaEwu32MIQLJcn5s521m81/ZAhbdyzLnlI=
# =/2+K
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 01 Aug 2024 06:33:25 PM AEST
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [undefined]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [undefined]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu:
intel_iommu: Fix for IQA reg read dropped DW field
hw/i386/amd_iommu: Don't leak memory in amdvi_update_iotlb()
Revert "hw/pci: Rename has_power to enabled"
Revert "hw/ppc/spapr_pci: Do not create DT for disabled PCI device"
Revert "hw/ppc/spapr_pci: Do not reject VFs created after a PF"
Revert "pcie_sriov: Do not manually unrealize"
Revert "pcie_sriov: Ensure VF function number does not overflow"
Revert "pcie_sriov: Reuse SR-IOV VF device instances"
Revert "pcie_sriov: Release VFs failed to realize"
Revert "pcie_sriov: Remove num_vfs from PCIESriovPF"
Revert "pcie_sriov: Register VFs after migration"
Revert "hw/pci: Fix SR-IOV VF number calculation"
Revert "pcie_sriov: Ensure PF and VF are mutually exclusive"
Revert "pcie_sriov: Check PCI Express for SR-IOV PF"
Revert "pcie_sriov: Allow user to create SR-IOV device"
Revert "virtio-pci: Implement SR-IOV PF"
Revert "virtio-net: Implement SR-IOV VF"
Revert "docs: Document composable SR-IOV device"
virtio-rng: block max-bytes=0
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In commit ad18376b90 we added an assert that the level value was
in-bounds for the array we're about to index into. However, the
assert condition is wrong -- env->config->interrupt_vector is an
array of uint32_t, so we should bounds check the index against
ARRAY_SIZE(...), not against sizeof().
Resolves: Coverity CID 1507131
Fixes: ad18376b90 ("target/xtensa: Assert that interrupt level is within bounds")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240731172246.3682311-1-peter.maydell@linaro.org
The FMOPA (widening) SME instruction takes pairs of half-precision
floating point values, widens them to single-precision, does a
two-way dot product and accumulates the results into a
single-precision destination. We don't quite correctly handle the
FPCR bits FZ and FZ16 which control flushing of denormal inputs and
outputs. This is because at the moment we pass a single float_status
value to the helper function, which then uses that configuration for
all the fp operations it does. However, because the inputs to this
operation are float16 and the outputs are float32 we need to use the
fp_status_f16 for the float16 input widening but the normal fp_status
for everything else. Otherwise we will apply the flushing control
FPCR.FZ16 to the 32-bit output rather than the FPCR.FZ control, and
incorrectly flush a denormal output to zero when we should not (or
vice-versa).
(In commit 207d30b5fd we tried to fix the FZ handling but
didn't get it right, switching from "use FPCR.FZ for everything" to
"use FPCR.FZ16 for everything".)
Pass the CPU env to the sme_fmopa_h helper instead of an fp_status
pointer, and have the helper pass an extra fp_status into the
f16_dotadd() function so that we can use the right status for the
right parts of this operation.
Cc: qemu-stable@nongnu.org
Fixes: 207d30b5fd ("target/arm: Use FPST_F16 for SME FMOPA (widening)")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2373
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
If VT-D hardware supports scalable mode, Linux will set the IQA DW field
(bit11). In qemu, the vtd_mem_write and vtd_update_iq_dw set DW field well.
However, vtd_mem_read the DW field wrong because "& VTD_IQA_QS" dropped the
value of DW.
Replace "&VTD_IQA_QS" with "& (VTD_IQA_QS | VTD_IQA_DW_MASK)" could save
the DW field.
Test patch as below:
config the "x-scalable-mode" option:
"-device intel-iommu,caching-mode=on,x-scalable-mode=on,aw-bits=48"
After Linux OS boot, check the IQA_REG DW Field by usage 1 or 2:
1. IOMMU_DEBUGFS:
Before fix:
cat /sys/kernel/debug/iommu/intel/iommu_regset |grep IQA
IQA 0x90 0x00000001001da001
After fix:
cat /sys/kernel/debug/iommu/intel/iommu_regset |grep IQA
IQA 0x90 0x00000001001da801
Check DW field(bit11) is 1.
2. devmem2 read the IQA_REG (offset 0x90):
Before fix:
devmem2 0xfed90090
/dev/mem opened.
Memory mapped at address 0x7f72c795b000.
Value at address 0xFED90090 (0x7f72c795b090): 0x1DA001
After fix:
devmem2 0xfed90090
/dev/mem opened.
Memory mapped at address 0x7fc95281c000.
Value at address 0xFED90090 (0x7fc95281c090): 0x1DA801
Check DW field(bit11) is 1.
Signed-off-by: yeeli <seven.yi.lee@gmail.com>
Message-Id: <20240725031858.1529902-1-seven.yi.lee@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In amdvi_update_iotlb() we will only put a new entry in the hash
table if to_cache.perm is not IOMMU_NONE. However we allocate the
memory for the new AMDVIIOTLBEntry and for the hash table key
regardless. This means that in the IOMMU_NONE case we will leak the
memory we alloacted.
Move the allocations into the if() to the point where we know we're
going to add the item to the hash table.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2452
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20240731170019.3590563-1-peter.maydell@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Similar to qemu-pr-helper, do not print errors from the socket handling loop
unless a --verbose or -v option is provided explicitly on the command line.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Between v5 and v6 of the series, the socket loop of qemu-vmsr-helper was changed to
allow sending multiple requests on the same socket. Unfortunately, the condition
of the while loop is botched and the loop will never be entered. Clean it up, and
also unify the handling of error reporting.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
As SDM stated, CPUID 0x12 leaves depend on CPUID_7_0_EBX_SGX (SGX
feature word).
Since FEAT_SGX_12_0_EAX, FEAT_SGX_12_0_EBX and FEAT_SGX_12_1_EAX define
multiple feature words, add the dependencies of those registers to
report the warning to user if SGX is absent.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20240730045544.2516284-4-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
At present, cpu_x86_cpuid() silently masks off SGX_LC if SGX is absent.
This is not proper because the user is not told about the dependency
between the two.
So explicitly define the dependency between SGX_LC and SGX feature
words, so that user could get a warning when SGX_LC is enabled but
SGX is absent.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20240730045544.2516284-3-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CPUID.0x7.0.ebx and CPUID.0x7.0.ecx leaves have been expressed as the
feature word lists, and the Host capability support has been checked
in x86_cpu_filter_features().
Therefore, such checks on SGX feature "words" are redundant, and
the follow-up adjustments to those feature "words" will not actually
take effect.
Remove unnecessary SGX feature words related checks.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20240730045544.2516284-2-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Minor bug fixes and documentation cleanups:
- display packages in CI builds to catch changes
- stop compiler complaining about exec stacks in test cases
- stop loongarch compiler complaining about rwx in test cases
- improve docs on running TCG tests
- remove old unneeded avocado test for memory callback testing
- move test plugins into tcg testing dir
- clean-up and move plugin documentation to emulation section
- remove dead code from cache modelling plugin
- add compatibility workaround for lockstep plugin
- make some noise when building contrib plugins
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmaoxmwACgkQ+9DbCVqe
# KkTOGwgAhAqwEQIIwridsih//+3NYB2QQ9SmbCW7ss/idVN0DfWEQLcEfiBz9sWl
# Vh0CeptupLvtQlbbcnTdIG7sF6Aj9+XbTbYy4dS0nl4TGPmUoWqJy4QdZtpMlSBp
# s3FyC2g6UxXKkbI64RPSkdGaMEdb8ACvlGrQqb2LvrH+6tmlEfSQ05jLFrm1L0Db
# LjsxeFq50aVVIP2y91Cvc7FZmFgv0dqjTVlIMi9JGiW5cDKAwLDHv5AvQqT6oiv8
# yyknMlnf8pvNiJpsJYXHIbl/029C87n6NeStHjfrMA9yUC4hWqYb4qzXFu4k7fuo
# 2s5WdRFK+QAXEgi9MhS2p7eXVVlMpw==
# =8gOM
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 30 Jul 2024 08:54:36 PM AEST
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [full]
* tag 'pull-maintainer-9.1-rc1-300724-1' of https://gitlab.com/stsquad/qemu:
plugin/loader: handle basic help query
contrib/plugins: add compat for g_memdup2
contrib/plugins: be more vocal building
contrib/plugins/cache.c: Remove redundant check of l2_access
docs: split TCG plugin usage from devel section
tests/tcg: move test plugins into tcg subdir
tests/avocado: remove tcg_plugins virt_mem_icount test
docs/devel: document how to run individual TCG tests
docs/devel: update the testing introduction
tests/tcg: update README
tests/tcg/loongarch64: Use --no-warn-rwx-segments to link system tests
tests/tcg: Use --noexecstack with assembler files
gitlab: display /packages.txt in build jobs
gitlab: record installed packages in /packages.txt in containers
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Our official support policy only supports the most recent two
versions of macOS (currently macOS 13 Ventura and macOS 14 Sonoma),
and we already have code that assumes at least macOS 12 Monterey or
better. In commit 2d27c91e2b we dropped some of the back-compat
code for older macOS versions, but missed the guard in osdep.h that
is providing a fallback for macOS 10 and earlier.
Simplify the ifdef to the "ifdef __APPLE__" that we use elsewhere for
"is this macOS?".
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240730095939.2781172-1-peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The asset used in the mentioned test gets truncated before it's used
in the test. This means that the file gets modified, and thus the
asset's expected hash doesn't match anymore. This causes cache misses
and re-downloads every time the test is re-run.
Let's make a copy of the asset so that the one in the cache is
preserved and the cache sees a hit on re-runs.
Signed-off-by: Cleber Rosa <crosa@redhat.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240726134438.14720-9-crosa@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Avocado's fetchasset plugin runs before the actual Avocado job (and
any test). It analyses the test's code looking for occurrences of
"self.fetch_asset()" in the either the actual test or setUp() method.
It's not able to fully analyze all code, though.
The way these tests are written, make the fetchasset plugin blind to
the assets. This adds some more code duplication, true, but it will
aid the fetchasset plugin to download or verify the existence of these
assets in advance.
Signed-off-by: Cleber Rosa <crosa@redhat.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240726134438.14720-3-crosa@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The SSL certificate installed at mipsdistros.mips.com has expired:
0 s:CN = mipsdistros.mips.com
i:C = US, O = Amazon, OU = Server CA 1B, CN = Amazon
a:PKEY: rsaEncryption, 2048 (bit); sigalg: RSA-SHA256
v:NotBefore: Dec 23 00:00:00 2019 GMT; NotAfter: Jan 23 12:00:00 2021 GMT
Because this project has no control over that certificate and host,
this falls back to plain HTTP instead. The integrity of the
downloaded files can be guaranteed by the existing hashes for those
files (which are not modified here).
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Cleber Rosa <crosa@redhat.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240726134438.14720-2-crosa@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
In newer versions of Sphinx the env.doc2path() API is going to change
to return a Path object rather than a str. This was originally visible
in Sphinx 8.0.0rc1, but has been rolled back for the final 8.0.0
release. However it will probably emit a deprecation warning and is
likely to change for good in 9.0:
https://github.com/sphinx-doc/sphinx/issues/12686
Our use in depfile.py assumes a str, and if it is passed a Path
it will fall over:
Handler <function write_depfile at 0x77a1775ff560> for event 'build-finished' threw an exception (exception: unsupported operand type(s) for +: 'PosixPath' and 'str')
Wrapping the env.doc2path() call in str() will coerce a Path object
to the str we expect, and have no effect in older Sphinx versions
that do return a str.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2458
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240729120533.2486427-1-peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
target-arm queue:
* hw/char/bcm2835_aux: Fix assert when receive FIFO fills up
* hw/arm/smmuv3: Assert input to oas2bits() is valid
* target/arm/kvm: Set PMU for host only when available
* target/arm/kvm: Do not silently remove PMU
* hvf: arm: Properly disable PMU
* hvf: arm: Do not advance PC when raising an exception
* hw/misc/bcm2835_property: several minor bugfixes
* target/arm: Don't assert for 128-bit tile accesses when SVL is 128
* target/arm: Fix UMOPA/UMOPS of 16-bit values
* target/arm: Ignore SMCR_EL2.LEN and SVCR_EL2.LEN if EL2 is not enabled
* system/physmem: Where we assume we have a RAM MR, assert it
* sh4, i386, m68k, xtensa, tricore, arm: fix minor Coverity issues
# -----BEGIN PGP SIGNATURE-----
#
# iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmaotMAZHHBldGVyLm1h
# eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3rsAEACIzQDAMKWy8DlB8o4W+a/l
# yqGijQ5e0JdAifEA2rsDbnaIs/kqDzVxBc0dgIXDxETe5LVZHB742q4vMbaSpSb2
# P8xuL0Q7NRpcIN4THPwLxW0wED+asaJc2TeyImPQRTRhLgk6yn+/4hpqQRkT0mxe
# oxxN8bnx9RssqHZ6pQCv5HYNLex3a7dljXlbjWr4KpRRFSMls1cxPSphsK1aZ1xV
# 3NXh/vgHcM0LquwxdF0uaPdPGQ1SyZb5KZ9khd0o4cpDivkns/hXQpyJ45nHsypK
# kG/TbFQsXPorprWCqBDOXY9rCM6eBDuK89mClKA34EzukIFlSMfIgxfezCzNIXaU
# o/cJCGpSzZnCdvZagEWDzkdryE3QFmmpBFRs8mcS3sb+/gm0O8YyMoCrdV87O3c5
# Y/NY1adOKTVf8FLlT3jR93k4pT6wiqIQND13fN3EbnUqfrGpocSyMD0VsYBj/gij
# PHPBFHAwCEDKVZSq6SViXdkS15arqL2V2mnOogeY1v0jTj2YRG3FyjrPOatg6tF5
# 3MoUBjTAp9ENtYHAY6mCr2vAYw6l1xZTKUwkXiO/i8rc4XQQ+A3AHhQLtWdu2K5+
# dv1E7QKur5O6FDmJxB5s/vGppfnkSUD6EEvViNSCj+hX0U9SyT80e/KClMehgJqQ
# +oME+fRoBHj1DUw4qasWsg==
# =NNxN
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 30 Jul 2024 07:39:12 PM AEST
# gpg: using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg: issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [full]
# gpg: aka "Peter Maydell <pmaydell@gmail.com>" [full]
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [full]
# gpg: aka "Peter Maydell <peter@archaic.org.uk>" [unknown]
* tag 'pull-target-arm-20240730' of https://git.linaro.org/people/pmaydell/qemu-arm: (21 commits)
system/physmem: Where we assume we have a RAM MR, assert it
target/sh4: Avoid shift into sign bit in update_itlb_use()
target/i386: Remove dead assignment to ss in do_interrupt64()
target/m68k: avoid shift into sign bit in dump_address_map()
target/xtensa: Make use of 'segment' in pptlb helper less confusing
target/tricore: Use unsigned types for bitops in helper_eq_b()
target/arm: Ignore SMCR_EL2.LEN and SVCR_EL2.LEN if EL2 is not enabled
target/arm: Avoid shifts by -1 in tszimm_shr() and tszimm_shl()
target/arm: Fix UMOPA/UMOPS of 16-bit values
target/arm: Don't assert for 128-bit tile accesses when SVL is 128
hw/misc/bcm2835_property: Reduce scope of variables in mbox push function
hw/misc/bcm2835_property: Restrict scope of start_num, number, otp_row
hw/misc/bcm2835_property: Avoid overflow in OTP access properties
hw/misc/bcm2835_property: Fix handling of FRAMEBUFFER_SET_PALETTE
hvf: arm: Do not advance PC when raising an exception
hvf: arm: Properly disable PMU
hvf: arm: Raise an exception for sysreg by default
target/arm/kvm: Do not silently remove PMU
target/arm/kvm: Set PMU for host only when available
hw/arm/smmuv3: Assert input to oas2bits() is valid
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
As the list of options isn't fixed we do all the parsing by hand.
Without any named arguments we automatically fill the "file" option
with the value give so check if it is requesting help and dump some
basic usage text.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240729144414.830369-15-alex.bennee@linaro.org>
The devel section is getting quite messy with the breakdown of the
example plugins which should be usable by users. As we mention plugins
in the emulation section along with semihosting move the overview
there leaving the development section about the details of writing
plugins.
While we are at make the headings nicer and convert the option lists
into nicely formatted tables.
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240729144414.830369-11-alex.bennee@linaro.org>
You cannot use plugins without TCG enabled so it doesn't make sense to
have them separated off in the test directory structure. While we are
at it rename the directory to plugins to reflect the plural nature of
the directory and match up with contrib/plugins.
Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240729144414.830369-10-alex.bennee@linaro.org>
Since 4f8d886085 (tests/plugin/mem: migrate to new per_vcpu API) this
test was skipping due to not being able to run callback and inline
memory instrumentation at the same time.
However b480f7a621 (tests/plugin: add test plugin for inline
operations) tests for all this matching up so we don't need the
additional complexity in avocado.
Remove the test.
Fixes: 4f8d886085
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240729144414.830369-9-alex.bennee@linaro.org>
The lcitool created containers save the full distro package list
details into /packages.txt. The idea is that build jobs will 'cat'
this file, so that the build log has a record of what packages
were used. This is important info, because when it comes to debug
failures, the original container is often lost.
This extends the manually written dockerfiles to also create the
/packages.txt file.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20240724095505.33544-2-berrange@redhat.com>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240729144414.830369-2-alex.bennee@linaro.org>
util/getauxval: Ensure setting errno if not found
util/getauxval: Use elf_aux_info on OpenBSD
linux-user: open_self_stat: Implement num_threads
target/rx: Use target_ulong for address in LI
# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmaoPYUdHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV/QoQgAhqVcFGTLW9ozw8cR
# 7DMloHfDbcZTmjQIUvq2WPWCGpUj6mXZXQCM7QAjfGVSa45zOsmRyTRM/If0aZxq
# r0/rQmNVchJ2bjnzz83tu1A+a2+yXLwzzfUdBZ6Jg91vSOrJ0io8CyHSIdtLrFlK
# mV/LQ5viFdhlqk5GO0o/vdAgBgz6rVk4Uwuc/wl88JR5AHk7tRB21XC2ZzhfupBR
# 7QnIru6K1Ltm1sJYxW7qX7DC720iqLeS/LFH67Q2f9eVgejUevoOPmCyOvVmt1kr
# VPwmxKUs46M3qs6zQ2DuPVIgXZof3Xs1C7jcPR6wvXzVcsof3X1Ma70zdVHWXkCN
# XKrTHQ==
# =WadL
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 30 Jul 2024 11:10:29 AM AEST
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate]
* tag 'pull-misc-20240730' of https://gitlab.com/rth7680/qemu:
linux-user: open_self_stat: Implement num_threads
util/cpuinfo: Make use of elf_aux_info(3) on OpenBSD
linux-user/main: Check errno when getting AT_EXECFD
util/getauxval: Ensure setting errno if not found
target/rx: Use target_ulong for address in LI
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Sometimes zero is a valid value for getauxval (e.g. AT_EXECFD). Make
sure that we can distinguish between a valid zero value and a not found
entry by setting errno.
Assumes that getauxval from sys/auxv.h sets errno correctly.
Signed-off-by: Vivian Wang <uwu@dram.page>
Message-ID: <20240723100545.405476-2-uwu@dram.page>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
CpuModelInfo is used both as command argument and in command
returns.
Its @deprecated-props array does not make any sense in arguments,
and is silently ignored. We actually want it only as return value
of query-cpu-model-expansion.
Move it from CpuModelInfo to CpuModelExpansionType, and document
its dependence on expansion type property.
This was identified late during review [1] and we have to fix it up
while it's not part of an official QEMU release yet.
[1] https://lore.kernel.org/qemu-devel/20240719181741.35146-1-walling@linux.ibm.com/
Message-ID: <20240726203646.20279-1-walling@linux.ibm.com>
Fixes: eed0e8ffa3 ("target/s390x: filter deprecated properties based on model expansion type")
Signed-off-by: Collin Walling <walling@linux.ibm.com>
[ david: - add "Fixes", adjust description, reference v3 instead
- make property s390x-only and non-optional
- fixup "populate" vs. "populated" ]
Signed-off-by: David Hildenbrand <david@redhat.com>
In the functions invalidate_and_set_dirty() and
cpu_physical_memory_snapshot_and_clear_dirty(), we assume that we
are dealing with RAM memory regions. In this case we know that
memory_region_get_ram_addr() will succeed. Assert this before we
use the returned ram_addr_t in arithmetic.
This makes Coverity happier about these functions: it otherwise
complains that we might have an arithmetic overflow that stems
from the possible -1 return from memory_region_get_ram_addr().
Resolves: Coverity CID 1547629, 1547715
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-id: 20240723170513.1676453-1-peter.maydell@linaro.org
In update_itlb_use() the variables or_mask and and_mask are uint8_t,
which means that in expressions like "and_mask << 24" the usual C
arithmetic conversions will result in the shift being done as a
signed int type, and so we will shift into the sign bit. For QEMU
this isn't undefined behaviour because we use -fwrapv; but we can
avoid it anyway by using uint32_t types for or_mask and and_mask.
Resolves: Coverity CID 1547628
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Yoshinori Sato <ysato@users.sourceforge.jp>
Message-id: 20240723172431.1757296-1-peter.maydell@linaro.org
Coverity points out that in do_interrupt64() in the "to inner
privilege" codepath we set "ss = 0", but because we also set
"new_stack = 1" there, later in the function we will always override
that value of ss with "ss = 0 | dpl".
Remove the unnecessary initialization of ss, which allows us to
reduce the scope of the variable to only where it is used. Borrow a
comment from helper_lcall_protected() that explains what "0 | dpl"
means here.
Resolves: Coverity CID 1527395
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240723162525.1585743-1-peter.maydell@linaro.org
Coverity complains (CID 1547592) that in dump_address_map() we take a
value stored in a signed integer variable 'i' and shift it by enough
to shift into the sign bit when we construct the value 'logical'.
This isn't a bug for QEMU because we use -fwrapv semantics, but
we can make Coverity happy by using an unsigned type for the loop
variables i, j, k in this function.
While we're changing the declaration of the variables, put them
in the for() loops so their scope is the minimum required (a style
now permitted by our coding style guide).
Resolves: Coverity CID 1547592
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240723154207.1483665-1-peter.maydell@linaro.org
Coverity gets confused about the use of the 'segment' variable in the
pptlb helper function: it thinks that we can take a code path where
we first initialize it:
unsigned segment = XTENSA_MPU_PROBE_B; // 0x40000000
and then use that value as a shift count:
} else if (nhits == 1 && (env->sregs[MPUENB] & (1u << segment))) {
In fact this isn't possible, beacuse xtensa_mpu_lookup() is passed
'&segment', and it uses that as an output value, which it will always
set if it returns nonzero. But the way the code is currently written
is confusing to a human reader as well as to Coverity.
Instead of initializing 'segment' at the top of the function with a
value that's only used in the "nhits == 0" code path, use the
constant value directly in that code path, and don't initialize
segment. This matches the way we use xtensa_mpu_lookup() in its
other callsites in get_physical_addr_mpu().
Resolves: Coverity CID 1547589
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Message-id: 20240723151454.1396826-1-peter.maydell@linaro.org
Coverity points out that in helper_eq_b() we have an int32_t 'msk'
and we end up shifting into its sign bit. This is OK for QEMU because
we use -fwrapv to give this well defined semantics, but when you look
at what this function is doing it's doing bit operations, so we
should be using an unsigned variable anyway. This also matches the
return type of the function.
Make 'ret' and 'msk' uint32_t.
Resolves: Coverity CID 1547758
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240723151042.1396610-1-peter.maydell@linaro.org
When determining the current vector length, the SMCR_EL2.LEN and
SVCR_EL2.LEN settings should only be considered if EL2 is enabled
(compare the pseudocode CurrentSVL and CurrentNSVL which call
EL2Enabled()).
We were checking against ARM_FEATURE_EL2 rather than calling
arm_is_el2_enabled(), which meant that we would look at
SMCR_EL2/SVCR_EL2 when in Secure EL1 or Secure EL0 even if Secure EL2
was not enabled.
Use the correct check in sve_vqm1_for_el_sm().
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240722172957.1041231-5-peter.maydell@linaro.org
The function tszimm_esz() returns a shift amount, or possibly -1 in
certain cases that correspond to unallocated encodings in the
instruction set. We catch these later in the trans_ functions
(generally with an "a-esz < 0" check), but before we do the
decodetree-generated code will also call tszimm_shr() or tszimm_sl(),
which will use the tszimm_esz() return value as a shift count without
checking that it is not negative, which is undefined behaviour.
Avoid the UB by checking the return value in tszimm_shr() and
tszimm_shl().
Cc: qemu-stable@nongnu.org
Resolves: Coverity CID 1547617, 1547694
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240722172957.1041231-4-peter.maydell@linaro.org
The UMOPA/UMOPS instructions are supposed to multiply unsigned 8 or
16 bit elements and accumulate the products into a 64-bit element.
In the Arm ARM pseudocode, this is done with the usual
infinite-precision signed arithmetic. However our implementation
doesn't quite get it right, because in the DEF_IMOP_64() macro we do:
sum += (NTYPE)(n >> 0) * (MTYPE)(m >> 0);
where NTYPE and MTYPE are uint16_t or int16_t. In the uint16_t case,
the C usual arithmetic conversions mean the values are converted to
"int" type and the multiply is done as a 32-bit multiply. This means
that if the inputs are, for example, 0xffff and 0xffff then the
result is 0xFFFE0001 as an int, which is then promoted to uint64_t
for the accumulation into sum; this promotion incorrectly sign
extends the multiply.
Avoid the incorrect sign extension by casting to int64_t before
the multiply, so we do the multiply as 64-bit signed arithmetic,
which is a type large enough that the multiply can never
overflow into the sign bit.
(The equivalent 8-bit operations in DEF_IMOP_32() are fine, because
the 8-bit multiplies can never overflow into the sign bit of a
32-bit integer.)
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2372
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240722172957.1041231-3-peter.maydell@linaro.org
For an instruction which accesses a 128-bit element tile when
the SVL is also 128 (for example MOV z0.Q, p0/M, ZA0H.Q[w0,0]),
we will assert in get_tile_rowcol():
qemu-system-aarch64: ../../tcg/tcg-op.c:926: tcg_gen_deposit_z_i32: Assertion `len > 0' failed.
This happens because we calculate
len = ctz32(streaming_vec_reg_size(s)) - esz;$
but if the SVL and the element size are the same len is 0, and
the deposit operation asserts.
In this case the ZA storage contains exactly one 128 bit
element ZA tile, and the horizontal or vertical slice is just
that tile. This means that regardless of the index value in
the Ws register, we always access that tile. (In pseudocode terms,
we calculate (index + offset) MOD 1, which is 0.)
Special case the len == 0 case to avoid hitting the assertion
in tcg_gen_deposit_z_i32().
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240722172957.1041231-2-peter.maydell@linaro.org
In bcm2835_property_mbox_push(), some variables are defined at function scope
but used only in a smaller scope of the function:
* tag, bufsize, resplen are used only in the body of the while() loop
* tmp is used only for RPI_FWREQ_SET_POWER_STATE (and is badly named)
Declare these variables in the scope where they're needed, so the code
is easier to read.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240723131029.1159908-5-peter.maydell@linaro.org
In the long function bcm2835_property_mbox_push(), the variables
start_num, number and otp_row are used only in the four cases which
access OTP data, and their uses don't overlap with each other.
Make these variables have scope restricted to the cases where they're
used, so it's easier to read each individual case without having to
cross-refer up to the variable declaration at the top of the function
and check whether the variable is also used later in the loop.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240723131029.1159908-4-peter.maydell@linaro.org
Coverity points out that in our handling of the property
RPI_FWREQ_SET_CUSTOMER_OTP we have a potential overflow. This
happens because we read start_num and number from the guest as
unsigned 32 bit integers, but then the variable 'n' we use as a loop
counter as we iterate from start_num to start_num + number is only an
"int". That means that if the guest passes us a very large start_num
we will interpret it as negative. This will result in an assertion
failure inside bcm2835_otp_set_row(), which checks that we didn't
pass it an invalid row number.
A similar issue applies to all the properties for accessing OTP rows
where we are iterating through with a start and length read from the
guest.
Use uint32_t for the loop counter to avoid this problem. Because in
all cases 'n' is only used as a loop counter, we can do this as
part of the for(), restricting its scope to exactly where we need it.
Resolves: Coverity CID 1549401
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240723131029.1159908-3-peter.maydell@linaro.org
The documentation of the "Set palette" mailbox property at
https://github.com/raspberrypi/firmware/wiki/Mailbox-property-interface#set-palette
says it has the form:
Length: 24..1032
Value:
u32: offset: first palette index to set (0-255)
u32: length: number of palette entries to set (1-256)
u32...: RGBA palette values (offset to offset+length-1)
We get this wrong in a couple of ways:
* we aren't checking the offset and length are in range, so the guest
can make us spin for a long time by providing a large length
* the bounds check on our loop is wrong: we should iterate through
'length' palette entries, not 'length - offset' entries
Fix the loop to implement the bounds checks and get the loop
condition right. In the process, make the variables local to
this switch case, rather than function-global, so it's clearer
what type they are when reading the code.
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240723131029.1159908-2-peter.maydell@linaro.org
kvm_arch_init_vcpu() used to remove PMU when it is not available even
if the CPU model needs one. It is semantically incorrect, and may
continue execution on a misbehaving host that advertises a CPU model
while lacking its PMU. Keep the PMU when the CPU model needs one, and
let kvm_arm_vcpu_init() fail if the KVM implementation mismatches with
our expectation.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
target/arm/kvm.c checked PMU availability but unconditionally set the
PMU feature flag for the host CPU model, which is confusing. Set the
feature flag only when available.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When a bare-metal application on the raspi3 board reads the
AUX_MU_STAT_REG MMIO register while the device's buffer is
at full receive FIFO capacity
(i.e. `s->read_count == BCM2835_AUX_RX_FIFO_LEN`) the
assertion `assert(s->read_count < BCM2835_AUX_RX_FIFO_LEN)`
fails.
Reported-by: Cryptjar <cryptjar@junk.studio>
Suggested-by: Cryptjar <cryptjar@junk.studio>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/459
Signed-off-by: Frederik van Hövell <frederik@fvhovell.nl>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
[PMM: commit message tweaks]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Some QOM properties are associated with ObjectTypes that already
depend on CONFIG_* switches. So to avoid generating dead code,
let's also make the definition of those properties dependent on
the corresponding CONFIG_*.
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-ID: <20240604135931.311709-1-sgarzare@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Make SecretKeyringProperties conditional, too]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
fixes
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCgAdFiEETkN92lZhb0MpsKeVZ7MCdqhiHK4FAmai5TsACgkQZ7MCdqhi
# HK4rgA//eh0ax3JnBGma1rVEDL5n5cdEYV+ATFYGc529CUZFUar3IMqSw3in8bJy
# uvQ6Cr/7IuusNEtoiYtdN1yNasqsm3fZB/hZ/Ekz32TsbpBRdkJW3ucavAu2rGM/
# EKRo7Y8gciy/Mj9y2JlIZqsDqYe+gribfGQvIg27DX+caAW/lKQdAdt4oJMTSdmr
# XR8JjtMdhUazKrI+bc/4EG6tIQyUdp+S1/z1q6Wthqt58dNRElTjkD9op4AsUWMu
# CE4a8ALCZoj3P3m+xf7xi7fT2JC2xgmNRCi3KbbhVEHdbFB6ViNYNuEYRS6GmpdC
# C6J/ZR6QXs6KB1KO7EyB+vsuxLX4Eb8aeCFxwMlzJ9Fo4g8JudABXOFzYTKX1xBn
# DUIGX91YACV43M2MvP/KuEU4zWpREO+U8MbQs/6s6fYsnCO2eKVJt/0Aaf1hmk37
# gY5Ak2DRx5TBvxlFy87zgHxHWTh/dGZodpN3IvCIDzVLnHGFlfluJbFRaoZSOecb
# 1vxDHORjIruLcAxNVEGkJ/6MxOrnjjoUzSPUQcbgJ5BpFZOdeGLiMAULu/HBLBd9
# 7dvVw+PeNEPJttYumljOD6nYc/jENhLQsvkc3++bwGNc/rpi4YngtB4jhT1HV2Cl
# oLool2ooKZgV4qx6IzeYo9feElvWVNK5XPzqDpSDlt9MaI+yTYM=
# =FxPm
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 26 Jul 2024 09:52:27 AM AEST
# gpg: using RSA key 4E437DDA56616F4329B0A79567B30276A8621CAE
# gpg: Good signature from "Nicholas Piggin <npiggin@gmail.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 4E43 7DDA 5661 6F43 29B0 A795 67B3 0276 A862 1CAE
* tag 'pull-ppc-for-9.1-2-20240726-1' of https://gitlab.com/npiggin/qemu: (96 commits)
target/ppc: Remove includes from mmu-book3s-v3.h
target/ppc/mmu-radix64: Remove externally unused parts from header
target/ppc: Unexport some functions from mmu-book3s-v3.h
target/ppc/mmu-hash32.c: Move get_pteg_offset32() to the header
target/ppc/mmu-hash32.c: Inline and remove ppc_hash32_pte_raddr()
target/ppc/mmu_common.c: Remove mmu_ctx_t
target/ppc/mmu_common.c: Stop using ctx in get_bat_6xx_tlb()
target/ppc: Remove bat_size_prot()
target/ppc/mmu_common.c: Use defines instead of numeric constants
target/ppc/mmu_common.c: Rename function parameter
target/ppc/mmu_common.c: Stop using ctx in ppc6xx_tlb_check()
target/ppc/mmu_common.c: Remove key field from mmu_ctx_t
target/ppc/mmu_common.c: Init variable in function that relies on it
target/ppc/mmu-hash32.c: Inline and remove ppc_hash32_pte_prot()
target/ppc: Add function to get protection key for hash32 MMU
target/ppc/mmu_common.c: Remove ptem field from mmu_ctx_t
target/ppc/mmu_common.c: Inline and remove ppc6xx_tlb_pte_check()
target/ppc/mmu_common.c: Simplify a switch statement
target/ppc/mmu_common.c: Remove single use local variable
target/ppc/mmu_common.c: Convert local variable to bool
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Drop includes from header that is not needed by the header itself and
only include them from C files that really need it.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Move the parts not needed outside of mmu-radix64.c from the header to
the C file to leave only parts in the header that need to be exported.
Also drop unneded include of this header.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The ppc_hash64_hpt_base() and ppc_hash64_hpt_mask() functions are
mostly used by mmu-hash64.c only but there is one call to
ppc_hash64_hpt_mask() in hw/ppc/spapr_vhyp_mmu.c.in a helper function
that can be moved to mmu-hash64.c which allows these functions to be
removed from the header.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
This function is a simple shared function, move it to other similar
static inline functions in the header.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
This function is used only once and does not add more clarity than
doing it inline.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Completely get rid of mmu_ctx_t after converting the remaining
functions to pass raddr and prot without the context struct.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
There is already a hash32_bat_prot() function that does most if this
and the rest can be inlined. Export hash32_bat_prot() and rename it to
ppc_hash32_bat_prot() to match other functions and use it in
get_bat_6xx_tlb().
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Replace some BAT related constants with defines from mmu-hash32.h
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Rename parameter of get_bat_6xx_tlb() from virtual to eaddr to match
other functions.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Pass it as a function parameter and remove it from mmu_ctx_t.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The ppc6xx_tlb_check() relies on the caller to initialise raddr field
in ctx. Move this init from the only caller into the function.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Add a function to get key bit from SR and use it instead of open coded
version.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Instead of passing around ptem in context use it once in the same
function so it can be removed from mmu_ctx_t.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
This function is only called once and we can make the caller simpler
by inlining it.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
In mmu6xx_get_physical_address() the switch handles all cases so the
default is never reached and can be dropped. Also group together cases
which just return -4.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
In mmu6xx_get_physical_address() tagtet_page_bits local is declared
only to use TARGET_PAGE_BITS once. Drop the unneeded variable.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
In mmu6xx_get_physical_address() ds is used as bool, declare it as
such. Also use named constant instead of hex value.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Pass it as a parameter instead. Also use named constants instead of
hex values when extracting bits from SR.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
This function is used only once, its return value is ignored and one
of its parameter is a return value from a previous call. It is better
to inline it in the caller and remove it.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Return hash value via a parameter and remove it from mmu_ctx.t.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The eaddr field of mmu_ctx_t is set once but never used so can be
removed.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Invert conditions to avoid deep nested ifs and return early instead.
Remove some obvious comments that don't add more clarity.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Instead of using a local ret variable return directly and remove the
local.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
In ppc6xx_tlb_pte_check() the pp variable is used only once to pass it
to a function parameter with the same name. Remove the local and
inline the value. Also use named constant for the hex value to make it
clearer.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
In ppc6xx_tlb_pte_check() the pteh variable is used only once to
compare to the h parameter of the function. Inline its value and use
pteh name for the function parameter which is more descriptive.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The ptev variable in ppc6xx_tlb_pte_check() is used only once and just
obfuscates an otherwise clear value. Get rid of it.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The ptem variable in ppc6xx_tlb_pte_check() is used only once,
simplify by removing it as the value is already clear itself without
adding a local name for it.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The mmask local variable is a less descriptive local name for a
constant. Drop it and use the constant directly in the two places it
is needed.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reorganise ppc_hash32_pp_prot() swapping the if legs so it does not
test for negative first and clean up to make it shorter. Also rename
it to ppc_hash32_prot().
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Updated many VSX instructions to use tcg_gen_qemu_ld/st_i128, instead of using
tcg_gen_qemu_ld/st_i64 consecutively.
Introduced functions {get,set}_vsr_full to facilitate the above & for future use.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Updated instructions {l, st}vx to use tcg_gen_qemu_ld/st_i128,
instead of using 64 bits loads/stores in succession.
Introduced functions {get, set}_avr_full in vmx-impl.c.inc to
facilitate the above, and potential future usage.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Those functions are used to ld/st data to and from Altivec registers,
in 64 bits chunks, and are only used in vmx-impl.c.inc file,
hence the clean-up movement.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Moving the following instructions to decodetree specification:
xvcmp{eq, gt, ge, ne}{s, d}p : XX3-form
The changes were verified by validating that the tcg-ops generated for those
instructions remain the same which were captured using the '-d in_asm,op' flag.
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Moving the following instructions to decodetree specification:
lxv{b16, d2, h8, w4, ds, ws}x : X-form
stxv{b16, d2, h8, w4}x : X-form
The changes were verified by validating that the tcg-ops generated for those
instructions remain the same, which were captured using the '-d in_asm,op' flag.
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Moving the following instructions to decodetree specification :
{l, st}xvl(l) : X-form
The changes were verified by validating that the tcg-ops generated by those
instructions remain the same, which were captured using the '-d in_asm,op' flag.
Also added a new function do_ea_calc_ra to calculate the effective address :
EA <- (RA == 0) ? 0 : GPR[RA], which is now used by the above-said insns,
and shall be used later by (p){lx, stx}vp insns.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
[np: Fix 32-bit build]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Moving the following instructions to decodetree specification :
lxs{d, iwa, ibz, ihz, iwz, sp}x : X-form
stxs{d, ib, ih, iw, sp}x : X-form
The changes were verified by validating that the tcg-ops generated by those
instructions remain the same, which were captured using the '-d in_asm,op' flag.
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Moving the following instructions to decodetree specification :
xxl{and, andc, or, orc, nor, xor, nand, eqv} : XX3-form
The changes were verified by validating that the tcg ops generated by those
instructions remain the same, which were captured with the '-d in_asm,op' flag.
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Moving the following instructions to decodetree specification:
x{s, v}{add, sub, mul, div}{s, d}p : XX3-form
xs{max, min}dp, xv{max, min}{s, d}p : XX3-form
The changes were verfied by validating that the tcg ops generated by those
instructions remain the same, which were captured with the '-d in_asm,op' flag.
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Moving PPC2_ISA300 flag check out of do_helper_XX3 method in vmx-impl.c.inc
so that the helper can be used with other instructions as well.
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Moving the following instructions to decodetree specification :
v{add,sub}{u,s}{b,h,w}s : VX-form
The changes were verified by validating that the tcg ops generated by those
instructions remain the same, which were captured with the '-d in_asm,op' flag.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Additional END state 'info pic' information as added. The 'ignore',
'crowd' and 'precluded escalation control' bits of an Event Notification
Descriptor are all used when delivering an interrupt targeting a VP-group
or crowd.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Kowal <kowal@linux.vnet.ibm.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Fail VST entry address computation if firmware doesn't define a descriptor
for one of the Virtualization Structure Tables (VST), there's no point in
trying to compute the address of its entry. Abort the operation and log
an error.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Kowal <kowal@linux.vnet.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Set Translation Table for the NVC port space is missing. The xive model
doesn't take into account the remapping of IO operations via the Set
Translation Table but firmware is allowed to define it for the Notify
Virtual Crowd (NVC), like it's already done for the other VST tables.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Kowal <kowal@linux.vnet.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Enable NVG and NVC VST tables for index compression which indicates the number
of bits the address is shifted to the right for the table accesses.
The compression values are defined as:
0000 - No compression
0001 - 1 bit shift
0010 - 2 bit shift
....
1000 - 8 bit shift
1001-1111 - No compression
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Kowal <kowal@linux.vnet.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Both the virtualization layer (VC) and presentation layer (PC) need to
be configured to access the VSTs. Since the information is redundant,
the xive model combines both into one set of tables and only the
definitions going through the VC are kept. The definitions through the
PC are ignored. That works well as long as firmware calls the VC for
all the tables.
For the NVG and NVC tables, it can make sense to only configure them
with the PC, since they are only used by the presenter. So this patch
allows firmware to configure the VST tables through the PC as well.
The definitions are still shared, since the VST tables can be set
through both the VC and/or PC, they are dynamically re-mapped in
memory by first deleting the memory subregion.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Kowal <kowal@linux.vnet.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The cache watch facility uses the same register interface to handle
entries in the NVP, NVG and NVC tables. A bit-field in the 'watchX
specification' register tells the table type. So far, that bit-field
was not read and the code assumed a read/write to the NVP table.
This patch allows to read/write entries in the NVG and NVC table as
well.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Kowal <kowal@linux.vnet.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Adds support for writing a completion notification byte in memory
whenever a cache flush or queue sync inject operation is requested by
software. QEMU does not cache any of the XIVE data that is in memory and
therefore it simply writes the completion notification byte at the time
that the operation is requested.
Co-authored-by: Glenn Miles <milesg@linux.vnet.ibm.com>
Signed-off-by: Glenn Miles <milesg@linux.vnet.ibm.com>
Signed-off-by: Michael Kowal <kowal@linux.vnet.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
With -fsanitize=undefined, which implies -fsanitize=function,
clang will add a "type signature" before functions.
It accesses funcptr-8 and funcptr-4 to do so.
The generated TCG prologue is directly on a page boundary,
so these accesses segfault.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240723232543.18093-1-richard.henderson@linaro.org>
Made changes to some structure and define elements to ease review in
next patchset.
Signed-off-by: Michael Kowal <kowal@linux.vnet.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
XIVE offers a 'cache watch facility', which allows software to read/update
a potentially cached table entry with no software lock. There's one such
facility in the Virtualization Controller (VC) to update the ESB and END
entries and one in the Presentation Controller (PC) to update the
NVP/NVG/NVC entries.
Each facility has 4 cache watch engines to control the updates and
firmware can request an available engine by querying the hardware
'watch_assign' register of the VC or PC. The engine is then reserved and
is released after the data is updated by reading the 'watch_spec' register
(which also allows to check for a conflict during the update).
If no engine is available, the special value 0xFF is returned and
firmware is expected to repeat the request until an engine becomes
available.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Kowal <kowal@linux.vnet.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
In this commit Write a qtest pnv-spi-seeprom-test to check the
SPI transactions between spi controller and seeprom device.
Signed-off-by: Chalapathi V <chalapathi.v@linux.ibm.com>
Acked-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Caleb Schlossin <calebs@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Add Microchip's 25CSM04 Serial EEPROM to m25p80. 25CSM04 provides 4 Mbits
of Serial EEPROM utilizing the Serial Peripheral Interface (SPI) compatible
bus. The device is organized as 524288 bytes of 8 bits each (512Kbyte) and
is optimized for use in consumer and industrial applications where reliable
and dependable nonvolatile memory storage is essential.
Signed-off-by: Chalapathi V <chalapathi.v@linux.ibm.com>
Reviewed-by: Glenn Miles <milesg@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
In this commit SPI shift engine and sequencer logic is implemented.
Shift engine performs serialization and de-serialization according to the
control by the sequencer and according to the setup defined in the
configuration registers. Sequencer implements the main control logic and
FSM to handle data transmit and data receive control of the shift engine.
Signed-off-by: Chalapathi V <chalapathi.v@linux.ibm.com>
Reviewed-by: Caleb Schlossin <calebs@linux.vnet.ibm.com>
Reviewed-by: Glenn Miles <milesg@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
SPI controller device model supports a connection to a single SPI responder.
This provide access to SPI seeproms, TPM, flash device and an ADC controller.
All SPI function control is mapped into the SPI register space to enable full
control by firmware. In this commit SPI configuration component is modelled
which contains all SPI configuration and status registers as well as the hold
registers for data to be sent or having been received.
An existing QEMU SSI framework is used and SSI_BUS is created.
Signed-off-by: Chalapathi V <chalapathi.v@linux.ibm.com>
Reviewed-by: Caleb Schlossin <calebs@linux.vnet.ibm.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Glenn Miles <milesg@linux.ibm.com>
[np: Fix FDT macro compile for qtest]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
In this commit target specific dependency from include/hw/ppc/pnv_xscom.h
has been removed so that pnv_xscom.h can be included outside hw/ppc.
Signed-off-by: Chalapathi V <chalapathi.v@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Caleb Schlossin <calebs@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Recent POWER CPUs can operate in "LPAR per core" or "LPAR per thread"
modes. In per-core mode, some SPRs and IPI doorbells are shared between
threads in a core. In per-thread mode, supervisor and user state is
not shared between threads.
OpenPOWER systems after POWER8 use LPAR per thread mode, and it is
required for KVM. Enterprise systems use LPAR per core mode, as they
partition the machine by core.
Implement a lpar-per-core machine option for powernv machines. This
is fixed true for POWER8 machines, and defaults off for P9 and P10.
With this change, powernv8 SMT now works sufficiently to run Linux,
with a single socket. Multi-threaded KVM guests still have problems,
as does multi-socket Linux boot.
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The PC unit in the processor core contains xscom registers that provide
low level status and control of the CPU.
This implements "direct controls", sufficient for skiboot firmware,
which uses it to send NMI IPIs between CPUs.
POWER10 is sufficiently different from POWER9 (particularly with respect
to QME and special wakeup) that it is not trivial to implement POWER9
support by reusing the code.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Power CPUs have an execution control facility that can pause, resume,
and cause NMIs, among other things. Add a function that will nmi a CPU
and resume it if it was paused, in preparation for implementing the
control facility.
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Big-core implementation is complete, so expose it as a machine
property that may be set with big-core=on option on powernv9 and
powernv10 machines.
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
POWER10 has a quirk in its ChipTOD addressing that requires the even
small-core to be selected even when programming the odd small-core.
This allows skiboot chiptod init to run in big-core mode.
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Power9 CPUs have a core thread state register accessible via SPRC/SPRD
indirect registers. This register includes a bit for big-core mode,
which skiboot requires.
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Power9/10 CPUs have PVR[51] set in small-core mode and clear in big-core
mode. This is used by skiboot firmware.
PVR is not hypervisor-privileged but it is not so important that spapr
to implement this because it's generally masked out of PVR matching code
in kernels, and only used by firmware.
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
device-tree building needs to account for big-core mode, because it is
driven by qemu cores (small cores). Every second core should be skipped,
and every core should describe threads for both small-cores that make
up the big core.
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
POWER9 and POWER10 machines come in two variants, big-core and
small-core. Big-core machines are SMT8 from software's point of view,
but the low level platform topology ("xscom registers and pervasive
addressing"), these look more like a pair of small cores ganged
together.
Presently the way this is modelled is to create one SMT8 PnvCore and add
special cases to xscom and pervasive for big-core mode that tries to
split this into two small cores, but this is becoming too complicated to
manage.
A better approach is to create 2 core structures and ganging them
together to look like an SMT8 core in TCG. Then the xscom and pervasive
models mostly do not need to differentiate big and small core modes.
This change adds initial mode bits and QEMU topology handling to
split SMT8 cores into 2xSMT4 cores.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The decision to branch out to a slower SMT path in instruction
emulation will become a bit more complicated with the way that
"big-core" topology that will be implemented in subsequent changes.
Hide these details from the wider CPU emulation code with a bool
has_smt_siblings flag that can be set by machine initialisation.
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Add helpers for TCG code to determine if there are SMT siblings
sharing per-core and per-lpar registers. This simplifies the
callers and makes SMT register topology simpler to modify with
later changes.
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The way SMT thread siblings are matched is clunky, using hard-coded
logic that checks the PIR SPR.
Change that to use a new core_index variable in the CPUPPCState,
where all siblings have the same core_index. CPU realize routines have
flexibility in setting core/sibling topology.
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The chip_pir chip class method allows the platform to set the PIR
processor identification register. Extend this to a more general
ID function which also allows the TIR to be set. This is in
preparation for "big core", which is a more complicated topology
of cores and threads.
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Use a class attribute to specify the number of SMT threads per core
permitted for different machines, 8 for powernv8 and 4 for powernv9/10.
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
SPRC/SPRD were recently added to all BookS CPUs supported, but
they are only tested on POWER9 and POWER10, so restrict them to
those CPUs.
SPR indirect scratch registers presently replicated per-CPU like
SMT SPRs, but the PnvCore is a better place for them since they
are restricted to P9/P10.
Also add SPR indirect read access to core thread state for POWER9
since skiboot accesses that when booting to check for big-core
mode.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The timebase state machine is per per-core state and can be driven
by any thread in the core. It is currently implemented as a hack
where the state is in a CPU structure and only thread 0's state is
accessed by the chiptod, which limits programming the timebase
side of the state machine to thread 0 of a core.
Move the state out into PnvCore and share it among all threads.
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
This helps move core state from CPU to core structures.
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
POWER8 (ISA v2.07S) introduced the doorbell facility, the msgsnd
instruction behaved mostly like msgsndp, it was addressed by TIR
and could only send interrupts between threads on the core.
ISA v3.0 changed msgsnd to be addressed by PIR and can interrupt
any thread in the system.
msgsnd only implements the v3.0 semantics, which can make
multi-threaded POWER8 hang when booting Linux (due to IPIs
failing). This change adds v2.07 semantics.
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
One of the functions of the ADU is indirect memory access engines that
send and receive data via ADU registers.
This implements the ADU LPC memory access functionality sufficiently
for IBM proprietary firmware to access the UART and print characters
to the serial port as it does on real hardware.
This requires a linkage between adu and lpc, which allows adu to
perform memory access in the lpc space.
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
This implements a framework for an ADU unit model.
The ADU unit actually implements XSCOM, which is the bridge between MMIO
and PIB. However it also includes control and status registers and other
functions that are exposed as PIB (xscom) registers.
To keep things simple, pnv_xscom.c remains the XSCOM bridge
implementation, and pnv_adu.c implements the ADU registers and other
functions.
So far, just the ADU no-op registers in the pnv_xscom.c default handler
are moved over to the adu model.
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The POWER8 LPC ISA device irqs all get combined and reported to the line
connected the PSI LPCHC irq. POWER9 changed this so only internal LPC
host controller irqs use that line, and the device irqs get routed to
4 new lines connected to PSI SERIRQ0-3.
POWER9 also introduced a new feature that automatically clears the irq
status in the LPC host controller when EOI'ed, so software does not have
to.
The powernv OPAL (skiboot) firmware managed to work because the LPCHC
irq handler scanned all LPC irqs and handled those including clearing
status even on POWER9 systems. So LPC irqs worked despite OPAL thinking
it was running in POWER9 mode. After this change, UART interrupts show
up on serirq1 which is where OPAL routes them to:
cat /proc/interrupts
...
20: 0 XIVE-IRQ 1048563 Level opal-psi#0:lpchc
...
25: 34 XIVE-IRQ 1048568 Level opal-psi#0:lpc_serirq_mux1
Whereas they previously turn up on lpchc.
Reviewed-by: Glenn Miles <milesg@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The LPC HC irq status register bits are set when an LPC IRQSER input is
asserted. These irq status bits drive the PSI irq to the CPU interrupt
controller. The LPC HC irq status bits are cleared by software writing
to the register with 1's for the bits to clear.
Existing register write was clearing the irq status bits even when the
input was asserted, this results in interrupts being lost.
This fix changes the behavior to keep track of the device IRQ status
in internal state that is separate from the irq status register, and
only allowing the irq status bits to be cleared if the associated
input is not asserted.
Signed-off-by: Glenn Miles <milesg@linux.ibm.com>
[np: rebased before P9 PSI SERIRQ patch, adjust changelog/comments]
Reviewed-by: Glenn Miles <milesg@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Power10 DD1.0 was dropped in:
commit 8f054d9ee8 ("ppc: Drop support for POWER9 and POWER10 DD1 chips")
Use the newer Power10 DD2 chips cfam id.
Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The patch enables HASHPKEYR migration by hooking with the
"KVM one reg" ID KVM_REG_PPC_HASHPKEYR.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The patch enables HASHKEYR migration by hooking with the
"KVM one reg" ID KVM_REG_PPC_HASHKEYR.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The patch enables DEXCR migration by hooking with the
"KVM one reg" ID KVM_REG_PPC_DEXCR.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
This is a placeholder change for these SPRs until the full linux
header update.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Every other architecture does this, and debuggers need it to be able to
identify which prstatus note corresponds to which CPU.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
On ppc64, the PowerVM hypervisor runs with limited memory and a VCPU
creation during hotplug may fail during kvm_ioctl for KVM_CREATE_VCPU,
leading to termination of guest since errp is set to &error_fatal while
calling kvm_init_vcpu. This unexpected behaviour can be avoided by
pre-creating and parking vcpu on success or return error otherwise.
This enables graceful error delivery for any vcpu hotplug failures while
the guest can keep running.
Also introducing KVM AccelCPUClass to init cpu_target_realize for kvm.
Tested OK by repeatedly doing a hotplug/unplug of vcpus as below:
#virsh setvcpus hotplug 40
#virsh setvcpus hotplug 70
error: internal error: unable to execute QEMU command 'device_add':
kvmppc_cpu_realize: vcpu hotplug failed with -12
Signed-off by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Reported-by: Anushree Mathur <anushree.mathur@linux.vnet.ibm.com>
Suggested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Suggested-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Tested-by: Anushree Mathur <anushree.mathur@linux.vnet.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
This helper provides an easy way to identify the next available free cpu
index which can be used for vcpu creation. Until now, this is being
called at a very later stage and there is a need to be able to call it
earlier (for now, with ppc64) hence the need to export.
Suggested-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
There are distinct helpers for creating and parking a KVM vCPU.
However, there can be cases where a platform needs to create and
immediately park the vCPU during early stages of vcpu init which
can later be reused when vcpu thread gets initialized. This would
help detect failures with kvm_create_vcpu at an early stage.
Suggested-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
This cap did not add the migration code when it was introduced. This
results in migration failure when changing the default using the
command line.
Cc: qemu-stable@nongnu.org
Fixes: ccc5a4c5e1 ("spapr: Add SPAPR_CAP_AIL_MODE_3 for AIL mode 3 support for H_SET_MODE hcall")
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
In Gitlab CI, some ppc64 multi-threaded tcg tests crash when run in the
clang-user job with an assertion failure in glibc that seems to
indicate corruption:
signals: allocatestack.c:223: allocate_stack:
Assertion `powerof2 (pagesize_m1 + 1)' failed.
Disable these tests for now.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
aio_context_set_thread_pool_params() takes two int64_t arguments to
set the minimum and maximum number of threads in the pool. We do
some bounds checking on these, but we don't catch the case where the
inputs are negative. This means that later in the function when we
assign these inputs to the AioContext::thread_pool_min and
::thread_pool_max fields, which are of type int, the values might
overflow the smaller type.
A negative number of threads is meaningless, so make
aio_context_set_thread_pool_params() return an error if either min or
max are negative.
Resolves: Coverity CID 1547605
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240723150927.1396456-1-peter.maydell@linaro.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
bsd-user: Misc changes for 9.1 (I hope)
V2: Add missing bsd-user/aarch64/target.h
This patch series includes two main sets of patches. To make it simple to
review, I've included the changes from my student which the later changes depend
on. I've included a change from Jessica and Doug as well. I've reviewed them,
but more eyes never hurt.
I've also included a number of 'touch up' patches needed either to get the
aarch64 building, or to implmement suggestions from prior review cycles. The
main one is what's charitably described as a kludge: force aarch64 to use 4k
pages. The qemu-project (and blitz branch) hasn't had the necessary changes to
bsd-user needed to support variable page size.
Sorry this is so late... Live has conspired to delay me.
# -----BEGIN PGP SIGNATURE-----
# Comment: GPGTools - https://gpgtools.org
#
# iQIzBAABCgAdFiEEIDX4lLAKo898zeG3bBzRKH2wEQAFAmahejwACgkQbBzRKH2w
# EQCXuQ/+Pj1Izmox/y9X1trn1T8KC7JdMtimdLiGMaS4C6+gcThXJkIB4l9ZStbV
# 7rI540mpqVf0KSRLYwc2/ATyhYU7Ffsz02WPn7Xn/NvmmITp4kjw9Z0gd7C7mPVq
# fS8DJbTyFQDy5dO8FUKLaTfnlYQe+NCnL421t9wFkIrlEepFygRaBaJN5yWVoC+0
# 1Ob6dG+JEV5BmNguMufvvI3S7nEFEnSBGpNqW3ljrRHAZjdNhv8d9GBYbj1laR1r
# HQ6r5+u4ZmKCuUbchS0jxGkug0DjuQC7iq+rQ/7fhLYLChkPZ4P2RxNv8ibzKjEV
# wlTy5LaM+WZNzKWdcHfDFMomeSnnUkOOfAMipMney2jedEjTIwCFDnP4zCAuG83V
# RbdXWfleP1rDto3AQ765pFneqm3+su2Dh4TKaTSnq6gd1eORJ2IL8dubCfcVwZCy
# TofemXPWh0HX3kwlD9IB9rqplQZFL78TkQ47btftxinHCLCQOOHRDPVG0IahQPjo
# pgK4yVH7WA7pWV2Xbo4ngG3sX5U1TyBCbfkkAwhq+P3gjnU8zxonx8Tk/qLeEDdH
# KEypi/pkGFQKZY0wc/y4XM+XQh6E1l8gMaQ4gJWK1qlyVtUKM1BiNQ2lweohYzC8
# p6WAfBQLPpzY4mDWfJMF6DsgObLwWmYbgKzuOtHgST1D/Ebk3Zo=
# =RPuN
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 25 Jul 2024 08:03:40 AM AEST
# gpg: using RSA key 2035F894B00AA3CF7CCDE1B76C1CD1287DB01100
# gpg: Good signature from "Warner Losh <wlosh@netflix.com>" [unknown]
# gpg: aka "Warner Losh <imp@bsdimp.com>" [unknown]
# gpg: aka "Warner Losh <imp@freebsd.org>" [unknown]
# gpg: aka "Warner Losh <imp@village.org>" [unknown]
# gpg: aka "Warner Losh <wlosh@bsdimp.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2035 F894 B00A A3CF 7CCD E1B7 6C1C D128 7DB0 1100
* tag 'bsd-user-for-9.1-pull-request' of gitlab.com:bsdimp/qemu:
bsd-user: Add target.h for aarch64.
bsd-user: Add aarch64 build to tree
bsd-user: Make compile for non-linux user-mode stuff
bsd-user: Define TARGET_SIGSTACK_ALIGN and use it to round stack
bsd-user: Sync fork_start/fork_end with linux-user
bsd-user: Hard wire aarch64 to be 4k pages only
bsd-user: Simplify the implementation of execve
bsd-user:Add AArch64 improvements and signal handling functions
bsd-user:Add set_mcontext function for ARM AArch64
bsd-user:Add setup_sigframe_arch function for ARM AArch64
bsd-user:Add get_mcontext function for ARM AArch64
bsd-user:Add ARM AArch64 signal handling support
bsd-user:Add ARM AArch64 support and capabilities
bsd-user:Add AArch64 register handling and related functions
bsd-user:Add CPU initialization and management functions
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Crypto patches
* Drop unused 'detached-header' QAPI field from LUKS create options
* Improve tracing of TLS sockets and TLS chardevs
* Improve error messages from TLS I/O failures
* Add docs about use of LUKS detached header options
* Allow building without libtasn1, but with GNUTLS
* Fix detection of libgcrypt when libgcrypt-config is absent
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE2vOm/bJrYpEtDo4/vobrtBUQT98FAmagzXUACgkQvobrtBUQ
# T9++chAAhCFgo5A/UjQGdl9UAOW/sdgOoHGE3E8Y6sSTQyv+EfHf1DO89JtAh4ft
# d8Hz7Taul4k1wRm6Dxv2aCqH5iS1tgDE2ghGDNwn/zDtHNnjFx3+HcxBaAEcpt3O
# FqvGeG6KdFO1t2UR2DMh1XbhfwygrHiIcSB2y8jrgi46ncS6JvLrFavjLTe7JBn9
# J3y/iYgQiVPN6UlIwUs1EquGdoTI/0SpHVirqHN/2yyrdRsGBsXZq5WI6Oli8zFL
# VqJNmc5Dzo7ushoYG5Rpk83mmC26VuXO/JmXyJ/c7FeADLWUfc/SPPyAMxPGuwFr
# DKg84ovRtq3yZIw8LPoUJOtbcu4Y7BSGwlolQjWegvsVTU6Bdk+teZVR9X64QbM2
# YBXzMkRHUKzR3rb0LewAKehP3n93aBypLln9ZMgg7wj92Rj8Dl/sylaBhDEkH/HQ
# 2pMdSdAWqMnGHfnKPxyjflNO2PIsOenZUkDZwf9i7Ow6fU5n3fqvudVDTWjXpWPn
# V7v9JGNPHocScJFRUqHSVqd2ZWaZX4F1TsvG6SGOmzDGR0IjBRlqos7OEdbAAH1x
# IglizbTxD6M9ZWJrGt1sl6LSAwEp3oXgsWNdejq2+7I6H4BeUm4ACDbdrEjqG9aG
# Ya/HpNT0PEzbGXm6qsuHY5z0agGtaPwdXLcSGnsv+a0rP/9nthY=
# =ccYf
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 24 Jul 2024 07:46:29 PM AEST
# gpg: using RSA key DAF3A6FDB26B62912D0E8E3FBE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>" [full]
# gpg: aka "Daniel P. Berrange <berrange@redhat.com>" [full]
* tag 'misc-fixes-pull-request' of https://gitlab.com/berrange/qemu:
crypto: propagate errors from TLS session I/O callbacks
crypto: push error reporting into TLS session I/O APIs
crypto: drop gnutls debug logging support
chardev: add tracing of socket error conditions
meson: build chardev trace files when have_block
qapi: drop unused QCryptoBlockCreateOptionsLUKS.detached-header
meson.build: fix libgcrypt detection on system without libgcrypt-config
docs/devel: Add introduction to LUKS volume with detached header
crypto: Allow building with GnuTLS but without Libtasn1
crypto: Restrict pkix_asn1_tab[] to crypto-tls-x509-helpers.c
crypto: Remove 'crypto-tls-x509-helpers.h' from crypto-tls-psk-helpers.c
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
GNUTLS doesn't know how to perform I/O on anything other than plain
FDs, so the TLS session provides it with some I/O callbacks. The
GNUTLS API design requires these callbacks to return a unix errno
value, which means we're currently loosing the useful QEMU "Error"
object.
This changes the I/O callbacks in QEMU to stash the "Error" object
in the QCryptoTLSSession class, and fetch it when seeing an I/O
error returned from GNUTLS, thus preserving useful error messages.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The current TLS session I/O APIs just return a synthetic errno
value on error, which has been translated from a gnutls error
value. This looses a large amount of valuable information that
distinguishes different scenarios.
Pushing population of the "Error *errp" object into the TLS
session I/O APIs gives more detailed error information.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
GNUTLS already supports dynamically enabling its logging at runtime by
setting the env var 'GNUTLS_DEBUG_LEVEL=10', so there is no need to
re-invent this logic in QEMU in a way that requires a re-compile.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This adds trace points to every error scenario in the chardev socket
backend that can lead to termination of the connection.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The QSD depends on chardev code, and is built when have_tools is
true. This means conditionalizing chardev trace on have_system
is wrong, we need have_block which is set have_system || have_tools.
This latent bug was historically harmless because only the spice
chardev included tracing, which wasn't built in a !have_system
scenario.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The 'detached-header' field in QCryptoBlockCreateOptionsLUKS
was left over from earlier patch iterations.
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
libgcrypt starts providing correct pkg-config configuration since 1.9,
in parallel with libgcrypt-config. Since 1.11 it may also stop
installing libgcrypt-config in some scenarios. Use the auto method for
detection of libgcrypt, in which meson will try both pkg-config and
libgcrypt-config.
Auto method for libgcrypt is supported by meson since 0.49.0, which is
higher than the version qemu requires.
Signed-off-by: Yao Zi <ziyao@disroot.org>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
We only use Libtasn1 in unit tests. As noted in commit d47b83b118
("tests: add migration tests of TLS with x509 credentials"), having
GnuTLS without Libtasn1 is a valid configuration, so do not require
Libtasn1, to avoid:
Dependency gnutls found: YES 3.7.1 (cached)
Run-time dependency libtasn1 found: NO (tried pkgconfig)
../meson.build:1914:10: ERROR: Dependency "libtasn1" not found, tried pkgconfig
Fixes: ba7ed407e6 ("configure, meson: convert libtasn1 detection to meson")
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
pkix_asn1_tab[] is only accessed by crypto-tls-x509-helpers.c,
rename pkix_asn1_tab.c as pkix_asn1_tab.c.inc and include it once.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
[berrange: updated MAINTAINERS for changed filename]
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
crypto-tls-psk-helpers.c doesn't access the declarations
of "crypto-tls-x509-helpers.h", remove the include line
to avoid when building with GNUTLS but without Libtasn1:
In file included from tests/unit/crypto-tls-psk-helpers.c:23:
tests/unit/crypto-tls-x509-helpers.h:26:10: fatal error:
libtasn1.h: No such file or directory
26 | #include <libtasn1.h>
| ^~~~~~~~~~~~
compilation terminated.
Fixes: e1a6dc91dd ("crypto: Implement TLS Pre-Shared Keys (PSK).")
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Fix for 9.1
# -----BEGIN PGP SIGNATURE-----
#
# iLMEAAEKAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCZqDBEQAKCRBAov/yOSY+
# 3/3nA/9xWl3qkj95WmBICIi/K9doeA54k1h4d7g/K8+UHV+hDIlEDoidgesJSveH
# RmnE6wmJTb6QhT4IQrO5ERwE5U5DqINDhxAcX5GLfBjEtTFbLCYqEaKlN8pNh6MR
# Qki+SGKSa50kxAMgcB6vkld9uTIWczXzMTb3IxGCWs4VKPhJYw==
# =y//7
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 24 Jul 2024 06:53:37 PM AEST
# gpg: using RSA key B8FF1DA0D2FDCB2DA09C6C2C40A2FFF239263EDF
# gpg: Good signature from "Song Gao <m17746591750@163.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B8FF 1DA0 D2FD CB2D A09C 6C2C 40A2 FFF2 3926 3EDF
* tag 'pull-loongarch-20240724' of https://gitlab.com/gaosong/qemu:
target/loongarch: Fix helper_lddir() a CID INTEGER_OVERFLOW issue
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Misc HW patch queue
- Restrict probe_access*() functions to TCG (Phil)
- Extract do_invalidate_device_tlb from vtd_process_device_iotlb_desc (Clément)
- Fixes in Loongson IPI model (Bibo & Phil)
- Make docs/interop/firmware.json compatible with qapi-gen.py script (Thomas)
- Correct MPC I2C MMIO region size (Zoltan)
- Remove useless cast in Loongson3 Virt machine (Yao)
- Various uses of range overlap API (Yao)
- Use ERRP_GUARD macro in nubus_virtio_mmio_realize (Zhao)
- Use DMA memory API in Goldfish UART model (Phil)
- Expose fifo8_pop_buf and introduce fifo8_drop (Phil)
- MAINTAINERS updates (Zhao, Phil)
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmagFF8ACgkQ4+MsLN6t
# wN5bKg//f5TwUhsy2ff0FJpHheDOj/9Gc2nZ1U/Fp0E5N3sz3A7MGp91wye6Xwi3
# XG34YN9LK1AVzuCdrEEs5Uaxs1ZS1R2mV+fZaGHwYYxPDdnXxGyp/2Q0eyRxzbcN
# zxE2hWscYSZbPVEru4HvZJKfp4XnE1cqA78fJKMAdtq0IPq38tmQNRlJ+gWD9dC6
# ZUHXPFf3DnucvVuwqb0JYO/E+uJpcTtgR6pc09Xtv/HFgMiS0vKZ1I/6LChqAUw9
# eLMpD/5V2naemVadJe98/dL7gIUnhB8GTjsb4ioblG59AO/uojutwjBSQvFxBUUw
# U5lX9OSn20ouwcGiqimsz+5ziwhCG0R6r1zeQJFqUxrpZSscq7NQp9ygbvirm+wS
# edLc8yTPf4MtYOihzPP9jLPcXPZjEV64gSnJISDDFYWANCrysX3suaFEOuVYPl+s
# ZgQYRVSSYOYHgNqBSRkPKKVUxskSQiqLY3SfGJG4EA9Ktt5lD1cLCXQxhdsqphFm
# Ws3zkrVVL0EKl4v/4MtCgITIIctN1ZJE9u3oPJjASqSvK6EebFqAJkc2SidzKHz0
# F3iYX2AheWNHCQ3HFu023EvFryjlxYk95fs2f6Uj2a9yVbi813qsvd3gcZ8t0kTT
# +dmQwpu1MxjzZnA6838R6OCMnC+UpMPqQh3dPkU/5AF2fc3NnN8=
# =J/I2
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 24 Jul 2024 06:36:47 AM AEST
# gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]
* tag 'hw-misc-20240723' of https://github.com/philmd/qemu: (28 commits)
MAINTAINERS: Add myself as a reviewer of machine core
MAINTAINERS: Cover guest-agent in QAPI schema
util/fifo8: Introduce fifo8_drop()
util/fifo8: Expose fifo8_pop_buf()
util/fifo8: Rename fifo8_pop_buf() -> fifo8_pop_bufptr()
util/fifo8: Rename fifo8_peek_buf() -> fifo8_peek_bufptr()
util/fifo8: Use fifo8_reset() in fifo8_create()
util/fifo8: Fix style
chardev/char-fe: Document returned value on error
hw/char/goldfish: Use DMA memory API
hw/nubus/virtio-mmio: Fix missing ERRP_GUARD() in realize handler
dump: make range overlap check more readable
crypto/block-luks: make range overlap check more readable
system/memory_mapping: make range overlap check more readable
sparc/ldst_helper: make range overlap check more readable
cxl/mailbox: make range overlap check more readable
util/range: Make ranges_overlap() return bool
hw/mips/loongson3_virt: remove useless type cast
hw/i2c/mpc_i2c: Fix mmio region size
docs/interop/firmware.json: convert "Example" section
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
vfio queue:
* IOMMUFD Dirty Tracking support
* Fix for a possible SEGV in IOMMU type1 container
* Dropped initialization of host IOMMU device with mdev devices
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmafyVUACgkQUaNDx8/7
# 7KGebRAAzEYxvstDxSPNF+1xx937TKbRpiKYtspTfEgu4Ht50MwO2ZqnVWzTBSwa
# qcjhDf2avMBpBvkp4O9fR7nXR0HRN2KvYrBSThZ3Qpqu4KjxCAGcHI5uYmgfizYh
# BBLrw3eWME5Ry220TinQF5KFl50vGq7Z/mku5N5Tgj2qfTfCXYK1Kc19SyAga49n
# LSokTIjZAGJa4vxrE7THawaEUjFRjfCJey64JUs/TPJaGr4R1snJcWgETww6juUE
# 9OSw/xl0AoQhaN/ZTRC1qCsBLUI2MVPsC+x+vqVK62HlTjCx+uDRVQ8KzfDzjCeH
# gaLkMjxJSuJZMpm4UU7DBzDGEGcEBCGeNyFt37BSqqPPpX55CcFhj++d8vqTiwpF
# YzmTNd/znxcZTw6OJN9sQZohh+NeS86CVZ3x31HD3dXifhRf17jbh7NoIyi+0ZCb
# N+mytOH5BXsD+ddwbk+yMaxXV43Fgz7ThG5tB1tjhhNtLZHDA5ezFvGZ5F/FJrqE
# xAbjOhz5MC+RcOVNSzQJCULNqFpfE6Gqeys6btEDm/ltf4LpAe6W1HYuv8BJc19T
# UsqGK2yKAuQX8GErYxJ1zqZCttVrgpsmXFYTC5iGbxC84mvsF0Iti96IdXz9gfzN
# Vlb2OxoefcOwVqIhbkvTZW0ZwYGGDDPAYhLMfr5lSuRqj123OOo=
# =cViP
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 24 Jul 2024 01:16:37 AM AEST
# gpg: using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@kaod.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1
* tag 'pull-vfio-20240723-1' of https://github.com/legoater/qemu:
vfio/common: Allow disabling device dirty page tracking
vfio/migration: Don't block migration device dirty tracking is unsupported
vfio/iommufd: Implement VFIOIOMMUClass::query_dirty_bitmap support
vfio/iommufd: Implement VFIOIOMMUClass::set_dirty_tracking support
vfio/iommufd: Probe and request hwpt dirty tracking capability
vfio/{iommufd, container}: Invoke HostIOMMUDevice::realize() during attach_device()
vfio/iommufd: Add hw_caps field to HostIOMMUDeviceCaps
vfio/{iommufd,container}: Remove caps::aw_bits
vfio/iommufd: Introduce auto domain creation
vfio/ccw: Don't initialize HOST_IOMMU_DEVICE with mdev
vfio/ap: Don't initialize HOST_IOMMU_DEVICE with mdev
vfio/iommufd: Return errno in iommufd_cdev_attach_ioas_hwpt()
backends/iommufd: Extend iommufd_backend_get_device_info() to fetch HW capabilities
vfio/iommufd: Don't initialize nor set a HOST_IOMMU_DEVICE with mdev
vfio/pci: Extract mdev check into an helper
hw/vfio/container: Fix SIGSEV on vfio_container_instance_finalize()
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* target/i386/kvm: support for reading RAPL MSRs using a helper program
* hpet: emulation improvements
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmaelL4UHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroMXoQf+K77lNlHLETSgeeP3dr7yZPOmXjjN
# qFY/18jiyLw7MK1rZC09fF+n9SoaTH8JDKupt0z9M1R10HKHLIO04f8zDE+dOxaE
# Rou3yKnlTgFPGSoPPFr1n1JJfxtYlLZRoUzaAcHUaa4W7JR/OHJX90n1Rb9MXeDk
# jV6P0v1FWtIDdM6ERm9qBGoQdYhj6Ra2T4/NZKJFXwIhKEkxgu4yO7WXv8l0dxQz
# jE4fKotqAvrkYW1EsiVZm30lw/19duhvGiYeQXoYhk8KKXXjAbJMblLITSNWsCio
# 3l6Uud/lOxekkJDAq5nH3H9hCBm0WwvwL+0vRf3Mkr+/xRGvrhtmUdp8NQ==
# =00mB
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 23 Jul 2024 03:19:58 AM AEST
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu:
hpet: avoid timer storms on periodic timers
hpet: store full 64-bit target value of the counter
hpet: accept 64-bit reads and writes
hpet: place read-only bits directly in "new_val"
hpet: remove unnecessary variable "index"
hpet: ignore high bits of comparator in 32-bit mode
hpet: fix and cleanup persistence of interrupt status
Add support for RAPL MSRs in KVM/Qemu
tools: build qemu-vmsr-helper
qio: add support for SO_PEERCRED for socket channel
target/i386: do not crash if microvm guest uses SGX CPUID leaves
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
virtio,pci,pc: features,fixes
pci: Initial support for SPDM Responders
cxl: Add support for scan media, feature commands, device patrol scrub
control, DDR5 ECS control, firmware updates
virtio: in-order support
virtio-net: support for SR-IOV emulation (note: known issues on s390,
might get reverted if not fixed)
smbios: memory device size is now configurable per Machine
cpu: architecture agnostic code to support vCPU Hotplug
Fixes, cleanups all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmae9l8PHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRp8fYH/impBH9nViO/WK48io4mLSkl0EUL8Y/xrMvH
# zKFCKaXq8D96VTt1Z4EGKYgwG0voBKZaCEKYU/0ARGnSlSwxINQ8ROCnBWMfn2sx
# yQt08EXVMznNLtXjc6U5zCoCi6SaV85GH40No3MUFXBQt29ZSlFqO/fuHGZHYBwS
# wuVKvTjjNF4EsGt3rS4Qsv6BwZWMM+dE6yXpKWk68kR8IGp+6QGxkMbWt9uEX2Md
# VuemKVnFYw0XGCGy5K+ZkvoA2DGpEw0QxVSOMs8CI55Oc9SkTKz5fUSzXXGo1if+
# M1CTjOPJu6pMym6gy6XpFa8/QioDA/jE2vBQvfJ64TwhJDV159s=
# =k8e9
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 23 Jul 2024 10:16:31 AM AEST
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [undefined]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [undefined]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (61 commits)
hw/nvme: Add SPDM over DOE support
backends: Initial support for SPDM socket support
hw/pci: Add all Data Object Types defined in PCIe r6.0
tests/acpi: Add expected ACPI AML files for RISC-V
tests/qtest/bios-tables-test.c: Enable basic testing for RISC-V
tests/acpi: Add empty ACPI data files for RISC-V
tests/qtest/bios-tables-test.c: Remove the fall back path
tests/acpi: update expected DSDT blob for aarch64 and microvm
acpi/gpex: Create PCI link devices outside PCI root bridge
tests/acpi: Allow DSDT acpi table changes for aarch64
hw/riscv/virt-acpi-build.c: Update the HID of RISC-V UART
hw/riscv/virt-acpi-build.c: Add namespace devices for PLIC and APLIC
virtio-iommu: Add trace point on virtio_iommu_detach_endpoint_from_domain
hw/vfio/common: Add vfio_listener_region_del_iommu trace event
virtio-iommu: Remove the end point on detach
virtio-iommu: Free [host_]resv_ranges on unset_iommu_devices
virtio-iommu: Remove probe_done
Revert "virtio-iommu: Clear IOMMUDevice when VFIO device is unplugged"
gdbstub: Add helper function to unregister GDB register space
physmem: Add helper function to destroy CPU AddressSpace
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Since fifo8_pop_buf() return a const buffer (which points
directly into the FIFO backing store). Rename it using the
'bufptr' suffix to better reflect that it is a pointer to
the internal buffer that is being returned. This will help
differentiate with methods *copying* the FIFO data.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20240722160745.67904-6-philmd@linaro.org>
Since fifo8_peek_buf() return a const buffer (which points
directly into the FIFO backing store). Rename it using the
'bufptr' suffix to better reflect that it is a pointer to
the internal buffer that is being returned. This will help
differentiate with methods *copying* the FIFO data.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20240722160745.67904-5-philmd@linaro.org>
Rather than using address_space_rw(..., 0 or 1),
use the simpler DMA memory API which expand to
the same code. This allows removing a cast on
the 'buf' variable which is really const. Since
'buf' is only used in the CMD_READ_BUFFER case,
we can reduce its scope.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-Id: <20240723181850.46000-1-philmd@linaro.org>
According to the comment in qapi/error.h, dereferencing @errp requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
* - It must not be dereferenced, because it may be null.
...
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
*
* Using it when it's not needed is safe, but please avoid cluttering
* the source with useless code.
In nubus_virtio_mmio_realize(), @errp is dereferenced without
ERRP_GUARD().
Although nubus_virtio_mmio_realize() - as a DeviceClass.realize()
method - is never passed a null @errp argument, it should follow the
rules on @errp usage. Add the ERRP_GUARD() there.
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240723161802.1377985-1-zhao1.liu@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The last register of this device is at offset 0x14 occupying 8 bits so
to cover it the mmio region needs to be 0x15 bytes long. Also correct
the name of the field storing this register value to match the
register name.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Fixes: 7abb479c7a ("PPC: E500: Add FSL I2C controller")
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240721225506.B32704E6039@zero.eik.bme.hu>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Since commit 3c5f6114d9 ("qapi: remove "Example" doc section")
the "Example" section is not valid anymore.
It has been replaced by the "qmp-example" directive.
This was not detected earlier as firmware.json was not validated.
As this validation is about to be added, adapt firmware.json.
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Message-ID: <20240719-qapi-firmware-json-v6-3-c2e3de390b58@linutronix.de>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Only a small subset of all architectures supported by qemu make use of
firmware files. Introduce and use a new enum to represent this.
This also removes the dependency to machine.json from the global qapi
definitions.
Claim "Since: 3.0" for the new enum, because that's correct for most of
its members, and the members are what matters in the interface.
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240719-qapi-firmware-json-v6-2-c2e3de390b58@linutronix.de>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Only a small subset of all blockdev drivers make sense for firmware
images. Introduce and use a new enum to represent this.
This also reduces the dependency on firmware.json from the global qapi
definitions.
Claim "Since: 3.0" for the new enum, because that's correct for its
members, and the members are what matters in the interface.
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240719-qapi-firmware-json-v6-1-c2e3de390b58@linutronix.de>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Once initialised, QOM objects can be realized and
unrealized multiple times before being finalized.
Resources allocated in REALIZE must be deallocated
in an equivalent UNREALIZE handler.
Free the CPU array in loongson_ipi_unrealize()
instead of loongson_ipi_finalize().
Cc: qemu-stable@nongnu.org
Fixes: 5e90b8db38 ("hw/loongarch: Set iocsr address space per-board rather than percpu")
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240723111405.14208-3-philmd@linaro.org>
Add the aarch64 bsd-user fragments needed to build the new aarch64 code.
Signed-off-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
We include the files that define PR_MTE_TCF_SHIFT only on Linux, but use
them unconditionally. Restrict its use to Linux-only.
"It's ugly, but it's not actually wrong."
Signed-off-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Most (all?) targets require stacks to be properly aligned. Rather than a
series of ifdefs in bsd-user/signal.h, instead use a manditory #define
for all architectures.
Signed-off-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Only support 4k pages for aarch64 binaries. The variable page size stuff
isn't working just yet, so put in this lessor-of-evils kludge until that
is complete.
Signed-off-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
This removes the logic which prepends the emulator to each call to
execve and fexecve. This is not necessary with the existing
imgact_binmisc support and it avoids the need to install the emulator
binary into jail environments when using 'binmiscctl --pre-open'.
Signed-off-by: Doug Rabson <dfr@rabson.org>
Reviewed-by: Warner Losh <imp@bsdimp.com>
Signed-off-by: Warner Losh <imp@bsdimp.com>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Added get_ucontext_sigreturn function to check processor state ensuring current execution mode is EL0 and no flags
indicating interrupts or exceptions are set.
Updated AArch64 code to use CF directly without reading/writing the entire processor state, improving efficiency.
Changed FP data structures to use Int128 instead of __uint128_t, leveraging QEMU's generic mechanism for referencing this type.
Signed-off-by: Stacey Son <sson@FreeBSD.org>
Signed-off-by: Ajeet Singh <itachis@FreeBSD.org>
Signed-off-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240707191128.10509-9-itachis@FreeBSD.org>
Signed-off-by: Warner Losh <imp@bsdimp.com>
Added sigcode setup function for signal trampoline which initializes a sequence of instructions
to handle signal returns and exits, copying this code to the target offset.
Defined ARM AArch64 specific signal definitions including register indices and sizes,
and introduced structures to represent general purpose registers, floating point registers, and machine context.
Added function to set up signal handler arguments, populating register values in `CPUARMState`
based on the provided signal, signal frame, signal action, and frame address.
Signed-off-by: Stacey Son <sson@FreeBSD.org>
Signed-off-by: Ajeet Singh <itachis@FreeBSD.org>
Signed-off-by: Warner Losh <imp@bsdimp.com>
Co-authored-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240707191128.10509-5-itachis@FreeBSD.org>
Signed-off-by: Warner Losh <imp@bsdimp.com>
Added function to access rval2 by accessing the x1 register.
Defined ARM AArch64 ELF parameters including mmap and dynamic load addresses.
Introduced extensive hardware capability definitions and macros for retrieving hardware capability (hwcap) flags.
Implemented function to retrieve ARM AArch64 hardware capabilities using the `GET_FEATURE_ID` macro.
Added function to retrieve extended ARM AArch64 hardware capability flags.
Signed-off-by: Stacey Son <sson@FreeBSD.org>
Signed-off-by: Ajeet Singh <itachis@FreeBSD.org>
Signed-off-by: Warner Losh <imp@bsdimp.com>
Co-authored-by: Kyle Evans <kevans@FreeBSD.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240707191128.10509-4-itachis@FreeBSD.org>
Signed-off-by: Warner Losh <imp@bsdimp.com>
Added header file for managing CPU register states in FreeBSD user mode.
Introduced prototypes for setting and getting thread-local storage (TLS).
Implemented AArch64 sysarch() system call emulation and a printing function.
Added function for setting up thread upcall to add thread support to BSD-USER.
Initialized thread's register state during thread setup.
Updated ARM AArch64 VM parameter definitions for bsd-user, including address spaces for FreeBSD/arm64 and
a function for getting the stack pointer from CPU and setting a return value.
Signed-off-by: Stacey Son <sson@FreeBSD.org>
Signed-off-by: Ajeet Singh <itachis@FreeBSD.org>
Co-authored-by: Jessica Clarke <jrtc27@jrtc27.com>
Co-authored-by: Sean Bruno <sbruno@freebsd.org>
Co-authored-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240707191128.10509-3-itachis@FreeBSD.org>
Signed-off-by: Warner Losh <imp@bsdimp.com>
Added function to initialize ARM CPU and check if it supports 64-bit mode.
Implemented CPU loop function to handle exceptions and emulate execution of instructions.
Added function to clone CPU state to create a new thread.
Included AArch64 specific CPU functions for bsd-user to set and receive thread-local-storage
value from the tpidr_el0 register.
Introduced structure for storing CPU register states for BSD-USER.
Signed-off-by: Stacey Son <sson@FreeBSD.org>
Signed-off-by: Ajeet Singh <itachis@FreeBSD.org>
Co-authored-by: Kyle Evans <kevans@freebsd.org>
Co-authored-by: Sean Bruno <sbruno@freebsd.org>
Co-authored-by: Jessica Clarke <jrtc27@jrtc27.com>
Reviewed-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240707191128.10509-2-itachis@FreeBSD.org>
Signed-off-by: Warner Losh <imp@bsdimp.com>
The property 'x-pre-copy-dirty-page-tracking' allows disabling the whole
tracking of VF pre-copy phase of dirty page tracking, though it means
that it will only be used at the start of the switchover phase.
Add an option that disables the VF dirty page tracking, and fall
back into container-based dirty page tracking. This also allows to
use IOMMU dirty tracking even on VFs with their own dirty
tracker scheme.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
By default VFIO migration is set to auto, which will support live
migration if the migration capability is set *and* also dirty page
tracking is supported.
For testing purposes one can force enable without dirty page tracking
via enable-migration=on, but that option is generally left for testing
purposes.
So starting with IOMMU dirty tracking it can use to accommodate the lack of
VF dirty page tracking allowing us to minimize the VF requirements for
migration and thus enabling migration by default for those too.
While at it change the error messages to mention IOMMU dirty tracking as
well.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
[ clg: - spelling in commit log ]
Signed-off-by: Cédric Le Goater <clg@redhat.com>
ioctl(iommufd, IOMMU_HWPT_GET_DIRTY_BITMAP, arg) is the UAPI
that fetches the bitmap that tells what was dirty in an IOVA
range.
A single bitmap is allocated and used across all the hwpts
sharing an IOAS which is then used in log_sync() to set Qemu
global bitmaps.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
ioctl(iommufd, IOMMU_HWPT_SET_DIRTY_TRACKING, arg) is the UAPI that
enables or disables dirty page tracking. The ioctl is used if the hwpt
has been created with dirty tracking supported domain (stored in
hwpt::flags) and it is called on the whole list of iommu domains.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
In preparation to using the dirty tracking UAPI, probe whether the IOMMU
supports dirty tracking. This is done via the data stored in
hiod::caps::hw_caps initialized from GET_HW_INFO.
Qemu doesn't know if VF dirty tracking is supported when allocating
hardware pagetable in iommufd_cdev_autodomains_get(). This is because
VFIODevice migration state hasn't been initialized *yet* hence it can't pick
between VF dirty tracking vs IOMMU dirty tracking. So, if IOMMU supports
dirty tracking it always creates HWPTs with IOMMU_HWPT_ALLOC_DIRTY_TRACKING
even if later on VFIOMigration decides to use VF dirty tracking instead.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[ clg: - Fixed vbasedev->iommu_dirty_tracking assignment in
iommufd_cdev_autodomains_get()
- Added warning for heterogeneous dirty page tracking support
in iommufd_cdev_autodomains_get() ]
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Move the HostIOMMUDevice::realize() to be invoked during the attach of the device
before we allocate IOMMUFD hardware pagetable objects (HWPT). This allows the use
of the hw_caps obtained by IOMMU_GET_HW_INFO that essentially tell if the IOMMU
behind the device supports dirty tracking.
Note: The HostIOMMUDevice data from legacy backend is static and doesn't
need any information from the (type1-iommu) backend to be initialized.
In contrast however, the IOMMUFD HostIOMMUDevice data requires the
iommufd FD to be connected and having a devid to be able to successfully
GET_HW_INFO. This means vfio_device_hiod_realize() is called in
different places within the backend .attach_device() implementation.
Suggested-by: Cédric Le Goater <clg@redhat.cm>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
[ clg: Fixed error handling in iommufd_cdev_attach() ]
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Store the value of @caps returned by iommufd_backend_get_device_info()
in a new field HostIOMMUDeviceCaps::hw_caps. Right now the only value is
whether device IOMMU supports dirty tracking (IOMMU_HW_CAP_DIRTY_TRACKING).
This is in preparation for HostIOMMUDevice::realize() being called early
during attach_device().
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Remove caps::aw_bits which requires the bcontainer::iova_ranges being
initialized after device is actually attached. Instead defer that to
.get_cap() and call vfio_device_get_aw_bits() directly.
This is in preparation for HostIOMMUDevice::realize() being called early
during attach_device().
Suggested-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
There's generally two modes of operation for IOMMUFD:
1) The simple user API which intends to perform relatively simple things
with IOMMUs e.g. DPDK. The process generally creates an IOAS and attaches
to VFIO and mainly performs IOAS_MAP and UNMAP.
2) The native IOMMUFD API where you have fine grained control of the
IOMMU domain and model it accordingly. This is where most new feature
are being steered to.
For dirty tracking 2) is required, as it needs to ensure that
the stage-2/parent IOMMU domain will only attach devices
that support dirty tracking (so far it is all homogeneous in x86, likely
not the case for smmuv3). Such invariant on dirty tracking provides a
useful guarantee to VMMs that will refuse incompatible device
attachments for IOMMU domains.
Dirty tracking insurance is enforced via HWPT_ALLOC, which is
responsible for creating an IOMMU domain. This is contrast to the
'simple API' where the IOMMU domain is created by IOMMUFD automatically
when it attaches to VFIO (usually referred as autodomains) but it has
the needed handling for mdevs.
To support dirty tracking with the advanced IOMMUFD API, it needs
similar logic, where IOMMU domains are created and devices attached to
compatible domains. Essentially mimicking kernel
iommufd_device_auto_get_domain(). With mdevs given there's no IOMMU domain
it falls back to IOAS attach.
The auto domain logic allows different IOMMU domains to be created when
DMA dirty tracking is not desired (and VF can provide it), and others where
it is. Here it is not used in this way given how VFIODevice migration
state is initialized after the device attachment. But such mixed mode of
IOMMU dirty tracking + device dirty tracking is an improvement that can
be added on. Keep the 'all of nothing' of type1 approach that we have
been using so far between container vs device dirty tracking.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
[ clg: Added ERRP_GUARD() in iommufd_cdev_autodomains_get() ]
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
mdevs aren't "physical" devices and when asking for backing IOMMU info,
it fails the entire provisioning of the guest. Fix that by setting
vbasedev->mdev true so skipping HostIOMMUDevice initialization in the
presence of mdevs.
Fixes: 9305895201 ("vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Acked-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
mdevs aren't "physical" devices and when asking for backing IOMMU info,
it fails the entire provisioning of the guest. Fix that by setting
vbasedev->mdev true so skipping HostIOMMUDevice initialization in the
presence of mdevs.
Fixes: 9305895201 ("vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
In preparation to implement auto domains have the attach function
return the errno it got during domain attach instead of a bool.
-EINVAL is tracked to track domain incompatibilities, and decide whether
to create a new IOMMU domain.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
The helper will be able to fetch vendor agnostic IOMMU capabilities
supported both by hardware and software. Right now it is only iommu dirty
tracking.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
mdevs aren't "physical" devices and when asking for backing IOMMU info, it
fails the entire provisioning of the guest. Fix that by skipping
HostIOMMUDevice initialization in the presence of mdevs, and skip setting
an iommu device when it is known to be an mdev.
Cc: Zhenzhong Duan <zhenzhong.duan@intel.com>
Fixes: 9305895201 ("vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler")
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
In preparation to skip initialization of the HostIOMMUDevice for mdev,
extract the checks that validate if a device is an mdev into helpers.
A vfio_device_is_mdev() is created, and subsystems consult VFIODevice::mdev
to check if it's mdev or not.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
In vfio_connect_container's error path, the base container is
removed twice form the VFIOAddressSpace QLIST: first on the
listener_release_exit label and second, on free_container_exit
label, through object_unref(container), which calls
vfio_container_instance_finalize().
Let's remove the first instance.
Fixes: 938026053f ("vfio/container: Switch to QOM")
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
qga-pull-2024-07-23
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEwsLBCepDxjwUI+uE711egWG6hOcFAmafUs0ACgkQ711egWG6
# hOffwQ/+PMFMOq3jwV11Na0GnrFHT0SLlcxNWYGQjE0Q/nwuYWMTKdo2iB9rVC7T
# qxaT6PLtTZPgRsJudJ5kkvLFw88Nr6BuWl31tCVeALUO7C0oTg/oRDfYVeH4/jfG
# PS5TiM6ie27SvI5lhGZhd9sRAy8N6NGgT6Fh+pS2tVVfftcfVYKVmnzgtvk314A+
# MpeW8ukVruSW+9G+suXaE750g/drZJAoepC5pW1HXdHE+IuzXNdMWZqwMqBZSM5T
# X8VcLvMjFrFrfLOP2el6mloriw67aJyKe9Uwsp548HdXfZKrLCmaR7cZK5zKVQDK
# Rzolyuw19wNNi0TZAwmP+MBioDiIHcM4nNhVDCHIVCbXzQHa4BhAr/cr8uucyfM5
# hdCWmaTl4Tksk4q4ooHurDWshV26QNRbLRD1Vx1Rhrwz42MmU2VG13PsSWqLj00I
# fj1LzhQOmr26cewgayIL7ODwHDXiwKi+6lKS1OyTjXXubucScgxSyTNC785T6Rvk
# T58KAnBRD3vDhE7Dn/4KdRClRFY+7R2/jcHdFnA4vfvOVV8ZXp/m0O0wfLEikH6/
# dGDDVBLNG5gqV477++0wdqkYFq6MmON3PH/EA6rgZYc4At5kS+HFNASBvnFRYMGf
# dgtyj8jV5uoffqYOqyXxClP6eTgV1EZ0/wKZ8uJipivB7azjnkE=
# =xzjT
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 23 Jul 2024 04:50:53 PM AEST
# gpg: using RSA key C2C2C109EA43C63C1423EB84EF5D5E8161BA84E7
# gpg: Good signature from "Kostiantyn Kostiuk (Upstream PR sign) <kkostiuk@redhat.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: C2C2 C109 EA43 C63C 1423 EB84 EF5D 5E81 61BA 84E7
* tag 'qga-pull-2024-07-23' of https://github.com/kostyanf14/qemu: (25 commits)
qga/linux: Add new api 'guest-network-get-route'
guest-agent: document allow-rpcs in config file section
qga/commands-posix: Make ga_wait_child() return boolean
qga: centralize logic for disabling/enabling commands
qga: allow configuration file path via the cli
qga: remove pointless 'blockrpcs_key' variable
qga: move declare of QGAConfig struct to top of file
qga: don't disable fsfreeze commands if vss_init fails
qga: conditionalize schema for commands not supported on other UNIX
qga: conditionalize schema for commands requiring utmpx
qga: conditionalize schema for commands requiring libudev
qga: conditionalize schema for commands requiring fstrim
qga: conditionalize schema for commands requiring fsfreeze
qga: conditionalize schema for commands only supported on Windows
qga: conditionalize schema for commands requiring linux/win32
qga: conditionalize schema for commands requiring getifaddrs
qga: conditionalize schema for commands unsupported on non-Linux POSIX
qga: conditionalize schema for commands unsupported on Windows
qga: move CONFIG_FSFREEZE/TRIM to be meson defined options
qga: move linux memory block command impls to commands-linux.c
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
accel/tcg: Export set/clear_helper_retaddr
target/arm: Use set_helper_retaddr for dc_zva, sve and sme
target/ppc: Tidy dcbz helpers
target/ppc: Use set_helper_retaddr for dcbz
target/s390x: Use set_helper_retaddr in mem_helper.c
# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmafJKIdHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV+FBAf7Bup+karxeGHZx2rN
# cPeF248bcCWTxBWHK7dsYze4KqzsrlNIJlPeOKErU2bbbRDZGhOp1/N95WVz+P8V
# 6Ny63WTsAYkaFWKxE6Jf0FWJlGw92btk75pTV2x/TNZixg7jg0vzVaYkk0lTYc5T
# m5e4WycYEbzYm0uodxI09i+wFvpd+7WCnl6xWtlJPWZENukvJ36Ss43egFMDtuMk
# vTJuBkS9wpwZ9MSi6EY6M+Raieg8bfaotInZeDvE/yRPNi7CwrA7Dgyc1y626uBA
# joGkYRLzhRgvT19kB3bvFZi1AXa0Pxr+j0xJqwspP239Gq5qezlS5Bv/DrHdmGHA
# jaqSwg==
# =XgUE
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 23 Jul 2024 01:33:54 PM AEST
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate]
* tag 'pull-tcg-20240723' of https://gitlab.com/rth7680/qemu:
target/riscv: Simplify probing in vext_ldff
target/s390x: Use set/clear_helper_retaddr in mem_helper.c
target/s390x: Use user_or_likely in access_memmove
target/s390x: Use user_or_likely in do_access_memset
target/ppc: Improve helper_dcbz for user-only
target/ppc: Merge helper_{dcbz,dcbzep}
target/ppc: Split out helper_dbczl for 970
target/ppc: Hoist dcbz_size out of dcbz_common
target/ppc/mem_helper.c: Remove a conditional from dcbz_common()
target/arm: Use set/clear_helper_retaddr in SVE and SME helpers
target/arm: Use set/clear_helper_retaddr in helper-a64.c
accel/tcg: Move {set,clear}_helper_retaddr to cpu_ldst.h
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The current pairing of tlb_vaddr_to_host with extra is either
inefficient (user-only, with page_check_range) or incorrect
(system, with probe_pages).
For proper non-fault behaviour, use probe_access_flags with
its nonfault parameter set to true.
Reviewed-by: Max Chou <max.chou@sifive.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Avoid a race condition with munmap in another thread.
For access_memset and access_memmove, manage the value
within the helper. For uses of access_{get,set}_byte,
manage the value across the for loops.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Invert the conditional, indent the block, and use the macro
that expands to true for user-only.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Mark the reserve_addr check unlikely. Use tlb_vaddr_to_host
instead of probe_write, relying on the memset itself to test
for page writability. Use set/clear_helper_retaddr so that
we can properly unwind on segfault.
With this, a trivial loop around guest memset will no longer
spend nearly 25% of runtime within page_get_flags.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Merge the two and pass the mmu_idx directly from translation.
Swap the argument order in dcbz_common to avoid extra swaps.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We can determine at translation time whether the insn is or
is not dbczl. We must retain a runtime check against the
HID5 register, but we can move that to a separate function
that never affects other ppc models.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Instead of passing a bool and select a value within dcbz_common() let
the callers pass in the right value to avoid this conditional
statement. On PPC dcbz is often used to zero memory and some code uses
it a lot. This change improves the run time of a test case that copies
memory with a dcbz call in every iteration from 6.23 to 5.83 seconds.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-Id: <20240622204833.5F7C74E6000@zero.eik.bme.hu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Avoid a race condition with munmap in another thread.
Use around blocks that exclusively use "host_fn".
Keep the blocks as small as possible, but without setting
and clearing for every operation on one page.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use these in helper_dc_dva and the FEAT_MOPS routines to
avoid a race condition with munmap in another thread.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use of these in helpers goes hand-in-hand with tlb_vaddr_to_host
and other probing functions.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
SPDM enables authentication, attestation and key exchange to assist in
providing infrastructure security enablement. It's a standard published
by the DMTF [1].
SPDM supports multiple transports, including PCIe DOE and MCTP.
This patch adds support to QEMU to connect to an external SPDM
instance.
SPDM support can be added to any QEMU device by exposing a
TCP socket to a SPDM server. The server can then implement the SPDM
decoding/encoding support, generally using libspdm [2].
This is similar to how the current TPM implementation works and means
that the heavy lifting of setting up certificate chains, capabilities,
measurements and complex crypto can be done outside QEMU by a well
supported and tested library.
1: https://www.dmtf.org/standards/SPDM
2: https://github.com/DMTF/libspdm
Signed-off-by: Huai-Cheng Kuo <hchkuo@avery-design.com.tw>
Signed-off-by: Chris Browy <cbrowy@avery-design.com>
Co-developed-by: Jonathan Cameron <Jonathan.cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
[ Changes by WM
- Bug fixes from testing
]
Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
[ Changes by AF:
- Convert to be more QEMU-ified
- Move to backends as it isn't PCIe specific
]
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20240703092027.644758-3-alistair.francis@wdc.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
As per the step 5 in the process documented in bios-tables-test.c,
generate the expected ACPI AML data files for RISC-V using the
rebuild-expected-aml.sh script and update the
bios-tables-test-allowed-diff.h.
These are all new files being added for the first time. Hence, iASL diff
output is not added.
Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20240716144306.2432257-10-sunilvl@ventanamicro.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
After PCI link devices are moved out of the scope of PCI root complex,
the DSDT files of machines which use GPEX, will change. So, update the
expected AML files with these changes for these machines.
Mainly, there are 2 changes.
1) Since the link devices are created now directly under _SB for all PCI
root bridges in the system, they should have unique names. So, instead
of GSIx, named those devices as LXXY where L means link, XX will have
PCI bus number and Y will have the INTx number (ex: L000 or L001). The
_PRT entries will also be updated to reflect this name change.
2) PCI link devices are moved from the scope of each PCI root bridge to
directly under _SB.
Below is the sample iASL difference for one such link device.
Scope (\_SB)
{
Name (_HID, "LNRO0005") // _HID: Hardware ID
Name (_UID, 0x1F) // _UID: Unique ID
Name (_CCA, One) // _CCA: Cache Coherency Attribute
Name (_CRS, ResourceTemplate () // _CRS: Current Resource Settings
{
Memory32Fixed (ReadWrite,
0x0A003E00, // Address Base
0x00000200, // Address Length
)
Interrupt (ResourceConsumer, Level, ActiveHigh, Exclusive, ,, )
{
0x0000004F,
}
})
+ Device (L000)
+ {
+ Name (_HID, "PNP0C0F" /* PCI Interrupt Link Device */)
+ Name (_UID, Zero) // _UID: Unique ID
+ Name (_PRS, ResourceTemplate ()
+ {
+ Interrupt (ResourceConsumer, Level, ActiveHigh, Exclusive, ,, )
+ {
+ 0x00000023,
+ }
+ })
+ Name (_CRS, ResourceTemplate ()
+ {
+ Interrupt (ResourceConsumer, Level, ActiveHigh, Exclusive, ,, )
+ {
+ 0x00000023,
+ }
+ })
+ Method (_SRS, 1, NotSerialized) // _SRS: Set Resource Settings
+ {
+ }
+ }
+
Device (PCI0)
{
Name (_HID, "PNP0A08" /* PCI Express Bus */) // _HID: Hardware ID
Name (_CID, "PNP0A03" /* PCI Bus */) // _CID: Compatible ID
Name (_SEG, Zero) // _SEG: PCI Segment
Name (_BBN, Zero) // _BBN: BIOS Bus Number
Name (_UID, Zero) // _UID: Unique ID
Name (_STR, Unicode ("PCIe 0 Device")) // _STR: Description String
Name (_CCA, One) // _CCA: Cache Coherency Attribute
Name (_PRT, Package (0x80) // _PRT: PCI Routing Table
{
Package (0x04)
{
0xFFFF,
Zero,
- GSI0,
+ L000,
Zero
},
.....
})
Device (GSI0)
{
Name (_HID, "PNP0C0F" /* PCI Interrupt Link Device */)
Name (_UID, Zero) // _UID: Unique ID
Name (_PRS, ResourceTemplate ()
{
Interrupt (ResourceConsumer, Level, ActiveHigh, Exclusive, ,, )
{
0x00000023,
}
})
Name (_CRS, ResourceTemplate ()
{
Interrupt (ResourceConsumer, Level, ActiveHigh, Exclusive, ,, )
{
0x00000023,
}
})
Method (_SRS, 1, NotSerialized) // _SRS: Set Resource Settings
{
}
}
}
}
Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Message-Id: <20240716144306.2432257-6-sunilvl@ventanamicro.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently, PCI link devices (PNP0C0F) are always created within the
scope of the PCI root bridge. However, RISC-V needs these link devices
to be created outside to ensure the probing order in the OS. This
matches the example given in the ACPI specification [1] as well. Hence,
create these link devices directly under _SB instead of under the PCI
root bridge.
To keep these link device names unique for multiple PCI bridges, change
the device name from GSIx to LXXY format where XX is the PCI bus number
and Y is the INTx.
GPEX is currently used by riscv, aarch64/virt and x86/microvm machines.
So, this change will alter the DSDT for those systems.
[1] - ACPI 5.1: 6.2.13.1 Example: Using _PRT to Describe PCI IRQ Routing
Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20240716144306.2432257-5-sunilvl@ventanamicro.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We are currently missing the deallocation of the [host_]resv_regions
in case of hot unplug. Also to make things more simple let's rule
out the case where multiple HostIOMMUDevices would be aliased and
attached to the same IOMMUDevice. This allows to remove the handling
of conflicting Host reserved regions. Anyway this is not properly
supported at guest kernel level. On hotunplug the reserved regions
are reset to the ones set by virtio-iommu property.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20240716094619.1713905-4-eric.auger@redhat.com>
Tested-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Now we have switched to PCIIOMMUOps to convey host IOMMU information,
the host reserved regions are transmitted when the PCIe topology is
built. This happens way before the virtio-iommu driver calls the probe
request. So let's remove the probe_done flag that allowed to check
the probe was not done before the IOMMU MR got enabled. Besides this
probe_done flag had a flaw wrt migration since it was not saved/restored.
The only case at risk is if 2 devices were plugged to a
PCIe to PCI bridge and thus aliased. First of all we
discovered in the past this case was not properly supported for
neither SMMU nor virtio-iommu on guest kernel side: see
[RFC] virtio-iommu: Take into account possible aliasing in virtio_iommu_mr()
https://lore.kernel.org/all/20230116124709.793084-1-eric.auger@redhat.com/
If this were supported by the guest kernel, it is unclear what the call
sequence would be from a virtio-iommu driver point of view.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20240716094619.1713905-3-eric.auger@redhat.com>
Tested-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This reverts commit 1b889d6e39.
There are different problems with that tentative fix:
- Some resources are left dangling (resv_regions,
host_resv_ranges) and memory subregions are left attached to
the root MR although freed as embedded in the sdev IOMMUDevice.
Finally the sdev->as is not destroyed and associated listeners
are left.
- Even when fixing the above we observe a memory corruption
associated with the deallocation of the IOMMUDevice. This can
be observed when a VFIO device is hotplugged, hot-unplugged
and a system reset is issued. At this stage we have not been
able to identify the root cause (IOMMU MR or as structs beeing
overwritten and used later on?).
- Another issue is HostIOMMUDevice are indexed by non aliased
BDF whereas the IOMMUDevice is indexed by aliased BDF - yes the
current naming is really misleading -. Given the state of the
code I don't think the virtio-iommu device works in non
singleton group case though.
So let's revert the patch for now. This means the IOMMU MR/as survive
the hotunplug. This is what is done in the intel_iommu for instance.
It does not sound very logical to keep those but currently there is
no symetric function to pci_device_iommu_address_space().
probe_done issue will be handled in a subsequent patch. Also
resv_regions and host_resv_regions will be deallocated separately.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20240716094619.1713905-2-eric.auger@redhat.com>
Tested-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
ACPI GED (as described in the ACPI 6.4 spec) uses an interrupt listed in the
_CRS object of GED to intimate OSPM about an event. Later then demultiplexes the
notified event by evaluating ACPI _EVT method to know the type of event. Use
ACPI GED to also notify the guest kernel about any CPU hot(un)plug events.
Note, GED interface is used by many hotplug events like memory hotplug, NVDIMM
hotplug and non-hotplug events like system power down event. Each of these can
be selected using a bit in the 32 bit GED IO interface. A bit has been reserved
for the CPU hotplug event.
ACPI CPU hotplug related initialization should only happen if ACPI_CPU_HOTPLUG
support has been enabled for particular architecture. Add cpu_hotplug_hw_init()
stub to avoid compilation break.
Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
Tested-by: Xianglai Li <lixianglai@loongson.cn>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
Reviewed-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
Tested-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20240716111502.202344-4-salil.mehta@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
KVM vCPU creation is done once during the vCPU realization when Qemu vCPU thread
is spawned. This is common to all the architectures as of now.
Hot-unplug of vCPU results in destruction of the vCPU object in QOM but the
corresponding KVM vCPU object in the Host KVM is not destroyed as KVM doesn't
support vCPU removal. Therefore, its representative KVM vCPU object/context in
Qemu is parked.
Refactor architecture common logic so that some APIs could be reused by vCPU
Hotplug code of some architectures likes ARM, Loongson etc. Update new/old APIs
with trace events. New APIs qemu_{create,park,unpark}_vcpu() can be externally
called. No functional change is intended here.
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: Xianglai Li <lixianglai@loongson.cn>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Reviewed-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Tested-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20240716111502.202344-2-salil.mehta@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently QEMU describes initial[1] RAM* in SMBIOS as a series of
virtual DIMMs (capped at 16Gb max) using type 17 structure entries.
Which is fine for the most cases. However when starting guest
with terabytes of RAM this leads to too many memory device
structures, which eventually upsets linux kernel as it reserves
only 64K for these entries and when that border is crossed out
it runs out of reserved memory.
Instead of partitioning initial RAM on 16Gb DIMMs, use maximum
possible chunk size that SMBIOS spec allows[2]. Which lets
encode RAM in lower 31 bits of 32bit field (which amounts upto
2047Tb per DIMM).
As result initial RAM will generate only one type 17 structure
until host/guest reach ability to use more RAM in the future.
Compat changes:
We can't unconditionally change chunk size as it will break
QEMU<->guest ABI (and migration). Thus introduce a new machine
class field that would let older versioned machines to use
legacy 16Gb chunks, while new(er) machine type[s] use maximum
possible chunk size.
PS:
While it might seem to be risky to rise max entry size this large
(much beyond of what current physical RAM modules support),
I'd not expect it causing much issues, modulo uncovering bugs
in software running within guest. And those should be fixed
on guest side to handle SMBIOS spec properly, especially if
guest is expected to support so huge RAM configs.
In worst case, QEMU can reduce chunk size later if we would
care enough about introducing a workaround for some 'unfixable'
guest OS, either by fixing up the next machine type or
giving users a CLI option to customize it.
1) Initial RAM - is RAM configured with help '-m SIZE' CLI option/
implicitly defined by machine. It doesn't include memory
configured with help of '-device' option[s] (pcdimm,nvdimm,...)
2) SMBIOS 3.1.0 7.18.5 Memory Device — Extended Size
PS:
* tested on 8Tb host with RHEL6 guest, which seems to parse
type 17 SMBIOS table entries correctly (according to 'dmidecode').
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20240715122417.4059293-1-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
A user can create a SR-IOV device by specifying the PF with the
sriov-pf property of the VFs. The VFs must be added before the PF.
A user-creatable VF must have PCIDeviceClass::sriov_vf_user_creatable
set. Such a VF cannot refer to the PF because it is created before the
PF.
A PF that user-creatable VFs can be attached calls
pcie_sriov_pf_init_from_user_created_vfs() during realization and
pcie_sriov_pf_exit() when exiting.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20240715-sriov-v5-5-3f5539093ffc@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
pci_config_get_bar_addr() had a division by vf_stride. vf_stride needs
to be non-zero when there are multiple VFs, but the specification does
not prohibit to make it zero when there is only one VF.
Do not perform the division for the first VF to avoid division by zero.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20240715-sriov-v5-2-3f5539093ffc@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* Minor clean-ups and fixes for the qtests and Avocado tests
* Fix crash that happens when introspecting scsi-block on older machine types
* s390x: filter deprecated properties based on model expansion type
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmaeSUMRHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbVdQw/8DvGymXKwpS0F2aSHg3AZvjSCpkv3Y+fK
# myQrzh30cv9Vhe/Y9do47HpfJ6Ug9SK6xG64K2o+BIW+G3+ZUwSHk24PoiALsrJf
# 9qqya1upBJkEC5B4PhqRPS3GlbvBnKKEk8W6BMpUa2BToFV9MsG256cBVhUrRpGc
# 6u80DgTNxCI1czsNkWVGJAt1oVLYYJIjz7UZ4VbZCH48o6r0iSUV6C01wccOFmNy
# IXbspyyUftWFh9lO0i8PiYlXG2YEAmFry3gqD5vc+6BsFT4lMeoRFFxbVCddGKFc
# iNwlH4ayjeISlEJeClImIdbHyZ+sDhPyy5x4cpQqmZudEPn+GVnZ0arm7OvXW/k8
# Yog4n7/cUz7GHnWbqYIFZMS1g1wmqm/9VPsVTzXAlTva4dTTs2p0tKAADHIAtPCI
# jxSPpbuCuukDzUZGsNZyRGbex6g4B0tP4TMHRFxo5LVy9dKn2BLOHBWuzPevD9OO
# FphZHUuGngcPi4GSFmlv7aCS0pqyWsCO+5EqoYUgO8yadyfiXN9pwjB6OnBZux0U
# kbJOkkBJwEalhsiHmPFMnS8rkWa4Ye4ZJjj8XHRiecxSZOcNOcxyE+l2x8CV2aFB
# UBR83nm86vXXpu86Yod3E+txDEUzKN5+B8X0q7Se0YvsWbB+1Dq/Co0Bdh/Wp70E
# EPk5eqaSp8k=
# =zB5F
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 22 Jul 2024 09:57:55 PM AEST
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full]
* tag 'pull-request-2024-07-22' of https://gitlab.com/thuth/qemu:
target/s390x: filter deprecated properties based on model expansion type
tests: increase timeout per instance of bios-tables-test
qtest/fuzz: make range overlap check more readable
hw: Fix crash that happens when introspecting scsi-block on older machine types
tests/avocado/machine_aspeed.py: Increase timeout for TPM test
tests/avocado: Remove the remainders of the virtiofs_submounts test
tests/avocado/mem-addr-space-check: Remove unused "import signal"
tests/avocado: Move LinuxTest related code into a separate file
tests/avocado: Allow overwriting AVOCADO_SHOW env variable
tests/avocado/boot_xen.py: use class attribute
tests/avocado/boot_xen.py: unify tags
tests/avocado/boot_xen.py: merge base classes
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
If I use `-serial stdio` on Windows, after QEMU exits, the terminal
could not handle arrow keys and tab any more. Because stdio backend
on Windows sets console mode to virtual terminal input when starts,
but does not restore the old mode when finalize.
This small patch saves the old console mode and set it back.
Signed-off-by: Ziming Song <s.ziming@hotmail.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <ME3P282MB25488BE7C39BF0C35CD0DA5D8CA82@ME3P282MB2548.AUSP282.PROD.OUTLOOK.COM>
If the period is set to a value that is too low, there could be no
time left to run the rest of QEMU. Do not trigger interrupts faster
than 1 MHz.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Store the full 64-bit value at which the timer should fire.
This makes it possible to skip the imprecise hpet_calculate_diff()
step, and to remove the clamping of the period to 31 or 63 bits.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Declare the MemoryRegionOps so that 64-bit reads and writes to the HPET
are received directly. This makes it possible to unify the code to
process low and high parts: for 32-bit reads, extract the desired word;
for 32-bit writes, just merge the desired part into the old value and
proceed as with a 64-bit write.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The variable "val" is used for two different purposes. As an intermediate
value when writing configuration registers, and to store the cleared bits
when writing ISR.
Use "new_val" for the former, and rename the variable so that it is clearer
for the latter case.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There are several bugs in the handling of the ISR register:
- switching level->edge was not lowering the interrupt and
clearing ISR
- switching on the enable bit was not raising a level-triggered
interrupt if the timer had fired
- the timer must be kept running even if not enabled, in
order to set the ISR flag, so writes to HPET_TN_CFG must
not call hpet_del_timer()
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Starting with the "Sandy Bridge" generation, Intel CPUs provide a RAPL
interface (Running Average Power Limit) for advertising the accumulated
energy consumption of various power domains (e.g. CPU packages, DRAM,
etc.).
The consumption is reported via MSRs (model specific registers) like
MSR_PKG_ENERGY_STATUS for the CPU package power domain. These MSRs are
64 bits registers that represent the accumulated energy consumption in
micro Joules. They are updated by microcode every ~1ms.
For now, KVM always returns 0 when the guest requests the value of
these MSRs. Use the KVM MSR filtering mechanism to allow QEMU handle
these MSRs dynamically in userspace.
To limit the amount of system calls for every MSR call, create a new
thread in QEMU that updates the "virtual" MSR values asynchronously.
Each vCPU has its own vMSR to reflect the independence of vCPUs. The
thread updates the vMSR values with the ratio of energy consumed of
the whole physical CPU package the vCPU thread runs on and the
thread's utime and stime values.
All other non-vCPU threads are also taken into account. Their energy
consumption is evenly distributed among all vCPUs threads running on
the same physical CPU package.
To overcome the problem that reading the RAPL MSR requires priviliged
access, a socket communication between QEMU and the qemu-vmsr-helper is
mandatory. You can specified the socket path in the parameter.
This feature is activated with -accel kvm,rapl=true,path=/path/sock.sock
Actual limitation:
- Works only on Intel host CPU because AMD CPUs are using different MSR
adresses.
- Only the Package Power-Plane (MSR_PKG_ENERGY_STATUS) is reported at
the moment.
Signed-off-by: Anthony Harivel <aharivel@redhat.com>
Link: https://lore.kernel.org/r/20240522153453.1230389-4-aharivel@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Abort was not implemented previously, but we can implement it for AERs
and asynchrnously for I/O.
Signed-off-by: Ayush Mishra <ayush.m55@samsung.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Extend copy command to copy user data across different namespaces via
support for specifying a namespace for each source range
Signed-off-by: Arun Kumar <arun.kka@samsung.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Currently, there is no way to execute the query-cpu-model-expansion
command to retrieve a comprehenisve list of deprecated properties, as
the result is dependent per-model. To enable this, the expansion output
is modified as such:
When reporting a "full" CPU model, show the *entire* list of deprecated
properties regardless if they are supported on the model. A full
expansion outputs all known CPU model properties anyway, so it makes
sense to report all deprecated properties here too.
This allows management apps to query a single model (e.g. host) to
acquire the full list of deprecated properties.
Additionally, when reporting a "static" CPU model, the command will
only show deprecated properties that are a subset of the model's
*enabled* properties. This is more accurate than how the query was
handled before, which blindly reported deprecated properties that
were never otherwise introduced for certain models.
Acked-by: David Hildenbrand <david@redhat.com>
Suggested-by: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Collin Walling <walling@linux.ibm.com>
Message-ID: <20240719181741.35146-1-walling@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
CI often fails 'cross-i686-tci' job due to runner slowness
Log shows that test almost complete, with a few remaining
when bios-tables-test timeout hits:
19/270 qemu:qtest+qtest-aarch64 / qtest-aarch64/bios-tables-test
TIMEOUT 610.02s killed by signal 15 SIGTERM
...
stderr:
TAP parsing error: Too few tests run (expected 8, got 7)
At the same time overall job running time is only ~30 out of 1hr allowed.
Increase bios-tables-test instance timeout on 5min as a fix
for slow CI runners.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-ID: <20240716125930.620861-1-imammedo@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
"make check SPEED=slow" is currently failing the device-introspect-test on
older machine types since introspecting "scsi-block" is causing an abort:
$ ./qemu-system-x86_64 -M pc-q35-8.0 -monitor stdio
QEMU 9.0.50 monitor - type 'help' for more information
(qemu) device_add scsi-block,help
Unexpected error in object_property_find_err() at
../../devel/qemu/qom/object.c:1357:
can't apply global scsi-disk-base.migrate-emulated-scsi-request=false:
Property 'scsi-block.migrate-emulated-scsi-request' not found
Aborted (core dumped)
The problem is that the compat code tries to change the
"migrate-emulated-scsi-request" property for all devices that are
derived from "scsi-block", but the property has only been added
to "scsi-hd" and "scsi-cd" via the DEFINE_SCSI_DISK_PROPERTIES macro.
Thus let's fix the problem by only changing the property on the devices
that really have this property.
Fixes: b4912afa5f ("scsi-disk: Fix crash for VM configured with USB CDROM after live migration")
Message-ID: <20240703090904.909720-1-thuth@redhat.com>
Acked-by: Hyman Huang <yong.huang@smartx.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The 'app' level logging is useful, but sometimes we want
more, for example QEMU leverages the 'console' logging.
Allow overwriting AVOCADO_SHOW from environment, i.e.:
$ make check-avocado AVOCADO_SHOW='app,console'
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240719180211.48073-1-philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Introduce a privileged helper to access RAPL MSR.
The privileged helper tool, qemu-vmsr-helper, is designed to provide
virtual machines with the ability to read specific RAPL (Running Average
Power Limit) MSRs without requiring CAP_SYS_RAWIO privileges or relying
on external, out-of-tree patches.
The helper tool leverages Unix permissions and SO_PEERCRED socket
options to enforce access control, ensuring that only processes
explicitly requesting read access via readmsr() from a valid Thread ID
can access these MSRs.
The list of RAPL MSRs that are allowed to be read by the helper tool is
defined in rapl-msr-index.h. This list corresponds to the RAPL MSRs that
will be supported in the next commit titled "Add support for RAPL MSRs
in KVM/QEMU."
The tool is intentionally designed to run on the Linux x86 platform.
This initial implementation is tailored for Intel CPUs but can be
extended to support AMD CPUs in the future.
Signed-off-by: Anthony Harivel <aharivel@redhat.com>
Link: https://lore.kernel.org/r/20240522153453.1230389-3-aharivel@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The function qio_channel_get_peercred() returns a pointer to the
credentials of the peer process connected to this socket.
This credentials structure is defined in <sys/socket.h> as follows:
struct ucred {
pid_t pid; /* Process ID of the sending process */
uid_t uid; /* User ID of the sending process */
gid_t gid; /* Group ID of the sending process */
};
The use of this function is possible only for connected AF_UNIX stream
sockets and for AF_UNIX stream and datagram socket pairs.
On platform other than Linux, the function return 0.
Signed-off-by: Anthony Harivel <aharivel@redhat.com>
Link: https://lore.kernel.org/r/20240522153453.1230389-2-aharivel@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
sgx_epc_get_section assumes a PC platform is in use:
bool sgx_epc_get_section(int section_nr, uint64_t *addr, uint64_t *size)
{
PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
However, sgx_epc_get_section is called by CPUID regardless of whether
SGX state has been initialized or which platform is in use. Check
whether the machine has the right QOM class and if not behave as if
there are no EPC sections.
Fixes: 1dec2e1f19 ("i386: Update SGX CPUID info according to hardware/KVM/user input", 2021-09-30)
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2142
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The allocated memory to hold LBA ranges leaks in the nvme_dsm function. This
happens because the allocated memory for iocb->range is not freed in all
error handling paths.
Fix this by adding a free to ensure that the allocated memory is properly freed.
ASAN log:
==3075137==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 480 byte(s) in 6 object(s) allocated from:
#0 0x55f1f8a0eddd in malloc llvm/compiler-rt/lib/asan/asan_malloc_linux.cpp:129:3
#1 0x7f531e0f6738 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x5e738)
#2 0x55f1faf1f091 in blk_aio_get block/block-backend.c:2583:12
#3 0x55f1f945c74b in nvme_dsm hw/nvme/ctrl.c:2609:30
#4 0x55f1f945831b in nvme_io_cmd hw/nvme/ctrl.c:4470:16
#5 0x55f1f94561b7 in nvme_process_sq hw/nvme/ctrl.c:7039:29
Cc: qemu-stable@nongnu.org
Fixes: d7d1474fd8 ("hw/nvme: reimplement dsm to allow cancellation")
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
The spice-vdagentd doesn't send capabilities again on host/client
disconnect (but when the session agent connects and sends a
GUEST_XORG_RESOLUTION message)
When the dbus client disconnects, vdagent_disconnect() is called to
reset the agent state. Capabilities must be negotiated again on
reconnection.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20240717171541.201525-5-marcandre.lureau@redhat.com>
Mouse cursors with 8 bit alpha were downsampled to 1-bit opacity maps by
turning alpha values of 255 into 1 and everything else into 0. This
means that mostly-opaque pixels ended up completely invisible.
This patch changes the behaviour so that only pixels with less than 50%
alpha (0-127) are treated as transparent when converted to 1-bit alpha.
This greatly improves the subjective appearance of anti-aliased mouse
cursors, such as those used by macOS, when using a front-end UI without
support for alpha-blended cursors, such as some VNC clients.
Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20240624101040.82726-1-phil@philjordan.eu>
Using bare printf's in plugins is perfectly acceptable but they do
rather mess up the output of "make check-tcg". Convert the printfs to
use g_string and then output with the plugin output helper which will
already be captured to .pout files by the test harness.
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240718094523.1198645-7-alex.bennee@linaro.org>
This new plugin allows to stop emulation using conditions on the
emulation state. By setting this plugin arguments, it is possible
to set an instruction count limit and/or trigger address(es) to stop at.
The code returned at emulation exit can be customized.
This plugin demonstrates how someone could stop QEMU execution.
It could be used for research purposes to launch some code and
deterministically stop it and understand where its execution flow went.
Co-authored-by: Alexandre Iooss <erdnaxe@crans.org>
Signed-off-by: Simon Hamelin <simon.hamelin@grenoble-inp.org>
Signed-off-by: Alexandre Iooss <erdnaxe@crans.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-Id: <20240715081521.19122-2-simon.hamelin@grenoble-inp.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240718094523.1198645-5-alex.bennee@linaro.org>
Coverity reported a memory leak (CID 1549757) in this code and its
admittedly rather clumsy handling of extending the command table.
Instead of handing over a full array of the commands lets use the
lighter weight GPtrArray and simply test for the presence of each
entry as we go. This avoids complications of transferring ownership of
arrays and keeps the final command entries as static entries in the
target code.
Cc: Akihiko Odaki <akihiko.odaki@daynix.com>
Cc: Gustavo Bueno Romero <gustavo.romero@linaro.org>
Cc: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Gustavo Romero <gustavo.romero@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240718094523.1198645-4-alex.bennee@linaro.org>
While it's a good practice to have reusable base classes, in this
specific case there's no other user of the BootXenBase class.
By unifying the class used in this test, we can improve readability
and have the opportunity to add some future improvements in a clearer
fashion.
Signed-off-by: Cleber Rosa <crosa@redhat.com>
Message-ID: <20231208190911.102879-9-crosa@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
aspeed queue:
* SMC model fix (Coverity)
* AST2600 boot for eMMC support and test
* AST2700 ADC model
* I2C model changes preparing AST2700 I2C support
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmacwdQACgkQUaNDx8/7
# 7KFJGxAAyGLeAW8OJQgRMh0LygKyY6n4p+8LnImKwH19DkJy9KXsFmi2iCyg2Ufh
# FvNU1NUNjJopYZv+9sMtNXDlFbv53FkxotpmRnPQZxncH7VNUqZ/FyfVBItU7fdB
# pX4pU1x49InQDSL+ZwOYEDLirc8aTp/ZfyeayeFxmJvhtpVtAOGwH+R/Xx5o+Tfd
# fHTkAkJ69LVxK37fk6Bz6X4s3RnOCUpC7g8MuwN4FOSs1IorCq37tH72npPQ+lR+
# rFAaTY8/EDvn+mhCk61rTDo7fNB+/Oaks336cqKVWX8cg+qc0qOfqnG9f8H77b/P
# PLmCoXS+L83Ko6p8PMh2hzehYMW/NXJLHQm3YOFx20LicommM3Mg9wXd2FV4AcVi
# VbsL4+gNi4fPb4z6qCKUV/ir9IoL3x4OLfazKvj9wo88AvOkw06cyhZCfIBIy1Pe
# BQyI9Bg8ExjCsDX5MXhPOzHbqHSQDmGPpN7B4DkcCRSp61QoO4GR8XwsUMPOWt2H
# jwa0qEicdetu4Rop6HIQMdGCvpQEB4RW9l9hoePlg5FSv66M+wQoO5DTmUmTP/Go
# 5NNEdFK1oaf2xgvgiWsexFyeinKoyC12OwzhHWxeZp7OORo44M1eYosFQ8L7o+Pk
# XKL+t9Om17/BKKEA4JQjjip8E4p7m9wNJ7HQNcb63lqh2sYH/rQ=
# =r9I0
# -----END PGP SIGNATURE-----
# gpg: Signature made Sun 21 Jul 2024 06:07:48 PM AEST
# gpg: using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@kaod.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1
* tag 'pull-aspeed-20240721' of https://github.com/legoater/qemu:
aspeed: fix coding style
hw/i2c/aspeed: rename the I2C class pool attribute to share_pool
hw/i2c/aspeed: support to set the different memory size
aspeed/soc: support ADC for AST2700
aspeed/adc: Add AST2700 support
tests/avocado/machine_aspeed.py: Add eMMC boot tests
aspeed: Introduce a 'boot-emmc' machine option
aspeed: Introduce a 'hw_strap1' machine attribute
aspeed: Add boot-from-eMMC HW strapping bit to rainier-bmc machine
aspeed: Tune eMMC device properties to reflect HW strapping
aspeed: Introduce a AspeedSoCClass 'boot_from_emmc' handler
aspeed/scu: Add boot-from-eMMC HW strapping bit for AST2600 SoC
aspeed: Load eMMC first boot area as a boot rom
aspeed: Change type of eMMC device
aspeed/smc: Fix possible integer overflow
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Coverity reported:
>>> CID 1549454: Integer handling issues (OVERFLOW_BEFORE_WIDEN)
>>> Potentially overflowing expression
"le32_to_cpu(desc->num_sectors) << 9" with type "uint32_t"
(32 bits, unsigned) is evaluated using 32-bit arithmetic, and
then used in a context that expects an expression of type
"uint64_t" (64 bits, unsigned).
199 le32_to_cpu(desc->num_sectors) << 9 };
Coverity noticed this issue after commit ab04420c3 ("contrib/vhost-user-*:
use QEMU bswap helper functions"), but it was pre-existing and introduced
from the beginning by commit caa1ee4313 ("vhost-user-blk: add
discard/write zeroes features support").
Explicitly cast the 32-bit value before the shift to fix this issue.
Fixes: Coverity CID 1549454
Fixes: 5ab04420c3 ("contrib/vhost-user-*: use QEMU bswap helper functions")
Fixes: caa1ee4313 ("vhost-user-blk: add discard/write zeroes features support")
Cc: changpeng.liu@intel.com
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20240712153857.207440-1-sgarzare@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add support for the VIRTIO_F_IN_ORDER feature across a variety of vhost
devices.
The inclusion of VIRTIO_F_IN_ORDER in the feature bits arrays for these
devices ensures that the backend is capable of offering and providing
support for this feature, and that it can be disabled if the backend
does not support it.
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <20240710125522.4168043-6-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add VIRTIO_F_IN_ORDER feature support for the virtqueue_flush operation.
The goal of the virtqueue_ordered_flush operation when the
VIRTIO_F_IN_ORDER feature has been negotiated is to write elements to
the used/descriptor ring in-order and then update used_idx.
The function iterates through the VirtQueueElement used_elems array
in-order starting at vq->used_idx. If the element is valid (filled), the
element is written to the used/descriptor ring. This process continues
until we find an invalid (not filled) element.
For packed VQs, the first entry (at vq->used_idx) is written to the
descriptor ring last so the guest doesn't see any invalid descriptors.
If any elements were written, the used_idx is updated.
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <20240710125522.4168043-5-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Add VIRTIO_F_IN_ORDER feature support for the virtqueue_fill operation.
The goal of the virtqueue_ordered_fill operation when the
VIRTIO_F_IN_ORDER feature has been negotiated is to search for this
now-used element, set its length, and mark the element as filled in
the VirtQueue's used_elems array.
By marking the element as filled, it will indicate that this element has
been processed and is ready to be flushed, so long as the element is
in-order.
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <20240710125522.4168043-4-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add VIRTIO_F_IN_ORDER feature support in virtqueue_split_pop and
virtqueue_packed_pop.
VirtQueueElements popped from the available/descritpor ring are added to
the VirtQueue's used_elems array in-order and in the same fashion as
they would be added the used and descriptor rings, respectively.
This will allow us to keep track of the current order, what elements
have been written, as well as an element's essential data after being
processed.
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <20240710125522.4168043-3-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add the boolean 'in_order_filled' member to the VirtQueueElement structure.
The use of this boolean will signify whether the element has been processed
and is ready to be flushed (so long as the element is in-order). This
boolean is used to support the VIRTIO_F_IN_ORDER feature.
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <20240710125522.4168043-2-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When setting the parameters of a PCM stream, we compute the bit flag
with the format and rate values as shift operand to check if they are
set in supported_formats and supported_rates.
If the guest provides a format/rate value which when shifting 1 results
in a value bigger than the number of bits in
supported_formats/supported_rates, we must report an error.
Previously, this ended up triggering the not reached assertions later
when converting to internal QEMU values.
Reported-by: Zheyu Ma <zheyuma97@gmail.com>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2416
Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Message-Id: <virtio-snd-fuzz-2416-fix-v1-manos.pitsidianakis@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When reading input audio in the virtio-snd input callback,
virtio_snd_pcm_in_cb(), we do not check whether the iov can actually fit
the data buffer. This is because we use the buffer->size field as a
total-so-far accumulator instead of byte-size-left like in TX buffers.
This triggers an out of bounds write if the size of the virtio queue
element is equal to virtio_snd_pcm_status, which makes the available
space for audio data zero. This commit adds a check for reaching the
maximum buffer size before attempting any writes.
Reported-by: Zheyu Ma <zheyuma97@gmail.com>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2427
Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Message-Id: <virtio-snd-fuzz-2427-fix-v1-manos.pitsidianakis@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Implement transfer and activate functionality per 3.1 spec for
supporting update metadata (no actual buffers). Transfer times
are arbitrarily set to ten and two seconds for full and part
transfers, respectively.
cxl update-firmware mem0 -F fw.img
<on-going fw update>
cxl update-firmware mem0
"memdev":"mem0",
"pmem_size":"1024.00 MiB (1073.74 MB)",
"serial":"0",
"host":"0000:0d:00.0",
"firmware":{
"num_slots":2,
"active_slot":1,
"online_activate_capable":true,
"slot_1_version":"BWFW VERSION 0",
"fw_update_in_progress":true,
"remaining_size":22400
}
}
<completed fw update>
cxl update-firmware mem0
{
"memdev":"mem0",
"pmem_size":"1024.00 MiB (1073.74 MB)",
"serial":"0",
"host":"0000:0d:00.0",
"firmware":{
"num_slots":2,
"active_slot":1,
"staged_slot":2,
"online_activate_capable":true,
"slot_1_version":"BWFW VERSION 0",
"slot_2_version":"BWFW VERSION 1",
"fw_update_in_progress":false
}
}
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Link: https://lore.kernel.org/r/20240627164912.25630-1-dave@stgolabs.net
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20240705125915.991672-2-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
CXL spec 3.1 section 8.2.9.9.11.2 describes the DDR5 Error Check Scrub (ECS)
control feature.
The Error Check Scrub (ECS) is a feature defined in JEDEC DDR5 SDRAM
Specification (JESD79-5) and allows the DRAM to internally read, correct
single-bit errors, and write back corrected data bits to the DRAM array
while providing transparency to error counts. The ECS control feature
allows the request to configure ECS input configurations during system
boot or at run-time.
The ECS control allows the requester to change the log entry type, the ECS
threshold count provided that the request is within the definition
specified in DDR5 mode registers, change mode between codeword mode and
row count mode, and reset the ECS counter.
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Link: https://lore.kernel.org/r/20240223085902.1549-4-shiju.jose@huawei.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20240705123039.963781-5-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
CXL spec 3.1 section 8.2.9.9.11.1 describes the device patrol scrub control
feature. The device patrol scrub proactively locates and makes corrections
to errors in regular cycle. The patrol scrub control allows the request to
configure patrol scrub input configurations.
The patrol scrub control allows the requester to specify the number of
hours for which the patrol scrub cycles must be completed, provided that
the requested number is not less than the minimum number of hours for the
patrol scrub cycle that the device is capable of. In addition, the patrol
scrub controls allow the host to disable and enable the feature in case
disabling of the feature is needed for other purposes such as
performance-aware operations which require the background operations to be
turned off.
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Link: https://lore.kernel.org/r/20240223085902.1549-3-shiju.jose@huawei.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20240705123039.963781-4-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The spec states that reads/writes should have no effect and a part of
commands should be ignored when the media is disabled, not when the
sanitize command is running.
Introduce cxl_dev_media_disabled() to check if the media is disabled and
replace sanitize_running() with it.
Make sure that the media has been correctly disabled during sanitation
by adding an assert to __toggle_media(). Now, enabling when already
enabled or vice versa results in an assert() failure.
Suggested-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Link: https://lore.kernel.org/r/20231222090051.3265307-4-42.hyeyoo@gmail.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20240705120643.959422-3-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Similar protection to that provided for -numa memdev=x
to make sure that memory used to back a type3 device is not also mapped
as normal RAM, or for multiple type3 devices.
This is an easy footgun to remove and seems multiple people have
run into it.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20240705113956.941732-4-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently, if the function fails during the key_len check, the op_code
does not have a proper value, causing virtio_crypto_free_create_session_req
not to free the memory correctly, leading to a memory leak.
By setting the op_code before performing any checks, we ensure that
virtio_crypto_free_create_session_req has the correct context to
perform cleanup operations properly, thus preventing memory leaks.
ASAN log:
==3055068==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 512 byte(s) in 1 object(s) allocated from:
#0 0x5586a75e6ddd in malloc llvm/compiler-rt/lib/asan/asan_malloc_linux.cpp:129:3
#1 0x7fb6b63b6738 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x5e738)
#2 0x5586a864bbde in virtio_crypto_handle_ctrl hw/virtio/virtio-crypto.c:407:19
#3 0x5586a94fc84c in virtio_queue_notify_vq hw/virtio/virtio.c:2277:9
#4 0x5586a94fc0a2 in virtio_queue_host_notifier_read hw/virtio/virtio.c:3641:9
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Message-Id: <20240702211835.3064505-1-zheyuma97@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
According to the datasheet of ASPEED SOCs,
each I2C bus has their own pool buffer since AST2500.
Only AST2400 utilized a pool buffer share to all I2C bus.
And firmware required to set the offset of pool buffer
by writing "Function Control Register(I2CD 00)"
To make this model more readable, will change to introduce
a new bus pool buffer attribute in AspeedI2Cbus.
So, it does not need to calculate the pool buffer offset
for different I2C bus.
This patch rename the I2C class pool attribute to share_pool.
It make user more understand share pool and bus pool
are different.
Incrementing the version of aspeed_i2c_vmstate to 3.
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
According to the datasheet of ASPEED SOCs,
an I2C controller owns 8KB of register space for AST2700,
owns 4KB of register space for AST2600, AST2500 and AST2400,
and owns 64KB of register space for AST1030.
It set the memory region size 4KB by default and it does not compatible
register space for AST2700.
Introduce a new class attribute to set the I2C controller memory size
for different ASPEED SOCs.
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Add ADC model for AST2700 ADC support.
The ADC controller registers base address is start at
0x14C0_0000 and its address space is 0x1000.
The ADC controller interrupt is connected to
GICINT130_INTC group at bit 16. The GIC IRQ is 130.
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
AST2700 and AST2600 ADC controllers are identical.
Introduce ast2700 class and set 2 engines.
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
The default behavior of some Aspeed machines is to boot from the eMMC
device, like the rainier-bmc. Others like ast2600-evb could also boot
from eMMC if the HW strapping boot-from-eMMC bit was set. Add a
property to set or unset this bit. This is useful to test boot images.
For now, only activate this property on the ast2600-evb and rainier-bmc
machines for which eMMC images are available or can be built.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Andrew Jeffery <andrew@codeconstruct.com.au>
Tested-by: Andrew Jeffery <andrew@codeconstruct.com.au>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
To change default behavior of a machine and boot from eMMC, future
changes will add a machine option to let the user configure the
boot-from-eMMC HW strapping bit. Add a new machine attribute first.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Andrew Jeffery <andrew@codeconstruct.com.au>
Tested-by: Andrew Jeffery <andrew@codeconstruct.com.au>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
When the boot-from-eMMC HW strapping bit is set, use the 'boot-config'
property to set the boot config register to boot from the first boot
area partition of the eMMC device. Also set the boot partition size
of the device.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Andrew Jeffery <andrew@codeconstruct.com.au>
Tested-by: Andrew Jeffery <andrew@codeconstruct.com.au>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Report support on the AST2600 SoC if the boot-from-eMMC HW strapping
bit is set at the board level. AST2700 also has support but it is not
yet ready in QEMU and others SoCs do not have support, so return false
always for these.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Andrew Jeffery <andrew@codeconstruct.com.au>
Tested-by: Andrew Jeffery <andrew@codeconstruct.com.au>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Bit SCU500[2] of the AST2600 controls the boot device of the SoC.
Future changes will configure this bit to boot from eMMC disk images
specially built for this purpose.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Andrew Jeffery <andrew@codeconstruct.com.au>
Tested-by: Andrew Jeffery <andrew@codeconstruct.com.au>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The first boot area partition (64K) of the eMMC device should contain
an initial boot loader (u-boot SPL). Load it as a ROM only if an eMMC
device is available to boot from but no flash device is.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Andrew Jeffery <andrew@codeconstruct.com.au>
Tested-by: Andrew Jeffery <andrew@codeconstruct.com.au>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The QEMU device model representing the eMMC device of the machine is
currently created with type SD_CARD. Change the type to EMMC now that
it is available.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Andrew Jeffery <andrew@codeconstruct.com.au>
Tested-by: Andrew Jeffery <andrew@codeconstruct.com.au>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Coverity reports a possible integer overflow because routine
aspeeed_smc_hclk_divisor() has a codepath returning 0, which could
lead to an integer overflow when computing variable 'hclk_shift' in
the caller aspeed_smc_dma_calibration().
The value passed to aspeed_smc_hclk_divisor() is always between 0 and
15 and, in this case, there is always a matching hclk divisor. Remove
the return 0 and use g_assert_not_reached() instead.
Fixes: Coverity CID 1547822
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
While the `allow-rpcs` option is documented in the CLI options
section, it was missing in the section about the configuration file
syntax.
And while it's mentioned that "the list of keys follows the command line
options", having `block-rpcs` there but not `allow-rpcs` seems like
being a potential source of confusion; and as it's cheap to add let's
just do so.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Message-ID: <20240718140407.444160-1-t.lamprecht@proxmox.com>
Signed-off-by: Konstantin Kostiuk <kkostiuk@redhat.com>
It is confusing having many different pieces of code enabling and
disabling commands, and it is not clear that they all have the same
semantics, especially wrt prioritization of the block/allow lists.
The code attempted to prevent the user from setting both the block
and allow lists concurrently, however, the logic was flawed as it
checked settings in the configuration file separately from the
command line arguments. Thus it was possible to set a block list
in the config file and an allow list via a command line argument.
The --dump-conf option also creates a configuration file with both
keys present, even if unset, which means it is creating a config
that cannot actually be loaded again.
Centralizing the code in a single method "ga_apply_command_filters"
will provide a strong guarantee of consistency and clarify the
intended behaviour. With this there is no compelling technical
reason to prevent concurrent setting of both the allow and block
lists, so this flawed restriction is removed.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Message-ID: <20240712132459.3974109-23-berrange@redhat.com>
Signed-off-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Allowing the user to set the QGA_CONF environment variable to change
the default configuration file path is very unusual practice, made
more obscure since this ability is not documented.
This introduces the more normal '-c PATH' / '--config=PATH' command
line argument approach. This requires that we parse the comamnd line
twice, since we want the command line arguments to take priority over
the configuration file settings in general.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Message-ID: <20240712132459.3974109-22-berrange@redhat.com>
Signed-off-by: Konstantin Kostiuk <kkostiuk@redhat.com>
The fsfreeze commands are already written to report an error if
vss_init() fails. Reporting a more specific error message is more
helpful than a generic "command is disabled" message, which cannot
between an admin config decision and lack of platform support.
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240712132459.3974109-19-berrange@redhat.com>
Signed-off-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Rather than creating stubs for every command that just return
QERR_UNSUPPORTED, use 'if' conditions in the QAPI schema to
fully exclude generation of the commands on other UNIX.
The command will be rejected at QMP dispatch time instead,
avoiding reimplementing rejection by blocking the stub commands.
This changes the error message for affected commands from
{"class": "CommandNotFound", "desc": "Command FOO has been disabled"}
to
{"class": "CommandNotFound", "desc": "The command FOO has not been found"}
This has the additional benefit that the QGA protocol reference
now documents what conditions enable use of the command.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Message-ID: <20240712132459.3974109-18-berrange@redhat.com>
Signed-off-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Rather than creating stubs for every command that just return
QERR_UNSUPPORTED, use 'if' conditions in the QAPI schema to
fully exclude generation of the get-users command on POSIX
platforms lacking required APIs.
The command will be rejected at QMP dispatch time instead,
avoiding reimplementing rejection by blocking the stub commands.
This changes the error message for affected commands from
{"class": "CommandNotFound", "desc": "Command FOO has been disabled"}
to
{"class": "CommandNotFound", "desc": "The command FOO has not been found"}
This has the additional benefit that the QGA protocol reference
now documents what conditions enable use of the command.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Message-ID: <20240712132459.3974109-17-berrange@redhat.com>
Signed-off-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Rather than creating stubs for every command that just return
QERR_UNSUPPORTED, use 'if' conditions in the schema to fully
exclude generation of the filesystem trimming commands on POSIX
platforms lacking required APIs.
The command will be rejected at QMP dispatch time instead,
avoiding reimplementing rejection by blocking the stub commands.
This changes the error message for affected commands from
{"class": "CommandNotFound", "desc": "Command FOO has been disabled"}
to
{"class": "CommandNotFound", "desc": "The command FOO has not been found"}
This has the additional benefit that the QGA protocol reference
now documents what conditions enable use of the command.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Message-ID: <20240712132459.3974109-16-berrange@redhat.com>
Signed-off-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Rather than creating stubs for every command that just return
QERR_UNSUPPORTED, use 'if' conditions in the QAPI schema to
fully exclude generation of the filesystem trimming commands
on POSIX platforms lacking required APIs.
The command will be rejected at QMP dispatch time instead,
avoiding reimplementing rejection by blocking the stub commands.
This changes the error message for affected commands from
{"class": "CommandNotFound", "desc": "Command FOO has been disabled"}
to
{"class": "CommandNotFound", "desc": "The command FOO has not been found"}
This has the additional benefit that the QGA protocol reference
now documents what conditions enable use of the command.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Message-ID: <20240712132459.3974109-15-berrange@redhat.com>
Signed-off-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Rather than creating stubs for every command that just return
QERR_UNSUPPORTED, use 'if' conditions in the schema to fully
exclude generation of the filesystem freezing commands on POSIX
platforms lacking the required APIs.
The command will be rejected at QMP dispatch time instead,
avoiding reimplementing rejection by blocking the stub commands.
This changes the error message for affected commands from
{"class": "CommandNotFound", "desc": "Command FOO has been disabled"}
to
{"class": "CommandNotFound", "desc": "The command FOO has not been found"}
This has the additional benefit that the QGA protocol reference
now documents what conditions enable use of the command.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Message-ID: <20240712132459.3974109-14-berrange@redhat.com>
Signed-off-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Rather than creating stubs for every command that just return
QERR_UNSUPPORTED, use 'if' conditions in the QAPI schema to
fully exclude generation of the commands on non-Windows.
The command will be rejected at QMP dispatch time instead,
avoiding reimplementing rejection by blocking the stub commands.
This changes the error message for affected commands from
{"class": "CommandNotFound", "desc": "Command FOO has been disabled"}
to
{"class": "CommandNotFound", "desc": "The command FOO has not been found"}
This has the additional benefit that the QGA protocol reference
now documents what conditions enable use of the command.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Message-ID: <20240712132459.3974109-13-berrange@redhat.com>
Signed-off-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Some commands were blocked based on CONFIG_FSFREEZE, but their
impl had nothing todo with CONFIG_FSFREEZE, and were instead
either Linux-only, or Win+Linux-only.
Rather than creating stubs for every command that just return
QERR_UNSUPPORTED, use 'if' conditions in the QAPI schema to
fully exclude generation of the stats and fsinfo commands on
platforms that can't support them.
The command will be rejected at QMP dispatch time instead,
avoiding reimplementing rejection by blocking the stub commands.
This changes the error message for affected commands from
{"class": "CommandNotFound", "desc": "Command FOO has been disabled"}
to
{"class": "CommandNotFound", "desc": "The command FOO has not been found"}
This has the additional benefit that the QGA protocol reference
now documents what conditions enable use of the command.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Message-ID: <20240712132459.3974109-12-berrange@redhat.com>
Signed-off-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Rather than creating stubs for every comamnd that just return
QERR_UNSUPPORTED, use 'if' conditions in the QAPI schema to
fully exclude generation of the network interface command on
POSIX platforms lacking getifaddrs().
The command will be rejected at QMP dispatch time instead,
avoiding reimplementing rejection by blocking the stub commands.
This changes the error message for affected commands from
{"class": "CommandNotFound", "desc": "Command FOO has been disabled"}
to
{"class": "CommandNotFound", "desc": "The command FOO has not been found"}
This has the additional benefit that the QGA protocol reference
now documents what conditions enable use of the command.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Message-ID: <20240712132459.3974109-11-berrange@redhat.com>
Signed-off-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Rather than creating stubs for every command that just return
QERR_UNSUPPORTED, use 'if' conditions in the QAPI schema to
fully exclude generation of the commands on non-Linux POSIX
platforms
The command will be rejected at QMP dispatch time instead,
avoiding reimplementing rejection by blocking the stub commands.
This changes the error message for affected commands from
{"class": "CommandNotFound", "desc": "Command FOO has been disabled"}
to
{"class": "CommandNotFound", "desc": "The command FOO has not been found"}
This has the additional benefit that the QGA protocol reference
now documents what conditions enable use of the command.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240712132459.3974109-10-berrange@redhat.com>
Signed-off-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Rather than creating stubs for every command that just return
QERR_UNSUPPORTED, use 'if' conditions in the QAPI schema to
fully exclude generation of the commands on Windows.
The command will be rejected at QMP dispatch time instead,
avoiding reimplementing rejection by blocking the stub commands.
This changes the error message for affected commands from
{"class": "CommandNotFound", "desc": "Command FOO has been disabled"}
to
{"class": "CommandNotFound", "desc": "The command FOO has not been found"}
This also fixes an accidental inconsistency where some commands
(guest-get-diskstats & guest-get-cpustats) are implemented as
stubs, yet not added to the blockedrpc list. Those change their
error message from
{"class": "GenericError, "desc": "this feature or command is not currently supported"}
to
{"class": "CommandNotFound", "desc": "The command FOO has not been found"}
The final additional benefit is that the QGA protocol reference
now documents what conditions enable use of the command.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240712132459.3974109-9-berrange@redhat.com>
Signed-off-by: Konstantin Kostiuk <kkostiuk@redhat.com>
The qmp_guest_{set,get}_{memory_blocks,block_info} command impls in
commands-posix.c are surrounded by '#ifdef __linux__' so should
instead live in commands-linux.c
This also removes a "#ifdef CONFIG_LINUX" that was nested inside
a "#ifdef __linux__".
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Message-ID: <20240712132459.3974109-7-berrange@redhat.com>
Signed-off-by: Konstantin Kostiuk <kkostiuk@redhat.com>
target-arm queue:
* Fix handling of LDAPR/STLR with negative offset
* LDAPR should honour SCTLR_ELx.nAA
* Use float_status copy in sme_fmopa_s
* hw/display/bcm2835_fb: fix fb_use_offsets condition
* hw/arm/smmuv3: Support and advertise nesting
* Use FPST_F16 for SME FMOPA (widening)
* tests/arm-cpu-features: Do not assume PMU availability
* hvf: arm: Do not advance PC when raising an exception
# -----BEGIN PGP SIGNATURE-----
#
# iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmaZFlUZHHBldGVyLm1h
# eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3iJuEACtVh1Wp93XMsL3llAZkQlx
# DUCnDCvAM2qiiTIMOqPQzeKTIkRV9aFh1YWzOtMFKai6UkBU6p1b4bPqb5SIr99G
# Ayps4+WzAHsjTqBGEpIIDWL6GqMwv9azBnRAYNb+Cg9O3SzEnCdGOKCfGYTXXPRz
# zQ1NIgqZSUC5jg3XgkU22J3VMsOUWijbzxnGXhOyemSIEhREl+t6Ns3ca3n47/jk
# JIw1g6o0mpefPPkaLq6ftVwpn1L63iYQugn4VCrIhtIoOM8vmnShbI9/GwzL4AYk
# n28nwPl948Xby13kCYmu6Slt8Rmm7M33pBDJzsVtbaeBSd44XHrov8Y1+e1FhAco
# lxrWY/2rG9HiWKGLdAeCKwVxB186DKiTmuK7lcN+eBu3VbOLjDiVE0d1bK4HqGyc
# nzA/Aq81Y9p5Z7wzX40sVFlq0j1pQDQWk6GgPfMA4ueHKEEobxC3C+k1q9m02gjQ
# qesOFzViiGe0j7JER84qqcatIaTk09xfbXL/uMZx8oP/iKa1pyMUx2blChXOXVTx
# oGkO2h3/QCpRIos8d8WM/bso16EkpraInM4748iumSLuxDxTwiIikK/hpsCLDwUN
# dLsH/hAMz+yQOFubFoRt4IlsGVnk5asmTDMb4S8RojdF2KzHuzbJMgdEOe62631g
# IOAc7Tn3TIm5MpAxXOXgJA==
# =/aEm
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 18 Jul 2024 11:19:17 PM AEST
# gpg: using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg: issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [full]
# gpg: aka "Peter Maydell <pmaydell@gmail.com>" [full]
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [full]
# gpg: aka "Peter Maydell <peter@archaic.org.uk>" [unknown]
* tag 'pull-target-arm-20240718' of https://git.linaro.org/people/pmaydell/qemu-arm: (26 commits)
hvf: arm: Do not advance PC when raising an exception
tests/arm-cpu-features: Do not assume PMU availability
tests/tcg/aarch64: Add test cases for SME FMOPA (widening)
target/arm: Use FPST_F16 for SME FMOPA (widening)
target/arm: Use float_status copy in sme_fmopa_s
hw/arm/smmu: Refactor SMMU OAS
hw/arm/smmuv3: Support and advertise nesting
hw/arm/smmuv3: Handle translation faults according to SMMUPTWEventInfo
hw/arm/smmuv3: Support nested SMMUs in smmuv3_notify_iova()
hw/arm/smmu: Support nesting in the rest of commands
hw/arm/smmu: Introduce smmu_iotlb_inv_asid_vmid
hw/arm/smmu: Support nesting in smmuv3_range_inval()
hw/arm/smmu-common: Support nested translation
hw/arm/smmu-common: Add support for nested TLB
hw/arm/smmu-common: Rework TLB lookup for nesting
hw/arm/smmuv3: Translate CD and TT using stage-2 table
hw/arm/smmu: Introduce CACHED_ENTRY_TO_ADDR
hw/arm/smmu: Consolidate ASID and VMID types
hw/arm/smmu: Split smmuv3_translate()
hw/arm/smmu: Use enum for SMMU stage
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
SMMUv3 OAS is currently hardcoded in the code to 44 bits, for nested
configurations that can be a problem, as stage-2 might be shared with
the CPU which might have different PARANGE, and according to SMMU manual
ARM IHI 0070F.b:
6.3.6 SMMU_IDR5, OAS must match the system physical address size.
This patch doesn't change the SMMU OAS, but refactors the code to
make it easier to do that:
- Rely everywhere on IDR5 for reading OAS instead of using the
SMMU_IDR5_OAS macro, so, it is easier just to change IDR5 and
it propagages correctly.
- Add additional checks when OAS is greater than 48bits.
- Remove unused functions/macros: pa_range/MAX_PA.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20240715084519.1189624-19-smostafa@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Previously, to check if faults are enabled, it was sufficient to check
the current stage of translation and check the corresponding
record_faults flag.
However, with nesting, it is possible for stage-1 (nested) translation
to trigger a stage-2 fault, so we check SMMUPTWEventInfo as it would
have the correct stage set from the page table walk.
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20240715084519.1189624-17-smostafa@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Some commands need rework for nesting, as they used to assume S1
and S2 are mutually exclusive:
- CMD_TLBI_NH_ASID: Consider VMID if stage-2 is supported
- CMD_TLBI_NH_ALL: Consider VMID if stage-2 is supported, otherwise
invalidate everything, this required a new vmid invalidation
function for stage-1 only (ASID >= 0)
Also, rework trace events to reflect the new implementation.
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20240715084519.1189624-15-smostafa@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Soon, Instead of doing TLB invalidation by ASID only, VMID will be
also required.
Add smmu_iotlb_inv_asid_vmid() which invalidates by both ASID and VMID.
However, at the moment this function is only used in SMMU_CMD_TLBI_NH_ASID
which is a stage-1 command, so passing VMID = -1 keeps the original
behaviour.
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20240715084519.1189624-14-smostafa@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
With nesting, we would need to invalidate IPAs without
over-invalidating stage-1 IOVAs. This can be done by
distinguishing IPAs in the TLBs by having ASID=-1.
To achieve that, rework the invalidation for IPAs to have a
separate function, while for IOVA invalidation ASID=-1 means
invalidate for all ASIDs.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20240715084519.1189624-13-smostafa@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When nested translation is requested, do the following:
- Translate stage-1 table address IPA into PA through stage-2.
- Translate stage-1 table walk output (IPA) through stage-2.
- Create a single TLB entry from stage-1 and stage-2 translations
using logic introduced before.
smmu_ptw() has a new argument SMMUState which include the TLB as
stage-1 table address can be cached in there.
Also in smmu_ptw(), a separate path used for nesting to simplify the
code, although some logic can be combined.
With nested translation class of translation fault can be different,
from the class of the translation, as faults from translating stage-1
tables are considered as CLASS_TT and not CLASS_IN, a new member
"is_ipa_descriptor" added to "SMMUPTWEventInfo" to differ faults
from walking stage 1 translation table and faults from translating
an IPA for a transaction.
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20240715084519.1189624-12-smostafa@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This patch adds support for nested (combined) TLB entries.
The main function combine_tlb() is not used here but in the next
patches, but to simplify the patches it is introduced first.
Main changes:
1) New field added in the SMMUTLBEntry struct: parent_perm, for
nested TLB, holds the stage-2 permission, this can be used to know
the origin of a permission fault from a cached entry as caching
the “and” of the permissions loses this information.
SMMUPTWEventInfo is used to hold information about PTW faults so
the event can be populated, the value of stage used to be set
based on the current stage for TLB permission faults, however
with the parent_perm, it is now set based on which perm has
the missing permission
When nesting is not enabled it has the same value as perm which
doesn't change the logic.
2) As combined TLB implementation is used, the combination logic
chooses:
- tg and level from the entry which has the smallest addr_mask.
- Based on that the iova that would be cached is recalculated.
- Translated_addr is chosen from stage-2.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20240715084519.1189624-11-smostafa@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In the next patch, combine_tlb() will be added which combines 2 TLB
entries into one for nested translations, which chooses the granule
and level from the smallest entry.
This means that with nested translation, an entry can be cached with
the granule of stage-2 and not stage-1.
However, currently, the lookup for an IOVA is done with input stage
granule, which is stage-1 for nested configuration, which will not
work with the above logic.
This patch reworks lookup in that case, so it falls back to stage-2
granule if no entry is found using stage-1 granule.
Also, drop aligning the iova to avoid over-aligning in case the iova
is cached with a smaller granule, the TLB lookup will align the iova
anyway for each granule and level, and the page table walker doesn't
consider the page offset bits.
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20240715084519.1189624-10-smostafa@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
According to ARM SMMU architecture specification (ARM IHI 0070 F.b),
In "5.2 Stream Table Entry":
[51:6] S1ContextPtr
If Config[1] == 1 (stage 2 enabled), this pointer is an IPA translated by
stage 2 and the programmed value must be within the range of the IAS.
In "5.4.1 CD notes":
The translation table walks performed from TTB0 or TTB1 are always performed
in IPA space if stage 2 translations are enabled.
This patch implements translation of the S1 context descriptor pointer and
TTBx base addresses through the S2 stage (IPA -> PA)
smmuv3_do_translate() is updated to have one arg which is translation
class, this is useful to:
- Decide wether a translation is stage-2 only or use the STE config.
- Populate the class in case of faults, WALK_EABT is left unchanged
for stage-1 as it is always IN, while stage-2 would match the
used class (TT, IN, CD), this will change slightly when the ptw
supports nested translation as it can also issue TT event with
class IN.
In case for stage-2 only translation, used in the context of nested
translation, the stage and asid are saved and restored before and
after calling smmu_translate().
Translating CD or TTBx can fail for the following reasons:
1) Large address size: This is described in
(3.4.3 Address sizes of SMMU-originated accesses)
- For CD ptr larger than IAS, for SMMUv3.1, it can trigger either
C_BAD_STE or Translation fault, we implement the latter as it
requires no extra code.
- For TTBx, if larger than the effective stage 1 output address size, it
triggers C_BAD_CD.
2) Faults from PTWs (7.3 Event records)
- F_ADDR_SIZE: large address size after first level causes stage 2 Address
Size fault (Also in 3.4.3 Address sizes of SMMU-originated accesses)
- F_PERMISSION: Same as an address translation. However, when
CLASS == CD, the access is implicitly Data and a read.
- F_ACCESS: Same as an address translation.
- F_TRANSLATION: Same as an address translation.
- F_WALK_EABT: Same as an address translation.
These are already implemented in the PTW logic, so no extra handling
required.
As in CD and TTBx translation context, the iova is not known, setting
the InputAddr was removed from "smmuv3_do_translate" and set after
from "smmuv3_translate" with the new function "smmuv3_fixup_event"
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20240715084519.1189624-9-smostafa@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
ASID and VMID used to be uint16_t in the translation config, however,
in other contexts they can be int as -1 in case of TLB invalidation,
to represent all (don’t care).
When stage-2 was added asid was set to -1 in stage-2 and vmid to -1
in stage-1 configs. However, that meant they were set as (65536),
this was not an issue as nesting was not supported and no
commands/lookup uses both.
With nesting, it’s critical to get this right as translation must be
tagged correctly with ASID/VMID, and with ASID=-1 meaning stage-2.
Represent ASID/VMID everywhere as int.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20240715084519.1189624-7-smostafa@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
smmuv3_translate() does everything from STE/CD parsing to TLB lookup
and PTW.
Soon, when nesting is supported, stage-1 data (tt, CD) needs to be
translated using stage-2.
Split smmuv3_translate() to 3 functions:
- smmu_translate(): in smmu-common.c, which does the TLB lookup, PTW,
TLB insertion, all the functions are already there, this just puts
them together.
This also simplifies the code as it consolidates event generation
in case of TLB lookup permission failure or in TT selection.
- smmuv3_do_translate(): in smmuv3.c, Calls smmu_translate() and does
the event population in case of errors.
- smmuv3_translate(), now calls smmuv3_do_translate() for
translation while the rest is the same.
Also, add stage in trace_smmuv3_translate_success()
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20240715084519.1189624-6-smostafa@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Currently, translation stage is represented as an int, where 1 is stage-1 and
2 is stage-2, when nested is added, 3 would be confusing to represent nesting,
so we use an enum instead.
While keeping the same values, this is useful for:
- Doing tricks with bit masks, where BIT(0) is stage-1 and BIT(1) is
stage-2 and both is nested.
- Tracing, as stage is printed as int.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-id: 20240715084519.1189624-5-smostafa@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The SMMUv3 spec (ARM IHI 0070 F.b - 7.3 Event records) defines the
class of events faults as:
CLASS: The class of the operation that caused the fault:
- 0b00: CD, CD fetch.
- 0b01: TTD, Stage 1 translation table fetch.
- 0b10: IN, Input address
However, this value was not set and left as 0 which means CD and not
IN (0b10).
Another problem was that stage-2 class is considered IN not TT for
EABT, according to the spec:
Translation of an IPA after successful stage 1 translation (or,
in stage 2-only configuration, an input IPA)
- S2 == 1 (stage 2), CLASS == IN (Input to stage)
This would change soon when nested translations are supported.
While at it, add an enum for class as it would be used for nesting.
However, at the moment stage-1 and stage-2 use the same class values,
except for EABT.
Fixes: 9bde7f0674 “hw/arm/smmuv3: Implement translate callback”
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20240715084519.1189624-4-smostafa@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
For the following events (ARM IHI 0070 F.b - 7.3 Event records):
- F_TRANSLATION
- F_ACCESS
- F_PERMISSION
- F_ADDR_SIZE
If fault occurs at stage 2, S2 == 1 and:
- If translating an IPA for a transaction (whether by input to
stage 2-only configuration, or after successful stage 1 translation),
CLASS == IN, and IPA is provided.
At the moment only CLASS == IN is used which indicates input
translation.
However, this was not implemented correctly, as for stage 2, the code
only sets the S2 bit but not the IPA.
This field has the same bits as FetchAddr in F_WALK_EABT which is
populated correctly, so we don’t change that.
The setting of this field should be done from the walker as the IPA address
wouldn't be known in case of nesting.
For stage 1, the spec says:
If fault occurs at stage 1, S2 == 0 and:
CLASS == IN, IPA is UNKNOWN.
So, no need to set it to for stage 1, as ptw_info is initialised by zero in
smmuv3_translate().
Fixes: e703f7076a “hw/arm/smmuv3: Add page table walk for stage-2”
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Message-id: 20240715084519.1189624-3-smostafa@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
According to the SMMU architecture specification (ARM IHI 0070 F.b),
in “3.4 Address sizes”
The address output from the translation causes a stage 1 Address Size
fault if it exceeds the range of the effective IPA size for the given CD.
However, this check was missing.
There is already a similar check for stage-2 against effective PA.
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Message-id: 20240715084519.1189624-2-smostafa@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
It is common practice when implementing double-buffering on VideoCore
to do so by multiplying the height of the virtual buffer by the
number of virtual screens desired (i.e., two - in the case of
double-bufferring).
At present, this won't work in QEMU because the logic in
fb_use_offsets require that both the virtual width and height exceed
their physical counterparts.
This appears to be unintentional/a typo and indeed the comment
states; "Experimentally, the hardware seems to do this only if the
viewport size is larger than the physical screen". The
viewport/virtual size would be larger than the physical size if
either virtual dimension were larger than their physical counterparts
and not necessarily both.
Signed-off-by: SamJakob <me@samjakob.com>
Message-id: 20240713160353.62410-1-me@samjakob.com
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In commit c1a1f80518 when we added the FEAT_LSE2 relaxations to
the alignment requirements for atomic and ordered loads and stores,
we didn't quite get it right for LDAPR/LDAPRH/LDAPRB with no
immediate offset. These instructions were handled in the old decoder
as part of disas_ldst_atomic(), but unlike all the other insns that
function decoded (LDADD, LDCLR, etc) these insns are "ordered", not
"atomic", so they should be using check_ordered_align() rather than
check_atomic_align(). Commit c1a1f80518 used
check_atomic_align() regardless for everything in
disas_ldst_atomic(). We then carried that incorrect check over in
the decodetree conversion, where LDAPR/LDAPRH/LDAPRB are now handled
by trans_LDAPR().
The effect is that when FEAT_LSE2 is implemented, these instructions
don't honour the SCTLR_ELx.nAA bit and will generate alignment
faults when they should not.
(The LDAPR insns with an immediate offset were in disas_ldst_ldapr_stlr()
and then in trans_LDAPR_i() and trans_STLR_i(), and have always used
the correct check_ordered_align().)
Use check_ordered_align() in trans_LDAPR().
Cc: qemu-stable@nongnu.org
Fixes: c1a1f80518 ("target/arm: Relax ordered/atomic alignment checks for LSE2")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240709134504.3500007-3-peter.maydell@linaro.org
When we converted the LDAPR/STLR instructions to decodetree we
accidentally introduced a regression where the offset is negative.
The 9-bit immediate field is signed, and the old hand decoder
correctly used sextract32() to get it out of the insn word,
but the ldapr_stlr_i pattern in the decode file used "imm:9"
instead of "imm:s9", so it treated the field as unsigned.
Fix the pattern to treat the field as a signed immediate.
Cc: qemu-stable@nongnu.org
Fixes: 2521b6073b ("target/arm: Convert LDAPR/STLR (imm) to decodetree")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2419
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240709134504.3500007-2-peter.maydell@linaro.org
RISC-V PR for 9.1
* Support the zimop, zcmop, zama16b and zabha extensions
* Validate the mode when setting vstvec CSR
* Add decode support for Zawrs extension
* Update the KVM regs to Linux 6.10-rc5
* Add smcntrpmf extension support
* Raise an exception when CSRRS/CSRRC writes a read-only CSR
* Re-insert and deprecate 'riscv,delegate' in virt machine device tree
* roms/opensbi: Update to v1.5
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEaukCtqfKh31tZZKWr3yVEwxTgBMFAmaYeUcACgkQr3yVEwxT
# gBMtdw//U2NbmnmECa0uXuE7fdFul0tUkl2oHb9Cr8g5Se5g/HVFqexAKOFZ8Lcm
# DvTl94zJ2dms4RntcmJHwTIusa+oU6qqOekediotjgpeH4BHZNCOHe0E9hIAHn9F
# uoJ1P186L7VeVr7OFAAgSCE7F6egCk7iC0h8L8/vuL4xcuyfbZ2r7ybiTl1+45N2
# YBBv5/00wsYnyMeqRYYtyqgX9QR017JRqNSfTJSbKxhQM/L1GA1xxisUvIGeyDqc
# Pn8E3dMN6sscR6bPs4RP+SBi0JIlRCgth/jteSUkbYf42osw3/5sl4oK/e6Xiogo
# SjELOF7QJNxE8H6EUIScDaCVB5ZhvELZcuOL2NRdUuVDkjhWXM633HwfEcXkZdFK
# W/H9wOvNxPAJIOGXOpv10+MLmhdyIOZwE0uk6evHvdcTn3FP9DurdUCc1se0zKOA
# Qg/H6usTbLGNQ7KKTNQ6GpQ6u89iE1CIyZqYVvB1YuF5t7vtAmxvNk3SVZ6aq3VL
# lPJW2Zd1eO09Q+kRnBVDV7MV4OJrRNsU+ryd91NrSVo9aLADtyiNC28dCSkjU3Gn
# 6YQZt65zHuhH5IBB/PGIPo7dLRT8KNWOiYVoy3c6p6DC6oXsKIibh0ue1nrVnnVQ
# NRqyxPYaj6P8zzqwTk+iJj36UXZZVtqPIhtRu9MrO6Opl2AbsXI=
# =pM6B
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 18 Jul 2024 12:09:11 PM AEST
# gpg: using RSA key 6AE902B6A7CA877D6D659296AF7C95130C538013
# gpg: Good signature from "Alistair Francis <alistair@alistair23.me>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6AE9 02B6 A7CA 877D 6D65 9296 AF7C 9513 0C53 8013
* tag 'pull-riscv-to-apply-20240718-1' of https://github.com/alistair23/qemu: (30 commits)
roms/opensbi: Update to v1.5
hw/riscv/virt.c: re-insert and deprecate 'riscv,delegate'
target/riscv: raise an exception when CSRRS/CSRRC writes a read-only CSR
target/riscv: Expose the Smcntrpmf config
target/riscv: Do not setup pmu timer if OF is disabled
target/riscv: More accurately model priv mode filtering.
target/riscv: Start counters from both mhpmcounter and mcountinhibit
target/riscv: Enforce WARL behavior for scounteren/hcounteren
target/riscv: Save counter values during countinhibit update
target/riscv: Implement privilege mode filtering for cycle/instret
target/riscv: Only set INH fields if priv mode is available
target/riscv: Add cycle & instret privilege mode filtering support
target/riscv: Add cycle & instret privilege mode filtering definitions
target/riscv: Add cycle & instret privilege mode filtering properties
target/riscv: Fix the predicate functions for mhpmeventhX CSRs
target/riscv: Combine set_mode and set_virt functions.
target/riscv/kvm: update KVM regs to Linux 6.10-rc5
disas/riscv: Add decode for Zawrs extension
target/riscv: Validate the mode in write_vstvec
disas/riscv: Support zabha disassemble
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Commit b1f1e9dcfa renamed 'riscv,delegate' to 'riscv,delegation' since
it is the correct name as per dt-bindings, and the absence of the
correct name will result in validation fails when dumping the dtb and
using dt-validate.
But this change has a side-effect: every other firmware available that
is AIA capable is using 'riscv,delegate', and it will fault/misbehave if
this property isn't present. The property was added back in QEMU 7.0,
meaning we have 2 years of firmware development using the wrong
property.
Re-introducing 'riscv,delegate' while keeping 'riscv,delegation' allows
older firmwares to keep booting with the 'virt' machine.
'riscv,delegate' is then marked for future deprecation with its use
being discouraged from now on.
Cc: Conor Dooley <conor@kernel.org>
Cc: Anup Patel <apatel@ventanamicro.com>
Fixes: b1f1e9dcfa ("hw/riscv/virt.c: aplic DT: rename prop to 'riscv, delegation'")
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240715090455.145888-1-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Both CSRRS and CSRRC always read the addressed CSR and cause any read side
effects regardless of rs1 and rd fields. Note that if rs1 specifies a register
holding a zero value other than x0, the instruction will still attempt to write
the unmodified value back to the CSR and will cause any attendant side effects.
So if CSRRS or CSRRC tries to write a read-only CSR with rs1 which specifies
a register holding a zero value, an illegal instruction exception should be
raised.
Signed-off-by: Yu-Ming Chang <yumin686@andestech.com>
Signed-off-by: Alvin Chang <alvinga@andestech.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <172100444279.18077.6893072378718059541-0@git.sr.ht>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The timer is setup function is invoked in both hpmcounter
write and mcountinhibit write path. If the OF bit set, the
LCOFI interrupt is disabled. There is no benefitting in
setting up the qemu timer until LCOFI is cleared to indicate
that interrupts can be fired again.
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Message-ID: <20240711-smcntrpmf_v7-v8-12-b7c38ae7b263@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
In case of programmable counters configured to count inst/cycles
we often end-up with counter not incrementing at all from kernel's
perspective.
For example:
- Kernel configures hpm3 to count instructions and sets hpmcounter
to -10000 and all modes except U mode are inhibited.
- In QEMU we configure a timer to expire after ~10000 instructions.
- Problem is, it's often the case that kernel might not even schedule
Umode task and we hit the timer callback in QEMU.
- In the timer callback we inject the interrupt into kernel, kernel
runs the handler and reads hpmcounter3 value.
- Given QEMU maintains individual counters to count for each privilege
mode, and given umode never ran, the umode counter didn't increment
and QEMU returns same value as was programmed by the kernel when
starting the counter.
- Kernel checks for overflow using previous and current value of the
counter and reprograms the counter given there wasn't an overflow
as per the counter value. (Which itself is a problem. We have QEMU
telling kernel that counter3 overflowed but the counter value
returned by QEMU doesn't seem to reflect that.).
This change makes sure that timer is reprogrammed from the handler
if the counter didn't overflow based on the counter value.
Second, this change makes sure that whenever the counter is read,
it's value is updated to reflect the latest count.
Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20240711-smcntrpmf_v7-v8-11-b7c38ae7b263@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Currently we start timer counter from write_mhpmcounter path only
without checking for mcountinhibit bit. This changes adds mcountinhibit
check and also programs the counter from write_mcountinhibit as well.
When a counter is stopped using mcountinhibit we simply update
the value of the counter based on current host ticks and save
it for future reads.
We don't need to disable running timer as pmu_timer_trigger_irq
will discard the interrupt if the counter has been inhibited.
Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20240711-smcntrpmf_v7-v8-10-b7c38ae7b263@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Currently, if a counter monitoring cycle/instret is stopped via
mcountinhibit we just update the state while the value is saved
during the next read. This is not accurate as the read may happen
many cycles after the counter is stopped. Ideally, the read should
return the value saved when the counter is stopped.
Thus, save the value of the counter during the inhibit update
operation and return that value during the read if corresponding bit
in mcountihibit is set.
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Message-ID: <20240711-smcntrpmf_v7-v8-8-b7c38ae7b263@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Privilege mode filtering can also be emulated for cycle/instret by
tracking host_ticks/icount during each privilege mode switch. This
patch implements that for both cycle/instret and mhpmcounters. The
first one requires Smcntrpmf while the other one requires Sscofpmf
to be enabled.
The cycle/instret are still computed using host ticks when icount
is not enabled. Otherwise, they are computed using raw icount which
is more accurate in icount mode.
Co-Developed-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Message-ID: <20240711-smcntrpmf_v7-v8-7-b7c38ae7b263@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Combining riscv_cpu_set_virt_enabled() and riscv_cpu_set_mode()
functions. This is to make complete mode change information
available through a single function.
This allows to easily differentiate between HS->VS, VS->HS
and VS->VS transitions when executing state update codes.
For example: One use-case which inspired this change is
to update mode-specific instruction and cycle counters
which requires information of both prev mode and current
mode.
Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20240711-smcntrpmf_v7-v8-1-b7c38ae7b263@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Zama16b is the property that misaligned load/stores/atomics within
a naturally aligned 16-byte region are atomic.
According to the specification, Zama16b applies only to AMOs, loads
and stores defined in the base ISAs, and loads and stores of no more
than XLEN bits defined in the F, D, and Q extensions. Thus it should
not apply to zacas or RVC instructions.
For an instruction in that set, if all accessed bytes lie within 16B granule,
the instruction will not raise an exception for reasons of address alignment,
and the instruction will give rise to only one memory operation for the
purposes of RVWMO—i.e., it will execute atomically.
Signed-off-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240709113652.1239-6-zhiwei_liu@linux.alibaba.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Zcmop defines eight 16-bit MOP instructions named C.MOP.n, where n is
an odd integer between 1 and 15, inclusive. C.MOP.n is encoded in
the reserved encoding space corresponding to C.LUI xn, 0.
Unlike the MOPs defined in the Zimop extension, the C.MOP.n instructions
are defined to not write any register.
In current implementation, C.MOP.n only has an check function, without any
other more behavior.
Signed-off-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Deepak Gupta <debug@rivosinc.com>
Message-ID: <20240709113652.1239-4-zhiwei_liu@linux.alibaba.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Zimop extension defines an encoding space for 40 MOPs.The Zimop
extension defines 32 MOP instructions named MOP.R.n, where n is
an integer between 0 and 31, inclusive. The Zimop extension
additionally defines 8 MOP instructions named MOP.RR.n, where n
is an integer between 0 and 7.
These 40 MOPs initially are defined to simply write zero to x[rd],
but are designed to be redefined by later extensions to perform some
other action.
Signed-off-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Deepak Gupta <debug@rivosinc.com>
Message-ID: <20240709113652.1239-2-zhiwei_liu@linux.alibaba.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
For qemu_open_old(), osdep.h said:
> Don't introduce new usage of this function, prefer the following
> qemu_open/qemu_create that take an "Error **errp".
So replace qemu_open_old() with qemu_open(). And considering
rng_random_opened() will lose its obvious error handling case after
removing error_setg_file_open(), add comment to remind here.
Cc: Laurent Vivier <lvivier@redhat.com>
Cc: Amit Shah <amit@kernel.org>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
(mjt: drop superfluous commit as suggested by philmd)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
For qemu_open_old(), osdep.h said:
> Don't introduce new usage of this function, prefer the following
> qemu_open/qemu_create that take an "Error **errp".
So replace qemu_open_old() with qemu_open().
Cc: David Hildenbrand <david@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
For qemu_open_old(), osdep.h said:
> Don't introduce new usage of this function, prefer the following
> qemu_open/qemu_create that take an "Error **errp".
So replace qemu_open_old() with qemu_open().
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: "Cédric Le Goater" <clg@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
For qemu_open_old(), osdep.h said:
> Don't introduce new usage of this function, prefer the following
> qemu_open/qemu_create that take an "Error **errp".
So replace qemu_open_old() with qemu_open().
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
For qemu_open_old(), osdep.h said:
> Don't introduce new usage of this function, prefer the following
> qemu_open/qemu_create that take an "Error **errp".
So replace qemu_open_old() with qemu_open().
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
For qemu_open_old(), osdep.h said:
> Don't introduce new usage of this function, prefer the following
> qemu_open/qemu_create that take an "Error **errp".
So replace qemu_open_old() with qemu_open(). And considering the SGX
enablement description is useful, convert it into a error message hint.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Eduardo Habkost <eduardo@habkost.net>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The short-form boolean options has been deprecated since v6.0 (refer
to docs/about/deprecated.rst).
Update the description and example of boolean fields in l2tpv3 option to
avoid deprecation warning.
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Existing code was long, unclear and twisty.
This also relaxes the rules a tiny bit: allows to have
whitespace before header name and colon and makes the
header value match to be case-insensitive.
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Fully eliminate the "Example" sections in QAPI doc blocks now that they
have all been converted to arbitrary rST syntax using the
".. qmp-example::" directive. Update tests to match.
Migrating to the new syntax
---------------------------
The old "Example:" or "Examples:" section syntax is now caught as an
error, but "Example::" is stil permitted as explicit rST syntax for an
un-lexed, generic preformatted text block.
('Example' is not special in this case, any sentence that ends with "::"
will start an indented code block in rST.)
Arbitrary rST for Examples is now possible, but it's strongly
recommended that documentation authors use the ".. qmp-example::"
directive for consistent visual formatting in rendered HTML docs. The
":title:" directive option may be used to add extra information into the
title bar for the example. The ":annotated:" option can be used to write
arbitrary rST instead, with nested "::" blocks applying QMP formatting
where desired.
Other choices available are ".. code-block:: QMP" which will not create
an "Example:" box, or the short-form "::" code-block syntax which will
not apply QMP highlighting when used outside of the qmp-example
directive.
Why?
----
This patch has several benefits:
1. Example sections can now be written more arbitrarily, mixing
explanatory paragraphs and code blocks however desired.
2. Example sections can now use fully arbitrary rST.
3. All code blocks are now lexed and validated as QMP; increasing
usability of the docs and ensuring validity of example snippets.
(To some extent - This patch only gaurantees it lexes correctly, not
that it's valid under the JSON or QMP grammars. It will catch most
small mistakes, however.)
4. Each qmp-example can be titled or annotated independently without
bypassing the QMP lexer/validator.
(i.e. code blocks are now for *code* only, so we don't have to
sacrifice exposition for having lexically valid examples.)
NOTE: As with the "Notes" conversion (d461c27973), this patch (and the
three preceding) may change the rendering order for Examples in
the current generator. The forthcoming qapidoc rewrite will fix
this by always generating documentation in source order.
Signed-off-by: John Snow <jsnow@redhat.com>
Message-ID: <20240717021312.606116-10-jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
These examples require longer explanations or have explanations that
require markup to look reasonable when rendered and so use the longer
form of the ".. qmp-example::" directive.
By using the :annotated: option, the content in the example block is
assumed *not* to be a code block literal and is instead parsed as normal
rST - with the exception that any code literal blocks after `::` will
assumed to be a QMP code literal block.
Note: There's one title-less conversion in this patch that comes along
for the ride because it's part of a larger "Examples" block that was
better to convert all at once.
See commit-5: "docs/qapidoc: create qmp-example directive", for a
detailed explanation of this custom directive syntax.
See commit+1: "qapi: remove "Example" doc section" for a detailed
explanation of why.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240717021312.606116-9-jsnow@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
When an Example section has a brief explanation, convert it to a
qmp-example:: section using the :title: option.
Rule of thumb: If the title can fit on a single line and requires no rST
markup, it's a good candidate for using the :title: option of
qmp-example.
In this patch, trailing punctuation is removed from the title section
for consistent headline aesthetics. In just one case, specifics of the
example are removed to make the title read better.
See commit-4: "docs/qapidoc: create qmp-example directive", for a
detailed explanation of this custom directive syntax.
See commit+2: "qapi: remove "Example" doc section" for a detailed
explanation of why.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240717021312.606116-8-jsnow@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Use the no-option form of ".. qmp-example::" to convert any Examples
that do not have any form of caption or explanation whatsoever. Note
that in a few cases, example sections are split into two or more
separate example blocks. This is only done stylistically to create a
delineation between two or more logically independent examples.
See commit-3: "docs/qapidoc: create qmp-example directive", for a
detailed explanation of this custom directive syntax.
See commit+3: "qapi: remove "Example" doc section" for a detailed
explanation of why.
Note: an empty "TODO" line was added to announce-self to keep the
example from floating up into the body; this will be addressed more
rigorously in the new qapidoc generator.
Signed-off-by: John Snow <jsnow@redhat.com>
Message-ID: <20240717021312.606116-7-jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Markup fixed in one place]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
For any code literal blocks inside of a qmp-example directive, apply and
enforce the QMP lexer/highlighter to those blocks.
This way, you won't need to write:
```
.. qmp-example::
:annotated:
Blah blah
.. code-block:: QMP
-> { "lorem": "ipsum" }
```
But instead, simply:
```
.. qmp-example::
:annotated:
Blah blah::
-> { "lorem": "ipsum" }
```
Once the directive block is exited, whatever the previous default
highlight language was will be restored; localizing the forced QMP
lexing to exclusively this directive.
Note, if the default language is *already* QMP, this directive will not
generate and restore redundant highlight configuration nodes. We may
well decide that the default language ought to be QMP for any QAPI
reference pages, but this way the directive behaves consistently no
matter where it is used.
Signed-off-by: John Snow <jsnow@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240717021312.606116-5-jsnow@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
This is a directive that creates a syntactic sugar for creating
"Example" boxes very similar to the ones already used in the bitmaps.rst
document, please see e.g.
https://www.qemu.org/docs/master/interop/bitmaps.html#creation-block-dirty-bitmap-add
In its simplest form, when a custom title is not needed or wanted, and
the example body is *solely* a QMP example:
```
.. qmp-example::
{body}
```
is syntactic sugar for:
```
.. admonition:: Example:
.. code-block:: QMP
{body}
```
When a custom, plaintext title that describes the example is desired,
this form:
```
.. qmp-example::
:title: Defrobnification
{body}
```
Is syntactic sugar for:
```
.. admonition:: Example: Defrobnification
.. code-block:: QMP
{body}
```
Lastly, when Examples are multi-step processes that require non-QMP
exposition, have lengthy titles, or otherwise involve prose with rST
markup (lists, cross-references, etc), the most complex form:
```
.. qmp-example::
:annotated:
This example shows how to use `foo-command`::
{body}
For more information, please see `frobnozz`.
```
Is desugared to:
```
.. admonition:: Example:
This example shows how to use `foo-command`::
{body}
For more information, please see `frobnozz`.
```
Note that :annotated: and :title: options can be combined together, if
desired.
The primary benefit here being documentation source consistently using
the same directive for all forms of examples to ensure consistent visual
styling, and ensuring all relevant prose is visually grouped alongside
the code literal block.
Note that as of this commit, the code-block rST syntax "::" does not
apply QMP highlighting; you would need to use ".. code-block:: QMP". The
very next commit changes this behavior to assume all "::" code blocks
within this directive are QMP blocks.
Signed-off-by: John Snow <jsnow@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240717021312.606116-4-jsnow@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Doc comments are reference documentation for users of QMP.
SpiceQueryMouseMode's doc comment contains a note explaining why it's
not named SpiceMouseMode: spice/enums.h has it already. Irrelevant
for users of QMP; delete the note.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240711112228.2140606-6-armbru@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Doc comments are reference documentation for users of QMP.
SocketAddress's doc comment contains a deprecation note advising
developers to use SocketAddress for new code. Irrelevant for users of
QMP. Move the note out of the doc comment.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240711112228.2140606-5-armbru@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
When no UUID has been specified, query-uuid returns
{"UUID": "00000000-0000-0000-0000-000000000000"}
The doc comment calls this "a null UUID", which I find less than
clear. RFC 9562 calls it "the nil UUID (all zeroes)", so use that
instead.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240711112228.2140606-4-armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
[Wording improved, commit message adjusted]
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
CpuInstanceProperties' doc comment describes its members as properties
to be passed to device_add when hot-plugging a CPU.
This was in fact the initial use of this type, with
query-hotpluggable-cpus: letting management applications find out what
properties need to be passed with device_add to hot-plug a CPU.
We've since added other uses: set-numa-node (commit 419fcdec3c and
f3be67812c), and query-cpus-fast (commit ce74ee3dea). These are not
about device-add.
query-hotpluggable-cpus uses CpuInstanceProperties within
HotpluggableCPU. Lift the documentation related to device-add from
CpuInstanceProperties to HotpluggableCPU.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240711112228.2140606-3-armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
PciDeviceInfo's doc comment has a note on PciDeviceClass member @desc.
Since the note applies always, not just within PciDeviceInfo, merge it
into PciDeviceClass's description of member @desc.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240711112228.2140606-2-armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
* target/i386/tcg: fixes for seg_helper.c
* SEV: Don't allow automatic fallback to legacy KVM_SEV_INIT,
but also don't use it by default
* scsi: honor bootindex again for legacy drives
* hpet, utils, scsi, build, cpu: miscellaneous bugfixes
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmaWoP0UHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroOqfggAg3jxUp6B8dFTEid5aV6qvT4M6nwD
# TAYcAl5kRqTOklEmXiPCoA5PeS0rbr+5xzWLAKgkumjCVXbxMoYSr0xJHVuDwQWv
# XunUm4kpxJBLKK3uTGAIW9A21thOaA5eAoLIcqu2smBMU953TBevMqA7T67h22rp
# y8NnZWWdyQRH0RAaWsCBaHVkkf+DuHSG5LHMYhkdyxzno+UWkTADFppVhaDO78Ba
# Egk49oMO+G6of4+dY//p1OtAkAf4bEHePKgxnbZePInJrkgHzr0TJWf9gERWFzdK
# JiM0q6DeqopZm+vENxS+WOx7AyDzdN0qOrf6t9bziXMg0Rr2Z8bu01yBCQ==
# =cZhV
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 17 Jul 2024 02:34:05 AM AEST
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu:
target/i386/tcg: save current task state before loading new one
target/i386/tcg: use X86Access for TSS access
target/i386/tcg: check for correct busy state before switching to a new task
target/i386/tcg: Compute MMU index once
target/i386/tcg: Introduce x86_mmu_index_{kernel_,}pl
target/i386/tcg: Reorg push/pop within seg_helper.c
target/i386/tcg: use PUSHL/PUSHW for error code
target/i386/tcg: Allow IRET from user mode to user mode with SMAP
target/i386/tcg: Remove SEG_ADDL
target/i386/tcg: fix POP to memory in long mode
hpet: fix HPET_TN_SETVAL for high 32-bits of the comparator
hpet: fix clamping of period
docs: Update description of 'user=username' for '-run-with'
qemu/timer: Add host ticks function for LoongArch
scsi: fix regression and honor bootindex again for legacy drives
hw/scsi/lsi53c895a: bump instruction limit in scripts processing to fix regression
disas: Fix build against Capstone v6
cpu: Free queued CPU work
Revert "qemu-char: do not operate on sources from finalize callbacks"
i386/sev: Don't allow automatic fallback to legacy KVM_SEV*_INIT
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Misc HW & UI patches queue
- Allow loading safely ROMs larger than 4GiB (Gregor)
- Convert vt82c686 IRQ as named 'intr' (Bernhard)
- Clarify QDev GPIO API (Peter)
- Drop unused load_image_gzipped function (Ani)
- Make TCGCPUOps::cpu_exec_interrupt handler mandatory (Peter)
- Factor cpu_pause() out (Nicholas)
- Remove transfer size check from ESP DMA DATA IN / OUT transfers (Mark)
- Add accelerated cursor composition to Cocoa UI (Akihiko)
- Fix '-vga help' CLI (Marc-André)
- Fix displayed errno in ram_block_add (Zhenzhong)
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmaWto0ACgkQ4+MsLN6t
# wN54fBAAwfhSQ9PKTYNlnsmJteXAsPCUg8KZwRblkAZs1z/xJX/sFKJF3PZ8fn4r
# Ty+Fiu4Sylfv19mTc/8Bc8pKfHn9zwY7Kb/H5kHjEuFwEZolODHXO8znRV621iZq
# PAeI64dVo5yIgqlAnf6xPSITwe2f75IS0ivIIKYwFsPqeGMUl6dvh/5xqoxis/hQ
# j/1hFLe+jX4whIcOFcqbR3oV3CZy+nMBLJH1/OtvKJ5aC8vFxt5xsKM0xkG94Pmx
# iYhVx4yjULRSSLMaRowqHqEtPB0pmYyuxz0CwjlcI8PU+gUa+dsZLOomD8YenmJR
# FQubQJOKkqlvQ8j7+2okwQs3NDW1TzwsYnvJKB3+EE+DD3Wq/ny5D0eMcnn5NW1Z
# 7rO624XhkvLsJlTJzVvuzpulmC+UFb/6S8CyStGPDxWCGrU3WqdZeoqbbhmXzacU
# ck17Cs2Ma4k0OIRYgAVdnwq96cuQCFNNzNq/iakcJs5Lsaa6Cai/YByKf1tBaGRm
# d/mJgN7WAJrOSpiRhNuNlay4O+hX0rn+wLwecbKW9sbKuoo9eHjzi8YAQuw/TVYr
# oMF/McqtWFCUyVt0eHtA3C+1dSW4+qQTDQSvabbXx54otRSEnMSEubgYFsdu3hF4
# P0mZyxPg4nPxy3uoz9hVQ63F45quaXX/B2fwvoYSBl58xuyxY6M=
# =rOg6
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 17 Jul 2024 04:06:05 AM AEST
# gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]
* tag 'hw-misc-20240716' of https://github.com/philmd/qemu:
system/physmem: use return value of ram_block_discard_require() as errno
vl: fix "type is NULL" in -vga help
ui/console: Remove dpy_cursor_define_supported()
ui/cocoa: Add cursor composition
ui/console: Convert mouse visibility parameter into bool
ui/cocoa: Release CGColorSpace
esp: remove transfer size check from DMA DATA IN and DATA OUT transfers
system/cpus: Add cpu_pause() function
accel/tcg: Make cpu_exec_interrupt hook mandatory
loader: remove load_image_gzipped function as its not used anywhere
include/hw/qdev-core.h: Correct and clarify gpio doc comments
hw/isa/vt82c686: Turn "intr" irq into a named gpio
hw/core/loader: allow loading larger ROMs
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
switch operation in mmc cards, updated the ext_csd register to
request changes in card operations. Here we implement similar
sequence but requests are mostly dummy and make no change.
Implement SWITCH_ERROR if the write operation offset goes beyond
length of ext_csd.
Signed-off-by: Sai Pavan Boddu <sai.pavan.boddu@xilinx.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
[PMD: Convert to SDProto handlers, add trace events]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Andrew Jeffery <andrew@codeconstruct.com.au>
Tested-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <20240712162719.88165-11-philmd@linaro.org>
Avoid hardcoding 1MiB boot size in EXT_CSD_BOOT_MULT,
expose it as 'boot-partition-size' QOM property.
By default, do not use any size. The board is responsible
to set the boot partition size property.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <20240712162719.88165-10-philmd@linaro.org>
The parameters mimick a real 4GB eMMC, but it can be set to various
sizes.
Signed-off-by: Vincent Palatin <vpalatin@chromium.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Sai Pavan Boddu <sai.pavan.boddu@xilinx.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
EXT_CSD values from Vincent's patch simplivied for Spec v4.3:
- Remove deprecated keys:
. EXT_CSD_SEC_ERASE_MULT
. EXT_CSD_SEC_TRIM_MULT
- Set some keys to not defined / implemented:
. EXT_CSD_HPI_FEATURES
. EXT_CSD_BKOPS_SUPPORT
. EXT_CSD_SEC_FEATURE_SUPPORT
. EXT_CSD_ERASE_TIMEOUT_MULT
. EXT_CSD_PART_SWITCH_TIME
. EXT_CSD_OUT_OF_INTERRUPT_TIME
- Simplify:
. EXT_CSD_ACC_SIZE (6 -> 1)
16KB of super_page_size -> 512B (BDRV_SECTOR_SIZE)
. EXT_CSD_HC_ERASE_GRP_SIZE (4 -> 1)
. EXT_CSD_HC_WP_GRP_SIZE (4 -> 1)
. EXT_CSD_S_C_VCC[Q] (8 -> 1)
. EXT_CSD_S_A_TIMEOUT (17 -> 1)
. EXT_CSD_CARD_TYPE (7 -> 3)
Dual data rate -> High-Speed mode
- Update:
. EXT_CSD_CARD_TYPE (7 -> 3)
High-Speed MultiMediaCard @ 26MHz & 52MHz
. Performances (0xa -> 0x46)
Class B at 3MB/s. -> Class J at 21MB/s
. EXT_CSD_REV (5 -> 3)
Rev 1.5 (spec v4.41) -> Rev 1.3 (spec v4.3)
- Use load/store API to set EXT_CSD_SEC_CNT
- Remove R/W keys, normally zeroed at reset
. EXT_CSD_BOOT_INFO
Migrate the Modes segment (192 lower bytes) but not the
full EXT_CSD register, see Spec v4.3, chapter 8.4
"Extended CSD register":
The Extended CSD register defines the card properties
and selected modes. It is 512 bytes long. The most
significant 320 bytes are the Properties segment, which
defines the card capabilities and cannot be modified by
the host. The lower 192 bytes are the Modes segment,
which defines the configuration the card is working in.
These modes can be changed by the host by means of the
SWITCH command.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <20240712162719.88165-9-philmd@linaro.org>
Since eMMC are soldered on boards, it is not user-creatable.
RCA register is initialized to 0x0001, per spec v4.3,
chapter 8.5 "RCA register":
The default value of the RCA register is 0x0001.
The value 0x0000 is reserved to set all cards into
the Stand-by State with CMD7.
The CSD register is very similar to SD one, except
the version announced is v4.3.
eMMC CID register is slightly different from SD:
- One extra PNM (5 -> 6)
- MDT is only 1 byte (2 -> 1).
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <20240712162719.88165-2-philmd@linaro.org>
When ram_block_discard_require() fails, errno is passed to error_setg_errno().
It's a stale value or 0 which is unrelated to ram_block_discard_require().
As ram_block_discard_require() already returns -EBUSY in failure case,
use it as errno for error_setg_errno().
Fixes: 852f0048f3 ("make guest_memfd require uncoordinated discard")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-ID: <20240716064213.290696-1-zhenzhong.duan@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Remove dpy_cursor_define_supported() as it brings no benefit today and
it has a few inherent problems.
All graphical displays except egl-headless support cursor composition
without DMA-BUF, and egl-headless is meant to be used in conjunction
with another graphical display, so dpy_cursor_define_supported()
always returns true and meaningless.
Even if we add a new display without cursor composition in the future,
dpy_cursor_define_supported() will be problematic as a cursor display
fix for it because some display devices like virtio-gpu cannot tell the
lack of cursor composition capability to the guest and are unable to
utilize the value the function returns. Therefore, all non-headless
graphical displays must actually implement cursor composition for
correct cursor display.
Another problem with dpy_cursor_define_supported() is that it returns
true even if only some of the display listeners support cursor
composition, which is wrong unless all display listeners that lack
cursor composition is headless.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Phil Dennis-Jordan <phil@philjordan.eu>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-ID: <20240715-cursor-v3-4-afa5b9492dbf@daynix.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Add accelerated cursor composition to ui/cocoa. This does not only
improve performance for display devices that exposes the capability to
the guest according to dpy_cursor_define_supported(), but fixes the
cursor display for devices that unconditionally expects the availability
of the capability (e.g., virtio-gpu).
The common pattern to implement accelerated cursor composition is to
replace the cursor and warp it so that the replaced cursor is shown at
the correct position on the guest display for relative pointer devices.
Unfortunately, ui/cocoa cannot do the same because warping the cursor
position interfers with the mouse input so it uses CALayer instead;
although it is not specialized for cursor composition, it still can
compose images with hardware acceleration.
Co-authored-by: Phil Dennis-Jordan <phil@philjordan.eu>
Tested-by: Phil Dennis-Jordan <phil@philjordan.eu>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-ID: <20240715-cursor-v3-3-afa5b9492dbf@daynix.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The transfer size check was originally added to prevent consecutive DMA TI
commands from causing an assert() due to an existing SCSI request being in
progress, but since the last set of updates [*] this is no longer required.
Remove the transfer size check from DMA DATA IN and DATA OUT transfers so
that issuing a DMA TI command when there is no data left to transfer does
not cause an assert() due to an existing SCSI request being in progress.
[*] See commits f3ace75be8..78d68f312a
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2415
Message-ID: <20240713224249.468084-1-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The TCGCPUOps::cpu_exec_interrupt hook is currently not mandatory; if
it is left NULL then we treat it as if it had returned false. However
since pretty much every architecture needs to handle interrupts,
almost every target we have provides the hook. The one exception is
Tricore, which doesn't currently implement the architectural
interrupt handling.
Add a "do nothing" implementation of cpu_exec_hook for Tricore,
assert on startup that the CPU does provide the hook, and remove
the runtime NULL check before calling it.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240712113949.4146855-1-peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The doc comments for the functions for named GPIO inputs and
outputs had a couple of problems:
* some copy-and-paste errors meant the qdev_connect_gpio_out_named()
doc comment had references to input GPIOs that should be to
output GPIOs
* it wasn't very clear that named GPIOs are arrays and so the
connect functions specify a single GPIO line by giving both
the name of the array and the index within that array
Fix the copy-and-paste errors and slightly expand the text
to say that functions are connecting one line in a named GPIO
array, not a single named GPIO line.
Reported-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240708153312.3109380-1-peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The read() syscall is not guaranteed to return all data from a file. The
default ROM loader implementation currently does not take this into account,
instead failing if all bytes are not read at once. This change loads the ROM
using g_file_get_contents() instead, which correctly reads all data using
multiple calls to read() while also returning the loaded ROM size.
Signed-off-by: Gregor Haas <gregorhaas1997@gmail.com>
Reviewed-by: Xingtao Yao <yaoxt.fnst@fujitsu.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240628182706.99525-1-gregorhaas1997@gmail.com>
[PMD: Use gsize with g_file_get_contents()]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
This is how the steps are ordered in the manual. EFLAGS.NT is
overwritten after the fact in the saved image.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This takes care of probing the vaddr range in advance, and is also faster
because it avoids repeated TLB lookups. It also matches the Intel manual
better, as it says "Checks that the current (old) TSS, new TSS, and all
segment descriptors used in the task switch are paged into system memory";
note however that it's not clear how the processor checks for segment
descriptors, and this check is not included in the AMD manual.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This step is listed in the Intel manual: "Checks that the new task is available
(call, jump, exception, or interrupt) or busy (IRET return)".
The AMD manual lists the same operation under the "Preventing recursion"
paragraph of "12.3.4 Nesting Tasks", though it is not clear if the processor
checks the busy bit in the IRET case.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add the MMU index to the StackAccess struct, so that it can be cached
or (in the next patch) computed from information that is not in
CPUX86State.
Co-developed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Interrupts and call gates should use accesses with the DPL as
the privilege level. While computing the applicable MMU index
is easy, the harder thing is how to plumb it in the code.
One possibility could be to add a single argument to the PUSH* macros
for the privilege level, but this is repetitive and risks confusion
between the involved privilege levels.
Another possibility is to pass both CPL and DPL, and adjusting both
PUSH* and POP* to use specific privilege levels (instead of using
cpu_{ld,st}*_data). This makes the code more symmetric.
However, a more complicated but much nicer approach is to use a structure
to contain the stack parameters, env, unwind return address, and rewrite
the macros into functions. The struct provides an easy home for the MMU
index as well.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20240617161210.4639-4-richard.henderson@linaro.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Do not pre-decrement esp, let the macros subtract the appropriate
operand size.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This fixes a bug wherein i386/tcg assumed an interrupt return using
the IRET instruction was always returning from kernel mode to either
kernel mode or user mode. This assumption is violated when IRET is used
as a clever way to restore thread state, as for example in the dotnet
runtime. There, IRET returns from user mode to user mode.
This bug is that stack accesses from IRET and RETF, as well as accesses
to the parameters in a call gate, are normal data accesses using the
current CPL. This manifested itself as a page fault in the guest Linux
kernel due to SMAP preventing the access.
This bug appears to have been in QEMU since the beginning.
Analyzed-by: Robert R. Henry <rrh.henry@gmail.com>
Co-developed-by: Robert R. Henry <rrh.henry@gmail.com>
Signed-off-by: Robert R. Henry <rrh.henry@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In long mode, POP to memory will write a full 64-bit value. However,
the call to gen_writeback() in gen_POP will use MO_32 because the
decoding table is incorrect.
The bug was latent until commit aea49fbb01 ("target/i386: use gen_writeback()
within gen_POP()", 2024-06-08), and then became visible because gen_op_st_v
now receives op->ot instead of the "ot" returned by gen_pop_T0.
Analyzed-by: Clément Chigot <chigot@adacore.com>
Fixes: 5e9e21bcc4 ("target/i386: move 60-BF opcodes to new decoder", 2024-05-07)
Tested-by: Clément Chigot <chigot@adacore.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 3787324101 ("hpet: Fix emulation of HPET_TN_SETVAL (Jan Kiszka)",
2009-04-17) applied the fix only to the low 32-bits of the comparator, but
it should be done for the high bits as well. Otherwise, the high 32-bits
of the comparator cannot be written and they remain fixed to 0xffffffff.
Co-developed-by: TaiseiIto <taisei1212@outlook.jp>
Signed-off-by: TaiseiIto <taisei1212@outlook.jp>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When writing a new period, the clamping should use a maximum value
rather tyhan a bit mask. Also, when writing the high bits new_val
is shifted right by 32, so the maximum allowed period should also
be shifted right.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 3089637461 ("scsi: Don't ignore most usb-storage properties")
removed the call to object_property_set_int() and thus the 'set'
method for the bootindex property was also not called anymore. Here
that method is device_set_bootindex() (as configured by
scsi_dev_instance_init() -> device_add_bootindex_property()) which as
a side effect registers the device via add_boot_device_path().
As reported by a downstream user [0], the bootindex property did not
have the desired effect anymore for legacy drives. Fix the regression
by explicitly calling the add_boot_device_path() function after
checking that the bootindex is not yet used (to avoid
add_boot_device_path() calling exit()).
[0]: https://forum.proxmox.com/threads/149772/post-679433
Cc: qemu-stable@nongnu.org
Fixes: 3089637461 ("scsi: Don't ignore most usb-storage properties")
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.kernel.org/r/20240710152529.1737407-1-f.ebner@proxmox.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Capstone v6 made major changes, such as renaming for AArch64, which
broke programs using the old headers, like QEMU. However, Capstone v6
provides the CAPSTONE_AARCH64_COMPAT_HEADER compatibility definition
allowing to build against v6 with the old definitions, so fix the QEMU
build using it.
We can lift that definition and switch to the new naming once our
supported distros have Capstone v6 in place.
Signed-off-by: Gustavo Romero <gustavo.romero@linaro.org>
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20240715213943.1210355-1-gustavo.romero@linaro.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This reverts commit 2b316774f6.
After 038b421788 ("Revert "chardev: use a child source for qio input
source"") we've been observing the "iwp->src == NULL" assertion
triggering periodically during the initial capabilities querying by
libvirtd. One of possible backtraces:
Thread 1 (Thread 0x7f16cd4f0700 (LWP 43858)):
0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
1 0x00007f16c6c21e65 in __GI_abort () at abort.c:79
2 0x00007f16c6c21d39 in __assert_fail_base at assert.c:92
3 0x00007f16c6c46e86 in __GI___assert_fail (assertion=assertion@entry=0x562e9bcdaadd "iwp->src == NULL", file=file@entry=0x562e9bcdaac8 "../chardev/char-io.c", line=line@entry=99, function=function@entry=0x562e9bcdab10 <__PRETTY_FUNCTION__.20549> "io_watch_poll_finalize") at assert.c:101
4 0x0000562e9ba20c2c in io_watch_poll_finalize (source=<optimized out>) at ../chardev/char-io.c:99
5 io_watch_poll_finalize (source=<optimized out>) at ../chardev/char-io.c:88
6 0x00007f16c904aae0 in g_source_unref_internal () from /lib64/libglib-2.0.so.0
7 0x00007f16c904baf9 in g_source_destroy_internal () from /lib64/libglib-2.0.so.0
8 0x0000562e9ba20db0 in io_remove_watch_poll (source=0x562e9d6720b0) at ../chardev/char-io.c:147
9 remove_fd_in_watch (chr=chr@entry=0x562e9d5f3800) at ../chardev/char-io.c:153
10 0x0000562e9ba23ffb in update_ioc_handlers (s=0x562e9d5f3800) at ../chardev/char-socket.c:592
11 0x0000562e9ba2072f in qemu_chr_fe_set_handlers_full at ../chardev/char-fe.c:279
12 0x0000562e9ba207a9 in qemu_chr_fe_set_handlers at ../chardev/char-fe.c:304
13 0x0000562e9ba2ca75 in monitor_qmp_setup_handlers_bh (opaque=0x562e9d4c2c60) at ../monitor/qmp.c:509
14 0x0000562e9bb6222e in aio_bh_poll (ctx=ctx@entry=0x562e9d4c2f20) at ../util/async.c:216
15 0x0000562e9bb4de0a in aio_poll (ctx=0x562e9d4c2f20, blocking=blocking@entry=true) at ../util/aio-posix.c:722
16 0x0000562e9b99dfaa in iothread_run (opaque=0x562e9d4c26f0) at ../iothread.c:63
17 0x0000562e9bb505a4 in qemu_thread_start (args=0x562e9d4c7ea0) at ../util/qemu-thread-posix.c:543
18 0x00007f16c70081ca in start_thread (arg=<optimized out>) at pthread_create.c:479
19 0x00007f16c6c398d3 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
io_remove_watch_poll(), which makes sure that iwp->src is NULL, calls
g_source_destroy() which finds that iwp->src is not NULL in the finalize
callback. This can only happen if another thread has managed to trigger
io_watch_poll_prepare() callback in the meantime.
Move iwp->src destruction back to the finalize callback to prevent the
described race, and also remove the stale comment. The deadlock glib bug
was fixed back in 2010 by b35820285668 ("gmain: move finalization of
GSource outside of context lock").
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sergey Dyasli <sergey.dyasli@nutanix.com>
Link: https://lore.kernel.org/r/20240712092659.216206-1-sergey.dyasli@nutanix.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently if the 'legacy-vm-type' property of the sev-guest object is
'on', QEMU will attempt to use the newer KVM_SEV_INIT2 kernel
interface in conjunction with the newer KVM_X86_SEV_VM and
KVM_X86_SEV_ES_VM KVM VM types.
This can lead to measurement changes if, for instance, an SEV guest was
created on a host that originally had an older kernel that didn't
support KVM_SEV_INIT2, but is booted on the same host later on after the
host kernel was upgraded.
Instead, if legacy-vm-type is 'off', QEMU should fail if the
KVM_SEV_INIT2 interface is not provided by the current host kernel.
Modify the fallback handling accordingly.
In the future, VMSA features and other flags might be added to QEMU
which will require legacy-vm-type to be 'off' because they will rely
on the newer KVM_SEV_INIT2 interface. It may be difficult to convey to
users what values of legacy-vm-type are compatible with which
features/options, so as part of this rework, switch legacy-vm-type to a
tri-state OnOffAuto option. 'auto' in this case will automatically
switch to using the newer KVM_SEV_INIT2, but only if it is required to
make use of new VMSA features or other options only available via
KVM_SEV_INIT2.
Defining 'auto' in this way would avoid inadvertantly breaking
compatibility with older kernels since it would only be used in cases
where users opt into newer features that are only available via
KVM_SEV_INIT2 and newer kernels, and provide better default behavior
than the legacy-vm-type=off behavior that was previously in place, so
make it the default for 9.1+ machine types.
Cc: Daniel P. Berrangé <berrange@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
cc: kvm@vger.kernel.org
Signed-off-by: Michael Roth <michael.roth@amd.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Link: https://lore.kernel.org/r/20240710041005.83720-1-michael.roth@amd.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The function ufs_is_mcq_reg() and ufs_is_mcq_op_reg() only evaluated
the range of the mcq_reg and mcq_op_reg offset, which is defined as
a constant. Therefore, it was possible for them to return true
even though the ufs device is configured to not support the mcq.
This could cause ufs_mmio_read()/ufs_mmio_write() to result in
Null-pointer-dereference.
So fix it.
Resolves: #2428
Fixes: 5c079578d2 ("hw/ufs: Add support MCQ of UFSHCI 4.0")
Reported-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: Jeuk Kim <jeuk20.kim@samsung.com>
Reviewed-by: Minwoo Im <minwoo.im@samsung.com>
With RHEL 8 support retired (It's been two years since RHEL9 released),
our very oldest build platform version of Sphinx is now 3.4.3; and
keeping backwards compatibility for versions as old as v1.6 when using
domain extensions is a lot of work we don't need to do.
This patch is motivated by my work creating a new QAPI domain, which
unlike the dbus documentation, cannot be allowed to regress by creating
a "dummy" doc when operating under older sphinx versions. Easier is to
raise our minimum version as far as we can push it forwards, reducing my
burden in creating cross-compatibility hacks and patches.
A sampling of sphinx versions from various distributions, courtesy
https://repology.org/project/python:sphinx/versions
Alpine 3.16: v4.3.0 (QEMU support ended 2024-05-23)
Alpine 3.17: v5.3.0
Alpine 3.18: v6.1.3
Alpine 3.19: v6.2.1
Ubuntu 20.04 LTS: EOL
Ubuntu 22.04 LTS: v4.3.2
Ubuntu 22.10: EOL
Ubuntu 23.04: EOL
Ubuntu 23.10: v5.3.0
Ubuntu 24.04 LTS: v7.2.6
Debian 11: v3.4.3 (QEMU support ends 2024-07-xx)
Debian 12: v5.3.0
Fedora 38: EOL
Fedora 39: v6.2.1
Fedora 40: v7.2.6
CentOS Stream 8: v1.7.6 (QEMU support ended 2024-05-17)
CentOS Stream 9: v3.4.3
OpenSUSE Leap 15.4: EOL
OpenSUSE Leap 15.5: 2.3.1, 4.2.0 and 7.2.6
RHEL9 / CentOS Stream 9 becomes the new defining factor in staying at
Sphinx 3.4.3 due to downstream offline build requirements that force us
to use platform Sphinx instead of newer packages from PyPI.
Signed-off-by: John Snow <jsnow@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 20240703175235.239004-2-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
Python 3.13 is in beta and Fedora 41 is preparing to make it the default
system interpreter; enable testing for it.
(In the event problems develop prior to release, it should only impact
the check-python-tox job, which is not run by default and is allowed to
fail.)
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20240626232230.408004-5-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
Python 3.13 isn't out yet, but it's in beta and Fedora is ramping up to
make it the default system interpreter for Fedora 41.
They moved our cheese for where ContextManager lives; add a conditional
to locate it while we support both pre-3.9 and 3.13+.
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 20240626232230.408004-4-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
New bleeding edge versions, new nits to iron out. This addresses the
'check-python-tox' optional GitLab test, while 'check-python-minreqs'
saw no regressions, since it's frozen on an older version of pylint.
Fixes:
qemu/machine/machine.py:345:52: E0606: Possibly using variable 'sock' before assignment (possibly-used-before-assignment)
qemu/utils/qemu_ga_client.py:168:4: R1711: Useless return at end of function or method (useless-return)
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20240626232230.408004-2-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
pull-loongarch-20240712
# -----BEGIN PGP SIGNATURE-----
#
# iLMEAAEKAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCZpCKgwAKCRBAov/yOSY+
# 3yuEBADmzjhomzzTnTHvOTPcK8Ugrru1QY9gT+5m7+I3cdbSRsYxEZLOdnjDAPBJ
# aVO+ZOkNFHspOOAo5A55QRC0PA4YGDGMg+ZcB7AVhzbdmra7SKdzMzrrVfYJYpk5
# CtcrI+4OPt+U6mh/eTKuaXaWgjuoZ+TOjZqhL+rrpIFjcN78Rw==
# =vhZy
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 11 Jul 2024 06:44:35 PM PDT
# gpg: using RSA key B8FF1DA0D2FDCB2DA09C6C2C40A2FFF239263EDF
# gpg: Good signature from "Song Gao <m17746591750@163.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B8FF 1DA0 D2FD CB2D A09C 6C2C 40A2 FFF2 3926 3EDF
* tag 'pull-loongarch-20240712' of https://gitlab.com/gaosong/qemu:
target/loongarch: Fix cpu_reset set wrong CSR_CRMD
target/loongarch: Set CSR_PRCFG1 and CSR_PRCFG2 values
target/loongarch: Remove avail_64 in trans_srai_w() and simplify it
target/loongarch/kvm: Add software breakpoint support
MAINTAINERS: Add myself as a reviewer of LoongArch virt machine
hw/loongarch/virt: Remove unused assignment
hw/loongarch: Change the tpm support by default
hw/loongarch/boot.c: fix out-of-bound reading
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
With KVM virtualization, debug exception is injected to guest kernel
rather than host for normal break intruction. Here hypercall
instruction with special code is used for sw breakpoint usage,
and detailed instruction comes from kvm kernel with user API
KVM_REG_LOONGARCH_DEBUG_INST.
Now only software breakpoint is supported, and it is allowed to
insert/remove software breakpoint. We can debug guest kernel with gdb
method after kernel is loaded, hardware breakpoint will be added in later.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Tested-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240607035016.2975799-1-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
I would like to be informed on changes made to the LoongArch virt machine.
I'm fairly familiar with Loongson-3 series platform hardware and doing
firmwre (U-Boot) development as hobbyist on LoongArch virt platform,
so I believe I can give positive review input to changes on that machine.
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240627-ipi-fixes-v1-2-9b061dc28a3a@flygoat.com>
Signed-off-by: Song Gao <gaosong@loongson.cn>
memcpy() is trying to READ 512 bytes from memory,
pointed by info->kernel_cmdline,
which was (presumable) allocated by g_strdup("");
Found with ASAN, making check with enabled sanitizers.
Signed-off-by: Dmitry Frolov <frolov@swemel.ru>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240628123910.577740-1-frolov@swemel.ru>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Bail out in qemu_ram_block_from_host() when
xen_ram_addr_from_mapcache() does not find an existing
mapping.
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Add Edgar as Xen subsystem maintainer in QEMU. Edgar has been a QEMU
maintainer for years, and has already made key changes to one of the
most difficult areas of the Xen subsystem (the mapcache).
Edgar volunteered helping us maintain the Xen subsystem in QEMU and we
are very happy to welcome him to the team. His knowledge and expertise
with QEMU internals will be of great help.
Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Acked-by: Anthony PERARD <anthony@xenproject.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
hw/nvme patches
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEUigzqnXi3OaiR2bATeGvMW1PDekFAmaQHpQACgkQTeGvMW1P
# DemukQf+Pqcq75cflBqIyVN84/0eThJxmpoTP0ynGNMKJp+K+oecb5pdgTeDI3Kh
# esDOjL8m849r5LFjrjmySrTX8znHPFXdBdqCaOp/MZlgz3NML1guB5EYsizZJ+L6
# K4IRLE/8gzfZHY4yWGmUBuL1VBs8XZV0bXYYlA0xKlO638O0KgVQ/2YpC/44l93J
# rEnefSeXIi+/tCYEaX7t2dA+Qfm/qUrcEZBgvhCREi8t8hTzKGHsl2LVKrsFdA5I
# QZtTFcqeoJThtzWmxGKqbfFb/qeirBlCfhvTEmUWXlS1z9VNzy0ZuqA2l0Sy05ls
# eARbl+JnvV6ic6PikZd8dMSrILjNkQ==
# =dLKH
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 11 Jul 2024 11:04:04 AM PDT
# gpg: using RSA key 522833AA75E2DCE6A24766C04DE1AF316D4F0DE9
# gpg: Good signature from "Klaus Jensen <its@irrelevant.dk>" [unknown]
# gpg: aka "Klaus Jensen <k.jensen@samsung.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: DDCA 4D9C 9EF9 31CC 3468 4272 63D5 6FC5 E55D A838
# Subkey fingerprint: 5228 33AA 75E2 DCE6 A247 66C0 4DE1 AF31 6D4F 0DE9
* tag 'nvme-next-pull-request' of https://gitlab.com/birkelund/qemu:
hw/nvme: Expand VI/VQ resource to uint32
hw/nvme: Allocate sec-ctrl-list as a dynamic array
hw/nvme: separate identify data for sec. ctrl list
hw/nvme: add Identify Endurance Group List
hw/nvme: fix BAR size mismatch of SR-IOV VF
hw/nvme: fix number of PIDs for FDP RUH update
hw/nvme: Add support for setting the MQES for the NVMe emulation
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
VI and VQ resources cover queue resources in each VFs in SR-IOV.
Current maximum I/O queue pair size is 0xffff, we can expand them to
cover the full number of I/O queue pairs.
This patch also fixed Identify Secondary Controller List overflow due to
expand of number of secondary controllers.
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Minwoo Im <minwoo.im@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
To prevent further bumping up the number of maximum VF te support, this
patch allocates a dynamic array (NvmeCtrl *)->sec_ctrl_list based on
number of VF supported by sriov_max_vfs property.
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Minwoo Im <minwoo.im@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Secondary controller list for virtualization has been managed by
Identify Secondary Controller List data structure with NvmeSecCtrlList
where up to 127 secondary controller entries can be managed. The
problem hasn't arisen so far because NVME_MAX_VFS has been 127.
This patch separated identify data itself from the actual secondary
controller list managed by controller to support more than 127 secondary
controllers with the following patch. This patch reused
NvmeSecCtrlEntry structure to manage all the possible secondary
controllers, and copy entries to identify data structure when the
command comes in.
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Minwoo Im <minwoo.im@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Commit 73064edfb8 ("hw/nvme: flexible data placement emulation")
intorudced NVMe FDP feature to nvme-subsys and nvme-ctrl with a
single endurance group #1 supported. This means that controller should
return proper identify data to host with Identify Endurance Group List
(CNS 19h). But, yes, only just for the endurance group #1. This patch
allows host applications to ask for which endurance group is available
and utilize FDP through that endurance group.
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Minwoo Im <minwoo.im@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
PF initializes SR-IOV VF BAR0 region in nvme_init_sriov() with bar_size
calcaulted by Primary Controller Capability such as VQFRSM and VIFRSM
rather than `max_ioqpairs` and `msix_qsize` which is for PF only.
In this case, the bar size reported in nvme_init_sriov() by PF and
nvme_init_pci() by VF might differ especially with large number of
sriov_max_vfs (e.g., 127 which is curret maximum number of VFs). And
this reports invalid BAR0 address of VFs to the host operating system
so that MMIO access will not be caught properly and, of course, NVMe
driver initialization is failed.
For example, if we give the following options, BAR size will be
initialized by PF with 4K, but VF will try to allocate 8K BAR0 size in
nvme_init_pci().
#!/bin/bash
nr_vf=$((127))
nr_vq=$(($nr_vf * 2 + 2))
nr_vi=$(($nr_vq / 2 + 1))
nr_ioq=$(($nr_vq + 2))
...
-device nvme,serial=foo,id=nvme0,bus=rp2,subsys=subsys0,mdts=9,msix_qsize=$nr_ioq,max_ioqpairs=$nr_ioq,sriov_max_vfs=$nr_vf,sriov_vq_flexible=$nr_vq,sriov_vi_flexible=$nr_vi \
To fix this issue, this patch modifies the calculation of BAR size in
the PF and VF initialization by using different elements:
PF: `max_ioqpairs + 1` with `msix_qsize`
VF: VQFRSM with VIFRSM
Signed-off-by: Minwoo Im <minwoo.im@samsung.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
The MQES field in the CAP register describes the Maximum Queue Entries
Supported for the IO queues of an NVMe controller. Adding a +1 to the
value in this field results in the total queue size. A full queue is
when a queue of size N contains N - 1 entries, and the minimum queue
size is 2. Thus the lowest MQES value is 1.
This patch adds the new mqes property to the NVMe emulation which allows
a user to specify the maximum queue size by setting this property. This
is useful as it enables testing of NVMe controller where the MQES is
relatively small. The smallest NVMe queue size supported in NVMe is 2
submission and completion entries, which means that the smallest legal
mqes value is 1.
The following example shows how the mqes can be set for a the NVMe
emulation:
-drive id=nvme0,if=none,file=nvme.img,format=raw
-device nvme,drive=nvme0,serial=foo,mqes=1
If the mqes property is not provided then the default mqes will still be
0x7ff (the queue size is 2048 entries).
Signed-off-by: John Berg <jhnberg@amazon.co.uk>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
The USART devices were previously connecting their outbound IRQs
directly to the CPU because the EXTI wasn't handling direct lines
interrupts.
Now the USART connects to the EXTI inbound GPIOs, and the EXTI connects
its IRQs to the CPU.
The existing QTest for the USART (tests/qtest/stm32l4x5_usart-test.c)
checks that USART1_IRQ in the CPU is pending when expected so it
confirms that the connection through the EXTI still works.
Signed-off-by: Inès Varhol <ines.varhol@telecom-paris.fr>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240707085927.122867-4-ines.varhol@telecom-paris.fr
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The previous implementation for EXTI interrupts only handled
"configurable" interrupts, like those originating from STM32L4x5 SYSCFG
(the only device currently connected to the EXTI up until now).
In order to connect STM32L4x5 USART to the EXTI, this commit adds
handling for direct interrupts (interrupts without configurable edge).
Signed-off-by: Inès Varhol <ines.varhol@telecom-paris.fr>
Message-id: 20240707085927.122867-3-ines.varhol@telecom-paris.fr
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Up until now, the EXTI implementation had 16 inbound GPIOs connected to
the 16 outbound GPIOs of STM32L4x5 SYSCFG.
The EXTI actually handles 40 lines (namely 5 from STM32L4x5 USART
devices which are already implemented in QEMU).
In order to connect USART devices to EXTI, this commit consolidates
constants `EXTI_NUM_INTERRUPT_OUT_LINES` (40) and
`EXTI_NUM_GPIO_EVENT_IN_LINES` (16) into `EXTI_NUM_LINES` (40).
Signed-off-by: Inès Varhol <ines.varhol@telecom-paris.fr>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240707085927.122867-2-ines.varhol@telecom-paris.fr
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Now that all targets set TCGCPUOps::cpu_exec_halt, we can make it
mandatory and remove the fallback handling that calls cpu_has_work.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Currently the TCGCPUOps::cpu_exec_halt method is optional, and if it
is not set then the default is to call the CPUClass::has_work
method (which has an identical function signature).
We would like to make the cpu_exec_halt method mandatory so we can
remove the runtime check and fallback handling. In preparation for
that, make all the targets which don't need special handling in their
cpu_exec_halt set it to their cpu_has_work implementation instead of
leaving it unset. (This is every target except for arm and i386.)
In the riscv case this requires us to make the function not
be local to the source file it's defined in.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
In commit a96edb687e we set the cpu_exec_halt field of the
TCGCPUOps arm_tcg_ops to arm_cpu_exec_halt(), but we left the
arm_v7m_tcg_ops struct unchanged. That isn't wrong, because for
M-profile FEAT_WFxT doesn't exist and the default handling for "no
cpu_exec_halt method" is correct, but it's perhaps a little
confusing. We would also like to make setting the cpu_exec_halt
method mandatory.
Initialize arm_v7m_tcg_ops cpu_exec_halt to the same function we use
for A-profile. (On M-profile we never set up the wfxt timer so there
is no change in behaviour here.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The current implementation of bcm2835_thermal_ops sets
impl.max_access_size and valid.min_access_size to 4, but leaves
impl.min_access_size and valid.max_access_size unset, defaulting to 1.
This causes issues when the memory system is presented with an access
of size 2 at an offset of 3, leading to an attempt to synthesize it as
a pair of byte accesses at offsets 3 and 4, which trips an assert.
Additionally, the lack of valid.max_access_size setting causes another
issue: the memory system tries to synthesize a read using a 4-byte
access at offset 3 even though the device doesn't allow unaligned
accesses.
This patch addresses these issues by explicitly setting both
impl.min_access_size and valid.max_access_size to 4, ensuring proper
handling of access sizes.
Error log:
ERROR:hw/misc/bcm2835_thermal.c:55:bcm2835_thermal_read: code should not be reached
Bail out! ERROR:hw/misc/bcm2835_thermal.c:55:bcm2835_thermal_read: code should not be reached
Aborted
Reproducer:
cat << EOF | qemu-system-aarch64 -display \
none -machine accel=qtest, -m 512M -machine raspi3b -m 1G -qtest stdio
readw 0x3f212003
EOF
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Message-id: 20240702154042.3018932-1-zheyuma97@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In pl011_get_baudrate(), when we calculate the baudrate we can
accidentally divide by zero. This happens because although (as the
specification requires) we treat UARTIBRD = 0 as invalid, we aren't
correctly limiting UARTIBRD and UARTFBRD values to the 16-bit and 6-bit
ranges the hardware allows, and so some non-zero values of UARTIBRD can
result in a zero divisor.
Enforce the correct register field widths on guest writes and on inbound
migration to avoid the division by zero.
ASAN log:
==2973125==ERROR: AddressSanitizer: FPE on unknown address 0x55f72629b348
(pc 0x55f72629b348 bp 0x7fffa24d0e00 sp 0x7fffa24d0d60 T0)
#0 0x55f72629b348 in pl011_get_baudrate hw/char/pl011.c:255:17
#1 0x55f726298d94 in pl011_trace_baudrate_change hw/char/pl011.c:260:33
#2 0x55f726296fc8 in pl011_write hw/char/pl011.c:378:9
Reproducer:
cat << EOF | qemu-system-aarch64 -display \
none -machine accel=qtest, -m 512M -machine realview-pb-a8 -qtest stdio
writeq 0x1000b024 0xf8000000
EOF
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240702155752.3022007-1-zheyuma97@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In order to allow FPCR bits that aren't in the FPSCR (like the new
bits that are defined for FEAT_AFP), we need to make sure that writes
to the FPSCR only write to the bits of FPCR that are architecturally
mapped, and not the others.
Implement this with a new function vfp_set_fpcr_masked() which
takes a mask of which bits to update.
(We could do the same for FPSR, but we leave that until we actually
are likely to need it.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240628142347.1283015-10-peter.maydell@linaro.org
Now that we store FPSR and FPCR separately, the FPSR_MASK and
FPCR_MASK macros are slightly confusingly named and the comment
describing them is out of date. Rename them to FPSCR_FPSR_MASK and
FPSCR_FPCR_MASK, document that they are the mask of which FPSCR bits
are architecturally mapped to which AArch64 register, and define them
symbolically rather than as hex values. (This latter requires
defining some extra macros for bits which we haven't previously
defined.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240628142347.1283015-9-peter.maydell@linaro.org
Now that we have refactored the set/get functions so that the FPSCR
format is no longer the authoritative one, we can keep FPSR and FPCR
in separate CPU state fields.
As well as the get and set functions, we also have a scattering of
places in the code which directly access vfp.xregs[ARM_VFP_FPSCR] to
extract single fields which are stored there. These all change to
directly access either vfp.fpsr or vfp.fpcr, depending on the
location of the field. (Most commonly, this is the NZCV flags.)
We make the field in the CPU state struct 64 bits, because
architecturally FPSR and FPCR are 64 bits. However we leave the
types of the arguments and return values of the get/set functions as
32 bits, since we don't need to make that change with the current
architecture and various callsites would be unable to handle
set bits in the high half (for instance the gdbstub protocol
assumes they're only 32 bit registers).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240628142347.1283015-7-peter.maydell@linaro.org
To support FPSR and FPCR bits that don't exist in the AArch32 FPSCR
view of floating point control and status (such as the FEAT_AFP ones),
we need to make sure those bits can be migrated. This commit allows
that, whilst maintaining backwards and forwards migration compatibility
for CPUs where there are no such bits:
On sending:
* If either the FPCR or the FPSR include set bits that are not
visible in the AArch32 FPSCR view of floating point control/status
then we send the FPCR and FPSR as two separate fields in a new
cpu/vfp/fpcr_fpsr subsection, and we send a 0 for the old
FPSCR field in cpu/vfp
* Otherwise, we don't send the fpcr_fpsr subsection, and we send
an FPSCR-format value in cpu/vfp as we did previously
On receiving:
* if we see a non-zero FPSCR field, that is the right information
* if we see a fpcr_fpsr subsection then that has the information
* if we see neither, then FPSCR/FPCR/FPSR are all zero on the source;
cpu_pre_load() ensures the CPU state defaults to that
* if we see both, then the migration source is buggy or malicious;
either the fpcr_fpsr or the FPSCR will "win" depending which
is first in the migration stream; we don't care which that is
We make the new FPCR and FPSR on-the-wire data be 64 bits, because
architecturally these registers are that wide, and this avoids the
need to engage in further migration-compatibility contortions in
future if some new architecture revision defines bits in the high
half of either register.
(We won't ever send the new migration subsection until we add support
for a CPU feature which enables setting overlapping FPCR bits, like
FEAT_AFP.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240628142347.1283015-5-peter.maydell@linaro.org
Make vfp_set_fpscr() call vfp_set_fpsr() and vfp_set_fpcr()
instead of the other way around.
The masking we do when getting and setting vfp.xregs[ARM_VFP_FPSCR]
is a little awkward, but we are going to change where we store the
underlying FPSR and FPCR information in a later commit, so it will
go away then.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240628142347.1283015-4-peter.maydell@linaro.org
In AArch32, the floating point control and status bits are all in a
single register, FPSCR. In AArch64, these were split into separate
FPCR and FPSR registers, but the bit layouts remained the same, with
no overlaps, so that you could construct an FPSCR value by ORing FPCR
and FPSR, or equivalently could produce FPSR and FPCR by masking an
FPSCR value. For QEMU's implementation, we opted to use masking to
produce FPSR and FPCR, because we started with an AArch32
implementation of FPSCR.
The addition of the (AArch64-only) FEAT_AFP adds new bits to the FPCR
which overlap with some bits in the FPSR. This means we'll no longer
be able to consider the FPSCR-encoded value as the primary one, but
instead need to treat FPSR/FPCR as the primary encoding and construct
the FPSCR from those. (This remains possible because the FEAT_AFP
bits in FPCR don't appear in the FPSCR.)
As the first step in this refactoring, make vfp_get_fpscr() call
vfp_get_fpcr() and vfp_get_fpsr(), instead of the other way around.
Note that vfp_get_fpcsr_from_host() returns only bits in the FPSR
(for the cumulative fp exception bits), so we can simply rename
it without needing to add a new function for getting FPCR bits.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240628142347.1283015-3-peter.maydell@linaro.org
The M-profile FPSCR LTPSIZE is bits [18:16]; this is the same
field as A-profile FPSCR Len, not Stride. Correct the comment
in vfp_get_fpscr().
We also implemented M-profile FPSCR.QC, but forgot to delete
a TODO comment from vfp_set_fpscr(); remove it now.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240628142347.1283015-2-peter.maydell@linaro.org
When opening an image with discard=off, we punch hole in the image when
writing zeroes, making the image sparse. This breaks users that want to
ensure that writes cannot fail with ENOSPACE by using fully allocated
images[1].
bdrv_co_pwrite_zeroes() correctly disables BDRV_REQ_MAY_UNMAP if we
opened the child without discard=unmap or discard=on. But we don't go
through this function when accessing the top node. Move the check down
to bdrv_co_do_pwrite_zeroes() which seems to be used in all code paths.
This change implements the documented behavior, punching holes only when
opening the image with discard=on or discard=unmap. This may not be the
best default but can improve it later.
The test depends on a file system supporting discard, deallocating the
entire file when punching hole with the length of the entire file.
Tested with xfs, ext4, and tmpfs.
[1] https://lists.nongnu.org/archive/html/qemu-discuss/2024-06/msg00003.html
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Message-id: 20240628202058.1964986-3-nsoffer@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The test works since we punch holes by default even when opening the
image without discard=on or discard=unmap. Fix the test to enable
discard.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The error message is actually expressive, considering QEMU only. But
when called from Libvirt, talking about "size" can be confusing, because
in Libvirt "size" translates to the memory backend size in QEMU (maximum
size) and "current" translates to the QEMU "size" property.
Let's simply avoid talking about the "size" property and spell out that
some device memory is still plugged.
Message-ID: <20240416141426.588544-1-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Cc: Liang Cong <lcong@redhat.com>
Cc: Mario Casquero <mcasquer@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
aspeed queue:
* support AST2700 network
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmaNJCcACgkQUaNDx8/7
# 7KF7pw//So48XdPJhdQukO/PDLGSYL8rRjDfZbQFLLw10MozcZZ/Nz/BCzrNxJRg
# rHP/shyO3XL1YZ6U1LNXk6E845giVriSpRRjGX9CuK4fypM9xom6qAIOtOLeH7hG
# iTMW++IxN/JgVmVOKYn3C+2+odiq6NzZxFrblVtGPUDtNkkC9BaYGHnccMsl5zQh
# LOSPJxqLiiuDjZPqdwa4fMbtEeNTU3A0WLlWxX7yPfJt2T20a4wE6bdWVGcI6fiV
# QbCmLLrMXhuZFx+uT4B2hbHi+hGS5H+F3QBOefum6z+i9NEbfAZSyusd8/qTEify
# fSBqxL4LD6K4WKL1Hg9959cBcm5zWgPXk7znus4E/TZuUTdSHaPC7clESIcYqWPS
# veEAppmHneO4cdmK1m+Gv4gpWD/adS4ZfV7O+C3z149ms0gL4JrK6QndPdE5QuIW
# u47PhIT3oIM0WznnMusoCndFxs6Gl/GBkzdxW0gdoJKBRfymbsroWeZamAWTznbV
# mL8Td8bEP/NcV40cm1PtpZyl7j0MzxcKDUHKv9ioQTXLUpkl5LSsIGmd1m78WRlE
# J6bUJ3jqQT6/s5i3TVqTGe7xuqMkg+9Er8rn5nAWgSronsf4nprAfOU8Lj+b06BM
# YRroGgU2lAQrv17liQExrG3Tj1SH+oEp1q0qEq7qo824HlGjBkI=
# =UygB
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 09 Jul 2024 04:51:03 AM PDT
# gpg: using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@kaod.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1
* tag 'pull-aspeed-20240709' of https://github.com/legoater/qemu:
machine_aspeed.py: update to test network for AST2700
machine_aspeed.py: update to test ASPEED OpenBMC SDK v09.02 for AST2700
hw/block: m25p80: support quad mode for w25q01jvq
aspeed/soc: set dma64 property for AST2700 ftgmac100
hw/net:ftgmac100: update TX and RX packet buffers address to 64 bits
hw/net:ftgmac100: introduce TX and RX ring base address high registers to support 64 bits
hw/net:ftgmac100: update ring base address to 64 bits
hw/net:ftgmac100: update memory region size to 64KB
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
vfio_display_edid_init() can fail for many reasons and return silently.
It would be good to report the error.
Old mdev driver may not support vfio edid region and we allow to go
through in this case.
vfio_display_edid_update() isn't changed because it can be called at
runtime when UI changes (i.e. window resize).
Fixes: 08479114b0 ("vfio/display: add edid support.")
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
EDID related device region info is leaked in vfio_display_edid_init()
error path and VFIODisplay destroying path.
Fixes: 08479114b0 ("vfio/display: add edid support.")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
In 94df5b2180 ("virtio-iommu: Fix 64kB host page size VFIO device
assignment"), in case of bypass mode, we transiently enabled the
IOMMU MR to allow the set_page_size_mask() to be called and pass
information about the page size mask constraint of cold plugged
VFIO devices. Now we do not use the IOMMU MR callback anymore, we
can just get rid of this hack.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Everything is now in place to use the Host IOMMU Device callbacks
to retrieve the page size mask usable with a given assigned device.
This new method brings the advantage to pass the info much earlier
to the virtual IOMMU and before the IOMMU MR gets enabled. So let's
remove the call to memory_region_iommu_set_page_size_mask in
vfio common.c and remove the single implementation of the IOMMU MR
callback in the virtio-iommu.c
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Retrieve the Host IOMMU Device page size mask when this latter is set.
This allows to get the information much sooner than when relying on
IOMMU MR set_page_size_mask() call, whcih happens when the IOMMU MR
gets enabled. We introduce check_page_size_mask() helper whose code
is inherited from current virtio_iommu_set_page_size_mask()
implementation. This callback will be removed in a subsequent patch.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
This callback will be used to retrieve the page size mask supported
along a given Host IOMMU device.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Introduce vfio_container_get_iova_ranges() to retrieve the usable
IOVA regions of the base container and use it in the Host IOMMU
device implementations of get_iova_ranges() callback.
We also fix a UAF bug as the list was shallow copied while
g_list_free_full() was used both on the single call site, in
virtio_iommu_set_iommu_device() but also in
vfio_container_instance_finalize(). Instead use g_list_copy_deep.
Fixes: cf2647a76e ("virtio-iommu: Compute host reserved regions")
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Suggested-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
In case no IOMMUPciBus/IOMMUDevice are found we need to properly
set the error handle and return.
Fixes : Coverity CID 1549006
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Fixes: cf2647a76e ("virtio-iommu: Compute host reserved regions")
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Update test case to test network connection via SSH.
Test command:
```
cd build
pyvenv/bin/avocado run ../qemu/tests/avocado/machine_aspeed.py:AST2x00MachineSDK.test_aarch64_ast2700_evb_sdk_v09_02
```
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Update test case to test ASPEED OpenBMC SDK v09.02 for AST2700.
ASPEED fixed TX mask issue from linux/drivers/ftgmac100.c.
It is required to use ASPEED OpenBMC SDK since v09.02
for AST2700 QEMU network testing.
A test image is downloaded from the ASPEED Forked OpenBMC GitHub
release repository :
https://github.com/AspeedTech-BMC/openbmc/releases/
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
According to the w25q01jv datasheet at page 16,
it is required to set QE bit in "Status Register 2".
Besides, users are able to utilize "Write Status Register 1(0x01)"
command to set QE bit in "Status Register 2" and
utilize "Read Status Register 2(0x35)" command to get the QE bit status.
To support quad mode for w25q01jvq, update collecting data needed
2 bytes for WRSR command in decode_new_cmd function and
verify QE bit at the second byte of collecting data bit 2
in complete_collecting_data.
Update RDCR_EQIO command to set bit 2 of return data
if quad mode enable in decode_new_cmd.
Signed-off-by: Troy Lee <troy_lee@aspeedtech.com>
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
ASPEED AST2700 SOC is a 64 bits quad core CPUs (Cortex-a35)
And the base address of dram is "0x4 00000000" which
is 64bits address.
Set dma64 property for ftgmac100 model to support
64bits dram address DMA.
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
ASPEED AST2700 SOC is a 64 bits quad core CPUs (Cortex-a35)
And the base address of dram is "0x4 00000000" which
is 64bits address.
It have "TXDES 2" and "RXDES 2" to save the high part
physical address of packet buffer.
Ex: TX packet buffer address [34:0]
The "TXDES 2" bits [18:16] which corresponds the bits [34:32]
of the 64 bits address of the TX packet buffer address
and "TXDES 3" bits [31:0] which corresponds the bits [31:0]
of the 64 bits address of the TX packet buffer address.
Update TX and RX packet buffers address type to
64 bits for dram 64 bits address DMA support.
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
ASPEED AST2700 SOC is a 64 bits quad core CPUs (Cortex-a35)
And the base address of dram is "0x4 00000000" which
is 64bits address.
It have "Normal Priority Transmit Ring Base Address Register High(0x17C)",
"High Priority Transmit Ring Base Address Register High(0x184)" and
"Receive Ring Base Address Register High(0x18C)" to save the high part physical
address of descriptor manager.
Ex: TX descriptor manager address [34:0]
The "Normal Priority Transmit Ring Base Address Register High(0x17C)"
bits [2:0] which corresponds the bits [34:32] of the 64 bits address of
the TX ring buffer address.
The "Normal Priority Transmit Ring Base Address Register(0x20)" bits [31:0]
which corresponds the bits [31:0] of the 64 bits address
of the TX ring buffer address.
Introduce a new sub region which size is 0x100 for the set of new registers
and map it at 0x100 in the container region.
This sub region range is from 0x100 to 0x1ff.
Introduce a new property and object attribute to activate the region for new registers.
Introduce a new memop handlers for the new register read and write.
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Update TX and RX ring base address data type to uint64_t for
64 bits dram address DMA support.
Both "Normal Priority Transmit Ring Base Address Register(0x20)" and
"Receive Ring Base Address Register (0x24)" are used for saving the
low part physical address of descriptor manager.
Therefore, changes to set TX and RX descriptor manager address bits [31:0]
in ftgmac100_read and ftgmac100_write functions.
Incrementing the version of vmstate to 2.
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
According to the datasheet of ASPEED SOCs,
one MAC controller owns 128KB of register space for AST2500.
However, one MAC controller only owns 64KB of register space for AST2600
and AST2700. It set the memory region size 128KB and it occupied another
controllers Address Spaces.
Update one MAC controller memory region size to 0x1000
because AST2500 did not use register spaces over than 64KB.
Introduce a new container region size to 0x1000 and its range
is from 0 to 0xfff. This container is mapped a sub region
for the current set of register.
This sub region range is from 0 to 0xff.
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
SD/MMC patches queue
- Use published card address (RCA) in qtest/npcm7xx_sdhci
- Have cards use random RCA
- Use SD spec v3.01 by default
- Convert GEN_CMD to sd_generic_read/write_byte style
- Extract SDMMC_COMMON abstract QDev parent from SD_CARD
- Few housekeeping
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmaIbbcACgkQ4+MsLN6t
# wN6A2RAAvTqk05r+R8ayyGLtxi6RBLb36WfIZy1iaiS3S5i93KrIwqM3LPqWMRRf
# 1h2dmflec3q3ebY/iHl6bdasdUlqfZDaw8BKBPETbDt9xCVmEC9/n7Vi7EMPmzP6
# A2ci7ZCDup4gLwp8AuB9OcMJnlVLGCQjW5yOTjN0V1MaG15iv6N7d6Th/aLEPEUr
# Ji/kk8adRGJhGRHcbkL7BGK+TxyAOUjjyt0k5e5hSS1W0T4dLgIljxq/L0wOxlZe
# Ot11GO/0EykkMIm7uASYXQws8wJFMgfhTYn77ibbzVFCBtSKvsq6ziuX3WopPoGK
# 0IfMkiK1vRpKey54Yn3+28ZY0v86c3NXybNlLbdrkvcZJgMrFTb4bpWyhQyx4Xbu
# uHfFxfu+rZC8/jfVqHd/RFw5sUliokc9a+KbaG9Yzx5MzXufOnu3iVOpx1vA6ZXX
# lX87qA1tZ78kTn/CtAAPx3CBWE9ojgH7wz/ABBTifUkIfDz5kFYT3g+kfygQQ+xh
# +bvdfQWeJ51Z3tPrUWm5fSGyB//XmgCfww7CZ1d63QaebAwml0YYvR3kivgnZ9A1
# abLr+uN7o4q3bqaY2FUvtglBPttA58wt7n02utWef8ZHl72hCsbvPtfwp2idUMY7
# ZRqdnHOB+opDbH9Xy9tj3Cqq1UPiEv3U3qXhZtd1Us7LSHXC/bk=
# =iKnd
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 05 Jul 2024 03:03:35 PM PDT
# gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]
* tag 'sdmmc-20240706' of https://github.com/philmd/qemu:
hw/sd/sdcard: Extract TYPE_SDMMC_COMMON from TYPE_SD_CARD
hw/sd/sdcard: Introduce set_csd/set_cid handlers
hw/sd/sdcard: Cover more SDCardStates
hw/sd/sdcard: Trace length of data read on DAT lines
hw/sd/sdcard: Remove default case in read/write on DAT lines
hw/sd/sdcard: Remove noise from sd_cmd_name()
hw/sd/sdcard: Remove noise from sd_acmd_name()
hw/sd/sdcard: Remove sd_none enum from sd_cmd_type_t
hw/sd/sdcard: Add sd_cmd_GEN_CMD handler (CMD56)
hw/sd/sdcard: Rename sd_cmd_SEND_OP_COND handler
hw/sd/sdcard: Use spec v3.01 by default
hw/sd/sdcard: Remove leftover comment about removed 'spi' Property
hw/sd/sdcard: Generate random RCA value
tests/qtest/npcm7xx_sdhci: Access the card using its published address
hw/sd/npcm7xx_sdhci: Use TYPE_SYSBUS_SDHCI definition
hw/sd/sdhci: Log non-sequencial access as GUEST_ERROR
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
When a command's arguments are specified as an explicit type T,
generated documentation points to the members of T.
Example:
##
# @announce-self:
#
# Trigger generation of broadcast RARP frames to update network
[...]
##
{ 'command': 'announce-self', 'boxed': true,
'data' : 'AnnounceParameters'}
generates
"announce-self" (Command)
-------------------------
Trigger generation of broadcast RARP frames to update network
[...]
Arguments
~~~~~~~~~
The members of "AnnounceParameters"
Except when the command takes its arguments unboxed , i.e. it doesn't
have 'boxed': true, we generate *nothing*. A few commands have a
reference in their doc comment to compensate, but most don't.
Example:
##
# @blockdev-snapshot-sync:
#
# Takes a synchronous snapshot of a block device.
#
# For the arguments, see the documentation of BlockdevSnapshotSync.
[...]
##
{ 'command': 'blockdev-snapshot-sync',
'data': 'BlockdevSnapshotSync',
'allow-preconfig': true }
generates
"blockdev-snapshot-sync" (Command)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Takes a synchronous snapshot of a block device.
For the arguments, see the documentation of BlockdevSnapshotSync.
[...]
Same for event data.
Fix qapidoc.py to generate the reference regardless of boxing. Delete
now redundant references in the doc comments.
Fixes: 4078ee5469 (docs/sphinx: Add new qapi-doc Sphinx extension)
Cc: qemu-stable@nongnu.org
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240628112756.794237-1-armbru@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
The double-colon synax is rST formatting that precedes a literal code
block. We do not want to capture these as QAPI-specific sections.
Coerce blocks that start with e.g. "Example::" to be parsed as untagged
paragraphs instead of special tagged sections.
Signed-off-by: John Snow <jsnow@redhat.com>
Message-ID: <20240626222128.406106-14-jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Indentation tweaked for consistency]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Generally, surround command-line options with ``literal`` markup to help
it stand out from prose in rendered HTML, and add cross-references to
replace "see also" messages.
References to types, values, and other QAPI definitions are not yet
adjusted here; they will be converted en masse in a subsequent patch
after the new QAPI doc generator is merged.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240626222128.406106-13-jsnow@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
We do not need a dedicated section for notes. By eliminating a specially
parsed section, these notes can be treated as normal rST paragraphs in
the new QMP reference manual, and can be placed and styled much more
flexibly.
Convert all existing "Note" and "Notes" sections to pure rST. As part of
the conversion, capitalize the first letter of each sentence and add
trailing punctuation where appropriate to ensure notes look sensible and
consistent in rendered HTML documentation. Markup is also re-aligned to
the de-facto standard of 3 spaces for directives.
Update docs/devel/qapi-code-gen.rst to reflect the new paradigm, and
update the QAPI parser to prohibit "Note" sections while suggesting a
new syntax. The exact formatting to use is a matter of taste, but a good
candidate is simply:
.. note:: lorem ipsum ...
... dolor sit amet ...
... consectetur adipiscing elit ...
... but there are other choices, too. The Sphinx readthedocs theme
offers theming for the following forms (capitalization unimportant); all
are adorned with a (!) symbol () in the title bar for rendered HTML
docs.
See
https://sphinx-rtd-theme.readthedocs.io/en/stable/demo/demo.html#admonitions
for examples of each directive/admonition in use.
These are rendered in orange:
.. Attention:: ...
.. Caution:: ...
.. WARNING:: ...
These are rendered in red:
.. DANGER:: ...
.. Error:: ...
These are rendered in green:
.. Hint:: ...
.. Important:: ...
.. Tip:: ...
These are rendered in blue:
.. Note:: ...
.. admonition:: custom title
admonition body text
This patch uses ".. note::" almost everywhere, with just two "caution"
directives. Several instances of "Notes:" have been converted to
merely ".. note::", or multiple ".. note::" where appropriate.
".. admonition:: notes" is used in a few places where we had an
ordered list of multiple notes that would not make sense as
standalone/separate admonitions. Two "Note:" following "Example:"
have been turned into ordinary paragraphs within the example.
NOTE: Because qapidoc.py does not attempt to preserve source ordering of
sections, the conversion of Notes from a "tagged section" to an
"untagged section" means that rendering order for some notes *may
change* as a result of this patch. The forthcoming qapidoc.py rewrite
strictly preserves source ordering in the rendered documentation, so
this issue will be rectified in the new generator.
Signed-off-by: John Snow <jsnow@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com> [for block*.json]
Message-ID: <20240626222128.406106-11-jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Commit message clarified slightly, period added to one more note]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
The new QMP documentation generator wants to parse all examples as
"QMP". We have an existing QMP lexer in docs/sphinx/qmp_lexer.py (Seen
in-use here: https://qemu-project.gitlab.io/qemu/interop/bitmaps.html)
that allows the use of "->", "<-" and "..." tokens to denote QMP
protocol flow with elisions, but otherwise defers to the JSON lexer.
To utilize this lexer for the existing QAPI documentation, we need them
to conform to a standard so that they lex and render correctly. Once the
QMP lexer is active for examples, errant QMP/JSON will produce warning
messages and fail the build.
Fix any invalid JSON found in QAPI documentation (identified by
attempting to lex all examples as QMP; see subsequent
commits). Additionally, elisions must be standardized for the QMP lexer;
they must be represented as the value "...", so three examples have been
adjusted to support that format here.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240626222128.406106-9-jsnow@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Sphinx does not like sections without titles, because it wants to
convert every section into a reference. When there is no title, it
struggles to do this and transforms the tree inproperly.
Depending on the rST used, this may result in an assertion error deep in
the docutils HTMLWriter.
(Observed when using ".. admonition:: Notes" under such a section - When
this is transformed with its own <title> element, Sphinx is fooled into
believing this title belongs to the section and incorrect mutates the
docutils tree, leading to errors during rendering time.)
When parsing an untagged section (free paragraphs), skip making a hollow
section and instead append the parse results to the prior section.
Many Bothans died to bring us this information.
The resulting output changes are basically invisible.
Signed-off-by: John Snow <jsnow@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240626222128.406106-8-jsnow@redhat.com>
[Mention output changes in commit message]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
If a comment immediately follows a doc block, the parser doesn't ignore
that token appropriately. Fix that.
e.g.
> ##
> # = Hello World!
> ##
>
> # I'm a comment!
will break the parser, because it does not properly ignore the comment
token if it immediately follows a doc block.
Fixes: 3d035cd2cc (qapi: Rewrite doc comment parser)
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240626222128.406106-7-jsnow@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Change get_doc_indented() to preserve indentation on all subsequent text
lines, and create a compatibility dedent() function for qapidoc.py that
removes indentation the same way get_doc_indented() did.
This is being done for the benefit of a new qapidoc generator which
requires that indentation in argument and features sections are
preserved.
Prior to this patch, a section like this:
```
@name: lorem ipsum
dolor sit amet
consectetur adipiscing elit
```
would have its body text be parsed into:
```
lorem ipsum
dolor sit amet
consectetur adipiscing elit
```
We want to preserve the indentation for even the first body line so that
the entire block can be parsed directly as rST. This patch would now
parse that segment into:
```
lorem ipsum
dolor sit amet
consectetur adipiscing elit
```
This is helpful for formatting arguments and features as field lists in
rST, where the new generator will format this information as:
```
:arg type name: lorem ipsum
dolor sit amet
consectetur apidiscing elit
```
...and can be formed by the simple concatenation of the field list
construct and the body text. The indents help preserve the continuation
of a block-level element, and further allow the use of additional rST
block-level constructs such as code blocks, lists, and other such
markup.
This understandably breaks the existing qapidoc.py; so a new function is
added there to dedent the text for compatibility. Once the new generator
is merged, this function will not be needed any longer and can be
dropped.
I verified this patch changes absolutely nothing by comparing the
md5sums of the QMP ref html pages both before and after the change, so
it's certified inert. QAPI test output has been updated to reflect the
new strategy of preserving indents for rST.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240626222128.406106-6-jsnow@redhat.com>
[Lost commit message paragraph restored]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
In a forthcoming series that adds a new QMP documentation generator, it
will be helpful to have a linting baseline. However, there's no need to
shuffle around the deck chairs too much, because most of this code will
be removed once that new qapidoc generator (the "transmogrifier") is in
place.
To ease my pain: just turn off the black auto-formatter for most, but
not all, of qapidoc.py. This will help ensure that *new* code follows a
coding standard without bothering too much with cleaning up the existing
code.
Code that I intend to keep is still subject to the delinting beam.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240626222128.406106-5-jsnow@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix minor irritants to pylint/flake8 et al.
(Yes, these need to be guarded by the Python tests. That's a work in
progress, a series that's quite likely to follow once I finish this
Sphinx project. Please pardon the temporary irritation.)
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240626222128.406106-3-jsnow@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
In order to keep eMMC model simpler to maintain,
extract common properties and the common code from
class_init to the (internal) TYPE_SDMMC_COMMON.
Update the corresponding QOM cast macros.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: Andrew Jeffery <andrew@codeconstruct.com.au>
Tested-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <20240703134356.85972-6-philmd@linaro.org>
"General command" (GEN_CMD, CMD56) is described as:
GEN_CMD is the same as the single block read or write
commands (CMD24 or CMD17). The difference is that [...]
the data block is not a memory payload data but has a
vendor specific format and meaning.
Thus this block must not be stored overwriting data block
on underlying storage drive. Handle as RAZ/WI.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Andrew Jeffery <andrew@codeconstruct.com.au>
Tested-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <20240703134356.85972-3-philmd@linaro.org>
Recent SDHCI expect cards to support the v3.01 spec
to negociate lower I/O voltage. Select it by default.
Versioned machine types with a version of 9.0 or
earlier retain the old default (spec v2.00).
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: Andrew Jeffery <andrew@codeconstruct.com.au>
Tested-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <20240703134356.85972-2-philmd@linaro.org>
Currently setup_sd_card() asks the card its address,
but discard the response and use hardcoded 0x4567.
Set the SDHC_CMD_RESPONSE bit to have the controller
record the bus response, and read the response from
the RSPREG0 register. Then we can select the card with
its real address.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <20240702140842.54242-4-philmd@linaro.org>
Updates for testing, plugins, gdbstub
- restore some 32 bit host builds and testing
- move some physmem tracepoint definitions
- use --userns keep-id for podman builds
- cleanup check-tcg compiler flag checking for Arm
- fix some casting in fcvt test
- tweak check-tcg inline asm for clang
- suppress some invalid clang warnings
- disable KVM for the TCI builds
- improve the insn tracking plugin
- cleanups to the lockstep plugin
- free plugin data on cpu finalise
- assert cpu->index assigned
- move qemu_plugin_vcpu_init__async into plugin code
- add support for dynamic gdb command tables
- allow targets to extend gdb capabilities
- enable user-mode MTE support
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmaH3bEACgkQ+9DbCVqe
# KkTnvwf9HS68sTICEJqBfY663hjcfdFGsSV/h3q7SN3fhKm/3JHGNK+kumgqdnaC
# ykd7tx0AtBGgKm83B7G6MPywsVMIosMeV3mFeJTVHhKsFwGNjSiGkr3j4R2qxjFt
# nYQ977FqBKyhvhSplR2wwhwi+JpuGWFGlnQTvdF2Z7ni4YCDFcbl4eiMyGwsjbWm
# 0VBP+wCSSMIIbS9Qb7DrhZlfu0+wKZK/q0FLzVVofcLSXGou+Mse/qhtG+yAU/FI
# qqqV+7J4PU9E4BqFaklxyRtBrpXNDgpo77pu6ZR7oDXD7HNMuIAuEIlkxMJjarNM
# xN64WOOzw15R2RMVyXdYx6ccxWft2Q==
# =9Gmk
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 05 Jul 2024 04:49:05 AM PDT
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [full]
* tag 'pull-maintainer-july24-050724-1' of https://gitlab.com/stsquad/qemu: (40 commits)
tests/tcg/aarch64: Add MTE gdbstub tests
gdbstub: Add support for MTE in user mode
gdbstub: Use true to set cmd_startswith
gdbstub: Pass CPU context to command handler
gdbstub: Make hex conversion function non-internal
target/arm: Factor out code for setting MTE TCF0 field
target/arm: Make some MTE helpers widely available
target/arm: Fix exception case in allocation_tag_mem_probe
gdbstub: Add support for target-specific stubs
gdbstub: Move GdbCmdParseEntry into a new header file
gdbstub: Clean up process_string_cmd
accel/tcg: Move qemu_plugin_vcpu_init__async() to plugins/
plugins: Free CPUPluginState before destroying vCPU state
plugins: Ensure vCPU index is assigned in init/exit hooks
plugins/lockstep: clean-up output
plugins/lockstep: mention the one-insn-per-tb option
plugins/lockstep: make mixed-mode safe
plugins/lockstep: preserve sock_path
test/plugins: preserve the instruction record over translations
test/plugin: make insn plugin less noisy by default
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Currently, it's not possible to have stubs specific to a given target,
even though there are GDB features which are target-specific, like, for
instance, memory tagging.
This commit introduces gdb_extend_qsupported_features,
gdb_extend_query_table, and gdb_extend_set_table functions as interfaces
to extend the qSupported string, the query handler table, and the set
handler table, allowing target-specific stub implementations.
Signed-off-by: Gustavo Romero <gustavo.romero@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240628050850.536447-4-gustavo.romero@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240705084047.857176-33-alex.bennee@linaro.org>
Move GdbCmdParseEntry and its associated types into a separate header
file to allow the use of GdbCmdParseEntry and other gdbstub command
functions outside of gdbstub.c.
Since GdbCmdParseEntry and get_param are now public, kdoc
GdbCmdParseEntry and rename get_param to gdb_get_cmd_param.
This commit also makes gdb_put_packet public since is used in gdbstub
command handling.
Signed-off-by: Gustavo Romero <gustavo.romero@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240628050850.536447-3-gustavo.romero@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240705084047.857176-32-alex.bennee@linaro.org>
We were repeating information which wasn't super clear. As we already
will have dumped the last failing PC just note the divergence and dump
the previous instruction log.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240705084047.857176-27-alex.bennee@linaro.org>
For arm32 host and arm64 guest we get
.../main.c:851:32: error: result of comparison of constant 70368744177664 with expression of type 'unsigned long' is always false [-Werror,-Wtautological-constant-out-of-range-compare]
if (TASK_UNMAPPED_BASE < reserved_va) {
~~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~
We already disable -Wtype-limits here, for this exact comparison, but
that is not enough for clang. Disable -Wtautological-compare as well,
which is a superset. GCC ignores the unknown warning flag.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20240630190050.160642-15-richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240705084047.857176-20-alex.bennee@linaro.org>
Previously we are always specifying -u $(UID) to match the UID in the
container with one outside. This causes a problem with rootless Podman.
Rootless Podman remaps user IDs in the container to ones controllable
for the current user outside. The -u option instructs Podman to use
a specified UID in the container but does not affect the UID remapping.
Therefore, the UID in the container can be remapped to some other UID
outside the container. This can make the access to bind-mounted volumes
fail because the remapped UID mismatches with the owner of the
directories.
Replace -u $(UID) with --userns keep-id, which fixes the UID remapping.
This change is limited to Podman because Docker does not support
--userns keep-id.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20240626-podman-v1-1-f8c8daf2bb0a@daynix.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240705084047.857176-6-alex.bennee@linaro.org>
Really the problem here is the return values of fit_load_[kernel|fdt]() are a
little all over the place. However we don't want to somehow get
through not having set kernel_end and having it just be random unused
data.
The compiler complained on an --enable-gcov build:
In file included from ../../hw/core/loader-fit.c:20:
/home/alex/lsrc/qemu.git/include/qemu/osdep.h: In function ‘load_fit’:
/home/alex/lsrc/qemu.git/include/qemu/osdep.h:486:45: error: ‘kernel_end’ may be used uninitialized [-Werror=maybe-uninitialized]
486 | #define ROUND_UP(n, d) ROUND_DOWN((n) + (d) - 1, (d))
| ^
../../hw/core/loader-fit.c:270:12: note: ‘kernel_end’ was declared here
270 | hwaddr kernel_end;
| ^~~~~~~~~~
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Aleksandar Rikalo <arikalo@gmail.com>
Message-Id: <20240705084047.857176-5-alex.bennee@linaro.org>
The commit 4f9a8315e6 (gitlab-ci.d/crossbuilds: Drop the i386 system
emulation job) was a little too aggressive dropping testing for 32 bit
system builds. Partially revert but using the debian-i686 cross build
images this time as fedora has deprecated the 32 bit stuff.
As the SEV breakage gets in the way and its TCG issues we want to
catch I've added --disable-kvm to the build.
Reported-by: Richard Henderson <richard.henderson@linaro.org>
Suggested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240705084047.857176-3-alex.bennee@linaro.org>
* meson: Pass objects and dependencies to declare_dependency(), not static_library()
* meson: Drop the .fa library suffix
* target/i386: drop AMD machine check bits from Intel CPUID
* target/i386: add avx-vnni-int16 feature
* target/i386: SEV bugfixes
* target/i386: SEV-SNP -cpu host support
* char: fix exit issues
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmaGceoUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroNcpgf/XziKojGOTvYsE7xMijOUswYjCG5m
# ZVLqxTug8Q0zO/9mGvluKBTWmh8KhRWOovX5iZL8+F0gPoYPG4ONpNhh3wpA9+S7
# H7ph4V6sDJBX4l3OrOK6htD8dO5D9kns1iKGnE0lY60PkcHl+pU8BNWfK1zYp5US
# geiyzuRFRRtDmoNx5+o+w+D+W5msPZsnlj5BnPWM+O/ykeFfSrk2ztfdwHKXUhCB
# 5FJcu2sWVx+wsdVzdjgT8USi5+VTK4vabq3SfccmNRxBRnJOCU5MrR63stMDceo4
# TswSB88I0WRV1848AudcGZRkjvKaXLyHJ+QTjg2dp7itEARJ3MGsvOpS5A==
# =3kv7
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 04 Jul 2024 02:56:58 AM PDT
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu:
target/i386/SEV: implement mask_cpuid_features
target/i386: add support for masking CPUID features in confidential guests
char-stdio: Restore blocking mode of stdout on exit
target/i386: add avx-vnni-int16 feature
i386/sev: Fallback to the default SEV device if none provided in sev_get_capabilities()
i386/sev: Fix error message in sev_get_capabilities()
target/i386: do not include undefined bits in the AMD topoext leaf
target/i386: SEV: fix formatting of CPUID mismatch message
target/i386: drop AMD machine check bits from Intel CPUID
target/i386: pass X86CPU to x86_cpu_get_supported_feature_word
meson: Drop the .fa library suffix
Revert "meson: Propagate gnutls dependency"
meson: Pass objects and dependencies to declare_dependency()
meson: merge plugin_ldflags into emulator_link_args
meson: move block.syms dependency out of libblock
meson: move shared_module() calls where modules are already walked
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Drop features that are listed as "BitMask" in the PPR and currently
not supported by AMD processors. The only ones that may become useful
in the future are TSC deadline timer and x2APIC, everything else is
not needed for SEV-SNP guests (e.g. VIRT_SSBD) or would require
processor support (e.g. TSC_ADJUST).
This allows running SEV-SNP guests with "-cpu host".
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some CPUID features may be provided by KVM for some guests, independent of
processor support, for example TSC deadline or TSC adjust. If these are
not supported by the confidential computing firmware, however, the guest
will fail to start. Add support for removing unsupported features from
"-cpu host".
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
virtio: features,fixes
A bunch of improvements:
- vhost dirty log is now only scanned once, not once per device
- virtio and vhost now support VIRTIO_F_NOTIFICATION_DATA
- cxl gained DCD emulation support
- pvpanic gained shutdown support
- beginning of patchset for Generic Port Affinity Structure
- s3 support
- friendlier error messages when boot fails on some illegal configs
- for vhost-user, VHOST_USER_SET_LOG_BASE is now only sent once
- part of vhost-user support for any POSIX system -
not yet enabled due to qtest failures
- sr-iov VF setup code has been reworked significantly
- new tests, particularly for risc-v ACPI
- bugfixes
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmaF068PHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRp+DMIAMC//mBXIZlPprfhb5cuZklxYi31Acgu5TUr
# njqjCkN+mFhXXZuc3B67xmrQ066IEPtsbzCjSnzuU41YK4tjvO1g+LgYJBv41G16
# va2k8vFM5pdvRA+UC9li1CCIPxiEcszxOdzZemj3szWLVLLUmwsc5OZLWWeFA5m8
# vXrrT9miODUz3z8/Xn/TVpxnmD6glKYIRK/IJRzzC4Qqqwb5H3ji/BJV27cDUtdC
# w6ns5RYIj5j4uAiG8wQNDggA1bMsTxFxThRDUwxlxaIwAcexrf1oRnxGRePA7PVG
# BXrt5yodrZYR2sR6svmOOIF3wPMUDKdlAItTcEgYyxaVo5rAdpc=
# =p9h4
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 03 Jul 2024 03:41:51 PM PDT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [undefined]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (85 commits)
hw/pci: Replace -1 with UINT32_MAX for romsize
pcie_sriov: Register VFs after migration
pcie_sriov: Remove num_vfs from PCIESriovPF
pcie_sriov: Release VFs failed to realize
pcie_sriov: Reuse SR-IOV VF device instances
pcie_sriov: Ensure VF function number does not overflow
pcie_sriov: Do not manually unrealize
hw/ppc/spapr_pci: Do not reject VFs created after a PF
hw/ppc/spapr_pci: Do not create DT for disabled PCI device
hw/pci: Rename has_power to enabled
virtio-iommu: Clear IOMMUDevice when VFIO device is unplugged
virtio: remove virtio_tswap16s() call in vring_packed_event_read()
hw/cxl/events: Mark cxl-add-dynamic-capacity and cxl-release-dynamic-capcity unstable
hw/cxl/events: Improve QMP interfaces and documentation for add/release dynamic capacity.
tests/data/acpi/rebuild-expected-aml.sh: Add RISC-V
pc-bios/meson.build: Add support for RISC-V in unpack_edk2_blobs
meson.build: Add RISC-V to the edk2-target list
tests/data/acpi/virt: Move ARM64 ACPI tables under aarch64/${machine} path
tests/data/acpi: Move x86 ACPI tables under x86/${machine} path
tests/qtest/bios-tables-test.c: Set "arch" for x86 tests
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
romsize is an uint32_t variable. Specifying -1 as an uint32_t value is
obscure way to denote UINT32_MAX.
Worse, if int is wider than 32-bit, it will change the behavior of a
construct like the following:
romsize = -1;
if (romsize != -1) {
...
}
When -1 is assigned to romsize, -1 will be implicitly casted into
uint32_t, resulting in UINT32_MAX. On contrary, when evaluating
romsize != -1, romsize will be casted into int, and it will be a
comparison of UINT32_MAX and -1, and result in false.
Replace -1 with UINT32_MAX for statements involving the variable to
clarify the intent and prevent potential breakage.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20240627-reuse-v10-10-7ca0b8ed3d9f@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
pcie_sriov doesn't have code to restore its state after migration, but
igb, which uses pcie_sriov, naively claimed its migration capability.
Add code to register VFs after migration and fix igb migration.
Fixes: 3a977deebe ("Intrdocue igb device emulation")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20240627-reuse-v10-9-7ca0b8ed3d9f@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Disable SR-IOV VF devices by reusing code to power down PCI devices
instead of removing them when the guest requests to disable VFs. This
allows to realize devices and report VF realization errors at PF
realization time.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20240627-reuse-v10-6-7ca0b8ed3d9f@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When a VFIO device is hoplugged in a VM using virtio-iommu, IOMMUPciBus
and IOMMUDevice cache entries are created in the .get_address_space()
handler of the machine IOMMU device. However, these entries are never
destroyed, not even when the VFIO device is detached from the machine.
This can lead to an assert if the device is reattached again.
When reattached, the .get_address_space() handler reuses an
IOMMUDevice entry allocated when the VFIO device was first attached.
virtio_iommu_set_host_iova_ranges() is called later on from the
.set_iommu_device() handler an fails with an assert on 'probe_done'
because the device appears to have been already probed when this is
not the case.
The IOMMUDevice entry is allocated in pci_device_iommu_address_space()
called from under vfio_realize(), the VFIO PCI realize handler. Since
pci_device_unset_iommu_device() is called from vfio_exitfn(), a sub
function of the PCIDevice unrealize() handler, it seems that the
.unset_iommu_device() handler is the best place to release resources
allocated at realize time. Clear the IOMMUDevice cache entry there to
fix hotplug.
Fixes: 817ef10da2 ("virtio-iommu: Implement set|unset]_iommu_device() callbacks")
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <20240701101453.203985-1-clg@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Markus suggested that we make the unstable. I don't expect these
interfaces to change because of their tight coupling to the Compute
Express Link (CXL) Specification, Revision 3.1 Fabric Management API
definitions which can only be extended in backwards compatible way.
However, there seems little disadvantage in taking a cautious path
for now and marking them as unstable interfaces.
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20240625170805.359278-3-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
New DCD command definitions updated in response to review comments
from Markus.
- Used CxlXXXX instead of CXLXXXXX for newly added types.
- Expanded some abreviations in type names to be easier to read.
- Additional documentation for some fields.
- Replace slightly vague cxl r3.1 references with
"Compute Express Link (CXL) Specification, Revision 3.1, XXXX"
to bring them inline with what it says on the specification cover.
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20240625170805.359278-2-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
To test ACPI tables, edk2 needs to be booted with a disk image having
EFI partition. This image is created using UefiTestToolsPkg.
The image is generated using tests/uefi-test-tools source.
Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Message-Id: <20240625150839.1358279-5-sunilvl@ventanamicro.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
It's observed that Linux kernel booting with the VM reports a "conflicting
mapping for input ID" FW_BUG.
The IORT doc defines "Number of IDs" to be "the number of IDs in the range
minus one", while virt-acpi-build.c simply stores the number of IDs in the
id_count without the "minus one". Meanwhile, some of the callers pass in a
0xFFFF following the spec. So, this is a mismatch between the function and
its callers.
Fix build_iort_id_mapping() by internally subtracting one from the pass-in
@id_count. Accordingly make sure that all existing callers pass in a value
without the "minus one", i.e. change all 0xFFFFs to 0x10000s.
Also, add a few lines of comments to highlight this change along with the
referencing document for this build_iort_id_mapping().
Fixes: 42e0f050e3 ("hw/arm/virt-acpi-build: Add IORT support to bypass SMMUv3")
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Message-Id: <20240619201243.936819-1-nicolinc@nvidia.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In e820_add_entry() the e820_table is reallocated with g_renew() to make
space for a new entry. However, fw_cfg_arch_create() just uses the
existing e820_table pointer. This leads to a use-after-free if anything
adds a new entry after fw_cfg is set up.
Shift the addition of the etc/e820 file to the machine done notifier, via
a new fw_cfg_add_e820() function.
Also make e820_table private and use an e820_get_table() accessor function
for it, which sets a flag that will trigger an assert() for any *later*
attempts to add to the table.
Make e820_add_entry() return void, as most callers don't check for error
anyway.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <a2708734f004b224f33d3b4824e9a5a262431568.camel@infradead.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Both the other two callers of build_iort_id_mapping() just directly pass
in the IORT_NODE_OFFSET macro. Keeping a "const uint32_t" local variable
storing the same value doesn't have any gain.
Simplify this by replacing the only place using this local variable with
the macro directly.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Message-Id: <20240619001708.926511-1-nicolinc@nvidia.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The unrealize functions of the various vhost-user devices are
calling the corresponding vhost_*_set_status() functions with a
status of 0 to shut down the device correctly.
Now these vhost_*_set_status() functions all follow this scheme:
bool should_start = virtio_device_should_start(vdev, status);
if (vhost_dev_is_started(&vvc->vhost_dev) == should_start) {
return;
}
if (should_start) {
/* ... do the initialization stuff ... */
} else {
/* ... do the cleanup stuff ... */
}
The problem here is virtio_device_should_start(vdev, 0) currently
always returns "true" since it internally only looks at vdev->started
instead of looking at the "status" parameter. Thus once the device
got started once, virtio_device_should_start() always returns true
and thus the vhost_*_set_status() functions return early, without
ever doing any clean-up when being called with status == 0. This
causes e.g. problems when trying to hot-plug and hot-unplug a vhost
user devices multiple times since the de-initialization step is
completely skipped during the unplug operation.
This bug has been introduced in commit 9f6bcfd99f ("hw/virtio: move
vm_running check to virtio_device_started") which replaced
should_start = status & VIRTIO_CONFIG_S_DRIVER_OK;
with
should_start = virtio_device_started(vdev, status);
which later got replaced by virtio_device_should_start(). This blocked
the possibility to set should_start to false in case the status flag
VIRTIO_CONFIG_S_DRIVER_OK was not set.
Fix it by adjusting the virtio_device_should_start() function to
only consider the status flag instead of vdev->started. Since this
function is only used in the various vhost_*_set_status() functions
for exactly the same purpose, it should be fine to fix it in this
central place there without any risk to change the behavior of other
code.
Fixes: 9f6bcfd99f ("hw/virtio: move vm_running check to virtio_device_started")
Buglink: https://issues.redhat.com/browse/RHEL-40708
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20240618121958.88673-1-thuth@redhat.com>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
`memory-backend-memfd` is available only on Linux while the new
`memory-backend-shm` can be used on any POSIX-compliant operating
system. Let's use it so we can run the test in multiple environments.
Since we are here, let`s remove `share=on` which is the default for shm
(and also for memfd).
Acked-by: Thomas Huth <thuth@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20240618100527.145883-1-sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
shm_open() creates and opens a new POSIX shared memory object.
A POSIX shared memory object allows creating memory backend with an
associated file descriptor that can be shared with external processes
(e.g. vhost-user).
The new `memory-backend-shm` can be used as an alternative when
`memory-backend-memfd` is not available (Linux only), since shm_open()
should be provided by any POSIX-compliant operating system.
This backend mimics memfd, allocating memory that is practically
anonymous. In theory shm_open() requires a name, but this is allocated
for a short time interval and shm_unlink() is called right after
shm_open(). After that, only fd is shared with external processes
(e.g., vhost-user) as if it were associated with anonymous memory.
In the future we may also allow the user to specify the name to be
passed to shm_open(), but for now we keep the backend simple, mimicking
anonymous memory such as memfd.
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com> (QAPI schema)
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20240618100519.145853-1-sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
With recent linux kernels, there is a syscall to probe for various
ISA extensions. These bits were phased in over several kernel
releases, so we still require checks for symbol availability.
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
AVX-VNNI-INT16 (CPUID[EAX=7,ECX=1).EDX[10]) is supported by Clearwater
Forest processor, add it to QEMU as it does not need any specific
enablement.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When management tools (e.g. libvirt) query QEMU capabilities,
they start QEMU with a minimalistic configuration and issue
various commands on monitor. One of the command issued is/might
be "query-sev-capabilities" to learn values like cbitpos or
reduced-phys-bits. But as of v9.0.0-1145-g16dcf200dc the monitor
command returns an error instead.
This creates a chicken-egg problem because in order to query
those aforementioned values QEMU needs to be started with a
'sev-guest' object. But to start QEMU with the values must be
known.
I think it's safe to assume that the default path ("/dev/sev")
provides the same data as user provided one. So fall back to it.
Fixes: 16dcf200dc
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Link: https://lore.kernel.org/r/157f93712c23818be193ce785f648f0060b33dee.1719218926.git.mprivozn@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit d7c72735f6 ("target/i386: Add new EPYC CPU versions with updated
cache_info", 2023-05-08) ensured that AMD-defined CPU models did not
have the 'complex_indexing' bit set, but left it set in "-cpu host"
which uses the default ("legacy") cache information.
Reimplement that commit using a CPU feature, so that it can be applied
to all guests using a new machine type, independent of the CPU model.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The recent addition of the SUCCOR bit to kvm_arch_get_supported_cpuid()
causes the bit to be visible when "-cpu host" VMs are started on Intel
processors.
While this should in principle be harmless, it's not tidy and we don't
even know for sure that it doesn't cause any guest OS to take unexpected
paths. Since x86_cpu_get_supported_feature_word() can return different
different values depending on the guest, adjust it to hide the SUCCOR
bit if the guest has non-AMD vendor.
Suggested-by: Xiaoyao Li <xiaoyao.li@intel.com>
Cc: John Allen <john.allen@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This allows modifying the bits in "-cpu max"/"-cpu host" depending on
the guest CPU vendor (which, at least by default, is the host vendor in
the case of KVM).
For example, machine check architecture differs between Intel and AMD,
and bits from AMD should be dropped when configuring the guest for
an Intel model.
Cc: Xiaoyao Li <xiaoyao.li@intel.com>
Cc: John Allen <john.allen@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The non-standard .fa library suffix breaks the link source
de-duplication done by Meson so drop it.
The lack of link source de-duplication causes AddressSanitizer to
complain ODR violations, and makes GNU ld abort when combined with
clang's LTO.
Fortunately, the non-standard suffix is not necessary anymore for
two reasons.
First, the non-standard suffix was necessary for fork-fuzzing.
Meson wraps all standard-suffixed libraries with --start-group and
--end-group. This made a fork-fuzz.ld linker script wrapped as well and
broke builds. Commit d2e6f9272d ("fuzz: remove fork-fuzzing
scaffolding") dropped fork-fuzzing so we can now restore the standard
suffix.
Second, the libraries are not even built anymore, because it is
possible to just use the object files directly via extract_all_objects().
The occurences of the suffix were detected and removed by performing
a tree-wide search with 'fa' and .fa (note the quotes and dot).
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-ID: <20240524-xkb-v4-4-2de564e5c859@daynix.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This reverts commit 3eacf70bb5.
It was only needed because of duplicate objects caused by
declare_dependency(link_whole: ...), and can be dropped now
that meson.build specifies objects and dependencies separately
for the internal dependencies.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-ID: <20240524-objects-v1-2-07cbbe96166b@daynix.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We used to request declare_dependency() to link_whole static libraries.
If a static library is a thin archive, GNU ld keeps all object files
referenced by the archive open, and sometimes exceeds the open file limit.
Another problem with link_whole is that suboptimal handling of nested
dependencies.
link_whole by itself does not propagate dependencies. In particular,
gnutls, a dependency of crypto, is not propagated to its users, and we
currently workaround the issue by declaring gnutls as a dependency for
each crypto user. On the other hand, if you write something like
libfoo = static_library('foo', 'foo.c', dependencies: gnutls)
foo = declare_dependency(link_whole: libfoo)
libbar = static_library('bar', 'bar.c', dependencies: foo)
bar = declare_dependency(link_whole: libbar, dependencies: foo)
executable('prog', sources: files('prog.c'), dependencies: [foo, bar])
hoping to propagate the gnutls dependency into bar.c, you'll see a
linking failure for "prog", because the foo.c.o object file is included in
libbar.a and therefore it is linked twice into "prog": once from libfoo.a
and once from libbar.a. Here Meson does not see the duplication, it
just asks the linker to link all of libfoo.a and libbar.a into "prog".
Instead of using link_whole, extract objects included in static libraries
and pass them to declare_dependency(); and then the dependencies can be
added as well so that they are propagated, because object files on the
linker command line are always deduplicated.
This requires Meson 1.1.0 or later.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-ID: <20240524-objects-v1-1-07cbbe96166b@daynix.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These serve the same purpose, except plugin_ldflags ends up in the linker
command line in a more roundabout way (through specific_ss). Simplify.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In order to define libqemuutil symbols that are requested by block modules,
QEMU currently uses a combination of the "link_depends" argument of
libraries (which is propagated into dependencies, but not available in
dependencies) and the "link_args" argument of declare_dependency()
(which _is_ available in static_library, but probably not used for
historical reasons only).
Unfortunately the link_depends will not be propagated into the
"block" dependency if it is defined using
declare_dependency(objects: ...); and it is not possible to
add it directly to the dependency because the keyword argument
simply is not available.
The only solution, in order to switch to defining the dependency
without using "link_whole" (which has problems of its own, see
https://github.com/mesonbuild/meson/pull/8151#issuecomment-754796420),
is unfortunately to add the link_args and link_depends to the
executables directly; fortunately there is just four of them.
It is possible (and I will look into it) to add "link_depends"
to declare_dependency(), but it probably will be a while before
QEMU can use it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Block layer patches (CVE-2024-4467)
- Don't open qcow2 data files in 'qemu-img info'
- Disallow protocol prefixes for qcow2 data files, VMDK extent files and
other child nodes that are neither 'file' nor 'backing'
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmaEKQwRHGt3b2xmQHJl
# ZGhhdC5jb20ACgkQfwmycsiPL9YgMA/+OeQf0veFb02ZNqf907Etz8/DvnqbiWUN
# 0aT5z5x8ilZQIiEDbFtLKgF3A/WO7phyCKk1q1dbRNbc1ZaWFW7mTaJM2ew++EuB
# fq0mnskLt/GVSqTReO4od7flsssp3sEDxs74yuyNITIUqui4we9WK2lLRiAv3aco
# 2NbyNeMHJxIW+QlOO3R62i24yjQaLyg/YekmiIK8itQkpKuI80fiVgor5W3RR0P0
# 71AVSHC0Edv5eavmiRqmQ+pfSI8tlINsN1s5jvxge6XpVTaL8NHsgH3LVv1R3Qtx
# Uo9hp6lQboAfc4I06gf+fcsYSBRiGCwA/J+JsWusX4FLaaTNHLt5eJAEJhfZlioj
# wgTqpy2ImRu5lcuLjLWRu4cLapPLI6CSwf4/lG9/szmRA/1UtOKpquKeTuCwMl9Y
# XEVoNDzo7GpfSb7YONo7fU7kq00OuEEAn0he7eNd2UU+Ao9Abi7JvY+fKx71FHo3
# k24SQVhVJihV1IEC4psCtaQm2bB/jdMr0jB44zHLtmqeUMLrrVf64cSAntp+2KRa
# sINBXA5OeblGKQ7FoAzc5NNNveSdF1ioRCvKB3MlHzI+efzRS7+I3wwh2Uz1Uwfo
# sivg+dAXQQBKVXn8UbfznFyEKueT0RW5CUbfeEqGQ/ocw7iTrXABsX+tjcktxl8Q
# zrHZNoAz6Ds=
# =7LWn
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 02 Jul 2024 09:21:32 AM PDT
# gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg: issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
* tag 'for-upstream' of https://repo.or.cz/qemu/kevin:
block: Parse filenames only when explicitly requested
iotests/270: Don't store data-file with json: prefix in image
iotests/244: Don't store data-file with protocol in image
qcow2: Don't open data_file with BDRV_O_NO_IO
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
aspeed queue:
* Coverity fixes
* Deprecation of tacoma-bmc machine
* Buffer overflow fix in GPIO model
* Minor cleanup
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmaDs3QACgkQUaNDx8/7
# 7KEc/BAAj5AS3rLm3NPpU13y1P1hcjuSm1/PVGTJQH+m4K9UaAkJ8VhRB0Y/rdU6
# ygGhKaCHyk96+I49Csz886YU9Wg9qnxaYJAbornHZJVGNy5tuVpQKM20kfgN3XFN
# ENJR3e+J6Ye7kCtR1ujcf0mydWDaDyq0i82ykURsudcQLMnGq1gBQGadYjt1hJoN
# F9HDPgUJ8/wjQnG8BomsrnuvUSpRTbGNV66FNxXdQ6C6d6OTKQfNnXXqrKO+8QPK
# B5XB9FjTk017DUog1jdE1SaEMowml8CmUhjMwLHOcyWhcZpEk90aMX8cQhefUs9y
# O6kNin2UYEjcTHA/lyfMQJQMNDDZTE32MyP1LwRE/5ZiHqrT7ViqNvZSPBGBueUz
# 9B0xiQTuYqcRqlwgyU73DvnTgrsKFdKQSldj5dXYVnWCKeKY/sCWApHMJxN9xMCA
# Uw1E4QfCLkd+TM6DoJAkBHWFsgi44Aym11VU4VviGNRNTgmTptgQzmHiYGNFiGZG
# OypVPM8Ti6UeVnW65l9J9f7xA0jDB+XQjhCCaoax9GlUMA4C4/Aln5OXXxIWRWFd
# XA3Gn3c/S2j7rMqdfAk68xDHuAJ3wShHlw6HLRd1Xki05WFTeLj1lejLHMdfpNmr
# DkQimzHShBqZzZGxc7FsO0keGY8kyIJkZhbCCbZrFXJXQGRdBao=
# =LxwO
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 02 Jul 2024 12:59:48 AM PDT
# gpg: using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@kaod.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1
* tag 'pull-aspeed-20240702' of https://github.com/legoater/qemu:
hw/net:ftgmac100: fix coding style
aspeed/sdmc: Remove extra R_MAIN_STATUS case
aspeed/soc: Fix possible divide by zero
aspeed/sdmc: Check RAM size value at realize time
aspeed: Deprecate the tacoma-bmc machine
hw/gpio/aspeed: Add reg_table_count to AspeedGPIOClass
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
When handling image filenames from legacy options such as -drive or from
tools, these filenames are parsed for protocol prefixes, including for
the json:{} pseudo-protocol.
This behaviour is intended for filenames that come directly from the
command line and for backing files, which may come from the image file
itself. Higher level management tools generally take care to verify that
untrusted images don't contain a bad (or any) backing file reference;
'qemu-img info' is a suitable tool for this.
However, for other files that can be referenced in images, such as
qcow2 data files or VMDK extents, the string from the image file is
usually not verified by management tools - and 'qemu-img info' wouldn't
be suitable because in contrast to backing files, it already opens these
other referenced files. So here the string should be interpreted as a
literal local filename. More complex configurations need to be specified
explicitly on the command line or in QMP.
This patch changes bdrv_open_inherit() so that it only parses filenames
if a new parameter parse_filename is true. It is set for the top level
in bdrv_open(), for the file child and for the backing file child. All
other callers pass false and disable filename parsing this way.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
We want to disable filename parsing for data files because it's too easy
to abuse in malicious image files. Make the test ready for the change by
passing the data file explicitly in command line options.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
We want to disable filename parsing for data files because it's too easy
to abuse in malicious image files. Make the test ready for the change by
passing the data file explicitly in command line options.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
One use case for 'qemu-img info' is verifying that untrusted images
don't reference an unwanted external file, be it as a backing file or an
external data file. To make sure that calling 'qemu-img info' can't
already have undesired side effects with a malicious image, just don't
open the data file at all with BDRV_O_NO_IO. If nothing ever tries to do
I/O, we don't need to have it open.
This changes the output of iotests case 061, which used 'qemu-img info'
to show that opening an image with an invalid data file fails. After
this patch, it succeeds. Replace this part of the test with a qemu-io
call, but keep the final 'qemu-img info' to show that the invalid data
file is correctly displayed in the output.
Fixes: CVE-2024-4467
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Misc HW patches queue
- Prevent NULL deref in sPAPR network model (Oleg)
- Automatic deprecation of versioned machine types (Daniel)
- Correct 'dump-guest-core' property name in hint (Akihiko)
- Prevent IRQ leak in MacIO IDE model (Mark)
- Remove dead #ifdef'ry related to unsupported macOS 12.0 (Akihiko)
- Remove "hw/hw.h" where unnecessary (Thomas)
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmaDiSQACgkQ4+MsLN6t
# wN4jmBAA2kxwFAGbKvokANDAZBwWmJdnuIPcqS+jdo/wCuQXOo1ROADd3NFlgQWx
# z1xOv/LiAmQiUeeiP+nlA8gWCdW93PErU07og1p1+N2D1sBO6oG5QDlT/tTFuEGd
# IL21jG2xWkEemd3PSN2pHKrytpS0e4S0cNZIKgTUTKdv+Mb2ZEiQi7K4zUTjcmjz
# nlsSjTXdyKBmoiqNGhITWfbR2IUWjtCpzUO44ceqXd5HDpvfGhpKI7Uwun1W2xNU
# yw1XrAFd64Qhd/lvc28G1DLfDdtRIoaRGxgLzQbU6621s0o50Ecs6TNHseuUAKvd
# tQhOtM8IEuZ6jVw8nswCPIcJyjbeY29kjI4WmD2weF1fZbDey6Emlrf+dkJUIuCb
# TximyTXw3rb1nREUVsEQLF69BKjTjE5+ETaplcTWGHCoH2+uA/5MqygalTH1Ub9W
# TwVWSUwpNvIJ3RTsT20YVowkill8piF+ECldTKzJuWjqDviiJDoMm5EFdkkcUB20
# nMyhGoiXtiQ4NYU0/B6HbHOXZkqLbhWcx9G281xJ+RRwjUyVxXD3zHGR9AoOp9ls
# EAo/2URJtGN95LJmzCtaD+oo0wRZ5+7lmnqHPPXkYUdwFm4bhe3dP4NggIrS0cXn
# 19wvBqQuPwywxIbFEu6327YtfPRcImWIlFthWnm9lUyDmbOqDKw=
# =fLCx
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 01 Jul 2024 09:59:16 PM PDT
# gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]
* tag 'hw-misc-20240702' of https://github.com/philmd/qemu: (22 commits)
Remove inclusion of hw/hw.h from files that don't need it
net/vmnet: Drop ifdef for macOS versions older than 12.0
block/file-posix: Drop ifdef for macOS versions older than 12.0
audio: Drop ifdef for macOS versions older than 12.0
hvf: Drop ifdef for macOS versions older than 12.0
hw/ide/macio: switch from using qemu_allocate_irq() to qdev input GPIOs
system/physmem: Fix reference to dump-guest-core
docs: document special exception for machine type deprecation & removal
hw/i386: remove obsolete manual deprecation reason string of i440fx machines
hw/ppc: remove obsolete manual deprecation reason string of spapr machines
hw: skip registration of outdated versioned machine types
hw: set deprecation info for all versioned machine types
include/hw: temporarily disable deletion of versioned machine types
include/hw: add macros for deprecation & removal of versioned machines
hw/i386: convert 'q35' machine definitions to use new macros
hw/i386: convert 'i440fx' machine definitions to use new macros
hw/m68k: convert 'virt' machine definitions to use new macros
hw/ppc: convert 'spapr' machine definitions to use new macros
hw/s390x: convert 'ccw' machine definitions to use new macros
hw/arm: convert 'virt' machine definitions to use new macros
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
On macOS passing `-s /tmp/vhost.socket` parameter to the vhost-user-blk
application, the bind was done on `/tmp/vhost.socke` pathname,
missing the last character.
This sounds like one of the portability problems described in the
unix(7) manpage:
Pathname sockets
When binding a socket to a pathname, a few rules should
be observed for maximum portability and ease of coding:
• The pathname in sun_path should be null-terminated.
• The length of the pathname, including the terminating
null byte, should not exceed the size of sun_path.
• The addrlen argument that describes the enclosing
sockaddr_un structure should have a value of at least:
offsetof(struct sockaddr_un, sun_path) +
strlen(addr.sun_path)+1
or, more simply, addrlen can be specified as
sizeof(struct sockaddr_un).
So let's follow the last advice and simplify the code as well.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20240618100440.145664-1-sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In vhost-user-server we set all fd received from the other peer
in non-blocking mode. For some of them (e.g. memfd, shm_open, etc.)
it's not really needed, because we don't use these fd with blocking
operations, but only to map memory.
In addition, in some systems this operation can fail (e.g. in macOS
setting an fd returned by shm_open() non-blocking fails with errno
= ENOTTY).
So, let's avoid setting fd non-blocking for those messages that we
know carry memory fd (e.g. VHOST_USER_ADD_MEM_REG,
VHOST_USER_SET_MEM_TABLE).
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20240618100043.144657-6-sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
libvhost-user will panic when receiving VHOST_USER_GET_INFLIGHT_FD
message if MFD_ALLOW_SEALING is not defined, since it's not able
to create a memfd.
VHOST_USER_GET_INFLIGHT_FD is used only if
VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD is negotiated. So, let's mask
that feature if the backend is not able to properly handle these
messages.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20240618100043.144657-5-sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In vu_message_write() we use sendmsg() to send the message header,
then a write() to send the payload.
If sendmsg() fails we should avoid sending the payload, since we
were unable to send the header.
Discovered before fixing the issue with the previous patch, where
sendmsg() failed on macOS due to wrong parameters, but the frontend
still sent the payload which the backend incorrectly interpreted
as a wrong header.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20240618100043.144657-4-sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The default value of the @share option of the @MemoryBackendProperties
really depends on the backend type, so let's document the default
values in the same place where we define the option to avoid
dispersing the information.
Cc: David Hildenbrand <david@redhat.com>
Suggested-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20240618100043.144657-2-sgarzare@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The CSD::CSR_IMP bit defines whether the Driver Stage
Register (DSR) is implemented or not. We do not set
this bit in CSD:
static void sd_set_csd(SDState *sd, uint64_t size)
{
...
if (size <= SDSC_MAX_CAPACITY) { /* Standard Capacity SD */
...
sd->csd[6] = 0xe0 | /* Partial block for read allowed */
((csize >> 10) & 0x03);
...
} else { /* SDHC */
...
sd->csd[6] = 0x00;
...
}
...
}
The sd_normal_command() switch case for the SEND_DSR
command do nothing and fallback to "illegal command".
Since the command is mandatory (although the register
isn't...) call the sd_cmd_unimplemented() handler.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <20240628070216.92609-43-philmd@linaro.org>
All commands switching from TRANSFER state to (receiving)DATA
do the same: receive stream of data from the DAT lines. Instead
of duplicating the same code many times, introduce 2 helpers:
- sd_cmd_to_receivingdata() on the I/O line setup the data to
be received on the data[] buffer,
- sd_generic_write_byte() on the DAT lines to push the data.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <20240628070216.92609-30-philmd@linaro.org>
All commands switching from TRANSFER state to (sending)DATA
do the same: send stream of data on the DAT lines. Instead
of duplicating the same code many times, introduce 2 helpers:
- sd_cmd_to_sendingdata() on the I/O line setup the data to
be transferred,
- sd_generic_read_byte() on the DAT lines to fetch the data.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <4c9f7f51-83ee-421a-8690-9af2e80b134b@linaro.org>
Card entering sd_inactive_state powers off, and won't respond
anymore. Handle that once when entering sd_do_command().
Remove condition always true in sd_cmd_GO_IDLE_STATE().
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <20240628070216.92609-12-philmd@linaro.org>
SDCardStates enum values are specified, so assign them
correspondingly. It will be useful later when we add
states from later specs, which might not be continuous.
See CURRENT_STATE bits in section 4.10.1 "Card Status".
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <20240628070216.92609-11-philmd@linaro.org>
Per sections 3.6.1 (SD Bus Protocol), 4.3.4 "Data Write"
and 7.3.2 (Responses):
In the CMD line the Most Significant Bit is transmitted first.
Use the stl_be_p() helper to store the value in big-endian.
Fixes: a1bb27b1e9 ("Initial SD card emulation")
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20240628070216.92609-9-philmd@linaro.org>
Per sections 3.6.1 (SD Bus Protocol) and 7.3.2 (Responses):
In the CMD line the Most Significant Bit is transmitted first.
Use the stl_be_p() helper to store the value in big-endian.
Fixes: a1bb27b1e9 ("Initial SD card emulation")
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20240628070216.92609-8-philmd@linaro.org>
The initial virtio-net-ccw devices currently do not have a proper parent
in the QOM tree, so they show up under /machine/unattached - which is
somewhat ugly. Let's attach them to /machine/virtual-css-bridge/virtual-css
instead.
Message-ID: <20240701200108.154271-1-thuth@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The command is selected on the I/O lines, and further
processing might be done on the DAT lines via the
sd_read_byte() and sd_write_byte() handlers. Since
these methods can't distinct between normal and APP
commands, keep the name of the current command in
the SDState and use it in the DAT handlers. This
fixes a bug that all normal commands were displayed
as APP commands.
Fixes: 2ed61fb57b ("sdcard: Display command name when tracing CMD/ACMD")
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <20240628070216.92609-4-philmd@linaro.org>
We use the v2.00 spec by default since commit 2f0939c234
("sdcard: Add a 'spec_version' property, default to Spec v2.00").
Time to deprecate the v1.10 which doesn't bring much, and
is not tested.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <20240627071040.36190-2-philmd@linaro.org>
Migration of a s390x guest with TCG was long known to be very unstable,
so the tests in tests/qtest/migration-test.c are disabled if running
with TCG instead of KVM.
Nicholas Piggin did a great analysis of the problem:
"The flic pending state is not migrated, so if the machine is migrated
while an interrupt is pending, it can be lost. This shows up in
qtest migration test, an extint is pending (due to console writes?)
and the CPU waits via s390_cpu_set_psw and expects the interrupt to
wake it. However when the flic pending state is lost, s390_cpu_has_int
returns false, so s390_cpu_exec_interrupt falls through to halting
again."
Thus let's finally migrate the pending state, and to be on the safe
side, also the other state variables of the QEMUS390FLICState structure.
Message-ID: <20240619144421.261342-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Coverity reports that the newly added 'case R_MAIN_STATUS' is DEADCODE
because it can not be reached. This is because R_MAIN_STATUS is handled
before in the "Unprotected registers" switch statement. Remove it.
Fixes: Coverity CID 1547112
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
[ clg: Rewrote commit log ]
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Coverity reports a possible DIVIDE_BY_ZERO issue regarding the
"ram_size" object property. This can not happen because RAM has
predefined valid sizes per SoC. Nevertheless, add a test to
close the issue.
Fixes: Coverity CID 1547113
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
[ clg: Rewrote commit log ]
Signed-off-by: Cédric Le Goater <clg@redhat.com>
The RAM size of the SDMC device is validated for the SoC and set when
the Aspeed machines are initialized and then later used by several
SoC implementations. However, the SDMC model never checks that the RAM
size has been actually set before being used. Do that at realize.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Jamin_lin < jamin_lin@aspeedtech.com>
The tacoma-bmc machine was a board including an AST2600 SoC based BMC
and a witherspoon like OpenPOWER system. It was used for bring up of
the AST2600 SoC in labs. It can be easily replaced by the rainier-bmc
machine which is part of a real product offering.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
ASan detected a global-buffer-overflow error in the aspeed_gpio_read()
function. This issue occurred when reading beyond the bounds of the
reg_table.
To enhance the safety and maintainability of the Aspeed GPIO code, this commit
introduces a reg_table_count member to the AspeedGPIOClass structure. This
change ensures that the size of the GPIO register table is explicitly tracked
and initialized, reducing the risk of errors if new register tables are
introduced in the future.
Reproducer:
cat << EOF | qemu-system-aarch64 -display none \
-machine accel=qtest, -m 512M -machine ast1030-evb -qtest stdio
readq 0x7e780272
EOF
ASAN log indicating the issue:
==2602930==ERROR: AddressSanitizer: global-buffer-overflow on address 0x55a5da29e128 at pc 0x55a5d700dc62 bp 0x7fff096c4e90 sp 0x7fff096c4e88
READ of size 2 at 0x55a5da29e128 thread T0
#0 0x55a5d700dc61 in aspeed_gpio_read hw/gpio/aspeed_gpio.c:564:14
#1 0x55a5d933f3ab in memory_region_read_accessor system/memory.c:445:11
#2 0x55a5d92fba40 in access_with_adjusted_size system/memory.c:573:18
#3 0x55a5d92f842c in memory_region_dispatch_read1 system/memory.c:1426:16
#4 0x55a5d92f7b68 in memory_region_dispatch_read system/memory.c:1459:9
#5 0x55a5d9376ad1 in flatview_read_continue_step system/physmem.c:2836:18
#6 0x55a5d9376399 in flatview_read_continue system/physmem.c:2877:19
#7 0x55a5d93775b8 in flatview_read system/physmem.c:2907:12
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2355
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Andrew Jeffery <andrew@codeconstruct.com.au>
The automatic deprecation mechanism introduced in the preceeding patches
will mark every i440fx machine upto and including 2.12 as deprecated. As
such we can revert the manually added deprecation introduced in:
commit 792b4fdd4e
Author: Philippe Mathieu-Daudé <philmd@linaro.org>
Date: Wed Feb 28 10:34:35 2024 +0100
hw/i386/pc: Deprecate 2.4 to 2.12 pc-i440fx machines
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240620165742.1711389-14-berrange@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The automatic deprecation mechanism introduced in the preceeding patches
will mark every spapr machine upto and including 2.12 as deprecated. As
such we can revert the manually added deprecation which was a subset:
commit 1392617d35
Author: Cédric Le Goater <clg@kaod.org>
Date: Tue Jan 23 16:37:02 2024 +1000
spapr: Tag pseries-2.1 - 2.11 machines as deprecated
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240620165742.1711389-13-berrange@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
This calls the MACHINE_VER_DELETION() macro in the machine type
registration method, so that when a versioned machine type reaches
the end of its life, it is no longer registered with QOM and thus
cannot be used.
The actual definition of the machine type should be deleted at
this point, but experience shows that can easily be forgotten.
By skipping registration the manual code deletion task can be
done at any later date.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240620165742.1711389-12-berrange@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
This calls the MACHINE_VER_DEPRECATION() macro in the definition of
all machine type classes which support versioning. This ensures
that they will automatically get deprecation info set when they
reach the appropriate point in their lifecycle.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240620165742.1711389-11-berrange@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The new deprecation and deletion policy for versioned machine types is
being introduced in QEMU 9.1.0.
Under the new policy a number of old machine types (any prior to 2.12)
would be liable for immediate deletion which would be a violation of our
historical deprecation and removal policy
Thus automatic deletions (by skipping QOM registration) are temporarily
gated on existance of the env variable "QEMU_DELETE_MACHINES" / QEMU
version number >= 10.1.0. This allows opt-in testing of the automatic
deletion logic, while activating it fully in QEMU >= 10.1.0.
This whole commit should be reverted in the 10.1.0 dev cycle or shortly
thereafter.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240620165742.1711389-10-berrange@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Versioned machines live for a long time to provide back compat for
incoming migration and restore of saved images. To guide users away from
usage of old machines, however, we want to deprecate any older than 3
years (equiv of 9 releases), and delete any older than 6 years (equiva
of 18 releases).
To get a standardized deprecation message and avoid having to remember
to manually add it after three years, this introduces two macros to be
used by targets when defining versioned machines.
* MACHINE_VER_DEPRECATION(major, minor)
Automates the task of setting the 'deprecation_reason' field on the
machine, if-and-only-if the major/minor version is older than 3 years.
* MACHINE_VER_DELETION(major, minor)
Simulates the deletion of by skipping registration of the QOM type
for a versioned machine, if-and-only-if the major/minor version is
older than 6 years.
By using these two macros there is no longer any manual work required
per-release to deprecate old machines. By preventing the use of machines
that have reached their deletion date, it is also not necessary to
manually delete machines per-release. Deletion can be batched up once a
year or whenever makes most sense.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240620165742.1711389-9-berrange@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
This changes the DEFINE_Q35_MACHINE macro to use the common
helpers for constructing versioned symbol names and strings,
bringing greater consistency across targets.
The added benefit is that it avoids the need to repeat the
version number thrice in three different formats in the calls
to DEFINE_Q35_MACHINE.
Due to the odd-ball '4.0.1' machine type version, this
commit introduces a DEFINE_Q35_BUGFIX helper, to allow
defining of "bugfix" machine types which have a three
digit version.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240620165742.1711389-8-berrange@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
This changes the DEFINE_I440FX_MACHINE macro to use the common
helpers for constructing versioned symbol names and strings,
bringing greater consistency across targets.
The added benefit is that it avoids the need to repeat the
version number thrice in three different formats in the calls
to DEFINE_I440FX_MACHINE.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240620165742.1711389-7-berrange@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
This changes the DEFINE_VIRT_MACHINE macro to use the common
helpers for constructing versioned symbol names and strings,
bringing greater consistency across targets.
A DEFINE_VIRT_MACHINE_AS_LATEST helper is added so that it
is not required to pass 'false' for every single historical
machine type.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240620165742.1711389-6-berrange@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
This changes the DEFINE_SPAPR_MACHINE macro to use the common
helpers for constructing versioned symbol names and strings,
bringing greater consistency across targets.
The added benefit is that it avoids the need to repeat the
version number twice in two different formats in the calls
to DEFINE_SPAPR_MACHINE.
A DEFINE_SPAPR_MACHINE_AS_LATEST helper is added so that it
is not required to pass 'false' for every single historical
machine type.
Due to the odd-ball '2.12-sxxm' machine type version, this
commit introduces a DEFINE_SPAPR_MACHINE_TAGGED helper to
allow defining of "tagged" machine types which have a string
suffix.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240620165742.1711389-5-berrange@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
This changes the DEFINE_CCW_MACHINE macro to use the common
helpers for constructing versioned symbol names and strings,
bringing greater consistency across targets.
The added benefit is that it avoids the need to repeat the
version number twice in two different formats in the calls
to DEFINE_CCW_MACHINE.
A DEFINE_CCW_MACHINE_AS_LATEST helper is added so that it
is not required to pass 'false' for every single historical
machine type.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240620165742.1711389-4-berrange@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The various targets which define versioned machine types have
a bunch of obfuscated macro code for defining unique function
and variable names using string concatenation.
This adds a couple of helpers to improve the clarity of such
code macro.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240620165742.1711389-2-berrange@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
A crash found while fuzzing device virtio-net-socket-check-used.
Assertion "offset == 0" in iov_copy() fails if less than guest_hdr_len bytes
were transmited.
Signed-off-by: Dmitry Frolov <frolov@swemel.ru>
Message-Id: <20240613143529.602591-2-frolov@swemel.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The VHOST_USER_SET_LOG_BASE requests should be categorized into
non-vring specific messages, and should be sent only once.
If send more than once, dpdk will munmap old log_addr which may has been used and cause segmentation fault.
Signed-off-by: BillXiang <xiangwencheng@dayudpu.com>
Message-Id: <20240613065150.3100-1-xiangwencheng@dayudpu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently, the Q35 supports up to 4096 vCPUs (since v9.0), but for TCG
cases, if x2APIC is not actively enabled to boot more than 255 vCPUs (
e.g., qemu-system-i386 -M pc-q35-9.0 -smp 666), the following error is
reported:
Unexpected error in apic_common_set_id() at ../hw/intc/apic_common.c:449:
qemu-system-i386: APIC ID 255 requires x2APIC feature in CPU
Aborted (core dumped)
This error can be resolved by setting x2apic=on in -cpu. In order to
better help users deal with this scenario, add the error hint to
instruct users on how to enable the x2apic feature. Then, the error
report becomes the following:
Unexpected error in apic_common_set_id() at ../hw/intc/apic_common.c:448:
qemu-system-i386: APIC ID 255 requires x2APIC feature in CPU
Try x2apic=on in -cpu.
Aborted (core dumped)
Note since @errp is &error_abort, error_append_hint() can't be applied
on @errp. And in order to separate the exact error message from the
(perhaps effectively) hint, adding a hint via error_append_hint() is
also necessary. Therefore, introduce @local_error in
apic_common_set_id() to handle both the error message and the error
hint.
Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20240606140858.2157106-1-zhao1.liu@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In the scenario where vhost-user sets eventfd to -1,
qemu_chr_fe_get_msgfds retrieves fd as -1. When vhost_user_read
receives, it does not perform blocking operations on the descriptor
with fd=-1, so non-blocking operations should not be performed here
either.This is a normal use case. Calling g_unix_set_fd_nonblocking
at this point will cause the test to interrupt.
When vhost_user_write sets the call fd to -1, it sets the number of
fds to 0, so the fds obtained by qemu_chr_fe_get_msgfds will also
be 0.
Signed-off-by: Yuxue Liu <yuxue.liu@jaguarmicro.com>
Message-Id: <20240411073555.1357-1-yuxue.liu@jaguarmicro.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In current code, when guest does S3, virtio-gpu are reset due to the
bit No_Soft_Reset is not set. After resetting, the display resources
of virtio-gpu are destroyed, then the display can't come back and only
show blank after resuming.
Implement No_Soft_Reset bit of PCI_PM_CTRL register, then guest can check
this bit, if this bit is set, the devices resetting will not be done, and
then the display can work after resuming.
No_Soft_Reset bit is implemented for all virtio devices, and was tested
only on virtio-gpu device. Set it false by default for safety.
Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
Message-Id: <20240606102205.114671-3-Jiqian.Chen@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Emit a QMP event on receiving a PVPANIC_SHUTDOWN event. Even though a typical
SHUTDOWN event will be sent, it will be indistinguishable from a shutdown
originating from other cases (e.g. KVM exit due to KVM_SYSTEM_EVENT_SHUTDOWN)
that also issue the guest-shutdown cause.
A management layer application can detect the new GUEST_PVSHUTDOWN event to
determine if the guest is using the pvpanic interface to request shutdowns.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Message-Id: <20240527-pvpanic-shutdown-v8-6-5a28ec02558b@t-8ch.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Shutdown requests are normally hardware dependent.
By extending pvpanic to also handle shutdown requests, guests can
submit such requests with an easily implementable and cross-platform
mechanism.
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Thomas Weißschuh <thomas@t-8ch.de>
Message-Id: <20240527-pvpanic-shutdown-v8-5-5a28ec02558b@t-8ch.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The different components of pvpanic duplicate the list of supported
events. Move it to the shared header file to minimize changes when new
events are added.
MST: tweak: keep header included in pvpanic.c to avoid header
dependency, rebase.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Thomas Weißschuh <thomas@t-8ch.de>
Message-Id: <20240527-pvpanic-shutdown-v8-3-5a28ec02558b@t-8ch.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Before the change, the QMP interface used for add/release DC extents
only allows to release an extent whose DPA range is contained by a single
accepted extent in the device.
With the change, we relax the constraints. As long as the DPA range of
the extent is covered by accepted extents, we allow the release.
Tested-by: Svetly Todorov <svetly.todorov@memverge.com>
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Fan Ni <fan.ni@samsung.com>
Message-Id: <20240523174651.1089554-15-nifan.cxl@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
All DPA ranges in the DC regions are invalid to access until an extent
covering the range has been successfully accepted by the host. A bitmap
is added to each region to record whether a DC block in the region has
been backed by a DC extent. Each bit in the bitmap represents a DC block.
When a DC extent is accepted, all the bits representing the blocks in the
extent are set, which will be cleared when the extent is released.
Tested-by: Svetly Todorov <svetly.todorov@memverge.com>
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Fan Ni <fan.ni@samsung.com>
Message-Id: <20240523174651.1089554-13-nifan.cxl@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
To simulate FM functionalities for initiating Dynamic Capacity Add
(Opcode 5604h) and Dynamic Capacity Release (Opcode 5605h) as in CXL spec
r3.1 7.6.7.6.5 and 7.6.7.6.6, we implemented two QMP interfaces to issue
add/release dynamic capacity extents requests.
With the change, we allow to release an extent only when its DPA range
is contained by a single accepted extent in the device. That is to say,
extent superset release is not supported yet.
1. Add dynamic capacity extents:
For example, the command to add two continuous extents (each 128MiB long)
to region 0 (starting at DPA offset 0) looks like below:
{ "execute": "qmp_capabilities" }
{ "execute": "cxl-add-dynamic-capacity",
"arguments": {
"path": "/machine/peripheral/cxl-dcd0",
"host-id": 0,
"selection-policy": "prescriptive",
"region": 0,
"extents": [
{
"offset": 0,
"len": 134217728
},
{
"offset": 134217728,
"len": 134217728
}
]
}
}
2. Release dynamic capacity extents:
For example, the command to release an extent of size 128MiB from region 0
(DPA offset 128MiB) looks like below:
{ "execute": "cxl-release-dynamic-capacity",
"arguments": {
"path": "/machine/peripheral/cxl-dcd0",
"host-id": 0,
"removal-policy":"prescriptive",
"region": 0,
"extents": [
{
"offset": 134217728,
"len": 134217728
}
]
}
}
Tested-by: Svetly Todorov <svetly.todorov@memverge.com>
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Signed-off-by: Fan Ni <fan.ni@samsung.com>
Message-Id: <20240523174651.1089554-12-nifan.cxl@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Per CXL spec 3.1, two mailbox commands are implemented:
Add Dynamic Capacity Response (Opcode 4802h) 8.2.9.9.9.3, and
Release Dynamic Capacity (Opcode 4803h) 8.2.9.9.9.4.
For the process of the above two commands, we use two-pass approach.
Pass 1: Check whether the input payload is valid or not; if not, skip
Pass 2 and return mailbox process error.
Pass 2: Do the real work--add or release extents, respectively.
Tested-by: Svetly Todorov <svetly.todorov@memverge.com>
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Signed-off-by: Fan Ni <fan.ni@samsung.com>
Message-Id: <20240523174651.1089554-11-nifan.cxl@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add (file/memory backed) host backend for DCD. All the dynamic capacity
regions will share a single, large enough host backend. Set up address
space for DC regions to support read/write operations to dynamic capacity
for DCD.
With the change, the following support is added:
1. Add a new property to type3 device "volatile-dc-memdev" to point to host
memory backend for dynamic capacity. Currently, all DC regions share one
host backend;
2. Add namespace for dynamic capacity for read/write support;
3. Create cdat entries for each dynamic capacity region.
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Signed-off-by: Fan Ni <fan.ni@samsung.com>
Message-Id: <20240523174651.1089554-9-nifan.cxl@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
With the change, when setting up memory for type3 memory device, we can
create DC regions.
A property 'num-dc-regions' is added to ct3_props to allow users to pass the
number of DC regions to create. To make it easier, other region parameters
like region base, length, and block size are hard coded. If needed,
these parameters can be added easily.
With the change, we can create DC regions with proper kernel side
support like below:
region=$(cat /sys/bus/cxl/devices/decoder0.0/create_dc_region)
echo $region > /sys/bus/cxl/devices/decoder0.0/create_dc_region
echo 256 > /sys/bus/cxl/devices/$region/interleave_granularity
echo 1 > /sys/bus/cxl/devices/$region/interleave_ways
echo "dc0" >/sys/bus/cxl/devices/decoder2.0/mode
echo 0x40000000 >/sys/bus/cxl/devices/decoder2.0/dpa_size
echo 0x40000000 > /sys/bus/cxl/devices/$region/size
echo "decoder2.0" > /sys/bus/cxl/devices/$region/target0
echo 1 > /sys/bus/cxl/devices/$region/commit
echo $region > /sys/bus/cxl/drivers/cxl_region/bind
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Fan Ni <fan.ni@samsung.com>
Message-Id: <20240523174651.1089554-7-nifan.cxl@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Per cxl spec r3.1, add dynamic capacity (DC) region representative based on
Table 8-165 and extend the cxl type3 device definition to include DC region
information. Also, based on info in 8.2.9.9.9.1, add 'Get Dynamic Capacity
Configuration' mailbox support.
Note: we store region decode length as byte-wise length on the device, which
should be divided by 256 * MiB before being returned to the host
for "Get Dynamic Capacity Configuration" mailbox command per
specification.
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Fan Ni <fan.ni@samsung.com>
Message-Id: <20240523174651.1089554-5-nifan.cxl@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This enables wrapper devices to customize the base device's CCI
(for example, with custom commands outside the specification)
without the need to change the base device.
The also enabled the base device to dispatch those commands without
requiring additional driver support.
Heavily edited by Jonathan Cameron to increase code reuse
Signed-off-by: Gregory Price <gregory.price@memverge.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Fan Ni <fan.ni@samsung.com>
Message-Id: <20240523174651.1089554-3-nifan.cxl@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This allows devices to have fully customized CCIs, along with complex
devices where wrapper devices can override or add additional CCI
commands without having to replicate full command structures or
pollute a base device with every command that might ever be used.
Signed-off-by: Gregory Price <gregory.price@memverge.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Fan Ni <fan.ni@samsung.com>
Message-Id: <20240523174651.1089554-2-nifan.cxl@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When the vhost-user is reconnecting to the backend, and if the vhost-user fails
at the get_features in vhost_dev_init(), then the reconnect will fail
and it will not be retriggered forever.
The reason is:
When the vhost-user fail at get_features, the vhost_dev_cleanup will be called
immediately.
vhost_dev_cleanup calls 'memset(hdev, 0, sizeof(struct vhost_dev))'.
The reconnect path is:
vhost_user_blk_event
vhost_user_async_close(.. vhost_user_blk_disconnect ..)
qemu_chr_fe_set_handlers <----- clear the notifier callback
schedule vhost_user_async_close_bh
The vhost->vdev is null, so the vhost_user_blk_disconnect will not be
called, then the event fd callback will not be reinstalled.
We need to ensure that even if vhost_dev_init initialization fails, the event
handler still needs to be reinstalled when s->connected is false.
All vhost-user devices have this issue, including vhost-user-blk/scsi.
Fixes: 71e076a07d ("hw/virtio: generalise CHR_EVENT_CLOSED handling")
Signed-off-by: Li Feng <fengli@smartx.com>
Message-Id: <20240516025753.130171-3-fengli@smartx.com>
Reviewed-by: Raphael Norwitz <raphael@enfabrica.net>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This reverts commit f02a4b8e64.
Since the current patch cannot completely fix the lost reconnect
problem, there is a scenario that is not considered:
- When the virtio-blk driver is removed from the guest os,
s->connected has no chance to be set to false, resulting in
subsequent reconnection not being executed.
The next patch will completely fix this issue with a better approach.
Signed-off-by: Li Feng <fengli@smartx.com>
Message-Id: <20240516025753.130171-2-fengli@smartx.com>
Reviewed-by: Raphael Norwitz <raphael@enfabrica.net>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When using vhost-user-gpu with GL, qemu -display gtk doesn't show output
and prints: qemu: eglCreateImageKHR failed
Since commit 9ac06df8b ("virtio-gpu-udmabuf: correct naming of
QemuDmaBuf size properties"), egl_dmabuf_import_texture() uses
backing_{width,height} for the texture dimension.
Fixes: 9ac06df8b ("virtio-gpu-udmabuf: correct naming of QemuDmaBuf size properties")
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20240515105237.1074116-1-marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fix bug imported by 27ce0f3afc ("fix Power Management Control Register for PCI Express virtio devices"
After this change, observe that QEMU may erroneously clear the power status of the device,
or may erroneously clear non writable registers, such as NO_SOFT_RESET, etc.
Only state of PM_CTRL is writable.
Only when flag VIRTIO_PCI_FLAG_INIT_PM is set, need to reset state.
Fixes: 27ce0f3afc ("fix Power Management Control Register for PCI Express virtio devices"
Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
Message-Id: <20240515073526.17297-2-Jiqian.Chen@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Not having VIRTIO_F_RING_PACKED in feature_bits[] is a problem when the
vhost-vsock device does not offer the feature bit VIRTIO_F_RING_PACKED
but the in QEMU device is configured to try to use the packed layout
(the virtio property "packed" is on).
As of today, the Linux kernel vhost-vsock device does not support the
packed queue layout (as vhost does not support packed), and does not
offer VIRTIO_F_RING_PACKED. Thus when for example a vhost-vsock-ccw is
used with packed=on, VIRTIO_F_RING_PACKED ends up being negotiated,
despite the fact that the device does not actually support it, and
one gets to keep the pieces.
Fixes: 74b3e46630 ("virtio: add property to enable packed virtqueue")
Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Message-Id: <20240429113334.2454197-1-pasic@linux.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
If the client sends more than one region this assert triggers. The
reason is that two fd's are 8 bytes and VHOST_MEMORY_BASELINE_NREGIONS
is exactly 8.
The assert is wrong because it should not test for the size of the fd
array, but for the numbers of regions.
Signed-off-by: Christian Pötzsch <christian.poetzsch@kernkonzept.com>
Message-Id: <20240426083313.3081272-1-christian.poetzsch@kernkonzept.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add support for the VIRTIO_F_NOTIFICATION_DATA feature across a variety
of vhost devices.
The inclusion of VIRTIO_F_NOTIFICATION_DATA in the feature bits arrays
for these devices ensures that the backend is capable of offering and
providing support for this feature, and that it can be disabled if the
backend does not support it.
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <20240315165557.26942-6-jonah.palmer@oracle.com>
Acked-by: Srujana Challa <schalla@marvell.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add support to virtio-ccw devices for handling the extra data sent from
the driver to the device when the VIRTIO_F_NOTIFICATION_DATA transport
feature has been negotiated.
The extra data that's passed to the virtio-ccw device when this feature
is enabled varies depending on the device's virtqueue layout.
That data passed to the virtio-ccw device is in the same format as the
data passed to virtio-pci devices.
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <20240315165557.26942-5-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add support to virtio-mmio devices for handling the extra data sent from
the driver to the device when the VIRTIO_F_NOTIFICATION_DATA transport
feature has been negotiated.
The extra data that's passed to the virtio-mmio device when this feature
is enabled varies depending on the device's virtqueue layout.
The data passed to the virtio-mmio device is in the same format as the
data passed to virtio-pci devices.
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <20240315165557.26942-4-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Prevent the realization of a virtio device that attempts to use the
VIRTIO_F_NOTIFICATION_DATA transport feature without disabling
ioeventfd.
Due to ioeventfd not being able to carry the extra data associated with
this feature, having both enabled is a functional mismatch and therefore
Qemu should not continue the device's realization process.
Although the device does not yet know if the feature will be
successfully negotiated, many devices using this feature wont actually
work without this extra data and would fail FEATURES_OK anyway.
If ioeventfd is able to work with the extra notification data in the
future, this compatibility check can be removed.
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <20240315165557.26942-3-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add support to virtio-pci devices for handling the extra data sent
from the driver to the device when the VIRTIO_F_NOTIFICATION_DATA
transport feature has been negotiated.
The extra data that's passed to the virtio-pci device when this
feature is enabled varies depending on the device's virtqueue
layout.
In a split virtqueue layout, this data includes:
- upper 16 bits: shadow_avail_idx
- lower 16 bits: virtqueue index
In a packed virtqueue layout, this data includes:
- upper 16 bits: 1-bit wrap counter & 15-bit shadow_avail_idx
- lower 16 bits: virtqueue index
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <20240315165557.26942-2-jonah.palmer@oracle.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
On setups with one or more virtio-net devices with vhost on,
dirty tracking iteration increases cost the bigger the number
amount of queues are set up e.g. on idle guests migration the
following is observed with virtio-net with vhost=on:
48 queues -> 78.11% [.] vhost_dev_sync_region.isra.13
8 queues -> 40.50% [.] vhost_dev_sync_region.isra.13
1 queue -> 6.89% [.] vhost_dev_sync_region.isra.13
2 devices, 1 queue -> 18.60% [.] vhost_dev_sync_region.isra.14
With high memory rates the symptom is lack of convergence as soon
as it has a vhost device with a sufficiently high number of queues,
the sufficient number of vhost devices.
On every migration iteration (every 100msecs) it will redundantly
query the *shared log* the number of queues configured with vhost
that exist in the guest. For the virtqueue data, this is necessary,
but not for the memory sections which are the same. So essentially
we end up scanning the dirty log too often.
To fix that, select a vhost device responsible for scanning the
log with regards to memory sections dirty tracking. It is selected
when we enable the logger (during migration) and cleared when we
disable the logger. If the vhost logger device goes away for some
reason, the logger will be re-selected from the rest of vhost
devices.
After making mem-section logger a singleton instance, constant cost
of 7%-9% (like the 1 queue report) will be seen, no matter how many
queues or how many vhost devices are configured:
48 queues -> 8.71% [.] vhost_dev_sync_region.isra.13
2 devices, 8 queues -> 7.97% [.] vhost_dev_sync_region.isra.14
Co-developed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Message-Id: <1710448055-11709-2-git-send-email-si-wei.liu@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
There could be a mix of both vhost-user and vhost-kernel clients
in the same QEMU process, where separate vhost loggers for the
specific vhost type have to be used. Make the vhost logger per
backend type, and have them properly reference counted.
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Message-Id: <1710448055-11709-1-git-send-email-si-wei.liu@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
EXTI's new field `irq_levels` tracks irq levels between tests when using
`global_qtest`.
This happens in `stm32l4x5_exti-test.c`, `stm32l4x5_syscfg-test.c` and
`stm32l4x5_gpio-test.c` (`dm163.c` doesn't use `global_qtest`).
To ensure that `irq_levels` has the same value before and after each
QTest, this commit toggles back the irq lines that were changed at the
end of each problematic test. Most QTests were already doing this.
Signed-off-by: Inès Varhol <ines.varhol@telecom-paris.fr>
Message-id: 20240629110800.539969-3-ines.varhol@telecom-paris.fr
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The QTest `test_irq_pin_multiplexer` makes the assumption that the
reset state of irq line 15 is low, which is false since STM32L4x5 GPIO
was implemented (the reset state of pin GPIOA15 is high because there's
pull-up and it results in the irq line 15 also being high at reset).
It wasn't triggering an error because `test_interrupt` was mistakenly
"resetting" the line low.
This commit corrects these two mistakes by :
- not setting the line low in `test_interrupt`
- using an irq line in `test_irq_pin_multiplexer` which is low at reset
Signed-off-by: Inès Varhol <ines.varhol@telecom-paris.fr>
Message-id: 20240629104454.366283-1-ines.varhol@telecom-paris.fr
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
A malicious or buggy guest may generated buffered ioreqs faster than
QEMU can process them in handle_buffered_iopage(). The result is a
livelock - QEMU continuously processes ioreqs on the main thread without
iterating through the main loop which prevents handling other events,
processing timers, etc. Without QEMU handling other events, it often
results in the guest becoming unsable and makes it difficult to stop the
source of buffered ioreqs.
To avoid this, if we process a full page of buffered ioreqs, stop and
reschedule an immediate timer to continue processing them. This lets
QEMU go back to the main loop and catch up.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Message-Id: <20240404140833.1557953-1-ross.lagerwall@citrix.com>
Signed-off-by: Anthony PERARD <anthony@xenproject.org>
The version of the sbsa-ref EDK2 firmware we used to use in this test
had a bug where it might make an unaligned access to the framebuffer,
which causes a guest crash on newer versions of QEMU where we enforce
the architectural requirement that unaligned accesses to Device memory
should take an exception.
We happened to not notice this because our test was booting with "-smp
1" and through luck this didn't write the boot logo to the framebuffer
at an unaligned address; but trying to boot the same firmware with two
CPUs would result in a guest crash. Now we have updated the firmware
we're using for the test, we can make the test use all the cores on the
board, so we are testing the SMP boot path.
Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240620-b4-new-firmware-v3-2-29a3a2f1be1e@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Update firmware to have graphics card memory fix from EDK2 commit
c1d1910be6e04a8b1a73090cf2881fb698947a6e:
OvmfPkg/QemuVideoDxe: add feature PCD to remap framebuffer W/C
Some platforms (such as SBSA-QEMU on recent builds of the emulator) only
tolerate misaligned accesses to normal memory, and raise alignment
faults on such accesses to device memory, which is the default for PCIe
MMIO BARs.
When emulating a PCIe graphics controller, the framebuffer is typically
exposed via a MMIO BAR, while the disposition of the region is closer to
memory (no side effects on reads or writes, except for the changing
picture on the screen; direct random access to any pixel in the image).
In order to permit the use of such controllers on platforms that only
tolerate these types of accesses for normal memory, it is necessary to
remap the memory. Use the DXE services to set the desired capabilities
and attributes.
Hide this behavior under a feature PCD so only platforms that really
need it can enable it. (OVMF on x86 has no need for this)
With this fix enabled we can boot sbsa-ref with more than one cpu core.
Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240620-b4-new-firmware-v3-1-29a3a2f1be1e@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Four mailbox properties are implemented as follows:
1. Customer OTP: GET_CUSTOMER_OTP and SET_CUSTOMER_OTP
2. Device-specific private key: GET_PRIVATE_KEY and
SET_PRIVATE_KEY.
The customer OTP is located in the rows 36-43. The device-specific private key
is located in the rows 56-63.
The customer OTP can be locked with the magic numbers 0xffffffff 0xaffe0000
when running the SET_CUSTOMER_OTP mailbox command. Bit 6 of row 32 indicates
this lock, which is undocumented. The lock also applies to the device-specific
private key.
Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The OTP device registers are currently stubbed. For now, the device
houses the OTP rows which will be accessed directly by other peripherals.
Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We should call inflateEnd() like on success path to cleanup state in s
variable.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
We don't ship a binary that is simply called "qemu", so we should
avoid this in the documentation. Use the configurable binary name
via "|qemu_system|" instead.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This patch corrects minor typographical errors to ensure the ASCII art
aligns with the explanations provided. Specifically, it fixes an
incorrect root port reference and removes redundant words.
Signed-off-by: Hyeongtak Ji <hyeongtak.ji@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Darwin uses a subtly different version of the setrlimit() syscall as
described in the COMPATIBILITY section of the macOS man page. The value
of the rlim_cur member has been adjusted accordingly for Darwin-based
systems.
Signed-off-by: Trent Huber <trentmhuber@gmail.com>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
As far as I can tell this struct has never been used in this
file (it is used in can_core.c).
Signed-off-by: Dr. David Alan Gilbert <dave@treblig.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This struct has been unused since
Commit f932093ae1 ("hw/arm/bcm2836: Split out common part of BCM283X
classes")
Signed-off-by: Dr. David Alan Gilbert <dave@treblig.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This struct is unused since Peter's
Commit b8ae597f0e ("linux-user/sparc: Fix errors in target_ucontext
structures")
However, hmm, I'm a bit confused since that commit modifies the
structure and then removes it, was that intentional?
Signed-off-by: Dr. David Alan Gilbert <dave@treblig.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
hmp_info_roms() was removed in commit dd98234c05 ("qapi:
introduce x-query-roms QMP command"),
hmp_info_numa() in commit 1b8ae799d8 ("qapi: introduce
x-query-numa QMP command"),
hmp_info_ramblock() in commit ca411b7c8a ("qapi: introduce
x-query-ramblock QMP command")
and hmp_info_irq() in commit 91f2fa7045 ("qapi: introduce
x-query-irq QMP command").
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dave@treblig.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
host_cpu_realizefn() sets CPUID_EXT_MONITOR without consulting host/KVM
capabilities. This may cause problems:
- If MWAIT/MONITOR is not available on the host, advertising this
feature to the guest and executing MWAIT/MONITOR from the guest
triggers #UD and the guest doesn't boot. This is because typically
#UD takes priority over VM-Exit interception checks and KVM doesn't
emulate MONITOR/MWAIT on #UD.
- If KVM doesn't support KVM_X86_DISABLE_EXITS_MWAIT, MWAIT/MONITOR
from the guest are intercepted by KVM, which is not what cpu-pm=on
intends to do.
In these cases, MWAIT/MONITOR should not be exposed to the guest.
The logic in kvm_arch_get_supported_cpuid() to handle CPUID_EXT_MONITOR
is correct and sufficient, and we can't set CPUID_EXT_MONITOR after
x86_cpu_filter_features().
This was not an issue before commit 662175b91f ("i386: reorder call to
cpu_exec_realizefn") because the feature added in the accel-specific
realizefn could be checked against host availability and filtered out.
Additionally, it seems not a good idea to handle guest CPUID leaves in
host_cpu_realizefn(), and this patch merges host_cpu_enable_cpu_pm()
into kvm_cpu_realizefn().
Fixes: f5cc5a5c16 ("i386: split cpu accelerators from cpu.c, using AccelCPUClass")
Fixes: 662175b91f ("i386: reorder call to cpu_exec_realizefn")
Signed-off-by: Zide Chen <zide.chen@intel.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Both cpu-pm and mem-lock are related to system resource overcommit, but
they are separate from each other, in terms of how they are realized,
and of course, they are applied to different system resources.
It's tempting to use separate command lines to specify their behavior.
e.g., in the following example, the cpu-pm command is quietly
overwritten, and it's not easy to notice it without careful inspection.
--overcommit mem-lock=on
--overcommit cpu-pm=on
Fixes: c8c9dc42b7 ("Remove the deprecated -realtime option")
Suggested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Since a4c2735f35 (cpu: move Qemu[Thread|Cond] setup into common code,
2024-05-30) these fields are now allocated at cpu_common_initfn(). So
let's make sure we also free them at cpu_common_finalize().
Furthermore, the code also frees these on round robin, but we missed
'halt_cond'.
Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
* configure: detect --cpu=mipsisa64r6
* target/i386: decode address before going back to translate.c
* meson: allow configuring the x86-64 baseline
* meson: remove dead optimization option
* exec: small changes to allow compilation with C++ in Android emulator
* fix SEV compilation on 32-bit systems
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmZ+8mEUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroMVmAf+PjJBpMYNFb2qxJDw5jI7hITsrtm4
# v5TKo9x7E3pna5guae5ODFencYhBITQznHFa3gO9w09QN7Gq/rKjuBBST9VISslU
# dW3HtxY9A1eHQtNqHuD7jBWWo9N0hhNiLRa6xz/VDTjEJSxhjSdK2bRW9Yz9hZAe
# 8bbEEC9us21RdFTS+eijOMo9SPyASUlqIq4RbQpbAVuzzOMeXnfOuX9VSTcBy9o2
# 7cKMg7zjL8WQugJKynyl5lny7m1Ji55LD2UrYMF6Mik3Wz5kwgHcUITJ+ZHd/9hR
# a+MI7o/jyCPdmX9pBvJCxyerCVYBu0ugLqYKpAcsqU6111FLrnGgDvHf/g==
# =LdYd
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 28 Jun 2024 10:26:57 AM PDT
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (23 commits)
target/i386/sev: Fix printf formats
target/i386/sev: Use size_t for object sizes
target/i386: SEV: store pointer to decoded id_auth in SevSnpGuest
target/i386: SEV: rename sev_snp_guest->id_auth
target/i386: SEV: store pointer to decoded id_block in SevSnpGuest
target/i386: SEV: rename sev_snp_guest->id_block
target/i386: remove unused enum
target/i386: give CC_OP_POPCNT low bits corresponding to MO_TL
target/i386: use cpu_cc_dst for CC_OP_POPCNT
target/i386: fix CC_OP dump
include: move typeof_strip_qual to compiler.h, use it in QAPI_LIST_LENGTH()
exec: don't use void* in pointer arithmetic in headers
exec: avoid using C++ keywords in function parameters
block: rename former bdrv_file_open callbacks
block: remove separate bdrv_file_open callback
block: do not check bdrv_file_open
block: make assertion more generic
meson: remove dead optimization option
meson: allow configuring the x86-64 baseline
Revert "host/i386: assume presence of SSE2"
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Do not rely on finish->id_auth_uaddr, so that there are no casts from
pointer to uint64_t. They break on 32-bit hosts.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Do not rely on finish->id_block_uaddr, so that there are no casts from
pointer to uint64_t. They break on 32-bit hosts.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Handle it like the other arithmetic cc_ops. This simplifies a
bit the implementation of bit test instructions.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It is the only CCOp, among those that compute ZF from one of the cc_op_*
registers, that uses cpu_cc_src. Do not make it the odd one off,
instead use cpu_cc_dst like the others.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
POPCNT was missing, and the entries were all out of order after
ADCX/ADOX/ADCOX were moved close to EFLAGS. Just use designated
initializers.
Fixes: 4885c3c495 ("target-i386: Use ctpop helper", 2017-01-10)
Fixes: cc155f1971 ("target/i386: rewrite flags writeback for ADCX/ADOX", 2024-06-11)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The typeof_strip_qual() is most useful for the atomic fetch-and-modify
operations in atomic.h, but it can be used elsewhere as well. For example,
QAPI_LIST_LENGTH() assumes that the argument is not const, which is not a
requirement.
Move the macro to compiler.h and, while at it, move it under #ifndef
__cplusplus to emphasize that it uses C-only constructs. A C++ version
of typeof_strip_qual() using type traits is possible[1], but beyond the
scope of this patch because the little C++ code that is in QEMU does not
use QAPI.
The patch was tested by changing the declaration of strv_from_str_list()
in qapi/qapi-type-helpers.c to:
char **strv_from_str_list(const strList *const list)
This is valid C code, and it fails to compile without this change.
[1] https://lore.kernel.org/qemu-devel/20240624205647.112034-1-flwu@google.com/
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Tested-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since there is no bdrv_file_open callback anymore, rename the implementations
so that they end with "_open" instead of "_file_open". NFS is the exception
because all the functions are named nfs_file_*.
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
bdrv_file_open and bdrv_open are completely equivalent, they are
never checked except to see which one to invoke. So merge them
into a single one.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The set of BlockDrivers that have .bdrv_file_open coincides with those
that have .protocol_name and guess what---checking drv->bdrv_file_open
is done to see if the driver is a protocol. So check drv->protocol_name
instead.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
.bdrv_needs_filename is only set for drivers that also set bdrv_file_open,
i.e. protocol drivers.
So we can make the assertion always, it will always pass for those drivers
that use bdrv_open.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a Meson option to configure which x86-64 instruction
set to use. QEMU will now default to x86-64-v1 + cmpxchg16b for
64-bit builds (that corresponds to a Pentium 4 for 32-bit builds).
The baseline can be tuned down to Pentium Pro for 32-bit builds (with
-Dx86_version=0), or up as desired.
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This reverts commit b18236897c.
The x86-64 instruction set can now be tuned down to x86-64 v1
or i386 Pentium Pro.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This reverts commit 433cd6d94a.
The x86-64 instruction set can now be tuned down to x86-64 v1
or i386 Pentium Pro.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This reverts commit 45ccdbcb24.
The x86-64 instruction set can now be tuned down to x86-64 v1
or i386 Pentium Pro.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We have implemented trigger_common_match(), which checks if the enabled
privilege levels of the trigger match CPU's current privilege level.
Remove the related code in riscv_cpu_debug_check_watchpoint() and invoke
trigger_common_match() to check the privilege levels of the type 2 and
type 6 triggers for the watchpoints.
This commit also changes the behavior of looping the triggers. In
previous implementation, if we have a type 2 trigger and
env->virt_enabled is true, we directly return false to stop the loop.
Now we keep looping all the triggers until we find a matched trigger.
Only load/store bits and loaded/stored address should be further checked
in riscv_cpu_debug_check_watchpoint().
Signed-off-by: Alvin Chang <alvinga@andestech.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240626132247.2761286-3-alvinga@andestech.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
According to RISC-V Debug specification version 0.13 [1] (also applied
to version 1.0 [2] but it has not been ratified yet), there are several
common matching conditions before firing a trigger, including the
enabled privilege levels of the trigger.
This commit adds trigger_common_match() to prepare the common matching
conditions for the type 2/3/6 triggers. For now, we just implement
trigger_priv_match() to check if the enabled privilege levels of the
trigger match CPU's current privilege level.
Remove the related code in riscv_cpu_debug_check_breakpoint() and invoke
trigger_common_match() to check the privilege levels of the type 2 and
type 6 triggers for the breakpoints.
This commit also changes the behavior of looping the triggers. In
previous implementation, if we have a type 2 trigger and
env->virt_enabled is true, we directly return false to stop the loop.
Now we keep looping all the triggers until we find a matched trigger.
Only the execution bit and the executed PC should be futher checked in
riscv_cpu_debug_check_breakpoint().
[1]: https://github.com/riscv/riscv-debug-spec/releases/tag/task_group_vote
[2]: https://github.com/riscv/riscv-debug-spec/releases/tag/1.0.0-rc1-asciidoc
Signed-off-by: Alvin Chang <alvinga@andestech.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240626132247.2761286-2-alvinga@andestech.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Introduce helpers to enable the extensions based on the implied rules.
The implied extensions are enabled recursively, so we don't have to
expand all of them manually. This also eliminates the old-fashioned
ordering requirement. For example, Zvksg implies Zvks, Zvks implies
Zvksed, etc., removing the need to check the implied rules of Zvksg
before Zvks.
Signed-off-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Jerry Zhang Jian <jerry.zhangjian@sifive.com>
Tested-by: Max Chou <max.chou@sifive.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20240625114629.27793-3-frank.chang@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
RISCVCPUImpliedExtsRule is created to store the implied rules.
'is_misa' flag is used to distinguish whether the rule is derived
from the MISA or other extensions.
'ext' stores the MISA bit if 'is_misa' is true. Otherwise, it stores
the offset of the extension defined in RISCVCPUConfig. 'ext' will also
serve as the key of the hash tables to look up the rule in the following
commit.
Signed-off-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Jerry Zhang Jian <jerry.zhangjian@sifive.com>
Tested-by: Max Chou <max.chou@sifive.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240625114629.27793-2-frank.chang@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
When icount is enabled, rather than returning the virtual CPU time, we
should return the instruction count itself. Add an instructions bool
parameter to get_ticks() to correctly return icount_get_raw() when
icount_enabled() == 1 and instruction count is queried. This will modify
the existing behavior which was returning an instructions count close to
the number of cycles (CPI ~= 1).
Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Message-ID: <20240618112649.76683-1-cleger@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
RISC-V virt is currently missing default type for block devices. Without
this being set, proper backend is not created when option like -cdrom
is used. So, make the virt board's default block device type be
IF_VIRTIO similar to other architectures.
We also need to set no_cdrom to avoid getting a default cdrom device.
Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20240620064718.275427-1-sunilvl@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Based on privileged spec 1.13, the RV32 needs to implement MEDELEGH
and HEDELEGH for exception codes 32-47 for reserving and exception codes
48-63 for custom use. Add the CSR number though the implementation is
just reading zero and writing ignore. Besides, for accessing HEDELEGH, it
should be controlled by mstateen0 'P1P13' bit.
Signed-off-by: Fea.Wang <fea.wang@sifive.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240606135454.119186-5-fea.wang@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
This patch implements insert/remove software breakpoint process.
For RISC-V, GDB treats single-step similarly to breakpoint: add a
breakpoint at the next step address, then continue. So this also
works for single-step debugging.
Implement kvm_arch_update_guest_debug(): Set the control flag
when there are active breakpoints. This will help KVM to know
the status in the userspace.
Add some stubs which are necessary for building, and will be
implemented later.
Signed-off-by: Chao Du <duchao@eswincomputing.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240606014501.20763-2-duchao@eswincomputing.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The Linux DT docs for imsic [1] predicts an 'interrupt-controller@addr'
node, not 'imsic@addr', given this node inherits the
'interrupt-controller' node.
[1] Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
Reported-by: Conor Dooley <conor@kernel.org>
Fixes: 28d8c28120 ("hw/riscv: virt: Add optional AIA IMSIC support to virt machine")
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240531202759.911601-7-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
We'll change the aplic DT nodename in the next patch and the name is
hardcoded in 2 different functions. Create a helper to change a single
place later.
While we're at it, in create_fdt_socket_aplic(), move 'aplic_name'
inside the conditional to avoid allocating a string that won't be used
when socket == NULL.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240531202759.911601-3-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Qemu maps IRQs 0:15 for core interrupts and 16 onward for
guest interrupts which are later translated to hgiep in
`riscv_cpu_set_irq()` function.
With virtual IRQ support added, software now can fully
use the whole local interrupt range without any actual
hardware attached.
This change moves the guest interrupt range after the
core local interrupt range to avoid clash.
Fixes: 1697837ed9 ("target/riscv: Add M-mode virtual interrupt and IRQ filtering support.")
Fixes: 40336d5b1d ("target/riscv: Add HS-mode virtual interrupt and IRQ filtering support.")
Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20240520125157.311503-3-rkanwal@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
AIA extends the width of all IRQ CSRs to 64bit even
in 32bit systems by adding missing half CSRs.
This seems to be missed while adding support for
virtual IRQs. The whole logic seems to be correct
except the width of the masks.
Fixes: 1697837ed9 ("target/riscv: Add M-mode virtual interrupt and IRQ filtering support.")
Fixes: 40336d5b1d ("target/riscv: Add HS-mode virtual interrupt and IRQ filtering support.")
Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240520125157.311503-2-rkanwal@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
SD/MMC patches queue
One fix and various cleanups for the SD card model.
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmZ5cRUACgkQ4+MsLN6t
# wN59Qw//cUdjD287pB5Ml5aQqr9sOTyVnHUceZtz7AOZ5w8RM2tlPDgOImeLOvU6
# OV7qfWvNaUxtQxhfh5jpe8Pj4eHBtRQzA6a1AWToEvnN4189QWHZpqf5TUa4AlFS
# uAk7k2TkoNv9zbNKca0bP3L1x6sT9l0VPZBLaLbgdXDIX2ycD0r3NVQxXb/bJRgM
# 6pFRcLCF/isKzLQDwqnTa11hB/JDTvOU7xnY0kazGRvyWjbSvE2sOJzLNJXHkW0I
# /FNfRbOKJo2t+47Z5qSXUFFLeIEBTy7VqNBsOQ6sMIgrWzbOSrtBcuxKp0p9NCGH
# fdZHlDVRnNGXewUya4RjbmXiCNuGL4zJ82b2BaQZVd5ZwU2opIr8xO96WCojQ4dZ
# +Dq3uv7su3PUVOh95i38Eo93OG9jXFx642XD4q2uKu5j70IoGXAkIoLUcFkZZdGS
# 9rCsaNUHyHJrN6nXf3Cekvkqxz36p6QXaUF9I1vB0JF6CrexMD35sBUK+RE9k4uW
# LnqL7ZwQDGDGVl3kPS/VCXv1mMim4aRLSEIveq7Ui6dKzaaJMIIodZ8CFMuyTTsD
# cGE+Cd053nf6SzX3+kEZftNdjtJ906O8xIAw+RNdARYx003l4kUxgsPDk7ELyzIP
# Tb+VlZl2P+ROJmeWvRMTW7ZQ49M9IEMrg8zlGF4hLCxB1JndeOA=
# =O5er
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 24 Jun 2024 06:13:57 AM PDT
# gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]
* tag 'sdmmc-20240624' of https://github.com/philmd/qemu:
hw/sd/sdcard: Add comments around registers and commands
hw/sd/sdcard: Inline BLK_READ_BLOCK / BLK_WRITE_BLOCK macros
hw/sd/sdcard: Add sd_invalid_mode_for_cmd to report invalid mode switch
hw/sd/sdcard: Only call sd_req_get_address() where address is used
hw/sd/sdcard: Factor sd_req_get_address() method out
hw/sd/sdcard: Only call sd_req_get_rca() where RCA is used
hw/sd/sdcard: Factor sd_req_get_rca() method out
hw/sd/sdcard: Have cmd_valid_while_locked() return a boolean value
hw/sd/sdcard: Trace update of block count (CMD23)
hw/sd/sdcard: Remove explicit entries for illegal commands
hw/sd/sdcard: Remove ACMD6 handler for SPI mode
hw/sd/sdcard: Use Load/Store API to fill some CID/CSD registers
hw/sd/sdcard: Use registerfield CSR::CURRENT_STATE definition
hw/sd/sdcard: Use HWBLOCK_SHIFT definition instead of magic values
hw/sd/sdcard: Fix typo in SEND_OP_COND command name
hw/sd/sdcard: Rewrite sd_cmd_ALL_SEND_CID using switch case (CMD2)
hw/sd/sdcard: Correct code indentation
hw/sd/sdcard: Avoid OOB in sd_read_byte() during unexpected CMD switch
bswap: Add st24_be_p() to store 24 bits in big-endian order
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
vfio_container_destroy() clears the resources allocated
VFIOContainerBase object. Now that VFIOContainerBase is a QOM object,
add an instance_finalize() handler to do the cleanup. It will be
called through object_unref().
Suggested-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Just as we did for the VFIOContainerBase object, introduce an
instance_init() handler for the legacy VFIOContainer object and do the
specific initialization there.
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
VFIOContainerBase was made a QOM interface because we believed that a
QOM object would expose all the IOMMU backends to the QEMU machine and
human interface. This only applies to user creatable devices or objects.
Change the VFIOContainerBase nature from interface to object and make
the necessary adjustments in the VFIO_IOMMU hierarchy.
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Since the QEMU struct type representing the VFIO container is deduced
from the IOMMU type exposed by the host, this type should be well
defined *before* creating the container struct. This will be necessary
to instantiate a QOM object of the correct type in future changes.
Rework vfio_set_iommu() to extract the part doing the container
initialization and move it under vfio_create_container().
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This routine allocates the QEMU struct type representing the VFIO
container. It is minimal currently and future changes will do more
initialization.
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Rework vfio_get_iommu_class() to return a literal class name instead
of a class object. We will need this name to instantiate the object
later on. Since the default case asserts, remove the error report as
QEMU will simply abort before.
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Assign the base container VFIOAddressSpace 'space' pointer in
vfio_address_space_insert(). The ultimate goal is to remove
vfio_container_init() and instead rely on an .instance_init() handler
to perfom the initialization of VFIOContainerBase.
To be noted that vfio_connect_container() will assign the 'space'
pointer later in the execution flow. This should not have any
consequence.
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
It prepares ground for a future change initializing the 'space' pointer
of VFIOContainerBase. The goal is to replace vfio_container_init() by
an .instance_init() handler when VFIOContainerBase is QOMified.
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Extract vIOMMU code from vfio_sync_dirty_bitmap() to a new function and
restructure the code.
This is done in preparation for optimizing vIOMMU device dirty page
tracking. No functional changes intended.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[ clg: - Rebased on upstream
- Fixed typo in commit log ]
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Separate the changes that update the ranges from the listener, to
make it reusable in preparation to expand its use to vIOMMU support.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
[ clg: - Rebased on upstream
- Introduced vfio_dirty_tracking_update_range()
- Fixed typ in commit log ]
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Since vfio_devices_dma_logging_start() takes an 'Error **' argument,
best practices suggest to return a bool. See the api/error.h Rules
section. It will simplify potential changes coming after.
vfio_container_set_dirty_page_tracking() could be modified in the same
way but the errno value can be saved in the migration stream when
called from vfio_listener_log_global_stop().
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Since the host IOVA ranges are now passed through the
PCIIOMMUOps set_host_resv_regions and we have removed
the only implementation of iommu_set_iova_range() in
the virtio-iommu and the only call site in vfio/common,
let's retire the IOMMU MR API and its memory wrapper.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
As we have just removed the only implementation of
iommu_set_iova_ranges IOMMU MR callback in the virtio-iommu,
let's remove the call to the memory wrapper. Usable IOVA ranges
are now conveyed through the PCIIOMMUOps in VFIO-PCI.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Now that we use PCIIOMMUOps to convey information about usable IOVA
ranges we do not to implement the iommu_set_iova_ranges IOMMU MR
callback.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Compute the host reserved regions in virtio_iommu_set_iommu_device().
The usable IOVA regions are retrieved from the HostIOMMUDevice.
The virtio_iommu_set_host_iova_ranges() helper turns usable regions
into complementary reserved regions while testing the inclusion
into existing ones. virtio_iommu_set_host_iova_ranges() reuse the
implementation of virtio_iommu_set_iova_ranges() which will be
removed in subsequent patches. rebuild_resv_regions() is just moved.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Store the aliased bus and devfn in the HostIOMMUDevice.
This will be useful to handle info that are iommu group
specific and not device specific (such as reserved
iova ranges).
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Introduce a new HostIOMMUDevice callback that allows to
retrieve the usable IOVA ranges.
Implement this callback in the legacy VFIO and IOMMUFD VFIO
host iommu devices. This relies on the VFIODevice agent's
base container iova_ranges resource.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Implement PCIIOMMUOPs [set|unset]_iommu_device() callbacks.
In set(), the HostIOMMUDevice handle is stored in a hash
table indexed by PCI BDF. The object will allow to retrieve
information related to the physical IOMMU.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Store the agent device (VFIO or VDPA) in the host IOMMU device.
This will allow easy access to some of its resources.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
If check fails, host device (either VFIO or VDPA device) is not
compatible with current vIOMMU config and should not be passed to
guest.
Only aw_bits is checked for now, we don't care about other caps
before scalable modern mode is introduced.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Implement [set|unset]_iommu_device() callbacks in Intel vIOMMU.
In set call, we take a reference of HostIOMMUDevice and store it
in hash table indexed by PCI BDF.
Note this BDF index is device's real BDF not the aliased one which
is different from the index of VTDAddressSpace. There can be multiple
assigned devices under same virtual iommu group and share same
VTDAddressSpace, but each has its own HostIOMMUDevice.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Extract cap/ecap initialization in vtd_cap_init() to make code
cleaner.
No functional change intended.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
pci_device_[set|unset]_iommu_device() call pci_device_get_iommu_bus_devfn()
to get iommu_bus->iommu_ops and call [set|unset]_iommu_device callback to
set/unset HostIOMMUDevice for a given PCI device.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Extract out pci_device_get_iommu_bus_devfn() from
pci_device_iommu_address_space() to facilitate
implementation of pci_device_[set|unset]_iommu_device()
in following patch.
No functional change intended.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Create host IOMMU device instance in vfio_attach_device() and call
.realize() to initialize it further.
Introuduce attribute VFIOIOMMUClass::hiod_typename and initialize
it based on VFIO backend type. It will facilitate HostIOMMUDevice
creation in vfio_attach_device().
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
It calls iommufd_backend_get_device_info() to get host IOMMU
related information and translate it into HostIOMMUDeviceCaps
for query with .get_cap().
For aw_bits, use the same way as legacy backend by calling
vfio_device_get_aw_bits() which is common for different vendor
IOMMU.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
The realize function populates the capabilities. For now only the
aw_bits caps is computed for legacy backend.
Introduce a helper function vfio_device_get_aw_bits() which calls
range_get_last_bit() to get host aw_bits and package it in
HostIOMMUDeviceCaps for query with .get_cap(). This helper will
also be used by iommufd backend.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
This helper get the highest 1 bit position of the upper bound.
If the range is empty or upper bound is zero, -1 is returned.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
TYPE_HOST_IOMMU_DEVICE_IOMMUFD represents a host IOMMU device under
iommufd backend. It is abstract, because it is going to be derived
into VFIO or VDPA type'd device.
It will have its own .get_cap() implementation.
TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO is a sub-class of
TYPE_HOST_IOMMU_DEVICE_IOMMUFD, represents a VFIO type'd host IOMMU
device under iommufd backend. It will be created during VFIO device
attaching and passed to vIOMMU.
It will have its own .realize() implementation.
Opportunistically, add missed header to include/sysemu/iommufd.h.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO represents a host IOMMU device under
VFIO legacy container backend.
It will have its own realize implementation.
Suggested-by: Eric Auger <eric.auger@redhat.com>
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
HostIOMMUDeviceCaps's elements map to the host IOMMU's capabilities.
Different platform IOMMU can support different elements.
Currently only two elements, type and aw_bits, type hints the host
platform IOMMU type, i.e., INTEL vtd, ARM smmu, etc; aw_bits hints
host IOMMU address width.
Introduce .get_cap() handler to check if HOST_IOMMU_DEVICE_CAP_XXX
is supported.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
A HostIOMMUDevice is an abstraction for an assigned device that is protected
by a physical IOMMU (aka host IOMMU). The userspace interaction with this
physical IOMMU can be done either through the VFIO IOMMU type 1 legacy
backend or the new iommufd backend. The assigned device can be a VFIO device
or a VDPA device. The HostIOMMUDevice is needed to interact with the host
IOMMU that protects the assigned device. It is especially useful when the
device is also protected by a virtual IOMMU as this latter use the translation
services of the physical IOMMU and is constrained by it. In that context the
HostIOMMUDevice can be passed to the virtual IOMMU to collect physical IOMMU
capabilities such as the supported address width. In the future, the virtual
IOMMU will use the HostIOMMUDevice to program the guest page tables in the
first translation stage of the physical IOMMU.
Introduce .realize() to initialize HostIOMMUDevice further after instance init.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
maintainer updates (plugins, gdbstub):
- add missing include guard comment to gdbstub.h
- move gdbstub enums into separate header
- move qtest_[get|set]_virtual_clock functions
- allow plugins to manipulate the virtual clock
- introduce an Instructions Per Second plugin
- fix inject_mem_cb rw mask tests
- allow qemu_plugin_vcpu_mem_cb to shortcut when no memory cbs
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmZ5OjoACgkQ+9DbCVqe
# KkQPlwf/VK673BAjYktuCLnf3DgWvIkkiHWwzBREP5MmseUloLjK2CQPLY/xWZED
# pbA/1OSzHViD/mvG5wTxwef36b9PIleWj5/YwBxGlrb/rh6hCd9004pZK4EMI3qU
# 53SK8Qron8TIXjey6XfmAY8rcl030GsHr0Zqf5i2pZKE5g0iaGlM3Cwkpo0SxQsu
# kMNqiSs9NzX7LxB+YeuAauIvC1YA2F/MGTXeFCTtO9Beyp5oV7oOI+2zIvLjlG5M
# Z5hKjG/STkNOteoIBGZpe1+QNpoGHSBoGE3nQnGpXb82iLx1KVBcKuQ6GoWGv1Wo
# hqiSh9kJX479l0mLML+IzaDsgSglbg==
# =pvWx
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 24 Jun 2024 02:19:54 AM PDT
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [full]
* tag 'pull-maintainer-june24-240624-1' of https://gitlab.com/stsquad/qemu:
accel/tcg: Avoid unnecessary call overhead from qemu_plugin_vcpu_mem_cb
plugins: fix inject_mem_cb rw masking
contrib/plugins: add Instructions Per Second (IPS) example for cost modeling
plugins: add migration blocker
plugins: add time control API
qtest: move qtest_{get, set}_virtual_clock to accel/qtest/qtest.c
sysemu: generalise qtest_warp_clock as qemu_clock_advance_virtual_time
qtest: use cpu interface in qtest_clock_warp
sysemu: add set_virtual_time to accel ops
plugins: Ensure register handles are not NULL
gdbstub: move enums into separate header
include/exec: add missing include guard comment
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This plugin uses the new time control interface to make decisions
about the state of time during the emulation. The algorithm is
currently very simple. The user specifies an ips rate which applies
per core. If the core runs ahead of its allocated execution time the
plugin sleeps for a bit to let real time catch up. Either way time is
updated for the emulation as a function of total executed instructions
with some adjustments for cores that idle.
Examples
--------
Slow down execution of /bin/true:
$ num_insn=$(./build/qemu-x86_64 -plugin ./build/tests/plugin/libinsn.so -d plugin /bin/true |& grep total | sed -e 's/.*: //')
$ time ./build/qemu-x86_64 -plugin ./build/contrib/plugins/libips.so,ips=$(($num_insn/4)) /bin/true
real 4.000s
Boot a Linux kernel simulating a 250MHz cpu:
$ /build/qemu-system-x86_64 -kernel /boot/vmlinuz-6.1.0-21-amd64 -append "console=ttyS0" -plugin ./build/contrib/plugins/libips.so,ips=$((250*1000*1000)) -smp 1 -m 512
check time until kernel panic on serial0
Tested in system mode by booting a full debian system, and using:
$ sysbench cpu run
Performance decrease linearly with the given number of ips.
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-Id: <20240530220610.1245424-7-pierrick.bouvier@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240620152220.2192768-11-alex.bennee@linaro.org>
It will be useful later to assert only ADTC commands
(Addressed point-to-point Data Transfer Commands, defined
as the 'sd_adtc' enum) extract the address value from the
command argument.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <20240621080554.18986-18-philmd@linaro.org>
Extract sd_cmd_get_address() so we can re-use it
in various SDProto handlers. Use CARD_CAPACITY and
HWBLOCK_SHIFT definitions instead of magic values.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <20240621080554.18986-17-philmd@linaro.org>
It will be useful later to assert only AC commands
(Addressed point-to-point Commands, defined as the
'sd_ac' enum) extract the RCA value from the command
argument.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <20240621080554.18986-16-philmd@linaro.org>
There is no ACMD6 command in SPI mode, remove the pointless
handler introduced in commit 946897ce18 ("sdcard: handles
more commands in SPI mode"). Keep sd_cmd_unimplemented()
since we'll reuse it later.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <20240621080554.18986-8-philmd@linaro.org>
The oldest model that IBM still supports is the z13. Considering
that each generation can "emulate" the previous two generations
in hardware (via the "IBC" feature of the CPUs), this means that
everything that is older than z114/196 is not an officially supported
CPU model anymore. The Linux kernel still support the z10, so if
we also take this into account, everything older than that can
definitely be considered as a legacy CPU model.
For downstream builds of QEMU, we would like to be able to disable
these legacy CPUs in the build. Thus add a CONFIG switch that can be
used to disable them (and old machine types that use them by default).
Message-Id: <20240614125019.588928-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Beside migration-test.c, there is nowadays migration-helpers.[ch],
too, so update the entry in the migration section to also cover these
files now.
While we're at it, exclude these files in the common qtest section,
since the migration test is well covered by the migration maintainers
already. Since the test is under very active development, it was causing
a lot of distraction to the generic qtest maintainers with regards to
the patches that need to be reviewed by the migration maintainers anyway.
Message-ID: <20240619055447.129943-1-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Last use of VMSTATE_ARRAY_TEST() was removed in commit 46baa9007f
("migration/i386: Remove old non-softfloat 64bit FP support"), we
can safely get rid of it.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Make sure there will be an event for postcopy recovery, irrelevant of
whether the reconnect will success, or when the failure happens.
The added new case is to fail early in postcopy recovery, in which case it
didn't even reach RECOVER stage on src (and in real life it'll be the same
to dest, but the test case is just slightly more involved due to the dual
socketpair setup).
To do that, rename the postcopy_recovery_test_fail to reflect either stage
to fail, instead of a boolean.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Making sure the postcopy-recover-setup status is present in the postcopy
failure unit test. Note that it only applies to src QEMU not dest.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
This changes the way the ohci emulation handles a Transfer Descriptor
with "Buffer End" set to "Current Buffer Pointer" - 1, specifically
in the case of a zero-length packet.
The OHCI spec 4.3.1.2 Table 4-2 specifies td.cbp to be zero for a
zero-length packet. Peter Maydell tracked down commit 1328fe0c32
(hw: usb: hcd-ohci: check len and frame_number variables) where qemu
started checking this according to the spec.
What this patch does is loosen the qemu ohci implementation to allow a
zero-length packet if td.be (Buffer End) is set to td.cbp - 1, and with a
non-zero td.cbp value.
The spec is unclear whether this is valid or not -- it is not the
clearly documented way to send a zero length TD (which is CBP=BE=0),
but it isn't specifically forbidden. Actual hw seems to be ok with it.
Does any OS rely on this behavior? There have been no reports to
qemu-devel of this problem.
This is attempting to have qemu behave like actual hardware,
but this is just a minor change.
With a tiny OS[1] that boots and executes a test, the issue can be seen:
* OS that sends USB requests to a USB mass storage device
but sends td.cbp = td.be + 1
* qemu 4.2
* qemu HEAD (4e66a0854)
* Actual OHCI controller (hardware)
Command line:
qemu-system-x86_64 -m 20 \
-device pci-ohci,id=ohci \
-drive if=none,format=raw,id=d,file=testmbr.raw \
-device usb-storage,bus=ohci.0,drive=d \
--trace "usb_*" --trace "ohci_*" -D qemu.log
Results are:
qemu 4.2 | qemu HEAD | actual HW
-----------+------------+-----------
works fine | ohci_die() | works fine
Tip: if the flags "-serial pty -serial stdio" are added to the command line
the test will output USB requests like this:
Testing qemu HEAD:
> Free mem 2M ohci port2 conn FS
> setup { 80 6 0 1 0 0 8 0 }
> ED info=80000 { mps=8 en=0 d=0 } tail=c20920
> td0 c20880 nxt=c20960 f2000000 setup cbp=c20900 be=c20907
> td1 c20960 nxt=c20980 f3140000 in cbp=c20908 be=c2090f
> td2 c20980 nxt=c20920 f3080000 out cbp=c20910 be=c2090f ohci20 host err
> usb stopped
And in qemu.log:
usb_ohci_iso_td_bad_cc_overrun ISO_TD start_offset=0x00c20910 > next_offset=0x00c2090f
Testing qemu 4.2:
> Free mem 2M ohci port2 conn FS
> setup { 80 6 0 1 0 0 8 0 }
> ED info=80000 { mps=8 en=0 d=0 } tail=620920
> td0 620880 nxt=620960 f2000000 setup cbp=620900 be=620907 cbp=0 be=620907
> td1 620960 nxt=620980 f3140000 in cbp=620908 be=62090f cbp=0 be=62090f
> td2 620980 nxt=620920 f3080000 out cbp=620910 be=62090f cbp=0 be=62090f
> rx { 12 1 0 2 0 0 0 8 }
> setup { 0 5 1 0 0 0 0 0 } tx {}
> ED info=80000 { mps=8 en=0 d=0 } tail=620880
> td0 620920 nxt=620960 f2000000 setup cbp=620900 be=620907 cbp=0 be=620907
> td1 620960 nxt=620880 f3100000 in cbp=620908 be=620907 cbp=0 be=620907
> setup { 80 6 0 1 0 0 12 0 }
> ED info=80001 { mps=8 en=0 d=1 } tail=620960
> td0 620880 nxt=6209c0 f2000000 setup cbp=620920 be=620927 cbp=0 be=620927
> td1 6209c0 nxt=6209e0 f3140000 in cbp=620928 be=620939 cbp=0 be=620939
> td2 6209e0 nxt=620960 f3080000 out cbp=62093a be=620939 cbp=0 be=620939
> rx { 12 1 0 2 0 0 0 8 f4 46 1 0 0 0 1 2 3 1 }
> setup { 80 6 0 2 0 0 0 1 }
> ED info=80001 { mps=8 en=0 d=1 } tail=620880
> td0 620960 nxt=6209a0 f2000000 setup cbp=620a20 be=620a27 cbp=0 be=620a27
> td1 6209a0 nxt=6209c0 f3140004 in cbp=620a28 be=620b27 cbp=620a48 be=620b27
> td2 6209c0 nxt=620880 f3080000 out cbp=620b28 be=620b27 cbp=0 be=620b27
> rx { 9 2 20 0 1 1 4 c0 0 9 4 0 0 2 8 6 50 0 7 5 81 2 40 0 0 7 5 2 2 40 0 0 }
> setup { 0 9 1 0 0 0 0 0 } tx {}
> ED info=80001 { mps=8 en=0 d=1 } tail=620900
> td0 620880 nxt=620940 f2000000 setup cbp=620a00 be=620a07 cbp=0 be=620a07
> td1 620940 nxt=620900 f3100000 in cbp=620a08 be=620a07 cbp=0 be=620a07
[1] The OS disk image has been emailed to philmd@linaro.org, mjt@tls.msk.ru,
and kraxel@redhat.com:
* testCbpOffBy1.img.xz
* sha256: f87baddcb86de845de12f002c698670a426affb40946025cc32694f9daa3abed
Signed-off-by: David Hubbard <dmamfmgm@gmail.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Multiple warning messages and corresponding backtraces are observed when Linux
guest is booted on the host with Fujitsu CPUs. One of them is shown as below.
[ 0.032443] ------------[ cut here ]------------
[ 0.032446] uart-pl011 9000000.pl011: ARCH_DMA_MINALIGN smaller than
CTR_EL0.CWG (128 < 256)
[ 0.032454] WARNING: CPU: 0 PID: 1 at arch/arm64/mm/dma-mapping.c:54
arch_setup_dma_ops+0xbc/0xcc
[ 0.032470] Modules linked in:
[ 0.032475] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-452.el9.aarch64
[ 0.032481] Hardware name: linux,dummy-virt (DT)
[ 0.032484] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 0.032490] pc : arch_setup_dma_ops+0xbc/0xcc
[ 0.032496] lr : arch_setup_dma_ops+0xbc/0xcc
[ 0.032501] sp : ffff80008003b860
[ 0.032503] x29: ffff80008003b860 x28: 0000000000000000 x27: ffffaae4b949049c
[ 0.032510] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
[ 0.032517] x23: 0000000000000100 x22: 0000000000000000 x21: 0000000000000000
[ 0.032523] x20: 0000000100000000 x19: ffff2f06c02ea400 x18: ffffffffffffffff
[ 0.032529] x17: 00000000208a5f76 x16: 000000006589dbcb x15: ffffaae4ba071c89
[ 0.032535] x14: 0000000000000000 x13: ffffaae4ba071c84 x12: 455f525443206e61
[ 0.032541] x11: 68742072656c6c61 x10: 0000000000000029 x9 : ffffaae4b7d21da4
[ 0.032547] x8 : 0000000000000029 x7 : 4c414e494d5f414d x6 : 0000000000000029
[ 0.032553] x5 : 000000000000000f x4 : ffffaae4b9617a00 x3 : 0000000000000001
[ 0.032558] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff2f06c029be40
[ 0.032564] Call trace:
[ 0.032566] arch_setup_dma_ops+0xbc/0xcc
[ 0.032572] of_dma_configure_id+0x138/0x300
[ 0.032591] amba_dma_configure+0x34/0xc0
[ 0.032600] really_probe+0x78/0x3dc
[ 0.032614] __driver_probe_device+0x108/0x160
[ 0.032619] driver_probe_device+0x44/0x114
[ 0.032624] __device_attach_driver+0xb8/0x14c
[ 0.032629] bus_for_each_drv+0x88/0xe4
[ 0.032634] __device_attach+0xb0/0x1e0
[ 0.032638] device_initial_probe+0x18/0x20
[ 0.032643] bus_probe_device+0xa8/0xb0
[ 0.032648] device_add+0x4b4/0x6c0
[ 0.032652] amba_device_try_add.part.0+0x48/0x360
[ 0.032657] amba_device_add+0x104/0x144
[ 0.032662] of_amba_device_create.isra.0+0x100/0x1c4
[ 0.032666] of_platform_bus_create+0x294/0x35c
[ 0.032669] of_platform_populate+0x5c/0x150
[ 0.032672] of_platform_default_populate_init+0xd0/0xec
[ 0.032697] do_one_initcall+0x4c/0x2e0
[ 0.032701] do_initcalls+0x100/0x13c
[ 0.032707] kernel_init_freeable+0x1c8/0x21c
[ 0.032712] kernel_init+0x28/0x140
[ 0.032731] ret_from_fork+0x10/0x20
[ 0.032735] ---[ end trace 0000000000000000 ]---
In Linux, a check is applied to every device which is exposed through
device-tree node. The warning message is raised when the device isn't
DMA coherent and the cache line size is larger than ARCH_DMA_MINALIGN
(128 bytes). The cache line is sorted from CTR_EL0[CWG], which corresponds
to 256 bytes on the guest CPUs. The DMA coherent capability is claimed
through 'dma-coherent' in their device-tree nodes or parent nodes.
This happens even when the device doesn't implement or use DMA at all,
for legacy reasons.
Fix the issue by adding 'dma-coherent' property to the device-tree root
node, meaning all devices are capable of DMA coherent by default.
This both suppresses the spurious kernel warnings and also guards
against possible future QEMU bugs where we add a DMA-capable device
and forget to mark it as dma-coherent.
Signed-off-by: Zhenyu Zhang <zhenyzha@redhat.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Donald Dutile <ddutile@redhat.com
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-id: 20240612020506.307793-1-zhenyzha@redhat.com
[PMM: tweaked commit message]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
For some use-cases, it is helpful to have more than one UART
available to the guest. If the second UART slot is not already used
for a TrustZone Secure-World-only UART, create it as a NonSecure UART
only when the user provides a serial backend (e.g. via a second
-serial command line option).
This avoids problems where existing guest software only expects a
single UART, and gets confused by the second UART in the DTB. The
major example of this is older EDK2 firmware, which will send the
GRUB bootloader output to UART1 and the guest serial output to UART0.
Users who want to use both UARTs with a guest setup including EDK2
are advised to update to EDK2 release edk2-stable202311 or newer.
(The prebuilt EDK2 blobs QEMU upstream provides are new enough.)
The relevant EDK2 changes are the ones described here:
https://bugzilla.tianocore.org/show_bug.cgi?id=4577
Inspired-by: Axel Heider <axel.heider@hensoldt.net>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240610162343.2131524-4-peter.maydell@linaro.org
If there is more than one UART in the DTB, then there is no guarantee
on which order a guest is supposed to initialise them. The standard
solution to this is "serialN" entries in the "/aliases" node of the
dtb which give the nodename of the UARTs.
At the moment we only have two UARTs in the DTB when one is for
the Secure world and one for the Non-Secure world, so this isn't
really a problem. However if we want to add a second NS UART we'll
need the aliases to ensure guests pick the right one.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240610162343.2131524-2-peter.maydell@linaro.org
This commit modifies the dwc2_hsotg_read() and dwc2_hsotg_write() functions
to handle invalid address access gracefully. Instead of using
g_assert_not_reached(), which causes the program to abort, the functions
now log an error message and return a default value for reads or do
nothing for writes.
This change prevents the program from aborting and provides clear log
messages indicating when an invalid memory address is accessed.
Reproducer:
cat << EOF | qemu-system-aarch64 -display none \
-machine accel=qtest, -m 512M -machine raspi2b -m 1G -nodefaults \
-usb -drive file=null-co://,if=none,format=raw,id=disk0 -device \
usb-storage,port=1,drive=disk0 -qtest stdio
readl 0x3f980dfb
EOF
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Reviewed-by: Paul Zimmerman <pauldzim@gmail.com>
Message-id: 20240618135610.3109175-1-zheyuma97@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This commit updates the a9_gtimer_get_current_cpu() function to handle
cases where QTest is enabled. When QTest is used, it returns 0 instead
of dereferencing the current_cpu, which can be NULL. This prevents the
program from crashing during QTest runs.
Reproducer:
cat << EOF | qemu-system-aarch64 -display \
none -machine accel=qtest, -m 512M -machine npcm750-evb -qtest stdio
writel 0xf03fe20c 0x26d7468c
EOF
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240618144009.3137806-1-zheyuma97@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The 'char' component:
* includes the no-longer-present qemu-char.c, which has been
long since split into the chardev/ backend code
* also includes the hw/char devices
Split it into two components:
* char is the hw/char devices
* chardev is the chardev backends
with regexes matching our current sources.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240604145934.1230583-3-peter.maydell@linaro.org
Since commit 83aa1baa06 we have been running the build for Coverity
Scan as a Gitlab CI job, rather than the old setup where it was run
on a local developer's machine. This is working well, but the
absolute paths of files are different for the Gitlab CI job, which
means that the regexes we use to identify Coverity components no
longer work. With Gitlab CI builds the file paths are of the form
/builds/qemu-project/qemu/accel/kvm/kvm-all.c
rather than the old
/qemu/accel/kvm/kvm-all.c
and our regexes all don't match.
Update all the regexes to start with .*/qemu/ . This will hopefully
avoid the need to change them again in future if the build path
changes again.
This change was made with a search-and-replace of (/qemu)?
to .*/qemu .
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240604145934.1230583-2-peter.maydell@linaro.org
Julien reported that he has seen strange behaviour when running
Xen on QEMU using GICv2. When Xen migrates a guest's vCPU from
one pCPU to another while the vCPU is handling an interrupt, the
guest is unable to properly deactivate interrupts.
Looking at it a little closer, our GICv2 model treats
deactivation of SPI lines as if they were PPI's, i.e banked per
CPU core. The state for active interrupts should only be banked
for PPI lines, not for SPI lines.
Make deactivation of SPI lines unbanked, similar to how we
handle writes to GICD_ICACTIVER.
Reported-by: Julien Grall <julien@xen.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Message-id: 20240605143044.2029444-2-edgar.iglesias@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Returning an uint32_t casted to a gint from g_cmp_ids causes the tx queue to
become wrongly sorted when executing g_slist_sort. Fix this by always
returning -1 or 1 from g_cmp_ids based on the ID comparison instead.
Also, if two message IDs are the same, sort them by using their index and
transmit the message at the lowest index first.
Signed-off-by: Shiva sagar Myana <Shivasagar.Myana@amd.com>
Reviewed-by: Francisco Iglesias <francisco.iglesias@amd.com>
Message-id: 20240603051732.3334571-1-Shivasagar.Myana@amd.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Introduce a small helper to wait for a migration event, generalized from
the incoming migration path. Make the helper easier to use by allowing it
to keep waiting until the expected event is received.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Libvirt should always enable it, so it'll be nice qtest also cover that for
all tests on both sides. migrate_incoming_qmp() used to enable it only on
dst, now we enable them on both, as we'll start to sanity check events even
on the src QEMU.
We'll need to leave the one in migrate_incoming_qmp(), because
virtio-net-failover test uses that one only, and it relies on the events to
work.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Most of them are not needed, we can stick with one ifdef inside
postcopy_recover_fail() so as to cover the scm right tricks only.
The tests won't run on windows anyway due to has_uffd always false.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Firstly, the "Paused" state was added in the wrong place before. The state
machine section was describing PostcopyState, rather than MigrationStatus.
Drop the Paused state descriptions.
Then in the postcopy recover session, add more information on the state
machine for MigrationStatus in the lines. Add the new RECOVER_SETUP phase.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
[fix typo s/reconnects/reconnect]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
This patch adds a migration state on src called "postcopy-recover-setup".
The new state will describe the intermediate step starting from when the
src QEMU received a postcopy recovery request, until the migration channels
are properly established, but before the recovery process take place.
The request came from Libvirt where Libvirt currently rely on the migration
state events to detect migration state changes. That works for most of the
migration process but except postcopy recovery failures at the beginning.
Currently postcopy recovery only has two major states:
- postcopy-paused: this is the state that both sides of QEMU will be in
for a long time as long as the migration channel was interrupted.
- postcopy-recover: this is the state where both sides of QEMU handshake
with each other, preparing for a continuation of postcopy which used to
be interrupted.
The issue here is when the recovery port is invalid, the src QEMU will take
the URI/channels, noticing the ports are not valid, and it'll silently keep
in the postcopy-paused state, with no event sent to Libvirt. In this case,
the only thing Libvirt can do is to poll the migration status with a proper
interval, however that's less optimal.
Considering that this is the only case where Libvirt won't get a
notification from QEMU on such events, let's add postcopy-recover-setup
state to mimic what we have with the "setup" state of a newly initialized
migration, describing the phase of connection establishment.
With that, postcopy recovery will have two paths to go now, and either path
will guarantee an event generated. Now the events will look like this
during a recovery process on src QEMU:
- Initially when the recovery is initiated on src, QEMU will go from
"postcopy-paused" -> "postcopy-recover-setup". Old QEMUs don't have
this event.
- Depending on whether the channel re-establishment is succeeded:
- In succeeded case, src QEMU will move from "postcopy-recover-setup"
to "postcopy-recover". Old QEMUs also have this event.
- In failure case, src QEMU will move from "postcopy-recover-setup" to
"postcopy-paused" again. Old QEMUs don't have this event.
This guarantees that Libvirt will always receive a notification for
recovery process properly.
One thing to mention is, such new status is only needed on src QEMU not
both. On dest QEMU, the state machine doesn't change. Hence the events
don't change either. It's done like so because dest QEMU may not have an
explicit point of setup start. E.g., it can happen that when dest QEMUs
doesn't use migrate-recover command to use a new URI/channel, but the old
URI/channels can be reused in recovery, in which case the old ports simply
can work again after the network routes are fixed up.
Add a new helper postcopy_is_paused() detecting whether postcopy is still
paused, taking RECOVER_SETUP into account too. When using it on both
src/dst, a slight change is done altogether to always wait for the
semaphore before checking the status, because for both sides a sem_post()
will be required for a recovery.
Cc: Jiri Denemark <jdenemar@redhat.com>
Cc: Prasad Pandit <ppandit@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Buglink: https://issues.redhat.com/browse/RHEL-38485
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Destination QEMU can setup incoming ports for two purposes: either a fresh
new incoming migration, in which QEMU will switch to SETUP for channel
establishment, or a paused postcopy migration, in which QEMU will stay in
POSTCOPY_PAUSED until kicking off the RECOVER phase.
Now the state machine worked on dest node for the latter, only because
migrate_set_state() implicitly will become a noop if the current state
check failed. It wasn't clear at all.
Clean it up by providing a helper migration_incoming_state_setup() doing
proper checks over current status. Postcopy-paused will be explicitly
checked now, and then we can bail out for unknown states.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
QEMU uses "int" in most cases even if it stores MigrationStatus. I don't
know why, so let's try to do that right and see what blows up..
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The postcopy thread names on dest QEMU are slightly confusing, partly I'll
need to blame myself on 36f62f11e4 ("migration: Postcopy preemption
preparation on channel creation"). E.g., "fault-fast" reads like a fast
version of "fault-default", but it's actually the fast version of
"postcopy/listen".
Taking this chance, rename all the migration threads with proper rules.
Considering we only have 15 chars usable, prefix all threads with "mig/",
meanwhile identify src/dst threads properly this time. So now most thread
names will look like "mig/DIR/xxx", where DIR will be "src"/"dst", except
the bg-snapshot thread which doesn't have a direction.
For multifd threads, making them "mig/{src|dst}/{send|recv}_%d".
We used to have "live_migration" thread for a very long time, now it's
called "mig/src/main". We may hope to have "mig/dst/main" soon but not
yet.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Zhijian Li (Fujitsu) <lizhijian@fujitsu.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
We always do the flush when finishing one round of scan, and during
complete() phase we should scan one more round making sure no dirty page
existed. In that case we shouldn't need one explicit FLUSH at the end of
complete(), as when reaching there all pages should have been flushed.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Tested-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Add a multifd test for mapped-ram with passing of fds into QEMU. This
is how libvirt will consume the feature.
There are a couple of details to the fdset mechanism:
- multifd needs two distinct file descriptors (not duplicated with
dup()) so it can enable O_DIRECT only on the channels that do
aligned IO. The dup() system call creates file descriptors that
share status flags, of which O_DIRECT is one.
- the open() access mode flags used for the fds passed into QEMU need
to match the flags QEMU uses to open the file. Currently O_WRONLY
for src and O_RDONLY for dst.
Note that fdset code goes under _WIN32 because fd passing is not
supported on Windows.
Reviewed-by: Peter Xu <peterx@redhat.com>
[brought back the qmp_remove_fd() call at the end of the tests]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
With the last few changes to the fdset infrastructure, we now allow
multifd to use an fdset when migrating to a file. This is useful for
the scenario where the management layer wants to have control over the
migration file.
By receiving the file descriptors directly, QEMU can delegate some
high level operating system operations to the management layer (such
as mandatory access control). The management layer might also want to
add its own headers before the migration stream.
Document the "file:/dev/fdset/#" syntax for the multifd migration with
mapped-ram. The requirements for the fdset mechanism are:
- the fdset must contain two fds that are not duplicates between
themselves;
- if direct-io is to be used, exactly one of the fds must have the
O_DIRECT flag set;
- the file must be opened with WRONLY on the migration source side;
- the file must be opened with RDONLY on the migration destination
side.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
We're about to enable the use of O_DIRECT in the migration code and
due to the alignment restrictions imposed by filesystems we need to
make sure the flag is only used when doing aligned IO.
The migration will do parallel IO to different regions of a file, so
we need to use more than one file descriptor. Those cannot be obtained
by duplicating (dup()) since duplicated file descriptors share the
file status flags, including O_DIRECT. If one migration channel does
unaligned IO while another sets O_DIRECT to do aligned IO, the
filesystem would fail the unaligned operation.
The add-fd QMP command along with the fdset code are specifically
designed to allow the user to pass a set of file descriptors with
different access flags into QEMU to be later fetched by code that
needs to alternate between those flags when doing IO.
Extend the fdset matching to behave the same with the O_DIRECT flag.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The tests are only allowed to run in systems that know about the
O_DIRECT flag and in filesystems which support it.
Note: this also brings back migrate_set_parameter_bool() which went
away when we removed the compression tests. I copied it verbatim.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
When multifd is used along with mapped-ram, we can take benefit of a
filesystem that supports the O_DIRECT flag and perform direct I/O in
the multifd threads. This brings a significant performance improvement
because direct-io writes bypass the page cache which would otherwise
be thrashed by the multifd data which is unlikely to be needed again
in a short period of time.
To be able to use a multifd channel opened with O_DIRECT, we must
ensure that a certain aligment is used. Filesystems usually require a
block-size alignment for direct I/O. The way to achieve this is by
enabling the mapped-ram feature, which already aligns its I/O properly
(see MAPPED_RAM_FILE_OFFSET_ALIGNMENT at ram.c).
By setting O_DIRECT on the multifd channels, all writes to the same
file descriptor need to be aligned as well, even the ones that come
from outside multifd, such as the QEMUFile I/O from the main migration
code. This makes it impossible to use the same file descriptor for the
QEMUFile and for the multifd channels. The various flags and metadata
written by the main migration code will always be unaligned by virtue
of their small size. To workaround this issue, we'll require a second
file descriptor to be used exclusively for direct I/O.
The second file descriptor can be obtained by QEMU by re-opening the
migration file (already possible), or by being provided by the user or
management application (support to be added in future patches).
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Add the direct-io migration parameter that tells the migration code to
use O_DIRECT when opening the migration stream file whenever possible.
This is currently only used with the mapped-ram migration that has a
clear window guaranteed to perform aligned writes.
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
We want to make use of the Error object to report fdset errors from
qemu_open_internal() and passing the error pointer to qemu_open_old()
would require changing all callers. Move the file channel to the new
API instead.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
I'm keeping the EACCES because callers expect to be able to look at
errno.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Remove fds right away instead of setting the ->removed flag. We don't
need the extra complexity of having a cleanup function reap the
removed entries at a later time.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
monitor_fdsets_cleanup() currently has three responsibilities:
1- Remove the fds that have been marked for removal(->removed=true) by
qmp_remove_fd(). This is overly complicated, but ok.
2- Remove any file descriptors that have been passed into QEMU and
never duplicated[1,2]. A file descriptor without duplicates
indicates that no part of QEMU has made use of it. This is
problematic because the current implementation does it only if the
guest is not running and the monitor is closed.
3- Remove/free fdsets that have become empty due to the above
removals. This is ok.
The scenario described in (2) is starting to show some cracks now that
we're trying to consume fds from the migration code:
- Doing cleanup every time the last monitor connection closes works to
reap unused fds, but also has the side effect of forcing the
management layer to pass the file descriptors again in case of a
disconnect/re-connect, if that happened to be the only monitor
connection.
Another side effect is that removing an fd with qmp_remove_fd() is
effectively delayed until the last monitor connection closes.
The usage of mon_refcount is also problematic because it's racy.
- Checking runstate_is_running() skips the cleanup unless the VM is
running and avoids premature cleanup of the fds, but also has the
side effect of blocking the legitimate removal of an fd via
qmp_remove_fd() if the VM happens to be in another state.
This affects qmp_remove_fd() and qmp_query_fdsets() in particular
because requesting a removal at a bad time (guest stopped) might
cause an fd to never be removed, or to be removed at a much later
point in time, causing the query command to continue showing the
supposedly removed fd/fdset.
Note that file descriptors that *have* been duplicated are owned by
the code that uses them and will be removed after qemu_close() is
called. Therefore we've decided that the best course of action to
avoid the undesired side-effects is to stop managing non-duplicated
file descriptors.
1- efb87c1697 ("monitor: Clean up fd sets on monitor disconnect")
2- ebe52b592d ("monitor: Prevent removing fd from set during init")
Reviewed-by: Peter Xu <peterx@redhat.com>
[fix logic mistake: s/fdset_free/fdset_free_if_empty]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Introduce new functions to remove and free no longer used fds and
fdsets.
We need those to decouple the remove/free routines from
monitor_fdset_cleanup() which will go away in the next patches.
The new functions:
- monitor_fdset_free/_if_empty() will be used when a monitor
connection closes and when an fd is removed to cleanup any fdset
that is now empty.
- monitor_fdset_fd_free() will be used to remove one or more fds that
have been explicitly targeted by qmp_remove_fd().
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Those functions are not needed, one remove function should already
work. Clean it up.
Here the code doesn't really care about whether we need to keep that dupfd
around if close() failed: when that happens something got very wrong,
keeping the dup_fd around the fdsets may not help that situation so far.
Cc: Dr. David Alan Gilbert <dave@treblig.org>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Philippe Mathieu-Daudé <philmd@linaro.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
[add missing return statement, removal during traversal is not safe]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Add a test for file migration using fdset. The passing of fds is more
complex than using a file path. This is also the scenario where it's
most important we ensure that the initial migration stream offset is
respected because the fdset interface is the one used by the
management layer when providing a non empty migration file.
Note that fd passing is not available on Windows, so anything that
uses add-fd needs to exclude that platform.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
When doing file migration, QEMU accepts an offset that should be
skipped when writing the migration stream to the file. The purpose of
the offset is to allow the management layer to put its own metadata at
the start of the file.
We have tests for this in migration-test, but only testing that the
migration stream starts at the correct offset and not that it actually
leaves the data intact. Unsurprisingly, there's been a bug in that
area that the tests didn't catch.
Fix the tests to write some data to the offset region and check that
it's actually there after the migration.
While here, switch to using g_get_file_contents() which is more
portable than mmap().
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
When the "file:" migration support was added we missed the special
case in the qemu_open_old implementation that allows for a particular
file name format to be used to refer to a set of file descriptors that
have been previously provided to QEMU via the add-fd QMP command.
When using this fdset feature, we should not truncate the migration
file because being given an fd means that the management layer is in
control of the file and will likely already have some data written to
it. This is further indicated by the presence of the 'offset'
argument, which indicates the start of the region where QEMU is
allowed to write.
Fix the issue by replacing the O_TRUNC flag on open by an ftruncate
call, which will take the offset into consideration.
Fixes: 385f510df5 ("migration: file URI offset")
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Prasad Pandit <pjp@fedoraproject.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
We forgot to drop the reference to the QIOChannel in the error path of
the offset adjustment. Do it now.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
tcg/loongarch64: Support 64- and 256-bit vectors
tcg/loongarch64: Fix tcg_out_movi vs some pcrel pointers
util/bufferiszero: Split out host include files
util/bufferiszero: Add loongarch64 vector acceleration
accel/tcg: Fix typo causing tb->page_addr[1] to not be recorded
target/sparc: use signed denominator in sdiv helper
linux-user: Make TARGET_NR_setgroups affect only the current thread
# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmZzRoMdHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV9Y7gf/ZUTGjCUdAO7W7J5e
# Z3JLUNOfUHO6PxoE05963XJc+APwKiuL6Yo2bnJo6km7WM50CoaX9/7L9CXD7STg
# s3eUJ2p7FfvOADZgO373nqRrB/2mhvoywhDbVJBl+NcRvRUDW8rMqrlSKIAwDIsC
# kwwTWlCfpBSlUgm/c6yCVmt815+sGUPD2k/p+pIzAVUG6fGYAosC2fwPzPajiDGX
# Q+obV1fryKq2SRR2dMnhmPRtr3pQBBkISLuTX6xNM2+CYhYqhBrAlQaOEGhp7Dx3
# ucKjvQFpHgPOSdQxb/HaDv81A20ZUQaydiNNmuKQcTtMx3MsQFR8NyVjH7L+fbS8
# JokjaQ==
# =yVKz
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 19 Jun 2024 01:58:43 PM PDT
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate]
* tag 'pull-tcg-20240619' of https://gitlab.com/rth7680/qemu: (24 commits)
tcg/loongarch64: Fix tcg_out_movi vs some pcrel pointers
target/sparc: use signed denominator in sdiv helper
linux-user: Make TARGET_NR_setgroups affect only the current thread
accel/tcg: Fix typo causing tb->page_addr[1] to not be recorded
util/bufferiszero: Add loongarch64 vector acceleration
util/bufferiszero: Split out host include files
tcg/loongarch64: Enable v256 with LASX
tcg/loongarch64: Support LASX in tcg_out_vec_op
tcg/loongarch64: Split out vdvjukN in tcg_out_vec_op
tcg/loongarch64: Remove temp_vec from tcg_out_vec_op
tcg/loongarch64: Support LASX in tcg_out_{mov,ld,st}
tcg/loongarch64: Split out vdvjvk in tcg_out_vec_op
tcg/loongarch64: Support LASX in tcg_out_addsub_vec
tcg/loongarch64: Simplify tcg_out_addsub_vec
tcg/loongarch64: Support LASX in tcg_out_dupi_vec
tcg/loongarch64: Use tcg_out_dup_vec in tcg_out_dupi_vec
tcg/loongarch64: Support LASX in tcg_out_dupm_vec
tcg/loongarch64: Support LASX in tcg_out_dup_vec
tcg/loongarch64: Simplify tcg_out_dup_vec
util/loongarch64: Detect LASX vector support
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Simplify the logic for two-part, 32-bit pc-relative addresses.
Rather than assume all such fit in int32_t, do some arithmetic
and assert a result, do some arithmetic first and then check
to see if the pieces are in range.
Cc: qemu-stable@nongnu.org
Fixes: dacc51720d ("tcg/loongarch64: Implement tcg_out_mov and tcg_out_movi")
Reviewed-by: Song Gao <gaosong@loongson.cn>
Reported-by: Song Gao <gaosong@loongson.cn>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use inline assembly because no release compiler allows
per-function selection of the ISA.
Tested-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Split out host/bufferiszero.h.inc for x86, aarch64 and generic
in order to avoid an overlong ifdef ladder.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Fixes a bug in the immediate shifts, because the exact
encoding depends on the element size.
Reviewed-by: Song Gao <gaosong@loongson.cn>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Each element size has a different encoding, so code cannot
be shared in the same way as with tcg_out_dup_vec.
Reviewed-by: Song Gao <gaosong@loongson.cn>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We can implement this with fld_d, fst_d for load and store,
and then use the normal v128 operations in registers.
This will improve support for guests which use v64.
Reviewed-by: Song Gao <gaosong@loongson.cn>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Commit c9274b6bf0 ("target/s390x: start moving TCG-only code
to tcg/") moved mem_helper.c, but the trace-events file is
still in the parent directory, so is the generated trace.h.
Call the s390_skeys_get|set() helper, removing the need
for the trace event shared with the tcg/ sub-directory,
fixing the following build failure:
In file included from ../target/s390x/tcg/mem_helper.c:33:
../target/s390x/tcg/trace.h:1:10: fatal error: 'trace/trace-target_s390x_tcg.h' file not found
#include "trace/trace-target_s390x_tcg.h"
Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20240613104415.9643-3-philmd@linaro.org>
The real IPI hardware have dedicated MMIO registers mapped into
memory address space for every core. This is not used by LoongArch
guest software but it is essential for CPU without IOCSR such as
Loongson-3A1000.
Implement it with existing infrastructure.
Acked-by: Song Gao <gaosong@loongson.cn>
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Message-ID: <20240605-loongson3-ipi-v3-2-ddd2c0e03fa3@flygoat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
In order to compute the amount of free space (in bytes), the number
of available blocks (f_bavail) should be multiplied by the block
size (f_frsize) instead of the total number of blocks (f_blocks).
Signed-off-by: Fabio D'Urso <fdurso@google.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240618003657.3344685-1-fdurso@google.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
This is a counterpart to the HMP "info pic" command. It is being
added with an "x-" prefix because this QMP command is intended as an
adhoc debugging tool and will thus not be modelled in QAPI as fully
structured data, nor will it have long term guaranteed stability.
The existing HMP command is rewritten to call the QMP command.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20240610063518.50680-3-philmd@linaro.org>
The VIRTIO Sound Device conforms with the Virtio spec v1.2,
thus only use little endianness.
Remove the suspicious target_words_bigendian() noticed during
code review.
Cc: qemu-stable@nongnu.org
Fixes: eb9ad377bb ("virtio-sound: handle control messages and streams")
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20240422211830.25606-1-philmd@linaro.org>
PCMachineClass::acpi_data_size was only used by the pc-i440fx-2.0
machine, which got removed. Since it is constant, replace the class
field by a definition (local to hw/i386/pc.c, since not used
elsewhere).
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20240617071118.60464-24-philmd@linaro.org>
PCMachineClass::enforce_aligned_dimm was only used by the
pc-i440fx-2.1 machine, which got removed. It is now always
true. Remove it, simplifying pc_get_device_memory_range().
Update the comment in Avocado test_phybits_low_pse36().
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20240617071118.60464-14-philmd@linaro.org>
Fix the transmission return size because not all bytes could be
transmitted successfully. So, return a successful length instead of a
constant value.
Signed-off-by: Fea.Wang <fea.wang@sifive.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Loading a description from memory may cause a bus-error. In this
case, the DMA should stop working, set the error flag, and return
the failure value.
When calling the loading a description function, it should be noticed
that the function may return a failure value. Breaking the loop in this
case is one of the possible ways to handle it.
Signed-off-by: Fea.Wang <fea.wang@sifive.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
* i386: fix issue with cache topology passthrough
* scsi-disk: migrate emulated requests
* i386/sev: fix Coverity issues
* i386/tcg: more conversions to new decoder
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmZv6kMUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroOn4Af/evnpsae1fm8may1NQmmezKiks/4X
# cR0GaQ7w75Oas05jKsG7Xnrq3Vn6p5wllf3Wf00p7F1iJX18azY9rQgIsUVUgVem
# /EIZk1eM6+mDxuIG0taPxc5Aw3cfIBWAjUmzsXrSr55e/wyiIxZCeUo2zk8Il+iL
# Z4ceNzY5PZzc2Fl10D3cGs/+ynfiDM53ucwe3ve2T6NrxEVfKQPp5jkIUkBUba6z
# zM5O4Q5KTEZYVth1gbDTB/uUJLUFjQ12kCQfRCNX+bEPDHwARr0UWr/Oxtz0jZSd
# FvXohz7tI+v+ph0xHyE4tEFqryvLCII1td2ohTAYZZXNGkjK6XZildngBw==
# =m4BE
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 17 Jun 2024 12:48:19 AM PDT
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (25 commits)
target/i386: SEV: do not assume machine->cgs is SEV
target/i386: convert CMPXCHG to new decoder
target/i386: convert XADD to new decoder
target/i386: convert LZCNT/TZCNT/BSF/BSR/POPCNT to new decoder
target/i386: convert SHLD/SHRD to new decoder
target/i386: adapt gen_shift_count for SHLD/SHRD
target/i386: pull load/writeback out of gen_shiftd_rm_T1
target/i386: convert non-grouped, helper-based 2-byte opcodes
target/i386: split X86_CHECK_prot into PE and VM86 checks
target/i386: finish converting 0F AE to the new decoder
target/i386: fix bad sorting of entries in the 0F table
target/i386: replace read_crN helper with read_cr8
target/i386: convert MOV from/to CR and DR to new decoder
target/i386: fix processing of intercept 0 (read CR0)
target/i386: replace NoSeg special with NoLoadEA
target/i386: change X86_ENTRYwr to use T0, use it for moves
target/i386: change X86_ENTRYr to use T0
target/i386: put BLS* input in T1, use generic flag writeback
target/i386: rewrite flags writeback for ADCX/ADOX
target/i386: remove CPUX86State argument from generator functions
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
aspeed queue:
* Add AST2700 support
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmZvtLUACgkQUaNDx8/7
# 7KH8Ew/+K7OJYUsRhuLByLjaQ8kCsVdxMCFLtpCL9t6AgrMUXaI6WkkynPMKITQQ
# AHocO76TsWRMp962obnjvXgVRCrtvOI2W5jvgp1Gr554tW7YQClLiGhuf1FeORS9
# ZQhWryoC8vK8ymC7dAS5cyuiddWFUGC04P9lb9oXr88n6goZ1xRfKwM+RttgfCAm
# 79SsK7g3TS8QOWH1kQwIQZyJKzwrw7bTM3Ijv9NmVKa050zWquMRZQeY18fgO6Ae
# p/pGpkf4Bc5iv+kIXoI4UN7Cx74aZoKInQ+DA71gtCWh/s09j9PkvOAfKWYAozD+
# VSaLvw4rvhRxgbs1SjoiMb5dDjJhngfzLhJX/P2FD1LCHRk+/uxk3fDDp2AqvQ6z
# IuWPb8FgWHqeiigcXkTW1JgUS85quIbjWBxreIrQiq+zR50EQy49elMRhzJlKsqZ
# 3/ulk7xf+5M1+wS4bo7r8LPk5K8mFw9b4cxfnx0feZCjrl4ZfeWyDtaKzCAU0MJq
# KfpHo9R98imjVmcRWUouTaFow33OXheLdPFO8PofVnT38a4KIWlkin3zFMdTOAk+
# f8kWMPlXlRpKBYsjvP2aCpoY6CY8bHskdBH7xysM2W1FfKTw3dwZRpt4dgVPxqYj
# KZXiKxzwnC2gGi/wn+EdhZwYy1nNSZYGK8s+jxBXi2UBrwv4PpA=
# =TnR8
# -----END PGP SIGNATURE-----
# gpg: Signature made Sun 16 Jun 2024 08:59:49 PM PDT
# gpg: using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@kaod.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1
* tag 'pull-aspeed-20240617' of https://github.com/legoater/qemu:
MAINTAINERS: Add reviewers for ASPEED BMCs
docs:aspeed: Add AST2700 Evaluation board
test/avocado/machine_aspeed.py: Add AST2700 test case
aspeed/soc: fix incorrect dram size for AST2700
aspeed: Add an AST2700 eval board
aspeed/soc: Add AST2700 support
aspeed/intc: Add AST2700 support
aspeed/scu: Add AST2700 support
aspeed/smc: Add AST2700 support
aspeed/smc: support different memory region ops for SMC flash region
aspeed/smc: support 64 bits dma dram address
aspeed/smc: support dma start length and 1 byte length unit
aspeed/smc: correct device description
aspeed/sdmc: Add AST2700 support
aspeed/sdmc: fix coding style
aspeed/sdmc: remove redundant macros
aspeed/sli: Add AST2700 support
aspeed/wdt: Add AST2700 support
aspeed/smc: Reintroduce "dram-base" property for AST2700
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Since the kvm_dirty_ring_enabled function accesses a null kvm_state
pointer when the KVM acceleration parameter is not specified, running
calc_dirty_rate with the -r or -b option causes a segmentation fault.
Signed-off-by: Masato Imai <mii@sfc.wide.ad.jp>
Message-ID: <20240507025010.1968881-1-mii@sfc.wide.ad.jp>
[Assert kvm_state when kvm_dirty_ring_enabled was called to fix it. - Hyman]
Signed-off-by: Hyman Huang <yong.huang@smartx.com>
There can be other confidential computing classes that are not derived
from sev-common. Avoid aborting when encountering them.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use the same flag generation code as SHL and SHR, but use
the existing gen_shiftd_rm_T1 function to compute the result
as well as CC_SRC.
Decoding-wise, SHLD/SHRD by immediate count as a 4 operand
instruction because s->T0 and s->T1 actually occupy three op
slots. The infrastructure used by opcodes in the 0F 3A table
works fine.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
SHLD/SHRD can have 3 register operands - s->T0, s->T1 and either
1 or CL - and therefore decode->op[2] is taken by the low part
of the register being shifted. Pass X86_OP_* to gen_shift_count
from its current callers and hardcode cpu_regs[R_ECX] as the
shift count.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use gen_ld_modrm/gen_st_modrm, moving them and gen_shift_flags to the
caller. This way, gen_shiftd_rm_T1 becomes something that the new
decoder can call.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These have very simple generators and no need for complex group
decoding. Apart from LAR/LSL which are simplified to use
gen_op_deposit_reg_v and movcond, the code is generally lifted
from translate.c into the generators.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
SYSENTER is allowed in VM86 mode, but not in real mode. Split the check
so that PE and !VM86 are covered by separate bits.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is already partly implemented due to VLDMXCSR and VSTMXCSR; finish
the job.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a test case to test Aspeed OpenBMC SDK v09.01 on AST2700 board.
It loads u-boot-nodtb.bin, u-boot.dtb, tfa and optee-os
images to dram first which base address is 0x400000000.
Then, boot and launch 4 cpu cores.
```
qemu-system-aarch64 -machine ast2700-evb
-device loader,force-raw=on,addr=0x400000000,file=workdir/u-boot-nodtb.bin \
-device loader,force-raw=on,addr=uboot_dtb_load_addr,file=workdir/u-boot.dtb\
-device loader,force-raw=on,addr=0x430000000,file=workdir/bl31.bin\
-device loader,force-raw=on,addr=0x430080000,file=workdir/optee/tee-raw.bin\
-device loader,cpu-num=0,addr=0x430000000 \
-device loader,cpu-num=1,addr=0x430000000 \
-device loader,cpu-num=2,addr=0x430000000 \
-device loader,cpu-num=3,addr=0x430000000 \
-smp 4 \
-drive file=workdir/image-bmc,format=raw,if=mtd
```
A test image is downloaded from the ASPEED Forked OpenBMC GitHub release repository :
https://github.com/AspeedTech-BMC/openbmc/releases/
Signed-off-by: Troy Lee <troy_lee@aspeedtech.com>
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
AST2700 dram size calculation is not back compatible AST2600.
According to the DDR capacity hardware behavior,
if users write the data to the address which is beyond the ram size,
it would write the data to the "address % ram_size".
For example:
a. sdram base address "0x4 00000000"
b. sdram size 1 GiB
The available address range is from "0x4 00000000" to "0x4 3FFFFFFF".
If users write 0x12345678 to address "0x5 00000000",
the value of DRAM address 0 (base address 0x4 00000000) will be 0x12345678.
Add aspeed_soc_ast2700_dram_init to calculate the dram size and add
memory I/O whose address range is from "max_ram_size - ram_size" to max_ram_size
and its read/write handler to emulate DDR capacity hardware behavior.
Signed-off-by: Troy Lee <troy_lee@aspeedtech.com>
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
AST2700 CPU is ARM Cortex-A35 which is 64 bits.
Add TARGET_AARCH64 to build this machine.
According to the design of ast2700, it has a bootmcu(riscv-32) which
is used for executing SPL.
Then, CPUs(cortex-a35) execute u-boot, kernel and rofs.
Currently, qemu not support emulate two CPU architectures
at the same machine. Therefore, qemu will only support
to emulate CPU(cortex-a35) side for ast2700
Signed-off-by: Troy Lee <troy_lee@aspeedtech.com>
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Initial definitions for a simple machine using an AST2700 SOC (Cortex-a35 CPU).
AST2700 SOC and its interrupt controller are too complex to handle
in the common Aspeed SoC framework. We introduce a new ast2700
class with instance_init and realize handlers.
AST2700 is a 64 bits quad core cpus and support 8 watchdog.
Update maximum ASPEED_CPUS_NUM to 4 and ASPEED_WDTS_NUM to 8.
In addition, update AspeedSocState to support scuio, sli, sliio and intc.
Add TYPE_ASPEED27X0_SOC machine type.
The SDMC controller is unlocked at SPL stage.
At present, only supports to emulate booting
start from u-boot stage. Set SDMC controller
unlocked by default.
In INTC, each interrupt of INT 128 to INT 136 combines 32 interrupts.
It connect GICINT IRQ GPIO-OUTPUT pins to GIC device with irq 128 to 136.
And, if a device irq is 128 to 136, its irq GPIO-OUTPUT pin is connected to
GICINT or-gates instead of GIC device.
Signed-off-by: Troy Lee <troy_lee@aspeedtech.com>
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
AST2700 interrupt controller(INTC) provides hardware interrupt interfaces
to interrupt of processors PSP, SSP and TSP. In INTC, each interrupt of
INT 128 to INT136 combines 32 interrupts.
Introduce a new aspeed_intc class with instance_init and realize handlers.
So far, this model only supports GICINT128 to GICINT136.
It creates 9 GICINT or-gates to connect 32 interrupts sources
from GICINT128 to GICINT136 as IRQ GPIO-OUTPUT pins.
Then, this model registers IRQ handler with its IRQ GPIO-INPUT pins which
connect to GICINT or-gates. And creates 9 GICINT IRQ GPIO-OUTPUT pins which
connect to GIC device with GIC IRQ 128 to 136.
If one interrupt source from GICINT128 to GICINT136
set irq, the OR-GATE irq callback function is called and set irq to INTC by
OR-GATE GPIO-OUTPUT pins. Then, the INTC irq callback function is called and
set irq to GIC by its GICINT IRQ GPIO-OUTPUT pins. Finally, the GIC irq
callback function is called and set irq to CPUs and
CPUs execute Interrupt Service Routine (ISR).
Block diagram of GICINT132:
GICINT132
ETH1 +-----------+
+-------->+0 3|
ETH2 | 4|
+-------->+1 5|
ETH3 | 6|
+-------->+2 19| INTC GIC
UART0 | 20| +--------------------------+
+-------->+7 21| | | +--------------+
UART1 | 22| |orgate0 +----> output_pin0+----------->+GIC128 |
+-------->+8 23| | | | |
UART2 | 24| |orgate1 +----> output_pin1+----------->+GIC129 |
+-------->+9 25| | | | |
UART3 | 26| |orgate2 +----> output_pin2+----------->+GIC130 |
+--------->10 27| | | | |
UART5 | 28| |orgate3 +----> output_pin3+----------->+GIC131 |
+-------->+11 29| | | | |
UART6 | +----------->+orgate4 +----> output_pin4+----------->+GIC132 |
+-------->+12 30| | | | |
UART7 | 31| |orgate5 +----> output_pin5+----------->+GIC133 |
+-------->+13 | | | | |
UART8 | OR[0:31] | |orgate6 +----> output_pin6+----------->+GIC134 |
---------->14 | | | | |
UART9 | | |orgate7 +----> output_pin7+----------->+GIC135 |
--------->+15 | | | | |
UART10 | | |orgate8 +----> output_pin8+----------->+GIC136 |
--------->+16 | | | +--------------+
UART11 | | +--------------------------+
+-------->+17 |
UART12 | |
+--------->18 |
| |
| |
| |
+-----------+
Signed-off-by: Troy Lee <troy_lee@aspeedtech.com>
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
[clg: Fixed class_size in TYPE_ASPEED_INTC definition ]
AST2700 have two SCU controllers which are SCU and SCUIO.
Both SCU and SCUIO registers are not compatible previous SOCs
, introduces new registers and adds ast2700 scu, sucio class init handler.
The pclk divider selection of SCUIO is defined in SCUIO280[20:18] and
the pclk divider selection of SCU is defined in SCU280[25:23].
Both of them are not compatible AST2600 SOCs, adds a get_apb_freq function
and trace-event for AST2700 SCU and SCUIO.
Signed-off-by: Troy Lee <troy_lee@aspeedtech.com>
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
[clg: Fixed spelling : Unhandeled -> Unhandled ]
AST2700 fmc/spi controller's address decoding unit is 64KB
and only bits [31:16] are used for decoding. Introduce seg_to_reg
and reg_to_seg handlers for ast2700 fmc/spi controller.
In addition, adds ast2700 fmc, spi0, spi1, and spi2 class init handler.
AST2700 is a 64 bits quad core CPUs(Cortex-a35). Introduce a new
"aspeed_2700_smc_flash_ops" and set its valid "max_access_size"
8 for 64 bits data format access.
Signed-off-by: Troy Lee <troy_lee@aspeedtech.com>
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
It set "aspeed_smc_flash_ops" struct which containing
read and write callbacks to be used when I/O is performed
on the SMC flash region. And it set the valid max_access_size 4
by default for all ASPEED SMC models.
However, the valid max_access_size 4 only support 32 bits CPUs.
To support all ASPEED SMC model, introduce a new
"const MemoryRegionOps *" attribute in AspeedSMCClass and
use it in aspeed_smc_flash_realize function.
Signed-off-by: Troy Lee <troy_lee@aspeedtech.com>
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
AST2700 support the maximum dram size is 8GiB
and has a "DMA DRAM Side Address High Part(0x7C)"
register to support 64 bits dma dram address.
Add helper routines functions to compute the dma dram
address, new features and update trace-event
to support 64 bits dram address.
Signed-off-by: Troy Lee <troy_lee@aspeedtech.com>
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
DMA length is from 1 byte to 32MB for AST2600 and AST10x0
and DMA length is from 4 bytes to 32MB for AST2500.
In other words, if "R_DMA_LEN" is 0, it should move at least 1 byte
data for AST2600 and AST10x0 and 4 bytes data for AST2500.
To support all ASPEED SOCs, adds dma_start_length parameter to store
the start length, add helper routines function to compute the dma length
and update DMA_LENGTH mask to "1FFFFFF" to support dma 1 byte
length unit for AST2600 and AST1030.
Currently, only supports dma length 4 bytes aligned.
Signed-off-by: Troy Lee <troy_lee@aspeedtech.com>
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
The SDRAM memory controller(DRAMC) controls the access to external
DDR4 and DDR5 SDRAM and power up to DDR4 and DDR5 PHY.
The DRAM memory controller of AST2700 is not backward compatible
to previous chips such AST2600, AST2500 and AST2400.
Max memory is now 8GiB on the AST2700. Introduce new
aspeed_2700_sdmc and class with read/write operation and
reset handlers.
Define DRAMC necessary protected registers and
unprotected registers for AST2700 and increase
the register set to 0x1000.
Add unlocked property to change controller protected status.
Incrementing the version of vmstate to 2.
Signed-off-by: Troy Lee <troy_lee@aspeedtech.com>
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Fix coding style issues from checkpatch.pl
Test command:
scripts/checkpatch.pl --no-tree -f hw/misc/aspeed_sdmc.c
Signed-off-by: Troy Lee <troy_lee@aspeedtech.com>
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
AST2700 SLI engine is designed to accelerate the
throughput between cross-die connections.
It have CPU_SLI at CPU die and IO_SLI at IO die.
Introduce dummy AST2700 SLI and SLIIO models.
Signed-off-by: Troy Lee <troy_lee@aspeedtech.com>
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
AST2700 wdt controller is similiar to AST2600's wdt, but
the AST2700 has 8 watchdogs, and they each have 0x80 of registers.
Introduce ast2700 object class and increase the number of regs(offset) of
ast2700 model.
Signed-off-by: Troy Lee <troy_lee@aspeedtech.com>
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
The Aspeed SMC device model use to have a 'sdram_base' property. It
was removed by commit d177892d4a ("aspeed/smc: Remove unused
"sdram-base" property") because previous changes simplified the DMA
transaction model to use an offset in RAM and not the physical
address.
The AST2700 SoC has larger address space (64-bit) and a new register
DMA DRAM Side Address High Part (0x7C) is introduced to deal with the
high bits of the DMA address. To be able to compute the offset of the
DMA transaction, as done on the other SoCs, we will need to know where
the DRAM is mapped in the address space. Re-introduce a "dram-base"
property to hold this value.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Jamin Lin <jamin_lin@aspeedtech.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
virtio-grants-v8
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCgAdFiEE0E4zq6UfZ7oH0wrqiU+PSHDhrpAFAmZqEk4ACgkQiU+PSHDh
# rpBaBxAA1jTfkty2RWJ0LfU5ekxnEWSx63zVzDWESFOQRjp/rOk/FhHbqbHzXISk
# cbHjz2PX6mNSOiFoSOWsNP7Utg+7xPH34D+D/EH59bmrXYFHCXxYjIK/T8T2Jr2p
# /qx3x/qxGRXFq38WFHvLhdK/0obdOuF3M6W/Zz82z8ruo7uHBX4XuCsF2rWV0ydb
# mvfAh+iMwh1JQN/g/vHIf0h+2RQjGCfsez+xVnG4rSeE4UWn/4iaU5c6SJ80arwE
# mwlnDOysEXwIZuy0fi+RX8o4tUie8rcS19+rBoMskXCAJXQblV/Aqhq4qww2DtA+
# kjL7HTHZrccZOJME9dj5gIUHvjAa9wxDZ5luelNVGY+VNO1hWXfk8Rcl9rtvOmNZ
# FKwcj3HW0ggQQMlH5+QizFQhNM3iRoirzX3t9Vw3uNbmwyTjSHcN3qVBExeCLAaT
# +N6t+aBfCOL5ZVskFb6YYxvWe3gLSghFH4cN/l0VLngzuGFl4BUNny5aNaENQYbX
# OSwH3rsE45j6X4B0gtwBXWFC31WpA1wPBwKYwcPZNmKWl30oJsXUs9UrTMHu4H/Z
# NnpFTgGYBaPCqlhkdIVQkOTpY9q85LzxQ8A+uwBUK+4uZwnw9rPXf+If8kyX/5eL
# 1AlVfBAG9uSVT/+AqxW/49jQ6jHRQ9ZgL9y6H0N0Ql3nrQBMasI=
# =4mj9
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 12 Jun 2024 02:25:34 PM PDT
# gpg: using RSA key D04E33ABA51F67BA07D30AEA894F8F4870E1AE90
# gpg: Good signature from "Stefano Stabellini <stefano.stabellini@eu.citrix.com>" [unknown]
# gpg: aka "Stefano Stabellini <sstabellini@kernel.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: D04E 33AB A51F 67BA 07D3 0AEA 894F 8F48 70E1 AE90
* tag 'virtio-grants-v8-tag' of https://gitlab.com/sstabellini/qemu:
hw/arm: xen: Enable use of grant mappings
xen: mapcache: Add support for grant mappings
xen: mapcache: Pass the ram_addr offset to xen_map_cache()
xen: mapcache: Unmap first entries in buckets
xen: mapcache: Make MCACHE_BUCKET_SHIFT runtime configurable
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Send raw packets over if UADK hardware support is not available. This is to
satisfy Qemu qtest CI which may run on platforms that don't have UADK
hardware support. Subsequent patch will add support for uadk migration
qtest.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Add --enable-uadk and --disable-uadk options to enable and disable
UADK compression accelerator. This is for using UADK based hardware
accelerators for live migration.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Document UADK(User Space Accelerator Development Kit) library details
and how to use that for migration.
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org>
[s/Qemu/QEMU in docs]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
add qpl to compression method test for multifd migration
the qpl compression supports software path and hardware
path(IAA device), and the hardware path is used first by
default. If the hardware path is unavailable, it will
automatically fallback to the software path for testing.
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
QPL compression and decompression will use IAA hardware path if the IAA
hardware is available. Otherwise the QPL library software path is used.
The hardware path will automatically fall back to QPL software path if
the IAA queues are busy. In some scenarios, this may happen frequently,
such as configuring 4 channels but only one IAA device is available. In
the case of insufficient IAA hardware resources, retry and fallback can
help optimize performance:
1. Retry + SW fallback:
total time: 14649 ms
downtime: 25 ms
throughput: 17666.57 mbps
pages-per-second: 1509647
2. No fallback, always wait for work queues to become available
total time: 18381 ms
downtime: 25 ms
throughput: 13698.65 mbps
pages-per-second: 859607
If both the hardware and software paths fail, the uncompressed page is
sent directly.
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
during initialization, a software job is allocated to each channel
for software path fallabck when the IAA hardware is unavailable or
the hardware job submission fails. If the IAA hardware is available,
multiple hardware jobs are allocated for batch processing.
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
add the Query Processing Library (QPL) compression method
Introduce the qpl as a new multifd migration compression method, it can
use In-Memory Analytics Accelerator(IAA) to accelerate compression and
decompression, which can not only reduce network bandwidth requirement
but also reduce host compression and decompression CPU overhead.
How to enable qpl compression during migration:
migrate_set_parameter multifd-compression qpl
There is no qpl compression level parameter added since it only supports
level one, users do not need to specify the qpl compression level.
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
[fixed docs spacing in migration.json]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
add --enable-qpl and --disable-qpl options to enable and disable
the QPL compression method for multifd migration.
The Query Processing Library (QPL) is an open-source library
that supports data compression and decompression features. It
is based on the deflate compression algorithm and use Intel
In-Memory Analytics Accelerator(IAA) hardware for compression
and decompression acceleration.
For more live migration with IAA, please refer to the document
docs/devel/migration/qpl-compression.rst
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Different compression methods may require different numbers of IOVs.
Based on streaming compression of zlib and zstd, all pages will be
compressed to a data block, so two IOVs are needed for packet header
and compressed data block.
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Similar to other archs, build a custom bios memory updater. Running the
test with OF code is a cool trick, but SLOF takes a long time to boot.
This reduces test time by around 3x (150s to 50s).
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
ppc64 with TCG seems to no longer be failing this test, perhaps since
commit 03bfc2188f ("physmem: Fix migration dirty bitmap coherency
with TCG memory access") which is not ppc specific but was seen to hit
ppc64 quite easily.
Let's enable it again.
The s390x problem has been identified so mention it while we are
adjusting the comment.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Prasad Pandit <pjp@fedoraproject.org>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The spapr QEMU machine defaults is useful outside libqos, so create a
new header for ppc specific qtests and move it there.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
* Fix loongarch64 avocado test
* Make qtests more flexible with regards to non-available CPU models
* Improvements for the test-smp-parse unit test
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmZpoEoRHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbVF6g/+JYTRKmaIduQIP9g2+NkM+qMTbjI9Ow47
# 8Vdj/ePMXNWOZsgMPkCUdisYeMZEC+XMcDN1xvwZXwLTMTJRacZCSFRpeN4P0m0W
# 6aQ28+tPNgx+B9Eh2kc4TpxbiqSH8u5u4GEN4Y07rcX/3YbYyjFgZD8orRu/nJ+H
# 0wV7Riq9csi1BkLxrgKaHocFSOl4eOga4OFi+u4wIn/xoW3MN0laxe4iuoQRMZPf
# gJLPRhEija4lto8iIKNxJbTABB0wEcWRWtgqcbHxdatqh1lPTPBpWxmdD/v1LJn+
# H/eO+oh05NQdlhw7+xfWF9PD+MpIePbZ28oNb3X3uURROTdcxpBAgpPipv07FsT4
# LmU2nIBQ4FcpDOkhLnLmBmFBNO6uDCzuGzxFRhX1SIiGMABqTDOKynBQSgQI2iB0
# 5J47XUwHtnOoCvf4SRA/MZG8zNSQZdJbnuOBLgZ+vsCG14mWM2NbfSUwRkH6pd/J
# fEbODuzHZoYgUTxjR9+WMbINAbNjMy+SP2sGZIBzcAIIkybKynOy58LoCyNT684U
# ean9bnc65908PJxEfsQ6k9kNwkK4GwOqZi+X383nVgMJ9+3dDw8M76IVU59hsq1n
# wnz4VgFcRdXMYhj9zghaCgH2Ezw8gZHILXH+RlX0Bav4LQ5vSZQ6tRNwM4+rfXBe
# okF1Sxmz31U=
# =s7+V
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 12 Jun 2024 06:19:06 AM PDT
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full]
* tag 'pull-request-2024-06-12' of https://gitlab.com/thuth/qemu:
tests/tcg/s390x: Allow specifying extra QEMU options on the command line
tests/unit/test-smp-parse: Test the full 8-levels topology hierarchy
tests/unit/test-smp-parse: Test "modules" and "dies" combination case
tests/unit/test-smp-parse: Test "modules" parameter in -smp
tests/unit/test-smp-parse: Make test cases aware of module level
tests/unit/test-smp-parse: Use default parameters=0 when not set in -smp
tests/unit/test-smp-parse: Fix an invalid topology case
tests/unit/test-smp-parse: Fix comment of parameters=1 case
tests/unit/test-smp-parse: Fix comments of drawers and books case
test: Remove libibumad dependence
meson: Remove libibumad dependence
tests/qtest/x86: check for availability of older cpu models before running tests
tests/qtest/libqtest: add qtest_has_cpu_model() api
qtest/x86/numa-test: do not use the obsolete 'pentium' cpu
tests/avocado: Update LoongArch bios file
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The use case for this is `make check-tcg EXTFLAGS="-accel kvm"`,
which allows validating the system TCG testcases on real hardware.
EXTFLAGS name is borrowed from tests/tcg/xtensa/Makefile.softmmu-target.
While at it, use += instead of = in order to be consistent with the
other architectures.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240522184116.35975-1-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
RDMA based migration has no dependence on libumad. libibverbs and
librdmacm are enough.
libumad was used by rdmacm-mux which has been already removed. It's
remained mistakenly.
Fixes: 1dfd42c426 ("hw/rdma: Remove deprecated pvrdma device and rdmacm-mux helper")
Cc: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240611105427.61395-2-pizhenwei@bytedance.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
It is better to check if some older cpu models like 486, athlon, pentium,
penryn, phenom, core2duo etc are available before running their corresponding
tests. Some downstream distributions may no longer support these older cpu
models.
Signature of add_feature_test() has been modified to return void as
FeatureTestArgs* was not used by the caller.
One minor correction. Replaced 'phenom' with '486' in the test
'x86/cpuid/auto-level/phenom/arat' matching the cpu used.
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240610155303.7933-4-anisinha@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Added a new test api qtest_has_cpu_model() in order to check availability of
some cpu models in the current QEMU binary. The specific architecture of the
QEMU binary is selected using the QTEST_QEMU_BINARY environment variable.
This api would be useful to run tests against some older cpu models after
checking if QEMU actually supported these models.
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240610155303.7933-3-anisinha@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Just like X86_ENTRYr, X86_ENTRYwr is easily changed to use only T0.
In this case, the motivation is to use it for the MOV instruction
family. The case when you need to preserve the input value is the
odd one, as it is used basically only for BLS* instructions.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
I am not sure why I made it use T1. It is a bit more symmetric with
respect to X86_ENTRYwr (which uses T0 for the "w"ritten operand
and T1 for the "r"ead operand), but it is also less flexible because it
does not let you apply zextT0/sextT0.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This makes for easier cpu_cc_* setup, and not using set_cc_op()
should come in handy if QEMU ever implements APX.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Avoid using set_cc_op() in preparation for implementing APX; treat
CC_OP_EFLAGS similar to the case where we have the "opposite" cc_op
(CC_OP_ADOX for ADCX and CC_OP_ADCX for ADOX), except the resulting
cc_op is not CC_OP_ADCOX. This is written easily as two "if"s, whose
conditions are both false for CC_OP_EFLAGS, both true for CC_OP_ADCOX,
and one each true for CC_OP_ADCX/ADOX.
The new logic also makes it easy to drop usage of tmp0.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CPUX86State argument would only be used to fetch bytes, but that has to be
done before the generator function is called. So remove it, and all
temptation together with it.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When QEMU is started with:
-cpu host,host-cache-info=on,l3-cache=off \
-smp 2,sockets=1,dies=1,cores=1,threads=2
Guest can't acquire maximum number of addressable IDs for processor cores in
the physical package from CPUID[04H].
When creating a CPU topology of 1 core per package, host-cache-info only
uses the Host's addressable core IDs field (CPUID.04H.EAX[bits 31-26]),
resulting in a conflict (on the multicore Host) between the Guest core
topology information in this field and the Guest's actual cores number.
Fix it by removing the unnecessary condition to cover 1 core per package
case. This is safe because cores_per_pkg will not be 0 and will be at
least 1.
Fixes: d7caf13b5f ("x86: cpu: fixup number of addressable IDs for logical processors sharing cache")
Signed-off-by: Guixiong Wei <weiguixiong@bytedance.com>
Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>
Signed-off-by: Chuang Xu <xuchuangxclwt@bytedance.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240611032314.64076-1-xuchuangxclwt@bytedance.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For VMs configured with the USB CDROM device:
-drive file=/path/to/local/file,id=drive-usb-disk0,media=cdrom,readonly=on...
-device usb-storage,drive=drive-usb-disk0,id=usb-disk0...
QEMU process may crash after live migration, to reproduce the issue,
configure VM (Guest OS ubuntu 20.04 or 21.10) with the following XML:
<disk type='file' device='cdrom'>
<driver name='qemu' type='raw'/>
<source file='/path/to/share_fs/cdrom.iso'/>
<target dev='sda' bus='usb'/>
<readonly/>
<address type='usb' bus='0' port='2'/>
</disk>
<controller type='usb' index='0' model='piix3-uhci'/>
Do the live migration repeatedly, crash may happen after live migratoin,
trace log at the source before live migration is as follows:
324808@1711972823.521945:usb_uhci_frame_start nr 319
324808@1711972823.521978:usb_uhci_qh_load qh 0x35cb5400
324808@1711972823.521989:usb_uhci_qh_load qh 0x35cb5480
324808@1711972823.521997:usb_uhci_td_load qh 0x35cb5480, td 0x35cbe000, ctrl 0x0, token 0xffe07f69
324808@1711972823.522010:usb_uhci_td_nextqh qh 0x35cb5480, td 0x35cbe000
324808@1711972823.522022:usb_uhci_qh_load qh 0x35cb5680
324808@1711972823.522030:usb_uhci_td_load qh 0x35cb5680, td 0x75ac5180, ctrl 0x19800000, token 0x3c903e1
324808@1711972823.522045:usb_uhci_packet_add token 0x103e1, td 0x75ac5180
324808@1711972823.522056:usb_packet_state_change bus 0, port 2, ep 2, packet 0x559f9ba14b00, state undef -> setup
324808@1711972823.522079:usb_msd_cmd_submit lun 0, tag 0x472, flags 0x00000080, len 10, data-len 8
324808@1711972823.522107:scsi_req_parsed target 0 lun 0 tag 1138 command 74 dir 1 length 8
324808@1711972823.522124:scsi_req_parsed_lba target 0 lun 0 tag 1138 command 74 lba 4096
324808@1711972823.522139:scsi_req_alloc target 0 lun 0 tag 1138
324808@1711972823.522169:scsi_req_continue target 0 lun 0 tag 1138
324808@1711972823.522181:scsi_req_data target 0 lun 0 tag 1138 len 8
324808@1711972823.522194:usb_packet_state_change bus 0, port 2, ep 2, packet 0x559f9ba14b00, state setup -> complete
324808@1711972823.522209:usb_uhci_packet_complete_success token 0x103e1, td 0x75ac5180
324808@1711972823.522219:usb_uhci_packet_del token 0x103e1, td 0x75ac5180
324808@1711972823.522232:usb_uhci_td_complete qh 0x35cb5680, td 0x75ac5180
trace log at the destination after live migration is as follows:
3286206@1711972823.951646:usb_uhci_frame_start nr 320
3286206@1711972823.951663:usb_uhci_qh_load qh 0x35cb5100
3286206@1711972823.951671:usb_uhci_qh_load qh 0x35cb5480
3286206@1711972823.951680:usb_uhci_td_load qh 0x35cb5480, td 0x35cbe000, ctrl 0x1000000, token 0xffe07f69
3286206@1711972823.951693:usb_uhci_td_nextqh qh 0x35cb5480, td 0x35cbe000
3286206@1711972823.951702:usb_uhci_qh_load qh 0x35cb5700
3286206@1711972823.951709:usb_uhci_td_load qh 0x35cb5700, td 0x75ac5240, ctrl 0x39800000, token 0xe08369
3286206@1711972823.951727:usb_uhci_queue_add token 0x8369
3286206@1711972823.951735:usb_uhci_packet_add token 0x8369, td 0x75ac5240
3286206@1711972823.951746:usb_packet_state_change bus 0, port 2, ep 1, packet 0x56066b2fb5a0, state undef -> setup
3286206@1711972823.951766:usb_msd_data_in 8/8 (scsi 8)
2024-04-01 12:00:24.665+0000: shutting down, reason=crashed
The backtrace reveals the following:
Program terminated with signal SIGSEGV, Segmentation fault.
0 __memmove_sse2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:312
312 movq -8(%rsi,%rdx), %rcx
[Current thread is 1 (Thread 0x7f0a9025fc00 (LWP 3286206))]
(gdb) bt
0 __memmove_sse2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:312
1 memcpy (__len=8, __src=<optimized out>, __dest=<optimized out>) at /usr/include/bits/string_fortified.h:34
2 iov_from_buf_full (iov=<optimized out>, iov_cnt=<optimized out>, offset=<optimized out>, buf=0x0, bytes=bytes@entry=8) at ../util/iov.c:33
3 iov_from_buf (bytes=8, buf=<optimized out>, offset=<optimized out>, iov_cnt=<optimized out>, iov=<optimized out>)
at /usr/src/debug/qemu-6-6.2.0-75.7.oe1.smartx.git.40.x86_64/include/qemu/iov.h:49
4 usb_packet_copy (p=p@entry=0x56066b2fb5a0, ptr=<optimized out>, bytes=bytes@entry=8) at ../hw/usb/core.c:636
5 usb_msd_copy_data (s=s@entry=0x56066c62c770, p=p@entry=0x56066b2fb5a0) at ../hw/usb/dev-storage.c:186
6 usb_msd_handle_data (dev=0x56066c62c770, p=0x56066b2fb5a0) at ../hw/usb/dev-storage.c:496
7 usb_handle_packet (dev=0x56066c62c770, p=p@entry=0x56066b2fb5a0) at ../hw/usb/core.c:455
8 uhci_handle_td (s=s@entry=0x56066bd5f210, q=0x56066bb7fbd0, q@entry=0x0, qh_addr=qh_addr@entry=902518530, td=td@entry=0x7fffe6e788f0, td_addr=<optimized out>,
int_mask=int_mask@entry=0x7fffe6e788e4) at ../hw/usb/hcd-uhci.c:885
9 uhci_process_frame (s=s@entry=0x56066bd5f210) at ../hw/usb/hcd-uhci.c:1061
10 uhci_frame_timer (opaque=opaque@entry=0x56066bd5f210) at ../hw/usb/hcd-uhci.c:1159
11 timerlist_run_timers (timer_list=0x56066af26bd0) at ../util/qemu-timer.c:642
12 qemu_clock_run_timers (type=QEMU_CLOCK_VIRTUAL) at ../util/qemu-timer.c:656
13 qemu_clock_run_all_timers () at ../util/qemu-timer.c:738
14 main_loop_wait (nonblocking=nonblocking@entry=0) at ../util/main-loop.c:542
15 qemu_main_loop () at ../softmmu/runstate.c:739
16 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at ../softmmu/main.c:52
(gdb) frame 5
(gdb) p ((SCSIDiskReq *)s->req)->iov
$1 = {iov_base = 0x0, iov_len = 0}
(gdb) p/x s->req->tag
$2 = 0x472
When designing the USB mass storage device model, QEMU places SCSI disk
device as the backend of USB mass storage device. In addition, USB mass
device driver in Guest OS conforms to the "Universal Serial Bus Mass
Storage Class Bulk-Only Transport" specification in order to simulate
the transform behavior between a USB controller and a USB mass device.
The following shows the protocol hierarchy:
+----------------+
CDROM driver | scsi command | CDROM
+----------------+
+-----------------------+
USB mass | USB Mass Storage Class| USB mass
storage driver | Bulk-Only Transport | storage device
+-----------------------+
+----------------+
USB Controller | USB Protocol | USB device
+----------------+
In the USB protocol layer, between the USB controller and USB device, at
least two USB packets will be transformed when guest OS send a
read operation to USB mass storage device:
1. The CBW packet, which will be delivered to the USB device's Bulk-Out
endpoint. In order to simulate a read operation, the USB mass storage
device parses the CBW and converts it to a SCSI command, which would be
executed by CDROM(represented as SCSI disk in QEMU internally), and store
the result data of the SCSI command in a buffer.
2. The DATA-IN packet, which will be delivered from the USB device's
Bulk-In endpoint(fetched directly from the preceding buffer) to the USB
controller.
We consider UHCI to be the controller. The two packets mentioned above may
have been processed by UHCI in two separate frame entries of the Frame List
, and also described by two different TDs. Unlike the physical environment,
a virtualized environment requires the QEMU to make sure that the result
data of CBW is not lost and is delivered to the UHCI controller.
Currently, these types of SCSI requests are not migrated, so QEMU cannot
ensure the result data of the IO operation is not lost if there are
inflight emulated SCSI requests during the live migration.
Assume for the moment that the USB mass storage device is processing the
CBW and storing the result data of the read operation to a buffre, live
migration happens and moves the VM to the destination while not migrating
the result data of the read operation.
After migration, when UHCI at the destination issues a DATA-IN request to
the USB mass storage device, a crash happens because USB mass storage device
fetches the result data and get nothing.
The scenario this patch addresses is this one.
Theoretically, any device that uses the SCSI disk as a back-end would be
affected by this issue. In this case, it is the USB CDROM.
To fix it, inflight emulated SCSI request be migrated during live migration,
similar to the DMA SCSI request.
Signed-off-by: Hyman Huang <yong.huang@smartx.com>
Message-ID: <878c8f093f3fc2f584b5c31cb2490d9f6a12131a.1716531409.git.yong.huang@smartx.com>
[Do not bump migration version, introduce compat property instead. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Ciphers are pre-allocated by qcrypto_block_init_cipher() depending on
the given number of threads. The -device
virtio-blk-pci,iothread-vq-mapping= feature allows users to assign
multiple IOThreads to a virtio-blk device, but the association between
the virtio-blk device and the block driver happens after the block
driver is already open.
When the number of threads given to qcrypto_block_init_cipher() is
smaller than the actual number of threads at runtime, the
block->n_free_ciphers > 0 assertion in qcrypto_block_pop_cipher() can
fail.
Get rid of qcrypto_block_init_cipher() n_thread's argument and allocate
ciphers on demand.
Reported-by: Qing Wang <qinwang@redhat.com>
Buglink: https://issues.redhat.com/browse/RHEL-36159
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240527155851.892885-2-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Libaio defines IO_CMD_FDSYNC command to sync all outstanding
asynchronous I/O operations, by flushing out file data to the
disk storage. Enable linux-aio to submit such aio request.
When using aio=native without fdsync() support, QEMU creates
pthreads, and destroying these pthreads results in TLB flushes.
In a real-time guest environment, TLB flushes cause a latency
spike. This patch helps to avoid such spikes.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
Message-ID: <20240425070412.37248-1-ppandit@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
rather than the uint32_t for which the maximum is slightly more than 4
seconds and larger values would overflow. The QAPI interface allows
specifying the number of seconds, so only values 0 to 4 are safe right
now, other values lead to a much lower timeout than a user expects.
The block_copy() call where this is used already takes a uint64_t for
the timeout, so no change required there.
Fixes: 6db7fd1ca9 ("block/copy-before-write: implement cbw-timeout option")
Reported-by: Friedrich Weber <f.weber@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20240429141934.442154-1-f.ebner@proxmox.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The main loop has two AioContexts: qemu_aio_context and iohandler_ctx.
The main loop runs them both, but nested aio_poll() calls on
qemu_aio_context exclude iohandler_ctx.
Which one should qemu_get_current_aio_context() return when called from
the main loop? Document that it's always qemu_aio_context.
This has subtle effects on functions that use
qemu_get_current_aio_context(). For example, aio_co_reschedule_self()
does not work when moving from iohandler_ctx to qemu_aio_context because
qemu_get_current_aio_context() does not differentiate these two
AioContexts.
Document this in order to reduce the chance of future bugs.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240506190622.56095-3-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Commit 1f25c172f8 ("monitor: use aio_co_reschedule_self()") was a code
cleanup that uses aio_co_reschedule_self() instead of open coding
coroutine rescheduling.
Bug RHEL-34618 was reported and Kevin Wolf <kwolf@redhat.com> identified
the root cause. I missed that aio_co_reschedule_self() ->
qemu_get_current_aio_context() only knows about
qemu_aio_context/IOThread AioContexts and not about iohandler_ctx. It
does not function correctly when going back from the iohandler_ctx to
qemu_aio_context.
Go back to open coding the AioContext transitions to avoid this bug.
This reverts commit 1f25c172f8.
Cc: qemu-stable@nongnu.org
Buglink: https://issues.redhat.com/browse/RHEL-34618
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240506190622.56095-2-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bsd-user: Baby Steps towards eliminating qemu_host_page_size, et al
First baby-steps towards eliminating qemu_host_page_size: tackle the reserve_va
calculation (which is easier to copy from linux-user than to fix).
# -----BEGIN PGP SIGNATURE-----
# Comment: GPGTools - https://gpgtools.org
#
# iQIzBAABCgAdFiEEIDX4lLAKo898zeG3bBzRKH2wEQAFAmZl3pgACgkQbBzRKH2w
# EQBfpg//U4YdJAA0H4okwPtowP1wIK1gpWvVd5FIN17pCXLKT4FR4efhWeEnQh8U
# +dXvkCpX/MnhBkStYoGZBmYe1rNKkEAn8BPCsQqX4y3af5RzKyKWo0gZXOjN3L9e
# ixmeFcg/7BTwnSbcO02xd9BOPPaRiFBDSidh28gr/1sxpXRxlbQHzIUpTBncDaN6
# 4w5DnF+b1RFHCz05ytrP517cj7E32Ig9S/cVMmBd1pGJiLnHiOp/peMprCL6tnI+
# YNBzttCbRPNH2z0zVd9En/hDnVirGPYX+LXg0Djkw3I+stJj4jwbJTuDG+5Lzghp
# YrYfiU6x7OG9ywjFJgY1/pExVT1cwkNjuGCXL+F4R49R5LfIEHq5/MlQp+tjpYYO
# g5WmpiLnFpFosmXIPJmxr16zqm2sLD+P0Jr/kdIz58fTWmIQeKwi/Vu/73h4kxST
# vjBbhC3eg56lQDaospc4h8+RehmI6LdSWYx0kxv2JKpXH3lQPqsDSrOcm9hEbWYS
# DeV++vkyQcXrbCnwomfxG1U+dVYBlJ1L1wClxc/1WD9KxXXJIwlvGmIu3o3c2+xj
# BM6eRe3evWioqdqhc2lY+XxATwbIUxiect6ml+F6E0KJxlm3Ajqy6qw49G+uhZxa
# XWUEIYGDd6/xHMlBeo6FKUpe/Ez/i3eCFXr4AD4iO7AtTuukrO4=
# =3EaH
# -----END PGP SIGNATURE-----
# gpg: Signature made Sun 09 Jun 2024 09:55:52 AM PDT
# gpg: using RSA key 2035F894B00AA3CF7CCDE1B76C1CD1287DB01100
# gpg: Good signature from "Warner Losh <wlosh@netflix.com>" [unknown]
# gpg: aka "Warner Losh <imp@bsdimp.com>" [unknown]
# gpg: aka "Warner Losh <imp@freebsd.org>" [unknown]
# gpg: aka "Warner Losh <imp@village.org>" [unknown]
# gpg: aka "Warner Losh <wlosh@bsdimp.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2035 F894 B00A A3CF 7CCD E1B7 6C1C D128 7DB0 1100
* tag 'bsd-user-misc-2024q2-pull-request' of gitlab.com:bsdimp/qemu:
bsd-user: Catch up to run-time reserved_va math
bsd-user: port linux-user:ff8a8bbc2ad1 for variable page sizes
linux-user: Adjust comment to reflect the code.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add a second mapcache for grant mappings. The mapcache for
grants needs to work with XC_PAGE_SIZE granularity since
we can't map larger ranges than what has been granted to us.
Like with foreign mappings (xen_memory), machines using grants
are expected to initialize the xen_grants MR and map it
into their address-map accordingly.
CC: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Pass the ram_addr offset to xen_map_cache.
This is in preparation for adding grant mappings that need
to compute the address within the RAMBlock.
No functional changes.
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
When invalidating memory ranges, if we happen to hit the first
entry in a bucket we were never unmapping it. This was harmless
for foreign mappings but now that we're looking to reuse the
mapcache for transient grant mappings, we must unmap entries
when invalidated.
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Catch up to linux-user's 8f67b9c694, 13c1339755, 2f7828b572, and
95059f9c31 by Richard Henderson which made reserved_va a run-time
calculation, defaulting to nothing except in the case of 64-bit host
32-bit target. Also include the adjustment of the comment heading that
work submitted in the same patch stream. Since this is a direct copy,
squash it into one patch rather than follow the Linux evolution since
breaking this down further at this point doesn't make sense for this
"new code".
Signed-off-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Bring in Richard Henderson's ff8a8bbc2a to finalize the page size to
allow TARGET_PAGE_BITS_VARY. bsd-user's "blitz" fork has aarch64
support, which is now variable page size. Add support for it here, even
though it's effectively a nop in upstream qemu.
Signed-off-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
If the user didn't specify reserved_va, there's an else for 64-bit host
32-bit (or fewer) target to reserve 32-bits of address space. Update the
comments to reflect this, and rejustify comment to 80 columns.
Signed-off-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
gen_inst_init_args() is called for instructions using a predicate as an
rvalue. Upon first call, the list of arguments which might need
initialization init_list is freed to indicate that they have been
processed. For instructions without an rvalue predicate,
gen_inst_init_args() isn't called and init_list will never be freed.
Free init_list from free_instruction() if it hasn't already been freed.
A comment in free_instruction is also updated.
Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Taylor Simpson <ltaylorsimpson@gmail.com>
Reviewed-by: Brian Cain <bcain@quicinc.com>
Message-Id: <20240523125901.27797-4-anjo@rev.ng>
Signed-off-by: Brian Cain <bcain@quicinc.com>
Before switching to GArray/g_string_printf we used fixed size arrays for
output buffers and instructions arguments among other things.
Macros defining the sizes of these buffers were left behind, remove
them.
Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Taylor Simpson <ltaylorsimpson@gmail.com>
Reviewed-by: Brian Cain <bcain@quicinc.com>
Message-Id: <20240523125901.27797-2-anjo@rev.ng>
Signed-off-by: Brian Cain <bcain@quicinc.com>
At 09a7e7db0f (Hexagon (target/hexagon) Remove uses of
op_regs_generated.h.inc, 2024-03-06), we've changed the logic of
check_new_value() to use the new pre-calculated
packet->insn[...].dest_idx instead of calculating the index on the fly
using opcode_reginfo[...]. The dest_idx index is calculated roughly like
the following:
for reg in iset[tag]["syntax"]:
if reg.is_written():
dest_idx = regno
break
Thus, we take the first register that is writtable. Before that,
however, we also used to follow an alphabetical order on the register
type: 'd', 'e', 'x', and 'y'. No longer following that makes us select
the wrong register index and the HVX store new instruction does not
update the memory like expected.
Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Reviewed-by: Brian Cain <bcain@quicinc.com>
Reviewed-by: Taylor Simpson <ltaylorsimpson@gmail.com>
Message-Id: <f548dc1c240819c724245e887f29f918441e9125.1716220379.git.quic_mathbern@quicinc.com>
Signed-off-by: Brian Cain <bcain@quicinc.com>
* scsi-disk: Don't silently truncate serial number
* backends/hostmem: Report error on unavailable qemu_madvise() features or unaligned memory sizes
* target/i386: fixes and documentation for INHIBIT_IRQ/TF/RF and debugging
* i386/hvf: Adds support for INVTSC cpuid bit
* i386/hvf: Fixes for dirty memory tracking
* i386/hvf: Use hv_vcpu_interrupt() and hv_vcpu_run_until()
* hvf: Cleanups
* stubs: fixes for --disable-system build
* i386/kvm: support for FRED
* i386/kvm: fix MCE handling on AMD hosts
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmZkF2oUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroPNlQf+N9y6Eh0nMEEQ69twtV8ytglTY+uX
# FsogvnsXHNMVubOWmmeItM6kFXTAkR9cmFaL8dqI1Gs03xEQdQXbF1KejJZOAZVl
# RQMOW8Fg2Afr+0lwqCXHvhsmZ4hr5yUkRndyucA/E9AO2uGrtgwsWGDBGaHJOZIA
# lAsEMOZgKjXHZnefXjhMrvpk/QNovjEV6f1RHX3oKZjKSI5/G4IqGSmwNYToot8p
# 2fgs4Qti4+1gNyM2oBLq7cCMjMS61tSxOMH4uqVoIisjyckPlAFRvc+DXtKsUAAs
# 9AgM++pNgpB0IXv67czRUNdRoK7OI8I0ULhI4qHXi6Yg2QYAHqpQ6WL4Lg==
# =RP7U
# -----END PGP SIGNATURE-----
# gpg: Signature made Sat 08 Jun 2024 01:33:46 AM PDT
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (42 commits)
python: mkvenv: remove ensure command
Revert "python: use vendored tomli"
i386: Add support for overflow recovery
i386: Add support for SUCCOR feature
i386: Fix MCE support for AMD hosts
docs: i386: pc: Avoid mentioning limit of maximum vCPUs
target/i386: Add get/set/migrate support for FRED MSRs
target/i386: enumerate VMX nested-exception support
vmxcap: add support for VMX FRED controls
target/i386: mark CR4.FRED not reserved
target/i386: add support for FRED in CPUID enumeration
hvf: Makes assert_hvf_ok report failed expression
i386/hvf: Updates API usage to use modern vCPU run function
i386/hvf: In kick_vcpu use hv_vcpu_interrupt to force exit
i386/hvf: Fixes dirty memory tracking by page granularity RX->RWX change
hvf: Consistent types for vCPU handles
i386/hvf: Fixes some compilation warnings
i386/hvf: Adds support for INVTSC cpuid bit
stubs/meson: Fix qemuutil build when --disable-system
scsi-disk: Don't silently truncate serial number
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This was used to bootstrap the venv with a TOML parser, after which
ensuregroup is used. Now that we expect it to be present as a system
package (either tomli or, for Python 3.11, tomllib), it is not needed
anymore.
Note that this means that, when implemented, the hypothetical "isolated"
mode that does not use any system packages will only work with Python
3.11+.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Now that Ubuntu 20.04 is not included anymore, there is no need to ship
it as part of QEMU; Ubuntu 22.04 includes it and Leap users anyway
need to install all the required dependencies from PyPI.
This mostly reverts commit ec77ee7634de123b7c899739711000fd21dab68b,
with just some changes to the wording.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add cpuid bit definition for overflow recovery. This is needed in the case
where a deferred error has been sent to the guest, a guest process accesses the
poisoned memory, but the machine_check_poll function has not yet handled the
original deferred error. If overflow recovery is not set in this case, when we
handle the uncorrected error from the poisoned memory access, the overflow bit
will be set and will result in the guest being shut down.
By the time the MCE reaches the guest, the overflow has been handled
by the host and has not caused a shutdown, so include the bit unconditionally.
Signed-off-by: John Allen <john.allen@amd.com>
Message-ID: <20240603193622.47156-4-john.allen@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add cpuid bit definition for the SUCCOR feature. This cpuid bit is required to
be exposed to guests to allow them to handle machine check exceptions on AMD
hosts.
----
v2:
- Add "succor" feature word.
- Add case to kvm_arch_get_supported_cpuid for the SUCCOR feature.
Reported-by: William Roche <william.roche@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: John Allen <john.allen@amd.com>
Message-ID: <20240603193622.47156-3-john.allen@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For the most part, AMD hosts can use the same MCE injection code as Intel, but
there are instances where the qemu implementation is Intel specific. First, MCE
delivery works differently on AMD and does not support broadcast. Second,
kvm_mce_inject generates MCEs that include a number of Intel specific status
bits. Modify kvm_mce_inject to properly generate MCEs on AMD platforms.
Reported-by: William Roche <william.roche@oracle.com>
Signed-off-by: John Allen <john.allen@amd.com>
Message-ID: <20240603193622.47156-2-john.allen@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Different versions of PC machine support different maximum vCPUs, and
even different features have limits on the maximum number of vCPUs (
For example, if x2apic is not enabled in the TCG case, the maximum of
255 vCPUs are supported).
It is difficult to list the maximum vCPUs under all restrictions. Thus,
to avoid confusion, avoid mentioning specific maximum vCPU number
limitations here.
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240606085436.2028900-1-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
FRED CPU states are managed in 9 new FRED MSRs, in addtion to a few
existing CPU registers and MSRs, e.g., CR4.FRED and MSR_IA32_PL0_SSP.
Save/restore/migrate FRED MSRs if FRED is exposed to the guest.
Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
Message-ID: <20231109072012.8078-7-xin3.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
FRED, i.e., the Intel flexible return and event delivery architecture,
defines simple new transitions that change privilege level (ring
transitions).
The new transitions defined by the FRED architecture are FRED event
delivery and, for returning from events, two FRED return instructions.
FRED event delivery can effect a transition from ring 3 to ring 0, but
it is used also to deliver events incident to ring 0. One FRED
instruction (ERETU) effects a return from ring 0 to ring 3, while the
other (ERETS) returns while remaining in ring 0. Collectively, FRED
event delivery and the FRED return instructions are FRED transitions.
In addition to these transitions, the FRED architecture defines a new
instruction (LKGS) for managing the state of the GS segment register.
The LKGS instruction can be used by 64-bit operating systems that do
not use the new FRED transitions.
WRMSRNS is an instruction that behaves exactly like WRMSR, with the
only difference being that it is not a serializing instruction by
default. Under certain conditions, WRMSRNS may replace WRMSR to improve
performance. FRED uses it to switch RSP0 in a faster manner.
Search for the latest FRED spec in most search engines with this search
pattern:
site:intel.com FRED (flexible return and event delivery) specification
The CPUID feature flag CPUID.(EAX=7,ECX=1):EAX[17] enumerates FRED, and
the CPUID feature flag CPUID.(EAX=7,ECX=1):EAX[18] enumerates LKGS, and
the CPUID feature flag CPUID.(EAX=7,ECX=1):EAX[19] enumerates WRMSRNS.
Add CPUID definitions for FRED/LKGS/WRMSRNS, and expose them to KVM guests.
Because FRED relies on LKGS and WRMSRNS, add that to feature dependency
map.
Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
Message-ID: <20231109072012.8078-2-xin3.li@intel.com>
[Fix order of dependencies, add dependencies from LM to FRED. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When a macOS Hypervisor.framework call fails which is checked by
assert_hvf_ok(), Qemu exits printing the error value, but not the
location
in the code, as regular assert() macro expansions would.
This change turns assert_hvf_ok() into a macro similar to other
assertions, which expands to a call to the corresponding _impl()
function together with information about the expression that failed
the assertion and its location in the code.
Additionally, stringifying the numeric hv_return_t code is factored
into a helper function that can be reused for diagnostics and debugging
outside of assertions.
Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Message-ID: <20240605112556.43193-8-phil@philjordan.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
macOS 10.15 introduced the more efficient hv_vcpu_run_until() function
to supersede hv_vcpu_run(). According to the documentation, there is no
longer any reason to use the latter on modern host OS versions, especially
after 11.0 added support for an indefinite deadline.
Observed behaviour of the newer function is that as documented, it exits
much less frequently - and most of the original function’s exits seem to
have been effectively pointless.
Another reason to use the new function is that it is a prerequisite for
using newer features such as in-kernel APIC support. (Not covered by
this patch.)
This change implements the upgrade by selecting one of three code paths
at compile time: two static code paths for the new and old functions
respectively, when building for targets where the new function is either
not available, or where the built executable won’t run on older
platforms lacking the new function anyway. The third code path selects
dynamically based on runtime detected availability of the weakly-linked
symbol.
Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Message-ID: <20240605112556.43193-7-phil@philjordan.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When interrupting a vCPU thread, this patch actually tells the hypervisor to
stop running guest code on that vCPU.
Calling hv_vcpu_interrupt actually forces a vCPU exit, analogously to
hv_vcpus_exit on aarch64. Alternatively, if the vCPU thread
is not
running the VM, it will immediately cause an exit when it attempts
to do so.
Previously, hvf_kick_vcpu_thread relied upon hv_vcpu_run returning very
frequently, including many spurious exits, which made it less of a problem that
nothing was actively done to stop the vCPU thread running guest code.
The newer, more efficient hv_vcpu_run_until exits much more rarely, so a true
"kick" is needed before switching to that.
Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Message-ID: <20240605112556.43193-6-phil@philjordan.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When using x86 macOS Hypervisor.framework as accelerator, detection of
dirty memory regions is implemented by marking logged memory region
slots as read-only in the EPT, then setting the dirty flag when a
guest write causes a fault. The area marked dirty should then be marked
writable in order for subsequent writes to succeed without a VM exit.
However, dirty bits are tracked on a per-page basis, whereas the fault
handler was marking the whole logged memory region as writable. This
change fixes the fault handler so only the protection of the single
faulting page is marked as dirty.
(Note: the dirty page tracking appeared to work despite this error
because HVF’s hv_vcpu_run() function generated unnecessary EPT fault
exits, which ended up causing the dirty marking handler to run even
when the memory region had been marked RW. When using
hv_vcpu_run_until(), a change planned for a subsequent commit, these
spurious exits no longer occur, so dirty memory tracking malfunctions.)
Additionally, the dirty page is set to permit code execution, the same
as all other guest memory; changing memory protection from RX to RW not
RWX appears to have been an oversight.
Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Reviewed-by: Roman Bolshakov <roman@roolebo.dev>
Tested-by: Roman Bolshakov <roman@roolebo.dev>
Message-ID: <20240605112556.43193-5-phil@philjordan.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
macOS Hypervisor.framework uses different types for identifying vCPUs, hv_vcpu_t or hv_vcpuid_t, depending on host architecture. They are not just differently named typedefs for the same primitive type, but reference different-width integers.
Instead of using an integer type and casting where necessary, this change introduces a typedef which resolves the active architecture’s hvf typedef. It also removes a now-unnecessary cast.
Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Reviewed-by: Roman Bolshakov <roman@roolebo.dev>
Tested-by: Roman Bolshakov <roman@roolebo.dev>
Message-ID: <20240605112556.43193-4-phil@philjordan.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A bunch of function definitions used empty parentheses instead of (void) syntax, yielding the following warning when building with clang on macOS:
warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
In addition to fixing these function headers, it also fixes what appears to be a typo causing a variable to be unused after initialisation.
warning: variable 'entry_ctls' set but not used [-Wunused-but-set-variable]
Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Reviewed-by: Roman Bolshakov <roman@roolebo.dev>
Tested-by: Roman Bolshakov <roman@roolebo.dev>
Message-ID: <20240605112556.43193-3-phil@philjordan.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch adds the INVTSC bit to the Hypervisor.framework accelerator's
CPUID bit passthrough allow-list. Previously, specifying +invtsc in the CPU
configuration would fail with the following warning despite the host CPU
advertising the feature:
qemu-system-x86_64: warning: host doesn't support requested feature:
CPUID.80000007H:EDX.invtsc [bit 8]
x86 macOS itself relies on a fixed rate TSC for its own Mach absolute time
timestamp mechanism, so there's no reason we can't enable this bit for guests.
When the feature is enabled, a migration blocker is installed.
Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Reviewed-by: Roman Bolshakov <roman@roolebo.dev>
Tested-by: Roman Bolshakov <roman@roolebo.dev>
Message-ID: <20240605112556.43193-2-phil@philjordan.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Compiling without system, user, tools or guest-agent fails with the
following error message:
./configure --disable-system --disable-user --disable-tools \
--disable-guest-agent
error message:
/usr/bin/ld: libqemuutil.a.p/util_error-report.c.o: in function `error_printf':
/media/liuzhao/data/qemu-cook/build/../util/error-report.c:38: undefined reference to `error_vprintf'
/usr/bin/ld: libqemuutil.a.p/util_error-report.c.o: in function `vreport':
/media/liuzhao/data/qemu-cook/build/../util/error-report.c:215: undefined reference to `error_vprintf'
collect2: error: ld returned 1 exit status
This is because tests/bench and tests/unit both need qemuutil, which
requires error_vprintf stub when system is disabled.
Add error_vprintf stub into stub_ss for all cases other than disabling
system.
Fixes: 3a15604900 ("stubs: include stubs only if needed")
Reported-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240605152549.1795762-1-zhao1.liu@intel.com>
[Include error-printf.c unconditionally. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Before this commit, scsi-disk accepts a string of arbitrary length for
its "serial" property. However, the value visible on the guest is
actually truncated to 36 characters. This limitation doesn't come from
the SCSI specification, it is an arbitrary limit that was initially
picked as 20 and later bumped to 36 by commit 48b62063.
Similarly, device_id was introduced as a copy of the serial number,
limited to 20 characters, but commit 48b62063 forgot to actually bump
it.
As long as we silently truncate the given string, extending the limit is
actually not a harmless change, but break the guest ABI. This is the
most important reason why commit 48b62063 was really wrong (and it's
also why we can't change device_id to be in sync with the serial number
again and use 36 characters now, it would be another guest ABI
breakage).
In order to avoid future breakage, don't silently truncate the serial
number string any more, but just error out if it would be truncated.
Buglink: https://issues.redhat.com/browse/RHEL-3542
Suggested-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20240604161755.63448-1-kwolf@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
No semantic change, just simpler control flow.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Detect early unsupported MADV_MERGEABLE and MADV_DONTDUMP, and print a clearer
error message that points to the deficiency of the host.
Cc: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If memory-backend-{file,ram} has a size that's not aligned to
underlying page size it is not only wasteful, but also may lead
to hard to debug behaviour. For instance, in case
memory-backend-file and hugepages, madvise() and mbind() fail.
Rightfully so, page is the smallest unit they can work with. And
even though an error is reported, the root cause it not very
clear:
qemu-system-x86_64: Couldn't set property 'dump' on 'memory-backend-file': Invalid argument
After this commit:
qemu-system-x86_64: backend 'memory-backend-file' memory size must be multiple of 2 MiB
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Message-ID: <b5b9f9c6bba07879fb43f3c6f496c69867ae3716.1717584048.git.mprivozn@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The unspoken premise of qemu_madvise() is that errno is set on
error. And it is mostly the case except for posix_madvise() which
is documented to return either zero (on success) or a positive
error number. This means, we must set errno ourselves. And while
at it, make the function return a negative value on error, just
like other error paths do.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-ID: <af17113e7c1f2cc909ffd36d23f5a411b63b8764.1717584048.git.mprivozn@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Otherwise, starting any guest on a non-Linux guests results in
qemu-system-arm: Couldn't set property 'merge' on 'memory-backend-ram': Invalid argument
Cc: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The calculation of FrameTemp is done using the size indicated by mo_pushpop()
before being written back to EBP, but the final writeback to EBP is done using
the size indicated by mo_stacksize().
In the case where mo_pushpop() is MO_32 and mo_stacksize() is MO_16 then the
final writeback to EBP is done using MO_16 which can leave junk in the top
16-bits of EBP after executing ENTER.
Change the writeback of EBP to use the same size indicated by mo_pushpop() to
ensure that the full value is written back.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2198
Message-ID: <20240606095319.229650-5-mark.cave-ayland@ilande.co.uk>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When OS/2 Warp configures its segment descriptors, many of them are configured with
the P flag clear to allow for a fault-on-demand implementation. In the case where
the stack value is POPped into the segment registers, the SP is incremented before
calling gen_helper_load_seg() to validate the segment descriptor:
IN:
0xffef2c0c: 66 07 popl %es
OP:
ld_i32 loc9,env,$0xfffffffffffffff8
sub_i32 loc9,loc9,$0x1
brcond_i32 loc9,$0x0,lt,$L0
st16_i32 loc9,env,$0xfffffffffffffff8
st8_i32 $0x1,env,$0xfffffffffffffffc
---- 0000000000000c0c 0000000000000000
ext16u_i64 loc0,rsp
add_i64 loc0,loc0,ss_base
ext32u_i64 loc0,loc0
qemu_ld_a64_i64 loc0,loc0,noat+un+leul,5
add_i64 loc3,rsp,$0x4
deposit_i64 rsp,rsp,loc3,$0x0,$0x10
extrl_i64_i32 loc5,loc0
call load_seg,$0x0,$0,env,$0x0,loc5
add_i64 rip,rip,$0x2
ext16u_i64 rip,rip
exit_tb $0x0
set_label $L0
exit_tb $0x7fff58000043
If helper_load_seg() generates a fault when validating the segment descriptor then as
the SP has already been incremented, the topmost word of the stack is overwritten by
the arguments pushed onto the stack by the CPU before taking the fault handler. As a
consequence things rapidly go wrong upon return from the fault handler due to the
corrupted stack.
Update the logic for the existing writeback condition so that a POP into the segment
registers also calls helper_load_seg() first before incrementing the SP, so that if a
fault occurs the SP remains unaltered.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2198
Message-ID: <20240606095319.229650-4-mark.cave-ayland@ilande.co.uk>
Fixes: cc1d28bdbe ("target/i386: move 00-5F opcodes to new decoder", 2024-05-07)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
DISAS_NORETURN suppresses the work normally done by gen_eob(), and therefore
must be used in special cases only. Document them.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
HLT uses DISAS_NORETURN because the corresponding helper calls
cpu_loop_exit(). However, while gen_eob() clears HF_RF_MASK and
synthesizes a #DB exception if single-step is active, none of this is
done by HLT. Note that the single-step trap is generated after the halt
is finished.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
PAUSE uses DISAS_NORETURN because the corresponding helper
calls cpu_loop_exit(). However, while HLT clear HF_INHIBIT_IRQ_MASK
to correctly handle "STI; HLT", the same is missing from PAUSE.
And also gen_eob() clears HF_RF_MASK and synthesizes a #DB exception
if single-step is active; none of this is done by HLT and PAUSE.
Start fixing PAUSE, HLT will follow.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
From vm entry to exit, VMRUN is handled as a single instruction. It
uses DISAS_NORETURN in order to avoid processing TF or RF before
the first instruction executes in the guest. However, the corresponding
handling is missing in vmexit. Add it, and at the same time reorganize
the comments with quotes from the manual about the tasks performed
by a #VMEXIT.
Another gen_eob() task that is missing in VMRUN is preparing the
HF_INHIBIT_IRQ flag for the next instruction, in this case by loading
it from the VMCB control state.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If the required DR7 (either from the VMCB or from the host save
area) disables a breakpoint that was enabled prior to vmentry
or vmexit, it is left enabled and will trigger EXCP_DEBUG.
This causes a spurious #DB on the next crossing of the breakpoint.
To disable it, vmentry/vmexit must use cpu_x86_update_dr7
to load DR7.
Because cpu_x86_update_dr7 takes a 32-bit argument, check
reserved bits prior to calling cpu_x86_update_dr7, and do the
same for DR6 as well for consistency.
This scenario is tested by the "host_rflags" test in kvm-unit-tests.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
DR7.GD triggers a #DB exception on any access to debug registers.
The GD bit is cleared so that the #DB handler itself can access
the debug registers.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use decode.c's support for intercepts, doing the check in TCG-generated
code rather than the helper. This is cleaner because it allows removing
the eip_addend argument to helper_pause(), even though it adds a bit of
bloat for opcode 0x90's new decoding function.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use decode.c's support for intercepts, doing the check in TCG-generated
code rather than the helper. This is cleaner because it allows removing
the eip_addend argument to helper_hlt().
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
ICEBP generates a trap-like exception, while gen_exception() produces
a fault. Resurrect gen_update_eip_next() to implement the desired
semantics.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When preparing an exception stack frame for a fault exception, the value
pushed for RF is 1. Take that into account. The same should be true
of interrupts for repeated string instructions, but the situation there
is complicated.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
pull-loongarch-20240606
# -----BEGIN PGP SIGNATURE-----
#
# iLMEAAEKAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCZmE0HwAKCRBAov/yOSY+
# 396sA/90m/zr91pLQlkhFuYLHg958Ow3L5ysblcuAAmcTXGi8iE9IeTTeZru6WEO
# H/CL/njUkIgP+/Tio0n0Lx6rWkxOzGxWCpvzqrabsPGvs4GUtFEjI/2pvEWP6C9/
# S6Jon3py0oZeoVx8D6Tr/CJrhD0IBptbEn1aiQNDRuSzeuCo1Q==
# =xpjH
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 05 Jun 2024 08:59:27 PM PDT
# gpg: using RSA key B8FF1DA0D2FDCB2DA09C6C2C40A2FFF239263EDF
# gpg: Good signature from "Song Gao <m17746591750@163.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B8FF 1DA0 D2FD CB2D A09C 6C2C 40A2 FFF2 3926 3EDF
* tag 'pull-loongarch-20240606' of https://gitlab.com/gaosong/qemu:
target/loongarch: fix a wrong print in cpu dump
hw/loongarch/virt: Enable extioi virt extension
hw/loongarch/virt: Use MemTxAttrs interface for misc ops
hw/intc/loongarch_extioi: Add extioi virt extension definition
tests/qtest: Add numa test for loongarch system
tests/libqos: Add loongarch virt machine node
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
testing cleanups (ci, vm, lcitool, ansible):
- clean up left over Centos 8 references
- use -fno-sanitize=function to avoid non-useful errors
- bump lcitool and update images (alpine, fedora)
- make sure we have mingw-w64-tools for windows builds
- drive ansible scripts with lcitool package lists
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmZhgb4ACgkQ+9DbCVqe
# KkQNMAf/eyGgbU6ASgbwGqiJCOrkWo8CM7G1dXZ5GpVvKVnlDioMaefFCWt3ftB6
# kKtiskC1xVx3vM1mmomosSGxTNxT93HMLulKJLXK8/SvOFU9phzzUeZXTqS7JKNb
# NrawL0vkygRn+mmTgr3M+Z7rh4yI9e7e2yeX+oQiXsSGGNM114EdcUqKG82tpO8G
# cDlNtlp2jgptNnmzQ7ufZIRD5ckg+2os6XFB+bhmfaWCu9PXTTNwBfLJkEnXENi3
# NpRHPRGsUwZ3QcGt0JLkxTT/yeHngYBP7us2cwMXuf9lxfFCF3Ixt1jF/9ZJRrRm
# wUYG6TBKh6S24cJSBiu1zguQ/EE7WA==
# =myWO
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 06 Jun 2024 02:30:38 AM PDT
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [full]
* tag 'pull-maintainer-june24-060624-1' of https://gitlab.com/stsquad/qemu:
scripts/ci: drive ubuntu/build-environment.yml from lcitool
tests/lcitool: generate package lists for ansible
tests/lcitool: Install mingw-w64-tools for the Windows cross-builds
tests/lcitool: Bump to latest libvirt-ci and update Fedora and Alpine version
.gitlab-ci.d/buildtest.yml: Use -fno-sanitize=function in the clang-system job
tests/lcitool: Delete obsolete centos-stream-8.yml file
docs/ci: clean-up references for consistency
scripts/ci: remove CentOS bits from common build-environment
tests/vm: remove plain centos image
tests/vm: update centos.aarch64 image to 9
docs/devel: update references to centos to non-versioned container
ci: remove centos-steam-8 customer runner
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Update to the latest version of lcitool. It dropped support for Fedora 38
and Alpine 3.18, so we have to update these to newer versions here, too.
Python 3.12 dropped the "imp" module which we still need for running
Avocado. Fortunately Fedora 40 still ships with a work-around package
that we can use until somebody updates our Avocado to a newer version.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20240601070543.37786-3-thuth@redhat.com>
[AJB: regen on rebase]
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240603175328.3823123-10-alex.bennee@linaro.org>
The latest version of Clang (version 18 from Fedora 40) now reports
bad function pointer casts as undefined behavior. Unfortunately, we are
still doing this in quite a lot of places in the QEMU code and some of
them are not easy to fix. So for the time being, temporarily switch this
off in the failing clang-system job until all spots in the QEMU sources
have been tackled.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20240601070543.37786-4-thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240603175328.3823123-9-alex.bennee@linaro.org>
>From the website:
"After May 31, 2024, CentOS Stream 8 will be archived and no further
updates will be provided."
We have updated a few bits but there are still references that need
fixing. Rather than bump I've replaced them with references to the
Debian image so we don't have to bump at the next update.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240603175328.3823123-3-alex.bennee@linaro.org>
This broke since eef0bae3a7 (migration: Remove block migration) but
even after that was addressed it still fails to complete. As it will
shortly be EOL lets to remove the runner definition and the related
ansible setup bits.
We still have centos9 docker images build and test.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240603175328.3823123-2-alex.bennee@linaro.org>
This patch adds a new board attribute 'v-eiointc'.
A value of true enables the virt extended I/O interrupt controller.
VMs working in kvm mode have 'v-eiointc' enabled by default.
Signed-off-by: Song Gao <gaosong@loongson.cn>
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Message-Id: <20240528083855.1912757-4-gaosong@loongson.cn>
Add loongarch virt machine to the graph. It is a modified copy of
the existing riscv virtmachine in riscv-virt-machine.c
It contains a generic-pcihost controller, and an extra function
loongarch_config_qpci_bus() to configure GPEX pci host controller
information, such as ecam and pio_base addresses.
Also hotplug handle checking about TYPE_VIRTIO_IOMMU_PCI device is
added on loongarch virt machine, since virtio_mmu_pci device requires
it.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Acked-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20240528082053.938564-1-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
util/hexdump: Use a GString for qemu_hexdump_line.
system/qtest: Replace sprintf by qemu_hexdump_line
hw/scsi/scsi-disk: Use qemu_hexdump_line to avoid sprintf
hw/ide/atapi: Use qemu_hexdump_line to avoid sprintf
hw/dma/pl330: Use qemu_hexdump_line to avoid sprintf
disas/microblaze: Reorg to avoid intermediate sprintf
disas/riscv: Use GString in format_inst
# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmZg1RMdHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV+6mgf6AjEdU91vBXAUxabs
# kmVl5HaAD3NHU1VCM+ruPQkm6xv4kLlMsTibmkiS7+WZYvHfPlGfozjRJxtvZj8K
# 8J2Qp9iHjny8NQPkMCValDvmzkxaIT7ZzYCBdS4jfTdIThuYNJnXsI3NNP7ghnl6
# xv8O62dQbc5gjWF8G+q6PKWSxY6BEuFJ3Pt82cJ/Fj/8bhsjd48pgiLv66F/+q1z
# U9Gy8fWqmkKEzTqBigSYU98yae5CA89T6JBKtgFV07pkYa4A7BUyCR5EBirARyhM
# P0OAqR1GCAbSXWFaJ1sSpU8ATq33FoSQYwWwcmEET7FZYZqvbd6Jd4HtpOPqmu9W
# Fc4taw==
# =VgLB
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 05 Jun 2024 02:13:55 PM PDT
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate]
* tag 'pull-misc-20240605' of https://gitlab.com/rth7680/qemu:
disas/riscv: Use GString in format_inst
disas/microblaze: Split get_field_special
disas/microblaze: Print registers directly with PRIrfsl
disas/microblaze: Print immediates directly with PRIimm
disas/microblaze: Print registers directly with PRIreg
disas/microblaze: Merge op->name output into each fprintf
disas/microblaze: Re-indent print_insn_microblaze
disas/microblaze: Split out print_immval_addr
hw/dma/pl330: Use qemu_hexdump_line to avoid sprintf
hw/ide/atapi: Use qemu_hexdump_line to avoid sprintf
hw/scsi/scsi-disk: Use qemu_hexdump_line to avoid sprintf
system/qtest: Replace sprintf by qemu_hexdump_line
hw/mips/malta: Add re-usable rng_seed_hex_new() method
util/hexdump: Inline g_string_append_printf "%02x"
util/hexdump: Add unit_len and block_len to qemu_hexdump_line
util/hexdump: Use a GString for qemu_hexdump_line
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
sprintf() is deprecated on Darwin since macOS 13.0 / XCode 14.1.
Using qemu_hexdump_line both fixes the deprecation warning and
simplifies the code base.
Note that this drops the "0x" prefix to every byte, which should
be of no consequence to tracing.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20240412073346.458116-9-richard.henderson@linaro.org>
sprintf() is deprecated on Darwin since macOS 13.0 / XCode 14.1.
Extract common code from reinitialize_rng_seed and load_kernel
to rng_seed_hex_new. Using qemu_hexdump_line both fixes the
deprecation warning and simplifies the code base.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
[rth: Use qemu_hexdump_line.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-Id: <20240412073346.458116-7-richard.henderson@linaro.org>
Ignore the "monitor" portion and treat them the same
as their base ASIs.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
VIS4 completes the set, adding missing signed 8-bit ops
and missing unsigned 16 and 32-bit ops.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The manual separates VIS 3 and VIS 3B, even though they are both
present in all extant cpus. For clarity, let the translator
match the manual but otherwise leave them on the same feature bit.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rearrange PDIST so that do_dddd is general purpose and may
be re-used for FMADDd etc. Add pickNaN and pickNaNMulAdd.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Form the proper register decoding from the start.
Because we're removing the translation from the inner-most
gen_load_fpr_* and gen_store_fpr_* routines, this must be
done for all insns at once.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This operation returns the high 16 bits of a 24-bit multiply
that has been sign-extended to 32 bits.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Follow the Oracle Sparc 2015 implementation note and bound
the input value of N to 5 from the lower 3 bits of rs2.
Spell out all of the intermediate values, matching the diagram
in the manual. Fix extraction of upper_x and upper_y for N=0.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* virtio-blk: remove SCSI passthrough functionality
* require x86-64-v2 baseline ISA
* SEV-SNP host support
* fix xsave.flat with TCG
* fixes for CPUID checks done by TCG
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmZgKVYUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroPKYgf/QkWrNXdjjD3yAsv5LbJFVTVyCYW3
# b4Iax29kEDy8k9wbzfLxOfIk9jXIjmbOMO5ZN9LFiHK6VJxbXslsMh6hm50M3xKe
# 49X1Rvf9YuVA7KZX+dWkEuqLYI6Tlgj3HaCilYWfXrjyo6hY3CxzkPV/ChmaeYlV
# Ad4Y8biifoUuuEK8OTeTlcDWLhOHlFXylG3AXqULsUsXp0XhWJ9juXQ60eATv/W4
# eCEH7CSmRhYFu2/rV+IrWFYMnskLRTk1OC1/m6yXGPKOzgnOcthuvQfiUgPkbR/d
# llY6Ni5Aaf7+XX3S7Avcyvoq8jXzaaMzOrzL98rxYGDR1sYBYO+4h4ZToA==
# =qQeP
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 05 Jun 2024 02:01:10 AM PDT
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (46 commits)
hw/i386: Add support for loading BIOS using guest_memfd
hw/i386/sev: Use guest_memfd for legacy ROMs
memory: Introduce memory_region_init_ram_guest_memfd()
i386/sev: Allow measured direct kernel boot on SNP
i386/sev: Reorder struct declarations
i386/sev: Extract build_kernel_loader_hashes
i386/sev: Enable KVM_HC_MAP_GPA_RANGE hcall for SNP guests
i386/kvm: Add KVM_EXIT_HYPERCALL handling for KVM_HC_MAP_GPA_RANGE
i386/sev: Invoke launch_updata_data() for SNP class
i386/sev: Invoke launch_updata_data() for SEV class
hw/i386/sev: Add support to encrypt BIOS when SEV-SNP is enabled
i386/sev: Add support for SNP CPUID validation
i386/sev: Add support for populating OVMF metadata pages
hw/i386/sev: Add function to get SEV metadata from OVMF header
i386/sev: Set CPU state to protected once SNP guest payload is finalized
i386/sev: Add handling to encrypt/finalize guest launch data
i386/sev: Add the SNP launch start context
i386/sev: Update query-sev QAPI format to handle SEV-SNP
i386/sev: Add a class method to determine KVM VM type for SNP guests
i386/sev: Don't return launch measurements for SEV-SNP guests
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
When guest_memfd is enabled, the BIOS is generally part of the initial
encrypted guest image and will be accessed as private guest memory. Add
the necessary changes to set up the associated RAM region with a
guest_memfd backend to allow for this.
Current support centers around using -bios to load the BIOS data.
Support for loading the BIOS via pflash requires additional enablement
since those interfaces rely on the use of ROM memory regions which make
use of the KVM_MEM_READONLY memslot flag, which is not supported for
guest_memfd-backed memslots.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-29-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Current SNP guest kernels will attempt to access these regions with
with C-bit set, so guest_memfd is needed to handle that. Otherwise,
kvm_convert_memory() will fail when the guest kernel tries to access it
and QEMU attempts to call KVM_SET_MEMORY_ATTRIBUTES to set these ranges
to private.
Whether guests should actually try to access ROM regions in this way (or
need to deal with legacy ROM regions at all), is a separate issue to be
addressed on kernel side, but current SNP guest kernels will exhibit
this behavior and so this handling is needed to allow QEMU to continue
running existing SNP guest kernels.
Signed-off-by: Michael Roth <michael.roth@amd.com>
[pankaj: Added sev_snp_enabled() check]
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-28-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In SNP, the hashes page designated with a specific metadata entry
published in AmdSev OVMF.
Therefore, if the user enabled kernel hashes (for measured direct boot),
QEMU should prepare the content of hashes table, and during the
processing of the metadata entry it copy the content into the designated
page and encrypt it.
Note that in SNP (unlike SEV and SEV-ES) the measurements is done in
whole 4KB pages. Therefore QEMU zeros the whole page that includes the
hashes table, and fills in the kernel hashes area in that page, and then
encrypts the whole page. The rest of the page is reserved for SEV
launch secrets which are not usable anyway on SNP.
If the user disabled kernel hashes, QEMU pre-validates the kernel hashes
page as a zero page.
Signed-off-by: Dov Murik <dovmurik@linux.ibm.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-24-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
KVM_HC_MAP_GPA_RANGE will be used to send requests to userspace for
private/shared memory attribute updates requested by the guest.
Implement handling for that use-case along with some basic
infrastructure for enabling specific hypercall events.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-31-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
SEV-SNP firmware allows a special guest page to be populated with a
table of guest CPUID values so that they can be validated through
firmware before being loaded into encrypted guest memory where they can
be used in place of hypervisor-provided values[1].
As part of SEV-SNP guest initialization, use this interface to validate
the CPUID entries reported by KVM_GET_CPUID2 prior to initial guest
start and populate the CPUID page reserved by OVMF with the resulting
encrypted data.
[1] SEV SNP Firmware ABI Specification, Rev. 0.8, 8.13.2.6
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-21-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A recent version of OVMF expanded the reset vector GUID list to add
SEV-specific metadata GUID. The SEV metadata describes the reserved
memory regions such as the secrets and CPUID page used during the SEV-SNP
guest launch.
The pc_system_get_ovmf_sev_metadata_ptr() is used to retieve the SEV
metadata pointer from the OVMF GUID list.
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-19-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Once KVM_SNP_LAUNCH_FINISH is called the vCPU state is copied into the
vCPU's VMSA page and measured/encrypted. Any attempt to read/write CPU
state afterward will only be acting on the initial data and so are
effectively no-ops.
Set the vCPU state to protected at this point so that QEMU don't
continue trying to re-sync vCPU data during guest runtime.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-18-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Most of the current 'query-sev' command is relevant to both legacy
SEV/SEV-ES guests and SEV-SNP guests, with 2 exceptions:
- 'policy' is a 64-bit field for SEV-SNP, not 32-bit, and
the meaning of the bit positions has changed
- 'handle' is not relevant to SEV-SNP
To address this, this patch adds a new 'sev-type' field that can be
used as a discriminator to select between SEV and SEV-SNP-specific
fields/formats without breaking compatibility for existing management
tools (so long as management tools that add support for launching
SEV-SNP guest update their handling of query-sev appropriately).
The corresponding HMP command has also been fixed up similarly.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Co-developed-by:Pankaj Gupta <pankaj.gupta@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-15-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
SEV guests can use either KVM_X86_DEFAULT_VM, KVM_X86_SEV_VM,
or KVM_X86_SEV_ES_VM depending on the configuration and what
the host kernel supports. SNP guests on the other hand can only
ever use KVM_X86_SNP_VM, so split determination of VM type out
into a separate class method that can be set accordingly for
sev-guest vs. sev-snp-guest objects and add handling for SNP.
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-14-pankaj.gupta@amd.com>
[Remove unnecessary function pointer declaration. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a simple helper to check if the current guest type is SNP. Also have
SNP-enabled imply that SEV-ES is enabled as well, and fix up any places
where the sev_es_enabled() check is expecting a pure/non-SNP guest.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-9-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
SEV-SNP support relies on a different set of properties/state than the
existing 'sev-guest' object. This patch introduces the 'sev-snp-guest'
object, which can be used to configure an SEV-SNP guest. For example,
a default-configured SEV-SNP guest with no additional information
passed in for use with attestation:
-object sev-snp-guest,id=sev0
or a fully-specified SEV-SNP guest where all spec-defined binary
blobs are passed in as base64-encoded strings:
-object sev-snp-guest,id=sev0, \
policy=0x30000, \
init-flags=0, \
id-block=YWFhYWFhYWFhYWFhYWFhCg==, \
id-auth=CxHK/OKLkXGn/KpAC7Wl1FSiisWDbGTEKz..., \
author-key-enabled=on, \
host-data=LNkCWBRC5CcdGXirbNUV1OrsR28s..., \
guest-visible-workarounds=AA==, \
See the QAPI schema updates included in this patch for more usage
details.
In some cases these blobs may be up to 4096 characters, but this is
generally well below the default limit for linux hosts where
command-line sizes are defined by the sysconf-configurable ARG_MAX
value, which defaults to 2097152 characters for Ubuntu hosts, for
example.
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Michael Roth <michael.roth@amd.com>
Acked-by: Markus Armbruster <armbru@redhat.com> (for QAPI schema)
Signed-off-by: Michael Roth <michael.roth@amd.com>
Co-developed-by: Pankaj Gupta <pankaj.gupta@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-8-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When sev-snp-guest objects are introduced there will be a number of
differences in how the launch finish is handled compared to the existing
sev-guest object. Move sev_launch_finish() to a class method to make it
easier to implement SNP-specific launch update functionality later.
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-7-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When sev-snp-guest objects are introduced there will be a number of
differences in how the launch data is handled compared to the existing
sev-guest object. Move sev_launch_start() to a class method to make it
easier to implement SNP-specific launch update functionality later.
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-6-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently all SEV/SEV-ES functionality is managed through a single
'sev-guest' QOM type. With upcoming support for SEV-SNP, taking this
same approach won't work well since some of the properties/state
managed by 'sev-guest' is not applicable to SEV-SNP, which will instead
rely on a new QOM type with its own set of properties/state.
To prepare for this, this patch moves common state into an abstract
'sev-common' parent type to encapsulate properties/state that are
common to both SEV/SEV-ES and SEV-SNP, leaving only SEV/SEV-ES-specific
properties/state in the current 'sev-guest' type. This should not
affect current behavior or command-line options.
As part of this patch, some related changes are also made:
- a static 'sev_guest' variable is currently used to keep track of
the 'sev-guest' instance. SEV-SNP would similarly introduce an
'sev_snp_guest' static variable. But these instances are now
available via qdev_get_machine()->cgs, so switch to using that
instead and drop the static variable.
- 'sev_guest' is currently used as the name for the static variable
holding a pointer to the 'sev-guest' instance. Re-purpose the name
as a local variable referring the 'sev-guest' instance, and use
that consistently throughout the code so it can be easily
distinguished from sev-common/sev-snp-guest instances.
- 'sev' is generally used as the name for local variables holding a
pointer to the 'sev-guest' instance. In cases where that now points
to common state, use the name 'sev_common'; in cases where that now
points to state specific to 'sev-guest' instance, use the name
'sev_guest'
In order to enable kernel-hashes for SNP, pull it from
SevGuestProperties to its parent SevCommonProperties so
it will be available for both SEV and SNP.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Co-developed-by: Dov Murik <dovmurik@linux.ibm.com>
Signed-off-by: Dov Murik <dovmurik@linux.ibm.com>
Acked-by: Markus Armbruster <armbru@redhat.com> (QAPI schema)
Co-developed-by: Pankaj Gupta <pankaj.gupta@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-5-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Ask the ConfidentialGuestSupport object whether to use guest_memfd
for KVM-backend private memory. This bool can be set in instance_init
(or user_complete) so that it is available when the machine is created.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Right now QEMU is importing arch/x86/include/uapi/asm/kvm_para.h
because it includes definitions for kvmclock and for KVM CPUID
bits. However, other definitions for KVM hypercall values and return
codes are included in include/uapi/linux/kvm_para.h and they will be
used by SEV-SNP.
To ensure that it is possible to include both <linux/kvm_para.h> and
"standard-headers/asm-x86/kvm_para.h" without conflicts, provide
linux/kvm_para.h as a portable header too, and forward linux-headers/
files to those in include/standard-headers. Note that <linux/kvm_para.h>
will include architecture-specific definitions as well, but
"standard-headers/linux/kvm_para.h" will not because it can be used in
architecture-independent files.
This could easily be extended to other architectures, but right now
they do not need any symbol in their specific kvm_para.h files.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This updates kernel headers to commit 6f627b425378 ("KVM: SVM: Add module
parameter to enable SEV-SNP", 2024-05-12). The SNP host patches will
be included in Linux 6.11, to be released next July.
Also brings in an linux-headers/linux/vhost.h fix from v6.9-rc4.
Co-developed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-3-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Linux has <misc/pvpanic.h>, not <linux/pvpanic.h>. Use the same
directory for QEMU's include/standard-headers/ copy.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Afer commit 3efc75ad9d ("scripts/update-linux-headers.sh: Remove
temporary directory inbetween", 2024-05-29), updating linux-headers/
results in errors such as
cp: cannot stat '/tmp/tmp.1A1Eejh1UE/headers/include/asm/bitsperlong.h': No such file or directory
because Loongarch does not have an asm/bitsperlong.h file and uses the
generic version. Before commit 3efc75ad9d, the missing file would
incorrectly cause stale files to be included in linux-headers/. The files
were never committed to qemu.git, but were wrong nevertheless. The build
would just use the system version of the files, which is opposite to
the idea of importing Linux header files into QEMU's tree.
Create forwarding headers, resembling the ones that are generated during a
kernel build by scripts/Makefile.asm-generic, if a file is only installed
under include/asm-generic/.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
xsave.flat checks that "executing the XSETBV instruction causes a general-
protection fault (#GP) if ECX = 0 and EAX[2:1] has the value 10b". QEMU allows
that option, so the test fails. Add the condition.
Cc: qemu-stable@nongnu.org
Fixes: 892544317f ("target/i386: implement XSAVE and XRSTOR of AVX registers", 2022-10-18)
Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This commit fixes an issue with MOV instructions (0x8C and 0x8E)
involving segment registers; MOV to segment register's source is
16-bit, while MOV from segment register has to explicitly set the
memory operand size to 16 bits. Introduce a new flag
X86_SPECIAL_Op0_Mw to handle this specification correctly.
Signed-off-by: Xinyu Li <lixinyu20s@ict.ac.cn>
Message-ID: <20240602100528.2135717-1-lixinyu20s@ict.ac.cn>
Fixes: 5e9e21bcc4 ("target/i386: move 60-BF opcodes to new decoder", 2024-05-07)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
QEMU now requires an x86-64-v2 host, which has the POPCNT instruction.
Use it freely in TCG-generated code.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
QEMU now requires an x86-64-v2 host, which has SSSE3 instructions
(notably, PSHUFB which is used by QEMU's AES implementation).
Do not bother checking it.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
QEMU now requires an x86-64-v2 host, which has SSE2.
Use it freely in buffer_is_zero.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
QEMU now requires an x86-64-v2 host, which always has CMOV.
Use it freely in TCG generated code.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
x86-64-v2 processors were released in 2008, assume that we have one.
Unfortunately there is no GCC flag to enable all the features
without disabling what came after; so enable them one by one.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The only user was the SSE4.1 variant of buffer_is_zero, which has
been removed; code to compute CPUINFO_SSE4 is dead.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The legacy SCSI passthrough functionality has never been enabled for
VIRTIO 1.0 and was deprecated more than four years ago.
Get rid of it---almost, because QEMU is advertising it unconditionally
for legacy virtio-blk devices. Just parse the header and return a
nonzero status.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCAAdFiEEIV1G9IJGaJ7HfzVi7wSWWzmNYhEFAmZewo4ACgkQ7wSWWzmN
# YhHhxgf/ZaECxru4fP8wi34XdSG/PR+BF+W5M9gZIRGrHg3vIf3/LRTpZTDccbRN
# Qpwtypr9O6/AWG9Os80rn7alsmMDxN8PDDNLa9T3wf5pJUQSyQ87Yy0MiuTNPSKD
# HKYUIfIlbFCM5WUW4huMmg98gKTgnzZMqOoRyMFZitbkR59qCm+Exws4HtXvCH68
# 3k4lgvnFccmzO9iIzaOUIPs+Yf04Kw/FrY0Q/6nypvqbF2W80Md6w02JMQuTLwdF
# Guxeg/n6g0NLvCBbkjiM2VWfTaWJYbwFSwRTAMxM/geqh7qAgGsmD0N5lPlgqRDy
# uAy2GvFyrwzcD0lYqf0/fRK0Go0HPA==
# =J70K
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 04 Jun 2024 02:30:22 AM CDT
# gpg: using RSA key 215D46F48246689EC77F3562EF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* tag 'net-pull-request' of https://github.com/jasowang/qemu:
ebpf: Added traces back. Changed source set for eBPF to 'system'.
virtio-net: drop too short packets early
ebpf: Add a separate target for skeleton
ebpf: Refactor tun_rss_steering_prog()
ebpf: Return 0 when configuration fails
ebpf: Fix RSS error handling
virtio-net: Do not write hashes to peer buffer
virtio-net: Always set populate_hash
virtio-net: Unify the logic to update NIC state for RSS
virtio-net: Disable RSS on reset
virtio-net: Shrink header byte swapping buffer
virtio-net: Copy header only when necessary
virtio-net: Add only one queue pair when realizing
virtio-net: Do not propagate ebpf-rss-fds errors
tap: Shrink zeroed virtio-net header
tap: Call tap_receive_iov() from tap_receive()
net: Remove receive_raw()
net: Move virtio-net header length assertion
tap: Remove qemu_using_vnet_hdr()
tap: Remove tap_probe_vnet_hdr_len()
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The 'blacklist' argument / config key are deprecated since commit
582a098e6c ("qga: Replace 'blacklist' command line and config file
options by 'block-rpcs'"), time to remove them.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Message-Id: <20240530070413.19181-1-philmd@linaro.org>
In fdf029762f we factored out the handling of reading and writing
DMA descriptors from guest memory. Unfortunately we accidentally
made the descriptor-read read the descriptor into the address of the
buffer rather than into the buffer, because we didn't notice we
needed to update the arguments to the dma_memory_read() call. Before
the refactoring, "&desc" is the address of a local struct DPDMADescriptor
variable in xlnx_dpdma_start_operation(), which is the correct target
for the guest-memory-read. But after the refactoring 'desc' is the
"DPDMADescriptor *desc" argument to the new function, and so it is
already an address.
This bug is an overrun of a stack variable, since a pointer is at
most 8 bytes long and we try to read 64 bytes, as well as being
incorrect behaviour.
Pass 'desc' rather than '&desc' as the dma_memory_read() argument
to fix this.
(The same bug is not present in xlnx_dpdma_write_descriptor(),
because there we are writing the descriptor from a local struct
variable "DPDMADescriptor tmp_desc" and so passing &tmp_desc to
dma_memory_write() is correct.)
Spotted by Coverity: CID 1546649
Fixes: fdf029762f ("xlnx_dpdma: fix descriptor endianness bug")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240531124628.476938-1-peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Directly calling exit() prevents any kind of management or handling.
Instead use the corresponding runstate API.
The default behavior of the runstate API is the same as exit().
Signed-off-by: Thomas Weißschuh <thomas@t-8ch.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240523-debugexit-v1-1-d52fcaf7bf8b@t-8ch.de>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Align the framebuffer backend with the other legacy ones,
register it via xen_backend_init() when '-vga xenfb' is
used. It is safe because MODULE_INIT_XEN_BACKEND is called
in xen_bus_realize(), long after CLI processing initialized
the vga_interface_type variable.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Paul Durrant <paul@xen.org>
Message-Id: <20240510104908.76908-8-philmd@linaro.org>
For xen, when checking for the first RAM (xen_memory), use
xen_mr_is_memory() rather than checking for a RAMBlock with
offset 0.
All Xen machines create xen_memory first so this has no
functional change for existing machines.
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-ID: <20240529140739.1387692-6-edgar.iglesias@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Originally I tried to move where vCPU thread initialisation to later
in realize. However pulling that thread (sic) got gnarly really
quickly. It turns out some steps of CPU realization need values that
can only be determined from the running vCPU thread.
However having moved enough out of the thread creation we can now
queue work before the thread starts (at least for TCG guests) and
avoid the race between vcpu_init and other vcpu states a plugin might
subscribe to.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-ID: <20240530194250.1801701-6-alex.bennee@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Aside from the round robin threads this is all common code. By
moving the halt_cond setup we also no longer need hacks to work around
the race between QOM object creation and thread creation.
It is a little ugly to free stuff up for the round robin thread but
better it deal with its own specialises than making the other
accelerators jump through hoops.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-ID: <20240530194250.1801701-3-alex.bennee@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
There was an issue with Qemu build with "--disable-system".
The traces could be generated and the build fails.
The traces were 'cut out' for previous patches, and overall,
the 'system' source set should be used like in pre-'eBPF blob' patches.
Signed-off-by: Andrew Melnychenko <andrew@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reproducer from https://gitlab.com/qemu-project/qemu/-/issues/1451
creates small packet (1 segment, len = 10 == n->guest_hdr_len),
then destroys queue.
"if (n->host_hdr_len != n->guest_hdr_len)" is triggered, if body creates
zero length/zero segment packet as there is nothing after guest header.
qemu_sendv_packet_async() tries to send it.
slirp discards it because it is smaller than Ethernet header,
but returns 0 because tx hooks are supposed to return total length of data.
0 is propagated upwards and is interpreted as "packet has been sent"
which is terrible because queue is being destroyed, nobody is waiting for TX
to complete and assert it triggered.
Fix is discard such empty packets instead of sending them.
Length 1 packets will go via different codepath:
virtqueue_push(q->tx_vq, elem, 0);
virtio_notify(vdev, q->tx_vq);
g_free(elem);
and aren't problematic.
Signed-off-by: Alexey Dobriyan <adobriyan@yandex-team.ru>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This generalizes the rule to generate the skeleton and allows to add
another.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The kernel interprets the returned value as an unsigned 32-bit so -1
will mean queue 4294967295, which is awkward. Return 0 instead.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
calculate_rss_hash() was using hash value 0 to tell if it calculated
a hash, but the hash value may be 0 on a rare occasion. Have a
distinct bool value for correctness.
Fixes: f3fa412de2 ("ebpf: Added eBPF RSS program.")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The peer buffer is qualified with const and not meant to be modified.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The code to attach or detach the eBPF program to RSS were duplicated so
unify them into one function to save some code.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Byte swapping is only performed for the part of header shared with the
legacy standard and the buffer only needs to cover it.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Multiqueue usage is not negotiated yet when realizing. If more than
one queue is added and the guest never requests to enable multiqueue,
the extra queues will not be deleted when unrealizing and leak.
Fixes: f9d6dbf0bf ("virtio-net: remove virtio queues if the guest doesn't support multiqueue")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Propagating ebpf-rss-fds errors has several problems.
First, it makes device realization fail and disables the fallback to the
conventional eBPF loading.
Second, it leaks memory by making device realization fail without
freeing memory already allocated.
Third, the convention is to set an error when a function returns false,
but virtio_net_load_ebpf_fds() and virtio_net_load_ebpf() returns false
without setting an error, which is confusing.
Remove the propagation to fix these problems.
Fixes: 0524ea0510 ("ebpf: Added eBPF initialization by fds.")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
tap prepends a zeroed virtio-net header when writing a packet to a
tap with virtio-net header enabled but not in use. This only happens
when s->host_vnet_hdr_len == sizeof(struct virtio_net_hdr).
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
While netmap implements virtio-net header, it does not implement
receive_raw(). Instead of implementing receive_raw for netmap, add
virtio-net headers in the common code and use receive_iov()/receive()
instead. This also fixes the buffer size for the virtio-net header.
Fixes: fbbdbddec0 ("tap: allow extended virtio header with hash info")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The virtio-net header length assertion should happen for any clients.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Since qemu_set_vnet_hdr_len() is always called when
qemu_using_vnet_hdr() is called, we can merge them and save some code.
For consistency, express that the virtio-net header is not in use by
returning 0 with qemu_get_vnet_hdr_len() instead of having a dedicated
function, qemu_get_using_vnet_hdr().
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
It was necessary since an Linux older than 2.6.35 may implement the
virtio-net header but may not allow to change its length. Remove it
since such an old Linux is no longer supported.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
RISC-V PR for 9.1
* APLICs add child earlier than realize
* Fix exposure of Zkr
* Raise exceptions on wrs.nto
* Implement SBI debug console (DBCN) calls for KVM
* Support 64-bit addresses for initrd
* Change RISCV_EXCP_SEMIHOST exception number to 63
* Tolerate KVM disable ext errors
* Set tval in breakpoints
* Add support for Zve32x extension
* Add support for Zve64x extension
* Relax vector register check in RISCV gdbstub
* Fix the element agnostic Vector function problem
* Fix Zvkb extension config
* Implement dynamic establishment of custom decoder
* Add th.sxstatus CSR emulation
* Fix Zvfhmin checking for vfwcvt.f.f.v and vfncvt.f.f.w instructions
* Check single width operator for vector fp widen instructions
* Check single width operator for vfncvt.rod.f.f.w
* Remove redudant SEW checking for vector fp narrow/widen instructions
* Prioritize pmp errors in raise_mmu_exception()
* Do not set mtval2 for non guest-page faults
* Remove experimental prefix from "B" extension
* Fixup CBO extension register calculation
* Fix the hart bit setting of AIA
* Fix reg_width in ricsv_gen_dynamic_vector_feature()
* Decode all of the pmpcfg and pmpaddr CSRs
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEaukCtqfKh31tZZKWr3yVEwxTgBMFAmZdVzcACgkQr3yVEwxT
# gBPxSBAAsuzhDCbaOl9jXhIL6Q0IDHULz4U16AZypHYID7T6rDaNoRmNVdqBKZuM
# IMby8qm5XFmcUGM9itcM7IKV2BNHuWSye3/Y7GOYZQyToR7U6lvLpAm4pNj4AgTC
# PLV2VPt1XLZRSthkgwp6ylBXzdNSiZMWggqTb7QbyfR5hJfG+VsZjTGaIwyZbtKI
# +CJG6gZSPv6JGNtwnJq+v0VBEkj1ryo/gg2EAAzA+EWU4nw5mJCLWoDLrYZalTv9
# vCTqJuMViTjeHqAm/IIMoFzYR94+ug0usqcmnx/E7ALTOsmBh5K+KWndAW4vqAlP
# mZOONfr3h7zc81jThC961kjGVPiTjTGbHHlKwlB2JEggwctcVqGRyWeM9wHSUr2W
# S6F56hpForzVW9IkCt/fDUxamr23303s5miIsronrwiihqkNpxKYAuqPTXFGkFKg
# ilBLGcbHcWxNmjpfIEXnTjDB6qFEceWqbjJejrsKusoSPkKQm0ktIZZUwCbTsu45
# 0ScYrBieUPjDWDFYlmWrr5byekyCXCzfpBgq8qo60FA+aP29Nx+GlFR0eWTXXY4V
# O5/WTKjQM4+/uNYIuFDCFPV1Ja5GERDhXoNkjkY5ErsSZL2c2UEp3UTxzbEl5dOm
# NRH7C26Z/xVMDwT08kDDq0t8Rkz4836txPO7y+aPbtvGfENRI8E=
# =mtVb
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 03 Jun 2024 12:40:07 AM CDT
# gpg: using RSA key 6AE902B6A7CA877D6D659296AF7C95130C538013
# gpg: Good signature from "Alistair Francis <alistair@alistair23.me>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6AE9 02B6 A7CA 877D 6D65 9296 AF7C 9513 0C53 8013
* tag 'pull-riscv-to-apply-20240603' of https://github.com/alistair23/qemu: (27 commits)
disas/riscv: Decode all of the pmpcfg and pmpaddr CSRs
riscv, gdbstub.c: fix reg_width in ricsv_gen_dynamic_vector_feature()
target/riscv/kvm.c: Fix the hart bit setting of AIA
target/riscv: rvzicbo: Fixup CBO extension register calculation
target/riscv: Remove experimental prefix from "B" extension
target/riscv: do not set mtval2 for non guest-page faults
target/riscv: prioritize pmp errors in raise_mmu_exception()
target/riscv: rvv: Remove redudant SEW checking for vector fp narrow/widen instructions
target/riscv: rvv: Check single width operator for vfncvt.rod.f.f.w
target/riscv: rvv: Check single width operator for vector fp widen instructions
target/riscv: rvv: Fix Zvfhmin checking for vfwcvt.f.f.v and vfncvt.f.f.w instructions
riscv: thead: Add th.sxstatus CSR emulation
target/riscv: Implement dynamic establishment of custom decoder
target/riscv/cpu.c: fix Zvkb extension config
target/riscv: Fix the element agnostic function problem
target/riscv: Relax vector register check in RISCV gdbstub
target/riscv: Add support for Zve64x extension
target/riscv: Add support for Zve32x extension
trans_privileged.c.inc: set (m|s)tval on ebreak breakpoint
target/riscv/debug: set tval=pc in breakpoint exceptions
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Prevent regressions when using NBD with TLS in the presence of
iothreads, adding coverage the fix to qio channels made in the
previous patch.
The shell function pick_unused_port() was copied from
nbdkit.git/tests/functions.sh.in, where it had all authors from Red
Hat, agreeing to the resulting relicensing from 2-clause BSD to GPLv2.
CC: qemu-stable@nongnu.org
CC: "Richard W.M. Jones" <rjones@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20240531180639.1392905-6-eblake@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
hw/ufs patches
- Add support MCQ of UFSHCI 4.0
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCgAdFiEEUBfYMVl8eKPZB+73EuIgTA5dtgIFAmZdf7EACgkQEuIgTA5d
# tgLI8Q//SXqscGTP/FrjaUj0SsQE90E0AEEfIX4juVP8e1FiFcM9x6tmEU/CT5CM
# BQYk1zn0d3JY09ClIyr+AlVyy5RYMdlL/LyhGElxU0MPZh6u4X/6QJ/oHcxAx96r
# sRGSjJp5k2maHXgjQmqVUFkp22SZGt5vpD+AQT83wvuWshL302b3MDJ8B3f9zX90
# mCcwSmk+JgSjceaXueuBAbOJud6Ie3jAqXf4w8Gv21nLwzRmDBacjfn5LGSVzQxd
# BLkADuwRcRkxQ9hpoCPWOdKvXCAXtTIYqw8BRCG7Avl478UDI+CrYNjvK62SystM
# el2ql5ZvGjL8w+k7aQxqJi44RyQH3k1NJwvss3pdRyJzwL9x9pVRVlsM6tQbW56u
# COVexJnKDoufWmQs7o6rOv5OzexC6cD6yEoH2mi60F35jO8j3skJi1od7ehHwbzm
# 2a6dt1glBKvRWgfcLgEGuFERji3vV++9T6bAg8To0GYTryZKZzLKMSbIHvYI/49t
# u4ZwqhYkp36gpcQ8eA7Byr5FWd19UzEin6sVw3uCYibr6oONFTjp+XXnFz/LK9Fu
# XbrSOe8943FCUs9dCBHHtJFLmw4j1Ck60GpnogFkzjPEhlKDHO8+4/lm1gFgPV2h
# K2wqmq5kw8JtTKvHSDGa4iGTfJ/zxMOv1ePc3wiulwSH28kl6r8=
# =dQMQ
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 03 Jun 2024 03:32:49 AM CDT
# gpg: using RSA key 5017D831597C78A3D907EEF712E2204C0E5DB602
# gpg: Good signature from "Jeuk Kim <jeuk20.kim@samsung.com>" [unknown]
# gpg: aka "Jeuk Kim <jeuk20.kim@gmail.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 5017 D831 597C 78A3 D907 EEF7 12E2 204C 0E5D B602
* tag 'pull-ufs-20240603' of https://gitlab.com/jeuk20.kim/qemu:
hw/ufs: Add support MCQ of UFSHCI 4.0
hw/ufs: Update MCQ-related fields to block/ufs.h
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This patch adds support for MCQ defined in UFSHCI 4.0. This patch
utilized the legacy I/O codes as much as possible to support MCQ.
MCQ operation & runtime register is placed at 0x1000 offset of UFSHCI
register statically with no spare space among four registers (48B):
UfsMcqSqReg, UfsMcqSqIntReg, UfsMcqCqReg, UfsMcqCqIntReg
The maxinum number of queue is 32 as per spec, and the default
MAC(Multiple Active Commands) are 32 in the device.
Example:
-device ufs,serial=foo,id=ufs0,mcq=true,mcq-maxq=8
Signed-off-by: Minwoo Im <minwoo.im@samsung.com>
Reviewed-by: Jeuk Kim <jeuk20.kim@samsung.com>
Message-Id: <20240528023106.856777-3-minwoo.im@samsung.com>
Signed-off-by: Jeuk Kim <jeuk20.kim@samsung.com>
In AIA spec, each hart (or each hart within a group) has a unique hart
number to locate the memory pages of interrupt files in the address
space. The number of bits required to represent any hart number is equal
to ceil(log2(hmax + 1)), where hmax is the largest hart number among
groups.
However, if the largest hart number among groups is a power of 2, QEMU
will pass an inaccurate hart-index-bit setting to Linux. For example, when
the guest OS has 4 harts, only ceil(log2(3 + 1)) = 2 bits are sufficient
to represent 4 harts, but we passes 3 to Linux. The code needs to be
updated to ensure accurate hart-index-bit settings.
Additionally, a Linux patch[1] is necessary to correctly recover the hart
index when the guest OS has only 1 hart, where the hart-index-bit is 0.
[1] https://lore.kernel.org/lkml/20240415064905.25184-1-yongxuan.wang@sifive.com/t/
Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Cc: qemu-stable <qemu-stable@nongnu.org>
Message-ID: <20240515091129.28116-1-yongxuan.wang@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
When running the instruction
```
cbo.flush 0(x0)
```
QEMU would segfault.
The issue was in cpu_gpr[a->rs1] as QEMU does not have cpu_gpr[0]
allocated.
In order to fix this let's use the existing get_address()
helper. This also has the benefit of performing pointer mask
calculations on the address specified in rs1.
The pointer masking specificiation specifically states:
"""
Cache Management Operations: All instructions in Zicbom, Zicbop and Zicboz
"""
So this is the correct behaviour and we previously have been incorrectly
not masking the address.
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reported-by: Fabian Thomas <fabian.thomas@cispa.de>
Fixes: e05da09b7c ("target/riscv: implement Zicbom extension")
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Cc: qemu-stable <qemu-stable@nongnu.org>
Message-ID: <20240514023910.301766-1-alistair.francis@wdc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Previous patch fixed the PMP priority in raise_mmu_exception() but we're still
setting mtval2 incorrectly. In riscv_cpu_tlb_fill(), after pmp check in 2 stage
translation part, mtval2 will be set in case of successes 2 stage translation but
failed pmp check.
In this case we gonna set mtval2 via env->guest_phys_fault_addr in context of
riscv_cpu_tlb_fill(), as this was a guest-page-fault, but it didn't and mtval2
should be zero, according to RISCV privileged spec sect. 9.4.4: When a guest
page-fault is taken into M-mode, mtval2 is written with either zero or guest
physical address that faulted, shifted by 2 bits. *For other traps, mtval2
is set to zero...*
Signed-off-by: Alexei Filippov <alexei.filippov@syntacore.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240503103052.6819-1-alexei.filippov@syntacore.com>
Cc: qemu-stable <qemu-stable@nongnu.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
raise_mmu_exception(), as is today, is prioritizing guest page faults by
checking first if virt_enabled && !first_stage, and then considering the
regular inst/load/store faults.
There's no mention in the spec about guest page fault being a higher
priority that PMP faults. In fact, privileged spec section 3.7.1 says:
"Attempting to fetch an instruction from a PMP region that does not have
execute permissions raises an instruction access-fault exception.
Attempting to execute a load or load-reserved instruction which accesses
a physical address within a PMP region without read permissions raises a
load access-fault exception. Attempting to execute a store,
store-conditional, or AMO instruction which accesses a physical address
within a PMP region without write permissions raises a store
access-fault exception."
So, in fact, we're doing it wrong - PMP faults should always be thrown,
regardless of also being a first or second stage fault.
The way riscv_cpu_tlb_fill() and get_physical_address() work is
adequate: a TRANSLATE_PMP_FAIL error is immediately reported and
reflected in the 'pmp_violation' flag. What we need is to change
raise_mmu_exception() to prioritize it.
Reported-by: Joseph Chan <jchan@ventanamicro.com>
Fixes: 82d53adfbb ("target/riscv/cpu_helper.c: Invalid exception on MMU translation stage")
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240413105929.7030-1-alexei.filippov@syntacore.com>
Cc: qemu-stable <qemu-stable@nongnu.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The require_scale_rvf function only checks the double width operator for
the vector floating point widen instructions, so most of the widen
checking functions need to add require_rvf for single width operator.
The vfwcvt.f.x.v and vfwcvt.f.xu.v instructions convert single width
integer to double width float, so the opfxv_widen_check function doesn’t
need require_rvf for the single width operator(integer).
Signed-off-by: Max Chou <max.chou@sifive.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Cc: qemu-stable <qemu-stable@nongnu.org>
Message-ID: <20240322092600.1198921-3-max.chou@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
According v spec 18.4, only the vfwcvt.f.f.v and vfncvt.f.f.w
instructions will be affected by Zvfhmin extension.
And the vfwcvt.f.f.v and vfncvt.f.f.w instructions only support the
conversions of
* From 1*SEW(16/32) to 2*SEW(32/64)
* From 2*SEW(32/64) to 1*SEW(16/32)
Signed-off-by: Max Chou <max.chou@sifive.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Cc: qemu-stable <qemu-stable@nongnu.org>
Message-ID: <20240322092600.1198921-2-max.chou@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
In this patch, we modify the decoder to be a freely composable data
structure instead of a hardcoded one. It can be dynamically builded up
according to the extensions.
This approach has several benefits:
1. Provides support for heterogeneous cpu architectures. As we add decoder in
RISCVCPU, each cpu can have their own decoder, and the decoders can be
different due to cpu's features.
2. Improve the decoding efficiency. We run the guard_func to see if the decoder
can be added to the dynamic_decoder when building up the decoder. Therefore,
there is no need to run the guard_func when decoding each instruction. It can
improve the decoding efficiency
3. For vendor or dynamic cpus, it allows them to customize their own decoder
functions to improve decoding efficiency, especially when vendor-defined
instruction sets increase. Because of dynamic building up, it can skip the other
decoder guard functions when decoding.
4. Pre patch for allowing adding a vendor decoder before decode_insn32() with minimal
overhead for users that don't need this particular vendor decoder.
Signed-off-by: Huang Tao <eric.huang@linux.alibaba.com>
Suggested-by: Christoph Muellner <christoph.muellner@vrull.eu>
Co-authored-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240506023607.29544-1-eric.huang@linux.alibaba.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
In current implementation, the gdbstub allows reading vector registers
only if V extension is supported. However, all vector extensions and
vector crypto extensions have the vector registers and they all depend
on Zve32x. The gdbstub should check for Zve32x instead.
Signed-off-by: Jason Chien <jason.chien@sifive.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Max Chou <max.chou@sifive.com>
Message-ID: <20240328022343.6871-4-jason.chien@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Privileged spec section 4.1.9 mentions:
"When a trap is taken into S-mode, stval is written with
exception-specific information to assist software in handling the trap.
(...)
If stval is written with a nonzero value when a breakpoint,
address-misaligned, access-fault, or page-fault exception occurs on an
instruction fetch, load, or store, then stval will contain the faulting
virtual address."
A similar text is found for mtval in section 3.1.16.
Setting mtval/stval in this scenario is optional, but some softwares read
these regs when handling ebreaks.
Write 'badaddr' in all ebreak breakpoints to write the appropriate
'tval' during riscv_do_cpu_interrrupt().
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240416230437.1869024-3-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
We're not setting (s/m)tval when triggering breakpoints of type 2
(mcontrol) and 6 (mcontrol6). According to the debug spec section
5.7.12, "Match Control Type 6":
"The Privileged Spec says that breakpoint exceptions that occur on
instruction fetches, loads, or stores update the tval CSR with either
zero or the faulting virtual address. The faulting virtual address for
an mcontrol6 trigger with action = 0 is the address being accessed and
which caused that trigger to fire."
A similar text is also found in the Debug spec section 5.7.11 w.r.t.
mcontrol.
Note that what we're doing ATM is not violating the spec, but it's
simple enough to set mtval/stval and it makes life easier for any
software that relies on this info.
Given that we always use action = 0, save the faulting address for the
mcontrol and mcontrol6 trigger breakpoints into env->badaddr, which is
used as as scratch area for traps with address information. 'tval' is
then set during riscv_cpu_do_interrupt().
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Message-ID: <20240416230437.1869024-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Running a KVM guest using a 6.9-rc3 kernel, in a 6.8 host that has zkr
enabled, will fail with a kernel oops SIGILL right at the start. The
reason is that we can't expose zkr without implementing the SEED CSR.
Disabling zkr in the guest would be a workaround, but if the KVM doesn't
allow it we'll error out and never boot.
In hindsight this is too strict. If we keep proceeding, despite not
disabling the extension in the KVM vcpu, we'll not add the extension in
the riscv,isa. The guest kernel will be unaware of the extension, i.e.
it doesn't matter if the KVM vcpu has it enabled underneath or not. So
it's ok to keep booting in this case.
Change our current logic to not error out if we fail to disable an
extension in kvm_set_one_reg(), but show a warning and keep booting. It
is important to throw a warning because we must make the user aware that
the extension is still available in the vcpu, meaning that an
ill-behaved guest can ignore the riscv,isa settings and use the
extension.
The case we're handling happens with an EINVAL error code. If we fail to
disable the extension in KVM for any other reason, error out.
We'll also keep erroring out when we fail to enable an extension in KVM,
since adding the extension in riscv,isa at this point will cause a guest
malfunction because the extension isn't enabled in the vcpu.
Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Cc: qemu-stable <qemu-stable@nongnu.org>
Message-ID: <20240422171425.333037-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The current semihost exception number (16) is a reserved number (range
[16-17]). The upcoming double trap specification uses that number for
the double trap exception. Since the privileged spec (Table 22) defines
ranges for custom uses change the semihosting exception number to 63
which belongs to the range [48-63] in order to avoid any future
collisions with reserved exception.
Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240422135840.1959967-1-cleger@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
SBI defines a Debug Console extension "DBCN" that will, in time, replace
the legacy console putchar and getchar SBI extensions.
The appeal of the DBCN extension is that it allows multiple bytes to be
read/written in the SBI console in a single SBI call.
As far as KVM goes, the DBCN calls are forwarded by an in-kernel KVM
module to userspace. But this will only happens if the KVM module
actually supports this SBI extension and we activate it.
We'll check for DBCN support during init time, checking if get-reg-list
is advertising KVM_RISCV_SBI_EXT_DBCN. In that case, we'll enable it via
kvm_set_one_reg() during kvm_arch_init_vcpu().
Finally, change kvm_riscv_handle_sbi() to handle the incoming calls for
SBI_EXT_DBCN, reading and writing as required.
A simple KVM guest with 'earlycon=sbi', running in an emulated RISC-V
host, takes around 20 seconds to boot without using DBCN. With this
patch we're taking around 14 seconds to boot due to the speed-up in the
terminal output. There's no change in boot time if the guest isn't
using earlycon.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20240425155012.581366-1-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Implementing wrs.nto to always just return is consistent with the
specification, as the instruction is permitted to terminate the
stall for any reason, but it's not useful for virtualization, where
we'd like the guest to trap to the hypervisor in order to allow
scheduling of the lock holding VCPU. Change to always immediately
raise exceptions when the appropriate conditions are present,
otherwise continue to just return. Note, immediately raising
exceptions is also consistent with the specification since the
time limit that should expire prior to the exception is
implementation-specific.
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Christoph Müllner <christoph.muellner@vrull.eu>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240424142808.62936-2-ajones@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The Zkr extension may only be exposed to KVM guests if the VMM
implements the SEED CSR. Use the same implementation as TCG.
Without this patch, running with a KVM which does not forward the
SEED CSR access to QEMU will result in an ILL exception being
injected into the guest (this results in Linux guests crashing on
boot). And, when running with a KVM which does forward the access,
QEMU will crash, since QEMU doesn't know what to do with the exit.
Fixes: 3108e2f1c6 ("target/riscv/kvm: update KVM exts to Linux 6.8")
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Cc: qemu-stable <qemu-stable@nongnu.org>
Message-ID: <20240422134605.534207-2-ajones@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
FEAT_WFxT introduces new instructions WFIT and WFET, which are like
the existing WFI and WFE but allow the guest to pass a timeout value
in a register. The instructions will wait for an interrupt/event as
usual, but will also stop waiting when the value of CNTVCT_EL0 is
greater than or equal to the specified timeout value.
We implement WFIT by setting up a timer to expire at the right
point; when the timer expires it sets the EXITTB interrupt, which
will cause the CPU to leave the halted state. If we come out of
halt for some other reason, we unset the pending timer.
We implement WFET as a nop, which is architecturally permitted and
matches the way we currently make WFE a nop.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240430140035.3889879-3-peter.maydell@linaro.org
The TCGCPUOps::cpu_exec_halt method is called from cpu_handle_halt()
when the CPU is halted, so that a target CPU emulation can do
anything target-specific it needs to do. (At the moment we only use
this on i386.)
The current specification of the method doesn't allow the target
specific code to do something different if the CPU is about to come
out of the halt state, because cpu_handle_halt() only determines this
after the method has returned. (If the method called cpu_has_work()
itself this would introduce a potential race if an interrupt arrived
between the target's method implementation checking and
cpu_handle_halt() repeating the check.)
Change the definition of the method so that it returns a bool to
tell cpu_handle_halt() whether to stay in halt or not.
We will want this for the Arm target, where FEAT_WFxT wants to do
some work only for the case where the CPU is in halt but about to
leave it.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240430140035.3889879-2-peter.maydell@linaro.org
The board list in target-arm.rst is supposed to be in alphabetical
order by the title text of each file (which is not the same as
alphabetical order by filename). A few items had got out of order;
correct them.
The entry for
"Facebook Yosemite v3.5 Platform and CraterLake Server (fby35)"
remains out-of-order, because this is not its own file
but is currently part of the aspeed.rst file.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240520141421.1895138-1-peter.maydell@linaro.org
Moving to Neoverse-N2 gives us several cpu features to use for expanding
our platform:
- branch target identification
- pointer authentication
- RME for confidential computing
- RNG for EFI_PROTOCOL_RNG
- SVE being enabled by default
We do not go for "max" as default to have stable set of features enabled
by default. It is still supported and can be selected with "--cpu"
argument.
Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Reviewed-by: Leif Lindholm <quic_llindhol@quicinc.com>
Message-id: 20240523165353.6547-1-marcin.juszkiewicz@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Partial support for NUMA setup:
- cpu nodes
- memory nodes
Used versions:
- Trusted Firmware v2.11.0
- Tianocore EDK2 stable202405
- Tianocore EDK2 Platforms code commit 4bbd0ed
Firmware is built using Debian 'bookworm' cross toolchain (gcc 12.2.0).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
According to the GICv2 specification section 4.3.12, "Interrupt Processor
Targets Registers, GICD_ITARGETSRn":
"Any change to a CPU targets field value:
[...]
* Has an effect on any pending interrupts. This means:
- adding a CPU interface to the target list of a pending interrupt makes that
interrupt pending on that CPU interface
- removing a CPU interface from the target list of a pending interrupt
removes the pending state of that interrupt on that CPU interface."
Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
Message-id: 20240524113256.8102-3-sebastian.huber@embedded-brains.de
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Since qemu 8.2, the combination of NBD + TLS + iothread crashes on an
assertion failure:
qemu-kvm: ../io/channel.c:534: void qio_channel_restart_read(void *): Assertion `qemu_get_current_aio_context() == qemu_coroutine_get_aio_context(co)' failed.
It turns out that when we removed AioContext locking, we did so by
having NBD tell its qio channels that it wanted to opt in to
qio_channel_set_follow_coroutine_ctx(); but while we opted in on the
main channel, we did not opt in on the TLS wrapper channel.
qemu-iotests has coverage of NBD+iothread and NBD+TLS, but apparently
no coverage of NBD+TLS+iothread, or we would have noticed this
regression sooner. (I'll add that in the next patch)
But while we could manually opt in to the TLS channel in nbd/server.c
(a one-line change), it is more generic if all qio channels that wrap
other channels inherit the follow status, in the same way that they
inherit feature bits.
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Daniel P. Berrangé <berrange@redhat.com>
CC: qemu-stable@nongnu.org
Fixes: https://issues.redhat.com/browse/RHEL-34786
Fixes: 06e0f098 ("io: follow coroutine AioContext in qio_channel_yield()", v8.2.0)
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240518025246.791593-5-eblake@redhat.com>
Using -fsanitize=undefined with Clang v18 causes an error if function
pointers are casted:
qapi/qapi-clone-visitor.c:188:5: runtime error: call to function visit_type_SocketAddress through pointer to incorrect function type 'bool (*)(struct Visitor *, const char *, void **, struct Error **)'
/tmp/qemu-ubsan/qapi/qapi-visit-sockets.c:487: note: visit_type_SocketAddress defined here
#0 0x5642aa2f7f3b in qapi_clone qapi/qapi-clone-visitor.c:188:5
#1 0x5642aa2c8ce5 in qio_channel_socket_listen_async io/channel-socket.c:285:18
#2 0x5642aa2b8903 in test_io_channel_setup_async tests/unit/test-io-channel-socket.c:116:5
#3 0x5642aa2b8204 in test_io_channel tests/unit/test-io-channel-socket.c:179:9
#4 0x5642aa2b8129 in test_io_channel_ipv4 tests/unit/test-io-channel-socket.c:323:5
...
It also prevents enabling the strict mode of CFI which is currently
disabled with -fsanitize-cfi-icall-generalize-pointers.
The problematic casts are necessary to pass visit_type_T() and
visit_type_T_members() as callbacks to qapi_clone() and qapi_clone_members(),
respectively. Open-code these two functions to avoid the callbacks, and
thus the type casts.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2346
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240524-xkb-v4-3-2de564e5c859@daynix.com>
[thuth: Improve commit message according to Markus' suggestions]
Signed-off-by: Thomas Huth <thuth@redhat.com>
LeakSanitizer complains about allocations whose references are held
only by automatic variables. It is possible to free them to suppress
the complaints, but it is a chore to make sure they are freed in all
exit paths so make them static instead.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240524-xkb-v4-1-2de564e5c859@daynix.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
When running the update-linx-headers.sh script, it currently fails with:
scripts/update-linux-headers.sh: line 73: .../qemu/standard-headers/asm-x86/setup_data.h: No such file or directory
The "include" folder is obviously missing here - no clue how this could
have worked before?
Fixes: 66210a1a30 ("scripts/update-linux-headers: Add setup_data.h to import list")
Message-ID: <20240527060126.12578-1-thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
We are reusing the same temporary directory for installing the headers
of all targets, so there could be stale files here when switching from
one target to another. Make sure to delete the folder before installing
a new set of target headers into it.
Message-ID: <20240527060243.12647-1-thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
When we are building for OSS-Fuzz, we want to ensure that the fuzzer
targets are actually created, regardless of leaks. Leaks will be
detected by the subsequent tests of the individual fuzz-targets.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240527150001.325565-1-alxndr@bu.edu>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Set per_address and ilen in per_ifetch; this is valid for
all PER exceptions and will last until the end of the
instruction. Therefore we don't need to give the same
data to per_check_exception.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240502054417.234340-13-richard.henderson@linaro.org>
[thuth: Silence checkpatch.pl errors]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Drop from argument, since gbea has always been updated with
this address. Add ilen argument for setting int_pgm_ilen.
Use update_cc_op before calling per_branch.
By raising the exception here, we need not call
per_check_exception later, which means we can clean up the
normal non-exception branch path.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240502054417.234340-10-richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Always use a tcg branch, instead of movcond. The movcond
was not a bad idea before PER was added, but since then
we have either 2 or 3 actions to perform on each leg of
the branch, and multiple movcond is inefficient.
Reorder the taken branch to be fallthrough of the tcg branch.
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240502054417.234340-8-richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Using exception unwind via tcg_s390_program_interrupt,
we discard the current value of psw.addr, which discards
the result of a branch.
Pass in the address of the next instruction, which may
not be sequential. Pass in ilen, which we would have
gotten from unwind and is passed to the exception handler.
Sync cc_op before the call, which we would have gotten
from unwind.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-ID: <20240502054417.234340-2-richard.henderson@linaro.org>
[thuth: Silence checkpatch.pl errors]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Block jobs patches for 2024-04-29
v2: add "iotests/pylintrc: allow up to 10 similar lines" to fix
check-python-minreqs
- backup: discard-source parameter
- blockcommit: Reopen base image as RO after abort
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCgAdFiEEi5wmzbL9FHyIDoahVh8kwfGfefsFAmZV4UwACgkQVh8kwfGf
# eftBIA/9Em1xR7yEK5gE9kiGc+qSBsRPB8sJZ/JB+GukDPvzQ+/CktIJJgTryI/q
# QC08KyHnuE6WknUfJPkV5kfINj8vTDtkMjwgccrMu8enc9W5wnRfVBQomS8qWpZY
# maJhyW+Sva7k82v/U1mpdur5cTF1cu8VmwMSNurBYVd84E33KHkgQikEbXSLzFBu
# N8dG4WOgtwuLmP5BMgg5ftzwC3W7qv+sq1DhnZwDATUKVbjX1lLtKAYwu66bH8du
# ekZtWqtJNJqRTcOIiSyl52lPm3xo9+U8khXWQ/lmq1jjvdKcC90y76bT16yIQw98
# 74aBiKSRu2MO/EraEgPQKU2LpSzbzr4Eu1kRjmDXcVDAB183vaFW3Ogym8BuGJ9n
# ZiNFYLZqOqUL4RkyaXEwci6THEyjHqQvK2HYGmjoidZPvATf5G52FWrKZT3S9LVT
# Q4oUhb6dQW4EtU4WoVJpqSg7xozVI/swJ04+gLTjQskitXQm2jX8ifD6MI+85tVp
# nntS5BtMfTe/z5K4L7bv8KOe7J+gK0NUo3YCdw3zQKa+u7tX/QQKnPmNtUK8ohjO
# g6wIuwrxn/GsHxvXaeOKftHyXBGDHYUSuIr7ByQ/WxS9nQaWW1UKk9WFC/XtUFND
# bHMfL+DidkUxMnZBe7Snz6gb16oEr0DsrsSyHe/J2dWrid6QJVA=
# =aSvT
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 28 May 2024 06:51:08 AM PDT
# gpg: using RSA key 8B9C26CDB2FD147C880E86A1561F24C1F19F79FB
# gpg: Good signature from "Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>" [unknown]
# gpg: aka "Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 8B9C 26CD B2FD 147C 880E 86A1 561F 24C1 F19F 79FB
* tag 'pull-block-jobs-2024-04-29-v2' of https://gitlab.com/vsementsov/qemu:
iotests/pylintrc: allow up to 10 similar lines
iotests: add backup-discard-source
qapi: blockdev-backup: add discard-source parameter
block/copy-before-write: create block_copy bitmap in filter node
block/copy-before-write: support unligned snapshot-discard
block/copy-before-write: fix permission
blockcommit: Reopen base image as RO after abort
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This fixes a bug in that neither PLI nor PLDW are present in ARMv6T2,
but are introduced with ARMv7 and ARMv7MP respectively.
For clarity, do not use NOP for PLD.
Note that there is no PLDW (literal). Architecturally in the
T1 encoding of "PLD (literal)" bit 5 is "(0)", which means
that it should be zero and if it is not then the behaviour
is CONSTRAINED UNPREDICTABLE (might UNDEF, NOP, or ignore the
value of the bit).
In our implementation we have patterns for both:
+ PLD 1111 1000 -001 1111 1111 ------------ # (literal)
+ PLD 1111 1000 -011 1111 1111 ------------ # (literal)
and so we effectively ignore the value of bit 5. (This is a
permitted option for this CONSTRAINED UNPREDICTABLE.) This isn't a
behaviour change in this commit, since we previously had NOP lines
for both those patterns.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240524232121.284515-3-richard.henderson@linaro.org
[PMM: adjusted commit message to note that PLD (lit) T1 bit 5
being 1 is an UNPREDICTABLE case.]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Some of the source files for older devices use hardcoded tabs
instead of our current coding standard's required spaces.
Fix these in the following files:
- hw/arm/boot.c
- hw/char/omap_uart.c
- hw/gpio/zaurus.c
- hw/input/tsc2005.c
This commit is mostly whitespace-only changes; it also
adds curly-braces to some 'if' statements.
This addresses part of https://gitlab.com/qemu-project/qemu/-/issues/373
but some other files remain to be handled.
Signed-off-by: Tanmay Patil <tanmaynpatil105@gmail.com>
Message-id: 20240508081502.88375-1-tanmaynpatil105@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: tweaked commit message]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Check the function index is in range and use an unsigned
variable to avoid the following warning with GCC 13.2.0:
[666/5358] Compiling C object libcommon.fa.p/hw_input_tsc2005.c.o
hw/input/tsc2005.c: In function 'tsc2005_timer_tick':
hw/input/tsc2005.c:416:26: warning: array subscript has type 'char' [-Wchar-subscripts]
416 | s->dav |= mode_regs[s->function];
| ~^~~~~~~~~~
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240508143513.44996-1-philmd@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: fixed missing ')']
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In gic_cpu_read() and gic_cpu_write(), we delegate the handling of
reading and writing the Non-Secure view of the GICC_APR<n> registers
to functions gic_apr_ns_view() and gic_apr_write_ns_view().
Unfortunately we got the order of the arguments wrong, swapping the
CPU number and the register number (which the compiler doesn't catch
because they're both integers).
Most guests probably didn't notice this bug because directly
accessing the APR registers is typically something only done by
firmware when it is doing state save for going into a sleep mode.
Correct the mismatched call arguments.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Cc: qemu-stable@nongnu.org
Fixes: 51fd06e0ee ("hw/intc/arm_gic: Fix handling of GICC_APR<n>, GICC_NSAPR<n> registers")
Signed-off-by: Andrey Shumilin <shum.sdl@nppct.ru>
[PMM: Rewrote commit message]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée<alex.bennee@linaro.org>
The value of the mp-affinity property being set in npcm7xx_realize is
always the same as the default value it would have when arm_cpu_realizefn
is called if the property is not set here. So there is no need to set
the property value in npcm7xx_realize function.
Signed-off-by: Dorjoy Chowdhury <dorjoychy111@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240504141733.14813-1-dorjoychy111@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We wrongly encoded ID_AA64PFR1_EL1 using {3,0,0,4,2} in hvf_sreg_match[] so
we fail to get the expected ARMCPRegInfo from cp_regs hash table with the
wrong key.
Fix it with the correct encoding {3,0,0,4,1}. With that fixed, the Linux
guest can properly detect FEAT_SSBS2 on my M1 HW.
All DBG{B,W}{V,C}R_EL1 registers are also wrongly encoded with op0 == 14.
It happens to work because HVF_SYSREG(CRn, CRm, 14, op1, op2) equals to
HVF_SYSREG(CRn, CRm, 2, op1, op2), by definition. But we shouldn't rely on
it.
Cc: qemu-stable@nongnu.org
Fixes: a1477da3dd ("hvf: Add Apple Silicon support")
Signed-off-by: Zenghui Yu <zenghui.yu@linux.dev>
Reviewed-by: Alexander Graf <agraf@csgraf.de>
Message-id: 20240503153453.54389-1-zenghui.yu@linux.dev
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add xlnx_dpdma_read_descriptor() and
xlnx_dpdma_write_descriptor() functions.
xlnx_dpdma_read_descriptor() combines reading a
descriptor from desc_addr by calling dma_memory_read()
and swapping the desc fields from guest memory order
to host memory order. xlnx_dpdma_write_descriptor()
performs similar actions when writing a descriptor.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: d3c6369a96 ("introduce xlnx-dpdma")
Signed-off-by: Alexandra Diupina <adiupina@astralinux.ru>
[PMM: tweaked indent, dropped behaviour change for write-failure case]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We want to have similar QMP objects in different tests. Reworking these
objects to make common parts by calling some helper functions doesn't
seem good. It's a lot more comfortable to see the whole QAPI request in
one place.
So, let's increase the limit, to unblock further commit
"iotests: add backup-discard-source"
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Add a parameter that enables discard-after-copy. That is mostly useful
in "push backup with fleecing" scheme, when source is snapshot-access
format driver node, based on copy-before-write filter snapshot-access
API:
[guest] [snapshot-access] ~~ blockdev-backup ~~> [backup target]
| |
| root | file
v v
[copy-before-write]
| |
| file | target
v v
[active disk] [temp.img]
In this case discard-after-copy does two things:
- discard data in temp.img to save disk space
- avoid further copy-before-write operation in discarded area
Note that we have to declare WRITE permission on source in
copy-before-write filter, for discard to work. Still we can't take it
unconditionally, as it will break normal backup from RO source. So, we
have to add a parameter and pass it thorough bdrv_open flags.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20240313152822.626493-5-vsementsov@yandex-team.ru>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Currently block_copy creates copy_bitmap in source node. But that is in
bad relation with .independent_close=true of copy-before-write filter:
source node may be detached and removed before .bdrv_close() handler
called, which should call block_copy_state_free(), which in turn should
remove copy_bitmap.
That's all not ideal: it would be better if internal bitmap of
block-copy object is not attached to any node. But that is not possible
now.
The simplest solution is just create copy_bitmap in filter node, where
anyway two other bitmaps are created.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
Message-Id: <20240313152822.626493-4-vsementsov@yandex-team.ru>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
In case when source node does not have any parents, the condition still
works as required: backup job do create the parent by
block_job_create -> block_job_add_bdrv -> bdrv_root_attach_child
Still, in this case checking @perm variable doesn't work, as backup job
creates the root blk with empty permissions (as it rely on CBW filter
to require correct permissions and don't want to create extra
conflicts).
So, we should not check @perm.
The hack may be dropped entirely when transactional insertion of
filter (when we don't try to recalculate permissions in intermediate
state, when filter does conflict with original parent of the source
node) merged (old big series
"[PATCH v5 00/45] Transactional block-graph modifying API"[1] and it's
current in-flight part is "[PATCH v8 0/7] blockdev-replace"[2])
[1] https://patchew.org/QEMU/20220330212902.590099-1-vsementsov@openvz.org/
[2] https://patchew.org/QEMU/20231017184444.932733-1-vsementsov@yandex-team.ru/
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
Message-Id: <20240313152822.626493-2-vsementsov@yandex-team.ru>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
If a blockcommit is aborted the base image remains in RW mode, that leads
to a fail of subsequent live migration.
How to reproduce:
$ virsh snapshot-create-as vm snp1 --disk-only
*** write something to the disk inside the guest ***
$ virsh blockcommit vm vda --active --shallow && virsh blockjob vm vda --abort
$ lsof /vzt/vm.qcow2
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
qemu-syst 433203 root 45u REG 253,0 1724776448 133 /vzt/vm.qcow2
$ cat /proc/433203/fdinfo/45
pos: 0
flags: 02140002 <==== The last 2 means RW mode
If the base image is in RW mode at the end of blockcommit and was in RO
mode before blockcommit, reopen the base BDS in RO.
Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-Id: <20240404091136.129811-1-alexander.ivanov@virtuozzo.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Some, but not all error messages are of the form
Guest agent command failed, error was '<actual error message>'
For instance, command guest-exec can fail with an error message like
Guest agent command failed, error was 'Failed to execute child process “/bin/invalid-cmd42” (No such file or directory)'
Shorten this to just just the actual error message. The guest-exec
example becomes
Failed to execute child process “/bin/invalid-cmd42” (No such file or directory)
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240514105829.729342-3-armbru@redhat.com>
[Superfluous #include "qapi/qmp/qerror.h" deleted]
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
When guest-set-user-password's argument @password can't be converted
from UTF-8 to UTF-16, we report something like
Guest agent command failed, error was 'Invalid sequence in conversion input'
Improve this to
can't convert 'password' to UTF-16: Invalid sequence in conversion input
Likewise for argument @username, and guest-file-open argument @path,
even though I'm not sure you can actually get invalid input past the
QMP core there.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240514105829.729342-2-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job. When the caller does, the error is reported twice. When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.
qmp_xen_save_devices_state() and qmp_xen_load_devices_state() violate
this principle: they call qemu_save_device_state() and
qemu_loadvm_state(), which call error_report_err().
I wish I could clean this up now, but migration's error reporting is
too complicated (confused?) for me to mess with it.
Instead, I'm merely improving the error reported by
qmp_xen_load_devices_state() and qmp_xen_load_devices_state() to the
QMP core from
An IO error has occurred
to
saving Xen device state failed
and
loading Xen device state failed
respectively.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240513141703.549874-6-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Fabiano Rosas <farosas@suse.de>
Acked-by: Peter Xu <peterx@redhat.com>
qmp_memsave() and qmp_pmemsave() report fwrite() error as
An IO error has occurred
Improve this to
writing memory to '<filename>' failed
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240513141703.549874-5-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
vmdk_init_extent() reports blk_co_pwrite() failure to its caller as
An IO error has occurred
The errno code returned by blk_co_pwrite() is lost.
Improve this to
failed to write VMDK <what>: <description of errno>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240513141703.549874-4-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
create_win_dump() and write_run report qemu_write_full() failure to
their callers as
An IO error has occurred
The errno set by qemu_write_full() is lost.
Improve this to
win-dump: failed to write header: <description of errno>
and
win-dump: failed to save memory: <description of errno>
This matches how dump.c reports similar errors.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240513141703.549874-3-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
external_snapshot_action() reports bdrv_flush() failure to its caller
as
An IO error has occurred
The errno code returned by bdrv_flush() is lost.
Improve this to
Write to node '<device or node name>' failed: <description of errno>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240513141703.549874-2-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
target/i386: Introduce X86Access and use for xsave and friends
linux-user/i386: Fix allocation and alignment of fp state in signal frame
# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmZT2GwdHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV87pQf9F/cmrKQG1mVWKmJd
# MI7l63lbxejdgAADv1nmro+oapCsJSaQeUSrYp904ydqJjVfBJkaoXfknGsvxrNA
# oW7nEuYt0sBKdaBUKhYpMOJ3ivfw7lVVMJmjNv9ngZRhW+WOoJrBHoleUkVLiM7D
# rxkMLL+LQ7BR9i0Lv1unorOkqUPGNOnEd45qRn6k1g/Qnqi8SNMzxFwO8+232u8m
# EG9un/oh4mKPyb5vSg3Y4JLg+yDKCRScBqBU1wcKFe1u+umBkv2BNcU+k62AJh1q
# bv8i1n+X/dFAd1aj0NEupi04EOZIof5m3T4YIWg7M4I94NiFWNZ18vgskkmiO+Mo
# 0KPd/A==
# =sYrE
# -----END PGP SIGNATURE-----
# gpg: Signature made Sun 26 May 2024 05:48:44 PM PDT
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate]
* tag 'pull-lu-20240526' of https://gitlab.com/rth7680/qemu: (28 commits)
target/i386: Pass host pointer and size to cpu_x86_{xsave,xrstor}
target/i386: Pass host pointer and size to cpu_x86_{fxsave,fxrstor}
target/i386: Pass host pointer and size to cpu_x86_{fsave,frstor}
target/i386: Convert do_xrstor to X86Access
target/i386: Convert do_xsave to X86Access
linux-user/i386: Honor xfeatures in xrstor_sigcontext
linux-user/i386: Fix allocation and alignment of fp state
linux-user/i386: Return boolean success from xrstor_sigcontext
linux-user/i386: Return boolean success from restore_sigcontext
linux-user/i386: Fix -mregparm=3 for signal delivery
linux-user/i386: Split out struct target_fregs_state
linux-user/i386: Replace target_fpstate_fxsave with X86LegacyXSaveArea
linux-user/i386: Remove xfeatures from target_fpstate_fxsave
linux-user/i386: Drop xfeatures_size from sigcontext arithmetic
target/i386: Add {hw,sw}_reserved to X86LegacyXSaveArea
target/i386: Add rbfm argument to cpu_x86_{xsave,xrstor}
target/i386: Split out do_xsave_chk
target/i386: Convert do_xrstor_* to X86Access
target/i386: Convert do_xsave_* to X86Access
tagret/i386: Convert do_fxsave, do_fxrstor to X86Access
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We have already validated the memory region in the course of
validating the signal frame. No need to do it again within
the helper function.
In addition, return failure when the header contains invalid
xstate_bv. The kernel handles this via exception handling
within XSTATE_OP within xrstor_from_user_sigframe.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We have already validated the memory region in the course of
validating the signal frame. No need to do it again within
the helper function.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We have already validated the memory region in the course of
validating the signal frame. No need to do it again within
the helper function.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
For modern cpus, the kernel uses xsave to store all extra
cpu state across the signal handler. For xsave/xrstor to
work, the pointer must be 64 byte aligned. Moreover, the
regular part of the signal frame must be 16 byte aligned.
Attempt to mirror the kernel code as much as possible.
Use enum FPStateKind instead of use_xsave() and use_fxsr().
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1648
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use the structure definition from target/i386/cpu.h.
The only minor quirk is re-casting the sw_reserved
area to the OS specific struct target_fpx_sw_bytes.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This is easily computed by advancing past the structure.
At the same time, replace the magic number "64".
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This is subtracting sizeof(target_fpstate_fxsave) in
TARGET_FXSAVE_SIZE, then adding it again via &fxsave->xfeatures.
Perform the same computation using xstate_size alone.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This completes the 512 byte structure, allowing the union to
be removed. Assert that the structure layout is as expected.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This path is not required by user-only, and can in fact
be shared between xsave and xrstor.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Move the alignment fault from do_* to helper_*, as it need
not apply to usage from within user-only signal handling.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Build system and target/i386/translate.c cleanups
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmZRy1gUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroMTtQf/ZQskuqZyTrDhB/uVUT8oT5JNKQNS
# GbFSgDK7jDdBeU3UmoYrlx9vfFR/mH5cA88MlusUy0SjQBNo4onD725o6Vvum/LW
# DPe5ZyE34wvOasM7KXqJsD+2SttjaVjCXN4ip+E9WL5By2TWJgrk6IgTtvAhT9cd
# LWb5OEIInaq7ZiWz3EpjmGvZd0M4mxqXi5OeDvmoFyf38xElfbWZWbfhJv+H5L1X
# stivPBtUbXOzh63NL491hUYQtiAWlow8Qcnn7CYRflb6Vdd4QPK+6W8FX5KyU2eC
# bXRXloW7wjEAC9pyiVky1SCvtNg7AVFL+9kxwiGreoZfo+/IMA+NP6pGOg==
# =hpWy
# -----END PGP SIGNATURE-----
# gpg: Signature made Sat 25 May 2024 04:28:24 AM PDT
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (24 commits)
migration: remove unnecessary zlib dependency
meson: do not query modules before they are processed
tcg: include dependencies in static_library()
meson: remove unnecessary dependency
meson: remove unnecessary reference to libm
target/i386: remove aflag argument of gen_lea_v_seg
target/i386: clean up repeated string operations
target/i386: introduce gen_lea_ss_ofs
target/i386: use mo_stacksize more
target/i386: inline gen_add_A0_ds_seg
target/i386: split gen_ldst_modrm for load and store
target/i386: reg in gen_ldst_modrm is always OR_TMP0
target/i386: raze the gen_eob* jungle
target/i386: assert that gen_update_eip_cur and gen_update_eip_next are the same in tb_stop
target/i386: avoid calling gen_eob_inhibit_irq before tb_stop
target/i386: avoid calling gen_eob_syscall before tb_stop
target/i386: document and group DISAS_* constants
target/i386: set CC_OP in helpers if they want CC_OP_EFLAGS
target/i386: cpu_load_eflags already sets cc_op
target/i386: remove unnecessary gen_update_cc_op before gen_eob*
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The dbus_display1_dep is not really used since all occurrences also
request gio independently. Just list the generated sources and drop
dbus_display1_dep.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Do not bother generating inline wrappers for gen_repz and gen_repz2;
use s->prefix to separate REPZ from REPNZ in the case of SCAS and
CMPS.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Generalize gen_stack_A0() to include an initial add and to use an arbitrary
destination. This is a common pattern and it is not a huge burden to
add the extra arguments to the only caller of gen_stack_A0().
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use mo_stacksize for all stack accesses, including when
a 64-bit code segment is impossible and the code is
therefore checking only for SS32(s).
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It is only used in MONITOR, where a direct call of gen_lea_v_seg
is simpler, and in XLAT. Inline it in the latter.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The is_store argument of gen_ldst_modrm has only ever been passed
a constant. Just split the function in two.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Values other than OR_TMP0 were only ever used by MOV and MOVNTI
opcodes. Now that these have been converted to the new decoder,
remove the argument.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Make gen_eob take the DISAS_* constant as an argument, so that
it is not necessary to have wrappers around it.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is an invariant now that there are no calls to gen_eob_inhibit_irq()
outside tb_stop.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
sti only has one exit, so it does not need to generate the
end-of-translation code inline. It can be deferred to tb_stop.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
syscall and sysret only have one exit, so they do not need to
generate the end-of-translation code inline. It can be
deferred to tb_stop.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Place DISAS_* constants that update cpu_eip first, and
the "jump" ones last. Add comments explaining the differences
and usage.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Mark cc_op as clean and do not spill it at the end of the translation block.
Technically this is a tiny bit less efficient, but:
* it results in translations that are a tiny bit smaller
* for most of these instructions, it is not unlikely that they are close to
the end of the basic block, in which case cc_op would not be overwritten
* anyway the cost is probably dwarfed by that of computing flags.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
No need to set it again at the end of the translation block, cc_op_dirty
can be set to false.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is already handled in gen_eob(). Before adding another DISAS_*
case, remove the double calls.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
gen_helper_rsm cannot generate an exception, and reloads the flags.
So there's no need to spill cc_op and update cpu_eip, but on the
other hand cc_op must be reset to CC_OP_EFLAGS before returning.
It all works by chance, because by spilling cc_op before the call
to the helper, it becomes non-dirty and gen_eob will not overwrite
the CC_OP_EFLAGS value that is placed there by the helper. But
let's clean it up.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Intel SDM 18.3.1.4 "If an occurrence of the MOV or POP instruction
loads the SS register executes with EFLAGS.TF = 1, no single-step debug
exception occurs following the MOV or POP instruction."
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The point of CPU_CFLAGS is really just to select the appropriate multilib,
for example for library linking tests, and -mcx16 is not needed for
that purpose.
Furthermore, if -mcx16 is part of QEMU's choice of a basic x86_64
instruction set, it should be applied to cross-compiled x86_64 code too;
it is plausible that tests/tcg would want to cover cmpxchg16b as well,
for example. In the end this makes just as much sense as a per sub-build
tweak, so move the flag to meson.build and cross_cc_cflags_x86_64.
This leaves out contrib/plugins, which would fail when attempting to use
__sync_val_compare_and_swap_16 (note it does not do yet); while minor,
this *is* a disadvantage of this change. But building contrib/plugins
with a Makefile instead of meson.build is something self-inflicted just
for the sake of showing that it can be done, and if this kind of papercut
started becoming a problem we could make the directory part of the meson
build. Until then, we can live with the limitation.
Signed-off-by: Artyom Kunakovsky <artyomkunakovsky@gmail.com>
Message-ID: <20240523051118.29367-1-artyomkunakovsky@gmail.com>
[rewrite commit message, remove from configure. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
*** NOTE ***
This replaces the previous PR for tags/pull-ppc-for-9.1-1-20240524
* Fix an interesting TLB invalidate race
* Implement more instructions with decodetree
* Add the POWER8/9/10 BHRB facility
* Add missing instructions, registers, SMT support
* First round of a big MMU xlate cleanup
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCgAdFiEETkN92lZhb0MpsKeVZ7MCdqhiHK4FAmZP1bsACgkQZ7MCdqhi
# HK7TuQ/7BQugpF2yOYroQmo0Yl4RPfFp6ACqfYQgehcGegg3SWpEselTeOJla3G9
# UyVd0mlWf7DciYi61qit/WyLOeuRXMtRjrnFLV2wz9o7D/Ey5/aLQfUL4oCDt/i2
# hmmq3ZAcr7WWxaz338pLJx9gIVjaNiqSoRz9HgHNkQq0pxkbEo1eSjZ6QLSvqYC2
# dwtJHywFrHNo14aq1Nc7PZ5MFxNN6t7hm7KRHKFrt8Obar15n64MSHyRvMzHI9EO
# RgNzz9/qe5yvJ4kmaNiZjntxojXCBUhhlCTtaDIG1LDBc2yNG5VWQUnwThvyNxxX
# h+Ia4Pv7blXikQ6RuqsvFyrLCgUvwXwBiQwiQCJyITk0asLyJVwhkUpiI/jJvOun
# AujSA/6e2pbSe4RUZytkzygx2KVODrVtcSoOvo8kRw+2aTOWMv7DbfBalmWJQWgx
# 0xSeuUz22eNKEL2XbZWNM5v0OgXUXIs9BVeCqn7RB4lC2RNi72v111UPuKYq6Ijx
# SHWQMGPGu9FNBsIdriclRWXVXHpVHz/s/l8AJT8ad6E57UHVk5zCPrbFZFImvQkL
# E7xlctijeST8V5qGyBPG3M4aPoER9+6J32ORSx7KwDwr+fzkbNUXC8UUC4OjAZ+d
# 2vhie9Vs5xWq/E8gGovTymeQ4yHArobDz/j7+rrr0qeppnKLWjM=
# =jHL7
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 23 May 2024 04:48:11 PM PDT
# gpg: using RSA key 4E437DDA56616F4329B0A79567B30276A8621CAE
# gpg: Good signature from "Nicholas Piggin <npiggin@gmail.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 4E43 7DDA 5661 6F43 29B0 A795 67B3 0276 A862 1CAE
* tag 'pull-ppc-for-9.1-1-20240524-1' of https://gitlab.com/npiggin/qemu: (72 commits)
target/ppc: Remove pp_check() and reuse ppc_hash32_pp_prot()
target/ppc: Move out BookE and related MMU functions from mmu_common.c
target/ppc: Add a function to check for page protection bit
target/ppc/mmu-radix64.c: Drop a local variable
target/ppc/mmu-hash32.c: Drop a local variable
target/ppc: Split off common embedded TLB init
target/ppc: Remove id_tlbs flag from CPU env
target/ppc: Move mmu_ctx_t type to mmu_common.c
target/ppc: Transform ppc_jumbo_xlate() into ppc_6xx_xlate()
target/ppc: Split off 40x cases from ppc_jumbo_xlate()
target/ppc: Split off real mode handling from get_physical_address_wtlb()
target/ppc: Simplify ppc_booke_xlate() part 2
target/ppc: Simplify ppc_booke_xlate() part 1
target/ppc: Split off BookE handling from ppc_jumbo_xlate()
target/ppc: Remove BookE from direct store handling
target/ppc: Don't use mmu_ctx_t in mmubooke206_get_physical_address()
target/ppc: Don't use mmu_ctx_t in mmubooke_get_physical_address()
target/ppc: Don't use mmu_ctx_t for mmu40x_get_physical_address()
target/ppc: Replace hard coded constants in ppc_jumbo_xlate()
target/ppc: Deindent ppc_jumbo_xlate()
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The ppc_hash32_pp_prot() function in mmu-hash32.c is the same as
pp_check() in mmu_common.c, merge these to remove duplicated code.
Define the common function as static lnline otherwise exporting the
function from mmu-hash32.c would stop the compiler inlining it which
results in slightly lower performance.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
[np: move ppc_hash32_pp_prot inline without changing it]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Add a new mmu-booke.c file for BookE and related MMU bits from
mmu_common.c.
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Checking if a page protection bit is set for a given access type is a
common operation. Add a function to avoid repeating the same check at
multiple places. As this relies on access type and page protection bit
values having certain relation also add an assert to ensure that this
assumption holds.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The value is only used once so no need to introduce a local variable
for it.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
In ppc_hash32_xlate() the value of need_prop is checked in two places
but precalculating it does not help because when we reach the first
check we always return and not reach the second place so the value
will only be used once. We can drop the local variable and calculate
it when needed, which makes these checks using it similar to other
places with such checks.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Several 4xx CPUs and e200 share the same TLB settings enclosed in an
ifdef. Split it off in a common function to reduce code duplication
and the number of ifdefs.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
This flag for split instruction/data TLBs is only set for 6xx soft TLB
MMU model and not used otherwise so no need to have a separate flag
for that.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Remove mmu_ctx_t definition from internal.h as this type is only used
within mmu_common.c.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Now that only 6xx cases left in ppc_jumbo_xlate() we can change it
to ppc_6xx_xlate() also removing get_physical_address_wtlb().
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Introduce ppc_40x_xlate() to split off 40x handlning leaving only 6xx
in ppc_jumbo_xlate() now.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Add ppc_real_mode_xlate() to handle real mode translation and allow
removing this case from ppc_jumbo_xlate().
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Merge the code fetch and data access cases in a common switch.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Move setting error_code that appears in every case out in front and
hoist the common fall through case for BOOKE206 as well which allows
removing the nested switches.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Introduce ppc_booke_xlate() to handle BookE and BookE 2.06 cases to
reduce ppc_jumbo_xlate() further.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
As BookE never returns -4 we can drop BookE from the direct store case
in ppc_jumbo_xlate().
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
mmubooke206_get_physical_address() only uses the raddr and prot fields
from mmu_ctx_t. Pass these directly instead of using a ctx struct.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
mmubooke_get_physical_address() only uses the raddr and prot fields
from mmu_ctx_t. Pass these directly instead of using a ctx struct.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
mmu40x_get_physical_address() only uses the raddr and prot fields from
mmu_ctx_t. Pass these directly instead of using a ctx struct.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The "2" in booke206_update_mas_tlb_miss() call corresponds to
MMU_INST_FETCH which is the value of access_type in this branch;
mmubooke206_esr() only checks for MMU_DATA_STORE and it's called from
code access so using MMU_DATA_LOAD here seems wrong so replace it with
access_type here as well that yields the same result. This also makes
these calls the same as the data access branch further down.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Instead of putting a large block of code in an if, invert the
condition and return early to be able to deindent the code block.
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
This function just does two assignments and and unnecessary check that
is always true so inline it in the only caller left and remove it.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The real mode handling is identical in the remaining switch cases.
Split off these common real mode cases into a separate conditional to
leave only the else branches in the switch that are different.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
BookE does not have real mode so split off and handle it first in
get_physical_address_wtlb() before checking for real mode for other
MMU models.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Return directly, which is simpler than dragging a return value through
multpile if and else blocks.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Move the debug logging within ppc6xx_tlb_check() from after its only
call to simplify the caller.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
In mmu6xx_get_physical_address() we have a large if block with a two
line else branch that effectively returns. Invert the condition and
move the else there to allow deindenting the large if block to make
the flow easier to follow.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Repurpose get_segment_6xx_tlb() to do the whole address translation
for POWERPC_MMU_SOFT_6xx MMU model by moving the BAT check there and
renaming it to match other similar functions. These are only called
once together so no need to keep these separate functions and
combining them simplifies the caller allowing further restructuring.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Drop MPC8xx cases from get_physical_address_wtlb() and ppc_jumbo_xlate().
The default case would still catch this and abort the same way and
there is still a warning about it in ppc_tlb_invalidate_all() which is
called in ppc_cpu_reset_hold() so likely we never get here but to make
sure add a case to ppc_xlate() to the same effect.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
In get_physical_address_wtlb() the real_mode flag depends on either
the MSR[IR] or MSR[DR] bit depending on access_type. Extract just the
needed bit in a more straight forward way instead of doing unnecessary
computation.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
In mmubooke_check_tlb() and mmubooke206_check_tlb() we can assign the
value of prot2 directly to the destination, no need to have a separate
local variable for it.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
In mmubooke_check_tlb() and mmubooke206_check_tlb() prot2 is
calculated first but only used after an unrelated check that can
return before tha value is used. Move the calculation after the check,
closer to where it is used, to keep them together and avoid computing
it when not needed.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The helper_rac function is defined but not used, remove it.
Fixes: 005b69fdcc (target/ppc: Remove PowerPC 601 CPUs)
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
I think it's use was removed by
Commit 5883d8b296 ("mmu-hash*: Don't use full ppc_hash{32,
64}_translate() path for get_phys_page_debug()")
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Dr. David Alan Gilbert <dave@treblig.org>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
msgsnd has a broadcast mode that sends hypervisor doorbells to all
threads belonging to the same core as the target. A "subcore" mode
sends to all or one thread depending on 1LPAR mode.
Reviewed-by: Glenn Miles <milesg@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
This implements the POWER SPRC/SPRD SPRs, and SCRATCH0-7 registers that
can be accessed via these indirect SPRs.
SCRATCH registers only provide storage, but they are used by firmware
for low level crash and progress data, so this implementation logs
writes to the registers to help with analysis.
Reviewed-by: Glenn Miles <milesg@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
LDBAR, TTR are a Power-specific SPRs. These simple implementations
are enough for IBM proprietary firmware for now.
Reviewed-by: Glenn Miles <milesg@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
AMOR, MMCRC, HRMOR, TSCR, HMEER, RPR SPRs are per-core or per-LPAR
registers with simple (generic) implementations.
Reviewed-by: Glenn Miles <milesg@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
An SPR can be either per-thread, per-core, or per-LPAR. Per-LPAR means
per-thread or per-core, depending on 1LPAR mode.
Reviewed-by: Glenn Miles <milesg@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
attn is an implementation-specific instruction that on POWER (and G5/
970) can be enabled with a HID bit (disabled = illegal), and executing
it causes the host processor to stop and the service processor to be
notified. Generally used for debugging.
Implement attn and make it checkstop the system, which should be good
enough for QEMU debugging.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Change the logging not to print to stderr as well, because a
checkstop is a guest error (or perhaps a simulated machine error)
rather than a QEMU error, so send it to the log.
Update the checkstop message, and log CPU registers too.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Glenn Miles <milesg@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
checkstop state does not halt the system, interrupts continue to be
serviced, and other CPUs run. Make it stop the machine with
qemu_system_guest_panicked.
Reviewed-by: Glenn Miles <milesg@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Use DEF_MEMOP() consistently in larx and stcx. generation, and apply it
once when it's used rather than where the macros are expanded, to reduce
typing.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Add support for the clrbhrb and mfbhrbe instructions.
Since neither instruction is believed to be critical to
performance, both instructions were implemented using helper
functions.
Access to both instructions is controlled by bits in the
HFSCR (for privileged state) and MMCR0 (for problem state).
A new function, helper_mmcr0_facility_check, was added for
checking MMCR0[BHRBA] and raising a facility_unavailable exception
if required.
NOTE: For P8 and P9, due to a performance issue, branch history will
not be kept, but the instructions will be allowed to execute
as normal with the exception that the mfbhrbe instruction will
always return a zero value.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Glenn Miles <milesg@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
This commit continues adding support for the Branch History
Rolling Buffer (BHRB) as is provided starting with the P8
processor and continuing with its successors. This commit
is limited to the recording and filtering of taken branches.
The following changes were made:
- Enabled functionality on P10 processors only due to
performance impact seen with P8 and P9 where it is not
disabled for non problem state branches.
- Added a BHRB buffer for storing branch instruction and
target addresses for taken branches
- Renamed gen_update_cfar to gen_update_branch_history and
added a 'target' parameter to hold the branch target
address and 'inst_type' parameter to use for filtering
- Added TCG code to gen_update_branch_history that stores
data to the BHRB and updates the BHRB offset.
- Added BHRB resource initialization and reset functions
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Glenn Miles <milesg@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
This commit is preparatory to the addition of Branch History
Rolling Buffer (BHRB) functionality, which is being provided
today starting with the P8 processor.
BHRB uses several SPR register fields to control whether or not
a branch instruction's address (and sometimes target address)
should be recorded. Checking each of these fields with each
branch instruction using jitted code would lead to a significant
decrease in performance.
Therefore, it was decided that BHRB configuration bits that are
not expected to change frequently should have their state summarized
in an hflag so that the amount of checking done by jitted code can
be reduced.
This commit contains the changes for summarizing the state of the
following register fields in the HFLAGS_BHRB_ENABLE hflag:
MMCR0[FCP] - Determines if BHRB recording is frozen in the
problem state
MMCR0[FCPC] - A modifier for MMCR0[FCP]
MMCRA[BHRBRD] - Disables all BHRB recording for a thread
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Glenn Miles <milesg@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Moving the following instructions to decodetree specification :
v{max, min}{u, s}{b, h, w, d} : VX-form
The changes were verified by validating that the tcg ops generated by those
instructions remain the same, which were captured with the '-d in_asm,op' flag.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Moving the following instructions to decodetree specification:
v{and, andc, nand, or, orc, nor, xor, eqv} : VX-form
The changes were verified by validating that the tcp ops generated by those
instructions remain the same, which were captured with the '-d in_asm,op' flag.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Moving the following instructions to decodetree specification :
{l,st}ve{b,h,w}x,
{l,st}v{x,xl},
lvs{l,r} : X-form
The changes were verified by validating that the tcg ops generated by those
instructions remain the same, which were captured using the '-d in_asm,op' flag.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Moving the below instructions to decodetree specification :
andi[s]., {ori, xori}[s] : D-form
{and, andc, nand, or, orc, nor, xor, eqv}[.],
exts{b, h, w}[.], cnt{l, t}z{w, d}[.],
popcnt{b, w, d}, prty{w, d}, cmp, bpermd : X-form
With this patch, all the fixed-point logical instructions have been
moved to decodetree.
The changes were verified by validating that the tcg ops generated by those
instructions remain the same, which were captured with the '-d in_asm,op' flag.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
[np: 32-bit compile fix]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Moving the following instructions to decodetree specification :
cmp{rb, eqb}, t{w, d} : X-form
t{w, d}i : D-form
isel : A-form
The changes were verified by validating that the tcg ops generated by those
instructions remain the same, which were captured using the '-d in_asm,op' flag.
Also for CMPRB, following review comments :
Replaced repetition of arithmetic right shifting (tcg_gen_shri_i32) followed
by extraction of last 8 bits (tcg_gen_ext8u_i32) with extraction of the required
bits using offsets (tcg_gen_extract_i32).
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
[np: 32-bit compile fix]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Moving the below instructions to decodetree specification :
divd[u, e, eu][o][.] : XO-form
mod{sd, ud} : X-form
With this patch, all the fixed-point arithmetic instructions have been
moved to decodetree.
The changes were verified by validating that the tcg ops generated by those
instructions remain the same, which were captured using the '-d in_asm,op' flag.
Also, remaned do_divwe method in fixedpoint-impl.c.inc to do_dive because it is
now used to divide doubleword operands as well, and not just words.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
[np: 32-bit compile fix]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Moving the following instructions to decodetree :
mul{ld, ldo, hd, hdu}[.] : XO-form
madd{hd, hdu, ld} : VA-form
The changes were verified by validating that the tcg ops generated by those
instructions remain the same, which were captured with the '-d in_asm,op'
flag.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
[np: 32-bit compile fix]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Moving the below instructions to decodetree specification :
neg[o][.] : XO-form
mod{sw, uw}, darn : X-form
The changes were verified by validating that the tcg ops generated by those
instructions remain the same, which were captured with the '-d in_asm,op' flag.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
[np: 32-bit compile fix]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Moving the following instructions to decodetree specification :
divw[u, e, eu][o][.] : XO-form
The changes were verified by validating that the tcg ops generated by those
instructions remain the same, which were captured with the '-d in_asm,op' flag.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The handler methods for divw[u] instructions internally use Rc(ctx->opcode),
for extraction of Rc field of instructions, which poses a problem if we move
the above said instructions to decodetree, as the ctx->opcode field is not
popluated in decodetree. Hence, making it decodetree compatible, so that the
mentioned insns can be safely move to decodetree specs.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Moving the following instructions to decodetree specification :
mulli : D-form
mul{lw, lwo, hw, hwu}[.] : XO-form
The changes were verified by validating that the tcg ops generated by those
instructions remain the same, which were captured with the '-d in_asm,op' flag.
Also cleaned up code for mullw[o][.] as per review comments while
keeping the logic of the tcg ops generated semantically same.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
This patch moves the below instructions to decodetree specification :
f{add, sub, mul, div, re, rsqrte, madd, msub, nmadd, nmsub}[s][.] : A-form
ft{div, sqrt} : X-form
With this patch, all the floating-point arithmetic instructions have been
moved to decodetree.
The changes were verified by validating that the tcg ops generated by those
instructions remain the same, which were captured with the '-d in_asm,op' flag.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
This patch merges the definitions of the following set of fpu helper methods,
which are similar, using macros :
1. f{add, sub, mul, div}(s)
2. fre(s)
3. frsqrte(s)
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
POWER10 adds a new field to sync for store-store syncs, and some
new variants of the existing syncs that include persistent memory.
Implement the store-store syncs and plwsync/phwsync.
Reviewed-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Memory barriers are supposed to do something on BookE systems, these
were probably just missed during MTTCG enablement, maybe no targets
support SMP. Either way, add proper BookE implementations.
Reviewed-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
This tries to faithfully reproduce the odd BookE logic. Note the
e206 check in gen_msync_4xx() is always false, so not carried over.
It does change the handling of non-zero reserved bits outside the
defined fields from being illegal to being ignored, which the
architecture specifies ot help with backward compatibility of new
fields. The existing behaviour causes illegal instruction exceptions
when using new POWER10 sync variants that add new fields, after this
the instructions are accepted and are implemented as supersets of
the new behaviour, as intended.
Reviewed-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Some TLB flush operations can flush other CPUs. The problem with this
is they used non-synced variants of flushes (i.e., that return
before the destination has completed the flush). Since all TLB flush
users need the _synced variants, and that last user (ppc) of the
non-synced flush was buggy, this is a footgun waiting to go off. There
do not seem to be any callers that flush other CPUs, so remove the
capability.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
These are no longer used.
tlb_flush_all_cpus: removed by previous commit.
tlb_flush_page_all_cpus: removed by previous commit.
tlb_flush_page_bits_by_mmuidx_all_cpus: never used.
tlb_flush_page_by_mmuidx_all_cpus: never used.
tlb_flush_page_bits_by_mmuidx_all_cpus: never used, thus:
tlb_flush_range_by_mmuidx_all_cpus: never used.
tlb_flush_by_mmuidx_all_cpus: never used.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
With mttcg, broadcast tlbie instructions do not wait until other vCPUs
have been kicked out of TCG execution before they complete (including
necessary subsequent tlbsync, etc., instructions). This is contrary to
the ISA, and it permits other vCPUs to use translations after the TLB
flush. For example:
CPU0
// *memP is initially 0, memV maps to memP with *pte
*pte = 0;
ptesync ; tlbie ; eieio ; tlbsync ; ptesync
*memP = 1;
CPU1
assert(*memV == 0);
It is possible for the assertion to fail because CPU1 translates memV
using the TLB after CPU0 has stored 1 to the underlying memory. This
race was observed with a careful test case where CPU1 checks run in a
very large expensive TB so it can run for the entire CPU0 period between
clearing the pte and storing the memory, but host vCPU thread preemption
could cause the race to hit anywhere.
As explained in commit 4ddc104689 ("target/ppc: Fix tlbie"), it is not
enough to just use tlb_flush_all_cpus_synced(), because that does not
execute until the calling CPU has finished its TB. It is also required
that the TB is ended at the point where the TLB flush must subsequently
take effect.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The ibm,pi-features property has a bit to say whether or not
msgsndp should be used. Linux checks if it is being run under
KVM and avoids msgsndp anyway, but it would be preferable to
rely on this bit.
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
PPC_VIRTUAL_HYPERVISOR_GET_CLASS is used in critical operations like
interrupts and TLB misses and is quite costly. Running the
kvm-unit-tests sieve program with radix MMU enabled thrashes the TCG
TLB and spends a lot of time in TLB and page table walking code. The
test takes 67 seconds to complete with a lot of time being spent in
code related to finding the vhyp class:
12.01% [.] g_str_hash
8.94% [.] g_hash_table_lookup
8.06% [.] object_class_dynamic_cast
6.21% [.] address_space_ldq
4.94% [.] __strcmp_avx2
4.28% [.] tlb_set_page_full
4.08% [.] address_space_translate_internal
3.17% [.] object_class_dynamic_cast_assert
2.84% [.] ppc_radix64_xlate
Keep a pointer to the class and avoid this lookup. This reduces the
execution time to 40 seconds.
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
* hw/i386/pc_sysfw: Alias rather than copy isa-bios region
* target/i386: add control bits support for LAM
* target/i386: tweaks to new translator
* target/i386: add support for LAM in CPUID enumeration
* hw/i386/pc: Support smp.modules for x86 PC machine
* target-i386: hyper-v: Correct kvm_hv_handle_exit return value
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmZOMlAUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroNTSwf8DOPgipepNcsxUQoV9nOBfNXqEWa6
# DilQGwuu/3eMSPITUCGKVrtLR5azwCwvNfYYErVBPVIhjImnk3XHwfKpH1csadgq
# 7Np8WGjAyKEIP/yC/K1VwsanFHv3hmC6jfcO3ZnsnlmbHsRINbvU9uMlFuiQkKJG
# lP/dSUcTVhwLT6eFr9DVDUnq4Nh7j3saY85pZUoDclobpeRLaEAYrawha1/0uQpc
# g7MZYsxT3sg9PIHlM+flpRvJNPz/ZDBdj4raN1xo4q0ET0KRLni6oEOVs5GpTY1R
# t4O8a/IYkxeI15K9U7i0HwYI2wVwKZbHgp9XPMYVZFJdKBGT8bnF56pV9A==
# =lp7q
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 22 May 2024 10:58:40 AM PDT
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (23 commits)
target-i386: hyper-v: Correct kvm_hv_handle_exit return value
i386/cpu: Use CPUCacheInfo.share_level to encode CPUID[0x8000001D].EAX[bits 25:14]
i386/cpu: Use CPUCacheInfo.share_level to encode CPUID[4]
i386: Add cache topology info in CPUCacheInfo
hw/i386/pc: Support smp.modules for x86 PC machine
tests: Add test case of APIC ID for module level parsing
i386/cpu: Introduce module-id to X86CPU
i386: Support module_id in X86CPUTopoIDs
i386: Expose module level in CPUID[0x1F]
i386: Support modules_per_die in X86CPUTopoInfo
i386: Introduce module level cpu topology to CPUX86State
i386/cpu: Decouple CPUID[0x1F] subleaf with specific topology level
i386: Split topology types of CPUID[0x1F] from the definitions of CPUID[0xB]
i386/cpu: Introduce bitmap to cache available CPU topology levels
i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid()
i386/cpu: Use APIC ID info get NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14]
i386/cpu: Use APIC ID info to encode cache topo in CPUID[4]
i386/cpu: Fix i/d-cache topology to core level for Intel CPU
target/i386: add control bits support for LAM
target/i386: add support for LAM in CPUID enumeration
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
pull-loongarch-20240523
# -----BEGIN PGP SIGNATURE-----
#
# iLMEAAEKAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCZk6fPgAKCRBAov/yOSY+
# 35rwA/98G/tODhR2PAl7qZr6+6z8vazkiT4iNNHgxnw/T2TKsh2YONe+2gtKhTa1
# HKYANMykWTxOtBZeCYY9Z5QNj8DuC3xKc1zY1pC1AwRcflsMlGz0WoAC78Gbl9TC
# PBCwyu01hsFoYpIstH/dOGbNsR2OFRLnnGUVFUKtPuS3O+59hg==
# =OzUv
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 22 May 2024 06:43:26 PM PDT
# gpg: using RSA key B8FF1DA0D2FDCB2DA09C6C2C40A2FFF239263EDF
# gpg: Good signature from "Song Gao <m17746591750@163.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B8FF 1DA0 D2FD CB2D A09C 6C2C 40A2 FFF2 3926 3EDF
* tag 'pull-loongarch-20240523' of https://gitlab.com/gaosong/qemu:
hw/loongarch/virt: Fix FDT memory node address width
target/loongarch: Add loongarch vector property unconditionally
hw/loongarch: Remove minimum and default memory size
hw/loongarch: Refine system dram memory region
hw/loongarch: Refine fwcfg memory map
hw/loongarch: Refine fadt memory table for numa memory
hw/loongarch: Refine acpi srat table for numa memory
hw/loongarch: Add VM mode in IOCSR feature register in kvm mode
target/loongarch/kvm: fpu save the vreg registers high 192bit
target/loongarch/kvm: Fix VM recovery from disk failures
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
When passing disassembly data to plugin callbacks,
translator_st_len relies on db->tb->size having been set.
Fixes: 4c833c60e0 ("disas: Use translator_st to get disassembly data")
Reported-by: Bernhard Beschow <shentey@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Currently LSX/LASX vector property is decided by the default value.
Instead vector property should be added unconditionally, and it is
irrelative with its default value. If vector is disabled by default,
vector also can be enabled from command line.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240521080549.434197-2-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Some qtest test cases such as numa use default memory size of generic
machine class, which is 128M by fault.
Here generic default memory size is used, and also remove minimum memory
size which is 1G originally.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240515093927.3453674-6-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
For system dram memory region, it is not necessary to use numa node
information. There is only low memory region and high memory region.
Remove numa node information for ddr memory region here, it can reduce
memory region number on LoongArch virt machine.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240515093927.3453674-5-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Memory map table for fwcfg is used for UEFI BIOS, UEFI BIOS uses the first
entry from fwcfg memory map as the first memory HOB, the second memory HOB
will be used if the first memory HOB is used up.
Memory map table for fwcfg does not care about numa node, however in
generic the first memory HOB is part of numa node0, so that runtime
memory of UEFI which is allocated from the first memory HOB is located
at numa node0.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240515093927.3453674-4-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
One LoongArch virt machine platform, there is limitation for memory
map information. The minimum memory size is 256M and minimum memory
size for numa node0 is 256M also. With qemu numa qtest, it is possible
that memory size of numa node0 is 128M.
Limitations for minimum memory size for both total memory and numa
node0 is removed for fadt numa memory table creation.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240515093927.3453674-3-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
One LoongArch virt machine platform, there is limitation for memory
map information. The minimum memory size is 256M and minimum memory
size for numa node0 is 256M also. With qemu numa qtest, it is possible
that memory size of numa node0 is 128M.
Limitations for minimum memory size for both total memory and numa
node0 is removed for acpi srat table creation.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240515093927.3453674-2-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Migration pull request
- Li Zhijian's COLO minor fixes
- Marc-André's virtio-gpu fix
- Fiona's virtio-net USO fix
- A couple of migration-test fixes from Thomas
# -----BEGIN PGP SIGNATURE-----
#
# iQJEBAABCAAuFiEEqhtIsKIjJqWkw2TPx5jcdBvsMZ0FAmZObggQHGZhcm9zYXNA
# c3VzZS5kZQAKCRDHmNx0G+wxnWE8D/49RGE+g29qyk9aKx3lU8mSq+ZzmX5GncBt
# 5+Mx5qoHDsBCQTE+dQpEVIoeMJ2HIbgbOML4qsnp6Hw/4/TWkfwC/R6+ZmHBevRk
# fVLkVh2JMHVg8Tq+0FO1X1QnMU03uJ7EAuWdDa8HqlJ5dQY/K3gDaku8oQBXk96X
# 13pChSbMob76tdb+wiwbdEakabigH7XfrPdI6lzI8MCGTIcPKc/UKTFYuoj/OsNx
# raqy+uBtvKtfHxiaYnIgHIPNAF/1f4tP3iAOcPoZWIMXWxFkE8+ANDJAbWo6xIcL
# DGg/wEzZO/OnXLjOhjvLBUHK/fx4wQ5bsqA09BVxoRyBGblkXr+bcwBLYjgiEqzT
# aniPiAx5W/Db+T7HqZPIWesFYj3cmcwvYUTrx/RPMdC0epG+ZczDMtescHdZbxvt
# Pjs3nFeCLhyYcVhlTI72eXRCxdd/26+r6/OmrBC2+GaZrybM61TvNo+3XvO0Pfhi
# UmwF2EN27XmSMelLvH/MnflUVgBHKDs3CCQzDlxreHq2jMVR0SL7LU5wMJJ58Iok
# M3u74izQM25bwYxiASH+4iRn0puH1mOwgOx28W0uiQfZY/678/lCnwa1Tul15BRE
# fIQZJhyIGzhSpwLqEXmdXdlLQs1isqIgpd/mzKgZ285nLr7kz+4gxCUqiXgVbrl7
# P45Dym1u4g==
# =DDrh
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 22 May 2024 03:13:28 PM PDT
# gpg: using RSA key AA1B48B0A22326A5A4C364CFC798DC741BEC319D
# gpg: issuer "farosas@suse.de"
# gpg: Good signature from "Fabiano Rosas <farosas@suse.de>" [unknown]
# gpg: aka "Fabiano Almeida Rosas <fabiano.rosas@suse.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: AA1B 48B0 A223 26A5 A4C3 64CF C798 DC74 1BEC 319D
* tag 'migration-20240522-pull-request' of https://gitlab.com/farosas/qemu:
tests/qtest/migration-test: Fix the check for a successful run of analyze-migration.py
tests/qtest/migration-test: Run some basic tests on s390x and ppc64 with TCG, too
hw/core/machine: move compatibility flags for VirtIO-net USO to machine 8.1
virtio-gpu: fix v2 migration
migration: fix a typo
migration: add "exists" info to load-state-field trace
migration/colo: Tidy up bql_unlock() around bdrv_activate_all()
migration/colo: make colo_incoming_co() return void
migration/colo: Minor fix for colo error message
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
If analyze-migration.py cannot be run or crashes, the error is currently
ignored since the code only checks for nonzero values in case the child
exited properly. For example, if you run the test with a non-existing
Python interpreter, it still succeeds:
$ PYTHON=wrongpython QTEST_QEMU_BINARY=./qemu-system-x86_64 tests/qtest/migration-test
...
# Running /x86_64/migration/analyze-script
# Using machine type: pc-q35-9.1
# starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-417639.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-417639.qmp,id=char0 -mon chardev=char0,mode=control -display none -audio none -accel kvm -accel tcg -machine pc-q35-9.1, -name source,debug-threads=on -m 150M -serial file:/tmp/migration-test-XPLUN2/src_serial -drive if=none,id=d0,file=/tmp/migration-test-XPLUN2/bootsect,format=raw -device ide-hd,drive=d0,secs=1,cyls=1,heads=1 -uuid 11111111-1111-1111-1111-111111111111 -accel qtest
# starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-417639.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-417639.qmp,id=char0 -mon chardev=char0,mode=control -display none -audio none -accel kvm -accel tcg -machine pc-q35-9.1, -name target,debug-threads=on -m 150M -serial file:/tmp/migration-test-XPLUN2/dest_serial -incoming tcp:127.0.0.1:0 -drive if=none,id=d0,file=/tmp/migration-test-XPLUN2/bootsect,format=raw -device ide-hd,drive=d0,secs=1,cyls=1,heads=1 -accel qtest
**
ERROR:../../devel/qemu/tests/qtest/migration-test.c:1603:test_analyze_script: code should not be reached
migration-test: ../../devel/qemu/tests/qtest/libqtest.c:240: qtest_wait_qemu: Assertion `pid == s->qemu_pid' failed.
migration-test: ../../devel/qemu/tests/qtest/libqtest.c:240: qtest_wait_qemu: Assertion `pid == s->qemu_pid' failed.
ok 2 /x86_64/migration/analyze-script
...
Let's better fail the test in case the child did not exit properly, too.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
On s390x, we recently had a regression that broke migration / savevm
(see commit bebe9603fc ("hw/intc/s390_flic: Fix crash that occurs when
saving the machine state"). The problem was merged without being noticed
since we currently do not run any migration / savevm related tests on
x86 hosts.
While we currently cannot run all migration tests for the s390x target
on x86 hosts yet (due to some unresolved issues with TCG), we can at
least run some of the non-live tests to avoid such problems in the future.
Thus enable the "analyze-script" and the "bad_dest" tests before checking
for KVM on s390x or ppc64 (this also fixes the problem that the
"analyze-script" test was not run on s390x at all anymore since it got
disabled again by accident in a previous refactoring of the code).
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Migration from an 8.2 or 9.0 binary to an 8.1 binary with machine
version 8.1 can fail with:
> kvm: Features 0x1c0010130afffa7 unsupported. Allowed features: 0x10179bfffe7
> kvm: Failed to load virtio-net:virtio
> kvm: error while loading state for instance 0x0 of device '0000:00:12.0/virtio-net'
> kvm: load of migration failed: Operation not permitted
The series
53da8b5a99 virtio-net: Add support for USO features
9da1684954 virtio-net: Add USO flags to vhost support.
f03e0cf63b tap: Add check for USO features
2ab0ec3121 tap: Add USO support to tap device.
only landed in QEMU 8.2, so the compatibility flags should be part of
machine version 8.1.
Moving the flags unfortunately breaks forward migration with machine
version 8.1 from a binary without this patch to a binary with this
patch.
Fixes: 53da8b5a99 ("virtio-net: Add support for USO features")
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Commit dfcf74fa ("virtio-gpu: fix scanout migration post-load") broke
forward/backward version migration. Versioning of nested VMSD structures
is not straightforward, as the wire format doesn't have nested
structures versions. Introduce x-scanout-vmstate-version and a field
test to save/load appropriately according to the machine version.
Fixes: dfcf74fa ("virtio-gpu: fix scanout migration post-load")
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
[fixed long lines]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Currently, it always returns 0, no need to check the return value at all.
In addition, enter colo coroutine only if migration_incoming_colo_enabled()
is true.
Once the destination side enters the COLO* state, the COLO process will
take over the remaining processes until COLO exits.
Cc: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
[fixed mangled author email address]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
This bug fix addresses the incorrect return value of kvm_hv_handle_exit for
KVM_EXIT_HYPERV_SYNIC, which should be EXCP_INTERRUPT.
Handling of KVM_EXIT_HYPERV_SYNIC in QEMU needs to be synchronous.
This means that async_synic_update should run in the current QEMU vCPU
thread before returning to KVM, returning EXCP_INTERRUPT to guarantee this.
Returning 0 can cause async_synic_update to run asynchronously.
One problem (kvm-unit-tests's hyperv_synic test fails with timeout error)
caused by this bug:
When a guest VM writes to the HV_X64_MSR_SCONTROL MSR to enable Hyper-V SynIC,
a VM exit is triggered and processed by the kvm_hv_handle_exit function of the
QEMU vCPU. This function then calls the async_synic_update function to set
synic->sctl_enabled to true. A true value of synic->sctl_enabled is required
before creating SINT routes using the hyperv_sint_route_new() function.
If kvm_hv_handle_exit returns 0 for KVM_EXIT_HYPERV_SYNIC, the current QEMU
vCPU thread may return to KVM and enter the guest VM before running
async_synic_update. In such case, the hyperv_synic test’s subsequent call to
synic_ctl(HV_TEST_DEV_SINT_ROUTE_CREATE, ...) immediately after writing to
HV_X64_MSR_SCONTROL can cause QEMU’s hyperv_sint_route_new() function to return
prematurely (because synic->sctl_enabled is false).
If the SINT route is not created successfully, the SINT interrupt will not be
fired, resulting in a timeout error in the hyperv_synic test.
Fixes: 267e071bd6 (“hyperv: make overlay pages for SynIC”)
Suggested-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Dongsheng Zhang <dongsheng.x.zhang@intel.com>
Message-ID: <20240521200114.11588-1-dongsheng.x.zhang@intel.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CPUID[0x8000001D].EAX[bits 25:14] NumSharingCache: number of logical
processors sharing cache.
The number of logical processors sharing this cache is
NumSharingCache + 1.
After cache models have topology information, we can use
CPUCacheInfo.share_level to decide which topology level to be encoded
into CPUID[0x8000001D].EAX[bits 25:14].
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-22-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CPUID[4].EAX[bits 25:14] is used to represent the cache topology for
Intel CPUs.
After cache models have topology information, we can use
CPUCacheInfo.share_level to decide which topology level to be encoded
into CPUID[4].EAX[bits 25:14].
And since with the helper max_processor_ids_for_cache(), the filed
CPUID[4].EAX[bits 25:14] (original virable "num_apic_ids") is parsed
based on cpu topology levels, which are verified when parsing -smp, it's
no need to check this value by "assert(num_apic_ids > 0)" again, so
remove this assert().
Additionally, wrap the encoding of CPUID[4].EAX[bits 31:26] into a
helper to make the code cleaner.
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-21-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently, by default, the cache topology is encoded as:
1. i/d cache is shared in one core.
2. L2 cache is shared in one core.
3. L3 cache is shared in one die.
This default general setting has caused a misunderstanding, that is, the
cache topology is completely equated with a specific cpu topology, such
as the connection between L2 cache and core level, and the connection
between L3 cache and die level.
In fact, the settings of these topologies depend on the specific
platform and are not static. For example, on Alder Lake-P, every
four Atom cores share the same L2 cache.
Thus, we should explicitly define the corresponding cache topology for
different cache models to increase scalability.
Except legacy_l2_cache_cpuid2 (its default topo level is
CPU_TOPO_LEVEL_UNKNOW), explicitly set the corresponding topology level
for all other cache models. In order to be compatible with the existing
cache topology, set the CPU_TOPO_LEVEL_CORE level for the i/d cache, set
the CPU_TOPO_LEVEL_CORE level for L2 cache, and set the
CPU_TOPO_LEVEL_DIE level for L3 cache.
The field for CPUID[4].EAX[bits 25:14] or CPUID[0x8000001D].EAX[bits
25:14] will be set based on CPUCacheInfo.share_level.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20240424154929.1487382-20-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
As module-level topology support is added to X86CPU, now we can enable
the support for the modules parameter on PC machines. With this support,
we can define a 5-level x86 CPU topology with "-smp":
-smp cpus=*,maxcpus=*,sockets=*,dies=*,modules=*,cores=*,threads=*.
So, add the 5-level topology example in description of "-smp".
Additionally, add the missed drawers and books options in previous
example.
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Co-developed-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-19-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add module_id member in X86CPUTopoIDs.
module_id can be parsed from APIC ID, so also update APIC ID parsing
rule to support module level. With this support, the conversions with
module level between X86CPUTopoIDs, X86CPUTopoInfo and APIC ID are
completed.
module_id can be also generated from cpu topology, and before i386
supports "modules" in smp, the default "modules per die" (modules *
clusters) is only 1, thus the module_id generated in this way is 0,
so that it will not conflict with the module_id generated by APIC ID.
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Co-developed-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-16-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Linux kernel (from v6.4, with commit edc0a2b595765 ("x86/topology: Fix
erroneous smp_num_siblings on Intel Hybrid platforms") is able to
handle platforms with Module level enumerated via CPUID.1F.
Expose the module level in CPUID[0x1F] if the machine has more than 1
modules.
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-15-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Support module level in i386 cpu topology structure "X86CPUTopoInfo".
Since x86 does not yet support the "modules" parameter in "-smp",
X86CPUTopoInfo.modules_per_die is currently always 1.
Therefore, the module level width in APIC ID, which can be calculated by
"apicid_bitwidth_for_count(topo_info->modules_per_die)", is always 0 for
now, so we can directly add APIC ID related helpers to support module
level parsing.
In addition, update topology structure in test-x86-topo.c.
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Co-developed-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-14-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Intel CPUs implement module level on hybrid client products (e.g.,
ADL-N, MTL, etc) and E-core server products.
A module contains a set of cores that share certain resources (in
current products, the resource usually includes L2 cache, as well as
module scoped features and MSRs).
Module level support is the prerequisite for L2 cache topology on
module level. With module level, we can implement the Guest's CPU
topology and future cache topology to be consistent with the Host's on
Intel hybrid client/E-core server platforms.
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Co-developed-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-13-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
At present, the subleaf 0x02 of CPUID[0x1F] is bound to the "die" level.
In fact, the specific topology level exposed in 0x1F depends on the
platform's support for extension levels (module, tile and die).
To help expose "module" level in 0x1F, decouple CPUID[0x1F] subleaf
with specific topology level.
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20240424154929.1487382-12-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CPUID[0xB] defines SMT, Core and Invalid types, and this leaf is shared
by Intel and AMD CPUs.
But for extended topology levels, Intel CPU (in CPUID[0x1F]) and AMD CPU
(in CPUID[0x80000026]) have the different definitions with different
enumeration values.
Though CPUID[0x80000026] hasn't been implemented in QEMU, to avoid
possible misunderstanding, split topology types of CPUID[0x1F] from the
definitions of CPUID[0xB] and introduce CPUID[0x1F]-specific topology
types.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-11-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently, QEMU checks the specify number of topology domains to detect
if there's extended topology levels (e.g., checking nr_dies).
With this bitmap, the extended CPU topology (the levels other than SMT,
core and package) could be easier to detect without touching the
topology details.
This is also in preparation for the follow-up to decouple CPUID[0x1F]
subleaf with specific topology level.
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20240424154929.1487382-10-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In cpu_x86_cpuid(), there are many variables in representing the cpu
topology, e.g., topo_info, cs->nr_cores and cs->nr_threads.
Since the names of cs->nr_cores and cs->nr_threads do not accurately
represent its meaning, the use of cs->nr_cores or cs->nr_threads is
prone to confusion and mistakes.
And the structure X86CPUTopoInfo names its members clearly, thus the
variable "topo_info" should be preferred.
In addition, in cpu_x86_cpuid(), to uniformly use the topology variable,
replace env->dies with topo_info.dies_per_pkg as well.
Suggested-by: Robert Hoo <robert.hu@linux.intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-9-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The commit 8f4202fb10 ("i386: Populate AMD Processor Cache Information
for cpuid 0x8000001D") adds the cache topology for AMD CPU by encoding
the number of sharing threads directly.
From AMD's APM, NumSharingCache (CPUID[0x8000001D].EAX[bits 25:14])
means [1]:
The number of logical processors sharing this cache is the value of
this field incremented by 1. To determine which logical processors are
sharing a cache, determine a Share Id for each processor as follows:
ShareId = LocalApicId >> log2(NumSharingCache+1)
Logical processors with the same ShareId then share a cache. If
NumSharingCache+1 is not a power of two, round it up to the next power
of two.
From the description above, the calculation of this field should be same
as CPUID[4].EAX[bits 25:14] for Intel CPUs. So also use the offsets of
APIC ID to calculate this field.
[1]: APM, vol.3, appendix.E.4.15 Function 8000_001Dh--Cache Topology
Information
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20240424154929.1487382-8-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Refer to the fixes of cache_info_passthrough ([1], [2]) and SDM, the
CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits 31:26] should use the
nearest power-of-2 integer.
The nearest power-of-2 integer can be calculated by pow2ceil() or by
using APIC ID offset/width (like L3 topology using 1 << die_offset [3]).
But in fact, CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits 31:26]
are associated with APIC ID. For example, in linux kernel, the field
"num_threads_sharing" (Bits 25 - 14) is parsed with APIC ID. And for
another example, on Alder Lake P, the CPUID.04H:EAX[bits 31:26] is not
matched with actual core numbers and it's calculated by:
"(1 << (pkg_offset - core_offset)) - 1".
Therefore the topology information of APIC ID should be preferred to
calculate nearest power-of-2 integer for CPUID.04H:EAX[bits 25:14] and
CPUID.04H:EAX[bits 31:26]:
1. d/i cache is shared in a core, 1 << core_offset should be used
instead of "cs->nr_threads" in encode_cache_cpuid4() for
CPUID.04H.00H:EAX[bits 25:14] and CPUID.04H.01H:EAX[bits 25:14].
2. L2 cache is supposed to be shared in a core as for now, thereby
1 << core_offset should also be used instead of "cs->nr_threads" in
encode_cache_cpuid4() for CPUID.04H.02H:EAX[bits 25:14].
3. Similarly, the value for CPUID.04H:EAX[bits 31:26] should also be
calculated with the bit width between the package and SMT levels in
the APIC ID (1 << (pkg_offset - core_offset) - 1).
In addition, use APIC ID bits calculations to replace "pow2ceil()" for
cache_info_passthrough case.
[1]: efb3934adf ("x86: cpu: make sure number of addressable IDs for processor cores meets the spec")
[2]: d7caf13b5f ("x86: cpu: fixup number of addressable IDs for logical processors sharing cache")
[3]: d65af288a8 ("i386: Update new x86_apicid parsing rules with die_offset support")
Fixes: 7e3482f824 ("i386: Helpers to encode cache information consistently")
Suggested-by: Robert Hoo <robert.hu@linux.intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-7-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For i-cache and d-cache, current QEMU hardcodes the maximum IDs for CPUs
sharing cache (CPUID.04H.00H:EAX[bits 25:14] and CPUID.04H.01H:EAX[bits
25:14]) to 0, and this means i-cache and d-cache are shared in the SMT
level.
This is correct if there's single thread per core, but is wrong for the
hyper threading case (one core contains multiple threads) since the
i-cache and d-cache are shared in the core level other than SMT level.
For AMD CPU, commit 8f4202fb10 ("i386: Populate AMD Processor Cache
Information for cpuid 0x8000001D") has already introduced i/d cache
topology as core level by default.
Therefore, in order to be compatible with both multi-threaded and
single-threaded situations, we should set i-cache and d-cache be shared
at the core level by default.
This fix changes the default i/d cache topology from per-thread to
per-core. Potentially, this change in L1 cache topology may affect the
performance of the VM if the user does not specifically specify the
topology or bind the vCPU. However, the way to achieve optimal
performance should be to create a reasonable topology and set the
appropriate vCPU affinity without relying on QEMU's default topology
structure.
Fixes: 7e3482f824 ("i386: Helpers to encode cache information consistently")
Suggested-by: Robert Hoo <robert.hu@linux.intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20240424154929.1487382-6-zhao1.liu@intel.com>
[Add compat property. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
LAM uses CR3[61] and CR3[62] to configure/enable LAM on user pointers.
LAM uses CR4[28] to configure/enable LAM on supervisor pointers.
For CR3 LAM bits, no additional handling needed:
- TCG
LAM is not supported for TCG of target-i386. helper_write_crN() and
helper_vmrun() check max physical address bits before calling
cpu_x86_update_cr3(), no change needed, i.e. CR3 LAM bits are not allowed
to be set in TCG.
- gdbstub
x86_cpu_gdb_write_register() will call cpu_x86_update_cr3() to update cr3.
Allow gdb to set the LAM bit(s) to CR3, if vcpu doesn't support LAM,
KVM_SET_SREGS will fail as other reserved bits.
For CR4 LAM bit, its reservation depends on vcpu supporting LAM feature or
not.
- TCG
LAM is not supported for TCG of target-i386. helper_write_crN() and
helper_vmrun() check CR4 reserved bit before calling cpu_x86_update_cr4(),
i.e. CR4 LAM bit is not allowed to be set in TCG.
- gdbstub
x86_cpu_gdb_write_register() will call cpu_x86_update_cr4() to update cr4.
Mask out LAM bit on CR4 if vcpu doesn't support LAM.
- x86_cpu_reset_hold() doesn't need special handling.
Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240112060042.19925-3-binbin.wu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In the -bios case the "isa-bios" memory region is an alias to the BIOS mapped
to the top of the 4G memory boundary. Do the same in the -pflash case, but only
for new machine versions for migration compatibility. This establishes common
behavior and makes pflash commands work in the "isa-bios" region which some
real-world legacy bioses rely on.
Note that in the sev_enabled() case, the "isa-bios" memory region in the -pflash
case will now also point to encrypted memory, just like it already does in the
-bios case.
When running `info mtree` before and after this commit with
`qemu-system-x86_64 -S -drive \
if=pflash,format=raw,readonly=on,file=/usr/share/qemu/bios-256k.bin` and running
`diff -u before.mtree after.mtree` results in the following changes in the
memory tree:
--- before.mtree
+++ after.mtree
@@ -71,7 +71,7 @@
0000000000000000-ffffffffffffffff (prio -1, i/o): pci
00000000000a0000-00000000000bffff (prio 1, i/o): vga-lowmem
00000000000c0000-00000000000dffff (prio 1, rom): pc.rom
- 00000000000e0000-00000000000fffff (prio 1, rom): isa-bios
+ 00000000000e0000-00000000000fffff (prio 1, romd): alias isa-bios @system.flash0 0000000000020000-000000000003ffff
00000000000a0000-00000000000bffff (prio 1, i/o): alias smram-region @pci 00000000000a0000-00000000000bffff
00000000000c0000-00000000000c3fff (prio 1, i/o): alias pam-pci @pci 00000000000c0000-00000000000c3fff
00000000000c4000-00000000000c7fff (prio 1, i/o): alias pam-pci @pci 00000000000c4000-00000000000c7fff
@@ -108,7 +108,7 @@
0000000000000000-ffffffffffffffff (prio -1, i/o): pci
00000000000a0000-00000000000bffff (prio 1, i/o): vga-lowmem
00000000000c0000-00000000000dffff (prio 1, rom): pc.rom
- 00000000000e0000-00000000000fffff (prio 1, rom): isa-bios
+ 00000000000e0000-00000000000fffff (prio 1, romd): alias isa-bios @system.flash0 0000000000020000-000000000003ffff
00000000000a0000-00000000000bffff (prio 1, i/o): alias smram-region @pci 00000000000a0000-00000000000bffff
00000000000c0000-00000000000c3fff (prio 1, i/o): alias pam-pci @pci 00000000000c0000-00000000000c3fff
00000000000c4000-00000000000c7fff (prio 1, i/o): alias pam-pci @pci 00000000000c4000-00000000000c7fff
@@ -131,11 +131,14 @@
memory-region: pc.ram
0000000000000000-0000000007ffffff (prio 0, ram): pc.ram
+memory-region: system.flash0
+ 00000000fffc0000-00000000ffffffff (prio 0, romd): system.flash0
+
memory-region: pci
0000000000000000-ffffffffffffffff (prio -1, i/o): pci
00000000000a0000-00000000000bffff (prio 1, i/o): vga-lowmem
00000000000c0000-00000000000dffff (prio 1, rom): pc.rom
- 00000000000e0000-00000000000fffff (prio 1, rom): isa-bios
+ 00000000000e0000-00000000000fffff (prio 1, romd): alias isa-bios @system.flash0 0000000000020000-000000000003ffff
memory-region: smram
00000000000a0000-00000000000bffff (prio 0, ram): alias smram-low @pc.ram 00000000000a0000-00000000000bffff
Note that in both cases the "system" memory region contains the entry
00000000fffc0000-00000000ffffffff (prio 0, romd): system.flash0
but the "system.flash0" memory region only appears standalone when "isa-bios" is
an alias.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-ID: <20240508175507.22270-7-shentey@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The 32-bit AAM/AAD opcodes are using helpers that read and write flags and
env->regs[R_EAX]. Clean them up so that the table correctly includes AX
as a 16-bit input and output.
No real reason to do it to be honest, but they are nice one-output helpers
and it removes the masking of env->regs[R_EAX] that generic load/writeback
code already does.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240522123912.608497-1-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
gen_rot_carry and gen_rot_overflow are meant to be called with count == NULL
if the count cannot be zero. However this is not done in gen_ROL and gen_ROR,
and writing everywhere "can_be_zero ? count : NULL" is burdensome and less
readable. Just pass can_be_zero as a separate argument.
gen_RCL and gen_RCR use a conditional branch to skip the computation
if count is zero, so they can pass false unconditionally to gen_rot_overflow.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240522123914.608516-1-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
vfio queue:
* Improvement of error reporting during migration
* Removed Vendor Specific Capability check on newer machine
* Addition of a VFIO migration QAPI event
* Changed prototype of routines using an error parameter to return bool
* Several cleanups regarding autofree variables
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmZNwDEACgkQUaNDx8/7
# 7KHaYQ/+MUFOiWEiAwJdP8I1DkY6mJV3ZDixKMHLmr8xH6fAkR2htEw6UUcYijcn
# Z0wVvcB7A1wetgIAB2EPc2o6JtRD1uEW2pPq3SVpdWO2rWYa4QLvldOiJ8A+Kvss
# 0ZugWirgZsM7+ka9TCuysmqWdQD+P6z2RURMSwiPi6QPHwv1Tt69gLSxFeV5WWai
# +mS6wUbaU3LSt6yRhORRvFkCss4je3D3YR73ivholGHANxi/7C5T22KwOHrW6Qzf
# uk3W/zq1yL1YLXSu6WoKPw0mMCvNtGyKK2oAlhG3Ln1tPYnctNrlfXlApqxEOGl3
# adGtwd6fyg6UTRR+vOXEy1QPCGcHtKWc5SuV5E677JftARJMwzbXrJw9Y9xS2RCQ
# oRYS5814k9RdubTxu+/l8NLICMdox7dNy//QLyrIdD7nJKYhFODkV1giWh4NWkt6
# m0T3PGLlUJ/V2ngWQu9Aw150m3lCPEKt+Nv/mGOEFDRu9dv55Vb7oJwr1dBB/n+e
# 1lNNpDmV0YipoKYMzrlBwNwxhXGJOtNPwHtw/vZuiy70CXUwo0t4XLMpWbWasxZc
# 0yz4O9RLRJEhPtPqv54aLsE2kNY10I8vwHBlhyNgIEsA7eCDduA+65aPBaqIF7z6
# GjvYdixF+vAZFexn0mDi1gtM3Yh60Hiiq1j7kKyyti/q0WUQzIc=
# =awMc
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 22 May 2024 02:51:45 AM PDT
# gpg: using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@kaod.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1
* tag 'pull-vfio-20240522' of https://github.com/legoater/qemu: (47 commits)
vfio/igd: Use g_autofree in vfio_probe_igd_bar4_quirk()
vfio: Use g_autofree in all call site of vfio_get_region_info()
vfio/pci-quirks: Make vfio_add_*_cap() return bool
vfio/pci-quirks: Make vfio_pci_igd_opregion_init() return bool
vfio/pci: Use g_autofree for vfio_region_info pointer
vfio/pci: Make capability related functions return bool
vfio/pci: Make vfio_populate_vga() return bool
vfio/pci: Make vfio_intx_enable() return bool
vfio/pci: Make vfio_populate_device() return a bool
vfio/pci: Make vfio_pci_relocate_msix() and vfio_msix_early_setup() return a bool
vfio/pci: Make vfio_intx_enable_kvm() return a bool
vfio/ccw: Make vfio_ccw_get_region() return a bool
vfio/platform: Make vfio_populate_device() and vfio_base_device_init() return bool
vfio/helpers: Make vfio_device_get_name() return bool
vfio/helpers: Make vfio_set_irq_signaling() return bool
vfio/helpers: Use g_autofree in vfio_set_irq_signaling()
vfio/display: Make vfio_display_*() return bool
vfio/display: Fix error path in call site of ramfb_setup()
backends/iommufd: Make iommufd_backend_*() return bool
vfio/cpr: Make vfio_cpr_register_container() return bool
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Pointer opregion, host and lpc are allocated and freed in
vfio_probe_igd_bar4_quirk(). Use g_autofree to automatically
free them.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
There are some exceptions when pointer to vfio_region_info is reused.
In that case, the pointed memory is freed manually.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This is to follow the coding standand in qapi/error.h to return bool
for bool-valued functions.
Include below functions:
vfio_add_virt_caps()
vfio_add_nv_gpudirect_cap()
vfio_add_vmd_shadow_cap()
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This is to follow the coding standand in qapi/error.h to return bool
for bool-valued functions.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Pointer opregion is freed after vfio_pci_igd_opregion_init().
Use 'g_autofree' to avoid the g_free() calls.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
The functions operating on capability don't have a consistent return style.
Below functions are in bool-valued functions style:
vfio_msi_setup()
vfio_msix_setup()
vfio_add_std_cap()
vfio_add_capabilities()
Below two are integer-valued functions:
vfio_add_vendor_specific_cap()
vfio_setup_pcie_cap()
But the returned integer is only used for check succeed/failure.
Change them all to return bool so now all capability related
functions follow the coding standand in qapi/error.h to return
bool.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This is to follow the coding standand in qapi/error.h to return bool
for bool-valued functions.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This is to follow the coding standand in qapi/error.h to return bool
for bool-valued functions.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Since vfio_populate_device() takes an 'Error **' argument,
best practices suggest to return a bool. See the qapi/error.h
Rules section.
By this chance, pass errp directly to vfio_populate_device() to
avoid calling error_propagate().
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Since vfio_pci_relocate_msix() and vfio_msix_early_setup() takes
an 'Error **' argument, best practices suggest to return a bool.
See the qapi/error.h Rules section.
By this chance, pass errp directly to vfio_msix_early_setup() to avoid
calling error_propagate().
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Since vfio_intx_enable_kvm() takes an 'Error **' argument,
best practices suggest to return a bool. See the qapi/error.h
Rules section.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Since vfio_populate_device() takes an 'Error **' argument,
best practices suggest to return a bool. See the qapi/error.h
Rules section.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This is to follow the coding standand in qapi/error.h to return bool
for bool-valued functions.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This is to follow the coding standand in qapi/error.h to return bool
for bool-valued functions.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This is to follow the coding standand in qapi/error.h to return bool
for bool-valued functions.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Local pointer irq_set is freed before return from
vfio_set_irq_signaling().
Use 'g_autofree' to avoid the g_free() calls.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This is to follow the coding standand in qapi/error.h to return bool
for bool-valued functions.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
vfio_display_dmabuf_init() and vfio_display_region_init() calls
ramfb_setup() without checking its return value.
So we may run into a situation that vfio_display_probe() succeed
but errp is set. This is risky and may lead to assert failure in
error_setv().
Cc: Gerd Hoffmann <kraxel@redhat.com>
Fixes: b290659fc3 ("hw/vfio/display: add ramfb support")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
GDB commit a207f6b3a38 ('Rewrite "python" command exception handling')
changed how exit() called from Python scripts loaded by GDB behave,
turning it into an exception instead of a generic error code that is
returned. This change caused several QEMU tests to crash with the
following exception:
Python Exception <class 'SystemExit'>: 0
Error occurred in Python: 0
This happens because in tests/guest-debug/test_gdbstub.py exit is
called after the tests have completed.
This commit fixes it by politely asking GDB to exit via gdb.execute,
passing the proper fail_count to be reported to 'make', instead of
abruptly calling exit() from the Python script.
Signed-off-by: Gustavo Romero <gustavo.romero@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240515173132.2462201-4-gustavo.romero@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
This effectively reverts
commit 54c4ea8f3a
Author: Zhao Liu <zhao1.liu@intel.com>
Date: Sat Mar 9 00:01:37 2024 +0800
hw/core/machine-smp: Deprecate unsupported "parameter=1" SMP configurations
but is not done as a 'git revert' since the part of the changes to the
file hw/core/machine-smp.c which add 'has_XXX' checks remain desirable.
Furthermore, we have to tweak the subsequently added unit test to
account for differing warning message.
The rationale for the original deprecation was:
"Currently, it was allowed for users to specify the unsupported
topology parameter as "1". For example, x86 PC machine doesn't
support drawer/book/cluster topology levels, but user could specify
"-smp drawers=1,books=1,clusters=1".
This is meaningless and confusing, so that the support for this kind
of configurations is marked deprecated since 9.0."
There are varying POVs on the topic of 'unsupported' topology levels.
It is common to say that on a system without hyperthreading, that there
is always 1 thread. Likewise when new CPUs introduced a concept of
multiple "dies', it was reasonable to say that all historical CPUs
before that implicitly had 1 'die'. Likewise for the more recently
introduced 'modules' and 'clusters' parameter'. From this POV, it is
valid to set 'parameter=1' on the -smp command line for any machine,
only a value > 1 is strictly an error condition.
It doesn't cause any functional difficulty for QEMU, because internally
the QEMU code is itself assuming that all "unsupported" parameters
implicitly have a value of '1'.
At the libvirt level, we've allowed applications to set 'parameter=1'
when configuring a guest, and pass that through to QEMU.
Deprecating this creates extra difficulty for because there's no info
exposed from QEMU about which machine types "support" which parameters.
Thus, libvirt can't know whether it is valid to pass 'parameter=1' for
a given machine type, or whether it will trigger deprecation messages.
Since there's no apparent functional benefit to deleting this deprecated
behaviour from QEMU, and it creates problems for consumers of QEMU,
remove this deprecation.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Message-ID: <20240513123358.612355-2-berrange@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
adapter_info_so_needed() treats its "opaque" parameter as a S390FLICState,
but the function belongs to a VMStateDescription that is attached to a
TYPE_VIRTIO_CCW_BUS device. This is currently causing a crash when the
user tries to save or migrate the VM state. Fix it by using s390_get_flic()
to get the correct device here instead.
Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Fixes: 9d1b0f5bf5 ("s390_flic: add migration-enabled property")
Message-ID: <20240517061553.564529-1-thuth@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Run "make lcitool-refresh" after the previous changes to the
lcitool files. This removes the g++ and xfslibs-dev packages
from the dockerfiles (except for the fedora-win64-cross dockerfile
where we keep the C++ compiler).
Message-ID: <20240516084059.511463-6-thuth@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
We don't need C++ for the normal QEMU builds anymore, so installing
g++ in each and every container seems to be a waste of time and disk
space. The only container that still needs it is the Fedora MinGW
container that builds the only remaining C++ code in ./qga/vss-win32/
and we can install it there with an extra project yml file instead.
Message-ID: <20240516084059.511463-4-thuth@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
QEMU's commit a5730b8bd3 ("block/file-posix: Simplify the
XFS_IOC_DIOINFO handling") removed the need for the 'xfsprogs'
package.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
[thuth: Adjusted the patch from the lcitools repo to QEMU's repo]
Message-ID: <20240516084059.511463-3-thuth@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
This is to follow the coding standand to return bool if 'Error **'
is used to pass error.
The changed functions include:
iommufd_backend_connect
iommufd_backend_alloc_ioas
By this chance, simplify the functions a bit by avoiding duplicate
recordings, e.g., log through either error interface or trace, not
both.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This is to follow the coding standand to return bool if 'Error **'
is used to pass error.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This is to follow the coding standand to return bool if 'Error **'
is used to pass error.
The changed functions include:
iommufd_cdev_kvm_device_add
iommufd_cdev_connect_and_bind
iommufd_cdev_attach_ioas_hwpt
iommufd_cdev_detach_ioas_hwpt
iommufd_cdev_attach_container
iommufd_cdev_get_info_iova_range
After the change, all functions in hw/vfio/iommufd.c follows the
standand.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This is to follow the coding standand to return bool if 'Error **'
is used to pass error.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This is to follow the coding standand to return bool if 'Error **'
is used to pass error.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This is to follow the coding standand to return bool if 'Error **'
is used to pass error.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Make VFIOIOMMUClass::add_window() and its wrapper function
vfio_container_add_section_window() return bool.
This is to follow the coding standand to return bool if 'Error **'
is used to pass error.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This is to follow the coding standand to return bool if 'Error **'
is used to pass error.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Make VFIOIOMMUClass::attach_device() and its wrapper function
vfio_attach_device() return bool.
This is to follow the coding standand to return bool if 'Error **'
is used to pass error.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Local pointer info is freed before return from
iommufd_cdev_get_info_iova_range().
Use 'g_autofree' to avoid the g_free() calls.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Local pointer name is allocated before vfio_attach_device() call
and freed after the call.
Same for tmp when calling realpath().
Use 'g_autofree' to avoid the g_free() calls.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Move trace_vfio_migration_set_state() to the top of the function, add
recover_state to it, and add a new trace event to
vfio_migration_set_device_state().
This improves tracing of device state changes as state changes are now
also logged when vfio_migration_set_state() fails (covering recover
state and device reset transitions) and in no-op state transitions to
the same state.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
When migrating a VFIO device that supports pre-copy, it is transitioned
to STOP_COPY twice: once in vfio_vmstate_change() and second time in
vfio_save_complete_precopy().
The second transition is harmless, as it's a STOP_COPY->STOP_COPY no-op
transition. However, with the newly added VFIO migration QAPI event, the
STOP_COPY event is undesirably emitted twice.
Prevent this by returning early in vfio_migration_set_state() if
new_state is the same as current device state.
Note that the STOP_COPY transition in vfio_save_complete_precopy() is
essential for VFIO devices that don't support pre-copy, for migrating an
already stopped guest and for snapshots.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Emit VFIO migration QAPI event when a VFIO device changes its migration
state. This can be used by management applications to get updates on the
current state of the VFIO device for their own purposes.
A new per VFIO device capability, "migration-events", is added so events
can be enabled only for the required devices. It is disabled by default.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Add a new QAPI event for VFIO migration. This event will be emitted when
a VFIO device changes its migration state, for example, during migration
or when stopping/starting the guest.
This event can be used by management applications to get updates on the
current state of the VFIO device for their own purposes.
Note that this new event is introduced since VFIO devices have a unique
set of migration states which cannot be described as accurately by other
existing events such as run state or migration status.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
In case of migration, during restore operation, qemu checks config space of the
pci device with the config space in the migration stream captured during save
operation. In case of config space data mismatch, restore operation is failed.
config space check is done in function get_pci_config_device(). By default VSC
(vendor-specific-capability) in config space is checked.
Due to qemu's config space check for VSC, live migration is broken across NVIDIA
vGPU devices in situation where source and destination host driver is different.
In this situation, Vendor Specific Information in VSC varies on the destination
to ensure vGPU feature capabilities exposed to the guest driver are compatible
with destination host.
If a vfio-pci device is migration capable and vfio-pci vendor driver is OK with
volatile Vendor Specific Info in VSC then qemu should exempt config space check
for Vendor Specific Info. It is vendor driver's responsibility to ensure that
VSC is consistent across migration. Here consistency could mean that VSC format
should be same on source and destination, however actual Vendor Specific Info
may not be byte-to-byte identical.
This patch skips the check for Vendor Specific Information in VSC for VFIO-PCI
device by clearing pdev->cmask[] offsets. Config space check is still enforced
for 3 byte VSC header. If cmask[] is not set for an offset, then qemu skips
config space check for that offset.
VSC check is skipped for machine types >= 9.1. The check would be enforced on
older machine types (<= 9.0).
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Vinayak Kale <vkale@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Since vfio_ccw_register_irq_notifier() takes an 'Error **' argument,
best practices suggest to return a bool. See the qapi/error.h Rules
section.
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Since vfio_ap_register_irq_notifier() takes and 'Error **' argument,
best practices suggest to return a bool. See the qapi/error.h Rules
section.
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
vfio_save_complete_precopy() currently returns before doing the trace
event. Change that.
Reviewed-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Let the callers do the error reporting. Add documentation while at it.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Use vmstate_save_state_with_err() to improve error reporting in the
callers and store a reported error under the migration stream. Add
documentation while at it.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Add an Error** argument to vfio_migration_set_state() and adjust
callers, including vfio_save_setup(). The error will be propagated up
to qemu_savevm_state_setup() where the save_setup() handler is
executed.
Modify vfio_vmstate_change_prepare() and vfio_vmstate_change() to
store a reported error under the migration stream if a migration is in
progress.
Reviewed-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Use it to update the current error of the migration stream if
available and if not, simply print out the error. Next changes will
update with an error to report.
Reviewed-by: Avihai Horon <avihaih@nvidia.com>
Acked-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This allows to update the Error argument of the VFIO log_global_start()
handler. Errors for container based logging will also be propagated to
qemu_savevm_state_setup() when the ram save_setup() handler is executed.
Also, errors from vfio_container_set_dirty_page_tracking() are now
collected and reported.
The vfio_set_migration_error() call becomes redundant in
vfio_listener_log_global_start(). Remove it.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
We will use the Error object to improve error reporting in the
.log_global*() handlers of VFIO. Add documentation while at it.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
plugin and testing updates
- don't duplicate options for microbit test
- don't spam the linux source tree when importing headers
- add STORE_U64 inline op to TCG plugins
- add conditional callback op to TCG plugins
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmZFvCMACgkQ+9DbCVqe
# KkSrYQf/aj9+eCWCKZk3Hym0lT+qNKxUeNSx3juUN8h7iG1vkA1f/XaQle5XvKDr
# ROIdo8urcr8onJ4PBH+4C7VZhUmnpL8zLH80pCuuTkF03MCNhaW/5qJ67niWmPVM
# QJHVqNomkykKOMBh+WtD5M0m/BYPT5lsa10sE3bDH8ziGjp0An2v24R89tzYEXnf
# 1QePItQN5vzEvhrZj6oKWVmeucqLsqS6yqS8V3sEpmF0+zqNjGZlrI86A4SAp74k
# 8vuduVuRbeyki7zWBTOLUeoiuHM2Zmh7v74zm/Hc1ITBaDjWMwPctcI/vFjsrCI/
# yoFRhgrV87DtIZdkrJzk5qBYFOWoeQ==
# =znN0
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 16 May 2024 09:56:19 AM CEST
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [full]
* tag 'pull-maintainer-may24-160524-2' of https://gitlab.com/stsquad/qemu:
plugins: remove op from qemu_plugin_inline_cb
plugins: extract cpu_index generate
plugins: distinct types for callbacks
tests/plugin/inline: add test for conditional callback
plugins: conditional callbacks
tests/plugin/inline: add test for STORE_U64 inline op
plugins: add new inline op STORE_U64
plugins: extract generate ptr for qemu_plugin_u64
plugins: prepare introduction of new inline ops
scripts/update-linux-header.sh: be more src tree friendly
tests/tcg: don't append QEMU_OPTS for armv6m-undef test
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Running "install_headers" in the Linux source tree is fairly
unfriendly as out-of-tree builds will start complaining about the
kernel source being non-pristine. As we have a temporary directory for
the install we should also do the build step here. So now we have:
$tmpdir/
$blddir/
$hdrdir/
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240514174253.694591-3-alex.bennee@linaro.org>
target/hppa:
- Use TCG_COND_TST where applicable.
- Use CF_BP_PAGE instead of a local breakpoint search.
- Clean up IAOQ handling during translation.
- Implement CF_PCREL.
- Implement PSW.B.
- Implement PSW.X.
- Log cpu state on interrupt and rfi.
# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmZEgnwdHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV+43gf8CakQdMSqfGV2nGP+
# 7wWZOAV04IyfkJ38F/CH0ihUkblEOzXJ1shTFkrHEw257j0D10MctSSbjrqz5BwU
# obQcwoVlxzTGXqzhkZ6wagkcqjv3TtlPtznZIk6JssdlrtwIKDmE2/3t1dzHnyBD
# WTrS0SK3YvVRovq/ai51raUbiBsNq7XG3skHEsMKsFxp4EaDP5JTbputdQWdffjh
# TBmXImhHC3gm09KWIUZwfEBHlaa7YXk2orzB8kBE8S2kQj9vrGXEaC4jYnBcQLPw
# NDDkBYRqxHYQr0vIAHee+5cUgt1jDBr5rXnAnJwzK0wyEEc4Mi4OTPhNE604iu2y
# SDxS8Q==
# =A4Qf
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 15 May 2024 11:38:04 AM CEST
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate]
* tag 'pull-hppa-20240515' of https://gitlab.com/rth7680/qemu: (43 commits)
target/hppa: Log cpu state on return-from-interrupt
target/hppa: Log cpu state at interrupt
target/hppa: Implement CF_PCREL
target/hppa: Adjust priv for B,GATE at runtime
target/hppa: Drop tlb_entry return from hppa_get_physical_address
target/hppa: Implement PSW_X
target/hppa: Implement PSW_B
target/hppa: Manage PSW_X and PSW_B in translator
target/hppa: Split PSW X and B into their own field
target/hppa: Improve hppa_cpu_dump_state
target/hppa: Do not mask in copy_iaoq_entry
target/hppa: Store full iaoq_f and page offset of iaoq_b in TB
linux-user/hppa: Force all code addresses to PRIV_USER
target/hppa: Use delay_excp for conditional trap on overflow
target/hppa: Use delay_excp for conditional traps
target/hppa: Introduce DisasDelayException
target/hppa: Remove cond_free
target/hppa: Use TCG_COND_TST* in trans_ftest
target/hppa: Use registerfields.h for FPSR
target/hppa: Use TCG_COND_TST* in trans_bb_imm
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
tcg/loongarch64: Fill out tcg_out_{ld,st} for vector regs
accel/tcg: Improve disassembly for target and plugin
# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmZEXT0dHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV/FbQf+P3ppcAA+5smxaQyi
# dsfCJaGOMqRTWYuSmNsJ7AlxQobxLKVsJrAHraNU1AnDfwKrX3XXJcU4Gwt0eQyN
# lGiF/24KLElvb+w6fkjuLdK+DbGWTrNabXJAnBw1h21x+go0mvVCVSuQQw7a/RDS
# btPnGkmoi0H340JC1MVSDRgFkB3RV0kOMXGGm70S+mw0WhjVgdInhLv0jjnj2QFM
# tYzJ5g+00v0HPo8Lun5kRSaI7EGG7J/XfGa71WHIHrB0o7FAzslap4fGTcfOB+7a
# f2jTGErezJQj1pvJLvFTNX4YQ02ORnDKsz4EC0G9QU8rk+S1bD2vTVoi5IY5ayfJ
# oqxyRw==
# =Q16M
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 15 May 2024 08:59:09 AM CEST
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate]
* tag 'pull-tcg-20240515' of https://gitlab.com/rth7680/qemu: (34 commits)
tcg/loongarch64: Fill out tcg_out_{ld,st} for vector regs
accel/tcg: Remove cpu_ldsb_code / cpu_ldsw_code
target/s390x: Use translator_lduw in get_next_pc
target/xtensa: Use translator_ldub in xtensa_insn_len
target/rx: Use translator_ld*
target/riscv: Use translator_ld* for everything
target/cris: Use cris_fetch in translate_v10.c.inc
target/cris: Use translator_ld* in cris_fetch
target/avr: Use translator_lduw
target/i386: Use translator_ldub for everything
target/microblaze: Use translator_ldl
target/hexagon: Use translator_ldl in pkt_crosses_page
target/s390x: Disassemble EXECUTEd instructions
target/s390x: Fix translator_fake_ld length
accel/tcg: Introduce translator_fake_ld
disas: Use translator_st to get disassembly data
disas: Split disas.c
accel/tcg: Return bool from TranslatorOps.disas_log
accel/tcg: Provide default implementation of disas_log
plugins: Merge alloc_tcg_plugin_context into plugin_gen_tb_start
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Now that the groundwork has been laid, enabling CF_PCREL within the
translator proper is a simple matter of updating copy_iaoq_entry
and install_iaq_entries.
We also need to modify the unwind info, since we no longer have
absolute addresses to install.
As expected, this reduces the runtime overhead of compilation when
running a Linux kernel with address space randomization enabled.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Do not compile in the priv change based on the first translation;
look up the PTE at execution time. This is required for CF_PCREL,
where a page may be mapped multiple times with different attributes.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use PAGE_WRITE_INV to temporarily enable write permission
on for a given page, driven by PSW_X being set.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
PSW_B causes B,GATE to trap as an illegal instruction, removing our
previous sequential execution test that was merely an approximation.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
PSW_X is cleared after every instruction, and only set by RFI.
PSW_B is cleared after every non-branch, or branch not taken,
and only set by taken branches. We can clear both bits with a
single store, at most once per TB. Taken branches set PSW_B,
at most once per TB.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Generally, both of these bits are cleared at the end of each
instruction. By separating these, we will be able to clear
both with a single insn, instead of 2 or 3.
Reviewed-by: Helge Deller <deller@gmx.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Print both raw IAQ_Front and IAQ_Back as well as the GVAs.
Print control registers in system mode.
Print floating point registers if CPU_DUMP_FPU.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
As with loads and stores, code offsets are kept intact until the
full gva is formed. In qemu, this is in cpu_get_tb_cpu_state.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In preparation for CF_PCREL. store the iaoq_f in 3 parts: high
bits in cs_base, middle bits in pc, and low bits in priv.
For iaoq_b, set a bit for either of space or page differing,
else the page offset.
Install iaq entries before goto_tb. The change to not record
the full direct branch difference in TB means that we have to
store at least iaoq_b before goto_tb. But since a later change
to enable CF_PCREL will require both iaoq_f and iaoq_b to be
updated before goto_tb, go ahead and update both fields now.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Allow an exception to be emitted at the end of the TranslationBlock,
leaving only the conditional branch inline. Use it for simple
exception instructions like break, which happen to be nullified.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Now that we do not need to free tcg temporaries, the only
thing cond_free does is reset the condition to never.
Instead, simply write a new condition over the old, which
may be simply cond_make_f() for the never condition.
The do_*_cond functions do the right thing with c or cf == 0,
so there's no need for a special case anymore.
Reviewed-by: Helge Deller <deller@gmx.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Define all of the context dependent field definitions.
Use FIELD_EX32 and FIELD_DP32 with named fields instead
of extract32 and deposit32 with raw constants.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We can directly test bits of a 32-bit comparison without
zero or sign-extending an intermediate result.
We can directly test bit 0 for odd/even.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We can directly test bits of a 32-bit comparison without
zero or sign-extending an intermediate result.
We can directly test bit 0 for odd/even.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use 'v' for a variable that needs copying, 't' for a temp that
doesn't need copying, and 'i' for an immediate, and use this
naming for both arguments of the comparison. So:
cond_make_tmp -> cond_make_tt
cond_make_0_tmp -> cond_make_ti
cond_make_0 -> cond_make_vi
cond_make -> cond_make_vv
Pass 0 explictly, rather than implicitly in the function name.
Reviewed-by: Helge Deller <deller@gmx.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This is a first step in enabling CF_PCREL, but for now
we regenerate the absolute address before writeback.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Wrap offset and space together in one structure, ensuring
that they're copied together as required.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This simplifies callers, which might otherwise have
to make another copy.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Using umax is clearer than the same operation using movcond.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This allows unification of BE, BLR, BV, BVE with a common helper.
Since we can now track space with IAQ_Next, we can now let the
TranslationBlock continue across the delay slot with BE, BVE.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add variable to track space changes to IAQ. So far, no such changes
are introduced, but the new checks vs ctx->iasq_b may eliminate an
unnecessary copy to cpu_iasq_f with e.g. BLR.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Minimize the amount of code in hppa_tr_translate_insn advancing the
insn queue for the next insn. Move the goto_tb path to hppa_tr_tb_stop.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We no longer have to allocate a temp and perform an
addition before translation of the rest of the insn.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Instead of two separate cpu_iaoq_entry calls, use one call to update
both IAQ_Front and IAQ_Back. Simplify with an argument combination
that automatically handles a simple increment from Front to Back.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The generic tcg driver will have already checked for breakpoints.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Simplify the function by not attempting a conditional move
on the branch destination -- just use nullify_over normally.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Pass a displacement instead of an absolute value.
In trans_be, remove the user-only do_dbranch case. The branch we are
attempting to optimize is to the zero page, which is perforce on a
different page than the code currently executing, which means that
we will *not* use a goto_tb. Use a plain indirect branch instead,
which is what we got out of the attempted direct branch anyway.
Reviewed-by: Helge Deller <deller@gmx.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Share this check between gen_goto_tb and hppa_tr_translate_insn.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This function is for log_pc(), which needs to produce a
similar result to cpu_get_tb_cpu_state().
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The ilen value extracted from ex_value is the length of the
EXECUTE instruction itself, and so is the increment to the pc.
However, the length of the synthetic insn is located in the
opcode like all other instructions.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Replace translator_fake_ldb, which required multiple calls,
with translator_fake_ld, which can take all data at once.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The routines in disas-common.c are also used from disas-mon.c.
Otherwise the rest of disassembly is only used from tcg.
While we're at it, put host and target code into separate files.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We have eliminated most uses of this hook. Reduce
further by allowing the hook to handle only the
special cases, returning false for normal processing.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Almost all of the disas_log implementations are identical.
Unify them within translator_loop.
Drop extra Priv/Virt logging from target/riscv.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We don't need to allocate plugin context at startup,
we can wait until we actually use it.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Do not pass around a boolean between multiple structures,
just read it from the TranslationBlock in the TCGContext.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use the bytes that we record for the entire TB, rather than
a per-insn GByteArray. Record the length of the insn in
plugin_gen_insn_end rather than infering from the length
of the array.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Copy data out of a completed translation. This will be used
for both plugins and disassembly.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Instead of returning a host pointer, copy the data into
storage provided by the caller.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This will be able to replace plugin_insn_append, and will
be usable for disassembly.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Do not allow translation to proceed beyond one insn with mmio,
as we will not be caching the TranslationBlock.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reorg translator_access into translator_ld, with a more
memcpy-ish interface. If both pages are in ram, do not
go through the caller's slow path.
Assert that the access is within the two pages that we are
prepared to protect, per TranslationBlock. Allow access
prior to pc_first, so long as it is within the first page.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
While there are other methods that could be used to replace
TARGET_PAGE_MASK, the function is not really required outside
the context of target-specific translation.
This makes the header usable by target independent code.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* Fix the "tsan-build" CI job on the shared gitlab CI runners
* Bump minimum glib version and use URI code from the newer glib
* Fix error message from "configure" when C compiler is not working
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmZDXcYRHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbVPJw//bK/NuMKOHlnwgowkQ/x41t8nc0jAR38+
# aMhJBTSB+9EOlPd+/y7+IeFlD9lS2JzoX/CWeBrNlKc6juWQahABJYcvscmdGiYr
# a/dUy9iZoqJyY220TMjCWYwORRtNqPDXaiUIR8hBZZBmW51xs1hRc3aazxPm6dOD
# cj1yFKwWGY5g72SkRNTMWi3qWX1tXNOh7sbuhWKkZ3eiRCllHb0RwrhA341ze4TI
# ckmlSA6stMjls4XNAIAKVdRKLPE1BsJ/UKxpnOEO3F640cbe69B0+z13wIBNfTOY
# Mk3zSjrdLY6thSY+2iOb2FLt3wC5QCBjyRluv+0kwdSnz6xsafEDWNx5VneZH+Iu
# ZQWLGvN4qUUBBqHKY8eWnrsij3ABXioHLK8eHj2JuHidcG15tku/1cwAJvy/8P/O
# iup0elZ3MXaAk6ce3dwYY4t6QecuzqX9cdJkTuRNlzysK1xKQdBiYTdeZikfUAoM
# InuFUh732yPXDSiZcG+uMXUTAJXHWASr7bvPydDx/gL1tYGYBqYepfPF2uWYfNwg
# VZRgsN6WVDBGPyXv8Z7eQ9lye5JoAGYrSDxZE87q8RwRV5holiYDxtf10zeLz3Wf
# RI5L/bb2eFSHzi3quzOC1uLflLqNKwq+9UZEjdLv2z8zuhwVwxbcDV9+ox6zA8zi
# dnVC3Yp/3ik=
# =VRHz
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 14 May 2024 02:49:10 PM CEST
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full]
* tag 'pull-request-2024-05-14' of https://gitlab.com/thuth/qemu:
util/uri: Remove the old URI parsing code
block/ssh: Use URI parsing code from glib
block/nfs: Use URI parsing code from glib
block/nbd: Use URI parsing code from glib
block/gluster: Use URI parsing code from glib
Remove glib compatibility code that is not required anymore
Bump minimum glib version to v2.66
gitlab: use 'setarch -R' to workaround tsan bug
gitlab: use $MAKE instead of 'make'
dockerfiles: add 'MAKE' env variable to remaining containers
configure: Fix error message when C compiler is not working
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
By default, SDL disables the screen saver which prevents the host from powering
down the screen even if the screen is locked. This results in draining the
battery needlessly when the host isn't connected to a wall charger. Fix that by
enabling the screen saver.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Acked-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <20240512095945.1879-1-shentey@gmail.com>
Remove gtk_widget_get_scale_factor() usage from the calculation of
the motion events in the GTK backend to make it work correctly on
environments that have `gtk_widget_get_scale_factor() != 1`.
This scale factor usage had been introduced in the commit f14aab420c and
at that time the window size was used for calculating the things and it
was working correctly. However, in the commit 2f31663ed4 the logic
switched to use the widget size instead of window size and because of
the change the usage of scale factor becomes invalid (since widgets use
`vc->gfx.scale_{x, y}` for scaling).
Tested on Crostini on ChromeOS (15823.51.0) with an external display.
Fixes: 2f31663ed4 ("ui/gtk: use widget size for cursor motion event")
Fixes: f14aab420c ("ui: fix incorrect pointer position on highdpi with
gtk")
Signed-off-by: hikalium <hikalium@hikalium.com>
Acked-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20240512111435.30121-3-hikalium@hikalium.com>
This commit introduces utility functions for the creation and deallocation
of QemuDmaBuf instances. Additionally, it updates all relevant sections
of the codebase to utilize these new utility functions.
v7: remove prefix, "dpy_gl_" from all helpers
qemu_dmabuf_free() returns without doing anything if input is null
(Daniel P. Berrangé <berrange@redhat.com>)
call G_DEFINE_AUTOPTR_CLEANUP_FUNC for qemu_dmabuf_free()
(Daniel P. Berrangé <berrange@redhat.com>)
v8: Introduction of helpers was removed as those were already added
by the previous commit
v9: set dmabuf->allow_fences to 'true' when dmabuf is created in
virtio_gpu_create_dmabuf()/virtio-gpu-udmabuf.c
removed unnecessary spaces were accidently added in the patch,
'ui/console: Use qemu_dmabuf_new() a...'
v11: Calling qemu_dmabuf_close was removed as closing dmabuf->fd will be
done in qemu_dmabuf_free anyway.
(Daniel P. Berrangé <berrange@redhat.com>)
v12: --- Calling qemu_dmabuf_close separately as qemu_dmabuf_free doesn't
do it.
--- 'dmabuf' is now allocated space so it should be freed at the end of
dbus_scanout_texture
v13: --- Immediately free dmabuf after it is released to prevent possible
leaking of the ptr
(Marc-André Lureau <marcandre.lureau@redhat.com>)
--- Use g_autoptr macro to define *dmabuf for auto clean up instead of
calling qemu_dmabuf_free
(Marc-André Lureau <marcandre.lureau@redhat.com>)
v14: --- (vhost-user-gpu) Change qemu_dmabuf_free back to g_clear_pointer
as it was done because of some misunderstanding (v13).
--- (vhost-user-gpu) g->dmabuf[m->scanout_id] needs to be set to NULL
to prevent freed dmabuf to be accessed again in case if(fd==-1)break;
happens (before new dmabuf is allocated). Otherwise, it would cause
invalid memory access when the same function is executed. Also NULL
check should be done before qemu_dmabuf_close (it asserts dmabuf!=NULL.).
(Marc-André Lureau <marcandre.lureau@redhat.com>)
Suggested-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Philippe Mathieu-Daudé <philmd@linaro.org>
Cc: Daniel P. Berrangé <berrange@redhat.com>
Cc: Vivek Kasireddy <vivek.kasireddy@intel.com>
Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Message-Id: <20240508175403.3399895-6-dongwon.kim@intel.com>
This commit updates all instances where fields within the QemuDmaBuf
struct are directly accessed, replacing them with calls to these new
helper functions.
v6: fix typos in helper names in ui/spice-display.c
v7: removed prefix, "dpy_gl_" from all helpers
v8: Introduction of helpers was removed as those were already added
by the previous commit
v11: -- Use new qemu_dmabuf_close() instead of close(qemu_dmabuf_get_fd()).
(Daniel P. Berrangé <berrange@redhat.com>)
-- Use new qemu_dmabuf_dup_fd() instead of dup(qemu_dmabuf_get_fd()).
(Daniel P. Berrangé <berrange@redhat.com>)
Suggested-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Philippe Mathieu-Daudé <philmd@linaro.org>
Cc: Daniel P. Berrangé <berrange@redhat.com>
Cc: Vivek Kasireddy <vivek.kasireddy@intel.com>
Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Message-Id: <20240508175403.3399895-4-dongwon.kim@intel.com>
New header and source files are added for containing QemuDmaBuf struct
definition and newly introduced helpers for creating/freeing the struct
and accessing its data.
v10: Change the license type for both dmabuf.h and dmabuf.c from MIT to
GPL to be in line with QEMU's default license
v11: -- Added new helpers, qemu_dmabuf_close for closing dmabuf->fd,
qemu_dmabuf_dup_fd for duplicating dmabuf->fd
(Daniel P. Berrangé <berrange@redhat.com>)
-- Let qemu_dmabuf_fee to call qemu_dmabuf_close before freeing
the struct to make sure fd is closed.
(Daniel P. Berrangé <berrange@redhat.com>)
v12: Not closing fd in qemu_dmabuf_free because there are cases fd
should still be available even after the struct is destroyed
(e.g. virtio-gpu: res->dmabuf_fd).
Suggested-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Philippe Mathieu-Daudé <philmd@linaro.org>
Cc: Daniel P. Berrangé <berrange@redhat.com>
Cc: Vivek Kasireddy <vivek.kasireddy@intel.com>
Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Message-Id: <20240508175403.3399895-3-dongwon.kim@intel.com>
Draw routine needs to be manually invoked in the next refresh
if there is a scanout blob from the guest. This is to prevent
a situation where there is a scheduled draw event but it won't
happen bacause the window is currently in inactive state
(minimized or tabified). If draw is not done for a long time,
gl_block timeout and/or fence timeout (on the guest) will happen
eventually.
v2: Use gd_gl_area_draw(vc) in gtk-gl-area.c
Suggested-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Acked-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20240426225059.3871283-1-dongwon.kim@intel.com>
Now that we switched all consumers of the URI code to use the URI
parsing functions from glib instead, we can remove our internal
URI parsing code since it is not used anymore.
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-ID: <20240418101056.302103-14-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Since version 2.66, glib has useful URI parsing functions, too.
Use those instead of the QEMU-internal ones to be finally able
to get rid of the latter.
While we're at it, also emit a warning when encountering unknown
parameters in the URI, so that the users have a chance to detect
their typos or other mistakes.
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Message-ID: <20240418101056.302103-13-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Since version 2.66, glib has useful URI parsing functions, too.
Use those instead of the QEMU-internal ones to be finally able
to get rid of the latter.
While we're at it, slightly rephrase one of the error messages:
Use "Invalid value..." instead of "Illegal value..." since the
latter rather sounds like the users were breaking a law here.
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-ID: <20240418101056.302103-12-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Since version 2.66, glib has useful URI parsing functions, too.
Use those instead of the QEMU-internal ones to be finally able
to get rid of the latter. The g_uri_get_host() also takes care
of removing the square brackets from IPv6 addresses, so we can
drop that part of the QEMU code now, too.
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-ID: <20240418101056.302103-11-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Since version 2.66, glib has useful URI parsing functions, too.
Use those instead of the QEMU-internal ones to be finally able
to get rid of the latter.
Since g_uri_get_path() returns a const pointer, we also need to
tweak the parameter of parse_volume_options() (where we use the
result of g_uri_get_path() as input).
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20240418101056.302103-10-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Now that we dropped support for CentOS 8 and Ubuntu 20.04, we can
look into bumping the glib version to a new minimum for further
clean-ups. According to repology.org, available versions are:
CentOS Stream 9: 2.66.7
Debian 11: 2.66.8
Fedora 38: 2.74.1
Freebsd: 2.78.4
Homebrew: 2.80.0
Openbsd: 2.78.4
OpenSuse leap 15.5: 2.70.5
pkgsrc_current: 2.78.4
Ubuntu 22.04: 2.72.1
Thus it should be safe to bump the minimum glib version to 2.66 now.
Version 2.66 comes with new functions for URI parsing which will
allow further clean-ups in the following patches.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240418101056.302103-8-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The TSAN job started failing when gitlab rolled out their latest
release. The root cause is a change in the Google COS version used
on shared runners. This brings a kernel running with
vm.mmap_rnd_bits = 31
which is incompatible with TSAN in LLVM < 18, which only supports
upto '28'. LLVM 18 can support upto '30', and failing that will
re-exec itself to turn off VA randomization.
Our LLVM is too old for now, but we can run with 'setarch -R make ..'
to turn off VA randomization ourselves.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240513111551.488088-4-berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
If you try to run the configure script on a system without a working
C compiler, you get a very misleading error message:
ERROR: Unrecognized host OS (uname -s reports 'Linux')
Some people already opened bug tickets because of this problem:
https://gitlab.com/qemu-project/qemu/-/issues/2057https://gitlab.com/qemu-project/qemu/-/issues/2288
We should rather tell the user that we were not able to use the C
compiler instead, otherwise they will have a hard time to figure
out what was going wrong.
While we're at it, let's also suppress the "unrecognized host CPU"
message in this case since it is rather misleading than helpful.
Fixes: 264b803721 ("configure: remove compiler sanity check")
Message-ID: <20240513114010.51608-1-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
* target/i386: miscellaneous changes, mostly TCG-related
* fix --without-default-devices build
* fix --without-default-devices qtests on s390x and arm
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmY+JWIUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroOGmwf+JKY/i7ihXvfINQIRSKaz+H7KM3Br
# BGv/iXj4hrRA+zflcZswwoWmPrkrXM3J5JqGG6zTqqhGne+fRKt60KBFwn+lRaMY
# n48icR4zOSaEcGKBOFKs9CB1JgL7SWMe+fZ8d02amYlIZ005af0d69ACenF9r/oX
# pTxYIrR90FdZStbF4Yl0G5CzMLBdHZd/b6bMNmbefVPv3/d2zuL7VgqLX3y3J0ee
# ASYkYjn8Wpda4KX9s2rvH9ENXj80Q7EqhuDvoBlyK72/2lE5aTojbUiyGB4n5AuX
# 5OHA+0HEpuCXXToijOeDXD1NDOk9E5DP8cEwwZfZ2gjWKjja0U6OODGLVw==
# =woTe
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 10 May 2024 03:47:14 PM CEST
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (27 commits)
configs: disable emulators that require it if libfdt is not found
hw/xtensa: require libfdt
kconfig: express dependency of individual boards on libfdt
kconfig: allow compiling out QEMU device tree code per target
meson: move libfdt together with other dependencies
meson: pick libfdt from common_ss when building target-specific files
tests/qtest: arm: fix operation in a build without any boards or devices
i386: select correct components for no-board build
hw/i386: move rtc-reset-reinjection command out of hw/rtc
hw/i386: split x86.c in multiple parts
i386: pc: remove unnecessary MachineClass overrides
i386: correctly select code in hw/i386 that depends on other components
xen: register legacy backends via xen_backend_init
xen: initialize legacy backends from xen_bus_init()
tests/qtest: s390x: fix operation in a build without any boards or devices
s390x: select correct components for no-board build
s390: move css_migration_enabled from machine to css.c
s390_flic: add migration-enabled property
s390x: move s390_cpu_addr2state to target/s390x/sigp.c
sh4: select correct components for no-board build
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Since boards can express their dependency on libfdt and
system/device_tree.c, only leave TARGET_NEED_FDT if the target has a
hard dependency.
Those emulators will be skipped if libfdt is disabled, or if it
is "auto" and not found and --disable-download is passed; unless
the target is mentioned explicitly in --target-list, in which case
the build will fail.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
All other boards require libfdt if it can be used (including for example
i386/x86_64), so change the "imply" to "select" and always allow -dtb
in qemu-system-xtensa.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Now that boards are enabled by default and the "CONFIG_FOO=y"
entries are gone from configs/devices/, there cannot be any more
a conflicts between the default contents of configs/devices/
and a failed "depends on" clause.
With this change, each individual board or target can express
whether it needs FDT. It can then include the common code in the
build via "select DEVICE_TREE", which will also as tell meson to link
with libfdt.
This allows building non-microvm x86 emulators without having
libfdt available.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Introduce a new Kconfig symbol, CONFIG_DEVICE_TREE, that specifies whether
to include the common device tree code in system/device_tree.c and to
link to libfdt. For now, include it unconditionally if libfdt is
available.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move the libfdt detection code together with other dependencies instead
of keeping it with subprojects. This has the disadvantage of performing
the detection even if no target requires libfdt; but it has the advantage
that Kconfig will be able to observe the availability of the library.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Avoid having to list dependencies such as libfdt twice, both on common_ss
and specific_ss. Instead, just take all the dependencies in common_ss
and allow the target-specific libqemu-*.fa library to use them.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The local APIC is a part of the CPU and has callbacks that are invoked
from multiple accelerators.
The IOAPIC on the other hand is optional, but ioapic_eoi_broadcast is
used by common x86 code to implement the IOAPIC's implicit EOI mode.
Add a stub in case the IOAPIC device is not included but the APIC is.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240509170044.190795-13-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The rtc-reset-reinjection QMP command is specific to x86, other boards do not
have the ACK tracking functionality that is needed for RTC interrupt
reinjection. Therefore the QMP command is only included in x86, but
qmp_rtc_reset_reinjection() is implemented by hw/rtc/mc146818rtc.c
and requires tracking of all created RTC devices. Move the implementation
to hw/i386, so that 1) it is available even if no RTC device exist
2) the only RTC that exists is easily found in x86ms->rtc.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240509170044.190795-12-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Keep the basic X86MachineState definition in x86.c. Move out functions that
are only needed by other files: x86-common.c for the pc and microvm machines,
x86-cpu.c for those used by accelerator code.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240509170044.190795-11-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
fw_cfg.c and vapic.c are currently included unconditionally but
depend on other components. vapic.c depends on the local APIC,
while fw_cfg.c includes a piece of AML builder code that depends
on CONFIG_ACPI.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240509170044.190795-9-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It is okay to register legacy backends in the middle of xen_bus_init().
All that the registration does is record the existence of the backend
in xenstore.
This makes it possible to remove them from the build without introducing
undefined symbols in xen_be_init(). It also removes the need for the
backend_register callback, whose only purpose is to avoid registering
nonfunctional backends.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240509170044.190795-8-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Prepare for moving the calls to xen_be_register() under the
control of xen_bus_init(), using the normal xen_backend_init()
method that is used by the "modern" backends.
This requires the xenstore global variable to be initialized,
which is done by xen_be_init(). To ensure that everything is
ready at the time the xen_backend_init() functions are called,
remove the xen_be_init() function from all the boards and
place it directly in xen_bus_init().
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240509170044.190795-7-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Do the bare minimum to ensure that at least a vanilla
--without-default-devices build works for all targets except i386,
x86_64 and ppc64. In particular this fixes s390x-softmmu; i386 and
x86_64 have about a dozen failing tests that do not pass -M and therefore
require a default machine type; ppc64 has the same issue, though only
with numa-test.
If we can for now ignore the cases where boards and devices are picked
by hand, drive_del-test however can be fixed easily; almost all tests
check for the virtio-blk or virtio-scsi device that they use, and are
already skipped. Only one didn't get the memo; plus another one does
not need a machine at all and can be run with -M none.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240509170044.190795-6-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The CSS subsystem uses global variables, just face the truth and use
a variable also for whether the CSS vmstate is in use; remove the
indirection of fetching it from the machine type, which makes the
TCG code depend unnecessarily on the virtio-ccw machine.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20240509170044.190795-4-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This function has no dependency on the virtio-ccw machine type, though it
assumes that the CPU address corresponds to the core_id and the index.
If there is any need of something different or more fancy (unlikely)
S390 can include a MachineClass subclass and implement it there. For
now, move it to sigp.c for simplicity.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240509170044.190795-2-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Ensure that they go through unmodified, instead of removing one layer
of quoting.
-D is a pretty specialized option and most options that can have spaces
do not need it (for example, c_args is covered by --extra-cflags).
Therefore it's unlikely that this causes actual trouble. However,
a somewhat realistic failure case would be with -Dpkg_config_path
and a pkg-config directory that contains spaces.
Cc: qemu-stable@nongnu.org
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The VMX feature bit depends on general availability of WAITPKG,
not the other way round.
Fixes: 33cc88261c ("target/i386: add support for VMX_SECONDARY_EXEC_ENABLE_USER_WAIT_PAUSE", 2023-08-28)
Cc: qemu-stable@nongnu.org
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These are trivial to add, and moving them to the new decoder fixes some
corner cases: raising #UD instead of an instruction fetch page fault for
the undefined opcodes, and incorrectly rejecting 0F 18 prefetches with
register operands (which are treated as reserved NOPs).
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
According to the manual, 32-bit vs 64-bit is governed by REX.W
and REX ignores the 0x66 prefix. This can be confirmed with this
program:
#include <stdio.h>
int main()
{
int x = 0x12340000;
int y;
asm("popcntl %1, %0" : "=r" (y) : "r" (x)); printf("%x\n", y);
asm("mov $-1, %0; .byte 0x66; popcntl %1, %0" : "+r" (y) : "r" (x)); printf("%x\n", y);
asm("mov $-1, %0; .byte 0x66; popcntq %q1, %q0" : "+r" (y) : "r" (x)); printf("%x\n", y);
}
which prints 5/ffff0000/5 on real hardware and 5/ffff0000/ffff0000
on QEMU.
Cc: qemu-stable@nongnu.org
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The PCOMMIT instruction was never included in any physical processor.
TCG implements it as a no-op instruction, but its utility is debatable
to say the least. Drop it from the decoder since it is only available
with "-cpu max", which does not guarantee migration compatibility
across versions, and deprecate the property just in case someone is
using it as "pcommit=off".
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The old "-runas" option has the disadvantage that it is not visible
in the QAPI schema, so it is not available via the normal introspection
mechanisms. We've recently introduced the "-run-with" option for exactly
this purpose, which is meant to handle the options that affect the
runtime behavior. Thus let's introduce a "user=..." parameter here now
and deprecate the old "-runas" option.
Message-ID: <20240506112058.51446-1-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Retain a list of deprecated features disjoint from any particular
CPU model. A query-cpu-model-expansion reply will now provide a list of
properties (i.e. features) that are flagged as deprecated. Example:
{
"return": {
"model": {
"name": "z14.2-base",
"deprecated-props": [
"bpb",
"csske"
],
"props": {
"pfmfi": false,
"exrl": true,
...a lot more props...
"skey": false,
"vxpdeh2": false
}
}
}
}
It is recommended that s390 guests operate with these features
explicitly disabled to ensure compatibility with future hardware.
Signed-off-by: Collin Walling <walling@linux.ibm.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-ID: <20240429191059.11806-2-walling@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
get_sclp_device() scans the whole machine to find a TYPE_SCLP object.
Now that the SCLPDevice instance is available under the machine state,
use it to simplify the lookup. While at it, remove the inline to let
the compiler decide on how to optimize.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <20240502131533.377719-4-clg@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
sclp_get_event_facility_bus() scans the whole machine to find a
TYPE_SCLP_EVENTS_BUS object. The SCLPDevice instance is now available
under the machine state, use it to simplify the lookup and adjust the
creation of the consoles.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <20240502131533.377719-3-clg@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Initialize directly SCLPDevice from the machine init handler and
remove s390_sclp_init(). We will use the SCLPDevice pointer later to
create the consoles.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <20240502131533.377719-2-clg@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The sclpconsole currently does not have a proper parent in the QOM
tree, so it shows up under /machine/unattached - which is somewhat
ugly. We should rather attach it to /machine/sclp/s390-sclp-event-facility
where the other devices of type TYPE_SCLP_EVENT already reside.
Message-ID: <20240430190843.453903-1-thuth@redhat.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
pull-loongarch-20240509
# -----BEGIN PGP SIGNATURE-----
#
# iLMEAAEKAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCZjyDAgAKCRBAov/yOSY+
# 33cfA/4jE0x+eLAT161caSwM3wBOfZRClfUhXdkxLP6GvWbACVQ8l0rEZiw2PuI8
# DFReU2gqs7wAfYKt7Yy62xXlCw1B3aSUzE45gS2TGIP1GqKBwigvpW4i1SgiOoMX
# 4TA+GG16KgR9zaxO48bjjyJ1epc7S3SxdAL09p2U08D9EdSwCA==
# =RLFu
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 09 May 2024 10:02:10 AM CEST
# gpg: using RSA key B8FF1DA0D2FDCB2DA09C6C2C40A2FFF239263EDF
# gpg: Good signature from "Song Gao <m17746591750@163.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B8FF 1DA0 D2FD CB2D A09C 6C2C 40A2 FFF2 3926 3EDF
* tag 'pull-loongarch-20240509' of https://gitlab.com/gaosong/qemu:
target/loongarch: Put cpucfg operation before CSR register
target/loongarch: Add TCG macro in structure CPUArchState
hw/loongarch: Refine default numa id calculation
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
On Loongarch, cpucfg is register for cpu feature, some other registers
depend on cpucfg feature such as perf CSR registers. Here put cpucfg
read/write operations before CSR register, so that KVM knows how many
perf CSR registers are valid from pre-set cpucfg feature information.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240428031651.1354587-1-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
In structure CPUArchState some struct elements are only used in TCG
mode, and it is not used in KVM mode. Macro CONFIG_TCG is added to
make it simpiler in KVM mode, also there is the same modification
in c code when these structure elements are used.
When VM runs in KVM mode, TLB entries are not used and do not need
migrate. It is only useful when it runs in TCG mode.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240506011912.2108842-1-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
With numa_test test case, there is subcase named test_def_cpu_split(),
there are 8 sockets and 2 numa nodes. Here is command line:
"-machine smp.cpus=8,smp.sockets=8 -numa node,memdev=ram -numa node"
The required result is:
node 0 cpus: 0 2 4 6
node 1 cpus: 1 3 5 7
Test case numa_test fails on LoongArch, since the actual result is:
node 0 cpus: 0 1 2 3
node 1 cpus: 4 5 6 7
It will be better if all the cpus in one socket share the same numa
node. Here socket id is used to calculate numa id in function
virt_get_default_cpu_node_id().
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240319022606.2994565-1-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Implement IOCSR address space get functions for MIPS/Loongson CPUs.
For MIPS/Loongson without IOCSR (i.e. Loongson-3A1000), get_cpu_iocsr_as
will return as null, and send_ipi_data will fail with MEMTX_DECODE_ERROR,
which matches expected behavior on hardware.
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240508-loongson3-ipi-v1-3-1a7b67704664@flygoat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Suspend function is emulated as what hardware actually do.
Doorbell register fields are updates to include suspend value,
suspend vector is encoded in firmware blob and fw_cfg is updated
to include S3 bits as what x86 did.
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Message-ID: <20240508-loongson3v-suspend-v1-1-186725524a39@flygoat.com>
[PMD: Use g_memdup2(), constify suspend array]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Rename LoongArchMachineState with LoongArchVirtMachineState, and change
variable name LoongArchMachineState *lams with LoongArchVirtMachineState
*lvms.
Rename function specific for virtmachine loongarch_xxx()
with virt_xxx(). However some common functions keep unchanged such as
loongarch_acpi_setup()/loongarch_load_kernel(), since there functions
can be used for real hw boards.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240508031110.2507477-3-maobibo@loongson.cn>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
On LoongArch system, there is only virt machine type now, name
LOONGARCH_MACHINE is confused, rename it with LOONGARCH_VIRT_MACHINE.
Machine name about Other real hw boards can be added in future.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240508031110.2507477-2-maobibo@loongson.cn>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The 'ref405ep' machine and PPC 405 CPU have no known users, firmware
images are not available, OpenWRT dropped support in 2019, U-Boot in
2017, Linux also is dropping support in 2024. It is time to let go of
this ancient hardware and focus on newer CPUs and platforms.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Message-ID: <20240507123332.641708-1-clg@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The function is inspired by pc_isa_bios_init() and should eventually replace it.
Using x86_isa_bios_init() rather than pc_isa_bios_init() fixes pflash commands
to work in the isa-bios region.
While at it convert the magic number 0x100000 (== 1MiB) to increase readability.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-ID: <20240508175507.22270-6-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The function creates and leaks two MemoryRegion objects regarding the BIOS which
will be moved into X86MachineState in the next steps to avoid the leakage.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240430150643.111976-3-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Given that memory_region_set_readonly() is a no-op when the readonlyness is
already as requested it is possible to simplify the pattern
if (condition) {
foo(true);
}
to
foo(condition);
which is shorter and allows to see the invariant of the code more easily.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240430150643.111976-2-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The i440fx and the isapc machines can be used in binaries without
FDC, too. We just have to make sure that they don't try to instantiate
the FDC when it is not available.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Acked-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240425184315.553329-4-thuth@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The q35 machine can be used without floppy disk controller (FDC),
but due to our current Kconfig setup, the FDC code is still always
included in the binary. To fix this, the "PC" config option should
only imply the "FDC_ISA" instead of always selecting it.
The i440fx and the isa-pc machine currently always instantiate
the FDC, so we have to add the select statements now there instead.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Acked-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240425184315.553329-3-thuth@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The q35 machine can work without FDC. But to be able to also link
a QEMU binary that does not include the FDC code, we have to make
it possible to disable the spots that call into the FDC code.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Acked-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240425184315.553329-2-thuth@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Instead of using a single global bounce buffer, give each AddressSpace
its own bounce buffer. The MapClient callback mechanism moves to
AddressSpace accordingly.
This is in preparation for generalizing bounce buffer handling further
to allow multiple bounce buffers, with a total allocation limit
configured per AddressSpace.
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
Message-ID: <20240507094210.300566-2-mnissler@rivosinc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
[PMD: Split patch, part 2/2]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Propagate AddressSpace handler to following helpers:
- register_map_client()
- unregister_map_client()
- notify_map_clients[_locked]()
Rename them using 'address_space_' prefix instead of 'cpu_'.
The AddressSpace argument will be used in the next commit.
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
Message-ID: <20240507094210.300566-2-mnissler@rivosinc.com>
[PMD: Split patch, part 1/2]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Per https://discourse.gnome.org/t/port-your-module-from-g-memdup-to-g-memdup2-now/5538
The old API took the size of the memory to duplicate as a guint,
whereas most memory functions take memory sizes as a gsize. This
made it easy to accidentally pass a gsize to g_memdup(). For large
values, that would lead to a silent truncation of the size from 64
to 32 bits, and result in a heap area being returned which is
significantly smaller than what the caller expects. This can likely
be exploited in various modules to cause a heap buffer overflow.
Replace g_memdup() by the safer g_memdup2() wrapper.
Trivially safe because the argument was directly from sizeof.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: David Gibson <david@gibson.dropber.id.au>
Message-Id: <20210903174510.751630-17-philmd@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Per https://discourse.gnome.org/t/port-your-module-from-g-memdup-to-g-memdup2-now/5538
The old API took the size of the memory to duplicate as a guint,
whereas most memory functions take memory sizes as a gsize. This
made it easy to accidentally pass a gsize to g_memdup(). For large
values, that would lead to a silent truncation of the size from 64
to 32 bits, and result in a heap area being returned which is
significantly smaller than what the caller expects. This can likely
be exploited in various modules to cause a heap buffer overflow.
Replace g_memdup() by the safer g_memdup2() wrapper.
Trivially safe because the argument was directly from sizeof.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210903174510.751630-12-philmd@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Per https://discourse.gnome.org/t/port-your-module-from-g-memdup-to-g-memdup2-now/5538
The old API took the size of the memory to duplicate as a guint,
whereas most memory functions take memory sizes as a gsize. This
made it easy to accidentally pass a gsize to g_memdup(). For large
values, that would lead to a silent truncation of the size from 64
to 32 bits, and result in a heap area being returned which is
significantly smaller than what the caller expects. This can likely
be exploited in various modules to cause a heap buffer overflow.
Replace g_memdup() by the safer g_memdup2() wrapper.
Trivially safe because the argument was directly from sizeof.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20210903174510.751630-27-philmd@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Per https://discourse.gnome.org/t/port-your-module-from-g-memdup-to-g-memdup2-now/5538
The old API took the size of the memory to duplicate as a guint,
whereas most memory functions take memory sizes as a gsize. This
made it easy to accidentally pass a gsize to g_memdup(). For large
values, that would lead to a silent truncation of the size from 64
to 32 bits, and result in a heap area being returned which is
significantly smaller than what the caller expects. This can likely
be exploited in various modules to cause a heap buffer overflow.
Replace g_memdup() by the safer g_memdup2() wrapper.
Trivially safe because the argument was directly from sizeof.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210903174510.751630-6-philmd@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Peter missed the Sphinx HMP document for the "resume/-r" flag in commit
7a4da28b26 ("qmp: hmp: add migrate "resume" option"). Add it.
When at it, slightly cleanup the lines around:
- Move "detach/-d" to a separate section rather than appending it at the
end of the command description. Add a hint for how to query the migration
results in detached mode.
- Add "postcopy" keyword to "resume/-r" help messages, as it only applies
to postcopy.
Cc: Dr. David Alan Gilbert <dave@treblig.org>
Cc: Fabiano Rosas <farosas@suse.de>
Fixes: 7a4da28b26 ("qmp: hmp: add migrate "resume" option")
Reported-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The fd: URI can currently trigger two different types of migration, a
TCP migration using sockets and a file migration using a plain
file. This is in conflict with the recently introduced (8.2) QMP
migrate API that takes structured data as JSON-like format. We cannot
keep the same backend for both types of migration because with the new
API the code is more tightly coupled to the type of transport. This
means a TCP migration must use the 'socket' transport and a file
migration must use the 'file' transport.
If we keep allowing fd: when using a file, this creates an issue when
the user converts the old-style (fd:) to the new style ("transport":
"socket") invocation because the file descriptor in question has
previously been allowed to be either a plain file or a socket.
To avoid creating too much confusion, we can simply deprecate the fd:
+ file usage, which is thought to be rarely used currently and instead
establish a 1:1 correspondence between fd: URI and socket transport,
and file: URI and file transport.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The 'compress' migration capability enables the old compression code
which has shown issues over the years and is thought to be less stable
and tested than the more recent multifd-based compression. The old
compression code has been deprecated in 8.2 and now is time to remove
it.
Deprecation commit 864128df46 ("migration: Deprecate old compression
method").
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The block migration has been considered obsolete since QEMU 8.2 in
favor of the more flexible storage migration provided by the
blockdev-mirror driver. Two releases have passed so now it's time to
remove it.
Deprecation commit 66db46ca83 ("migration: Deprecate block
migration").
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The block migration is considered obsolete and has been deprecated in
8.2. Remove the migrate command option that enables it. This only
affects the QMP and HMP commands, the feature can still be accessed by
setting the migration 'block' capability. The whole feature will be
removed in a future patch.
Deprecation commit 8846b5bfca ("migration: migrate 'blk' command
option is deprecated.").
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The block incremental option for block migration has been deprecated
in 8.2 in favor of using the block-mirror feature. Remove it now.
Deprecation commit 40101f320d ("migration: migrate 'inc' command
option is deprecated.").
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The 'skipped' field of the MigrationStats struct has been deprecated
in 8.1. Time to remove it.
Deprecation commit 7b24d32634 ("migration: skipped field is really
obsolete.").
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Now we do set MIGRATION_FAILED state, but don't give a chance to
orchestrator to query migration state and get the error.
Let's provide a possibility for QMP-based orchestrators to get an error
like with outgoing migration.
For hmp_migrate_incoming(), let's enable the new behavior: HMP is not
and ABI, it's mostly intended to use by developer and it makes sense
not to stop the process.
For x-exit-preconfig, let's keep the old behavior:
- it's called from init(), so here we want to keep current behavior by
default
- it does exit on error by itself as well
So, if we want to change the behavior of x-exit-preconfig, it should be
another patch.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Unify error reporting in the function. This simplifies the following
commit, which will not-exit-on-error behavior variant to the function.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
It's bad idea to leave critical section with error object freed, but
s->error still set, this theoretically may lead to use-after-free
crash. Let's avoid it.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Make call to migration_incoming_state_destroy(), instead of doing only
partial of it.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
migration/ram.c: API Conversion qemu_mutex_lock(),
and qemu_mutex_unlock() to WITH_QEMU_LOCK_GUARD macro
Signed-off-by: Will Gyda <vilhelmgyda@gmail.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
* target/i386/tcg: conversion of one byte opcodes to table-based decoder
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmY5z/QUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroP1YQf/WMAoB/lR31fzu/Uh36hF1Ke/NHNU
# gefqKRAol6xJXxavKH8ym9QMlCTzrCLVt0e8RalZH76gLqYOjRhSLSSL+gUo5HEo
# lsGSfkDAH2pHO0ZjQUkXcjJQQKkH+4+Et8xtyPc0qmq4uT1pqQZRgOeI/X/DIFNb
# sMoKaRKfj+dB7TSp3qCSOp77RqL13f4QTP8mUQ4XIfzDDXdTX5n8WNLnyEIKjoar
# ge4U6/KHjM35hAjCG9Av/zYQx0E084r2N2OEy0ESYNwswFZ8XYzTuL4SatN/Otf3
# F6eQZ7Q7n6lQbTA+k3J/jR9dxiSqVzFQnL1ePGoe9483UnxVavoWd0PSgw==
# =jCyB
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 06 May 2024 11:53:40 PM PDT
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (26 commits)
target/i386: remove duplicate prefix decoding
target/i386: split legacy decoder into a separate function
target/i386: decode x87 instructions in a separate function
target/i386: remove now-converted opcodes from old decoder
target/i386: port extensions of one-byte opcodes to new decoder
target/i386: move BSWAP to new decoder
target/i386: move remaining conditional operations to new decoder
target/i386: merge and enlarge a few ranges for call to disas_insn_new
target/i386: move C0-FF opcodes to new decoder (except for x87)
target/i386: generalize gen_movl_seg_T0
target/i386: move 60-BF opcodes to new decoder
target/i386: allow instructions with more than one immediate
target/i386: extract gen_far_call/jmp, reordering temporaries
target/i386: move 00-5F opcodes to new decoder
target/i386: reintroduce debugging mechanism
target/i386: cleanup *gen_eob*
target/i386: clarify the "reg" argument of functions returning CCPrepare
target/i386: do not use s->T0 and s->T1 as scratch registers for CCPrepare
target/i386: extend cc_* when using them to compute flags
target/i386: pull cc_op update to callers of gen_jmp_rel{,_csize}
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Now that a bulk of opcodes go through the new decoder, it is sensible
to do some cleanup. Go immediately through disas_insn_new and only jump
back after parsing the prefixes.
disas_insn() now only contains the three sigsetjmp cases, and they
are more easily managed if they are inlined into i386_tr_translate_insn.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Split the bits that have some duplication with disas_insn_new, from
those that should be the main topic of the conversion. This is the
first step towards removing duplicate decoding of prefixes between
disas_insn and disas_insn_new.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These are unlikely to be converted to the table-based decoding
soon (perhaps there could be generic ESC decoding in decode-new.c.inc
for the Mod/RM byte, but not operand decoding), so keep them separate
from the remaining legacy-decoded instructions.
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Send all converted opcodes to disas_insn_new() directly from the big
decoding switch statement; once more, the debugging/bisecting logic
disappears.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A few two-byte opcodes are simple extensions of existing one-byte opcodes;
they are easy to decode and need no change to emit.c.inc. Port them to
the new decoder.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move long-displacement Jcc, SETcc and CMOVcc to the new decoder.
While filling in the tables makes the code seem longer, the new
emitters are all just one line of code.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since new opcodes are not going to be added in translate.c, round the
case labels that call to disas_insn_new(), including whole sets of
eight opcodes when possible.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The shift instructions are rewritten instead of reusing code from the old
decoder. Rotates use CC_OP_ADCOX more extensively and generally rely
more on the optimizer, so that the code generators are shared between
the immediate-count and variable-count cases.
In particular, this makes gen_RCL and gen_RCR pretty efficient for the
count == 1 case, which becomes (apart from a few extra movs) something like:
(compute_cc_all if needed)
// save old value for OF calculation
mov cc_src2, T0
// the bulk of RCL is just this!
deposit T0, cc_src, T0, 1, TARGET_LONG_BITS - 1
// compute carry
shr cc_dst, cc_src2, length - 1
and cc_dst, cc_dst, 1
// compute overflow
xor cc_src2, cc_src2, T0
extract cc_src2, cc_src2, length - 1, 1
32-bit MUL and IMUL are also slightly more efficient on 64-bit hosts.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In the new decoder it is sometimes easier to put the segment
in T1 instead of T0, usually because another operand was loaded
by common code in T0. Genrealize gen_movl_seg_T0 to allow
using any source.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Compared to the old decoder, the main differences in translation
are for the little-used ARPL instruction. IMUL is adjusted a bit
to share more code to produce flags, but is otherwise very similar.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
While keeping decode->immediate for convenience and for 4-operand instructions,
store the immediate in X86DecodedOp as well. This enables instructions
with more than one immediate such as ENTER. It can also be used for far
calls and jumps.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Extract the code into new functions, and swap T0/T1 so that T0 corresponds
to the first immediate in the instruction stream.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Create a new wrapper for syscall/sysret, and do not go through multiple
layers of wrappers.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Instead of using s->T0 or s->T1, create a scratch register
when computing the C, NC, L or LE conditions.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Instead of using s->tmp0 or s->tmp4 as the result, just extend the cc_*
registers in place. It is harmless and, if multiple setcc instructions
are used, the optimizer will be able to remove the redundant ones.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
gen_update_cc_op must be called before control flow splits. Doing it
in gen_jmp_rel{,_csize} may hide bugs, instead assert that cc_op is
clean---even if that means a few more calls to gen_update_cc_op().
With this new invariant, setting cc_op to CC_OP_DYNAMIC is unnecessary
since the caller should have done it.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
gen_update_cc_op must be called before control flow splits. Do it
where the jump on ECX!=0 is translated.
On the other hand, remove the call before gen_jcc1, which takes care of
it already, and explain why REPZ/REPNZ need not use CC_OP_DYNAMIC---the
translation block ends before any control-flow-dependent cc_op could
be observed.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Resetting cc_op to CC_OP_DYNAMIC should be done at control flow junctions,
which is not the case here. This translation block is ending and the
only effect of calling set_cc_op() would be a discard of s->cc_srcT.
This discard is useless (it's a temporary, not a global) and in fact
prevents gen_prepare_cc from returning s->cc_srcT.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
With the introduction of TSTEQ and TSTNE the .mask field is always -1,
so remove all the now-unnecessary code.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The new conditions obviously come in handy when testing individual bits
of EFLAGS, and they make it possible to remove the .mask field of
CCPrepare.
Lowering to shift+and is done by the optimizer if necessary.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When testing the sign bit or equality to zero of a partial register, it
is useful to use a single TSTEQ or TSTNE operation. It can also be used
to test the parity flag, using bit 0 of the population count.
Do not do this for target_ulong-sized values however; the optimizer would
produce a comparison against zero anyway, and it avoids shifts by 64
which are undefined behavior.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Observed the following failure while booting the SEV-SNP guest and the
guest fails to boot with the smp parameters:
"-smp 192,sockets=1,dies=12,cores=8,threads=2".
qemu-system-x86_64: sev_snp_launch_update: SNP_LAUNCH_UPDATE ret=-5 fw_error=22 'Invalid parameter'
qemu-system-x86_64: SEV-SNP: CPUID validation failed for function 0x8000001e, index: 0x0.
provided: eax:0x00000000, ebx: 0x00000100, ecx: 0x00000b00, edx: 0x00000000
expected: eax:0x00000000, ebx: 0x00000100, ecx: 0x00000300, edx: 0x00000000
qemu-system-x86_64: SEV-SNP: failed update CPUID page
Reason for the failure is due to overflowing of bits used for "Node per
processor" in CPUID Fn8000001E_ECX. This field's width is 3 bits wide and
can hold maximum value 0x7. With dies=12 (0xB), it overflows and spills
over into the reserved bits. In the case of SEV-SNP, this causes CPUID
enforcement failure and guest fails to boot.
The PPR documentation for CPUID_Fn8000001E_ECX [Node Identifiers]
=================================================================
Bits Description
31:11 Reserved.
10:8 NodesPerProcessor: Node per processor. Read-only.
ValidValues:
Value Description
0h 1 node per processor.
7h-1h Reserved.
7:0 NodeId: Node ID. Read-only. Reset: Fixed,XXh.
=================================================================
As in the spec, the valid value for "node per processor" is 0 and rest
are reserved.
Looking back at the history of decoding of CPUID_Fn8000001E_ECX, noticed
that there were cases where "node per processor" can be more than 1. It
is valid only for pre-F17h (pre-EPYC) architectures. For EPYC or later
CPUs, the linux kernel does not use this information to build the L3
topology.
Also noted that the CPUID Function 0x8000001E_ECX is available only when
TOPOEXT feature is enabled. This feature is enabled only for EPYC(F17h)
or later processors. So, previous generation of processors do not not
enumerate 0x8000001E_ECX leaf.
There could be some corner cases where the older guests could enable the
TOPOEXT feature by running with -cpu host, in which case legacy guests
might notice the topology change. To address those cases introduced a
new CPU property "legacy-multi-node". It will be true for older machine
types to maintain compatibility. By default, it will be false, so new
decoding will be used going forward.
The documentation is taken from Preliminary Processor Programming
Reference (PPR) for AMD Family 19h Model 11h, Revision B1 Processors 55901
Rev 0.25 - Oct 6, 2022.
Cc: qemu-stable@nongnu.org
Fixes: 31ada106d8 ("Simplify CPUID_8000_001E for AMD")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
Message-ID: <0ee4b0a8293188a53970a2b0e4f4ef713425055e.1714757834.git.babu.moger@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We have one job to build user binaries and one job for system.
Disable tools and docs in the user job, and disable building
the user binaries in the system job.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The host does not have the correct libraries installed for static pie,
which causes host/guest address space interference for some tests.
There's no real gain from linking statically, so drop it.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Match the extra inserts of INDEX_op_insn_start, fixing
the db->num_insns != 1 assert in translator_loop.
Fixes: dcd092a063 ("accel/tcg: Improve can_do_io management")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Record the fact that we've found a breakpoint on the page
in which a TranslationBlock is running.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
If we can show that high bits of an input are zero,
then we may optimize away some comparisons.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This may be treated as a 32-bit EQ/NE comparison against 0,
which is in turn treated as a LTU/GEU comparison against 1.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The x86 isa does not have this operation, so we need an expansion.
Use the same algorithm that we use for expanding this vector
operation with integers: perform the shift with a wider type
and then mask the bits that must be zero.
This reduces the instruction count from 5 to 2.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
qemu-sparc queue
# -----BEGIN PGP SIGNATURE-----
#
# iQFSBAABCgA8FiEEzGIauY6CIA2RXMnEW8LFb64PMh8FAmY4wZceHG1hcmsuY2F2
# ZS1heWxhbmRAaWxhbmRlLmNvLnVrAAoJEFvCxW+uDzIftQsH+wfIWymTdQMowfM6
# Ze/T8KODn+MqU5eg25VPSTojnmr7LFaCj2yK6zWX61RwIqtMc3NaxX0G7ksW12/g
# 35ACqiEEd5WRDhAtVhj5Wp+WEDoR4AD3LWIaN7a/qjO3qb78l7Bujw3qXzGSq4lQ
# hST6dTgMwn5LhJOyz+5dORVUK1UZSBuDxHeKRHgdoFi6yqGQ5bao5TpaDYOnGSbx
# 8KPrAFfXG1T6xRS8Ih5HXAPE5VJztLFPiVtCTTrETDP/o8EzvOZj5y/nJVZXXC3N
# 57g+QyJX9EdrRZvobef4LnNnoZyiqG+uQNugglqZqjiiLjl6AzYxI+ed0hU+cZR9
# pz76Hr8=
# =i2cV
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 06 May 2024 04:40:07 AM PDT
# gpg: using RSA key CC621AB98E82200D915CC9C45BC2C56FAE0F321F
# gpg: issuer "mark.cave-ayland@ilande.co.uk"
# gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>" [full]
* tag 'qemu-sparc-20240506' of https://github.com/mcayland/qemu:
target/sparc: Split out do_ms16b
target/sparc: Fix FPMERGE
target/sparc: Fix FMULD8*X16
target/sparc: Fix FMUL8x16A{U,L}
target/sparc: Fix FMUL8x16
target/sparc: Fix FEXPAND
linux-user/sparc: Add more hwcap bits for sparc64
hw/sparc64: set iommu_platform=on for virtio devices attached to the sun4u machine
docs/about: Deprecate the old "UltraSparc" CPU names that contain a "+"
docs/system/target-sparc: Improve the Sparc documentation
target/sparc/cpu: Avoid spaces by default in the CPU names
target/sparc/cpu: Rename the CPU models with a "+" in their names
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Accelerator patches
- Extract page-protection definitions to page-protection.h
- Rework in accel/tcg in preparation of extracting TCG fields from CPUState
- More uses of get_task_state() in user emulation
- Xen refactors in preparation for adding multiple map caches (Juergen & Edgar)
- MAINTAINERS updates (Aleksandar and Bin)
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmY40CAACgkQ4+MsLN6t
# wN5drxAA1oIsuUzpAJmlMIxZwlzbICiuexgn/HH9DwWNlrarKo7V1l4YB8jd9WOg
# IKuj7c39kJKsDEB8BXApYwcly+l7DYdnAAI8Z7a+eN+ffKNl/0XBaLjsGf58RNwY
# fb39/cXWI9ZxKxsHMSyjpiu68gOGvZ5JJqa30Fr+eOGuug9Fn/fOe1zC6l/dMagy
# Dnym72stpD+hcsN5sVwohTBIk+7g9og1O/ctRx6Q3ZCOPz4p0+JNf8VUu43/reaR
# 294yRK++JrSMhOVFRzP+FH1G25NxiOrVCFXZsUTYU+qPDtdiKtjH1keI/sk7rwZ7
# U573lesl7ewQFf1PvMdaVf0TrQyOe6kUGr9Mn2k8+KgjYRAjTAQk8V4Ric/+xXSU
# 0rd7Cz7lyQ8jm0DoOElROv+lTDQs4dvm3BopF3Bojo4xHLHd3SFhROVPG4tvGQ3H
# 72Q5UPR2Jr2QZKiImvPceUOg0z5XxoN6KRUkSEpMFOiTRkbwnrH59z/qPijUpe6v
# 8l5IlI9GjwkL7pcRensp1VC6e9KC7F5Od1J/2RLDw3UQllMQXqVw2bxD3CEtDRJL
# QSZoS4d1jUCW4iAYdqh/8+2cOIPiCJ4ai5u7lSdjrIJkRErm32FV/pQLZauoHlT5
# eTPUgzDoRXVgI1X1slTpVXlEEvRNbhZqSkYLkXr80MLn5hTafo0=
# =3Qkg
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 06 May 2024 05:42:08 AM PDT
# gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]
* tag 'accel-20240506' of https://github.com/philmd/qemu: (28 commits)
MAINTAINERS: Update my email address
MAINTAINERS: Update Aleksandar Rikalo email
system: Pass RAM MemoryRegion and is_write in xen_map_cache()
xen: mapcache: Break out xen_map_cache_init_single()
xen: mapcache: Break out xen_invalidate_map_cache_single()
xen: mapcache: Refactor xen_invalidate_map_cache_entry_unlocked
xen: mapcache: Refactor xen_replace_cache_entry_unlocked
xen: mapcache: Break out xen_ram_addr_from_mapcache_single
xen: mapcache: Refactor xen_remap_bucket for multi-instance
xen: mapcache: Refactor xen_map_cache for multi-instance
xen: mapcache: Refactor lock functions for multi-instance
xen: let xen_ram_addr_from_mapcache() return -1 in case of not found entry
system: let qemu_map_ram_ptr() use qemu_ram_ptr_length()
user: Use get_task_state() helper
user: Declare get_task_state() once in 'accel/tcg/vcpu-state.h'
user: Forward declare TaskState type definition
accel/tcg: Move @plugin_mem_cbs from CPUState to CPUNegativeOffsetState
accel/tcg: Restrict cpu_plugin_mem_cbs_enabled() to TCG
accel/tcg: Restrict qemu_plugin_vcpu_exit_hook() to TCG plugins
accel/tcg: Update CPUNegativeOffsetState::can_do_io field documentation
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* target/i386: Introduce SapphireRapids-v3 to add missing features
* switch boards to "default y"
* allow building emulators without any board
* configs: list "implied" device groups in the default configs
* remove unnecessary declarations from typedefs.h
* target/i386: Give IRQs a chance when resetting HF_INHIBIT_IRQ_MASK
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmY1ILsUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroNtIwf+MEehq2HudZvsK1M8FrvNmkB/AssO
# x4tqL8DlTus23mQDBu9+rANTB93ManJdK9ybtf6NfjEwK+R8RJslLVnuy/qT+aQX
# PD208L88fjZg17G8uyawwvD1VmqWzHFSN14ShmKzqB2yPXXo/1cJ30w78DbD50yC
# 6rw/xbC5j195CwE2u8eBcIyY4Hh2PUYEE4uyHbYVr57cMjfmmA5Pg4I4FJrpLrF3
# eM2Avl/4pIbsW3zxXVB8QbAkgypxZErk3teDK1AkPJnlnBYM1jGKbt/GdKe7vcHR
# V/o+7NlcbS3oHVItQ2gP3m91stjFq+NhixaZpa0VlmuqayBa3xNGl0G6OQ==
# =ZbNW
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 03 May 2024 10:36:59 AM PDT
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (46 commits)
qga/commands-posix: fix typo in qmp_guest_set_user_password
migration: do not include coroutine_int.h
kvm: move target-dependent interrupt routing out of kvm-all.c
pci: remove some types from typedefs.h
tcg: remove CPU* types from typedefs.h
display: remove GraphicHwOps from typedefs.h
qapi/machine: remove types from typedefs.h
monitor: remove MonitorDef from typedefs.h
migration: remove PostcopyDiscardState from typedefs.h
lockable: remove QemuLockable from typedefs.h
intc: remove PICCommonState from typedefs.h
qemu-option: remove QemuOpt from typedefs.h
net: remove AnnounceTimer from typedefs.h
numa: remove types from typedefs.h
qdev-core: remove DeviceListener from typedefs.h
fw_cfg: remove useless declarations from typedefs.h
build: do not build virtio-vga-gl if virgl/opengl not available
bitmap: Use g_try_new0/g_new0/g_renew
target/i386: Introduce SapphireRapids-v3 to add missing features
docs: document new convention for Kconfig board symbols
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Short-circuit for packets with r/w and no overlap
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCgAdFiEEPWaq5HRZSCTIjOD4GlSvuOVkbDIFAmY4FR8ACgkQGlSvuOVk
# bDLEfxAAup6v9J4n2/q88FXfLGgx1EfZrT01gOM/48mwngNNQJGJQySe2GLl0G8S
# 1hx/Ym3jbikic8HL80v8FyCr4gNRshEY7xKpCfvY9lsgnCRbhEvoV/hZqucmLQAt
# 1SIhFSsi5h8gyZDTvXhH75v3qGvYjQ7fQBhy2JbRsPjthdHBh9xi6Na60wlqfNZq
# oGsVtY7sv1uHsvDKBi3JoXWckSK99R38BHY6zPoStarRZACkkLdX6KHxeX88TUt1
# whIUYUS/K0nRVxzekdq/+m8UJYrXnW/0cliM5mLFHDGlsV+qjdcIRrfaPWBO0eFN
# kXeZU2BWLCdP2M52FHI4FllnIRpX5OGkxjR6x8Pc9r+EGciwGRU7xeAlqBxKQSZP
# e3oXtV6oKxg69xBgHE5HcKbt6bX5EZR/sUcbAoGA41UssaiMyj3wbg1cy2UxXu2J
# 7oJyywJUggWGSoCIIJJ95YgpUrIg73Yg6pOjfhKW1w/V2SuQPGG0XTXrwe7J6uGi
# VAqyu55p2oiW8Gk4Lvl1SfWgxkVeZa/NcxTmXNEWFnT7vatqwez0O5pxIkxdSCFE
# lRv7PuFT5nhQ/gg12zGqqRiOrMOMQitHFzJ9sUNu7J4Y7W5R4gzRW19ucojLt0lH
# fT83Ra+Eex1Cu3DsuvWkokxFikxXP1Ll297Jr1JhOPewTtvlxvI=
# =Q8/k
# -----END PGP SIGNATURE-----
# gpg: Signature made Sun 05 May 2024 04:24:15 PM PDT
# gpg: using RSA key 3D66AAE474594824C88CE0F81A54AFB8E5646C32
# gpg: Good signature from "Brian Cain (QUIC) <quic_bcain@quicinc.com>" [unknown]
# gpg: aka "Brian Cain <bcain@kernel.org>" [unknown]
# gpg: aka "Brian Cain (QuIC) <bcain@quicinc.com>" [unknown]
# gpg: aka "Brian Cain (CAF) <bcain@codeaurora.org>" [unknown]
# gpg: aka "bcain" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6350 20F9 67A7 7164 79EF 49E0 175C 464E 541B 6D47
# Subkey fingerprint: 3D66 AAE4 7459 4824 C88C E0F8 1A54 AFB8 E564 6C32
* tag 'pull-hex-20240505' of https://github.com/quic/qemu:
Hexagon (target/hexagon) Remove hex_common.read_attribs_file
Hexagon (target/hexagon) Remove gen_shortcode.py
Hexagon (target/hexagon) Remove gen_op_regs.py
Hexagon (target/hexagon) Remove uses of op_regs_generated.h.inc
Hexagon (tests/tcg/hexagon) Test HVX .new read from high half of pair
Hexagon (target/hexagon) Mark has_pred_dest in trans functions
Hexagon (target/hexagon) Mark dest_idx in trans functions
Hexagon (target/hexagon) Mark new_read_idx in trans functions
Hexagon (target/hexagon) Add is_old/is_new to Register class
Hexagon (target/hexagon) Only pass env to generated helper when needed
Hexagon (target/hexagon) Pass SP explicitly to helpers that need it
Hexagon (target/hexagon) Pass P0 explicitly to helpers that need it
Hexagon (target/hexagon) Enable more short-circuit packets (HVX)
Hexagon (target/hexagon) Enable more short-circuit packets (scalar core)
Hexagon (target/hexagon) Analyze reads before writes
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add MapCache argument to xen_replace_cache_entry_unlocked in
preparation for supporting multiple map caches.
No functional change.
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240430164939.925307-8-edgar.iglesias@gmail.com>
[PMD: Remove last global mapcache pointer, reported by sstabellini]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
While each user emulation implentation defines its own
TaskState structure, both use the same get_task_state()
declaration, in particular in common code (such gdbstub).
Declare the method once in "accel/tcg/vcpu-state.h".
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240428221450.26460-10-philmd@linaro.org>
For union types, the tag member is known only after .check().
We used to code this in a simple way: QAPISchemaVariants attribute
.tag_member was None for union types until .check().
Since this complicated typing, recent commit "qapi/schema: fix typing
for QAPISchemaVariants.tag_member" hid it behind a property.
The previous commit lets us treat .tag_member just like the other
attributes that become known only in .check(): declare, but don't
initialize it in .__init__().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
QAPISchemaVariants.check()'s code is almost entirely conditional on
union vs. alternate type.
Move the conditional code to QAPISchemaBranches.check() and
QAPISchemaAlternatives.check(), where the conditions are always
satisfied.
Attribute QAPISchemaVariants.tag_name is now only used by
QAPISchemaBranches. Move it there.
Refactor the three types' .__init__() to make them a bit simpler.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
A previous commit narrowed the type of
QAPISchemaAlternateType.variants from QAPISchemaVariants to
QAPISchemaAlternatives. Rename it to .alternatives.
Same for .__init__() parameter @variants.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
A previous commit narrowed the type of QAPISchemaObjectType.variants
from QAPISchemaVariants to QAPISchemaBranches. Rename it to
.branches.
Same for .__init__() parameter @variants.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
A previous commit narrowed the type of .visit_alternate_type()
parameter @variants from QAPISchemaVariants to QAPISchemaAlternatives.
Rename it to @alternatives.
One of them passes @alternatives to helper function
gen_visit_alternate(). Rename its @variants parameter to
@alternatives as well.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
The previous commit narrowed the type of .visit_object_type()
parameter @variants from QAPISchemaVariants to QAPISchemaBranches.
Rename it to @branches.
Same for .visit_object_type_flat().
A few of these pass @branches to helper functions:
QAPISchemaGenRSTVisitor.visit_object_type() to ._nodes_for_members()
and ._nodes_for_variant_when(), and
QAPISchemaGenVisitVisitor.visit_object_type() to
gen_visit_object_members(). Rename the helpers' @variants parameters
to @branches as well.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
QAPISchemaVariants represents either a union type's branches, or an
alternate type's alternatives. Much of its code is conditional on
which one it actually is.
Create QAPISchemaBranches for branches, and QAPISchemaAlternatives for
alternatives, both subtypes of QAPISchemaVariants.
Replace QAPISchemaVariants by one of them where possible. Keep it
only where we actually deal with either of them.
QAPISchemaVariants.__init__() takes @tag_name and @tag_member, where
exactly one must be None: @tag_name for alternatives, @tag_member for
branches. Let QAPISchemaBranches.__init__() take just @tag_name, and
QAPISchemaAlternatives.__init__() take just @tag_member.
A later patch will move the conditional code to the subtypes.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
qemu_plugin_vcpu_exit_hook() is specific to TCG plugins,
so must be restricted to it in cpu_common_unrealizefn(),
similarly to how qemu_plugin_create_vcpu_state() is
restricted in the cpu_common_realizefn() counterpart.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240429213050.55177-2-philmd@linaro.org>
Extract page-protection definitions from "exec/cpu-all.h"
to "exec/page-protection.h".
The list of files requiring the new header was generated
using:
$ git grep -wE \
'PAGE_(READ|WRITE|EXEC|RWX|VALID|ANON|RESERVED|TARGET_.|PASSTHROUGH)'
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240427155714.53669-3-philmd@linaro.org>
Currently, we pass env to every generated helper. When the semantics of
the instruction only depend on the arguments, this is unnecessary and
adds extra overhead to the helper call.
We add the TCG_CALL_NO_RWG_SE flag to any non-HVX helpers that don't get
the ptr to env.
The A2_nop and SA1_setin1 instructions end up with no arguments. This
results in a "old-style function definition" error from the compiler, so
we write overrides for them.
With this change, the number of helpers with env argument is
idef-parser enabled: 329 total, 23 with env
idef-parser disabled: 1543 total, 550 with env
Signed-off-by: Taylor Simpson <ltaylorsimpson@gmail.com>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Tested-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20240214042726.19290-4-ltaylorsimpson@gmail.com>
Signed-off-by: Brian Cain <bcain@quicinc.com>
We divide gen_analyze_funcs.py into 3 phases
Declare the operands
Analyze the register reads
Analyze the register writes
We also create special versions of ctx_log_*_read for new operands
Check that the operand is written before the read
This is a precursor to improving the analysis for short-circuiting
the packet semantics in a subsequent commit
Signed-off-by: Taylor Simpson <ltaylorsimpson@gmail.com>
Reviewed-by: Brian Cain <bcain@quicinc.com>
Message-Id: <20240201103340.119081-2-ltaylorsimpson@gmail.com>
Signed-off-by: Brian Cain <bcain@quicinc.com>
The sun4u machine has an IOMMU and therefore it is possible to program it such
that the virtio-device IOVA does not map directly to the CPU physical address.
This is not a problem with Linux which always maps the IOVA directly to the CPU
physical address, however it is required for the NetBSD virtio driver where this
is not the case.
Set the sun4u machine defaults for all virtio devices so that disable-legacy=on
and iommu_platform=on to ensure a default configuration will allow virtio
devices to function correctly on both Linux and NetBSD.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20240418205730.31396-1-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
The output of "-cpu help" is currently rather confusing to the users:
It might not be fully clear which part of the output defines the CPU
names since the CPU names contain white spaces (which we later have to
convert into dashes internally). At best it's at least a nuisance since
the users might need to specify the CPU names with quoting on the command
line if they are not aware of the fact that the CPU names could be written
with dashes instead. So let's finally clean up this mess by using dashes
instead of white spaces for the CPU names, like we're doing it internally
later (and like we're doing it in most other targets of QEMU).
Note that it is still possible to pass the CPU names with spaces to the
"-cpu" option, since sparc_cpu_type_name() still translates those to "-".
Fixes: https://gitlab.com/qemu-project/qemu/-/issues/2141
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20240419084812.504779-3-thuth@redhat.com>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Commit b447378e12 ("qom/object: Limit type names to alphanumerical ...")
cut down the amount of allowed characters for QOM types to a saner set.
The "+" character was meant to be included in this set, so we had to
add a hack there to still allow the legacy names of POWER and Sparc64
CPUs. However, instead of putting such a hack in the common QOM code,
there is a much better place to do this: The sparc_cpu_class_by_name()
function which is used to look up the names of all Sparc CPUs.
Thus let's finally get rid of the "+" in the Sparc CPU names, and provide
backward compatibility for the old names via some simple checks in the
sparc_cpu_class_by_name() function.
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240419084812.504779-2-thuth@redhat.com>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Richard Henderson explained on IRC:
bcond_internal() used to insist that both branch
destination and branch fallthrough are use_goto_tb;
if not, we'd use movcond to compute an indirect jump.
But it's perfectly fine for e.g. the branch fallthrough
to use_goto_tb, and the branch destination to use
an indirect branch.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240424234436.995410-4-richard.henderson@linaro.org>
[PMD: Split bigger patch, part 4/5]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20240503072014.24751-7-philmd@linaro.org>
qga/commands-posix.c does not compile on FreeBSD due to a confusion
between "chpasswdata" (wrong) and "chpasswddata" (used in the #else
branch).
Fixes: 0e5b75a390 ("qga/commands-posix: qmp_guest_set_user_password: use ga_run_command helper")
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We only support the most recent two versions of macOS (currently
macOS 13 Ventura and macOS 14 Sonoma), and our ui/cocoa.m code
already assumes at least macOS 12 Monterey or better, because it uses
NSScreen safeAreaInsets, which is 12.0-or-newer.
Remove the ifdefs that were providing backwards compatibility for
building on 10.12 and earlier versions.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240502142904.62644-1-peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Benchmark each acceleration function vs an aligned buffer of zeros.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Because non-embedded aarch64 is expected to have AdvSIMD enabled, merely
double-check with the compiler flags for __ARM_NEON and don't bother with
a runtime check. Otherwise, model the loop after the x86 SSE2 function.
Use UMAXV for the vector reduction. This is 3 cycles on cortex-a76 and
2 cycles on neoverse-n1.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Because the three alternatives are monotonic, we don't need
to keep a couple of bitmasks, just identify the strongest
alternative at startup.
Generalize test_buffer_is_zero_next_accel and init_accel
by always defining an accel_table array.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Split less-than and greater-than 256 cases.
Use unaligned accesses for head and tail.
Avoid using out-of-bounds pointers in loop boundary conditions.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Increase unroll factor in SIMD loops from 4x to 8x in order to move
their bottlenecks from ALU port contention to load issue rate (two loads
per cycle on popular x86 implementations).
Avoid using out-of-bounds pointers in loop boundary conditions.
Follow SSE2 implementation strategy in the AVX2 variant. Avoid use of
PTEST, which is not profitable there (like in the removed SSE4 variant).
Signed-off-by: Alexander Monakov <amonakov@ispras.ru>
Signed-off-by: Mikhail Romanov <mmromanov@ispras.ru>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240206204809.9859-6-amonakov@ispras.ru>
Use of prefetching in bufferiszero.c is quite questionable:
- prefetches are issued just a few CPU cycles before the corresponding
line would be hit by demand loads;
- they are done for simple access patterns, i.e. where hardware
prefetchers can perform better;
- they compete for load ports in loops that should be limited by load
port throughput rather than ALU throughput.
Signed-off-by: Alexander Monakov <amonakov@ispras.ru>
Signed-off-by: Mikhail Romanov <mmromanov@ispras.ru>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240206204809.9859-5-amonakov@ispras.ru>
Test for length >= 256 inline, where is is often a constant.
Before calling into the accelerated routine, sample three bytes
from the buffer, which handles most non-zero buffers.
Signed-off-by: Alexander Monakov <amonakov@ispras.ru>
Signed-off-by: Mikhail Romanov <mmromanov@ispras.ru>
Message-Id: <20240206204809.9859-3-amonakov@ispras.ru>
[rth: Use __builtin_constant_p; move the indirect call out of line.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The SSE4.1 variant is virtually identical to the SSE2 variant, except
for using 'PTEST+JNZ' in place of 'PCMPEQB+PMOVMSKB+CMP+JNE' for testing
if an SSE register is all zeroes. The PTEST instruction decodes to two
uops, so it can be handled only by the complex decoder, and since
CMP+JNE are macro-fused, both sequences decode to three uops. The uops
comprising the PTEST instruction dispatch to p0 and p5 on Intel CPUs, so
PCMPEQB+PMOVMSKB is comparatively more flexible from dispatch
standpoint.
Hence, the use of PTEST brings no benefit from throughput standpoint.
Its latency is not important, since it feeds only a conditional jump,
which terminates the dependency chain.
I never observed PTEST variants to be faster on real hardware.
Signed-off-by: Alexander Monakov <amonakov@ispras.ru>
Signed-off-by: Mikhail Romanov <mmromanov@ispras.ru>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240206204809.9859-2-amonakov@ispras.ru>
Migration code needs no private fields of the coroutine backend.
Include the "regular" coroutine.h header.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Let hw/hyperv/hyperv.c and hw/intc/s390_flic.c handle (respectively)
SynIC and adapter routes, removing the code from target-independent
files. This also removes the only occurrence of AdapterInfo outside
s390 code, so remove that from typedefs.h.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For types that are embedded in structs defined by pci.h, the definition
is pretty much required to be available. Remove them from typedefs.h.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
hw/core/cpu.h is already using struct forward declarations in some cases
to avoid inclusions, and otherwise CPUAddressSpace and CPUJumpCache
are only used together with their definition. CPUTLBEntryFull is
always used when their definition is available. Remove all three
from typedefs.h.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Basically all uses of GraphicHwOps are defining an instance of it, which requires the
full definition of the struct. It is pointless to have it in typedefs.h.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
They are needed in very few places, which already depends on other generated QAPI
files. The benefit of having these types in typedefs.h is small.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
MonitorDef is defined by hmp-target.h, and all users except one already
include it; the reason why the stubs do not include it, is because
hmp-target.h currently can only be used in files that are compiled
per target. However, that is easily fixed. Because the benefit of
having MonitorDef in typedefs.h is very small, do it and remove the
type from typedefs.h.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It is defined and referred to exclusively from a .c file.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Using QemuLockable almost always requires going through QEMU_MAKE_LOCKABLE().
Therefore, there is little point in having the typedef always present. Move
it to lockable.h, with only a small adjustment to coroutine.h (which has
a tricky co-dependency with lockable.h due to defining CoMutex *and*
using QemuLockable as a part of the CoQueue API).
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move it to the existing "PIC related things" header, hw/intc/i8259.h.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
QemuOpt is basically an internal data structure. It has no business
being defined except if you need functions from include/qemu/option.h.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Exactly nobody needs it there. Place the typedef in the header
that defines the struct.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Exactly nobody needs them there. Place the typedef in the header
that defines the struct.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It is needed in very few places, which already depend on other parts of
qdev-core.h files. The benefit of having it in typedefs.h is small.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Only FWCfgState is used as part of APIs such as acpi_ghes_add_fw_cfg.
Everything else need not be in typedefs.h.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If virgl and opengl are not available, the build process creates a useless
libvirtio-vga-gl module that does not have any device in it. Follow the
example of virtio-vga-rutabaga and do not build the module at all in that
case.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Avoids an explicit use of sizeof(). The GLib allocation macros
ensure that the multiplication by the size of the element
uses the right type and does not overflow.
While at it, change bitmap_new() to use g_new0 directly. Its current
impl of calling bitmap_try_new() followed by a plain abort() has
worse diagnostics than g_new0, which uses g_error to report the actual
allocation size that failed.
Cc: qemu-trivial@nongnu.org
Cc: Roman Kiryanov <rkir@google.com>
Reviewed-by: Daniel Berrange <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Boards have been switched to use "default y" and are now listed
in default-configs/*.mak only for convenience.
Document this change and the new possibilities that it allows.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
Continue with Xtensa.
No changes to generated config-devices.mak file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
Continue with TriCore.
No changes to generated config-devices.mak file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
Continue with SPARC and SPARC64.
No changes to generated config-devices.mak file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
Continue with SH.
No changes to generated config-devices.mak file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
Continue with s390.
No changes to generated config-devices.mak file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
Continue with RX.
No changes to generated config-devices.mak file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
Continue with RISC-V.
No changes to generated config-devices.mak file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
Continue with PowerPC/POWER.
No changes to generated config-devices.mak files, other than
adding CONFIG_PPC to the ppc64-softmmu target.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
Continue with OpenRISC.
No changes to generated config-devices.mak file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
Continue with MIPS.
No changes to generated config-devices.mak file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
MIPS boards may only be available for big-endian or only for
little-endian emulators, add a symbol so that this can be described
with a "depends on" clause.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
Continue with Microblaze.
No changes to generated config-devices.mak file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
Continue with m68k.
No changes to generated config-devices.mak file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
Continue with Loongarch.
No changes to generated config-devices.mak file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
Continue with i386.
No changes to generated config-devices.mak files, other than
adding CONFIG_I386 to the x86_64-softmmu target.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
Continue with PARISC.
No changes to generated config-devices.mak file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
Continue with CRIS.
No changes to generated config-devices.mak file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
Continue with AVR.
No changes to generated config-devices.mak file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For ARM targets, boards that require TCG are already using "default y".
Switch ARM_VIRT to the same selection mechanism.
No changes to generated config-devices.mak file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
Start with Alpha.
No changes to generated config-devices.mak file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
target/ppc/kvm.c calls out to code in hw/ppc/spapr*.c; that code is
not present and fails to link if CONFIG_PSERIES is not enabled.
Adjust kvm.c to depend on CONFIG_PSERIES instead of TARGET_PPC64,
and compile out anything that requires cap_papr, because only
the pseries machine will call kvmppc_set_papr().
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
sparc-softmmu is able to run a subset of qtests when compiled --without-default-devices,
so use it instead of x86_64-softmmu for the msys2 run.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
KVM code might have to call functions on the PCIDevice that is
passed to kvm_arch_fixup_msi_route(). This fails in the case
where --without-default-devices is used and no board is
configured. While this is not really a useful configuration,
and therefore setting up stubs for CONFIG_PCI is overkill,
failing the build is impolite. Just include the PCI
subsystem if kvm_arch_fixup_msi_route() requires it, as
is the case for ARM and x86.
Reported-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When emulated with QEMU, interrupts will never come in the following
loop. However, if the NOP instruction is uncommented, interrupts will
fire as normal.
loop:
cli
call do_sti
jmp loop
do_sti:
sti
# nop
ret
This behavior is different from that of a real processor. For example,
if KVM is enabled, interrupts will always fire regardless of whether the
NOP instruction is commented or not. Also, the Intel Software Developer
Manual states that after the STI instruction is executed, the interrupt
inhibit should end as soon as the next instruction (e.g., the RET
instruction if the NOP instruction is commented) is executed.
This problem is caused because the previous code may choose not to end
the TB even if the HF_INHIBIT_IRQ_MASK has just been reset (e.g., in the
case where the STI instruction is immediately followed by the RET
instruction), so that IRQs may not have a change to trigger. This commit
fixes the problem by always terminating the current TB to give IRQs a
chance to trigger when HF_INHIBIT_IRQ_MASK is reset.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Ruihan Li <lrh2000@pku.edu.cn>
Message-ID: <20240415064518.4951-4-lrh2000@pku.edu.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
plugins: Rewrite plugin tcg expansion
# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmYyUpkdHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV98VAgAoTqIWPHtPJOS800G
# TlFuQjkEzQCPSKAh6ZbotsAMvfNwBloPpdrUlFr/jT7mURjEl2B7UC/4LzdhuGeQ
# U/xZt5rXsYvyfS3VwLf8pKBIscF7XjJ1rdfYMvBg9XaNp5VV0aEIk3+6P0uYtzXG
# cREF0uCYfdK6uoiuifhqRAkgrNnamdwpPbbfvzDQI13wICW7SfR7dcd629clVZ1O
# QvD1M4bpTWyhClbZzaoHqyPs+HQEM/AY0wOTfYZNbQBu6zFZXNDZCvYhIEWonPBO
# AKe5KWUrQMwLJhRVejaSSZZDjMdcz3HLaGJppP89/WB+gpY09+LsiuqT7k5c12Bw
# ueLEhw==
# =mn63
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 01 May 2024 07:32:57 AM PDT
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate]
* tag 'pull-tcg-20240501' of https://gitlab.com/rth7680/qemu:
plugins: Update the documentation block for plugin-gen.c
plugins: Inline plugin_gen_empty_callback
plugins: Merge qemu_plugin_tb_insn_get to plugin-gen.c
plugins: Split out common cb expanders
plugins: Replace pr_ops with a proper debug dump flag
plugins: Introduce PLUGIN_CB_MEM_REGULAR
plugins: Simplify callback queues
tcg: Remove INDEX_op_plugin_cb_{start,end}
tcg: Remove TCG_CALL_PLUGIN
plugins: Remove plugin helpers
plugins: Use emit_before_op for PLUGIN_GEN_FROM_MEM
plugins: Use emit_before_op for PLUGIN_GEN_FROM_INSN
plugins: Add PLUGIN_GEN_AFTER_TB
plugins: Use emit_before_op for PLUGIN_GEN_FROM_TB
plugins: Use emit_before_op for PLUGIN_GEN_AFTER_INSN
plugins: Create TCGHelperInfo for all out-of-line callbacks
plugins: Move function pointer in qemu_plugin_dyn_cb
plugins: Zero new qemu_plugin_dyn_cb entries
tcg: Pass function pointer to tcg_gen_call*
tcg: Make tcg/helper-info.h self-contained
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
When executing guest commands in *nix environment, we repeat the same
fork/exec pattern multiple times. Let's just separate it into a single
helper which would also be able to feed input data into the launched
process' stdin. This way we can avoid code duplication.
To keep the history more bisectable, let's replace qmp commands
implementations one by one. Also add G_GNUC_UNUSED attribute to the
helper and remove it in the next commit.
Originally-by: Yuri Pudgorodskiy <yur@virtuozzo.com>
Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Link: https://lore.kernel.org/r/20240320161648.158226-3-andrey.drobyshev@virtuozzo.com
Signed-off-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Since the commit 25b5ff1a86 ("qga: add mountpoint usage info to
GuestFilesystemInfo") we have 2 values reported in guest-get-fsinfo:
used = (f_blocks - f_bfree), total = (f_blocks - f_bfree + f_bavail) as
returned by statvfs(3). While on Windows guests that's all we can get
with GetDiskFreeSpaceExA(), on POSIX guests we might also be interested in
total file system size, as it's visible for root user. Let's add an
optional field 'total-bytes-privileged' to GuestFilesystemInfo struct,
which'd only be reported on POSIX and represent f_blocks value as returned
by statvfs(3).
While here, also tweak the docs to reflect better where those values
come from.
Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Link: https://lore.kernel.org/r/20240320161648.158226-2-andrey.drobyshev@virtuozzo.com
Signed-off-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Merge qemu_plugin_insn_alloc and qemu_plugin_tb_insn_get into
plugin_gen_insn_start, since it is used nowhere else.
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The DEBUG_PLUGIN_GEN_OPS ifdef is replaced with "-d op_plugin".
The second pr_ops call can be obtained with "-d op".
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We have qemu_plugin_dyn_cb.type to differentiate the various
callback types, so we do not need to keep them in separate queues.
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Since we no longer emit plugin helpers during the initial code
translation phase, we don't need to specially mark plugin helpers.
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Introduce a new plugin_mem_cb op to hold the address temp
and meminfo computed by tcg-op-ldst.c. Because this now
has its own opcode, we no longer need PLUGIN_GEN_FROM_MEM.
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
By having the qemu_plugin_cb_flags be recorded in the TCGHelperInfo,
we no longer need to distinguish PLUGIN_CB_REGULAR from
PLUGIN_CB_REGULAR_R, so place all TB callbacks in the same queue.
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Introduce a new plugin_cb op and migrate one operation.
By using emit_before_op, we do not need to emit opcodes
early and modify them later -- we can simply emit the
final set of opcodes once.
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The out-of-line function pointer is mutually exclusive
with inline expansion, so move it into the union.
Wrap the pointer in a structure named 'regular' to match
PLUGIN_CB_REGULAR.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
For normal helpers, read the function pointer from the
structure earlier. For plugins, this will allow the
function pointer to come from elsewhere.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* Clean-ups for "errp" handling in s390x cpu_model code
* Fix a possible abort in the "edu" device
* Add missing qga stubs for stand-alone qga builds and re-enable qga-ssh-test
* Fix memory corruption caused by the stm32l4x5 uart device
* Update the s390x custom runner to Ubuntu 22.04
* Fix READ NATIVE MAX ADDRESS IDE commands to avoid a possible crash
* Shorten the runtime of Cirrus-CI jobs
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmYwmaMRHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbUCERAAss5PJMG8rI4i4X/3nW49JYTlPOpgm/YX
# /UWF+eHUlqaqDdE0s+Pdw4Ozo3hXQt/E/CkcyflUTzVpnZtpv9vkhNWyjOoPV31v
# GQyQEzGvxZXl2S595XefyAyaMTP5maBhUTlyZWJo385cQraa60Ot5d4Mibr2CobY
# gIBRxEGB/frJYpbHJPxd/FxJV120gtuWAdZwGGYYYjwMzf2IKu2veODB8CnUErlX
# WNUsIzjtAslfh8Ek2ZmPzD7uktCUeigkukqIrLC1oEU3wzbJHkISv1kXCKPW/Nf6
# ISjVa5TqGwkiiF8fw9aYKvWrnPJS7JkhXw7Gz+b39d846kUdNyDfgLcYJeNS3cZ2
# R1xgR9B6hX8ZmikMbGC+0/Sv15u2Yr+bFxJBTJzq6zdOAb9EJNQY1hW2w/Lbrg3X
# LjY+ltcVweoSILT6AE6vGDPCHfBzO+6FcptFvw7ePvRGOlwAPZ3tEB9G2LEbCYgg
# BjWNP4aRuSfbUebO4x4Todz65WN8aY1EIBXORU/wgUlF2+zajWiOI5JRDKjWz2qQ
# gAMeCbLplli5bYrChWtouRIXtb061cQloULddu/SRFcaJOlV3SCzx4JfN15pU90s
# jRMIhMESAEj4NSfclhxsOiYp3ywZTvlQsVA6MgPlu2i3HJakQnt5zbg59TesRn2d
# r5PfAk83UnA=
# =0OB7
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 30 Apr 2024 12:11:31 AM PDT
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full]
* tag 'pull-request-2024-04-30' of https://gitlab.com/thuth/qemu:
.gitlab-ci.d/cirrus: Remove the netbsd and openbsd jobs
.gitlab-ci.d/cirrus.yml: Shorten the runtime of the macOS and FreeBSD jobs
tests/qtest/ide-test: Verify READ NATIVE MAX ADDRESS is not limited
hw/ide/core.c (cmd_read_native_max): Avoid limited device parameters
gitlab: remove stale s390x-all-linux-static conf hacks
gitlab: migrate the s390x custom machine to 22.04
build-environment: make some packages optional
hw/char/stm32l4x5_usart: Fix memory corruption by adding correct class_size
qga: Re-enable the qga-ssh-test when running without fuzzing
stubs: Add missing qga stubs
hw: misc: edu: use qemu_log_mask instead of hw_error
hw: misc: edu: rename local vars in edu_check_range
hw: misc: edu: fix 2 off-by-one errors
target/s390x/cpu_models_sysemu: Drop local @err in apply_cpu_model()
target/s390x/cpu_models: Make kvm_s390_apply_cpu_model() return boolean
target/s390x/cpu_models: Drop local @err in get_max_cpu_model()
target/s390x/cpu_models: Make kvm_s390_get_host_cpu_model() return boolean
target/s390x/cpu_model: Drop local @err in s390_realize_cpu_model()
target/s390x/cpu_model: Make check_compatibility() return boolean
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
"make check-qtest-aarch64" recently started failing on FreeBSD builds,
and valgrind on Linux also detected that there is something fishy with
the new stm32l4x5-usart: The code forgot to set the correct class_size
here, so the various class_init functions in this file wrote beyond
the allocated buffer when setting the subc->type field.
Fixes: 4fb37aea7e ("hw/char: Implement STM32L4x5 USART skeleton")
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240429075908.36302-1-thuth@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The DMA descriptor structures for this device have
a set of "address extension" fields which extend the 32
bit source addresses with an extra 16 bits to give a
48 bit address:
https://docs.amd.com/r/en-US/ug1085-zynq-ultrascale-trm/ADDR_EXT-Field
However, we misimplemented this address extension in several ways:
* we only extracted 12 bits of the extension fields, not 16
* we didn't shift the extension field up far enough
* we accidentally did the shift as 32-bit arithmetic, which
meant that we would have an overflow instead of setting
bits [47:32] of the resulting 64-bit address
Add a type cast and use extract64() instead of extract32()
to avoid integer overflow on addition. Fix bit fields
extraction according to documentation.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Cc: qemu-stable@nongnu.org
Fixes: d3c6369a96 ("introduce xlnx-dpdma")
Signed-off-by: Alexandra Diupina <adiupina@astralinux.ru>
Message-id: 20240428181131.23801-1-adiupina@astralinux.ru
[PMM: adjusted commit message]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In previous versions of the Arm architecture, the frequency of the
generic timers as reported in CNTFRQ_EL0 could be any IMPDEF value,
and for QEMU we picked 62.5MHz, giving a timer tick period of 16ns.
In Armv8.6, the architecture standardized this frequency to 1GHz.
Because there is no ID register feature field that indicates whether
a CPU is v8.6 or that it ought to have this counter frequency, we
implement this by changing our default CNTFRQ value for all CPUs,
with exceptions for backwards compatibility:
* CPU types which we already implement will retain the old
default value. None of these are v8.6 CPUs, so this is
architecturally OK.
* CPUs used in versioned machine types with a version of 9.0
or earlier will retain the old default value.
The upshot is that the only CPU type that changes is 'max'; but any
new type we add in future (whether v8.6 or not) will also get the new
1GHz default.
It remains the case that the machine model can override the default
value via the 'cntfrq' QOM property (regardless of the CPU type).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240426122913.3427983-5-peter.maydell@linaro.org
Currently the sbsa_gdwt watchdog device hardcodes its frequency at
62.5MHz. In real hardware, this watchdog is supposed to be driven
from the system counter, which also drives the CPU generic timers.
Newer CPU types (in particular from Armv8.6) should have a CPU
generic timer frequency of 1GHz, so we can't leave the watchdog
on the old QEMU default of 62.5GHz.
Make the frequency a QOM property so it can be set by the board,
and have our only board that uses this device set that frequency
to the same value it sets the CPU frequency.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240426122913.3427983-4-peter.maydell@linaro.org
Currently QEMU CPUs always run with a generic timer counter frequency
of 62.5MHz, but ARMv8.6 CPUs will run at 1GHz. For older versions of
the TF-A firmware that sbsa-ref runs, the frequency of the generic
timer is hardcoded into the firmware, and so if the CPU actually has
a different frequency then timers in the guest will be set
incorrectly.
The default frequency used by the 'max' CPU is about to change, so
make the sbsa-ref board force the CPU frequency to the value which
the firmware expects.
Newer versions of TF-A will read the frequency from the CPU's
CNTFRQ_EL0 register:
4c77fac98d
so in the longer term we could make this board use the 1GHz
frequency. We will need to make sure we update the binaries used
by our avocado test
Aarch64SbsarefMachine.test_sbsaref_alpine_linux_max_pauth_impdef
before we can do that.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Message-id: 20240426122913.3427983-3-peter.maydell@linaro.org
The generic timer frequency is settable by board code via a QOM
property "cntfrq", but otherwise defaults to 62.5MHz. The way this
is done includes some complication resulting from how this was
originally a fixed value with no QOM property. Clean it up:
* always set cpu->gt_cntfrq_hz to some sensible value, whether
the CPU has the generic timer or not, and whether it's system
or user-only emulation
* this means we can always use gt_cntfrq_hz, and never need
the old GTIMER_SCALE define
* set the default value in exactly one place, in the realize fn
The aim here is to pave the way for handling the ARMv8.6 requirement
that the generic timer frequency is always 1GHz. We're going to do
that by having old CPU types keep their legacy-in-QEMU behaviour and
having the default for any new CPU types be a 1GHz rather han 62.5MHz
cntfrq, so we want the point where the default is decided to be in
one place, and in code, not in a DEFINE_PROP_UINT64() initializer.
This commit should have no behavioural changes.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240426122913.3427983-2-peter.maydell@linaro.org
FEAT_Spec_FPACC is a feature describing speculative behaviour in the
event of a PAC authontication failure when FEAT_FPACCOMBINE is
implemented. FEAT_Spec_FPACC means that the speculative use of
pointers processed by a PAC Authentication is not materially
different in terms of the impact on cached microarchitectural state
(caches, TLBs, etc) between passing and failing of the PAC
Authentication.
QEMU doesn't do speculative execution, so we can advertise
this feature.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240418152004.2106516-6-peter.maydell@linaro.org
Newer versions of the Arm ARM (e.g. rev K.a) now define fields for
ID_AA64MMFR3_EL1. Implement this register, so that we can set the
fields if we need to. There's no behaviour change here since we
don't currently set the register value to non-zero.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240418152004.2106516-5-peter.maydell@linaro.org
FEAT_ETS2 is a tighter set of guarantees about memory ordering
involving translation table walks than the old FEAT_ETS; FEAT_ETS has
been retired from the Arm ARM and the old ID_AA64MMFR1.ETS == 1
now gives no greater guarantees than ETS == 0.
FEAT_ETS2 requires:
* the virtual address of a load or store that appears in program
order after a DSB cannot be translated until after the DSB
completes (section B2.10.9)
* TLB maintenance operations that only affect translations without
execute permission are guaranteed complete after a DSB
(R_BLDZX)
* if a memory access RW2 is ordered-before memory access RW2,
then RW1 is also ordered-before any translation table walk
generated by RW2 that generates a Translation, Address size
or Access flag fault (R_NNFPF, I_CLGHP)
As with FEAT_ETS, QEMU is already compliant, because we do not
reorder translation table walk memory accesses relative to other
memory accesses, and we always guarantee to have finished TLB
maintenance as soon as the TLB op is done.
Update the documentation to list FEAT_ETS2 instead of the
no-longer-existent FEAT_ETS, and update the 'max' CPU ID registers.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240418152004.2106516-4-peter.maydell@linaro.org
FEAT_CSV2_3 adds a mechanism to identify if hardware cannot disclose
information about whether branch targets and branch history trained
in one hardware described context can control speculative execution
in a different hardware context.
There is no branch prediction in TCG, so we don't need to do anything
to be compliant with this. Upadte the '-cpu max' ID registers to
advertise the feature.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240418152004.2106516-3-peter.maydell@linaro.org
As of version DDI0487K.a of the Arm ARM, some architectural features
which previously didn't have official names have been named. Add
these to the list of features which QEMU's TCG emulation supports.
Mostly these are features which we thought of as part of baseline 8.0
support. For SVE and SVE2, the names have been brought into line
with the FEAT_* naming convention of other extensions, and some
sub-components split into separate FEAT_ items. In a few cases (eg
FEAT_CCIDX, FEAT_DPB2) the omission from our list was just an oversight.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240418152004.2106516-2-peter.maydell@linaro.org
clock_propagate() has an assert that clk->source is NULL, i.e. that
you are calling it on a clock which has no source clock. This made
sense in the original design where the only way for a clock's
frequency to change if it had a source clock was when that source
clock changed. However, we subsequently added multiplier/divider
support, but didn't look at what that meant for propagation.
If a clock-management device changes the multiplier or divider value
on a clock, it needs to propagate that change down to child clocks,
even if the clock has a source clock set. So the assertion is now
incorrect.
Remove the assertion.
Signed-off-by: Raphael Poggi <raphael.poggi@lynxleap.co.uk>
Message-id: 20240419162951.23558-1-raphael.poggi@lynxleap.co.uk
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: Rewrote the commit message]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
During the past months, the netbsd and openbsd jobs in the Cirrus-CI
were broken most of the time - the setup to run a BSD in KVM on Cirrus-CI
from gitlab via the cirrus-run script was very fragile, and since the
jobs were not run by default, it used to bitrot very fast.
Now Cirrus-CI also introduce a limit on the amount of free CI minutes
that you get there, so it is not appealing at all anymore to run
these BSDs in this setup - it's better to run the checks locally via
"make vm-build-openbsd" and "make vm-build-netbsd" instead. Thus let's
remove these CI jobs now.
Message-ID: <20240426113742.654748-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Cirrus-CI introduced limitations to the free CI minutes. To avoid that
we are consuming them too fast, let's drop the usual targets that are
not that important since they are either a subset of another target
(like i386 or ppc being a subset of x86_64 or ppc64 respectively), or
since there is still a similar target with the opposite endianness
(like xtensa/xtensael, microblaze/microblazeel etc.).
Message-ID: <20240429100113.53357-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Verify that the ATA command READ NATIVE MAX ADDRESS returns the last
valid CHS tuple for the native device rather than any limit
established by INITIALIZE DEVICE PARAMETERS.
Signed-off-by: Lev Kujawski <lkujaw@mailbox.org>
Message-ID: <20221010085229.2431276-2-lkujaw@mailbox.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Always use the native CHS device parameters for the ATA commands READ
NATIVE MAX ADDRESS and READ NATIVE MAX ADDRESS EXT, not those limited
by the ATA command INITIALIZE_DEVICE_PARAMETERS (introduced in patch
176e4961, hw/ide/core.c: Implement ATA INITIALIZE_DEVICE_PARAMETERS
command, 2022-07-07.)
As stated by the ATA/ATAPI specification, "[t]he native maximum is the
highest address accepted by the device in the factory default
condition." Therefore this patch substitutes the native values in
drive_heads and drive_sectors before calling ide_set_sector().
One consequence of the prior behavior was that setting zero sectors
per track could lead to an FPE within ide_set_sector(). Thanks to
Alexander Bulekov for reporting this issue.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1243
Signed-off-by: Lev Kujawski <lkujaw@mailbox.org>
Message-ID: <20221010085229.2431276-1-lkujaw@mailbox.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
"make check-qtest-aarch64" recently started failing on FreeBSD builds,
and valgrind on Linux also detected that there is something fishy with
the new stm32l4x5-usart: The code forgot to set the correct class_size
here, so the various class_init functions in this file wrote beyond
the allocated buffer when setting the subc->type field.
Fixes: 4fb37aea7e ("hw/char: Implement STM32L4x5 USART skeleton")
Message-ID: <20240429075908.36302-1-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
According to the comment in qga/meson.build, the test got disabled
since there were problems with the fuzzing job. But instead of
disabling this test completely, we should still be fine running
it when fuzzing is disabled.
Message-ID: <20240426162348.684143-1-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Compilation QGA without system and user fails
./configure --disable-system --disable-user --enable-guest-agent
Link failure:
/usr/bin/ld: libqemuutil.a.p/util_main-loop.c.o: in function
`os_host_main_loop_wait':
../util/main-loop.c:303: undefined reference to `replay_mutex_unlock'
/usr/bin/ld: ../util/main-loop.c:307: undefined reference to
`replay_mutex_lock'
/usr/bin/ld: libqemuutil.a.p/util_error-report.c.o: in function
`error_printf':
../util/error-report.c:38: undefined reference to `error_vprintf'
/usr/bin/ld: libqemuutil.a.p/util_error-report.c.o: in function
`vreport':
../util/error-report.c:225: undefined reference to `error_vprintf'
/usr/bin/ld: libqemuutil.a.p/util_qemu-timer.c.o: in function
`timerlist_run_timers':
../util/qemu-timer.c:562: undefined reference to `replay_checkpoint'
/usr/bin/ld: ../util/qemu-timer.c:530: undefined reference to
`replay_checkpoint'
/usr/bin/ld: ../util/qemu-timer.c:525: undefined reference to
`replay_checkpoint'
ninja: build stopped: subcommand failed.
Fixes: 3a15604900 ("stubs: include stubs only if needed")
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Message-ID: <20240426121347.18843-2-kkostiuk@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Log a guest error instead of a hardware error when
the guest tries to DMA to / from an invalid address.
Signed-off-by: Chris Friedt <cfriedt@meta.com>
Message-ID: <20221018122551.94567-3-cfriedt@meta.com>
[thuth: Add missing #include statement, fix error reported by checkpatch.pl]
Signed-off-by: Thomas Huth <thuth@redhat.com>
In the case that size1 was zero, because of the explicit
'end1 > addr' check, the range check would fail and the error
message would read as shown below. The correct comparison
is 'end1 >= addr'.
EDU: DMA range 0x40000-0x3ffff out of bounds (0x40000-0x40fff)!
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1254
Signed-off-by: Chris Friedt <cfriedt@meta.com>
[thuth: Adjust patch with regards to the "end1 <= end2" check]
Message-ID: <20221018122551.94567-1-cfriedt@meta.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As error.h suggested, the best practice for callee is to return
something to indicate success / failure.
So make kvm_s390_apply_cpu_model() return boolean and check the
returned boolean in apply_cpu_model() instead of accessing @err.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240425031232.1586401-7-zhao1.liu@intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As error.h suggested, the best practice for callee is to return
something to indicate success / failure.
So make kvm_s390_get_host_cpu_model() return boolean and check the
returned boolean in get_max_cpu_model() instead of accessing @err.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240425031232.1586401-5-zhao1.liu@intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Commit d424db2354 removed an instance of strerrorname_np() because it
was breaking building with musl libc. A recent RISC-V patch ended up
re-introducing it again by accident.
Put this function in the baddies list in checkpatch.pl to avoid this
situation again. This is what it will look like next time:
$ ./scripts/checkpatch.pl 0001-temp-test.patch
ERROR: use strerror() instead of strerrorname_np()
#22: FILE: target/riscv/kvm/kvm-cpu.c:1058:
+ strerrorname_np(errno));
total: 1 errors, 0 warnings, 10 lines checked
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Commit d424db2354 excluded some strerrorname_np() instances because they
break musl libc builds. Another instance happened to slip by via commit
d4ff3da8f4.
Remove it before it causes trouble again.
Fixes: d4ff3da8f4 (target/riscv/kvm: initialize 'vlenb' via get-reg-list)
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Fixes: 1590154ee4 ("target/loongarch: Fix qemu-system-loongarch64 assert failed with the option '-d int'")
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
The .mailmap file fixes mistake we already did.
Do not use it when running checkpatch.pl, otherwise
we might commit the very same mistakes.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Commit f5177798d8 ("scripts: report on author emails
that are mangled by the mailing list") added a check
for qemu-devel@ list, extend the regexp to cover more
such qemu-trivial@, qemu-block@ and qemu-ppc@.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Printing a "PowerPC" in front of each CPU name is not helpful at all:
It is confusing for the users since they don't know whether they
have to specify these letters for the "-cpu" parameter, too, and
it also takes some precious space in the dense output of the CPU
entries. Let's simply remove this now and use two spaces at the
beginning of the lines for the indentation of the entries instead,
and add a "Available CPUs" in the very first line, like most other
target architectures are doing it for their CPU help output already.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Printing an "s390x" in front of each CPU name is not helpful at all:
It is confusing for the users since they don't know whether they
have to specify these letters for the "-cpu" parameter, too, and
it also takes some precious space in the dense output of the CPU
entries. Let's simply remove this now!
While we're at it, use two spaces at the beginning of the lines for
the indentation of the entries, and add a "Available CPUs" in the
very first line, like most other target architectures are doing it
for their "-cpu help" output already.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Printing an "x86" in front of each CPU name is not helpful at all:
It is confusing for the users since they don't know whether they
have to specify these letters for the "-cpu" parameter, too, and
it also takes some precious space in the dense output of the CPU
entries. Let's simply remove this now and use two spaces at the
beginning of the lines for the indentation of the entries instead,
like most other target architectures are doing it for their CPU help
output already.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
libslirp provides a newer slirp_*_hostxfwd API meant for
address-agnostic forwarding instead of the is_udp parameter which is
limited to just TCP/UDP.
This paves the way for IPv6 and Unix socket support.
Signed-off-by: Nicholas Ngai <nicholas@ngai.me>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Tested-by: Breno Leitao <leitao@debian.org>
Message-Id: <20210925214820.18078-1-nicholas@ngai.me>
The following CPUTLBEntry helpers are only used in accel/tcg/cputlb.c:
- tlb_index()
- tlb_entry()
- tlb_read_idx()
- tlb_addr_write()
Move them to this file, allowing to remove the huge "cpu.h" header
inclusion from "exec/cpu_ldst.h".
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240418192525.97451-13-philmd@linaro.org>
Declare 'have_guest_base' in "user/guest-base.h".
Very few files require this header, so explicitly include
it there instead of "exec/cpu-all.h" which is used in many
source files.
Assert this user-specific header is only included from user
emulation.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20231211212003.21686-23-philmd@linaro.org>
Reviewed-by: Anton Johansson <anjo@rev.ng>
The CPUBreakpoint and CPUWatchpoint structures are declared
in "hw/core/cpu.h", which contains declarations related to
CPUState and CPUClass. Some source files only require the
BP/WP definitions and don't need to pull in all CPU* API.
In order to simplify, create a new "exec/breakpoint.h" header.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20240418192525.97451-3-philmd@linaro.org>
The MMUAccessType enum is declared in "hw/core/cpu.h".
"hw/core/cpu.h" contains declarations related to CPUState
and CPUClass. Some source files only require MMUAccessType
and don't need to pull in all CPU* declarations. In order
to simplify, create a new "exec/mmu-access-type.h" header.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240418192525.97451-2-philmd@linaro.org>
The abi_ptr type is declared in "exec/cpu_ldst.h" with all
the load/store helpers. Some source files requiring abi_ptr
type don't need the load/store helpers. In order to simplify,
create a new "exec/abi_ptr.h" header.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20231212123401.37493-21-philmd@linaro.org>
"exec/user/abitypes.h" requires:
- "exec/cpu-defs.h" (TARGET_LONG_BITS)
- "exec/tswap.h" (tswap32)
In order to avoid "cpu.h", pick the minimum required headers.
Assert this user-specific header is only included from user
emulation.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20231212123401.37493-20-philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
We usually check target endianess before swapping values,
so target_words_bigendian() declaration makes sense in
"exec/tswap.h" with the target swapping helpers.
Remove "hw/core/cpu.h" when it was only included to get
the target_words_bigendian() declaration.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20231212123401.37493-16-philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
HVF has a specific use of the CPUState::vcpu_dirty field
(CPUState::vcpu_dirty is not used by common code).
To make this field accel-specific, add and use a new
@dirty variable in the AccelCPUState structure.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240424174506.326-4-philmd@linaro.org>
NVMM has a specific use of the CPUState::vcpu_dirty field
(CPUState::vcpu_dirty is not used by common code).
To make this field accel-specific, add and use a new
@dirty variable in the AccelCPUState structure.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240424174506.326-3-philmd@linaro.org>
WHPX has a specific use of the CPUState::vcpu_dirty field
(CPUState::vcpu_dirty is not used by common code).
To make this field accel-specific, add and use a new
@dirty variable in the AccelCPUState structure.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240424174506.326-2-philmd@linaro.org>
Since commit 139c1837db ("meson: rename included C source files
to .c.inc"), QEMU standard procedure for included C files is to
use *.c.inc.
Besides, since commit 6a0057aa22 ("docs/devel: make a statement
about includes") this is documented in the Coding Style:
If you do use template header files they should be named with
the ``.c.inc`` or ``.h.inc`` suffix to make it clear they are
being included for expansion.
Therefore rename "exec/helper-head.h" as "exec/helper-head.h.inc".
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240424173333.96148-4-philmd@linaro.org>
Since commit 139c1837db ("meson: rename included C source files
to .c.inc"), QEMU standard procedure for included C files is to
use *.c.inc.
Besides, since commit 6a0057aa22 ("docs/devel: make a statement
about includes") this is documented in the Coding Style:
If you do use template header files they should be named with
the ``.c.inc`` or ``.h.inc`` suffix to make it clear they are
being included for expansion.
Therefore rename 'store-insert-al16.h' as 'store-insert-al16.h.inc'
and 'load-extract-al16-al8.h' as 'load-extract-al16-al8.h.inc'.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240424173333.96148-3-philmd@linaro.org>
Due to missing headers, when including "tb-jmp-cache.h" we might get:
accel/tcg/tb-jmp-cache.h:21:21: error: field ‘rcu’ has incomplete type
21 | struct rcu_head rcu;
| ^~~
accel/tcg/tb-jmp-cache.h:24:9: error: unknown type name ‘vaddr’
24 | vaddr pc;
| ^~~~~
Add the missing "qemu/rcu.h" and "exec/cpu-common.h" headers.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240111162442.43755-1-philmd@linaro.org>
set_helper_retaddr() is only used in accel/tcg/user-exec.c.
clear_helper_retaddr() is only used in accel/tcg/cpu-exec.c
and accel/tcg/user-exec.c.
No need to expose their definitions to all user-emulation
files including "exec/cpu_ldst.h", move them to a new
"user-retaddr.h" header (restricted to accel/tcg/).
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20231211212003.21686-19-philmd@linaro.org>
accel/tcg/ files requires the following definitions:
- TARGET_LONG_BITS
- TARGET_PAGE_BITS
- TARGET_PHYS_ADDR_SPACE_BITS
- TCG_GUEST_DEFAULT_MO
The first 3 are defined in "cpu-param.h". The last one
in "cpu.h", with a bunch of definitions irrelevant for
TCG. By moving the TCG_GUEST_DEFAULT_MO definition to
"cpu-param.h", we can simplify various accel/tcg includes.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20231211212003.21686-4-philmd@linaro.org>
"semihosting/uaccess.h" only requires the following headers:
- "exec/cpu-defs.h" for target_ulong,
- "exec/cpu-common.h" for cpu_memory_rw_debug()
- "exec/tswap.h" for tswap32() and tswap64().
Include them instead of the huge "cpu.h".
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <42c6471e-8383-45e0-85ee-e20ca32ecbad@linaro.org>
CPUArchState 'env' field is defined within the ArchCPU structure,
so we need to include each target "cpu.h" header which defines it.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Warner Losh <imp@bsdimp.com>
Message-Id: <20231211212003.21686-2-philmd@linaro.org>
'NEED_CPU_H' guard target-specific code; it is defined by meson
altogether with the 'CONFIG_TARGET' definition. Rename NEED_CPU_H
as COMPILING_PER_TARGET to clarify its meaning.
Mechanical change running:
$ sed -i s/NEED_CPU_H/COMPILING_PER_TARGET/g $(git grep -l NEED_CPU_H)
then manually add a /* COMPILING_PER_TARGET */ comment
after the '#endif' when the block is large.
Inspired-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240322161439.6448-4-philmd@linaro.org>
nbd_negotiate() is already marked coroutine_fn. And given the fix in
the previous patch to have nbd_negotiate_handle_starttls not create
and wait on a g_main_loop (as that would violate coroutine
constraints), it is worth marking the rest of the related static
functions reachable only during option negotiation as also being
coroutine_fn.
Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20240408160214.1200629-6-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
[eblake: drop one spurious coroutine_fn marking]
Signed-off-by: Eric Blake <eblake@redhat.com>
Misc HW patch queue
- Script to compare machines compat_props[] (Maksim)
- Introduce 'module' CPU topology level (Zhao)
- Various cleanups (Thomas, Zhao, Inès, Bernhard)
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmYqN3wACgkQ4+MsLN6t
# wN4hTw/9FHsItnEkme/864DRPSP7A9mCGa+JfzJmsL8oUb9fBjXXKm+lNchMLu3B
# uvzfXB2Ea24yf5vyrldo0XlU3i/4GDvqXTI6YFYqBvitGICauYBu+6n2NZh2Y/Pn
# zZCcVo167o0q7dHu2WSrZ6cSUchsF2C80HjuS07QaN2YZ7QMuN1+uqTjCQ/JHQWA
# MH4xHh7cXdfCbbv8iNhMWn6sa+Bw/UyfRcc2W6w9cF5Q5cuuTshgDyd0JBOzkM1i
# Mcul7TuKrSiLUeeeqfTjwtw3rtbNfkelV3ycgvgECFAlzPSjF5a6d/EGdO2zo3T/
# aFZnQBYrb4U0SzsmfXFHW7cSylIc1Jn2CCuZZBIvdVcu8TGDD5XsgZbGoCfKdWxp
# l67qbQJy1Mp3LrRzygJIaxDOfE8fhhRrcIxfK/GoTHaCkqeFRkGjTeiDTVBqAES2
# zs6kUYZyG/xGaa2tsMu+HbtSO5EEqPC2QCdHayY3deW42Kwjj/HFV50Ya8YgYSVp
# gEAjTDOle2dDjlkYud+ymTJz7LnGb3G7q0EZRI9DWolx/bu+uZGQqTSRRre4qFQY
# SgN576hsFGN4NdM7tyJWiiqD/OC9ZeqUx3gGBtmI52Q6obBCE9hcow0fPs55Tk95
# 1YzPrt/3IoPI5ZptCoA8DFiysQ46OLtpIsQO9YcrpJmxWyLDSr0=
# =tm+U
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 25 Apr 2024 03:59:08 AM PDT
# gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]
* tag 'hw-misc-20240425' of https://github.com/philmd/qemu: (22 commits)
hw/core: Support module-id in numa configuration
hw/core: Introduce module-id as the topology subindex
hw/core/machine: Support modules in -smp
hw/core/machine: Introduce the module as a CPU topology level
hw/i386/pc_sysfw: Remove unused parameter from pc_isa_bios_init()
hw/misc : Correct 5 spaces indents in stm32l4x5_exti
hw/xtensa: Include missing 'exec/cpu-common.h' in 'bootparam.h'
hw/elf_ops: Rename elf_ops.h -> elf_ops.h.inc
hw/cxl/cxl-cdat: Make cxl_doe_cdat_init() return boolean
hw/cxl/cxl-cdat: Make ct3_build_cdat() return boolean
hw/cxl/cxl-cdat: Make ct3_load_cdat() return boolean
hw: Add a Kconfig switch for the TYPE_CPU_CLUSTER device
hw: Fix problem with the A*MPCORE switches in the Kconfig files
hw/riscv/virt: Replace sprintf by g_strdup_printf
hw/misc/imx: Replace sprintf() by snprintf()
hw/misc/applesmc: Simplify DeviceReset handler
target/i386: Move APIC related code to cpu-apic.c
hw/core: Remove check on NEED_CPU_H in tcg-cpu-ops.h
scripts: add script to compare compatibility properties
python/qemu/machine: add method to retrieve QEMUMachine::binary field
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
target-arm queue:
* Implement FEAT_NMI and NMI support in the GICv3
* hw/dma: avoid apparent overflow in soc_dma_set_request
* linux-user/flatload.c: Remove unused bFLT shared-library and ZFLAT code
* Add ResetType argument to Resettable hold and exit phase methods
* Add RESET_TYPE_SNAPSHOT_LOAD ResetType
* Implement STM32L4x5 USART
# -----BEGIN PGP SIGNATURE-----
#
# iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmYqMhMZHHBldGVyLm1h
# eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3uVlD/47U3zYP33y4+wJcRScC0QI
# jYd82jS7GhD5YP5QPrIEMaSbDwtYGi4Rez1taaHvZ2fWLg2gE973iixmTaM2mXCd
# xPEqMsRXkFrQnC89K5/v9uR04AvHxoM8J2mD2OKnUT0RVBs38WxCUMLETBsD18/q
# obs1RzDRhEs5BnwwPMm5HI1iQeVvDRe/39O3w3rZfA8DuqerrNOQWuJd43asHYjO
# Gc1QzCGhALlXDoqk11IzjhJ7es8WbJ5XGvrSNe9QLGNJwNsu9oi1Ez+5WK2Eht9r
# eRvGNFjH4kQY1YCShZjhWpdzU9KT0+80KLirMJFcI3vUztrYZ027/rMyKLHVOybw
# YAqgEUELwoGVzacpaJg73f77uknKoXrfTH25DfoLX0yFCB35JHOPcjU4Uq1z1pfV
# I80ZcJBDJ95mXPfyKLrO+0IyVBztLybufedK2aiH16waEGDpgsJv66FB2QRuQBYW
# O0i6/4DEUZmfSpOmr8ct+julz7wCWSjbvo6JFWxzzxvD0M5T3AFKXZI244g1SMdh
# LS8V7WVCVzVJ5mK8Ujp2fVaIIxiBzlXVZrQftWv5rhyDOiIIeP8pdekmPlI6p5HK
# 3/2efzSYNL2UCDZToIq24El/3md/7vHR6DBfBT1/pagxWUstqqLgkJO42jQtTG0E
# JY1cZ/EQY7cqXGrww8lhWA==
# =WEsU
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 25 Apr 2024 03:36:03 AM PDT
# gpg: using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg: issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [full]
# gpg: aka "Peter Maydell <pmaydell@gmail.com>" [full]
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [full]
# gpg: aka "Peter Maydell <peter@archaic.org.uk>" [unknown]
* tag 'pull-target-arm-20240425' of https://git.linaro.org/people/pmaydell/qemu-arm: (37 commits)
tests/qtest: Add tests for the STM32L4x5 USART
hw/arm: Add the USART to the stm32l4x5 SoC
hw/char/stm32l4x5_usart: Add options for serial parameters setting
hw/char/stm32l4x5_usart: Enable serial read and write
hw/char: Implement STM32L4x5 USART skeleton
reset: Add RESET_TYPE_SNAPSHOT_LOAD
docs/devel/reset: Update to new API for hold and exit phase methods
hw, target: Add ResetType argument to hold and exit phase methods
scripts/coccinelle: New script to add ResetType to hold and exit phases
allwinner-i2c, adm1272: Use device_cold_reset() for software-triggered reset
hw/misc: Don't special case RESET_TYPE_COLD in npcm7xx_clk, gcr
linux-user/flatload.c: Remove unused bFLT shared-library and ZFLAT code
hw/dma: avoid apparent overflow in soc_dma_set_request
hw/arm/virt: Enable NMI support in the GIC if the CPU has FEAT_NMI
target/arm: Add FEAT_NMI to max
hw/intc/arm_gicv3: Report the VINMI interrupt
hw/intc/arm_gicv3: Report the NMI interrupt in gicv3_cpuif_update()
hw/intc/arm_gicv3: Implement NMI interrupt priority
hw/intc/arm_gicv3: Handle icv_nmiar1_read() for icc_nmiar1_read()
hw/intc/arm_gicv3: Add NMI handling CPU interface registers
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* Update OpenBSD CI image to 7.5
* Update/remove Ubuntu 20.04 CI jobs
* Update (most) CentOS 8 CI jobs to CentOS 9
* Some clean-ups and improvements to travis.yml
* Minor test fixes
* s390x header clean-ups
* Doc updates
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmYqbu4RHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbWqeA//cFAzjjmayCRZzuwwCFH0ILPrMNViRLFc
# ZuslNEFygDPl1H1wnw3MKzhHy1hbaC3gf30MjtejU61OMMyxS4exZd1rvw94a0cm
# OEisE8UG52kRsqPwKPktB2bybgX3BZbrFEwp1P0DsvpLTX7wI5nOZyNR4zWOf5Ym
# mODN/MjMFOhWjONOnNDRn4TbySQqolIQBTCq+f1J5Ej74V+p17aC5Fe3xhlFp8Ip
# aRockPW6dLpNt26zx6kKvQSYtkyLgQJSeUUyUCgcla03yzNSuV/kJPUoW0ewa+q8
# DZg1Ru5WJ6O8lELQdYq630cmdwg3e9EeI6q/U/1A11auuLaafOBi0eZW9LdPlrqD
# 6a+zwVn+ipyRdz8eRGZVRGdhJ6XT27YfFuKxdiu4BxnS0LRks4vDcXreIcQmiIUN
# bg/zSp6snCpYf7+GlUReZXWnVx401nu59+BNNKUV0qIxdORNm8kwd9ZpSQwXP/nF
# BMPhj2hoqvWb4C4r3WlTaSPlkJGhkb2lMLucjCbeGrdnmna0RFOFB301fllbpnVm
# 11SRipMEfrj/G5qp4giPLcruzesvRaZm85nmwDyOQWxr5Q0KWWfBVXZMt+qqOckR
# 2SUtLPd9nWruCy7KN15BrOWkmXc+OU8UFUqXIOvflkI6aF1bmFYRyrXgqX2q7QDT
# kEfWnBvBqxw=
# =1uVo
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 25 Apr 2024 07:55:42 AM PDT
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full]
* tag 'pull-request-2024-04-25' of https://gitlab.com/thuth/qemu:
target/s390x: Remove KVM stubs in cpu_models.h
tests/unit: Remove debug statements in test-nested-aio-poll.c
docs/devel: fix minor typo in submitting-a-patch.rst
hw/s390x: Include missing 'cpu.h' header
tests: Update our CI to use CentOS Stream 9 instead of 8
tests/docker/dockerfiles: Run lcitool-refresh after the lcitool update
tests/lcitool/libvirt-ci: Update to the latest master branch
tests: Remove Ubuntu 20.04 container
.travis.yml: Do some more testing with Clang
.travis.yml: Update the jobs to Ubuntu 22.04
.travis.yml: Remove the unused UNRELIABLE environment variable
Revert ".travis.yml: Cache Avocado cache"
tests/vm: update openbsd image to 7.5
docs: i386: pc: Update maximum CPU numbers for PC Q35
tests/qtest : Use `g_assert_cmphex` instead of `g_assert_cmpuint`
MAINTAINERS: update email of Peter Lieven
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We have been running this test for almost a year; it
is safe to remove its debug statements, which clutter
CI jobs output:
▶ 88/100 /nested-aio-poll OK
io_read 0x16bb26158
io_poll_true 0x16bb26158
> io_poll_ready
io_read 0x16bb26164
< io_poll_ready
io_poll_true 0x16bb26158
io_poll_false 0x16bb26164
> io_poll_ready
io_poll_false 0x16bb26164
io_poll_false 0x16bb26164
io_poll_false 0x16bb26164
io_poll_false 0x16bb26164
io_poll_false 0x16bb26164
io_poll_false 0x16bb26164
io_poll_false 0x16bb26164
io_poll_false 0x16bb26164
io_poll_false 0x16bb26164
io_read 0x16bb26164
< io_poll_ready
88/100 qemu:unit / test-nested-aio-poll OK
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240422112246.83812-1-philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
"cpu.h" is implicitly included. Include it explicitly to
avoid the following error when refactoring headers:
hw/s390x/s390-stattrib.c:86:40: error: use of undeclared identifier 'TARGET_PAGE_SIZE'
len = sac->peek_stattr(sas, addr / TARGET_PAGE_SIZE, buflen, vals);
^
hw/s390x/s390-stattrib.c:94:58: error: use of undeclared identifier 'TARGET_PAGE_MASK'
addr / TARGET_PAGE_SIZE, len, addr & ~TARGET_PAGE_MASK);
^
hw/s390x/s390-stattrib.c:224:40: error: use of undeclared identifier 'TARGET_PAGE_BITS'
qemu_put_be64(f, (start_gfn << TARGET_PAGE_BITS) | STATTR_FLAG_MORE);
^
In file included from hw/s390x/s390-virtio-ccw.c:17:
hw/s390x/s390-virtio-hcall.h:22:27: error: unknown type name 'CPUS390XState'
int s390_virtio_hypercall(CPUS390XState *env);
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Eric Farman <farman@linux.ibm.com>
Message-ID: <20240322162822.7391-1-philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Coroutines are not supposed to block. Instead, they should yield.
The client performs TLS upgrade outside of an AIOContext, during
synchronous handshake; this still requires g_main_loop. But the
server responds to TLS upgrade inside a coroutine, so a nested
g_main_loop is wrong. Since the two callbacks no longer share more
than the setting of data.complete and data.error, it's just as easy to
use static helpers instead of trying to share a common code path. It
is also possible to add assertions that no other code is interfering
with the eventual path to qio reaching the callback, whether or not it
required a yield or main loop.
Fixes: f95910f ("nbd: implement TLS support in the protocol negotiation")
Signed-off-by: Zhu Yangyang <zhuyangyang14@huawei.com>
[eblake: move callbacks to their use point, add assertions]
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20240408160214.1200629-5-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Module is a level above the core, thereby supporting numa
configuration on the module level can bring user more numa flexibility.
This is the natural further support for module level.
Add module level support in numa configuration.
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-5-zhao1.liu@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
In x86, module is the topology level above core, which contains a set
of cores that share certain resources (in current products, the resource
usually includes L2 cache, as well as module scoped features and MSRs).
Though smp.clusters could also share the L2 cache resource [1], there
are following reasons that drive us to introduce the new smp.modules:
* As the CPU topology abstraction in device tree [2], cluster supports
nesting (though currently QEMU hasn't support that). In contrast,
(x86) module does not support nesting.
* Due to nesting, there is great flexibility in sharing resources
on cluster, rather than narrowing cluster down to sharing L2 (and
L3 tags) as the lowest topology level that contains cores.
* Flexible nesting of cluster allows it to correspond to any level
between the x86 package and core.
* In Linux kernel, x86's cluster only represents the L2 cache domain
but QEMU's smp.clusters is the CPU topology level. Linux kernel will
also expose module level topology information in sysfs for x86. To
avoid cluster ambiguity and keep a consistent CPU topology naming
style with the Linux kernel, we introduce module level for x86.
The module is, in existing hardware practice, the lowest layer that
contains the core, while the cluster is able to have a higher
topological scope than the module due to its nesting.
Therefore, place the module between the cluster and the core:
drawer/book/socket/die/cluster/module/core/thread
With the above topological hierarchy order, introduce module level
support in MachineState and MachineClass.
[1]: https://lore.kernel.org/qemu-devel/c3d68005-54e0-b8fe-8dc1-5989fe3c7e69@huawei.com/
[2]: https://www.kernel.org/doc/Documentation/devicetree/bindings/cpu/cpu-topology.txt
Suggested-by: Xiaoyao Li <xiaoyao.li@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-2-zhao1.liu@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Since commit 139c1837db ("meson: rename included C source files
to .c.inc"), QEMU standard procedure for included C files is to
use *.c.inc.
Besides, since commit 6a0057aa22 ("docs/devel: make a statement
about includes") this is documented in the Coding Style:
If you do use template header files they should be named with
the ``.c.inc`` or ``.h.inc`` suffix to make it clear they are
being included for expansion.
Therefore rename "hw/elf_ops.h" as "hw/elf_ops.h.inc".
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240424173333.96148-2-philmd@linaro.org>
A9MPCORE, ARM11MPCORE and A15MPCORE are defined twice, once in
hw/cpu/Kconfig and once in hw/arm/Kconfig. This is only possible
by accident, since hw/cpu/Kconfig is never included from hw/Kconfig.
Fix it by declaring the switches only in hw/cpu/Kconfig (since the
related files reside in the hw/cpu/ folder) and by making sure that
the file hw/cpu/Kconfig is now properly included from hw/Kconfig.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240415065655.130099-2-thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Have applesmc_find_key() return a const pointer.
Since the returned buffers are not modified in
applesmc_io_data_write(), it is pointless to
delete and re-add the keys in the DeviceReset
handler. Add them once in DeviceRealize, and
discard them in the DeviceUnrealize handler.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20240410180819.92332-1-philmd@linaro.org>
Add the basic infrastructure (register read/write, type...)
to implement the STM32L4x5 USART.
Also create different types for the USART, UART and LPUART
of the STM32L4x5 to deduplicate code and enable the
implementation of different behaviors depending on the type.
Signed-off-by: Arnaud Minier <arnaud.minier@telecom-paris.fr>
Signed-off-by: Inès Varhol <ines.varhol@telecom-paris.fr>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240329174402.60382-2-arnaud.minier@telecom-paris.fr
[PMM: update to new reset hold method signature;
fixed a few checkpatch nits]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Some devices and machines need to handle the reset before a vmsave
snapshot is loaded differently -- the main user is the handling of
RNG seed information, which does not want to put a new RNG seed into
a ROM blob when we are doing a snapshot load.
Currently this kind of reset handling is supported only for:
* TYPE_MACHINE reset methods, which take a ShutdownCause argument
* reset functions registered with qemu_register_reset_nosnapshotload
To allow a three-phase-reset device to also distinguish "snapshot
load" reset from the normal kind, add a new ResetType
RESET_TYPE_SNAPSHOT_LOAD. All our existing reset methods ignore
the reset type, so we don't need to update any device code.
Add the enum type, and make qemu_devices_reset() use the
right reset type for the ShutdownCause it is passed. This
allows us to get rid of the device_reset_reason global we
were using to implement qemu_register_reset_nosnapshotload().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Luc Michel <luc.michel@amd.com>
Message-id: 20240412160809.1260625-7-peter.maydell@linaro.org
We pass a ResetType argument to the Resettable class enter
phase method, but we don't pass it to hold and exit, even though
the callsites have it readily available. This means that if
a device cared about the ResetType it would need to record it
in the enter phase method to use later on. Pass the type to
all three of the phase methods to avoid having to do that.
Commit created with
for dir in hw target include; do \
spatch --macro-file scripts/cocci-macro-file.h \
--sp-file scripts/coccinelle/reset-type.cocci \
--keep-comments --smpl-spacing --in-place \
--include-headers --dir $dir; done
and no manual edits.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Luc Michel <luc.michel@amd.com>
Message-id: 20240412160809.1260625-5-peter.maydell@linaro.org
We pass a ResetType argument to the Resettable class enter phase
method, but we don't pass it to hold and exit, even though the
callsites have it readily available. This means that if a device
cared about the ResetType it would need to record it in the enter
phase method to use later on. We should pass the type to all three
of the phase methods to avoid having to do that.
This coccinelle script adds the ResetType argument to the hold and
exit phases of the Resettable interface.
The first part of the script (rules holdfn_assigned, holdfn_defined,
exitfn_assigned, exitfn_defined) update implementations of the
interface within device models, both to change the signature of their
method implementations and to pass on the reset type when they invoke
reset on some other device.
The second part of the script is various special cases:
* method callsites in resettable_phase_hold(), resettable_phase_exit()
and device_phases_reset()
* updating the typedefs for the methods
* isl_pmbus_vr.c has some code where one device's reset method directly
calls the implementation of a different device's method
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Luc Michel <luc.michel@amd.com>
Message-id: 20240412160809.1260625-4-peter.maydell@linaro.org
The npcm7xx_clk and npcm7xx_gcr device reset methods look at
the ResetType argument and only handle RESET_TYPE_COLD,
producing a warning if another reset type is passed. This
is different from how every other three-phase-reset method
we have works, and makes it difficult to add new reset types.
A better pattern is "assume that any reset type you don't know
about should be handled like RESET_TYPE_COLD"; switch these
devices to do that. Then adding a new reset type will only
need to touch those devices where its behaviour really needs
to be different from the standard cold reset.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Luc Michel <luc.michel@amd.com>
Message-id: 20240412160809.1260625-2-peter.maydell@linaro.org
Ever since the bFLT format support was added in 2006, there has been
a chunk of code in the file guarded by CONFIG_BINFMT_SHARED_FLAT
which is supposedly for shared library support. This is not enabled
and it's not possible to enable it, because if you do you'll run into
the "#error needs checking" in the calc_reloc() function.
Similarly, CONFIG_BINFMT_ZFLAT exists but can't be enabled because of
an "#error code needs checking" in load_flat_file().
This code is obviously unfinished and has never been used; nobody in
the intervening 18 years has complained about this or fixed it, so
just delete the dead code. If anybody ever wants the feature they
can always pull it out of git, or (perhaps better) write it from
scratch based on the current Linux bFLT loader rather than the one of
18 years ago.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240411115313.680433-1-peter.maydell@linaro.org
In soc_dma_set_request() we try to set a bit in a uint64_t, but we
do it with "1 << ch->num", which can't set any bits past 31;
any use for a channel number of 32 or more would fail due to
integer overflow.
This doesn't happen in practice for our current use of this code,
because the worst case is when we call soc_dma_init() with an
argument of 32 for the number of channels, and QEMU builds with
-fwrapv so the shift into the sign bit is well-defined. However,
it's obviously not the intended behaviour of the code.
Add casts to force the shift to be done as 64-bit arithmetic,
allowing up to 64 channels.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: afbb5194d4 ("Handle on-chip DMA controllers in one place, convert OMAP DMA to use it.")
Signed-off-by: Anastasia Belova <abelova@astralinux.ru>
Message-id: 20240409115301.21829-1-abelova@astralinux.ru
[PMM: Edit commit message to clarify that this doesn't actually
bite us in our current usage of this code.]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
If the CPU implements FEAT_NMI, then turn on the NMI support in the
GICv3 too. It's permitted to have a configuration with FEAT_NMI in
the CPU (and thus NMI support in the CPU interfaces too) but no NMI
support in the distributor and redistributor, but this isn't a very
useful setup as it's close to having no NMI support at all.
We don't need to gate the enabling of NMI in the GIC behind a
machine version property, because none of our current CPUs
implement FEAT_NMI, and '-cpu max' is not something we maintain
migration compatibility across versions for. So we can always
enable the GIC NMI support when the CPU has it.
Neither hvf nor KVM support NMI in the GIC yet, so we don't enable
it unless we're using TCG.
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240407081733.3231820-25-ruanjinjie@huawei.com
[PMM: Update comment and commit message]
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
If GICD_CTLR_DS bit is zero and the NMI is non-secure, the NMI priority is
higher than 0x80, otherwise it is higher than 0x0. And save the interrupt
non-maskable property in hppi.nmi to deliver NMI exception. Since both GICR
and GICD can deliver NMI, it is both necessary to check whether the pending
irq is NMI in gicv3_redist_update_noirqset and gicv3_update_noirqset.
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240407081733.3231820-21-ruanjinjie@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Implement icv_nmiar1_read() for icc_nmiar1_read(), so add definition for
ICH_LR_EL2.NMI and ICH_AP1R_EL2.NMI bit.
If FEAT_GICv3_NMI is supported, ich_ap_write() should consider ICV_AP1R_EL1.NMI
bit. In icv_activate_irq() and icv_eoir_write(), the ICV_AP1R_EL1.NMI bit
should be set or clear according to the Non-maskable property. And the RPR
priority should also update the NMI bit according to the APR priority NMI bit.
By the way, add gicv3_icv_nmiar1_read trace event.
If the hpp irq is a NMI, the icv iar read should return 1022 and trap for
NMI again
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
[PMM: use cs->nmi_support instead of cs->gic->nmi_support]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240407081733.3231820-20-ruanjinjie@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add the NMIAR CPU interface registers which deal with acknowledging NMI.
When introduce NMI interrupt, there are some updates to the semantics for the
register ICC_IAR1_EL1 and ICC_HPPIR1_EL1. For ICC_IAR1_EL1 register, it
should return 1022 if the intid has non-maskable property. And for
ICC_NMIAR1_EL1 register, it should return 1023 if the intid do not have
non-maskable property. Howerever, these are not necessary for ICC_HPPIR1_EL1
register.
And the APR and RPR has NMI bits which should be handled correctly.
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
[PMM: Separate out whether cpuif supports NMI from whether the
GIC proper (IRI) supports NMI]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240407081733.3231820-19-ruanjinjie@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
According to Arm GIC section 4.6.3 Interrupt superpriority, the interrupt
with superpriority is always IRQ, never FIQ, so the NMI exception trap entry
behave like IRQ. And VINMI(vIRQ with Superpriority) can be raised from the
GIC or come from the hcrx_el2.HCRX_VINMI bit, VFNMI(vFIQ with Superpriority)
come from the hcrx_el2.HCRX_VFNMI bit.
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240407081733.3231820-13-ruanjinjie@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When PSTATE.ALLINT is set, an IRQ or FIQ interrupt that is targeted to
ELx, with or without superpriority is masked. As Richard suggested, place
ALLINT bit in PSTATE in env->pstate.
In the pseudocode, AArch64.ExceptionReturn() calls SetPSTATEFromPSR(), which
treats PSTATE.ALLINT as one of the bits which are reinstated from SPSR to
PSTATE regardless of whether this is an illegal exception return or not. So
handle PSTATE.ALLINT the same way as PSTATE.DAIF in the illegal_return exit
path of the exception_return helper. With the change, exception entry and
return are automatically handled.
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240407081733.3231820-3-ruanjinjie@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This script runs QEMU to obtain compat_props of machines and default
values of different types of drivers to produce comparison table. This
table can be used to compare machine types to choose the most suitable
machine or compare binaries to be sure that migration to the newer version
will save all device properties. Also the json or csv format of this
table can be used to check does a new machine affect the previous ones by
comparing tables with and without the new machine.
Default values (that will be used without machine compat_props) of
properties are needed to fill "holes" in the table (one machine has
the property but another machine not. For instance, 2.12 machine has
`{ "EPYC-" TYPE_X86_CPU, "xlevel", "0x8000000a" }`, but compat_pros of
3.1 machine doesn't have it. Thus, to compare these machines we need to
get unknown value of "EPYC-x86_64-cpu-xlevel" for 3.1 machine. These
unknown values in the table are called "holes". To get values for these
"holes" the script uses list of appropriate methods.)
Notes:
* Some init values from the devices can't be available like properties
from virtio-9p when configure has --disable-virtfs. This situations will
be seen in the table as "unavailable driver".
* Default values can be obtained in an unobvious way, like x86 features.
If the script doesn't know how to get property default value to compare
one machine with another it fills "holes" with "unavailable method". This
is done because script uses whitelist model to get default values of
different types. It means that the method that can't be applied to a new
type that can crash this script. It is better to get an "unavailable
driver" when creating a new machine with new compatible properties than
to break this script. So it turns out a more stable and generic script.
* If the default value can't be obtained because this property doesn't
exist or because this property can't have default value, appropriate
"hole" will be filled by "unknown property" or "no default value"
* If the property is applied to the abstract class, the script collects
default values from all child classes and prints all these classes
* Raw table (--raw flag) should be used with json/csv parameters for
scripts and etc. Human-readable (default) format contains transformed
and simplified values and it doesn't contain lines with the same values
in columns
Example:
./scripts/compare-machine-types.py --mt pc-q35-6.2 pc-q35-7.1
╒══════════════════╤══════════════════════════╤════════════════════════════╤════════════════════════════╕
│ Driver │ Property │ build/qemu-system-x86_64 │ build/qemu-system-x86_64 │
│ │ │ pc-q35-6.2 │ pc-q35-7.1 │
╞══════════════════╪══════════════════════════╪════════════════════════════╪════════════════════════════╡
│ PIIX4_PM │ x-not-migrate-acpi-index │ True │ False │
├──────────────────┼──────────────────────────┼────────────────────────────┼────────────────────────────┤
│ arm-gicv3-common │ force-8-bit-prio │ True │ unavailable driver │
├──────────────────┼──────────────────────────┼────────────────────────────┼────────────────────────────┤
│ nvme-ns │ eui64-default │ True │ False │
├──────────────────┼──────────────────────────┼────────────────────────────┼────────────────────────────┤
│ virtio-mem │ unplugged-inaccessible │ False │ auto │
╘══════════════════╧══════════════════════════╧════════════════════════════╧════════════════════════════╛
Signed-off-by: Maksim Davydov <davydov-max@yandex-team.ru>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20240318213550.155573-5-davydov-max@yandex-team.ru>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
To control that creating new machine type doesn't affect the previous
types (their compat_props) and to check complex compat_props inheritance
we need qmp command to print machine type compatibility properties.
This patch adds the ability to get list of all the compat_props of the
corresponding supported machines for their comparison via new optional
argument of "query-machines" command. Since information on compatibility
properties can increase the command output by a factor of 40, add an
argument to enable it, default off.
Signed-off-by: Maksim Davydov <davydov-max@yandex-team.ru>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Acked-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240318213550.155573-3-davydov-max@yandex-team.ru>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
qmp_qom_list_properties can print default values if they are available
as qmp_device_list_properties does, because both of them use the
ObjectPropertyInfo structure with default_value field. This can be useful
when working with "not device" types (e.g. memory-backend).
Signed-off-by: Maksim Davydov <davydov-max@yandex-team.ru>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240318213550.155573-2-davydov-max@yandex-team.ru>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
RHEL 9 (and thus also the derivatives) have been available since two
years now, so according to QEMU's support policy, we can drop the active
support for the previous major version 8 now.
Another reason for doing this is that Centos Stream 8 will go EOL soon:
https://blog.centos.org/2023/04/end-dates-are-coming-for-centos-stream-8-and-centos-linux-7/
"After May 31, 2024, CentOS Stream 8 will be archived
and no further updates will be provided."
Thus upgrade our CentOS Stream container to major version 9 now.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240418101056.302103-5-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
This update adds the removing of the EXTERNALLY-MANAGED marker files
that has been added to the lcitool recently.
Quoting Daniel:
"For those who don't know, python now commonly blocks the ability to
run 'pip install' outside of a venv. This generally makes sense for
a precious installation environment. Our containers are disposable
though, so a venv has no benefit. Removing the 'EXTERNALLY-MANAGED'
allows the historical arbitrary use of 'pip' outside a venv.
lcitool just does this unconditionally given the containers are
not precious."
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240418101056.302103-4-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Since Ubuntu 22.04 has now been available for more than two years, we
can stop actively supporting the previous LTS version of Ubuntu now.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240418101056.302103-2-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
We are doing a lot of cross-compilation tests with GCC in the gitlab-CI
already, so we could get some more test coverage by using Clang in the
Travis-CI instead. Thus let's switch two additional jobs to use Clang
for compilation.
Message-ID: <20240320104144.823425-7-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
According to our support policy, we'll soon drop our official support
for Ubuntu 20.04 ("Focal Fossa") in QEMU. Thus we should update the
Travis jobs now to a newer release (Ubuntu 22.04 - "Jammy Jellyfish")
for future testing. Since all jobs are using this release now, we
can drop the entries from the individual jobs and use the global
setting again.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240418101056.302103-6-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
This variable was used to allow jobs to fail without spoiling the
overall result. But the required "allow_failures:" hunk has been
accidentally removed in commit 9d03f5abed ("travis.yml: Remove the
"Release tarball" job"), and it was anyway only useful while we
still had the x86 jobs here around that were our main CI jobs.
Thus let's simply remove this useless variable now.
Message-ID: <20240320104144.823425-6-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
This reverts commit c1073e44b4.
The Avocado tests have been removed from Travis a long time ago with
commit c5008c76ee ("gitlab: add acceptance testing to system builds"),
so we don't need to cache the avocado files here anymore.
Message-ID: <20240320104144.823425-4-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
meson: Make DEBUG_REMAP a meson option
target/m68k: Support semihosting on non-ColdFire targets
linux-user: do_setsockopt cleanups
linux-user: Add FITRIM ioctl
# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmYpjHcdHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV+a/Af7BHmDB27U61b9i8et
# cObewYH9y9M+iaCrIflNZPAaoguHDRKOuvw+PFT/dIo5FL2D509vYOuxUow1qLsy
# q6b6kdvXROq9WU2NiuB86Abl/4mwwzxRhFah+Eh+OYSA2/pQnkcULkouLqxjFfF0
# xTBzZtHtYdTbCTVRbpd6XrwLo7Qrs85ovl4wVD1r+T2T8FkvrryoNOA/VjUWxyeh
# 3b1X1I0wtOTnEA7JSr17JCXWZGENCmTO35r6WSYzJy5U/C59PjjgaaeMi3R3lQTJ
# gg21EH0hlU1nTiPLg2ypj3l9NbIGAincAdDF/jufee+R75YSPdpKoDH8tUlUGsnM
# CRx5Xg==
# =J+5K
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 24 Apr 2024 03:49:27 PM PDT
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate]
* tag 'pull-tcg-20240424' of https://gitlab.com/rth7680/qemu:
target/m68k: Support semihosting on non-ColdFire targets
target/m68k: Perform the semihosting test during translate
target/m68k: Pass semihosting arg to exit
linux-user: Add FITRIM ioctl
linux-user: do_setsockopt: eliminate goto in switch for SO_SNDTIMEO
linux-user: do_setsockopt: make ip_mreq_source local to the place where it is used
linux-user: do_setsockopt: make ip_mreq local to the place it is used and inline target_to_host_ip_mreq()
linux-user: do_setsockopt: fix SOL_ALG.ALG_SET_KEY
meson: Make DEBUG_REMAP a meson option
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
According to the m68k semihosting spec:
"The instruction used to trigger a semihosting request depends on the
m68k processor variant. On ColdFire, "halt" is used; on other processors
(which don't implement "halt"), "bkpt #0" may be used."
Add support for non-CodeFire processors by matching BKPT #0 instructions.
Signed-off-by: Keith Packard <keithp@keithp.com>
[rth: Use semihosting_test()]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Replace EXCP_HALT_INSN by EXCP_SEMIHOSTING. Perform the pre-
and post-insn tests during translate, leaving only the actual
semihosting operation for the exception.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
There's identical code for SO_SNDTIMEO and SO_RCVTIMEO, currently
implemented using an ugly goto into another switch case. Eliminate
that using arithmetic if, making code flow more natural.
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Message-Id: <20240331100737.2724186-5-mjt@tls.msk.ru>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
ip_mreq is declared at the beginning of do_setsockopt(), while
it is used in only one place. Move its declaration to that very
place and replace pointer to alloca()-allocated memory with the
structure itself.
target_to_host_ip_mreq() is used only once, inline it.
This change also properly handles TARGET_EFAULT when the address
is wrong.
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Message-Id: <20240331100737.2724186-3-mjt@tls.msk.ru>
[rth: Fix braces, adjust optlen to match host structure size]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Currently DEBUG_REMAP is a macro that needs to be manually #defined to
be activated, which makes it hard to have separate build directories
dedicated to testing the code with it. Promote it to a meson option.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20240312002402.14344-1-iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
QAPI patches patches for 2024-04-24
# -----BEGIN PGP SIGNATURE-----
#
# iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAmYovX4SHGFybWJydUBy
# ZWRoYXQuY29tAAoJEDhwtADrkYZT0RQP/1R1oOSdfWmiSR5gdW+IDE88VrjNY+Md
# lZJ7dTtbIDbwABP6s7/wGIxlc4bf6lwIwhWNnfOa+y71jJ/u9aX2Q6D8wHY5lQnu
# b8jhzoP3UyB+LQV5TTX1nTqTaO72Bj0pxw/0/DeKCzg7kgjys06a1Ln9P0oyWs6x
# li/Cs133wPtZ5rISqif5yOmssber0h2D584k5MN2VK7eaGidLioQQIRmSDikPE6Z
# TpnOEqySInIFhPJmm77il19ZDpCrgdCoD7lXoqX1C6XScYz7dU+m/TaToFk1lECw
# VstR2SJT39TzbOLdis1O5/vsLP0QfciMRQbUktSrf4jQHumrkSa/OE6xKwJC3x82
# axJWc+BygcosylKc5CYVzwSlHugHw6Lf39qui//yzi5CXakzO6owKX7Q8AcH3PPy
# 6thncy0dvw1ggq/BZGYhjyG+6MXBCWipPGVXFp9Gf2cAayTALhwyNtPvDNX57fzT
# UZA/fa+/Wc9xZv1YAnxLaKyo4o65YWXVnq1eHXV17ny08BzZNYOC2ZuXim+rMThM
# lxzcrTkrMLgsGXetp59uhZw9JRnVvaxLqNXfC2bwpoRzXzuifvnPithl6nkSm1YB
# TvHGZZdO3B498kOW6947KrMVFh3t4aNWkdbOyetAMECy71H3Q4CTHYGFa5TzXT9O
# Rjz31NYdVNBE
# =uyfg
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 24 Apr 2024 01:06:22 AM PDT
# gpg: using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653
# gpg: issuer "armbru@redhat.com"
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [undefined]
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653
* tag 'pull-qapi-2024-04-24' of https://repo.or.cz/qemu/armbru: (25 commits)
qapi: Dumb down QAPISchema.lookup_entity()
qapi: Tighten check whether implicit object type already exists
qapi/schema: remove unnecessary asserts
qapi/schema: turn on mypy strictness
qapi/schema: add type hints
qapi/parser.py: assert member.info is present in connect_member
qapi/parser: demote QAPIExpression to Dict[str, Any]
qapi/schema: assert inner type of QAPISchemaVariants in check_clash()
qapi/schema: fix typing for QAPISchemaVariants.tag_member
qapi/schema: Don't initialize "members" with `None`
qapi/schema: add _check_complete flag
qapi/schema: assert info is present when necessary
qapi/schema: fix QAPISchemaArrayType.check's call to resolve_type
qapi: Assert built-in types exist
qapi/schema: assert resolve_type has 'info' and 'what' args on error
qapi/schema: add type narrowing to lookup_type()
qapi/schema: adjust type narrowing for mypy's benefit
qapi/schema: make c_type() and json_type() abstract methods
qapi/schema: declare type for QAPISchemaArrayType.element_type
qapi/schema: declare type for QAPISchemaObjectTypeMember.type
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
GlusterFS+RDMA has been deprecated 8 years ago in commit
0552ff2465 ("block/gluster: deprecate rdma support"):
gluster volfile server fetch happens through unix and/or tcp,
it doesn't support volfile fetch over rdma. The rdma code may
actually mislead, so to make sure things do not break, for now
we fallback to tcp when requested for rdma, with a warning.
If you are wondering how this worked all these days, its the
gluster libgfapi code which handles anything other than unix
transport as socket/tcp, sad but true.
Besides, the whole RDMA subsystem was deprecated in commit
e9a54265f5 ("hw/rdma: Deprecate the pvrdma device and the rdma
subsystem") released in v8.2.
Cc: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20240328130255.52257-4-philmd@linaro.org>
The Nios II target is deprecated since v8.2 in commit 9997771bc1
("target/nios2: Deprecate the Nios II architecture").
Remove:
- Buildsys / CI infra
- User emulation
- System emulation (10m50-ghrd & nios2-generic-nommu machines)
- Tests
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Marek Vasut <marex@denx.de>
Message-Id: <20240327144806.11319-3-philmd@linaro.org>
QAPISchema.lookup_entity() takes an optional type argument, a subtype
of QAPISchemaDefinition, and returns that type or None. Callers can
use this to save themselves an isinstance() test.
The only remaining user of this convenience feature is .lookup_type().
But we don't actually save anything anymore there: we still need the
isinstance() to help mypy over the hump.
Drop the .lookup_entity() argument, and adjust .lookup_type().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240315152301.3621858-26-armbru@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
[Commit message typo fixed]
Entities with names starting with q_obj_ are implicit object types.
Therefore, QAPISchema._make_implicit_object_type()'s .lookup_entity()
can only return a QAPISchemaObjectType. Assert that.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240315152301.3621858-25-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: John Snow <jsnow@redhat.com>
This patch only adds type hints, which aren't utilized at runtime and
don't change the behavior of this module in any way.
In a scant few locations, type hints are removed where no longer
necessary due to inference power from typing all of the rest of
creation; and any type hints that no longer need string quotes are
changed.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240315152301.3621858-22-armbru@redhat.com>
Dict[str, object] is a stricter type, but with the way that code is
currently arranged, it is infeasible to enforce this strictness.
In particular, although expr.py's entire raison d'être is normalization
and type-checking of QAPI Expressions, that type information is not
"remembered" in any meaningful way by mypy because each individual
expression is not downcast to a specific expression type that holds all
the details of each expression's unique form.
As a result, all of the code in schema.py that deals with actually
creating type-safe specialized structures has no guarantee (myopically)
that the data it is being passed is correct.
There are two ways to solve this:
(1) Re-assert that the incoming data is in the shape we expect it to be, or
(2) Disable type checking for this data.
(1) is appealing to my sense of strictness, but I gotta concede that it
is asinine to re-check the shape of a QAPIExpression in schema.py when
expr.py has just completed that work at length. The duplication of code
and the nightmare thought of needing to update both locations if and
when we change the shape of these structures makes me extremely
reluctant to go down this route.
(2) allows us the chance to miss updating types in the case that types
are updated in expr.py, but it *is* an awful lot simpler and,
importantly, gets us closer to type checking schema.py *at
all*. Something is better than nothing, I'd argue.
So, do the simpler dumber thing and worry about future strictness
improvements later.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240315152301.3621858-20-armbru@redhat.com>
QAPISchemaVariant's "variants" field is typed as
List[QAPISchemaVariant], where the typing for QAPISchemaVariant allows
its type field to be any QAPISchemaType.
However, QAPISchemaVariant expects that all of its variants contain the
narrower QAPISchemaObjectType. This relationship is enforced at runtime
in QAPISchemaVariants.check(). This relationship is not embedded in the
type system though, so QAPISchemaVariants.check_clash() needs to
re-assert this property in order to call
QAPISchemaVariant.type.check_clash().
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240315152301.3621858-19-armbru@redhat.com>
There are two related changes here:
(1) We need to perform type narrowing for resolving the type of
tag_member during check(), and
(2) tag_member is a delayed initialization field, but we can hide it
behind a property that raises an Exception if it's called too
early. This simplifies the typing in quite a few places and avoids
needing to assert that the "tag_member is not None" at a dozen
callsites, which can be confusing and suggest the wrong thing to a
drive-by contributor.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240315152301.3621858-18-armbru@redhat.com>
Declare, but don't initialize the "members" field with type
List[QAPISchemaObjectTypeMember].
This simplifies the typing from what would otherwise be
Optional[List[T]] to merely List[T]. This removes the need to add
assertions to several callsites that this value is not None - which it
never will be after the delayed initialization in check() anyway.
The type declaration without initialization trick will cause accidental
uses of this field prior to full initialization to raise an
AttributeError.
(Note that it is valid to have an empty members list, see the internal
q_empty object as an example. For this reason, we cannot use the empty
list as a replacement test for full initialization and instead rely on
the _checked/_check_complete fields.)
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240315152301.3621858-17-armbru@redhat.com>
Instead of using the None value for the members field, use a dedicated
flag to detect recursive misconfigurations.
This is intended to assist with subsequent patches that seek to remove
the "None" value from the members field (which can never hold that value
after the final call to check()) in order to simplify the static typing
of that field; avoiding the need of assertions littered at many
callsites to eliminate the possibility of the None value.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240315152301.3621858-16-armbru@redhat.com>
QAPISchemaInfo arguments can often be None because built-in definitions
don't have such information. The type hint can only be
Optional[QAPISchemaInfo] then. But, mypy gets upset about all the
places where we exploit that it can't actually be None there. Add
assertions that will help mypy over the hump, to enable adding type
hints in a forthcoming commit.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240315152301.3621858-15-armbru@redhat.com>
Adjust the expression at the callsite to work around mypy's weak type
introspection that believes this expression can resolve to
QAPISourceInfo; it cannot.
(Fundamentally: self.info only resolves to false in a boolean expression
when it is None; therefore this expression may only ever produce
Optional[str]. mypy does not know that 'info', when it is a
QAPISourceInfo object, cannot ever be false.)
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240315152301.3621858-14-armbru@redhat.com>
QAPISchema.lookup_type('FOO') returns a QAPISchemaType when type 'FOO'
exists, else None. It won't return None for built-in types like
'int'.
Since mypy can't see that, it'll complain that we assign the
Optional[QAPISchemaType] returned by .lookup_type() to QAPISchemaType
variables.
Add assertions to help it over the hump.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240315152301.3621858-13-armbru@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
resolve_type() is generally used to resolve configuration-provided type
names into type objects, and generally requires valid 'info' and 'what'
parameters.
In some cases, such as with QAPISchemaArrayType.check(), resolve_type
may be used to resolve built-in types and as such will not have an
'info' argument, but also must not fail in this scenario.
Use an assertion to sate mypy that we will indeed have 'info' and 'what'
parameters for the error pathway in resolve_type.
Note: there are only three callsites to resolve_type at present where
"info" is perceived by mypy to be possibly None:
1) QAPISchemaArrayType.check()
2) QAPISchemaObjectTypeMember.check()
3) QAPISchemaEvent.check()
Of those three, only the first actually ever passes None; the other two
are limited by their base class initializers which accept info=None, but
neither subclass actually use a None value in practice, currently.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240315152301.3621858-12-armbru@redhat.com>
We already take care to perform some type narrowing for arg_type and
ret_type, but not in a way where mypy can utilize the result once we add
type hints, e.g.:
qapi/schema.py:833: error: Incompatible types in assignment (expression
has type "QAPISchemaType", variable has type
"Optional[QAPISchemaObjectType]") [assignment]
qapi/schema.py:893: error: Incompatible types in assignment (expression
has type "QAPISchemaType", variable has type
"Optional[QAPISchemaObjectType]") [assignment]
A simple change to use a temporary variable helps the medicine go down.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240315152301.3621858-10-armbru@redhat.com>
These methods should always return a str, it's only the default abstract
implementation that doesn't. They can be marked "abstract", which
requires subclasses to override the method with the proper return type.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240315152301.3621858-9-armbru@redhat.com>
A QAPISchemaArrayType's element type gets resolved only during .check().
We have QAPISchemaArrayType.__init__() initialize self.element_type =
None, and .check() assign the actual type. Using .element_type before
.check() is wrong, and hopefully crashes due to the value being None.
Works.
However, it makes for awkward typing. With .element_type:
Optional[QAPISchemaType], mypy is of course unable to see that it's None
before .check(), and a QAPISchemaType after. To help it over the hump,
we'd have to assert self.element_type is not None before all the (valid)
uses. The assertion catches invalid uses, but only at run time; mypy
can't flag them.
Instead, declare .element_type in .__init__() as QAPISchemaType
*without* initializing it. Using .element_type before .check() now
certainly crashes, which is an improvement. Mypy still can't flag
invalid uses, but that's okay.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240315152301.3621858-8-armbru@redhat.com>
A QAPISchemaObjectTypeMember's type gets resolved only during .check().
We have QAPISchemaObjectTypeMember.__init__() initialize self.type =
None, and .check() assign the actual type. Using .type before .check()
is wrong, and hopefully crashes due to the value being None. Works.
However, it makes for awkward typing. With .type:
Optional[QAPISchemaType], mypy is of course unable to see that it's None
before .check(), and a QAPISchemaType after. To help it over the hump,
we'd have to assert self.type is not None before all the (valid) uses.
The assertion catches invalid uses, but only at run time; mypy can't
flag them.
Instead, declare .type in .__init__() as QAPISchemaType *without*
initializing it. Using .type before .check() now certainly crashes,
which is an improvement. Mypy still can't flag invalid uses, but that's
okay.
Addresses typing errors such as these:
qapi/schema.py:657: error: "None" has no attribute "alternate_qtype" [attr-defined]
qapi/schema.py:662: error: "None" has no attribute "describe" [attr-defined]
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240315152301.3621858-7-armbru@redhat.com>
Include entities don't have names, but we generally expect "entities" to
have names. Reclassify all entities with names as *definitions*, leaving
the nameless include entities as QAPISchemaEntity instances.
This is primarily to help simplify typing around expectations of what
callers expect for properties of an "entity".
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240315152301.3621858-6-armbru@redhat.com>
Address the comment added in commit 4629ed1e98
("qerror: Finally unused, clean up"), from 2015:
/*
* These macros will go away, please don't use
* in new code, and do not add new ones!
*/
Manual change. Remove the definition in
include/qapi/qmp/qerror.h.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240312141343.3168265-11-armbru@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Address the comment added in commit 4629ed1e98
("qerror: Finally unused, clean up"), from 2015:
/*
* These macros will go away, please don't use
* in new code, and do not add new ones!
*/
Mechanical transformation using sed, manually
removing the definition in include/qapi/qmp/qerror.h.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240312141343.3168265-10-armbru@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
[Straightforward conflict with commit aeaafb1e59 (migration: export
migration_is_running) resolved]
QERR_INVALID_PARAMETER_VALUE is defined as:
#define QERR_INVALID_PARAMETER_VALUE \
"Parameter '%s' expects %s"
The current error is formatted as:
"Parameter 'vcpu_dirty_limit' expects is invalid, it must greater then 1 MB/s"
Replace by:
"Parameter 'vcpu_dirty_limit' must be greater than 1 MB/s"
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240312141343.3168265-9-armbru@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
[New error message corrected, commit message updated accordingly]
Address the comment added in commit 4629ed1e98
("qerror: Finally unused, clean up"), from 2015:
/*
* These macros will go away, please don't use
* in new code, and do not add new ones!
*/
Manual changes (escaping the format in qapi/visit.py).
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240312141343.3168265-8-armbru@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Address the comment added in commit 4629ed1e98
("qerror: Finally unused, clean up"), from 2015:
/*
* These macros will go away, please don't use
* in new code, and do not add new ones!
*/
Mechanical transformation using the following
coccinelle semantic patch:
@match@
expression errp;
expression param;
constant value;
@@
error_setg(errp, QERR_INVALID_PARAMETER_TYPE, param, value);
@script:python strformat depends on match@
value << match.value;
fixedfmt; // new var
@@
fixedfmt = f'"Invalid parameter type for \'%s\', expected: {value[1:-1]}"'
coccinelle.fixedfmt = cocci.make_ident(fixedfmt)
@replace@
expression match.errp;
expression match.param;
constant match.value;
identifier strformat.fixedfmt;
@@
- error_setg(errp, QERR_INVALID_PARAMETER_TYPE, param, value);
+ error_setg(errp, fixedfmt, param);
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240312141343.3168265-7-armbru@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Address the comment added in commit 4629ed1e98
("qerror: Finally unused, clean up"), from 2015:
/*
* These macros will go away, please don't use
* in new code, and do not add new ones!
*/
Mechanical transformation using:
$ sed -i -e "s/QERR_INVALID_PARAMETER,/\"Invalid parameter '%s'\",/" \
$(git grep -lw QERR_INVALID_PARAMETER)
Manually simplify qemu_opts_create(), and remove the macro definition
in include/qapi/qmp/qerror.h.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240312141343.3168265-6-armbru@redhat.com>
Address the comment added in commit 4629ed1e98
("qerror: Finally unused, clean up"), from 2015:
/*
* These macros will go away, please don't use
* in new code, and do not add new ones!
*/
Mechanical transformation using sed, and manual cleanup.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240312141343.3168265-5-armbru@redhat.com>
Address the comment added in commit 4629ed1e98
("qerror: Finally unused, clean up"), from 2015:
/*
* These macros will go away, please don't use
* in new code, and do not add new ones!
*/
Mechanical transformation using sed, and manual cleanup.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240312141343.3168265-4-armbru@redhat.com>
Address the comment added in commit 4629ed1e98
("qerror: Finally unused, clean up"), from 2015:
/*
* These macros will go away, please don't use
* in new code, and do not add new ones!
*/
Mechanical transformation using sed, and manual cleanup.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240312141343.3168265-3-armbru@redhat.com>
Migration pull for 9.1
- Het's new test cases for "channels"
- Het's fix for a typo for vsock parsing
- Cedric's VFIO error report series
- Cedric's one more patch for dirty-bitmap error reports
- Zhijian's rdma deprecation patch
- Yuan's zeropage optimization to fix double faults on anon mem
- Zhijian's COLO fix on a crash
# -----BEGIN PGP SIGNATURE-----
#
# iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZig4HxIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1wbQiwD/V5nSJzSuAG4Ra1Fjo+LRG2TT6qk8eNCi
# fIytehSw6cYA/0wqarxOF0tr7ikeyhtG3w4xFf44kk6KcPkoVSl1tqoL
# =pJmQ
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 23 Apr 2024 03:37:19 PM PDT
# gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg: issuer "peterx@redhat.com"
# gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [unknown]
# gpg: aka "Peter Xu <peterx@redhat.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706
* tag 'migration-20240423-pull-request' of https://gitlab.com/peterx/qemu: (26 commits)
migration/colo: Fix bdrv_graph_rdlock_main_loop: Assertion `!qemu_in_coroutine()' failed.
migration/multifd: solve zero page causing multiple page faults
migration: Add Error** argument to add_bitmaps_to_list()
migration: Modify ram_init_bitmaps() to report dirty tracking errors
migration: Add Error** argument to xbzrle_init()
migration: Add Error** argument to ram_state_init()
memory: Add Error** argument to the global_dirty_log routines
migration: Introduce ram_bitmaps_destroy()
memory: Add Error** argument to .log_global_start() handler
migration: Add Error** argument to .load_setup() handler
migration: Add Error** argument to .save_setup() handler
migration: Add Error** argument to qemu_savevm_state_setup()
migration: Add Error** argument to vmstate_save()
migration: Always report an error in ram_save_setup()
migration: Always report an error in block_save_setup()
vfio: Always report an error in vfio_save_setup()
s390/stattrib: Add Error** argument to set_migrationmode() handler
tests/qtest/migration: Fix typo for vsock in SocketAddress_to_str
tests/qtest/migration: Add negative tests to validate migration QAPIs
tests/qtest/migration: Add multifd_tcp_plain test using list of channels instead of uri
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* cleanups for stubs
* do not link pixman automatically into all targets
* optimize computation of VGA dirty memory region
* kvm: use configs/ definition to conditionalize debug support
* hw: Add compat machines for 9.1
* target/i386: add guest-phys-bits cpu property
* target/i386: Introduce Icelake-Server-v7 and SierraForest models
* target/i386: Export RFDS bit to guests
* q35: SMM ranges cleanups
* target/i386: basic support for confidential guests
* linux-headers: update headers
* target/i386: SEV: use KVM_SEV_INIT2 if possible
* kvm: Introduce support for memory_attributes
* RAMBlock: Add support of KVM private guest memfd
* Consolidate use of warn_report_once()
* pythondeps.toml: warn about updates needed to docs/requirements.txt
* target/i386: always write 32-bits for SGDT and SIDT
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmYn1UkUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroO1nwgAhRQhkYcdtFc649WJWTNvJCNzmek0
# Sg7trH2NKlwA75zG8Qv4TR3E71UrXoY9oItwYstc4Erz+tdf73WyaHMF3cEk1p82
# xx3LcBYhP7jGSjabxTkZsFU8+MM1raOjRN/tHvfcjYLaJOqJZplnkaVhMbNPsVuM
# IPJ5bVQohxpmHKPbeFNpF4QJ9wGyZAYOfJOFCk09xQtHnA8CtFjS9to33QPAR/Se
# OVZwRCigVjf0KNmCnHC8tJHoW8pG/cdQAr3qqd397XbM1vVELv9fiXiMoGF78UsY
# trO4K2yg6N5Sly4Qv/++zZ0OZNkL3BREGp3wf4eTSvLXxqSGvfi8iLpFGA==
# =lwSL
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 23 Apr 2024 08:35:37 AM PDT
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [undefined]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (63 commits)
target/i386/translate.c: always write 32-bits for SGDT and SIDT
pythondeps.toml: warn about updates needed to docs/requirements.txt
accel/tcg/icount-common: Consolidate the use of warn_report_once()
target/i386/cpu: Merge the warning and error messages for AMD HT check
target/i386/cpu: Consolidate the use of warn_report_once()
target/i386/host-cpu: Consolidate the use of warn_report_once()
kvm/tdx: Ignore memory conversion to shared of unassigned region
kvm/tdx: Don't complain when converting vMMIO region to shared
kvm: handle KVM_EXIT_MEMORY_FAULT
physmem: Introduce ram_block_discard_guest_memfd_range()
RAMBlock: make guest_memfd require uncoordinated discard
HostMem: Add mechanism to opt in kvm guest memfd via MachineState
kvm/memory: Make memory type private by default if it has guest memfd backend
kvm: Enable KVM_SET_USER_MEMORY_REGION2 for memslot
RAMBlock: Add support of KVM private guest memfd
kvm: Introduce support for memory_attributes
trace/kvm: Split address space and slot id in trace_kvm_set_user_memory()
hw/i386/sev: Use legacy SEV VM types for older machine types
i386/sev: Add 'legacy-vm-type' parameter for SEV guest objects
target/i386: SEV: use KVM_SEV_INIT2 if possible
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
bdrv_activate_all() should not be called from the coroutine context, move
it to the QEMU thread colo_process_incoming_thread() with the bql_lock
protected.
The backtrace is as follows:
#4 0x0000561af7948362 in bdrv_graph_rdlock_main_loop () at ../block/graph-lock.c:260
#5 0x0000561af7907a68 in graph_lockable_auto_lock_mainloop (x=0x7fd29810be7b) at /patch/to/qemu/include/block/graph-lock.h:259
#6 0x0000561af79167d1 in bdrv_activate_all (errp=0x7fd29810bed0) at ../block.c:6906
#7 0x0000561af762b4af in colo_incoming_co () at ../migration/colo.c:935
#8 0x0000561af7607e57 in process_incoming_migration_co (opaque=0x0) at ../migration/migration.c:793
#9 0x0000561af7adbeeb in coroutine_trampoline (i0=-106876144, i1=22042) at ../util/coroutine-ucontext.c:175
#10 0x00007fd2a5cf21c0 in () at /lib64/libc.so.6
Cc: qemu-stable@nongnu.org
Cc: Fabiano Rosas <farosas@suse.de>
Closes: https://gitlab.com/qemu-project/qemu/-/issues/2277
Fixes: 2b3912f135 ("block: Mark bdrv_first_blk() and bdrv_is_root_node() GRAPH_RDLOCK")
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Zhang Chen <chen.zhang@intel.com>
Tested-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240417025634.1014582-1-lizhijian@fujitsu.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Implemented recvbitmap tracking of received pages in multifd.
If the zero page appears for the first time in the recvbitmap, this
page is not checked and set.
If the zero page has already appeared in the recvbitmap, there is no
need to check the data but directly set the data to 0, because it is
unlikely that the zero page will be migrated multiple times.
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240401154110.2028453-2-yuan1.liu@intel.com
[peterx: touch up the comment, as the bitmap is used outside postcopy now]
Signed-off-by: Peter Xu <peterx@redhat.com>
The .save_setup() handler has now an Error** argument that we can use
to propagate errors reported by the .log_global_start() handler. Do
that for the RAM. The caller qemu_savevm_state_setup() will store the
error under the migration stream for later detection in the migration
sequence.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240320064911.545001-15-clg@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Now that the log_global*() handlers take an Error** parameter and
return a bool, do the same for memory_global_dirty_log_start() and
memory_global_dirty_log_stop(). The error is reported in the callers
for now and it will be propagated in the call stack in the next
changes.
To be noted a functional change in ram_init_bitmaps(), if the dirty
pages logger fails to start, there is no need to synchronize the dirty
pages bitmaps. colo_incoming_start_dirty_log() could be modified in a
similar way.
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Anthony Perard <anthony.perard@citrix.com>
Cc: Paul Durrant <paul@xen.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hyman Huang <yong.huang@smartx.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Acked-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240320064911.545001-12-clg@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
This prepares ground for the changes coming next which add an Error**
argument to the .save_setup() handler. Callers of qemu_savevm_state_setup()
now handle the error and fail earlier setting the migration state from
MIGRATION_STATUS_SETUP to MIGRATION_STATUS_FAILED.
In qemu_savevm_state(), move the cleanup to preserve the error
reported by .save_setup() handlers.
Since the previous behavior was to ignore errors at this step of
migration, this change should be examined closely to check that
cleanups are still correctly done.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240320064911.545001-7-clg@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Refactor migrate_get_socket_address to internally utilize 'socket-address'
parameter, reducing redundancy in the function definition.
migrate_get_socket_address implicitly converts SocketAddress into str.
Move migrate_get_socket_address inside migrate_get_connect_uri which
should return the uri string instead.
Signed-off-by: Het Gala <het.gala@nutanix.com>
Suggested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240312202634.63349-4-het.gala@nutanix.com
Signed-off-by: Peter Xu <peterx@redhat.com>
The various Intel CPU manuals claim that SGDT and SIDT can write either 24-bits
or 32-bits depending upon the operand size, but this is incorrect. Not only do
the Intel CPU manuals give contradictory information between processor
revisions, but this information doesn't even match real-life behaviour.
In fact, tests on real hardware show that the CPU always writes 32-bits for SGDT
and SIDT, and this behaviour is required for at least OS/2 Warp and WFW 3.11 with
Win32s to function correctly. Remove the masking applied due to the operand size
for SGDT and SIDT so that the TCG behaviour matches the behaviour on real
hardware.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2198
--
MCA: Whilst I don't have a copy of OS/2 Warp handy, I've confirmed that this
patch fixes the issue in WFW 3.11 with Win32s. For more technical information I
highly recommend the excellent write-up at
https://www.os2museum.com/wp/sgdtsidt-fiction-and-reality/.
Message-ID: <20240419195147.434894-1-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
docs/requirements.txt is expected by readthedocs and should be in sync
with pythondeps.toml. Add a comment to both.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently, the difference between warn_report_once() and
error_report_once() is the former has the "warning:" prefix, while the
latter does not have a similar level prefix.
At the meantime, considering that there is no error handling logic here,
and the purpose of error_report_once() is only to prompt the user with
an abnormal message, there is no need to use an error-level message here,
and instead we can just use a warning.
Therefore, downgrade the message in error_report_once() to warning, and
merge it into the previous warn_report_once().
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240327103951.3853425-4-zhao1.liu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The difference between error_printf() and error_report() is the latter
may contain more information, such as the name of the program
("qemu-system-x86_64").
Thus its variant error_report_once() and warn_report()'s variant
warn_report_once() can be used here to print the information only once
without a static local variable "ht_warned".
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240327103951.3853425-3-zhao1.liu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
TDX requires vMMIO region to be shared. For KVM, MMIO region is the region
which kvm memslot isn't assigned to (except in-kernel emulation).
qemu has the memory region for vMMIO at each device level.
While OVMF issues MapGPA(to-shared) conservatively on 32bit PCI MMIO
region, qemu doesn't find corresponding vMMIO region because it's before
PCI device allocation and memory_region_find() finds the device region, not
PCI bus region. It's safe to ignore MapGPA(to-shared) because when guest
accesses those region they use GPA with shared bit set for vMMIO. Ignore
memory conversion request of non-assigned region to shared and return
success. Otherwise OVMF is confused and panics there.
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20240229063726.610065-35-xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Upon an KVM_EXIT_MEMORY_FAULT exit, userspace needs to do the memory
conversion on the RAMBlock to turn the memory into desired attribute,
switching between private and shared.
Currently only KVM_MEMORY_EXIT_FLAG_PRIVATE in flags is valid when
KVM_EXIT_MEMORY_FAULT happens.
Note, KVM_EXIT_MEMORY_FAULT makes sense only when the RAMBlock has
guest_memfd memory backend.
Note, KVM_EXIT_MEMORY_FAULT returns with -EFAULT, so special handling is
added.
When page is converted from shared to private, the original shared
memory can be discarded via ram_block_discard_range(). Note, shared
memory can be discarded only when it's not back'ed by hugetlb because
hugetlb is supposed to be pre-allocated and no need for discarding.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Co-developed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20240320083945.991426-13-michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some subsystems like VFIO might disable ram block discard, but guest_memfd
uses discard operations to implement conversions between private and
shared memory. Because of this, sequences like the following can result
in stale IOMMU mappings:
1. allocate shared page
2. convert page shared->private
3. discard shared page
4. convert page private->shared
5. allocate shared page
6. issue DMA operations against that shared page
This is not a use-after-free, because after step 3 VFIO is still pinning
the page. However, DMA operations in step 6 will hit the old mapping
that was allocated in step 1.
Address this by taking ram_block_discard_is_enabled() into account when
deciding whether or not to discard pages.
Since kvm_convert_memory()/guest_memfd doesn't implement a
RamDiscardManager handler to convey and replay discard operations,
this is a case of uncoordinated discard, which is blocked/released
by ram_block_discard_require(). Interestingly, this function had
no use so far.
Alternative approaches would be to block discard of shared pages, but
this would cause guests to consume twice the memory if they use VFIO;
or to implement a RamDiscardManager and only block uncoordinated
discard, i.e. use ram_block_coordinated_discard_require().
[Commit message mostly by Michael Roth <michael.roth@amd.com>]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a new member "guest_memfd" to memory backends. When it's set
to true, it enables RAM_GUEST_MEMFD in ram_flags, thus private kvm
guest_memfd will be allocated during RAMBlock allocation.
Memory backend's @guest_memfd is wired with @require_guest_memfd
field of MachineState. It avoid looking up the machine in phymem.c.
MachineState::require_guest_memfd is supposed to be set by any VMs
that requires KVM guest memfd as private memory, e.g., TDX VM.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-ID: <20240320083945.991426-8-michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
KVM side leaves the memory to shared by default, which may incur the
overhead of paging conversion on the first visit of each page. Because
the expectation is that page is likely to private for the VMs that
require private memory (has guest memfd).
Explicitly set the memory to private when memory region has valid
guest memfd backend.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Message-ID: <20240320083945.991426-16-michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add KVM guest_memfd support to RAMBlock so both normal hva based memory
and kvm guest memfd based private memory can be associated in one RAMBlock.
Introduce new flag RAM_GUEST_MEMFD. When it's set, it calls KVM ioctl to
create private guest_memfd during RAMBlock setup.
Allocating a new RAM_GUEST_MEMFD flag to instruct the setup of guest memfd
is more flexible and extensible than simply relying on the VM type because
in the future we may have the case that not all the memory of a VM need
guest memfd. As a benefit, it also avoid getting MachineState in memory
subsystem.
Note, RAM_GUEST_MEMFD is supposed to be set for memory backends of
confidential guests, such as TDX VM. How and when to set it for memory
backends will be implemented in the following patches.
Introduce memory_region_has_guest_memfd() to query if the MemoryRegion has
KVM guest_memfd allocated.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-ID: <20240320083945.991426-7-michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Introduce the helper functions to set the attributes of a range of
memory to private or shared.
This is necessary to notify KVM the private/shared attribute of each gpa
range. KVM needs the information to decide the GPA needs to be mapped at
hva-based shared memory or guest_memfd based private memory.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20240320083945.991426-11-michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Newer 9.1 machine types will default to using the KVM_SEV_INIT2 API for
creating SEV/SEV-ES going forward. However, this API results in guest
measurement changes which are generally not expected for users of these
older guest types and can cause disruption if they switch to a newer
QEMU/kernel version. Avoid this by continuing to use the older
KVM_SEV_INIT/KVM_SEV_ES_INIT APIs for older machine types.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Message-ID: <20240409230743.962513-4-michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
QEMU will currently automatically make use of the KVM_SEV_INIT2 API for
initializing SEV and SEV-ES guests verses the older
KVM_SEV_INIT/KVM_SEV_ES_INIT interfaces.
However, the older interfaces will silently avoid sync'ing FPU/XSAVE
state to the VMSA prior to encryption, thus relying on behavior and
measurements that assume the related fields to be allow zero.
With KVM_SEV_INIT2, this state is now synced into the VMSA, resulting in
measurements changes and, theoretically, behaviorial changes, though the
latter are unlikely to be seen in practice.
To allow a smooth transition to the newer interface, while still
providing a mechanism to maintain backward compatibility with VMs
created using the older interfaces, provide a new command-line
parameter:
-object sev-guest,legacy-vm-type=true,...
and have it default to false.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Message-ID: <20240409230743.962513-2-michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Implement support for the KVM_X86_SEV_VM and KVM_X86_SEV_ES_VM virtual
machine types, and the KVM_SEV_INIT2 function of KVM_MEMORY_ENCRYPT_OP.
These replace the KVM_SEV_INIT and KVM_SEV_ES_INIT functions, and have
several advantages:
- sharing the initialization sequence with SEV-SNP and TDX
- allowing arguments including the set of desired VMSA features
- protection against invalid use of KVM_GET/SET_* ioctls for guests
with encrypted state
If the KVM_X86_SEV_VM and KVM_X86_SEV_ES_VM types are not supported,
fall back to KVM_SEV_INIT and KVM_SEV_ES_INIT (which use the
default x86 VM type).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
KVM is introducing a new API to create confidential guests, which
will be used by TDX and SEV-SNP but is also available for SEV and
SEV-ES. The API uses the VM type argument to KVM_CREATE_VM to
identify which confidential computing technology to use.
Since there are no other expected uses of VM types, delegate
mc->kvm_type() for x86 boards to the confidential-guest-support
object pointed to by ms->cgs.
For example, if a sev-guest object is specified to confidential-guest-support,
like,
qemu -machine ...,confidential-guest-support=sev0 \
-object sev-guest,id=sev0,...
it will check if a VM type KVM_X86_SEV_VM or KVM_X86_SEV_ES_VM
is supported, and if so use them together with the KVM_SEV_INIT2
function of the KVM_MEMORY_ENCRYPT_OP ioctl. If not, it will fall back to
KVM_SEV_INIT and KVM_SEV_ES_INIT.
This is a preparatory work towards TDX and SEV-SNP support, but it
will also enable support for VMSA features such as DebugSwap, which
are only available via KVM_SEV_INIT2.
Co-developed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Introduce a common superclass for x86 confidential guest implementations.
It will extend ConfidentialGuestSupportClass with a method that provides
the VM type to be passed to KVM_CREATE_VM.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Board reset requires writing a fresh CPU state. As far as KVM is
concerned, the only thing that blocks reset is that CPU state is
encrypted; therefore, kvm_cpus_are_resettable() can simply check
if that is the case.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
So far, KVM has allowed KVM_GET/SET_* ioctls to execute even if the
guest state is encrypted, in which case they do nothing. For the new
API using VM types, instead, the ioctls will fail which is a safer and
more robust approach.
The new API will be the only one available for SEV-SNP and TDX, but it
is also usable for SEV and SEV-ES. In preparation for that, require
architecture-specific KVM code to communicate the point at which guest
state is protected (which must be after kvm_cpu_synchronize_post_init(),
though that might change in the future in order to suppor migration).
From that point, skip reading registers so that cpu->vcpu_dirty is
never true: if it ever becomes true, kvm_arch_put_registers() will
fail miserably.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Right now, the system reset is concluded by a call to
cpu_synchronize_all_post_reset() in order to sync any changes
that the machine reset callback applied to the CPU state.
However, for VMs with encrypted state such as SEV-ES guests (currently
the only case of guests with non-resettable CPUs) this cannot be done,
because guest state has already been finalized by machine-init-done notifiers.
cpu_synchronize_all_post_reset() does nothing on these guests, and actually
we would like to make it fail if called once guest has been encrypted.
So, assume that boards that support non-resettable CPUs do not touch
CPU state and that all such setup is done before, at the time of
cpu_synchronize_all_post_init().
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Data structures like struct setup_data have been moved to a separate
setup_data.h header which bootparam.h relies on. Add setup_data.h to
the cp_portable() list and sync it along with the other header files.
Note that currently struct setup_data is stripped away as part of
generating bootparam.h, but that handling is no currently needed for
setup_data.h since it doesn't pull in many external
headers/dependencies. However, QEMU currently redefines struct
setup_data in hw/i386/x86.c, so that will need to be removed as part of
any header update that pulls in the new setup_data.h to avoid build
bisect breakage.
Because <asm/setup_data.h> is the first architecture specific #include
in include/standard-headers/, add a new sed substitution to rewrite
asm/ include to the standard-headers/asm-* subdirectory for the current
architecture.
And while at it, remove asm-generic/kvm_para.h from the list of
allowed includes: it does not have a matching substitution, and therefore
it would not be possible to use it on non-Linux systems where there is
no /usr/include/asm-generic/ directory.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use the unified interface to call confidential guest related kvm_init()
and kvm_reset(), to avoid exposing pef specific functions.
As a bonus, pef.h goes away since there is no direct call from sPAPR
board code to PEF code anymore.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use confidential_guest_kvm_init() instead of calling SEV
specific sev_kvm_init(). This allows the introduction of multiple
confidential-guest-support subclasses for different x86 vendors.
As a bonus, stubs are not needed anymore since there is no
direct call from target/i386/kvm/kvm.c to SEV code.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20240229060038.606591-1-xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Different confidential VMs in different architectures all have the same
needs to do their specific initialization (and maybe resetting) stuffs
with KVM. Currently each of them exposes individual *_kvm_init()
functions and let machine code or kvm code to call it.
To facilitate the introduction of confidential guest technology from
different x86 vendors, add two virtual functions, kvm_init() and kvm_reset()
in ConfidentialGuestSupportClass, and expose two helpers functions for
invodking them.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20240229060038.606591-1-xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A value 1 of PCAT_COMPAT (bit 0) of MADT.Flags indicates that the system
also has a PC-AT-compatible dual-8259 setup, i.e., the PIC. When PIC
is not enabled (pic=off) for x86 machine, the PCAT_COMPAT bit needs to
be cleared. The PIC probe should then print:
[ 0.155970] Using NULL legacy PIC
However, no such log printed in guest kernel unless PCAT_COMPAT is
cleared.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20240403145953.3082491-1-xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
According to table 1-2 in Intel Architecture Instruction Set Extensions and
Future Features (rev 051) [1], SierraForest has the following new features
which have already been virtualized:
- CMPCCXADD CPUID.(EAX=7,ECX=1):EAX[bit 7]
- AVX-IFMA CPUID.(EAX=7,ECX=1):EAX[bit 23]
- AVX-VNNI-INT8 CPUID.(EAX=7,ECX=1):EDX[bit 4]
- AVX-NE-CONVERT CPUID.(EAX=7,ECX=1):EDX[bit 5]
Add above features to new CPU model SierraForest. Comparing with GraniteRapids
CPU model, SierraForest bare-metal removes the following features:
- HLE CPUID.(EAX=7,ECX=0):EBX[bit 4]
- RTM CPUID.(EAX=7,ECX=0):EBX[bit 11]
- AVX512F CPUID.(EAX=7,ECX=0):EBX[bit 16]
- AVX512DQ CPUID.(EAX=7,ECX=0):EBX[bit 17]
- AVX512_IFMA CPUID.(EAX=7,ECX=0):EBX[bit 21]
- AVX512CD CPUID.(EAX=7,ECX=0):EBX[bit 28]
- AVX512BW CPUID.(EAX=7,ECX=0):EBX[bit 30]
- AVX512VL CPUID.(EAX=7,ECX=0):EBX[bit 31]
- AVX512_VBMI CPUID.(EAX=7,ECX=0):ECX[bit 1]
- AVX512_VBMI2 CPUID.(EAX=7,ECX=0):ECX[bit 6]
- AVX512_VNNI CPUID.(EAX=7,ECX=0):ECX[bit 11]
- AVX512_BITALG CPUID.(EAX=7,ECX=0):ECX[bit 12]
- AVX512_VPOPCNTDQ CPUID.(EAX=7,ECX=0):ECX[bit 14]
- LA57 CPUID.(EAX=7,ECX=0):ECX[bit 16]
- TSXLDTRK CPUID.(EAX=7,ECX=0):EDX[bit 16]
- AMX-BF16 CPUID.(EAX=7,ECX=0):EDX[bit 22]
- AVX512_FP16 CPUID.(EAX=7,ECX=0):EDX[bit 23]
- AMX-TILE CPUID.(EAX=7,ECX=0):EDX[bit 24]
- AMX-INT8 CPUID.(EAX=7,ECX=0):EDX[bit 25]
- AVX512_BF16 CPUID.(EAX=7,ECX=1):EAX[bit 5]
- fast zero-length MOVSB CPUID.(EAX=7,ECX=1):EAX[bit 10]
- fast short CMPSB, SCASB CPUID.(EAX=7,ECX=1):EAX[bit 12]
- AMX-FP16 CPUID.(EAX=7,ECX=1):EAX[bit 21]
- PREFETCHI CPUID.(EAX=7,ECX=1):EDX[bit 14]
- XFD CPUID.(EAX=0xD,ECX=1):EAX[bit 4]
- EPT_PAGE_WALK_LENGTH_5 VMX_EPT_VPID_CAP(0x48c)[bit 7]
Add all features of GraniteRapids CPU model except above features to
SierraForest CPU model.
SierraForest doesn’t support TSX and RTM but supports TAA_NO. When RTM is
not enabled in host, KVM will not report TAA_NO. So, just don't include
TAA_NO in SierraForest CPU model.
[1] https://cdrdv2.intel.com/v1/dl/getContent/671368
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Message-ID: <20240320021044.508263-1-tao1.su@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When start L2 guest with both L1/L2 using Icelake-Server-v3 or above,
QEMU reports below warning:
"warning: host doesn't support requested feature: MSR(10AH).taa-no [bit 8]"
Reason is QEMU Icelake-Server-v3 has TSX feature disabled but enables taa-no
bit. It's meaningless that TSX isn't supported but still claim TSX is secure.
So L1 KVM doesn't expose taa-no to L2 if TSX is unsupported, then starting L2
triggers the warning.
Fix it by introducing a new version Icelake-Server-v7 which has both TSX
and taa-no features. Then guest can use TSX securely when it see taa-no.
This matches the production Icelake which supports TSX and isn't susceptible
to TSX Async Abort (TAA) vulnerabilities, a.k.a, taa-no.
Ideally, TSX should have being enabled together with taa-no since v3, but for
compatibility, we'd better to add v7 to enable it.
Fixes: d965dc3559 ("target/i386: Add ARCH_CAPABILITIES related bits into Icelake-Server CPU model")
Tested-by: Xiangfei Ma <xiangfeix.ma@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-ID: <20240320093138.80267-2-zhenzhong.duan@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move the architectural (for lack of a better term) CPUID leaf generation
to a separate helper so that the generation code can be reused by TDX,
which needs to generate a canonical VM-scoped configuration.
For now this is just a cleanup, so keep the function static.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20240229063726.610065-23-xiaoyao.li@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Query kvm for supported guest physical address bits, in cpuid
function 80000008, eax[23:16]. Usually this is identical to host
physical address bits. With NPT or EPT being used this might be
restricted to 48 (max 4-level paging address space size) even if
the host cpu supports more physical address bits.
When set pass this to the guest, using cpuid too. Guest firmware
can use this to figure how big the usable guest physical address
space is, so PCI bar mapping are actually reachable.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240318155336.156197-2-kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If an architecture adds support for KVM_CAP_SET_GUEST_DEBUG but QEMU does not
have the necessary code, QEMU will fail to build after updating kernel headers.
Avoid this by using a #define in config-target.h instead of KVM_CAP_SET_GUEST_DEBUG.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Take into account split screen mode close to wrap around, which is the
other special case for dirty memory region computation.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The depth == 0 and depth == 15 have to be special cased because
width * depth / 8 does not provide the correct scanline length.
However, thanks to the recent reorganization of vga_draw_graphic()
the correct value of VRAM bits per pixel is available in "bits".
Use it (via the same "bwidth" computation that is used later in
the function), thus restricting the slow path to the wraparound case.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently it is not documented anywhere why some functions need to
be stubbed.
Group the files in stubs/meson.build according to who needs them, both
to reduce the size of the compilation and to clarify the use of stubs.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20240408155330.522792-18-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
replay.c symbols are only needed by user mode emulation, with the
exception of replay_mode that is needed by both user mode emulation
(by way of qemu_guest_getrandom) and block layer tools (by way of
util/qemu-timer.c).
Since it is needed by libqemuutil rather than specific files that
are part of the tools and emulators, split the replay_mode stub
into its own file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20240408155330.522792-17-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since the semihosting stubs are needed exactly when the Kconfig symbols
are not needed, move them to semihosting/ and conditionalize them
on CONFIG_SEMIHOSTING and/or CONFIG_SYSTEM_ONLY.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20240408155330.522792-13-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Only the files in hwcore_ss[] are required to link a user emulation
binary.
Have meson process the hw/ sub-directories if system emulation is
selected, otherwise directly process hw/core/ to get hwcore_ss[], which
is the only set required by user emulation.
This removes about 10% from the time needed to run
"../configure --disable-system --disable-tools --disable-guest-agent".
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240404194757.9343-8-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20240408155330.522792-9-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Try not to test code that is not used by user mode emulation, or by the
block layer, unless they are being compiled; and fix test-timed-average
which was not compiled with --disable-system --enable-tools.
This is by no means complete, it only touches the more blatantly
wrong cases.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240408155330.522792-5-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 30896374 started to pass the full BlockConf from usb-storage to
scsi-disk, while previously only a few select properties would be
forwarded. This enables the user to set more properties, e.g. the block
size, that are actually taking effect.
However, now the calls to blkconf_apply_backend_options() and
blkconf_blocksizes() in usb_msd_storage_realize() that modify some of
these properties take effect, too, instead of being silently ignored.
This means at least that the block sizes get an unconditional default of
512 bytes before the configuration is passed to scsi-disk.
Before commit 30896374, the property wouldn't be set for scsi-disk and
therefore the device dependent defaults would apply - 512 for scsi-hd,
but 2048 for scsi-cd. The latter default has now become 512, too, which
makes at least Windows 11 installation fail when installing from
usb-storage.
Fix this by simply not calling these functions any more in usb-storage
and passing BlockConf on unmodified (except for the BlockBackend). The
same functions are called by the SCSI code anyway and it sets the right
defaults for the actual media type.
Fixes: 3089637461 ('scsi: Don't ignore most usb-storage properties')
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2260
Reported-by: Jonas Svensson
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Message-id: 20240412144202.13786-1-kwolf@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Move calculation of mask after the switch which sets the function
number for PIRQ/PINT pins to make sure the state of these pins are
kept track of separately and IRQ is raised if any of them is active.
Cc: qemu-stable@nongnu.org
Fixes: 7e01bd80c1 hw/isa/vt82c686: Bring back via_isa_set_irq()
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240410222543.0EA534E6005@zero.eik.bme.hu>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
During the booting process of the non-standard image, the behavior of the
called function in qemu is as follows:
1. vhost_net_stop() was triggered by guest image. This will call the function
virtio_pci_set_guest_notifiers() with assgin= false,
virtio_pci_set_guest_notifiers() will release the irqfd for vector 0
2. virtio_reset() was triggered, this will set configure vector to VIRTIO_NO_VECTOR
3.vhost_net_start() was called (at this time, the configure vector is
still VIRTIO_NO_VECTOR) and then call virtio_pci_set_guest_notifiers() with
assgin=true, so the irqfd for vector 0 is still not "init" during this process
4. The system continues to boot and sets the vector back to 0. After that
msix_fire_vector_notifier() was triggered to unmask the vector 0 and meet the crash
To fix the issue, we need to support changing the vector after VIRTIO_CONFIG_S_DRIVER_OK is set.
(gdb) bt
0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0)
at pthread_kill.c:44
1 0x00007fc87148ec53 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78
2 0x00007fc87143e956 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
3 0x00007fc8714287f4 in __GI_abort () at abort.c:79
4 0x00007fc87142871b in __assert_fail_base
(fmt=0x7fc8715bbde0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x5606413efd53 "ret == 0", file=0x5606413ef87d "../accel/kvm/kvm-all.c", line=1837, function=<optimized out>) at assert.c:92
5 0x00007fc871437536 in __GI___assert_fail
(assertion=0x5606413efd53 "ret == 0", file=0x5606413ef87d "../accel/kvm/kvm-all.c", line=1837, function=0x5606413f06f0 <__PRETTY_FUNCTION__.19> "kvm_irqchip_commit_routes") at assert.c:101
6 0x0000560640f884b5 in kvm_irqchip_commit_routes (s=0x560642cae1f0) at ../accel/kvm/kvm-all.c:1837
7 0x0000560640c98f8e in virtio_pci_one_vector_unmask
(proxy=0x560643c65f00, queue_no=4294967295, vector=0, msg=..., n=0x560643c6e4c8)
at ../hw/virtio/virtio-pci.c:1005
8 0x0000560640c99201 in virtio_pci_vector_unmask (dev=0x560643c65f00, vector=0, msg=...)
at ../hw/virtio/virtio-pci.c:1070
9 0x0000560640bc402e in msix_fire_vector_notifier (dev=0x560643c65f00, vector=0, is_masked=false)
at ../hw/pci/msix.c:120
10 0x0000560640bc40f1 in msix_handle_mask_update (dev=0x560643c65f00, vector=0, was_masked=true)
at ../hw/pci/msix.c:140
11 0x0000560640bc4503 in msix_table_mmio_write (opaque=0x560643c65f00, addr=12, val=0, size=4)
at ../hw/pci/msix.c:231
12 0x0000560640f26d83 in memory_region_write_accessor
(mr=0x560643c66540, addr=12, value=0x7fc86b7bc628, size=4, shift=0, mask=4294967295, attrs=...)
at ../system/memory.c:497
13 0x0000560640f270a6 in access_with_adjusted_size
(addr=12, value=0x7fc86b7bc628, size=4, access_size_min=1, access_size_max=4, access_fn=0x560640f26c8d <memory_region_write_accessor>, mr=0x560643c66540, attrs=...) at ../system/memory.c:573
14 0x0000560640f2a2b5 in memory_region_dispatch_write (mr=0x560643c66540, addr=12, data=0, op=MO_32, attrs=...)
at ../system/memory.c:1521
15 0x0000560640f37bac in flatview_write_continue
(fv=0x7fc65805e0b0, addr=4273803276, attrs=..., ptr=0x7fc871e9c028, len=4, addr1=12, l=4, mr=0x560643c66540)
at ../system/physmem.c:2714
16 0x0000560640f37d0f in flatview_write
(fv=0x7fc65805e0b0, addr=4273803276, attrs=..., buf=0x7fc871e9c028, len=4) at ../system/physmem.c:2756
17 0x0000560640f380bf in address_space_write
(as=0x560642161ae0 <address_space_memory>, addr=4273803276, attrs=..., buf=0x7fc871e9c028, len=4)
at ../system/physmem.c:2863
18 0x0000560640f3812c in address_space_rw
(as=0x560642161ae0 <address_space_memory>, addr=4273803276, attrs=..., buf=0x7fc871e9c028, len=4, is_write=true) at ../system/physmem.c:2873
--Type <RET> for more, q to quit, c to continue without paging--
19 0x0000560640f8aa55 in kvm_cpu_exec (cpu=0x560642f205e0) at ../accel/kvm/kvm-all.c:2915
20 0x0000560640f8d731 in kvm_vcpu_thread_fn (arg=0x560642f205e0) at ../accel/kvm/kvm-accel-ops.c:51
21 0x00005606411949f4 in qemu_thread_start (args=0x560642f292b0) at ../util/qemu-thread-posix.c:541
22 0x00007fc87148cdcd in start_thread (arg=<optimized out>) at pthread_create.c:442
23 0x00007fc871512630 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
(gdb)
MST: coding style and typo fixups
Fixes: f9a09ca3ea ("vhost: add support for configure interrupt")
Cc: qemu-stable@nongnu.org
Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-ID: <2321ade5f601367efe7380c04e3f61379c59b48f.1713173550.git.mst@redhat.com>
Cc: Lei Yang <leiyang@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Cindy Lu <lulu@redhat.com>
Our Makefile massages the given make arguments to invoke ninja
accordingly. One key difference is that ninja will parallelize by
default, whereas make only does so with -j<n> or -j. The make man page
says that "if the -j option is given without an argument, make will not
limit the number of jobs that can run simultaneously". We use to support
that by replacing -j with "" (empty string) when calling ninja, so that
it would do its auto-parallelization based on the number of CPU cores.
This was accidentally broken at d1ce2cc95b (Makefile: preserve
--jobserver-auth argument when calling ninja, 2024-04-02),
causing `make -j` to fail:
$ make -j V=1
/usr/bin/ninja -v -j -d keepdepfile all | cat
make -C contrib/plugins/ V="1" TARGET_DIR="contrib/plugins/" all
ninja: fatal: invalid -j parameter
make: *** [Makefile:161: run-ninja] Error
Let's fix that and indent the touched code for better readability.
Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Fixes: d1ce2cc95b ("Makefile: preserve --jobserver-auth argument when calling ninja", 2024-04-02)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Per "SD Host Controller Standard Specification Version 3.00":
* 2.2.5 Transfer Mode Register (Offset 00Ch)
Writes to this register shall be ignored when the Command
Inhibit (DAT) in the Present State register is 1.
Do not update the TRNMOD register when Command Inhibit (DAT)
bit is set to avoid the present-status register going out of
sync, leading to malicious guest using DMA mode and overflowing
the FIFO buffer:
$ cat << EOF | qemu-system-i386 \
-display none -nographic -nodefaults \
-machine accel=qtest -m 512M \
-device sdhci-pci,sd-spec-version=3 \
-device sd-card,drive=mydrive \
-drive if=none,index=0,file=null-co://,format=raw,id=mydrive \
-qtest stdio
outl 0xcf8 0x80001013
outl 0xcfc 0x91
outl 0xcf8 0x80001001
outl 0xcfc 0x06000000
write 0x9100002c 0x1 0x05
write 0x91000058 0x1 0x16
write 0x91000005 0x1 0x04
write 0x91000028 0x1 0x08
write 0x16 0x1 0x21
write 0x19 0x1 0x20
write 0x9100000c 0x1 0x01
write 0x9100000e 0x1 0x20
write 0x9100000f 0x1 0x00
write 0x9100000c 0x1 0x00
write 0x91000020 0x1 0x00
EOF
Stack trace (part):
=================================================================
==89993==ERROR: AddressSanitizer: heap-buffer-overflow on address
0x615000029900 at pc 0x55d5f885700d bp 0x7ffc1e1e9470 sp 0x7ffc1e1e9468
WRITE of size 1 at 0x615000029900 thread T0
#0 0x55d5f885700c in sdhci_write_dataport hw/sd/sdhci.c:564:39
#1 0x55d5f8849150 in sdhci_write hw/sd/sdhci.c:1223:13
#2 0x55d5fa01db63 in memory_region_write_accessor system/memory.c:497:5
#3 0x55d5fa01d245 in access_with_adjusted_size system/memory.c:573:18
#4 0x55d5fa01b1a9 in memory_region_dispatch_write system/memory.c:1521:16
#5 0x55d5fa09f5c9 in flatview_write_continue system/physmem.c:2711:23
#6 0x55d5fa08f78b in flatview_write system/physmem.c:2753:12
#7 0x55d5fa08f258 in address_space_write system/physmem.c:2860:18
...
0x615000029900 is located 0 bytes to the right of 512-byte region
[0x615000029700,0x615000029900) allocated by thread T0 here:
#0 0x55d5f7237b27 in __interceptor_calloc
#1 0x7f9e36dd4c50 in g_malloc0
#2 0x55d5f88672f7 in sdhci_pci_realize hw/sd/sdhci-pci.c:36:5
#3 0x55d5f844b582 in pci_qdev_realize hw/pci/pci.c:2092:9
#4 0x55d5fa2ee74b in device_set_realized hw/core/qdev.c:510:13
#5 0x55d5fa325bfb in property_set_bool qom/object.c:2358:5
#6 0x55d5fa31ea45 in object_property_set qom/object.c:1472:5
#7 0x55d5fa332509 in object_property_set_qobject om/qom-qobject.c:28:10
#8 0x55d5fa31f6ed in object_property_set_bool qom/object.c:1541:15
#9 0x55d5fa2e2948 in qdev_realize hw/core/qdev.c:292:12
#10 0x55d5f8eed3f1 in qdev_device_add_from_qdict system/qdev-monitor.c:719:10
#11 0x55d5f8eef7ff in qdev_device_add system/qdev-monitor.c:738:11
#12 0x55d5f8f211f0 in device_init_func system/vl.c:1200:11
#13 0x55d5fad0877d in qemu_opts_foreach util/qemu-option.c:1135:14
#14 0x55d5f8f0df9c in qemu_create_cli_devices system/vl.c:2638:5
#15 0x55d5f8f0db24 in qmp_x_exit_preconfig system/vl.c:2706:5
#16 0x55d5f8f14dc0 in qemu_init system/vl.c:3737:9
...
SUMMARY: AddressSanitizer: heap-buffer-overflow hw/sd/sdhci.c:564:39
in sdhci_write_dataport
Add assertions to ensure the fifo_buffer[] is not overflowed by
malicious accesses to the Buffer Data Port register.
Fixes: CVE-2024-3447
Cc: qemu-stable@nongnu.org
Fixes: d7dfca0807 ("hw/sdhci: introduce standard SD host controller")
Buglink: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=58813
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Reported-by: Chuhong Yuan <hslester96@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <CAFEAcA9iLiv1XGTGKeopgMa8Y9+8kvptvsb8z2OBeuy+5=NUfg@mail.gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20240409145524.27913-1-philmd@linaro.org>
When the MAC Interface Layer (MIL) transmit FIFO is full,
truncate the packet, and raise the Transmitter Error (TXE)
flag.
Broken since model introduction in commit 2a42499017
("LAN9118 emulation").
When using the reproducer from
https://gitlab.com/qemu-project/qemu/-/issues/2267 we get:
hw/net/lan9118.c:798:17: runtime error:
index 2048 out of bounds for type 'uint8_t[2048]' (aka 'unsigned char[2048]')
#0 0x563ec9a057b1 in tx_fifo_push hw/net/lan9118.c:798:43
#1 0x563ec99fbb28 in lan9118_writel hw/net/lan9118.c:1042:9
#2 0x563ec99f2de2 in lan9118_16bit_mode_write hw/net/lan9118.c:1205:9
#3 0x563ecbf78013 in memory_region_write_accessor system/memory.c:497:5
#4 0x563ecbf776f5 in access_with_adjusted_size system/memory.c:573:18
#5 0x563ecbf75643 in memory_region_dispatch_write system/memory.c:1521:16
#6 0x563ecc01bade in flatview_write_continue_step system/physmem.c:2713:18
#7 0x563ecc01b374 in flatview_write_continue system/physmem.c:2743:19
#8 0x563ecbff1c9b in flatview_write system/physmem.c:2774:12
#9 0x563ecbff1768 in address_space_write system/physmem.c:2894:18
...
[*] LAN9118 DS00002266B.pdf, Table 5.3.3 "INTERRUPT STATUS REGISTER"
Cc: qemu-stable@nongnu.org
Reported-by: Will Lester
Reported-by: Chuhong Yuan <hslester96@gmail.com>
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2267
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20240409133801.23503-3-philmd@linaro.org>
The magic 2048 is explained in the LAN9211 datasheet (DS00002414A)
in chapter 1.4, "10/100 Ethernet MAC":
The MAC Interface Layer (MIL), within the MAC, contains a
2K Byte transmit and a 128 Byte receive FIFO which is separate
from the TX and RX FIFOs. [...]
Note, the use of the constant in lan9118_receive() reveals that
our implementation is using the same buffer for both tx and rx.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20240409133801.23503-2-philmd@linaro.org>
nand_command() and nand_getio() don't check @offset points
into the block, nor the available data length (s->iolen) is
not negative.
In order to fix:
- check the offset is in range in nand_blk_load_NAND_PAGE_SIZE(),
- do not set @iolen if blk_load() failed.
Reproducer:
$ cat << EOF | qemu-system-arm -machine tosa \
-monitor none -serial none \
-display none -qtest stdio
write 0x10000111 0x1 0xca
write 0x10000104 0x1 0x47
write 0x1000ca04 0x1 0xd7
write 0x1000ca01 0x1 0xe0
write 0x1000ca04 0x1 0x71
write 0x1000ca00 0x1 0x50
write 0x1000ca04 0x1 0xd7
read 0x1000ca02 0x1
write 0x1000ca01 0x1 0x10
EOF
=================================================================
==15750==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x61f000000de0
at pc 0x560e61557210 bp 0x7ffcfc4a59f0 sp 0x7ffcfc4a59e8
READ of size 1 at 0x61f000000de0 thread T0
#0 0x560e6155720f in mem_and hw/block/nand.c:101:20
#1 0x560e6155ac9c in nand_blk_write_512 hw/block/nand.c:663:9
#2 0x560e61544200 in nand_command hw/block/nand.c:293:13
#3 0x560e6153cc83 in nand_setio hw/block/nand.c:520:13
#4 0x560e61a0a69e in tc6393xb_nand_writeb hw/display/tc6393xb.c:380:13
#5 0x560e619f9bf7 in tc6393xb_writeb hw/display/tc6393xb.c:524:9
#6 0x560e647c7d03 in memory_region_write_accessor softmmu/memory.c:492:5
#7 0x560e647c7641 in access_with_adjusted_size softmmu/memory.c:554:18
#8 0x560e647c5f66 in memory_region_dispatch_write softmmu/memory.c:1514:16
#9 0x560e6485409e in flatview_write_continue softmmu/physmem.c:2825:23
#10 0x560e648421eb in flatview_write softmmu/physmem.c:2867:12
#11 0x560e64841ca8 in address_space_write softmmu/physmem.c:2963:18
#12 0x560e61170162 in qemu_writeb tests/qtest/videzzo/videzzo_qemu.c:1080:5
#13 0x560e6116eef7 in dispatch_mmio_write tests/qtest/videzzo/videzzo_qemu.c:1227:28
0x61f000000de0 is located 0 bytes to the right of 3424-byte region [0x61f000000080,0x61f000000de0)
allocated by thread T0 here:
#0 0x560e611276cf in malloc /root/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
#1 0x7f7959a87e98 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x57e98)
#2 0x560e64b98871 in object_new qom/object.c:749:12
#3 0x560e64b5d1a1 in qdev_new hw/core/qdev.c:153:19
#4 0x560e61547ea5 in nand_init hw/block/nand.c:639:11
#5 0x560e619f8772 in tc6393xb_init hw/display/tc6393xb.c:558:16
#6 0x560e6390bad2 in tosa_init hw/arm/tosa.c:250:12
SUMMARY: AddressSanitizer: heap-buffer-overflow hw/block/nand.c:101:20 in mem_and
==15750==ABORTING
Broken since introduction in commit 3e3d5815cb ("NAND Flash memory
emulation and ECC calculation helpers for use by NAND controllers").
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1445
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1446
Reported-by: Qiang Liu <cyruscyliu@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20240409135944.24997-4-philmd@linaro.org>
Introduce virtio_bh_new_guarded(), similar to qemu_bh_new_guarded()
but using the transport memory guard, instead of the device one
(there can only be one virtio device per virtio bus).
Inspired-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20240409105537.18308-2-philmd@linaro.org>
Passing the tswapped structure to strace means that
our internal si_type is also gone, which then aborts
in print_siginfo.
Fixes: 4d6d8a05a0 ("linux-user: Move tswap_siginfo out of target code")
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We already attempted to set and clear can_do_io before the first
and last insns, but only used the initial value of max_insns and
the call to translator_io_start to find those insns.
Now that we track insn_start in DisasContextBase, and now that
we have emit_before_op, we can wait until we have finished
translation to identify the true first and last insns and emit
the sets of can_do_io at that time.
This fixes the case of a translation block which crossed a page
boundary, and for which the second page turned out to be mmio.
In this case we truncate the block, and the previous logic for
can_do_io could leave a block with a single insn with can_do_io
set to false, which would fail an assertion in cpu_io_recompile.
Reported-by: Jørgen Hansen <Jorgen.Hansen@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Jørgen Hansen <Jorgen.Hansen@wdc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
To keep the multiple update check, replace insn_start
with insn_start_updated.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
To keep the multiple update check, replace insn_start
with insn_start_updated.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
To keep the multiple update check, replace insn_start
with insn_start_updated.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
CHECK_NOT_DELAY_SLOT is correctly applied to the branch-related
instructions, but not to the PC-relative mov* instructions.
I verified the existence of an illegal slot exception on a SH7091 when
any of these instructions are attempted inside a delay slot.
This also matches the behavior described in the SH-4 ISA manual.
Signed-off-by: Zack Buhman <zack@buhman.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240407150705.5965-1-zack@buhman.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewd-by: Yoshinori Sato <ysato@users.sourceforge.jp>
The contents of IIAOQ depend on PSW_W.
Follow the text in "Interruption Instruction Address Queues",
pages 2-13 through 2-15.
Tested-by: Sven Schnelle <svens@stackframe.org>
Tested-by: Helge Deller <deller@gmx.de>
Reported-by: Sven Schnelle <svens@stackframe.org>
Fixes: b10700d826 ("target/hppa: Update IIAOQ, IIASQ for pa2.0")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Turned out hard-coding version and date in the Makefile wasn't a bright
idea. Updating it on edk2 updates is easily forgotten. Fetch the info
from git instead. Store in edk2-version, so this can be committed to
the repo and is present in tarballs too.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-ID: <20240327102448.61877-2-kraxel@redhat.com>
Let's not care about what was changed and update the whole config,
reasons:
1. config->geometry should be updated together with capacity, so we fix
a bug.
2. Vhost-user protocol doesn't say anything about config change
limitation. Silent ignore of changes doesn't seem to be correct.
3. vhost-user-vsock reads the whole config
4. on realize we don't do any checks on retrieved config, so no reason
to care here
Comment "valid for resize only" exists since introduction the whole
hw/block/vhost-user-blk.c in commit
00343e4b54
"vhost-user-blk: introduce a new vhost-user-blk host device",
seems it was just an extra limitation.
Also, let's notify guest unconditionally:
1. So does vhost-user-vsock
2. We are going to reuse the functionality in new cases when we do want
to notify the guest unconditionally. So, no reason to create extra
branches in the logic.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Acked-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20240329183758.3360733-2-vsementsov@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The set_config callback function vhost_vdpa_device_get_config in
vdpa-dev does not fetch the current device status from the hardware
device, causing the guest os to not receive the latest device status
information.
The hardware updates the config status of the vdpa device and then
notifies the os. The guest os receives an interrupt notification,
triggering a get_config access in the kernel, which then enters qemu
internally. Ultimately, the vhost_vdpa_device_get_config function of
vdpa-dev is called
One scenario encountered is when the device needs to bring down the
vdpa net device. After modifying the status field of virtio_net_config
in the hardware, it sends an interrupt notification. However, the guest
os always receives the STATUS field as VIRTIO_NET_S_LINK_UP.
Signed-off-by: Yuxue Liu <yuxue.liu@jaguarmicro.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20240408020003.1979-1-yuxue.liu@jaguarmicro.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In the event of writing many chains of descriptors, the device must
write just the id of the last buffer in the descriptor chain, skip
forward the number of descriptors in the chain, and then repeat the
operations for the rest of chains.
Current QEMU code writes all the buffer ids consecutively, and then
skips all the buffers altogether. This is a bug, and can be reproduced
with a VirtIONet device with _F_MRG_RXBUB and without
_F_INDIRECT_DESC:
If a virtio-net device has the VIRTIO_NET_F_MRG_RXBUF feature
but not the VIRTIO_RING_F_INDIRECT_DESC feature,
'VirtIONetQueue->rx_vq' will use the merge feature
to store data in multiple 'elems'.
The 'num_buffers' in the virtio header indicates how many elements are merged.
If the value of 'num_buffers' is greater than 1,
all the merged elements will be filled into the descriptor ring.
The 'idx' of the elements should be the value of 'vq->used_idx' plus 'ndescs'.
Fixes: 86044b24e8 ("virtio: basic packed virtqueue support")
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Wafer <wafer@jaguarmicro.com>
Message-Id: <20240407015451.5228-2-wafer@jaguarmicro.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The current handling of invalid virtqueue elements inside the TX/RX virt
queue handlers is wrong.
They are added in a per-stream invalid queue to be processed after the
handler is done examining each message, but the invalid message might
not be specifying any stream_id; which means it's invalid to add it to
any stream->invalid queue since stream could be NULL at this point.
This commit moves the invalid queue to the VirtIOSound struct which
guarantees there will always be a valid temporary place to store them
inside the tx/rx handlers. The queue will be emptied before the handler
returns, so the queue must be empty at any other point of the device's
lifetime.
Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Message-Id: <virtio-snd-rewrite-invalid-tx-rx-message-handling-v1.manos.pitsidianakis@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch improves error handling in virtio_snd_handle_tx_xfer()
and virtio_snd_handle_rx_xfer() in the VirtIO sound driver. Previously,
'goto' statements were used for error paths, leading to unnecessary
processing and potential null pointer dereferences. Now, 'continue' is
used to skip the rest of the current loop iteration for errors such as
message size discrepancies or null streams, reducing crash risks.
ASAN log illustrating the issue addressed:
ERROR: AddressSanitizer: SEGV on unknown address 0x0000000000b4
#0 0x57cea39967b8 in qemu_mutex_lock_impl qemu/util/qemu-thread-posix.c:92:5
#1 0x57cea128c462 in qemu_mutex_lock qemu/include/qemu/thread.h:122:5
#2 0x57cea128d72f in qemu_lockable_lock qemu/include/qemu/lockable.h:95:5
#3 0x57cea128c294 in qemu_lockable_auto_lock qemu/include/qemu/lockable.h:105:5
#4 0x57cea1285eb2 in virtio_snd_handle_rx_xfer qemu/hw/audio/virtio-snd.c:1026:9
#5 0x57cea2caebbc in virtio_queue_notify_vq qemu/hw/virtio/virtio.c:2268:9
#6 0x57cea2cae412 in virtio_queue_host_notifier_read qemu/hw/virtio/virtio.c:3671:9
#7 0x57cea39822f1 in aio_dispatch_handler qemu/util/aio-posix.c:372:9
#8 0x57cea3979385 in aio_dispatch_handlers qemu/util/aio-posix.c:414:20
#9 0x57cea3978eb1 in aio_dispatch qemu/util/aio-posix.c:424:5
#10 0x57cea3a1eede in aio_ctx_dispatch qemu/util/async.c:360:5
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Message-Id: <20240322110827.568412-1-zheyuma97@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
subj is calling kvm_add_routing_entry() which simply extends
KVMState::irq_routes::entries[]
but doesn't check if number of routes goes beyond limit the kernel
is willing to accept. Which later leads toi the assert
qemu-kvm: ../accel/kvm/kvm-all.c:1833: kvm_irqchip_commit_routes: Assertion `ret == 0' failed
typically it happens during guest boot for large enough guest
Reproduced with:
./qemu --enable-kvm -m 8G -smp 64 -machine pc \
`for b in {1..2}; do echo -n "-device pci-bridge,id=pci$b,chassis_nr=$b ";
for i in {0..31}; do touch /tmp/vblk$b$i;
echo -n "-drive file=/tmp/vblk$b$i,if=none,id=drive$b$i,format=raw
-device virtio-blk-pci,drive=drive$b$i,bus=pci$b ";
done; done`
While crash at boot time is bad, the same might happen at hotplug time
which is unacceptable.
So instead calling kvm_add_routing_entry() unconditionally, check first
that number of routes won't exceed KVM_CAP_IRQ_ROUTING. This way virtio
device insteads killin qemu, will gracefully fail to initialize device
as expected with following warnings on console:
virtio-blk failed to set guest notifier (-28), ensure -accel kvm is set.
virtio_bus_start_ioeventfd: failed. Fallback to userspace (slower).
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-ID: <20240408110956.451558-1-imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
GCC 14 shows -Wshadow=local warnings if an enum conflicts with a local
variable (including a parameter). To avoid this, move the problematic
enum and all of its dependencies after the hundreds of functions that
have a parameter named "instruction".
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Migration pull for 9.0-rc3
- Wei/Lei's fix on a rare postcopy race that can hang the channel (since 8.0)
- Avihai's fix on maintainers file, points to the right doc links
# -----BEGIN PGP SIGNATURE-----
#
# iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZhLpJBIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1wa87AEAhvXqJyLxYYdlQ5fqp4hVV6O/3N1vNHMu
# kT3d9tmM0jsBAJ5KxK176iGDp+ej5MEyYSm1gG7ivj3y3v3wlPnSmJMJ
# =T1lk
# -----END PGP SIGNATURE-----
# gpg: Signature made Sun 07 Apr 2024 19:42:44 BST
# gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg: issuer "peterx@redhat.com"
# gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [marginal]
# gpg: aka "Peter Xu <peterx@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706
* tag 'migration-20240407-pull-request' of https://gitlab.com/peterx/qemu:
MAINTAINERS: Adjust migration documentation files
migration/postcopy: ensure preempt channel is ready before loading states
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When we do an AT address translation operation, the page table walk
is supposed to be performed in the context of the EL we're doing the
walk for, so for instance an AT S1E2R walk is done for EL2. In the
pseudocode an EL is passed to AArch64.AT(), which calls
SecurityStateAtEL() to find the security state that we should be
doing the walk with.
In ats_write64() we get this wrong, instead using the current
security space always. This is fine for AT operations performed from
EL1 and EL2, because there the current security state and the
security state for the lower EL are the same. But for AT operations
performed from EL3, the current security state is always either
Secure or Root, whereas we want to use the security state defined by
SCR_EL3.{NS,NSE} for the walk. This affects not just guests using
FEAT_RME but also ones where EL3 is Secure state and the EL3 code
is trying to do an AT for a NonSecure EL2 or EL1.
Use arm_security_space_below_el3() to get the SecuritySpace to
pass to do_ats_write() for all AT operations except the
AT S1E3* operations.
Cc: qemu-stable@nongnu.org
Fixes: e1ee56ec23 ("target/arm: Pass security space rather than flag for AT instructions")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2250
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240405180232.3570066-1-peter.maydell@linaro.org
Qemu wraps its call to ninja in a Makefile. Since ninja, as opposed to
make, utilizes all CPU cores by default, the qemu Makefile translates
the absense of a `-jN` argument into `-j1`. This breaks jobserver
functionality, so update the -jN mangling to take the --jobserver-auth
argument into considerationa too.
Signed-off-by: Martin Hundebøll <martin@geanix.com>
Message-Id: <20240402081738.1051560-1-martin@geanix.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
EL2 accesses to CNTPOFF_EL2 should only ever trap to EL3 if EL3 is
present, as described by the reference manual (for MRS):
/* ... */
elsif PSTATE.EL == EL2 then
if Halted() && HaveEL(EL3) && /*...*/ then
UNDEFINED;
elsif HaveEL(EL3) && SCR_EL3.ECVEn == '0' then
/* ... */
else
X[t, 64] = CNTPOFF_EL2;
However, the existing implementation of gt_cntpoff_access() always
returns CP_ACCESS_TRAP_EL3 for EL2 accesses with SCR_EL3.ECVEn unset. In
pseudo-code terminology, this corresponds to assuming that HaveEL(EL3)
is always true, which is wrong. As a result, QEMU panics in
access_check_cp_reg() when started without EL3 and running EL2 code
accessing the register (e.g. any recent KVM booting a guest).
Therefore, add the HaveEL(EL3) check to gt_cntpoff_access().
Fixes: 2808d3b38a ("target/arm: Implement FEAT_ECV CNTPOFF_EL2 handling")
Signed-off-by: Pierre-Clément Tosi <ptosi@google.com>
Message-id: m3al6amhdkmsiy2f62w72ufth6dzn45xg5cz6xljceyibphnf4@ezmmpwk4tnhl
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
qemu-sparc queue
# -----BEGIN PGP SIGNATURE-----
#
# iQFSBAABCgA8FiEEzGIauY6CIA2RXMnEW8LFb64PMh8FAmYOtvEeHG1hcmsuY2F2
# ZS1heWxhbmRAaWxhbmRlLmNvLnVrAAoJEFvCxW+uDzIf+5oIAJtRPiTP5aUmN4nU
# s72NBtgARBJ+5hHl0fqFFlCrG9elO28F1vhT9DwwBOLwihZCnfIXf+SCoE+pvqDw
# c+AMN/RnDu+1F4LF93W0ZIr305yGDfVlU+S3vKGtB9G4rcLeBDmNlhui2d0Bqx9R
# jwX1y57vcPclObE0KL6AVOfSDPYiVEVQSiTr3j4oW8TqAs2bduEZMRh6esb3XMIA
# hmj8mhZAszfh1YvX8ufbxtPQsnNuFMM+Fxgxp0pux8QaI0addDHwVNObRUYlTUZ1
# o4xCw7TRXXotaHde/OqZApFECs+md3R7rC2wj7s3ae0ynohHHDFfaB5t1f4pm+kA
# /6UN/Jc=
# =XwaI
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 04 Apr 2024 15:19:29 BST
# gpg: using RSA key CC621AB98E82200D915CC9C45BC2C56FAE0F321F
# gpg: issuer "mark.cave-ayland@ilande.co.uk"
# gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>" [full]
# Primary key fingerprint: CC62 1AB9 8E82 200D 915C C9C4 5BC2 C56F AE0F 321F
* tag 'qemu-sparc-20240404' of https://github.com/mcayland/qemu:
esp.c: remove explicit setting of DRQ within ESP state machine
esp.c: ensure esp_pdma_write() always calls esp_fifo_push()
esp.c: update esp_fifo_{push, pop}() to call esp_update_drq()
esp.c: introduce esp_update_drq() and update esp_fifo_{push, pop}_buf() to use it
esp.c: move esp_set_phase() and esp_get_phase() towards the beginning of the file
esp.c: prevent cmdfifo overflow in esp_cdb_ready()
esp.c: rework esp_cdb_length() into esp_cdb_ready()
esp.c: don't assert() if FIFO empty when executing non-DMA SELATNS
esp.c: introduce esp_fifo_push_buf() function for pushing to the FIFO
esp.c: change esp_fifo_pop_buf() to take ESPState
esp.c: use esp_fifo_push() instead of fifo8_push()
esp.c: change esp_fifo_pop() to take ESPState
esp.c: change esp_fifo_push() to take ESPState
esp.c: replace cmdfifo use of esp_fifo_pop() in do_message_phase()
esp.c: replace esp_fifo_pop_buf() with esp_fifo8_pop_buf() in do_message_phase()
esp.c: replace esp_fifo_pop_buf() with esp_fifo8_pop_buf() in do_command_phase()
esp.c: move esp_fifo_pop_buf() internals to new esp_fifo8_pop_buf() function
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
During normal use the cmdfifo will never wrap internally and cmdfifo_cdb_offset
will always indicate the start of the SCSI CDB. However it is possible that a
malicious guest could issue an invalid ESP command sequence such that cmdfifo
wraps internally and cmdfifo_cdb_offset could point beyond the end of the FIFO
data buffer.
Add an extra check to fifo8_peek_buf() to ensure that if the cmdfifo has wrapped
internally then esp_cdb_ready() will exit rather than allow scsi_cdb_length() to
access data outside the cmdfifo data buffer.
Reported-by: Chuhong Yuan <hslester96@gmail.com>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20240324191707.623175-13-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
This modification ensures that in scenarios where the buffer size is
insufficient for a zone report, the function will now properly set an
error status and proceed to a cleanup label, instead of merely
returning.
The following ASAN log reveals it:
==1767400==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 312 byte(s) in 1 object(s) allocated from:
#0 0x64ac7b3280cd in malloc llvm/compiler-rt/lib/asan/asan_malloc_linux.cpp:129:3
#1 0x735b02fb9738 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x5e738)
#2 0x64ac7d23be96 in virtqueue_split_pop hw/virtio/virtio.c:1612:12
#3 0x64ac7d23728a in virtqueue_pop hw/virtio/virtio.c:1783:16
#4 0x64ac7cfcaacd in virtio_blk_get_request hw/block/virtio-blk.c:228:27
#5 0x64ac7cfca7c7 in virtio_blk_handle_vq hw/block/virtio-blk.c:1123:23
#6 0x64ac7cfecb95 in virtio_blk_handle_output hw/block/virtio-blk.c:1157:5
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Message-id: 20240404120040.1951466-1-zheyuma97@gmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
If no bytes are there to process in the message in phase,
the input data latch (s->sidl) is set to s->msg[-1]. Just
do nothing since no DMA is performed.
Reported-by: Chuhong Yuan <hslester96@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Horizontal pel panning bit 3 is only used in text mode. In graphics
mode, it can be treated as if it was zero, thus not extending the
dirty memory region.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When pel panning is active, one more byte is read from each of the VGA
memory planes. This has to be accounted in the computation of region_end,
otherwise vga_draw_graphic() fails an assertion:
qemu-system-i386: ../system/physmem.c:946: cpu_physical_memory_snapshot_get_dirty: Assertion `start + length <= snap->end' failed.
Reported-by: Helge Konetzka <hk@zapateado.de>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2244
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move the computation of region_start and region_end after the value of
"bits" is known. This makes it possible to distinguish modes that
support horizontal pel panning from modes that do not.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There are two sets of conditionals using the shift control bits: one to
verify the palette and adjust disp_width, one to compute the "v" and
"bits" variables. Merge them into one, with the extra benefit that
we now have the "bits" value available early and can use it to
compute region_end.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When vhost-user or vhost-kernel is handling virtio net datapath,
QEMU should not touch used ring.
But with vhost-user socket reconnect scenario, in a very rare case
(has pending kick event). VRING_USED_F_NO_NOTIFY is set by QEMU in
following code path:
#0 virtio_queue_split_set_notification (vq=0x7ff5f4c920a8, enable=0) at ../hw/virtio/virtio.c:511
#1 0x0000559d6dbf033b in virtio_queue_set_notification (vq=0x7ff5f4c920a8, enable=0) at ../hw/virtio/virtio.c:576
#2 0x0000559d6dbbbdbc in virtio_net_handle_tx_bh (vdev=0x559d703a6aa0, vq=0x7ff5f4c920a8) at ../hw/net/virtio-net.c:2801
#3 0x0000559d6dbf4791 in virtio_queue_notify_vq (vq=0x7ff5f4c920a8) at ../hw/virtio/virtio.c:2248
#4 0x0000559d6dbf79da in virtio_queue_host_notifier_read (n=0x7ff5f4c9211c) at ../hw/virtio/virtio.c:3525
#5 0x0000559d6d9a5814 in virtio_bus_cleanup_host_notifier (bus=0x559d703a6a20, n=1) at ../hw/virtio/virtio-bus.c:321
#6 0x0000559d6dbf83c9 in virtio_device_stop_ioeventfd_impl (vdev=0x559d703a6aa0) at ../hw/virtio/virtio.c:3774
#7 0x0000559d6d9a55c8 in virtio_bus_stop_ioeventfd (bus=0x559d703a6a20) at ../hw/virtio/virtio-bus.c:259
#8 0x0000559d6d9a53e8 in virtio_bus_grab_ioeventfd (bus=0x559d703a6a20) at ../hw/virtio/virtio-bus.c:199
#9 0x0000559d6dbf841c in virtio_device_grab_ioeventfd (vdev=0x559d703a6aa0) at ../hw/virtio/virtio.c:3783
#10 0x0000559d6d9bde18 in vhost_dev_enable_notifiers (hdev=0x559d707edd70, vdev=0x559d703a6aa0) at ../hw/virtio/vhost.c:1592
#11 0x0000559d6d89a0b8 in vhost_net_start_one (net=0x559d707edd70, dev=0x559d703a6aa0) at ../hw/net/vhost_net.c:266
#12 0x0000559d6d89a6df in vhost_net_start (dev=0x559d703a6aa0, ncs=0x559d7048d890, data_queue_pairs=31, cvq=0) at ../hw/net/vhost_net.c:412
#13 0x0000559d6dbb5b89 in virtio_net_vhost_status (n=0x559d703a6aa0, status=15 '\017') at ../hw/net/virtio-net.c:311
#14 0x0000559d6dbb5e34 in virtio_net_set_status (vdev=0x559d703a6aa0, status=15 '\017') at ../hw/net/virtio-net.c:392
#15 0x0000559d6dbb60d8 in virtio_net_set_link_status (nc=0x559d7048d890) at ../hw/net/virtio-net.c:455
#16 0x0000559d6da64863 in qmp_set_link (name=0x559d6f0b83d0 "hostnet1", up=true, errp=0x7ffdd76569f0) at ../net/net.c:1459
#17 0x0000559d6da7226e in net_vhost_user_event (opaque=0x559d6f0b83d0, event=CHR_EVENT_OPENED) at ../net/vhost-user.c:301
#18 0x0000559d6ddc7f63 in chr_be_event (s=0x559d6f2ffea0, event=CHR_EVENT_OPENED) at ../chardev/char.c:62
#19 0x0000559d6ddc7fdc in qemu_chr_be_event (s=0x559d6f2ffea0, event=CHR_EVENT_OPENED) at ../chardev/char.c:82
This issue causes guest kernel stop kicking device and traffic stop.
Add vhost_started check in virtio_net_handle_tx_bh to fix this wrong
VRING_USED_F_NO_NOTIFY set.
Signed-off-by: Yajun Wu <yajunw@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20240402045109.97729-1-yajunw@nvidia.com>
[PMD: Use unlikely()]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
../hw/nvme/ctrl.c:6081:21: error: ‘result’ may be used uninitialized [-Werror=maybe-uninitialized]
It's not obvious that 'result' is set in all code paths. When &result is
a returned argument, it's even less clear.
Looking at various assignments, 0 seems to be a suitable default value.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Message-ID: <20240328102052.3499331-18-marcandre.lureau@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Coverity complains that the check introduced in commit 3f934817 suggests
that qiov could be NULL and we dereference it before reaching the check.
In fact, all of the callers pass a non-NULL pointer, so just remove the
misleading check.
Resolves: Coverity CID 1542668
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20240327192750.204197-1-kwolf@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The commit 15002f60f7 ("util: rename qemu-error.c to match its header
name") renamed util/qemu-error.c to util/error-report.c but missed to
change the corresponding entry.
To avoid get_maintainer.pl failing, update the error-report.c entry.
Fixes: 15002f60f7 ("util: rename qemu-error.c to match its header name")
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240327115539.3860270-1-zhao1.liu@linux.intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Similarly to commit 9de9fa5cf2 ("hw/arm/smmu-common: Avoid using
inlined functions with external linkage"):
None of our code base require / use inlined functions with external
linkage. Some places use internal inlining in the hot path. These
two functions are certainly not in any hot path and don't justify
any inlining, so these are likely oversights rather than intentional.
Fix:
C compiler for the host machine: clang (clang 15.0.0 "Apple clang version 15.0.0 (clang-1500.3.9.4)")
...
hw/arm/smmu-common.c:203:43: error: static function 'smmu_hash_remove_by_vmid' is
used in an inline function with external linkage [-Werror,-Wstatic-in-inline]
g_hash_table_foreach_remove(s->iotlb, smmu_hash_remove_by_vmid, &vmid);
^
include/hw/arm/smmu-common.h:197:1: note: use 'static' to give inline function 'smmu_iotlb_inv_vmid' internal linkage
void smmu_iotlb_inv_vmid(SMMUState *s, uint16_t vmid);
^
static
hw/arm/smmu-common.c:139:17: note: 'smmu_hash_remove_by_vmid' declared here
static gboolean smmu_hash_remove_by_vmid(gpointer key, gpointer value,
^
Fixes: ccc3ee3871 ("hw/arm/smmuv3: Add CMDs related to stage-2")
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20240313184954.42513-2-philmd@linaro.org>
The test mangles the GPIO address and the pin number in the
qtest_add_data_func data parameter. Doing so, it assumes that the host
pointer size is always 64-bit, which breaks on 32-bit :
../tests/qtest/stm32l4x5_gpio-test.c: In function ‘test_gpio_output_mode’:
../tests/qtest/stm32l4x5_gpio-test.c:272:25: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
272 | unsigned int pin = ((uint64_t)data) & 0xF;
| ^
../tests/qtest/stm32l4x5_gpio-test.c:273:22: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
273 | uint32_t gpio = ((uint64_t)data) >> 32;
| ^
To fix, improve the mangling of the GPIO address and pin number fields
by using GPIO_SIZE so that the resulting value fits in a 32-bit pointer.
While at it, include some helpers to hide the details.
Cc: Arnaud Minier <arnaud.minier@telecom-paris.fr>
Cc: Inès Varhol <ines.varhol@telecom-paris.fr>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Message-id: 20240329092747.298259-1-clg@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
If the group of the highest priority pending interrupt is disabled
via ICC_IGRPEN*, the ICC_HPPIR* registers should return
INTID_SPURIOUS, not the interrupt ID. (See the GIC architecture
specification pseudocode functions ICC_HPPIR1_EL1[] and
HighestPriorityPendingInterrupt().)
Make HPPIR reads honour the group disable, the way we already do
when determining whether to preempt in icc_hppi_can_preempt().
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240328153333.2522667-1-peter.maydell@linaro.org
Hardware of sbsa-ref board is nowadays defined by both BSA and SBSA
specifications. Then BBR defines firmware interface.
Added note about DeviceTree data passed from QEMU to firmware. It is
very minimal and provides only data we use in firmware.
Added NUMA information to list of things reported by DeviceTree.
Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Message-id: 20240328163851.1386176-1-marcin.juszkiewicz@linaro.org
Reviewed-by: Leif Lindholm <quic_llindhol@quicinc.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The HSTR_EL2 register allows the hypervisor to trap AArch32 EL1 and
EL0 accesses to cp15 registers. We incorrectly implemented this so
they trap to EL1 when we detect the need for a HSTR trap at code
generation time. (The check in access_check_cp_reg() which we do at
runtime to catch traps from EL0 is correctly routing them to EL2.)
Use the correct target EL when generating the code to take the trap.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2226
Fixes: 049edada5e ("target/arm: Make HSTR_EL2 traps take priority over UNDEF-at-EL1")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240325133116.2075362-1-peter.maydell@linaro.org
TILE-Gx has been removed during the v6.0 release (see
commit 2cc1a90166 "Remove deprecated target tilegx"),
no need to mention it in the list of "supported targets".
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This fixes the invalid bInterfaceProtocol value 0x04 in the USB audio
AudioControl descriptors. It should be zero. While Linux and Windows
forgive this error, macOS 14 Sonoma does not. The usb-audio device does
not appear in macOS sound settings even though the device is recognized
and shows up in USB system information. According to the USB audio class
specs 1.0-4.0, valid values are 0x00, 0x20, 0x30 and 0x40. (Note also
that Linux prints the warning "unknown interface protocol 0x4, assuming
v1", but then proceeds as if the value was zero.)
This also fixes the invalid wTotalLength value in the multi-channel
setup AudioControl interface header descriptor (used when multi=on
and out.mixing-engine off). The combined length of all the descriptors
there add up to 0x37, not 0x38. In Linux, "lsusb -D ..." displays
incomplete descriptor information when this length is incorrect.
Signed-off-by: Joonas Kankaala <joonas.a.kankaala@gmail.com>
Reviewed-by: Volker Rümelin <vr_qemu@t-online.de>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Migration pull for 9.0-rc2
- Avihai's two fixes on error paths
# -----BEGIN PGP SIGNATURE-----
#
# iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZgmsOxIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1waYKQD9G/B4c5u94Puhkr4o+K4M3FZ3J1pSpYRd
# nMAlrCWYLHQBAKV5q8DvgXbRNzT/Q+1UX7psxIsjyaqljxyJoZ+dIgAD
# =hucV
# -----END PGP SIGNATURE-----
# gpg: Signature made Sun 31 Mar 2024 19:32:27 BST
# gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg: issuer "peterx@redhat.com"
# gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [marginal]
# gpg: aka "Peter Xu <peterx@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706
* tag 'migration-20240331-pull-request' of https://gitlab.com/peterx/qemu:
migration/postcopy: Ensure postcopy_start() sets errp if it fails
migration: Set migration error in migration_completion()
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
After commit 9425ef3f99 ("migration: Use migrate_has_error() in
close_return_path_on_source()"), close_return_path_on_source() assumes
that migration error is set if an error occurs during migration.
This may not be true if migration errors in migration_completion(). For
example, if qemu_savevm_state_complete_precopy() errors, migration error
will not be set.
This in turn, will cause a migration hang bug, similar to the bug that
was fixed by commit 22b04245f0 ("migration: Join the return path
thread before releasing to_dst_file"), as shutdown() will not be issued
for the return-path channel.
Fix it by ensuring migration error is set in case of error in
migration_completion().
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Fixes: 9425ef3f99 ("migration: Use migrate_has_error() in close_return_path_on_source()")
Acked-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20240328140252.16756-2-avihaih@nvidia.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Various fixes for recent regressions and new code.
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCgAdFiEETkN92lZhb0MpsKeVZ7MCdqhiHK4FAmYJEQMACgkQZ7MCdqhi
# HK6l0BAAkVf/BXKJxMu3jLvCpK/fBYGytvfHBR9PdWeBwIirqsk3L8eI/Fb5qkMZ
# NMrfECyHR9LTcWb6/Pi/PGciNNWeyleN6IuVBeWfraIFyfHcxpwEKH8P+cXr5EWq
# WDg+1GUt9+FHuAC9UdGZ81UzX7qeI9VfD3wHceqJ/XRU3qjj67DPZjTpsvxuP64+
# N7MhdEM69F34uiIAn1aNCceXiS00dvtu6lDl3+18TzT8sNc6S3qdyxVcqfRhTJfY
# FMZIN3j2hQrVOElEQE9vAOeJyjAQCM+U0y3XZIZHFUw/GTwKV0tm08RFnnxprteG
# 67vR5uXrDEELnU/1PA1YeyaBMA3Z3Nc36XbGf8zTD6rKkS2z0lWMcs72pPIxbMXj
# c4FdnHaE+Q5ngy5s1p6bm5xM7WOEhrsJkgIu2N0weRroe0nAxywDWw3uQlMoV8Oc
# Xet/xM2IKdc0PLzTvFO7xKnW3oqavJ4CX/6XgrGBoMDZKO1JRqaMixGtYKmoH/1h
# 96+jdRbPTZAY8aoiFWW7t065lvdWt74A6QITcn2Kqm04j3MGJfyWMU6dakBzwuri
# PhOkf40o8qn8KN0JNfSO+IXhYVRRotLO/s9H7TEyQiXm25qrGMIF9FErnbDseZil
# rGR4eL0lcwJboYH9RSRWg0NNqpUekvqBzdnS+G0Ad3J+qaMYoik=
# =7UPB
# -----END PGP SIGNATURE-----
# gpg: Signature made Sun 31 Mar 2024 08:30:11 BST
# gpg: using RSA key 4E437DDA56616F4329B0A79567B30276A8621CAE
# gpg: Good signature from "Nicholas Piggin <npiggin@gmail.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 4E43 7DDA 5661 6F43 29B0 A795 67B3 0276 A862 1CAE
* tag 'pull-ppc-for-9.0-3-20240331' of https://gitlab.com/npiggin/qemu:
tests/avocado: ppc_hv_tests.py set alpine time before setup-alpine
tests/avocado: Fix ppc_hv_tests.py xorriso dependency guard
target/ppc: Do not clear MSR[ME] on MCE interrupts to supervisor
target/ppc: Fix GDB register indexing on secondary CPUs
target/ppc: Restore [H]DEXCR to 64-bits
target/ppc/mmu-radix64: Use correct string format in walk_tree()
hw/ppc/spapr: Include missing 'sysemu/tcg.h' header
spapr: nested: use bitwise NOT operator for flags check
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
If the time is wrong, setup-alpine SSL certificate checks can fail.
setup-alpine is used to bring up the network, but it doesn't seem
to to set NTP time before the failing SSL checks. This test has
recently started failing presumably because the default time has
now fallen too far behind.
Fix this by setting time from the host time before running setup-alpine.
Fixes: c9cb496710 ("tests/avocado: ppc add hypervisor tests")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
For some reason the skipIf missing_deps() check fails to skip the test
if it comes after the skipUnless lines, causing an error running on
systems without xorriso.
Avocado implements skipUnless is just an inverted skipIf, so it's not
clear what the bug is or why this fixes it. For now it's enough to
get things working.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2246
Fixes: c9cb496710 ("tests/avocado: ppc add hypervisor tests")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Hardware clears the MSR[ME] bit when delivering a machine check
interrupt, so that is what QEMU does.
The spapr environment runs in supervisor mode though, and receives
machine check interrupts after they are processed by the hypervisor,
and MSR[ME] must always be enabled in supervisor mode (otherwise it
could checkstop the system). So MSR[ME] must not be cleared when
delivering machine checks to the supervisor.
The fix to prevent supervisor mode from modifying MSR[ME] also
prevented it from re-enabling the incorrectly cleared MSR[ME] bit
when returning from handling the interrupt. Before that fix, the
problem was not very noticable with well-behaved code. So the
Fixes tag is not strictly correct, but practically they go together.
Found by kvm-unit-tests machine check tests (not yet upstream).
Fixes: 678b6f1af7 ("target/ppc: Prevent supervisor from modifying MSR[ME]")
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The GDB server protocol assigns an arbitrary numbering of the SPRs.
We track this correspondence on each SPR with gdb_id, using it to
resolve any SPR requests GDB makes.
Early on we generate an XML representation of the SPRs to give GDB,
including this numbering. However the XML is cached globally, and we
skip setting the SPR gdb_id values on subsequent threads if we detect
it is cached. This causes QEMU to fail to resolve SPR requests against
secondary CPUs because it cannot find the matching gdb_id value on that
thread's SPRs.
This is a minimal fix to first assign the gdb_id values, then return
early if the XML is cached. Otherwise we generate the XML using the
now already initialised gdb_id values.
Fixes: 1b53948ff8 ("target/ppc: Use GDBFeature for dynamic XML")
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The DEXCR emulation was recently changed to a 32-bit register, possibly
because it does have a 32-bit read-only view. It is a full 64-bit
SPR though, so use the corresponding 64-bit write functions.
Fixes: fbda88f7ab ("target/ppc: Fix width of some 32-bit SPRs")
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
'mask', 'nlb' and 'base_addr' are all uin64_t types.
Use the corresponding PRIx64 format.
Fixes: d2066bc50d ("target/ppc: Check page dir/table base alignment")
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
"sysemu/tcg.h" declares tcg_enabled(), and is implicitly included.
Include it explicitly to avoid the following error when refactoring
headers:
hw/ppc/spapr.c:2612:9: error: call to undeclared function 'tcg_enabled'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
if (tcg_enabled()) {
^
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Check for flag bit in H_GUEST_GETSET_STATE_FLAG_GUEST_WIDE need to use
bitwise NOT operator to ensure no other flag bits are set.
Resolves: Coverity CID 1540008
Resolves: Coverity CID 1540009
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Using log_pc produces the pc at the beginning of TB,
not the actual pc installed by cpu_restore_state_from_tb,
which could be any of the guest instructions within TB.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The 32-bit PA-7300LC (PCX-L2) CPU and the 64-bit PA8700 (PCX-W2) CPU
use different diag instructions to save or restore the CPU registers
to/from the shadow registers.
Implement those per-CPU architecture diag instructions to fix those
parts of the HP ODE testcases (L2DIAG and WDIAG, section 1) which test
the shadow registers.
Signed-off-by: Helge Deller <deller@gmx.de>
[rth: Use decodetree to distinguish cases]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Helge Deller <deller@gmx.de>
Tested-by: Helge Deller <deller@gmx.de>
This reverts commit 46d4d36d0b.
The reverted commit changed to emit warnings instead of errors when
vhost is requested but vhost initialization fails if vhostforce option
is not set.
However, vhostforce is not meant to ignore vhost errors. It was once
introduced as an option to commit 5430a28fe4 ("vhost: force vhost off
for non-MSI guests") to force enabling vhost for non-MSI guests, which
will have worse performance with vhost. The option was deprecated with
commit 1e7398a140 ("vhost: enable vhost without without MSI-X") and
changed to behave identical with the vhost option for compatibility.
Worse, commit bf769f742c ("virtio: del net client if net_init_tap_one
failed") changed to delete the client when vhost fails even when the
failure only results in a warning. The leads to an assertion failure
for the -netdev command line option.
The reverted commit was intended to avoid that the vhost initialization
failure won't result in a corrupted netdev. This problem should have
been fixed by deleting netdev when the initialization fails instead of
ignoring the failure with an arbitrary option. Fortunately, commit
bf769f742c ("virtio: del net client if net_init_tap_one failed"),
mentioned earlier, implements this behavior.
Restore the correct semantics and fix the assertion failure for the
-netdev command line option by reverting the problematic commit.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Some of them are only necessary for POSIX systems. The others are
assigned to function pointers in NetClientInfo that can actually be
NULL.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
It is incorrect to have the VIRTIO_NET_HDR_F_NEEDS_CSUM set when
checksum offloading is disabled so clear the bit.
TCP/UDP checksum is usually offloaded when the peer requires virtio
headers because they can instruct the peer to compute checksum. However,
igb disables TX checksum offloading when a VF is enabled whether the
peer requires virtio headers because a transmitted packet can be routed
to it and it expects the packet has a proper checksum. Therefore, it
is necessary to have a correct virtio header even when checksum
offloading is disabled.
A real TCP/UDP checksum will be computed and saved in the buffer when
checksum offloading is disabled. The virtio specification requires to
set the packet checksum stored in the buffer to the TCP/UDP pseudo
header when the VIRTIO_NET_HDR_F_NEEDS_CSUM bit is set so the bit must
be cleared in that case.
Fixes: ffbd2dbd8e ("e1000e: Perform software segmentation for loopback")
Buglink: https://issues.redhat.com/browse/RHEL-23067
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
virtio_net_guest_notifier_pending() and virtio_net_guest_notifier_mask()
checked VIRTIO_NET_F_MQ to know there are multiple queues, but
VIRTIO_NET_F_RSS also enables multiple queues. Refer to n->multiqueue,
which is set to true either of VIRTIO_NET_F_MQ or VIRTIO_NET_F_RSS is
enabled.
Fixes: 68b0a6395f ("virtio-net: align ctrl_vq index for non-mq guest for vhost_vdpa")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The local 9p driver in virtio-9p-test.c its temporary dir right at the
start of qos-test (via virtio_9p_create_local_test_dir()) and only
deletes it after qos-test is finished (via
virtio_9p_remove_local_test_dir()).
This means that any qos-test machine that ends up running virtio-9p-test
local tests more than once will end up re-using the same temp dir. This
is what's happening in [1] after we introduced the riscv machine nodes:
if we enable slow tests with the '-m slow' flag using
qemu-system-riscv64, this is what happens:
- a temp dir is created;
- virtio-9p-device tests will run virtio-9p-test successfully;
- virtio-9p-pci tests will run virtio-9p-test, and fail right at the
first slow test at fs_create_dir() because the "01" file was already
created by fs_create_dir() test when running with the virtio-9p-device.
The root cause is that we're creating a single temporary dir, via the
construct/destruct callbacks, and this temp dir is kept for the entire
qos-test run.
We can change each test to clean after themselves. This approach would
make the 'create' tests obsolete since we would need to create and
delete dirs/files/symlinks for the cleanup, turning them into the
'unlinkat' tests that comes right after.
We chose a different approach that handles the root cause: do not use
constructor/destructor to create the temp dir. Create one temp dir for
each test, and remove it after the test is complete. This is the
approach taken for other qtests like vhost-user-test.c where each test
requires a setup() and a subsequent cleanup(), all of those instantiated
in the .before callback.
[1] https://mail.gnu.org/archive/html/qemu-devel/2024-03/msg05807.html
Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-Id: <20240327142011.805728-2-dbarboza@ventanamicro.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Overflow indicator should include the effect of the shift step.
We had previously left ??? comments about the issue.
Tested-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Prepare for proper indication of shladd unsigned overflow.
The UV indicator will be zero/not-zero instead of a single bit.
Tested-by: Helge Deller <deller@gmx.de>
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The cond_need_ext predicate was created while we still had a
32-bit compilation mode. It now makes more sense to treat D
as an absolute indicator of a 64-bit operation.
Tested-by: Helge Deller <deller@gmx.de>
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Split do_unit_cond to do_unit_zero_cond to only handle conditions
versus zero. These are the only ones that are legal for UXOR.
Simplify trans_uxor accordingly.
Rename do_unit to do_unit_addsub, since xor has been split.
Properly compute carry-out bits for add and subtract, mirroring
the code in do_add and do_sub.
Tested-by: Helge Deller <deller@gmx.de>
Reviewed-by: Helge Deller <deller@gmx.de>
Fixes: b2167459ae ("target-hppa: Implement basic arithmetic")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
With r1 as zero is by far the most common usage of UADDCM, as the
easiest way to invert a register. The compiler does occasionally
use the addition step as well, and we can simplify that to avoid
a temp and write directly into the destination.
Tested-by: Helge Deller <deller@gmx.de>
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The carry bits for each nibble N are located in bit (N+1)*4,
so the shift by 3 was off by one. Furthermore, the carry bit
for the most significant carry bit is indeed located in bit 64,
which is located in a different storage word.
Use a double-word shift-right to reassemble into a single word
and place them all at bit 0 of their respective nibbles.
Tested-by: Helge Deller <deller@gmx.de>
Reviewed-by: Helge Deller <deller@gmx.de>
Fixes: b2167459ae ("target-hppa: Implement basic arithmetic")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Call translator_io_start before write to EIRR.
Move evaluation of EIRR vs EIEM to hppa_cpu_exec_interrupt.
Exit TB after write to EIEM, but otherwise use a straight store.
Reviewed-by: Helge Deller <deller@gmx.de>
Tested-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The call to gen_helper_read_interval_timer is
identical on both sides of the IF.
Reviewed-by: Helge Deller <deller@gmx.de>
Tested-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Do not clobber the high bits of the address by using a 32-bit deposit.
Reviewed-by: Helge Deller <deller@gmx.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The return address comes from IA*Q_Next, and IASQ_Next
is always equal to IASQ_Back, not IASQ_Front.
Tested-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In the h != g && shmaddr == NULL && !reserved_va case, target_shmat()
incorrectly mmap()s the initial anonymous range with
MAP_FIXED_NOREPLACE, even though the earlier mmap_find_vma() has
already reserved the respective address range.
Fix by using MAP_FIXED when "mapped", which is set after
mmap_find_vma(), is true.
Fixes: 78bc8ed9a8 ("linux-user: Rewrite target_shmat")
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20240325192436.561154-4-iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The indices of arguments used with semctl() are all off-by-1, because
arg1 is the ipc() command. Fix them. While at it, reuse print_semctl().
New output (for a small test program):
3540333 semctl(999,888,SEM_INFO,0x00007fe5051ee9a0) = -1 errno=14 (Bad address)
Fixes: 7ccfb2eb5f ("Fix warnings that would be caused by gcc flag -Wwrite-strings")
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20240325192436.561154-2-iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
I observed [NSTrackingArea rect] becomes de-synchronized with the view
frame with some unknown condition, and fails to track mouse movement on
some area of the view. Specify NSTrackingInVisibleRect option to let
Cocoa automatically update NSTrackingArea, which also saves code for
synchronization.
Fixes: 91aa508d02 ("ui/cocoa: Let the platform toggle fullscreen")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-ID: <20240323-fixes-v2-3-18651a2b0394@daynix.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
[NSWindow setContentAspectRatio:] does not trigger window resize itself,
so the wrong aspect ratio will persist if nothing resizes the window.
Call [NSWindow setContentSize:] in such a case.
Fixes: 91aa508d02 ("ui/cocoa: Let the platform toggle fullscreen")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-ID: <20240323-fixes-v2-1-18651a2b0394@daynix.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
QEMU build fails with
hw/i386/fw_cfg.c:74: undefined reference to `smbios_get_table_legacy'
when it's built with only 'microvm' enabled i.e. with config patch
+++ b/configs/devices/i386-softmmu/default.mak
@@ -26,7 +26,7 @@
# Boards:
#
-CONFIG_ISAPC=y
-CONFIG_I440FX=y
-CONFIG_Q35=y
+CONFIG_ISAPC=n
+CONFIG_I440FX=n
+CONFIG_Q35=n
It happens because I've fogotten/lost smbios_get_table_legacy() stub.
Fix it by adding missing stub as Philippe suggested.
Fixes: b42b0e4daa "smbios: build legacy mode code only for 'pc' machine"
Reported-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-ID: <20240326122630.85989-1-imammedo@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
1. The g_pattern_match_string() is deprecated when glib2 version >= 2.70.
Use g_pattern_spec_match_string() instead to avoid this problem.
2. The type of second parameter in g_ptr_array_add() is
'gpointer' {aka 'void *'}, but the type of reg->name is 'const char*'.
Cast the type of reg->name to 'gpointer' to avoid this problem.
compiler warning message:
contrib/plugins/execlog.c:330:17: warning: ‘g_pattern_match_string’
is deprecated: Use 'g_pattern_spec_match_string' instead [-Wdeprecated-declarations]
330 | if (g_pattern_match_string(pat, rd->name) ||
| ^~
In file included from /usr/include/glib-2.0/glib.h:67,
from contrib/plugins/execlog.c:9:
/usr/include/glib-2.0/glib/gpattern.h:57:15: note: declared here
57 | gboolean g_pattern_match_string (GPatternSpec *pspec,
| ^~~~~~~~~~~~~~~~~~~~~~
contrib/plugins/execlog.c:331:21: warning: ‘g_pattern_match_string’
is deprecated: Use 'g_pattern_spec_match_string' instead [-Wdeprecated-declarations]
331 | g_pattern_match_string(pat, rd_lower)) {
| ^~~~~~~~~~~~~~~~~~~~~~
/usr/include/glib-2.0/glib/gpattern.h:57:15: note: declared here
57 | gboolean g_pattern_match_string (GPatternSpec *pspec,
| ^~~~~~~~~~~~~~~~~~~~~~
contrib/plugins/execlog.c:339:63: warning: passing argument 2 of
‘g_ptr_array_add’ discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
339 | g_ptr_array_add(all_reg_names, reg->name);
| ~~~^~~~~~
In file included from /usr/include/glib-2.0/glib.h:33:
/usr/include/glib-2.0/glib/garray.h:198:62: note: expected
‘gpointer’ {aka ‘void *’} but argument is of type ‘const char *’
198 | gpointer data);
| ~~~~~~~~~~~~~~~~~~^~~~
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2210
Signed-off-by: Yao Xingtao <yaoxt.fnst@fujitsu.com>
Message-ID: <20240326015257.21516-1-yaoxt.fnst@fujitsu.com>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Let clock_set_mul_div() return a boolean value whether the
clock has been updated or not, similarly to clock_set().
Return early when clock_set_mul_div() is called with
same mul/div values the clock has.
Acked-by: Luc Michel <luc@lmichel.fr>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20240325152827.73817-2-philmd@linaro.org>
In qemu monitor mode, when we use gpa2hva command to print the host
virtual address corresponding to a guest physical address, if the gpa is
not in RAM, the error message is below:
(qemu) gpa2hva 0x750000000
Memory at address 0x750000000is not RAM
A space is missed between '0x750000000' and 'is'.
Signed-off-by: Yao Xingtao <yaoxt.fnst@fujitsu.com>
Fixes: e9628441df ("hmp: gpa2hva and gpa2hpa hostaddr command")
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Dr. David Alan Gilbert <dave@treblig.org>
Message-ID: <20240319021610.2423844-1-ruansy.fnst@fujitsu.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The io_timeout property, introduced in c9b6609 (part of 6.0) is
silently overwritten by the hardcoded default value of 30 seconds
(DEFAULT_IO_TIMEOUT) in scsi_generic_realize because that function is
being called after the properties have already been applied.
The property definition already has a default value which is applied
correctly when no value is explicitly set, so we can just remove the
code which overrides the io_timeout completely.
This has been tested by stracing SG_IO operations with the io_timeout
property set and unset and now sets the timeout field in the ioctl
request to the proper value.
Fixes: c9b6609b69 ("scsi: make io_timeout configurable")
Signed-off-by: Lorenz Brun <lorenz@brun.one>
Message-ID: <20240315145831.2531695-1-lorenz@brun.one>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
CXL emulation of interleave requires read and write hooks due to
requirement for subpage granularity. The Linux kernel stack now enables
using this memory as conventional memory in a separate NUMA node. If a
process is deliberately forced to run from that node
$ numactl --membind=1 ls
the page table walk on i386 fails.
Useful part of backtrace:
(cpu=cpu@entry=0x555556fd9000, fmt=fmt@entry=0x555555fe3378 "cpu_io_recompile: could not find TB for pc=%p")
at ../../cpu-target.c:359
(retaddr=0, addr=19595792376, attrs=..., xlat=<optimized out>, cpu=0x555556fd9000, out_offset=<synthetic pointer>)
at ../../accel/tcg/cputlb.c:1339
(cpu=0x555556fd9000, full=0x7fffee0d96e0, ret_be=ret_be@entry=0, addr=19595792376, size=size@entry=8, mmu_idx=4, type=MMU_DATA_LOAD, ra=0) at ../../accel/tcg/cputlb.c:2030
(cpu=cpu@entry=0x555556fd9000, p=p@entry=0x7ffff56fddc0, mmu_idx=<optimized out>, type=type@entry=MMU_DATA_LOAD, memop=<optimized out>, ra=ra@entry=0) at ../../accel/tcg/cputlb.c:2356
(cpu=cpu@entry=0x555556fd9000, addr=addr@entry=19595792376, oi=oi@entry=52, ra=ra@entry=0, access_type=access_type@entry=MMU_DATA_LOAD) at ../../accel/tcg/cputlb.c:2439
at ../../accel/tcg/ldst_common.c.inc:301
at ../../target/i386/tcg/sysemu/excp_helper.c:173
(err=0x7ffff56fdf80, out=0x7ffff56fdf70, mmu_idx=0, access_type=MMU_INST_FETCH, addr=18446744072116178925, env=0x555556fdb7c0)
at ../../target/i386/tcg/sysemu/excp_helper.c:578
(cs=0x555556fd9000, addr=18446744072116178925, size=<optimized out>, access_type=MMU_INST_FETCH, mmu_idx=0, probe=<optimized out>, retaddr=0) at ../../target/i386/tcg/sysemu/excp_helper.c:604
Avoid this by plumbing the address all the way down from
x86_cpu_tlb_fill() where is available as retaddr to the actual accessors
which provide it to probe_access_full() which already handles MMIO accesses.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2180
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2220
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Gregory Price <gregory.price@memverge.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-ID: <20240307155304.31241-2-Jonathan.Cameron@huawei.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Previously, bdrv_pad_request() could not deal with a NULL qiov when
a read needed to be aligned. During prefetch, a stream job will pass a
NULL qiov. Add a test case to cover this scenario.
By accident, also covers a previous race during shutdown, where block
graph changes during iteration in bdrv_flush_all() could lead to
unreferencing the wrong block driver state and an assertion failure
later.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20240322095009.346989-5-f.ebner@proxmox.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Same rationale as for commit "block-backend: fix edge case in
bdrv_next() where BDS associated to BB changes". The block graph might
change between the bdrv_next() call and the bdrv_next_cleanup() call,
so it could be that the associated BDS is not the same that was
referenced previously anymore. Instead, rely on bdrv_next() to set
it->bs to the BDS it referenced and unreference that one in any case.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20240322095009.346989-4-f.ebner@proxmox.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Some operations, e.g. block-stream, perform reads while discarding the
results (only copy-on-read matters). In this case, they will pass NULL
as the target QEMUIOVector, which will however trip bdrv_pad_request,
since it wants to extend its passed vector. In particular, this is the
case for the blk_co_preadv() call in stream_populate().
If there is no qiov, no operation can be done with it, but the bytes
and offset still need to be updated, so the subsequent aligned read
will actually be aligned and not run into an assertion failure.
In particular, this can happen when the request alignment of the top
node is larger than the allocated part of the bottom node, in which
case padding becomes necessary. For example:
> ./qemu-img create /tmp/backing.qcow2 -f qcow2 64M -o cluster_size=32768
> ./qemu-io -c "write -P42 0x0 0x1" /tmp/backing.qcow2
> ./qemu-img create /tmp/top.qcow2 -f qcow2 64M -b /tmp/backing.qcow2 -F qcow2
> ./qemu-system-x86_64 --qmp stdio \
> --blockdev qcow2,node-name=node0,file.driver=file,file.filename=/tmp/top.qcow2 \
> <<EOF
> {"execute": "qmp_capabilities"}
> {"execute": "blockdev-add", "arguments": { "driver": "compress", "file": "node0", "node-name": "node1" } }
> {"execute": "block-stream", "arguments": { "job-id": "stream0", "device": "node1" } }
> EOF
Originally-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: do update bytes and offset in any case
add reproducer to commit message]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20240322095009.346989-2-f.ebner@proxmox.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
VDUSE requires that virtqueues are first enabled before the DRIVER_OK
status flag is set; with the current API of the kernel module, it is
impossible to enable the opposite order in our block export code because
userspace is not notified when a virtqueue is enabled.
This requirement also mathces the normal initialisation order as done by
the generic vhost code in QEMU. However, commit 6c482547 accidentally
changed the order for vdpa-dev and broke access to VDUSE devices with
this.
This changes vdpa-dev to use the normal order again and use the standard
vhost callback .vhost_set_vring_enable for this. VDUSE devices can be
used with vdpa-dev again after this fix.
vhost_net intentionally avoided enabling the vrings for vdpa and does
this manually later while it does enable them for other vhost backends.
Reflect this in the vhost_net code and return early for vdpa, so that
the behaviour doesn't change for this device.
Cc: qemu-stable@nongnu.org
Fixes: 6c4825476a ('vdpa: move vhost_vdpa_set_vring_ready to the caller')
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20240315155949.86066-1-kwolf@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Tests 157 and 227 use the virtio-blk device, so we have to mark these
tests accordingly to be skipped if this devices is not available (e.g.
when running the tests with qemu-system-avr only).
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240325154737.1305063-1-thuth@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
QAPI patches patches for 2024-03-26
# -----BEGIN PGP SIGNATURE-----
#
# iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAmYCXs0SHGFybWJydUBy
# ZWRoYXQuY29tAAoJEDhwtADrkYZTRMkP/RR9ipyCh1P6KKuQCyouxM8nXZ945b33
# L0OBFY4cTZsMY98ZTU9i80f/Tjh58y5bYzSdr21jxBSdRlvb6w/Ys8xUsI2Hi/kZ
# czdKZbWT5IeSNrLw04CjhW2FKq8XXR7epHbPms2nBa1Xc5Jf8aIC2UhZKADeVCyE
# FCIUU8Ctkyvre3XGVOrIBhhVPs9hi33uG+HXiveQjjWwGZZ0v915sAZ4WvNEoiBS
# E0NwhJxtbQu8FwfBquE84o0TMzYb19MlU4zq3G5dv7PhLoXX9Lf097/tPEtoP4et
# H37eGD8MzL1hywf+E1vAw0HC/v7Ci5Kx1jpUUL03Hfvfa6Lktb6yYke8qAD1HopK
# o4btXKpfseuyTJuqTVaV4Z9tLqmpTddnUupBFkfKviMPdidAN9SM1qXx0VCM9mUo
# cwf5di6zOPTNYlsKuP6ofSGGaD5so9PGLGAZoGeZ1Cb7TLsgiqqD0GrPP0//1VLn
# GzOOsOA4Su9F2/qjkNu2HZ1jv6J105LbQnaPvqjVfkjzzmoOANYmViX3KmDwdBxX
# vSfy5/UpCP9TBv++N/ljRtrVzW9i4a2rIGeHWC0x/JJglIh45f2MzeyqJ5haV2n5
# MVY82yY7EY7U+YXUUUKSgGVWw7+/wj/R5wQ5a9Gjk9OlmTsTHxJIKIzhcw4lX9da
# ZmRLCSIlVu4Z
# =li36
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 26 Mar 2024 05:36:13 GMT
# gpg: using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653
# gpg: issuer "armbru@redhat.com"
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full]
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>" [full]
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653
* tag 'pull-qapi-2024-03-26' of https://repo.or.cz/qemu/armbru:
qapi: document parameters of query-cpu-model-* QAPI commands
qapi/block-core: improve Qcow2OverlapCheckFlags documentation
qapi: document leftover members in qapi/stats.json
qapi: document leftover members in qapi/run-state.json
qapi: document InputMultiTouchType
qga/qapi-schema: Refill doc comments to conform to current conventions
qapi: Correct documentation indentation and whitespace
qapi: Refill doc comments to conform to current conventions
qapi: Don't repeat member type in its documentation text
qapi: Start sentences with a capital letter, end them with a period
qapi: Fix abbreviation punctuation in doc comments
qapi: Fix typo in request-ebpf documentation
qapi: Fix argument markup in drive-mirror documentation
qapi: Tidy up indentation of add_client's example
qapi: Tidy up block-latency-histogram-set documentation some more
qapi: Expand a few awkward abbreviations in documentation
qapi: Drop stray Arguments: line from qmp_capabilities docs
qapi: Fix bogus documentation of query-migrationthreads
qapi: Resync MigrationParameter and MigrateSetParameters
qapi: Improve migration TLS documentation
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Most of fields have no description at all. Let's fix that. Still, no
reason to place here more detailed descriptions of what these
structures are, as we have public Qcow2 format specification.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20240325120054.2693236-1-vsementsov@yandex-team.ru>
Acked-by: Markus Armbruster <armbru@redhat.com>
[Capitalize "QEMU", update qapi/pragma.json]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
For legibility, wrap text paragraphs so every line is at most 70
characters long.
To check the generated documentation does not change, I compared the
generated HTML before and after this commit with "wdiff -3". Finds no
differences. Comparing with diff is not useful, as the refilled
paragraphs are visible there.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240322140910.328840-13-armbru@redhat.com>
For legibility, wrap text paragraphs so every line is at most 70
characters long.
To check the generated documentation does not change, I compared the
generated HTML before and after this commit with "wdiff -3". Finds no
differences. Comparing with diff is not useful, as the refilled
paragraphs are visible there.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240322140910.328840-11-armbru@redhat.com>
Documentation generated for the arguments of MEMORY_FAILURE looks like
"recipient": "MemoryFailureRecipient"
recipient is defined as "MemoryFailureRecipient".
"action": "MemoryFailureAction"
action that has been taken. action is defined as
"MemoryFailureAction".
"flags": "MemoryFailureFlags"
flags for MemoryFailureAction. action is defined as
"MemoryFailureFlags".
The "action is defined as ..." are redundant. Drop.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240322140910.328840-10-armbru@redhat.com>
Commit d23055b8db (qapi: Require descriptions and tagged sections to
be indented) indented add_client's example too much. Revert that.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240322140910.328840-5-armbru@redhat.com>
[Move a stray hunk to the later patch it belongs to]
The doc comment documents an argument that doesn't exist. Would
fail compilation if it was marked up correctly. Delete.
The Returns: section fails to refer to the data type, leaving the user
to guess. Fix that.
The command name violates QAPI naming rules: it should be
query-migration-threads. Too late to fix.
Reported-by: John Snow <jsnow@redhat.com>
Fixes: 671326201d (migration: Introduce interface query-migrationthreads)
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240322135117.195489-4-armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Enum MigrationParameter mirrors the members of struct
MigrateSetParameters. Differences to MigrateSetParameters's member
documentation are pointless. Clean them up:
* @compress-level, @compress-threads, @decompress-threads, and
x-checkpoint-delay are more thoroughly documented for
MigrationParameter, so use that version for both.
* @max-cpu-throttle is almost the same. Use MigrationParameter's
version for both.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240322135117.195489-3-armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
MigrateSetParameters is about setting parameters, and
MigrationParameters is about querying them. Their documentation of
@tls-creds and @tls-hostname has residual damage from a failed attempt
at de-duplicating them (see commit de63ab6124 "migrate: Share common
MigrationParameters struct" and commit 1bda8b3c69 "migration: Unshare
MigrationParameters struct for now").
MigrateSetParameters documentation issues:
* It claims plain text mode "was reported by omitting tls-creds"
before 2.9. MigrateSetParameters is not used for reporting, so this
is misleading. Delete.
* It similarly claims hostname defaulting to migration URI "was
reported by omitting tls-hostname" before 2.9. Delete as well.
Rephrase the remaining @tls-hostname contents for clarity.
Enum MigrationParameter mirrors the members of struct
MigrateSetParameters. Differences to MigrateSetParameters's member
documentation are pointless. Copy the new text to MigrationParameter.
MigrationParameters documentation issues:
* @tls-creds runs the two last sentences together without punctuation.
Fix that.
* Much of the contents on @tls-hostname only applies to setting
parameters, resulting in confusion. Replace by a suitable abridged
version of the new MigrateSetParameters text, and a note on
@tls-hostname omission in 2.8.
Additional damage is due to flawed doc fix commit
66fcb9d651 (qapi/migration: Add missing tls-authz documentation):
since it copied the missing MigrateSetParameters text from
MigrationParameters instead of MigrationParameter, the part on
recreating @tls-authz on the fly is missing. Copy that, too.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240322135117.195489-2-armbru@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
[Some typos corrected]
* Fix timeouts in Travis-CI jobs
* Mark devices with user_creatable = false that can crash QEMU otherwise
* Fix s390x TEST-AND-SET TCG instruction emulation
* Move pc955* devices to hw/gpio/
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmYBhdgRHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbVcfA/9FulEN4HrjD3ObyboA+WfibXURwChui98
# 8LvL/fAGe3BZXQtspuNmPyrKRtIOrIwHJyFuxf2N5+8BuvGhEHQIuvIhQIj/rvfy
# X14KlldmQ3w3HlI3Ud4YiebLyK3AAFC2ApIywzFsnN+HoaHJR2EyDIb+T7OsGJZf
# ZLE/Z7qANxoNeZ+a3+rQR3SVpijyS3fXxDSaILrq2uW4kCCs/55O8Rt3Qb+PFSVd
# fF+OlpG6o+z73ACZc1u9Io4IO1ZZc/NdkmDTNz4HknkvJLTLF6kOECAxLl0ytgAG
# YRzBGKes29Zpa9wn/9rc75/OYNS0Ks+B19sQnijWUNX0zq5FkReXNXiyVcbT7d4p
# 6jFzlFnjj4ifB8uQkZTGcx/lL4s4VkPzF+f7fgHq9CKNrNsx8uca0TyQ8s4y+NGb
# C98kJdHd+QhCcuNnAbifCwuFaxQ8C4BdgzxVbU/sGDKNkINNkiTp+uue4TxnRKvV
# MfhqdnWRvqgZ0Rl4TxqcNfODK72Z1YNv3933OKE/mRJYS1Q529TIq4vfp8WIMsWQ
# 7+ipo4WKXhkiSOJZD6AkCoFum1W8yaDzUDJTw2Xt2bPBL3+FXcQyKkKVUMfzIJ8M
# KLe0Bb9W/pYU1ToTciTP0dkQF/02tG0Vik273445wPgH0x8OyHJkPF/ny1a7lKFO
# 5jreYdMxWdc=
# =lfZM
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 25 Mar 2024 14:10:32 GMT
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5
* tag 'pull-request-2024-03-25' of https://gitlab.com/thuth/qemu:
tests/tcg/s390x: Test TEST AND SET
target/s390x: Use mutable temporary value for op_ts
libqos/virtio.c: Correct 'flags' reading in qvirtqueue_kick
misc/pca955*: Move models under hw/gpio
aspeed: Make the ast1030-a1 SoC not user creatable
aspeed: Make the ast2600-a3 SoC not user creatable
hw/microblaze: Do not allow xlnx-zynqmp-pmu-soc to be created by the user
.travis.yml: Remove the unused xfslib-dev package
.travis.yml: Shorten the runtime of the problematic jobs
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Migration pull for 9.0-rc1
- Fabiano's patch to revert fd: support on mapped-ram
- Peter's fix on postcopy regression on unnecessary dirty syncs
- Fabiano's fix on mapped-ram rare corrupt on zero page handling
# -----BEGIN PGP SIGNATURE-----
#
# iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZf2uIxIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1waqTgD/RjaWrcUYlHcfFcWlEQGrYqikCtZYI+oW
# YYdbLcCBOlQBAL/ecCbsFyaWyPnB1Eg3YFcj5g8AgogDHdg37HSxydgL
# =aWGi
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 22 Mar 2024 16:13:23 GMT
# gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg: issuer "peterx@redhat.com"
# gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [marginal]
# gpg: aka "Peter Xu <peterx@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706
* tag 'migration-20240322-pull-request' of https://gitlab.com/peterx/qemu:
migration/multifd: Fix clearing of mapped-ram zero pages
migration/postcopy: Fix high frequency sync
migration: Revert mapped-ram multifd support to fd: URI
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Coverity points out that g_setenv() can fail and we don't
check for this in qtest_inproc_init(). In practice this will
only fail if a memory allocation failed in setenv() or if
the caller passed an invalid architecture name (e.g. one
with an '=' in it), so rather than requiring the callsite
to check for failure, make g_setenv() failure fatal here,
similarly to what we did in commit aca68d95c5.
Resolves: Coverity CID 1497485
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240312183810.557768-8-peter.maydell@linaro.org
In test_compute_wait() we do
double units = bkt.max / 10;
which does an integer division and then assigns it to a double variable,
and similarly later on in the expression for an assertion.
Use 10.0 so that we do a floating point division and calculate the
exact value, rather than doing an integer division.
Spotted by Coverity.
Resolves: Coverity CID 1432564
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240312183810.557768-7-peter.maydell@linaro.org
In qvirtqueue_kick(), the 'flags' were previously being incorrectly read from
vq->avail instead of the correct vq->used location. This update ensures 'flags'
are read from the correct location as per the virtio standard.
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240320090442.267525-1-zheyuma97@gmail.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
In pca9554_get_pin() and pca9554_set_pin(), we try to detect an
incorrect pin value, but we get the condition wrong, using ">"
when ">=" was intended.
This has no actual effect, because in pca9554_initfn() we
use the correct test when creating the properties and so
we'll never be called with an out of range value. However,
Coverity complains about the mismatch between the check and
the later use of the pin value in a shift operation.
Use the correct condition.
Resolves: Coverity CID 1534917
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 20240312183810.557768-5-peter.maydell@linaro.org
In net_init_af_xdp() we parse the arguments and allocate
a buffer of ints into sock_fds. However, although we
free this in the error exit path, we don't ever free it
in the successful return path. Coverity spots this leak.
Switch to g_autofree so we don't need to manually free the
array.
Resolves: Coverity CID 1534906
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 20240312183810.557768-4-peter.maydell@linaro.org
In socket_check_afunix_support() we call socket(PF_UNIX, SOCK_STREAM, 0)
to see if it works, but we call close() on the result whether it
worked or not. Only close the fd if the socket() call succeeded.
Spotted by Coverity.
Resolves: Coverity CID 1497481
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 20240312183810.557768-3-peter.maydell@linaro.org
Using xlnx-zynqmp-pmu-soc on the command line causes QEMU to crash:
./qemu-system-microblazeel -M petalogix-ml605 -device xlnx-zynqmp-pmu-soc
**
ERROR:tcg/tcg.c:813:tcg_register_thread: assertion failed: (n < tcg_max_ctxs)
Bail out!
Aborted (core dumped)
Mark the device with "user_creatable = false" to avoid that this can happen.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2229
Message-ID: <20240322183153.1023359-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The "[s390x] GCC (other-system)" and the "[s390x] GCC check-tcg"
jobs are hitting the 50 minutes timeout in Travis quite frequently
since a while.
To fix it, we've got to drop a lot of the targets from the target
list in the jobs to make them work again.
With regards to the "check-tcg" test, we can move the check with
"s390x-linux-user" to the "user" job instead which also builds
the s390x-linux-user target.
And while we're at it, remove the "--enable-fdt=system" configure
switch (since this is not required nowadays anymore).
Message-ID: <20240320104144.823425-2-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
When the zero page detection is done in the multifd threads, we need
to iterate the second part of the pages->offset array and clear the
file bitmap for each zero page. The piece of code we merged to do that
is wrong.
The reason this has passed all the tests is because the bitmap is
initialized with zeroes already, so clearing the bits only really has
an effect during live migration and when a data page goes from having
data to no data.
Fixes: 303e6f54f9 ("migration/multifd: Implement zero page transmission on the multifd thread.")
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240321201242.6009-1-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
With current code base I can observe extremely high sync count during
precopy, as long as one enables postcopy-ram=on before switchover to
postcopy.
To provide some context of when QEMU decides to do a full sync: it checks
must_precopy (which implies "data must be sent during precopy phase"), and
as long as it is lower than the threshold size we calculated (out of
bandwidth and expected downtime) QEMU will kick off the slow/exact sync.
However, when postcopy is enabled (even if still during precopy phase), RAM
only reports all pages as can_postcopy, and report must_precopy==0. Then
"must_precopy <= threshold_size" mostly always triggers and enforces a slow
sync for every call to migration_iteration_run() when postcopy is enabled
even if not used. That is insane.
It turns out it was a regress bug introduced in the previous refactoring in
8.0 as reported by Nina [1]:
(a) c8df4a7aef ("migration: Split save_live_pending() into state_pending_*")
Then a workaround patch is applied at the end of release (8.0-rc4) to fix it:
(b) 28ef5339c3 ("migration: fix ram_state_pending_exact()")
However that "workaround" was overlooked when during the cleanup in this
9.0 release in this commit..
(c) b0504edd40 ("migration: Drop unnecessary check in ram's pending_exact()")
Then the issue was re-exposed as reported by Nina [1].
The problem with (b) is that it only fixed the case for RAM, rather than
all the rest of iterators. Here a slow sync should only be required if all
dirty data (precopy+postcopy) is less than the threshold_size that QEMU
calculated. It is even debatable whether a sync is needed when switched to
postcopy. Currently ram_state_pending_exact() will be mostly noop if
switched to postcopy, and that logic seems to apply too for all the rest of
iterators, as sync dirty bitmap during a postcopy doesn't make much sense.
However let's leave such change for later, as we're in rc phase.
So rather than reusing commit (b), this patch provides the complete fix for
all iterators. When at it, cleanup a little bit on the lines around.
[1] https://gitlab.com/qemu-project/qemu/-/issues/1565
Reported-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Fixes: b0504edd40 ("migration: Drop unnecessary check in ram's pending_exact()")
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240320214453.584374-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
This reverts commit decdc76772 in full
and also the relevant migration-tests from
7a09f09283.
After the addition of the new QAPI-based migration address API in 8.2
we've been converting an "fd:" URI into a SocketAddress, missing the
fact that the "fd:" syntax could also be used for a plain file instead
of a socket. This is a problem because the SocketAddress is part of
the API, so we're effectively asking users to create a "socket"
channel to pass in a plain file.
The easiest way to fix this situation is to deprecate the usage of
both SocketAddress and "fd:" when used with a plain file for
migration. Since this has been possible since 8.2, we can wait until
9.1 to deprecate it.
For 9.0, however, we should avoid adding further support to migration
to a plain file using the old "fd:" syntax or the new SocketAddress
API, and instead require the usage of either the old-style "file:" URI
or the FileMigrationArgs::filename field of the new API with the
"/dev/fdset/NN" syntax, both of which are already supported.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240319210941.1907-1-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
pull-loongarch-20240322
# -----BEGIN PGP SIGNATURE-----
#
# iLMEAAEKAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCZf1WZgAKCRBAov/yOSY+
# 35zZBADDPLM3130Q/2zsGhol1C538i4+hYRbrX+OsLnlaldyE3NqCPcgaKwVE3xS
# T9aOln91rDyQedz4DVYYSx+Oa1JpRjGko957REmopL50SJOYi6n7YhHJksaUirjJ
# tMDZdPClOegieOpCu8LgJAVhaxTpZvfLedJVPt7O6Fl/uP3pLg==
# =XLqh
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 22 Mar 2024 09:59:02 GMT
# gpg: using RSA key B8FF1DA0D2FDCB2DA09C6C2C40A2FFF239263EDF
# gpg: Good signature from "Song Gao <m17746591750@163.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B8FF 1DA0 D2FD CB2D A09C 6C2C 40A2 FFF2 3926 3EDF
* tag 'pull-loongarch-20240322' of https://gitlab.com/gaosong/qemu:
target/loongarch: Fix qemu-system-loongarch64 assert failed with the option '-d int'
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
RISC-V PR for 9.0
* Do not enable all named features by default
* A range of Vector fixes
* Update APLIC IDC after claiming iforce register
* Remove the dependency of Zvfbfmin to Zfbfmin
* Fix mode in riscv_tlb_fill
* Fix timebase-frequency when using KVM acceleration
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEaukCtqfKh31tZZKWr3yVEwxTgBMFAmX9RscACgkQr3yVEwxT
# gBNaRg/+KUSF6AuY25pS7GawbufBbwWWaWN9G/inPVoCnLbeYrkB3uZw3nBd3iV8
# KiD9Azabl6TLBFC/f7eP9alNDIoSrq5EliayrlFEZIncYvig2Y3CkWUeK6oJqDp2
# Dz1Vah4IB96bU2/M9icyHkh3tnSnbhq0JrbgoAYwWutZy4ERYugTHulOGPxBj64I
# JIfb8wYqaak3Uak+g0mz/YBNHegLEDxIzIRhO4oWPE0MWKSO3t79G9qVAYi7pkFB
# ZQQasZy0h9ZpwKvVajiO8yjwh7COI0IPU+4vZNkNXue0SXQvAvcKA4DdaTwmMTio
# 9UM9HRB371F5LtJLdvAT2TR8FfW26Y7xBe458jheFOnPHKwxEFtUFCQ39UJB3bDN
# k7CYvU3GIqUJHD7PtYZfzTdYkdnIDpr9yKTPP2/nCN53FzXuJs/XTyySphJ6mZ2m
# dsr1bnJn/ncZP7W2vdWGfgQEKt2CHfE5qWM++RwhmQc+IKn2ImMA0hBsg6Gl2imB
# 9WANt3UX784VDmcwcFVgDgr6nftDs7gjVCtHAaRV7Oq2f9hcr17pRxg66mSXs0BX
# fMhcqHBe01LpZQRbaGQ0ImTQksEFyH2KTvt0kjF4SfpVzMfVOi/Zmy9goYNq4iYd
# tfucBbXVhpzbJ/9HeOzKAJQ2Wt0NyLiyDIOkWXj61WquS/0Mr9g=
# =8vP1
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 22 Mar 2024 08:52:23 GMT
# gpg: using RSA key 6AE902B6A7CA877D6D659296AF7C95130C538013
# gpg: Good signature from "Alistair Francis <alistair@alistair23.me>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6AE9 02B6 A7CA 877D 6D65 9296 AF7C 9513 0C53 8013
* tag 'pull-riscv-to-apply-20240322' of https://github.com/alistair23/qemu:
target/riscv/kvm: fix timebase-frequency when using KVM acceleration
target/riscv: Fix mode in riscv_tlb_fill
target/riscv: rvv: Remove the dependency of Zvfbfmin to Zfbfmin
hw/intc: Update APLIC IDC after claiming iforce register
target/riscv/vector_helper.c: optimize loops in ldst helpers
target/riscv: enable 'vstart_eq_zero' in the end of insns
trans_rvv.c.inc: remove redundant mark_vs_dirty() calls
target/riscv: remove 'over' brconds from vector trans
target/riscv/vector_helpers: do early exit when vstart >= vl
target/riscv: always clear vstart for ldst_whole insns
target/riscv: always clear vstart in whole vec move insns
target/riscv/vector_helper.c: fix 'vmvr_v' memcpy endianess
trans_rvv.c.inc: set vstart = 0 in int scalar move insns
target/riscv/vector_helper.c: set vstart = 0 in GEN_VEXT_VSLIDEUP_VX()
target/riscv: do not enable all named features by default
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Currently, QEMU only sets the iforce register to 0 and returns early
when claiming the iforce register. However, this may leave mip.meip
remains at 1 if a spurious external interrupt triggered by iforce
register is the only pending interrupt to be claimed, and the interrupt
cannot be lowered as expected.
This commit fixes this issue by calling riscv_aplic_idc_update() to
update the IDC status after the iforce register is claimed.
Signed-off-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Jim Shu <jim.shu@sifive.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240321104951.12104-1-frank.chang@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The vstart_eq_zero flag is updated at the beginning of the translation
phase from the env->vstart variable. During the execution phase all
functions will set env->vstart = 0 after a successful execution, but the
vstart_eq_zero flag remains the same as at the start of the block. This
will wrongly cause SIGILLs in translations that requires env->vstart = 0
and might be reading vstart_eq_zero = false.
This patch adds a new finalize_rvv_inst() helper that is called at the
end of each vector instruction that will both update vstart_eq_zero and
do a mark_vs_dirty().
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1976
Signed-off-by: Ivan Klokov <ivan.klokov@syntacore.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240314175704.478276-10-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
All helpers that rely on vstart >= vl are now doing early exits using
the VSTART_CHECK_EARLY_EXIT() macro. This macro will not only exit the
helper but also clear vstart.
We're still left with brconds that are skipping the helper, which is the
only place where we're clearing vstart. The pattern goes like this:
tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);
(... calls helper that clears vstart ...)
gen_set_label(over);
return true;
This means that every time we jump to 'over' we're not clearing vstart,
which is an oversight that we're doing across the board.
Instead of setting vstart = 0 manually after each 'over' jump, remove
those brconds that are skipping helpers. The exception will be
trans_vmv_s_x() and trans_vfmv_s_f(): they don't use a helper and are
already clearing vstart manually in the 'over' label.
While we're at it, remove the (vl == 0) brconds from trans_rvbf16.c.inc
too since they're unneeded.
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240314175704.478276-8-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
We're going to make changes that will required each helper to be
responsible for the 'vstart' management, i.e. we will relieve the
'vstart < vl' assumption that helpers have today.
Helpers are usually able to deal with vstart >= vl, i.e. doing nothing
aside from setting vstart = 0 at the end, but the tail update functions
will update the tail regardless of vstart being valid or not. Unifying
the tail update process in a single function that would handle the
vstart >= vl case isn't trivial (see [1] for more info).
This patch takes a blunt approach: do an early exit in every single
vector helper if vstart >= vl, unless the helper is guarded with
vstart_eq_zero in the translation. For those cases the helper is ready
to deal with cases where vl might be zero, i.e. throwing exceptions
based on it like vcpop_m() and first_m().
Helpers that weren't changed:
- vcpop_m(), vfirst_m(), vmsetm(), GEN_VEXT_VIOTA_M(): these are guarded
directly with vstart_eq_zero;
- GEN_VEXT_VCOMPRESS_VM(): guarded with vcompress_vm_check() that checks
vstart_eq_zero;
- GEN_VEXT_RED(): guarded with either reduction_check() or
reduction_widen_check(), both check vstart_eq_zero;
- GEN_VEXT_FRED(): guarded with either freduction_check() or
freduction_widen_check(), both check vstart_eq_zero.
Another exception is vext_ldst_whole(), who operates on effective vector
length regardless of the current settings in vtype and vl.
[1] https://lore.kernel.org/qemu-riscv/1590234b-0291-432a-a0fa-c5a6876097bc@linux.alibaba.com/
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240314175704.478276-7-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Commit 8ff8ac6329 added a conditional to guard the vext_ldst_whole()
helper if vstart >= evl. But by skipping the helper we're also not
setting vstart = 0 at the end of the insns, which is incorrect.
We'll move the conditional to vext_ldst_whole(), following in line with
the removal of all brconds vstart >= vl that the next patch will do. The
idea is to make the helpers responsible for their own vstart management.
Fix ldst_whole isns by:
- remove the brcond that skips the helper if vstart is >= evl;
- vext_ldst_whole() now does an early exit with the same check, where
evl = (vlenb * nf) >> log2_esz, but the early exit will also clear
vstart.
The 'width' param is now unneeded in ldst_whole_trans() and is also
removed. It was used for the evl calculation for the brcond and has no
other use now. The 'width' is reflected in vext_ldst_whole() via
log2_esz, which is encoded by GEN_VEXT_LD_WHOLE() as
"ctzl(sizeof(ETYPE))".
Suggested-by: Max Chou <max.chou@sifive.com>
Fixes: 8ff8ac6329 ("target/riscv: rvv: Add missing early exit condition for whole register load/store")
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Max Chou <max.chou@sifive.com>
Message-ID: <20240314175704.478276-6-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
These insns have 2 paths: we'll either have vstart already cleared if
vstart_eq_zero or we'll do a brcond to check if vstart >= maxsz to call
the 'vmvr_v' helper. The helper will clear vstart if it executes until
the end, or if vstart >= vl.
For starters, the check itself is wrong: we're checking vstart >= maxsz,
when in fact we should use vstart in bytes, or 'startb' like 'vmvr_v' is
calling, to do the comparison. But even after fixing the comparison we'll
still need to clear vstart in the end, which isn't happening too.
We want to make the helpers responsible to manage vstart, including
these corner cases, precisely to avoid these situations:
- remove the wrong vstart >= maxsz cond from the translation;
- add a 'startb >= maxsz' cond in 'vmvr_v', and clear vstart if that
happens.
This way we're now sure that vstart is being cleared in the end of the
execution, regardless of the path taken.
Fixes: f714361ed7 ("target/riscv: rvv-1.0: implement vstart CSR")
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240314175704.478276-5-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
trans_vmv_x_s, trans_vmv_s_x, trans_vfmv_f_s and trans_vfmv_s_f aren't
setting vstart = 0 after execution. This is usually done by a helper in
vector_helper.c but these functions don't use helpers.
We'll set vstart after any potential 'over' brconds, and that will also
mandate a mark_vs_dirty() too.
Fixes: dedc53cbc9 ("target/riscv: rvv-1.0: integer scalar move instructions")
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240314175704.478276-3-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Commit 3b8022269c added the capability of named features/profile
extensions to be added in riscv,isa. To do that we had to assign priv
versions for each one of them in isa_edata_arr[]. But this resulted in a
side-effect: vendor CPUs that aren't running priv_version_latest started
to experience warnings for these profile extensions [1]:
| $ qemu-system-riscv32 -M sifive_e
| qemu-system-riscv32: warning: disabling zic64b extension for hart
0x00000000 because privilege spec version does not match
| qemu-system-riscv32: warning: disabling ziccamoa extension for
hart 0x00000000 because privilege spec version does not match
This is benign as far as the CPU behavior is concerned since disabling
both extensions is a no-op (aside from riscv,isa). But the warnings are
unpleasant to deal with, especially because we're sending user warnings
for extensions that users can't enable/disable.
Instead of enabling all named features all the time, separate them by
priv version. During finalize() time, after we decided which
priv_version the CPU is running, enable/disable all the named extensions
based on the priv spec chosen. This will be enough for a bug fix, but as
a future work we should look into how we can name these extensions in a
way that we don't need an explicit ext_name => priv_ver as we're doing
here.
The named extensions being added in isa_edata_arr[] that will be
enabled/disabled based solely on priv version can be removed from
riscv_cpu_named_features[]. 'zic64b' is an extension that can be
disabled based on block sizes so it'll retain its own flag and entry.
[1] https://lists.gnu.org/archive/html/qemu-devel/2024-03/msg02592.html
Reported-by: Clément Chigot <chigot@adacore.com>
Fixes: 3b8022269c ("target/riscv: add riscv,isa to named features")
Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Tested-by: Clément Chigot <chigot@adacore.com>
Message-ID: <20240312203214.350980-1-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Daniel P. Berrangé <berrange@redhat.com> pointed out that the coroutine
pool size heuristic is very conservative. Instead of halving
max_map_count, he suggested reserving 5,000 mappings for non-coroutine
users based on observations of guests he has access to.
Fixes: 86a637e481 ("coroutine: cap per-thread local pool size")
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20240320181232.1464819-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* Use EPERM for seccomp filter instead of killing QEMU when
an attempt to spawn child process is made
* Reduce priority of POLLHUP handling for socket chardevs
to increase likelihood of pending data being processed
* Fix chardev I/O main loop integration when TLS is enabled
* Fix broken crypto test suite when distro disables
SM4 algorithm
* Improve diagnosis of failed crypto tests
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE2vOm/bJrYpEtDo4/vobrtBUQT98FAmX585EACgkQvobrtBUQ
# T98TIg//ekc/f0JrRs68hjmo/vfcHWGHDMbZagj48zZNIn8DhJmQdt+qrCjMrMGW
# 353nTawFuF3EO9ju/eRLO54T+p1+a3zX8TyO4tL1W+RY9HARPeqssmFemDPfkMfQ
# IFGv0M0vaxGZpBna7jlXfDK/hCbJexKoChyT4eSF9H1Tp9o6T2J9AWvB5WTYLoQ2
# GzusDqBLKTkKhxMTCqevkFD/yCkgIQKlX8mG188PoJnGMqpGzQLTyw9lo5Npi1nE
# nhXa2MrrSfusk0rtwEzT14sQ58U+MF4fLQxUC+knNX81FSv8Q6QDu4Stfhwc+az7
# ynO4b/3IzK+VCICb2QM1ZNoTZNLcLfw1jdFTIAt8wiE+BMSySNQtdneURZOynydy
# Qd0alPNb4zfVRIGVjoOj38HiOmIKp5riIsUsI03jjBAgJu47tYRi60Tq2t6KxVoP
# rpDd5Vmsd0AR+7acO29rp0aLB+x2/ANDY+1N1Xi4tQdblmKIziHPZzx6H49wbwev
# 8Jdghg10RpbdqIGOfZ9fn13iCDO+1/gy6g/jTe2tMZrZsyov904tDqyUCDCzAbTz
# B8lvnr0LfSX2DYBryGEHIa/eMN2TxPuzpvZP0JFO1QxJnOs9w3aHr1T6A1sCV4a3
# JjTu71LsomNMXj3t3ImBHzMlgQZoL5Bxoh7b7jbLO4cvnhRbiJk=
# =4HKW
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 19 Mar 2024 20:20:33 GMT
# gpg: using RSA key DAF3A6FDB26B62912D0E8E3FBE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>" [full]
# gpg: aka "Daniel P. Berrange <berrange@redhat.com>" [full]
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E 8E3F BE86 EBB4 1510 4FDF
* tag 'misc-fixes-pull-request' of https://gitlab.com/berrange/qemu:
crypto: report which ciphers are being skipped during tests
crypto: use error_abort for unexpected failures
crypto: query gcrypt for cipher availability
crypto: factor out conversion of QAPI to gcrypt constants
Revert "chardev: use a child source for qio input source"
Revert "chardev/char-socket: Fix TLS io channels sending too much data to the backend"
chardev: lower priority of the HUP GSource in socket chardev
seccomp: report EPERM instead of killing process for spawn set
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The "link_depends" key has not been used since commit c46f76d158
("meson: specify fuzz linker script as a project arg", 2020-09-08),
and even before that it was only used for fork-fuzzing which we
removed in commit d2e6f9272d ("fuzz: remove fork-fuzzing scaffolding",
2023-02-16).
So, remove it for a very small simplification of meson.build.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This avoids fetching blobs and tree references for branches we are not
going to worry about. Also skip tag references which are similarly not
useful and keep the default --prune. This keeps the .git data to
around 100M rather than the ~400M even a shallow clone takes.
So we can check the savings we also run a quick du while setting up
the build.
We also have to have special settings of GIT_FETCH_EXTRA_FLAGS for the
Windows build, the migration legacy test and the custom runners. In
the case of the custom runners we also move the free floating variable
to the runner template.
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240312170011.1688444-1-alex.bennee@linaro.org>
rec->count.score is inside rec, which is freed before rec->count.score is.
Reorder the instructions
Reported by Coverity as CID 1539967.
Cc: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
monitor_puts() doesn't check the monitor pointer, but do_inject_x86_mce()
may have a parameter with NULL monitor pointer. Revert monitor_puts() in
do_inject_x86_mce() to fix, then the fact that we send the same message to
monitor and log is again more obvious.
Fixes: bf0c50d4aa (monitor: expose monitor_puts to rest of code)
Reviwed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Message-ID: <20240320083640.523287-1-tao1.su@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Building dbus-display1.c explicitly as a static library drops -fPIC by
default, which may not be correct if it ends up linked to a shared
library.
Let the target decide how to build the unit, with or without -fPIC. This
makes commit 186acfbaf7 ("tests/qtest: Depend on dbus_display1_dep") no
longer relevant, as dbus-display1.c will be recompiled.
Fixes: c172136ea3 ("meson: ensure dbus-display generated code is built
before other units")
Reported-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
ui/cocoa needs to update the UI info and reset the keyboard state
tracker when switching the console, or the new console will see the
stale UI info or keyboard state. Previously, updating the UI info was
done with cocoa_switch(), but it is meant to be called when the surface
is being replaced, and may be called even when not switching the
console. ui/cocoa never reset the keyboard state, which resulted in
stuck keys.
Add ui/cocoa's own implementation of console_select(), which updates the
UI info and resets the keyboard state tracker.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20240319-console-v2-3-3fd6feef321a@daynix.com>
console_select() is shared by other displays and a console_select() call
from one of them triggers console switching also in ui/curses,
circumventing key state reinitialization that needs to be performed in
preparation and resulting in stuck keys.
Use its internal state to track the current active console to prevent
such a surprise console switch.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20240319-console-v2-2-3fd6feef321a@daynix.com>
A chardev-vc used to inherit the size of a graphic console when its
size not explicitly specified, but it often did not make sense. If a
chardev-vc is instantiated during the startup, the active graphic
console has no content at the time, so it will have the size of graphic
console placeholder, which contains no useful information. It's better
to have the standard size of text console instead.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20240319-console-v2-1-3fd6feef321a@daynix.com>
When we use qemu tcg simulation, the page size of bios is 4KB.
When using the level 2 super huge page (page size is 1G) to create the page table,
it is found that the content of the corresponding address space is abnormal,
resulting in the bios can not start the operating system and graphical interface normally.
The lddir and ldpte instruction emulation has
a problem with the use of super huge page processing above level 2.
The page size is not correctly calculated,
resulting in the wrong page size of the table entry found by tlb.
Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240318070332.1273939-1-lixianglai@loongson.cn>
Since the ciphers can be dynamically disabled at runtime, when running
unit tests it is helpful to report which ciphers we can skipped for
testing.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This improves the error diagnosis from the unit test when a cipher
is unexpected not available from
ERROR:../tests/unit/test-crypto-cipher.c:683:test_cipher: assertion failed: (err == NULL)
Bail out! ERROR:../tests/unit/test-crypto-cipher.c:683:test_cipher: assertion failed: (err == NULL)
Aborted (core dumped)
to
Unexpected error in qcrypto_cipher_ctx_new() at ../crypto/cipher-gcrypt.c.inc:262:
./build//tests/unit/test-crypto-cipher: Cannot initialize cipher: Invalid cipher algorithm
Aborted (core dumped)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Just because a cipher is defined in the gcrypt header file, does not
imply that it can be used. Distros can filter the list of ciphers when
building gcrypt. For example, RHEL-9 disables the SM4 cipher. It is
also possible that running in FIPS mode might dynamically change what
ciphers are available at runtime.
qcrypto_cipher_supports must therefore query gcrypt directly to check
for cipher availability.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The conversion of cipher mode will shortly be required in more
than one place.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This reverts commit a7077b8e35,
and add comments to explain why child sources cannot be used.
When a GSource is added as a child of another GSource, if its
'prepare' function indicates readiness, then the parent's
'prepare' function will never be run. The io_watch_poll_prepare
absolutely *must* be run on every iteration of the main loop,
to ensure that the chardev backend doesn't feed data to the
frontend that it is unable to consume.
At the time a7077b8e35 was made,
all the child GSource impls were relying on poll'ing an FD,
so their 'prepare' functions would never indicate readiness
ahead of poll() being invoked. So the buggy behaviour was
not noticed and lay dormant.
Relatively recently the QIOChannelTLS impl introduced a
level 2 child GSource, which checks with GNUTLS whether it
has cached any data that was decoded but not yet consumed:
commit ffda5db65a
Author: Antoine Damhet <antoine.damhet@shadow.tech>
Date: Tue Nov 15 15:23:29 2022 +0100
io/channel-tls: fix handling of bigger read buffers
Since the TLS backend can read more data from the underlying QIOChannel
we introduce a minimal child GSource to notify if we still have more
data available to be read.
Signed-off-by: Antoine Damhet <antoine.damhet@shadow.tech>
Signed-off-by: Charles Frey <charles.frey@shadow.tech>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
With this, it is now quite common for the 'prepare' function
on a QIOChannelTLS GSource to indicate immediate readiness,
bypassing the parent GSource 'prepare' function. IOW, the
critical 'io_watch_poll_prepare' is being skipped on some
iterations of the main loop. As a result chardev frontend
asserts are now being triggered as they are fed data they
are not ready to consume.
A reproducer is as follows:
* In terminal 1 run a GNUTLS *echo* server
$ gnutls-serv --echo \
--x509cafile ca-cert.pem \
--x509keyfile server-key.pem \
--x509certfile server-cert.pem \
-p 9000
* In terminal 2 run a QEMU guest
$ qemu-system-s390x \
-nodefaults \
-display none \
-object tls-creds-x509,id=tls0,dir=$PWD,endpoint=client \
-chardev socket,id=con0,host=localhost,port=9000,tls-creds=tls0 \
-device sclpconsole,chardev=con0 \
-hda Fedora-Cloud-Base-39-1.5.s390x.qcow2
After the previous patch revert, but before this patch revert,
this scenario will crash:
qemu-system-s390x: ../hw/char/sclpconsole.c:73: chr_read: Assertion
`size <= SIZE_BUFFER_VT220 - scon->iov_data_len' failed.
This assert indicates that 'tcp_chr_read' was called without
'tcp_chr_read_poll' having first been checked for ability to
receive more data
QEMU's use of a 'prepare' function to create/delete another
GSource is rather a hack and not normally the kind of thing that
is expected to be done by a GSource. There is no mechanism to
force GLib to always run the 'prepare' function of a parent
GSource. The best option is to simply not use the child source
concept, and go back to the functional approach previously
relied on.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This commit results in unexpected termination of the TLS connection.
When 'fd_can_read' returns 0, the code goes on to pass a zero length
buffer to qio_channel_read. The TLS impl calls into gnutls_recv()
with this zero length buffer, at which point GNUTLS returns an error
GNUTLS_E_INVALID_REQUEST. This is treated as fatal by QEMU's TLS code
resulting in the connection being torn down by the chardev.
Simply skipping the qio_channel_read when the buffer length is zero
is also not satisfactory, as it results in a high CPU burn busy loop
massively slowing QEMU's functionality.
The proper solution is to avoid tcp_chr_read being called at all
unless the frontend is able to accept more data. This will be done
in a followup commit.
This reverts commit 462945cd22
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The socket chardev often has 2 GSource object registered against the
same FD. One is registered all the time and is just intended to handle
POLLHUP events, while the other gets registered & unregistered on the
fly as the frontend is ready to receive more data or not.
It is very common for poll() to signal a POLLHUP event at the same time
as there is pending incoming data from the disconnected client. It is
therefore essential to process incoming data prior to processing HUP.
The problem with having 2 GSource on the same FD is that there is no
guaranteed ordering of execution between them, so the chardev code may
process HUP first and thus discard data.
This failure scenario is non-deterministic but can be seen fairly
reliably by reverting a7077b8e35, and
then running 'tests/unit/test-char', which will sometimes fail with
missing data.
Ideally QEMU would only have 1 GSource, but that's a complex code
refactoring job. The next best solution is to try to ensure ordering
between the 2 GSource objects. This can be achieved by lowering the
priority of the HUP GSource, so that it is never dispatched if the
main GSource is also ready to dispatch. Counter-intuitively, lowering
the priority of a GSource is done by raising its priority number.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
When something tries to run one of the spawn syscalls (eg clone),
our seccomp deny filter is set to cause a fatal trap which kills
the process.
This is found to be unhelpful when QEMU has loaded the nvidia
GL library. This tries to spawn a process to modprobe the nvidia
kmod. This is a dubious thing to do, but at the same time, the
code will gracefully continue if this fails. Our seccomp filter
rightly blocks the spawning, but prevent the graceful continue.
Switching to reporting EPERM will make QEMU behave more gracefully
without impacting the level of protect we have.
https://gitlab.com/qemu-project/qemu/-/issues/2116
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Pull request
This fix solves the "failed to set up stack guard page" error that has been
reported on Linux hosts where the QEMU coroutine pool exceeds the
vm.max_map_count limit.
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCAAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAmX5qq0ACgkQnKSrs4Gr
# c8ginQf8DRKzA7K8OivEegKpf0TgGcAcw9/xKc6zJH3X0/GXi1my61tzz+XUkbNy
# /R9HRrjBUb4MhSmJzP9kxuPFcBD5fZeipg4eTqtJCdi+DQ57+YypShVpsDrD7eNv
# X5dxeeONdWwP+k9JiOj9NtSOMmTKExn/Q/w45G2eeBlJh4yRA+56XN/dDXTFlidm
# NEpOGrKbyFKuAf/ZwYmeBr4aqIGTN3UgOVco/rqkGPYPTYpKlCoE5rSTEnQrbR7/
# C9KojlrGawJXlKjxfu/6i7yGHrv0eJ2N1VauvR/DHhQvdRhojVVt3NFGG/WJi+cL
# CMbxNyYeQJLNFtfPWzokjKEudxkshg==
# =lznr
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 19 Mar 2024 15:09:33 GMT
# gpg: using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [full]
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" [full]
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8
* tag 'block-pull-request' of https://gitlab.com/stefanha/qemu:
coroutine: cap per-thread local pool size
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The coroutine pool implementation can hit the Linux vm.max_map_count
limit, causing QEMU to abort with "failed to allocate memory for stack"
or "failed to set up stack guard page" during coroutine creation.
This happens because per-thread pools can grow to tens of thousands of
coroutines. Each coroutine causes 2 virtual memory areas to be created.
Eventually vm.max_map_count is reached and memory-related syscalls fail.
The per-thread pool sizes are non-uniform and depend on past coroutine
usage in each thread, so it's possible for one thread to have a large
pool while another thread's pool is empty.
Switch to a new coroutine pool implementation with a global pool that
grows to a maximum number of coroutines and per-thread local pools that
are capped at hardcoded small number of coroutines.
This approach does not leave large numbers of coroutines pooled in a
thread that may not use them again. In order to perform well it
amortizes the cost of global pool accesses by working in batches of
coroutines instead of individual coroutines.
The global pool is a list. Threads donate batches of coroutines to when
they have too many and take batches from when they have too few:
.-----------------------------------.
| Batch 1 | Batch 2 | Batch 3 | ... | global_pool
`-----------------------------------'
Each thread has up to 2 batches of coroutines:
.-------------------.
| Batch 1 | Batch 2 | per-thread local_pool (maximum 2 batches)
`-------------------'
The goal of this change is to reduce the excessive number of pooled
coroutines that cause QEMU to abort when vm.max_map_count is reached
without losing the performance of an adequately sized coroutine pool.
Here are virtio-blk disk I/O benchmark results:
RW BLKSIZE IODEPTH OLD NEW CHANGE
randread 4k 1 113725 117451 +3.3%
randread 4k 8 192968 198510 +2.9%
randread 4k 16 207138 209429 +1.1%
randread 4k 32 212399 215145 +1.3%
randread 4k 64 218319 221277 +1.4%
randread 128k 1 17587 17535 -0.3%
randread 128k 8 17614 17616 +0.0%
randread 128k 16 17608 17609 +0.0%
randread 128k 32 17552 17553 +0.0%
randread 128k 64 17484 17484 +0.0%
See files/{fio.sh,test.xml.j2} for the benchmark configuration:
https://gitlab.com/stefanha/virt-playbooks/-/tree/coroutine-pool-fix-sizing
Buglink: https://issues.redhat.com/browse/RHEL-28947
Reported-by: Sanjay Rao <srao@redhat.com>
Reported-by: Boaz Ben Shabat <bbenshab@redhat.com>
Reported-by: Joe Mario <jmario@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240318183429.1039340-1-stefanha@redhat.com>
aspeed, pnv, vfio queue:
* user device fixes for Aspeed and PowerNV machines
* coverity fix for iommufd
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmX5mm0ACgkQUaNDx8/7
# 7KE/MQ/9GeX4yNBxY2iTATdmPXwjMw8AtKyfIQb605nIO0ch1Z98ywl5VMwCNohn
# ppY9L5bFpEASgRlFVm73X4DGxKyRGpRPqylsvINh0hKciRpmRkELHY3llhnXsd7P
# Q197pDtFr54FeX8j4+hSAu4paT97fPENlKn0J6lto2I1cXGcD1LYNDFhysoXdGme
# brJgo7KjQJZPZ560ZewskL5FWf3G9EkRjpqd8y0G5OtNmAPgAaahOMHhDCXan182
# J89I9CHI5xN45MRfAs8JamSaj/GyNsr4h04WhPa0+VZQ5vsaeW2Ekt4ypj+oAV+p
# wykhYzQk4ALZcmmph2flSAtLa7uheI+imyqubMthQCDj3G8onSQBMd5/4WRK6O49
# 0oE1DpPDEfhlJEQYxaYhOeqeA9iaP+w6V+yE+L5oGlMO66cR7GZsPu0x7kXailbH
# IoHw9mO+vMkpuyeP7M3hA8WRFCdFpf1Nn1Ao5Jz3KoiTyJWlIvX5VSaj12sjddQ2
# fU9SKu2Q5QqS5uQGakkY64EyUy7RkGIX6zY2NIscVe2lfAfKf3mZwu7OIuLjEy5O
# lRn35vWV8fOdRooKoDPTNcdBCaNPi+RApin8chOv5P+F+ie7+Twf9sb1AgH/pIcv
# HptvTXbvSFNbbdb+OE8a5qsqTvnrN8d31IXzrWRYsJB07x2IyoA=
# =zR3v
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 19 Mar 2024 14:00:13 GMT
# gpg: using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@kaod.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1
* tag 'pull-for-9.0-20240319' of https://github.com/legoater/qemu:
aspeed/smc: Only wire flash devices at reset
ppc/pnv: I2C controller is not user creatable
vfio/iommufd: Fix memory leak
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The Aspeed machines have many Static Memory Controllers (SMC), up to
8, which can only drive flash memory devices. Commit 27a2c66c92
("aspeed/smc: Wire CS lines at reset") tried to ease the definitions
of these devices by allowing flash devices from the command line to be
attached to a SSI bus. For that, the wiring of the CS lines of the
Aspeed SMC controller was moved at reset. Two assumptions are made
though, first that the device has a SSI_GPIO_CS GPIO line, which is
not always the case, and second that it is a flash device.
Correct this problem by ensuring that the devices attached to the bus
are of the correct flash type. This fixes a QEMU abort when devices
without a CS line, such as the max111x, are passed on the command
line.
While at it, export TYPE_M25P80 used in the Xilinx Versal Virtual
machine.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2228
Fixes: 27a2c66c92 ("aspeed/smc: Wire CS lines at reset")
Reported-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
[ clg: minor fixes in the commit log ]
Signed-off-by: Cédric Le Goater <clg@redhat.com>
The I2C controller is a subunit of the processor. Make it so and avoid
QEMU crashes.
$ build/qemu-system-ppc64 -S -machine powernv9 -device pnv-i2c
qemu-system-ppc64: ../hw/ppc/pnv_i2c.c:521: pnv_i2c_realize: Assertion `i2c->chip' failed.
Aborted (core dumped)
Fixes: 263b81ee15 ("ppc/pnv: Add an I2C controller model")
Cc: Glenn Miles <milesg@linux.vnet.ibm.com>
Reported-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Glenn Miles <milesg@linux.vnet.ibm.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Coverity reported a memory leak on variable 'contents' in routine
iommufd_cdev_getfd(). Use g_autofree variables to simplify the exit
path and get rid of g_free() calls.
Cc: Eric Auger <eric.auger@redhat.com>
Cc: Yi Liu <yi.l.liu@intel.com>
Fixes: CID 1540007
Fixes: 5ee3dc7af7 ("vfio/iommufd: Implement the iommufd backend")
Suggested-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
virtio,pc,pci: bugfixes
Some minor fixes plus a big patchset from Igor fixing
a regression with windows.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmX4NzsPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRpkp0H/1foAaDYrApMiIkji4aI94bq/fwTnu5CshNP
# +YEzwJCS4qbl67/Ix2Z+xVz7twjQbgGdLd6hb9ZypAQfclUk5tDoKyCmqHtQMakX
# T080FayOvWmUEostAw7MXvuz0HpJlgnJaJBn29l1hHjA/XXahKqcc705cup+W8hv
# F7xb6AoFcbdETMzNaoqekNaHiiYyQPITY9p/UYPLzj2zyLsspR9kBebIeA1yhtXw
# Tmc3+FMquoM2fMNxpwfhCBswg662MlOXhLN3dmyLqeJRl09x1GvaeJIGMY2MbefM
# RMMv0/jqwAyii5HXew2rPIbLdULGq+hSjZo2NOlx3EOjTCaOkXc=
# =XGMp
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 18 Mar 2024 12:44:43 GMT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (24 commits)
smbios: add extra comments to smbios_get_table_legacy()
tests: acpi: update expected SSDT.dimmpxm blob
pc/q35: set SMBIOS entry point type to 'auto' by default
tests: acpi/smbios: whitelist expected blobs
smbios: error out when building type 4 table is not possible
smbios: in case of entry point is 'auto' try to build v2 tables 1st
smbios: extend smbios-entry-point-type with 'auto' value
smbios: clear smbios_type4_count before building tables
smbios: get rid of global smbios_ep_type
smbios: handle errors consistently
smbios: build legacy mode code only for 'pc' machine
smbios: rename/expose structures/bitmaps used by both legacy and modern code
smbios: add smbios_add_usr_blob_size() helper
smbios: don't check type4 structures in legacy mode
smbios: avoid mangling user provided tables
smbios: get rid of smbios_legacy global
smbios: get rid of smbios_smp_sockets global
smbios: cleanup smbios_get_tables() from legacy handling
tests: smbios: add test for legacy mode CLI options
tests: smbios: add test for -smbios type=11 option
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The low bit of MMU indices for x86 TCG indicates whether the processor is
in 32-bit mode and therefore linear addresses have to be masked to 32 bits.
However, the index was computed incorrectly, leading to possible conflicts
in the TLB for any address above 4G.
Analyzed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Fixes: b1661801c1 ("target/i386: Fix physical address truncation", 2024-02-28)
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2206
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Block layer patches
- mirror: Fix deadlock
- nbd/server: Fix race in draining the export
- qemu-img snapshot: Fix formatting with large values
- Fix blockdev-snapshot-sync error reporting for no medium
- iotests fixes
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmX4OG8RHGt3b2xmQHJl
# ZGhhdC5jb20ACgkQfwmycsiPL9YdiQ//faXfGmbK6rBW4AkpwfrRM8SDHvm6hz7L
# 043ujAi3ziSXXoiec2/RK5wZ27nMJkfIrRHXpH41hgQvC6/3a4eIW6KSTaFV1PdG
# JtHCeopmVmgu7TZQ+kt/J6eLUTTLovoO94HgEfmxpr4CGZfx9RJftf2kCKILcYkh
# 9r04zSZLByVd4FJ5ZrqsFulWif5mXoGKdT/YisY3tKiCwFRWQDOoTymvJA012VtO
# MVmID593zwem3O3qtlGiGlK9qodBR4yof66xa/0gaYP98BZgv+LWnwLKha+OzSpX
# bQlxT26LY4JnSQkTdjF0QYnQiH4Q1kveUcNRZrGpA4iZxVDq1aks5DisThDwqoGG
# rhaPOWyJwJsonM1Enzim5Jd60JqvGdpTLjSA5oSyTjw62lAulnYihInERYSAFyyz
# UhQaO7qSog1//RpPEXEsiVkJBq8BE9l5I+L7+l5SCBhNr/UwZAOer/4m4X6d0SKN
# GEPRx0kH1voikzx7gIQs+Oldqvb0sg+zAvOynBxzpd+Ac6s8bFtWe+eSyWYL/ZGr
# Jg9+PL1xir/Uh7KmOnzt/iVBAmfSRpAo1O72xQXvHFYYtIP7hTkPO/vzqF206WMc
# WQFHHjfp5gVcMZ5AYg6txw+Bbtzu8g0AfB054lgnhihuShpf0E923TTDQFdV755s
# NUlrzuGu2fs=
# =+JIK
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 18 Mar 2024 12:49:51 GMT
# gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg: issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* tag 'for-upstream' of https://repo.or.cz/qemu/kevin:
iotests: adapt to output change for recently introduced 'detached header' field
tests/qemu-iotests: Restrict tests using "--blockdev file" to the file protocol
tests/qemu-iotests: Fix some tests that use --image-opts for other protocols
tests/qemu-iotests: Restrict tests that use --image-opts to the 'file' protocol
tests/qemu-iotests: Restrict test 156 to the 'file' protocol
tests/qemu-iotests: Restrict test 134 and 158 to the 'file' protocol
tests/qemu-iotests: Restrict test 130 to the 'file' protocol
tests/qemu-iotests: Restrict test 114 to the 'file' protocol
tests/qemu-iotests: Restrict test 066 to the 'file' protocol
tests/qemu-iotests: Fix test 033 for running with non-file protocols
qemu-img: Fix Column Width and Improve Formatting in snapshot list
blockdev: Fix blockdev-snapshot-sync error reporting for no medium
iotests: Add test for reset/AioContext switches with NBD exports
nbd/server: Fix race in draining the export
mirror: Don't call job_pause_point() under graph lock
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Migration pull for 9.0-rc0
- Nicholas/Phil's fix on migration corruption / inconsistent for tcg
- Cedric's fix on block migration over n_sectors==0
- Steve's CPR reboot documentation page
- Fabiano's misc fixes on mapped-ram (IOC leak, dup() errors, fd checks, fd
use race, etc.)
# -----BEGIN PGP SIGNATURE-----
#
# iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZfdZEhIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1wa+1AEA0+f7nCssvsILvCY9KifYO+OUJsLodUuQ
# JW0JBz+1iPMA+wSiyIVl2Xg78Q97nJxv71UJf+1cDJENA5EMmXMnxmYK
# =SLnA
# -----END PGP SIGNATURE-----
# gpg: Signature made Sun 17 Mar 2024 20:56:50 GMT
# gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg: issuer "peterx@redhat.com"
# gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [marginal]
# gpg: aka "Peter Xu <peterx@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706
* tag 'migration-20240317-pull-request' of https://gitlab.com/peterx/qemu:
migration/multifd: Duplicate the fd for the outgoing_args
migration/multifd: Ensure we're not given a socket for file migration
migration: Fix iocs leaks during file and fd migration
migration: cpr-reboot documentation
migration: Skip only empty block devices
physmem: Fix migration dirty bitmap coherency with TCG memory access
physmem: Factor cpu_physical_memory_dirty_bits_cleared() out
physmem: Expose tlb_reset_dirty_range_all()
migration: Fix error handling after dup in file migration
io: Introduce qio_channel_file_new_dupfd
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Remove the unnecessary "Sparc" at the beginning of the line and
put the chip information into parentheses so that it is clearer
which part of the line have to be passed to "-cpu" to specify a
different CPU.
Message-ID: <20240307174334.130407-4-thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
some users were confused by this message showing under TCG:
Selected CPU generation is too new. Maximum supported model
in the configuration: 'xyz'
Clarify that the maximum can depend on the accel, and add a
hint to try a different one.
Also add a hint for features mismatch to suggest trying
different accel, QEMU and kernel versions.
Signed-off-by: Claudio Fontana <cfontana@suse.de>
Message-ID: <20240314213746.27163-1-cfontana@suse.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
address shift is caused by switch to 32-bit SMBIOS entry point
which has slightly different size from 64-bit one and happens
to trigger a bit different memory layout.
Expected diff:
- Name (MEMA, 0x07FFE000)
+ Name (MEMA, 0x07FFF000)
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20240314152302.2324164-21-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Use smbios-entry-point-type='auto' for newer machine types as a workaround
for Windows not detecting SMBIOS tables. Which makes QEMU pick SMBIOS tables
based on configuration (with 2.x preferred and fallback to 3.x if the former
isn't compatible with configuration)
Default compat setting of smbios-entry-point-type after series
for pc/q35 machines:
* 9.0-newer: 'auto'
* 8.1-8.2: '64'
* 8.0-older: '32'
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2008
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
Message-Id: <20240314152302.2324164-20-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
If SMBIOS v2 version is requested but number of cores/threads
are more than it's possible to describe with v2, error out
instead of silently ignoring the fact and filling core/thread
count with bogus values.
This will help caller to decide if it should fallback to
SMBIOSv3 when smbios-entry-point-type='auto'
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
Message-Id: <20240314152302.2324164-18-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
QEMU for some time now uses SMBIOS 3.0 for PC/Q35 machines by
default, however Windows has a bug in locating SMBIOS 3.0
entrypoint and fails to find tables when booted on SeaBIOS
(on UEFI SMBIOS 3.0 tables work fine since firmware hands
over tables in another way)
Missing SMBIOS tables may lead to some issues for guest
though (worst are: possible reactiveation, inability to
get virtio drivers from 'Windows Update')
It's unclear at this point if MS will fix the issue on their
side. So instead of it (or rather in addition) this patch
will try to workaround the issue.
aka, use smbios-entry-point-type=auto to make QEMU try
generating conservative SMBIOS 2.0 tables and if that
fails (due to limits/requested configuration) fallback
to SMBIOS 3.0 tables.
With this in place majority of users will use SMBIOS 2.0
tables which work fine with (Windows + legacy BIOS).
The configurations that is not to possible to describe
with SMBIOS 2.0 will switch automatically to SMBIOS 3.0
(which will trigger Windows bug but there is nothing
QEMU can do here, so go and aks Microsoft to real fix).
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
Message-Id: <20240314152302.2324164-17-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Current code uses mix of error_report()+exit(1)
and error_setg() to handle errors.
Use newer error_setg() everywhere, beside consistency
it will allow to detect error condition without killing
QEMU and attempt switch-over to SMBIOS3.x tables/entrypoint
in follow up patch.
while at it, clear smbios_tables pointer after freeing.
that will avoid double free if smbios_get_tables() is called
multiple times.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20240314152302.2324164-13-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
basically moving code around without functional change.
And exposing some symbols so that they could be shared
between smbbios.c and new smbios_legacy.c
plus some meson magic to build smbios_legacy.c only
for 'pc' machine and otherwise replace it with stub
if not selected.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20240314152302.2324164-12-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
As a preparation to move legacy handling into a separate file,
add prefix 'smbios_' to type0/type1/have_binfile_bitmap/have_fields_bitmap
and expose them in smbios.h so that they can be reused in
legacy and modern code.
Doing it as a separate patch to avoid rename cluttering follow-up
patch which will move legacy code into a separate file.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20240314152302.2324164-11-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
legacy mode doesn't support structures of type 2 and more,
and CLI has a check for '-smbios type' option, however it's
still possible to sneak in type4 as a blob with '-smbios file'
option. However doing the later makes SMBIOS tables broken
since SeaBIOS doesn't expect that.
Rather than trying to add support for type4 to legacy code
(both QEMU and SeaBIOS), simplify smbios_get_table_legacy()
by dropping not relevant check in legacy code and error out
on type4 blob.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
Message-Id: <20240314152302.2324164-9-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
currently smbios_entry_add() preserves internally '-smbios type='
options but tables provided with '-smbios file=' are stored directly
into blob that eventually will be exposed to VM. And then later
QEMU adds default/'-smbios type' entries on top into the same blob.
It makes impossible to generate tables more than once, hence
'immutable' guard was used.
Make it possible to regenerate final blob by storing user provided
blobs into a dedicated area (usr_blobs) and then copy it when
composing final blob. Which also makes handling of -smbios
options consistent.
As side effect of this and previous commits there is no need to
generate legacy smbios_entries at the time options are parsed.
Instead compose smbios_entries on demand from usr_blobs like
it is done for non-legacy SMBIOS tables.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20240314152302.2324164-8-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
smbios_get_tables() bails out right away if leagacy mode is enabled
and won't generate any SMBIOS tables. At the same time x86 specific
fw_cfg_build_smbios() will genarate legacy tables and then proceed
to preparing temporary mem_array for useless call to
smbios_get_tables() and then discard it.
Drop legacy related check in smbios_get_tables() and return from
fw_cfg_build_smbios() early if legacy tables where built without
proceeding to non legacy part of the function.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
Message-Id: <20240314152302.2324164-5-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Unfortunately having 2.0 machine type deprecated is not enough
to get rid of legacy SMBIOS handling since 'isapc' also uses
that and it's staying around.
Hence add test for CLI options handling to be sure that it
ain't broken during SMBIOS code refactoring.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
Message-Id: <20240314152302.2324164-4-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cureently it not possible to run SMBIOS test without ACPI one,
which gets into the way when testing ACPI-less configs.
Extract SMBIOS testing into separate routines that could also
be run without ACPI dependency and use that for testing SMBIOS.
As the 1st user add "acpi/piix4/smbios-options" test case.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
Message-Id: <20240314152302.2324164-2-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tests 263, 284 and detect-zeroes-registered-buf use qemu-io
with --image-opts so we have to enforce IMGOPTSSYNTAX=true here
to get $TEST_IMG in shape for other protocols than "file".
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240315111108.153201-9-thuth@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
These tests 188, 189 and 198 use qemu-io with --image-opts with additional
hard-coded parameters for the file protocol, so they cannot work for other
protocols. Thus we have to limit these tests to the file protocol only.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240315111108.153201-8-thuth@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The test fails completely when you try to use it with a different
protocol, e.g. with "./check -ssh -qcow2 156".
The test uses some hand-crafted JSON statements which cannot work with other
protocols, thus let's change this test to only support the 'file' protocol.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240315111108.153201-7-thuth@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Commit b25b387fa5 updated the iotests 134 and 158 to use the --image-opts
parameter for qemu-io with file protocol related options, but forgot to
update the _supported_proto line accordingly. So let's do that now.
Fixes: b25b387fa5 ("qcow2: convert QCow2 to use QCryptoBlock for encryption")
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240315111108.153201-6-thuth@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
iotest 114 uses "truncate" and the qcow2.py script on the destination file,
which both cannot deal with URIs. Thus this test needs the "file" protocol,
otherwise it fails with an error message like this:
truncate: cannot open 'ssh://127.0.0.1/tmp/qemu-build/tests/qemu-iotests/scratch/qcow2-ssh-114/t.qcow2.orig'
for writing: No such file or directory
Thus mark this test for "file protocol only" accordingly.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240315111108.153201-4-thuth@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When running iotest 033 with the ssh protocol, it fails with:
033 fail [14:48:31] [14:48:41] 10.2s output mismatch
--- /.../tests/qemu-iotests/033.out
+++ /.../tests/qemu-iotests/scratch/qcow2-ssh-033/033.out.bad
@@ -174,6 +174,7 @@
512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
wrote 512/512 bytes at offset 2097152
512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+qemu-io: warning: Failed to truncate the tail of the image: ssh driver does not support shrinking files
read 512/512 bytes at offset 0
512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
We already check for the qcow2 format here, so let's simply also
add a check for the protocol here, too, to only test the truncation
with the file protocol.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240315111108.153201-2-thuth@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When external_snapshot_abort() rejects a BlockDriverState without a
medium, it creates an error like this:
error_setg(errp, "Device '%s' has no medium", device);
Trouble is @device can be null. My system formats null as "(null)",
but other systems might crash. Reproducer:
1. Create a block device without a medium
-> {"execute": "blockdev-add", "arguments": {"driver": "host_cdrom", "node-name": "blk0", "filename": "/dev/sr0"}}
<- {"return": {}}
3. Attempt to snapshot it
-> {"execute":"blockdev-snapshot-sync", "arguments": { "node-name": "blk0", "snapshot-file":"/tmp/foo.qcow2","format":"qcow2"}}
<- {"error": {"class": "GenericError", "desc": "Device '(null)' has no medium"}}
Broken when commit 0901f67ecd made @device optional.
Use bdrv_get_device_or_node_name() instead. Now it fails as it
should:
<- {"error": {"class": "GenericError", "desc": "Device 'blk0' has no medium"}}
Fixes: 0901f67ecd ("qmp: Allow to take external snapshots on bs graphs node.")
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240306142831.2514431-1-armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This replicates the scenario in which the bug was reported.
Unfortunately this relies on actually executing a guest (so that the
firmware initialises the virtio-blk device and moves it to its
configured iothread), so this can't make use of the qtest accelerator
like most other test cases. I tried to find a different easy way to
trigger the bug, but couldn't find one.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20240314165825.40261-3-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When draining an NBD export, nbd_drained_begin() first sets
client->quiescing so that nbd_client_receive_next_request() won't start
any new request coroutines. Then nbd_drained_poll() tries to makes sure
that we wait for any existing request coroutines by checking that
client->nb_requests has become 0.
However, there is a small window between creating a new request
coroutine and increasing client->nb_requests. If a coroutine is in this
state, it won't be waited for and drain returns too early.
In the context of switching to a different AioContext, this means that
blk_aio_attached() will see client->recv_coroutine != NULL and fail its
assertion.
Fix this by increasing client->nb_requests immediately when starting the
coroutine. Doing this after the checks if we should create a new
coroutine is okay because client->lock is held.
Cc: qemu-stable@nongnu.org
Fixes: fd6afc501a ("nbd/server: Use drained block ops to quiesce the server")
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20240314165825.40261-2-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Calling job_pause_point() while holding the graph reader lock
potentially results in a deadlock: bdrv_graph_wrlock() first drains
everything, including the mirror job, which pauses it. The job is only
unpaused at the end of the drain section, which is when the graph writer
lock has been successfully taken. However, if the job happens to be
paused at a pause point where it still holds the reader lock, the writer
lock can't be taken as long as the job is still paused.
Mark job_pause_point() as GRAPH_UNLOCKED and fix mirror accordingly.
Cc: qemu-stable@nongnu.org
Buglink: https://issues.redhat.com/browse/RHEL-28125
Fixes: 004915a96a ("block: Protect bs->backing with graph_lock")
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20240313153000.33121-1-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
At least for now cpu-topology is implemented only for KVM.
We already say this, but this tries to be more explicit,
and also show it in the examples.
This adds a new reference in the introduction that we can point to,
whenever we need to reference accelerators and how to select them.
Signed-off-by: Claudio Fontana <cfontana@suse.de>
Message-ID: <20240314172218.16478-1-cfontana@suse.de>
Reviewed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Tested-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
virtio,pc,pci: features, cleanups, fixes
more memslots support in libvhost-user
support PCIe Gen5/Gen6 link speeds in pcie
more traces in vdpa
network simulation devices support in vdpa
SMBIOS type 9 descriptor implementation
Bump max_cpus to 4096 vcpus in q35
aw-bits and granule options in VIRTIO-IOMMU
Support report NUMA nodes for device memory using GI in acpi
Beginning of shutdown event support in pvpanic
fixes, cleanups all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmXw0TMPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRp8x4H+gLMoGwaGAX7gDGPgn2Ix4j/3kO77ZJ9X9k/
# 1KqZu/9eMS1j2Ei+vZqf05w7qRjxxhwDq3ilEXF/+UFqgAehLqpRRB8j5inqvzYt
# +jv0DbL11PBp/oFjWcytm5CbiVsvq8KlqCF29VNzc162XdtcduUOWagL96y8lJfZ
# uPrOoyeR7SMH9lp3LLLHWgu+9W4nOS03RroZ6Umj40y5B7yR0Rrppz8lMw5AoQtr
# 0gMRnFhYXeiW6CXdz+Tzcr7XfvkkYDi/j7ibiNSURLBfOpZa6Y8+kJGKxz5H1K1G
# 6ZY4PBcOpQzl+NMrktPHogczgJgOK10t+1i/R3bGZYw2Qn/93Eg=
# =C0UU
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 12 Mar 2024 22:03:31 GMT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (68 commits)
docs/specs/pvpanic: document shutdown event
hw/cxl: Fix missing reserved data in CXL Device DVSEC
hmat acpi: Fix out of bounds access due to missing use of indirection
hmat acpi: Do not add Memory Proximity Domain Attributes Structure targetting non existent memory.
qemu-options.hx: Document the virtio-iommu-pci aw-bits option
hw/arm/virt: Set virtio-iommu aw-bits default value to 48
hw/i386/q35: Set virtio-iommu aw-bits default value to 39
virtio-iommu: Add an option to define the input range width
virtio-iommu: Trace domain range limits as unsigned int
qemu-options.hx: Document the virtio-iommu-pci granule option
virtio-iommu: Change the default granule to the host page size
virtio-iommu: Add a granule property
hw/i386/acpi-build: Add support for SRAT Generic Initiator structures
hw/acpi: Implement the SRAT GI affinity structure
qom: new object to associate device to NUMA node
hw/i386/pc: Inline pc_cmos_init() into pc_cmos_init_late() and remove it
hw/i386/pc: Set "normal" boot device order in pc_basic_device_init()
hw/i386/pc: Avoid one use of the current_machine global
hw/i386/pc: Remove "rtc_state" link again
Revert "hw/i386/pc: Confine system flash handling to pc_sysfw"
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
# Conflicts:
# hw/core/machine.c
* PAPR nested hypervisor host implementation for spapr TCG
* excp_helper.c code cleanups and improvements
* Move more ops to decodetree
* Deprecate pseries-2.12 machines and P9 and P10 DD1.0 CPUs
* Document running Linux on AmigaNG
* Update dt feature advertising POWER CPUs.
* Add P10 PMU SPRs
* Improve pnv topology calculation for SMT8 CPUs.
* Various bug fixes.
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCgAdFiEETkN92lZhb0MpsKeVZ7MCdqhiHK4FAmXwiT8ACgkQZ7MCdqhi
# HK7C/w//XxEO2bQTFPLFDTrP/voq7pcX8XeQNVyXCkXYjvsbu05oQow50k+Y5UAE
# US4MFjt8jFz0vuIKuKyoA3kG41zDSOzoX4TQXMM+tyTWbuFF3KAyfizb1xE6SYAN
# xJEGvmiXv/EgoSBD7BTKQp1tMPdIGZLwSdYiA0lmOo7YaMCgYAXaujW5hnNjQecT
# 873sN+10pHtQY++mINtD9Nfb6AcDGMWw0b+bykqIXhNRkI8IGOS4WF4vAuMBrwfe
# UM00wDnNRb86Dk14bv2XVNDr6/i0VRtUMwM4yiptrQ1TQx18LZaPSQFYjQfPaan7
# LwN4QkMFnBX54yJ7Npvjvu8BCBF47kwOVu4CIAFJ4sIm0WfTmozDpPttwcZ5w7Ve
# iXDOB9ECAB4pQ2rCgbSNG8MYUZgoHHOuThqolOP0Vh9NHRRJxpdw6CyAbmCGftc0
# lvRDPFiKp8xmCNJ/j3XzoUdHoG7NMwpUmHv9ruGU18SdQ8hyJN9AcQGWYrB4v0RV
# /hs2RAbwntG7ahkcwd8uy5aFw88Wph/uGXPXc49EWj7i49vHeIV2y5+gtthMywje
# qqjFXkistXuF+JHVnyoYmqqCyXaHX5CEwtawMv4EQeaJs76bLhMeMTKKl9rRp8qB
# DtbIZphO8iMsocrBnje48sA5HR0PM+H4HTjw10i8R0fLlWitaIY=
# =XnY5
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 12 Mar 2024 16:56:31 GMT
# gpg: using RSA key 4E437DDA56616F4329B0A79567B30276A8621CAE
# gpg: Good signature from "Nicholas Piggin <npiggin@gmail.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 4E43 7DDA 5661 6F43 29B0 A795 67B3 0276 A862 1CAE
* tag 'pull-ppc-for-9.0-2-20240313' of https://gitlab.com/npiggin/qemu: (38 commits)
spapr: nested: Introduce cap-nested-papr for Nested PAPR API
spapr: nested: Introduce H_GUEST_RUN_VCPU hcall.
spapr: nested: Use correct source for parttbl info for nested PAPR API.
spapr: nested: Introduce H_GUEST_[GET|SET]_STATE hcalls.
spapr: nested: Initialize the GSB elements lookup table.
spapr: nested: Extend nested_ppc_state for nested PAPR API
spapr: nested: Introduce H_GUEST_CREATE_VCPU hcall.
spapr: nested: Introduce H_GUEST_[CREATE|DELETE] hcalls.
spapr: nested: Introduce H_GUEST_[GET|SET]_CAPABILITIES hcalls.
spapr: nested: Document Nested PAPR API
spapr: nested: keep nested-hv related code restricted to its API.
spapr: nested: Introduce SpaprMachineStateNested to store related info.
spapr: nested: move nested part of spapr_get_pate into spapr_nested.c
spapr: nested: register nested-hv api hcalls only for cap-nested-hv
target/ppc: Remove interrupt handler wrapper functions
target/ppc: Clean up ifdefs in excp_helper.c, part 3
target/ppc: Clean up ifdefs in excp_helper.c, part 2
target/ppc: Clean up ifdefs in excp_helper.c, part 1
target/ppc: Add gen_exception_err_nip() function
target/ppc: Readability improvements in exception handlers
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When the terminal GDB_FORK_ENABLED state is reached, the coordination
socket is not needed anymore and is therefore closed. However, if there
is a communication error between QEMU gdbstub and GDB, the generic
error handling code attempts to close it again.
Fix by closing it later - before returning - instead.
Fixes: Coverity CID 1539966
Fixes: d547e711a8 ("gdbstub: Implement follow-fork-mode child")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240312001813.13720-1-iii@linux.ibm.com>
Add stub to handle Xfer:siginfo:read packet query that requests the
machine's siginfo data.
This is used when GDB user executes 'print $_siginfo' and when the
machine stops due to a signal, for instance, on SIGSEGV. The information
in siginfo allows GDB to determiner further details on the signal, like
the fault address/insn when the SIGSEGV is caught.
Signed-off-by: Gustavo Romero <gustavo.romero@linaro.org>
Message-Id: <20240309030901.1726211-5-gustavo.romero@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
The "check" target by itself is not enough to ensure we build the user
mode binaries. While we can't test them with check-tcg we can at least
include them in the build.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Gustavo Romero <gustavo.romero@linaro.org>
Shutdown requests are normally hardware dependent.
By extending pvpanic to also handle shutdown requests, guests can
submit such requests with an easily implementable and cross-platform
mechanism.
Signed-off-by: Thomas Weißschuh <thomas@t-8ch.de>
Message-Id: <20240310-pvpanic-shutdown-spec-v1-1-b258e182ce55@t-8ch.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The r3.1 specification introduced a new 2 byte field, but
to maintain DWORD alignment, a additional 2 reserved bytes
were added. Forgot those in updating the structure definition
but did include them in the size define leading to a buffer
overrun.
Also use the define so that we don't duplicate the value.
Fixes: Coverity ID 1534095 buffer overrun
Fixes: 8700ee15de ("hw/cxl: Standardize all references on CXL r3.1 and minor updates")
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20240308143831.6256-1-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
With a numa set up such as
-numa nodeid=0,cpus=0 \
-numa nodeid=1,memdev=mem \
-numa nodeid=2,cpus=1
and appropriate hmat_lb entries the initiator list is correctly
computed and writen to HMAT as 0,2 but then the LB data is accessed
using the node id (here 2), landing outside the entry_list array.
Stash the reverse lookup when writing the initiator list and use
it to get the correct array index index.
Fixes: 4586a2cb83 ("hmat acpi: Build System Locality Latency and Bandwidth Information Structure(s)")
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20240307160326.31570-3-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently the default input range can extend to 64 bits. On x86,
when the virtio-iommu protects vfio devices, the physical iommu
may support only 39 bits. Let's set the default to 39, as done
for the intel-iommu.
We use hw_compat_8_2 to handle the compatibility for machines
before 9.0 which used to have a virtio-iommu default input range
of 64 bits.
Of course if aw-bits is set from the command line, the default
is overriden.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20240307134445.92296-8-eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
aw-bits is a new option that allows to set the bit width of
the input address range. This value will be used as a default for
the device config input_range.end. By default it is set to 64 bits
which is the current value.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <20240307134445.92296-7-eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We used to set the default granule to 4KB but with VFIO assignment
it makes more sense to use the actual host page size.
Indeed when hotplugging a VFIO device protected by a virtio-iommu
on a 64kB/64kB host/guest config, we current get a qemu crash:
"vfio: DMA mapping failed, unable to continue"
This is due to the hot-attached VFIO device calling
memory_region_iommu_set_page_size_mask() with 64kB granule
whereas the virtio-iommu granule was already frozen to 4KB on
machine init done.
Set the granule property to "host" and introduce a new compat.
The page size mask used before 9.0 was qemu_target_page_mask().
Since the virtio-iommu currently only supports x86_64 and aarch64,
this matched a 4KB granule.
Note that the new default will prevent 4kB guest on 64kB host
because the granule will be set to 64kB which would be larger
than the guest page size. In that situation, the virtio-iommu
driver fails on viommu_domain_finalise() with
"granule 0x10000 larger than system page size 0x1000".
In that case the workaround is to request 4K granule.
The current limitation of global granule in the virtio-iommu
should be removed and turned into per domain granule. But
until we get this upgraded, this new default is probably
better because I don't think anyone is currently interested in
running a 4KB page size guest with virtio-iommu on a 64KB host.
However supporting 64kB guest on 64kB host with virtio-iommu and
VFIO looks a more important feature.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-Id: <20240307134445.92296-4-eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This allows to choose which granule will be used by
default by the virtio-iommu. Current page size mask
default is qemu_target_page_mask so this translates
into a 4k granule on ARM and x86_64 where virtio-iommu
is supported.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-Id: <20240307134445.92296-3-eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
ACPI spec provides a scheme to associate "Generic Initiators" [1]
(e.g. heterogeneous processors and accelerators, GPUs, and I/O devices with
integrated compute or DMA engines GPUs) with Proximity Domains. This is
achieved using Generic Initiator Affinity Structure in SRAT. During bootup,
Linux kernel parse the ACPI SRAT to determine the PXM ids and create a NUMA
node for each unique PXM ID encountered. Qemu currently do not implement
these structures while building SRAT.
Add GI structures while building VM ACPI SRAT. The association between
device and node are stored using acpi-generic-initiator object. Lookup
presence of all such objects and use them to build these structures.
The structure needs a PCI device handle [2] that consists of the device BDF.
The vfio-pci device corresponding to the acpi-generic-initiator object is
located to determine the BDF.
[1] ACPI Spec 6.3, Section 5.2.16.6
[2] ACPI Spec 6.3, Table 5.80
Cc: Jonathan Cameron <qemu-devel@nongnu.org>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cedric Le Goater <clg@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
Message-Id: <20240308145525.10886-3-ankita@nvidia.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
NVIDIA GPU's support MIG (Mult-Instance GPUs) feature [1], which allows
partitioning of the GPU device resources (including device memory) into
several (upto 8) isolated instances. Each of the partitioned memory needs
a dedicated NUMA node to operate. The partitions are not fixed and they
can be created/deleted at runtime.
Unfortunately Linux OS does not provide a means to dynamically create/destroy
NUMA nodes and such feature implementation is not expected to be trivial. The
nodes that OS discovers at the boot time while parsing SRAT remains fixed. So
we utilize the Generic Initiator (GI) Affinity structures that allows
association between nodes and devices. Multiple GI structures per BDF is
possible, allowing creation of multiple nodes by exposing unique PXM in each
of these structures.
Implement the mechanism to build the GI affinity structures as Qemu currently
does not. Introduce a new acpi-generic-initiator object to allow host admin
link a device with an associated NUMA node. Qemu maintains this association
and use this object to build the requisite GI Affinity Structure.
When multiple NUMA nodes are associated with a device, it is required to
create those many number of acpi-generic-initiator objects, each representing
a unique device:node association.
Following is one of a decoded GI affinity structure in VM ACPI SRAT.
[0C8h 0200 1] Subtable Type : 05 [Generic Initiator Affinity]
[0C9h 0201 1] Length : 20
[0CAh 0202 1] Reserved1 : 00
[0CBh 0203 1] Device Handle Type : 01
[0CCh 0204 4] Proximity Domain : 00000007
[0D0h 0208 16] Device Handle : 00 00 20 00 00 00 00 00 00 00 00
00 00 00 00 00
[0E0h 0224 4] Flags (decoded below) : 00000001
Enabled : 1
[0E4h 0228 4] Reserved2 : 00000000
[0E8h 0232 1] Subtable Type : 05 [Generic Initiator Affinity]
[0E9h 0233 1] Length : 20
An admin can provide a range of acpi-generic-initiator objects, each
associating a device (by providing the id through pci-dev argument)
to the desired NUMA node (using the node argument). Currently, only PCI
device is supported.
For the grace hopper system, create a range of 8 nodes and associate that
with the device using the acpi-generic-initiator object. While a configuration
of less than 8 nodes per device is allowed, such configuration will prevent
utilization of the feature to the fullest. The following sample creates 8
nodes per PCI device for a VM with 2 PCI devices and link them to the
respecitve PCI device using acpi-generic-initiator objects:
-numa node,nodeid=2 -numa node,nodeid=3 -numa node,nodeid=4 \
-numa node,nodeid=5 -numa node,nodeid=6 -numa node,nodeid=7 \
-numa node,nodeid=8 -numa node,nodeid=9 \
-device vfio-pci-nohotplug,host=0009:01:00.0,bus=pcie.0,addr=04.0,rombar=0,id=dev0 \
-object acpi-generic-initiator,id=gi0,pci-dev=dev0,node=2 \
-object acpi-generic-initiator,id=gi1,pci-dev=dev0,node=3 \
-object acpi-generic-initiator,id=gi2,pci-dev=dev0,node=4 \
-object acpi-generic-initiator,id=gi3,pci-dev=dev0,node=5 \
-object acpi-generic-initiator,id=gi4,pci-dev=dev0,node=6 \
-object acpi-generic-initiator,id=gi5,pci-dev=dev0,node=7 \
-object acpi-generic-initiator,id=gi6,pci-dev=dev0,node=8 \
-object acpi-generic-initiator,id=gi7,pci-dev=dev0,node=9 \
-numa node,nodeid=10 -numa node,nodeid=11 -numa node,nodeid=12 \
-numa node,nodeid=13 -numa node,nodeid=14 -numa node,nodeid=15 \
-numa node,nodeid=16 -numa node,nodeid=17 \
-device vfio-pci-nohotplug,host=0009:01:01.0,bus=pcie.0,addr=05.0,rombar=0,id=dev1 \
-object acpi-generic-initiator,id=gi8,pci-dev=dev1,node=10 \
-object acpi-generic-initiator,id=gi9,pci-dev=dev1,node=11 \
-object acpi-generic-initiator,id=gi10,pci-dev=dev1,node=12 \
-object acpi-generic-initiator,id=gi11,pci-dev=dev1,node=13 \
-object acpi-generic-initiator,id=gi12,pci-dev=dev1,node=14 \
-object acpi-generic-initiator,id=gi13,pci-dev=dev1,node=15 \
-object acpi-generic-initiator,id=gi14,pci-dev=dev1,node=16 \
-object acpi-generic-initiator,id=gi15,pci-dev=dev1,node=17 \
Link: https://www.nvidia.com/en-in/technologies/multi-instance-gpu [1]
Cc: Jonathan Cameron <qemu-devel@nongnu.org>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
Message-Id: <20240308145525.10886-2-ankita@nvidia.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Now that pc_cmos_init() doesn't populate the X86MachineState::rtc attribute any
longer, its duties can be merged into pc_cmos_init_late() which is called within
machine_done notifier. This frees pc_piix and pc_q35 from explicit CMOS
initialization.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20240303185332.1408-5-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The boot device order may change during the lifetime of a VM. Usually, the
"normal" order is set once during machine init(). However, if a user specifies
`-boot once=...`, the "normal" order is overwritten by the "once" order just
before machine_done, and a reset handler is registered which restores the
"normal" order during the next reset.
In the next patch, pc_cmos_init() will be inlined into pc_cmos_init_late() which
runs during machine_done. This means that the "once" boot order would be
overwritten again with the "normal" boot order -- which renders the user's
choice ineffective. Fix this by setting the "normal" boot order in
pc_basic_device_init() which already registers the boot_set() handler.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20240303185332.1408-4-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The RTC can be accessed through the X86 machine instance, so rather than passing
the RTC it's possible to pass the machine state instead. This avoids
pc_boot_set() from having to access the current_machine global.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20240303185332.1408-3-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Commit 99e1c1137b "hw/i386/pc: Populate RTC attribute directly" made linking
the "rtc_state" property unnecessary and removed it. Commit 84e945aad2 "vl,
pc: turn -no-fd-bootchk into a machine property" accidently reintroduced the
link. Remove it again since it is not needed.
Fixes: 84e945aad2 "vl, pc: turn -no-fd-bootchk into a machine property"
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20240303185332.1408-2-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Commit 6f6ad2b245 "hw/i386/pc: Confine system flash handling to pc_sysfw"
causes a regression when specifying the property `-M pflash0` in the PCI PC
machines:
qemu-system-x86_64: Property 'pc-q35-9.0-machine.pflash0' not found
In order to revert the commit, the commit below must be reverted first.
This reverts commit cb05cc1602.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20240226215909.30884-2-shentey@gmail.com>
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Since commit f10a570b093e6 ("KVM: x86: Add CONFIG_KVM_MAX_NR_VCPUS to allow up to 4096 vCPUs")
Linux kernel can support upto a maximum number of 4096 vcpus when MAXSMP is
enabled in the kernel. At present, QEMU has been tested to correctly boot a
linux guest with 4096 vcpus using the current edk2 upstream master branch that
has the fixes corresponding to the following two PRs:
https://github.com/tianocore/edk2/pull/5410https://github.com/tianocore/edk2/pull/5418
The changes merged into edk2 with the above PRs will be in the upcoming 2024-05
release. With current seabios firmware, it boots fine with 4096 vcpus already.
So bump up the value max_cpus to 4096 for q35 machines versions 9 and newer.
Q35 machines versions 8.2 and older continue to support 1024 maximum vcpus
as before for compatibility reasons.
If KVM is not able to support the specified number of vcpus, QEMU would
return the following error messages:
$ ./qemu-system-x86_64 -cpu host -accel kvm -machine q35 -smp 1728
qemu-system-x86_64: -accel kvm: warning: Number of SMP cpus requested (1728) exceeds the recommended cpus supported by KVM (12)
qemu-system-x86_64: -accel kvm: warning: Number of hotpluggable cpus requested (1728) exceeds the recommended cpus supported by KVM (12)
Number of SMP cpus requested (1728) exceeds the maximum cpus supported by KVM (1024)
Cc: Daniel P. Berrangé <berrange@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Julia Suvorova <jusual@redhat.com>
Cc: kraxel@redhat.com
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20240228143351.3967-1-anisinha@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The spec does not NumVFs is reset after disabling VFs except when
resetting the PF. Clearing it is guest visible and out of spec, even
though Linux doesn't rely on this value being preserved, so we never
noticed.
Fixes: 7c0fa8dff8 ("pcie: Add support for Single Root I/O Virtualization (SR/IOV)")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20240228-reuse-v8-4-282660281e60@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
pcie_sriov_pf_disable_vfs() is called when resetting the PF, but it only
disables VFs and does not reset SR-IOV extended capability, leaking the
state and making the VF Enable register inconsistent with the actual
state.
Replace pcie_sriov_pf_disable_vfs() with pcie_sriov_pf_reset(), which
does not only disable VFs but also resets the capability.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20240228-reuse-v8-3-282660281e60@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@ericsson.com>
nvme_sriov_pre_write_ctrl() used to directly inspect SR-IOV
configurations to know the number of VFs being disabled due to SR-IOV
configuration writes, but the logic was flawed and resulted in
out-of-bound memory access.
It assumed PCI_SRIOV_NUM_VF always has the number of currently enabled
VFs, but it actually doesn't in the following cases:
- PCI_SRIOV_NUM_VF has been set but PCI_SRIOV_CTRL_VFE has never been.
- PCI_SRIOV_NUM_VF was written after PCI_SRIOV_CTRL_VFE was set.
- VFs were only partially enabled because of realization failure.
It is a responsibility of pcie_sriov to interpret SR-IOV configurations
and pcie_sriov does it correctly, so use pcie_sriov_num_vfs(), which it
provides, to get the number of enabled VFs before and after SR-IOV
configuration writes.
Cc: qemu-stable@nongnu.org
Fixes: CVE-2024-26328
Fixes: 11871f53ef ("hw/nvme: Add support for the Virtualization Management command")
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20240228-reuse-v8-1-282660281e60@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
IOAPICCommonClass implements its own private realize(), and this private
realize() allows error.
Since IOAPICCommonClass.realize() returns void, to check the error,
dereference @errp with ERRP_GUARD().
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20240223085653.1255438-8-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
As the comment in qapi/error, dereferencing @errp requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
* - It must not be dereferenced, because it may be null.
...
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
*
* Using it when it's not needed is safe, but please avoid cluttering
* the source with useless code.
But in iommufd_cdev_getfd(), @errp is dereferenced without ERRP_GUARD():
if (*errp) {
error_prepend(errp, VFIO_MSG_PREFIX, path);
}
Currently, since vfio_attach_device() - the caller of
iommufd_cdev_getfd() - is always called in DeviceClass.realize() context
and doesn't get the NULL @errp parameter, iommufd_cdev_getfd()
hasn't triggered the bug that dereferencing the NULL @errp.
To follow the requirement of @errp, add missing ERRP_GUARD() in
iommufd_cdev_getfd().
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20240223085653.1255438-7-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
As the comment in qapi/error, dereferencing @errp requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
* - It must not be dereferenced, because it may be null.
...
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
*
* Using it when it's not needed is safe, but please avoid cluttering
* the source with useless code.
But in cxl_usp_realize(), @errp is dereferenced without ERRP_GUARD():
cxl_doe_cdat_init(cxl_cstate, errp);
if (*errp) {
goto err_cap;
}
Here we check *errp, because cxl_doe_cdat_init() returns void. And since
cxl_usp_realize() - as a PCIDeviceClass.realize() method - doesn't get
the NULL @errp parameter, it hasn't triggered the bug that dereferencing
the NULL @errp.
To follow the requirement of @errp, add missing ERRP_GUARD() in
cxl_usp_realize().
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20240223085653.1255438-6-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, dereferencing @errp requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
* - It must not be dereferenced, because it may be null.
...
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
*
* Using it when it's not needed is safe, but please avoid cluttering
* the source with useless code.
But in trng_prop_fault_event_set, @errp is dereferenced without
ERRP_GUARD():
visit_type_uint32(v, name, events, errp);
if (*errp) {
return;
}
Currently, since trng_prop_fault_event_set() doesn't get the NULL @errp
parameter as a "set" method of object property, it hasn't triggered the
bug that dereferencing the NULL @errp.
And since visit_type_uint32() returns bool, check the returned bool
directly instead of dereferencing @errp, then we needn't the add missing
ERRP_GUARD().
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20240223085653.1255438-5-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
As the comment in qapi/error, dereferencing @errp requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
* - It must not be dereferenced, because it may be null.
...
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
*
* Using it when it's not needed is safe, but please avoid cluttering
* the source with useless code.
But in ct3_realize(), @errp is dereferenced without ERRP_GUARD():
cxl_doe_cdat_init(cxl_cstate, errp);
if (*errp) {
goto err_free_special_ops;
}
Here we check *errp, because cxl_doe_cdat_init() returns void. And
ct3_realize() - as a PCIDeviceClass.realize() method - doesn't get the
NULL @errp parameter, it hasn't triggered the bug that dereferencing
the NULL @errp.
To follow the requirement of @errp, add missing ERRP_GUARD() in
ct3_realize().
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20240223085653.1255438-4-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
As the comment in qapi/error, dereferencing @errp requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
* - It must not be dereferenced, because it may be null.
...
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
*
* Using it when it's not needed is safe, but please avoid cluttering
* the source with useless code.
But in macfb_nubus_realize(), @errp is dereferenced without
ERRP_GUARD():
ndc->parent_realize(dev, errp);
if (*errp) {
return;
}
Here we check *errp, because the ndc->parent_realize(), as a
DeviceClass.realize() callback, returns void. And since
macfb_nubus_realize(), also as a DeviceClass.realize(), doesn't get the
NULL @errp parameter, it hasn't triggered the bug that dereferencing the
NULL @errp.
To follow the requirement of @errp, add missing ERRP_GUARD() in
macfb_nubus_realize().
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20240223085653.1255438-3-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
As the comment in qapi/error, dereferencing @errp requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
* - It must not be dereferenced, because it may be null.
...
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
*
* Using it when it's not needed is safe, but please avoid cluttering
* the source with useless code.
But in cxl_fixed_memory_window_config(), @errp is dereferenced in 2
places without ERRP_GUARD():
fw->enc_int_ways = cxl_interleave_ways_enc(fw->num_targets, errp);
if (*errp) {
return;
}
and
fw->enc_int_gran =
cxl_interleave_granularity_enc(object->interleave_granularity,
errp);
if (*errp) {
return;
}
For the above 2 places, we check "*errp", because neither function
returns a suitable error code. And since machine_set_cfmw() - the caller
of cxl_fixed_memory_window_config() - doesn't get the NULL @errp
parameter as the "set" method of object property,
cxl_fixed_memory_window_config() hasn't triggered the bug that
dereferencing the NULL @errp.
To follow the requirement of @errp, add missing ERRP_GUARD() in
cxl_fixed_memory_window_config().
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20240223085653.1255438-2-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
This patch adds support for VDPA network simulation devices.
The device is developed based on virtio-net and tap backend,
and supports hardware live migration function.
For more details, please refer to "docs/system/devices/vdpa-net.rst"
Signed-off-by: Hao Chen <chenh@yusur.tech>
Message-Id: <20240221073802.2888022-1-chenh@yusur.tech>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Shared objects lack spoofing protection.
For VHOST_USER_BACKEND_SHARED_OBJECT_REMOVE messages
received by the vhost-user interface, any backend was
allowed to remove entries from the shared table just
by knowing the UUID. Only the owner of the entry
shall be allowed to removed their resources
from the table.
To fix that, add a check for all
*SHARED_OBJECT_REMOVE messages received.
A vhost device can only remove TYPE_VHOST_DEV
entries that are owned by them, otherwise skip
the removal, and inform the device that the entry
has not been removed in the answer.
Signed-off-by: Albert Esteve <aesteve@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20240219143423.272012-2-aesteve@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The payload size returned by command VIRTIO_SND_R_PCM_INFO is
wrong. The code in process_cmd() assumes that all commands
return only a virtio_snd_hdr payload, but some commands like
VIRTIO_SND_R_PCM_INFO may return an additional payload.
Add a zero initialized payload_size variable to struct
virtio_snd_ctrl_command to allow for additional payloads.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Volker Rümelin <vr_qemu@t-online.de>
Message-Id: <20240218083351.8524-1-vr_qemu@t-online.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Sometimes, certain parts are not being skipped in
vhost_vdpa_listener_region_del, but they are skipped in
vhost_vdpa_listener_region_add, or vice versa. The vhost-vdpa code
expects all parts to maintain their properties, so we're adding a trace
to help with debugging when any part is skipped.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20240215103616.330518-3-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We already use MADV_NORESERVE to deal with sparse memory regions. Let's
also set madvise(MADV_DONTDUMP), otherwise a crash of the process can
result in us allocating all memory in the mmap'ed region for dumping
purposes.
This change implies that the mmap'ed rings won't be included in a
coredump. If ever required for debugging purposes, we could mark only
the mapped rings MADV_DODUMP.
Ignore errors during madvise() for now.
Reviewed-by: Raphael Norwitz <raphael@enfabrica.net>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20240214151701.29906-15-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently, we try to remap all rings whenever we add a single new memory
region. That doesn't quite make sense, because we already map rings when
setting the ring address, and panic if that goes wrong. Likely, that
handling was simply copied from set_mem_table code, where we actually
have to remap all rings.
Remapping all rings might require us to walk quite a lot of memory
regions to perform the address translations. Ideally, we'd simply remove
that remapping.
However, let's be a bit careful. There might be some weird corner cases
where we might temporarily remove a single memory region (e.g., resize
it), that would have worked for now. Further, a ring might be located on
hotplugged memory, and as the VM reboots, we might unplug that memory, to
hotplug memory before resetting the ring addresses.
So let's unmap affected rings as we remove a memory region, and try
dynamically mapping the ring again when required.
Acked-by: Raphael Norwitz <raphael@enfabrica.net>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20240214151701.29906-14-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In the past, QEMU would create memory regions that could partially cover
hugetlb pages, making mmap() fail if we would use the mmap_offset as an
fd_offset. For that reason, we never used the mmap_offset as an offset into
the fd and instead always mapped the fd from the very start.
However, that can easily result in us mmap'ing a lot of unnecessary
parts of an fd, possibly repeatedly.
QEMU nowadays does not create memory regions that partially cover huge
pages -- it never really worked with postcopy. QEMU handles merging of
regions that partially cover huge pages (due to holes in boot memory) since
2018 in c1ece84e7c ("vhost: Huge page align and merge").
Let's be a bit careful and not unconditionally convert the
mmap_offset into an fd_offset. Instead, let's simply detect the hugetlb
size and pass as much as we can as fd_offset, making sure that we call
mmap() with a properly aligned offset.
With QEMU and a virtio-mem device that is fully plugged (50GiB using 50
memslots) the qemu-storage daemon process consumes in the VA space
1281GiB before this change and 58GiB after this change.
================ Vhost user message ================
Request: VHOST_USER_ADD_MEM_REG (37)
Flags: 0x9
Size: 40
Fds: 59
Adding region 4
guest_phys_addr: 0x0000000200000000
memory_size: 0x0000000040000000
userspace_addr: 0x00007fb73bffe000
old mmap_offset: 0x0000000080000000
fd_offset: 0x0000000080000000
new mmap_offset: 0x0000000000000000
mmap_addr: 0x00007f02f1bdc000
Successfully added new region
================ Vhost user message ================
Request: VHOST_USER_ADD_MEM_REG (37)
Flags: 0x9
Size: 40
Fds: 59
Adding region 5
guest_phys_addr: 0x0000000240000000
memory_size: 0x0000000040000000
userspace_addr: 0x00007fb77bffe000
old mmap_offset: 0x00000000c0000000
fd_offset: 0x00000000c0000000
new mmap_offset: 0x0000000000000000
mmap_addr: 0x00007f0284000000
Successfully added new region
Reviewed-by: Raphael Norwitz <raphael@enfabrica.net>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20240214151701.29906-12-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Let's speed up GPA to memory region / virtual address lookup. Store the
memory regions ordered by guest physical addresses, and use binary
search for address translation, as well as when adding/removing memory
regions.
Most importantly, this will speed up GPA->VA address translation when we
have many memslots.
Reviewed-by: Raphael Norwitz <raphael@enfabrica.net>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20240214151701.29906-11-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Memory regions cannot overlap, and if we ever hit that case something
would be really flawed.
For example, when vhost code in QEMU decides to increase the size of memory
regions to cover full huge pages, it makes sure to never create overlaps,
and if there would be overlaps, it would bail out.
QEMU commits 48d7c97577 ("vhost: Merge sections added to temporary
list"), c1ece84e7c ("vhost: Huge page align and merge") and
e7b94a84b6 ("vhost: Allow adjoining regions") added and clarified that
handling and how overlaps are impossible.
Consequently, each GPA can belong to at most one memory region, and
everything else doesn't make sense. Let's factor out our search to prepare
for further changes.
Reviewed-by: Raphael Norwitz <raphael@enfabrica.net>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20240214151701.29906-10-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Let's factor it out, reducing quite some code duplication and perparing
for further changes.
If we fail to mmap a region and panic, we now simply don't add that
(broken) region.
Note that we now increment dev->nregions as we are successfully
adding memory regions, and don't increment dev->nregions if anything went
wrong.
Reviewed-by: Raphael Norwitz <raphael@enfabrica.net>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20240214151701.29906-6-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Let's support up to 509 mem slots, just like vhost in the kernel usually
does and the rust vhost-user implementation recently [1] started doing.
This is required to properly support memory hotplug, either using
multiple DIMMs (ACPI supports up to 256) or using virtio-mem.
The 509 used to be the KVM limit, it supported 512, but 3 were
used for internal purposes. Currently, KVM supports more than 512, but
it usually doesn't make use of more than ~260 (i.e., 256 DIMMs + boot
memory), except when other memory devices like PCI devices with BARs are
used. So, 509 seems to work well for vhost in the kernel.
Details can be found in the QEMU change that made virtio-mem consume
up to 256 mem slots across all virtio-mem devices. [2]
509 mem slots implies 509 VMAs/mappings in the worst case (even though,
in practice with virtio-mem we won't be seeing more than ~260 in most
setups).
With max_map_count under Linux defaulting to 64k, 509 mem slots
still correspond to less than 1% of the maximum number of mappings.
There are plenty left for the application to consume.
[1] https://github.com/rust-vmm/vhost/pull/224
[2] https://lore.kernel.org/all/20230926185738.277351-1-david@redhat.com/
Reviewed-by: Raphael Norwitz <raphael@enfabrica.net>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20240214151701.29906-3-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Let's prepare for increasing VHOST_USER_MAX_RAM_SLOTS by dynamically
allocating dev->regions. We don't have any ABI guarantees (not
dynamically linked), so we can simply change the layout of VuDev.
Let's zero out the memory, just as we used to do.
Reviewed-by: Raphael Norwitz <raphael@enfabrica.net>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20240214151701.29906-2-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fix an issue where cancellation of ongoing migration ends up
with no network connectivity.
When canceling migration, SVQ will be switched back to the
passthrough mode, but the right call fd is not programed to
the device and the svq's own call fd is still used. At the
point of this transitioning period, the shadow_vqs_enabled
hadn't been set back to false yet, causing the installation
of call fd inadvertently bypassed.
Message-Id: <1707910082-10243-13-git-send-email-si-wei.liu@oracle.com>
Fixes: a8ac88585d ("vhost: Add Shadow VirtQueue call forwarding capabilities")
Cc: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Will be used in following patches.
DISABLING(-1) means SVQ is being switched off to passthrough
mode.
ENABLING(1) means passthrough VQs are being switched to SVQ.
DONE(0) means SVQ switching is completed.
Message-Id: <1707910082-10243-11-git-send-email-si-wei.liu@oracle.com>
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Xen queue:
* In Xen PCI passthrough, emulate multifunction bit.
* Fix in Xen mapcache.
* Improve performance of kernel+initrd loading in an Xen HVM Direct
Kernel Boot scenario.
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEE+AwAYwjiLP2KkueYDPVXL9f7Va8FAmXwZbwACgkQDPVXL9f7
# Va+PhQgAusZBhy3b0hOCCoqC/1ffCE5J2JxUTnN3zN/2FSOe8/kqQYqt4Zk3vi2e
# Eq8FbGupU357eoJSz0gTEPKQ8y+FVBCmFKEHM1PS54TW1yUZchQg4RmlII6+Psoj
# 7u+qC1RqZu/ZQ9f1QZd8YDJ5oVOkfAZYwq5BkWVS6h5gJiQTSkekAXlMNOQBZxz4
# 48fzpokatiJBbyaBGEm6YKEOwkYG76eHhxB4SC0Rgx6zW+EDQpX0s/Lg19SXnj2C
# UOueiPod1GkE+iH6dQFJUSbsnrkAtJZf253bs3BQnoChGiqQLuXn4jC79ffjPzHI
# AKP2+u+bSJ+8C1SdPuoJN6sJIZmOfA==
# =FZ2n
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 12 Mar 2024 14:25:00 GMT
# gpg: using RSA key F80C006308E22CFD8A92E7980CF5572FD7FB55AF
# gpg: Good signature from "Anthony PERARD <anthony.perard@gmail.com>" [marginal]
# gpg: aka "Anthony PERARD <anthony.perard@citrix.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 5379 2F71 024C 600F 778A 7161 D8D5 7199 DF83 42C8
# Subkey fingerprint: F80C 0063 08E2 2CFD 8A92 E798 0CF5 572F D7FB 55AF
* tag 'pull-xen-20240312' of https://xenbits.xen.org/git-http/people/aperard/qemu-dm:
i386: load kernel on xen using DMA
xen: Drop out of coroutine context xen_invalidate_map_cache_entry
xen/pt: Emulate multifunction bit in header type
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The --target-type and --target-name args are used to construct
the default probe prefix if '--probe-prefix' is not given. The
meson.build will always pass '--probe-prefix', so the other args
are effectively redundant.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20240108171356.1037059-2-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* Add missing ERRP_GUARD() statements in functions that need it
* Prefer fast cpu_env() over slower CPU QOM cast macro
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmXwPhYRHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbWHvBAAgKx5LHFjz3xREVA+LkDTQ49mz0lK3s32
# SGvNlIHjiaDGVttVYhVC4sinBWUruG4Lyv/2QN72OJBzn6WUsEUQE3KPH1d7Y3/s
# wS9X7mj70n4kugWJqeIJP5AXSRasHmWoQ4QJLVQRJd6+Eb9jqwep0x7bYkI1de6D
# bL1Q7bIfkFeNQBXaiPWAm2i+hqmT4C1r8HEAGZIjAsMFrjy/hzBEjNV+pnh6ZSq9
# Vp8BsPWRfLU2XHm4WX0o8d89WUMAfUGbVkddEl/XjIHDrUD+Zbd1HAhLyfhsmrnE
# jXIwSzm+ML1KX4MoF5ilGtg8Oo0gQDEBy9/xck6G0HCm9lIoLKlgTxK9glr2vdT8
# yxZmrM9Hder7F9hKKxmb127xgU6AmL7rYmVqsoQMNAq22D6Xr4UDpgFRXNk2/wO6
# zZZBkfZ4H4MpZXbd/KJpXvYH5mQA4IpkOy8LJdE+dbcHX7Szy9ksZdPA+Z10hqqf
# zqS13qTs3abxymy2Q/tO3hPKSJCk1+vCGUkN60Wm+9VoLWGoU43qMc7gnY/pCS7m
# 0rFKtvfwFHhokX1orK0lP/ppVzPv/5oFIeK8YDY9if+N+dU2LCwVZHIuf2/VJPRq
# wmgH2vAn3JDoRKPxTGX9ly6AMxuZaeP92qBTOPap0gDhihYzIpaCq9ecEBoTakI7
# tdFhV0iRr08=
# =NiP4
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 12 Mar 2024 11:35:50 GMT
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5
* tag 'pull-request-2024-03-12' of https://gitlab.com/thuth/qemu: (55 commits)
user: Prefer fast cpu_env() over slower CPU QOM cast macro
target/xtensa: Prefer fast cpu_env() over slower CPU QOM cast macro
target/tricore: Prefer fast cpu_env() over slower CPU QOM cast macro
target/sparc: Prefer fast cpu_env() over slower CPU QOM cast macro
target/sh4: Prefer fast cpu_env() over slower CPU QOM cast macro
target/rx: Prefer fast cpu_env() over slower CPU QOM cast macro
target/ppc: Prefer fast cpu_env() over slower CPU QOM cast macro
target/openrisc: Prefer fast cpu_env() over slower CPU QOM cast macro
target/nios2: Prefer fast cpu_env() over slower CPU QOM cast macro
target/mips: Prefer fast cpu_env() over slower CPU QOM cast macro
target/microblaze: Prefer fast cpu_env() over slower CPU QOM cast macro
target/m68k: Prefer fast cpu_env() over slower CPU QOM cast macro
target/loongarch: Prefer fast cpu_env() over slower CPU QOM cast macro
target/i386/hvf: Use CPUState typedef
target/hexagon: Prefer fast cpu_env() over slower CPU QOM cast macro
target/cris: Prefer fast cpu_env() over slower CPU QOM cast macro
target/avr: Prefer fast cpu_env() over slower CPU QOM cast macro
target/alpha: Prefer fast cpu_env() over slower CPU QOM cast macro
target: Replace CPU_GET_CLASS(cpu -> obj) in cpu_reset_hold() handler
bulk: Call in place single use cpu_env()
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Introduce a SPAPR capability cap-nested-papr which enables nested PAPR
API for nested guests. This new API is to enable support for KVM on PowerVM
and the support in Linux kernel has already merged upstream.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The H_GUEST_RUN_VCPU hcall is used to start execution of a Guest VCPU.
The Hypervisor will update the state of the Guest VCPU based on the
input buffer, restore the saved Guest VCPU state, and start its
execution.
The Guest VCPU can stop running for numerous reasons including HCALLs,
hypervisor exceptions, or an outstanding Host Partition Interrupt.
The reason that the Guest VCPU stopped running is communicated through
R4 and the output buffer will be filled in with any relevant state.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
For nested PAPR API, we use SpaprMachineStateNestedGuest struct to store
partition table info, use the same in spapr_get_pate_nested() via
helper.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Introduce the nested PAPR hcalls:
- H_GUEST_GET_STATE which is used to get state of a nested guest or
a guest VCPU. The value field for each element in the request is
destination to be updated to reflect current state on success.
- H_GUEST_SET_STATE which is used to modify the state of a guest or
a guest VCPU. On success, guest (or its VCPU) state shall be
updated as per the value field for the requested element(s).
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Nested PAPR API provides a standard Guest State Buffer (GSB) format
with unique IDs for each guest state element for which get/set state is
supported by the API. Some of the elements are read-only and/or guest-wide.
Introducing additional required GSB elements and helper routines for state
exchange of each of the nested guest state elements for which get/set state
should be supported by the API.
[amachhiw: set the PCR whenever logical PVR is set]
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Signed-off-by: Amit Machhiwal <amachhiw@linux.vnet.ibm.com>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Currently, nested_ppc_state stores a certain set of registers and works
with nested_[load|save]_state() for state transfer as reqd for nested-hv API.
Extending these with additional registers state as reqd for nested PAPR API.
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Suggested-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Introduce the nested PAPR hcall H_GUEST_CREATE_VCPU which is used to
create and initialize the specified VCPU resource for the previously
created guest. Each guest can have multiple VCPUs upto max 2048.
All VCPUs for a guest gets deallocated on guest delete.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Introduce the nested PAPR hcalls:
- H_GUEST_CREATE which is used to create and allocate resources for
nested guest being created.
- H_GUEST_DELETE which is used to delete and deallocate resources
for the nested guest being deleted. It also supports deleting all nested
guests at once using a deleteAll flag.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Introduce the nested PAPR hcalls:
- H_GUEST_GET_CAPABILITIES which is used to query the capabilities
of the API and the L2 guests it provides.
- H_GUEST_SET_CAPABILITIES which is used to set the Guest API
capabilities that the Host Partition supports and may use.
[amachhiw: support for p9 compat mode and return register bug fixes]
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Amit Machhiwal <amachhiw@linux.vnet.ibm.com>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Adding initial documentation about Nested PAPR API to describe the set
of APIs and its usage. Also talks about the Guest State Buffer elements
and it's format which is used between L0/L1 to communicate L2 state.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
spapr_exit_nested and spapr_get_pate_nested_hv contains code which
is specific to nested-hv API. Isolating code flows based on API
helps extending it to be used with different API as well.
Suggested-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Currently, nested_ptcr is being used by existing nested-hv API to store
nested guest related info. This need to be organised to extend support
for the nested PAPR API which would need to store additional info
related to nested guests in next series of patches.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Most of the nested code has already been moved to spapr_nested.c
This logic inside spapr_get_pate is related to nested guests and
better suited for spapr_nested.c, hence moving there.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Since cap-nested-hv is an optional capability, it makes sense to register
api specfic hcalls only when respective capability is enabled. This
requires to introduce a new API to unregister hypercalls to maintain
sanity across guest reboot since caps are re-applied across reboots and
re-registeration of hypercalls would hit assert otherwise.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
These wrappers call out to handle POWER7 and newer in separate
functions but reduce to the generic case when TARGET_PPC64 is not
defined. It is easy enough to include the switch in the beginning of
the generic functions to branch out to the specific functions and get
rid of these wrappers. This avoids one indirection and entirely
compiles out the switch without TARGET_PPC64.
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Concatenate #if blocks that are ending then beginning on the next line
again.
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Remove check for !defined(CONFIG_USER_ONLY) as this is already within
an #ifndef CONFIG_USER_ONLY block.
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Use #ifdef, #ifndef for brevity and add comments to #endif that are
more than a few lines apart for clarity.
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Add gen_exception_err_nip() that does the same as gen_exception_err()
but takes the nip as a parameter to allow specifying it instead of
using the current instruction address then change gen_exception_err()
to use it.
The gen_exception() and gen_exception_nip() functions are similar so
remove code duplication from those too while at it.
Suggested-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Improve readability by shortening some long comments, removing
comments that state the obvious and dropping some empty lines so they
don't distract when reading the code.
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Use the env_cpu function to get the CPUState for cpu_abort. These are
only needed in case of fatal errors so this allows to avoid casting
and storing CPUState in a local variable wnen not needed.
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Big (SMT8) cores have a complicated function to map the core, thread ID
to pervasive topology (PIR). Fix this for power8, power9, and power10.
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Caleb Schlossin <calebs@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Currently in tcg mode, when reading from power10 pmu spr like MMCR3,
qemu logs this message (when starting qemu with -d guest_errors)
Trying to read invalid spr 754 (0x2f2) at 0000000030056bb0
This is becuase, no read/write call-backs are registered for
these SPRs. Add support to register generic read/write
functions to these power10 pmu sprs to fix it.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
This patch moves the below instructions to decodetree specification:
{add, subf}[c,e,me,ze][o][.] : XO-form
addic[.], subfic : D-form
addex : Z23-form
This patch introduces XO form instructions into decode tree
specification, for which all the four variations([o][.]) have been
handled with a single pattern. The changes were verified by validating
that the tcg ops generated by those instructions remain the same, which
were captured with the '-d in_asm,op' flag.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Documentation on how to run Linux on the amigaone, pegasos2 and
sam460ex machines is currently buried in the depths of the qemu-devel
mailing list and in the source code. Let's collect the information in
the QEMU handbook for a one stop solution.
Tested-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Co-authored-by: Bernhard Beschow <shentey@gmail.com>
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
pSeries machines before 3.0 have complex migration back
compatibility code we'd like to get ride of. The last
one is 2.12, which is 6 years old. We just deprecated up
to the 2.11 machine in commit 1392617d35 ("spapr: Tag
pseries-2.1 - 2.11 machines as deprecated").
Take to opportunity to also deprecate the 2.12 machines.
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
PPC maintainership has been a side activity for the last 2 years and
it is time to let go some of it now that Nick has taken over.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Copy the pa-features arrays from spapr, adjusting slightly as
described in comments.
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
This allows different pa-features for powernv8/9/10.
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Add POWER10 pa-features entry.
Notably DEXCR and [P]HASHST/[P]HASHCHK instruction support is
advertised. Each DEXCR aspect is allocated a bit in the device tree,
using the 68--71 byte range (inclusive). The functionality of the
[P]HASHST/[P]HASHCHK instructions is separately declared in byte 72,
bit 0 (BE).
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
[npiggin: reword title and changelog, adjust a few bits]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
"MMR" and "SPR SO" are not implemented in POWER9, so clear those bits.
HTM is not set by default, and only later if the cap is set, so remove
the comment that suggests otherwise.
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
TCG does not support copy/paste instructions. Remove it from
ibm,pa-features. This has never been implemented under TCG or
practically usable under KVM, so it won't be missed.
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
SAO is a page table attribute that strengthens the memory ordering of
accesses. QEMU with MTTCG does not implement this, so clear it in
ibm,pa-features. This is an obscure feature that has been removed from
POWER10 ISA v3.1, there isn't much concern with removing it.
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
POWER10 hardware implements a degenerate transactional memory facility
in POWER8/9 PCR compatibility modes to permit migration from older
CPUs, but POWER10 / ISA v3.1 mode does not support it so the CPU model
should not support it.
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The POWER9 DD1 and POWER10 DD1 chips are not public and are no longer of
any use in QEMU. Remove them.
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The initial MSR state for the OpenFirmware binding specifies
MSR[ME] and MSR[FP] are set.
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Prevent guest state modifying the MSR[ME] bit. Per ISA:
An attempt to modify MSR[ME] in privileged but non-hypervisor state
is ignored (i.e., the bit is not changed).
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Commit 1901b4967c ("hw/block/nvme: move msix table and pba to BAR 0")
moved the MSI-X table and PBA to BAR 0 to make room for enabling CMR and
PMR at the same time. As reported by Julien Grall in #2184, this breaks
migration through system hibernation.
Add a machine compatibility parameter and set it on machines pre 6.0 to
enable the old behavior automatically, restoring the hibernation
migration support.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2184
Fixes: 1901b4967c ("hw/block/nvme: move msix table and pba to BAR 0")
Reported-by: Julien Grall julien@xen.org
Tested-by: Julien Grall julien@xen.org
Reviewed-by: Jesper Wendel Devantier <foss@defmacro.it>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Generalize the mbar size helper such that it can handle cases where the
MSI-X table and PBA are expected to be in an exclusive bar.
Cc: qemu-stable@nongnu.org
Reviewed-by: Jesper Wendel Devantier <foss@defmacro.it>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
This patch adds a way to specify an NGUID for a given NVMe Namespace using a
string of hexadecimal digits with an optional '-' separator to group bytes. For
instance:
-device nvme-ns,nguid="e9accd3b83904e13167cf0593437f57d"
If provided, the NGUID will be part of the Namespace Identification Descriptor
list and the Identify Namespace data.
Signed-off-by: Roque Arcudia Hernandez <roqueh@google.com>
Signed-off-by: Nabih Estefan <nabihestefan@google.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
My colleague, Jesper, will be assiting with hw/nvme related reviews. Add
him with R: so he gets automatically bugged going forward.
Cc: Jesper Devantier <foss@defmacro.it>
Acked-by: Jesper Devantier <foss@defmacro.it>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
The number of logical blocks within a source range is converted into a
1s based number at the time of parsing. However, when verifying the copy
length we add one again, causing the check against MCL to fail in error.
Cc: qemu-stable@nongnu.org
Fixes: 381ab99d85 ("hw/nvme: check maximum copy length (MCL) for COPY")
Reviewed-by: Minwoo Im <minwoo.im@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Currently, when a VF is created, it uses the 'params' object of the PF
as it is. In other words, the 'params.serial' string memory area is also
shared. In this situation, if the VF is removed from the system, the
PF's 'params.serial' object is released with object_finalize() followed
by object_property_del_all() which release the memory for 'serial'
property. If that happens, the next VF created will inherit a serial
from a corrupted memory area.
If this happens, an error will occur when comparing subsys->serial and
n->params.serial in the nvme_subsys_register_ctrl() function.
Cc: qemu-stable@nongnu.org
Fixes: 44c2c09488 ("hw/nvme: Add support for SR-IOV")
Signed-off-by: Minwoo Im <minwoo.im@samsung.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
xen_invalidate_map_cache_entry is not expected to run in a
coroutine. Without this, there is crash:
signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
threadid=<optimized out>) at pthread_kill.c:78
at /usr/src/debug/glibc/2.38+git-r0/sysdeps/posix/raise.c:26
fmt=0xffff9e1ca8a8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n",
assertion=assertion@entry=0xaaaae0d25740 "!qemu_in_coroutine()",
file=file@entry=0xaaaae0d301a8 "../qemu-xen-dir-remote/block/graph-lock.c", line=line@entry=260,
function=function@entry=0xaaaae0e522c0 <__PRETTY_FUNCTION__.3> "bdrv_graph_rdlock_main_loop") at assert.c:92
assertion=assertion@entry=0xaaaae0d25740 "!qemu_in_coroutine()",
file=file@entry=0xaaaae0d301a8 "../qemu-xen-dir-remote/block/graph-lock.c", line=line@entry=260,
function=function@entry=0xaaaae0e522c0 <__PRETTY_FUNCTION__.3> "bdrv_graph_rdlock_main_loop") at assert.c:101
at ../qemu-xen-dir-remote/block/graph-lock.c:260
at /home/Freenix/work/sw-stash/xen/upstream/tools/qemu-xen-dir-remote/include/block/graph-lock.h:259
host=host@entry=0xffff742c8000, size=size@entry=2097152)
at ../qemu-xen-dir-remote/block/io.c:3362
host=0xffff742c8000, size=2097152)
at ../qemu-xen-dir-remote/block/block-backend.c:2859
host=<optimized out>, size=<optimized out>, max_size=<optimized out>)
at ../qemu-xen-dir-remote/block/block-ram-registrar.c:33
size=2097152, max_size=2097152)
at ../qemu-xen-dir-remote/hw/core/numa.c:883
buffer=buffer@entry=0xffff743c5000 "")
at ../qemu-xen-dir-remote/hw/xen/xen-mapcache.c:475
buffer=buffer@entry=0xffff743c5000 "")
at ../qemu-xen-dir-remote/hw/xen/xen-mapcache.c:487
as=as@entry=0xaaaae1ca3ae8 <address_space_memory>, buffer=0xffff743c5000,
len=<optimized out>, is_write=is_write@entry=true,
access_len=access_len@entry=32768)
at ../qemu-xen-dir-remote/system/physmem.c:3199
dir=DMA_DIRECTION_FROM_DEVICE, len=<optimized out>,
buffer=<optimized out>, as=0xaaaae1ca3ae8 <address_space_memory>)
at /home/Freenix/work/sw-stash/xen/upstream/tools/qemu-xen-dir-remote/include/sysemu/dma.h:236
elem=elem@entry=0xaaaaf620aa30, len=len@entry=32769)
at ../qemu-xen-dir-remote/hw/virtio/virtio.c:758
elem=elem@entry=0xaaaaf620aa30, len=len@entry=32769, idx=idx@entry=0)
at ../qemu-xen-dir-remote/hw/virtio/virtio.c:919
elem=elem@entry=0xaaaaf620aa30, len=32769)
at ../qemu-xen-dir-remote/hw/virtio/virtio.c:994
req=req@entry=0xaaaaf620aa30, status=status@entry=0 '\000')
at ../qemu-xen-dir-remote/hw/block/virtio-blk.c:67
ret=0) at ../qemu-xen-dir-remote/hw/block/virtio-blk.c:136
at ../qemu-xen-dir-remote/block/block-backend.c:1559
--Type <RET> for more, q to quit, c to continue without paging--
at ../qemu-xen-dir-remote/block/block-backend.c:1614
i1=<optimized out>) at ../qemu-xen-dir-remote/util/coroutine-ucontext.c:177
at ../sysdeps/unix/sysv/linux/aarch64/setcontext.S:123
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Message-Id: <20240124021450.21656-1-peng.fan@oss.nxp.com>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
The intention of the code appears to have been to unconditionally set
the multifunction bit but since the emulation mask is 0x00 it has no
effect. Instead, emulate the bit and set it based on the multifunction
property of the PCIDevice (which can be set using QAPI).
This allows making passthrough devices appear as functions in a Xen
guest.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Message-Id: <20231103172601.1319375-1-ross.lagerwall@citrix.com>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
When converting test vs UINT32_MAX to compare vs 0, we need to
adjust the condition to match.
Fixes: 34aff3c2e0 ("tcg/aarch64: Generate CBNZ for TSTNE of UINT32_MAX")
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Pass the type to tcg_out_logicali; remove the assert, duplicated
at the start of tcg_out_logicali.
Fixes: 339adf2f38 ("tcg/aarch64: Support TCG_COND_TST{EQ,NE}")
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The current post-loading code for scanout has a FIXME: it doesn't take
the resource region/rect into account. But there is more, when adding
blob migration support in commit f66767f75c, I didn't realize that blob
resources could be used for scanouts. This situationn leads to a crash
during post-load, as they don't have an associated res->image.
virtio_gpu_do_set_scanout() handle all cases, but requires the
associated virtio_gpu_framebuffer, which is currently not saved during
migration.
Add a v2 of "virtio-gpu-one-scanout" with the framebuffer fields, so we
can restore blob scanouts, as well as fixing the existing FIXME.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Sebastian Ott <sebott@redhat.com>
The "Listener" connection, being private and under the control of the
qemu display, allows for the optimization of discarding pending
intermediary messages when queuing a new scanout. This ensures that the
client receives only the latest scanout update, improving communication
efficiency.
While the current implementation does not provide a mechanism for
clients who may wish to receive all updates, making this behavior
optional could be considered in the future. For now, adopting this new
default behavior accelerates the communication process without a
guarantee of delivering all updates.
The filter is removed when the connection is dropped.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
query-cpu-model-expansion takes a CpuModelInfo argument. The
loongarch version of the command silently ignores the argument's
member @props. For instance,
{"execute": "query-cpu-model-expansion", "arguments": {"type": "static", "model": {"name": "la464", "props": null}}}
and
{"execute": "query-cpu-model-expansion", "arguments": {"type": "static", "model": {"name": "la464", "props": {"prop": null}}}}
succeed.
Add skeleton code for property processing that recognizes no
properties. Now the two commands fail as they should:
{"error": {"class": "GenericError", "desc": "Invalid parameter type for 'model.props', expected: object"}}
and
{"error": {"class": "GenericError", "desc": "Parameter 'model.props.prop' is unexpected"}}
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240305145919.2186971-5-armbru@redhat.com>
[Drop #include now superfluous]
query-cpu-model-comparison, query-cpu-model-baseline, and
query-cpu-model-expansion take CpuModelInfo arguments. Errors in
@props members of these arguments are reported for 'props', without
further context. For instance, s390x rejects
{"execute": "query-cpu-model-comparison", "arguments": {"modela": {"name": "z13", "props": {}}, "modelb": {"name": "z14", "props": []}}}
with
{"error": {"class": "GenericError", "desc": "Invalid parameter type for 'props', expected: object"}}
This is unusual; the common QAPI unmarshaling machinery would complain
about 'modelb.props'. Our hand-written code to visit the @props
member neglects to provide the context.
Tweak it so it provides it. The command above now fails with
{"error": {"class": "GenericError", "desc": "Invalid parameter type for 'modelb.props', expected: dict"}}
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240305145919.2186971-4-armbru@redhat.com>
Acked-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
CpuModelInfo member @props is semantically a mapping from name to
value, and syntactically a JSON object on the wire. This translates
to QDict in C. Since the QAPI schema language lacks the means to
express 'object', we use 'any' instead. This is QObject in C.
Commands taking a CpuModelInfo argument need to check the QObject is a
QDict.
The i386 version of qmp_query_cpu_model_expansion() fails to check.
Instead, @props is silently ignored when it's not an object. For
instance,
{"execute": "query-cpu-model-expansion", "arguments": {"type": "full", "model": {"name": "qemu64", "props": null}}}
succeeds.
Fix by refactoring the code to match the other targets. Now the
command fails as it should:
{"error": {"class": "GenericError", "desc": "Invalid parameter type for 'props', expected: object"}}
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240305145919.2186971-3-armbru@redhat.com>
CpuModelInfo member @props is semantically a mapping from name to
value, and syntactically a JSON object on the wire. This translates
to QDict in C. Since the QAPI schema language lacks the means to
express 'object', we use 'any' instead. This is QObject in C.
Commands taking a CpuModelInfo argument need to check the QObject is a
QDict.
For arm, riscv, and s390x, the code checks right before passing the
QObject to visit_start_struct(). visit_start_struct() then checks
again.
Delete the first check.
The error message for @props that are not an object changes slightly
to the the message we get for this kind of type error in other
contexts. Minor improvement.
Additionally, error messages about members of @props now refer to
'props.prop-name' instead of just 'prop-name'. Another minor
improvement.
Both changes are visible in tests/qtest/arm-cpu-features.c.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240305145919.2186971-2-armbru@redhat.com>
Acked-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
[Drop #include now superfluous]
Since CPU() macro is a simple cast, the following are equivalent:
Object *obj;
CPUState *cs = CPU(obj)
In order to ease static analysis when running
scripts/coccinelle/cpu_env.cocci from the previous commit,
replace:
- CPU_GET_CLASS(cpu);
+ CPU_GET_CLASS(obj);
Most code use the 'cs' variable name for CPUState handle.
Replace few 's' -> 'cs' to unify cpu_reset_hold() style.
No logical change in this patch.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240129164514.73104-7-philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Avoid CPUArchState local variable when cpu_env() is used once.
Mechanical patch using the following Coccinelle spatch script:
@@
type CPUArchState;
identifier env;
expression cs;
@@
{
- CPUArchState *env = cpu_env(cs);
... when != env
- env
+ cpu_env(cs)
... when != env
}
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240129164514.73104-5-philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
When a variable is initialized to &struct->field, use it
in place. Rationale: while this makes the code more concise,
this also helps static analyzers.
Mechanical change using the following Coccinelle spatch script:
@@
type S, F;
identifier s, m, v;
@@
S *s;
...
F *v = &s->m;
<+...
- &s->m
+ v
...+>
Inspired-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240129164514.73104-2-philmd@linaro.org>
Acked-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
[thuth: Dropped hunks that need a rebase, and fixed sizeof() in pmu_realize()]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Since the commit 05e385d2a9 ("error: Move ERRP_GUARD() to the beginning
of the function"), there are new codes that don't put ERRP_GUARD() at
the beginning of the functions.
As stated in the commit 05e385d2a9: "include/qapi/error.h advises to put
ERRP_GUARD() right at the beginning of the function, because only then
can it guard the whole function.", so clean up the few spots
disregarding the advice.
Inspired-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240312060337.3240965-1-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
In target/s390x/cpu_models.c, there are 2 functions passing @errp to
error_prepend() without ERRP_GUARD():
- check_compatibility()
- s390_realize_cpu_model()
Though both their @errp parameters point to their callers' local @err
virables and don't cause the issue as [1] said, to follow the
requirement of @errp, also add missing ERRP_GUARD() at their beginning.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: David Hildenbrand <david@redhat.com>
Cc: Thomas Huth <thuth@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Ilya Leoshkevich <iii@linux.ibm.com>
Cc: qemu-s390x@nongnu.org
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <20240311033822.3142585-30-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
The net_init_vhost_vdpa() passes @errp to error_prepend(), and as a
member of net_client_init_fun[], it's called in net_client_init1() and
gets @errp from this caller.
But because netdev_init_modern() passes &error_fatal to
net_client_init1(), then @errp parameter of net_init_vhost_vdpa() would
point to @error_fatal. This causes the error message in error_prepend()
to be lost because of the above issue.
To fix this, add missing ERRP_GUARD() at the beginning of this function.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240311033822.3142585-29-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
The migrate_params_check() passes @errp to error_prepend() without
ERRP_GUARD(), and it could be called from migration_object_init(),
where the passed @errp points to @error_fatal.
Therefore, the error message echoed in error_prepend() will be lost
because of the above issue.
To fix this, add missing ERRP_GUARD() at the beginning of this function.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Peter Xu <peterx@redhat.com>
Cc: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Acked-by: Peter Xu <peterx@redhat.com>
Message-ID: <20240311033822.3142585-28-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
In hw/virtio/vhost.c, there are 2 functions passing @errp to
error_prepend() without ERRP_GUARD():
- vhost_save_backend_state()
- vhost_load_backend_state()
Their @errp both points to callers' @local_err. However, as the APIs
defined in include/hw/virtio/vhost.h, it is necessary to protect their
@errp with ERRP_GUARD().
To follow the requirement of @errp, add missing ERRP_GUARD() at their
beginning.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240311033822.3142585-27-zhao1.liu@linux.intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
The vhost_vsock_device_realize() passes @errp to error_prepend(), and as
a VirtioDeviceClass.realize method, its @errp is from
DeviceClass.realize so that there is no guarantee that the @errp won't
point to @error_fatal.
To avoid the issue like [1] said, add missing ERRP_GUARD() at the
beginning of this function.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240311033822.3142585-26-zhao1.liu@linux.intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
The vfio_platform_realize() passes @errp to error_prepend(), and as a
DeviceClass.realize method, there are too many possible callers to check
the impact of this defect; it may or may not be harmless. Thus it is
necessary to protect @errp with ERRP_GUARD().
To avoid the issue like [1] said, add missing ERRP_GUARD() at the
beginning of this function.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <20240311033822.3142585-25-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
In hw/vfio/pci.c, there are 2 functions passing @errp to error_prepend()
without ERRP_GUARD():
- vfio_add_std_cap()
- vfio_realize()
The @errp of vfio_add_std_cap() is also from vfio_realize(). And
vfio_realize(), as a PCIDeviceClass.realize method, its @errp is from
DeviceClass.realize so that there is no guarantee that the @errp won't
point to @error_fatal.
To avoid the issue like [1] said, add missing ERRP_GUARD() at their
beginning.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <20240311033822.3142585-24-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
In hw/vfio/pci-quirks.c, there are 2 functions passing @errp to
error_prepend() without ERRP_GUARD():
- vfio_add_nv_gpudirect_cap()
- vfio_add_vmd_shadow_cap()
There are too many possible callers to check the impact of this defect;
it may or may not be harmless. Thus it is necessary to protect their
@errp with ERRP_GUARD().
To avoid the issue like [1] said, add missing ERRP_GUARD() at the
beginning of this function.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <20240311033822.3142585-23-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
The iommufd_cdev_getfd() passes @errp to error_prepend(). Its @errp is
from vfio_attach_device(), and there are too many possible callers to
check the impact of this defect; it may or may not be harmless. Thus it
is necessary to protect @errp with ERRP_GUARD().
To avoid the issue like [1] said, add missing ERRP_GUARD() at the
beginning of this function.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <20240311033822.3142585-22-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
In hw/vfio/helpers.c, there are 3 functions passing @errp to
error_prepend() without ERRP_GUARD():
- vfio_set_irq_signaling()
- vfio_device_get_name()
- vfio_device_set_fd()
There are too many possible callers to check the impact of this defect;
it may or may not be harmless. Thus it is necessary to protect their
@errp with ERRP_GUARD().
To avoid the issue like [1] said, add missing ERRP_GUARD() at their
beginning.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <20240311033822.3142585-21-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
The vfio_get_group() passes @errp to error_prepend(). Its @errp is
from vfio_attach_device(), and there are too many possible callers to
check the impact of this defect; it may or may not be harmless. Thus it
is necessary to protect @errp with ERRP_GUARD().
To avoid the issue like [1] said, add missing ERRP_GUARD() at the
beginning of this function.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <20240311033822.3142585-20-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
The vfio_ap_realize() passes @errp to error_prepend(), and as a
DeviceClass.realize method, there are too many possible callers to check
the impact of this defect; it may or may not be harmless. Thus it is
necessary to protect @errp with ERRP_GUARD().
To avoid the issue like [1] said, add missing ERRP_GUARD() at the
beginning of this function.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cédric Le Goater <clg@redhat.com>
Cc: Tony Krowiak <akrowiak@linux.ibm.com>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Jason Herne <jjherne@linux.ibm.com>
Cc: Thomas Huth <thuth@redhat.com>
Cc: qemu-s390x@nongnu.org
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <20240311033822.3142585-19-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
The vhost_scsi_realize() passes @errp to error_prepend(), and as a
VirtioDeviceClass.realize method, its @errp is from DeviceClass.realize
so that there is no guarantee that the @errp won't point to
@error_fatal.
To avoid the issue like [1] said, add missing ERRP_GUARD() at the
beginning of this function.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Fam Zheng <fam@euphon.net>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240311033822.3142585-18-zhao1.liu@linux.intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
The virtio_blk_vq_aio_context_init() passes @errp to error_prepend().
Though its @errp points its caller's local @err variable, to follow the
requirement of @errp, add missing ERRP_GUARD() at the beginning of
virtio_blk_vq_aio_context_init().
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Hanna Reitz <hreitz@redhat.com>
Cc: qemu-block@nongnu.org
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: "Michael S. Tsirkin" <mst@redhat.com>
Message-ID: <20240311033822.3142585-14-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
The vmdk_parse_extents() passes @errp to error_prepend(), and its @errp
is from vmdk_open().
Though, vmdk_open(), as a BlockDriver.bdrv_open(), gets the @errp
parameter which is pointer of its caller's local_err, to follow the
requirement of @errp, add missing ERRP_GUARD() at the beginning of this
function.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Fam Zheng <fam@euphon.net>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Hanna Reitz <hreitz@redhat.com>
Cc: qemu-block@nongnu.org
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240311033822.3142585-13-zhao1.liu@linux.intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
The vdi_co_do_create() passes @errp to error_prepend() without
ERRP_GUARD(), and its @errp parameter is so widely sourced that it is
necessary to protect it with ERRP_GUARD().
To avoid the potential issues as [1] said, add missing ERRP_GUARD() at
the beginning of this function.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Stefan Weil <sw@weilnetz.de>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Hanna Reitz <hreitz@redhat.com>
Cc: qemu-block@nongnu.org
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240311033822.3142585-12-zhao1.liu@linux.intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
In block/snapshot.c, there are 2 functions passing @errp to
error_prepend() without ERRP_GUARD():
- bdrv_all_delete_snapshot()
- bdrv_all_goto_snapshot()
As the APIs exposed in include/block/snapshot.h, they could be called
by other modules.
To avoid potential issues as [1] said, add missing ERRP_GUARD() at the
beginning of these 2 functions.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Hanna Reitz <hreitz@redhat.com>
Cc: qemu-block@nongnu.org
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240311033822.3142585-11-zhao1.liu@linux.intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
The bdrv_qed_co_invalidate_cache() passes @errp to error_prepend()
without ERRP_GUARD().
Though it is a BlockDriver.bdrv_co_invalidate_cache() method, and
currently its @errp parameter only points to callers' local_err, to
follow the requirement of @errp, add missing ERRP_GUARD() at the
beginning of this function.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Hanna Reitz <hreitz@redhat.com>
Cc: qemu-block@nongnu.org
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240311033822.3142585-10-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
In block/qcow2.c, there are 2 functions passing @errp to error_prepend()
without ERRP_GUARD():
- qcow2_co_create()
- qcow2_co_truncate()
There are too many possible callers to check the impact of the defect;
it may or may not be harmless. Thus it is necessary to protect @errp with
ERRP_GUARD().
Therefore, to avoid the issue like [1] said, add missing ERRP_GUARD() at
their beginning.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Hanna Reitz <hreitz@redhat.com>
Cc: qemu-block@nongnu.org
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-ID: <20240311033822.3142585-9-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
The qcow2_co_can_store_new_dirty_bitmap() passes @errp to
error_prepend(). As a BlockDriver.bdrv_co_can_store_new_dirty_bitmap
method, it's called by bdrv_co_can_store_new_dirty_bitmap().
Its caller is not being called anywhere, but as the API in
include/block/block-io.h, we can't ensure what kind of @errp future
users will pass in.
To avoid potential issues as [1] said, add missing ERRP_GUARD() at the
beginning of qcow2_co_can_store_new_dirty_bitmap().
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Hanna Reitz <hreitz@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Cc: John Snow <jsnow@redhat.com>
Cc: qemu-block@nongnu.org
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20240311033822.3142585-8-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
In nvme.c, there are 3 functions passing @errp to error_prepend()
without ERRP_GUARD():
- nvme_init_queue()
- nvme_create_queue_pair()
- nvme_identify()
All these 3 functions take their @errp parameters from the
nvme_file_open(), which is a BlockDriver.bdrv_nvme() method and its
@errp points to its caller's local_err.
Though these 3 cases haven't trigger the issue like [1] said, to
follow the requirement of @errp, add missing ERRP_GUARD() at their
beginning.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Fam Zheng <fam@euphon.net>
Cc: Philippe Mathieu-Daudé <philmd@linaro.org>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Hanna Reitz <hreitz@redhat.com>
Cc: qemu-block@nongnu.org
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240311033822.3142585-7-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
The nbd_co_do_receive_one_chunk() passes @errp to error_prepend()
without ERRP_GUARD(), and though its @errp parameter points to its
caller's local_err, to follow the requirement of @errp, add missing
ERRP_GUARD() at the beginning of this function.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Eric Blake <eblake@redhat.com>
Cc: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Hanna Reitz <hreitz@redhat.com>
Cc: qemu-block@nongnu.org
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20240311033822.3142585-6-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
The cbw_open() passes @errp to error_prepend() without ERRP_GUARD().
Though it is the BlockDriver.bdrv_open() method, and currently its
@errp parameter only points to callers' local_err, to follow the
requirement of @errp, add missing ERRP_GUARD() at the beginning of this
function.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: John Snow <jsnow@redhat.com>
Cc: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Hanna Reitz <hreitz@redhat.com>
Cc: qemu-block@nongnu.org
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20240311033822.3142585-5-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
In block.c, there are 4 functions passing @errp to error_prepend()
without ERRP_GUARD():
- bdrv_co_create_opts_simple()
- parse_json_filename()
- bdrv_open_backing_file()
- bdrv_append_temp_snapshot()
bdrv_co_create_opts_simple(), is an implementation of
BlockDriver.bdrv_co_create_opts(). There are too many possible callers
to check the impact of this defect; it may or may not be harmless. Thus
it is necessary to protect @errp with ERRP_GUARD().
Though the @errp parameters passed to parse_json_filename(),
bdrv_open_backing_file() and bdrv_append_temp_snapshot() points to their
callers' local_err, to follow the requirement of @errp, also add missing
ERRP_GUARD() at their beginning.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Hanna Reitz <hreitz@redhat.com>
Cc: qemu-block@nongnu.org
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-ID: <20240311033822.3142585-4-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
The iommufd_backend_set_fd() passes @errp to error_prepend(), to avoid
the above issue, add missing ERRP_GUARD() at the beginning of this
function.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Yi Liu <yi.l.liu@intel.com>
Cc: Eric Auger <eric.auger@redhat.com>
Cc: Zhenzhong Duan <zhenzhong.duan@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-ID: <20240311033822.3142585-3-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The error_vprepend() should use ERRP_GUARD() just as the documentation
of ERRP_GUARD() says:
> It must be used when the function dereferences @errp or passes
> @errp to error_prepend(), error_vprepend(), or error_append_hint().
Considering that error_vprepend() is also an API provided in error.h,
it is necessary to add it to the description of the rules for using
ERRP_GUARD().
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Michael Roth <michael.roth@amd.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240311033822.3142585-2-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
IOAPICCommonClass implements its own private realize(), and this private
realize() allows error.
Since IOAPICCommonClass.realize() returns void, to check the error,
dereference @errp with ERRP_GUARD().
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240223085653.1255438-8-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, dereferencing @errp requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
* - It must not be dereferenced, because it may be null.
...
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
*
* Using it when it's not needed is safe, but please avoid cluttering
* the source with useless code.
But in cxl_usp_realize(), @errp is dereferenced without ERRP_GUARD():
cxl_doe_cdat_init(cxl_cstate, errp);
if (*errp) {
goto err_cap;
}
Here we check *errp, because cxl_doe_cdat_init() returns void. And since
cxl_usp_realize() - as a PCIDeviceClass.realize() method - doesn't get
the NULL @errp parameter, it hasn't triggered the bug that dereferencing
the NULL @errp.
To follow the requirement of @errp, add missing ERRP_GUARD() in
cxl_usp_realize().
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240223085653.1255438-6-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, dereferencing @errp requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
* - It must not be dereferenced, because it may be null.
...
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
*
* Using it when it's not needed is safe, but please avoid cluttering
* the source with useless code.
But in trng_prop_fault_event_set, @errp is dereferenced without
ERRP_GUARD():
visit_type_uint32(v, name, events, errp);
if (*errp) {
return;
}
Currently, since trng_prop_fault_event_set() doesn't get the NULL @errp
parameter as a "set" method of object property, it hasn't triggered the
bug that dereferencing the NULL @errp.
And since visit_type_uint32() returns bool, check the returned bool
directly instead of dereferencing @errp, then we needn't the add missing
ERRP_GUARD().
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240223085653.1255438-5-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, dereferencing @errp requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
* - It must not be dereferenced, because it may be null.
...
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
*
* Using it when it's not needed is safe, but please avoid cluttering
* the source with useless code.
But in ct3_realize(), @errp is dereferenced without ERRP_GUARD():
cxl_doe_cdat_init(cxl_cstate, errp);
if (*errp) {
goto err_free_special_ops;
}
Here we check *errp, because cxl_doe_cdat_init() returns void. And
ct3_realize() - as a PCIDeviceClass.realize() method - doesn't get the
NULL @errp parameter, it hasn't triggered the bug that dereferencing
the NULL @errp.
To follow the requirement of @errp, add missing ERRP_GUARD() in
ct3_realize().
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240223085653.1255438-4-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, dereferencing @errp requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
* - It must not be dereferenced, because it may be null.
...
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
*
* Using it when it's not needed is safe, but please avoid cluttering
* the source with useless code.
But in macfb_nubus_realize(), @errp is dereferenced without
ERRP_GUARD():
ndc->parent_realize(dev, errp);
if (*errp) {
return;
}
Here we check *errp, because the ndc->parent_realize(), as a
DeviceClass.realize() callback, returns void. And since
macfb_nubus_realize(), also as a DeviceClass.realize(), doesn't get the
NULL @errp parameter, it hasn't triggered the bug that dereferencing the
NULL @errp.
To follow the requirement of @errp, add missing ERRP_GUARD() in
macfb_nubus_realize().
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240223085653.1255438-3-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, dereferencing @errp requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
* - It must not be dereferenced, because it may be null.
...
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
*
* Using it when it's not needed is safe, but please avoid cluttering
* the source with useless code.
But in cxl_fixed_memory_window_config(), @errp is dereferenced in 2
places without ERRP_GUARD():
fw->enc_int_ways = cxl_interleave_ways_enc(fw->num_targets, errp);
if (*errp) {
return;
}
and
fw->enc_int_gran =
cxl_interleave_granularity_enc(object->interleave_granularity,
errp);
if (*errp) {
return;
}
For the above 2 places, we check "*errp", because neither function
returns a suitable error code. And since machine_set_cfmw() - the caller
of cxl_fixed_memory_window_config() - doesn't get the NULL @errp
parameter as the "set" method of object property,
cxl_fixed_memory_window_config() hasn't triggered the bug that
dereferencing the NULL @errp.
To follow the requirement of @errp, add missing ERRP_GUARD() in
cxl_fixed_memory_window_config().
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240223085653.1255438-2-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-03-12 11:45:33 +01:00
2477 changed files with 105101 additions and 78005 deletions
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.