Send raw packets over if UADK hardware support is not available. This is to
satisfy Qemu qtest CI which may run on platforms that don't have UADK
hardware support. Subsequent patch will add support for uadk migration
qtest.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Add --enable-uadk and --disable-uadk options to enable and disable
UADK compression accelerator. This is for using UADK based hardware
accelerators for live migration.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Document UADK(User Space Accelerator Development Kit) library details
and how to use that for migration.
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org>
[s/Qemu/QEMU in docs]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
add qpl to compression method test for multifd migration
the qpl compression supports software path and hardware
path(IAA device), and the hardware path is used first by
default. If the hardware path is unavailable, it will
automatically fallback to the software path for testing.
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
QPL compression and decompression will use IAA hardware path if the IAA
hardware is available. Otherwise the QPL library software path is used.
The hardware path will automatically fall back to QPL software path if
the IAA queues are busy. In some scenarios, this may happen frequently,
such as configuring 4 channels but only one IAA device is available. In
the case of insufficient IAA hardware resources, retry and fallback can
help optimize performance:
1. Retry + SW fallback:
total time: 14649 ms
downtime: 25 ms
throughput: 17666.57 mbps
pages-per-second: 1509647
2. No fallback, always wait for work queues to become available
total time: 18381 ms
downtime: 25 ms
throughput: 13698.65 mbps
pages-per-second: 859607
If both the hardware and software paths fail, the uncompressed page is
sent directly.
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
during initialization, a software job is allocated to each channel
for software path fallabck when the IAA hardware is unavailable or
the hardware job submission fails. If the IAA hardware is available,
multiple hardware jobs are allocated for batch processing.
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
add the Query Processing Library (QPL) compression method
Introduce the qpl as a new multifd migration compression method, it can
use In-Memory Analytics Accelerator(IAA) to accelerate compression and
decompression, which can not only reduce network bandwidth requirement
but also reduce host compression and decompression CPU overhead.
How to enable qpl compression during migration:
migrate_set_parameter multifd-compression qpl
There is no qpl compression level parameter added since it only supports
level one, users do not need to specify the qpl compression level.
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
[fixed docs spacing in migration.json]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
add --enable-qpl and --disable-qpl options to enable and disable
the QPL compression method for multifd migration.
The Query Processing Library (QPL) is an open-source library
that supports data compression and decompression features. It
is based on the deflate compression algorithm and use Intel
In-Memory Analytics Accelerator(IAA) hardware for compression
and decompression acceleration.
For more live migration with IAA, please refer to the document
docs/devel/migration/qpl-compression.rst
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Different compression methods may require different numbers of IOVs.
Based on streaming compression of zlib and zstd, all pages will be
compressed to a data block, so two IOVs are needed for packet header
and compressed data block.
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Similar to other archs, build a custom bios memory updater. Running the
test with OF code is a cool trick, but SLOF takes a long time to boot.
This reduces test time by around 3x (150s to 50s).
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
ppc64 with TCG seems to no longer be failing this test, perhaps since
commit 03bfc2188f ("physmem: Fix migration dirty bitmap coherency
with TCG memory access") which is not ppc specific but was seen to hit
ppc64 quite easily.
Let's enable it again.
The s390x problem has been identified so mention it while we are
adjusting the comment.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Prasad Pandit <pjp@fedoraproject.org>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The spapr QEMU machine defaults is useful outside libqos, so create a
new header for ppc specific qtests and move it there.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
* Fix loongarch64 avocado test
* Make qtests more flexible with regards to non-available CPU models
* Improvements for the test-smp-parse unit test
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmZpoEoRHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbVF6g/+JYTRKmaIduQIP9g2+NkM+qMTbjI9Ow47
# 8Vdj/ePMXNWOZsgMPkCUdisYeMZEC+XMcDN1xvwZXwLTMTJRacZCSFRpeN4P0m0W
# 6aQ28+tPNgx+B9Eh2kc4TpxbiqSH8u5u4GEN4Y07rcX/3YbYyjFgZD8orRu/nJ+H
# 0wV7Riq9csi1BkLxrgKaHocFSOl4eOga4OFi+u4wIn/xoW3MN0laxe4iuoQRMZPf
# gJLPRhEija4lto8iIKNxJbTABB0wEcWRWtgqcbHxdatqh1lPTPBpWxmdD/v1LJn+
# H/eO+oh05NQdlhw7+xfWF9PD+MpIePbZ28oNb3X3uURROTdcxpBAgpPipv07FsT4
# LmU2nIBQ4FcpDOkhLnLmBmFBNO6uDCzuGzxFRhX1SIiGMABqTDOKynBQSgQI2iB0
# 5J47XUwHtnOoCvf4SRA/MZG8zNSQZdJbnuOBLgZ+vsCG14mWM2NbfSUwRkH6pd/J
# fEbODuzHZoYgUTxjR9+WMbINAbNjMy+SP2sGZIBzcAIIkybKynOy58LoCyNT684U
# ean9bnc65908PJxEfsQ6k9kNwkK4GwOqZi+X383nVgMJ9+3dDw8M76IVU59hsq1n
# wnz4VgFcRdXMYhj9zghaCgH2Ezw8gZHILXH+RlX0Bav4LQ5vSZQ6tRNwM4+rfXBe
# okF1Sxmz31U=
# =s7+V
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 12 Jun 2024 06:19:06 AM PDT
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full]
* tag 'pull-request-2024-06-12' of https://gitlab.com/thuth/qemu:
tests/tcg/s390x: Allow specifying extra QEMU options on the command line
tests/unit/test-smp-parse: Test the full 8-levels topology hierarchy
tests/unit/test-smp-parse: Test "modules" and "dies" combination case
tests/unit/test-smp-parse: Test "modules" parameter in -smp
tests/unit/test-smp-parse: Make test cases aware of module level
tests/unit/test-smp-parse: Use default parameters=0 when not set in -smp
tests/unit/test-smp-parse: Fix an invalid topology case
tests/unit/test-smp-parse: Fix comment of parameters=1 case
tests/unit/test-smp-parse: Fix comments of drawers and books case
test: Remove libibumad dependence
meson: Remove libibumad dependence
tests/qtest/x86: check for availability of older cpu models before running tests
tests/qtest/libqtest: add qtest_has_cpu_model() api
qtest/x86/numa-test: do not use the obsolete 'pentium' cpu
tests/avocado: Update LoongArch bios file
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The use case for this is `make check-tcg EXTFLAGS="-accel kvm"`,
which allows validating the system TCG testcases on real hardware.
EXTFLAGS name is borrowed from tests/tcg/xtensa/Makefile.softmmu-target.
While at it, use += instead of = in order to be consistent with the
other architectures.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240522184116.35975-1-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
RDMA based migration has no dependence on libumad. libibverbs and
librdmacm are enough.
libumad was used by rdmacm-mux which has been already removed. It's
remained mistakenly.
Fixes: 1dfd42c426 ("hw/rdma: Remove deprecated pvrdma device and rdmacm-mux helper")
Cc: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240611105427.61395-2-pizhenwei@bytedance.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
It is better to check if some older cpu models like 486, athlon, pentium,
penryn, phenom, core2duo etc are available before running their corresponding
tests. Some downstream distributions may no longer support these older cpu
models.
Signature of add_feature_test() has been modified to return void as
FeatureTestArgs* was not used by the caller.
One minor correction. Replaced 'phenom' with '486' in the test
'x86/cpuid/auto-level/phenom/arat' matching the cpu used.
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240610155303.7933-4-anisinha@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Added a new test api qtest_has_cpu_model() in order to check availability of
some cpu models in the current QEMU binary. The specific architecture of the
QEMU binary is selected using the QTEST_QEMU_BINARY environment variable.
This api would be useful to run tests against some older cpu models after
checking if QEMU actually supported these models.
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240610155303.7933-3-anisinha@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Ciphers are pre-allocated by qcrypto_block_init_cipher() depending on
the given number of threads. The -device
virtio-blk-pci,iothread-vq-mapping= feature allows users to assign
multiple IOThreads to a virtio-blk device, but the association between
the virtio-blk device and the block driver happens after the block
driver is already open.
When the number of threads given to qcrypto_block_init_cipher() is
smaller than the actual number of threads at runtime, the
block->n_free_ciphers > 0 assertion in qcrypto_block_pop_cipher() can
fail.
Get rid of qcrypto_block_init_cipher() n_thread's argument and allocate
ciphers on demand.
Reported-by: Qing Wang <qinwang@redhat.com>
Buglink: https://issues.redhat.com/browse/RHEL-36159
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240527155851.892885-2-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Libaio defines IO_CMD_FDSYNC command to sync all outstanding
asynchronous I/O operations, by flushing out file data to the
disk storage. Enable linux-aio to submit such aio request.
When using aio=native without fdsync() support, QEMU creates
pthreads, and destroying these pthreads results in TLB flushes.
In a real-time guest environment, TLB flushes cause a latency
spike. This patch helps to avoid such spikes.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
Message-ID: <20240425070412.37248-1-ppandit@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
rather than the uint32_t for which the maximum is slightly more than 4
seconds and larger values would overflow. The QAPI interface allows
specifying the number of seconds, so only values 0 to 4 are safe right
now, other values lead to a much lower timeout than a user expects.
The block_copy() call where this is used already takes a uint64_t for
the timeout, so no change required there.
Fixes: 6db7fd1ca9 ("block/copy-before-write: implement cbw-timeout option")
Reported-by: Friedrich Weber <f.weber@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20240429141934.442154-1-f.ebner@proxmox.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The main loop has two AioContexts: qemu_aio_context and iohandler_ctx.
The main loop runs them both, but nested aio_poll() calls on
qemu_aio_context exclude iohandler_ctx.
Which one should qemu_get_current_aio_context() return when called from
the main loop? Document that it's always qemu_aio_context.
This has subtle effects on functions that use
qemu_get_current_aio_context(). For example, aio_co_reschedule_self()
does not work when moving from iohandler_ctx to qemu_aio_context because
qemu_get_current_aio_context() does not differentiate these two
AioContexts.
Document this in order to reduce the chance of future bugs.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240506190622.56095-3-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Commit 1f25c172f8 ("monitor: use aio_co_reschedule_self()") was a code
cleanup that uses aio_co_reschedule_self() instead of open coding
coroutine rescheduling.
Bug RHEL-34618 was reported and Kevin Wolf <kwolf@redhat.com> identified
the root cause. I missed that aio_co_reschedule_self() ->
qemu_get_current_aio_context() only knows about
qemu_aio_context/IOThread AioContexts and not about iohandler_ctx. It
does not function correctly when going back from the iohandler_ctx to
qemu_aio_context.
Go back to open coding the AioContext transitions to avoid this bug.
This reverts commit 1f25c172f8.
Cc: qemu-stable@nongnu.org
Buglink: https://issues.redhat.com/browse/RHEL-34618
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240506190622.56095-2-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bsd-user: Baby Steps towards eliminating qemu_host_page_size, et al
First baby-steps towards eliminating qemu_host_page_size: tackle the reserve_va
calculation (which is easier to copy from linux-user than to fix).
# -----BEGIN PGP SIGNATURE-----
# Comment: GPGTools - https://gpgtools.org
#
# iQIzBAABCgAdFiEEIDX4lLAKo898zeG3bBzRKH2wEQAFAmZl3pgACgkQbBzRKH2w
# EQBfpg//U4YdJAA0H4okwPtowP1wIK1gpWvVd5FIN17pCXLKT4FR4efhWeEnQh8U
# +dXvkCpX/MnhBkStYoGZBmYe1rNKkEAn8BPCsQqX4y3af5RzKyKWo0gZXOjN3L9e
# ixmeFcg/7BTwnSbcO02xd9BOPPaRiFBDSidh28gr/1sxpXRxlbQHzIUpTBncDaN6
# 4w5DnF+b1RFHCz05ytrP517cj7E32Ig9S/cVMmBd1pGJiLnHiOp/peMprCL6tnI+
# YNBzttCbRPNH2z0zVd9En/hDnVirGPYX+LXg0Djkw3I+stJj4jwbJTuDG+5Lzghp
# YrYfiU6x7OG9ywjFJgY1/pExVT1cwkNjuGCXL+F4R49R5LfIEHq5/MlQp+tjpYYO
# g5WmpiLnFpFosmXIPJmxr16zqm2sLD+P0Jr/kdIz58fTWmIQeKwi/Vu/73h4kxST
# vjBbhC3eg56lQDaospc4h8+RehmI6LdSWYx0kxv2JKpXH3lQPqsDSrOcm9hEbWYS
# DeV++vkyQcXrbCnwomfxG1U+dVYBlJ1L1wClxc/1WD9KxXXJIwlvGmIu3o3c2+xj
# BM6eRe3evWioqdqhc2lY+XxATwbIUxiect6ml+F6E0KJxlm3Ajqy6qw49G+uhZxa
# XWUEIYGDd6/xHMlBeo6FKUpe/Ez/i3eCFXr4AD4iO7AtTuukrO4=
# =3EaH
# -----END PGP SIGNATURE-----
# gpg: Signature made Sun 09 Jun 2024 09:55:52 AM PDT
# gpg: using RSA key 2035F894B00AA3CF7CCDE1B76C1CD1287DB01100
# gpg: Good signature from "Warner Losh <wlosh@netflix.com>" [unknown]
# gpg: aka "Warner Losh <imp@bsdimp.com>" [unknown]
# gpg: aka "Warner Losh <imp@freebsd.org>" [unknown]
# gpg: aka "Warner Losh <imp@village.org>" [unknown]
# gpg: aka "Warner Losh <wlosh@bsdimp.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2035 F894 B00A A3CF 7CCD E1B7 6C1C D128 7DB0 1100
* tag 'bsd-user-misc-2024q2-pull-request' of gitlab.com:bsdimp/qemu:
bsd-user: Catch up to run-time reserved_va math
bsd-user: port linux-user:ff8a8bbc2ad1 for variable page sizes
linux-user: Adjust comment to reflect the code.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Catch up to linux-user's 8f67b9c694, 13c1339755, 2f7828b572, and
95059f9c31 by Richard Henderson which made reserved_va a run-time
calculation, defaulting to nothing except in the case of 64-bit host
32-bit target. Also include the adjustment of the comment heading that
work submitted in the same patch stream. Since this is a direct copy,
squash it into one patch rather than follow the Linux evolution since
breaking this down further at this point doesn't make sense for this
"new code".
Signed-off-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Bring in Richard Henderson's ff8a8bbc2a to finalize the page size to
allow TARGET_PAGE_BITS_VARY. bsd-user's "blitz" fork has aarch64
support, which is now variable page size. Add support for it here, even
though it's effectively a nop in upstream qemu.
Signed-off-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
If the user didn't specify reserved_va, there's an else for 64-bit host
32-bit (or fewer) target to reserve 32-bits of address space. Update the
comments to reflect this, and rejustify comment to 80 columns.
Signed-off-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
gen_inst_init_args() is called for instructions using a predicate as an
rvalue. Upon first call, the list of arguments which might need
initialization init_list is freed to indicate that they have been
processed. For instructions without an rvalue predicate,
gen_inst_init_args() isn't called and init_list will never be freed.
Free init_list from free_instruction() if it hasn't already been freed.
A comment in free_instruction is also updated.
Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Taylor Simpson <ltaylorsimpson@gmail.com>
Reviewed-by: Brian Cain <bcain@quicinc.com>
Message-Id: <20240523125901.27797-4-anjo@rev.ng>
Signed-off-by: Brian Cain <bcain@quicinc.com>
Before switching to GArray/g_string_printf we used fixed size arrays for
output buffers and instructions arguments among other things.
Macros defining the sizes of these buffers were left behind, remove
them.
Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Taylor Simpson <ltaylorsimpson@gmail.com>
Reviewed-by: Brian Cain <bcain@quicinc.com>
Message-Id: <20240523125901.27797-2-anjo@rev.ng>
Signed-off-by: Brian Cain <bcain@quicinc.com>
At 09a7e7db0f (Hexagon (target/hexagon) Remove uses of
op_regs_generated.h.inc, 2024-03-06), we've changed the logic of
check_new_value() to use the new pre-calculated
packet->insn[...].dest_idx instead of calculating the index on the fly
using opcode_reginfo[...]. The dest_idx index is calculated roughly like
the following:
for reg in iset[tag]["syntax"]:
if reg.is_written():
dest_idx = regno
break
Thus, we take the first register that is writtable. Before that,
however, we also used to follow an alphabetical order on the register
type: 'd', 'e', 'x', and 'y'. No longer following that makes us select
the wrong register index and the HVX store new instruction does not
update the memory like expected.
Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Reviewed-by: Brian Cain <bcain@quicinc.com>
Reviewed-by: Taylor Simpson <ltaylorsimpson@gmail.com>
Message-Id: <f548dc1c240819c724245e887f29f918441e9125.1716220379.git.quic_mathbern@quicinc.com>
Signed-off-by: Brian Cain <bcain@quicinc.com>
* scsi-disk: Don't silently truncate serial number
* backends/hostmem: Report error on unavailable qemu_madvise() features or unaligned memory sizes
* target/i386: fixes and documentation for INHIBIT_IRQ/TF/RF and debugging
* i386/hvf: Adds support for INVTSC cpuid bit
* i386/hvf: Fixes for dirty memory tracking
* i386/hvf: Use hv_vcpu_interrupt() and hv_vcpu_run_until()
* hvf: Cleanups
* stubs: fixes for --disable-system build
* i386/kvm: support for FRED
* i386/kvm: fix MCE handling on AMD hosts
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmZkF2oUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroPNlQf+N9y6Eh0nMEEQ69twtV8ytglTY+uX
# FsogvnsXHNMVubOWmmeItM6kFXTAkR9cmFaL8dqI1Gs03xEQdQXbF1KejJZOAZVl
# RQMOW8Fg2Afr+0lwqCXHvhsmZ4hr5yUkRndyucA/E9AO2uGrtgwsWGDBGaHJOZIA
# lAsEMOZgKjXHZnefXjhMrvpk/QNovjEV6f1RHX3oKZjKSI5/G4IqGSmwNYToot8p
# 2fgs4Qti4+1gNyM2oBLq7cCMjMS61tSxOMH4uqVoIisjyckPlAFRvc+DXtKsUAAs
# 9AgM++pNgpB0IXv67czRUNdRoK7OI8I0ULhI4qHXi6Yg2QYAHqpQ6WL4Lg==
# =RP7U
# -----END PGP SIGNATURE-----
# gpg: Signature made Sat 08 Jun 2024 01:33:46 AM PDT
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (42 commits)
python: mkvenv: remove ensure command
Revert "python: use vendored tomli"
i386: Add support for overflow recovery
i386: Add support for SUCCOR feature
i386: Fix MCE support for AMD hosts
docs: i386: pc: Avoid mentioning limit of maximum vCPUs
target/i386: Add get/set/migrate support for FRED MSRs
target/i386: enumerate VMX nested-exception support
vmxcap: add support for VMX FRED controls
target/i386: mark CR4.FRED not reserved
target/i386: add support for FRED in CPUID enumeration
hvf: Makes assert_hvf_ok report failed expression
i386/hvf: Updates API usage to use modern vCPU run function
i386/hvf: In kick_vcpu use hv_vcpu_interrupt to force exit
i386/hvf: Fixes dirty memory tracking by page granularity RX->RWX change
hvf: Consistent types for vCPU handles
i386/hvf: Fixes some compilation warnings
i386/hvf: Adds support for INVTSC cpuid bit
stubs/meson: Fix qemuutil build when --disable-system
scsi-disk: Don't silently truncate serial number
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This was used to bootstrap the venv with a TOML parser, after which
ensuregroup is used. Now that we expect it to be present as a system
package (either tomli or, for Python 3.11, tomllib), it is not needed
anymore.
Note that this means that, when implemented, the hypothetical "isolated"
mode that does not use any system packages will only work with Python
3.11+.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Now that Ubuntu 20.04 is not included anymore, there is no need to ship
it as part of QEMU; Ubuntu 22.04 includes it and Leap users anyway
need to install all the required dependencies from PyPI.
This mostly reverts commit ec77ee7634de123b7c899739711000fd21dab68b,
with just some changes to the wording.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add cpuid bit definition for overflow recovery. This is needed in the case
where a deferred error has been sent to the guest, a guest process accesses the
poisoned memory, but the machine_check_poll function has not yet handled the
original deferred error. If overflow recovery is not set in this case, when we
handle the uncorrected error from the poisoned memory access, the overflow bit
will be set and will result in the guest being shut down.
By the time the MCE reaches the guest, the overflow has been handled
by the host and has not caused a shutdown, so include the bit unconditionally.
Signed-off-by: John Allen <john.allen@amd.com>
Message-ID: <20240603193622.47156-4-john.allen@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add cpuid bit definition for the SUCCOR feature. This cpuid bit is required to
be exposed to guests to allow them to handle machine check exceptions on AMD
hosts.
----
v2:
- Add "succor" feature word.
- Add case to kvm_arch_get_supported_cpuid for the SUCCOR feature.
Reported-by: William Roche <william.roche@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: John Allen <john.allen@amd.com>
Message-ID: <20240603193622.47156-3-john.allen@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For the most part, AMD hosts can use the same MCE injection code as Intel, but
there are instances where the qemu implementation is Intel specific. First, MCE
delivery works differently on AMD and does not support broadcast. Second,
kvm_mce_inject generates MCEs that include a number of Intel specific status
bits. Modify kvm_mce_inject to properly generate MCEs on AMD platforms.
Reported-by: William Roche <william.roche@oracle.com>
Signed-off-by: John Allen <john.allen@amd.com>
Message-ID: <20240603193622.47156-2-john.allen@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Different versions of PC machine support different maximum vCPUs, and
even different features have limits on the maximum number of vCPUs (
For example, if x2apic is not enabled in the TCG case, the maximum of
255 vCPUs are supported).
It is difficult to list the maximum vCPUs under all restrictions. Thus,
to avoid confusion, avoid mentioning specific maximum vCPU number
limitations here.
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240606085436.2028900-1-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
FRED CPU states are managed in 9 new FRED MSRs, in addtion to a few
existing CPU registers and MSRs, e.g., CR4.FRED and MSR_IA32_PL0_SSP.
Save/restore/migrate FRED MSRs if FRED is exposed to the guest.
Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
Message-ID: <20231109072012.8078-7-xin3.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
FRED, i.e., the Intel flexible return and event delivery architecture,
defines simple new transitions that change privilege level (ring
transitions).
The new transitions defined by the FRED architecture are FRED event
delivery and, for returning from events, two FRED return instructions.
FRED event delivery can effect a transition from ring 3 to ring 0, but
it is used also to deliver events incident to ring 0. One FRED
instruction (ERETU) effects a return from ring 0 to ring 3, while the
other (ERETS) returns while remaining in ring 0. Collectively, FRED
event delivery and the FRED return instructions are FRED transitions.
In addition to these transitions, the FRED architecture defines a new
instruction (LKGS) for managing the state of the GS segment register.
The LKGS instruction can be used by 64-bit operating systems that do
not use the new FRED transitions.
WRMSRNS is an instruction that behaves exactly like WRMSR, with the
only difference being that it is not a serializing instruction by
default. Under certain conditions, WRMSRNS may replace WRMSR to improve
performance. FRED uses it to switch RSP0 in a faster manner.
Search for the latest FRED spec in most search engines with this search
pattern:
site:intel.com FRED (flexible return and event delivery) specification
The CPUID feature flag CPUID.(EAX=7,ECX=1):EAX[17] enumerates FRED, and
the CPUID feature flag CPUID.(EAX=7,ECX=1):EAX[18] enumerates LKGS, and
the CPUID feature flag CPUID.(EAX=7,ECX=1):EAX[19] enumerates WRMSRNS.
Add CPUID definitions for FRED/LKGS/WRMSRNS, and expose them to KVM guests.
Because FRED relies on LKGS and WRMSRNS, add that to feature dependency
map.
Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
Message-ID: <20231109072012.8078-2-xin3.li@intel.com>
[Fix order of dependencies, add dependencies from LM to FRED. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When a macOS Hypervisor.framework call fails which is checked by
assert_hvf_ok(), Qemu exits printing the error value, but not the
location
in the code, as regular assert() macro expansions would.
This change turns assert_hvf_ok() into a macro similar to other
assertions, which expands to a call to the corresponding _impl()
function together with information about the expression that failed
the assertion and its location in the code.
Additionally, stringifying the numeric hv_return_t code is factored
into a helper function that can be reused for diagnostics and debugging
outside of assertions.
Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Message-ID: <20240605112556.43193-8-phil@philjordan.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
macOS 10.15 introduced the more efficient hv_vcpu_run_until() function
to supersede hv_vcpu_run(). According to the documentation, there is no
longer any reason to use the latter on modern host OS versions, especially
after 11.0 added support for an indefinite deadline.
Observed behaviour of the newer function is that as documented, it exits
much less frequently - and most of the original function’s exits seem to
have been effectively pointless.
Another reason to use the new function is that it is a prerequisite for
using newer features such as in-kernel APIC support. (Not covered by
this patch.)
This change implements the upgrade by selecting one of three code paths
at compile time: two static code paths for the new and old functions
respectively, when building for targets where the new function is either
not available, or where the built executable won’t run on older
platforms lacking the new function anyway. The third code path selects
dynamically based on runtime detected availability of the weakly-linked
symbol.
Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Message-ID: <20240605112556.43193-7-phil@philjordan.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When interrupting a vCPU thread, this patch actually tells the hypervisor to
stop running guest code on that vCPU.
Calling hv_vcpu_interrupt actually forces a vCPU exit, analogously to
hv_vcpus_exit on aarch64. Alternatively, if the vCPU thread
is not
running the VM, it will immediately cause an exit when it attempts
to do so.
Previously, hvf_kick_vcpu_thread relied upon hv_vcpu_run returning very
frequently, including many spurious exits, which made it less of a problem that
nothing was actively done to stop the vCPU thread running guest code.
The newer, more efficient hv_vcpu_run_until exits much more rarely, so a true
"kick" is needed before switching to that.
Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Message-ID: <20240605112556.43193-6-phil@philjordan.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When using x86 macOS Hypervisor.framework as accelerator, detection of
dirty memory regions is implemented by marking logged memory region
slots as read-only in the EPT, then setting the dirty flag when a
guest write causes a fault. The area marked dirty should then be marked
writable in order for subsequent writes to succeed without a VM exit.
However, dirty bits are tracked on a per-page basis, whereas the fault
handler was marking the whole logged memory region as writable. This
change fixes the fault handler so only the protection of the single
faulting page is marked as dirty.
(Note: the dirty page tracking appeared to work despite this error
because HVF’s hv_vcpu_run() function generated unnecessary EPT fault
exits, which ended up causing the dirty marking handler to run even
when the memory region had been marked RW. When using
hv_vcpu_run_until(), a change planned for a subsequent commit, these
spurious exits no longer occur, so dirty memory tracking malfunctions.)
Additionally, the dirty page is set to permit code execution, the same
as all other guest memory; changing memory protection from RX to RW not
RWX appears to have been an oversight.
Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Reviewed-by: Roman Bolshakov <roman@roolebo.dev>
Tested-by: Roman Bolshakov <roman@roolebo.dev>
Message-ID: <20240605112556.43193-5-phil@philjordan.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
macOS Hypervisor.framework uses different types for identifying vCPUs, hv_vcpu_t or hv_vcpuid_t, depending on host architecture. They are not just differently named typedefs for the same primitive type, but reference different-width integers.
Instead of using an integer type and casting where necessary, this change introduces a typedef which resolves the active architecture’s hvf typedef. It also removes a now-unnecessary cast.
Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Reviewed-by: Roman Bolshakov <roman@roolebo.dev>
Tested-by: Roman Bolshakov <roman@roolebo.dev>
Message-ID: <20240605112556.43193-4-phil@philjordan.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A bunch of function definitions used empty parentheses instead of (void) syntax, yielding the following warning when building with clang on macOS:
warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
In addition to fixing these function headers, it also fixes what appears to be a typo causing a variable to be unused after initialisation.
warning: variable 'entry_ctls' set but not used [-Wunused-but-set-variable]
Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Reviewed-by: Roman Bolshakov <roman@roolebo.dev>
Tested-by: Roman Bolshakov <roman@roolebo.dev>
Message-ID: <20240605112556.43193-3-phil@philjordan.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch adds the INVTSC bit to the Hypervisor.framework accelerator's
CPUID bit passthrough allow-list. Previously, specifying +invtsc in the CPU
configuration would fail with the following warning despite the host CPU
advertising the feature:
qemu-system-x86_64: warning: host doesn't support requested feature:
CPUID.80000007H:EDX.invtsc [bit 8]
x86 macOS itself relies on a fixed rate TSC for its own Mach absolute time
timestamp mechanism, so there's no reason we can't enable this bit for guests.
When the feature is enabled, a migration blocker is installed.
Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Reviewed-by: Roman Bolshakov <roman@roolebo.dev>
Tested-by: Roman Bolshakov <roman@roolebo.dev>
Message-ID: <20240605112556.43193-2-phil@philjordan.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Compiling without system, user, tools or guest-agent fails with the
following error message:
./configure --disable-system --disable-user --disable-tools \
--disable-guest-agent
error message:
/usr/bin/ld: libqemuutil.a.p/util_error-report.c.o: in function `error_printf':
/media/liuzhao/data/qemu-cook/build/../util/error-report.c:38: undefined reference to `error_vprintf'
/usr/bin/ld: libqemuutil.a.p/util_error-report.c.o: in function `vreport':
/media/liuzhao/data/qemu-cook/build/../util/error-report.c:215: undefined reference to `error_vprintf'
collect2: error: ld returned 1 exit status
This is because tests/bench and tests/unit both need qemuutil, which
requires error_vprintf stub when system is disabled.
Add error_vprintf stub into stub_ss for all cases other than disabling
system.
Fixes: 3a15604900 ("stubs: include stubs only if needed")
Reported-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240605152549.1795762-1-zhao1.liu@intel.com>
[Include error-printf.c unconditionally. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Before this commit, scsi-disk accepts a string of arbitrary length for
its "serial" property. However, the value visible on the guest is
actually truncated to 36 characters. This limitation doesn't come from
the SCSI specification, it is an arbitrary limit that was initially
picked as 20 and later bumped to 36 by commit 48b62063.
Similarly, device_id was introduced as a copy of the serial number,
limited to 20 characters, but commit 48b62063 forgot to actually bump
it.
As long as we silently truncate the given string, extending the limit is
actually not a harmless change, but break the guest ABI. This is the
most important reason why commit 48b62063 was really wrong (and it's
also why we can't change device_id to be in sync with the serial number
again and use 36 characters now, it would be another guest ABI
breakage).
In order to avoid future breakage, don't silently truncate the serial
number string any more, but just error out if it would be truncated.
Buglink: https://issues.redhat.com/browse/RHEL-3542
Suggested-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20240604161755.63448-1-kwolf@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
No semantic change, just simpler control flow.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Detect early unsupported MADV_MERGEABLE and MADV_DONTDUMP, and print a clearer
error message that points to the deficiency of the host.
Cc: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If memory-backend-{file,ram} has a size that's not aligned to
underlying page size it is not only wasteful, but also may lead
to hard to debug behaviour. For instance, in case
memory-backend-file and hugepages, madvise() and mbind() fail.
Rightfully so, page is the smallest unit they can work with. And
even though an error is reported, the root cause it not very
clear:
qemu-system-x86_64: Couldn't set property 'dump' on 'memory-backend-file': Invalid argument
After this commit:
qemu-system-x86_64: backend 'memory-backend-file' memory size must be multiple of 2 MiB
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Message-ID: <b5b9f9c6bba07879fb43f3c6f496c69867ae3716.1717584048.git.mprivozn@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The unspoken premise of qemu_madvise() is that errno is set on
error. And it is mostly the case except for posix_madvise() which
is documented to return either zero (on success) or a positive
error number. This means, we must set errno ourselves. And while
at it, make the function return a negative value on error, just
like other error paths do.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-ID: <af17113e7c1f2cc909ffd36d23f5a411b63b8764.1717584048.git.mprivozn@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Otherwise, starting any guest on a non-Linux guests results in
qemu-system-arm: Couldn't set property 'merge' on 'memory-backend-ram': Invalid argument
Cc: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The calculation of FrameTemp is done using the size indicated by mo_pushpop()
before being written back to EBP, but the final writeback to EBP is done using
the size indicated by mo_stacksize().
In the case where mo_pushpop() is MO_32 and mo_stacksize() is MO_16 then the
final writeback to EBP is done using MO_16 which can leave junk in the top
16-bits of EBP after executing ENTER.
Change the writeback of EBP to use the same size indicated by mo_pushpop() to
ensure that the full value is written back.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2198
Message-ID: <20240606095319.229650-5-mark.cave-ayland@ilande.co.uk>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When OS/2 Warp configures its segment descriptors, many of them are configured with
the P flag clear to allow for a fault-on-demand implementation. In the case where
the stack value is POPped into the segment registers, the SP is incremented before
calling gen_helper_load_seg() to validate the segment descriptor:
IN:
0xffef2c0c: 66 07 popl %es
OP:
ld_i32 loc9,env,$0xfffffffffffffff8
sub_i32 loc9,loc9,$0x1
brcond_i32 loc9,$0x0,lt,$L0
st16_i32 loc9,env,$0xfffffffffffffff8
st8_i32 $0x1,env,$0xfffffffffffffffc
---- 0000000000000c0c 0000000000000000
ext16u_i64 loc0,rsp
add_i64 loc0,loc0,ss_base
ext32u_i64 loc0,loc0
qemu_ld_a64_i64 loc0,loc0,noat+un+leul,5
add_i64 loc3,rsp,$0x4
deposit_i64 rsp,rsp,loc3,$0x0,$0x10
extrl_i64_i32 loc5,loc0
call load_seg,$0x0,$0,env,$0x0,loc5
add_i64 rip,rip,$0x2
ext16u_i64 rip,rip
exit_tb $0x0
set_label $L0
exit_tb $0x7fff58000043
If helper_load_seg() generates a fault when validating the segment descriptor then as
the SP has already been incremented, the topmost word of the stack is overwritten by
the arguments pushed onto the stack by the CPU before taking the fault handler. As a
consequence things rapidly go wrong upon return from the fault handler due to the
corrupted stack.
Update the logic for the existing writeback condition so that a POP into the segment
registers also calls helper_load_seg() first before incrementing the SP, so that if a
fault occurs the SP remains unaltered.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2198
Message-ID: <20240606095319.229650-4-mark.cave-ayland@ilande.co.uk>
Fixes: cc1d28bdbe ("target/i386: move 00-5F opcodes to new decoder", 2024-05-07)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
DISAS_NORETURN suppresses the work normally done by gen_eob(), and therefore
must be used in special cases only. Document them.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
HLT uses DISAS_NORETURN because the corresponding helper calls
cpu_loop_exit(). However, while gen_eob() clears HF_RF_MASK and
synthesizes a #DB exception if single-step is active, none of this is
done by HLT. Note that the single-step trap is generated after the halt
is finished.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
PAUSE uses DISAS_NORETURN because the corresponding helper
calls cpu_loop_exit(). However, while HLT clear HF_INHIBIT_IRQ_MASK
to correctly handle "STI; HLT", the same is missing from PAUSE.
And also gen_eob() clears HF_RF_MASK and synthesizes a #DB exception
if single-step is active; none of this is done by HLT and PAUSE.
Start fixing PAUSE, HLT will follow.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
From vm entry to exit, VMRUN is handled as a single instruction. It
uses DISAS_NORETURN in order to avoid processing TF or RF before
the first instruction executes in the guest. However, the corresponding
handling is missing in vmexit. Add it, and at the same time reorganize
the comments with quotes from the manual about the tasks performed
by a #VMEXIT.
Another gen_eob() task that is missing in VMRUN is preparing the
HF_INHIBIT_IRQ flag for the next instruction, in this case by loading
it from the VMCB control state.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If the required DR7 (either from the VMCB or from the host save
area) disables a breakpoint that was enabled prior to vmentry
or vmexit, it is left enabled and will trigger EXCP_DEBUG.
This causes a spurious #DB on the next crossing of the breakpoint.
To disable it, vmentry/vmexit must use cpu_x86_update_dr7
to load DR7.
Because cpu_x86_update_dr7 takes a 32-bit argument, check
reserved bits prior to calling cpu_x86_update_dr7, and do the
same for DR6 as well for consistency.
This scenario is tested by the "host_rflags" test in kvm-unit-tests.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
DR7.GD triggers a #DB exception on any access to debug registers.
The GD bit is cleared so that the #DB handler itself can access
the debug registers.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use decode.c's support for intercepts, doing the check in TCG-generated
code rather than the helper. This is cleaner because it allows removing
the eip_addend argument to helper_pause(), even though it adds a bit of
bloat for opcode 0x90's new decoding function.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use decode.c's support for intercepts, doing the check in TCG-generated
code rather than the helper. This is cleaner because it allows removing
the eip_addend argument to helper_hlt().
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
ICEBP generates a trap-like exception, while gen_exception() produces
a fault. Resurrect gen_update_eip_next() to implement the desired
semantics.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When preparing an exception stack frame for a fault exception, the value
pushed for RF is 1. Take that into account. The same should be true
of interrupts for repeated string instructions, but the situation there
is complicated.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
pull-loongarch-20240606
# -----BEGIN PGP SIGNATURE-----
#
# iLMEAAEKAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCZmE0HwAKCRBAov/yOSY+
# 396sA/90m/zr91pLQlkhFuYLHg958Ow3L5ysblcuAAmcTXGi8iE9IeTTeZru6WEO
# H/CL/njUkIgP+/Tio0n0Lx6rWkxOzGxWCpvzqrabsPGvs4GUtFEjI/2pvEWP6C9/
# S6Jon3py0oZeoVx8D6Tr/CJrhD0IBptbEn1aiQNDRuSzeuCo1Q==
# =xpjH
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 05 Jun 2024 08:59:27 PM PDT
# gpg: using RSA key B8FF1DA0D2FDCB2DA09C6C2C40A2FFF239263EDF
# gpg: Good signature from "Song Gao <m17746591750@163.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B8FF 1DA0 D2FD CB2D A09C 6C2C 40A2 FFF2 3926 3EDF
* tag 'pull-loongarch-20240606' of https://gitlab.com/gaosong/qemu:
target/loongarch: fix a wrong print in cpu dump
hw/loongarch/virt: Enable extioi virt extension
hw/loongarch/virt: Use MemTxAttrs interface for misc ops
hw/intc/loongarch_extioi: Add extioi virt extension definition
tests/qtest: Add numa test for loongarch system
tests/libqos: Add loongarch virt machine node
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
testing cleanups (ci, vm, lcitool, ansible):
- clean up left over Centos 8 references
- use -fno-sanitize=function to avoid non-useful errors
- bump lcitool and update images (alpine, fedora)
- make sure we have mingw-w64-tools for windows builds
- drive ansible scripts with lcitool package lists
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmZhgb4ACgkQ+9DbCVqe
# KkQNMAf/eyGgbU6ASgbwGqiJCOrkWo8CM7G1dXZ5GpVvKVnlDioMaefFCWt3ftB6
# kKtiskC1xVx3vM1mmomosSGxTNxT93HMLulKJLXK8/SvOFU9phzzUeZXTqS7JKNb
# NrawL0vkygRn+mmTgr3M+Z7rh4yI9e7e2yeX+oQiXsSGGNM114EdcUqKG82tpO8G
# cDlNtlp2jgptNnmzQ7ufZIRD5ckg+2os6XFB+bhmfaWCu9PXTTNwBfLJkEnXENi3
# NpRHPRGsUwZ3QcGt0JLkxTT/yeHngYBP7us2cwMXuf9lxfFCF3Ixt1jF/9ZJRrRm
# wUYG6TBKh6S24cJSBiu1zguQ/EE7WA==
# =myWO
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 06 Jun 2024 02:30:38 AM PDT
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [full]
* tag 'pull-maintainer-june24-060624-1' of https://gitlab.com/stsquad/qemu:
scripts/ci: drive ubuntu/build-environment.yml from lcitool
tests/lcitool: generate package lists for ansible
tests/lcitool: Install mingw-w64-tools for the Windows cross-builds
tests/lcitool: Bump to latest libvirt-ci and update Fedora and Alpine version
.gitlab-ci.d/buildtest.yml: Use -fno-sanitize=function in the clang-system job
tests/lcitool: Delete obsolete centos-stream-8.yml file
docs/ci: clean-up references for consistency
scripts/ci: remove CentOS bits from common build-environment
tests/vm: remove plain centos image
tests/vm: update centos.aarch64 image to 9
docs/devel: update references to centos to non-versioned container
ci: remove centos-steam-8 customer runner
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Update to the latest version of lcitool. It dropped support for Fedora 38
and Alpine 3.18, so we have to update these to newer versions here, too.
Python 3.12 dropped the "imp" module which we still need for running
Avocado. Fortunately Fedora 40 still ships with a work-around package
that we can use until somebody updates our Avocado to a newer version.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20240601070543.37786-3-thuth@redhat.com>
[AJB: regen on rebase]
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240603175328.3823123-10-alex.bennee@linaro.org>
The latest version of Clang (version 18 from Fedora 40) now reports
bad function pointer casts as undefined behavior. Unfortunately, we are
still doing this in quite a lot of places in the QEMU code and some of
them are not easy to fix. So for the time being, temporarily switch this
off in the failing clang-system job until all spots in the QEMU sources
have been tackled.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20240601070543.37786-4-thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240603175328.3823123-9-alex.bennee@linaro.org>
>From the website:
"After May 31, 2024, CentOS Stream 8 will be archived and no further
updates will be provided."
We have updated a few bits but there are still references that need
fixing. Rather than bump I've replaced them with references to the
Debian image so we don't have to bump at the next update.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240603175328.3823123-3-alex.bennee@linaro.org>
This broke since eef0bae3a7 (migration: Remove block migration) but
even after that was addressed it still fails to complete. As it will
shortly be EOL lets to remove the runner definition and the related
ansible setup bits.
We still have centos9 docker images build and test.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240603175328.3823123-2-alex.bennee@linaro.org>
This patch adds a new board attribute 'v-eiointc'.
A value of true enables the virt extended I/O interrupt controller.
VMs working in kvm mode have 'v-eiointc' enabled by default.
Signed-off-by: Song Gao <gaosong@loongson.cn>
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Message-Id: <20240528083855.1912757-4-gaosong@loongson.cn>
Add loongarch virt machine to the graph. It is a modified copy of
the existing riscv virtmachine in riscv-virt-machine.c
It contains a generic-pcihost controller, and an extra function
loongarch_config_qpci_bus() to configure GPEX pci host controller
information, such as ecam and pio_base addresses.
Also hotplug handle checking about TYPE_VIRTIO_IOMMU_PCI device is
added on loongarch virt machine, since virtio_mmu_pci device requires
it.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Acked-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20240528082053.938564-1-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
util/hexdump: Use a GString for qemu_hexdump_line.
system/qtest: Replace sprintf by qemu_hexdump_line
hw/scsi/scsi-disk: Use qemu_hexdump_line to avoid sprintf
hw/ide/atapi: Use qemu_hexdump_line to avoid sprintf
hw/dma/pl330: Use qemu_hexdump_line to avoid sprintf
disas/microblaze: Reorg to avoid intermediate sprintf
disas/riscv: Use GString in format_inst
# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmZg1RMdHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV+6mgf6AjEdU91vBXAUxabs
# kmVl5HaAD3NHU1VCM+ruPQkm6xv4kLlMsTibmkiS7+WZYvHfPlGfozjRJxtvZj8K
# 8J2Qp9iHjny8NQPkMCValDvmzkxaIT7ZzYCBdS4jfTdIThuYNJnXsI3NNP7ghnl6
# xv8O62dQbc5gjWF8G+q6PKWSxY6BEuFJ3Pt82cJ/Fj/8bhsjd48pgiLv66F/+q1z
# U9Gy8fWqmkKEzTqBigSYU98yae5CA89T6JBKtgFV07pkYa4A7BUyCR5EBirARyhM
# P0OAqR1GCAbSXWFaJ1sSpU8ATq33FoSQYwWwcmEET7FZYZqvbd6Jd4HtpOPqmu9W
# Fc4taw==
# =VgLB
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 05 Jun 2024 02:13:55 PM PDT
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate]
* tag 'pull-misc-20240605' of https://gitlab.com/rth7680/qemu:
disas/riscv: Use GString in format_inst
disas/microblaze: Split get_field_special
disas/microblaze: Print registers directly with PRIrfsl
disas/microblaze: Print immediates directly with PRIimm
disas/microblaze: Print registers directly with PRIreg
disas/microblaze: Merge op->name output into each fprintf
disas/microblaze: Re-indent print_insn_microblaze
disas/microblaze: Split out print_immval_addr
hw/dma/pl330: Use qemu_hexdump_line to avoid sprintf
hw/ide/atapi: Use qemu_hexdump_line to avoid sprintf
hw/scsi/scsi-disk: Use qemu_hexdump_line to avoid sprintf
system/qtest: Replace sprintf by qemu_hexdump_line
hw/mips/malta: Add re-usable rng_seed_hex_new() method
util/hexdump: Inline g_string_append_printf "%02x"
util/hexdump: Add unit_len and block_len to qemu_hexdump_line
util/hexdump: Use a GString for qemu_hexdump_line
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
sprintf() is deprecated on Darwin since macOS 13.0 / XCode 14.1.
Using qemu_hexdump_line both fixes the deprecation warning and
simplifies the code base.
Note that this drops the "0x" prefix to every byte, which should
be of no consequence to tracing.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20240412073346.458116-9-richard.henderson@linaro.org>
sprintf() is deprecated on Darwin since macOS 13.0 / XCode 14.1.
Extract common code from reinitialize_rng_seed and load_kernel
to rng_seed_hex_new. Using qemu_hexdump_line both fixes the
deprecation warning and simplifies the code base.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
[rth: Use qemu_hexdump_line.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-Id: <20240412073346.458116-7-richard.henderson@linaro.org>
Ignore the "monitor" portion and treat them the same
as their base ASIs.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
VIS4 completes the set, adding missing signed 8-bit ops
and missing unsigned 16 and 32-bit ops.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The manual separates VIS 3 and VIS 3B, even though they are both
present in all extant cpus. For clarity, let the translator
match the manual but otherwise leave them on the same feature bit.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rearrange PDIST so that do_dddd is general purpose and may
be re-used for FMADDd etc. Add pickNaN and pickNaNMulAdd.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Form the proper register decoding from the start.
Because we're removing the translation from the inner-most
gen_load_fpr_* and gen_store_fpr_* routines, this must be
done for all insns at once.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This operation returns the high 16 bits of a 24-bit multiply
that has been sign-extended to 32 bits.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Follow the Oracle Sparc 2015 implementation note and bound
the input value of N to 5 from the lower 3 bits of rs2.
Spell out all of the intermediate values, matching the diagram
in the manual. Fix extraction of upper_x and upper_y for N=0.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* virtio-blk: remove SCSI passthrough functionality
* require x86-64-v2 baseline ISA
* SEV-SNP host support
* fix xsave.flat with TCG
* fixes for CPUID checks done by TCG
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmZgKVYUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroPKYgf/QkWrNXdjjD3yAsv5LbJFVTVyCYW3
# b4Iax29kEDy8k9wbzfLxOfIk9jXIjmbOMO5ZN9LFiHK6VJxbXslsMh6hm50M3xKe
# 49X1Rvf9YuVA7KZX+dWkEuqLYI6Tlgj3HaCilYWfXrjyo6hY3CxzkPV/ChmaeYlV
# Ad4Y8biifoUuuEK8OTeTlcDWLhOHlFXylG3AXqULsUsXp0XhWJ9juXQ60eATv/W4
# eCEH7CSmRhYFu2/rV+IrWFYMnskLRTk1OC1/m6yXGPKOzgnOcthuvQfiUgPkbR/d
# llY6Ni5Aaf7+XX3S7Avcyvoq8jXzaaMzOrzL98rxYGDR1sYBYO+4h4ZToA==
# =qQeP
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 05 Jun 2024 02:01:10 AM PDT
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (46 commits)
hw/i386: Add support for loading BIOS using guest_memfd
hw/i386/sev: Use guest_memfd for legacy ROMs
memory: Introduce memory_region_init_ram_guest_memfd()
i386/sev: Allow measured direct kernel boot on SNP
i386/sev: Reorder struct declarations
i386/sev: Extract build_kernel_loader_hashes
i386/sev: Enable KVM_HC_MAP_GPA_RANGE hcall for SNP guests
i386/kvm: Add KVM_EXIT_HYPERCALL handling for KVM_HC_MAP_GPA_RANGE
i386/sev: Invoke launch_updata_data() for SNP class
i386/sev: Invoke launch_updata_data() for SEV class
hw/i386/sev: Add support to encrypt BIOS when SEV-SNP is enabled
i386/sev: Add support for SNP CPUID validation
i386/sev: Add support for populating OVMF metadata pages
hw/i386/sev: Add function to get SEV metadata from OVMF header
i386/sev: Set CPU state to protected once SNP guest payload is finalized
i386/sev: Add handling to encrypt/finalize guest launch data
i386/sev: Add the SNP launch start context
i386/sev: Update query-sev QAPI format to handle SEV-SNP
i386/sev: Add a class method to determine KVM VM type for SNP guests
i386/sev: Don't return launch measurements for SEV-SNP guests
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
When guest_memfd is enabled, the BIOS is generally part of the initial
encrypted guest image and will be accessed as private guest memory. Add
the necessary changes to set up the associated RAM region with a
guest_memfd backend to allow for this.
Current support centers around using -bios to load the BIOS data.
Support for loading the BIOS via pflash requires additional enablement
since those interfaces rely on the use of ROM memory regions which make
use of the KVM_MEM_READONLY memslot flag, which is not supported for
guest_memfd-backed memslots.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-29-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Current SNP guest kernels will attempt to access these regions with
with C-bit set, so guest_memfd is needed to handle that. Otherwise,
kvm_convert_memory() will fail when the guest kernel tries to access it
and QEMU attempts to call KVM_SET_MEMORY_ATTRIBUTES to set these ranges
to private.
Whether guests should actually try to access ROM regions in this way (or
need to deal with legacy ROM regions at all), is a separate issue to be
addressed on kernel side, but current SNP guest kernels will exhibit
this behavior and so this handling is needed to allow QEMU to continue
running existing SNP guest kernels.
Signed-off-by: Michael Roth <michael.roth@amd.com>
[pankaj: Added sev_snp_enabled() check]
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-28-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In SNP, the hashes page designated with a specific metadata entry
published in AmdSev OVMF.
Therefore, if the user enabled kernel hashes (for measured direct boot),
QEMU should prepare the content of hashes table, and during the
processing of the metadata entry it copy the content into the designated
page and encrypt it.
Note that in SNP (unlike SEV and SEV-ES) the measurements is done in
whole 4KB pages. Therefore QEMU zeros the whole page that includes the
hashes table, and fills in the kernel hashes area in that page, and then
encrypts the whole page. The rest of the page is reserved for SEV
launch secrets which are not usable anyway on SNP.
If the user disabled kernel hashes, QEMU pre-validates the kernel hashes
page as a zero page.
Signed-off-by: Dov Murik <dovmurik@linux.ibm.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-24-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
KVM_HC_MAP_GPA_RANGE will be used to send requests to userspace for
private/shared memory attribute updates requested by the guest.
Implement handling for that use-case along with some basic
infrastructure for enabling specific hypercall events.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-31-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
SEV-SNP firmware allows a special guest page to be populated with a
table of guest CPUID values so that they can be validated through
firmware before being loaded into encrypted guest memory where they can
be used in place of hypervisor-provided values[1].
As part of SEV-SNP guest initialization, use this interface to validate
the CPUID entries reported by KVM_GET_CPUID2 prior to initial guest
start and populate the CPUID page reserved by OVMF with the resulting
encrypted data.
[1] SEV SNP Firmware ABI Specification, Rev. 0.8, 8.13.2.6
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-21-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A recent version of OVMF expanded the reset vector GUID list to add
SEV-specific metadata GUID. The SEV metadata describes the reserved
memory regions such as the secrets and CPUID page used during the SEV-SNP
guest launch.
The pc_system_get_ovmf_sev_metadata_ptr() is used to retieve the SEV
metadata pointer from the OVMF GUID list.
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-19-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Once KVM_SNP_LAUNCH_FINISH is called the vCPU state is copied into the
vCPU's VMSA page and measured/encrypted. Any attempt to read/write CPU
state afterward will only be acting on the initial data and so are
effectively no-ops.
Set the vCPU state to protected at this point so that QEMU don't
continue trying to re-sync vCPU data during guest runtime.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-18-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Most of the current 'query-sev' command is relevant to both legacy
SEV/SEV-ES guests and SEV-SNP guests, with 2 exceptions:
- 'policy' is a 64-bit field for SEV-SNP, not 32-bit, and
the meaning of the bit positions has changed
- 'handle' is not relevant to SEV-SNP
To address this, this patch adds a new 'sev-type' field that can be
used as a discriminator to select between SEV and SEV-SNP-specific
fields/formats without breaking compatibility for existing management
tools (so long as management tools that add support for launching
SEV-SNP guest update their handling of query-sev appropriately).
The corresponding HMP command has also been fixed up similarly.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Co-developed-by:Pankaj Gupta <pankaj.gupta@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-15-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
SEV guests can use either KVM_X86_DEFAULT_VM, KVM_X86_SEV_VM,
or KVM_X86_SEV_ES_VM depending on the configuration and what
the host kernel supports. SNP guests on the other hand can only
ever use KVM_X86_SNP_VM, so split determination of VM type out
into a separate class method that can be set accordingly for
sev-guest vs. sev-snp-guest objects and add handling for SNP.
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-14-pankaj.gupta@amd.com>
[Remove unnecessary function pointer declaration. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a simple helper to check if the current guest type is SNP. Also have
SNP-enabled imply that SEV-ES is enabled as well, and fix up any places
where the sev_es_enabled() check is expecting a pure/non-SNP guest.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-9-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
SEV-SNP support relies on a different set of properties/state than the
existing 'sev-guest' object. This patch introduces the 'sev-snp-guest'
object, which can be used to configure an SEV-SNP guest. For example,
a default-configured SEV-SNP guest with no additional information
passed in for use with attestation:
-object sev-snp-guest,id=sev0
or a fully-specified SEV-SNP guest where all spec-defined binary
blobs are passed in as base64-encoded strings:
-object sev-snp-guest,id=sev0, \
policy=0x30000, \
init-flags=0, \
id-block=YWFhYWFhYWFhYWFhYWFhCg==, \
id-auth=CxHK/OKLkXGn/KpAC7Wl1FSiisWDbGTEKz..., \
author-key-enabled=on, \
host-data=LNkCWBRC5CcdGXirbNUV1OrsR28s..., \
guest-visible-workarounds=AA==, \
See the QAPI schema updates included in this patch for more usage
details.
In some cases these blobs may be up to 4096 characters, but this is
generally well below the default limit for linux hosts where
command-line sizes are defined by the sysconf-configurable ARG_MAX
value, which defaults to 2097152 characters for Ubuntu hosts, for
example.
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Michael Roth <michael.roth@amd.com>
Acked-by: Markus Armbruster <armbru@redhat.com> (for QAPI schema)
Signed-off-by: Michael Roth <michael.roth@amd.com>
Co-developed-by: Pankaj Gupta <pankaj.gupta@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-8-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When sev-snp-guest objects are introduced there will be a number of
differences in how the launch finish is handled compared to the existing
sev-guest object. Move sev_launch_finish() to a class method to make it
easier to implement SNP-specific launch update functionality later.
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-7-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When sev-snp-guest objects are introduced there will be a number of
differences in how the launch data is handled compared to the existing
sev-guest object. Move sev_launch_start() to a class method to make it
easier to implement SNP-specific launch update functionality later.
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-6-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently all SEV/SEV-ES functionality is managed through a single
'sev-guest' QOM type. With upcoming support for SEV-SNP, taking this
same approach won't work well since some of the properties/state
managed by 'sev-guest' is not applicable to SEV-SNP, which will instead
rely on a new QOM type with its own set of properties/state.
To prepare for this, this patch moves common state into an abstract
'sev-common' parent type to encapsulate properties/state that are
common to both SEV/SEV-ES and SEV-SNP, leaving only SEV/SEV-ES-specific
properties/state in the current 'sev-guest' type. This should not
affect current behavior or command-line options.
As part of this patch, some related changes are also made:
- a static 'sev_guest' variable is currently used to keep track of
the 'sev-guest' instance. SEV-SNP would similarly introduce an
'sev_snp_guest' static variable. But these instances are now
available via qdev_get_machine()->cgs, so switch to using that
instead and drop the static variable.
- 'sev_guest' is currently used as the name for the static variable
holding a pointer to the 'sev-guest' instance. Re-purpose the name
as a local variable referring the 'sev-guest' instance, and use
that consistently throughout the code so it can be easily
distinguished from sev-common/sev-snp-guest instances.
- 'sev' is generally used as the name for local variables holding a
pointer to the 'sev-guest' instance. In cases where that now points
to common state, use the name 'sev_common'; in cases where that now
points to state specific to 'sev-guest' instance, use the name
'sev_guest'
In order to enable kernel-hashes for SNP, pull it from
SevGuestProperties to its parent SevCommonProperties so
it will be available for both SEV and SNP.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Co-developed-by: Dov Murik <dovmurik@linux.ibm.com>
Signed-off-by: Dov Murik <dovmurik@linux.ibm.com>
Acked-by: Markus Armbruster <armbru@redhat.com> (QAPI schema)
Co-developed-by: Pankaj Gupta <pankaj.gupta@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-5-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Ask the ConfidentialGuestSupport object whether to use guest_memfd
for KVM-backend private memory. This bool can be set in instance_init
(or user_complete) so that it is available when the machine is created.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Right now QEMU is importing arch/x86/include/uapi/asm/kvm_para.h
because it includes definitions for kvmclock and for KVM CPUID
bits. However, other definitions for KVM hypercall values and return
codes are included in include/uapi/linux/kvm_para.h and they will be
used by SEV-SNP.
To ensure that it is possible to include both <linux/kvm_para.h> and
"standard-headers/asm-x86/kvm_para.h" without conflicts, provide
linux/kvm_para.h as a portable header too, and forward linux-headers/
files to those in include/standard-headers. Note that <linux/kvm_para.h>
will include architecture-specific definitions as well, but
"standard-headers/linux/kvm_para.h" will not because it can be used in
architecture-independent files.
This could easily be extended to other architectures, but right now
they do not need any symbol in their specific kvm_para.h files.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This updates kernel headers to commit 6f627b425378 ("KVM: SVM: Add module
parameter to enable SEV-SNP", 2024-05-12). The SNP host patches will
be included in Linux 6.11, to be released next July.
Also brings in an linux-headers/linux/vhost.h fix from v6.9-rc4.
Co-developed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-3-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Linux has <misc/pvpanic.h>, not <linux/pvpanic.h>. Use the same
directory for QEMU's include/standard-headers/ copy.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Afer commit 3efc75ad9d ("scripts/update-linux-headers.sh: Remove
temporary directory inbetween", 2024-05-29), updating linux-headers/
results in errors such as
cp: cannot stat '/tmp/tmp.1A1Eejh1UE/headers/include/asm/bitsperlong.h': No such file or directory
because Loongarch does not have an asm/bitsperlong.h file and uses the
generic version. Before commit 3efc75ad9d, the missing file would
incorrectly cause stale files to be included in linux-headers/. The files
were never committed to qemu.git, but were wrong nevertheless. The build
would just use the system version of the files, which is opposite to
the idea of importing Linux header files into QEMU's tree.
Create forwarding headers, resembling the ones that are generated during a
kernel build by scripts/Makefile.asm-generic, if a file is only installed
under include/asm-generic/.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
xsave.flat checks that "executing the XSETBV instruction causes a general-
protection fault (#GP) if ECX = 0 and EAX[2:1] has the value 10b". QEMU allows
that option, so the test fails. Add the condition.
Cc: qemu-stable@nongnu.org
Fixes: 892544317f ("target/i386: implement XSAVE and XRSTOR of AVX registers", 2022-10-18)
Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This commit fixes an issue with MOV instructions (0x8C and 0x8E)
involving segment registers; MOV to segment register's source is
16-bit, while MOV from segment register has to explicitly set the
memory operand size to 16 bits. Introduce a new flag
X86_SPECIAL_Op0_Mw to handle this specification correctly.
Signed-off-by: Xinyu Li <lixinyu20s@ict.ac.cn>
Message-ID: <20240602100528.2135717-1-lixinyu20s@ict.ac.cn>
Fixes: 5e9e21bcc4 ("target/i386: move 60-BF opcodes to new decoder", 2024-05-07)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
QEMU now requires an x86-64-v2 host, which has the POPCNT instruction.
Use it freely in TCG-generated code.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
QEMU now requires an x86-64-v2 host, which has SSSE3 instructions
(notably, PSHUFB which is used by QEMU's AES implementation).
Do not bother checking it.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
QEMU now requires an x86-64-v2 host, which has SSE2.
Use it freely in buffer_is_zero.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
QEMU now requires an x86-64-v2 host, which always has CMOV.
Use it freely in TCG generated code.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
x86-64-v2 processors were released in 2008, assume that we have one.
Unfortunately there is no GCC flag to enable all the features
without disabling what came after; so enable them one by one.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The only user was the SSE4.1 variant of buffer_is_zero, which has
been removed; code to compute CPUINFO_SSE4 is dead.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The legacy SCSI passthrough functionality has never been enabled for
VIRTIO 1.0 and was deprecated more than four years ago.
Get rid of it---almost, because QEMU is advertising it unconditionally
for legacy virtio-blk devices. Just parse the header and return a
nonzero status.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCAAdFiEEIV1G9IJGaJ7HfzVi7wSWWzmNYhEFAmZewo4ACgkQ7wSWWzmN
# YhHhxgf/ZaECxru4fP8wi34XdSG/PR+BF+W5M9gZIRGrHg3vIf3/LRTpZTDccbRN
# Qpwtypr9O6/AWG9Os80rn7alsmMDxN8PDDNLa9T3wf5pJUQSyQ87Yy0MiuTNPSKD
# HKYUIfIlbFCM5WUW4huMmg98gKTgnzZMqOoRyMFZitbkR59qCm+Exws4HtXvCH68
# 3k4lgvnFccmzO9iIzaOUIPs+Yf04Kw/FrY0Q/6nypvqbF2W80Md6w02JMQuTLwdF
# Guxeg/n6g0NLvCBbkjiM2VWfTaWJYbwFSwRTAMxM/geqh7qAgGsmD0N5lPlgqRDy
# uAy2GvFyrwzcD0lYqf0/fRK0Go0HPA==
# =J70K
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 04 Jun 2024 02:30:22 AM CDT
# gpg: using RSA key 215D46F48246689EC77F3562EF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* tag 'net-pull-request' of https://github.com/jasowang/qemu:
ebpf: Added traces back. Changed source set for eBPF to 'system'.
virtio-net: drop too short packets early
ebpf: Add a separate target for skeleton
ebpf: Refactor tun_rss_steering_prog()
ebpf: Return 0 when configuration fails
ebpf: Fix RSS error handling
virtio-net: Do not write hashes to peer buffer
virtio-net: Always set populate_hash
virtio-net: Unify the logic to update NIC state for RSS
virtio-net: Disable RSS on reset
virtio-net: Shrink header byte swapping buffer
virtio-net: Copy header only when necessary
virtio-net: Add only one queue pair when realizing
virtio-net: Do not propagate ebpf-rss-fds errors
tap: Shrink zeroed virtio-net header
tap: Call tap_receive_iov() from tap_receive()
net: Remove receive_raw()
net: Move virtio-net header length assertion
tap: Remove qemu_using_vnet_hdr()
tap: Remove tap_probe_vnet_hdr_len()
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The 'blacklist' argument / config key are deprecated since commit
582a098e6c ("qga: Replace 'blacklist' command line and config file
options by 'block-rpcs'"), time to remove them.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Message-Id: <20240530070413.19181-1-philmd@linaro.org>
In fdf029762f we factored out the handling of reading and writing
DMA descriptors from guest memory. Unfortunately we accidentally
made the descriptor-read read the descriptor into the address of the
buffer rather than into the buffer, because we didn't notice we
needed to update the arguments to the dma_memory_read() call. Before
the refactoring, "&desc" is the address of a local struct DPDMADescriptor
variable in xlnx_dpdma_start_operation(), which is the correct target
for the guest-memory-read. But after the refactoring 'desc' is the
"DPDMADescriptor *desc" argument to the new function, and so it is
already an address.
This bug is an overrun of a stack variable, since a pointer is at
most 8 bytes long and we try to read 64 bytes, as well as being
incorrect behaviour.
Pass 'desc' rather than '&desc' as the dma_memory_read() argument
to fix this.
(The same bug is not present in xlnx_dpdma_write_descriptor(),
because there we are writing the descriptor from a local struct
variable "DPDMADescriptor tmp_desc" and so passing &tmp_desc to
dma_memory_write() is correct.)
Spotted by Coverity: CID 1546649
Fixes: fdf029762f ("xlnx_dpdma: fix descriptor endianness bug")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240531124628.476938-1-peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Directly calling exit() prevents any kind of management or handling.
Instead use the corresponding runstate API.
The default behavior of the runstate API is the same as exit().
Signed-off-by: Thomas Weißschuh <thomas@t-8ch.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240523-debugexit-v1-1-d52fcaf7bf8b@t-8ch.de>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Align the framebuffer backend with the other legacy ones,
register it via xen_backend_init() when '-vga xenfb' is
used. It is safe because MODULE_INIT_XEN_BACKEND is called
in xen_bus_realize(), long after CLI processing initialized
the vga_interface_type variable.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Paul Durrant <paul@xen.org>
Message-Id: <20240510104908.76908-8-philmd@linaro.org>
For xen, when checking for the first RAM (xen_memory), use
xen_mr_is_memory() rather than checking for a RAMBlock with
offset 0.
All Xen machines create xen_memory first so this has no
functional change for existing machines.
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-ID: <20240529140739.1387692-6-edgar.iglesias@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Originally I tried to move where vCPU thread initialisation to later
in realize. However pulling that thread (sic) got gnarly really
quickly. It turns out some steps of CPU realization need values that
can only be determined from the running vCPU thread.
However having moved enough out of the thread creation we can now
queue work before the thread starts (at least for TCG guests) and
avoid the race between vcpu_init and other vcpu states a plugin might
subscribe to.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-ID: <20240530194250.1801701-6-alex.bennee@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Aside from the round robin threads this is all common code. By
moving the halt_cond setup we also no longer need hacks to work around
the race between QOM object creation and thread creation.
It is a little ugly to free stuff up for the round robin thread but
better it deal with its own specialises than making the other
accelerators jump through hoops.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-ID: <20240530194250.1801701-3-alex.bennee@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
There was an issue with Qemu build with "--disable-system".
The traces could be generated and the build fails.
The traces were 'cut out' for previous patches, and overall,
the 'system' source set should be used like in pre-'eBPF blob' patches.
Signed-off-by: Andrew Melnychenko <andrew@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reproducer from https://gitlab.com/qemu-project/qemu/-/issues/1451
creates small packet (1 segment, len = 10 == n->guest_hdr_len),
then destroys queue.
"if (n->host_hdr_len != n->guest_hdr_len)" is triggered, if body creates
zero length/zero segment packet as there is nothing after guest header.
qemu_sendv_packet_async() tries to send it.
slirp discards it because it is smaller than Ethernet header,
but returns 0 because tx hooks are supposed to return total length of data.
0 is propagated upwards and is interpreted as "packet has been sent"
which is terrible because queue is being destroyed, nobody is waiting for TX
to complete and assert it triggered.
Fix is discard such empty packets instead of sending them.
Length 1 packets will go via different codepath:
virtqueue_push(q->tx_vq, elem, 0);
virtio_notify(vdev, q->tx_vq);
g_free(elem);
and aren't problematic.
Signed-off-by: Alexey Dobriyan <adobriyan@yandex-team.ru>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This generalizes the rule to generate the skeleton and allows to add
another.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The kernel interprets the returned value as an unsigned 32-bit so -1
will mean queue 4294967295, which is awkward. Return 0 instead.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
calculate_rss_hash() was using hash value 0 to tell if it calculated
a hash, but the hash value may be 0 on a rare occasion. Have a
distinct bool value for correctness.
Fixes: f3fa412de2 ("ebpf: Added eBPF RSS program.")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The peer buffer is qualified with const and not meant to be modified.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The code to attach or detach the eBPF program to RSS were duplicated so
unify them into one function to save some code.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Byte swapping is only performed for the part of header shared with the
legacy standard and the buffer only needs to cover it.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Multiqueue usage is not negotiated yet when realizing. If more than
one queue is added and the guest never requests to enable multiqueue,
the extra queues will not be deleted when unrealizing and leak.
Fixes: f9d6dbf0bf ("virtio-net: remove virtio queues if the guest doesn't support multiqueue")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Propagating ebpf-rss-fds errors has several problems.
First, it makes device realization fail and disables the fallback to the
conventional eBPF loading.
Second, it leaks memory by making device realization fail without
freeing memory already allocated.
Third, the convention is to set an error when a function returns false,
but virtio_net_load_ebpf_fds() and virtio_net_load_ebpf() returns false
without setting an error, which is confusing.
Remove the propagation to fix these problems.
Fixes: 0524ea0510 ("ebpf: Added eBPF initialization by fds.")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
tap prepends a zeroed virtio-net header when writing a packet to a
tap with virtio-net header enabled but not in use. This only happens
when s->host_vnet_hdr_len == sizeof(struct virtio_net_hdr).
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
While netmap implements virtio-net header, it does not implement
receive_raw(). Instead of implementing receive_raw for netmap, add
virtio-net headers in the common code and use receive_iov()/receive()
instead. This also fixes the buffer size for the virtio-net header.
Fixes: fbbdbddec0 ("tap: allow extended virtio header with hash info")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The virtio-net header length assertion should happen for any clients.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Since qemu_set_vnet_hdr_len() is always called when
qemu_using_vnet_hdr() is called, we can merge them and save some code.
For consistency, express that the virtio-net header is not in use by
returning 0 with qemu_get_vnet_hdr_len() instead of having a dedicated
function, qemu_get_using_vnet_hdr().
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
It was necessary since an Linux older than 2.6.35 may implement the
virtio-net header but may not allow to change its length. Remove it
since such an old Linux is no longer supported.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
RISC-V PR for 9.1
* APLICs add child earlier than realize
* Fix exposure of Zkr
* Raise exceptions on wrs.nto
* Implement SBI debug console (DBCN) calls for KVM
* Support 64-bit addresses for initrd
* Change RISCV_EXCP_SEMIHOST exception number to 63
* Tolerate KVM disable ext errors
* Set tval in breakpoints
* Add support for Zve32x extension
* Add support for Zve64x extension
* Relax vector register check in RISCV gdbstub
* Fix the element agnostic Vector function problem
* Fix Zvkb extension config
* Implement dynamic establishment of custom decoder
* Add th.sxstatus CSR emulation
* Fix Zvfhmin checking for vfwcvt.f.f.v and vfncvt.f.f.w instructions
* Check single width operator for vector fp widen instructions
* Check single width operator for vfncvt.rod.f.f.w
* Remove redudant SEW checking for vector fp narrow/widen instructions
* Prioritize pmp errors in raise_mmu_exception()
* Do not set mtval2 for non guest-page faults
* Remove experimental prefix from "B" extension
* Fixup CBO extension register calculation
* Fix the hart bit setting of AIA
* Fix reg_width in ricsv_gen_dynamic_vector_feature()
* Decode all of the pmpcfg and pmpaddr CSRs
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEaukCtqfKh31tZZKWr3yVEwxTgBMFAmZdVzcACgkQr3yVEwxT
# gBPxSBAAsuzhDCbaOl9jXhIL6Q0IDHULz4U16AZypHYID7T6rDaNoRmNVdqBKZuM
# IMby8qm5XFmcUGM9itcM7IKV2BNHuWSye3/Y7GOYZQyToR7U6lvLpAm4pNj4AgTC
# PLV2VPt1XLZRSthkgwp6ylBXzdNSiZMWggqTb7QbyfR5hJfG+VsZjTGaIwyZbtKI
# +CJG6gZSPv6JGNtwnJq+v0VBEkj1ryo/gg2EAAzA+EWU4nw5mJCLWoDLrYZalTv9
# vCTqJuMViTjeHqAm/IIMoFzYR94+ug0usqcmnx/E7ALTOsmBh5K+KWndAW4vqAlP
# mZOONfr3h7zc81jThC961kjGVPiTjTGbHHlKwlB2JEggwctcVqGRyWeM9wHSUr2W
# S6F56hpForzVW9IkCt/fDUxamr23303s5miIsronrwiihqkNpxKYAuqPTXFGkFKg
# ilBLGcbHcWxNmjpfIEXnTjDB6qFEceWqbjJejrsKusoSPkKQm0ktIZZUwCbTsu45
# 0ScYrBieUPjDWDFYlmWrr5byekyCXCzfpBgq8qo60FA+aP29Nx+GlFR0eWTXXY4V
# O5/WTKjQM4+/uNYIuFDCFPV1Ja5GERDhXoNkjkY5ErsSZL2c2UEp3UTxzbEl5dOm
# NRH7C26Z/xVMDwT08kDDq0t8Rkz4836txPO7y+aPbtvGfENRI8E=
# =mtVb
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 03 Jun 2024 12:40:07 AM CDT
# gpg: using RSA key 6AE902B6A7CA877D6D659296AF7C95130C538013
# gpg: Good signature from "Alistair Francis <alistair@alistair23.me>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6AE9 02B6 A7CA 877D 6D65 9296 AF7C 9513 0C53 8013
* tag 'pull-riscv-to-apply-20240603' of https://github.com/alistair23/qemu: (27 commits)
disas/riscv: Decode all of the pmpcfg and pmpaddr CSRs
riscv, gdbstub.c: fix reg_width in ricsv_gen_dynamic_vector_feature()
target/riscv/kvm.c: Fix the hart bit setting of AIA
target/riscv: rvzicbo: Fixup CBO extension register calculation
target/riscv: Remove experimental prefix from "B" extension
target/riscv: do not set mtval2 for non guest-page faults
target/riscv: prioritize pmp errors in raise_mmu_exception()
target/riscv: rvv: Remove redudant SEW checking for vector fp narrow/widen instructions
target/riscv: rvv: Check single width operator for vfncvt.rod.f.f.w
target/riscv: rvv: Check single width operator for vector fp widen instructions
target/riscv: rvv: Fix Zvfhmin checking for vfwcvt.f.f.v and vfncvt.f.f.w instructions
riscv: thead: Add th.sxstatus CSR emulation
target/riscv: Implement dynamic establishment of custom decoder
target/riscv/cpu.c: fix Zvkb extension config
target/riscv: Fix the element agnostic function problem
target/riscv: Relax vector register check in RISCV gdbstub
target/riscv: Add support for Zve64x extension
target/riscv: Add support for Zve32x extension
trans_privileged.c.inc: set (m|s)tval on ebreak breakpoint
target/riscv/debug: set tval=pc in breakpoint exceptions
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Prevent regressions when using NBD with TLS in the presence of
iothreads, adding coverage the fix to qio channels made in the
previous patch.
The shell function pick_unused_port() was copied from
nbdkit.git/tests/functions.sh.in, where it had all authors from Red
Hat, agreeing to the resulting relicensing from 2-clause BSD to GPLv2.
CC: qemu-stable@nongnu.org
CC: "Richard W.M. Jones" <rjones@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20240531180639.1392905-6-eblake@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
hw/ufs patches
- Add support MCQ of UFSHCI 4.0
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCgAdFiEEUBfYMVl8eKPZB+73EuIgTA5dtgIFAmZdf7EACgkQEuIgTA5d
# tgLI8Q//SXqscGTP/FrjaUj0SsQE90E0AEEfIX4juVP8e1FiFcM9x6tmEU/CT5CM
# BQYk1zn0d3JY09ClIyr+AlVyy5RYMdlL/LyhGElxU0MPZh6u4X/6QJ/oHcxAx96r
# sRGSjJp5k2maHXgjQmqVUFkp22SZGt5vpD+AQT83wvuWshL302b3MDJ8B3f9zX90
# mCcwSmk+JgSjceaXueuBAbOJud6Ie3jAqXf4w8Gv21nLwzRmDBacjfn5LGSVzQxd
# BLkADuwRcRkxQ9hpoCPWOdKvXCAXtTIYqw8BRCG7Avl478UDI+CrYNjvK62SystM
# el2ql5ZvGjL8w+k7aQxqJi44RyQH3k1NJwvss3pdRyJzwL9x9pVRVlsM6tQbW56u
# COVexJnKDoufWmQs7o6rOv5OzexC6cD6yEoH2mi60F35jO8j3skJi1od7ehHwbzm
# 2a6dt1glBKvRWgfcLgEGuFERji3vV++9T6bAg8To0GYTryZKZzLKMSbIHvYI/49t
# u4ZwqhYkp36gpcQ8eA7Byr5FWd19UzEin6sVw3uCYibr6oONFTjp+XXnFz/LK9Fu
# XbrSOe8943FCUs9dCBHHtJFLmw4j1Ck60GpnogFkzjPEhlKDHO8+4/lm1gFgPV2h
# K2wqmq5kw8JtTKvHSDGa4iGTfJ/zxMOv1ePc3wiulwSH28kl6r8=
# =dQMQ
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 03 Jun 2024 03:32:49 AM CDT
# gpg: using RSA key 5017D831597C78A3D907EEF712E2204C0E5DB602
# gpg: Good signature from "Jeuk Kim <jeuk20.kim@samsung.com>" [unknown]
# gpg: aka "Jeuk Kim <jeuk20.kim@gmail.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 5017 D831 597C 78A3 D907 EEF7 12E2 204C 0E5D B602
* tag 'pull-ufs-20240603' of https://gitlab.com/jeuk20.kim/qemu:
hw/ufs: Add support MCQ of UFSHCI 4.0
hw/ufs: Update MCQ-related fields to block/ufs.h
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This patch adds support for MCQ defined in UFSHCI 4.0. This patch
utilized the legacy I/O codes as much as possible to support MCQ.
MCQ operation & runtime register is placed at 0x1000 offset of UFSHCI
register statically with no spare space among four registers (48B):
UfsMcqSqReg, UfsMcqSqIntReg, UfsMcqCqReg, UfsMcqCqIntReg
The maxinum number of queue is 32 as per spec, and the default
MAC(Multiple Active Commands) are 32 in the device.
Example:
-device ufs,serial=foo,id=ufs0,mcq=true,mcq-maxq=8
Signed-off-by: Minwoo Im <minwoo.im@samsung.com>
Reviewed-by: Jeuk Kim <jeuk20.kim@samsung.com>
Message-Id: <20240528023106.856777-3-minwoo.im@samsung.com>
Signed-off-by: Jeuk Kim <jeuk20.kim@samsung.com>
In AIA spec, each hart (or each hart within a group) has a unique hart
number to locate the memory pages of interrupt files in the address
space. The number of bits required to represent any hart number is equal
to ceil(log2(hmax + 1)), where hmax is the largest hart number among
groups.
However, if the largest hart number among groups is a power of 2, QEMU
will pass an inaccurate hart-index-bit setting to Linux. For example, when
the guest OS has 4 harts, only ceil(log2(3 + 1)) = 2 bits are sufficient
to represent 4 harts, but we passes 3 to Linux. The code needs to be
updated to ensure accurate hart-index-bit settings.
Additionally, a Linux patch[1] is necessary to correctly recover the hart
index when the guest OS has only 1 hart, where the hart-index-bit is 0.
[1] https://lore.kernel.org/lkml/20240415064905.25184-1-yongxuan.wang@sifive.com/t/
Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Cc: qemu-stable <qemu-stable@nongnu.org>
Message-ID: <20240515091129.28116-1-yongxuan.wang@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
When running the instruction
```
cbo.flush 0(x0)
```
QEMU would segfault.
The issue was in cpu_gpr[a->rs1] as QEMU does not have cpu_gpr[0]
allocated.
In order to fix this let's use the existing get_address()
helper. This also has the benefit of performing pointer mask
calculations on the address specified in rs1.
The pointer masking specificiation specifically states:
"""
Cache Management Operations: All instructions in Zicbom, Zicbop and Zicboz
"""
So this is the correct behaviour and we previously have been incorrectly
not masking the address.
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reported-by: Fabian Thomas <fabian.thomas@cispa.de>
Fixes: e05da09b7c ("target/riscv: implement Zicbom extension")
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Cc: qemu-stable <qemu-stable@nongnu.org>
Message-ID: <20240514023910.301766-1-alistair.francis@wdc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Previous patch fixed the PMP priority in raise_mmu_exception() but we're still
setting mtval2 incorrectly. In riscv_cpu_tlb_fill(), after pmp check in 2 stage
translation part, mtval2 will be set in case of successes 2 stage translation but
failed pmp check.
In this case we gonna set mtval2 via env->guest_phys_fault_addr in context of
riscv_cpu_tlb_fill(), as this was a guest-page-fault, but it didn't and mtval2
should be zero, according to RISCV privileged spec sect. 9.4.4: When a guest
page-fault is taken into M-mode, mtval2 is written with either zero or guest
physical address that faulted, shifted by 2 bits. *For other traps, mtval2
is set to zero...*
Signed-off-by: Alexei Filippov <alexei.filippov@syntacore.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240503103052.6819-1-alexei.filippov@syntacore.com>
Cc: qemu-stable <qemu-stable@nongnu.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
raise_mmu_exception(), as is today, is prioritizing guest page faults by
checking first if virt_enabled && !first_stage, and then considering the
regular inst/load/store faults.
There's no mention in the spec about guest page fault being a higher
priority that PMP faults. In fact, privileged spec section 3.7.1 says:
"Attempting to fetch an instruction from a PMP region that does not have
execute permissions raises an instruction access-fault exception.
Attempting to execute a load or load-reserved instruction which accesses
a physical address within a PMP region without read permissions raises a
load access-fault exception. Attempting to execute a store,
store-conditional, or AMO instruction which accesses a physical address
within a PMP region without write permissions raises a store
access-fault exception."
So, in fact, we're doing it wrong - PMP faults should always be thrown,
regardless of also being a first or second stage fault.
The way riscv_cpu_tlb_fill() and get_physical_address() work is
adequate: a TRANSLATE_PMP_FAIL error is immediately reported and
reflected in the 'pmp_violation' flag. What we need is to change
raise_mmu_exception() to prioritize it.
Reported-by: Joseph Chan <jchan@ventanamicro.com>
Fixes: 82d53adfbb ("target/riscv/cpu_helper.c: Invalid exception on MMU translation stage")
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240413105929.7030-1-alexei.filippov@syntacore.com>
Cc: qemu-stable <qemu-stable@nongnu.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The require_scale_rvf function only checks the double width operator for
the vector floating point widen instructions, so most of the widen
checking functions need to add require_rvf for single width operator.
The vfwcvt.f.x.v and vfwcvt.f.xu.v instructions convert single width
integer to double width float, so the opfxv_widen_check function doesn’t
need require_rvf for the single width operator(integer).
Signed-off-by: Max Chou <max.chou@sifive.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Cc: qemu-stable <qemu-stable@nongnu.org>
Message-ID: <20240322092600.1198921-3-max.chou@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
According v spec 18.4, only the vfwcvt.f.f.v and vfncvt.f.f.w
instructions will be affected by Zvfhmin extension.
And the vfwcvt.f.f.v and vfncvt.f.f.w instructions only support the
conversions of
* From 1*SEW(16/32) to 2*SEW(32/64)
* From 2*SEW(32/64) to 1*SEW(16/32)
Signed-off-by: Max Chou <max.chou@sifive.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Cc: qemu-stable <qemu-stable@nongnu.org>
Message-ID: <20240322092600.1198921-2-max.chou@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
In this patch, we modify the decoder to be a freely composable data
structure instead of a hardcoded one. It can be dynamically builded up
according to the extensions.
This approach has several benefits:
1. Provides support for heterogeneous cpu architectures. As we add decoder in
RISCVCPU, each cpu can have their own decoder, and the decoders can be
different due to cpu's features.
2. Improve the decoding efficiency. We run the guard_func to see if the decoder
can be added to the dynamic_decoder when building up the decoder. Therefore,
there is no need to run the guard_func when decoding each instruction. It can
improve the decoding efficiency
3. For vendor or dynamic cpus, it allows them to customize their own decoder
functions to improve decoding efficiency, especially when vendor-defined
instruction sets increase. Because of dynamic building up, it can skip the other
decoder guard functions when decoding.
4. Pre patch for allowing adding a vendor decoder before decode_insn32() with minimal
overhead for users that don't need this particular vendor decoder.
Signed-off-by: Huang Tao <eric.huang@linux.alibaba.com>
Suggested-by: Christoph Muellner <christoph.muellner@vrull.eu>
Co-authored-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240506023607.29544-1-eric.huang@linux.alibaba.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
In current implementation, the gdbstub allows reading vector registers
only if V extension is supported. However, all vector extensions and
vector crypto extensions have the vector registers and they all depend
on Zve32x. The gdbstub should check for Zve32x instead.
Signed-off-by: Jason Chien <jason.chien@sifive.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Max Chou <max.chou@sifive.com>
Message-ID: <20240328022343.6871-4-jason.chien@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Privileged spec section 4.1.9 mentions:
"When a trap is taken into S-mode, stval is written with
exception-specific information to assist software in handling the trap.
(...)
If stval is written with a nonzero value when a breakpoint,
address-misaligned, access-fault, or page-fault exception occurs on an
instruction fetch, load, or store, then stval will contain the faulting
virtual address."
A similar text is found for mtval in section 3.1.16.
Setting mtval/stval in this scenario is optional, but some softwares read
these regs when handling ebreaks.
Write 'badaddr' in all ebreak breakpoints to write the appropriate
'tval' during riscv_do_cpu_interrrupt().
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240416230437.1869024-3-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
We're not setting (s/m)tval when triggering breakpoints of type 2
(mcontrol) and 6 (mcontrol6). According to the debug spec section
5.7.12, "Match Control Type 6":
"The Privileged Spec says that breakpoint exceptions that occur on
instruction fetches, loads, or stores update the tval CSR with either
zero or the faulting virtual address. The faulting virtual address for
an mcontrol6 trigger with action = 0 is the address being accessed and
which caused that trigger to fire."
A similar text is also found in the Debug spec section 5.7.11 w.r.t.
mcontrol.
Note that what we're doing ATM is not violating the spec, but it's
simple enough to set mtval/stval and it makes life easier for any
software that relies on this info.
Given that we always use action = 0, save the faulting address for the
mcontrol and mcontrol6 trigger breakpoints into env->badaddr, which is
used as as scratch area for traps with address information. 'tval' is
then set during riscv_cpu_do_interrupt().
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Message-ID: <20240416230437.1869024-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Running a KVM guest using a 6.9-rc3 kernel, in a 6.8 host that has zkr
enabled, will fail with a kernel oops SIGILL right at the start. The
reason is that we can't expose zkr without implementing the SEED CSR.
Disabling zkr in the guest would be a workaround, but if the KVM doesn't
allow it we'll error out and never boot.
In hindsight this is too strict. If we keep proceeding, despite not
disabling the extension in the KVM vcpu, we'll not add the extension in
the riscv,isa. The guest kernel will be unaware of the extension, i.e.
it doesn't matter if the KVM vcpu has it enabled underneath or not. So
it's ok to keep booting in this case.
Change our current logic to not error out if we fail to disable an
extension in kvm_set_one_reg(), but show a warning and keep booting. It
is important to throw a warning because we must make the user aware that
the extension is still available in the vcpu, meaning that an
ill-behaved guest can ignore the riscv,isa settings and use the
extension.
The case we're handling happens with an EINVAL error code. If we fail to
disable the extension in KVM for any other reason, error out.
We'll also keep erroring out when we fail to enable an extension in KVM,
since adding the extension in riscv,isa at this point will cause a guest
malfunction because the extension isn't enabled in the vcpu.
Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Cc: qemu-stable <qemu-stable@nongnu.org>
Message-ID: <20240422171425.333037-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The current semihost exception number (16) is a reserved number (range
[16-17]). The upcoming double trap specification uses that number for
the double trap exception. Since the privileged spec (Table 22) defines
ranges for custom uses change the semihosting exception number to 63
which belongs to the range [48-63] in order to avoid any future
collisions with reserved exception.
Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240422135840.1959967-1-cleger@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
SBI defines a Debug Console extension "DBCN" that will, in time, replace
the legacy console putchar and getchar SBI extensions.
The appeal of the DBCN extension is that it allows multiple bytes to be
read/written in the SBI console in a single SBI call.
As far as KVM goes, the DBCN calls are forwarded by an in-kernel KVM
module to userspace. But this will only happens if the KVM module
actually supports this SBI extension and we activate it.
We'll check for DBCN support during init time, checking if get-reg-list
is advertising KVM_RISCV_SBI_EXT_DBCN. In that case, we'll enable it via
kvm_set_one_reg() during kvm_arch_init_vcpu().
Finally, change kvm_riscv_handle_sbi() to handle the incoming calls for
SBI_EXT_DBCN, reading and writing as required.
A simple KVM guest with 'earlycon=sbi', running in an emulated RISC-V
host, takes around 20 seconds to boot without using DBCN. With this
patch we're taking around 14 seconds to boot due to the speed-up in the
terminal output. There's no change in boot time if the guest isn't
using earlycon.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20240425155012.581366-1-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Implementing wrs.nto to always just return is consistent with the
specification, as the instruction is permitted to terminate the
stall for any reason, but it's not useful for virtualization, where
we'd like the guest to trap to the hypervisor in order to allow
scheduling of the lock holding VCPU. Change to always immediately
raise exceptions when the appropriate conditions are present,
otherwise continue to just return. Note, immediately raising
exceptions is also consistent with the specification since the
time limit that should expire prior to the exception is
implementation-specific.
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Christoph Müllner <christoph.muellner@vrull.eu>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240424142808.62936-2-ajones@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The Zkr extension may only be exposed to KVM guests if the VMM
implements the SEED CSR. Use the same implementation as TCG.
Without this patch, running with a KVM which does not forward the
SEED CSR access to QEMU will result in an ILL exception being
injected into the guest (this results in Linux guests crashing on
boot). And, when running with a KVM which does forward the access,
QEMU will crash, since QEMU doesn't know what to do with the exit.
Fixes: 3108e2f1c6 ("target/riscv/kvm: update KVM exts to Linux 6.8")
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Cc: qemu-stable <qemu-stable@nongnu.org>
Message-ID: <20240422134605.534207-2-ajones@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
FEAT_WFxT introduces new instructions WFIT and WFET, which are like
the existing WFI and WFE but allow the guest to pass a timeout value
in a register. The instructions will wait for an interrupt/event as
usual, but will also stop waiting when the value of CNTVCT_EL0 is
greater than or equal to the specified timeout value.
We implement WFIT by setting up a timer to expire at the right
point; when the timer expires it sets the EXITTB interrupt, which
will cause the CPU to leave the halted state. If we come out of
halt for some other reason, we unset the pending timer.
We implement WFET as a nop, which is architecturally permitted and
matches the way we currently make WFE a nop.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240430140035.3889879-3-peter.maydell@linaro.org
The TCGCPUOps::cpu_exec_halt method is called from cpu_handle_halt()
when the CPU is halted, so that a target CPU emulation can do
anything target-specific it needs to do. (At the moment we only use
this on i386.)
The current specification of the method doesn't allow the target
specific code to do something different if the CPU is about to come
out of the halt state, because cpu_handle_halt() only determines this
after the method has returned. (If the method called cpu_has_work()
itself this would introduce a potential race if an interrupt arrived
between the target's method implementation checking and
cpu_handle_halt() repeating the check.)
Change the definition of the method so that it returns a bool to
tell cpu_handle_halt() whether to stay in halt or not.
We will want this for the Arm target, where FEAT_WFxT wants to do
some work only for the case where the CPU is in halt but about to
leave it.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240430140035.3889879-2-peter.maydell@linaro.org
The board list in target-arm.rst is supposed to be in alphabetical
order by the title text of each file (which is not the same as
alphabetical order by filename). A few items had got out of order;
correct them.
The entry for
"Facebook Yosemite v3.5 Platform and CraterLake Server (fby35)"
remains out-of-order, because this is not its own file
but is currently part of the aspeed.rst file.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240520141421.1895138-1-peter.maydell@linaro.org
Moving to Neoverse-N2 gives us several cpu features to use for expanding
our platform:
- branch target identification
- pointer authentication
- RME for confidential computing
- RNG for EFI_PROTOCOL_RNG
- SVE being enabled by default
We do not go for "max" as default to have stable set of features enabled
by default. It is still supported and can be selected with "--cpu"
argument.
Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Reviewed-by: Leif Lindholm <quic_llindhol@quicinc.com>
Message-id: 20240523165353.6547-1-marcin.juszkiewicz@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Partial support for NUMA setup:
- cpu nodes
- memory nodes
Used versions:
- Trusted Firmware v2.11.0
- Tianocore EDK2 stable202405
- Tianocore EDK2 Platforms code commit 4bbd0ed
Firmware is built using Debian 'bookworm' cross toolchain (gcc 12.2.0).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
According to the GICv2 specification section 4.3.12, "Interrupt Processor
Targets Registers, GICD_ITARGETSRn":
"Any change to a CPU targets field value:
[...]
* Has an effect on any pending interrupts. This means:
- adding a CPU interface to the target list of a pending interrupt makes that
interrupt pending on that CPU interface
- removing a CPU interface from the target list of a pending interrupt
removes the pending state of that interrupt on that CPU interface."
Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
Message-id: 20240524113256.8102-3-sebastian.huber@embedded-brains.de
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Since qemu 8.2, the combination of NBD + TLS + iothread crashes on an
assertion failure:
qemu-kvm: ../io/channel.c:534: void qio_channel_restart_read(void *): Assertion `qemu_get_current_aio_context() == qemu_coroutine_get_aio_context(co)' failed.
It turns out that when we removed AioContext locking, we did so by
having NBD tell its qio channels that it wanted to opt in to
qio_channel_set_follow_coroutine_ctx(); but while we opted in on the
main channel, we did not opt in on the TLS wrapper channel.
qemu-iotests has coverage of NBD+iothread and NBD+TLS, but apparently
no coverage of NBD+TLS+iothread, or we would have noticed this
regression sooner. (I'll add that in the next patch)
But while we could manually opt in to the TLS channel in nbd/server.c
(a one-line change), it is more generic if all qio channels that wrap
other channels inherit the follow status, in the same way that they
inherit feature bits.
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Daniel P. Berrangé <berrange@redhat.com>
CC: qemu-stable@nongnu.org
Fixes: https://issues.redhat.com/browse/RHEL-34786
Fixes: 06e0f098 ("io: follow coroutine AioContext in qio_channel_yield()", v8.2.0)
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240518025246.791593-5-eblake@redhat.com>
Using -fsanitize=undefined with Clang v18 causes an error if function
pointers are casted:
qapi/qapi-clone-visitor.c:188:5: runtime error: call to function visit_type_SocketAddress through pointer to incorrect function type 'bool (*)(struct Visitor *, const char *, void **, struct Error **)'
/tmp/qemu-ubsan/qapi/qapi-visit-sockets.c:487: note: visit_type_SocketAddress defined here
#0 0x5642aa2f7f3b in qapi_clone qapi/qapi-clone-visitor.c:188:5
#1 0x5642aa2c8ce5 in qio_channel_socket_listen_async io/channel-socket.c:285:18
#2 0x5642aa2b8903 in test_io_channel_setup_async tests/unit/test-io-channel-socket.c:116:5
#3 0x5642aa2b8204 in test_io_channel tests/unit/test-io-channel-socket.c:179:9
#4 0x5642aa2b8129 in test_io_channel_ipv4 tests/unit/test-io-channel-socket.c:323:5
...
It also prevents enabling the strict mode of CFI which is currently
disabled with -fsanitize-cfi-icall-generalize-pointers.
The problematic casts are necessary to pass visit_type_T() and
visit_type_T_members() as callbacks to qapi_clone() and qapi_clone_members(),
respectively. Open-code these two functions to avoid the callbacks, and
thus the type casts.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2346
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240524-xkb-v4-3-2de564e5c859@daynix.com>
[thuth: Improve commit message according to Markus' suggestions]
Signed-off-by: Thomas Huth <thuth@redhat.com>
LeakSanitizer complains about allocations whose references are held
only by automatic variables. It is possible to free them to suppress
the complaints, but it is a chore to make sure they are freed in all
exit paths so make them static instead.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240524-xkb-v4-1-2de564e5c859@daynix.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
When running the update-linx-headers.sh script, it currently fails with:
scripts/update-linux-headers.sh: line 73: .../qemu/standard-headers/asm-x86/setup_data.h: No such file or directory
The "include" folder is obviously missing here - no clue how this could
have worked before?
Fixes: 66210a1a30 ("scripts/update-linux-headers: Add setup_data.h to import list")
Message-ID: <20240527060126.12578-1-thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
We are reusing the same temporary directory for installing the headers
of all targets, so there could be stale files here when switching from
one target to another. Make sure to delete the folder before installing
a new set of target headers into it.
Message-ID: <20240527060243.12647-1-thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
When we are building for OSS-Fuzz, we want to ensure that the fuzzer
targets are actually created, regardless of leaks. Leaks will be
detected by the subsequent tests of the individual fuzz-targets.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240527150001.325565-1-alxndr@bu.edu>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Set per_address and ilen in per_ifetch; this is valid for
all PER exceptions and will last until the end of the
instruction. Therefore we don't need to give the same
data to per_check_exception.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240502054417.234340-13-richard.henderson@linaro.org>
[thuth: Silence checkpatch.pl errors]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Drop from argument, since gbea has always been updated with
this address. Add ilen argument for setting int_pgm_ilen.
Use update_cc_op before calling per_branch.
By raising the exception here, we need not call
per_check_exception later, which means we can clean up the
normal non-exception branch path.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240502054417.234340-10-richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Always use a tcg branch, instead of movcond. The movcond
was not a bad idea before PER was added, but since then
we have either 2 or 3 actions to perform on each leg of
the branch, and multiple movcond is inefficient.
Reorder the taken branch to be fallthrough of the tcg branch.
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240502054417.234340-8-richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Using exception unwind via tcg_s390_program_interrupt,
we discard the current value of psw.addr, which discards
the result of a branch.
Pass in the address of the next instruction, which may
not be sequential. Pass in ilen, which we would have
gotten from unwind and is passed to the exception handler.
Sync cc_op before the call, which we would have gotten
from unwind.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-ID: <20240502054417.234340-2-richard.henderson@linaro.org>
[thuth: Silence checkpatch.pl errors]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Block jobs patches for 2024-04-29
v2: add "iotests/pylintrc: allow up to 10 similar lines" to fix
check-python-minreqs
- backup: discard-source parameter
- blockcommit: Reopen base image as RO after abort
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCgAdFiEEi5wmzbL9FHyIDoahVh8kwfGfefsFAmZV4UwACgkQVh8kwfGf
# eftBIA/9Em1xR7yEK5gE9kiGc+qSBsRPB8sJZ/JB+GukDPvzQ+/CktIJJgTryI/q
# QC08KyHnuE6WknUfJPkV5kfINj8vTDtkMjwgccrMu8enc9W5wnRfVBQomS8qWpZY
# maJhyW+Sva7k82v/U1mpdur5cTF1cu8VmwMSNurBYVd84E33KHkgQikEbXSLzFBu
# N8dG4WOgtwuLmP5BMgg5ftzwC3W7qv+sq1DhnZwDATUKVbjX1lLtKAYwu66bH8du
# ekZtWqtJNJqRTcOIiSyl52lPm3xo9+U8khXWQ/lmq1jjvdKcC90y76bT16yIQw98
# 74aBiKSRu2MO/EraEgPQKU2LpSzbzr4Eu1kRjmDXcVDAB183vaFW3Ogym8BuGJ9n
# ZiNFYLZqOqUL4RkyaXEwci6THEyjHqQvK2HYGmjoidZPvATf5G52FWrKZT3S9LVT
# Q4oUhb6dQW4EtU4WoVJpqSg7xozVI/swJ04+gLTjQskitXQm2jX8ifD6MI+85tVp
# nntS5BtMfTe/z5K4L7bv8KOe7J+gK0NUo3YCdw3zQKa+u7tX/QQKnPmNtUK8ohjO
# g6wIuwrxn/GsHxvXaeOKftHyXBGDHYUSuIr7ByQ/WxS9nQaWW1UKk9WFC/XtUFND
# bHMfL+DidkUxMnZBe7Snz6gb16oEr0DsrsSyHe/J2dWrid6QJVA=
# =aSvT
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 28 May 2024 06:51:08 AM PDT
# gpg: using RSA key 8B9C26CDB2FD147C880E86A1561F24C1F19F79FB
# gpg: Good signature from "Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>" [unknown]
# gpg: aka "Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 8B9C 26CD B2FD 147C 880E 86A1 561F 24C1 F19F 79FB
* tag 'pull-block-jobs-2024-04-29-v2' of https://gitlab.com/vsementsov/qemu:
iotests/pylintrc: allow up to 10 similar lines
iotests: add backup-discard-source
qapi: blockdev-backup: add discard-source parameter
block/copy-before-write: create block_copy bitmap in filter node
block/copy-before-write: support unligned snapshot-discard
block/copy-before-write: fix permission
blockcommit: Reopen base image as RO after abort
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This fixes a bug in that neither PLI nor PLDW are present in ARMv6T2,
but are introduced with ARMv7 and ARMv7MP respectively.
For clarity, do not use NOP for PLD.
Note that there is no PLDW (literal). Architecturally in the
T1 encoding of "PLD (literal)" bit 5 is "(0)", which means
that it should be zero and if it is not then the behaviour
is CONSTRAINED UNPREDICTABLE (might UNDEF, NOP, or ignore the
value of the bit).
In our implementation we have patterns for both:
+ PLD 1111 1000 -001 1111 1111 ------------ # (literal)
+ PLD 1111 1000 -011 1111 1111 ------------ # (literal)
and so we effectively ignore the value of bit 5. (This is a
permitted option for this CONSTRAINED UNPREDICTABLE.) This isn't a
behaviour change in this commit, since we previously had NOP lines
for both those patterns.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240524232121.284515-3-richard.henderson@linaro.org
[PMM: adjusted commit message to note that PLD (lit) T1 bit 5
being 1 is an UNPREDICTABLE case.]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Some of the source files for older devices use hardcoded tabs
instead of our current coding standard's required spaces.
Fix these in the following files:
- hw/arm/boot.c
- hw/char/omap_uart.c
- hw/gpio/zaurus.c
- hw/input/tsc2005.c
This commit is mostly whitespace-only changes; it also
adds curly-braces to some 'if' statements.
This addresses part of https://gitlab.com/qemu-project/qemu/-/issues/373
but some other files remain to be handled.
Signed-off-by: Tanmay Patil <tanmaynpatil105@gmail.com>
Message-id: 20240508081502.88375-1-tanmaynpatil105@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: tweaked commit message]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Check the function index is in range and use an unsigned
variable to avoid the following warning with GCC 13.2.0:
[666/5358] Compiling C object libcommon.fa.p/hw_input_tsc2005.c.o
hw/input/tsc2005.c: In function 'tsc2005_timer_tick':
hw/input/tsc2005.c:416:26: warning: array subscript has type 'char' [-Wchar-subscripts]
416 | s->dav |= mode_regs[s->function];
| ~^~~~~~~~~~
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240508143513.44996-1-philmd@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: fixed missing ')']
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In gic_cpu_read() and gic_cpu_write(), we delegate the handling of
reading and writing the Non-Secure view of the GICC_APR<n> registers
to functions gic_apr_ns_view() and gic_apr_write_ns_view().
Unfortunately we got the order of the arguments wrong, swapping the
CPU number and the register number (which the compiler doesn't catch
because they're both integers).
Most guests probably didn't notice this bug because directly
accessing the APR registers is typically something only done by
firmware when it is doing state save for going into a sleep mode.
Correct the mismatched call arguments.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Cc: qemu-stable@nongnu.org
Fixes: 51fd06e0ee ("hw/intc/arm_gic: Fix handling of GICC_APR<n>, GICC_NSAPR<n> registers")
Signed-off-by: Andrey Shumilin <shum.sdl@nppct.ru>
[PMM: Rewrote commit message]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée<alex.bennee@linaro.org>
The value of the mp-affinity property being set in npcm7xx_realize is
always the same as the default value it would have when arm_cpu_realizefn
is called if the property is not set here. So there is no need to set
the property value in npcm7xx_realize function.
Signed-off-by: Dorjoy Chowdhury <dorjoychy111@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240504141733.14813-1-dorjoychy111@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We wrongly encoded ID_AA64PFR1_EL1 using {3,0,0,4,2} in hvf_sreg_match[] so
we fail to get the expected ARMCPRegInfo from cp_regs hash table with the
wrong key.
Fix it with the correct encoding {3,0,0,4,1}. With that fixed, the Linux
guest can properly detect FEAT_SSBS2 on my M1 HW.
All DBG{B,W}{V,C}R_EL1 registers are also wrongly encoded with op0 == 14.
It happens to work because HVF_SYSREG(CRn, CRm, 14, op1, op2) equals to
HVF_SYSREG(CRn, CRm, 2, op1, op2), by definition. But we shouldn't rely on
it.
Cc: qemu-stable@nongnu.org
Fixes: a1477da3dd ("hvf: Add Apple Silicon support")
Signed-off-by: Zenghui Yu <zenghui.yu@linux.dev>
Reviewed-by: Alexander Graf <agraf@csgraf.de>
Message-id: 20240503153453.54389-1-zenghui.yu@linux.dev
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add xlnx_dpdma_read_descriptor() and
xlnx_dpdma_write_descriptor() functions.
xlnx_dpdma_read_descriptor() combines reading a
descriptor from desc_addr by calling dma_memory_read()
and swapping the desc fields from guest memory order
to host memory order. xlnx_dpdma_write_descriptor()
performs similar actions when writing a descriptor.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: d3c6369a96 ("introduce xlnx-dpdma")
Signed-off-by: Alexandra Diupina <adiupina@astralinux.ru>
[PMM: tweaked indent, dropped behaviour change for write-failure case]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We want to have similar QMP objects in different tests. Reworking these
objects to make common parts by calling some helper functions doesn't
seem good. It's a lot more comfortable to see the whole QAPI request in
one place.
So, let's increase the limit, to unblock further commit
"iotests: add backup-discard-source"
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Add a parameter that enables discard-after-copy. That is mostly useful
in "push backup with fleecing" scheme, when source is snapshot-access
format driver node, based on copy-before-write filter snapshot-access
API:
[guest] [snapshot-access] ~~ blockdev-backup ~~> [backup target]
| |
| root | file
v v
[copy-before-write]
| |
| file | target
v v
[active disk] [temp.img]
In this case discard-after-copy does two things:
- discard data in temp.img to save disk space
- avoid further copy-before-write operation in discarded area
Note that we have to declare WRITE permission on source in
copy-before-write filter, for discard to work. Still we can't take it
unconditionally, as it will break normal backup from RO source. So, we
have to add a parameter and pass it thorough bdrv_open flags.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20240313152822.626493-5-vsementsov@yandex-team.ru>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Currently block_copy creates copy_bitmap in source node. But that is in
bad relation with .independent_close=true of copy-before-write filter:
source node may be detached and removed before .bdrv_close() handler
called, which should call block_copy_state_free(), which in turn should
remove copy_bitmap.
That's all not ideal: it would be better if internal bitmap of
block-copy object is not attached to any node. But that is not possible
now.
The simplest solution is just create copy_bitmap in filter node, where
anyway two other bitmaps are created.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
Message-Id: <20240313152822.626493-4-vsementsov@yandex-team.ru>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
In case when source node does not have any parents, the condition still
works as required: backup job do create the parent by
block_job_create -> block_job_add_bdrv -> bdrv_root_attach_child
Still, in this case checking @perm variable doesn't work, as backup job
creates the root blk with empty permissions (as it rely on CBW filter
to require correct permissions and don't want to create extra
conflicts).
So, we should not check @perm.
The hack may be dropped entirely when transactional insertion of
filter (when we don't try to recalculate permissions in intermediate
state, when filter does conflict with original parent of the source
node) merged (old big series
"[PATCH v5 00/45] Transactional block-graph modifying API"[1] and it's
current in-flight part is "[PATCH v8 0/7] blockdev-replace"[2])
[1] https://patchew.org/QEMU/20220330212902.590099-1-vsementsov@openvz.org/
[2] https://patchew.org/QEMU/20231017184444.932733-1-vsementsov@yandex-team.ru/
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
Message-Id: <20240313152822.626493-2-vsementsov@yandex-team.ru>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
If a blockcommit is aborted the base image remains in RW mode, that leads
to a fail of subsequent live migration.
How to reproduce:
$ virsh snapshot-create-as vm snp1 --disk-only
*** write something to the disk inside the guest ***
$ virsh blockcommit vm vda --active --shallow && virsh blockjob vm vda --abort
$ lsof /vzt/vm.qcow2
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
qemu-syst 433203 root 45u REG 253,0 1724776448 133 /vzt/vm.qcow2
$ cat /proc/433203/fdinfo/45
pos: 0
flags: 02140002 <==== The last 2 means RW mode
If the base image is in RW mode at the end of blockcommit and was in RO
mode before blockcommit, reopen the base BDS in RO.
Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-Id: <20240404091136.129811-1-alexander.ivanov@virtuozzo.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Some, but not all error messages are of the form
Guest agent command failed, error was '<actual error message>'
For instance, command guest-exec can fail with an error message like
Guest agent command failed, error was 'Failed to execute child process “/bin/invalid-cmd42” (No such file or directory)'
Shorten this to just just the actual error message. The guest-exec
example becomes
Failed to execute child process “/bin/invalid-cmd42” (No such file or directory)
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240514105829.729342-3-armbru@redhat.com>
[Superfluous #include "qapi/qmp/qerror.h" deleted]
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
When guest-set-user-password's argument @password can't be converted
from UTF-8 to UTF-16, we report something like
Guest agent command failed, error was 'Invalid sequence in conversion input'
Improve this to
can't convert 'password' to UTF-16: Invalid sequence in conversion input
Likewise for argument @username, and guest-file-open argument @path,
even though I'm not sure you can actually get invalid input past the
QMP core there.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240514105829.729342-2-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job. When the caller does, the error is reported twice. When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.
qmp_xen_save_devices_state() and qmp_xen_load_devices_state() violate
this principle: they call qemu_save_device_state() and
qemu_loadvm_state(), which call error_report_err().
I wish I could clean this up now, but migration's error reporting is
too complicated (confused?) for me to mess with it.
Instead, I'm merely improving the error reported by
qmp_xen_load_devices_state() and qmp_xen_load_devices_state() to the
QMP core from
An IO error has occurred
to
saving Xen device state failed
and
loading Xen device state failed
respectively.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240513141703.549874-6-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Fabiano Rosas <farosas@suse.de>
Acked-by: Peter Xu <peterx@redhat.com>
qmp_memsave() and qmp_pmemsave() report fwrite() error as
An IO error has occurred
Improve this to
writing memory to '<filename>' failed
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240513141703.549874-5-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
vmdk_init_extent() reports blk_co_pwrite() failure to its caller as
An IO error has occurred
The errno code returned by blk_co_pwrite() is lost.
Improve this to
failed to write VMDK <what>: <description of errno>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240513141703.549874-4-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
create_win_dump() and write_run report qemu_write_full() failure to
their callers as
An IO error has occurred
The errno set by qemu_write_full() is lost.
Improve this to
win-dump: failed to write header: <description of errno>
and
win-dump: failed to save memory: <description of errno>
This matches how dump.c reports similar errors.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240513141703.549874-3-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
external_snapshot_action() reports bdrv_flush() failure to its caller
as
An IO error has occurred
The errno code returned by bdrv_flush() is lost.
Improve this to
Write to node '<device or node name>' failed: <description of errno>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240513141703.549874-2-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
target/i386: Introduce X86Access and use for xsave and friends
linux-user/i386: Fix allocation and alignment of fp state in signal frame
# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmZT2GwdHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV87pQf9F/cmrKQG1mVWKmJd
# MI7l63lbxejdgAADv1nmro+oapCsJSaQeUSrYp904ydqJjVfBJkaoXfknGsvxrNA
# oW7nEuYt0sBKdaBUKhYpMOJ3ivfw7lVVMJmjNv9ngZRhW+WOoJrBHoleUkVLiM7D
# rxkMLL+LQ7BR9i0Lv1unorOkqUPGNOnEd45qRn6k1g/Qnqi8SNMzxFwO8+232u8m
# EG9un/oh4mKPyb5vSg3Y4JLg+yDKCRScBqBU1wcKFe1u+umBkv2BNcU+k62AJh1q
# bv8i1n+X/dFAd1aj0NEupi04EOZIof5m3T4YIWg7M4I94NiFWNZ18vgskkmiO+Mo
# 0KPd/A==
# =sYrE
# -----END PGP SIGNATURE-----
# gpg: Signature made Sun 26 May 2024 05:48:44 PM PDT
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate]
* tag 'pull-lu-20240526' of https://gitlab.com/rth7680/qemu: (28 commits)
target/i386: Pass host pointer and size to cpu_x86_{xsave,xrstor}
target/i386: Pass host pointer and size to cpu_x86_{fxsave,fxrstor}
target/i386: Pass host pointer and size to cpu_x86_{fsave,frstor}
target/i386: Convert do_xrstor to X86Access
target/i386: Convert do_xsave to X86Access
linux-user/i386: Honor xfeatures in xrstor_sigcontext
linux-user/i386: Fix allocation and alignment of fp state
linux-user/i386: Return boolean success from xrstor_sigcontext
linux-user/i386: Return boolean success from restore_sigcontext
linux-user/i386: Fix -mregparm=3 for signal delivery
linux-user/i386: Split out struct target_fregs_state
linux-user/i386: Replace target_fpstate_fxsave with X86LegacyXSaveArea
linux-user/i386: Remove xfeatures from target_fpstate_fxsave
linux-user/i386: Drop xfeatures_size from sigcontext arithmetic
target/i386: Add {hw,sw}_reserved to X86LegacyXSaveArea
target/i386: Add rbfm argument to cpu_x86_{xsave,xrstor}
target/i386: Split out do_xsave_chk
target/i386: Convert do_xrstor_* to X86Access
target/i386: Convert do_xsave_* to X86Access
tagret/i386: Convert do_fxsave, do_fxrstor to X86Access
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We have already validated the memory region in the course of
validating the signal frame. No need to do it again within
the helper function.
In addition, return failure when the header contains invalid
xstate_bv. The kernel handles this via exception handling
within XSTATE_OP within xrstor_from_user_sigframe.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We have already validated the memory region in the course of
validating the signal frame. No need to do it again within
the helper function.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We have already validated the memory region in the course of
validating the signal frame. No need to do it again within
the helper function.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
For modern cpus, the kernel uses xsave to store all extra
cpu state across the signal handler. For xsave/xrstor to
work, the pointer must be 64 byte aligned. Moreover, the
regular part of the signal frame must be 16 byte aligned.
Attempt to mirror the kernel code as much as possible.
Use enum FPStateKind instead of use_xsave() and use_fxsr().
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1648
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use the structure definition from target/i386/cpu.h.
The only minor quirk is re-casting the sw_reserved
area to the OS specific struct target_fpx_sw_bytes.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This is easily computed by advancing past the structure.
At the same time, replace the magic number "64".
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This is subtracting sizeof(target_fpstate_fxsave) in
TARGET_FXSAVE_SIZE, then adding it again via &fxsave->xfeatures.
Perform the same computation using xstate_size alone.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This completes the 512 byte structure, allowing the union to
be removed. Assert that the structure layout is as expected.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This path is not required by user-only, and can in fact
be shared between xsave and xrstor.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Move the alignment fault from do_* to helper_*, as it need
not apply to usage from within user-only signal handling.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Build system and target/i386/translate.c cleanups
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmZRy1gUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroMTtQf/ZQskuqZyTrDhB/uVUT8oT5JNKQNS
# GbFSgDK7jDdBeU3UmoYrlx9vfFR/mH5cA88MlusUy0SjQBNo4onD725o6Vvum/LW
# DPe5ZyE34wvOasM7KXqJsD+2SttjaVjCXN4ip+E9WL5By2TWJgrk6IgTtvAhT9cd
# LWb5OEIInaq7ZiWz3EpjmGvZd0M4mxqXi5OeDvmoFyf38xElfbWZWbfhJv+H5L1X
# stivPBtUbXOzh63NL491hUYQtiAWlow8Qcnn7CYRflb6Vdd4QPK+6W8FX5KyU2eC
# bXRXloW7wjEAC9pyiVky1SCvtNg7AVFL+9kxwiGreoZfo+/IMA+NP6pGOg==
# =hpWy
# -----END PGP SIGNATURE-----
# gpg: Signature made Sat 25 May 2024 04:28:24 AM PDT
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (24 commits)
migration: remove unnecessary zlib dependency
meson: do not query modules before they are processed
tcg: include dependencies in static_library()
meson: remove unnecessary dependency
meson: remove unnecessary reference to libm
target/i386: remove aflag argument of gen_lea_v_seg
target/i386: clean up repeated string operations
target/i386: introduce gen_lea_ss_ofs
target/i386: use mo_stacksize more
target/i386: inline gen_add_A0_ds_seg
target/i386: split gen_ldst_modrm for load and store
target/i386: reg in gen_ldst_modrm is always OR_TMP0
target/i386: raze the gen_eob* jungle
target/i386: assert that gen_update_eip_cur and gen_update_eip_next are the same in tb_stop
target/i386: avoid calling gen_eob_inhibit_irq before tb_stop
target/i386: avoid calling gen_eob_syscall before tb_stop
target/i386: document and group DISAS_* constants
target/i386: set CC_OP in helpers if they want CC_OP_EFLAGS
target/i386: cpu_load_eflags already sets cc_op
target/i386: remove unnecessary gen_update_cc_op before gen_eob*
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The dbus_display1_dep is not really used since all occurrences also
request gio independently. Just list the generated sources and drop
dbus_display1_dep.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Do not bother generating inline wrappers for gen_repz and gen_repz2;
use s->prefix to separate REPZ from REPNZ in the case of SCAS and
CMPS.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Generalize gen_stack_A0() to include an initial add and to use an arbitrary
destination. This is a common pattern and it is not a huge burden to
add the extra arguments to the only caller of gen_stack_A0().
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use mo_stacksize for all stack accesses, including when
a 64-bit code segment is impossible and the code is
therefore checking only for SS32(s).
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It is only used in MONITOR, where a direct call of gen_lea_v_seg
is simpler, and in XLAT. Inline it in the latter.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The is_store argument of gen_ldst_modrm has only ever been passed
a constant. Just split the function in two.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Values other than OR_TMP0 were only ever used by MOV and MOVNTI
opcodes. Now that these have been converted to the new decoder,
remove the argument.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Make gen_eob take the DISAS_* constant as an argument, so that
it is not necessary to have wrappers around it.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is an invariant now that there are no calls to gen_eob_inhibit_irq()
outside tb_stop.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
sti only has one exit, so it does not need to generate the
end-of-translation code inline. It can be deferred to tb_stop.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
syscall and sysret only have one exit, so they do not need to
generate the end-of-translation code inline. It can be
deferred to tb_stop.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Place DISAS_* constants that update cpu_eip first, and
the "jump" ones last. Add comments explaining the differences
and usage.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Mark cc_op as clean and do not spill it at the end of the translation block.
Technically this is a tiny bit less efficient, but:
* it results in translations that are a tiny bit smaller
* for most of these instructions, it is not unlikely that they are close to
the end of the basic block, in which case cc_op would not be overwritten
* anyway the cost is probably dwarfed by that of computing flags.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
No need to set it again at the end of the translation block, cc_op_dirty
can be set to false.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is already handled in gen_eob(). Before adding another DISAS_*
case, remove the double calls.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
gen_helper_rsm cannot generate an exception, and reloads the flags.
So there's no need to spill cc_op and update cpu_eip, but on the
other hand cc_op must be reset to CC_OP_EFLAGS before returning.
It all works by chance, because by spilling cc_op before the call
to the helper, it becomes non-dirty and gen_eob will not overwrite
the CC_OP_EFLAGS value that is placed there by the helper. But
let's clean it up.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Intel SDM 18.3.1.4 "If an occurrence of the MOV or POP instruction
loads the SS register executes with EFLAGS.TF = 1, no single-step debug
exception occurs following the MOV or POP instruction."
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The point of CPU_CFLAGS is really just to select the appropriate multilib,
for example for library linking tests, and -mcx16 is not needed for
that purpose.
Furthermore, if -mcx16 is part of QEMU's choice of a basic x86_64
instruction set, it should be applied to cross-compiled x86_64 code too;
it is plausible that tests/tcg would want to cover cmpxchg16b as well,
for example. In the end this makes just as much sense as a per sub-build
tweak, so move the flag to meson.build and cross_cc_cflags_x86_64.
This leaves out contrib/plugins, which would fail when attempting to use
__sync_val_compare_and_swap_16 (note it does not do yet); while minor,
this *is* a disadvantage of this change. But building contrib/plugins
with a Makefile instead of meson.build is something self-inflicted just
for the sake of showing that it can be done, and if this kind of papercut
started becoming a problem we could make the directory part of the meson
build. Until then, we can live with the limitation.
Signed-off-by: Artyom Kunakovsky <artyomkunakovsky@gmail.com>
Message-ID: <20240523051118.29367-1-artyomkunakovsky@gmail.com>
[rewrite commit message, remove from configure. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
*** NOTE ***
This replaces the previous PR for tags/pull-ppc-for-9.1-1-20240524
* Fix an interesting TLB invalidate race
* Implement more instructions with decodetree
* Add the POWER8/9/10 BHRB facility
* Add missing instructions, registers, SMT support
* First round of a big MMU xlate cleanup
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCgAdFiEETkN92lZhb0MpsKeVZ7MCdqhiHK4FAmZP1bsACgkQZ7MCdqhi
# HK7TuQ/7BQugpF2yOYroQmo0Yl4RPfFp6ACqfYQgehcGegg3SWpEselTeOJla3G9
# UyVd0mlWf7DciYi61qit/WyLOeuRXMtRjrnFLV2wz9o7D/Ey5/aLQfUL4oCDt/i2
# hmmq3ZAcr7WWxaz338pLJx9gIVjaNiqSoRz9HgHNkQq0pxkbEo1eSjZ6QLSvqYC2
# dwtJHywFrHNo14aq1Nc7PZ5MFxNN6t7hm7KRHKFrt8Obar15n64MSHyRvMzHI9EO
# RgNzz9/qe5yvJ4kmaNiZjntxojXCBUhhlCTtaDIG1LDBc2yNG5VWQUnwThvyNxxX
# h+Ia4Pv7blXikQ6RuqsvFyrLCgUvwXwBiQwiQCJyITk0asLyJVwhkUpiI/jJvOun
# AujSA/6e2pbSe4RUZytkzygx2KVODrVtcSoOvo8kRw+2aTOWMv7DbfBalmWJQWgx
# 0xSeuUz22eNKEL2XbZWNM5v0OgXUXIs9BVeCqn7RB4lC2RNi72v111UPuKYq6Ijx
# SHWQMGPGu9FNBsIdriclRWXVXHpVHz/s/l8AJT8ad6E57UHVk5zCPrbFZFImvQkL
# E7xlctijeST8V5qGyBPG3M4aPoER9+6J32ORSx7KwDwr+fzkbNUXC8UUC4OjAZ+d
# 2vhie9Vs5xWq/E8gGovTymeQ4yHArobDz/j7+rrr0qeppnKLWjM=
# =jHL7
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 23 May 2024 04:48:11 PM PDT
# gpg: using RSA key 4E437DDA56616F4329B0A79567B30276A8621CAE
# gpg: Good signature from "Nicholas Piggin <npiggin@gmail.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 4E43 7DDA 5661 6F43 29B0 A795 67B3 0276 A862 1CAE
* tag 'pull-ppc-for-9.1-1-20240524-1' of https://gitlab.com/npiggin/qemu: (72 commits)
target/ppc: Remove pp_check() and reuse ppc_hash32_pp_prot()
target/ppc: Move out BookE and related MMU functions from mmu_common.c
target/ppc: Add a function to check for page protection bit
target/ppc/mmu-radix64.c: Drop a local variable
target/ppc/mmu-hash32.c: Drop a local variable
target/ppc: Split off common embedded TLB init
target/ppc: Remove id_tlbs flag from CPU env
target/ppc: Move mmu_ctx_t type to mmu_common.c
target/ppc: Transform ppc_jumbo_xlate() into ppc_6xx_xlate()
target/ppc: Split off 40x cases from ppc_jumbo_xlate()
target/ppc: Split off real mode handling from get_physical_address_wtlb()
target/ppc: Simplify ppc_booke_xlate() part 2
target/ppc: Simplify ppc_booke_xlate() part 1
target/ppc: Split off BookE handling from ppc_jumbo_xlate()
target/ppc: Remove BookE from direct store handling
target/ppc: Don't use mmu_ctx_t in mmubooke206_get_physical_address()
target/ppc: Don't use mmu_ctx_t in mmubooke_get_physical_address()
target/ppc: Don't use mmu_ctx_t for mmu40x_get_physical_address()
target/ppc: Replace hard coded constants in ppc_jumbo_xlate()
target/ppc: Deindent ppc_jumbo_xlate()
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The ppc_hash32_pp_prot() function in mmu-hash32.c is the same as
pp_check() in mmu_common.c, merge these to remove duplicated code.
Define the common function as static lnline otherwise exporting the
function from mmu-hash32.c would stop the compiler inlining it which
results in slightly lower performance.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
[np: move ppc_hash32_pp_prot inline without changing it]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Add a new mmu-booke.c file for BookE and related MMU bits from
mmu_common.c.
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Checking if a page protection bit is set for a given access type is a
common operation. Add a function to avoid repeating the same check at
multiple places. As this relies on access type and page protection bit
values having certain relation also add an assert to ensure that this
assumption holds.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The value is only used once so no need to introduce a local variable
for it.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
In ppc_hash32_xlate() the value of need_prop is checked in two places
but precalculating it does not help because when we reach the first
check we always return and not reach the second place so the value
will only be used once. We can drop the local variable and calculate
it when needed, which makes these checks using it similar to other
places with such checks.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Several 4xx CPUs and e200 share the same TLB settings enclosed in an
ifdef. Split it off in a common function to reduce code duplication
and the number of ifdefs.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
This flag for split instruction/data TLBs is only set for 6xx soft TLB
MMU model and not used otherwise so no need to have a separate flag
for that.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Remove mmu_ctx_t definition from internal.h as this type is only used
within mmu_common.c.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Now that only 6xx cases left in ppc_jumbo_xlate() we can change it
to ppc_6xx_xlate() also removing get_physical_address_wtlb().
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Introduce ppc_40x_xlate() to split off 40x handlning leaving only 6xx
in ppc_jumbo_xlate() now.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Add ppc_real_mode_xlate() to handle real mode translation and allow
removing this case from ppc_jumbo_xlate().
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Merge the code fetch and data access cases in a common switch.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Move setting error_code that appears in every case out in front and
hoist the common fall through case for BOOKE206 as well which allows
removing the nested switches.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Introduce ppc_booke_xlate() to handle BookE and BookE 2.06 cases to
reduce ppc_jumbo_xlate() further.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
As BookE never returns -4 we can drop BookE from the direct store case
in ppc_jumbo_xlate().
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
mmubooke206_get_physical_address() only uses the raddr and prot fields
from mmu_ctx_t. Pass these directly instead of using a ctx struct.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
mmubooke_get_physical_address() only uses the raddr and prot fields
from mmu_ctx_t. Pass these directly instead of using a ctx struct.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
mmu40x_get_physical_address() only uses the raddr and prot fields from
mmu_ctx_t. Pass these directly instead of using a ctx struct.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The "2" in booke206_update_mas_tlb_miss() call corresponds to
MMU_INST_FETCH which is the value of access_type in this branch;
mmubooke206_esr() only checks for MMU_DATA_STORE and it's called from
code access so using MMU_DATA_LOAD here seems wrong so replace it with
access_type here as well that yields the same result. This also makes
these calls the same as the data access branch further down.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Instead of putting a large block of code in an if, invert the
condition and return early to be able to deindent the code block.
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
This function just does two assignments and and unnecessary check that
is always true so inline it in the only caller left and remove it.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The real mode handling is identical in the remaining switch cases.
Split off these common real mode cases into a separate conditional to
leave only the else branches in the switch that are different.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
BookE does not have real mode so split off and handle it first in
get_physical_address_wtlb() before checking for real mode for other
MMU models.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Return directly, which is simpler than dragging a return value through
multpile if and else blocks.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Move the debug logging within ppc6xx_tlb_check() from after its only
call to simplify the caller.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
In mmu6xx_get_physical_address() we have a large if block with a two
line else branch that effectively returns. Invert the condition and
move the else there to allow deindenting the large if block to make
the flow easier to follow.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Repurpose get_segment_6xx_tlb() to do the whole address translation
for POWERPC_MMU_SOFT_6xx MMU model by moving the BAT check there and
renaming it to match other similar functions. These are only called
once together so no need to keep these separate functions and
combining them simplifies the caller allowing further restructuring.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Drop MPC8xx cases from get_physical_address_wtlb() and ppc_jumbo_xlate().
The default case would still catch this and abort the same way and
there is still a warning about it in ppc_tlb_invalidate_all() which is
called in ppc_cpu_reset_hold() so likely we never get here but to make
sure add a case to ppc_xlate() to the same effect.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
In get_physical_address_wtlb() the real_mode flag depends on either
the MSR[IR] or MSR[DR] bit depending on access_type. Extract just the
needed bit in a more straight forward way instead of doing unnecessary
computation.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
In mmubooke_check_tlb() and mmubooke206_check_tlb() we can assign the
value of prot2 directly to the destination, no need to have a separate
local variable for it.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
In mmubooke_check_tlb() and mmubooke206_check_tlb() prot2 is
calculated first but only used after an unrelated check that can
return before tha value is used. Move the calculation after the check,
closer to where it is used, to keep them together and avoid computing
it when not needed.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The helper_rac function is defined but not used, remove it.
Fixes: 005b69fdcc (target/ppc: Remove PowerPC 601 CPUs)
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
I think it's use was removed by
Commit 5883d8b296 ("mmu-hash*: Don't use full ppc_hash{32,
64}_translate() path for get_phys_page_debug()")
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Dr. David Alan Gilbert <dave@treblig.org>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
msgsnd has a broadcast mode that sends hypervisor doorbells to all
threads belonging to the same core as the target. A "subcore" mode
sends to all or one thread depending on 1LPAR mode.
Reviewed-by: Glenn Miles <milesg@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
This implements the POWER SPRC/SPRD SPRs, and SCRATCH0-7 registers that
can be accessed via these indirect SPRs.
SCRATCH registers only provide storage, but they are used by firmware
for low level crash and progress data, so this implementation logs
writes to the registers to help with analysis.
Reviewed-by: Glenn Miles <milesg@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
LDBAR, TTR are a Power-specific SPRs. These simple implementations
are enough for IBM proprietary firmware for now.
Reviewed-by: Glenn Miles <milesg@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
AMOR, MMCRC, HRMOR, TSCR, HMEER, RPR SPRs are per-core or per-LPAR
registers with simple (generic) implementations.
Reviewed-by: Glenn Miles <milesg@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
An SPR can be either per-thread, per-core, or per-LPAR. Per-LPAR means
per-thread or per-core, depending on 1LPAR mode.
Reviewed-by: Glenn Miles <milesg@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
attn is an implementation-specific instruction that on POWER (and G5/
970) can be enabled with a HID bit (disabled = illegal), and executing
it causes the host processor to stop and the service processor to be
notified. Generally used for debugging.
Implement attn and make it checkstop the system, which should be good
enough for QEMU debugging.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Change the logging not to print to stderr as well, because a
checkstop is a guest error (or perhaps a simulated machine error)
rather than a QEMU error, so send it to the log.
Update the checkstop message, and log CPU registers too.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Glenn Miles <milesg@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
checkstop state does not halt the system, interrupts continue to be
serviced, and other CPUs run. Make it stop the machine with
qemu_system_guest_panicked.
Reviewed-by: Glenn Miles <milesg@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Use DEF_MEMOP() consistently in larx and stcx. generation, and apply it
once when it's used rather than where the macros are expanded, to reduce
typing.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Add support for the clrbhrb and mfbhrbe instructions.
Since neither instruction is believed to be critical to
performance, both instructions were implemented using helper
functions.
Access to both instructions is controlled by bits in the
HFSCR (for privileged state) and MMCR0 (for problem state).
A new function, helper_mmcr0_facility_check, was added for
checking MMCR0[BHRBA] and raising a facility_unavailable exception
if required.
NOTE: For P8 and P9, due to a performance issue, branch history will
not be kept, but the instructions will be allowed to execute
as normal with the exception that the mfbhrbe instruction will
always return a zero value.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Glenn Miles <milesg@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
This commit continues adding support for the Branch History
Rolling Buffer (BHRB) as is provided starting with the P8
processor and continuing with its successors. This commit
is limited to the recording and filtering of taken branches.
The following changes were made:
- Enabled functionality on P10 processors only due to
performance impact seen with P8 and P9 where it is not
disabled for non problem state branches.
- Added a BHRB buffer for storing branch instruction and
target addresses for taken branches
- Renamed gen_update_cfar to gen_update_branch_history and
added a 'target' parameter to hold the branch target
address and 'inst_type' parameter to use for filtering
- Added TCG code to gen_update_branch_history that stores
data to the BHRB and updates the BHRB offset.
- Added BHRB resource initialization and reset functions
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Glenn Miles <milesg@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
This commit is preparatory to the addition of Branch History
Rolling Buffer (BHRB) functionality, which is being provided
today starting with the P8 processor.
BHRB uses several SPR register fields to control whether or not
a branch instruction's address (and sometimes target address)
should be recorded. Checking each of these fields with each
branch instruction using jitted code would lead to a significant
decrease in performance.
Therefore, it was decided that BHRB configuration bits that are
not expected to change frequently should have their state summarized
in an hflag so that the amount of checking done by jitted code can
be reduced.
This commit contains the changes for summarizing the state of the
following register fields in the HFLAGS_BHRB_ENABLE hflag:
MMCR0[FCP] - Determines if BHRB recording is frozen in the
problem state
MMCR0[FCPC] - A modifier for MMCR0[FCP]
MMCRA[BHRBRD] - Disables all BHRB recording for a thread
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Glenn Miles <milesg@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Moving the following instructions to decodetree specification :
v{max, min}{u, s}{b, h, w, d} : VX-form
The changes were verified by validating that the tcg ops generated by those
instructions remain the same, which were captured with the '-d in_asm,op' flag.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Moving the following instructions to decodetree specification:
v{and, andc, nand, or, orc, nor, xor, eqv} : VX-form
The changes were verified by validating that the tcp ops generated by those
instructions remain the same, which were captured with the '-d in_asm,op' flag.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Moving the following instructions to decodetree specification :
{l,st}ve{b,h,w}x,
{l,st}v{x,xl},
lvs{l,r} : X-form
The changes were verified by validating that the tcg ops generated by those
instructions remain the same, which were captured using the '-d in_asm,op' flag.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Moving the below instructions to decodetree specification :
andi[s]., {ori, xori}[s] : D-form
{and, andc, nand, or, orc, nor, xor, eqv}[.],
exts{b, h, w}[.], cnt{l, t}z{w, d}[.],
popcnt{b, w, d}, prty{w, d}, cmp, bpermd : X-form
With this patch, all the fixed-point logical instructions have been
moved to decodetree.
The changes were verified by validating that the tcg ops generated by those
instructions remain the same, which were captured with the '-d in_asm,op' flag.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
[np: 32-bit compile fix]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Moving the following instructions to decodetree specification :
cmp{rb, eqb}, t{w, d} : X-form
t{w, d}i : D-form
isel : A-form
The changes were verified by validating that the tcg ops generated by those
instructions remain the same, which were captured using the '-d in_asm,op' flag.
Also for CMPRB, following review comments :
Replaced repetition of arithmetic right shifting (tcg_gen_shri_i32) followed
by extraction of last 8 bits (tcg_gen_ext8u_i32) with extraction of the required
bits using offsets (tcg_gen_extract_i32).
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
[np: 32-bit compile fix]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Moving the below instructions to decodetree specification :
divd[u, e, eu][o][.] : XO-form
mod{sd, ud} : X-form
With this patch, all the fixed-point arithmetic instructions have been
moved to decodetree.
The changes were verified by validating that the tcg ops generated by those
instructions remain the same, which were captured using the '-d in_asm,op' flag.
Also, remaned do_divwe method in fixedpoint-impl.c.inc to do_dive because it is
now used to divide doubleword operands as well, and not just words.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
[np: 32-bit compile fix]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Moving the following instructions to decodetree :
mul{ld, ldo, hd, hdu}[.] : XO-form
madd{hd, hdu, ld} : VA-form
The changes were verified by validating that the tcg ops generated by those
instructions remain the same, which were captured with the '-d in_asm,op'
flag.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
[np: 32-bit compile fix]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Moving the below instructions to decodetree specification :
neg[o][.] : XO-form
mod{sw, uw}, darn : X-form
The changes were verified by validating that the tcg ops generated by those
instructions remain the same, which were captured with the '-d in_asm,op' flag.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
[np: 32-bit compile fix]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Moving the following instructions to decodetree specification :
divw[u, e, eu][o][.] : XO-form
The changes were verified by validating that the tcg ops generated by those
instructions remain the same, which were captured with the '-d in_asm,op' flag.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The handler methods for divw[u] instructions internally use Rc(ctx->opcode),
for extraction of Rc field of instructions, which poses a problem if we move
the above said instructions to decodetree, as the ctx->opcode field is not
popluated in decodetree. Hence, making it decodetree compatible, so that the
mentioned insns can be safely move to decodetree specs.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Moving the following instructions to decodetree specification :
mulli : D-form
mul{lw, lwo, hw, hwu}[.] : XO-form
The changes were verified by validating that the tcg ops generated by those
instructions remain the same, which were captured with the '-d in_asm,op' flag.
Also cleaned up code for mullw[o][.] as per review comments while
keeping the logic of the tcg ops generated semantically same.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
This patch moves the below instructions to decodetree specification :
f{add, sub, mul, div, re, rsqrte, madd, msub, nmadd, nmsub}[s][.] : A-form
ft{div, sqrt} : X-form
With this patch, all the floating-point arithmetic instructions have been
moved to decodetree.
The changes were verified by validating that the tcg ops generated by those
instructions remain the same, which were captured with the '-d in_asm,op' flag.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
This patch merges the definitions of the following set of fpu helper methods,
which are similar, using macros :
1. f{add, sub, mul, div}(s)
2. fre(s)
3. frsqrte(s)
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
POWER10 adds a new field to sync for store-store syncs, and some
new variants of the existing syncs that include persistent memory.
Implement the store-store syncs and plwsync/phwsync.
Reviewed-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Memory barriers are supposed to do something on BookE systems, these
were probably just missed during MTTCG enablement, maybe no targets
support SMP. Either way, add proper BookE implementations.
Reviewed-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
This tries to faithfully reproduce the odd BookE logic. Note the
e206 check in gen_msync_4xx() is always false, so not carried over.
It does change the handling of non-zero reserved bits outside the
defined fields from being illegal to being ignored, which the
architecture specifies ot help with backward compatibility of new
fields. The existing behaviour causes illegal instruction exceptions
when using new POWER10 sync variants that add new fields, after this
the instructions are accepted and are implemented as supersets of
the new behaviour, as intended.
Reviewed-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Some TLB flush operations can flush other CPUs. The problem with this
is they used non-synced variants of flushes (i.e., that return
before the destination has completed the flush). Since all TLB flush
users need the _synced variants, and that last user (ppc) of the
non-synced flush was buggy, this is a footgun waiting to go off. There
do not seem to be any callers that flush other CPUs, so remove the
capability.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
These are no longer used.
tlb_flush_all_cpus: removed by previous commit.
tlb_flush_page_all_cpus: removed by previous commit.
tlb_flush_page_bits_by_mmuidx_all_cpus: never used.
tlb_flush_page_by_mmuidx_all_cpus: never used.
tlb_flush_page_bits_by_mmuidx_all_cpus: never used, thus:
tlb_flush_range_by_mmuidx_all_cpus: never used.
tlb_flush_by_mmuidx_all_cpus: never used.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
With mttcg, broadcast tlbie instructions do not wait until other vCPUs
have been kicked out of TCG execution before they complete (including
necessary subsequent tlbsync, etc., instructions). This is contrary to
the ISA, and it permits other vCPUs to use translations after the TLB
flush. For example:
CPU0
// *memP is initially 0, memV maps to memP with *pte
*pte = 0;
ptesync ; tlbie ; eieio ; tlbsync ; ptesync
*memP = 1;
CPU1
assert(*memV == 0);
It is possible for the assertion to fail because CPU1 translates memV
using the TLB after CPU0 has stored 1 to the underlying memory. This
race was observed with a careful test case where CPU1 checks run in a
very large expensive TB so it can run for the entire CPU0 period between
clearing the pte and storing the memory, but host vCPU thread preemption
could cause the race to hit anywhere.
As explained in commit 4ddc104689 ("target/ppc: Fix tlbie"), it is not
enough to just use tlb_flush_all_cpus_synced(), because that does not
execute until the calling CPU has finished its TB. It is also required
that the TB is ended at the point where the TLB flush must subsequently
take effect.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The ibm,pi-features property has a bit to say whether or not
msgsndp should be used. Linux checks if it is being run under
KVM and avoids msgsndp anyway, but it would be preferable to
rely on this bit.
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
PPC_VIRTUAL_HYPERVISOR_GET_CLASS is used in critical operations like
interrupts and TLB misses and is quite costly. Running the
kvm-unit-tests sieve program with radix MMU enabled thrashes the TCG
TLB and spends a lot of time in TLB and page table walking code. The
test takes 67 seconds to complete with a lot of time being spent in
code related to finding the vhyp class:
12.01% [.] g_str_hash
8.94% [.] g_hash_table_lookup
8.06% [.] object_class_dynamic_cast
6.21% [.] address_space_ldq
4.94% [.] __strcmp_avx2
4.28% [.] tlb_set_page_full
4.08% [.] address_space_translate_internal
3.17% [.] object_class_dynamic_cast_assert
2.84% [.] ppc_radix64_xlate
Keep a pointer to the class and avoid this lookup. This reduces the
execution time to 40 seconds.
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
* hw/i386/pc_sysfw: Alias rather than copy isa-bios region
* target/i386: add control bits support for LAM
* target/i386: tweaks to new translator
* target/i386: add support for LAM in CPUID enumeration
* hw/i386/pc: Support smp.modules for x86 PC machine
* target-i386: hyper-v: Correct kvm_hv_handle_exit return value
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmZOMlAUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroNTSwf8DOPgipepNcsxUQoV9nOBfNXqEWa6
# DilQGwuu/3eMSPITUCGKVrtLR5azwCwvNfYYErVBPVIhjImnk3XHwfKpH1csadgq
# 7Np8WGjAyKEIP/yC/K1VwsanFHv3hmC6jfcO3ZnsnlmbHsRINbvU9uMlFuiQkKJG
# lP/dSUcTVhwLT6eFr9DVDUnq4Nh7j3saY85pZUoDclobpeRLaEAYrawha1/0uQpc
# g7MZYsxT3sg9PIHlM+flpRvJNPz/ZDBdj4raN1xo4q0ET0KRLni6oEOVs5GpTY1R
# t4O8a/IYkxeI15K9U7i0HwYI2wVwKZbHgp9XPMYVZFJdKBGT8bnF56pV9A==
# =lp7q
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 22 May 2024 10:58:40 AM PDT
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (23 commits)
target-i386: hyper-v: Correct kvm_hv_handle_exit return value
i386/cpu: Use CPUCacheInfo.share_level to encode CPUID[0x8000001D].EAX[bits 25:14]
i386/cpu: Use CPUCacheInfo.share_level to encode CPUID[4]
i386: Add cache topology info in CPUCacheInfo
hw/i386/pc: Support smp.modules for x86 PC machine
tests: Add test case of APIC ID for module level parsing
i386/cpu: Introduce module-id to X86CPU
i386: Support module_id in X86CPUTopoIDs
i386: Expose module level in CPUID[0x1F]
i386: Support modules_per_die in X86CPUTopoInfo
i386: Introduce module level cpu topology to CPUX86State
i386/cpu: Decouple CPUID[0x1F] subleaf with specific topology level
i386: Split topology types of CPUID[0x1F] from the definitions of CPUID[0xB]
i386/cpu: Introduce bitmap to cache available CPU topology levels
i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid()
i386/cpu: Use APIC ID info get NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14]
i386/cpu: Use APIC ID info to encode cache topo in CPUID[4]
i386/cpu: Fix i/d-cache topology to core level for Intel CPU
target/i386: add control bits support for LAM
target/i386: add support for LAM in CPUID enumeration
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
pull-loongarch-20240523
# -----BEGIN PGP SIGNATURE-----
#
# iLMEAAEKAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCZk6fPgAKCRBAov/yOSY+
# 35rwA/98G/tODhR2PAl7qZr6+6z8vazkiT4iNNHgxnw/T2TKsh2YONe+2gtKhTa1
# HKYANMykWTxOtBZeCYY9Z5QNj8DuC3xKc1zY1pC1AwRcflsMlGz0WoAC78Gbl9TC
# PBCwyu01hsFoYpIstH/dOGbNsR2OFRLnnGUVFUKtPuS3O+59hg==
# =OzUv
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 22 May 2024 06:43:26 PM PDT
# gpg: using RSA key B8FF1DA0D2FDCB2DA09C6C2C40A2FFF239263EDF
# gpg: Good signature from "Song Gao <m17746591750@163.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B8FF 1DA0 D2FD CB2D A09C 6C2C 40A2 FFF2 3926 3EDF
* tag 'pull-loongarch-20240523' of https://gitlab.com/gaosong/qemu:
hw/loongarch/virt: Fix FDT memory node address width
target/loongarch: Add loongarch vector property unconditionally
hw/loongarch: Remove minimum and default memory size
hw/loongarch: Refine system dram memory region
hw/loongarch: Refine fwcfg memory map
hw/loongarch: Refine fadt memory table for numa memory
hw/loongarch: Refine acpi srat table for numa memory
hw/loongarch: Add VM mode in IOCSR feature register in kvm mode
target/loongarch/kvm: fpu save the vreg registers high 192bit
target/loongarch/kvm: Fix VM recovery from disk failures
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
When passing disassembly data to plugin callbacks,
translator_st_len relies on db->tb->size having been set.
Fixes: 4c833c60e0 ("disas: Use translator_st to get disassembly data")
Reported-by: Bernhard Beschow <shentey@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Currently LSX/LASX vector property is decided by the default value.
Instead vector property should be added unconditionally, and it is
irrelative with its default value. If vector is disabled by default,
vector also can be enabled from command line.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240521080549.434197-2-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Some qtest test cases such as numa use default memory size of generic
machine class, which is 128M by fault.
Here generic default memory size is used, and also remove minimum memory
size which is 1G originally.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240515093927.3453674-6-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
For system dram memory region, it is not necessary to use numa node
information. There is only low memory region and high memory region.
Remove numa node information for ddr memory region here, it can reduce
memory region number on LoongArch virt machine.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240515093927.3453674-5-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Memory map table for fwcfg is used for UEFI BIOS, UEFI BIOS uses the first
entry from fwcfg memory map as the first memory HOB, the second memory HOB
will be used if the first memory HOB is used up.
Memory map table for fwcfg does not care about numa node, however in
generic the first memory HOB is part of numa node0, so that runtime
memory of UEFI which is allocated from the first memory HOB is located
at numa node0.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240515093927.3453674-4-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
One LoongArch virt machine platform, there is limitation for memory
map information. The minimum memory size is 256M and minimum memory
size for numa node0 is 256M also. With qemu numa qtest, it is possible
that memory size of numa node0 is 128M.
Limitations for minimum memory size for both total memory and numa
node0 is removed for fadt numa memory table creation.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240515093927.3453674-3-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
One LoongArch virt machine platform, there is limitation for memory
map information. The minimum memory size is 256M and minimum memory
size for numa node0 is 256M also. With qemu numa qtest, it is possible
that memory size of numa node0 is 128M.
Limitations for minimum memory size for both total memory and numa
node0 is removed for acpi srat table creation.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240515093927.3453674-2-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Migration pull request
- Li Zhijian's COLO minor fixes
- Marc-André's virtio-gpu fix
- Fiona's virtio-net USO fix
- A couple of migration-test fixes from Thomas
# -----BEGIN PGP SIGNATURE-----
#
# iQJEBAABCAAuFiEEqhtIsKIjJqWkw2TPx5jcdBvsMZ0FAmZObggQHGZhcm9zYXNA
# c3VzZS5kZQAKCRDHmNx0G+wxnWE8D/49RGE+g29qyk9aKx3lU8mSq+ZzmX5GncBt
# 5+Mx5qoHDsBCQTE+dQpEVIoeMJ2HIbgbOML4qsnp6Hw/4/TWkfwC/R6+ZmHBevRk
# fVLkVh2JMHVg8Tq+0FO1X1QnMU03uJ7EAuWdDa8HqlJ5dQY/K3gDaku8oQBXk96X
# 13pChSbMob76tdb+wiwbdEakabigH7XfrPdI6lzI8MCGTIcPKc/UKTFYuoj/OsNx
# raqy+uBtvKtfHxiaYnIgHIPNAF/1f4tP3iAOcPoZWIMXWxFkE8+ANDJAbWo6xIcL
# DGg/wEzZO/OnXLjOhjvLBUHK/fx4wQ5bsqA09BVxoRyBGblkXr+bcwBLYjgiEqzT
# aniPiAx5W/Db+T7HqZPIWesFYj3cmcwvYUTrx/RPMdC0epG+ZczDMtescHdZbxvt
# Pjs3nFeCLhyYcVhlTI72eXRCxdd/26+r6/OmrBC2+GaZrybM61TvNo+3XvO0Pfhi
# UmwF2EN27XmSMelLvH/MnflUVgBHKDs3CCQzDlxreHq2jMVR0SL7LU5wMJJ58Iok
# M3u74izQM25bwYxiASH+4iRn0puH1mOwgOx28W0uiQfZY/678/lCnwa1Tul15BRE
# fIQZJhyIGzhSpwLqEXmdXdlLQs1isqIgpd/mzKgZ285nLr7kz+4gxCUqiXgVbrl7
# P45Dym1u4g==
# =DDrh
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 22 May 2024 03:13:28 PM PDT
# gpg: using RSA key AA1B48B0A22326A5A4C364CFC798DC741BEC319D
# gpg: issuer "farosas@suse.de"
# gpg: Good signature from "Fabiano Rosas <farosas@suse.de>" [unknown]
# gpg: aka "Fabiano Almeida Rosas <fabiano.rosas@suse.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: AA1B 48B0 A223 26A5 A4C3 64CF C798 DC74 1BEC 319D
* tag 'migration-20240522-pull-request' of https://gitlab.com/farosas/qemu:
tests/qtest/migration-test: Fix the check for a successful run of analyze-migration.py
tests/qtest/migration-test: Run some basic tests on s390x and ppc64 with TCG, too
hw/core/machine: move compatibility flags for VirtIO-net USO to machine 8.1
virtio-gpu: fix v2 migration
migration: fix a typo
migration: add "exists" info to load-state-field trace
migration/colo: Tidy up bql_unlock() around bdrv_activate_all()
migration/colo: make colo_incoming_co() return void
migration/colo: Minor fix for colo error message
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
If analyze-migration.py cannot be run or crashes, the error is currently
ignored since the code only checks for nonzero values in case the child
exited properly. For example, if you run the test with a non-existing
Python interpreter, it still succeeds:
$ PYTHON=wrongpython QTEST_QEMU_BINARY=./qemu-system-x86_64 tests/qtest/migration-test
...
# Running /x86_64/migration/analyze-script
# Using machine type: pc-q35-9.1
# starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-417639.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-417639.qmp,id=char0 -mon chardev=char0,mode=control -display none -audio none -accel kvm -accel tcg -machine pc-q35-9.1, -name source,debug-threads=on -m 150M -serial file:/tmp/migration-test-XPLUN2/src_serial -drive if=none,id=d0,file=/tmp/migration-test-XPLUN2/bootsect,format=raw -device ide-hd,drive=d0,secs=1,cyls=1,heads=1 -uuid 11111111-1111-1111-1111-111111111111 -accel qtest
# starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-417639.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-417639.qmp,id=char0 -mon chardev=char0,mode=control -display none -audio none -accel kvm -accel tcg -machine pc-q35-9.1, -name target,debug-threads=on -m 150M -serial file:/tmp/migration-test-XPLUN2/dest_serial -incoming tcp:127.0.0.1:0 -drive if=none,id=d0,file=/tmp/migration-test-XPLUN2/bootsect,format=raw -device ide-hd,drive=d0,secs=1,cyls=1,heads=1 -accel qtest
**
ERROR:../../devel/qemu/tests/qtest/migration-test.c:1603:test_analyze_script: code should not be reached
migration-test: ../../devel/qemu/tests/qtest/libqtest.c:240: qtest_wait_qemu: Assertion `pid == s->qemu_pid' failed.
migration-test: ../../devel/qemu/tests/qtest/libqtest.c:240: qtest_wait_qemu: Assertion `pid == s->qemu_pid' failed.
ok 2 /x86_64/migration/analyze-script
...
Let's better fail the test in case the child did not exit properly, too.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
On s390x, we recently had a regression that broke migration / savevm
(see commit bebe9603fc ("hw/intc/s390_flic: Fix crash that occurs when
saving the machine state"). The problem was merged without being noticed
since we currently do not run any migration / savevm related tests on
x86 hosts.
While we currently cannot run all migration tests for the s390x target
on x86 hosts yet (due to some unresolved issues with TCG), we can at
least run some of the non-live tests to avoid such problems in the future.
Thus enable the "analyze-script" and the "bad_dest" tests before checking
for KVM on s390x or ppc64 (this also fixes the problem that the
"analyze-script" test was not run on s390x at all anymore since it got
disabled again by accident in a previous refactoring of the code).
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Migration from an 8.2 or 9.0 binary to an 8.1 binary with machine
version 8.1 can fail with:
> kvm: Features 0x1c0010130afffa7 unsupported. Allowed features: 0x10179bfffe7
> kvm: Failed to load virtio-net:virtio
> kvm: error while loading state for instance 0x0 of device '0000:00:12.0/virtio-net'
> kvm: load of migration failed: Operation not permitted
The series
53da8b5a99 virtio-net: Add support for USO features
9da1684954 virtio-net: Add USO flags to vhost support.
f03e0cf63b tap: Add check for USO features
2ab0ec3121 tap: Add USO support to tap device.
only landed in QEMU 8.2, so the compatibility flags should be part of
machine version 8.1.
Moving the flags unfortunately breaks forward migration with machine
version 8.1 from a binary without this patch to a binary with this
patch.
Fixes: 53da8b5a99 ("virtio-net: Add support for USO features")
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Commit dfcf74fa ("virtio-gpu: fix scanout migration post-load") broke
forward/backward version migration. Versioning of nested VMSD structures
is not straightforward, as the wire format doesn't have nested
structures versions. Introduce x-scanout-vmstate-version and a field
test to save/load appropriately according to the machine version.
Fixes: dfcf74fa ("virtio-gpu: fix scanout migration post-load")
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
[fixed long lines]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Currently, it always returns 0, no need to check the return value at all.
In addition, enter colo coroutine only if migration_incoming_colo_enabled()
is true.
Once the destination side enters the COLO* state, the COLO process will
take over the remaining processes until COLO exits.
Cc: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
[fixed mangled author email address]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
This bug fix addresses the incorrect return value of kvm_hv_handle_exit for
KVM_EXIT_HYPERV_SYNIC, which should be EXCP_INTERRUPT.
Handling of KVM_EXIT_HYPERV_SYNIC in QEMU needs to be synchronous.
This means that async_synic_update should run in the current QEMU vCPU
thread before returning to KVM, returning EXCP_INTERRUPT to guarantee this.
Returning 0 can cause async_synic_update to run asynchronously.
One problem (kvm-unit-tests's hyperv_synic test fails with timeout error)
caused by this bug:
When a guest VM writes to the HV_X64_MSR_SCONTROL MSR to enable Hyper-V SynIC,
a VM exit is triggered and processed by the kvm_hv_handle_exit function of the
QEMU vCPU. This function then calls the async_synic_update function to set
synic->sctl_enabled to true. A true value of synic->sctl_enabled is required
before creating SINT routes using the hyperv_sint_route_new() function.
If kvm_hv_handle_exit returns 0 for KVM_EXIT_HYPERV_SYNIC, the current QEMU
vCPU thread may return to KVM and enter the guest VM before running
async_synic_update. In such case, the hyperv_synic test’s subsequent call to
synic_ctl(HV_TEST_DEV_SINT_ROUTE_CREATE, ...) immediately after writing to
HV_X64_MSR_SCONTROL can cause QEMU’s hyperv_sint_route_new() function to return
prematurely (because synic->sctl_enabled is false).
If the SINT route is not created successfully, the SINT interrupt will not be
fired, resulting in a timeout error in the hyperv_synic test.
Fixes: 267e071bd6 (“hyperv: make overlay pages for SynIC”)
Suggested-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Dongsheng Zhang <dongsheng.x.zhang@intel.com>
Message-ID: <20240521200114.11588-1-dongsheng.x.zhang@intel.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CPUID[0x8000001D].EAX[bits 25:14] NumSharingCache: number of logical
processors sharing cache.
The number of logical processors sharing this cache is
NumSharingCache + 1.
After cache models have topology information, we can use
CPUCacheInfo.share_level to decide which topology level to be encoded
into CPUID[0x8000001D].EAX[bits 25:14].
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-22-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CPUID[4].EAX[bits 25:14] is used to represent the cache topology for
Intel CPUs.
After cache models have topology information, we can use
CPUCacheInfo.share_level to decide which topology level to be encoded
into CPUID[4].EAX[bits 25:14].
And since with the helper max_processor_ids_for_cache(), the filed
CPUID[4].EAX[bits 25:14] (original virable "num_apic_ids") is parsed
based on cpu topology levels, which are verified when parsing -smp, it's
no need to check this value by "assert(num_apic_ids > 0)" again, so
remove this assert().
Additionally, wrap the encoding of CPUID[4].EAX[bits 31:26] into a
helper to make the code cleaner.
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-21-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently, by default, the cache topology is encoded as:
1. i/d cache is shared in one core.
2. L2 cache is shared in one core.
3. L3 cache is shared in one die.
This default general setting has caused a misunderstanding, that is, the
cache topology is completely equated with a specific cpu topology, such
as the connection between L2 cache and core level, and the connection
between L3 cache and die level.
In fact, the settings of these topologies depend on the specific
platform and are not static. For example, on Alder Lake-P, every
four Atom cores share the same L2 cache.
Thus, we should explicitly define the corresponding cache topology for
different cache models to increase scalability.
Except legacy_l2_cache_cpuid2 (its default topo level is
CPU_TOPO_LEVEL_UNKNOW), explicitly set the corresponding topology level
for all other cache models. In order to be compatible with the existing
cache topology, set the CPU_TOPO_LEVEL_CORE level for the i/d cache, set
the CPU_TOPO_LEVEL_CORE level for L2 cache, and set the
CPU_TOPO_LEVEL_DIE level for L3 cache.
The field for CPUID[4].EAX[bits 25:14] or CPUID[0x8000001D].EAX[bits
25:14] will be set based on CPUCacheInfo.share_level.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20240424154929.1487382-20-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
As module-level topology support is added to X86CPU, now we can enable
the support for the modules parameter on PC machines. With this support,
we can define a 5-level x86 CPU topology with "-smp":
-smp cpus=*,maxcpus=*,sockets=*,dies=*,modules=*,cores=*,threads=*.
So, add the 5-level topology example in description of "-smp".
Additionally, add the missed drawers and books options in previous
example.
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Co-developed-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-19-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add module_id member in X86CPUTopoIDs.
module_id can be parsed from APIC ID, so also update APIC ID parsing
rule to support module level. With this support, the conversions with
module level between X86CPUTopoIDs, X86CPUTopoInfo and APIC ID are
completed.
module_id can be also generated from cpu topology, and before i386
supports "modules" in smp, the default "modules per die" (modules *
clusters) is only 1, thus the module_id generated in this way is 0,
so that it will not conflict with the module_id generated by APIC ID.
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Co-developed-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-16-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Linux kernel (from v6.4, with commit edc0a2b595765 ("x86/topology: Fix
erroneous smp_num_siblings on Intel Hybrid platforms") is able to
handle platforms with Module level enumerated via CPUID.1F.
Expose the module level in CPUID[0x1F] if the machine has more than 1
modules.
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-15-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Support module level in i386 cpu topology structure "X86CPUTopoInfo".
Since x86 does not yet support the "modules" parameter in "-smp",
X86CPUTopoInfo.modules_per_die is currently always 1.
Therefore, the module level width in APIC ID, which can be calculated by
"apicid_bitwidth_for_count(topo_info->modules_per_die)", is always 0 for
now, so we can directly add APIC ID related helpers to support module
level parsing.
In addition, update topology structure in test-x86-topo.c.
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Co-developed-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-14-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Intel CPUs implement module level on hybrid client products (e.g.,
ADL-N, MTL, etc) and E-core server products.
A module contains a set of cores that share certain resources (in
current products, the resource usually includes L2 cache, as well as
module scoped features and MSRs).
Module level support is the prerequisite for L2 cache topology on
module level. With module level, we can implement the Guest's CPU
topology and future cache topology to be consistent with the Host's on
Intel hybrid client/E-core server platforms.
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Co-developed-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-13-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
At present, the subleaf 0x02 of CPUID[0x1F] is bound to the "die" level.
In fact, the specific topology level exposed in 0x1F depends on the
platform's support for extension levels (module, tile and die).
To help expose "module" level in 0x1F, decouple CPUID[0x1F] subleaf
with specific topology level.
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20240424154929.1487382-12-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CPUID[0xB] defines SMT, Core and Invalid types, and this leaf is shared
by Intel and AMD CPUs.
But for extended topology levels, Intel CPU (in CPUID[0x1F]) and AMD CPU
(in CPUID[0x80000026]) have the different definitions with different
enumeration values.
Though CPUID[0x80000026] hasn't been implemented in QEMU, to avoid
possible misunderstanding, split topology types of CPUID[0x1F] from the
definitions of CPUID[0xB] and introduce CPUID[0x1F]-specific topology
types.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-11-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently, QEMU checks the specify number of topology domains to detect
if there's extended topology levels (e.g., checking nr_dies).
With this bitmap, the extended CPU topology (the levels other than SMT,
core and package) could be easier to detect without touching the
topology details.
This is also in preparation for the follow-up to decouple CPUID[0x1F]
subleaf with specific topology level.
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20240424154929.1487382-10-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In cpu_x86_cpuid(), there are many variables in representing the cpu
topology, e.g., topo_info, cs->nr_cores and cs->nr_threads.
Since the names of cs->nr_cores and cs->nr_threads do not accurately
represent its meaning, the use of cs->nr_cores or cs->nr_threads is
prone to confusion and mistakes.
And the structure X86CPUTopoInfo names its members clearly, thus the
variable "topo_info" should be preferred.
In addition, in cpu_x86_cpuid(), to uniformly use the topology variable,
replace env->dies with topo_info.dies_per_pkg as well.
Suggested-by: Robert Hoo <robert.hu@linux.intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-9-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The commit 8f4202fb10 ("i386: Populate AMD Processor Cache Information
for cpuid 0x8000001D") adds the cache topology for AMD CPU by encoding
the number of sharing threads directly.
From AMD's APM, NumSharingCache (CPUID[0x8000001D].EAX[bits 25:14])
means [1]:
The number of logical processors sharing this cache is the value of
this field incremented by 1. To determine which logical processors are
sharing a cache, determine a Share Id for each processor as follows:
ShareId = LocalApicId >> log2(NumSharingCache+1)
Logical processors with the same ShareId then share a cache. If
NumSharingCache+1 is not a power of two, round it up to the next power
of two.
From the description above, the calculation of this field should be same
as CPUID[4].EAX[bits 25:14] for Intel CPUs. So also use the offsets of
APIC ID to calculate this field.
[1]: APM, vol.3, appendix.E.4.15 Function 8000_001Dh--Cache Topology
Information
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20240424154929.1487382-8-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Refer to the fixes of cache_info_passthrough ([1], [2]) and SDM, the
CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits 31:26] should use the
nearest power-of-2 integer.
The nearest power-of-2 integer can be calculated by pow2ceil() or by
using APIC ID offset/width (like L3 topology using 1 << die_offset [3]).
But in fact, CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits 31:26]
are associated with APIC ID. For example, in linux kernel, the field
"num_threads_sharing" (Bits 25 - 14) is parsed with APIC ID. And for
another example, on Alder Lake P, the CPUID.04H:EAX[bits 31:26] is not
matched with actual core numbers and it's calculated by:
"(1 << (pkg_offset - core_offset)) - 1".
Therefore the topology information of APIC ID should be preferred to
calculate nearest power-of-2 integer for CPUID.04H:EAX[bits 25:14] and
CPUID.04H:EAX[bits 31:26]:
1. d/i cache is shared in a core, 1 << core_offset should be used
instead of "cs->nr_threads" in encode_cache_cpuid4() for
CPUID.04H.00H:EAX[bits 25:14] and CPUID.04H.01H:EAX[bits 25:14].
2. L2 cache is supposed to be shared in a core as for now, thereby
1 << core_offset should also be used instead of "cs->nr_threads" in
encode_cache_cpuid4() for CPUID.04H.02H:EAX[bits 25:14].
3. Similarly, the value for CPUID.04H:EAX[bits 31:26] should also be
calculated with the bit width between the package and SMT levels in
the APIC ID (1 << (pkg_offset - core_offset) - 1).
In addition, use APIC ID bits calculations to replace "pow2ceil()" for
cache_info_passthrough case.
[1]: efb3934adf ("x86: cpu: make sure number of addressable IDs for processor cores meets the spec")
[2]: d7caf13b5f ("x86: cpu: fixup number of addressable IDs for logical processors sharing cache")
[3]: d65af288a8 ("i386: Update new x86_apicid parsing rules with die_offset support")
Fixes: 7e3482f824 ("i386: Helpers to encode cache information consistently")
Suggested-by: Robert Hoo <robert.hu@linux.intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-7-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For i-cache and d-cache, current QEMU hardcodes the maximum IDs for CPUs
sharing cache (CPUID.04H.00H:EAX[bits 25:14] and CPUID.04H.01H:EAX[bits
25:14]) to 0, and this means i-cache and d-cache are shared in the SMT
level.
This is correct if there's single thread per core, but is wrong for the
hyper threading case (one core contains multiple threads) since the
i-cache and d-cache are shared in the core level other than SMT level.
For AMD CPU, commit 8f4202fb10 ("i386: Populate AMD Processor Cache
Information for cpuid 0x8000001D") has already introduced i/d cache
topology as core level by default.
Therefore, in order to be compatible with both multi-threaded and
single-threaded situations, we should set i-cache and d-cache be shared
at the core level by default.
This fix changes the default i/d cache topology from per-thread to
per-core. Potentially, this change in L1 cache topology may affect the
performance of the VM if the user does not specifically specify the
topology or bind the vCPU. However, the way to achieve optimal
performance should be to create a reasonable topology and set the
appropriate vCPU affinity without relying on QEMU's default topology
structure.
Fixes: 7e3482f824 ("i386: Helpers to encode cache information consistently")
Suggested-by: Robert Hoo <robert.hu@linux.intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20240424154929.1487382-6-zhao1.liu@intel.com>
[Add compat property. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
LAM uses CR3[61] and CR3[62] to configure/enable LAM on user pointers.
LAM uses CR4[28] to configure/enable LAM on supervisor pointers.
For CR3 LAM bits, no additional handling needed:
- TCG
LAM is not supported for TCG of target-i386. helper_write_crN() and
helper_vmrun() check max physical address bits before calling
cpu_x86_update_cr3(), no change needed, i.e. CR3 LAM bits are not allowed
to be set in TCG.
- gdbstub
x86_cpu_gdb_write_register() will call cpu_x86_update_cr3() to update cr3.
Allow gdb to set the LAM bit(s) to CR3, if vcpu doesn't support LAM,
KVM_SET_SREGS will fail as other reserved bits.
For CR4 LAM bit, its reservation depends on vcpu supporting LAM feature or
not.
- TCG
LAM is not supported for TCG of target-i386. helper_write_crN() and
helper_vmrun() check CR4 reserved bit before calling cpu_x86_update_cr4(),
i.e. CR4 LAM bit is not allowed to be set in TCG.
- gdbstub
x86_cpu_gdb_write_register() will call cpu_x86_update_cr4() to update cr4.
Mask out LAM bit on CR4 if vcpu doesn't support LAM.
- x86_cpu_reset_hold() doesn't need special handling.
Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240112060042.19925-3-binbin.wu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In the -bios case the "isa-bios" memory region is an alias to the BIOS mapped
to the top of the 4G memory boundary. Do the same in the -pflash case, but only
for new machine versions for migration compatibility. This establishes common
behavior and makes pflash commands work in the "isa-bios" region which some
real-world legacy bioses rely on.
Note that in the sev_enabled() case, the "isa-bios" memory region in the -pflash
case will now also point to encrypted memory, just like it already does in the
-bios case.
When running `info mtree` before and after this commit with
`qemu-system-x86_64 -S -drive \
if=pflash,format=raw,readonly=on,file=/usr/share/qemu/bios-256k.bin` and running
`diff -u before.mtree after.mtree` results in the following changes in the
memory tree:
--- before.mtree
+++ after.mtree
@@ -71,7 +71,7 @@
0000000000000000-ffffffffffffffff (prio -1, i/o): pci
00000000000a0000-00000000000bffff (prio 1, i/o): vga-lowmem
00000000000c0000-00000000000dffff (prio 1, rom): pc.rom
- 00000000000e0000-00000000000fffff (prio 1, rom): isa-bios
+ 00000000000e0000-00000000000fffff (prio 1, romd): alias isa-bios @system.flash0 0000000000020000-000000000003ffff
00000000000a0000-00000000000bffff (prio 1, i/o): alias smram-region @pci 00000000000a0000-00000000000bffff
00000000000c0000-00000000000c3fff (prio 1, i/o): alias pam-pci @pci 00000000000c0000-00000000000c3fff
00000000000c4000-00000000000c7fff (prio 1, i/o): alias pam-pci @pci 00000000000c4000-00000000000c7fff
@@ -108,7 +108,7 @@
0000000000000000-ffffffffffffffff (prio -1, i/o): pci
00000000000a0000-00000000000bffff (prio 1, i/o): vga-lowmem
00000000000c0000-00000000000dffff (prio 1, rom): pc.rom
- 00000000000e0000-00000000000fffff (prio 1, rom): isa-bios
+ 00000000000e0000-00000000000fffff (prio 1, romd): alias isa-bios @system.flash0 0000000000020000-000000000003ffff
00000000000a0000-00000000000bffff (prio 1, i/o): alias smram-region @pci 00000000000a0000-00000000000bffff
00000000000c0000-00000000000c3fff (prio 1, i/o): alias pam-pci @pci 00000000000c0000-00000000000c3fff
00000000000c4000-00000000000c7fff (prio 1, i/o): alias pam-pci @pci 00000000000c4000-00000000000c7fff
@@ -131,11 +131,14 @@
memory-region: pc.ram
0000000000000000-0000000007ffffff (prio 0, ram): pc.ram
+memory-region: system.flash0
+ 00000000fffc0000-00000000ffffffff (prio 0, romd): system.flash0
+
memory-region: pci
0000000000000000-ffffffffffffffff (prio -1, i/o): pci
00000000000a0000-00000000000bffff (prio 1, i/o): vga-lowmem
00000000000c0000-00000000000dffff (prio 1, rom): pc.rom
- 00000000000e0000-00000000000fffff (prio 1, rom): isa-bios
+ 00000000000e0000-00000000000fffff (prio 1, romd): alias isa-bios @system.flash0 0000000000020000-000000000003ffff
memory-region: smram
00000000000a0000-00000000000bffff (prio 0, ram): alias smram-low @pc.ram 00000000000a0000-00000000000bffff
Note that in both cases the "system" memory region contains the entry
00000000fffc0000-00000000ffffffff (prio 0, romd): system.flash0
but the "system.flash0" memory region only appears standalone when "isa-bios" is
an alias.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-ID: <20240508175507.22270-7-shentey@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The 32-bit AAM/AAD opcodes are using helpers that read and write flags and
env->regs[R_EAX]. Clean them up so that the table correctly includes AX
as a 16-bit input and output.
No real reason to do it to be honest, but they are nice one-output helpers
and it removes the masking of env->regs[R_EAX] that generic load/writeback
code already does.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240522123912.608497-1-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
gen_rot_carry and gen_rot_overflow are meant to be called with count == NULL
if the count cannot be zero. However this is not done in gen_ROL and gen_ROR,
and writing everywhere "can_be_zero ? count : NULL" is burdensome and less
readable. Just pass can_be_zero as a separate argument.
gen_RCL and gen_RCR use a conditional branch to skip the computation
if count is zero, so they can pass false unconditionally to gen_rot_overflow.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240522123914.608516-1-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
vfio queue:
* Improvement of error reporting during migration
* Removed Vendor Specific Capability check on newer machine
* Addition of a VFIO migration QAPI event
* Changed prototype of routines using an error parameter to return bool
* Several cleanups regarding autofree variables
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmZNwDEACgkQUaNDx8/7
# 7KHaYQ/+MUFOiWEiAwJdP8I1DkY6mJV3ZDixKMHLmr8xH6fAkR2htEw6UUcYijcn
# Z0wVvcB7A1wetgIAB2EPc2o6JtRD1uEW2pPq3SVpdWO2rWYa4QLvldOiJ8A+Kvss
# 0ZugWirgZsM7+ka9TCuysmqWdQD+P6z2RURMSwiPi6QPHwv1Tt69gLSxFeV5WWai
# +mS6wUbaU3LSt6yRhORRvFkCss4je3D3YR73ivholGHANxi/7C5T22KwOHrW6Qzf
# uk3W/zq1yL1YLXSu6WoKPw0mMCvNtGyKK2oAlhG3Ln1tPYnctNrlfXlApqxEOGl3
# adGtwd6fyg6UTRR+vOXEy1QPCGcHtKWc5SuV5E677JftARJMwzbXrJw9Y9xS2RCQ
# oRYS5814k9RdubTxu+/l8NLICMdox7dNy//QLyrIdD7nJKYhFODkV1giWh4NWkt6
# m0T3PGLlUJ/V2ngWQu9Aw150m3lCPEKt+Nv/mGOEFDRu9dv55Vb7oJwr1dBB/n+e
# 1lNNpDmV0YipoKYMzrlBwNwxhXGJOtNPwHtw/vZuiy70CXUwo0t4XLMpWbWasxZc
# 0yz4O9RLRJEhPtPqv54aLsE2kNY10I8vwHBlhyNgIEsA7eCDduA+65aPBaqIF7z6
# GjvYdixF+vAZFexn0mDi1gtM3Yh60Hiiq1j7kKyyti/q0WUQzIc=
# =awMc
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 22 May 2024 02:51:45 AM PDT
# gpg: using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@kaod.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1
* tag 'pull-vfio-20240522' of https://github.com/legoater/qemu: (47 commits)
vfio/igd: Use g_autofree in vfio_probe_igd_bar4_quirk()
vfio: Use g_autofree in all call site of vfio_get_region_info()
vfio/pci-quirks: Make vfio_add_*_cap() return bool
vfio/pci-quirks: Make vfio_pci_igd_opregion_init() return bool
vfio/pci: Use g_autofree for vfio_region_info pointer
vfio/pci: Make capability related functions return bool
vfio/pci: Make vfio_populate_vga() return bool
vfio/pci: Make vfio_intx_enable() return bool
vfio/pci: Make vfio_populate_device() return a bool
vfio/pci: Make vfio_pci_relocate_msix() and vfio_msix_early_setup() return a bool
vfio/pci: Make vfio_intx_enable_kvm() return a bool
vfio/ccw: Make vfio_ccw_get_region() return a bool
vfio/platform: Make vfio_populate_device() and vfio_base_device_init() return bool
vfio/helpers: Make vfio_device_get_name() return bool
vfio/helpers: Make vfio_set_irq_signaling() return bool
vfio/helpers: Use g_autofree in vfio_set_irq_signaling()
vfio/display: Make vfio_display_*() return bool
vfio/display: Fix error path in call site of ramfb_setup()
backends/iommufd: Make iommufd_backend_*() return bool
vfio/cpr: Make vfio_cpr_register_container() return bool
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Pointer opregion, host and lpc are allocated and freed in
vfio_probe_igd_bar4_quirk(). Use g_autofree to automatically
free them.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
There are some exceptions when pointer to vfio_region_info is reused.
In that case, the pointed memory is freed manually.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This is to follow the coding standand in qapi/error.h to return bool
for bool-valued functions.
Include below functions:
vfio_add_virt_caps()
vfio_add_nv_gpudirect_cap()
vfio_add_vmd_shadow_cap()
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This is to follow the coding standand in qapi/error.h to return bool
for bool-valued functions.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Pointer opregion is freed after vfio_pci_igd_opregion_init().
Use 'g_autofree' to avoid the g_free() calls.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
The functions operating on capability don't have a consistent return style.
Below functions are in bool-valued functions style:
vfio_msi_setup()
vfio_msix_setup()
vfio_add_std_cap()
vfio_add_capabilities()
Below two are integer-valued functions:
vfio_add_vendor_specific_cap()
vfio_setup_pcie_cap()
But the returned integer is only used for check succeed/failure.
Change them all to return bool so now all capability related
functions follow the coding standand in qapi/error.h to return
bool.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This is to follow the coding standand in qapi/error.h to return bool
for bool-valued functions.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This is to follow the coding standand in qapi/error.h to return bool
for bool-valued functions.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Since vfio_populate_device() takes an 'Error **' argument,
best practices suggest to return a bool. See the qapi/error.h
Rules section.
By this chance, pass errp directly to vfio_populate_device() to
avoid calling error_propagate().
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Since vfio_pci_relocate_msix() and vfio_msix_early_setup() takes
an 'Error **' argument, best practices suggest to return a bool.
See the qapi/error.h Rules section.
By this chance, pass errp directly to vfio_msix_early_setup() to avoid
calling error_propagate().
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Since vfio_intx_enable_kvm() takes an 'Error **' argument,
best practices suggest to return a bool. See the qapi/error.h
Rules section.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Since vfio_populate_device() takes an 'Error **' argument,
best practices suggest to return a bool. See the qapi/error.h
Rules section.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This is to follow the coding standand in qapi/error.h to return bool
for bool-valued functions.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This is to follow the coding standand in qapi/error.h to return bool
for bool-valued functions.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This is to follow the coding standand in qapi/error.h to return bool
for bool-valued functions.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Local pointer irq_set is freed before return from
vfio_set_irq_signaling().
Use 'g_autofree' to avoid the g_free() calls.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This is to follow the coding standand in qapi/error.h to return bool
for bool-valued functions.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
vfio_display_dmabuf_init() and vfio_display_region_init() calls
ramfb_setup() without checking its return value.
So we may run into a situation that vfio_display_probe() succeed
but errp is set. This is risky and may lead to assert failure in
error_setv().
Cc: Gerd Hoffmann <kraxel@redhat.com>
Fixes: b290659fc3 ("hw/vfio/display: add ramfb support")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
GDB commit a207f6b3a38 ('Rewrite "python" command exception handling')
changed how exit() called from Python scripts loaded by GDB behave,
turning it into an exception instead of a generic error code that is
returned. This change caused several QEMU tests to crash with the
following exception:
Python Exception <class 'SystemExit'>: 0
Error occurred in Python: 0
This happens because in tests/guest-debug/test_gdbstub.py exit is
called after the tests have completed.
This commit fixes it by politely asking GDB to exit via gdb.execute,
passing the proper fail_count to be reported to 'make', instead of
abruptly calling exit() from the Python script.
Signed-off-by: Gustavo Romero <gustavo.romero@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240515173132.2462201-4-gustavo.romero@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
This effectively reverts
commit 54c4ea8f3a
Author: Zhao Liu <zhao1.liu@intel.com>
Date: Sat Mar 9 00:01:37 2024 +0800
hw/core/machine-smp: Deprecate unsupported "parameter=1" SMP configurations
but is not done as a 'git revert' since the part of the changes to the
file hw/core/machine-smp.c which add 'has_XXX' checks remain desirable.
Furthermore, we have to tweak the subsequently added unit test to
account for differing warning message.
The rationale for the original deprecation was:
"Currently, it was allowed for users to specify the unsupported
topology parameter as "1". For example, x86 PC machine doesn't
support drawer/book/cluster topology levels, but user could specify
"-smp drawers=1,books=1,clusters=1".
This is meaningless and confusing, so that the support for this kind
of configurations is marked deprecated since 9.0."
There are varying POVs on the topic of 'unsupported' topology levels.
It is common to say that on a system without hyperthreading, that there
is always 1 thread. Likewise when new CPUs introduced a concept of
multiple "dies', it was reasonable to say that all historical CPUs
before that implicitly had 1 'die'. Likewise for the more recently
introduced 'modules' and 'clusters' parameter'. From this POV, it is
valid to set 'parameter=1' on the -smp command line for any machine,
only a value > 1 is strictly an error condition.
It doesn't cause any functional difficulty for QEMU, because internally
the QEMU code is itself assuming that all "unsupported" parameters
implicitly have a value of '1'.
At the libvirt level, we've allowed applications to set 'parameter=1'
when configuring a guest, and pass that through to QEMU.
Deprecating this creates extra difficulty for because there's no info
exposed from QEMU about which machine types "support" which parameters.
Thus, libvirt can't know whether it is valid to pass 'parameter=1' for
a given machine type, or whether it will trigger deprecation messages.
Since there's no apparent functional benefit to deleting this deprecated
behaviour from QEMU, and it creates problems for consumers of QEMU,
remove this deprecation.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Message-ID: <20240513123358.612355-2-berrange@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
adapter_info_so_needed() treats its "opaque" parameter as a S390FLICState,
but the function belongs to a VMStateDescription that is attached to a
TYPE_VIRTIO_CCW_BUS device. This is currently causing a crash when the
user tries to save or migrate the VM state. Fix it by using s390_get_flic()
to get the correct device here instead.
Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Fixes: 9d1b0f5bf5 ("s390_flic: add migration-enabled property")
Message-ID: <20240517061553.564529-1-thuth@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Run "make lcitool-refresh" after the previous changes to the
lcitool files. This removes the g++ and xfslibs-dev packages
from the dockerfiles (except for the fedora-win64-cross dockerfile
where we keep the C++ compiler).
Message-ID: <20240516084059.511463-6-thuth@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
We don't need C++ for the normal QEMU builds anymore, so installing
g++ in each and every container seems to be a waste of time and disk
space. The only container that still needs it is the Fedora MinGW
container that builds the only remaining C++ code in ./qga/vss-win32/
and we can install it there with an extra project yml file instead.
Message-ID: <20240516084059.511463-4-thuth@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
QEMU's commit a5730b8bd3 ("block/file-posix: Simplify the
XFS_IOC_DIOINFO handling") removed the need for the 'xfsprogs'
package.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
[thuth: Adjusted the patch from the lcitools repo to QEMU's repo]
Message-ID: <20240516084059.511463-3-thuth@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
This is to follow the coding standand to return bool if 'Error **'
is used to pass error.
The changed functions include:
iommufd_backend_connect
iommufd_backend_alloc_ioas
By this chance, simplify the functions a bit by avoiding duplicate
recordings, e.g., log through either error interface or trace, not
both.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This is to follow the coding standand to return bool if 'Error **'
is used to pass error.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This is to follow the coding standand to return bool if 'Error **'
is used to pass error.
The changed functions include:
iommufd_cdev_kvm_device_add
iommufd_cdev_connect_and_bind
iommufd_cdev_attach_ioas_hwpt
iommufd_cdev_detach_ioas_hwpt
iommufd_cdev_attach_container
iommufd_cdev_get_info_iova_range
After the change, all functions in hw/vfio/iommufd.c follows the
standand.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This is to follow the coding standand to return bool if 'Error **'
is used to pass error.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This is to follow the coding standand to return bool if 'Error **'
is used to pass error.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This is to follow the coding standand to return bool if 'Error **'
is used to pass error.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Make VFIOIOMMUClass::add_window() and its wrapper function
vfio_container_add_section_window() return bool.
This is to follow the coding standand to return bool if 'Error **'
is used to pass error.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This is to follow the coding standand to return bool if 'Error **'
is used to pass error.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Make VFIOIOMMUClass::attach_device() and its wrapper function
vfio_attach_device() return bool.
This is to follow the coding standand to return bool if 'Error **'
is used to pass error.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Local pointer info is freed before return from
iommufd_cdev_get_info_iova_range().
Use 'g_autofree' to avoid the g_free() calls.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Local pointer name is allocated before vfio_attach_device() call
and freed after the call.
Same for tmp when calling realpath().
Use 'g_autofree' to avoid the g_free() calls.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Move trace_vfio_migration_set_state() to the top of the function, add
recover_state to it, and add a new trace event to
vfio_migration_set_device_state().
This improves tracing of device state changes as state changes are now
also logged when vfio_migration_set_state() fails (covering recover
state and device reset transitions) and in no-op state transitions to
the same state.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
When migrating a VFIO device that supports pre-copy, it is transitioned
to STOP_COPY twice: once in vfio_vmstate_change() and second time in
vfio_save_complete_precopy().
The second transition is harmless, as it's a STOP_COPY->STOP_COPY no-op
transition. However, with the newly added VFIO migration QAPI event, the
STOP_COPY event is undesirably emitted twice.
Prevent this by returning early in vfio_migration_set_state() if
new_state is the same as current device state.
Note that the STOP_COPY transition in vfio_save_complete_precopy() is
essential for VFIO devices that don't support pre-copy, for migrating an
already stopped guest and for snapshots.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Emit VFIO migration QAPI event when a VFIO device changes its migration
state. This can be used by management applications to get updates on the
current state of the VFIO device for their own purposes.
A new per VFIO device capability, "migration-events", is added so events
can be enabled only for the required devices. It is disabled by default.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Add a new QAPI event for VFIO migration. This event will be emitted when
a VFIO device changes its migration state, for example, during migration
or when stopping/starting the guest.
This event can be used by management applications to get updates on the
current state of the VFIO device for their own purposes.
Note that this new event is introduced since VFIO devices have a unique
set of migration states which cannot be described as accurately by other
existing events such as run state or migration status.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
In case of migration, during restore operation, qemu checks config space of the
pci device with the config space in the migration stream captured during save
operation. In case of config space data mismatch, restore operation is failed.
config space check is done in function get_pci_config_device(). By default VSC
(vendor-specific-capability) in config space is checked.
Due to qemu's config space check for VSC, live migration is broken across NVIDIA
vGPU devices in situation where source and destination host driver is different.
In this situation, Vendor Specific Information in VSC varies on the destination
to ensure vGPU feature capabilities exposed to the guest driver are compatible
with destination host.
If a vfio-pci device is migration capable and vfio-pci vendor driver is OK with
volatile Vendor Specific Info in VSC then qemu should exempt config space check
for Vendor Specific Info. It is vendor driver's responsibility to ensure that
VSC is consistent across migration. Here consistency could mean that VSC format
should be same on source and destination, however actual Vendor Specific Info
may not be byte-to-byte identical.
This patch skips the check for Vendor Specific Information in VSC for VFIO-PCI
device by clearing pdev->cmask[] offsets. Config space check is still enforced
for 3 byte VSC header. If cmask[] is not set for an offset, then qemu skips
config space check for that offset.
VSC check is skipped for machine types >= 9.1. The check would be enforced on
older machine types (<= 9.0).
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Vinayak Kale <vkale@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Since vfio_ccw_register_irq_notifier() takes an 'Error **' argument,
best practices suggest to return a bool. See the qapi/error.h Rules
section.
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Since vfio_ap_register_irq_notifier() takes and 'Error **' argument,
best practices suggest to return a bool. See the qapi/error.h Rules
section.
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
vfio_save_complete_precopy() currently returns before doing the trace
event. Change that.
Reviewed-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Let the callers do the error reporting. Add documentation while at it.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Use vmstate_save_state_with_err() to improve error reporting in the
callers and store a reported error under the migration stream. Add
documentation while at it.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Add an Error** argument to vfio_migration_set_state() and adjust
callers, including vfio_save_setup(). The error will be propagated up
to qemu_savevm_state_setup() where the save_setup() handler is
executed.
Modify vfio_vmstate_change_prepare() and vfio_vmstate_change() to
store a reported error under the migration stream if a migration is in
progress.
Reviewed-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Use it to update the current error of the migration stream if
available and if not, simply print out the error. Next changes will
update with an error to report.
Reviewed-by: Avihai Horon <avihaih@nvidia.com>
Acked-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This allows to update the Error argument of the VFIO log_global_start()
handler. Errors for container based logging will also be propagated to
qemu_savevm_state_setup() when the ram save_setup() handler is executed.
Also, errors from vfio_container_set_dirty_page_tracking() are now
collected and reported.
The vfio_set_migration_error() call becomes redundant in
vfio_listener_log_global_start(). Remove it.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
We will use the Error object to improve error reporting in the
.log_global*() handlers of VFIO. Add documentation while at it.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
plugin and testing updates
- don't duplicate options for microbit test
- don't spam the linux source tree when importing headers
- add STORE_U64 inline op to TCG plugins
- add conditional callback op to TCG plugins
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmZFvCMACgkQ+9DbCVqe
# KkSrYQf/aj9+eCWCKZk3Hym0lT+qNKxUeNSx3juUN8h7iG1vkA1f/XaQle5XvKDr
# ROIdo8urcr8onJ4PBH+4C7VZhUmnpL8zLH80pCuuTkF03MCNhaW/5qJ67niWmPVM
# QJHVqNomkykKOMBh+WtD5M0m/BYPT5lsa10sE3bDH8ziGjp0An2v24R89tzYEXnf
# 1QePItQN5vzEvhrZj6oKWVmeucqLsqS6yqS8V3sEpmF0+zqNjGZlrI86A4SAp74k
# 8vuduVuRbeyki7zWBTOLUeoiuHM2Zmh7v74zm/Hc1ITBaDjWMwPctcI/vFjsrCI/
# yoFRhgrV87DtIZdkrJzk5qBYFOWoeQ==
# =znN0
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 16 May 2024 09:56:19 AM CEST
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [full]
* tag 'pull-maintainer-may24-160524-2' of https://gitlab.com/stsquad/qemu:
plugins: remove op from qemu_plugin_inline_cb
plugins: extract cpu_index generate
plugins: distinct types for callbacks
tests/plugin/inline: add test for conditional callback
plugins: conditional callbacks
tests/plugin/inline: add test for STORE_U64 inline op
plugins: add new inline op STORE_U64
plugins: extract generate ptr for qemu_plugin_u64
plugins: prepare introduction of new inline ops
scripts/update-linux-header.sh: be more src tree friendly
tests/tcg: don't append QEMU_OPTS for armv6m-undef test
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Running "install_headers" in the Linux source tree is fairly
unfriendly as out-of-tree builds will start complaining about the
kernel source being non-pristine. As we have a temporary directory for
the install we should also do the build step here. So now we have:
$tmpdir/
$blddir/
$hdrdir/
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240514174253.694591-3-alex.bennee@linaro.org>
target/hppa:
- Use TCG_COND_TST where applicable.
- Use CF_BP_PAGE instead of a local breakpoint search.
- Clean up IAOQ handling during translation.
- Implement CF_PCREL.
- Implement PSW.B.
- Implement PSW.X.
- Log cpu state on interrupt and rfi.
# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmZEgnwdHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV+43gf8CakQdMSqfGV2nGP+
# 7wWZOAV04IyfkJ38F/CH0ihUkblEOzXJ1shTFkrHEw257j0D10MctSSbjrqz5BwU
# obQcwoVlxzTGXqzhkZ6wagkcqjv3TtlPtznZIk6JssdlrtwIKDmE2/3t1dzHnyBD
# WTrS0SK3YvVRovq/ai51raUbiBsNq7XG3skHEsMKsFxp4EaDP5JTbputdQWdffjh
# TBmXImhHC3gm09KWIUZwfEBHlaa7YXk2orzB8kBE8S2kQj9vrGXEaC4jYnBcQLPw
# NDDkBYRqxHYQr0vIAHee+5cUgt1jDBr5rXnAnJwzK0wyEEc4Mi4OTPhNE604iu2y
# SDxS8Q==
# =A4Qf
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 15 May 2024 11:38:04 AM CEST
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate]
* tag 'pull-hppa-20240515' of https://gitlab.com/rth7680/qemu: (43 commits)
target/hppa: Log cpu state on return-from-interrupt
target/hppa: Log cpu state at interrupt
target/hppa: Implement CF_PCREL
target/hppa: Adjust priv for B,GATE at runtime
target/hppa: Drop tlb_entry return from hppa_get_physical_address
target/hppa: Implement PSW_X
target/hppa: Implement PSW_B
target/hppa: Manage PSW_X and PSW_B in translator
target/hppa: Split PSW X and B into their own field
target/hppa: Improve hppa_cpu_dump_state
target/hppa: Do not mask in copy_iaoq_entry
target/hppa: Store full iaoq_f and page offset of iaoq_b in TB
linux-user/hppa: Force all code addresses to PRIV_USER
target/hppa: Use delay_excp for conditional trap on overflow
target/hppa: Use delay_excp for conditional traps
target/hppa: Introduce DisasDelayException
target/hppa: Remove cond_free
target/hppa: Use TCG_COND_TST* in trans_ftest
target/hppa: Use registerfields.h for FPSR
target/hppa: Use TCG_COND_TST* in trans_bb_imm
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
tcg/loongarch64: Fill out tcg_out_{ld,st} for vector regs
accel/tcg: Improve disassembly for target and plugin
# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmZEXT0dHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV/FbQf+P3ppcAA+5smxaQyi
# dsfCJaGOMqRTWYuSmNsJ7AlxQobxLKVsJrAHraNU1AnDfwKrX3XXJcU4Gwt0eQyN
# lGiF/24KLElvb+w6fkjuLdK+DbGWTrNabXJAnBw1h21x+go0mvVCVSuQQw7a/RDS
# btPnGkmoi0H340JC1MVSDRgFkB3RV0kOMXGGm70S+mw0WhjVgdInhLv0jjnj2QFM
# tYzJ5g+00v0HPo8Lun5kRSaI7EGG7J/XfGa71WHIHrB0o7FAzslap4fGTcfOB+7a
# f2jTGErezJQj1pvJLvFTNX4YQ02ORnDKsz4EC0G9QU8rk+S1bD2vTVoi5IY5ayfJ
# oqxyRw==
# =Q16M
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 15 May 2024 08:59:09 AM CEST
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate]
* tag 'pull-tcg-20240515' of https://gitlab.com/rth7680/qemu: (34 commits)
tcg/loongarch64: Fill out tcg_out_{ld,st} for vector regs
accel/tcg: Remove cpu_ldsb_code / cpu_ldsw_code
target/s390x: Use translator_lduw in get_next_pc
target/xtensa: Use translator_ldub in xtensa_insn_len
target/rx: Use translator_ld*
target/riscv: Use translator_ld* for everything
target/cris: Use cris_fetch in translate_v10.c.inc
target/cris: Use translator_ld* in cris_fetch
target/avr: Use translator_lduw
target/i386: Use translator_ldub for everything
target/microblaze: Use translator_ldl
target/hexagon: Use translator_ldl in pkt_crosses_page
target/s390x: Disassemble EXECUTEd instructions
target/s390x: Fix translator_fake_ld length
accel/tcg: Introduce translator_fake_ld
disas: Use translator_st to get disassembly data
disas: Split disas.c
accel/tcg: Return bool from TranslatorOps.disas_log
accel/tcg: Provide default implementation of disas_log
plugins: Merge alloc_tcg_plugin_context into plugin_gen_tb_start
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Now that the groundwork has been laid, enabling CF_PCREL within the
translator proper is a simple matter of updating copy_iaoq_entry
and install_iaq_entries.
We also need to modify the unwind info, since we no longer have
absolute addresses to install.
As expected, this reduces the runtime overhead of compilation when
running a Linux kernel with address space randomization enabled.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Do not compile in the priv change based on the first translation;
look up the PTE at execution time. This is required for CF_PCREL,
where a page may be mapped multiple times with different attributes.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use PAGE_WRITE_INV to temporarily enable write permission
on for a given page, driven by PSW_X being set.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
PSW_B causes B,GATE to trap as an illegal instruction, removing our
previous sequential execution test that was merely an approximation.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
PSW_X is cleared after every instruction, and only set by RFI.
PSW_B is cleared after every non-branch, or branch not taken,
and only set by taken branches. We can clear both bits with a
single store, at most once per TB. Taken branches set PSW_B,
at most once per TB.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Generally, both of these bits are cleared at the end of each
instruction. By separating these, we will be able to clear
both with a single insn, instead of 2 or 3.
Reviewed-by: Helge Deller <deller@gmx.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Print both raw IAQ_Front and IAQ_Back as well as the GVAs.
Print control registers in system mode.
Print floating point registers if CPU_DUMP_FPU.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
As with loads and stores, code offsets are kept intact until the
full gva is formed. In qemu, this is in cpu_get_tb_cpu_state.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In preparation for CF_PCREL. store the iaoq_f in 3 parts: high
bits in cs_base, middle bits in pc, and low bits in priv.
For iaoq_b, set a bit for either of space or page differing,
else the page offset.
Install iaq entries before goto_tb. The change to not record
the full direct branch difference in TB means that we have to
store at least iaoq_b before goto_tb. But since a later change
to enable CF_PCREL will require both iaoq_f and iaoq_b to be
updated before goto_tb, go ahead and update both fields now.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Allow an exception to be emitted at the end of the TranslationBlock,
leaving only the conditional branch inline. Use it for simple
exception instructions like break, which happen to be nullified.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Now that we do not need to free tcg temporaries, the only
thing cond_free does is reset the condition to never.
Instead, simply write a new condition over the old, which
may be simply cond_make_f() for the never condition.
The do_*_cond functions do the right thing with c or cf == 0,
so there's no need for a special case anymore.
Reviewed-by: Helge Deller <deller@gmx.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Define all of the context dependent field definitions.
Use FIELD_EX32 and FIELD_DP32 with named fields instead
of extract32 and deposit32 with raw constants.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We can directly test bits of a 32-bit comparison without
zero or sign-extending an intermediate result.
We can directly test bit 0 for odd/even.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We can directly test bits of a 32-bit comparison without
zero or sign-extending an intermediate result.
We can directly test bit 0 for odd/even.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use 'v' for a variable that needs copying, 't' for a temp that
doesn't need copying, and 'i' for an immediate, and use this
naming for both arguments of the comparison. So:
cond_make_tmp -> cond_make_tt
cond_make_0_tmp -> cond_make_ti
cond_make_0 -> cond_make_vi
cond_make -> cond_make_vv
Pass 0 explictly, rather than implicitly in the function name.
Reviewed-by: Helge Deller <deller@gmx.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This is a first step in enabling CF_PCREL, but for now
we regenerate the absolute address before writeback.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Wrap offset and space together in one structure, ensuring
that they're copied together as required.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This simplifies callers, which might otherwise have
to make another copy.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Using umax is clearer than the same operation using movcond.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This allows unification of BE, BLR, BV, BVE with a common helper.
Since we can now track space with IAQ_Next, we can now let the
TranslationBlock continue across the delay slot with BE, BVE.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add variable to track space changes to IAQ. So far, no such changes
are introduced, but the new checks vs ctx->iasq_b may eliminate an
unnecessary copy to cpu_iasq_f with e.g. BLR.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Minimize the amount of code in hppa_tr_translate_insn advancing the
insn queue for the next insn. Move the goto_tb path to hppa_tr_tb_stop.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We no longer have to allocate a temp and perform an
addition before translation of the rest of the insn.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Instead of two separate cpu_iaoq_entry calls, use one call to update
both IAQ_Front and IAQ_Back. Simplify with an argument combination
that automatically handles a simple increment from Front to Back.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The generic tcg driver will have already checked for breakpoints.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Simplify the function by not attempting a conditional move
on the branch destination -- just use nullify_over normally.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Pass a displacement instead of an absolute value.
In trans_be, remove the user-only do_dbranch case. The branch we are
attempting to optimize is to the zero page, which is perforce on a
different page than the code currently executing, which means that
we will *not* use a goto_tb. Use a plain indirect branch instead,
which is what we got out of the attempted direct branch anyway.
Reviewed-by: Helge Deller <deller@gmx.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Share this check between gen_goto_tb and hppa_tr_translate_insn.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This function is for log_pc(), which needs to produce a
similar result to cpu_get_tb_cpu_state().
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The ilen value extracted from ex_value is the length of the
EXECUTE instruction itself, and so is the increment to the pc.
However, the length of the synthetic insn is located in the
opcode like all other instructions.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Replace translator_fake_ldb, which required multiple calls,
with translator_fake_ld, which can take all data at once.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The routines in disas-common.c are also used from disas-mon.c.
Otherwise the rest of disassembly is only used from tcg.
While we're at it, put host and target code into separate files.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We have eliminated most uses of this hook. Reduce
further by allowing the hook to handle only the
special cases, returning false for normal processing.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Almost all of the disas_log implementations are identical.
Unify them within translator_loop.
Drop extra Priv/Virt logging from target/riscv.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We don't need to allocate plugin context at startup,
we can wait until we actually use it.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Do not pass around a boolean between multiple structures,
just read it from the TranslationBlock in the TCGContext.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use the bytes that we record for the entire TB, rather than
a per-insn GByteArray. Record the length of the insn in
plugin_gen_insn_end rather than infering from the length
of the array.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Copy data out of a completed translation. This will be used
for both plugins and disassembly.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Instead of returning a host pointer, copy the data into
storage provided by the caller.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This will be able to replace plugin_insn_append, and will
be usable for disassembly.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Do not allow translation to proceed beyond one insn with mmio,
as we will not be caching the TranslationBlock.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reorg translator_access into translator_ld, with a more
memcpy-ish interface. If both pages are in ram, do not
go through the caller's slow path.
Assert that the access is within the two pages that we are
prepared to protect, per TranslationBlock. Allow access
prior to pc_first, so long as it is within the first page.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
While there are other methods that could be used to replace
TARGET_PAGE_MASK, the function is not really required outside
the context of target-specific translation.
This makes the header usable by target independent code.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* Fix the "tsan-build" CI job on the shared gitlab CI runners
* Bump minimum glib version and use URI code from the newer glib
* Fix error message from "configure" when C compiler is not working
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmZDXcYRHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbVPJw//bK/NuMKOHlnwgowkQ/x41t8nc0jAR38+
# aMhJBTSB+9EOlPd+/y7+IeFlD9lS2JzoX/CWeBrNlKc6juWQahABJYcvscmdGiYr
# a/dUy9iZoqJyY220TMjCWYwORRtNqPDXaiUIR8hBZZBmW51xs1hRc3aazxPm6dOD
# cj1yFKwWGY5g72SkRNTMWi3qWX1tXNOh7sbuhWKkZ3eiRCllHb0RwrhA341ze4TI
# ckmlSA6stMjls4XNAIAKVdRKLPE1BsJ/UKxpnOEO3F640cbe69B0+z13wIBNfTOY
# Mk3zSjrdLY6thSY+2iOb2FLt3wC5QCBjyRluv+0kwdSnz6xsafEDWNx5VneZH+Iu
# ZQWLGvN4qUUBBqHKY8eWnrsij3ABXioHLK8eHj2JuHidcG15tku/1cwAJvy/8P/O
# iup0elZ3MXaAk6ce3dwYY4t6QecuzqX9cdJkTuRNlzysK1xKQdBiYTdeZikfUAoM
# InuFUh732yPXDSiZcG+uMXUTAJXHWASr7bvPydDx/gL1tYGYBqYepfPF2uWYfNwg
# VZRgsN6WVDBGPyXv8Z7eQ9lye5JoAGYrSDxZE87q8RwRV5holiYDxtf10zeLz3Wf
# RI5L/bb2eFSHzi3quzOC1uLflLqNKwq+9UZEjdLv2z8zuhwVwxbcDV9+ox6zA8zi
# dnVC3Yp/3ik=
# =VRHz
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 14 May 2024 02:49:10 PM CEST
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full]
* tag 'pull-request-2024-05-14' of https://gitlab.com/thuth/qemu:
util/uri: Remove the old URI parsing code
block/ssh: Use URI parsing code from glib
block/nfs: Use URI parsing code from glib
block/nbd: Use URI parsing code from glib
block/gluster: Use URI parsing code from glib
Remove glib compatibility code that is not required anymore
Bump minimum glib version to v2.66
gitlab: use 'setarch -R' to workaround tsan bug
gitlab: use $MAKE instead of 'make'
dockerfiles: add 'MAKE' env variable to remaining containers
configure: Fix error message when C compiler is not working
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
By default, SDL disables the screen saver which prevents the host from powering
down the screen even if the screen is locked. This results in draining the
battery needlessly when the host isn't connected to a wall charger. Fix that by
enabling the screen saver.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Acked-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <20240512095945.1879-1-shentey@gmail.com>
Remove gtk_widget_get_scale_factor() usage from the calculation of
the motion events in the GTK backend to make it work correctly on
environments that have `gtk_widget_get_scale_factor() != 1`.
This scale factor usage had been introduced in the commit f14aab420c and
at that time the window size was used for calculating the things and it
was working correctly. However, in the commit 2f31663ed4 the logic
switched to use the widget size instead of window size and because of
the change the usage of scale factor becomes invalid (since widgets use
`vc->gfx.scale_{x, y}` for scaling).
Tested on Crostini on ChromeOS (15823.51.0) with an external display.
Fixes: 2f31663ed4 ("ui/gtk: use widget size for cursor motion event")
Fixes: f14aab420c ("ui: fix incorrect pointer position on highdpi with
gtk")
Signed-off-by: hikalium <hikalium@hikalium.com>
Acked-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20240512111435.30121-3-hikalium@hikalium.com>
This commit introduces utility functions for the creation and deallocation
of QemuDmaBuf instances. Additionally, it updates all relevant sections
of the codebase to utilize these new utility functions.
v7: remove prefix, "dpy_gl_" from all helpers
qemu_dmabuf_free() returns without doing anything if input is null
(Daniel P. Berrangé <berrange@redhat.com>)
call G_DEFINE_AUTOPTR_CLEANUP_FUNC for qemu_dmabuf_free()
(Daniel P. Berrangé <berrange@redhat.com>)
v8: Introduction of helpers was removed as those were already added
by the previous commit
v9: set dmabuf->allow_fences to 'true' when dmabuf is created in
virtio_gpu_create_dmabuf()/virtio-gpu-udmabuf.c
removed unnecessary spaces were accidently added in the patch,
'ui/console: Use qemu_dmabuf_new() a...'
v11: Calling qemu_dmabuf_close was removed as closing dmabuf->fd will be
done in qemu_dmabuf_free anyway.
(Daniel P. Berrangé <berrange@redhat.com>)
v12: --- Calling qemu_dmabuf_close separately as qemu_dmabuf_free doesn't
do it.
--- 'dmabuf' is now allocated space so it should be freed at the end of
dbus_scanout_texture
v13: --- Immediately free dmabuf after it is released to prevent possible
leaking of the ptr
(Marc-André Lureau <marcandre.lureau@redhat.com>)
--- Use g_autoptr macro to define *dmabuf for auto clean up instead of
calling qemu_dmabuf_free
(Marc-André Lureau <marcandre.lureau@redhat.com>)
v14: --- (vhost-user-gpu) Change qemu_dmabuf_free back to g_clear_pointer
as it was done because of some misunderstanding (v13).
--- (vhost-user-gpu) g->dmabuf[m->scanout_id] needs to be set to NULL
to prevent freed dmabuf to be accessed again in case if(fd==-1)break;
happens (before new dmabuf is allocated). Otherwise, it would cause
invalid memory access when the same function is executed. Also NULL
check should be done before qemu_dmabuf_close (it asserts dmabuf!=NULL.).
(Marc-André Lureau <marcandre.lureau@redhat.com>)
Suggested-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Philippe Mathieu-Daudé <philmd@linaro.org>
Cc: Daniel P. Berrangé <berrange@redhat.com>
Cc: Vivek Kasireddy <vivek.kasireddy@intel.com>
Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Message-Id: <20240508175403.3399895-6-dongwon.kim@intel.com>
This commit updates all instances where fields within the QemuDmaBuf
struct are directly accessed, replacing them with calls to these new
helper functions.
v6: fix typos in helper names in ui/spice-display.c
v7: removed prefix, "dpy_gl_" from all helpers
v8: Introduction of helpers was removed as those were already added
by the previous commit
v11: -- Use new qemu_dmabuf_close() instead of close(qemu_dmabuf_get_fd()).
(Daniel P. Berrangé <berrange@redhat.com>)
-- Use new qemu_dmabuf_dup_fd() instead of dup(qemu_dmabuf_get_fd()).
(Daniel P. Berrangé <berrange@redhat.com>)
Suggested-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Philippe Mathieu-Daudé <philmd@linaro.org>
Cc: Daniel P. Berrangé <berrange@redhat.com>
Cc: Vivek Kasireddy <vivek.kasireddy@intel.com>
Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Message-Id: <20240508175403.3399895-4-dongwon.kim@intel.com>
New header and source files are added for containing QemuDmaBuf struct
definition and newly introduced helpers for creating/freeing the struct
and accessing its data.
v10: Change the license type for both dmabuf.h and dmabuf.c from MIT to
GPL to be in line with QEMU's default license
v11: -- Added new helpers, qemu_dmabuf_close for closing dmabuf->fd,
qemu_dmabuf_dup_fd for duplicating dmabuf->fd
(Daniel P. Berrangé <berrange@redhat.com>)
-- Let qemu_dmabuf_fee to call qemu_dmabuf_close before freeing
the struct to make sure fd is closed.
(Daniel P. Berrangé <berrange@redhat.com>)
v12: Not closing fd in qemu_dmabuf_free because there are cases fd
should still be available even after the struct is destroyed
(e.g. virtio-gpu: res->dmabuf_fd).
Suggested-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Philippe Mathieu-Daudé <philmd@linaro.org>
Cc: Daniel P. Berrangé <berrange@redhat.com>
Cc: Vivek Kasireddy <vivek.kasireddy@intel.com>
Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Message-Id: <20240508175403.3399895-3-dongwon.kim@intel.com>
Draw routine needs to be manually invoked in the next refresh
if there is a scanout blob from the guest. This is to prevent
a situation where there is a scheduled draw event but it won't
happen bacause the window is currently in inactive state
(minimized or tabified). If draw is not done for a long time,
gl_block timeout and/or fence timeout (on the guest) will happen
eventually.
v2: Use gd_gl_area_draw(vc) in gtk-gl-area.c
Suggested-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Acked-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20240426225059.3871283-1-dongwon.kim@intel.com>
Now that we switched all consumers of the URI code to use the URI
parsing functions from glib instead, we can remove our internal
URI parsing code since it is not used anymore.
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-ID: <20240418101056.302103-14-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Since version 2.66, glib has useful URI parsing functions, too.
Use those instead of the QEMU-internal ones to be finally able
to get rid of the latter.
While we're at it, also emit a warning when encountering unknown
parameters in the URI, so that the users have a chance to detect
their typos or other mistakes.
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Message-ID: <20240418101056.302103-13-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Since version 2.66, glib has useful URI parsing functions, too.
Use those instead of the QEMU-internal ones to be finally able
to get rid of the latter.
While we're at it, slightly rephrase one of the error messages:
Use "Invalid value..." instead of "Illegal value..." since the
latter rather sounds like the users were breaking a law here.
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-ID: <20240418101056.302103-12-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Since version 2.66, glib has useful URI parsing functions, too.
Use those instead of the QEMU-internal ones to be finally able
to get rid of the latter. The g_uri_get_host() also takes care
of removing the square brackets from IPv6 addresses, so we can
drop that part of the QEMU code now, too.
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-ID: <20240418101056.302103-11-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Since version 2.66, glib has useful URI parsing functions, too.
Use those instead of the QEMU-internal ones to be finally able
to get rid of the latter.
Since g_uri_get_path() returns a const pointer, we also need to
tweak the parameter of parse_volume_options() (where we use the
result of g_uri_get_path() as input).
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20240418101056.302103-10-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Now that we dropped support for CentOS 8 and Ubuntu 20.04, we can
look into bumping the glib version to a new minimum for further
clean-ups. According to repology.org, available versions are:
CentOS Stream 9: 2.66.7
Debian 11: 2.66.8
Fedora 38: 2.74.1
Freebsd: 2.78.4
Homebrew: 2.80.0
Openbsd: 2.78.4
OpenSuse leap 15.5: 2.70.5
pkgsrc_current: 2.78.4
Ubuntu 22.04: 2.72.1
Thus it should be safe to bump the minimum glib version to 2.66 now.
Version 2.66 comes with new functions for URI parsing which will
allow further clean-ups in the following patches.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240418101056.302103-8-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The TSAN job started failing when gitlab rolled out their latest
release. The root cause is a change in the Google COS version used
on shared runners. This brings a kernel running with
vm.mmap_rnd_bits = 31
which is incompatible with TSAN in LLVM < 18, which only supports
upto '28'. LLVM 18 can support upto '30', and failing that will
re-exec itself to turn off VA randomization.
Our LLVM is too old for now, but we can run with 'setarch -R make ..'
to turn off VA randomization ourselves.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240513111551.488088-4-berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
If you try to run the configure script on a system without a working
C compiler, you get a very misleading error message:
ERROR: Unrecognized host OS (uname -s reports 'Linux')
Some people already opened bug tickets because of this problem:
https://gitlab.com/qemu-project/qemu/-/issues/2057https://gitlab.com/qemu-project/qemu/-/issues/2288
We should rather tell the user that we were not able to use the C
compiler instead, otherwise they will have a hard time to figure
out what was going wrong.
While we're at it, let's also suppress the "unrecognized host CPU"
message in this case since it is rather misleading than helpful.
Fixes: 264b803721 ("configure: remove compiler sanity check")
Message-ID: <20240513114010.51608-1-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
* target/i386: miscellaneous changes, mostly TCG-related
* fix --without-default-devices build
* fix --without-default-devices qtests on s390x and arm
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmY+JWIUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroOGmwf+JKY/i7ihXvfINQIRSKaz+H7KM3Br
# BGv/iXj4hrRA+zflcZswwoWmPrkrXM3J5JqGG6zTqqhGne+fRKt60KBFwn+lRaMY
# n48icR4zOSaEcGKBOFKs9CB1JgL7SWMe+fZ8d02amYlIZ005af0d69ACenF9r/oX
# pTxYIrR90FdZStbF4Yl0G5CzMLBdHZd/b6bMNmbefVPv3/d2zuL7VgqLX3y3J0ee
# ASYkYjn8Wpda4KX9s2rvH9ENXj80Q7EqhuDvoBlyK72/2lE5aTojbUiyGB4n5AuX
# 5OHA+0HEpuCXXToijOeDXD1NDOk9E5DP8cEwwZfZ2gjWKjja0U6OODGLVw==
# =woTe
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 10 May 2024 03:47:14 PM CEST
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (27 commits)
configs: disable emulators that require it if libfdt is not found
hw/xtensa: require libfdt
kconfig: express dependency of individual boards on libfdt
kconfig: allow compiling out QEMU device tree code per target
meson: move libfdt together with other dependencies
meson: pick libfdt from common_ss when building target-specific files
tests/qtest: arm: fix operation in a build without any boards or devices
i386: select correct components for no-board build
hw/i386: move rtc-reset-reinjection command out of hw/rtc
hw/i386: split x86.c in multiple parts
i386: pc: remove unnecessary MachineClass overrides
i386: correctly select code in hw/i386 that depends on other components
xen: register legacy backends via xen_backend_init
xen: initialize legacy backends from xen_bus_init()
tests/qtest: s390x: fix operation in a build without any boards or devices
s390x: select correct components for no-board build
s390: move css_migration_enabled from machine to css.c
s390_flic: add migration-enabled property
s390x: move s390_cpu_addr2state to target/s390x/sigp.c
sh4: select correct components for no-board build
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Since boards can express their dependency on libfdt and
system/device_tree.c, only leave TARGET_NEED_FDT if the target has a
hard dependency.
Those emulators will be skipped if libfdt is disabled, or if it
is "auto" and not found and --disable-download is passed; unless
the target is mentioned explicitly in --target-list, in which case
the build will fail.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
All other boards require libfdt if it can be used (including for example
i386/x86_64), so change the "imply" to "select" and always allow -dtb
in qemu-system-xtensa.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Now that boards are enabled by default and the "CONFIG_FOO=y"
entries are gone from configs/devices/, there cannot be any more
a conflicts between the default contents of configs/devices/
and a failed "depends on" clause.
With this change, each individual board or target can express
whether it needs FDT. It can then include the common code in the
build via "select DEVICE_TREE", which will also as tell meson to link
with libfdt.
This allows building non-microvm x86 emulators without having
libfdt available.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Introduce a new Kconfig symbol, CONFIG_DEVICE_TREE, that specifies whether
to include the common device tree code in system/device_tree.c and to
link to libfdt. For now, include it unconditionally if libfdt is
available.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move the libfdt detection code together with other dependencies instead
of keeping it with subprojects. This has the disadvantage of performing
the detection even if no target requires libfdt; but it has the advantage
that Kconfig will be able to observe the availability of the library.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Avoid having to list dependencies such as libfdt twice, both on common_ss
and specific_ss. Instead, just take all the dependencies in common_ss
and allow the target-specific libqemu-*.fa library to use them.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The local APIC is a part of the CPU and has callbacks that are invoked
from multiple accelerators.
The IOAPIC on the other hand is optional, but ioapic_eoi_broadcast is
used by common x86 code to implement the IOAPIC's implicit EOI mode.
Add a stub in case the IOAPIC device is not included but the APIC is.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240509170044.190795-13-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The rtc-reset-reinjection QMP command is specific to x86, other boards do not
have the ACK tracking functionality that is needed for RTC interrupt
reinjection. Therefore the QMP command is only included in x86, but
qmp_rtc_reset_reinjection() is implemented by hw/rtc/mc146818rtc.c
and requires tracking of all created RTC devices. Move the implementation
to hw/i386, so that 1) it is available even if no RTC device exist
2) the only RTC that exists is easily found in x86ms->rtc.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240509170044.190795-12-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Keep the basic X86MachineState definition in x86.c. Move out functions that
are only needed by other files: x86-common.c for the pc and microvm machines,
x86-cpu.c for those used by accelerator code.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240509170044.190795-11-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
fw_cfg.c and vapic.c are currently included unconditionally but
depend on other components. vapic.c depends on the local APIC,
while fw_cfg.c includes a piece of AML builder code that depends
on CONFIG_ACPI.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240509170044.190795-9-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It is okay to register legacy backends in the middle of xen_bus_init().
All that the registration does is record the existence of the backend
in xenstore.
This makes it possible to remove them from the build without introducing
undefined symbols in xen_be_init(). It also removes the need for the
backend_register callback, whose only purpose is to avoid registering
nonfunctional backends.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240509170044.190795-8-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Prepare for moving the calls to xen_be_register() under the
control of xen_bus_init(), using the normal xen_backend_init()
method that is used by the "modern" backends.
This requires the xenstore global variable to be initialized,
which is done by xen_be_init(). To ensure that everything is
ready at the time the xen_backend_init() functions are called,
remove the xen_be_init() function from all the boards and
place it directly in xen_bus_init().
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240509170044.190795-7-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Do the bare minimum to ensure that at least a vanilla
--without-default-devices build works for all targets except i386,
x86_64 and ppc64. In particular this fixes s390x-softmmu; i386 and
x86_64 have about a dozen failing tests that do not pass -M and therefore
require a default machine type; ppc64 has the same issue, though only
with numa-test.
If we can for now ignore the cases where boards and devices are picked
by hand, drive_del-test however can be fixed easily; almost all tests
check for the virtio-blk or virtio-scsi device that they use, and are
already skipped. Only one didn't get the memo; plus another one does
not need a machine at all and can be run with -M none.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240509170044.190795-6-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The CSS subsystem uses global variables, just face the truth and use
a variable also for whether the CSS vmstate is in use; remove the
indirection of fetching it from the machine type, which makes the
TCG code depend unnecessarily on the virtio-ccw machine.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20240509170044.190795-4-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This function has no dependency on the virtio-ccw machine type, though it
assumes that the CPU address corresponds to the core_id and the index.
If there is any need of something different or more fancy (unlikely)
S390 can include a MachineClass subclass and implement it there. For
now, move it to sigp.c for simplicity.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240509170044.190795-2-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Ensure that they go through unmodified, instead of removing one layer
of quoting.
-D is a pretty specialized option and most options that can have spaces
do not need it (for example, c_args is covered by --extra-cflags).
Therefore it's unlikely that this causes actual trouble. However,
a somewhat realistic failure case would be with -Dpkg_config_path
and a pkg-config directory that contains spaces.
Cc: qemu-stable@nongnu.org
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The VMX feature bit depends on general availability of WAITPKG,
not the other way round.
Fixes: 33cc88261c ("target/i386: add support for VMX_SECONDARY_EXEC_ENABLE_USER_WAIT_PAUSE", 2023-08-28)
Cc: qemu-stable@nongnu.org
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These are trivial to add, and moving them to the new decoder fixes some
corner cases: raising #UD instead of an instruction fetch page fault for
the undefined opcodes, and incorrectly rejecting 0F 18 prefetches with
register operands (which are treated as reserved NOPs).
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
According to the manual, 32-bit vs 64-bit is governed by REX.W
and REX ignores the 0x66 prefix. This can be confirmed with this
program:
#include <stdio.h>
int main()
{
int x = 0x12340000;
int y;
asm("popcntl %1, %0" : "=r" (y) : "r" (x)); printf("%x\n", y);
asm("mov $-1, %0; .byte 0x66; popcntl %1, %0" : "+r" (y) : "r" (x)); printf("%x\n", y);
asm("mov $-1, %0; .byte 0x66; popcntq %q1, %q0" : "+r" (y) : "r" (x)); printf("%x\n", y);
}
which prints 5/ffff0000/5 on real hardware and 5/ffff0000/ffff0000
on QEMU.
Cc: qemu-stable@nongnu.org
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The PCOMMIT instruction was never included in any physical processor.
TCG implements it as a no-op instruction, but its utility is debatable
to say the least. Drop it from the decoder since it is only available
with "-cpu max", which does not guarantee migration compatibility
across versions, and deprecate the property just in case someone is
using it as "pcommit=off".
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The old "-runas" option has the disadvantage that it is not visible
in the QAPI schema, so it is not available via the normal introspection
mechanisms. We've recently introduced the "-run-with" option for exactly
this purpose, which is meant to handle the options that affect the
runtime behavior. Thus let's introduce a "user=..." parameter here now
and deprecate the old "-runas" option.
Message-ID: <20240506112058.51446-1-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Retain a list of deprecated features disjoint from any particular
CPU model. A query-cpu-model-expansion reply will now provide a list of
properties (i.e. features) that are flagged as deprecated. Example:
{
"return": {
"model": {
"name": "z14.2-base",
"deprecated-props": [
"bpb",
"csske"
],
"props": {
"pfmfi": false,
"exrl": true,
...a lot more props...
"skey": false,
"vxpdeh2": false
}
}
}
}
It is recommended that s390 guests operate with these features
explicitly disabled to ensure compatibility with future hardware.
Signed-off-by: Collin Walling <walling@linux.ibm.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-ID: <20240429191059.11806-2-walling@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
get_sclp_device() scans the whole machine to find a TYPE_SCLP object.
Now that the SCLPDevice instance is available under the machine state,
use it to simplify the lookup. While at it, remove the inline to let
the compiler decide on how to optimize.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <20240502131533.377719-4-clg@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
sclp_get_event_facility_bus() scans the whole machine to find a
TYPE_SCLP_EVENTS_BUS object. The SCLPDevice instance is now available
under the machine state, use it to simplify the lookup and adjust the
creation of the consoles.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <20240502131533.377719-3-clg@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Initialize directly SCLPDevice from the machine init handler and
remove s390_sclp_init(). We will use the SCLPDevice pointer later to
create the consoles.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <20240502131533.377719-2-clg@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The sclpconsole currently does not have a proper parent in the QOM
tree, so it shows up under /machine/unattached - which is somewhat
ugly. We should rather attach it to /machine/sclp/s390-sclp-event-facility
where the other devices of type TYPE_SCLP_EVENT already reside.
Message-ID: <20240430190843.453903-1-thuth@redhat.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
pull-loongarch-20240509
# -----BEGIN PGP SIGNATURE-----
#
# iLMEAAEKAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCZjyDAgAKCRBAov/yOSY+
# 33cfA/4jE0x+eLAT161caSwM3wBOfZRClfUhXdkxLP6GvWbACVQ8l0rEZiw2PuI8
# DFReU2gqs7wAfYKt7Yy62xXlCw1B3aSUzE45gS2TGIP1GqKBwigvpW4i1SgiOoMX
# 4TA+GG16KgR9zaxO48bjjyJ1epc7S3SxdAL09p2U08D9EdSwCA==
# =RLFu
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 09 May 2024 10:02:10 AM CEST
# gpg: using RSA key B8FF1DA0D2FDCB2DA09C6C2C40A2FFF239263EDF
# gpg: Good signature from "Song Gao <m17746591750@163.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B8FF 1DA0 D2FD CB2D A09C 6C2C 40A2 FFF2 3926 3EDF
* tag 'pull-loongarch-20240509' of https://gitlab.com/gaosong/qemu:
target/loongarch: Put cpucfg operation before CSR register
target/loongarch: Add TCG macro in structure CPUArchState
hw/loongarch: Refine default numa id calculation
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
On Loongarch, cpucfg is register for cpu feature, some other registers
depend on cpucfg feature such as perf CSR registers. Here put cpucfg
read/write operations before CSR register, so that KVM knows how many
perf CSR registers are valid from pre-set cpucfg feature information.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240428031651.1354587-1-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
In structure CPUArchState some struct elements are only used in TCG
mode, and it is not used in KVM mode. Macro CONFIG_TCG is added to
make it simpiler in KVM mode, also there is the same modification
in c code when these structure elements are used.
When VM runs in KVM mode, TLB entries are not used and do not need
migrate. It is only useful when it runs in TCG mode.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240506011912.2108842-1-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
With numa_test test case, there is subcase named test_def_cpu_split(),
there are 8 sockets and 2 numa nodes. Here is command line:
"-machine smp.cpus=8,smp.sockets=8 -numa node,memdev=ram -numa node"
The required result is:
node 0 cpus: 0 2 4 6
node 1 cpus: 1 3 5 7
Test case numa_test fails on LoongArch, since the actual result is:
node 0 cpus: 0 1 2 3
node 1 cpus: 4 5 6 7
It will be better if all the cpus in one socket share the same numa
node. Here socket id is used to calculate numa id in function
virt_get_default_cpu_node_id().
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240319022606.2994565-1-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Implement IOCSR address space get functions for MIPS/Loongson CPUs.
For MIPS/Loongson without IOCSR (i.e. Loongson-3A1000), get_cpu_iocsr_as
will return as null, and send_ipi_data will fail with MEMTX_DECODE_ERROR,
which matches expected behavior on hardware.
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240508-loongson3-ipi-v1-3-1a7b67704664@flygoat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Suspend function is emulated as what hardware actually do.
Doorbell register fields are updates to include suspend value,
suspend vector is encoded in firmware blob and fw_cfg is updated
to include S3 bits as what x86 did.
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Message-ID: <20240508-loongson3v-suspend-v1-1-186725524a39@flygoat.com>
[PMD: Use g_memdup2(), constify suspend array]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Rename LoongArchMachineState with LoongArchVirtMachineState, and change
variable name LoongArchMachineState *lams with LoongArchVirtMachineState
*lvms.
Rename function specific for virtmachine loongarch_xxx()
with virt_xxx(). However some common functions keep unchanged such as
loongarch_acpi_setup()/loongarch_load_kernel(), since there functions
can be used for real hw boards.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240508031110.2507477-3-maobibo@loongson.cn>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
On LoongArch system, there is only virt machine type now, name
LOONGARCH_MACHINE is confused, rename it with LOONGARCH_VIRT_MACHINE.
Machine name about Other real hw boards can be added in future.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240508031110.2507477-2-maobibo@loongson.cn>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The 'ref405ep' machine and PPC 405 CPU have no known users, firmware
images are not available, OpenWRT dropped support in 2019, U-Boot in
2017, Linux also is dropping support in 2024. It is time to let go of
this ancient hardware and focus on newer CPUs and platforms.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Message-ID: <20240507123332.641708-1-clg@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The function is inspired by pc_isa_bios_init() and should eventually replace it.
Using x86_isa_bios_init() rather than pc_isa_bios_init() fixes pflash commands
to work in the isa-bios region.
While at it convert the magic number 0x100000 (== 1MiB) to increase readability.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-ID: <20240508175507.22270-6-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The function creates and leaks two MemoryRegion objects regarding the BIOS which
will be moved into X86MachineState in the next steps to avoid the leakage.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240430150643.111976-3-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Given that memory_region_set_readonly() is a no-op when the readonlyness is
already as requested it is possible to simplify the pattern
if (condition) {
foo(true);
}
to
foo(condition);
which is shorter and allows to see the invariant of the code more easily.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240430150643.111976-2-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The i440fx and the isapc machines can be used in binaries without
FDC, too. We just have to make sure that they don't try to instantiate
the FDC when it is not available.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Acked-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240425184315.553329-4-thuth@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The q35 machine can be used without floppy disk controller (FDC),
but due to our current Kconfig setup, the FDC code is still always
included in the binary. To fix this, the "PC" config option should
only imply the "FDC_ISA" instead of always selecting it.
The i440fx and the isa-pc machine currently always instantiate
the FDC, so we have to add the select statements now there instead.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Acked-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240425184315.553329-3-thuth@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The q35 machine can work without FDC. But to be able to also link
a QEMU binary that does not include the FDC code, we have to make
it possible to disable the spots that call into the FDC code.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Acked-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240425184315.553329-2-thuth@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Instead of using a single global bounce buffer, give each AddressSpace
its own bounce buffer. The MapClient callback mechanism moves to
AddressSpace accordingly.
This is in preparation for generalizing bounce buffer handling further
to allow multiple bounce buffers, with a total allocation limit
configured per AddressSpace.
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
Message-ID: <20240507094210.300566-2-mnissler@rivosinc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
[PMD: Split patch, part 2/2]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Propagate AddressSpace handler to following helpers:
- register_map_client()
- unregister_map_client()
- notify_map_clients[_locked]()
Rename them using 'address_space_' prefix instead of 'cpu_'.
The AddressSpace argument will be used in the next commit.
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
Message-ID: <20240507094210.300566-2-mnissler@rivosinc.com>
[PMD: Split patch, part 1/2]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Per https://discourse.gnome.org/t/port-your-module-from-g-memdup-to-g-memdup2-now/5538
The old API took the size of the memory to duplicate as a guint,
whereas most memory functions take memory sizes as a gsize. This
made it easy to accidentally pass a gsize to g_memdup(). For large
values, that would lead to a silent truncation of the size from 64
to 32 bits, and result in a heap area being returned which is
significantly smaller than what the caller expects. This can likely
be exploited in various modules to cause a heap buffer overflow.
Replace g_memdup() by the safer g_memdup2() wrapper.
Trivially safe because the argument was directly from sizeof.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: David Gibson <david@gibson.dropber.id.au>
Message-Id: <20210903174510.751630-17-philmd@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Per https://discourse.gnome.org/t/port-your-module-from-g-memdup-to-g-memdup2-now/5538
The old API took the size of the memory to duplicate as a guint,
whereas most memory functions take memory sizes as a gsize. This
made it easy to accidentally pass a gsize to g_memdup(). For large
values, that would lead to a silent truncation of the size from 64
to 32 bits, and result in a heap area being returned which is
significantly smaller than what the caller expects. This can likely
be exploited in various modules to cause a heap buffer overflow.
Replace g_memdup() by the safer g_memdup2() wrapper.
Trivially safe because the argument was directly from sizeof.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210903174510.751630-12-philmd@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Per https://discourse.gnome.org/t/port-your-module-from-g-memdup-to-g-memdup2-now/5538
The old API took the size of the memory to duplicate as a guint,
whereas most memory functions take memory sizes as a gsize. This
made it easy to accidentally pass a gsize to g_memdup(). For large
values, that would lead to a silent truncation of the size from 64
to 32 bits, and result in a heap area being returned which is
significantly smaller than what the caller expects. This can likely
be exploited in various modules to cause a heap buffer overflow.
Replace g_memdup() by the safer g_memdup2() wrapper.
Trivially safe because the argument was directly from sizeof.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20210903174510.751630-27-philmd@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Per https://discourse.gnome.org/t/port-your-module-from-g-memdup-to-g-memdup2-now/5538
The old API took the size of the memory to duplicate as a guint,
whereas most memory functions take memory sizes as a gsize. This
made it easy to accidentally pass a gsize to g_memdup(). For large
values, that would lead to a silent truncation of the size from 64
to 32 bits, and result in a heap area being returned which is
significantly smaller than what the caller expects. This can likely
be exploited in various modules to cause a heap buffer overflow.
Replace g_memdup() by the safer g_memdup2() wrapper.
Trivially safe because the argument was directly from sizeof.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210903174510.751630-6-philmd@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Peter missed the Sphinx HMP document for the "resume/-r" flag in commit
7a4da28b26 ("qmp: hmp: add migrate "resume" option"). Add it.
When at it, slightly cleanup the lines around:
- Move "detach/-d" to a separate section rather than appending it at the
end of the command description. Add a hint for how to query the migration
results in detached mode.
- Add "postcopy" keyword to "resume/-r" help messages, as it only applies
to postcopy.
Cc: Dr. David Alan Gilbert <dave@treblig.org>
Cc: Fabiano Rosas <farosas@suse.de>
Fixes: 7a4da28b26 ("qmp: hmp: add migrate "resume" option")
Reported-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The fd: URI can currently trigger two different types of migration, a
TCP migration using sockets and a file migration using a plain
file. This is in conflict with the recently introduced (8.2) QMP
migrate API that takes structured data as JSON-like format. We cannot
keep the same backend for both types of migration because with the new
API the code is more tightly coupled to the type of transport. This
means a TCP migration must use the 'socket' transport and a file
migration must use the 'file' transport.
If we keep allowing fd: when using a file, this creates an issue when
the user converts the old-style (fd:) to the new style ("transport":
"socket") invocation because the file descriptor in question has
previously been allowed to be either a plain file or a socket.
To avoid creating too much confusion, we can simply deprecate the fd:
+ file usage, which is thought to be rarely used currently and instead
establish a 1:1 correspondence between fd: URI and socket transport,
and file: URI and file transport.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The 'compress' migration capability enables the old compression code
which has shown issues over the years and is thought to be less stable
and tested than the more recent multifd-based compression. The old
compression code has been deprecated in 8.2 and now is time to remove
it.
Deprecation commit 864128df46 ("migration: Deprecate old compression
method").
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The block migration has been considered obsolete since QEMU 8.2 in
favor of the more flexible storage migration provided by the
blockdev-mirror driver. Two releases have passed so now it's time to
remove it.
Deprecation commit 66db46ca83 ("migration: Deprecate block
migration").
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The block migration is considered obsolete and has been deprecated in
8.2. Remove the migrate command option that enables it. This only
affects the QMP and HMP commands, the feature can still be accessed by
setting the migration 'block' capability. The whole feature will be
removed in a future patch.
Deprecation commit 8846b5bfca ("migration: migrate 'blk' command
option is deprecated.").
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The block incremental option for block migration has been deprecated
in 8.2 in favor of using the block-mirror feature. Remove it now.
Deprecation commit 40101f320d ("migration: migrate 'inc' command
option is deprecated.").
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The 'skipped' field of the MigrationStats struct has been deprecated
in 8.1. Time to remove it.
Deprecation commit 7b24d32634 ("migration: skipped field is really
obsolete.").
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Now we do set MIGRATION_FAILED state, but don't give a chance to
orchestrator to query migration state and get the error.
Let's provide a possibility for QMP-based orchestrators to get an error
like with outgoing migration.
For hmp_migrate_incoming(), let's enable the new behavior: HMP is not
and ABI, it's mostly intended to use by developer and it makes sense
not to stop the process.
For x-exit-preconfig, let's keep the old behavior:
- it's called from init(), so here we want to keep current behavior by
default
- it does exit on error by itself as well
So, if we want to change the behavior of x-exit-preconfig, it should be
another patch.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Unify error reporting in the function. This simplifies the following
commit, which will not-exit-on-error behavior variant to the function.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
It's bad idea to leave critical section with error object freed, but
s->error still set, this theoretically may lead to use-after-free
crash. Let's avoid it.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Make call to migration_incoming_state_destroy(), instead of doing only
partial of it.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
migration/ram.c: API Conversion qemu_mutex_lock(),
and qemu_mutex_unlock() to WITH_QEMU_LOCK_GUARD macro
Signed-off-by: Will Gyda <vilhelmgyda@gmail.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
* target/i386/tcg: conversion of one byte opcodes to table-based decoder
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmY5z/QUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroP1YQf/WMAoB/lR31fzu/Uh36hF1Ke/NHNU
# gefqKRAol6xJXxavKH8ym9QMlCTzrCLVt0e8RalZH76gLqYOjRhSLSSL+gUo5HEo
# lsGSfkDAH2pHO0ZjQUkXcjJQQKkH+4+Et8xtyPc0qmq4uT1pqQZRgOeI/X/DIFNb
# sMoKaRKfj+dB7TSp3qCSOp77RqL13f4QTP8mUQ4XIfzDDXdTX5n8WNLnyEIKjoar
# ge4U6/KHjM35hAjCG9Av/zYQx0E084r2N2OEy0ESYNwswFZ8XYzTuL4SatN/Otf3
# F6eQZ7Q7n6lQbTA+k3J/jR9dxiSqVzFQnL1ePGoe9483UnxVavoWd0PSgw==
# =jCyB
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 06 May 2024 11:53:40 PM PDT
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (26 commits)
target/i386: remove duplicate prefix decoding
target/i386: split legacy decoder into a separate function
target/i386: decode x87 instructions in a separate function
target/i386: remove now-converted opcodes from old decoder
target/i386: port extensions of one-byte opcodes to new decoder
target/i386: move BSWAP to new decoder
target/i386: move remaining conditional operations to new decoder
target/i386: merge and enlarge a few ranges for call to disas_insn_new
target/i386: move C0-FF opcodes to new decoder (except for x87)
target/i386: generalize gen_movl_seg_T0
target/i386: move 60-BF opcodes to new decoder
target/i386: allow instructions with more than one immediate
target/i386: extract gen_far_call/jmp, reordering temporaries
target/i386: move 00-5F opcodes to new decoder
target/i386: reintroduce debugging mechanism
target/i386: cleanup *gen_eob*
target/i386: clarify the "reg" argument of functions returning CCPrepare
target/i386: do not use s->T0 and s->T1 as scratch registers for CCPrepare
target/i386: extend cc_* when using them to compute flags
target/i386: pull cc_op update to callers of gen_jmp_rel{,_csize}
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Now that a bulk of opcodes go through the new decoder, it is sensible
to do some cleanup. Go immediately through disas_insn_new and only jump
back after parsing the prefixes.
disas_insn() now only contains the three sigsetjmp cases, and they
are more easily managed if they are inlined into i386_tr_translate_insn.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Split the bits that have some duplication with disas_insn_new, from
those that should be the main topic of the conversion. This is the
first step towards removing duplicate decoding of prefixes between
disas_insn and disas_insn_new.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These are unlikely to be converted to the table-based decoding
soon (perhaps there could be generic ESC decoding in decode-new.c.inc
for the Mod/RM byte, but not operand decoding), so keep them separate
from the remaining legacy-decoded instructions.
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Send all converted opcodes to disas_insn_new() directly from the big
decoding switch statement; once more, the debugging/bisecting logic
disappears.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A few two-byte opcodes are simple extensions of existing one-byte opcodes;
they are easy to decode and need no change to emit.c.inc. Port them to
the new decoder.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move long-displacement Jcc, SETcc and CMOVcc to the new decoder.
While filling in the tables makes the code seem longer, the new
emitters are all just one line of code.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since new opcodes are not going to be added in translate.c, round the
case labels that call to disas_insn_new(), including whole sets of
eight opcodes when possible.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The shift instructions are rewritten instead of reusing code from the old
decoder. Rotates use CC_OP_ADCOX more extensively and generally rely
more on the optimizer, so that the code generators are shared between
the immediate-count and variable-count cases.
In particular, this makes gen_RCL and gen_RCR pretty efficient for the
count == 1 case, which becomes (apart from a few extra movs) something like:
(compute_cc_all if needed)
// save old value for OF calculation
mov cc_src2, T0
// the bulk of RCL is just this!
deposit T0, cc_src, T0, 1, TARGET_LONG_BITS - 1
// compute carry
shr cc_dst, cc_src2, length - 1
and cc_dst, cc_dst, 1
// compute overflow
xor cc_src2, cc_src2, T0
extract cc_src2, cc_src2, length - 1, 1
32-bit MUL and IMUL are also slightly more efficient on 64-bit hosts.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In the new decoder it is sometimes easier to put the segment
in T1 instead of T0, usually because another operand was loaded
by common code in T0. Genrealize gen_movl_seg_T0 to allow
using any source.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Compared to the old decoder, the main differences in translation
are for the little-used ARPL instruction. IMUL is adjusted a bit
to share more code to produce flags, but is otherwise very similar.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
While keeping decode->immediate for convenience and for 4-operand instructions,
store the immediate in X86DecodedOp as well. This enables instructions
with more than one immediate such as ENTER. It can also be used for far
calls and jumps.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Extract the code into new functions, and swap T0/T1 so that T0 corresponds
to the first immediate in the instruction stream.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Create a new wrapper for syscall/sysret, and do not go through multiple
layers of wrappers.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Instead of using s->T0 or s->T1, create a scratch register
when computing the C, NC, L or LE conditions.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Instead of using s->tmp0 or s->tmp4 as the result, just extend the cc_*
registers in place. It is harmless and, if multiple setcc instructions
are used, the optimizer will be able to remove the redundant ones.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
gen_update_cc_op must be called before control flow splits. Doing it
in gen_jmp_rel{,_csize} may hide bugs, instead assert that cc_op is
clean---even if that means a few more calls to gen_update_cc_op().
With this new invariant, setting cc_op to CC_OP_DYNAMIC is unnecessary
since the caller should have done it.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
gen_update_cc_op must be called before control flow splits. Do it
where the jump on ECX!=0 is translated.
On the other hand, remove the call before gen_jcc1, which takes care of
it already, and explain why REPZ/REPNZ need not use CC_OP_DYNAMIC---the
translation block ends before any control-flow-dependent cc_op could
be observed.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Resetting cc_op to CC_OP_DYNAMIC should be done at control flow junctions,
which is not the case here. This translation block is ending and the
only effect of calling set_cc_op() would be a discard of s->cc_srcT.
This discard is useless (it's a temporary, not a global) and in fact
prevents gen_prepare_cc from returning s->cc_srcT.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
With the introduction of TSTEQ and TSTNE the .mask field is always -1,
so remove all the now-unnecessary code.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The new conditions obviously come in handy when testing individual bits
of EFLAGS, and they make it possible to remove the .mask field of
CCPrepare.
Lowering to shift+and is done by the optimizer if necessary.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When testing the sign bit or equality to zero of a partial register, it
is useful to use a single TSTEQ or TSTNE operation. It can also be used
to test the parity flag, using bit 0 of the population count.
Do not do this for target_ulong-sized values however; the optimizer would
produce a comparison against zero anyway, and it avoids shifts by 64
which are undefined behavior.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Observed the following failure while booting the SEV-SNP guest and the
guest fails to boot with the smp parameters:
"-smp 192,sockets=1,dies=12,cores=8,threads=2".
qemu-system-x86_64: sev_snp_launch_update: SNP_LAUNCH_UPDATE ret=-5 fw_error=22 'Invalid parameter'
qemu-system-x86_64: SEV-SNP: CPUID validation failed for function 0x8000001e, index: 0x0.
provided: eax:0x00000000, ebx: 0x00000100, ecx: 0x00000b00, edx: 0x00000000
expected: eax:0x00000000, ebx: 0x00000100, ecx: 0x00000300, edx: 0x00000000
qemu-system-x86_64: SEV-SNP: failed update CPUID page
Reason for the failure is due to overflowing of bits used for "Node per
processor" in CPUID Fn8000001E_ECX. This field's width is 3 bits wide and
can hold maximum value 0x7. With dies=12 (0xB), it overflows and spills
over into the reserved bits. In the case of SEV-SNP, this causes CPUID
enforcement failure and guest fails to boot.
The PPR documentation for CPUID_Fn8000001E_ECX [Node Identifiers]
=================================================================
Bits Description
31:11 Reserved.
10:8 NodesPerProcessor: Node per processor. Read-only.
ValidValues:
Value Description
0h 1 node per processor.
7h-1h Reserved.
7:0 NodeId: Node ID. Read-only. Reset: Fixed,XXh.
=================================================================
As in the spec, the valid value for "node per processor" is 0 and rest
are reserved.
Looking back at the history of decoding of CPUID_Fn8000001E_ECX, noticed
that there were cases where "node per processor" can be more than 1. It
is valid only for pre-F17h (pre-EPYC) architectures. For EPYC or later
CPUs, the linux kernel does not use this information to build the L3
topology.
Also noted that the CPUID Function 0x8000001E_ECX is available only when
TOPOEXT feature is enabled. This feature is enabled only for EPYC(F17h)
or later processors. So, previous generation of processors do not not
enumerate 0x8000001E_ECX leaf.
There could be some corner cases where the older guests could enable the
TOPOEXT feature by running with -cpu host, in which case legacy guests
might notice the topology change. To address those cases introduced a
new CPU property "legacy-multi-node". It will be true for older machine
types to maintain compatibility. By default, it will be false, so new
decoding will be used going forward.
The documentation is taken from Preliminary Processor Programming
Reference (PPR) for AMD Family 19h Model 11h, Revision B1 Processors 55901
Rev 0.25 - Oct 6, 2022.
Cc: qemu-stable@nongnu.org
Fixes: 31ada106d8 ("Simplify CPUID_8000_001E for AMD")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
Message-ID: <0ee4b0a8293188a53970a2b0e4f4ef713425055e.1714757834.git.babu.moger@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We have one job to build user binaries and one job for system.
Disable tools and docs in the user job, and disable building
the user binaries in the system job.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The host does not have the correct libraries installed for static pie,
which causes host/guest address space interference for some tests.
There's no real gain from linking statically, so drop it.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Match the extra inserts of INDEX_op_insn_start, fixing
the db->num_insns != 1 assert in translator_loop.
Fixes: dcd092a063 ("accel/tcg: Improve can_do_io management")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Record the fact that we've found a breakpoint on the page
in which a TranslationBlock is running.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
If we can show that high bits of an input are zero,
then we may optimize away some comparisons.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This may be treated as a 32-bit EQ/NE comparison against 0,
which is in turn treated as a LTU/GEU comparison against 1.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The x86 isa does not have this operation, so we need an expansion.
Use the same algorithm that we use for expanding this vector
operation with integers: perform the shift with a wider type
and then mask the bits that must be zero.
This reduces the instruction count from 5 to 2.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
qemu-sparc queue
# -----BEGIN PGP SIGNATURE-----
#
# iQFSBAABCgA8FiEEzGIauY6CIA2RXMnEW8LFb64PMh8FAmY4wZceHG1hcmsuY2F2
# ZS1heWxhbmRAaWxhbmRlLmNvLnVrAAoJEFvCxW+uDzIftQsH+wfIWymTdQMowfM6
# Ze/T8KODn+MqU5eg25VPSTojnmr7LFaCj2yK6zWX61RwIqtMc3NaxX0G7ksW12/g
# 35ACqiEEd5WRDhAtVhj5Wp+WEDoR4AD3LWIaN7a/qjO3qb78l7Bujw3qXzGSq4lQ
# hST6dTgMwn5LhJOyz+5dORVUK1UZSBuDxHeKRHgdoFi6yqGQ5bao5TpaDYOnGSbx
# 8KPrAFfXG1T6xRS8Ih5HXAPE5VJztLFPiVtCTTrETDP/o8EzvOZj5y/nJVZXXC3N
# 57g+QyJX9EdrRZvobef4LnNnoZyiqG+uQNugglqZqjiiLjl6AzYxI+ed0hU+cZR9
# pz76Hr8=
# =i2cV
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 06 May 2024 04:40:07 AM PDT
# gpg: using RSA key CC621AB98E82200D915CC9C45BC2C56FAE0F321F
# gpg: issuer "mark.cave-ayland@ilande.co.uk"
# gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>" [full]
* tag 'qemu-sparc-20240506' of https://github.com/mcayland/qemu:
target/sparc: Split out do_ms16b
target/sparc: Fix FPMERGE
target/sparc: Fix FMULD8*X16
target/sparc: Fix FMUL8x16A{U,L}
target/sparc: Fix FMUL8x16
target/sparc: Fix FEXPAND
linux-user/sparc: Add more hwcap bits for sparc64
hw/sparc64: set iommu_platform=on for virtio devices attached to the sun4u machine
docs/about: Deprecate the old "UltraSparc" CPU names that contain a "+"
docs/system/target-sparc: Improve the Sparc documentation
target/sparc/cpu: Avoid spaces by default in the CPU names
target/sparc/cpu: Rename the CPU models with a "+" in their names
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Accelerator patches
- Extract page-protection definitions to page-protection.h
- Rework in accel/tcg in preparation of extracting TCG fields from CPUState
- More uses of get_task_state() in user emulation
- Xen refactors in preparation for adding multiple map caches (Juergen & Edgar)
- MAINTAINERS updates (Aleksandar and Bin)
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmY40CAACgkQ4+MsLN6t
# wN5drxAA1oIsuUzpAJmlMIxZwlzbICiuexgn/HH9DwWNlrarKo7V1l4YB8jd9WOg
# IKuj7c39kJKsDEB8BXApYwcly+l7DYdnAAI8Z7a+eN+ffKNl/0XBaLjsGf58RNwY
# fb39/cXWI9ZxKxsHMSyjpiu68gOGvZ5JJqa30Fr+eOGuug9Fn/fOe1zC6l/dMagy
# Dnym72stpD+hcsN5sVwohTBIk+7g9og1O/ctRx6Q3ZCOPz4p0+JNf8VUu43/reaR
# 294yRK++JrSMhOVFRzP+FH1G25NxiOrVCFXZsUTYU+qPDtdiKtjH1keI/sk7rwZ7
# U573lesl7ewQFf1PvMdaVf0TrQyOe6kUGr9Mn2k8+KgjYRAjTAQk8V4Ric/+xXSU
# 0rd7Cz7lyQ8jm0DoOElROv+lTDQs4dvm3BopF3Bojo4xHLHd3SFhROVPG4tvGQ3H
# 72Q5UPR2Jr2QZKiImvPceUOg0z5XxoN6KRUkSEpMFOiTRkbwnrH59z/qPijUpe6v
# 8l5IlI9GjwkL7pcRensp1VC6e9KC7F5Od1J/2RLDw3UQllMQXqVw2bxD3CEtDRJL
# QSZoS4d1jUCW4iAYdqh/8+2cOIPiCJ4ai5u7lSdjrIJkRErm32FV/pQLZauoHlT5
# eTPUgzDoRXVgI1X1slTpVXlEEvRNbhZqSkYLkXr80MLn5hTafo0=
# =3Qkg
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 06 May 2024 05:42:08 AM PDT
# gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]
* tag 'accel-20240506' of https://github.com/philmd/qemu: (28 commits)
MAINTAINERS: Update my email address
MAINTAINERS: Update Aleksandar Rikalo email
system: Pass RAM MemoryRegion and is_write in xen_map_cache()
xen: mapcache: Break out xen_map_cache_init_single()
xen: mapcache: Break out xen_invalidate_map_cache_single()
xen: mapcache: Refactor xen_invalidate_map_cache_entry_unlocked
xen: mapcache: Refactor xen_replace_cache_entry_unlocked
xen: mapcache: Break out xen_ram_addr_from_mapcache_single
xen: mapcache: Refactor xen_remap_bucket for multi-instance
xen: mapcache: Refactor xen_map_cache for multi-instance
xen: mapcache: Refactor lock functions for multi-instance
xen: let xen_ram_addr_from_mapcache() return -1 in case of not found entry
system: let qemu_map_ram_ptr() use qemu_ram_ptr_length()
user: Use get_task_state() helper
user: Declare get_task_state() once in 'accel/tcg/vcpu-state.h'
user: Forward declare TaskState type definition
accel/tcg: Move @plugin_mem_cbs from CPUState to CPUNegativeOffsetState
accel/tcg: Restrict cpu_plugin_mem_cbs_enabled() to TCG
accel/tcg: Restrict qemu_plugin_vcpu_exit_hook() to TCG plugins
accel/tcg: Update CPUNegativeOffsetState::can_do_io field documentation
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* target/i386: Introduce SapphireRapids-v3 to add missing features
* switch boards to "default y"
* allow building emulators without any board
* configs: list "implied" device groups in the default configs
* remove unnecessary declarations from typedefs.h
* target/i386: Give IRQs a chance when resetting HF_INHIBIT_IRQ_MASK
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmY1ILsUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroNtIwf+MEehq2HudZvsK1M8FrvNmkB/AssO
# x4tqL8DlTus23mQDBu9+rANTB93ManJdK9ybtf6NfjEwK+R8RJslLVnuy/qT+aQX
# PD208L88fjZg17G8uyawwvD1VmqWzHFSN14ShmKzqB2yPXXo/1cJ30w78DbD50yC
# 6rw/xbC5j195CwE2u8eBcIyY4Hh2PUYEE4uyHbYVr57cMjfmmA5Pg4I4FJrpLrF3
# eM2Avl/4pIbsW3zxXVB8QbAkgypxZErk3teDK1AkPJnlnBYM1jGKbt/GdKe7vcHR
# V/o+7NlcbS3oHVItQ2gP3m91stjFq+NhixaZpa0VlmuqayBa3xNGl0G6OQ==
# =ZbNW
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 03 May 2024 10:36:59 AM PDT
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (46 commits)
qga/commands-posix: fix typo in qmp_guest_set_user_password
migration: do not include coroutine_int.h
kvm: move target-dependent interrupt routing out of kvm-all.c
pci: remove some types from typedefs.h
tcg: remove CPU* types from typedefs.h
display: remove GraphicHwOps from typedefs.h
qapi/machine: remove types from typedefs.h
monitor: remove MonitorDef from typedefs.h
migration: remove PostcopyDiscardState from typedefs.h
lockable: remove QemuLockable from typedefs.h
intc: remove PICCommonState from typedefs.h
qemu-option: remove QemuOpt from typedefs.h
net: remove AnnounceTimer from typedefs.h
numa: remove types from typedefs.h
qdev-core: remove DeviceListener from typedefs.h
fw_cfg: remove useless declarations from typedefs.h
build: do not build virtio-vga-gl if virgl/opengl not available
bitmap: Use g_try_new0/g_new0/g_renew
target/i386: Introduce SapphireRapids-v3 to add missing features
docs: document new convention for Kconfig board symbols
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Short-circuit for packets with r/w and no overlap
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCgAdFiEEPWaq5HRZSCTIjOD4GlSvuOVkbDIFAmY4FR8ACgkQGlSvuOVk
# bDLEfxAAup6v9J4n2/q88FXfLGgx1EfZrT01gOM/48mwngNNQJGJQySe2GLl0G8S
# 1hx/Ym3jbikic8HL80v8FyCr4gNRshEY7xKpCfvY9lsgnCRbhEvoV/hZqucmLQAt
# 1SIhFSsi5h8gyZDTvXhH75v3qGvYjQ7fQBhy2JbRsPjthdHBh9xi6Na60wlqfNZq
# oGsVtY7sv1uHsvDKBi3JoXWckSK99R38BHY6zPoStarRZACkkLdX6KHxeX88TUt1
# whIUYUS/K0nRVxzekdq/+m8UJYrXnW/0cliM5mLFHDGlsV+qjdcIRrfaPWBO0eFN
# kXeZU2BWLCdP2M52FHI4FllnIRpX5OGkxjR6x8Pc9r+EGciwGRU7xeAlqBxKQSZP
# e3oXtV6oKxg69xBgHE5HcKbt6bX5EZR/sUcbAoGA41UssaiMyj3wbg1cy2UxXu2J
# 7oJyywJUggWGSoCIIJJ95YgpUrIg73Yg6pOjfhKW1w/V2SuQPGG0XTXrwe7J6uGi
# VAqyu55p2oiW8Gk4Lvl1SfWgxkVeZa/NcxTmXNEWFnT7vatqwez0O5pxIkxdSCFE
# lRv7PuFT5nhQ/gg12zGqqRiOrMOMQitHFzJ9sUNu7J4Y7W5R4gzRW19ucojLt0lH
# fT83Ra+Eex1Cu3DsuvWkokxFikxXP1Ll297Jr1JhOPewTtvlxvI=
# =Q8/k
# -----END PGP SIGNATURE-----
# gpg: Signature made Sun 05 May 2024 04:24:15 PM PDT
# gpg: using RSA key 3D66AAE474594824C88CE0F81A54AFB8E5646C32
# gpg: Good signature from "Brian Cain (QUIC) <quic_bcain@quicinc.com>" [unknown]
# gpg: aka "Brian Cain <bcain@kernel.org>" [unknown]
# gpg: aka "Brian Cain (QuIC) <bcain@quicinc.com>" [unknown]
# gpg: aka "Brian Cain (CAF) <bcain@codeaurora.org>" [unknown]
# gpg: aka "bcain" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6350 20F9 67A7 7164 79EF 49E0 175C 464E 541B 6D47
# Subkey fingerprint: 3D66 AAE4 7459 4824 C88C E0F8 1A54 AFB8 E564 6C32
* tag 'pull-hex-20240505' of https://github.com/quic/qemu:
Hexagon (target/hexagon) Remove hex_common.read_attribs_file
Hexagon (target/hexagon) Remove gen_shortcode.py
Hexagon (target/hexagon) Remove gen_op_regs.py
Hexagon (target/hexagon) Remove uses of op_regs_generated.h.inc
Hexagon (tests/tcg/hexagon) Test HVX .new read from high half of pair
Hexagon (target/hexagon) Mark has_pred_dest in trans functions
Hexagon (target/hexagon) Mark dest_idx in trans functions
Hexagon (target/hexagon) Mark new_read_idx in trans functions
Hexagon (target/hexagon) Add is_old/is_new to Register class
Hexagon (target/hexagon) Only pass env to generated helper when needed
Hexagon (target/hexagon) Pass SP explicitly to helpers that need it
Hexagon (target/hexagon) Pass P0 explicitly to helpers that need it
Hexagon (target/hexagon) Enable more short-circuit packets (HVX)
Hexagon (target/hexagon) Enable more short-circuit packets (scalar core)
Hexagon (target/hexagon) Analyze reads before writes
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add MapCache argument to xen_replace_cache_entry_unlocked in
preparation for supporting multiple map caches.
No functional change.
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240430164939.925307-8-edgar.iglesias@gmail.com>
[PMD: Remove last global mapcache pointer, reported by sstabellini]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
While each user emulation implentation defines its own
TaskState structure, both use the same get_task_state()
declaration, in particular in common code (such gdbstub).
Declare the method once in "accel/tcg/vcpu-state.h".
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240428221450.26460-10-philmd@linaro.org>
For union types, the tag member is known only after .check().
We used to code this in a simple way: QAPISchemaVariants attribute
.tag_member was None for union types until .check().
Since this complicated typing, recent commit "qapi/schema: fix typing
for QAPISchemaVariants.tag_member" hid it behind a property.
The previous commit lets us treat .tag_member just like the other
attributes that become known only in .check(): declare, but don't
initialize it in .__init__().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
QAPISchemaVariants.check()'s code is almost entirely conditional on
union vs. alternate type.
Move the conditional code to QAPISchemaBranches.check() and
QAPISchemaAlternatives.check(), where the conditions are always
satisfied.
Attribute QAPISchemaVariants.tag_name is now only used by
QAPISchemaBranches. Move it there.
Refactor the three types' .__init__() to make them a bit simpler.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
A previous commit narrowed the type of
QAPISchemaAlternateType.variants from QAPISchemaVariants to
QAPISchemaAlternatives. Rename it to .alternatives.
Same for .__init__() parameter @variants.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
A previous commit narrowed the type of QAPISchemaObjectType.variants
from QAPISchemaVariants to QAPISchemaBranches. Rename it to
.branches.
Same for .__init__() parameter @variants.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
A previous commit narrowed the type of .visit_alternate_type()
parameter @variants from QAPISchemaVariants to QAPISchemaAlternatives.
Rename it to @alternatives.
One of them passes @alternatives to helper function
gen_visit_alternate(). Rename its @variants parameter to
@alternatives as well.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
The previous commit narrowed the type of .visit_object_type()
parameter @variants from QAPISchemaVariants to QAPISchemaBranches.
Rename it to @branches.
Same for .visit_object_type_flat().
A few of these pass @branches to helper functions:
QAPISchemaGenRSTVisitor.visit_object_type() to ._nodes_for_members()
and ._nodes_for_variant_when(), and
QAPISchemaGenVisitVisitor.visit_object_type() to
gen_visit_object_members(). Rename the helpers' @variants parameters
to @branches as well.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
QAPISchemaVariants represents either a union type's branches, or an
alternate type's alternatives. Much of its code is conditional on
which one it actually is.
Create QAPISchemaBranches for branches, and QAPISchemaAlternatives for
alternatives, both subtypes of QAPISchemaVariants.
Replace QAPISchemaVariants by one of them where possible. Keep it
only where we actually deal with either of them.
QAPISchemaVariants.__init__() takes @tag_name and @tag_member, where
exactly one must be None: @tag_name for alternatives, @tag_member for
branches. Let QAPISchemaBranches.__init__() take just @tag_name, and
QAPISchemaAlternatives.__init__() take just @tag_member.
A later patch will move the conditional code to the subtypes.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
qemu_plugin_vcpu_exit_hook() is specific to TCG plugins,
so must be restricted to it in cpu_common_unrealizefn(),
similarly to how qemu_plugin_create_vcpu_state() is
restricted in the cpu_common_realizefn() counterpart.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240429213050.55177-2-philmd@linaro.org>
Extract page-protection definitions from "exec/cpu-all.h"
to "exec/page-protection.h".
The list of files requiring the new header was generated
using:
$ git grep -wE \
'PAGE_(READ|WRITE|EXEC|RWX|VALID|ANON|RESERVED|TARGET_.|PASSTHROUGH)'
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240427155714.53669-3-philmd@linaro.org>
Currently, we pass env to every generated helper. When the semantics of
the instruction only depend on the arguments, this is unnecessary and
adds extra overhead to the helper call.
We add the TCG_CALL_NO_RWG_SE flag to any non-HVX helpers that don't get
the ptr to env.
The A2_nop and SA1_setin1 instructions end up with no arguments. This
results in a "old-style function definition" error from the compiler, so
we write overrides for them.
With this change, the number of helpers with env argument is
idef-parser enabled: 329 total, 23 with env
idef-parser disabled: 1543 total, 550 with env
Signed-off-by: Taylor Simpson <ltaylorsimpson@gmail.com>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Tested-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20240214042726.19290-4-ltaylorsimpson@gmail.com>
Signed-off-by: Brian Cain <bcain@quicinc.com>
We divide gen_analyze_funcs.py into 3 phases
Declare the operands
Analyze the register reads
Analyze the register writes
We also create special versions of ctx_log_*_read for new operands
Check that the operand is written before the read
This is a precursor to improving the analysis for short-circuiting
the packet semantics in a subsequent commit
Signed-off-by: Taylor Simpson <ltaylorsimpson@gmail.com>
Reviewed-by: Brian Cain <bcain@quicinc.com>
Message-Id: <20240201103340.119081-2-ltaylorsimpson@gmail.com>
Signed-off-by: Brian Cain <bcain@quicinc.com>
The sun4u machine has an IOMMU and therefore it is possible to program it such
that the virtio-device IOVA does not map directly to the CPU physical address.
This is not a problem with Linux which always maps the IOVA directly to the CPU
physical address, however it is required for the NetBSD virtio driver where this
is not the case.
Set the sun4u machine defaults for all virtio devices so that disable-legacy=on
and iommu_platform=on to ensure a default configuration will allow virtio
devices to function correctly on both Linux and NetBSD.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20240418205730.31396-1-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
The output of "-cpu help" is currently rather confusing to the users:
It might not be fully clear which part of the output defines the CPU
names since the CPU names contain white spaces (which we later have to
convert into dashes internally). At best it's at least a nuisance since
the users might need to specify the CPU names with quoting on the command
line if they are not aware of the fact that the CPU names could be written
with dashes instead. So let's finally clean up this mess by using dashes
instead of white spaces for the CPU names, like we're doing it internally
later (and like we're doing it in most other targets of QEMU).
Note that it is still possible to pass the CPU names with spaces to the
"-cpu" option, since sparc_cpu_type_name() still translates those to "-".
Fixes: https://gitlab.com/qemu-project/qemu/-/issues/2141
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20240419084812.504779-3-thuth@redhat.com>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Commit b447378e12 ("qom/object: Limit type names to alphanumerical ...")
cut down the amount of allowed characters for QOM types to a saner set.
The "+" character was meant to be included in this set, so we had to
add a hack there to still allow the legacy names of POWER and Sparc64
CPUs. However, instead of putting such a hack in the common QOM code,
there is a much better place to do this: The sparc_cpu_class_by_name()
function which is used to look up the names of all Sparc CPUs.
Thus let's finally get rid of the "+" in the Sparc CPU names, and provide
backward compatibility for the old names via some simple checks in the
sparc_cpu_class_by_name() function.
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240419084812.504779-2-thuth@redhat.com>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Richard Henderson explained on IRC:
bcond_internal() used to insist that both branch
destination and branch fallthrough are use_goto_tb;
if not, we'd use movcond to compute an indirect jump.
But it's perfectly fine for e.g. the branch fallthrough
to use_goto_tb, and the branch destination to use
an indirect branch.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240424234436.995410-4-richard.henderson@linaro.org>
[PMD: Split bigger patch, part 4/5]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20240503072014.24751-7-philmd@linaro.org>
qga/commands-posix.c does not compile on FreeBSD due to a confusion
between "chpasswdata" (wrong) and "chpasswddata" (used in the #else
branch).
Fixes: 0e5b75a390 ("qga/commands-posix: qmp_guest_set_user_password: use ga_run_command helper")
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We only support the most recent two versions of macOS (currently
macOS 13 Ventura and macOS 14 Sonoma), and our ui/cocoa.m code
already assumes at least macOS 12 Monterey or better, because it uses
NSScreen safeAreaInsets, which is 12.0-or-newer.
Remove the ifdefs that were providing backwards compatibility for
building on 10.12 and earlier versions.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240502142904.62644-1-peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Benchmark each acceleration function vs an aligned buffer of zeros.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Because non-embedded aarch64 is expected to have AdvSIMD enabled, merely
double-check with the compiler flags for __ARM_NEON and don't bother with
a runtime check. Otherwise, model the loop after the x86 SSE2 function.
Use UMAXV for the vector reduction. This is 3 cycles on cortex-a76 and
2 cycles on neoverse-n1.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Because the three alternatives are monotonic, we don't need
to keep a couple of bitmasks, just identify the strongest
alternative at startup.
Generalize test_buffer_is_zero_next_accel and init_accel
by always defining an accel_table array.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Split less-than and greater-than 256 cases.
Use unaligned accesses for head and tail.
Avoid using out-of-bounds pointers in loop boundary conditions.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Increase unroll factor in SIMD loops from 4x to 8x in order to move
their bottlenecks from ALU port contention to load issue rate (two loads
per cycle on popular x86 implementations).
Avoid using out-of-bounds pointers in loop boundary conditions.
Follow SSE2 implementation strategy in the AVX2 variant. Avoid use of
PTEST, which is not profitable there (like in the removed SSE4 variant).
Signed-off-by: Alexander Monakov <amonakov@ispras.ru>
Signed-off-by: Mikhail Romanov <mmromanov@ispras.ru>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240206204809.9859-6-amonakov@ispras.ru>
Use of prefetching in bufferiszero.c is quite questionable:
- prefetches are issued just a few CPU cycles before the corresponding
line would be hit by demand loads;
- they are done for simple access patterns, i.e. where hardware
prefetchers can perform better;
- they compete for load ports in loops that should be limited by load
port throughput rather than ALU throughput.
Signed-off-by: Alexander Monakov <amonakov@ispras.ru>
Signed-off-by: Mikhail Romanov <mmromanov@ispras.ru>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240206204809.9859-5-amonakov@ispras.ru>
Test for length >= 256 inline, where is is often a constant.
Before calling into the accelerated routine, sample three bytes
from the buffer, which handles most non-zero buffers.
Signed-off-by: Alexander Monakov <amonakov@ispras.ru>
Signed-off-by: Mikhail Romanov <mmromanov@ispras.ru>
Message-Id: <20240206204809.9859-3-amonakov@ispras.ru>
[rth: Use __builtin_constant_p; move the indirect call out of line.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The SSE4.1 variant is virtually identical to the SSE2 variant, except
for using 'PTEST+JNZ' in place of 'PCMPEQB+PMOVMSKB+CMP+JNE' for testing
if an SSE register is all zeroes. The PTEST instruction decodes to two
uops, so it can be handled only by the complex decoder, and since
CMP+JNE are macro-fused, both sequences decode to three uops. The uops
comprising the PTEST instruction dispatch to p0 and p5 on Intel CPUs, so
PCMPEQB+PMOVMSKB is comparatively more flexible from dispatch
standpoint.
Hence, the use of PTEST brings no benefit from throughput standpoint.
Its latency is not important, since it feeds only a conditional jump,
which terminates the dependency chain.
I never observed PTEST variants to be faster on real hardware.
Signed-off-by: Alexander Monakov <amonakov@ispras.ru>
Signed-off-by: Mikhail Romanov <mmromanov@ispras.ru>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240206204809.9859-2-amonakov@ispras.ru>
Migration code needs no private fields of the coroutine backend.
Include the "regular" coroutine.h header.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Let hw/hyperv/hyperv.c and hw/intc/s390_flic.c handle (respectively)
SynIC and adapter routes, removing the code from target-independent
files. This also removes the only occurrence of AdapterInfo outside
s390 code, so remove that from typedefs.h.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For types that are embedded in structs defined by pci.h, the definition
is pretty much required to be available. Remove them from typedefs.h.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
hw/core/cpu.h is already using struct forward declarations in some cases
to avoid inclusions, and otherwise CPUAddressSpace and CPUJumpCache
are only used together with their definition. CPUTLBEntryFull is
always used when their definition is available. Remove all three
from typedefs.h.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Basically all uses of GraphicHwOps are defining an instance of it, which requires the
full definition of the struct. It is pointless to have it in typedefs.h.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
They are needed in very few places, which already depends on other generated QAPI
files. The benefit of having these types in typedefs.h is small.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
MonitorDef is defined by hmp-target.h, and all users except one already
include it; the reason why the stubs do not include it, is because
hmp-target.h currently can only be used in files that are compiled
per target. However, that is easily fixed. Because the benefit of
having MonitorDef in typedefs.h is very small, do it and remove the
type from typedefs.h.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It is defined and referred to exclusively from a .c file.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Using QemuLockable almost always requires going through QEMU_MAKE_LOCKABLE().
Therefore, there is little point in having the typedef always present. Move
it to lockable.h, with only a small adjustment to coroutine.h (which has
a tricky co-dependency with lockable.h due to defining CoMutex *and*
using QemuLockable as a part of the CoQueue API).
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move it to the existing "PIC related things" header, hw/intc/i8259.h.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
QemuOpt is basically an internal data structure. It has no business
being defined except if you need functions from include/qemu/option.h.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Exactly nobody needs it there. Place the typedef in the header
that defines the struct.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Exactly nobody needs them there. Place the typedef in the header
that defines the struct.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It is needed in very few places, which already depend on other parts of
qdev-core.h files. The benefit of having it in typedefs.h is small.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Only FWCfgState is used as part of APIs such as acpi_ghes_add_fw_cfg.
Everything else need not be in typedefs.h.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If virgl and opengl are not available, the build process creates a useless
libvirtio-vga-gl module that does not have any device in it. Follow the
example of virtio-vga-rutabaga and do not build the module at all in that
case.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Avoids an explicit use of sizeof(). The GLib allocation macros
ensure that the multiplication by the size of the element
uses the right type and does not overflow.
While at it, change bitmap_new() to use g_new0 directly. Its current
impl of calling bitmap_try_new() followed by a plain abort() has
worse diagnostics than g_new0, which uses g_error to report the actual
allocation size that failed.
Cc: qemu-trivial@nongnu.org
Cc: Roman Kiryanov <rkir@google.com>
Reviewed-by: Daniel Berrange <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Boards have been switched to use "default y" and are now listed
in default-configs/*.mak only for convenience.
Document this change and the new possibilities that it allows.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
Continue with Xtensa.
No changes to generated config-devices.mak file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
Continue with TriCore.
No changes to generated config-devices.mak file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
Continue with SPARC and SPARC64.
No changes to generated config-devices.mak file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
Continue with SH.
No changes to generated config-devices.mak file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
Continue with s390.
No changes to generated config-devices.mak file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
Continue with RX.
No changes to generated config-devices.mak file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
Continue with RISC-V.
No changes to generated config-devices.mak file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
Continue with PowerPC/POWER.
No changes to generated config-devices.mak files, other than
adding CONFIG_PPC to the ppc64-softmmu target.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
Continue with OpenRISC.
No changes to generated config-devices.mak file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
Continue with MIPS.
No changes to generated config-devices.mak file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
MIPS boards may only be available for big-endian or only for
little-endian emulators, add a symbol so that this can be described
with a "depends on" clause.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
Continue with Microblaze.
No changes to generated config-devices.mak file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
Continue with m68k.
No changes to generated config-devices.mak file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
Continue with Loongarch.
No changes to generated config-devices.mak file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
Continue with i386.
No changes to generated config-devices.mak files, other than
adding CONFIG_I386 to the x86_64-softmmu target.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
Continue with PARISC.
No changes to generated config-devices.mak file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
Continue with CRIS.
No changes to generated config-devices.mak file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
Continue with AVR.
No changes to generated config-devices.mak file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For ARM targets, boards that require TCG are already using "default y".
Switch ARM_VIRT to the same selection mechanism.
No changes to generated config-devices.mak file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some targets use "default y" for boards to filter out those that require
TCG. For consistency we are switching all other targets to do the same.
Start with Alpha.
No changes to generated config-devices.mak file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
target/ppc/kvm.c calls out to code in hw/ppc/spapr*.c; that code is
not present and fails to link if CONFIG_PSERIES is not enabled.
Adjust kvm.c to depend on CONFIG_PSERIES instead of TARGET_PPC64,
and compile out anything that requires cap_papr, because only
the pseries machine will call kvmppc_set_papr().
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
sparc-softmmu is able to run a subset of qtests when compiled --without-default-devices,
so use it instead of x86_64-softmmu for the msys2 run.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
KVM code might have to call functions on the PCIDevice that is
passed to kvm_arch_fixup_msi_route(). This fails in the case
where --without-default-devices is used and no board is
configured. While this is not really a useful configuration,
and therefore setting up stubs for CONFIG_PCI is overkill,
failing the build is impolite. Just include the PCI
subsystem if kvm_arch_fixup_msi_route() requires it, as
is the case for ARM and x86.
Reported-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When emulated with QEMU, interrupts will never come in the following
loop. However, if the NOP instruction is uncommented, interrupts will
fire as normal.
loop:
cli
call do_sti
jmp loop
do_sti:
sti
# nop
ret
This behavior is different from that of a real processor. For example,
if KVM is enabled, interrupts will always fire regardless of whether the
NOP instruction is commented or not. Also, the Intel Software Developer
Manual states that after the STI instruction is executed, the interrupt
inhibit should end as soon as the next instruction (e.g., the RET
instruction if the NOP instruction is commented) is executed.
This problem is caused because the previous code may choose not to end
the TB even if the HF_INHIBIT_IRQ_MASK has just been reset (e.g., in the
case where the STI instruction is immediately followed by the RET
instruction), so that IRQs may not have a change to trigger. This commit
fixes the problem by always terminating the current TB to give IRQs a
chance to trigger when HF_INHIBIT_IRQ_MASK is reset.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Ruihan Li <lrh2000@pku.edu.cn>
Message-ID: <20240415064518.4951-4-lrh2000@pku.edu.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
plugins: Rewrite plugin tcg expansion
# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmYyUpkdHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV98VAgAoTqIWPHtPJOS800G
# TlFuQjkEzQCPSKAh6ZbotsAMvfNwBloPpdrUlFr/jT7mURjEl2B7UC/4LzdhuGeQ
# U/xZt5rXsYvyfS3VwLf8pKBIscF7XjJ1rdfYMvBg9XaNp5VV0aEIk3+6P0uYtzXG
# cREF0uCYfdK6uoiuifhqRAkgrNnamdwpPbbfvzDQI13wICW7SfR7dcd629clVZ1O
# QvD1M4bpTWyhClbZzaoHqyPs+HQEM/AY0wOTfYZNbQBu6zFZXNDZCvYhIEWonPBO
# AKe5KWUrQMwLJhRVejaSSZZDjMdcz3HLaGJppP89/WB+gpY09+LsiuqT7k5c12Bw
# ueLEhw==
# =mn63
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 01 May 2024 07:32:57 AM PDT
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate]
* tag 'pull-tcg-20240501' of https://gitlab.com/rth7680/qemu:
plugins: Update the documentation block for plugin-gen.c
plugins: Inline plugin_gen_empty_callback
plugins: Merge qemu_plugin_tb_insn_get to plugin-gen.c
plugins: Split out common cb expanders
plugins: Replace pr_ops with a proper debug dump flag
plugins: Introduce PLUGIN_CB_MEM_REGULAR
plugins: Simplify callback queues
tcg: Remove INDEX_op_plugin_cb_{start,end}
tcg: Remove TCG_CALL_PLUGIN
plugins: Remove plugin helpers
plugins: Use emit_before_op for PLUGIN_GEN_FROM_MEM
plugins: Use emit_before_op for PLUGIN_GEN_FROM_INSN
plugins: Add PLUGIN_GEN_AFTER_TB
plugins: Use emit_before_op for PLUGIN_GEN_FROM_TB
plugins: Use emit_before_op for PLUGIN_GEN_AFTER_INSN
plugins: Create TCGHelperInfo for all out-of-line callbacks
plugins: Move function pointer in qemu_plugin_dyn_cb
plugins: Zero new qemu_plugin_dyn_cb entries
tcg: Pass function pointer to tcg_gen_call*
tcg: Make tcg/helper-info.h self-contained
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
When executing guest commands in *nix environment, we repeat the same
fork/exec pattern multiple times. Let's just separate it into a single
helper which would also be able to feed input data into the launched
process' stdin. This way we can avoid code duplication.
To keep the history more bisectable, let's replace qmp commands
implementations one by one. Also add G_GNUC_UNUSED attribute to the
helper and remove it in the next commit.
Originally-by: Yuri Pudgorodskiy <yur@virtuozzo.com>
Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Link: https://lore.kernel.org/r/20240320161648.158226-3-andrey.drobyshev@virtuozzo.com
Signed-off-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Since the commit 25b5ff1a86 ("qga: add mountpoint usage info to
GuestFilesystemInfo") we have 2 values reported in guest-get-fsinfo:
used = (f_blocks - f_bfree), total = (f_blocks - f_bfree + f_bavail) as
returned by statvfs(3). While on Windows guests that's all we can get
with GetDiskFreeSpaceExA(), on POSIX guests we might also be interested in
total file system size, as it's visible for root user. Let's add an
optional field 'total-bytes-privileged' to GuestFilesystemInfo struct,
which'd only be reported on POSIX and represent f_blocks value as returned
by statvfs(3).
While here, also tweak the docs to reflect better where those values
come from.
Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Link: https://lore.kernel.org/r/20240320161648.158226-2-andrey.drobyshev@virtuozzo.com
Signed-off-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Merge qemu_plugin_insn_alloc and qemu_plugin_tb_insn_get into
plugin_gen_insn_start, since it is used nowhere else.
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The DEBUG_PLUGIN_GEN_OPS ifdef is replaced with "-d op_plugin".
The second pr_ops call can be obtained with "-d op".
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We have qemu_plugin_dyn_cb.type to differentiate the various
callback types, so we do not need to keep them in separate queues.
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Since we no longer emit plugin helpers during the initial code
translation phase, we don't need to specially mark plugin helpers.
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Introduce a new plugin_mem_cb op to hold the address temp
and meminfo computed by tcg-op-ldst.c. Because this now
has its own opcode, we no longer need PLUGIN_GEN_FROM_MEM.
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
By having the qemu_plugin_cb_flags be recorded in the TCGHelperInfo,
we no longer need to distinguish PLUGIN_CB_REGULAR from
PLUGIN_CB_REGULAR_R, so place all TB callbacks in the same queue.
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Introduce a new plugin_cb op and migrate one operation.
By using emit_before_op, we do not need to emit opcodes
early and modify them later -- we can simply emit the
final set of opcodes once.
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The out-of-line function pointer is mutually exclusive
with inline expansion, so move it into the union.
Wrap the pointer in a structure named 'regular' to match
PLUGIN_CB_REGULAR.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
For normal helpers, read the function pointer from the
structure earlier. For plugins, this will allow the
function pointer to come from elsewhere.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* Clean-ups for "errp" handling in s390x cpu_model code
* Fix a possible abort in the "edu" device
* Add missing qga stubs for stand-alone qga builds and re-enable qga-ssh-test
* Fix memory corruption caused by the stm32l4x5 uart device
* Update the s390x custom runner to Ubuntu 22.04
* Fix READ NATIVE MAX ADDRESS IDE commands to avoid a possible crash
* Shorten the runtime of Cirrus-CI jobs
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmYwmaMRHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbUCERAAss5PJMG8rI4i4X/3nW49JYTlPOpgm/YX
# /UWF+eHUlqaqDdE0s+Pdw4Ozo3hXQt/E/CkcyflUTzVpnZtpv9vkhNWyjOoPV31v
# GQyQEzGvxZXl2S595XefyAyaMTP5maBhUTlyZWJo385cQraa60Ot5d4Mibr2CobY
# gIBRxEGB/frJYpbHJPxd/FxJV120gtuWAdZwGGYYYjwMzf2IKu2veODB8CnUErlX
# WNUsIzjtAslfh8Ek2ZmPzD7uktCUeigkukqIrLC1oEU3wzbJHkISv1kXCKPW/Nf6
# ISjVa5TqGwkiiF8fw9aYKvWrnPJS7JkhXw7Gz+b39d846kUdNyDfgLcYJeNS3cZ2
# R1xgR9B6hX8ZmikMbGC+0/Sv15u2Yr+bFxJBTJzq6zdOAb9EJNQY1hW2w/Lbrg3X
# LjY+ltcVweoSILT6AE6vGDPCHfBzO+6FcptFvw7ePvRGOlwAPZ3tEB9G2LEbCYgg
# BjWNP4aRuSfbUebO4x4Todz65WN8aY1EIBXORU/wgUlF2+zajWiOI5JRDKjWz2qQ
# gAMeCbLplli5bYrChWtouRIXtb061cQloULddu/SRFcaJOlV3SCzx4JfN15pU90s
# jRMIhMESAEj4NSfclhxsOiYp3ywZTvlQsVA6MgPlu2i3HJakQnt5zbg59TesRn2d
# r5PfAk83UnA=
# =0OB7
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 30 Apr 2024 12:11:31 AM PDT
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full]
* tag 'pull-request-2024-04-30' of https://gitlab.com/thuth/qemu:
.gitlab-ci.d/cirrus: Remove the netbsd and openbsd jobs
.gitlab-ci.d/cirrus.yml: Shorten the runtime of the macOS and FreeBSD jobs
tests/qtest/ide-test: Verify READ NATIVE MAX ADDRESS is not limited
hw/ide/core.c (cmd_read_native_max): Avoid limited device parameters
gitlab: remove stale s390x-all-linux-static conf hacks
gitlab: migrate the s390x custom machine to 22.04
build-environment: make some packages optional
hw/char/stm32l4x5_usart: Fix memory corruption by adding correct class_size
qga: Re-enable the qga-ssh-test when running without fuzzing
stubs: Add missing qga stubs
hw: misc: edu: use qemu_log_mask instead of hw_error
hw: misc: edu: rename local vars in edu_check_range
hw: misc: edu: fix 2 off-by-one errors
target/s390x/cpu_models_sysemu: Drop local @err in apply_cpu_model()
target/s390x/cpu_models: Make kvm_s390_apply_cpu_model() return boolean
target/s390x/cpu_models: Drop local @err in get_max_cpu_model()
target/s390x/cpu_models: Make kvm_s390_get_host_cpu_model() return boolean
target/s390x/cpu_model: Drop local @err in s390_realize_cpu_model()
target/s390x/cpu_model: Make check_compatibility() return boolean
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
"make check-qtest-aarch64" recently started failing on FreeBSD builds,
and valgrind on Linux also detected that there is something fishy with
the new stm32l4x5-usart: The code forgot to set the correct class_size
here, so the various class_init functions in this file wrote beyond
the allocated buffer when setting the subc->type field.
Fixes: 4fb37aea7e ("hw/char: Implement STM32L4x5 USART skeleton")
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240429075908.36302-1-thuth@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The DMA descriptor structures for this device have
a set of "address extension" fields which extend the 32
bit source addresses with an extra 16 bits to give a
48 bit address:
https://docs.amd.com/r/en-US/ug1085-zynq-ultrascale-trm/ADDR_EXT-Field
However, we misimplemented this address extension in several ways:
* we only extracted 12 bits of the extension fields, not 16
* we didn't shift the extension field up far enough
* we accidentally did the shift as 32-bit arithmetic, which
meant that we would have an overflow instead of setting
bits [47:32] of the resulting 64-bit address
Add a type cast and use extract64() instead of extract32()
to avoid integer overflow on addition. Fix bit fields
extraction according to documentation.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Cc: qemu-stable@nongnu.org
Fixes: d3c6369a96 ("introduce xlnx-dpdma")
Signed-off-by: Alexandra Diupina <adiupina@astralinux.ru>
Message-id: 20240428181131.23801-1-adiupina@astralinux.ru
[PMM: adjusted commit message]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In previous versions of the Arm architecture, the frequency of the
generic timers as reported in CNTFRQ_EL0 could be any IMPDEF value,
and for QEMU we picked 62.5MHz, giving a timer tick period of 16ns.
In Armv8.6, the architecture standardized this frequency to 1GHz.
Because there is no ID register feature field that indicates whether
a CPU is v8.6 or that it ought to have this counter frequency, we
implement this by changing our default CNTFRQ value for all CPUs,
with exceptions for backwards compatibility:
* CPU types which we already implement will retain the old
default value. None of these are v8.6 CPUs, so this is
architecturally OK.
* CPUs used in versioned machine types with a version of 9.0
or earlier will retain the old default value.
The upshot is that the only CPU type that changes is 'max'; but any
new type we add in future (whether v8.6 or not) will also get the new
1GHz default.
It remains the case that the machine model can override the default
value via the 'cntfrq' QOM property (regardless of the CPU type).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240426122913.3427983-5-peter.maydell@linaro.org
Currently the sbsa_gdwt watchdog device hardcodes its frequency at
62.5MHz. In real hardware, this watchdog is supposed to be driven
from the system counter, which also drives the CPU generic timers.
Newer CPU types (in particular from Armv8.6) should have a CPU
generic timer frequency of 1GHz, so we can't leave the watchdog
on the old QEMU default of 62.5GHz.
Make the frequency a QOM property so it can be set by the board,
and have our only board that uses this device set that frequency
to the same value it sets the CPU frequency.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240426122913.3427983-4-peter.maydell@linaro.org
Currently QEMU CPUs always run with a generic timer counter frequency
of 62.5MHz, but ARMv8.6 CPUs will run at 1GHz. For older versions of
the TF-A firmware that sbsa-ref runs, the frequency of the generic
timer is hardcoded into the firmware, and so if the CPU actually has
a different frequency then timers in the guest will be set
incorrectly.
The default frequency used by the 'max' CPU is about to change, so
make the sbsa-ref board force the CPU frequency to the value which
the firmware expects.
Newer versions of TF-A will read the frequency from the CPU's
CNTFRQ_EL0 register:
4c77fac98d
so in the longer term we could make this board use the 1GHz
frequency. We will need to make sure we update the binaries used
by our avocado test
Aarch64SbsarefMachine.test_sbsaref_alpine_linux_max_pauth_impdef
before we can do that.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Message-id: 20240426122913.3427983-3-peter.maydell@linaro.org
The generic timer frequency is settable by board code via a QOM
property "cntfrq", but otherwise defaults to 62.5MHz. The way this
is done includes some complication resulting from how this was
originally a fixed value with no QOM property. Clean it up:
* always set cpu->gt_cntfrq_hz to some sensible value, whether
the CPU has the generic timer or not, and whether it's system
or user-only emulation
* this means we can always use gt_cntfrq_hz, and never need
the old GTIMER_SCALE define
* set the default value in exactly one place, in the realize fn
The aim here is to pave the way for handling the ARMv8.6 requirement
that the generic timer frequency is always 1GHz. We're going to do
that by having old CPU types keep their legacy-in-QEMU behaviour and
having the default for any new CPU types be a 1GHz rather han 62.5MHz
cntfrq, so we want the point where the default is decided to be in
one place, and in code, not in a DEFINE_PROP_UINT64() initializer.
This commit should have no behavioural changes.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240426122913.3427983-2-peter.maydell@linaro.org
FEAT_Spec_FPACC is a feature describing speculative behaviour in the
event of a PAC authontication failure when FEAT_FPACCOMBINE is
implemented. FEAT_Spec_FPACC means that the speculative use of
pointers processed by a PAC Authentication is not materially
different in terms of the impact on cached microarchitectural state
(caches, TLBs, etc) between passing and failing of the PAC
Authentication.
QEMU doesn't do speculative execution, so we can advertise
this feature.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240418152004.2106516-6-peter.maydell@linaro.org
Newer versions of the Arm ARM (e.g. rev K.a) now define fields for
ID_AA64MMFR3_EL1. Implement this register, so that we can set the
fields if we need to. There's no behaviour change here since we
don't currently set the register value to non-zero.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240418152004.2106516-5-peter.maydell@linaro.org
FEAT_ETS2 is a tighter set of guarantees about memory ordering
involving translation table walks than the old FEAT_ETS; FEAT_ETS has
been retired from the Arm ARM and the old ID_AA64MMFR1.ETS == 1
now gives no greater guarantees than ETS == 0.
FEAT_ETS2 requires:
* the virtual address of a load or store that appears in program
order after a DSB cannot be translated until after the DSB
completes (section B2.10.9)
* TLB maintenance operations that only affect translations without
execute permission are guaranteed complete after a DSB
(R_BLDZX)
* if a memory access RW2 is ordered-before memory access RW2,
then RW1 is also ordered-before any translation table walk
generated by RW2 that generates a Translation, Address size
or Access flag fault (R_NNFPF, I_CLGHP)
As with FEAT_ETS, QEMU is already compliant, because we do not
reorder translation table walk memory accesses relative to other
memory accesses, and we always guarantee to have finished TLB
maintenance as soon as the TLB op is done.
Update the documentation to list FEAT_ETS2 instead of the
no-longer-existent FEAT_ETS, and update the 'max' CPU ID registers.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240418152004.2106516-4-peter.maydell@linaro.org
FEAT_CSV2_3 adds a mechanism to identify if hardware cannot disclose
information about whether branch targets and branch history trained
in one hardware described context can control speculative execution
in a different hardware context.
There is no branch prediction in TCG, so we don't need to do anything
to be compliant with this. Upadte the '-cpu max' ID registers to
advertise the feature.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240418152004.2106516-3-peter.maydell@linaro.org
As of version DDI0487K.a of the Arm ARM, some architectural features
which previously didn't have official names have been named. Add
these to the list of features which QEMU's TCG emulation supports.
Mostly these are features which we thought of as part of baseline 8.0
support. For SVE and SVE2, the names have been brought into line
with the FEAT_* naming convention of other extensions, and some
sub-components split into separate FEAT_ items. In a few cases (eg
FEAT_CCIDX, FEAT_DPB2) the omission from our list was just an oversight.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240418152004.2106516-2-peter.maydell@linaro.org
clock_propagate() has an assert that clk->source is NULL, i.e. that
you are calling it on a clock which has no source clock. This made
sense in the original design where the only way for a clock's
frequency to change if it had a source clock was when that source
clock changed. However, we subsequently added multiplier/divider
support, but didn't look at what that meant for propagation.
If a clock-management device changes the multiplier or divider value
on a clock, it needs to propagate that change down to child clocks,
even if the clock has a source clock set. So the assertion is now
incorrect.
Remove the assertion.
Signed-off-by: Raphael Poggi <raphael.poggi@lynxleap.co.uk>
Message-id: 20240419162951.23558-1-raphael.poggi@lynxleap.co.uk
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: Rewrote the commit message]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
During the past months, the netbsd and openbsd jobs in the Cirrus-CI
were broken most of the time - the setup to run a BSD in KVM on Cirrus-CI
from gitlab via the cirrus-run script was very fragile, and since the
jobs were not run by default, it used to bitrot very fast.
Now Cirrus-CI also introduce a limit on the amount of free CI minutes
that you get there, so it is not appealing at all anymore to run
these BSDs in this setup - it's better to run the checks locally via
"make vm-build-openbsd" and "make vm-build-netbsd" instead. Thus let's
remove these CI jobs now.
Message-ID: <20240426113742.654748-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Cirrus-CI introduced limitations to the free CI minutes. To avoid that
we are consuming them too fast, let's drop the usual targets that are
not that important since they are either a subset of another target
(like i386 or ppc being a subset of x86_64 or ppc64 respectively), or
since there is still a similar target with the opposite endianness
(like xtensa/xtensael, microblaze/microblazeel etc.).
Message-ID: <20240429100113.53357-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Verify that the ATA command READ NATIVE MAX ADDRESS returns the last
valid CHS tuple for the native device rather than any limit
established by INITIALIZE DEVICE PARAMETERS.
Signed-off-by: Lev Kujawski <lkujaw@mailbox.org>
Message-ID: <20221010085229.2431276-2-lkujaw@mailbox.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Always use the native CHS device parameters for the ATA commands READ
NATIVE MAX ADDRESS and READ NATIVE MAX ADDRESS EXT, not those limited
by the ATA command INITIALIZE_DEVICE_PARAMETERS (introduced in patch
176e4961, hw/ide/core.c: Implement ATA INITIALIZE_DEVICE_PARAMETERS
command, 2022-07-07.)
As stated by the ATA/ATAPI specification, "[t]he native maximum is the
highest address accepted by the device in the factory default
condition." Therefore this patch substitutes the native values in
drive_heads and drive_sectors before calling ide_set_sector().
One consequence of the prior behavior was that setting zero sectors
per track could lead to an FPE within ide_set_sector(). Thanks to
Alexander Bulekov for reporting this issue.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1243
Signed-off-by: Lev Kujawski <lkujaw@mailbox.org>
Message-ID: <20221010085229.2431276-1-lkujaw@mailbox.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
"make check-qtest-aarch64" recently started failing on FreeBSD builds,
and valgrind on Linux also detected that there is something fishy with
the new stm32l4x5-usart: The code forgot to set the correct class_size
here, so the various class_init functions in this file wrote beyond
the allocated buffer when setting the subc->type field.
Fixes: 4fb37aea7e ("hw/char: Implement STM32L4x5 USART skeleton")
Message-ID: <20240429075908.36302-1-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
According to the comment in qga/meson.build, the test got disabled
since there were problems with the fuzzing job. But instead of
disabling this test completely, we should still be fine running
it when fuzzing is disabled.
Message-ID: <20240426162348.684143-1-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Compilation QGA without system and user fails
./configure --disable-system --disable-user --enable-guest-agent
Link failure:
/usr/bin/ld: libqemuutil.a.p/util_main-loop.c.o: in function
`os_host_main_loop_wait':
../util/main-loop.c:303: undefined reference to `replay_mutex_unlock'
/usr/bin/ld: ../util/main-loop.c:307: undefined reference to
`replay_mutex_lock'
/usr/bin/ld: libqemuutil.a.p/util_error-report.c.o: in function
`error_printf':
../util/error-report.c:38: undefined reference to `error_vprintf'
/usr/bin/ld: libqemuutil.a.p/util_error-report.c.o: in function
`vreport':
../util/error-report.c:225: undefined reference to `error_vprintf'
/usr/bin/ld: libqemuutil.a.p/util_qemu-timer.c.o: in function
`timerlist_run_timers':
../util/qemu-timer.c:562: undefined reference to `replay_checkpoint'
/usr/bin/ld: ../util/qemu-timer.c:530: undefined reference to
`replay_checkpoint'
/usr/bin/ld: ../util/qemu-timer.c:525: undefined reference to
`replay_checkpoint'
ninja: build stopped: subcommand failed.
Fixes: 3a15604900 ("stubs: include stubs only if needed")
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Message-ID: <20240426121347.18843-2-kkostiuk@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Log a guest error instead of a hardware error when
the guest tries to DMA to / from an invalid address.
Signed-off-by: Chris Friedt <cfriedt@meta.com>
Message-ID: <20221018122551.94567-3-cfriedt@meta.com>
[thuth: Add missing #include statement, fix error reported by checkpatch.pl]
Signed-off-by: Thomas Huth <thuth@redhat.com>
In the case that size1 was zero, because of the explicit
'end1 > addr' check, the range check would fail and the error
message would read as shown below. The correct comparison
is 'end1 >= addr'.
EDU: DMA range 0x40000-0x3ffff out of bounds (0x40000-0x40fff)!
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1254
Signed-off-by: Chris Friedt <cfriedt@meta.com>
[thuth: Adjust patch with regards to the "end1 <= end2" check]
Message-ID: <20221018122551.94567-1-cfriedt@meta.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As error.h suggested, the best practice for callee is to return
something to indicate success / failure.
So make kvm_s390_apply_cpu_model() return boolean and check the
returned boolean in apply_cpu_model() instead of accessing @err.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240425031232.1586401-7-zhao1.liu@intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As error.h suggested, the best practice for callee is to return
something to indicate success / failure.
So make kvm_s390_get_host_cpu_model() return boolean and check the
returned boolean in get_max_cpu_model() instead of accessing @err.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240425031232.1586401-5-zhao1.liu@intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Commit d424db2354 removed an instance of strerrorname_np() because it
was breaking building with musl libc. A recent RISC-V patch ended up
re-introducing it again by accident.
Put this function in the baddies list in checkpatch.pl to avoid this
situation again. This is what it will look like next time:
$ ./scripts/checkpatch.pl 0001-temp-test.patch
ERROR: use strerror() instead of strerrorname_np()
#22: FILE: target/riscv/kvm/kvm-cpu.c:1058:
+ strerrorname_np(errno));
total: 1 errors, 0 warnings, 10 lines checked
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Commit d424db2354 excluded some strerrorname_np() instances because they
break musl libc builds. Another instance happened to slip by via commit
d4ff3da8f4.
Remove it before it causes trouble again.
Fixes: d4ff3da8f4 (target/riscv/kvm: initialize 'vlenb' via get-reg-list)
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Fixes: 1590154ee4 ("target/loongarch: Fix qemu-system-loongarch64 assert failed with the option '-d int'")
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
The .mailmap file fixes mistake we already did.
Do not use it when running checkpatch.pl, otherwise
we might commit the very same mistakes.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Commit f5177798d8 ("scripts: report on author emails
that are mangled by the mailing list") added a check
for qemu-devel@ list, extend the regexp to cover more
such qemu-trivial@, qemu-block@ and qemu-ppc@.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Printing a "PowerPC" in front of each CPU name is not helpful at all:
It is confusing for the users since they don't know whether they
have to specify these letters for the "-cpu" parameter, too, and
it also takes some precious space in the dense output of the CPU
entries. Let's simply remove this now and use two spaces at the
beginning of the lines for the indentation of the entries instead,
and add a "Available CPUs" in the very first line, like most other
target architectures are doing it for their CPU help output already.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Printing an "s390x" in front of each CPU name is not helpful at all:
It is confusing for the users since they don't know whether they
have to specify these letters for the "-cpu" parameter, too, and
it also takes some precious space in the dense output of the CPU
entries. Let's simply remove this now!
While we're at it, use two spaces at the beginning of the lines for
the indentation of the entries, and add a "Available CPUs" in the
very first line, like most other target architectures are doing it
for their "-cpu help" output already.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Printing an "x86" in front of each CPU name is not helpful at all:
It is confusing for the users since they don't know whether they
have to specify these letters for the "-cpu" parameter, too, and
it also takes some precious space in the dense output of the CPU
entries. Let's simply remove this now and use two spaces at the
beginning of the lines for the indentation of the entries instead,
like most other target architectures are doing it for their CPU help
output already.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
libslirp provides a newer slirp_*_hostxfwd API meant for
address-agnostic forwarding instead of the is_udp parameter which is
limited to just TCP/UDP.
This paves the way for IPv6 and Unix socket support.
Signed-off-by: Nicholas Ngai <nicholas@ngai.me>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Tested-by: Breno Leitao <leitao@debian.org>
Message-Id: <20210925214820.18078-1-nicholas@ngai.me>
The following CPUTLBEntry helpers are only used in accel/tcg/cputlb.c:
- tlb_index()
- tlb_entry()
- tlb_read_idx()
- tlb_addr_write()
Move them to this file, allowing to remove the huge "cpu.h" header
inclusion from "exec/cpu_ldst.h".
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240418192525.97451-13-philmd@linaro.org>
Declare 'have_guest_base' in "user/guest-base.h".
Very few files require this header, so explicitly include
it there instead of "exec/cpu-all.h" which is used in many
source files.
Assert this user-specific header is only included from user
emulation.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20231211212003.21686-23-philmd@linaro.org>
Reviewed-by: Anton Johansson <anjo@rev.ng>
The CPUBreakpoint and CPUWatchpoint structures are declared
in "hw/core/cpu.h", which contains declarations related to
CPUState and CPUClass. Some source files only require the
BP/WP definitions and don't need to pull in all CPU* API.
In order to simplify, create a new "exec/breakpoint.h" header.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20240418192525.97451-3-philmd@linaro.org>
The MMUAccessType enum is declared in "hw/core/cpu.h".
"hw/core/cpu.h" contains declarations related to CPUState
and CPUClass. Some source files only require MMUAccessType
and don't need to pull in all CPU* declarations. In order
to simplify, create a new "exec/mmu-access-type.h" header.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240418192525.97451-2-philmd@linaro.org>
The abi_ptr type is declared in "exec/cpu_ldst.h" with all
the load/store helpers. Some source files requiring abi_ptr
type don't need the load/store helpers. In order to simplify,
create a new "exec/abi_ptr.h" header.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20231212123401.37493-21-philmd@linaro.org>
"exec/user/abitypes.h" requires:
- "exec/cpu-defs.h" (TARGET_LONG_BITS)
- "exec/tswap.h" (tswap32)
In order to avoid "cpu.h", pick the minimum required headers.
Assert this user-specific header is only included from user
emulation.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20231212123401.37493-20-philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
We usually check target endianess before swapping values,
so target_words_bigendian() declaration makes sense in
"exec/tswap.h" with the target swapping helpers.
Remove "hw/core/cpu.h" when it was only included to get
the target_words_bigendian() declaration.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20231212123401.37493-16-philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
HVF has a specific use of the CPUState::vcpu_dirty field
(CPUState::vcpu_dirty is not used by common code).
To make this field accel-specific, add and use a new
@dirty variable in the AccelCPUState structure.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240424174506.326-4-philmd@linaro.org>
NVMM has a specific use of the CPUState::vcpu_dirty field
(CPUState::vcpu_dirty is not used by common code).
To make this field accel-specific, add and use a new
@dirty variable in the AccelCPUState structure.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240424174506.326-3-philmd@linaro.org>
WHPX has a specific use of the CPUState::vcpu_dirty field
(CPUState::vcpu_dirty is not used by common code).
To make this field accel-specific, add and use a new
@dirty variable in the AccelCPUState structure.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240424174506.326-2-philmd@linaro.org>
Since commit 139c1837db ("meson: rename included C source files
to .c.inc"), QEMU standard procedure for included C files is to
use *.c.inc.
Besides, since commit 6a0057aa22 ("docs/devel: make a statement
about includes") this is documented in the Coding Style:
If you do use template header files they should be named with
the ``.c.inc`` or ``.h.inc`` suffix to make it clear they are
being included for expansion.
Therefore rename "exec/helper-head.h" as "exec/helper-head.h.inc".
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240424173333.96148-4-philmd@linaro.org>
Since commit 139c1837db ("meson: rename included C source files
to .c.inc"), QEMU standard procedure for included C files is to
use *.c.inc.
Besides, since commit 6a0057aa22 ("docs/devel: make a statement
about includes") this is documented in the Coding Style:
If you do use template header files they should be named with
the ``.c.inc`` or ``.h.inc`` suffix to make it clear they are
being included for expansion.
Therefore rename 'store-insert-al16.h' as 'store-insert-al16.h.inc'
and 'load-extract-al16-al8.h' as 'load-extract-al16-al8.h.inc'.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240424173333.96148-3-philmd@linaro.org>
Due to missing headers, when including "tb-jmp-cache.h" we might get:
accel/tcg/tb-jmp-cache.h:21:21: error: field ‘rcu’ has incomplete type
21 | struct rcu_head rcu;
| ^~~
accel/tcg/tb-jmp-cache.h:24:9: error: unknown type name ‘vaddr’
24 | vaddr pc;
| ^~~~~
Add the missing "qemu/rcu.h" and "exec/cpu-common.h" headers.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240111162442.43755-1-philmd@linaro.org>
set_helper_retaddr() is only used in accel/tcg/user-exec.c.
clear_helper_retaddr() is only used in accel/tcg/cpu-exec.c
and accel/tcg/user-exec.c.
No need to expose their definitions to all user-emulation
files including "exec/cpu_ldst.h", move them to a new
"user-retaddr.h" header (restricted to accel/tcg/).
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20231211212003.21686-19-philmd@linaro.org>
accel/tcg/ files requires the following definitions:
- TARGET_LONG_BITS
- TARGET_PAGE_BITS
- TARGET_PHYS_ADDR_SPACE_BITS
- TCG_GUEST_DEFAULT_MO
The first 3 are defined in "cpu-param.h". The last one
in "cpu.h", with a bunch of definitions irrelevant for
TCG. By moving the TCG_GUEST_DEFAULT_MO definition to
"cpu-param.h", we can simplify various accel/tcg includes.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20231211212003.21686-4-philmd@linaro.org>
"semihosting/uaccess.h" only requires the following headers:
- "exec/cpu-defs.h" for target_ulong,
- "exec/cpu-common.h" for cpu_memory_rw_debug()
- "exec/tswap.h" for tswap32() and tswap64().
Include them instead of the huge "cpu.h".
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <42c6471e-8383-45e0-85ee-e20ca32ecbad@linaro.org>
CPUArchState 'env' field is defined within the ArchCPU structure,
so we need to include each target "cpu.h" header which defines it.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Warner Losh <imp@bsdimp.com>
Message-Id: <20231211212003.21686-2-philmd@linaro.org>
'NEED_CPU_H' guard target-specific code; it is defined by meson
altogether with the 'CONFIG_TARGET' definition. Rename NEED_CPU_H
as COMPILING_PER_TARGET to clarify its meaning.
Mechanical change running:
$ sed -i s/NEED_CPU_H/COMPILING_PER_TARGET/g $(git grep -l NEED_CPU_H)
then manually add a /* COMPILING_PER_TARGET */ comment
after the '#endif' when the block is large.
Inspired-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240322161439.6448-4-philmd@linaro.org>
nbd_negotiate() is already marked coroutine_fn. And given the fix in
the previous patch to have nbd_negotiate_handle_starttls not create
and wait on a g_main_loop (as that would violate coroutine
constraints), it is worth marking the rest of the related static
functions reachable only during option negotiation as also being
coroutine_fn.
Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20240408160214.1200629-6-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
[eblake: drop one spurious coroutine_fn marking]
Signed-off-by: Eric Blake <eblake@redhat.com>
Misc HW patch queue
- Script to compare machines compat_props[] (Maksim)
- Introduce 'module' CPU topology level (Zhao)
- Various cleanups (Thomas, Zhao, Inès, Bernhard)
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmYqN3wACgkQ4+MsLN6t
# wN4hTw/9FHsItnEkme/864DRPSP7A9mCGa+JfzJmsL8oUb9fBjXXKm+lNchMLu3B
# uvzfXB2Ea24yf5vyrldo0XlU3i/4GDvqXTI6YFYqBvitGICauYBu+6n2NZh2Y/Pn
# zZCcVo167o0q7dHu2WSrZ6cSUchsF2C80HjuS07QaN2YZ7QMuN1+uqTjCQ/JHQWA
# MH4xHh7cXdfCbbv8iNhMWn6sa+Bw/UyfRcc2W6w9cF5Q5cuuTshgDyd0JBOzkM1i
# Mcul7TuKrSiLUeeeqfTjwtw3rtbNfkelV3ycgvgECFAlzPSjF5a6d/EGdO2zo3T/
# aFZnQBYrb4U0SzsmfXFHW7cSylIc1Jn2CCuZZBIvdVcu8TGDD5XsgZbGoCfKdWxp
# l67qbQJy1Mp3LrRzygJIaxDOfE8fhhRrcIxfK/GoTHaCkqeFRkGjTeiDTVBqAES2
# zs6kUYZyG/xGaa2tsMu+HbtSO5EEqPC2QCdHayY3deW42Kwjj/HFV50Ya8YgYSVp
# gEAjTDOle2dDjlkYud+ymTJz7LnGb3G7q0EZRI9DWolx/bu+uZGQqTSRRre4qFQY
# SgN576hsFGN4NdM7tyJWiiqD/OC9ZeqUx3gGBtmI52Q6obBCE9hcow0fPs55Tk95
# 1YzPrt/3IoPI5ZptCoA8DFiysQ46OLtpIsQO9YcrpJmxWyLDSr0=
# =tm+U
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 25 Apr 2024 03:59:08 AM PDT
# gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]
* tag 'hw-misc-20240425' of https://github.com/philmd/qemu: (22 commits)
hw/core: Support module-id in numa configuration
hw/core: Introduce module-id as the topology subindex
hw/core/machine: Support modules in -smp
hw/core/machine: Introduce the module as a CPU topology level
hw/i386/pc_sysfw: Remove unused parameter from pc_isa_bios_init()
hw/misc : Correct 5 spaces indents in stm32l4x5_exti
hw/xtensa: Include missing 'exec/cpu-common.h' in 'bootparam.h'
hw/elf_ops: Rename elf_ops.h -> elf_ops.h.inc
hw/cxl/cxl-cdat: Make cxl_doe_cdat_init() return boolean
hw/cxl/cxl-cdat: Make ct3_build_cdat() return boolean
hw/cxl/cxl-cdat: Make ct3_load_cdat() return boolean
hw: Add a Kconfig switch for the TYPE_CPU_CLUSTER device
hw: Fix problem with the A*MPCORE switches in the Kconfig files
hw/riscv/virt: Replace sprintf by g_strdup_printf
hw/misc/imx: Replace sprintf() by snprintf()
hw/misc/applesmc: Simplify DeviceReset handler
target/i386: Move APIC related code to cpu-apic.c
hw/core: Remove check on NEED_CPU_H in tcg-cpu-ops.h
scripts: add script to compare compatibility properties
python/qemu/machine: add method to retrieve QEMUMachine::binary field
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
target-arm queue:
* Implement FEAT_NMI and NMI support in the GICv3
* hw/dma: avoid apparent overflow in soc_dma_set_request
* linux-user/flatload.c: Remove unused bFLT shared-library and ZFLAT code
* Add ResetType argument to Resettable hold and exit phase methods
* Add RESET_TYPE_SNAPSHOT_LOAD ResetType
* Implement STM32L4x5 USART
# -----BEGIN PGP SIGNATURE-----
#
# iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmYqMhMZHHBldGVyLm1h
# eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3uVlD/47U3zYP33y4+wJcRScC0QI
# jYd82jS7GhD5YP5QPrIEMaSbDwtYGi4Rez1taaHvZ2fWLg2gE973iixmTaM2mXCd
# xPEqMsRXkFrQnC89K5/v9uR04AvHxoM8J2mD2OKnUT0RVBs38WxCUMLETBsD18/q
# obs1RzDRhEs5BnwwPMm5HI1iQeVvDRe/39O3w3rZfA8DuqerrNOQWuJd43asHYjO
# Gc1QzCGhALlXDoqk11IzjhJ7es8WbJ5XGvrSNe9QLGNJwNsu9oi1Ez+5WK2Eht9r
# eRvGNFjH4kQY1YCShZjhWpdzU9KT0+80KLirMJFcI3vUztrYZ027/rMyKLHVOybw
# YAqgEUELwoGVzacpaJg73f77uknKoXrfTH25DfoLX0yFCB35JHOPcjU4Uq1z1pfV
# I80ZcJBDJ95mXPfyKLrO+0IyVBztLybufedK2aiH16waEGDpgsJv66FB2QRuQBYW
# O0i6/4DEUZmfSpOmr8ct+julz7wCWSjbvo6JFWxzzxvD0M5T3AFKXZI244g1SMdh
# LS8V7WVCVzVJ5mK8Ujp2fVaIIxiBzlXVZrQftWv5rhyDOiIIeP8pdekmPlI6p5HK
# 3/2efzSYNL2UCDZToIq24El/3md/7vHR6DBfBT1/pagxWUstqqLgkJO42jQtTG0E
# JY1cZ/EQY7cqXGrww8lhWA==
# =WEsU
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 25 Apr 2024 03:36:03 AM PDT
# gpg: using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg: issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [full]
# gpg: aka "Peter Maydell <pmaydell@gmail.com>" [full]
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [full]
# gpg: aka "Peter Maydell <peter@archaic.org.uk>" [unknown]
* tag 'pull-target-arm-20240425' of https://git.linaro.org/people/pmaydell/qemu-arm: (37 commits)
tests/qtest: Add tests for the STM32L4x5 USART
hw/arm: Add the USART to the stm32l4x5 SoC
hw/char/stm32l4x5_usart: Add options for serial parameters setting
hw/char/stm32l4x5_usart: Enable serial read and write
hw/char: Implement STM32L4x5 USART skeleton
reset: Add RESET_TYPE_SNAPSHOT_LOAD
docs/devel/reset: Update to new API for hold and exit phase methods
hw, target: Add ResetType argument to hold and exit phase methods
scripts/coccinelle: New script to add ResetType to hold and exit phases
allwinner-i2c, adm1272: Use device_cold_reset() for software-triggered reset
hw/misc: Don't special case RESET_TYPE_COLD in npcm7xx_clk, gcr
linux-user/flatload.c: Remove unused bFLT shared-library and ZFLAT code
hw/dma: avoid apparent overflow in soc_dma_set_request
hw/arm/virt: Enable NMI support in the GIC if the CPU has FEAT_NMI
target/arm: Add FEAT_NMI to max
hw/intc/arm_gicv3: Report the VINMI interrupt
hw/intc/arm_gicv3: Report the NMI interrupt in gicv3_cpuif_update()
hw/intc/arm_gicv3: Implement NMI interrupt priority
hw/intc/arm_gicv3: Handle icv_nmiar1_read() for icc_nmiar1_read()
hw/intc/arm_gicv3: Add NMI handling CPU interface registers
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* Update OpenBSD CI image to 7.5
* Update/remove Ubuntu 20.04 CI jobs
* Update (most) CentOS 8 CI jobs to CentOS 9
* Some clean-ups and improvements to travis.yml
* Minor test fixes
* s390x header clean-ups
* Doc updates
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmYqbu4RHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbWqeA//cFAzjjmayCRZzuwwCFH0ILPrMNViRLFc
# ZuslNEFygDPl1H1wnw3MKzhHy1hbaC3gf30MjtejU61OMMyxS4exZd1rvw94a0cm
# OEisE8UG52kRsqPwKPktB2bybgX3BZbrFEwp1P0DsvpLTX7wI5nOZyNR4zWOf5Ym
# mODN/MjMFOhWjONOnNDRn4TbySQqolIQBTCq+f1J5Ej74V+p17aC5Fe3xhlFp8Ip
# aRockPW6dLpNt26zx6kKvQSYtkyLgQJSeUUyUCgcla03yzNSuV/kJPUoW0ewa+q8
# DZg1Ru5WJ6O8lELQdYq630cmdwg3e9EeI6q/U/1A11auuLaafOBi0eZW9LdPlrqD
# 6a+zwVn+ipyRdz8eRGZVRGdhJ6XT27YfFuKxdiu4BxnS0LRks4vDcXreIcQmiIUN
# bg/zSp6snCpYf7+GlUReZXWnVx401nu59+BNNKUV0qIxdORNm8kwd9ZpSQwXP/nF
# BMPhj2hoqvWb4C4r3WlTaSPlkJGhkb2lMLucjCbeGrdnmna0RFOFB301fllbpnVm
# 11SRipMEfrj/G5qp4giPLcruzesvRaZm85nmwDyOQWxr5Q0KWWfBVXZMt+qqOckR
# 2SUtLPd9nWruCy7KN15BrOWkmXc+OU8UFUqXIOvflkI6aF1bmFYRyrXgqX2q7QDT
# kEfWnBvBqxw=
# =1uVo
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 25 Apr 2024 07:55:42 AM PDT
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full]
* tag 'pull-request-2024-04-25' of https://gitlab.com/thuth/qemu:
target/s390x: Remove KVM stubs in cpu_models.h
tests/unit: Remove debug statements in test-nested-aio-poll.c
docs/devel: fix minor typo in submitting-a-patch.rst
hw/s390x: Include missing 'cpu.h' header
tests: Update our CI to use CentOS Stream 9 instead of 8
tests/docker/dockerfiles: Run lcitool-refresh after the lcitool update
tests/lcitool/libvirt-ci: Update to the latest master branch
tests: Remove Ubuntu 20.04 container
.travis.yml: Do some more testing with Clang
.travis.yml: Update the jobs to Ubuntu 22.04
.travis.yml: Remove the unused UNRELIABLE environment variable
Revert ".travis.yml: Cache Avocado cache"
tests/vm: update openbsd image to 7.5
docs: i386: pc: Update maximum CPU numbers for PC Q35
tests/qtest : Use `g_assert_cmphex` instead of `g_assert_cmpuint`
MAINTAINERS: update email of Peter Lieven
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We have been running this test for almost a year; it
is safe to remove its debug statements, which clutter
CI jobs output:
▶ 88/100 /nested-aio-poll OK
io_read 0x16bb26158
io_poll_true 0x16bb26158
> io_poll_ready
io_read 0x16bb26164
< io_poll_ready
io_poll_true 0x16bb26158
io_poll_false 0x16bb26164
> io_poll_ready
io_poll_false 0x16bb26164
io_poll_false 0x16bb26164
io_poll_false 0x16bb26164
io_poll_false 0x16bb26164
io_poll_false 0x16bb26164
io_poll_false 0x16bb26164
io_poll_false 0x16bb26164
io_poll_false 0x16bb26164
io_poll_false 0x16bb26164
io_read 0x16bb26164
< io_poll_ready
88/100 qemu:unit / test-nested-aio-poll OK
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240422112246.83812-1-philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
"cpu.h" is implicitly included. Include it explicitly to
avoid the following error when refactoring headers:
hw/s390x/s390-stattrib.c:86:40: error: use of undeclared identifier 'TARGET_PAGE_SIZE'
len = sac->peek_stattr(sas, addr / TARGET_PAGE_SIZE, buflen, vals);
^
hw/s390x/s390-stattrib.c:94:58: error: use of undeclared identifier 'TARGET_PAGE_MASK'
addr / TARGET_PAGE_SIZE, len, addr & ~TARGET_PAGE_MASK);
^
hw/s390x/s390-stattrib.c:224:40: error: use of undeclared identifier 'TARGET_PAGE_BITS'
qemu_put_be64(f, (start_gfn << TARGET_PAGE_BITS) | STATTR_FLAG_MORE);
^
In file included from hw/s390x/s390-virtio-ccw.c:17:
hw/s390x/s390-virtio-hcall.h:22:27: error: unknown type name 'CPUS390XState'
int s390_virtio_hypercall(CPUS390XState *env);
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Eric Farman <farman@linux.ibm.com>
Message-ID: <20240322162822.7391-1-philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Coroutines are not supposed to block. Instead, they should yield.
The client performs TLS upgrade outside of an AIOContext, during
synchronous handshake; this still requires g_main_loop. But the
server responds to TLS upgrade inside a coroutine, so a nested
g_main_loop is wrong. Since the two callbacks no longer share more
than the setting of data.complete and data.error, it's just as easy to
use static helpers instead of trying to share a common code path. It
is also possible to add assertions that no other code is interfering
with the eventual path to qio reaching the callback, whether or not it
required a yield or main loop.
Fixes: f95910f ("nbd: implement TLS support in the protocol negotiation")
Signed-off-by: Zhu Yangyang <zhuyangyang14@huawei.com>
[eblake: move callbacks to their use point, add assertions]
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20240408160214.1200629-5-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Module is a level above the core, thereby supporting numa
configuration on the module level can bring user more numa flexibility.
This is the natural further support for module level.
Add module level support in numa configuration.
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-5-zhao1.liu@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
In x86, module is the topology level above core, which contains a set
of cores that share certain resources (in current products, the resource
usually includes L2 cache, as well as module scoped features and MSRs).
Though smp.clusters could also share the L2 cache resource [1], there
are following reasons that drive us to introduce the new smp.modules:
* As the CPU topology abstraction in device tree [2], cluster supports
nesting (though currently QEMU hasn't support that). In contrast,
(x86) module does not support nesting.
* Due to nesting, there is great flexibility in sharing resources
on cluster, rather than narrowing cluster down to sharing L2 (and
L3 tags) as the lowest topology level that contains cores.
* Flexible nesting of cluster allows it to correspond to any level
between the x86 package and core.
* In Linux kernel, x86's cluster only represents the L2 cache domain
but QEMU's smp.clusters is the CPU topology level. Linux kernel will
also expose module level topology information in sysfs for x86. To
avoid cluster ambiguity and keep a consistent CPU topology naming
style with the Linux kernel, we introduce module level for x86.
The module is, in existing hardware practice, the lowest layer that
contains the core, while the cluster is able to have a higher
topological scope than the module due to its nesting.
Therefore, place the module between the cluster and the core:
drawer/book/socket/die/cluster/module/core/thread
With the above topological hierarchy order, introduce module level
support in MachineState and MachineClass.
[1]: https://lore.kernel.org/qemu-devel/c3d68005-54e0-b8fe-8dc1-5989fe3c7e69@huawei.com/
[2]: https://www.kernel.org/doc/Documentation/devicetree/bindings/cpu/cpu-topology.txt
Suggested-by: Xiaoyao Li <xiaoyao.li@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-2-zhao1.liu@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Since commit 139c1837db ("meson: rename included C source files
to .c.inc"), QEMU standard procedure for included C files is to
use *.c.inc.
Besides, since commit 6a0057aa22 ("docs/devel: make a statement
about includes") this is documented in the Coding Style:
If you do use template header files they should be named with
the ``.c.inc`` or ``.h.inc`` suffix to make it clear they are
being included for expansion.
Therefore rename "hw/elf_ops.h" as "hw/elf_ops.h.inc".
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240424173333.96148-2-philmd@linaro.org>
A9MPCORE, ARM11MPCORE and A15MPCORE are defined twice, once in
hw/cpu/Kconfig and once in hw/arm/Kconfig. This is only possible
by accident, since hw/cpu/Kconfig is never included from hw/Kconfig.
Fix it by declaring the switches only in hw/cpu/Kconfig (since the
related files reside in the hw/cpu/ folder) and by making sure that
the file hw/cpu/Kconfig is now properly included from hw/Kconfig.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240415065655.130099-2-thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Have applesmc_find_key() return a const pointer.
Since the returned buffers are not modified in
applesmc_io_data_write(), it is pointless to
delete and re-add the keys in the DeviceReset
handler. Add them once in DeviceRealize, and
discard them in the DeviceUnrealize handler.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20240410180819.92332-1-philmd@linaro.org>
Add the basic infrastructure (register read/write, type...)
to implement the STM32L4x5 USART.
Also create different types for the USART, UART and LPUART
of the STM32L4x5 to deduplicate code and enable the
implementation of different behaviors depending on the type.
Signed-off-by: Arnaud Minier <arnaud.minier@telecom-paris.fr>
Signed-off-by: Inès Varhol <ines.varhol@telecom-paris.fr>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240329174402.60382-2-arnaud.minier@telecom-paris.fr
[PMM: update to new reset hold method signature;
fixed a few checkpatch nits]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Some devices and machines need to handle the reset before a vmsave
snapshot is loaded differently -- the main user is the handling of
RNG seed information, which does not want to put a new RNG seed into
a ROM blob when we are doing a snapshot load.
Currently this kind of reset handling is supported only for:
* TYPE_MACHINE reset methods, which take a ShutdownCause argument
* reset functions registered with qemu_register_reset_nosnapshotload
To allow a three-phase-reset device to also distinguish "snapshot
load" reset from the normal kind, add a new ResetType
RESET_TYPE_SNAPSHOT_LOAD. All our existing reset methods ignore
the reset type, so we don't need to update any device code.
Add the enum type, and make qemu_devices_reset() use the
right reset type for the ShutdownCause it is passed. This
allows us to get rid of the device_reset_reason global we
were using to implement qemu_register_reset_nosnapshotload().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Luc Michel <luc.michel@amd.com>
Message-id: 20240412160809.1260625-7-peter.maydell@linaro.org
We pass a ResetType argument to the Resettable class enter
phase method, but we don't pass it to hold and exit, even though
the callsites have it readily available. This means that if
a device cared about the ResetType it would need to record it
in the enter phase method to use later on. Pass the type to
all three of the phase methods to avoid having to do that.
Commit created with
for dir in hw target include; do \
spatch --macro-file scripts/cocci-macro-file.h \
--sp-file scripts/coccinelle/reset-type.cocci \
--keep-comments --smpl-spacing --in-place \
--include-headers --dir $dir; done
and no manual edits.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Luc Michel <luc.michel@amd.com>
Message-id: 20240412160809.1260625-5-peter.maydell@linaro.org
We pass a ResetType argument to the Resettable class enter phase
method, but we don't pass it to hold and exit, even though the
callsites have it readily available. This means that if a device
cared about the ResetType it would need to record it in the enter
phase method to use later on. We should pass the type to all three
of the phase methods to avoid having to do that.
This coccinelle script adds the ResetType argument to the hold and
exit phases of the Resettable interface.
The first part of the script (rules holdfn_assigned, holdfn_defined,
exitfn_assigned, exitfn_defined) update implementations of the
interface within device models, both to change the signature of their
method implementations and to pass on the reset type when they invoke
reset on some other device.
The second part of the script is various special cases:
* method callsites in resettable_phase_hold(), resettable_phase_exit()
and device_phases_reset()
* updating the typedefs for the methods
* isl_pmbus_vr.c has some code where one device's reset method directly
calls the implementation of a different device's method
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Luc Michel <luc.michel@amd.com>
Message-id: 20240412160809.1260625-4-peter.maydell@linaro.org
The npcm7xx_clk and npcm7xx_gcr device reset methods look at
the ResetType argument and only handle RESET_TYPE_COLD,
producing a warning if another reset type is passed. This
is different from how every other three-phase-reset method
we have works, and makes it difficult to add new reset types.
A better pattern is "assume that any reset type you don't know
about should be handled like RESET_TYPE_COLD"; switch these
devices to do that. Then adding a new reset type will only
need to touch those devices where its behaviour really needs
to be different from the standard cold reset.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Luc Michel <luc.michel@amd.com>
Message-id: 20240412160809.1260625-2-peter.maydell@linaro.org
Ever since the bFLT format support was added in 2006, there has been
a chunk of code in the file guarded by CONFIG_BINFMT_SHARED_FLAT
which is supposedly for shared library support. This is not enabled
and it's not possible to enable it, because if you do you'll run into
the "#error needs checking" in the calc_reloc() function.
Similarly, CONFIG_BINFMT_ZFLAT exists but can't be enabled because of
an "#error code needs checking" in load_flat_file().
This code is obviously unfinished and has never been used; nobody in
the intervening 18 years has complained about this or fixed it, so
just delete the dead code. If anybody ever wants the feature they
can always pull it out of git, or (perhaps better) write it from
scratch based on the current Linux bFLT loader rather than the one of
18 years ago.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240411115313.680433-1-peter.maydell@linaro.org
In soc_dma_set_request() we try to set a bit in a uint64_t, but we
do it with "1 << ch->num", which can't set any bits past 31;
any use for a channel number of 32 or more would fail due to
integer overflow.
This doesn't happen in practice for our current use of this code,
because the worst case is when we call soc_dma_init() with an
argument of 32 for the number of channels, and QEMU builds with
-fwrapv so the shift into the sign bit is well-defined. However,
it's obviously not the intended behaviour of the code.
Add casts to force the shift to be done as 64-bit arithmetic,
allowing up to 64 channels.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: afbb5194d4 ("Handle on-chip DMA controllers in one place, convert OMAP DMA to use it.")
Signed-off-by: Anastasia Belova <abelova@astralinux.ru>
Message-id: 20240409115301.21829-1-abelova@astralinux.ru
[PMM: Edit commit message to clarify that this doesn't actually
bite us in our current usage of this code.]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
If the CPU implements FEAT_NMI, then turn on the NMI support in the
GICv3 too. It's permitted to have a configuration with FEAT_NMI in
the CPU (and thus NMI support in the CPU interfaces too) but no NMI
support in the distributor and redistributor, but this isn't a very
useful setup as it's close to having no NMI support at all.
We don't need to gate the enabling of NMI in the GIC behind a
machine version property, because none of our current CPUs
implement FEAT_NMI, and '-cpu max' is not something we maintain
migration compatibility across versions for. So we can always
enable the GIC NMI support when the CPU has it.
Neither hvf nor KVM support NMI in the GIC yet, so we don't enable
it unless we're using TCG.
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240407081733.3231820-25-ruanjinjie@huawei.com
[PMM: Update comment and commit message]
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
If GICD_CTLR_DS bit is zero and the NMI is non-secure, the NMI priority is
higher than 0x80, otherwise it is higher than 0x0. And save the interrupt
non-maskable property in hppi.nmi to deliver NMI exception. Since both GICR
and GICD can deliver NMI, it is both necessary to check whether the pending
irq is NMI in gicv3_redist_update_noirqset and gicv3_update_noirqset.
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240407081733.3231820-21-ruanjinjie@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Implement icv_nmiar1_read() for icc_nmiar1_read(), so add definition for
ICH_LR_EL2.NMI and ICH_AP1R_EL2.NMI bit.
If FEAT_GICv3_NMI is supported, ich_ap_write() should consider ICV_AP1R_EL1.NMI
bit. In icv_activate_irq() and icv_eoir_write(), the ICV_AP1R_EL1.NMI bit
should be set or clear according to the Non-maskable property. And the RPR
priority should also update the NMI bit according to the APR priority NMI bit.
By the way, add gicv3_icv_nmiar1_read trace event.
If the hpp irq is a NMI, the icv iar read should return 1022 and trap for
NMI again
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
[PMM: use cs->nmi_support instead of cs->gic->nmi_support]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240407081733.3231820-20-ruanjinjie@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add the NMIAR CPU interface registers which deal with acknowledging NMI.
When introduce NMI interrupt, there are some updates to the semantics for the
register ICC_IAR1_EL1 and ICC_HPPIR1_EL1. For ICC_IAR1_EL1 register, it
should return 1022 if the intid has non-maskable property. And for
ICC_NMIAR1_EL1 register, it should return 1023 if the intid do not have
non-maskable property. Howerever, these are not necessary for ICC_HPPIR1_EL1
register.
And the APR and RPR has NMI bits which should be handled correctly.
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
[PMM: Separate out whether cpuif supports NMI from whether the
GIC proper (IRI) supports NMI]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240407081733.3231820-19-ruanjinjie@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
According to Arm GIC section 4.6.3 Interrupt superpriority, the interrupt
with superpriority is always IRQ, never FIQ, so the NMI exception trap entry
behave like IRQ. And VINMI(vIRQ with Superpriority) can be raised from the
GIC or come from the hcrx_el2.HCRX_VINMI bit, VFNMI(vFIQ with Superpriority)
come from the hcrx_el2.HCRX_VFNMI bit.
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240407081733.3231820-13-ruanjinjie@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When PSTATE.ALLINT is set, an IRQ or FIQ interrupt that is targeted to
ELx, with or without superpriority is masked. As Richard suggested, place
ALLINT bit in PSTATE in env->pstate.
In the pseudocode, AArch64.ExceptionReturn() calls SetPSTATEFromPSR(), which
treats PSTATE.ALLINT as one of the bits which are reinstated from SPSR to
PSTATE regardless of whether this is an illegal exception return or not. So
handle PSTATE.ALLINT the same way as PSTATE.DAIF in the illegal_return exit
path of the exception_return helper. With the change, exception entry and
return are automatically handled.
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240407081733.3231820-3-ruanjinjie@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This script runs QEMU to obtain compat_props of machines and default
values of different types of drivers to produce comparison table. This
table can be used to compare machine types to choose the most suitable
machine or compare binaries to be sure that migration to the newer version
will save all device properties. Also the json or csv format of this
table can be used to check does a new machine affect the previous ones by
comparing tables with and without the new machine.
Default values (that will be used without machine compat_props) of
properties are needed to fill "holes" in the table (one machine has
the property but another machine not. For instance, 2.12 machine has
`{ "EPYC-" TYPE_X86_CPU, "xlevel", "0x8000000a" }`, but compat_pros of
3.1 machine doesn't have it. Thus, to compare these machines we need to
get unknown value of "EPYC-x86_64-cpu-xlevel" for 3.1 machine. These
unknown values in the table are called "holes". To get values for these
"holes" the script uses list of appropriate methods.)
Notes:
* Some init values from the devices can't be available like properties
from virtio-9p when configure has --disable-virtfs. This situations will
be seen in the table as "unavailable driver".
* Default values can be obtained in an unobvious way, like x86 features.
If the script doesn't know how to get property default value to compare
one machine with another it fills "holes" with "unavailable method". This
is done because script uses whitelist model to get default values of
different types. It means that the method that can't be applied to a new
type that can crash this script. It is better to get an "unavailable
driver" when creating a new machine with new compatible properties than
to break this script. So it turns out a more stable and generic script.
* If the default value can't be obtained because this property doesn't
exist or because this property can't have default value, appropriate
"hole" will be filled by "unknown property" or "no default value"
* If the property is applied to the abstract class, the script collects
default values from all child classes and prints all these classes
* Raw table (--raw flag) should be used with json/csv parameters for
scripts and etc. Human-readable (default) format contains transformed
and simplified values and it doesn't contain lines with the same values
in columns
Example:
./scripts/compare-machine-types.py --mt pc-q35-6.2 pc-q35-7.1
╒══════════════════╤══════════════════════════╤════════════════════════════╤════════════════════════════╕
│ Driver │ Property │ build/qemu-system-x86_64 │ build/qemu-system-x86_64 │
│ │ │ pc-q35-6.2 │ pc-q35-7.1 │
╞══════════════════╪══════════════════════════╪════════════════════════════╪════════════════════════════╡
│ PIIX4_PM │ x-not-migrate-acpi-index │ True │ False │
├──────────────────┼──────────────────────────┼────────────────────────────┼────────────────────────────┤
│ arm-gicv3-common │ force-8-bit-prio │ True │ unavailable driver │
├──────────────────┼──────────────────────────┼────────────────────────────┼────────────────────────────┤
│ nvme-ns │ eui64-default │ True │ False │
├──────────────────┼──────────────────────────┼────────────────────────────┼────────────────────────────┤
│ virtio-mem │ unplugged-inaccessible │ False │ auto │
╘══════════════════╧══════════════════════════╧════════════════════════════╧════════════════════════════╛
Signed-off-by: Maksim Davydov <davydov-max@yandex-team.ru>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20240318213550.155573-5-davydov-max@yandex-team.ru>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
To control that creating new machine type doesn't affect the previous
types (their compat_props) and to check complex compat_props inheritance
we need qmp command to print machine type compatibility properties.
This patch adds the ability to get list of all the compat_props of the
corresponding supported machines for their comparison via new optional
argument of "query-machines" command. Since information on compatibility
properties can increase the command output by a factor of 40, add an
argument to enable it, default off.
Signed-off-by: Maksim Davydov <davydov-max@yandex-team.ru>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Acked-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240318213550.155573-3-davydov-max@yandex-team.ru>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
qmp_qom_list_properties can print default values if they are available
as qmp_device_list_properties does, because both of them use the
ObjectPropertyInfo structure with default_value field. This can be useful
when working with "not device" types (e.g. memory-backend).
Signed-off-by: Maksim Davydov <davydov-max@yandex-team.ru>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240318213550.155573-2-davydov-max@yandex-team.ru>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
RHEL 9 (and thus also the derivatives) have been available since two
years now, so according to QEMU's support policy, we can drop the active
support for the previous major version 8 now.
Another reason for doing this is that Centos Stream 8 will go EOL soon:
https://blog.centos.org/2023/04/end-dates-are-coming-for-centos-stream-8-and-centos-linux-7/
"After May 31, 2024, CentOS Stream 8 will be archived
and no further updates will be provided."
Thus upgrade our CentOS Stream container to major version 9 now.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240418101056.302103-5-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
This update adds the removing of the EXTERNALLY-MANAGED marker files
that has been added to the lcitool recently.
Quoting Daniel:
"For those who don't know, python now commonly blocks the ability to
run 'pip install' outside of a venv. This generally makes sense for
a precious installation environment. Our containers are disposable
though, so a venv has no benefit. Removing the 'EXTERNALLY-MANAGED'
allows the historical arbitrary use of 'pip' outside a venv.
lcitool just does this unconditionally given the containers are
not precious."
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240418101056.302103-4-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Since Ubuntu 22.04 has now been available for more than two years, we
can stop actively supporting the previous LTS version of Ubuntu now.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240418101056.302103-2-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
We are doing a lot of cross-compilation tests with GCC in the gitlab-CI
already, so we could get some more test coverage by using Clang in the
Travis-CI instead. Thus let's switch two additional jobs to use Clang
for compilation.
Message-ID: <20240320104144.823425-7-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
According to our support policy, we'll soon drop our official support
for Ubuntu 20.04 ("Focal Fossa") in QEMU. Thus we should update the
Travis jobs now to a newer release (Ubuntu 22.04 - "Jammy Jellyfish")
for future testing. Since all jobs are using this release now, we
can drop the entries from the individual jobs and use the global
setting again.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240418101056.302103-6-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
This variable was used to allow jobs to fail without spoiling the
overall result. But the required "allow_failures:" hunk has been
accidentally removed in commit 9d03f5abed ("travis.yml: Remove the
"Release tarball" job"), and it was anyway only useful while we
still had the x86 jobs here around that were our main CI jobs.
Thus let's simply remove this useless variable now.
Message-ID: <20240320104144.823425-6-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
This reverts commit c1073e44b4.
The Avocado tests have been removed from Travis a long time ago with
commit c5008c76ee ("gitlab: add acceptance testing to system builds"),
so we don't need to cache the avocado files here anymore.
Message-ID: <20240320104144.823425-4-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
meson: Make DEBUG_REMAP a meson option
target/m68k: Support semihosting on non-ColdFire targets
linux-user: do_setsockopt cleanups
linux-user: Add FITRIM ioctl
# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmYpjHcdHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV+a/Af7BHmDB27U61b9i8et
# cObewYH9y9M+iaCrIflNZPAaoguHDRKOuvw+PFT/dIo5FL2D509vYOuxUow1qLsy
# q6b6kdvXROq9WU2NiuB86Abl/4mwwzxRhFah+Eh+OYSA2/pQnkcULkouLqxjFfF0
# xTBzZtHtYdTbCTVRbpd6XrwLo7Qrs85ovl4wVD1r+T2T8FkvrryoNOA/VjUWxyeh
# 3b1X1I0wtOTnEA7JSr17JCXWZGENCmTO35r6WSYzJy5U/C59PjjgaaeMi3R3lQTJ
# gg21EH0hlU1nTiPLg2ypj3l9NbIGAincAdDF/jufee+R75YSPdpKoDH8tUlUGsnM
# CRx5Xg==
# =J+5K
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 24 Apr 2024 03:49:27 PM PDT
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate]
* tag 'pull-tcg-20240424' of https://gitlab.com/rth7680/qemu:
target/m68k: Support semihosting on non-ColdFire targets
target/m68k: Perform the semihosting test during translate
target/m68k: Pass semihosting arg to exit
linux-user: Add FITRIM ioctl
linux-user: do_setsockopt: eliminate goto in switch for SO_SNDTIMEO
linux-user: do_setsockopt: make ip_mreq_source local to the place where it is used
linux-user: do_setsockopt: make ip_mreq local to the place it is used and inline target_to_host_ip_mreq()
linux-user: do_setsockopt: fix SOL_ALG.ALG_SET_KEY
meson: Make DEBUG_REMAP a meson option
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
According to the m68k semihosting spec:
"The instruction used to trigger a semihosting request depends on the
m68k processor variant. On ColdFire, "halt" is used; on other processors
(which don't implement "halt"), "bkpt #0" may be used."
Add support for non-CodeFire processors by matching BKPT #0 instructions.
Signed-off-by: Keith Packard <keithp@keithp.com>
[rth: Use semihosting_test()]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Replace EXCP_HALT_INSN by EXCP_SEMIHOSTING. Perform the pre-
and post-insn tests during translate, leaving only the actual
semihosting operation for the exception.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
There's identical code for SO_SNDTIMEO and SO_RCVTIMEO, currently
implemented using an ugly goto into another switch case. Eliminate
that using arithmetic if, making code flow more natural.
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Message-Id: <20240331100737.2724186-5-mjt@tls.msk.ru>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
ip_mreq is declared at the beginning of do_setsockopt(), while
it is used in only one place. Move its declaration to that very
place and replace pointer to alloca()-allocated memory with the
structure itself.
target_to_host_ip_mreq() is used only once, inline it.
This change also properly handles TARGET_EFAULT when the address
is wrong.
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Message-Id: <20240331100737.2724186-3-mjt@tls.msk.ru>
[rth: Fix braces, adjust optlen to match host structure size]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Currently DEBUG_REMAP is a macro that needs to be manually #defined to
be activated, which makes it hard to have separate build directories
dedicated to testing the code with it. Promote it to a meson option.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20240312002402.14344-1-iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
QAPI patches patches for 2024-04-24
# -----BEGIN PGP SIGNATURE-----
#
# iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAmYovX4SHGFybWJydUBy
# ZWRoYXQuY29tAAoJEDhwtADrkYZT0RQP/1R1oOSdfWmiSR5gdW+IDE88VrjNY+Md
# lZJ7dTtbIDbwABP6s7/wGIxlc4bf6lwIwhWNnfOa+y71jJ/u9aX2Q6D8wHY5lQnu
# b8jhzoP3UyB+LQV5TTX1nTqTaO72Bj0pxw/0/DeKCzg7kgjys06a1Ln9P0oyWs6x
# li/Cs133wPtZ5rISqif5yOmssber0h2D584k5MN2VK7eaGidLioQQIRmSDikPE6Z
# TpnOEqySInIFhPJmm77il19ZDpCrgdCoD7lXoqX1C6XScYz7dU+m/TaToFk1lECw
# VstR2SJT39TzbOLdis1O5/vsLP0QfciMRQbUktSrf4jQHumrkSa/OE6xKwJC3x82
# axJWc+BygcosylKc5CYVzwSlHugHw6Lf39qui//yzi5CXakzO6owKX7Q8AcH3PPy
# 6thncy0dvw1ggq/BZGYhjyG+6MXBCWipPGVXFp9Gf2cAayTALhwyNtPvDNX57fzT
# UZA/fa+/Wc9xZv1YAnxLaKyo4o65YWXVnq1eHXV17ny08BzZNYOC2ZuXim+rMThM
# lxzcrTkrMLgsGXetp59uhZw9JRnVvaxLqNXfC2bwpoRzXzuifvnPithl6nkSm1YB
# TvHGZZdO3B498kOW6947KrMVFh3t4aNWkdbOyetAMECy71H3Q4CTHYGFa5TzXT9O
# Rjz31NYdVNBE
# =uyfg
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 24 Apr 2024 01:06:22 AM PDT
# gpg: using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653
# gpg: issuer "armbru@redhat.com"
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [undefined]
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653
* tag 'pull-qapi-2024-04-24' of https://repo.or.cz/qemu/armbru: (25 commits)
qapi: Dumb down QAPISchema.lookup_entity()
qapi: Tighten check whether implicit object type already exists
qapi/schema: remove unnecessary asserts
qapi/schema: turn on mypy strictness
qapi/schema: add type hints
qapi/parser.py: assert member.info is present in connect_member
qapi/parser: demote QAPIExpression to Dict[str, Any]
qapi/schema: assert inner type of QAPISchemaVariants in check_clash()
qapi/schema: fix typing for QAPISchemaVariants.tag_member
qapi/schema: Don't initialize "members" with `None`
qapi/schema: add _check_complete flag
qapi/schema: assert info is present when necessary
qapi/schema: fix QAPISchemaArrayType.check's call to resolve_type
qapi: Assert built-in types exist
qapi/schema: assert resolve_type has 'info' and 'what' args on error
qapi/schema: add type narrowing to lookup_type()
qapi/schema: adjust type narrowing for mypy's benefit
qapi/schema: make c_type() and json_type() abstract methods
qapi/schema: declare type for QAPISchemaArrayType.element_type
qapi/schema: declare type for QAPISchemaObjectTypeMember.type
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
GlusterFS+RDMA has been deprecated 8 years ago in commit
0552ff2465 ("block/gluster: deprecate rdma support"):
gluster volfile server fetch happens through unix and/or tcp,
it doesn't support volfile fetch over rdma. The rdma code may
actually mislead, so to make sure things do not break, for now
we fallback to tcp when requested for rdma, with a warning.
If you are wondering how this worked all these days, its the
gluster libgfapi code which handles anything other than unix
transport as socket/tcp, sad but true.
Besides, the whole RDMA subsystem was deprecated in commit
e9a54265f5 ("hw/rdma: Deprecate the pvrdma device and the rdma
subsystem") released in v8.2.
Cc: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20240328130255.52257-4-philmd@linaro.org>
The Nios II target is deprecated since v8.2 in commit 9997771bc1
("target/nios2: Deprecate the Nios II architecture").
Remove:
- Buildsys / CI infra
- User emulation
- System emulation (10m50-ghrd & nios2-generic-nommu machines)
- Tests
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Marek Vasut <marex@denx.de>
Message-Id: <20240327144806.11319-3-philmd@linaro.org>
QAPISchema.lookup_entity() takes an optional type argument, a subtype
of QAPISchemaDefinition, and returns that type or None. Callers can
use this to save themselves an isinstance() test.
The only remaining user of this convenience feature is .lookup_type().
But we don't actually save anything anymore there: we still need the
isinstance() to help mypy over the hump.
Drop the .lookup_entity() argument, and adjust .lookup_type().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240315152301.3621858-26-armbru@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
[Commit message typo fixed]
Entities with names starting with q_obj_ are implicit object types.
Therefore, QAPISchema._make_implicit_object_type()'s .lookup_entity()
can only return a QAPISchemaObjectType. Assert that.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240315152301.3621858-25-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: John Snow <jsnow@redhat.com>
This patch only adds type hints, which aren't utilized at runtime and
don't change the behavior of this module in any way.
In a scant few locations, type hints are removed where no longer
necessary due to inference power from typing all of the rest of
creation; and any type hints that no longer need string quotes are
changed.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240315152301.3621858-22-armbru@redhat.com>
Dict[str, object] is a stricter type, but with the way that code is
currently arranged, it is infeasible to enforce this strictness.
In particular, although expr.py's entire raison d'être is normalization
and type-checking of QAPI Expressions, that type information is not
"remembered" in any meaningful way by mypy because each individual
expression is not downcast to a specific expression type that holds all
the details of each expression's unique form.
As a result, all of the code in schema.py that deals with actually
creating type-safe specialized structures has no guarantee (myopically)
that the data it is being passed is correct.
There are two ways to solve this:
(1) Re-assert that the incoming data is in the shape we expect it to be, or
(2) Disable type checking for this data.
(1) is appealing to my sense of strictness, but I gotta concede that it
is asinine to re-check the shape of a QAPIExpression in schema.py when
expr.py has just completed that work at length. The duplication of code
and the nightmare thought of needing to update both locations if and
when we change the shape of these structures makes me extremely
reluctant to go down this route.
(2) allows us the chance to miss updating types in the case that types
are updated in expr.py, but it *is* an awful lot simpler and,
importantly, gets us closer to type checking schema.py *at
all*. Something is better than nothing, I'd argue.
So, do the simpler dumber thing and worry about future strictness
improvements later.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240315152301.3621858-20-armbru@redhat.com>
QAPISchemaVariant's "variants" field is typed as
List[QAPISchemaVariant], where the typing for QAPISchemaVariant allows
its type field to be any QAPISchemaType.
However, QAPISchemaVariant expects that all of its variants contain the
narrower QAPISchemaObjectType. This relationship is enforced at runtime
in QAPISchemaVariants.check(). This relationship is not embedded in the
type system though, so QAPISchemaVariants.check_clash() needs to
re-assert this property in order to call
QAPISchemaVariant.type.check_clash().
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240315152301.3621858-19-armbru@redhat.com>
There are two related changes here:
(1) We need to perform type narrowing for resolving the type of
tag_member during check(), and
(2) tag_member is a delayed initialization field, but we can hide it
behind a property that raises an Exception if it's called too
early. This simplifies the typing in quite a few places and avoids
needing to assert that the "tag_member is not None" at a dozen
callsites, which can be confusing and suggest the wrong thing to a
drive-by contributor.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240315152301.3621858-18-armbru@redhat.com>
Declare, but don't initialize the "members" field with type
List[QAPISchemaObjectTypeMember].
This simplifies the typing from what would otherwise be
Optional[List[T]] to merely List[T]. This removes the need to add
assertions to several callsites that this value is not None - which it
never will be after the delayed initialization in check() anyway.
The type declaration without initialization trick will cause accidental
uses of this field prior to full initialization to raise an
AttributeError.
(Note that it is valid to have an empty members list, see the internal
q_empty object as an example. For this reason, we cannot use the empty
list as a replacement test for full initialization and instead rely on
the _checked/_check_complete fields.)
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240315152301.3621858-17-armbru@redhat.com>
Instead of using the None value for the members field, use a dedicated
flag to detect recursive misconfigurations.
This is intended to assist with subsequent patches that seek to remove
the "None" value from the members field (which can never hold that value
after the final call to check()) in order to simplify the static typing
of that field; avoiding the need of assertions littered at many
callsites to eliminate the possibility of the None value.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240315152301.3621858-16-armbru@redhat.com>
QAPISchemaInfo arguments can often be None because built-in definitions
don't have such information. The type hint can only be
Optional[QAPISchemaInfo] then. But, mypy gets upset about all the
places where we exploit that it can't actually be None there. Add
assertions that will help mypy over the hump, to enable adding type
hints in a forthcoming commit.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240315152301.3621858-15-armbru@redhat.com>
Adjust the expression at the callsite to work around mypy's weak type
introspection that believes this expression can resolve to
QAPISourceInfo; it cannot.
(Fundamentally: self.info only resolves to false in a boolean expression
when it is None; therefore this expression may only ever produce
Optional[str]. mypy does not know that 'info', when it is a
QAPISourceInfo object, cannot ever be false.)
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240315152301.3621858-14-armbru@redhat.com>
QAPISchema.lookup_type('FOO') returns a QAPISchemaType when type 'FOO'
exists, else None. It won't return None for built-in types like
'int'.
Since mypy can't see that, it'll complain that we assign the
Optional[QAPISchemaType] returned by .lookup_type() to QAPISchemaType
variables.
Add assertions to help it over the hump.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240315152301.3621858-13-armbru@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
resolve_type() is generally used to resolve configuration-provided type
names into type objects, and generally requires valid 'info' and 'what'
parameters.
In some cases, such as with QAPISchemaArrayType.check(), resolve_type
may be used to resolve built-in types and as such will not have an
'info' argument, but also must not fail in this scenario.
Use an assertion to sate mypy that we will indeed have 'info' and 'what'
parameters for the error pathway in resolve_type.
Note: there are only three callsites to resolve_type at present where
"info" is perceived by mypy to be possibly None:
1) QAPISchemaArrayType.check()
2) QAPISchemaObjectTypeMember.check()
3) QAPISchemaEvent.check()
Of those three, only the first actually ever passes None; the other two
are limited by their base class initializers which accept info=None, but
neither subclass actually use a None value in practice, currently.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240315152301.3621858-12-armbru@redhat.com>
We already take care to perform some type narrowing for arg_type and
ret_type, but not in a way where mypy can utilize the result once we add
type hints, e.g.:
qapi/schema.py:833: error: Incompatible types in assignment (expression
has type "QAPISchemaType", variable has type
"Optional[QAPISchemaObjectType]") [assignment]
qapi/schema.py:893: error: Incompatible types in assignment (expression
has type "QAPISchemaType", variable has type
"Optional[QAPISchemaObjectType]") [assignment]
A simple change to use a temporary variable helps the medicine go down.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240315152301.3621858-10-armbru@redhat.com>
These methods should always return a str, it's only the default abstract
implementation that doesn't. They can be marked "abstract", which
requires subclasses to override the method with the proper return type.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240315152301.3621858-9-armbru@redhat.com>
A QAPISchemaArrayType's element type gets resolved only during .check().
We have QAPISchemaArrayType.__init__() initialize self.element_type =
None, and .check() assign the actual type. Using .element_type before
.check() is wrong, and hopefully crashes due to the value being None.
Works.
However, it makes for awkward typing. With .element_type:
Optional[QAPISchemaType], mypy is of course unable to see that it's None
before .check(), and a QAPISchemaType after. To help it over the hump,
we'd have to assert self.element_type is not None before all the (valid)
uses. The assertion catches invalid uses, but only at run time; mypy
can't flag them.
Instead, declare .element_type in .__init__() as QAPISchemaType
*without* initializing it. Using .element_type before .check() now
certainly crashes, which is an improvement. Mypy still can't flag
invalid uses, but that's okay.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240315152301.3621858-8-armbru@redhat.com>
A QAPISchemaObjectTypeMember's type gets resolved only during .check().
We have QAPISchemaObjectTypeMember.__init__() initialize self.type =
None, and .check() assign the actual type. Using .type before .check()
is wrong, and hopefully crashes due to the value being None. Works.
However, it makes for awkward typing. With .type:
Optional[QAPISchemaType], mypy is of course unable to see that it's None
before .check(), and a QAPISchemaType after. To help it over the hump,
we'd have to assert self.type is not None before all the (valid) uses.
The assertion catches invalid uses, but only at run time; mypy can't
flag them.
Instead, declare .type in .__init__() as QAPISchemaType *without*
initializing it. Using .type before .check() now certainly crashes,
which is an improvement. Mypy still can't flag invalid uses, but that's
okay.
Addresses typing errors such as these:
qapi/schema.py:657: error: "None" has no attribute "alternate_qtype" [attr-defined]
qapi/schema.py:662: error: "None" has no attribute "describe" [attr-defined]
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240315152301.3621858-7-armbru@redhat.com>
Include entities don't have names, but we generally expect "entities" to
have names. Reclassify all entities with names as *definitions*, leaving
the nameless include entities as QAPISchemaEntity instances.
This is primarily to help simplify typing around expectations of what
callers expect for properties of an "entity".
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240315152301.3621858-6-armbru@redhat.com>
Address the comment added in commit 4629ed1e98
("qerror: Finally unused, clean up"), from 2015:
/*
* These macros will go away, please don't use
* in new code, and do not add new ones!
*/
Manual change. Remove the definition in
include/qapi/qmp/qerror.h.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240312141343.3168265-11-armbru@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Address the comment added in commit 4629ed1e98
("qerror: Finally unused, clean up"), from 2015:
/*
* These macros will go away, please don't use
* in new code, and do not add new ones!
*/
Mechanical transformation using sed, manually
removing the definition in include/qapi/qmp/qerror.h.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240312141343.3168265-10-armbru@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
[Straightforward conflict with commit aeaafb1e59 (migration: export
migration_is_running) resolved]
QERR_INVALID_PARAMETER_VALUE is defined as:
#define QERR_INVALID_PARAMETER_VALUE \
"Parameter '%s' expects %s"
The current error is formatted as:
"Parameter 'vcpu_dirty_limit' expects is invalid, it must greater then 1 MB/s"
Replace by:
"Parameter 'vcpu_dirty_limit' must be greater than 1 MB/s"
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240312141343.3168265-9-armbru@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
[New error message corrected, commit message updated accordingly]
Address the comment added in commit 4629ed1e98
("qerror: Finally unused, clean up"), from 2015:
/*
* These macros will go away, please don't use
* in new code, and do not add new ones!
*/
Manual changes (escaping the format in qapi/visit.py).
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240312141343.3168265-8-armbru@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Address the comment added in commit 4629ed1e98
("qerror: Finally unused, clean up"), from 2015:
/*
* These macros will go away, please don't use
* in new code, and do not add new ones!
*/
Mechanical transformation using the following
coccinelle semantic patch:
@match@
expression errp;
expression param;
constant value;
@@
error_setg(errp, QERR_INVALID_PARAMETER_TYPE, param, value);
@script:python strformat depends on match@
value << match.value;
fixedfmt; // new var
@@
fixedfmt = f'"Invalid parameter type for \'%s\', expected: {value[1:-1]}"'
coccinelle.fixedfmt = cocci.make_ident(fixedfmt)
@replace@
expression match.errp;
expression match.param;
constant match.value;
identifier strformat.fixedfmt;
@@
- error_setg(errp, QERR_INVALID_PARAMETER_TYPE, param, value);
+ error_setg(errp, fixedfmt, param);
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240312141343.3168265-7-armbru@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Address the comment added in commit 4629ed1e98
("qerror: Finally unused, clean up"), from 2015:
/*
* These macros will go away, please don't use
* in new code, and do not add new ones!
*/
Mechanical transformation using:
$ sed -i -e "s/QERR_INVALID_PARAMETER,/\"Invalid parameter '%s'\",/" \
$(git grep -lw QERR_INVALID_PARAMETER)
Manually simplify qemu_opts_create(), and remove the macro definition
in include/qapi/qmp/qerror.h.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240312141343.3168265-6-armbru@redhat.com>
Address the comment added in commit 4629ed1e98
("qerror: Finally unused, clean up"), from 2015:
/*
* These macros will go away, please don't use
* in new code, and do not add new ones!
*/
Mechanical transformation using sed, and manual cleanup.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240312141343.3168265-5-armbru@redhat.com>
Address the comment added in commit 4629ed1e98
("qerror: Finally unused, clean up"), from 2015:
/*
* These macros will go away, please don't use
* in new code, and do not add new ones!
*/
Mechanical transformation using sed, and manual cleanup.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240312141343.3168265-4-armbru@redhat.com>
Address the comment added in commit 4629ed1e98
("qerror: Finally unused, clean up"), from 2015:
/*
* These macros will go away, please don't use
* in new code, and do not add new ones!
*/
Mechanical transformation using sed, and manual cleanup.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240312141343.3168265-3-armbru@redhat.com>
Migration pull for 9.1
- Het's new test cases for "channels"
- Het's fix for a typo for vsock parsing
- Cedric's VFIO error report series
- Cedric's one more patch for dirty-bitmap error reports
- Zhijian's rdma deprecation patch
- Yuan's zeropage optimization to fix double faults on anon mem
- Zhijian's COLO fix on a crash
# -----BEGIN PGP SIGNATURE-----
#
# iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZig4HxIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1wbQiwD/V5nSJzSuAG4Ra1Fjo+LRG2TT6qk8eNCi
# fIytehSw6cYA/0wqarxOF0tr7ikeyhtG3w4xFf44kk6KcPkoVSl1tqoL
# =pJmQ
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 23 Apr 2024 03:37:19 PM PDT
# gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg: issuer "peterx@redhat.com"
# gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [unknown]
# gpg: aka "Peter Xu <peterx@redhat.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706
* tag 'migration-20240423-pull-request' of https://gitlab.com/peterx/qemu: (26 commits)
migration/colo: Fix bdrv_graph_rdlock_main_loop: Assertion `!qemu_in_coroutine()' failed.
migration/multifd: solve zero page causing multiple page faults
migration: Add Error** argument to add_bitmaps_to_list()
migration: Modify ram_init_bitmaps() to report dirty tracking errors
migration: Add Error** argument to xbzrle_init()
migration: Add Error** argument to ram_state_init()
memory: Add Error** argument to the global_dirty_log routines
migration: Introduce ram_bitmaps_destroy()
memory: Add Error** argument to .log_global_start() handler
migration: Add Error** argument to .load_setup() handler
migration: Add Error** argument to .save_setup() handler
migration: Add Error** argument to qemu_savevm_state_setup()
migration: Add Error** argument to vmstate_save()
migration: Always report an error in ram_save_setup()
migration: Always report an error in block_save_setup()
vfio: Always report an error in vfio_save_setup()
s390/stattrib: Add Error** argument to set_migrationmode() handler
tests/qtest/migration: Fix typo for vsock in SocketAddress_to_str
tests/qtest/migration: Add negative tests to validate migration QAPIs
tests/qtest/migration: Add multifd_tcp_plain test using list of channels instead of uri
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* cleanups for stubs
* do not link pixman automatically into all targets
* optimize computation of VGA dirty memory region
* kvm: use configs/ definition to conditionalize debug support
* hw: Add compat machines for 9.1
* target/i386: add guest-phys-bits cpu property
* target/i386: Introduce Icelake-Server-v7 and SierraForest models
* target/i386: Export RFDS bit to guests
* q35: SMM ranges cleanups
* target/i386: basic support for confidential guests
* linux-headers: update headers
* target/i386: SEV: use KVM_SEV_INIT2 if possible
* kvm: Introduce support for memory_attributes
* RAMBlock: Add support of KVM private guest memfd
* Consolidate use of warn_report_once()
* pythondeps.toml: warn about updates needed to docs/requirements.txt
* target/i386: always write 32-bits for SGDT and SIDT
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmYn1UkUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroO1nwgAhRQhkYcdtFc649WJWTNvJCNzmek0
# Sg7trH2NKlwA75zG8Qv4TR3E71UrXoY9oItwYstc4Erz+tdf73WyaHMF3cEk1p82
# xx3LcBYhP7jGSjabxTkZsFU8+MM1raOjRN/tHvfcjYLaJOqJZplnkaVhMbNPsVuM
# IPJ5bVQohxpmHKPbeFNpF4QJ9wGyZAYOfJOFCk09xQtHnA8CtFjS9to33QPAR/Se
# OVZwRCigVjf0KNmCnHC8tJHoW8pG/cdQAr3qqd397XbM1vVELv9fiXiMoGF78UsY
# trO4K2yg6N5Sly4Qv/++zZ0OZNkL3BREGp3wf4eTSvLXxqSGvfi8iLpFGA==
# =lwSL
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 23 Apr 2024 08:35:37 AM PDT
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [undefined]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (63 commits)
target/i386/translate.c: always write 32-bits for SGDT and SIDT
pythondeps.toml: warn about updates needed to docs/requirements.txt
accel/tcg/icount-common: Consolidate the use of warn_report_once()
target/i386/cpu: Merge the warning and error messages for AMD HT check
target/i386/cpu: Consolidate the use of warn_report_once()
target/i386/host-cpu: Consolidate the use of warn_report_once()
kvm/tdx: Ignore memory conversion to shared of unassigned region
kvm/tdx: Don't complain when converting vMMIO region to shared
kvm: handle KVM_EXIT_MEMORY_FAULT
physmem: Introduce ram_block_discard_guest_memfd_range()
RAMBlock: make guest_memfd require uncoordinated discard
HostMem: Add mechanism to opt in kvm guest memfd via MachineState
kvm/memory: Make memory type private by default if it has guest memfd backend
kvm: Enable KVM_SET_USER_MEMORY_REGION2 for memslot
RAMBlock: Add support of KVM private guest memfd
kvm: Introduce support for memory_attributes
trace/kvm: Split address space and slot id in trace_kvm_set_user_memory()
hw/i386/sev: Use legacy SEV VM types for older machine types
i386/sev: Add 'legacy-vm-type' parameter for SEV guest objects
target/i386: SEV: use KVM_SEV_INIT2 if possible
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
bdrv_activate_all() should not be called from the coroutine context, move
it to the QEMU thread colo_process_incoming_thread() with the bql_lock
protected.
The backtrace is as follows:
#4 0x0000561af7948362 in bdrv_graph_rdlock_main_loop () at ../block/graph-lock.c:260
#5 0x0000561af7907a68 in graph_lockable_auto_lock_mainloop (x=0x7fd29810be7b) at /patch/to/qemu/include/block/graph-lock.h:259
#6 0x0000561af79167d1 in bdrv_activate_all (errp=0x7fd29810bed0) at ../block.c:6906
#7 0x0000561af762b4af in colo_incoming_co () at ../migration/colo.c:935
#8 0x0000561af7607e57 in process_incoming_migration_co (opaque=0x0) at ../migration/migration.c:793
#9 0x0000561af7adbeeb in coroutine_trampoline (i0=-106876144, i1=22042) at ../util/coroutine-ucontext.c:175
#10 0x00007fd2a5cf21c0 in () at /lib64/libc.so.6
Cc: qemu-stable@nongnu.org
Cc: Fabiano Rosas <farosas@suse.de>
Closes: https://gitlab.com/qemu-project/qemu/-/issues/2277
Fixes: 2b3912f135 ("block: Mark bdrv_first_blk() and bdrv_is_root_node() GRAPH_RDLOCK")
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Zhang Chen <chen.zhang@intel.com>
Tested-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240417025634.1014582-1-lizhijian@fujitsu.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Implemented recvbitmap tracking of received pages in multifd.
If the zero page appears for the first time in the recvbitmap, this
page is not checked and set.
If the zero page has already appeared in the recvbitmap, there is no
need to check the data but directly set the data to 0, because it is
unlikely that the zero page will be migrated multiple times.
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240401154110.2028453-2-yuan1.liu@intel.com
[peterx: touch up the comment, as the bitmap is used outside postcopy now]
Signed-off-by: Peter Xu <peterx@redhat.com>
The .save_setup() handler has now an Error** argument that we can use
to propagate errors reported by the .log_global_start() handler. Do
that for the RAM. The caller qemu_savevm_state_setup() will store the
error under the migration stream for later detection in the migration
sequence.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240320064911.545001-15-clg@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Now that the log_global*() handlers take an Error** parameter and
return a bool, do the same for memory_global_dirty_log_start() and
memory_global_dirty_log_stop(). The error is reported in the callers
for now and it will be propagated in the call stack in the next
changes.
To be noted a functional change in ram_init_bitmaps(), if the dirty
pages logger fails to start, there is no need to synchronize the dirty
pages bitmaps. colo_incoming_start_dirty_log() could be modified in a
similar way.
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Anthony Perard <anthony.perard@citrix.com>
Cc: Paul Durrant <paul@xen.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hyman Huang <yong.huang@smartx.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Acked-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240320064911.545001-12-clg@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
This prepares ground for the changes coming next which add an Error**
argument to the .save_setup() handler. Callers of qemu_savevm_state_setup()
now handle the error and fail earlier setting the migration state from
MIGRATION_STATUS_SETUP to MIGRATION_STATUS_FAILED.
In qemu_savevm_state(), move the cleanup to preserve the error
reported by .save_setup() handlers.
Since the previous behavior was to ignore errors at this step of
migration, this change should be examined closely to check that
cleanups are still correctly done.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240320064911.545001-7-clg@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Refactor migrate_get_socket_address to internally utilize 'socket-address'
parameter, reducing redundancy in the function definition.
migrate_get_socket_address implicitly converts SocketAddress into str.
Move migrate_get_socket_address inside migrate_get_connect_uri which
should return the uri string instead.
Signed-off-by: Het Gala <het.gala@nutanix.com>
Suggested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240312202634.63349-4-het.gala@nutanix.com
Signed-off-by: Peter Xu <peterx@redhat.com>
The various Intel CPU manuals claim that SGDT and SIDT can write either 24-bits
or 32-bits depending upon the operand size, but this is incorrect. Not only do
the Intel CPU manuals give contradictory information between processor
revisions, but this information doesn't even match real-life behaviour.
In fact, tests on real hardware show that the CPU always writes 32-bits for SGDT
and SIDT, and this behaviour is required for at least OS/2 Warp and WFW 3.11 with
Win32s to function correctly. Remove the masking applied due to the operand size
for SGDT and SIDT so that the TCG behaviour matches the behaviour on real
hardware.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2198
--
MCA: Whilst I don't have a copy of OS/2 Warp handy, I've confirmed that this
patch fixes the issue in WFW 3.11 with Win32s. For more technical information I
highly recommend the excellent write-up at
https://www.os2museum.com/wp/sgdtsidt-fiction-and-reality/.
Message-ID: <20240419195147.434894-1-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
docs/requirements.txt is expected by readthedocs and should be in sync
with pythondeps.toml. Add a comment to both.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently, the difference between warn_report_once() and
error_report_once() is the former has the "warning:" prefix, while the
latter does not have a similar level prefix.
At the meantime, considering that there is no error handling logic here,
and the purpose of error_report_once() is only to prompt the user with
an abnormal message, there is no need to use an error-level message here,
and instead we can just use a warning.
Therefore, downgrade the message in error_report_once() to warning, and
merge it into the previous warn_report_once().
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240327103951.3853425-4-zhao1.liu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The difference between error_printf() and error_report() is the latter
may contain more information, such as the name of the program
("qemu-system-x86_64").
Thus its variant error_report_once() and warn_report()'s variant
warn_report_once() can be used here to print the information only once
without a static local variable "ht_warned".
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240327103951.3853425-3-zhao1.liu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
TDX requires vMMIO region to be shared. For KVM, MMIO region is the region
which kvm memslot isn't assigned to (except in-kernel emulation).
qemu has the memory region for vMMIO at each device level.
While OVMF issues MapGPA(to-shared) conservatively on 32bit PCI MMIO
region, qemu doesn't find corresponding vMMIO region because it's before
PCI device allocation and memory_region_find() finds the device region, not
PCI bus region. It's safe to ignore MapGPA(to-shared) because when guest
accesses those region they use GPA with shared bit set for vMMIO. Ignore
memory conversion request of non-assigned region to shared and return
success. Otherwise OVMF is confused and panics there.
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20240229063726.610065-35-xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Upon an KVM_EXIT_MEMORY_FAULT exit, userspace needs to do the memory
conversion on the RAMBlock to turn the memory into desired attribute,
switching between private and shared.
Currently only KVM_MEMORY_EXIT_FLAG_PRIVATE in flags is valid when
KVM_EXIT_MEMORY_FAULT happens.
Note, KVM_EXIT_MEMORY_FAULT makes sense only when the RAMBlock has
guest_memfd memory backend.
Note, KVM_EXIT_MEMORY_FAULT returns with -EFAULT, so special handling is
added.
When page is converted from shared to private, the original shared
memory can be discarded via ram_block_discard_range(). Note, shared
memory can be discarded only when it's not back'ed by hugetlb because
hugetlb is supposed to be pre-allocated and no need for discarding.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Co-developed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20240320083945.991426-13-michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some subsystems like VFIO might disable ram block discard, but guest_memfd
uses discard operations to implement conversions between private and
shared memory. Because of this, sequences like the following can result
in stale IOMMU mappings:
1. allocate shared page
2. convert page shared->private
3. discard shared page
4. convert page private->shared
5. allocate shared page
6. issue DMA operations against that shared page
This is not a use-after-free, because after step 3 VFIO is still pinning
the page. However, DMA operations in step 6 will hit the old mapping
that was allocated in step 1.
Address this by taking ram_block_discard_is_enabled() into account when
deciding whether or not to discard pages.
Since kvm_convert_memory()/guest_memfd doesn't implement a
RamDiscardManager handler to convey and replay discard operations,
this is a case of uncoordinated discard, which is blocked/released
by ram_block_discard_require(). Interestingly, this function had
no use so far.
Alternative approaches would be to block discard of shared pages, but
this would cause guests to consume twice the memory if they use VFIO;
or to implement a RamDiscardManager and only block uncoordinated
discard, i.e. use ram_block_coordinated_discard_require().
[Commit message mostly by Michael Roth <michael.roth@amd.com>]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a new member "guest_memfd" to memory backends. When it's set
to true, it enables RAM_GUEST_MEMFD in ram_flags, thus private kvm
guest_memfd will be allocated during RAMBlock allocation.
Memory backend's @guest_memfd is wired with @require_guest_memfd
field of MachineState. It avoid looking up the machine in phymem.c.
MachineState::require_guest_memfd is supposed to be set by any VMs
that requires KVM guest memfd as private memory, e.g., TDX VM.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-ID: <20240320083945.991426-8-michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
KVM side leaves the memory to shared by default, which may incur the
overhead of paging conversion on the first visit of each page. Because
the expectation is that page is likely to private for the VMs that
require private memory (has guest memfd).
Explicitly set the memory to private when memory region has valid
guest memfd backend.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Message-ID: <20240320083945.991426-16-michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add KVM guest_memfd support to RAMBlock so both normal hva based memory
and kvm guest memfd based private memory can be associated in one RAMBlock.
Introduce new flag RAM_GUEST_MEMFD. When it's set, it calls KVM ioctl to
create private guest_memfd during RAMBlock setup.
Allocating a new RAM_GUEST_MEMFD flag to instruct the setup of guest memfd
is more flexible and extensible than simply relying on the VM type because
in the future we may have the case that not all the memory of a VM need
guest memfd. As a benefit, it also avoid getting MachineState in memory
subsystem.
Note, RAM_GUEST_MEMFD is supposed to be set for memory backends of
confidential guests, such as TDX VM. How and when to set it for memory
backends will be implemented in the following patches.
Introduce memory_region_has_guest_memfd() to query if the MemoryRegion has
KVM guest_memfd allocated.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-ID: <20240320083945.991426-7-michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Introduce the helper functions to set the attributes of a range of
memory to private or shared.
This is necessary to notify KVM the private/shared attribute of each gpa
range. KVM needs the information to decide the GPA needs to be mapped at
hva-based shared memory or guest_memfd based private memory.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20240320083945.991426-11-michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Newer 9.1 machine types will default to using the KVM_SEV_INIT2 API for
creating SEV/SEV-ES going forward. However, this API results in guest
measurement changes which are generally not expected for users of these
older guest types and can cause disruption if they switch to a newer
QEMU/kernel version. Avoid this by continuing to use the older
KVM_SEV_INIT/KVM_SEV_ES_INIT APIs for older machine types.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Message-ID: <20240409230743.962513-4-michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
QEMU will currently automatically make use of the KVM_SEV_INIT2 API for
initializing SEV and SEV-ES guests verses the older
KVM_SEV_INIT/KVM_SEV_ES_INIT interfaces.
However, the older interfaces will silently avoid sync'ing FPU/XSAVE
state to the VMSA prior to encryption, thus relying on behavior and
measurements that assume the related fields to be allow zero.
With KVM_SEV_INIT2, this state is now synced into the VMSA, resulting in
measurements changes and, theoretically, behaviorial changes, though the
latter are unlikely to be seen in practice.
To allow a smooth transition to the newer interface, while still
providing a mechanism to maintain backward compatibility with VMs
created using the older interfaces, provide a new command-line
parameter:
-object sev-guest,legacy-vm-type=true,...
and have it default to false.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Message-ID: <20240409230743.962513-2-michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Implement support for the KVM_X86_SEV_VM and KVM_X86_SEV_ES_VM virtual
machine types, and the KVM_SEV_INIT2 function of KVM_MEMORY_ENCRYPT_OP.
These replace the KVM_SEV_INIT and KVM_SEV_ES_INIT functions, and have
several advantages:
- sharing the initialization sequence with SEV-SNP and TDX
- allowing arguments including the set of desired VMSA features
- protection against invalid use of KVM_GET/SET_* ioctls for guests
with encrypted state
If the KVM_X86_SEV_VM and KVM_X86_SEV_ES_VM types are not supported,
fall back to KVM_SEV_INIT and KVM_SEV_ES_INIT (which use the
default x86 VM type).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
KVM is introducing a new API to create confidential guests, which
will be used by TDX and SEV-SNP but is also available for SEV and
SEV-ES. The API uses the VM type argument to KVM_CREATE_VM to
identify which confidential computing technology to use.
Since there are no other expected uses of VM types, delegate
mc->kvm_type() for x86 boards to the confidential-guest-support
object pointed to by ms->cgs.
For example, if a sev-guest object is specified to confidential-guest-support,
like,
qemu -machine ...,confidential-guest-support=sev0 \
-object sev-guest,id=sev0,...
it will check if a VM type KVM_X86_SEV_VM or KVM_X86_SEV_ES_VM
is supported, and if so use them together with the KVM_SEV_INIT2
function of the KVM_MEMORY_ENCRYPT_OP ioctl. If not, it will fall back to
KVM_SEV_INIT and KVM_SEV_ES_INIT.
This is a preparatory work towards TDX and SEV-SNP support, but it
will also enable support for VMSA features such as DebugSwap, which
are only available via KVM_SEV_INIT2.
Co-developed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Introduce a common superclass for x86 confidential guest implementations.
It will extend ConfidentialGuestSupportClass with a method that provides
the VM type to be passed to KVM_CREATE_VM.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Board reset requires writing a fresh CPU state. As far as KVM is
concerned, the only thing that blocks reset is that CPU state is
encrypted; therefore, kvm_cpus_are_resettable() can simply check
if that is the case.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
So far, KVM has allowed KVM_GET/SET_* ioctls to execute even if the
guest state is encrypted, in which case they do nothing. For the new
API using VM types, instead, the ioctls will fail which is a safer and
more robust approach.
The new API will be the only one available for SEV-SNP and TDX, but it
is also usable for SEV and SEV-ES. In preparation for that, require
architecture-specific KVM code to communicate the point at which guest
state is protected (which must be after kvm_cpu_synchronize_post_init(),
though that might change in the future in order to suppor migration).
From that point, skip reading registers so that cpu->vcpu_dirty is
never true: if it ever becomes true, kvm_arch_put_registers() will
fail miserably.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Right now, the system reset is concluded by a call to
cpu_synchronize_all_post_reset() in order to sync any changes
that the machine reset callback applied to the CPU state.
However, for VMs with encrypted state such as SEV-ES guests (currently
the only case of guests with non-resettable CPUs) this cannot be done,
because guest state has already been finalized by machine-init-done notifiers.
cpu_synchronize_all_post_reset() does nothing on these guests, and actually
we would like to make it fail if called once guest has been encrypted.
So, assume that boards that support non-resettable CPUs do not touch
CPU state and that all such setup is done before, at the time of
cpu_synchronize_all_post_init().
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Data structures like struct setup_data have been moved to a separate
setup_data.h header which bootparam.h relies on. Add setup_data.h to
the cp_portable() list and sync it along with the other header files.
Note that currently struct setup_data is stripped away as part of
generating bootparam.h, but that handling is no currently needed for
setup_data.h since it doesn't pull in many external
headers/dependencies. However, QEMU currently redefines struct
setup_data in hw/i386/x86.c, so that will need to be removed as part of
any header update that pulls in the new setup_data.h to avoid build
bisect breakage.
Because <asm/setup_data.h> is the first architecture specific #include
in include/standard-headers/, add a new sed substitution to rewrite
asm/ include to the standard-headers/asm-* subdirectory for the current
architecture.
And while at it, remove asm-generic/kvm_para.h from the list of
allowed includes: it does not have a matching substitution, and therefore
it would not be possible to use it on non-Linux systems where there is
no /usr/include/asm-generic/ directory.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use the unified interface to call confidential guest related kvm_init()
and kvm_reset(), to avoid exposing pef specific functions.
As a bonus, pef.h goes away since there is no direct call from sPAPR
board code to PEF code anymore.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use confidential_guest_kvm_init() instead of calling SEV
specific sev_kvm_init(). This allows the introduction of multiple
confidential-guest-support subclasses for different x86 vendors.
As a bonus, stubs are not needed anymore since there is no
direct call from target/i386/kvm/kvm.c to SEV code.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20240229060038.606591-1-xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Different confidential VMs in different architectures all have the same
needs to do their specific initialization (and maybe resetting) stuffs
with KVM. Currently each of them exposes individual *_kvm_init()
functions and let machine code or kvm code to call it.
To facilitate the introduction of confidential guest technology from
different x86 vendors, add two virtual functions, kvm_init() and kvm_reset()
in ConfidentialGuestSupportClass, and expose two helpers functions for
invodking them.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20240229060038.606591-1-xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A value 1 of PCAT_COMPAT (bit 0) of MADT.Flags indicates that the system
also has a PC-AT-compatible dual-8259 setup, i.e., the PIC. When PIC
is not enabled (pic=off) for x86 machine, the PCAT_COMPAT bit needs to
be cleared. The PIC probe should then print:
[ 0.155970] Using NULL legacy PIC
However, no such log printed in guest kernel unless PCAT_COMPAT is
cleared.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20240403145953.3082491-1-xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
According to table 1-2 in Intel Architecture Instruction Set Extensions and
Future Features (rev 051) [1], SierraForest has the following new features
which have already been virtualized:
- CMPCCXADD CPUID.(EAX=7,ECX=1):EAX[bit 7]
- AVX-IFMA CPUID.(EAX=7,ECX=1):EAX[bit 23]
- AVX-VNNI-INT8 CPUID.(EAX=7,ECX=1):EDX[bit 4]
- AVX-NE-CONVERT CPUID.(EAX=7,ECX=1):EDX[bit 5]
Add above features to new CPU model SierraForest. Comparing with GraniteRapids
CPU model, SierraForest bare-metal removes the following features:
- HLE CPUID.(EAX=7,ECX=0):EBX[bit 4]
- RTM CPUID.(EAX=7,ECX=0):EBX[bit 11]
- AVX512F CPUID.(EAX=7,ECX=0):EBX[bit 16]
- AVX512DQ CPUID.(EAX=7,ECX=0):EBX[bit 17]
- AVX512_IFMA CPUID.(EAX=7,ECX=0):EBX[bit 21]
- AVX512CD CPUID.(EAX=7,ECX=0):EBX[bit 28]
- AVX512BW CPUID.(EAX=7,ECX=0):EBX[bit 30]
- AVX512VL CPUID.(EAX=7,ECX=0):EBX[bit 31]
- AVX512_VBMI CPUID.(EAX=7,ECX=0):ECX[bit 1]
- AVX512_VBMI2 CPUID.(EAX=7,ECX=0):ECX[bit 6]
- AVX512_VNNI CPUID.(EAX=7,ECX=0):ECX[bit 11]
- AVX512_BITALG CPUID.(EAX=7,ECX=0):ECX[bit 12]
- AVX512_VPOPCNTDQ CPUID.(EAX=7,ECX=0):ECX[bit 14]
- LA57 CPUID.(EAX=7,ECX=0):ECX[bit 16]
- TSXLDTRK CPUID.(EAX=7,ECX=0):EDX[bit 16]
- AMX-BF16 CPUID.(EAX=7,ECX=0):EDX[bit 22]
- AVX512_FP16 CPUID.(EAX=7,ECX=0):EDX[bit 23]
- AMX-TILE CPUID.(EAX=7,ECX=0):EDX[bit 24]
- AMX-INT8 CPUID.(EAX=7,ECX=0):EDX[bit 25]
- AVX512_BF16 CPUID.(EAX=7,ECX=1):EAX[bit 5]
- fast zero-length MOVSB CPUID.(EAX=7,ECX=1):EAX[bit 10]
- fast short CMPSB, SCASB CPUID.(EAX=7,ECX=1):EAX[bit 12]
- AMX-FP16 CPUID.(EAX=7,ECX=1):EAX[bit 21]
- PREFETCHI CPUID.(EAX=7,ECX=1):EDX[bit 14]
- XFD CPUID.(EAX=0xD,ECX=1):EAX[bit 4]
- EPT_PAGE_WALK_LENGTH_5 VMX_EPT_VPID_CAP(0x48c)[bit 7]
Add all features of GraniteRapids CPU model except above features to
SierraForest CPU model.
SierraForest doesn’t support TSX and RTM but supports TAA_NO. When RTM is
not enabled in host, KVM will not report TAA_NO. So, just don't include
TAA_NO in SierraForest CPU model.
[1] https://cdrdv2.intel.com/v1/dl/getContent/671368
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Message-ID: <20240320021044.508263-1-tao1.su@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When start L2 guest with both L1/L2 using Icelake-Server-v3 or above,
QEMU reports below warning:
"warning: host doesn't support requested feature: MSR(10AH).taa-no [bit 8]"
Reason is QEMU Icelake-Server-v3 has TSX feature disabled but enables taa-no
bit. It's meaningless that TSX isn't supported but still claim TSX is secure.
So L1 KVM doesn't expose taa-no to L2 if TSX is unsupported, then starting L2
triggers the warning.
Fix it by introducing a new version Icelake-Server-v7 which has both TSX
and taa-no features. Then guest can use TSX securely when it see taa-no.
This matches the production Icelake which supports TSX and isn't susceptible
to TSX Async Abort (TAA) vulnerabilities, a.k.a, taa-no.
Ideally, TSX should have being enabled together with taa-no since v3, but for
compatibility, we'd better to add v7 to enable it.
Fixes: d965dc3559 ("target/i386: Add ARCH_CAPABILITIES related bits into Icelake-Server CPU model")
Tested-by: Xiangfei Ma <xiangfeix.ma@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-ID: <20240320093138.80267-2-zhenzhong.duan@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move the architectural (for lack of a better term) CPUID leaf generation
to a separate helper so that the generation code can be reused by TDX,
which needs to generate a canonical VM-scoped configuration.
For now this is just a cleanup, so keep the function static.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20240229063726.610065-23-xiaoyao.li@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Query kvm for supported guest physical address bits, in cpuid
function 80000008, eax[23:16]. Usually this is identical to host
physical address bits. With NPT or EPT being used this might be
restricted to 48 (max 4-level paging address space size) even if
the host cpu supports more physical address bits.
When set pass this to the guest, using cpuid too. Guest firmware
can use this to figure how big the usable guest physical address
space is, so PCI bar mapping are actually reachable.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240318155336.156197-2-kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If an architecture adds support for KVM_CAP_SET_GUEST_DEBUG but QEMU does not
have the necessary code, QEMU will fail to build after updating kernel headers.
Avoid this by using a #define in config-target.h instead of KVM_CAP_SET_GUEST_DEBUG.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Take into account split screen mode close to wrap around, which is the
other special case for dirty memory region computation.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The depth == 0 and depth == 15 have to be special cased because
width * depth / 8 does not provide the correct scanline length.
However, thanks to the recent reorganization of vga_draw_graphic()
the correct value of VRAM bits per pixel is available in "bits".
Use it (via the same "bwidth" computation that is used later in
the function), thus restricting the slow path to the wraparound case.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently it is not documented anywhere why some functions need to
be stubbed.
Group the files in stubs/meson.build according to who needs them, both
to reduce the size of the compilation and to clarify the use of stubs.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20240408155330.522792-18-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
replay.c symbols are only needed by user mode emulation, with the
exception of replay_mode that is needed by both user mode emulation
(by way of qemu_guest_getrandom) and block layer tools (by way of
util/qemu-timer.c).
Since it is needed by libqemuutil rather than specific files that
are part of the tools and emulators, split the replay_mode stub
into its own file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20240408155330.522792-17-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since the semihosting stubs are needed exactly when the Kconfig symbols
are not needed, move them to semihosting/ and conditionalize them
on CONFIG_SEMIHOSTING and/or CONFIG_SYSTEM_ONLY.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20240408155330.522792-13-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Only the files in hwcore_ss[] are required to link a user emulation
binary.
Have meson process the hw/ sub-directories if system emulation is
selected, otherwise directly process hw/core/ to get hwcore_ss[], which
is the only set required by user emulation.
This removes about 10% from the time needed to run
"../configure --disable-system --disable-tools --disable-guest-agent".
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240404194757.9343-8-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20240408155330.522792-9-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Try not to test code that is not used by user mode emulation, or by the
block layer, unless they are being compiled; and fix test-timed-average
which was not compiled with --disable-system --enable-tools.
This is by no means complete, it only touches the more blatantly
wrong cases.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240408155330.522792-5-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 30896374 started to pass the full BlockConf from usb-storage to
scsi-disk, while previously only a few select properties would be
forwarded. This enables the user to set more properties, e.g. the block
size, that are actually taking effect.
However, now the calls to blkconf_apply_backend_options() and
blkconf_blocksizes() in usb_msd_storage_realize() that modify some of
these properties take effect, too, instead of being silently ignored.
This means at least that the block sizes get an unconditional default of
512 bytes before the configuration is passed to scsi-disk.
Before commit 30896374, the property wouldn't be set for scsi-disk and
therefore the device dependent defaults would apply - 512 for scsi-hd,
but 2048 for scsi-cd. The latter default has now become 512, too, which
makes at least Windows 11 installation fail when installing from
usb-storage.
Fix this by simply not calling these functions any more in usb-storage
and passing BlockConf on unmodified (except for the BlockBackend). The
same functions are called by the SCSI code anyway and it sets the right
defaults for the actual media type.
Fixes: 3089637461 ('scsi: Don't ignore most usb-storage properties')
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2260
Reported-by: Jonas Svensson
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Message-id: 20240412144202.13786-1-kwolf@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Move calculation of mask after the switch which sets the function
number for PIRQ/PINT pins to make sure the state of these pins are
kept track of separately and IRQ is raised if any of them is active.
Cc: qemu-stable@nongnu.org
Fixes: 7e01bd80c1 hw/isa/vt82c686: Bring back via_isa_set_irq()
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240410222543.0EA534E6005@zero.eik.bme.hu>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
During the booting process of the non-standard image, the behavior of the
called function in qemu is as follows:
1. vhost_net_stop() was triggered by guest image. This will call the function
virtio_pci_set_guest_notifiers() with assgin= false,
virtio_pci_set_guest_notifiers() will release the irqfd for vector 0
2. virtio_reset() was triggered, this will set configure vector to VIRTIO_NO_VECTOR
3.vhost_net_start() was called (at this time, the configure vector is
still VIRTIO_NO_VECTOR) and then call virtio_pci_set_guest_notifiers() with
assgin=true, so the irqfd for vector 0 is still not "init" during this process
4. The system continues to boot and sets the vector back to 0. After that
msix_fire_vector_notifier() was triggered to unmask the vector 0 and meet the crash
To fix the issue, we need to support changing the vector after VIRTIO_CONFIG_S_DRIVER_OK is set.
(gdb) bt
0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0)
at pthread_kill.c:44
1 0x00007fc87148ec53 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78
2 0x00007fc87143e956 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
3 0x00007fc8714287f4 in __GI_abort () at abort.c:79
4 0x00007fc87142871b in __assert_fail_base
(fmt=0x7fc8715bbde0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x5606413efd53 "ret == 0", file=0x5606413ef87d "../accel/kvm/kvm-all.c", line=1837, function=<optimized out>) at assert.c:92
5 0x00007fc871437536 in __GI___assert_fail
(assertion=0x5606413efd53 "ret == 0", file=0x5606413ef87d "../accel/kvm/kvm-all.c", line=1837, function=0x5606413f06f0 <__PRETTY_FUNCTION__.19> "kvm_irqchip_commit_routes") at assert.c:101
6 0x0000560640f884b5 in kvm_irqchip_commit_routes (s=0x560642cae1f0) at ../accel/kvm/kvm-all.c:1837
7 0x0000560640c98f8e in virtio_pci_one_vector_unmask
(proxy=0x560643c65f00, queue_no=4294967295, vector=0, msg=..., n=0x560643c6e4c8)
at ../hw/virtio/virtio-pci.c:1005
8 0x0000560640c99201 in virtio_pci_vector_unmask (dev=0x560643c65f00, vector=0, msg=...)
at ../hw/virtio/virtio-pci.c:1070
9 0x0000560640bc402e in msix_fire_vector_notifier (dev=0x560643c65f00, vector=0, is_masked=false)
at ../hw/pci/msix.c:120
10 0x0000560640bc40f1 in msix_handle_mask_update (dev=0x560643c65f00, vector=0, was_masked=true)
at ../hw/pci/msix.c:140
11 0x0000560640bc4503 in msix_table_mmio_write (opaque=0x560643c65f00, addr=12, val=0, size=4)
at ../hw/pci/msix.c:231
12 0x0000560640f26d83 in memory_region_write_accessor
(mr=0x560643c66540, addr=12, value=0x7fc86b7bc628, size=4, shift=0, mask=4294967295, attrs=...)
at ../system/memory.c:497
13 0x0000560640f270a6 in access_with_adjusted_size
(addr=12, value=0x7fc86b7bc628, size=4, access_size_min=1, access_size_max=4, access_fn=0x560640f26c8d <memory_region_write_accessor>, mr=0x560643c66540, attrs=...) at ../system/memory.c:573
14 0x0000560640f2a2b5 in memory_region_dispatch_write (mr=0x560643c66540, addr=12, data=0, op=MO_32, attrs=...)
at ../system/memory.c:1521
15 0x0000560640f37bac in flatview_write_continue
(fv=0x7fc65805e0b0, addr=4273803276, attrs=..., ptr=0x7fc871e9c028, len=4, addr1=12, l=4, mr=0x560643c66540)
at ../system/physmem.c:2714
16 0x0000560640f37d0f in flatview_write
(fv=0x7fc65805e0b0, addr=4273803276, attrs=..., buf=0x7fc871e9c028, len=4) at ../system/physmem.c:2756
17 0x0000560640f380bf in address_space_write
(as=0x560642161ae0 <address_space_memory>, addr=4273803276, attrs=..., buf=0x7fc871e9c028, len=4)
at ../system/physmem.c:2863
18 0x0000560640f3812c in address_space_rw
(as=0x560642161ae0 <address_space_memory>, addr=4273803276, attrs=..., buf=0x7fc871e9c028, len=4, is_write=true) at ../system/physmem.c:2873
--Type <RET> for more, q to quit, c to continue without paging--
19 0x0000560640f8aa55 in kvm_cpu_exec (cpu=0x560642f205e0) at ../accel/kvm/kvm-all.c:2915
20 0x0000560640f8d731 in kvm_vcpu_thread_fn (arg=0x560642f205e0) at ../accel/kvm/kvm-accel-ops.c:51
21 0x00005606411949f4 in qemu_thread_start (args=0x560642f292b0) at ../util/qemu-thread-posix.c:541
22 0x00007fc87148cdcd in start_thread (arg=<optimized out>) at pthread_create.c:442
23 0x00007fc871512630 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
(gdb)
MST: coding style and typo fixups
Fixes: f9a09ca3ea ("vhost: add support for configure interrupt")
Cc: qemu-stable@nongnu.org
Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-ID: <2321ade5f601367efe7380c04e3f61379c59b48f.1713173550.git.mst@redhat.com>
Cc: Lei Yang <leiyang@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Cindy Lu <lulu@redhat.com>
Our Makefile massages the given make arguments to invoke ninja
accordingly. One key difference is that ninja will parallelize by
default, whereas make only does so with -j<n> or -j. The make man page
says that "if the -j option is given without an argument, make will not
limit the number of jobs that can run simultaneously". We use to support
that by replacing -j with "" (empty string) when calling ninja, so that
it would do its auto-parallelization based on the number of CPU cores.
This was accidentally broken at d1ce2cc95b (Makefile: preserve
--jobserver-auth argument when calling ninja, 2024-04-02),
causing `make -j` to fail:
$ make -j V=1
/usr/bin/ninja -v -j -d keepdepfile all | cat
make -C contrib/plugins/ V="1" TARGET_DIR="contrib/plugins/" all
ninja: fatal: invalid -j parameter
make: *** [Makefile:161: run-ninja] Error
Let's fix that and indent the touched code for better readability.
Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Fixes: d1ce2cc95b ("Makefile: preserve --jobserver-auth argument when calling ninja", 2024-04-02)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Per "SD Host Controller Standard Specification Version 3.00":
* 2.2.5 Transfer Mode Register (Offset 00Ch)
Writes to this register shall be ignored when the Command
Inhibit (DAT) in the Present State register is 1.
Do not update the TRNMOD register when Command Inhibit (DAT)
bit is set to avoid the present-status register going out of
sync, leading to malicious guest using DMA mode and overflowing
the FIFO buffer:
$ cat << EOF | qemu-system-i386 \
-display none -nographic -nodefaults \
-machine accel=qtest -m 512M \
-device sdhci-pci,sd-spec-version=3 \
-device sd-card,drive=mydrive \
-drive if=none,index=0,file=null-co://,format=raw,id=mydrive \
-qtest stdio
outl 0xcf8 0x80001013
outl 0xcfc 0x91
outl 0xcf8 0x80001001
outl 0xcfc 0x06000000
write 0x9100002c 0x1 0x05
write 0x91000058 0x1 0x16
write 0x91000005 0x1 0x04
write 0x91000028 0x1 0x08
write 0x16 0x1 0x21
write 0x19 0x1 0x20
write 0x9100000c 0x1 0x01
write 0x9100000e 0x1 0x20
write 0x9100000f 0x1 0x00
write 0x9100000c 0x1 0x00
write 0x91000020 0x1 0x00
EOF
Stack trace (part):
=================================================================
==89993==ERROR: AddressSanitizer: heap-buffer-overflow on address
0x615000029900 at pc 0x55d5f885700d bp 0x7ffc1e1e9470 sp 0x7ffc1e1e9468
WRITE of size 1 at 0x615000029900 thread T0
#0 0x55d5f885700c in sdhci_write_dataport hw/sd/sdhci.c:564:39
#1 0x55d5f8849150 in sdhci_write hw/sd/sdhci.c:1223:13
#2 0x55d5fa01db63 in memory_region_write_accessor system/memory.c:497:5
#3 0x55d5fa01d245 in access_with_adjusted_size system/memory.c:573:18
#4 0x55d5fa01b1a9 in memory_region_dispatch_write system/memory.c:1521:16
#5 0x55d5fa09f5c9 in flatview_write_continue system/physmem.c:2711:23
#6 0x55d5fa08f78b in flatview_write system/physmem.c:2753:12
#7 0x55d5fa08f258 in address_space_write system/physmem.c:2860:18
...
0x615000029900 is located 0 bytes to the right of 512-byte region
[0x615000029700,0x615000029900) allocated by thread T0 here:
#0 0x55d5f7237b27 in __interceptor_calloc
#1 0x7f9e36dd4c50 in g_malloc0
#2 0x55d5f88672f7 in sdhci_pci_realize hw/sd/sdhci-pci.c:36:5
#3 0x55d5f844b582 in pci_qdev_realize hw/pci/pci.c:2092:9
#4 0x55d5fa2ee74b in device_set_realized hw/core/qdev.c:510:13
#5 0x55d5fa325bfb in property_set_bool qom/object.c:2358:5
#6 0x55d5fa31ea45 in object_property_set qom/object.c:1472:5
#7 0x55d5fa332509 in object_property_set_qobject om/qom-qobject.c:28:10
#8 0x55d5fa31f6ed in object_property_set_bool qom/object.c:1541:15
#9 0x55d5fa2e2948 in qdev_realize hw/core/qdev.c:292:12
#10 0x55d5f8eed3f1 in qdev_device_add_from_qdict system/qdev-monitor.c:719:10
#11 0x55d5f8eef7ff in qdev_device_add system/qdev-monitor.c:738:11
#12 0x55d5f8f211f0 in device_init_func system/vl.c:1200:11
#13 0x55d5fad0877d in qemu_opts_foreach util/qemu-option.c:1135:14
#14 0x55d5f8f0df9c in qemu_create_cli_devices system/vl.c:2638:5
#15 0x55d5f8f0db24 in qmp_x_exit_preconfig system/vl.c:2706:5
#16 0x55d5f8f14dc0 in qemu_init system/vl.c:3737:9
...
SUMMARY: AddressSanitizer: heap-buffer-overflow hw/sd/sdhci.c:564:39
in sdhci_write_dataport
Add assertions to ensure the fifo_buffer[] is not overflowed by
malicious accesses to the Buffer Data Port register.
Fixes: CVE-2024-3447
Cc: qemu-stable@nongnu.org
Fixes: d7dfca0807 ("hw/sdhci: introduce standard SD host controller")
Buglink: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=58813
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Reported-by: Chuhong Yuan <hslester96@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <CAFEAcA9iLiv1XGTGKeopgMa8Y9+8kvptvsb8z2OBeuy+5=NUfg@mail.gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20240409145524.27913-1-philmd@linaro.org>
When the MAC Interface Layer (MIL) transmit FIFO is full,
truncate the packet, and raise the Transmitter Error (TXE)
flag.
Broken since model introduction in commit 2a42499017
("LAN9118 emulation").
When using the reproducer from
https://gitlab.com/qemu-project/qemu/-/issues/2267 we get:
hw/net/lan9118.c:798:17: runtime error:
index 2048 out of bounds for type 'uint8_t[2048]' (aka 'unsigned char[2048]')
#0 0x563ec9a057b1 in tx_fifo_push hw/net/lan9118.c:798:43
#1 0x563ec99fbb28 in lan9118_writel hw/net/lan9118.c:1042:9
#2 0x563ec99f2de2 in lan9118_16bit_mode_write hw/net/lan9118.c:1205:9
#3 0x563ecbf78013 in memory_region_write_accessor system/memory.c:497:5
#4 0x563ecbf776f5 in access_with_adjusted_size system/memory.c:573:18
#5 0x563ecbf75643 in memory_region_dispatch_write system/memory.c:1521:16
#6 0x563ecc01bade in flatview_write_continue_step system/physmem.c:2713:18
#7 0x563ecc01b374 in flatview_write_continue system/physmem.c:2743:19
#8 0x563ecbff1c9b in flatview_write system/physmem.c:2774:12
#9 0x563ecbff1768 in address_space_write system/physmem.c:2894:18
...
[*] LAN9118 DS00002266B.pdf, Table 5.3.3 "INTERRUPT STATUS REGISTER"
Cc: qemu-stable@nongnu.org
Reported-by: Will Lester
Reported-by: Chuhong Yuan <hslester96@gmail.com>
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2267
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20240409133801.23503-3-philmd@linaro.org>
The magic 2048 is explained in the LAN9211 datasheet (DS00002414A)
in chapter 1.4, "10/100 Ethernet MAC":
The MAC Interface Layer (MIL), within the MAC, contains a
2K Byte transmit and a 128 Byte receive FIFO which is separate
from the TX and RX FIFOs. [...]
Note, the use of the constant in lan9118_receive() reveals that
our implementation is using the same buffer for both tx and rx.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20240409133801.23503-2-philmd@linaro.org>
nand_command() and nand_getio() don't check @offset points
into the block, nor the available data length (s->iolen) is
not negative.
In order to fix:
- check the offset is in range in nand_blk_load_NAND_PAGE_SIZE(),
- do not set @iolen if blk_load() failed.
Reproducer:
$ cat << EOF | qemu-system-arm -machine tosa \
-monitor none -serial none \
-display none -qtest stdio
write 0x10000111 0x1 0xca
write 0x10000104 0x1 0x47
write 0x1000ca04 0x1 0xd7
write 0x1000ca01 0x1 0xe0
write 0x1000ca04 0x1 0x71
write 0x1000ca00 0x1 0x50
write 0x1000ca04 0x1 0xd7
read 0x1000ca02 0x1
write 0x1000ca01 0x1 0x10
EOF
=================================================================
==15750==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x61f000000de0
at pc 0x560e61557210 bp 0x7ffcfc4a59f0 sp 0x7ffcfc4a59e8
READ of size 1 at 0x61f000000de0 thread T0
#0 0x560e6155720f in mem_and hw/block/nand.c:101:20
#1 0x560e6155ac9c in nand_blk_write_512 hw/block/nand.c:663:9
#2 0x560e61544200 in nand_command hw/block/nand.c:293:13
#3 0x560e6153cc83 in nand_setio hw/block/nand.c:520:13
#4 0x560e61a0a69e in tc6393xb_nand_writeb hw/display/tc6393xb.c:380:13
#5 0x560e619f9bf7 in tc6393xb_writeb hw/display/tc6393xb.c:524:9
#6 0x560e647c7d03 in memory_region_write_accessor softmmu/memory.c:492:5
#7 0x560e647c7641 in access_with_adjusted_size softmmu/memory.c:554:18
#8 0x560e647c5f66 in memory_region_dispatch_write softmmu/memory.c:1514:16
#9 0x560e6485409e in flatview_write_continue softmmu/physmem.c:2825:23
#10 0x560e648421eb in flatview_write softmmu/physmem.c:2867:12
#11 0x560e64841ca8 in address_space_write softmmu/physmem.c:2963:18
#12 0x560e61170162 in qemu_writeb tests/qtest/videzzo/videzzo_qemu.c:1080:5
#13 0x560e6116eef7 in dispatch_mmio_write tests/qtest/videzzo/videzzo_qemu.c:1227:28
0x61f000000de0 is located 0 bytes to the right of 3424-byte region [0x61f000000080,0x61f000000de0)
allocated by thread T0 here:
#0 0x560e611276cf in malloc /root/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
#1 0x7f7959a87e98 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x57e98)
#2 0x560e64b98871 in object_new qom/object.c:749:12
#3 0x560e64b5d1a1 in qdev_new hw/core/qdev.c:153:19
#4 0x560e61547ea5 in nand_init hw/block/nand.c:639:11
#5 0x560e619f8772 in tc6393xb_init hw/display/tc6393xb.c:558:16
#6 0x560e6390bad2 in tosa_init hw/arm/tosa.c:250:12
SUMMARY: AddressSanitizer: heap-buffer-overflow hw/block/nand.c:101:20 in mem_and
==15750==ABORTING
Broken since introduction in commit 3e3d5815cb ("NAND Flash memory
emulation and ECC calculation helpers for use by NAND controllers").
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1445
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1446
Reported-by: Qiang Liu <cyruscyliu@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20240409135944.24997-4-philmd@linaro.org>
Introduce virtio_bh_new_guarded(), similar to qemu_bh_new_guarded()
but using the transport memory guard, instead of the device one
(there can only be one virtio device per virtio bus).
Inspired-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20240409105537.18308-2-philmd@linaro.org>
Passing the tswapped structure to strace means that
our internal si_type is also gone, which then aborts
in print_siginfo.
Fixes: 4d6d8a05a0 ("linux-user: Move tswap_siginfo out of target code")
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We already attempted to set and clear can_do_io before the first
and last insns, but only used the initial value of max_insns and
the call to translator_io_start to find those insns.
Now that we track insn_start in DisasContextBase, and now that
we have emit_before_op, we can wait until we have finished
translation to identify the true first and last insns and emit
the sets of can_do_io at that time.
This fixes the case of a translation block which crossed a page
boundary, and for which the second page turned out to be mmio.
In this case we truncate the block, and the previous logic for
can_do_io could leave a block with a single insn with can_do_io
set to false, which would fail an assertion in cpu_io_recompile.
Reported-by: Jørgen Hansen <Jorgen.Hansen@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Jørgen Hansen <Jorgen.Hansen@wdc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
To keep the multiple update check, replace insn_start
with insn_start_updated.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
To keep the multiple update check, replace insn_start
with insn_start_updated.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
To keep the multiple update check, replace insn_start
with insn_start_updated.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
CHECK_NOT_DELAY_SLOT is correctly applied to the branch-related
instructions, but not to the PC-relative mov* instructions.
I verified the existence of an illegal slot exception on a SH7091 when
any of these instructions are attempted inside a delay slot.
This also matches the behavior described in the SH-4 ISA manual.
Signed-off-by: Zack Buhman <zack@buhman.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240407150705.5965-1-zack@buhman.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewd-by: Yoshinori Sato <ysato@users.sourceforge.jp>
The contents of IIAOQ depend on PSW_W.
Follow the text in "Interruption Instruction Address Queues",
pages 2-13 through 2-15.
Tested-by: Sven Schnelle <svens@stackframe.org>
Tested-by: Helge Deller <deller@gmx.de>
Reported-by: Sven Schnelle <svens@stackframe.org>
Fixes: b10700d826 ("target/hppa: Update IIAOQ, IIASQ for pa2.0")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Turned out hard-coding version and date in the Makefile wasn't a bright
idea. Updating it on edk2 updates is easily forgotten. Fetch the info
from git instead. Store in edk2-version, so this can be committed to
the repo and is present in tarballs too.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-ID: <20240327102448.61877-2-kraxel@redhat.com>
Let's not care about what was changed and update the whole config,
reasons:
1. config->geometry should be updated together with capacity, so we fix
a bug.
2. Vhost-user protocol doesn't say anything about config change
limitation. Silent ignore of changes doesn't seem to be correct.
3. vhost-user-vsock reads the whole config
4. on realize we don't do any checks on retrieved config, so no reason
to care here
Comment "valid for resize only" exists since introduction the whole
hw/block/vhost-user-blk.c in commit
00343e4b54
"vhost-user-blk: introduce a new vhost-user-blk host device",
seems it was just an extra limitation.
Also, let's notify guest unconditionally:
1. So does vhost-user-vsock
2. We are going to reuse the functionality in new cases when we do want
to notify the guest unconditionally. So, no reason to create extra
branches in the logic.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Acked-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20240329183758.3360733-2-vsementsov@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The set_config callback function vhost_vdpa_device_get_config in
vdpa-dev does not fetch the current device status from the hardware
device, causing the guest os to not receive the latest device status
information.
The hardware updates the config status of the vdpa device and then
notifies the os. The guest os receives an interrupt notification,
triggering a get_config access in the kernel, which then enters qemu
internally. Ultimately, the vhost_vdpa_device_get_config function of
vdpa-dev is called
One scenario encountered is when the device needs to bring down the
vdpa net device. After modifying the status field of virtio_net_config
in the hardware, it sends an interrupt notification. However, the guest
os always receives the STATUS field as VIRTIO_NET_S_LINK_UP.
Signed-off-by: Yuxue Liu <yuxue.liu@jaguarmicro.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20240408020003.1979-1-yuxue.liu@jaguarmicro.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In the event of writing many chains of descriptors, the device must
write just the id of the last buffer in the descriptor chain, skip
forward the number of descriptors in the chain, and then repeat the
operations for the rest of chains.
Current QEMU code writes all the buffer ids consecutively, and then
skips all the buffers altogether. This is a bug, and can be reproduced
with a VirtIONet device with _F_MRG_RXBUB and without
_F_INDIRECT_DESC:
If a virtio-net device has the VIRTIO_NET_F_MRG_RXBUF feature
but not the VIRTIO_RING_F_INDIRECT_DESC feature,
'VirtIONetQueue->rx_vq' will use the merge feature
to store data in multiple 'elems'.
The 'num_buffers' in the virtio header indicates how many elements are merged.
If the value of 'num_buffers' is greater than 1,
all the merged elements will be filled into the descriptor ring.
The 'idx' of the elements should be the value of 'vq->used_idx' plus 'ndescs'.
Fixes: 86044b24e8 ("virtio: basic packed virtqueue support")
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Wafer <wafer@jaguarmicro.com>
Message-Id: <20240407015451.5228-2-wafer@jaguarmicro.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The current handling of invalid virtqueue elements inside the TX/RX virt
queue handlers is wrong.
They are added in a per-stream invalid queue to be processed after the
handler is done examining each message, but the invalid message might
not be specifying any stream_id; which means it's invalid to add it to
any stream->invalid queue since stream could be NULL at this point.
This commit moves the invalid queue to the VirtIOSound struct which
guarantees there will always be a valid temporary place to store them
inside the tx/rx handlers. The queue will be emptied before the handler
returns, so the queue must be empty at any other point of the device's
lifetime.
Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Message-Id: <virtio-snd-rewrite-invalid-tx-rx-message-handling-v1.manos.pitsidianakis@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch improves error handling in virtio_snd_handle_tx_xfer()
and virtio_snd_handle_rx_xfer() in the VirtIO sound driver. Previously,
'goto' statements were used for error paths, leading to unnecessary
processing and potential null pointer dereferences. Now, 'continue' is
used to skip the rest of the current loop iteration for errors such as
message size discrepancies or null streams, reducing crash risks.
ASAN log illustrating the issue addressed:
ERROR: AddressSanitizer: SEGV on unknown address 0x0000000000b4
#0 0x57cea39967b8 in qemu_mutex_lock_impl qemu/util/qemu-thread-posix.c:92:5
#1 0x57cea128c462 in qemu_mutex_lock qemu/include/qemu/thread.h:122:5
#2 0x57cea128d72f in qemu_lockable_lock qemu/include/qemu/lockable.h:95:5
#3 0x57cea128c294 in qemu_lockable_auto_lock qemu/include/qemu/lockable.h:105:5
#4 0x57cea1285eb2 in virtio_snd_handle_rx_xfer qemu/hw/audio/virtio-snd.c:1026:9
#5 0x57cea2caebbc in virtio_queue_notify_vq qemu/hw/virtio/virtio.c:2268:9
#6 0x57cea2cae412 in virtio_queue_host_notifier_read qemu/hw/virtio/virtio.c:3671:9
#7 0x57cea39822f1 in aio_dispatch_handler qemu/util/aio-posix.c:372:9
#8 0x57cea3979385 in aio_dispatch_handlers qemu/util/aio-posix.c:414:20
#9 0x57cea3978eb1 in aio_dispatch qemu/util/aio-posix.c:424:5
#10 0x57cea3a1eede in aio_ctx_dispatch qemu/util/async.c:360:5
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Message-Id: <20240322110827.568412-1-zheyuma97@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
subj is calling kvm_add_routing_entry() which simply extends
KVMState::irq_routes::entries[]
but doesn't check if number of routes goes beyond limit the kernel
is willing to accept. Which later leads toi the assert
qemu-kvm: ../accel/kvm/kvm-all.c:1833: kvm_irqchip_commit_routes: Assertion `ret == 0' failed
typically it happens during guest boot for large enough guest
Reproduced with:
./qemu --enable-kvm -m 8G -smp 64 -machine pc \
`for b in {1..2}; do echo -n "-device pci-bridge,id=pci$b,chassis_nr=$b ";
for i in {0..31}; do touch /tmp/vblk$b$i;
echo -n "-drive file=/tmp/vblk$b$i,if=none,id=drive$b$i,format=raw
-device virtio-blk-pci,drive=drive$b$i,bus=pci$b ";
done; done`
While crash at boot time is bad, the same might happen at hotplug time
which is unacceptable.
So instead calling kvm_add_routing_entry() unconditionally, check first
that number of routes won't exceed KVM_CAP_IRQ_ROUTING. This way virtio
device insteads killin qemu, will gracefully fail to initialize device
as expected with following warnings on console:
virtio-blk failed to set guest notifier (-28), ensure -accel kvm is set.
virtio_bus_start_ioeventfd: failed. Fallback to userspace (slower).
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-ID: <20240408110956.451558-1-imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
GCC 14 shows -Wshadow=local warnings if an enum conflicts with a local
variable (including a parameter). To avoid this, move the problematic
enum and all of its dependencies after the hundreds of functions that
have a parameter named "instruction".
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Migration pull for 9.0-rc3
- Wei/Lei's fix on a rare postcopy race that can hang the channel (since 8.0)
- Avihai's fix on maintainers file, points to the right doc links
# -----BEGIN PGP SIGNATURE-----
#
# iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZhLpJBIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1wa87AEAhvXqJyLxYYdlQ5fqp4hVV6O/3N1vNHMu
# kT3d9tmM0jsBAJ5KxK176iGDp+ej5MEyYSm1gG7ivj3y3v3wlPnSmJMJ
# =T1lk
# -----END PGP SIGNATURE-----
# gpg: Signature made Sun 07 Apr 2024 19:42:44 BST
# gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg: issuer "peterx@redhat.com"
# gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [marginal]
# gpg: aka "Peter Xu <peterx@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706
* tag 'migration-20240407-pull-request' of https://gitlab.com/peterx/qemu:
MAINTAINERS: Adjust migration documentation files
migration/postcopy: ensure preempt channel is ready before loading states
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When we do an AT address translation operation, the page table walk
is supposed to be performed in the context of the EL we're doing the
walk for, so for instance an AT S1E2R walk is done for EL2. In the
pseudocode an EL is passed to AArch64.AT(), which calls
SecurityStateAtEL() to find the security state that we should be
doing the walk with.
In ats_write64() we get this wrong, instead using the current
security space always. This is fine for AT operations performed from
EL1 and EL2, because there the current security state and the
security state for the lower EL are the same. But for AT operations
performed from EL3, the current security state is always either
Secure or Root, whereas we want to use the security state defined by
SCR_EL3.{NS,NSE} for the walk. This affects not just guests using
FEAT_RME but also ones where EL3 is Secure state and the EL3 code
is trying to do an AT for a NonSecure EL2 or EL1.
Use arm_security_space_below_el3() to get the SecuritySpace to
pass to do_ats_write() for all AT operations except the
AT S1E3* operations.
Cc: qemu-stable@nongnu.org
Fixes: e1ee56ec23 ("target/arm: Pass security space rather than flag for AT instructions")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2250
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240405180232.3570066-1-peter.maydell@linaro.org
Qemu wraps its call to ninja in a Makefile. Since ninja, as opposed to
make, utilizes all CPU cores by default, the qemu Makefile translates
the absense of a `-jN` argument into `-j1`. This breaks jobserver
functionality, so update the -jN mangling to take the --jobserver-auth
argument into considerationa too.
Signed-off-by: Martin Hundebøll <martin@geanix.com>
Message-Id: <20240402081738.1051560-1-martin@geanix.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
EL2 accesses to CNTPOFF_EL2 should only ever trap to EL3 if EL3 is
present, as described by the reference manual (for MRS):
/* ... */
elsif PSTATE.EL == EL2 then
if Halted() && HaveEL(EL3) && /*...*/ then
UNDEFINED;
elsif HaveEL(EL3) && SCR_EL3.ECVEn == '0' then
/* ... */
else
X[t, 64] = CNTPOFF_EL2;
However, the existing implementation of gt_cntpoff_access() always
returns CP_ACCESS_TRAP_EL3 for EL2 accesses with SCR_EL3.ECVEn unset. In
pseudo-code terminology, this corresponds to assuming that HaveEL(EL3)
is always true, which is wrong. As a result, QEMU panics in
access_check_cp_reg() when started without EL3 and running EL2 code
accessing the register (e.g. any recent KVM booting a guest).
Therefore, add the HaveEL(EL3) check to gt_cntpoff_access().
Fixes: 2808d3b38a ("target/arm: Implement FEAT_ECV CNTPOFF_EL2 handling")
Signed-off-by: Pierre-Clément Tosi <ptosi@google.com>
Message-id: m3al6amhdkmsiy2f62w72ufth6dzn45xg5cz6xljceyibphnf4@ezmmpwk4tnhl
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
qemu-sparc queue
# -----BEGIN PGP SIGNATURE-----
#
# iQFSBAABCgA8FiEEzGIauY6CIA2RXMnEW8LFb64PMh8FAmYOtvEeHG1hcmsuY2F2
# ZS1heWxhbmRAaWxhbmRlLmNvLnVrAAoJEFvCxW+uDzIf+5oIAJtRPiTP5aUmN4nU
# s72NBtgARBJ+5hHl0fqFFlCrG9elO28F1vhT9DwwBOLwihZCnfIXf+SCoE+pvqDw
# c+AMN/RnDu+1F4LF93W0ZIr305yGDfVlU+S3vKGtB9G4rcLeBDmNlhui2d0Bqx9R
# jwX1y57vcPclObE0KL6AVOfSDPYiVEVQSiTr3j4oW8TqAs2bduEZMRh6esb3XMIA
# hmj8mhZAszfh1YvX8ufbxtPQsnNuFMM+Fxgxp0pux8QaI0addDHwVNObRUYlTUZ1
# o4xCw7TRXXotaHde/OqZApFECs+md3R7rC2wj7s3ae0ynohHHDFfaB5t1f4pm+kA
# /6UN/Jc=
# =XwaI
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 04 Apr 2024 15:19:29 BST
# gpg: using RSA key CC621AB98E82200D915CC9C45BC2C56FAE0F321F
# gpg: issuer "mark.cave-ayland@ilande.co.uk"
# gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>" [full]
# Primary key fingerprint: CC62 1AB9 8E82 200D 915C C9C4 5BC2 C56F AE0F 321F
* tag 'qemu-sparc-20240404' of https://github.com/mcayland/qemu:
esp.c: remove explicit setting of DRQ within ESP state machine
esp.c: ensure esp_pdma_write() always calls esp_fifo_push()
esp.c: update esp_fifo_{push, pop}() to call esp_update_drq()
esp.c: introduce esp_update_drq() and update esp_fifo_{push, pop}_buf() to use it
esp.c: move esp_set_phase() and esp_get_phase() towards the beginning of the file
esp.c: prevent cmdfifo overflow in esp_cdb_ready()
esp.c: rework esp_cdb_length() into esp_cdb_ready()
esp.c: don't assert() if FIFO empty when executing non-DMA SELATNS
esp.c: introduce esp_fifo_push_buf() function for pushing to the FIFO
esp.c: change esp_fifo_pop_buf() to take ESPState
esp.c: use esp_fifo_push() instead of fifo8_push()
esp.c: change esp_fifo_pop() to take ESPState
esp.c: change esp_fifo_push() to take ESPState
esp.c: replace cmdfifo use of esp_fifo_pop() in do_message_phase()
esp.c: replace esp_fifo_pop_buf() with esp_fifo8_pop_buf() in do_message_phase()
esp.c: replace esp_fifo_pop_buf() with esp_fifo8_pop_buf() in do_command_phase()
esp.c: move esp_fifo_pop_buf() internals to new esp_fifo8_pop_buf() function
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
During normal use the cmdfifo will never wrap internally and cmdfifo_cdb_offset
will always indicate the start of the SCSI CDB. However it is possible that a
malicious guest could issue an invalid ESP command sequence such that cmdfifo
wraps internally and cmdfifo_cdb_offset could point beyond the end of the FIFO
data buffer.
Add an extra check to fifo8_peek_buf() to ensure that if the cmdfifo has wrapped
internally then esp_cdb_ready() will exit rather than allow scsi_cdb_length() to
access data outside the cmdfifo data buffer.
Reported-by: Chuhong Yuan <hslester96@gmail.com>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20240324191707.623175-13-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
This modification ensures that in scenarios where the buffer size is
insufficient for a zone report, the function will now properly set an
error status and proceed to a cleanup label, instead of merely
returning.
The following ASAN log reveals it:
==1767400==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 312 byte(s) in 1 object(s) allocated from:
#0 0x64ac7b3280cd in malloc llvm/compiler-rt/lib/asan/asan_malloc_linux.cpp:129:3
#1 0x735b02fb9738 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x5e738)
#2 0x64ac7d23be96 in virtqueue_split_pop hw/virtio/virtio.c:1612:12
#3 0x64ac7d23728a in virtqueue_pop hw/virtio/virtio.c:1783:16
#4 0x64ac7cfcaacd in virtio_blk_get_request hw/block/virtio-blk.c:228:27
#5 0x64ac7cfca7c7 in virtio_blk_handle_vq hw/block/virtio-blk.c:1123:23
#6 0x64ac7cfecb95 in virtio_blk_handle_output hw/block/virtio-blk.c:1157:5
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Message-id: 20240404120040.1951466-1-zheyuma97@gmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
If no bytes are there to process in the message in phase,
the input data latch (s->sidl) is set to s->msg[-1]. Just
do nothing since no DMA is performed.
Reported-by: Chuhong Yuan <hslester96@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Horizontal pel panning bit 3 is only used in text mode. In graphics
mode, it can be treated as if it was zero, thus not extending the
dirty memory region.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When pel panning is active, one more byte is read from each of the VGA
memory planes. This has to be accounted in the computation of region_end,
otherwise vga_draw_graphic() fails an assertion:
qemu-system-i386: ../system/physmem.c:946: cpu_physical_memory_snapshot_get_dirty: Assertion `start + length <= snap->end' failed.
Reported-by: Helge Konetzka <hk@zapateado.de>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2244
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move the computation of region_start and region_end after the value of
"bits" is known. This makes it possible to distinguish modes that
support horizontal pel panning from modes that do not.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There are two sets of conditionals using the shift control bits: one to
verify the palette and adjust disp_width, one to compute the "v" and
"bits" variables. Merge them into one, with the extra benefit that
we now have the "bits" value available early and can use it to
compute region_end.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When vhost-user or vhost-kernel is handling virtio net datapath,
QEMU should not touch used ring.
But with vhost-user socket reconnect scenario, in a very rare case
(has pending kick event). VRING_USED_F_NO_NOTIFY is set by QEMU in
following code path:
#0 virtio_queue_split_set_notification (vq=0x7ff5f4c920a8, enable=0) at ../hw/virtio/virtio.c:511
#1 0x0000559d6dbf033b in virtio_queue_set_notification (vq=0x7ff5f4c920a8, enable=0) at ../hw/virtio/virtio.c:576
#2 0x0000559d6dbbbdbc in virtio_net_handle_tx_bh (vdev=0x559d703a6aa0, vq=0x7ff5f4c920a8) at ../hw/net/virtio-net.c:2801
#3 0x0000559d6dbf4791 in virtio_queue_notify_vq (vq=0x7ff5f4c920a8) at ../hw/virtio/virtio.c:2248
#4 0x0000559d6dbf79da in virtio_queue_host_notifier_read (n=0x7ff5f4c9211c) at ../hw/virtio/virtio.c:3525
#5 0x0000559d6d9a5814 in virtio_bus_cleanup_host_notifier (bus=0x559d703a6a20, n=1) at ../hw/virtio/virtio-bus.c:321
#6 0x0000559d6dbf83c9 in virtio_device_stop_ioeventfd_impl (vdev=0x559d703a6aa0) at ../hw/virtio/virtio.c:3774
#7 0x0000559d6d9a55c8 in virtio_bus_stop_ioeventfd (bus=0x559d703a6a20) at ../hw/virtio/virtio-bus.c:259
#8 0x0000559d6d9a53e8 in virtio_bus_grab_ioeventfd (bus=0x559d703a6a20) at ../hw/virtio/virtio-bus.c:199
#9 0x0000559d6dbf841c in virtio_device_grab_ioeventfd (vdev=0x559d703a6aa0) at ../hw/virtio/virtio.c:3783
#10 0x0000559d6d9bde18 in vhost_dev_enable_notifiers (hdev=0x559d707edd70, vdev=0x559d703a6aa0) at ../hw/virtio/vhost.c:1592
#11 0x0000559d6d89a0b8 in vhost_net_start_one (net=0x559d707edd70, dev=0x559d703a6aa0) at ../hw/net/vhost_net.c:266
#12 0x0000559d6d89a6df in vhost_net_start (dev=0x559d703a6aa0, ncs=0x559d7048d890, data_queue_pairs=31, cvq=0) at ../hw/net/vhost_net.c:412
#13 0x0000559d6dbb5b89 in virtio_net_vhost_status (n=0x559d703a6aa0, status=15 '\017') at ../hw/net/virtio-net.c:311
#14 0x0000559d6dbb5e34 in virtio_net_set_status (vdev=0x559d703a6aa0, status=15 '\017') at ../hw/net/virtio-net.c:392
#15 0x0000559d6dbb60d8 in virtio_net_set_link_status (nc=0x559d7048d890) at ../hw/net/virtio-net.c:455
#16 0x0000559d6da64863 in qmp_set_link (name=0x559d6f0b83d0 "hostnet1", up=true, errp=0x7ffdd76569f0) at ../net/net.c:1459
#17 0x0000559d6da7226e in net_vhost_user_event (opaque=0x559d6f0b83d0, event=CHR_EVENT_OPENED) at ../net/vhost-user.c:301
#18 0x0000559d6ddc7f63 in chr_be_event (s=0x559d6f2ffea0, event=CHR_EVENT_OPENED) at ../chardev/char.c:62
#19 0x0000559d6ddc7fdc in qemu_chr_be_event (s=0x559d6f2ffea0, event=CHR_EVENT_OPENED) at ../chardev/char.c:82
This issue causes guest kernel stop kicking device and traffic stop.
Add vhost_started check in virtio_net_handle_tx_bh to fix this wrong
VRING_USED_F_NO_NOTIFY set.
Signed-off-by: Yajun Wu <yajunw@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20240402045109.97729-1-yajunw@nvidia.com>
[PMD: Use unlikely()]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
../hw/nvme/ctrl.c:6081:21: error: ‘result’ may be used uninitialized [-Werror=maybe-uninitialized]
It's not obvious that 'result' is set in all code paths. When &result is
a returned argument, it's even less clear.
Looking at various assignments, 0 seems to be a suitable default value.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Message-ID: <20240328102052.3499331-18-marcandre.lureau@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Coverity complains that the check introduced in commit 3f934817 suggests
that qiov could be NULL and we dereference it before reaching the check.
In fact, all of the callers pass a non-NULL pointer, so just remove the
misleading check.
Resolves: Coverity CID 1542668
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20240327192750.204197-1-kwolf@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The commit 15002f60f7 ("util: rename qemu-error.c to match its header
name") renamed util/qemu-error.c to util/error-report.c but missed to
change the corresponding entry.
To avoid get_maintainer.pl failing, update the error-report.c entry.
Fixes: 15002f60f7 ("util: rename qemu-error.c to match its header name")
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240327115539.3860270-1-zhao1.liu@linux.intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Similarly to commit 9de9fa5cf2 ("hw/arm/smmu-common: Avoid using
inlined functions with external linkage"):
None of our code base require / use inlined functions with external
linkage. Some places use internal inlining in the hot path. These
two functions are certainly not in any hot path and don't justify
any inlining, so these are likely oversights rather than intentional.
Fix:
C compiler for the host machine: clang (clang 15.0.0 "Apple clang version 15.0.0 (clang-1500.3.9.4)")
...
hw/arm/smmu-common.c:203:43: error: static function 'smmu_hash_remove_by_vmid' is
used in an inline function with external linkage [-Werror,-Wstatic-in-inline]
g_hash_table_foreach_remove(s->iotlb, smmu_hash_remove_by_vmid, &vmid);
^
include/hw/arm/smmu-common.h:197:1: note: use 'static' to give inline function 'smmu_iotlb_inv_vmid' internal linkage
void smmu_iotlb_inv_vmid(SMMUState *s, uint16_t vmid);
^
static
hw/arm/smmu-common.c:139:17: note: 'smmu_hash_remove_by_vmid' declared here
static gboolean smmu_hash_remove_by_vmid(gpointer key, gpointer value,
^
Fixes: ccc3ee3871 ("hw/arm/smmuv3: Add CMDs related to stage-2")
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20240313184954.42513-2-philmd@linaro.org>
The test mangles the GPIO address and the pin number in the
qtest_add_data_func data parameter. Doing so, it assumes that the host
pointer size is always 64-bit, which breaks on 32-bit :
../tests/qtest/stm32l4x5_gpio-test.c: In function ‘test_gpio_output_mode’:
../tests/qtest/stm32l4x5_gpio-test.c:272:25: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
272 | unsigned int pin = ((uint64_t)data) & 0xF;
| ^
../tests/qtest/stm32l4x5_gpio-test.c:273:22: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
273 | uint32_t gpio = ((uint64_t)data) >> 32;
| ^
To fix, improve the mangling of the GPIO address and pin number fields
by using GPIO_SIZE so that the resulting value fits in a 32-bit pointer.
While at it, include some helpers to hide the details.
Cc: Arnaud Minier <arnaud.minier@telecom-paris.fr>
Cc: Inès Varhol <ines.varhol@telecom-paris.fr>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Message-id: 20240329092747.298259-1-clg@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
If the group of the highest priority pending interrupt is disabled
via ICC_IGRPEN*, the ICC_HPPIR* registers should return
INTID_SPURIOUS, not the interrupt ID. (See the GIC architecture
specification pseudocode functions ICC_HPPIR1_EL1[] and
HighestPriorityPendingInterrupt().)
Make HPPIR reads honour the group disable, the way we already do
when determining whether to preempt in icc_hppi_can_preempt().
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240328153333.2522667-1-peter.maydell@linaro.org
Hardware of sbsa-ref board is nowadays defined by both BSA and SBSA
specifications. Then BBR defines firmware interface.
Added note about DeviceTree data passed from QEMU to firmware. It is
very minimal and provides only data we use in firmware.
Added NUMA information to list of things reported by DeviceTree.
Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Message-id: 20240328163851.1386176-1-marcin.juszkiewicz@linaro.org
Reviewed-by: Leif Lindholm <quic_llindhol@quicinc.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The HSTR_EL2 register allows the hypervisor to trap AArch32 EL1 and
EL0 accesses to cp15 registers. We incorrectly implemented this so
they trap to EL1 when we detect the need for a HSTR trap at code
generation time. (The check in access_check_cp_reg() which we do at
runtime to catch traps from EL0 is correctly routing them to EL2.)
Use the correct target EL when generating the code to take the trap.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2226
Fixes: 049edada5e ("target/arm: Make HSTR_EL2 traps take priority over UNDEF-at-EL1")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240325133116.2075362-1-peter.maydell@linaro.org
TILE-Gx has been removed during the v6.0 release (see
commit 2cc1a90166 "Remove deprecated target tilegx"),
no need to mention it in the list of "supported targets".
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This fixes the invalid bInterfaceProtocol value 0x04 in the USB audio
AudioControl descriptors. It should be zero. While Linux and Windows
forgive this error, macOS 14 Sonoma does not. The usb-audio device does
not appear in macOS sound settings even though the device is recognized
and shows up in USB system information. According to the USB audio class
specs 1.0-4.0, valid values are 0x00, 0x20, 0x30 and 0x40. (Note also
that Linux prints the warning "unknown interface protocol 0x4, assuming
v1", but then proceeds as if the value was zero.)
This also fixes the invalid wTotalLength value in the multi-channel
setup AudioControl interface header descriptor (used when multi=on
and out.mixing-engine off). The combined length of all the descriptors
there add up to 0x37, not 0x38. In Linux, "lsusb -D ..." displays
incomplete descriptor information when this length is incorrect.
Signed-off-by: Joonas Kankaala <joonas.a.kankaala@gmail.com>
Reviewed-by: Volker Rümelin <vr_qemu@t-online.de>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Migration pull for 9.0-rc2
- Avihai's two fixes on error paths
# -----BEGIN PGP SIGNATURE-----
#
# iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZgmsOxIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1waYKQD9G/B4c5u94Puhkr4o+K4M3FZ3J1pSpYRd
# nMAlrCWYLHQBAKV5q8DvgXbRNzT/Q+1UX7psxIsjyaqljxyJoZ+dIgAD
# =hucV
# -----END PGP SIGNATURE-----
# gpg: Signature made Sun 31 Mar 2024 19:32:27 BST
# gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg: issuer "peterx@redhat.com"
# gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [marginal]
# gpg: aka "Peter Xu <peterx@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706
* tag 'migration-20240331-pull-request' of https://gitlab.com/peterx/qemu:
migration/postcopy: Ensure postcopy_start() sets errp if it fails
migration: Set migration error in migration_completion()
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
After commit 9425ef3f99 ("migration: Use migrate_has_error() in
close_return_path_on_source()"), close_return_path_on_source() assumes
that migration error is set if an error occurs during migration.
This may not be true if migration errors in migration_completion(). For
example, if qemu_savevm_state_complete_precopy() errors, migration error
will not be set.
This in turn, will cause a migration hang bug, similar to the bug that
was fixed by commit 22b04245f0 ("migration: Join the return path
thread before releasing to_dst_file"), as shutdown() will not be issued
for the return-path channel.
Fix it by ensuring migration error is set in case of error in
migration_completion().
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Fixes: 9425ef3f99 ("migration: Use migrate_has_error() in close_return_path_on_source()")
Acked-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20240328140252.16756-2-avihaih@nvidia.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Various fixes for recent regressions and new code.
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCgAdFiEETkN92lZhb0MpsKeVZ7MCdqhiHK4FAmYJEQMACgkQZ7MCdqhi
# HK6l0BAAkVf/BXKJxMu3jLvCpK/fBYGytvfHBR9PdWeBwIirqsk3L8eI/Fb5qkMZ
# NMrfECyHR9LTcWb6/Pi/PGciNNWeyleN6IuVBeWfraIFyfHcxpwEKH8P+cXr5EWq
# WDg+1GUt9+FHuAC9UdGZ81UzX7qeI9VfD3wHceqJ/XRU3qjj67DPZjTpsvxuP64+
# N7MhdEM69F34uiIAn1aNCceXiS00dvtu6lDl3+18TzT8sNc6S3qdyxVcqfRhTJfY
# FMZIN3j2hQrVOElEQE9vAOeJyjAQCM+U0y3XZIZHFUw/GTwKV0tm08RFnnxprteG
# 67vR5uXrDEELnU/1PA1YeyaBMA3Z3Nc36XbGf8zTD6rKkS2z0lWMcs72pPIxbMXj
# c4FdnHaE+Q5ngy5s1p6bm5xM7WOEhrsJkgIu2N0weRroe0nAxywDWw3uQlMoV8Oc
# Xet/xM2IKdc0PLzTvFO7xKnW3oqavJ4CX/6XgrGBoMDZKO1JRqaMixGtYKmoH/1h
# 96+jdRbPTZAY8aoiFWW7t065lvdWt74A6QITcn2Kqm04j3MGJfyWMU6dakBzwuri
# PhOkf40o8qn8KN0JNfSO+IXhYVRRotLO/s9H7TEyQiXm25qrGMIF9FErnbDseZil
# rGR4eL0lcwJboYH9RSRWg0NNqpUekvqBzdnS+G0Ad3J+qaMYoik=
# =7UPB
# -----END PGP SIGNATURE-----
# gpg: Signature made Sun 31 Mar 2024 08:30:11 BST
# gpg: using RSA key 4E437DDA56616F4329B0A79567B30276A8621CAE
# gpg: Good signature from "Nicholas Piggin <npiggin@gmail.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 4E43 7DDA 5661 6F43 29B0 A795 67B3 0276 A862 1CAE
* tag 'pull-ppc-for-9.0-3-20240331' of https://gitlab.com/npiggin/qemu:
tests/avocado: ppc_hv_tests.py set alpine time before setup-alpine
tests/avocado: Fix ppc_hv_tests.py xorriso dependency guard
target/ppc: Do not clear MSR[ME] on MCE interrupts to supervisor
target/ppc: Fix GDB register indexing on secondary CPUs
target/ppc: Restore [H]DEXCR to 64-bits
target/ppc/mmu-radix64: Use correct string format in walk_tree()
hw/ppc/spapr: Include missing 'sysemu/tcg.h' header
spapr: nested: use bitwise NOT operator for flags check
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
If the time is wrong, setup-alpine SSL certificate checks can fail.
setup-alpine is used to bring up the network, but it doesn't seem
to to set NTP time before the failing SSL checks. This test has
recently started failing presumably because the default time has
now fallen too far behind.
Fix this by setting time from the host time before running setup-alpine.
Fixes: c9cb496710 ("tests/avocado: ppc add hypervisor tests")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
For some reason the skipIf missing_deps() check fails to skip the test
if it comes after the skipUnless lines, causing an error running on
systems without xorriso.
Avocado implements skipUnless is just an inverted skipIf, so it's not
clear what the bug is or why this fixes it. For now it's enough to
get things working.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2246
Fixes: c9cb496710 ("tests/avocado: ppc add hypervisor tests")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Hardware clears the MSR[ME] bit when delivering a machine check
interrupt, so that is what QEMU does.
The spapr environment runs in supervisor mode though, and receives
machine check interrupts after they are processed by the hypervisor,
and MSR[ME] must always be enabled in supervisor mode (otherwise it
could checkstop the system). So MSR[ME] must not be cleared when
delivering machine checks to the supervisor.
The fix to prevent supervisor mode from modifying MSR[ME] also
prevented it from re-enabling the incorrectly cleared MSR[ME] bit
when returning from handling the interrupt. Before that fix, the
problem was not very noticable with well-behaved code. So the
Fixes tag is not strictly correct, but practically they go together.
Found by kvm-unit-tests machine check tests (not yet upstream).
Fixes: 678b6f1af7 ("target/ppc: Prevent supervisor from modifying MSR[ME]")
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The GDB server protocol assigns an arbitrary numbering of the SPRs.
We track this correspondence on each SPR with gdb_id, using it to
resolve any SPR requests GDB makes.
Early on we generate an XML representation of the SPRs to give GDB,
including this numbering. However the XML is cached globally, and we
skip setting the SPR gdb_id values on subsequent threads if we detect
it is cached. This causes QEMU to fail to resolve SPR requests against
secondary CPUs because it cannot find the matching gdb_id value on that
thread's SPRs.
This is a minimal fix to first assign the gdb_id values, then return
early if the XML is cached. Otherwise we generate the XML using the
now already initialised gdb_id values.
Fixes: 1b53948ff8 ("target/ppc: Use GDBFeature for dynamic XML")
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The DEXCR emulation was recently changed to a 32-bit register, possibly
because it does have a 32-bit read-only view. It is a full 64-bit
SPR though, so use the corresponding 64-bit write functions.
Fixes: fbda88f7ab ("target/ppc: Fix width of some 32-bit SPRs")
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
'mask', 'nlb' and 'base_addr' are all uin64_t types.
Use the corresponding PRIx64 format.
Fixes: d2066bc50d ("target/ppc: Check page dir/table base alignment")
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
"sysemu/tcg.h" declares tcg_enabled(), and is implicitly included.
Include it explicitly to avoid the following error when refactoring
headers:
hw/ppc/spapr.c:2612:9: error: call to undeclared function 'tcg_enabled'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
if (tcg_enabled()) {
^
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Check for flag bit in H_GUEST_GETSET_STATE_FLAG_GUEST_WIDE need to use
bitwise NOT operator to ensure no other flag bits are set.
Resolves: Coverity CID 1540008
Resolves: Coverity CID 1540009
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Using log_pc produces the pc at the beginning of TB,
not the actual pc installed by cpu_restore_state_from_tb,
which could be any of the guest instructions within TB.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The 32-bit PA-7300LC (PCX-L2) CPU and the 64-bit PA8700 (PCX-W2) CPU
use different diag instructions to save or restore the CPU registers
to/from the shadow registers.
Implement those per-CPU architecture diag instructions to fix those
parts of the HP ODE testcases (L2DIAG and WDIAG, section 1) which test
the shadow registers.
Signed-off-by: Helge Deller <deller@gmx.de>
[rth: Use decodetree to distinguish cases]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Helge Deller <deller@gmx.de>
Tested-by: Helge Deller <deller@gmx.de>
This reverts commit 46d4d36d0b.
The reverted commit changed to emit warnings instead of errors when
vhost is requested but vhost initialization fails if vhostforce option
is not set.
However, vhostforce is not meant to ignore vhost errors. It was once
introduced as an option to commit 5430a28fe4 ("vhost: force vhost off
for non-MSI guests") to force enabling vhost for non-MSI guests, which
will have worse performance with vhost. The option was deprecated with
commit 1e7398a140 ("vhost: enable vhost without without MSI-X") and
changed to behave identical with the vhost option for compatibility.
Worse, commit bf769f742c ("virtio: del net client if net_init_tap_one
failed") changed to delete the client when vhost fails even when the
failure only results in a warning. The leads to an assertion failure
for the -netdev command line option.
The reverted commit was intended to avoid that the vhost initialization
failure won't result in a corrupted netdev. This problem should have
been fixed by deleting netdev when the initialization fails instead of
ignoring the failure with an arbitrary option. Fortunately, commit
bf769f742c ("virtio: del net client if net_init_tap_one failed"),
mentioned earlier, implements this behavior.
Restore the correct semantics and fix the assertion failure for the
-netdev command line option by reverting the problematic commit.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Some of them are only necessary for POSIX systems. The others are
assigned to function pointers in NetClientInfo that can actually be
NULL.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
It is incorrect to have the VIRTIO_NET_HDR_F_NEEDS_CSUM set when
checksum offloading is disabled so clear the bit.
TCP/UDP checksum is usually offloaded when the peer requires virtio
headers because they can instruct the peer to compute checksum. However,
igb disables TX checksum offloading when a VF is enabled whether the
peer requires virtio headers because a transmitted packet can be routed
to it and it expects the packet has a proper checksum. Therefore, it
is necessary to have a correct virtio header even when checksum
offloading is disabled.
A real TCP/UDP checksum will be computed and saved in the buffer when
checksum offloading is disabled. The virtio specification requires to
set the packet checksum stored in the buffer to the TCP/UDP pseudo
header when the VIRTIO_NET_HDR_F_NEEDS_CSUM bit is set so the bit must
be cleared in that case.
Fixes: ffbd2dbd8e ("e1000e: Perform software segmentation for loopback")
Buglink: https://issues.redhat.com/browse/RHEL-23067
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
virtio_net_guest_notifier_pending() and virtio_net_guest_notifier_mask()
checked VIRTIO_NET_F_MQ to know there are multiple queues, but
VIRTIO_NET_F_RSS also enables multiple queues. Refer to n->multiqueue,
which is set to true either of VIRTIO_NET_F_MQ or VIRTIO_NET_F_RSS is
enabled.
Fixes: 68b0a6395f ("virtio-net: align ctrl_vq index for non-mq guest for vhost_vdpa")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The local 9p driver in virtio-9p-test.c its temporary dir right at the
start of qos-test (via virtio_9p_create_local_test_dir()) and only
deletes it after qos-test is finished (via
virtio_9p_remove_local_test_dir()).
This means that any qos-test machine that ends up running virtio-9p-test
local tests more than once will end up re-using the same temp dir. This
is what's happening in [1] after we introduced the riscv machine nodes:
if we enable slow tests with the '-m slow' flag using
qemu-system-riscv64, this is what happens:
- a temp dir is created;
- virtio-9p-device tests will run virtio-9p-test successfully;
- virtio-9p-pci tests will run virtio-9p-test, and fail right at the
first slow test at fs_create_dir() because the "01" file was already
created by fs_create_dir() test when running with the virtio-9p-device.
The root cause is that we're creating a single temporary dir, via the
construct/destruct callbacks, and this temp dir is kept for the entire
qos-test run.
We can change each test to clean after themselves. This approach would
make the 'create' tests obsolete since we would need to create and
delete dirs/files/symlinks for the cleanup, turning them into the
'unlinkat' tests that comes right after.
We chose a different approach that handles the root cause: do not use
constructor/destructor to create the temp dir. Create one temp dir for
each test, and remove it after the test is complete. This is the
approach taken for other qtests like vhost-user-test.c where each test
requires a setup() and a subsequent cleanup(), all of those instantiated
in the .before callback.
[1] https://mail.gnu.org/archive/html/qemu-devel/2024-03/msg05807.html
Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-Id: <20240327142011.805728-2-dbarboza@ventanamicro.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Overflow indicator should include the effect of the shift step.
We had previously left ??? comments about the issue.
Tested-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Prepare for proper indication of shladd unsigned overflow.
The UV indicator will be zero/not-zero instead of a single bit.
Tested-by: Helge Deller <deller@gmx.de>
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The cond_need_ext predicate was created while we still had a
32-bit compilation mode. It now makes more sense to treat D
as an absolute indicator of a 64-bit operation.
Tested-by: Helge Deller <deller@gmx.de>
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Split do_unit_cond to do_unit_zero_cond to only handle conditions
versus zero. These are the only ones that are legal for UXOR.
Simplify trans_uxor accordingly.
Rename do_unit to do_unit_addsub, since xor has been split.
Properly compute carry-out bits for add and subtract, mirroring
the code in do_add and do_sub.
Tested-by: Helge Deller <deller@gmx.de>
Reviewed-by: Helge Deller <deller@gmx.de>
Fixes: b2167459ae ("target-hppa: Implement basic arithmetic")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
With r1 as zero is by far the most common usage of UADDCM, as the
easiest way to invert a register. The compiler does occasionally
use the addition step as well, and we can simplify that to avoid
a temp and write directly into the destination.
Tested-by: Helge Deller <deller@gmx.de>
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The carry bits for each nibble N are located in bit (N+1)*4,
so the shift by 3 was off by one. Furthermore, the carry bit
for the most significant carry bit is indeed located in bit 64,
which is located in a different storage word.
Use a double-word shift-right to reassemble into a single word
and place them all at bit 0 of their respective nibbles.
Tested-by: Helge Deller <deller@gmx.de>
Reviewed-by: Helge Deller <deller@gmx.de>
Fixes: b2167459ae ("target-hppa: Implement basic arithmetic")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Call translator_io_start before write to EIRR.
Move evaluation of EIRR vs EIEM to hppa_cpu_exec_interrupt.
Exit TB after write to EIEM, but otherwise use a straight store.
Reviewed-by: Helge Deller <deller@gmx.de>
Tested-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The call to gen_helper_read_interval_timer is
identical on both sides of the IF.
Reviewed-by: Helge Deller <deller@gmx.de>
Tested-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Do not clobber the high bits of the address by using a 32-bit deposit.
Reviewed-by: Helge Deller <deller@gmx.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The return address comes from IA*Q_Next, and IASQ_Next
is always equal to IASQ_Back, not IASQ_Front.
Tested-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In the h != g && shmaddr == NULL && !reserved_va case, target_shmat()
incorrectly mmap()s the initial anonymous range with
MAP_FIXED_NOREPLACE, even though the earlier mmap_find_vma() has
already reserved the respective address range.
Fix by using MAP_FIXED when "mapped", which is set after
mmap_find_vma(), is true.
Fixes: 78bc8ed9a8 ("linux-user: Rewrite target_shmat")
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20240325192436.561154-4-iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The indices of arguments used with semctl() are all off-by-1, because
arg1 is the ipc() command. Fix them. While at it, reuse print_semctl().
New output (for a small test program):
3540333 semctl(999,888,SEM_INFO,0x00007fe5051ee9a0) = -1 errno=14 (Bad address)
Fixes: 7ccfb2eb5f ("Fix warnings that would be caused by gcc flag -Wwrite-strings")
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20240325192436.561154-2-iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
I observed [NSTrackingArea rect] becomes de-synchronized with the view
frame with some unknown condition, and fails to track mouse movement on
some area of the view. Specify NSTrackingInVisibleRect option to let
Cocoa automatically update NSTrackingArea, which also saves code for
synchronization.
Fixes: 91aa508d02 ("ui/cocoa: Let the platform toggle fullscreen")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-ID: <20240323-fixes-v2-3-18651a2b0394@daynix.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
[NSWindow setContentAspectRatio:] does not trigger window resize itself,
so the wrong aspect ratio will persist if nothing resizes the window.
Call [NSWindow setContentSize:] in such a case.
Fixes: 91aa508d02 ("ui/cocoa: Let the platform toggle fullscreen")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-ID: <20240323-fixes-v2-1-18651a2b0394@daynix.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
QEMU build fails with
hw/i386/fw_cfg.c:74: undefined reference to `smbios_get_table_legacy'
when it's built with only 'microvm' enabled i.e. with config patch
+++ b/configs/devices/i386-softmmu/default.mak
@@ -26,7 +26,7 @@
# Boards:
#
-CONFIG_ISAPC=y
-CONFIG_I440FX=y
-CONFIG_Q35=y
+CONFIG_ISAPC=n
+CONFIG_I440FX=n
+CONFIG_Q35=n
It happens because I've fogotten/lost smbios_get_table_legacy() stub.
Fix it by adding missing stub as Philippe suggested.
Fixes: b42b0e4daa "smbios: build legacy mode code only for 'pc' machine"
Reported-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-ID: <20240326122630.85989-1-imammedo@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
1. The g_pattern_match_string() is deprecated when glib2 version >= 2.70.
Use g_pattern_spec_match_string() instead to avoid this problem.
2. The type of second parameter in g_ptr_array_add() is
'gpointer' {aka 'void *'}, but the type of reg->name is 'const char*'.
Cast the type of reg->name to 'gpointer' to avoid this problem.
compiler warning message:
contrib/plugins/execlog.c:330:17: warning: ‘g_pattern_match_string’
is deprecated: Use 'g_pattern_spec_match_string' instead [-Wdeprecated-declarations]
330 | if (g_pattern_match_string(pat, rd->name) ||
| ^~
In file included from /usr/include/glib-2.0/glib.h:67,
from contrib/plugins/execlog.c:9:
/usr/include/glib-2.0/glib/gpattern.h:57:15: note: declared here
57 | gboolean g_pattern_match_string (GPatternSpec *pspec,
| ^~~~~~~~~~~~~~~~~~~~~~
contrib/plugins/execlog.c:331:21: warning: ‘g_pattern_match_string’
is deprecated: Use 'g_pattern_spec_match_string' instead [-Wdeprecated-declarations]
331 | g_pattern_match_string(pat, rd_lower)) {
| ^~~~~~~~~~~~~~~~~~~~~~
/usr/include/glib-2.0/glib/gpattern.h:57:15: note: declared here
57 | gboolean g_pattern_match_string (GPatternSpec *pspec,
| ^~~~~~~~~~~~~~~~~~~~~~
contrib/plugins/execlog.c:339:63: warning: passing argument 2 of
‘g_ptr_array_add’ discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
339 | g_ptr_array_add(all_reg_names, reg->name);
| ~~~^~~~~~
In file included from /usr/include/glib-2.0/glib.h:33:
/usr/include/glib-2.0/glib/garray.h:198:62: note: expected
‘gpointer’ {aka ‘void *’} but argument is of type ‘const char *’
198 | gpointer data);
| ~~~~~~~~~~~~~~~~~~^~~~
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2210
Signed-off-by: Yao Xingtao <yaoxt.fnst@fujitsu.com>
Message-ID: <20240326015257.21516-1-yaoxt.fnst@fujitsu.com>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Let clock_set_mul_div() return a boolean value whether the
clock has been updated or not, similarly to clock_set().
Return early when clock_set_mul_div() is called with
same mul/div values the clock has.
Acked-by: Luc Michel <luc@lmichel.fr>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20240325152827.73817-2-philmd@linaro.org>
In qemu monitor mode, when we use gpa2hva command to print the host
virtual address corresponding to a guest physical address, if the gpa is
not in RAM, the error message is below:
(qemu) gpa2hva 0x750000000
Memory at address 0x750000000is not RAM
A space is missed between '0x750000000' and 'is'.
Signed-off-by: Yao Xingtao <yaoxt.fnst@fujitsu.com>
Fixes: e9628441df ("hmp: gpa2hva and gpa2hpa hostaddr command")
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Dr. David Alan Gilbert <dave@treblig.org>
Message-ID: <20240319021610.2423844-1-ruansy.fnst@fujitsu.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The io_timeout property, introduced in c9b6609 (part of 6.0) is
silently overwritten by the hardcoded default value of 30 seconds
(DEFAULT_IO_TIMEOUT) in scsi_generic_realize because that function is
being called after the properties have already been applied.
The property definition already has a default value which is applied
correctly when no value is explicitly set, so we can just remove the
code which overrides the io_timeout completely.
This has been tested by stracing SG_IO operations with the io_timeout
property set and unset and now sets the timeout field in the ioctl
request to the proper value.
Fixes: c9b6609b69 ("scsi: make io_timeout configurable")
Signed-off-by: Lorenz Brun <lorenz@brun.one>
Message-ID: <20240315145831.2531695-1-lorenz@brun.one>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
CXL emulation of interleave requires read and write hooks due to
requirement for subpage granularity. The Linux kernel stack now enables
using this memory as conventional memory in a separate NUMA node. If a
process is deliberately forced to run from that node
$ numactl --membind=1 ls
the page table walk on i386 fails.
Useful part of backtrace:
(cpu=cpu@entry=0x555556fd9000, fmt=fmt@entry=0x555555fe3378 "cpu_io_recompile: could not find TB for pc=%p")
at ../../cpu-target.c:359
(retaddr=0, addr=19595792376, attrs=..., xlat=<optimized out>, cpu=0x555556fd9000, out_offset=<synthetic pointer>)
at ../../accel/tcg/cputlb.c:1339
(cpu=0x555556fd9000, full=0x7fffee0d96e0, ret_be=ret_be@entry=0, addr=19595792376, size=size@entry=8, mmu_idx=4, type=MMU_DATA_LOAD, ra=0) at ../../accel/tcg/cputlb.c:2030
(cpu=cpu@entry=0x555556fd9000, p=p@entry=0x7ffff56fddc0, mmu_idx=<optimized out>, type=type@entry=MMU_DATA_LOAD, memop=<optimized out>, ra=ra@entry=0) at ../../accel/tcg/cputlb.c:2356
(cpu=cpu@entry=0x555556fd9000, addr=addr@entry=19595792376, oi=oi@entry=52, ra=ra@entry=0, access_type=access_type@entry=MMU_DATA_LOAD) at ../../accel/tcg/cputlb.c:2439
at ../../accel/tcg/ldst_common.c.inc:301
at ../../target/i386/tcg/sysemu/excp_helper.c:173
(err=0x7ffff56fdf80, out=0x7ffff56fdf70, mmu_idx=0, access_type=MMU_INST_FETCH, addr=18446744072116178925, env=0x555556fdb7c0)
at ../../target/i386/tcg/sysemu/excp_helper.c:578
(cs=0x555556fd9000, addr=18446744072116178925, size=<optimized out>, access_type=MMU_INST_FETCH, mmu_idx=0, probe=<optimized out>, retaddr=0) at ../../target/i386/tcg/sysemu/excp_helper.c:604
Avoid this by plumbing the address all the way down from
x86_cpu_tlb_fill() where is available as retaddr to the actual accessors
which provide it to probe_access_full() which already handles MMIO accesses.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2180
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2220
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Gregory Price <gregory.price@memverge.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-ID: <20240307155304.31241-2-Jonathan.Cameron@huawei.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Previously, bdrv_pad_request() could not deal with a NULL qiov when
a read needed to be aligned. During prefetch, a stream job will pass a
NULL qiov. Add a test case to cover this scenario.
By accident, also covers a previous race during shutdown, where block
graph changes during iteration in bdrv_flush_all() could lead to
unreferencing the wrong block driver state and an assertion failure
later.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20240322095009.346989-5-f.ebner@proxmox.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Same rationale as for commit "block-backend: fix edge case in
bdrv_next() where BDS associated to BB changes". The block graph might
change between the bdrv_next() call and the bdrv_next_cleanup() call,
so it could be that the associated BDS is not the same that was
referenced previously anymore. Instead, rely on bdrv_next() to set
it->bs to the BDS it referenced and unreference that one in any case.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20240322095009.346989-4-f.ebner@proxmox.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Some operations, e.g. block-stream, perform reads while discarding the
results (only copy-on-read matters). In this case, they will pass NULL
as the target QEMUIOVector, which will however trip bdrv_pad_request,
since it wants to extend its passed vector. In particular, this is the
case for the blk_co_preadv() call in stream_populate().
If there is no qiov, no operation can be done with it, but the bytes
and offset still need to be updated, so the subsequent aligned read
will actually be aligned and not run into an assertion failure.
In particular, this can happen when the request alignment of the top
node is larger than the allocated part of the bottom node, in which
case padding becomes necessary. For example:
> ./qemu-img create /tmp/backing.qcow2 -f qcow2 64M -o cluster_size=32768
> ./qemu-io -c "write -P42 0x0 0x1" /tmp/backing.qcow2
> ./qemu-img create /tmp/top.qcow2 -f qcow2 64M -b /tmp/backing.qcow2 -F qcow2
> ./qemu-system-x86_64 --qmp stdio \
> --blockdev qcow2,node-name=node0,file.driver=file,file.filename=/tmp/top.qcow2 \
> <<EOF
> {"execute": "qmp_capabilities"}
> {"execute": "blockdev-add", "arguments": { "driver": "compress", "file": "node0", "node-name": "node1" } }
> {"execute": "block-stream", "arguments": { "job-id": "stream0", "device": "node1" } }
> EOF
Originally-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: do update bytes and offset in any case
add reproducer to commit message]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20240322095009.346989-2-f.ebner@proxmox.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
VDUSE requires that virtqueues are first enabled before the DRIVER_OK
status flag is set; with the current API of the kernel module, it is
impossible to enable the opposite order in our block export code because
userspace is not notified when a virtqueue is enabled.
This requirement also mathces the normal initialisation order as done by
the generic vhost code in QEMU. However, commit 6c482547 accidentally
changed the order for vdpa-dev and broke access to VDUSE devices with
this.
This changes vdpa-dev to use the normal order again and use the standard
vhost callback .vhost_set_vring_enable for this. VDUSE devices can be
used with vdpa-dev again after this fix.
vhost_net intentionally avoided enabling the vrings for vdpa and does
this manually later while it does enable them for other vhost backends.
Reflect this in the vhost_net code and return early for vdpa, so that
the behaviour doesn't change for this device.
Cc: qemu-stable@nongnu.org
Fixes: 6c4825476a ('vdpa: move vhost_vdpa_set_vring_ready to the caller')
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20240315155949.86066-1-kwolf@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Tests 157 and 227 use the virtio-blk device, so we have to mark these
tests accordingly to be skipped if this devices is not available (e.g.
when running the tests with qemu-system-avr only).
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240325154737.1305063-1-thuth@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
QAPI patches patches for 2024-03-26
# -----BEGIN PGP SIGNATURE-----
#
# iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAmYCXs0SHGFybWJydUBy
# ZWRoYXQuY29tAAoJEDhwtADrkYZTRMkP/RR9ipyCh1P6KKuQCyouxM8nXZ945b33
# L0OBFY4cTZsMY98ZTU9i80f/Tjh58y5bYzSdr21jxBSdRlvb6w/Ys8xUsI2Hi/kZ
# czdKZbWT5IeSNrLw04CjhW2FKq8XXR7epHbPms2nBa1Xc5Jf8aIC2UhZKADeVCyE
# FCIUU8Ctkyvre3XGVOrIBhhVPs9hi33uG+HXiveQjjWwGZZ0v915sAZ4WvNEoiBS
# E0NwhJxtbQu8FwfBquE84o0TMzYb19MlU4zq3G5dv7PhLoXX9Lf097/tPEtoP4et
# H37eGD8MzL1hywf+E1vAw0HC/v7Ci5Kx1jpUUL03Hfvfa6Lktb6yYke8qAD1HopK
# o4btXKpfseuyTJuqTVaV4Z9tLqmpTddnUupBFkfKviMPdidAN9SM1qXx0VCM9mUo
# cwf5di6zOPTNYlsKuP6ofSGGaD5so9PGLGAZoGeZ1Cb7TLsgiqqD0GrPP0//1VLn
# GzOOsOA4Su9F2/qjkNu2HZ1jv6J105LbQnaPvqjVfkjzzmoOANYmViX3KmDwdBxX
# vSfy5/UpCP9TBv++N/ljRtrVzW9i4a2rIGeHWC0x/JJglIh45f2MzeyqJ5haV2n5
# MVY82yY7EY7U+YXUUUKSgGVWw7+/wj/R5wQ5a9Gjk9OlmTsTHxJIKIzhcw4lX9da
# ZmRLCSIlVu4Z
# =li36
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 26 Mar 2024 05:36:13 GMT
# gpg: using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653
# gpg: issuer "armbru@redhat.com"
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full]
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>" [full]
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653
* tag 'pull-qapi-2024-03-26' of https://repo.or.cz/qemu/armbru:
qapi: document parameters of query-cpu-model-* QAPI commands
qapi/block-core: improve Qcow2OverlapCheckFlags documentation
qapi: document leftover members in qapi/stats.json
qapi: document leftover members in qapi/run-state.json
qapi: document InputMultiTouchType
qga/qapi-schema: Refill doc comments to conform to current conventions
qapi: Correct documentation indentation and whitespace
qapi: Refill doc comments to conform to current conventions
qapi: Don't repeat member type in its documentation text
qapi: Start sentences with a capital letter, end them with a period
qapi: Fix abbreviation punctuation in doc comments
qapi: Fix typo in request-ebpf documentation
qapi: Fix argument markup in drive-mirror documentation
qapi: Tidy up indentation of add_client's example
qapi: Tidy up block-latency-histogram-set documentation some more
qapi: Expand a few awkward abbreviations in documentation
qapi: Drop stray Arguments: line from qmp_capabilities docs
qapi: Fix bogus documentation of query-migrationthreads
qapi: Resync MigrationParameter and MigrateSetParameters
qapi: Improve migration TLS documentation
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Most of fields have no description at all. Let's fix that. Still, no
reason to place here more detailed descriptions of what these
structures are, as we have public Qcow2 format specification.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20240325120054.2693236-1-vsementsov@yandex-team.ru>
Acked-by: Markus Armbruster <armbru@redhat.com>
[Capitalize "QEMU", update qapi/pragma.json]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
For legibility, wrap text paragraphs so every line is at most 70
characters long.
To check the generated documentation does not change, I compared the
generated HTML before and after this commit with "wdiff -3". Finds no
differences. Comparing with diff is not useful, as the refilled
paragraphs are visible there.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240322140910.328840-13-armbru@redhat.com>
For legibility, wrap text paragraphs so every line is at most 70
characters long.
To check the generated documentation does not change, I compared the
generated HTML before and after this commit with "wdiff -3". Finds no
differences. Comparing with diff is not useful, as the refilled
paragraphs are visible there.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240322140910.328840-11-armbru@redhat.com>
Documentation generated for the arguments of MEMORY_FAILURE looks like
"recipient": "MemoryFailureRecipient"
recipient is defined as "MemoryFailureRecipient".
"action": "MemoryFailureAction"
action that has been taken. action is defined as
"MemoryFailureAction".
"flags": "MemoryFailureFlags"
flags for MemoryFailureAction. action is defined as
"MemoryFailureFlags".
The "action is defined as ..." are redundant. Drop.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240322140910.328840-10-armbru@redhat.com>
Commit d23055b8db (qapi: Require descriptions and tagged sections to
be indented) indented add_client's example too much. Revert that.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240322140910.328840-5-armbru@redhat.com>
[Move a stray hunk to the later patch it belongs to]
The doc comment documents an argument that doesn't exist. Would
fail compilation if it was marked up correctly. Delete.
The Returns: section fails to refer to the data type, leaving the user
to guess. Fix that.
The command name violates QAPI naming rules: it should be
query-migration-threads. Too late to fix.
Reported-by: John Snow <jsnow@redhat.com>
Fixes: 671326201d (migration: Introduce interface query-migrationthreads)
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240322135117.195489-4-armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Enum MigrationParameter mirrors the members of struct
MigrateSetParameters. Differences to MigrateSetParameters's member
documentation are pointless. Clean them up:
* @compress-level, @compress-threads, @decompress-threads, and
x-checkpoint-delay are more thoroughly documented for
MigrationParameter, so use that version for both.
* @max-cpu-throttle is almost the same. Use MigrationParameter's
version for both.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240322135117.195489-3-armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
MigrateSetParameters is about setting parameters, and
MigrationParameters is about querying them. Their documentation of
@tls-creds and @tls-hostname has residual damage from a failed attempt
at de-duplicating them (see commit de63ab6124 "migrate: Share common
MigrationParameters struct" and commit 1bda8b3c69 "migration: Unshare
MigrationParameters struct for now").
MigrateSetParameters documentation issues:
* It claims plain text mode "was reported by omitting tls-creds"
before 2.9. MigrateSetParameters is not used for reporting, so this
is misleading. Delete.
* It similarly claims hostname defaulting to migration URI "was
reported by omitting tls-hostname" before 2.9. Delete as well.
Rephrase the remaining @tls-hostname contents for clarity.
Enum MigrationParameter mirrors the members of struct
MigrateSetParameters. Differences to MigrateSetParameters's member
documentation are pointless. Copy the new text to MigrationParameter.
MigrationParameters documentation issues:
* @tls-creds runs the two last sentences together without punctuation.
Fix that.
* Much of the contents on @tls-hostname only applies to setting
parameters, resulting in confusion. Replace by a suitable abridged
version of the new MigrateSetParameters text, and a note on
@tls-hostname omission in 2.8.
Additional damage is due to flawed doc fix commit
66fcb9d651 (qapi/migration: Add missing tls-authz documentation):
since it copied the missing MigrateSetParameters text from
MigrationParameters instead of MigrationParameter, the part on
recreating @tls-authz on the fly is missing. Copy that, too.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240322135117.195489-2-armbru@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
[Some typos corrected]
* Fix timeouts in Travis-CI jobs
* Mark devices with user_creatable = false that can crash QEMU otherwise
* Fix s390x TEST-AND-SET TCG instruction emulation
* Move pc955* devices to hw/gpio/
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmYBhdgRHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbVcfA/9FulEN4HrjD3ObyboA+WfibXURwChui98
# 8LvL/fAGe3BZXQtspuNmPyrKRtIOrIwHJyFuxf2N5+8BuvGhEHQIuvIhQIj/rvfy
# X14KlldmQ3w3HlI3Ud4YiebLyK3AAFC2ApIywzFsnN+HoaHJR2EyDIb+T7OsGJZf
# ZLE/Z7qANxoNeZ+a3+rQR3SVpijyS3fXxDSaILrq2uW4kCCs/55O8Rt3Qb+PFSVd
# fF+OlpG6o+z73ACZc1u9Io4IO1ZZc/NdkmDTNz4HknkvJLTLF6kOECAxLl0ytgAG
# YRzBGKes29Zpa9wn/9rc75/OYNS0Ks+B19sQnijWUNX0zq5FkReXNXiyVcbT7d4p
# 6jFzlFnjj4ifB8uQkZTGcx/lL4s4VkPzF+f7fgHq9CKNrNsx8uca0TyQ8s4y+NGb
# C98kJdHd+QhCcuNnAbifCwuFaxQ8C4BdgzxVbU/sGDKNkINNkiTp+uue4TxnRKvV
# MfhqdnWRvqgZ0Rl4TxqcNfODK72Z1YNv3933OKE/mRJYS1Q529TIq4vfp8WIMsWQ
# 7+ipo4WKXhkiSOJZD6AkCoFum1W8yaDzUDJTw2Xt2bPBL3+FXcQyKkKVUMfzIJ8M
# KLe0Bb9W/pYU1ToTciTP0dkQF/02tG0Vik273445wPgH0x8OyHJkPF/ny1a7lKFO
# 5jreYdMxWdc=
# =lfZM
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 25 Mar 2024 14:10:32 GMT
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5
* tag 'pull-request-2024-03-25' of https://gitlab.com/thuth/qemu:
tests/tcg/s390x: Test TEST AND SET
target/s390x: Use mutable temporary value for op_ts
libqos/virtio.c: Correct 'flags' reading in qvirtqueue_kick
misc/pca955*: Move models under hw/gpio
aspeed: Make the ast1030-a1 SoC not user creatable
aspeed: Make the ast2600-a3 SoC not user creatable
hw/microblaze: Do not allow xlnx-zynqmp-pmu-soc to be created by the user
.travis.yml: Remove the unused xfslib-dev package
.travis.yml: Shorten the runtime of the problematic jobs
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Migration pull for 9.0-rc1
- Fabiano's patch to revert fd: support on mapped-ram
- Peter's fix on postcopy regression on unnecessary dirty syncs
- Fabiano's fix on mapped-ram rare corrupt on zero page handling
# -----BEGIN PGP SIGNATURE-----
#
# iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZf2uIxIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1waqTgD/RjaWrcUYlHcfFcWlEQGrYqikCtZYI+oW
# YYdbLcCBOlQBAL/ecCbsFyaWyPnB1Eg3YFcj5g8AgogDHdg37HSxydgL
# =aWGi
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 22 Mar 2024 16:13:23 GMT
# gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg: issuer "peterx@redhat.com"
# gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [marginal]
# gpg: aka "Peter Xu <peterx@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706
* tag 'migration-20240322-pull-request' of https://gitlab.com/peterx/qemu:
migration/multifd: Fix clearing of mapped-ram zero pages
migration/postcopy: Fix high frequency sync
migration: Revert mapped-ram multifd support to fd: URI
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Coverity points out that g_setenv() can fail and we don't
check for this in qtest_inproc_init(). In practice this will
only fail if a memory allocation failed in setenv() or if
the caller passed an invalid architecture name (e.g. one
with an '=' in it), so rather than requiring the callsite
to check for failure, make g_setenv() failure fatal here,
similarly to what we did in commit aca68d95c5.
Resolves: Coverity CID 1497485
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240312183810.557768-8-peter.maydell@linaro.org
In test_compute_wait() we do
double units = bkt.max / 10;
which does an integer division and then assigns it to a double variable,
and similarly later on in the expression for an assertion.
Use 10.0 so that we do a floating point division and calculate the
exact value, rather than doing an integer division.
Spotted by Coverity.
Resolves: Coverity CID 1432564
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240312183810.557768-7-peter.maydell@linaro.org
In qvirtqueue_kick(), the 'flags' were previously being incorrectly read from
vq->avail instead of the correct vq->used location. This update ensures 'flags'
are read from the correct location as per the virtio standard.
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240320090442.267525-1-zheyuma97@gmail.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
In pca9554_get_pin() and pca9554_set_pin(), we try to detect an
incorrect pin value, but we get the condition wrong, using ">"
when ">=" was intended.
This has no actual effect, because in pca9554_initfn() we
use the correct test when creating the properties and so
we'll never be called with an out of range value. However,
Coverity complains about the mismatch between the check and
the later use of the pin value in a shift operation.
Use the correct condition.
Resolves: Coverity CID 1534917
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 20240312183810.557768-5-peter.maydell@linaro.org
In net_init_af_xdp() we parse the arguments and allocate
a buffer of ints into sock_fds. However, although we
free this in the error exit path, we don't ever free it
in the successful return path. Coverity spots this leak.
Switch to g_autofree so we don't need to manually free the
array.
Resolves: Coverity CID 1534906
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 20240312183810.557768-4-peter.maydell@linaro.org
In socket_check_afunix_support() we call socket(PF_UNIX, SOCK_STREAM, 0)
to see if it works, but we call close() on the result whether it
worked or not. Only close the fd if the socket() call succeeded.
Spotted by Coverity.
Resolves: Coverity CID 1497481
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 20240312183810.557768-3-peter.maydell@linaro.org
Using xlnx-zynqmp-pmu-soc on the command line causes QEMU to crash:
./qemu-system-microblazeel -M petalogix-ml605 -device xlnx-zynqmp-pmu-soc
**
ERROR:tcg/tcg.c:813:tcg_register_thread: assertion failed: (n < tcg_max_ctxs)
Bail out!
Aborted (core dumped)
Mark the device with "user_creatable = false" to avoid that this can happen.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2229
Message-ID: <20240322183153.1023359-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The "[s390x] GCC (other-system)" and the "[s390x] GCC check-tcg"
jobs are hitting the 50 minutes timeout in Travis quite frequently
since a while.
To fix it, we've got to drop a lot of the targets from the target
list in the jobs to make them work again.
With regards to the "check-tcg" test, we can move the check with
"s390x-linux-user" to the "user" job instead which also builds
the s390x-linux-user target.
And while we're at it, remove the "--enable-fdt=system" configure
switch (since this is not required nowadays anymore).
Message-ID: <20240320104144.823425-2-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
When the zero page detection is done in the multifd threads, we need
to iterate the second part of the pages->offset array and clear the
file bitmap for each zero page. The piece of code we merged to do that
is wrong.
The reason this has passed all the tests is because the bitmap is
initialized with zeroes already, so clearing the bits only really has
an effect during live migration and when a data page goes from having
data to no data.
Fixes: 303e6f54f9 ("migration/multifd: Implement zero page transmission on the multifd thread.")
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240321201242.6009-1-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
With current code base I can observe extremely high sync count during
precopy, as long as one enables postcopy-ram=on before switchover to
postcopy.
To provide some context of when QEMU decides to do a full sync: it checks
must_precopy (which implies "data must be sent during precopy phase"), and
as long as it is lower than the threshold size we calculated (out of
bandwidth and expected downtime) QEMU will kick off the slow/exact sync.
However, when postcopy is enabled (even if still during precopy phase), RAM
only reports all pages as can_postcopy, and report must_precopy==0. Then
"must_precopy <= threshold_size" mostly always triggers and enforces a slow
sync for every call to migration_iteration_run() when postcopy is enabled
even if not used. That is insane.
It turns out it was a regress bug introduced in the previous refactoring in
8.0 as reported by Nina [1]:
(a) c8df4a7aef ("migration: Split save_live_pending() into state_pending_*")
Then a workaround patch is applied at the end of release (8.0-rc4) to fix it:
(b) 28ef5339c3 ("migration: fix ram_state_pending_exact()")
However that "workaround" was overlooked when during the cleanup in this
9.0 release in this commit..
(c) b0504edd40 ("migration: Drop unnecessary check in ram's pending_exact()")
Then the issue was re-exposed as reported by Nina [1].
The problem with (b) is that it only fixed the case for RAM, rather than
all the rest of iterators. Here a slow sync should only be required if all
dirty data (precopy+postcopy) is less than the threshold_size that QEMU
calculated. It is even debatable whether a sync is needed when switched to
postcopy. Currently ram_state_pending_exact() will be mostly noop if
switched to postcopy, and that logic seems to apply too for all the rest of
iterators, as sync dirty bitmap during a postcopy doesn't make much sense.
However let's leave such change for later, as we're in rc phase.
So rather than reusing commit (b), this patch provides the complete fix for
all iterators. When at it, cleanup a little bit on the lines around.
[1] https://gitlab.com/qemu-project/qemu/-/issues/1565
Reported-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Fixes: b0504edd40 ("migration: Drop unnecessary check in ram's pending_exact()")
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240320214453.584374-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
This reverts commit decdc76772 in full
and also the relevant migration-tests from
7a09f09283.
After the addition of the new QAPI-based migration address API in 8.2
we've been converting an "fd:" URI into a SocketAddress, missing the
fact that the "fd:" syntax could also be used for a plain file instead
of a socket. This is a problem because the SocketAddress is part of
the API, so we're effectively asking users to create a "socket"
channel to pass in a plain file.
The easiest way to fix this situation is to deprecate the usage of
both SocketAddress and "fd:" when used with a plain file for
migration. Since this has been possible since 8.2, we can wait until
9.1 to deprecate it.
For 9.0, however, we should avoid adding further support to migration
to a plain file using the old "fd:" syntax or the new SocketAddress
API, and instead require the usage of either the old-style "file:" URI
or the FileMigrationArgs::filename field of the new API with the
"/dev/fdset/NN" syntax, both of which are already supported.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240319210941.1907-1-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
pull-loongarch-20240322
# -----BEGIN PGP SIGNATURE-----
#
# iLMEAAEKAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCZf1WZgAKCRBAov/yOSY+
# 35zZBADDPLM3130Q/2zsGhol1C538i4+hYRbrX+OsLnlaldyE3NqCPcgaKwVE3xS
# T9aOln91rDyQedz4DVYYSx+Oa1JpRjGko957REmopL50SJOYi6n7YhHJksaUirjJ
# tMDZdPClOegieOpCu8LgJAVhaxTpZvfLedJVPt7O6Fl/uP3pLg==
# =XLqh
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 22 Mar 2024 09:59:02 GMT
# gpg: using RSA key B8FF1DA0D2FDCB2DA09C6C2C40A2FFF239263EDF
# gpg: Good signature from "Song Gao <m17746591750@163.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B8FF 1DA0 D2FD CB2D A09C 6C2C 40A2 FFF2 3926 3EDF
* tag 'pull-loongarch-20240322' of https://gitlab.com/gaosong/qemu:
target/loongarch: Fix qemu-system-loongarch64 assert failed with the option '-d int'
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
RISC-V PR for 9.0
* Do not enable all named features by default
* A range of Vector fixes
* Update APLIC IDC after claiming iforce register
* Remove the dependency of Zvfbfmin to Zfbfmin
* Fix mode in riscv_tlb_fill
* Fix timebase-frequency when using KVM acceleration
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEaukCtqfKh31tZZKWr3yVEwxTgBMFAmX9RscACgkQr3yVEwxT
# gBNaRg/+KUSF6AuY25pS7GawbufBbwWWaWN9G/inPVoCnLbeYrkB3uZw3nBd3iV8
# KiD9Azabl6TLBFC/f7eP9alNDIoSrq5EliayrlFEZIncYvig2Y3CkWUeK6oJqDp2
# Dz1Vah4IB96bU2/M9icyHkh3tnSnbhq0JrbgoAYwWutZy4ERYugTHulOGPxBj64I
# JIfb8wYqaak3Uak+g0mz/YBNHegLEDxIzIRhO4oWPE0MWKSO3t79G9qVAYi7pkFB
# ZQQasZy0h9ZpwKvVajiO8yjwh7COI0IPU+4vZNkNXue0SXQvAvcKA4DdaTwmMTio
# 9UM9HRB371F5LtJLdvAT2TR8FfW26Y7xBe458jheFOnPHKwxEFtUFCQ39UJB3bDN
# k7CYvU3GIqUJHD7PtYZfzTdYkdnIDpr9yKTPP2/nCN53FzXuJs/XTyySphJ6mZ2m
# dsr1bnJn/ncZP7W2vdWGfgQEKt2CHfE5qWM++RwhmQc+IKn2ImMA0hBsg6Gl2imB
# 9WANt3UX784VDmcwcFVgDgr6nftDs7gjVCtHAaRV7Oq2f9hcr17pRxg66mSXs0BX
# fMhcqHBe01LpZQRbaGQ0ImTQksEFyH2KTvt0kjF4SfpVzMfVOi/Zmy9goYNq4iYd
# tfucBbXVhpzbJ/9HeOzKAJQ2Wt0NyLiyDIOkWXj61WquS/0Mr9g=
# =8vP1
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 22 Mar 2024 08:52:23 GMT
# gpg: using RSA key 6AE902B6A7CA877D6D659296AF7C95130C538013
# gpg: Good signature from "Alistair Francis <alistair@alistair23.me>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6AE9 02B6 A7CA 877D 6D65 9296 AF7C 9513 0C53 8013
* tag 'pull-riscv-to-apply-20240322' of https://github.com/alistair23/qemu:
target/riscv/kvm: fix timebase-frequency when using KVM acceleration
target/riscv: Fix mode in riscv_tlb_fill
target/riscv: rvv: Remove the dependency of Zvfbfmin to Zfbfmin
hw/intc: Update APLIC IDC after claiming iforce register
target/riscv/vector_helper.c: optimize loops in ldst helpers
target/riscv: enable 'vstart_eq_zero' in the end of insns
trans_rvv.c.inc: remove redundant mark_vs_dirty() calls
target/riscv: remove 'over' brconds from vector trans
target/riscv/vector_helpers: do early exit when vstart >= vl
target/riscv: always clear vstart for ldst_whole insns
target/riscv: always clear vstart in whole vec move insns
target/riscv/vector_helper.c: fix 'vmvr_v' memcpy endianess
trans_rvv.c.inc: set vstart = 0 in int scalar move insns
target/riscv/vector_helper.c: set vstart = 0 in GEN_VEXT_VSLIDEUP_VX()
target/riscv: do not enable all named features by default
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Currently, QEMU only sets the iforce register to 0 and returns early
when claiming the iforce register. However, this may leave mip.meip
remains at 1 if a spurious external interrupt triggered by iforce
register is the only pending interrupt to be claimed, and the interrupt
cannot be lowered as expected.
This commit fixes this issue by calling riscv_aplic_idc_update() to
update the IDC status after the iforce register is claimed.
Signed-off-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Jim Shu <jim.shu@sifive.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240321104951.12104-1-frank.chang@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The vstart_eq_zero flag is updated at the beginning of the translation
phase from the env->vstart variable. During the execution phase all
functions will set env->vstart = 0 after a successful execution, but the
vstart_eq_zero flag remains the same as at the start of the block. This
will wrongly cause SIGILLs in translations that requires env->vstart = 0
and might be reading vstart_eq_zero = false.
This patch adds a new finalize_rvv_inst() helper that is called at the
end of each vector instruction that will both update vstart_eq_zero and
do a mark_vs_dirty().
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1976
Signed-off-by: Ivan Klokov <ivan.klokov@syntacore.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240314175704.478276-10-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
All helpers that rely on vstart >= vl are now doing early exits using
the VSTART_CHECK_EARLY_EXIT() macro. This macro will not only exit the
helper but also clear vstart.
We're still left with brconds that are skipping the helper, which is the
only place where we're clearing vstart. The pattern goes like this:
tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);
(... calls helper that clears vstart ...)
gen_set_label(over);
return true;
This means that every time we jump to 'over' we're not clearing vstart,
which is an oversight that we're doing across the board.
Instead of setting vstart = 0 manually after each 'over' jump, remove
those brconds that are skipping helpers. The exception will be
trans_vmv_s_x() and trans_vfmv_s_f(): they don't use a helper and are
already clearing vstart manually in the 'over' label.
While we're at it, remove the (vl == 0) brconds from trans_rvbf16.c.inc
too since they're unneeded.
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240314175704.478276-8-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
We're going to make changes that will required each helper to be
responsible for the 'vstart' management, i.e. we will relieve the
'vstart < vl' assumption that helpers have today.
Helpers are usually able to deal with vstart >= vl, i.e. doing nothing
aside from setting vstart = 0 at the end, but the tail update functions
will update the tail regardless of vstart being valid or not. Unifying
the tail update process in a single function that would handle the
vstart >= vl case isn't trivial (see [1] for more info).
This patch takes a blunt approach: do an early exit in every single
vector helper if vstart >= vl, unless the helper is guarded with
vstart_eq_zero in the translation. For those cases the helper is ready
to deal with cases where vl might be zero, i.e. throwing exceptions
based on it like vcpop_m() and first_m().
Helpers that weren't changed:
- vcpop_m(), vfirst_m(), vmsetm(), GEN_VEXT_VIOTA_M(): these are guarded
directly with vstart_eq_zero;
- GEN_VEXT_VCOMPRESS_VM(): guarded with vcompress_vm_check() that checks
vstart_eq_zero;
- GEN_VEXT_RED(): guarded with either reduction_check() or
reduction_widen_check(), both check vstart_eq_zero;
- GEN_VEXT_FRED(): guarded with either freduction_check() or
freduction_widen_check(), both check vstart_eq_zero.
Another exception is vext_ldst_whole(), who operates on effective vector
length regardless of the current settings in vtype and vl.
[1] https://lore.kernel.org/qemu-riscv/1590234b-0291-432a-a0fa-c5a6876097bc@linux.alibaba.com/
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240314175704.478276-7-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Commit 8ff8ac6329 added a conditional to guard the vext_ldst_whole()
helper if vstart >= evl. But by skipping the helper we're also not
setting vstart = 0 at the end of the insns, which is incorrect.
We'll move the conditional to vext_ldst_whole(), following in line with
the removal of all brconds vstart >= vl that the next patch will do. The
idea is to make the helpers responsible for their own vstart management.
Fix ldst_whole isns by:
- remove the brcond that skips the helper if vstart is >= evl;
- vext_ldst_whole() now does an early exit with the same check, where
evl = (vlenb * nf) >> log2_esz, but the early exit will also clear
vstart.
The 'width' param is now unneeded in ldst_whole_trans() and is also
removed. It was used for the evl calculation for the brcond and has no
other use now. The 'width' is reflected in vext_ldst_whole() via
log2_esz, which is encoded by GEN_VEXT_LD_WHOLE() as
"ctzl(sizeof(ETYPE))".
Suggested-by: Max Chou <max.chou@sifive.com>
Fixes: 8ff8ac6329 ("target/riscv: rvv: Add missing early exit condition for whole register load/store")
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Max Chou <max.chou@sifive.com>
Message-ID: <20240314175704.478276-6-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
These insns have 2 paths: we'll either have vstart already cleared if
vstart_eq_zero or we'll do a brcond to check if vstart >= maxsz to call
the 'vmvr_v' helper. The helper will clear vstart if it executes until
the end, or if vstart >= vl.
For starters, the check itself is wrong: we're checking vstart >= maxsz,
when in fact we should use vstart in bytes, or 'startb' like 'vmvr_v' is
calling, to do the comparison. But even after fixing the comparison we'll
still need to clear vstart in the end, which isn't happening too.
We want to make the helpers responsible to manage vstart, including
these corner cases, precisely to avoid these situations:
- remove the wrong vstart >= maxsz cond from the translation;
- add a 'startb >= maxsz' cond in 'vmvr_v', and clear vstart if that
happens.
This way we're now sure that vstart is being cleared in the end of the
execution, regardless of the path taken.
Fixes: f714361ed7 ("target/riscv: rvv-1.0: implement vstart CSR")
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240314175704.478276-5-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
trans_vmv_x_s, trans_vmv_s_x, trans_vfmv_f_s and trans_vfmv_s_f aren't
setting vstart = 0 after execution. This is usually done by a helper in
vector_helper.c but these functions don't use helpers.
We'll set vstart after any potential 'over' brconds, and that will also
mandate a mark_vs_dirty() too.
Fixes: dedc53cbc9 ("target/riscv: rvv-1.0: integer scalar move instructions")
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240314175704.478276-3-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Commit 3b8022269c added the capability of named features/profile
extensions to be added in riscv,isa. To do that we had to assign priv
versions for each one of them in isa_edata_arr[]. But this resulted in a
side-effect: vendor CPUs that aren't running priv_version_latest started
to experience warnings for these profile extensions [1]:
| $ qemu-system-riscv32 -M sifive_e
| qemu-system-riscv32: warning: disabling zic64b extension for hart
0x00000000 because privilege spec version does not match
| qemu-system-riscv32: warning: disabling ziccamoa extension for
hart 0x00000000 because privilege spec version does not match
This is benign as far as the CPU behavior is concerned since disabling
both extensions is a no-op (aside from riscv,isa). But the warnings are
unpleasant to deal with, especially because we're sending user warnings
for extensions that users can't enable/disable.
Instead of enabling all named features all the time, separate them by
priv version. During finalize() time, after we decided which
priv_version the CPU is running, enable/disable all the named extensions
based on the priv spec chosen. This will be enough for a bug fix, but as
a future work we should look into how we can name these extensions in a
way that we don't need an explicit ext_name => priv_ver as we're doing
here.
The named extensions being added in isa_edata_arr[] that will be
enabled/disabled based solely on priv version can be removed from
riscv_cpu_named_features[]. 'zic64b' is an extension that can be
disabled based on block sizes so it'll retain its own flag and entry.
[1] https://lists.gnu.org/archive/html/qemu-devel/2024-03/msg02592.html
Reported-by: Clément Chigot <chigot@adacore.com>
Fixes: 3b8022269c ("target/riscv: add riscv,isa to named features")
Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Tested-by: Clément Chigot <chigot@adacore.com>
Message-ID: <20240312203214.350980-1-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Daniel P. Berrangé <berrange@redhat.com> pointed out that the coroutine
pool size heuristic is very conservative. Instead of halving
max_map_count, he suggested reserving 5,000 mappings for non-coroutine
users based on observations of guests he has access to.
Fixes: 86a637e481 ("coroutine: cap per-thread local pool size")
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20240320181232.1464819-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* Use EPERM for seccomp filter instead of killing QEMU when
an attempt to spawn child process is made
* Reduce priority of POLLHUP handling for socket chardevs
to increase likelihood of pending data being processed
* Fix chardev I/O main loop integration when TLS is enabled
* Fix broken crypto test suite when distro disables
SM4 algorithm
* Improve diagnosis of failed crypto tests
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE2vOm/bJrYpEtDo4/vobrtBUQT98FAmX585EACgkQvobrtBUQ
# T98TIg//ekc/f0JrRs68hjmo/vfcHWGHDMbZagj48zZNIn8DhJmQdt+qrCjMrMGW
# 353nTawFuF3EO9ju/eRLO54T+p1+a3zX8TyO4tL1W+RY9HARPeqssmFemDPfkMfQ
# IFGv0M0vaxGZpBna7jlXfDK/hCbJexKoChyT4eSF9H1Tp9o6T2J9AWvB5WTYLoQ2
# GzusDqBLKTkKhxMTCqevkFD/yCkgIQKlX8mG188PoJnGMqpGzQLTyw9lo5Npi1nE
# nhXa2MrrSfusk0rtwEzT14sQ58U+MF4fLQxUC+knNX81FSv8Q6QDu4Stfhwc+az7
# ynO4b/3IzK+VCICb2QM1ZNoTZNLcLfw1jdFTIAt8wiE+BMSySNQtdneURZOynydy
# Qd0alPNb4zfVRIGVjoOj38HiOmIKp5riIsUsI03jjBAgJu47tYRi60Tq2t6KxVoP
# rpDd5Vmsd0AR+7acO29rp0aLB+x2/ANDY+1N1Xi4tQdblmKIziHPZzx6H49wbwev
# 8Jdghg10RpbdqIGOfZ9fn13iCDO+1/gy6g/jTe2tMZrZsyov904tDqyUCDCzAbTz
# B8lvnr0LfSX2DYBryGEHIa/eMN2TxPuzpvZP0JFO1QxJnOs9w3aHr1T6A1sCV4a3
# JjTu71LsomNMXj3t3ImBHzMlgQZoL5Bxoh7b7jbLO4cvnhRbiJk=
# =4HKW
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 19 Mar 2024 20:20:33 GMT
# gpg: using RSA key DAF3A6FDB26B62912D0E8E3FBE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>" [full]
# gpg: aka "Daniel P. Berrange <berrange@redhat.com>" [full]
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E 8E3F BE86 EBB4 1510 4FDF
* tag 'misc-fixes-pull-request' of https://gitlab.com/berrange/qemu:
crypto: report which ciphers are being skipped during tests
crypto: use error_abort for unexpected failures
crypto: query gcrypt for cipher availability
crypto: factor out conversion of QAPI to gcrypt constants
Revert "chardev: use a child source for qio input source"
Revert "chardev/char-socket: Fix TLS io channels sending too much data to the backend"
chardev: lower priority of the HUP GSource in socket chardev
seccomp: report EPERM instead of killing process for spawn set
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The "link_depends" key has not been used since commit c46f76d158
("meson: specify fuzz linker script as a project arg", 2020-09-08),
and even before that it was only used for fork-fuzzing which we
removed in commit d2e6f9272d ("fuzz: remove fork-fuzzing scaffolding",
2023-02-16).
So, remove it for a very small simplification of meson.build.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This avoids fetching blobs and tree references for branches we are not
going to worry about. Also skip tag references which are similarly not
useful and keep the default --prune. This keeps the .git data to
around 100M rather than the ~400M even a shallow clone takes.
So we can check the savings we also run a quick du while setting up
the build.
We also have to have special settings of GIT_FETCH_EXTRA_FLAGS for the
Windows build, the migration legacy test and the custom runners. In
the case of the custom runners we also move the free floating variable
to the runner template.
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240312170011.1688444-1-alex.bennee@linaro.org>
rec->count.score is inside rec, which is freed before rec->count.score is.
Reorder the instructions
Reported by Coverity as CID 1539967.
Cc: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
monitor_puts() doesn't check the monitor pointer, but do_inject_x86_mce()
may have a parameter with NULL monitor pointer. Revert monitor_puts() in
do_inject_x86_mce() to fix, then the fact that we send the same message to
monitor and log is again more obvious.
Fixes: bf0c50d4aa (monitor: expose monitor_puts to rest of code)
Reviwed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Message-ID: <20240320083640.523287-1-tao1.su@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Building dbus-display1.c explicitly as a static library drops -fPIC by
default, which may not be correct if it ends up linked to a shared
library.
Let the target decide how to build the unit, with or without -fPIC. This
makes commit 186acfbaf7 ("tests/qtest: Depend on dbus_display1_dep") no
longer relevant, as dbus-display1.c will be recompiled.
Fixes: c172136ea3 ("meson: ensure dbus-display generated code is built
before other units")
Reported-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
ui/cocoa needs to update the UI info and reset the keyboard state
tracker when switching the console, or the new console will see the
stale UI info or keyboard state. Previously, updating the UI info was
done with cocoa_switch(), but it is meant to be called when the surface
is being replaced, and may be called even when not switching the
console. ui/cocoa never reset the keyboard state, which resulted in
stuck keys.
Add ui/cocoa's own implementation of console_select(), which updates the
UI info and resets the keyboard state tracker.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20240319-console-v2-3-3fd6feef321a@daynix.com>
console_select() is shared by other displays and a console_select() call
from one of them triggers console switching also in ui/curses,
circumventing key state reinitialization that needs to be performed in
preparation and resulting in stuck keys.
Use its internal state to track the current active console to prevent
such a surprise console switch.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20240319-console-v2-2-3fd6feef321a@daynix.com>
A chardev-vc used to inherit the size of a graphic console when its
size not explicitly specified, but it often did not make sense. If a
chardev-vc is instantiated during the startup, the active graphic
console has no content at the time, so it will have the size of graphic
console placeholder, which contains no useful information. It's better
to have the standard size of text console instead.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20240319-console-v2-1-3fd6feef321a@daynix.com>
When we use qemu tcg simulation, the page size of bios is 4KB.
When using the level 2 super huge page (page size is 1G) to create the page table,
it is found that the content of the corresponding address space is abnormal,
resulting in the bios can not start the operating system and graphical interface normally.
The lddir and ldpte instruction emulation has
a problem with the use of super huge page processing above level 2.
The page size is not correctly calculated,
resulting in the wrong page size of the table entry found by tlb.
Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240318070332.1273939-1-lixianglai@loongson.cn>
Since the ciphers can be dynamically disabled at runtime, when running
unit tests it is helpful to report which ciphers we can skipped for
testing.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This improves the error diagnosis from the unit test when a cipher
is unexpected not available from
ERROR:../tests/unit/test-crypto-cipher.c:683:test_cipher: assertion failed: (err == NULL)
Bail out! ERROR:../tests/unit/test-crypto-cipher.c:683:test_cipher: assertion failed: (err == NULL)
Aborted (core dumped)
to
Unexpected error in qcrypto_cipher_ctx_new() at ../crypto/cipher-gcrypt.c.inc:262:
./build//tests/unit/test-crypto-cipher: Cannot initialize cipher: Invalid cipher algorithm
Aborted (core dumped)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Just because a cipher is defined in the gcrypt header file, does not
imply that it can be used. Distros can filter the list of ciphers when
building gcrypt. For example, RHEL-9 disables the SM4 cipher. It is
also possible that running in FIPS mode might dynamically change what
ciphers are available at runtime.
qcrypto_cipher_supports must therefore query gcrypt directly to check
for cipher availability.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The conversion of cipher mode will shortly be required in more
than one place.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This reverts commit a7077b8e35,
and add comments to explain why child sources cannot be used.
When a GSource is added as a child of another GSource, if its
'prepare' function indicates readiness, then the parent's
'prepare' function will never be run. The io_watch_poll_prepare
absolutely *must* be run on every iteration of the main loop,
to ensure that the chardev backend doesn't feed data to the
frontend that it is unable to consume.
At the time a7077b8e35 was made,
all the child GSource impls were relying on poll'ing an FD,
so their 'prepare' functions would never indicate readiness
ahead of poll() being invoked. So the buggy behaviour was
not noticed and lay dormant.
Relatively recently the QIOChannelTLS impl introduced a
level 2 child GSource, which checks with GNUTLS whether it
has cached any data that was decoded but not yet consumed:
commit ffda5db65a
Author: Antoine Damhet <antoine.damhet@shadow.tech>
Date: Tue Nov 15 15:23:29 2022 +0100
io/channel-tls: fix handling of bigger read buffers
Since the TLS backend can read more data from the underlying QIOChannel
we introduce a minimal child GSource to notify if we still have more
data available to be read.
Signed-off-by: Antoine Damhet <antoine.damhet@shadow.tech>
Signed-off-by: Charles Frey <charles.frey@shadow.tech>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
With this, it is now quite common for the 'prepare' function
on a QIOChannelTLS GSource to indicate immediate readiness,
bypassing the parent GSource 'prepare' function. IOW, the
critical 'io_watch_poll_prepare' is being skipped on some
iterations of the main loop. As a result chardev frontend
asserts are now being triggered as they are fed data they
are not ready to consume.
A reproducer is as follows:
* In terminal 1 run a GNUTLS *echo* server
$ gnutls-serv --echo \
--x509cafile ca-cert.pem \
--x509keyfile server-key.pem \
--x509certfile server-cert.pem \
-p 9000
* In terminal 2 run a QEMU guest
$ qemu-system-s390x \
-nodefaults \
-display none \
-object tls-creds-x509,id=tls0,dir=$PWD,endpoint=client \
-chardev socket,id=con0,host=localhost,port=9000,tls-creds=tls0 \
-device sclpconsole,chardev=con0 \
-hda Fedora-Cloud-Base-39-1.5.s390x.qcow2
After the previous patch revert, but before this patch revert,
this scenario will crash:
qemu-system-s390x: ../hw/char/sclpconsole.c:73: chr_read: Assertion
`size <= SIZE_BUFFER_VT220 - scon->iov_data_len' failed.
This assert indicates that 'tcp_chr_read' was called without
'tcp_chr_read_poll' having first been checked for ability to
receive more data
QEMU's use of a 'prepare' function to create/delete another
GSource is rather a hack and not normally the kind of thing that
is expected to be done by a GSource. There is no mechanism to
force GLib to always run the 'prepare' function of a parent
GSource. The best option is to simply not use the child source
concept, and go back to the functional approach previously
relied on.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This commit results in unexpected termination of the TLS connection.
When 'fd_can_read' returns 0, the code goes on to pass a zero length
buffer to qio_channel_read. The TLS impl calls into gnutls_recv()
with this zero length buffer, at which point GNUTLS returns an error
GNUTLS_E_INVALID_REQUEST. This is treated as fatal by QEMU's TLS code
resulting in the connection being torn down by the chardev.
Simply skipping the qio_channel_read when the buffer length is zero
is also not satisfactory, as it results in a high CPU burn busy loop
massively slowing QEMU's functionality.
The proper solution is to avoid tcp_chr_read being called at all
unless the frontend is able to accept more data. This will be done
in a followup commit.
This reverts commit 462945cd22
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The socket chardev often has 2 GSource object registered against the
same FD. One is registered all the time and is just intended to handle
POLLHUP events, while the other gets registered & unregistered on the
fly as the frontend is ready to receive more data or not.
It is very common for poll() to signal a POLLHUP event at the same time
as there is pending incoming data from the disconnected client. It is
therefore essential to process incoming data prior to processing HUP.
The problem with having 2 GSource on the same FD is that there is no
guaranteed ordering of execution between them, so the chardev code may
process HUP first and thus discard data.
This failure scenario is non-deterministic but can be seen fairly
reliably by reverting a7077b8e35, and
then running 'tests/unit/test-char', which will sometimes fail with
missing data.
Ideally QEMU would only have 1 GSource, but that's a complex code
refactoring job. The next best solution is to try to ensure ordering
between the 2 GSource objects. This can be achieved by lowering the
priority of the HUP GSource, so that it is never dispatched if the
main GSource is also ready to dispatch. Counter-intuitively, lowering
the priority of a GSource is done by raising its priority number.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
When something tries to run one of the spawn syscalls (eg clone),
our seccomp deny filter is set to cause a fatal trap which kills
the process.
This is found to be unhelpful when QEMU has loaded the nvidia
GL library. This tries to spawn a process to modprobe the nvidia
kmod. This is a dubious thing to do, but at the same time, the
code will gracefully continue if this fails. Our seccomp filter
rightly blocks the spawning, but prevent the graceful continue.
Switching to reporting EPERM will make QEMU behave more gracefully
without impacting the level of protect we have.
https://gitlab.com/qemu-project/qemu/-/issues/2116
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Pull request
This fix solves the "failed to set up stack guard page" error that has been
reported on Linux hosts where the QEMU coroutine pool exceeds the
vm.max_map_count limit.
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCAAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAmX5qq0ACgkQnKSrs4Gr
# c8ginQf8DRKzA7K8OivEegKpf0TgGcAcw9/xKc6zJH3X0/GXi1my61tzz+XUkbNy
# /R9HRrjBUb4MhSmJzP9kxuPFcBD5fZeipg4eTqtJCdi+DQ57+YypShVpsDrD7eNv
# X5dxeeONdWwP+k9JiOj9NtSOMmTKExn/Q/w45G2eeBlJh4yRA+56XN/dDXTFlidm
# NEpOGrKbyFKuAf/ZwYmeBr4aqIGTN3UgOVco/rqkGPYPTYpKlCoE5rSTEnQrbR7/
# C9KojlrGawJXlKjxfu/6i7yGHrv0eJ2N1VauvR/DHhQvdRhojVVt3NFGG/WJi+cL
# CMbxNyYeQJLNFtfPWzokjKEudxkshg==
# =lznr
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 19 Mar 2024 15:09:33 GMT
# gpg: using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [full]
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" [full]
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8
* tag 'block-pull-request' of https://gitlab.com/stefanha/qemu:
coroutine: cap per-thread local pool size
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The coroutine pool implementation can hit the Linux vm.max_map_count
limit, causing QEMU to abort with "failed to allocate memory for stack"
or "failed to set up stack guard page" during coroutine creation.
This happens because per-thread pools can grow to tens of thousands of
coroutines. Each coroutine causes 2 virtual memory areas to be created.
Eventually vm.max_map_count is reached and memory-related syscalls fail.
The per-thread pool sizes are non-uniform and depend on past coroutine
usage in each thread, so it's possible for one thread to have a large
pool while another thread's pool is empty.
Switch to a new coroutine pool implementation with a global pool that
grows to a maximum number of coroutines and per-thread local pools that
are capped at hardcoded small number of coroutines.
This approach does not leave large numbers of coroutines pooled in a
thread that may not use them again. In order to perform well it
amortizes the cost of global pool accesses by working in batches of
coroutines instead of individual coroutines.
The global pool is a list. Threads donate batches of coroutines to when
they have too many and take batches from when they have too few:
.-----------------------------------.
| Batch 1 | Batch 2 | Batch 3 | ... | global_pool
`-----------------------------------'
Each thread has up to 2 batches of coroutines:
.-------------------.
| Batch 1 | Batch 2 | per-thread local_pool (maximum 2 batches)
`-------------------'
The goal of this change is to reduce the excessive number of pooled
coroutines that cause QEMU to abort when vm.max_map_count is reached
without losing the performance of an adequately sized coroutine pool.
Here are virtio-blk disk I/O benchmark results:
RW BLKSIZE IODEPTH OLD NEW CHANGE
randread 4k 1 113725 117451 +3.3%
randread 4k 8 192968 198510 +2.9%
randread 4k 16 207138 209429 +1.1%
randread 4k 32 212399 215145 +1.3%
randread 4k 64 218319 221277 +1.4%
randread 128k 1 17587 17535 -0.3%
randread 128k 8 17614 17616 +0.0%
randread 128k 16 17608 17609 +0.0%
randread 128k 32 17552 17553 +0.0%
randread 128k 64 17484 17484 +0.0%
See files/{fio.sh,test.xml.j2} for the benchmark configuration:
https://gitlab.com/stefanha/virt-playbooks/-/tree/coroutine-pool-fix-sizing
Buglink: https://issues.redhat.com/browse/RHEL-28947
Reported-by: Sanjay Rao <srao@redhat.com>
Reported-by: Boaz Ben Shabat <bbenshab@redhat.com>
Reported-by: Joe Mario <jmario@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240318183429.1039340-1-stefanha@redhat.com>
aspeed, pnv, vfio queue:
* user device fixes for Aspeed and PowerNV machines
* coverity fix for iommufd
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmX5mm0ACgkQUaNDx8/7
# 7KE/MQ/9GeX4yNBxY2iTATdmPXwjMw8AtKyfIQb605nIO0ch1Z98ywl5VMwCNohn
# ppY9L5bFpEASgRlFVm73X4DGxKyRGpRPqylsvINh0hKciRpmRkELHY3llhnXsd7P
# Q197pDtFr54FeX8j4+hSAu4paT97fPENlKn0J6lto2I1cXGcD1LYNDFhysoXdGme
# brJgo7KjQJZPZ560ZewskL5FWf3G9EkRjpqd8y0G5OtNmAPgAaahOMHhDCXan182
# J89I9CHI5xN45MRfAs8JamSaj/GyNsr4h04WhPa0+VZQ5vsaeW2Ekt4ypj+oAV+p
# wykhYzQk4ALZcmmph2flSAtLa7uheI+imyqubMthQCDj3G8onSQBMd5/4WRK6O49
# 0oE1DpPDEfhlJEQYxaYhOeqeA9iaP+w6V+yE+L5oGlMO66cR7GZsPu0x7kXailbH
# IoHw9mO+vMkpuyeP7M3hA8WRFCdFpf1Nn1Ao5Jz3KoiTyJWlIvX5VSaj12sjddQ2
# fU9SKu2Q5QqS5uQGakkY64EyUy7RkGIX6zY2NIscVe2lfAfKf3mZwu7OIuLjEy5O
# lRn35vWV8fOdRooKoDPTNcdBCaNPi+RApin8chOv5P+F+ie7+Twf9sb1AgH/pIcv
# HptvTXbvSFNbbdb+OE8a5qsqTvnrN8d31IXzrWRYsJB07x2IyoA=
# =zR3v
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 19 Mar 2024 14:00:13 GMT
# gpg: using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@kaod.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1
* tag 'pull-for-9.0-20240319' of https://github.com/legoater/qemu:
aspeed/smc: Only wire flash devices at reset
ppc/pnv: I2C controller is not user creatable
vfio/iommufd: Fix memory leak
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The Aspeed machines have many Static Memory Controllers (SMC), up to
8, which can only drive flash memory devices. Commit 27a2c66c92
("aspeed/smc: Wire CS lines at reset") tried to ease the definitions
of these devices by allowing flash devices from the command line to be
attached to a SSI bus. For that, the wiring of the CS lines of the
Aspeed SMC controller was moved at reset. Two assumptions are made
though, first that the device has a SSI_GPIO_CS GPIO line, which is
not always the case, and second that it is a flash device.
Correct this problem by ensuring that the devices attached to the bus
are of the correct flash type. This fixes a QEMU abort when devices
without a CS line, such as the max111x, are passed on the command
line.
While at it, export TYPE_M25P80 used in the Xilinx Versal Virtual
machine.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2228
Fixes: 27a2c66c92 ("aspeed/smc: Wire CS lines at reset")
Reported-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
[ clg: minor fixes in the commit log ]
Signed-off-by: Cédric Le Goater <clg@redhat.com>
The I2C controller is a subunit of the processor. Make it so and avoid
QEMU crashes.
$ build/qemu-system-ppc64 -S -machine powernv9 -device pnv-i2c
qemu-system-ppc64: ../hw/ppc/pnv_i2c.c:521: pnv_i2c_realize: Assertion `i2c->chip' failed.
Aborted (core dumped)
Fixes: 263b81ee15 ("ppc/pnv: Add an I2C controller model")
Cc: Glenn Miles <milesg@linux.vnet.ibm.com>
Reported-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Glenn Miles <milesg@linux.vnet.ibm.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Coverity reported a memory leak on variable 'contents' in routine
iommufd_cdev_getfd(). Use g_autofree variables to simplify the exit
path and get rid of g_free() calls.
Cc: Eric Auger <eric.auger@redhat.com>
Cc: Yi Liu <yi.l.liu@intel.com>
Fixes: CID 1540007
Fixes: 5ee3dc7af7 ("vfio/iommufd: Implement the iommufd backend")
Suggested-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
virtio,pc,pci: bugfixes
Some minor fixes plus a big patchset from Igor fixing
a regression with windows.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmX4NzsPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRpkp0H/1foAaDYrApMiIkji4aI94bq/fwTnu5CshNP
# +YEzwJCS4qbl67/Ix2Z+xVz7twjQbgGdLd6hb9ZypAQfclUk5tDoKyCmqHtQMakX
# T080FayOvWmUEostAw7MXvuz0HpJlgnJaJBn29l1hHjA/XXahKqcc705cup+W8hv
# F7xb6AoFcbdETMzNaoqekNaHiiYyQPITY9p/UYPLzj2zyLsspR9kBebIeA1yhtXw
# Tmc3+FMquoM2fMNxpwfhCBswg662MlOXhLN3dmyLqeJRl09x1GvaeJIGMY2MbefM
# RMMv0/jqwAyii5HXew2rPIbLdULGq+hSjZo2NOlx3EOjTCaOkXc=
# =XGMp
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 18 Mar 2024 12:44:43 GMT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (24 commits)
smbios: add extra comments to smbios_get_table_legacy()
tests: acpi: update expected SSDT.dimmpxm blob
pc/q35: set SMBIOS entry point type to 'auto' by default
tests: acpi/smbios: whitelist expected blobs
smbios: error out when building type 4 table is not possible
smbios: in case of entry point is 'auto' try to build v2 tables 1st
smbios: extend smbios-entry-point-type with 'auto' value
smbios: clear smbios_type4_count before building tables
smbios: get rid of global smbios_ep_type
smbios: handle errors consistently
smbios: build legacy mode code only for 'pc' machine
smbios: rename/expose structures/bitmaps used by both legacy and modern code
smbios: add smbios_add_usr_blob_size() helper
smbios: don't check type4 structures in legacy mode
smbios: avoid mangling user provided tables
smbios: get rid of smbios_legacy global
smbios: get rid of smbios_smp_sockets global
smbios: cleanup smbios_get_tables() from legacy handling
tests: smbios: add test for legacy mode CLI options
tests: smbios: add test for -smbios type=11 option
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The low bit of MMU indices for x86 TCG indicates whether the processor is
in 32-bit mode and therefore linear addresses have to be masked to 32 bits.
However, the index was computed incorrectly, leading to possible conflicts
in the TLB for any address above 4G.
Analyzed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Fixes: b1661801c1 ("target/i386: Fix physical address truncation", 2024-02-28)
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2206
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Block layer patches
- mirror: Fix deadlock
- nbd/server: Fix race in draining the export
- qemu-img snapshot: Fix formatting with large values
- Fix blockdev-snapshot-sync error reporting for no medium
- iotests fixes
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmX4OG8RHGt3b2xmQHJl
# ZGhhdC5jb20ACgkQfwmycsiPL9YdiQ//faXfGmbK6rBW4AkpwfrRM8SDHvm6hz7L
# 043ujAi3ziSXXoiec2/RK5wZ27nMJkfIrRHXpH41hgQvC6/3a4eIW6KSTaFV1PdG
# JtHCeopmVmgu7TZQ+kt/J6eLUTTLovoO94HgEfmxpr4CGZfx9RJftf2kCKILcYkh
# 9r04zSZLByVd4FJ5ZrqsFulWif5mXoGKdT/YisY3tKiCwFRWQDOoTymvJA012VtO
# MVmID593zwem3O3qtlGiGlK9qodBR4yof66xa/0gaYP98BZgv+LWnwLKha+OzSpX
# bQlxT26LY4JnSQkTdjF0QYnQiH4Q1kveUcNRZrGpA4iZxVDq1aks5DisThDwqoGG
# rhaPOWyJwJsonM1Enzim5Jd60JqvGdpTLjSA5oSyTjw62lAulnYihInERYSAFyyz
# UhQaO7qSog1//RpPEXEsiVkJBq8BE9l5I+L7+l5SCBhNr/UwZAOer/4m4X6d0SKN
# GEPRx0kH1voikzx7gIQs+Oldqvb0sg+zAvOynBxzpd+Ac6s8bFtWe+eSyWYL/ZGr
# Jg9+PL1xir/Uh7KmOnzt/iVBAmfSRpAo1O72xQXvHFYYtIP7hTkPO/vzqF206WMc
# WQFHHjfp5gVcMZ5AYg6txw+Bbtzu8g0AfB054lgnhihuShpf0E923TTDQFdV755s
# NUlrzuGu2fs=
# =+JIK
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 18 Mar 2024 12:49:51 GMT
# gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg: issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* tag 'for-upstream' of https://repo.or.cz/qemu/kevin:
iotests: adapt to output change for recently introduced 'detached header' field
tests/qemu-iotests: Restrict tests using "--blockdev file" to the file protocol
tests/qemu-iotests: Fix some tests that use --image-opts for other protocols
tests/qemu-iotests: Restrict tests that use --image-opts to the 'file' protocol
tests/qemu-iotests: Restrict test 156 to the 'file' protocol
tests/qemu-iotests: Restrict test 134 and 158 to the 'file' protocol
tests/qemu-iotests: Restrict test 130 to the 'file' protocol
tests/qemu-iotests: Restrict test 114 to the 'file' protocol
tests/qemu-iotests: Restrict test 066 to the 'file' protocol
tests/qemu-iotests: Fix test 033 for running with non-file protocols
qemu-img: Fix Column Width and Improve Formatting in snapshot list
blockdev: Fix blockdev-snapshot-sync error reporting for no medium
iotests: Add test for reset/AioContext switches with NBD exports
nbd/server: Fix race in draining the export
mirror: Don't call job_pause_point() under graph lock
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Migration pull for 9.0-rc0
- Nicholas/Phil's fix on migration corruption / inconsistent for tcg
- Cedric's fix on block migration over n_sectors==0
- Steve's CPR reboot documentation page
- Fabiano's misc fixes on mapped-ram (IOC leak, dup() errors, fd checks, fd
use race, etc.)
# -----BEGIN PGP SIGNATURE-----
#
# iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZfdZEhIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1wa+1AEA0+f7nCssvsILvCY9KifYO+OUJsLodUuQ
# JW0JBz+1iPMA+wSiyIVl2Xg78Q97nJxv71UJf+1cDJENA5EMmXMnxmYK
# =SLnA
# -----END PGP SIGNATURE-----
# gpg: Signature made Sun 17 Mar 2024 20:56:50 GMT
# gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg: issuer "peterx@redhat.com"
# gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [marginal]
# gpg: aka "Peter Xu <peterx@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706
* tag 'migration-20240317-pull-request' of https://gitlab.com/peterx/qemu:
migration/multifd: Duplicate the fd for the outgoing_args
migration/multifd: Ensure we're not given a socket for file migration
migration: Fix iocs leaks during file and fd migration
migration: cpr-reboot documentation
migration: Skip only empty block devices
physmem: Fix migration dirty bitmap coherency with TCG memory access
physmem: Factor cpu_physical_memory_dirty_bits_cleared() out
physmem: Expose tlb_reset_dirty_range_all()
migration: Fix error handling after dup in file migration
io: Introduce qio_channel_file_new_dupfd
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Remove the unnecessary "Sparc" at the beginning of the line and
put the chip information into parentheses so that it is clearer
which part of the line have to be passed to "-cpu" to specify a
different CPU.
Message-ID: <20240307174334.130407-4-thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
some users were confused by this message showing under TCG:
Selected CPU generation is too new. Maximum supported model
in the configuration: 'xyz'
Clarify that the maximum can depend on the accel, and add a
hint to try a different one.
Also add a hint for features mismatch to suggest trying
different accel, QEMU and kernel versions.
Signed-off-by: Claudio Fontana <cfontana@suse.de>
Message-ID: <20240314213746.27163-1-cfontana@suse.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
address shift is caused by switch to 32-bit SMBIOS entry point
which has slightly different size from 64-bit one and happens
to trigger a bit different memory layout.
Expected diff:
- Name (MEMA, 0x07FFE000)
+ Name (MEMA, 0x07FFF000)
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20240314152302.2324164-21-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Use smbios-entry-point-type='auto' for newer machine types as a workaround
for Windows not detecting SMBIOS tables. Which makes QEMU pick SMBIOS tables
based on configuration (with 2.x preferred and fallback to 3.x if the former
isn't compatible with configuration)
Default compat setting of smbios-entry-point-type after series
for pc/q35 machines:
* 9.0-newer: 'auto'
* 8.1-8.2: '64'
* 8.0-older: '32'
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2008
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
Message-Id: <20240314152302.2324164-20-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
If SMBIOS v2 version is requested but number of cores/threads
are more than it's possible to describe with v2, error out
instead of silently ignoring the fact and filling core/thread
count with bogus values.
This will help caller to decide if it should fallback to
SMBIOSv3 when smbios-entry-point-type='auto'
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
Message-Id: <20240314152302.2324164-18-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
QEMU for some time now uses SMBIOS 3.0 for PC/Q35 machines by
default, however Windows has a bug in locating SMBIOS 3.0
entrypoint and fails to find tables when booted on SeaBIOS
(on UEFI SMBIOS 3.0 tables work fine since firmware hands
over tables in another way)
Missing SMBIOS tables may lead to some issues for guest
though (worst are: possible reactiveation, inability to
get virtio drivers from 'Windows Update')
It's unclear at this point if MS will fix the issue on their
side. So instead of it (or rather in addition) this patch
will try to workaround the issue.
aka, use smbios-entry-point-type=auto to make QEMU try
generating conservative SMBIOS 2.0 tables and if that
fails (due to limits/requested configuration) fallback
to SMBIOS 3.0 tables.
With this in place majority of users will use SMBIOS 2.0
tables which work fine with (Windows + legacy BIOS).
The configurations that is not to possible to describe
with SMBIOS 2.0 will switch automatically to SMBIOS 3.0
(which will trigger Windows bug but there is nothing
QEMU can do here, so go and aks Microsoft to real fix).
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
Message-Id: <20240314152302.2324164-17-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Current code uses mix of error_report()+exit(1)
and error_setg() to handle errors.
Use newer error_setg() everywhere, beside consistency
it will allow to detect error condition without killing
QEMU and attempt switch-over to SMBIOS3.x tables/entrypoint
in follow up patch.
while at it, clear smbios_tables pointer after freeing.
that will avoid double free if smbios_get_tables() is called
multiple times.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20240314152302.2324164-13-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
basically moving code around without functional change.
And exposing some symbols so that they could be shared
between smbbios.c and new smbios_legacy.c
plus some meson magic to build smbios_legacy.c only
for 'pc' machine and otherwise replace it with stub
if not selected.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20240314152302.2324164-12-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
As a preparation to move legacy handling into a separate file,
add prefix 'smbios_' to type0/type1/have_binfile_bitmap/have_fields_bitmap
and expose them in smbios.h so that they can be reused in
legacy and modern code.
Doing it as a separate patch to avoid rename cluttering follow-up
patch which will move legacy code into a separate file.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20240314152302.2324164-11-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
legacy mode doesn't support structures of type 2 and more,
and CLI has a check for '-smbios type' option, however it's
still possible to sneak in type4 as a blob with '-smbios file'
option. However doing the later makes SMBIOS tables broken
since SeaBIOS doesn't expect that.
Rather than trying to add support for type4 to legacy code
(both QEMU and SeaBIOS), simplify smbios_get_table_legacy()
by dropping not relevant check in legacy code and error out
on type4 blob.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
Message-Id: <20240314152302.2324164-9-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
currently smbios_entry_add() preserves internally '-smbios type='
options but tables provided with '-smbios file=' are stored directly
into blob that eventually will be exposed to VM. And then later
QEMU adds default/'-smbios type' entries on top into the same blob.
It makes impossible to generate tables more than once, hence
'immutable' guard was used.
Make it possible to regenerate final blob by storing user provided
blobs into a dedicated area (usr_blobs) and then copy it when
composing final blob. Which also makes handling of -smbios
options consistent.
As side effect of this and previous commits there is no need to
generate legacy smbios_entries at the time options are parsed.
Instead compose smbios_entries on demand from usr_blobs like
it is done for non-legacy SMBIOS tables.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20240314152302.2324164-8-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
smbios_get_tables() bails out right away if leagacy mode is enabled
and won't generate any SMBIOS tables. At the same time x86 specific
fw_cfg_build_smbios() will genarate legacy tables and then proceed
to preparing temporary mem_array for useless call to
smbios_get_tables() and then discard it.
Drop legacy related check in smbios_get_tables() and return from
fw_cfg_build_smbios() early if legacy tables where built without
proceeding to non legacy part of the function.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
Message-Id: <20240314152302.2324164-5-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Unfortunately having 2.0 machine type deprecated is not enough
to get rid of legacy SMBIOS handling since 'isapc' also uses
that and it's staying around.
Hence add test for CLI options handling to be sure that it
ain't broken during SMBIOS code refactoring.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
Message-Id: <20240314152302.2324164-4-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cureently it not possible to run SMBIOS test without ACPI one,
which gets into the way when testing ACPI-less configs.
Extract SMBIOS testing into separate routines that could also
be run without ACPI dependency and use that for testing SMBIOS.
As the 1st user add "acpi/piix4/smbios-options" test case.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
Message-Id: <20240314152302.2324164-2-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tests 263, 284 and detect-zeroes-registered-buf use qemu-io
with --image-opts so we have to enforce IMGOPTSSYNTAX=true here
to get $TEST_IMG in shape for other protocols than "file".
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240315111108.153201-9-thuth@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
These tests 188, 189 and 198 use qemu-io with --image-opts with additional
hard-coded parameters for the file protocol, so they cannot work for other
protocols. Thus we have to limit these tests to the file protocol only.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240315111108.153201-8-thuth@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The test fails completely when you try to use it with a different
protocol, e.g. with "./check -ssh -qcow2 156".
The test uses some hand-crafted JSON statements which cannot work with other
protocols, thus let's change this test to only support the 'file' protocol.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240315111108.153201-7-thuth@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Commit b25b387fa5 updated the iotests 134 and 158 to use the --image-opts
parameter for qemu-io with file protocol related options, but forgot to
update the _supported_proto line accordingly. So let's do that now.
Fixes: b25b387fa5 ("qcow2: convert QCow2 to use QCryptoBlock for encryption")
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240315111108.153201-6-thuth@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
iotest 114 uses "truncate" and the qcow2.py script on the destination file,
which both cannot deal with URIs. Thus this test needs the "file" protocol,
otherwise it fails with an error message like this:
truncate: cannot open 'ssh://127.0.0.1/tmp/qemu-build/tests/qemu-iotests/scratch/qcow2-ssh-114/t.qcow2.orig'
for writing: No such file or directory
Thus mark this test for "file protocol only" accordingly.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240315111108.153201-4-thuth@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When running iotest 033 with the ssh protocol, it fails with:
033 fail [14:48:31] [14:48:41] 10.2s output mismatch
--- /.../tests/qemu-iotests/033.out
+++ /.../tests/qemu-iotests/scratch/qcow2-ssh-033/033.out.bad
@@ -174,6 +174,7 @@
512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
wrote 512/512 bytes at offset 2097152
512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+qemu-io: warning: Failed to truncate the tail of the image: ssh driver does not support shrinking files
read 512/512 bytes at offset 0
512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
We already check for the qcow2 format here, so let's simply also
add a check for the protocol here, too, to only test the truncation
with the file protocol.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240315111108.153201-2-thuth@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When external_snapshot_abort() rejects a BlockDriverState without a
medium, it creates an error like this:
error_setg(errp, "Device '%s' has no medium", device);
Trouble is @device can be null. My system formats null as "(null)",
but other systems might crash. Reproducer:
1. Create a block device without a medium
-> {"execute": "blockdev-add", "arguments": {"driver": "host_cdrom", "node-name": "blk0", "filename": "/dev/sr0"}}
<- {"return": {}}
3. Attempt to snapshot it
-> {"execute":"blockdev-snapshot-sync", "arguments": { "node-name": "blk0", "snapshot-file":"/tmp/foo.qcow2","format":"qcow2"}}
<- {"error": {"class": "GenericError", "desc": "Device '(null)' has no medium"}}
Broken when commit 0901f67ecd made @device optional.
Use bdrv_get_device_or_node_name() instead. Now it fails as it
should:
<- {"error": {"class": "GenericError", "desc": "Device 'blk0' has no medium"}}
Fixes: 0901f67ecd ("qmp: Allow to take external snapshots on bs graphs node.")
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240306142831.2514431-1-armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This replicates the scenario in which the bug was reported.
Unfortunately this relies on actually executing a guest (so that the
firmware initialises the virtio-blk device and moves it to its
configured iothread), so this can't make use of the qtest accelerator
like most other test cases. I tried to find a different easy way to
trigger the bug, but couldn't find one.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20240314165825.40261-3-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When draining an NBD export, nbd_drained_begin() first sets
client->quiescing so that nbd_client_receive_next_request() won't start
any new request coroutines. Then nbd_drained_poll() tries to makes sure
that we wait for any existing request coroutines by checking that
client->nb_requests has become 0.
However, there is a small window between creating a new request
coroutine and increasing client->nb_requests. If a coroutine is in this
state, it won't be waited for and drain returns too early.
In the context of switching to a different AioContext, this means that
blk_aio_attached() will see client->recv_coroutine != NULL and fail its
assertion.
Fix this by increasing client->nb_requests immediately when starting the
coroutine. Doing this after the checks if we should create a new
coroutine is okay because client->lock is held.
Cc: qemu-stable@nongnu.org
Fixes: fd6afc501a ("nbd/server: Use drained block ops to quiesce the server")
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20240314165825.40261-2-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Calling job_pause_point() while holding the graph reader lock
potentially results in a deadlock: bdrv_graph_wrlock() first drains
everything, including the mirror job, which pauses it. The job is only
unpaused at the end of the drain section, which is when the graph writer
lock has been successfully taken. However, if the job happens to be
paused at a pause point where it still holds the reader lock, the writer
lock can't be taken as long as the job is still paused.
Mark job_pause_point() as GRAPH_UNLOCKED and fix mirror accordingly.
Cc: qemu-stable@nongnu.org
Buglink: https://issues.redhat.com/browse/RHEL-28125
Fixes: 004915a96a ("block: Protect bs->backing with graph_lock")
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20240313153000.33121-1-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We currently store the file descriptor used during the main outgoing
channel creation to use it again when creating the multifd
channels.
Since this fd is used for the first iochannel, there's risk that the
QIOChannel gets freed and the fd closed while outgoing_args.fd still
has it available. This could lead to an fd-reuse bug.
Duplicate the outgoing_args fd to avoid this issue.
Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240315032040.7974-3-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
When doing migration using the fd: URI, QEMU will fetch the file
descriptor passed in via the monitor at
fd_start_outgoing|incoming_migration(), which means the checks at
migration_channels_and_transport_compatible() happen too soon and we
don't know at that point whether the FD refers to a plain file or a
socket.
For this reason, we've been allowing a migration channel of type
SOCKET_ADDRESS_TYPE_FD to pass the initial verifications in scenarios
where the socket migration is not supported, such as with fd + multifd.
The commit decdc76772 ("migration/multifd: Add mapped-ram support to
fd: URI") was supposed to add a second check prior to starting
migration to make sure a socket fd is not passed instead of a file fd,
but failed to do so.
Add the missing verification and update the comment explaining this
situation which is currently incorrect.
Fixes: decdc76772 ("migration/multifd: Add mapped-ram support to fd: URI")
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240315032040.7974-2-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
At least for now cpu-topology is implemented only for KVM.
We already say this, but this tries to be more explicit,
and also show it in the examples.
This adds a new reference in the introduction that we can point to,
whenever we need to reference accelerators and how to select them.
Signed-off-by: Claudio Fontana <cfontana@suse.de>
Message-ID: <20240314172218.16478-1-cfontana@suse.de>
Reviewed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Tested-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The memory for the io channels is being leaked in three different ways
during file migration:
1) if the offset check fails we never drop the ioc reference;
2) we allocate an extra channel for no reason;
3) if multifd is enabled but channel creation fails when calling
dup(), we leave the previous channels around along with the glib
polling;
Fix all issues by restructuring the code to first allocate the
channels and only register the watches when all channels have been
created.
For multifd, the file and fd migrations can share code because both
are backed by a QIOChannelFile. For the non-multifd case, the fd needs
to be separate because it is backed by a QIOChannelSocket.
Fixes: 2dd7ee7a51 ("migration/multifd: Add incoming QIOChannelFile support")
Fixes: decdc76772 ("migration/multifd: Add mapped-ram support to fd: URI")
Reported-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240313212824.16974-2-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
virtio,pc,pci: features, cleanups, fixes
more memslots support in libvhost-user
support PCIe Gen5/Gen6 link speeds in pcie
more traces in vdpa
network simulation devices support in vdpa
SMBIOS type 9 descriptor implementation
Bump max_cpus to 4096 vcpus in q35
aw-bits and granule options in VIRTIO-IOMMU
Support report NUMA nodes for device memory using GI in acpi
Beginning of shutdown event support in pvpanic
fixes, cleanups all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmXw0TMPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRp8x4H+gLMoGwaGAX7gDGPgn2Ix4j/3kO77ZJ9X9k/
# 1KqZu/9eMS1j2Ei+vZqf05w7qRjxxhwDq3ilEXF/+UFqgAehLqpRRB8j5inqvzYt
# +jv0DbL11PBp/oFjWcytm5CbiVsvq8KlqCF29VNzc162XdtcduUOWagL96y8lJfZ
# uPrOoyeR7SMH9lp3LLLHWgu+9W4nOS03RroZ6Umj40y5B7yR0Rrppz8lMw5AoQtr
# 0gMRnFhYXeiW6CXdz+Tzcr7XfvkkYDi/j7ibiNSURLBfOpZa6Y8+kJGKxz5H1K1G
# 6ZY4PBcOpQzl+NMrktPHogczgJgOK10t+1i/R3bGZYw2Qn/93Eg=
# =C0UU
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 12 Mar 2024 22:03:31 GMT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (68 commits)
docs/specs/pvpanic: document shutdown event
hw/cxl: Fix missing reserved data in CXL Device DVSEC
hmat acpi: Fix out of bounds access due to missing use of indirection
hmat acpi: Do not add Memory Proximity Domain Attributes Structure targetting non existent memory.
qemu-options.hx: Document the virtio-iommu-pci aw-bits option
hw/arm/virt: Set virtio-iommu aw-bits default value to 48
hw/i386/q35: Set virtio-iommu aw-bits default value to 39
virtio-iommu: Add an option to define the input range width
virtio-iommu: Trace domain range limits as unsigned int
qemu-options.hx: Document the virtio-iommu-pci granule option
virtio-iommu: Change the default granule to the host page size
virtio-iommu: Add a granule property
hw/i386/acpi-build: Add support for SRAT Generic Initiator structures
hw/acpi: Implement the SRAT GI affinity structure
qom: new object to associate device to NUMA node
hw/i386/pc: Inline pc_cmos_init() into pc_cmos_init_late() and remove it
hw/i386/pc: Set "normal" boot device order in pc_basic_device_init()
hw/i386/pc: Avoid one use of the current_machine global
hw/i386/pc: Remove "rtc_state" link again
Revert "hw/i386/pc: Confine system flash handling to pc_sysfw"
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
# Conflicts:
# hw/core/machine.c
* PAPR nested hypervisor host implementation for spapr TCG
* excp_helper.c code cleanups and improvements
* Move more ops to decodetree
* Deprecate pseries-2.12 machines and P9 and P10 DD1.0 CPUs
* Document running Linux on AmigaNG
* Update dt feature advertising POWER CPUs.
* Add P10 PMU SPRs
* Improve pnv topology calculation for SMT8 CPUs.
* Various bug fixes.
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCgAdFiEETkN92lZhb0MpsKeVZ7MCdqhiHK4FAmXwiT8ACgkQZ7MCdqhi
# HK7C/w//XxEO2bQTFPLFDTrP/voq7pcX8XeQNVyXCkXYjvsbu05oQow50k+Y5UAE
# US4MFjt8jFz0vuIKuKyoA3kG41zDSOzoX4TQXMM+tyTWbuFF3KAyfizb1xE6SYAN
# xJEGvmiXv/EgoSBD7BTKQp1tMPdIGZLwSdYiA0lmOo7YaMCgYAXaujW5hnNjQecT
# 873sN+10pHtQY++mINtD9Nfb6AcDGMWw0b+bykqIXhNRkI8IGOS4WF4vAuMBrwfe
# UM00wDnNRb86Dk14bv2XVNDr6/i0VRtUMwM4yiptrQ1TQx18LZaPSQFYjQfPaan7
# LwN4QkMFnBX54yJ7Npvjvu8BCBF47kwOVu4CIAFJ4sIm0WfTmozDpPttwcZ5w7Ve
# iXDOB9ECAB4pQ2rCgbSNG8MYUZgoHHOuThqolOP0Vh9NHRRJxpdw6CyAbmCGftc0
# lvRDPFiKp8xmCNJ/j3XzoUdHoG7NMwpUmHv9ruGU18SdQ8hyJN9AcQGWYrB4v0RV
# /hs2RAbwntG7ahkcwd8uy5aFw88Wph/uGXPXc49EWj7i49vHeIV2y5+gtthMywje
# qqjFXkistXuF+JHVnyoYmqqCyXaHX5CEwtawMv4EQeaJs76bLhMeMTKKl9rRp8qB
# DtbIZphO8iMsocrBnje48sA5HR0PM+H4HTjw10i8R0fLlWitaIY=
# =XnY5
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 12 Mar 2024 16:56:31 GMT
# gpg: using RSA key 4E437DDA56616F4329B0A79567B30276A8621CAE
# gpg: Good signature from "Nicholas Piggin <npiggin@gmail.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 4E43 7DDA 5661 6F43 29B0 A795 67B3 0276 A862 1CAE
* tag 'pull-ppc-for-9.0-2-20240313' of https://gitlab.com/npiggin/qemu: (38 commits)
spapr: nested: Introduce cap-nested-papr for Nested PAPR API
spapr: nested: Introduce H_GUEST_RUN_VCPU hcall.
spapr: nested: Use correct source for parttbl info for nested PAPR API.
spapr: nested: Introduce H_GUEST_[GET|SET]_STATE hcalls.
spapr: nested: Initialize the GSB elements lookup table.
spapr: nested: Extend nested_ppc_state for nested PAPR API
spapr: nested: Introduce H_GUEST_CREATE_VCPU hcall.
spapr: nested: Introduce H_GUEST_[CREATE|DELETE] hcalls.
spapr: nested: Introduce H_GUEST_[GET|SET]_CAPABILITIES hcalls.
spapr: nested: Document Nested PAPR API
spapr: nested: keep nested-hv related code restricted to its API.
spapr: nested: Introduce SpaprMachineStateNested to store related info.
spapr: nested: move nested part of spapr_get_pate into spapr_nested.c
spapr: nested: register nested-hv api hcalls only for cap-nested-hv
target/ppc: Remove interrupt handler wrapper functions
target/ppc: Clean up ifdefs in excp_helper.c, part 3
target/ppc: Clean up ifdefs in excp_helper.c, part 2
target/ppc: Clean up ifdefs in excp_helper.c, part 1
target/ppc: Add gen_exception_err_nip() function
target/ppc: Readability improvements in exception handlers
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When the terminal GDB_FORK_ENABLED state is reached, the coordination
socket is not needed anymore and is therefore closed. However, if there
is a communication error between QEMU gdbstub and GDB, the generic
error handling code attempts to close it again.
Fix by closing it later - before returning - instead.
Fixes: Coverity CID 1539966
Fixes: d547e711a8 ("gdbstub: Implement follow-fork-mode child")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240312001813.13720-1-iii@linux.ibm.com>
Add stub to handle Xfer:siginfo:read packet query that requests the
machine's siginfo data.
This is used when GDB user executes 'print $_siginfo' and when the
machine stops due to a signal, for instance, on SIGSEGV. The information
in siginfo allows GDB to determiner further details on the signal, like
the fault address/insn when the SIGSEGV is caught.
Signed-off-by: Gustavo Romero <gustavo.romero@linaro.org>
Message-Id: <20240309030901.1726211-5-gustavo.romero@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
The "check" target by itself is not enough to ensure we build the user
mode binaries. While we can't test them with check-tcg we can at least
include them in the build.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Gustavo Romero <gustavo.romero@linaro.org>
The block .save_setup() handler calls a helper routine
init_blk_migration() which builds a list of block devices to take into
account for migration. When one device is found to be empty (sectors
== 0), the loop exits and all the remaining devices are ignored. This
is a regression introduced when bdrv_iterate() was removed.
Change that by skipping only empty devices.
Cc: Markus Armbruster <armbru@redhat.com>
Cc: qemu-stable <qemu-stable@nongnu.org>
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Fixes: fea68bb6e9 ("block: Eliminate bdrv_iterate(), use bdrv_next()")
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Link: https://lore.kernel.org/r/20240312120431.550054-1-clg@redhat.com
[peterx: fix "Suggested-by:"]
Signed-off-by: Peter Xu <peterx@redhat.com>
Shutdown requests are normally hardware dependent.
By extending pvpanic to also handle shutdown requests, guests can
submit such requests with an easily implementable and cross-platform
mechanism.
Signed-off-by: Thomas Weißschuh <thomas@t-8ch.de>
Message-Id: <20240310-pvpanic-shutdown-spec-v1-1-b258e182ce55@t-8ch.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The r3.1 specification introduced a new 2 byte field, but
to maintain DWORD alignment, a additional 2 reserved bytes
were added. Forgot those in updating the structure definition
but did include them in the size define leading to a buffer
overrun.
Also use the define so that we don't duplicate the value.
Fixes: Coverity ID 1534095 buffer overrun
Fixes: 8700ee15de ("hw/cxl: Standardize all references on CXL r3.1 and minor updates")
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20240308143831.6256-1-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
With a numa set up such as
-numa nodeid=0,cpus=0 \
-numa nodeid=1,memdev=mem \
-numa nodeid=2,cpus=1
and appropriate hmat_lb entries the initiator list is correctly
computed and writen to HMAT as 0,2 but then the LB data is accessed
using the node id (here 2), landing outside the entry_list array.
Stash the reverse lookup when writing the initiator list and use
it to get the correct array index index.
Fixes: 4586a2cb83 ("hmat acpi: Build System Locality Latency and Bandwidth Information Structure(s)")
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20240307160326.31570-3-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently the default input range can extend to 64 bits. On x86,
when the virtio-iommu protects vfio devices, the physical iommu
may support only 39 bits. Let's set the default to 39, as done
for the intel-iommu.
We use hw_compat_8_2 to handle the compatibility for machines
before 9.0 which used to have a virtio-iommu default input range
of 64 bits.
Of course if aw-bits is set from the command line, the default
is overriden.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20240307134445.92296-8-eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
aw-bits is a new option that allows to set the bit width of
the input address range. This value will be used as a default for
the device config input_range.end. By default it is set to 64 bits
which is the current value.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <20240307134445.92296-7-eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We used to set the default granule to 4KB but with VFIO assignment
it makes more sense to use the actual host page size.
Indeed when hotplugging a VFIO device protected by a virtio-iommu
on a 64kB/64kB host/guest config, we current get a qemu crash:
"vfio: DMA mapping failed, unable to continue"
This is due to the hot-attached VFIO device calling
memory_region_iommu_set_page_size_mask() with 64kB granule
whereas the virtio-iommu granule was already frozen to 4KB on
machine init done.
Set the granule property to "host" and introduce a new compat.
The page size mask used before 9.0 was qemu_target_page_mask().
Since the virtio-iommu currently only supports x86_64 and aarch64,
this matched a 4KB granule.
Note that the new default will prevent 4kB guest on 64kB host
because the granule will be set to 64kB which would be larger
than the guest page size. In that situation, the virtio-iommu
driver fails on viommu_domain_finalise() with
"granule 0x10000 larger than system page size 0x1000".
In that case the workaround is to request 4K granule.
The current limitation of global granule in the virtio-iommu
should be removed and turned into per domain granule. But
until we get this upgraded, this new default is probably
better because I don't think anyone is currently interested in
running a 4KB page size guest with virtio-iommu on a 64KB host.
However supporting 64kB guest on 64kB host with virtio-iommu and
VFIO looks a more important feature.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-Id: <20240307134445.92296-4-eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This allows to choose which granule will be used by
default by the virtio-iommu. Current page size mask
default is qemu_target_page_mask so this translates
into a 4k granule on ARM and x86_64 where virtio-iommu
is supported.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-Id: <20240307134445.92296-3-eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
ACPI spec provides a scheme to associate "Generic Initiators" [1]
(e.g. heterogeneous processors and accelerators, GPUs, and I/O devices with
integrated compute or DMA engines GPUs) with Proximity Domains. This is
achieved using Generic Initiator Affinity Structure in SRAT. During bootup,
Linux kernel parse the ACPI SRAT to determine the PXM ids and create a NUMA
node for each unique PXM ID encountered. Qemu currently do not implement
these structures while building SRAT.
Add GI structures while building VM ACPI SRAT. The association between
device and node are stored using acpi-generic-initiator object. Lookup
presence of all such objects and use them to build these structures.
The structure needs a PCI device handle [2] that consists of the device BDF.
The vfio-pci device corresponding to the acpi-generic-initiator object is
located to determine the BDF.
[1] ACPI Spec 6.3, Section 5.2.16.6
[2] ACPI Spec 6.3, Table 5.80
Cc: Jonathan Cameron <qemu-devel@nongnu.org>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cedric Le Goater <clg@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
Message-Id: <20240308145525.10886-3-ankita@nvidia.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
NVIDIA GPU's support MIG (Mult-Instance GPUs) feature [1], which allows
partitioning of the GPU device resources (including device memory) into
several (upto 8) isolated instances. Each of the partitioned memory needs
a dedicated NUMA node to operate. The partitions are not fixed and they
can be created/deleted at runtime.
Unfortunately Linux OS does not provide a means to dynamically create/destroy
NUMA nodes and such feature implementation is not expected to be trivial. The
nodes that OS discovers at the boot time while parsing SRAT remains fixed. So
we utilize the Generic Initiator (GI) Affinity structures that allows
association between nodes and devices. Multiple GI structures per BDF is
possible, allowing creation of multiple nodes by exposing unique PXM in each
of these structures.
Implement the mechanism to build the GI affinity structures as Qemu currently
does not. Introduce a new acpi-generic-initiator object to allow host admin
link a device with an associated NUMA node. Qemu maintains this association
and use this object to build the requisite GI Affinity Structure.
When multiple NUMA nodes are associated with a device, it is required to
create those many number of acpi-generic-initiator objects, each representing
a unique device:node association.
Following is one of a decoded GI affinity structure in VM ACPI SRAT.
[0C8h 0200 1] Subtable Type : 05 [Generic Initiator Affinity]
[0C9h 0201 1] Length : 20
[0CAh 0202 1] Reserved1 : 00
[0CBh 0203 1] Device Handle Type : 01
[0CCh 0204 4] Proximity Domain : 00000007
[0D0h 0208 16] Device Handle : 00 00 20 00 00 00 00 00 00 00 00
00 00 00 00 00
[0E0h 0224 4] Flags (decoded below) : 00000001
Enabled : 1
[0E4h 0228 4] Reserved2 : 00000000
[0E8h 0232 1] Subtable Type : 05 [Generic Initiator Affinity]
[0E9h 0233 1] Length : 20
An admin can provide a range of acpi-generic-initiator objects, each
associating a device (by providing the id through pci-dev argument)
to the desired NUMA node (using the node argument). Currently, only PCI
device is supported.
For the grace hopper system, create a range of 8 nodes and associate that
with the device using the acpi-generic-initiator object. While a configuration
of less than 8 nodes per device is allowed, such configuration will prevent
utilization of the feature to the fullest. The following sample creates 8
nodes per PCI device for a VM with 2 PCI devices and link them to the
respecitve PCI device using acpi-generic-initiator objects:
-numa node,nodeid=2 -numa node,nodeid=3 -numa node,nodeid=4 \
-numa node,nodeid=5 -numa node,nodeid=6 -numa node,nodeid=7 \
-numa node,nodeid=8 -numa node,nodeid=9 \
-device vfio-pci-nohotplug,host=0009:01:00.0,bus=pcie.0,addr=04.0,rombar=0,id=dev0 \
-object acpi-generic-initiator,id=gi0,pci-dev=dev0,node=2 \
-object acpi-generic-initiator,id=gi1,pci-dev=dev0,node=3 \
-object acpi-generic-initiator,id=gi2,pci-dev=dev0,node=4 \
-object acpi-generic-initiator,id=gi3,pci-dev=dev0,node=5 \
-object acpi-generic-initiator,id=gi4,pci-dev=dev0,node=6 \
-object acpi-generic-initiator,id=gi5,pci-dev=dev0,node=7 \
-object acpi-generic-initiator,id=gi6,pci-dev=dev0,node=8 \
-object acpi-generic-initiator,id=gi7,pci-dev=dev0,node=9 \
-numa node,nodeid=10 -numa node,nodeid=11 -numa node,nodeid=12 \
-numa node,nodeid=13 -numa node,nodeid=14 -numa node,nodeid=15 \
-numa node,nodeid=16 -numa node,nodeid=17 \
-device vfio-pci-nohotplug,host=0009:01:01.0,bus=pcie.0,addr=05.0,rombar=0,id=dev1 \
-object acpi-generic-initiator,id=gi8,pci-dev=dev1,node=10 \
-object acpi-generic-initiator,id=gi9,pci-dev=dev1,node=11 \
-object acpi-generic-initiator,id=gi10,pci-dev=dev1,node=12 \
-object acpi-generic-initiator,id=gi11,pci-dev=dev1,node=13 \
-object acpi-generic-initiator,id=gi12,pci-dev=dev1,node=14 \
-object acpi-generic-initiator,id=gi13,pci-dev=dev1,node=15 \
-object acpi-generic-initiator,id=gi14,pci-dev=dev1,node=16 \
-object acpi-generic-initiator,id=gi15,pci-dev=dev1,node=17 \
Link: https://www.nvidia.com/en-in/technologies/multi-instance-gpu [1]
Cc: Jonathan Cameron <qemu-devel@nongnu.org>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
Message-Id: <20240308145525.10886-2-ankita@nvidia.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Now that pc_cmos_init() doesn't populate the X86MachineState::rtc attribute any
longer, its duties can be merged into pc_cmos_init_late() which is called within
machine_done notifier. This frees pc_piix and pc_q35 from explicit CMOS
initialization.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20240303185332.1408-5-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The boot device order may change during the lifetime of a VM. Usually, the
"normal" order is set once during machine init(). However, if a user specifies
`-boot once=...`, the "normal" order is overwritten by the "once" order just
before machine_done, and a reset handler is registered which restores the
"normal" order during the next reset.
In the next patch, pc_cmos_init() will be inlined into pc_cmos_init_late() which
runs during machine_done. This means that the "once" boot order would be
overwritten again with the "normal" boot order -- which renders the user's
choice ineffective. Fix this by setting the "normal" boot order in
pc_basic_device_init() which already registers the boot_set() handler.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20240303185332.1408-4-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The RTC can be accessed through the X86 machine instance, so rather than passing
the RTC it's possible to pass the machine state instead. This avoids
pc_boot_set() from having to access the current_machine global.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20240303185332.1408-3-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Commit 99e1c1137b "hw/i386/pc: Populate RTC attribute directly" made linking
the "rtc_state" property unnecessary and removed it. Commit 84e945aad2 "vl,
pc: turn -no-fd-bootchk into a machine property" accidently reintroduced the
link. Remove it again since it is not needed.
Fixes: 84e945aad2 "vl, pc: turn -no-fd-bootchk into a machine property"
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20240303185332.1408-2-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Commit 6f6ad2b245 "hw/i386/pc: Confine system flash handling to pc_sysfw"
causes a regression when specifying the property `-M pflash0` in the PCI PC
machines:
qemu-system-x86_64: Property 'pc-q35-9.0-machine.pflash0' not found
In order to revert the commit, the commit below must be reverted first.
This reverts commit cb05cc1602.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20240226215909.30884-2-shentey@gmail.com>
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Since commit f10a570b093e6 ("KVM: x86: Add CONFIG_KVM_MAX_NR_VCPUS to allow up to 4096 vCPUs")
Linux kernel can support upto a maximum number of 4096 vcpus when MAXSMP is
enabled in the kernel. At present, QEMU has been tested to correctly boot a
linux guest with 4096 vcpus using the current edk2 upstream master branch that
has the fixes corresponding to the following two PRs:
https://github.com/tianocore/edk2/pull/5410https://github.com/tianocore/edk2/pull/5418
The changes merged into edk2 with the above PRs will be in the upcoming 2024-05
release. With current seabios firmware, it boots fine with 4096 vcpus already.
So bump up the value max_cpus to 4096 for q35 machines versions 9 and newer.
Q35 machines versions 8.2 and older continue to support 1024 maximum vcpus
as before for compatibility reasons.
If KVM is not able to support the specified number of vcpus, QEMU would
return the following error messages:
$ ./qemu-system-x86_64 -cpu host -accel kvm -machine q35 -smp 1728
qemu-system-x86_64: -accel kvm: warning: Number of SMP cpus requested (1728) exceeds the recommended cpus supported by KVM (12)
qemu-system-x86_64: -accel kvm: warning: Number of hotpluggable cpus requested (1728) exceeds the recommended cpus supported by KVM (12)
Number of SMP cpus requested (1728) exceeds the maximum cpus supported by KVM (1024)
Cc: Daniel P. Berrangé <berrange@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Julia Suvorova <jusual@redhat.com>
Cc: kraxel@redhat.com
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20240228143351.3967-1-anisinha@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The spec does not NumVFs is reset after disabling VFs except when
resetting the PF. Clearing it is guest visible and out of spec, even
though Linux doesn't rely on this value being preserved, so we never
noticed.
Fixes: 7c0fa8dff8 ("pcie: Add support for Single Root I/O Virtualization (SR/IOV)")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20240228-reuse-v8-4-282660281e60@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
pcie_sriov_pf_disable_vfs() is called when resetting the PF, but it only
disables VFs and does not reset SR-IOV extended capability, leaking the
state and making the VF Enable register inconsistent with the actual
state.
Replace pcie_sriov_pf_disable_vfs() with pcie_sriov_pf_reset(), which
does not only disable VFs but also resets the capability.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20240228-reuse-v8-3-282660281e60@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@ericsson.com>
nvme_sriov_pre_write_ctrl() used to directly inspect SR-IOV
configurations to know the number of VFs being disabled due to SR-IOV
configuration writes, but the logic was flawed and resulted in
out-of-bound memory access.
It assumed PCI_SRIOV_NUM_VF always has the number of currently enabled
VFs, but it actually doesn't in the following cases:
- PCI_SRIOV_NUM_VF has been set but PCI_SRIOV_CTRL_VFE has never been.
- PCI_SRIOV_NUM_VF was written after PCI_SRIOV_CTRL_VFE was set.
- VFs were only partially enabled because of realization failure.
It is a responsibility of pcie_sriov to interpret SR-IOV configurations
and pcie_sriov does it correctly, so use pcie_sriov_num_vfs(), which it
provides, to get the number of enabled VFs before and after SR-IOV
configuration writes.
Cc: qemu-stable@nongnu.org
Fixes: CVE-2024-26328
Fixes: 11871f53ef ("hw/nvme: Add support for the Virtualization Management command")
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20240228-reuse-v8-1-282660281e60@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
IOAPICCommonClass implements its own private realize(), and this private
realize() allows error.
Since IOAPICCommonClass.realize() returns void, to check the error,
dereference @errp with ERRP_GUARD().
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20240223085653.1255438-8-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
As the comment in qapi/error, dereferencing @errp requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
* - It must not be dereferenced, because it may be null.
...
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
*
* Using it when it's not needed is safe, but please avoid cluttering
* the source with useless code.
But in iommufd_cdev_getfd(), @errp is dereferenced without ERRP_GUARD():
if (*errp) {
error_prepend(errp, VFIO_MSG_PREFIX, path);
}
Currently, since vfio_attach_device() - the caller of
iommufd_cdev_getfd() - is always called in DeviceClass.realize() context
and doesn't get the NULL @errp parameter, iommufd_cdev_getfd()
hasn't triggered the bug that dereferencing the NULL @errp.
To follow the requirement of @errp, add missing ERRP_GUARD() in
iommufd_cdev_getfd().
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20240223085653.1255438-7-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
As the comment in qapi/error, dereferencing @errp requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
* - It must not be dereferenced, because it may be null.
...
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
*
* Using it when it's not needed is safe, but please avoid cluttering
* the source with useless code.
But in cxl_usp_realize(), @errp is dereferenced without ERRP_GUARD():
cxl_doe_cdat_init(cxl_cstate, errp);
if (*errp) {
goto err_cap;
}
Here we check *errp, because cxl_doe_cdat_init() returns void. And since
cxl_usp_realize() - as a PCIDeviceClass.realize() method - doesn't get
the NULL @errp parameter, it hasn't triggered the bug that dereferencing
the NULL @errp.
To follow the requirement of @errp, add missing ERRP_GUARD() in
cxl_usp_realize().
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20240223085653.1255438-6-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, dereferencing @errp requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
* - It must not be dereferenced, because it may be null.
...
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
*
* Using it when it's not needed is safe, but please avoid cluttering
* the source with useless code.
But in trng_prop_fault_event_set, @errp is dereferenced without
ERRP_GUARD():
visit_type_uint32(v, name, events, errp);
if (*errp) {
return;
}
Currently, since trng_prop_fault_event_set() doesn't get the NULL @errp
parameter as a "set" method of object property, it hasn't triggered the
bug that dereferencing the NULL @errp.
And since visit_type_uint32() returns bool, check the returned bool
directly instead of dereferencing @errp, then we needn't the add missing
ERRP_GUARD().
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20240223085653.1255438-5-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
As the comment in qapi/error, dereferencing @errp requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
* - It must not be dereferenced, because it may be null.
...
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
*
* Using it when it's not needed is safe, but please avoid cluttering
* the source with useless code.
But in ct3_realize(), @errp is dereferenced without ERRP_GUARD():
cxl_doe_cdat_init(cxl_cstate, errp);
if (*errp) {
goto err_free_special_ops;
}
Here we check *errp, because cxl_doe_cdat_init() returns void. And
ct3_realize() - as a PCIDeviceClass.realize() method - doesn't get the
NULL @errp parameter, it hasn't triggered the bug that dereferencing
the NULL @errp.
To follow the requirement of @errp, add missing ERRP_GUARD() in
ct3_realize().
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20240223085653.1255438-4-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
As the comment in qapi/error, dereferencing @errp requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
* - It must not be dereferenced, because it may be null.
...
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
*
* Using it when it's not needed is safe, but please avoid cluttering
* the source with useless code.
But in macfb_nubus_realize(), @errp is dereferenced without
ERRP_GUARD():
ndc->parent_realize(dev, errp);
if (*errp) {
return;
}
Here we check *errp, because the ndc->parent_realize(), as a
DeviceClass.realize() callback, returns void. And since
macfb_nubus_realize(), also as a DeviceClass.realize(), doesn't get the
NULL @errp parameter, it hasn't triggered the bug that dereferencing the
NULL @errp.
To follow the requirement of @errp, add missing ERRP_GUARD() in
macfb_nubus_realize().
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20240223085653.1255438-3-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
As the comment in qapi/error, dereferencing @errp requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
* - It must not be dereferenced, because it may be null.
...
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
*
* Using it when it's not needed is safe, but please avoid cluttering
* the source with useless code.
But in cxl_fixed_memory_window_config(), @errp is dereferenced in 2
places without ERRP_GUARD():
fw->enc_int_ways = cxl_interleave_ways_enc(fw->num_targets, errp);
if (*errp) {
return;
}
and
fw->enc_int_gran =
cxl_interleave_granularity_enc(object->interleave_granularity,
errp);
if (*errp) {
return;
}
For the above 2 places, we check "*errp", because neither function
returns a suitable error code. And since machine_set_cfmw() - the caller
of cxl_fixed_memory_window_config() - doesn't get the NULL @errp
parameter as the "set" method of object property,
cxl_fixed_memory_window_config() hasn't triggered the bug that
dereferencing the NULL @errp.
To follow the requirement of @errp, add missing ERRP_GUARD() in
cxl_fixed_memory_window_config().
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20240223085653.1255438-2-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
This patch adds support for VDPA network simulation devices.
The device is developed based on virtio-net and tap backend,
and supports hardware live migration function.
For more details, please refer to "docs/system/devices/vdpa-net.rst"
Signed-off-by: Hao Chen <chenh@yusur.tech>
Message-Id: <20240221073802.2888022-1-chenh@yusur.tech>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Shared objects lack spoofing protection.
For VHOST_USER_BACKEND_SHARED_OBJECT_REMOVE messages
received by the vhost-user interface, any backend was
allowed to remove entries from the shared table just
by knowing the UUID. Only the owner of the entry
shall be allowed to removed their resources
from the table.
To fix that, add a check for all
*SHARED_OBJECT_REMOVE messages received.
A vhost device can only remove TYPE_VHOST_DEV
entries that are owned by them, otherwise skip
the removal, and inform the device that the entry
has not been removed in the answer.
Signed-off-by: Albert Esteve <aesteve@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20240219143423.272012-2-aesteve@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The payload size returned by command VIRTIO_SND_R_PCM_INFO is
wrong. The code in process_cmd() assumes that all commands
return only a virtio_snd_hdr payload, but some commands like
VIRTIO_SND_R_PCM_INFO may return an additional payload.
Add a zero initialized payload_size variable to struct
virtio_snd_ctrl_command to allow for additional payloads.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Volker Rümelin <vr_qemu@t-online.de>
Message-Id: <20240218083351.8524-1-vr_qemu@t-online.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Sometimes, certain parts are not being skipped in
vhost_vdpa_listener_region_del, but they are skipped in
vhost_vdpa_listener_region_add, or vice versa. The vhost-vdpa code
expects all parts to maintain their properties, so we're adding a trace
to help with debugging when any part is skipped.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20240215103616.330518-3-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We already use MADV_NORESERVE to deal with sparse memory regions. Let's
also set madvise(MADV_DONTDUMP), otherwise a crash of the process can
result in us allocating all memory in the mmap'ed region for dumping
purposes.
This change implies that the mmap'ed rings won't be included in a
coredump. If ever required for debugging purposes, we could mark only
the mapped rings MADV_DODUMP.
Ignore errors during madvise() for now.
Reviewed-by: Raphael Norwitz <raphael@enfabrica.net>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20240214151701.29906-15-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently, we try to remap all rings whenever we add a single new memory
region. That doesn't quite make sense, because we already map rings when
setting the ring address, and panic if that goes wrong. Likely, that
handling was simply copied from set_mem_table code, where we actually
have to remap all rings.
Remapping all rings might require us to walk quite a lot of memory
regions to perform the address translations. Ideally, we'd simply remove
that remapping.
However, let's be a bit careful. There might be some weird corner cases
where we might temporarily remove a single memory region (e.g., resize
it), that would have worked for now. Further, a ring might be located on
hotplugged memory, and as the VM reboots, we might unplug that memory, to
hotplug memory before resetting the ring addresses.
So let's unmap affected rings as we remove a memory region, and try
dynamically mapping the ring again when required.
Acked-by: Raphael Norwitz <raphael@enfabrica.net>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20240214151701.29906-14-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In the past, QEMU would create memory regions that could partially cover
hugetlb pages, making mmap() fail if we would use the mmap_offset as an
fd_offset. For that reason, we never used the mmap_offset as an offset into
the fd and instead always mapped the fd from the very start.
However, that can easily result in us mmap'ing a lot of unnecessary
parts of an fd, possibly repeatedly.
QEMU nowadays does not create memory regions that partially cover huge
pages -- it never really worked with postcopy. QEMU handles merging of
regions that partially cover huge pages (due to holes in boot memory) since
2018 in c1ece84e7c ("vhost: Huge page align and merge").
Let's be a bit careful and not unconditionally convert the
mmap_offset into an fd_offset. Instead, let's simply detect the hugetlb
size and pass as much as we can as fd_offset, making sure that we call
mmap() with a properly aligned offset.
With QEMU and a virtio-mem device that is fully plugged (50GiB using 50
memslots) the qemu-storage daemon process consumes in the VA space
1281GiB before this change and 58GiB after this change.
================ Vhost user message ================
Request: VHOST_USER_ADD_MEM_REG (37)
Flags: 0x9
Size: 40
Fds: 59
Adding region 4
guest_phys_addr: 0x0000000200000000
memory_size: 0x0000000040000000
userspace_addr: 0x00007fb73bffe000
old mmap_offset: 0x0000000080000000
fd_offset: 0x0000000080000000
new mmap_offset: 0x0000000000000000
mmap_addr: 0x00007f02f1bdc000
Successfully added new region
================ Vhost user message ================
Request: VHOST_USER_ADD_MEM_REG (37)
Flags: 0x9
Size: 40
Fds: 59
Adding region 5
guest_phys_addr: 0x0000000240000000
memory_size: 0x0000000040000000
userspace_addr: 0x00007fb77bffe000
old mmap_offset: 0x00000000c0000000
fd_offset: 0x00000000c0000000
new mmap_offset: 0x0000000000000000
mmap_addr: 0x00007f0284000000
Successfully added new region
Reviewed-by: Raphael Norwitz <raphael@enfabrica.net>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20240214151701.29906-12-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Let's speed up GPA to memory region / virtual address lookup. Store the
memory regions ordered by guest physical addresses, and use binary
search for address translation, as well as when adding/removing memory
regions.
Most importantly, this will speed up GPA->VA address translation when we
have many memslots.
Reviewed-by: Raphael Norwitz <raphael@enfabrica.net>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20240214151701.29906-11-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Memory regions cannot overlap, and if we ever hit that case something
would be really flawed.
For example, when vhost code in QEMU decides to increase the size of memory
regions to cover full huge pages, it makes sure to never create overlaps,
and if there would be overlaps, it would bail out.
QEMU commits 48d7c97577 ("vhost: Merge sections added to temporary
list"), c1ece84e7c ("vhost: Huge page align and merge") and
e7b94a84b6 ("vhost: Allow adjoining regions") added and clarified that
handling and how overlaps are impossible.
Consequently, each GPA can belong to at most one memory region, and
everything else doesn't make sense. Let's factor out our search to prepare
for further changes.
Reviewed-by: Raphael Norwitz <raphael@enfabrica.net>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20240214151701.29906-10-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Let's factor it out, reducing quite some code duplication and perparing
for further changes.
If we fail to mmap a region and panic, we now simply don't add that
(broken) region.
Note that we now increment dev->nregions as we are successfully
adding memory regions, and don't increment dev->nregions if anything went
wrong.
Reviewed-by: Raphael Norwitz <raphael@enfabrica.net>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20240214151701.29906-6-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Let's support up to 509 mem slots, just like vhost in the kernel usually
does and the rust vhost-user implementation recently [1] started doing.
This is required to properly support memory hotplug, either using
multiple DIMMs (ACPI supports up to 256) or using virtio-mem.
The 509 used to be the KVM limit, it supported 512, but 3 were
used for internal purposes. Currently, KVM supports more than 512, but
it usually doesn't make use of more than ~260 (i.e., 256 DIMMs + boot
memory), except when other memory devices like PCI devices with BARs are
used. So, 509 seems to work well for vhost in the kernel.
Details can be found in the QEMU change that made virtio-mem consume
up to 256 mem slots across all virtio-mem devices. [2]
509 mem slots implies 509 VMAs/mappings in the worst case (even though,
in practice with virtio-mem we won't be seeing more than ~260 in most
setups).
With max_map_count under Linux defaulting to 64k, 509 mem slots
still correspond to less than 1% of the maximum number of mappings.
There are plenty left for the application to consume.
[1] https://github.com/rust-vmm/vhost/pull/224
[2] https://lore.kernel.org/all/20230926185738.277351-1-david@redhat.com/
Reviewed-by: Raphael Norwitz <raphael@enfabrica.net>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20240214151701.29906-3-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Let's prepare for increasing VHOST_USER_MAX_RAM_SLOTS by dynamically
allocating dev->regions. We don't have any ABI guarantees (not
dynamically linked), so we can simply change the layout of VuDev.
Let's zero out the memory, just as we used to do.
Reviewed-by: Raphael Norwitz <raphael@enfabrica.net>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20240214151701.29906-2-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fix an issue where cancellation of ongoing migration ends up
with no network connectivity.
When canceling migration, SVQ will be switched back to the
passthrough mode, but the right call fd is not programed to
the device and the svq's own call fd is still used. At the
point of this transitioning period, the shadow_vqs_enabled
hadn't been set back to false yet, causing the installation
of call fd inadvertently bypassed.
Message-Id: <1707910082-10243-13-git-send-email-si-wei.liu@oracle.com>
Fixes: a8ac88585d ("vhost: Add Shadow VirtQueue call forwarding capabilities")
Cc: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Will be used in following patches.
DISABLING(-1) means SVQ is being switched off to passthrough
mode.
ENABLING(1) means passthrough VQs are being switched to SVQ.
DONE(0) means SVQ switching is completed.
Message-Id: <1707910082-10243-11-git-send-email-si-wei.liu@oracle.com>
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The fastpath in cpu_physical_memory_sync_dirty_bitmap() to test large
aligned ranges forgot to bring the TCG TLB up to date after clearing
some of the dirty memory bitmap bits. This can result in stores though
the TCG TLB not setting the dirty memory bitmap and ultimately causes
memory corruption / lost updates during migration from a TCG host.
Fix this by calling cpu_physical_memory_dirty_bits_cleared() when
dirty bits have been cleared.
Fixes: aa8dc04477 ("migration: synchronize memory bitmap 64bits at a time")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240219061731.232570-1-npiggin@gmail.com>
[PMD: Split patch in 2: part 2/2, slightly adapt description]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20240312201458.79532-4-philmd@linaro.org
Signed-off-by: Peter Xu <peterx@redhat.com>
Xen queue:
* In Xen PCI passthrough, emulate multifunction bit.
* Fix in Xen mapcache.
* Improve performance of kernel+initrd loading in an Xen HVM Direct
Kernel Boot scenario.
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEE+AwAYwjiLP2KkueYDPVXL9f7Va8FAmXwZbwACgkQDPVXL9f7
# Va+PhQgAusZBhy3b0hOCCoqC/1ffCE5J2JxUTnN3zN/2FSOe8/kqQYqt4Zk3vi2e
# Eq8FbGupU357eoJSz0gTEPKQ8y+FVBCmFKEHM1PS54TW1yUZchQg4RmlII6+Psoj
# 7u+qC1RqZu/ZQ9f1QZd8YDJ5oVOkfAZYwq5BkWVS6h5gJiQTSkekAXlMNOQBZxz4
# 48fzpokatiJBbyaBGEm6YKEOwkYG76eHhxB4SC0Rgx6zW+EDQpX0s/Lg19SXnj2C
# UOueiPod1GkE+iH6dQFJUSbsnrkAtJZf253bs3BQnoChGiqQLuXn4jC79ffjPzHI
# AKP2+u+bSJ+8C1SdPuoJN6sJIZmOfA==
# =FZ2n
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 12 Mar 2024 14:25:00 GMT
# gpg: using RSA key F80C006308E22CFD8A92E7980CF5572FD7FB55AF
# gpg: Good signature from "Anthony PERARD <anthony.perard@gmail.com>" [marginal]
# gpg: aka "Anthony PERARD <anthony.perard@citrix.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 5379 2F71 024C 600F 778A 7161 D8D5 7199 DF83 42C8
# Subkey fingerprint: F80C 0063 08E2 2CFD 8A92 E798 0CF5 572F D7FB 55AF
* tag 'pull-xen-20240312' of https://xenbits.xen.org/git-http/people/aperard/qemu-dm:
i386: load kernel on xen using DMA
xen: Drop out of coroutine context xen_invalidate_map_cache_entry
xen/pt: Emulate multifunction bit in header type
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The file migration code was allowing a possible -1 from a failed call
to dup() to propagate into the new QIOFileChannel::fd before checking
for validity. Coverity doesn't like that, possibly due to the the
lseek(-1, ...) call that would ensue before returning from the channel
creation routine.
Use the newly introduced qio_channel_file_dupfd() to properly check
the return of dup() before proceeding.
Fixes: CID 1539961
Fixes: CID 1539965
Fixes: CID 1539960
Fixes: 2dd7ee7a51 ("migration/multifd: Add incoming QIOChannelFile support")
Fixes: decdc76772 ("migration/multifd: Add mapped-ram support to fd: URI")
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: "Daniel P. Berrangé" <berrange@redhat.com>
Link: https://lore.kernel.org/r/20240311233335.17299-3-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
The --target-type and --target-name args are used to construct
the default probe prefix if '--probe-prefix' is not given. The
meson.build will always pass '--probe-prefix', so the other args
are effectively redundant.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20240108171356.1037059-2-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* Add missing ERRP_GUARD() statements in functions that need it
* Prefer fast cpu_env() over slower CPU QOM cast macro
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmXwPhYRHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbWHvBAAgKx5LHFjz3xREVA+LkDTQ49mz0lK3s32
# SGvNlIHjiaDGVttVYhVC4sinBWUruG4Lyv/2QN72OJBzn6WUsEUQE3KPH1d7Y3/s
# wS9X7mj70n4kugWJqeIJP5AXSRasHmWoQ4QJLVQRJd6+Eb9jqwep0x7bYkI1de6D
# bL1Q7bIfkFeNQBXaiPWAm2i+hqmT4C1r8HEAGZIjAsMFrjy/hzBEjNV+pnh6ZSq9
# Vp8BsPWRfLU2XHm4WX0o8d89WUMAfUGbVkddEl/XjIHDrUD+Zbd1HAhLyfhsmrnE
# jXIwSzm+ML1KX4MoF5ilGtg8Oo0gQDEBy9/xck6G0HCm9lIoLKlgTxK9glr2vdT8
# yxZmrM9Hder7F9hKKxmb127xgU6AmL7rYmVqsoQMNAq22D6Xr4UDpgFRXNk2/wO6
# zZZBkfZ4H4MpZXbd/KJpXvYH5mQA4IpkOy8LJdE+dbcHX7Szy9ksZdPA+Z10hqqf
# zqS13qTs3abxymy2Q/tO3hPKSJCk1+vCGUkN60Wm+9VoLWGoU43qMc7gnY/pCS7m
# 0rFKtvfwFHhokX1orK0lP/ppVzPv/5oFIeK8YDY9if+N+dU2LCwVZHIuf2/VJPRq
# wmgH2vAn3JDoRKPxTGX9ly6AMxuZaeP92qBTOPap0gDhihYzIpaCq9ecEBoTakI7
# tdFhV0iRr08=
# =NiP4
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 12 Mar 2024 11:35:50 GMT
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5
* tag 'pull-request-2024-03-12' of https://gitlab.com/thuth/qemu: (55 commits)
user: Prefer fast cpu_env() over slower CPU QOM cast macro
target/xtensa: Prefer fast cpu_env() over slower CPU QOM cast macro
target/tricore: Prefer fast cpu_env() over slower CPU QOM cast macro
target/sparc: Prefer fast cpu_env() over slower CPU QOM cast macro
target/sh4: Prefer fast cpu_env() over slower CPU QOM cast macro
target/rx: Prefer fast cpu_env() over slower CPU QOM cast macro
target/ppc: Prefer fast cpu_env() over slower CPU QOM cast macro
target/openrisc: Prefer fast cpu_env() over slower CPU QOM cast macro
target/nios2: Prefer fast cpu_env() over slower CPU QOM cast macro
target/mips: Prefer fast cpu_env() over slower CPU QOM cast macro
target/microblaze: Prefer fast cpu_env() over slower CPU QOM cast macro
target/m68k: Prefer fast cpu_env() over slower CPU QOM cast macro
target/loongarch: Prefer fast cpu_env() over slower CPU QOM cast macro
target/i386/hvf: Use CPUState typedef
target/hexagon: Prefer fast cpu_env() over slower CPU QOM cast macro
target/cris: Prefer fast cpu_env() over slower CPU QOM cast macro
target/avr: Prefer fast cpu_env() over slower CPU QOM cast macro
target/alpha: Prefer fast cpu_env() over slower CPU QOM cast macro
target: Replace CPU_GET_CLASS(cpu -> obj) in cpu_reset_hold() handler
bulk: Call in place single use cpu_env()
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Introduce a SPAPR capability cap-nested-papr which enables nested PAPR
API for nested guests. This new API is to enable support for KVM on PowerVM
and the support in Linux kernel has already merged upstream.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The H_GUEST_RUN_VCPU hcall is used to start execution of a Guest VCPU.
The Hypervisor will update the state of the Guest VCPU based on the
input buffer, restore the saved Guest VCPU state, and start its
execution.
The Guest VCPU can stop running for numerous reasons including HCALLs,
hypervisor exceptions, or an outstanding Host Partition Interrupt.
The reason that the Guest VCPU stopped running is communicated through
R4 and the output buffer will be filled in with any relevant state.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
For nested PAPR API, we use SpaprMachineStateNestedGuest struct to store
partition table info, use the same in spapr_get_pate_nested() via
helper.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Introduce the nested PAPR hcalls:
- H_GUEST_GET_STATE which is used to get state of a nested guest or
a guest VCPU. The value field for each element in the request is
destination to be updated to reflect current state on success.
- H_GUEST_SET_STATE which is used to modify the state of a guest or
a guest VCPU. On success, guest (or its VCPU) state shall be
updated as per the value field for the requested element(s).
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Nested PAPR API provides a standard Guest State Buffer (GSB) format
with unique IDs for each guest state element for which get/set state is
supported by the API. Some of the elements are read-only and/or guest-wide.
Introducing additional required GSB elements and helper routines for state
exchange of each of the nested guest state elements for which get/set state
should be supported by the API.
[amachhiw: set the PCR whenever logical PVR is set]
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Signed-off-by: Amit Machhiwal <amachhiw@linux.vnet.ibm.com>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Currently, nested_ppc_state stores a certain set of registers and works
with nested_[load|save]_state() for state transfer as reqd for nested-hv API.
Extending these with additional registers state as reqd for nested PAPR API.
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Suggested-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Introduce the nested PAPR hcall H_GUEST_CREATE_VCPU which is used to
create and initialize the specified VCPU resource for the previously
created guest. Each guest can have multiple VCPUs upto max 2048.
All VCPUs for a guest gets deallocated on guest delete.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Introduce the nested PAPR hcalls:
- H_GUEST_CREATE which is used to create and allocate resources for
nested guest being created.
- H_GUEST_DELETE which is used to delete and deallocate resources
for the nested guest being deleted. It also supports deleting all nested
guests at once using a deleteAll flag.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Introduce the nested PAPR hcalls:
- H_GUEST_GET_CAPABILITIES which is used to query the capabilities
of the API and the L2 guests it provides.
- H_GUEST_SET_CAPABILITIES which is used to set the Guest API
capabilities that the Host Partition supports and may use.
[amachhiw: support for p9 compat mode and return register bug fixes]
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Amit Machhiwal <amachhiw@linux.vnet.ibm.com>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Adding initial documentation about Nested PAPR API to describe the set
of APIs and its usage. Also talks about the Guest State Buffer elements
and it's format which is used between L0/L1 to communicate L2 state.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
spapr_exit_nested and spapr_get_pate_nested_hv contains code which
is specific to nested-hv API. Isolating code flows based on API
helps extending it to be used with different API as well.
Suggested-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Currently, nested_ptcr is being used by existing nested-hv API to store
nested guest related info. This need to be organised to extend support
for the nested PAPR API which would need to store additional info
related to nested guests in next series of patches.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Most of the nested code has already been moved to spapr_nested.c
This logic inside spapr_get_pate is related to nested guests and
better suited for spapr_nested.c, hence moving there.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Since cap-nested-hv is an optional capability, it makes sense to register
api specfic hcalls only when respective capability is enabled. This
requires to introduce a new API to unregister hypercalls to maintain
sanity across guest reboot since caps are re-applied across reboots and
re-registeration of hypercalls would hit assert otherwise.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
These wrappers call out to handle POWER7 and newer in separate
functions but reduce to the generic case when TARGET_PPC64 is not
defined. It is easy enough to include the switch in the beginning of
the generic functions to branch out to the specific functions and get
rid of these wrappers. This avoids one indirection and entirely
compiles out the switch without TARGET_PPC64.
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Concatenate #if blocks that are ending then beginning on the next line
again.
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Remove check for !defined(CONFIG_USER_ONLY) as this is already within
an #ifndef CONFIG_USER_ONLY block.
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Use #ifdef, #ifndef for brevity and add comments to #endif that are
more than a few lines apart for clarity.
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Add gen_exception_err_nip() that does the same as gen_exception_err()
but takes the nip as a parameter to allow specifying it instead of
using the current instruction address then change gen_exception_err()
to use it.
The gen_exception() and gen_exception_nip() functions are similar so
remove code duplication from those too while at it.
Suggested-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Improve readability by shortening some long comments, removing
comments that state the obvious and dropping some empty lines so they
don't distract when reading the code.
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Use the env_cpu function to get the CPUState for cpu_abort. These are
only needed in case of fatal errors so this allows to avoid casting
and storing CPUState in a local variable wnen not needed.
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Big (SMT8) cores have a complicated function to map the core, thread ID
to pervasive topology (PIR). Fix this for power8, power9, and power10.
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Caleb Schlossin <calebs@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Currently in tcg mode, when reading from power10 pmu spr like MMCR3,
qemu logs this message (when starting qemu with -d guest_errors)
Trying to read invalid spr 754 (0x2f2) at 0000000030056bb0
This is becuase, no read/write call-backs are registered for
these SPRs. Add support to register generic read/write
functions to these power10 pmu sprs to fix it.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
This patch moves the below instructions to decodetree specification:
{add, subf}[c,e,me,ze][o][.] : XO-form
addic[.], subfic : D-form
addex : Z23-form
This patch introduces XO form instructions into decode tree
specification, for which all the four variations([o][.]) have been
handled with a single pattern. The changes were verified by validating
that the tcg ops generated by those instructions remain the same, which
were captured with the '-d in_asm,op' flag.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Documentation on how to run Linux on the amigaone, pegasos2 and
sam460ex machines is currently buried in the depths of the qemu-devel
mailing list and in the source code. Let's collect the information in
the QEMU handbook for a one stop solution.
Tested-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Co-authored-by: Bernhard Beschow <shentey@gmail.com>
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
pSeries machines before 3.0 have complex migration back
compatibility code we'd like to get ride of. The last
one is 2.12, which is 6 years old. We just deprecated up
to the 2.11 machine in commit 1392617d35 ("spapr: Tag
pseries-2.1 - 2.11 machines as deprecated").
Take to opportunity to also deprecate the 2.12 machines.
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
PPC maintainership has been a side activity for the last 2 years and
it is time to let go some of it now that Nick has taken over.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Copy the pa-features arrays from spapr, adjusting slightly as
described in comments.
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
This allows different pa-features for powernv8/9/10.
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Add POWER10 pa-features entry.
Notably DEXCR and [P]HASHST/[P]HASHCHK instruction support is
advertised. Each DEXCR aspect is allocated a bit in the device tree,
using the 68--71 byte range (inclusive). The functionality of the
[P]HASHST/[P]HASHCHK instructions is separately declared in byte 72,
bit 0 (BE).
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
[npiggin: reword title and changelog, adjust a few bits]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
"MMR" and "SPR SO" are not implemented in POWER9, so clear those bits.
HTM is not set by default, and only later if the cap is set, so remove
the comment that suggests otherwise.
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
TCG does not support copy/paste instructions. Remove it from
ibm,pa-features. This has never been implemented under TCG or
practically usable under KVM, so it won't be missed.
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
SAO is a page table attribute that strengthens the memory ordering of
accesses. QEMU with MTTCG does not implement this, so clear it in
ibm,pa-features. This is an obscure feature that has been removed from
POWER10 ISA v3.1, there isn't much concern with removing it.
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
POWER10 hardware implements a degenerate transactional memory facility
in POWER8/9 PCR compatibility modes to permit migration from older
CPUs, but POWER10 / ISA v3.1 mode does not support it so the CPU model
should not support it.
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The POWER9 DD1 and POWER10 DD1 chips are not public and are no longer of
any use in QEMU. Remove them.
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The initial MSR state for the OpenFirmware binding specifies
MSR[ME] and MSR[FP] are set.
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Prevent guest state modifying the MSR[ME] bit. Per ISA:
An attempt to modify MSR[ME] in privileged but non-hypervisor state
is ignored (i.e., the bit is not changed).
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Commit 1901b4967c ("hw/block/nvme: move msix table and pba to BAR 0")
moved the MSI-X table and PBA to BAR 0 to make room for enabling CMR and
PMR at the same time. As reported by Julien Grall in #2184, this breaks
migration through system hibernation.
Add a machine compatibility parameter and set it on machines pre 6.0 to
enable the old behavior automatically, restoring the hibernation
migration support.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2184
Fixes: 1901b4967c ("hw/block/nvme: move msix table and pba to BAR 0")
Reported-by: Julien Grall julien@xen.org
Tested-by: Julien Grall julien@xen.org
Reviewed-by: Jesper Wendel Devantier <foss@defmacro.it>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Generalize the mbar size helper such that it can handle cases where the
MSI-X table and PBA are expected to be in an exclusive bar.
Cc: qemu-stable@nongnu.org
Reviewed-by: Jesper Wendel Devantier <foss@defmacro.it>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
This patch adds a way to specify an NGUID for a given NVMe Namespace using a
string of hexadecimal digits with an optional '-' separator to group bytes. For
instance:
-device nvme-ns,nguid="e9accd3b83904e13167cf0593437f57d"
If provided, the NGUID will be part of the Namespace Identification Descriptor
list and the Identify Namespace data.
Signed-off-by: Roque Arcudia Hernandez <roqueh@google.com>
Signed-off-by: Nabih Estefan <nabihestefan@google.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
My colleague, Jesper, will be assiting with hw/nvme related reviews. Add
him with R: so he gets automatically bugged going forward.
Cc: Jesper Devantier <foss@defmacro.it>
Acked-by: Jesper Devantier <foss@defmacro.it>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
The number of logical blocks within a source range is converted into a
1s based number at the time of parsing. However, when verifying the copy
length we add one again, causing the check against MCL to fail in error.
Cc: qemu-stable@nongnu.org
Fixes: 381ab99d85 ("hw/nvme: check maximum copy length (MCL) for COPY")
Reviewed-by: Minwoo Im <minwoo.im@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Currently, when a VF is created, it uses the 'params' object of the PF
as it is. In other words, the 'params.serial' string memory area is also
shared. In this situation, if the VF is removed from the system, the
PF's 'params.serial' object is released with object_finalize() followed
by object_property_del_all() which release the memory for 'serial'
property. If that happens, the next VF created will inherit a serial
from a corrupted memory area.
If this happens, an error will occur when comparing subsys->serial and
n->params.serial in the nvme_subsys_register_ctrl() function.
Cc: qemu-stable@nongnu.org
Fixes: 44c2c09488 ("hw/nvme: Add support for SR-IOV")
Signed-off-by: Minwoo Im <minwoo.im@samsung.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
xen_invalidate_map_cache_entry is not expected to run in a
coroutine. Without this, there is crash:
signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
threadid=<optimized out>) at pthread_kill.c:78
at /usr/src/debug/glibc/2.38+git-r0/sysdeps/posix/raise.c:26
fmt=0xffff9e1ca8a8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n",
assertion=assertion@entry=0xaaaae0d25740 "!qemu_in_coroutine()",
file=file@entry=0xaaaae0d301a8 "../qemu-xen-dir-remote/block/graph-lock.c", line=line@entry=260,
function=function@entry=0xaaaae0e522c0 <__PRETTY_FUNCTION__.3> "bdrv_graph_rdlock_main_loop") at assert.c:92
assertion=assertion@entry=0xaaaae0d25740 "!qemu_in_coroutine()",
file=file@entry=0xaaaae0d301a8 "../qemu-xen-dir-remote/block/graph-lock.c", line=line@entry=260,
function=function@entry=0xaaaae0e522c0 <__PRETTY_FUNCTION__.3> "bdrv_graph_rdlock_main_loop") at assert.c:101
at ../qemu-xen-dir-remote/block/graph-lock.c:260
at /home/Freenix/work/sw-stash/xen/upstream/tools/qemu-xen-dir-remote/include/block/graph-lock.h:259
host=host@entry=0xffff742c8000, size=size@entry=2097152)
at ../qemu-xen-dir-remote/block/io.c:3362
host=0xffff742c8000, size=2097152)
at ../qemu-xen-dir-remote/block/block-backend.c:2859
host=<optimized out>, size=<optimized out>, max_size=<optimized out>)
at ../qemu-xen-dir-remote/block/block-ram-registrar.c:33
size=2097152, max_size=2097152)
at ../qemu-xen-dir-remote/hw/core/numa.c:883
buffer=buffer@entry=0xffff743c5000 "")
at ../qemu-xen-dir-remote/hw/xen/xen-mapcache.c:475
buffer=buffer@entry=0xffff743c5000 "")
at ../qemu-xen-dir-remote/hw/xen/xen-mapcache.c:487
as=as@entry=0xaaaae1ca3ae8 <address_space_memory>, buffer=0xffff743c5000,
len=<optimized out>, is_write=is_write@entry=true,
access_len=access_len@entry=32768)
at ../qemu-xen-dir-remote/system/physmem.c:3199
dir=DMA_DIRECTION_FROM_DEVICE, len=<optimized out>,
buffer=<optimized out>, as=0xaaaae1ca3ae8 <address_space_memory>)
at /home/Freenix/work/sw-stash/xen/upstream/tools/qemu-xen-dir-remote/include/sysemu/dma.h:236
elem=elem@entry=0xaaaaf620aa30, len=len@entry=32769)
at ../qemu-xen-dir-remote/hw/virtio/virtio.c:758
elem=elem@entry=0xaaaaf620aa30, len=len@entry=32769, idx=idx@entry=0)
at ../qemu-xen-dir-remote/hw/virtio/virtio.c:919
elem=elem@entry=0xaaaaf620aa30, len=32769)
at ../qemu-xen-dir-remote/hw/virtio/virtio.c:994
req=req@entry=0xaaaaf620aa30, status=status@entry=0 '\000')
at ../qemu-xen-dir-remote/hw/block/virtio-blk.c:67
ret=0) at ../qemu-xen-dir-remote/hw/block/virtio-blk.c:136
at ../qemu-xen-dir-remote/block/block-backend.c:1559
--Type <RET> for more, q to quit, c to continue without paging--
at ../qemu-xen-dir-remote/block/block-backend.c:1614
i1=<optimized out>) at ../qemu-xen-dir-remote/util/coroutine-ucontext.c:177
at ../sysdeps/unix/sysv/linux/aarch64/setcontext.S:123
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Message-Id: <20240124021450.21656-1-peng.fan@oss.nxp.com>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
The intention of the code appears to have been to unconditionally set
the multifunction bit but since the emulation mask is 0x00 it has no
effect. Instead, emulate the bit and set it based on the multifunction
property of the PCIDevice (which can be set using QAPI).
This allows making passthrough devices appear as functions in a Xen
guest.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Message-Id: <20231103172601.1319375-1-ross.lagerwall@citrix.com>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
When converting test vs UINT32_MAX to compare vs 0, we need to
adjust the condition to match.
Fixes: 34aff3c2e0 ("tcg/aarch64: Generate CBNZ for TSTNE of UINT32_MAX")
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Pass the type to tcg_out_logicali; remove the assert, duplicated
at the start of tcg_out_logicali.
Fixes: 339adf2f38 ("tcg/aarch64: Support TCG_COND_TST{EQ,NE}")
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The current post-loading code for scanout has a FIXME: it doesn't take
the resource region/rect into account. But there is more, when adding
blob migration support in commit f66767f75c, I didn't realize that blob
resources could be used for scanouts. This situationn leads to a crash
during post-load, as they don't have an associated res->image.
virtio_gpu_do_set_scanout() handle all cases, but requires the
associated virtio_gpu_framebuffer, which is currently not saved during
migration.
Add a v2 of "virtio-gpu-one-scanout" with the framebuffer fields, so we
can restore blob scanouts, as well as fixing the existing FIXME.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Sebastian Ott <sebott@redhat.com>
The "Listener" connection, being private and under the control of the
qemu display, allows for the optimization of discarding pending
intermediary messages when queuing a new scanout. This ensures that the
client receives only the latest scanout update, improving communication
efficiency.
While the current implementation does not provide a mechanism for
clients who may wish to receive all updates, making this behavior
optional could be considered in the future. For now, adopting this new
default behavior accelerates the communication process without a
guarantee of delivering all updates.
The filter is removed when the connection is dropped.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCAAdFiEEIV1G9IJGaJ7HfzVi7wSWWzmNYhEFAmXwPUAACgkQ7wSWWzmN
# YhFnIwgAgctDniJwlRxXB01eVlzXz7IulHnpSby07XEJxENSpGB8ufaeE4eK5gJy
# NVK6C2+1EU2vRxm4oIdcvtN4C4/jtRbYYjiSTx7eE4FmSkqshSnR5XCV72LDqG3i
# WbzInjMvYfysmcMXLfrWgxOnVew9WqEzlpEWlc7FfNKnkzBVf+JDztfqCUx0XM7H
# qefw4ImjqQw993QxJpipXC7aEGUyouB0RIBB71FkCa9ihlh9x7W68evbOI/jTn5q
# HWuStgS02sKHjRFliMbdbMY77FNUz4Yroo/GKSvGt64atxkQSJqPNAV+/9n18LNy
# QAH5eK6cXFPOIAaYpADU5kHDVVAFiw==
# =iBdx
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 12 Mar 2024 11:32:16 GMT
# gpg: using RSA key 215D46F48246689EC77F3562EF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* tag 'net-pull-request' of https://github.com/jasowang/qemu:
ebpf: Updated eBPF program and skeleton.
qmp: Added new command to retrieve eBPF blob.
virtio-net: Added property to load eBPF RSS with fds.
ebpf: Added eBPF initialization by fds.
ebpf: Added eBPF map update through mmap.
Avoid unaligned fetch in ladr_match()
e1000e: fix link state on resume
igb: fix link state on resume
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Misc HW patch queue
- Rename hw/ide/ahci-internal.h for consistency (Zoltan)
- More convenient PCI hotplug trace events (Vladimir)
- Short CLI option to add drives for sam460ex machine (Zoltan)
- More missing ERRP_GUARD() macros (Zhao)
- Avoid faulting when unmapped I/O BAR is accessed on SPARC EBUS (Mark)
- Remove unused includes in hw/core/ (Zhao)
- New PCF8574 GPIO over I2C model (Dmitriy)
- Require ObjC on Darwin macOS by default (Peter)
- Corrected "-smp parameter=1" placement in docs/ (Zhao)
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmXwEJkACgkQ4+MsLN6t
# wN4A3hAAngVu7VmyrfqYF6jfDMUuRGYaKf4D73/KF6R1PsU+nJdN7UAkECLj8o7g
# mkcAQu1U3fKCUssF6MJ2a3kU+rD1OkkA/ZcitzgWwEjCK8KVjtMt2HzEqX+B/X+e
# RUVjXMOMkyV48MF0+yLhJz+lQiDpEBFVxIgssPBNUz1Pw9IfoXp29Bfz+bYBThS4
# ywAdvCefNzSira0Nt6RWTnvgBHB/1+aLy1uMSt0Xu926zcqoxQJ0b//0flYL8vAf
# JuSSZuiXPw+oAc3qG3d6aPl3g8DrFn3pvPD471KlFQAnB0dlhEZZqNBPvraySpHl
# h04Y8teHYj9XfxPtaWfaEdgQCazdkKFR/q7E5c9GU00Rf469BJeuo9Pzkm4kWfbU
# sbCl8em5biVZ5DpBIOMT3/D0JOyGf7/CM8y5c3Jc92hapx2NdSszkvCicrDE1+i0
# zEr4N0P/F2x5KFVFkQ3Xzv2Jtzw+iXj6kSE5a5/64GMK29Mqu/EPaSkvwGDQOs3N
# QJ9mpa4gg47g310a0/nH0i5eVbvGVuzcCMP6VXOBVr18cJ7JFQFFiYcvoTDXNQ2m
# sq5xUelRimnWfKpawomJXkS+/j0usH61/aQBuDKfj45i8/XFRejCIk0gMWQ9hjyD
# no1HqDN8CVXtiPNSinC7ctNHU5ClS0xO/BRl0h3PGC7Bl+A2eVY=
# =JQg1
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 12 Mar 2024 08:21:45 GMT
# gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]
# Primary key fingerprint: FAAB E75E 1291 7221 DCFD 6BB2 E3E3 2C2C DEAD C0DE
* tag 'hw-misc-20240312' of https://github.com/philmd/qemu:
docs/about/deprecated.rst: Move SMP configurations item to system emulator section
meson.build: Always require an objc compiler on macos hosts
hw/gpio: introduce pcf8574 driver
hw/core: Cleanup unused included headers in numa.c
hw/core: Cleanup unused included header in machine-qmp-cmds.c
hw/core: Cleanup unused included headers in cpu-common.c
sun4u: remap ebus BAR0 to use unassigned_io_ops instead of alias to PCI IO space
hw/misc/ivshmem: Fix missing ERRP_GUARD() for error_prepend()
hw/core/qdev-properties-system: Fix missing ERRP_GUARD() for error_prepend()
hw/core/loader-fit: Fix missing ERRP_GUARD() for error_prepend()
hw/ppc/sam460ex: Support short options for adding drives
hw/pci: add some convenient trace-events for pcie and shpc hotplug
hw/ide/ahci: Rename ahci_internal.h to ahci-internal.h
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
query-cpu-model-expansion takes a CpuModelInfo argument. The
loongarch version of the command silently ignores the argument's
member @props. For instance,
{"execute": "query-cpu-model-expansion", "arguments": {"type": "static", "model": {"name": "la464", "props": null}}}
and
{"execute": "query-cpu-model-expansion", "arguments": {"type": "static", "model": {"name": "la464", "props": {"prop": null}}}}
succeed.
Add skeleton code for property processing that recognizes no
properties. Now the two commands fail as they should:
{"error": {"class": "GenericError", "desc": "Invalid parameter type for 'model.props', expected: object"}}
and
{"error": {"class": "GenericError", "desc": "Parameter 'model.props.prop' is unexpected"}}
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240305145919.2186971-5-armbru@redhat.com>
[Drop #include now superfluous]
query-cpu-model-comparison, query-cpu-model-baseline, and
query-cpu-model-expansion take CpuModelInfo arguments. Errors in
@props members of these arguments are reported for 'props', without
further context. For instance, s390x rejects
{"execute": "query-cpu-model-comparison", "arguments": {"modela": {"name": "z13", "props": {}}, "modelb": {"name": "z14", "props": []}}}
with
{"error": {"class": "GenericError", "desc": "Invalid parameter type for 'props', expected: object"}}
This is unusual; the common QAPI unmarshaling machinery would complain
about 'modelb.props'. Our hand-written code to visit the @props
member neglects to provide the context.
Tweak it so it provides it. The command above now fails with
{"error": {"class": "GenericError", "desc": "Invalid parameter type for 'modelb.props', expected: dict"}}
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240305145919.2186971-4-armbru@redhat.com>
Acked-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
CpuModelInfo member @props is semantically a mapping from name to
value, and syntactically a JSON object on the wire. This translates
to QDict in C. Since the QAPI schema language lacks the means to
express 'object', we use 'any' instead. This is QObject in C.
Commands taking a CpuModelInfo argument need to check the QObject is a
QDict.
The i386 version of qmp_query_cpu_model_expansion() fails to check.
Instead, @props is silently ignored when it's not an object. For
instance,
{"execute": "query-cpu-model-expansion", "arguments": {"type": "full", "model": {"name": "qemu64", "props": null}}}
succeeds.
Fix by refactoring the code to match the other targets. Now the
command fails as it should:
{"error": {"class": "GenericError", "desc": "Invalid parameter type for 'props', expected: object"}}
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240305145919.2186971-3-armbru@redhat.com>
CpuModelInfo member @props is semantically a mapping from name to
value, and syntactically a JSON object on the wire. This translates
to QDict in C. Since the QAPI schema language lacks the means to
express 'object', we use 'any' instead. This is QObject in C.
Commands taking a CpuModelInfo argument need to check the QObject is a
QDict.
For arm, riscv, and s390x, the code checks right before passing the
QObject to visit_start_struct(). visit_start_struct() then checks
again.
Delete the first check.
The error message for @props that are not an object changes slightly
to the the message we get for this kind of type error in other
contexts. Minor improvement.
Additionally, error messages about members of @props now refer to
'props.prop-name' instead of just 'prop-name'. Another minor
improvement.
Both changes are visible in tests/qtest/arm-cpu-features.c.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240305145919.2186971-2-armbru@redhat.com>
Acked-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
[Drop #include now superfluous]
Migration pull request
- Avihai's fix to allow vmstate iterators to not starve for VFIO
- Maksim's fix on additional check on precopy load error
- Fabiano's fix on fdatasync() hang in mapped-ram
- Jonathan's fix on vring cached access over MMIO regions
- Cedric's cleanup patches 1-4 out of his error report series
- Yu's fix for RDMA migration (which used to be broken even for 8.2)
- Anthony's small cleanup/fix on err message
- Steve's patches on privatize migration.h
- Xiang's patchset to enable zero page detections in multifd threads
# -----BEGIN PGP SIGNATURE-----
#
# iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZe9+uBIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1wamaQD/SvmpMEcuRndT9LPSxzXowAGDZTBpYUfv
# 5XAbx80dS9IBAO8PJJgQJIBHBeacyLBjHP9CsdVtgw5/VW+wCsbfV4AB
# =xavb
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 11 Mar 2024 21:59:20 GMT
# gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg: issuer "peterx@redhat.com"
# gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [marginal]
# gpg: aka "Peter Xu <peterx@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706
* tag 'migration-20240311-pull-request' of https://gitlab.com/peterx/qemu: (34 commits)
migration/multifd: Add new migration test cases for legacy zero page checking.
migration/multifd: Enable multifd zero page checking by default.
migration/multifd: Implement ram_save_target_page_multifd to handle multifd version of MigrationOps::ram_save_target_page.
migration/multifd: Implement zero page transmission on the multifd thread.
migration/multifd: Add new migration option zero-page-detection.
migration/multifd: Allow clearing of the file_bmap from multifd
migration/multifd: Allow zero pages in file migration
migration: purge MigrationState from public interface
migration: delete unused accessors
migration: privatize colo interfaces
migration: migration_file_set_error
migration: migration_is_device
migration: migration_thread_is_self
migration: export vcpu_dirty_limit_period
migration: export migration_is_running
migration: export migration_is_active
migration: export migration_is_setup_or_active
migration: remove migration.h references
migration: export fewer options
migration: Fix format in error message
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Updated section name, so libbpf should init/gues proper
program type without specifications during open/load.
Also, added map_flags with explicitly declared BPF_F_MMAPABLE.
Added check for BPF_F_MMAPABLE flag to meson script and
requirements to libbpf version.
Also changed fragmentation flag check - some TCP/UDP packets
may be considered fragmented if DF flag is set.
Signed-off-by: Andrew Melnychenko <andrew@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Now, the binary objects may be retrieved by id.
It would require for future qmp commands that may require specific
eBPF blob.
Added command "request-ebpf". This command returns
eBPF program encoded base64. The program taken from the
skeleton and essentially is an ELF object that can be
loaded in the future with libbpf.
The reason to use the command to provide the eBPF object
instead of a separate artifact was to avoid issues related
to finding the eBPF itself. eBPF object is an ELF binary
that contains the eBPF program and eBPF map description(BTF).
Overall, eBPF object should contain the program and enough
metadata to create/load eBPF with libbpf. As the eBPF
maps/program should correspond to QEMU, the eBPF can't
be used from different QEMU build.
The first solution was a helper that comes with QEMU
and loads appropriate eBPF objects. And the issue is
to find a proper helper if the system has several
different QEMUs installed and/or built from the source,
which helpers may not be compatible.
Another issue is QEMU updating while there is a running
QEMU instance. With an updated helper, it may not be
possible to hotplug virtio-net device to the already
running QEMU. Overall, requesting the eBPF object from
QEMU itself solves possible failures with acceptable effort.
Links:
[PATCH 3/5] qmp: Added the helper stamp check.
https://lore.kernel.org/all/20230219162100.174318-4-andrew@daynix.com/
Signed-off-by: Andrew Melnychenko <andrew@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
eBPF RSS program and maps may now be passed during initialization.
Initially was implemented for libvirt to launch qemu without permissions,
and initialized eBPF program through the helper.
Signed-off-by: Andrew Melnychenko <andrew@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
It allows using file descriptors of eBPF provided
outside of QEMU.
QEMU may be run without capabilities for eBPF and run
RSS program provided by management tool(g.e. libvirt).
Signed-off-by: Andrew Melnychenko <andrew@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Changed eBPF map updates through mmaped array.
Mmaped arrays provide direct access to map data.
It should omit using bpf_map_update_elem() call,
which may require capabilities that are not present.
Signed-off-by: Andrew Melnychenko <andrew@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
On resume e1000e_vm_state_change() always calls e1000e_autoneg_resume()
that sets link_down to false, and thus activates the link even
if we have disabled it.
The problem can be reproduced starting qemu in paused state (-S) and
then set the link to down. When we resume the machine the link appears
to be up.
Reproducer:
# qemu-system-x86_64 ... -device e1000e,netdev=netdev0,id=net0 -S
{"execute": "qmp_capabilities" }
{"execute": "set_link", "arguments": {"name": "net0", "up": false}}
{"execute": "cont" }
To fix the problem, merge the content of e1000e_vm_state_change()
into e1000e_core_post_load() as e1000 does.
Buglink: https://issues.redhat.com/browse/RHEL-21867
Fixes: 6f3fbe4ed0 ("net: Introduce e1000e device emulation")
Suggested-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
On resume igb_vm_state_change() always calls igb_autoneg_resume()
that sets link_down to false, and thus activates the link even
if we have disabled it.
The problem can be reproduced starting qemu in paused state (-S) and
then set the link to down. When we resume the machine the link appears
to be up.
Reproducer:
# qemu-system-x86_64 ... -device igb,netdev=netdev0,id=net0 -S
{"execute": "qmp_capabilities" }
{"execute": "set_link", "arguments": {"name": "net0", "up": false}}
{"execute": "cont" }
To fix the problem, merge the content of igb_vm_state_change()
into igb_core_post_load() as e1000 does.
Buglink: https://issues.redhat.com/browse/RHEL-21867
Fixes: 3a977deebe ("Intrdocue igb device emulation")
Cc: akihiko.odaki@daynix.com
Suggested-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Since CPU() macro is a simple cast, the following are equivalent:
Object *obj;
CPUState *cs = CPU(obj)
In order to ease static analysis when running
scripts/coccinelle/cpu_env.cocci from the previous commit,
replace:
- CPU_GET_CLASS(cpu);
+ CPU_GET_CLASS(obj);
Most code use the 'cs' variable name for CPUState handle.
Replace few 's' -> 'cs' to unify cpu_reset_hold() style.
No logical change in this patch.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240129164514.73104-7-philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Avoid CPUArchState local variable when cpu_env() is used once.
Mechanical patch using the following Coccinelle spatch script:
@@
type CPUArchState;
identifier env;
expression cs;
@@
{
- CPUArchState *env = cpu_env(cs);
... when != env
- env
+ cpu_env(cs)
... when != env
}
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240129164514.73104-5-philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
When a variable is initialized to &struct->field, use it
in place. Rationale: while this makes the code more concise,
this also helps static analyzers.
Mechanical change using the following Coccinelle spatch script:
@@
type S, F;
identifier s, m, v;
@@
S *s;
...
F *v = &s->m;
<+...
- &s->m
+ v
...+>
Inspired-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240129164514.73104-2-philmd@linaro.org>
Acked-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
[thuth: Dropped hunks that need a rebase, and fixed sizeof() in pmu_realize()]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Since the commit 05e385d2a9 ("error: Move ERRP_GUARD() to the beginning
of the function"), there are new codes that don't put ERRP_GUARD() at
the beginning of the functions.
As stated in the commit 05e385d2a9: "include/qapi/error.h advises to put
ERRP_GUARD() right at the beginning of the function, because only then
can it guard the whole function.", so clean up the few spots
disregarding the advice.
Inspired-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240312060337.3240965-1-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
In target/s390x/cpu_models.c, there are 2 functions passing @errp to
error_prepend() without ERRP_GUARD():
- check_compatibility()
- s390_realize_cpu_model()
Though both their @errp parameters point to their callers' local @err
virables and don't cause the issue as [1] said, to follow the
requirement of @errp, also add missing ERRP_GUARD() at their beginning.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: David Hildenbrand <david@redhat.com>
Cc: Thomas Huth <thuth@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Ilya Leoshkevich <iii@linux.ibm.com>
Cc: qemu-s390x@nongnu.org
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <20240311033822.3142585-30-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
The net_init_vhost_vdpa() passes @errp to error_prepend(), and as a
member of net_client_init_fun[], it's called in net_client_init1() and
gets @errp from this caller.
But because netdev_init_modern() passes &error_fatal to
net_client_init1(), then @errp parameter of net_init_vhost_vdpa() would
point to @error_fatal. This causes the error message in error_prepend()
to be lost because of the above issue.
To fix this, add missing ERRP_GUARD() at the beginning of this function.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240311033822.3142585-29-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
The migrate_params_check() passes @errp to error_prepend() without
ERRP_GUARD(), and it could be called from migration_object_init(),
where the passed @errp points to @error_fatal.
Therefore, the error message echoed in error_prepend() will be lost
because of the above issue.
To fix this, add missing ERRP_GUARD() at the beginning of this function.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Peter Xu <peterx@redhat.com>
Cc: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Acked-by: Peter Xu <peterx@redhat.com>
Message-ID: <20240311033822.3142585-28-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
In hw/virtio/vhost.c, there are 2 functions passing @errp to
error_prepend() without ERRP_GUARD():
- vhost_save_backend_state()
- vhost_load_backend_state()
Their @errp both points to callers' @local_err. However, as the APIs
defined in include/hw/virtio/vhost.h, it is necessary to protect their
@errp with ERRP_GUARD().
To follow the requirement of @errp, add missing ERRP_GUARD() at their
beginning.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240311033822.3142585-27-zhao1.liu@linux.intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
The vhost_vsock_device_realize() passes @errp to error_prepend(), and as
a VirtioDeviceClass.realize method, its @errp is from
DeviceClass.realize so that there is no guarantee that the @errp won't
point to @error_fatal.
To avoid the issue like [1] said, add missing ERRP_GUARD() at the
beginning of this function.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240311033822.3142585-26-zhao1.liu@linux.intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
The vfio_platform_realize() passes @errp to error_prepend(), and as a
DeviceClass.realize method, there are too many possible callers to check
the impact of this defect; it may or may not be harmless. Thus it is
necessary to protect @errp with ERRP_GUARD().
To avoid the issue like [1] said, add missing ERRP_GUARD() at the
beginning of this function.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <20240311033822.3142585-25-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
In hw/vfio/pci.c, there are 2 functions passing @errp to error_prepend()
without ERRP_GUARD():
- vfio_add_std_cap()
- vfio_realize()
The @errp of vfio_add_std_cap() is also from vfio_realize(). And
vfio_realize(), as a PCIDeviceClass.realize method, its @errp is from
DeviceClass.realize so that there is no guarantee that the @errp won't
point to @error_fatal.
To avoid the issue like [1] said, add missing ERRP_GUARD() at their
beginning.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <20240311033822.3142585-24-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
In hw/vfio/pci-quirks.c, there are 2 functions passing @errp to
error_prepend() without ERRP_GUARD():
- vfio_add_nv_gpudirect_cap()
- vfio_add_vmd_shadow_cap()
There are too many possible callers to check the impact of this defect;
it may or may not be harmless. Thus it is necessary to protect their
@errp with ERRP_GUARD().
To avoid the issue like [1] said, add missing ERRP_GUARD() at the
beginning of this function.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <20240311033822.3142585-23-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
The iommufd_cdev_getfd() passes @errp to error_prepend(). Its @errp is
from vfio_attach_device(), and there are too many possible callers to
check the impact of this defect; it may or may not be harmless. Thus it
is necessary to protect @errp with ERRP_GUARD().
To avoid the issue like [1] said, add missing ERRP_GUARD() at the
beginning of this function.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <20240311033822.3142585-22-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
In hw/vfio/helpers.c, there are 3 functions passing @errp to
error_prepend() without ERRP_GUARD():
- vfio_set_irq_signaling()
- vfio_device_get_name()
- vfio_device_set_fd()
There are too many possible callers to check the impact of this defect;
it may or may not be harmless. Thus it is necessary to protect their
@errp with ERRP_GUARD().
To avoid the issue like [1] said, add missing ERRP_GUARD() at their
beginning.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <20240311033822.3142585-21-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
The vfio_get_group() passes @errp to error_prepend(). Its @errp is
from vfio_attach_device(), and there are too many possible callers to
check the impact of this defect; it may or may not be harmless. Thus it
is necessary to protect @errp with ERRP_GUARD().
To avoid the issue like [1] said, add missing ERRP_GUARD() at the
beginning of this function.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <20240311033822.3142585-20-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
The vfio_ap_realize() passes @errp to error_prepend(), and as a
DeviceClass.realize method, there are too many possible callers to check
the impact of this defect; it may or may not be harmless. Thus it is
necessary to protect @errp with ERRP_GUARD().
To avoid the issue like [1] said, add missing ERRP_GUARD() at the
beginning of this function.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cédric Le Goater <clg@redhat.com>
Cc: Tony Krowiak <akrowiak@linux.ibm.com>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Jason Herne <jjherne@linux.ibm.com>
Cc: Thomas Huth <thuth@redhat.com>
Cc: qemu-s390x@nongnu.org
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <20240311033822.3142585-19-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
The vhost_scsi_realize() passes @errp to error_prepend(), and as a
VirtioDeviceClass.realize method, its @errp is from DeviceClass.realize
so that there is no guarantee that the @errp won't point to
@error_fatal.
To avoid the issue like [1] said, add missing ERRP_GUARD() at the
beginning of this function.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Fam Zheng <fam@euphon.net>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240311033822.3142585-18-zhao1.liu@linux.intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
The virtio_blk_vq_aio_context_init() passes @errp to error_prepend().
Though its @errp points its caller's local @err variable, to follow the
requirement of @errp, add missing ERRP_GUARD() at the beginning of
virtio_blk_vq_aio_context_init().
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Hanna Reitz <hreitz@redhat.com>
Cc: qemu-block@nongnu.org
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: "Michael S. Tsirkin" <mst@redhat.com>
Message-ID: <20240311033822.3142585-14-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
The vmdk_parse_extents() passes @errp to error_prepend(), and its @errp
is from vmdk_open().
Though, vmdk_open(), as a BlockDriver.bdrv_open(), gets the @errp
parameter which is pointer of its caller's local_err, to follow the
requirement of @errp, add missing ERRP_GUARD() at the beginning of this
function.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Fam Zheng <fam@euphon.net>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Hanna Reitz <hreitz@redhat.com>
Cc: qemu-block@nongnu.org
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240311033822.3142585-13-zhao1.liu@linux.intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
The vdi_co_do_create() passes @errp to error_prepend() without
ERRP_GUARD(), and its @errp parameter is so widely sourced that it is
necessary to protect it with ERRP_GUARD().
To avoid the potential issues as [1] said, add missing ERRP_GUARD() at
the beginning of this function.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Stefan Weil <sw@weilnetz.de>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Hanna Reitz <hreitz@redhat.com>
Cc: qemu-block@nongnu.org
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240311033822.3142585-12-zhao1.liu@linux.intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
In block/snapshot.c, there are 2 functions passing @errp to
error_prepend() without ERRP_GUARD():
- bdrv_all_delete_snapshot()
- bdrv_all_goto_snapshot()
As the APIs exposed in include/block/snapshot.h, they could be called
by other modules.
To avoid potential issues as [1] said, add missing ERRP_GUARD() at the
beginning of these 2 functions.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Hanna Reitz <hreitz@redhat.com>
Cc: qemu-block@nongnu.org
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240311033822.3142585-11-zhao1.liu@linux.intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
The bdrv_qed_co_invalidate_cache() passes @errp to error_prepend()
without ERRP_GUARD().
Though it is a BlockDriver.bdrv_co_invalidate_cache() method, and
currently its @errp parameter only points to callers' local_err, to
follow the requirement of @errp, add missing ERRP_GUARD() at the
beginning of this function.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Hanna Reitz <hreitz@redhat.com>
Cc: qemu-block@nongnu.org
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240311033822.3142585-10-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
In block/qcow2.c, there are 2 functions passing @errp to error_prepend()
without ERRP_GUARD():
- qcow2_co_create()
- qcow2_co_truncate()
There are too many possible callers to check the impact of the defect;
it may or may not be harmless. Thus it is necessary to protect @errp with
ERRP_GUARD().
Therefore, to avoid the issue like [1] said, add missing ERRP_GUARD() at
their beginning.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Hanna Reitz <hreitz@redhat.com>
Cc: qemu-block@nongnu.org
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-ID: <20240311033822.3142585-9-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
The qcow2_co_can_store_new_dirty_bitmap() passes @errp to
error_prepend(). As a BlockDriver.bdrv_co_can_store_new_dirty_bitmap
method, it's called by bdrv_co_can_store_new_dirty_bitmap().
Its caller is not being called anywhere, but as the API in
include/block/block-io.h, we can't ensure what kind of @errp future
users will pass in.
To avoid potential issues as [1] said, add missing ERRP_GUARD() at the
beginning of qcow2_co_can_store_new_dirty_bitmap().
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Hanna Reitz <hreitz@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Cc: John Snow <jsnow@redhat.com>
Cc: qemu-block@nongnu.org
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20240311033822.3142585-8-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
In nvme.c, there are 3 functions passing @errp to error_prepend()
without ERRP_GUARD():
- nvme_init_queue()
- nvme_create_queue_pair()
- nvme_identify()
All these 3 functions take their @errp parameters from the
nvme_file_open(), which is a BlockDriver.bdrv_nvme() method and its
@errp points to its caller's local_err.
Though these 3 cases haven't trigger the issue like [1] said, to
follow the requirement of @errp, add missing ERRP_GUARD() at their
beginning.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Fam Zheng <fam@euphon.net>
Cc: Philippe Mathieu-Daudé <philmd@linaro.org>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Hanna Reitz <hreitz@redhat.com>
Cc: qemu-block@nongnu.org
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240311033822.3142585-7-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
The nbd_co_do_receive_one_chunk() passes @errp to error_prepend()
without ERRP_GUARD(), and though its @errp parameter points to its
caller's local_err, to follow the requirement of @errp, add missing
ERRP_GUARD() at the beginning of this function.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Eric Blake <eblake@redhat.com>
Cc: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Hanna Reitz <hreitz@redhat.com>
Cc: qemu-block@nongnu.org
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20240311033822.3142585-6-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
The cbw_open() passes @errp to error_prepend() without ERRP_GUARD().
Though it is the BlockDriver.bdrv_open() method, and currently its
@errp parameter only points to callers' local_err, to follow the
requirement of @errp, add missing ERRP_GUARD() at the beginning of this
function.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: John Snow <jsnow@redhat.com>
Cc: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Hanna Reitz <hreitz@redhat.com>
Cc: qemu-block@nongnu.org
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20240311033822.3142585-5-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
In block.c, there are 4 functions passing @errp to error_prepend()
without ERRP_GUARD():
- bdrv_co_create_opts_simple()
- parse_json_filename()
- bdrv_open_backing_file()
- bdrv_append_temp_snapshot()
bdrv_co_create_opts_simple(), is an implementation of
BlockDriver.bdrv_co_create_opts(). There are too many possible callers
to check the impact of this defect; it may or may not be harmless. Thus
it is necessary to protect @errp with ERRP_GUARD().
Though the @errp parameters passed to parse_json_filename(),
bdrv_open_backing_file() and bdrv_append_temp_snapshot() points to their
callers' local_err, to follow the requirement of @errp, also add missing
ERRP_GUARD() at their beginning.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Hanna Reitz <hreitz@redhat.com>
Cc: qemu-block@nongnu.org
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-ID: <20240311033822.3142585-4-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
The iommufd_backend_set_fd() passes @errp to error_prepend(), to avoid
the above issue, add missing ERRP_GUARD() at the beginning of this
function.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Yi Liu <yi.l.liu@intel.com>
Cc: Eric Auger <eric.auger@redhat.com>
Cc: Zhenzhong Duan <zhenzhong.duan@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-ID: <20240311033822.3142585-3-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The error_vprepend() should use ERRP_GUARD() just as the documentation
of ERRP_GUARD() says:
> It must be used when the function dereferences @errp or passes
> @errp to error_prepend(), error_vprepend(), or error_append_hint().
Considering that error_vprepend() is also an API provided in error.h,
it is necessary to add it to the description of the rules for using
ERRP_GUARD().
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Michael Roth <michael.roth@amd.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240311033822.3142585-2-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
IOAPICCommonClass implements its own private realize(), and this private
realize() allows error.
Since IOAPICCommonClass.realize() returns void, to check the error,
dereference @errp with ERRP_GUARD().
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240223085653.1255438-8-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, dereferencing @errp requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
* - It must not be dereferenced, because it may be null.
...
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
*
* Using it when it's not needed is safe, but please avoid cluttering
* the source with useless code.
But in cxl_usp_realize(), @errp is dereferenced without ERRP_GUARD():
cxl_doe_cdat_init(cxl_cstate, errp);
if (*errp) {
goto err_cap;
}
Here we check *errp, because cxl_doe_cdat_init() returns void. And since
cxl_usp_realize() - as a PCIDeviceClass.realize() method - doesn't get
the NULL @errp parameter, it hasn't triggered the bug that dereferencing
the NULL @errp.
To follow the requirement of @errp, add missing ERRP_GUARD() in
cxl_usp_realize().
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240223085653.1255438-6-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, dereferencing @errp requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
* - It must not be dereferenced, because it may be null.
...
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
*
* Using it when it's not needed is safe, but please avoid cluttering
* the source with useless code.
But in trng_prop_fault_event_set, @errp is dereferenced without
ERRP_GUARD():
visit_type_uint32(v, name, events, errp);
if (*errp) {
return;
}
Currently, since trng_prop_fault_event_set() doesn't get the NULL @errp
parameter as a "set" method of object property, it hasn't triggered the
bug that dereferencing the NULL @errp.
And since visit_type_uint32() returns bool, check the returned bool
directly instead of dereferencing @errp, then we needn't the add missing
ERRP_GUARD().
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240223085653.1255438-5-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, dereferencing @errp requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
* - It must not be dereferenced, because it may be null.
...
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
*
* Using it when it's not needed is safe, but please avoid cluttering
* the source with useless code.
But in ct3_realize(), @errp is dereferenced without ERRP_GUARD():
cxl_doe_cdat_init(cxl_cstate, errp);
if (*errp) {
goto err_free_special_ops;
}
Here we check *errp, because cxl_doe_cdat_init() returns void. And
ct3_realize() - as a PCIDeviceClass.realize() method - doesn't get the
NULL @errp parameter, it hasn't triggered the bug that dereferencing
the NULL @errp.
To follow the requirement of @errp, add missing ERRP_GUARD() in
ct3_realize().
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240223085653.1255438-4-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, dereferencing @errp requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
* - It must not be dereferenced, because it may be null.
...
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
*
* Using it when it's not needed is safe, but please avoid cluttering
* the source with useless code.
But in macfb_nubus_realize(), @errp is dereferenced without
ERRP_GUARD():
ndc->parent_realize(dev, errp);
if (*errp) {
return;
}
Here we check *errp, because the ndc->parent_realize(), as a
DeviceClass.realize() callback, returns void. And since
macfb_nubus_realize(), also as a DeviceClass.realize(), doesn't get the
NULL @errp parameter, it hasn't triggered the bug that dereferencing the
NULL @errp.
To follow the requirement of @errp, add missing ERRP_GUARD() in
macfb_nubus_realize().
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240223085653.1255438-3-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As the comment in qapi/error, dereferencing @errp requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
* - It must not be dereferenced, because it may be null.
...
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
*
* Using it when it's not needed is safe, but please avoid cluttering
* the source with useless code.
But in cxl_fixed_memory_window_config(), @errp is dereferenced in 2
places without ERRP_GUARD():
fw->enc_int_ways = cxl_interleave_ways_enc(fw->num_targets, errp);
if (*errp) {
return;
}
and
fw->enc_int_gran =
cxl_interleave_granularity_enc(object->interleave_granularity,
errp);
if (*errp) {
return;
}
For the above 2 places, we check "*errp", because neither function
returns a suitable error code. And since machine_set_cfmw() - the caller
of cxl_fixed_memory_window_config() - doesn't get the NULL @errp
parameter as the "set" method of object property,
cxl_fixed_memory_window_config() hasn't triggered the bug that
dereferencing the NULL @errp.
To follow the requirement of @errp, add missing ERRP_GUARD() in
cxl_fixed_memory_window_config().
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240223085653.1255438-2-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
In the commit 54c4ea8f3a ("hw/core/machine-smp: Deprecate unsupported
'parameter=1' SMP configurations"), the SMP related item is put under
the section "User-mode emulator command line arguments" instead of
"System emulator command line arguments".
-smp is a system emulator command, so move SMP configurations item to
system emulator section.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240312071512.3283513-1-zhao1.liu@linux.intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
We currently only insist that an ObjectiveC compiler is present on
macos hosts if we're building the Cocoa UI. However, since then
we've added some other parts of QEMU which are also written in ObjC:
the coreaudio audio backend, and the vmnet net backend. This means
that if you try to configure QEMU on macos with --disable-cocoa the
build will fail:
../meson.build:3741:13: ERROR: No host machine compiler for 'audio/coreaudio.m'
Since in practice any macos host will have an ObjC compiler
available, rather than trying to gate the compiler detection on an
increasingly complicated list of every bit of QEMU that uses ObjC,
just require it unconditionally on macos hosts.
Resolves https://gitlab.com/qemu-project/qemu/-/issues/2138
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240311133334.3991537-1-peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
NXP PCF8574 and compatible ICs are simple I2C GPIO expanders.
PCF8574 incorporates quasi-bidirectional IO, and simple
communication protocol, when IO read is I2C byte read, and
IO write is I2C byte write. User can think of it as
open-drain port, when line high state is input and line low
state is output.
Signed-off-by: Dmitrii Sharikhin <d.sharikhin@yadro.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <f1552d822276e878d84c01eba2cf2c7c9ebdde00.camel@yadro.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Remove unused header in numa.c:
* qemu/bitmap.h
* migration/vmstate.h
Note: Though parse_numa_hmat_lb() has the variable named "bitmap_copy",
it doesn't use the normal bitmap ops so that it's safe to exclude
qemu/bitmap.h header.
Tested by "./configure" and then "make".
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240311075621.3224684-4-zhao1.liu@linux.intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
During kernel startup OpenBSD accesses addresses mapped by BAR0 of the ebus device
but at offsets where no IO devices exist. Before commit 4aa07e8649 ("hw/sparc64/ebus:
Access memory regions via pci_address_space_io()") BAR0 was mapped to legacy IO
space which allows accesses to unmapped devices to succeed, but afterwards these
accesses to unmapped PCI IO space cause a memory fault which prevents OpenBSD from
booting.
Since no devices are mapped at the addresses accessed by OpenBSD, change ebus BAR0
from a PCI IO space alias to an IO memory region using unassigned_io_ops which allows
these accesses to succeed and so allows OpenBSD to boot once again.
Fixes: 4aa07e8649 ("hw/sparc64/ebus: Access memory regions via pci_address_space_io()")
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240311064345.2531197-1-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
The ivshmem_common_realize() passes @errp to error_prepend(), and as a
DeviceClass.realize method, there are too many possible callers to check
the impact of this defect; it may or may not be harmless. Thus it is
necessary to protect @errp with ERRP_GUARD().
To avoid the issue like [1] said, add missing ERRP_GUARD() at the
beginning of this function.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Juan Quintela <quintela@trasno.org>
Cc: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Cc: Michael Galaxy <mgalaxy@akamai.com>
Cc: Steve Sistare <steven.sistare@oracle.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240311033822.3142585-17-zhao1.liu@linux.intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
The set_chr() passes @errp to error_prepend() without ERRP_GUARD().
As a PropertyInfo.set method, there are too many possible callers to
check the impact of this defect; it may or may not be harmless. Thus it
is necessary to protect @errp with ERRP_GUARD().
To avoid the issue like [1] said, add missing ERRP_GUARD() at the
beginning of this function.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Daniel P. Berrangé" <berrange@redhat.com
Cc: Eduardo Habkost <eduardo@habkost.net>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240311033822.3142585-16-zhao1.liu@linux.intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
In hw/core/loader-fit.c, there are 2 functions passing @errp to
error_prepend() without ERRP_GUARD():
- fit_load_kernel()
- fit_load_fdt()
Their @errp parameters are both the pointers of the local @err virable
in load_fit().
Though they don't cause the issue like [1] said, to follow the
requirement of @errp, add missing ERRP_GUARD() at their beginning.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Paul Burton <paulburton@kernel.org>
Cc: Aleksandar Rikalo <aleksandar.rikalo@syrmia.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240311033822.3142585-15-zhao1.liu@linux.intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Having to use -drive if=none,... and -device ide-[cd,hd] is
inconvenient. Add support for shorter convenience options such as
-cdrom and -drive media=disk. Also adjust two nearby comments for code
style.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-ID: <20240305225721.E9A404E6005@zero.eik.bme.hu>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Now that zero page checking is done on the multifd sender threads by
default, we still provide an option for backward compatibility. This
change adds a qtest migration test case to set the zero-page-detection
option to "legacy" and run multifd migration with zero page checking on the
migration main thread.
Signed-off-by: Hao Xiang <hao.xiang@bytedance.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240311180015.3359271-8-hao.xiang@linux.dev
Signed-off-by: Peter Xu <peterx@redhat.com>
1. Set default "zero-page-detection" option to "multifd". Now
zero page checking can be done in the multifd threads and this
becomes the default configuration.
2. Handle migration QEMU9.0 -> QEMU8.2 compatibility. We provide
backward compatibility where zero page checking is done from the
migration main thread.
Signed-off-by: Hao Xiang <hao.xiang@bytedance.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240311180015.3359271-7-hao.xiang@linux.dev
Signed-off-by: Peter Xu <peterx@redhat.com>
1. Add zero_pages field in MultiFDPacket_t.
2. Implements the zero page detection and handling on the multifd
threads for non-compression, zlib and zstd compression backends.
3. Added a new value 'multifd' in ZeroPageDetection enumeration.
4. Adds zero page counters and updates multifd send/receive tracing
format to track the newly added counters.
Signed-off-by: Hao Xiang <hao.xiang@bytedance.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240311180015.3359271-5-hao.xiang@linux.dev
Signed-off-by: Peter Xu <peterx@redhat.com>
We currently only need to clear the mapped-ram file bitmap from the
migration thread during save_zero_page.
We're about to add support for zero page detection on the multifd
thread, so allow ramblock_set_file_bmap_atomic() to also clear the
bits.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240311180015.3359271-3-hao.xiang@linux.dev
Signed-off-by: Peter Xu <peterx@redhat.com>
Currently, it's an error to have no data pages in the multifd file
migration because zero page detection is done in the migration thread
and zero pages don't reach multifd. This is enforced with the
pages->num assert.
We're about to add zero page detection on the multifd thread. Fix the
file_write_ramblock_iov() to stop considering p->iovs_num=0 an error.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240311180015.3359271-2-hao.xiang@linux.dev
Signed-off-by: Peter Xu <peterx@redhat.com>
A small number of migration options are accessed by migration clients,
but to see them clients must include all of options.h, which is mostly
for migration core code. migrate_mode() in particular will be needed by
multiple clients.
Refactor the option declarations so clients can see the necessary few via
misc.h, which already exports a portion of the client API.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Link: https://lore.kernel.org/r/1710179319-294320-1-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
qga-pull-2024-03-11-2
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEwsLBCepDxjwUI+uE711egWG6hOcFAmXvMF4ACgkQ711egWG6
# hOd7GQ/9H11bXH5U2HZHAyEv68rCuGxHt57yjy9GfrSGAE7kqkLJ0bHwxdgoj09A
# mmaTOakOEhM5tyrkFYsROde3ta0fAwdFQXyhqpWDHG0ZwDDCAlsNsEuDd541KXbg
# qFTue26BM4EJEwTYy94nEEhOHD+2GgEAuPIsUCF6QDhrq+sBqUts1pH3uUM+E3Sg
# 7HSXaF9O/RbgYR6J8FVA53tvNOP4WgOYZ/SZoFwKYzIOAblcaRgJvVbm600OqQDb
# DUZ96s6HUBVBazKx4t24WwPOkgcvYgp7b8QYj2VVfjRU2IHnFumv9A47KuEdJEUl
# meZlo8TsgfL+uSiYWiRvps1Eo1uoS2M4s4FYGbuOeABYviLr3XhJM7VarNyPp7nf
# lupuyeqydXCic5LFbjSg5gkpaIhFYQ9gANYr6JZjPPjX5G8hmgCeB6YsZT98Ygmi
# yG1np2luapvFGsDCPd8kgNpfTkKhOpKghUnTC3UESReV8mjBGhJHRQTd+lMV7YWJ
# reMRxlqrqNq69lDUnlNuSXOqPJX1qHHGkWuV9pdz641tFvw7vigE07qhvsCr8Y7b
# tAg6oDcw+1uuq2t3nHmfkzic6WTxb4lFmSpShsDu2sRA7MqWE3lCefNoS75R+xaI
# FebdrkjkFxdbazyl6oKXwxkOkAR7rFoTvVKKbG79DVMIBDiu4BE=
# =Xt/n
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 11 Mar 2024 16:25:02 GMT
# gpg: using RSA key C2C2C109EA43C63C1423EB84EF5D5E8161BA84E7
# gpg: Good signature from "Kostiantyn Kostiuk (Upstream PR sign) <kkostiuk@redhat.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: C2C2 C109 EA43 C63C 1423 EB84 EF5D 5E81 61BA 84E7
* tag 'qga-pull-2024-03-11-2' of https://github.com/kostyanf14/qemu:
qga-win: Add support of Windows Server 2025 in get-osinfo command
qga/commands-win32: Do not set matrix_lookup_t/win_10_0_t arrays size
qga/commands-win32: Declare const qualifier before type
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
If the access is bigger than the MemoryRegion supports,
flatview_read/write_continue() will attempt to update the Memory Region.
but the address passed to flatview_translate() is relative to the cache, not
to the FlatView.
On arm/virt with interleaved CXL memory emulation and virtio-blk-pci this
lead to the first part of descriptor being read from the CXL memory and the
second part from PA 0x8 which happens to be a blank region
of a flash chip and all ffs on this particular configuration.
Note this test requires the out of tree ARM support for CXL, but
the problem is more general.
Avoid this by adding new address_space_read_continue_cached()
and address_space_write_continue_cached() which share all the logic
with the flatview versions except for the MemoryRegion lookup which
is unnecessary as the MemoryRegionCache only covers one MemoryRegion.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20240307153710.30907-5-Jonathan.Cameron@huawei.com
Signed-off-by: Peter Xu <peterx@redhat.com>
The calls to flatview_read/write[_continue]() have parameters addr and
addr1 but the names give no indication of what they are addresses of.
Rename addr1 to mr_addr to reflect that it is the translated address
offset within the MemoryRegion returned by flatview_translate().
Similarly rename the parameter in address_space_read/write_cached_slow()
Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20240307153710.30907-2-Jonathan.Cameron@huawei.com
Signed-off-by: Peter Xu <peterx@redhat.com>
In commit 3fa9642ff7 change was made to convert the RDMA backend to
accept MigrateAddress struct. However, the assignment of "host" leads
to data corruption on the target host and the failure of migration.
isock->host = rdma->host;
By allocating the memory explicitly for it with g_strdup_printf(), the
issue is fixed and the migration doesn't fail any more.
Fixes: 3fa9642ff7 ("migration: convert rdma backend to accept MigrateAddress")
Cc: qemu-stable <qemu-stable@nongnu.org>
Cc: Li Zhijian <lizhijian@fujitsu.com>
Link: https://lore.kernel.org/r/CAHEcVy4L_D6tuhJ8h=xLR4WaPaprJE3nnxZAEyUnoTrxQ6CF5w@mail.gmail.com
Signed-off-by: Yu Zhang <yu.zhang@ionos.com>
[peterx: use g_strdup() instead of g_strdup_printf(), per Zhijian]
Signed-off-by: Peter Xu <peterx@redhat.com>
Commit bc38feddeb ("io: fsync before closing a file channel") added a
fsync/fdatasync at the closing point of the QIOChannelFile to ensure
integrity of the migration stream in case of QEMU crash.
The decision to do the sync at qio_channel_close() was not the best
since that function runs in the main thread and the fsync can cause
QEMU to hang for several minutes, depending on the migration size and
disk speed.
To fix the hang, remove the fsync from qio_channel_file_close().
At this moment, the migration code is the only user of the fsync and
we're taking the tradeoff of not having a sync at all, leaving the
responsibility to the upper layers.
Fixes: bc38feddeb ("io: fsync before closing a file channel")
Reviewed-by: "Daniel P. Berrangé" <berrange@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240305195629.9922-1-farosas@suse.de
Link: https://lore.kernel.org/r/20240305174332.2553-1-farosas@suse.de
[peterx: add more comment to the qio_channel_close()]
Signed-off-by: Peter Xu <peterx@redhat.com>
When commit bd2270608f ("migration/ram.c: add a notifier chain for
precopy") added PRECOPY_NOTIFY_SETUP notifiers at the end of
qemu_savevm_state_setup(), it didn't take into account a possible
error in the loop calling vmstate_save() or .save_setup() handlers.
Check ret value before calling the notifiers.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20240304122844.1888308-10-clg@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
VFIO migration buffer size is currently limited to 1MB. Therefore, there
is no need to check if migration rate exceeded, as in the worst case it
will exceed by only 1MB.
However, if the buffer size is later changed to a bigger value,
vfio_save_iterate() should enforce migration rate (similar to migration
RAM code).
Add a note about this in vfio_save_iterate() to serve as a reminder.
Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240304105339.20713-4-avihaih@nvidia.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Currently, vfio_save_state() returns 1 regardless of whether there is
more data to send or not. This was done to prevent a fast changing VFIO
device from potentially blocking other devices from sending their data,
as qemu_savevm_state_iterate() serialized devices.
Now that qemu_savevm_state_iterate() no longer serializes devices, there
is no need for that.
Refactor vfio_save_state() to return 0 if more data is available and 1
if no more data is available.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240304105339.20713-3-avihaih@nvidia.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Commit 90697be889 ("live migration: Serialize vmstate saving in stage
2") introduced device serialization in qemu_savevm_state_iterate(). The
rationale behind it was to first complete migration of slower changing
block devices and only then migrate the RAM, to avoid sending fast
changing RAM pages over and over.
This commit was added a long time ago, and while it was useful back
then, it is not the case anymore:
1. Block migration is deprecated, see commit 66db46ca83 ("migration:
Deprecate block migration").
2. Today there are other iterative devices besides RAM and block, such
as VFIO, which are registered for migration after RAM. With current
serialization behavior, a fast changing device can block other
devices from sending their data, which may prevent migration from
converging in some cases.
The issue described in item 2 was observed in several VFIO migration
scenarios with switchover-ack capability enabled, where some workload on
the VM prevented RAM from ever reaching a hard zero, thus blocking VFIO
initial pre-copy data from being sent. Hence, destination could not ack
switchover and migration could not converge.
Fix that by not serializing iterative devices in
qemu_savevm_state_iterate().
Note that this still doesn't fully prevent device starvation. As
correctly pointed out by Peter [1], a fast changing device might
constantly consume all allocated bandwidth and block the following
devices. However, this scenario is more likely to happen only if
max-bandwidth is low.
[1] https://lore.kernel.org/qemu-devel/Zd6iw9dBhW6wKNxx@x1n/
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240304105339.20713-2-avihaih@nvidia.com
Signed-off-by: Peter Xu <peterx@redhat.com>
QEMU includes some models of old Arm machine types which are
a bit problematic for us because:
* they're written in a very old way that uses numerous APIs that we
would like to get away from (eg they don't use qdev, they use
qemu_system_reset_request(), they use vmstate_register(), etc)
* they've been that way for a decade plus and nobody particularly has
stepped up to try to modernise the code (beyond some occasional
work here and there)
* we often don't have test cases for them, which means that if we
do try to do the necessary refactoring work on them we have no
idea if they even still work at all afterwards
All these machine types are also of hardware that has largely passed
away into history and where I would not be surprised to find that
e.g. the Linux kernel support was never tested on real hardware
any more.
After some consultation with the Linux kernel developers, we
are going to deprecate:
All PXA2xx machines:
akita Sharp SL-C1000 (Akita) PDA (PXA270)
borzoi Sharp SL-C3100 (Borzoi) PDA (PXA270)
connex Gumstix Connex (PXA255)
mainstone Mainstone II (PXA27x)
spitz Sharp SL-C3000 (Spitz) PDA (PXA270)
terrier Sharp SL-C3200 (Terrier) PDA (PXA270)
tosa Sharp SL-6000 (Tosa) PDA (PXA255)
verdex Gumstix Verdex Pro XL6P COMs (PXA270)
z2 Zipit Z2 (PXA27x)
All OMAP2 machines:
n800 Nokia N800 tablet aka. RX-34 (OMAP2420)
n810 Nokia N810 tablet aka. RX-44 (OMAP2420)
One of the OMAP1 machines:
cheetah Palm Tungsten|E aka. Cheetah PDA (OMAP310)
Rationale:
* for QEMU dropping individual machines is much less beneficial
than if we can drop support for an entire SoC
* the OMAP2 QEMU code in particular is large, old and unmaintained,
and none of the OMAP2 kernel maintainers said they were using
QEMU in any of their testing/development
* although there is a setup that is booting test kernels on some
of the PXA2xx machines, nobody seemed to be using them as part
of their active kernel development and my impression from the
email thread is that PXA is the closest of all these SoC families
to being dropped from the kernel soon
* nobody said they were using cheetah, so it's entirely
untested and quite probably broken
* on the other hand the OMAP1 sx1 model does seem to be being
used as part of kernel development, and there was interest
in keeping collie around
In particular, the mainstone, tosa and z2 machine types have
already been dropped from Linux.
Mark all these machine types as deprecated.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240308171621.3749894-1-peter.maydell@linaro.org
ga_get_win_name() iterates over all elements in the arrays by
checking the 'version' field is non-NULL. Since the arrays are
guarded by a NULL terminating element, we don't need to specify
their size:
static char *ga_get_win_name(...)
{
...
const ga_matrix_lookup_t *table = WIN_VERSION_MATRIX[tbl_idx];
const ga_win_10_0_t *win_10_0_table = ...
...
while (table->version != NULL) {
^^^^^^^^^^^^^^^
while (win_10_0_table->version != NULL) {
^^^^^^^^^^^^^^^
This will simplify maintenance when adding new entries to these
arrays.
Split WIN_VERSION_MATRIX into WIN_CLIENT_VERSION_MATRIX and
WIN_SERVER_VERSION_MATRIX because multidimensional array must
have bounds for all dimensions except the first.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240222152835.72095-3-philmd@linaro.org>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Reviewed-by: Yan Vugenfirer <yvugenfi@redhat.com>
Link: https://lore.kernel.org/r/20240304134532.28506-3-kkostiuk@redhat.com
Signed-off-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Misc HW patch queue
- hmp: Shorter 'info qtree' output (Zoltan)
- qdev: Add a granule_mode property (Eric)
- Some ERRP_GUARD() fixes (Zhao)
- Doc & style fixes in docs/interop/firmware.json (Thomas)
- hw/xen: Housekeeping (Phil)
- hw/ppc/mac99: Change timebase frequency 25 -> 100 MHz (Mark)
- hw/intc/apic: Memory leak fix (Paolo)
- hw/intc/grlib_irqmp: Ensure ncpus value is in range (Clément)
- hw/m68k/mcf5208: Add support for reset (Angelo)
- hw/i386/pc: Housekeeping (Phil)
- hw/core/smp: Remove/deprecate parameter=0,1 adapting test-smp-parse (Zhao)
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmXstpMACgkQ4+MsLN6t
# wN6XBw//dNItFhf1YX+Au4cNoQVDgHE9RtzEIGOnwcL1CgrA9rAQQfLRE5KWM6sN
# 1qiPh+T6SPxtiQ2rw4AIpsI7TXjO72b/RDWpUUSwnfH39eC77pijkxIK+i9mYI9r
# p0sPjuP6OfolUFYeSbYX+DmNZh1ONPf27JATJQEf0st8dyswn7lTQvJEaQ97kwxv
# UKA0JD5l9LZV8Zr92cgCzlrfLcbVblJGux9GYIL09yN78yqBuvTm77GBC/rvC+5Q
# fQC5PQswJZ0+v32AXIfSysMp2R6veo4By7VH9Lp51E/u9jpc4ZbcDzxzaJWE6zOR
# fZ01nFzou1qtUfZi+MxNiDR96LP6YoT9xFdGYfNS6AowZn8kymCs3eo7M9uvb+rN
# A2Sgis9rXcjsR4e+w1YPBXwpalJnLwB0QYhEOStR8wo1ceg7GBG6zHUJV89OGzsA
# KS8X0aV1Ulkdm/2H6goEhzrcC6FWLg8pBJpfKK8JFWxXNrj661xM0AAFVL9we356
# +ymthS2x/RTABSI+1Lfsoo6/SyXoimFXJJWA82q9Yzoaoq2gGMWnfwqxfix6JrrA
# PuMnNP5WNvh04iWcNz380P0psLVteHWcVfTRN3JvcJ9iJ2bpjcU1mQMJtvSF9wBn
# Y8kiJTUmZCu3br2e5EfxmypM/h8y29VD/1mxPk8Dtcq3gjx9AU4=
# =juZH
# -----END PGP SIGNATURE-----
# gpg: Signature made Sat 09 Mar 2024 19:20:51 GMT
# gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]
# Primary key fingerprint: FAAB E75E 1291 7221 DCFD 6BB2 E3E3 2C2C DEAD C0DE
* tag 'hw-misc-20240309' of https://github.com/philmd/qemu: (43 commits)
hw/m68k/mcf5208: add support for reset
tests/unit/test-smp-parse: Test "parameter=0" SMP configurations
tests/unit/test-smp-parse: Test smp_props.has_clusters
tests/unit/test-smp-parse: Test the full 7-levels topology hierarchy
tests/unit/test-smp-parse: Test "drawers" and "books" combination case
tests/unit/test-smp-parse: Test "drawers" parameter in -smp
tests/unit/test-smp-parse: Test "books" parameter in -smp
tests/unit/test-smp-parse: Make test cases aware of the book/drawer
tests/unit/test-smp-parse: Bump max_cpus to 4096
tests/unit/test-smp-parse: Use CPU number macros in invalid topology case
tests/unit/test-smp-parse: Drop the unsupported "dies=1" case
hw/core/machine-smp: Calculate total CPUs once in machine_parse_smp_config()
hw/core/machine-smp: Deprecate unsupported "parameter=1" SMP configurations
hw/core/machine-smp: Remove deprecated "parameter=0" SMP configurations
docs/interop/firmware.json: Fix doc for FirmwareFlashMode
docs/interop/firmware.json: Align examples
hw/intc/grlib_irqmp: abort realize when ncpus value is out of range
mac_newworld: change timebase frequency from 100MHz to 25MHz for mac99 machine
hmp: Add option to info qtree to omit details
qdev: Add a granule_mode property
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
trivial patches for 2024-03-09
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEe3O61ovnosKJMUsicBtPaxppPlkFAmXshtIPHG1qdEB0bHMu
# bXNrLnJ1AAoJEHAbT2saaT5ZMFUIAKTd1rYzRs6x4wXitaWbYIIs2d6UB/HLbzz2
# BVHZwoYqsW3TuNFJp4njHhexZ76nFlT8xMuOKB5tAm4KOmqOdxS/mfThuSGsWGP7
# CAk35ENOMQbii/jp6tqawop+H0rVMSJjBrkU4vLRAtQ7g1ISnX6tJi3wiyS+FtHq
# 9eIfgJgM77tvq6RLPZTUrUBevMWQfjMcvXmMnYqL4Z1dnibIb5/R3RKAnEc4CUoS
# hMw94wBcq+ZOQNPnY7d+WioKq7JcSWX7UW5NuHo+C+G83nq1/5vE8Oe2kNwzFyDL
# 9sIqL8bz6v8iiqcVMIBykSAZhYH9QEuVRJso18UE5w0B8k4CQcM=
# =dIAF
# -----END PGP SIGNATURE-----
# gpg: Signature made Sat 09 Mar 2024 15:57:06 GMT
# gpg: using RSA key 7B73BAD68BE7A2C289314B22701B4F6B1A693E59
# gpg: issuer "mjt@tls.msk.ru"
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>" [full]
# gpg: aka "Michael Tokarev <mjt@corpit.ru>" [full]
# gpg: aka "Michael Tokarev <mjt@debian.org>" [full]
# Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D 4324 457C E0A0 8044 65C5
# Subkey fingerprint: 7B73 BAD6 8BE7 A2C2 8931 4B22 701B 4F6B 1A69 3E59
* tag 'pull-trivial-patches' of https://gitlab.com/mjt0k/qemu:
docs/acpi/bits: add some clarity and details while also improving formating
hw/mem/cxl_type3: Fix problem with g_steal_pointer()
hw/pci-bridge/cxl_upstream: Fix problem with g_steal_pointer()
hw/cxl/cxl-cdat: Fix type of buf in ct3_load_cdat()
qerror: QERR_DEVICE_IN_USE is no longer used, drop
blockdev: Fix block_resize error reporting for op blockers
char: Slightly better error reporting when chardev is in use
make-release: switch to .xz format by default
hw/scsi/lsi53c895a: Fix typo in comment
hw/vfio/pci.c: Make some structure static
replay: Improve error messages about configuration conflicts
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The smp_props.has_clusters in MachineClass is not a user configured
field, and it indicates if user specifies "clusters" in -smp.
After -smp parsing, other module could aware if the cluster level
is configured by user. This is used when the machine has only 1 cluster
since there's only 1 cluster by default.
Add the check to cover this field.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Xiaoling Song <xiaoling.song@intel.com>
Acked-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240308160148.3130837-13-zhao1.liu@linux.intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Currently, -smp supports up to 7-levels topology hierarchy:
-drawers/books/sockets/dies/clusters/cores/threads.
Though no machine supports all these 7 levels yet, these 7 levels have
the strict containment relationship and together form the generic CPU
topology representation of QEMU.
Also, note that the maxcpus is calculated by multiplying all 7 levels:
maxcpus = drawers * books * sockets * dies * clusters *
cores * threads.
To cover this code path, it is necessary to test the full topology case
(with all 7 levels). This also helps to avoid introducing new issues by
further expanding the CPU topology in the future.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Xiaoling Song <xiaoling.song@intel.com>
Acked-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240308160148.3130837-12-zhao1.liu@linux.intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Currently, -smp supports 2 more new levels: book and drawer.
It is necessary to consider the effects of book and drawer in the test
cases to ensure that the calculations are correct. This is also the
preparation to add new book and drawer test cases.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Xiaoling Song <xiaoling.song@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240308160148.3130837-8-zhao1.liu@linux.intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
In machine_parse_smp_config(), the number of total CPUs is calculated
by:
drawers * books * sockets * dies * clusters * cores * threads
To avoid missing the future new topology level, use a local variable to
cache the calculation result so that total CPUs are only calculated
once.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240308160148.3130837-4-zhao1.liu@linux.intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Currently, it was allowed for users to specify the unsupported
topology parameter as "1". For example, x86 PC machine doesn't
support drawer/book/cluster topology levels, but user could specify
"-smp drawers=1,books=1,clusters=1".
This is meaningless and confusing, so that the support for this kind of
configurations is marked deprecated since 9.0. And report warning
message for such case like:
qemu-system-x86_64: warning: Deprecated CPU topology (considered invalid):
Unsupported clusters parameter mustn't be specified as 1
qemu-system-x86_64: warning: Deprecated CPU topology (considered invalid):
Unsupported books parameter mustn't be specified as 1
qemu-system-x86_64: warning: Deprecated CPU topology (considered invalid):
Unsupported drawers parameter mustn't be specified as 1
Users have to ensure that all the topology members described with -smp
are supported by the target machine.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240308160148.3130837-3-zhao1.liu@linux.intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The "parameter=0" SMP configurations have been marked as deprecated
since v6.2.
For these cases, -smp currently returns the warning and adjusts the
zeroed parameters to 1 by default.
Remove the above compatibility logic in v9.0, and return error directly
if any -smp parameter is set as 0.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Prasad Pandit <pjp@fedoraproject.org>
Message-ID: <20240308160148.3130837-2-zhao1.liu@linux.intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
MacOS X uses multiple techniques for calibrating timers depending upon the detected
hardware. One of these calibration routines compares the change in the timebase
against the KeyLargo timer and uses this to recalculate the clock frequency,
timebase frequency and bus frequency if the calibration exceeds certain limits.
This recalibration occurs despite the correct values being passed via the device
tree, and is likely due to buggy firmware on some hardware.
The timebase frequency of 100MHz was set way back in 2005 by commit fa296b0fb4
("PIC fix - changed back TB frequency to 100 MHz") and with this value on a
mac99,via=pmu machine the OSX 10.2 timer calibration incorrectly calculates the
bus frequency as 400MHz instead of 100MHz. The most noticeable side-effect is
the UI appears sluggish and not very responsive for normal use.
Change the timebase frequency from 100MHz to 25MHz which matches that of a real
G4 AGP machine (the closest match to QEMU's mac99 machine) and allows OSX 10.2
to correctly detect all of the clock frequency, timebase frequency and bus
frequency.
Tested on various MacOS images from OS 9.2 through to OSX 10.4, along with Linux
and NetBSD and I was unable to find any regressions from this change.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240304073548.2098806-1-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Introduce a new enum type property allowing to set an
IOMMU granule. Values are 4k, 8k, 16k, 64k and host.
This latter indicates the vIOMMU granule will match
the host page size.
A subsequent patch will add such a property to the
virtio-iommu device.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240227165730.14099-2-eric.auger@redhat.com>
deliver_bitmask is allocated on the heap in apic_deliver(), but there
are many paths in the function that return before the corresponding
g_free() is reached. Fix this by switching to g_autofree and, while at
it, also switch to g_new. Do the same in apic_deliver_irq() as well
for consistency.
Fixes: b5ee0468e9 ("apic: add support for x2APIC mode", 2024-02-14)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Bui Quang Minh <minhquangbui99@gmail.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-ID: <20240304224133.267640-1-pbonzini@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The "isapc" machine only provides an ISA bus, not a PCI one,
and doesn't instanciate any i440FX south bridge.
Its machine class defines PCMachineClass::pci_enabled = false,
and pc_init1() only uses the pci_type argument when pci_enabled
is true. Since for this machine the argument is not used,
passing NULL makes more sense.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20240301185936.95175-5-philmd@linaro.org>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is the pointer of
error_fatal, the user can't see this additional information, because
exit() happens in error_setg earlier than information is added [1].
The sev_inject_launch_secret() passes @errp to error_prepend(), and as
an APIs defined in target/i386/sev.h, it is necessary to protect its
@errp with ERRP_GUARD().
To avoid the issue like [1] said, add missing ERRP_GUARD() at the
beginning of this function.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240229143914.1977550-17-zhao1.liu@linux.intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is the pointer of
error_fatal, the user can't see this additional information, because
exit() happens in error_setg earlier than information is added [1].
The remote_object_set_fd() passes @errp to error_prepend(), and as a
PropertyInfo.set method, its @errp is so widely sourced that it is
necessary to protect it with ERRP_GUARD().
To avoid the issue like [1] said, add missing ERRP_GUARD() at the
beginning of this function.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Cc: Jagannathan Raman <jag.raman@oracle.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240229143914.1977550-4-zhao1.liu@linux.intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is the pointer of
error_fatal, the user can't see this additional information, because
exit() happens in error_setg earlier than information is added [1].
The xen_netdev_connect() passes @errp to error_prepend(), and its @errp
parameter is from xen_device_frontend_changed().
Though its @errp points to @local_err of xen_device_frontend_changed(),
to follow the requirement of @errp, add missing ERRP_GUARD() at the
beginning of this function.
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Anthony Perard <anthony.perard@citrix.com>
Cc: Paul Durrant <paul@xen.org>
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240229143914.1977550-3-zhao1.liu@linux.intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is the pointer of
error_fatal, the user can't see this additional information, because
exit() happens in error_setg earlier than information is added [1].
The xen_console_connect() passes @errp to error_prepend() without
ERRP_GUARD().
There're 2 places will call xen_console_connect():
- xen_console_realize(): the @errp is from DeviceClass.realize()'s
parameter.
- xen_console_frontend_changed(): the @errp points its caller's
@local_err.
To avoid the issue like [1] said, add missing ERRP_GUARD() at the
beginning of xen_console_connect().
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Anthony Perard <anthony.perard@citrix.com>
Cc: Paul Durrant <paul@xen.org>
Cc: "Marc-André Lureau" <marcandre.lureau@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Message-ID: <20240228163723.1775791-15-zhao1.liu@linux.intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
In order to build this file once for all targets, replace:
TARGET_PAGE_BITS -> qemu_target_page_bits()
TARGET_PAGE_SIZE -> qemu_target_page_size()
TARGET_PAGE_MASK -> -qemu_target_page_size()
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Message-Id: <20231114163123.74888-4-philmd@linaro.org>
We are going to replace TARGET_PAGE_MASK by a
runtime variable. In order to reduce code duplication,
propagate TARGET_PAGE_MASK to get_physmapping() and
xen_phys_offset_to_gaddr().
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Message-Id: <20231114163123.74888-3-philmd@linaro.org>
"hw/xen/xen_pt.h" requires "hw/xen/xen_native.h" which is target
specific. It also declares IGD methods, which are not target
specific.
Target-agnostic code can use IGD methods. To allow that, extract
these methos into a new "hw/xen/xen_igd.h" header.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <20231114143816.71079-18-philmd@linaro.org>
Similarly to the restriction in hw/pci/msix.c (see commit
e1e4bf2252 "msix: fix msix_vector_masked"), restrict the
xen_is_pirq_msi() call in msi_is_masked() to Xen.
No functional change intended.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <20231114143816.71079-7-philmd@linaro.org>
"sysemu/xen.h" defines CONFIG_XEN_IS_POSSIBLE as a target-agnostic
version of CONFIG_XEN accelerator.
Use it in order to use "sysemu/xen-mapcache.h" in target-agnostic files.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <20231114143816.71079-4-philmd@linaro.org>
vAPIC isn't KVM specific, so having its name prefixed 'kvm'
is misleading. Rename it simply 'vapic'. Rename the single
function prefixed 'kvm'.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230905145159.7898-1-philmd@linaro.org>
Update bios-bits docs to add more details on why a pre-OS environment for
testing bioses is useful. Add author's FOSDEM talk link. Also improve the
formating of the document while at it.
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When setting GLIB_VERSION_MAX_ALLOWED to GLIB_VERSION_2_58 or higher,
glib adds type safety checks to the g_steal_pointer() macro. This
triggers errors in the ct3_build_cdat_entries_for_mr() function which
uses the g_steal_pointer() for type-casting from one pointer type to
the other (which also looks quite weird since the local pointers have
all been declared with g_autofree though they are never freed here).
Fix it by using a proper typecast instead. For making this possible, we
have to remove the QEMU_PACKED attribute from some structs since GCC
otherwise complains that the source and destination pointer might
have different alignment restrictions. Removing the QEMU_PACKED should
be fine here since the structs are already naturally aligned. Anyway,
add some QEMU_BUILD_BUG_ON() statements to make sure that we've got
the right sizes (without padding in the structs).
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When setting GLIB_VERSION_MAX_ALLOWED to GLIB_VERSION_2_58 or higher,
glib adds type safety checks to the g_steal_pointer() macro. This
triggers errors in the build_cdat_table() function which uses the
g_steal_pointer() for type-casting from one pointer type to the other
(which also looks quite weird since the local pointers have all been
declared with g_autofree though they are never freed here). Let's fix
it by using a proper typecast instead. For making this possible, we
have to remove the QEMU_PACKED attribute from some structs since GCC
otherwise complains that the source and destination pointer might
have different alignment restrictions. Removing the QEMU_PACKED should
be fine here since the structs are already naturally aligned. Anyway,
add some QEMU_BUILD_BUG_ON() statements to make sure that we've got
the right sizes (without padding in the structs).
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When setting GLIB_VERSION_MAX_ALLOWED to GLIB_VERSION_2_58 or higher
(which we'll certainly do in the not too distant future), glib adds
type safety checks to the g_steal_pointer() macro. This trigger an
error in the ct3_load_cdat() function: The local char *buf variable is
assigned to uint8_t *buf in CDATObject, i.e. a pointer of a different
type. Change the local variable to the same type as buf in CDATObject
to avoid the error.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When block_resize() runs into an op blocker, it creates an error like
this:
error_setg(errp, "Device '%s' is in use", device);
Trouble is @device can be null. My system formats null as "(null)",
but other systems might crash. Reproducer:
1. Create two block devices
-> {"execute": "blockdev-add", "arguments": {"driver": "file", "node-name": "blk0", "filename": "64k.img"}}
<- {"return": {}}
-> {"execute": "blockdev-add", "arguments": {"driver": "file", "node-name": "blk1", "filename": "m.img"}}
<- {"return": {}}
2. Put a blocker on one them
-> {"execute": "blockdev-mirror", "arguments": {"job-id": "job0", "device": "blk0", "target": "blk1", "sync": "full"}}
{"return": {}}
-> {"execute": "job-pause", "arguments": {"id": "job0"}}
{"return": {}}
-> {"execute": "job-complete", "arguments": {"id": "job0"}}
{"return": {}}
Note: job events elided for brevity.
3. Attempt to resize
-> {"execute": "block_resize", "arguments": {"node-name": "blk1", "size":32768}}
<- {"error": {"class": "GenericError", "desc": "Device '(null)' is in use"}}
Broken when commit 3b1dbd11a6 made @device optional. Fixed in commit
ed3d2ec98a (block: Add errp to b{lk,drv}_truncate()), except for this
one instance.
Fix it by using the error message provided by the op blocker instead,
so it fails like this:
<- {"error": {"class": "GenericError", "desc": "Node 'blk1' is busy: block device is in use by block job: mirror"}}
Fixes: 3b1dbd11a6 (qmp: Allow block_resize to manipulate bs graph nodes.)
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Both
$ qemu-system-x86_64 -chardev null,id=chr0,mux=on -mon chardev=chr0 -mon chardev=chr0 -mon chardev=chr0 -mon chardev=chr0 -mon chardev=chr0
and
$ qemu-system-x86_64 -chardev null,id=chr0 -mon chardev=chr0 -mon chardev=chr0
fail with
qemu-system-x86_64: -mon chardev=chr0: Device 'chr0' is in use
Improve to
qemu-system-x86_64: -mon chardev=chr0: too many uses of multiplexed chardev 'chr0' (maximum is 4)
and
qemu-system-x86_64: -mon chardev=chr0: chardev 'chr0' is already in use
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
For a long time, we provide two compression formats in the
download area, .bz2 and .xz. There's absolutely no reason
to provide two in parallel, .xz compresses better, and all
the links we use points to .xz. Downstream distributions
mostly use .xz too.
For the release maintenance providing two formats is definitely
extra burden too.
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Improve
Record/replay feature is not supported for '-rtc base=localtime'
Record/replay feature is not supported for 'smp'
Record/replay feature is not supported for '-snapshot'
to
Record/replay is not supported with -rtc base=localtime
Record/replay is not supported with multiple CPUs
Record/replay is not supported with -snapshot
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Hyper-V Dynamic Memory and VMBus misc small patches
This pull request contains two small patches to hv-balloon:
the first one replacing alloca() usage with g_malloc0() + g_autofree
and the second one adding additional declaration of a protocol message
struct with an optional field explicitly defined to avoid a Coverity
warning.
Also included is a VMBus patch to print a warning when it is enabled
without the recommended set of Hyper-V features (enlightenments) since
some Windows versions crash at boot in this case.
# -----BEGIN PGP SIGNATURE-----
#
# iQGzBAABCAAdFiEE4ndqq6COJv9aG0oJUrHW6VHQzgcFAmXrQeMACgkQUrHW6VHQ
# zgcvWwv9GUCDnidnDka8WGF2wgBEaPPdC2JXcqRFFLADISBAn/3fhsOERO6FwYuN
# pouhVEJnHpp9ueNAx+et51ySRzGCaL+VdOGGeReQllIOZGsnOnB8JfM58UE4lX4Z
# prCr72bxFsunxRqlqxssejrc8fBhgEQRPo5lQabl73rxftpXkNTHY0CGTwlvnaY1
# CzEBTBuowzkZJbQYDL8Qim2HrYqrSnOaend6bbrj9P6P+UFw9wLJU5tkfYCiHUjg
# Ux2Fjjx+5+qD9yE7khtxSHqjwWYkR7xA9di1yv4Znqg18gzdbuqnlrKR7F0v98yh
# sWFy+fyfVRDg+G2yh2F+vAUjmAJUrfw5+GL3uZTWIevoQUoSHBQfgUEJrlIKvykZ
# WP1XuAZRH3m2akDOXOWZVcDhkb3zPKtPJYZ2WncBZk+DLCs/vg94Taq0FcZefBTn
# 6qsFjs2lHz96uOSzgqICfU34ghcxfU5xgzmvKxKAiriOItmRMHgIYOXLHRfaIJhV
# MT/9OMuW
# =kVny
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 08 Mar 2024 16:50:43 GMT
# gpg: using RSA key E2776AABA08E26FF5A1B4A0952B1D6E951D0CE07
# gpg: Good signature from "Maciej S. Szmigiero <mail@maciej.szmigiero.name>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 727A 0D4D DB9E D9F6 039B ECEF 847F 5E37 90CE 0977
# Subkey fingerprint: E277 6AAB A08E 26FF 5A1B 4A09 52B1 D6E9 51D0 CE07
* tag 'pull-hv-balloon-20240308' of https://github.com/maciejsszmigiero/qemu:
vmbus: Print a warning when enabled without the recommended set of features
hv-balloon: define dm_hot_add_with_region to avoid Coverity warning
hv-balloon: avoid alloca() usage
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Allow cpr-reboot for vfio if the guest is in the suspended runstate. The
guest drivers' suspend methods flush outstanding requests and re-initialize
the devices, and thus there is no device state to save and restore. The
user is responsible for suspending the guest before initiating cpr, such as
by issuing guest-suspend-ram to the qemu guest agent.
Relax the vfio blocker so it does not apply to cpr, and add a notifier that
verifies the guest is suspended.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Define entry points to perform per-container cpr-specific initialization
and teardown.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Add a job that can be run, either manually or on a schedule, to upload
a build to Coverity Scan. The job uses the run-coverity-scan script
in multiple phases of check, download tools and upload, in order to
avoid both wasting time (skip everything if you are above the upload
quota) and avoid filling the log with the progress of downloading
the tools.
The job is intended to run on a scheduled pipeline run, and scheduled
runs will not get any other job. It requires two variables to be in
GitLab CI, COVERITY_TOKEN and COVERITY_EMAIL. Those are already set up
in qemu-project's configuration as protected and masked variables.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add an option to check if upload is permitted without actually
attempting a build. This can be useful to add a third outcome
beyond success and failure---namely, a CI job can self-cancel
if the uploading quota has been reached.
There is a small change here in that a failure to do the upload
check changes the exit code from 1 to 99. 99 was chosen because
it is what Autotools and Meson use to represent a problem in the
setup (as opposed to a failure in the test).
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add new "select" and "imply" directives if needed. The resulting
config-devices.mak files are the same as before.
Builds without default devices will become much smaller
than before, and qtests fail (as expected, though suboptimal)
for mips64-softmmu because most tests do not use -nodefaults,
so remove it from build-without-defaults
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
touch_all_pages() can return early, before creating threads. In this case,
however, it leaks the MemsetContext that it has allocated at the
beginning of the function.
Reported by Coverity as CID 1534922.
Fixes: 04accf43df ("oslib-posix: initialize backend memory objects in parallel", 2024-02-06)
Reviewed-by: Mark Kanda <mark.kanda@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
deliver_bitmask is allocated on the heap in apic_deliver(), but there
are many paths in the function that return before the corresponding
g_free() is reached. Fix this by switching to g_autofree and, while at
it, also switch to g_new. Do the same in apic_deliver_irq() as well
for consistency.
Fixes: b5ee0468e9 ("apic: add support for x2APIC mode", 2024-02-14)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Bui Quang Minh <minhquangbui99@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Netbsd isn't happy with qemu lsi53c895a emulation:
cd0(esiop0:0:2:0): command with tag id 0 reset
esiop0: autoconfiguration error: phase mismatch without command
esiop0: autoconfiguration error: unhandled scsi interrupt, sist=0x80 sstat1=0x0 DSA=0x23a64b1 DSP=0x50
This is because lsi_bad_phase() triggers a phase mismatch, which
stops SCRIPT processing. However, after returning to
lsi_command_complete(), SCRIPT is restarted with lsi_resume_script().
Fix this by adding a return value to lsi_bad_phase(), and only resume
script processing when lsi_bad_phase() didn't trigger a host interrupt.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Tested-by: Helge Deller <deller@gmx.de>
Message-ID: <20240302214453.2071388-1-svens@stackframe.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
--warn-common ldflag causes warnings for multiple definitions of
___asan_globals_registered when enabling AddressSanitizer with clang.
The warning is somewhat obsolete so just remove it.
The common block is used to allow duplicate definitions of uninitialized
global variables. In the past, GCC and clang used to place such
variables in a common block by default, which prevented programmers for
noticing accidental duplicate definitions. Commit 49237acdb7 ("Enable
ld flag --warn-common") added --warn-common ldflag so that ld warns in
such a case.
Today, both of GCC and clang don't use common blocks by default[1][2] so
any remaining use of common blocks should be intentional. Remove
--warn-common ldflag to suppress warnings for intentional use of
common blocks.
[1]: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85678
[2]: https://reviews.llvm.org/D75056
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-ID: <20240304-common-v1-1-1a2005d1f350@daynix.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Original goal of addition of drain_call_rcu to qmp_device_add was to cover
the failure case of qdev_device_add. It seems call of drain_call_rcu was
misplaced in 7bed89958b what led to waiting for pending RCU callbacks
under happy path too. What led to overall performance degradation of
qmp_device_add.
In this patch call of drain_call_rcu moved under handling of failure of
qdev_device_add.
Signed-off-by: Dmitrii Gavrilov <ds-gavr@yandex-team.ru>
Message-ID: <20231103105602.90475-1-ds-gavr@yandex-team.ru>
Fixes: 7bed89958b ("device_core: use drain_call_rcu in in qmp_device_add", 2020-10-12)
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
HP-UX 10.20 seems to make the lsi53c895a spinning on a memory location
under certain circumstances. As the SCSI controller and CPU are not
running at the same time this loop will never finish. After some
time, the check loop interrupts with a unexpected device disconnect.
This works, but is slow because the kernel resets the scsi controller.
Instead of signaling UDC, start a timer and exit the loop. Until the
timer fires, the CPU can process instructions which might changes the
memory location.
The limit of instructions is also reduced because scripts running on
the SCSI processor are usually very short. This keeps the time until
the loop is exit short.
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-ID: <20240229204407.1699260-1-svens@stackframe.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some Windows versions crash at boot or fail to enable the VMBus device if
they don't see the expected set of Hyper-V features (enlightenments).
Since this provides poor user experience let's warn user if the VMBus
device is enabled without the recommended set of Hyper-V features.
The recommended set is the minimum set of Hyper-V features required to make
the VMBus device work properly in Windows Server versions 2016, 2019 and
2022.
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Since the presence of a hot add memory region is optional in hot add
request message it wasn't part of this message declaration
(struct dm_hot_add).
Instead, the code allocated such enlarged message by simply adding the
necessary size for this extra field to the size of basic hot add message
struct.
However, Coverity considers accessing this extra member to be
an out-of-bounds access, even thought the memory is actually there.
Fix this by adding an extended variant of this message that explicitly has
an additional union dm_mem_page_range at its end.
CID: #1523903
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
G-stage translation should be considered to be user-level access in
riscv_cpu_get_phys_page_debug(), as already done in riscv_cpu_tlb_fill().
This fixes a bug that prevents gdb from reading memory while the VM is
running in VS-mode.
Signed-off-by: Hiroaki Yamamoto <hrak1529@gmail.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240228081028.35081-1-hrak1529@gmail.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The reads to in_clrip[x] registers return rectified input values of the
interrupt sources.
A rectified input value of an interrupt source is defined by the section
"4.5.2 Source configurations (sourcecfg[1]–sourcecfg[1023])" of the RISC-V
AIA specification as:
"rectified input value = (incoming wire value) XOR (source is inverted)"
Update the riscv_aplic_read_input_word() implementation to match the above.
Fixes: e8f79343cf ("hw/intc: Add RISC-V AIA APLIC device emulation")
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20240306095722.463296-3-apatel@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The writes to setipnum_le register in APLIC MSI-mode have special
consideration for level-triggered interrupts as-per section "4.9.2
Special consideration for level-sensitive interrupt sources" of the
RISC-V AIA specification.
Particularly, the below text from the RISC-V specification defines
the behaviour of writes to setipnum_le for level-triggered interrupts:
"A second option is for the interrupt service routine to write the
APLIC’s source identity number for the interrupt to the domain’s
setipnum register just before exiting. This will cause the interrupt’s
pending bit to be set to one again if the source is still asserting
an interrupt, but not if the source is not asserting an interrupt."
Fix setipnum_le write emulation for APLIC MSI-mode by implementing
the above behaviour in riscv_aplic_set_pending() function.
Fixes: e8f79343cf ("hw/intc: Add RISC-V AIA APLIC device emulation")
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20240306095722.463296-2-apatel@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
While discussing a problem with how we're (not) setting vstart_eq_zero
Richard had the following to say w.r.t the conditional mark_vs_dirty()
calls on load/store functions [1]:
"I think it's required to have stores set dirty unconditionally, before
the operation.
Consider a store that traps on the 2nd element, leaving vstart = 2, and
exiting to the main loop via exception. The exception enters the kernel
page fault handler. The kernel may need to fault in the page for the
process, and in the meantime task switch.
If vs dirty is not already set, the kernel won't know to save vector
state on task switch."
Do a mark_vs_dirty() before both loads and stores.
[1] https://lore.kernel.org/qemu-riscv/72c7503b-0f43-44b8-aa82-fbafed2aac0c@linaro.org/
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240306171932.549549-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The Ztso extension is already ratified, this adds it as a CPU property
and adds various fences throughout the port in order to allow TSO
targets to function on weaker hosts. We need no fences for AMOs as
they're already SC, the places we need barriers are described.
These fences are placed in the RISC-V backend rather than TCG as is
planned for x86-on-arm64 because RISC-V allows heterogeneous (and
likely soon dynamic) hart memory models.
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Message-ID: <20240207122256.902627-2-christoph.muellner@vrull.eu>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Add a RISC-V 'virt' machine to the graph. This implementation is a
modified copy of the existing arm machine in arm-virt-machine.c
It contains a virtio-mmio and a generic-pcihost controller. The
generic-pcihost controller hardcodes assumptions from the ARM 'virt'
machine, like ecam and pio_base addresses, so we'll add an extra step to
set its parameters after creating it.
Our command line is incremented with 'aclint' parameters to allow the
machine to run MSI tests.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Acked-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240217192607.32565-7-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The 'virt' machine makes assumptions on the Advanced Core-Local
Interruptor, or aclint, based on 'tcg_enabled()' conditionals. This
will impact MSI related tests support when adding a RISC-V 'virt' libqos
machine. The accelerator used in that case, 'qtest', isn't being
accounted for and we'll error out if we try to enable aclint.
Create a new virt_aclint_allowed() helper to gate the aclint code
considering both TCG and 'qtest' accelerators. The error message is
left untouched, mentioning TCG only, because we don't expect the
regular user to be aware of 'qtest'.
We want to add 'qtest' support for aclint only, leaving the TCG specific
bits out of it. This is done by changing the current format we use
today:
if (tcg_enabled()) {
if (s->have_aclint) { - aclint logic - }
else { - non-aclint, TCG logic - }
}
into:
if (virt_aclint_allowed() && s->have_aclint) {
- aclint logic -
} else if (tcg_enabled()) {
- non-aclint, TCG logic -
}
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240217192607.32565-6-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The SATP register is an SXLEN-bit read/write WARL register. It means that CSR fields are only defined
for a subset of bit encodings, but allow any value to be written while guaranteeing to return a legal
value whenever read (See riscv-privileged-20211203, SATP CSR).
For example on rv64 we are trying to write to SATP CSR val = 0x1000000000000000 (SATP_MODE = 1 - Reserved for standard use)
and after that we are trying to read SATP_CSR. We read from the SATP CSR value = 0x1000000000000000, which is not a correct
operation (return illegal value).
Signed-off-by: Irina Ryapolova <irina.ryapolova@syntacore.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240109145923.37893-1-irina.ryapolova@syntacore.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Named features are extensions which don't make sense for users to
control and are therefore not exposed on the command line. However,
svade is an extension which makes sense for users to control, so treat
it like a "normal" extension. The default is false, even for the max
cpu type, since QEMU has always implemented hardware A/D PTE bit
updating, so users must opt into svade (or get it from a CPU type
which enables it by default).
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240215223955.969568-7-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Gate hardware A/D PTE bit updating on {m,h}envcfg.ADUE and only
enable menvcfg.ADUE on reset if svade has not been selected. Now
that we also consider svade, we have four possible configurations:
1) !svade && !svadu
use hardware updating and there's no way to disable it
(the default, which maintains past behavior. Maintaining
the default, even with !svadu is a change that fixes [1])
2) !svade && svadu
use hardware updating, but also provide {m,h}envcfg.ADUE,
allowing software to switch to exception mode
(being able to switch is a change which fixes [1])
3) svade && !svadu
use exception mode and there's no way to switch to hardware
updating
(this behavior change fixes [2])
4) svade && svadu
use exception mode, but also provide {m,h}envcfg.ADUE,
allowing software to switch to hardware updating
(this behavior change fixes [2])
Fixes: 0af3f115e6 ("target/riscv: Add *envcfg.HADE related check in address translation") [1]
Fixes: 48531f5adb ("target/riscv: implement svade") [2]
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240215223955.969568-6-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The RVA22U64 and RVA22S64 profiles mandates certain extensions that,
until now, we were implying that they were available.
We can't do this anymore since named features also has a riscv,isa
entry. Let's add them to riscv_cpu_named_features[].
Instead of adding one bool for each named feature that we'll always
implement, i.e. can't be turned off, add a 'ext_always_enabled' bool in
cpu->cfg. This bool will be set to 'true' in TCG accel init, and all
named features will point to it. This also means that KVM won't see
these features as always enable, which is our intention.
If any accelerator adds support to disable one of these features, we'll
have to promote them to regular extensions and allow users to disable it
via command line.
After this patch, here's the riscv,isa from a buildroot using the
'rva22s64' CPU:
# cat /proc/device-tree/cpus/cpu@0/riscv,isa
rv64imafdc_zic64b_zicbom_zicbop_zicboz_ziccamoa_ziccif_zicclsm_ziccrse_
zicntr_zicsr_zifencei_zihintpause_zihpm_za64rs_zfhmin_zca_zcd_zba_zbb_
zbs_zkt_ssccptr_sscounterenw_sstvala_sstvecd_svade_svinval_svpbmt#
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20240215223955.969568-4-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Further discussions after the introduction of rva22 support in QEMU
revealed that what we've been calling 'named features' are actually
regular extensions, with their respective riscv,isa DTs. This is
clarified in [1]. [2] is a bug tracker asking for the profile spec to be
less cryptic about it.
As far as QEMU goes we understand extensions as something that the user
can enable/disable in the command line. This isn't the case for named
features, so we'll have to reach a middle ground.
We'll keep our existing nomenclature 'named features' to refer to any
extension that the user can't control in the command line. We'll also do
the following:
- 'svade' and 'zic64b' flags are renamed to 'ext_svade' and
'ext_zic64b'. 'ext_svade' and 'ext_zic64b' now have riscv,isa strings and
priv_spec versions;
- skip name feature check in cpu_bump_multi_ext_priv_ver(). Now that
named features have a riscv,isa and an entry in isa_edata_arr[] we
don't need to gate the call to cpu_cfg_ext_get_min_version() anymore.
[1] https://github.com/riscv/riscv-profiles/issues/121
[2] https://github.com/riscv/riscv-profiles/issues/142
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240215223955.969568-3-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Recent changes in options handling removed the 'mmu' default the bare
CPUs had, meaning that we must enable 'mmu' by hand when using the
rva22s64 profile CPU.
Given that this profile is setting a satp mode, it already implies that
we need a 'mmu'. Enable the 'mmu' in this case.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240215223955.969568-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Currently, the initrd is placed at 128MB, which overlaps with the kernel
when it is large (for example syzbot kernels are). From the kernel side,
there is no reason we could not push the initrd further away in memory
to accommodate large kernels, so move the initrd at 512MB when possible.
The ideal solution would have been to place the initrd based on the
kernel size but we actually can't since the bss size is not known when
the image is loaded by load_image_targphys_as() and the initrd would
then overlap with this section.
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20240206154042.514698-1-alexghiti@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The original implementation sets $pc to the address read from the jump
vector table first and links $ra with the address of the next instruction
after the updated $pc. After jumping to the updated $pc and executing the
next ret instruction, the program jumps to $ra, which is in the same
function currently executing, which results in an infinite loop.
This commit stores the jump address in a temporary, updates $ra with the
current $pc, and copies the temporary to $pc.
Signed-off-by: Jason Chien <jason.chien@sifive.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240207081820.28559-1-jason.chien@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The testcase contains :
- `test_idr_reset_value()` :
Checks the reset values of MODER, OTYPER, PUPDR, ODR and IDR.
- `test_gpio_output_mode()` :
Checks that writing a bit in register ODR results in the corresponding
pin rising or lowering, if this pin is configured in output mode.
- `test_gpio_input_mode()` :
Checks that a input pin set high or low externally results
in the pin rising and lowering.
- `test_pull_up_pull_down()` :
Checks that a floating pin in pull-up/down mode is actually high/down.
- `test_push_pull()` :
Checks that a pin set externally is disconnected when configured in
push-pull output mode, and can't be set externally while in this mode.
- `test_open_drain()` :
Checks that a pin set externally high is disconnected when configured
in open-drain output mode, and can't be set high while in this mode.
- `test_bsrr_brr()` :
Checks that writing to BSRR and BRR has the desired result in ODR.
- `test_clock_enable()` :
Checks that GPIO clock is at the right frequency after enabling it.
Acked-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Arnaud Minier <arnaud.minier@telecom-paris.fr>
Signed-off-by: Inès Varhol <ines.varhol@telecom-paris.fr>
Message-id: 20240305210444.310665-4-ines.varhol@telecom-paris.fr
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Features supported :
- the 8 STM32L4x5 GPIOs are initialized with their reset values
(except IDR, see below)
- input mode : setting a pin in input mode "externally" (using input
irqs) results in an out irq (transmitted to SYSCFG)
- output mode : setting a bit in ODR sets the corresponding out irq
(if this line is configured in output mode)
- pull-up, pull-down
- push-pull, open-drain
Difference with the real GPIOs :
- Alternate Function and Analog mode aren't implemented :
pins in AF/Analog behave like pins in input mode
- floating pins stay at their last value
- register IDR reset values differ from the real one :
values are coherent with the other registers reset values
and the fact that AF/Analog modes aren't implemented
- setting I/O output speed isn't supported
- locking port bits isn't supported
- ADC function isn't supported
- GPIOH has 16 pins instead of 2 pins
- writing to registers LCKR, AFRL, AFRH and ASCR is ineffective
Signed-off-by: Arnaud Minier <arnaud.minier@telecom-paris.fr>
Signed-off-by: Inès Varhol <ines.varhol@telecom-paris.fr>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20240305210444.310665-2-ines.varhol@telecom-paris.fr
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When ID_AA64MMFR0_EL1.ECV is 0b0010, a new register CNTPOFF_EL2 is
implemented. This is similar to the existing CNTVOFF_EL2, except
that it controls a hypervisor-adjustable offset made to the physical
counter and timer.
Implement the handling for this register, which includes control/trap
bits in SCR_EL3 and CNTHCTL_EL2.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240301183219.2424889-8-peter.maydell@linaro.org
For FEAT_ECV, new registers CNTPCTSS_EL0 and CNTVCTSS_EL0 are
defined, which are "self-synchronized" views of the physical and
virtual counts as seen in the CNTPCT_EL0 and CNTVCT_EL0 registers
(meaning that no barriers are needed around accesses to them to
ensure that reads of them do not occur speculatively and out-of-order
with other instructions).
For QEMU, all our system registers are self-synchronized, so we can
simply copy the existing implementation of CNTPCT_EL0 and CNTVCT_EL0
to the new register encodings.
This means we now implement all the functionality required for
ID_AA64MMFR0_EL1.ECV == 0b0001.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240301183219.2424889-7-peter.maydell@linaro.org
The functionality defined by ID_AA64MMFR0_EL1.ECV == 1 is:
* four new trap bits for various counter and timer registers
* the CNTHCTL_EL2.EVNTIS and CNTKCTL_EL1.EVNTIS bits which control
scaling of the event stream. This is a no-op for us, because we don't
implement the event stream (our WFE is a NOP): all we need to do is
allow CNTHCTL_EL2.ENVTIS to be read and written.
* extensions to PMSCR_EL1.PCT, PMSCR_EL2.PCT, TRFCR_EL1.TS and
TRFCR_EL2.TS: these are all no-ops for us, because we don't implement
FEAT_SPE or FEAT_TRF.
* new registers CNTPCTSS_EL0 and NCTVCTSS_EL0 which are
"self-sychronizing" views of the CNTPCT_EL0 and CNTVCT_EL0, meaning
that no barriers are needed around their accesses. For us these
are just the same as the normal views, because all our sysregs are
inherently self-sychronizing.
In this commit we implement the trap handling and permit the new
CNTHCTL_EL2 bits to be written.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240301183219.2424889-6-peter.maydell@linaro.org
Don't allow the guest to write CNTHCTL_EL2 bits which don't exist.
This is not strictly architecturally required, but it is how we've
tended to implement registers more recently.
In particular, bits [19:18] are only present with FEAT_RME,
and bits [17:12] will only be present with FEAT_ECV.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240301183219.2424889-5-peter.maydell@linaro.org
maintainer updates (tests, gdbstub, plugins):
- expand QOS_PATH_MAX_ELEMENT_SIZE to avoid LTO issues
- support fork-follow-mode in gdbstub
- new thread-safe scoreboard API for TCG plugins
- suppress showing opcodes in plugin disassembly
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmXoY7oACgkQ+9DbCVqe
# KkTdTwf8D8nUB+Ee6LuglW36vtd1ETdMfUmfRis7RIBsXZZ0Tg4+8LyfKkNi1vCL
# UMdWQTkSW79RfXr21QEtETokwLZ0CWQMdxDAWfOiz4S+uDgQyBE+lwUsy0mHBmd7
# +J4SQb3adoZ+//9KMJhRU1wL9j3ygpEoKHVJonDObU6K5XuhE18JuBE44q7FqkWl
# 0VhoLDgNxrf2PqT+LLP/O3MFLDXPVKbzrZYQF0IoqBTlcqShCoaykhSwiwCZ4Sqq
# NO9hVwZIOFOcOF4F6ZqRXaZrwERldoBwG+BeIx1ah20vKFVT12y02dQqdP/oKwe+
# /PXFXDdzs4yMOghb4Go6SiKlKT5g4A==
# =s1lF
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 06 Mar 2024 12:38:18 GMT
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [full]
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* tag 'pull-maintainer-updates-060324-1' of https://gitlab.com/stsquad/qemu: (29 commits)
target/riscv: honour show_opcodes when disassembling
target/loongarch: honour show_opcodes when disassembling
disas/hppa: honour show_opcodes
disas: introduce show_opcodes
plugins: cleanup codepath for previous inline operation
plugins: remove non per_vcpu inline operation from API
contrib/plugins/howvec: migrate to new per_vcpu API
contrib/plugins/hotblocks: migrate to new per_vcpu API
tests/plugin/bb: migrate to new per_vcpu API
tests/plugin/insn: migrate to new per_vcpu API
tests/plugin/mem: migrate to new per_vcpu API
tests/plugin: add test plugin for inline operations
plugins: add inline operation per vcpu
plugins: implement inline operation relative to cpu_index
plugins: define qemu_plugin_u64
plugins: scoreboard API
tests/tcg: Add two follow-fork-mode tests
gdbstub: Implement follow-fork-mode child
gdbstub: Introduce gdb_handle_detach_user()
gdbstub: Introduce gdb_handle_set_thread_user()
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Additionally to the scoreboard, we define a qemu_plugin_u64, which is a
simple struct holding a pointer to a scoreboard, and a given offset.
This allows to have a scoreboard containing structs, without having to
bring offset to operate on a specific field.
Since most of the plugins are simply collecting a sum of per-cpu values,
qemu_plugin_u64 directly support this operation as well.
All inline operations defined later will use a qemu_plugin_u64 as input.
New functions:
- qemu_plugin_u64_add
- qemu_plugin_u64_get
- qemu_plugin_u64_set
- qemu_plugin_u64_sum
New macros:
- qemu_plugin_scoreboard_u64
- qemu_plugin_scoreboard_u64_in_struct
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-Id: <20240304130036.124418-3-pierrick.bouvier@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240305121005.3528075-16-alex.bennee@linaro.org>
We introduce a cpu local storage, automatically managed (and extended)
by QEMU itself. Plugin allocate a scoreboard, and don't have to deal
with how many cpus are launched.
This API will be used by new inline functions but callbacks can benefit
from this as well. This way, they can operate without a global lock for
simple operations.
At any point during execution, any scoreboard will be dimensioned with
at least qemu_plugin_num_vcpus entries.
New functions:
- qemu_plugin_scoreboard_find
- qemu_plugin_scoreboard_free
- qemu_plugin_scoreboard_new
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-Id: <20240304130036.124418-2-pierrick.bouvier@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240305121005.3528075-15-alex.bennee@linaro.org>
Currently it's not possible to use gdbstub for debugging linux-user
code that runs in a forked child, which is normally done using the `set
follow-fork-mode child` GDB command. Purely on the protocol level, the
missing piece is the fork-events feature.
However, a deeper problem is supporting $Hg switching between different
processes - right now it can do only threads. Implementing this for the
general case would be quite complicated, but, fortunately, for the
follow-fork-mode case there are a few factors that greatly simplify
things: fork() happens in the exclusive section, there are only two
processes involved, and before one of them is resumed, the second one
is detached.
This makes it possible to implement a simplified scheme: the parent and
the child share the gdbserver socket, it's used only by one of them at
any given time, which is coordinated through a separate socketpair. The
processes can read from the gdbserver socket only one byte at a time,
which is not great for performance, but, fortunately, the
follow-fork-mode handling involves only a few messages.
Advertise the fork-events support, and remember whether GDB has it
as well. Implement the state machine that is initialized on fork(),
decides the current owner of the gdbserver socket, and is terminated
when one of the two processes is detached. The logic for the parent and
the child is the same, only the initial state is different.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20240219141628.246823-12-iii@linux.ibm.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240305121005.3528075-13-alex.bennee@linaro.org>
We "fixed" a bug with LTO builds with 100c459f19 (tests/qtest: bump
up QOS_PATH_MAX_ELEMENT_SIZE) but it seems it has triggered again.
The array is sized according to the maximum anticipated length of a
path on the graph. However, the worst case for a depth-first search is
to push all nodes on the graph. So it's not really LTO, it depends on
the ordering of the constructors.
Lets be more assertive raising QOS_PATH_MAX_ELEMENT_SIZE to make it go
away again.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1186 (again)
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240305121005.3528075-2-alex.bennee@linaro.org>
Before v2.12, the implementation of serial ports was limited to
a value of MAX_SERIAL_PORTS = 4. We now dynamically allocate
the data structures for serial ports, so this limit is no longer
present, but the documentation for the -serial options still reads:
"This option can be used several times to simulate up to 4 serial ports."
Update to "This option can be used several times to simulate
multiple serial ports." to avoid misleading.
Signed-off-by: Steven Shen <steven.shen@jaguarmicro.com>
Message-id: 20240305013016.2268-1-steven.shen@jaguarmicro.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: tweaked commit message]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The qatomic_cmpxchg() and qatomic_cmpxchg__nocheck() macros have
a comment that reads:
Returns the eventual value, failed or not
This is somewhere between cryptic and wrong, since the value actually
returned is the value that was in memory before the cmpxchg. Reword
to match how we describe these macros in atomics.rst.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-id: 20240223182035.1048541-1-peter.maydell@linaro.org
Instantiate the whole clock tree and using the Clock multiplexers and
the PLLs defined in the previous commits. This allows to statically
define the clock tree and easily follow the clock signal from one end to
another.
Also handle three-phase reset now that we have defined a known base
state for every object.
(Reset handling based on hw/misc/zynq_sclr.c)
Signed-off-by: Arnaud Minier <arnaud.minier@telecom-paris.fr>
Signed-off-by: Inès Varhol <ines.varhol@telecom-paris.fr>
Message-id: 20240303140643.81957-5-arnaud.minier@telecom-paris.fr
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
This patch adds loopback for sent characters, sent BREAK,
and modem-control signals.
Loopback of send and modem-control is often used for uart
self tests in real hardware but missing from current pl011
model, resulting in self-test failures when running in QEMU.
This implementation matches what is observed in real pl011
hardware placed in loopback mode:
1. Input characters and BREAK events from serial backend
are ignored, but
2. Both TX characters and BREAK events are still sent to
serial backend, in addition to be looped back to RX.
Signed-off-by: Tong Ho <tong.ho@amd.com>
Signed-off-by: Francisco Iglesias <francisco.iglesias@amd.com>
Message-id: 20240227054855.44204-1-tong.ho@amd.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
A few deficiencies in the current device model need to be noted.
1. FIFOs are not used. All sends and receives are done directly.
2. Repeated starts are not emulated. Repeated starts can be triggered in real
hardware by sending a new read transfer request in the window time between
transfer active set of write transfer request and done bit set of the same.
Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240224191038.2409945-2-rayhan.faizel@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Migartion pull request for 20240304
- Bryan's fix on multifd compression level API
- Fabiano's mapped-ram series (base + multifd only)
- Steve's amend on cpr document in qapi/
# -----BEGIN PGP SIGNATURE-----
#
# iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZeUjKhIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1wbv5QD/ZexBUsmZA5qyxgGvZ2yvlUBEGNOvtmKY
# kRdiYPU7khMA/0N43rn4LcqKCoq4+T+EAnYizGjIyhH/7BRUyn4DUxgO
# =AeEn
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 04 Mar 2024 01:26:02 GMT
# gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg: issuer "peterx@redhat.com"
# gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [marginal]
# gpg: aka "Peter Xu <peterx@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706
* tag 'migration-next-pull-request' of https://gitlab.com/peterx/qemu: (27 commits)
migration/multifd: Document two places for mapped-ram
tests/qtest/migration: Add a multifd + mapped-ram migration test
migration/multifd: Add mapped-ram support to fd: URI
migration/multifd: Support incoming mapped-ram stream format
migration/multifd: Support outgoing mapped-ram stream format
migration/multifd: Prepare multifd sync for mapped-ram migration
migration/multifd: Add incoming QIOChannelFile support
migration/multifd: Add outgoing QIOChannelFile support
migration/multifd: Add a wrapper for channels_created
migration/multifd: Allow receiving pages without packets
migration/multifd: Allow multifd without packets
migration/multifd: Decouple recv method from pages
migration/multifd: Rename MultiFDSend|RecvParams::data to compress_data
tests/qtest/migration: Add tests for mapped-ram file-based migration
migration/ram: Add incoming 'mapped-ram' migration
migration/ram: Add outgoing 'mapped-ram' migration
migration: Add mapped-ram URI compatibility check
migration/ram: Introduce 'mapped-ram' migration capability
migration/qemu-file: add utility methods for working with seekable channels
io: fsync before closing a file channel
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
# Conflicts:
# migration/ram.c
Currently [-QemuCocoaView handleEventLocked:] parses the passed event,
stores operations to be done to variables, and perform them according
to the variables. This construct will be cluttered with variables and
hard to read when we need more different operations for different
events.
Split the methods so that we can call appropriate methods depending on
events instead of relying on variables.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Tested-by: Rene Engel <ReneEngel80@emailn.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-ID: <20240224-cocoa-v12-1-e89f70bdda71@daynix.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
macOS Sonoma changes the NSView.clipsToBounds to false by default
where it was true in earlier version of macOS. This causes the window
contents to be occluded by the frame at the top of the window. This
fixes the issue by conditionally compiling the clipping on Sonoma to
true. NSView only exposes the clipToBounds in macOS 14 and so has
to be fixed via conditional compilation.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1994
Signed-off-by: David Parsons <dave@daveparsons.net>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-ID: <20240224140620.39200-1-dave@daveparsons.net>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Provides a new display option, zoom-interpolation, that enables
interpolation of the scaled display when zoom-to-fit is enabled.
Also provides a corresponding view menu item to allow this to be toggled
as required.
Signed-off-by: Carwyn Ellis <carwynellis@gmail.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-ID: <20231110161729.36822-2-carwynellis@gmail.com>
[PMD: QAPI @zoom-interpolation since 9.0]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The macOS jobs in our CI recently started failing, complaining that
the distutils module is not available anymore. And indeed, according to
https://peps.python.org/pep-0632/ it's been deprecated since a while
and now likely got removed in recent Python versions.
Fortunately, we only use it for a version check via LooseVersion here
which we don't really need anymore - according to Repology.org, these
are the versions of sphinx-rtd-theme that are currently used by the
various distros:
centos_stream_8: 0.3.1
centos_stream_9: 0.5.1
fedora_38: 1.1.1
fedora_39: 1.2.2
freebsd: 1.0.0
haikuports_master: 1.2.1
openbsd: 1.2.2
opensuse_leap_15_5: 0.5.1
pkgsrc_current: 2.0.0
debian_11: 0.5.1
debian_12: 1.2.0
ubuntu_20_04: 0.4.3
ubuntu_22_04: 1.0.0
ubuntu_24_04: 2.0.0
So except for CentOS 8, all distros are using a newer version of
sphinx-rtd-theme, and for CentOS 8 we don't support compiling with
the Sphinx of the distro anymore anyway, since it's based on the
Python 3.6 interpreter there. For compiling on CentOS 8, you have
to use the alternative Python 3.8 interpreter which comes without
Sphinx, so that needs the Sphinx installed via pip in the venv
instead, and that is using a newer version, too, according to our
pythondeps.toml file.
Thus we can simply drop the version check now to get rid of the
distutils dependency here.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Message-id: 20240304130403.129543-1-thuth@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Simplify the exec migration code by using list utility functions.
As a side effect, this also fixes a minor memory leak. On function return,
"g_auto(GStrv) argv" frees argv and each element, which is wrong, because
the function does not own the individual elements. To compensate, the code
uses g_steal_pointer which NULLs argv and prevents the destructor from
running, but argv is leaked.
Fixes: cbab4face5 ("migration: convert exec backend ...")
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-ID: <20240227153321.467343-4-armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Avoid "JSON" when talking about the QAPI schema syntax. Capitalize
QEMU. Don't claim all HMP commands live in monitor/hmp-cmds.c (this
was never true). Fix punctuation and drop inappropriate "the" here
and there.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240227115617.237875-3-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
The tutorial doesn't match reality since at least 2013. Repairing it
involves fixing the following issues:
* Update for commit 6d32717155 (aio / timers: Remove alarm timers):
replace the broken examples. Instead of having one for returning a
struct and another for returning a list of structs, do just one for
the latter. This resolves the FIXME added in commit
e218052f92 (aio / timers: De-document -clock) back in 2014.
* Update for commit 895a2a80e0 (qapi: Use 'struct' instead of 'type'
in schema).
* Update for commit 3313b6124b (qapi: add qapi2texi script): add
required documentation to the schema snippets, and drop section
"Command Documentation".
* Update for commit a3c45b3e62 (qapi: New special feature flag
"unstable"): supply the required feature, deemphasize the x- prefix.
* Update for commit dd98234c05 (qapi: introduce x-query-roms QMP
command): rephrase from "add new command" to "examine existing
command".
* Update for commit 9492718b7c (qapi misc: Elide redundant has_FOO in
generated C): hello-world's message argument no longer comes with a
has_message, add a second argument that does.
* Update for moved and renamed files.
While there, update QMP version output to current output.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240227115617.237875-2-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[Whitespace tidied up, typo fixed]
Documentation claims the command can "return NULL". "NULL" doesn't
exist in JSON. "null" does, but the command returns lists, and null
isn't. Correct documentation to "return an empty list".
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240227113921.236097-13-armbru@redhat.com>
"Returns:" sections of guest-fsfreeze-freeze and
guest-fsfreeze-freeze-list describe both command behavior and success
response. Move behavior out, so "Returns:" is only about success
response.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240227113921.236097-12-armbru@redhat.com>
We use section "Returns" for documenting both success and error
response of commands.
I intend to generate better command success response documentation.
Easier when "Returns" documents just he success response.
Create new section tag "Errors". The next two commits will move error
response documentation from "Returns" sections to "Errors" sections.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240227113921.236097-4-armbru@redhat.com>
Add the missing 64-bit hppa firmware blob so that it gets installed.
Signed-off-by: Helge Deller <deller@gmx.de>
Fixes: 7c0dfcf939 ("target/hppa: Update SeaBIOS-hppa to version 16")
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
When calculating the IOR for the exception handlers, the current
unwind_breg value is needed on 64-bit hppa machines.
Restore that value by calling cpu_restore_state() earlier, which in turn
calls hppa_restore_state_to_opc() which restores the unwind_breg for the
current instruction.
Signed-off-by: Helge Deller <deller@gmx.de>
Fixes: 3824e0d643 ("target/hppa: Export function hppa_set_ior_and_isr()")
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Unaligned 64-bit accesses were found in Linux to clobber carry bits,
resulting in bad results if an arithmetic operation involving a
carry bit was executed after an unaligned 64-bit operation.
hppa 2.0 defines additional carry bits in PSW register bits 32..39.
When restoring PSW after executing an unaligned instruction trap, those
bits were not cleared and ended up to be active all the time. Since there
are no bits other than the upper carry bits needed in the upper 32 bit of
env->psw and since those are stored in env->psw_cb, just clear the entire
upper 32 bit when storing psw to solve the problem unconditionally.
Fixes: 931adff314 ("target/hppa: Update cpu_hppa_get/put_psw for hppa64")
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Charlie Jenkins <charlie@rivosinc.com>
Cc: Helge Deller <deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Add a regression test for a recently fixed issue, where shmat()
desynced the guest and the host view of the address space and caused
open("/proc/self/maps") to SEGV.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <jwyuvao4apydvykmsnvacwshdgy3ixv7qvkh4dbxm3jkwgnttw@k4wpaayou7oq>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
pull-loongarch-20240229
V2: fix build error on mipsel
# -----BEGIN PGP SIGNATURE-----
#
# iLMEAAEKAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCZeBrwAAKCRBAov/yOSY+
# 33YXA/4+A5Bpe/3+mSAWZSUlluGTqUi0ILBYRMyX1RXovMx4uCRGr7PXzAf03yKS
# MZzlVzTuOK69WmTm/iTdYWOxkXisC3gzxL/wm8hP4lzh4c0dHrHRsKHqq6gR3+t2
# ojdZn7TefeflnNqIhxXxgxb1OETofhBNnBJ74pvqxO7XV5SWnA==
# =J2Kb
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 29 Feb 2024 11:34:24 GMT
# gpg: using RSA key B8FF1DA0D2FDCB2DA09C6C2C40A2FFF239263EDF
# gpg: Good signature from "Song Gao <m17746591750@163.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B8FF 1DA0 D2FD CB2D A09C 6C2C 40A2 FFF2 3926 3EDF
* tag 'pull-loongarch-20240229' of https://gitlab.com/gaosong/qemu:
loongarch: Change the UEFI loading mode to loongarch
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
If we receive a file descriptor that points to a regular file, there's
nothing stopping us from doing multifd migration with mapped-ram to
that file.
Enable the fd: URI to work with multifd + mapped-ram.
Note that the fds passed into multifd are duplicated because we want
to avoid cross-thread effects when doing cleanup (i.e. close(fd)). The
original fd doesn't need to be duplicated because monitor_get_fd()
transfers ownership to the caller.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240229153017.2221-23-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
For the incoming mapped-ram migration we need to read the ramblock
headers, get the pages bitmap and send the host address of each
non-zero page to the multifd channel thread for writing.
Usage on HMP is:
(qemu) migrate_set_capability multifd on
(qemu) migrate_set_capability mapped-ram on
(qemu) migrate_incoming file:migfile
(the ram.h include needs to move because we've been previously relying
on it being included from migration.c. Now file.h will start including
multifd.h before migration.o is processed)
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240229153017.2221-22-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
The new mapped-ram stream format uses a file transport and puts ram
pages in the migration file at their respective offsets and can be
done in parallel by using the pwritev system call which takes iovecs
and an offset.
Add support to enabling the new format along with multifd to make use
of the threading and page handling already in place.
This requires multifd to stop sending headers and leaving the stream
format to the mapped-ram code. When it comes time to write the data, we
need to call a version of qio_channel_write that can take an offset.
Usage on HMP is:
(qemu) stop
(qemu) migrate_set_capability multifd on
(qemu) migrate_set_capability mapped-ram on
(qemu) migrate_set_parameter max-bandwidth 0
(qemu) migrate_set_parameter multifd-channels 8
(qemu) migrate file:migfile
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240229153017.2221-21-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
The mapped-ram migration can be performed live or non-live, but it is
always asynchronous, i.e. the source machine and the destination
machine are not migrating at the same time. We only need some pieces
of the multifd sync operations.
multifd_send_sync_main()
------------------------
Issued by the ram migration code on the migration thread, causes the
multifd send channels to synchronize with the migration thread and
makes the sending side emit a packet with the MULTIFD_FLUSH flag.
With mapped-ram we want to maintain the sync on the sending side
because that provides ordering between the rounds of dirty pages when
migrating live.
MULTIFD_FLUSH
-------------
On the receiving side, the presence of the MULTIFD_FLUSH flag on a
packet causes the receiving channels to start synchronizing with the
main thread.
We're not using packets with mapped-ram, so there's no MULTIFD_FLUSH
flag and therefore no channel sync on the receiving side.
multifd_recv_sync_main()
------------------------
Issued by the migration thread when the ram migration flag
RAM_SAVE_FLAG_MULTIFD_FLUSH is received, causes the migration thread
on the receiving side to start synchronizing with the recv
channels. Due to compatibility, this is also issued when
RAM_SAVE_FLAG_EOS is received.
For mapped-ram we only need to synchronize the channels at the end of
migration to avoid doing cleanup before the channels have finished
their IO.
Make sure the multifd syncs are only issued at the appropriate times.
Note that due to pre-existing backward compatibility issues, we have
the multifd_flush_after_each_section property that can cause a sync to
happen at EOS. Since the EOS flag is needed on the stream, allow
mapped-ram to just ignore it.
Also emit an error if any other unexpected flags are found on the
stream.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240229153017.2221-20-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
Allow multifd to open file-backed channels. This will be used when
enabling the mapped-ram migration stream format which expects a
seekable transport.
The QIOChannel read and write methods will use the preadv/pwritev
versions which don't update the file offset at each call so we can
reuse the fd without re-opening for every channel.
Contrary to the socket migration, the file migration doesn't need an
asynchronous channel creation process, so expose
multifd_channel_connect() and call it directly.
Note that this is just setup code and multifd cannot yet make use of
the file channels.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240229153017.2221-18-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
Currently multifd does not need to have knowledge of pages on the
receiving side because all the information needed is within the
packets that come in the stream.
We're about to add support to mapped-ram migration, which cannot use
packets because it expects the ramblock section in the migration file
to contain only the guest pages data.
Add a data structure to transfer pages between the ram migration code
and the multifd receiving threads.
We don't want to reuse MultiFDPages_t for two reasons:
a) multifd threads don't really need to know about the data they're
receiving.
b) the receiving side has to be stopped to load the pages, which means
we can experiment with larger granularities than page size when
transferring data.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240229153017.2221-16-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
For the upcoming support to the new 'mapped-ram' migration stream
format, we cannot use multifd packets because each write into the
ramblock section in the migration file is expected to contain only the
guest pages. They are written at their respective offsets relative to
the ramblock section header.
There is no space for the packet information and the expected gains
from the new approach come partly from being able to write the pages
sequentially without extraneous data in between.
The new format also simply doesn't need the packets and all necessary
information can be taken from the standard migration headers with some
(future) changes to multifd code.
Use the presence of the mapped-ram capability to decide whether to
send packets.
This only moves code under multifd_use_packets(), it has no effect for
now as mapped-ram cannot yet be enabled with multifd.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240229153017.2221-15-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
Implement the outgoing migration side for the 'mapped-ram' capability.
A bitmap is introduced to track which pages have been written in the
migration file. Pages are written at a fixed location for every
ramblock. Zero pages are ignored as they'd be zero in the destination
migration as well.
The migration stream is altered to put the dirty pages for a ramblock
after its header instead of having a sequential stream of pages that
follow the ramblock headers.
Without mapped-ram (current): With mapped-ram (new):
--------------------- --------------------------------
| ramblock 1 header | | ramblock 1 header |
--------------------- --------------------------------
| ramblock 2 header | | ramblock 1 mapped-ram header |
--------------------- --------------------------------
| ... | | padding to next 1MB boundary |
--------------------- | ... |
| ramblock n header | --------------------------------
--------------------- | ramblock 1 pages |
| RAM_SAVE_FLAG_EOS | | ... |
--------------------- --------------------------------
| stream of pages | | ramblock 2 header |
| (iter 1) | --------------------------------
| ... | | ramblock 2 mapped-ram header |
--------------------- --------------------------------
| RAM_SAVE_FLAG_EOS | | padding to next 1MB boundary |
--------------------- | ... |
| stream of pages | --------------------------------
| (iter 2) | | ramblock 2 pages |
| ... | | ... |
--------------------- --------------------------------
| ... | | ... |
--------------------- --------------------------------
| RAM_SAVE_FLAG_EOS |
--------------------------------
| ... |
--------------------------------
where:
- ramblock header: the generic information for a ramblock, such as
idstr, used_len, etc.
- ramblock mapped-ram header: the new information added by this
feature: bitmap of pages written, bitmap size and offset of pages
in the migration file.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240229153017.2221-10-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
Add a new migration capability 'mapped-ram'.
The core of the feature is to ensure that RAM pages are mapped
directly to offsets in the resulting migration file instead of being
streamed at arbitrary points.
The reasons why we'd want such behavior are:
- The resulting file will have a bounded size, since pages which are
dirtied multiple times will always go to a fixed location in the
file, rather than constantly being added to a sequential
stream. This eliminates cases where a VM with, say, 1G of RAM can
result in a migration file that's 10s of GBs, provided that the
workload constantly redirties memory.
- It paves the way to implement O_DIRECT-enabled save/restore of the
migration stream as the pages are ensured to be written at aligned
offsets.
- It allows the usage of multifd so we can write RAM pages to the
migration file in parallel.
For now, enabling the capability has no effect. The next couple of
patches implement the core functionality.
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240229153017.2221-8-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
Make sure the data is flushed to disk before closing file
channels. This is to ensure data is on disk and not lost in the event
of a host crash.
This is currently being implemented to affect the migration code when
migrating to a file, but all QIOChannelFile users should benefit from
the change.
Reviewed-by: "Daniel P. Berrangé" <berrange@redhat.com>
Acked-by: "Daniel P. Berrangé" <berrange@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240229153017.2221-6-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
Some minor cleanups and documentation for multifd_recv_sync_main.
Use thread_count as done in other parts of the code. Remove p->id from
the multifd_recv_state sync, since that is global and not tied to a
channel. Add documentation for the sync steps.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240229153017.2221-2-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
Commit ffda5db65a ("io/channel-tls: fix handling of bigger read buffers")
changed the behavior of the TLS io channels to schedule a second reading
attempt if there is still incoming data pending. This caused a regression
with backends like the sclpconsole that check in their read function that
the sender does not try to write more bytes to it than the device can
currently handle.
The problem can be reproduced like this:
1) In one terminal, do this:
mkdir qemu-pki
cd qemu-pki
openssl genrsa 2048 > ca-key.pem
openssl req -new -x509 -nodes -days 365000 -key ca-key.pem -out ca-cert.pem
# enter some dummy value for the cert
openssl genrsa 2048 > server-key.pem
openssl req -new -x509 -nodes -days 365000 -key server-key.pem \
-out server-cert.pem
# enter some other dummy values for the cert
gnutls-serv --echo --x509cafile ca-cert.pem --x509keyfile server-key.pem \
--x509certfile server-cert.pem -p 8338
2) In another terminal, do this:
wget https://download.fedoraproject.org/pub/fedora-secondary/releases/39/Cloud/s390x/images/Fedora-Cloud-Base-39-1.5.s390x.qcow2
qemu-system-s390x -nographic -nodefaults \
-hda Fedora-Cloud-Base-39-1.5.s390x.qcow2 \
-object tls-creds-x509,id=tls0,endpoint=client,verify-peer=false,dir=$PWD/qemu-pki \
-chardev socket,id=tls_chardev,host=localhost,port=8338,tls-creds=tls0 \
-device sclpconsole,chardev=tls_chardev,id=tls_serial
QEMU then aborts after a second or two with:
qemu-system-s390x: ../hw/char/sclpconsole.c:73: chr_read: Assertion
`size <= SIZE_BUFFER_VT220 - scon->iov_data_len' failed.
Aborted (core dumped)
It looks like the second read does not trigger the chr_can_read() function
to be called before the second read, which should normally always be done
before sending bytes to a character device to see how much it can handle,
so the s->max_size in tcp_chr_read() still contains the old value from the
previous read. Let's make sure that we use the up-to-date value by calling
tcp_chr_read_poll() again here.
Fixes: ffda5db65a ("io/channel-tls: fix handling of bigger read buffers")
Buglink: https://issues.redhat.com/browse/RHEL-24614
Reviewed-by: "Daniel P. Berrangé" <berrange@redhat.com>
Message-ID: <20240229104339.42574-1-thuth@redhat.com>
Reviewed-by: Antoine Damhet <antoine.damhet@blade-group.com>
Tested-by: Antoine Damhet <antoine.damhet@blade-group.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Since Windows text files use CRLFs for all \n, the Windows version of QEMU
inserts a CR in the PCAP stream when a LF is encountered when using USB PCAP
files. This is due to the fact that the PCAP file is opened as TEXT instead
of BINARY.
To show an example, when using a very common protocol to USB disks, the BBB
protocol uses a 10-byte command packet. For example, the READ_CAPACITY(10)
command will have a command block length of 10 (0xA). When this 10-byte
command (part of the 31-byte CBW) is placed into the PCAP file, the Windows
file manager inserts a 0xD before the 0xA, turning the 31-byte CBW into a
32-byte CBW.
Actual CBW:
0040 55 53 42 43 01 00 00 00 08 00 00 00 80 00 0a 25 USBC...........%
0050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ...............
PCAP CBW
0040 55 53 42 43 01 00 00 00 08 00 00 00 80 00 0d 0a USBC............
0050 25 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 %..............
I believe simply opening the PCAP file as BINARY instead of TEXT will fix
this issue.
Resolves: https://bugs.launchpad.net/qemu/+bug/2054889
Signed-off-by: Benjamin David Lunt <benlunt@fysnet.net>
Message-ID: <000101da6823$ce1bbf80$6a533e80$@fysnet.net>
[thuth: Break long line to avoid checkpatch.pl error]
Signed-off-by: Thomas Huth <thuth@redhat.com>
When using "--without-default-devices", the ARM_GICV3_TCG and ARM_GIC_KVM
settings currently get disabled, though the arm virt machine is only of
very limited use in that case. This also causes the migration-test to
fail in such builds. Let's make sure that we always keep the GIC switches
enabled in the --without-default-devices builds, too.
Message-ID: <20240221110059.152665-1-thuth@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Thomas Huth <thuth@redhat.com>
In qvring_init() we're writing vq->used->avail_event at "vq->used + 2 +
array_size". The struct pointed by vq->used is, from virtio_ring.h
Linux header):
* // A ring of used descriptor heads with free-running index.
* __virtio16 used_flags;
* __virtio16 used_idx;
* struct vring_used_elem used[num];
* __virtio16 avail_event_idx;
So 'flags' is the word right at vq->used. 'idx' is vq->used + 2. We need
to skip 'used_idx' by adding + 2 bytes, and then sum the vector size, to
reach avail_event_idx. An example on how to properly access this field
can be found in qvirtqueue_kick():
avail_event = qvirtio_readw(d, qts, vq->used + 4 +
sizeof(struct vring_used_elem) * vq->size);
This error was detected when enabling the RISC-V 'virt' libqos machine.
The 'idx' test from vhost-user-blk-test.c errors out with a timeout in
qvirtio_wait_used_elem(). The timeout happens because when processing
the first element, 'avail_event' is read in qvirtqueue_kick() as non-zero
because we didn't initialize it properly (and the memory at that point
happened to be non-zero). 'idx' is 0.
All of this makes this condition fail because "idx - avail_event" will
overflow and be non-zero:
/* < 1 because we add elements to avail queue one by one */
if ((flags & VRING_USED_F_NO_NOTIFY) == 0 &&
(!vq->event || (uint16_t)(idx-avail_event) < 1)) {
d->bus->virtqueue_kick(d, vq);
}
As a result the virtqueue is never kicked and we'll timeout waiting for it.
Fixes: 1053587c3f ("libqos: Added EVENT_IDX support")
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240217192607.32565-3-dbarboza@ventanamicro.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The loop isn't setting the values for the last element. Every other
element is being initialized with addr = 0, flags = VRING_DESC_F_NEXT
and next = i + 1. The last elem is never touched.
This became a problem when enabling a RISC-V 'virt' libqos machine in
the 'indirect' test of virti-blk-test.c. The 'flags' for the last
element will end up being an odd number (since we didn't touch it).
Being an odd number it will be mistaken by VRING_DESC_F_NEXT, which
happens to be 1.
Deep into hw/virt/virtio.c, in virtqueue_split_pop(), into
virtqueue_split_read_next_desc(), a check for VRING_DESC_F_NEXT will be
made to see if we're supposed to chain. The code will keep up chaining
in the last element because the uninitialized value happens to be odd.
We'll error out right after that because desc->next (which is also
uninitialized) will be >= max. A VIRTQUEUE_READ_DESC_ERROR will be
returned, with an error message like this in the stderr:
qemu-system-riscv64: Desc next is 49391
Since we never returned, we'll end up timing out at qvirtio_wait_used_elem():
ERROR:../tests/qtest/libqos/virtio.c:236:qvirtio_wait_used_elem:
assertion failed: (g_get_monotonic_time() - start_time <= timeout_us)
The root cause is using uninitialized values from guest_alloc() in
qvring_indirect_desc_setup(). There's no guarantee that the memory pages
retrieved will be zeroed, so we can't make assumptions. In fact, commit
5b4f72f5e8 ("tests/qtest: properly initialise the vring used idx") fixed a
similar problem stating "It is probably not wise to assume guest memory
is zeroed anyway". I concur.
Initialize all elems in qvring_indirect_desc_setup().
Fixes: f294b029aa ("libqos: Added indirect descriptor support to virtio implementation")
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240217192607.32565-2-dbarboza@ventanamicro.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The kernel abi was changed with
commit d23b77953f5a4fbf94c05157b186aac2a247ae32
Author: Huacai Chen <chenhuacai@kernel.org>
Date: Wed Jan 17 12:43:08 2024 +0800
LoongArch: Change SHMLBA from SZ_64K to PAGE_SIZE
during the v6.8 cycle.
Reviewed-by: Song Gao <gaosong@loongson.cn>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This is the only case in which we expect to have no host memory backing
for a guest memory page, because in general linux user processes cannot
map any pages in the top half of the 64-bit address space.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2170
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The variables uext_opc and sext_opc are used without initialization if
TCG_TARGET_extract_i{32,64}_valid returns false. The result, depending
on the compiler, might be the generation of extract and sextract opcodes
with invalid offset and count, or just random data in the TCG opcode
stream.
Fixes: ceb9ee06b7 ("tcg/optimize: Handle TCG_COND_TST{EQ,NE}", 2024-02-03)
Cc: Richard Henderson <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20240228110641.287205-1-pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The assertion was never correct, because the alignment is a composite
of the image alignment and SHMLBA. Even if the image alignment didn't
match the image address, an assertion would not be correct -- more
appropriate would be an error message about an ill formed image. But
the image cannot be held to SHMLBA under any circumstances.
Fixes: ee94743034 ("linux-user: completely re-write init_guest_space")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2157
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reported-by: Alexey Sheplyakov <asheplyakov@yandex.ru>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
This option controls the host page size. From the mis-usage in
our own testsuite, this is easily confused with guest page size.
The only thing that occurs when changing the host page size is
that stuff breaks, because one cannot actually change the host
page size. Therefore reject all but the no-op setting as part
of the deprecation process.
Reviewed-by: Warner Losh <imp@bsdimp.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Helge Deller <deller@gmx.de>
Message-Id: <20240102015808.132373-27-richard.henderson@linaro.org>
For the cases for which the host mmap succeeds, but does
not yield the desired address, use do_munmap to restore
the reserved_va memory reservation.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This removes a hidden use of qemu_host_page_size, hoisting
two uses of qemu_real_host_page_size to a local variable.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Acked-by: Helge Deller <deller@gmx.de>
On i386, after fixing the page walking code to work with pages in
MMIO memory (specifically CXL emulated interleaved memory),
a crash was seen in an interrupt handling path.
Useful part of backtrace
7 0x0000555555ab1929 in bql_lock_impl (file=0x555556049122 "../../accel/tcg/cputlb.c", line=2033) at ../../system/cpus.c:524
8 bql_lock_impl (file=file@entry=0x555556049122 "../../accel/tcg/cputlb.c", line=line@entry=2033) at ../../system/cpus.c:520
9 0x0000555555c9f7d6 in do_ld_mmio_beN (cpu=0x5555578e0cb0, full=0x7ffe88012950, ret_be=ret_be@entry=0, addr=19595792376, size=size@entry=8, mmu_idx=4, type=MMU_DATA_LOAD, ra=0) at ../../accel/tcg/cputlb.c:2033
10 0x0000555555ca0fbd in do_ld_8 (cpu=cpu@entry=0x5555578e0cb0, p=p@entry=0x7ffff4efd1d0, mmu_idx=<optimized out>, type=type@entry=MMU_DATA_LOAD, memop=<optimized out>, ra=ra@entry=0) at ../../accel/tcg/cputlb.c:2356
11 0x0000555555ca341f in do_ld8_mmu (cpu=cpu@entry=0x5555578e0cb0, addr=addr@entry=19595792376, oi=oi@entry=52, ra=0, ra@entry=52, access_type=access_type@entry=MMU_DATA_LOAD) at ../../accel/tcg/cputlb.c:2439
12 0x0000555555ca5f59 in cpu_ldq_mmu (ra=52, oi=52, addr=19595792376, env=0x5555578e3470) at ../../accel/tcg/ldst_common.c.inc:169
13 cpu_ldq_le_mmuidx_ra (env=0x5555578e3470, addr=19595792376, mmu_idx=<optimized out>, ra=ra@entry=0) at ../../accel/tcg/ldst_common.c.inc:301
14 0x0000555555b4b5fc in ptw_ldq (ra=0, in=0x7ffff4efd320) at ../../target/i386/tcg/sysemu/excp_helper.c:98
15 ptw_ldq (ra=0, in=0x7ffff4efd320) at ../../target/i386/tcg/sysemu/excp_helper.c:93
16 mmu_translate (env=env@entry=0x5555578e3470, in=0x7ffff4efd3e0, out=0x7ffff4efd3b0, err=err@entry=0x7ffff4efd3c0, ra=ra@entry=0) at ../../target/i386/tcg/sysemu/excp_helper.c:174
17 0x0000555555b4c4b3 in get_physical_address (ra=0, err=0x7ffff4efd3c0, out=0x7ffff4efd3b0, mmu_idx=0, access_type=MMU_DATA_LOAD, addr=18446741874686299840, env=0x5555578e3470) at ../../target/i386/tcg/sysemu/excp_helper.c:580
18 x86_cpu_tlb_fill (cs=0x5555578e0cb0, addr=18446741874686299840, size=<optimized out>, access_type=MMU_DATA_LOAD, mmu_idx=0, probe=<optimized out>, retaddr=0) at ../../target/i386/tcg/sysemu/excp_helper.c:606
19 0x0000555555ca0ee9 in tlb_fill (retaddr=0, mmu_idx=0, access_type=MMU_DATA_LOAD, size=<optimized out>, addr=18446741874686299840, cpu=0x7ffff4efd540) at ../../accel/tcg/cputlb.c:1315
20 mmu_lookup1 (cpu=cpu@entry=0x5555578e0cb0, data=data@entry=0x7ffff4efd540, mmu_idx=0, access_type=access_type@entry=MMU_DATA_LOAD, ra=ra@entry=0) at ../../accel/tcg/cputlb.c:1713
21 0x0000555555ca2c61 in mmu_lookup (cpu=cpu@entry=0x5555578e0cb0, addr=addr@entry=18446741874686299840, oi=oi@entry=32, ra=ra@entry=0, type=type@entry=MMU_DATA_LOAD, l=l@entry=0x7ffff4efd540) at ../../accel/tcg/cputlb.c:1803
22 0x0000555555ca3165 in do_ld4_mmu (cpu=cpu@entry=0x5555578e0cb0, addr=addr@entry=18446741874686299840, oi=oi@entry=32, ra=ra@entry=0, access_type=access_type@entry=MMU_DATA_LOAD) at ../../accel/tcg/cputlb.c:2416
23 0x0000555555ca5ef9 in cpu_ldl_mmu (ra=0, oi=32, addr=18446741874686299840, env=0x5555578e3470) at ../../accel/tcg/ldst_common.c.inc:158
24 cpu_ldl_le_mmuidx_ra (env=env@entry=0x5555578e3470, addr=addr@entry=18446741874686299840, mmu_idx=<optimized out>, ra=ra@entry=0) at ../../accel/tcg/ldst_common.c.inc:294
25 0x0000555555bb6cdd in do_interrupt64 (is_hw=1, next_eip=18446744072399775809, error_code=0, is_int=0, intno=236, env=0x5555578e3470) at ../../target/i386/tcg/seg_helper.c:889
26 do_interrupt_all (cpu=cpu@entry=0x5555578e0cb0, intno=236, is_int=is_int@entry=0, error_code=error_code@entry=0, next_eip=next_eip@entry=0, is_hw=is_hw@entry=1) at ../../target/i386/tcg/seg_helper.c:1130
27 0x0000555555bb87da in do_interrupt_x86_hardirq (env=env@entry=0x5555578e3470, intno=<optimized out>, is_hw=is_hw@entry=1) at ../../target/i386/tcg/seg_helper.c:1162
28 0x0000555555b5039c in x86_cpu_exec_interrupt (cs=0x5555578e0cb0, interrupt_request=<optimized out>) at ../../target/i386/tcg/sysemu/seg_helper.c:197
29 0x0000555555c94480 in cpu_handle_interrupt (last_tb=<synthetic pointer>, cpu=0x5555578e0cb0) at ../../accel/tcg/cpu-exec.c:844
Peter identified this as being due to the BQL already being
held when the page table walker encounters MMIO memory and attempts
to take the lock again. There are other examples of similar paths
TCG, so this follows the approach taken in those of simply checking
if the lock is already held and if it is, don't take it again.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20240219173153.12114-4-Jonathan.Cameron@huawei.com>
[rth: Use BQL_LOCK_GUARD]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
By unprotecting regions, we re-instate writability and
unify regions that have been split, which may reduce
the total number of regions.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rather than creating new data structures for vma,
rely on the IntervalTree used by walk_memory_regions.
Use PAGE_* constants, per the page table api, rather
than PROT_* constants, per the mmap api.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use the flags that we've already saved in order to test
accessibility. Use g2h_untagged and compare guest memory
directly instead of copy_from_user.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We do not need to copy pages from guest memory before writing
them out. Because vmas are contiguous in host memory, we can
write them in one go.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Fixes a bug in which write_note() wrote namesz_rounded
and datasz_rounded bytes, even though name and data
pointers contain only the unrounded number of bytes.
Instead of many small writes, allocate a block to contain all
of the elf headers and all of the notes. Copy the data into the
block piecemeal and the write it to the file as a chunk.
This also avoids the need to lseek forward for alignment.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Verify the size of the corefile vs the rlimit before
opening and creating the core file at all.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Swap the ordering of vma_init and open. This will be necessary
for further changes, and adjusts the error cleanup path. Narrow
the scope of corefile, as the variable can be freed immediately
after use in open().
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
On the off-chance that one of the cleanup functions changes
errno, latch the errno that we want to return beforehand.
Flush errno to 0 upon success, rather than at the beginning.
No need to avoid negation of 0.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Ignoring the fact that g_malloc cannot fail, the structure
is quite small and might as well be allocated locally.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In fill_note_info, there were unnecessary checks for
success of g_new/g_malloc. But these structures do not
need to be dyamically allocated at all, and can in fact
be statically allocated within the parent structure.
This removes all error paths from fill_note_info, so
change the return type to void.
Change type of signr to match both caller (elf_core_dump)
and callee (fill_prstatus), which both use int for signr.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Do not dump core at all if getrlimit fails; this ensures
that dumpsize is valid throughout the function, not just
for the initial test vs rlim_cur.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The UEFI loading mode in loongarch is very different
from that in other architectures:loongarch's UEFI code
is in rom, while other architectures' UEFI code is in flash.
loongarch UEFI can be loaded as follows:
-machine virt,pflash=pflash0-format
-bios ./QEMU_EFI.fd
Other architectures load UEFI using the following methods:
-machine virt,pflash0=pflash0-format,pflash1=pflash1-format
loongarch's UEFI loading method makes qemu and libvirt incompatible
when using NVRAM, and the cost of loongarch's current loading method
far outweighs the benefits, so we decided to use the same UEFI loading
scheme as other architectures.
Cc: Andrea Bolognani <abologna@redhat.com>
Cc: maobibo@loongson.cn
Cc: Philippe Mathieu-Daudé <philmd@linaro.org>
Cc: Song Gao <gaosong@loongson.cn>
Cc: zhaotianrui@loongson.cn
Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
Tested-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <0bd892aa9b88e0f4cc904cb70efd0251fc1cde29.1708336919.git.lixianglai@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Migration pull request
- Fabiano's fixed-ram patches (1-5 only)
- Peter's cleanups on multifd tls IOC referencing
- Steve's cpr patches for vfio (migration patches only)
- Fabiano's fix on mbps stats racing with COMPLETE state
- Fabiano's fix on return path thread hang
# -----BEGIN PGP SIGNATURE-----
#
# iIcEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZd7AbhIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1wbg0gDyA3Vg3pIqCJ+u+hLZ+QKxY/pnu8Y5kF+E
# HK2IdslQUQD+OX4ATUnl+CGMiVX9fjs1fKx0Z0Qetq8gC1YJF13yuA0=
# =P2QF
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 28 Feb 2024 05:11:10 GMT
# gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg: issuer "peterx@redhat.com"
# gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [marginal]
# gpg: aka "Peter Xu <peterx@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706
* tag 'migration-next-pull-request' of https://gitlab.com/peterx/qemu: (25 commits)
migration: Use migrate_has_error() in close_return_path_on_source()
migration: Join the return path thread before releasing to_dst_file
migration: Fix qmp_query_migrate mbps value
migration: options incompatible with cpr
migration: update cpr-reboot description
migration: stop vm for cpr
migration: notifier error checking
migration: refactor migrate_fd_connect failures
migration: per-mode notifiers
migration: MigrationNotifyFunc
migration: remove postcopy_after_devices
migration: MigrationEvent for notifiers
migration: convert to NotifierWithReturn
migration: remove error from notifier data
notify: pass error to notifier with return
migration/multifd: Drop unnecessary helper to destroy IOC
migration/multifd: Cleanup outgoing_args in state destroy
migration/multifd: Make multifd_channel_connect() return void
migration/multifd: Drop registered_yank
migration/multifd: Cleanup TLS iochannel referencing
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Testing, gdbstub and plugin updates:
- fix some test/tcg license headers to GPLv2+
- bump up check-tcg timeout to 120s
- avoid re-building VM images too often
- update OpenBSD to 7.4
- use GDBFeature to build gdbstub XML
- unify plugin vcpu count under qemu_plugin_num_vcpus
- avoid spurious idle/resume callbacks on new vCPUs
- ensure nios2-linux-user processes async work
- call vcpu_init plugin callback through async work
- define plugin helpers when registers being read
- add plugin API for reading register values
- add support for register tracking to execlog
- update plugin docs with assumptions
- mention plugins can trigger tb_flush in mttcg design doc
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmXfAv0ACgkQ+9DbCVqe
# KkQyogf/X6T5lWsdZGb22FOYzaTLf5gfCPXArIVN+GsjSae3dU6qy/qVM1VRJQPw
# mH8kvMY7QO5V9M2tL33WtZZg6hqWypXYU+Hit6sMmveKYMKS9ESEX28x3yybgt8Y
# fyDywNODX7bs8Wb6NQjVkZvTmM2llrHEtQXPffaXaPyxOAzlGTV9Mf3Sop9rk4nG
# 8IchzLmOOQ7XnVst/KRyq+29oOYsbyUtj13tNeWBZ5iXFDT6Q/nGwPQ12U2Ztn9N
# FZvyzGG707dFaEDxIr4pl7n+lHJto29LMlSXlocANwG6wFNP3nfkSw/dXw3nkZZK
# pOfrQKvnnunJKBd7495LYZxTDe505Q==
# =/k97
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 28 Feb 2024 09:55:09 GMT
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [full]
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* tag 'pull-maintainer-updates-280224-1' of https://gitlab.com/stsquad/qemu: (29 commits)
docs/devel: plugins can trigger a tb flush
docs/devel: document some plugin assumptions
docs/devel: lift example and plugin API sections up
contrib/plugins: extend execlog to track register changes
contrib/plugins: fix imatch
tests/tcg: expand insn test case to exercise register API
plugins: add an API to read registers
plugins: create CPUPluginState and migrate plugin_mask
gdbstub: expose api to find registers
plugins: Use different helpers when reading registers
cpu: call plugin init hook asynchronously
linux-user: ensure nios2 processes queued work
plugins: fix order of init/idle/resume callback
plugins: add qemu_plugin_num_vcpus function
plugins: remove previous n_vcpus functions from API
gdbstub: Add members to identify registers to GDBFeature
hw/core/cpu: Remove gdb_get_dynamic_xml member
gdbstub: Infer number of core registers from XML
gdbstub: Simplify XML lookup
gdbstub: Change gdb_get_reg_cb and gdb_set_reg_cb
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
With the new plugin register API we can now track changes to register
values. Currently the implementation is fairly dumb which will slow
down if a large number of register values are being tracked. This
could be improved by only instrumenting instructions which mention
registers we are interested in tracking.
Example usage:
./qemu-aarch64 -D plugin.log -d plugin \
-cpu max,sve256=on \
-plugin contrib/plugins/libexeclog.so,reg=sp,reg=z\* \
./tests/tcg/aarch64-linux-user/sha512-sve
will display in the execlog any changes to the stack pointer (sp) and
the SVE Z registers.
As testing registers every instruction will be quite a heavy operation
there is an additional flag which attempts to optimise the register
tracking by only instrumenting instructions which are likely to change
its value. This relies on the QEMU disassembler showing up the register
names in disassembly so is an explicit opt-in.
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Cc: Akihiko Odaki <akihiko.odaki@daynix.com>
Based-On: <20231025093128.33116-19-akihiko.odaki@daynix.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240227144335.1196131-27-alex.bennee@linaro.org>
While async processes are rare for linux-user we do use them from time
to time. The most obvious one is tb_flush when we run out of
translation space. We will also need this when we move plugin
vcpu_init to an async task.
Fix nios2 to follow its older, wiser and more stable siblings.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240227144335.1196131-19-alex.bennee@linaro.org>
We found that vcpu_init_hook was called *after* idle callback.
vcpu_init is called from cpu_realize_fn, while idle/resume cb are called
from qemu_wait_io_event (in vcpu thread).
This change ensures we only call idle and resume cb only once a plugin
was init for a given vcpu.
Next change in the series will run vcpu_init asynchronously, which will
make it run *after* resume callback as well. So we fix this now.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-Id: <20240213094009.150349-4-pierrick.bouvier@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240227144335.1196131-18-alex.bennee@linaro.org>
The main problem is that "check-venv" is a .PHONY target will always
evaluate and trigger a full re-build of the VM images. While its
tempting to drop it from the dependencies that does introduce a
breakage on freshly configured builds.
Fortunately we do have the otherwise redundant --force flag for the
script which up until now was always on. If we make the usage of
--force conditional on dependencies other than check-venv triggering
the update we can avoid the costly rebuild and still run cleanly on a
fresh checkout.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2118
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240227144335.1196131-4-alex.bennee@linaro.org>
My default header template is GPLv3 but for QEMU code we really should
stick to GPLv2-or-later (allowing others to up-license it if they
wish). While this is test code we should still be consistent on the
source distribution.
I wrote all of this code so its not a problem. However there remains
one GPLv3 file left which is the crt0-tc2x.S for TriCore.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240227144335.1196131-2-alex.bennee@linaro.org>
close_return_path_on_source() retrieves the migration error from the
the QEMUFile '->to_dst_file' to know if a shutdown is required. This
shutdown is required to exit the return-path thread.
Avoid relying on '->to_dst_file' and use migrate_has_error() instead.
(using to_dst_file is a heuristic to infer whether
rp_state.from_dst_file might be stuck on a recvmsg(). Using a generic
method for detecting errors is more reliable. We also want to reduce
dependency on QEMUFile::last_error)
Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
[added some words about the motivation for this patch]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240226203122.22894-3-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
The return path thread might hang at a blocking system call. Before
joining the thread we might need to issue a shutdown() on the socket
file descriptor to release it. To determine whether the shutdown() is
necessary we look at the QEMUFile error.
Make sure we only clean up the QEMUFile after the return path has been
waited for.
This fixes a hang when qemu_savevm_state_setup() produced an error
that was detected by migration_detect_error(). That skips
migration_completion() so close_return_path_on_source() would get
stuck waiting for the RP thread to terminate.
Reported-by: Cédric Le Goater <clg@redhat.com>
Tested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240226203122.22894-2-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
The QMP command query_migrate might see incorrect throughput numbers
if it runs after we've set the migration completion status but before
migration_calculate_complete() has updated s->total_time and s->mbps.
The migration status would show COMPLETED, but the throughput value
would be the one from the last iteration and not the one from the
whole migration. This will usually be a larger value due to the time
period being smaller (one iteration).
Move migration_calculate_complete() earlier so that the status
MIGRATION_STATUS_COMPLETED is only emitted after the final counters
update. Keep everything under the BQL so the QMP thread sees the
updates as atomic.
Rename migration_calculate_complete to migration_completion_end to
reflect its new purpose of also updating s->state.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240226143335.14282-1-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
Both socket_send_channel_destroy() and multifd_send_channel_destroy() are
unnecessary wrappers to destroy an IOC, as the only thing to do is to
release the final IOC reference. We have plenty of code that destroys an
IOC using direct unref() already; keep that style.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240222095301.171137-6-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
outgoing_args is a global cache of socket address to be reused in multifd.
Freeing the cache in per-channel destructor is more or less a hack. Move
it to multifd_send_cleanup_state() so it only get checked once. Use a
small helper to do so because it's internal of socket.c.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240222095301.171137-5-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
With a clear definition of p->c protocol, where we only set it up if the
channel is fully established (TLS or non-TLS), registered_yank boolean will
have equal meaning of "p->c != NULL".
Drop registered_yank by checking p->c instead.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240222095301.171137-3-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Commit a1af605bd5 ("migration/multifd: fix hangup with TLS-Multifd due to
blocking handshake") introduced a thread for TLS channels, which will
resolve the issue on blocking the main thread. However in the same commit
p->c is slightly abused just to be able to pass over the pointer "p" into
the thread.
That's the major reason we'll need to conditionally free the io channel in
the fault paths.
To clean it up, using a separate structure to pass over both "p" and "tioc"
in the tls handshake thread. Then we can make it a rule that p->c will
never be set until the channel is completely setup. With that, we can drop
the tricky conditional unref of the io channel in the error path.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240222095301.171137-2-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
All calls to ide_init_drive comes from ide_dev_initfn. Just pass down the
IDEDevice (IDEState is kinda obsolete and should be merged into IDEDevice).
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The A20 mask is only applied to the final memory access. Nested
page tables are always walked with the raw guest-physical address.
Unlike the previous patch, in this one the masking must be kept, but
it was done too early.
Cc: qemu-stable@nongnu.org
Fixes: 4a1e9d4d11 ("target/i386: Use atomic operations for pte updates", 2022-10-18)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If ptw_translate() does a MMU_PHYS_IDX access, the A20 mask is already
applied in get_physical_address(), which is called via probe_access_full()
and x86_cpu_tlb_fill().
If ptw_translate() on the other hand does a MMU_NESTED_IDX access,
the A20 mask must not be applied to the address that is looked up in
the nested page tables; it must be applied only to the addresses that
hold the NPT entries (which is achieved via MMU_PHYS_IDX, per the
previous paragraph).
Therefore, we can remove A20 masking from the computation of the page
table entry's address, and let get_physical_address() or mmu_translate()
apply it when they know they are returning a host-physical address.
Cc: qemu-stable@nongnu.org
Fixes: 4a1e9d4d11 ("target/i386: Use atomic operations for pte updates", 2022-10-18)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The address translation logic in get_physical_address() will currently
truncate physical addresses to 32 bits unless long mode is enabled.
This is incorrect when using physical address extensions (PAE) outside
of long mode, with the result that a 32-bit operating system using PAE
to access memory above 4G will experience undefined behaviour.
The truncation code was originally introduced in commit 33dfdb5 ("x86:
only allow real mode to access 32bit without LMA"), where it applied
only to translations performed while paging is disabled (and so cannot
affect guests using PAE).
Commit 9828198 ("target/i386: Add MMU_PHYS_IDX and MMU_NESTED_IDX")
rearranged the code such that the truncation also applied to the use
of MMU_PHYS_IDX and MMU_NESTED_IDX. Commit 4a1e9d4 ("target/i386: Use
atomic operations for pte updates") brought this truncation into scope
for page table entry accesses, and is the first commit for which a
Windows 10 32-bit guest will reliably fail to boot if memory above 4G
is present.
The truncation code however is not completely redundant. Even though the
maximum address size for any executed instruction is 32 bits, helpers for
operations such as BOUND, FSAVE or XSAVE may ask get_physical_address()
to translate an address outside of the 32-bit range, if invoked with an
argument that is close to the 4G boundary. Likewise for processor
accesses, for example TSS or IDT accesses, when EFER.LMA==0.
So, move the address truncation in get_physical_address() so that it
applies to 32-bit MMU indexes, but not to MMU_PHYS_IDX and MMU_NESTED_IDX.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2040
Fixes: 4a1e9d4d11 ("target/i386: Use atomic operations for pte updates", 2022-10-18)
Cc: qemu-stable@nongnu.org
Co-developed-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Accesses from a 32-bit environment (32-bit code segment for instruction
accesses, EFER.LMA==0 for processor accesses) have to mask away the
upper 32 bits of the address. While a bit wasteful, the easiest way
to do so is to use separate MMU indexes. These days, QEMU anyway is
compiled with a fixed value for NB_MMU_MODES. Split MMU_USER_IDX,
MMU_KSMAP_IDX and MMU_KNOSMAP_IDX in two.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Remove knowledge of specific MMU indexes (other than MMU_NESTED_IDX and
MMU_PHYS_IDX) from mmu_translate(). This will make it possible to split
32-bit and 64-bit MMU indexes.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
MSR_VM_HSAVE_PA bits 0-11 are reserved, as are the bits above the
maximum physical address width of the processor. Setting them to
1 causes a #GP (see "15.30.4 VM_HSAVE_PA MSR" in the AMD manual).
The same is true of VMCB addresses passed to VMRUN/VMLOAD/VMSAVE,
even though the manual is not clear on that.
Cc: qemu-stable@nongnu.org
Fixes: 4a1e9d4d11 ("target/i386: Use atomic operations for pte updates", 2022-10-18)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CR3 bits 63:32 are ignored in 32-bit mode (either legacy 2-level
paging or PAE paging). Do this in mmu_translate() to remove
the last where get_physical_address() meaningfully drops the high
bits of the address.
Cc: qemu-stable@nongnu.org
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Fixes: 4a1e9d4d11 ("target/i386: Use atomic operations for pte updates", 2022-10-18)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a fd-bootchk property to PC machine types, so that -no-fd-bootchk
returns an error if the machine does not support booting from floppies
and checking for boot signatures therein.
Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
aspeed queue:
* Add support for UART0, in preparation of AST2700 models
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmXd2nMACgkQUaNDx8/7
# 7KErPBAAjKRmJQF9aMEgf7uqsPnJojAVumFe63NE9Gqnvy4MzgoZWfdSnLl2Ddba
# im5IfR7MYv0tzJtqCVtz7o4JwXhhDwesWALQZBM/ms48aacPSNP+7Gn141yLuCCS
# Vr8NBSIz156lSsnFGnRUArcQTDKjDp/1TLRiGcS8SDm/S4Nn++nur+T054EZgbKR
# CMWDeavgzZRb9HPepvWDwqb9qs11hq5/onCqC886dVNznxEKAVYcd0FVbSn3OfDF
# 2EPvKh+fxHlW37wcctlGPnbJK5rRvFi78yZf5utSt+mlVhyiEXjQJ6p8zBIh2w5A
# NlsmUo/UYv1F41yC/vCFRR8KJ2wO5VW7zL6UCGMV6I9hxhu/Qw+FYqWdBbAZWsOO
# GFOkFbe8zbJFXTr/W7P5upBlA7U1/B9VbRj71eu01dqT+n8OGsk8yfnWVs1SjpoD
# 89ZIhpb7lSolQmjPPxrVyfUe3/8ncTx64+CZuAZjxPh/9HA8wDXwVRPtAbIvvGaZ
# YPQ4Qmd4m6nAANAvTg2ufj19WT64XKwrQ6O3IkmGcn0BzHl08GFjru8IUp6rbduG
# m6WqulL1Ej1PrYaiw5ktpJ4Fkoy6iEFXJOWfl3oTLp2KWE5VAohyRKI00AFnHiAC
# frK+cxT4bqDtJR8QbNyJy5d3ZGZV1R6ZA0XjQ1jtb8ty2qISysw=
# =gFeX
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 27 Feb 2024 12:49:55 GMT
# gpg: using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@kaod.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1
* tag 'pull-aspeed-20240227' of https://github.com/legoater/qemu:
aspeed: fix hardcode boot address 0
aspeed: introduce a new UART0 device name
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Move the reset of the sysbus (and thus all devices and buses anywhere
on the qbus tree) from qemu_register_reset() to qemu_register_resettable().
This is a behaviour change: because qemu_register_resettable() is
aware of three-phase reset, this now means that:
* 'enter' phase reset methods of devices and buses are called
before any legacy reset callbacks registered with qemu_register_reset()
* 'exit' phase reset methods of devices and buses are called
after any legacy qemu_register_reset() callbacks
Put another way, a qemu_register_reset() callback is now correctly
ordered in the 'hold' phase along with any other 'hold' phase methods.
The motivation for doing this is that we will now be able to resolve
some reset-ordering issues using the three-phase mechanism, because
the 'exit' phase is always after the 'hold' phase, even when the
'hold' phase function was registered with qemu_register_reset().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20240220160622.114437-10-peter.maydell@linaro.org
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reimplement qemu_register_reset() via qemu_register_resettable().
We define a new LegacyReset object which implements Resettable and
defines its reset hold phase method to call a QEMUResetHandler
function. When qemu_register_reset() is called, we create a new
LegacyReset object and add it to the simulation_reset
ResettableContainer. When qemu_unregister_reset() is called, we find
the LegacyReset object in the container and remove it.
This implementation of qemu_unregister_reset() means we'll end up
scanning the ResetContainer's list of child objects twice, once
to find the LegacyReset object, and once in g_ptr_array_remove().
In theory we could avoid this by having the ResettableContainer
interface include a resettable_container_remove_with_equal_func()
that took a callback method so that we could use
g_ptr_array_find_with_equal_func() and g_ptr_array_remove_index().
But we don't expect qemu_unregister_reset() to be called frequently
or in hot paths, and we expect the simulation_reset container to
usually not have many children.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20240220160622.114437-9-peter.maydell@linaro.org
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Implement new functions qemu_register_resettable() and
qemu_unregister_resettable(). These are intended to be
three-phase-reset aware equivalents of the old qemu_register_reset()
and qemu_unregister_reset(). Instead of passing in a function
pointer and opaque, you register any QOM object that implements the
Resettable interface.
The implementation is simple: we have a single global instance of a
ResettableContainer, which we reset in qemu_devices_reset(), and
the Resettable objects passed to qemu_register_resettable() are
added to it.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20240220160622.114437-8-peter.maydell@linaro.org
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Implement a ResetContainer. This is a subclass of Object, and it
implements the Resettable interface. The container holds a list of
arbitrary other objects which implement Resettable, and when the
container is reset, all the objects it contains are also reset.
This will allow us to have a 3-phase-reset equivalent of the old
qemu_register_reset() API: we will have a single "simulation reset"
top level ResetContainer, and objects in it are the equivalent of the
old QEMUResetHandler functions.
The qemu_register_reset() API manages its list of callbacks using a
QTAILQ, but here we use a GPtrArray for our list of Resettable
children: we expect the "remove" operation (which will need to do an
iteration through the list) to be fairly uncommon, and we get simpler
code with fewer memory allocations.
Since there is currently no listed owner in MAINTAINERS for the
existing reset-related source files, create a new section for
them, and add these new files there also.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20240220160622.114437-7-peter.maydell@linaro.org
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
We have an OBJECT_DEFINE_TYPE_EXTENDED macro, plus several variations
on it, which emits the boilerplate for the TypeInfo and ensures it is
registered with the type system. However, all the existing macros
insist that the type being defined has its own FooClass struct, so
they aren't useful for the common case of a simple leaf class which
doesn't have any new methods or any other need for its own class
struct (that is, for the kind of type that OBJECT_DECLARE_SIMPLE_TYPE
declares).
Pull the actual implementation of OBJECT_DEFINE_TYPE_EXTENDED out
into a new DO_OBJECT_DEFINE_TYPE_EXTENDED which parameterizes the
value we use for the class_size field. This lets us add a new
OBJECT_DEFINE_SIMPLE_TYPE which does the same job as the various
existing OBJECT_DEFINE_*_TYPE_* family macros for this kind of simple
type, and the variant OBJECT_DEFINE_SIMPLE_TYPE_WITH_INTERFACES for
when the type will implement some interfaces.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240220160622.114437-5-peter.maydell@linaro.org
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Currently the qemu_register_reset() API permits the reset handler functions
registered with it to remove themselves from within the callback function.
This is fine with our current implementation, but is a bit odd, because
generally reset is supposed to be idempotent, and doesn't fit well in a
three-phase-reset world where a resettable object will get multiple
callbacks as the system is reset.
We now have only one user of qemu_register_reset() which makes use of
the ability to unregister itself within the callback:
restore_boot_order(). We want to change our implementation of
qemu_register_reset() to something where it would be awkward to
maintain the "can self-unregister" feature. Rather than making that
reimplementation complicated, change restore_boot_order() so that it
doesn't unregister itself but instead returns doing nothing for any
calls after it has done the "restore the boot order" work.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240220160622.114437-4-peter.maydell@linaro.org
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
I'm far from confident this handling here is correct. Hence
RFC. In particular not sure on what locks I should hold for this
to be even moderately safe.
The function already appears to be inconsistent in what it returns
as the CONFIG_ATOMIC64 block returns the endian converted 'eventual'
value of the cmpxchg whereas the TCG_OVERSIZED_GUEST case returns
the previous value.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-id: 20240219161229.11776-1-Jonathan.Cameron@huawei.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The Cortex-A53 r0p4 revision that QEMU emulates is affected by a CatA
erratum #843419 (i.e., the most severe), which requires workarounds in
the toolchain as well as the OS.
Since the emulation is obviously not affected in the same way, we can
indicate this via REVIDR bit #8, which on r0p4 has the meaning that no
workarounds for erratum #843419 are needed.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240215160202.2803452-1-ardb+git@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In the previous design of ASPEED SOCs QEMU model, it set the boot
address at "0" which was the hardcode setting for ast10x0, ast2600,
ast2500 and ast2400.
According to the design of ast2700, it has a bootmcu(riscv-32) which
is used for executing SPL and initialize DRAM and copy u-boot image
from SPI/Flash to DRAM at address 0x400000000 at SPL boot stage.
Then, CPUs(cortex-a35) execute u-boot, kernel and rofs.
Currently, qemu not support emulate two CPU architectures
at the same machine. Therefore, qemu will only support
to emulate CPU(cortex-a35) side for ast2700 and the boot
address is "0x4 00000000".
Fixed hardcode boot address "0" for future models using
a different mapping address.
Signed-off-by: Troy Lee <troy_lee@aspeedtech.com>
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
The Aspeed datasheet refers to the UART controllers
as UART1 - UART13 for the ast10x0, ast2600, ast2500
and ast2400 SoCs and the Aspeed ast2700 introduces an UART0
and the UART controllers as UART0 - UART12.
To keep the naming in the QEMU models
in sync with the datasheet, let's introduce a new UART0 device name
and do the required adjustements.
Signed-off-by: Troy Lee <troy_lee@aspeedtech.com>
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
[ clg: - Kept original assert() in aspeed_soc_uart_set_chr()
- Fixed 'i' range in connect_serial_hds_to_uarts() loop ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Rename "internal.h" as "ide-internal.h", and include
it via its relative local path, instead of absolute
to the project root path.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20240226080632.9596-4-philmd@linaro.org>
Remove last two includes of hw/ide/intarnal.h outside of hw/ide and
replace them with newly added public header to allow moving internal.h
into hw/ide to really stop exposing it.
Fixes: a11f439a0e (hw/ide: Stop exposing internal.h to non-IDE files)
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240223142633.933694E6004@zero.eik.bme.hu>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Use ahci_ide_create_devs() instead of open-coding it.
Not accessing AHCIDevice internals anymore allows to
remove "hw/ide/ahci_internal.h" (which isn't really a
public header).
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20240226080632.9596-2-philmd@linaro.org>
Both the piix and the q35 machines introduce an rtc_state variable and defer the
initialization of the X86MachineState::rtc attribute to pc_cmos_init(). Resolve
this complication which makes pc_cmos_init() do what it says on the tin.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240224135851.100361-6-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
PCMachineClass introduces the attribute into the class hierarchy and sets it to
true. There is no sub class overriding the attribute. Commit 30d2a17b46
"hw/i386: Remove the deprecated machines 0.12 up to 0.15" removed the last
overrides of this attribute. The attribute is now unneeded and can be removed.
Fixes: 30d2a17b46 "hw/i386: Remove the deprecated machines 0.12 up to 0.15"
Cc: Thomas Huth <thuth@redhat.com>
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240224135851.100361-5-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
There is no advantage in having these local variables which 1/ needlessly have
different identifiers in both machines and 2/ which are redundant to pcms->bus
which is almost as short.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240224135851.100361-4-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
"hw/acpi/acpi.h" is implicitly included. Include it
explicitly to avoid the following error when refactoring
headers:
hw/i386/pc_q35.c:209:43: error: use of undeclared identifier 'ACPI_PM_PROP_ACPI_PCIHP_BRIDGE'
ACPI_PM_PROP_ACPI_PCIHP_BRIDGE,
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20240226090600.31952-3-philmd@linaro.org>
Split the sysbus version to a separate file so that it is not
included in PCI-only machines, and adjust Kconfig for machines
that do need sysbus-ohci. The copyrights are based on the
time and employer of balrog and Paul Brook's contributions.
While adjusting the SM501 dependency, move it to the right place
instead of keeping it in the R4D machine.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240223124406.234509-10-pbonzini@redhat.com>
[PMD: Rename some functions using 'ohci_sysbus_' prefix]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
With --without-default-devices it is possible to build a binary that
does not include any USB host controller and therefore that does not
include the code guarded by CONFIG_USB. While the simpler creation
functions such as usb_create_simple can be inlined, this is not true
of usb_bus_find(). Remove it, replacing it with a search of the single
USB bus on the machine.
Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240223124406.234509-8-pbonzini@redhat.com>
[PMD: Fixed style]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
With --without-default-devices it should not be required to have
devices in the binary that are removed by -nodefaults. It should be
therefore possible to build a binary that does not include any USB
host controller or any of the code guarded by CONFIG_USB. While the
simpler creation functions such as usb_create_simple can be inlined,
this is not true of usb_bus_find(). Remove it, replacing it with a
search of the single USB bus on the machine.
With this change, it is possible to change "select USB_OHCI_PCI" into
an "imply" directive.
Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240223124406.234509-7-pbonzini@redhat.com>
[PMD: Fixed style]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
With --without-default-devices it is possible to build a binary that
does not include any USB host controller and therefore that does not
include the code guarded by CONFIG_USB. While the simpler creation
functions such as usb_create_simple can be inlined, this is not true
of usb_bus_find(). Remove it, replacing it with a search of the single
USB bus on the machine.
Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240223124406.234509-6-pbonzini@redhat.com>
[PMD: Fixed style]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Once the Kconfig for hw/mips is cleaned up, it will be possible to build a
binary that does not include any USB host controller and therefore that
does not include the code guarded by CONFIG_USB. While the simpler
creation functions such as usb_create_simple can be inlined, this is not
true of usb_bus_find(). Remove it, replacing it with a search of the
single USB bus created by loongson3_virt_devices_init().
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20240223124406.234509-5-pbonzini@redhat.com>
[PMD: Fixed style]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
object_resolve_type_unambiguous provides a useful functionality, that
is currently emulated for example by usb_bus_find(). Move it to core
code and add error reporting for increased generality.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240223124406.234509-2-pbonzini@redhat.com>
[PMD: Fixed style]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The nubus-virtio-mmio device is a Nubus card that contains a set of 32 virtio-mmio
devices and a goldfish PIC similar to the m68k virt machine that can be plugged
into the m68k q800 machine.
There are currently a number of drivers under development that can be used in
conjunction with this device to provide accelerated and/or additional hypervisor
services to 68k Classic MacOS.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-ID: <20240111102954.449462-4-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Whilst 128k is more than enough for a typical Declaration ROM, a C compiler
configured to produce an unstripped debug binary can generate a ROM image that
exceeds this limit. Increase the maximum size to 1Mb to help make life easier
for developers.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-ID: <20240111102954.449462-3-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Declaration ROM binary images can be any arbitrary size, however if a host ROM
memory region is not aligned to qemu_target_page_size() then we fail the
"assert(!(iotlb & ~TARGET_PAGE_MASK))" check in tlb_set_page_full().
Ensure that the host ROM memory region is aligned to qemu_target_page_size()
and adjust the offset at which the Declaration ROM image is loaded, since Nubus
ROM images are unusual in that they are aligned to the end of the slot address
space.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-ID: <20240111102954.449462-2-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
sysbus_address_space(...) is a simple wrapper to
get_system_memory(). Use it in place, since KVM
VAPIC doesn't distinct address spaces.
Rename the 'as' variable as 'mr' since it is a
MemoryRegion type, not an AddressSpace one.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240216153517.49422-6-philmd@linaro.org>
We want to set another qdev property (a link) for the FIMD
device, we can not use sysbus_create_varargs() which only
passes sysbus base address and IRQs as arguments. Inline
it so we can set the link property in the next commit.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240216153517.49422-4-philmd@linaro.org>
We want to set another qdev property (a link) for the pl110
and pl111 devices, we can not use sysbus_create_simple() which
only passes sysbus base address and IRQs as arguments. Inline
it so we can set the link property in the next commit.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240226173805.289-2-philmd@linaro.org>
QAPIDoc stores a reference to QAPIParser just to pass it to
QAPIParseError. The resulting error position depends on the state of
the parser. It happens to be the current comment line. Servicable,
but action at a distance.
The commit before previous moved most uses of QAPIParseError from
QAPIDoc to QAPIParser. There are just three left. Convert them to
QAPISemError. This involves passing info to a few methods. Then drop
the reference to QAPIParser.
The three errors lose the column number. Not really interesting here:
it's the comment line's indentation.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240216145841.2099240-17-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The parser recognizes only the first "Features:" line. Any subsequent
ones are treated as ordinary text, as visible in test case
doc-duplicate-features. Recognize "Features:" lines anywhere. A
second one is an error.
A 'Features:' line without any features is useless, but not an error.
Make it an error. This makes detecting a second "Features:" line
easier.
qapi/run-state.json actually has an instance of this since commit
fe17522d85 (qapi: Remove deprecated 'singlestep' member of
StatusInfo). Clean it up.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240216145841.2099240-16-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
QAPISchemaParser is a conventional recursive descent parser. Except
QAPISchemaParser.get_doc() delegates most of the doc comment parsing
work to a state machine in QAPIDoc. The state machine doesn't get
tokens like a recursive descent parser, it is fed tokens.
I find this state machine rather opaque and hard to maintain.
Replace it by a conventional parser, all in QAPISchemaParser. Less
code, and (at least in my opinion) easier to understand.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240216145841.2099240-15-armbru@redhat.com>
Tested-by: Daniel P. Berrangé <berrange@redhat.com>
The parser mostly doesn't create adjacent untagged sections, and
merging the ones it does create is hardly worth the bother. I'm doing
it to avoid behavioral change in the next commit.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240216145841.2099240-14-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
We currently call QAPIDoc.check() only for definition documentation.
Calling it for free-form documentation as well is simpler. No change,
because it doesn't actually do anything there.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240216145841.2099240-13-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
By convention, we indent the second and subsequent lines of
descriptions and tagged sections, except for examples.
Turn this into a hard rule, and apply it to examples, too.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240216145841.2099240-11-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
[Straightforward conflicts in qapi/migration.json resolved]
docs/devel/qapi-code-gen.txt claims "A heading line must be the first
line of the documentation comment block" since commit
55ec69f8b1 (docs/devel/qapi-code-gen.txt: Update to new rST backend
conventions). Not true, we have code to make it work anywhere in a
free-form doc comment: commit dcdc07a97c (qapi: Make section headings
start a new doc comment block).
Make it true, for simplicity's sake.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240216145841.2099240-10-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Since the previous commit, QAPIDoc.Section.name is either
None (untagged section) or the section's tag string ('Returns',
'@name', ...). Rename it to .tag.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240216145841.2099240-9-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Improve the message for an empty tagged section from
empty doc section 'Note'
to
text required after 'Note:'
and the message for an empty argument or feature description from
empty doc section 'foo'
to
text required after '@foo:'
Improve the error position to refer to the beginning of the empty
section instead of its end.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240216145841.2099240-8-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
When something other than a command has a "Returns" section, the error
message points to the beginning of the definition comment. Point to
the "Returns" section instead.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240216145841.2099240-7-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
When documented arguments don't exist, the error message points to the
beginning of the definition comment. Point to the first bogus
argument description instead.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240216145841.2099240-6-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Commit 4e99f4b12c (qapi: Drop simple unions) eliminated implicitly
defined union branch types, except for the empty object type
'q_empty'. QAPISchemaGenRSTVisitor._nodes_for_members() still has
code to generate documentation for implicitly defined union branch
types. It does nothing for 'q_empty'. Simplify.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240216145841.2099240-5-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
A 'Features:' line without any features is useless, but not an error
now. However, a later commit will make it one, because that makes
rejecting duplicate 'Features:' easier.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240216145841.2099240-4-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
We don't actually recognize the second 'Features:' line. Instead, we
treat it as an untagged section.
If it was followed by feature description, we'd reject that like
"description of '@feat2:' follows a section". Less than clear.
To be improved shortly.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240216145841.2099240-3-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The test compares Sphinx plain-text output against a golden reference.
To work on Windows hosts, it filters out carriage returns in both
files. Unfortunately, the filter doesn't work: it creates an empty
file. Comparing empty files always succeeds.
Fix the filter, and update the golden reference to current Sphinx
output.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240216145841.2099240-2-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
* Avocado tests for ppc64 to boot FreeBSD, run guests with emulated
or nested hypervisor facilities, among other things.
* Update ppc64 CPU defaults to Power10.
* Add a new powernv10-rainier machine to better capture differences
between the different Power10 systems.
* Implement more device models for powernv.
* 4xx TLB flushing performance and correctness improvements.
* Correct gdb implementation to access some important SPRs.
* Misc cleanups and bug fixes.
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCgAdFiEETkN92lZhb0MpsKeVZ7MCdqhiHK4FAmXYuX0ACgkQZ7MCdqhi
# HK6t1Q/9Hxw+MseFUa/6sbWX6mhv/8emrFFOwI9qxapxDoMyic+SjIhR5PPCYh6t
# TLE1vJiV54XYB3286hz3eQfDxfHNjkgsF7PYp9SEd6D1rMT9ESxeu5NkifenEfP0
# UoTFXJyfg/OF1h+JQRrVv1m+D4mqGGNCQB4QiU3DYTmRhrhp7H3mKfUX/KvkEwiX
# EqZibmrqb9SVSjT66LBQzY328mEH4nipF33QtYKfYjb6kMe8ACSznL2VYP0NmacU
# T+3eHJeLtOLeRlHwYfADx2ekRHlsJuE9/fMMHJHb2qxJkHSQ7yGBqSLESAe6kNP8
# TnKJ9x4433K7IjFqaoiDONrMVJbVZDh/DUh1WWdY14iiUOYEy7uLkLtmThmNSyUB
# 622Rd5Ch09JWzA/tg1aC9mR2f9boe9/Z1VeHeN8j+sVj1e6MEh8un8SER3X+9TDz
# myGLsmPXQnu1yjebycuE+9RAPbR9npOAkQpE5ZfDwjUM7y4s4jzZUKUoIhtCXeEF
# eIykVnaGbPlEBGpuf+E+w2ZxhZUIfxRUhuunK8Ib4TE8khJn/Ir4BxoLweSnqtKM
# O4xiFvHm72RUVK232Kox5HWbFJ8XSLBUb3ABNGbXXynzAMD+THB4ImFBbysOmIkR
# xcF1tWQ+xoMMcCxbx73b0PhO5AR/PgYc2ctug9rAc9fh4ypJLEs=
# =LZzb
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 23 Feb 2024 15:27:57 GMT
# gpg: using RSA key 4E437DDA56616F4329B0A79567B30276A8621CAE
# gpg: Good signature from "Nicholas Piggin <npiggin@gmail.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 4E43 7DDA 5661 6F43 29B0 A795 67B3 0276 A862 1CAE
* tag 'pull-ppc-for-9.0-20240224' of https://gitlab.com/npiggin/qemu: (47 commits)
target/ppc: optimise ppcemb_tlb_t flushing
target/ppc: 440 optimise tlbwe TLB flushing
target/ppc: 4xx optimise tlbwe_lo TLB flushing
target/ppc: 4xx don't flush TLB for a newly written software TLB entry
target/ppc: Factor out 4xx ppcemb_tlb_t flushing
target/ppc: Fix 440 tlbwe TLB invalidation gaps
target/ppc: Add SMT support to time facilities
target/ppc: Implement core timebase state machine and TFMR
ppc/pnv: Implement the ChipTOD to Core transfer
ppc/pnv: Wire ChipTOD model to powernv9 and powernv10 machines
ppc/pnv: Add POWER9/10 chiptod model
target/ppc: Fix move-to timebase SPR access permissions
target/ppc: Improve timebase register defines naming
target/ppc: Rename TBL to TB on 64-bit
target/ppc: Update gdbstub to read SPR's CFAR, DEC, HDEC, TB-L/U
hw/ppc: N1 chiplet wiring
hw/ppc: Add N1 chiplet model
hw/ppc: Add pnv nest pervasive common chiplet model
ppc/pnv: Test pnv i2c master and connected devices
ppc/pnv: Add a pca9554 I2C device to powernv10-rainier
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Filter TLB flushing by PID and mmuidx.
Zoltan reports that, together with the previous TLB flush changes,
performance of a sam460ex machine running 'lame' to convert a wav to
mp3 is improved nearly 10%:
CPU time TLB partial flushes TLB elided flushes
Before 37s 508238 7680722
After 34s 73 1143
Tested-by: BALATON Zoltan <balaton@eik.bme.hu>
Acked-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Have 440 tlbwe flush only the range corresponding to the addresses
covered by the software TLB entry being modified rather than the
entire TLB. This matches what 4xx does.
Tested-by: BALATON Zoltan <balaton@eik.bme.hu>
Acked-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Rather than tlbwe_lo always flushing all TCG TLBs, have it flush just
those corresponding to the old software TLB, and only if it was valid.
Tested-by: BALATON Zoltan <balaton@eik.bme.hu>
Acked-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
BookE software TLB is implemented by flushing old translations from the
relevant TCG TLB whenever software TLB entries change. This means a new
software TLB entry should not have any corresponding cached TCG TLB
translations, so there is nothing to flush. The exception is multiple
software TLBs that cover the same address and address space, but that is
a programming error and results in undefined behaviour, and flushing
does not give an obviously better outcome in that case either.
Remove the unnecessary flush of a newly written software TLB entry.
Tested-by: BALATON Zoltan <balaton@eik.bme.hu>
Acked-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Flushing the TCG TLB pages that cache a software TLB is a common
operation, factor it into its own function.
Tested-by: BALATON Zoltan <balaton@eik.bme.hu>
Acked-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The 440 tlbwe (write entry) instruction misses several cases that must
flush the TCG TLB:
- If the new size is smaller than the existing size, the EA no longer
covered should be flushed. This looks like an inverted inequality
test.
- If the TLB PID changes.
- If the TLB attr bit 0 (translation address space) changes.
- If low prot (access control) bits change.
Fix this by removing tricks to avoid TLB flushes, and just invalidate
the TLB if any valid entry is being changed, similarly to 4xx.
Optimisations will be introduced in subsequent changes.
Tested-by: BALATON Zoltan <balaton@eik.bme.hu>
Acked-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The TB, VTB, PURR, HDEC SPRs are per-LPAR registers, and the TFMR is a
per-core register. Add the necessary SMT synchronisation and value
sharing.
The TFMR can only drive the timebase state machine via thread 0 of the
core, which is almost certainly not right, but it is enough for skiboot
and certain other proprietary firmware.
Acked-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
This implements the core timebase state machine, which is the core side
of the time-of-day system in POWER processors. This facility is operated
by control fields in the TFMR register, which also contains status
fields.
The core timebase interacts with the chiptod hardware, primarily to
receive TOD updates, to synchronise timebase with other cores. This
model does not actually update TB values with TOD or updates received
from the chiptod, as timebases are always synchronised. It does step
through the states required to perform the update.
There are several asynchronous state transitions. These are modelled
using using mfTFMR to drive state changes, because it is expected that
firmware poll the register to wait for those states. This is good enough
to test basic firmware behaviour without adding real timers. The values
chosen are arbitrary.
Acked-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
One of the functions of the ChipTOD is to transfer TOD to the Core
(aka PC - Pervasive Core) timebase facility.
The ChipTOD can be programmed with a target address to send the TOD
value to. The hardware implementation seems to perform this by
sending the TOD value to a SCOM address.
This implementation grabs the core directly and manipulates the
timebase facility state in the core. This is a hack, but it works
enough for now. A better implementation would implement the transfer
to the PnvCore xscom register and drive the timebase state machine
from there.
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Wire the ChipTOD model to powernv9 and powernv10 machines.
Suggested-by-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The ChipTOD (for Time-Of-Day) is a chip pervasive facility in IBM POWER
(powernv) processors that keeps a time of day clock.
In particular for this model are facilities that initialise and start
the time of day clock, and that synchronise that clock to cores on the
chip, and to other chips. In this way, all cores on all chips can
synchronise timebase (TB).
This model implements functionality sufficient to run the skiboot
chiptod synchronisation procedure (with the following core timebase
state machine implementation). It does not modify the TB in the cores
where the real hardware would, because the QEMU ppc timebase
implementation is always synchronised acros all cores.
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The move-to timebase registers TBU and TBL can not be read, and they
can not be written in supervisor mode on hypervisor-capable CPUs.
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The timebase in ppc started out with the mftb instruction which is like
mfspr but addressed timebase registers (TBRs) rather than SPRs. These
instructions could be used to read TB and TBU at 268 and 269. Timebase
could be written via the TBL and TBU SPRs at 284 and 285.
The ISA changed around v2.03 to bring TB and TBU reads into the SPR
space at 268 and 269 (access via mftb TBR-space is still supported
but will be phased out). Later, VTB was added which is an entirely
different register.
The SPR number defines in QEMU are understandably inconsistently named.
Change SPR 268, 269, 284, 285 to TBL, TBU, WR_TBL, WR_TBU, respectively.
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
From the earliest PowerPC ISA, TBR (later SPR) 268 has been called TB
and accessed with mftb instruction. The problem is that TB is the name
of the 64-bit register, and 32-bit implementations can only read the
lower half with one instruction, so 268 has also been called TBL and
it does only read TBL on 32-bit.
Change SPR 268 to be called TB on 64-bit implementations.
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
SPR's CFAR, DEC, HDEC, TB-L/U are not implemented as part of CPUPPCState.
Hence, gdbstub is not able to access them using (CPUPPCState *)env->spr[] array.
Update gdb_get_spr_reg() method to handle these SPR's specifically.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Saif Abrar <saif.abrar@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
This part of the patchset connects the nest1 chiplet model to p10 chip.
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Chalapathi V <chalapathi.v@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The N1 chiplet handle the high speed i/o traffic over PCIe and others.
The N1 chiplet consists of PowerBus Fabric controller,
nest Memory Management Unit, chiplet control unit and more.
This commit creates a N1 chiplet model and initialize and realize the
pervasive chiplet model where chiplet control registers are implemented.
This commit also implement the read/write method for the powerbus scom
registers
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Chalapathi V <chalapathi.v@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
A POWER10 chip is divided into logical units called chiplets. Chiplets
are broadly divided into "core chiplets" (with the processor cores) and
"nest chiplets" (with everything else). Each chiplet has an attachment
to the pervasive bus (PIB) and with chiplet-specific registers. All nest
chiplets have a common basic set of registers and This model will provide
the registers functionality for common registers of nest chiplet (Pervasive
Chiplet, PB Chiplet, PCI Chiplets, MC Chiplet, PAU Chiplets)
This commit implement the read/write functions of chiplet control registers.
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Chalapathi V <chalapathi.v@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Tests the following for both P9 and P10:
- I2C master POR status
- I2C master status after immediate reset
Tests the following for powernv10-ranier only:
- Config pca9552 hotplug device pins as inputs then
Read the INPUT0/1 registers to verify all pins are high
- Connected GPIO pin tests of P10 PCA9552 device. Tests
output of pins 0-4 affect input of pins 5-9 respectively.
- PCA9554 GPIO pins test. Tests input and ouput functionality.
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Glenn Miles <milesg@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
For powernv10-rainier, the Power Hypervisor code expects to see a
pca9554 device connected to the 3rd PNV I2C engine on port 1 at I2C
address 0x25 (or left-justified address of 0x4A). This is used by
the hypervisor code to detect if a "Cable Card" is present.
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Glenn Miles <milesg@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The QEMU I2C buses and devices use the resettable
interface for resetting while the PNV I2C controller
and parent buses and devices have not yet transitioned
to this new interface and use the old reset strategy.
This was preventing the I2C buses and devices wired
to the PNV I2C controller from being reset.
The short term fix for this is to have the PNV I2C
Controller's reset function explicitly call the resettable
interface function, bus_cold_reset(), on all child
I2C buses.
The long term fix should be to transition all PNV parent
devices and buses to use the resettable interface so that
all child buses and devices are automatically reset.
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Glenn Miles <milesg@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
For power10-rainier, a pca9552 device is used for PCIe slot hotplug
power control by the Power Hypervisor code. The code expects that
some time after it enables power to a PCIe slot by asserting one of
the pca9552 GPIO pins 0-4, it should see a "power good" signal asserted
on one of pca9552 GPIO pins 5-9.
To simulate this behavior, we simply connect the GPIO outputs for
pins 0-4 to the GPIO inputs for pins 5-9.
Each PCIe slot is assigned 3 GPIO pins on the pca9552 device, for
control of up to 5 PCIe slots. The per-slot signal names are:
SLOTx_EN.......PHYP uses this as an output to enable
slot power. We connect this to the
SLOTx_PG pin to simulate a PGOOD signal.
SLOTx_PG.......PHYP uses this as in input to detect
PGOOD for the slot. For our purposes
we just connect this to the SLOTx_EN
output.
SLOTx_Control..PHYP uses this as an output to prevent
a race condition in the real hotplug
circuitry, but we can ignore this output
for simulation.
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Glenn Miles <milesg@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The Power Hypervisor code expects to see a pca9552 device connected
to the 3rd PNV I2C engine on port 1 at I2C address 0x63 (or left-
justified address of 0xC6). This is used by hypervisor code to
control PCIe slot power during hotplug events.
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Glenn Miles <milesg@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Create a new powernv machine type, powernv10-rainier, that
will contain rainier-specific devices.
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Glenn Miles <milesg@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Allow external devices to drive pca9552 input pins by adding
input GPIO's to the model. This allows a device to connect
its output GPIO's to the pca9552 input GPIO's.
In order for an external device to set the state of a pca9552
pin, the pin must first be configured for high impedance (LED
is off). If the pca9552 pin is configured to drive the pin low
(LED is on), then external input will be ignored.
Here is a table describing the logical state of a pca9552 pin
given the state being driven by the pca9552 and an external device:
PCA9552
Configured
State
| Hi-Z | Low |
------+------+-----+
External Hi-Z | Hi | Low |
Device ------+------+-----+
State Low | Low | Low |
------+------+-----+
Reviewed-by: Andrew Jeffery <andrew@codeconstruct.com.au>
Signed-off-by: Glenn Miles <milesg@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The pca9552 INPUT0 and INPUT1 registers are supposed to
hold the logical values of the LED pins. A logical 0
should be seen in the INPUT0/1 registers for a pin when
its corresponding LSn bits are set to 0, which is also
the state needed for turning on an LED in a typical
usage scenario. Existing code was doing the opposite
and setting INPUT0/1 bit to a 1 when the LSn bit was
set to 0, so this commit fixes that.
Reviewed-by: Andrew Jeffery <andrew@codeconstruct.com.au>
Signed-off-by: Glenn Miles <milesg@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
POWER10 is the latest IBM Power machine. Although it is not offered in
"OPAL mode" (i.e., powernv configuration), so there is a case that it
should remain at powernv9, most of the development work is going into
powernv10 at the moment.
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
pseries machines before version 2.11 have undergone many changes to
correct issues, mostly regarding migration compatibility. This is
obfuscating the code uselessly and makes maintenance more difficult.
Remove them and only keep the last version of the 2.x series, 2.12,
still in use by old distros.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Initialize the machine specific max_cpus limit as per the maximum range
of CPU IPIs available. Keeping between 4096 to 8192 will throw IRQ not
free error due to XIVE/XICS limitation and keeping beyond 8192 will hit
assert in tcg_region_init or spapr_xive_claim_irq.
Logs:
Without patch fix:
[root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=4097
qemu-system-ppc64: IRQ 4096 is not free
[root@host build]#
On LPAR:
[root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=8193
**
ERROR:../tcg/region.c:774:tcg_region_init: assertion failed:
(region_size >= 2 * page_size)
Bail out! ERROR:../tcg/region.c:774:tcg_region_init: assertion failed:
(region_size >= 2 * page_size)
Aborted (core dumped)
[root@host build]#
On x86:
[root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=8193
qemu-system-ppc64: ../hw/intc/spapr_xive.c:596: spapr_xive_claim_irq:
Assertion `lisn < xive->nr_irqs' failed.
Aborted (core dumped)
[root@host build]#
With patch fix:
[root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=4097
qemu-system-ppc64: Invalid SMP CPUs 4097. The max CPUs supported by
machine 'pseries-8.2' is 4096
[root@host build]#
Reported-by: Kowshik Jois <kowsjois@linux.ibm.com>
Tested-by: Kowshik Jois <kowsjois@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
spapr_irq_init currently uses existing macro SPAPR_XIRQ_BASE to refer to
the range of CPU IPIs during initialization of nr-irqs property.
It is more appropriate to have its own define which can be further
reused as appropriate for correct interpretation.
Suggested-by: Cedric Le Goater <clg@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Kowshik Jois <kowsjois@linux.ibm.com>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
To reduce the use of the term 'softmmu', rename spapr_softmmu.c
to spapr_vhyp_mmu.c.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
[np: change name]
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Since 'softmmu' is quite a loaded term in QEMU, rename the vhyp MMU
facilities to use the vhyp_mmu_ prefix rather than softmmu_.
vhyp_mmu_ is chosen because the code that manipulates the hash table
via guest software hypercalls is QEMU's implementation of the PAPR
hypervisor interface, called vhyp.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
[npiggin: Pick a different name, explain it in changelog.]
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Check tcg_enabled() before calling softmmu_resize_hpt_prepare()
and softmmu_resize_hpt_commit() to allow the compiler to elide
their calls. The stubs are then unnecessary, remove them.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Commit 9fdf0c2995 ("Start implementing pSeries logical partition
machine") added hw/ppc/spapr_hcall.c, then commit 962104f044
("hw/ppc: moved hcalls that depend on softmmu") extracted the
system code to hw/ppc/spapr_softmmu.c. Take the license and
copyrights from the original spapr_hcall.c at commit 9fdf0c2995.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
[npiggin: Update file description.]
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Several registers have names that don't match the ISA (or convention
with other QEMU PPC registers), making them unintuitive to use with
GDB.
Fortunately most of these registers are obscure and/or have not been
correctly implemented in the gdb server (e.g., DEC, TB, CFAR), so risk
of breaking users should be low.
QEMU should follow the ISA for register name convention (where there is
no established GDB name).
Acked-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
This includes a number of improvements and fixes. Importantly there
is a change for QEMU platforms to permit the ChipTOD to be initialised
if it is present in the device tree. This will facilitate ChipTOD
enablement in pnv.
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The powernv and pseries machines both provide hypervisor facilities
that are supported by KVM. This is a large and complicated set of
features that don't get much system-level testing in ppc tests.
Add a new test case for these which runs QEMU KVM inside the target.
This downloads an Alpine VM image, boots it and downloads and installs
the qemu package, then boots a virtual machine under it, re-using the
original Alpine VM image.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
ppc has no avocado tests for the KVM backend. Add a KVM boot_linux.py
test for pseries.
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
POWER CPUs support hash and radix MMU modes. Linux supports running in
either mode, but defaults to radix. To keep up testing of QEMU's hash
MMU implementation, add some Linux hash boot tests.
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The expected MTD partition detection output does not always appear on
the console, despite the test reaching the boot loader and the string
appearing in dmesg. Possibly due to an init script that quietens the
console output. Using an earlier log message improves reliability.
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The ppc64 and s390x tests were first marked skipIf GITLAB_CI by commit
c0c8687ef0 ("tests/avocado: disable BootLinuxPPC64 test in CI"), and
commit 0f26d94ec9 ("tests/acceptance: skip s390x_ccw_vrtio_tcg on
GitLab") due to being very heavy-weight for gitlab CI.
Commit 9b45cc9931 ("docs/devel: rationalise unstable gitlab tests under
FLAKY_TESTS") changed this to being flaky but it isn't really, it just
had a long runtime.
So take the SPEED=slow variable from qtests and introduce it to avocado,
and make these tests require it.
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
is_prefix_insn_excp() loads the first word of the instruction address
which caused an exception, to determine whether or not it was prefixed
so the prefix bit can be set in [H]SRR1.
This works if the instruction image can be loaded, but if the exception
was caused by an ifetch, this load could fail and cause a recursive
exception and crash. Machine checks caused by ifetch are not excluded
from the prefix check and can crash (see issue 2108 for an example).
Fix this by excluding machine checks caused by ifetch from the prefix
check.
Cc: qemu-stable@nongnu.org
Acked-by: Cédric Le Goater <clg@kaod.org>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2108
Fixes: 55a7fa34f8 ("target/ppc: Machine check on invalid real address access on POWER9/10")
Fixes: 5a5d3b23cb ("target/ppc: Add SRR1 prefix indication to interrupt handlers")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The move to decodetree flipped the inequality test for the VEC / VSX
MSR facility check.
This caused application crashes under Linux, where these facility
unavailable interrupts are used for lazy-switching of VEC/VSX register
sets. Getting the incorrect interrupt would result in wrong registers
being loaded, potentially overwriting live values and/or exposing
stale ones.
Cc: qemu-stable@nongnu.org
Reported-by: Joel Stanley <joel@jms.id.au>
Fixes: 70426b5bb7 ("target/ppc: moved stxvx and lxvx from legacy to decodtree")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1769
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Tested-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
The processor tracing features in cpu_x86_cpuid() are hardcoded to a set
that should be safe on all processor that support PT virtualization.
But as an additional check, x86_cpu_filter_features() also checks
that the accelerator supports that safe subset, and if not it marks
CPUID_7_0_EBX_INTEL_PT as unavailable.
This check fails on accelerators other than KVM, but it is actually
unnecessary to do it because KVM is the only accelerator that uses the
safe subset. Everything else just provides nonzero values for CPUID
leaf 0x14 (TCG/HVF because processor tracing is not supported; qtest
because nothing is able to read CPUID anyway). Restricting the check
to KVM fixes a warning with the qtest accelerator:
$ qemu-system-x86_64 -display none -cpu max,mmx=off -accel qtest
qemu-system-x86_64: warning: TCG doesn't support requested feature: CPUID.07H:EBX.intel-pt [bit 25]
The warning also happens in the test-x86-cpuid-compat qtest.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2096
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20240221162910.101327-1-pbonzini@redhat.com>
Fixes: d047402436 ("target/i386: Call accel-agnostic x86_cpu_get_supported_cpuid()")
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
QEMU has historically used variable length arrays only very rarely.
Variable length arrays are a potential security issue where an
on-stack dynamic allocation isn't correctly size-checked, especially
when the size comes from the guest. (An example problem of this kind
from the past is CVE-2021-3527). Forbidding them entirely is a
defensive measure against further bugs of this kind.
Enable -Wvla to prevent any new uses from sneaking into the codebase.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-ID: <20240125173211.1786196-3-peter.maydell@linaro.org>
[thuth: rebased to current master branch]
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240221162636.173136-4-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
To be able to compile QEMU with -Wvla (to prevent potential security
issues), we need to get rid of the variable length array in the
kvmppc_save_htab() function. Replace it with a heap allocation instead.
Message-ID: <20240221162636.173136-2-thuth@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
When compiling with "configure --without-default-devices", the
dbus-display-test fails since it implicitly assumes that the
machine comes with a default console.
There doesn't seem to be an easy way to figure this during build time,
so skip the tests requiring the Console interface at runtime.
Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <20240221073759.171443-1-marcandre.lureau@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
If "configure" has been run with "--without-default-devices", there is
no e1000 device in the binaries, so the boot-serial-test currently fails
in that case since it tries to use the e1000 with the sam460ex machine.
Since we're testing the serial output here, and not the NIC, let's
simply switch to the "pci-bridge" device here instead, which should
always be there for PCI-based machines like the sam460ex.
Message-ID: <20240219111030.384158-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The cdrom test skips to execute on LoongArch system with command
"make check", this patch enables cdrom test for LoongArch virt
machine platform.
With this patch, cdrom test passes to run on LoongArch virt
machine type.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Message-ID: <20240217100230.134042-1-maobibo@loongson.cn>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
From the 68010 a word with the frame format and exception vector
are placed on the stack before the PC and SR.
M68K_FEATURE_QUAD_MULDIV is currently checked to workout if to do
this or not for the configured CPU but that flag isn't set for
68010 so currently the exception stack when 68010 is configured
is incorrect.
It seems like checking M68K_FEATURE_MOVEFROMSR_PRIV would do but
adding a new flag that shows exactly what is going on here is
maybe clearer.
Add a new flag for the behaviour, M68K_FEATURE_EXCEPTION_FORMAT_VEC,
and set it for 68010 and above, and then use it to control if the
format and vector word are pushed/pop during exception entry/exit.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2164
Signed-off-by: Daniel Palmer <daniel@0x0f.com>
Message-ID: <20240115101643.2165387-1-daniel@0x0f.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Misc HW patch queue
- Remove sysbus_add_io (Phil)
- Build PPC 4xx PCI host bridges once (Phil)
- Display QOM path while debugging SMBus targets (Joe)
- Simplify x86 PC code (Bernhard)
- Remove qemu_[un]register_reset() calls in x86 PC CMOS (Peter)
- Fix wiring of ICH9 LPC interrupts (Bernhard)
- Split core IDE as device / bus / dma (Thomas)
- Prefer QDev API over QOM for devices (Phil)
- Fix invalid use of DO_UPCAST() in Leon3 (Thomas)
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmXXQ1IACgkQ4+MsLN6t
# wN4e2xAAig55EJh/JwpdGx55rFUab3Ay22jgXrExmBir8hzhyzssY+RUj2ALRa5e
# T26kxCEqiuT549FtWm/ci6kVax0QD6bqz/6/j451XB9469Z/3BDOV5rhsqF6zlr5
# BMbyC8PKnMUluG8v1ZuRjC3m2lK3ZvkVnZtj7SZUR50ssEnR32fVIziN14/OYkts
# 2B24sLrnLBfvyatMRsuFqGWrcbtMdnwNpjenGfDPOTF33W1sxTQ8GSvx1RV32l69
# Yr/iCVoCl+rGxbLLP1TwqtOwzk32p8RsbIt6rWMqVMv/p5F6ezFeiOk7VHnnEJRH
# e7TPxt4XeLGPARMQLT3gQh0MGIIodanSHePRBkczuNmKYTJrz+5jMu2Qg4MmMUE/
# TV0fKgdjh/edhAOHzJgZqLmNV71icl8WBjfsw2qT4ZwgJzWq7YM2/XZKkeWhk2nQ
# whLxfgiU4PNJ6vHhebJNjOovCYQTK2FbXR+PvVn5FEbH4CuFr8mqkYc+vNYM9dLA
# b7uMk1H8kcb5+kqfPPU2lVd1wO7uqhxYOYU2O9nYq8aw7ioLoLeEdj2IicLtrA/H
# GMtyA5cYeabeRzSXF30tM2AR1uQ/e4Z7oNxW6z3GVK1NrQtKilqPgMKut8uWYvva
# crJLpRQhGiY3sDrIkkCcAHzv256dZaJNLR1KPViaHOyVPZV+x2s=
# =+h2O
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 22 Feb 2024 12:51:30 GMT
# gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]
# Primary key fingerprint: FAAB E75E 1291 7221 DCFD 6BB2 E3E3 2C2C DEAD C0DE
* tag 'hw-misc-20240222' of https://github.com/philmd/qemu: (32 commits)
hw/sparc/leon3: Fix wrong usage of DO_UPCAST macro
hw/ide: Stop exposing internal.h to non-IDE files
hw/ide: Remove the include/hw/ide.h legacy file
hw/ide: Move IDE bus related definitions to a new header ide-bus.h
hw/ide: Move IDE device related definitions to ide-dev.h
hw/ide: Move IDE DMA related definitions to a separate header ide-dma.h
hw/ide: Split qdev.c into ide-bus.c and ide-dev.c
hw/ide: Add the possibility to disable the CompactFlash device in the build
hw/acpi/ich9_tco: Include missing 'migration/vmstate.h' header
hw/acpi/cpu: Use CPUState typedef
hw/acpi: Include missing 'qapi/qapi-types-acpi.h' generated header
hw/isa/meson.build: Sort alphabetically
hw/i386/pc_q35: Populate interrupt handlers before realizing LPC PCI function
hw/i386/pc_sysfw: Use qdev_is_realized() instead of QOM API
hw/i386/pc_sysfw: Inline pc_system_flash_create() and remove it
hw/i386/pc: Confine system flash handling to pc_sysfw
hw/i386/pc: Defer smbios_set_defaults() to machine_done
hw/i386/pc: Merge pc_guest_info_init() into pc_machine_initfn()
hw/i386/x86: Turn apic_xrupt_override into class attribute
hw/i386/pc: Do pc_cmos_init_late() from pc_machine_done()
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
# Conflicts:
# include/hw/i386/pc.h
Python is transitioning to a world where you're not allowed to use 'pip
install' outside of a virutal env by default. The rationale is to stop
use of pip clashing with distro provided python packages, which creates
a major headache on distro upgrades.
All our CI environments, however, are 100% disposable so the upgrade
headaches don't exist. Thus we can undo the python defaults to allow
pip to work.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-id: 20240222114038.2348718-1-berrange@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
MSYS2 is dropping support for 32-bit Windows. This shows up for us
as various packages we were using in our CI job no longer being
available to install, which causes the job to fail. In commit
8e31b744fd we dropped the dependency on libusb and spice, but the
dtc package has also now been removed.
For us as QEMU upstream, "32 bit x86 hosts for system emulation" have
already been deprecated as of QEMU 8.0, so we are ready to drop them
anyway.
Drop the msys2-32bit CI job, as the first step in doing this.
This is cc'd to stable, because this job will also be broken for CI
on the stable branches. We can't drop 32-bit support entirely there,
but we will still be covering at least compilation for 32-bit Windows
via the cross-win32-system job.
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20240220165602.135695-1-peter.maydell@linaro.org
leon3.c currently fails to compile with some compilers when the -Wvla
option has been enabled:
../hw/sparc/leon3.c: In function ‘leon3_cpu_reset’:
../hw/sparc/leon3.c:153:5: error: ISO C90 forbids variable length array
‘offset_must_be_zero’ [-Werror=vla]
153 | ResetData *s = (ResetData *)DO_UPCAST(ResetData, info[id], info);
| ^~~~~~~~~
cc1: all warnings being treated as errors
Looking at this code, the DO_UPCAST macro is indeed used in a wrong way
here: DO_UPCAST is supposed to check that the second parameter is the
first entry of the struct that the first parameter indicates, but since
we use and index into the info[] array, this of course cannot work.
The intention here was likely rather to use the container_of() macro
instead, so switch the code accordingly.
Fixes: d65aba8286 ("hw/sparc/leon3: implement multiprocessor")
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240221180751.190489-1-thuth@redhat.com>
Tested-by: Clément Chigot <chigot@adacore.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
include/hw/ide/internal.h is currently included by include/hw/ide/pci.h
and thus exposed to a lot of files that are not part of the IDE subsystem.
Stop including internal.h there and use the appropriate new headers
ide-bus.h and ide-dma.h instead.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-ID: <20240220085505.30255-8-thuth@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
qdev.c is a mixture between IDE bus specific functions and IDE device
functions. Let's split it up to make it more obvious which part is
related to bus handling and which part is related to device handling.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-ID: <20240220085505.30255-3-thuth@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
For distros like downstream RHEL, it would be helpful to allow to disable
the CompactFlash device. For making this possible, we need a separate
Kconfig switch for this device, and the code should reside in a separate
file. Let's also introduce a new header ide-dev.h which can be used to
collect definitions related to IDE devices.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-ID: <20240220085505.30255-2-thuth@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
ACPIOSTInfo is a QAPI generated structure:
$ git grep -w ACPIOSTInfo
qapi/acpi.json:81:# @ACPIOSTInfo:
qapi/acpi.json:99:{ 'struct': 'ACPIOSTInfo',
qapi/acpi.json:109:# Return a list of ACPIOSTInfo for devices that support status
Include the "qapi/qapi-types-acpi.h" header to avoid the following
errors when including "hw/acpi/cpu.h" or "hw/acpi/memory_hotplug.h"
elsewhere:
include/hw/acpi/cpu.h:67:52: error: unknown type name 'ACPIOSTInfoList'
void acpi_cpu_ospm_status(CPUHotplugState *cpu_st, ACPIOSTInfoList ***list);
^
include/hw/acpi/memory_hotplug.h:51:55: error: unknown type name 'ACPIOSTInfoList'
void acpi_memory_ospm_status(MemHotplugState *mem_st, ACPIOSTInfoList ***list);
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20240219141412.71418-2-philmd@linaro.org>
The interrupt handlers need to be populated before the device is realized since
internal devices such as the RTC are wired during realize(). If the interrupt
handlers aren't populated, devices such as the RTC will be wired with a NULL
interrupt handler, i.e. MC146818RtcState::irq is NULL.
Fixes: fc11ca08bc "hw/i386/q35: Realize LPC PCI function before accessing it"
Cc: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-ID: <20240217104644.19755-1-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
pc_system_flash_create() checked for pcmc->pci_enabled which is redundant since
its caller already checked it. The method can be turned into just two lines, so
inline and remove it.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240208220349.4948-8-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Rather than distributing PC system flash handling across three files, let's
confine it to one. Now, pc_system_firmware_init() creates, configures and cleans
up the system flash which makes the code easier to understand. It also avoids
the extra call to pc_system_flash_cleanup_unused() in the Xen case.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240208220349.4948-7-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Handling most of smbios data generation in the machine_done notifier is similar
to how the ARM virt machine handles it which also calls smbios_set_defaults()
there. The result is that all pc machines are freed from explicitly worrying
about smbios setup.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240208220349.4948-6-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The attribute isn't user-changeable and only true for pc-based machines. Turn it
into a class attribute which allows for inlining pc_guest_info_init() into
pc_machine_initfn().
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240208220349.4948-4-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
In the i386 PC machine, we want to run the pc_cmos_init_late()
function only once the IDE and floppy drive devices have been set up.
We currently do this using qemu_register_reset(), and then have the
function call qemu_unregister_reset() on itself, so it runs exactly
once.
This was an expedient way to do it back in 2010 when we first added
this (in commit c0897e0cb9), but now we have a more obvious point
to do "machine initialization that has to happen after generic device
init": the machine-init-done hook.
Do the pc_cmos_init_late() work from our existing PC machine init
done hook function, so we can drop the use of qemu_register_reset()
and qemu_unregister_reset().
Because the pointers to the devices we need (the IDE buses and the
RTC) are now all in the machine state, we don't need the
pc_cmos_init_late_arg struct and can just pass the PCMachineState
pointer.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240220160622.114437-3-peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
ppc440_pcix.c is moved from the target specific ppc_ss[] meson
source set to pci_ss[] which is common to all targets: the
object is built once.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20240215105017.57748-5-philmd@linaro.org>
ppc4xx_pci.c is moved from the target specific ppc_ss[] meson
source set to pci_ss[] which is common to all targets: the
object is built once.
Declare PPC4XX_PCI selector in pci-host/Kconfig.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20240215105017.57748-4-philmd@linaro.org>
sysbus_add_io(...) is a simple wrapper to
memory_region_add_subregion(get_system_io(), ...).
It is used in 3 places; inline it directly.
Rationale: we want to move to an explicit I/O bus,
rather that an implicit one. Besides in heterogeneous
setup we can have more than one I/O bus.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20240216150441.45681-1-philmd@linaro.org>
[PMD: Include missing "exec/address-spaces.h" header]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Input grab key should be Ctrl-Alt-g, not just Ctrl-Alt.
Fixes: f8d2c9369b ("sdl: use ctrl-alt-g as grab hotkey")
Signed-off-by: Tianlan Zhou <bobby825@126.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Input grab key should be Ctrl-Alt-g, not just Ctrl-Alt.
Fixes: f8d2c9369b ("sdl: use ctrl-alt-g as grab hotkey")
Signed-off-by: Tianlan Zhou <bobby825@126.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When running "configure" with "--without-default-devices", building
of qemu-system-hppa currently fails with:
/usr/bin/ld: libqemu-hppa-softmmu.fa.p/hw_hppa_machine.c.o: in function `machine_HP_common_init_tail':
hw/hppa/machine.c:399: undefined reference to `usb_bus_find'
/usr/bin/ld: hw/hppa/machine.c:399: undefined reference to `usb_create_simple'
/usr/bin/ld: hw/hppa/machine.c:400: undefined reference to `usb_bus_find'
/usr/bin/ld: hw/hppa/machine.c:400: undefined reference to `usb_create_simple'
collect2: error: ld returned 1 exit status
ninja: build stopped: subcommand failed.
make: *** [Makefile:162: run-ninja] Error 1
And after fixing this, the qemu-system-hppa binary refuses to run
due to the missing 'pci-ohci' and 'pci-serial' devices. Let's add
the right config switches to fix these problems.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Arguments `ram_block` are reassigned to local declarations `block`
without further use. Remove re-assignment to reduce noise.
Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
TYPE_PORT92 inherits TYPE_ISA_DEVICE, so need to include
"hw/isa/isa.h" to get its declarations (currently we
indirectly include this header via "hw/i386/pc.h").
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Luc Michel <luc.michel@amd.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Since pc_madt_cpu_entry() is only used by:
- hw/i386/acpi-build.c // single call
- hw/i386/acpi-common.c // definition
there is no need to expose it outside of hw/i386/.
Declare it in "acpi-common.h".
acpi-build.c doesn't need "hw/i386/pc.h" anymore.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Luc Michel <luc.michel@amd.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Commit c461f3e382 ("hw/acpi/acpi_dev_interface: Remove now unused
madt_cpu virtual method") removed the need for "hw/i386/pc.h".
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Luc Michel <luc.michel@amd.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
HPET_INTCAP is specific to TYPE_HPET, so define it there.
hpet.c doesn't need to include "hw/i386/pc.h" anymore.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Luc Michel <luc.michel@amd.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Rename NB_PORTS as EHCI_PORTS to avoid definition clash
with UHCI equivalent:
hw/usb/hcd-ehci.h:40:9: error: 'NB_PORTS' macro redefined [-Werror,-Wmacro-redefined]
#define NB_PORTS 6 /* Max. Number of downstream ports */
^
hw/usb/hcd-uhci.h:38:9: note: previous definition is here
#define NB_PORTS 2
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Rename NB_PORTS as UHCI_PORTS to avoid definition clash
with EHCI equivalent:
hw/usb/hcd-uhci.h:38:9: error: 'NB_PORTS' macro redefined [-Werror,-Wmacro-redefined]
#define NB_PORTS 2
^
hw/usb/hcd-ehci.h:40:9: note: previous definition is here
#define NB_PORTS 6 /* Max. Number of downstream ports */
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
We are going to modify these lines, fix their style
in order to avoid checkpatch.pl warning.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
In `qemu_console_resize()`, the old surface of the console is keeped if the new
console size is the same as the old one. If the old surface is a placeholder,
and the new size of console is the same as the placeholder surface (640*480),
the surface won't be replace.
In this situation, the surface's `QEMU_PLACEHOLDER_FLAG` flag is still set, so
the console won't be displayed in SDL display mode.
This patch fixes this problem by forcing a new surface if the old one is a
placeholder.
Signed-off-by: Tianlan Zhou <bobby825@126.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <20240207172024.8-1-bobby825@126.com>
With VNC, a client can send a non-extended VNC_MSG_CLIENT_CUT_TEXT
message with len=0. In qemu_clipboard_set_data(), the clipboard info
will be updated setting data to NULL (because g_memdup(data, size)
returns NULL when size is 0). If the client does not set the
VNC_ENCODING_CLIPBOARD_EXT feature when setting up the encodings, then
the 'request' callback for the clipboard peer is not initialized.
Later, because data is NULL, qemu_clipboard_request() can be reached
via vdagent_chr_write() and vdagent_clipboard_recv_request() and
there, the clipboard owner's 'request' callback will be attempted to
be called, but that is a NULL pointer.
In particular, this can happen when using the KRDC (22.12.3) VNC
client.
Another scenario leading to the same issue is with two clients (say
noVNC and KRDC):
The noVNC client sets the extension VNC_FEATURE_CLIPBOARD_EXT and
initializes its cbpeer.
The KRDC client does not, but triggers a vnc_client_cut_text() (note
it's not the _ext variant)). There, a new clipboard info with it as
the 'owner' is created and via qemu_clipboard_set_data() is called,
which in turn calls qemu_clipboard_update() with that info.
In qemu_clipboard_update(), the notifier for the noVNC client will be
called, i.e. vnc_clipboard_notify() and also set vs->cbinfo for the
noVNC client. The 'owner' in that clipboard info is the clipboard peer
for the KRDC client, which did not initialize the 'request' function.
That sounds correct to me, it is the owner of that clipboard info.
Then when noVNC sends a VNC_MSG_CLIENT_CUT_TEXT message (it did set
the VNC_FEATURE_CLIPBOARD_EXT feature correctly, so a check for it
passes), that clipboard info is passed to qemu_clipboard_request() and
the original segfault still happens.
Fix the issue by handling updates with size 0 differently. In
particular, mark in the clipboard info that the type is not available.
While at it, switch to g_memdup2(), because g_memdup() is deprecated.
Cc: qemu-stable@nongnu.org
Fixes: CVE-2023-6683
Reported-by: Markus Frank <m.frank@proxmox.com>
Suggested-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: Markus Frank <m.frank@proxmox.com>
Message-ID: <20240124105749.204610-1-f.ebner@proxmox.com>
The extended clipboard message protocol requires that the client
activate the extension by requesting a psuedo encoding. If this
is not done, then any extended clipboard messages from the client
should be considered invalid and the client dropped.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20240115095119.654271-1-berrange@redhat.com>
The build-previous-qemu job is now trying to fetch from the upstream
repository, but the tag is only fetched into FETCH_HEAD:
$ git remote add upstream https://gitlab.com/qemu-project/qemu 00:00
$ git fetch upstream $QEMU_PREV_VERSION 00:02
warning: redirecting to https://gitlab.com/qemu-project/qemu.git/
From https://gitlab.com/qemu-project/qemu
* tag v8.2.0 -> FETCH_HEAD
$ git checkout $QEMU_PREV_VERSION 00:02
error: pathspec v8.2.0 did not match any file(s) known to git
Fix by fetching the tag into the checkout itself.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Allow boards to use the device creation functions even if USB itself
is not available; of course the functions will fail inexorably, but
this can be okay if the calls are conditional on the existence of
some USB host controller device. This is for example the case for
hw/mips/loongson3_virt.c.
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
With more than three years since Meson was introduced in the build system, people
have had quite some time to move away from the foo-softmmu/qemu-system-* and
foo-linux-user/qemu-* symbolic links. Remove them, and with them another
instance of the "softmmu" name for system emulators.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Calls to is_enabled are bounded to indices that actually exist in
the SuperIO device. Therefore, the is_enabled functions in
smc37c669 are not doing anything and they can be removed.
Reviewed-by: Bernhard Beschow <shentey@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Ensure that the value is valid; it can only be zero or one.
And never create a floppy disk controller if it is zero.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There is no use of ptimer functions in mips_cps.c or any other related
code.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CPUID leaf 7 was grouped together with SGX leaf 0x12 by commit
b9edbadefb ("i386: Propagate SGX CPUID sub-leafs to KVM") by mistake.
SGX leaf 0x12 has its specific logic to check if subleaf (starting from 2)
is valid or not by checking the bit 0:3 of corresponding EAX is 1 or
not.
Leaf 7 follows the logic that EAX of subleaf 0 enumerates the maximum
valid subleaf.
Fixes: b9edbadefb ("i386: Propagate SGX CPUID sub-leafs to KVM")
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20240125024016.2521244-4-xiaoyao.li@intel.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
pc_machine_kvm_type() was introduced by commit e21be724ea ("i386/xen:
add pc_machine_kvm_type to initialize XEN_EMULATE mode") to do Xen
specific initialization by utilizing kvm_type method.
commit eeedfe6c63 ("hw/xen: Simplify emulated Xen platform init")
moves the Xen specific initialization to pc_basic_device_init().
There is no need to keep the PC specific kvm_type() implementation
anymore. So we'll fallback to kvm_arch_get_default_type(), which
simply returns 0.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Isaku Yamahata <isaku.yamahata@intel.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: David Woodhouse <dwmw@amazon.co.uk>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20231007065819.27498-1-xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When msys2 updated their libusb packages to libusb 1.0.27, they
dropped support for building them for mingw32, leaving only mingw64
packages. This broke our CI job, as the 'pacman' package install now
fails with:
error: target not found: mingw-w64-i686-libusb
error: target not found: mingw-w64-i686-usbredir
(both these binary packages are from the libusb source package).
Similarly, spice is now 64-bit only:
error: target not found: mingw-w64-i686-spice
Fix this by dropping these packages from the list we install for our
msys2-32bit build. We do this with a simple mechanism for the
msys2-64bit and msys2-32bit jobs to specify a list of extra packages
to install on top of the common ones we install for both jobs.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2160
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Message-id: 20240215155009.2422335-1-peter.maydell@linaro.org
Since commit effd60c8 changed how QMP commands are processed, the order
of the block-commit return value and job events in iotests 144 wasn't
fixed and more and caused the test to fail intermittently.
Change the test to cache events first and then print them in a
predefined order.
Waiting three times for JOB_STATUS_CHANGE is a bit uglier than just
waiting for the JOB_STATUS_CHANGE that has "status": "ready", but the
tooling we have doesn't seem to allow the latter easily.
Fixes: effd60c878
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2126
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20240209173103.239994-1-kwolf@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
AHCIState::ports should be unsigned. Besides, we never
check it for negative value. It is unlikely it was ever
used with more than INT32_MAX ports, so it is safe to
convert it to unsigned.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240213081201.78951-7-philmd@linaro.org>
Since commit b04d989054 ("SPARC: Emulation of Leon3") the
main_cpu_reset() handler sets both pc/npc when the CPU is
reset, after the machine is realized. It is pointless to
set it in leon3_generic_hw_init().
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Clément Chigot <chigot@adacore.com>
Message-Id: <20240130113102.6732-3-philmd@linaro.org>
CPUSPARCState::irq_manager holds a pointer to a QDev,
so declare it as DeviceState instead of void.
Move the comment about Leon3 fields.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Clément Chigot <chigot@adacore.com>
Message-Id: <20240130113102.6732-3-philmd@linaro.org>
Right now all subclasses of TYPE_ISA_SUPERIO have to specify an instance_size,
because the ISASuperIODevice struct adds fields to ISADevice but the type does
not include the increased instance size. Failure to do so results in an access
past the bounds of struct ISADevice as soon as isa_superio_realize is called.
Fix this by specifying the instance_size already in the superclass.
Fixes: 4c3119a6e3 ("hw/isa/superio: Factor out the parallel code from pc87312.c")
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Bernhard Beschow <shentey@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20240213155005.109954-6-pbonzini@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
ISA_SUPERIO does not provide an ISA bus, so it should not select the symbol:
instead it requires one. Among its users, VT82C686 is the only one that
is a PCI-ISA bridge and does not already select ISA_BUS.
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Bernhard Beschow <shentey@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20240213155005.109954-5-pbonzini@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
This board has a lot of UARTs: there is one UART per CPU in the
per-CPU peripheral part of the address map, whose interrupts are
connected as per-CPU interrupt lines. Then there are 4 UARTs in the
normal part of the peripheral space, whose interrupts are shared
peripheral interrupts.
Connect and wire them all up; this involves some OR gates where
multiple overflow interrupts are wired into one GIC input.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240206132931.38376-11-peter.maydell@linaro.org
The AN536 is another FPGA image for the MPS3 development board. Unlike
the existing FPGA images we already model, this board uses a Cortex-R
family CPU, and it does not use any equivalent to the M-profile
"Subsystem for Embedded" SoC-equivalent that we model in hw/arm/armsse.c.
It's therefore more convenient for us to model it as a completely
separate C file.
This commit adds the basic skeleton of the board model, and the
code to create all the RAM and ROM. We assume that we're probably
going to want to add more images in future, so use the same
base class/subclass setup that mps2-tz.c uses, even though at
the moment there's only a single subclass.
Following commits will add the CPUs and the peripherals.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240206132931.38376-9-peter.maydell@linaro.org
The MPS2 SCC device is broadly the same for all FPGA images, but has
minor differences in the behaviour of the CFG registers depending on
the image. In many cases we don't really care about the functionality
controlled by these registers and a reads-as-written or similar
behaviour is sufficient for the moment.
For the AN536 the required behaviour is:
* A_CFG0 has CPU reset and halt bits
- implement as reads-as-written for the moment
* A_CFG1 has flash or ATCM address 0 remap handling
- QEMU doesn't model this; implement as reads-as-written
* A_CFG2 has QSPI select (like AN524)
- implemented (no behaviour, as with AN524)
* A_CFG3 is MCC_MSB_ADDR "additional MCC addressing bits"
- QEMU doesn't care about these, so use the existing
RAZ behaviour for convenience
* A_CFG4 is board rev (like all other images)
- no change needed
* A_CFG5 is ACLK frq in hz (like AN524)
- implemented as reads-as-written, as for other boards
* A_CFG6 is core 0 vector table base address
- implemented as reads-as-written for the moment
* A_CFG7 is core 1 vector table base address
- implemented as reads-as-written for the moment
Make the changes necessary for this; leave TODO comments where
appropriate to indicate where we might want to come back and
implement things like CPU reset.
The other aspects of the device specific to this FPGA image (like the
values of the board ID and similar registers) will be set via the
device's qdev properties.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240206132931.38376-8-peter.maydell@linaro.org
The MPS SCC device has a lot of different flavours for the various
different MPS FPGA images, which look mostly similar but have
differences in how particular registers are handled. Currently we
deal with this with a lot of open-coded checks on scc_partno(), but
as we add more board types this is getting a bit hard to read.
Factor out the conditions into some functions which we can
give more descriptive names to.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240206132931.38376-7-peter.maydell@linaro.org
We currently guard the CFG3 register read with
(scc_partno(s) == 0x524 && scc_partno(s) == 0x547)
which is clearly wrong as it is never true.
This register is present on all board types except AN524
and AN527; correct the condition.
Fixes: 6ac8081894 ("hw/misc/mps2-scc: Implement changes for AN547")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240206132931.38376-6-peter.maydell@linaro.org
Architecturally, the AArch32 MSR/MRS to/from banked register
instructions are UNPREDICTABLE for attempts to access a banked
register that the guest could access in a more direct way (e.g.
using this insn to access r8_fiq when already in FIQ mode). QEMU has
chosen to UNDEF on all of these.
However, for the case of accessing SPSR_hyp from hyp mode, it turns
out that real hardware permits this, with the same effect as if the
guest had directly written to SPSR. Further, there is some
guest code out there that assumes it can do this, because it
happens to work on hardware: an example Cortex-R52 startup code
fragment uses this, and it got copied into various other places,
including Zephyr. Zephyr was fixed to not use this:
https://github.com/zephyrproject-rtos/zephyr/issues/47330
but other examples are still out there, like the selftest
binary for the MPS3-AN536.
For convenience of being able to run guest code, permit
this UNPREDICTABLE access instead of UNDEFing it.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240206132931.38376-5-peter.maydell@linaro.org
The Cortex-R52 implements the Configuration Base Address Register
(CBAR), as a read-only register. Add ARM_FEATURE_CBAR_RO to this CPU
type, so that our implementation provides the register and the
associated qdev property.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240206132931.38376-3-peter.maydell@linaro.org
We support two different encodings for the AArch32 IMPDEF
CBAR register -- older cores like the Cortex A9, A7, A15
have this at 4, c15, c0, 0; newer cores like the
Cortex A35, A53, A57 and A72 have it at 1 c15 c0 0.
When we implemented this we picked which encoding to
use based on whether the CPU set ARM_FEATURE_AARCH64.
However this isn't right for three cases:
* the qemu-system-arm 'max' CPU, which is supposed to be
a variant on a Cortex-A57; it ought to use the same
encoding the A57 does and which the AArch64 'max'
exposes to AArch32 guest code
* the Cortex-R52, which is AArch32-only but has the CBAR
at the newer encoding (and where we incorrectly are
not yet setting ARM_FEATURE_CBAR_RO anyway)
* any possible future support for other v8 AArch32
only CPUs, or for supporting "boot the CPU into
AArch32 mode" on our existing cores like the A57 etc
Make the decision of the encoding be based on whether
the CPU implements the ARM_FEATURE_V8 flag instead.
This changes the behaviour only for the qemu-system-arm
'-cpu max'. We don't expect anybody to be relying on the
old behaviour because:
* it's not what the real hardware Cortex-A57 does
(and that's what our ID register claims we are)
* we don't implement the memory-mapped GICv3 support
which is the only thing that exists at the peripheral
base address pointed to by the register
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240206132931.38376-2-peter.maydell@linaro.org
QDev objects created with qdev_new() need to manually add
their parent relationship with object_property_add_child().
This commit plug the devices which aren't part of the SoC;
they will be plugged into a SoC container in the next one.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240213155214.13619-4-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
It doesn't make sense to read the value of MDCR_EL2 on a non-A-profile
CPU, and in fact if you try to do it we will assert:
#6 0x00007ffff4b95e96 in __GI___assert_fail
(assertion=0x5555565a8c70 "!arm_feature(env, ARM_FEATURE_M)", file=0x5555565a6e5c "../../target/arm/helper.c", line=12600, function=0x5555565a9560 <__PRETTY_FUNCTION__.0> "arm_security_space_below_el3") at ./assert/assert.c:101
#7 0x0000555555ebf412 in arm_security_space_below_el3 (env=0x555557bc8190) at ../../target/arm/helper.c:12600
#8 0x0000555555ea6f89 in arm_is_el2_enabled (env=0x555557bc8190) at ../../target/arm/cpu.h:2595
#9 0x0000555555ea942f in arm_mdcr_el2_eff (env=0x555557bc8190) at ../../target/arm/internals.h:1512
We might call pmu_counter_enabled() on an M-profile CPU (for example
from the migration pre/post hooks in machine.c); this should always
return false because these CPUs don't set ARM_FEATURE_PMU.
Avoid the assertion by not calling arm_mdcr_el2_eff() before we
have done the early return for "PMU not present".
This fixes an assertion failure if you try to do a loadvm or
savevm for an M-profile board.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2155
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240208153346.970021-1-peter.maydell@linaro.org
Currently QEMU will warn if there is a NIC on the board that
is not connected to a backend. By default the '-nic user' will
get used for all NICs, but if you manually connect a specific
NIC to a specific backend, then the other NICs on the board
have no backend and will be warned about:
qemu-system-arm: warning: nic npcm7xx-emc.1 has no peer
qemu-system-arm: warning: nic npcm-gmac.0 has no peer
qemu-system-arm: warning: nic npcm-gmac.1 has no peer
So suppress those warnings by manually connecting every NIC
on the board to some backend.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 20240206171231.396392-3-peter.maydell@linaro.org
The patchset adding the GMAC ethernet to this SoC crossed in the
mail with the patchset cleaning up the NIC handling. When we
create the GMAC modules we must call qemu_configure_nic_device()
so that the user has the opportunity to use the -nic commandline
option to create a network backend and connect it to the GMACs.
Add the missing call.
Fixes: 21e5326a7c ("hw/arm: Add GMAC devices to NPCM7XX SoC")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
Message-id: 20240206171231.396392-2-peter.maydell@linaro.org
Armv8.1+ CPUs have the Virtual Host Extension (VHE) which adds a
non-secure EL2 virtual timer. We implemented the timer itself in the
CPU model, but never wired up its IRQ line to the GIC.
Wire up the IRQ line (this is always safe whether the CPU has the
interrupt or not, since it always creates the outbound IRQ line).
Report it to the guest via dtb and ACPI if the CPU has the feature.
The DTB binding is documented in the kernel's
Documentation/devicetree/bindings/timer/arm\,arch_timer.yaml
and the ACPI table entries are documented in the ACPI specification
version 6.3 or later.
Because the IRQ line ACPI binding is new in 6.3, we need to bump the
FADT table rev to show that we might be using 6.3 features.
Note that exposing this IRQ in the DTB will trigger a bug in EDK2
versions prior to edk2-stable202311, for users who use the virt board
with 'virtualization=on' to enable EL2 emulation and are booting an
EDK2 guest BIOS, if that EDK2 has assertions enabled. The effect is
that EDK2 will assert on bootup:
ASSERT [ArmTimerDxe] /home/kraxel/projects/qemu/roms/edk2/ArmVirtPkg/Library/ArmVirtTimerFdtClientLib/ArmVirtTimerFdtClientLib.c(72): PropSize == 36 || PropSize == 48
If you see that assertion you should do one of:
* update your EDK2 binaries to edk2-stable202311 or newer
* use the 'virt-8.2' versioned machine type
* not use 'virtualization=on'
(The versions shipped with QEMU itself have the fix.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Message-id: 20240122143537.233498-3-peter.maydell@linaro.org
We deliberately don't include qtests_npcm7xx in qtests_aarch64,
because we already get the coverage of those tests via qtests_arm,
and we don't want to use extra CI minutes testing them twice.
In commit 327b680877 we added it to qtests_aarch64; revert
that change.
Fixes: 327b680877 ("tests/qtest: Creating qtest for GMAC Module")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240206163043.315535-1-peter.maydell@linaro.org
The raven_io_ops MemoryRegionOps is the only one in the source tree
which sets .valid.unaligned to indicate that it should support
unaligned accesses and which does not also set .impl.unaligned to
indicate that its read and write functions can do the unaligned
handling themselves. This is a problem, because at the moment the
core memory system does not implement the support for handling
unaligned accesses by doing a series of aligned accesses and
combining them (system/memory.c:access_with_adjusted_size() has a
TODO comment noting this).
Fortunately raven_io_read() and raven_io_write() will correctly deal
with the case of being passed an unaligned address, so we can fix the
missing unaligned access support by setting .impl.unaligned in the
MemoryRegionOps struct.
Fixes: 9a1839164c ("raven: Implement non-contiguous I/O region")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-id: 20240112134640.1775041-1-peter.maydell@linaro.org
When the Rutabaga GPU device frees resources, it calls
rutabaga_resource_unref for that resource_id. However, when the generic
VirtIOGPU functions destroys resources, it only removes the
virtio_gpu_simple_resource from the device's VirtIOGPU->reslist list.
The rutabaga resource associated with that resource_id is then leaked.
This commit overrides the resource_destroy class method introduced in
the previous commit to fix this.
Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Message-Id: <e3778e44c98a35839de2f4938e5355449fa3aa14.1706626470.git.manos.pitsidianakis@linaro.org>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Previously not all references mentioned any spec version at all.
Given r3.1 is the current specification available for evaluation at
www.computeexpresslink.org update references to refer to that.
Hopefully this won't become a never ending job.
A few structure definitions have been updated to add new fields.
Defaults of 0 and read only are valid choices for these new DVSEC
registers so go with that for now.
There are additional error codes and some of the 'questions' in
the comments are resolved now.
Update documentation reference to point to the CXL r3.1 specification
with naming closer to what is on the cover.
For cases where there are structure version numbers, add defines
so they can be found next to the register definitions.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20240126121636.24611-6-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Found whilst testing a series for the linux kernel that actually
bothers to check if enabled is set. 0xB is the option used
for vast majority of DSDT entries in QEMU.
It is a little odd for a device that doesn't really exist and
is simply a hook to tell the OS there is a CEDT table but 0xB
seems a reasonable choice and avoids need to special case
this device in the OS.
Means:
* Device present.
* Device enabled and decoding it's resources.
* Not shown in UI
* Functioning properly
* No battery (on this device!)
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20240126120132.24248-12-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The _STA value returned currently indicates the ACPI0017 device
is not enabled. Whilst this isn't a real device, setting _STA
like this may prevent an OS from enumerating it correctly and
hence from parsing the CEDT table.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20240126120132.24248-11-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In the current mdev_reg_read() implementation, it consistently returns
that the Media Status is Ready (01b). This was fine until commit
25a52959f9 ("hw/cxl: Add support for device sanitation") because the
media was presumed to be ready.
However, as per the CXL 3.0 spec "8.2.9.8.5.1 Sanitize (Opcode 4400h)",
during sanitation, the Media State should be set to Disabled (11b). The
mentioned commit correctly sets it to Disabled, but mdev_reg_read()
still returns Media Status as Ready.
To address this, update mdev_reg_read() to read register values instead
of returning dummy values.
Note that __toggle_media() managed to not only write something
that no one read, it did it to the wrong register storage and
so changed the reported mailbox size which was definitely not
the intent. That gets fixed as a side effect of allocating
separate state storage for this register.
Fixes: commit 25a52959f9 ("hw/cxl: Add support for device sanitation")
Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20240126120132.24248-7-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
s->iommu_pcibus_by_bus_num is a IOMMUPciBus pointer cache indexed
by bus number, bus number may not always be a fixed value,
i.e., guest reboot to different kernel which set bus number with
different algorithm.
This could lead to endpoint binding to wrong iommu MR in
virtio_iommu_get_endpoint(), then vfio device setup wrong
mapping from other device.
Remove the memset in virtio_iommu_device_realize() to avoid
redundancy with memset in system reset.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-Id: <20240125073706.339369-2-zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Due to my own limitation on bandwidth, I noticed that unfortunately I won't
have time to review VT-d patches at least in the near future. Meanwhile I
expect a lot of possibilities could actually happen in this area in the
near future.
To reflect that reality, I decided to drop myself from the VT-d role. It
shouldn't affect much since we still have Jason around like usual, and
Michael on top. But I assume it'll always be good if anyone would like to
fill this role up.
I'll still work on QEMU. So I suppose anyone can still copy me if one
thinks essential.
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20240118091035.48178-1-peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Jason Wang <jasowang@redhat.com>
The VIA south bridges are able to relocate and toggle (enable or disable) their
SuperI/O functions. So far this is hardcoded such that all functions are always
enabled and are located at fixed addresses.
Some PC BIOSes seem to probe for I/O occupancy before activating such a function
and issue an error in case of a conflict. Since the functions are currently
enabled on reset, conflicts are always detected. Prevent that by implementing
relocation and toggling of the SuperI/O functions.
Note that all SuperI/O functions are now deactivated upon reset (except for
VT82C686B's serial ports where Fuloong 2e's rescue-yl seems to expect them to be
enabled by default). Rely on firmware to configure the functions accordingly.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-Id: <20240114123911.4877-12-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This is a preparation for implementing relocation and toggling of SuperI/O
functions in the VT8231 device model. Upon reset, all SuperI/O functions will be
deactivated, so in case if no -bios is given, let the machine configure those
functions the same way Pegasos II firmware would do.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-Id: <20240114123911.4877-11-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The real SuperI/O chips emulated by QEMU allow for relocating and enabling or
disabling their SuperI/O functions via software. So far this is not implemented.
Prepare for that by adding isa_parallel_set_{enabled,iobase}.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20240114123911.4877-10-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The real SuperI/O chips emulated by QEMU allow for relocating and enabling or
disabling their SuperI/O functions via software. So far this is not implemented.
Prepare for that by adding isa_serial_set_{enabled,iobase}.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20240114123911.4877-9-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The real SuperI/O chips emulated by QEMU allow for relocating and enabling or
disabling their SuperI/O functions via software. So far this is not implemented.
Prepare for that by adding isa_fdc_set_{enabled,iobase}.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20240114123911.4877-8-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Some SuperI/O devices such as the VIA south bridges or the PC87312 controller
allow to enable or disable their SuperI/O functions. Add a convenience function
for implementing this in the VIA south bridges.
The naming of the functions is inspired by its memory_region_set_enabled()
pendant.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20240114123911.4877-7-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Some SuperI/O devices such as the VIA south bridges or the PC87312 controller
are able to relocate their SuperI/O functions. Add a convenience function for
implementing this in the VIA south bridges.
This convenience function relies on previous simplifications in exec/ioport
which avoids some duplicate synchronization of I/O port base addresses. The
naming of the function is inspired by its memory_region_set_address() pendant.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20240114123911.4877-6-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
portio_list_add_1() creates a MemoryRegionPortioList instance which holds a
MemoryRegion `mr` and an array of MemoryRegionPortio elements named `ports`.
Each element in the array gets assigned the same value for its .base attribute.
The same value also ends up as the .addr attribute of `mr` due to the
memory_region_add_subregion() call. This means that all .base attributes are
the same as `mr.addr`.
The only usages of MemoryRegionPortio::base were in portio_read() and
portio_write(). Both functions get above MemoryRegionPortioList as their
opaque parameter. In both cases find_portio() can only return one of the
MemoryRegionPortio elements of the `ports` array. Due to above observation any
element will have the same .base value equal to `mr.addr` which is also
accessible.
Hence, `mrpio->mr.addr` is equivalent to `mrp->base` and
MemoryRegionPortio::base is redundant and can be removed.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20240114123911.4877-5-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
QEMU populates the apic_state attribute of x86 CPUs if supported by real
hardware or if SMP is active. When handling interrupts, it just checks whether
apic_state is populated to route the interrupt to the PIC or to the APIC.
However, chapter 10.4.3 of [1] requires that:
When IA32_APIC_BASE[11] is 0, the processor is functionally equivalent to an
IA-32 processor without an on-chip APIC.
This means that when apic_state is populated, QEMU needs to check for the
MSR_IA32_APICBASE_ENABLE flag in addition. Implement this which fixes some
real-world BIOSes.
[1] Intel 64 and IA-32 Architectures Software Developer's Manual, Vol. 3A:
System Programming Guide, Part 1
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20240106132546.21248-3-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The if statement currently uses double negation when executing the else branch.
So swap the branches and simplify the condition to make the code more
comprehensible.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20240106132546.21248-2-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This commit adds XTSup configuration to let user choose to whether enable
this feature or not. When XTSup is enabled, additional bytes in IRTE with
enabled guest virtual VAPIC are used to support 32-bit destination id.
Additionally, this commit exports IVHD type 0x11 besides the old IVHD type
0x10 in ACPI table. IVHD type 0x10 does not report full set of IOMMU
features only the legacy ones, so operating system (e.g. Linux) may only
detects x2APIC support if IVHD type 0x11 is available. The IVHD type 0x10
is kept so that old operating system that only parses type 0x10 can detect
the IOMMU device.
Besides, an amd_iommu-stub.c file is created to provide the definition for
amdvi_extended_feature_register when CONFIG_AMD_IOMMU=n. This function is
used by acpi-build.c to get the extended feature register value for
building the ACPI table. When CONFIG_AMD_IOMMU=y, this function is defined
in amd_iommu.c.
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
Message-Id: <20240111154404.5333-7-minhquangbui99@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This commit adds support for x2APIC transitions when writing to
MSR_IA32_APICBASE register and finally adds CPUID_EXT_X2APIC to
TCG_EXT_FEATURES.
The set_base in APICCommonClass now returns an integer to indicate error in
execution. apic_set_base return -1 on invalid APIC state transition,
accelerator can use this to raise appropriate exception.
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
Message-Id: <20240111154404.5333-4-minhquangbui99@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This commit extends the APIC ID to 32-bit long and remove the 255 max APIC
ID limit in userspace APIC. The array that manages local APICs is now
dynamically allocated based on the max APIC ID of created x86 machine.
Also, new x2APIC IPI destination determination scheme, self IPI and x2APIC
mode register access are supported.
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
Message-Id: <20240111154404.5333-3-minhquangbui99@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This commit creates apic_register_read/write which are used by both
apic_mem_read/write for MMIO access and apic_msr_read/write for MSR access.
The apic_msr_read/write returns -1 on error, accelerator can use this to
raise the appropriate exception.
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
Message-Id: <20240111154404.5333-2-minhquangbui99@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Lets keep a cleaner split between the base class and the derived
vhost-user-device which we can use for generic vhost-user stubs. This
includes an update to introduce the vq_size property so the number of
entries in a virtq can be defined.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240104210945.1223134-2-alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Character backends are actually QOM types. When a backend's
compile-time conditional QOM type is not compiled in, creation fails
with "'FOO' is not a valid char driver name". Okay, except
introspecting chardev-add with query-qmp-schema doesn't work then: the
backend type is there even though the QOM type isn't.
A management application can work around this issue by using
qom-list-types instead.
Fix the issue anyway: add the conditionals to the QAPI schema.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240203080228.2766159-4-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
The __linux__ version of qemu_chr_open_pp_fd() tries to claim the
parport device with a PPCLAIM ioctl(). On success, it stores the file
descriptor in the chardev object, and returns success. On failure, it
closes the file descriptor, and returns failure.
chardev_new() then passes the Chardev to object_unref(). This duly
calls char_parallel_finalize(), which closes the file descriptor
stored in the chardev object. Since qemu_chr_open_pp_fd() didn't
store it, it's still zero, so this closes standard input. Ooopsie.
To demonstate, add a unit test. With the bug above unfixed, running
this test closes standard input. char_hotswap_test() happens to run
next. It opens a socket, duly gets file descriptor 0, and since it
tests for success with > 0 instead of >= 0, it fails.
The new unit test needs to be conditional exactly like the chardev it
tests. Since the condition is rather complicated, steal the solution
from the serial chardev: define HAVE_CHARDEV_PARALLEL in qemu/osdep.h.
This also permits simplifying chardev/meson.build a bit.
The bug fix is easy enough: store the file descriptor, and leave
closing it to char_parallel_finalize().
The next commit will fix char_hotswap_test()'s test for success.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240203080228.2766159-2-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[Test fixed up for BSDs, indentation fixed up, commit message improved]
qemu-sparc queue
# -----BEGIN PGP SIGNATURE-----
#
# iQFSBAABCgA8FiEEzGIauY6CIA2RXMnEW8LFb64PMh8FAmXLxQweHG1hcmsuY2F2
# ZS1heWxhbmRAaWxhbmRlLmNvLnVrAAoJEFvCxW+uDzIfn3UH/2blaWblrlMBQlGQ
# fkQOI2IGCJ5yRuh70roTY2aPnUyfc70IvZMvYtHElRD0UqYaQgxSjBbnmsqdS+9c
# IKJG3qlDbnu0GBKKpxw9pmtHJ5NsaAl9E9jLZEX6ISu2rWrBHt4XisZhz8U5cVuc
# dmlM4onk2F3+UcfGh4ACPNwtbYqQHEfWwsLuYPdyDdI647Vs6fEgIjeixBi3BcpN
# lzyzquu/AB5SMXRnKaP5CUHC01TM/US2HuZfZ4PzyA0CmIi1od4RHE1iEN7JNWyC
# ki/dasFoELfeoEU/6JrfPOx65v+91hhkBzN+oC4eV3r5COQkmW7PTmlqS269sH5w
# SZsOWcM=
# =T2mw
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 13 Feb 2024 19:37:48 GMT
# gpg: using RSA key CC621AB98E82200D915CC9C45BC2C56FAE0F321F
# gpg: issuer "mark.cave-ayland@ilande.co.uk"
# gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>" [full]
# Primary key fingerprint: CC62 1AB9 8E82 200D 915C C9C4 5BC2 C56F AE0F 321F
* tag 'qemu-sparc-20240213' of https://github.com/mcayland/qemu: (88 commits)
esp.c: add my copyright to the file
esp.c: switch TypeInfo registration to use DEFINE_TYPES() macro
esp.c: keep track of the DRQ state during DMA
esp.c: rename irq_data IRQ to drq_irq
esp.c: implement DMA Transfer Pad command for DATA phases
esp.c: replace n variable with len in esp_do_nodma()
esp.c: consolidate DMA and PDMA logic in STATUS and MESSAGE IN phases
esp.c: remove redundant n variable in PDMA COMMAND phase
esp.c: consolidate DMA and PDMA logic in MESSAGE OUT phase
esp.c: consolidate DMA and PDMA logic in DATA IN phase
esp.c: consolidate DMA and PDMA logic in DATA OUT phase
esp.c: only transfer non-DMA MESSAGE OUT phase data for specific commands
esp.c: only transfer non-DMA COMMAND phase data for specific commands
esp.c: improve ESP_RSEQ logic consolidation
esp.c: handle non-DMA FIFO writes used to terminate DMA commands
esp.c: remove restriction on FIFO read access when DMA memory routines defined
esp.c: handle TC underflow for DMA SCSI requests
esp.c: don't clear the SCSI phase when reading ESP_RINTR
esp.c: ensure that STAT_INT is cleared when reading ESP_RINTR
esp.c: consolidate end of command sequence after ICCS command
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Currently the DRQ IRQ is updated every time DMA data is sent/received which
is both inefficient and causes excessive logging of the DRQ state. Add a
new drq_state bool that only updates the DRQ IRQ if its state changes.
This commit adds the new drq_state bool to the migration state: since the
version number has already been increased earlier in the series, there is
no need to repeat it again here. The DRQ IRQ is (currently) only used for
PDMA transfers which already have a migration break in this series so
there are no problems setting its value post-load.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Helge Deller <deller@gmx.de>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20240112125420.514425-87-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
The Transfer Pad command is used to either drop incoming FIFO data during the
DATA IN phase or generate a series of zero bytes in the FIFO during the DATA
OUT phase.
Implement the DMA Transfer Pad command for the DATA phases which is used by
the NeXTCube firmware in the DATA IN phase to ignore part of the incoming SCSI
data as it is copied into memory.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Helge Deller <deller@gmx.de>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20240112125420.514425-85-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
The contents of the FIFO should only be copied to cmdfifo for ESP commands that
are sending data to the SCSI bus, which are the SEL_* commands and the TI
command. Otherwise any incoming data should be held in the FIFO as normal.
This fixes booting of NetBSD m68k under the Q800 machine once again.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Helge Deller <deller@gmx.de>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20240112125420.514425-78-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
The contents of the FIFO should only be copied to cmdfifo for ESP commands that
are sending data to the SCSI bus, which are the SEL_* commands and the TI
command. Otherwise any incoming data should be held in the FIFO as normal.
This fixes booting of really old 32-bit SPARC Linux kernels such as Aurelien's
debian_etch_sparc_small.qcow2 test image.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Helge Deller <deller@gmx.de>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20240112125420.514425-77-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
The ESP_RSEQ logic is scattered in a few places throughout the ESP state machine
which is mainly because the ESP_RSEQ register isn't always reset when executing
an ESP select command. Once this is done, the ESP_RSEQ register only needs to be
updated at the point where the sequencer command completes.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Helge Deller <deller@gmx.de>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20240112125420.514425-76-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Certain versions of MacOS send the first 5 bytes of the CDB using DMA and then
send the last byte of the CDB by writing to the FIFO. Update the non-DMA state
machine to detect the end of the CDB and execute the SCSI command using similar
logic as that which already exists for transferring the remainder of the CDB
using the ESP TI command.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Helge Deller <deller@gmx.de>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20240112125420.514425-75-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
The latest state machines can handle mixing DMA and non-DMA FIFO access for all
SCSI phases except DATA IN and DATA OUT. For DATA IN and DATA OUT phases, the
transfer is complete when TC == 0 and the updated logic will now handle TC
underflow correctly, which makes it just about impossible to manually manipulate
the FIFO during a DMA transfer.
Remove the restriction on FIFO read access when DMA memory routines are defined
which also allows the NeXTCube machine to pass its self-test.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Helge Deller <deller@gmx.de>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20240112125420.514425-74-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Detect the case where the guest underflows TC by requesting a DMA transfer which
is larger than the available data. If this case is detected, immediately
complete the SCSI request and handle any remaining FIFO accesses in the STATUS
phase by raising INTR_BS once the FIFO is below the threshold.
Note that handling the premature SCSI bus phase change in the case of TC
underflow fixes booting EMILE on m68k once again.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Helge Deller <deller@gmx.de>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20240112125420.514425-73-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Both esp_raise_irq() and esp_lower_irq() check the STAT_INT bit in ESP_RSTAT
to ensure that the IRQ is raised or lowered if its state changes. When reading
ESP_RINTR, esp_lower_irq() was being called *after* ESP_RSTAT had been
cleared meaning that STAT_INT was already clear, and so if STAT_INT was
asserted beforehand then the esp_lower_irq() would have no effect.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Helge Deller <deller@gmx.de>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20240112125420.514425-71-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
The end of command sequences for the ICCS command are currently different
between the DMA and non-DMA versions, and also different from the description
in the datasheet.
Update the sequence so that only INTR_FC is asserted in both cases, and keep
all the logic in esp_do_dma() and esp_do_nodma() rather than having some of
it within esp_run_cmd().
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Helge Deller <deller@gmx.de>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20240112125420.514425-70-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Currently any write to the ESP FIFO in the MESSAGE OUT or COMMAND phases will
manually raise the bus service interrupt. Instead of duplicating the interrupt
logic in esp_reg_write(), update esp_do_nodma() to correctly process incoming
FIFO data during the MESSAGE OUT and COMMAND phases. Part of this change is to
call esp_nodma_ti_dataout() from handle_ti() to ensure that the DATA OUT phase
FIFO transfer only occurs when executing a non-DMA TI command instead of for
each byte entering the FIFO.
One slight complication is that NextSTEP uses multiple TI commands to transfer
the CDB one byte at a time (as opposed to loading the FIFO and using a single
TI command), so it is necessary to determine the expected length of the SCSI
CDB being received. This is handled by the introduction of a new
esp_cdb_length() function which returns the expected SCSI CDB length based
upon the first command byte.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Helge Deller <deller@gmx.de>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20240112125420.514425-67-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
In the case where a SCSI command with a DATA IN phase has been issued, the host
may preload the FIFO with unaligned bytes before issuing the main DMA transfer.
When accumulating data in the FIFO don't raise the INTR_BS interrupt until the
TI command is issued, otherwise the unexpected interrupt can confuse the host.
In particular this is needed to prevent the MacOS Disk Utility from failing
when switching non-DMA transfers to use esp_do_nodma().
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Helge Deller <deller@gmx.de>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20240112125420.514425-65-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
According to the datasheet the previous ESP command remains in the ESP_CMD
register, which caused a problem when consecutive TI commands were issued as
it becomes impossible for the state machine to know when the first TI
command finishes.
This was the original reason for introducing the ti_cmd field which kept
track of the last written command for this purpose. However closer reading
of the datasheet shows that a TI command that terminates due to a change of
SCSI target phase resets the ESP_CMD register to zero which solves this
problem.
Now that this has been fixed in the previous commit, remove the unneeded
ti_cmd field and access the ESP_CMD register directly instead. Bump the
vmstate_esp version to indicate that the ti_cmd field is no longer included.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Helge Deller <deller@gmx.de>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20240112125420.514425-64-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
In the PDMA case the last transfer from the device to the FIFO has occurred
(async_len is zero) but esp_do_dma() is still being called to drain the
remaining FIFO contents.
The additional non-zero transfer check ensures that we still defer the SCSI
layer in the case where we are waiting for data for a TI command or a DMA
enable signal.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Helge Deller <deller@gmx.de>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20240112125420.514425-35-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
There are two cases here: the first is when the TI command underflows, in which
case we raise INTR_BS to indicate an early change of phase, and the second is
when the TI command overflows because the host requested a transfer for more
data than is available. In the latter case force TC to zero so that the TI
completion logic executes correctly.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Helge Deller <deller@gmx.de>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20240112125420.514425-30-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Ensure that the async_len checks for requesting data from the SCSI layer and
the TC == 0 checks to detect the end of the DMA transfer are consistent in both
do_dma_pdma_cb() and esp_do_dma(). In particular this involves adding the check
to see if the FIFO is at its low threshold since PDMA and mixed DMA and non-DMA
requests can leave data remaining in the FIFO.
At the same time update all the comments so that they are also consistent between
all similar code paths.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Helge Deller <deller@gmx.de>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20240112125420.514425-29-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
The fifo8_pop_buf() function returns a pointer to the FIFO buffer up to the
specified length. Since the FIFO buffer is modelled as an array then once
the FIFO wraps around, only the continuous portion of the buffer can be
returned.
In future the use of continuous and unaligned accesses will advance the
internal FIFO head pointer, so modify esp_fifo_pop_buf() to ensure that
any wraparound content is also returned up to the requested length.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Helge Deller <deller@gmx.de>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20240112125420.514425-4-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
target/hppa: Enhancements and fixes
Some enhancements and fixes for the hppa target.
The major change is, that this patchset adds a new SeaBIOS-hppa firmware
which is built as 32- and 64-bit firmware.
The new 64-bit firmware is necessary to fully support 64-bit operating systems
(HP-UX, Linux, NetBSD,...).
# -----BEGIN PGP SIGNATURE-----
#
# iHUEABYKAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCZcquAQAKCRD3ErUQojoP
# X9pjAQCVsWyuYlGCW2paIGVWKV0vsOpwetUrbhRtFUZGqZxb4AD9FbMsXRcCN/oq
# CotBPY/a8MEzIQcwYl5QbcI5nNW4ygs=
# =RA0B
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 12 Feb 2024 23:47:13 GMT
# gpg: using EDDSA key BCE9123E1AD29F07C049BBDEF712B510A23A0F5F
# gpg: Good signature from "Helge Deller <deller@gmx.de>" [unknown]
# gpg: aka "Helge Deller <deller@kernel.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 4544 8228 2CD9 10DB EF3D 25F8 3E5F 3D04 A7A2 4603
# Subkey fingerprint: BCE9 123E 1AD2 9F07 C049 BBDE F712 B510 A23A 0F5F
* tag 'hppa64-pull-request' of https://github.com/hdeller/qemu-hppa:
hw/hppa/machine: Load 64-bit firmware on 64-bit machines
target/hppa: Update SeaBIOS-hppa to version 16
hw/net/tulip: add chip status register values
target/hppa: PDC_BTLB_INFO uses 32-bit ints
target/hppa: Allow read-access to PSW with rsm 0,reg instruction
lasi: Add reset I/O ports for LASI audio and FDC
target/hppa: Implement do_transaction_failed handler for I/O errors
lasi: allow access to LAN MAC address registers
hw/pci-host/astro: Implement Hard Fail and Soft Fail mode
hw/pci-host/astro: Avoid aborting on access failure
target/hppa: Add "diag 0x101" for console output support
disas/hppa: Add disassembly for qemu specific instructions
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Include "exec/memory.h" in order to avoid:
monitor/hmp-cmds-target.c:263:10: error: call to undeclared function 'memory_region_is_ram';
ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
if (!memory_region_is_ram(mrs.mr) && !memory_region_is_romd(mrs.mr)) {
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Include "exec/memory.h" in order to avoid:
cpu-target.c:201:50: error: use of undeclared identifier 'TYPE_MEMORY_REGION'
DEFINE_PROP_LINK("memory", CPUState, memory, TYPE_MEMORY_REGION,
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The detailed help lists zoom-to-fit as valid option but it is missing
from the short option summary. Add it there too.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
'a == b ? false : true' is a rather convoluted way of writing 'a != b'.
Use the more obvious way to write it.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Commit aa09b3d5f8 (stats: Move QMP commands from monitor/ to stats/)
created section Stats, but neglected to add qapi/stats.json to it.
Fix that.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
qemu_smbios_type8_opts did not have the list terminator and that
resulted in out-of-bound memory access. It also needs to have an element
for the type option.
Cc: qemu-stable@nongnu.org
Fixes: fd8caa253c ("hw/smbios: support for type 8 (port connector)")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
qemu_smbios_type11_opts did not have the list terminator and that
resulted in out-of-bound memory access. It also needs to have an element
for the type option.
Cc: qemu-stable@nongnu.org
Fixes: 2d6dcbf93f ("smbios: support setting OEM strings table")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Use device_class_set_parent_realize() to set parent realize() directly.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Use device_class_set_parent_realize() to set parent realize() directly.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Use device_class_set_parent_realize() to set parent realize() directly.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Use device_class_set_parent_realize() to set parent realize() directly.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Load the 64-bit SeaBIOS-hppa firmware by default when running on a 64-bit
machine. This will enable us to later support more than 4GB of RAM and is
required that the OS (or PALO bootloader) will start or install a 64-bit kernel
instead of a 32-bit kernel.
Note that SeaBIOS-hppa v16 provides the "-fw_cfg opt/OS64,string=3" option with
which the user can control what the firmware shall report back to the OS:
Support of 32-bit OS, support of a 64-bit OS, or support for both (default).
Wrap firmware loading inside !qtest_enabled() to avoid this warning with
qtest: "qemu-system-hppa: no firmware provided".
Signed-off-by: Helge Deller <deller@gmx.de>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
- LUKS support for detached headers
- Update x86 CPU model docs and script
- Add missing close of chardev QIOChannel
- More trace events o nTKS handshake
- Drop unsafe VNC constants
- Increase NOFILE limit during startup
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE2vOm/bJrYpEtDo4/vobrtBUQT98FAmXGMNUACgkQvobrtBUQ
# T998JQ//SqQ3L/AZmhE5cIwZ1XipSMMZ/yEoVIyniA3tL41S7Oimj3O9XvY68TEG
# nnj9Oh+zOlVLxauTHAczveJ7z+XfonQZS3HrbGRUTHU+ezGVjyM618e/h9pSQtYI
# +CCkrjtey1NoT42/um4D/bKg/B2XQeulS+pD12Z9l5zbqEZiw0R9+UwVIJ52G811
# 5UQgIjJ7GNFzalxqiMCkGc0nTyU8keEXQJcdZ4droo42DnU4pZeQWGDimzP61JnW
# 1Crm6aZSuUriUbVmxJde+2eEdPSR4rr/yQ4Pw06hoi1QJALSgGYtOTo8+qsyumHd
# us/2ouMrxOMdsIk4ViAkSTiaje9agPj84VE1Z229Y/uqZcEAuX572n730/kkzqUv
# ZDKxMz0v3rzpkjFmsgj5D4yqJaQp4zn1zYm98ld7HWJVIOf3GSvpaNg9J6jwN7Gi
# HKKkvYns9pxg3OSx++gqnM32HV6nnMDFiddipl/hTiUsnNlnWyTDSvJoNxIUU5+l
# /uEbbdt8xnxx1JP0LiOhgmz6N6FU7oOpaPuJ5CD8xO2RO8D1uBRvmpFcdOTDAfv0
# uYdjhKBI+quKjE64p7gNWYCoqZtipRIJ6AY2VaPU8XHx8GvGFwBLX64oLYiYtrBG
# gkv3NTHRkMhQw9cGQcZIgZ+OLU+1eNF+m9EV7LUjuKl0HWC3Vjs=
# =61zI
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 09 Feb 2024 14:04:05 GMT
# gpg: using RSA key DAF3A6FDB26B62912D0E8E3FBE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>" [full]
# gpg: aka "Daniel P. Berrange <berrange@redhat.com>" [full]
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E 8E3F BE86 EBB4 1510 4FDF
* tag 'misc-fixes-pull-request' of https://gitlab.com/berrange/qemu:
tests: Add case for LUKS volume with detached header
crypto: Introduce 'detached-header' field in QCryptoBlockInfoLUKS
block: Support detached LUKS header creation using qemu-img
block: Support detached LUKS header creation using blockdev-create
crypto: Modify the qcrypto_block_create to support creation flags
qapi: Make parameter 'file' optional for BlockdevCreateOptionsLUKS
crypto: Support LUKS volume with detached header
io: add trace event when cancelling TLS handshake
chardev: close QIOChannel before unref'ing
docs: re-generate x86_64 ABI compatibility CSV
docs: fix highlighting of CPU ABI header rows
scripts: drop comment about autogenerated CPU API file
softmmu: remove obsolete comment about libvirt timeouts
ui: drop VNC feature _MASK constants
qemu_init: increase NOFILE soft limit on POSIX
crypto: Introduce SM4 symmetric cipher algorithm
meson: sort C warning flags alphabetically
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Use of String is problematic, because it results in awkward interface
documentation. The previous commit cleaned up one instance.
Move String out of common.json next to its remaining users in net.json
to discourage reuse elsewhere.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240205074709.3613229-15-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
SocketAddress branch @fd is documented in enum SocketAddressType,
unlike the other branches. That's because the branch's type is String
from common.json.
Use a local copy of String, so we can put the documentation in the
usual place.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240205074709.3613229-14-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The conversion of simple to flat unions left the @data members
undocumented. Add documentation where it's trivial. Copy verbatim
from the wrapped type's description where possible.
Leftovers: String (to be taken care of in the next commit), and
TransActionAction (left for another day).
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240205074709.3613229-13-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Add missing return member documentation of guest-get-disks,
guest-get-devices, guest-get-diskstats, and guest-get-cpustats.
The NVMe SMART information returned by guest-getdisks remains
undocumented. Add a TODO there.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240205074709.3613229-10-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The QAPI generator forces you to document your stuff. Except for
command arguments, event data, and members of enum and object types:
these the generator silently "documents" as "Not documented".
We can't require proper documentation there without first fixing all
the offenders. We've always had too many offenders to pull that off.
Right now, we have more than 500. Worse, we seem to fix old ones no
faster than we add new ones: in the past year, we fixed 22 ones, but
added 26 new ones.
To help arrest the backsliding, make missing documentation an error
unless the command, type, or event is in listed in new pragma
documentation-exceptions.
List all the current offenders: 117 commands and types in qapi/, and 9
in qga/.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240205074709.3613229-7-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
QAPISchemaGenRSTVisitor._nodes_for_members() has a special case to
auto-generate documentation for a union tag member of implicit (enum)
type that lacks documentation.
This was useful for simple unions, where the tag member's type was
implicitly. The only implicit enum type left today is 'QType'. Not
worth a special case. Drop. No change to generated documentation.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240205074709.3613229-6-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
docs/devel/qapi-code-gen demands that the "second and subsequent lines
of sections other than "Example"/"Examples" should be indented".
Commit a937b6aa739q (qapi: Reformat doc comments to conform to current
conventions) missed a few instances, and messed up a few others.
Clean that up.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240205074709.3613229-5-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The description of @bins ends with a literal block:
# @bins: list of io request counts corresponding to histogram
# intervals, one more element than @boundaries has. For the
# example above, @bins may be something like [3, 1, 5, 2], and
# corresponding histogram looks like:
#
# ::
#
# 5| *
Except it actually ends *before* the block: the unindented '::' line
starts a new section. Makes no sense.
We could fix this by indenting the '::' line. Instead, double the
colon at the end of the preceding paragraph, and drop the '::' line.
This shifts the box for the literal block right in generated
documentation, so it lines up with the description.
Fixes: commit a0fcff383b (qapi: Use rST markup for literal blocks)
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240205074709.3613229-4-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
SeaBIOS-hppa version 16 news & enhancements:
- Initial 64-bit firmware release
- Added fault handler to catch and report firmware bugs
- Use Qemu's builtin_console_out() via diag 0x101
- parisc-qemu-install Makefile target to install firmware in qemu
- Added -fw_cfg opt/OS64,string=3 option
Fixes:
- Avoid crash when booting without SCSI controller
- Avoid possible crashes while detecting LASI LAN & graphics
- Don't check layers in PDC_MEM_MAP_HPA, fixes NetBSD
- Ensure cache definition does not trigger endless loops
- Mark B160L as 32-bit machine in inventory
Signed-off-by: Helge Deller <deller@gmx.de>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Netbsd isn't able to detect a link on the emulated tulip card. That's
because netbsd reads the Chip Status Register of the Phy (address
0x14). The default phy data in the qemu tulip driver is all zero,
which means no link is established and autonegotation isn't complete.
Therefore set the register to 0x3b40, which means:
Link is up, Autonegotation complete, Full Duplex, 100MBit/s Link
speed.
Also clear the mask because this register is read only.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Helge Deller <deller@gmx.de>
Tested-by: Helge Deller <deller@gmx.de>
Signed-off-by: Helge Deller <deller@gmx.de>
The BTLB helper function stores the BTLB info (four 32-bit ints) into
the memory of the guest. They are only available when emulating a 32-bit
CPU in the guest, so use "uint32_t" instead of "target_ulong" here.
Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
HP-UX 11 and HP ODE tools use the "rsm 0,%reg" instruction in not priviledged
code paths to get the current PSW flags. The constant 0 means that no bits of
the PSW shall be reset, so this is effectively a read-only access to the PSW.
Allow this read-only access even for not privileged code.
Signed-off-by: Helge Deller <deller@gmx.de>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Linux writes zeroes at bootup into the default ports for LASI audio and
LASI floppy controller to reset those devices. Allow writing to those
registers to avoid HPMCs.
Signed-off-by: Helge Deller <deller@gmx.de>
Add the do_transaction_failed() handler to tigger a HPMC to the CPU
in case of I/O transaction errors.
This is a preparation commit.
We still lack implementation for some registers, so do not yet enable sending
HPMCs. Having this hunk here now nevertheless helps for the further
development, so that it can easily be enabled later on.
Signed-off-by: Helge Deller <deller@gmx.de>
Firmware and qemu reads and writes the MAC address for the LASI LAN via
registers in LASI. Allow those accesses and return zero even if LASI
LAN isn't enabled to avoid HPMCs (=crashes).
Signed-off-by: Helge Deller <deller@gmx.de>
The Astro/Elroy chip can work in either Hard-Fail or Soft-Fail mode.
Hard fail means the system bus will send an HPMC (=crash) to the
processor, soft fail means the system bus will ignore timeouts of
MMIO-reads or MMIO-writes and return -1ULL.
The HF mode is controlled by a bit in the status register and is usually
programmed by the OS. Return the corresponing values based on the current
value of that bit.
Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Instead of stopping the emulation, report a MEMTX_DECODE_ERROR if the OS
tries to access non-existent registers.
Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
For debugging purposes at the early stage of the bootup process,
the SeaBIOS-hppa firmware sometimes needs to output characters to the
serial console. Note that the serial console is the default output
method for parisc machines.
At this stage PCI busses and other devices haven't been initialized
yet. So, SeaBIOS-hppa will not be able to find the correct I/O ports
for the serial ports yet.
Instead, add an emulation for the "diag 0x101" opcode to assist here.
Without any other dependencies, SeaBIOS-hppa can then load the character
to be printed in register %r26 and issue the diag assembly instruction.
The qemu diag_console_output() helper function will then print
that character to the first serial port.
Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
This regressed qemu-system-xtensa:
TEST test_load_store on xtensa
qemu-system-xtensa: Some ROM regions are overlapping
These ROM regions might have been loaded by direct user request or by default.
They could be BIOS/firmware images, a guest kernel, initrd or some other file loaded into guest memory.
Check whether you intended to load all this guest code, and whether it has been built to load to the correct addresses.
The following two regions overlap (in the memory address space):
test_load_store ELF program header segment 1 (addresses 0x0000000000001000 - 0x0000000000001f26)
test_load_store ELF program header segment 2 (addresses 0x0000000000001ab8 - 0x0000000000001ab8)
make[1]: *** [Makefile:187: run-test_load_store] Error 1
This reverts commit 62570f1434.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240207163812.3231697-5-alex.bennee@linaro.org>
This might be premature but while streamlining the avocado tests I
realised the only tests we have are "check-tcg" ones. The ageing
fedora-cris-cross image works well enough for developers but can't be
used in CI as we need supported build platforms to build QEMU.
Does this mean the writing is on the wall for this architecture?
Cc: Rabin Vincent <rabinv@axis.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240207163812.3231697-3-alex.bennee@linaro.org>
Avocado needs sqlite3:
Failed to load plugin from module "avocado.plugins.journal":
ImportError("Module 'sqlite3' is not installed.
Use: sudo zypper install python311 to install it")
>From 'zypper info python311':
"This package supplies rich command line features provided by
readline, and sqlite3 support for the interpreter core, thus forming
a so called "extended" runtime."
Include the appropriate package in the lcitool mappings which will
guarantee the dockerfile gets properly updated when lcitool is
run. Also include the updated dockerfile.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Suggested-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20240117164227.32143-1-farosas@suse.de>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240207163812.3231697-2-alex.bennee@linaro.org>
RISC-V PR for 9.0
* Check for 'A' extension on all atomic instructions
* Add support for 'B' extension
* Internally deprecate riscv_cpu_options
* Implement optional CSR mcontext of debug Sdtrig extension
* Internally add cpu->cfg.vlenb and remove cpu->cfg.vlen
* Support vlenb and vregs[] in KVM
* RISC-V gdbstub and TCG plugin improvements
* Remove vxrm and vxsat from FCSR
* Use RISCVException as return type for all csr ops
* Use g_autofree more and fix a memory leak
* Add support for Zaamo and Zalrsc
* Support new isa extension detection devicetree properties
* SMBIOS support for RISC-V virt machine
* Enable xtheadsync under user mode
* Add rv32i,rv32e and rv64e CPUs
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEaukCtqfKh31tZZKWr3yVEwxTgBMFAmXGBRAACgkQr3yVEwxT
# gBPqVA//etMiwP8+lQb2E4pw+QwBIzpm3qFyBlqgSCFrekj1u2kYNd4CH3CKurWE
# ysoQ6OAMeb0MUbRHdjrejjzD/wOg7JNA9h7ynM1VbupveBrJY3GWC6qQWSG+A1j/
# LSgmr/dDya74chDxjxa+7ld3xqloHi5OtdGaeORfdPXl7mjCCKKCoSKYCex1ykup
# uuB7bsjeWeWEbuUsntmeuHJLZJuhpnbuZJmp17tEo+3vWXqjxV00Lik+XMwh3gua
# KOLiAqHjGr2NEhA3Mg1JLcQ+6JLTDM9ugZpQeNGQwMkfuB/RAU7jO/1Di3flbadF
# 8l2xOHu3mydDbfdxTGZNJjcIrMTX/YEewAYZLRYpNsyPOMntgq8HEegwCdWGvK7C
# M5Tc59MNSuBt+zkZkHd21qLYusa2ThP4YT/schh7IA+2F1TSKdhlptEzi2oebIc7
# ilLSgZ9Of72QlAH2OPJNSAL9Nbc06MHEM0JiHIJa5u+XdcVRhZus5h1YIOKXisqF
# YPP22RnI5Jj5d5csa/0ONAZGFh5SRMTJtpjKoKSkzoYJWDjCQ2MiUAOmLscchMZd
# wbK0vjeRf6kRG4U4z7nTmHS9kzH8RXUZDecVcOITuMpKih9LhUiCZ+xPunFYPycJ
# WNFa9/pENcCXJweXvtk4NHwx933rX56678lF6KY2hwUwwaiBOv4=
# =yuRM
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 09 Feb 2024 10:57:20 GMT
# gpg: using RSA key 6AE902B6A7CA877D6D659296AF7C95130C538013
# gpg: Good signature from "Alistair Francis <alistair@alistair23.me>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6AE9 02B6 A7CA 877D 6D65 9296 AF7C 9513 0C53 8013
* tag 'pull-riscv-to-apply-20240209' of https://github.com/alistair23/qemu: (61 commits)
target/riscv: add rv32i, rv32e and rv64e CPUs
target/riscv/cpu.c: add riscv_bare_cpu_init()
target/riscv: Enable xtheadsync under user mode
qemu-options: enable -smbios option on RISC-V
target/riscv: SMBIOS support for RISC-V virt machine
smbios: function to set default processor family
smbios: add processor-family option
target/riscv: support new isa extension detection devicetree properties
target/riscv: use misa_mxl_max to populate isa string rather than TARGET_LONG_BITS
target/riscv: Expose Zaamo and Zalrsc extensions
target/riscv: Check 'A' and split extensions for atomic instructions
target/riscv: Add Zaamo and Zalrsc extension infrastructure
hw/riscv/virt.c: use g_autofree in create_fdt_*
hw/riscv/virt.c: use g_autofree in virt_machine_init()
hw/riscv/virt.c: use g_autofree in create_fdt_virtio()
hw/riscv/virt.c: use g_autofree in create_fdt_sockets()
hw/riscv/virt.c: use g_autofree in create_fdt_socket_cpus()
hw/riscv/numa.c: use g_autofree in socket_fdt_write_distance_matrix()
hw/riscv/virt-acpi-build.c: fix leak in build_rhct()
target/riscv: Use RISCVException as return type for all csr ops
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Also, add a section to the MAINTAINERS file for detached
LUKS header, it only has a test case in it currently.
Signed-off-by: Hyman Huang <yong.huang@smartx.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
When querying the LUKS disk with the qemu-img tool or other APIs,
add information about whether the LUKS header is detached.
Additionally, update the test case with the appropriate
modification.
Signed-off-by: Hyman Huang <yong.huang@smartx.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Even though a LUKS header might be created with cryptsetup,
qemu-img should be enhanced to accommodate it as well.
Add the 'detached-header' option to specify the creation of
a detached LUKS header. This is how it is used:
$ qemu-img create --object secret,id=sec0,data=abc123 -f luks
> -o cipher-alg=aes-256,cipher-mode=xts -o key-secret=sec0
> -o detached-header=true header.luks
Using qemu-img or cryptsetup tools to query information of
an LUKS header image as follows:
Assume a detached LUKS header image has been created by:
$ dd if=/dev/zero of=test-header.img bs=1M count=32
$ dd if=/dev/zero of=test-payload.img bs=1M count=1000
$ cryptsetup luksFormat --header test-header.img test-payload.img
> --force-password --type luks1
Header image information could be queried using cryptsetup:
$ cryptsetup luksDump test-header.img
or qemu-img:
$ qemu-img info 'json:{"driver":"luks","file":{"filename":
> "test-payload.img"},"header":{"filename":"test-header.img"}}'
When using qemu-img, keep in mind that the entire disk
information specified by the JSON-format string above must be
supplied on the commandline; if not, an overlay check will reveal
a problem with the LUKS volume check logic.
Signed-off-by: Hyman Huang <yong.huang@smartx.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
[changed to pass 'cflags' to block_crypto_co_create_generic]
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Firstly, enable the ability to choose the block device containing
a detachable LUKS header by adding the 'header' parameter to
BlockdevCreateOptionsLUKS.
Secondly, when formatting the LUKS volume with a detachable header,
truncate the payload volume to length without a header size.
Using the qmp blockdev command, create the LUKS volume with a
detachable header as follows:
1. add the secret to lock/unlock the cipher stored in the
detached LUKS header
$ virsh qemu-monitor-command vm '{"execute":"object-add",
> "arguments":{"qom-type": "secret", "id": "sec0", "data": "foo"}}'
2. create a header img with 0 size
$ virsh qemu-monitor-command vm '{"execute":"blockdev-create",
> "arguments":{"job-id":"job0", "options":{"driver":"file",
> "filename":"/path/to/detached_luks_header.img", "size":0 }}}'
3. add protocol blockdev node for header
$ virsh qemu-monitor-command vm '{"execute":"blockdev-add",
> "arguments": {"driver":"file", "filename":
> "/path/to/detached_luks_header.img", "node-name":
> "detached-luks-header-storage"}}'
4. create a payload img with 0 size
$ virsh qemu-monitor-command vm '{"execute":"blockdev-create",
> "arguments":{"job-id":"job1", "options":{"driver":"file",
> "filename":"/path/to/detached_luks_payload_raw.img", "size":0}}}'
5. add protocol blockdev node for payload
$ virsh qemu-monitor-command vm '{"execute":"blockdev-add",
> "arguments": {"driver":"file", "filename":
> "/path/to/detached_luks_payload_raw.img", "node-name":
> "luks-payload-raw-storage"}}'
6. do the formatting with 128M size
$ virsh qemu-monitor-command c81_node1 '{"execute":"blockdev-create",
> "arguments":{"job-id":"job2", "options":{"driver":"luks", "header":
> "detached-luks-header-storage", "file":"luks-payload-raw-storage",
> "size":134217728, "preallocation":"full", "key-secret":"sec0" }}}'
Signed-off-by: Hyman Huang <yong.huang@smartx.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Expand the signature of qcrypto_block_create to enable the
formation of LUKS volumes with detachable headers. To accomplish
that, introduce QCryptoBlockCreateFlags to instruct the creation
process to set the payload_offset_sector to 0.
Signed-off-by: Hyman Huang <yong.huang@smartx.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
To support detached LUKS header creation, make the existing 'file'
field in BlockdevCreateOptionsLUKS optional.
Signed-off-by: Hyman Huang <yong.huang@smartx.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
By enhancing the LUKS driver, it is possible to implement
the LUKS volume with a detached header.
Normally a LUKS volume has a layout:
disk: | header | key material | disk payload data |
With a detached LUKS header, you need 2 disks so getting:
disk1: | header | key material |
disk2: | disk payload data |
There are a variety of benefits to doing this:
* Secrecy - the disk2 cannot be identified as containing LUKS
volume since there's no header
* Control - if access to the disk1 is restricted, then even
if someone has access to disk2 they can't unlock
it. Might be useful if you have disks on NFS but
want to restrict which host can launch a VM
instance from it, by dynamically providing access
to the header to a designated host
* Flexibility - your application data volume may be a given
size and it is inconvenient to resize it to
add encryption.You can store the LUKS header
separately and use the existing storage
volume for payload
* Recovery - corruption of a bit in the header may make the
entire payload inaccessible. It might be
convenient to take backups of the header. If
your primary disk header becomes corrupt, you
can unlock the data still by pointing to the
backup detached header
Take the raw-format image as an example to introduce the usage
of the LUKS volume with a detached header:
1. prepare detached LUKS header images
$ dd if=/dev/zero of=test-header.img bs=1M count=32
$ dd if=/dev/zero of=test-payload.img bs=1M count=1000
$ cryptsetup luksFormat --header test-header.img test-payload.img
> --force-password --type luks1
2. block-add a protocol blockdev node of payload image
$ virsh qemu-monitor-command vm '{"execute":"blockdev-add",
> "arguments":{"node-name":"libvirt-1-storage", "driver":"file",
> "filename":"test-payload.img"}}'
3. block-add a protocol blockdev node of LUKS header as above.
$ virsh qemu-monitor-command vm '{"execute":"blockdev-add",
> "arguments":{"node-name":"libvirt-2-storage", "driver":"file",
> "filename": "test-header.img" }}'
4. object-add the secret for decrypting the cipher stored in
LUKS header above
$ virsh qemu-monitor-command vm '{"execute":"object-add",
> "arguments":{"qom-type":"secret", "id":
> "libvirt-2-storage-secret0", "data":"abc123"}}'
5. block-add the raw-drived blockdev format node
$ virsh qemu-monitor-command vm '{"execute":"blockdev-add",
> "arguments":{"node-name":"libvirt-1-format", "driver":"raw",
> "file":"libvirt-1-storage"}}'
6. block-add the luks-drived blockdev to link the raw disk
with the LUKS header by specifying the field "header"
$ virsh qemu-monitor-command vm '{"execute":"blockdev-add",
> "arguments":{"node-name":"libvirt-2-format", "driver":"luks",
> "file":"libvirt-1-format", "header":"libvirt-2-storage",
> "key-secret":"libvirt-2-format-secret0"}}'
7. hot-plug the virtio-blk device finally
$ virsh qemu-monitor-command vm '{"execute":"device_add",
> "arguments": {"num-queues":"1", "driver":"virtio-blk-pci",
> "drive": "libvirt-2-format", "id":"virtio-disk2"}}'
Starting a VM with a LUKS volume with detached header is
somewhat similar to hot-plug in that both maintaining the
same json command while the starting VM changes the
"blockdev-add/device_add" parameters to "blockdev/device".
Signed-off-by: Hyman Huang <yong.huang@smartx.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The chardev socket backend will unref the QIOChannel object while
it is still potentially open. When using TLS there could be a
pending TLS handshake taking place. If the channel is left open
then when the TLS handshake callback runs, it can end up accessing
free'd memory in the tcp_chr_tls_handshake method.
Closing the QIOChannel will unregister any pending handshake
source.
Reported-by: jiangyegen <jiangyegen@huawei.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This picks up the new EPYC-Genoa, SapphireRapids & GraniteRapids CPUs,
removes the now deleted Icelake-Client CPU, and adds the newer versions
of many existing CPUs.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The 'header-rows' directive indicates how many rows in the generated
table are to be highlighted as headers. We only have one such row in
the CSV file included. This removes the accident bold highlighting
of the 'i486' CPU model.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The RST doc include can't be made to skip the comment indicating the CPU
CSV file is auto-generated when importing it. This comment line was
previously manually removed from the generated output that was committed.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
For a long time now, libvirt has pre-created the monitor connection
socket and passed the pre-opened FD into QEMU during startup. Thus
libvirt does not have any timeouts waiting for the monitor socket
to appear, it is immediately connected.
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Each VNC feature enum entry has a corresponding _MASK constant
which is the bit-shifted value. It is very easy for contributors
to accidentally use the _MASK constant, instead of the non-_MASK
constant, or the reverse. No compiler warning is possible and
it'll just silently do the wrong thing at runtime.
By introducing the vnc_set_feature helper method, we can drop
all the _MASK constants and thus prevent any future accidents.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
In many configurations, e.g. multiple vNICs with multiple queues or
with many Ceph OSDs, the default soft limit of 1024 is not enough.
QEMU is supposed to work fine with file descriptors >= 1024 and does
not use select() on POSIX. Bump the soft limit to the allowed hard
limit to avoid issues with the aforementioned configurations.
Of course the limit could be raised from the outside, but the man page
of systemd.exec states about 'LimitNOFILE=':
> Don't use.
> [...]
> Typically applications should increase their soft limit to the hard
> limit on their own, if they are OK with working with file
> descriptors above 1023,
If the soft limit is already the same as the hard limit, avoid the
superfluous setrlimit call. This can avoid a warning with a strict
seccomp filter blocking setrlimit if NOFILE was already raised before
executing QEMU.
Buglink: https://bugzilla.proxmox.com/show_bug.cgi?id=4507
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Introduce the SM4 cipher algorithms (OSCCA GB/T 32907-2016).
SM4 (GBT.32907-2016) is a cryptographic standard issued by the
Organization of State Commercial Administration of China (OSCCA)
as an authorized cryptographic algorithms for the use within China.
Detect the SM4 cipher algorithms and enable the feature silently
if it is available.
Signed-off-by: Hyman Huang <yong.huang@smartx.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
When scanning the list of warning flags to see if one is present, it is
helpful if they are in alphabetical order. It is further helpful to
separate out the 'no-' prefixed warnings.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Migration pull
- William's fix on hwpoison migration which used to crash QEMU
- Peter's multifd cleanup + bugfix + optimizations
- Avihai's fix on multifd crash over non-socket channels
- Fabiano's multifd thread-race fix
- Peter's CI fix series
# -----BEGIN PGP SIGNATURE-----
#
# iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZcREtRIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1wacrwEAl2aeQkh51h/e+OKX7MG4/4Y6Edf6Oz7o
# IJLk/cyrUFQA/2exo2lOdv5zHNOJKwAYj8HYDraezrC/MK1eED4Wji0M
# =k53l
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 08 Feb 2024 03:04:21 GMT
# gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg: issuer "peterx@redhat.com"
# gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [marginal]
# gpg: aka "Peter Xu <peterx@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706
* tag 'migration-staging-pull-request' of https://gitlab.com/peterx/qemu: (34 commits)
ci: Update comment for migration-compat-aarch64
ci: Remove tag dependency for build-previous-qemu
tests/migration-test: Stick with gicv3 in aarch64 test
migration/multifd: Add a synchronization point for channel creation
migration/multifd: Unify multifd and TLS connection paths
migration/multifd: Move multifd_send_setup into migration thread
migration/multifd: Move multifd_send_setup error handling in to the function
migration/multifd: Remove p->running
migration/multifd: Join the TLS thread
migration: Fix logic of channels and transport compatibility check
migration/multifd: Optimize sender side to be lockless
migration/multifd: Fix MultiFDSendParams.packet_num race
migration/multifd: Stick with send/recv on function names
migration/multifd: Cleanup multifd_load_cleanup()
migration/multifd: Cleanup multifd_save_cleanup()
migration/multifd: Rewrite multifd_queue_page()
migration/multifd: Change retval of multifd_send_pages()
migration/multifd: Change retval of multifd_queue_page()
migration/multifd: Split multifd_send_terminate_threads()
migration/multifd: Forbid spurious wakeups
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
A bare bones 32 bit RVI CPU, rv32i, will make users lives easier when a
full customized 32 bit CPU is desired, and users won't need to disable
defaults by hand as they would with the rv32 CPU. [1] has an example of
a situation that would be avoided with rv32i.
In fact, add bare bones CPUs for RVE as well. Trying to use RVE in QEMU
requires one to disable every single default extension, including RVI,
and then add the desirable extension set. Adding rv32e/rv64e makes it
more pleasant to use embedded CPUs in QEMU.
[1] https://lore.kernel.org/qemu-riscv/258be47f-97be-4308-bed5-dc34ef7ff954@Spark/
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240122123348.973288-3-dbarboza@ventanamicro.com>
[ Changes by AF:
- Rebase on latest changes
]
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Next patch will add more bare CPUs. Their cpu_init() functions would be
glorified copy/pastes of rv64i_bare_cpu_init(), differing only by a
riscv_cpu_set_misa() call.
Add a new .instance_init for the TYPE_RISCV_BARE_CPU typ to avoid this
code repetition. While we're at it, add a better explanation on why
we're disabling the timing extensions for bare CPUs.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240122123348.973288-2-dbarboza@ventanamicro.com>
[ Changes by AF:
- Rebase on latest changes
]
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Generate SMBIOS tables for the RISC-V mach-virt.
Add CONFIG_SMBIOS=y to the RISC-V default config.
Set the default processor family in the type 4 table.
The implementation is based on the corresponding ARM and Loongson code.
With the patch the following firmware tables are provided:
etc/smbios/smbios-anchor
etc/smbios/smbios-tables
Signed-off-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20240123184229.10415-4-heinrich.schuchardt@canonical.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
For RISC-V the SMBIOS standard requires specific values of the processor
family value depending on the bitness of the CPU.
Add a processor-family option for SMBIOS table 4.
The value of processor-family may exceed 255 and therefore must be provided
in the Processor Family 2 field. Set the Processor Family field to 0xFE
which signals that the Processor Family 2 is used.
Signed-off-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20240123184229.10415-2-heinrich.schuchardt@canonical.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
A few months ago I submitted a patch to various lists, deprecating
"riscv,isa" with a lengthy commit message [0] that is now commit
aeb71e42caae ("dt-bindings: riscv: deprecate riscv,isa") in the Linux
kernel tree. Primarily, the goal was to replace "riscv,isa" with a new
set of properties that allowed for strictly defining the meaning of
various extensions, where "riscv,isa" was tied to whatever definitions
inflicted upon us by the ISA manual, which have seen some variance over
time.
Two new properties were introduced: "riscv,isa-base" and
"riscv,isa-extensions". The former is a simple string to communicate the
base ISA implemented by a hart and the latter an array of strings used
to communicate the set of ISA extensions supported, per the definitions
of each substring in extensions.yaml [1]. A beneficial side effect was
also the ability to define vendor extensions in a more "official" way,
as the ISA manual and other RVI specifications only covered the format
for vendor extensions in the ISA string, but not the meaning of vendor
extensions, for obvious reasons.
Add support for setting these two new properties in the devicetrees for
the various devicetree platforms supported by QEMU for RISC-V. The Linux
kernel already supports parsing ISA extensions from these new
properties, and documenting them in the dt-binding is a requirement for
new extension detection being added to the kernel.
A side effect of the implementation is that the meaning for elements in
"riscv,isa" and in "riscv,isa-extensions" are now tied together as they
are constructed from the same source. The same applies to the ISA string
provided in ACPI tables, but there does not appear to be any strict
definitions of meanings in ACPI land either.
Link: https://lore.kernel.org/qemu-riscv/20230702-eats-scorebook-c951f170d29f@spud/ [0]
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/devicetree/bindings/riscv/extensions.yaml [1]
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20240124-unvarying-foothold-9dde2aaf95d4@spud>
[ Changes by AF:
- Rebase on recent changes
]
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
We have a lot of cases where a char or an uint32_t pointer is used once
to alloc a string/array, read/written during the function, and then
g_free() at the end. There's no pointer re-use - a single alloc, a
single g_free().
Use 'g_autofree' to avoid the g_free() calls.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240122221529.86562-8-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The 'isa' char pointer isn't being freed after use.
Issue detected by Valgrind:
==38752== 128 bytes in 1 blocks are definitely lost in loss record 3,190 of 3,884
==38752== at 0x484280F: malloc (vg_replace_malloc.c:442)
==38752== by 0x5189619: g_malloc (gmem.c:130)
==38752== by 0x51A5BF2: g_strconcat (gstrfuncs.c:628)
==38752== by 0x6C1E3E: riscv_isa_string_ext (cpu.c:2321)
==38752== by 0x6C1E3E: riscv_isa_string (cpu.c:2343)
==38752== by 0x6BD2EA: build_rhct (virt-acpi-build.c:232)
==38752== by 0x6BD2EA: virt_acpi_build (virt-acpi-build.c:556)
==38752== by 0x6BDC86: virt_acpi_setup (virt-acpi-build.c:662)
==38752== by 0x9C8DC6: notifier_list_notify (notify.c:39)
==38752== by 0x4A595A: qdev_machine_creation_done (machine.c:1589)
==38752== by 0x61E052: qemu_machine_creation_done (vl.c:2680)
==38752== by 0x61E052: qmp_x_exit_preconfig.part.0 (vl.c:2709)
==38752== by 0x6220C6: qmp_x_exit_preconfig (vl.c:2702)
==38752== by 0x6220C6: qemu_init (vl.c:3758)
==38752== by 0x425858: main (main.c:47)
Fixes: ebfd392893 ("hw/riscv/virt: virt-acpi-build.c: Add RHCT Table")
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240122221529.86562-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
vregs[] have variable size that depends on the current vlenb set by the
host, meaning we can't use our regular kvm_riscv_reg_id() to retrieve
it.
Create a generic kvm_encode_reg_size_id() helper to encode any given
size in bytes into a given kvm reg id. kvm_riscv_vector_reg_id() will
use it to encode vlenb into a given vreg ID.
kvm_riscv_(get|set)_vector() can then get/set all 32 vregs.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240123161714.160149-4-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
KVM will check for the correct 'reg_size' when accessing the vector
registers, erroring with EINVAL if we encode the wrong size in reg ID.
Vector registers varies in size with the vector length in bytes, or
'vlenb'. This means that we need the current 'vlenb' being used by the
host, otherwise we won't be able to fetch all vector regs.
We'll deal with 'vlenb' first. Its support was added in Linux 6.8 as a
get-reg-list register. We'll read 'vlenb' via get-reg-list and mark the
register as 'supported'. All 'vlenb' ops via kvm_arch_get_registers()
and kvm_arch_put_registers() will only be done if the reg is supported,
i.e. we fetched it in get-reg-list during init.
If the user sets a new vlenb value using the 'vlen' property, throw an
error if the user value differs from the host.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240123161714.160149-3-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
We'll re-use the logic froim vext_get_vlmax() in 2 other occurrences in
the next patch, but first we need to make it independent of both 'cpu'
and 'vtype'. To do that, add 'vlenb', 'vsew' and 'lmul' as parameters
instead.
Adapt the two existing callers. In cpu_get_tb_cpu_state(), rename 'sew'
to 'vsew' to be less ambiguous about what we're encoding into *pflags.
In HELPER(vsetvl) the following changes were made:
- add a 'vsew' var to store vsew. Use it in the shift to get 'sew';
- the existing 'lmul' var was renamed to 'vlmul';
- add a new 'lmul' var to store 'lmul' encoded like DisasContext:lmul.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240122161107.26737-12-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Calculate the maximum vector size possible, 'max_sz', which is the size
in bytes 'vlenb' multiplied by the max value of LMUL (LMUL = 8, when
s->lmul = 3).
'max_sz' is then shifted right by 'scale', expressed as '3 - s->lmul',
which is clearer than doing 'scale = lmul - 3' and then using '-scale'
in the shift right.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240122161107.26737-10-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Our usage of 'vlenb' is overwhelming superior than the use of 'vlen'.
We're using 'vlenb' most of the time, having to do 'vlen >> 3' or
'vlen / 8' in every instance.
In hindsight we would be better if the 'vlenb' property was introduced
instead of 'vlen'. That's not what happened, and now we can't easily get
rid of it due to user scripts all around. What we can do, however, is to
change our internal representation to use 'vlenb'.
Add a 'vlenb' field in cpu->cfg. It'll be set via the existing 'vlen'
property, i.e. setting 'vlen' will also set 'vlenb'.
We'll replace all 'vlen >> 3' code to use 'vlenb' directly. Start with
the single instance we have in target/riscv/cpu.c.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240122161107.26737-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
After adding a KVM finalize() implementation, turn cbom_blocksize into a
class property. Follow the same design we used with 'vlen' and 'elen'.
The duplicated 'cbom_blocksize' KVM property can be removed from
kvm_riscv_add_cpu_user_properties().
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Tested-by: Vladimir Isaev <vladimir.isaev@syntacore.com>
tested-by tags added, rebased with Alistair's riscv-to-apply.next.
Message-ID: <20240112140201.127083-3-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
To turn cbom_blocksize and cboz_blocksize into class properties we need
KVM specific changes.
KVM is creating its own version of these options with a customized
setter() that prevents users from picking an invalid value during init()
time. This comes at the cost of duplicating each option that KVM
supports. This will keep happening for each new shared option KVM
implements in the future.
We can avoid that by using the same property TCG uses and adding
specific KVM handling during finalize() time, like TCG already does with
riscv_tcg_cpu_finalize_features(). To do that, the common CPU property
offers a way of knowing if an option was user set or not, sparing us
from doing unneeded syscalls.
riscv_kvm_cpu_finalize_features() is then created using the same
KVMScratch CPU we already use during init() time, since finalize() time
is still too early to use the official KVM CPU for it. cbom_blocksize
and cboz_blocksize are then handled during finalize() in the same way
they're handled by their KVM specific setter.
With this change we can proceed with the blocksize changes in the common
code without breaking the KVM driver.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
tested-by tags added, rebased with Alistair's riscv-to-apply.next.
Message-ID: <20240112140201.127083-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Turning 'vlen' into a class property will allow its default value to be
overwritten by cpu_init() later on, solving the issue we have now where
CPU specific settings are getting overwritten by the default.
Common validation bits are moved from riscv_cpu_validate_v() to
prop_vlen_set() to be shared with KVM.
And, as done with every option we migrated to riscv_cpu_properties[],
vendor CPUs can't have their 'vlen' value changed.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Tested-by: Vladimir Isaev <vladimir.isaev@syntacore.com>
Message-ID: <20240105230546.265053-9-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
'priv_spec' and 'vext_spec' are two string options used as a fancy way
of setting integers in the CPU state (cpu->env.priv_ver and
cpu->env.vext_ver). It requires us to deal with string parsing and to
store them in cpu_cfg.
We must support these string options, but we don't need to store them.
We have a precedence for this kind of arrangement in target/ppc/compat.c,
ppc_compat_prop_get|set, getters and setters used for the
'max-cpu-compat' class property of the pseries ppc64 machine. We'll do
the same with both 'priv_spec' and 'vext_spec'.
For 'priv_spec', the validation from riscv_cpu_validate_priv_spec() will
be done by the prop_priv_spec_set() setter, while also preventing it to
be changed for vendor CPUs. Add two helpers that converts env->priv_ver
back and forth to its string representation. These helpers allow us to
get a string and set 'env->priv_ver' and return a string giving the
current env->priv_ver value. In other words, make the cpu->cfg.priv_spec
string obsolete.
Last but not the least, move the reworked 'priv_spec' option to
riscv_cpu_properties[].
After all said and done, we don't need to store the 'priv_spec' string in
the CPU state, and we're now protecting vendor CPUs from priv_ver
changes:
$ ./build/qemu-system-riscv64 -M virt -cpu sifive-e51,priv_spec="v1.12.0"
qemu-system-riscv64: can't apply global sifive-e51-riscv-cpu.priv_spec=v1.12.0:
CPU 'sifive-e51' does not allow changing the value of 'priv_spec'
Current 'priv_spec' val: v1.10.0
$
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Tested-by: Vladimir Isaev <vladimir.isaev@syntacore.com>
Message-ID: <20240105230546.265053-7-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Commit 7f0bdfb5bf ("target/riscv/cpu.c: remove cfg setup from
riscv_cpu_init()") already did some of the work by making some
cpu_init() functions to explictly enable their own 'mmu' default.
The generic CPUs didn't get update by that commit, so they are still
relying on the defaults set by the 'mmu' option. But having 'mmu' and
'pmp' being default=true will force CPUs that doesn't implement these
options to set them to 'false' in their cpu_init(), which isn't ideal.
We'll move 'mmu' to riscv_cpu_properties[] without any defaults, i.e.
the default will be 'false'. Compensate it by manually setting 'mmu =
true' to the generic CPUs that requires it.
Implement a setter for it to forbid the 'mmu' setting to be changed for
vendor CPUs. This will allow the option to exist for all CPUs and, at
the same time, protect vendor CPUs from undesired changes:
$ ./build/qemu-system-riscv64 -M virt -cpu sifive-e51,mmu=true
qemu-system-riscv64: can't apply global sifive-e51-riscv-cpu.mmu=true:
CPU 'sifive-e51' does not allow changing the value of 'mmu'
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Tested-by: Vladimir Isaev <vladimir.isaev@syntacore.com>
Message-ID: <20240105230546.265053-5-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Every property in riscv_cpu_options[] will be migrated to
riscv_cpu_properties[]. This will make their default values init
earlier, allowing cpu_init() functions to overwrite them. We'll also
implement common getters and setters that both accelerators will use,
allowing them to share validations that TCG is doing.
At the same time, some options (namely 'vlen', 'elen' and the cache
blocksizes) need a way of tracking if the user set a value for them.
This is benign for TCG since the cost of always validating these values
are small, but for KVM we need syscalls to read the host values to make
the validations, thus knowing whether the user didn't touch the values
makes a difference.
We'll track user setting for these properties using a hash, like we do
in the TCG driver. All riscv cpu options will update this hash in case
the user sets it. The KVM driver will use this hash to minimize the
amount of syscalls done.
For now, both 'pmu-mask' and 'pmu-num' shouldn't be changed for vendor
CPUs. The existing setter for 'pmu-num' is changed to add this
restriction. New getters and setters are required for 'pmu-mask'
While we're at it, add a 'static' modifier to 'prop_pmu_num' since we're
not exporting it.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Tested-by: Vladimir Isaev <vladimir.isaev@syntacore.com>
Message-ID: <20240105230546.265053-4-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
tcg: Introduce TCG_COND_TST{EQ,NE}
target/alpha: Use TCG_COND_TST{EQ,NE}
target/m68k: Use TCG_COND_TST{EQ,NE} in gen_fcc_cond
target/sparc: Use TCG_COND_TSTEQ in gen_op_mulscc
target/s390x: Use TCG_COND_TSTNE for CC_OP_{TM,ICM}
target/s390x: Improve general case of disas_jcc
# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmXBpTAdHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV/p6gf9HAasTSRECk2cvjW9
# /mcJy0AIaespnI50fG8fm48OoFl0847CdrsJycpZ1spw3W3Wb0cVbMbq/teNMjXZ
# 0SGQJFk9Baq7wMhW7VzhSzJ96pcorpQprp7XBMdheLXqpT4zsM/EuwEAepBk8RUG
# 3kCeo38dswXE681ZafZkd/8pPzII19sQK8eiMpceeYkBsbbep+DDcnE18Ee4kISS
# u0SbuslKVahxd86LKuzrcz0pNFcmFuR5jRP9hmbQ0MfeAn0Pxlndi+ayZNghfgPf
# 3hDjskiionFwxb/OoRj45BssTWfDiluWl7IUsHfegPXCQ2Y+woT5Vq6TVGZn0GqS
# c6RLQQ==
# =TMiE
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 06 Feb 2024 03:19:12 GMT
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F
* tag 'pull-tcg-20240205-2' of https://gitlab.com/rth7680/qemu: (39 commits)
tcg/tci: Support TCG_COND_TST{EQ,NE}
tcg/s390x: Support TCG_COND_TST{EQ,NE}
tcg/s390x: Add TCG_CT_CONST_CMP
tcg/s390x: Split constraint A into J+U
tcg/ppc: Support TCG_COND_TST{EQ,NE}
tcg/ppc: Add TCG_CT_CONST_CMP
tcg/ppc: Tidy up tcg_target_const_match
tcg/ppc: Use cr0 in tcg_to_bc and tcg_to_isel
tcg/ppc: Sink tcg_to_bc usage into tcg_out_bc
tcg/sparc64: Support TCG_COND_TST{EQ,NE}
tcg/sparc64: Pass TCGCond to tcg_out_cmp
tcg/sparc64: Hoist read of tcg_cond_to_rcond
tcg/i386: Use TEST r,r to test 8/16/32 bits
tcg/i386: Improve TSTNE/TESTEQ vs powers of two
tcg/i386: Support TCG_COND_TST{EQ,NE}
tcg/i386: Move tcg_cond_to_jcc[] into tcg_out_cmp
tcg/i386: Pass x86 condition codes to tcg_out_cmov
tcg/arm: Support TCG_COND_TST{EQ,NE}
tcg/arm: Split out tcg_out_cmp()
tcg/aarch64: Generate CBNZ for TSTNE of UINT32_MAX
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Requests that complete in an IOThread use irqfd to notify the guest
while requests that complete in the main loop thread use the traditional
qdev irq code path. The reason for this conditional is that the irq code
path requires the BQL:
if (s->ioeventfd_started && !s->ioeventfd_disabled) {
virtio_notify_irqfd(vdev, req->vq);
} else {
virtio_notify(vdev, req->vq);
}
There is a corner case where the conditional invokes the irq code path
instead of the irqfd code path:
static void virtio_blk_stop_ioeventfd(VirtIODevice *vdev)
{
...
/*
* Set ->ioeventfd_started to false before draining so that host notifiers
* are not detached/attached anymore.
*/
s->ioeventfd_started = false;
/* Wait for virtio_blk_dma_restart_bh() and in flight I/O to complete */
blk_drain(s->conf.conf.blk);
During blk_drain() the conditional produces the wrong result because
ioeventfd_started is false.
Use qemu_in_iothread() instead of checking the ioeventfd state.
Cc: qemu-stable@nongnu.org
Buglink: https://issues.redhat.com/browse/RHEL-15394
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240122172625.415386-1-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Commit d3f6f294ae ("virtio-blk: always set
ioeventfd during startup") has made virtio_blk_start_ioeventfd() always
kick the virtqueue (set the ioeventfd), regardless of whether the BB is
drained. That is no longer necessary, because attaching the host
notifier will now set the ioeventfd, too; this happens either
immediately right here in virtio_blk_start_ioeventfd(), or later when
the drain ends, in virtio_blk_ioeventfd_attach().
With event_notifier_set() removed, the code becomes the same as the one
in virtio_blk_ioeventfd_attach(), so we can reuse that function.
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-ID: <20240202153158.788922-4-hreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
During drain, we do not care about virtqueue notifications, which is why
we remove the handlers on it. When removing those handlers, whether vq
notifications are enabled or not depends on whether we were in polling
mode or not; if not, they are enabled (by default); if so, they have
been disabled by the io_poll_start callback.
Because we do not care about those notifications after removing the
handlers, this is fine. However, we have to explicitly ensure they are
enabled when re-attaching the handlers, so we will resume receiving
notifications. We do this in virtio_queue_aio_attach_host_notifier*().
If such a function is called while we are in a polling section,
attaching the notifiers will then invoke the io_poll_start callback,
re-disabling notifications.
Because we will always miss virtqueue updates in the drained section, we
also need to poll the virtqueue once after attaching the notifiers.
Buglink: https://issues.redhat.com/browse/RHEL-3934
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-ID: <20240202153158.788922-3-hreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
As of commit 38738f7dbb ("virtio-scsi:
don't waste CPU polling the event virtqueue"), we only attach an io_read
notifier for the virtio-scsi event virtqueue instead, and no polling
notifiers. During operation, the event virtqueue is typically
non-empty, but none of the buffers are intended to be used immediately.
Instead, they only get used when certain events occur. Therefore, it
makes no sense to continuously poll it when non-empty, because it is
supposed to be and stay non-empty.
We do this by using virtio_queue_aio_attach_host_notifier_no_poll()
instead of virtio_queue_aio_attach_host_notifier() for the event
virtqueue.
Commit 766aa2de0f ("virtio-scsi: implement
BlockDevOps->drained_begin()") however has virtio_scsi_drained_end() use
virtio_queue_aio_attach_host_notifier() for all virtqueues, including
the event virtqueue. This can lead to it being polled again, undoing
the benefit of commit 38738f7dbb.
Fix it by using virtio_queue_aio_attach_host_notifier_no_poll() for the
event virtqueue.
Reported-by: Fiona Ebner <f.ebner@proxmox.com>
Fixes: 766aa2de0f
("virtio-scsi: implement BlockDevOps->drained_begin()")
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-ID: <20240202153158.788922-2-hreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
blkio_alloc_mem_region() requires that the requested buffer size is a
multiple of the memory-alignment property. If it isn't, the allocation
fails with a return value of -EINVAL.
Fix the call in blkio_resize_bounce_pool() to make sure the requested
size is properly aligned.
I observed this problem with vhost-vdpa, which requires page aligned
memory. As the virtio-blk device behind it still had 512 byte blocks, we
got bs->bl.request_alignment = 512, but actually any request that needed
a bounce buffer and was not aligned to 4k would fail without this fix.
Suggested-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20240131173140.42398-1-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
usb-storage is for the most part just a wrapper around an internally
created scsi-disk device. It uses DEFINE_BLOCK_PROPERTIES() to offer all
of the usual block device properties to the user, but then only forwards
a few select properties to the internal device while the rest is
silently ignored.
This changes scsi_bus_legacy_add_drive() to accept a whole BlockConf
instead of some individual values inside of it so that usb-storage can
now pass the whole configuration to the internal scsi-disk. This enables
the remaining block device properties, e.g. logical/physical_block_size
or discard_granularity.
Buglink: https://issues.redhat.com/browse/RHEL-22375
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20240131130607.24117-1-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Creating an instance of the 'TestEnv' class will create a temporary
directory. This dir is only deleted, however, in the __exit__ handler
invoked by a context manager.
In dry-run mode, we don't use the TestEnv via a context manager, so
were leaking the temporary directory. Since meson invokes 'check'
5 times on each configure run, developers /tmp was filling up with
empty temporary directories.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240205154019.1841037-1-berrange@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
scsi_device_for_each_req_async() currently does not provide any way to
be awaited. One of its callers is scsi_device_purge_requests(), which
therefore currently does not guarantee that all requests are fully
settled when it returns.
We want all requests to be settled, because scsi_device_purge_requests()
is called through the unrealize path, including the one invoked by
virtio_scsi_hotunplug() through qdev_simple_device_unplug_cb(), which
most likely assumes that all SCSI requests are done then.
In fact, scsi_device_purge_requests() already contains a blk_drain(),
but this will not fully await scsi_device_for_each_req_async(), only the
I/O requests it potentially cancels (not the non-I/O requests).
However, we can have scsi_device_for_each_req_async() increment the BB
in-flight counter, and have scsi_device_for_each_req_async_bh()
decrement it when it is done. This way, the blk_drain() will fully
await all SCSI requests to be purged.
This also removes the need for scsi_device_for_each_req_async_bh() to
double-check the current context and potentially re-schedule itself,
should it now differ from the BB's context: Changing a BB's AioContext
with a root node is done through bdrv_try_change_aio_context(), which
creates a drained section. With this patch, we keep the BB in-flight
counter elevated throughout, so we know the BB's context cannot change.
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-ID: <20240202144755.671354-3-hreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Since AioContext locks have been removed, a BlockBackend's AioContext
may really change at any time (only exception is that it is often
confined to a drained section, as noted in this patch). Therefore,
blk_get_aio_context() cannot rely on its root node's context always
matching that of the BlockBackend.
In practice, whether they match does not matter anymore anyway: Requests
can be sent to BDSs from any context, so anyone who requests the BB's
context should have no reason to require the root node to have the same
context. Therefore, we can and should remove the assertion to that
effect.
In addition, because the context can be set and queried from different
threads concurrently, it has to be accessed with atomic operations.
Buglink: https://issues.redhat.com/browse/RHEL-19381
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-ID: <20240202144755.671354-2-hreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The aio_co_reschedule_self() API is designed to avoid the race
condition between scheduling the coroutine in another AioContext and
yielding.
The QMP dispatch code uses the open-coded version that appears
susceptible to the race condition at first glance:
aio_co_schedule(qemu_get_aio_context(), qemu_coroutine_self());
qemu_coroutine_yield();
The code is actually safe because the iohandler and qemu_aio_context
AioContext run under the Big QEMU Lock. Nevertheless, set a good example
and use aio_co_reschedule_self() so it's obvious that there is no race.
Suggested-by: Hanna Reitz <hreitz@redhat.com>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240206190610.107963-6-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Hanna Czenczek <hreitz@redhat.com> noted that the array index in
virtio_blk_dma_restart_cb() is not bounds-checked:
g_autofree VirtIOBlockReq **vq_rq = g_new0(VirtIOBlockReq *, num_queues);
...
while (rq) {
VirtIOBlockReq *next = rq->next;
uint16_t idx = virtio_get_queue_index(rq->vq);
rq->next = vq_rq[idx];
^^^^^^^^^^
The code is correct because both rq->vq and vq_rq[] depend on
num_queues, but this is indirect and not 100% obvious. Add an assertion.
Suggested-by: Hanna Czenczek <hreitz@redhat.com>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240206190610.107963-4-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
It is not possible to instantiate a virtio-blk device with 0 virtqueues.
The following check is located in ->realize():
if (!conf->num_queues) {
error_setg(errp, "num-queues property must be larger than 0");
return;
}
Later on we access s->vq_aio_context[0] under the assumption that there
is as least one virtqueue. Hanna Czenczek <hreitz@redhat.com> noted that
it would help to show that the array index is already valid.
Add an assertion to document that s->vq_aio_context[0] is always
safe...and catch future code changes that break this assumption.
Suggested-by: Hanna Czenczek <hreitz@redhat.com>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240206190610.107963-3-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Hanna Czenczek <hreitz@redhat.com> noticed that the safety of
`vq_aio_context[vq->value] = ctx;` with user-defined vq->value inputs is
not obvious.
The code is structured in validate() + apply() steps so input validation
is there, but it happens way earlier and there is nothing that
guarantees apply() can only be called with validated inputs.
This patch moves the validate() call inside the apply() function so
validation is guaranteed. I also added the bounds checking assertion
that Hanna suggested.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Message-ID: <20240206190610.107963-2-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Recently we introduced cross-binary migration test. It's always wanted
that migration-test uses stable guest ABI for both QEMU binaries in this
case, so that both QEMU binaries will be compatible on the migration
stream with the cmdline specified.
Switch to a static gic version "3" rather than using version "max", so that
GIC should be stable now across any future QEMU binaries for migration-test.
Here the version can actually be anything as long as the ABI is stable. We
choose "3" because it's the majority of what we already use in QEMU while
still new enough: "git grep gic-version=3" shows 6 hit, while version 4 has
no direct user yet besides "max".
Note that even with this change, aarch64 won't be able to work yet with
migration cross binary test, but then the only missing piece will be the
stable CPU model.
Reviewed-by: "Daniel P. Berrangé" <berrange@redhat.com>
Link: https://lore.kernel.org/r/20240207005403.242235-2-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
It is possible that one of the multifd channels fails to be created at
multifd_new_send_channel_async() while the rest of the channel
creation tasks are still in flight.
This could lead to multifd_save_cleanup() executing the
qemu_thread_join() loop too early and not waiting for the threads
which haven't been created yet, leading to the freeing of resources
that the newly created threads will try to access and crash.
Add a synchronization point after which there will be no attempts at
thread creation and therefore calling multifd_save_cleanup() past that
point will ensure it properly waits for the threads.
A note about performance: Prior to this patch, if a channel took too
long to be established, other channels could finish connecting first
and already start taking load. Now we're bounded by the
slowest-connecting channel.
Reported-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240206215118.6171-7-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
During multifd channel creation (multifd_send_new_channel_async) when
TLS is enabled, the multifd_channel_connect function is called twice,
once to create the TLS handshake thread and another time after the
asynchrounous TLS handshake has finished.
This creates a slightly confusing call stack where
multifd_channel_connect() is called more times than the number of
channels. It also splits error handling between the two callers of
multifd_channel_connect() causing some code duplication. Lastly, it
gets in the way of having a single point to determine whether all
channel creation tasks have been initiated.
Refactor the code to move the reentrancy one level up at the
multifd_new_send_channel_async() level, de-duplicating the error
handling and allowing for the next patch to introduce a
synchronization point common to all the multifd channel creation,
regardless of TLS.
Note that the previous code would never fail once p->c had been set.
This patch changes this assumption, which affects refcounting, so add
comments around object_unref to explain the situation.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240206215118.6171-6-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
We currently have an unfavorable situation around multifd channels
creation and the migration thread execution.
We create the multifd channels with qio_channel_socket_connect_async
-> qio_task_run_in_thread, but only connect them at the
multifd_new_send_channel_async callback, called from
qio_task_complete, which is registered as a glib event.
So at multifd_send_setup() we create the channels, but they will only
be actually usable after the whole multifd_send_setup() calling stack
returns back to the main loop. Which means that the migration thread
is already up and running without any possibility for the multifd
channels to be ready on time.
We currently rely on the channels-ready semaphore blocking
multifd_send_sync_main() until channels start to come up and release
it. However there have been bugs recently found when a channel's
creation fails and multifd_send_cleanup() is allowed to run while
other channels are still being created.
Let's start to organize this situation by moving the
multifd_send_setup() call into the migration thread. That way we
unblock the main-loop to dispatch the completion callbacks and
actually have a chance of getting the multifd channels ready for when
the migration thread needs them.
The next patches will deal with the synchronization aspects.
Note that this takes multifd_send_setup() out of the BQL.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240206215118.6171-5-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
We currently only need p->running to avoid calling qemu_thread_join()
on a non existent thread if the thread has never been created.
However, there are at least two bugs in this logic:
1) On the sending side, p->running is set too early and
qemu_thread_create() can be skipped due to an error during TLS
handshake, leaving the flag set and leading to a crash when
multifd_send_cleanup() calls qemu_thread_join().
2) During exit, the multifd thread clears the flag while holding the
channel lock. The counterpart at multifd_send_cleanup() reads the flag
outside of the lock and might free the mutex while the multifd thread
still has it locked.
Fix the first issue by setting the flag right before creating the
thread. Rename it from p->running to p->thread_created to clarify its
usage.
Fix the second issue by not clearing the flag at the multifd thread
exit. We don't have any use for that.
Note that these bugs are straight-forward logic issues and not race
conditions. There is still a gap for races to affect this code due to
multifd_send_cleanup() being allowed to run concurrently with the
thread creation loop. This issue is solved in the next patches.
Cc: qemu-stable <qemu-stable@nongnu.org>
Fixes: 2964714015 ("migration/tls: add support for multifd tls-handshake")
Reported-by: Avihai Horon <avihaih@nvidia.com>
Reported-by: chenyuhui5@huawei.com
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240206215118.6171-3-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
The commit in the fixes line mistakenly modified the channels and
transport compatibility check logic so it now checks multi-channel
support only for socket transport type.
Thus, running multifd migration using a transport other than socket that
is incompatible with multi-channels (such as "exec") would lead to a
segmentation fault instead of an error message.
For example:
(qemu) migrate_set_capability multifd on
(qemu) migrate -d "exec:cat > /tmp/vm_state"
Segmentation fault (core dumped)
Fix it by checking multi-channel compatibility for all transport types.
Cc: qemu-stable <qemu-stable@nongnu.org>
Fixes: d95533e1cd ("migration: modify migration_channels_and_uri_compatible() for new QAPI syntax")
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240125162528.7552-2-avihaih@nvidia.com
Signed-off-by: Peter Xu <peterx@redhat.com>
make vm-build-freebsd fails with:
ld: error: undefined symbol: inotify_init1
>>> referenced by filemonitor-inotify.c:183 (../src/util/filemonitor-inotify.c:183)
>>> util_filemonitor-inotify.c.o:(qemu_file_monitor_new) in archive libqemuutil.a
On FreeBSD the inotify functions are defined in libinotify.so. Add it
to the dependencies.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240206002344.12372-5-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Unlike on Linux, on FreeBSD renaming a file when the destination
already exists results in an IN_DELETE event for that existing file:
$ FILEMONITOR_DEBUG=1 build/tests/unit/test-util-filemonitor
Rename /tmp/test-util-filemonitor-K13LI2/fish/one.txt -> /tmp/test-util-filemonitor-K13LI2/two.txt
Event id=200000000 event=2 file=one.txt
Queue event id 200000000 event 2 file one.txt
Queue event id 100000000 event 2 file two.txt
Queue event id 100000002 event 2 file two.txt
Queue event id 100000000 event 0 file two.txt
Queue event id 100000002 event 0 file two.txt
Event id=100000000 event=0 file=two.txt
Expected event 0 but got 2
This difference in behavior is not expected to break the real users, so
teach the test to accept it.
Suggested-by: "Daniel P. Berrange" <berrange@redhat.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-ID: <20240206002344.12372-4-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
After console_sshd_config(), the SSH server needs to be nudged to pick
up the new configs. The scripts for the other BSD flavors already do
this with a reboot, but a simple reload is sufficient.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-ID: <20240206002344.12372-3-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
make vm-build-freebsd sometimes fails with "Connection timed out during
banner exchange". The client strace shows:
13:59:30 write(3, "SSH-2.0-OpenSSH_9.3\r\n", 21) = 21
13:59:30 getpid() = 252655
13:59:30 poll([{fd=3, events=POLLIN}], 1, 5000) = 1 ([{fd=3, revents=POLLIN}])
13:59:32 read(3, "S", 1) = 1
13:59:32 poll([{fd=3, events=POLLIN}], 1, 3625) = 1 ([{fd=3, revents=POLLIN}])
13:59:32 read(3, "S", 1) = 1
13:59:32 poll([{fd=3, events=POLLIN}], 1, 3625) = 1 ([{fd=3, revents=POLLIN}])
13:59:32 read(3, "H", 1) = 1
There is a 2s delay during connection, and ConnectTimeout is set to 1.
Raising it makes the issue go away, but we can do better. The server
truss shows:
888: 27.811414714 socket(PF_INET,SOCK_DGRAM|SOCK_CLOEXEC,0) = 5 (0x5)
888: 27.811765030 connect(5,{ AF_INET 10.0.2.3:53 },16) = 0 (0x0)
888: 27.812166941 sendto(5,"\^Z/\^A\0\0\^A\0\0\0\0\0\0\^A2"...,39,0,NULL,0) = 39 (0x27)
888: 29.363970743 poll({ 5/POLLRDNORM },1,5000) = 1 (0x1)
So the delay is due to a DNS query. Disable DNS queries in the server
config.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-ID: <20240206002344.12372-2-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
QEMU initializes preallocated backend memory as the objects are parsed from
the command line. This is not optimal in some cases (e.g. memory spanning
multiple NUMA nodes) because the memory objects are initialized in series.
Allow the initialization to occur in parallel (asynchronously). In order to
ensure optimal thread placement, asynchronous initialization requires prealloc
context threads to be in use.
Signed-off-by: Mark Kanda <mark.kanda@oracle.com>
Message-ID: <20240131165327.3154970-2-mark.kanda@oracle.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
We used to check that the memory region size is multiples of the overall
requested address alignment for the device memory address.
We removed that check, because there are cases (i.e., hv-balloon) where
devices unconditionally request an address alignment that has a very large
alignment (i.e., 32 GiB), but the actual memory device size might not be
multiples of that alignment.
However, this change:
(a) allows for some practically impossible DIMM sizes, like "1GB+1 byte".
(b) allows for DIMMs that partially cover hugetlb pages, previously
reported in [1].
Both scenarios don't make any sense: we might even waste memory.
So let's reintroduce that check, but only check that the
memory region size is multiples of the memory region alignment (i.e.,
page size, huge page size), but not any additional memory device
requirements communicated using md->get_min_alignment().
The following examples now fail again as expected:
(a) 1M with 2M THP
qemu-system-x86_64 -m 4g,maxmem=16g,slots=1 -S -nodefaults -nographic \
-object memory-backend-ram,id=mem1,size=1M \
-device pc-dimm,id=dimm1,memdev=mem1
-> backend memory size must be multiple of 0x200000
(b) 1G+1byte
qemu-system-x86_64 -m 4g,maxmem=16g,slots=1 -S -nodefaults -nographic \
-object memory-backend-ram,id=mem1,size=1073741825B \
-device pc-dimm,id=dimm1,memdev=mem1
-> backend memory size must be multiple of 0x200000
(c) Unliagned hugetlb size (2M)
qemu-system-x86_64 -m 4g,maxmem=16g,slots=1 -S -nodefaults -nographic \
-object memory-backend-file,id=mem1,mem-path=/dev/hugepages/tmp,size=511M \
-device pc-dimm,id=dimm1,memdev=mem1
backend memory size must be multiple of 0x200000
(d) Unliagned hugetlb size (1G)
qemu-system-x86_64 -m 4g,maxmem=16g,slots=1 -S -nodefaults -nographic \
-object memory-backend-file,id=mem1,mem-path=/dev/hugepages1G/tmp,size=2047M \
-device pc-dimm,id=dimm1,memdev=mem1
-> backend memory size must be multiple of 0x40000000
Note that this fix depends on a hv-balloon change to communicate its
additional alignment requirements using get_min_alignment() instead of
through the memory region.
[1] https://lkml.kernel.org/r/f77d641d500324525ac036fe1827b3070de75fc1.1701088320.git.mprivozn@redhat.com
Message-ID: <20240117135554.787344-3-david@redhat.com>
Reported-by: Zhenyu Zhang <zhenyzha@redhat.com>
Reported-by: Michal Privoznik <mprivozn@redhat.com>
Fixes: eb1b7c4bd4 ("memory-device: Drop size alignment check")
Tested-by: Zhenyu Zhang <zhenyzha@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
When reviewing my attempt to refactor send_prepare(), Fabiano suggested we
try out with dropping the mutex in multifd code [1].
I thought about that before but I never tried to change the code. Now
maybe it's time to give it a stab. This only optimizes the sender side.
The trick here is multifd has a clear provider/consumer model, that the
migration main thread publishes requests (either pending_job/pending_sync),
while the multifd sender threads are consumers. Here we don't have a lot
of complicated data sharing, and the jobs can logically be submitted
lockless.
Arm the code with atomic weapons. Two things worth mentioning:
- For multifd_send_pages(): we can use qatomic_load_acquire() when trying
to find a free channel, but that's expensive if we attach one ACQUIRE per
channel. Instead, keep the qatomic_read() on reading the pending_job
flag as we do already, meanwhile use one smp_mb_acquire() after the loop
to guarantee the memory ordering.
- For pending_sync: it doesn't have any extra data required since now
p->flags are never touched, it should be safe to not use memory barrier.
That's different from pending_job.
Provide rich comments for all the lockless operations to state how they are
paired. With that, we can remove the mutex.
[1] https://lore.kernel.org/r/87o7d1jlu5.fsf@suse.de
Suggested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240202102857.110210-24-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
The character "+" is now forbidden in QOM device names (see commit
b447378e12 - "Limit type names to alphanumerical and some few special
characters"). For the "power5+" and "power7+" CPU names, there is
currently a hack in type_name_is_valid() to still allow them for
compatibility reasons. However, there is a much nicer solution for this:
Simply use aliases! This way we can still support the old names without
the need for the ugly hack in type_name_is_valid().
Message-ID: <20240117141054.73841-2-thuth@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
When the maximum count of SCRIPTS instructions is reached, the code
stops execution and returns, but fails to decrement the reentrancy
counter. This effectively renders the SCSI controller unusable
because on next entry the reentrancy counter is still above the limit.
This bug was seen on HP-UX 10.20 which seems to trigger SCRIPTS
loops.
Fixes: b987718bbb ("hw/scsi/lsi53c895a: Fix reentrancy issues in the LSI controller (CVE-2023-0330)")
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-ID: <20240128202214.2644768-1-svens@stackframe.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Helge Deller <deller@gmx.de>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As reported correctly by Fabiano [1] (while per Fabiano, it sourced back to
Elena's initial report in Oct 2023), MultiFDSendParams.packet_num is buggy
to be assigned and stored. Consider two consequent operations of: (1)
queue a job into multifd send thread X, then (2) queue another sync request
to the same send thread X. Then the MultiFDSendParams.packet_num will be
assigned twice, and the first assignment can get lost already.
To avoid that, we move the packet_num assignment from p->packet_num into
where the thread will fill in the packet. Use atomic operations to protect
the field, making sure there's no race.
Note that atomic fetch_add() may not be good for scaling purposes, however
multifd should be fine as number of threads should normally not go beyond
16 threads. Let's leave that concern for later but fix the issue first.
There's also a trick on how to make it always work even on 32 bit hosts for
uint64_t packet number. Switching to uintptr_t as of now to simply the
case. It will cause packet number to overflow easier on 32 bit, but that
shouldn't be a major concern for now as 32 bit systems is not the major
audience for any performance concerns like what multifd wants to address.
We also need to move multifd_send_state definition upper, so that
multifd_send_fill_packet() can reference it.
[1] https://lore.kernel.org/r/87o7d1jlu5.fsf@suse.de
Reported-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240202102857.110210-23-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Most of the multifd code uses send/recv to represent the two sides, but
some rare cases use save/load.
Since send/recv is the majority, replacing the save/load use cases to use
send/recv globally. Now we reach a consensus on the naming.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240202102857.110210-22-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Shrink the function by moving relevant works into helpers: move the thread
join()s into multifd_send_terminate_threads(), then create two more helpers
to cover channel/state cleanups.
Add a TODO entry for the thread terminate process because p->running is
still buggy. We need to fix it at some point but not yet covered.
Suggested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240202102857.110210-20-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
The current multifd_queue_page() is not easy to read and follow. It is not
good with a few reasons:
- No helper at all to show what exactly does a condition mean; in short,
readability is low.
- Rely on pages->ramblock being cleared to detect an empty queue. It's
slightly an overload of the ramblock pointer, per Fabiano [1], which I
also agree.
- Contains a self recursion, even if not necessary..
Rewrite this function. We add some comments to make it even clearer on
what it does.
[1] https://lore.kernel.org/r/87wmrpjzew.fsf@suse.de
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240202102857.110210-19-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Split multifd_send_terminate_threads() into two functions:
- multifd_send_set_error(): used when an error happened on the sender
side, set error and quit state only
- multifd_send_terminate_threads(): used only by the main thread to kick
all multifd send threads out of sleep, for the last recycling.
Use multifd_send_set_error() in the three old call sites where only the
error will be set.
Use multifd_send_terminate_threads() in the last one where the main thread
will kick the multifd threads at last in multifd_save_cleanup().
Both helpers will need to set quitting=1.
Suggested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240202102857.110210-16-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Now multifd's logic is designed to have no spurious wakeup. I still
remember a talk to Juan and he seems to agree we should drop it now, and if
my memory was right it was there because multifd used to hit that when
still debugging.
Let's drop it and see what can explode; as long as it's not reaching
soft-freeze.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240202102857.110210-15-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
This patch redefines the interfacing of ->send_prepare(). It further
simplifies multifd_send_thread() especially on zero copy.
Now with the new interface, we require the hook to do all the work for
preparing the IOVs to send. After it's completed, the IOVs should be ready
to be dumped into the specific multifd QIOChannel later.
So now the API looks like:
p->pages -----------> send_prepare() -------------> IOVs
This also prepares for the case where the input can be extended to even not
any p->pages. But that's for later.
This patch will achieve similar goal of what Fabiano used to propose here:
https://lore.kernel.org/r/20240126221943.26628-1-farosas@suse.de
However the send() interface may not be necessary. I'm boldly attaching a
"Co-developed-by" for Fabiano.
Co-developed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240202102857.110210-14-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Introduce a helper multifd_send_prepare_header() to setup the header packet
for multifd sender.
It's fine to setup the IOV[0] _before_ send_prepare() because the packet
buffer is already ready, even if the content is to be filled in.
With this helper, we can already slightly clean up the zero copy path.
Note that I explicitly put it into multifd.h, because I want it inlined
directly into multifd*.c where necessary later.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240202102857.110210-13-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
This field, no matter whether on src or dest, is only used for debugging
purpose.
They can even be removed already, unless it still more or less provide some
accounting on "how many packets are sent/recved for this thread". The
other more important one is called packet_num, which is embeded in the
multifd packet headers (MultiFDPacket_t).
So let's keep them for now, but make them much easier to understand, by
doing below:
- Rename both of them to packets_sent / packets_recved, the old
name (num_packets) are waaay too confusing when we already have
MultiFDPacket_t.packets_num.
- Avoid worrying on the "initial packet": we know we will send it, that's
good enough. The accounting won't matter a great deal to start with 0 or
with 1.
- Move them to where we send/recv the packets. They're:
- multifd_send_fill_packet() for senders.
- multifd_recv_unfill_packet() for receivers.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240202102857.110210-10-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
The sender thread will yield the p->mutex before IO starts, trying to not
block the requester thread. This may be unnecessary lock optimizations,
because the requester can already read pending_job safely even without the
lock, because the requester is currently the only one who can assign a
task.
Drop that lock complication on both sides:
(1) in the sender thread, always take the mutex until job done
(2) in the requester thread, check pending_job clear lockless
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240202102857.110210-8-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Multifd provide a threaded model for processing jobs. On sender side,
there can be two kinds of job: (1) a list of pages to send, or (2) a sync
request.
The sync request is a very special kind of job. It never contains a page
array, but only a multifd packet telling the dest side to synchronize with
sent pages.
Before this patch, both requests use the pending_job field, no matter what
the request is, it will boost pending_job, while multifd sender thread will
decrement it after it finishes one job.
However this should be racy, because SYNC is special in that it needs to
set p->flags with MULTIFD_FLAG_SYNC, showing that this is a sync request.
Consider a sequence of operations where:
- migration thread enqueue a job to send some pages, pending_job++ (0->1)
- [...before the selected multifd sender thread wakes up...]
- migration thread enqueue another job to sync, pending_job++ (1->2),
setup p->flags=MULTIFD_FLAG_SYNC
- multifd sender thread wakes up, found pending_job==2
- send the 1st packet with MULTIFD_FLAG_SYNC and list of pages
- send the 2nd packet with flags==0 and no pages
This is not expected, because MULTIFD_FLAG_SYNC should hopefully be done
after all the pages are received. Meanwhile, the 2nd packet will be
completely useless, which contains zero information.
I didn't verify above, but I think this issue is still benign in that at
least on the recv side we always receive pages before handling
MULTIFD_FLAG_SYNC. However that's not always guaranteed and just tricky.
One other reason I want to separate it is using p->flags to communicate
between the two threads is also not clearly defined, it's very hard to read
and understand why accessing p->flags is always safe; see the current impl
of multifd_send_thread() where we tried to cache only p->flags. It doesn't
need to be that complicated.
This patch introduces pending_sync, a separate flag just to show that the
requester needs a sync. Alongside, we remove the tricky caching of
p->flags now because after this patch p->flags should only be used by
multifd sender thread now, which will be crystal clear. So it is always
thread safe to access p->flags.
With that, we can also safely convert the pending_job into a boolean,
because we don't support >1 pending jobs anyway.
Always use atomic ops to access both flags to make sure no cache effect.
When at it, drop the initial setting of "pending_job = 0" because it's
always allocated using g_new0().
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240202102857.110210-7-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
This array is redundant when p->pages exists. Now we extended the life of
p->pages to the whole period where pending_job is set, it should be safe to
always use p->pages->offset[] rather than p->normal[]. Drop the array.
Alongside, the normal_num is also redundant, which is the same to
p->pages->num.
This doesn't apply to recv side, because there's no extra buffering on recv
side, so p->normal[] array is still needed.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240202102857.110210-6-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Now we reset MultiFDPages_t object in the multifd sender thread in the
middle of the sending job. That's not necessary, because the "*pages"
struct will not be reused anyway until pending_job is cleared.
Move that to the end after the job is completed, provide a helper to reset
a "*pages" object. Use that same helper when free the object too.
This prepares us to keep using p->pages in the follow up patches, where we
may drop p->normal[].
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240202102857.110210-5-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Multifd send side has two fields to indicate error quits:
- MultiFDSendParams.quit
- &multifd_send_state->exiting
Merge them into the global one. The replacement is done by changing all
p->quit checks into the global var check. The global check doesn't need
any lock.
A few more things done on top of this altogether:
- multifd_send_terminate_threads()
Moving the xchg() of &multifd_send_state->exiting upper, so as to cover
the tracepoint, migrate_set_error() and migrate_set_state().
- multifd_send_sync_main()
In the 2nd loop, add one more check over the global var to make sure we
don't keep the looping if QEMU already decided to quit.
- multifd_tls_outgoing_handshake()
Use multifd_send_terminate_threads() to set the error state. That has
a benefit of updating MigrationState.error to that error too, so we can
persist that 1st error we hit in that specific channel.
- multifd_new_send_channel_async()
Take similar approach like above, drop the migrate_set_error() because
multifd_send_terminate_threads() already covers that. Unwrap the helper
multifd_new_send_channel_cleanup() along the way; not really needed.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240202102857.110210-4-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
A memory page poisoned from the hypervisor level is no longer readable.
The migration of a VM will crash Qemu when it tries to read the
memory address space and stumbles on the poisoned page with a similar
stack trace:
Program terminated with signal SIGBUS, Bus error.
#0 _mm256_loadu_si256
#1 buffer_zero_avx2
#2 select_accel_fn
#3 buffer_is_zero
#4 save_zero_page
#5 ram_save_target_page_legacy
#6 ram_save_host_page
#7 ram_find_and_save_block
#8 ram_save_iterate
#9 qemu_savevm_state_iterate
#10 migration_iteration_run
#11 migration_thread
#12 qemu_thread_start
To avoid this VM crash during the migration, prevent the migration
when a known hardware poison exists on the VM.
Signed-off-by: William Roche <william.roche@oracle.com>
Link: https://lore.kernel.org/r/20240130190640.139364-2-william.roche@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Let's implement the get_min_alignment() callback for memory devices, and
copy for the device memory region the alignment of the host memory
region. This mimics what virtio-mem does, and allows for re-introducing
proper alignment checks for the memory region size (where we don't care
about additional device requirements) in memory device core.
Message-ID: <20240117135554.787344-2-david@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Better constraint for tcg_out_cmp, based on the comparison.
We can't yet remove the fallback to load constants into a
scratch because of tcg_out_cmp2, but that path should not
be as frequent.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Using cr0 means we could choose to use rc=1 to compute the condition.
Adjust the tables and tcg_out_cmp that feeds them.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rename the current tcg_out_bc function to tcg_out_bc_lab, and
create a new function that takes an integer displacement + link.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use a non-zero value here (an illegal encoding) as a better
condition than is_unsigned_cond for when MOVR/BPR is usable.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Merge tcg_out_testi into tcg_out_cmp and adjust the two uses.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Fill the new argument from any condition within the opcode.
Not yet used within any backend.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Avoid code duplication by handling 7 of the 14 cases
by inverting the test for the other 7 cases.
Use TCG_COND_TSTNE for cc in {1,3}.
Use (cc - 1) <= 1 for cc in {1,2}.
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
After having performed other simplifications, lower any
remaining test comparisons with AND.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Fold constant comparisons.
Canonicalize "tst x,x" to equality vs zero.
Canonicalize "tst x,sign" to sign test vs zero.
Fold double-word comparisons with zero parts.
Fold setcond of "tst x,pow2" to a bit extract.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Mirror the new do_constant_folding_cond1 by doing all
argument and condition adjustment within one helper.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Handle modifications to the arguments and condition
in a single place.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add the enumerators, adjust the helpers to match, and dump.
Not supported anywhere else just yet.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Documentation of commands guest-ssh-get-authorized-keys,
guest-ssh-add-authorized-keys, and guest-ssh-remove-authorized-keys
describes the command's purpose after its arguments. Everywhere else,
we do it the other way round. Move it for consistency.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240129115008.674248-6-armbru@redhat.com>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Documentation of type BlockdevOptionsIscsi describes the type's
purpose after its members. Everywhere else, we do it the other way
round. Move it for consistency.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240129115008.674248-5-armbru@redhat.com>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Documentation of BlockExportRemoveMode has
Potential additional modes to be added in the future:
hide: Just hide export from new clients, leave existing connections
as is. Remove export after all clients are disconnected.
soft: Hide export from new clients, answer with ESHUTDOWN for all
further requests from existing clients.
I think this is useful only for developers. Elide it from generated
documentation by turning it into a TODO section.
This effectively reverts my own commit b71fd73cc4 (Revert "qapi:
BlockExportRemoveMode: move comments to TODO"). At the time, I was
about to elide TODO sections from the generated manual, I wasn't sure
about this one, and decided to avoid change. And now I've made up my
mind.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240129115008.674248-4-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Documentation generated for dump-skeys contains
This command is only supported on s390 architecture.
and
If
~~
"TARGET_S390X"
The former became redundant in commit 901a34a400 (qapi: add 'If:'
section to generated documentation) added the latter. Drop the
former.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240129115008.674248-3-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Documentation generated for SchemaInfo looks like
The members of "SchemaInfoBuiltin" when "meta-type" is ""builtin""
The members of "SchemaInfoEnum" when "meta-type" is ""enum""
The members of "SchemaInfoArray" when "meta-type" is ""array""
The members of "SchemaInfoObject" when "meta-type" is ""object""
The members of "SchemaInfoAlternate" when "meta-type" is ""alternate""
The members of "SchemaInfoCommand" when "meta-type" is ""command""
The members of "SchemaInfoEvent" when "meta-type" is ""event""
Additional members depend on the value of "meta-type".
The last line became redundant when commit 88f63467c5 (qapi2texi:
Generate reference to base type members) added the lines preceding it.
Drop it.
BlockdevOptions has the same issue. Drop
Remaining options are determined by the block driver.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240129115008.674248-2-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
If an exception is to be raised, the destination fp register
should be unmodified. The current implementation is incorrect,
in that double results will be written back before calling
gen_helper_check_ieee_exceptions, despite the placement of
gen_store_fpr_D, since gen_dest_fpr_D returns cpu_fpr[].
We can simplify the entire implementation by having each
FPOp helper call check_ieee_exceptions. For the moment this
requires that all FPop helpers write to the TCG global cpu_fsr,
so remove TCG_CALL_NO_WG from the DEF_HELPER_FLAGS_*.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20231103173841.33651-19-richard.henderson@linaro.org>
The `if not probe_proc_self_mem` check never passes, because
probe_proc_self_mem is a function object, which is a truthy value.
Add parentheses in order to perform a function call.
Fixes: dc84d50a7f9b ("tests/tcg: Add the PROT_NONE gdbstub test")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20240131220245.235993-1-iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
For user-only mode, use MMU_USER_IDX.
For system mode, use CPUClass.mmu_index.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rather than adjust env->hflags so that the value computed
by cpu_mmu_index() changes, compute the mmu_idx that we
want directly and pass it down.
Introduce symbolic constants for MMU_{KERNEL,ERL}_IDX.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The expected form is MMU_FOO_IDX, not MMU_IDX_FOO.
Rename to match generic code.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rework matching of network devices to -nic options (v2)
# -----BEGIN PGP SIGNATURE-----
#
# iQJGBAABCAAwFiEEMUsIrNDeSBEzpfKGm+mA/QrAFUQFAmW9F9oSHGR3bXdAYW1h
# em9uLmNvLnVrAAoJEJvpgP0KwBVEVwsQAIIDTYb3R/vxpf6w9n+n6FWCbFt/ihPC
# RbQ/Zrnoj6K3dCj6U3zJDpa5qpJ27/AiFfVv/gU13d+ELf72uHKE50GkQa2r/Fl8
# cPoW1LRinGFGxQS+WY5OnRYJ2mBaVx6THUd5DCgb5wpkBgVe21XsZLr6pfAapNCG
# c22HBaIb8sHPeIV2wf1xZKEswNGlkXuylBnS4wayncRKa2vOYPAAO7P4PvwNuMnb
# j0pLyLfD6Zx+6D53ema4zpcDh7d1Qn5eDGHQmy55Ml5AleC05gsDzrCEeiT4vU9T
# 9fj6w8NlyLkPYLqTodAEeaUpUCFhMO312VPSM163iYOUDtjqz10bBZncgbRrsR5I
# 30bKqQvEQ8PAQZWILNhfyHrYw4/O2Y88sUf/lE8lGmHvVYda+yqq5lgEyPFHbJwh
# ZCEJQalc6tRATIWUqI/Lw+X7hqnJ29c14hkEVG8L0KW0fIB/cqXUStzcUt87VkA2
# wwQI4aAGWZE1pvFvhmeM2rTDXfg1uD8SoFDTj4ORJl/7PEemf1yraKUYb8YdRE0z
# dQWfLmSnl1JkTa0yVF5MtnoTJUP8PX+hhJROfdwvfd1sU5s98O5pivYf7arUybVl
# j4g4qwm8IUBiAznZzbhdp38Q91RFvBKjjLsx/+Ts9avZTL0xCUcCvt21wzqWhbkc
# X7KdrU/XxVry
# =4PuR
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 02 Feb 2024 16:27:06 GMT
# gpg: using RSA key 314B08ACD0DE481133A5F2869BE980FD0AC01544
# gpg: issuer "dwmw@amazon.co.uk"
# gpg: Good signature from "David Woodhouse <dwmw@amazon.co.uk>" [unknown]
# gpg: aka "David Woodhouse <dwmw@amazon.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 314B 08AC D0DE 4811 33A5 F286 9BE9 80FD 0AC0 1544
* tag 'pull-nic-config-2-20240202' of git://git.infradead.org/users/dwmw2/qemu: (47 commits)
net: make nb_nics and nd_table[] static in net/net.c
net: remove qemu_show_nic_models(), qemu_find_nic_model()
hw/pci: remove pci_nic_init_nofail()
net: remove qemu_check_nic_model()
hw/xtensa/xtfpga: use qemu_create_nic_device()
hw/sparc/sun4m: use qemu_find_nic_info()
hw/s390x/s390-virtio-ccw: use qemu_create_nic_device()
hw/riscv: use qemu_configure_nic_device()
hw/openrisc/openrisc_sim: use qemu_create_nic_device()
hw/net/lasi_i82596: use qemu_create_nic_device()
hw/net/lasi_i82596: Re-enable build
hw/mips/jazz: use qemu_find_nic_info()
hw/mips/mipssim: use qemu_create_nic_device()
hw/microblaze: use qemu_configure_nic_device()
hw/m68k/q800: use qemu_find_nic_info()
hw/m68k/mcf5208: use qemu_create_nic_device()
hw/net/etraxfs-eth: use qemu_configure_nic_device()
hw/arm: use qemu_configure_nic_device()
hw/arm/stellaris: use qemu_find_nic_info()
hw/arm/npcm7xx: use qemu_configure_nic_device, allow emc0/emc1 as aliases
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Also remove the stale declaration of host_net_devices; the actual
definition was removed long ago in commit 7cc28cb061 ("net: Remove
the deprecated 'host_net_add' and 'host_net_remove' HMP commands")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Thomas Huth <thuth@redhat.com>
These old functions can be removed now too. Let net_param_nic() print
the full set of network devices directly, and also make it note that a
list more specific to this platform/config will be available by using
'-nic model=help' instead.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Thomas Huth <thuth@redhat.com>
This function is no longer used, as all its callers have been converted
to use pci_init_nic_devices() instead.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Thomas Huth <thuth@redhat.com>
There are no callers of this function any more, as they have all been
converted to qemu_{create,configure}_nic_device().
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Obtain the MAC address from the NIC configuration if there is one, or
generate one explicitly so that it can be placed in the PROM.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Create the device only if there is a corresponding NIC config for it.
Remove the explicit check on nd_table[0].used from hw/hppa/machine.c
which (since commit d8a3220005) tries to do the same thing.
The lasi_82596 support has been disabled since it was first introduced,
since enable_lasi_lan() has always been zero. This allows the user to
enable it by explicitly requesting a NIC model 'lasi_82596' or just
using the alias 'lasi'. Otherwise, it defaults to a PCI NIC as before.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Thomas Huth <thuth@redhat.com>
When converting to the shiny build-system-du-jour, a typo prevented the
last_i82596 driver from being built. Correct the config option name to
re-enable the build. And include "sysemu/sysemu.h" so it actually builds.
Fixes: b1419fa665 ("meson: convert hw/net")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2144
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Thomas Huth <thuth@redhat.com>
The MIPS SIM platform instantiates its NIC only if a corresponding
configuration exists for it. Use qemu_create_nic_device() function for
that.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Thomas Huth <thuth@redhat.com>
If a corresponding NIC configuration was found, it will have a MAC address
already assigned, so use that. Else, generate and assign a default one.
Using qemu_find_nic_info() is simpler than the alternative of using
qemu_configure_nic_device() and then having to fetch the "mac" property
as a string and convert it.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Rather than just using qemu_configure_nic_device(), populate the MAC
address in the system-registers device by peeking at the NICInfo before
it's assigned to the device.
Generate the MAC address early, if there is no matching -nic option.
Otherwise the MAC address wouldn't be generated until net_client_init1()
runs.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Also update the test to specify which device to attach the test socket
to, and remove the comment lamenting the fact that we can't do so.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Some callers instantiate the device unconditionally, others will do so only
if there is a NICInfo to go with it. This appears to be fairly random, but
preseve the existing behaviour for now.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Some callers instantiate the device unconditionally, others will do so only
if there is a NICInfo to go with it. This appears to be fairly random, but
preserve the existing behaviour of each caller for now.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Thomas Huth <thuth@redhat.com>
The first sunhme NIC gets placed a function 1 on slot 1 of PCI bus A,
and the rest are dynamically assigned on PCI bus B.
Previously, any PCI NIC would get the special treatment purely by
virtue of being first in the list.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Previously, the first PCI NIC would be assigned to slot 2 even if the
user override the model and made it something other than an rtl8139
which is the default. Everything else would be dynamically assigned.
Now, the first rtl8139 gets slot 2 and everything else is dynamic.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Yoshinori Sato <ysato@users.sourceforge.jp>
Avoid directly referencing nd_table[] by first instantiating any
spapr-vlan devices using a qemu_get_nic_info() loop, then calling
pci_init_nic_devices() to do the rest.
No functional change intended.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Previously, the first PCI NIC would be placed in PCI slot 3 and the rest
would be dynamically assigned. Even if the user overrode the default NIC
type and made it something other than PCNet.
Now, the first PCNet NIC (that is, anything not explicitly specified
to be anything different) will go to slot 3 even if it isn't the first
NIC specified on the command line. And anything else will be dynamically
assigned.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Thomas Huth <thuth@redhat.com>
The Malta board setup code would previously place the first NIC into PCI
slot 11 if was a PCNet card, and the rest (including the first if it was
anything other than a PCNet card) would be dynamically assigned.
Now it will place any PCNet NIC into slot 11, and then anything else will
be dynamically assigned.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Thomas Huth <thuth@redhat.com>
The previous behaviour was: *if* the first NIC specified on the command
line was an RTL8139 (or unspecified model) then it gets assigned to PCI
slot 7, which is where the Fuloong board had an RTL8139. All other
devices (including the first, if it was specified as anything other than
an rtl8319) get dynamically assigned on the bus.
The new behaviour is subtly different: If the first NIC was given a
specific model *other* than rtl8139, and a subsequent NIC was not,
then the rtl8139 (or unspecified) NIC will go to slot 7 and the rest
will be dynamically assigned.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Thomas Huth <thuth@redhat.com>
When instantiating XenBus itself, for each NIC which is configured with
either the model unspecified, or set to to "xen" or "xen-net-device",
create a corresponding xen-net-device for it.
Now we can revert the previous more hackish version which relied on the
platform code explicitly registering the NICs on its own XenBus, having
returned the BusState* from xen_bus_init() itself.
This also fixes the setup for Xen PV guests, which was previously broken
in various ways and never actually managed to peer with the netdev.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Eliminate direct access to nd_table[] and nb_nics by processing the the
Xen and ISA NICs first and then calling pci_init_nic_devices() for the
rest.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
The loop over nd_table[] to add PCI NICs is repeated in quite a few
places. Add a helper function to do it.
Some platforms also try to instantiate a specific model in a specific
slot, to match the real hardware. Add pci_init_nic_in_slot() for that
purpose.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
This will instantiate any NICs which live on a given bus type. Each bus
is allowed *one* substitution (for PCI it's virtio → virtio-net-pci, for
Xen it's xen → xen-net-device; no point in overengineering it unless we
actually want more).
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
By noting the models for which a configuration was requested, we can give
the user an accurate list of which NIC models were actually available on
the platform/configuration that was otherwise chosen.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Most code which directly accesses nd_table[] and nb_nics uses them for
one of two things. Either "I have created a NIC device and I'd like a
configuration for it", or "I will create a NIC device *if* there is a
configuration for it". With some variants on the theme around whether
they actually *check* if the model specified in the configuration is
the right one.
Provide functions which perform both of those, allowing platforms to
be a little more consistent and as a step towards making nd_table[]
and nb_nics private to the net code.
One might argue that platforms ought to be consistent about whether
they create the unconfigured devices or not, but making significant
user-visible changes is explicitly *not* the intent right now.
The new functions leave the 'model' field of the NICInfo as NULL after
using it for the default NIC model, unlike the qemu_check_nic_model()
function which does set nd->model to match default_model explicitly.
This is acceptable because there is no code which consumes nd->model
except this NIC-matching code in net/net.c, and no reasonable excuse
for any code wanting to use nd->model in future.
Also export the qemu_find_nic_info() helper, as some platforms have
special cases they need to handle.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
This patch will allow the SPI controller to be accessible from BCM2835 based
boards as SPI0. SPI driver is usually disabled by default and config.txt does
not work.
Instead, dtmerge can be used to apply spi=on on a bcm2835 dtb file.
Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com>
Message-id: 20240129221807.2983148-3-rayhan.faizel@gmail.com
[PMM: indent tweak]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
- Implementation of Transmit function for packets
- Implementation for reading and writing from and to descriptors in
memory for Tx
Added relevant trace-events
NOTE: This function implements the steps detailed in the datasheet for
transmitting messages from the GMAC.
Change-Id: Icf14f9fcc6cc7808a41acd872bca67c9832087e6
Signed-off-by: Nabih Estefan <nabihestefan@google.com>
Reviewed-by: Tyrone Ting <kfting@nuvoton.com>
Message-id: 20240131002800.989285-6-nabihestefan@google.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
- Implementation of Receive function for packets
- Implementation for reading and writing from and to descriptors in
memory for Rx
When RX starts, we need to flush the queued packets so that they
can be received by the GMAC device. Without this it won't work
with TAP NIC device.
When RX descriptor list is full, it returns a DMA_STATUS for
software to handle it. But there's no way to indicate the software has
handled all RX descriptors and the whole pipeline stalls.
We do something similar to NPCM7XX EMC to handle this case.
1. Return packet size when RX descriptor is full, effectively dropping
these packets in such a case.
2. When software clears RX descriptor full bit, continue receiving
further packets by flushing QEMU packet queue.
Added relevant trace-events
Change-Id: I132aa254a94cda1a586aba2ea33bbfc74ecdb831
Signed-off-by: Hao Wu <wuhaotsh@google.com>
Signed-off-by: Nabih Estefan <nabihestefan@google.com>
Reviewed-by: Tyrone Ting <kfting@nuvoton.com>
Message-id: 20240131002800.989285-5-nabihestefan@google.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This patch implements the basic registers of GMAC device and sets
registers for networking functionalities.
Squashed IRQ Implementation patch into this one for compliation.
Tested:
The following message shows up with the change:
Broadcom BCM54612E stmmac-0:00: attached PHY driver [Broadcom BCM54612E] (mii_bus:phy_addr=stmmac-0:00, irq=POLL)
stmmaceth f0802000.eth eth0: Link is Up - 1Gbps/Full - flow control rx/tx
Change-Id: If71c6d486b95edcccba109ba454870714d7e0940
Signed-off-by: Hao Wu <wuhaotsh@google.com>
Signed-off-by: Nabih Estefan Diaz <nabihestefan@google.com>
Reviewed-by: Tyrone Ting <kfting@nuvoton.com>
Message-id: 20240131002800.989285-2-nabihestefan@google.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
According to the QEMU Coding Style document:
> Do not use printf(), fprintf() or monitor_printf(). Instead, use
> error_report() or error_vreport() from error-report.h. This ensures the
> error is reported in the right place (current monitor or stderr), and in
> a uniform format.
> Use error_printf() & friends to print additional information.
This commit changes fprintfs that report warnings and errors to the
appropriate report functions.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 42a8953553cf68e8bacada966f93af4fbce45919.1706544115.git.manos.pitsidianakis@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The latest version of qemu (v8.2.0-869-g7a1dc45af5) crashes when booting
the mcimx7d-sabre emulation with Linux v5.11 and later.
qemu-system-arm: ../system/memory.c:2750: memory_region_set_alias_offset: Assertion `mr->alias' failed.
Problem is that the Designware PCIe emulation accepts the full value range
for the iATU Viewport Register. However, both hardware and emulation only
support four inbound and four outbound viewports.
The Linux kernel determines the number of supported viewports by writing
0xff into the viewport register and reading the value back. The expected
value when reading the register is the highest supported viewport index.
Match that code by masking the supported viewport value range when the
register is written. With this change, the Linux kernel reports
imx6q-pcie 33800000.pcie: iATU: unroll F, 4 ob, 4 ib, align 0K, limit 4G
as expected and supported.
Fixes: d64e5eabc4 ("pci: Add support for Designware IP block")
Cc: Andrey Smirnov <andrew.smirnov@gmail.com>
Cc: Nikita Ostrenkov <n.ostrenkov@gmail.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Message-id: 20240129060055.2616989-1-linux@roeck-us.net
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The npcm7xx Soc is created with a Cortex-A9 core, see in
hw/arm/npcm7xx.c:
static void npcm7xx_init(Object *obj)
{
NPCM7xxState *s = NPCM7XX(obj);
for (int i = 0; i < NPCM7XX_MAX_NUM_CPUS; i++) {
object_initialize_child(obj, "cpu[*]", &s->cpu[i],
ARM_CPU_TYPE_NAME("cortex-a9"));
}
The MachineClass::default_cpu_type field is ignored: delete it.
Use the common code introduced in commit c9cf636d48 ("machine: Add
a valid_cpu_types property") to check for valid CPU type at the
board level.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240129151828.59544-8-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Musca boards use the embedded subsystems (SSE) tied to a specific
Cortex core. Our models only use the Cortex-M33.
Use the common code introduced in commit c9cf636d48 ("machine: Add
a valid_cpu_types property") to check for valid CPU type at the
board level.
Remove the now unused MachineClass::default_cpu_type field.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240129151828.59544-7-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The M2Sxxx SoC family can only be used with Cortex-M3.
Propagating the CPU type from the board level is pointless.
Hard-code the CPU type at the SoC level.
Remove the now ignored MachineClass::default_cpu_type field.
Use the common code introduced in commit c9cf636d48 ("machine: Add
a valid_cpu_types property") to check for valid CPU type at the
board level.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240129151828.59544-6-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Restrict MachineClass::valid_cpu_types[] to the single
valid CPU types.
Instead of ignoring invalid CPU type requested by the user:
$ qemu-system-arm -M midway -cpu cortex-a7 -S -monitor stdio
QEMU 8.2.50 monitor - type 'help' for more information
(qemu) info qom-tree
/machine (midway-machine)
/cpu[0] (cortex-a15-arm-cpu)
...
we now display an error:
$ qemu-system-arm -M midway -cpu cortex-a7
qemu-system-arm: Invalid CPU model: cortex-a7
The only valid type is: cortex-a15
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Message-id: 20240129151828.59544-5-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Restrict MachineClass::valid_cpu_types[] to the single
valid CPU type.
Instead of ignoring invalid CPU type requested by the user:
$ qemu-system-arm -M nuri -cpu cortex-a7 -S -monitor stdio
QEMU 8.2.50 monitor - type 'help' for more information
(qemu) info qom-tree
/machine (nuri-machine)
/soc (exynos4210)
/cpu[0] (cortex-a9-arm-cpu)
...
We now display an error:
$ qemu-system-arm -M nuri -cpu cortex-a7
qemu-system-arm: Invalid CPU model: cortex-a7
The only valid type is: cortex-a9
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Message-id: 20240129151828.59544-3-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We can't just embed labels directly into files like qemu-options.hx which
are included from multiple top-level rST files, because Sphinx sees the
labels as duplicate: https://github.com/sphinx-doc/sphinx/issues/9707
So add an optional argument to the SRST directive which causes a label
of the form '.. _DOCNAME-HXFILE-LABEL:' to be emitted, where 'DOCNAME'
is the name of the top level rST file, 'HXFILE' is the filename of the
.hx file, and 'LABEL' is the text provided within the 'SRST()' directive.
Using the DOCNAME of the top-level rST document means that it is unique
even when the .hx file is included from two different documents, as is
the case for qemu-options.hx
Now where the Xen PV documentation refers to the documentation for the
-initrd command line option, it can emit a link directly to it as
'<system/invocation-qemu-options-initrd>'.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240130190348.682912-1-dwmw2@infradead.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This test program is the last use of any variable length array in the
codebase. If we can get rid of all uses of VLAs we can make the
compiler error on new additions. This is a defensive measure against
security bugs where an on-stack dynamic allocation isn't correctly
size-checked (e.g. CVE-2021-3527).
In this case the test code didn't even want a variable-sized
array, it was just accidentally using syntax that gave it one.
(The array size for C has to be an actual constant expression,
not just something that happens to be known to be constant...)
Remove the VLA usage.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-id: 20240125173211.1786196-2-peter.maydell@linaro.org
In kernel commit 5d5b4e8c2d9ec ("arm64/sve: Report FEAT_SVE_B16B16 to
userspace") Linux added ID_AA64ZFR0_el1.B16B16 to the set of ID
register fields which it exposes to userspace. Update our
exported_bits mask to include this.
(This doesn't yet change any behaviour for us, because we don't yet
have any CPUs that implement this feature, which is part of SVE2.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240125134304.1470404-1-peter.maydell@linaro.org
The -serial option documentation is a bit brief about '-serial none'
and '-serial null'. In particular it's not very clear about the
difference between them, and it doesn't mention that it's up to
the machine model whether '-serial none' means "don't create the
serial port" or "don't wire the serial port up to anything".
Expand on these points.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240122163607.459769-3-peter.maydell@linaro.org
Currently if the user passes multiple -serial options on the command
line, we mostly treat those as applying to the different serial
devices in order, so that for example
-serial stdio -serial file:filename
will connect the first serial port to stdio and the second to the
named file.
The exception to this is the '-serial none' serial device type. This
means "don't allocate this serial device", but a bug means that
following -serial options are not correctly handled, so that
-serial none -serial stdio
has the unexpected effect that stdio is connected to the first serial
port, not the second.
This is a very long-standing bug that dates back at least as far as
commit 998bbd74b9 from 2009.
Make the 'none' serial type move forward in the indexing of serial
devices like all the other serial types, so that any subsequent
-serial options are correctly handled.
Note that if your commandline mistakenly had a '-serial none' that
was being overridden by a following '-serial something' option, you
should delete the unnecessary '-serial none'. This will give you the
same behaviour as before, on QEMU versions both with and without this
bug fix.
Cc: qemu-stable@nongnu.org
Reported-by: Bohdan Kostiv <bohdan.kostiv@tii.ae>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240122163607.459769-2-peter.maydell@linaro.org
Fixes: 998bbd74b9 ("default devices: core code & serial lines")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Switch the PCI bus from using BusClass::reset to the Resettable
interface.
This has no behavioural change, because the BusClass code to support
subclasses that use the legacy BusClass::reset will call that method
in the hold phase of 3-phase reset.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Cédric Le Goater <clg@redhat.com>
Tested-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-id: 20240119163512.3810301-2-peter.maydell@linaro.org
aspeed queue:
* Update of buildroot images to 2023.11 (6.6.3 kernel)
* Check of the valid CPU type supported by aspeed machines
* Simplified models for the IBM's FSI bus and the Aspeed
controller bridge
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmW7Sa8ACgkQUaNDx8/7
# 7KG7mw/8DbMJY6aqgq5YANszzem1ktJphPCNxq081cbczCOUpCNX4aL+0/ANvxxD
# lbJQB+SZeIRmuFbxYPhq68rtzB4vG7tsQpns4H33EPKT4vuzF70lq4fgptMiun3q
# 1ZJ2LF3jonvQWdhbC17wzAQz0FFb4F7XOxz++UL4okPsgzsYItnd+TWs8q7+erRb
# 84UwN+eBTBAl/FiNk679/tBTqAfCVGgQ7dzotr4f3tg5POvrGOrlEjAn0O+dGGDj
# wgILmpEBsTsilRB1tz8Kw0j/v/VkHz1DJu45lRAV9CIrN22iKcjMilNGgNDT8kcI
# yAlxAw3iN+hVFqDov8wFPjDYd/Qw2oRAPy2Kd14hW9xL8zBOTms1JK5L0PS2+Feo
# ZjMJ2cOJq3t4Wt1ZXRhgHfF4ANwP0OZ/y9bHCy3CkBljEeiTQbikHP9gVV4qHXZH
# 4Q0HnDZQwAgobw3CmZ8jVx1dQueqy3ycuvkhCyv3S0l/tdbtXDtr5pNNu3dAP/PJ
# 3nifLdRImhDvxxO9GKaCdUVLzELzMJl0GrgAsVJPKVnKHA4IiVKmB+XcW9IUbfy/
# 3zA2wHJLrEF+MF6MsuNcEYCCqUvyNLm7rUrXk1wNLXpCJ35bbW5IYy7Ty/8E2GHb
# D5Cv/EPNhMBiNA4+HqQlMOTC13Ozv2qwCuWYCh2Ik8mnzaEiyTo=
# =0C5S
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 01 Feb 2024 07:35:11 GMT
# gpg: using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@kaod.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1
* tag 'pull-aspeed-20240201' of https://github.com/legoater/qemu:
hw/fsi: Update MAINTAINER list
hw/fsi: Added FSI documentation
hw/fsi: Added qtest
hw/arm: Hook up FSI module in AST2600
hw/fsi: Aspeed APB2OPB & On-chip peripheral bus
hw/fsi: Introduce IBM's FSI master
hw/fsi: Introduce IBM's cfam
hw/fsi: Introduce IBM's fsi-slave model
hw/fsi: Introduce IBM's FSI Bus
hw/fsi: Introduce IBM's scratchpad device
hw/fsi: Introduce IBM's Local bus
hw/arm/aspeed: Check for CPU types in machine_run_board_init()
hw/arm/aspeed: Introduce aspeed_soc_cpu_type() helper
hw/arm/aspeed: Init CPU defaults in a common helper
hw/arm/aspeed: Set default CPU count using aspeed_soc_num_cpus()
hw/arm/aspeed: Remove dead code
tests/avocado/machine_aspeed.py: Update buildroot images to 2023.11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
pull-loongarch-20240201
# -----BEGIN PGP SIGNATURE-----
#
# iLMEAAEKAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCZbtI0AAKCRBAov/yOSY+
# 35J4A/9ehl15MIrkByqq8QNQnG4PbKWOTcTev6P0sEAhzPUtBpcGPesoHnt+DFuk
# f3WluWsfnFY9pCRgB9VBSpy89O6Efj187bpekT/PD5svPOXoUmETafv1p2vsVrpj
# huMNijt7I/LAXUCb2gA5pnNH9vrrJ8E5kQlgmYfh35cEhHObMQ==
# =Q9Gn
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 01 Feb 2024 07:31:28 GMT
# gpg: using RSA key B8FF1DA0D2FDCB2DA09C6C2C40A2FFF239263EDF
# gpg: Good signature from "Song Gao <m17746591750@163.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B8FF 1DA0 D2FD CB2D A09C 6C2C 40A2 FFF2 3926 3EDF
* tag 'pull-loongarch-20240201' of https://gitlab.com/gaosong/qemu:
target/loongarch: Fix qtest test-hmp error when KVM-only build
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add maintainer for IBM FSI model
Signed-off-by: Ninad Palsule <ninad@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
[ clg: - slight change in commit log
- fixed file list ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Added basic qtests for FSI model.
Signed-off-by: Ninad Palsule <ninad@linux.ibm.com>
Acked-by: Thomas Huth <thuth@redhat.com>
[ clg: aspeed-fsi-test.c -> aspeed_fsi-test.c to match other filenames ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
This patchset introduces IBM's Flexible Service Interface(FSI).
Time for some fun with inter-processor buses. FSI allows a service
processor access to the internal buses of a host POWER processor to
perform configuration or debugging.
FSI has long existed in POWER processes and so comes with some baggage,
including how it has been integrated into the ASPEED SoC.
Working backwards from the POWER processor, the fundamental pieces of
interest for the implementation are:
1. The Common FRU Access Macro (CFAM), an address space containing
various "engines" that drive accesses on buses internal and external
to the POWER chip. Examples include the SBEFIFO and I2C masters. The
engines hang off of an internal Local Bus (LBUS) which is described
by the CFAM configuration block.
2. The FSI slave: The slave is the terminal point of the FSI bus for
FSI symbols addressed to it. Slaves can be cascaded off of one
another. The slave's configuration registers appear in address space
of the CFAM to which it is attached.
3. The FSI master: A controller in the platform service processor (e.g.
BMC) driving CFAM engine accesses into the POWER chip. At the
hardware level FSI is a bit-based protocol supporting synchronous and
DMA-driven accesses of engines in a CFAM.
4. The On-Chip Peripheral Bus (OPB): A low-speed bus typically found in
POWER processors. This now makes an appearance in the ASPEED SoC due
to tight integration of the FSI master IP with the OPB, mainly the
existence of an MMIO-mapping of the CFAM address straight onto a
sub-region of the OPB address space.
5. An APB-to-OPB bridge enabling access to the OPB from the ARM core in
the AST2600. Hardware limitations prevent the OPB from being directly
mapped into APB, so all accesses are indirect through the bridge.
The implementation appears as following in the qemu device tree:
(qemu) info qtree
bus: main-system-bus
type System
...
dev: aspeed.apb2opb, id ""
gpio-out "sysbus-irq" 1
mmio 000000001e79b000/0000000000001000
bus: opb.1
type opb
dev: fsi.master, id ""
bus: fsi.bus.1
type fsi.bus
dev: cfam.config, id ""
dev: cfam, id ""
bus: fsi.lbus.1
type lbus
dev: scratchpad, id ""
address = 0 (0x0)
bus: opb.0
type opb
dev: fsi.master, id ""
bus: fsi.bus.0
type fsi.bus
dev: cfam.config, id ""
dev: cfam, id ""
bus: fsi.lbus.0
type lbus
dev: scratchpad, id ""
address = 0 (0x0)
The LBUS is modelled to maintain the qdev bus hierarchy and to take
advantage of the object model to automatically generate the CFAM
configuration block. The configuration block presents engines in the
order they are attached to the CFAM's LBUS. Engine implementations
should subclass the LBusDevice and set the 'config' member of
LBusDeviceClass to match the engine's type.
CFAM designs offer a lot of flexibility, for instance it is possible for
a CFAM to be simultaneously driven from multiple FSI links. The modeling
is not so complete; it's assumed that each CFAM is attached to a single
FSI slave (as a consequence the CFAM subclasses the FSI slave).
As for FSI, its symbols and wire-protocol are not modelled at all. This
is not necessary to get FSI off the ground thanks to the mapping of the
CFAM address space onto the OPB address space - the models follow this
directly and map the CFAM memory region into the OPB's memory region.
Future work includes supporting more advanced accesses that drive the
FSI master directly rather than indirectly via the CFAM mapping, which
will require implementing the FSI state machine and methods for each of
the FSI symbols on the slave. Further down the track we can also look at
supporting the bitbanged SoftFSI drivers in Linux by extending the FSI
slave model to resolve sequences of GPIO IRQs into FSI symbols, and
calling the associated symbol method on the slave to map the access onto
the CFAM.
Testing:
Tested by reading cfam config address 0 on rainier machine type.
root@p10bmc:~# pdbg -a getcfam 0x0
p0: 0x0 = 0xc0022d15
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Ninad Palsule <ninad@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
This is a part of patchset where IBM's Flexible Service Interface is
introduced.
An APB-to-OPB bridge enabling access to the OPB from the ARM core in
the AST2600. Hardware limitations prevent the OPB from being directly
mapped into APB, so all accesses are indirect through the bridge.
The On-Chip Peripheral Bus (OPB): A low-speed bus typically found in
POWER processors. This now makes an appearance in the ASPEED SoC due
to tight integration of the FSI master IP with the OPB, mainly the
existence of an MMIO-mapping of the CFAM address straight onto a
sub-region of the OPB address space.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Ninad Palsule <ninad@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
[ clg: - moved FSIMasterState under AspeedAPB2OPBState
- modified fsi_opb_fsi_master_address() and
fsi_opb_opb2fsi_address()
- instroduced fsi_aspeed_apb2opb_init()
- reworked fsi_aspeed_apb2opb_realize()
- removed FSIMasterState object and fsi_opb_realize()
- simplified OPBus
- introduced fsi_aspeed_apb2opb_rw to fix endianness issue ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
This is a part of patchset where IBM's Flexible Service Interface is
introduced.
This commit models the FSI master. CFAM is hanging out of FSI master which is a bus controller.
The FSI master: A controller in the platform service processor (e.g.
BMC) driving CFAM engine accesses into the POWER chip. At the
hardware level FSI is a bit-based protocol supporting synchronous and
DMA-driven accesses of engines in a CFAM.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Ninad Palsule <ninad@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
[ clg: - move FSICFAMState object under FSIMasterState
- introduced fsi_master_init()
- reworked fsi_master_realize()
- dropped FSIBus definition ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
This is a part of patchset where IBM's Flexible Service Interface is
introduced.
The Common FRU Access Macro (CFAM), an address space containing
various "engines" that drive accesses on busses internal and external
to the POWER chip. Examples include the SBEFIFO and I2C masters. The
engines hang off of an internal Local Bus (LBUS) which is described
by the CFAM configuration block.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Ninad Palsule <ninad@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
[ clg: - moved object FSIScratchPad under FSICFAMState
- moved FSIScratchPad code under cfam.c
- introduced fsi_cfam_instance_init()
- reworked fsi_cfam_realize() ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
This is a part of patchset where IBM's Flexible Service Interface is
introduced.
The FSI slave: The slave is the terminal point of the FSI bus for
FSI symbols addressed to it. Slaves can be cascaded off of one
another. The slave's configuration registers appear in address space
of the CFAM to which it is attached.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Ninad Palsule <ninad@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
This is a part of patchset where FSI bus is introduced.
The FSI bus is a simple bus where FSI master is attached.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Ninad Palsule <ninad@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
[ clg: - removed include/hw/fsi/engine-scratchpad.h and
hw/fsi/engine-scratchpad.c
- dropped FSI_SCRATCHPAD
- included FSIBus definition
- dropped hw/fsi/trace-events changes ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
This is a part of patchset where IBM's Flexible Service Interface is
introduced.
The scratchpad provides a set of non-functional registers. The firmware
is free to use them, hardware does not support any special management
support. The scratchpad registers can be read or written from LBUS
slave. The scratch pad is managed under FSI CFAM state.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Ninad Palsule <ninad@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
[ clg: - moved object FSIScratchPad under FSICFAMState
- moved FSIScratchPad code under cfam.c ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
This is a part of patchset where IBM's Flexible Service Interface is
introduced.
The LBUS is modelled to maintain mapped memory for the devices. The
memory is mapped after CFAM config, peek table and FSI slave registers.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Ninad Palsule <ninad@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
[ clg: - removed lbus_add_device() bc unused
- removed lbus_create_device() bc used only once
- removed "address" property
- updated meson.build to build fsi dir
- included an empty hw/fsi/trace-events ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Aspeed SoCs use a single CPU type (set as AspeedSoCClass::cpu_type).
Convert it to a NULL-terminated array (of a single non-NULL element).
Set MachineClass::valid_cpu_types[] to use the common machine code
to provide hints when the requested CPU is invalid (see commit
e702cbc19e ("machine: Improve is_cpu_type_supported()").
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
In order to alter AspeedSoCClass::cpu_type in the next
commit, introduce the aspeed_soc_cpu_type() helper to
retrieve the per-SoC CPU type from AspeedSoCClass.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Rework aspeed_soc_num_cpus() as a new init_cpus_defaults()
helper to reduce code duplication.
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Since commit b7f1a0cb76 ("arm/aspeed: Compute the number
of CPUs from the SoC definition") Aspeed machines use the
aspeed_soc_num_cpus() helper to set the number of CPUs.
Use it for the ast1030-evb (commit 356b230ed1 "aspeed/soc:
Add AST1030 support") and supermicrox11-bmc (commit 40a38df55e
"hw/arm/aspeed: Add board model for Supermicro X11 BMC") machines.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
trivial patches for 2024-01-31
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEe3O61ovnosKJMUsicBtPaxppPlkFAmW6NScPHG1qdEB0bHMu
# bXNrLnJ1AAoJEHAbT2saaT5ZdQYH/2fhfhZotH0V2qAcMxlOoHbAE9UhZNRsSYtf
# QFP0GXFYFAMm7LHkPUbvKgO7LylKWAOMn/zKZqgj1Vf1EpoKQ2FwLtR/buDz86Ec
# pi2OrDPRA7Ay5c3ow3YZZkUOhQTTcR5rNjYctPtt/J4j8ol/z5vre7weJIg2bCJe
# zI7vIVg7iFFzbkXY20KHngJ5nDC+aEm7WaGlxAP8kfkvy324Wy9O2k8qu2J5zbLT
# HGvh3rwEDvRTYe4CaKFFHWNV0m4092HAr/dJBobugI5VZ6QQpK6Tgy8N+4ZrCHD2
# SjUKeym85VTOYGuY8b18fk5MQK2SzsfBUJ4x8VGC75W4mJ8agdc=
# =HImO
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 31 Jan 2024 11:55:19 GMT
# gpg: using RSA key 7B73BAD68BE7A2C289314B22701B4F6B1A693E59
# gpg: issuer "mjt@tls.msk.ru"
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>" [full]
# gpg: aka "Michael Tokarev <mjt@corpit.ru>" [full]
# gpg: aka "Michael Tokarev <mjt@debian.org>" [full]
# Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D 4324 457C E0A0 8044 65C5
# Subkey fingerprint: 7B73 BAD6 8BE7 A2C2 8931 4B22 701B 4F6B 1A69 3E59
* tag 'pull-trivial-patches' of https://gitlab.com/mjt0k/qemu: (21 commits)
hw/hyperv: Include missing headers
hw/intc/xics: Include missing 'cpu.h' header
hw/arm: Add `\n` to hint message
hw/loongarch: Add `\n` to hint message
hw/i386: Add `\n` to hint message
backends/hostmem: Fix block comments style (checkpatch.pl warnings)
misc: Clean up includes
riscv: Clean up includes
cxl: Clean up includes
include: Clean up includes
m68k: Clean up includes
acpi: Clean up includes
aspeed: Clean up includes
disas/riscv: Clean up includes
hyperv: Clean up includes
scripts/clean-includes: Update exclude list
mailmap: Fix Stefan Weil email
qemu-docs: Update options for graphical frontends
qapi/migration.json: Fix the member name for MigrationCapability
colo: examples: remove mentions of script= and (wrong) downscript=
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The following expression is incorrect because blk_pread_nonzeroes()
deals in units of bytes, not sectors:
bytes = MIN(size - offset, BDRV_REQUEST_MAX_SECTORS)
^^^^^^^
BDRV_REQUEST_MAX_BYTES is the appropriate constant.
Fixes: a4b15a8b9e ("pflash: Only read non-zero parts of backend image")
Cc: Xiang Zheng <zhengxiang9@huawei.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240130002712.257815-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
With GCC 14 the code failed to compile on i686 (and was wrong for any
version of GCC):
../block/blkio.c: In function ‘blkio_file_open’:
../block/blkio.c:857:28: error: passing argument 3 of ‘blkio_get_uint64’ from incompatible pointer type [-Wincompatible-pointer-types]
857 | &s->mem_region_alignment);
| ^~~~~~~~~~~~~~~~~~~~~~~~
| |
| size_t * {aka unsigned int *}
In file included from ../block/blkio.c:12:
/usr/include/blkio.h:49:67: note: expected ‘uint64_t *’ {aka ‘long long unsigned int *’} but argument is of type ‘size_t *’ {aka ‘unsigned int *’}
49 | int blkio_get_uint64(struct blkio *b, const char *name, uint64_t *value);
| ~~~~~~~~~~^~~~~
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Message-id: 20240130122006.2977938-1-rjones@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The man page for io_uring_queue_init states:
> io_uring_queue_init(3) returns 0 on success and -errno on failure.
and the man page for io_uring_setup (which is one of the functions
where the return value of io_uring_queue_init() can come from) states:
> On error, a negative error code is returned. The caller should not
> rely on errno variable.
Tested using 'sysctl kernel.io_uring_disabled=2'. Output before this
change:
> failed to init linux io_uring ring
Output after this change:
> failed to init linux io_uring ring: Operation not permitted
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240123135044.204985-1-f.ebner@proxmox.com>
Include missing headers in order to avoid when refactoring
unrelated headers:
hw/hyperv/hyperv.c:33:18: error: field ‘msg_page_mr’ has incomplete type
33 | MemoryRegion msg_page_mr;
| ^~~~~~~~~~~
hw/hyperv/hyperv.c: In function ‘synic_update’:
hw/hyperv/hyperv.c:64:13: error: implicit declaration of function ‘memory_region_del_subregion’ [-Werror=implicit-function-declaration]
64 | memory_region_del_subregion(get_system_memory(),
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~
hw/hyperv/hyperv.c: In function ‘hyperv_hcall_signal_event’:
hw/hyperv/hyperv.c:683:17: error: implicit declaration of function ‘ldq_phys’; did you mean ‘ldub_phys’? [-Werror=implicit-function-declaration]
683 | param = ldq_phys(&address_space_memory, addr);
| ^~~~~~~~
| ldub_phys
hw/hyperv/hyperv.c:683:17: error: nested extern declaration of ‘ldq_phys’ [-Werror=nested-externs]
hw/hyperv/hyperv.c: In function ‘hyperv_hcall_retreive_dbg_data’:
hw/hyperv/hyperv.c:792:24: error: ‘TARGET_PAGE_SIZE’ undeclared (first use in this function); did you mean ‘TARGET_PAGE_BITS’?
792 | msg.u.recv.count = TARGET_PAGE_SIZE - sizeof(*debug_data_out);
| ^~~~~~~~~~~~~~~~
| TARGET_PAGE_BITS
hw/hyperv/hyperv.c: In function ‘hyperv_syndbg_send’:
hw/hyperv/hyperv.c:885:16: error: ‘HV_SYNDBG_STATUS_INVALID’ undeclared (first use in this function)
885 | return HV_SYNDBG_STATUS_INVALID;
| ^~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Include missing headers in order to avoid when refactoring
unrelated headers:
hw/intc/xics.c: In function 'icp_realize':
hw/intc/xics.c:304:5: error: unknown type name 'PowerPCCPU'
304 | PowerPCCPU *cpu;
| ^~~~~~~~~~
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
While re-indenting code in host_memory_backend_memory_complete(),
we triggered various "Block comments use a leading /* on a separate
line" warnings from checkpatch.pl. Correct the comments style.
Fixes: e199f7ad4d ("backends: Simplify host_memory_backend_memory_complete()")
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This commit was created with scripts/clean-includes:
./scripts/clean-includes --git misc net/af-xdp.c plugins/*.c audio/pwaudio.c util/userfaultfd.c
All .c should include qemu/osdep.h first. The script performs three
related cleanups:
* Ensure .c files include qemu/osdep.h first.
* Including it in a .h is redundant, since the .c already includes
it. Drop such inclusions.
* Likewise, including headers qemu/osdep.h includes is redundant.
Drop these, too.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This commit was created with scripts/clean-includes:
./scripts/clean-includes --git riscv target/riscv/*.[ch]
All .c should include qemu/osdep.h first. The script performs three
related cleanups:
* Ensure .c files include qemu/osdep.h first.
* Including it in a .h is redundant, since the .c already includes
it. Drop such inclusions.
* Likewise, including headers qemu/osdep.h includes is redundant.
Drop these, too.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This commit was created with scripts/clean-includes.
All .c should include qemu/osdep.h first. The script performs three
related cleanups:
* Ensure .c files include qemu/osdep.h first.
* Including it in a .h is redundant, since the .c already includes
it. Drop such inclusions.
* Likewise, including headers qemu/osdep.h includes is redundant.
Drop these, too.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This commit was created with scripts/clean-includes:
./scripts/clean-includes --git include include/*/*.h include/*/*/*.h
All .c should include qemu/osdep.h first. The script performs three
related cleanups:
* Ensure .c files include qemu/osdep.h first.
* Including it in a .h is redundant, since the .c already includes
it. Drop such inclusions.
* Likewise, including headers qemu/osdep.h includes is redundant.
Drop these, too.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This commit was created with scripts/clean-includes:
./scripts/clean-includes --git m68k include/hw/audio/asc.h include/hw/m68k/*.h
All .c should include qemu/osdep.h first. The script performs three
related cleanups:
* Ensure .c files include qemu/osdep.h first.
* Including it in a .h is redundant, since the .c already includes
it. Drop such inclusions.
* Likewise, including headers qemu/osdep.h includes is redundant.
Drop these, too.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This commit was created with scripts/clean-includes:
./scripts/clean-includes --git acpi include/hw/*/*acpi.h hw/*/*acpi.c
All .c should include qemu/osdep.h first. The script performs three
related cleanups:
* Ensure .c files include qemu/osdep.h first.
* Including it in a .h is redundant, since the .c already includes
it. Drop such inclusions.
* Likewise, including headers qemu/osdep.h includes is redundant.
Drop these, too.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This commit was created with scripts/clean-includes.
All .c should include qemu/osdep.h first. The script performs three
related cleanups:
* Ensure .c files include qemu/osdep.h first.
* Including it in a .h is redundant, since the .c already includes
it. Drop such inclusions.
* Likewise, including headers qemu/osdep.h includes is redundant.
Drop these, too.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This commit was created with scripts/clean-includes:
./scripts/clean-includes --git disas/riscv disas/riscv*[ch]
All .c should include qemu/osdep.h first. The script performs three
related cleanups:
* Ensure .c files include qemu/osdep.h first.
* Including it in a .h is redundant, since the .c already includes
it. Drop such inclusions.
* Likewise, including headers qemu/osdep.h includes is redundant.
Drop these, too.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This commit was created with scripts/clean-includes:
./scripts/clean-includes --git hyperv hw/hyperv/*.[ch]
All .c should include qemu/osdep.h first. The script performs three
related cleanups:
* Ensure .c files include qemu/osdep.h first.
* Including it in a .h is redundant, since the .c already includes
it. Drop such inclusions.
* Likewise, including headers qemu/osdep.h includes is redundant.
Drop these, too.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Update the exclude list to exclude some more files which don't follow our
standard #include policy.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Commit 5204b499a6 ("mailmap: Fix Stefan Weil author email")
corrected authorship for patch received at qemu-devel@nongnu.org,
correct now for patch received at qemu-trivial@nongnu.org.
Update other authorship email for Stefan's commits.
Suggested-by: Stefan Weil <sw@weilnetz.de>
Fixes: d819fc9516 ("virtio-blk: Fix potential nullptr read access")
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The command line options `-ctrl-grab` and `-alt-grab` have been removed
in QEMU 7.1. Instead, use the `-display sdl,grab-mod=<modifiers>` option
to specify the grab modifiers.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2103
Signed-off-by: Yihuan Pan <xun794@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
There's no need to repeat script=/etc/qemu-ifup in examples,
as it is already in there. More, all examples uses incorrect
"down script=" (which should be "downscript=").
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Zhang Chen <chen.zhang@intel.com>
-z without -R has no effect: the dump format remains @elf. Fix the
logic error so it becomes @kdump-zlib.
Fixes: e6549197f7 (dump: Add command interface for kdump-raw formats)
Fixes: CID 1523841
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
qga-pull-2024-01-30
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEwsLBCepDxjwUI+uE711egWG6hOcFAmW409sACgkQ711egWG6
# hOc0PxAAj0Lhgt70OpAGQd5ERBgdRbh2/pJkKvdWBZvUAkVyFEp6Wn1UEC9aRpdi
# g8KCfYWVpp4LKMm5Ok1/8MKzAKT0jJFKatX0iZy/sx5M0A7x2/cJWgTv69L41Ld3
# FoJM/a/te8VUSk/jPKMqf4G7BzqYyHDnAlcnLjnvUsUabnocjmpSSckCNWLtQ/Sz
# RCGOwaTJ4HKW9O7yuw4gcXeYf2G21UemkITLh7Kdv/6fqqlg3k2gRSQeQ11Xh9Tc
# Jst7511vs+9AvDVM67o84z6UQwlp+23IJb3PBG2Td+Odr7BDAturWE4yyqPtjcpf
# Kcy6uMO6ZNYSW9CAspIF271fObLBPdvvMCHabar8x3LwF/6SxdnVRnWbqVsY/LSZ
# j76VB7TashwxzgqXEqmSrJmCSiCU4L7k8+KU0XxqIg4howlHzoRlejuMZzVK8Hyt
# Cu5tqa6yq8+twxzP4qQhkvazGgYgWo/xNaZ8D3h6xxxZ4NE/hN24YodtyFV/NGWE
# VSvUpzTKettECcyPo3OGezOqcr0vW5Z776MQOsMQWAjwm3bjM8hMzuKglnJe7Lt4
# DAbreHQ5s+pELQnq/cv7AVAr3zy2AIpPq4xDPJUHunAh+kMFY3cHCNaKytiRqmf+
# kZtd4nSN9j5+0oE5jSiuPn9C96y3shGL95XR6WW0nN1ImWOVgx4=
# =Tl7c
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 30 Jan 2024 10:47:55 GMT
# gpg: using RSA key C2C2C109EA43C63C1423EB84EF5D5E8161BA84E7
# gpg: Good signature from "Kostiantyn Kostiuk (Upstream PR sign) <kkostiuk@redhat.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: C2C2 C109 EA43 C63C 1423 EB84 EF5D 5E81 61BA 84E7
* tag 'qga-pull-2024-01-30' of https://github.com/kostyanf14/qemu:
qga: Solaris has net/if_arp.h and netinet/if_ether.h but not ETHER_ADDR_LEN
qga-win: Fix guest-get-fsinfo multi-disks collection
tests/unit/test-qga: do not qualify executable paths
guest-agent: improve help for --allow-rpcs and --block-rpcs
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Solaris has net/if_arp.h and netinet/if_ether.h rather than net/ethernet.h,
but does not define ETHER_ADDR_LEN, instead providing ETHERADDRL.
Signed-off-by: Nick Briggs <nicholas.h.briggs@gmail.com>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Signed-off-by: Konstantin Kostiuk <kkostiuk@redhat.com>
guest-exec invocation does not need the full path of the executable to
execute. Using only the command names ensures correct execution of the
test on systems not adhering to the FHS.
Signed-off-by: Samuel Tardieu <sam@rfc1149.net>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: "Daniel P. Berrangé" <berrange@redhat.com>
Signed-off-by: Konstantin Kostiuk <kkostiuk@redhat.com>
The function is now trivial, and with inlining we can
re-use the calling function's tcg_ops variable.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Migration Pull
[dropped fabiano's patch on modifying cpu model for arm migration tests for
now]
- Fabiano's patchset to fix migration state references in BHs
- Fabiano's new 'n-1' migration test for CI
- Het's fix on making "uri" optional in QMP migrate cmd
- Markus's HMP leak fix reported by Coverity
- Paolo's cleanup on uffd to replace u64 usage
- Peter's small migration cleanup series all over the places
# -----BEGIN PGP SIGNATURE-----
#
# iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZbcVeBIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1wYHjgD9F2Fnrf4EuPNC/gF3yUvHVz1mgHqevb/g
# pw/ThcJF31wBALuWmwuUaNWm+VNtRc10YH6bY7HZW8oa1RefRN6QZn0L
# =JGTX
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 29 Jan 2024 03:03:20 GMT
# gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg: issuer "peterx@redhat.com"
# gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [marginal]
# gpg: aka "Peter Xu <peterx@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706
* tag 'migration-20240126-pull-request' of https://gitlab.com/peterx/qemu:
Make 'uri' optional for migrate QAPI
migration: Centralize BH creation and dispatch
migration: Add a wrapper to qemu_bh_schedule
migration: Reference migration state around loadvm_postcopy_handle_run_bh
migration: Take reference to migration state around bg_migration_vm_start_bh
migration: Fix use-after-free of migration state object
migration/yank: Use channel features
ci: Disable migration compatibility tests for aarch64
ci: Add a migration compatibility test job
analyze-migration.py: Remove trick on parsing ramblocks
migration: Drop unnecessary check in ram's pending_exact()
migration: Make threshold_size an uint64_t
migration: Plug memory leak on HMP migrate error path
userfaultfd: use 1ULL to build ioctl masks
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Both the report() function as well as the initial gdbstub test sequence
are copy-pasted into ~10 files with slight modifications. This
indicates that they are indeed generic, so factor them out. While
at it, add a few newlines to make the formatting closer to PEP-8.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20240129093410.3151-3-iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
gdbserver ignores page protection by virtue of using /proc/$pid/mem.
Teach qemu gdbstub to do this too. This will not work if /proc is not
mounted; accept this limitation.
One alternative is to temporarily grant the missing PROT_* bit, but
this is inherently racy. Another alternative is self-debugging with
ptrace(POKE), which will break if QEMU itself is being debugged - a
much more severe limitation.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20240129093410.3151-2-iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
When doing device assignment of a physical device, MSI-X can be
enabled with no vectors enabled and this sets the IRQ index to
VFIO_PCI_MSIX_IRQ_INDEX. However, when MSI-X is disabled, the IRQ
index is left untouched if no vectors are in use. Then, when INTx
is enabled, the IRQ index value is considered incompatible (set to
MSI-X) and VFIO_DEVICE_SET_IRQS fails. QEMU complains with :
qemu-system-x86_64: vfio 0000:08:00.0: Failed to set up TRIGGER eventfd signaling for interrupt INTX-0: VFIO_DEVICE_SET_IRQS failure: Invalid argument
To avoid that, unconditionaly clear the IRQ index when MSI-X is
disabled.
Buglink: https://issues.redhat.com/browse/RHEL-21293
Fixes: 5ebffa4e87 ("vfio/pci: use an invalid fd to enable MSI-X")
Cc: Jing Liu <jing2.liu@intel.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Do not use uint64_t for the type of the declaration and __u64 when
computing the number of elements in the array.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Now that the migration state reference counting is correct, further
wrap the bottom half dispatch process to avoid future issues.
Move BH creation and scheduling together and wrap the dispatch with an
intermediary function that will ensure we always keep the ref/unref
balanced.
Also move the responsibility of deleting the BH into the wrapper and
remove the now unnecessary pointers.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240119233922.32588-6-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
We need to hold a reference to the current_migration object around
async calls to avoid it been freed while still in use. Even on this
load-side function, we might still use the MigrationState, e.g to
check for capabilities.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240119233922.32588-4-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
We're currently allowing the process_incoming_migration_bh bottom-half
to run without holding a reference to the 'current_migration' object,
which leads to a segmentation fault if the BH is still live after
migration_shutdown() has dropped the last reference to
current_migration.
In my system the bug manifests as migrate_multifd() returning true
when it shouldn't and multifd_load_shutdown() calling
multifd_recv_terminate_threads() which crashes due to an uninitialized
multifd_recv_state.
Fix the issue by holding a reference to the object when scheduling the
BH and dropping it before returning from the BH. The same is already
done for the cleanup_bh at migrate_fd_cleanup_schedule().
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1969
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240119233922.32588-2-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
Stop using outside knowledge about the io channels when registering
yank functions. Query for features instead.
The yank method for all channels used with migration code currently is
to call the qio_channel_shutdown() function, so query for
QIO_CHANNEL_FEATURE_SHUTDOWN. We could add a separate feature in the
future for indicating whether a channel supports yanking, but that
seems overkill at the moment.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20230911171320.24372-9-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
Until 9.0 is out, we need to keep the aarch64 job disabled because the
tests always use the n-1 version of migration-test. That happens to be
broken for aarch64 in 8.2. Once 9.0 is out, it will become the n-1
version and it will bring the fixed tests.
We can revert this patch when 9.0 releases.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240118164951.30350-4-farosas@suse.de
[peterx: use _SKIPPED rather than _OPTIONAL]
Signed-off-by: Peter Xu <peterx@redhat.com>
The migration tests have support for being passed two QEMU binaries to
test migration compatibility.
Add a CI job that builds the lastest release of QEMU and another job
that uses that version plus an already present build of the current
version and run the migration tests with the two, both as source and
destination. I.e.:
old QEMU (n-1) -> current QEMU (development tree)
current QEMU (development tree) -> old QEMU (n-1)
The purpose of this CI job is to ensure the code we're about to merge
will not cause a migration compatibility problem when migrating the
next release (which will contain that code) to/from the previous
release.
The version of migration-test used will be the one matching the older
QEMU. That way we can avoid special-casing new tests that wouldn't be
compatible with the older QEMU.
Note: for user forks, the version tags need to be pushed to gitlab
otherwise it won't be able to checkout a different version.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240118164951.30350-3-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
When the migration frameworks fetches the exact pending sizes, it means
this check:
remaining_size < s->threshold_size
Must have been done already, actually at migration_iteration_run():
if (must_precopy <= s->threshold_size) {
qemu_savevm_state_pending_exact(&must_precopy, &can_postcopy);
That should be after one round of ram_state_pending_estimate(). It makes
the 2nd check meaningless and can be dropped.
To say it in another way, when reaching ->state_pending_exact(), we
unconditionally sync dirty bits for precopy.
Then we can drop migrate_get_current() there too.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240117075848.139045-3-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Always include fake_user_interrupt in user-only build, despite
only being used for i386. This will enable cpu-exec.c to be
compiled only once.
Signed-off-by: Anton Johansson <anjo@rev.ng>
Message-ID: <20240119144024.14289-18-anjo@rev.ng>
[rth: Split out of a larger patch; remove TARGET_I386 conditional.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The ifdef out of which it is moved is not quite right: do_interrupt is
only needed for system mode. Move it to the top of a different ifdef
block, which preserves its position within the structure for that case.
Signed-off-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20240119144024.14289-18-anjo@rev.ng>
[rth: Split from a larger patch and simplified.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Needed to work around circular includes. vaddr is currently defined in
cpu-common.h and needed by hw/core/cpu.h, but cpu-common.h also need
cpu.h to know the size of the CPUState.
[Maybe we can instead move parts of cpu-common.h w. hw/core/cpu.h to
sort out the circular inclusion.]
Signed-off-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20240119144024.14289-7-anjo@rev.ng>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
[rth: Add include of vaddr.h into cpu-common.h]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Unless I'm missing something egregious, the jmp cache is only every
populated with a valid entry by the same thread that reads the cache.
Therefore, the contents of any valid entry are always consistent and
there is no need for any acquire/release magic.
Indeed ->tb has to be accessed with atomics, because concurrent
invalidations would otherwise cause data races. But ->pc is only ever
accessed by one thread, and accesses to ->tb and ->pc within tb_lookup
can never race with another tb_lookup. While the TranslationBlock
(especially the flags) could be modified by a concurrent invalidation,
store-release and load-acquire operations on the cache entry would
not add any additional ordering beyond what you get from performing
the accesses within a single thread.
Because of this, there is really nothing to win in splitting the CF_PCREL
and !CF_PCREL paths. It is easier to just always use the ->pc field in
the jump cache.
I noticed this while working on splitting commit 8ed558ec0c
("accel/tcg: Introduce TARGET_TB_PCREL", 2022-10-04) into multiple
pieces, for the sake of finding a more fine-grained bisection
result for https://gitlab.com/qemu-project/qemu/-/issues/2092.
It does not (and does not intend to) fix that issue; therefore
it may make sense to not commit it until the root cause
of issue #2092 is found.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240122153409.351959-1-pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Block layer patches
- virtio-blk: Multiqueue fixes and cleanups
- blklogwrites: Fixes for write_zeroes and superblock update races
- commit/stream: Allow users to request only format driver names in
backing file format
- monitor: only run coroutine commands in qemu_aio_context
- Some iotest fixes
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmWzpOwRHGt3b2xmQHJl
# ZGhhdC5jb20ACgkQfwmycsiPL9ZNzg//W1+C7HxLft4Jc4O1BcOoOLlGCg4Esupt
# z0/XLZ9+xVQUtjQ82pFzf9XaWQs8CuNT3FBUKi+ngdwZ0JBThIv0aGiMZBcAeQjD
# qshPFgDM1lGL4ICIaT73/qfUzQgO3oruZj9F+ShBBzoasNWVoRzqqVDR3pinLwTp
# D4TU+3A6LkdhlYGT60SYfRq/UKNmCA1s2wysdjqXxS6KOEURNF2VBnz0Nu76qrVb
# 3P/a55GPiJIn+VVsdQ0J4vyyzn23m7I7WZOJ7Sjm1EfSJ6SvcDbhWsZTUonaV2rU
# qZ3WI/jggqxXRV8F2AaA4suS/Cc8RkX2KfcN8fB6wDC2eI5USSatjh6xfw5xH9Ll
# NRKUO4vFFR3Lf8wN9apg0Bwxqi0GOm9kvBJT5QqjQ16R1dvqBLqbZqcx6ZXqWFXe
# /Iy243Tz19mWTFVUj0EgCKQpNz9F4SyXxV83HtSR1lJ5mhthnLxkvUOe7jsFPE4d
# 1Z3uBNWnx2mKFkhlwocMTKayYqxPuKQ+YjqrRoplLW1GZoBeoalKRGf8/RHa6kQx
# gh4cguihlb71AH1AO1QuYpiZt9G4RJR2RZlIoCPJY5TaKJedcxMVn8H+8/F0PnQd
# gPysZf7hTU1xCUV6TClDd+f2fuvqZYwXdwHJ9iiohNkbFq4HFQUp4nk4/eEPGSe/
# uv8oE813E30=
# =KQJl
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 26 Jan 2024 12:26:20 GMT
# gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg: issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* tag 'for-upstream' of https://repo.or.cz/qemu/kevin:
iotests/277: Use iotests.sock_dir for socket creation
iotests/iothreads-stream: Use the right TimeoutError
tests/unit: Bump test-replication timeout to 60 seconds
iotests/264: Use iotests.sock_dir for socket creation
block/blklogwrites: Protect mutable driver state with a mutex.
virtio-blk: always set ioeventfd during startup
virtio-blk: tolerate failure to set BlockBackend AioContext
virtio-blk: restart s->rq reqs in vq AioContexts
virtio-blk: rename dataplane to ioeventfd
virtio-blk: rename dataplane create/destroy functions
virtio-blk: move dataplane code into virtio-blk.c
monitor: only run coroutine commands in qemu_aio_context
iotests: port 141 to Python for reliable QMP testing
iotests: add filter_qmp_generated_node_ids()
stream: Allow users to request only format driver names in backing file format
commit: Allow users to request only format driver names in backing file format
string-output-visitor: Fix (pseudo) struct handling
block/blklogwrites: Fix a bug when logging "write zeroes" operations.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The const_le64() macro introduced in commit 845d80a8c7 turns out
to have a bug which means that on big-endian systems the compiler
complains if the argument isn't already a 64-bit type. This hasn't
caused a problem yet, because there are no in-tree uses, but it
means it's not possible for anybody to add one without it failing CI.
This example is from an attempted use of it with the argument '0',
from the s390 CI runner's gcc:
../block/blklogwrites.c: In function ‘blk_log_writes_co_do_log’:
../include/qemu/bswap.h:148:36: error: left shift count >= width of
type [-Werror=shift-count-overflow]
148 | ((((_x) & 0x00000000000000ffU) << 56) | \
| ^~
../block/blklogwrites.c:409:27: note: in expansion of macro ‘const_le64’
409 | .nr_entries = const_le64(0),
| ^~~~~~~~~~
../include/qemu/bswap.h:149:36: error: left shift count >= width of
type [-Werror=shift-count-overflow]
149 | (((_x) & 0x000000000000ff00U) << 40) | \
| ^~
../block/blklogwrites.c:409:27: note: in expansion of macro ‘const_le64’
409 | .nr_entries = const_le64(0),
| ^~~~~~~~~~
cc1: all warnings being treated as errors
Fix this by making all the constants in the macro have the ULL
suffix. This will cause them all to be 64-bit integers, which means
the result of the logical & will also be an unsigned 64-bit type,
even if the input to the macro is a smaller type, and so the shifts
will be in range.
Fixes: 845d80a8c7 ("qemu/bswap: Add const_le64()")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Message-id: 20240122173735.472951-1-peter.maydell@linaro.org
In commit 1b7bc9b5c8 we changed handle_vec_simd_sqshrn() so
that instead of starting with a 0 value and depositing in each new
element from the narrowing operation, it instead started with the raw
result of the narrowing operation of the first element.
This is fine in the vector case, because the deposit operations for
the second and subsequent elements will always overwrite any higher
bits that might have been in the first element's result value in
tcg_rd. However in the scalar case we only go through this loop
once. The effect is that for a signed narrowing operation, if the
result is negative then we will now return a value where the bits
above the first element are incorrectly 1 (because the narrowfn
returns a sign-extended result, not one that is truncated to the
element size).
Fix this by using an extract operation to get exactly the correct
bits of the output of the narrowfn for element 1, instead of a
plain move.
Cc: qemu-stable@nongnu.org
Fixes: 1b7bc9b5c8 ("target/arm: Avoid tcg_const_ptr in handle_vec_simd_sqshrn")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2089
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240123153416.877308-1-peter.maydell@linaro.org
This patch implements a 32 half word FIFO as per imx serial device
specifications. If a non empty FIFO is below the trigger level, an
ageing timer will tick for a duration of 8 characters. On expiry,
AGTIM will be set triggering an interrupt. AGTIM timer resets when
there is activity in the receive FIFO.
Otherwise, RRDY is set when trigger level is exceeded. The receive
trigger level is 8 in newer kernel versions and 1 in older ones.
This change will break migration compatibility for the imx boards.
Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com>
Message-id: 20240125151931.83494-1-rayhan.faizel@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: commit message tidyups]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add a note on CPU features that are off by default in `virt` machines.
Some CPU features will remain off even if a CPU-capable CPU (e.g.,
`-cpu max`) is selected because they require support in both the CPU
itself and in the wider system. Therefore, the user, besides selecting a
CPU that supports such features, must also turn on the feature using a
machine option.
Signed-off-by: Gustavo Romero <gustavo.romero@linaro.org>
Message-id: 20240122211215.95073-1-gustavo.romero@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add MMDC, OCOTP, SQPI, CAAM, and USBMISC as unimplemented devices.
This allows operating systems such as Linux to run emulations such as
mcimx6ul-evk.
Before commit 0cd4926b85 ("Refactor i.MX6UL processor code"), the affected
memory ranges were covered by the unimplemented DAP device. The commit
reduced the DAP address range from 0x100000 to 4kB, and the emulation
thus no longer covered the various unimplemented devices in the affected
address range.
Fixes: 0cd4926b85 ("Refactor i.MX6UL processor code")
Cc: Jean-Christophe Dubois <jcd@tribudubois.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240120005356.2599547-1-linux@roeck-us.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Various files in hw/arm/ don't require "cpu.h" anymore.
Except virt-acpi-build.c, all of them don't require any
ARM specific knowledge anymore and can be build once as
target agnostic units. Update meson accordingly.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240118200643.29037-21-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The ARM_CPU_IRQ/FIQ definitions are used to index the GPIO
IRQ created calling qdev_init_gpio_in() in ARMCPU instance_init()
handler. To allow non-ARM code to raise interrupt on ARM cores,
move they to 'target/arm/cpu-qom.h' which is non-ARM specific and
can be included by any hw/ file.
File list to include the new header generated using:
$ git grep -wEl 'ARM_CPU_(\w*IRQ|FIQ)'
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240118200643.29037-18-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Now than we can access the M-profile bank index
definitions from the target-agnostic "cpu-qom.h"
header, we don't need the huge "cpu.h" anymore
(except in hw/arm/armv7m.c). Reduce its inclusion
to the source unit.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240118200643.29037-17-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The ARMv7M QDev container accesses the QDev SysTickState
by its secure/non-secure bank index. In order to make
the "hw/intc/armv7m_nvic.h" header target-agnostic in
the next commit, first move the M-profile bank index
definitions to "target/arm/cpu-qom.h".
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240118200643.29037-16-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
"target/arm/cpu.h" is target specific, any file including it
becomes target specific too, thus this is the same for any file
including "hw/misc/xlnx-versal-crl.h".
"hw/misc/xlnx-versal-crl.h" doesn't require any target specific
definition however, only the target-agnostic QOM definitions
from "target/arm/cpu-qom.h". Include the latter header to avoid
tainting unnecessary objects as target-specific.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240118200643.29037-14-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Declare arm_cpu_mp_affinity() prototype in the new
"target/arm/multiprocessing.h" header so units in
hw/arm/ can use it without having to include the huge
target-specific "cpu.h".
File list to include the new header generated using:
$ git grep -lw arm_cpu_mp_affinity
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240118200643.29037-11-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
target/arm/cpregs.h uses the CP_REG_ARCH_* definitions
from "target/arm/kvm-consts.h". Include it in order to
avoid when refactoring unrelated headers:
target/arm/cpregs.h:191:18: error: use of undeclared identifier 'CP_REG_ARCH_MASK'
if ((kvmid & CP_REG_ARCH_MASK) == CP_REG_ARM64) {
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240118200643.29037-8-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
target/arm/cpregs.h uses the FIELD() macro defined in
"hw/registerfields.h". Include it in order to avoid when
refactoring unrelated headers:
target/arm/cpregs.h:347:30: error: expected identifier
FIELD(HFGRTR_EL2, AFSR0_EL1, 0, 1)
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240118200643.29037-7-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
target/arm/cpu-features.h uses the FIELD_EX32() macro
defined in "hw/registerfields.h". Include it in order
to avoid when refactoring unrelated headers:
target/arm/cpu-features.h:44:12: error: call to undeclared function 'FIELD_EX32';
ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
return FIELD_EX32(id->id_isar0, ID_ISAR0, DIVIDE) != 0;
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240118200643.29037-6-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
include/hw/arm/xlnx-versal.h uses the ARMCPU structure which
is defined in the "target/arm/cpu.h" header. Include it in
order to avoid when refactoring unrelated headers:
In file included from hw/arm/xlnx-versal-virt.c:20:
include/hw/arm/xlnx-versal.h:62:23: error: array has incomplete element type 'ARMCPU' (aka 'struct ArchCPU')
ARMCPU cpu[XLNX_VERSAL_NR_ACPUS];
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240118200643.29037-5-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
hw/arm/smmuv3-internal.h uses the REG32() and FIELD()
macros defined in "hw/registerfields.h". Include it in
order to avoid when refactoring unrelated headers:
In file included from ../../hw/arm/smmuv3.c:34:
hw/arm/smmuv3-internal.h:36:28: error: expected identifier
REG32(IDR0, 0x0)
^
hw/arm/smmuv3-internal.h:37:5: error: expected function body after function declarator
FIELD(IDR0, S2P, 0 , 1)
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240118200643.29037-4-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
hw/arm/xilinx_zynq.c calls tswap32() which is declared
in "exec/tswap.h". Include it in order to avoid when
refactoring unrelated headers:
hw/arm/xilinx_zynq.c:103:31: error: call to undeclared function 'tswap32';
ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
board_setup_blob[n] = tswap32(board_setup_blob[n]);
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240118200643.29037-3-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
hw/arm/exynos4210.c calls tswap32() which is declared
in "exec/tswap.h". Include it in order to avoid when
refactoring unrelated headers:
hw/arm/exynos4210.c:499:22: error: call to undeclared function 'tswap32';
ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
smpboot[n] = tswap32(smpboot[n]);
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240118200643.29037-2-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Allwinner R40 supports two USB host ports shared between a USB 2.0 EHCI
host controller and a USB 1.1 OHCI host controller. Add support for both
of them.
If machine USB support is not enabled, create unimplemented devices
for the USB memory ranges to avoid crashes when booting Linux.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240115182757.1095012-2-linux@roeck-us.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The TUSB6010 USB controller is soldered on the N800 and N810
tablets, thus is always present.
This is a migration compatibility break for the n800/n810
machines started with the '-usb none' option.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240119215106.45776-3-philmd@linaro.org
[PMM: fixed commit message typo]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Convert the musicpal key input device to use
qemu_add_kbd_event_handler(). This lets us simplify it because we no
longer need to track whether we're in the middle of a PS/2 multibyte
key sequence.
In the conversion we move the keyboard handler registration from init
to realize, because devices shouldn't disturb the state of the
simulation by doing things like registering input handlers until
they're realized, so that device objects can be introspected
safely.
The behaviour where key-repeat is permitted for the arrow-keys only
is intentional (added in commit 7c6ce4baed), so we retain it,
and add a comment to that effect.
This is a migration compatibility break for musicpal.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20231103182750.855577-1-peter.maydell@linaro.org
error_report() strings should not include trailing newlines; remove
the newline from the error we print when devices won't fit into the
address space of the CPU.
This commit also fixes the accidental hardcoded tabs that were in
this line, since we have to touch the line anyway.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240118131649.2726375-1-peter.maydell@linaro.org
In arm_deliver_fault() we check for whether the fault is caused
by a data abort due to an access to a FEAT_NV2 sysreg in the
memory pointed to by the VNCR. Unfortunately part of the
condition checks the wrong argument to the function, meaning
that it would spuriously trigger, resulting in some instruction
aborts being taken to the wrong EL and reported incorrectly.
Use the right variable in the condition.
Fixes: 674e534527 ("target/arm: Report VNCR_EL2 based faults correctly")
Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-id: 20240116165605.2523055-1-peter.maydell@linaro.org
r[id]tlb[01], [iw][id]tlb opcodes use TLB way index passed in a register
by the guest. The host uses 3 bits of the index for ITLB indexing and 4
bits for DTLB, but there's only 7 entries in the ITLB array and 10 in
the DTLB array, so a malicious guest may trigger out-of-bound access to
these arrays.
Change split_tlb_entry_spec return type to bool to indicate whether TLB
way passed to it is valid. Change get_tlb_entry to return NULL in case
invalid TLB way is requested. Add assertion to xtensa_tlb_get_entry that
requested TLB way and entry indices are valid. Add checks to the
[rwi]tlb helpers that requested TLB way is valid and return 0 or do
nothing when it's not.
Cc: qemu-stable@nongnu.org
Fixes: b67ea0cd74 ("target-xtensa: implement memory protection options")
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20231215120307.545381-1-jcmvbkbc@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
If socket path is too long (longer than 108 bytes), socket can't be
opened. This might lead to failure when test dir path is long enough.
Make sure socket is created in iotests.sock_dir to avoid such a case.
This commit basically aligns iotests/277 with the rest of iotests.
Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Message-ID: <20240124162257.168325-1-andrey.drobyshev@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Since Python 3.11 asyncio.TimeoutError is an alias for TimeoutError, but
in older versions it's not. We really have to catch asyncio.TimeoutError
here, otherwise a slow test run will fail (as has happened multiple
times on CI recently).
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20240125152150.42389-1-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We're seeing timeouts for this test on CI runs (specifically for
ubuntu-20.04-s390x-all). It doesn't fail consistently, but even the
successful runs take about 27 or 28 seconds, which is not very far from
the 30 seconds timeout.
Bump the timeout a bit to make failure less likely even on this CI host.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20240125165803.48373-1-kwolf@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCAAdFiEEIV1G9IJGaJ7HfzVi7wSWWzmNYhEFAmWyBtIACgkQ7wSWWzmN
# YhHWDgf+P9Jnlt8tCmOV6oKYrhKBbNZ5mZGmd83LpgFpn0YTdBQrauje2DziQ6u8
# KSVO6VGK/yzFLe8+xIIZXT0pFTbr8KuGhpKwqU8hq33dZtkRPUM6psirGgh2Z94K
# zWvBt/gL8DaO4ywShqwTZxhNBke1WduZpwzd/2XehmfT2SM/krpWeI2CjistQTBe
# IVbD7QioVuolh4Vq3W8On14NhwMp85Z/POh0kIAYHq5eDp2U6uYfK+1O8KHsRV4j
# Ae0Comul3YvNj9t3WPB6i1fLAzHvSfc1vO18CHKnznRONBLuhfnm9HKU7PtT/BC0
# JY59tU1lGYaQ9Ok3fDtxkaU41gkBWQ==
# =FHOd
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 25 Jan 2024 06:59:30 GMT
# gpg: using RSA key 215D46F48246689EC77F3562EF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* tag 'net-pull-request' of https://github.com/jasowang/qemu:
virtio-net: correctly copy vnet header when flushing TX
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
pull-loongarch-20240125
# -----BEGIN PGP SIGNATURE-----
#
# iLMEAAEKAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCZbINEAAKCRBAov/yOSY+
# 3yVsBACz0E5gVPc5Fp5hgQsAiiZPga/Pr565BOypIw8iAPs0RNxMMnywinFsOi1w
# A6euynZTEW9lxx5cq/O5j7yaXUmgfChcJ1OkS/IEZaUtiG25ksOIqvoeYvuROfuV
# nYrM0nuOMNwJzkOJy+qZAwGaUbyWdiqUTkP369V2xxngTneDkw==
# =1YQg
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 25 Jan 2024 07:26:08 GMT
# gpg: using RSA key B8FF1DA0D2FDCB2DA09C6C2C40A2FFF239263EDF
# gpg: Good signature from "Song Gao <m17746591750@163.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B8FF 1DA0 D2FD CB2D A09C 6C2C 40A2 FFF2 3926 3EDF
* tag 'pull-loongarch-20240125' of https://gitlab.com/gaosong/qemu:
target/loongarch/kvm: Enable LSX/LASX extension
target/loongarch: Set cpuid CSR register only once with kvm mode
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
If socket path is too long (longer than 108 bytes), socket can't be
opened. This might lead to failure when test dir path is long enough.
Make sure socket is created in iotests.sock_dir to avoid such a case.
This commit basically aligns iotests/264 with the rest of iotests.
Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Message-ID: <20240125135237.189493-1-andrey.drobyshev@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
During the review of a fix for a concurrency issue in blklogwrites,
it was found that the driver needs an additional fix when enabling
multiqueue, which is a new feature introduced in QEMU 9.0, as the
driver state may be read and written by multiple threads at the same
time, which was not the case when the driver was originally written.
Fix the multi-threaded scenario by introducing a mutex to protect the
mutable fields in the driver state, and always having the mutex locked
by the current thread when accessing them. Also use the mutex and a
CoQueue to ensure that the super block is not being written to by
multiple threads concurrently and updates are properly serialized.
Additionally, add the const qualifier to a few BDRVBlkLogWritesState
pointer targets in contexts where the driver state is not written to.
Signed-off-by: Ari Sundholm <ari@tuxera.com>
Message-ID: <20240119162913.2620245-1-ari@tuxera.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When starting ioeventfd it is common practice to set the event notifier
so that the ioeventfd handler is triggered to run immediately. There may
be no requests waiting to be processed, but the idea is that if a
request snuck in then we guarantee that it will be detected.
One scenario where self-triggering the ioeventfd is necessary is when
virtio_blk_handle_output() is called from a vCPU thread before the
VIRTIO Device Status transitions to DRIVER_OK. In that case we need to
self-trigger the ioeventfd so that the kick handled by the vCPU thread
causes the vq AioContext thread to take over handling the request(s).
Fixes: b6948ab01d ("virtio-blk: add iothread-vq-mapping parameter")
Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240119135748.270944-7-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We no longer rely on setting the AioContext since the block layer
IO_CODE APIs can be called from any thread. Now it's just a hint to help
block jobs and other operations co-locate themselves in a thread with
the guest I/O requests. Keep going if setting the AioContext fails.
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240119135748.270944-6-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
A virtio-blk device with the iothread-vq-mapping parameter has
per-virtqueue AioContexts. It is not thread-safe to process s->rq
requests in the BlockBackend AioContext since that may be different from
the virtqueue's AioContext to which this request belongs. The code
currently races and could crash.
Adapt virtio_blk_dma_restart_cb() to first split s->rq into per-vq lists
and then schedule a BH each vq's AioContext as necessary. This way
requests are safely processed in their vq's AioContext.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240119135748.270944-5-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The dataplane code is really about using ioeventfd. It's used both for
IOThreads (what we think of as dataplane) and for the core virtio-pci
code's ioeventfd feature (which is enabled by default and used when no
IOThread has been specified). Rename the code to reflect this.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240119135748.270944-4-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
virtio_blk_data_plane_create() and virtio_blk_data_plane_destroy() are
actually about s->vq_aio_context[] rather than managing
dataplane-specific state.
As a prerequisite to using s->vq_aio_context[] in all code paths (even
when dataplane is not used), rename these functions to reflect that they
just manage s->vq_aio_context and call them regardless of whether or not
dataplane is in use.
Note that virtio-blk supports running with -device
virtio-blk-pci,ioevent=off where the vCPU thread enters the device
emulation code. In this mode ioeventfd is not used for virtqueue
processing. However, we still want to initialize s->vq_aio_context[] to
qemu_aio_context in that case since I/O completion callbacks will be
invoked in the main loop thread.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240119135748.270944-3-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The dataplane code used to be significantly different from the
non-dataplane code and therefore had a separate source file.
Over time the difference has gotten smaller because the I/O code paths
were unified. Nowadays the distinction between the VirtIOBlock and
VirtIOBlockDataPlane structs is more of an inconvenience that hinders
code simplification.
Move hw/block/dataplane/virtio-blk.c into hw/block/virtio-blk.c, merging
VirtIOBlockDataPlane's fields into VirtIOBlock.
hw/block/virtio-blk.c used VirtIOBlock->dataplane to check if
virtio_blk_data_plane_create() was successful. This is not necessary
because ->dataplane_started and ->dataplane_disabled can be used
instead. This patch makes those changes in order to drop
VirtIOBlock->dataplane.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240119135748.270944-2-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
monitor_qmp_dispatcher_co() runs in the iohandler AioContext that is not
polled during nested event loops. The coroutine currently reschedules
itself in the main loop's qemu_aio_context AioContext, which is polled
during nested event loops. One known problem is that QMP device-add
calls drain_call_rcu(), which temporarily drops the BQL, leading to all
sorts of havoc like other vCPU threads re-entering device emulation code
while another vCPU thread is waiting in device emulation code with
aio_poll().
Paolo Bonzini suggested running non-coroutine QMP handlers in the
iohandler AioContext. This avoids trouble with nested event loops. His
original idea was to move coroutine rescheduling to
monitor_qmp_dispatch(), but I resorted to moving it to qmp_dispatch()
because we don't know if the QMP handler needs to run in coroutine
context in monitor_qmp_dispatch(). monitor_qmp_dispatch() would have
been nicer since it's associated with the monitor implementation and not
as general as qmp_dispatch(), which is also used by qemu-ga.
A number of qemu-iotests need updated .out files because the order of
QMP events vs QMP responses has changed.
Solves Issue #1933.
Cc: qemu-stable@nongnu.org
Fixes: 7bed89958b ("device_core: use drain_call_rcu in in qmp_device_add")
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2215192
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2214985
Buglink: https://issues.redhat.com/browse/RHEL-17369
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240118144823.1497953-4-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The common.qemu bash functions allow tests to interact with the QMP
monitor of a QEMU process. I spent two days trying to update 141 when
the order of the test output changed, but found it would still fail
occassionally because printf() and QMP events race with synchronous QMP
communication.
I gave up and ported 141 to the existing Python API for QMP tests. The
Python API is less affected by the order in which QEMU prints output
because it does not print all QMP traffic by default.
The next commit changes the order in which QMP messages are received.
Make 141 reliable first.
Cc: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240118144823.1497953-3-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add a filter function for QMP responses that contain QEMU's
automatically generated node ids. The ids change between runs and must
be masked in the reference output.
The next commit will use this new function.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240118144823.1497953-2-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Introduce a new flag 'backing-mask-protocol' for the block-stream QMP
command which instructs the internals to use 'raw' instead of the
protocol driver in case when a image is used without a dummy 'raw'
wrapper.
The flag is designed such that it can be always asserted by management
tools even when there isn't any update to backing files.
The flag will be used by libvirt so that the backing images still
reference the proper format even when libvirt will stop using the dummy
raw driver (raw driver with no other config). Libvirt needs this so that
the images stay compatible with older libvirt versions which didn't
expect that a protocol driver name can appear in the backing file format
field.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <bbee9a0a59748a8893289bf8249f568f0d587e62.1701796348.git.pkrempa@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Introduce a new flag 'backing-mask-protocol' for the block-commit QMP
command which instructs the internals to use 'raw' instead of the
protocol driver in case when a image is used without a dummy 'raw'
wrapper.
The flag is designed such that it can be always asserted by management
tools even when there isn't any update to backing files.
The flag will be used by libvirt so that the backing images still
reference the proper format even when libvirt will stop using the dummy
raw driver (raw driver with no other config). Libvirt needs this so that
the images stay compatible with older libvirt versions which didn't
expect that a protocol driver name can appear in the backing file format
field.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <2cb46e37093ce793ea1604abc8bbb90f4c8e434b.1701796348.git.pkrempa@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Commit ff32bb53 tried to get minimal struct support into the string
output visitor by just making it return "<omitted>". Unfortunately, it
forgot that the caller will still make more visitor calls for the
content of the struct.
If the struct is contained in a list, such as IOThreadVirtQueueMapping,
in the better case its fields show up as separate list entries. In the
worse case, it contains another list, and the string output visitor
doesn't support nested lists and asserts that this doesn't happen. So as
soon as the optional "vqs" field in IOThreadVirtQueueMapping is
specified, we get a crash.
This can be reproduced with the following command line:
echo "info qtree" | ./qemu-system-x86_64 \
-object iothread,id=t0 \
-blockdev null-co,node-name=disk \
-device '{"driver": "virtio-blk-pci", "drive": "disk",
"iothread-vq-mapping": [{"iothread": "t0", "vqs": [0]}]}' \
-monitor stdio
Fix the problem by counting the nesting level of structs and ignoring
any visitor calls for values (apart from start/end_struct) while we're
not on the top level.
Lists nested directly within lists remain unimplemented, as we don't
currently have a use case for them.
Fixes: ff32bb5347
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2069
Reported-by: Aihua Liang <aliang@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20240109181717.42493-1-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There is a bug in the blklogwrites driver pertaining to logging "write
zeroes" operations, causing log corruption. This can be easily observed
by setting detect-zeroes to something other than "off" for the driver.
The issue is caused by a concurrency bug pertaining to the fact that
"write zeroes" operations have to be logged in two parts: first the log
entry metadata, then the zeroed-out region. While the log entry
metadata is being written by bdrv_co_pwritev(), another operation may
begin in the meanwhile and modify the state of the blklogwrites driver.
This is as intended by the coroutine-driven I/O model in QEMU, of
course.
Unfortunately, this specific scenario is mishandled. A short example:
1. Initially, in the current operation (#1), the current log sector
number in the driver state is only incremented by the number of sectors
taken by the log entry metadata, after which the log entry metadata is
written. The current operation yields.
2. Another operation (#2) may start while the log entry metadata is
being written. It uses the current log position as the start offset for
its log entry. This is in the sector right after the operation #1 log
entry metadata, which is bad!
3. After bdrv_co_pwritev() returns (#1), the current log sector
number is reread from the driver state in order to find out the start
offset for bdrv_co_pwrite_zeroes(). This is an obvious blunder, as the
offset will be the sector right after the (misplaced) operation #2 log
entry, which means that the zeroed-out region begins at the wrong
offset.
4. As a result of the above, the log is corrupt.
Fix this by only reading the driver metadata once, computing the
offsets and sizes in one go (including the optional zeroed-out region)
and setting the log sector number to the appropriate value for the next
operation in line.
Signed-off-by: Ari Sundholm <ari@tuxera.com>
Cc: qemu-stable@nongnu.org
Message-ID: <20240109184646.1128475-1-megari@gmx.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
"Since X.Y" is not recognized as a tagged section, and therefore not
formatted as such in generated documentation. Fix by adding the
required colon.
Previously fixed in commit 433a4fdc42 (qapi: Fix malformed "Since:"
section tags)
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240120095327.666239-8-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
docs/devel/qapi-code-gen demands that the "second and subsequent lines
of sections other than "Example"/"Examples" should be indented".
Commit a937b6aa73 (qapi: Reformat doc comments to conform to current
conventions) missed a few instances, and a few more have crept in
since. Indent them.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240120095327.666239-7-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Commit e050e42678 (qapi: Use explicit bulleted lists) added list
markup to correct bad rendering:
A JSON block comment like this:
Returns: nothing on success
If @node is not a valid block device, DeviceNotFound
If @name is not found, GenericError with an explanation
renders like this:
Returns: nothing on success If node is not a valid block device,
DeviceNotFound If name is not found, GenericError with an explanation
because whitespace is not significant.
Use an actual bulleted list, so that the formatting is correct.
It missed a few instances. Commit a937b6aa73 (qapi: Reformat doc
comments to conform to current conventions) then reflowed them.
Revert the reflowing, and add list markup.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240120095327.666239-6-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
docs/interop/bitmaps.rst uses references like
`qemu-qmp-ref <qemu-qmp-ref.html>`_
`query-block <qemu-qmp-ref.html#index-query_002dblock>`_
to refer to and into docs/interop/qemu-qmp-ref.rst.
Clean up the former: use :doc:`qemu-qmp-ref`.
I don't know how to clean up the latter.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240120095327.666239-5-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Conversion of docs/devel/qapi-code-gen.txt to ReST left several
dangling references behind. Fix them to point to
docs/devel/qapi-code-gen.rst.
Fixes: f7aa076dbd (docs: convert qapi-code-gen.txt to ReST)
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240120095327.666239-4-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Deletion of docs/interop/qmp-intro.txt left two dangling references
behind. Replace them by references to docs/interop/qmp-spec.rst.
Fixes: 0ec4468f23 (docs/interop: Delete qmp-intro.txt)
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240120095327.666239-3-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
We reserved type names ending with 'Kind' because a simple union
'SomeSimpleUnion' generated both a struct type SomeSimpleUnion and an
enum type SomeSimpleUnionKind. Gone since commit 4e99f4b12c (qapi:
Drop simple unions). The commit neglected to update the documentation
not to reserve type names ending with 'Kind'. Do that now.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20231221145727.835905-1-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
CSR cpuid register is used for routing irq to different vcpus, its
value is kept unchanged since poweron. So it is not necessary to
set CSR cpuid register after system resets, and it is only set at
vm creation stage.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240115085121.180524-1-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
When HASH_REPORT is negotiated, the guest_hdr_len might be larger than
the size of the mergeable rx buffer header. Using
virtio_net_hdr_mrg_rxbuf during the header swap might lead a stack
overflow in this case. Fixing this by using virtio_net_hdr_v1_hash
instead.
Reported-by: Xiao Lei <leixiao.nop@zju.edu.cn>
Cc: Yuri Benditovich <yuri.benditovich@daynix.com>
Cc: qemu-stable@nongnu.org
Cc: Mauro Matteo Cascella <mcascell@redhat.com>
Fixes: CVE-2023-6693
Fixes: e22f0603fb ("virtio-net: reference implementation of hash report")
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Jason Wang <jasowang@redhat.com>
These rather complex functions have never been used since they've been
introduced in 2012, so looks like they are not really useful for QEMU.
And since the static normalize_uri_path() function is also only used by
uri_resolve(), we can remove that function now, too.
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: "Daniel P. Berrangé" <berrange@redhat.com>
Message-ID: <20240123182247.432642-3-thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
We're still seeing timeouts in qtests that use a TCG payload with TCI
on a slow k8s runner:
https://gitlab.com/qemu-project/qemu/-/jobs/5990992722
So we should bump the timeout of cdrom-test to see whether that
fixes the issue.
Now, cdrom-test, as bios-tables-test, pxe-test and vmgenid-test use
the boot_sector_test() function for running a TCG payload. That
function already uses an internal timeout of 600 seconds with
the remark that the test could be slow with TCI.
Thus from the outer meson test runner side, we should not use less
than 600 seconds as timeout values for these tests. Let's bump them
on the meson side to 610 seconds so that the tests themselves can
run with their internal 600 seconds timeout and have some additional
seconds on top for reporting the outcome.
Message-ID: <20240124084412.465638-1-thuth@redhat.com>
Reviewed-by: "Daniel P. Berrangé" <berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The test-iov code uses usleep() with small values (<= 30) in some
nested loops with many iterations. This causes a small delay on OSes
like Linux that have a precise sleeping mechanism, but on systems
like NetBSD and OpenBSD, each usleep() call takes multiple microseconds,
which then sum up in a total test time of multiple minutes!
Looking at the code, the usleep() does not really seem to be necessary
here - if not enough data could be send, we should simply always use
select() to wait 'til we can send more. Thus remove the usleep() and
re-arrange the code a little bit to make it more clear what is going
on here.
Suggested-by: "Daniel P. Berrangé" <berrange@redhat.com>
Message-ID: <20240122153347.71654-1-thuth@redhat.com>
Reviewed-by: "Daniel P. Berrangé" <berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
A typo in sizeof_reg put the registers at the wrong offset.
Simplify the expressions to use positive addresses from the
start of uc_mcontext instead of negative addresses from the
end of uc_mcontext.
Reported-by: Vineet Gupta <vineetg@rivosinc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Using fleecing backup like in [0] on a qcow2 image (with metadata
preallocation) can lead to the following assertion failure:
> bdrv_co_do_block_status: Assertion `!(ret & BDRV_BLOCK_ZERO)' failed.
In the reproducer [0], it happens because the BDRV_BLOCK_RECURSE flag
will be set by the qcow2 driver, so the caller will recursively check
the file child. Then the BDRV_BLOCK_ZERO set too. Later up the call
chain, in bdrv_co_do_block_status() for the snapshot-access driver,
the assertion failure will happen, because both flags are set.
To fix it, clear the recurse flag after the recursive check was done.
In detail:
> #0 qcow2_co_block_status
Returns 0x45 = BDRV_BLOCK_RECURSE | BDRV_BLOCK_DATA |
BDRV_BLOCK_OFFSET_VALID.
> #1 bdrv_co_do_block_status
Because of the data flag, bdrv_co_do_block_status() will now also set
BDRV_BLOCK_ALLOCATED. Because of the recurse flag,
bdrv_co_do_block_status() for the bdrv_file child will be called,
which returns 0x16 = BDRV_BLOCK_ALLOCATED | BDRV_BLOCK_OFFSET_VALID |
BDRV_BLOCK_ZERO. Now the return value inherits the zero flag.
Returns 0x57 = BDRV_BLOCK_RECURSE | BDRV_BLOCK_DATA |
BDRV_BLOCK_OFFSET_VALID | BDRV_BLOCK_ALLOCATED | BDRV_BLOCK_ZERO.
> #2 bdrv_co_common_block_status_above
> #3 bdrv_co_block_status_above
> #4 bdrv_co_block_status
> #5 cbw_co_snapshot_block_status
> #6 bdrv_co_snapshot_block_status
> #7 snapshot_access_co_block_status
> #8 bdrv_co_do_block_status
Return value is propagated all the way up to here, where the assertion
failure happens, because BDRV_BLOCK_RECURSE and BDRV_BLOCK_ZERO are
both set.
> #9 bdrv_co_common_block_status_above
> #10 bdrv_co_block_status_above
> #11 block_copy_block_status
> #12 block_copy_dirty_clusters
> #13 block_copy_common
> #14 block_copy_async_co_entry
> #15 coroutine_trampoline
[0]:
> #!/bin/bash
> rm /tmp/disk.qcow2
> ./qemu-img create /tmp/disk.qcow2 -o preallocation=metadata -f qcow2 1G
> ./qemu-img create /tmp/fleecing.qcow2 -f qcow2 1G
> ./qemu-img create /tmp/backup.qcow2 -f qcow2 1G
> ./qemu-system-x86_64 --qmp stdio \
> --blockdev qcow2,node-name=node0,file.driver=file,file.filename=/tmp/disk.qcow2 \
> --blockdev qcow2,node-name=node1,file.driver=file,file.filename=/tmp/fleecing.qcow2 \
> --blockdev qcow2,node-name=node2,file.driver=file,file.filename=/tmp/backup.qcow2 \
> <<EOF
> {"execute": "qmp_capabilities"}
> {"execute": "blockdev-add", "arguments": { "driver": "copy-before-write", "file": "node0", "target": "node1", "node-name": "node3" } }
> {"execute": "blockdev-add", "arguments": { "driver": "snapshot-access", "file": "node3", "node-name": "snap0" } }
> {"execute": "blockdev-backup", "arguments": { "device": "snap0", "target": "node1", "sync": "full", "job-id": "backup0" } }
> EOF
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-id: 20240116154839.401030-1-f.ebner@proxmox.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Coroutine may be pooled even after COROUTINE_TERMINATE if
CONFIG_COROUTINE_POOL is enabled and fake stack should be saved in
such a case to keep AddressSanitizerUseAfterReturn working. Even worse,
I'm seeing stack corruption without fake stack being saved.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240117-asan-v2-1-26f9e1ea6e72@daynix.com>
Section 10.3 of the Hexagon V73 Programmer's Reference Manual
A duplex is encoded as a 32-bit instruction with bits [15:14] set to 00.
The sub-instructions that comprise a duplex are encoded as 13-bit fields
in the duplex.
Create a decoder for each subinstruction class (a, l1, l2, s1, s2).
Extend gen_trans_funcs.py to handle all instructions rather than
filter by instruction class.
There is a g_assert_not_reached() in decode_insns() in decode.c to
verify we never try to use the old decoder on 16-bit instructions.
Signed-off-by: Taylor Simpson <ltaylorsimpson@gmail.com>
Reviewed-by: Brian Cain <bcain@quicinc.com>
Message-Id: <20240115221443.365287-3-ltaylorsimpson@gmail.com>
Signed-off-by: Brian Cain <bcain@quicinc.com>
The Decodetree Specification can be found here
https://www.qemu.org/docs/master/devel/decodetree.html
Covers all 32-bit instructions, including HVX
We generate separate decoders for each instruction class. The reason
will be more apparent in the next patch in this series.
We add 2 new scripts
gen_decodetree.py Generate the input to decodetree.py
gen_trans_funcs.py Generate the trans_* functions used by the
output of decodetree.py
Since the functions generated by decodetree.py take DisasContext * as an
argument, we add the argument to a couple of functions that didn't need
it previously. We also set the insn field in DisasContext during decode
because it is used by the trans_* functions.
There is a g_assert_not_reached() in decode_insns() in decode.c to
verify we never try to use the old decoder on 32-bit instructions
Signed-off-by: Taylor Simpson <ltaylorsimpson@gmail.com>
Reviewed-by: Brian Cain <bcain@quicinc.com>
Message-Id: <20240115221443.365287-2-ltaylorsimpson@gmail.com>
Signed-off-by: Brian Cain <bcain@quicinc.com>
The generators are generally a bunch of Python if-then-else
statements based on the regtype and regid. Encapsulate regtype/regid
into a class hierarchy. Clients lookup the register and invoke
methods.
This has several advantages for making the code easier to read,
understand, and maintain
- The class name makes it more clear what the operand does
- All the methods for a given type of operand are together
- Don't need hex_common.bad_register
If a regtype/regid is missing, the lookup in hex_common.get_register
will fail
- We can remove the functions in hex_common that use regtype/regid
(e.g., is_read)
This patch creates the class hierarchy in hex_common and converts
gen_tcg_funcs.py. The other scripts will be converted in subsequent
patches in this series.
Signed-off-by: Taylor Simpson <ltaylorsimpson@gmail.com>
Reviewed-by: Brian Cain <bcain@quicinc.com>
Message-Id: <20231210220712.491494-3-ltaylorsimpson@gmail.com>
Signed-off-by: Brian Cain <bcain@quicinc.com>
Currently, the register number (MuN) for modifier registers is the
modifier register number rather than the index into hex_gpr. This
patch changes MuN to the hex_gpr index, which is consistent with
the handling of control registers.
Note that HELPER(fcircadd) needs the CS register corresponding to the
modifier register specified in the instruction. We create a TCGv
variable "CS" to hold the value to pass to the helper.
Reviewed-by: Brian Cain <bcain@quicinc.com>
Signed-off-by: Taylor Simpson <ltaylorsimpson@gmail.com>
Message-Id: <20231210220712.491494-2-ltaylorsimpson@gmail.com>
Signed-off-by: Brian Cain <bcain@quicinc.com>
When compiling qemu with system KVM mode for LoongArch, header files
in directory linux-headers/asm-loongarch should be used firstly.
Otherwise it fails to find kvm.h on system with old glibc, since
latest kernel header files are not installed.
This patch adds linux_arch definition for LoongArch system so that
header files in directory linux-headers/asm-loongarch can be included.
Fixes: 714b03c125 ("target/loongarch: Add loongarch kvm into meson build")
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240116013952.264474-1-maobibo@loongson.cn>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Some ELF files really do have segments of zero size, e.g.:
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
RISCV_ATTRIBUT 0x00000000000025b8 0x0000000000000000 0x0000000000000000
0x000000000000003e 0x0000000000000000 R 0x1
LOAD 0x0000000000001000 0x0000000080200000 0x0000000080200000
0x00000000000001d1 0x00000000000001d1 R E 0x1000
LOAD 0x00000000000011d1 0x00000000802001d1 0x00000000802001d1
0x0000000000000e37 0x0000000000000e37 RW 0x1000
LOAD 0x0000000000000120 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 0x1000
The current logic does not check for this condition, resulting in
the incorrect assignment of 'lowaddr' as zero.
There is already a piece of codes inside the segment traversal loop
that checks for zero-sized loadable segments for not creating empty
ROM blobs. Let's move this check to the beginning of the loop to
cover both scenarios.
Signed-off-by: Bin Meng <bmeng@tinylab.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240116155049.390301-1-bmeng@tinylab.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Even though the BLAST command isn't fully implemented in QEMU, the DMA_STAT_BCMBLT
bit should be set after the command has been issued to indicate that the command
has completed.
This fixes an issue with the DC390 DOS driver which issues the BLAST command as
part of its normal error recovery routine at startup, and otherwise sits in a
tight loop waiting for DMA_STAT_BCMBLT to be set before continuing.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Message-ID: <20240112131529.515642-5-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The setting of DMA_STAT_DONE at the end of a DMA transfer can be configured to
generate an interrupt, however the Linux driver manually checks for DMA_STAT_DONE
being set and if it is, considers that a DMA transfer has completed.
If DMA_STAT_DONE is set but the ESP device isn't indicating an interrupt then
the Linux driver considers this to be a spurious interrupt. However this can
occur in QEMU as there is a delay between the end of DMA transfer where
DMA_STAT_DONE is set, and the ESP device raising its completion interrupt.
This appears to be an incorrect assumption in the Linux driver as the ESP and
PCI DMA interrupt sources are separate (and may not be raised exactly
together), however we can work around this by synchronising the setting of
DMA_STAT_DONE at the end of a DMA transfer with the ESP completion interrupt.
In conjunction with the previous commit Linux is now able to correctly boot
from an am53c974 PCI SCSI device on the hppa C3700 machine without emitting
"iget: checksum invalid" and "Spurious irq, sreg=10" errors.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Message-ID: <20240112131529.515642-4-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The am53c974/dc390 PCI interrupt has two separate sources: the first is from the
internal ESP device, and the second is from the PCI DMA transfer logic.
Update the ESP interrupt handler so that it sets DMA_STAT_SCSIINT rather than
driving the PCI IRQ directly, and introduce a new esp_pci_update_irq() function
to generate the correct PCI IRQ level. In particular this fixes spurious interrupts
being generated by setting DMA_STAT_DONE at the end of a transfer if DMA_CMD_INTE_D
isn't set in the DMA_CMD register.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Message-ID: <20240112131529.515642-3-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The current code in esp_pci_dma_memory_rw() sets the DMA address to the value
of the DMA_SPA (Starting Physical Address) register which is incorrect: this
means that for each callback from the SCSI layer the DMA address is set back
to the starting address.
In the case where only a single SCSI callback occurs (currently for transfer
lengths < 128kB) this works fine, however for larger transfers the DMA address
wraps back to the initial starting address, corrupting the buffer holding the
data transferred to the guest.
Fix esp_pci_dma_memory_rw() to use the DMA_WAC (Working Address Counter) for
the DMA address which is correctly incremented across multiple SCSI layer
transfers.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Message-ID: <20240112131529.515642-2-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The tcg_cpu_FOO() names are x86 specific, so rename
them as x86_tcg_cpu_FOO() (as other names in this file)
to ease navigating the code.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Message-ID: <20240111120221.35072-5-philmd@linaro.org>
Add an update buffer where all block updates are staged.
Flush or discard updates properly, so we should never see
half-completed block writes in pflash storage.
Drop a bunch of FIXME comments ;)
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240108160900.104835-4-kraxel@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Move the offset calculation, do it once at the start of the function and
let the 'p' variable point directly to the memory location which should
be updated. This makes it simpler to update other buffers than
pfl->storage in an upcoming patch. No functional change.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240108160900.104835-2-kraxel@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
This is a follow-up on commit 89965db43c "hw/isa/piix3: Avoid Xen-specific
variant of piix3_write_config()" which introduced
piix_intx_routing_notifier_xen(). This function is implemented in board code but
accesses the PCI configuration space of the PIIX ISA function to determine the
PCI interrupt routes. Avoid this by reusing pci_device_route_intx_to_irq() which
makes piix_intx_routing_notifier_xen() more device-agnostic.
One remaining improvement would be making piix_intx_routing_notifier_xen()
agnostic towards the number of PCI interrupt routes and move it to xen-hvm.
This might be useful for possible Q35 Xen efforts but remains a future exercise
for now.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240107231623.5282-1-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The 16MiB flash device is only used by the deprecated shix machine.
Its code it old and unmaintained, and has never been adapted to the
QOM architecture. It still contains debug statements and uses global
variables. It is time to deprecate it.
Signed-off-by: Samuel Tardieu <sam@rfc1149.net>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240109083053.2581588-3-sam@rfc1149.net>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The shix machine has been designed and used at Télécom Paris from 2003
to 2010. It had been added to QEMU in 2005 and has not been maintained
since. Since nobody is using the physical board anymore nor interested
in maintaining the QEMU port, it is time to deprecate it.
Signed-off-by: Samuel Tardieu <sam@rfc1149.net>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Yoshinori Sato <ysato@users.sourceforge.jp>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240109083053.2581588-2-sam@rfc1149.net>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
pmu_init() register its event checking the pm_event::supported()
handler. For INST_RETIRED, the event is only registered and the
bit enabled in the PMU Common Event Identification register when
icount is enabled as ICOUNT_PRECISE.
PMU events are TCG-only, hardware accelerators handle them
directly. Unfortunately we register the events in non-TCG builds,
leading to linking error such:
ld: Undefined symbols:
_icount_to_ns, referenced from:
_instructions_ns_per in target_arm_helper.c.o
clang: error: linker command failed with exit code 1 (use -v to see invocation)
As a kludge, give a hint to the compiler by asserting the
pm_event::get_count() and pm_event::ns_per_count() handler will
only be called under this icount mode.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231208113529.74067-5-philmd@linaro.org>
Except helper_load_pcc(), all helpers from sys_helper.c
are system-emulation specific. In preparation of restricting
sys_helper.c to system emulation, extract helper_load_pcc()
to clk_helper.c.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20231207105426.49339-2-philmd@linaro.org>
Since previous commit, tb_invalidate_phys_page() is not used
anymore in system emulation. Make it static for user emulation
and remove its public declaration in "exec/translate-all.h".
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20231130205600.35727-1-philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Commit e3f7c801f1 introduced the TCGCPUOps::debug_check_breakpoint()
handler, and commit 10c37828b2 "moved breakpoint recognition outside
of translation", so "we no longer need to flush any TBs when changing
BPs".
The last target using tb_invalidate_phys_addr() was converted to the
debug_check_breakpoint(), so this function is now unused. Remove it.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20231130203241.31099-1-philmd@linaro.org>
Don't embed ibreak exception generation into TB and don't invalidate TB
on ibreak address change. Add CPUBreakpoint pointers to xtensa
CPUArchState, use cpu_breakpoint_insert/cpu_breakpoint_remove_by_ref to
manage ibreak breakpoints and provide TCGCPUOps::debug_check_breakpoint
callback that recognizes valid instruction breakpoints.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20231130171920.3798954-2-jcmvbkbc@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
'can_do_io' is specific to TCG. It was added to other
accelerators in 626cf8f4c6 ("icount: set can_do_io outside
TB execution"), then likely copy/pasted in commit c97d6d2cdf
("i386: hvf: add code base from Google's QEMU repository").
Having it set in non-TCG code is confusing, so remove it from
QTest / HVF / KVM.
Fixes: 626cf8f4c6 ("icount: set can_do_io outside TB execution")
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20231129205037.16849-1-philmd@linaro.org>
Both cryptodev_backend_set_throttle() and CryptoDevBackendClass::init()
can set their Error** argument. Do not ignore them, return early
on failure. Without that, running into another failure trips
error_setv()'s assertion. Use the ERRP_GUARD() macro as suggested
in commit ae7c80a7bd ("error: New macro ERRP_GUARD()").
Cc: qemu-stable@nongnu.org
Fixes: e7a775fd9f ("cryptodev: Account statistics")
Fixes: 2580b452ff ("cryptodev: support QoS")
Reviewed-by: zhenwei pi <pizhenwei@bytedance.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20231120150418.93443-1-philmd@linaro.org>
This conversion is pretty straight-forward. Standardized some formatting
so the +0 and +4 offset cases can recycle the same message.
Signed-off-by: Daniel Hoffman <dhoff749@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231118231129.2840388-1-dhoff749@gmail.com>
[PMD: Fixed few string formats]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Since the pkgsrc-2023Q3 release [*], the py-expat package has been
merged into the base 'python' package:
- Several packages have been folded into base packages. While the
result is simpler, those updating may need to force-remove the
secondary packages, depending on the update method. When doing
make replace, one has to pkg_delete -f the secondary packages.
pkgin handles at least the python packages correctly, removing the
split package when updating python. Specific packages and the
former packages now included:
* cairo: cairo-gobject
* python: py-cElementTree py-curses py-cursespanel py-expat
py-readline py-sqlite3
Remove py311-expat from the package list in order to avoid:
### Installing packages ...
processing remote summary (http://cdn.NetBSD.org/pub/pkgsrc/packages/NetBSD/amd64/9.3/All)...
database for http://cdn.NetBSD.org/pub/pkgsrc/packages/NetBSD/amd64/9.3/All is up-to-date
py311-expat is not available in the repository
...
calculating dependencies.../py311-expat is not available in the repository
pkg_install error log can be found in /var/db/pkgin/pkg_install-err.log
[*] https://mail-index.netbsd.org/netbsd-announce/2024/01/01/msg000360.html
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2109
Tested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240117140746.23511-1-philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
ISM devices are sensitive to manipulation of the IOMMU, so the ISM device
needs to be reset before the vfio-pci device is reset (triggering a full
UNMAP). In order to ensure this occurs, trigger ISM device resets from
subsystem_reset before triggering the PCI bus reset (which will also
trigger vfio-pci reset). This only needs to be done for ISM devices
which were enabled for use by the guest.
Further, ensure that AIF is disabled as part of the reset event.
Fixes: ef1535901a ("s390x: do a subsystem reset before the unprotect on reboot")
Fixes: 03451953c7 ("s390x/pci: reset ISM passthrough devices on shutdown and system reset")
Reported-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Message-ID: <20240118185151.265329-4-mjrosato@linux.ibm.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Typically we refresh the host fh during CLP enable, however it's possible
that the device goes through multiple reset events before the guest
performs another CLP enable. Let's handle this for now by refreshing the
host handle from vfio before disabling aif.
Fixes: 03451953c7 ("s390x/pci: reset ISM passthrough devices on shutdown and system reset")
Reported-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Message-ID: <20240118185151.265329-3-mjrosato@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Use a flag to keep track of whether AIF is currently enabled. This can be
used to avoid enabling/disabling AIF multiple times as well as to determine
whether or not it should be disabled during reset processing.
Fixes: d0bc7091c2 ("s390x/pci: enable adapter event notification for interpreted devices")
Reported-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Message-ID: <20240118185151.265329-2-mjrosato@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
By default, the timeout to receive any specified event from the QEMU VM is 60
seconds set by the python avocado test framework. Please see event_wait() and
events_wait() in python/qemu/machine/machine.py. If the matching event is not
triggered within that interval, an asyncio.TimeoutError is generated. Since the
timeout for the bits avocado test is 200 secs, we need to make event_wait()
timeout of the same value as well so that an early timeout is not triggered by
the avocado framework.
CC: peter.maydell@linaro.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2077
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20240117042556.3360190-1-anisinha@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
j is used while loading an ELF file to byteswap segments'
data. If data is larger than 2GB an overflow may happen.
So j should be elf_word.
This commit fixes a minor bug: it's unlikely anybody is trying to
load ELF files with 2GB+ segments for wrong-endianness targets,
but if they did, it wouldn't work correctly.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Cc: qemu-stable@nongnu.org
Fixes: 7ef295ea5b ("loader: Add data swap option to load-elf")
Signed-off-by: Anastasia Belova <abelova@astralinux.ru>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
It's found that some of the CPU type names in the array of valid
CPU types are invalid because their corresponding classes aren't
registered, as reported by Peter Maydell.
[gshan@gshan build]$ ./qemu-system-arm -machine virt -cpu cortex-a9
qemu-system-arm: Invalid CPU model: cortex-a9
The valid models are: cortex-a7, cortex-a15, (null), (null), (null),
(null), (null), (null), (null), (null), (null), (null), (null), max
Fix it by consolidating the array of valid CPU types. After it's
applied, we have the following output when TCG is enabled.
[gshan@gshan build]$ ./qemu-system-arm -machine virt -cpu cortex-a9
qemu-system-arm: Invalid CPU model: cortex-a9
The valid models are: cortex-a7, cortex-a15, max
[gshan@gshan build]$ ./qemu-system-aarch64 -machine virt -cpu cortex-a9
qemu-system-aarch64: Invalid CPU model: cortex-a9
The valid models are: cortex-a7, cortex-a15, cortex-a35, cortex-a55,
cortex-a72, cortex-a76, cortex-a710, a64fx, neoverse-n1, neoverse-v1,
neoverse-n2, cortex-a53, cortex-a57, max
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2084
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-id: 20240111051054.83304-1-gshan@redhat.com
Fixes: fa8c617791 ("hw/arm/virt: Check CPU type in machine_run_board_init()")
Signed-off-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
make check-tcg fails on Fedora with:
vtimer.c:9:10: fatal error: inttypes.h: No such file or directory
Fedora has a minimal aarch64 cross-compiler, which satisfies the
configure checks, so it's chosen instead of the dockerized one.
There is no cross-version of inttypes.h, however.
Fix by using stdint.h instead. The test does not require anything
from inttypes.h anyway.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-ID: <20240108125030.58569-1-iii@linux.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
On LoongArch kvm mode if transparent huge page wants to be enabled, base
address and size of memslot from both HVA and GPA view. And LoongArch
supports both 4K and 16K page size with Linux kernel, so transparent huge
page size is calculated from real page size rather than hardcoded size.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Message-ID: <20240115073244.174155-1-maobibo@loongson.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
uintptr_t, or unsigned long which is equivalent on Linux I32LP64 systems,
is an unsigned type and there is no need to further cast to __u64 which is
another unsigned integer type; widening casts from unsigned integers
zero-extend the value.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For PC-relative translation blocks, env->eip changes during the
execution of a translation block, Therefore, QEMU must be able to
recover an instruction's PC just from the TranslationBlock struct and
the instruction data with. Because a TB will not span two pages, QEMU
stores all the low bits of EIP in the instruction data and replaces them
in x86_restore_state_to_opc. Bits 12 and higher (which may vary between
executions of a PCREL TB, since these only use the physical address in
the hash key) are kept unmodified from env->eip. The assumption is that
these bits of EIP, unlike bits 0-11, will not change as the translation
block executes.
Unfortunately, this is incorrect when the CS base is not aligned to a page.
Then the linear address of the instructions (i.e. the one with the
CS base addred) indeed will never span two pages, but bits 12+ of EIP
can actually change. For example, if CS base is 0x80262200 and EIP =
0x6FF4, the first instruction in the translation block will be at linear
address 0x802691F4. Even a very small TB will cross to EIP = 0x7xxx,
while the linear addresses will remain comfortably within a single page.
The fix is simply to use the low bits of the linear address for data[0],
since those don't change. Then x86_restore_state_to_opc uses tb->cs_base
to compute a temporary linear address (referring to some unknown
instruction in the TB, but with the correct values of bits 12 and higher);
the low bits are replaced with data[0], and EIP is obtained by subtracting
again the CS base.
Huge thanks to Mark Cave-Ayland for the image and initial debugging,
and to Gitlab user @kjliew for help with bisecting another occurrence
of (hopefully!) the same bug.
It should be relatively easy to write a testcase that performs MMIO on
an EIP with different bits 12+ than the first instruction of the translation
block; any help is welcome.
Fixes: e3a79e0e87 ("target/i386: Enable TARGET_TB_PCREL", 2022-10-11)
Cc: qemu-stable@nongnu.org
Cc: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Cc: Richard Henderson <richard.henderson@linaro.org>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1759
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1964
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2012
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
With PCREL, we have a page-relative view of EIP, and an
approximation of PC = EIP+CSBASE that is good enough to
detect page crossings. If we try to recompute PC after
masking EIP, we will mess up that approximation and write
a corrupt value to EIP.
We already handled masking properly for PCREL, so the
fix in b5e0d5d2 was only needed for the !PCREL path.
Cc: qemu-stable@nongnu.org
Fixes: b5e0d5d22f ("target/i386: Fix 32-bit wrapping of pc/eip computation")
Reported-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240101230617.129349-1-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The LuringState typedef is defined twice, in include/block/raw-aio.h and
block/io_uring.c. Move it in include/block/aio.h, which is included
everywhere the typedef is needed, since include/block/aio.h already has
to define the forward reference to the struct.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This allows passing the KVM device node to use as a file
descriptor via /dev/fdset/XX. Passing the device node to
use as a file descriptor allows running qemu unprivileged
even when the user running qemu is not in the kvm group
on distributions where access to /dev/kvm is gated behind
membership of the kvm group (as long as the process invoking
qemu is able to open /dev/kvm and passes the file descriptor
to qemu).
Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com>
Message-ID: <20231021134015.1119597-1-daan.j.demeyer@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Jazz Jackrabbit has a very unusual VGA setup, where it uses odd/even mode
with 256-color graphics. Probably, it wants to use fast VRAM-to-VRAM
copies without having to store 4 copies of the sprites as needed in mode
X, one for each mod-4 alignment; odd/even mode simplifies the code a
lot if it's okay to place on a 160-pixels horizontal grid.
At the same time, because it wants to use double buffering (a la "mode X")
it uses byte mode, not word mode as is the case in text modes. In order
to implement the combination of odd/even mode (plane number comes from
bit 0 of the address) and byte mode (use all bytes of VRAM, whereas word
mode only uses bytes 0, 2, 4,... on each of the four planes), we need
to separate the effect on the plane number from the effect on the address.
Implementing the modes properly is a mess in QEMU, because it would
change the layout of VRAM and break migration. As an approximation,
shift right when the CPU accesses memory instead of shifting left when
the CRT controller reads it. A hack is needed in order to write font data
properly (see comment in the code), but it works well enough for the game.
Because doubleword and chain4 modes are now independent, chain4 does not
assert anymore that the address is in range. Instead it just returns
all ones and discards writes, like other modes.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Jazz Jackrabbit uses odd/even mode with 256-color graphics. This is
probably so that it can do very fast blitting with a decent resolution
(two pixels, compared to four pixels for "regular" mode X).
Accesses still use all planes (reads go to the latches and the game uses
read mode 1 so that the CPU always gets 0xFF; writes use the plane mask
register because the game sets bit 2 of the sequencer's memory mode
register). For this to work, QEMU needs to use the code for latched
memory accesses in odd/even mode. The only difference between odd/even
mode and "regular" planar mode is how the plane is computed in read mode
0, and how the planes are masked if the aforementioned bit 2 is reset.
It is almost enough to fix the game. You also need to honor byte/word
mode selection, which is done in the next patch.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The next patch will reuse latched memory access in text modes. Start with
a patch that moves the latched access code out of the "if".
Best reviewed with "git diff -b".
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This implements smooth scrolling, as used for example by Commander Keen
and Second Reality.
Unfortunately, this is not enough to avoid tearing in Commander Keen,
because sometimes the wrong start address is used for a frame.
On real EGA, the panning register is sampled on every line, while
the display start is latched for the next frame at the start of the
vertical retrace. On real VGA, the panning register is also latched,
but at the end of the vertical retrace. It looks like Keen exploits
this by only waiting for horizontal retrace when setting the display
start, but implementing it breaks the 256-color Keen games...
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This allows setting the start address to a high value, and reading the
bottom of the screen from the beginning of VRAM. Commander Keen 4
("Goodbye, Galaxy!") relies on this behavior.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The next patches will introduce more parameters that cause a full
refresh. Instead of adding arguments to get_offsets and lines to
update_basic_params, do everything through a struct.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The constant-expression bswap is provided by const_le32(), and GET_PLANE()
can also be implemented using cpu_to_le32(). Remove the custom macros in
vga.c.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
target/hppa qemu v8.2 regression fixes
There were some regressions introduced with Qemu v8.2 on the hppa/hppa64
target, e.g.:
- 32-bit HP-UX crashes on B160L (32-bit) machine
- NetBSD boot failure due to power button in page zero
- NetBSD FPU detection failure
- OpenBSD 7.4 boot failure
This patch series fixes those known regressions and additionally:
- allows usage of the max. 3840MB of memory (instead of 3GB),
- adds support for the qemu --nodefaults option (to debug other devices)
This patch set will not fix those known (non-regression) bugs:
- HP-UX and NetBSD still fail to boot on the new 64-bit C3700 machine
- Linux kernel will still fail to boot on C3700 as long as kernel modules are used.
Changes v2->v3:
- Added comment about Figures H-10 and H-11 in the parisc2.0 spec
in patch which calculate PDC address translation if PSW.W=0
- Introduce and use hppa_set_ior_and_isr()
- Use drive_get_max_bus(IF_SCSI), nd_table[] and serial_hd() to check
if default devices should be created
- Added Tested-by and Reviewed-by tags
Changes v1->v2:
- fix OpenBSD boot with SeaBIOS v15 instead of v14
- commit message enhancements suggested by BALATON Zoltan
- use uint64_t for ram_max in patch #1
# -----BEGIN PGP SIGNATURE-----
#
# iHUEABYKAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCZaImPQAKCRD3ErUQojoP
# X2C5AP9fbIkCni45JU6KC6OmFsCbAReRQCPwLO+MzR8/us2ywgD+PsGxSBk8ASxM
# nqtv3J9JC3i+XSnbtwLV+qChnO+IXwc=
# =FAMY
# -----END PGP SIGNATURE-----
# gpg: Signature made Sat 13 Jan 2024 05:57:17 GMT
# gpg: using EDDSA key BCE9123E1AD29F07C049BBDEF712B510A23A0F5F
# gpg: Good signature from "Helge Deller <deller@gmx.de>" [unknown]
# gpg: aka "Helge Deller <deller@kernel.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 4544 8228 2CD9 10DB EF3D 25F8 3E5F 3D04 A7A2 4603
# Subkey fingerprint: BCE9 123E 1AD2 9F07 C049 BBDE F712 B510 A23A 0F5F
* tag 'hppa-fixes-8.2-pull-request' of https://github.com/hdeller/qemu-hppa:
target/hppa: Update SeaBIOS-hppa to version 15
target/hppa: Fix IOR and ISR on error in probe
target/hppa: Fix IOR and ISR on unaligned access trap
target/hppa: Export function hppa_set_ior_and_isr()
target/hppa: Avoid accessing %gr0 when raising exception
hw/hppa: Move software power button address back into PDC
target/hppa: Fix PDC address translation on PA2.0 with PSW.W=0
hw/pci-host/astro: Add missing astro & elroy registers for NetBSD
hw/hppa/machine: Disable default devices with --nodefaults option
hw/hppa/machine: Allow up to 3840 MB total memory
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Migration pull request 2nd batch for 9.0
- Het's cleanup on migration qmp command paths
- Fabiano's migration cleanups and test improvements
- Fabiano's patch to re-enable multifd-cancel test
- Peter's migration doc reorganizations
- Nick Briggs's fix for Solaries build on rdma
# -----BEGIN PGP SIGNATURE-----
#
# iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZaX1PhIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1wZSzwEAq6sp/ylNHLzNoMdWL28JLqCsb4DPYH2i
# u7XgYgT1qDAA/0vwoe4a5uFn1aaGCS+2d2syjJ8kOE7h+eZrbK520jsA
# =1zUG
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 16 Jan 2024 03:17:18 GMT
# gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg: issuer "peterx@redhat.com"
# gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [marginal]
# gpg: aka "Peter Xu <peterx@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706
* tag 'migration-20240116-pull-request' of https://gitlab.com/peterx/qemu:
migration/rdma: define htonll/ntohll only if not predefined
docs/migration: Further move virtio to be feature of migration
docs/migration: Further move vfio to be feature of migration
docs/migration: Organize "Postcopy" page
docs/migration: Split "dirty limit"
docs/migration: Split "Postcopy"
docs/migration: Split "Debugging" and "Firmware"
docs/migration: Split "Backwards compatibility" separately
docs/migration: Convert virtio.txt into rST
docs/migration: Create index page
docs/migration: Create migration/ directory
tests/qtest: Re-enable multifd cancel test
tests/qtest/migration: Use the new migration_test_add
tests/qtest/migration: Add a wrapper to print test names
tests/qtest/migration: Print migration incoming errors
migration: Report error in incoming migration
migration/multifd: Change multifd_pages_init argument
migration/multifd: Remove QEMUFile from where it is not needed
migration/multifd: Remove MultiFDPages_t::packet_num
migration: Simplify initial conditionals in migration for better readability
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When variables are used without being initialized, there is potential
to take advantage of data that was pre-existing on the stack from an
earlier call, to drive an exploit.
It is good practice to always initialize variables, and the compiler
can warn about flaws when -Wuninitialized is present. This warning,
however, is by no means foolproof with its output varying depending
on compiler version and which optimizations are enabled.
The -ftrivial-auto-var-init option can be used to tell the compiler
to always initialize all variables. This increases the security and
predictability of the program, closing off certain attack vectors,
reducing the risk of unsafe memory disclosure.
While the option takes several possible values, using 'zero' is
considered to be the option that is likely to lead to semantically
correct or safe behaviour[1]. eg sizes/indexes are not likely to
lead to out-of-bounds accesses when initialized to zero. Pointers
are less likely to point something useful if initialized to zero.
Even with -ftrivial-auto-var-init=zero set, GCC will still issue
warnings with -Wuninitialized if it discovers a problem, so we are
not loosing diagnostics for developers, just hardening runtime
behaviour and making QEMU behave more predictably in case of hitting
bad codepaths.
[1] https://lists.llvm.org/pipermail/cfe-dev/2020-April/065221.html
Signed-off-by: "Daniel P. Berrangé" <berrange@redhat.com>
Message-ID: <20240103123414.2401208-3-berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
To quote wikipedia:
"Return-oriented programming (ROP) is a computer security exploit
technique that allows an attacker to execute code in the presence
of security defenses such as executable space protection and code
signing.
In this technique, an attacker gains control of the call stack to
hijack program control flow and then executes carefully chosen
machine instruction sequences that are already present in the
machine's memory, called "gadgets". Each gadget typically ends in
a return instruction and is located in a subroutine within the
existing program and/or shared library code. Chained together,
these gadgets allow an attacker to perform arbitrary operations
on a machine employing defenses that thwart simpler attacks."
QEMU is by no means perfect with an ever growing set of CVEs from
flawed hardware device emulation, which could potentially be
exploited using ROP techniques.
Since GCC 11 there has been a compiler option that can mitigate
against this exploit technique:
-fzero-call-user-regs
To understand it refer to these two resources:
https://www.jerkeby.se/newsletter/posts/rop-reduction-zero-call-user-regs/https://gcc.gnu.org/pipermail/gcc-patches/2020-August/552262.html
I used two programs to scan qemu-system-x86_64 for ROP gadgets:
https://github.com/0vercl0k/rphttps://github.com/JonathanSalwan/ROPgadget
When asked to find 8 byte gadgets, the 'rp' tool reports:
A total of 440278 gadgets found.
You decided to keep only the unique ones, 156143 unique gadgets found.
While the ROPgadget tool reports:
Unique gadgets found: 353122
With the --ropchain argument, the latter attempts to use the found
gadgets to product a chain that can execute arbitrary syscalls. With
current QEMU it succeeds in this task, which is an undesirable
situation.
With QEMU modified to use -fzero-call-user-regs=used-gpr the 'rp' tool
reports
A total of 528991 gadgets found.
You decided to keep only the unique ones, 121128 unique gadgets found.
This is 22% fewer unique gadgets
While the ROPgadget tool reports:
Unique gadgets found: 328605
This is 7% fewer unique gadgets. Crucially though, despite this more
modest reduction, the ROPgadget tool is no longer able to identify a
chain of gadgets for executing arbitrary syscalls. It fails at the
very first step, unable to find gadgets for populating registers for
a future syscall. Having said that, more advanced tools do still
manage to put together a viable ROP chain.
Also this only takes into account QEMU code. QEMU links to many 3rd
party shared libraries and ideally all of them would be compiled with
this same hardening. That becomes a distro policy question though.
In terms of performance impact, TCG was used as an evaluation test
case. We're not interested in protecting TCG since it isn't designed
to provide a security barrier, but it is performance sensitive code,
so useful as a guide to how other areas of QEMU might be impacted.
With the -fzero-call-user-regs=used-gpr argument present, using the
real world test of booting a linux kernel and having init immediately
poweroff, there is a ~1% slow down in performance under TCG. The QEMU
binary size also grows by approximately 1%.
By comparison, using the more aggressive -fzero-call-user-regs=all,
results in a slowdown of over 25% in TCG, which is clearly not an
acceptable impact, and a binary size increase of 5%.
Considering that 'used-gpr' successfully stopped ROPgadget assembling
a chain, this more targeted protection is a justifiable hardening
/ performance tradeoff.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: "Daniel P. Berrangé" <berrange@redhat.com>
Message-ID: <20240103123414.2401208-2-berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The npcm7xx_watchdog_timer-test can take more than 60 seconds in
SPEED=slow mode on a loaded host system.
Bumping to 2 minutes will give more headroom.
Message-ID: <20240112164717.1063954-1-thuth@redhat.com>
Reviewed-by: "Daniel P. Berrangé" <berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The test_prescaler() part in the npcm7xx_watchdog_timer test is quite
repetitive, testing all possible combinations of the WTCLK and WTIS
bitfields. Since each test spins up a new instance of QEMU, this is
rather an expensive test, especially on loaded host systems.
For the normal quick test mode, it should be sufficient to test the
corner settings of these fields (i.e. 0 and 3), so we can speed up
this test in the default mode quite a bit.
Message-ID: <20240115070223.30178-1-thuth@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Our usage of gtest results in us losing the very basic functionality
of "knowing which test failed". The issue is that gtest only prints
test names ("paths" in gtest parlance) once the test has finished, but
we use asserts in the tests and crash gtest itself before it can print
anything. We also use a final abort when the result of g_test_run is
not 0.
Depending on how the test failed/broke we can see the function that
trigged the abort, which may be representative of the test, but it
could also just be some generic function.
We have been relying on the primitive method of looking at the name of
the previous successful test and then looking at the code to figure
out which test should have come next.
Add a wrapper to the test registration that does the job of printing
the test name before running.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240104142144.9680-7-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
In arm_pamax(), we need to cope with the virt board calling this
function on a CPU object which has been inited but not realize.
We used to do propagation of feature-flag implications (such as
"V7VE implies LPAE") at realize, so we have some code in arm_pamax()
which manually checks for both V7VE and LPAE feature flags.
In commit b8f7959f28 we moved the feature propagation for
almost all features from realize to post-init. That means that
now when the virt board calls arm_pamax(), the feature propagation
has been done. So we can drop the manual propagation handling
and check only for the feature we actually care about, which
is ARM_FEATURE_LPAE.
Retain the comment that the virt board is calling this function
with a not completely realized CPU object, because that is a
potential beartrap for later changes which is worth calling out.
(Note that b8f7959f28 actually fixed a bug in the arm_pamax()
handling: arm_pamax() was missing a check for ARM_FEATURE_V8, so it
incorrectly thought that the qemu-system-arm 'max' CPU did not have
LPAE and turned off 'highmem' support in the virt board. Following
b8f7959f28 qemu-system-arm 'max' is treated the same as
'cortex-a15' and other v7 LPAE CPUs, because the generic feature
propagation code does correctly propagate V8 -> V7VE -> LPAE.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240109143804.1118307-1-peter.maydell@linaro.org
We don't currently document the syntax of .hx files anywhere
except in a few comments at the top of individual .hx files.
We don't even have somewhere in the developer docs where we
could do this.
Add a new files docs/devel/docs.rst which can be a place to
document how our docs build process works. For the moment,
put in only a brief introductory paragraph and the documentation
of the .hx files. We could later add to this file by for
example describing how the QAPI-schema-to-docs process works,
or anything else that developers might need to know about
how to add documentation.
Make the .hx files refer to this doc file, and clean
up their header comments to be more accurate for the
usage in each file and less cut-n-pasted.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Luc Michel <luc.michel@amd.com>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
Message-id: 20231212162313.1742462-1-peter.maydell@linaro.org
Put correct values (depending on CPU arch) into IOR and ISR on fault.
Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Put correct values (depending on CPU arch) into IOR and ISR on fault.
Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Move functionality to set IOR and ISR on fault into own
function. This will be used by follow-up patches.
Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
The value of unwind_breg may reference register %r0, but we need to avoid
accessing gr0 directly and use the value 0 instead.
At runtime I've seen unwind_breg being zero with the Linux kernel when
rfi is used to jump to smp_callin().
Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Bruno Haible <bruno@clisp.org>
The various operating systems (e.g. Linux, NetBSD) have issues
mapping the power button when it's stored in page zero.
NetBSD even crashes, because it fails to map that page and then
accesses unmapped memory.
Since we now have a consistent memory mapping of PDC in 32-bit
and 64-bit address space (the lower 32-bits of the address are in
sync) the power button can be moved back to PDC space.
This patch fixes the power button on Linux, NetBSD and HP-UX.
Signed-off-by: Helge Deller <deller@gmx.de>
Tested-by: Bruno Haible <bruno@clisp.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Fix the address translation for PDC space on PA2.0 if PSW.W=0.
Basically, for any address in the 32-bit PDC range from 0xf0000000 to
0xf1000000 keep the lower 32-bits and just set the upper 32-bits to
0xfffffff0.
This mapping fixes the emulated power button in PDC space for 32- and
64-bit machines and is how the physical C3700 machine seems to map
PDC.
Figures H-10 and H-11 in the parisc2.0 spec [1] show that the 32-bit
region will be mapped somewhere into a higher and bigger 64-bit PDC
space. The start and end of this 64-bit space is defined by the
physical address bits. But the figures don't specifiy where exactly the
mapping will start inside that region. Tests on a real HP C3700
regarding the address of the power button indicate, that the lower
32-bits will stay the same though.
[1] https://parisc.wiki.kernel.org/images-parisc/7/73/Parisc2.0.pdf
Signed-off-by: Helge Deller <deller@gmx.de>
Tested-by: Bruno Haible <bruno@clisp.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
NetBSD accesses some astro and elroy registers which aren't accessed
by Linux yet. Add emulation for those registers to allow NetBSD to
boot further.
Please note that this patch is not sufficient to completely boot up
NetBSD on the 64-bit C3700 machine yet.
Signed-off-by: Helge Deller <deller@gmx.de>
Tested-by: Bruno Haible <bruno@clisp.org>
Recognize the qemu --nodefaults option, which will disable the
following default devices on hppa:
- lsi53c895a SCSI controller,
- artist graphics card,
- LASI 82596 NIC,
- tulip PCI NIC,
- second serial PCI card,
- USB OHCI controller.
Adding this option is very useful to allow manual testing and
debugging of the other possible devices on the command line.
Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
The physical hardware allows DIMMs of 4 MB size and above, allowing up
to 3840 MB of memory, but is restricted by setup code to 3 GB.
Increase the limit to allow up to the maximum amount of memory.
Btw. the memory area from 0xf000.0000 to 0xffff.ffff is reserved by
the architecture for firmware and I/O memory and can not be used for
standard memory.
An upcoming 64-bit SeaBIOS-hppa firmware will allow more than 3.75GB
on 64-bit HPPA64. In this case the ram_max for the pa20 case will change.
Signed-off-by: Helge Deller <deller@gmx.de>
Noticed-by: Nelson H. F. Beebe <beebe@math.utah.edu>
Fixes: b7746b1194 ("hw/hppa/machine: Restrict the total memory size to 3GB")
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Bruno Haible <bruno@clisp.org>
* Fix non-deterministic failures of the 'netdev-socket' qtest
* Fix device presence checking in the virtio-ccw qtest
* Support codespell checking in checkpatch.pl
* Fix emulation of LAE s390x instruction
* Work around htags bug when environment is large
* Some other small clean-ups here and there
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmWgHlgRHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbXAnBAAjQve/Jmfp9p8eQmswG7cl/a2TuJ59b9X
# SFRja2PprV/Wp4kxxEJX4er9F2+rlMusNL62LBp/QjZi9u4lCvCmuB7sMa0wEkjr
# BPPBrkxkAT+/8vhGpYg2GrxZv/UOLkycp3sjEp4v5yXWQw+OEBnkZZ+AuHddpnEr
# NKMKss71uQmccvuzD5FMDfbJQcSBD/yGPyFfDrv1RKreYRlbkEDVlcVoZpfoMwQY
# Pl167iDdmjVtsT+4wf8vHo5W/AYKDOjlV6AoujCnJVZnGx6BtDLiF/iNJ/VU1Ty5
# cRxySPT64HG+cGrbRqz9IjDvs++WW5EQn1jPY8NO2XFz3sney6Cs/pLKjqJY9S7P
# kfOXOBZG3zOI1kgd/CSR5b4szg4XvtTZaupczKiGOpYC9klf0oQNXGU5jXi3Csop
# Q332oUgiPeNdOx/4tXobFX6RwVCqLRYZbHx9RRYSxWlqJJPAB74/n+RZsmOtsxuJ
# RaiPKDmbVlslkUm78gIa5e6DMwDk2wmlkqa64W7VZxyqfQTRDPiPvfMGePkj6tmZ
# h9vUsELwwORlHpZyL08n0fzs3aeIYwzPwhfr+5iQZIawIp4Zqo8i8Lic/WfIlok9
# rmPIA0mjs1VtrUsroItw4NcY04xcVa7hkhz4EbkZROrfGamdkLuvbk2OKuOeoL0U
# lpgtQL6jA7E=
# =F/j2
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 11 Jan 2024 16:59:04 GMT
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5
* tag 'pull-request-2024-01-11' of https://gitlab.com/thuth/qemu:
.gitlab-ci.d/buildtest.yml: Work around htags bug when environment is large
tests/tcg/s390x: Test LOAD ADDRESS EXTENDED
target/s390x: Fix LAE setting a wrong access register
scripts/checkpatch: Support codespell checking
hw/s390x/ccw: Replace dirname() with g_path_get_dirname()
hw/s390x/ccw: Replace basename() with g_path_get_basename()
target/s390x/kvm/pv: Provide some more useful information if decryption fails
gitlab: fix s390x tag for avocado-system-centos
tests/qtest/virtio-ccw: Fix device presence checking
qtest: ensure netdev-socket tests have non-overlapping names
net: handle QIOTask completion to report useful error message
net: add explicit info about connecting/listening state
Revert "tests/qtest/netdev-socket: Raise connection timeout to 120 seconds"
Revert "osdep: add getloadavg"
Revert "netdev: set timeout depending on loadavg"
qtest: use correct boolean type for failover property
q800: move dp8393x_prom memory region to Q800MachineState
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
There are warning messages printed from tests/qtest/numa-test.c,
to complain the CPU cluster and NUMA node boundary is broken. Since
the broken boundary is expected, we don't want to see the warning
messages.
# cd /home/gavin/sandbox/qemu.main/build
# MALLOC_PERTURB_=255 QTEST_QEMU_BINARY=./qemu-system-aarch64 \
G_TEST_DBUS_DAEMON=../tests/dbus-vmstate-daemon.sh \
QTEST_QEMU_IMG=./qemu-img \
QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon \
tests/qtest/numa-test --tap -k
:
qemu-system-aarch64: warning: CPU-0 and CPU-4 in socket-0-cluster-0 \
have been associated with node-0 and node-1 respectively. \
It can cause OSes like Linux to misbehave
:
Skip the invalidation of CPU cluster and NUMA node boundary when
qtest is enabled, to avoid the warning messages.
Fixes: a494fdb715 ("numa: Validate cluster and NUMA node boundary if required")
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This is now expected by rtd so I've expanded using their example as
22.04 is one of our supported platforms. I tried to work out if there
was an easy way to re-generate a requirements.txt from our
pythondeps.toml but in the end went for the easier solution.
Cc: <qemu-stable@nongnu.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231221174200.2693694-1-alex.bennee@linaro.org>
The mtest2make.py script passes the arg '-t 0' to 'meson test' which
disables all test timeouts. This is a major source of pain when running
in GitLab CI and a test gets stuck. It will stall until GitLab kills the
CI job. This leaves us with little easily consumable information about
the stalled test. The TAP format doesn't show the test name until it is
completed, and TAP output from multiple tests it interleaved. So we
have to analyse the log to figure out what tests had un-finished TAP
output present and thus infer which test case caused the hang. This is
very time consuming and error prone.
By allowing meson to kill stalled tests, we get a direct display of what
test program got stuck, which lets us more directly focus in on what
specific test case within the test program hung.
The other issue with disabling meson test timeouts by default is that it
makes it more likely that maintainers inadvertantly introduce slowdowns.
For example the recent-ish change that accidentally made migrate-test
take 15-20 minutes instead of around 1 minute.
The main risk of this change is that the individual test timeouts might
be too short to allow completion in high load scenarios. Thus, there is
likely to be some short term pain where we have to bump the timeouts for
certain tests to make them reliable enough. The preceeding few patches
raised the timeouts for all failures that were immediately apparent
in GitLab CI.
Even with the possible short term instability, this should still be a
net win for debuggability of failed CI pipelines over the long term.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230717182859.707658-13-berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20231215070357.10888-17-thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
When running the tests in slow mode with --enable-debug on a very loaded
system, the fp-test-mulAdd test can take longer than 2 minutes. Bump the
timeout to three minutes to make sure it passes in such situations, too.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20231215070357.10888-16-thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
When running the tests in slow mode on a very loaded system and with
--enable-debug, the test-crypto-block can take longer than 4 minutes.
Bump the timeout to 5 minutes to make sure that it also passes in
such situations.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20231215070357.10888-15-thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
When running the tests in slow mode on a very loaded system and with
--enable-debug, the test-aio-multithread can take longer than 1 minute.
Bump the timeout to two minutes to make sure that it also passes in
such situations.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20231215070357.10888-14-thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
When running the test in slow mode on a very loaded system with the
arm/aarch64 target and with --enable-debug, it can take longer than
10 minutes to finish the introspection test. Bump the timeout to twelve
minutes to make sure that it also finishes in such situations.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20231215070357.10888-13-thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
On a loaded system with --enable-debug, this test can take longer than
5 minutes. Raising the timeout to 6 minutes gives greater headroom for
such situations.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
[thuth: Increase the timeout to 6 minutes for very loaded systems]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20231215070357.10888-11-thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
The prom-env-test can take more than 5 minutes in a --enable-debug
build on a loaded system. Bumping to 6 minutes will give more headroom.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
[thuth: Bump timeout to 6 minutes instead of 3]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20231215070357.10888-8-thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
The pxe-test uses the boot_sector_test() function, and that already
uses a timeout of 600 seconds. So adjust the timeout on the meson
side accordingly.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
[thuth: Bump timeout to 600s and adjust commit description]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20231215070357.10888-7-thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
The migration test should take between 1 min 30 and 2 mins on reasonably
modern hardware. The test is not especially compute bound, rather its
running time is dominated by the guest RAM size relative to the
bandwidth cap, which forces each iteration to take at least 30 seconds.
None the less under high load conditions with multiple QEMU processes
spawned and competing with other parallel tests, the worst case running
time might be somewhat extended. Bumping the timeout to 8 minutes gives
us good headroom, while still catching stuck tests relatively quickly.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20230717182859.707658-3-berrange@redhat.com>
[thuth: Bump timeout to 8 minutes to make it work on very loaded systems, too]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20231215070357.10888-3-thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
The function qemu_chr_fe_init already treats be->fe_open as a bool and
if it acts like a bool it should be one. While we are at it make the
variable name more descriptive and add kdoc decorations.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231211145959.93759-1-alex.bennee@linaro.org>
We've already got a test for a big endian microblaze machine, but so
far we lack one for a little endian machine. Now that the QEMU advent
calendar featured such an image, we can test the little endian mode,
too.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20231215161851.71508-1-thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Sometimes the CI "pages" job fails with a message like this from
htags:
$ htags -anT --tree-view=filetree -m qemu_init -t "Welcome to the QEMU sourcecode"
htags: Negative exec line limit = -371
This is due to a bug in hflags where if the environment is too large it
falls over:
https://lists.gnu.org/archive/html/bug-global/2024-01/msg00000.html
This happens to us because GitLab CI puts the commit message of the
commit under test into the CI_COMMIT_MESSAGE and/or CI_COMMIT_TAG_MESSAGE
environment variables, so the job will fail if the commit happens to
have a verbose commit message.
Work around the htags bug by unsetting these variables while running
htags.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2080
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240111125543.1573473-1-peter.maydell@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
edk2: update to git snapshot (maybe for-8.2)
This updates edk2 to git master as of today. This picks up a patch
(merged only yesterday, that's why this last-minute PR) which allows to
work around a bug in shim, and enables that workaround in the qemu
firmware builds.
This solves a real-world problem on arm hardware, walk over to
https://gitlab.com/qemu-project/qemu/-/issues/1990 to see the details.
Merging this firmware update that close to the 8.2 release clearly is
not without risks. If I get a 'no', I'm not going to complain.
That said I'm not aware of any bugs, and landing this in 8.2.0 would
make a bunch of folks hanging around in issue 1990 very happy.
Alternative plan would be to merge this after the release, give it some
time for testing, and assuming everything goes well schedule a backport
for 8.2.1
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCgAdFiEEoDKM/7k6F6eZAf59TLbY7tPocTgFAmV5i70ACgkQTLbY7tPo
# cTg5mA//VDjGNmBYWhIhf5c7Z8+h1FspnqkxqResX3KgE2indCWkTlyZnCFGb7CO
# NgDiCR7xKMw9S1Cun14vTs/OK8BVFvmXGhTIgjecK+k6w6D8PtR4QvfXYUKxNajA
# Sd6reWAlojlgKOkpcrejrSSvtBTZqrJc8CrkowMR3FZXzD0GstUCMZ0jBvVhzlO6
# o9RMk0kbf+VNupsA+v9ZWPstMHXjLKs8v1eUqrc6LYOanY6mqQM5Wz9yWteUfrNp
# /0zShBrkmB+BgPoRQypphFdXRacP82fVXDMeTSbbXaReI0PR9MLKZnyk0UUkES6k
# BTtEVEM0cCAYLGaGFjHZVEpbrtFmVBisE0fLgdozsCU8SMCuxjNzXyj0HGRsJ7m4
# UQ+qGJLOR3Zx/Bnz3LLKOmWBlq6MQD5lYgxk3dwSPKzXTqun1ndlVKenJ3Z9fgXQ
# gibVbS/2fNylR9aoPSYkXnlE8l8vSo24sXIn8R2wX8rJ0xBc6bFDs1MKizzv2b9l
# YUeybDwgDvbbDLGSN4DgIeNSZxQBgNO/nmuFnx8jNxTqcNlCJFHO2jR7gPijj5ct
# ZPQQwLCCEIxD3OY3Dg94zXDm1EfWZQpNBFDD/83joJt/15Vu9GLsPqEs4QUdiQsp
# MO4Bd7HFavLSGsyX1rMe0yonWirbRX2uKYmyc+KwGjjS9LRGesU=
# =bcZj
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 13 Dec 2023 10:47:25 GMT
# gpg: using RSA key A0328CFFB93A17A79901FE7D4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>" [full]
# gpg: aka "Gerd Hoffmann <gerd@kraxel.org>" [full]
# gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>" [full]
# Primary key fingerprint: A032 8CFF B93A 17A7 9901 FE7D 4CB6 D8EE D3E8 7138
* tag 'firmware/edk2-20231213-pull-request' of https://gitlab.com/kraxel/qemu:
tests/acpi: disallow tests/data/acpi/virt/SSDT.memhp changes
tests/acpi: update expected data files
edk2: update binaries to git snapshot
edk2: update build config, set PcdUninstallMemAttrProtocol = TRUE.
edk2: update to git snapshot
tests/acpi: allow tests/data/acpi/virt/SSDT.memhp changes
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
pull-loongarch-20240111
# -----BEGIN PGP SIGNATURE-----
#
# iLMEAAEKAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCZZ/QKgAKCRBAov/yOSY+
# 34eqBADA48++Z9gETFNheLUHdYEaja2emn+gSaoHLFquyq/l53w8RfrUII+BzV1o
# T7D8xjlVQldAYZzqQn2pQe2S7r4ggfeNmxGxwJbCTW9sooGMwBnU8+Ix3ruSet7K
# gI+UHLU4oHk6jdrT384tux2EG+qUmlLN1c7j4G/z1OzKEwFv7Q==
# =+Pi0
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 11 Jan 2024 11:25:30 GMT
# gpg: using RSA key B8FF1DA0D2FDCB2DA09C6C2C40A2FFF239263EDF
# gpg: Good signature from "Song Gao <m17746591750@163.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B8FF 1DA0 D2FD CB2D A09C 6C2C 40A2 FFF2 3926 3EDF
* tag 'pull-loongarch-20240111' of https://gitlab.com/gaosong/qemu:
hw/intc/loongarch_extioi: Add vmstate post_load support
hw/intc/loongarch_extioi: Add dynamic cpu number support
hw/loongarch/virt: Set iocsr address space per-board rather than percpu
hw/intc/loongarch_ipi: Use MemTxAttrs interface for ipi ops
target/loongarch: Add loongarch kvm into meson build
target/loongarch: Implement set vcpu intr for kvm
target/loongarch: Restrict TCG-specific code
target/loongarch: Implement kvm_arch_handle_exit
target/loongarch: Implement kvm_arch_init_vcpu
target/loongarch: Implement kvm_arch_init function
target/loongarch: Implement kvm get/set registers
target/loongarch: Supplement vcpu env initial when vcpu reset
target/loongarch: Define some kvm_arch interfaces
linux-headers: Synchronize linux headers from linux v6.7.0-rc8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add two spelling check options (--codespell and --codespellfile) to
enhance spelling check through dictionary, which copied the Linux
kernel's implementation in checkpatch.pl.
This check uses the dictionary at "/usr/share/codespell/dictionary.txt"
by default, if there is no dictionary specified under this path, it
will look for the dictionary of python3's codespell (This requires user
to add python3's path in environment variable $PATH, and to install
codespell by "pip install codespell").
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Tested-by: Samuel Tardieu <sam@rfc1149.net>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240105083848.267192-1-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As commit 3e015d815b ("use g_path_get_basename instead of basename")
said, g_path_get_dirname() should be preferred over dirname() since
the former is a portable utility function that has the advantage of not
modifying the string argument.
Replace dirname() with g_path_get_dirname().
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20231221171921.57784-3-zhao1.liu@linux.intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
It's a common scenario to copy guest images from one host to another
to run the guest on the other machine. This (of course) does not work
with "secure execution" guests since they are encrypted with one certain
host key. However, if you still (accidentally) do it, you only get a
very user-unfriendly error message that looks like this:
qemu-system-s390x: KVM PV command 2 (KVM_PV_SET_SEC_PARMS) failed:
header rc 108 rrc 5 IOCTL rc: -22
Let's provide at least a somewhat nicer hint to the users so that they
are able to figure out what might have gone wrong.
Buglink: https://issues.redhat.com/browse/RHEL-18212
Message-ID: <20240110142916.850605-1-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
There are elements sw_ipmap and sw_coremap, which is usd to speed
up irq injection flow. They are saved and restored in vmstate during
migration, indeed they can calculated from hw registers. Here
post_load is added for get sw_ipmap and sw_coremap from extioi hw
state.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20231215100333.3933632-5-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
On LoongArch physical machine, one extioi interrupt controller only
supports 4 cpus. With processor more than 4 cpus, there are multiple
extioi interrupt controllers; if interrupts need to be routed to
other cpus, they are forwarded from extioi node0 to other extioi nodes.
On virt machine model, there is simple extioi interrupt device model.
All cpus can access register of extioi interrupt controller, however
interrupt can only be route to 4 vcpu for compatible with old kernel.
This patch adds dynamic cpu number support about extioi interrupt.
With old kernel legacy extioi model is used, however kernel can detect
and choose new route method in future, so that interrupt can be routed to
all vcpus.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20231215100333.3933632-4-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
LoongArch system has iocsr address space, most iocsr registers are
per-board, however some iocsr register spaces banked for percpu such
as ipi mailbox and extioi interrupt status. For banked iocsr space,
each cpu has the same iocsr space, but separate data.
This patch changes iocsr address space per-board rather percpu,
for iocsr registers specified for cpu, MemTxAttrs.requester_id
can be parsed for the cpu. With this patches, the total address space
on board will be simple, only iocsr address space and system memory,
rather than the number of cpu and system memory.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20231215100333.3933632-3-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
There are two interface pairs for MemoryRegionOps, read/write and
read_with_attrs/write_with_attrs. The later is better for ipi device
emulation since initial cpu can be parsed from attrs.requester_id.
And requester_id can be overrided for IOCSR_IPI_SEND and mail_send
function when it is to forward message to another vcpu.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20231215100333.3933632-2-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Implement kvm_arch_init_vcpu interface for loongarch,
in this function, we register VM change state handler.
And when VM state changes to running, the counter value
should be put into kvm to keep consistent with kvm,
and when state change to stop, counter value should be
refreshed from kvm.
Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn>
Signed-off-by: xianglai li <lixianglai@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240105075804.1228596-7-zhaotianrui@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
tcg/i386: Use more 8-bit immediate forms for add, sub, or, xor
tcg/ppc: Use new registers for LQ destination
util: fix build with musl libc on ppc64le
# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmWfESodHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV8OLQf/TnNOeBPGFVFRLycp
# rRbLxFar/oRP0SfH7I1S09vKFH+mlb5JK5Er4DL9CmUxV596r9ZGiwC6RlowK8nD
# INfC9Nnf3MgeyViDG41bA5oxiWom+XxbFtN4iVZo84CVDFEZFt0xjaq7d9Zhfj9J
# xWWAlCr013MnhamjmEB2NKxQzLnIMhJs1JuhkAbThKsaPoDwHLSmIMSMJlRwrf27
# Ey9blEt8GAOkd1iMA0xpw2vthNUfpCgZibg//CzqZevIq8pdxcieQ9ZjuxLjDM32
# N3u3eaX9SyuLwj4682MYuHYIxpuZ+HkIkjmuIQBsBuG8d3EoDs+rr/9Jzi47f/nR
# 0btVug==
# =rXXF
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 10 Jan 2024 21:50:34 GMT
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F
* tag 'pull-tcg-20240111' of https://gitlab.com/rth7680/qemu:
util: fix build with musl libc on ppc64le
tcg/ppc: Use new registers for LQ destination
tcg/i386: use 8-bit OR or XOR for unsigned 8-bit immediates
tcg/i386: convert add/sub of 128 to sub/add of -128
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
An apparent copy-paste error tests for the presence of the
virtio-rng-ccw device in order to perform tests on the virtio-scsi-ccw
device.
Signed-off-by: Samuel Tardieu <sam@rfc1149.net>
Message-ID: <20240106130121.1244993-1-sam@rfc1149.net>
Fixes: 65331bf5d1 ("tests/qtest: Check for virtio-ccw devices before using them")
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The network stream backend uses the async QIO socket APIs for listening
and connecting sockets. It does not check the task object completion
status, however, instead just looking at whether the socket FD is -1
or not.
By checking the task completion, we can set a useful error message for
users instead of the non-actionable "connection error" string.
eg so users will see:
(qemu) info network
net: index=0,type=stream,error: Failed to connect to '/foo.unix': No such file or directory
Signed-off-by: "Daniel P. Berrangé" <berrange@redhat.com>
Message-ID: <20240104162942.211458-6-berrange@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
When running 'info network', if the stream backend is still in
the process of connecting, or waiting for an incoming connection,
no information is displayed.
There is also no way to distinguish whether the server is still
in the process of setting up the listener socket, or whether it
is ready to accept incoming client connections.
This leads to a race condition in the netdev-socket qtest which
launches a server process followed by a client process. Under
high load conditions it is possible for the client to attempt
to connect before the server is accepting clients. For the
scenarios which do not set the 'reconnect' option, this opens
up a race which can lead to the test scenario failing to reach
the expected state.
Now that 'info network' can distinguish between initialization
phase and the listening phase, the netdev-socket qtest will
correctly synchronize, such that the client QEMU is not spawned
until the server is ready.
This should solve the non-deterministic failures seen with the
netdev-socket qtest.
Signed-off-by: "Daniel P. Berrangé" <berrange@redhat.com>
Message-ID: <20240104162942.211458-5-berrange@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
This reverts commit 0daaf2761f.
The test was not timing out because of slow execution. It was
timing out due to a race condition leading to the client QEMU
attempting (and fatally failing) to connect before the server
QEMU was listening.
Signed-off-by: "Daniel P. Berrangé" <berrange@redhat.com>
Message-ID: <20240104162942.211458-4-berrange@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
This reverts commit cadfc72939.
The test was not timing out because of slow execution. It was
timing out due to a race condition leading to the client QEMU
attempting (and fatally failing) to connect before the server
QEMU was listening.
Signed-off-by: "Daniel P. Berrangé" <berrange@redhat.com>
Message-ID: <20240104162942.211458-2-berrange@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
QMP device_add does not historically validate the parameter types.
At some point it will likely change to enforce correct types, to
match behaviour of -device. The failover property is expected to
be a boolean in JSON.
Signed-off-by: "Daniel P. Berrangé" <berrange@redhat.com>
Message-ID: <20240103123005.2400437-1-berrange@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
LQ has a constraint that RTp != RA, else SIGILL.
Therefore, force the destination of INDEX_op_qemu_*_ld128 to be a
new register pair, so that it cannot overlap the input address.
This requires new support in process_op_defs and tcg_reg_alloc_op.
Cc: qemu-stable@nongnu.org
Fixes: 526cd4ec01 ("tcg/ppc: Support 128-bit load/store")
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20240102013456.131846-1-richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In the case where OR or XOR has an 8-bit immediate between 128 and 255,
we can operate on a low-byte register and shorten the output by two or
three bytes (two if a prefix byte is needed for REX.B).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20231228120524.70239-1-pbonzini@redhat.com>
[rth: Incorporate into switch.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Extend the existing conditional that generates INC/DEC, to also swap an
ADD for a SUB and vice versa when the immediate is 128. This facilitates
using OPC_ARITH_EvIb instead of OPC_ARITH_EvIz.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20231228120514.70205-1-pbonzini@redhat.com>
[rth: Use a switch on C]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
RISC-V PR for 9.0
* Make vector whole-register move (vmv) depend on vtype register
* Fix th.dcache.cval1 priviledge check
* Don't allow write mstatus_vs without RVV
* Use hwaddr instead of target_ulong for RV32
* Fix machine IDs QOM getters\
* Fix KVM reg id sizes
* ACPI: Enable AIA, PLIC and update RHCT
* Fix the interrupts-extended property format of PLIC
* Add support for Zacas extension
* Add amocas.[w,d,q] instructions
* Document acpi parameter of virt machine
* RVA22 profiles support
* Remove group setting of KVM AIA if the machine only has 1 socket
* Add RVV CSRs to KVM
* sifive_u: Update S-mode U-Boot image build instructions
* Upgrade OpenSBI from v1.3.1 to v1.4
* pmp: Ignore writes when RW=01 and MML=0
* Assert that the CSR numbers will be correct
* Don't adjust vscause for exceptions
* Ensure mideleg is set correctly on reset
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEaukCtqfKh31tZZKWr3yVEwxTgBMFAmWeW8kACgkQr3yVEwxT
# gBMB3BAAtpb7dC/NqDOjo/LjGf81wYUnF0KcfJUIbuHEM9S03mKJEvngV/sUhg+A
# fzsoJazijQZk2+Y02WLT/o+ppRDegb4P6n54Nn13xr024Dn2jf45+EKDLI+vtU5y
# lhwp/LH3SEo2MM/Qr0njl8+jJ7W9adhZeK6x+NFaLaQJ291xupbcwEnScdv2bPAo
# gvbM6yrfUoZ25MsQKIDGssozdGRwOD/keAT0q8C0gKDamqXBDrI80BOVhRms+uLm
# R33DXsAegPKluJTa9gfaWFI0eK34WHXRvSIjE36nZlGNNgqLAVdM2/QozMVz4cKA
# Ymz1nzqB9HeSn1pM4KCK/Y3LH89qLGWtyHYgldiDXA/wSyKajwkbXSWFOT9gPDqV
# i+5BRDvU0zIeMIt+ROqNKgx1Hry6U2aycMNsdHTmygJbGEpiTaXuES5tt+LKsyHe
# w/7a6wPd/kh9LQhXYQ4qbn7L534tWvn8zWyvKLZLxmYPcOn6SdjFbKWmk5ARky2W
# sx9ojn9ANlYaLfzQ3TMRcIhWD6n8Si3KFNiQ3353E8xkRkyfu0WHyXAy8/kIc5UT
# nScO2YD68XkdkcLF6uLUKuGiVZXFWXRY1Ttz9tvEmBckVsg6TIkoMONHeUWNP7ly
# A0bJwN5qEOk6XIYKHWwX5UzvkcfUpOb5VmuLuv3gRoNX0A7/+fc=
# =5K9J
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 10 Jan 2024 08:56:41 GMT
# gpg: using RSA key 6AE902B6A7CA877D6D659296AF7C95130C538013
# gpg: Good signature from "Alistair Francis <alistair@alistair23.me>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6AE9 02B6 A7CA 877D 6D65 9296 AF7C 9513 0C53 8013
* tag 'pull-riscv-to-apply-20240110' of https://github.com/alistair23/qemu: (65 commits)
target/riscv: Ensure mideleg is set correctly on reset
target/riscv: Don't adjust vscause for exceptions
target/riscv: Assert that the CSR numbers will be correct
target/riscv: pmp: Ignore writes when RW=01 and MML=0
roms/opensbi: Upgrade from v1.3.1 to v1.4
docs/system/riscv: sifive_u: Update S-mode U-Boot image build instructions
target/riscv/kvm: add RVV and Vector CSR regs
target/riscv/kvm: do PR_RISCV_V_SET_CONTROL during realize()
linux-headers: riscv: add ptrace.h
linux-headers: Update to Linux v6.7-rc5
target/riscv/kvm.c: remove group setting of KVM AIA if the machine only has 1 socket
target/riscv: add rva22s64 cpu
target/riscv: add RVA22S64 profile
target/riscv: add 'parent' in profile description
target/riscv: add satp_mode profile support
target/riscv/cpu.c: add riscv_cpu_is_32bit()
target/riscv/cpu.c: finalize satp_mode earlier
target/riscv: add priv ver restriction to profiles
target/riscv: implement svade
target/riscv: add 'rva22u64' CPU
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The CSRs will always be between either CSR_MHPMCOUNTER3 and
CSR_MHPMCOUNTER31 or CSR_MHPMCOUNTER3H and CSR_MHPMCOUNTER31H.
So although ctr_index can't be negative, Coverity doesn't know this and
it isn't obvious to human readers either. Let's add an assert to ensure
that Coverity knows the values will be within range.
To simplify the code let's also change the RV32 adjustment.
Fixes: Coverity CID 1523910
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20240108001328.280222-2-alistair.francis@wdc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
This patch changes behavior on writing RW=01 to pmpcfg with MML=0.
RWX filed is form of collective WARL with the combination of
pmpcfg.RW=01 remains reserved for future standard use.
According to definition of WARL writing the CSR has no other side
effect. But current implementation change architectural state and
change system behavior. After writing we will get unreadable-unwriteble
region regardless on the previous state.
On the other side WARL said that we should read legal value and nothing
says about what we should write. Current behavior change system state
regardless of whether we read this register or not.
Fixes: ac66f2f0 ("target/riscv: pmp: Ignore writes when RW=01")
Signed-off-by: Ivan Klokov <ivan.klokov@syntacore.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231220153205.11072-1-ivan.klokov@syntacore.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Currently, the documentation outlines the process for building the
S-mode U-Boot image using `make menuconfig` and manual actions within
the menuconfig UI. However, this approach is fragile due to Kconfig
options potentially changing across different releases. For example,
CONFIG_OF_PRIOR_STAGE has been replaced by CONFIG_BOARD since v2022.01
release, and CONFIG_TEXT_BASE has been moved to the 'General setup'
menu from the 'Boot options' menu in v2024.01 release.
This update aims to make the S-mode U-Boot image build instructions
future-proof. It leverages the 'config' script provided in the U-Boot
source tree to edit the .config file, followed by a `make olddefconfig`.
Validated with U-Boot v2024.01 release.
Signed-off-by: Bin Meng <bmeng@tinylab.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240104071523.273702-1-bmeng@tinylab.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Linux RISC-V vector documentation (Document/arch/riscv/vector.rst)
mandates a prctl() in order to allow an userspace thread to use the
Vector extension from the host.
This is something to be done in realize() time, after init(), when we
already decided whether we're using RVV or not. We don't have a
realize() callback for KVM yet, so add kvm_cpu_realize() and enable RVV
for the thread via PR_RISCV_V_SET_CONTROL.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231218204321.75757-4-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The emulated AIA within the Linux kernel restores the HART index
of the IMSICs according to the configured AIA settings. During
this process, the group setting is used only when the machine
partitions harts into groups. It's unnecessary to set the group
configuration if the machine has only one socket, as its address
space might not contain the group shift.
Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com>
Reviewed-by: Jim Shu <jim.shu@sifive.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20231218090543.22353-2-yongxuan.wang@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The RVA22S64 profile consists of the following:
- all mandatory extensions of RVA22U64;
- priv spec v1.12.0;
- satp mode sv39;
- Ssccptr, a cache related named feature that we're assuming always
enable since we don't implement a cache;
- Other named features already implemented: Sstvecd, Sstvala,
Sscounterenw;
- the new Svade named feature that was recently added.
Most of the work is already done, so this patch is enough to implement
the profile.
After this patch, the 'rva22s64' user flag alone can be used with the
rva64i CPU to boot Linux:
-cpu rv64i,rva22s64=true
This is the /proc/cpuinfo with this profile enabled:
# cat /proc/cpuinfo
processor : 0
hart : 0
isa : rv64imafdc_zicbom_zicbop_zicboz_zicntr_zicsr_zifencei_zihintpause_zihpm_zfhmin_zca_zcd_zba_zbb_zbs_zkt_svinval_svpbmt
mmu : sv39
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231218125334.37184-26-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Certain S-mode profiles, like RVA22S64 and RVA23S64, mandate all the
mandatory extensions of their respective U-mode profiles. RVA22S64
includes all mandatory extensions of RVA22U64, and the same happens with
RVA23 profiles.
Add a 'parent' field to allow profiles to enable other profiles. This
will allow us to describe S-mode profiles by specifying their parent
U-mode profile, then adding just the S-mode specific extensions.
We're naming the field 'parent' to consider the possibility of other
uses (e.g. a s-mode profile including a previous s-mode profile) in the
future.
Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231218125334.37184-25-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
'satp_mode' is a requirement for supervisor profiles like RVA22S64.
User-mode/application profiles like RVA22U64 doesn't care.
Add 'satp_mode' to the profile description. If a profile requires it,
set it during cpu_set_profile(). We'll also check it during finalize()
to validate if the running config implements the profile.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231218125334.37184-24-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Next patch will need to retrieve if a given RISCVCPU is 32 or 64 bit.
The existing helper riscv_is_32bit() (hw/riscv/boot.c) will always check
the first CPU of a given hart array, not any given CPU.
Create a helper to retrieve the info for any given CPU, not the first
CPU of the hart array. The helper is using the same 32 bit check that
riscv_cpu_satp_mode_finalize() was doing.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231218125334.37184-23-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
'svade' is a RVA22S64 profile requirement, a profile we're going to add
shortly. It is a named feature (i.e. not a formal extension, not defined
in riscv,isa DT at this moment) defined in [1] as:
"Page-fault exceptions are raised when a page is accessed when A bit is
clear, or written when D bit is clear.".
As far as the spec goes, 'svade' is one of the two distinct modes of
handling PTE_A and PTE_D. The other way, i.e. update PTE_A/PTE_D when
they're cleared, is defined by the 'svadu' extension. Checking
cpu_helper.c, get_physical_address(), we can verify that QEMU is
compliant with that: we will update PTE_A/PTE_D if 'svadu' is enabled,
or throw a page-fault exception if 'svadu' isn't enabled.
So, as far as we're concerned, 'svade' translates to 'svadu must be
disabled'.
We'll implement it like 'zic64b': an internal flag that profiles can
enable. The flag will not be exposed to users.
[1] https://github.com/riscv/riscv-profiles/blob/main/profiles.adoc
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231218125334.37184-20-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Expose all profile flags for all CPUs when executing
query-cpu-model-expansion. This will allow callers to quickly determine
if a certain profile is implemented by a given CPU. This includes vendor
CPUs - the fact that they don't have profile user flags doesn't mean
that they don't implement the profile.
After this change it's possible to quickly determine if our stock CPUs
implement the existing rva22u64 profile. Here's a few examples:
$ ./build/qemu-system-riscv64 -S -M virt -display none
-qmp tcp:localhost:1234,server,wait=off
$ ./scripts/qmp/qmp-shell localhost:1234
Welcome to the QMP low-level shell!
Connected to QEMU 8.1.50
- As expected, the 'max' CPU implements the rva22u64 profile.
(QEMU) query-cpu-model-expansion type=full model={"name":"max"}
{"return": {"model":
{"name": "rv64", "props": {... "rva22u64": true, ...}}}}
- rv64 is missing "zba", "zbb", "zbs", "zkt" and "zfhmin":
query-cpu-model-expansion type=full model={"name":"rv64"}
{"return": {"model":
{"name": "rv64", "props": {... "rva22u64": false, ...}}}}
query-cpu-model-expansion type=full model={"name":"rv64",
"props":{"zba":true,"zbb":true,"zbs":true,"zkt":true,"zfhmin":true}}
{"return": {"model":
{"name": "rv64", "props": {... "rva22u64": true, ...}}}}
We have no vendor CPUs that supports rva22u64 (veyron-v1 is the closest
- it is missing just 'zkt').
In short, aside from the 'max' CPU, we have no CPUs that supports
rva22u64 by default.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231218125334.37184-18-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Enabling a profile and then disabling some of its mandatory extensions
is a valid use. It can be useful for debugging and testing. But the
common expected use of enabling a profile is to enable all its mandatory
extensions.
Add an user warning when mandatory extensions from an enabled profile
are disabled in the command line. We're also going to disable the
profile flag in this case since the profile must include all the
mandatory extensions. This flag can be exposed by QMP to indicate the
actual profile state after the CPU is realized.
After this patch, this will throw warnings:
-cpu rv64,rva22u64=true,zihintpause=false,zicbom=false,zicboz=false
qemu-system-riscv64: warning: Profile rva22u64 mandates disabled extension zihintpause
qemu-system-riscv64: warning: Profile rva22u64 mandates disabled extension zicbom
qemu-system-riscv64: warning: Profile rva22u64 mandates disabled extension zicboz
Note that the following will NOT throw warnings because the profile is
being enabled last, hence all its mandatory extensions will be enabled:
-cpu rv64,zihintpause=false,zicbom=false,zicboz=false,rva22u64=true
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231218125334.37184-17-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
RVG behaves like a profile: a single flag enables a set of bits. Right
now we're considering user choice when handling RVG and zicsr/zifencei
and ignoring user choice on MISA bits.
We'll add user warnings for profiles when the user disables its
mandatory extensions in the next patch. We'll do the same thing with RVG
now to keep consistency between RVG and profile handling.
First and foremost, create a new RVG only helper to avoid clogging
riscv_cpu_validate_set_extensions(). We do not want to annoy users with
RVG warnings like we did in the past (see 9b9741c38f), thus we'll only
warn if RVG was user set and the user disabled a RVG extension in the
command line.
For every RVG MISA bit (IMAFD), zicsr and zifencei, the logic then
becomes:
- if enabled, do nothing;
- if disabled and not user set, enable it;
- if disabled and user set, throw a warning that it's a RVG mandatory
extension.
This same logic will be used for profiles in the next patch.
Note that this is a behavior change, where we would error out if the
user disabled either zicsr or zifencei. As long as users are explicitly
disabling things in the command line we'll let them have a go at it, at
least in this step. We'll error out later in the validation if needed.
Other notable changes from the previous RVG code:
- use riscv_cpu_write_misa_bit() instead of manually updating both
env->misa_ext and env->misa_ext_mask;
- set zicsr and zifencei directly. We're already checking if they
were user set and priv version will never fail for these
extensions, making cpu_cfg_ext_auto_update() redundant.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231218125334.37184-16-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The profile support is handling multi-letter extensions only. Let's add
support for MISA bits as well.
We'll go through every known MISA bit. If the profile doesn't declare
the bit as mandatory, ignore it. Otherwise, set the bit in env->misa_ext
and env->misa_ext_mask.
Now that we're setting profile MISA bits, one can use the rv64i CPU to boot
Linux using the following options:
-cpu rv64i,rva22u64=true,rv39=true,s=true,zifencei=true
In the near future, when rva22s64 (where, 's', 'zifencei' and sv39 are
mandatory), is implemented, rv64i will be able to boot Linux loading
rva22s64 and no additional flags.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231218125334.37184-14-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
We already track user choice for multi-letter extensions because we
needed to honor user choice when enabling/disabling extensions during
realize(). We refrained from adding the same mechanism for MISA
extensions since we didn't need it.
Profile support requires tne need to check for user choice for MISA
extensions, so let's add the corresponding hash now. It works like the
existing multi-letter hash (multi_ext_user_opts) but tracking MISA bits
options in the cpu_set_misa_ext_cfg() callback.
Note that we can't re-use the same hash from multi-letter extensions
because that hash uses cpu->cfg offsets as keys, while for MISA
extensions we're using MISA bits as keys.
After adding the user hash in cpu_set_misa_ext_cfg(), setting default
values with object_property_set_bool() in add_misa_properties() will end
up marking the user choice hash with them. Set the default value
manually to avoid it.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20231218125334.37184-12-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The TCG emulation implements all the extensions described in the
RVA22U64 profile, both mandatory and optional. The mandatory extensions
will be enabled via the profile flag. We'll leave the optional
extensions to be enabled by hand.
Given that this is the first profile we're implementing in TCG we'll
need some ground work first:
- all profiles declared in riscv_profiles[] will be exposed to users.
TCG is the main accelerator we're considering when adding profile
support in QEMU, so for now it's safe to assume that all profiles in
riscv_profiles[] will be relevant to TCG;
- we'll not support user profile settings for vendor CPUs. The flags
will still be exposed but users won't be able to change them;
- profile support, albeit available for all non-vendor CPUs, will be
based on top of the new 'rv64i' CPU. Setting a profile to 'true' means
enable all mandatory extensions of this profile, setting it to 'false'
will disable all mandatory profile extensions of the CPU, which will
obliterate preset defaults. This is not a problem for a bare CPU like
rv64i but it can allow for silly scenarios when using other CPUs. E.g.
an user can do "-cpu rv64,rva22u64=false" and have a bunch of default
rv64 extensions disabled. The recommended way of using profiles is the
rv64i CPU, but users are free to experiment.
For now we'll handle multi-letter extensions only. MISA extensions need
additional steps that we'll take care later. At this point we can boot a
Linux buildroot using rva22u64 using the following options:
-cpu rv64i,rva22u64=true,sv39=true,g=true,c=true,s=true
Note that being an usermode/application profile we still need to
explicitly set 's=true' to enable Supervisor mode to boot Linux.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231218125334.37184-11-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
KVM does not have the means to support enabling the rva22u64 profile.
The main reasons are:
- we're missing support for some mandatory rva22u64 extensions in the
KVM module;
- we can't make promises about enabling a profile since it all depends
on host support in the end.
We'll revisit this decision in the future if needed. For now mark the
'rva22u64' profile as unavailable when running a KVM CPU:
$ qemu-system-riscv64 -machine virt,accel=kvm -cpu rv64,rva22u64=true
qemu-system-riscv64: can't apply global rv64-riscv-cpu.rva22u64=true:
'rva22u64' is not available with KVM
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20231218125334.37184-10-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The rva22U64 profile, described in:
https://github.com/riscv/riscv-profiles/blob/main/profiles.adoc#rva22-profiles
Contains a set of CPU extensions aimed for 64-bit userspace
applications. Enabling this set to be enabled via a single user flag
makes it convenient to enable a predictable set of features for the CPU,
giving users more predicability when running/testing their workloads.
QEMU implements all possible extensions of this profile. All the so
called 'synthetic extensions' described in the profile that are cache
related are ignored/assumed enabled (Za64rs, Zic64b, Ziccif, Ziccrse,
Ziccamoa, Zicclsm) since we do not implement a cache model.
An abstraction called RISCVCPUProfile is created to store the profile.
'ext_offsets' contains mandatory extensions that QEMU supports. Same
thing with the 'misa_ext' mask. Optional extensions must be enabled
manually in the command line if desired.
The design here is to use the common target/riscv/cpu.c file to store
the profile declaration and export it to the accelerator files. Each
accelerator is then responsible to expose it (or not) to users and how
to enable the extensions.
Next patches will implement the profile for TCG and KVM.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20231218125334.37184-9-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Named features (zic64b the sole example at this moment) aren't expose to
users, thus we need another way to expose them.
Go through each named feature, get its boolean value, do the needed
conversions (bool to qbool, qbool to QObject) and add it to output dict.
Another adjustment is needed: named features are evaluated during
finalize(), so riscv_cpu_finalize_features() needs to be mandatory
regardless of whether we have an input dict or not. Otherwise zic64b
will always return 'false', which is incorrect: the default values of
cache blocksizes ([cbom/cbop/cboz]_blocksize) are set to 64, satisfying
the conditions for zic64b.
Here's an API usage example after this patch:
$ ./build/qemu-system-riscv64 -S -M virt -display none
-qmp tcp:localhost:1234,server,wait=off
$ ./scripts/qmp/qmp-shell localhost:1234
Welcome to the QMP low-level shell!
Connected to QEMU 8.1.50
(QEMU) query-cpu-model-expansion type=full model={"name":"rv64"}
{"return": {"model":
{"name": "rv64", "props": {... "zic64b": true, ...}}}}
zic64b is set to 'true', as expected, since all cache sizes are 64
bytes by default.
If we change one of the cache blocksizes, zic64b is returned as 'false':
(QEMU) query-cpu-model-expansion type=full model={"name":"rv64","props":{"cbom_blocksize":128}}
{"return": {"model":
{"name": "rv64", "props": {... "zic64b": false, ...}}}}
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231218125334.37184-8-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
zic64b is defined in the RVA22U64 profile [1] as a named feature for
"Cache blocks must be 64 bytes in size, naturally aligned in the address
space". It's a fantasy name for 64 bytes cache blocks. The RVA22U64
profile mandates this feature, meaning that applications using this
profile expects 64 bytes cache blocks.
To make the upcoming RVA22U64 implementation complete, we'll zic64b as
a 'named feature', not a regular extension. This means that:
- it won't be exposed to users;
- it won't be written in riscv,isa.
This will be extended to other named extensions in the future, so we're
creating some common boilerplate for them as well.
zic64b is default to 'true' since we're already using 64 bytes blocks.
If any cache block size (cbo{m,p,z}_blocksize) is changed to something
different than 64, zic64b is set to 'false'.
Our profile implementation will then be able to check the current state
of zic64b and take the appropriate action (e.g. throw a warning).
[1] https://github.com/riscv/riscv-profiles/releases/download/v1.0/profiles.pdf
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231218125334.37184-7-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
QEMU already implements zicbom (Cache Block Management Operations) and
zicboz (Cache Block Zero Operations). Commit 59cb29d6a5 ("target/riscv:
add Zicbop cbo.prefetch{i, r, m} placeholder") added placeholders for
what would be the instructions for zicbop (Cache Block Prefetch
Operations), which are now no-ops.
The RVA22U64 profile mandates zicbop, which means that applications that
run with this profile might expect zicbop to be present in the riscv,isa
DT and might behave badly if it's absent.
Adding zicbop as an extension will make our future RVA22U64
implementation more in line with what userspace expects and, if/when
cache block prefetch operations became relevant to QEMU, we already have
the extension flag to turn then on/off as needed.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231218125334.37184-6-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
We don't have any form of a 'bare bones' CPU. rv64, our default CPUs,
comes with a lot of defaults. This is fine for most regular uses but
it's not suitable when more control of what is actually loaded in the
CPU is required.
A bare-bones CPU would be annoying to deal with if not by profile
support, a way to load a multitude of extensions with a single flag.
Profile support is going to be implemented shortly, so let's add a CPU
for it.
The new 'rv64i' CPU will have only RVI loaded. It is inspired in the
profile specification that dictates, for RVA22U64 [1]:
"RVA22U64 Mandatory Base
RV64I is the mandatory base ISA for RVA22U64"
And so it seems that RV64I is the mandatory base ISA for all profiles
listed in [1], making it an ideal CPU to use with profile support.
rv64i is a CPU of type TYPE_RISCV_BARE_CPU. It has a mix of features
from pre-existent CPUs:
- it allows extensions to be enabled, like generic CPUs;
- it will not inherit extension defaults, like vendor CPUs.
This is the minimum extension set to boot OpenSBI and buildroot using
rv64i:
./build/qemu-system-riscv64 -nographic -M virt \
-cpu rv64i,sv39=true,g=true,c=true,s=true,u=true
Our minimal riscv,isa in this case will be:
# cat /proc/device-tree/cpus/cpu@0/riscv,isa
rv64imafdc_zicntr_zicsr_zifencei_zihpm_zca_zcd#
[1] https://github.com/riscv/riscv-profiles/blob/main/profiles.adoc
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231218125334.37184-5-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
We'll add a new bare CPU type that won't have any default priv_ver. This
means that the CPU will default to priv_ver = 0, i.e. 1.10.0.
At the same we'll allow these CPUs to enable extensions at will, but
then, if the extension has a priv_ver newer than 1.10, we'll end up
disabling it. Users will then need to manually set priv_ver to something
other than 1.10 to enable the extensions they want, which is not ideal.
Change the setter() of extensions to allow user enabled extensions to
bump the priv_ver of the CPU. This will make it convenient for users to
enable extensions for CPUs that doesn't set a default priv_ver.
This change does not affect any existing CPU: vendor CPUs does not allow
extensions to be enabled, and generic CPUs are already set to priv_ver
LATEST.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231218125334.37184-4-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Our current logic in get/setters of MISA and multi-letter extensions
works because we have only 2 CPU types, generic and vendor, and by using
"!generic" we're implying that we're talking about vendor CPUs. When adding
a third CPU type this logic will break so let's handle it beforehand.
In set_misa_ext_cfg() and set_multi_ext_cfg(), check for "vendor" cpu instead
of "not generic". The "generic CPU" checks remaining are from
riscv_cpu_add_misa_properties() and cpu_add_multi_ext_prop() before
applying default values for the extensions.
This leaves us with:
- vendor CPUs will not allow extension enablement, all other CPUs will;
- generic CPUs will inherit default values for extensions, all others
won't.
And now we can add a new, third CPU type, that will allow extensions to
be enabled and will not inherit defaults, without changing the existing
logic.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231218125334.37184-3-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
We want to add a new CPU type for bare CPUs that will inherit specific
traits of the 2 existing types:
- it will allow for extensions to be enabled/disabled, like generic
CPUs;
- it will NOT inherit defaults, like vendor CPUs.
We can make this conditions met by adding an explicit type for the
existing vendor CPUs and change the existing logic to not imply that
"not generic" means vendor CPUs.
Let's add the "vendor" CPU type first.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231218125334.37184-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The interrupts-extended property of PLIC only has 2 * hart number
fields when KVM enabled, copy 4 * hart number fields to fdt will
expose some uninitialized value.
In this patch, I also refactor the code about the setting of
interrupts-extended property of PLIC for improved readability.
Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com>
Reviewed-by: Jim Shu <jim.shu@sifive.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20231218090543.22353-1-yongxuan.wang@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
ACPI DSDT generator needs information like ECAM range, PIO range, 32-bit
and 64-bit PCI MMIO range etc related to the PCI host bridge. Instead of
making these values machine specific, create properties for the GPEX
host bridge with default value 0. During initialization, the firmware
can initialize these properties with correct values for the platform.
This basically allows DSDT generator code independent of the machine
specific memory map accesses.
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20231218150247.466427-11-sunilvl@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
kvm_riscv_reg_id() returns an id encoded with an ulong size, i.e. an u32
size when running TARGET_RISCV32 and u64 when running TARGET_RISCV64.
Rename it to kvm_riscv_reg_id_ulong() to enhance code readability. It'll
be in line with the existing kvm_riscv_reg_id_<size>() helpers.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20231208183835.2411523-6-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
KVM_REG_RISCV_FP_F regs have u32 size according to the API, but by using
kvm_riscv_reg_id() in RISCV_FP_F_REG() we're returning u64 sizes when
running with TARGET_RISCV64. The most likely reason why no one noticed
this is because we're not implementing kvm_cpu_synchronize_state() in
RISC-V yet.
Create a new helper that returns a KVM ID with u32 size and use it in
RISCV_FP_F_REG().
Reported-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20231208183835.2411523-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
mvendorid is an uint32 property, mimpid/marchid are uint64 properties.
But their getters are returning bools. The reason this went under the
radar for this long is because we have no code using the getters.
The problem can be seem via the 'qom-get' API though. Launching QEMU
with the 'veyron-v1' CPU, a model with:
VEYRON_V1_MVENDORID: 0x61f (1567)
VEYRON_V1_MIMPID: 0x111 (273)
VEYRON_V1_MARCHID: 0x8000000000010000 (9223372036854841344)
This is what the API returns when retrieving these properties:
(qemu) qom-get /machine/soc0/harts[0] mvendorid
true
(qemu) qom-get /machine/soc0/harts[0] mimpid
true
(qemu) qom-get /machine/soc0/harts[0] marchid
true
After this patch:
(qemu) qom-get /machine/soc0/harts[0] mvendorid
1567
(qemu) qom-get /machine/soc0/harts[0] mimpid
273
(qemu) qom-get /machine/soc0/harts[0] marchid
9223372036854841344
Fixes: 1e34150045 ("target/riscv/cpu.c: restrict 'mvendorid' value")
Fixes: a1863ad368 ("target/riscv/cpu.c: restrict 'mimpid' value")
Fixes: d6a427e2c0 ("target/riscv/cpu.c: restrict 'marchid' value")
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231211170732.2541368-1-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The Sv32 page-based virtual-memory scheme described in RISCV privileged
spec Section 5.3 supports 34-bit physical addresses for RV32, so the
PMP scheme must support addresses wider than XLEN for RV32. However,
PMP address register format is still 32 bit wide.
Signed-off-by: Ivan Klokov <ivan.klokov@syntacore.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231123091214.20312-1-ivan.klokov@syntacore.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The ratified version of RISC-V V spec section 16.6 says that
`The instructions operate as if EEW=SEW`.
So the whole vector register move instructions depend on the vtype
register that means the whole vector register move instructions should
raise an illegal-instruction exception when vtype.vill=1.
Signed-off-by: Max Chou <max.chou@sifive.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20231129170400.21251-2-max.chou@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
We already print various lines of information when we take an
exception, including the ELR and (if relevant) the FAR. Now
that FEAT_NV means that we might report something other than
the old PSTATE to the guest as the SPSR, it's worth logging
this as well.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
When interpreting CPU dumps where FEAT_NV and FEAT_NV2 are in use,
it's helpful to include the values of HCR_EL2.{NV,NV1,NV2} in the CPU
dump format, as a way of distinguishing when we are in EL1 as part of
executing guest-EL2 and when we are just in normal EL1.
Add the bits to the end of the log line that shows PSTATE and similar
information:
PSTATE=000003c9 ---- EL2h BTYPE=0 NV NV2
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
Mark up the cpreginfo structs for the GIC CPU registers to indicate
the offsets from VNCR_EL2, as defined in table D8-66 in rule R_CSRPQ
in the Arm ARM.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
Mark up the cpreginfo structs to indicate offsets for system
registers from VNCR_EL2, as defined in table D8-66 in rule R_CSRPQ in
the Arm ARM. This covers all the remaining offsets at 0x200 and
above, except for the GIC ICH_* registers.
(Note that because we don't implement FEAT_SPE, FEAT_TRF,
FEAT_MPAM, FEAT_BRBE or FEAT_AMUv1p1 we don't implement any
of the registers that use offsets at 0x800 and above.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
Mark up the cpreginfo structs to indicate offsets for system
registers from VNCR_EL2, as defined in table D8-66 in rule R_CSRPQ in
the Arm ARM. This commit covers offsets 0x168 to 0x1f8.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
Mark up the cpreginfo structs to indicate offsets for system
registers from VNCR_EL2, as defined in table D8-66 in rule R_CSRPQ in
the Arm ARM. This commit covers offsets 0x100 to 0x160.
Many (but not all) of the registers in this range have _EL12 aliases,
and the slot in memory is shared between the _EL12 version of the
register and the _EL1 version. Where we programmatically generate
the regdef for the _EL12 register, arrange that its
nv2_redirect_offset is set up correctly to do this.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
Mark up the cpreginfo structs to indicate offsets for system
registers from VNCR_EL2, as defined in table D8-66 in rule R_CSRPQ in
the Arm ARM. This commit covers offsets below 0x100; all of these
registers are redirected to memory regardless of the value of
HCR_EL2.NV1.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
If FEAT_NV2 redirects a system register access to a memory offset
from VNCR_EL2, that access might fault. In this case we need to
report the correct syndrome information:
* Data Abort, from same-EL
* no ISS information
* the VNCR bit (bit 13) is set
and the exception must be taken to EL2.
Save an appropriate syndrome template when generating code; we can
then use that to:
* select the right target EL
* reconstitute a correct final syndrome for the data abort
* report the right syndrome if we take a FEAT_RME granule protection
fault on the VNCR-based write
Note that because VNCR is bit 13, we must start keeping bit 13 in
template syndromes, by adjusting ARM_INSN_START_WORD2_SHIFT.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
FEAT_NV2 requires that when HCR_EL2.{NV,NV2} == 0b11 then accesses by
EL1 to certain system registers are redirected to RAM. The full list
of affected registers is in the table in rule R_CSRPQ in the Arm ARM.
The registers may be normally accessible at EL1 (like ACTLR_EL1), or
normally UNDEF at EL1 (like HCR_EL2). Some registers redirect to RAM
only when HCR_EL2.NV1 is 0, and some only when HCR_EL2.NV1 is 1;
others trap in both cases.
Add the infrastructure for identifying which registers should be
redirected and turning them into memory accesses.
This code does not set the correct syndrome or arrange for the
exception to be taken to the correct target EL if the access via
VNCR_EL2 faults; we will do that in the next commit.
Subsequent commits will mark up the relevant regdefs to set their
nv2_redirect_offset, and if relevant one of the two flags which
indicates that the redirect happens only for a particular value of
HCR_EL2.NV1.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Under FEAT_NV2, when HCR_EL2.{NV,NV2} == 0b11 at EL1, accesses to the
registers SPSR_EL2, ELR_EL2, ESR_EL2, FAR_EL2 and TFSR_EL2 (which
would UNDEF without FEAT_NV or FEAT_NV2) should instead access the
equivalent EL1 registers SPSR_EL1, ELR_EL1, ESR_EL1, FAR_EL1 and
TFSR_EL1.
Because there are only five registers involved and the encoding for
the EL1 register is identical to that of the EL2 register except
that opc1 is 0, we handle this by finding the EL1 register in the
hash table and using it instead.
Note that traps that apply to direct accesses to the EL1 register,
such as active fine-grained traps or other trap bits, do not trigger
when it is accessed via the EL2 encoding in this way. However, some
traps that are defined by the EL2 register may apply. We therefore
call the EL2 register's accessfn first. The only one of the five
which has such traps is TFSR_EL2: make sure its accessfn correctly
handles both FEAT_NV (where we trap to EL2 without checking ATA bits)
and FEAT_NV2 (where we check ATA bits and then redirect to TFSR_EL1).
(We don't need the NV1 tbflag bit until the next patch, but we
introduce it here to avoid putting the NV, NV1, NV2 bits in an
odd order.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
For FEAT_NV2, a new system register VNCR_EL2 holds the base
address of the memory which nested-guest system register
accesses are redirected to. Implement this register.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
Enable FEAT_NV on the 'max' CPU, and stop filtering it out for the
Neoverse N2 and Neoverse V1 CPUs. We continue to downgrade FEAT_NV2
support to FEAT_NV for the latter two CPU types.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
FEAT_NV requires that when HCR_EL2.{NV,NV1} == {1,1} the handling
of some of the page table attribute bits changes for the EL1&0
translation regime:
* for block and page descriptors:
- bit [54] holds PXN, not UXN
- bit [53] is RES0, and the effective value of UXN is 0
- bit [6], AP[1], is treated as 0
* for table descriptors, when hierarchical permissions are enabled:
- bit [60] holds PXNTable, not UXNTable
- bit [59] is RES0
- bit [61], APTable[0] is treated as 0
Implement these changes to the page table attribute handling.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
FEAT_NV requires (per I_JKLJK) that when HCR_EL2.{NV,NV1} is {1,1} the
unprivileged-access instructions LDTR, STTR etc behave as normal
loads and stores. Implement the check that handles this.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
For FEAT_NV, when HCR_EL2.{NV,NV1} is {1,1} PAN is always disabled
even when the PSTATE.PAN bit is set. Implement this by having
arm_pan_enabled() return false in this situation.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
Currently the code in target/arm/helper.c mostly checks the PAN bits
in env->pstate or env->uncached_cpsr directly when it wants to know
if PAN is enabled, because in most callsites we know whether we are
in AArch64 or AArch32. We do have an arm_pan_enabled() function, but
we only use it in a few places where the code might run in either an
AArch32 or AArch64 context.
For FEAT_NV, when HCR_EL2.{NV,NV1} is {1,1} PAN is always disabled
even when the PSTATE.PAN bit is set, the "is PAN enabled" test
becomes more complicated. Make all places that check for PAN use
arm_pan_enabled(), so we have a place to put the FEAT_NV test.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
When HCR_EL2.{NV,NV1} is {1,1} we must trap five extra registers to
EL2: VBAR_EL1, ELR_EL1, SPSR_EL1, SCXTNUM_EL1 and TFSR_EL1.
Implement these traps.
This trap does not apply when FEAT_NV2 is implemented and enabled;
include the check that HCR_EL2.NV2 is 0 here, to save us having
to come back and add it later.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
FEAT_NV requires that when HCR_EL2.{NV,NV1} == {1,0} and an exception
is taken from EL1 to EL1 then the reported EL in SPSR_EL1.M should be
EL2, not EL1. Implement this behaviour.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
FEAT_NV requires that when HCR_EL2.NV is set reads of the CurrentEL
register from EL1 always report EL2 rather than the real EL.
Implement this.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
For FEAT_NV, accesses to system registers and instructions from EL1
which would normally UNDEF there but which work in EL2 need to
instead be trapped to EL2. Detect this both for "we know this will
UNDEF at translate time" and "we found this UNDEFs at runtime", and
make the affected registers trap to EL2 instead.
The Arm ARM defines the set of registers that should trap in terms
of their names; for our implementation this would be both awkward
and inefficent as a test, so we instead trap based on the opc1
field of the sysreg. The regularity of the architectural choice
of encodings for sysregs means that in practice this captures
exactly the correct set of registers.
Regardless of how we try to define the registers this trapping
applies to, there's going to be a certain possibility of breakage
if new architectural features introduce new registers that don't
follow the current rules (FEAT_MEC is one example already visible
in the released sysreg XML, though not yet in the Arm ARM). This
approach seems to me to be straightforward and likely to require
a minimum of manual overrides.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
In handle_sys() we don't do the check for whether the register is
marked as needing an FPU/SVE/SME access check until after we've
handled the special cases covered by ARM_CP_SPECIAL_MASK. This is
conceptually the wrong way around, because if for example we happen
to implement an FPU-access-checked register as ARM_CP_NOP, we should
do the access check first.
Move the access checks up so they are with all the other access
checks, not sandwiched between the special-case read/write handling
and the normal-case read/write handling. This doesn't change
behaviour at the moment, because we happen not to define any
cpregs with both ARM_CPU_{FPU,SVE,SME} and one of the cases
dealt with by ARM_CP_SPECIAL_MASK.
Moving this code also means we have the correct place to put the
FEAT_NV/FEAT_NV2 access handling, which should come after the access
checks and before we try to do any read/write action.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
FEAT_NV and FEAT_NV2 will allow EL1 to attempt to access cpregs that
only exist at EL2. This means we're going to want to run their
accessfns when the CPU is at EL1. In almost all cases, the behaviour
we want is "the accessfn returns OK if at EL1".
Mostly the accessfn already does the right thing; in a few cases we
need to explicitly check that the EL is not 1 before applying various
trap controls, or split out an accessfn used both for an _EL1 and an
_EL2 register into two so we can handle the FEAT_NV case correctly
for the _EL2 register.
There are two registers where we want the accessfn to trap for
a FEAT_NV EL1 access: VSTTBR_EL2 and VSTCR_EL2 should UNDEF
an access from NonSecure EL1, not trap to EL2 under FEAT_NV.
The way we have written sel2_access() already results in this
behaviour.
We can identify the registers we care about here because they
all have opc1 == 4 or 5.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
The alias registers like SCTLR_EL12 only exist when HCR_EL2.E2H
is 1; they should UNDEF otherwise. We weren't implementing this.
Add an intercept of the accessfn for these aliases, and implement
the UNDEF check.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
For FEAT_VHE, we define a set of register aliases, so that for instance:
* the SCTLR_EL1 either accesses the real SCTLR_EL1, or (if E2H is 1)
SCTLR_EL2
* a new SCTLR_EL12 register accesses SCTLR_EL1 if E2H is 1
However when we create the 'new_reg' cpreg struct for the SCTLR_EL12
register, we duplicate the information in the SCTLR_EL1 cpreg, which
means the opcode fields are those of SCTLR_EL1, not SCTLR_EL12. This
is a problem for code which looks at the cpreg opcode fields to
determine behaviour (e.g. in access_check_cp_reg()). In practice
the current checks we do there don't intersect with the *_EL12
registers, but for FEAT_NV this will become a problem.
Write the correct values from the encoding into the new_reg struct.
This restores the invariant that the cpreg that you get back
from the hashtable has opcode fields that match the key you used
to retrieve it.
When we call the readfn or writefn for the target register, we
pass it the cpreg struct for that target register, not the one
for the alias, in case the readfn/writefn want to look at the
opcode fields to determine behaviour. This means we need to
interpose custom read/writefns for the e12 aliases.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
The TBFLAG_A64 TB flag bits go in flags2, which for AArch64 guests
we know is 64 bits. However at the moment we use FIELD_EX32() and
FIELD_DP32() to read and write these bits, which only works for
bits 0 to 31. Since we're about to add a flag that uses bit 32,
switch to FIELD_EX64() and FIELD_DP64() so that this will work.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
The HCR_EL2.TSC trap for trapping EL1 execution of SMC instructions
has a behaviour change for FEAT_NV when EL3 is not implemented:
* in older architecture versions TSC was required to have no
effect (i.e. the SMC insn UNDEFs)
* with FEAT_NV, when HCR_EL2.NV == 1 the trap must apply
(i.e. SMC traps to EL2, as it already does in all cases when
EL3 is implemented)
* in newer architecture versions, the behaviour either without
FEAT_NV or with FEAT_NV and HCR_EL2.NV == 0 is relaxed to
an IMPDEF choice between UNDEF and trap-to-EL2 (i.e. it is
permitted to always honour HCR_EL2.TSC) for AArch64 only
Add the condition to honour the trap bit when HCR_EL2.NV == 1. We
leave the HCR_EL2.NV == 0 case with the existing (UNDEF) behaviour,
as our IMPDEF choice (both because it avoids a behaviour change
for older CPU models and because we'd have to distinguish AArch32
from AArch64 if we opted to trap to EL2).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
When FEAT_NV is turned on via the HCR_EL2.NV bit, ERET instructions
are trapped, with the same syndrome information as for the existing
FEAT_FGT fine-grained trap (in the pseudocode this is handled in
AArch64.CheckForEretTrap()).
Rename the DisasContext and tbflag bits to reflect that they are
no longer exclusively for FGT traps, and set the tbflag bit when
FEAT_NV is enabled as well as when the FGT is enabled.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
FEAT_NV defines three new bits in HCR_EL2: NV, NV1 and AT. When the
feature is enabled, allow these bits to be written, and flush the
TLBs for the bits which affect page table interpretation.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
The hypervisor can deliver (virtual) LPIs to a guest by setting up a
list register to have an intid which is an LPI. The GIC has to treat
these a little differently to standard interrupt IDs, because LPIs
have no Active state, and so the guest will only EOI them, it will
not also deactivate them. So icv_eoir_write() must do two things:
* if the LPI ID is not in any list register, we drop the
priority but do not increment the EOI count
* if the LPI ID is in a list register, we immediately deactivate
it, regardless of the split-drop-and-deactivate control
This can be seen in the VirtualWriteEOIR0() and VirtualWriteEOIR1()
pseudocode in the GICv3 architecture specification.
Without this fix, potentially a hypervisor guest might stall because
LPIs get stuck in a bogus Active+Pending state.
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
The CTR_EL0 register has some bits which allow the implementation to
tell the guest that it does not need to do cache maintenance for
data-to-instruction coherence and instruction-to-data coherence.
QEMU doesn't emulate caches and so our cache maintenance insns are
all NOPs.
We already have some models of specific CPUs where we set these bits
(e.g. the Neoverse V1), but the 'max' CPU still uses the settings it
inherits from Cortex-A57. Set the bits for 'max' as well, so the
guest doesn't need to do unnecessary work.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
A SoC will not have a direct access to the NVIC embedded in its ARM
core. By aliasing the "num-prio-bits" property similarly to what is
done for the "num-irq" one, a SoC can easily configure it on its
armv7m instance.
Signed-off-by: Samuel Tardieu <sam@rfc1149.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240106181503.1746200-3-sam@rfc1149.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Record/replay fixes for replay_kernel tests
- add a 32 bit x86 replay test case
- fix some typos
- use modern snapshot setting for tests
- update replay_dump for current ABI
- remove stale replay variables
- improve kdoc for ReplayState
- introduce common error path for replay
- always fully drain chardevs when in replay
- catch unexpected waitio on playback
- remove flaky tags from replay_kernel tests
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmWcAJgACgkQ+9DbCVqe
# KkS/TQf+PuIPtuX71ENajfRBjz6450IbGqLUJ1HEaPGYGRj+fR6rg5g5u8qaBrT7
# TUv9ef9L22NtyL+Gbs1OGpGDWKoqV6RQc+A/MHa8IKFpcS24nUo3k4psIC6NSGRH
# 6w3++fPC1Q5cDk9Lei3Qt8fXzcnUZz+NTiIK05aC0xh7D6uGfdADvKqHeLav7qi+
# X2ztNdBsy/WJWCuWcMVzb/dGwDBtuyyxvqTD4EF+zn+gSYq9od2G8XdF+0o6ZVLM
# mXEHwNwB6UjOkLt2cYaay59SXcJFvwxKbEGTDnA7T+kgd3rknuBaWdVBIazoSPQh
# +522nPz5qq/3wO1l7+iQXuvd38fWyw==
# =nKRx
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 08 Jan 2024 14:03:04 GMT
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [full]
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* tag 'pull-replay-fixes-080124-1' of https://gitlab.com/stsquad/qemu:
tests/avocado: remove skips from replay_kernel
chardev: force write all when recording replay logs
replay: stop us hanging in rr_wait_io_event
replay/replay-char: use report_sync_error
replay: introduce a central report point for sync errors
replay: make has_unread_data a bool
replay: add proper kdoc for ReplayState
replay: remove host_clock_last
scripts/replay_dump: track total number of instructions
scripts/replay-dump: update to latest format
tests/avocado: modernise the drive args for replay_linux
tests/avocado: fix typo in replay_linux
tests/avocado: add a simple i386 replay kernel test
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The Big QEMU Lock (BQL) has many names and they are confusing. The
actual QemuMutex variable is called qemu_global_mutex but it's commonly
referred to as the BQL in discussions and some code comments. The
locking APIs, however, are called qemu_mutex_lock_iothread() and
qemu_mutex_unlock_iothread().
The "iothread" name is historic and comes from when the main thread was
split into into KVM vcpu threads and the "iothread" (now called the main
loop thread). I have contributed to the confusion myself by introducing
a separate --object iothread, a separate concept unrelated to the BQL.
The "iothread" name is no longer appropriate for the BQL. Rename the
locking APIs to:
- void bql_lock(void)
- void bql_unlock(void)
- bool bql_locked(void)
There are more APIs with "iothread" in their names. Subsequent patches
will rename them. There are also comments and documentation that will be
updated in later patches.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Acked-by: Fabiano Rosas <farosas@suse.de>
Acked-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Acked-by: Peter Xu <peterx@redhat.com>
Acked-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Acked-by: Hyman Huang <yong.huang@smartx.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-id: 20240102153529.486531-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
aio_context_set_aio_params() doesn't use its undocumented
Error** argument. Remove it to simplify.
Note this removes a use of "unchecked Error**" in
iothread_set_aio_context_params().
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20231120171806.19361-1-philmd@linaro.org>
QEMU complains about us not being explicit with setting snapshot so
lets do that. Also as cdroms are RO media we don't need to jump the
hoops of setting up snapshots and replay disks - just declare the
drive is a cdrom and nothing should change.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231211091346.14616-4-alex.bennee@linaro.org>
vfio queue:
* Minor cleanups
* Fix for a regression in device reset introduced in 8.2
* Coverity fixes, including the removal of the iommufd backend mutex
* Introduced VFIOIOMMUClass, to avoid compiling spapr when !CONFIG_PSERIES
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmWbIrcACgkQUaNDx8/7
# 7KFtPRAAxWcH9uh4tjJe4CgL+wXC+JOgviiNaI3AS6KmxdTHXcAvXMNAiGJfTBo4
# y/lJg+PYNgcDWrOqZqp1jj6ulWpO8ekLD9Nxv03e6o3kaArX/o2MtsrndOtWYnG/
# CUrr+/kTNeEw9008OaOca9vuh03xh3AnSwb3DzjHTvpMkj5LTXzuE1mU50DTUkn9
# GZjuN3rqHcdjJ/fXpiS6IgJbxcxLdo2aSykmyuq+TZmGf02lTES94PRef3Btr7Q6
# sKQZpv+A+gcZ8DHDJqfOEzEgu1OSa257q4ic47O1X3CeSyiGTGQ7rVKHtX6bK7xP
# mB9WOVqzzdH/g+kHNG+kVXMCQXZ0qo7VlIkHabYD220RryZBCqMecQ4aKPLFULQE
# e7C5ZaEvb7TLe/EaEQUSFrLCns7Nq6ciurcoAmP0cn2Ef1Sr1luNQVAR9LWRH1pc
# 1TeNmHy4nQygT0dQtFBXwNUZfnTuGcKdr43twReiCjX1ViPBU4lrcajVQH4rAuoe
# K/bBak2Kyi1LsFn8AzIwKXZZl83L57EyL+XEW8i5GN1jFSAHFx4ocUq8NQBa//kS
# xei9LV3HEJbAMOQsPO8HEK40mg5WR17s22AUClMqtD2DAQbPUrmcLbZ6Ttq6hTuV
# BqL56JFjbfML5RGjxwF9G8v5mdLmLlNRCGF2KI3NsT7dkMbVh24=
# =zvPi
# -----END PGP SIGNATURE-----
# gpg: Signature made Sun 07 Jan 2024 22:16:23 GMT
# gpg: using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@kaod.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1
* tag 'pull-vfio-20240107' of https://github.com/legoater/qemu:
backends/iommufd: Remove mutex
backends/iommufd: Remove check on number of backend users
vfio/migration: Add helper function to set state or reset device
vfio/container: Rename vfio_init_container to vfio_set_iommu
vfio/iommufd: Remove the use of stat() to check file existence
hw/vfio: fix iteration over global VFIODevice list
vfio/container: Replace basename with g_path_get_basename
vfio/iommufd: Remove CONFIG_IOMMUFD usage
vfio/spapr: Only compile sPAPR IOMMU support when needed
vfio/iommufd: Introduce a VFIOIOMMU iommufd QOM interface
vfio/spapr: Introduce a sPAPR VFIOIOMMU QOM interface
vfio/container: Intoduce a new VFIOIOMMUClass::setup handler
vfio/container: Introduce a VFIOIOMMU legacy QOM interface
vfio/container: Introduce a VFIOIOMMU QOM interface
vfio/container: Initialize VFIOIOMMUOps under vfio_init_container()
vfio/container: Introduce vfio_legacy_setup() for further cleanups
vfio/spapr: Extend VFIOIOMMUOps with a release handler
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
pull-loongarch-20240106
Fixs patch conflict
# -----BEGIN PGP SIGNATURE-----
#
# iLMEAAEKAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCZZi5pAAKCRBAov/yOSY+
# 35iIA/4uXw92i0HJ3KK9NxzZlP9gwld6dATvincKNUpUgplK3NtBpRlVKm9NzLH8
# Jdg84k7D5pNXfWAfomT+R3UMnL8R/zZZqeCCd60JE0Kbn4gCyCou2QuLKuWxPvNM
# Jklf4mifq+gplg7lDF0GTLo8MUzhAmV8AuG7lUZb6IQJ68ui8A==
# =NnLa
# -----END PGP SIGNATURE-----
# gpg: Signature made Sat 06 Jan 2024 02:23:32 GMT
# gpg: using RSA key B8FF1DA0D2FDCB2DA09C6C2C40A2FFF239263EDF
# gpg: Good signature from "Song Gao <m17746591750@163.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B8FF 1DA0 D2FD CB2D A09C 6C2C 40A2 FFF2 3926 3EDF
* tag 'pull-loongarch-20240106' of https://gitlab.com/gaosong/qemu:
target/loongarch: move translate modules to tcg/
target/loongarch/meson: move gdbstub.c to loongarch.ss
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Coverity reports a concurrent data access violation because be->users
is being accessed in iommufd_backend_can_be_deleted() without holding
the mutex.
However, these routines are called from the QEMU main thread when a
device is created. In this case, the code paths should be protected by
the BQL lock and it should be safe to drop the IOMMUFD backend mutex.
Simply remove it.
Fixes: CID 1531550
Fixes: CID 1531549
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
QOM already has a ref count on objects and it will assert much
earlier, when INT_MAX is reached.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
There are several places where failure in setting the device state leads
to a device reset, which is done by setting ERROR as the recover state.
Add a helper function that sets the device state and resets the device
in case of failure. This will make the code cleaner and remove duplicate
comments.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
vfio_container_init() and vfio_init_container() names are confusing
especially when we see vfio_init_container() calls vfio_container_init().
vfio_container_init() operates on base container which is consistent
with all routines handling 'VFIOContainerBase *' ops.
vfio_init_container() operates on legacy container and setup IOMMU
context with ioctl(VFIO_SET_IOMMU).
So choose to rename vfio_init_container to vfio_set_iommu to avoid
the confusion.
No functional change intended.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Using stat() before opening a file or a directory can lead to a
time-of-check to time-of-use (TOCTOU) filesystem race, which is
reported by coverity as a Security best practices violations. The
sequence could be replaced by open and fdopendir but it doesn't add
much in this case. Simply use opendir to avoid the race.
Fixes: CID 1531551
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Zhenzhong Duan <Zhenzhong.duan@intel.com>
Availability of the IOMMUFD backend can now be fully determined at
runtime and the ifdef check was a build time protection (for PPC not
supporting it mostly).
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Tested-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
sPAPR IOMMU support is only needed for pseries machines. Compile out
support when CONFIG_PSERIES is not set. This saves ~7K of text.
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Tested-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
As previously done for the sPAPR and legacy IOMMU backends, convert
the VFIOIOMMUOps struct to a QOM interface. The set of of operations
for this backend can be referenced with a literal typename instead of
a C struct.
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Tested-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Move vfio_spapr_container_setup() to a VFIOIOMMUClass::setup handler
and convert the sPAPR VFIOIOMMUOps struct to a QOM interface. The
sPAPR QOM interface inherits from the legacy QOM interface because
because both have the same basic needs. The sPAPR interface is then
extended with the handlers specific to the sPAPR IOMMU.
This allows reuse and provides better abstraction of the backends. It
will be useful to avoid compiling the sPAPR IOMMU backend on targets
not supporting it.
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Tested-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Convert the legacy VFIOIOMMUOps struct to the new VFIOIOMMU QOM
interface. The set of of operations for this backend can be referenced
with a literal typename instead of a C struct. This will simplify
support of multiple backends.
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Tested-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
VFIOContainerBase was not introduced as an abstract QOM object because
it felt unnecessary to expose all the IOMMU backends to the QEMU
machine and human interface. However, we can still abstract the IOMMU
backend handlers using a QOM interface class. This provides more
flexibility when referencing the various implementations.
Simply transform the VFIOIOMMUOps struct in an InterfaceClass and do
some initial name replacements. Next changes will start converting
VFIOIOMMUOps.
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Tested-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
vfio_init_container() already defines the IOMMU type of the container.
Do the same for the VFIOIOMMUOps struct. This prepares ground for the
following patches that will deduce the associated VFIOIOMMUOps struct
from the IOMMU type.
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Tested-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This will help subsequent patches to unify the initialization of type1
and sPAPR IOMMU backends.
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Tested-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This allows to abstract a bit more the sPAPR IOMMU support in the
legacy IOMMU backend.
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Tested-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
If "busses" might be encountered as a plural of "bus" (5 instances),
the correct spelling is "buses" (26 instances). Fixing those 5
instances makes the doc more consistent.
Signed-off-by: Samuel Tardieu <sam@rfc1149.net>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Pavel Pisa <pisa@cmp.felk.cvut.cz>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The edu_check_range function checks that start <= end1 < end2, where
end1 is the upper bound (exclusive) of the guest-supplied DMA range and
end2 is the upper bound (exclusive) of the device's allowed DMA range.
When the guest tries to transfer exactly DMA_SIZE (4096) bytes, end1
will be equal to end2, so the check fails and QEMU aborts with this
puzzling error message (newlines added for formatting):
qemu: hardware error: EDU: DMA range
0x0000000000040000-0x0000000000040fff out of bounds
(0x0000000000040000-0x0000000000040fff)!
By checking end1 <= end2 instead, guests will be allowed to transfer
exactly 4096 bytes. It is not necessary to explicitly check for
start <= end1 because the previous two checks (within(addr, start, end2)
and end1 > addr) imply start < end1.
Fixes: b30934cb52 ("hw: misc, add educational driver", 2015-01-21)
Signed-off-by: Max Erenberg <merenber@uwaterloo.ca>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Testing upstream U-Boot with 'sifive_u' machine we see:
=> dhcp
ethernet@10090000: PHY present at 0
Could not get PHY for ethernet@10090000: addr 0
phy_connect failed
This has been working till QEMU 8.1 but broken since QEMU 8.2.
Fixes: 1b09eeb122 ("hw/net/cadence_gem: use FIELD to describe PHYMNTNC register fields")
Reported-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Signed-off-by: Bin Meng <bmeng@tinylab.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
error_setg() appends newline to the formatted message.
Fixes: cb94ff5f80 ("audio: propagate Error * out of audio_init")
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Current error message:
qemu-system-x86_64: -chardev spice,id=foo: Parameter 'driver' expects an abstract device type
while in fact the meaning is in reverse, -chardev expects
a non-abstract device type.
Fixes: 777357d758 ("chardev: qom-ify" 2016-12-07)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
The mcycle/minstret counter's stop flag is mistakenly updated on a copy
on stack. Thus the counter increments even when the CY/IR bit in the
mcountinhibit register is set. This commit corrects its behavior.
Fixes: 3780e33732 (target/riscv: Support mcycle/minstret write operation)
Signed-off-by: Xu Lu <luxu.kernel@bytedance.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
HW core patch queue
- Unify CPU QOM type checks (Gavin)
- Simplify uses of some CPU related property (Philippe)
(start-powered-off, ARM reset-cbar and mp-affinity)
- Header and documentation cleanups (Zhao, Philippe)
- Have Memory API return boolean indicating possible error
- Fix frame filter mask in CAN sja1000 model (Pavel)
- QOM embed MCF5206 timer into SoC (Thomas)
- Simplify LEON3 qemu_irq_ack handler (Clément)
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmWYIxwACgkQ4+MsLN6t
# wN66fA//UBwgYqcdpg6Wz17qzgq1TWeZHHzYh7HbZRUCxhdSgS6TSQOH9Fi8VNYq
# Ed5a5l4ovP/2NRN1/S5PPBydyKXTU7wintHm2+suQbLSmplIE6yr0Ca6o8FLEeJ3
# hnE0dAoQCLS7eDpoeOEpGjzmJFiBSWLvyqAZLa/rZkCnCiZRHB6g/nAEM8I3I9bl
# //H20d3a/fektZxGnpEAeoMxrl4iA9hkFYVW8lbu6EhNFBPUkkj5Y8w47Kq/BIvD
# NmLTPgu4d7oahwlfsM6jWdRDG9zlEkXQor817PHwl00o45yAfeITsy40GvJeEYaI
# BcDLFfWrSm9SQb7/suXGeyU/SLmx7rsmJWfNYUoMr6807QcSH4ScPCfgzEQ4j8IV
# PmeVsxxLxT9CSzfxhMx5cXt33H2l+tEzwJ5UJCLQvmvTu+aDkt46Q09X/7j0z89m
# zSk/HBtdACIzwEWBAJsKuzarRTZNUvyXEsOxZ5l7xOxJpzpsNV2YVuChClVGtHOJ
# kr1PE2hxEMPY1vDyKU6ckDvW+XXgYhOXrPAxdx8gIwwd4oyDC5vVlIajvlqbOAsp
# Es7zq40b/is3ZnByEDbZ+yYvdYRLtVf/lDPK3KIv7IhrTNzH/HT1egshOQAVirY1
# Gw8f3fXqL3/84w383VI4efrSlKBJeb0i2SJ50y2N1clrF1qnlx0=
# =an4B
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 05 Jan 2024 15:41:16 GMT
# gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: FAAB E75E 1291 7221 DCFD 6BB2 E3E3 2C2C DEAD C0DE
* tag 'hw-cpus-20240105' of https://github.com/philmd/qemu: (71 commits)
target/sparc: Simplify qemu_irq_ack
hw/net/can/sja1000: fix bug for single acceptance filter and standard frame
hw/m68k/mcf5206: Embed m5206_timer_state in m5206_mbar_state
hw/pci-host/raven: Propagate error in raven_realize()
hw/nvram: Simplify memory_region_init_rom_device() calls
hw/misc: Simplify memory_region_init_ram_from_fd() calls
hw/sparc: Simplify memory_region_init_ram_nomigrate() calls
hw/arm: Simplify memory_region_init_rom() calls
hw: Simplify memory_region_init_ram() calls
misc: Simplify qemu_prealloc_mem() calls
util/oslib: Have qemu_prealloc_mem() handler return a boolean
backends: Reduce variable scope in host_memory_backend_memory_complete
backends: Have HostMemoryBackendClass::alloc() handler return a boolean
backends: Simplify host_memory_backend_memory_complete()
backends: Use g_autofree in HostMemoryBackendClass::alloc() handlers
memory: Have memory_region_init_ram_from_fd() handler return a boolean
memory: Have memory_region_init_ram_from_file() handler return a boolean
memory: Have memory_region_init_resizeable_ram() return a boolean
memory: Have memory_region_init_rom_device() handler return a boolean
memory: Simplify memory_region_init_rom_device_nomigrate() calls
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
A CAN sja1000 standard frame filter mask has been computed and applied
incorrectly for standard frames when single Acceptance Filter Mode
(MOD_AFM = 1) has been selected. The problem has not been found
by Linux kernel testing because it uses dual filter mode (MOD_AFM = 0)
and leaves falters fully open.
The problem has been noticed by Grant Ramsay when testing with Zephyr
RTOS which uses single filter mode.
Signed-off-by: Pavel Pisa <pisa@cmp.felk.cvut.cz>
Reported-by: Grant Ramsay <gramsay@enphaseenergy.com>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2028
Fixes: 733210e754 ("hw/net/can: SJA1000 chip register level emulation")
Message-ID: <20240103231426.5685-1-pisa@fel.cvut.cz>
Return early if bc->alloc is NULL. De-indent the if() ladder.
Note, this avoids a pointless call to error_propagate() with
errp=NULL at the 'out:' label.
Change trivial when reviewed with 'git-diff --ignore-all-space'.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Message-Id: <20231120213301.24349-16-philmd@linaro.org>
Following the example documented since commit e3fe3988d7
("error: Document Error API usage rules"), have
memory_region_init_rom_device_nomigrate() return a boolean
indicating whether an error is set or not.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Message-Id: <20231120213301.24349-9-philmd@linaro.org>
There is no universal BIOS, each machine needs a specific one.
Move the machine-specific definitions to each machine code and
remove this bogus header.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20231122184334.18201-1-philmd@linaro.org>
The 'start-powered-off' property has been added to ARM CPUs in
commit 5de164304a ("arm: Allow secondary KVM CPUs to be booted
via PSCI"), then eventually got generalized to all CPUs in commit
c1b701587e ("target/arm: Move start-powered-off property to generic
CPUState"). Since all CPUs have it, no need to check whether it is
available. Updating this property can't fail, so use &error_abort.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20231123143813.42632-5-philmd@linaro.org>
CPUState::start_powered_off field is part of the internal
implementation of a QDev CPU. It is exposed as the QDev
"start-powered-off" property. External components should
use the qdev properties API to access it.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Message-Id: <20231123143813.42632-2-philmd@linaro.org>
The 'mp-affinity' property is present since commit 15a21fe028
("target-arm: Add mp-affinity property for ARM CPU class").
Use it and remove a /* TODO */ comment. Since all ARM CPUs
have this property, use &error_abort, because this call can
not fail.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20231123143813.42632-4-philmd@linaro.org>
bcm2836_realize() is called by
- bcm2836_class_init() which sets:
bc->cpu_type = ARM_CPU_TYPE_NAME("cortex-a7")
- bcm2837_class_init() which sets:
bc->cpu_type = ARM_CPU_TYPE_NAME("cortex-a53")
Both Cortex-A7 / A53 have the ARM_FEATURE_CBAR set. If it isn't,
then this is a programming error: use &error_abort.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20231123143813.42632-3-philmd@linaro.org>
Remove unused header (qemu/module.h and qemu/cutils.h) in cluster.c,
and reorder the remaining header files (except qemu/osdep.h) in
alphabetical order.
Tested by "./configure" and then "make".
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231127145611.925817-3-zhao1.liu@linux.intel.com>
Remove unused header (qemu/module.h and sysemu/cpus.h) in core.c,
and reorder the remaining header files (except qemu/osdep.h) in
alphabetical order.
Tested by "./configure" and then "make".
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231127145611.925817-2-zhao1.liu@linux.intel.com>
The 'host' CPU model isn't available until KVM or HVF is enabled.
For example, the following error messages are seen when the guest
is started with option '-cpu cortex-a8' on tcg after the next commit
is applied to check the CPU type in machine_run_board_init().
ERROR:../hw/core/machine.c:1423:is_cpu_type_supported: \
assertion failed: (model != NULL)
Bail out! ERROR:../hw/core/machine.c:1423:is_cpu_type_supported: \
assertion failed: (model != NULL)
Aborted (core dumped)
Hide 'host' CPU model until KVM or HVF is enabled. With this applied,
the valid CPU models can be shown.
qemu-system-aarch64: Invalid CPU type: cortex-a8
The valid types are: cortex-a7, cortex-a15, cortex-a35, \
cortex-a55, cortex-a72, cortex-a76, cortex-a710, a64fx, \
neoverse-n1, neoverse-v1, neoverse-n2, cortex-a53, \
cortex-a57, max
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231204004726.483558-6-gshan@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The names of supported CPU models instead of CPU types should be
printed when the user specified CPU type isn't supported, to be
consistent with the output from '-cpu ?'.
Correct the error messages to print CPU model names instead of CPU
type names.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231204004726.483558-5-gshan@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
It's no sense to check the CPU type when mc->valid_cpu_types[0] is
NULL, which is a program error. Raise an assert on this.
A precise hint for the error message is given when mc->valid_cpu_types[0]
is the only valid entry. Besides, enumeration on mc->valid_cpu_types[0]
when we have mutiple valid entries there is avoided to increase the code
readability, as suggested by Philippe Mathieu-Daudé.
Besides, @cc comes from machine->cpu_type or mc->default_cpu_type. For
the later case, it can be NULL and it's also a program error. We should
use assert() in this case.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Message-ID: <20231204004726.483558-4-gshan@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The logic, to check if the specified CPU type is supported in
machine_run_board_init(), is independent enough. Factor it out into
helper is_cpu_type_supported(). machine_run_board_init() looks a bit
clean with this. Since we're here, @machine_class is renamed to @mc to
avoid multiple line spanning of code. The comments are tweaked a bit
either.
No functional change intended.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231204004726.483558-3-gshan@redhat.com>
[PMD: Only call new helper if machine->cpu_type is not NULL]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job. The principle is violated by machine_run_board_init() because
it calls error_report(), error_printf(), and exit(1) when the machine
doesn't support the requested CPU type.
Clean this up by using error_setg() and error_append_hint() instead.
No functional change, as the only caller passes &error_fatal.
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20231204004726.483558-2-gshan@redhat.com>
[PMD: Correct error_append_hint() argument]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Add generic cpu_list() to replace the individual target's implementation
in the subsequent commits. Currently, there are 3 targets with no cpu_list()
implementation: microblaze and nios2. With this applied, those two targets
switch to the generic cpu_list().
[gshan@gshan q]$ ./build/qemu-system-microblaze -cpu ?
Available CPUs:
microblaze-cpu
[gshan@gshan q]$ ./build/qemu-system-nios2 -cpu ?
Available CPUs:
nios2-cpu
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231114235628.534334-7-gshan@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Add helper cpu_model_from_type() to extract the CPU model name from
the CPU type name in two circumstances: (1) The CPU type name is the
combination of the CPU model name and suffix. (2) The CPU type name
is same to the CPU model name.
The helper will be used in the subsequent commits to conver the
CPU type name to the CPU model name.
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20231114235628.534334-6-gshan@redhat.com>
[PMD: Mention returned string must be released with g_free()]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
For all targets, the CPU class returned from CPUClass::class_by_name()
and object_class_dynamic_cast(oc, CPU_RESOLVING_TYPE) need to be
compatible. Lets apply the check in cpu_class_by_name() for once,
instead of having the check in CPUClass::class_by_name() for individual
target.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Gavin Shan <gshan@redhat.com>
Message-ID: <20231114235628.534334-4-gshan@redhat.com>
'ev67' CPU class will be returned to match everything, which makes
no sense as mentioned in the comments. Remove the logic to fall
back to 'ev67' CPU class to match everything.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231114235628.534334-2-gshan@redhat.com>
[PMD: Reword subject, replace 'any' -> 'ev67' on linux-user]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
migration 1st pull for 9.0
- We lost Juan and Leo in the maintainers file
- Steven's suspend state fix
- Steven's fix for coverity on migrate_mode
- Avihai's migration cleanup series
# -----BEGIN PGP SIGNATURE-----
#
# iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZZY0TxIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1wbSxgEAoM5g3wkc22lpAlRpU+hJUqT9NVOVQSK+
# Fk7XJYTdSgABAKzykA6hAmU5Kj+yVI6jI874SVZbs2FWpFs4osvsKk4D
# =sfuM
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 04 Jan 2024 04:30:07 GMT
# gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg: issuer "peterx@redhat.com"
# gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [unknown]
# gpg: aka "Peter Xu <peterx@redhat.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706
* tag 'migration-20240104-pull-request' of https://gitlab.com/peterx/qemu: (26 commits)
migration: fix coverity migrate_mode finding
migration/multifd: Remove unnecessary usage of local Error
migration: Remove unnecessary usage of local Error
migration: Fix migration_channel_read_peek() error path
migration/multifd: Remove error_setg() in migration_ioc_process_incoming()
migration/multifd: Fix leaking of Error in TLS error flow
migration/multifd: Simplify multifd_channel_connect() if else statement
migration/multifd: Fix error message in multifd_recv_initial_packet()
migration: Remove errp parameter in migration_fd_process_incoming()
migration: Refactor migration_incoming_setup()
migration: Remove nulling of hostname in migrate_init()
migration: Remove migrate_max_downtime() declaration
tests/qtest: postcopy migration with suspend
tests/qtest: precopy migration with suspend
tests/qtest: option to suspend during migration
tests/qtest: migration events
migration: preserve suspended for bg_migration
migration: preserve suspended for snapshot
migration: preserve suspended runstate
migration: propagate suspended runstate
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
According to Error API, usage of ERRP_GUARD() or a local Error instead
of errp is needed if errp is passed to void functions, where it is later
dereferenced to see if an error occurred.
There are several places in multifd.c that use local Error although it
is not needed. Change these places to use errp directly.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/20231231093016.14204-12-avihaih@nvidia.com
Signed-off-by: Peter Xu <peterx@redhat.com>
According to Error API, usage of ERRP_GUARD() or a local Error instead
of errp is needed if errp is passed to void functions, where it is later
dereferenced to see if an error occurred.
There are several places in migration.c that use local Error although it
is not needed. Change these places to use errp directly.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20231231093016.14204-11-avihaih@nvidia.com
Signed-off-by: Peter Xu <peterx@redhat.com>
migration_channel_read_peek() calls qio_channel_readv_full() and handles
both cases of return value == 0 and return value < 0 the same way, by
calling error_setg() with errp. However, if return value < 0, errp is
already set, so calling error_setg() with errp will lead to an assert.
Fix it by handling these cases separately, calling error_setg() with
errp only in return value == 0 case.
Fixes: 6720c2b327 ("migration: check magic value for deciding the mapping of channels")
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/20231231093016.14204-10-avihaih@nvidia.com
Signed-off-by: Peter Xu <peterx@redhat.com>
If multifd_load_setup() fails in migration_ioc_process_incoming(),
error_setg() is called with errp. This will lead to an assert because in
that case errp already contains an error.
Fix it by removing the redundant error_setg().
Fixes: 6720c2b327 ("migration: check magic value for deciding the mapping of channels")
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20231231093016.14204-9-avihaih@nvidia.com
Signed-off-by: Peter Xu <peterx@redhat.com>
If there is an error in multifd TLS handshake task,
multifd_tls_outgoing_handshake() retrieves the error with
qio_task_propagate_error() but never frees it.
Fix it by freeing the obtained Error.
In addition, the error is not reported at all, so report it with
migrate_set_error().
Fixes: 2964714015 ("migration/tls: add support for multifd tls-handshake")
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20231231093016.14204-8-avihaih@nvidia.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Add an option to suspend the src in a-b-bootblock.S, which puts the guest
in S3 state after one round of writing to memory. The option is enabled by
poking a 1 into the suspend_me word in the boot block prior to starting the
src vm. Generate symbol offsets in a-b-bootblock.h so that the suspend_me
offset is known. Generate the bootblock for each test, because suspend_me
may differ for each.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/1704312341-66640-11-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Restoring a snapshot can break a suspended guest. Snapshots suffer from
the same suspended-state issues that affect live migration, plus they must
handle an additional problematic scenario, which is that a running vm must
remain running if it loads a suspended snapshot.
To save, the existing vm_stop call now completely stops the suspended
state. Finish with vm_resume to leave the vm in the state it had prior
to the save, correctly restoring the suspended state.
To load, if the snapshot is not suspended, then vm_stop + vm_resume
correctly handles all states, and leaves the vm in the state it had prior
to the load. However, if the snapshot is suspended, restoration is
trickier. First, call vm_resume to restore the state to suspended so the
current state matches the saved state. Then, if the pre-load state is
running, call wakeup to resume running.
Prior to these changes, the vm_stop to RUN_STATE_SAVE_VM and
RUN_STATE_RESTORE_VM did not change runstate if the current state was
suspended, but now it does, so allow these transitions.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1704312341-66640-8-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
A guest that is migrated in the suspended state automaticaly wakes and
continues execution. This is wrong; the guest should end migration in
the same state it started. The root cause is that the outgoing migration
code automatically wakes the guest, then saves the RUNNING runstate in
global_state_store(), hence the incoming migration code thinks the guest is
running and continues the guest if autostart is true.
On the outgoing side, delete the call to qemu_system_wakeup_request().
Now that vm_stop completely stops a vm in the suspended state (from the
preceding patches), the existing call to vm_stop_force_state is sufficient
to correctly migrate all vmstate.
On the incoming side, call vm_start if the pre-migration state was running
or suspended. For the latter, vm_start correctly restores the suspended
state, and a future system_wakeup monitor request will cause the vm to
resume running.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1704312341-66640-7-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
When a vm transitions from running to suspended, runstate notifiers are
not called, so the notifiers still think the vm is running. Hence, when
we call vm_start to restore the suspended state, we call vm_state_notify
with running=1. However, some notifiers check for RUN_STATE_RUNNING.
They must check the running boolean instead.
No functional change.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1704312341-66640-4-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Currently, a vm in the suspended state is not completely stopped. The VCPUs
have been paused, but the cpu clock still runs, and runstate notifiers for
the transition to stopped have not been called. This causes problems for
live migration. Stale cpu timers_state is saved to the migration stream,
causing time errors in the guest when it wakes from suspend, and state that
would have been modified by runstate notifiers is wrong.
Modify vm_stop to completely stop the vm if the current state is suspended,
transition to RUN_STATE_PAUSED, and remember that the machine was suspended.
Modify vm_start to restore the suspended state.
This affects all callers of vm_stop and vm_start, notably, the qapi stop and
cont commands:
old behavior:
RUN_STATE_SUSPENDED --> stop --> RUN_STATE_SUSPENDED
new behavior:
RUN_STATE_SUSPENDED --> stop --> RUN_STATE_PAUSED
RUN_STATE_PAUSED --> cont --> RUN_STATE_SUSPENDED
For example:
(qemu) info status
VM status: paused (suspended)
(qemu) stop
(qemu) info status
VM status: paused
(qemu) system_wakeup
Error: Unable to wake up: guest is not in suspended state
(qemu) cont
(qemu) info status
VM status: paused (suspended)
(qemu) system_wakeup
(qemu) info status
VM status: running
Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1704312341-66640-3-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
To enable accelerated VirtIO GPUs for the guest we need the rendering
support on the host, which currently it's reported in the configuration
summary under the "dependencies" section. Add a graphics backend section
and report the status of the VirGL and Rutabaga support libraries.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231222114846.2850741-1-alex.bennee@linaro.org>
[Remove from dependencies as suggested by Philippe. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This variable is about the host OS, not the target. It is used a lot
more since the Meson conversion, but the original sin dates back to 2003.
Time to fix it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
config_all now lists only accelerators, rename it to indicate its actual
content.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CONFIG_ALL is tricky to use and was ported over to Meson from the
recursive processing of Makefile variables. Meson sourcesets
however have all_sources() and all_dependencies() methods that
remove the need for it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
config_targetos is now empty and can be removed; its use in sourcesets
that do not involve target-specific files can be replaced with an empty
dictionary.
In fact, at this point *all* sourcesets that do not involve
target-specific files are just glorified mutable arrays. Enforce that
they never test for symbols in "when:" by computing the set of files
without "strict: false".
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CONFIG_DARWIN, CONFIG_LINUX and CONFIG_BSD are used in some rules, but
only CONFIG_LINUX has substantial use. Convert them all to if...endif.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Keep it together with the other compiler modes, and before dependencies.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Remove assignments that match the default, and group the
targets for debian-legacy-test-cross and debian-all-test-cross
into a single arm.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Do not use a subshell to hide the shadowing of $config_host_mak.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
While a simple lexicographic comparison usually works, it is less
robust than a more specific algorithm designed to compare versions.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a 'current_lun' check for a null value
to avoid null pointer dereferencing and
recover host if NULL return
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 4eb8606560 (esp: store lun coming from the MESSAGE OUT phase)
Signed-off-by: Alexandra Diupina <adiupina@astralinux.ru>
Message-ID: <20231229152647.19699-1-adiupina@astralinux.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The main difficulty here is that a page fault when writing to the destination
must not overwrite the flags. Therefore, the flags computation must be
inlined instead of using gen_jcc1*.
For simplicity, I am using an unconditional cmpxchg operation, that becomes
a NOP if the comparison fails.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
ALU instructions can write to both memory and flags. If the CC_SRC*
and CC_DST locations have been written already when a memory access
causes a fault, the value in CC_SRC* and CC_DST might be interpreted
with the wrong CC_OP (the one that is in effect before the instruction.
Besides just using the wrong result for the flags, something like
subtracting -1 can have disastrous effects if the current CC_OP is
CC_OP_EFLAGS: this is because QEMU does not expect bits outside the ALU
flags to be set in CC_SRC, and env->eflags can end up set to all-ones.
In the case of the attached testcase, this sets IOPL to 3 and would
cause an assertion failure if SUB is moved to the new decoder.
This mechanism is not really needed for BMI instructions, which can
only write to a register, but put it to use anyway for cleanliness.
In the case of BZHI, the code has to be modified slightly to ensure
that decode->cc_src is written, otherwise the new assertions trigger.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
gen_jcc() has been changed to accept a relative offset since the
new decoder was written. Adjust the J operand, which is meant
to be used with jump instructions such as gen_jcc(), to not
include the program counter and to not truncate the result, as
both operations are now performed by common code.
The result is that J is now the same as the I operand.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Similar to gen_setcc1, make gen_cmovcc1 receive TCGv. This is more friendly
to simultaneous implementation in the old and the new decoder.
A small wart is that s->T0 of CMOV is currently the *second* argument (which
would ordinarily be in T1). Therefore, the condition has to be inverted in
order to overwrite s->T0 with cpu_regs[reg] if the MOV is not performed.
This only applies to the old decoder, and this code will go away soon.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Create a new temporary, to ease the register allocator's work.
Creation of the temporary is pushed into gen_ext_tl, which
also allows NULL as the first parameter now.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The new x86 decoder wants the gen_* functions to compute EFLAGS before
writeback, which can be an issue for instructions with a memory
destination such as ARPL or shifts.
Extract code to compute the EFLAGS without clobbering CC_SRC, in case
the memory write causes a fault. The flags writeback mechanism will
take care of copying the result to CC_SRC.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The new decoder would rather have the operand in T0 when expanding SCAS, rather
than use R_EAX directly as gen_scas currently does. This makes SCAS more similar
to CMP and SUB, in that CC_DST = T0 - T1.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The new decoder likes to compute the address in A0 very early, so the
gen_lea_v_seg in gen_pop_T0 would clobber the address of the memory
operand. Instead use T0 since it is already available and will be
overwritten immediately after.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
decode->mem is only used if one operand has has_ea == true. String
operations will not use decode->mem and will load A0 on their own, because
they are the only case of two memory operands in a single instruction.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Usually the registers are just moved into s->T0 without much care for
their operand size. However, in some cases we can get more efficient
code if the operand fetching logic syncs with the emission function
on what is nicer.
All the current uses are mostly demonstrative and only reduce the code
in the emission functions, because the instructions do not support
memory operands. However the logic is generic and applies to several
more instructions such as MOVSXD (aka movslq), one-byte shift
instructions, multiplications, XLAT, and indirect calls/jumps.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
X86_SPECIAL_ZExtOp0 and X86_SPECIAL_ZExtOp2 are poorly named; they are a hack
that is needed by scalar insertion and extraction instructions, and not really
related to zero extension: for PEXTR the zero extension is done by the generation
functions, for PINSR the high bits are not used at all and in fact are *not*
filled with zeroes when loaded into s->T1.
Rename the values to match the effect described in the manual, and explain
better in the comments.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use _tl operations for 32-bit operands on 32-bit targets, and only go
through trunc and extu ops for 64-bit targets. While the trunc/ext
ops should be pretty much free after optimization, the optimizer also
does not like having the same temporary used in multiple EBBs.
Therefore it is nicer to not use tmpN* unless necessary.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The previous check erroneously allowed CMP to be modified with LOCK.
Instead, tag explicitly the instructions that do support LOCK.
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
cpu_cc_compute_all() has an argument that is always equal to CC_OP for historical
reasons (dating back to commit a7812ae412, "TCG variable type checking.", 2008-11-17,
which added the argument to helper_cc_compute_all). It does not make sense for the
argument to have any other value, so remove it and clean up some lines that are not
too long anymore.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
gen_lea_v_seg (called by gen_add_A0_ds_seg) already zeroes any
bits of s->A0 beyond s->aflag. It does so before summing the
segment base and, if not in 64-bit mode, also after summing it.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Take advantage of the fact that there can be no 1 bits between SF and OF.
If they were adjacent, you could sum SF and get a carry only if SF was
already set. Then the value of OF in the sum is the XOR of OF itself,
the carry (which is SF) and 0 (the value of the OF bit in the addend):
this is OF^SF exactly.
Because OF and SF are not adjacent, just place more 1 bits to the
left so that the carry propagates, which means summing CC_O - CC_S.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
virtio,pc,pci: features, cleanups, fixes
vhost-scsi support for worker ioctls
fixes, cleanups all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmWKohIPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRpG2YH/1rJGV8TQm4V8kcGP9wOknPAMFADnEFdFmrB
# V+JEDnyKrdcEZLPRh0b846peWRJhC13iL7Ks3VNjeVsfE9TyzNyNDpUzCJPfYFjR
# 3m8ChLDvE9tKBA5/hXMIcgDXaYcPIrPvHyl4HG8EQn7oaeMpS2uecKqDpDDvNXGq
# oNamNvqimFSqA+3ChzA+0Qt07Ts7xFEw4OEXSwfRXlsam/dhQG0SI+crRheHuvFb
# HR8EwmNydA1D/M51AuBNuvX36u3SnPWm7Anp5711SZ1b59unshI0ztIqIJnGkvYe
# qpUJSmxR6ulwWe4nQfb+GhBsuJ2j2ORC7YfXyAT7mw8rds8loaI=
# =cNy2
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 26 Dec 2023 04:51:14 EST
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (21 commits)
vdpa: move memory listener to vhost_vdpa_shared
vdpa: use dev_shared in vdpa_iommu
vdpa: use VhostVDPAShared in vdpa_dma_map and unmap
vdpa: move iommu_list to vhost_vdpa_shared
vdpa: remove msg type of vhost_vdpa
vdpa: move backend_cap to vhost_vdpa_shared
vdpa: move iotlb_batch_begin_sent to vhost_vdpa_shared
vdpa: move file descriptor to vhost_vdpa_shared
vdpa: use vdpa shared for tracing
vdpa: move shadow_data to vhost_vdpa_shared
vdpa: move iova_range to vhost_vdpa_shared
vdpa: move iova tree to the shared struct
vdpa: add VhostVDPAShared
vdpa: do not set virtio status bits if unneeded
Fix bugs when VM shutdown with virtio-gpu unplugged
vhost-scsi: fix usage of error_reportf_err()
hw/acpi: propagate vcpu hotplug after switch to modern interface
vhost-scsi: Add support for a worker thread per virtqueue
vhost: Add worker backend callouts
tests: bios-tables-test: Rename smbios type 4 related test functions
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
dirtylimit dirtyrate pull request 20231225
Nitpick about an unused parameter
Please apply, thanks,
Yong
# -----BEGIN PGP SIGNATURE-----
#
# iQJKBAABCAA0FiEEaF0CINwmSCgVLlfC3/Ij1rP+y5wFAmWJVtkWHHlvbmcuaHVh
# bmdAc21hcnR4LmNvbQAKCRDf8iPWs/7LnLexEADjuQ7dtbv2vq1ksqDq6fct++FK
# sGPe6P9UAyJrB0qxW3jKRn2P4E/xiPHpmi+BbOOu8PppeZt86NFlzun681vMpkKu
# d3oAlwRtX8eklk3oub+yBXzkAfsGr+EHEWms/12ATsd6apItxi47zUUw/2aznujz
# TlZR7qOD1kjvPHOVnP07fQ3NYiPIITpzSqVgo3fxE+upmb+dnELIkyGTAvJY2Pjz
# Uh4CtG/sQq8OJJwUn1TFmhB1Ujw9oo8v/lOtIpsxfcKtQcAwyvgmyfndOMZ/h6O3
# ri3axypg8PyGyidYEh0JSoR3OVUIt5eoqPNp+sQTNeZsjTIwKjpBGUOLXbIA56nB
# Z/ZRbxSKj+JRD9qQZ2NODE1QurgEuT51aWkEeBEsx0OtUINcjZkwWeQmiasJQw1T
# 4z4Hez0ipuET9MmeTbWkVh9rO3Bsxi9QxiXEiN6YPNqTqLhwomXU+MIlrCsZQzjB
# l4Jij9byeWL9ntx3qwtSWJGHV6Qql16UkPo+c2Co/q2rdx/FqiYrMgCtLee1yo5n
# lSCKmrVMLaLyQv7jnLY8txhU7/CXNnwij4YSgfi7nujOgM6L3uwYc9Tp5kcQHE/b
# o/Mr6oOi79Z+rdByk4pPUKaJV7FnbWMoUgjsqWPJgedGt0K9StRsXCGR1wYMUVdU
# R8Dv0IZyyGuSf6MhNw==
# =Y00D
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 25 Dec 2023 05:18:01 EST
# gpg: using RSA key 685D0220DC264828152E57C2DFF223D6B3FECB9C
# gpg: issuer "yong.huang@smartx.com"
# gpg: Good signature from "Yong Huang <yong.huang@smartx.com>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 685D 0220 DC26 4828 152E 57C2 DFF2 23D6 B3FE CB9C
* tag 'dirtylimit-dirtyrate-pull-request-20231225' of https://github.com/newfriday/qemu:
migration/dirtyrate: Remove an extra parameter
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Next patches will register the vhost_vdpa memory listener while the VM
is migrating at the destination, so we can map the memory to the device
before stopping the VM at the source. The main goal is to reduce the
downtime.
However, the destination QEMU is unaware of which vhost_vdpa device will
register its memory_listener. If the source guest has CVQ enabled, it
will be the CVQ device. Otherwise, it will be the first one.
Move the memory listener to a common place rather than always in the
first / last vhost_vdpa.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20231221174322.3130442-14-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Next patches will register the vhost_vdpa memory listener while the VM
is migrating at the destination, so we can map the memory to the device
before stopping the VM at the source. The main goal is to reduce the
downtime.
However, the destination QEMU is unaware of which vhost_vdpa device will
register its memory_listener. If the source guest has CVQ enabled, it
will be the CVQ device. Otherwise, it will be the first one.
Move the iommu_list member to VhostVDPAShared so all vhost_vdpa can use
it, rather than always in the first / last vhost_vdpa.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20231221174322.3130442-11-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Next patches will register the vhost_vdpa memory listener while the VM
is migrating at the destination, so we can map the memory to the device
before stopping the VM at the source. The main goal is to reduce the
downtime.
However, the destination QEMU is unaware of which vhost_vdpa device will
register its memory_listener. If the source guest has CVQ enabled, it
will be the CVQ device. Otherwise, it will be the first one.
Move the backend_cap member to VhostVDPAShared so all vhost_vdpa can use
it, rather than always in the first / last vhost_vdpa.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20231221174322.3130442-9-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Next patches will register the vhost_vdpa memory listener while the VM
is migrating at the destination, so we can map the memory to the device
before stopping the VM at the source. The main goal is to reduce the
downtime.
However, the destination QEMU is unaware of which vhost_vdpa device will
register its memory_listener. If the source guest has CVQ enabled, it
will be the CVQ device. Otherwise, it will be the first one.
Move the iotlb_batch_begin_sent member to VhostVDPAShared so all
vhost_vdpa can use it, rather than always in the first / last
vhost_vdpa.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20231221174322.3130442-8-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Next patches will register the vhost_vdpa memory listener while the VM
is migrating at the destination, so we can map the memory to the device
before stopping the VM at the source. The main goal is to reduce the
downtime.
However, the destination QEMU is unaware of which vhost_vdpa device will
register its memory_listener. If the source guest has CVQ enabled, it
will be the CVQ device. Otherwise, it will be the first one.
Move the file descriptor to VhostVDPAShared so all vhost_vdpa can use
it, rather than always in the first / last vhost_vdpa.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20231221174322.3130442-7-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Next patches will register the vhost_vdpa memory listener while the VM
is migrating at the destination, so we can map the memory to the device
before stopping the VM at the source. The main goal is to reduce the
downtime.
However, the destination QEMU is unaware of which vhost_vdpa device will
register its memory_listener. If the source guest has CVQ enabled, it
will be the CVQ device. Otherwise, it will be the first one.
Move the shadow_data member to VhostVDPAShared so all vhost_vdpa can use
it, rather than always in the first or last vhost_vdpa.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20231221174322.3130442-5-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Next patches will register the vhost_vdpa memory listener while the VM
is migrating at the destination, so we can map the memory to the device
before stopping the VM at the source. The main goal is to reduce the
downtime.
However, the destination QEMU is unaware of which vhost_vdpa device will
register its memory_listener. If the source guest has CVQ enabled, it
will be the CVQ device. Otherwise, it will be the first one.
Move the iova range to VhostVDPAShared so all vhost_vdpa can use it,
rather than always in the first or last vhost_vdpa.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20231221174322.3130442-4-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Next patches will register the vhost_vdpa memory listener while the VM
is migrating at the destination, so we can map the memory to the device
before stopping the VM at the source. The main goal is to reduce the
downtime.
However, the destination QEMU is unaware of which vhost_vdpa device will
register its memory_listener. If the source guest has CVQ enabled, it
will be the CVQ device. Otherwise, it will be the first one.
Move the iova tree to VhostVDPAShared so all vhost_vdpa can use it,
rather than always in the first or last vhost_vdpa.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20231221174322.3130442-3-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
It will hold properties shared among all vhost_vdpa instances associated
with of the same device. For example, we just need one iova_tree or one
memory listener for the entire device.
Next patches will register the vhost_vdpa memory listener at the
beginning of the VM migration at the destination. This enables QEMU to
map the memory to the device before stopping the VM at the source,
instead of doing while both source and destination are stopped, thus
minimizing the downtime.
However, the destination QEMU is unaware of which vhost_vdpa struct will
register its memory_listener. If the source guest has CVQ enabled, it
will be the one associated with the CVQ. Otherwise, it will be the
first one.
Save the memory operations related members in a common place rather than
always in the first / last vhost_vdpa.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20231221174322.3130442-2-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Virtio-gpu malloc memory for the queue when it realized, but the queues was not
released when it unrealized, which resulting in a memory leak. In addition,
vm_change_state_handler is not cleaned up, which is related to vdev and will
lead to segmentation fault when VM shutdown.
Signed-off-by: wangmeiling <wangmeiling21@huawei.com>
Signed-off-by: Binfeng Wu <wubinfeng@huawei.com>
Message-Id: <7bbbc0f3-2ad9-83ca-b39b-f976d0837daf@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
It is required to use error_report() instead of error_reportf_err(), if the
prior function does not take local_err as the argument. As a result, the
local_err is always NULL and segment fault may happen.
vhost_scsi_start()
-> vhost_scsi_set_endpoint(s) --> does not allocate local_err
-> error_reportf_err()
-> error_vprepend()
-> g_string_append(newmsg, (*errp)->msg) --> (*errp) is NULL
In addition, add ": " at the end of other error_reportf_err() logs.
Fixes: 7962e432b4 ("vhost-user-scsi: support reconnect to backend")
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Message-Id: <20231214003117.43960-1-dongli.zhang@oracle.com>
Reviewed-by: Feng Li <fengli@smartx.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
If a vcpu with an apic-id that is not supported by the legacy
interface (>255) is hot-plugged, the legacy code will dynamically switch
to the modern interface. However, the hotplug event is not forwarded to
the new interface resulting in the vcpu not being fully/properly added
to the machine config. This BUG is evidenced by OVMF when it
it attempts to count the vcpus and reports an inconsistent vcpu count
reported by the fw_cfg interface and the modern hotpug interface.
Fix is to propagate the hotplug event after making the switch from
the legacy interface to the modern interface.
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Aaron Young <aaron.young@oracle.com>
Message-Id: <0e8a9baebbb29f2a6c87fd08e43dc2ac4019759a.1702398644.git.Aaron.Young@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This adds support for vhost-scsi to be able to create a worker thread
per virtqueue. Right now for vhost-net we get a worker thread per
tx/rx virtqueue pair which scales nicely as we add more virtqueues and
CPUs, but for scsi we get the single worker thread that's shared by all
virtqueues. When trying to send IO to more than 2 virtqueues the single
thread becomes a bottlneck.
This patch adds a new setting, worker_per_virtqueue, which can be set
to:
false: Existing behavior where we get the single worker thread.
true: Create a worker per IO virtqueue.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20231204231618.21962-3-michael.christie@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Since the driver doesn't support interrupts, we must return early when
index is set to VIRTIO_CONFIG_IRQ_IDX. Basically the same thing Viresh
did for "91208dd297f2 virtio: i2c: Check notifier helpers for
VIRTIO_CONFIG_IRQ_IDX".
Fixes: 544f0278af ("virtio: introduce macro VIRTIO_CONFIG_IRQ_IDX")
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Message-Id: <20231025171841.3379663-1-mathieu.poirier@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Some options for -display cocoa were not described or not listed at all.
Reported-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Apparently the help entries were not merged when the patches got in.
Fixes: f844cdb997 ("ui/cocoa: capture all keys and combos when mouse is grabbed")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
These are now redundant with the scr2 and old_scr2 fields in NeXTPC. Rename
the function from nextscr2_write() to next_scr2_rtc_update() to better
reflect its purpose. At the same time replace the manual bit manipulation with
the extract32() and deposit32() functions.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Thomas Huth <huth@tuxfamily.org>
Message-ID: <20231220131641.592826-11-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Thomas Huth <huth@tuxfamily.org>
Move the old_scr2 variable to NeXTPC so that the old SCR2 register state is
stored along with the current SCR2 state.
Since the SCR2 register is 32-bits wide, convert old_scr2 to uint32_t and
update the SCR2 register access code to allow unaligned writes.
Note that this is a migration break, but as nothing will currently boot then
we do not need to worry about this now.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Thomas Huth <huth@tuxfamily.org>
Message-ID: <20231220131641.592826-9-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Thomas Huth <huth@tuxfamily.org>
The phase variable represents part of the state machine used to clock data out
of the NextRtc device.
Note that this is a migration break for the NeXTRtc struct, but as nothing will
currently boot then we simply bump the migration version for now.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Thomas Huth <huth@tuxfamily.org>
Message-ID: <20231220131641.592826-8-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Thomas Huth <huth@tuxfamily.org>
Rename dma_ops to next_dma_ops and the read/write functions to next_dma_read()
and next_dma_write() respectively, mark next_dma_ops as DEVICE_BIG_ENDIAN and
also improve the consistency of the val variable in next_dma_read() and
next_dma_write().
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Thomas Huth <huth@tuxfamily.org>
Message-ID: <20231220131641.592826-6-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Thomas Huth <huth@tuxfamily.org>
The old QEMU memory accessors used in the original NextCube patch series had
separate functions for 1, 2 and 4 byte accessors. When the series was finally
merged a simple wrapper function was written to dispatch the memory accesses
using the original functions.
Convert scr_ops to use the memory API directly renaming it to next_scr_ops,
marking it as DEVICE_BIG_ENDIAN, and handling any unaligned accesses.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Thomas Huth <huth@tuxfamily.org>
Message-ID: <20231220131641.592826-5-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Thomas Huth <huth@tuxfamily.org>
The old QEMU memory accessors used in the original NextCube patch series had
separate functions for 1, 2 and 4 byte accessors. When the series was finally
merged a simple wrapper function was written to dispatch the memory accesses
using the original functions.
Convert mmio_ops to use the memory API directly renaming it to next_mmio_ops,
marking it as DEVICE_BIG_ENDIAN, and handling any unaligned accesses.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Thomas Huth <huth@tuxfamily.org>
Message-ID: <20231220131641.592826-4-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Thomas Huth <huth@tuxfamily.org>
Normally a DMA FLUSH command is used to ensure that data is completely written
to the device and/or memory, so remove the pulse of the SCSI DMA IRQ if a DMA
FLUSH command is received. This enables the NeXT ROM monitor to start to load
from a SCSI disk.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Thomas Huth <huth@tuxfamily.org>
Message-ID: <20231220131641.592826-3-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Thomas Huth <huth@tuxfamily.org>
Commit c2118e9e1a ("configure: don't try a "native" cross for linux-user",
2023-11-23) sought to avoid issues with using the native compiler with a
cross-endian or cross-bitness setup. However, in doing so it ended up
requiring a cross compiler setup (and most likely a slow compiler setup)
even when building TCG tests that are native to the host architecture.
Always allow the host compiler in that case.
Cc: qemu-stable@nongnu.org
Fixes: c2118e9e1a ("configure: don't try a "native" cross for linux-user", 2023-11-23)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
pull-loongarch-20231221
# -----BEGIN PGP SIGNATURE-----
#
# iLMEAAEKAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCZYPyvQAKCRBAov/yOSY+
# 38/vBADT3b+Wo/2AeXOO3OXOM1VBhIvzDjY1OWytuJpkF3JGW45cMLqgtIgMj8h7
# NtzRS3JbFYbYuxITeeo1Ppl6dAD0pCZjIU6OCBxAJ6ADPsE/xD8nYWrMGqYVXg7E
# hN0Cno2sf6dmJ0QxUxn7G+cUuvNtnGaDSZE+RAkjtzq1nvx7CQ==
# =mte5
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 21 Dec 2023 03:09:33 EST
# gpg: using RSA key B8FF1DA0D2FDCB2DA09C6C2C40A2FFF239263EDF
# gpg: Good signature from "Song Gao <m17746591750@163.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B8FF 1DA0 D2FD CB2D A09C 6C2C 40A2 FFF2 3926 3EDF
* tag 'pull-loongarch-20231221' of https://gitlab.com/gaosong/qemu:
target/loongarch: Add timer information dump support
hw/loongarch/virt: Align high memory base address with super page size
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Add the iothread-vq-mapping parameter to assign virtqueues to IOThreads.
Store the vq:AioContext mapping in the new struct
VirtIOBlockDataPlane->vq_aio_context[] field and refactor the code to
use the per-vq AioContext instead of the BlockDriverState's AioContext.
Reimplement --device virtio-blk-pci,iothread= and non-IOThread mode by
assigning all virtqueues to the IOThread and main loop's AioContext in
vq_aio_context[], respectively.
The comment in struct VirtIOBlockDataPlane about EventNotifiers is
stale. Remove it.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20231220134755.814917-5-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
virtio-blk and virtio-scsi devices will need a way to specify the
mapping between IOThreads and virtqueues. At the moment all virtqueues
are assigned to a single IOThread or the main loop. This single thread
can be a CPU bottleneck, so it is necessary to allow finer-grained
assignment to spread the load.
Introduce DEFINE_PROP_IOTHREAD_VQ_MAPPING_LIST() so devices can take a
parameter that maps virtqueues to IOThreads. The command-line syntax for
this new property is as follows:
--device '{"driver":"foo","iothread-vq-mapping":[{"iothread":"iothread0","vqs":[0,1,2]},...]}'
IOThreads are specified by name and virtqueues are specified by 0-based
index.
It will be common to simply assign virtqueues round-robin across a set
of IOThreads. A convenient syntax that does not require specifying
individual virtqueue indices is available:
--device '{"driver":"foo","iothread-vq-mapping":[{"iothread":"iothread0"},{"iothread":"iothread1"},...]}'
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20231220134755.814917-4-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qdev_alias_all_properties() aliases a DeviceState's qdev properties onto
an Object. This is used for VirtioPCIProxy types so that --device
virtio-blk-pci has properties of its embedded --device virtio-blk-device
object.
Currently this function is implemented using qdev properties. Change the
function to use QOM object class properties instead. This works because
qdev properties create QOM object class properties, but it also catches
any QOM object class-only properties that have no qdev properties.
This change ensures that properties of devices are shown with --device
foo,\? even if they are QOM object class properties.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20231220134755.814917-2-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
StringOutputVisitor crashes when it visits a struct because
->start_struct() is NULL.
Show "<omitted>" instead of crashing. This is necessary because the
virtio-blk-pci iothread-vq-mapping parameter that I'd like to introduce
soon is a list of IOThreadMapping structs.
This patch is a quick fix to solve the crash, but the long-term solution
is replacing StringOutputVisitor with something that can handle the full
gamut of values in QEMU.
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20231212134934.500289-1-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Use qemu_get_current_aio_context() in mixed wrappers and coroutine
wrappers so that code runs in the caller's AioContext instead of moving
to the BlockDriverState's AioContext. This change is necessary for the
multi-queue block layer where any thread can call into the block layer.
Most wrappers are IO_CODE where it's safe to use the current AioContext
nowadays. BlockDrivers and the core block layer use their own locks and
no longer depend on the AioContext lock for thread-safety.
The bdrv_create() wrapper invokes GLOBAL_STATE code. Using the current
AioContext is safe because this code is only called with the BQL held
from the main loop thread.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230912231037.826804-6-stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The AioContext lock no longer exists.
There is one noteworthy change:
- * More specifically, these functions use BDRV_POLL_WHILE(bs), which
- * requires the caller to be either in the main thread and hold
- * the BlockdriverState (bs) AioContext lock, or directly in the
- * home thread that runs the bs AioContext. Calling them from
- * another thread in another AioContext would cause deadlocks.
+ * More specifically, these functions use BDRV_POLL_WHILE(bs), which requires
+ * the caller to be either in the main thread or directly in the home thread
+ * that runs the bs AioContext. Calling them from another thread in another
+ * AioContext would cause deadlocks.
I am not sure whether deadlocks are still possible. Maybe they have just
moved to the fine-grained locks that have replaced the AioContext. Since
I am not sure if the deadlocks are gone, I have kept the substance
unchanged and just removed mention of the AioContext.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-ID: <20231205182011.1976568-15-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The SCSI subsystem no longer uses the AioContext lock. Request
processing runs exclusively in the BlockBackend's AioContext since
"scsi: only access SCSIDevice->requests from one thread" and hence the
lock is unnecessary.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-ID: <20231205182011.1976568-13-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Delete these functions because nothing calls these functions anymore.
I introduced these APIs in commit 98563fc3ec ("aio: add
aio_context_acquire() and aio_context_release()") in 2014. It's with a
sigh of relief that I delete these APIs almost 10 years later.
Thanks to Paolo Bonzini's vision for multi-queue QEMU, we got an
understanding of where the code needed to go in order to remove the
limitations that the original dataplane and the IOThread/AioContext
approach that followed it.
Emanuele Giuseppe Esposito had the splendid determination to convert
large parts of the codebase so that they no longer needed the AioContext
lock. This was a painstaking process, both in the actual code changes
required and the iterations of code review that Emanuele eked out of
Kevin and me over many months.
Kevin Wolf tackled multitudes of graph locking conversions to protect
in-flight I/O from run-time changes to the block graph as well as the
clang Thread Safety Analysis annotations that allow the compiler to
check whether the graph lock is being used correctly.
And me, well, I'm just here to add some pizzazz to the QEMU multi-queue
block layer :). Thank you to everyone who helped with this effort,
including Eric Blake, code reviewer extraordinaire, and others who I've
forgotten to mention.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-ID: <20231205182011.1976568-11-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This is the big patch that removes
aio_context_acquire()/aio_context_release() from the block layer and
affected block layer users.
There isn't a clean way to split this patch and the reviewers are likely
the same group of people, so I decided to do it in one patch.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Message-ID: <20231205182011.1976568-7-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Stop acquiring/releasing the AioContext lock in
bdrv_graph_wrlock()/bdrv_graph_unlock() since the lock no longer has any
effect.
The distinction between bdrv_graph_wrunlock() and
bdrv_graph_wrunlock_ctx() becomes meaningless and they can be collapsed
into one function.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231205182011.1976568-6-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
aio_context_acquire()/aio_context_release() has been replaced by
fine-grained locking to protect state shared by multiple threads. The
AioContext lock still plays the role of balancing locking in
AIO_WAIT_WHILE() and many functions in QEMU either require that the
AioContext lock is held or not held for this reason. In other words, the
AioContext lock is purely there for consistency with itself and serves
no real purpose anymore.
Stop actually acquiring/releasing the lock in
aio_context_acquire()/aio_context_release() so that subsequent patches
can remove callers across the codebase incrementally.
I have performed "make check" and qemu-iotests stress tests across
x86-64, ppc64le, and aarch64 to confirm that there are no failures as a
result of eliminating the lock.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Acked-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231205182011.1976568-5-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Since the removal of AioContext locking, the correctness of the code
relies on running requests from a single AioContext at any given time.
Add assertions that verify that callbacks are invoked in the correct
AioContext.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20231205182011.1976568-3-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Protect the Task Management Function BH state with a lock. The TMF BH
runs in the main loop thread. An IOThread might process a TMF at the
same time as the TMF BH is running. Therefore tmf_bh_list and tmf_bh
must be protected by a lock.
Run TMF request completion in the IOThread using aio_wait_bh_oneshot().
This avoids more locking to protect the virtqueue and SCSI layer state.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231205182011.1976568-2-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Commit abfcd2760b ("dma-helpers: prevent dma_blk_cb() vs
dma_aio_cancel() race") acquired the AioContext lock inside dma_blk_cb()
to avoid a race with scsi_device_purge_requests() running in the main
loop thread.
The SCSI code no longer calls dma_aio_cancel() from the main loop thread
while I/O is running in the IOThread AioContext. Therefore it is no
longer necessary to take this lock to protect DMAAIOCB fields. The
->cb() function also does not require the lock because blk_aio_*() and
friends do not need the AioContext lock.
Both hw/ide/core.c and hw/ide/macio.c also call dma_blk_io() but don't
rely on it taking the AioContext lock, so this change is safe.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231204164259.1515217-5-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
virtio_queue_aio_attach_host_notifier() does not require the AioContext
lock. Stop taking the lock and add an explicit smp_wmb() because we were
relying on the implicit barrier in the AioContext lock before.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231204164259.1515217-3-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Stop depending on the AioContext lock and instead access
SCSIDevice->requests from only one thread at a time:
- When the VM is running only the BlockBackend's AioContext may access
the requests list.
- When the VM is stopped only the main loop may access the requests
list.
These constraints protect the requests list without the need for locking
in the I/O code path.
Note that multiple IOThreads are not supported yet because the code
assumes all SCSIRequests are executed from a single AioContext. Leave
that as future work.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-ID: <20231204164259.1515217-2-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We have a few test cases that include tests for corner case aspects of
internal snapshots, but nothing that tests that they actually function
as snapshots or that involves deleting a snapshot. Add a test for this
kind of basic internal snapshot functionality.
The error cases include a regression test for the crash we just fixed
with snapshot operations on inactive images.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231201142520.32255-4-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Currently, the conflict between -incoming and -loadvm is only detected
when loading the snapshot fails because the image is still inactive for
the incoming migration. This results in a suboptimal error message:
$ ./qemu-system-x86_64 -hda /tmp/test.qcow2 -loadvm foo -incoming defer
qemu-system-x86_64: Device 'ide0-hd0' is writable but does not support snapshots
Catch the situation already in qemu_validate_options() to improve the
message:
$ ./qemu-system-x86_64 -hda /tmp/test.qcow2 -loadvm foo -incoming defer
qemu-system-x86_64: 'incoming' and 'loadvm' options are mutually exclusive
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231201142520.32255-3-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bdrv_is_read_only() only checks if the node is configured to be
read-only eventually, but even if it returns false, writing to the node
may not be permitted at the moment (because it's inactive).
bdrv_is_writable() checks that the node can be written to right now, and
this is what the snapshot operations really need.
Change bdrv_can_snapshot() to use bdrv_is_writable() to fix crashes like
the following:
$ ./qemu-system-x86_64 -hda /tmp/test.qcow2 -loadvm foo -incoming defer
qemu-system-x86_64: ../block/io.c:1990: int bdrv_co_write_req_prepare(BdrvChild *, int64_t, int64_t, BdrvTrackedRequest *, int): Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed.
The resulting error message after this patch isn't perfect yet, but at
least it doesn't crash any more:
$ ./qemu-system-x86_64 -hda /tmp/test.qcow2 -loadvm foo -incoming defer
qemu-system-x86_64: Device 'ide0-hd0' is writable but does not support snapshots
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231201142520.32255-2-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The file-posix block driver currently only sets up Linux AIO and
io_uring in the BDS's AioContext. In the multi-queue block layer we must
be able to submit I/O requests in AioContexts that do not have Linux AIO
and io_uring set up yet since any thread can call into the block driver.
Set up Linux AIO and io_uring for the current AioContext during request
submission. We lose the ability to return an error from
.bdrv_file_open() when Linux AIO and io_uring setup fails (e.g. due to
resource limits). Instead the user only gets warnings and we fall back
to aio=threads. This is still better than a fatal error after startup.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230914140101.1065008-2-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
NBDClient has a number of fields that are accessed by both the export
AioContext and the main loop thread. When the AioContext lock is removed
these fields will need another form of protection.
Add NBDClient->lock and protect fields that are accessed by both
threads. Also add assertions where possible and otherwise add doc
comments stating assumptions about which thread and lock holding.
Note this patch moves the client->recv_coroutine assertion from
nbd_co_receive_request() to nbd_trip() where client->lock is held.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20231221192452.1785567-7-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The NBD clients list is currently accessed from both the export
AioContext and the main loop thread. When the AioContext lock is removed
there will be nothing protecting the clients list.
Adding a lock around the clients list is tricky because NBDClient
structs are refcounted and may be freed from the export AioContext or
the main loop thread. nbd_export_request_shutdown() -> client_close() ->
nbd_client_put() is also tricky because the list lock would be held
while indirectly dropping references to NDBClients.
A simpler approach is to only allow nbd_client_put() and client_close()
calls from the main loop thread. Then the NBD clients list is only
accessed from the main loop thread and no fancy locking is needed.
nbd_trip() just needs to reschedule itself in the main loop AioContext
before calling nbd_client_put() and client_close(). This costs more CPU
cycles per NBD request so add nbd_client_put_nonzero() to optimize the
common case where more references to NBDClient remain.
Note that nbd_client_get() can still be called from either thread, so
make NBDClient->refcount atomic.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20231221192452.1785567-6-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
nbd_trip() processes a single NBD request from start to finish and holds
an NBDClient reference throughout. NBDRequest does not outlive the scope
of nbd_trip(). Therefore it is unnecessary to ref/unref NBDClient for
each NBDRequest.
Removing these nbd_client_get()/nbd_client_put() calls will make
thread-safety easier in the commits that follow.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20231221192452.1785567-5-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
With LoongArch virt machine, there is low memory space with region
0--0x10000000, and high memory space with started from 0x90000000.
High memory space is aligned with 256M, it will be better if it is
aligned with 1G, which is super page aligned for 4K page size.
Currently linux kernel and uefi bios has no limitation with high
memory base address, it is ok to set high memory base address
with 0x80000000.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20231127040231.4123715-1-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
* Add compat machines for QEMU 9.0
* Some header clean-ups by Philippe
* Restrict type names to alphanumerical range (and a few special characters)
* Fix analyze-migration.py script on s390x
* Clean up and improve some tests
* Document handling of commas in CLI options parameters
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmWCtYsRHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbWLnw//cNJrxG0V+j0iakX+C7HRumVrLBDI4KYY
# Cp2Hx92SyeQ0Kk8DJS6JueTV0SLjMsV77APu2YPH7ELmPlk+CB9gqmV7xVoYNvsm
# QbRPlIjFw8MHLekadc2A+C+pn48tWACoOdBEDIfazKrxybnf0B57RC/fIfMKHjbs
# 2ALCoFbbgphs7yWuzTHK8ayKaGMhUVkWfzHQwpnq899olHyZBhkl951uKJA6VmLx
# KvggePkpszLjmmXA8MH1hDCcizki31cB0ZKTbQFCyE42s2S3Hvg0GueU90O7Y1cj
# lS5tPVQxyEhUYMLL+/hudlf2OYqVn2BalB7ieUQIy6rG8yoc9zxfIKQi0ccl+2oA
# s8HRq5S0bSjtilQogU1LQL/Gk6W1/N9MmnhKvCGB+BTK5KX7s4EQk02y9gGZm/8s
# pMErMyaXTG4dLiTAK42VgMVDqCYvzBmE+Gj91OmoUR7fb+VMrsWxeBFxMPDn+VtL
# TMJegIFsjw2QCSitcU4v+nP0qtKgXGbuZtrGXKabrxH5PmeQFJDSM7TwpTK4qvjK
# QMIQKBbz8BfJnUzN8qAaaJEpp1T5tcMJClKtfcgxq/+VyaSaHLmD0cljqBC+g+y7
# FTo+fa7oYx44sAlqapdEXBSGn4T+J26iuCef13CCCiPfYBv/tk3b2E0AWHj4y58I
# +VpInjUaPBQ=
# =TA1/
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 20 Dec 2023 04:36:11 EST
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5
* tag 'pull-request-2023-12-20' of https://gitlab.com/thuth/qemu:
tests/unit/test-qmp-event: Replace fixture by global variables
tests/unit/test-qmp-event: Simplify event emission check
tests/unit/test-qmp-event: Drop superfluous mutex
tests/qtest/npcm7xx_pwm-test: Only do full testing in slow mode
qemu-options: Clarify handling of commas in options parameters
tests/qtest/migration-test: Fix analyze-migration.py for s390x
qom/object: Limit type names to alphanumerical and some few special characters
tests/unit/test-io-task: Rename "qemu:dummy" to avoid colon in the name
memory: Remove "qemu:" prefix from the "qemu:ram-discard-manager" type name
hw: Replace anti-social QOM type names (again)
docs/system/arm: Fix for rename of type "xlnx.bbram-ctrl"
target: Restrict 'sysemu/reset.h' to system emulation
hw/s390x/ipl: Remove unused 'exec/exec-all.h' included header
hw/misc/mips_itu: Remove unnecessary 'exec/exec-all.h' header
hw/ppc/spapr_hcall: Remove unused 'exec/exec-all.h' included header
system/qtest: Restrict QTest API to system emulation
system/qtest: Include missing 'hw/core/cpu.h' header
MAINTAINERS: Add some more vmware-related files to the corresponding section
hw: Add compat machines for 9.0
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
target-arm queue:
* arm/kvm: drop the split between "common KVM support" and
"64-bit KVM support", since 32-bit Arm KVM no longer exists
* arm/kvm: clean up APIs to be consistent about CPU arguments
* Don't implement *32_EL2 registers when EL1 is AArch64 only
* Restrict DC CVAP & DC CVADP instructions to TCG accel
* Restrict TCG specific helpers
* Propagate MDCR_EL2.HPMN into PMCR_EL0.N
* Include missing 'exec/exec-all.h' header
* fsl-imx: add simple RTC emulation for i.MX6 and i.MX7 boards
# -----BEGIN PGP SIGNATURE-----
#
# iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmWB6o0ZHHBldGVyLm1h
# eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3mxMEACRpRxJ81pLs8fFYC5BgRhU
# BCxr+ZqarBygzsH9YWUN2TFFKlEZi7mLu6lzFsfN/qEmYCg8VslPbulQHqcGkx51
# kVxXFp/KuGlKt4zGRagZUJxgYAwwU5mnK6dTZT5/ZF6yWX67dXn8V7MP9lqqEPw5
# 5gut7Mu4f7MiAQbwZY1CWP+iu5uZmdsBuKxA6zkxOWJh/A1SfaqQRO6xVQttLAxS
# DPMTpQGmwPS4I+3gGNnqlSu6etp2tdy2K0cW3fhMp6hx70uNMHmFNzRhT/6TaKka
# 9AqXQsFHQiFXDGAm6PmCvfQI6KpLljDyNL/TuUkQWi72bGEHjUsJAdG0aXVOa30W
# uC7vuJkdZrP/t5P1AkZhWQUrlawDRV2YHNDD+gY4fxJL/STkGyU6M8R1nm1J+InN
# n0SeK0VHRC6DRPXCMQhC5QwKUH6ZjFZRs/r2opTu9p+ThQAQRmZBiVfdISCDMYnN
# DCiSb78gIFaUkwtiP44qq8MJQjsHnXtTD1Akqyo2fXSKs66jDK9Gnc8gENYdpghe
# 7V36bOp6scROHOB2a/r8gT42RKzSN6uh6xByaaToza63/bPgvHnn8vvQQbB01AgX
# zJC1xs3dwY8JMyqDefda0K0NDPS8TzNsXYmgxxxcQJpUvB4VVjet9VIMF3T+d8HO
# Pas41Z1gsQY+rcaRk/9mPA==
# =GWIA
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 19 Dec 2023 14:10:05 EST
# gpg: using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg: issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [full]
# gpg: aka "Peter Maydell <pmaydell@gmail.com>" [full]
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [full]
# gpg: aka "Peter Maydell <peter@archaic.org.uk>" [unknown]
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* tag 'pull-target-arm-20231219' of https://git.linaro.org/people/pmaydell/qemu-arm: (43 commits)
fsl-imx: add simple RTC emulation for i.MX6 and i.MX7 boards
target/arm/helper: Propagate MDCR_EL2.HPMN into PMCR_EL0.N
target/arm/tcg: Including missing 'exec/exec-all.h' header
target/arm: Restrict DC CVAP & DC CVADP instructions to TCG accel
target/arm: Restrict TCG specific helpers
target/arm: Don't implement *32_EL2 registers when EL1 is AArch64 only
target/arm/kvm: Have kvm_arm_hw_debug_active take a ARMCPU argument
target/arm/kvm: Have kvm_arm_handle_debug take a ARMCPU argument
target/arm/kvm: Have kvm_arm_handle_dabt_nisv take a ARMCPU argument
target/arm/kvm: Have kvm_arm_verify_ext_dabt_pending take a ARMCPU arg
target/arm/kvm: Have kvm_arm_[get|put]_virtual_time take ARMCPU argument
target/arm/kvm: Have kvm_arm_vcpu_finalize take a ARMCPU argument
target/arm/kvm: Have kvm_arm_vcpu_init take a ARMCPU argument
target/arm/kvm: Have kvm_arm_pmu_set_irq take a ARMCPU argument
target/arm/kvm: Have kvm_arm_pmu_init take a ARMCPU argument
target/arm/kvm: Have kvm_arm_pvtime_init take a ARMCPU argument
target/arm/kvm: Have kvm_arm_set_device_attr take a ARMCPU argument
target/arm/kvm: Have kvm_arm_sve_get_vls take a ARMCPU argument
target/arm/kvm: Have kvm_arm_sve_set_vls take a ARMCPU argument
target/arm/kvm: Have kvm_arm_add_vcpu_properties take a ARMCPU argument
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
vfio queue:
* Introduce an IOMMU interface backend for VFIO devices
* Convert IOMMU type1 and sPAPR IOMMU to respective backends
* Introduce a new IOMMUFD backend for ARM, x86_64 and s390x platforms
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmWB34AACgkQUaNDx8/7
# 7KGOMxAAqXegvAneHqIlu4c8TzTuUR2rkYgev9RdfIHRDuY2XtaX14xlWn/rpTXZ
# qSgeta+iT8Cv4YV1POJeHWFDNs9E29p1w+R7nLcH1qTIIaZHtxwbVVQ3s7kAo1Vb
# 1S1G0/zIznzGVI50a0lj1gO2yQJnu/79nXpnICgA5REW0CscMssnvboQODlwq17V
# ZLNVM8CSAvKl6ppkmzRdfNXCfq6x7bf4MsvnuXsqda4TBbvyyTjAqdo/8sjKiGly
# gSDQqhgy6cvEXIF0UUHPJzFApf0YdXUDlL8hzH90hvRVu4W/t24dPmT7UkVIX9Ek
# TA7RVxv7iJlHtFDqfSTAJFr7nKO9Tm2V9N7xbD1OJUKrMoPZRT6+0R1hMKqsZ5z+
# nG6khqHGzuo/aI9n70YxYIPXt+vs/EHI4WUtslGLUTL0xv8lUzk6cxyIJupFRmDS
# ix6GM9TXOV8RyOveL2knHVymlFnAR6dekkMB+6ljUTuzDwG0oco4vno8z9bi7Vct
# j36bM56U3lhY+w+Ljoy0gPwgrw/FROnGG3mp1mwp1KRHqtEDnUQu8CaLbJOBsBGE
# JJDP6AKAYMczdmYVkd4CvE0WaeSxtOUxW5H5NCPjtaFQt0qEcght2lA2K15g521q
# jeojoJ/QK5949jnNCqm1Z66/YQVL79lPyL0E+mxEohwu+yTORk4=
# =U0x5
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 19 Dec 2023 13:22:56 EST
# gpg: using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@redhat.com>" [unknown]
# gpg: aka "Cédric Le Goater <clg@kaod.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1
* tag 'pull-vfio-20231219' of https://github.com/legoater/qemu: (47 commits)
hw/ppc/Kconfig: Imply VFIO_PCI
docs/devel: Add VFIO iommufd backend documentation
vfio: Introduce a helper function to initialize VFIODevice
vfio/ccw: Move VFIODevice initializations in vfio_ccw_instance_init
vfio/ap: Move VFIODevice initializations in vfio_ap_instance_init
vfio/platform: Move VFIODevice initializations in vfio_platform_instance_init
vfio/pci: Move VFIODevice initializations in vfio_instance_init
hw/i386: Activate IOMMUFD for q35 machines
kconfig: Activate IOMMUFD for s390x machines
hw/arm: Activate IOMMUFD for virt machines
vfio: Make VFIOContainerBase poiner parameter const in VFIOIOMMUOps callbacks
vfio/ccw: Make vfio cdev pre-openable by passing a file handle
vfio/ccw: Allow the selection of a given iommu backend
vfio/ap: Make vfio cdev pre-openable by passing a file handle
vfio/ap: Allow the selection of a given iommu backend
vfio/platform: Make vfio cdev pre-openable by passing a file handle
vfio/platform: Allow the selection of a given iommu backend
vfio/pci: Make vfio cdev pre-openable by passing a file handle
vfio/pci: Allow the selection of a given iommu backend
vfio/iommufd: Enable pci hot reset through iommufd cdev interface
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The generated qapi_event_send_FOO() call an event emitter function.
It's test_qapi_event_emit() in this test. It compares the actual
event to the expected event, and sets a flag to record it was called.
The test functions set expected data and clear the flag before calling
qapi_event_send_FOO(), and check the flag afterwards.
Make test_qapi_event_emit() consume expected data, and the test
functions check it was consumed. Delete the flag. This is simpler.
It also catches extraneous calls of test_qapi_event_emit(). Catching
that is not worthwhile, but since the cost is negative...
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20231122072456.2518816-3-armbru@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Mutex @test_event_lock is held from fixture setup to teardown,
protecting global variable @test_event_data. But tests always run one
after the other, so this is superfluous. It also confuses Coverity.
Drop the mutex.
Fixes: CID 1527425
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20231122072456.2518816-2-armbru@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The npcm7xx_pwm-test can take quite a while when running with
--enable-debug on a loaded system. The tests here are quite
repetitive - by default it should be fine if we only execute
some of them and only execute all when running in slow testing mode.
Message-ID: <20231215143524.49241-1-thuth@redhat.com>
Reviewed-by: "Daniel P. Berrangé" <berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The migration stream on s390x contains data for the storage_attributes
which the analyze-migration.py cannot handle yet. Add the basic code
for handling this, so we can re-enable the check in the migration-test.
Message-ID: <20231120113951.162090-1-thuth@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Thomas Huth <thuth@redhat.com>
QOM names currently don't have any enforced naming rules. This
can be problematic, e.g. when they are used on the command line
for the "-device" option (where the comma is used to separate
properties). To avoid that such problematic type names come in
again, let's restrict the set of acceptable characters during the
type registration.
Ideally, we'd apply here the same rules as for QAPI, i.e. all type
names should begin with a letter, and contain only ASCII letters,
digits, hyphen, and underscore. However, we already have so many
pre-existing types like:
486-x86_64-cpu
cfi.pflash01
power5+_v2.1-spapr-cpu-core
virt-2.6-machine
pc-i440fx-3.0-machine
... so that we have to allow "." and "+" for now, too. While the
dot is used in a lot of places, the "+" can fortunately be limited
to two classes of legacy names ("power" and "Sun-UltraSparc" CPUs).
We also cannot enforce the rule that names must start with a letter
yet, since there are lot of types that start with a digit. Still,
at least limiting the first characters to the alphanumerical range
should be way better than nothing.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231117114457.177308-6-thuth@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Type names should not contain special characters like ":" (so that
they are easier to use with QAPI and other parts). We are going to
forbid such names in an upcoming patch. Thus let's replace the ":"
here with a "-".
Reviewed-by: "Daniel P. Berrangé" <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231117114457.177308-5-thuth@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Type names should not contain special characters like ":". Let's
remove the whole prefix here since it does not really seem to be
helpful to have such a prefix here. The type name is only used
internally for an interface type, so the renaming should not affect
the user interface or migration.
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231117114457.177308-4-thuth@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
QOM type names containing ',' result in awful UI. We got rid of them
in v6.0.0 (commit e178113ff6 hw: Replace anti-social QOM type names).
A few have crept back since:
xlnx,cframe-reg
xlnx,efuse
xlnx,pmc-efuse-cache
xlnx,versal-cfu-apb
xlnx,versal-cfu-fdro
xlnx,versal-cfu-sfr
xlnx,versal-crl
xlnx,versal-efuse
xlnx,zynqmp-efuse
These are all device types. They can't be plugged with -device /
device_add, except for "xlnx,efuse" (I'm not sure that one is
intentional).
They *can* be used with -device / device_add to request help.
Usability is poor, though: you have to double the comma, like this:
$ qemu-system-aarch64 -device xlnx,,pmc-efuse-cache,help
They can also be used with -global, where you must *not* double the
comma:
$ qemu-system-aarch64 -global xlnx,efuse.drive-index=2
Trap for the unwary.
"xlnx,efuse", "xlnx,versal-efuse", "xlnx,pmc-efuse-cache",
"xlnx-zynqmp-efuse" are from v6.2.0, "xlnx,versal-crl" is from v7.1.0,
and the remainder are new.
Rename them all to "xlnx-FOO", like commit e178113ff6 did.
Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Francisco Iglesias <francisco.iglesias@amd.com>
Message-ID: <20231117114457.177308-3-thuth@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
"hw/core/cpu.h" declares 'first_cpu'. Include it to avoid
when unrelated headers are refactored:
system/qtest.c:548:33: error: use of undeclared identifier 'first_cpu'
address_space_write(first_cpu->as, addr, MEMTXATTRS_UNSPECIFIED,
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231212113016.29808-2-philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
When the legacy and iommufd backends were introduced, a set of common
vfio-pci routines were exported in pci.c for both backends to use :
vfio_pci_pre_reset
vfio_pci_get_pci_hot_reset_info
vfio_pci_host_match
vfio_pci_post_reset
This introduced a build failure on PPC when --without-default-devices
is use because VFIO is always selected in ppc/Kconfig but VFIO_PCI is
not.
Use an 'imply VFIO_PCI' in ppc/Kconfig and bypass compilation of the
VFIO EEH hooks routines defined in hw/ppc/spapr_pci_vfio.c with
CONFIG_VFIO_PCI.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Introduce a helper function to replace the common code to initialize
VFIODevice in pci, platform, ap and ccw VFIO device.
No functional change intended.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Some of the VFIODevice initializations is in vfio_ccw_realize,
move all of them in vfio_ccw_instance_init.
No functional change intended.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Some of the VFIODevice initializations is in vfio_ap_realize,
move all of them in vfio_ap_instance_init.
No functional change intended.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Some of the VFIODevice initializations is in vfio_platform_realize,
move all of them in vfio_platform_instance_init.
No functional change intended.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Some of the VFIODevice initializations is in vfio_realize,
move all of them in vfio_instance_init.
No functional change intended.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Some of the callbacks in VFIOIOMMUOps pass VFIOContainerBase poiner,
those callbacks only need read access to the sub object of VFIOContainerBase.
So make VFIOContainerBase, VFIOContainer and VFIOIOMMUFDContainer as const
in these callbacks.
Local functions called by those callbacks also need same changes to avoid
build error.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This gives management tools like libvirt a chance to open the vfio
cdev with privilege and pass FD to qemu. This way qemu never needs
to have privilege to open a VFIO or iommu cdev node.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Now we support two types of iommu backends, let's add the capability
to select one of them. This depends on whether an iommufd object has
been linked with the vfio-ccw device:
If the user wants to use the legacy backend, it shall not
link the vfio-ccw device with any iommufd object:
-device vfio-ccw,sysfsdev=/sys/bus/mdev/devices/XXX
This is called the legacy mode/backend.
If the user wants to use the iommufd backend (/dev/iommu) it
shall pass an iommufd object id in the vfio-ccw device options:
-object iommufd,id=iommufd0
-device vfio-ccw,sysfsdev=/sys/bus/mdev/devices/XXX,iommufd=iommufd0
Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This gives management tools like libvirt a chance to open the vfio
cdev with privilege and pass FD to qemu. This way qemu never needs
to have privilege to open a VFIO or iommu cdev node.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Now we support two types of iommu backends, let's add the capability
to select one of them. This depends on whether an iommufd object has
been linked with the vfio-ap device:
if the user wants to use the legacy backend, it shall not
link the vfio-ap device with any iommufd object:
-device vfio-ap,sysfsdev=/sys/bus/mdev/devices/XXX
This is called the legacy mode/backend.
If the user wants to use the iommufd backend (/dev/iommu) it
shall pass an iommufd object id in the vfio-ap device options:
-object iommufd,id=iommufd0
-device vfio-ap,sysfsdev=/sys/bus/mdev/devices/XXX,iommufd=iommufd0
Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This gives management tools like libvirt a chance to open the vfio
cdev with privilege and pass FD to qemu. This way qemu never needs
to have privilege to open a VFIO or iommu cdev node.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Now we support two types of iommu backends, let's add the capability
to select one of them. This depends on whether an iommufd object has
been linked with the vfio-platform device:
If the user wants to use the legacy backend, it shall not
link the vfio-platform device with any iommufd object:
-device vfio-platform,host=XXX
This is called the legacy mode/backend.
If the user wants to use the iommufd backend (/dev/iommu) it
shall pass an iommufd object id in the vfio-platform device options:
-object iommufd,id=iommufd0
-device vfio-platform,host=XXX,iommufd=iommufd0
Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This gives management tools like libvirt a chance to open the vfio
cdev with privilege and pass FD to qemu. This way qemu never needs
to have privilege to open a VFIO or iommu cdev node.
Together with the earlier support of pre-opening /dev/iommu device,
now we have full support of passing a vfio device to unprivileged
qemu by management tool. This mode is no more considered for the
legacy backend. So let's remove the "TODO" comment.
Add helper functions vfio_device_set_fd() and vfio_device_get_name()
to set fd and get device name, they will also be used by other vfio
devices.
There is no easy way to check if a device is mdev with FD passing,
so fail the x-balloon-allowed check unconditionally in this case.
There is also no easy way to get BDF as name with FD passing, so
we fake a name by VFIO_FD[fd].
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Now we support two types of iommu backends, let's add the capability
to select one of them. This depends on whether an iommufd object has
been linked with the vfio-pci device:
If the user wants to use the legacy backend, it shall not
link the vfio-pci device with any iommufd object:
-device vfio-pci,host=0000:02:00.0
This is called the legacy mode/backend.
If the user wants to use the iommufd backend (/dev/iommu) it
shall pass an iommufd object id in the vfio-pci device options:
-object iommufd,id=iommufd0
-device vfio-pci,host=0000:02:00.0,iommufd=iommufd0
Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Legacy vfio pci and iommufd cdev have different process to hot reset
vfio device, expand current code to abstract out pci_hot_reset callback
for legacy vfio, this same interface will also be used by iommufd
cdev vfio device.
Rename vfio_pci_hot_reset to vfio_legacy_pci_hot_reset and move it
into container.c.
vfio_pci_[pre/post]_reset and vfio_pci_host_match are exported so
they could be called in legacy and iommufd pci_hot_reset callback.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Some vIOMMU such as virtio-iommu use IOVA ranges from host side to
setup reserved ranges for passthrough device, so that guest will not
use an IOVA range beyond host support.
Use an uAPI of IOMMUFD to get IOVA ranges of host side and pass to
vIOMMU just like the legacy backend, if this fails, fallback to
64bit IOVA range.
Also use out_iova_alignment returned from uAPI as pgsizes instead of
qemu_real_host_page_size() as a fallback.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Currently iommufd doesn't support dirty page sync yet,
but it will not block us doing live migration if VFIO
migration is force enabled.
So in this case we allow set_dirty_page_tracking to be NULL.
Note we don't need same change for query_dirty_bitmap because
when dirty page sync isn't supported, query_dirty_bitmap will
never be called.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
The iommufd backend is implemented based on the new /dev/iommu user API.
This backend obviously depends on CONFIG_IOMMUFD.
So far, the iommufd backend doesn't support dirty page sync yet.
Co-authored-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This is a trivial optimization. If there is active container in space,
vfio_reset_handler will never be unregistered. So revert the check of
space->containers and return early.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Introduce an iommufd object which allows the interaction
with the host /dev/iommu device.
The /dev/iommu can have been already pre-opened outside of qemu,
in which case the fd can be passed directly along with the
iommufd object:
This allows the iommufd object to be shared accross several
subsystems (VFIO, VDPA, ...). For example, libvirt would open
the /dev/iommu once.
If no fd is passed along with the iommufd object, the /dev/iommu
is opened by the qemu code.
Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Introduce an empty spapr backend which will hold spapr specific
content, currently only prereg_listener and hostwin_list.
Also introduce two spapr specific callbacks add/del_window into
VFIOIOMMUOps. Instantiate a spapr ops with a helper setup_spapr_ops
and assign it to bcontainer->ops.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Meanwhile remove the helper function vfio_free_container as it
only calls g_free now.
No functional change intended.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
In the prospect to get rid of VFIOContainer refs
in common.c lets convert misc functions to use the base
container object instead:
vfio_devices_all_dirty_tracking
vfio_devices_all_device_dirty_tracking
vfio_devices_all_running_and_mig_active
vfio_devices_query_dirty_bitmap
vfio_get_dirty_bitmap
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
VFIO Device is also changed to point to base container instead of
legacy container.
No functional change intended.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This adds two helper functions vfio_container_init/destroy which will be
used by both legacy and iommufd containers to do base container specific
initialization and release.
No functional change intended.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This empty VFIOIOMMUOps named vfio_legacy_ops will hold all general
IOMMU ops of legacy container.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Introduce a dumb VFIOContainerBase object and its targeted interface.
This is willingly not a QOM object because we don't want it to be
visible from the user interface. The VFIOContainerBase will be
smoothly populated in subsequent patches as well as interfaces.
No functional change intended.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
The system registers DBGVCR32_EL2, FPEXC32_EL2, DACR32_EL2 and
IFSR32_EL2 are present only to allow an AArch64 EL2 or EL3 to read
and write the contents of an AArch32-only system register. The
architecture requires that they are present only when EL1 can be
AArch32, but we implement them unconditionally. This was OK when all
our CPUs supported AArch32 EL1, but we have quite a lot of CPU models
now which only support AArch64 at EL1:
a64fx
cortex-a76
cortex-a710
neoverse-n1
neoverse-n2
neoverse-v1
Only define these registers for CPUs which allow AArch32 EL1.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20231121144605.3980419-1-peter.maydell@linaro.org
Drop fprintfs and actually use the return values in the callers.
This is OK to do since commit 7191f24c7f which added the
error-check to the generic accel/kvm functions that eventually
call into these ones.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
[PMM: tweak commit message]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Since kvm32.c was removed, there is no need to keep them separate.
This will allow more symbols to be unexported.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
[PMM: retain copyright lines from kvm64.c in kvm.c]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In 32-bit mode, pc = eip + cs_base is also 32-bit, and must wrap.
Failure to do so results in incorrect memory exceptions to the guest.
Before 732d548732, this was implicitly done via truncation to
target_ulong but only in qemu-system-i386, not qemu-system-x86_64.
To fix this, we must add conditional zero-extensions.
Since we have to test for 32 vs 64-bit anyway, note that cs_base
is always zero in 64-bit mode.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2022
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20231212172510.103305-1-richard.henderson@linaro.org>
I noticed the code blocks where not rendering properly so thought I'd
better fix things up. So:
- Use better title for the machine type
- Explain why Xen is a little different
- Add a proper anchor to the tpm-device link
- add newline so code block properly renders
- add some indentation to make continuation clearer
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20231207130623.360473-1-alex.bennee@linaro.org>
GUEST_VIRTIO_MMIO_* was added in Xen 4.17, so only define them
for CONFIG_XEN_CTRL_INTERFACE_VERSIONs up to 4.16.
Reported-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A misspelled condition in xen_native.h is hiding a bug in the enablement of
Xen for qemu-system-aarch64. The bug becomes apparent when building for
Xen 4.18.
While the i386 emulator provides the xenpv machine type for multiple architectures,
and therefore can be compiled with Xen enabled even when the host is Arm, the
opposite is not true: qemu-system-aarch64 can only be compiled with Xen support
enabled when the host is Arm.
Expand the computation of accelerator_targets['CONFIG_XEN'] similar to what is
already there for KVM.
Cc: Stefano Stabellini <stefano.stabellini@amd.com>
Cc: Richard W.M. Jones <rjones@redhat.com>
Cc: Daniel P. Berrangé <berrange@redhat.com>
Reported-by: Michael Young <m.a.young@durham.ac.uk>
Fixes: 0c8ab1cddd ("xen_arm: Create virtio-mmio devices during initialization", 2023-08-30)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 7191f24c7f ("accel/kvm/kvm-all: Handle register access errors")
added error checking for KVM_SET_SREGS/KVM_SET_SREGS2. In doing so, it
exposed a long-running bug in current KVM support for SEV-ES where the
kernel assumes that MSR_EFER_LMA will be set explicitly by the guest
kernel, in which case EFER write traps would result in KVM eventually
seeing MSR_EFER_LMA get set and recording it in such a way that it would
be subsequently visible when accessing it via KVM_GET_SREGS/etc.
However, guest kernels currently rely on MSR_EFER_LMA getting set
automatically when MSR_EFER_LME is set and paging is enabled via
CR0_PG_MASK. As a result, the EFER write traps don't actually expose the
MSR_EFER_LMA bit, even though it is set internally, and when QEMU
subsequently tries to pass this EFER value back to KVM via
KVM_SET_SREGS* it will fail various sanity checks and return -EINVAL,
which is now considered fatal due to the aforementioned QEMU commit.
This can be addressed by inferring the MSR_EFER_LMA bit being set when
paging is enabled and MSR_EFER_LME is set, and synthesizing it to ensure
the expected bits are all present in subsequent handling on the host
side.
Ultimately, this handling will be implemented in the host kernel, but to
avoid breaking QEMU's SEV-ES support when using older host kernels, the
same handling can be done in QEMU just after fetching the register
values via KVM_GET_SREGS*. Implement that here.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Akihiko Odaki <akihiko.odaki@daynix.com>
Cc: Philippe Mathieu-Daudé <philmd@linaro.org>
Cc: Lara Lazier <laramglazier@gmail.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Maxim Levitsky <mlevitsk@redhat.com>
Cc: <kvm@vger.kernel.org>
Fixes: 7191f24c7f ("accel/kvm/kvm-all: Handle register access errors")
Signed-off-by: Michael Roth <michael.roth@amd.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20231206155821.1194551-1-michael.roth@amd.com>
ufs fixes for 8.2
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCgAdFiEEUBfYMVl8eKPZB+73EuIgTA5dtgIFAmVurjcACgkQEuIgTA5d
# tgLWVBAAkzus4nN2+Z0H23VUmeBPCLPFXRSkK8mOWC3ymbX3kiy/IjgM7Ept6QWA
# btssTf3YEeDtycgbrb5GZ4kEfKThDN7bbGRHvCW5bjwkyLQN1Ys2K61CTRX0VhSi
# U4HDE3gCm+LpO28BuV/1KunlSH4TWjt76AB6YG5PuyzSH+AbC8yY7m+VSJTmCw1k
# cZv0TQ+9lqWc4C6ziETV8UqhhltBmd/57P3xFDKhYNl0EtzxnKGSZ2szzWqE7guY
# DsmTlfB5bnkYPE51xxTcJnRj907utNrIfa2kbu9wXU/GuPuEf9QkDo1Dt3t1Z0Zm
# OZPkloXC2eNufVcGYVJa2PylRjwFlg01IuhYmlhsgerg5LZz2RIyrWM61JTONF2J
# 6EvO89e2S3XpBbnl2ugf2rMIdW1tlLSWhnLZD+jZzOu+V2TeLm6/onHWCVQ02sLr
# ddDVpf2djvUsmRvcBBYlI40FcC9Wt828Spm+wkRsGHC+VbAg2al6jRNXyJ2LWeiS
# wGsAwRV6XhQz996uMOWTA7jEsAawHUFgYCsH4bgiqiWEn+FblufY2iicRxY4ZsJA
# GXpvxGoUHWE8e0XjXG1BnRFo2Q5ns9SRl5gx5X7rcmIKUGGCh3ZI72zfeVgCjm7b
# 5/CV/YzKuCRWJCYcORguli1GVuPO01FJrBloTJc0OSaDAtZL2Mg=
# =o2kr
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 04 Dec 2023 23:59:35 EST
# gpg: using RSA key 5017D831597C78A3D907EEF712E2204C0E5DB602
# gpg: Good signature from "Jeuk Kim <jeuk20.kim@samsung.com>" [unknown]
# gpg: aka "Jeuk Kim <jeuk20.kim@gmail.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 5017 D831 597C 78A3 D907 EEF7 12E2 204C 0E5D B602
* tag 'pull-ufs-20231205' of https://gitlab.com/jeuk20.kim/qemu:
hw/ufs: avoid generating the same ID string for different LU devices
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
QEMU would not start when trying to create two UFS host controllers and
a UFS logical unit for each with the following options:
-device ufs,id=bus0 \
-device ufs-lu,drive=drive1,bus=bus0,lun=0 \
-device ufs,id=bus1 \
-device ufs-lu,drive=drive2,bus=bus1,lun=0 \
This is because the same ID string ("0:0:0/scsi-disk") is generated
for both UFS logical units.
To fix this issue, prepend the parent pci device's path to make
the ID string unique.
("0000:00:03.0/0:0:0/scsi-disk" and "0000:00:04.0/0:0:0/scsi-disk")
Resolves: #2018
Fixes: 096434fea1 ("hw/ufs: Modify lu.c to share codes with SCSI subsystem")
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Reviewed-by: Jeuk Kim <jeuk20.kim@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20231204150543.48252-1-akinobu.mita@gmail.com>
Signed-off-by: Jeuk Kim <jeuk20.kim@samsung.com>
KVM_RISCV_GET_CSR() and KVM_RISCV_SET_CSR() use an 'int ret' variable
that is used to do an early 'return' if ret > 0. Both are being called
in functions that are also declaring a 'ret' integer, initialized with
'0', and this integer is used as return of the function.
The result is that the compiler is less than pleased and is pointing
shadowing errors:
../target/riscv/kvm/kvm-cpu.c: In function 'kvm_riscv_get_regs_csr':
../target/riscv/kvm/kvm-cpu.c:90:13: error: declaration of 'ret' shadows a previous local [-Werror=shadow=compatible-local]
90 | int ret = kvm_get_one_reg(cs, RISCV_CSR_REG(env, csr), ®); \
| ^~~
../target/riscv/kvm/kvm-cpu.c:539:5: note: in expansion of macro 'KVM_RISCV_GET_CSR'
539 | KVM_RISCV_GET_CSR(cs, env, sstatus, env->mstatus);
| ^~~~~~~~~~~~~~~~~
../target/riscv/kvm/kvm-cpu.c:536:9: note: shadowed declaration is here
536 | int ret = 0;
| ^~~
../target/riscv/kvm/kvm-cpu.c: In function 'kvm_riscv_put_regs_csr':
../target/riscv/kvm/kvm-cpu.c:98:13: error: declaration of 'ret' shadows a previous local [-Werror=shadow=compatible-local]
98 | int ret = kvm_set_one_reg(cs, RISCV_CSR_REG(env, csr), ®); \
| ^~~
../target/riscv/kvm/kvm-cpu.c:556:5: note: in expansion of macro 'KVM_RISCV_SET_CSR'
556 | KVM_RISCV_SET_CSR(cs, env, sstatus, env->mstatus);
| ^~~~~~~~~~~~~~~~~
../target/riscv/kvm/kvm-cpu.c:553:9: note: shadowed declaration is here
553 | int ret = 0;
| ^~~
The macros are doing early returns for non-zero returns and the local
'ret' variable for both functions is used just to do 'return 0', so
remove them from kvm_riscv_get_regs_csr() and kvm_riscv_put_regs_csr()
and do a straight 'return 0' in the end.
For good measure let's also rename the 'ret' variables in
KVM_RISCV_GET_CSR() and KVM_RISCV_SET_CSR() to '_ret' to make them more
resilient to these kind of errors.
Fixes: 937f0b4512 ("target/riscv: Implement kvm_arch_get_registers")
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231123101338.1040134-1-dbarboza@ventanamicro.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
There is no architectural requirement that SME implies SVE, but
our implementation currently assumes it. (FEAT_SME_FA64 does
imply SVE.) So if you try to run a CPU with eg "-cpu max,sve=off"
you quickly run into an assert when the guest tries to write to
SMCR_EL1:
#6 0x00007ffff4b38e96 in __GI___assert_fail
(assertion=0x5555566e69cb "sm", file=0x5555566e5b24 "../../target/arm/helper.c", line=6865, function=0x5555566e82f0 <__PRETTY_FUNCTION__.31> "sve_vqm1_for_el_sm") at ./assert/assert.c:101
#7 0x0000555555ee33aa in sve_vqm1_for_el_sm (env=0x555557d291f0, el=2, sm=false) at ../../target/arm/helper.c:6865
#8 0x0000555555ee3407 in sve_vqm1_for_el (env=0x555557d291f0, el=2) at ../../target/arm/helper.c:6871
#9 0x0000555555ee3724 in smcr_write (env=0x555557d291f0, ri=0x555557da23b0, value=2147483663) at ../../target/arm/helper.c:6995
#10 0x0000555555fd1dba in helper_set_cp_reg64 (env=0x555557d291f0, rip=0x555557da23b0, value=2147483663) at ../../target/arm/tcg/op_helper.c:839
#11 0x00007fff60056781 in code_gen_buffer ()
Avoid this unsupported and slightly odd combination by
disabling SME when SVE is not present.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2005
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20231127173318.674758-1-peter.maydell@linaro.org
Flaky avocado tests, gdbstub and gitlab tweaks
- gdbstub, properly halt when QEMU is having IO issues
- convert skipIf(GITLAB_CI) to skipUnless(QEMU_TEST_FLAKY_TESTS)
- tag sbsa-ref tests as TCG only
- build the correct microblaze for avocado-system-ubuntu
- add optional flaky tests job to CI
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmVqHFgACgkQ+9DbCVqe
# KkQHLwgAjP2iL5LSa3FaMUoESJQqRB0rpoJ80gtEtmvmgRF0fHsRfHtDdMN9h2Ed
# YilCDhMKLyr2ZoK4atyuc5SR6vCXI5RAvfTddex0xSxlvBX5Z5+1FMC6yA8SDJM7
# ezEXACEKHiGv+l8gvOZOf9ZYEgh8DMJYFMbrtxuxKWw/kAjZ3R3X/ChCL94ZCPRe
# 486wqPIQfp5EPs2ddsW4DYFTjLpK5ImX+u/5kdaEGXwcg8UoLmQ9BVIrN/hYJ6u5
# t/mAp1qVIQwSOSUBnerQ4ZkVQfCgLtEtiDtt8EZjUbQD3DcLjfHFjTwVlpqcC1zs
# wHXYpLbD5jkthqav5E0DObCF9gIZdA==
# =qtvU
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 01 Dec 2023 12:48:08 EST
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* tag 'pull-more-8.2-fixes-011223-2' of https://gitlab.com/stsquad/qemu:
gitlab: add optional job to run flaky avocado tests
gitlab: build the correct microblaze target
tests/avocado: tag sbsa tests as tcg only
docs/devel: rationalise unstable gitlab tests under FLAKY_TESTS
gdbstub: use a better signal when we halt for IO reasons
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The virtio-sound device is currently not migratable. QEMU crashes
on the source machine at some point during the migration with a
segmentation fault.
Even with this bug fixed, the virtio-sound device doesn't migrate
the state of the audio streams. For example, running streams leave
the device on the destination machine in a broken condition.
Mark the device as unmigratable until these issues have been fixed.
Signed-off-by: Volker Rümelin <vr_qemu@t-online.de>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20231204072837.6058-1-vr_qemu@t-online.de>
Commit d921fea338 ("ui/vnc-clipboard: fix infinite loop in
inflate_buffer (CVE-2023-3255)") removed this hunk, but it is still
required, because it can happen that stream.avail_in becomes zero
before coming across a return value of Z_STREAM_END in the loop.
This fixes the host->guest direction of the clipboard with noVNC and
TigerVNC as clients.
Fixes: d921fea338 ("ui/vnc-clipboard: fix infinite loop in inflate_buffer (CVE-2023-3255)")
Reported-by: Friedrich Weber <f.weber@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Acked-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20231122125826.228189-1-f.ebner@proxmox.com>
Commit 6f189a08c1 ("ui/gtk-egl: Check EGLSurface before doing
scanout") introduced a regression when QEMU is running with a
virtio-gpu-gl-device on a host under X11. After the guest has
initialized the virtio-gpu-gl-device, the guest screen only
shows "Display output is not active.".
Commit 6f189a08c1 moved all function calls in
gd_egl_scanout_texture() to a code path which is only called
once after gd_egl_init() succeeds in gd_egl_scanout_texture().
Move all function calls in gd_egl_scanout_texture() back to
the regular code path so they get always called if one of the
gd_egl_init() calls was successful.
Fixes: 6f189a08c1 ("ui/gtk-egl: Check EGLSurface before doing scanout")
Signed-off-by: Volker Rümelin <vr_qemu@t-online.de>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20231111104020.26183-1-vr_qemu@t-online.de>
The code already checks iommu_mr is not NULL so there is no
need to check container_of() is not NULL. Remove the check.
Fixes: CID 1523901
Fixes: 09b4c3d6a2 ("virtio-iommu: Record whether a probe request has
been issued")
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reported-by: Coverity (CID 1523901)
Message-Id: <20231109170715.259520-1-eric.auger@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
`kvm_enabled()` is compiled down to `0` and short-circuit logic is
used to remove references to undefined symbols at the compile stage.
Some build configurations with some compilers don't attempt to
simplify this logic down in some cases (the pattern appears to be
that the literal false must be the first term) and this was causing
some builds to emit references to undefined symbols.
An example of such a configuration is clang 16.0.6 with the following
configure: ./configure --enable-debug --without-default-features
--target-list=x86_64-softmmu --enable-tcg-interpreter
Signed-off-by: Daniel Hoffman <dhoff749@gmail.com>
Message-Id: <20231119203116.3027230-1-dhoff749@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
erst_realizefn() passes @errp to functions without checking for
failure. If it runs into another failure, it trips error_setv()'s
assertion.
Use the ERRP_GUARD() macro and check *errp, as suggested in commit
ae7c80a7bd ("error: New macro ERRP_GUARD()").
Cc: qemu-stable@nongnu.org
Fixes: f7e26ffa59 ("ACPI ERST: support for ACPI ERST feature")
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20231120130017.81286-1-philmd@linaro.org>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
QEMU crashes on exit when a virtio-sound device has failed to
realise. Its vmstate field was not cleaned up properly with
qemu_del_vm_change_state_handler().
This patch changes the realize() order as
1. Validate the given configuration values (no resources allocated
by us either on success or failure)
2. Try AUD_register_card() and return on failure (no resources allocated
by us on failure)
3. Initialize vmstate, virtio device, heap allocations and stream
parameters at once.
If error occurs, goto error_cleanup label which calls
virtio_snd_unrealize(). This cleans up all resources made in steps
1-3.
Reported-by: Volker Rümelin <vr_qemu@t-online.de>
Fixes: 2880e676c0 ("Add virtio-sound device stub")
Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Message-Id: <20231116072046.4002957-1-manos.pitsidianakis@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Commit b7639b7dd0 ("hw/audio: Simplify hda audio init") inverted
the sense of hda codec property mixer during initialization.
Change the code so that mixer=on enables the hda mixer emulation
and mixer=off disables the hda mixer emulation.
With this change audio playback and recording streams don't start
muted by default.
Fixes: b7639b7dd0 ("hw/audio: Simplify hda audio init")
Signed-off-by: Volker Rümelin <vr_qemu@t-online.de>
Message-Id: <20231105172552.8405-2-vr_qemu@t-online.de>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
After a relatively short time, there is an multiplication overflow
when multiplying (now - buft_start) with hda_bytes_per_second().
While the uptime now - buft_start only overflows after 2**63 ns
= 292.27 years, this happens hda_bytes_per_second() times faster
with the multiplication. At 44100 samples/s * 2 channels
* 2 bytes/channel = 176400 bytes/s that is 14.52 hours. After the
multiplication overflow the affected audio stream stalls.
Replace the multiplication and following division with muldiv64()
to prevent a multiplication overflow.
Fixes: 280c1e1cdb ("audio/hda: create millisecond timers that handle IO")
Reported-by: M_O_Bz <m_o_bz@163.com>
Signed-off-by: Volker Rümelin <vr_qemu@t-online.de>
Message-Id: <20231105172552.8405-1-vr_qemu@t-online.de>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The virtio sound device is currently an unclassified PCI device.
~> sudo lspci -s '00:02.0' -v -nn | head -n 2
00:02.0 Unclassified device [00ff]:
Red Hat, Inc. Device [1af4:1059] (rev 01)
Subsystem: Red Hat, Inc. Device [1af4:1100]
Set the correct PCI class code to change the device to a
multimedia audio controller.
~> sudo lspci -s '00:02.0' -v -nn | head -n 2
00:02.0 Multimedia audio controller [0401]:
Red Hat, Inc. Device [1af4:1059] (rev 01)
Subsystem: Red Hat, Inc. Device [1af4:1100]
Signed-off-by: Volker Rümelin <vr_qemu@t-online.de>
Message-Id: <20231107185034.6434-1-vr_qemu@t-online.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When dumping table blobs using rebuild-expected-aml.sh, table blobs from all
test variants are dumped regardless of whether there are any actual changes to
the tables or not. This creates lot of new files for various test variants that
are not part of the git repository. This is because we do not check in all table
blobs for all test variants into the repository. Only those blobs for those
variants that are different from the generic test-variant agnostic blob are
checked in.
This change makes the test smarter by checking if at all there are any changes
in the tables from the checked-in gold master blobs and take actions
accordingly.
When there are no changes:
- No new table blobs would be written.
- Existing table blobs will be refreshed (git diff will show no changes).
When there are changes:
- New table blob files will be dumped.
- Existing table blobs will be refreshed (git diff will show that the files
changed, asl diff will show the actual changes).
When new tables are introduced:
- Zero byte empty file blobs for new tables as instructed in the header of
bios-tables-test.c will be regenerated to actual table blobs.
This would make analyzing changes to tables less confusing and there would
be no need to clean useless untracked files when there are no table changes.
CC: peter.maydell@linaro.org
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20231107044952.5461-1-anisinha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
It doesn't make sense to have two classes of flaky tests. While it may
take the constrained environment of CI to trigger failures easily it
doesn't mean they don't occasionally happen on developer machines. As
CI is the gating factor to passing there is no point developers
running the tests locally anyway unless they are trying to fix things.
While we are at it update the language in the docs to discourage the
QEMU_TEST_FLAKY_TESTS becoming a permanent solution.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231201093633.2551497-3-alex.bennee@linaro.org>
netdev test keeps failing sometimes.
I don't think we should increase the timeout some more:
let's try something else instead, testing how busy the
system is.
Seems to work for me.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
getloadavg is supported on Linux, BSDs, Solaris.
Following man page:
RETURN VALUE
If the load average was unobtainable, -1 is returned; otherwise,
the number of samples actually retrieved is returned.
accordingly, make stub for systems which don't support this function return -1
for consistency.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* Add a default BIOS for the new amigaone machine so it does not
require out of tree binary blob.
* SLOF update to fix virtio serial bugs.
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCgAdFiEETkN92lZhb0MpsKeVZ7MCdqhiHK4FAmVof8kACgkQZ7MCdqhi
# HK71ng//TCpoi02/aZY5kAd1a1NxvRDd/gR9d5y79TaixgJ9FoV7joNg7Labu21r
# Gezghpgj7Ph+Wy175/qYhIJJ6JheK6xsAb7JmCJUq5HeOixJHkK0xHCJ0uGf1tcb
# c24+6JYa7K1Yd48EhGQUDwd+7J7QeAKPyJLSZHG2Qg9+sPX2koxa9tzZMoaWoA2L
# pMfXhUTBiK6Q93FtrQw16pRUcGrY542wLeA/nRaUFtuPdv38TDmJ4ktnid27fIh5
# 1+QVGQD0HCO29SVT/VP1TJenJukrYVjBfT8ulVC/wo53tZHhNSDVffXbRijrVFlX
# CPowJ2UebPwpvnvv8F8CSGPL4XPI+IBVdUOwZZMkH5oGaMXQW6mP4zsB7TK+g5z3
# 8+hQ0VZS0MzrrfSqufup8SUJAqJ1Sckx104clrpXtrBSAoiF634Qi1+UurwDVLFS
# VibKnMl31LauNRIWXVfj4BYOdH9oHOEHR5ghoaRguOAe58N7fGNiXC/WnScWbp8r
# PXE9D7SUMPtxNejDFRam+Df7JwTY+CdB56uvZ/behgs3FABfMmqBX+WgBbNhLaP4
# B4Wa0MTOAHz3itXRHYtvd6n3M9ts4nU88Srkuf0akAzp4Nv4b3+isuIncUazDREt
# q2z94oolhuZarLhsi/8Qo2G/SfJBNM0s4fmx4NTrqscupl5SadM=
# =7rvy
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 30 Nov 2023 07:27:53 EST
# gpg: using RSA key 4E437DDA56616F4329B0A79567B30276A8621CAE
# gpg: Good signature from "Nicholas Piggin <npiggin@gmail.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 4E43 7DDA 5661 6F43 29B0 A795 67B3 0276 A862 1CAE
* tag 'pull-ppc-for-8.2-20231130' of https://gitlab.com/npiggin/qemu:
ppc/amigaone: Allow running AmigaOS without firmware image
pseries: Update SLOF firmware image
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The machine uses a modified U-Boot under GPL license but the sources
of it are lost with only a binary available so it cannot be included
in QEMU. Allow running without the firmware image which can be used
when calling a boot loader directly and thus simplifying booting
guests. We need a small routine that AmigaOS calls from ROM which is
added in this case to allow booting AmigaOS without external firmware
image.
Fixes: d9656f860a ("hw/ppc: Add emulation of AmigaOne XE board")
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
It's been a while. This fixes compile warning, typos and
a bug with virtio-serial being used after it was shutdown
at "quiesce".
The full changelog is here:
Alexey Kardashevskiy (2):
Remove ?PICK
version: update to 20230918
Jordan Niethe (1):
virtio-serial: Do not close stdout on quiesce
Kautuk Consul (1):
virtio-serial: Make read and write methods report failure
Thomas Huth (10):
lib/libnet/ipv6: Silence compiler warning from Clang
Fix typos in the board-qemu folder
Fix typos in the lib/libnet folder
Fix typos in the remaining lib folders
Fix typos in the slof folder
Fix typos in the board-js2x folder
Fix typos in the llfw folder
Fix typos in the board-js2x folder
Fix typos in the clients folder
Fix remaining typos in various folders
Compiled with gcc-12.1.0-nolibc
Tested with (sorry, no KVM):
/home/aik/b/q-slof/qemu-system-ppc64 \
-nodefaults \
-chardev stdio,id=STDIO0,signal=off,mux=on \
-device spapr-vty,id=svty0,reg=0x71000110,chardev=STDIO0 \
-mon id=MON0,chardev=STDIO0,mode=readline \
-nographic \
-vga none \
-m 2G \
-kernel /home/aik/t/vml4150le \
-initrd /home/aik/t/le.cpio \
-machine pseries,cap-cfpc=broken,cap-sbbc=broken,cap-ibs=broken,cap-ccf-assist=off \
-bios pc-bios/slof.bin \
-trace events=/home/aik/qemu_trace_events \
-d guest_errors \
-chardev socket,id=SOCKET0,server=on,wait=off,path=qemu.mon.604650 \
-mon chardev=SOCKET0,mode=control \
-name 604650,debug-threads=on
[ npiggin: Also tested with KVM, including with virtio-console. ]
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Since socket_parse() will allocate memory for 'saddr',and its value
will pass to 'addr' that allocated by migrate_uri_parse(),
then 'saddr' will no longer used,need to free.
But due to 'saddr->u' is shallow copying the contents of the union,
the members of this union containing allocated strings,and will be used after that.
So just free 'saddr' itself without doing a deep free on the contents of the SocketAddress.
Fixes: 72a8192e22 ("migration: convert migration 'uri' into 'MigrateAddress'")
Signed-off-by: Zongmin Zhou<zhouzongmin@kylinos.cn>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231120031428.908295-1-zhouzongmin@kylinos.cn>
Return default value in legacy mode for BAR4 when unset. This can't be
set in reset method because BARs are cleared on reset so we return it
instead when BARs are read in legacy mode. This fixes UDMA on amigaone
with AmigaOS.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-ID: <20231125140135.AF6A075A4C3@zero.eik.bme.hu>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The vhost-user-blk export implement AioContext switches in its drain
implementation. This means that on drain_begin, it detaches the server
from its AioContext and on drain_end, attaches it again and schedules
the server->co_trip coroutine in the updated AioContext.
However, nothing guarantees that server->co_trip is even safe to be
scheduled. Not only is it unclear that the coroutine is actually in a
state where it can be reentered externally without causing problems, but
with two consecutive drains, it is possible that the scheduled coroutine
didn't have a chance yet to run and trying to schedule an already
scheduled coroutine a second time crashes with an assertion failure.
Following the model of NBD, this commit makes the vhost-user-blk export
shut down server->co_trip during drain so that resuming the export means
creating and scheduling a new coroutine, which is always safe.
There is one exception: If the drain call didn't poll (for example, this
happens in the context of bdrv_graph_wrlock()), then the coroutine
didn't have a chance to shut down. However, in this case the AioContext
can't have changed; changing the AioContext always involves a polling
drain. So in this case we can simply assert that the AioContext is
unchanged and just leave the coroutine running or wake it up if it has
yielded to wait for the AioContext to be attached again.
Fixes: e1054cd4aa
Fixes: https://issues.redhat.com/browse/RHEL-1708
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231127115755.22846-1-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The machine type is being detected based on "-M help" output, and we're
searching for the line ending with " (default)". However, in downstream
one of the machine types s marked as deprecated might become the
default, in which case this logic breaks as the line would now end with
" (default) (deprecated)". To fix potential issues here, let's relax
that requirement and detect the mere presence of " (default)" line
instead.
Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Message-ID: <20231122121538.32903-1-andrey.drobyshev@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
From s390_possible_cpu_arch_ids() in hw/s390x/s390-virtio-ccw.c, the
"core-id" is the index of possible_cpus->cpus[], so it should only be
less than possible_cpus->len, which is equal to ms->smp.max_cpus.
Fix the wrong "core-id" 112, because it isn't less than maxcpus (36) in
-smp, and the valid core ids are 0-35 inclusive.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Message-ID: <20231127134917.568552-1-zhao1.liu@linux.intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The chip has 4 pins (called PIRQA-D in VT82C686B and PINTA-D in
VT8231) that are meant to be connected to PCI IRQ lines and allow
routing PCI interrupts to the ISA PIC. Route these in
via_isa_set_irq() to make it possible to share them with internal
functions that can also be routed to the same ISA IRQs.
Fixes: 2fdadd02e6
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-ID: <8c4513d8b78fac40e6d4e65a0a4b3a7f2f278a4b.1701035944.git.balaton@eik.bme.hu>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The VIA integrated south bridge chips combine several functions and
allow routing their interrupts to any of the ISA IRQs also allowing
multiple sources to share the same ISA IRQ. E.g. pegasos2 firmware
configures everything to use IRQ 9 but amigaone routes them to
separate ISA IRQs so the current simplified routing does not work.
Bring back via_isa_set_irq() and change it to take the component that
wants to change an IRQ and keep track of interrupt status of each
source separately and do the mapping to ISA IRQ within the ISA bridge.
This may not handle cases when an ISA IRQ is controlled by devices
directly, not going through via_isa_set_irq() such as serial, parallel
or keyboard but these IRQs being conventionally fixed are not likely
to be change by guests or share with other devices so this does not
cause a problem in practice.
This reverts commit 4e5a20b6da.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-ID: <1c3902d4166234bef0a476026441eaac3dd6cda5.1701035944.git.balaton@eik.bme.hu>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
With the introduction of list-based array properties in qdev, the string
output visitor has to deal with lists of non-integer elements now ('info
qtree' prints all properties with the string output visitor).
Currently there is no explicit support for such lists, and the resulting
output is only the last element because string_output_set() always
replaces the output with the latest value. Instead of replacing the old
value, append comma separated values in list context.
The difference can be observed in 'info qtree' with a 'rocker' device
that has a 'ports' list with more than one element.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20231121173416.346610-3-kwolf@redhat.com>
Passing an uninitialised list to visit_start_list() happens to work for
the QObject output visitor because it treats the pointer as an opaque
value and never dereferences it, but the string output visitor expects a
valid list to check if it has more than one element.
The existing code crashes with the string output visitor if the
uninitialised value is non-NULL. Passing an explicit NULL would fix the
crash, but still result in wrong output.
Rework get_prop_array() so that it conforms to the expectations that the
string output visitor has. This includes building a real list first and
using visit_next_list() to iterate it.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1993
Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Dan Hoffman <dhoff749@gmail.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20231121173416.346610-2-kwolf@redhat.com>
The spips, qspips, and zynqmp-qspips share the same realize function
(xilinx_spips_realize) and initialize their io memory region with different
mmio_ops passed through the class. The size of the memory region is set to
the largest area (0x200 bytes for zynqmp-qspips) thus it is possible to write
out of s->regs[addr] in xilinx_spips_write for spips and qspips.
This fixes that wrong behavior.
Reviewed-by: Luc Michel <luc.michel@amd.com>
Signed-off-by: Frederic Konrad <fkonrad@amd.com>
Reviewed-by: Francisco Iglesias <francisco.iglesias@amd.com>
Message-id: 20231124143505.1493184-2-fkonrad@amd.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Commit 0be6bfac62 ("qdev: Implement variable length array properties")
added the DEFINE_PROP_ARRAY() macro with the following comment:
* It is the responsibility of the device deinit code to free the
* @_arrayfield memory.
Commit a75f336b97 added:
DEFINE_PROP_ARRAY("keycodes", StellarisGamepad, num_buttons,
keycodes, qdev_prop_uint32, uint32_t),
but forgot to free the 'keycodes' array. Do it in the instance_finalize
handler.
Fixes: a75f336b97 ("hw/input/stellaris_input: Convert to qdev")
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20231121174051.63038-7-philmd@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Commit 0be6bfac62 ("qdev: Implement variable length array properties")
added the DEFINE_PROP_ARRAY() macro with the following comment:
* It is the responsibility of the device deinit code to free the
* @_arrayfield memory.
Commit 9e4aa1fafe added:
DEFINE_PROP_ARRAY("pg0-lock",
XlnxVersalEFuseCtrl, extra_pg0_lock_n16,
extra_pg0_lock_spec, qdev_prop_uint16, uint16_t),
but forgot to free the 'extra_pg0_lock_spec' array. Do it in the
instance_finalize() handler.
Cc: qemu-stable@nongnu.org
Fixes: 9e4aa1fafe ("hw/nvram: Xilinx Versal eFuse device") # v6.2.0+
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20231121174051.63038-6-philmd@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Commit 0be6bfac62 ("qdev: Implement variable length array properties")
added the DEFINE_PROP_ARRAY() macro with the following comment:
* It is the responsibility of the device deinit code to free the
* @_arrayfield memory.
Commit 68fbcc344e added:
DEFINE_PROP_ARRAY("read-only", XlnxEFuse, ro_bits_cnt, ro_bits,
qdev_prop_uint32, uint32_t),
but forgot to free the 'ro_bits' array. Do it in the instance_finalize
handler.
Cc: qemu-stable@nongnu.org
Fixes: 68fbcc344e ("hw/nvram: Introduce Xilinx eFuse QOM") # v6.2.0+
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20231121174051.63038-5-philmd@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Commit 0be6bfac62 ("qdev: Implement variable length array properties")
added the DEFINE_PROP_ARRAY() macro with the following comment:
* It is the responsibility of the device deinit code to free the
* @_arrayfield memory.
Commit 4fb013afcc added:
DEFINE_PROP_ARRAY("oscclk", MPS2SCC, num_oscclk, oscclk_reset,
qdev_prop_uint32, uint32_t),
but forgot to free the 'oscclk_reset' array. Do it in the
instance_finalize() handler.
Cc: qemu-stable@nongnu.org
Fixes: 4fb013afcc ("hw/misc/mps2-scc: Support configurable number of OSCCLK values") # v6.0.0+
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20231121174051.63038-4-philmd@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Commit 0be6bfac62 ("qdev: Implement variable length array properties")
added the DEFINE_PROP_ARRAY() macro with the following comment:
* It is the responsibility of the device deinit code to free the
* @_arrayfield memory.
Commit 8077b8e549 added:
DEFINE_PROP_ARRAY("reserved-regions", VirtIOIOMMUPCI,
vdev.nb_reserved_regions, vdev.reserved_regions,
qdev_prop_reserved_region, ReservedRegion),
but forgot to free the 'vdev.reserved_regions' array. Do it in the
instance_finalize() handler.
Cc: qemu-stable@nongnu.org
Fixes: 8077b8e549 ("virtio-iommu-pci: Add array of Interval properties") # v5.1.0+
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20231121174051.63038-3-philmd@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In commit edac4d8a16 back in 2015 when we added support for
the virtual timer offset CNTVOFF_EL2, we didn't correctly update
the timer-recalculation code that figures out when the timer
interrupt is next going to change state. We got it wrong in
two ways:
* for the 0->1 transition, we didn't notice that gt->cval + offset
can overflow a uint64_t
* for the 1->0 transition, we didn't notice that the transition
might now happen before the count rolls over, if offset > count
In the former case, we end up trying to set the next interrupt
for a time in the past, which results in QEMU hanging as the
timer fires continuously.
In the latter case, we would fail to update the interrupt
status when we are supposed to.
Fix the calculations in both cases.
The test case is Alex Bennée's from the bug report, and tests
the 0->1 transition overflow case.
Fixes: edac4d8a16 ("target-arm: Add CNTVOFF_EL2")
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/60
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20231120173506.3729884-1-peter.maydell@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
The syndrome register value always has an IL field at bit 25, which
is 0 for a trap on a 16 bit instruction, and 1 for a trap on a 32
bit instruction (or for exceptions which aren't traps on a known
instruction, like PC alignment faults). This means that our
syn_*() functions should always either take an is_16bit argument to
determine whether to set the IL bit, or else unconditionally set it.
We missed setting the IL bit for the syndrome for three kinds of trap:
* an SVE access exception
* a pointer authentication check failure
* a BTI (branch target identification) check failure
All of these traps are AArch64 only, and so the instruction causing
the trap is always 64 bit. This means we can unconditionally set
the IL bit in the syn_*() function.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20231120150121.3458408-1-peter.maydell@linaro.org
Cc: qemu-stable@nongnu.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
The URL to the Coverity tools download has changed; the old one points
to an obsolete version that is not supported anymore. Adjust to point
to the correct and supported tools.
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pseudo-"in source tree" build used to run make in the build directory
as many times as goals. Worse, although .NOTPARALLEL is specified,
it does not work for patterns, and run make in parallel, which can break
things.
Add a new rule "build", and let it call make. The pattern rule only
needs to specify "build" as its prerequisite and have a no-op recipe so
that it does more than canceling built-in implicit rules.
Fixes: dedad02720 ("configure: add support for pseudo-"in source tree" builds")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-ID: <20231119101604.47325-1-akihiko.odaki@daynix.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Propagate the buffer size to format_dec() and use snprintf().
This should silence this UBSan -Wformat-overflow warning:
In file included from /usr/include/stdio.h:906,
from include/qemu/osdep.h:114,
from ../disas/cris.c:21:
In function 'sprintf',
inlined from 'format_dec' at ../disas/cris.c:1737:3,
inlined from 'print_with_operands' at ../disas/cris.c:2477:12,
inlined from 'print_insn_cris_generic.constprop' at ../disas/cris.c:2690:8:
/usr/include/bits/stdio2.h:30:10: warning: null destination pointer [-Wformat-overflow=]
30 | return __builtin___sprintf_chk (__s, __USE_FORTIFY_LEVEL - 1,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
31 | __glibc_objsize (__s), __fmt,
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
32 | __va_arg_pack ());
| ~~~~~~~~~~~~~~~~~
Reported-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231120132222.82138-1-philmd@linaro.org>
[Rewritten to fix logic and avoid repeated expression. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Upgrade libvirt-ci so it covers macOS 14. Add a manual entry
(QEMU_JOB_OPTIONAL: 1) to test on Sonoma release. Refresh the
lci-tool generated files.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231109160504.93677-3-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Given the recent confusion around how QEMU detects the system
Meson installation, and/or decides to install its own, it is
time to fill in the "Python virtual environments and the QEMU
build system" section of the documentation.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
various random fixes for 8.2
- replace fedora-i386 cross compiler with debian
- update cirrus MacOS image to Ventura
- merge debian-native and debian-amd64 docker images
- fix compile of plugins on Windows mingw cross
- add some doc notes on semihosting READC
- add some doc notes on gdbstub
- skip loading debug symbols if we have failed
- enable arm-softmmu TCG tests
- don't attempt to use native cross builds for linux-user
- clean up registers gdb test case (ppc64/s390x)
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmVfXowACgkQ+9DbCVqe
# KkQY6Af5AVjPG2aHmixvhTjxEx5dXAH3cGYsWbny3EByT2RijaTBBK/A4OB7RTVV
# fr11kGpCkJDk4JPoUz4yTuw6Q+7WBmB0tJJ5wcGyC9cyCjI/PttSTJUC7hiikifw
# dg1IVrJZX0ahOpUiDXAtDbeHK1/i95mDRtot40mnyv5HHYHlJKohKsUVtiQEWMeq
# 0/X/M5Zq8oJ6wCkbw1nsCqkWpZa7eh4YcB9cGNf87dd0ZJ9M93CbjdSQlsugF2gB
# pH+5ZGOj+L/zkbEKoaWJNwYzF4G6hJeLpqP2rLMqRfA5MM43wdd0dJ6gK0ylKeuR
# Bo7jC1oEOcuLibZY40OhlOwLTMWiDg==
# =ME/l
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 23 Nov 2023 09:15:40 EST
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* tag 'pull-for-8.2-fixes-231123-1' of https://gitlab.com/stsquad/qemu:
tests/tcg: finesse the registers check for "hidden" regs
configure: don't try a "native" cross for linux-user
tests/tcg: enable semiconsole test for Arm
tests/tcg: enable arm softmmu tests
testing: move arm system tests into their own folder
hw/core: skip loading debug on all failures
docs/system: clarify limits of using gdbstub in system emulation
docs/emulation: expand warning about semihosting
tests/tcg: fixup Aarch64 semiconsole test
target/nios2: Deprecate the Nios II architecture
plugins: fix win plugin tests on cross compile
tests/docker: merge debian-native with debian-amd64
.gitlab-ci.d/cirrus: Upgrade macOS to 13 (Ventura)
tests/docker: replace fedora-i386 with debian-i686
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Pass the content of $mkvenv_flags (which is either "--online"
or empty) down to tests/Makefile.include.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Unfortunately Coverity doesn't follow the logic aroung "len" and "l"
variables in stacks finishing with flatview_{read,write}_continue() and
generate a lot of OVERRUN false-positives. When small buffer (2 or 4
bytes) is passed to mem read/write path, Coverity assumes the worst
case of sz=8 in stn_he_p()/ldn_he_p() (defined in
include/qemu/bswap.h), and reports buffer overrun.
To silence these false-positives we have model functions, which hide
real logic from Coverity.
However, it turned out that these new two assertions are enough to
quiet Coverity.
Assertions are better than hiding the logic, so let's drop the
modelling and move to assertions for memory r/w call stacks.
After patch, the sequence
cov-make-library --output-file /tmp/master.xmldb \
scripts/coverity-scan/model.c
cov-build --dir ~/covtmp/master make -j9
cov-analyze --user-model-file /tmp/master.xmldb \
--dir ~/covtmp/master --all --strip-path "$(pwd)
cov-format-errors --dir ~/covtmp/master \
--html-output ~/covtmp/master_html_report
Generate for me the same big set of CIDs excepept for 6 disappeared (so
it becomes even better).
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Acked-by: David Hildenbrand <david@redhat.com>
Message-ID: <20231005140326.332830-1-vsementsov@yandex-team.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The reason the ppc64 and s390x test where failing was because gdb
hides them although they are still accessible via regnum. We can
re-arrange the test a little bit and include these two arches in our
test.
We also need to be a bit more careful handling remote-registers as the
format isn't easily parsed with pure white space separation. Once we
fold types like "long long" and "long double" into a single word we
can now assert all registers are either listed or elided.
Cc: Ilya Leoshkevich <iii@linux.ibm.com>
Cc: <qemu-s390x@nongnu.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Daniel Henrique Barboza <danielhb413@gmail.com>
Cc: <qemu-ppc@nongnu.org>
Cc: Luis Machado <luis.machado@arm.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231121153606.542101-1-alex.bennee@linaro.org>
As 32 bit x86 become rarer we are starting to run into problems with
search paths. Although we switched to a Debian container we still
favour the native CC on a Bookworm host. As a result we have a broken
cross compile setup which then fails to build with:
BUILD i386-linux-user guest-tests
In file included from /usr/include/linux/stat.h:5,
from /usr/include/bits/statx.h:31,
from /usr/include/sys/stat.h:465,
from /home/alex/lsrc/qemu.git/tests/tcg/multiarch/linux/linux-test.c:28:
/usr/include/linux/types.h:5:10: fatal error: asm/types.h: No such file or directory
5 | #include <asm/types.h>
| ^~~~~~~~~~~~~
compilation terminated.
make[1]: *** [Makefile:119: linux-test] Error 1
make: *** [/home/alex/lsrc/qemu.git/tests/Makefile.include:50: build-tcg-tests-i386-linux-user] Error 2
This is likely to affect more and more linux-user builds so wrap the
whole check in a test for softmmu targets (aka bare metal) which don't
worry about such header niceties. This allows us to keep using the
host compiler for softmmu tests and the roms.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231120150833.2552739-14-alex.bennee@linaro.org>
We need to ensure we squash the serial port if we want to hand craft
our muxed input. As a bonus emit the example with a V=1 build to make
it easier for people to figure out.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231120150833.2552739-7-alex.bennee@linaro.org>
debian-native isn't really needed and suffers from the problem of
tracking a distros dependencies rather than the projects. With a
little surgery we can make the debian-amd64 container architecture
neutral and allow people to use it to build a native QEMU.
Rename it so it follows the same non-arch pattern of the other distro
containers.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231120150833.2552739-4-alex.bennee@linaro.org>
Fourth RISC-V PR for 8.2
This is a few bug fixes for the 8.2 release
* Add Zicboz block size to hwprobe
* Creat the virt machine FDT before machine init is complete
* Don't verify ISA compatibility for zicntr and zihpm
* Fix SiFive E CLINT clock frequency
* Fix invalid exception on MMU translation stage
* Fix mxr bit behavior
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEaukCtqfKh31tZZKWr3yVEwxTgBMFAmVdk4sACgkQr3yVEwxT
# gBP6gQ/+NzdRT8Wx/9ynnKs0XwXBwOjQTHDcxCIKLWYrM26c3M+4XEU6IBdg2X1T
# qRv9Xal/pXqvAz8tIunF1fNd0Syom4UezcjvLjzipWwS32+D9KEKhKz89aoQc2SQ
# lnTBYz6lSUNppp3wj68gNAyPpht+5zVwYZDsjeZCRlAS00dcl26Xde8kt9tJW7zy
# tPBvHtJP9AVc+HJdClytEZ79G+EHN5Y4ScoJsVinXSBZs9lIQD+nPmFbxopre6kg
# +RUk56eATIlVMISD5pCYyCr3jTebMqVIFY9xtQxb4R09aLYN6+k13NfsJeIcQgaF
# MbhAGE0WbXEhKyHe4BuVtyz2k+zYtoh6YSE2Czub2pzPAfpKKWiu4Odi7vHlYejw
# Nksn3N7LR3FbhrDst71+EQ28vUuEYfECEFICjzHb+DhxlPxHW9WC4f8ciTUpT57O
# HPWYN7zn5Yw97nGBVuITVO7DfcQcw8MS8HcFEelkeDOephiDKr327SWTL+lp5+P5
# fm7PM4Z92GRvT3Voj4mebVxC62CGqehDotWRvXCvc87m4DfLsmpt0nNeX9q18zw+
# phEZ5Q8AMmEnRzpmoXEzzcDWyJIO6huJFad0imTR6MqvXYxsJYIr+wURDB6POelP
# SfMqdX9cTu8xJ7Hw4gJT9ZgcTlKsTq5LNpGZ/kLPXS6/y7fgC5Y=
# =QK14
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 22 Nov 2023 00:37:15 EST
# gpg: using RSA key 6AE902B6A7CA877D6D659296AF7C95130C538013
# gpg: Good signature from "Alistair Francis <alistair@alistair23.me>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6AE9 02B6 A7CA 877D 6D65 9296 AF7C 9513 0C53 8013
* tag 'pull-riscv-to-apply-20231122' of https://github.com/alistair23/qemu:
target/riscv/cpu_helper.c: Fix mxr bit behavior
target/riscv/cpu_helper.c: Invalid exception on MMU translation stage
riscv: Fix SiFive E CLINT clock frequency
target/riscv: don't verify ISA compatibility for zicntr and zihpm
hw/riscv/virt.c: do create_fdt() earlier, add finalize_fdt()
linux-user/riscv: Add Zicboz block size to hwprobe
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
SeaBIOS-hppa v13
Please pull an update of SeaBIOS-hppa to v13 to fix
a system reboot crash in qemu-system-hppa as reported in
https://gitlab.com/qemu-project/qemu/-/issues/1991
# -----BEGIN PGP SIGNATURE-----
#
# iHUEABYKAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCZV0uiQAKCRD3ErUQojoP
# X/UEAP4vVLO/21SwO8/UpmImQPGTpoGUxA2DWYHBfjmyVGEoqwEA1sfhqpdahDJ0
# FLSculh9fFG7vWOMCZo2Xnur+X9ahgQ=
# =FaBT
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 21 Nov 2023 17:26:17 EST
# gpg: using EDDSA key BCE9123E1AD29F07C049BBDEF712B510A23A0F5F
# gpg: Good signature from "Helge Deller <deller@gmx.de>" [unknown]
# gpg: aka "Helge Deller <deller@kernel.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 4544 8228 2CD9 10DB EF3D 25F8 3E5F 3D04 A7A2 4603
# Subkey fingerprint: BCE9 123E 1AD2 9F07 C049 BBDE F712 B510 A23A 0F5F
* tag 'seabios-hppa-v13-pull-request' of https://github.com/hdeller/qemu-hppa:
target/hppa: Update SeaBIOS-hppa to version 13
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
According to RISCV Specification sect 9.5 on two stage translation when
V=1 the vsstatus(mstatus in QEMU's terms) field MXR, which makes
execute-only pages readable, only overrides VS-stage page protection.
Setting MXR at HS-level(mstatus_hs), however, overrides both VS-stage
and G-stage execute-only permissions.
The hypervisor extension changes the behavior of MXR\MPV\MPRV bits.
Due to RISCV Specification sect. 9.4.1 when MPRV=1, explicit memory
accesses are translated and protected, and endianness is applied, as
though the current virtualization mode were set to MPV and the current
nominal privilege mode were set to MPP. vsstatus.MXR makes readable
those pages marked executable at the VS translation stage.
Fixes: 36a18664ba ("target/riscv: Implement second stage MMU")
Signed-off-by: Ivan Klokov <ivan.klokov@syntacore.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20231121071757.7178-3-ivan.klokov@syntacore.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
According to RISCV privileged spec sect. 5.3.2 Virtual Address Translation Process
access-fault exceptions may raise only after PMA/PMP check. Current implementation
generates an access-fault for mbare mode even if there were no PMA/PMP errors.
This patch removes the erroneous MMU mode check and generates an access-fault
exception based on the pmp_violation flag only.
Fixes: 1448689c7b ("target/riscv: Allow specifying MMU stage")
Signed-off-by: Ivan Klokov <ivan.klokov@syntacore.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20231121071757.7178-2-ivan.klokov@syntacore.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The extensions zicntr and zihpm were officially added in the privilege
instruction set specification 1.12. However, QEMU has been implemented
them long before it and thus they are forced to be on during the cpu
initialization to ensure compatibility (see riscv_cpu_init).
riscv_cpu_disable_priv_spec_isa_exts was not updated when the above
behavior was introduced, resulting in these extensions to be disabled
after all.
Signed-off-by: Clément Chigot <chigot@adacore.com>
Fixes: c004099330 ("target/riscv: add zicntr extension flag for TCG")
Fixes: 0824121660 ("target/riscv: add zihpm extension flag for TCG")
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231114123913.536194-1-chigot@adacore.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Commit 49554856f0 fixed a problem, where TPM devices were not appearing
in the FDT, by delaying the FDT creation up until virt_machine_done().
This create a side effect (see gitlab #1925) - devices that need access
to the '/chosen' FDT node during realize() stopped working because, at
that point, we don't have a FDT.
This happens because our FDT creation is monolithic, but it doesn't need
to be. We can add the needed FDT components for realize() time and, at
the same time, do another FDT round where we account for dynamic sysbus
devices. In other words, the problem fixed by 49554856f0 could also be
fixed by postponing only create_fdt_sockets() and its dependencies,
leaving everything else from create_fdt() to be done during init().
Split the FDT creation in two parts:
- create_fdt(), now moved back to virt_machine_init(), will create FDT
nodes that doesn't depend on additional (dynamic) devices from the
sysbus;
- a new finalize_fdt() step is added, where create_fdt_sockets() and
friends is executed, accounting for the dynamic sysbus devices that
were added during realize().
This will make both use cases happy: TPM devices are still working as
intended, and devices such as 'guest-loader' have a FDT to work on
during realize().
Fixes: 49554856f0 ("riscv: Generate devicetree only after machine initialization is complete")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1925
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231110172559.73209-1-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Allow the VIA IDE controller to switch between both legacy and native modes by
calling pci_ide_update_mode() to reconfigure the device whenever PCI_CLASS_PROG
is updated.
This patch moves the initial setting of PCI_CLASS_PROG from via_ide_realize() to
via_ide_reset(), and removes the direct setting of PCI_INTERRUPT_PIN during PCI
bus reset since this is now managed by pci_ide_update_mode(). This ensures that
the device configuration is always consistent with respect to the currently
selected mode.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-ID: <20231116103355.588580-5-mark.cave-ayland@ilande.co.uk>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The via-ide device currently attempts to set the default BAR addresses to the
values shown in the datasheet, but this doesn't work for 2 reasons: firstly
BARS 1-4 do not set the bottom 2 bits to PCI_BASE_ADDRESS_SPACE_IO, and
secondly the initial PCI bus reset clears the values of all PCI device BARs
after the device itself has been reset.
Remove the setting of the default BAR addresses from via_ide_reset() to ensure
there is no doubt that these values are never exposed to the guest.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-ID: <20231116103355.588580-4-mark.cave-ayland@ilande.co.uk>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This function reads the value of the PCI_CLASS_PROG register for PCI IDE
controllers and configures the PCI BARs and/or IDE ioports accordingly.
In the case where we switch to legacy mode, the PCI BARs are set to return zero
(as suggested in the "PCI IDE Controller" specification), the legacy IDE ioports
are enabled, and the PCI interrupt pin cleared to indicate legacy IRQ routing.
Conversely when we switch to native mode, the legacy IDE ioports are disabled
and the PCI interrupt pin set to indicate native IRQ routing. The contents of
the PCI BARs are unspecified, but this is not an issue since if a PCI IDE
controller has been switched to native mode then its BARs will need to be
programmed.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-ID: <20231116103355.588580-3-mark.cave-ayland@ilande.co.uk>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This tests two parallel stream jobs that will complete around the same
time and run on two different disks in the same iothreads. It is loosely
based on the bug report at https://issues.redhat.com/browse/RHEL-1761.
For me, this test hangs reliably with the originally reported bug in
blk_remove_bs(). After fixing it, it intermittently hangs for the bugs
fixed after it, missing AioContext unlocking in bdrv_graph_wrunlock()
and in stream_prepare(). The deadlocks seem to happen more frequently
when the test directory is on tmpfs.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231115172012.112727-5-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In stream_prepare(), we need to temporarily drop the AioContext lock
that job_prepare_locked() took for us while calling the graph write lock
functions which can poll.
All block nodes related to this block job are in the same AioContext, so
we can pass any of them to bdrv_graph_wrlock()/ bdrv_graph_wrunlock().
Unfortunately, the one that we picked is base, which can be NULL - and
in this case the AioContext lock is not released and deadlocks can
occur.
Fix this by passing s->target_bs, which is never NULL.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231115172012.112727-4-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bdrv_graph_wrunlock() calls aio_poll(), which may run callbacks that
have a nested event loop. Nested event loops can depend on other
iothreads making progress, so in order to allow them to make progress it
must not hold the AioContext lock of another thread while calling
aio_poll().
This introduces a @bs parameter to bdrv_graph_wrunlock() whose
AioContext is temporarily dropped (which matches bdrv_graph_wrlock()),
and a bdrv_graph_wrunlock_ctx() that can be used if the BlockDriverState
doesn't necessarily exist any more when unlocking.
This also requires a change to bdrv_schedule_unref(), which was relying
on the incorrectly taken lock. It needs to take the lock itself now.
While this is a separate bug, it can't be fixed a separate patch because
otherwise the intermediate state would either deadlock or try to release
a lock that we don't even hold.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231115172012.112727-3-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[kwolf: Fixed up bdrv_schedule_unref()]
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
While not all callers of blk_remove_bs() are correct in this respect,
the assumption in the function is that callers hold the AioContext lock
of the BlockBackend (this is required by the drain calls in it).
In order to avoid deadlock in the nested event loop, bdrv_graph_wrlock()
has then to be called with the root BlockDriverState as its parameter
instead of NULL, so that this AioContext lock is temporarily dropped.
Fixes: https://issues.redhat.com/browse/RHEL-1761
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231115172012.112727-2-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Legacy software contains a standard mechanism for generating a reset to a
Serial ATA device - setting the SRST (software reset) bit in the Device
Control register.
Serial ATA has a more robust mechanism called COMRESET, also referred to
as port reset. A port reset is the preferred mechanism for error
recovery and should be used in place of software reset.
Commit e2a5d9b3d9 ("hw/ide/ahci: simplify and document PxCI handling")
improved the handling of PxCI, such that PxCI gets cleared after handling
a non-NCQ, or NCQ command (instead of incorrectly clearing PxCI after
receiving anything - even a FIS that failed to parse, which should NOT
clear PxCI, so that you can see which command slot that caused an error).
However, simply clearing PxCI after a non-NCQ, or NCQ command, is not
enough, we also need to clear PxCI when receiving a SRST in the Device
Control register.
A legacy software reset is performed by the host sending two H2D FISes,
the first H2D FIS asserts SRST, and the second H2D FIS deasserts SRST.
The first H2D FIS will not get a D2H reply, and requires the FIS to have
the C bit set to one, such that the HBA itself will clear the bit in PxCI.
The second H2D FIS will get a D2H reply once the diagnostic is completed.
The clearing of the bit in PxCI for this command should ideally be done
in ahci_init_d2h() (if it was a legacy software reset that caused the
reset (a COMRESET does not use a command slot)). However, since the reset
value for PxCI is 0, modify ahci_reset_port() to actually clear PxCI to 0,
that way we can avoid complex logic in ahci_init_d2h().
This fixes an issue for FreeBSD where the device would fail to reset.
The problem was not noticed in Linux, because Linux uses a COMRESET
instead of a legacy software reset by default.
Fixes: e2a5d9b3d9 ("hw/ide/ahci: simplify and document PxCI handling")
Reported-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Message-ID: <20231108222657.117984-1-nks@flawful.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Coverity couldn't see that nr_existing was always going to be zero when
qemu_xen_xs_directory() returned NULL in the ENOENT case (CID 1523906).
Perhaps more to the point, neither could Peter at first glance. Improve
the code to hopefully make it clearer to Coverity and human reviewers
alike.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
If a Xen console is configured on the command line, do not add a default
serial port.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Paul Durrant <paul@xen.org>
target-arm queue:
* enable FEAT_RNG on Neoverse-N2
* hw/intc/arm_gicv3: ICC_PMR_EL1 high bits should be RAZ
* Fix SME FMOPA (16-bit), BFMOPA
* hw/core/machine: Constify MachineClass::valid_cpu_types[]
* stm32f* machines: Report error when user asks for wrong CPU type
* hw/arm/fsl-imx: Do not ignore Error argument
# -----BEGIN PGP SIGNATURE-----
#
# iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmVchLYZHHBldGVyLm1h
# eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3kHMD/47tKxzrsXc6+V9esRQGi2H
# 1hAgLBwglEdxLXokF+Di41sh/fvK7wYVXO/hiWlq+9h3kG3D/u1N5r1TdMPMUb9j
# 4Sg3rOejn7nzkxVZ6MZ/K/1j84C9bfrt4sboVHZVRvWuvbiyuTuivEr4IqLYO4x3
# AIwhFMQ5gbNrmClZh/DBxj0keO13cp63Fg2JSSICdi+1Dw9rRXTyhJloMu1omeqc
# k/BXzjSeNXpLSMyGWBR3uaPcJBaGC1xnz3Z1V7fUY1EYD2Cu1oo5lEZ9aNO5t30d
# XW/qVGLa3b1Cb7WuEO247RnU3N2oZotozjFtdj/8IQoYWspM9RHyipEimUlegVdO
# 3fpu8QGsN1ljNiwjdk0i6OwS7SGxcPtteFOaqEf/Yogj4EOKTn/Rx5TT4vJ5DhmI
# 2w/9J15JWDIE1paNwecuFWbxCOOzSsOtSxzuyLSZDU3GlNfJ4zoF6YboROLYfejy
# NXZABFhGd/0ykX7r0VY1GGYXUQ+akv6q+VDmVZCP9gMiRUiqmFPwMLMLlcuHb8G5
# 8UztN5SvOG2EYXj28Zx0BnGCNiGdI15rWMb0veqAtbnn3yEdltW3O475BAhZ0PB7
# OVpLWnXwmWURm/BGlwb1PH5s3kgWgzOebcBgcnCftwFQ8EedQAQDA5FmT+nK5SfV
# VoOf89PngTubU6B3BOfeBw==
# =thIa
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 21 Nov 2023 05:21:42 EST
# gpg: using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg: issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [full]
# gpg: aka "Peter Maydell <pmaydell@gmail.com>" [full]
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [full]
# gpg: aka "Peter Maydell <peter@archaic.org.uk>" [unknown]
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* tag 'pull-target-arm-20231121' of https://git.linaro.org/people/pmaydell/qemu-arm:
hw/arm/fsl-imx: Do not ignore Error argument
hw/arm/stm32f100: Report error when incorrect CPU is used
hw/arm/stm32f205: Report error when incorrect CPU is used
hw/arm/stm32f405: Report error when incorrect CPU is used
hw/core/machine: Constify MachineClass::valid_cpu_types[]
target/arm: Fix SME FMOPA (16-bit), BFMOPA
hw/intc/arm_gicv3: ICC_PMR_EL1 high bits should be RAZ
target/arm: enable FEAT_RNG on Neoverse-N2
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
fixes tcg_out_mov aborted.
# -----BEGIN PGP SIGNATURE-----
#
# iLMEAAEKAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCZVwXJgAKCRBAov/yOSY+
# 30HKBAC4+3oAaMqRDEBTlYT0oHmU3IVRv7Pkuht72YZ57qQwjq21jMpxRdeuAAT2
# McGzDIH/IbF0qG1HBako00jiwgGpx90aBU0KwOVgBjyjvUK2VXE268UoRs+WYVG/
# 7ljOHEnpvwJVTquAtDNFZIw0EFwiF75MP2rKvrSG8KmmrSu4hg==
# =oHNA
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 20 Nov 2023 21:34:14 EST
# gpg: using RSA key B8FF1DA0D2FDCB2DA09C6C2C40A2FFF239263EDF
# gpg: Good signature from "Song Gao <m17746591750@163.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B8FF 1DA0 D2FD CB2D A09C 6C2C 40A2 FFF2 3926 3EDF
* tag 'pull-loongarch-20231121' of https://gitlab.com/gaosong/qemu:
tcg/loongarch64: Fix tcg_out_mov() Aborted
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
In the minimal pixman API stub that is used when the real pixman
dependency is missing a NULL dereference happens when
virtio-gpu-rutabaga allocates a pixman image with bits = NULL and
rowstride_bytes = zero. A buffer of rowstride_bytes * height is
allocated which is NULL. However, in that scenario pixman calculates a
new stride value based on given width, height and format size.
This commit adds a helper function that performs the same logic as
pixman.
Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20231121093840.2121195-1-manos.pitsidianakis@linaro.org>
We should also consider -display vnc= as setting up a remote display,
and not attempt to add another default one.
The display_remote++ in qemu_setup_display() isn't necessary at this
point, but is there for completeness and further usages of the variable.
Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1988
Fixes: commit 484629fc81 ("vl: simplify display_remote logic ")
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
When display is "none", we may still have remote displays (I think it
would be simpler if VNC/Spice were regular display btw). Return the
default VC then, and set them up to fix a regression when using remote
display and it used the TTY instead.
Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1989
Fixes: commit 1bec1cc0d ("ui/console: allow to override the default VC")
Reported-by: German Maglione <gmaglione@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Thomas Huth <thuth@redhat.com>
Those display have their own implementation of "vc" chardev, which
doesn't use pixman. They also don't implement the width/height/cols/rows
options, so qemu_display_get_vc() should return a compatible argument.
This patch was meant to be with the pixman series, when the "vc" field
was introduced. It fixes a regression where VC are created on the
tty (or null) instead of the display own "vc" implementation.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Thomas Huth <thuth@redhat.com>
Commit 1bec1cc0d ("ui/console: allow to override the default VC") changed
the behaviour of the "-display none" option, so that it now creates a
QEMU monitor on the terminal. "-display none" should not be tangled up
with whether we create a monitor or a serial terminal; it should purely
and only disable the graphical window. Changing its behaviour like this
breaks command lines which, for example, use semihosting for their
output and don't want a graphical window, as they now get a monitor they
never asked for.
It also breaks the command line we document for Xen in
docs/system/i386/xen.html:
$ ./qemu-system-x86_64 --accel kvm,xen-version=0x40011,kernel-irqchip=split \
-display none -chardev stdio,mux=on,id=char0,signal=off -mon char0 \
-device xen-console,chardev=char0 -drive file=${GUEST_IMAGE},if=xen
qemu-system-x86_64: cannot use stdio by multiple character devices
qemu-system-x86_64: could not connect serial device to character backend
'stdio'
When qemu is compiled without PIXMAN, by default the serials aren't
muxed with the monitor anymore on stdio. The serials are redirected to
"null" instead, and the monitor isn't set up.
Fixes: commit 1bec1cc0d ("ui/console: allow to override the default VC")
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
In net_cleanup() we only need to delete the netdevs, as those may have
state which outlives Qemu when it exits, and thus may actually need to
be cleaned up on exit.
The nics, on the other hand, are owned by the device which created them.
Most devices don't bother to clean up on exit because they don't have
any state which will outlive Qemu... but XenBus devices do need to clean
up their nodes in XenStore, and do have an exit handler to delete them.
When the XenBus exit handler destroys the xen-net-device, it attempts
to delete its nic after net_cleanup() had already done so. And crashes.
Fix this by only deleting netdevs as we walk the list. As the comment
notes, we can't use QTAILQ_FOREACH_SAFE() as each deletion may remove
*multiple* entries, including the "safely" saved 'next' pointer. But
we can store the *previous* entry, since nics are safe.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Recently MemReentrancyGuard was added to DeviceState to record that the
device is engaging in I/O. The network device backend needs to update it
when delivering a packet to a device.
This implementation follows what bottom half does, but it does not add
a tracepoint for the case that the network device backend started
delivering a packet to a device which is already engaging in I/O. This
is because such reentrancy frequently happens for
qemu_flush_queued_packets() and is insignificant.
Fixes: CVE-2023-3019
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Acked-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Recently MemReentrancyGuard was added to DeviceState to record that the
device is engaging in I/O. The network device backend needs to update it
when delivering a packet to a device.
In preparation for such a change, add MemReentrancyGuard * as a
parameter of qemu_new_nic().
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The PNV I2C Controller was clearing the status register
after a reset without repopulating the "upper threshold
for I2C ports", "Command Complete" and the SCL/SDA input
level fields.
Fixed this for resets caused by a system reset as well
as from writing to the "Immediate Reset" register.
Fixes: 263b81ee15 ("ppc/pnv: Add an I2C controller model")
Signed-off-by: Glenn Miles <milesg@linux.vnet.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
The PNV I2C engines for power9 and power10 were being assigned a base
XSCOM address that was off by one I2C engine's address range such
that engine 0 had engine 1's address and so on. The xscom address
assignment was being based on the device tree engine numbering, which
starts at 1. Rather than changing the device tree numbering to start
with 0, the addressing was changed to be based on the existing device
tree numbers minus one.
Fixes: 1ceda19c28 ("ppc/pnv: Connect PNV I2C controller to powernv10)
Signed-off-by: Glenn Miles <milesg@linux.vnet.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
The patch below fixes a bug in the VSX_CVT_FP_TO_INT and VSX_CVT_FP_TO_INT2
macros in target/ppc/fpu_helper.c where a non-NaN floating point value from the
source vector is incorrectly converted to 0, 0x80000000, or 0x8000000000000000
instead of the expected value if a preceding source floating point value from
the same source vector was a NaN.
The bug in the VSX_CVT_FP_TO_INT and VSX_CVT_FP_TO_INT2 macros in
target/ppc/fpu_helper.c was introduced with commit c3f24257e3.
This patch also adds a new vsx_f2i_nan test in tests/tcg/ppc64 that checks that
the VSX xvcvspsxws, xvcvspuxws, xvcvspsxds, xvcvspuxds, xvcvdpsxws, xvcvdpuxws,
xvcvdpsxds, and xvcvdpuxds instructions correctly convert non-NaN floating point
values to integer values if the source vector contains NaN floating point values.
Fixes: c3f24257e3 ("target/ppc: Clear fpstatus flags on helpers missing it")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1941
Signed-off-by: John Platts <john_platts@hotmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Coverity warns that "i2c_bus_busy(i2c->busses[i]) << i" might overflow
because the expression is evaluated using 32-bit arithmetic and then
used in a context expecting a uint64_t.
While we are at it, introduce a PNV_I2C_MAX_BUSSES constant and check
the number of busses at realize time.
Fixes: Coverity CID 1523918
Cc: Glenn Miles <milesg@linux.vnet.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Glenn Miles <milesg@linux.vnet.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
On LoongArch host, we got an Aborted from tcg_out_mov().
qemu-x86_64 configure with '--enable-debug'.
> (gdb) b /home1/gaosong/code/qemu/tcg/loongarch64/tcg-target.c.inc:312
> Breakpoint 1 at 0x2576f0: file /home1/gaosong/code/qemu/tcg/loongarch64/tcg-target.c.inc, line 312.
> (gdb) run hello
[...]
> Thread 1 "qemu-x86_64" hit Breakpoint 1, tcg_out_mov (s=0xaaaae91760 <tcg_init_ctx>, type=TCG_TYPE_V128, ret=TCG_REG_V2,
> arg=TCG_REG_V0) at /home1/gaosong/code/qemu/tcg/loongarch64/tcg-target.c.inc:312
> 312 g_assert_not_reached();
> (gdb) bt
> #0 tcg_out_mov (s=0xaaaae91760 <tcg_init_ctx>, type=TCG_TYPE_V128, ret=TCG_REG_V2, arg=TCG_REG_V0)
> at /home1/gaosong/code/qemu/tcg/loongarch64/tcg-target.c.inc:312
> #1 0x000000aaaad0fee0 in tcg_reg_alloc_mov (s=0xaaaae91760 <tcg_init_ctx>, op=0xaaaaf67c20) at ../tcg/tcg.c:4632
> #2 0x000000aaaad142f4 in tcg_gen_code (s=0xaaaae91760 <tcg_init_ctx>, tb=0xffe8030340 <code_gen_buffer+197328>,
> pc_start=4346094) at ../tcg/tcg.c:6135
[...]
> (gdb) c
> Continuing.
> **
> ERROR:/home1/gaosong/code/qemu/tcg/loongarch64/tcg-target.c.inc:312:tcg_out_mov: code should not be reached
> Bail out! ERROR:/home1/gaosong/code/qemu/tcg/loongarch64/tcg-target.c.inc:312:tcg_out_mov: code should not be reached
>
> Thread 1 "qemu-x86_64" received signal SIGABRT, Aborted.
> 0x000000fff7b1c390 in raise () from /lib64/libc.so.6
> (gdb) q
Fixes: 16288ded94 ("tcg/loongarch64: Lower basic tcg vec ops to LSX")
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20231120065916.374045-1-gaosong@loongson.cn>
Both i.MX25 and i.MX6 SoC models ignore the Error argument when
setting the PHY number. Pick &error_abort which is the error
used by the i.MX7 SoC (see commit 1f7197deb0 "ability to change
the FEC PHY on i.MX7 processor").
Fixes: 74c1330582 ("ability to change the FEC PHY on i.MX25 processor")
Fixes: a9c167a3c4 ("ability to change the FEC PHY on i.MX6 processor")
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20231120115116.76858-1-philmd@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The 'stm32vldiscovery' machine ignores the CPU type requested by
the command line. This might confuse users, since the following
will create a machine with a Cortex-M3 CPU:
$ qemu-system-aarch64 -M stm32vldiscovery -cpu neoverse-n1
Set the MachineClass::valid_cpu_types field (introduced in commit
c9cf636d48 "machine: Add a valid_cpu_types property").
Remove the now unused MachineClass::default_cpu_type field.
We now get:
$ qemu-system-aarch64 -M stm32vldiscovery -cpu neoverse-n1
qemu-system-aarch64: Invalid CPU type: neoverse-n1-arm-cpu
The valid types are: cortex-m3-arm-cpu
Since the SoC family can only use Cortex-M3 CPUs, hard-code the
CPU type name at the SoC level, removing the QOM property
entirely.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Message-id: 20231117071704.35040-5-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The 'netduino2' machine ignores the CPU type requested by the
command line. This might confuse users, since the following will
create a machine with a Cortex-M3 CPU:
$ qemu-system-arm -M netduino2 -cpu cortex-a9
Set the MachineClass::valid_cpu_types field (introduced in commit
c9cf636d48 "machine: Add a valid_cpu_types property").
Remove the now unused MachineClass::default_cpu_type field.
We now get:
$ qemu-system-arm -M netduino2 -cpu cortex-a9
qemu-system-arm: Invalid CPU type: cortex-a9-arm-cpu
The valid types are: cortex-m3-arm-cpu
Since the SoC family can only use Cortex-M3 CPUs, hard-code the
CPU type name at the SoC level, removing the QOM property
entirely.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Message-id: 20231117071704.35040-4-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Both 'netduinoplus2' and 'olimex-stm32-h405' machines ignore the
CPU type requested by the command line. This might confuse users,
since the following will create a machine with a Cortex-M4 CPU:
$ qemu-system-aarch64 -M netduinoplus2 -cpu cortex-r5f
Set the MachineClass::valid_cpu_types field (introduced in commit
c9cf636d48 "machine: Add a valid_cpu_types property").
Remove the now unused MachineClass::default_cpu_type field.
We now get:
$ qemu-system-aarch64 -M netduinoplus2 -cpu cortex-r5f
qemu-system-aarch64: Invalid CPU type: cortex-r5f-arm-cpu
The valid types are: cortex-m4-arm-cpu
Since the SoC family can only use Cortex-M4 CPUs, hard-code the
CPU type name at the SoC level, removing the QOM property
entirely.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Message-id: 20231117071704.35040-3-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The ICC_PMR_ELx and ICV_PMR_ELx bit masks returned from
ic{c,v}_fullprio_mask should technically also remove any
bit above 7 as these are marked reserved (read 0) and should
therefore should not be written as anything other than 0.
This was noted during a run of a proprietary test system and
discused on the mailing list [1] and initially thought not to
be an issue due to RES0 being technically allowed to be
written to and read back as long as the implementation does
not use the RES0 bits. It is very possible that the values
are used in comparison without masking, as pointed out by
Peter in [2], if (cs->hppi.prio >= cs->icc_pmr_el1) may well
do the wrong thing.
Masking these values in ic{c,v}_fullprio_mask() should fix
this and prevent any future problems with playing with the
values.
[1]: https://lists.nongnu.org/archive/html/qemu-arm/2023-11/msg00607.html
[2]: https://lists.nongnu.org/archive/html/qemu-arm/2023-11/msg00737.html
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Message-id: 20231116172818.792364-1-ben.dooks@codethink.co.uk
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
HPPA64-PATCHES-for-8.2
Two patches for 8.2.
The SHRPD patch fixes a real translation bug which then allows to boot
the 64-bit Linux kernels of the Debian-11 and Debian-12 installation CDs.
The second patch adds the instruction byte sequence to the
assembly log. This is not an actual bug fix, but it's important since
it helps a lot when trying to fix qemu translation bugs on hppa.
# -----BEGIN PGP SIGNATURE-----
#
# iHUEABYKAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCZVfHPwAKCRD3ErUQojoP
# X3TrAQD2SfFsTWIYqTamh1ZHmydaJRL1xhXmPMqXgXFkDmiyhQD/VeyIyWEGj5Oe
# x70WR8HrtkadsUddgSGzFRChaVb0/wI=
# =Sapq
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 17 Nov 2023 15:04:15 EST
# gpg: using EDDSA key BCE9123E1AD29F07C049BBDEF712B510A23A0F5F
# gpg: Good signature from "Helge Deller <deller@gmx.de>" [unknown]
# gpg: aka "Helge Deller <deller@kernel.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 4544 8228 2CD9 10DB EF3D 25F8 3E5F 3D04 A7A2 4603
# Subkey fingerprint: BCE9 123E 1AD2 9F07 C049 BBDE F712 B510 A23A 0F5F
* tag 'hppa64-fixes-pull-request' of https://github.com/hdeller/qemu-hppa:
disas/hppa: Show hexcode of instruction along with disassembly
target/hppa: Fix 64-bit SHRPD instruction
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
In FDPIC signal handlers are passed around as FD pointers. Actual code
address and GOT pointer must be fetched from memory by the QEMU code
that implements kernel signal delivery functionality. This change is
equivalent to the following kernel change:
9c2cc74fb31e ("xtensa: fix signal delivery to FDPIC process")
Cc: qemu-stable@nongnu.org
Fixes: d2796be69d ("linux-user: add support for xtensa FDPIC")
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
On hppa many instructions can be expressed by different bytecodes.
To be able to debug qemu translation bugs it's therefore necessary to see the
currently executed byte codes without the need to lookup the sequence without
the full executable.
With this patch the instruction byte code is shown beside the disassembly.
Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
When shifting the two joined 64-bit registers right, shift the upper
64-bit register to the left and the lower 64-bit register to the right
before merging them with OR.
Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Improve
$ qemu-system-x86_64 -device max-x86_64-cpu,vendor=me
qemu-system-x86_64: -device max-x86_64-cpu,vendor=me: Property '.vendor' doesn't take value 'me'
to
qemu-system-x86_64: -device max-x86_64-cpu,vendor=0123456789abc: value of property 'vendor' must consist of exactly 12 characters
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20231031111059.3407803-8-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
[Typo corrected]
The error message
{"execute": "balloon", "arguments":{"value": -1}}
{"error": {"class": "GenericError", "desc": "Parameter 'target' expects a size"}}
points to 'target' instead of 'value'. Fix:
{"error": {"class": "GenericError", "desc": "Parameter 'value' expects a size"}}
Root cause: qmp_balloon()'s parameter is named @target. Rename it to
@value to match the QAPI schema.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20231031111059.3407803-7-armbru@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
The error message
$ qemu-system-x86_64 -netdev user,id=net0,ipv6-net=fec0::0/
qemu-system-x86_64: -netdev user,id=net0,ipv6-net=fec0::0/: Parameter 'ipv6-prefixlen' expects a number
points to ipv6-prefixlen instead of ipv6-net. Fix:
qemu-system-x86_64: -netdev user,id=net0,ipv6-net=fec0::0/: parameter 'ipv6-net' expects a number after '/'
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20231031111059.3407803-6-armbru@redhat.com>
set_password with "protocol": "vnc" supports only "connected": "keep".
Any other value is rejected with
Invalid parameter 'connected'
Improve this to
parameter 'connected' must be 'keep' when 'protocol' is 'vnc'
client_migrate_info requires "port" or "tls-port". When both are
missing, it fails with
Parameter 'port/tls-port' is missing
Improve this to
parameter 'port' or 'tls-port' is required
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20231031111059.3407803-5-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
When dynamic-reconfiguration is off, hot plug / unplug can fail with
"Bus 'spapr-pci-host-bridge' does not support hotplugging".
spapr-pci-host-bridge is a device, not a bus. Report the name of the
bus it provides instead: 'pci.0'.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20231031111059.3407803-2-armbru@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Like replay_linux.py, reverse_debugging.py starts the vm with console
set but does not interact with it (e.g., with wait_for_console_pattern).
In this situation, the console should have a drainer attached so the
socket does not fill. replay_linux.py has a drainer, but it is missing
from reverse_debugging.py.
Per analysis in Link: this can cause the console socket/pipe to fill and
QEMU get stuck in qemu_chr_write_buffer, leading to strange test case
failures (ppc64 fails because it prints a lot to console in early bios).
Attaching a drainer prevents this.
Note, this commit does not fix bugs introduced by the commits referenced
in the first two Fixes: tags, but together those commits conspire to
irritate the problem and cause test case failure, which this commit
fixes.
Link: https://lore.kernel.org/qemu-devel/ZVT-bY9YOr69QTPX@redhat.com/
Fixes: 1d4796cd00 ("python/machine: use socketpair() for console connections")
Fixes: 761a13b239 ("tests/avocado: ppc64 reverse debugging tests for pseries and powernv")
Fixes: be52eca309 ("tests/acceptance: add reverse debugging test")
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-ID: <20231116115354.228678-1-npiggin@gmail.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
In a perfect world we'd have reproducible tests,
but then we'd be sure we run the same binaries.
If a binary artifact isn't hashed, we have no idea
what we are running. Therefore enforce hashing for
all our artifacts.
With this change, unhashed artifacts produce:
$ avocado run tests/avocado/multiprocess.py
(1/2) tests/avocado/multiprocess.py:Multiprocess.test_multiprocess_x86_64:
ERROR: QemuBaseTest.fetch_asset() missing 1 required positional argument: 'asset_hash' (0.19 s)
Inspired-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-ID: <20231115205149.90765-1-philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The multiprocess test is currently succeeding with an annoying warning:
(1/2) tests/avocado/multiprocess.py:Multiprocess.test_multiprocess_x86_64:
WARN: Test passed but there were warnings during execution. Check
the log for details
In the log, you can find an entry like:
WARNI| No hash provided. Cannot check the asset file integrity.
Add the proper asset hashes to avoid those warnings.
Message-ID: <20231115145852.494052-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The "edid" feature has been added to vhost-user-gpu in commit
c06444261e ("contrib/vhost-user-gpu: implement get_edid feature"),
so waiting for "features: +virgl -edid" in the test does not work
anymore, it's "+edid" instead of "-edid" now!
While we're at it, move the expected string to the preceeding
exec_command_and_wait_for_pattern() instead (since waiting for
empty string here does not make too much sense).
Message-ID: <20231114203456.319093-1-thuth@redhat.com>
Reviewed-by: Antonio Caggiano <quic_acaggian@quicinc.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Fixes: 2e12dd405c "util/filemonitor-inotify: qemu_file_monitor_watch(): assert no overflow"
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Fixes: bb67ec32a0 "target/hppa: Include PSW_P in tb flags and mmu index"
Fixes: d7553f3591 "target/hppa: Populate an interval tree with valid tlb entries"
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Fixes: 593c28c02c "migration/doc: How to migrate when hosts have different features"
Fixes: 1aefe2ca14 "migration/doc: Add documentation for backwards compatiblity"
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Fixes: 761e3c1088 "gdbstub: fixes cases where wrong threads were reported to GDB on SIGINT"
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Fixes: 388d6b574e "hw/cxl: Use switch statements for read and write of cachemem registers"
Fixes: 3314efd276 "hw/cxl/mbox: Add Physical Switch Identify command."
Fixes: 004e3a93b8 "hw/cxl: Add tunneled command support to mailbox for switch cci."
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Fixes: a99d740347 "bsd-user: Implement do_obreak function"
Fixes: 8632729060 "bsd-user: Implement freebsd_exec_common, used in implementing execve/fexecve."
Fixes: bf14f13d8b "bsd-user: Implement stat related syscalls"
Reviewed-by: Warner Losh <imp@bsdimp.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
qdict.txt only consists of more or less random test data. We
can simply drop the lines with the problematic words here.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The tests/decode/ folder belongs to scripts/decodetree.py, so
it should be listed in the same section as the script.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Aspeed watchdog doesn't use anything from the System Control Unit.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Without this, we just dirty a single byte, and so if the caller writes
more than one byte to the host memory then we won't have invalidated any
translation blocks that start after the first byte and overlap those
writes. In particular, AArch64's DC ZVA implementation uses probe_access
(via probe_write), and so we don't invalidate the entire block, only the
TB overlapping the first byte (and, in the unusual case an unaligned VA
is given to the instruction, we also probe that specific address in
order to get the right VA reported on an exception, so will invalidate a
TB overlapping that address too). Since our IC IVAU implementation is a
no-op for system emulation that relies on the softmmu already having
detected self-modifying code via this mechanism, this means we have
observably wrong behaviour when jumping to code that has been DC ZVA'ed.
In practice this is an unusual thing for software to do, as in reality
the OS will DC ZVA the page and the application will go and write actual
instructions to it that aren't UDF #0, but you can write a test that
clearly shows the faulty behaviour.
For functions other than probe_access it's not clear what size to use
when 0 is passed in. Arguably a size of 0 shouldn't dirty at all, since
if you want to actually write then you should pass in a real size, but I
have conservatively kept the implementation as dirtying the first byte
in that case so as to avoid breaking any assumptions about that
behaviour.
Signed-off-by: Jessica Clarke <jrtc27@jrtc27.com>
Message-Id: <20231104031232.3246614-1-jrtc27@jrtc27.com>
[rth: Move the dirtysize computation next to notdirty_write.]
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
PV dumps block vcpu runs until dump end is reached. If there's an
error between PV dump init and PV dump end the vm will never be able
to run again. One example of such an error is insufficient disk space
for the dump file.
Let's add a cleanup function that tries to do a dump end. The dump
completion data is discarded but there's no point in writing it to a
file anyway if there's a possibility that other PV dump data is
missing.
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <20231109120443.185979-4-frankja@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Python 3.12 now warns about backslashes in strings that aren't used
for escaping a special character from Python. Silence the warning
by using raw strings here instead.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231113140721.46903-1-thuth@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The new SeaBIOS-hppa version 12 includes the necessary fixes to
support emulated PA2.0 CPUs and which allows starting 64-bit Linux
kernels in the guest.
To boot a 64-bit machine use the "-machine C3700" qemu option.
Signed-off-by: Helge Deller <deller@gmx.de>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
SEABIOS_HPPA_VERSION 12 contains those fixes and enhancements:
- Reduce debug level
- Update README file for PA-RISC
- Fix debug name of CPU_HPA_xx if xx >= 10
- Disable device indexing
SEABIOS_HPPA_VERSION 11 contains those fixes and enhancements
(mostly to enable support for 64-bit Linux kernel):
- Fixed 64-bit CPU detection via "mfctl,w" instruction
- Implement PDC_PSW for 64-bit CPUs
- Added PAT PDC functions:
- PDC_PAT_CELL
- PDC_PAT_CHASSIS_LOG
- PDC_PAT_PD_GET_ADDR_MAP
- PDC_PAT_CPU
- Fix return value of PDC_CACHE_RET_SPID space-id bits
- Introduce new default software IDs for the machines
- Fix CPU and FPU model numbers
- Fix 64-bit SMP rendezvous
- Fix Linux 64-bit kernel crash in STI due to usage of unsigned
32-bit "next_font" pointer in sti header files
- Fix graphics output to LASI artist card on PA2.0 machines
- More USB OHCI endianess fixes
- Fixes which make ODE run on B160L
- Fixes which make ODE detect Astro Runway port and CPUs
- Implement "firmware unlocking" via PDC_MODEL/PDC_MODEL_CAPABILITIES call
- Add subfunction 2 for PDC_MODEL_VERSIONS
Signed-off-by: Helge Deller <deller@gmx.de>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Something appears to be off between the 64-bit CPU, the 32-bit PDC
(SeaBIOS-hppa firmware), and the 64-bit kernel in addressing the
power button address in high-mapped firmware memory.
Use a 32-bit value at PAGE0->pad0[4] instead.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Apply the "32-bit PCI addressing on 40-bit Runway" as the default
iommu transformation. This allows PCI devices to dma PDC memory.
Signed-off-by: Helge Deller <deller@gmx.de>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This is the value that is supported by both PA-8500 and Astro.
If we support a larger address space than expected, we trip up
software that did not fill in all of the page table bits,
expecting them to be ignored.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Align the language with pa2.0, separating absolute and physical.
The translation from absolute to physical depends on PSW.W, and
we prefer not to flush between changes, therefore use 2 mmu_idx.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Coverty found that the shift of TARGET_PAGE_SIZE (32-bit type) might
overflow. Fix it by casting TARGET_PAGE_SIZE to a 64-bit type before
doing the shift (CID 1523902 and CID 1523908).
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Helge Deller <deller@gmx.de>
Message-Id: <ZU6F/H8CZr3q4pP/@p100>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Need to use iasq_b and iaoq_b to determine back register of CR_IIASQ.
This fixes random faults when booting up Linux user space.
Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Direct privilege level to mmu_idx mapping has been
false for some time. Provide the correct value to
hppa_get_physical_address.
Fixes: fa824d99f9 ("target/hppa: Switch to use MMU indices 11-15")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
During the conversion to decodetree, the 2-bit mask was lost.
Fixes: deee69a19f ("target/hppa: Convert memory management insns")
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
According to the technical reference manual, the Cortex-A9
has a Perfomance Unit Monitor (PMU):
https://developer.arm.com/documentation/100511/0401/performance-monitoring-unit/about-the-performance-monitoring-unit
The Cortex-A8 does also.
We already already define the PMU registers when emulating the
Cortex-A8 and Cortex-A9, because we put them in v7_cp_reginfo[]
rather than guarding them behind ARM_FEATURE_PMU. So the only thing
that setting the feature bit changes is that the registers actually
do something.
Enable ARM_FEATURE_PMU for Cortex-A8 and Cortex-A9, to avoid
this anomaly.
(The A8 and A9 PMU predates the standardisation of ID_DFR0.PerfMon,
so the field there is 0, but the PMU is still present.)
Signed-off-by: Nikita Ostrenkov <n.ostrenkov@gmail.com>
Message-id: 20231112165658.2335-1-n.ostrenkov@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: tweaked commit message; also enable PMU for A8]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
QEMU has validations to make sure that a VM is not started with more memory
(static and hotpluggable memory) than what the guest processor can address
directly with its addressing bits. This change adds a test to make sure QEMU
fails to start with a specific error message when an attempt is made to
start a VM with more memory than what the processor can directly address.
The test also checks for passing cases when the address space of the processor
is capable of addressing all memory. Boundary cases are tested.
CC: imammedo@redhat.com
CC: David Hildenbrand <david@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Message-ID: <20231109045601.33349-1-anisinha@redhat.com>
Message-ID: <D5D8D419-76BA-4FB0-9BAC-4F7470A052FC@redhat.com>
[PMD: Use SPDX tag]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
When calling trace_vmware_verify_rect_greater_than_bound() replace
"y" with "h" and y with h
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 02218aedb1 ("hw/display/vmware_vga: replace fprintf calls with trace events")
Signed-off-by: Alexandra Diupina <adiupina@astralinux.ru>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20231110174104.13280-1-adiupina@astralinux.ru>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
When we are doing a FEAT_MOPS copy that must be performed backwards,
we call mte_mops_probe_rev(), passing it the address of the last byte
in the region we are probing. However, allocation_tag_mem_probe()
wants the address of the first byte to get the tag memory for.
Because we passed it (ptr, size) we could incorrectly trip the
allocation_tag_mem_probe() check for "does this access run across to
the following page", and if that following page happened not to be
valid then we would assert.
We know we will always be only dealing with a single page because the
code that calls mte_mops_probe_rev() ensures that. We could make
mte_mops_probe_rev() pass 'ptr - (size - 1)' to
allocation_tag_mem_probe(), but then we would have to adjust the
returned 'mem' pointer to get back to the tag RAM for the last byte
of the region. It's simpler to just pass in a size of 1 byte,
because we know that allocation_tag_mem_probe() in pure-probe
single-page mode doesn't care about the size.
Fixes: 69c51dc372 ("target/arm: Implement MTE tag-checking functions for FEAT_MOPS copies")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20231110162546.2192512-1-peter.maydell@linaro.org
Since commit 9036e917f8 ("{include/}hw/arm: refactor virt PPI logic"),
GIC maintenance IRQ registration fails on arm64:
[ 0.979743] kvm [1]: Cannot register interrupt 9
That commit re-defined VIRTUAL_PMU_IRQ to be a INTID but missed a case
where the maintenance IRQ is actually referred by its PPI index. Just
like commit fa68ecb330 ("hw/arm/virt: fix PMU IRQ registration"), use
INITID_TO_PPI(). A search of "GIC_FDT_IRQ_TYPE_PPI" indicates that there
shouldn't be more similar issues.
Fixes: 9036e917f8 ("{include/}hw/arm: refactor virt PPI logic")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-id: 20231110090557.3219206-2-jean-philippe@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Hi,
"Host Memory Backends" and "Memory devices" queue ("mem"):
- One virtio-mem fix leading to a QEMU crash in QEMU debug builds
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAmVR4DsRHGRhdmlkQHJl
# ZGhhdC5jb20ACgkQTd4Q9wD/g1qKMQ//fe/4mJOXQ8l5OZ3ScpC2K7yoB9dowJiQ
# vobja0X0UhyMIOEH4V5RDtMrW3WcYzD2rVwehpLel3QbwcGa7TTB8NtkTx/t4L8P
# tRQe3epGvz+0Kkx4kBFcNBYNR5Skl1rg9kcDhYxNmoOLngWjJcDqRBryfc3V9pEs
# dl9sWXaQn82MGNQGuWFnTOUeOgg1LIdKMRcU2AzhAhrA/e4BqOof/JW+PVdQfzDq
# 4Jhq74pDmKiuH9GmRZgbNlNFX+GxRk63jJrRw4HDAbSD5dBmVnLAjgFZ0sBcKxe0
# HyiGrZOZNIMhMl/GwwQ7NilN03Hl6Hqlx03nz96/2DbiEKr6sOAErIclkUOVlr7k
# YeJvFv+iijqyC4XF43OqoIOz8mtkxan8CuiZW/6/FV9mS/Rb3r8of/BnrK2a8/Kh
# RJLX3tsmrxFdFDxVXWPw+UYrJy8g0xQP2Ils3OReO8QO9qqCytPqJFQsSHDlK3T3
# 2K5FiDpMu7cjFezLyRF0LkPSWg1CV7D6Vc8mp+amc2K4Ltiyhp4xZ2TBKrEC8HHE
# zs+EyEIfsna4SaKwVUVRimWF3+B4GojoAcAD0zju+uhD8Zw+z553zXpr5TSx0Une
# cbMs1n5MTzE6pQo1MmL3hu1xaf6Xdx7hnJPlcnjlKXGFol8ghv6tBkHbOQA5B1/H
# 7hVX43f3epM=
# =7M1K
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 13 Nov 2023 03:37:15 EST
# gpg: using RSA key 1BD9CAAD735C4C3A460DFCCA4DDE10F700FF835A
# gpg: issuer "david@redhat.com"
# gpg: Good signature from "David Hildenbrand <david@redhat.com>" [unknown]
# gpg: aka "David Hildenbrand <davidhildenbrand@gmail.com>" [full]
# gpg: aka "David Hildenbrand <hildenbr@in.tum.de>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 1BD9 CAAD 735C 4C3A 460D FCCA 4DDE 10F7 00FF 835A
* tag 'mem-2023-11-13' of https://github.com/davidhildenbrand/qemu:
virtio-mem: fix division by zero in virtio_mem_activate_memslots_to_plug()
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Coverity complains about passing "&expected" to "run_range_inverse_array",
which dereferences null "expected". I guess the problem is that the
compare_ranges() loop dereferences 'e' without testing it. However the
loop condition is based on 'ranges' which is garanteed to have
the same length as 'expected' given the g_assert_cmpint() just
before the loop. So the code looks safe to me.
Nevertheless adding a test on expected before the loop to get rid of the
warning.
Fixes: CID 1523901
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reported-by: Coverity (CID 1523901)
Message-ID: <20231110083654.277345-1-eric.auger@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
We requiere the 'ninja-build', which depends on 'python311':
$ pkgin show-deps ninja-build
direct dependencies for ninja-build-1.11.1nb1
python311>=3.11.0
So we end up installing both Python v3.10 and v3.11:
[31/76] installing python311-3.11.5...
[54/76] installing python310-3.10.13...
[74/76] installing py310-expat-3.10.13nb1...
Then the build system picks Python v3.11, and doesn't find
py-expat because we only installed the 3.10 version:
python determined to be '/usr/pkg/bin/python3.11'
python version: Python 3.11.5
*** Ouch! ***
Python's pyexpat module is not found.
It's normally part of the Python standard library, maybe your distribution packages it separately?
Either install pyexpat, or alleviate the need for it in the first place by installing pip and setuptools for '/usr/pkg/bin/python3.11'.
(Hint: NetBSD's pkgsrc debundles this to e.g. 'py310-expat'.)
ERROR: python venv creation failed
Fix by installing py-expat for v3.11. Remove the v3.10
packages since we aren't using them anymore.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20231109150900.91186-1-philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
It's a little bit weird that the files in target/i386/ which
are not in a subfolder there do not have any associated
maintainer (and thus nobody might be CC:-ed on changes to
these files). We should have a general x86 section for these
files, similar to what we already have for s390x and mips.
Since Paolo is already listed as maintainer for both, the
x86 KVM and TCG CPUs, I'd like to suggest him as maintainer
for the general files, too.
Message-ID: <20230929134551.395438-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
This header include/hw/timer/stellaris-gptm.h obviously belongs to the
Stellaris machines, so let's add it to the corresponding section.
And hw/display/ssd0303.c and hw/display/ssd0323.c are only used
by hw/arm/stellaris.c, so add them to the corresponding section
in the MAINTAINERS file, too.
Message-ID: <20231020060936.524988-5-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The current code assumes that there is always a vfio group, but
that's no longer guaranteed with the iommufd backend when using
cdev. In this case, we don't need to track the vfio dma limit
anyway.
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Message-ID: <20231110175108.465851-2-mjrosato@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
When compiling QEMU with Clang 17 on a s390x, the compilation fails:
In file included from ../accel/tcg/cputlb.c:32:
In file included from /root/qemu/include/exec/helper-proto-common.h:10:
In file included from /root/qemu/include/qemu/atomic128.h:62:
/root/qemu/host/include/generic/host/atomic128-ldst.h:68:15: error:
__sync builtin operation MUST have natural alignment (consider using __
atomic). [-Werror,-Wsync-alignment]
68 | } while (!__sync_bool_compare_and_swap_16(ptr_align, old, new.i));
| ^
In file included from ../accel/tcg/cputlb.c:32:
In file included from /root/qemu/include/exec/helper-proto-common.h:10:
In file included from /root/qemu/include/qemu/atomic128.h:61:
/root/qemu/host/include/generic/host/atomic128-cas.h:36:11: error:
__sync builtin operation MUST have natural alignment (consider using __a
tomic). [-Werror,-Wsync-alignment]
36 | r.i = __sync_val_compare_and_swap_16(ptr_align, c.i, n.i);
| ^
2 errors generated.
It's arguably a bug in Clang since we already use __builtin_assume_aligned()
to tell the compiler that the pointer is properly aligned. But according to
https://github.com/llvm/llvm-project/issues/69146 it seems like the Clang
folks don't see an easy fix on their side and recommend to use a type
declared with __attribute__((aligned(16))) to work around this problem.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1934
Message-ID: <20231108085954.313071-1-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Pylint warns:
tests/qapi-schema/test-qapi.py:139:13: W1514: Using open without explicitly specifying an encoding (unspecified-encoding)
tests/qapi-schema/test-qapi.py:143:13: W1514: Using open without explicitly specifying an encoding (unspecified-encoding)
Add encoding='utf-8'.
Pylint advises:
tests/qapi-schema/test-qapi.py:143:13: R1732: Consider using 'with' for resource-allocating operations (consider-using-with)
Silence this by returning the value directly.
Pylint advises:
tests/qapi-schema/test-qapi.py:221:4: R1722: Consider using sys.exit() (consider-using-sys-exit)
tests/qapi-schema/test-qapi.py:226:4: R1722: Consider using sys.exit() (consider-using-sys-exit)
Sure, why not.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20231025092925.1785934-1-armbru@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Pylint advises:
docs/sphinx/qapidoc.py:518:12: W0707: Consider explicitly re-raising using 'raise ExtensionError(str(err)) from err' (raise-missing-from)
>From its manual:
Python's exception chaining shows the traceback of the current
exception, but also of the original exception. When you raise a
new exception after another exception was caught it's likely that
the second exception is a friendly re-wrapping of the first
exception. In such cases `raise from` provides a better link
between the two tracebacks in the final error.
Makes sense, so do it.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20231025092159.1782638-2-armbru@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Local variables shadowing other local variables or parameters make the
code needlessly hard to understand. Bugs love to hide in such code.
Evidence: commit bbde656263 (migration/rdma: Fix save_page method to
fail on polling error).
Enable -Wshadow=local to prevent such issues. Possible thanks to
recent cleanups. Enabling -Wshadow would prevent more issues, but
we're not yet ready for that.
As usual, the warning is only enabled when the compiler recognizes it.
GCC does, Clang doesn't.
Some shadowed locals remain in bsd-user. Since BSD prefers Clang,
let's not wait for its cleanup.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20231026053115.2066744-2-armbru@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
When running with "dynamic-memslots=off", we enter
virtio_mem_activate_memslots_to_plug() to return immediately again
because "vmem->dynamic_memslots == false". However, the compiler might
not optimize out calculating start_idx+end_idx, where we divide by
vmem->memslot_size. In such a configuration, the memslot size is 0 and
we'll get a division by zero:
(qemu) qom-set vmem0 requested-size 3G
(qemu) q35.sh: line 38: 622940 Floating point exception(core dumped)
The same is true for virtio_mem_deactivate_unplugged_memslots(), however
we never really reach that code without a prior
virtio_mem_activate_memslots_to_plug() call.
Let's fix it by simply calling these functions only with
"dynamic-memslots=on".
This was found when using a debug build of QEMU.
Message-ID: <20231023111341.219317-1-david@redhat.com>
Reprted-by: Mario Casquero <mcasquer@redhat.com>
Fixes: 177f9b1ee4 ("virtio-mem: Expose device memory dynamically via multiple memslots if enabled")
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
The Intel 82576EB GbE Controller say that the Physical and Virtual
Functions support Function Level Reset. Add the capability to the PF
device model using device property "x-pcie-flr-init" which is "on" by
default and "off" for machines <= 8.1 to preserve compatibility.
The FLR capability of the VF model is defined according to the FLR
property of the PF, this to avoid adding an extra compatibility
property.
Cc: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Fixes: 3a977deebe ("Intrdocue igb device emulation")
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Tested-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
No need to declare a new variable in the the inner code block
here, we can re-use the "ret" variable that has been declared
at the beginning of the function. With this change, the code
can now be successfully compiled with -Wshadow=local again.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20231023175038.111607-1-thuth@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Commit message tweaked]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
The system mask is a restricted subset of the psw, with only
a couple of reserved bits. It is better to handle this up
front in the translator than require helper_swap_system_mask
to use cpu_hppa_get_psw and cpu_hppa_put_psw.
Signed-off-by: Helge Deller <deller@gmx.de>
[rth: Handle this in expand_sm_imm not helper_swap_system_mask.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
qdev: Make array properties user accessible again
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmVOZicRHGt3b2xmQHJl
# ZGhhdC5jb20ACgkQfwmycsiPL9arpw/+NKGRhSMrSq9Az+z5+ANUfw5SNLJYf1hH
# jm5ITA1Gr9htqHtBfEOdkms2wef6m7onF72rHVUlBKdqCPNMGLme5B0oQ8PZ1X1t
# OxAZ8KYwlO98QvOYl617SA/8wxc0U4/zi192kJpbRkKF6KdbbMGtLKjHyEitA/Yv
# izx1vkKOgQyMFGF1JgIyG4R3WmsKQW1XLqb3emVNRzCqmJpkvMJZQG8tnyEAXlIS
# gkY69cTpaKVaM1OxdB45gjlKTGzLWC/3tTGH+u8q356fvgm/QIgrokCirCZFPIl0
# C8hvzPm/L8hkvWtUb3EZx0DLiunWcAGvoLgBNODHojKRtQ6X9TRTrjJ41ZCLXVqv
# tVJm+XGKC0CZ/WW5yqVOmnzfPH4z8ubzSoRv5ryz3xDb5B/Zr10+ScE+/Ee24wJ2
# HIehxc1LgVGGpikP88/Ns/nAlIVUQxxYvSJ23R5D1+UpP6FCy6Y1pKyRtZGzPCIe
# N4Y+52GtelBR8gOjay5INn/Yf8Fh6sFxX556BW0XKYcbQgvl2bxASe/KVnAVZ1NB
# 8DsaAWlK+hPGopwyp2lDRuGd4kusNbzQvIUZ0mr1g9HQ/iSnT/9RFdExsj+K6QTr
# pX42QCe4mWHPAKx38cez+Bhx4TEOw+GmHuTp/oLdBRuY8DPu/I0Ny364uiW+At/R
# 8jF+jt5uVZc=
# =MV6O
# -----END PGP SIGNATURE-----
# gpg: Signature made Sat 11 Nov 2023 01:19:35 HKT
# gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg: issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* tag 'qdev-array-prop' of https://repo.or.cz/qemu/kevin:
qdev: Rework array properties based on list visitor
qdev: Make netdev properties work as list elements
qom: Add object_property_set_default_list()
hw/rx/rx62n: Use qdev_prop_set_array()
hw/arm/xlnx-versal: Use qdev_prop_set_array()
hw/arm/virt: Use qdev_prop_set_array()
hw/arm/vexpress: Use qdev_prop_set_array()
hw/arm/sbsa-ref: Use qdev_prop_set_array()
hw/arm/mps2: Use qdev_prop_set_array()
hw/arm/mps2-tz: Use qdev_prop_set_array()
hw/i386/pc: Use qdev_prop_set_array()
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-11-11 11:23:25 +08:00
3247 changed files with 124526 additions and 91080 deletions
- if test -n "@PYPI_PKGS@" ; then @PIP3@ install @PYPI_PKGS@ ; fi
- if test -n "@PYPI_PKGS@" ; then PYLIB=$(@PYTHON@ -c 'import sysconfig; print(sysconfig.get_path("stdlib"))'); rm -f $PYLIB/EXTERNALLY-MANAGED; @PIP3@ install @PYPI_PKGS@ ; fi
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.