There are 27 pre-copy live migration scenarios being tested. In all of
these we force non-convergence and run for one iteration, then let it
converge and wait for completion during the second (or following)
iterations. At 3 mbps bandwidth limit the first iteration takes a very
long time (~30 seconds).
While it is important to test the migration passes and convergence
logic, it is overkill to do this for all 27 pre-copy scenarios. The
TLS migration scenarios in particular are merely exercising different
code paths during connection establishment.
To optimize time taken, switch most of the test scenarios to run
non-live (ie guest CPUs paused) with no bandwidth limits. This gives
a massive speed up for most of the test scenarios.
For test coverage the following scenarios are unchanged
* Precopy with UNIX sockets
* Precopy with UNIX sockets and dirty ring tracking
* Precopy with XBZRLE
* Precopy with UNIX compress
* Precopy with UNIX compress (nowait)
* Precopy with multifd
On a test machine this reduces execution time from 13 minutes to
8 minutes.
Tested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230601161347.1803440-10-berrange@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Change the migration test to use the new qtest event callback to watch
for the stop event. This ensures that we only watch for the STOP event
on the source QEMU. The previous code would set the single 'got_stop'
flag when either source or dest QEMU got the STOP event.
Reviewed-by: Juan Quintela <quintela@redhat.com>
Acked-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230601161347.1803440-6-berrange@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
This function duplicates logic of qtest_qmp_assert_success_ref.
The qtest_qmp_assert_success_ref method has better diagnostics
on failure because it prints the entire QMP response, instead
of just asserting on existance of the 'error' key.
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230601161347.1803440-4-berrange@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Currently code must call one of the qtest_qmp_event* functions to
fetch events. These are only usable if the immediate caller knows
the particular event they want to capture, and are only interested
in one specific event type. Adding ability to register an event
callback lets the caller capture a range of events over any period
of time.
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230601161347.1803440-3-berrange@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Currently, it is only done when the iteration finishes successfully.
Not cleaning up the userfaultfd write protection can lead to
symptoms/issues such as the process hanging in memmove or GDB not
being able to attach.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Message-Id: <20230526115908.196171-1-f.ebner@proxmox.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
1. Otherwise failed migration just drops guest-panicked state, which is
not good for management software.
2. We do keep different paused states like guest-panicked during
migration with help of global_state state.
3. We do restore running state on source when migration is cancelled or
failed.
4. "postmigrate" state is documented as "guest is paused following a
successful 'migrate'", so originally it's only for successful path
and we never documented current behavior.
Let's restore paused states like guest-panicked in case of cancel or
fail too. Allow same transitions like for inmigrate state.
This commit changes the behavior that was introduced by commit
42da5550d6 "migration: set state to post-migrate on failure" and
provides a bit different fix on related
https://bugzilla.redhat.com/show_bug.cgi?id=1355683
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20230517123752.21615-6-vsementsov@yandex-team.ru>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Pull request
- Stefano Garzarella's blkio block driver 'fd' parameter
- My thread-local blk_io_plug() series
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCAAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAmR4uHoACgkQnKSrs4Gr
# c8hFBAgAo+SFrOteYgdELM9s0EWb0AU39MTOyNXW7i5mPZNXrn5J7pfRD/5wvI6l
# wl5GNMQ+M5HVYO7CumKWr4M1IpKV5Jin6FN/2h15fWkeg17lBOmNHUF+LctLYQbq
# HwtNA4hdw1+SEv8kQLBgiqSJMqWcn80X09emgPMCIwET9zxokRYwVjQJx2alM5bd
# SqgitDp5qlHyj5HQPX2orT9KrXYWQdGr8i50bn0S67r1wdqTRMu93wrWdEUUncId
# 7otlUaq8cARbRMJzIwDmy/cF24Ynr0wCJb4aHW+trRtf+PNgx1Ki+YOiz+LFyjq7
# t6KOMeignzhz9Uzq8EVG4XW8SHpGkw==
# =Ms48
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 01 Jun 2023 08:25:46 AM PDT
# gpg: using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [full]
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" [full]
* tag 'block-pull-request' of https://gitlab.com/stefanha/qemu:
qapi: add '@fdset' feature for BlockdevOptionsVirtioBlkVhostVdpa
block/blkio: use qemu_open() to support fd passing for virtio-blk
block: remove bdrv_co_io_plug() API
block/linux-aio: convert to blk_io_plug_call() API
block/io_uring: convert to blk_io_plug_call() API
block/blkio: convert to blk_io_plug_call() API
block/nvme: convert to blk_io_plug_call() API
block: add blk_io_plug_call() API
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The virtio-blk-vhost-vdpa driver in libblkio 1.3.0 supports the fd
passing through the new 'fd' property.
Since now we are using qemu_open() on '@path' if the virtio-blk driver
supports the fd passing, let's announce it.
In this way, the management layer can pass the file descriptor of an
already opened vhost-vdpa character device. This is useful especially
when the device can only be accessed with certain privileges.
Add the '@fdset' feature only when the virtio-blk-vhost-vdpa driver
in libblkio supports it.
Suggested-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20230530071941.8954-3-sgarzare@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Some virtio-blk drivers (e.g. virtio-blk-vhost-vdpa) supports the fd
passing. Let's expose this to the user, so the management layer
can pass the file descriptor of an already opened path.
If the libblkio virtio-blk driver supports fd passing, let's always
use qemu_open() to open the `path`, so we can handle fd passing
from the management layer through the "/dev/fdset/N" special path.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20230530071941.8954-2-sgarzare@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Now we no longer have dynamic state affecting things we can remove the
additional fields in cpu.h and simplify the TB hash calculation.
For the benchmark:
hyperfine -w 2 -m 20 \
"./arm-softmmu/qemu-system-arm -cpu cortex-a15 \
-machine type=virt,highmem=off \
-display none -m 2048 \
-serial mon:stdio \
-netdev user,id=unet,hostfwd=tcp::2222-:22 \
-device virtio-net-pci,netdev=unet \
-device virtio-scsi-pci \
-blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf \
-device scsi-hd,drive=hd -smp 4 \
-kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage \
-append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' \
-snapshot"
It has a marginal effect on runtime, before:
Time (mean ± σ): 26.279 s ± 2.438 s [User: 41.113 s, System: 1.843 s]
Range (min … max): 24.420 s … 32.565 s 20 runs
after:
Time (mean ± σ): 24.440 s ± 2.885 s [User: 34.474 s, System: 2.028 s]
Range (min … max): 21.663 s … 29.937 s 20 runs
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1358
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20230526165401.574474-10-alex.bennee@linaro.org
Message-Id: <20230524133952.3971948-9-alex.bennee@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stop using the .bdrv_co_io_plug() API because it is not multi-queue
block layer friendly. Use the new blk_io_plug_call() API to batch I/O
submission instead.
Note that a dev_max_batch check is dropped in laio_io_unplug() because
the semantics of unplug_fn() are different from .bdrv_co_unplug():
1. unplug_fn() is only called when the last blk_io_unplug() call occurs,
not every time blk_io_unplug() is called.
2. unplug_fn() is per-thread, not per-BlockDriverState, so there is no
way to get per-BlockDriverState fields like dev_max_batch.
Therefore this condition cannot be moved to laio_unplug_fn(). It is not
obvious that this condition affects performance in practice, so I am
removing it instead of trying to come up with a more complex mechanism
to preserve the condition.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Acked-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20230530180959.1108766-6-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Introduce a new API for thread-local blk_io_plug() that does not
traverse the block graph. The goal is to make blk_io_plug() multi-queue
friendly.
Instead of having block drivers track whether or not we're in a plugged
section, provide an API that allows them to defer a function call until
we're unplugged: blk_io_plug_call(fn, opaque). If blk_io_plug_call() is
called multiple times with the same fn/opaque pair, then fn() is only
called once at the end of the function - resulting in batching.
This patch introduces the API and changes blk_io_plug()/blk_io_unplug().
blk_io_plug()/blk_io_unplug() no longer require a BlockBackend argument
because the plug state is now thread-local.
Later patches convert block drivers to blk_io_plug_call() and then we
can finally remove .bdrv_co_io_plug() once all block drivers have been
converted.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 20230530180959.1108766-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Favor using connect() when passing a socket instead of
open_with_socket(). Simultaneously, update constructor calls to use the
combined address argument for QEMUMonitorProtocol().
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 20230517163406.2593480-5-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
Instead of using accept() with sockets (which uses open_with_socket()),
use calls to connect() to utilize existing sockets instead. A benefit of
this is more robust error handling already present within the connect()
call that isn't present in open_with_socket().
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 20230517163406.2593480-4-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
Allow existing sockets to be passed to connect(). The changes are pretty
minimal, and this allows for far greater flexibility in setting up
communications with an endpoint.
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 20230517163406.2593480-2-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
Improvements to 128-bit atomics:
- Separate __int128_t type and arithmetic detection
- Support 128-bit load/store in backend for i386, aarch64, ppc64, s390x
- Accelerate atomics via host/include/
Decodetree:
- Add named field syntax
- Move tests to meson
# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmR2R10dHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV/bsgf/XLi8q+ITyoEAKwG4
# 6ML7DktLAdIs9Euah9twqe16U0BM0YzpKfymBfVVBKKaIa0524N4ZKIT3h6EeJo+
# f+ultqrpsnH+aQh4wc3ZCkEvRdhzhFT8VcoRTunJuJrbL3Y8n2ZSgODUL2a0tahT
# Nn+zEPm8rzQanSKQHq5kyNBLpgTUKjc5wKfvy/WwttnFmkTnqzcuEA6nPVOVwOHC
# lZBQCByIQWsHfFHUVJFvsFzBQbm0mAiW6FNKzPBkoXon0h/UZUI1lV+xXzgutFs+
# zR2O8IZwLYRu2wOWiTF8Nn2qQafkB3Dhwoq3JTEXhOqosOPExbIiWlsZDlPiKRJk
# bwmQlg==
# =XQMb
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 30 May 2023 11:58:37 AM PDT
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate]
* tag 'pull-tcg-20230530' of https://gitlab.com/rth7680/qemu: (27 commits)
tests/decode: Add tests for various named-field cases
scripts/decodetree: Implement named field support
scripts/decodetree: Implement a topological sort
scripts/decodetree: Pass lvalue-formatter function to str_extract()
docs: Document decodetree named field syntax
tests/decode: Convert tests to meson
decodetree: Do not remove output_file from /dev
decodetree: Diagnose empty pattern group
decodetree: Fix recursion in prop_format and build_tree
decodetree: Add --test-for-error
tcg: Remove TCG_TARGET_TLB_DISPLACEMENT_BITS
accel/tcg: Add aarch64 store_atom_insert_al16
accel/tcg: Add aarch64 lse2 load_atom_extract_al16_or_al8
accel/tcg: Add x86_64 load_atom_extract_al16_or_al8
accel/tcg: Extract store_atom_insert_al16 to host header
accel/tcg: Extract load_atom_extract_al16_or_al8 to host header
tcg/s390x: Support 128-bit load/store
tcg/ppc: Support 128-bit load/store
tcg/aarch64: Support 128-bit load/store
tcg/aarch64: Simplify constraints on qemu_ld/st
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The test_vhost_user_vga_virgl test currently fails on some CI
machines with:
qemu-system-x86_64: egl: no drm render node available
qemu-system-x86_64: egl: render node init failed
The other test in this file already checks whether there is
an error while starting QEMU - we should do the same for the
test_vhost_user_vga_virgl test, too.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230530180330.48722-1-thuth@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Implement support for named fields, i.e. where one field is defined
in terms of another, rather than directly in terms of bits extracted
from the instruction.
The new method referenced_fields() on all the Field classes returns a
list of fields that this field references. This just passes through,
except for the new NamedField class.
We can then use referenced_fields() to:
* construct a list of 'dangling references' for a format or
pattern, which is the fields that the format/pattern uses but
doesn't define itself
* do a topological sort, so that we output "field = value"
assignments in an order that means that we assign a field before
we reference it in a subsequent assignment
* check when we output the code for a pattern whether we need to
fill in the format fields before or after the pattern fields, and
do other error checking
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230523120447.728365-6-peter.maydell@linaro.org>
To support named fields, we will need to be able to do a topological
sort (so that we ensure that we output the assignment to field A
before the assignment to field B if field B refers to field A by
name). The good news is that there is a tsort in the python standard
library; the bad news is that it was only added in Python 3.9.
To bridge the gap between our current minimum supported Python
version and 3.9, provide a local implementation that has the
same API as the stdlib version for the parts we care about.
In future when QEMU's minimum Python version requirement reaches
3.9 we can delete this code and replace it with an 'import' line.
The core of this implementation is based on
https://code.activestate.com/recipes/578272-topological-sort/
which is MIT-licensed.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230523120447.728365-5-peter.maydell@linaro.org>
To support referring to other named fields in field definitions, we
need to pass the str_extract() method a function which tells it how
to emit the code for a previously initialized named field. (In
Pattern::output_code() the other field will be "u.f_foo.field", and
in Format::output_extract() it is "a->field".)
Refactor the two callsites that currently do "output code to
initialize each field", and have them pass a lambda that defines how
to format the lvalue in each case. This is then used both in
emitting the LHS of the assignment and also passed down to
str_extract() as a new argument (unused at the moment, but will be
used in the following patch).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230523120447.728365-4-peter.maydell@linaro.org>
Document the named field syntax that we want to implement for the
decodetree script. This allows a field to be defined in terms of
some other field that the instruction pattern has already set, for
example:
%sz_imm 10:3 sz:3 !function=expand_sz_imm
to allow a function to be passed both an immediate field from the
instruction and also a sz value which might have been specified by
the instruction pattern directly (sz=1, etc) rather than being a
simple field within the instruction.
Note that the restriction on not having the format referring to the
pattern and the pattern referring to the format simultaneously is a
restriction of the decoder generator rather than inherently being a
silly thing to do.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230523120447.728365-3-peter.maydell@linaro.org>
Nor report any PermissionError on remove.
The primary purpose is testing with -o /dev/null.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Test err_pattern_group_empty.decode failed with exception:
Traceback (most recent call last):
File "./scripts/decodetree.py", line 1424, in <module> main()
File "./scripts/decodetree.py", line 1342, in main toppat.build_tree()
File "./scripts/decodetree.py", line 627, in build_tree
self.tree = self.__build_tree(self.pats, self.fixedbits,
File "./scripts/decodetree.py", line 607, in __build_tree
fb = i.fixedbits & innermask
TypeError: unsupported operand type(s) for &: 'NoneType' and 'int'
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use LPQ/STPQ when 16-byte atomicity is required.
Note that these instructions require 16-byte alignment.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use LQ/STQ with ISA v2.07, and 16-byte atomicity is required.
Note that these instructions do not require 16-byte alignment.
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
With FEAT_LSE2, LDP/STP suffices. Without FEAT_LSE2, use LDXP+STXP
16-byte atomicity is required and LDP/STP otherwise.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Adjust the softmmu tlb to use TMP[0-2], not any of the normally available
registers. Since we handle overlap betwen inputs and helper arguments,
we can allow any allocatable reg.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We will need to allocate a second general-purpose temporary.
Rename the existing temps to add a distinguishing number.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
With CPUINFO_ATOMIC_VMOVDQA, we can perform proper atomic
load/store without cmpxchg16b.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Older versions of clang have missing runtime functions for arithmetic
with -fsanitize=undefined (see 464e3671f9), so we cannot use
__int128_t for implementing Int128. But __int128_t is present,
data movement works, and it can be used for atomic128.
Probe for both CONFIG_INT128_TYPE and CONFIG_INT128, adjust
qemu/int128.h to define Int128Alias if CONFIG_INT128_TYPE,
and adjust the meson probe for atomics to use has_int128_type.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
PAGE_WRITE is current writability, as modified by TB protection;
PAGE_WRITE_ORG is the original page writability.
Fixes: cdfac37be0 ("accel/tcg: Honor atomicity of loads")
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The first move was incorrectly using TCG_TYPE_I32 while the second
move was correctly using TCG_TYPE_REG. This prevents a 64-bit host
from moving all 128-bits of the return value.
Fixes: ebebea53ef ("tcg: Support TCG_TYPE_I128 in tcg_out_{ld,st}_helper_{args,ret}")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Block layer patches
- Fix blockdev-create with iothreads
- Remove aio_disable_external() API
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmR2JIARHGt3b2xmQHJl
# ZGhhdC5jb20ACgkQfwmycsiPL9brtA/9HVdAdtJxW78J60TE2lTqE9XlqMOEHBZl
# 8GN72trjP2geY/9mVsv/XoFie4ecqFsYjwAWWUuXZwLgAo53jh7oFN7gBH5iGyyD
# +EukYEfjqoykX5BkoK0gbMZZUe5Y4Dr2CNXYw4bNg8kDzj2RLifGA1XhdL3HoiVt
# PHZrhwBR7ddww6gVOnyJrfGL8fMkW/ZNeKRhrTZuSP+63oDOeGTsTumD+YKJzfPs
# p5WlwkuPjcqbO+w32FeVOHVhNI4swkN5svz3fkr8NuflfA7kH6nBQ5wymObbaTLc
# Erx03lrtP1+6nw43V11UnYt6iDMg4EBUQwtzNaKFnk3rMIdjoQYxIM5FTBWL2rYD
# Dg6PhkncXQ1WNWhUaFqpTFLB52XAYsSa4/y2QAGP6nWbqAUAUknQ3exaMvWiq7Z0
# nZeyyhIWvpJIHGCArWRdqqh+zsBdsmUVuPGyZnZgL/cXoJboYiHMyMJSUWE0XxML
# NGrncwxdsBXkVGGwTdHpBT64dcu3ENRgwtraqRLQm+tp5MKNTJB/+Ug2/p1vonHT
# UOoHz//UPskn8sHIyevoHXeu2Ns0uIHzrAXr+7Ay+9UYyIH6a07F4b2BGqkfyi/i
# 8wQsDmJ/idx5C4q1+jS+GuIbpnjIx6nxXwXMqpscUXZmM4Am8OMkiKxQAa1wExGF
# paId+HHwyks=
# =yuER
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 30 May 2023 09:29:52 AM PDT
# gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg: issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
* tag 'for-upstream' of https://repo.or.cz/qemu/kevin: (32 commits)
aio: remove aio_disable_external() API
virtio: do not set is_external=true on host notifiers
virtio-scsi: implement BlockDevOps->drained_begin()
virtio-blk: implement BlockDevOps->drained_begin()
virtio: make it possible to detach host notifier from any thread
block/fuse: do not set is_external=true on FUSE fd
block/export: don't require AioContext lock around blk_exp_ref/unref()
block/export: rewrite vduse-blk drain code
hw/xen: do not set is_external=true on evtchn fds
xen-block: implement BlockDevOps->drained_begin()
block: drain from main loop thread in bdrv_co_yield_to_drain()
block: add blk_in_drain() API
hw/xen: do not use aio_set_fd_handler(is_external=true) in xen_xenstore
block/export: stop using is_external in vhost-user-blk server
block/export: wait for vhost-user-blk requests when draining
util/vhost-user-server: rename refcount to in_flight counter
virtio-scsi: stop using aio_disable_external() during unplug
virtio-scsi: avoid race between unplug and transport event
hw/qdev: introduce qdev_is_realized() helper
block-backend: split blk_do_set_aio_context()
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
All callers now pass is_external=false to aio_set_fd_handler() and
aio_set_event_notifier(). The aio_disable_external() API that
temporarily disables fd handlers that were registered is_external=true
is therefore dead code.
Remove aio_disable_external(), aio_enable_external(), and the
is_external arguments to aio_set_fd_handler() and
aio_set_event_notifier().
The entire test-fdmon-epoll test is removed because its sole purpose was
testing aio_disable_external().
Parts of this patch were generated using the following coccinelle
(https://coccinelle.lip6.fr/) semantic patch:
@@
expression ctx, fd, is_external, io_read, io_write, io_poll, io_poll_ready, opaque;
@@
- aio_set_fd_handler(ctx, fd, is_external, io_read, io_write, io_poll, io_poll_ready, opaque)
+ aio_set_fd_handler(ctx, fd, io_read, io_write, io_poll, io_poll_ready, opaque)
@@
expression ctx, notifier, is_external, io_read, io_poll, io_poll_ready;
@@
- aio_set_event_notifier(ctx, notifier, is_external, io_read, io_poll, io_poll_ready)
+ aio_set_event_notifier(ctx, notifier, io_read, io_poll, io_poll_ready)
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230516190238.8401-21-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The virtio-scsi Host Bus Adapter provides access to devices on a SCSI
bus. Those SCSI devices typically have a BlockBackend. When the
BlockBackend enters a drained section, the SCSI device must temporarily
stop submitting new I/O requests.
Implement this behavior by temporarily stopping virtio-scsi virtqueue
processing when one of the SCSI devices enters a drained section. The
new scsi_device_drained_begin() API allows scsi-disk to message the
virtio-scsi HBA.
scsi_device_drained_begin() uses a drain counter so that multiple SCSI
devices can have overlapping drained sections. The HBA only sees one
pair of .drained_begin/end() calls.
After this commit, virtio-scsi no longer depends on hw/virtio's
ioeventfd aio_set_event_notifier(is_external=true). This commit is a
step towards removing the aio_disable_external() API.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230516190238.8401-19-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Detach ioeventfds during drained sections to stop I/O submission from
the guest. virtio-blk is no longer reliant on aio_disable_external()
after this patch. This will allow us to remove the
aio_disable_external() API once all other code that relies on it is
converted.
Take extra care to avoid attaching/detaching ioeventfds if the data
plane is started/stopped during a drained section. This should be rare,
but maybe the mirror block job can trigger it.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230516190238.8401-18-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
virtio_queue_aio_detach_host_notifier() does two things:
1. It removes the fd handler from the event loop.
2. It processes the virtqueue one last time.
The first step can be peformed by any thread and without taking the
AioContext lock.
The second step may need the AioContext lock (depending on the device
implementation) and runs in the thread where request processing takes
place. virtio-blk and virtio-scsi therefore call
virtio_queue_aio_detach_host_notifier() from a BH that is scheduled in
AioContext.
The next patch will introduce a .drained_begin() function that needs to
call virtio_queue_aio_detach_host_notifier(). .drained_begin() functions
cannot call aio_poll() to wait synchronously for the BH. It is possible
for a .drained_poll() callback to asynchronously wait for the BH, but
that is more complex than necessary here.
Move the virtqueue processing out to the callers of
virtio_queue_aio_detach_host_notifier() so that the function can be
called from any thread. This is in preparation for the next patch.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230516190238.8401-17-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This is part of ongoing work to remove the aio_disable_external() API.
Use BlockDevOps .drained_begin/end/poll() instead of
aio_set_fd_handler(is_external=true).
As a side-effect the FUSE export now follows AioContext changes like the
other export types.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230516190238.8401-16-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The FUSE export calls blk_exp_ref/unref() without the AioContext lock.
Instead of fixing the FUSE export, adjust blk_exp_ref/unref() so they
work without the AioContext lock. This way it's less error-prone.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230516190238.8401-15-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
vduse_blk_detach_ctx() waits for in-flight requests using
AIO_WAIT_WHILE(). This is not allowed according to a comment in
bdrv_set_aio_context_commit():
/*
* Take the old AioContex when detaching it from bs.
* At this point, new_context lock is already acquired, and we are now
* also taking old_context. This is safe as long as bdrv_detach_aio_context
* does not call AIO_POLL_WHILE().
*/
Use this opportunity to rewrite the drain code in vduse-blk:
- Use the BlockExport refcount so that vduse_blk_exp_delete() is only
called when there are no more requests in flight.
- Implement .drained_poll() so in-flight request coroutines are stopped
by the time .bdrv_detach_aio_context() is called.
- Remove AIO_WAIT_WHILE() from vduse_blk_detach_ctx() to solve the
.bdrv_detach_aio_context() constraint violation. It's no longer
needed due to the previous changes.
- Always handle the VDUSE file descriptor, even in drained sections. The
VDUSE file descriptor doesn't submit I/O, so it's safe to handle it in
drained sections. This ensures that the VDUSE kernel code gets a fast
response.
- Suspend virtqueue fd handlers in .drained_begin() and resume them in
.drained_end(). This eliminates the need for the
aio_set_fd_handler(is_external=true) flag, which is being removed from
QEMU.
This is a long list but splitting it into individual commits would
probably lead to git bisect failures - the changes are all related.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230516190238.8401-14-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
is_external=true suspends fd handlers between aio_disable_external() and
aio_enable_external(). The block layer's drain operation uses this
mechanism to prevent new I/O from sneaking in between
bdrv_drained_begin() and bdrv_drained_end().
The previous commit converted the xen-block device to use BlockDevOps
.drained_begin/end() callbacks. It no longer relies on is_external=true
so it is safe to pass is_external=false.
This is part of ongoing work to remove the aio_disable_external() API.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230516190238.8401-13-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Detach event channels during drained sections to stop I/O submission
from the ring. xen-block is no longer reliant on aio_disable_external()
after this patch. This will allow us to remove the
aio_disable_external() API once all other code that relies on it is
converted.
Extend xen_device_set_event_channel_context() to allow ctx=NULL. The
event channel still exists but the event loop does not monitor the file
descriptor. Event channel processing can resume by calling
xen_device_set_event_channel_context() with a non-NULL ctx.
Factor out xen_device_set_event_channel_context() calls in
hw/block/dataplane/xen-block.c into attach/detach helper functions.
Incidentally, these don't require the AioContext lock because
aio_set_fd_handler() is thread-safe.
It's safer to register BlockDevOps after the dataplane instance has been
created. The BlockDevOps .drained_begin/end() callbacks depend on the
dataplane instance, so move the blk_set_dev_ops() call after
xen_block_dataplane_create().
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230516190238.8401-12-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
For simplicity, always run BlockDevOps .drained_begin/end/poll()
callbacks in the main loop thread. This makes it easier to implement the
callbacks and avoids extra locks.
Move the function pointer declarations from the I/O Code section to the
Global State section for BlockDevOps, BdrvChildClass, and BlockDriver.
Narrow IO_OR_GS_CODE() to GLOBAL_STATE_CODE() where appropriate.
The test-bdrv-drain test case calls bdrv_drain() from an IOThread. This
is now only allowed from coroutine context, so update the test case to
run in a coroutine.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230516190238.8401-11-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The BlockBackend quiesce_counter is greater than zero during drained
sections. Add an API to check whether the BlockBackend is in a drained
section.
The next patch will use this API.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230516190238.8401-10-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There is no need to suspend activity between aio_disable_external() and
aio_enable_external(), which is mainly used for the block layer's drain
operation.
This is part of ongoing work to remove the aio_disable_external() API.
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230516190238.8401-9-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
vhost-user activity must be suspended during bdrv_drained_begin/end().
This prevents new requests from interfering with whatever is happening
in the drained section.
Previously this was done using aio_set_fd_handler()'s is_external
argument. In a multi-queue block layer world the aio_disable_external()
API cannot be used since multiple AioContext may be processing I/O, not
just one.
Switch to BlockDevOps->drained_begin/end() callbacks.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230516190238.8401-8-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Each vhost-user-blk request runs in a coroutine. When the BlockBackend
enters a drained section we need to enter a quiescent state. Currently
any in-flight requests race with bdrv_drained_begin() because it is
unaware of vhost-user-blk requests.
When blk_co_preadv/pwritev()/etc returns it wakes the
bdrv_drained_begin() thread but vhost-user-blk request processing has
not yet finished. The request coroutine continues executing while the
main loop thread thinks it is in a drained section.
One example where this is unsafe is for blk_set_aio_context() where
bdrv_drained_begin() is called before .aio_context_detached() and
.aio_context_attach(). If request coroutines are still running after
bdrv_drained_begin(), then the AioContext could change underneath them
and they race with new requests processed in the new AioContext. This
could lead to virtqueue corruption, for example.
(This example is theoretical, I came across this while reading the
code and have not tried to reproduce it.)
It's easy to make bdrv_drained_begin() wait for in-flight requests: add
a .drained_poll() callback that checks the VuServer's in-flight counter.
VuServer just needs an API that returns true when there are requests in
flight. The in-flight counter needs to be atomic.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230516190238.8401-7-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The VuServer object has a refcount field and ref/unref APIs. The name is
confusing because it's actually an in-flight request counter instead of
a refcount.
Normally a refcount destroys the object upon reaching zero. The VuServer
counter is used to wake up the vhost-user coroutine when there are no
more requests.
Avoid confusing by renaming refcount and ref/unref to in_flight and
inc/dec.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230516190238.8401-6-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch is part of an effort to remove the aio_disable_external()
API because it does not fit in a multi-queue block layer world where
many AioContexts may be submitting requests to the same disk.
The SCSI emulation code is already in good shape to stop using
aio_disable_external(). It was only used by commit 9c5aad84da
("virtio-scsi: fixed virtio_scsi_ctx_check failed when detaching scsi
disk") to ensure that virtio_scsi_hotunplug() works while the guest
driver is submitting I/O.
Ensure virtio_scsi_hotunplug() is safe as follows:
1. qdev_simple_device_unplug_cb() -> qdev_unrealize() ->
device_set_realized() calls qatomic_set(&dev->realized, false) so
that future scsi_device_get() calls return NULL because they exclude
SCSIDevices with realized=false.
That means virtio-scsi will reject new I/O requests to this
SCSIDevice with VIRTIO_SCSI_S_BAD_TARGET even while
virtio_scsi_hotunplug() is still executing. We are protected against
new requests!
2. scsi_qdev_unrealize() already contains a call to
scsi_device_purge_requests() so that in-flight requests are cancelled
synchronously. This ensures that no in-flight requests remain once
qdev_simple_device_unplug_cb() returns.
Thanks to these two conditions we don't need aio_disable_external()
anymore.
Cc: Zhengui Li <lizhengui@huawei.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230516190238.8401-5-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Only report a transport reset event to the guest after the SCSIDevice
has been unrealized by qdev_simple_device_unplug_cb().
qdev_simple_device_unplug_cb() sets the SCSIDevice's qdev.realized field
to false so that scsi_device_find/get() no longer see it.
scsi_target_emulate_report_luns() also needs to be updated to filter out
SCSIDevices that are unrealized.
Change virtio_scsi_push_event() to take event information as an argument
instead of the SCSIDevice. This allows virtio_scsi_hotunplug() to emit a
VIRTIO_SCSI_T_TRANSPORT_RESET event after the SCSIDevice has already
been unrealized.
These changes ensure that the guest driver does not see the SCSIDevice
that's being unplugged if it responds very quickly to the transport
reset event.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230516190238.8401-4-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add a helper function to check whether the device is realized without
requiring the Big QEMU Lock. The next patch adds a second caller. The
goal is to avoid spreading DeviceState field accesses throughout the
code.
Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230516190238.8401-3-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
blk_set_aio_context() is not fully transactional because
blk_do_set_aio_context() updates blk->ctx outside the transaction. Most
of the time this goes unnoticed but a BlockDevOps.drained_end() callback
that invokes blk_get_aio_context() fails assert(ctx == blk->ctx). This
happens because blk->ctx is only assigned after
BlockDevOps.drained_end() is called and we're in an intermediate state
where BlockDrvierState nodes already have the new context and the
BlockBackend still has the old context.
Making blk_set_aio_context() fully transactional solves this assertion
failure because the BlockBackend's context is updated as part of the
transaction (before BlockDevOps.drained_end() is called).
Split blk_do_set_aio_context() in order to solve this assertion failure.
This helper function actually serves two different purposes:
1. It drives blk_set_aio_context().
2. It responds to BdrvChildClass->change_aio_ctx().
Get rid of the helper function. Do #1 inside blk_set_aio_context() and
do #2 inside blk_root_set_aio_ctx_commit(). This simplifies the code.
The only drawback of the fully transactional approach is that
blk_set_aio_context() must contend with blk_root_set_aio_ctx_commit()
being invoked as part of the AioContext change propagation. This can be
solved by temporarily setting blk->allow_aio_context_change to true.
Future patches call blk_get_aio_context() from
BlockDevOps->drained_end(), so this patch will become necessary.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230516190238.8401-2-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If blockdev-create references an existing node in an iothread (e.g. as
it's 'file' child), then suddenly all of the image creation code must
run in that AioContext, too. Test that this actually works.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230525124713.401149-13-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
It has no internal callers, so its only use is being called from
individual test cases. If the name starts with an underscore, it is
considered private and linters warn against calling it. 256 only gets
away with it currently because it's on the exception list for linters.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230525124713.401149-12-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The AioContext lock must not be held for bdrv_open_child(), but it is
necessary for the following operations, in particular those using nested
event loops in coroutine wrappers.
Temporarily dropping the main AioContext lock is not necessary because
we know we run in the main thread.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230525124713.401149-9-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When opening the 'file' child moves bs to an iothread, we need to hold
the AioContext lock of it before we can call raw_apply_options() (and
more specifically, bdrv_getlength() inside of it).
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230525124713.401149-8-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qcow2_open() doesn't work correctly when opening the 'file' child moves
bs to an iothread, for several reasons:
- It uses BDRV_POLL_WHILE() to wait for the qcow2_open_entry()
coroutine, which involves dropping the AioContext lock for bs when it
is not in the main context - but we don't hold it, so this crashes.
- It runs the qcow2_open_entry() coroutine in the current thread instead
of the new AioContext of bs.
- qcow2_open_entry() doesn't notify the main loop when it's done.
This patches fixes these issues around delegating work to a coroutine.
Temporarily dropping the main AioContext lock is not necessary because
we know we run in the main thread.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230525124713.401149-7-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This fixes blk_new_open() to not assume that bs is in the main context.
In particular, the BlockBackend must be created with the right
AioContext because it will refuse to move to a different context
afterwards. (blk->allow_aio_context_change is false.)
Use this opportunity to use blk_insert_bs() instead of duplicating the
bdrv_root_attach_child() call. This is consistent with what
blk_new_with_bs() does. Add comments to document the locking rules.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230525124713.401149-5-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The function documentation already says that all callers must hold the
main AioContext lock, but not all of them do. This can cause assertion
failures when functions called by bdrv_open() try to drop the lock. Fix
a few more callers to take the lock before calling bdrv_open().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230525124713.401149-4-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
These functions specify that the caller must hold the "@filename
AioContext lock". This doesn't make sense, file names don't have an
AioContext. New BlockDriverStates always start in the main AioContext,
so this is what we really need here.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230525124713.401149-3-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
All of the functions that currently take a BlockDriverState, BdrvChild
or BlockBackend as their first parameter expect the associated
AioContext to be locked when they are called. In the case of
no_co_wrappers, they are called from bottom halves directly in the main
loop, so no other caller can be expected to take the lock for them. This
can result in assertion failures because a lock that isn't taken is
released in nested event loops.
Looking at the first parameter is already done by co_wrappers to decide
where the coroutine should run, so doing the same in no_co_wrappers is
only consistent. Take the lock in the generated bottom halves to fix the
problem.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230525124713.401149-2-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
target-arm queue:
* fsl-imx6: Add SNVS support for i.MX6 boards
* smmuv3: Add support for stage 2 translations
* hw/dma/xilinx_axidma: Check DMASR.HALTED to prevent infinite loop
* hw/arm/xlnx-zynqmp: fix unsigned error when checking the RPUs number
* cleanups for recent Kconfig changes
* target/arm: Explicitly select short-format FSR for M-profile
* tests/qtest: Run arm-specific tests only if the required machine is available
* hw/arm/sbsa-ref: add GIC node into DT
* docs: sbsa: correct graphics card name
* Update copyright dates to 2023
# -----BEGIN PGP SIGNATURE-----
#
# iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmR2DYsZHHBldGVyLm1h
# eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3ubED/40MFaRWfJuVhD3NzWltzhD
# 5Y2/kxd3Bm51ki56hiBWXBXeovR3Exve9rP8OOGJ5RUK0SoEb4xdIjwMAGRt1Ksi
# Ln4MUqjv0tqUNv1hBDKgnGJ4dW34bhmJAnU/Jdzt8yrpGuSmN+LCWoPC+vTNCWYm
# sNFm8VLB+nmVq/sjTKwQc/Uo+7l9JZ+aY6poyHfN7kKpITUHtoCPgwz34btRrXEk
# 4+eNYQV1UvofRhLRVsIrvA89bd7Rcn5iHbhY+xYHaJDEaoYy7iBfUJeDlUtEgW8k
# 0fXt5Z5bXUXpz7jmzXdbq//68p8HcqinarIFH4r0Nbu+u2UgkZDJZRns+p5i8Wmv
# qE+hLGOgEg8s9n2e6chGuvw6wX49T3Xtr7tNpKQfi5ou5VT7qZIwl50m/JefuoPI
# eHu4uPj7pS0z/s8KDk0mNtbfcHkzmT5KpZkbvS2JOzg9o2t1fwGCbKPlcgJPxcIV
# Ro7R3rNvd6XSSQBlmcYNXWE7P7zuJyfjfSN7D7b0MdFP/hBTpLGKI2LBggZEdcce
# 21fiEkEE6d1L2oN+Eiq3q8xQNoVjYSGaE5LJ34+997z7W1JRB/dyJhZM0AkabSMl
# mkgyi9kBKxU4S9pxtZ//Uh9B/5blpMQAI4U8S/svuGqzwfI6luY/Qxue+YzRUD0H
# XsDSBnq1x2LW2Fhu7YVW3Q==
# =/OdY
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 30 May 2023 07:51:55 AM PDT
# gpg: using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg: issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [full]
# gpg: aka "Peter Maydell <pmaydell@gmail.com>" [full]
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [full]
* tag 'pull-target-arm-20230530-1' of https://git.linaro.org/people/pmaydell/qemu-arm: (21 commits)
docs: sbsa: correct graphics card name
hw/arm/sbsa-ref: add GIC node into DT
Update copyright dates to 2023
arm/Kconfig: Make TCG dependence explicit
arm/Kconfig: Keep Kconfig default entries in default.mak as documentation
target/arm: Explain why we need to select ARM_V7M
target/arm: Explicitly select short-format FSR for M-profile
tests/qtest: Run arm-specific tests only if the required machine is available
hw/arm/xlnx-zynqmp: fix unsigned error when checking the RPUs number
hw/dma/xilinx_axidma: Check DMASR.HALTED to prevent infinite loop.
hw/arm/smmuv3: Add knob to choose translation stage and enable stage-2
hw/arm/smmuv3: Add stage-2 support in iova notifier
hw/arm/smmuv3: Add CMDs related to stage-2
hw/arm/smmuv3: Add VMID to TLB tagging
hw/arm/smmuv3: Make TLB lookup work for stage-2
hw/arm/smmuv3: Parse STE config for stage-2
hw/arm/smmuv3: Add page table walk for stage-2
hw/arm/smmuv3: Refactor stage-1 PTW
hw/arm/smmuv3: Update translation config to hold stage-2
hw/arm/smmuv3: Add missing fields for IDR0
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Let add GIC information into DeviceTree as part of SBSA-REF versioning.
Trusted Firmware will read it and provide to next firmware level.
Bumps platform version to 0.1 one so we can check is node is present.
Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When we moved the arm default CONFIGs into Kconfig and removed them
from default.mak, we made it harder to identify which CONFIGs are
selected by default in case users want to disable them.
Bring back the default entries into default.mak, but keep them
commented out. This way users can keep their workflows of editing
default.mak to remove build options without needing to search through
Kconfig.
Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 20230523180525.29994-3-farosas@suse.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We currently need to select ARM_V7M unconditionally when TCG is
present in the build because some translate.c helpers and the whole of
m_helpers.c are not yet under CONFIG_ARM_V7M.
Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230523180525.29994-2-farosas@suse.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
For M-profile, there is no guest-facing A-profile format FSR, but we
still use the env->exception.fsr field to pass fault information from
the point where a fault is raised to the code in
arm_v7m_cpu_do_interrupt() which interprets it and sets the M-profile
specific fault status registers. So it doesn't matter whether we
fill in env->exception.fsr in the short format or the LPAE format, as
long as both sides agree. As it happens arm_v7m_cpu_do_interrupt()
assumes short-form.
In compute_fsr_fsc() we weren't explicitly choosing short-form for
M-profile, but instead relied on it falling out in the wash because
arm_s1_regime_using_lpae_format() would be false. This was broken in
commit 452c67a4 when we added v8R support, because we said "PMSAv8 is
always LPAE format" (as it is for v8R), forgetting that we were
implicitly using this code path on M-profile. At that point we would
hit a g_assert_not_reached():
ERROR:../../target/arm/internals.h:549:arm_fi_to_lfsc: code should not be reached
#7 0x0000555555e055f7 in arm_fi_to_lfsc (fi=0x7fffecff9a90) at ../../target/arm/internals.h:549
#8 0x0000555555e05a27 in compute_fsr_fsc (env=0x555557356670, fi=0x7fffecff9a90, target_el=1, mmu_idx=1, ret_fsc=0x7fffecff9a1c)
at ../../target/arm/tlb_helper.c:95
#9 0x0000555555e05b62 in arm_deliver_fault (cpu=0x555557354800, addr=268961344, access_type=MMU_INST_FETCH, mmu_idx=1, fi=0x7fffecff9a90)
at ../../target/arm/tlb_helper.c:132
#10 0x0000555555e06095 in arm_cpu_tlb_fill (cs=0x555557354800, address=268961344, size=1, access_type=MMU_INST_FETCH, mmu_idx=1, probe=false, retaddr=0)
at ../../target/arm/tlb_helper.c:260
The specific assertion changed when commit fcc7404eff added
"assert not M-profile" to arm_is_secure_below_el3(), because the
conditions being checked in compute_fsr_fsc() include
arm_el_is_aa64(), which will end up calling arm_is_secure_below_el3()
and asserting before we try to call arm_fi_to_lfsc():
#7 0x0000555555efaf43 in arm_is_secure_below_el3 (env=0x5555574665a0) at ../../target/arm/cpu.h:2396
#8 0x0000555555efb103 in arm_is_el2_enabled (env=0x5555574665a0) at ../../target/arm/cpu.h:2448
#9 0x0000555555efb204 in arm_el_is_aa64 (env=0x5555574665a0, el=1) at ../../target/arm/cpu.h:2509
#10 0x0000555555efbdfd in compute_fsr_fsc (env=0x5555574665a0, fi=0x7fffecff99e0, target_el=1, mmu_idx=1, ret_fsc=0x7fffecff996c)
Avoid the assertion and the incorrect FSR format selection by
explicitly making M-profile use the short-format in this function.
Fixes: 452c67a427 ("target/arm: Enable TTBCR_EAE for ARMv8-R AArch32")a
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1658
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230523131726.866635-1-peter.maydell@linaro.org
pflash-cfi02-test.c always uses the "musicpal" machine for testing,
test-arm-mptimer.c always uses the "vexpress-a9" machine, and
microbit-test.c requires the "microbit" machine, so we should only
run these tests if the machines have been enabled in the configuration.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-id: 20230524080600.1618137-1-thuth@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When passing --smp with a number lower than XLNX_ZYNQMP_NUM_APU_CPUS,
the expression (ms->smp.cpus - XLNX_ZYNQMP_NUM_APU_CPUS) will result
in a positive number as ms->smp.cpus is a unsigned int.
This will raise the following error afterwards, as Qemu will try to
instantiate some additional RPUs.
| $ qemu-system-aarch64 --smp 1 -M xlnx-zcu102
| **
| ERROR:../src/tcg/tcg.c:777:tcg_register_thread:
| assertion failed: (n < tcg_max_ctxs)
Signed-off-by: Clément Chigot <chigot@adacore.com>
Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Tested-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Message-id: 20230524143714.565792-1-chigot@adacore.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When we receive a packet from the xilinx_axienet and then try to s2mem
through the xilinx_axidma, if the descriptor ring buffer is full in the
xilinx axidma driver, we’ll assert the DMASR.HALTED in the
function : stream_process_s2mem and return 0. In the end, we’ll be stuck in
an infinite loop in axienet_eth_rx_notify.
This patch checks the DMASR.HALTED state when we try to push data
from xilinx axi-enet to xilinx axi-dma. When the DMASR.HALTED is asserted,
we will not keep pushing the data and then prevent the infinte loop.
Signed-off-by: Tommy Wu <tommy.wu@sifive.com>
Reviewed-by: Edgar E. Iglesias <edgar@zeroasic.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Message-id: 20230519062137.1251741-1-tommy.wu@sifive.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
As everything is in place, we can use a new system property to
advertise which stage is supported and remove bad_ste from STE
stage2 config.
The property added arm-smmuv3.stage can have 3 values:
- "1": Stage-1 only is advertised.
- "2": Stage-2 only is advertised.
If not passed or an unsupported value is passed, it will default to
stage-1.
Advertise VMID16.
Don't try to decode CD, if stage-2 is configured.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-id: 20230516203327.2051088-11-smostafa@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Right now, either stage-1 or stage-2 are supported, this simplifies
how we can deal with TLBs.
This patch makes TLB lookup work if stage-2 is enabled instead of
stage-1.
TLB lookup is done before a PTW, if a valid entry is found we won't
do the PTW.
To be able to do TLB lookup, we need the correct tagging info, as
granularity and input size, so we get this based on the supported
translation stage. The TLB entries are added correctly from each
stage PTW.
When nested translation is supported, this would need to change, for
example if we go with a combined TLB implementation, we would need to
use the min of the granularities in TLB.
As stage-2 shouldn't be tagged by ASID, it will be set to -1 if S1P
is not enabled.
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-id: 20230516203327.2051088-7-smostafa@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Parse stage-2 configuration from STE and populate it in SMMUS2Cfg.
Validity of field values are checked when possible.
Only AA64 tables are supported and Small Translation Tables (STT) are
not supported.
According to SMMUv3 UM(IHI0070E) "5.2 Stream Table Entry": All fields
with an S2 prefix (with the exception of S2VMID) are IGNORED when
stage-2 bypasses translation (Config[1] == 0).
Which means that VMID can be used(for TLB tagging) even if stage-2 is
bypassed, so we parse it unconditionally when S2P exists. Otherwise
it is set to -1.(only S1P)
As stall is not supported, if S2S is set the translation would abort.
For S2R, we reuse the same code used for stage-1 with flag
record_faults. However when nested translation is supported we would
need to separate stage-1 and stage-2 faults.
Fix wrong shift in STE_S2HD, STE_S2HA, STE_S2S.
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20230516203327.2051088-6-smostafa@google.com
[PMM: fixed format string]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In preparation for adding stage-2 support, add Stage-2 PTW code.
Only Aarch64 format is supported as stage-1.
Nesting stage-1 and stage-2 is not supported right now.
HTTU is not supported, SW is expected to maintain the Access flag.
This is described in the SMMUv3 manual(IHI 0070.E.a)
"5.2. Stream Table Entry" in "[181] S2AFFD".
This flag determines the behavior on access of a stage-2 page whose
descriptor has AF == 0:
- 0b0: An Access flag fault occurs (stall not supported).
- 0b1: An Access flag fault never occurs.
An Access fault takes priority over a Permission fault.
There are 3 address size checks for stage-2 according to
(IHI 0070.E.a) in "3.4. Address sizes".
- As nesting is not supported, input address is passed directly to
stage-2, and is checked against IAS.
We use cfg->oas to hold the OAS when stage-1 is not used, this is set
in the next patch.
This check is done outside of smmu_ptw_64_s2 as it is not part of
stage-2(it throws stage-1 fault), and the stage-2 function shouldn't
change it's behavior when nesting is supported.
When nesting is supported and we figure out how to combine TLB for
stage-1 and stage-2 we can move this check into the stage-1 function
as described in ARM DDI0487I.a in pseudocode
aarch64/translation/vmsa_translation/AArch64.S1Translate
aarch64/translation/vmsa_translation/AArch64.S1DisabledOutput
- Input to stage-2 is checked against s2t0sz, and throws stage-2
transaltion fault if exceeds it.
- Output of stage-2 is checked against effective PA output range.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-id: 20230516203327.2051088-5-smostafa@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In preparation for adding stage-2 support, rename smmu_ptw_64 to
smmu_ptw_64_s1 and refactor some of the code so it can be reused in
stage-2 page table walk.
Remove AA64 check from PTW as decode_cd already ensures that AA64 is
used, otherwise it faults with C_BAD_CD.
A stage member is added to SMMUPTWEventInfo to differentiate
between stage-1 and stage-2 ptw faults.
Add stage argument to trace_smmu_ptw_level be consistent with other
trace events.
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-id: 20230516203327.2051088-4-smostafa@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In preparation for adding stage-2 support, add a S2 config
struct(SMMUS2Cfg), composed of the following fields and embedded in
the main SMMUTransCfg:
-tsz: Size of IPA input region (S2T0SZ)
-sl0: Start level of translation (S2SL0)
-affd: AF Fault Disable (S2AFFD)
-record_faults: Record fault events (S2R)
-granule_sz: Granule page shift (based on S2TG)
-vmid: Virtual Machine ID (S2VMID)
-vttb: Address of translation table base (S2TTB)
-eff_ps: Effective PA output range (based on S2PS)
They will be used in the next patches in stage-2 address translation.
The fields in SMMUS2Cfg, are reordered to make the shared and stage-1
fields next to each other, this reordering didn't change the struct
size (104 bytes before and after).
Stage-1 only fields: aa64, asid, tt, ttb, tbi, record_faults, oas.
oas is stage-1 output address size. However, it is used to check
input address in case stage-1 is unimplemented or bypassed according
to SMMUv3 manual IHI0070.E "3.4. Address sizes"
Shared fields: stage, disabled, bypassed, aborted, iotlb_*.
No functional change intended.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-id: 20230516203327.2051088-3-smostafa@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
ppc patch queue for 2023-05-28:
This queue includes several assorted fixes for PowerPC SPR
emulation, a change in the default Pegasos2 CPU, the addition
of AIL mode 3 for spapr, a PIC->CPU interrupt fix for prep and
performance enhancements in fpu_helper.c.
# -----BEGIN PGP SIGNATURE-----
#
# iIwEABYKADQWIQQX6/+ZI9AYAK8oOBk82cqW3gMxZAUCZHOFiRYcZGFuaWVsaGI0
# MTNAZ21haWwuY29tAAoJEDzZypbeAzFkVZ0BAMV+9RlHKRlldOSPMEWCWo6hmA/U
# 9SMyJsZPY3OpDbE3AP9XOQR1boqyT5MJXoeOUq1OLlFm6mY7UA300kBZ7wxVCw==
# =IGNT
# -----END PGP SIGNATURE-----
# gpg: Signature made Sun 28 May 2023 09:47:05 AM PDT
# gpg: using EDDSA key 17EBFF9923D01800AF2838193CD9CA96DE033164
# gpg: issuer "danielhb413@gmail.com"
# gpg: Good signature from "Daniel Henrique Barboza <danielhb413@gmail.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 17EB FF99 23D0 1800 AF28 3819 3CD9 CA96 DE03 3164
* tag 'pull-ppc-20230528' of https://gitlab.com/danielhb/qemu:
ppc/pegasos2: Change default CPU to 7457
target/ppc: Add POWER9 DD2.2 model
target/ppc: Merge COMPUTE_CLASS and COMPUTE_FPRF
pnv_lpc: disable reentrancy detection for lpc-hc
target/ppc: Use SMT4 small core chip type in POWER9/10 PVRs
hw/ppc/prep: Fix wiring of PIC -> CPU interrupt
spapr: Add SPAPR_CAP_AIL_MODE_3 for AIL mode 3 support for H_SET_MODE hcall
target/ppc: Alignment faults do not set DSISR in ISA v3.0 onward
target/ppc: Fix width of some 32-bit SPRs
target/ppc: Fix fallback to MFSS for MFFS* instructions on pre 3.0 ISAs
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Previously 7400 was selected as a safe choice as that is used by other
machines so it's better tested but AmigaOS does not know this CPU and
disables some features when running on it. The real hardware has
7447/7457 G4 CPU so change the default to match that now that it was
confirmed to work better with AmigaOS.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Tested-by: Rene Engel <ReneEngel80@emailn.de>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20230528152937.B8DAD74633D@zero.eik.bme.hu>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
POWER9 DD2.1 and earlier had significant limitations when running KVM,
including lack of "mixed mode" MMU support (ability to run HPT and RPT
mode on threads of the same core), and a translation prefetch issue
which is worked around by disabling "AIL" mode for the guest.
These processors are not widely available, and it's difficult to deal
with all these quirks in qemu +/- KVM, so create a POWER9 DD2.2 CPU
and make it the default POWER9 CPU.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Message-Id: <20230515160201.394587-1-npiggin@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
GTK3 provides the infrastructure to receive and process multi-touch
events through the "touch-event" signal and the GdkEventTouch type.
Make use of it to transpose events from the host to the guest.
This allows users of machines with hardware capable of receiving
multi-touch events to run guests that can also receive those events
and interpret them as gestures, when appropriate.
An example of this in action can be seen here:
https://fosstodon.org/@slp/109545849296546767
Signed-off-by: Sergio Lopez <slp@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230526112925.38794-7-slp@redhat.com>
QEMU's PVR value for POWER9 DD2.0 has chip type 1, which is the SMT4
"small core" type that OpenPOWER processors use. QEMU's PVR for all
other POWER9/10 have chip type 0, which "enterprise" systems use.
The difference does not really matter to QEMU (because it does not care
about SMT mode in the target), but for consistency all PVRs should use
the same chip type. We'll go with the SMT4 OpenPOWER type.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20230515160131.394562-1-npiggin@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Commit cef2e7148e ("hw/isa/i82378: Remove intermediate IRQ forwarder")
passes s->cpu_intr to i8259_init() in i82378_realize() directly. However, s-
>cpu_intr isn't initialized yet since that happens after the south bridge's
pci_realize_and_unref() in board code. Fix this by initializing s->cpu_intr
before realizing the south bridge.
Fixes: cef2e7148e ("hw/isa/i82378: Remove intermediate IRQ forwarder")
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20230304114043.121024-4-shentey@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
The behaviour of the Address Translation Mode on Interrupt resource is
not consistently supported by all CPU versions or all KVM versions: KVM
HV does not support mode 2, and does not support mode 3 on POWER7 or
early POWER9 processesors. KVM PR only supports mode 0. TCG supports all
modes (0, 2, 3) on CPUs with support for the corresonding LPCR[AIL] mode.
This leads to inconsistencies in guest behaviour and could cause problems
migrating guests.
This was not noticable for Linux guests for a long time because the
kernel only uses modes 0 and 3, and it used to consider AIL-3 to be
advisory in that it would always keep the AIL-0 vectors around, so it
did not matter whether or not interrupts were delivered according to
the AIL mode. Recent Linux guests depend on AIL mode 3 working as
specified in order to support the SCV facility interrupt. If AIL-3 can
not be provided, then H_SET_MODE must return an error to Linux so it can
disable the SCV facility (failure to do so can lead to userspace being
able to crash the guest kernel).
Add the ail-mode-3 capability to specify that AIL-3 is supported. AIL-0
is implied as the baseline, and AIL-2 is no longer supported by spapr.
AIL-2 is not known to be used by any software, but support in TCG could
be restored with an ail-mode-2 capability quite easily if a regression
is reported.
Modify the H_SET_MODE Address Translation Mode on Interrupt resource
handler to check capabilities and correctly return error if not
supported.
KVM has a cap to advertise support for AIL-3.
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20230515160216.394612-1-npiggin@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
As there are other bitmap-based config properties that need to be dealt in a
similar fashion as VIRTIO_INPUT_CFG_EV_BITS, generalize the function to
receive select and subsel as arguments, and rename it to
virtio_input_extend_config()
Signed-off-by: Sergio Lopez <slp@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230526112925.38794-2-slp@redhat.com>
Although not actually exploitable at the moment, a negative width/height
could make datasize wrap around and potentially lead to buffer overflow.
Since there is no reason a negative width/height is ever appropriate,
modify QEMUCursor struct and cursor_alloc prototype to accept uint16_t.
This protects us against accidentally introducing future bugs.
Signed-off-by: Mauro Matteo Cascella <mcascell@redhat.com>
Reported-by: Jacek Halon <jacek.halon@gmail.com>
Reported-by: Yair Mizrahi <yairh33@gmail.com>
Reported-by: Elsayed El-Refa'ei <e.elrefaei99@gmail.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230523163023.608121-1-mcascell@redhat.com>
Windows sends an extra left control key up/down input event for
every right alt key up/down input event for keyboards with
international layout. Since commit 830473455f ("ui/sdl2: fix
handling of AltGr key on Windows") QEMU uses a Windows low level
keyboard hook procedure to reliably filter out the special left
control key and to grab the keyboard on Windows.
The SDL2 version 2.0.16 introduced its own Windows low level
keyboard hook procedure to grab the keyboard. Windows calls this
callback before the QEMU keyboard hook procedure. This disables
the special left control key filter when the keyboard is grabbed.
To fix the problem, disable the SDL2 Windows low level keyboard
hook procedure.
Reported-by: Bernhard Beschow <shentey@gmail.com>
Signed-off-by: Volker Rümelin <vr_qemu@t-online.de>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20230418062823.5683-1-vr_qemu@t-online.de>
SDL doesn't grab Alt+F4 under Windows by default. Pressing Alt+F4 thus closes
the VM immediately without confirmation, possibly leading to data loss. Fix
this by always grabbing Alt+F4 on Windows hosts, too.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Volker Rümelin <vr_qemu@t-online.de>
Message-Id: <20230417192139.43263-3-shentey@gmail.com>
By default, SDL grabs Alt+Tab only in non-fullscreen mode. This causes Alt+Tab
to switch tasks on the host rather than in the VM in fullscreen mode while it
switches tasks in non-fullscreen mode in the VM. Fix this confusing behavior
by grabbing Alt+Tab in fullscreen mode, always causing tasks to be switched in
the VM.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Volker Rümelin <vr_qemu@t-online.de>
Message-Id: <20230417192139.43263-2-shentey@gmail.com>
Except SDL, display backends seem to fail at handing full scanout
geometry correctly. It would need some test/reproducer to actually check
it. In the meantime, fill some missing fields, and leave a FIXME.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230515132537.1026310-1-marcandre.lureau@redhat.com>
It looks like the virtio_gpu_load() does not compute and set the offset,
the same way virtio_gpu_set_scanout() does. This probably results in
incorrect display until the scanout/framebuffer is updated again, I
guess we should fix it, although I haven't checked this yet.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230515132518.1025853-1-marcandre.lureau@redhat.com>
Since commit abe34282 ("win32: avoid mixing SOCKET and file descriptor
space"), we set HANDLE_FLAG_PROTECT_FROM_CLOSE on the socket FD, to
prevent closing the HANDLE with CloseHandle. This raises an exception
which under gdb is fatal, and qemu exits.
Let's catch the expected error instead.
Note: this appears to work, but the mingw64 macro is not well documented
or tested, and it's not obvious how it is meant to be used.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230515132440.1025315-1-marcandre.lureau@redhat.com>
commit 4814d3cbf ("ui/dbus: restrict opengl to gbm-enabled config")
assumes that whenever GBM is available, OpenGL is. This is not always
the case, let's further restrict opengl-related paths and fix some
compilation issues.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230515132348.1024663-1-marcandre.lureau@redhat.com>
vc->gfx.w and vc->gfx.h are not updated appropriately in this code path,
which leads to a different scaling factor for rendering the cursor on
some edge cases (e.g. the focus has left and re-entered the gtk window).
This can be reproduced using vhost-user-gpu with the gtk ui on the x11
backend.
Use the surface dimensions which are already updated accordingly.
Signed-off-by: Erico Nunes <ernunes@redhat.com>
Acked-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230320160856.364319-2-ernunes@redhat.com>
The gd_motion_event size has some calculations for the cursor position,
which also take into account things like different size of the
framebuffer compared to the window size.
The use of window size makes things more difficult though, as at least
in the case of Wayland includes the size of ui elements like a menu bar
at the top of the window. This leads to a wrong position calculation by
a few pixels.
Fix it by using the size of the widget, which already returns the size
of the actual space to render the framebuffer.
Signed-off-by: Erico Nunes <ernunes@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Message-Id: <20230320160856.364319-1-ernunes@redhat.com>
The dmabuf->y0_top flag is passed to .dpy_gl_scanout_dmabuf(), however
in the gtk ui both implementations dropped it when doing the next
scanout_texture call.
Fixes flipped linux console using vhost-user-gpu with the gtk ui
display.
Signed-off-by: Erico Nunes <ernunes@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230220175605.43759-1-ernunes@redhat.com>
This optional behavior was removed from the ISA in v3.0, see
Summary of Changes preface:
Data Storage Interrupt Status Register for Alignment Interrupt:
Simplifies the Alignment interrupt by remov- ing the Data Storage
Interrupt Status Register (DSISR) from the set of registers modified
by the Alignment interrupt.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20230515092655.171206-5-npiggin@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Some 32-bit SPRs are incorrectly implemented as 64-bits on 64-bit
targets.
This changes VRSAVE, DSISR, HDSISR, DAWRX0, PIDR, LPIDR, DEXCR,
HDEXCR, CTRL, TSCR, MMCRH, and PMC[1-6] from to be 32-bit registers.
This only goes by the 32/64 classification in the architecture, it
does not try to implement finer details of SPR implementation (e.g.,
not all bits implemented as simple read/write storage).
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Message-Id: <20230515092655.171206-2-npiggin@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
The following commits changed the code such that the fallback to MFSS for MFFSCRN,
MFFSCRNI, MFFSCE and MFFSL on pre 3.0 ISAs was removed and became an illegal instruction:
bf8adfd88b - target/ppc: Move mffscrn[i] to decodetree
394c2e2fda - target/ppc: Move mffsce to decodetree
3e5bce70ef - target/ppc: Move mffsl to decodetree
The hardware will handle them as a MFFS instruction as the code did previously.
This means applications that were segfaulting under qemu when encountering these
instructions which is used in glibc libm functions for example.
The fallback for MFFSCDRN and MFFSCDRNI added in a later patch was also missing.
This patch restores the fallback to MFSS for these instructions on pre 3.0s ISAs
as the hardware decoder would, fixing the segfaulting libm code. It doesn't have
the fallback for 3.0 onwards to match hardware behaviour.
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
Reviewed-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230510111913.1718734-1-richard.purdie@linuxfoundation.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Hexagon update
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEENjXHiM5iuR/UxZq0ewJE+xLeRCIFAmRwv6QACgkQewJE+xLe
# RCLRvQf/e0utA8/KAYwmay4dYiiVlrtJ4UVpwogQ8JC7je5H2+Gv633P4BF8uGAF
# HmhdUk031jvG/BvKGH+493ESKgtIX3caLxJInPtYu3elqKxZhqKpke2VPF3srrwI
# Mli8IqdwE2scSilG591xTjhU8vBGSm+hiQptSg9OaSotVcH8Qc/32+vudnr2JZtK
# ko3MqISMW/KvfD+x47UcX4IX4bmQfDyysQITQs9lfwYgzv/4drl6/7CUFQZ3b8Go
# Rz4ClbYhKT8YybJjX+yaKuTaHSrL9r0+90ORzYisEYcPiOOChmy9vv4HbZ1zTCbY
# MVJM69IPdZDi1quE00jULYEEPrHRoA==
# =vczK
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 26 May 2023 07:18:12 AM PDT
# gpg: using RSA key 3635C788CE62B91FD4C59AB47B0244FB12DE4422
# gpg: Good signature from "Taylor Simpson (Rock on) <tsimpson@quicinc.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 3635 C788 CE62 B91F D4C5 9AB4 7B02 44FB 12DE 4422
* tag 'pull-hex-20230526' of https://github.com/quic/qemu:
Hexagon (target/hexagon) Change Hexagon maintainer
Hexagon: fix outdated `hex_new_*` comments
target/hexagon/*.py: clean up used 'toss' and 'numregs' vars
Hexagon (target/hexagon) Fix assignment to tmp registers
Hexagon (tests/tcg/hexagon) Clean up Hexagon check-tcg tests
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
pull-loongarch-20230526
# -----BEGIN PGP SIGNATURE-----
#
# iLMEAAEIAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCZHB6VwAKCRBAov/yOSY+
# 390YA/98bGE+W8NGBoKI4sxke6LE6jbF1vYiOz4DiqvbGFcyL+sYKnlN92mpfNaP
# K8BlgD3kvL7wV/DtCGTq4c0aAtUmSZNCC1w7PSlOkFxkJ+QONQGMGZKmI75BRYdY
# Q/JQxUG02Hm4K/ghJDMGAm3+m+VaZaqxYNCv/6gLhmTERB5l5A==
# =yu/e
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 26 May 2023 02:22:31 AM PDT
# gpg: using RSA key B8FF1DA0D2FDCB2DA09C6C2C40A2FFF239263EDF
# gpg: Good signature from "Song Gao <m17746591750@163.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B8FF 1DA0 D2FD CB2D A09C 6C2C 40A2 FFF2 3926 3EDF
* tag 'pull-loongarch-20230526' of https://gitlab.com/gaosong/qemu:
target/loongarch: Fix the vinsgr2vr/vpickve2gr instructions cause system coredump
target/loongarch: Fix LD/ST{LE/GT} instructions get wrong CSR_ERA and CSR_BADV
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* Use MachineClass->default_nic in more machines to allow running them
without "--nodefaults" in builds that used "--without-default-devices"
* Improve qtests for such builds
* Add up-/downsampling qtest
* Avoid crash if default RAM backend name has been stolen
* Fix reentrant DMA problem in the lsi53c895a device (CVE-2023-0330)
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmRwdqsRHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbXk6g//eQzVGv1Ep4ZusQXPDpFJLgBNq7JMOF6a
# bWa6fTluzCn2ivnbgPEf0lV1TsCrUuQwqWlEozylltE6l4zbmIWBMO8F/6Wy0JZH
# DuBrO9fio+nKhcEqeFLE+wTWUCiBqM66n8LL+rznO3RjXv2QU8zhk9owmsEKZUV0
# vXrMO5XdUO/dTrxyBdVjbok9L1UpkF+Sp9LEHNxIJZnAqhVmx13jnKq6WTrDR/fX
# ZwGbwWxsnTZl5PuPsHePdTWhXigzZJYcI5TSfcdTVHbzIxVKzFIvTX7stKxySL3b
# 3rXqmkmdozi28UPq7kXvLRoN8VscORgC3J+0izVxd1P0q+sh6p+hF/8T1r0UCqWa
# cgPoqGP5fcqfQiQxdaPbm3Ar9qscZPqzpZWxzjFQsptxf69RIEg+8XZq/EP+6g+c
# GxCh1cqugLdWvZPpBjoGIDlftxJZ99rMKnOZJEudaAIDzRWbNBuqzVo5osj8n5ht
# m68Nanlil451+ySuTS7iiWyyKXF6hIfe5I6A72QdxMPeHsavcCk5D5AN76dFSTmN
# XWWqlk9CNYbvaYSIqyxJpANiwA5Y0j7r6GVXdWFZ9YRt//+z2rMwOrZIqYyvoscE
# 5p+ul/qgUq10XkNwI9t1pd9DX8g+5yuIY0chfC9G1B0AuiPHzvmszORBYY+8+7GT
# 2Rwq/HqraC4=
# =eab7
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 26 May 2023 02:06:51 AM PDT
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [undefined]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [undefined]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5
* tag 'pull-request-2023-05-26' of https://gitlab.com/thuth/qemu:
hw/scsi/lsi53c895a: Fix reentrancy issues in the LSI controller (CVE-2023-0330)
lsi53c895a: disable reentrancy detection for MMIO region, too
machine: do not crash if default RAM backend name has been stolen
tests/qtest/ac97-test: add up-/downsampling tests
tests/qtest/usb-hcd-ehci-test: Check for EHCI and UHCI HCDs before using them
tests/qtest/rtl8139-test: Check whether the rtl8139 device is available
tests/qtest: Check for virtio-blk before using -cdrom with the arm virt machine
tests/qtest/usb-hcd-uhci-test: Check whether "usb-storage" is available
hw/mips: Use MachineClass->default_nic in the virt machine
hw/arm: Use MachineClass->default_nic in the sbsa-ref machine
hw/xtensa: Use MachineClass->default_nic in the virt machine
hw/loongarch64: Use MachineClass->default_nic in the virt machine
hw/arm: Use MachineClass->default_nic in the virt machine
hw/alpha: Use MachineClass->default_nic in the alpha machine
hw/hppa: Use MachineClass->default_nic in the hppa machine
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Some code comments refer to hex_new_value and hex_new_pred_value, which
have been transferred to DisasContext and, in the case of hex_new_value,
should now be accessed through get_result_gpr().
In order to fix this outdated comments and also avoid having to tweak
them whenever we make a variable name change in the future, let's
replace them with pseudocode.
Suggested-by: Taylor Simpson <tsimpson@quicinc.com>
Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Message-Id: <8e1689e28dd7b1318369b55127cf47b82ab75921.1684939078.git.quic_mathbern@quicinc.com>
Many Hexagon python scripts call hex_common.get_tagregs(), but only one
call site use the full reg structure given by this function. To make the
code cleaner, let's make get_tagregs() filter out the unused fields
(i.e. 'toss' and 'numregs'), properly removed the unused variables at
the call sites. The hex_common.bad_register() function is also adjusted
to work exclusively with 'regtype' and 'regid' args. For the single call
site that does use toss/numregs, we provide an optional parameter to
get_tagregs() which will restore the old full behavior.
Suggested-by: Taylor Simpson <tsimpson@quicinc.com>
Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
Tested-by: Taylor Simpson <tsimpson@quicinc.com>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Message-Id: <3ffd4ccb972879f57f499705c624e8eaba7f8b52.1684939078.git.quic_mathbern@quicinc.com>
The order in which instructions are generated by gen_insn() influences
assignment to tmp registers. During generation, tmp instructions (e.g.
generate_V6_vassign_tmp) use vreg_src_off() to determine what kind of
register to use as source. If some instruction (e.g.
generate_V6_vmpyowh_64_acc) uses a tmp register but is generated prior
to the corresponding tmp instruction, the vregs_updated_tmp bit map
isn't updated in time.
Exmple:
{ v14.tmp = v16; v25 = v14 } This works properly because
generate_V6_vassign_tmp is generated before generate_V6_vassign
and the bit map is updated.
{ v15:14.tmp = vcombine(v21, v16); v25:24 += vmpyo(v18.w,v14.h) }
This does not work properly because vmpyo is generated before
vcombine and therefore the bit map does not yet know that there's
a tmp register.
The parentheses in the decoding function were in the wrong place.
Moving them to the correct location makes shuffling of .tmp vector
registers work as expected.
Signed-off-by: Marco Liebel <quic_mliebel@quicinc.com>
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
Tested-by: Taylor Simpson <tsimpson@quicinc.com>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Brian Cain <bcain@quicinc.com>
Message-Id: <20230522174708.464197-1-quic_mliebel@quicinc.com>
Move test infra to header file
check functions (always print line number on error)
USR manipulation
Useful floating point values
Use stdint.h types
Use stdbool.h bool where appropriate
Use trip counts local to for loop
Suggested-by: Anton Johansson <anjo@rev.ng>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Tested-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230522174341.1805460-1-tsimpson@quicinc.com>
Setting the MAKE variable to a GNU Make executable does not really have
any effect: if a non-GNU Make is used, the QEMU Makefile will fail to
parse. Just remove everything related to --make and $make as dead code.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
By using a subproject, our own meson.build can use variables from
the subproject instead of hard-coded paths. This is also the first step
towards managing downloads with .wrap files instead of submodule.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Recent dtc/libfdt can use either Make or meson as the build system.
By using a subproject, our own meson.build can remove the hard
coded list of source files.
This is also the first step towards managing downloads with .wrap
files instead of submodule.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
fdt_opt == 'disabled' is going to give an error if libfdt is required
by any target, so catch that immediately. For fdt_opt == 'enabled',
instead, do not check immediately whether the internal libfdt is present.
Instead do the check after ascertaining that libfdt is absent or too old.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The VirtioInfoList is already allocated by QAPI_LIST_PREPEND and
need not be allocated by the caller.
Fixes Coverity CID 1508724.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It is recommended to use SSIZE_T for ssize_t on win32, but the commit
that is being used for slirp.wrap uses int. Update to include the fix
as well as the other bugfix commit "ip: Enforce strict aliasing".
Reported-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We recently moved glib detection code to meson but this changes the
linker command line from -lglib-2.0 to using a path to libglib-2.0.so.
This does not work for static linking, which is used by stress.c:
$ make V=1 tests/migration/initrd-stress.img
cc -m64 -mcx16 -o tests/migration/stress ... -static -Wl,--start-group
/usr/lib64/libglib-2.0.so -Wl,--end-group
...
bin/ld: attempted static link of dynamic object `/usr/lib64/libglib-2.0.so'
Add a specific dependency for stress.c, which is linked statically.
The compiler command line is now:
cc -m64 -mcx16 -o tests/migration/stress ... -static -pthread
-Wl,--start-group -lm /usr/lib64/libpcre.a -lglib-2.0 -Wl,--end-group
Fixes: fc9a809e0d ("build: move glib detection and workarounds to meson")
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Message-Id: <20230525212044.30222-3-farosas@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
1.helper_asrtle_d/helper_asrtgt_d need use GETPC() to get PC;
2 LD/ST{LE/GT} need set CSR_BADV = gpr[rj];
3 ASRTLE.D/ASRTGT.D also write CSR_BADV, but this value is random
and has no reference value.
Signed-off-by: Song Gao <gaosong@loongson.cn>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230515130042.2719712-1-gaosong@loongson.cn>
Apart from CLICOLOR_FORCE and GREP_OPTIONS, there are other variables
that are listed in the Autoconf manual. While Autoconf neutralizes them
very early, and assumes it does not (yet) run in a shell that has "unset",
QEMU assumes that the user invoked configure under a POSIX shell, and
therefore can simply use "unset" to clear them.
CDPATH is particularly nasty because it messes up "cd ... && pwd".
Reported-by: Juan Quintela <quintela@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It is now the same as $(PYTHON), since the latter always points at pyvenv/bin/python3.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
ARCH is always empty, so just define HOST_ARCH as the result of uname.
The incorrect definition was not being used because the "ifeq" statement
is wrong; replace it with the same idiom based on $(realpath) that the
main Makefile uses.
With this change, vm-build-netbsd in a configured tree will not use
the PYTHONPATH hack.
Reported-by: John Snow <jsnow@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
ARCH is always empty, so just define HOST_ARCH as the result of uname.
Acked-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We cannot use the generic reentrancy guard in the LSI code, so
we have to manually prevent endless reentrancy here. The problematic
lsi_execute_script() function has already a way to detect whether
too many instructions have been executed - we just have to slightly
change the logic here that it also takes into account if the function
has been called too often in a reentrant way.
The code in fuzz-lsi53c895a-test.c has been taken from an earlier
patch by Mauro Matteo Cascella.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1563
Message-Id: <20230522091011.1082574-1-thuth@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Thomas Huth <thuth@redhat.com>
While trying to use a SCSI disk on the LSI controller with an
older version of Fedora (25), I'm getting:
qemu: warning: Blocked re-entrant IO on MemoryRegion: lsi-mmio at addr: 0x34
and the SCSI controller is not usable. Seems like we have to
disable the reentrancy checker for the MMIO region, too, to
get this working again.
The problem could be reproduced it like this:
./qemu-system-x86_64 -accel kvm -m 2G -machine q35 \
-device lsi53c810,id=lsi1 -device scsi-hd,drive=d0 \
-drive if=none,id=d0,file=.../somedisk.qcow2 \
-cdrom Fedora-Everything-netinst-i386-25-1.3.iso
Where somedisk.qcow2 is an image that contains already some partitions
and file systems.
In the boot menu of Fedora, go to
"Troubleshooting" -> "Rescue a Fedora system" -> "3) Skip to shell"
Then check "dmesg | grep -i 53c" for failure messages, and try to mount
a partition from somedisk.qcow2.
Message-Id: <20230516090556.553813-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
QEMU aborts when default RAM backend should be used (i.e. no
explicit '-machine memory-backend=' specified) but user
has created an object which 'id' equals to default RAM backend
name used by board.
$QEMU -machine pc \
-object memory-backend-ram,id=pc.ram,size=4294967296
Actual results:
QEMU 7.2.0 monitor - type 'help' for more information
(qemu) Unexpected error in object_property_try_add() at ../qom/object.c:1239:
qemu-kvm: attempt to add duplicate property 'pc.ram' to object (type 'container')
Aborted (core dumped)
Instead of abort, check for the conflicting 'id' and exit with
an error, suggesting how to remedy the issue.
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2207886
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230522131717.3780533-1-imammedo@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Though we are already using CONFIG_RTL8139_PCI in the meson.build file
for testing whether the rtl8139 device is available or not, this is not
enough: The CONFIG switch might have been selected by another target
(e.g. the mips fuloong2e machine has the rtl8139 chip soldered on the
board), so CONFIG_RTL8139_PCI ends up in config_all_devices and the
test then gets executed on x86. We need an additional run-time check
to be on the safe side to make this test also work when configure has
been run with "--without-default-devices".
Message-Id: <20230525081016.1870364-4-thuth@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The arm "virt" machine needs "virtio-blk-pci" for devices that get attached
via the "-cdrom" option. Since this is an optional device that might not
be available in the binary, we should check for the availability of this
device first before using it.
Message-Id: <20230525081016.1870364-3-thuth@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The "usb-storage" device might not have been compiled into the binary
(e.g. when compiling with "--without-default-devices"), so we have to
check first before using it.
Message-Id: <20230525081016.1870364-2-thuth@redhat.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Mark the default NIC via the new MachineClass->default_nic setting
so that the machine-defaults code in vl.c can decide whether the
default NIC is usable or not (for example when compiling with the
"--without-default-devices" configure switch).
Inspired-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230524122559.28863-1-philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Mark the default NIC via the new MachineClass->default_nic setting
so that the machine-defaults code in vl.c can decide whether the
default NIC is usable or not (for example when compiling with the
"--without-default-devices" configure switch).
Message-Id: <20230524082037.1620952-1-thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Mark the default NIC via the new MachineClass->default_nic setting
so that the machine-defaults code in vl.c can decide whether the
default NIC is usable or not (for example when compiling with the
"--without-default-devices" configure switch).
Message-Id: <20230523110435.1375774-6-thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Mark the default NIC via the new MachineClass->default_nic setting
so that the machine-defaults code in vl.c can decide whether the
default NIC is usable or not (for example when compiling with the
"--without-default-devices" configure switch).
Message-Id: <20230523110435.1375774-4-thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Mark the default NIC via the new MachineClass->default_nic setting
so that the machine-defaults code in vl.c can decide whether the
default NIC is usable or not (for example when compiling with the
"--without-default-devices" configure switch).
Message-Id: <20230523110435.1375774-3-thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Mark the default NIC via the new MachineClass->default_nic setting
so that the machine-defaults code in vl.c can decide whether the
default NIC is usable or not (for example when compiling with the
"--without-default-devices" configure switch).
Message-Id: <20230523110435.1375774-2-thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
tcg/mips:
- Constant formation improvements
- Replace MIPS_BE with HOST_BIG_ENDIAN
- General cleanups
tcg/riscv:
- Improve setcond
- Support movcond
- Support Zbb, Zba
# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmRvo9kdHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV/ECwf/eQSKdXsppLfgH1zj
# 1VYOfSHB7kKacm5s9de6n0n0aT5DdBYGT1VkYqczMyanpYrK5jHIyzxYIcxa2KjN
# /pMRKALUTq1Aku1wvovpybUT9Qt38+6jHw0U9inj11NJIYX4bheVJon3gztOUBRp
# O67Z22RdfBBu+jL6VD00AE8OhCfeU7CZ+Bj9oNRKYCxXyr1ASla9gfTDy8UG+h2k
# WqNti04xmgXqOZ+pEQ+ZyOCzhCHNLm8XBCtFjWXBe30ibX1PwWdSXqkuUtddd5nJ
# MEbzQV42RCk1CNRrFz0RoAJhpcOEiSeDcI3Vx/PN8xS5mIS2jaWqW+5sMyCcI54h
# JcfcUg==
# =GI+F
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 25 May 2023 11:07:21 AM PDT
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate]
* tag 'pull-tcg-20230525' of https://gitlab.com/rth7680/qemu: (23 commits)
tcg/riscv: Support CTZ, CLZ from Zbb
tcg/riscv: Implement movcond
tcg/riscv: Improve setcond expansion
tcg/riscv: Support CPOP from Zbb
tcg/riscv: Support REV8 from Zbb
tcg/riscv: Support rotates from Zbb
tcg/riscv: Use ADD.UW for guest address generation
tcg/riscv: Support ADD.UW, SEXT.B, SEXT.H, ZEXT.H from Zba+Zbb
tcg/riscv: Support ANDN, ORN, XNOR from Zbb
tcg/riscv: Probe for Zba, Zbb, Zicond extensions
disas/riscv: Decode czero.{eqz,nez}
tcg/mips: Replace MIPS_BE with HOST_BIG_ENDIAN
tcg/mips: Use qemu_build_not_reached for LO/HI_OFF
tcg/mips: Try three insns with shift and add in tcg_out_movi
tcg/mips: Try tb-relative addresses in tcg_out_movi
tcg/mips: Aggressively use the constant pool for n64 calls
tcg/mips: Use the constant pool for 64-bit constants
tcg/mips: Split out tcg_out_movi_two
tcg/mips: Split out tcg_out_movi_one
tcg/mips: Create and use TCG_REG_TB
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* hot-unplug fixes for ioport
* purge qatomic_mb_read/set from monitor
* build system fixes
* OHCI fix from gitlab
* provide EPYC-Rome CPU model not susceptible to XSAVES erratum
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmRvGpEUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroOa/Af/WS5/tmIlEYgH7UOPERQXNqf7+Jwj
# bA2wgqv3ZoQwcgp5f4EVjfA8ABfpGxLZy6xIdUSbWANb8lDJNuh/nPd/em3rWUAU
# LnJGGdo1vF31gfsVQnlzb7hJi3ur+e2f8JqkRVskDCk3a7YY44OCN42JdKWLrN9u
# CFf2zYqxMqXHjrYrY0Kx2oTkfGDZrfwUlx0vM4dHb8IEoxaplfDd8lJXQzjO4htr
# 3nPBPjQ+h08EeC7mObH4XoJE0omzovR10GkBo8K4q952xGOQ041Y/2YY7JwLfx0D
# na7IanVo+ZAmvTJZoJFSBwNnXkTMHvDH5+Hc45NSTsDBtz0YJhRxPw/z/A==
# =A5Lp
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 25 May 2023 01:21:37 AM PDT
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [undefined]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu:
monitor: do not use mb_read/mb_set
monitor: extract request dequeuing to a new function
monitor: introduce qmp_dispatcher_co_wake
monitor: cleanup fetching of QMP requests
monitor: cleanup detection of qmp_dispatcher_co shutting down
monitor: do not use mb_read/mb_set for suspend_cnt
monitor: add more *_locked() functions
monitor: allow calling monitor_resume under mon_lock
monitor: use QEMU_LOCK_GUARD a bit more
softmmu/ioport.c: make MemoryRegionPortioList owner of portio_list MemoryRegions
softmmu/ioport.c: QOMify MemoryRegionPortioList
softmmu/ioport.c: allocate MemoryRegionPortioList ports on the heap
usb/ohci: Set pad to 0 after frame update
meson: move -no-pie from linker to compiler
meson: fix rule for qemu-ga installer
meson.build: Fix glib -Wno-unused-function workaround
target/i386: EPYC-Rome model without XSAVES
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Implement with and without Zicond. Without Zicond, we were letting
the middle-end expand to a 5 insn sequence; better to use a branch
over a single insn.
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Split out a helper function, tcg_out_setcond_int, which does not
always produce the complete boolean result, but returns a set of
flags to do so.
Based on 21af161984, the same improvement for loongarch64.
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Since e03b56863d, which replaced HOST_WORDS_BIGENDIAN
with HOST_BIG_ENDIAN, there is no need to define a second
symbol which is [0,1].
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These sequences are inexpensive to test. Maxing out at three insns
results in the same space as a load plus the constant pool entry.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These addresses are often loaded by the qemu_ld/st slow path,
for loading the retaddr value.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Repeated calls to a single helper are common -- especially
the ones for softmmu memory access. Prefer the constant pool
to longer sequences to increase sharing.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
During normal processing, the constant pool is accessible via
TCG_REG_TB. During the prologue, it is accessible via TCG_REG_T9.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This vastly reduces the size of code generated for 64-bit addresses.
The code for exit_tb, for instance, where we load a (tagged) pointer
to the current TB, goes from
0x400aa9725c: li v0,64
0x400aa97260: dsll v0,v0,0x10
0x400aa97264: ori v0,v0,0xaa9
0x400aa97268: dsll v0,v0,0x10
0x400aa9726c: j 0x400aa9703c
0x400aa97270: ori v0,v0,0x7083
to
0x400aa97240: j 0x400aa97040
0x400aa97244: daddiu v0,s6,-189
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In tcg_out_qemu_ld/st, we already check for guest_base matching int16_t.
Mirror that when setting up TCG_GUEST_BASE_REG in the prologue.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
No functional change; just moving the saved reserved regs to the end.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
No functional change; just moving the saved reserved regs to the end.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Instead of relying on magic memory barriers, document the pattern that
is being used. It is the one based on Dekker's algorithm, and in this
case it is embodied as follows:
enqueue request; sleeping = true;
smp_mb(); smp_mb();
if (sleeping) kick(); if (!have a request) yield();
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use a continue statement so that "after going to sleep" is treated the same
way as "after processing a request". Pull the monitor_lock critical
section out of monitor_qmp_requests_pop_any_with_lock() and protect
qmp_dispatcher_co_shutdown with the monitor_lock.
The two changes are complex to separate because monitor_qmp_dispatcher_co()
previously had a complicated logic to check for shutdown both before
and after going to sleep.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Instead of overloading qmp_dispatcher_co_busy, make the coroutine
pointer NULL. This will make things break spectacularly if somebody
tries to start a request after monitor_cleanup().
AIO_WAIT_WHILE_UNLOCKED() does not need qatomic_mb_read(), because
the macro contains all the necessary memory barriers.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Clean up monitor_event to just use monitor_suspend/monitor_resume,
using mon->mux_out to protect against incorrect nesting (especially
on startup).
The only remaining case of reading suspend_cnt is in the can_read
callback, which is just advisory and can use qatomic_read.
As an extra benefit, mux_out is now simply protected by mon_lock.
Also, moving the prompt to the beginning of the main loop removes
it from the output in some error cases where QEMU does not actually
start successfully. It is not a full fix and it would be nice to
also remove the monitor heading, but this is already a small (though
unintentional) improvement.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Allow flushing and printing to the monitor while mon->mon_lock is
held. This will help cleaning up the locking of mon->mux_out and
mon->suspend_cnt.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move monitor_resume()'s call to readline_show_prompt() outside the
potentially locked section. Reuse the existing monitor_accept_input()
bottom half for this purpose.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently when portio_list MemoryRegions are freed using portio_list_destroy() the RCU
thread segfaults generating a backtrace similar to that below:
#0 0x5555599a34b6 in phys_section_destroy ../softmmu/physmem.c:996
#1 0x5555599a37a3 in phys_sections_free ../softmmu/physmem.c:1011
#2 0x5555599b24aa in address_space_dispatch_free ../softmmu/physmem.c:2430
#3 0x55555996a283 in flatview_destroy ../softmmu/memory.c:292
#4 0x55555a2cb9fb in call_rcu_thread ../util/rcu.c:284
#5 0x55555a29b71d in qemu_thread_start ../util/qemu-thread-posix.c:541
#6 0x7ffff4a0cea6 in start_thread nptl/pthread_create.c:477
#7 0x7ffff492ca2e in __clone (/lib/x86_64-linux-gnu/libc.so.6+0xfca2e)
The problem here is that portio_list_destroy() unparents the portio_list
MemoryRegions causing them to be freed immediately, however the flatview
still has a reference to the MemoryRegion and so causes a use-after-free
segfault when the RCU thread next updates the flatview.
Solve the lifetime issue by making MemoryRegionPortioList the owner of the
portio_list MemoryRegions, and then reparenting them to the portio_list
owner. This ensures that they can be accessed as QOM children via the
portio_list owner, yet the MemoryRegionPortioList owns the refcount.
Update portio_list_destroy() to unparent the MemoryRegion from the
portio_list owner (while keeping mrpio->mr live until finalization of the
MemoryRegionPortioList), so that the portio_list MemoryRegions remain
allocated until flatview_destroy() removes the final refcount upon the
next flatview update.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230419151652.362717-4-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The aim of QOMification is so that the lifetime of the MemoryRegionPortioList
structure can be managed using QOM's in-built refcounting instead of having to
handle this manually.
Due to the use of an opaque pointer it isn't possible to model the new
TYPE_MEMORY_REGION_PORTIO_LIST directly using QOM properties, however since
use of the new object is restricted to the portio API we can simply set the
opaque pointer (and the heap-allocated port list) internally.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230419151652.362717-3-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In order to facilitate a conversion of MemoryRegionPortioList to a QOM object
move the allocation of MemoryRegionPortioList ports to the heap instead of
using a variable-length member at the end of the MemoryRegionPortioList
structure.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230419151652.362717-2-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When the OHCI controller's framenumber is incremented, HccaPad1 register
should be set to zero (Ref OHCI Spec 4.4)
ReactOS uses hccaPad1 to determine if the OHCI hardware is running,
consequently it fails this check in current qemu master.
Signed-off-by: Ryan Wendland <wendland@live.com.au>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1048
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The large comment in the patch says it all; the -no-pie flag is broken and
this is why it was not included in QEMU_LDFLAGS before commit a988b4c561
("build: move remaining compiler flag tests to meson", 2023-05-18). And
some distros made things even worse, so we have to add it to the compiler
command line.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1664
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The bindir variable is not available in the "glib" variable, which is an internal
dependency (created with "declare_dependency"). Use glib_pc instead, which contains
the variable as it is instantiated from glib-2.0.pc.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We want to only enable '-Wno-unused-function' if glib's version is
smaller than '2.57.2' and has a G_DEFINE_AUTOPTR_CLEANUP_FUNC()
implementation that doesn't take into account unused functions. But the
compilation test isn't working as intended as '-Wunused-function' isn't
enabled while running it.
Let's enable it.
Fixes: fc9a809e0d ("build: move glib detection and workarounds to meson")
Signed-off-by: Nicolas Saenz Julienne <nsaenz@amazon.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230524173123.66483-1-nsaenz@amazon.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Based on the kernel commit "b0563468ee x86/CPU/AMD: Disable XSAVES on
AMD family 0x17", host system with EPYC-Rome can clear XSAVES capability
bit. In another words, EPYC-Rome host without XSAVES can occur. Thus, we
need an EPYC-Rome cpu model (without this feature) that matches the
solution of fixing this erratum
Signed-off-by: Maksim Davydov <davydov-max@yandex-team.ru>
Message-Id: <20230524213748.8918-1-davydov-max@yandex-team.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
vfio queue:
* Fix for a memory corruption due to an extra free
* Fix for a compile breakage
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmRtyB4ACgkQUaNDx8/7
# 7KFQvRAAhexL/Q8rWM8og+VESL5gPlpxDhWCI+l76+YJqQzZkgebwZ5rw920f8EG
# bRs5AAk8fPTX/qKKq/JkYMmQwpM2jo8W4elcNumm44WAG7hDwd1LQ3nAZeOcvgU0
# jQ1IwRYcgNo+oOTN9b7GhePQK27OraliLUrf/sBGUWvbdAttVc2pcB91CMur0Dxb
# 9KK2vEA4MJ9B8zf2/ZkaK6Z+28GsratR7803Nvv25rm5sP3VBb9w0TnKZAOmaHLv
# X5Tz8yjNvQxxzB9SzgOK6yMtnrp42ArVC5u2aDa33uzSWUeFiTF1HEFeGAps2nJg
# 8tSNo0fTKhznrVR3q2pyxC05Dp+jmKicrmivc26iBdAWAUxQYX44UQoLYD5ISdti
# nlSE+Is+0ZE5E2tHE9yAOPa4rrXHNBqpueu+VMPbYMyVEqzblP7twYe6HkGPYhrD
# zbx/ABZAAGOf+3YmyL1yQrCc0WyJ2lHDySQt/llMrhkBTCHGEF8yjfWFypluZFWX
# X7Mb0YZP0qPpFsV3TDcrqV3onaFSNehp2EJs2EJAa/DeUNbnKlz4LiYBzZE95egb
# 9PGrLnB5w1Vlp44H+ctrnYj55TnspHT+Qqwvhkr/vOMupZukbGus0VFIU2IDrh2g
# qEqhaigwxfVyZ1Eqwti4IgX8RVX8bW43slR33aD6vsO7jpiP2Pk=
# =TA2V
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 24 May 2023 01:17:34 AM PDT
# gpg: using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@kaod.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1
* tag 'pull-vfio-20230524' of https://github.com/legoater/qemu:
util/vfio-helpers: Use g_file_read_link()
vfio/pci: Fix a use-after-free issue
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
When _FORTIFY_SOURCE=2, glibc version is 2.35, and GCC version is
12.1.0, the compiler complains as follows:
In file included from /usr/include/features.h:490,
from /usr/include/bits/libc-header-start.h:33,
from /usr/include/stdint.h:26,
from /usr/lib/gcc/aarch64-unknown-linux-gnu/12.1.0/include/stdint.h:9,
from /home/alarm/q/var/qemu/include/qemu/osdep.h:94,
from ../util/vfio-helpers.c:13:
In function 'readlink',
inlined from 'sysfs_find_group_file' at ../util/vfio-helpers.c:116:9,
inlined from 'qemu_vfio_init_pci' at ../util/vfio-helpers.c:326:18,
inlined from 'qemu_vfio_open_pci' at ../util/vfio-helpers.c:517:9:
/usr/include/bits/unistd.h:119:10: error: argument 2 is null but the corresponding size argument 3 value is 4095 [-Werror=nonnull]
119 | return __glibc_fortify (readlink, __len, sizeof (char),
| ^~~~~~~~~~~~~~~
This error implies the allocated buffer can be NULL. Use
g_file_read_link(), which allocates buffer automatically to avoid the
error.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
util: Host cpu detection for x86 and aa64
util: Use cpu detection for bufferiszero
migration: Use cpu detection for xbzrle
tcg: Replace and remove cpu_atomic_{ld,st}o*
host/include: Split qemu/atomic128.h
tcg: Remove DEBUG_DISAS
tcg: Remove USE_TCG_OPTIMIZATIONS
# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmRtbwAdHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV8xlgf7B/RnVG7u7Hjndr6h
# fH07ujjElAivs+H05S0GGbQYpSNlqVv8PzXT2olJTAe15ryb537dCkqxyKW53vgb
# pUWzZf9Zy8XfN48W5V91dSKQE3gm5wBlOM6LI85F8XrIQyjZqkHti+rw3GxsamNL
# 8n2euOR0vx/jculBRxvZUAJDzb/0shN583mC5+wX/KInCHiNmMC6sCggyd5bpFJZ
# 1wqWwrUCqJ0KAAYKd9WrIKt6QwAX3kUDiBQPa1g+psBjZ1CYQ4lqZZn9uYQ4hEtG
# yBnT0ER2LOBQaKXJ0BrdG5c/mUNX7WkLBDTb+QjGGkfPc/bHIirXqeFzuyrXahg8
# kY155w==
# =XH8Z
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 23 May 2023 06:57:20 PM PDT
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate]
* tag 'pull-tcg-20230523-3' of https://gitlab.com/rth7680/qemu: (28 commits)
tcg: Remove USE_TCG_OPTIMIZATIONS
tcg: Remove DEBUG_DISAS
qemu/atomic128: Add runtime test for FEAT_LSE2
qemu/atomic128: Improve cmpxchg fallback for atomic16_set
tcg: Split out tcg/debug-assert.h
accel/tcg: Correctly use atomic128.h in ldst_atomicity.c.inc
qemu/atomic128: Split atomic16_read
accel/tcg: Eliminate #if on HAVE_ATOMIC128 and HAVE_CMPXCHG128
accel/tcg: Remove prot argument to atomic_mmu_lookup
accel/tcg: Remove cpu_atomic_{ld,st}o_*_mmu
target/s390x: Always use cpu_atomic_cmpxchgl_be_mmu in do_csst
target/s390x: Use cpu_{ld,st}*_mmu in do_csst
accel/tcg: Unify cpu_{ld,st}*_{be,le}_mmu
target/s390x: Use tcg_gen_qemu_{ld,st}_i128 for LPQ, STPQ
target/ppc: Use tcg_gen_qemu_{ld,st}_i128 for LQARX, LQ, STQ
include/qemu: Move CONFIG_ATOMIC128_OPT handling to atomic128.h
meson: Fix detect atomic128 support with optimization
include/host: Split out atomic128-ldst.h
include/host: Split out atomic128-cas.h
util: Add cpuinfo-aarch64.c
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This is always defined, and the optimization pass is
essential to producing reasonable code.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This had been set since the beginning, is never undefined,
and it would seem to be harmful to debugging to do so.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use __sync_bool_compare_and_swap_16 to control the loop,
rather than a separate comparison.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Remove the locally defined load_atomic16 and store_atomic16,
along with HAVE_al16 and HAVE_al16_fast in favor of the
routines defined in atomic128.h.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Create both atomic16_read_ro and atomic16_read_rw.
Previously we pretended that we had atomic16_read in system mode,
because we "know" that all ram is always writable to the host.
Now, expose read-only and read-write versions all of the time.
For aarch64, do not fall back to __atomic_read_16 even if
supported by the compiler, to work around a clang bug.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These symbols will shortly become dynamic runtime tests and
therefore not appropriate for the preprocessor. Use the
matching CONFIG_* symbols for that purpose.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Now that load/store are gone, we're always passing
PAGE_READ | PAGE_WRITE for RMW atomic operations.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use cpu_ld16_mmu and cpu_st16_mmu to eliminate the special case,
and change all of the *_data_ra functions to match.
Note that we check the alignment of both compare and store
pointers at the top of the function, so MO_ALIGN* may be
safely removed from the individual memory operations.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
With the current structure of cputlb.c, there is no difference
between the little-endian and big-endian entry points, aside
from the assert. Unify the pairs of functions.
The only use of the functions with explicit endianness was in
target/sparc64, and that was only to satisfy the assert: the
correct endianness is already built into memop.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
No need to roll our own, as this is now provided by tcg.
This was the last use of retxl, so remove that too.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
No need to roll our own, as this is now provided by tcg.
This was the last use of retxl, so remove that too.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Not only the routines in ldst_atomicity.c.inc need markup,
but also the ones in the headers.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The items in migration_files are built for libmigration and included
info softmmu_ss from there; no need to also include them directly.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Perform the function selection once, and only if CONFIG_AVX512_OPT
is enabled. Centralize the selection to xbzrle.c, instead of
spreading the init across 3 files.
Remove xbzrle-bench.c. The benefit of being able to benchmark
the different implementations is less important than not peeking
into the internals of the implementation.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use cpuinfo_init() during init_accel(), and the variable cpuinfo
during test_buffer_is_zero_next_accel(). Adjust the logic that
cycles through the set of accelerators for testing.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use the CPUINFO_* bits instead of the individual boolean
variables that we had been using. Remove all of the init
code that was moved over to cpuinfo-i386.c.
Note that have_avx512* check both AVX512{F,VL}, as we had
previously done during tcg_target_init.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add cpuinfo.h for i386 and x86_64, and the initialization
for that in util/. Populate that with a slightly altered
copy of the tcg host probing code. Other uses of cpuid.h
will be adjusted one patch at a time.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The entire contents of the header is host-specific, but the
existence of such a header is not, which could prevent some
host specific ifdefs at the top of the file for the include.
Add host/include/{arch,generic} to the project arguments.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add an option for hostmem-file to start the memory object at an offset
into the target file. This is useful if multiple memory objects reside
inside the same target file, such as a device node.
In particular, it's useful to map guest memory directly into /dev/mem
for experimentation.
To make this work consistently, also fix up all places in QEMU that
expect fd offsets to be 0.
Signed-off-by: Alexander Graf <graf@amazon.com>
Message-Id: <20230403221421.60877-1-graf@amazon.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
# -----BEGIN PGP SIGNATURE-----
# Version: GnuPG v1
#
# iQEcBAABAgAGBQJkbGmXAAoJEO8Ells5jWIR4ogH/R5+IgkZi1dwN/IxCpzTIc5H
# l5ncKK6TCqKCfgpFnFFLNKhcDqDczq4LhO42s/vnuOF8vIXcUVhLAz0HULARb46o
# p/7Ufn1k8Zg/HGtWwIW+9CcTkymsHzTOwFcTRFiCjpdkjaW1Wprb2q968f0Px8eS
# cKqC5xln8U+s02KWQMHlJili6BTPuw1ZNnYV3iq/81Me96WOtPd8c8ZSF4aVR2AB
# Kqah+BBOnk4p4kg9Gs0OvM4TffEBrsab8iu4s6SSQGA6ymCWY6GeCX0Ik4u9P1yE
# 6NtKLixBPO4fqLwWxWuKVJmaLKmuEd/FjZXWwITx9EPNtDuBuGLDKuvW8fJxkhw=
# =dw2I
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 23 May 2023 12:21:59 AM PDT
# gpg: using RSA key EF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* tag 'net-pull-request' of https://github.com/jasowang/qemu: (50 commits)
rtl8139: fix large_send_mss divide-by-zero
docs/system/devices/igb: Note igb is tested for DPDK
MAINTAINERS: Add a reviewer for network packet abstractions
vmxnet3: Do not depend on PC
igb: Clear-on-read ICR when ICR.INTA is set
igb: Notify only new interrupts
e1000e: Notify only new interrupts
igb: Implement Tx timestamp
igb: Implement Rx PTP2 timestamp
igb: Implement igb-specific oversize check
igb: Filter with the second VLAN tag for extended VLAN
igb: Strip the second VLAN tag for extended VLAN
igb: Implement Tx SCTP CSO
igb: Implement Rx SCTP CSO
igb: Use UDP for RSS hash
igb: Implement MSI-X single vector mode
tests/qtest/libqos/igb: Set GPIE.Multiple_MSIX
hw/net/net_rx_pkt: Enforce alignment for eth_header
net/eth: Always add VLAN tag
net/eth: Use void pointers
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
I have made significant changes for network packet abstractions so add
me as a reviewer.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
For GPIE.NSICR, Section 7.3.2.1.2 says:
> ICR bits are cleared on register read. If GPIE.NSICR = 0b, then the
> clear on read occurs only if no bit is set in the IMS or at least one
> bit is set in the IMS and there is a true interrupt as reflected in
> ICR.INTA.
e1000e does similar though it checks for CTRL_EXT.IAME, which does not
exist on igb.
Suggested-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This follows the corresponding change for e1000e. This fixes:
tests/avocado/netdev-ethtool.py:NetDevEthtool.test_igb
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
In MSI-X mode, if there are interrupts already notified but not cleared
and a new interrupt arrives, e1000e incorrectly notifies the notified
ones again along with the new one.
To fix this issue, replace e1000e_update_interrupt_state() with
two new functions: e1000e_raise_interrupts() and
e1000e_lower_interrupts(). These functions don't only raise or lower
interrupts, but it also performs register writes which updates the
interrupt state. Before it performs a register write, these function
determines the interrupts already raised, and compares with the
interrupts raised after the register write to determine the interrupts
to notify.
The introduction of these functions made tracepoints which assumes that
the caller of e1000e_update_interrupt_state() performs register writes
obsolete. These tracepoints are now removed, and alternative ones are
added to the new functions.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
igb has a configurable size limit for LPE, and uses different limits
depending on whether the packet is treated as a VLAN packet.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
eth_strip_vlan and eth_strip_vlan_ex refers to ehdr_buf as struct
eth_header. Enforce alignment for the structure.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
It is possible to have another VLAN tag even if the packet is already
tagged.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The uses of uint8_t pointers were misleading as they are never accessed
as an array of octets and it even require more strict alignment to
access as struct eth_header.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Rename variable "n" to "causes", which properly represents the content
of the variable.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Section 7.3.4.1 says:
> When auto-clear is enabled for an interrupt cause, the EICR bit is
> set when a cause event mapped to this vector occurs. When the EITR
> Counter reaches zero, the MSI-X message is sent on PCIe. Then the
> EICR bit is cleared and enabled to be set by a new cause event
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Keeping Tx packet state after the transmit queue is emptied but this
behavior is unreliable as the state can be reset anytime the migration
happens.
Always reset Tx packet state always after the queue is emptied.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Keeping Tx packet state after the transmit queue is emptied has some
problems:
- The datasheet says the descriptors can be reused after the transmit
queue is emptied, but the Tx packet state may keep references to them.
- The Tx packet state cannot be migrated so it can be reset anytime the
migration happens.
Always reset Tx packet state always after the queue is emptied.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Section 7.2.2.3 Advanced Transmit Data Descriptor says:
> For frames that spans multiple descriptors, all fields apart from
> DCMD.EOP, DCMD.RS, DCMD.DEXT, DTALEN, Address and DTYP are valid only
> in the first descriptors and are ignored in the subsequent ones.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The goto is a bit confusing as it changes the control flow only if L4
protocol is not recognized. It is also different from e1000e, and
noisy when comparing e1000e and igb.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Without this change, the status flags may not be traced e.g. if checksum
offloading is disabled.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Without this change, the status flags may not be traced e.g. if checksum
offloading is disabled.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
e1000e and igb employs NetPktRssIpV6TcpEx for RSS hash if TcpIpv6 MRQC
bit is set. Moreover, igb also has a MRQC bit for NetPktRssIpV6Tcp
though it is not implemented yet. Rename it to TcpIpv6Ex to avoid
confusion.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Section 13.7.15 Receive Length Error Count says:
> Packets over 1522 bytes are oversized if LongPacketEnable is 0b
> (RCTL.LPE). If LongPacketEnable (LPE) is 1b, then an incoming packet
> is considered oversized if it exceeds 16384 bytes.
> These lengths are based on bytes in the received packet from
> <Destination Address> through <CRC>, inclusively.
As QEMU processes packets without CRC, the number of bytes for CRC
need to be subtracted. This change adds some size definitions to be used
to derive the new size thresholds to eth.h.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The old eth_setup_vlan_headers has no user so remove it and rename
eth_setup_vlan_headers_ex.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
igb_receive_internal() used to check the iov length to determine
copy the iovs to a contiguous buffer, but the check is flawed in two
ways:
- It does not ensure that iovcnt > 0.
- It does not take virtio-net header into consideration.
The size of this copy is just 22 octets, which can be even less than
the code size required for checks. This (wrong) optimization is probably
not worth so just remove it. Removing this also allows igb to assume
aligned accesses for the ethernet header.
Fixes: 3a977deebe ("Intrdocue igb device emulation")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
e1000e_receive_internal() used to check the iov length to determine
copy the iovs to a contiguous buffer, but the check is flawed in two
ways:
- It does not ensure that iovcnt > 0.
- It does not take virtio-net header into consideration.
The size of this copy is just 18 octets, which can be even less than
the code size required for checks. This (wrong) optimization is probably
not worth so just remove it.
Fixes: 6f3fbe4ed0 ("net: Introduce e1000e device emulation")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
igb does not properly ensure the buffer passed to
net_rx_pkt_set_protocols() is contiguous for the entire L2/L3/L4 header.
Allow it to pass scattered data to net_rx_pkt_set_protocols().
Fixes: 3a977deebe ("Intrdocue igb device emulation")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
While the datasheet of e1000e says it checks CTRL.VME for tx VLAN
tagging, igb's datasheet has no such statements. It also says for
"CTRL.VLE":
> This register only affects the VLAN Strip in Rx it does not have any
> influence in the Tx path in the 82576.
(Appendix A. Changes from the 82575)
There is no "CTRL.VLE" so it is more likely that it is a mistake of
CTRL.VME.
Fixes: fba7c3b788 ("igb: respect VMVIR and VMOLR for VLAN")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
igb's advanced descriptor uses a packet type encoding different from
one used in e1000e's extended descriptor. Fix the logic to encode
Rx packet type accordingly.
Fixes: 3a977deebe ("Intrdocue igb device emulation")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Before this change, e1000 and the common code updated BPRC and MPRC
depending on the matched filter, but e1000e and igb decided to update
those counters by deriving the packet type independently. This
inconsistency caused a multicast packet to be counted twice.
Updating BPRC and MPRC depending on are fundamentally flawed anyway as
a filter can be used for different types of packets. For example, it is
possible to filter broadcast packets with MTA.
Always determine what counters to update by inspecting the packets.
Fixes: 3b27430177 ("e1000: Implementing various counters")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This allows to use the network packet abstractions even if PCI is not
used.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This is intended to be followed by another change for the interface.
It also fixes the leak of memory mapping when the specified memory is
partially mapped.
Fixes: e263cd49c7 ("Packet abstraction for VMWARE network devices")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The bytes and packets counter registers are cleared on read.
Copying the "total counter" registers to the "good counter" registers has
side effects.
If the "total" register is never read by the OS, it only gets incremented.
This leads to exponential growth of the "good" register.
This commit increments the counters individually to avoid this.
Signed-off-by: Timothée Cocault <timothee.cocault@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
* First batch of fixes to allow "make check" with "--without-default-devices"
* Enable the "bios bits" avocado test in the gitlab-CI
* Another minor fix for the redundancy DMA blocker code
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmRrVhoRHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbUaiRAApPVveet6WPQ7Ag1448LtqHTGiwl8x2Ba
# jQ7FTKhqdTC5O+/BU7IQkvGmErPxCc8WPB7eoowwBVA/4dr8YIIBLKqO4RtP6LXs
# rtUkzsPI9ExW+iJjIMVOmHsp/shlRhuf+Tmlr8OsTObecCeA4Vbxc+RlvYXfCPhM
# 8tOuLO8n6LQY/62fgXSzI5WlLQSzIo3aDSmCeWa1QHkPLf6itvGkwsNBytMJLoUT
# pXZnBNqlXiuyPtloLp+DMfRRkpq8AHB04+Sri7TVPxi7bJL28RMZiaAXpvHSFLz8
# JR2ApRrzBthiLMK1I6A0c2ZGCbVOAi1dhNDNqWCyx8ZBASEJj0XuT/+Qse81sKmG
# zNXr57x0CzWAJ59/taBM2hjUks10rJOmxHJYxS6i1JJR7u1zTuvii7toPMmf35zX
# bM7TYjKpYGa2HneHpw1eOjpTgUYZpgla/pVXZhKqoGdfmseBMlFU424MNl/xDRng
# bxuam3Ku+ClOeQlzXt8aceL/gTApJfvy5FAIAK5yUOQDTs6HjJJL2AfcOzss8kXb
# k6IMHgV1tnLed8B7K4iml2rzvk+RT3CPGvmaNwSAkdh8SnE5/bv1I6s4fHiXMlvC
# mmfvFSoWwdhcsD5r+XOFxfke8sGrOeQIXKefp6UL3hYVV7o2NUe89BytXZCzut/Y
# 6ulR25HHtmI=
# =m1Px
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 22 May 2023 04:46:34 AM PDT
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [undefined]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [undefined]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5
* tag 'pull-request-2023-05-22' of https://gitlab.com/thuth/qemu:
memory: stricter checks prior to unsetting engaged_in_io
acpi/tests/avocado/bits: enable bios bits avocado tests on gitlab CI pipeline
.gitlab-ci.d/buildtest.yml: Run full "make check" with --without-default-devices
tests/qemu-iotests/172: Run QEMU with -vga none and -nic none
tests/qtest/meson.build: Run the net filter tests only with default devices
tests/qtest: Check for the availability of virtio-ccw devices before using them
tests/qtest/virtio-ccw-test: Remove superfluous tests
tests/qtest/cdrom-test: Fix the test to also work without optional devices
tests/qtest/usb-hcd-uhci-test: Skip test if UHCI controller is not available
tests/qtest/readconfig-test: Check for the availability of USB controllers
hw/sparc64/sun4u: Use MachineClass->default_nic and MachineClass->no_parallel
hw/i386: Ignore the default parallel port if it has not been compiled into QEMU
hw/char/parallel: Move TYPE_ISA_PARALLEL to the header file
hw/sh4: Use MachineClass->default_nic in the sh4 r2d machine
hw/s390x: Use MachineClass->default_nic in the s390x machine
hw/ppc: Use MachineClass->default_nic in the ppc machines
softmmu/vl.c: Disable default NIC if it has not been compiled into the binary
hw: Move the default NIC machine class setting from the x86 to the generic one
softmmu/vl.c: Check for the availability of the VGA device before using it
hw/i386/Kconfig: ISAPC works fine without VGA_ISA
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Block layer patches
- qcow2 spec: Rename "zlib" compression to "deflate"
- Honour graph read lock even in the main thread + prerequisite fixes
- aio-posix: do not nest poll handlers (fixes infinite recursion)
- Refactor QMP blockdev transactions
- graph-lock: Disable locking for now
- iotests/245: Check if 'compress' driver is available
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmRnrxURHGt3b2xmQHJl
# ZGhhdC5jb20ACgkQfwmycsiPL9aHyw/9H0xpceVb0kcC5CStOWCcq4PJHzkl/8/m
# c6ABFe0fgEuN2FCiKiCKOt6+V7qaIAw0+YLgPr/LGIsbIBzdxF3Xgd2UyIH6o4dK
# bSaIAaes6ZLTcYGIYEVJtHuwNgvzhjyBlW5qqwTpN0YArKS411eHyQ3wlUkCEVwK
# ZNmDY/MC8jq8r1xfwpPi7CaH6k1I6HhDmyl1PdURW9hmoAKZQZMhEdA5reJrUwZ9
# EhfgbLIaK0kkLLsufJ9YIkd+b/P3mUbH30kekNMOiA0XlnhWm1Djol5pxlnNiflg
# CGh6CAyhJKdXzwV567cSF11NYCsFmiY+c/l0xRIGscujwvO4iD7wFT5xk2geUAKV
# yaox8JA7Le36g7lO2CRadlS24/Ekqnle6q09g2i8s2tZwB4fS286vaZz6QDPmf7W
# VSQp9vuDj6ZcVjMsuo2+LzF3yA2Vqvgd9s032iBAjRDSGLAoOdQZjBJrreypJ0Oi
# pVFwgK+9QNCZBsqVhwVOgElSoK/3Vbl1kqpi30Ikgc0epAn0suM1g2QQPJ2Zt/MJ
# xqMlTv+48OW3vq3ebr8GXqkhvG/u0ku6I1G6ZyCrjOce89osK8QUaovERyi1eOmo
# ouoZ8UJJa6VfEkkmdhq2vF6u/MP4PeZ8MW3pYQy6qEnSOPDKpLnR30Z/s/HZCZcm
# H4QIbfQnzic=
# =edNP
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 19 May 2023 10:17:09 AM PDT
# gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg: issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
* tag 'for-upstream' of https://repo.or.cz/qemu/kevin: (21 commits)
iotests: Test commit with iothreads and ongoing I/O
nbd/server: Fix drained_poll to wake coroutine in right AioContext
graph-lock: Disable locking for now
tested: add test for nested aio_poll() in poll handlers
aio-posix: do not nest poll handlers
iotests/245: Check if 'compress' driver is available
graph-lock: Honour read locks even in the main thread
blockjob: Adhere to rate limit even when reentered early
test-bdrv-drain: Call bdrv_co_unref() in coroutine context
test-bdrv-drain: Take graph lock more selectively
qemu-img: Take graph lock more selectively
qcow2: Unlock the graph in qcow2_do_open() where necessary
block/export: Fix null pointer dereference in error path
block: Call .bdrv_co_create(_opts) unlocked
docs/interop/qcow2.txt: fix description about "zlib" clusters
blockdev: qmp_transaction: drop extra generic layer
blockdev: use state.bitmap in block-dirty-bitmap-add action
blockdev: transaction: refactor handling transaction properties
blockdev: qmp_transaction: refactor loop to classic for
blockdev: transactions: rename some things
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Biosbits avocado tests on gitlab has thus far been disabled because some
packages needed by this test was missing in the container images used by gitlab
CI. These packages have now been added with the commit:
da9000784c ("tests/lcitool: Add mtools and xorriso and remove genisoimage as dependencies")
Therefore, this change enables bits avocado test on gitlab.
At the same time, the bits cleanup code has also been made more robust with
this change.
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20230517065357.5614-1-anisinha@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
qmp-intro.txt is quite small and provides very little information
that isn't already in the documentation elsewhere. Fold the example
command lines into qemu-options.hx, and delete the now-unneeded plain
text document.
While we're touching the qemu-options.hx documentation text,
wordsmith it a little bit and improve the rST formatting.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20230515162245.3964307-4-peter.maydell@linaro.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Convert the qmp-spec.txt document to restructuredText.
Notable points about the conversion:
* numbers at the start of section headings are removed, to match
the style of the rest of the manual
* cross-references to other sections or documents are hyperlinked
* various formatting tweaks (notably the examples, which need the
-> and <- prefixed so the QMP code-block lexer will accept them)
* English prose fixed in a few places
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20230515162245.3964307-2-peter.maydell@linaro.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
[.. code-block:: dumbed down to :: to work around CI failure]
The error message is bad when the section is untagged. For instance,
test case doc-interleaved-section produces "'@foobar:' can't follow
'Note' section", which is okay, but if we drop the "Note:" tag, we get
"'@foobar:' can't follow 'None' section, which is bad.
Change the error message to "description of '@foobar:' follows a
section".
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20230510141637.3685080-1-armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
[Conflict with commit 3e32dca3f0 resolved]
This way QEMU won't complain in case the VGA card or the NIC device
are not available in the binary, thus it won't spoil the output
and the test then passes with such QEMU binaries that have a limited
configuration, too.
Message-Id: <20230512124033.502654-18-thuth@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
virtio-balloon-ccw is already tested in the device-plug-test,
virtio-blk-ccw is already tested in cdrom-test, and virtio-net-ccw
is already tested in the pxe-test, so there is not much point
in doing "nop" tests here again.
Message-Id: <20230512124033.502654-15-thuth@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The test is already fenced with CONFIG_USB_UHCI in meson.build, but in
case we build the ppc or mips targets in parallel, this config switch
is still set in "config_all_devices" and thus the test is still run.
Thus we need an explicit additional check here before adding the tests
to the test plan.
Message-Id: <20230512124033.502654-13-thuth@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The USB controllers might not be available in the QEMU binary
(e.g. when using the "--without-default-devices" configure switch),
so we have to check whether the devices can be used before running
the related test.
Message-Id: <20230512124033.502654-12-thuth@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Announce the default NIC via MachineClass->default_nic and set up
MachineClass->no_parallel according to the availability of the
"isa-parallel" device, so that the Sun machines also work when
QEMU has been configured with "--without-default-devices".
Message-Id: <20230512124033.502654-11-thuth@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Mark the default NIC via the new MachineClass->default_nic setting
so that the machine-defaults code in vl.c can decide whether the
default NIC is usable or not (for example when compiling with the
"--without-default-devices" configure switch).
Message-Id: <20230512124033.502654-8-thuth@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Mark the default NIC via the new MachineClass->default_nic setting
so that the machine-defaults code in vl.c can decide whether the
default NIC is usable or not (for example when compiling with the
"--without-default-devices" configure switch).
Message-Id: <20230512124033.502654-7-thuth@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Mark the default NIC via the new MachineClass->default_nic setting
so that the machine-defaults code in vl.c can decide whether the
default NIC is usable or not (for example when compiling with the
"--without-default-devices" configure switch).
Message-Id: <20230512124033.502654-6-thuth@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
In case the user disabled the default VGA device in the binary (e.g.
with the "--without-default-devices" configure switch), we should
not try to use it by default if QEMU is running with the default
devices, otherwise it aborts when trying to use it. Simply emit a
warning instead.
Message-Id: <20230512124033.502654-3-thuth@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
virtio,pc,pci: fixes, features, cleanups
CXL volatile memory support
More memslots for vhost-user on x86 and ARM.
vIOMMU support for vhost-vdpa
pcie-to-pci bridge can now be compiled out
MADT revision bumped to 3
Fixes, cleanups all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmRniWoPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRpN4MH/RqdvHmujrjvjzXbbN/gq87Njp+kQLKEooIE
# ZkqdNaVUE6vjCH8iU+chjsxt4VSquSjOL9CWWrYefEIeqCFLWsuXSAY0VDAbY67x
# +aes51tTYILVsx7fbb+T5mJKRgVuWW4C5KaGeQ1djSexy42nvplZUJdIJUhZr0t9
# dzzOsD+mezHS7Xu2QOzSfl5QQRuOVVJnjJXkqJG/yRvHrZM5aTolatr/X7jNGedm
# 4oyMsVMaAcQ+dnEQigRJodf/MpFfs9DfNZAH55VwwQWsNT0t0ueD0xigR203jjaE
# mJJJipAqetFax2JjC7QMXWf+LR36BnL/0/xH+x/BWb0FI42wr0I=
# =ajmR
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 19 May 2023 07:36:26 AM PDT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [undefined]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (40 commits)
hw/i386/pc: No need for rtc_state to be an out-parameter
hw/i386/pc: Create RTC controllers in south bridges
hw/cxl: Introduce cxl_device_get_timestamp() utility function
hw/cxl: rename mailbox return code type from ret_code to CXLRetCode
hw/pci-bridge: make building pcie-to-pci bridge configurable
virtio-pci: add handling of PCI ATS and Device-TLB enable/disable
hw/pci-host/pam: Make init_pam() usage more readable
hw/i386/pc: Initialize ram_memory variable directly
hw/i386/pc_{q35,piix}: Minimize usage of get_system_memory()
hw/i386/pc_{q35,piix}: Reuse MachineClass::desc as SMB product name
hw/i386/pc_q35: Reuse machine parameter
hw/pci-host/q35: Inline sysbus_add_io()
hw/pci-host/i440fx: Inline sysbus_add_io()
vhost-vdpa: Add support for vIOMMU.
vhost-vdpa: Add check for full 64-bit in region delete
vhost_vdpa: fix the input in trace_vhost_vdpa_listener_region_del()
vhost: expose function vhost_dev_has_iommu()
virtio-crypto: fix NULL pointer dereference in virtio_crypto_free_request
virtio-net: not enable vq reset feature unconditionally
vhost-user: Remove acpi-specific memslot limit
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Some scripts are invoked via the first "python3" binary in the PATH,
because they are executable and their shebang line is "#! /usr/bin/env
python3". To enforce usage of $(PYTHON), make them nonexecutable.
Scripts invoked via meson need nothing else, and meson-buildoptions.py
is already using $(PYTHON). For probe-gdb-support.py however the
invocation in the configure script has to be adjusted.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since custom runners are not generally available, make it possible to
debug the differences between a successful and a failing build by
comparing the logs and the build.ninja rules.
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If sphinx is present but the theme is not, mkvenv will print an
inaccurate diagnostic:
ERROR: Could not find a version that satisfies the requirement sphinx-rtd-theme>=0.5.0 (from versions: none)
ERROR: No matching distribution found for sphinx-rtd-theme>=0.5.0
'sphinx>=1.6.0' not found:
• Python package 'sphinx' version '5.3.0' was found, but isn't suitable.
• mkvenv was configured to operate offline and did not check PyPI.
Instead, ignore the packages that were found to be present, and report
an error based on the first absent package.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reintroduce the cmd_line.txt mangling to remove the sphinx_build option
when rerunning meson. The mechanism was removed in commit 75cc286485
("configure: remove backwards-compatibility code", 2023-01-11) because
the fixups were obsolete at the time; however, the Meson deprecation
mechanism doesn't quite work when options are finally removed, so we
need to bring it back.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Do not use the rule in build.ninja, because the path to meson is hardcoded
in build.ninja and this breaks if meson moves (for example if the distro
meson suddenly becomes too old after an update).
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
importlib.metadata is just as good as distlib.database and a bit more
battle-proven for "egg" based distributions, and in fact that is exactly
why mkvenv.py is not using distlib.database to find entry points: it
simply does not work for eggs.
The only disadvantage of importlib.metadata is that it is not available
by default before Python 3.8, so we need a fallback to pkg_resources
(again, just like for the case of finding entry points). Do so to
fix issues where incorrect egg metadata results in a JSONDecodeError.
While at it, reuse the new _get_version function to diagnose an incorrect
version of the package even if importlib.metadata is not available.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This tests exercises graph locking, draining, and graph modifications
with AioContext switches a lot. Amongst others, it serves as a
regression test for bdrv_graph_wrlock() deadlocking because it is called
with a locked AioContext and for AioContext handling in the NBD server.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230517152834.277483-4-kwolf@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
nbd_drained_poll() generally runs in the main thread, not whatever
iothread the NBD server coroutine is meant to run in, so it can't
directly reenter the coroutines to wake them up.
The code seems to have the right intention, it specifies the correct
AioContext when it calls qemu_aio_coroutine_enter(). However, this
functions doesn't schedule the coroutine to run in that AioContext, but
it assumes it is already called in the home thread of the AioContext.
To fix this, add a new thread-safe qio_channel_wake_read() that can be
called in the main thread to wake up the coroutine in its AioContext,
and use this in nbd_drained_poll().
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230517152834.277483-3-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In QEMU 8.0, we've been seeing deadlocks in bdrv_graph_wrlock(). They
come from callers that hold an AioContext lock, which is not allowed
during polling. In theory, we could temporarily release the lock, but
callers are inconsistent about whether they hold a lock, and if they do,
some are also confused about which one they hold. While all of this is
fixable, it's not trivial, and the best course of action for 8.0.1 is
probably just disabling the graph locking code temporarily.
We don't currently rely on graph locking yet. It is supposed to replace
the AioContext lock eventually to enable multiqueue support, but as long
as we still have the AioContext lock, it is sufficient without the graph
lock. Once the AioContext lock goes away, the deadlock doesn't exist any
more either and this commit can be reverted. (Of course, it can also be
reverted while the AioContext lock still exists if the callers have been
fixed.)
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230517152834.277483-2-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
QEMU's event loop supports nesting, which means that event handler
functions may themselves call aio_poll(). The condition that triggered a
handler must be reset before the nested aio_poll() call, otherwise the
same handler will be called and immediately re-enter aio_poll. This
leads to an infinite loop and stack exhaustion.
Poll handlers are especially prone to this issue, because they typically
reset their condition by finishing the processing of pending work.
Unfortunately it is during the processing of pending work that nested
aio_poll() calls typically occur and the condition has not yet been
reset.
Disable a poll handler during ->io_poll_ready() so that a nested
aio_poll() call cannot invoke ->io_poll_ready() again. As a result, the
disabled poll handler and its associated fd handler do not run during
the nested aio_poll(). Calling aio_set_fd_handler() from inside nested
aio_poll() could cause it to run again. If the fd handler is pending
inside nested aio_poll(), then it will also run again.
In theory fd handlers can be affected by the same issue, but they are
more likely to reset the condition before calling nested aio_poll().
This is a special case and it's somewhat complex, but I don't see a way
around it as long as nested aio_poll() is supported.
Cc: qemu-stable@nongnu.org
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2186181
Fixes: c382706925 ("block: Mark bdrv_co_io_(un)plug() and callers GRAPH_RDLOCK")
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230502184134.534703-2-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Skip TestBlockdevReopen.test_insert_compress_filter() if the 'compress'
driver isn't available.
In order to make the test succeed when the case is skipped, we also need
to remove any output from it (which would be missing in the case where
we skip it). This is done by replacing qemu_io_log() with qemu_io(). In
case of failure, qemu_io() raises an exception with the output of the
qemu-io binary in its message, so we don't actually lose anything.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230511143801.255021-1-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There are some conditions under which we don't actually need to do
anything for taking a reader lock: Writing the graph is only possible
from the main context while holding the BQL. So if a reader is running
in the main context under the BQL and knows that it won't be interrupted
until the next writer runs, we don't actually need to do anything.
This is the case if the reader code neither has a nested event loop
(this is forbidden anyway while you hold the lock) nor is a coroutine
(because a writer could run when the coroutine has yielded).
These conditions are exactly what bdrv_graph_rdlock_main_loop() asserts.
They are not fulfilled in bdrv_graph_co_rdlock(), which always runs in a
coroutine.
This deletes the shortcuts in bdrv_graph_co_rdlock() that skip taking
the reader lock in the main thread.
Reported-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230510203601.418015-9-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When jobs are sleeping, for example to enforce a given rate limit, they
can be reentered early, in particular in order to get paused, to update
the rate limit or to get cancelled.
Before this patch, they behave in this case as if they had fully
completed their rate limiting delay. This means that requests are sped
up beyond their limit, violating the constraints that the user gave us.
Change the block jobs to sleep in a loop until the necessary delay is
completed, while still allowing cancelling them immediately as well
pausing (handled by the pause point in job_sleep_ns()) and updating the
rate limit.
This change is also motivated by iotests cases being prone to fail
because drain operations pause and unpause them so often that block jobs
complete earlier than they are supposed to. In particular, the next
commit would fail iotests 030 without this change.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230510203601.418015-8-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If we take a reader lock, we can't call any functions that take a writer
lock internally without causing deadlocks once the reader lock is
actually enforced in the main thread, too. Take the reader lock only
where it is actually needed.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230510203601.418015-6-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If we take a reader lock, we can't call any functions that take a writer
lock internally without causing deadlocks once the reader lock is
actually enforced in the main thread, too. Take the reader lock only
where it is actually needed.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230510203601.418015-5-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qcow2_do_open() calls a few no_co_wrappers that wrap functions taking
the graph lock internally as a writer. Therefore, it can't hold the
reader lock across these calls, it causes deadlocks. Drop the lock
temporarily around the calls.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230510203601.418015-4-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There are some error paths in blk_exp_add() that jump to 'fail:' before
'exp' is even created. So we can't just unconditionally access exp->blk.
Add a NULL check, and switch from exp->blk to blk, which is available
earlier, just to be extra sure that we really cover all cases where
BlockDevOps could have been set for it (in practice, this only happens
in drv->create() today, so this part of the change isn't strictly
necessary).
Fixes: Coverity CID 1509238
Fixes: de79b52604
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230510203601.418015-3-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
These are functions that modify the graph, so they must be able to take
a writer lock. This is impossible if they already hold the reader lock.
If they need a reader lock for some of their operations, they should
take it internally.
Many of them go through blk_*(), which will always take the lock itself.
Direct calls of bdrv_*() need to take the reader lock. Note that while
locking for bdrv_co_*() calls is checked by TSA, this is not the case
for the mixed_coroutine_fns bdrv_*(). Holding the lock is still required
when they are called from coroutine context like here!
This effectively reverts 4ec8df0183, but adds some internal locking
instead.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230510203601.418015-2-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Let's simplify things:
First, actions generally don't need access to common BlkActionState
structure. The only exclusion are backup actions that need
block_job_txn.
Next, for transaction actions of Transaction API is more native to
allocated state structure in the action itself.
So, do the following transformation:
1. Let all actions be represented by a function with corresponding
structure as arguments.
2. Instead of array-map marshaller, let's make a function, that calls
corresponding action directly.
3. BlkActionOps and BlkActionState structures become unused
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-Id: <20230510150624.310640-7-vsementsov@yandex-team.ru>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Other bitmap related actions use the .bitmap pointer in .abort action,
let's do same here:
1. It helps further refactoring, as bitmap-add is the only bitmap
action that uses state.action in .abort
2. It must be safe: transaction actions rely on the fact that on
.abort() the state is the same as at the end of .prepare(), so that
in .abort() we could precisely rollback the changes done by
.prepare().
The only way to remove the bitmap during transaction should be
block-dirty-bitmap-remove action, but it postpones actual removal to
.commit(), so we are OK on any rollback path. (Note also that
bitmap-remove is the only bitmap action that has .commit() phase,
except for simple g_free the state on .clean())
3. Again, other bitmap actions behave this way: keep the bitmap pointer
during the transaction.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-Id: <20230510150624.310640-6-vsementsov@yandex-team.ru>
[kwolf: Also remove the now unused BlockDirtyBitmapState.prepared]
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Look at qmp_transaction(): dev_list is not obvious name for list of
actions. Let's look at qapi spec, this argument is "actions". Let's
follow the common practice of using same argument names in qapi scheme
and code.
To be honest, rename props to properties for same reason.
Next, we have to rename global map of actions, to not conflict with new
name for function argument.
Rename also dev_entry loop variable accordingly to new name of the
list.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230510150624.310640-3-vsementsov@yandex-team.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are going to add more block-graph modifying transaction actions,
and block-graph modifying functions are already based on Transaction
API.
Next, we'll need to separately update permissions after several
graph-modifying actions, and this would be simple with help of
Transaction API.
So, now let's just transform what we have into new-style transaction
actions.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-Id: <20230510150624.310640-2-vsementsov@yandex-team.ru>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This reverts commit b320e21c48,
which accidentally broke TCG, because it made the TCG -cpu max
report the presence of MTE to the guest even if the board hadn't
enabled MTE by wiring up the tag RAM. This meant that if the guest
then tried to use MTE QEMU would segfault accessing the
non-existent tag RAM:
==346473==ERROR: UndefinedBehaviorSanitizer: SEGV on unknown address (pc 0x55f328952a4a bp 0x00000213a400 sp 0x7f7871859b80 T346476)
==346473==The signal is caused by a READ memory access.
==346473==Hint: this fault was caused by a dereference of a high value address (see register values below). Disassemble the provided pc to learn which register was used.
#0 0x55f328952a4a in address_space_to_flatview /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/include/exec/memory.h:1108:12
#1 0x55f328952a4a in address_space_translate /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/include/exec/memory.h:2797:31
#2 0x55f328952a4a in allocation_tag_mem /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/arm-clang/../../target/arm/tcg/mte_helper.c:176:10
#3 0x55f32895366c in helper_stgm /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/arm-clang/../../target/arm/tcg/mte_helper.c:461:15
#4 0x7f782431a293 (<unknown module>)
It's also not clear that the KVM logic is correct either:
MTE defaults to on there, rather than being only on if the
board wants it on.
Revert the whole commit for now so we can sort out the issues.
(We didn't catch this in CI because we have no test cases in
avocado that use guests with MTE support.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20230519145808.348701-1-peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
According to PCIe Address Translation Services specification 5.1.3.,
ATS Control Register has Enable bit to enable/disable ATS. Guest may
enable/disable PCI ATS and, accordingly, Device-TLB for the VirtIO PCI
device. So, raise/lower a flag and call a trigger function to pass this
event to a device implementation.
Signed-off-by: Viktor Prutyanov <viktor@daynix.com>
Message-Id: <20230512135122.70403-2-viktor@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Unlike pam_update() which takes the subject -- PAMMemoryRegion -- as
first argument, init_pam() takes it as fifth (!) argument. This makes it
quite hard to figure out what an init_pam() invocation actually
initializes. By moving the subject to the front this should become
clearer.
While at it, lower the DeviceState parameter to Object, also
communicating more clearly that this parameter is just the owner rather
than some (heavy?) dependency.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230213162004.2797-8-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
sysbus_add_io() just wraps memory_region_add_subregion() while also
obscuring where the memory is attached. So use
memory_region_add_subregion() directly and attach it to the existing
memory region s->mch.address_space_io which is set as an alias to
get_system_io() by the q35 machine.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230213162004.2797-3-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
sysbus_add_io() just wraps memory_region_add_subregion() while also
obscuring where the memory is attached. So use
memory_region_add_subregion() directly and attach it to the existing
memory region s->bus->address_space_io which is set as an alias to
get_system_io() by the pc machine.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230213162004.2797-2-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
1. The vIOMMU support will make vDPA can work in IOMMU mode. This
will fix security issues while using the no-IOMMU mode.
To support this feature we need to add new functions for IOMMU MR adds and
deletes.
Also since the SVQ does not support vIOMMU yet, add the check for IOMMU
in vhost_vdpa_dev_start, if the SVQ and IOMMU enable at the same time
the function will return fail.
2. Skip the iova_max check vhost_vdpa_listener_skipped_section(). While
MR is IOMMU, move this check to vhost_vdpa_iommu_map_notify()
Verified in vp_vdpa and vdpa_sim_net driver
Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20230510054631.2951812-5-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The commit 93a97dc520 ("virtio-net: enable vq reset feature") enables
unconditionally vq reset feature as long as the device is emulated.
This makes impossible to actually disable the feature, and it causes
migration problems from qemu version previous than 7.2.
The entire final commit is unneeded as device system already enable or
disable the feature properly.
This reverts commit 93a97dc520.
Fixes: 93a97dc520 ("virtio-net: enable vq reset feature")
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230504101447.389398-1-eperezma@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Let's just support 512 memslots on x86-64 and aarch64 as well. The maximum
number of ACPI slots (256) is no longer completely expressive ever since
we supported virtio-based memory devices. Further, we're completely
ignoring other memslots used outside of memory device context, such as
memslots used for boot memory.
Note that the vhost memslot limit in the kernel is usually configured to
be 509. With this change, we prepare vhost-user on the QEMU side to be
closer to that limit, to eventually support ~512 memslots in most vhost
implementations and have less "surprises" when cold/hotplugging vhost
devices while also consuming more memslots than we're currently used to
by memory devices (e.g., once virtio-mem starts using multiple memslots).
Note that most vhost-user implementations only support a small number of
memslots so far, which we can hopefully improve in the near future.
We'll leave the PPC special-case as is for now.
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20230503184144.808478-1-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Allowing guests to read unplugged memory simplified the bring-up of
virtio-mem in Linux guests -- which was limited to x86-64 only. On arm64
(which was added later), we never had legacy guests and don't even allow
to configure it, essentially always having "unplugged-inaccessible=on".
At this point, all guests we care about
should be supporting VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE, so let's
change the default for the 8.1 machine.
This change implies that also memory that supports the shared zeropage
(private anonymous memory) will now require
VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE in the driver in order to be usable by
the guest -- as default, one can still manually set the
unplugged-inaccessible property.
Disallowing the guest to read unplugged memory will be important for
some future features, such as memslot optimizations or protection of
unplugged memory, whereby we'll actually no longer allow the guest to
even read from unplugged memory.
At some point, we might want to deprecate and remove that property.
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Eduardo Habkost <eduardo@habkost.net>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20230503182352.792458-1-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Since it's implementation on v8.0.0-rc0, having the PCI_ERR_UNCOR_MASK
set for machine types < 8.0 will cause migration to fail if the target
QEMU version is < 8.0.0 :
qemu-system-x86_64: get_pci_config_device: Bad config data: i=0x10a read: 40 device: 0 cmask: ff wmask: 0 w1cmask:0
qemu-system-x86_64: Failed to load PCIDevice:config
qemu-system-x86_64: Failed to load e1000e:parent_obj
qemu-system-x86_64: error while loading state for instance 0x0 of device '0000:00:02.0/e1000e'
qemu-system-x86_64: load of migration failed: Invalid argument
The above test migrated a 7.2 machine type from QEMU master to QEMU 7.2.0,
with this cmdline:
./qemu-system-x86_64 -M pc-q35-7.2 [-incoming XXX]
In order to fix this, property x-pcie-err-unc-mask was introduced to
control when PCI_ERR_UNCOR_MASK is enabled. This property is enabled by
default, but is disabled if machine type <= 7.2.
Fixes: 010746ae1d ("hw/pci/aer: Implement PCI_ERR_UNCOR_MASK register")
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Leonardo Bras <leobras@redhat.com>
Message-Id: <20230503002701.854329-1-leobras@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1576
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Setting the VIRTIO Device Status Field to 0 resets the device. The
device's state is lost, including the vring configuration.
vhost-user.c currently sends SET_STATUS 0 before GET_VRING_BASE. This
risks confusion about the lifetime of the vhost-user state (e.g. vring
last_avail_idx) across VIRTIO device reset.
Eugenio Pérez <eperezma@redhat.com> adjusted the order for vhost-vdpa.c
in commit c3716f260b ("vdpa: move vhost reset after get vring base")
and in that commit description suggested doing the same for vhost-user
in the future.
Go ahead and adjust vhost-user.c now. I ran various online code searches
to identify vhost-user backends implementing SET_STATUS. It seems only
DPDK implements SET_STATUS and Yajun Wu <yajunw@nvidia.com> has
confirmed that it is safe to make this change.
Fixes: commit 923b8921d2 ("vhost-user: Support vhost_dev_start")
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Cindy Lu <lulu@redhat.com>
Cc: Yajun Wu <yajunw@nvidia.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230501230409.274178-1-stefanha@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Yajun Wu <yajunw@nvidia.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This commit enables each CXL Type-3 device to contain one volatile
memory region and one persistent region.
Two new properties have been added to cxl-type3 device initialization:
[volatile-memdev] and [persistent-memdev]
The existing [memdev] property has been deprecated and will default the
memory region to a persistent memory region (although a user may assign
the region to a ram or file backed region). It cannot be used in
combination with the new [persistent-memdev] property.
Partitioning volatile memory from persistent memory is not yet supported.
Volatile memory is mapped at DPA(0x0), while Persistent memory is mapped
at DPA(vmem->size), per CXL Spec 8.2.9.8.2.0 - Get Partition Info.
Signed-off-by: Gregory Price <gregory.price@memverge.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Tested-by: Fan Ni <fan.ni@samsung.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20230421160827.2227-4-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The failure paths in CDAT file loading did not clear up properly.
Change to using g_auto_free and a local pointer for the buffer to
ensure this function has no side effects on error.
Also drop some unnecessary checks that can not fail.
Cleanup properly after a failure to load a CDAT file.
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20230421132020.7408-3-Jonathan.Cameron@huawei.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
QEMU invokes vhost_svq_add() when adding a guest's element
into SVQ. In vhost_svq_add(), it uses vhost_svq_available_slots()
to check whether QEMU can add the element into SVQ. If there is
enough space, then QEMU combines some out descriptors and some
in descriptors into one descriptor chain, and adds it into
`svq->vring.desc` by vhost_svq_vring_write_descs().
Yet the problem is that, `svq->shadow_avail_idx - svq->shadow_used_idx`
in vhost_svq_available_slots() returns the number of occupied elements,
or the number of descriptor chains, instead of the number of occupied
descriptors, which may cause wrapping in SVQ descriptor ring.
Here is an example. In vhost_handle_guest_kick(), QEMU forwards
as many available buffers to device by virtqueue_pop() and
vhost_svq_add_element(). virtqueue_pop() returns a guest's element,
and then this element is added into SVQ by vhost_svq_add_element(),
a wrapper to vhost_svq_add(). If QEMU invokes virtqueue_pop() and
vhost_svq_add_element() `svq->vring.num` times,
vhost_svq_available_slots() thinks QEMU just ran out of slots and
everything should work fine. But in fact, virtqueue_pop() returns
`svq->vring.num` elements or descriptor chains, more than
`svq->vring.num` descriptors due to guest memory fragmentation,
and this causes wrapping in SVQ descriptor ring.
This bug is valid even before marking the descriptors used.
If the guest memory is fragmented, SVQ must add chains
so it can try to add more descriptors than possible.
This patch solves it by adding `num_free` field in
VhostShadowVirtqueue structure and updating this field
in vhost_svq_add() and vhost_svq_get_buf(), to record
the number of free descriptors.
Fixes: 100890f7ca ("vhost: Shadow virtqueue buffers forwarding")
Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230509084817.3973-1-yin31149@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Hexagon update
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEENjXHiM5iuR/UxZq0ewJE+xLeRCIFAmRmgQgACgkQewJE+xLe
# RCJLtAf8C/0kQRa4mjnbsztXuFyca53UxAv3BSBEDla4ZcMfFBoVJsGB3OP7IPXd
# KBQpkLyJAVye9idex5xqdp9nIfoGKDTsc6YtCfGujZ17cDpzLRDpHdUTex8PcZYK
# wpfM3hoVJsYRBMsojZ4OaxatjFQ+FWzrIH6FcgH086Q8TH4w9dZLNEJzHC4lOj0s
# 7qOuw2tgm+vOVlzsk/fv6/YD/BTeZTON3jgTPvAnvdRLb/482UpM9JkJ8E4rbte3
# Ss5PUK8QTQHU0yamspGy/PfsYxiptM+jIWGd836fAGzwF12Ug27mSc1enndRtQVW
# pQTdnOnWuuRzOwEpd7x3xh9upACm4g==
# =1CyJ
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 18 May 2023 12:48:24 PM PDT
# gpg: using RSA key 3635C788CE62B91FD4C59AB47B0244FB12DE4422
# gpg: Good signature from "Taylor Simpson (Rock on) <tsimpson@quicinc.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 3635 C788 CE62 B91F D4C5 9AB4 7B02 44FB 12DE 4422
* tag 'pull-hex-20230518-1' of https://github.com/quic/qemu: (44 commits)
Hexagon (linux-user/hexagon): handle breakpoints
Hexagon (gdbstub): add HVX support
Hexagon (gdbstub): fix p3:0 read and write via stub
Hexagon: add core gdbstub xml data for LLDB
gdbstub: add test for untimely stop-reply packets
gdbstub: only send stop-reply packets when allowed to
Remove test_vshuff from hvx_misc tests
Hexagon (decode): look for pkts with multiple insns at the same slot
Hexagon (iclass): update J4_hintjumpr slot constraints
Hexagon: append eflags to unknown cpu model string
Hexagon: list available CPUs with `-cpu help`
Hexagon (target/hexagon/*.py): raise exception on reg parsing error
target/hexagon: fix = vs. == mishap
Hexagon (target/hexagon) Additional instructions handled by idef-parser
Hexagon (target/hexagon) Move items to DisasContext
Hexagon (target/hexagon) Move pkt_has_store_s1 to DisasContext
Hexagon (target/hexagon) Move pred_written to DisasContext
Hexagon (target/hexagon) Move new_pred_value to DisasContext
Hexagon (target/hexagon) Move new_value to DisasContext
Hexagon (target/hexagon) Make special new_value for USR
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
GDB's remote serial protocol allows stop-reply messages to be sent by
the stub either as a notification packet or as a reply to a GDB command
(provided that the cmd accepts such a response). QEMU currently does not
implement notification packets, so it should only send stop-replies
synchronously and when requested. Nevertheless, it still issues
unsolicited stop messages through gdb_vm_state_change().
Although this behavior doesn't seem to cause problems with GDB itself
(the messages are just ignored), it can impact other debuggers that
implement the GDB remote serial protocol, like hexagon-lldb. Let's
change the gdbstub to send stop messages only as a response to a
previous GDB command that accepts such a reply.
Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Acked-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Message-Id: <a49c0897fc22a6a7827c8dfc32aef2e1d933ec6b.1683214375.git.quic_mathbern@quicinc.com>
Each slot in a packet can be assigned to at most one instruction.
Although the assembler generally ought to enforce this rule, we better
be safe than sorry and also do some check to properly throw an "invalid
packet" exception on wrong slot assignments.
This should also make it easier to debug possible future errors caused
by missing updates to `find_iclass_slots()` rules in
target/hexagon/iclass.c.
Co-authored-by: Taylor Simpson <tsimpson@quicinc.com>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
Tested-by: Taylor Simpson <tsimpson@quicinc.com>
Message-Id: <f8b829443523568823d062adf8bf6659bc6d4a3f.1683552984.git.quic_mathbern@quicinc.com>
The Hexagon PRM says that "The assembler automatically encodes
instructions in the packet in the proper order. In the binary encoding
of a packet, the instructions must be ordered from Slot 3 down to
Slot 0."
Prior to the architecture version v73, the slot constraints from
instruction "hintjr" only allowed it to be executed at slot 2.
With that in mind, consider the packet:
{
hintjr(r0)
nop
nop
if (!p0) memd(r1+#0) = r1:0
}
To satisfy the ordering rule quoted from the PRM, the assembler would,
thus, move one of the nops to the first position, so that it can be
assigned to slot 3 and the subsequent hintjr to slot 2.
However, since v73, hintjr can be executed at either slot 2 or 3. So
there is no need to reorder that packet and the assembler will encode it
as is. When QEMU tries to execute it, however, we end up hitting a
"misaliged store" exception because both the store and the hintjr will
be assigned to store 0, and some functions like `slot_is_predicated()`
expect the decode machinery to assign only one instruction per slot. In
particular, the mentioned function will traverse the packet until it
finds the first instruction at the desired slot which, for slot 0, will
be hintjr. Since hintjr is not predicated, the result is that we try to
execute the store regardless of the predicate. And because the predicate
is false, we had not previously loaded hex_store_addr[0] or
hex_store_width[0]. As a result, the store will decide de width based on
trash memory, causing it to be misaligned.
Update the slot constraints for hintjr so that QEMU can properly handle
such encodings.
Note: to avoid similar-but-not-identical issues in the future, we should
look for multiple instructions at the same slot during decoding time and
throw an invalid packet exception. That will be done in the subsequent
commit.
Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Message-Id: <0fcd8293642c6324119fbbab44741164bcbd04fb.1673616964.git.quic_mathbern@quicinc.com>
Currently, qemu-hexagon only models the v67 cpu. Nonetheless if we try
to get this information with `-cpu help`, qemu just exists with an error
code and no output. Let's correct that.
The code is basically a copy from target/alpha/cpu.h, but we strip the
"-hexagon-cpu" suffix before printing. This is to avoid confusing
situations like the following:
$ qemu-hexagon -cpu help
Available CPUs:
v67-hexagon-cpu
$ qemu-hexagon -cpu v67-hexagon-cpu ./prog
qemu-hexagon: unable to find CPU model 'v67-hexagon-cpu'
Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Tested-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
Message-Id: <b946e17c7e17eed9095700b54c5ead36e5d55dfa.1683225804.git.quic_mathbern@quicinc.com>
Currently, the python scripts used for the hexagon building will not
abort the compilation when there is an error parsing a register. Let's
make the compilation properly fail in such cases by rasing an exception
instead of just printing a warning message, which might get lost in the
output.
This patch was generated with:
git grep -l "Bad register" *hexagon* | \
xargs sed -i "" -e 's/print("Bad register parse: "[, ]*\([^)]*\))/hex_common.bad_register(\1)/g'
Plus the bad_register() helper added to hex_common.py.
Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Tested-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Message-Id: <1f5dbd92f68fdd89e2647e4ba527a2c32cf0f070.1683217043.git.quic_mathbern@quicinc.com>
**** Changes in v3 ****
Fix bugs exposed by dpmpyss_rnd_s0 instruction
Set correct size/signedness for constants
Test cases added to tests/tcg/hexagon/misc.c
**** Changes in v2 ****
Fix bug in imm_print identified in clang build
Currently, idef-parser skips all floating point instructions. However,
there are some floating point instructions that can be handled.
The following instructions are now parsed
F2_sfimm_p
F2_sfimm_n
F2_dfimm_p
F2_dfimm_n
F2_dfmpyll
F2_dfmpylh
To make these instructions work, we fix some bugs in parser-helpers.c
gen_rvalue_extend
gen_cast_op
imm_print
lexer properly sets size/signedness of constants
Test cases added to tests/tcg/hexagon/fpstuff.c
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Tested-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230501203125.4025991-1-tsimpson@quicinc.com>
The following items in the CPUHexagonState are only used for bookkeeping
within the translation of a packet. With recent changes that eliminate
the need to free TCGv variables, these make more sense to be transient
and kept in DisasContext.
The following items are moved
dczero_addr
branch_taken
this_PC
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-22-tsimpson@quicinc.com>
The pkt_has_store_s1 field is only used for bookkeeping helpers with
a load. With recent changes that eliminate the need to free TCGv
variables, it makes more sense to make this transient.
These helpers already take the instruction slot as an argument. We
combine the slot and pkt_has_store_s1 into a single argument called
slotval.
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-21-tsimpson@quicinc.com>
The following have overrides
S2_insert
S2_insert_rp
S2_asr_r_svw_trun
A2_swiz
These instructions have semantics that write to the destination
before all the operand reads have been completed. Therefore,
the idef-parser versions were disabled with the short-circuit patch.
Test cases added to tests/tcg/hexagon/read_write_overlap.c
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-16-tsimpson@quicinc.com>
The generated helpers for HVX use pass-by-reference, so they can't
short-circuit when the reads/writes overlap. The instructions with
overrides are OK because they use tcg_gen_gvec_*.
We add a flag has_hvx_helper to DisasContext and extend gen_analyze_funcs
to set the flag when the instruction is an HVX instruction with a
generated helper.
We add an override for V6_vcombine so that it can be short-circuited
along with a test case in tests/tcg/hexagon/hvx_misc.c
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-15-tsimpson@quicinc.com>
In certain cases, we can avoid the overhead of writing to future_VRegs
and write directly to VRegs. We consider HVX reads/writes when computing
ctx->need_commit. Then, we can early-exit from gen_commit_hvx.
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-14-tsimpson@quicinc.com>
In certain cases, we can avoid the overhead of writing to hex_new_pred_value
and write directly to hex_pred. We consider predicate reads/writes when
computing ctx->need_commit. The get_result_pred() function uses this
field to decide between hex_new_pred_value and hex_pred. Then, we can
early-exit from gen_pred_writes.
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-13-tsimpson@quicinc.com>
In certain cases, we can avoid the overhead of writing to hex_new_value
and write directly to hex_gpr. We add need_commit field to DisasContext
indicating if the end-of-packet commit is needed. If it is not needed,
get_result_gpr() and get_result_gpr_pair() can return hex_gpr.
We pass the ctx->need_commit to helpers when needed.
Finally, we can early-exit from gen_reg_writes during packet commit.
There are a few instructions whose semantics write to the result before
reading all the inputs. Therefore, the idef-parser generated code is
incompatible with short-circuit. We tell idef-parser to skip them.
For debugging purposes, we add a cpu property to turn off short-circuit.
When the short-circuit property is false, we skip the analysis and force
the end-of-packet commit.
Here's a simple example of the TCG generated for
0x004000b4: 0x7800c020 { R0 = #0x1 }
BEFORE:
---- 004000b4
movi_i32 new_r0,$0x1
mov_i32 r0,new_r0
AFTER:
---- 004000b4
movi_i32 r0,$0x1
This patch reintroduces a use of check_for_attrib, so we remove the
G_GNUC_UNUSED added earlier in this series.
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Brian Cain <bcain@quicinc.com>
Message-Id: <20230427230012.3800327-12-tsimpson@quicinc.com>
When generating TCG, make sure we have read all the operand registers
before writing to the destination registers.
This is a prerequesite for short-circuiting where the source and dest
operands could be the same.
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-10-tsimpson@quicinc.com>
Only endloop instructions will conditionally write to a predicate.
When there is an endloop instruction, we preload the values into
new_pred_value.
The only place pred_written is needed is when HEX_DEBUG is on.
We remove the last use of check_for_attrib. However, new uses will be
introduced later in this series, so we mark it with G_GNUC_UNUSED.
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-9-tsimpson@quicinc.com>
These instructions have implicit writes to registers, so we don't
want them to be helpers when idef-parser is off.
The following instructions are overriden
S2_cabacdecbin
SA1_cmpeqi
Remove the log_pred_write function from op_helper.c
Remove references in macros.h
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-8-tsimpson@quicinc.com>
The following instructions are tested
V6_vasrvuhubrndsat
V6_vasrvuhubsat
V6_vasrvwuhrndsat
V6_vasrvwuhsat
V6_vassign_tmp
V6_vcombine_tmp
V6_vmpyuhvs
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230427224057.3766963-8-tsimpson@quicinc.com>
The following instructions are added
L2_loadw_aq
L4_loadd_aq
R6_release_at_vi
R6_release_st_vi
S2_storew_rl_at_vi
S4_stored_rl_at_vi
S2_storew_rl_st_vi
S4_stored_rl_st_vi
The release instructions are nop's in qemu. The others behave as
loads/stores.
The encodings for these instructions changed some "don't care" bits
L2_loadw_locked
L4_loadd_locked
S2_storew_locked
S4_stored_locked
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230427224057.3766963-3-tsimpson@quicinc.com>
Add support for the ELF flags
Move target/hexagon/cpu.[ch] to be v73
Change the compiler flag used by "make check-tcg"
The decbin instruction is removed in Hexagon v73, so check the
version before trying to compile the instruction.
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230427224057.3766963-2-tsimpson@quicinc.com>
Migration Pull request
Hi
Based on latest reviewed parts of migration:
- Disable colo (vladimir)
- Migration atomic counters (juan)
Please apply.
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEGJn/jt6/WMzuA0uC9IfvGFhy1yMFAmRmXJUACgkQ9IfvGFhy
# 1yNRAxAAjDYJELL34Qovt/WE9qKhYJEvIUGTl1IMWJ22YMFnqIFKRdka57dWoU3P
# 7EK1BHmokEEtzGT7Fe1ecERXsOwQIJDIkDTJ5g8Oc8Jt1iqY1AC8h5T+LghijCar
# mbZ6qWHaSjsg2lmek/xc9quymzFGGK36PSyB5WkaLRviKQn4RIkEDpUaWny7nDbA
# Q8zJJpBqNFqKfC5/DN0ePa3QQscXQJhey3nxqFd8hYp8RFNIV5UJVW5Lf6ombtK7
# atgdWC4ckkfO2z3OsghKeo/UiMFWpPktgBVVMhDLmk+P/E6czc2gfzD6SCvrPKTj
# XowI8hro22HVmq9bEY8PtbjMOfpxrAxer+tM2KR/0O9l3UzUacFsi7KGqCJ1/trQ
# 1tSDjlgyczb8GOgLwwxj8XE+jPHPfVrzCNfDqrBKBNxz6nnZSdZUwhV5mG8FdVtm
# oVVV96BIrNXLl/lIxYIFD/Zyvl8/lrSWQdLkEHTzihYQeXaQfyvPVbV/dOLT4sii
# YUuGCuEhF+DW/qz43G1krwq5/bfxsiZoQzrMV/Odtf0wYQKkabA3KNBIda/vxBCR
# dsLQ7QtmOwKmCzjqw4LUov9vDNYOYr98o7ZqwJ3qeKL4QgFwtEZUFO3VW6UR8fnF
# arVXiTn9wVlkTpu4sT5hLm9400iadhX4Fppji7Ce0tUpLbWbghA=
# =3x32
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 18 May 2023 10:12:53 AM PDT
# gpg: using RSA key 1899FF8EDEBF58CCEE034B82F487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>" [undefined]
# gpg: aka "Juan Quintela <quintela@trasno.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03 4B82 F487 EF18 5872 D723
* tag 'migration-20230518-pull-request' of https://gitlab.com/juan.quintela/qemu:
migration: Fix duplicated included in meson.build
migration/multifd: Compute transferred bytes correctly
migration: We don't need the field rate_limit_used anymore
migration: Use migration_transferred_bytes() to calculate rate_limit
migration: Add a trace for migration_transferred_bytes
migration: Move migration_total_bytes() to migration-stats.c
migration: Move rate_limit_max and rate_limit_used to migration_stats
qemu-file: Account for rate_limit usage on qemu_fflush()
migration: Don't use INT64_MAX for unlimited rate
migration: process_incoming_migration_co(): move colo part to colo
migration: split migration_incoming_co
configure: add --disable-colo-proxy option
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In the past, we had to put the in the main thread all the operations
related with sizes due to qemu_file not beeing thread safe. As now
all counters are atomic, we can update the counters just after the
do the write. As an aditional bonus, we are able to use the right
value for the compression methods. Right now we were assuming that
there were no compression at all.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20230515195709.63843-17-quintela@redhat.com>
Since previous commit, we calculate how much data we have send with
migration_transferred_bytes() so no need to maintain this counter and
remember to always update it.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20230515195709.63843-10-quintela@redhat.com>
These way we can make them atomic and use this functions from any
place. I also moved all functions that use rate_limit to
migration-stats.
Functions got renamed, they are not qemu_file anymore.
qemu_file_rate_limit -> migration_rate_exceeded
qemu_file_set_rate_limit -> migration_rate_set
qemu_file_get_rate_limit -> migration_rate_get
qemu_file_reset_rate_limit -> migration_rate_reset
qemu_file_acct_rate_limit -> migration_rate_account.
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20230515195709.63843-6-quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Let's make better public interface for COLO: instead of
colo_process_incoming_thread and not trivial logic around creating the
thread let's make simple colo_incoming_co(), hiding implementation from
generic code.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20230515130640.46035-4-vsementsov@yandex-team.ru>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Originally, migration_incoming_co was introduced by
25d0c16f62
"migration: Switch to COLO process after finishing loadvm"
to be able to enter from COLO code to one specific yield point, added
by 25d0c16f62.
Later in 923709896b
"migration: poll the cm event for destination qemu"
we reused this variable to wake the migration incoming coroutine from
RDMA code.
That was doubtful idea. Entering coroutines is a very fragile thing:
you should be absolutely sure which yield point you are going to enter.
I don't know how much is it safe to enter during qemu_loadvm_state()
which I think what RDMA want to do. But for sure RDMA shouldn't enter
the special COLO-related yield-point. As well, COLO code doesn't want
to enter during qemu_loadvm_state(), it want to enter it's own specific
yield-point.
As well, when in 8e48ac9586
"COLO: Add block replication into colo process" we added
bdrv_invalidate_cache_all() call (now it's called activate_all())
it became possible to enter the migration incoming coroutine during
that call which is wrong too.
So, let't make these things separate and disjoint: loadvm_co for RDMA,
non-NULL during qemu_loadvm_state(), and colo_incoming_co for COLO,
non-NULL only around specific yield.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20230515130640.46035-3-vsementsov@yandex-team.ru>
Signed-off-by: Juan Quintela <quintela@redhat.com>
* kvm: enable dirty ring for arm64
* target/i386: new features
* target/i386: AVX fixes
* configure: create a python venv unconditionally
* meson: bump to 0.63.0 and move tests from configure
* meson: Pass -j option to sphinx
* drop support for Python 3.6
* fix check-python-tox
* fix "make clean" in the source directory
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmRmDYQUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroOXSwf/WKmYPe09yHfxfVSFsSz83QpB3e+f
# KJx6FdyMMt26ZQJpcqorobrDV23R8FyxngXPkwoxqobAEtXB/AH0/S/u8RUZ46Qt
# IrF8FXr4ZdyLW7CW6nmIejmlul0iRmFD7D98E6dZ3QXfype3Ifra7gG74spZ1B44
# ZNvaomJKUK8Ga8rbChs9KtgrxlOC5q8IfTWF5ZExmZszPC9NRnZmU5Oncnuwek9T
# Ic6zDPoAeF3jDtovZhxg1HAB9e/ENZX/V9NjO92yZa8u/TITQ88l4tJctf7uiLxO
# 2oGY12ln8i//pbjyUe4iM+bNh5+reAChEI8iv7WxEsj9s2HBUJ68f3tpbQ==
# =Zg00
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 18 May 2023 04:35:32 AM PDT
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [undefined]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (68 commits)
docs/devel: update build system docs
configure: remove unnecessary check
configure: reorder option parsing code
configure: remove unnecessary mkdir
configure: do not rerun the tests with -Werror
configure: remove compiler sanity check
build: move --disable-debug-info to meson
build: move compiler version check to meson
build: move remaining compiler flag tests to meson
build: move warning flag selection to meson
build: move stack protector flag selection to meson
build: move coroutine backend selection to meson
build: move SafeStack tests to meson
build: move sanitizer tests to meson
meson: prepare move of QEMU_CFLAGS to meson
configure, meson: move --enable-modules to Meson
configure: remove pkg-config functions
build: move glib detection and workarounds to meson
meson: drop unnecessary declare_dependency()
meson: add more version numbers to the summary
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
configure is only doing compiler and host setup now, so adjust the
relevant documentation. It is also possible to build emulators with
ninja directly if one is so inclined, so mention that as well.
The Python virtual environment set up is a new major task of configure
as well. Mention it in the list of produced files, while leaving it
for a future patch to document how it works and how ``mkvenv ensure``
is used.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
All calls to probe_target_compiler are conditioned on
some "have_target" invocation, or inside a loop on target_list.
Therefore there is no issue with building unnecessary
firmware images and tests.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move some variable assignments around for clarity and to remove
one of three loops on the command line arguments.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tests run in configure are pretty trivial at this point, so
do not bother with the extra complication of running tests
both with and without -Werror.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The comment is not correct anymore, in that the usability test for
the compiler and linker are done after probing $cpu, and Meson will
redo them anyway.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use the slighly nicer .version_compare() function for GCC; for Clang that is
not possible due to the mess that Apple does with version numbers.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Remove the only remaining uses of QEMU_CFLAGS. Now that no
feature tests are done in configure, it is possible to remove
CONFIGURE_CFLAGS and CONFIGURE_LDFLAGS as well.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Meson already knows to test with the positive form of the flag, which
simplifies the test. Warnings are now tested explicitly for the C++
compiler, instead of hardcoding those that are only available for
the C language.
At this point all compiler flags in QEMU_CFLAGS are global and only
depend on the OS. No feature tests are performed in configure.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Convert the u2f.txt file to rST, and place it in the right place
in our manual layout. The old text didn't fit very well into our
manual style, so the new version ends up looking like a rewrite,
although some of the original text is preserved:
* the 'building' section of the old file is removed, since we
generally assume that users have already built QEMU
* some rather verbose text has been cut back
* document the passthrough device first, on the assumption
that's most likely to be of interest to users
* cut back on the duplication of text between sections
* format example command lines etc with rST
As it's a short document it seemed simplest to do this all
in one go rather than try to do a minimal syntactic conversion
and then clean up the wording and layout.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 20230421163734.1152076-1-peter.maydell@linaro.org
In the vexpress board code, we allocate a new MemoryRegion at the top
of vexpress_common_init() but only set it up and use it inside the
"if (map[VE_NORFLASHALIAS] != -1)" conditional, so we leak it if not.
This isn't a very interesting leak as it's a tiny amount of memory
once at startup, but it's easy to fix.
We could silence Coverity simply by moving the g_new() into the
if() block, but this use of g_new(MemoryRegion, 1) is a legacy from
when this board model was originally written; we wouldn't do that
if we wrote it today. The MemoryRegions are conceptually a part of
the board and must not go away until the whole board is done with
(at the end of the simulation), so they belong in its state struct.
This machine already has a VexpressMachineState struct that extends
MachineState, so statically put the MemoryRegions in there instead of
dynamically allocating them separately at runtime.
Spotted by Coverity (CID 1509083).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230512170223.3801643-3-peter.maydell@linaro.org
The IMPDEF sysreg L2CTLR_EL1 found on the Cortex-A35, A53, A57, A72
and which we (arguably dubiously) also provide in '-cpu max' has a
2 bit field for the number of processors in the cluster. On real
hardware this must be sufficient because it can only be configured
with up to 4 CPUs in the cluster. However on QEMU if the board code
does not explicitly configure the code into clusters with the right
CPU count we default to "give the value assuming that all CPUs in
the system are in a single cluster", which might be too big to fit
in the field.
Instead of just overflowing this 2-bit field, saturate to 3 (meaning
"4 CPUs", so at least we don't overwrite other fields in the register.
It's unlikely that any guest code really cares about the value in
this field; at least, if it does it probably also wants the system
to be more closely matching real hardware, i.e. not to have more
than 4 CPUs.
This issue has been present since the L2CTLR was first added in
commit 377a44ec8f back in 2014. It was only noticed because
Coverity complains (CID 1509227) that the shift might overflow 32 bits
and inadvertently sign extend into the top half of the 64 bit value.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230512170223.3801643-2-peter.maydell@linaro.org
Convert the exception-return insns ERET, ERETA and ERETB to
decodetree. These were the last insns left in the legacy
decoder function disas_uncond_reg_b(), which allows us to
remove it.
The old decoder explicitly decoded the DRPS instruction,
only in order to call unallocated_encoding() on it, exactly
as would have happened if it hadn't decoded it. This is
because this insn always UNDEFs unless the CPU is in
halting-debug state, which we don't emulate. So we list
the pattern in a comment in a64.decode, but don't actively
decode it.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230512144106.3608981-21-peter.maydell@linaro.org
Convert the last four BR-with-pointer-auth insns to decodetree.
The remaining cases in the outer switch in disas_uncond_b_reg()
all return early rather than leaving the case statement, so we
can delete the now-unused code at the end of that function.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230512144106.3608981-20-peter.maydell@linaro.org
The SVE and SME decode is already done by decodetree. Pull the calls
to these decoders out of the legacy decoder. This doesn't change
behaviour because all the patterns in sve.decode and sme.decode
already require the bits that the legacy decoder is decoding to have
the correct values.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230512144106.3608981-4-peter.maydell@linaro.org
The A64 translator uses a hand-written decoder for everything except
SVE or SME. It's fairly well structured, but it's becoming obvious
that it's still more painful to add instructions to than the A32
translator, because putting a new instruction into the right place in
a hand-written decoder is much harder than adding new instruction
patterns to a decodetree file.
As the first step in conversion to decodetree, create the skeleton of
the decodetree decoder; where it does not handle instructions we will
fall back to the legacy decoder (which will be for everything at the
moment, since there are no patterns in a64.decode).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230512144106.3608981-3-peter.maydell@linaro.org
Extend the 'mte' property for the virt machine to cover KVM as
well. For KVM, we don't allocate tag memory, but instead enable the
capability.
If MTE has been enabled, we need to disable migration, as we do not
yet have a way to migrate the tags as well. Therefore, MTE will stay
off with KVM unless requested explicitly.
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230428095533.21747-2-cohuck@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
To simplify the code, rename coroutine-win32.c to match the option
passed to configure.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This disables the old behavior of detecting SafeStack from environment
CFLAGS. SafeStack is now enabled purely based on the configure arguments.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Clean up the handling of compiler flags in meson.build, splitting
the general flags that should be included in subprojects as well,
from warning flags that only apply to QEMU itself. The two were
mixed in both configure tests and meson tests.
This split makes it easier to move the compiler tests piecewise
from configure to Meson.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
QEMU adds the path to glib.h to all compilation commands. This is simpler
due to the pervasive use of static_library, and was grandfathered in from
the previous Make-based build system. Until Meson 0.63 the only way to
do this was to detect glib in configure and use add_project_arguments,
but now it is possible to use add_project_dependencies instead.
gmodule is detected in a separate variable, with export enabled for
modules and disabled for plugin.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The libvfio_user_dep variable of subprojects/libvfio-user/lib/meson.build
is already a dependency, so there is no need to wrap it with another
declare_dependency().
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Whenever declare_dependency is used to add some compile flags or dependent
libraries to the outcome of dependency(), the version of the original
dependency is dropped in the summary. Make sure that declare_dependency()
has a version argument in those cases.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
After static_kwargs has been changed to an empty dictionary, it has
no functional effect and can be removed.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The option is new in Meson 0.63 and removes the need to pass "static:
true" to all dependency and find_library invocation. Actually cleaning
up the invocations is left for a separate patch.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This version allows cleanups in modinfo collection, but they only
work with Ninja 1.9.x and 1.8.x is still supported. It also supports the
equivalent of QEMU's --static option to configure.
The wheel file is bumped to 0.63.3, the last release in the 0.63 branch.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The version of pyflakes that is listed in python/tests/minreqs.txt
breaks on Python 3.8 with the following message:
AttributeError: 'FlakesChecker' object has no attribute 'CONSTANT'
Now that we do not support EOL'd Python versions anymore, we can
update to newer, fixed versions. It is a good time to do so, before
Python packages start dropping support for Python 3.7 as well!
The new mypy is also a bit smarter about which packages are actually
being used, so remove the now-unnecessary sections from setup.cfg.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-Id: <20230511035435.734312-27-jsnow@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Python 3.6 was EOL 2021-12-31. Newer versions of upstream libraries have
begun dropping support for this version and it is becoming more
cumbersome to support. Avocado-framework and qemu.qmp each have their
own reasons for wanting to drop Python 3.6, but won't until QEMU does.
Versions of Python available in our supported build platforms as of today,
with optional versions available in parentheses:
openSUSE Leap 15.4: 3.6.15 (3.9.10, 3.10.2)
CentOS Stream 8: 3.6.8 (3.8.13, 3.9.16)
CentOS Stream 9: 3.9.13
Fedora 36: 3.10
Fedora 37: 3.11
Debian 11: 3.9.2
Alpine 3.14, 3.15: 3.9.16
Alpine 3.16, 3.17: 3.10.10
Ubuntu 20.04 LTS: 3.8.10
Ubuntu 22.04 LTS: 3.10.4
NetBSD 9.3: 3.9.13*
FreeBSD 12.4: 3.9.16
FreeBSD 13.1: 3.9.16
OpenBSD 7.2: 3.9.16
Note: Our VM tests install 3.9 explicitly for FreeBSD and 3.10 for
NetBSD; the default for "python" or "python3" in FreeBSD is
3.9.16. NetBSD does not appear to have a default meta-package, but
offers several options, the lowest of which is 3.7.15. "python39"
appears to be a pre-requisite to one of the other packages we request in
tests/vm/netbsd. pip, ensurepip and other Python essentials are
currently only available for Python 3.10 for NetBSD.
CentOS and OpenSUSE support parallel installation of multiple Python
interpreters, and binaries in /usr/bin will always use Python 3.6. However,
the newly introduced support for virtual environments ensures that all build
steps that execute QEMU Python code use a single interpreter.
Since it is safe to under our supported platform policy, bump our
minimum supported version of Python to 3.7.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20230511035435.734312-24-jsnow@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In the event that there's no vendored source present and no sufficient
version of $package can be found, we will attempt to connect to PyPI to
install the package if '--disable-pypi' was not passed.
This means that PyPI access is "enabled by default", but there are some
subtleties that make this action occur much less frequently than you
might imagine:
(1) While --enable-pypi is the default, vendored source will always be
preferred when found, making PyPI a fallback. This should ensure
that configure-time venv building "just works" for almost everyone
in almost every circumstance.
(2) Because meson source is, at time of writing, vendored directly into
qemu.git, PyPI will never be used for sourcing meson.
(3) Because Sphinx is an optional dependency, if docs are set to "auto",
PyPI will not be used to obtain Sphinx source as a fallback and
instead docs will be disabled. If PyPI sourcing of sphinx is
desired, --enable-docs should be passed to force the lookup. I chose
this as the default behavior to avoid adding new internet lookups to
a "default" invocation of configure.
Signed-off-by: John Snow <jsnow@redhat.com>
Message-Id: <20230511035435.734312-23-jsnow@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When docs are explicitly requested, require Sphinx>=1.6.0. When docs are
explicitly disabled, don't bother to check for Sphinx at all. If docs
are set to "auto", attempt to locate Sphinx, but continue onward if it
wasn't located.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-Id: <20230511035435.734312-22-jsnow@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move this option back from meson into configure for the purposes of
using the configuration value to bootstrap Sphinx in different ways
based on this value.
Signed-off-by: John Snow <jsnow@redhat.com>
Message-Id: <20230511035435.734312-21-jsnow@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch changes how the avocado tests are provided, ever so
slightly. Instead of creating a new testing venv, use the
configure-provided 'pyvenv' instead, and install optional packages into
that.
Signed-off-by: John Snow <jsnow@redhat.com>
Message-Id: <20230511035435.734312-20-jsnow@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This commit changes how we detect and install meson. It notably removes
'--meson='.
Currently, configure creates a lightweight Python virtual environment
unconditionally using the user's configured $python that inherits system
packages. Temporarily, we forced the use of meson source present via git
submodule or in the release tarball.
With this patch, we restore the ability to use a system-provided meson:
If Meson is installed in the build venv and meets our minimum version
requirements, we will use that Meson. This includes a system provided
meson, which would be visible via system-site packages inside the venv.
In the event that Meson is installed but *not for the chosen Python
interpreter*, not found, or of insufficient version, we will attempt to
install Meson from vendored source into the newly created Python virtual
environment. This vendored installation replaces both the git submodule
and tarball source mechanisms for sourcing meson.
As a result of this patch, the Python interpreter we use for both our
own build scripts *and* Meson extensions are always known to be the
exact same Python. As a further benefit, there will also be a symlink
available in the build directory that points to the correct, configured
python and can be used by e.g. manual tests to invoke the correct,
configured Python unambiguously.
Signed-off-by: John Snow <jsnow@redhat.com>
Message-Id: <20230511035435.734312-18-jsnow@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In preference to vendoring meson source, vendor a built distributable
("bdist" in python parlance). This has some benefits:
(1) We can get rid of a git submodule,
(2) Installing built meson into a venv doesn't require any extra
dependencies (the python "wheel" package, chiefly.)
(3) We don't treat meson any differently than we would any other python
package (we install it, end of story, done.)
(4) All future tarball *and* developer checkouts will function offline;
No git or PyPI connection needed to fetch meson.
Note that because mkvenv prefers vendored packages to PyPI, as mkvenv is
currently written we will never consult PyPI for meson. (Do keep in mind
that your distribution's meson will be preferred above the vendored
version, though.)
```
jsnow@scv ~/s/q/python (python-configure-venv)> python3 scripts/vendor.py
pip download --dest /home/jsnow/src/qemu/python/wheels --require-hashes -r /tmp/tmpvo5qav7i
Collecting meson==0.61.5
Using cached meson-0.61.5-py3-none-any.whl (862 kB)
Saved ./wheels/meson-0.61.5-py3-none-any.whl
Successfully downloaded meson
```
Signed-off-by: John Snow <jsnow@redhat.com>
Message-Id: <20230511035435.734312-17-jsnow@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch changes the configure script so that it always creates and
uses a python virtual environment unconditionally.
Meson bootstrapping is temporarily altered to force the use of meson
from git or vendored source (as packaged in our source tarballs). A
subsequent commit restores the use of distribution-vendored Meson.
Signed-off-by: John Snow <jsnow@redhat.com>
Message-Id: <20230511035435.734312-16-jsnow@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is a teeny-tiny script that just downloads any packages we want to
vendor from PyPI and stores them in qemu.git/python/wheels/. If I'm hit
by a meteor, it'll be easy to replicate what I have done in order to
udpate the vendored source.
We don't really care which python runs it; it exists as a meta-utility
with no external dependencies and we won't package or install it. It
will be monitored by the linters/type checkers, though; so it's
guaranteed safe on python 3.6+.
Signed-off-by: John Snow <jsnow@redhat.com>
Message-Id: <20230511035435.734312-15-jsnow@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
NetBSD cannot successfully run "ensurepip" without access to the pyexpat
module, which NetBSD debundles. Like the Debian patch, it would be
strictly faster long term to install pip/setuptools, and I recommend
developers at their workstations take that approach instead.
For the purposes of a throwaway VM, there's not really a speed
difference for who is responsible for installing pip; us (needs
py310-pip) or Python (needs py310-expat).
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230511035435.734312-14-jsnow@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Several debian-based tests need the python3-venv dependency as a
consequence of Debian debundling the "ensurepip" module normally
included with Python.
As mkvenv.py stands as of this commit, Debian requires EITHER:
(A) setuptools and pip, or
(B) ensurepip
mkvenv is a few seconds faster if you have setuptools and pip, so
developers should prefer the first requirement. For the purposes of CI,
the time-save is a wash; it's only a matter of who is responsible for
installing pip and when; the timing is about the same.
Arbitrarily, I chose adding ensurepip to the test configuration because
it is normally part of the Python stdlib, and always having it allows us
a more consistent cross-platform environment.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230511035435.734312-12-jsnow@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is a workaround intended for Debian 10, where the debian-patched
pip does not function correctly if accessed from within a virtual
environment.
We don't support Debian 10 as a build platform any longer, though we do
still utilize it for our build-tricore-softmmu CI test. It's also
possible that this bug might appear on other derivative platforms and
this workaround may prove useful.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-Id: <20230511035435.734312-11-jsnow@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
distlib is usually not installed on Linux distribution, but it is vendored
into pip. Because the virtual environment has pip via ensurepip, we
can piggy-back on pip's vendored version. This could break if they move
our cheese in the future, but the fix would be simply to require distlib.
If it is debundled, as it is on msys, it is simply available directly.
Signed-off-by: John Snow <jsnow@redhat.com>
[Move to toplevel. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When creating a virtual environment that inherits system packages,
script entry points (like "meson", "sphinx-build", etc) are not
re-generated with the correct shebang. When you are *inside* of the
venv, this is not a problem, but if you are *outside* of it, you will
not have a script that engages the virtual environment appropriately.
Add a mechanism that generates new entry points for pre-existing
packages so that we can use these scripts to run "meson",
"sphinx-build", "pip", unambiguously inside the venv.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-Id: <20230511035435.734312-9-jsnow@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is a routine that is designed to print some usable info for human
beings back out to the terminal if/when "mkvenv ensure" fails to locate
or install a package during configure time, such as meson or sphinx.
Since we are requiring that "meson" and "sphinx" are installed to the
same Python environment as QEMU is configured to build with, this can
produce some surprising failures when things are mismatched. This method
is here to try and ease that sting by offering some actionable
diagnosis.
Signed-off-by: John Snow <jsnow@redhat.com>
Message-Id: <20230511035435.734312-8-jsnow@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This command is to be used to add various packages (or ensure they're
already present) into the configure-provided venv in a modular fashion.
Examples:
mkvenv ensure --online --dir "${source_dir}/python/wheels/" "meson>=0.61.5"
mkvenv ensure --online "sphinx>=1.6.0"
mkvenv ensure "qemu.qmp==0.0.2"
It's designed to look for packages in three places, in order:
(1) In system packages, if the version installed is already good
enough. This way your distribution-provided meson, sphinx, etc are
always used as first preference.
(2) In a vendored packages directory. Here I am suggesting
qemu.git/python/wheels/ as that directory. This is intended to serve as
a replacement for vendoring the meson source for QEMU tarballs. It is
also highly likely to be extremely useful for packaging the "qemu.qmp"
package in source distributions for platforms that do not yet package
qemu.qmp separately.
(3) Online, via PyPI, ***only when "--online" is passed***. This is only
ever used as a fallback if the first two sources do not have an
appropriate package that meets the requirement. The ability to build
QEMU and run tests *completely offline* is not impinged.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-Id: <20230511035435.734312-7-jsnow@redhat.com>
[Use distlib to lookup distributions. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Python virtual environments do not typically nest; they may inherit from
the top-level system packages or not at all.
For our purposes, it would be convenient to emulate "nested" virtual
environments to allow callers of the configure script to install
specific versions of python utilities in order to test build system
features, utility version compatibility, etc.
While it is possible to install packages into the system environment
(say, by using the --user flag), it's nicer to install test packages
into a totally isolated environment instead.
As detailed in https://www.qemu.org/2023/03/24/python/, Emulate a nested
venv environment by using .pth files installed into the site-packages
folder that points to the parent environment when appropriate.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-Id: <20230511035435.734312-6-jsnow@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Debian debundles ensurepip for python; NetBSD debundles pyexpat but
ensurepip needs pyexpat. Try our best to offer a helpful error message
instead of just failing catastrophically.
Signed-off-by: John Snow <jsnow@redhat.com>
Message-Id: <20230511035435.734312-5-jsnow@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This script will be responsible for building a lightweight Python
virtual environment at configure time. It works with Python 3.6 or
newer.
It has been designed to:
- work *offline*, no PyPI required.
- work *quickly*, The fast path is only ~65ms on my machine.
- work *robustly*, with multiple fallbacks to keep things working.
- work *cooperatively*, using system packages where possible.
(You can use your distro's meson, no problem.)
Due to its unique position in the build chain, it exists outside of the
installable python packages in-tree and *must* be runnable without any
third party dependencies.
Under normal circumstances, the only dependency required to execute this
script is Python 3.6+ itself. The script is *faster* by several seconds
when setuptools and pip are installed in the host environment, which is
probably the case for a typical multi-purpose developer workstation.
In the event that pip/setuptools are missing or not usable, additional
dependencies may be required on some distributions which remove certain
Python stdlib modules to package them separately:
- Debian may require python3-venv to provide "ensurepip"
- NetBSD may require py310-expat to provide "pyexpat" *
(* Or whichever version is current for NetBSD.)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-Id: <20230511035435.734312-4-jsnow@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
"make check-minreqs" runs pip without the --disable-pip-version-check
option, which causes the obnoxious "A new release of pip available"
message.
Recent versions of pip also complain that some of the dependencies in
our virtual environment rely on "setup.py install" instead of providing
a pyproject.toml file; apparently it is deprecated to install them
directly from pip instead of letting the "wheel" package take care
of them. So, install "wheel" in the virtual environment.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-Id: <20230511035435.734312-2-jsnow@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Run 'make distclean' in a tree, and GNUmakefile is removed.
But, GNUmakefile is where we change directory to build.
Run 'make distclean' or 'make clean' again, and Makefile applies
the clean actions, such as this one, at the top level of the tree.
For example, it removes the .d source files in 'meson/test cases/d/*/*.d'.
find . \( -name '*.so' -o -name '*.dll' -o \
-name '*.[oda]' -o -name '*.gcno' \) -type f \
! -path ./roms/edk2/ArmPkg/Library/GccLto/liblto-aarch64.a \
! -path ./roms/edk2/ArmPkg/Library/GccLto/liblto-arm.a \
-exec rm {} +
To fix, remove clean and distclean from UNCHECKED_GOALS, so those targets
are "checked", meaning that configure must be run before make. However,
the check action does not trigger, because clean does not depend on
config-host.mak, so change the action to simply throw an error.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Message-Id: <1681909700-94095-1-git-send-email-steven.sistare@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Using linux 6.x guest, at boot time, an inquiry on a scsi-generic
device makes qemu crash. This is caused by a buffer overflow when
scsi-generic patches the block limits VPD page.
Do the operations on a temporary on-stack buffer that is guaranteed
to be large enough.
Reported-by: Théo Maillart <tmaillart@freebox.fr>
Analyzed-by: Théo Maillart <tmaillart@freebox.fr>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
arm64 has different capability from x86 to enable the dirty ring, which
is KVM_CAP_DIRTY_LOG_RING_ACQ_REL. Besides, arm64 also needs the backup
bitmap extension (KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP) when 'kvm-arm-gicv3'
or 'arm-its-kvm' device is enabled. Here the extension is always enabled
and the unnecessary overhead to do the last stage of dirty log synchronization
when those two devices aren't used is introduced, but the overhead should
be very small and acceptable. The benefit is cover future cases where those
two devices are used without modifying the code.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Tested-by: Zhenyu Zhang <zhenyzha@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20230509022122.20888-5-gshan@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Due to multiple capabilities associated with the dirty ring for different
architectures: KVM_CAP_DIRTY_{LOG_RING, LOG_RING_ACQ_REL} for x86 and
arm64 separately. There will be more to be done in order to support the
dirty ring for arm64.
Lets add helper kvm_dirty_ring_init() to enable the dirty ring. With this,
the code looks a bit clean.
No functional change intended.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Zhenyu Zhang <zhenyzha@redhat.com>
Message-Id: <20230509022122.20888-4-gshan@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The global dirty log synchronization is used when KVM and dirty ring
are enabled. There is a particularity for ARM64 where the backup
bitmap is used to track dirty pages in non-running-vcpu situations.
It means the dirty ring works with the combination of ring buffer
and backup bitmap. The dirty bits in the backup bitmap needs to
collected in the last stage of live migration.
In order to identify the last stage of live migration and pass it
down, an extra parameter is added to the relevant functions and
callbacks. This last stage indicator isn't used until the dirty
ring is enabled in the subsequent patches.
No functional change intended.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Zhenyu Zhang <zhenyzha@redhat.com>
Message-Id: <20230509022122.20888-2-gshan@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Save a bit of build time by passing the number of jobs option to
sphinx.
We cannot use the -j option from make because meson does not support
setting build time parameters for custom targets. Use nproc instead or
the equivalent sphinx option "-j auto", if that is available (version
>=1.7.0).
Also make sure our plugins support parallelism and report it properly
to sphinx. Particularly, implement the merge_domaindata method in
DBusDomain that is used to merge in data from other subprocesses.
Tested-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Message-Id: <20230503203947.3417-2-farosas@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Before this change, MOVNTPS and MOVNTPD were labeled as Exception Class
4 (only requiring alignment for legacy SSE instructions). This changes
them to Exception Class 1 (always requiring memory alignment), as
documented in the Intel manual.
Message-Id: <20230501111428.95998-3-ricky@rzhou.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Fix the exception classes for some SSE/AVX instructions to match what is
documented in the Intel manual.
These changes are expected to have no functional effect on the behavior
that qemu implements (primarily >= 16-byte memory alignment checks). For
instance, since qemu does not implement the AC flag, there is no
difference in behavior between Exception Classes 4 and 5 for
instructions where the SSE version only takes <16 byte memory operands.
Message-Id: <20230501111428.95998-2-ricky@rzhou.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Adds some comments describing what instructions correspond to decoding
table entries and fixes some existing comments which named the wrong
instruction.
Message-Id: <20230501111428.95998-1-ricky@rzhou.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Compared to other SSE instructions, VUCOMISx and VCOMISx are different:
the single and double precision versions are distinguished through a
prefix, however they use no-prefix and 0x66 for SS and SD respectively.
Scalar values usually are associated with 0xF2 and 0xF3.
Because of these, they incorrectly perform a 128-bit memory load instead
of a 32- or 64-bit load. Fix this by writing a custom decoding function.
I tested that the reproducer is fixed and the test-avx output does not
change.
Reported-by: Gabriele Svelto <gsvelto@mozilla.com>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1637
Fixes: f8d19eec0d ("target/i386: reimplement 0x0f 0x28-0x2f, add AVX", 2022-10-18)
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
As reported by the Intel's doc:
"FB_CLEAR: The processor will overwrite fill buffer values as part of
MD_CLEAR operations with the VERW instruction.
On these processors, L1D_FLUSH does not overwrite fill buffer values."
If this cpu feature is present in host, allow QEMU to choose whether to
show it to the guest too.
One disadvantage of not exposing it is that the guest will report
a non existing vulnerability in
/sys/devices/system/cpu/vulnerabilities/mmio_stale_data
because the mitigation is present only when the cpu has
(FLUSH_L1D and MD_CLEAR) or FB_CLEAR
features enabled.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20230201135759.555607-3-eesposit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
As reported by Intel's doc:
"L1D_FLUSH: Writeback and invalidate the L1 data cache"
If this cpu feature is present in host, allow QEMU to choose whether to
show it to the guest too.
One disadvantage of not exposing it is that the guest will report
a non existing vulnerability in
/sys/devices/system/cpu/vulnerabilities/mmio_stale_data
because the mitigation is present only when the cpu has
(FLUSH_L1D and MD_CLEAR) or FB_CLEAR
features enabled.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20230201135759.555607-2-eesposit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
linux-user getgroups(), setgroups(), getgroups32() and setgroups32()
used alloca() to allocate grouplist arrays, with unchecked gidsetsize
coming from the "guest". With NGROUPS_MAX being 65536 (linux, and it
is common for an application to allocate NGROUPS_MAX for getgroups()),
this means a typical allocation is half the megabyte on the stack.
Which just overflows stack, which leads to immediate SIGSEGV in actual
system getgroups() implementation.
An example of such issue is aptitude, eg
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=811087#72
Cap gidsetsize to NGROUPS_MAX (return EINVAL if it is larger than that),
and use heap allocation for grouplist instead of alloca(). While at it,
fix coding style and make all 4 implementations identical.
Try to not impose random limits - for example, allow gidsetsize to be
negative for getgroups() - just do not allocate negative-sized grouplist
in this case but still do actual getgroups() call. But do not allow
negative gidsetsize for setgroups() since its argument is unsigned.
Capping by NGROUPS_MAX seems a bit arbitrary, - we can do more, it is
not an error if set size will be NGROUPS_MAX+1. But we should not allow
integer overflow for the array being allocated. Maybe it is enough to
just call g_try_new() and return ENOMEM if it fails.
Maybe there's also no need to convert setgroups() since this one is
usually smaller and known beforehand (KERN_NGROUPS_MAX is actually 63, -
this is apparently a kernel-imposed limit for runtime group set).
The patch fixes aptitude segfault mentioned above.
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Message-Id: <20230409105327.1273372-1-mjt@msgid.tls.msk.ru>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
The kernel does not require PROT_READ for addresses passed to mincore.
For example the fincore(1) tool from util-linux uses PROT_NONE and
currently does not work under qemu-user.
Example (with fincore(1) from util-linux 2.38):
$ fincore /proc/self/exe
RES PAGES SIZE FILE
24K 6 22.1K /proc/self/exe
$ qemu-x86_64 /usr/bin/fincore /proc/self/exe
fincore: failed to do mincore: /proc/self/exe: Cannot allocate memory
With this patch:
$ ./build/qemu-x86_64 /usr/bin/fincore /proc/self/exe
RES PAGES SIZE FILE
24K 6 22.1K /proc/self/exe
Signed-off-by: Thomas Weißschuh <thomas@t-8ch.de>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20230422100314.1650-3-thomas@t-8ch.de>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
This can be used to validate that an address range is mapped but without
being readable or writable.
It will be used by an updated implementation of mincore().
Signed-off-by: Thomas Weißschuh <thomas@t-8ch.de>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20230422100314.1650-2-thomas@t-8ch.de>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
The correct error number for unknown ioctls is ENOTTY.
ENOSYS would mean that the ioctl() syscall itself is not implemented,
which is very improbable and unexpected for userspace.
ENOTTY means "Inappropriate ioctl for device". This is what the kernel
returns on unknown ioctls, what qemu is trying to express and what
userspace is prepared to handle.
Signed-off-by: Thomas Weißschuh <thomas@t-8ch.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230426070659.80649-1-thomas@t-8ch.de>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
TCG will need this declaration, without all of the other
bits that come with cpu-all.h.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Disconnect guest tlb parameters from TCG compilation.
Reviewed-by: Anton Johansson <anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Disconnect guest page size from TCG compilation.
While this could be done via exec/target_page.h, we want to cache
the value across multiple memory access operations, so we might
as well initialize this early.
The changes within tcg/ are entirely mechanical:
sed -i s/TARGET_PAGE_BITS/s->page_bits/g
sed -i s/TARGET_PAGE_MASK/s->page_mask/g
Reviewed-by: Anton Johansson <anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Eliminate the test vs TARGET_LONG_BITS by considering this
predicate to be always true, and simplify accordingly.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
All uses can be infered from the INDEX_op_qemu_*_a{32,64}_* opcode
being used. Add a field into TCGLabelQemuLdst to record the usage.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Because of its use on tgen_arithi, this value must be a signed
32-bit quantity, as that is what may be encoded in the insn.
The truncation of the value to unsigned for 32-bit guests is
done via the REX bit via 'trexw'.
Removes the only uses of target_ulong from this tcg backend.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Since TCG_TYPE_I32 values are kept zero-extended in registers, via
omission of the REXW bit, we need not extend if the register matches.
This is already relied upon by qemu_{ld,st}.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Keep all 32-bit values zero extended in the register, not solely when
addresses are 32 bits. This eliminates a dependency on TARGET_LONG_BITS.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We now have the address size as part of the opcode, so
we no longer need to test TARGET_LONG_BITS. We can use
uint64_t for target_ulong, as passed into load/store helpers.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
For 32-bit hosts, we cannot simply rely on TCGContext.addr_bits,
as we need one or two host registers to represent the guest address.
Create the new opcodes and update all users. Since we have not
yet eliminated TARGET_LONG_BITS, only one of the two opcodes will
ever be used, so we can get away with treating them the same in
the backends.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Expand from TCGv to TCGTemp inline in the translators,
and validate that the size matches tcg_ctx->addr_type.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Expand from TCGv to TCGTemp inline in the translators,
and validate that the size matches tcg_ctx->addr_type.
These inlines will eventually be seen only by target-specific code.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Since we do this inside gen_empty_mem_cb anyway, let's
do this earlier inside tcg expansion.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We only need to make copies for loads, when the destination
overlaps the address. For now, only eliminate the copy for
stores and 128-bit loads.
Rename plugin_prep_mem_callbacks to plugin_maybe_preserve_addr,
returning NULL if no copy is made.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
As gen_mem_wrapped is only used in plugin_gen_empty_mem_callback,
we can avoid the curiosity of union mem_gen_fn by inlining it.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Always pass the target address as uint64_t.
Adjust tcg_out_{ld,st}_helper_args to match.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We already pass uint64_t to restore_state_to_opc; this changes all
of the other uses from insn_start through the encoding to decoding.
Reviewed-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
No change to the ultimate load/store routines yet, so some atomicity
conditions not yet honored, but plumbs the change to alignment through
the relevant functions.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
No change to the ultimate load/store routines yet, so some atomicity
conditions not yet honored, but plumbs the change to alignment through
the relevant functions.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Examine MemOp for atomicity and alignment, adjusting alignment
as required to implement atomicity on the host.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Now that tcg_out_helper_load_regs is not recursive, we can
merge it into its only caller, tcg_out_helper_load_slots.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
With x86_64 as host, we do not have any temporaries with which to
resolve cycles, but we do have xchg. As a side bonus, the set of
graphs that can be made with 3 nodes and all nodes conflicting is
small: two. We can solve the cycle with a single temp.
This is required for x86_64 to handle stores of i128: 1 address
register and 2 data registers.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Replace the unparameterized TCG_TARGET_HAS_MEMORY_BSWAP macro
with a function with a memop argument.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The system is required to emulate unaligned accesses, even if the
hardware does not support it. The resulting trap may or may not
be more efficient than the qemu slow path. There are linux kernel
patches in flight to allow userspace to query hardware support;
we can re-evaluate whether to enable this by default after that.
In the meantime, softmmu now matches useronly, where we already
assumed that unaligned accesses are supported.
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Test the final byte of an unaligned access.
Use BSTRINS.D to clear the range of bits, rather than AND.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Drop the target-specific trampolines for the standard slow path.
This lets us use tcg_out_helper_{ld,st}_args, and handles the new
atomicity bits within MemOp.
At the same time, use the full load/store helpers for user-only mode.
Drop inline unaligned access support for user-only mode, as it does
not handle atomicity.
Use TCG_REG_T[1-3] in the tlb lookup, instead of TCG_REG_O[0-2].
This allows the constraints to be simplified.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Shuffle the order in tcg_out_movi_int to check s13 first, and
drop this check from tcg_out_movi_imm32. This might make the
sequence for in_prologue larger, but not worth worrying about.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Instead of using helper_unaligned_{ld,st}, use the full load/store helpers.
This will allow the fast path to increase alignment to implement atomicity
while not immediately raising an alignment exception.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Instead of using helper_unaligned_{ld,st}, use the full load/store helpers.
This will allow the fast path to increase alignment to implement atomicity
while not immediately raising an alignment exception.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Instead of using helper_unaligned_{ld,st}, use the full load/store helpers.
This will allow the fast path to increase alignment to implement atomicity
while not immediately raising an alignment exception.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Always reserve r3 for tlb softmmu lookup. Fix a bug in user-only
ALL_QLDST_REGS, in that r14 is clobbered by the BLNE that leads
to the misaligned trap. Remove r0+r1 from user-only ALL_QLDST_REGS;
I believe these had been reserved for bswap, which we no longer
perform during qemu_st.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Instead of using helper_unaligned_{ld,st}, use the full load/store helpers.
This will allow the fast path to increase alignment to implement atomicity
while not immediately raising an alignment exception.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Instead of using helper_unaligned_{ld,st}, use the full load/store helpers.
This will allow the fast path to increase alignment to implement atomicity
while not immediately raising an alignment exception.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Instead of using helper_unaligned_{ld,st}, use the full load/store helpers.
This will allow the fast path to increase alignment to implement atomicity
while not immediately raising an alignment exception.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Instead of using helper_unaligned_{ld,st}, use the full load/store helpers.
This will allow the fast path to increase alignment to implement atomicity
while not immediately raising an alignment exception.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Instead of using helper_unaligned_{ld,st}, use the full load/store helpers.
This will allow the fast path to increase alignment to implement atomicity
while not immediately raising an alignment exception.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Notice when the host has additional atomic instructions.
The new variables will also be used in generated code.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Notice when Intel or AMD have guaranteed that vmovdqa is atomic.
The new variable will also be used in generated code.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
There is an edge condition prior to gcc13 for which optimization
is required to generate 16-byte atomic sequences. Detect this.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
TCG backends may need to defer to a helper to implement
the atomicity required by a given operation. Mirror the
interface used in system mode.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
With the current structure of cputlb.c, there is no difference
between the little-endian and big-endian entry points, aside
from the assert. Unify the pairs of functions.
Hoist the qemu_{ld,st}_helpers arrays to tcg.c.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Create ldst_atomicity.c.inc.
Not required for user-only code loads, because we've ensured that
the page is read-only before beginning to translate code.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This field may be used to describe the precise atomicity requirements
of the guest, which may then be used to constrain the methods by which
it may be emulated by the host.
For instance, the AArch64 LDP (32-bit) instruction changes semantics
with ARMv8.4 LSE2, from
MO_64 | MO_ATOM_IFALIGN_PAIR
(64-bits, single-copy atomic only on 4 byte units,
nonatomic if not aligned by 4),
to
MO_64 | MO_ATOM_WITHIN16
(64-bits, single-copy atomic within a 16 byte block)
The former may be implemented with two 4 byte loads, or a single 8 byte
load if that happens to be efficient on the host. The latter may not
be implemented with two 4 byte loads and may also require a helper when
misaligned.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
xen_9pfs_free can't use gnttabdev since it is already closed and NULL-ed
out when free is called. Do the teardown in _disconnect(). This
matches the setup done in _connect().
trace-events are also added for the XenDevOps functions.
Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Message-Id: <20230502143722.15613-1-jandryuk@gmail.com>
[C.S.: - Remove redundant return in xen_9pfs_free().
- Add comment to trace-events. ]
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
It's only required for the proxy helper.
Add a new option for the proxy helper rather than enabling it
implicitly.
Change-Id: I95b73fca625529e99d16b0a64e01c65c0c1d43f2
Signed-off-by: Peter Foley <pefoley@google.com>
Message-Id: <20230503130757.863824-1-pefoley@google.com>
[C.S.: - Resolve merge conflict. ]
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
* Various small test updates
* Some small doc updates
* Introduce replacement for -async-teardown that shows up in the QAPI
* Make machine-qmp-cmds.c and xilinx_ethlite.c target-independent
* Fix s390x LDER instruction
* Fix s390x EXECUTE instruction with relative branches
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmRjagURHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbVyPxAAhlqIbVWir264DQkpLKM/4CVWPxVPwBxh
# OPvSG42wM7+uCNefnIWYr4qT1+Iz14w8OYBCEON2u8Pwfgxrjf2ZkS2C79iL3FHG
# 37NsFGkxhLLeexzYyCpSg3FNikZql+RNg9I9um4NRPH0lgu4L3aQk58WyXFyBHU2
# mxvbAEOyiSbGr8bp6ZcU7k1UryRZ6qQoBUzFvMQpUD7jlo88MVUu5D+4xZclH6EV
# R6WerbyKUWnfY0rFWxA8RGt785aUVq9iD8tIkPkPhQ/UjvzZKosCHIpjF0qCkd6P
# 42Ahz6kP7Ce/XlTcS/Q3gIEzKViCFJtZiZIG/N2sBAWqisTkaSKDeQMrM6vAmmBr
# ju44CUk2tupZSG20G/Gz7a09ZKr3S7+6BpJ+tUdnK2W9PSU7CycesZ6s9hqKJL8W
# QUOMKyEMF/+W+pubdfYJNvUha6hYPoaR9vTNAhC50NiahhhIxIRcyRtpteVgsjwW
# lxHMeIz8PUHxp+tvl3CzLZyDWF0maq5/JzhkCoUhvzVUAh+tDYAfWOKxIxEVNPVt
# E1Igj6N4TYvkrXltSyxMxs9uHWhNi4ObETbB+7greYOWFVhtKhphnG78wt0uu81O
# iZIqdLzWFeqaH5/Li3VnuVhLDnhSfpDiWUNqaVvWu6V5WrXDuIGQoe7pxAhRvZTB
# zsOUpGdprPo=
# =sWOT
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 16 May 2023 04:33:25 AM PDT
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [undefined]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [undefined]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5
* tag 'pull-request-2023-05-15v2' of https://gitlab.com/thuth/qemu: (21 commits)
tests/tcg/s390x: Test EXECUTE of relative branches
target/s390x: Fix EXECUTE of relative branches
tests/tcg/s390x: Enable the multiarch system tests
tests/tcg/multiarch: Make the system memory test work on big-endian
s390x/tcg: Fix LDER instruction format
hw/net: Move xilinx_ethlite.c to the target-independent source set
hw/core: Move machine-qmp-cmds.c into the target independent source set
cpu: Introduce a wrapper for being able to use TARGET_NAME in common code
hw/core: Use a callback for target specific query-cpus-fast information
docs/about/emulation: fix typo
docs/devel: remind developers to run CI container pipeline when updating images
s390x/pv: Fix spurious warning with asynchronous teardown
util/async-teardown: wire up query-command-line-options
tests/lcitool: Add mtools and xorriso and remove genisoimage as dependencies
tests: libvirt-ci: Update to commit 'c8971e90ac' to pull in mformat and xorriso
Add information how to fix common build error on Windows in symlink-install-tree
hw/pci-bridge: Fix release ordering by embedding PCIBridgeWindows within PCIBridge
tests/qtest: replace qmp_discard_response with qtest_qmp_assert_success
net: stream: test reconnect option with an unix socket
sysemu/kvm: Remove unused headers
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Multiarch tests are written in C and need support for printing
characters. Instead of implementing the runtime from scratch, just
reuse the pc-bios/s390-ccw one.
Run tests with -nographic in order to enable SCLP (enable this for
the existing tests as well, since it does not hurt).
Use the default linker script for the new tests.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230511114651.439872-3-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The only target specific code that is left in here are two spots that
use TARGET_NAME. Change them to use the new target_name() wrapper
function instead, so we can move the file into the common softmmu_ss
source set. That way we only have to compile this file once, and not
for each target anymore.
Message-Id: <20230424160434.331175-4-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
In some spots, it would be helpful to be able to use TARGET_NAME
in common (target independent) code, too. Thus introduce a wrapper
that can be called from common code, too, just like we already
have one for target_words_bigendian().
Message-Id: <20230424160434.331175-3-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
For being able to create a universal QEMU binary one day, core
files like machine-qmp-cmds.c must not contain any "#ifdef TARGET_..."
parts. Thus let's provide the target specific function via a
function pointer in CPUClass instead, as a first step towards
making this file target independent.
Message-Id: <20230424160434.331175-2-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
When new dependencies and packages are added to containers, its important to
run CI container generation pipelines on gitlab to make sure that there are no
obvious conflicts between packages that are being added and those that are
already present. Running CI container pipelines will make sure that there are
no such breakages before we commit the change updating the containers. Add a
line in the documentation reminding developers to run the pipeline before
submitting the change. It will also ease the life of the maintainers.
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230506072012.10350-1-anisinha@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Kernel commit 292a7d6fca33 ("KVM: s390: pv: fix asynchronous teardown
for small VMs") causes the KVM_PV_ASYNC_CLEANUP_PREPARE ioctl to fail
if the VM is not larger than 2GiB. QEMU would attempt it and fail,
print an error message, and then proceed with a normal teardown.
Avoid attempting to use asynchronous teardown altogether when the VM is
not larger than 2 GiB. This will avoid triggering the error message and
also avoid pointless overhead; normal teardown is fast enough for small
VMs.
Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Fixes: c3a073c610 ("s390x/pv: Add support for asynchronous teardown for reboot")
Link: https://lore.kernel.org/all/20230421085036.52511-2-imbrenda@linux.ibm.com/
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Message-Id: <20230510105531.30623-2-imbrenda@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
[thuth: Fix inline function parameter in pv.h]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Add new -run-with option with an async-teardown=on|off parameter. It is
visible in the output of query-command-line-options QMP command, so it
can be discovered and used by libvirt.
The option -async-teardown is now redundant, deprecate it.
Reported-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Fixes: c891c24b1a ("os-posix: asynchronous teardown for shutdown on Linux")
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Message-Id: <20230505120051.36605-2-imbrenda@linux.ibm.com>
[thuth: Add curly braces to fix error with GCC 8.5, fix bug in deprecated.rst]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Bios bits avocado tests need mformat (provided by the mtools package) and
xorriso tools in order to run within gitlab CI containers. Add those
dependencies within the Dockerfiles so that containers can be built with
those tools present and bios bits avocado tests can be run there.
xorriso package conflicts with genisoimage package on some distributions.
Therefore, it is not possible to have both the packages at the same time
in the container image uniformly for all distribution flavors. Further,
on some distributions like RHEL, both xorriso and genisoimage
packages provide /usr/bin/genisoimage and on some other distributions like
Fedora, only genisoimage package provides the same utility.
Therefore, this change removes the dependency on geninsoimage for building
container images altogether keeping only xorriso package. At the same time,
cdrom-test.c is updated to use and check for existence of only xorrisofs.
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20230504154611.85854-3-anisinha@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Pull in the following changes from lcitool:
* tests/lcitool/libvirt-ci 85487e1...c8971e9 (18):
> mappings: add new package mappings for mformat and xorriso
> docs: testing: Update contents with tox
> .gitlab-ci.yml: Always test against installed lcitool
> gitlab-ci.yml: Start using tox for testing
> tox: Allow running with custom pytest options with {posargs}
> gitignore: Add the default .tox directory
> dev-requirements: Reference VM requirements
> requirements: Add tox to dev-requirements.txt and drop pytest and flake
> test-requirements: Rename to dev-requirements.txt
> Add tox.ini configuration file
> tests: commands: Consolidate the installed package/run from git tests
> Add a pytest.ini
> facts: targets: Drop Fedora 36 target
> gitlab-ci.yml: Add Fedora 38 target
> facts: targets: Add Fedora 38
> facts: mappings: Drop 'zstd' mapping
> facts: projects: nbdkit: Replace zstd mapping with libzstd
> docs: mappings: Add a section on the preferred mapping naming scheme
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20230504154611.85854-2-anisinha@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
By default, Windows doesn't allow to create soft links for user account
and only administrator is allowed to do this. To fix this problem you have
to raise your permissions or enable Developer Mode, which available since
Windows 10. Additional explanation when build fails will allow developer
to fix the problem on his computer faster.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1386
Signed-off-by: Mateusz Krawczuk <mat.krawczuk@gmail.com>
Message-Id: <20230504211101.1386-1-mat.krawczuk@gmail.com>
[thuth: Drop the hunk with the white space changes]
Signed-off-by: Thomas Huth <thuth@redhat.com>
The lifetime of the PCIBridgeWindows instance accessed via the windows pointer
in struct PCIBridge is managed separately from the PCIBridge itself.
Triggered by ./qemu-system-x86_64 -M x-remote -display none -monitor stdio
QEMU monitor: device_add cxl-downstream
In some error handling paths (such as the above due to attaching a cxl-downstream
port anything other than a cxl-upstream port) the g_free() of the PCIBridge
windows in pci_bridge_region_cleanup() is called before the final call of
flatview_uref() in address_space_set_flatview() ultimately from
drain_call_rcu()
At one stage this resulted in a crash, currently can still be observed using
valgrind which records a use after free.
When present, only one instance is allocated. pci_bridge_update_mappings()
can operate directly on an instance rather than creating a new one and
swapping it in. Thus there appears to be no reason to not directly
couple the lifetimes of the two structures by embedding the PCIBridgeWindows
within the PCIBridge removing the need for the problematic separate free.
Patch is same as was posted deep in the discussion.
https://lore.kernel.org/qemu-devel/20230403171232.000020bb@huawei.com/
Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20230421122550.28234-1-Jonathan.Cameron@huawei.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The qmp_discard_response method simply ignores the result of the QMP
command, merely unref'ing the object. This is a bad idea for tests
as it leaves no trace if the QMP command unexpectedly failed. The
qtest_qmp_assert_success method will validate that the QMP command
returned without error, and if errors occur, it will print a message
on the console aiding debugging.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230421171411.566300-2-berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
We can have failure with the inet type test because the port address
is not allocated atomically and can be taken by another test between its
selection and the start of QEMU. To avoid that, use an unix socket with
a path that is unique
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20230503094109.1198248-1-lvivier@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Pull request
This pull request contain's Sam Li's zoned storage support in the QEMU block
layer and virtio-blk emulation.
v2:
- Sam fixed the CI failures. CI passes for me now. [Richard]
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCAAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAmRiWCgACgkQnKSrs4Gr
# c8h/7gf+MMm2cGEaf376t8HMwTc6wbXVfbmAlZrge2EXPZfFvEaxj7HClcEraOgV
# yJsGWeU6mOw4r68ICJ/4KhrY1cdv+VZym/LsMLMcFUTXFHnyX4pyU3am31FPOI4K
# +wrDYJOJhc4DkAESWGgEWiMKpuO/uUEgBmHdW+qPFCl77Yl/eP6H5uNP6nGFn55p
# QpS/l8iha7PDkc81EsrjA+e/YI0ubfNSP7+zZElhQ98354CQ0MCfmZ6h9bT+o2bu
# R7SBUj80e+2X0a1b9s/2Jz/x8l4TEsl8kr48/Q1usq3GVVkbjEgqsk6wTN13Q/4g
# CeIR7E61ZeYzmpb4tLFRIqK2Jw+NEQ==
# =Q8xW
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 15 May 2023 09:04:56 AM PDT
# gpg: using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [full]
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" [full]
* tag 'block-pull-request' of https://gitlab.com/stefanha/qemu:
docs/zoned-storage:add zoned emulation use case
virtio-blk: add some trace events for zoned emulation
block: add accounting for zone append operation
virtio-blk: add zoned storage emulation for zoned devices
block: add some trace events for zone append
qemu-iotests: test zone append operation
block: introduce zone append write for zoned devices
file-posix: add tracking of the zone write pointers
docs/zoned-storage: add zoned device documentation
block: add some trace events for new block layer APIs
iotests: test new zone operations
block: add zoned BlockDriver check to block layer
block/raw-format: add zone operations to pass through requests
block/block-backend: add block layer APIs resembling Linux ZonedBlockDevice ioctls
block/file-posix: introduce helper functions for sysfs attributes
block/block-common: add zoned device structs
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Migration Pull request 20230515
Hi
On this PULL:
- use xxHash for calculate dirty_rate (andrei)
- Create qemu_target_pages_to_MiB() and use them (quintela)
- make dirtyrate target independent (quintela)
- Merge 5 patches from atomic counters series (quintela)
Please apply.
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEGJn/jt6/WMzuA0uC9IfvGFhy1yMFAmRiJoUACgkQ9IfvGFhy
# 1yO1ExAAsSStVAUh/tSgu5fhXydJVkBMO6LOj1k+tA7qylwv4QsqZ/pLNBvY8Zms
# 8/bpYtlvw1LwCSaq01oNA6RhBhkBaZ5x0PUViCY87dsJhu0hEo68Jcp0FkrkW93E
# OiIsp9NU7wpnqd88ZhzjcZ/viWebPw3660V5KY4/8ZZFVxJaKMhG+vW3pGYH8yDR
# TmZK5E5e3t5yiwDRRPrkAw3+e+GDwfwNuOBkk+NBJdL1mOZnIfVwFwxRAXWn/vEM
# f6NdT3aXplsNeKPCN1w9zrLhOJdHeu8IlhWhT/cjTgOKemBJBYzftH6dI/X9D0ix
# ghWAzFSJh1S38gw0mMef1VERJqh7JpAkTq7vT2x7J/0UIbIAru0yRiSrHbNBCcvL
# efsVFtjyseKq70qKN515uoqbK6mlnxP+eECIAUmesUx0bJI9jDWzn+KVc86xUvWy
# +98KDcPuYVxdVp4XHAIsyHYOfTY/tJwG5KI4hYgGP7uxFVr/qus3eBB/Q5BBVPOx
# X0A/760iehfV0V0UmVEt8mC7uDjI0JBouenUHcURAtbsnuGRMCz6s1kLsZYaHuGV
# NhihXq6jnwcvn2nGGnXY44TsgBWesfUrCFZOjJzbaSjGH5UpipC0SECKqh1GKoQP
# kdknvyej5h8egU2QFdS8sCUeXIfwAtHfCamtnui3b3E3iF3TSco=
# =8gfA
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 15 May 2023 05:33:09 AM PDT
# gpg: using RSA key 1899FF8EDEBF58CCEE034B82F487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>" [undefined]
# gpg: aka "Juan Quintela <quintela@trasno.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03 4B82 F487 EF18 5872 D723
* tag 'migration-20230515-pull-request' of https://gitlab.com/juan.quintela/qemu:
qemu-file: Remove total from qemu_file_total_transferred_*()
qemu-file: Make rate_limit_used an uint64_t
qemu-file: make qemu_file_[sg]et_rate_limit() use an uint64_t
migration: We set the rate_limit by a second
migration: A rate limit value of 0 is valid
migration: Make dirtyrate.c target independent
migration: Teach dirtyrate about qemu_target_page_bits()
migration: Teach dirtyrate about qemu_target_page_size()
Use new created qemu_target_pages_to_MiB()
softmmu: Create qemu_target_pages_to_MiB()
migration/calc-dirty-rate: replaced CRC32 with xxHash
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
pull-loongarch-20230515
# -----BEGIN PGP SIGNATURE-----
#
# iLMEAAEIAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCZGIThgAKCRBAov/yOSY+
# 34NVA/0b99XxYeeOnJYspjKGgVk+R51+1ilMHqPGlNEG6HB2eHyIJdDgenBDaa/h
# lxqzDU9YQI4DzuvUcC75uWrShMkR5/Fb8Z0CCEToQUyAwfh2pNeAIzuB7TXHW5Ox
# SRGMs3eF23q5BUSCeD7DS2Ar1Zv4Gm3ytutiMAvCxNzxJWF1aA==
# =g93p
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 15 May 2023 04:12:06 AM PDT
# gpg: using RSA key B8FF1DA0D2FDCB2DA09C6C2C40A2FFF239263EDF
# gpg: Good signature from "Song Gao <m17746591750@163.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B8FF 1DA0 D2FD CB2D A09C 6C2C 40A2 FFF2 3926 3EDF
* tag 'pull-loongarch-20230515' of https://gitlab.com/gaosong/qemu:
hw/intc: Add NULL pointer check on LoongArch ipi device
hw/loongarch/virt: Set max 256 cpus support on loongarch virt machine
hw/loongarch/virt: Modify ipi as percpu device
tests/avocado: Add LoongArch machine start test
loongarch: mark loongarch_ipi_iocsr re-entrnacy safe
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This patch extends virtio-blk emulation to handle zoned device commands
by calling the new block layer APIs to perform zoned device I/O on
behalf of the guest. It supports Report Zone, four zone oparations (open,
close, finish, reset), and Append Zone.
The VIRTIO_BLK_F_ZONED feature bit will only be set if the host does
support zoned block devices. Regular block devices(conventional zones)
will not be set.
The guest os can use blktests, fio to test those commands on zoned devices.
Furthermore, using zonefs to test zone append write is also supported.
Signed-off-by: Sam Li <faithilikerun@gmail.com>
Message-id: 20230508051916.178322-2-faithilikerun@gmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The patch tests zone append writes by reporting the zone wp after
the completion of the call. "zap -p" option can print the sector
offset value after completion, which should be the start sector
where the append write begins.
Signed-off-by: Sam Li <faithilikerun@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20230508051510.177850-4-faithilikerun@gmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
A zone append command is a write operation that specifies the first
logical block of a zone as the write position. When writing to a zoned
block device using zone append, the byte offset of the call may point at
any position within the zone to which the data is being appended. Upon
completion the device will respond with the position where the data has
been written in the zone.
Signed-off-by: Sam Li <faithilikerun@gmail.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20230508051510.177850-3-faithilikerun@gmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Since Linux doesn't have a user API to issue zone append operations to
zoned devices from user space, the file-posix driver is modified to add
zone append emulation using regular writes. To do this, the file-posix
driver tracks the wp location of all zones of the device. It uses an
array of uint64_t. The most significant bit of each wp location indicates
if the zone type is conventional zones.
The zones wp can be changed due to the following operations issued:
- zone reset: change the wp to the start offset of that zone
- zone finish: change to the end location of that zone
- write to a zone
- zone append
Signed-off-by: Sam Li <faithilikerun@gmail.com>
Message-id: 20230508051510.177850-2-faithilikerun@gmail.com
[Fix errno propagation from handle_aiocb_zone_mgmt()
--Stefan]
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Add zoned device option to host_device BlockDriver. It will be presented only
for zoned host block devices. By adding zone management operations to the
host_block_device BlockDriver, users can use the new block layer APIs
including Report Zone and four zone management operations
(open, close, finish, reset, reset_all).
Qemu-io uses the new APIs to perform zoned storage commands of the device:
zone_report(zrp), zone_open(zo), zone_close(zc), zone_reset(zrs),
zone_finish(zf).
For example, to test zone_report, use following command:
$ ./build/qemu-io --image-opts -n driver=host_device, filename=/dev/nullb0
-c "zrp offset nr_zones"
Signed-off-by: Sam Li <faithilikerun@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Acked-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20230508045533.175575-4-faithilikerun@gmail.com
Message-id: 20230324090605.28361-4-faithilikerun@gmail.com
[Adjust commit message prefix as suggested by Philippe Mathieu-Daudé
<philmd@linaro.org> and remove spurious ret = -errno in
raw_co_zone_mgmt().
--Stefan]
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
That the implementation does the check every 100 milliseconds is an
implementation detail that shouldn't be seen on the interfaz.
Notice that all callers of qemu_file_set_rate_limit() used the
division or pass 0, so this change is a NOP.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20230508130909.65420-4-quintela@redhat.com>
Add separate macro EXTIOI_CPUS for extioi interrupt controller, extioi
only supports 4 cpu. And set macro LOONGARCH_MAX_CPUS as 256 so that
loongarch virt machine supports more cpus.
Interrupts from external devices can only be routed cpu 0-3 because
of extioi limits, cpu internal interrupt such as timer/ipi can be
triggered on all cpus.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20230512100421.1867848-3-gaosong@loongson.cn>
ipi is used to communicate between cpus, this patch modified
loongarch ipi device as percpu device, so that there are
2 MemoryRegions with ipi device, rather than 2*cpus
MemoryRegions, which may be large than QDEV_MAX_MMIO if
more cpus are added on loongarch virt machine.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20230512100421.1867848-2-gaosong@loongson.cn>
OpenRISC FPU Updates for 8.1
A few fixes and updates to bring OpenRISC inline with the latest
architecture spec updates:
- Allow FPCSR to be accessed in user mode
- Select tininess detection before rounding
- Fix FPE Exception PC value
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE2cRzVK74bBA6Je/xw7McLV5mJ+QFAmRfPIEACgkQw7McLV5m
# J+RFuhAAt4xxci52fxvPpgUu/mjKU6mbYNjBEPEh+OAcb+m/BrvKhazZDACkyLMe
# ehavWtI856jfy6DsIA5wj5+zhgV8W5DR6a1mHIhmSAoVq7e+NnC5y0GJC9B0Xd/2
# FNOq/LZPtv/w7u+D1pFJaTb07hAaFVIC05Arn4dXa1k3yBuyjqIJnlrXa3Jt0pLW
# To/z1zch1rUp6RhFmGxU+8/qvTbzqkm/F3kbe8l2z34371lTd6KhPwvKaImMpTYQ
# dvULTMXjZ6Dp8BmUrDcnLMTL3NbYcPrI+qOHX1X+dwzNFyui2I8Ci7IfEKJ460ja
# Fe2Ku/aDfHSZYYayWaYSlrrZ1AH0fLLwIkHSs95+xUMsl81mtS6lIysj7fAFRnM5
# 7tU4ov1T/leupvvUCUX5N4Yje/yvbuoAqGyhjDHzJ98vIe6fDhutU4Bm8/30q6Dy
# nKnfSgRHrrTrH042xW32DJnzaN2pEWrNtOMaegLMaqZ60app2YDaKJvtHLua1VjD
# b+g+X/+xBNb34k5e/f4z+GeGPoqE2wvwMcSkD+NBE8je3idPdMS/u5lQrvqvcbI/
# DJBRoPifNME/oYoTxPVKRnrCQIWQ6YkeLWVmqMfCVpjCF97gexo+UBUawJimTXFr
# gmcIYxv87oKF4KbCn7LsLlXGSpWSihKSBTHDxFPaKiRbnYZ5ais=
# =zqbX
# -----END PGP SIGNATURE-----
# gpg: Signature made Sat 13 May 2023 08:30:09 AM BST
# gpg: using RSA key D9C47354AEF86C103A25EFF1C3B31C2D5E6627E4
# gpg: Good signature from "Stafford Horne <shorne@gmail.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: D9C4 7354 AEF8 6C10 3A25 EFF1 C3B3 1C2D 5E66 27E4
* tag 'or1k-pull-request-20230513' of https://github.com/stffrdhrn/qemu:
target/openrisc: Setup FPU for detecting tininess before rounding
target/openrisc: Set PC to cpu state on FPU exception
target/openrisc: Allow fpcsr access in user mode
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In check_s2_mmu_setup() we have a check that is attempting to
implement the part of AArch64.S2MinTxSZ that is specific to when EL1
is AArch32:
if !s1aarch64 then
// EL1 is AArch32
min_txsz = Min(min_txsz, 24);
Unfortunately we got this wrong in two ways:
(1) The minimum txsz corresponds to a maximum inputsize, but we got
the sense of the comparison wrong and were faulting for all
inputsizes less than 40 bits
(2) We try to implement this as an extra check that happens after
we've done the same txsz checks we would do for an AArch64 EL1, but
in fact the pseudocode is *loosening* the requirements, so that txsz
values that would fault for an AArch64 EL1 do not fault for AArch32
EL1, because it does Min(old_min, 24), not Max(old_min, 24).
You can see this also in the text of the Arm ARM in table D8-8, which
shows that where the implemented PA size is less than 40 bits an
AArch32 EL1 is still OK with a configured stage2 T0SZ for a 40 bit
IPA, whereas if EL1 is AArch64 then the T0SZ must be big enough to
constrain the IPA to the implemented PA size.
Because of part (2), we can't do this as a separate check, but
have to integrate it into aa64_va_parameters(). Add a new argument
to that function to indicate that EL1 is 32-bit. All the existing
callsites except the one in get_phys_addr_lpae() can pass 'false',
because they are either doing a lookup for a stage 1 regime or
else they don't care about the tsz/tsz_oob fields.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1627
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230509092059.3176487-1-peter.maydell@linaro.org
On a build configured with: --disable-tcg --enable-xen it is possible
to produce a QEMU binary with no TCG nor KVM support. Skip the cdrom
boot tests if that's the case.
Fixes: 0c1ae3ff9d ("tests/qtest: Fix tests when no KVM or TCG are present")
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 20230508181611.2621-4-farosas@suse.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We cannot allow this config to be disabled at the moment as not all of
the relevant code is protected by it.
Commit 29d9efca16 ("arm/Kconfig: Do not build TCG-only boards on a
KVM-only build") moved the CONFIGs of several boards to Kconfig, so it
is now possible that nothing selects ARM_V7M (e.g. when doing a
--without-default-devices build).
Return the CONFIG_ARM_V7M entry to a state where it is always selected
whenever TCG is available.
Fixes: 29d9efca16 ("arm/Kconfig: Do not build TCG-only boards on a KVM-only build")
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230508181611.2621-3-farosas@suse.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Semihosting has been made a 'default y' entry in Kconfig, which does
not work because when building --without-default-devices, the
semihosting code would not be available.
Make semihosting unconditional when TCG is present.
Fixes: 29d9efca16 ("arm/Kconfig: Do not build TCG-only boards on a KVM-only build")
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230508181611.2621-2-farosas@suse.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Coverity points out (in CID 1508390) that write_bootloader has
some dead code, where we assign to 'p' and then in the following
line assign to it again. This happened as a result of the
refactoring in commit cd5066f861.
Fix the dead code by removing the 'void *v' variable entirely and
instead adding a cast when calling bl_setup_gt64120_jump_kernel(), as
we do at its other callsite in write_bootloader_nanomips().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
In the doc sources, we have a few cross-reference targets with odd
names "pcsys_005fxyz". These are the legacy of the semi-automated
conversion of the old info docs to rST (the '005f' is because ASCII
0x5f is '_' and the old info link names had underscores in them).
Remove the targets which nothing links to, and rename the two targets
which are used to something a bit more descriptive.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20230421163642.1151904-1-peter.maydell@linaro.org
Reviewed-by: Markus Armbruster <armbru@redhat.com>
When we take a PNG screenshot the ordering of the colour channels in
the data is not correct, resulting in the image having weird
colouring compared to the actual display. (Specifically, on a
little-endian host the blue and red channels are swapped; on
big-endian everything is wrong.)
This happens because the pixman idea of the pixel data and the libpng
idea differ. PIXMAN_a8r8g8b8 defines that pixels are 32-bit values,
with A in bits 24-31, R in bits 16-23, G in bits 8-15 and B in bits
0-7. This means that on little-endian systems the bytes in memory
are
B G R A
and on big-endian systems they are
A R G B
libpng, on the other hand, thinks of pixels as being a series of
values for each channel, so its format PNG_COLOR_TYPE_RGB_ALPHA
always wants bytes in the order
R G B A
This isn't the same as the pixman order for either big or little
endian hosts.
The alpha channel is also unnecessary bulk in the output PNG file,
because there is no alpha information in a screenshot.
To handle the endianness issue, we already define in ui/qemu-pixman.h
various PIXMAN_BE_* and PIXMAN_LE_* values that give consistent
byte-order pixel channel formats. So we can use PIXMAN_BE_r8g8b8 and
PNG_COLOR_TYPE_RGB, which both have an in-memory byte order of
R G B
and 3 bytes per pixel.
(PPM format screenshots get this right; they already use the
PIXMAN_BE_r8g8b8 format.)
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1622
Fixes: 9a0a119a38 ("Added parameter to take screenshot with screendump as PNG")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20230502135548.2451309-1-peter.maydell@linaro.org
We currently don't correctly handle the VSTCR_EL2.SW and VTCR_EL2.NSW
configuration bits. These allow configuration of whether the stage 2
page table walks for Secure IPA and NonSecure IPA should do their
descriptor reads from Secure or NonSecure physical addresses. (This
is separate from how the translation table base address and other
parameters are set: an NS IPA always uses VTTBR_EL2 and VTCR_EL2
for its base address and walk parameters, regardless of the NSW bit,
and similarly for Secure.)
Provide a new function ptw_idx_for_stage_2() which returns the
MMU index to use for descriptor reads, and use it to set up
the .in_ptw_idx wherever we call get_phys_addr_lpae().
For a stage 2 walk, wherever we call get_phys_addr_lpae():
* .in_ptw_idx should be ptw_idx_for_stage_2() of the .in_mmu_idx
* .in_secure should be true if .in_mmu_idx is Stage2_S
This allows us to correct S1_ptw_translate() so that it consistently
always sets its (out_secure, out_phys) to the result it gets from the
S2 walk (either by calling get_phys_addr_lpae() or by TLB lookup).
This makes better conceptual sense because the S2 walk should return
us an (address space, address) tuple, not an address that we then
randomly assign to S or NS.
Our previous handling of SW and NSW was broken, so guest code
trying to use these bits to put the s2 page tables in the "other"
address space wouldn't work correctly.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1600
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230504135425.2748672-3-peter.maydell@linaro.org
Bit 63 in a Table descriptor is only the NSTable bit for stage 1
translations; in stage 2 it is RES0. We were incorrectly looking at
it all the time.
This causes problems if:
* the stage 2 table descriptor was incorrectly setting the RES0 bit
* we are doing a stage 2 translation in Secure address space for
a NonSecure stage 1 regime -- in this case we would incorrectly
do an immediate downgrade to NonSecure
A bug elsewhere in the code currently prevents us from getting
to the second situation, but when we fix that it will be possible.
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230504135425.2748672-2-peter.maydell@linaro.org
OpenRISC defines tininess to be detected before rounding. Setup qemu to
obey this.
Signed-off-by: Stafford Horne <shorne@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Store the PC to ensure the correct value can be read in the exception
handler.
Signed-off-by: Stafford Horne <shorne@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Instead of trying to unify all operations on uint64_t, use
mmu_lookup() to perform the basic tlb hit and resolution.
Create individual functions to handle access by size.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Instead of trying to unify all operations on uint64_t, pull out
mmu_lookup() to perform the basic tlb hit and resolution.
Create individual functions to handle access by size.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Instead of playing with offsetof in various places, use
MMUAccessType to index an array. This is easily defined
instead of the previous dummy padding array in the union.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Like cpu_in_exclusive_context, but also true if
there is no other cpu against which we could race.
Use it in tb_flush as a direct replacement.
Use it in cpu_loop_exit_atomic to ensure that there
is no loop against cpu_exec_step_atomic.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Mark all memory operations that are not already marked with UNALIGN.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In gen_ldx/gen_stx, the only two locations for memory operations,
mark the operation as either aligned (softmmu) or unaligned
(user-only, as if emulated by the kernel).
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Memory operations that are not already aligned, or otherwise
marked up, require addition of ctx->default_tcg_memop_mask.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Adjust the softmmu tlb to use R0+R1, not any of the normally available
registers. Since we handle overlap betwen inputs and helper arguments,
we can allow any allocatable reg.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rather than zero-extend the guest address into a register,
use an add instruction which zero-extends the second input.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The softmmu tlb uses TCG_REG_TMP[0-2], not any of the normally available
registers. Now that we handle overlap betwen inputs and helper arguments,
we can allow any allocatable reg.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The softmmu tlb uses TCG_REG_{TMP1,TMP2,R0}, not any of the normally
available registers. Now that we handle overlap betwen inputs and
helper arguments, we can allow any allocatable reg.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Allocate TCG_REG_TMP2. Use R0, TMP1, TMP2 instead of any of
the normally allocated registers for the tlb load.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The softmmu tlb uses TCG_REG_TMP[0-3], not any of the normally available
registers. Now that we handle overlap betwen inputs and helper arguments,
and have eliminated use of A0, we can allow any allocatable reg.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Compare the address vs the tlb entry with sign-extended values.
This simplifies the page+alignment mask constant, and the
generation of the last byte address for the misaligned test.
Move the tlb addend load up, and the zero-extension down.
This frees up a register, which allows us use TMP3 as the returned base
address register instead of A0, which we were using as a 5th temporary.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
While performing the load in the delay slot of the call to the common
bswap helper function is cute, it is not worth the added complexity.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The softmmu tlb uses TCG_REG_TMP[0-2], not any of the normally available
registers. Now that we handle overlap betwen inputs and helper arguments,
we can allow any allocatable reg.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use tcg_out_ld_helper_args, tcg_out_ld_helper_ret,
and tcg_out_st_helper_args. This allows our local
tcg_out_arg_* infrastructure to be removed.
We are no longer filling the call or return branch
delay slots, nor are we tail-calling for the store,
but this seems a small price to pay.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use tcg_out_ld_helper_args, tcg_out_ld_helper_ret,
and tcg_out_st_helper_args. This allows our local
tcg_out_arg_* infrastructure to be removed.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use tcg_out_st_helper_args. This eliminates the use of a tail call to
the store helper. This may or may not be an improvement, depending on
the call/return branch prediction of the host microarchitecture.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add tcg_out_ld_helper_args, tcg_out_ld_helper_ret,
and tcg_out_st_helper_args. These and their subroutines
use the existing knowledge of the host function call abi
to load the function call arguments and return results.
These will be used to simplify the backends in turn.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Merge tcg_out_tlb_load, add_qemu_ldst_label, tcg_out_test_alignment,
tcg_prepare_user_ldst, and some code that lived in both tcg_out_qemu_ld
and tcg_out_qemu_st into one function that returns HostAddress and
TCGLabelQemuLdst structures.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Merge tcg_out_tlb_load, add_qemu_ldst_label, tcg_out_test_alignment,
and some code that lived in both tcg_out_qemu_ld and tcg_out_qemu_st
into one function that returns TCGReg and TCGLabelQemuLdst.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Merge tcg_out_tlb_load, add_qemu_ldst_label, tcg_out_test_alignment,
and some code that lived in both tcg_out_qemu_ld and tcg_out_qemu_st
into one function that returns HostAddress and TCGLabelQemuLdst structures.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Merge tcg_out_tlb_load, add_qemu_ldst_label, tcg_out_test_alignment,
and some code that lived in both tcg_out_qemu_ld and tcg_out_qemu_st
into one function that returns HostAddress and TCGLabelQemuLdst structures.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Merge tcg_out_tlb_load, add_qemu_ldst_label, tcg_out_test_alignment,
tcg_out_zext_addr_if_32_bit, and some code that lived in both
tcg_out_qemu_ld and tcg_out_qemu_st into one function that returns
HostAddress and TCGLabelQemuLdst structures.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Merge tcg_out_tlb_load, add_qemu_ldst_label, and some code that lived
in both tcg_out_qemu_ld and tcg_out_qemu_st into one function that
returns HostAddress and TCGLabelQemuLdst structures.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Merge tcg_out_tlb_load, add_qemu_ldst_label, tcg_out_test_alignment,
and some code that lived in both tcg_out_qemu_ld and tcg_out_qemu_st
into one function that returns HostAddress and TCGLabelQemuLdst structures.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Since tcg_out_{ld,st}_helper_args, the slow path no longer requires
the address argument to be set up by the tlb load sequence. Use a
plain load for the addend and indexed addressing with the original
input address register.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Merge tcg_out_tlb_load, add_qemu_ldst_label,
tcg_out_test_alignment, and some code that lived in both
tcg_out_qemu_ld and tcg_out_qemu_st into one function
that returns HostAddress and TCGLabelQemuLdst structures.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The round-robin scheduler will iterate over the CPU list with an
assigned budget until the next timer expiry and may exit early because
of a TB exit. This is fine under normal operation but with icount
enabled and SMP it is possible for a CPU to be starved of run time and
the system live-locks.
For example, booting a riscv64 platform with '-icount
shift=0,align=off,sleep=on -smp 2' we observe a livelock once the kernel
has timers enabled and starts performing TLB shootdowns. In this case
we have CPU 0 in M-mode with interrupts disabled sending an IPI to CPU
1. As we enter the TCG loop, we assign the icount budget to next timer
interrupt to CPU 0 and begin executing where the guest is sat in a busy
loop exhausting all of the budget before we try to execute CPU 1 which
is the target of the IPI but CPU 1 is left with no budget with which to
execute and the process repeats.
We try here to add some fairness by splitting the budget across all of
the CPUs on the thread fairly before entering each one. The CPU count
is cached on CPU list generation ID to avoid iterating the list on each
loop iteration. With this change it is possible to boot an SMP rv64
guest with icount enabled and no hangs.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jamie Iles <quic_jiles@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427020925.51003-3-quic_jiles@quicinc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use target_words_bigendian() instead of an ifdef.
Remove CONFIG_RISCV_DIS from the check for riscv as a host; this is
a poisoned identifier, and anyway will always be set by meson.build
when building on a riscv host.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230508133745.109463-3-thuth@redhat.com>
[rth: Type change done in a separate patch]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We'd like to move disas.c into the common code source set, where
CONFIG_USER_ONLY is not available anymore. So we have to move
the related code into a separate file instead.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230508133745.109463-2-thuth@redhat.com>
[rth: Type change done in a separate patch]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Migration Pull request (20230509 vintage) take 2
Hi
In this take 2:
- Change uint -> uint32_t to fix mingw32 compilation.
Please apply.
[take 1]
In this PULL request:
- 1st part of colo support for multifd (lukas)
- 1st part of disabling colo option (vladimir)
Please, apply.
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEGJn/jt6/WMzuA0uC9IfvGFhy1yMFAmRb3dgACgkQ9IfvGFhy
# 1yNLBxAAwHiAOdSPS7TqJXH2/PkBKsd42XMtWzC9UowZ6SUdQi0Q2bQUBnygJ8BA
# 59yLOTPdwUhaPWk4KsyKM2znOCJ+f9MF5V4QXbyILf1WCAq6d+mtPwArnYF1TRwi
# XIewVDeRopdOO5lnWGcfAKZZ5WIDzA/bn6NiGLi+pQa5HGyk84Bk+tFa8kJI6xBL
# 5CWfhNTcxDNYRFg/z/9YVirkuxIXEEL6VEeRFV+pmFuj05q9bysWJkLFoEcFNawO
# gp1foHDkU7wHmHDJ3D4AVTm3TW641ft1wdlHIHZRoOiIIu3EUOoDEVVsaCfdxrY8
# pPJZ5m37wb52GIaCJmigG8rkHxIJ8xKLk4HKu4umDqFq5jZQ2krnnj7AkQhpp7p2
# aEIOXJQQq7XCsKpuvSUIexPv4gbN5SEYKi7XKoOPe3sZ03Rkn0I5xY3KSyMQMamP
# jtk8tNlRA+9Wug82eb/FtIKDj3//4SbuQOJEdRXjKJBldd3mtWTT/FRj/8oo96/p
# hmTu/cGDrP5qgtWpz0kKI/xaBf8at1nwpDgdEzOjRw4zf6xQHFjbXgJ7tQBH/JUI
# T3A9pdiXN6QdRupcWUSV0iJsfS/5i3mOUTA/C529qGXabSnZzfMK+unL/I8N02yt
# 83o7jSg22etMjaS1c+VuDmzKCAfuZloDZv2Bms/+yM/8k8Xe5S4=
# =vbqf
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 10 May 2023 07:09:28 PM BST
# gpg: using RSA key 1899FF8EDEBF58CCEE034B82F487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>" [undefined]
# gpg: aka "Juan Quintela <quintela@trasno.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03 4B82 F487 EF18 5872 D723
* tag 'migration-20230509-pull-request' of https://gitlab.com/juan.quintela/qemu:
migration: block incoming colo when capability is disabled
migration: disallow change capabilities in COLO state
migration: process_incoming_migration_co: simplify code flow around ret
migration: drop colo_incoming_thread from MigrationIncomingState
build: move COLO under CONFIG_REPLICATION
colo: make colo_checkpoint_notify static and provide simpler API
block/meson.build: prefer positive condition for replication
multifd: Add the ramblock to MultiFDRecvParams
ram: Let colo_flush_ram_cache take the bitmap_mutex
ram: Add public helper to set colo bitmap
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
COLO is not listed as running state in migrate_is_running(), so, it's
theoretically possible to disable colo capability in COLO state and the
unexpected error in migration_iteration_finish() is reachable.
Let's disallow that in qmp_migrate_set_capabilities. Than the error
becomes absolutely unreachable: we can get into COLO state only with
enabled capability and can't disable it while we are in COLO state. So
substitute the error by simple assertion.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20230428194928.1426370-10-vsementsov@yandex-team.ru>
Signed-off-by: Juan Quintela <quintela@redhat.com>
We don't allow to use x-colo capability when replication is not
configured. So, no reason to build COLO when replication is disabled,
it's unusable in this case.
Note also that the check in migrate_caps_check() is not the only
restriction: some functions in migration/colo.c will just abort if
called with not defined CONFIG_REPLICATION, for example:
migration_iteration_finish()
case MIGRATION_STATUS_COLO:
migrate_start_colo_process()
colo_process_checkpoint()
abort()
It could probably make sense to have possibility to enable COLO without
REPLICATION, but this requires deeper audit of colo & replication code,
which may be done later if needed.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Acked-by: Dr. David Alan Gilbert <dave@treblig.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20230428194928.1426370-4-vsementsov@yandex-team.ru>
Signed-off-by: Juan Quintela <quintela@redhat.com>
colo_checkpoint_notify() is mostly used in colo.c. Outside we use it
once when x-checkpoint-delay migration parameter is set. So, let's
simplify the external API to only that function - notify COLO that
parameter was set. This make external API more robust and hides
implementation details from external callers. Also this helps us to
make COLO module optional in further patch (i.e. we are going to add
possibility not build the COLO module).
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Zhang Chen <chen.zhang@intel.com>
Message-Id: <20230428194928.1426370-3-vsementsov@yandex-team.ru>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Testing updates:
- fix up xtensa docker container base to current Debian
- document breakpoint and watchpoint support
- clean up the ansible scripts for Ubuntu 22.04
- add a minimal device profile
- drop https on mipsdistros URL
- fix Kconfig bug for XLNX_VERSAL
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmRbspsACgkQ+9DbCVqe
# KkSBowf+JjcVxZMb2kS8pV8WEdAq+fceBYI7mDBSEu0DFqZF+w0XSM+T+VZHyZ8+
# QmPeE+McKBUXvq/V4osPnDVVZfBKmwzFN548M6qIMLUbHjbDp94DtudNkAZ0ejhc
# +Ack73vzTiTWsGmBaqQxZlcYkZNZiZAhQsTF6cPwna74cDkcRghvd/Zxzy831rVB
# gVWhbEkk7SBQhJ+PqRIeso60DbWvCaVDMrkPc2WX8kup6QltbUpoayS/eNOtBkfA
# C557eOBxoM8s0cu33O780K5mCPCyk1IaIynvZtmkty0DXUSd5y9SNpsofhAY7BGy
# 4QdlolLygDgEC3s4bMULGy04nzaylw==
# =a+97
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 10 May 2023 04:04:59 PM BST
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* tag 'pull-testing-updates-100523-1' of https://gitlab.com/stsquad/qemu:
hw/arm: Select XLNX_USB_SUBSYS for xlnx-zcu102 machine
tests/avocado: use http for mipsdistros.mips.com
gitlab: enable minimal device profile for aarch64 --disable-tcg
gitlab: add ubuntu-22.04-aarch64-without-defaults
scripts/ci: clean-up the 20.04/22.04 confusion in ansible
scripts/ci: add gitlab-runner to kvm group
docs: document breakpoint and watchpoint support
tests/docker: bump the xtensa base to debian:11-slim
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This machine hardcodes initialization of the USB device, so select the
corresponding Kconfig. It is not enough to have it as "default y if
XLNX_VERSAL" at usb/Kconfig because building --without-default-devices
disables the default selection resulting in:
$ ./qemu-system-aarch64 -M xlnx-zcu102
qemu-system-aarch64: missing object type 'usb_dwc3'
Aborted (core dumped)
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230208192654.8854-8-farosas@suse.de>
Message-Id: <20230503091244.1450613-8-alex.bennee@linaro.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
As the cached assets have fallen out of our cache new attempts to
fetch these binaries fail hard due to certificate expiry. It's hard
to find a contact email for the domain as the root page of mipsdistros
throws up some random XML. I suspect Amazon are merely the hosts.
The checksums should protect us from any man-in-the-middle type
attacks.
Message-Id: <20230503091244.1450613-22-alex.bennee@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Cc: Philippe Mathieu-Daudé <philmd@linaro.org>
We have a bunch of references to 20.04 (which s390x is still on)
although we are basically building on 22.04 now. Clean up the textual
references and use lcitool to generate the full package list to be
consistent.
We can drop "Install packages to build QEMU on Ubuntu on non-s390x" as
when we upgrade the s390x builder to 22.04 it won't need this
workaround.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230503091244.1450613-19-alex.bennee@linaro.org>
Block layer patches
- Graph locking, part 3 (more block drivers)
- Compile out assert_bdrv_graph_readable() by default
- Add configure options for vmdk, vhdx and vpc
- Fix use after free in blockdev_mark_auto_del()
- migration: Attempt disk reactivation in more failure scenarios
- Coroutine correctness fixes
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmRbi6ERHGt3b2xmQHJl
# ZGhhdC5jb20ACgkQfwmycsiPL9Y66A//ZRk/0M6EZUJPAKG6m/XLTDNrOCNBZ1Tu
# kBGvxXsVQZMt4gGpBad4l2INN6IQKTIdIf+lK71EpxMPmFG6xK32btn38yywCAfQ
# lr1p5nR0Y/zSlT+XzP4yKy/CtQl6U0rkysmjCIk35bZc7uLy6eo4oFR4vmhRRt2M
# UGltB50/Nicx12YFufVjodbhv+apxTGwS2XHatmwqtjKeYReSz8mJHslEy6DvC8m
# ziNThD6YBy7hMktAhNaqUqtZD0OSWz66VMObco/4i2++sOAMZIspXQkjv3AjH74e
# lmgMhNc/xgJKPwFBPsj6F7dOKxwhdKD9jzZlx3yaBtAU18hpWX54QWuA3/CFlySc
# 5QbbqIstFTC8lqoRWThQrcHHRKbDBJCP4ImRXUIKhuPaxEzXA9zb3+f3QPTIjLSA
# KO7nxuSmO+tC7hQ1K9kAjRZHWlxxAk4clk+7UrK4UrWgGxfCUKgFg4Tyx7RrpwA6
# j4L5vwAY60LW74tikWe9xJx2QbdRoWBTTZhUyirbO7rLX1e8mS1nUWmtIsFSQxAq
# Z7nX7ygN0WEF+8qIsk3jTGaEeJoCM7+7B+X2RpSy0sftFjFYmybIiUgLMO7e+ozK
# rvUPnwlHAbGCVIJOKrUDj3cGt6k3/xnrTajUc7pCB3KKqG4pe+IlZuHyKIUMActb
# dBLaBnj0M2o=
# =hw9E
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 10 May 2023 01:18:41 PM BST
# gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg: issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
* tag 'for-upstream' of https://repo.or.cz/qemu/kevin: (28 commits)
block: compile out assert_bdrv_graph_readable() by default
block: Mark bdrv_refresh_limits() and callers GRAPH_RDLOCK
block: Mark bdrv_recurse_can_replace() and callers GRAPH_RDLOCK
block: Mark bdrv_query_block_graph_info() and callers GRAPH_RDLOCK
block: Mark bdrv_query_bds_stats() and callers GRAPH_RDLOCK
block: Mark BlockDriver callbacks for amend job GRAPH_RDLOCK
block: Mark bdrv_co_debug_event() GRAPH_RDLOCK
block: Mark bdrv_co_get_info() and callers GRAPH_RDLOCK
block: Mark bdrv_co_get_allocated_file_size() and callers GRAPH_RDLOCK
mirror: Require GRAPH_RDLOCK for accessing a node's parent list
vhdx: Require GRAPH_RDLOCK for accessing a node's parent list
nbd: Mark nbd_co_do_establish_connection() and callers GRAPH_RDLOCK
nbd: Remove nbd_co_flush() wrapper function
block: .bdrv_open is non-coroutine and unlocked
graph-lock: Fix GRAPH_RDLOCK_GUARD*() to be reader lock
graph-lock: Add GRAPH_UNLOCKED(_PTR)
test-bdrv-drain: Don't modify the graph in coroutines
iotests: Test resizing image attached to an iothread
block: Don't call no_coroutine_fns in qmp_block_resize()
block: bdrv/blk_co_unref() for calls in coroutine context
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
reader_count() is a performance bottleneck because the global
aio_context_list_lock mutex causes thread contention. Put this debugging
assertion behind a new ./configure --enable-debug-graph-lock option and
disable it by default.
The --enable-debug-graph-lock option is also enabled by the more general
--enable-debug option.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230501173443.153062-1-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This adds GRAPH_RDLOCK annotations to declare that callers of
bdrv_co_debug_event() need to hold a reader lock for the graph.
Unfortunately we cannot use a co_wrapper_bdrv_rdlock (i.e. make the
coroutine wrapper a no_coroutine_fn), because the function is called
(using the BLKDBG_EVENT macro) by mixed functions that run both in
coroutine and non-coroutine context (for example many of the functions
in qcow2-cluster.c and qcow2-refcount.c).
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230504115750.54437-16-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Drivers were a bit confused about whether .bdrv_open can run in a
coroutine and whether or not it holds a graph lock.
It cannot keep a graph lock from the caller across the whole function
because it both changes the graph (requires a writer lock) and does I/O
(requires a reader lock). Therefore, it should take these locks
internally as needed.
The functions used to be called in coroutine context during image
creation. This was buggy for other reasons, and as of commit 32192301,
all block drivers go through no_co_wrappers. So it is not called in
coroutine context any more.
Fix qcow2 and qed to work with the correct assumptions: The graph lock
needs to be taken internally instead of just assuming it's already
there, and the coroutine path is dead code that can be removed.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230504115750.54437-9-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
For some functions, it is part of their interface to be called without
holding the graph lock. Add a new macro to document this.
The macro expands to TSA_EXCLUDES(), which is a relatively weak check
because it passes in cases where the compiler just doesn't know if the
lock is held. Function pointers can't be checked at all. Therefore, its
primary purpose is documentation.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230504115750.54437-7-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
test-bdrv-drain contains a few test cases that are run both in coroutine
and non-coroutine context. Running the entire code including the setup
and shutdown in coroutines is incorrect because graph modifications can
generally not happen in coroutines.
Change the test so that creating and destroying the test nodes and
BlockBackends always happens outside of coroutine context.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20230504115750.54437-6-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This tests that trying to resize an image with QMP block_resize doesn't
hang or otherwise fail when the image is attached to a device running in
an iothread.
This is a regression test for the recent fix that changed
qmp_block_resize, which is a coroutine based QMP handler, to avoid
calling no_coroutine_fns directly.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230509134133.373408-1-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Migration code can call bdrv_activate() in coroutine context, whereas
other callers call it outside of coroutines. As it calls other code that
is not supposed to run in coroutines, standardise on running outside of
coroutines.
This adds a no_co_wrapper to switch to the main loop before calling
bdrv_activate().
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230504115750.54437-3-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There is a bdrv_co_getlength() now, which should be used in coroutine
context.
This requires adding GRAPH_RDLOCK to some functions so that this still
compiles with TSA because bdrv_co_getlength() is GRAPH_RDLOCK.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230504115750.54437-2-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Commit fe904ea824 added a fail_inactivate label, which tries to
reactivate disks on the source after a failure while s->state ==
MIGRATION_STATUS_ACTIVE, but didn't actually use the label if
qemu_savevm_state_complete_precopy() failed. This failure to
reactivate is also present in commit 6039dd5b1c (also covering the new
s->state == MIGRATION_STATUS_DEVICE state) and 403d18ae (ensuring
s->block_inactive is set more reliably).
Consolidate the two labels back into one - no matter HOW migration is
failed, if there is any chance we can reach vm_start() after having
attempted inactivation, it is essential that we have tried to restart
disks before then. This also makes the cleanup more like
migrate_fd_cancel().
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20230502205212.134680-1-eblake@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Socket paths need to be short to avoid failures. This is why there is a
iotests.sock_dir (defaulting to /tmp) separate from the disk image base
directory.
Make use of it to fix failures in too deeply nested test directories.
Fixes: ab7f7e67a7
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230503165019.8867-1-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
job_cancel_locked() drops the job list lock temporarily and it may call
aio_poll(). We must assume that the list has changed after this call.
Also, with unlucky timing, it can end up freeing the job during
job_completed_txn_abort_locked(), making the job pointer invalid, too.
For both reasons, we can't just continue at block_job_next_locked(job).
Instead, start at the head of the list again after job_cancel_locked()
and skip those jobs that we already cancelled (or that are completing
anyway).
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230503140142.474404-1-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There is no need for the AioContext lock in aio_wait_bh_oneshot().
It's easy to remove the lock from existing callers and then switch from
AIO_WAIT_WHILE() to AIO_WAIT_WHILE_UNLOCKED() in aio_wait_bh_oneshot().
Document that the AioContext lock should not be held across
aio_wait_bh_oneshot(). Holding a lock across aio_poll() can cause
deadlock so we don't want callers to do that.
This is a step towards getting rid of the AioContext lock.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230404153307.458883-1-stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
QAPI patches patches for 2023-05-09
# -----BEGIN PGP SIGNATURE-----
#
# iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAmRbUEYSHGFybWJydUBy
# ZWRoYXQuY29tAAoJEDhwtADrkYZTmzEP/3pDpVxpP7xXLevl2vFqkFyHEjc0L3N4
# x//ljgQojAdM6WU3e0qqOfp/NE2ktUg5D3z+QNiVP1/xXv/dtMGATdG+X9AZs0US
# XnhdicYdBng8bGuhlNuNY8QJ/I4ALwUR44LVOYibVohv2RVYWBapGiHowpyTyABq
# sFSHrj/cgvTMUn53yp7veZTo6rWG6RU/D5uUTOMsvKeAoHoOXMyBxV01SCt84t/J
# pcelINcriP6cQVzgfm1B39UNa0IxinGxEx/IIaxz5Ju66G05HTs4CsBFAF6/0QI/
# 3YerGWPt9fF6+qYNn21Gg9CL1fHHppNqTXkcuTeGn/Ohg53bosktti5Ysn73vtpR
# GWsJr6M4KQ1SwEbZIiFZCS3A4VTbRcr7WkXets39pcpxGDlNisi+zfV95kNo09xR
# hxi8SuWgb2OfQpVs/71eunp+PM1ZQsODurcy4x0/rlYJfhk53kQSMRtlfy5Cn6uY
# +weWUgygBSWG/w0qanWWK5TF1DNlRKzbix6cmMuGGKcpyF7EMWE1kqmjmmu7CQvM
# a3aPTqGtUt0LeqBQIhmeq/jEwd3vxQa1R85gd0/0sWxEMHkPXVfVoaryiaWAykye
# 7r+c8o/41c44zs8YxdZrz72su9fqKC/TeVf5soU46ZucmH8D6f7QHy+s1ec2PEjY
# l6cRIXTXHeQe
# =j6cJ
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 10 May 2023 09:05:26 AM BST
# gpg: using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653
# gpg: issuer "armbru@redhat.com"
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [undefined]
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653
* tag 'pull-qapi-2023-05-09-v2' of https://repo.or.cz/qemu/armbru:
qapi: Reformat doc comments to conform to current conventions
qga/qapi-schema: Reformat doc comments to conform to current conventions
docs/devel/qapi-code-gen: Update doc comment conventions
qapi: Section parameter @indent is no longer used, drop
qapi: Relax doc string @name: description indentation rules
qapi: Rewrite parsing of doc comment section symbols and tags
qapi: Fix argument description indentation stripping
tests/qapi-schema/doc-good: Improve argument description tests
tests/qapi-schema/doc-good: Improve a comment
qapi/dump: Indent bulleted lists consistently
qapi: Tidy up a slightly awkward TODO comment
sphinx/qapidoc: Do not emit TODO sections into user manuals
Revert "qapi: BlockExportRemoveMode: move comments to TODO"
meson: Fix to make QAPI generator output depend on main.py
qapi: Fix crash on stray double quote character
docs/devel/qapi-code-gen: Turn FIXME admonitions into comments
docs/devel/qapi-code-gen: Clean up use of quotes a bit
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
VFIO updates 2023-05-09
* Add vf-token device option allowing QEMU to assign VFs where the PF
is managed by a userspace driver. (Minwoo Im)
* Skip log_sync during migration setup as a potential source of failure
and likely source of redundancy. (Avihai Horon)
* Virtualize PCIe Resizable BAR capability rather than hiding it,
exposing only the current size as available. (Alex Williamson)
# -----BEGIN PGP SIGNATURE-----
#
# iQJPBAABCAA5FiEEQvbATlQL0amee4qQI5ubbjuwiyIFAmRaqfobHGFsZXgud2ls
# bGlhbXNvbkByZWRoYXQuY29tAAoJECObm247sIsiwNYP/2KtCbKqylnGPuwLbRMP
# HC4Id4mme7jUribmhM7FP57nQrb0tgnQoGvalkmB6M3833e3p4ivH2ezTyPxIawx
# UH4mAEBtR03rxh54eVBbOvDVf+XHd6qll/rFw5dBI0C5s7JQyMOourNRLTZLvqzD
# 2bwI7dfQzWbXWPj8QGPmDti9wbeATZ3RjqC7onoWq6A6Cw4aRGj1gHBQH9v81iA+
# m8hnZh+e5eFkQRc4mPXxFjm1Kw6ZYXWGoEEZrYPXvQn9+3MDCLcNb++KIrLsGujP
# qOnZG534vs+EZtUsGI8F02CBBXMAQFuBZhxCtuuG8iI9OQSE6R3E29iIc0Lpz5aO
# s8rN5OW4m7wXPdGkU1/7/N7kdeZvg+R8Jc4ozx3Mez3eSFbVkABSSX9vyvdHAezi
# 02Np1+ZBldZWBbBhYbWfqhvcg4iYNnHknSkS2CYY8jdsGttbrNY2f7Xllf3KC/Iv
# 6Un5WpU//0LuJjmH6onzswUUEmulchzR7OpBj68jFsB8rnTaZWM4Sqb/Jx+KXlRB
# BnNck0PCPoblpT8lgjAD3H9NaXx3mdVsml8i/7YIZjx8Zc4eanRGlsH9DmnHbB7U
# i4orDvL3SR3ZKVy6Zssti5jt8GwrEnqg97uTbS/jiTai1tOCP9n6U4T/wslHIUR4
# rIxvyJnmqrPAiWtVF+0cvGmT
# =VTJU
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 09 May 2023 09:15:54 PM BST
# gpg: using RSA key 42F6C04E540BD1A99E7B8A90239B9B6E3BB08B22
# gpg: issuer "alex.williamson@redhat.com"
# gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com>" [undefined]
# gpg: aka "Alex Williamson <alex@shazbot.org>" [undefined]
# gpg: aka "Alex Williamson <alwillia@redhat.com>" [undefined]
# gpg: aka "Alex Williamson <alex.l.williamson@gmail.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 42F6 C04E 540B D1A9 9E7B 8A90 239B 9B6E 3BB0 8B22
* tag 'vfio-updates-20230509.0' of https://gitlab.com/alex.williamson/qemu:
vfio/pci: Static Resizable BAR capability
vfio/migration: Skip log_sync during migration SETUP state
vfio/pci: add support for VF token
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Change
# @name: Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed
# do eiusmod tempor incididunt ut labore et dolore magna aliqua.
to
# @name: Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed
# do eiusmod tempor incididunt ut labore et dolore magna aliqua.
See recent commit "qapi: Relax doc string @name: description
indentation rules" for rationale.
Reflow paragraphs to 70 columns width, and consistently use two spaces
to separate sentences.
To check the generated documentation does not change, I compared the
generated HTML before and after this commit with "wdiff -3". Finds no
differences. Comparing with diff is not useful, as the reflown
paragraphs are visible there.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20230428105429.1687850-18-armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Acked-by: Lukas Straub <lukasstraub2@web.de>
[Straightforward conflicts in qapi/audio.json qapi/misc-target.json
qapi/run-state.json resolved]
Change
# @name: Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed
# do eiusmod tempor incididunt ut labore et dolore magna aliqua.
to
# @name: Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed
# do eiusmod tempor incididunt ut labore et dolore magna aliqua.
See recent commit "qapi: Relax doc string @name: description
indentation rules" for rationale.
Reflow paragraphs to 70 columns width, and consistently use two spaces
to separate sentences.
To check the generated documentation does not change, I compared the
generated HTML before and after this commit with "wdiff -3". Finds no
differences. Comparing with diff is not useful, as the reflown
paragraphs are visible there.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20230428105429.1687850-17-armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
The commit before previous relaxed the indentation rules to let us
improve the doc comment conventions. This commit changes the written
conventions. The next commits will update QAPI schemas to conform to
them.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20230428105429.1687850-16-armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
The QAPI schema doc comment language provides special syntax for
command and event arguments, struct and union members, alternate
branches, enumeration values, and features: descriptions starting with
"@name:".
By convention, we format them like this:
# @name: Lorem ipsum dolor sit amet, consectetur adipiscing elit,
# sed do eiusmod tempor incididunt ut labore et dolore
# magna aliqua.
Okay for names as short as "name", but we have much longer ones. Their
description gets squeezed against the right margin, like this:
# @dirty-sync-missed-zero-copy: Number of times dirty RAM synchronization could
# not avoid copying dirty pages. This is between
# 0 and @dirty-sync-count * @multifd-channels.
# (since 7.1)
The description text is effectively just 50 characters wide. Easy
enough to read, but can be cumbersome to write.
The awkward squeeze against the right margin makes people go beyond it,
which produces two undesirables: arguments about style, and descriptions
that are unnecessarily hard to read, like this one:
# @postcopy-vcpu-blocktime: list of the postcopy blocktime per vCPU. This is
# only present when the postcopy-blocktime migration capability
# is enabled. (Since 3.0)
We could instead format it like
# @postcopy-vcpu-blocktime:
# list of the postcopy blocktime per vCPU. This is only present
# when the postcopy-blocktime migration capability is
# enabled. (Since 3.0)
or, since the commit before previous, like
# @postcopy-vcpu-blocktime:
# list of the postcopy blocktime per vCPU. This is only present
# when the postcopy-blocktime migration capability is
# enabled. (Since 3.0)
However, I'd rather have
# @postcopy-vcpu-blocktime: list of the postcopy blocktime per vCPU.
# This is only present when the postcopy-blocktime migration
# capability is enabled. (Since 3.0)
because this is how rST field and option lists work.
To get this, we need to let the first non-blank line after the
"@name:" line determine expected indentation.
This fills up the indentation pitfall mentioned in
docs/devel/qapi-code-gen.rst. A related pitfall still exists. Update
the text to show it.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20230428105429.1687850-14-armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
[Work around lack of walrus operator in Python 3.7 and older]
To recognize a line starting with a section symbol and or tag, we
first split it at the first space, then examine the part left of the
space. We can just as well examine the unsplit line, so do that.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20230428105429.1687850-13-armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
[Work around lack of walrus operator in Python 3.7 and older]
* target/i386: improved EPYC models
* more removal of mb_read/mb_set
* bump _WIN32_WINNT to the Windows 8 API
* fix for modular builds with --disable-system
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmRZK7wUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroObngf8D6A5l1QQAnImRrZAny6HZV/9xseD
# 9QhkUW3fxXlUhb8tXomv2BlT8h9GzLIN6aWvcCotT+xK3kAX7mRcYKgPMr9CYL7y
# vev/hh+B6RY1CJ/xPT09/BMVjkj50AL0O/OuWMhcQ5nCO7F2sdMjMrsYqqeZcjYf
# zx9RTX7gVGt+wWFHxgCgdfL0kfgzexK55YuZU0vLzcA+pYsZWoEfW+fKBIf4rzDV
# r9M6mDBUkHBQ0rIVC3QFloAXnYb1JrpeqqL2i2qwhAkLz8LyGqk3lZF20hE/04im
# XZcZjWO5pxAxIEPeTken+2x1n8tn2BLkMtvwJdV5TpvICCFRtPZlbH79qw==
# =rXLN
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 08 May 2023 06:05:00 PM BST
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [undefined]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu:
meson: leave unnecessary modules out of the build
docs: clarify --without-default-devices
target/i386: Add EPYC-Genoa model to support Zen 4 processor series
target/i386: Add VNMI and automatic IBRS feature bits
target/i386: Add missing feature bits in EPYC-Milan model
target/i386: Add feature bits for CPUID_Fn80000021_EAX
target/i386: Add a couple of feature bits in 8000_0008_EBX
target/i386: Add new EPYC CPU versions with updated cache_info
target/i386: allow versioned CPUs to specify new cache_info
include/qemu/osdep.h: Bump _WIN32_WINNT to the Windows 8 API
MAINTAINERS: add stanza for Kconfig files
tb-maint: do not use mb_read/mb_set
call_rcu: stop using mb_set/mb_read
test-aio-multithread: simplify test_multi_co_schedule
test-aio-multithread: do not use mb_read/mb_set for simple flags
rcu: remove qatomic_mb_set, expand comments
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The PCI Resizable BAR (ReBAR) capability is currently hidden from the
VM because the protocol for interacting with the capability does not
support a mechanism for the device to reject an advertised supported
BAR size. However, when assigned to a VM, the act of resizing the
BAR requires adjustment of host resources for the device, which
absolutely can fail. Linux does not currently allow us to reserve
resources for the device independent of the current usage.
The only writable field within the ReBAR capability is the BAR Size
register. The PCIe spec indicates that when written, the device
should immediately begin to operate with the provided BAR size. The
spec however also notes that software must only write values
corresponding to supported sizes as indicated in the capability and
control registers. Writing unsupported sizes produces undefined
results. Therefore, if the hypervisor were to virtualize the
capability and control registers such that the current size is the
only indicated available size, then a write of anything other than
the current size falls into the category of undefined behavior,
where we can essentially expose the modified ReBAR capability as
read-only.
This may seem pointless, but users have reported that virtualizing
the capability in this way not only allows guest software to expose
related features as available (even if only cosmetic), but in some
scenarios can resolve guest driver issues. Additionally, no
regressions in behavior have been reported for this change.
A caveat here is that the PCIe spec requires for compatibility that
devices report support for a size in the range of 1MB to 512GB,
therefore if the current BAR size falls outside that range we revert
to hiding the capability.
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20230505232308.2869912-1-alex.williamson@redhat.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Currently, VFIO log_sync can be issued while migration is in SETUP
state. However, doing this log_sync is at best redundant and at worst
can fail.
Redundant -- all RAM is marked dirty in migration SETUP state and is
transferred only after migration is set to ACTIVE state, so doing
log_sync during migration SETUP is pointless.
Can fail -- there is a time window, between setting migration state to
SETUP and starting dirty tracking by RAM save_live_setup handler, during
which dirty tracking is still not started. Any VFIO log_sync call that
is issued during this time window will fail. For example, this error can
be triggered by migrating a VM when a GUI is active, which constantly
calls log_sync.
Fix it by skipping VFIO log_sync while migration is in SETUP state.
Fixes: 758b96b61d ("vfio/migrate: Move switch of dirty tracking into vfio_memory_listener")
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Link: https://lore.kernel.org/r/20230403130000.6422-1-avihaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
When an argument's description starts on the line after the "#arg: "
line, indentation is stripped only from the description's first line,
as demonstrated by the previous commit. Moreover, subsequent lines
with less indentation are not rejected.
Make the first line's indentation the expected indentation for the
remainder of the description. This fixes indentation stripping, and
also requires at least that much indentation.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20230428105429.1687850-12-armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Improve the comments to better describe what they test.
Cover argument description starting on a new line indented. This
style isn't documented in docs/devel/qapi-code-gen.rst. qapi-gen.py
accepts it, but messes up indentation: it's stripped from the first
line, not subsequent ones. The next commit will fix this.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20230428105429.1687850-11-armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
The QAPI generator doesn't reject undocumented members and
features (yet). doc-good.json covers this, with clear "is
undocumented" notes to signal intent.
Except for @Variant1 member @var1, where it's "(but no @var: line)".
Less clear. Replace by "@var1 is undocumented".
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20230428105429.1687850-10-armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Documentation of dump-guest-memory contains two bulleted lists. The
first one is indented, the second one isn't. Delete the first one's
indentation for a more consistent look.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20230428105429.1687850-9-armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
MigrateSetParameters has a TODO comment sitting right behind its doc
comment. I wrote it this way to keep it out of the manual, but that
reason is not obvious.
The previous commit (sphinx/qapidoc: Do not emit TODO sections into
user manuals) lets me move it into the doc comment as a TODO section.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20230428105429.1687850-8-armbru@redhat.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
QAPI doc comments are for QMP users: they go into the "QEMU QMP
Reference Manual" and the "QEMU Storage Daemon QMP Reference Manual".
The doc comment TODO sections are for somebody else, namely for the
people who can do: developers. Do not emit them into the user
manuals.
This elides the following TODOs:
* SchemaInfoCommand
# TODO: @success-response (currently irrelevant, because it's QGA, not QMP)
This is a note to developers adding introspection to the guest
agent. It makes no sense to users.
* @query-hotpluggable-cpus
# TODO: Better documentation; currently there is none.
This is a reminder for developers. It doesn't help users.
* @device_add
# TODO: This command effectively bypasses QAPI completely due to its
# "additional arguments" business. It shouldn't have been added to
# the schema in this form. It should be qapified properly, or
# replaced by a properly qapified command.
Likewise.
Eliding them is an improvement.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20230428105429.1687850-7-armbru@redhat.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
When the lexer chokes on a stray character, its shows the characters
until the next structural character in the error message. It uses a
regular expression to match a non-empty string of non-structural
characters. Bug: the regular expression treats '"' as structural.
When the lexer chokes on '"', the match fails, and trips
must_match()'s assertion. Fix the regular expression.
Fixes: 14c3279502 (qapi: Improve reporting of lexical errors)
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20230428105429.1687850-4-armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
We have two FIXME notes. These FIXMEs are for QAPI developers. They
are not useful for QAPI schema developers. They are marked up as
admonitions, which makes them look important in generated HTML.
Turn them into comments. QAPI developers will still see them (they
read and write the .rst). QAPI schema developers may still see
them (if they read the .rst instead of the generated .html), but "this
is just for QAPI developers" should be more obvious.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20230428105429.1687850-3-armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Migration PULL request (20230508 edition, take 2)
Hi
This is just the compression bits of the Migration PULL request for
20230428. Only change is that we don't run the compression tests by
default.
The problem already exist with compression code. The test just show
that it don't work.
- Add migration tests for (old) compress migration code (lukas)
- Make compression code independent of ram.c (lukas)
- Move compression code into ram-compress.c (lukas)
Please apply, Juan.
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEGJn/jt6/WMzuA0uC9IfvGFhy1yMFAmRZRMwACgkQ9IfvGFhy
# 1yOdixAA1fOLanaYMUJZGLZ9sVTt7rDc4AEPRGkQOYYZNGK3LHaG2Dx9ob2/CEkS
# /YPp9Oth9QAYHZgiI2Xx8GSg98PRVr9b/GlQPseoCOFXnUL89rTpQtxQq4CV41E6
# AA5Dr8Z07hsr47ERQERFfDGD4zsvpn+NWM1ZBy+CCilf/o8UU4eIyfRF34YgSScv
# FVdWM4czUKei9fe2Go1KnMCz1GnT/6epl47Hs8zn9WAEeUfLILp7dbkbNq26F65G
# 8YC8YnrikxU+2j+NIyIbRxbIdjR+JUbR14AyezwWZ2zGbirwWN1DP2WQx0QIZOqM
# ZuCqIDj5HpNSlHmShI0gNDfPvs+iM+sFSwQ7JE8Q03hlES9HF5c+MOr3Pl3J91hH
# EEmkk5gBJ2v2tvBuHgwVAQ2UH1+XT+a7RXeoMU1iizc2sXRGDK12ZsyaAg4D0oaF
# eohzJk2j1QXcx/DNK2G5uhzwgKvKv1/+rHyYQFtg+XuWVVipSNwqRjDJkDANAYZP
# VwKOOqDd5lHLOIzE1j61Yu06DJhkSoMvz74RQlqnk+r1EKJcTUZL52uhQor//DaL
# ULpBsgYzoMUMrtw7myHxq4t0t6mmOtOkb0CvO8dTzkIV0YgIFTtPFB0ySXOFUFf5
# UoFoMFKlfbPpDsvTNEVErxpaG4FBwZNVt67V2KXQ53xRPShyBiQ=
# =SG8L
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 08 May 2023 07:51:56 PM BST
# gpg: using RSA key 1899FF8EDEBF58CCEE034B82F487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>" [undefined]
# gpg: aka "Juan Quintela <quintela@trasno.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03 4B82 F487 EF18 5872 D723
* tag 'compression-code-pull-request' of https://gitlab.com/juan.quintela/qemu:
migration: Initialize and cleanup decompression in migration.c
ram-compress.c: Make target independent
ram compress: Assert that the file buffer matches the result
ram.c: Move core decompression code into its own file
ram.c: Move core compression code into its own file
ram.c: Remove last ram.c dependency from the core compress code
ram.c: Call update_compress_thread_counts from compress_send_queued_data
ram.c: Do not call save_page_header() from compress threads
ram.c: Reset result after sending queued data
ram.c: Dont change param->block in the compress thread
ram.c: Let the compress threads return a CompressResult enum
qtest/migration-test.c: Add postcopy tests with compress enabled
qtest/migration-test.c: Add tests with compress enabled
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
meson.build files choose whether to build modules based on foo.found()
expressions. If a feature is enabled (e.g. --enable-gtk), these expressions
are true even if the code is not used by any emulator, and this results
in an unexpected difference between modular and non-modular builds.
For non-modular builds, the files are not included in any binary, and
therefore the source files are never processed. For modular builds,
however, all .so files are unconditionally built by default, and therefore
a normal "make" tries to build them. However, the corresponding trace-*.h
files are absent due to this conditional:
if have_system
trace_events_subdirs += [
...
'ui',
...
]
endif
which was added to avoid wasting time running tracetool on unused trace-events
files. This causes a compilation failure; fix it by skipping module builds
entirely if (depending on the module directory) have_block or have_system
are false.
Reported-by: Michael Tokarev <mjt@tls.msk.ru>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
--without-default-devices is a specialized option that should only be used
when configs/devices/ is changed manually.
Explain the model towards which we should tend, with respect to failures
to start guests and to run "make check".
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Adds the support for AMD EPYC Genoa generation processors. The model
display for the new processor will be EPYC-Genoa.
Adds the following new feature bits on top of the feature bits from
the previous generation EPYC models.
avx512f : AVX-512 Foundation instruction
avx512dq : AVX-512 Doubleword & Quadword Instruction
avx512ifma : AVX-512 Integer Fused Multiply Add instruction
avx512cd : AVX-512 Conflict Detection instruction
avx512bw : AVX-512 Byte and Word Instructions
avx512vl : AVX-512 Vector Length Extension Instructions
avx512vbmi : AVX-512 Vector Byte Manipulation Instruction
avx512_vbmi2 : AVX-512 Additional Vector Byte Manipulation Instruction
gfni : AVX-512 Galois Field New Instructions
avx512_vnni : AVX-512 Vector Neural Network Instructions
avx512_bitalg : AVX-512 Bit Algorithms, add bit algorithms Instructions
avx512_vpopcntdq: AVX-512 AVX-512 Vector Population Count Doubleword and
Quadword Instructions
avx512_bf16 : AVX-512 BFLOAT16 instructions
la57 : 57-bit virtual address support (5-level Page Tables)
vnmi : Virtual NMI (VNMI) allows the hypervisor to inject the NMI
into the guest without using Event Injection mechanism
meaning not required to track the guest NMI and intercepting
the IRET.
auto-ibrs : The AMD Zen4 core supports a new feature called Automatic IBRS.
It is a "set-and-forget" feature that means that, unlike e.g.,
s/w-toggled SPEC_CTRL.IBRS, h/w manages its IBRS mitigation
resources automatically across CPL transitions.
Signed-off-by: Babu Moger <babu.moger@amd.com>
Message-Id: <20230504205313.225073-8-babu.moger@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add the following featute bits.
vnmi: Virtual NMI (VNMI) allows the hypervisor to inject the NMI into the
guest without using Event Injection mechanism meaning not required to
track the guest NMI and intercepting the IRET.
The presence of this feature is indicated via the CPUID function
0x8000000A_EDX[25].
automatic-ibrs :
The AMD Zen4 core supports a new feature called Automatic IBRS.
It is a "set-and-forget" feature that means that, unlike e.g.,
s/w-toggled SPEC_CTRL.IBRS, h/w manages its IBRS mitigation
resources automatically across CPL transitions.
The presence of this feature is indicated via the CPUID function
0x80000021_EAX[8].
The documention for the features are available in the links below.
a. Processor Programming Reference (PPR) for AMD Family 19h Model 01h,
Revision B1 Processors
b. AMD64 Architecture Programmer’s Manual Volumes 1–5 Publication No. Revision
40332 4.05 Date October 2022
Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
Link: https://www.amd.com/system/files/TechDocs/55898_B1_pub_0.50.zip
Link: https://www.amd.com/system/files/TechDocs/40332_4.05.pdf
Message-Id: <20230504205313.225073-7-babu.moger@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add the following feature bits for EPYC-Milan model and bump the version.
vaes : Vector VAES(ENC|DEC), VAES(ENC|DEC)LAST instruction support
vpclmulqdq : Vector VPCLMULQDQ instruction support
stibp-always-on : Single Thread Indirect Branch Prediction Mode has enhanced
performance and may be left Always on
amd-psfd : Predictive Store Forward Disable
no-nested-data-bp : Processor ignores nested data breakpoints
lfence-always-serializing : LFENCE instruction is always serializing
null-sel-clr-base : Null Selector Clears Base. When this bit is
set, a null segment load clears the segment base
These new features will be added in EPYC-Milan-v2. The "-cpu help" output
after the change will be.
x86 EPYC-Milan (alias configured by machine type)
x86 EPYC-Milan-v1 AMD EPYC-Milan Processor
x86 EPYC-Milan-v2 AMD EPYC-Milan Processor
The documentation for the features are available in the links below.
a. Processor Programming Reference (PPR) for AMD Family 19h Model 01h,
Revision B1 Processors
b. SECURITY ANALYSIS OF AMD PREDICTIVE STORE FORWARDING
c. AMD64 Architecture Programmer’s Manual Volumes 1–5 Publication No. Revision
40332 4.05 Date October 2022
Signed-off-by: Babu Moger <babu.moger@amd.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://www.amd.com/system/files/TechDocs/55898_B1_pub_0.50.zip
Link: https://www.amd.com/system/files/documents/security-analysis-predictive-store-forwarding.pdf
Link: https://www.amd.com/system/files/TechDocs/40332_4.05.pdf
Message-Id: <20230504205313.225073-6-babu.moger@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add the following feature bits.
no-nested-data-bp : Processor ignores nested data breakpoints.
lfence-always-serializing : LFENCE instruction is always serializing.
null-sel-cls-base : Null Selector Clears Base. When this bit is
set, a null segment load clears the segment base.
The documentation for the features are available in the links below.
a. Processor Programming Reference (PPR) for AMD Family 19h Model 01h,
Revision B1 Processors
b. AMD64 Architecture Programmer’s Manual Volumes 1–5 Publication No. Revision
40332 4.05 Date October 2022
Signed-off-by: Babu Moger <babu.moger@amd.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://www.amd.com/system/files/TechDocs/55898_B1_pub_0.50.zip
Link: https://www.amd.com/system/files/TechDocs/40332_4.05.pdf
Message-Id: <20230504205313.225073-5-babu.moger@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add the following feature bits.
amd-psfd : Predictive Store Forwarding Disable:
PSF is a hardware-based micro-architectural optimization
designed to improve the performance of code execution by
predicting address dependencies between loads and stores.
While SSBD (Speculative Store Bypass Disable) disables both
PSF and speculative store bypass, PSFD only disables PSF.
PSFD may be desirable for the software which is concerned
with the speculative behavior of PSF but desires a smaller
performance impact than setting SSBD.
Depends on the following kernel commit:
b73a54321ad8 ("KVM: x86: Expose Predictive Store Forwarding Disable")
stibp-always-on :
Single Thread Indirect Branch Prediction mode has enhanced
performance and may be left always on.
The documentation for the features are available in the links below.
a. Processor Programming Reference (PPR) for AMD Family 19h Model 01h,
Revision B1 Processors
b. SECURITY ANALYSIS OF AMD PREDICTIVE STORE FORWARDING
Signed-off-by: Babu Moger <babu.moger@amd.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://www.amd.com/system/files/documents/security-analysis-predictive-store-forwarding.pdf
Link: https://www.amd.com/system/files/TechDocs/55898_B1_pub_0.50.zip
Message-Id: <20230504205313.225073-4-babu.moger@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Introduce new EPYC cpu versions: EPYC-v4 and EPYC-Rome-v3.
The only difference vs. older models is an updated cache_info with
the 'complex_indexing' bit unset, since this bit is not currently
defined for AMD and may cause problems should it be used for
something else in the future. Setting this bit will also cause
CPUID validation failures when running SEV-SNP guests.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20230504205313.225073-3-babu.moger@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
New EPYC CPUs versions require small changes to their cache_info's.
Because current QEMU x86 CPU definition does not support versioned
cach_info, we would have to declare a new CPU type for each such case.
To avoid the dup work, add "cache_info" in X86CPUVersionDefinition",
to allow new cache_info pointers to be specified for a new CPU version.
Co-developed-by: Wei Huang <wei.huang2@amd.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20230504205313.225073-2-babu.moger@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit cf60ccc330 ("cutils: Introduce bundle mechanism") abandoned
compatibility with Windows older than 8 - we should reflect this
in our _WIN32_WINNT and set it to the value that corresponds to
Windows 8.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230504081351.125140-1-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Before this series, "nothing to send" was handled by the file buffer
being empty. Now it is tracked via param->result.
Assert that the file buffer state matches the result.
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Make compression interfaces take send_queued_data() as an argument.
Remove save_page_use_compression() from flush_compressed_data().
This removes the last ram.c dependency from the core compress code.
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
save_page_header() accesses several global variables, so calling it
from multiple threads is pretty ugly.
Instead, call save_page_header() before writing out the compressed
data from the compress buffer to the migration stream.
This also makes the core compress code more independend from ram.c.
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
And take the param->mutex lock for the whole section to ensure
thread-safety.
Now, it is explicitly clear if there is no queued data to send.
Before, this was handled by param->file stream being empty and thus
qemu_put_qemu_file() not sending anything.
This will be used in the next commits to move save_page_header()
out of compress code.
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Instead introduce a extra parameter to trigger the compress thread.
Now, when the compress thread is done, we know what RAMBlock and
offset it did compress.
This will be used in the next commits to move save_page_header()
out of compress code.
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
This will be used in the next commits to move save_page_header()
out of compress code.
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Add postcopy tests with compress enabled to ensure nothing breaks
with the refactoring in the next commits.
preempt+compress is blocked, so no test needed for that case.
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
There has never been tests for migration with compress enabled.
Add suitable tests, testing with compress-wait-thread = false
too.
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The load side can use a relaxed load, which will surely happen before
the work item is run by async_safe_run_on_cpu() or before double-checking
under mmap_lock. The store side can use an atomic RMW operation.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use a store-release when enqueuing a new call_rcu, and a load-acquire
when dequeuing; and read the tail after checking that node->next is
consistent, which is the standard message passing pattern and it is
clearer than mb_read/mb_set.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Instead of using qatomic_mb_{read,set} mindlessly, just use a per-coroutine
flag that requires no synchronization.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The remaining use of mb_read/mb_set is just to force a thread to exit
eventually. It does not order two memory accesses and therefore can be
just read/set.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.