Until now, array properties are actually implemented with a hack that
uses multiple properties on the QOM level: a static "foo-len" property
and after it is set, dynamically created "foo[i]" properties.
In external interfaces (-device on the command line and device_add in
QMP), this interface was broken by commit f3558b1b ('qdev: Base object
creation on QDict rather than QemuOpts') because QDicts are unordered
and therefore it could happen that QEMU tried to set the indexed
properties before setting the length, which fails and effectively makes
array properties inaccessible. In particular, this affects the 'ports'
property of the 'rocker' device, which used to be configured like this:
-device rocker,len-ports=2,ports[0]=dev0,ports[1]=dev1
This patch reworks the external interface so that instead of using a
separate top-level property for the length and for each element, we use
a single true array property that accepts a list value. In the external
interfaces, this is naturally expressed as a JSON list and makes array
properties accessible again. The new syntax looks like this:
-device '{"driver":"rocker","ports":["dev0","dev1"]}'
Creating an array property on the command line without using JSON format
is currently not possible. This could be fixed by switching from
QemuOpts to a keyval parser, which however requires consideration of the
compatibility implications.
All internal users of devices with array properties go through
qdev_prop_set_array() at this point, so updating it takes care of all of
them.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1090
Fixes: f3558b1b76
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231109174240.72376-12-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The 'name' parameter of QOM setters is primarily used to specify the name
of the currently parsed input element in the visitor interface. For
top-level qdev properties, this is always set and matches 'prop->name'.
However, for list elements it is NULL, because each element of a list
doesn't have a separate name. Passing a non-NULL value runs into
assertion failures in the visitor code.
Therefore, using 'name' in error messages is not right for property
types that are used in lists, because "(null)" (or even a segfault)
isn't very helpful to identify what QEMU is complaining about.
Change netdev properties to use 'prop->name' instead, which will contain
the name of the array property after switching array properties to lists
in the external interface. (This is still not perfect, as it doesn't
identify which element in the list caused the error, but strictly better
than before.)
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231109174240.72376-11-kwolf@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Block layer patches
- Graph locking part 6 (bs->file/backing)
- ahci: trigger either error IRQ or regular IRQ, not both
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmVLvccRHGt3b2xmQHJl
# ZGhhdC5jb20ACgkQfwmycsiPL9ZkFg//awQoPiGnYzHpqcx2tGCM2AqBV+mFkbZr
# BKI5vp8FYfJtgMuHjC8jabL24NRMPpT+HbCzoxwjJU+nnnr85qr7R5iGwG6kfgX6
# HJlAXYXdY6e7l+FV44PBJ52vOoGCsh1GHg8HlKsHMaxSdXi9C1axHJ6rCAjnWXE0
# FQ4znCBVs/9HiKsvu4Wdm5muX2ShftFRM/toAwA+fLEOealX8WEXoRFJXI40bYbR
# OR7aJXWMDQrljlqdKk2FXvK337/tpofXmXf3NIE1R2pmY4x5Fg8bfChZn4UaaCdN
# n+0AhmE4ScI0rXuaXXYOvTO9vdTzXeBROG6tX03t9rrQfB6wPcGVeXRo/uusslAW
# sDH8NLz7uHFOooV02Fs8CKDdVrNNw5qjziclSGa0Po7vqOV1TKI8OTiNpsDLmdI5
# +DQvC6N+IU1pSOXImATSHkheGWggsegrsgN6PdrlzHEXJwWoAaRD0T06MRn74/pL
# gCegK2ez4RJYsci7C5b0gaqY/QBsMj8EUfEGVHvVyuVSoPRwiq4ehPqSQ+siA3xP
# KxYR0e4+QIfRmxqCzaJhiQ3DDGdt8UcO3yF0XcKXEqWwgFAGQKNeUG314jginvmA
# iaJzC0dHbiGcagAk7Ey8iyzfxQDWM6ixzJtGv7VLILepzCuu8vaJXy5qeEtTM/ZI
# EXoDGceNSvw=
# =ikBW
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 09 Nov 2023 00:56:39 HKT
# gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg: issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* tag 'for-upstream' of https://repo.or.cz/qemu/kevin: (25 commits)
hw/ide/ahci: trigger either error IRQ or regular IRQ, not both
block: Protect bs->file with graph_lock
block: Take graph lock for most of .bdrv_open
vhdx: Take locks for accessing bs->file
qcow2: Take locks for accessing bs->file
block: Add missing GRAPH_RDLOCK annotations
block: Introduce bdrv_co_change_backing_file()
blkverify: Add locking for request_fn
block: Protect bs->backing with graph_lock
block: Mark bdrv_replace_node() GRAPH_WRLOCK
block: Mark bdrv_replace_node_common() GRAPH_WRLOCK
block: Inline bdrv_set_backing_noperm()
block: Mark bdrv_set_backing_hd_drained() GRAPH_WRLOCK
block: Mark bdrv_cow_child() and callers GRAPH_RDLOCK
block: Mark bdrv_filter_child() and callers GRAPH_RDLOCK
block: Mark bdrv_chain_contains() and callers GRAPH_RDLOCK
block: Mark bdrv_(un)freeze_backing_chain() and callers GRAPH_RDLOCK
block: Mark bdrv_skip_filters() and callers GRAPH_RDLOCK
block: Mark bdrv_skip_implicit_filters() and callers GRAPH_RDLOCK
block: Mark bdrv_filter_or_cow_bs() and callers GRAPH_RDLOCK
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Final test, gdbstub, plugin and gitdm updates for 8.2
- fix duplicate register in arm xml
- hide various duplicate system registers from gdbstub
- add new gdb register test to the CI (skipping s390x/ppc64 for now)
- introduce GDBFeatureBuilder
- move plugin initialisation to after vCPU init completes
- enable building TCG plugins on Windows platform
- various gitdm updates
- some mailmap fixes
- disable testing for nios2 signals which have regressed
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmVLpk4ACgkQ+9DbCVqe
# KkT7Zwf+LgNS2T8Gd6UBMk50Zwew3DSzK3HRRkAlxSV9vN9TCprnVDGJn7ObRpfq
# QCwiTmh20JRPFFBEsPGy/ozNPZsuWbt1/vyh3fnU4KD3aMySuyc/Hb9/mONPC9VE
# zh1mUxLCx10uwG5qF8jupIp22BQYD7B9i4YSF1gAUGsQNU7BPvcBDeDzyhCItJen
# 73oG9RQm7vDbjTOcGDkAMAG8iwLt07oMgFrDSgD8x7RWOxG8aiM3ninAW6S5GcO3
# s49t0rTqJIu+pOncYYzmPvFxyZ/6W82tsJYtfxlVML02qj24HOmLWywRWgL5b10y
# TyXsDba3Ru8ez/kEaVVX6u9N1G/Ktg==
# =or8W
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 08 Nov 2023 23:16:30 HKT
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* tag 'pull-halloween-omnibus-081123-1' of https://gitlab.com/stsquad/qemu: (23 commits)
Revert "tests/tcg/nios2: Re-enable linux-user tests"
mailmap: fixup some more corrupted author fields
contrib/gitdm: add Daynix to domain-map
contrib/gitdm: map HiSilicon to Huawei
contrib/gitdm: add domain-map for Cestc
contrib/gitdm: Add Rivos Inc to the domain map
plugins: allow plugins to be enabled on windows
gitlab: add dlltool to Windows CI
plugins: disable lockstep plugin on windows
plugins: make test/example plugins work on windows
plugins: add dllexport and dllimport to api funcs
configure: tell meson and contrib_plugins about DLLTOOL
cpu: Call plugin hooks only when ready
gdbstub: Introduce GDBFeatureBuilder
gdbstub: Introduce gdb_find_static_feature()
gdbstub: Add num_regs member to GDBFeature
tests/avocado: update the tcg_plugins test
tests/tcg: add an explicit gdbstub register tester
target/arm: hide aliased MIDR from gdbstub
target/arm: hide all versions of DBGD[RS]AR from gdbstub
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
According to AHCI 1.3.1, 5.3.8.1 RegFIS:Entry, if ERR_STAT is set,
we jump to state ERR:FatalTaskfile, which will raise a TFES IRQ
unconditionally, regardless if the I bit is set in the FIS or not.
Thus, we should never raise a normal IRQ after having sent an error
IRQ.
NOTE: for QEMU platforms that use SeaBIOS, this patch depends on QEMU
commit 784155cdcb ("seabios: update submodule to git snapshot"), and
QEMU commit 14f5a7bae4 ("seabios: update binaries to git snapshot"),
which update SeaBIOS to a version that contains SeaBIOS commit 1281e340
("ahci: handle TFES irq correctly").
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Message-ID: <20231011131220.1992064-1-nks@flawful.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Almost all functions that access bs->file already take the graph
lock now. Add locking to the remaining users and finally annotate the
struct field itself as protected by the graph lock.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231027155333.420094-25-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Most implementations of .bdrv_open first open their file child (which is
an operation that internally takes the write lock and therefore we
shouldn't hold the graph lock while calling it), and afterwards many
operations that require holding the graph lock, e.g. for accessing
bs->file.
This changes block drivers that follow this pattern to take the graph
lock after opening the child node.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231027155333.420094-24-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This adds GRAPH_RDLOCK to some driver callbacks that are already called
with the graph lock held, and which will need the annotation because
they access bs->file, but don't have it yet.
This also covers a few callbacks that were not marked GRAPH_RDLOCK
before, but where updating BlockDriver is trivially possible.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231027155333.420094-21-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bdrv_change_backing_file() is called both inside and outside coroutine
context. This makes it difficult for it to take the graph lock
internally. It also means that driver implementations need to be able to
run outside of coroutines, too. Switch it to the usual model with a
coroutine based implementation and a co_wrapper instead. The new
function is marked GRAPH_RDLOCK.
As the co_wrapper now runs the function in the AioContext of the node
(as it should always have done), this is not GLOBAL_STATE_CODE() any
more.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231027155333.420094-20-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This is either bdrv_co_preadv() or bdrv_co_pwritev() which both need to
have the graph locked. Annotate the function pointer accordingly and add
locking to its callers.
This shouldn't actually have resulted in a bug because the graph lock is
already held by blkverify_co_prwv(), which waits for the coroutines to
terminate. Annotate with GRAPH_RDLOCK as well to make this clearer.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231027155333.420094-19-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Almost all functions that access bs->backing already take the graph
lock now. Add locking to the remaining users and finally annotate the
struct field itself as protected by the graph lock.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231027155333.420094-18-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
nios2 signal tests are broken again:
retry.py -n 10 -c -- ./qemu-nios2 ./tests/tcg/nios2-linux-user/signals
Results summary:
0: 8 times (80.00%), avg time 2.254 (0.00 varience/0.00 deviation)
-11: 2 times (20.00%), avg time 0.253 (0.00 varience/0.00 deviation)
Ran command 10 times, 8 passes
This wasn't picked up by CI as we don't have a docker container that
can build QEMU with the nios2 compiler. I don't have time to bisect
the breakage and the target is orphaned anyway so take the easy route
and revert it.
This reverts commit 20e7524ff9.
Cc: Chris Wulff <crwulff@gmail.com>
Cc: Marek Vasut <marex@denx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231106185112.2755262-23-alex.bennee@linaro.org>
We also --disable-plugins for the two mingw based cross builds as
although they have dlltool they seem to be unhappy linking.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
There are a number of things that are broken on the test currently so
lets fix that up:
- replace retired Debian kernel for tuxrun_baseline one
- remove "detected repeat instructions test" since ea185a55
- log total counted instructions/memory accesses
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231106185112.2755262-8-alex.bennee@linaro.org>
We already do a couple of "info registers" for specific tests but this
is a more comprehensive multiarch test. It also has some output
helpful for debugging the gdbstub by showing which XML features are
advertised and what the underlying register numbers are.
My initial motivation was to see if there are any duplicate register
names exposed via the gdbstub while I was reviewing the proposed
register interface for TCG plugins.
Mismatches between the xml and remote-desc are reported for debugging
but do not fail the test.
We also skip the tests for the following arches for now until we can
investigate and fix any issues:
- s390x (fails to read v0l->v15l, not seen in remote-registers)
- ppc64 (fails to read vs0h->vs31h, not seen in remote-registers)
Cc: Akihiko Odaki <akihiko.odaki@daynix.com>
Cc: Luis Machado <luis.machado@linaro.org>
Cc: Ilya Leoshkevich <iii@linux.ibm.com>
Cc: qemu-s390x@nongnu.org
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Daniel Henrique Barboza <danielhb413@gmail.com>
Cc: qemu-ppc@nongnu.org
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231106185112.2755262-7-alex.bennee@linaro.org>
ppc patch queue for 2023-11-07:
This queue, the last one before the 8.2 feature freeze, has miscellanous
changes that includes new PowerNV features and the new AmigaONE XE
board.
# -----BEGIN PGP SIGNATURE-----
#
# iIwEABYKADQWIQQX6/+ZI9AYAK8oOBk82cqW3gMxZAUCZUqiORYcZGFuaWVsaGI0
# MTNAZ21haWwuY29tAAoJEDzZypbeAzFkBSUA/2qm8CyrRqY5+tsjtWQqZmPZ3L1F
# CgnXFNqtY2tzbTe5AQCi6FeQBEmXbZYVfryZyA+CQ4DUERc+18pe6hV3bBR9Cg==
# =cnHS
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 08 Nov 2023 04:46:49 HKT
# gpg: using EDDSA key 17EBFF9923D01800AF2838193CD9CA96DE033164
# gpg: issuer "danielhb413@gmail.com"
# gpg: Good signature from "Daniel Henrique Barboza <danielhb413@gmail.com>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 17EB FF99 23D0 1800 AF28 3819 3CD9 CA96 DE03 3164
* tag 'pull-ppc-20231107' of https://gitlab.com/danielhb/qemu:
ppc: qtest already exports qtest_rtas_call()
hw/pci-host: Update PHB5 XSCOM registers
ppc/pnv: Fix number of I2C engines and ports for power9/10
ppc/pnv: Connect PNV I2C controller to powernv10
ppc/pnv: Connect I2C controller model to powernv9 chip
ppc/pnv: Add an I2C controller model
tests/avocado: Add test for amigaone board
hw/ppc: Add emulation of AmigaOne XE board
hw/pci-host: Add emulation of Mai Logic Articia S
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* Fix s390x CPU reconfiguration information in the SCLP facility map
* Fix condition code problem in the CLC and LAALG instruction
* Fix ordering of the new s390x topology list entries
* Add some more files to the MAINTAINERS file
* Allow newer versions of Tesseract in the m68k nextcube test
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmVKgksRHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbWIHg//TM3JOpsMEqHKlUKqOJH02mFQrK6H7LG0
# BC56FG7T+/mpYs1NTG92t8nCK03C2ZCweQWD7ZulRJAjPhZv+TF5bJEForivU7+k
# PKEshz9xKCWn2YGyNnf2LA06J1JkF215+KlReOoxwSgj1cPlHfBLQ0DtxmpJJZ1G
# h5p4d26BbSlwR58HrFWTlhgJMPenl59BETUGIK1FklBxunmZeeijddfniAhOT44y
# i0u9/H9KCg3tkwBROUy+42QV+ef32kz/yvi5RmYQI5W7PixO4sxH6MYduOjshsu9
# wK70f8EOwiZV6lFxqmbV7vxFeNnp5IuaVU7PMBoAkwZqLw99mSFy1+1BabCuL5b+
# 3iUTiD4UW48MYwE2Ua6Lit4kpfjhwcp/UYz6pIk6TCBQX6LfzO+nj+rod0GdIpyZ
# 4Lwm7jBtpTlYkGrsMvpA/qcidOtqPA1lmBTNlY1hFodQF6KWtyObn0w5AM80xeeU
# /mGxQDz97Bpz7LKZvhu+k38jaWvnJFnl3jF1zet88CYL9YL+YI/k1KjhFafCXb0V
# 38Xpt5JTWxyLSh2B3gx0OpokX5bftvW9GlLix0HqL7c23uYwR2Bq+Rd6I8SAlk4C
# uJq6gqP8IFBFHfgbmyqf/fyd/eHxm7J1voIdy9PZyxZ1JYT9A7yu56qV6SJYwCpr
# aARwui/Dm4o=
# =y+cC
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 08 Nov 2023 02:30:35 HKT
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5
* tag 'pull-request-2023-11-07' of https://gitlab.com/thuth/qemu:
target/s390x/cpu topology: Fix ordering and creation of TLEs
tests/tcg/s390x: Test ADD LOGICAL WITH CARRY
tests/tcg/s390x: Test LAALG with negative cc_src
target/s390x: Fix LAALG not updating cc_src
tests/tcg/s390x: Test CLC with inaccessible second operand
target/s390x: Fix CLC corrupting cc_src
target/s390x/cpu_models: Use 'first_cpu' in s390_get_feat_block()
s390/sclp: fix SCLP facility map
tests/avocado: Allow newer versions of tesseract in the nextcube test
MAINTAINERS: Add artist.c to the hppa machine section
MAINTAINERS: Add the virtio-gpu documentation to the corresponding section
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Misc hardware patch queue
HW emulation:
- PMBus fixes and tests (Titus)
- IDE fixes and tests (Fiona)
- New ADM1266 sensor (Titus)
- Better error propagation in PCI-ISA i82378 (Philippe)
- Declare SD model QOM types using DEFINE_TYPES macro (Philippe)
Topology:
- Fix CPUState::nr_cores calculation (Zhuocheng Ding and Zhao Liu)
Monitor:
- Synchronize CPU state in 'info lapic' (Dongli Zhang)
QOM:
- Have 'cpu-qom.h' target-agnostic (Philippe)
- Move ArchCPUClass definition to each target's cpu.h (Philippe)
- Call object_class_is_abstract once in cpu_class_by_name (Philippe)
UI:
- Use correct key names in titles on MacOS / SDL2 (Adrian)
MIPS:
- Fix MSA BZ/BNZ and TX79 LQ/SQ opcodes (Philippe)
Nios2:
- Create IRQs *after* vCPU is realized (Philippe)
PPC:
- Restrict KVM objects to system emulation (Philippe)
- Move target-specific definitions out of 'cpu-qom.h' (Philippe)
S390X:
- Make hw/s390x/css.h and hw/s390x/sclp.h headers target agnostic (Philippe)
X86:
- HVF & KVM cleanups (Philippe)
Various targets:
- Use env_archcpu() to optimize (Philippe)
Misc:
- Few global variable shadowing removed (Philippe)
- Introduce cpu_exec_reset_hold and factor tcg_cpu_reset_hold out (Philippe)
- Remove few more 'softmmu' mentions (Philippe)
- Fix and cleanup in vl.c (Akihiko & Marc-André)
- Resource leak fix in dump (Zongmin Zhou)
- MAINTAINERS updates (Thomas, Daniel)
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmVKKmEACgkQ4+MsLN6t
# wN4xHQ//X/enH4C7K3VP/tSinDiwmXN2o61L9rjqSDQkBaCtktZx4c8qKSDL7V4S
# vwzmvvBn3biMXQwZNVJo9d0oz2qoaF9tI6Ao0XDHAan9ziagfG9YMqWhkCfj077Q
# jLdCqkUuMJBvQgXGB1a6UgCme8PQx7h0oqjbCNfB0ZBls24b5DiEjO87LE4OTbTi
# zKRhYEpZpGwIVcy+1dAsbaBpGFP06sr1doB9Wz4c06eSx7t0kFSPk6U4CyOPrGXh
# ynyCxPwngxIXmarY8gqPs3SBs7oXsH8Q/ZOHr1LbuXhwSuw/0zBQU9aF7Ir8RPan
# DB79JjPrtxTAhICKredWT79v9M18D2/1MpONgg4vtx5K2FzGYoAJULCHyfkHMRSM
# L6/H0ZQPHvf7w72k9EcSQIhd0wPlMqRmfy37/8xcLiw1h4l/USx48QeKaeFWeSEu
# DgwSk+R61HbrKvQz/U0tF98zUEyBaQXNrKmyzht0YE4peAtpbPNBeRHkd0GMae/Z
# HOmkt8QlFQ0T14qSK7mSHaSJTUzRvFGD01cbuCDxVsyCWWsesEikXBACZLG5RCRY
# Rn1WeX1H9eE3kKi9iueLnhzcF9yM5XqFE3f6RnDzY8nkg91lsTMSQgFcIpv6uGyp
# 3WOTNSC9SoFyI3x8pCWiKOGytPUb8xk+PnOA85wYvVmT+7j6wus=
# =OVdQ
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 07 Nov 2023 20:15:29 HKT
# gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]
# Primary key fingerprint: FAAB E75E 1291 7221 DCFD 6BB2 E3E3 2C2C DEAD C0DE
* tag 'misc-cpus-20231107' of https://github.com/philmd/qemu: (75 commits)
dump: Add close fd on error return to avoid resource leak
ui/sdl2: use correct key names in win title on mac
MAINTAINERS: Add more guest-agent related files to the corresponding section
MAINTAINERS: Add include/hw/xtensa/mx_pic.h to the XTFPGA machine section
MAINTAINERS: update libvirt devel mailing list address
MAINTAINERS: Add the CAN documentation file to the CAN section
MAINTAINERS: Add include/hw/timer/tmu012.h to the SH4 R2D section
hw/sd: Declare QOM types using DEFINE_TYPES() macro
hw/i2c: pmbus: reset page register for out of range reads
hw/i2c: pmbus: immediately clear faults on request
tests/qtest: add tests for ADM1266
hw/sensor: add ADM1266 device model
hw/i2c: pmbus: add VCAP register
hw/i2c: pmbus: add fan support
hw/i2c: pmbus: add vout mode bitfields
hw/i2c: pmbus add support for block receive
tests/qtest: ahci-test: add test exposing reset issue with pending callback
hw/ide: reset: cancel async DMA operation before resetting state
hw/cpu: Update the comments of nr_cores and nr_dies
system/cpus: Fix CPUState.nr_cores' calculation
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Having two functions with the same name is a bad idea. As spapr only
uses the function locally, made it static.
When you compile with clang, you get this compilation error:
/usr/bin/ld: tests/qtest/libqos/libqos.fa.p/.._libqtest.c.o: in function `qtest_rtas_call':
/scratch/qemu/clang/full/all/../../../../../mnt/code/qemu/full/tests/qtest/libqtest.c:1195: multiple definition of `qtest_rtas_call'; libqemu-ppc64-softmmu.fa.p/hw_ppc_spapr_rtas.c.o:/scratch/qemu/clang/full/all/../../../../../mnt/code/qemu/full/hw/ppc/spapr_rtas.c:536: first defined here
clang-16: error: linker command failed with exit code 1 (use -v to see invocation)
ninja: build stopped: subcommand failed.
make: *** [Makefile:162: run-ninja] Error 1
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20231030163834.4638-1-quintela@redhat.com>
[dhb: remove 'spapr_rtas.h' include from spapr_rtas.c]
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Power9 is supposed to have 4 PIB-connected I2C engines with the
following number of ports on each engine:
0: 2
1: 13
2: 2
3: 2
Power10 also has 4 engines but has the following number of ports
on each engine:
0: 14
1: 14
2: 2
3: 16
Current code assumes that they all have the same (maximum) number.
This can be a problem if software expects to see a certain number
of ports present (Power Hypervisor seems to care).
Fixed this by adding separate tables for power9 and power10 that
map the I2C controller number to the number of I2C buses that should
be attached for that engine.
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Glenn Miles <milesg@linux.vnet.ibm.com>
Message-ID: <20231025152714.956664-1-milesg@linux.vnet.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Wires up four I2C controller instances to the powernv10 chip
XSCOM address space.
Each controller instance is wired up to two I2C buses of
its own. No other I2C devices are connected to the buses
at this time.
Signed-off-by: Glenn Miles <milesg@linux.vnet.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20231017221434.810363-1-milesg@linux.vnet.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Wires up three I2C controller instances to the powernv9 chip
XSCOM address space.
Each controller instance is wired up to a single I2C bus of
its own. No other I2C devices are connected to the buses
at this time.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
[milesg: Split wiring from addition of model itself]
[milesg: Added new commit message]
[milesg: Moved hardcoded attributes into PnvChipClass]
[milesg: Removed TODO comment for I2C]
Signed-off-by: Glenn Miles <milesg@linux.vnet.ibm.com>
Acked-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-ID: <20231016222013.3739530-3-milesg@linux.vnet.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
The more recent IBM power processors have an embedded I2C
controller that is accessible by software via the XSCOM
address space.
Each instance of the I2C controller is capable of controlling
multiple I2C buses (one at a time). Prior to beginning a
transaction on an I2C bus, the bus must be selected by writing
the port number associated with the bus into the PORT_NUM
field of the MODE register. Once an I2C bus is selected,
the status of the bus can be determined by reading the
Status and Extended Status registers.
I2C bus transactions can be started by writing a command to
the Command register and reading/writing data from/to the
FIFO register.
Not supported :
. 10 bit I2C addresses
. Multimaster
. Slave
Signed-off-by: Cédric Le Goater <clg@kaod.org>
[milesg: Split wiring to powernv9 into its own commit]
[milesg: Added more detail to commit message]
[milesg: Added SPDX Licensed Identifier to new files]
[milesg: updated copyright dates]
[milesg: Added use of g_autofree]
[milesg: Added NULL check after pnv_i2c_get_bus]
Signed-off-by: Glenn Miles <milesg@linux.vnet.ibm.com>
Acked-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-ID: <20231016222013.3739530-2-milesg@linux.vnet.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
The AmigaOne is a rebranded MAI Teron board that uses U-Boot firmware
with patches to support AmigaOS and is very similar to pegasos2 so can
be easily emulated sharing most code with pegasos2. The reason to
emulate it is that AmigaOS comes in different versions for AmigaOne
and PegasosII which only have drivers for one machine and firmware so
these only run on the specific machine. Adding this board allows
another AmigaOS version to be used reusing already existing peagasos2
emulation. (The AmigaOne was the first of these boards so likely most
widespread which then inspired Pegasos that was later replaced with
PegasosII due to problems with Articia S, so these have a lot of
similarity. Pegasos mainly ran MorphOS while the PegasosII version of
AmigaOS was added later and therefore less common than the AmigaOne
version.)
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Tested-by: Rene Engel <ReneEngel80@emailn.de>
Acked-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-ID: <804935e7a5921548d630576159ae2c758fe6e275.1699382232.git.balaton@eik.bme.hu>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
In case of horizontal polarization entitlement has no effect on
ordering.
Moreover, since the comparison is used to insert CPUs at the correct
position in the TLE list, this affects the creation of TLEs and now
correctly collapses horizontally polarized CPUs into one TLE.
Fixes: f4f54b582f ("target/s390x/cpu topology: handle STSI(15) and build the SYSIB")
Signed-off-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Message-ID: <20231027163637.3060537-1-nsg@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
LAALG uses op_laa() and wout_addu64(). The latter expects cc_src to be
set, but the former does not do it. This can lead to assertion failures
if something sets cc_src to neither 0 nor 1 before.
Fix by introducing op_laa_addu64(), which sets cc_src, and using it for
LAALG.
Fixes: 4dba4d6fef ("target/s390x: Use atomic operations for LOAD AND OP")
Cc: qemu-stable@nongnu.org
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20231106093605.1349201-4-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Qemu's SCLP implementation incorrectly reports that it supports CPU
reconfiguration. If a guest issues a CPU reconfiguration request it
is rejected as invalid command.
Fix the SCLP_HAS_CPU_INFO mask, and remove the unused
SCLP_CMDW_CONFIGURE_CPU and SCLP_CMDW_DECONFIGURE_CPU defines.
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Message-ID: <20231024100703.929679-1-hca@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Current Linux distros ship version 5 of the tesseract OCR software,
so the nextcube screen test is ignored there. Let's make the check
more flexible to allow newer versions, too, and remove the old v3
test since most Linux distros don't ship this version anymore.
Message-ID: <20231101204323.35533-1-huth@tuxfamily.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <huth@tuxfamily.org>
Instead of taking the writer lock internally, require callers to already
hold it when calling bdrv_replace_node(). Its callers may already want
to hold the graph lock and so wouldn't be able to call functions that
take it internally.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231027155333.420094-17-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Instead of taking the writer lock internally, require callers to already
hold it when calling bdrv_replace_node_common(). Basically everthing in
the function needs the lock and its callers may already want to hold the
graph lock and so wouldn't be able to call functions that take it
internally.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231027155333.420094-16-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Instead of taking the writer lock internally, require callers to already
hold it when calling bdrv_set_backing_hd_drained(). Basically everthing
in the function needs the lock and its callers may already want to hold
the graph lock and so wouldn't be able to call functions that take it
internally.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231027155333.420094-14-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This adds GRAPH_RDLOCK annotations to declare that callers of
bdrv_chain_contains() need to hold a reader lock for the graph because
it calls bdrv_filter_or_cow_bs(), which accesses bs->file/backing.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231027155333.420094-11-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This adds GRAPH_RDLOCK annotations to declare that callers of
bdrv_(un)freeze_backing_chain() need to hold a reader lock for the
graph because it calls bdrv_filter_or_cow_child(), which accesses
bs->file/backing.
Use the opportunity to make bdrv_is_backing_chain_frozen() static, it
has no external callers.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231027155333.420094-10-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This adds GRAPH_RDLOCK annotations to declare that callers of
bdrv_skip_filters() need to hold a reader lock for the graph because it
calls bdrv_filter_child(), which accesses bs->file/backing.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231027155333.420094-9-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This adds GRAPH_RDLOCK annotations to declare that callers of
bdrv_skip_implicit_filters() need to hold a reader lock for the graph
because it calls bdrv_filter_child(), which accesses bs->file/backing.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231027155333.420094-8-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This adds GRAPH_RDLOCK annotations to declare that callers of
bdrv_filter_or_cow_bs() need to hold a reader lock for the graph because
it calls bdrv_filter_or_cow_child(), which accesses bs->file/backing.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231027155333.420094-7-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Instead of taking the writer lock internally, require callers to already
hold it when calling block_job_add_bdrv(). These callers will typically
already hold the graph lock once the locking work is completed, which
means that they can't call functions that take it internally.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231027155333.420094-6-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Instead of taking the writer lock internally, require callers to already
hold it when calling bdrv_root_attach_child(). These callers will
typically already hold the graph lock once the locking work is
completed, which means that they can't call functions that take it
internally.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231027155333.420094-5-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This adds GRAPH_RDLOCK annotations to declare that callers of
bdrv_filter_bs() need to hold a reader lock for the graph because
it calls bdrv_filter_child(), which accesses bs->file/backing.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231027155333.420094-4-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This adds GRAPH_RDLOCK annotations to declare that callers of
bdrv_has_zero_init() need to hold a reader lock for the graph because
it calls bdrv_filter_bs(), which accesses bs->file/backing.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231027155333.420094-3-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This adds GRAPH_RDLOCK annotations to declare that callers of
bdrv_probe_blocksizes() need to hold a reader lock for the graph because
it calls bdrv_filter_bs(), which accesses bs->file/backing.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231027155333.420094-2-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When run this script, there's the error:
python3 scripts/cpu-x86-uarch-abi.py /tmp/qmp
Traceback (most recent call last):
File "/path-to-qemu/qemu/scripts/cpu-x86-uarch-abi.py", line 96, in <module>
cpu = shell.cmd("query-cpu-model-expansion",
TypeError: QEMUMonitorProtocol.cmd() takes 2 positional arguments but 3 were given
Commit 7f521b023b ("scripts/cpu-x86-uarch-abi.py: use .command()
instead of .cmd()") converts the the original .cmd() to .command()
(which was later renamed to "cmd" to replace the original one).
But the new .cmd() only accepts typing.Mapping as the parameter instead
of typing.Dict (see _qmp.execute()).
Change the paremeters of "query-cpu-model-expansion" to typing.Mapping
format to fix this error.
Fixes: 7f521b023b ("scripts/cpu-x86-uarch-abi.py: use .command() instead of .cmd()")
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Print a debug message as is done for other unsupported audio formats
to give the user the chance to understand their mistake.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
All callers of qio_net_listener_set_name() already add some sort of
"listen" or "listener" suffix.
For intance, we currently have "migration-socket-listener-listen" and
"vnc-listen-listen" as ioc names.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
When qcrypto_builtin_rsa_public_key_parse() is about to fail, but no
error has been set, it makes one up. Actually, there's just one way
to fail without setting an error. Set it there instead.
Same for qcrypto_builtin_rsa_private_key_parse().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Previously, when using the SDL2 UI on MacOS, the title bar uses incorrect
key names (such as Ctrl and Alt instead of the standard MacOS key symbols
like ⌃ and ⌥). This commit changes sdl_update_caption in ui/sdl2.c to
use the correct symbols when compiling for MacOS (CONFIG_DARWIN is
defined).
Unfortunately, standard Mac keyboards do not include a "Right-Ctrl" key,
so in the case that the SDL grab mode is set to HOT_KEY_MOD_RCTRL, the
default text is still used.
Signed-off-by: Adrian Wowk <dev@adrianwowk.com>
Acked-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231030024119.28342-1-dev@adrianwowk.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
contrib/systemd/qemu-guest-agent.service, tests/data/test-qga-config
and tests/data/test-qga-os-release belong to the guest agent, so make
sure that these files are covered here, too.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Message-ID: <20231107101811.14189-1-thuth@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The linux pmbus driver scans all possible pages and does not reset the
current page after the scan, making all future page reads fail as out of range
on devices with a single page.
This change resets out of range pages immediately on write.
Also added a qtest for simultaneous writes to all pages.
Reviewed-by: Hao Wu <wuhaotsh@google.com>
Signed-off-by: Titus Rwantare <titusr@google.com>
Message-ID: <20231023-staging-pmbus-v3-v4-8-07a8cb7cd20a@google.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The ADM1266 is a cascadable super sequencer with margin control and
fault recording.
This commit adds basic support for its PMBus commands and models
the identification registers that can be modified in a firmware
update.
Reviewed-by: Hao Wu <wuhaotsh@google.com>
Acked-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Titus Rwantare <titusr@google.com>
[PMD: Cover file in MAINTAINERS]
Message-ID: <20231023-staging-pmbus-v3-v4-5-07a8cb7cd20a@google.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The VOUT_MODE command is described in the PMBus Specification,
Part II, Ver 1.3 Section 8.3
VOUT_MODE has a three bit mode and 4 bit parameter, the three bit
mode determines whether voltages are formatted as uint16, uint16,
VID, and Direct modes. VID and Direct modes use the remaining 5 bits
to scale the voltage readings.
Reviewed-by: Hao Wu <wuhaotsh@google.com>
Acked-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Titus Rwantare <titusr@google.com>
Message-ID: <20231023-staging-pmbus-v3-v4-2-07a8cb7cd20a@google.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
PMBus devices can send and receive variable length data using the
block read and write format, with the first byte in the payload
denoting the length.
This is mostly used for strings and on-device logs. Devices can
respond to a block read with an empty string.
Reviewed-by: Hao Wu <wuhaotsh@google.com>
Acked-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Titus Rwantare <titusr@google.com>
Message-ID: <20231023-staging-pmbus-v3-v4-1-07a8cb7cd20a@google.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Before commit "hw/ide: reset: cancel async DMA operation before
resetting state", this test would fail, because a reset with a
pending write operation would lead to an unsolicited write to the
first sector of the disk.
The test writes a pattern to the beginning of the disk and verifies
that it is still intact after a reset with a pending operation. It
also checks that the pending operation actually completes correctly.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20230906130922.142845-2-f.ebner@proxmox.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
If there is a pending DMA operation during ide_bus_reset(), the fact
that the IDEState is already reset before the operation is canceled
can be problematic. In particular, ide_dma_cb() might be called and
then use the reset IDEState which contains the signature after the
reset. When used to construct the IO operation this leads to
ide_get_sector() returning 0 and nsector being 1. This is particularly
bad, because a write command will thus destroy the first sector which
often contains a partition table or similar.
Traces showing the unsolicited write happening with IDEState
0x5595af6949d0 being used after reset:
> ahci_port_write ahci(0x5595af6923f0)[0]: port write [reg:PxSCTL] @ 0x2c: 0x00000300
> ahci_reset_port ahci(0x5595af6923f0)[0]: reset port
> ide_reset IDEstate 0x5595af6949d0
> ide_reset IDEstate 0x5595af694da8
> ide_bus_reset_aio aio_cancel
> dma_aio_cancel dbs=0x7f64600089a0
> dma_blk_cb dbs=0x7f64600089a0 ret=0
> dma_complete dbs=0x7f64600089a0 ret=0 cb=0x5595acd40b30
> ahci_populate_sglist ahci(0x5595af6923f0)[0]
> ahci_dma_prepare_buf ahci(0x5595af6923f0)[0]: prepare buf limit=512 prepared=512
> ide_dma_cb IDEState 0x5595af6949d0; sector_num=0 n=1 cmd=DMA WRITE
> dma_blk_io dbs=0x7f6420802010 bs=0x5595ae2c6c30 offset=0 to_dev=1
> dma_blk_cb dbs=0x7f6420802010 ret=0
> (gdb) p *qiov
> $11 = {iov = 0x7f647c76d840, niov = 1, {{nalloc = 1, local_iov = {iov_base = 0x0,
> iov_len = 512}}, {__pad = "\001\000\000\000\000\000\000\000\000\000\000",
> size = 512}}}
> (gdb) bt
> #0 blk_aio_pwritev (blk=0x5595ae2c6c30, offset=0, qiov=0x7f6420802070, flags=0,
> cb=0x5595ace6f0b0 <dma_blk_cb>, opaque=0x7f6420802010)
> at ../block/block-backend.c:1682
> #1 0x00005595ace6f185 in dma_blk_cb (opaque=0x7f6420802010, ret=<optimized out>)
> at ../softmmu/dma-helpers.c:179
> #2 0x00005595ace6f778 in dma_blk_io (ctx=0x5595ae0609f0,
> sg=sg@entry=0x5595af694d00, offset=offset@entry=0, align=align@entry=512,
> io_func=io_func@entry=0x5595ace6ee30 <dma_blk_write_io_func>,
> io_func_opaque=io_func_opaque@entry=0x5595ae2c6c30,
> cb=0x5595acd40b30 <ide_dma_cb>, opaque=0x5595af6949d0,
> dir=DMA_DIRECTION_TO_DEVICE) at ../softmmu/dma-helpers.c:244
> #3 0x00005595ace6f90a in dma_blk_write (blk=0x5595ae2c6c30,
> sg=sg@entry=0x5595af694d00, offset=offset@entry=0, align=align@entry=512,
> cb=cb@entry=0x5595acd40b30 <ide_dma_cb>, opaque=opaque@entry=0x5595af6949d0)
> at ../softmmu/dma-helpers.c:280
> #4 0x00005595acd40e18 in ide_dma_cb (opaque=0x5595af6949d0, ret=<optimized out>)
> at ../hw/ide/core.c:953
> #5 0x00005595ace6f319 in dma_complete (ret=0, dbs=0x7f64600089a0)
> at ../softmmu/dma-helpers.c:107
> #6 dma_blk_cb (opaque=0x7f64600089a0, ret=0) at ../softmmu/dma-helpers.c:127
> #7 0x00005595ad12227d in blk_aio_complete (acb=0x7f6460005b10)
> at ../block/block-backend.c:1527
> #8 blk_aio_complete (acb=0x7f6460005b10) at ../block/block-backend.c:1524
> #9 blk_aio_write_entry (opaque=0x7f6460005b10) at ../block/block-backend.c:1594
> #10 0x00005595ad258cfb in coroutine_trampoline (i0=<optimized out>,
> i1=<optimized out>) at ../util/coroutine-ucontext.c:177
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: simon.rowe@nutanix.com
Message-ID: <20230906130922.142845-1-f.ebner@proxmox.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
From CPUState.nr_cores' comment, it represents "number of cores within
this CPU package".
After 003f230e37 ("machine: Tweak the order of topology members in
struct CpuTopology"), the meaning of smp.cores changed to "the number of
cores in one die", but this commit missed to change CPUState.nr_cores'
calculation, so that CPUState.nr_cores became wrong and now it
misses to consider numbers of clusters and dies.
At present, only i386 is using CPUState.nr_cores.
But as for i386, which supports die level, the uses of CPUState.nr_cores
are very confusing:
Early uses are based on the meaning of "cores per package" (before die
is introduced into i386), and later uses are based on "cores per die"
(after die's introduction).
This difference is due to that commit a94e142899 ("target/i386: Add
CPUID.1F generation support for multi-dies PCMachine") misunderstood
that CPUState.nr_cores means "cores per die" when calculated
CPUID.1FH.01H:EBX. After that, the changes in i386 all followed this
wrong understanding.
With the influence of 003f230e37 and a94e142899, for i386 currently
the result of CPUState.nr_cores is "cores per die", thus the original
uses of CPUState.cores based on the meaning of "cores per package" are
wrong when multiple dies exist:
1. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.01H:EBX[bits 23:16] is
incorrect because it expects "cpus per package" but now the
result is "cpus per die".
2. In cpu_x86_cpuid() of target/i386/cpu.c, for all leaves of CPUID.04H:
EAX[bits 31:26] is incorrect because they expect "cpus per package"
but now the result is "cpus per die". The error not only impacts the
EAX calculation in cache_info_passthrough case, but also impacts other
cases of setting cache topology for Intel CPU according to cpu
topology (specifically, the incoming parameter "num_cores" expects
"cores per package" in encode_cache_cpuid4()).
3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.0BH.01H:EBX[bits
15:00] is incorrect because the EBX of 0BH.01H (core level) expects
"cpus per package", which may be different with 1FH.01H (The reason
is 1FH can support more levels. For QEMU, 1FH also supports die,
1FH.01H:EBX[bits 15:00] expects "cpus per die").
4. In cpu_x86_cpuid() of target/i386/cpu.c, when CPUID.80000001H is
calculated, here "cpus per package" is expected to be checked, but in
fact, now it checks "cpus per die". Though "cpus per die" also works
for this code logic, this isn't consistent with AMD's APM.
5. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.80000008H:ECX expects
"cpus per package" but it obtains "cpus per die".
6. In simulate_rdmsr() of target/i386/hvf/x86_emu.c, in
kvm_rdmsr_core_thread_count() of target/i386/kvm/kvm.c, and in
helper_rdmsr() of target/i386/tcg/sysemu/misc_helper.c,
MSR_CORE_THREAD_COUNT expects "cpus per package" and "cores per
package", but in these functions, it obtains "cpus per die" and
"cores per die".
On the other hand, these uses are correct now (they are added in/after
a94e142899):
1. In cpu_x86_cpuid() of target/i386/cpu.c, topo_info.cores_per_die
meets the actual meaning of CPUState.nr_cores ("cores per die").
2. In cpu_x86_cpuid() of target/i386/cpu.c, vcpus_per_socket (in CPUID.
04H's calculation) considers number of dies, so it's correct.
3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.1FH.01H:EBX[bits
15:00] needs "cpus per die" and it gets the correct result, and
CPUID.1FH.02H:EBX[bits 15:00] gets correct "cpus per package".
When CPUState.nr_cores is correctly changed to "cores per package" again
, the above errors will be fixed without extra work, but the "currently"
correct cases will go wrong and need special handling to pass correct
"cpus/cores per die" they want.
Fix CPUState.nr_cores' calculation to fit the original meaning "cores
per package", as well as changing calculation of topo_info.cores_per_die,
vcpus_per_socket and CPUID.1FH.
Fixes: a94e142899 ("target/i386: Add CPUID.1F generation support for multi-dies PCMachine")
Fixes: 003f230e37 ("machine: Tweak the order of topology members in struct CpuTopology")
Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20231024090323.1859210-4-zhao1.liu@linux.intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
In commit 40f8214fcd ("hw/audio/pcspk: Inline pcspk_init()")
we neglected to give a change to the caller to handle failed
device creation cleanly. Respect the caller API contract and
propagate the error if creating the PC_SPEAKER device ever
failed. This avoid yet another bad API use to be taken as
example and copy / pasted all over the code base.
Reported-by: Bernhard Beschow <shentey@gmail.com>
Suggested-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Bernhard Beschow <shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20231020171509.87839-5-philmd@linaro.org>
Fix:
hw/core/loader.c:1073:27: error: declaration shadows a variable in the global scope [-Werror,-Wshadow]
bool option_rom, MemoryRegion *mr,
^
include/sysemu/sysemu.h:57:22: note: previous declaration is here
extern QEMUOptionRom option_rom[MAX_OPTION_ROMS];
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20231010115048.11856-3-philmd@linaro.org>
Fix:
hw/core/machine.c:1302:22: error: declaration shadows a variable in the global scope [-Werror,-Wshadow]
const CPUArchId *cpus = possible_cpus->cpus;
^
hw/core/numa.c:69:17: error: declaration shadows a variable in the global scope [-Werror,-Wshadow]
uint16List *cpus = NULL;
^
hw/acpi/aml-build.c:2005:20: error: declaration shadows a variable in the global scope [-Werror,-Wshadow]
CPUArchIdList *cpus = ms->possible_cpus;
^
hw/core/machine-smp.c:77:14: error: declaration shadows a variable in the global scope [-Werror,-Wshadow]
unsigned cpus = config->has_cpus ? config->cpus : 0;
^
include/hw/core/cpu.h:589:17: note: previous declaration is here
extern CPUTailQ cpus;
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20231010115048.11856-2-philmd@linaro.org>
The OBJECT_DECLARE_CPU_TYPE() macro forward-declares each
ArchCPUClass type. These forward declarations are sufficient
for code in hw/ to use the QOM definitions. No need to expose
these structure definitions. Keep each local to their target/
by moving them to the corresponding "cpu.h" header.
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20231013140116.255-13-philmd@linaro.org>
Architecture specific hardware doesn't have a particular dependency
on the accelerator vCPU (created with cpu_exec_realizefn), and can
be initialized *after* the vCPU is realized. Doing so allows further
generic API simplification (in few commits).
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230918160257.30127-12-philmd@linaro.org>
"target/s390x/cpu-qom.h" has to be target-agnostic. However, it
currently declares CPUS390XState, which is target-specific.
Move that declaration to "cpu.h".
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20231106114500.5269-5-philmd@linaro.org>
cpu_get_tb_cpu_state() is TCG specific. Another accelerator
calling it would be a bug, so restrict the definition to TCG,
along with "tcg_s390x.h" header inclusion.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20231106114500.5269-4-philmd@linaro.org>
"hw/s390x/sclp.h" is a header used by target-agnostic objects
(such hw/char/sclpconsole[-lm].c), thus can not use target-specific
types, such CPUS390XState.
Have sclp_service_call[_protected]() take a S390CPU pointer, which
is target-agnostic.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20231106114500.5269-3-philmd@linaro.org>
"hw/s390x/css.h" is a header used by target-agnostic objects
(such hw/s390x/virtio-ccw-gpu.c), thus can not use target-specific
types, such CPUS390XState.
Have css_do_sic() take S390CPU a pointer, which is target-agnostic.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20231106114500.5269-2-philmd@linaro.org>
The OBJECT_DECLARE_CPU_TYPE() macro forward-declares the
PowerPCCPUClass type. This forward declaration is sufficient
for code in hw/ to use the QOM definitions. No need to expose
the structure definition. Keep it local to target/ppc/ by
moving it to target/ppc/cpu.h.
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20231013125630.95116-5-philmd@linaro.org>
None of these target-specific prototypes should be used
by user emulation. Remove their declaration there, so we
get a compile failure if ever used (instead of having to
deal with linker and its possible optimizations, such
dead code removal).
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20231003070427.69621-5-philmd@linaro.org>
CONFIG_KVM is always FALSE on user emulation, so 'kvm.c'
won't be added to ppc_ss[] source set; direcly use the system
specific ppc_system_ss[] source set.
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20231003070427.69621-4-philmd@linaro.org>
kvm_get_radix_page_info() is only defined for ppc targets (in
target/ppc/kvm.c). The declaration is not useful in other targets,
reduce its scope.
Rename using the 'kvmppc_' prefix following other declarations
from target/ppc/kvm_ppc.h.
Suggested-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20231003070427.69621-2-philmd@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
When CPUArchState* is available (here CPUX86State*), we can
use the fast env_archcpu() macro to get ArchCPU* (here X86CPU*).
The QOM cast X86_CPU() macro will be slower when building with
--enable-qom-cast-debug.
Pass CPUX86State* as argument to simulate_rdmsr / simulate_wrmsr
instead of a CPUState* to avoid an extra cast.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Roman Bolshakov <roman@roolebo.dev>
Tested-by: Roman Bolshakov <roman@roolebo.dev>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20231009110239.66778-7-philmd@linaro.org>
When CPUArchState* is available (here CPUXtensaState*), we
can use the fast env_archcpu() macro to get ArchCPU* (here
XtensaCPU*). The QOM cast XTENSA_CPU() macro will be slower
when building with --enable-qom-cast-debug.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20231009110239.66778-5-philmd@linaro.org>
When CPUArchState* is available (here CPUS390XState*), we
can use the fast env_archcpu() macro to get ArchCPU* (here
S390CPU*). The QOM cast S390_CPU() macro will be slower when
building with --enable-qom-cast-debug.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20231009110239.66778-4-philmd@linaro.org>
TYPE_RISCV_CPU_BASE depends on the TARGET_RISCV32/TARGET_RISCV64
definitions which are target specific. Such target specific
definition taints "cpu-qom.h".
Since "cpu-qom.h" must be target agnostic, remove its target
specific definition uses by moving TYPE_RISCV_CPU_BASE to
"target/riscv/cpu.h".
"target/riscv/cpu-qom.h" is now fully target agnostic.
Add a comment clarifying that in the header.
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20231013140116.255-12-philmd@linaro.org>
"target/foo/cpu.h" contains the target specific declarations.
A heterogeneous setup need to access target agnostic declarations
(at least the QOM ones, to instantiate the objects).
Our convention is to add such target agnostic QOM declarations in
the "target/foo/cpu-qom.h" header.
Add a comment clarifying that in the header.
Extract QOM definitions from "cpu.h" to "cpu-qom.h".
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20231013140116.255-11-philmd@linaro.org>
"target/foo/cpu.h" contains the target specific declarations.
A heterogeneous setup need to access target agnostic declarations
(at least the QOM ones, to instantiate the objects).
Our convention is to add such target agnostic QOM declarations in
the "target/foo/cpu-qom.h" header.
Add a comment clarifying that in the header.
Extract QOM definitions from "cpu.h" to "cpu-qom.h".
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20231013140116.255-10-philmd@linaro.org>
"target/foo/cpu.h" contains the target specific declarations.
A heterogeneous setup need to access target agnostic declarations
(at least the QOM ones, to instantiate the objects).
Our convention is to add such target agnostic QOM declarations in
the "target/foo/cpu-qom.h" header.
Add a comment clarifying that in the header.
Extract QOM definitions from "cpu.h" to "cpu-qom.h".
Reviewed-by: Song Gao <gaosong@loongson.cn>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20231013140116.255-9-philmd@linaro.org>
"target/foo/cpu.h" contains the target specific declarations.
A heterogeneous setup need to access target agnostic declarations
(at least the QOM ones, to instantiate the objects).
Our convention is to add such target agnostic QOM declarations in
the "target/foo/cpu-qom.h" header.
Add a comment clarifying that in the header.
Extract QOM definitions from "cpu.h" to "cpu-qom.h".
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Brian Cain <bcain@quicinc.com>
Message-Id: <20231013140116.255-8-philmd@linaro.org>
Hegerogeneous code needs access to the FOO_CPU_TYPE_NAME()
macro to resolve target CPU types. Move the declaration
(along with the required FOO_CPU_TYPE_SUFFIX) to "cpu-qom.h".
"target/foo/cpu-qom.h" is supposed to be target agnostic
(include-able by any target). Add such mention in the
header.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20231013140116.255-7-philmd@linaro.org>
CPU_RESOLVING_TYPE is a per-target definition, and is
irrelevant for other targets. Move it to "cpu.h".
"target/ppc/cpu-qom.h" is supposed to be target agnostic
(include-able by any target). Add such mention in the
header.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20231013140116.255-5-philmd@linaro.org>
Enforce the style described by commit 067109a11c ("docs/devel:
mention the spacing requirement for QOM"):
The first declaration of a storage or class structure should
always be the parent and leave a visual space between that
declaration and the new code. It is also useful to separate
backing for properties (options driven by the user) and internal
state to make navigation easier.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20231013140116.255-2-philmd@linaro.org>
Factor the TCG specific code from cpu_common_reset_hold() to
tcg_cpu_reset_hold() within tcg-accel-ops.c. Since this file
is sysemu specific, we can inline tcg_flush_softmmu_tlb(),
removing its declaration in "exec/cpu-common.h".
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230918104153.24433-4-philmd@linaro.org>
Introduce cpu_exec_reset_hold() which call an accelerator
specific AccelOpsClass::cpu_reset_hold() handler.
Define a stub on TCG user emulation, because CPU reset is
irrelevant there.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230918104153.24433-3-philmd@linaro.org>
"exec/cpu-common.h" is meant to contain the declarations
related to CPU usable with any accelerator / target
combination.
tcg_flush_jmp_cache() is specific to TCG, so restrict its
declaration by moving it to "exec/tb-flush.h".
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230918104153.24433-2-philmd@linaro.org>
Wether we are using a software MMU or not is irrelevant for the
seccomp facility. The facility is restricted to system emulation,
but such detail isn't really helpful, so directly drop the
'softmmu' mention from the test names.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20231002145104.52193-3-philmd@linaro.org>
virtio,pc,pci: features, fixes
virtio sound card support
vhost-user: back-end state migration
cxl:
line length reduction
enabling fabric management
vhost-vdpa:
shadow virtqueue hash calculation Support
shadow virtqueue RSS Support
tests:
CPU topology related smbios test cases
Fixes, cleanups all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmVKDDoPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRpF08H/0Zts8uvkHbgiOEJw4JMHU6/VaCipfIYsp01
# GSfwYOyEsXJ7GIxKWaCiMnWXEm7tebNCPKf3DoUtcAojQj3vuF9XbWBKw/bfRn83
# nGO/iiwbYViSKxkwqUI+Up5YiN9o0M8gBFrY0kScPezbnYmo5u2bcADdEEq6gH68
# D0Ea8i+WmszL891ypvgCDBL2ObDk3qX3vA5Q6J2I+HKX2ofJM59BwaKwS5ghw+IG
# BmbKXUZJNjUQfN9dQ7vJuiuqdknJ2xUzwW2Vn612ffarbOZB1DZ6ruWlrHty5TjX
# 0w4IXEJPBgZYbX9oc6zvTQnbLDBJbDU89mnme0TcmNMKWmQKTtc=
# =vEv+
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 07 Nov 2023 18:06:50 HKT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (63 commits)
acpi/tests/avocado/bits: enable console logging from bits VM
acpi/tests/avocado/bits: enforce 32-bit SMBIOS entry point
hw/cxl: Add tunneled command support to mailbox for switch cci.
hw/cxl: Add dummy security state get
hw/cxl/type3: Cleanup multiple CXL_TYPE3() calls in read/write functions
hw/cxl/mbox: Add Get Background Operation Status Command
hw/cxl: Add support for device sanitation
hw/cxl/mbox: Wire up interrupts for background completion
hw/cxl/mbox: Add support for background operations
hw/cxl: Implement Physical Ports status retrieval
hw/pci-bridge/cxl_downstream: Set default link width and link speed
hw/cxl/mbox: Add Physical Switch Identify command.
hw/cxl/mbox: Add Information and Status / Identify command
hw/cxl: Add a switch mailbox CCI function
hw/pci-bridge/cxl_upstream: Move defintion of device to header.
hw/cxl/mbox: Generalize the CCI command processing
hw/cxl/mbox: Pull the CCI definition out of the CXLDeviceState
hw/cxl/mbox: Split mailbox command payload into separate input and output
hw/cxl/mbox: Pull the payload out of struct cxl_cmd and make instances constant
hw/cxl: Fix a QEMU_BUILD_BUG_ON() in switch statement scope issue.
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Xen PV guest support for 8.2
Add Xen PV console and network support, the former of which enables the
Xen "PV shim" to be used to support PV guests.
Also clean up the block support and make it work when the user passes
just 'drive file=IMAGE,if=xen' on the command line.
Update the documentation to reflect all of these, taking the opportunity
to simplify what it says about q35 by making unplug work for AHCI.
Ignore the VCPU_SSHOTTMR_future timer flag, and advertise the 'fixed'
per-vCPU upcall vector support, as newer upstream Xen do.
# -----BEGIN PGP SIGNATURE-----
#
# iQJIBAABCAAyFiEEvgfZ/VSAmrLEsP9fY3Ys2mfi81kFAmVJ/7EUHGR3bXcyQGlu
# ZnJhZGVhZC5vcmcACgkQY3Ys2mfi81k+/xAAswivVR4+nwz3wTSN7EboGogS3hy+
# ZsTpvbJnfprGQJAK8vv8OP4eunaCJkO/dy3M/33Dh270msmV6I/1ki0E1RIPG45D
# n5wKM1Zxk0ABvjIgdp3xiLwITTdruJ+k9aqV8U9quhjgNFdOa7yjBOG8MD32GEPZ
# KHbavJ++huOu7+DZHJRNRq4gI/fREIULoPGHVg7WuEiRDYokOOmMROXqmTHTaUkV
# yFhkofzWxlpYhh7qRQx6/A80CSf7xwCof8krjdMCOYj3XGzYVZND0z5ZfHQYEwqt
# fowhargA8gH4V3d21S/MWCaZ+QrswFXZhcnl5wuGgWakV4ChvFETKs+fz2mODWUx
# 2T13trqeFJ5ElTrSpH1iWCoSEy6KCeLecvx7c/6HPSkDYQ3w5q8dXPpqgEtXY24S
# Wcmw4PkQ+HrLX7wbSU7QLyTZjvCQLFZ3Sb0uTf2zwsJZyeCCiT2lqAaogoMm6Kg0
# m/jG1JzE+9AC3j0Upp1lS3EK1qdxIuLdBuIcaEBEjy7Am+Y14PlZYoU2c751KbRF
# kqnIOYMoijX0PJDomPqCQtYNE0mrtogo0AbcFFIu+4k25vGbkl7xS5p2du9qw2Rd
# ++IdqQYzdzrUcIwmxocFQqFBJQ2dcbOGB1d7+VJ+A1Uj3yY2/DnFG5WqSaqS0KJi
# ZhBdFs3OTlPnRoM=
# =Dg79
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 07 Nov 2023 17:13:21 HKT
# gpg: using RSA key BE07D9FD54809AB2C4B0FF5F63762CDA67E2F359
# gpg: issuer "dwmw2@infradead.org"
# gpg: Good signature from "David Woodhouse <dwmw2@infradead.org>" [unknown]
# gpg: aka "David Woodhouse <dwmw2@exim.org>" [unknown]
# gpg: aka "David Woodhouse <david@woodhou.se>" [unknown]
# gpg: aka "David Woodhouse <dwmw2@kernel.org>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: BE07 D9FD 5480 9AB2 C4B0 FF5F 6376 2CDA 67E2 F359
* tag 'pull-xenfv.for-upstream-20231107' of git://git.infradead.org/users/dwmw2/qemu:
docs: update Xen-on-KVM documentation
xen-platform: unplug AHCI disks
hw/i386/pc: support '-nic' for xen-net-device
hw/xen: update Xen PV NIC to XenDevice model
hw/xen: only remove peers of PCI NICs on unplug
hw/xen: add support for Xen primary console in emulated mode
hw/xen: update Xen console to XenDevice model
hw/xen: do not repeatedly try to create a failing backend device
hw/xen: add get_frontend_path() method to XenDeviceClass
hw/xen: automatically assign device index to block devices
hw/xen: populate store frontend nodes with XenStore PFN/port
i386/xen: advertise XEN_HVM_CPUID_UPCALL_VECTOR in CPUID
include: update Xen public headers to Xen 4.17.2 release
hw/xen: Clean up event channel 'type_val' handling to use union
i386/xen: Ignore VCPU_SSHOTTMR_future flag in set_singleshot_timer()
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Change the "x-pixman" property default value and use the fallback path
when PIXMAN support is disabled.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: BALATON Zoltan <balaton@eik.bme.hu>
Change the "x-pixman" property default value and use the fallback path
when PIXMAN support is disabled.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
The QEMU fallback covers the requirements. We still need the flags of
header inclusion with CONFIG_PIXMAN.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
This simply means that 2d drawing updates won't be handled, but 3d
should work.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Use a simpler implementation for rectangle geometry & intersect, drop
the need for (more complex) PIXMAN functions.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
The command requires color conversion and line-by-line feeding. We could
have a simple fallback for simple formats though.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Add stubs for the fallback paths.
get_vc() now returns NULL by default if !PIXMAN.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
If a display is backed by a specialized VC, allow to override the
default "vc:80Cx24C".
As suggested by Paolo, if the display doesn't implement a VC (get_vc()
returns NULL), use a fallback that will use a muxed console on stdio.
This changes the behaviour of "qemu -display none", to create a muxed
serial/monitor by default (on TTY & not daemonized).
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
The next commit needs to have the display registered itself before
creating the default VCs.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Bump the display_remote variable when the -vnc option is parsed, just
like -spice.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Since commit 5324e3e958 ("qemu-options: define -spice only #ifdef
CONFIG_SPICE"), it is unnecessary to check at runtime for "-spice"
option.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
This is a tiny subset of PIXMAN API that is used pervasively in QEMU
codebase to manage images and identify the underlying format.
It doesn't seems worth to wrap this in a QEMU-specific API.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Thomas Huth <thuth@redhat.com>
For now, pixman is mandatory, but we set config_host.h and Kconfig.
Once compilation is fixed, "pixman" will become actually optional.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Add notes about console and network support, and how to launch PV guests.
Clean up the disk configuration examples now that that's simpler, and
remove the comment about IDE unplug on q35/AHCI now that it's fixed.
Update the -initrd option documentation to explain how to quote commas
in module command lines, and reference it when documenting PV guests.
Also update stale avocado test filename in MAINTAINERS.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
To support Xen guests using the Q35 chipset, the unplug protocol needs
to also remove AHCI disks.
Make pci_xen_ide_unplug() more generic, iterating over the children
of the PCI device and destroying the "ide-hd" devices. That works the
same for both AHCI and IDE, as does the detection of the primary disk
as unit 0 on the bus named "ide.0".
Then pci_xen_ide_unplug() can be used for both AHCI and IDE devices.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
The default NIC creation seems a bit hackish to me. I don't understand
why each platform has to call pci_nic_init_nofail() from a point in the
code where it actually has a pointer to the PCI bus, and then we have
the special cases for things like ne2k_isa.
If qmp_device_add() can *find* the appropriate bus and instantiate
the device on it, why can't we just do that from generic code for
creating the default NICs too?
But that isn't a yak I want to shave today. Add a xenbus field to the
PCMachineState so that it can make its way from pc_basic_device_init()
to pc_nic_init() and be handled as a special case like ne2k_isa is.
Now we can launch emulated Xen guests with '-nic user'.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
This allows us to use Xen PV networking with emulated Xen guests, and to
add them on the command line or hotplug.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
When the Xen guest asks to unplug *emulated* NICs, it's kind of unhelpful
also to unplug the peer of the *Xen* PV NIC.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
The primary console is special because the toolstack maps a page into
the guest for its ring, and also allocates the guest-side event channel.
The guest's grant table is even primed to export that page using a known
grant ref#. Add support for all that in emulated mode, so that we can
have a primary console.
For reasons unclear, the backends running under real Xen don't just use
a mapping of the well-known GNTTAB_RESERVED_CONSOLE grant ref (which
would also be in the ring-ref node in XenStore). Instead, the toolstack
sets the ring-ref node of the primary console to the GFN of the guest
page. The backend is expected to handle that special case and map it
with foreignmem operations instead.
We don't have an implementation of foreignmem ops for emulated Xen mode,
so just make it map GNTTAB_RESERVED_CONSOLE instead. This would probably
work for real Xen too, but we can't work out how to make real Xen create
a primary console of type "ioemu" to make QEMU drive it, so we can't
test that; might as well leave it as it is for now under Xen.
Now at last we can boot the Xen PV shim and run PV kernels in QEMU.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
This allows (non-primary) console devices to be created on the command
line and hotplugged.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
If xen_backend_device_create() fails to instantiate a device, the XenBus
code will just keep trying over and over again each time the bus is
re-enumerated, as long as the backend appears online and in
XenbusStateInitialising.
The only thing which prevents the XenBus code from recreating duplicates
of devices which already exist, is the fact that xen_device_realize()
sets the backend state to XenbusStateInitWait. If the attempt to create
the device doesn't get *that* far, that's when it will keep getting
retried.
My first thought was to handle errors by setting the backend state to
XenbusStateClosed, but that doesn't work for XenConsole which wants to
*ignore* any device of type != "ioemu" completely.
So, make xen_backend_device_create() *keep* the XenBackendInstance for a
failed device, and provide a new xen_backend_exists() function to allow
xen_bus_type_enumerate() to check whether one already exists before
creating a new one.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
The primary Xen console is special. The guest's side is set up for it by
the toolstack automatically and not by the standard PV init sequence.
Accordingly, its *frontend* doesn't appear in …/device/console/0 either;
instead it appears under …/console in the guest's XenStore node.
To allow the Xen console driver to override the frontend path for the
primary console, add a method to the XenDeviceClass which can be used
instead of the standard xen_device_get_frontend_path()
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
There's no need to force the user to assign a vdev. We can automatically
assign one, starting at xvda and searching until we find the first disk
name that's unused.
This means we can now allow '-drive if=xen,file=xxx' to work without an
explicit separate -driver argument, just like if=virtio.
Rip out the legacy handling from the xenpv machine, which was scribbling
over any disks configured by the toolstack, and didn't work with anything
but raw images.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Acked-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Paul Durrant <paul@xen.org>
This is kind of redundant since without being able to get these through
some other method (HVMOP_get_param) the guest wouldn't be able to access
XenStore in order to find them.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
This will allow Linux guests (since v6.0) to use the per-vCPU upcall
vector delivered as MSI through the local APIC.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
... in order to advertise the XEN_HVM_CPUID_UPCALL_VECTOR feature,
which will come in a subsequent commit.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Acked-by: Paul Durrant <paul@xen.org>
A previous implementation of this stuff used a 64-bit field for all of
the port information (vcpu/type/type_val) and did atomic exchanges on
them. When I implemented that in Qemu I regretted my life choices and
just kept it simple with locking instead.
So there's no need for the XenEvtchnPort to be so simplistic. We can
use a union for the pirq/virq/interdomain information, which lets us
keep a separate bit for the 'remote domain' in interdomain ports. A
single bit is enough since the only possible targets are loopback or
qemu itself.
So now we can ditch PORT_INFO_TYPEVAL_REMOTE_QEMU and the horrid
manual masking, although the in-memory representation is identical
so there's no change in the saved state ABI.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Console logs from the VM can be useful for debugging when things go wrong.
Other avocado tests enables them. This change enables console logging with the
following changes:
- point to the newer bios bits image that actually enabled VM console.
- change the bits test to drain the console logs from the VM and write the
logs.
- wait for SHUTDOWN event from QEMU so that console logs can be drained out
of the socket before it is closed as a part of vm.wait().
Additionally, following two cosmetic changes have been made:
- Removed VM QEMU command line logging as avocado framework already logs it.
This is a minor cleanup along the way.
- Update my email to my work email in the avocado acpi bios bits test.
CC: jsnow@redhat.com
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20231027032120.6012-3-anisinha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
QEMU defaults to 64-bit entry point since the following commit
bf376f3020 ("hw/i386/pc: Default to use SMBIOS 3.0 for newer machine models")
The above change is applicable for all newer machine versions from version 8.1
and newer. i440fx and q35 machine versions 8.0 and older still use 32-bit entry
points.
Unfortunately, bits currently does not recognize 64-bit entry points and hence
is not able to parse SMBIOS tables. Therefore, we need to enforce 32-bit
SMBIOS entry point in QEMU command line so that bits is able to parse the
SMBIOS tables.
Once we implement the support in bits to parse 64-bit entry points, we can
remove the extra command line that is passed to enforce a 32-bit entry point.
The support can be added to the following smbios test script:
tests/avocado/acpi-bits/bits-tests/smbios.py2 in QEMU repository.
CC: jusual@redhat.com
CC: imammedo@redhat.com
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20231027032120.6012-2-anisinha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This implementation of tunneling makes the choice that our Type 3 device is
a Logical Device (LD) of a Multi-Logical Device (MLD) that just happens to
only have one LD for now.
Tunneling is supported from a Switch Mailbox CCI (and shortly via MCTP over
I2C connected to the switch MCTP CCI) via an outer level to the FM owned LD
in the MLD Type 3 device. From there an inner tunnel may be used to access
particular LDs.
Protocol wise, the following is what happens in a real system but we
don't emulate the transports - just the destinations and the payloads.
( Host -> Switch Mailbox CCI - in band FM-API mailbox command
or
Host -> Switch MCTP CCI - MCTP over I2C using the CXL FM-API
MCTP Binding.
)
then (if a tunnel command)
Switch -> Type 3 FM Owned LD - MCTP over PCI VDM using the
CXL FM-API binding (addressed by switch port)
then (if unwrapped command also a tunnel command)
Type 3 FM Owned LD to LD0 via internal transport
(addressed by LD number)
or (added shortly)
Host to Type 3 FM Owned MCTP CCI - MCTP over I2C using the
CXL FM-API MCTP Binding.
then (if unwrapped comand is a tunnel comamnd)
Type 3 FM Owned LD to LD0 via internal transport.
(addressed by LD number)
It is worth noting that the tunneling commands over PCI VDM
presumably use the appropriate MCTP binding depending on opcode.
This may be the CXL FMAPI binding or the CXL Memory Device Binding.
Additional commands will need to be added to make this
useful beyond testing the tunneling works.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20231023160806.13206-18-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Support background commands in the mailbox, and update
cmd_infostat_bg_op_sts() accordingly. This patch does not implement mbox
interrupts upon completion, so the kernel driver must rely on polling to
know when the operation is done.
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20231023160806.13206-12-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
CXL switch CCIs were added in CXL r3.0. They are a PCI function,
identified by class code that provides a CXL mailbox (identical
to that previously defined for CXL type 3 memory devices) over which
various FM-API commands may be used. Whilst the intent of this
feature is enable switch control from a BMC attached to a switch
upstream port, it is also useful to allow emulation of this feature
on the upstream port connected to a host using the CXL devices as
this greatly simplifies testing.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20231023160806.13206-7-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
To avoid repetition of switch upstream port specific data in the
CXLDeviceState structure it will be necessary to access the switch USP
specific data from mailbox callbacks. Hence move it to cxl_device.h so it
is no longer an opaque structure.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Message-Id: <20231023160806.13206-6-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
By moving the parts of the mailbox command handling that are CCI type
specific out to the caller, make the main handling code generic. Rename it
to cxl_process_cci_message() to reflect this new generality.
Change the type3 mailbox handling (reused shortly for the switch
mailbox CCI) to take a snapshot of the mailbox input data rather
than operating on it in place. This reduces the chance of bugs
due to aliasing going forwars.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20231023160806.13206-5-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
New CCI types that will be supported shortly do not have a single buffer
used in both directions. As such, split it up. To avoid the complexities
of implementing all commands to handle potential aliasing, take a copy of
the input before use.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20231023160806.13206-3-Jonathan.Cameron@huawei.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Establishing that only register accesses of size 4 and 8 can occur
using these functions requires looking at their callers. Make it
easier to see that by using switch statements.
Assertions are used to enforce that the register storage is of the
matching size, allowing fixed values to be used for divisors of
the array indices.
Suggested-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Message-Id: <20231023140210.3089-3-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This tests the commit 7298fd7de5 ("hw/smbios: Fix thread count in
type4").
In smbios_build_type_4_table() (hw/smbios/smbios.c), if the number of
threads in the socket is more than 255, then smbios type4 table encodes
threads per socket into the thread count2 field.
So for the topology in this case, there're the following considerations:
1. threads per socket should be more than 255 to ensure we could cover
the thread count2 field.
2. The original bug was that threads per socket was miscalculated, so
now we should configure as many topology levels as possible (multiple
dies, no module since x86 hasn't supported it) to cover more general
topology scenarios, to ensure that the threads per socket encoded in
the thread count2 field is correct.
3. For the more general topology, we should also add "cpus" (presented
threads for machine) and "maxcpus" (total threads for machine) to
make sure that configuring unpluged CPUs in smp (cpus < maxcpus)
does not affect the correctness of threads per socket for thread
count2 field.
Note we don't consider the topology with multiple sockets since this
topology would create too many vCPUs (more than 255 threads per socket
with at least 2 sockets, which may cause the failure "Number of
hotpluggable cpus requested (*) exceeds the maximum cpus supported by
KVM (*) socket_accept failed: Resource temporarily unavailable"), and
the calculation of threads per socket has already been covered by
"thread count" test case.
Based on these considerations, select the topology as the follow:
-smp cpus=210,maxcpus=260,dies=2,cores=65,threads=2
The expected thread count2 = threads per socket = threads (2)
* cores (65) * dies (2) = 260.
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20231023094635.1588282-16-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This tests the commit 7298fd7de5 ("hw/smbios: Fix thread count in
type4").
In smbios_build_type_4_table() (hw/smbios/smbios.c), if the number of
threads in the socket is not more than 255, then smbios type4 table
encodes threads per socket into the thread count field.
So for the topology in this case, there're the following considerations:
1. threads per socket should be not more than 255 to ensure we could
cover the thread count field.
2. The original bug was that threads per socket was miscalculated, so
now we should configure as many topology levels as possible (multiple
sockets & dies, no module since x86 hasn't supported it) to cover
more general topology scenarios, to ensure that the threads per
socket encoded in the thread count field is correct.
3. For the more general topology, we should also add "cpus" (presented
threads for machine) and "maxcpus" (total threads for machine) to
make sure that configuring unpluged CPUs in smp (cpus < maxcpus)
does not affect the correctness of threads per socket for thread
count field.
Based on these considerations, select the topology as the follow:
-smp cpus=15,maxcpus=54,sockets=2,dies=3,cores=3,threads=3
The expected thread count = threads per socket = threads (3) * cores (3)
* dies (3) = 27.
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20231023094635.1588282-13-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The commit 196ea60a73 ("hw/smbios: Fix core count in type4") fixed
the miscalculation of cores per socket.
The original core count2 test (with the topology configured by
"-smp 275") didn't recognize that topology-related but because it just
created a special topology with only one socket and one die by default,
ignoring the effect of more topology levels (between socket and core) on
the cores per socket calculation.
So for the topology in this case, there're the following considerations:
1. cores per socket should be more than 255 to ensure we could cover
the core count2 field.
2. The original bug was that cores per socket was miscalculated, so now
we should include as many topology levels as possible (multiple
sockets or dies, no module since x86 hasn't supported it) to cover
more general topology scenarios, to ensure that the cores per socket
encoded in the core count2 field is correct.
Based on these considerations, select the topology with multiple dies:
-smp 260,dies=2,cores=130,threads=1
Note, here we doesn't configure multiple sockets to avoid the error
("kvm_init_vcpu: kvm_get_vcpu failed (*): Too many open files") if user
uses the default ulimit seeting on his machine.
And the cores per socket calculation for multiple sockets has already
been covered by the core count test case, so that only multiple dies
configuration is enough.
The expected core count2 = cores per socket = cores (130) * dies (2) =
260.
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20231023094635.1588282-10-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This tests the commit 196ea60a73 ("hw/smbios: Fix core count in
type4").
In smbios_build_type_4_table() (hw/smbios/smbios.c), if the number of
cores in the socket is not more than 255, then smbios type4 table
encodes cores per socket into the core count field.
So for the topology in this case, there're the following considerations:
1. cores per socket should be not more than 255 to ensure we could cover
the core count field.
2. The original bug was that cores per socket was miscalculated, so now
we should include as many topology levels as possible (mutiple
sockets & dies, no module since x86 hasn't supported it) to cover
more general topology scenarios, to ensure that the cores per socket
encoded in the core count field is correct.
Based on these considerations, select the topology with multiple sockets
and dies:
-smp 54,sockets=2,dies=3,cores=3,threads=3
The expected core count = cores per socket = cores (3) * dies (3) = 9.
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20231023094635.1588282-7-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This tests the commit d79a284a44 ("hw/smbios: Fix smbios_smp_sockets
calculation").
In smbios_get_tables() (hw/smbios/smbios.c), smbios type4 table is built
for each socket, so the count of type4 tables should be equal to the
number of sockets.
Thus for the topology in this case, there're the following considerations:
1. The topology should include multiple sockets to ensure smbios could
create type4 tables for each socket.
2. In addition to sockets, for the more general topology, we should also
configure as many topology levels as possible (multiple dies, no
module since x86 hasn't supported it), to ensure that smbios is able
to exclude the effect of other topology levels to create the type4
tables only for sockets.
3. The original miscalculation bug also misused "smp.cpus", so it's
necessary to configure "cpus" (presented threads for machine) and
"maxcpus" (total threads for machine) as well to make sure that
configuring unpluged CPUs in smp (cpus < maxcpus) does not affect
the correctness of the count of type4 tables.
Based on these considerations, select the topology as the follow:
-smp cpus=100,maxcpus=120,sockets=5,dies=2,cores=4,threads=3
The expected count of type4 tables = sockets (5).
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20231023094635.1588282-4-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Use the different ways to calculate cores/threads per socket, so that
the new CPU topology levels won't be missed in these 2 helpes:
* machine_topo_get_cores_per_socket()
* machine_topo_get_threads_per_socket()
Test the commit a1d027be95 ("machine: Add helpers to get cores/
threads per socket").
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20231023094635.1588282-2-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
At present, to enable the VIRTIO_NET_F_RSS feature, eBPF must
be loaded for the vhost backend.
Given that vhost-vdpa is one of the vhost backend, we need to
implement the SetSteeringEBPF method to support RSS for vhost-vdpa,
even if vhost-vdpa calculates the rss hash in the hardware device
instead of in the kernel by eBPF.
Although this requires QEMU to be compiled with `--enable-bpf`
configuration even if the vdpa device does not use eBPF to
calculate the rss hash, this can avoid adding the specific
conditional statements for vDPA case to enable the VIRTIO_NET_F_RSS
feature, which reduces code maintainbility.
Suggested-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Message-Id: <280e20ddce55b6de60f1552ba0865bffffe909b2.1698195059.git.yin31149@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Handle output IO messages in the transmit (TX) virtqueue.
It allocates a VirtIOSoundPCMBuffer for each IO message and copies the
data buffer to it. When the IO buffer is written to the host's sound
card, the guest will be notified that it has been consumed.
The lifetime of an IO message is:
1. Guest sends IO message to TX virtqueue.
2. QEMU adds it to the appropriate stream's IO buffer queue.
3. Sometime later, the host audio backend calls the output callback,
virtio_snd_pcm_out_cb(), which is defined with an AUD_open_out()
call. The callback gets an available number of bytes the backend can
receive. Then it writes data from the IO buffer queue to the backend.
If at any time a buffer is exhausted, it is returned to the guest as
completed.
4. If the guest releases the stream, its buffer queue is flushed by
attempting to write any leftover data to the audio backend and
releasing all IO messages back to the guest. This is how according to
the spec the guest knows the release was successful.
Based-on: 5a2f350eec
Signed-off-by: Igor Skalkin <Igor.Skalkin@opensynergy.com>
Signed-off-by: Anton Yakovlev <Anton.Yakovlev@opensynergy.com>
Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <b7c6fc458c763d09a4abbcb620ae9b220afa5b8f.1698062525.git.manos.pitsidianakis@linaro.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch adds a PCI wrapper device for the virtio-sound device.
It is necessary to instantiate a virtio-snd device in a guest.
All sound logic will be added to the virtio-snd device in the following
commits.
To add this device with a guest, you'll need a >=5.13 kernel compiled
with CONFIG_SND_VIRTIO=y, which at the time of writing most distros have
off by default.
Use with following flags in the invocation:
Pulseaudio:
-audio driver=pa,model=virtio
or
-audio driver=pa,model=virtio,server=/run/user/1000/pulse/native
sdl:
-audio driver=sdl,model=virtio
coreaudio (macos/darwin):
-audio driver=coreaudio,model=virtio
etc.
Based-on: 5a2f350eec
Signed-off-by: Igor Skalkin <Igor.Skalkin@opensynergy.com>
Signed-off-by: Anton Yakovlev <Anton.Yakovlev@opensynergy.com>
Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <b223598d59f56ead6a6d8d9bb6801e17489ddaa4.1698062525.git.manos.pitsidianakis@linaro.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
A virtio-fs device's VM state consists of:
- the virtio device (vring) state (VMSTATE_VIRTIO_DEVICE)
- the back-end's (virtiofsd's) internal state
We get/set the latter via the new vhost operations to transfer migratory
state. It is its own dedicated subsection, so that for external
migration, it can be disabled.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20231016134243.68248-8-hreitz@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
vhost_save_backend_state() and vhost_load_backend_state() can be used by
vhost front-ends to easily save and load the back-end's state to/from
the migration stream.
Because we do not know the full state size ahead of time,
vhost_save_backend_state() simply reads the data in 1 MB chunks, and
writes each chunk consecutively into the migration stream, prefixed by
its length. EOF is indicated by a 0-length chunk.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20231016134243.68248-7-hreitz@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
For vhost-user devices, qemu can migrate the virtio state, but not the
back-end's internal state. To do so, we need to be able to transfer
this internal state between front-end (qemu) and back-end.
At this point, this new feature is added for the purpose of virtio-fs
migration. Because virtiofsd's internal state will not be too large, we
believe it is best to transfer it as a single binary blob after the
streaming phase.
These are the additions to the protocol:
- New vhost-user protocol feature VHOST_USER_PROTOCOL_F_DEVICE_STATE
- SET_DEVICE_STATE_FD function: Front-end and back-end negotiate a file
descriptor over which to transfer the state.
- CHECK_DEVICE_STATE: After the state has been transferred through the
file descriptor, the front-end invokes this function to verify
success. There is no in-band way (through the file descriptor) to
indicate failure, so we need to check explicitly.
Once the transfer FD has been established via SET_DEVICE_STATE_FD
(which includes establishing the direction of transfer and migration
phase), the sending side writes its data into it, and the reading side
reads it until it sees an EOF. Then, the front-end will check for
success via CHECK_DEVICE_STATE, which on the destination side includes
checking for integrity (i.e. errors during deserialization).
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20231016134243.68248-5-hreitz@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In vDPA, GET_VRING_BASE does not stop the queried vring, which is why
SUSPEND was introduced so that the returned index would be stable. In
vhost-user, it does stop the vring, so under the same reasoning, it can
get away without SUSPEND.
Still, we do want to clarify that if the device is completely stopped,
i.e. all vrings are stopped, the back-end should cease to modify any
state relating to the guest. Do this by calling it "suspended".
Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20231016134243.68248-4-hreitz@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently, the vhost-user documentation says that rings are to be
initialized in a disabled state when VHOST_USER_F_PROTOCOL_FEATURES is
negotiated. However, by the time of feature negotiation, all rings have
already been initialized, so it is not entirely clear what this means.
At least the vhost-user-backend Rust crate's implementation interpreted
it to mean that whenever this feature is negotiated, all rings are to
put into a disabled state, which means that every SET_FEATURES call
would disable all rings, effectively halting the device. This is
problematic because the VHOST_F_LOG_ALL feature is also set or cleared
this way, which happens during migration. Doing so should not halt the
device.
Other implementations have interpreted this to mean that the device is
to be initialized with all rings disabled, and a subsequent SET_FEATURES
call that does not set VHOST_USER_F_PROTOCOL_FEATURES will enable all of
them. Here, SET_FEATURES will never disable any ring.
This interpretation does not suffer the problem of unintentionally
halting the device whenever features are set or cleared, so it seems
better and more reasonable.
We can clarify this in the documentation by making it explicit that the
enabled/disabled state is tracked even while the vring is stopped.
Every vring is initialized in a disabled state, and SET_FEATURES without
VHOST_USER_F_PROTOCOL_FEATURES simply becomes one way to enable all
vrings.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20231016134243.68248-3-hreitz@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
GET_VRING_BASE does not mention that it stops the respective ring. Fix
that.
Furthermore, it is not fully clear what the "base offset" these
commands' documentation refers to is; an offset could be many things.
Be more precise and verbose about it, especially given that these
commands use different payload structures depending on whether the vring
is split or packed.
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20231016134243.68248-2-hreitz@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Provides a display option, zoom-to-fit, that enables scaling of the
display when full-screen mode is enabled.
Also ensures that the corresponding menu item is marked as enabled when
the option is set to on.
Signed-off-by: Carwyn Ellis <carwynellis@gmail.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20231027154920.80626-2-carwynellis@gmail.com>
The first time gd_egl_scanout_texture() is called, there's a possibility
that the GTK drawing area might not be realized yet, in which case its
associated GdkWindow is NULL. This means gd_egl_init() was also skipped
and the EGLContext and EGLSurface stored in the VirtualGfxConsole are
not valid yet.
Continuing with the scanout in this conditions would result in hitting
an assert in libepoxy: "Couldn't find current GLX or EGL context".
A possible workaround is to just ignore the scanout request, giving the
the GTK drawing area some time to finish its realization. At that point,
the gd_egl_init() will succeed and the EGLContext and EGLSurface stored
in the VirtualGfxConsole will be valid.
Signed-off-by: Antonio Caggiano <quic_acaggian@quicinc.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20231016123215.2699269-1-quic_acaggian@quicinc.com>
target/hppa: Implement PA2.0 instructions
hw/hppa: Map astro chip 64-bit I/O mem
hw/hppa: Turn on 64-bit cpu for C3700
# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmVJqDEdHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV8n5Qf/R15CvXGMgjDJjoV2
# ILMFM+Rpg17SR2yu060sEZ01R3iHdobeCcDB184K0RI9JLrpcBFar+PeF023o9fn
# O9MnfIyL6/ggzaeIpQ9AD2uT0HJMU9hLFoyQqQvnhDHHcT34raL2+Zkrkb2vvauH
# XET7awXN9xYCnY4ALrfcapzlrHqI77ahz0vReUWPxk7eGY2ez8dEOiFW2WLBmuMx
# mAFAMrFQhq66GjoMDl8JiGHD/KBJQ9X4eUAEotS27lTCOYU0ryA6dWBGqBSTWCUa
# smpxkeGQKOew+717HV1H4FdCRYG1Rgm7yFN423JULeew+T7DHvfe0K55vMIulx5I
# g3oVZA==
# =dxC7
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 07 Nov 2023 11:00:01 HKT
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F
* tag 'pull-pa-20231106' of https://gitlab.com/rth7680/qemu: (85 commits)
hw/hppa: Allow C3700 with 64-bit and B160L with 32-bit CPU only
hw/hppa: Turn on 64-bit CPU for C3700 machine
hw/pci-host/astro: Trigger CPU irq on CPU HPA in high memory
hw/pci-host/astro: Map Astro chip into 64-bit I/O memory region
target/hppa: Improve interrupt logging
target/hppa: Update IIAOQ, IIASQ for pa2.0
target/hppa: Create raise_exception_with_ior
target/hppa: Add unwind_breg to CPUHPPAState
target/hppa: Clear upper bits in mtctl for pa1.x
target/hppa: Avoid async_safe_run_on_cpu on uniprocessor system
target/hppa: Add pa2.0 cpu local tlb flushes
target/hppa: Implement pa2.0 data prefetch instructions
linux-user/hppa: Drop EXCP_DUMP from handled exceptions
hw/hppa: Translate phys addresses for the cpu
include/hw/elf: Remove truncating signed casts
target/hppa: Return zero for r0 from load_gpr
target/hppa: Precompute zero into DisasContext
target/hppa: Fix interruption based on default PSW
target/hppa: Implement PERMH
target/hppa: Implement MIXH, MIXW
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Third RISC-V PR for 8.2
* Rename ext_icboz to ext_zicboz
* Rename ext_icbom to ext_zicbom
* Rename ext_icsr to ext_zicsr
* Rename ext_ifencei to ext_zifencei
* Add RISC-V Virtual IRQs and IRQ filtering support
* Change default linux-user cpu to 'max'
* Update 'virt' machine core limit
* Add query-cpu-model-expansion API
* Rename epmp to smepmp and expose the extension
* Clear pmp/smepmp bits on reset
* Ignore pmp writes when RW=01
* Support zicntr/zihpm flags and disable support
* Correct CSR_MSECCFG operations
* Update mail address for Weiwei Li
* Update RISC-V vector crypto to ratified v1.0.0
* Clear the Ibex/OpenTitan SPI interrupts even if disabled
* Set the OpenTitan priv to 1.12.0
* Support discontinuous PMU counters
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEaukCtqfKh31tZZKWr3yVEwxTgBMFAmVJoOEACgkQr3yVEwxT
# gBPwcw/5AXgSVu521IHpobofq4Skc2rpO9P0Hep3IniBuS+5+h2XM3fwWNBaeeGj
# LZgdXDrCfcCnPuFh2I5j1D885xJDncDF4LET9EFtxK+BTT8eC5JpaCnORdV3Zd2T
# C7qdq1r4J/wKBel3cAz1jlLXc2Pssle4NFaMZGmOGlNX/mLJUYkI6BwKG9wNiCI+
# cCRQW5bEv9g8XzPYPsIKhX9aTegDKdV5x4Xj3YyVs8qkZTVM7Ona8GTpy6eShNfL
# h/RW+yvSxLwfKC9YJHesjI1oqhLsAuA7hFu5AVHiedFNAD5FevMZsZwrqjrmeBOG
# 5awBw9XgfXFFl7jQ0VQVRknt/PFANzTmGGbjLUkaXgJ6iTmH7oIMzwbkx2pM/0Qd
# HV2EboUPe5rJl0SNhcDMCJkYJYpt4z6TVXFpN5p10WU4K1AJXZf9P3YkChcxWiSK
# B4DlY4ax3W77voySwbKCvJRIRWCFQZmtl7doFY5dEQz2ERcNfI7VIB1GKIj7BlGm
# AVTCc5G9KghsaB8q0BzYbDplzCggdaaUBRgpIgLS/n22GKJlOisFwMCawWquPkEw
# i0t3ftt+Ket4Qnnq+dO4W3ehR4qW1/XatCWgQ3NCSgUeS4/9VK3h/nz5t+L7iKwp
# mjp86gNN11wcJRsBIIV7nOAmSAs9ybCm2F4J6YAyh3n1IlRVN0Q=
# =2A+W
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 07 Nov 2023 10:28:49 HKT
# gpg: using RSA key 6AE902B6A7CA877D6D659296AF7C95130C538013
# gpg: Good signature from "Alistair Francis <alistair@alistair23.me>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6AE9 02B6 A7CA 877D 6D65 9296 AF7C 9513 0C53 8013
* tag 'pull-riscv-to-apply-20231107' of https://github.com/alistair23/qemu: (49 commits)
docs/about/deprecated: Document RISC-V "pmu-num" deprecation
target/riscv: Add "pmu-mask" property to replace "pmu-num"
target/riscv: Use existing PMU counter mask in FDT generation
target/riscv: Don't assume PMU counters are continuous
target/riscv: Propagate error from PMU setup
target/riscv: cpu: Set the OpenTitan priv to 1.12.0
hw/ssi: ibex_spi_host: Clear the interrupt even if disabled
disas/riscv: Replace TABs with space
disas/riscv: Add support for vector crypto extensions
disas/riscv: Add rv_codec_vror_vi for vror.vi
disas/riscv: Add rv_fmt_vd_vs2_uimm format
target/riscv: Move vector crypto extensions to riscv_cpu_extensions
target/riscv: Expose Zvks[c|g] extnesion properties
target/riscv: Add cfg properties for Zvks[c|g] extensions
target/riscv: Expose Zvkn[c|g] extnesion properties
target/riscv: Add cfg properties for Zvkn[c|g] extensions
target/riscv: Expose Zvkb extension property
target/riscv: Replace Zvbb checking by Zvkb
target/riscv: Add cfg property for Zvkb extension
target/riscv: Expose Zvkt extension property
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Prevent that users try to boot a 64-bit only C3700 machine with a 32-bit
CPU, and to boot a 32-bit only B160L machine with a 64-bit CPU.
Signed-off-by: Helge Deller <deller@gmx.de>
The CPU HPA is in the high F-region on PA2.0 CPUs, so use F_EXTEND()
to trigger interrupt request at the right CPU HPA address.
Note that the cpu_hpa value comes out of the IRT, which doesn't store the
higher addresss bits.
Signed-off-by: Helge Deller <deller@gmx.de>
Fill in the insn_start value during form_gva, and copy
it out to the env field in hppa_restore_state_to_opc.
The value is not yet consumed.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The previous decoding misnamed the bit it called "local".
Other than the name, the implementation was correct for pa1.x.
Rename this field to "tlbe".
PA2.0 adds (a real) local bit to PxTLB, and also adds a range
of pages to flush in GR[b].
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These are aliased onto the normal integer loads to %g0.
Since we don't emulate caches, prefetch is a nop.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Hack the machine to use pa2.0 physical layout when required,
using the PSW.W=0 absolute to physical mapping.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
There's nothing about elf that specifically requires signed vs unsigned.
This is very much a target-specific preference.
In the meantime, casting low and high from uint64_t back to Elf_SWord
to uint64_t discards high bits that might have been set by translate_fn.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The default PSW is set by the operating system with the PDC_PSW
firmware call. Use that setting to decide if wide mode is to be
enabled for interruptions and EIRR usage.
Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Split out the tlb to a subsection so that it can be separately
versioned -- the format is only partially following the architecture
and is partially guided by the qemu implementation.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The conversions to/from i64 can be eliminated entirely,
folding computation into adjacent operations.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rename the existing insert tlb helpers to emphasize that they
are for pa1.1 cpus. Implement a combined i/d tlb for pa2.0.
Still missing is the new 'P' tlb bit.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Allow both user-only and system mode to run pa2.0 cpus.
Avoid creating a separate qemu-system-hppa64 binary;
force the qemu-hppa binary to use TARGET_ABI32.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
There is no support for hppa64 in gdb. Any attempt to provide the
data for the larger hppa64 registers results in an error from gdb.
Mask CR_SAR writes to the width of the register: 5 or 6 bits.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Hoist the resolution of d up one level above do_unit_cond.
All computations are logical, and are simplified by using a mask of the
correct width, after which the result may be compared with zero.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Hoist the resolution of d up one level above do_sed_cond.
The MOVB comparison and the existing shift/extract/deposit
are all 32-bit.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The sar shift amount register is limited to 5 bits when running
a 32-bit CPU. Strip off the remaining bits.
The interesting part is, that this register allows to detect at runtime
if a physical CPU is capable to execute PA2.0 (64-bit) instructions.
Signed-off-by: Helge Deller <deller@gmx.de>
We need to make sure the link is masked properly along the
use_nullify_skip path. The other three settings of a link
register already use this.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This will be how we ensure that the IAOQ is always
valid per PSW.W, therefore all stores to these two
variables must be done with this function.
Use third argument -1 if the destination is always dynamic,
and fourth argument NULL if the destination is always static.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In form_gva and cpu_get_tb_cpu_state, we must truncate when PSW_W == 0.
In space_select, the bits that choose the space depend on PSW_W.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
With pa2.0, absolute addresses are not the same as physical addresses,
and undergo a transformation based on PSW_W.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
With 64-bit registers, there are 16 carry bits in the PSW.
Clear reserved bits based on cpu revision.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Prepare for the qemu binary supporting both pa10 and pa20
at the same time.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Select the proper carry bit for input to the arithmetic
and for output for the condition.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This instruction always uses the input carry from bit 32,
but produces all 16 output carry bits.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The destination is TCGv_i32, so use tcg_gen_qemu_ld_i32
not tcg_gen_qemu_ld_reg.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Complete the data structure conversion started earlier. This reduces
the perf overhead of hppa_get_physical_address from ~5% to ~0.25%.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
No need to trigger the large_page_mask code unnecessarily.
Drop the now unused HPPATLBEntry.page_size field.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Replace the va_b and va_b fields with the interval tree node.
The actual interval tree is not yet used.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use a separate mmu index for PSW_P enabled vs disabled.
This means we can elide the tlb flush in cpu_hppa_put_psw
when PSW_P changes. This turns out to be the majority
of all tlb flushes.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Block patches:
- One patch to make qcow2's discard-no-unref option do better what it is
supposed to do (i.e. prevent fragmentation)
- Two fixes for zoned requests
# -----BEGIN PGP SIGNATURE-----
#
# iQJGBAABCAAwFiEEy2LXoO44KeRfAE00ofpA0JgBnN8FAmVJHbgSHGhyZWl0ekBy
# ZWRoYXQuY29tAAoJEKH6QNCYAZzfLn4QAKxuUYZaXirv6K4U2tW4aAJtc5uESdwv
# WYhG7YU7MleBGCY0fRoih5thrPrzRLC8o1QhbRcA36+/PAZf4BYrJEfqLUdzuN5x
# 6Vb1n3NRUzPD1+VfL/B9hVZhFbtTOUZuxPGEqCoHAmqBaeKuYRT1bLZbtRtPVLSk
# 5eTMiyrpRMlBWc7O71eGKLqU4k0vAznwHBGf2Z93qWAsKcRZCwbAWYa7Q6rJ9jJ8
# 1jNsQuAk0p74/uGEpFhoEVrFEcV6pMbI4+jB9i0t9YYxT0tLIdIX1VUx+AHJfItk
# IF2stB6SFOaAy2W3Fn+0oJvz40aMLzg9VjEeTpGmdlKC67ZTYa6Obwzy5WNLPIap
# k7VUheUEe8qoKUtxQNxGLR/HKEJSFXyhU0lgAGxE1gl2xc1QFFFsrimpwFd3d37j
# 3PwfhjARHonf4ZXgsvtIjb7nG9seMZYO7Vht0OztJyW8c2XN5OFVPir9xLbd9VUg
# wZNGB8jAsHgj77+S/mRIwpP+laKL8wB7zYZ1mgFI98QJIYqL8tGdV/IiUhLljHzc
# XAmwekOhBMMbgHhliBy9zDuTy59+zZ0FoxZPn/JvBjqBAkEnz9EbhHxi2imQg+1d
# XSoLbx1X1yEbepWz8mCGiveLIPkt+3qMJuuQF76nURaA+nm3tCl/nKca6QLnVKzU
# 2QtPWS0qRmwd
# =5w7S
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 07 Nov 2023 01:09:12 HKT
# gpg: using RSA key CB62D7A0EE3829E45F004D34A1FA40D098019CDF
# gpg: issuer "hreitz@redhat.com"
# gpg: Good signature from "Hanna Reitz <hreitz@redhat.com>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: CB62 D7A0 EE38 29E4 5F00 4D34 A1FA 40D0 9801 9CDF
* tag 'pull-block-2023-11-06' of https://gitlab.com/hreitz/qemu:
file-posix: fix over-writing of returning zone_append offset
block/file-posix: fix update_zones_wp() caller
qcow2: keep reference on zeroize with discard-no-unref enabled
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
vfio queue:
* Support for non 64b IOVA space
* Introduction of a PCIIOMMUOps callback structure to ease future
extensions
* Fix for a buffer overrun when writing the VF token
* PPC cleanups preparing ground for IOMMUFD support
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmVI+bIACgkQUaNDx8/7
# 7KHW4g/9FmgX0k2Elm1BAul3slJtuBT8/iHKfK19rhXICxhxS5xBWJA8FmosTWAT
# 91YqQJhOHARxLd9VROfv8Fq8sAo+Ys8bP3PTXh5satjY5gR9YtmMSVqvsAVLn7lv
# a/0xp7wPJt2UeKzvRNUqFXNr7yHPwxFxbJbmmAJbNte8p+TfE2qvojbJnu7BjJbg
# sTtS/vFWNJwtuNYTkMRoiZaUKEoEZ8LnslOqKUjgeO59g4i3Dq8e2JCmHANPFWUK
# cWmr7AqcXgXEnLSDWTtfN53bjcSCYkFVb4WV4Wv1/7hUF5jQ4UR0l3B64xWe0M3/
# Prak3bWOM/o7JwLBsgaWPngXA9V0WFBTXVF4x5qTwhuR1sSV8MxUvTKxI+qqiEzA
# FjU89oSZ+zXId/hEUuTL6vn1Th8/6mwD0L9ORchNOQUKzCjBzI4MVPB09nM3AdPC
# LGThlufsZktdoU2KjMHpc+gMIXQYsxkgvm07K5iZTZ5eJ4tV5KB0aPvTZppGUxe1
# YY9og9F3hxjDHQtEuSY2rzBQI7nrUpd1ZI5ut/3ZgDWkqD6aGRtMme4n4GsGsYb2
# Ht9+d2RL9S8uPUh+7rV8K/N3+vXgXRaEYTuAScKtflEbA7YnZA5nUdMng8x0kMTQ
# Y73XCd4UGWDfSSZsgaIHGkM/MRIHgmlrfcwPkWqWW9vF+92O6Hw=
# =/Du0
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 06 Nov 2023 22:35:30 HKT
# gpg: using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@redhat.com>" [unknown]
# gpg: aka "Cédric Le Goater <clg@kaod.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1
* tag 'pull-vfio-20231106' of https://github.com/legoater/qemu: (22 commits)
vfio/common: Move vfio_host_win_add/del into spapr.c
vfio/spapr: Make vfio_spapr_create/remove_window static
vfio/container: Move spapr specific init/deinit into spapr.c
vfio/container: Move vfio_container_add/del_section_window into spapr.c
vfio/container: Move IBM EEH related functions into spapr_pci_vfio.c
util/uuid: Define UUID_STR_LEN from UUID_NONE string
util/uuid: Remove UUID_FMT_LEN
vfio/pci: Fix buffer overrun when writing the VF token
util/uuid: Add UUID_STR_LEN definition
hw/pci: modify pci_setup_iommu() to set PCIIOMMUOps
test: Add some tests for range and resv-mem helpers
virtio-iommu: Consolidate host reserved regions and property set ones
virtio-iommu: Implement set_iova_ranges() callback
virtio-iommu: Record whether a probe request has been issued
range: Introduce range_inverse_array()
virtio-iommu: Introduce per IOMMUDevice reserved regions
util/reserved-region: Add new ReservedRegion helpers
range: Make range_compare() public
virtio-iommu: Rename reserved_regions into prop_resv_regions
vfio: Collect container iova range info
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Hyper-V Dynamic Memory protocol driver.
This driver is like virtio-balloon on steroids for Windows guests:
it allows both changing the guest memory allocation via ballooning and
inserting pieces of extra RAM into it on demand from a provided memory
backend via Windows-native Hyper-V Dynamic Memory protocol.
* Preparatory patches to support empty memory devices and ones with
large alignment requirements.
* Revert of recently added "hw/virtio/virtio-pmem: Replace impossible
check by assertion" commit 5960f254db since this series makes this
situation possible again.
* Protocol definitions.
* Hyper-V DM protocol driver (hv-balloon) base (ballooning only).
* Hyper-V DM protocol driver (hv-balloon) hot-add support.
* qapi query-memory-devices support for the driver.
* qapi HV_BALLOON_STATUS_REPORT event.
* The relevant PC machine plumbing.
* New MAINTAINERS entry for the above.
# -----BEGIN PGP SIGNATURE-----
#
# iQGzBAABCAAdFiEE4ndqq6COJv9aG0oJUrHW6VHQzgcFAmVI81IACgkQUrHW6VHQ
# zgdzTgv+I5eV2R01YLOBBJhBjzxZ4/BUqkuUHNxHpfjuCqEIzPb7FIfoZ4ZyXZFT
# YJdSE4lPeTZLrmmi/Nt6G0rUKDvdCeIgkS2VLHFSsTV8IzcT71BTRGzV0zAjUF5v
# yDH6uzo6e9gmaziIalRjibUxSDjCQmoCifms2rS2DwazADudUp+naGfm+3uyA0gM
# raOfBfRkNZsDqhXg2ayuqPIES75xQONoON9xYPKDAthS48POEbqtWBKuFopr3kXY
# y0eph+NAw+RajCyLYKM3poIgaSu3l4WegInuKQffzqKR8dxrbwPdCmtgo6NSHx0W
# uDfl7FUBnGzrR18VU4ZfTSrF5SVscGwF9EL7uocJen15inJjl1q3G53uZgyGzHLC
# cw8fKMjucmE8njQR2qiMyX0b+T4+9nKO1rykBgTG/+c9prRUVoxYpFCF117Ei0U8
# QzLGACW1oK+LV41bekWAye7w9pShUtFaxffhPbJeZDDGh7q0x61R3Z3yKkA07p46
# /YWWFWUD
# =RAb0
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 06 Nov 2023 22:08:18 HKT
# gpg: using RSA key E2776AABA08E26FF5A1B4A0952B1D6E951D0CE07
# gpg: Good signature from "Maciej S. Szmigiero <mail@maciej.szmigiero.name>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 727A 0D4D DB9E D9F6 039B ECEF 847F 5E37 90CE 0977
# Subkey fingerprint: E277 6AAB A08E 26FF 5A1B 4A09 52B1 D6E9 51D0 CE07
* tag 'pull-hv-balloon-20231106' of https://github.com/maciejsszmigiero/qemu:
MAINTAINERS: Add an entry for Hyper-V Dynamic Memory Protocol
hw/i386/pc: Support hv-balloon
qapi: Add HV_BALLOON_STATUS_REPORT event and its QMP query command
qapi: Add query-memory-devices support to hv-balloon
Add Hyper-V Dynamic Memory Protocol driver (hv-balloon) hot-add support
Add Hyper-V Dynamic Memory Protocol driver (hv-balloon) base
Add Hyper-V Dynamic Memory Protocol definitions
memory-device: Drop size alignment check
Revert "hw/virtio/virtio-pmem: Replace impossible check by assertion"
memory-device: Support empty memory devices
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Bugfixes for emulated Xen support
Selected bugfixes for mainline and stable, especially to the per-vCPU
local APIC vector delivery mode for event channel notifications, which
was broken in a number of ways.
The xen-block driver has been defaulting to the wrong protocol for x86
guest, and this fixes that — which is technically an incompatible change
but I'm fairly sure nobody relies on the broken behaviour (and in
production I *have* seen guests which rely on the correct behaviour,
which now matches the blkback driver in the Linux kernel).
A handful of other simple fixes for issues which came to light as new
features (qv) were being developed.
# -----BEGIN PGP SIGNATURE-----
#
# iQJIBAABCAAyFiEEvgfZ/VSAmrLEsP9fY3Ys2mfi81kFAmVIvv4UHGR3bXcyQGlu
# ZnJhZGVhZC5vcmcACgkQY3Ys2mfi81nFmRAAvK3VNuGDV56TJqFdtEWD+3jzSZU0
# CoL1mxggvwnlFn1SdHvbC5jl+UscknErcNbqlxMTTg9jQiiQqzFuaWujJnL0dEOY
# RJiS2scKln/1gv9NRbLE31FjPwoNz+zJI/iMvdutjT7Ll//v34jY0vd1Y5Wo53ay
# MBschuuxD1sUUTHNj5f9afrgZaetJfgBSNZraiLR5T2HEadJVJuhItdGxW1+KaPI
# zBIcflIeZmJl9b/L1a2bP3KJmRo8QzHB56X3uzwkPhYhYSU2dnCaJTLCkiNfK+Qh
# SgCBMlzsvJbIZqDA9YPOGdKK1ArfTJRmRDwAkqH0YQknQGoIkpN+7eQiiSv6PMS5
# U/93V7r6MfaftIs6YdWSnFozWeBuyKZL9H2nAXqZgL5t6uEMVR8Un/kFnGfslTFY
# 9gQ1o4IM6ECLiXhIP/sPNOprrbFb0HU7QPtEDJOxrJzBM+IfLbldRHn4p9CccqQA
# LHvJF98VhX1d0nA0iZBT3qqfKPbmUhRV9Jrm+WamqNrRXhiGdF8EidsUf8RWX+JD
# xZWJiqhTwShxdLE6TC/JgFz4cQCVHG8QiZstZUbdq59gtz9YO5PGByMgI3ds7iNQ
# lGXAPFm+1wU85W4dZOH7qyim6d9ytFm2Fm110BKM8l9B6UKEuKHpsxXMqdo65JXI
# 7uBKbVpdPKul0DY=
# =dQ7h
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 06 Nov 2023 18:25:02 HKT
# gpg: using RSA key BE07D9FD54809AB2C4B0FF5F63762CDA67E2F359
# gpg: issuer "dwmw2@infradead.org"
# gpg: Good signature from "David Woodhouse <dwmw2@infradead.org>" [unknown]
# gpg: aka "David Woodhouse <dwmw2@exim.org>" [unknown]
# gpg: aka "David Woodhouse <david@woodhou.se>" [unknown]
# gpg: aka "David Woodhouse <dwmw2@kernel.org>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: BE07 D9FD 5480 9AB2 C4B0 FF5F 6376 2CDA 67E2 F359
* tag 'pull-xenfv-stable-20231106' of git://git.infradead.org/users/dwmw2/qemu:
hw/xen: use correct default protocol for xen-block on x86
hw/xen: take iothread mutex in xen_evtchn_reset_op()
hw/xen: fix XenStore watch delivery to guest
hw/xen: don't clear map_track[] in xen_gnttab_reset()
hw/xen: select kernel mode for per-vCPU event channel upcall vector
i386/xen: fix per-vCPU upcall vector for Xen emulation
i386/xen: Don't advertise XENFEAT_supervisor_mode_kernel
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Using a mask instead of the number of PMU devices supports the accurate
emulation of platforms that have a discontinuous set of PMU counters.
The "pmu-num" property now generates a warning when used by the user on
the command line.
Rather than storing the value for "pmu-num" convert it directly to the
mask if it is specified (overwriting the default "pmu-mask" value)
likewise the value is calculated from the mask if the property value is
obtained.
In the unusual situation that both "pmu-mask" and "pmu-num" are provided
then then the order on the command line determines which takes
precedence (later overwriting earlier.)
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231031154000.18134-5-rbradford@rivosinc.com>
[Changes by AF
- Fixup ext_zihpm logic after rebase
]
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
We currently don't clear the interrupts if they are disabled. This means
that if an interrupt occurs and the guest disables interrupts the QEMU
IRQ will remain high.
This doesn't immediately affect guests, but if the
guest re-enables interrupts it's possible that we will miss an
interrupt as it always remains set.
Let's update the logic to always call qemu_set_irq() even if the
interrupts are disabled to ensure we set the level low. The level will
never be high unless interrupts are enabled, so we won't generate
interrupts when we shouldn't.
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20231102003424.2003428-2-alistair.francis@wdc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
zihpm is the Hardware Performance Counters extension described in
chapter 12 of the unprivileged spec. It describes support for 29
unprivileged performance counters, hpmcounter3-hpmcounter31.
As with zicntr, QEMU already implements zihpm before it was even an
extension. zihpm is also part of the RVA22 profile, so add it to QEMU
to complement the future profile implementation. Default it to 'true'
for all existing CPUs since it was always present in the code.
As for disabling it, there is already code in place in
target/riscv/csr.c in all predicates for these counters (ctr() and
mctr()) that disables them if cpu->cfg.pmu_num is zero. Thus, setting
cpu->cfg.pmu_num to zero if 'zihpm=false' is enough to disable the
extension.
Set cpu->pmu_avail_ctrs mask to zero as well since this is also checked
to verify if the counters exist.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231023153927.435083-4-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
zicntr is the Base Counters and Timers extension described in chapter 12
of the unprivileged spec. It describes support for RDCYCLE, RDTIME and
RDINSTRET.
QEMU already implements it in TCG way before it was a discrete
extension. zicntr is part of the RVA22 profile, so let's add it to QEMU
to make the future profile implementation flag complete. Given than it
represents an already existing feature, default it to 'true' for all
CPUs.
For TCG, we need a way to disable zicntr if the user wants to. This is
done by restricting access to the CYCLE, TIME, and INSTRET counters via
the 'ctr()' predicate when we're about to access them.
Disabling zicntr happens via the command line or if its dependency,
zicsr, happens to be disabled. We'll check for zicsr during realize()
and, in case it's absent, disable zicntr. However, if the user was
explicit about having zicntr support, error out instead of disabling it.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231023153927.435083-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
As per the Priv spec: "The R, W, and X fields form a collective WARL
field for which the combinations with R=0 and W=1 are reserved."
However currently such writes are not ignored as ought to be. The
combinations with RW=01 are allowed only when the Smepmp extension
is enabled and mseccfg.MML is set.
Signed-off-by: Mayuresh Chitale <mchitale@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231019065705.1431868-1-mchitale@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
As per the Priv and Smepmp specifications, certain bits such as the 'L'
bit of pmp entries and mseccfg.MML can only be cleared upon reset and it
is necessary to do so to allow 'M' mode firmware to correctly reinitialize
the pmp/smpemp state across reboots. As required by the spec, also clear
the 'A' field of pmp entries.
Signed-off-by: Mayuresh Chitale <mchitale@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231019065644.1431798-1-mchitale@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Use the recently added riscv_cpu_accelerator_compatible() to filter
unavailable CPUs for a given accelerator. At this moment this is the
case for a QEMU built with KVM and TCG support querying a binary running
with TCG:
qemu-system-riscv64 -S -M virt,accel=tcg -display none
-qmp tcp:localhost:1234,server,wait=off
./qemu/scripts/qmp/qmp-shell localhost:1234
(QEMU) query-cpu-model-expansion type=full model={"name":"host"}
{"error": {"class": "GenericError", "desc": "'host' CPU not available with tcg"}}
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231018195638.211151-7-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Add an API to check if a given CPU is compatible with the current
accelerator.
This will allow query-cpu-model-expansion to work properly in conditions
where QEMU supports both accelerators (TCG and KVM), QEMU is then
launched using TCG, and the API requests information about a KVM only
CPU (e.g. 'host' CPU).
KVM doesn't have such restrictions and, at least in theory, all CPUs
models should work with KVM. We will revisit this API in case we decide
to restrict the amount of KVM CPUs we support.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231018195638.211151-6-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Callers can add 'props' when querying for a cpu model expansion to see
if a given CPU model supports a certain criteria, and what's the
resulting CPU object.
If we have 'props' to handle, gather it in a QDict and use the new
riscv_cpuobj_validate_qdict_in() helper to validate it. This helper will
add the custom properties in the CPU object and validate it using
riscv_cpu_finalize_features(). Users will be aware of validation errors
if any occur, if not a CPU object with 'props' will be returned.
Here's an example with the veyron-v1 vendor CPU. Disabling vendor CPU
extensions is allowed, assuming the final config is valid. Disabling
'smstateen' is a valid expansion:
(QEMU) query-cpu-model-expansion type=full model={"name":"veyron-v1","props":{"smstateen":false}}
{"return": {"model": {"name": "veyron-v1", "props": {"zicond": false, ..., "smstateen": false, ...}
But enabling extensions isn't allowed for vendor CPUs. E.g. enabling 'V'
for the veyron-v1 CPU isn't allowed:
(QEMU) query-cpu-model-expansion type=full model={"name":"veyron-v1","props":{"v":true}}
{"error": {"class": "GenericError", "desc": "'veyron-v1' CPU does not allow enabling extensions"}}
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231018195638.211151-5-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The query-cpu-model-expansion API is capable of passing extra properties
to a given CPU model and tell callers if this custom configuration is
valid.
The RISC-V version of the API is not quite there yet. The reason is the
realize() flow in the TCG driver, where most of the validation is done
in tcg_cpu_realizefn(). riscv_cpu_finalize_features() is then used to
validate satp_mode for both TCG and KVM CPUs.
Our ARM friends uses a concept of 'finalize_features()', a step done in
the end of realize() where the CPU features are validated. We have a
riscv_cpu_finalize_features() helper that, at this moment, is only
validating satp_mode.
Re-use this existing helper to do all CPU extension validation we
required after at the end of realize(). Make it public to allow APIs to
use it. At this moment only the TCG driver requires a realize() time
validation, thus, to avoid adding accelerator specific helpers in the
API, riscv_cpu_finalize_features() uses
riscv_tcg_cpu_finalize_features() if we are running TCG. The API will
then use riscv_cpu_finalize_features() regardless of the current
accelerator.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231018195638.211151-4-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
We got along without property getters in the KVM driver because we never
needed them. But the incoming query-cpu-model-expansion API will use
property getters and setters to retrieve the CPU characteristics.
Add the missing getters for the KVM driver for both MISA and
multi-letter extension properties. We're also adding an special getter
for absent multi-letter properties that KVM doesn't implement that
always return false.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231018195638.211151-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Commit f57d5f8004 deprecated the 'any' CPU type but failed to change the
default CPU for linux-user. The result is that all linux-users
invocations that doesn't specify a different CPU started to show a
deprecation warning:
$ ./build/qemu-riscv64 ./foo-novect.out
qemu-riscv64: warning: The 'any' CPU is deprecated and will be removed in the future.
Change the default CPU for RISC-V linux-user from 'any' to 'max'.
Reported-by: Richard Henderson <richard.henderson@linaro.org>
Fixes: f57d5f8004 ("target/riscv: deprecate the 'any' CPU type")
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20231020074501.283063-1-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
This change adds support for inserting virtual interrupts from HS-mode
into VS-mode using hvien and hvip csrs. This also allows for IRQ filtering
from HS-mode.
Also, the spec doesn't mandate the interrupt to be actually supported
in hardware. Which allows HS-mode to assert virtual interrupts to VS-mode
that have no connection to any real interrupt events.
This is defined as part of the AIA specification [0], "6.3.2 Virtual
interrupts for VS level".
[0]: https://github.com/riscv/riscv-aia/releases/download/1.0/riscv-interrupts-1.0.pdf
Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20231016111736.28721-7-rkanwal@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
This change adds support for inserting virtual interrupts from M-mode
into S-mode using mvien and mvip csrs. IRQ filtering is a use case of
this change, i-e M-mode can stop delegating an interrupt to S-mode and
instead enable it in MIE and receive those interrupts in M-mode and then
selectively inject the interrupt using mvien and mvip.
Also, the spec doesn't mandate the interrupt to be actually supported
in hardware. Which allows M-mode to assert virtual interrupts to S-mode
that have no connection to any real interrupt events.
This is defined as part of the AIA specification [0], "5.3 Interrupt
filtering and virtual interrupts for supervisor level".
[0]: https://github.com/riscv/riscv-aia/releases/download/1.0/riscv-interrupts-1.0.pdf
Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20231016111736.28721-6-rkanwal@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
This is to allow virtual interrupts to be inserted into S and VS
modes. Given virtual interrupts will be maintained in separate
mvip and hvip CSRs, riscv_cpu_update_mip will no longer be in the
path and interrupts need to be triggered for these cases from
rmw_hvip64 and rmw_mvip64 functions.
Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231016111736.28721-5-rkanwal@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
With H-Ext supported, VS bits are all hardwired to one in MIDELEG
denoting always delegated interrupts. This is being done in rmw_mideleg
but given mideleg is used in other places when routing interrupts
this change initializes it in riscv_cpu_realize to be on the safe side.
Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231016111736.28721-4-rkanwal@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Fixes a bug wherein raw uses of tcg_constant_internal
do not have their TempOptInfo initialized.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Compare two temps for "better", split out from finding
the best from a whole list. Use TCGKind, which already
gives the proper priority.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Avoid reusing vector temporaries so that we may re-use them
when propagating stores to loads.
Reviewed-by: Song Gao <gaosong@loongson.cn>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Move all of it into accel/tcg/monitor.c. This puts everything
about tcg that is only used by the monitor in the same place.
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
raw_co_zone_append() sets "s->offset" where "BDRVRawState *s". This pointer
is used later at raw_co_prw() to save the block address where the data is
written.
When multiple IOs are on-going at the same time, a later IO's
raw_co_zone_append() call over-writes a former IO's offset address before
raw_co_prw() completes. As a result, the former zone append IO returns the
initial value (= the start address of the writing zone), instead of the
proper address.
Fix the issue by passing the offset pointer to raw_co_prw() instead of
passing it through s->offset. Also, remove "offset" from BDRVRawState as
there is no usage anymore.
Fixes: 4751d09adc ("block: introduce zone append write for zoned devices")
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Message-Id: <20231030073853.2601162-1-naohiro.aota@wdc.com>
Reviewed-by: Sam Li <faithilikerun@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
When the zoned request fail, it needs to update only the wp of
the target zones for not disrupting the in-flight writes on
these other zones. The wp is updated successfully after the
request completes.
Fixed the callers with right offset and nr_zones.
Signed-off-by: Sam Li <faithilikerun@gmail.com>
Message-Id: <20230825040556.4217-1-faithilikerun@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[hreitz: Rebased and fixed comment spelling]
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
When the discard-no-unref flag is enabled, we keep the reference for
normal discard requests.
But when a discard is executed on a snapshot/qcow2 image with backing,
the discards are saved as zero clusters in the snapshot image.
When committing the snapshot to the backing file, not
discard_in_l2_slice is called but zero_in_l2_slice. Which did not had
any logic to keep the reference when discard-no-unref is enabled.
Therefor we add logic in the zero_in_l2_slice call to keep the reference
on commit.
Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1621
Signed-off-by: Jean-Louis Dupond <jean-louis@dupond.be>
Message-Id: <20231003125236.216473-2-jean-louis@dupond.be>
[hreitz: Made the documentation change more verbose, as discussed
on-list]
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
On the vexpress-a9 board we try to map both RAM and flash to address 0,
as seen in "info mtree":
address-space: memory
0000000000000000-ffffffffffffffff (prio 0, i/o): system
0000000000000000-0000000003ffffff (prio 0, romd): alias vexpress.flashalias @vexpress.flash0 0000000000000000-0000000003ffffff
0000000000000000-0000000003ffffff (prio 0, ram): alias vexpress.lowmem @vexpress.highmem 0000000000000000-0000000003ffffff
0000000010000000-0000000010000fff (prio 0, i/o): arm-sysctl
0000000010004000-0000000010004fff (prio 0, i/o): pl041
(etc)
The flash "wins" and the RAM mapping is useless (but also harmless).
This happened as a result of commit 6ec1588e in 2014, which changed
"we always map the RAM to the low addresses for vexpress-a9" to "we
always map flash in the low addresses", but forgot to stop mapping
the RAM.
In real hardware, this low part of memory is remappable, both at
runtime by the guest writing to a control register, and configurably
as to what you get out of reset -- you can have the first flash
device, or the second, or the DDR2 RAM, or the external AXI bus
(which for QEMU means "nothing there"). In an ideal world we would
support that remapping both at runtime and via a machine property to
select the out-of-reset behaviour.
Pending anybody caring enough to implement the full remapping
behaviour:
* remove the useless mapped-but-inaccessible lowram MR
* document that QEMU doesn't support remapping of low memory
Fixes: 6ec1588e ("hw/arm/vexpress: Alias NOR flash at 0 for vexpress-a9")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1761
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20231103185602.875849-1-peter.maydell@linaro.org
We support only 3- and 4-level page-tables, which is firstly checked in
vtd_decide_config(), then setup in vtd_init(). Than level fields are
checked by vtd_is_level_supported().
So here we can't have level out from 1..4 inclusive range. Let's assert
it. That also explains Coverity that we are not going to overflow the
array.
CID: 1487158, 1487186
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Maksim Davydov <davydov-max@yandex-team.ru>
Message-id: 20231017125941.810461-2-vsementsov@yandex-team.ru
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Documentation for using the GAS in ACPI tables to report debug UART addresses at
https://learn.microsoft.com/en-us/windows-hardware/drivers/bringup/acpi-debug-port-table
states the following:
- The Register Bit Width field contains the register stride and must be a
power of 2 that is at least as large as the access size. On 32-bit
platforms this value cannot exceed 32. On 64-bit platforms this value
cannot exceed 64.
- The Access Size field is used to determine whether byte, WORD, DWORD, or
QWORD accesses are to be used. QWORD accesses are only valid on 64-bit
architectures.
Documentation for the ARM PL011 at
https://developer.arm.com/documentation/ddi0183/latest/
states that the registers are:
- spaced 4 bytes apart (see Table 3-2), so register stride must be 32.
- 16 bits in size in some cases (see individual registers), so access
size must be at least 2.
Linux doesn't seem to care about this error in the table, but it does
affect at least the NOVA microhypervisor.
In theory we therefore have a choice between reporting the access
size as 2 (16 bit accesses) or 3 (32-bit accesses). In practice,
Linux does not correctly handle the case where the table reports the
access size as 2: as of kernel commit 750b95887e5678, the code in
acpi_parse_spcr() tries to tell the serial driver to use 16 bit
accesses by passing "mmio16" in the option string, but the PL011
driver code in pl011_console_match() only recognizes "mmio" or
"mmio32". The result is that unless the user has enabled 'earlycon'
there is no console output from the guest kernel.
We therefore choose to report the access size as 32 bits; this works
for NOVA and also for Linux. It is also what the UEFI firmware on a
Raspberry Pi 4 reports, so we're in line with existing real-world
practice.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1938
Signed-off-by: Udo Steinberg <udo@hypervisor.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: minor commit message tweaks; use 32 bit accesses]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Allow changes to the virt board SPCR and DBG2 -- we are going to fix
an error in the UART descriptions there.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Since commit 9036e917f8 ("{include/}hw/arm: refactor virt PPI logic")
PMU IRQ registration fails for arm64 guests:
[ 0.563689] hw perfevents: unable to request IRQ14 for ARM PMU counters
[ 0.565160] armv8-pmu: probe of pmu failed with error -22
That commit re-defined VIRTUAL_PMU_IRQ to be a INTID but missed a case
where the PMU IRQ is actually referred by its PPI index. Fix that by using
INTID_TO_PPI() in that case.
Fixes: 9036e917f8 ("{include/}hw/arm: refactor virt PPI logic")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1960
Signed-off-by: Sebastian Ott <sebott@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 475d918d-ab0e-f717-7206-57a5beb28c7b@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
If we decide to apply this patch (for easier backporting reasons), we
can now revert it.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Add the necessary plumbing for the hv-balloon driver to the PC machine.
Co-developed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Used by the hv-balloon driver for (optional) guest memory status reports.
Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
One of advantages of using this protocol over ACPI-based PC DIMM hotplug is
that it allows hot-adding memory in much smaller granularity because the
ACPI DIMM slot limit does not apply.
In order to enable this functionality a new memory backend needs to be
created and provided to the driver via the "memdev" parameter.
This can be achieved by, for example, adding
"-object memory-backend-ram,id=mem1,size=32G" to the QEMU command line and
then instantiating the driver with "memdev=mem1" parameter.
The device will try to use multiple memslots to cover the memory backend in
order to reduce the size of metadata for the not-yet-hot-added part of the
memory backend.
Co-developed-by: David Hildenbrand <david@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
This driver is like virtio-balloon on steroids: it allows both changing the
guest memory allocation via ballooning and (in the next patch) inserting
pieces of extra RAM into it on demand from a provided memory backend.
The actual resizing is done via ballooning interface (for example, via
the "balloon" HMP command).
This includes resizing the guest past its boot size - that is, hot-adding
additional memory in granularity limited only by the guest alignment
requirements, as provided by the next patch.
In contrast with ACPI DIMM hotplug where one can only request to unplug a
whole DIMM stick this driver allows removing memory from guest in single
page (4k) units via ballooning.
After a VM reboot the guest is back to its original (boot) size.
In the future, the guest boot memory size might be changed on reboot
instead, taking into account the effective size that VM had before that
reboot (much like Hyper-V does).
For performance reasons, the guest-released memory is tracked in a few
range trees, as a series of (start, count) ranges.
Each time a new page range is inserted into such tree its neighbors are
checked as candidates for possible merging with it.
Besides performance reasons, the Dynamic Memory protocol itself uses page
ranges as the data structure in its messages, so relevant pages need to be
merged into such ranges anyway.
One has to be careful when tracking the guest-released pages, since the
guest can maliciously report returning pages outside its current address
space, which later clash with the address range of newly added memory.
Similarly, the guest can report freeing the same page twice.
The above design results in much better ballooning performance than when
using virtio-balloon with the same guest: 230 GB / minute with this driver
versus 70 GB / minute with virtio-balloon.
During a ballooning operation most of time is spent waiting for the guest
to come up with newly freed page ranges, processing the received ranges on
the host side (in QEMU and KVM) is nearly instantaneous.
The unballoon operation is also pretty much instantaneous:
thanks to the merging of the ballooned out page ranges 200 GB of memory can
be returned to the guest in about 1 second.
With virtio-balloon this operation takes about 2.5 minutes.
These tests were done against a Windows Server 2019 guest running on a
Xeon E5-2699, after dirtying the whole memory inside guest before each
balloon operation.
Using a range tree instead of a bitmap to track the removed memory also
means that the solution scales well with the guest size: even a 1 TB range
takes just a few bytes of such metadata.
Since the required GTree operations aren't present in every Glib version
a check for them was added to the meson build script, together with new
"--enable-hv-balloon" and "--disable-hv-balloon" configure arguments.
If these GTree operations are missing in the system's Glib version this
driver will be skipped during QEMU build.
An optional "status-report=on" device parameter requests memory status
events from the guest (typically sent every second), which allow the host
to learn both the guest memory available and the guest memory in use
counts.
Following commits will add support for their external emission as
"HV_BALLOON_STATUS_REPORT" QMP events.
The driver is named hv-balloon since the Linux kernel client driver for
the Dynamic Memory Protocol is named as such and to follow the naming
pattern established by the virtio-balloon driver.
The whole protocol runs over Hyper-V VMBus.
The driver was tested against Windows Server 2012 R2, Windows Server 2016
and Windows Server 2019 guests and obeys the guest alignment requirements
reported to the host via DM_CAPABILITIES_REPORT message.
Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
This commit adds Hyper-V Dynamic Memory Protocol definitions, taken
from hv_balloon Linux kernel driver, adapted to the QEMU coding style and
definitions.
Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
There is no strong requirement that the size has to be multiples of the
requested alignment, let's drop it. This is a preparation for hv-baloon.
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
This reverts commit 5960f254db since the
previous commit made this situation possible again.
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Only spapr supports a customed host window list, other vfio driver
assume 64bit host window. So remove the check in listener callback
and move vfio_host_win_add/del into spapr.c and make it static.
With the check removed, we still need to do the same check for
VFIO_SPAPR_TCE_IOMMU which allows a single host window range
[dma32_window_start, dma32_window_size). Move vfio_find_hostwin
into spapr.c and do same check in vfio_container_add_section_window
instead.
When mapping a ram device section, if it's unaligned with
hostwin->iova_pgsizes, this mapping is bypassed. With hostwin
moved into spapr, we changed to check container->pgsizes.
Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
vfio_spapr_create_window calls vfio_spapr_remove_window,
With reoder of definition of the two, we can make
vfio_spapr_create/remove_window static.
No functional changes intended.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Move spapr specific init/deinit code into spapr.c and wrap
them with vfio_spapr_container_init/deinit, this way footprint
of spapr is further reduced, vfio_prereg_listener could also
be made static.
vfio_listener_release is unnecessary when prereg_listener is
moved out, so have it removed.
No functional changes intended.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
vfio_container_add/del_section_window are spapr specific functions,
so move them into spapr.c to make container.c cleaner.
No functional changes intended.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
With vfio_eeh_as_ok/vfio_eeh_as_op moved and made static,
vfio.h becomes empty and is deleted.
No functional changes intended.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Acked-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
As we are going to introduce an extra subsection for "blob" resources,
scanout have to be restored after.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
"blob" resources don't have an associated pixman image:
#0 pixman_image_get_stride (image=0x0) at ../pixman/pixman-image.c:921
#1 0x0000562327c25236 in virtio_gpu_save (f=0x56232bb13b00, opaque=0x56232b555a60, size=0, field=0x5623289ab6c8 <__compound_literal.3+104>, vmdesc=0x56232ab59fe0) at ../hw/display/virtio-gpu.c:1225
Related to:
https://bugzilla.redhat.com/show_bug.cgi?id=2236353
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Apparently these should be half the memory region sizes confirmed at
least by Radeon FCocde ROM while Rage 128 Pro ROMs don't seem to use
these. Linux r100 DRM driver also checks for a bit in HOST_PATH_CNTL
so we also add that even though the FCode ROM does not seem to set it.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <d077d4f90d19db731df78da6f05058db074cada1.1698871239.git.balaton@eik.bme.hu>
Add an empty element to the interfaces array, which is consistent with
the behavior of other devices in qemu and fixes the crash on arm64.
0 0x0000fffff5c18550 in () at /usr/lib64/libc.so.6
1 0x0000fffff6c9cd6c in g_strdup () at /usr/lib64/libglib-2.0.so.0
2 0x0000aaaaab4945d8 in g_strdup_inline (str=<optimized out>) at /usr/include/glib-2.0/glib/gstrfuncs.h:321
3 type_new (info=info@entry=0xaaaaabc1b2c8 <virtio_gpu_rutabaga_pci_info>) at ../qom/object.c:133
4 0x0000aaaaab494f14 in type_register_internal (info=0xaaaaabc1b2c8 <virtio_gpu_rutabaga_pci_info>) at ../qom/object.c:143
5 type_register (info=0xaaaaabc1b2c8 <virtio_gpu_rutabaga_pci_info>) at ../qom/object.c:152
6 type_register_static (info=0xaaaaabc1b2c8 <virtio_gpu_rutabaga_pci_info>) at ../qom/object.c:157
7 type_register_static_array (infos=<optimized out>, nr_infos=<optimized out>) at ../qom/object.c:165
8 0x0000aaaaab6147e8 in module_call_init (type=type@entry=MODULE_INIT_QOM) at ../util/module.c:109
9 0x0000aaaaab10a0ec in qemu_init_subsystems () at ../system/runstate.c:817
10 0x0000aaaaab10d334 in qemu_init (argc=13, argv=0xfffffffff198) at ../system/vl.c:2760
11 0x0000aaaaaae4da6c in main (argc=<optimized out>, argv=<optimized out>) at ../system/main.c:47
Signed-off-by: Cong Liu <liucong2@kylinos.cn>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20231031012515.15504-1-liucong2@kylinos.cn>
Even on x86_64 the default protocol is the x86-32 one if the guest doesn't
specifically ask for x86-64.
Cc: qemu-stable@nongnu.org
Fixes: b6af8926fb ("xen: add implementations of xen-block connect and disconnect functions...")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
The xen_evtchn_soft_reset() function requires the iothread mutex, but is
also called for the EVTCHNOP_reset hypercall. Ensure the mutex is taken
in that case.
Cc: qemu-stable@nongnu.org
Fixes: a15b10978f ("hw/xen: Implement EVTCHNOP_reset")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
When fire_watch_cb() found the response buffer empty, it would call
deliver_watch() to generate the XS_WATCH_EVENT message in the response
buffer and send an event channel notification to the guest… without
actually *copying* the response buffer into the ring. So there was
nothing for the guest to see. The pending response didn't actually get
processed into the ring until the guest next triggered some activity
from its side.
Add the missing call to put_rsp().
It might have been slightly nicer to call xen_xenstore_event() here,
which would *almost* have worked. Except for the fact that it calls
xen_be_evtchn_pending() to check that it really does have an event
pending (and clear the eventfd for next time). And under Xen it's
defined that setting that fd to O_NONBLOCK isn't guaranteed to work,
so the emu implementation follows suit.
This fixes Xen device hot-unplug.
Cc: qemu-stable@nongnu.org
Fixes: 0254c4d19d ("hw/xen: Add xenstore wire implementation and implementation stubs")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
The refcounts actually correspond to 'active_ref' structures stored in a
GHashTable per "user" on the backend side (mostly, per XenDevice).
If we zero map_track[] on reset, then when the backend drivers get torn
down and release their mapping we hit the assert(s->map_track[ref] != 0)
in gnt_unref().
So leave them in place. Each backend driver will disconnect and reconnect
as the guest comes back up again and reconnects, and it all works out OK
in the end as the old refs get dropped.
Cc: qemu-stable@nongnu.org
Fixes: de26b26197 ("hw/xen: Implement soft reset for emulated gnttab")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
A guest which has configured the per-vCPU upcall vector may set the
HVM_PARAM_CALLBACK_IRQ param to fairly much anything other than zero.
For example, Linux v6.0+ after commit b1c3497e604 ("x86/xen: Add support
for HVMOP_set_evtchn_upcall_vector") will just do this after setting the
vector:
/* Trick toolstack to think we are enlightened. */
if (!cpu)
rc = xen_set_callback_via(1);
That's explicitly setting the delivery to GSI#1, but it's supposed to be
overridden by the per-vCPU vector setting. This mostly works in Qemu
*except* for the logic to enable the in-kernel handling of event channels,
which falsely determines that the kernel cannot accelerate GSI delivery
in this case.
Add a kvm_xen_has_vcpu_callback_vector() to report whether vCPU#0 has
the vector set, and use that in xen_evtchn_set_callback_param() to
enable the kernel acceleration features even when the param *appears*
to be set to target a GSI.
Preserve the Xen behaviour that when HVM_PARAM_CALLBACK_IRQ is set to
*zero* the event channel delivery is disabled completely. (Which is
what that bizarre guest behaviour is working round in the first place.)
Cc: qemu-stable@nongnu.org
Fixes: 91cce75617 ("hw/xen: Add xen_evtchn device for event channel emulation")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
The per-vCPU upcall vector support had three problems. Firstly it was
using the wrong hypercall argument and would always return -EFAULT when
the guest tried to set it up. Secondly it was using the wrong ioctl() to
pass the vector to the kernel and thus the *kernel* would always return
-EINVAL. Finally, even when delivering the event directly from userspace
with an MSI, it put the destination CPU ID into the wrong bits of the
MSI address.
Linux doesn't (yet) use this mode so it went without decent testing
for a while.
Cc: qemu-stable@nongnu.org
Fixes: 105b47fdf2 ("i386/xen: implement HVMOP_set_evtchn_upcall_vector")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Return both result and overflow from helper_[us]div.
Compute all flags explicitly in gen_op_[us]divcc.
Marginally improve the INT64_MIN special case in helper_sdiv.
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Step in removing CC_OP: change the representation of CC_OP_FLAGS.
The 8 bits are distributed between 6 variables, which should make
it easy to keep up to date.
The code within cc_helper.c is quite ugly but is only temporary.
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The original tests with MacOS showed that only the bottom 8 bits of the DAFB_LUT
register were used when writing to the LUT, however A/UX performs some of its
writes using 4 byte accesses. Expand the address range for the DAFB_LUT register
so that different size accesses write the correct value to the color_palette
array.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-ID: <20231026085650.917663-4-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
When A/UX uses the MacOS Device Manager Status (GetEntries) call to read the
contents of the CLUT, it is easy to see that the requested index is written to
the DAFB_RESET register. Update the palette_current index with the requested
value, and rename it to DAFB_LUT_INDEX to reflect its true purpose.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-ID: <20231026085650.917663-3-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Let's support empty memory devices -- memory devices that don't have a
memory device region in the current configuration. hv-balloon with an
optional memdev is the primary use case.
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
We were not unlocking bitmap mutex on the error case. To fix it
forever change to enclose the code with WITH_QEMU_LOCK_GUARD().
Coverity CID 1523750.
Fixes: a2326705e5 ("migration: Stop migration immediately in RDMA error paths")
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231103074245.55166-1-quintela@redhat.com>
qemu_uuid_unparse() includes a trailing NUL when writing the uuid
string and the buffer size should be UUID_FMT_LEN + 1 bytes. Use the
recently added UUID_STR_LEN which defines the correct size.
Fixes: CID 1522913
Fixes: 2dca1b37a7 ("vfio/pci: add support for VF token")
Cc: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: "Denis V. Lunev" <den@openvz.org>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
qemu_uuid_unparse() includes a trailing NUL when writing the uuid
string and the buffer size should be UUID_FMT_LEN + 1 bytes. Add a
define for this size and use it where required.
Cc: Fam Zheng <fam@euphon.net>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: "Denis V. Lunev" <den@openvz.org>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Add unit tests for both resv_region_list_insert() and
range_inverse_array().
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: "Michael S. Tsirkin" <mst@redhat.com>
[ clg: Removal of unused variable in compare_ranges() ]
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Up to now we were exposing to the RESV_MEM probe requests the
reserved memory regions set though the reserved-regions array property.
Combine those with the host reserved memory regions if any. Those
latter are tagged as RESERVED. We don't have more information about
them besides then cannot be mapped. Reserved regions set by
property have higher priority.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: "Michael S. Tsirkin" <mst@redhat.com>
Tested-by: Yanghang Liu <yanghliu@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
The implementation populates the array of per IOMMUDevice
host reserved ranges.
It is forbidden to have conflicting sets of host IOVA ranges
to be applied onto the same IOMMU MR (implied by different
host devices).
In case the callback is called after the probe request has
been issues by the driver, a warning is issued.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: "Michael S. Tsirkin" <mst@redhat.com>
Tested-by: Yanghang Liu <yanghliu@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Add an IOMMUDevice 'probe_done' flag to record that the driver
already issued a probe request on that device.
This will be useful to double check host reserved regions aren't
notified after the probe and hence are not taken into account
by the driver.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Suggested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: "Michael S. Tsirkin" <mst@redhat.com>
Tested-by: Yanghang Liu <yanghliu@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This helper reverses a list of regions within a [low, high]
span, turning original regions into holes and original
holes into actual regions, covering the whole UINT64_MAX span.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Yanghang Liu <yanghliu@redhat.com>
Reviewed-by: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
For the time being the per device reserved regions are
just a duplicate of IOMMU wide reserved regions. Subsequent
patches will combine those with host reserved regions, if any.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Yanghang Liu <yanghliu@redhat.com>
Reviewed-by: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Introduce resv_region_list_insert() helper which inserts
a new ReservedRegion into a sorted list of reserved region.
In case of overlap, the new region has higher priority and
hides the existing overlapped segments. If the overlap is
partial, new regions are created for parts which are not
overlapped. The new region has higher priority independently
on the type of the regions.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Tested-by: Yanghang Liu <yanghliu@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Let's expose range_compare() in the header so that it can be
reused outside of util/range.c
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Rename VirtIOIOMMU (nb_)reserved_regions fields with the "prop_" prefix
to highlight those fields are set through a property, at machine level.
They are IOMMU wide.
A subsequent patch will introduce per IOMMUDevice reserved regions
that will include both those IOMMU wide property reserved
regions plus, sometimes, host reserved regions, if the device is
backed by a host device protected by a physical IOMMU. Also change
nb_ prefix by nr_.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Collect iova range information if VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE
capability is supported.
This allows to propagate the information though the IOMMU MR
set_iova_ranges() callback so that virtual IOMMUs
get aware of those aperture constraints. This is only done if
the info is available and the number of iova ranges is greater than
0.
A new vfio_get_info_iova_range helper is introduced matching
the coding style of existing vfio_get_info_dma_avail. The
boolean returned value isn't used though. Code is aligned
between both.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Tested-by: Yanghang Liu <yanghliu@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This helper will allow to convey information about valid
IOVA ranges to virtual IOMMUS.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: "Michael S. Tsirkin" <mst@redhat.com>
[ clg: fixes in memory_region_iommu_set_iova_ranges() and
iommu_set_iova_ranges() documentation ]
Signed-off-by: Cédric Le Goater <clg@redhat.com>
A reserved region is a range tagged with a type. Let's directly use
the Range type in the prospect to reuse some of the library helpers
shipped with the Range type.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Dirty ring size configuration is not supported by guestperf tool.
Introduce dirty-ring-size (ranges in [1024, 65536]) option so
developers can play with dirty-ring and dirty-limit feature easier.
To set dirty ring size with 4096 during migration test:
$ ./tests/migration/guestperf.py --dirty-ring-size 4096 xxx
Signed-off-by: Hyman Huang <yong.huang@smartx.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <8a388cec5c1f73a34d42515bbc43837e97ee3839.1698847223.git.yong.huang@smartx.com>
Add migration dirty-limit capability test if kernel support
dirty ring.
Migration dirty-limit capability introduce dirty limit
capability, two parameters: x-vcpu-dirty-limit-period and
vcpu-dirty-limit are introduced to implement the live
migration with dirty limit.
The test case does the following things:
1. start src, dst vm and enable dirty-limit capability
2. start migrate and set cancel it to check if dirty limit
stop working.
3. restart dst vm
4. start migrate and enable dirty-limit capability
5. check if migration satisfy the convergence condition
during pre-switchover phase.
Note that this test case involves many passes, so it runs
in slow mode only.
Signed-off-by: Hyman Huang <yong.huang@smartx.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <e55a302df9da7dbc00ad825f47f57c1a756d303e.1698847223.git.yong.huang@smartx.com>
Add support for the query-cpu-model-expansion QMP command to LoongArch.
We support query the cpu features.
e.g
la464 and max cpu support LSX/LASX, default enable,
la132 not support LSX/LASX.
1. start with '-cpu max,lasx=off'
(QEMU) query-cpu-model-expansion type=static model={"name":"max"}
{"return": {"model": {"name": "max", "props": {"lasx": false, "lsx": true}}}}
2. start with '-cpu la464,lasx=off'
(QEMU) query-cpu-model-expansion type=static model={"name":"la464"}
{"return": {"model": {"name": "max", "props": {"lasx": false, "lsx": true}}}
3. start with '-cpu la132,lasx=off'
qemu-system-loongarch64: can't apply global la132-loongarch-cpu.lasx=off: Property 'la132-loongarch-cpu.lasx' not found
4. start with '-cpu max,lasx=off' or start with '-cpu la464,lasx=off' query cpu model la132
(QEMU) query-cpu-model-expansion type=static model={"name":"la132"}
{"return": {"model": {"name": "la132"}}}
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20231020084925.3457084-4-gaosong@loongson.cn>
Some users may not need LSX/LASX, this patch allows the user
enable/disable LSX/LASX features.
e.g
'-cpu max,lsx=on,lasx=on' (default);
'-cpu max,lsx=on,lasx=off' (enabled LSX);
'-cpu max,lsx=off,lasx=on' (enabled LASX, LSX);
'-cpu max,lsx=off' (disable LSX and LASX).
Signed-off-by: Song Gao <gaosong@loongson.cn>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20231020084925.3457084-3-gaosong@loongson.cn>
target-arm queue:
* linux-user/elfload: Add missing arm64 hwcap values
* stellaris-gamepad: Convert to qdev
* docs/specs: Convert various txt docs to rST
* MAINTAINERS: Make sure that gicv3_internal.h is covered, too
* hw/arm/pxa2xx_gpio: Pass CPU using QOM link property
* hw/watchdog/wdt_imx2: Trace MMIO access and timer activity
* hw/misc/imx7_snvs: Trace MMIO access
* hw/misc/imx6_ccm: Convert DPRINTF to trace events
* hw/i2c/pm_smbus: Convert DPRINTF to trace events
* target/arm: Enable FEAT_MOPS insns in user-mode emulation
* linux-user: Report AArch64 hwcap2 fields above bit 31
* target/arm: Make FEAT_MOPS SET* insns handle Xs == XZR correctly
* target/arm: Fix SVE STR increment
* hw/char/stm32f2xx_usart: implement TX interrupts
* target/arm: Correctly propagate stage 1 BTI guarded bit in a two-stage walk
* xlnx-versal-virt: Add AMD/Xilinx TRNG device
* tag 'pull-target-arm-20231102' of https://git.linaro.org/people/pmaydell/qemu-arm: (33 commits)
tests/qtest: Introduce tests for AMD/Xilinx Versal TRNG device
hw/arm: xlnx-versal-virt: Add AMD/Xilinx TRNG device
hw/misc: Introduce AMD/Xilix Versal TRNG device
target/arm: Correctly propagate stage 1 BTI guarded bit in a two-stage walk
hw/char/stm32f2xx_usart: Add more definitions for CR1 register
hw/char/stm32f2xx_usart: Update IRQ when DR is written
hw/char/stm32f2xx_usart: Extract common IRQ update code to update_irq()
target/arm: Fix SVE STR increment
target/arm: Make FEAT_MOPS SET* insns handle Xs == XZR correctly
linux-user: Report AArch64 hwcap2 fields above bit 31
target/arm: Enable FEAT_MOPS insns in user-mode emulation
hw/i2c/pm_smbus: Convert DPRINTF to trace events
hw/misc/imx6_ccm: Convert DPRINTF to trace events
hw/misc/imx7_snvs: Trace MMIO access
hw/watchdog/wdt_imx2: Trace timer activity
hw/watchdog/wdt_imx2: Trace MMIO access
hw/arm/pxa2xx_gpio: Pass CPU using QOM link property
MAINTAINERS: Make sure that gicv3_internal.h is covered, too
docs/specs/vmgenid: Convert to rST
docs/specs/vmcoreinfo: Convert to rST
...
Conflicts:
hw/input/stellaris_input.c
The qdev conversion in this pull request ("stellaris-gamepad: Convert
to qdev") eliminates the vmstate_register() call that was converted to
vmstate_register_any() in the conflicting migration pull request.
vmstate_register_any() is no longer necessary now that this device has
been converted to qdev, so take this pull request's version of
stellaris_gamepad.c over the previous pull request's
stellaris_input.c (the file was renamed).
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Migration Pull request (20231102)
Hi
In this pull request:
- migration reboot mode (steve)
* I disabled the test because our CI don't like programs using so
much shared memory. Searching for a fix.
- test for postcopy recover (fabiano)
- MigrateAddress QAPI (het)
- better return path error handling (peter)
- traces for downtime (peter)
- vmstate_register() check for duplicates (juan)
thomas find better solutions for s390x and ipmi.
now also works on s390x
Please, apply.
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEGJn/jt6/WMzuA0uC9IfvGFhy1yMFAmVDipMACgkQ9IfvGFhy
# 1yNYnQ/9E5Cywsoqljqa/9FiKBSII2qMrmkfu6JLKqePnsh5pFZiukbudYRuJCCe
# ZTDEmD0NmKRJbDx2xRU1qx/e6gKJy+gz37KP89Buuh/WwZHPboPYtxQpGvCSiH26
# J3i+1+TgaqmkLzcO35wa8tp6gneQclWeAwKgMvdb4cm2pJEhgWRKI62ccyLzxeve
# UCzFQn60t55ETyVZGnRD4YwdTQvGKH+DPlyTuJOLR3DePuvZd8EdH+ypvB4RLAy7
# 3+CuQOxmF5LRXPbpJuAeOsudbmhhHzrO/yL7ZmsiKQTthsJv+SzC1bO94jhQrawZ
# Q7GCii5KpGq0KnRTRKZRGk6XKwxcYRduXMX3R5tXuVmDmCZsjhXzziU8yEdftph8
# 5TJdk1o0Gb043EFu81mrsQYS+9yJqe6sy6m3PTJaec54cAty5ln+c17WOvpAOaSV
# +1phe05ftuVPmQ3KWhbIR/tCmavNLwEZxpVIfyaKJx04bFbtQ9gRpRyURORX4KXc
# s4WXvNirQEohxYBnP4TPvA09xBTW3V08pk/wRDwt0YDXnLiqCltOuxD8r05K8K4B
# MkCLcWj0g7he2tBkF60oz1KSIE0oTB81um9AzLIv5F2YSYLaJM5BIcoC437MR2f4
# MOR7drR1fP5GsRu/SeU5BWvhVq3IvdOxR7G2MLNRJJvl7ZtGXDc=
# =uaqL
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 02 Nov 2023 19:40:03 HKT
# gpg: using RSA key 1899FF8EDEBF58CCEE034B82F487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>" [full]
# gpg: aka "Juan Quintela <quintela@trasno.org>" [full]
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03 4B82 F487 EF18 5872 D723
* tag 'migration-20231102-pull-request' of https://gitlab.com/juan.quintela/qemu: (40 commits)
migration: modify test_multifd_tcp_none() to use new QAPI syntax.
migration: Implement MigrateChannelList to hmp migration flow.
migration: Implement MigrateChannelList to qmp migration flow.
migration: modify migration_channels_and_uri_compatible() for new QAPI syntax
migration: New migrate and migrate-incoming argument 'channels'
migration: Convert the file backend to the new QAPI syntax
migration: convert exec backend to accept MigrateAddress.
migration: convert rdma backend to accept MigrateAddress
migration: convert socket backend to accept MigrateAddress
migration: convert migration 'uri' into 'MigrateAddress'
migration: New QAPI type 'MigrateAddress'
migration: Change ram_dirty_bitmap_reload() retval to bool
tests/migration-test: Add a test for postcopy hangs during RECOVER
migration: Allow network to fail even during recovery
migration: Refactor error handling in source return path
tests/qtest: migration: add reboot mode test
cpr: reboot mode
cpr: relax vhost migration blockers
cpr: relax blockdev migration blockers
migration: per-mode blockers
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Connect the support for Versal True Random Number Generator
(TRNG) device.
Warning: unlike the TRNG component in a real device from the
Versal device familiy, the connected TRNG model is not of
cryptographic grade and is not intended for use cases when
cryptograpically strong TRNG is needed.
Signed-off-by: Tong Ho <tong.ho@amd.com>
Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20231031184611.3029156-3-tong.ho@amd.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This adds a non-cryptographic grade implementation of the
model for the True Random Number Generator (TRNG) component
in AMD/Xilinx Versal device family.
This implements all 3 modes defined by the actual hardware
specs, all of which selectable by guest software at will
at anytime:
1) PRNG mode, in which the generated sequence is required to
be reproducible after reseeded by the same 384-bit value
as supplied by guest software.
2) Test mode, in which the generated sequence is required to
be reproducible ater reseeded by the same 128-bit test
seed supplied by guest software.
3) TRNG mode, in which non-reproducible sequence is generated
based on periodic reseed by a suitable entropy source.
This model is only intended for non-real world testing of
guest software, where cryptographically strong PRNG or TRNG
is not needed.
This model supports versions 1 & 2 of the device, with
default to be version 2; the 'hw-version' uint32 property
can be set to 0x0100 to override the default.
Other implemented properties:
- 'forced-prng', uint64
When set to non-zero, mode 3's entropy source is implemented
as a deterministic sequence based on the given value and other
deterministic parameters.
This option allows the emulation to test guest software using
mode 3 and to reproduce data-dependent defects.
- 'fips-fault-events', uint32, bit-mask
bit 3: Triggers the SP800-90B entropy health test fault irq
bit 1: Triggers the FIPS 140-2 continuous test fault irq
Signed-off-by: Tong Ho <tong.ho@amd.com>
Message-id: 20231031184611.3029156-2-tong.ho@amd.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
dump_init() first computes the size of the dump, taking the filter
area into account, and fails if its zero. It then looks for memory in
the filter area, and fails if there is none.
This is redundant: if the size of the dump is zero, there is no
memory, and vice versa. Delete this check.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20231031104531.3169721-6-armbru@redhat.com>
Zero @length is rejected with "Invalid parameter 'length'". Improve
to "parameter 'length' expects a non-zero length".
qemu_open_old() is a wrapper around qemu_open_internal() that throws
away error information. Switch to the wrapper that doesn't:
qemu_create(). Example improvement:
(qemu) dump-guest-memory /dev/fdset/x 0 1
Error: Could not open '/dev/fdset/x': Invalid argument
becomes
Error: Could not parse fdset /dev/fdset/x
@protocol values not starting with "fd:" or "file:" are rejected with
"Invalid parameter 'protocol'". Improve to "parameter 'protocol' must
start with 'file:' or 'fd:'".
While there, make the conditional checking @protocol a little more
obvious.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20231031104531.3169721-5-armbru@redhat.com>
A few QMP command can work with named file descriptors.
The only way to create a named file descriptor used to be QMP command
getfd, which only works on POSIX hosts. Thus, named file descriptors
were actually usable only there.
They became usable on Windows hosts when we added QMP command
get-win32-socket (commit 4cda177c60 "qmp: add 'get-win32-socket'").
Except in dump-guest-memory, because qmp_dump_guest_memory() compiles
its named file descriptor code only #if !defined(WIN32).
Compile it unconditionally, like we do for the other commands
supporting them.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20231031104531.3169721-4-armbru@redhat.com>
When dump_init()'s check for non-zero @length fails, dump_cleanup()
passes null s->string_table_buf to g_array_unref(), which spews "GLib:
g_array_unref: assertion 'array' failed" to stderr.
Guard the g_array_unref().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20231031104531.3169721-3-armbru@redhat.com>
The name of the second parameter differs between QAPI schema and C
implementation: it's @protocol in the former and @file in the latter.
Potentially confusing. Change the C implementation to match the QAPI
schema.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20231031104531.3169721-2-armbru@redhat.com>
The QMP dump API represents the dump format as an enumeration. Add three
new enumerators, one for each supported kdump compression, each named
"kdump-raw-*".
For the HMP command line, rather than adding a new flag corresponding to
each format, it seems more human-friendly to add a single flag "-R" to
switch the kdump formats to "raw" mode. The choice of "-R" also
correlates nicely to the "makedumpfile -R" option, which would serve to
reassemble a flattened vmcore.
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
[ Marc-André: replace loff_t with off_t, indent fixes ]
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230918233233.1431858-4-stephen.s.brennan@oracle.com>
The flattened format (currently output by QEMU) is used by makedumpfile
only when it is outputting a vmcore to a file which is not seekable. The
flattened format functions essentially as a set of instructions of the
form "seek to the given offset, then write the given bytes out".
The flattened format can be reconstructed using makedumpfile -R, or
makedumpfile-R.pl, but it is a slow process because it requires copying
the entire vmcore. The flattened format can also be directly read by
crash, but still, it requires a lengthy reassembly phase.
To sum up, the flattened format is not an ideal one: it should only be
used on files which are actually not seekable. This is the exact
strategy which makedumpfile uses, as seen in the implementation of
"write_buffer()" in makedumpfile [1]. However, QEMU has always used the
flattened format. For compatibility it is best not to change the default
output format without warning. So, add a flag to DumpState which changes
the output to use the normal (i.e. raw) format. This flag will be added
to the QMP and HMP commands in the next change.
[1]: f23bb94356/makedumpfile.c (L5008-L5040)
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
[ Marc-André: replace loff_t with off_t ]
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230918233233.1431858-3-stephen.s.brennan@oracle.com>
In a two-stage translation, the result of the BTI guarded bit should
be the guarded bit from the first stage of translation, as there is
no BTI guard information in stage two. Our code tried to do this,
but got it wrong, because we currently have two fields where the GP
bit information might live (ARMCacheAttrs::guarded and
CPUTLBEntryFull::extra::arm::guarded), and we were storing the GP bit
in the latter during the stage 1 walk but trying to copy the former
in combine_cacheattrs().
Remove the duplicated storage, and always use the field in
CPUTLBEntryFull; correctly propagate the stage 1 value to the output
in get_phys_addr_twostage().
Note for stable backports: in v8.0 and earlier the field is named
result->f.guarded, not result->f.extra.arm.guarded.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1950
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20231031173723.26582-1-peter.maydell@linaro.org
Most of the registers used by the FEAT_MOPS instructions cannot use
31 as a register field value; this is CONSTRAINED UNPREDICTABLE to
NOP or UNDEF (we UNDEF). However, it is permitted for the "source
value" register for the memset insns SET* to be 31, which (as usual
for most data-processing insns) means it should be the zero register
XZR. We forgot to handle this case, with the effect that trying to
set memory to zero with a "SET* Xd, Xn, XZR" sets the memory to
the value that happens to be in the low byte of SP.
Handle XZR when getting the SET* data value from the register file.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20231030174000.3792225-4-peter.maydell@linaro.org
The AArch64 ELF hwcap2 field is 64 bits, but our get_elf_hwcap2()
works with uint32_t, so it accidentally fails to report any hwcaps
over bit 31. Use uint64_t here.
The Arm hwcap2 is only 32 bits (because the ELF format makes these
fields be the size of "long" in the ABI), but since it shares the
prototype declaration for get_elf_hwcap2() it is easier to also
expand it to 64 bits.
The only hwcap fields we implement already that are affected by this
are the HBC and MOPS ones, neither of which were implemented in a
previous release, so this doesn't need backporting to older stable
branches.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20231030174000.3792225-3-peter.maydell@linaro.org
Convert docs/specs/ivshmem-spec.txt to rST format.
In converting, I have dropped the sections on the device's command
line interface and usage, as they are already covered by the
user-facing docs in system/devices/ivshmem.rst.
I have also removed the reference to Memnic, because the URL is dead
and a web search suggests that whatever this was it's pretty much
sunk without trace.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20230927151205.70930-4-peter.maydell@linaro.org
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Now that we have converted to qdev, we can use the newer
qemu_input_handler_register() API rather than the legacy
qemu_add_kbd_event_handler().
Since we only have one user, take the opportunity to convert
from scancodes to QCodes, rather than using
qemu_input_key_value_to_scancode() (which adds an 0xe0
prefix and encodes up/down indication in the scancode,
which our old handler function then had to reverse). That
lets us drop the old state field which was tracking whether
we were halfway through a two-byte scancode.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20231030114802.3671871-7-peter.maydell@linaro.org
Convert the hw/input/stellaris_input device to qdev.
The interface uses an array property for the board to specify the
keycodes to use, so the s->keycodes memory is now allocated by the
array-property machinery.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20231030114802.3671871-6-peter.maydell@linaro.org
Currently for each button on the device we have a
StellarisGamepadButton struct which has the irq, keycode and pressed
state for it. When we convert to qdev, the qdev property and GPIO
APIs are going to require that we have separate arrays for the irqs
and keycodes. Convert from array-of-structs to three separate arrays
in preparation.
This is a migration compatibility break for the stellaris boards
(lm3s6965evb, lm3s811evb).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20231030114802.3671871-5-peter.maydell@linaro.org
--
v1=>v2: mention migration compat break in commit message;
bump version fields in vmstate
Instead of exposing the ugly hack of how we represent arrays in qdev (a
static "foo-len" property and after it is set, dynamically created
"foo[i]" properties) to boards, add an interface that allows setting the
whole array at once.
Once all internal users of devices with array properties have been
converted to use this function, we can change the implementation to move
away from this hack.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20231030114802.3671871-4-peter.maydell@linaro.org
Integrate MigrateChannelList with all transport backends
(socket, exec and rdma) for both src and dest migration
endpoints for qmp migration.
For current series, limit the size of MigrateChannelList
to single element (single interface) as runtime check.
Suggested-by: Aravind Retnakaran <aravind.retnakaran@nutanix.com>
Signed-off-by: Het Gala <het.gala@nutanix.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231023182053.8711-13-farosas@suse.de>
migration_channels_and_uri_compatible() check for transport mechanism
suitable for multifd migration gets executed when the caller calls old
uri syntax. It needs it to be run when using the modern MigrateChannel
QAPI syntax too.
After URI -> 'MigrateChannel' :
migration_channels_and_uri_compatible() ->
migration_channels_and_transport_compatible() passes object as argument
and check for valid transport mechanism.
Suggested-by: Aravind Retnakaran <aravind.retnakaran@nutanix.com>
Signed-off-by: Het Gala <het.gala@nutanix.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231023182053.8711-12-farosas@suse.de>
MigrateChannelList allows to connect accross multiple interfaces.
Add MigrateChannelList struct as argument to migration QAPIs.
We plan to include multiple channels in future, to connnect
multiple interfaces. Hence, we choose 'MigrateChannelList'
as the new argument over 'MigrateChannel' to make migration
QAPIs future proof.
Suggested-by: Aravind Retnakaran <aravind.retnakaran@nutanix.com>
Signed-off-by: Het Gala <het.gala@nutanix.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231023182053.8711-10-farosas@suse.de>
This patch introduces well defined MigrateAddress struct
and its related child objects.
The existing argument of 'migrate' and 'migrate-incoming' QAPI
- 'uri' is of type string. The current implementation follows
double encoding scheme for fetching migration parameters like
'uri' and this is not an ideal design.
Motive for intoducing struct level design is to prevent double
encoding of QAPI arguments, as Qemu should be able to directly
use the QAPI arguments without any level of encoding.
Note: this commit only adds the type, and actual uses comes
in later commits.
Fabiano fixed for "file" transport.
Suggested-by: Aravind Retnakaran <aravind.retnakaran@nutanix.com>
Signed-off-by: Het Gala <het.gala@nutanix.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231023182053.8711-2-farosas@suse.de>
Message-Id: <20231023182053.8711-3-farosas@suse.de>
Now we have a Error** passed into the return path thread stack, which is
even clearer than an int retval. Change ram_dirty_bitmap_reload() and the
callers to use a bool instead to replace errnos.
Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231017202633.296756-5-peterx@redhat.com>
To do so, create two paired sockets, but make them not providing real data.
Feed those fake sockets to src/dst QEMUs for recovery to let them go into
RECOVER stage without going out. Test that we can always kick it out and
recover again with the right ports.
This patch is based on Fabiano's version here:
https://lore.kernel.org/r/877cowmdu0.fsf@suse.de
Signed-off-by: Fabiano Rosas <farosas@suse.de>
[peterx: write commit message, remove case 1, fix bugs, and more]
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231017202633.296756-4-peterx@redhat.com>
Normally the postcopy recover phase should only exist for a super short
period, that's the duration when QEMU is trying to recover from an
interrupted postcopy migration, during which handshake will be carried out
for continuing the procedure with state changes from PAUSED -> RECOVER ->
POSTCOPY_ACTIVE again.
Here RECOVER phase should be super small, that happens right after the
admin specified a new but working network link for QEMU to reconnect to
dest QEMU.
However there can still be case where the channel is broken in this small
RECOVER window.
If it happens, with current code there's no way the src QEMU can got kicked
out of RECOVER stage. No way either to retry the recover in another channel
when established.
This patch allows the RECOVER phase to fail itself too - we're mostly
ready, just some small things missing, e.g. properly kick the main
migration thread out when sleeping on rp_sem when we found that we're at
RECOVER stage. When this happens, it fails the RECOVER itself, and
rollback to PAUSED stage. Then the user can retry another round of
recovery.
To make it even stronger, teach QMP command migrate-pause to explicitly
kick src/dst QEMU out when needed, so even if for some reason the migration
thread didn't got kicked out already by a failing rethrn-path thread, the
admin can also kick it out.
This will be an super, super corner case, but still try to cover that.
One can try to test this with two proxy channels for migration:
(a) socat unix-listen:/tmp/src.sock,reuseaddr,fork tcp:localhost:10000
(b) socat tcp-listen:10000,reuseaddr,fork unix:/tmp/dst.sock
So the migration channel will be:
(a) (b)
src -> /tmp/src.sock -> tcp:10000 -> /tmp/dst.sock -> dst
Then to make QEMU hang at RECOVER stage, one can do below:
(1) stop the postcopy using QMP command postcopy-pause
(2) kill the 2nd proxy (b)
(3) try to recover the postcopy using /tmp/src.sock on src
(4) src QEMU will go into RECOVER stage but won't be able to continue
from there, because the channel is actually broken at (b)
Before this patch, step (4) will make src QEMU stuck in RECOVER stage,
without a way to kick the QEMU out or continue the postcopy again. After
this patch, (4) will quickly fail qemu and bounce back to PAUSED stage.
Admin can also kick QEMU from (4) into PAUSED when needed using
migrate-pause when needed.
After bouncing back to PAUSED stage, one can recover again.
Reported-by: Xiaohui Li <xiaohli@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2111332
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231017202633.296756-3-peterx@redhat.com>
rp_state.error was a boolean used to show error happened in return path
thread. That's not only duplicating error reporting (migrate_set_error),
but also not good enough in that we only do error_report() and set it to
true, we never can keep a history of the exact error and show it in
query-migrate.
To make this better, a few things done:
- Use error_setg() rather than error_report() across the whole lifecycle
of return path thread, keeping the error in an Error*.
- With above, no need to have mark_source_rp_bad(), remove it, alongside
with rp_state.error itself.
- Use migrate_set_error() to apply that captured error to the global
migration object when error occured in this thread.
- Do the same when detected qemufile error in source return path
We need to re-export qemu_file_get_error_obj() to do the last one.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231017202633.296756-2-peterx@redhat.com>
The NeXTcube uses a NCR 53C90 SCSI interface for its disks, so we should
be able to use the ESP controller from QEMU here. The code here has been
basically taken from Bryce Lanham's GSoC 2011 contribution, except for
the next_scsi_init() function which has been rewritte as a replacement
for the esp_init() function (that has been removed quite a while ago).
Note that SCSI is not working yet. The ESP code likely needs some more
fixes first and there still might be some bugs left in they way we wire
it up for the NeXT-Cube machine.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-ID: <20230930132351.30282-4-huth@tuxfamily.org>
Signed-off-by: Thomas Huth <huth@tuxfamily.org>
Add the cpr-reboot migration mode. Usage:
$ qemu-system-$arch -monitor stdio ...
QEMU 8.1.50 monitor - type 'help' for more information
(qemu) migrate_set_capability x-ignore-shared on
(qemu) migrate_set_parameter mode cpr-reboot
(qemu) migrate -d file:vm.state
(qemu) info status
VM status: paused (postmigrate)
(qemu) quit
$ qemu-system-$arch -monitor stdio -incoming defer ...
QEMU 8.1.50 monitor - type 'help' for more information
(qemu) migrate_set_capability x-ignore-shared on
(qemu) migrate_set_parameter mode cpr-reboot
(qemu) migrate_incoming file:vm.state
(qemu) info status
VM status: running
In this mode, the migrate command saves state to a file, allowing one
to quit qemu, reboot to an updated kernel, and restart an updated version
of qemu. The caller must specify a migration URI that writes to and reads
from a file. Unlike normal mode, the use of certain local storage options
does not block the migration, but the caller must not modify guest block
devices between the quit and restart. To avoid saving guest RAM to the
file, the memory backend must be shared, and the @x-ignore-shared migration
capability must be set. Guest RAM must be non-volatile across reboot, such
as by backing it with a dax device, but this is not enforced. The restarted
qemu arguments must match those used to initially start qemu, plus the
-incoming option.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <1698263069-406971-6-git-send-email-steven.sistare@oracle.com>
vhost blocks migration if logging is not supported to track dirty
memory, and vhost-user blocks it if the log cannot be saved to a shm fd.
vhost-vdpa blocks migration if both hosts do not support all the device's
features using a shadow VQ, for tracking requests and dirty memory.
vhost-scsi blocks migration if storage cannot be shared across hosts,
or if state cannot be migrated.
None of these conditions apply if the old and new qemu processes do
not run concurrently, and if new qemu starts on the same host as old,
which is the case for cpr.
Narrow the scope of these blockers so they only apply to normal mode.
They will not block cpr modes when they are added in subsequent patches.
No functional change until a new mode is added.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <1698263069-406971-5-git-send-email-steven.sistare@oracle.com>
Some blockdevs block migration because they do not support sharing across
hosts and/or do not support dirty bitmaps. These prohibitions do not apply
if the old and new qemu processes do not run concurrently, and if new qemu
starts on the same host as old, which is the case for cpr. Narrow the scope
of these blockers so they only apply to normal mode. They will not block
cpr modes when they are added in subsequent patches.
No functional change until a new mode is added.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <1698263069-406971-4-git-send-email-steven.sistare@oracle.com>
Extend the blocker interface so that a blocker can be registered for
one or more migration modes. The existing interfaces register a
blocker for all modes, and the new interfaces take a varargs list
of modes.
Internally, maintain a separate blocker list per mode. The same Error
object may be added to multiple lists. When a block is deleted, it is
removed from every list, and the Error is freed.
No functional change until a new mode is added.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <1698263069-406971-3-git-send-email-steven.sistare@oracle.com>
Create a mode migration parameter that can be used to select alternate
migration algorithms. The default mode is normal, representing the
current migration algorithm, and does not need to be explicitly set.
No functional change until a new mode is added, except that the mode is
shown by the 'info migrate' command.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <1698263069-406971-2-git-send-email-steven.sistare@oracle.com>
This patch is inspired by Joao Martin's patch here:
https://lore.kernel.org/r/20230926161841.98464-1-joao.m.martins@oracle.com
Add tracepoints for major downtime checkpoints on both src and dst. They
share the same tracepoint with a string showing its stage.
Besides the checkpoints in the previous patch, this patch also added
destination checkpoints.
On src, we have these checkpoints added:
- src-downtime-start: right before vm stops on src
- src-vm-stopped: after vm is fully stopped
- src-iterable-saved: after all iterables saved (END sections)
- src-non-iterable-saved: after all non-iterable saved (FULL sections)
- src-downtime-stop: migration fully completed
On dst, we have these checkpoints added:
- dst-precopy-loadvm-completes: after loadvm all done for precopy
- dst-precopy-bh-*: record BH steps to resume VM for precopy
- dst-postcopy-bh-*: record BH steps to resume VM for postcopy
On dst side, we don't have a good way to trace total time consumed by
iterable or non-iterable for now. We can mark it by 1st time receiving a
FULL / END section, but rather than that let's just rely on the other
tracepoints added for vmstates to back up the information.
With this patch, one can enable "vmstate_downtime*" tracepoints and it'll
enable all tracepoints for downtime measurements necessary.
Drop loadvm_postcopy_handle_run_bh() tracepoint alongside, because they
service the same purpose, which was only for postcopy. We then have
unified prefix for all downtime relevant tracepoints.
Co-developed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231030163346.765724-6-peterx@redhat.com>
We have a bunch of savevm_section* tracepoints, they're good to analyze
migration stream, but not always suitable if someone would like to analyze
the migration downtime. Two major problems:
- savevm_section* tracepoints are dumping all sections, we only care
about the sections that contribute to the downtime
- They don't have an identifier to show the type of sections, so no way
to filter downtime information either easily.
We can add type into the tracepoints, but instead of doing so, this patch
kept them untouched, instead of adding a bunch of downtime specific
tracepoints, so one can enable "vmstate_downtime*" tracepoints and get a
full picture of how the downtime is distributed across iterative and
non-iterative vmstate save/load.
Note that here both save() and load() need to be traced, because both of
them may contribute to the downtime. The contribution is not a simple "add
them together", though: consider when the src is doing a save() of device1
while the dest can be load()ing for device2, so they can happen
concurrently.
Tracking both sides make sense because device load() and save() can be
imbalanced, one device can save() super fast, but load() super slow, vice
versa. We can't figure that out without tracing both.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231030163346.765724-4-peterx@redhat.com>
Postcopy calculates its downtime separately. It always sets
MigrationState.downtime properly, but not MigrationState.downtime_start.
Make postcopy do the same as other modes on properly recording the
timestamp when the VM is going to be stopped. Drop the temporary variable
in postcopy_start() along the way.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231030163346.765724-2-peterx@redhat.com>
We can have more than one audio backend.
void audio_init_audiodevs(void)
{
AudiodevListEntry *e;
QSIMPLEQ_FOREACH(e, &audiodevs, next) {
audio_init(e->dev, &error_fatal);
}
}
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231020090731.28701-12-quintela@redhat.com>
Before finally register one SaveStateEntry, we detect for duplicated
entries. This could be helpful to notify us asap instead of get
silent migration failures which could be hard to diagnose.
For example, this patch will generate a message like this (if without
previous fixes on x2apic) as long as we wants to boot a VM instance
with "-smp 200,maxcpus=288,sockets=2,cores=72,threads=2" and QEMU will
bail out even before VM starts:
savevm_state_handler_insert: Detected duplicate SaveStateEntry: id=apic, instance_id=0x0
Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231020090731.28701-10-quintela@redhat.com>
Current code does:
- register pre_2_10_vmstate_dummy_icp with "icp/server" and instance
dependinfg on cpu number
- for newer machines, it register vmstate_icp with "icp/server" name
and instance 0
- now it unregisters "icp/server" for the 1st instance.
This is wrong at many levels:
- we shouldn't have two VMSTATEDescriptions with the same name
- In case this is the only solution that we can came with, it needs to
be:
* register pre_2_10_vmstate_dummy_icp
* unregister pre_2_10_vmstate_dummy_icp
* register real vmstate_icp
Created vmstate_replace_hack_for_ppc() with warnings left and right
that it is a hack.
CC: Cedric Le Goater <clg@kaod.org>
CC: Daniel Henrique Barboza <danielhb413@gmail.com>
CC: David Gibson <david@gibson.dropbear.id.au>
CC: Greg Kurz <groug@kaod.org>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231020090731.28701-8-quintela@redhat.com>
Each user network conection create a new slirp instance. We register
more than one slirp instance for number 0.
qemu-system-x86_64: -netdev user,id=hs1: savevm_state_handler_insert: Detected duplicate SaveStateEntry: id=slirp, instance_id=0x0
Broken pipe
../../../../../mnt/code/qemu/full/tests/qtest/libqtest.c:195: kill_qemu() tried to terminate QEMU process but encountered exit status 1 (expected 0)
Aborted (core dumped)
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231020090731.28701-6-quintela@redhat.com>
We have lots of cases where we are using an instance_id==0 when we
should be using VMSTATE_INSTANCE_ID_ANY (-1). Basically everything
that can have more than one needs to have a proper instance_id or -1
and the system will take one for it.
vmstate_register_any(): We register with -1.
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231020090731.28701-2-quintela@redhat.com>
Since the instance_init() function immediately tries to set the
property to "true", the s390_skeys_set_migration_enabled() tries
to register a savevm handler during instance_init(). However,
instance_init() functions can be called multiple times, e.g. for
introspection of devices. That means multiple instances of devices
can be created during runtime (which is fine as long as they all
don't get realized, too), so the "Prevent double registration of
savevm handler" check in the s390_skeys_set_migration_enabled()
function does not work at all as expected (since there could be
more than one instance).
Thus we must not call register_savevm_live() from an instance_init()
function at all. Move this to the realize() function instead. This
way we can also get rid of the property getter and setter functions
completely, simplifying the code along the way quite a bit.
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Acked-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231020150554.664422-2-thuth@redhat.com>
There is no point in having mcf_uart_init() demote the DeviceState
pointer and return a void one. Directly return the real typedef.
mcf_uart_init() do both init + realize: rename as mcf_uart_create().
Similarly, mcf_uart_mm_init() do init / realize / mmap: rename as
mcf_uart_create_mmap().
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231019104929.16517-1-philmd@linaro.org>
Signed-off-by: Thomas Huth <huth@tuxfamily.org>
Mechanical change using the following coccinelle script:
@@
identifier dev;
identifier sbd;
expression qom_type;
expression addr;
@@
- dev = qdev_new(qom_type);
- sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
- sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, addr);
+ dev = sysbus_create_simple(qom_type, addr, NULL);
then manually removing the 'dev' variable to avoid:
error: variable 'dev' set but not used [-Werror,-Wunused-but-set-variable]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <huth@tuxfamily.org>
Message-ID: <20231024083010.12453-6-philmd@linaro.org>
Signed-off-by: Thomas Huth <huth@tuxfamily.org>
QOM objects shouldn't access each other internals fields
except using the QOM API.
Here the caller of mcf_intc_init() access the MMIO region from
the MCF_INTC state. Avoid that by exposing that region via
sysbus_init_mmio(), then get it with sysbus_mmio_get_region().
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Thomas Huth <huth@tuxfamily.org>
Reviewed-by: Thomas Huth <huth@tuxfamily.org>
Message-ID: <20231024083010.12453-4-philmd@linaro.org>
Signed-off-by: Thomas Huth <huth@tuxfamily.org>
Block layer patches
- virtio-blk: use blk_io_plug_call() instead of notification BH
- mirror: allow switching from background to active mode
- qemu-img rebase: add compression support
- Fix locking in media change monitor commands
- Fix a few blockjob-related deadlocks when using iothread
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmVBTkERHGt3b2xmQHJl
# ZGhhdC5jb20ACgkQfwmycsiPL9ZiqRAAqvsWbblmEGJ7TBKYQK3f8QshJ66RxzbC
# 4eSjKHrciWNTeeIeU8r8OvFcPPoTcPXxpcmasD2gsAxG5W5N8vkPbBkW+YT4YdDJ
# pWJXrbJ15nILC4DmnR1ARVtvxKgv9zy5LSm5bjss1K+OSYJl/nx+ILjmfVZnYDF7
# z1dP/G0JxKKm4JzAIdBE3uZS+6Q5kx/wGYlJv8EQmlH3DYfsJfy6Lthe9jfw8ijg
# lSqLoQ+D0lEd6Bk4XbkUqqBxFcYBWTfU6qPZoyIO94zCTwTG9yIjmoivxmmfwQZq
# cJUTGGZjcxpJYnvcC6P13WgcWBtcD9L2kYFVH0JyjpwcSg9cCGHMF66n9pSlyEGq
# DUikwVzbTwOotwzYQyM88v4ET+2+Qdcwn8pRbv9PllEczh0kAsUAEuxSgtz4NEcN
# bZrap/16xHFybNOKkMZcmpqxspT5NXKbDODUP0IvbSYMOYpWS983nBTxwMRpyHog
# 2TFDZu4DjNiPkI2BcYM5VOKk6diNowZFShcEKvoaOLX/n9EBhP0tjoH9VUn1800F
# myHrhF2jpIf9GhErMWB7N2W3/0aK0pqdQgbpVnd1ARDdIdYkr7G/S+50D9K80b6n
# 0q2E7br4S5bcsY0HQzBL9YARSayY+lVOssLoolCWEsYzijdBQmAvs5THajFKcism
# /idI6nlp2Vs=
# =RdxS
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 01 Nov 2023 03:58:09 JST
# gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg: issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* tag 'for-upstream' of https://repo.or.cz/qemu/kevin: (27 commits)
iotests: add test for changing mirror's copy_mode
mirror: return mirror-specific information upon query
blockjob: query driver-specific info via a new 'query' driver method
qapi/block-core: turn BlockJobInfo into a union
qapi/block-core: use JobType for BlockJobInfo's type
mirror: implement mirror_change method
block/mirror: determine copy_to_target only once
block/mirror: move dirty bitmap to filter
block/mirror: set actively_synced even after the job is ready
blockjob: introduce block-job-change QMP command
virtio-blk: remove batch notification BH
virtio: use defer_call() in virtio_irqfd_notify()
util/defer-call: move defer_call() to util/
block: rename blk_io_plug_call() API to defer_call()
blockdev: mirror: avoid potential deadlock when using iothread
block: avoid potential deadlock during bdrv_graph_wrlock() in bdrv_close()
blockjob: drop AioContext lock before calling bdrv_graph_wrlock()
iotests: Test media change with iothreads
block: Fix locking in media change monitor commands
iotests: add tests for "qemu-img rebase" with compression
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Maintainer updates for testing, gitlab, gdbstub and plugins:
- add dtc package to openbsd VMs
- use -fno-stack-protector for non-stdlib tests
- split alpha and sh4 compilers into legacy image
- harmonise other compilers into debian-all-test-cross
- fix NULL check in gdb_regs
- fix memleak in semihosting
- remove unused parameter in plugin code
- fix fd leak in lockstep plugin
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmVBCvQACgkQ+9DbCVqe
# KkR8jAgAjFC3BE6fu80zYT0Dmeu8zh20QY/wgKQebaFfGEmPL4Bqkl2D/Rx7PhQA
# EH8fR/LAH/iXAO07+LYOB6QiyMb9PWiXS52iHyE3q11mOaM8iKkkj7a59NW8DfGC
# biSrj9o3wpz9gGkJjzTCcHC8DOMbrAuE12XnmhW7uTqqkrcTMC393dSEeyL+nrP9
# lKS5XzFyn3FOT4YIL8hAC02ObKH4LpWIO3gdWeDAo56yg24fLir9a2wYSXMaxQtN
# kDf6UtL97CIIhbNi6qrUPBB13MV8MlXno3wnb9+E4Cn5sGntGSnTyh7j6XrGqYj9
# p/Vio6ye8xP1IjlavKiBM0nnozcAhw==
# =ZOMS
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 31 Oct 2023 23:11:00 JST
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* tag 'pull-halloween-omnibus-311023-2' of https://gitlab.com/stsquad/qemu:
contrib/plugins: Close file descriptor on error return
plugins: Remove an extra parameter
semihosting: fix memleak at semihosting_arg_fallback
gdbstub: Check if gdb_regs is NULL
tests/docker: upgrade debian-all-test-cross to bookworm
tests/docker: use debian-all-test-cross for sparc64
tests/docker: use debian-all-test-cross for riscv64
tests/docker: use debian-all-test-cross for mips
tests/docker: use debian-all-test-cross for mips64
tests/docker: use debian-all-test-cross for m68k
tests/docker: use debian-all-test-cross for hppa
tests/docker: use debian-all-test-cross for power
tests/docker: move sh4 to use debian-legacy-test-cross
tests/docker: use debian-legacy-test-cross for alpha
gitlab: add build-loongarch to matrix
gitlab: clean-up build-soft-softmmu job
gitlab: split alpha testing into a legacy container
tests/tcg: Add -fno-stack-protector
tests/vm/openbsd: Use the system dtc package
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
One part of the test is using a throttled source to ensure that there
are no obvious issues when changing the copy_mode while there are
ongoing requests (source and target images are compared at the very
end).
The other part of the test is using a throttled target to ensure that
the change to active mode actually happened. This is done by hitting
the throttling limit, issuing a synchronous write and then immediately
verifying the target side. QSD is used, because otherwise, a
synchronous write would hang there.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20231031135431.393137-11-f.ebner@proxmox.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
To start out, only actively-synced is returned.
For example, this is useful for jobs that started out in background
mode and switched to active mode. Once actively-synced is true, it's
clear that the mode switch has been completed. Note that completion of
the switch might happen much earlier, e.g. if the switch happens
before the job is ready, once all background operations have finished.
It's assumed that whether the disks are actively-synced or not is more
interesting than whether the mode switch completed. That information
can still be added if required in the future.
In presence of an iothread, the actively_synced member is now shared
between the iothread and the main thread, so turn accesses to it
atomic.
Requires to adapt the output for iotest 109.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20231031135431.393137-10-f.ebner@proxmox.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In preparation to turn BlockJobInfo into a union with @type as the
discriminator. That requires it to be an enum. Even without that
requirement, it's nicer to have an enum instead of a str here.
No functional change is intended.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20231031135431.393137-7-f.ebner@proxmox.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
which allows switching the @copy-mode from 'background' to
'write-blocking'.
This is useful for management applications, so they can start out in
background mode to avoid limiting guest write speed and switch to
active mode when certain criteria are fulfilled.
In presence of an iothread, the copy_mode member is now shared between
the iothread and the main thread, so turn accesses to it atomic.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20231031135431.393137-6-f.ebner@proxmox.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In preparation to allow changing the copy_mode via QMP. When running
in an iothread, it could be that copy_mode is changed from the main
thread in between reading copy_mode in bdrv_mirror_top_pwritev() and
reading copy_mode in bdrv_mirror_top_do_write(), so they might end up
disagreeing about whether copy_to_target is true or false. Avoid that
scenario by determining copy_to_target only once and passing it to
bdrv_mirror_top_do_write() as an argument.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20231031135431.393137-5-f.ebner@proxmox.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In preparation to allow switching to active mode without draining.
Initialization of the bitmap in mirror_dirty_init() still happens with
the original/backing BlockDriverState, which should be fine, because
the mirror top has the same length.
Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20231031135431.393137-4-f.ebner@proxmox.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There is a batching mechanism for virtio-blk Used Buffer Notifications
that is no longer needed because the previous commit added batching to
virtio_notify_irqfd().
Note that this mechanism was rarely used in practice because it is only
enabled when EVENT_IDX is not negotiated by the driver. Modern drivers
enable EVENT_IDX.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230913200045.1024233-5-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
virtio-blk and virtio-scsi invoke virtio_irqfd_notify() to send Used
Buffer Notifications from an IOThread. This involves an eventfd
write(2) syscall. Calling this repeatedly when completing multiple I/O
requests in a row is wasteful.
Use the defer_call() API to batch together virtio_irqfd_notify() calls
made during thread pool (aio=threads), Linux AIO (aio=native), and
io_uring (aio=io_uring) completion processing.
Behavior is unchanged for emulated devices that do not use
defer_call_begin()/defer_call_end() since defer_call() immediately
invokes the callback when called outside a
defer_call_begin()/defer_call_end() region.
fio rw=randread bs=4k iodepth=64 numjobs=8 IOPS increases by ~9% with a
single IOThread and 8 vCPUs. iodepth=1 decreases by ~1% but this could
be noise. Detailed performance data and configuration specifics are
available here:
https://gitlab.com/stefanha/virt-playbooks/-/tree/blk_io_plug-irqfd
This duplicates the BH that virtio-blk uses for batching. The next
commit will remove it.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230913200045.1024233-4-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The networking subsystem may wish to use defer_call(), so move the code
to util/ where it can be reused.
As a reminder of what defer_call() does:
This API defers a function call within a defer_call_begin()/defer_call_end()
section, allowing multiple calls to batch up. This is a performance
optimization that is used in the block layer to submit several I/O requests
at once instead of individually:
defer_call_begin(); <-- start of section
...
defer_call(my_func, my_obj); <-- deferred my_func(my_obj) call
defer_call(my_func, my_obj); <-- another
defer_call(my_func, my_obj); <-- another
...
defer_call_end(); <-- end of section, my_func(my_obj) is called once
Suggested-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230913200045.1024233-3-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Prepare to move the blk_io_plug_call() API out of the block layer so
that other subsystems call use this deferred call mechanism. Rename it
to defer_call() but leave the code in block/plug.c.
The next commit will move the code out of the block layer.
Suggested-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Paul Durrant <paul@xen.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230913200045.1024233-2-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This requires a few more tweaks than usual as:
- the default sources format has changed
- bring in python3-tomli from the repos
- split base install from cross compilers
- also include libclang-rt-dev for sanitiser builds
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231029145033.592566-16-alex.bennee@linaro.org>
Maintaining two sets of containers for test building is silly. While
it makes sense for the QEMU cross-compile targets to have their own
fat containers built by lcitool we might as well merge the other
random debian based compilers into the same one used on gitlab.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231029145033.592566-15-alex.bennee@linaro.org>
Maintaining two sets of containers for test building is silly. While
it makes sense for the QEMU cross-compile targets to have their own
fat containers built by lcitool we might as well merge the other
random debian based compilers into the same one used on gitlab.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231029145033.592566-14-alex.bennee@linaro.org>
Maintaining two sets of containers for test building is silly. While
it makes sense for the QEMU cross-compile targets to have their own
fat containers built by lcitool we might as well merge the other
random debian based compilers into the same one used on gitlab.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231029145033.592566-13-alex.bennee@linaro.org>
Maintaining two sets of containers for test building is silly. While
it makes sense for the QEMU cross-compile targets to have their own
fat containers built by lcitool we might as well merge the other
random debian based compilers into the same one used on gitlab.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231029145033.592566-12-alex.bennee@linaro.org>
Maintaining two sets of containers for test building is silly. While
it makes sense for the QEMU cross-compile targets to have their own
fat containers built by lcitool we might as well merge the other
random debian based compilers into the same one used on gitlab.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231029145033.592566-11-alex.bennee@linaro.org>
Maintaining two sets of containers for test building is silly. While
it makes sense for the QEMU cross-compile targets to have their own
fat containers built by lcitool we might as well merge the other
random debian based compilers into the same one used on gitlab.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231029145033.592566-10-alex.bennee@linaro.org>
Maintaining two sets of containers for test building is silly. While
it makes sense for the QEMU cross-compile targets to have their own
fat containers built by lcitool we might as well merge the other
random debian based compilers into the same one used on gitlab.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231029145033.592566-9-alex.bennee@linaro.org>
Maintaining two sets of containers for test building is silly. While
it makes sense for the QEMU cross-compile targets to have their own
fat containers built by lcitool we might as well merge the other
random debian based compilers into the same one used on gitlab.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231029145033.592566-7-alex.bennee@linaro.org>
Having dropped alpha we also now drop xtensa as we don't have the
compiler in this image. It's not all doom and gloom though as a number
of other targets have gained softmmu TCG tests so we can add them. We
will take care of the other targets with their own containers in
future commits.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231029145033.592566-5-alex.bennee@linaro.org>
The current bookworm compiler doesn't build the static binaries due to
bug #1054412 and it might be awhile before it gets fixed. The problem
of keeping older architecture compilers running isn't going to go away
so lets prepare the ground. Create a legacy container and move some
tests around so the others can get upgraded.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231029145033.592566-4-alex.bennee@linaro.org>
The bdrv_getlength() function is a generated co-wrapper and uses
AIO_WAIT_WHILE() to wait for the spawned coroutine. AIO_WAIT_WHILE()
expects the lock to be acquired exactly once.
Fix a case where it may be acquired twice. This can happen when the
source node is explicitly specified as the @replaces parameter or if the
source node is a filter node.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20231019131936.414246-4-f.ebner@proxmox.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
by passing the BlockDriverState along, so the held AioContext can be
dropped before polling. See commit 31b2ddfea3 ("graph-lock: Unlock the
AioContext while polling") which introduced this functionality for
more information.
The only way to reach bdrv_close() is via bdrv_unref() and for calling
that the BlockDriverState's AioContext lock is supposed to be held.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20231019131936.414246-3-f.ebner@proxmox.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Same rationale as in 31b2ddfea3 ("graph-lock: Unlock the AioContext
while polling"). Otherwise, a deadlock can happen.
The alternative would be to pass a BlockDriverState along to
bdrv_graph_wrlock(), but there is no BlockDriverState readily
available and it's also better conceptually, because the lock is held
for the job.
The function is always called with the job's AioContext lock held, via
one of the .abort, .clean, .free or .prepare job driver functions.
Thus, it's safe to drop it.
While mirror_exit_common() does hold a second AioContext lock while
calling block_job_remove_all_bdrv(), that is for the main thread's
AioContext and does not need to be dropped (bdrv_graph_wrlock(bs) also
skips dropping the lock if bdrv_get_aio_context(bs) ==
qemu_get_aio_context()).
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20231019131936.414246-2-f.ebner@proxmox.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
iotests case 118 already tests all relevant operations for media change
with multiple devices, however never with iothreads. This changes the
test so that the virtio-scsi tests run with an iothread.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231013153302.39234-3-kwolf@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
blk_insert_bs() requires that the caller holds the AioContext lock for
the node to be inserted. Since commit c066e808e1, neglecting to do so
causes a crash when the child has to be moved to a different AioContext
to attach it to the BlockBackend.
This fixes qmp_blockdev_insert_anon_medium(), which is called for the
QMP commands 'blockdev-insert-medium' and 'blockdev-change-medium', to
correctly take the lock.
Cc: qemu-stable@nongnu.org
Fixes: https://issues.redhat.com/browse/RHEL-3922
Fixes: c066e808e1
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231013153302.39234-2-kwolf@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The test cases considered so far:
314 (new test suite):
1. Check that compression mode isn't compatible with "-f raw" (raw
format doesn't support compression).
2. Check that rebasing an image onto no backing file preserves the data
and writes the copied clusters actually compressed.
3. Same as 2, but with a raw backing file (i.e. the clusters copied from the
backing are originally uncompressed -- we check they end up compressed
after being merged).
4. Remove a single delta from a backing chain, perform the same checks
as in 2.
5. Check that even when backing and overlay are initially uncompressed,
copied clusters end up compressed when rebase with compression is
performed.
271:
1. Check that when target image has subclusters, rebase with compression
will make an entire cluster containing the written subcluster
compressed.
Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Message-ID: <20230919165804.439110-9-andrey.drobyshev@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If we rebase an image whose backing file has compressed clusters, we
might end up wasting disk space since the copied clusters are now
uncompressed. In order to have better control over this, let's add
"--compress" option to the "qemu-img rebase" command.
Note that this option affects only the clusters which are actually being
copied from the original backing file. The clusters which were
uncompressed in the target image will remain so.
Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Message-ID: <20230919165804.439110-8-andrey.drobyshev@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
As the previous commit changes the logic of "qemu-img rebase" (it's using
write alignment now), let's add a couple more test cases which would
ensure it works correctly. In particular, the following scenarios:
024: add test case for rebase within one backing chain when the overlay
cluster size > backings cluster size;
271: add test case for rebase images that contain subclusters. Check
that no extra allocations are being made.
Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Message-ID: <20230919165804.439110-7-andrey.drobyshev@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When rebasing an image from one backing file to another, we need to
compare data from old and new backings. If the diff between that data
happens to be unaligned to the target cluster size, we might end up
doing partial writes, which would lead to copy-on-write and additional IO.
Consider the following simple case (virtual_size == cluster_size == 64K):
base <-- inc1 <-- inc2
qemu-io -c "write -P 0xaa 0 32K" base.qcow2
qemu-io -c "write -P 0xcc 32K 32K" base.qcow2
qemu-io -c "write -P 0xbb 0 32K" inc1.qcow2
qemu-io -c "write -P 0xcc 32K 32K" inc1.qcow2
qemu-img rebase -f qcow2 -b base.qcow2 -F qcow2 inc2.qcow2
While doing rebase, we'll write a half of the cluster to inc2, and block
layer will have to read the 2nd half of the same cluster from the base image
inc1 while doing this write operation, although the whole cluster is already
read earlier to perform data comparison.
In order to avoid these unnecessary IO cycles, let's make sure every
write request is aligned to the overlay subcluster boundaries. Using
subcluster size is universal as for the images which don't have them
this size equals to the cluster size. so in any case we end up aligning
to the smallest unit of allocation.
Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Message-ID: <20230919165804.439110-6-andrey.drobyshev@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add @chsize param to the function which, if non-zero, would represent
the chunk size to be used for comparison. If it's zero, then
BDRV_SECTOR_SIZE is used as default chunk size, which is the previous
behaviour.
In particular, we're going to use this param in img_rebase() to make the
write requests aligned to a predefined alignment value.
Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Message-ID: <20230919165804.439110-5-andrey.drobyshev@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Since commit bb1c05973c ("qemu-img: Use qemu_blockalign"), buffers for
the data read from the old and new backing files are aligned using
BlockDriverState (or BlockBackend later on) referring to the target image.
However, this isn't quite right, because buf_new is only being used for
reading from the new backing, while buf_old is being used for both reading
from the old backing and writing to the target. Let's take that into account
and use more appropriate values as alignments.
Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Message-ID: <20230919165804.439110-4-andrey.drobyshev@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In case when we're rebasing within one backing chain, and when target image
is larger than old backing file, bdrv_is_allocated_above() ends up setting
*pnum = 0. As a result, target offset isn't getting incremented, and we
get stuck in an infinite for loop. Let's detect this case and proceed
further down the loop body, as the offsets beyond the old backing size need
to be explicitly zeroed.
Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Message-ID: <20230919165804.439110-2-andrey.drobyshev@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
target-arm queue:
* Correct minor errors in Cortex-A710 definition
* Implement Neoverse N2 CPU model
* Refactor feature test functions out into separate header
* Fix syndrome for FGT traps on ERET
* Remove 'hw/arm/boot.h' includes from various header files
* pxa2xx: Refactoring/cleanup
* Avoid using 'first_cpu' when first ARM CPU is reachable
* misc/led: LED state is set opposite of what is expected
* hw/net/cadence_gen: clean up to use FIELD macros
* hw/net/cadence_gem: perform PHY access on write only
* hw/net/cadence_gem: enforce 32 bits variable size for CRC
# -----BEGIN PGP SIGNATURE-----
#
# iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmU7yz0ZHHBldGVyLm1h
# eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3n4xEACK4ti+PFSJHVCQ69NzLLBT
# ybFGFMsMhXJTSNS30Pzs+KWCKWPP59knYBD4qO43W1iV6pPUhy+skr+BFCCRvBow
# se74+Fm1l4LmnuHxgukJzTdvRffI3v37alLn6Y/ioWe8bDpf/IJj8WLj8B1IPoNg
# fswJSGDLpPMovaz8NBQRzglUWpfyzxH+uuW779qBS1nuFdPOfIHKrocvvdrfogBP
# aO8AeiBzz5STW9Naeq+BIKho8S9LinSB6FHa+rRPUDkWx03lvRIvkgGPzHpXYy8I
# zAZ8gUQZyXprHAHMpnoBv8Wcw3Bwc2f+8xx8hnRRki3iBroXKfJA9NkeN0StQmL1
# ZHhfYkiKSS5diIFW5pX6ZixKbXHE2a4aH4zPVUNQriNWOevhe7n82mAPNFIYjk97
# ciTtd4I2oew48sDLSodMiirGL987Mit7KC23itVGezcNfQ9FnVTDmuGy8Rq52BZm
# u4TZjVBrtjQOdMBUcD2hKvXhikQNAdOhArPwNfOr0esSQL44MMEe+6Q5/Cbp0BOE
# stAY/xwSP2cY5mIPnAbIBELseEZsV8ySA3M0y1iRCJptjwbyWM+s1TYz0iXcqeOn
# l6LfiI6r1BqUeoWLGP4042R4FLyLNh6gU/TiFNLu7JJQjXl/EkRgqVXWYfzy2n51
# KKY6iGFi5r41sAU6GIXOkQ==
# =szC7
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 27 Oct 2023 23:37:49 JST
# gpg: using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg: issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [full]
# gpg: aka "Peter Maydell <pmaydell@gmail.com>" [full]
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [full]
# gpg: aka "Peter Maydell <peter@archaic.org.uk>" [unknown]
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* tag 'pull-target-arm-20231027' of https://git-us.linaro.org/people/pmaydell/qemu-arm: (41 commits)
hw/net/cadence_gem: enforce 32 bits variable size for CRC
hw/net/cadence_gem: perform PHY access on write only
hw/net/cadence_gem: use FIELD to describe PHYMNTNC register fields
hw/net/cadence_gem: use FIELD to describe DESCONF6 register fields
hw/net/cadence_gem: use FIELD to describe IRQ register fields
hw/net/cadence_gem: use FIELD to describe [TX|RX]STATUS register fields
hw/net/cadence_gem: use FIELD to describe DMACFG register fields
hw/net/cadence_gem: use FIELD to describe NWCFG register fields
hw/net/cadence_gem: use FIELD to describe NWCTRL register fields
hw/net/cadence_gem: use FIELD for screening registers
hw/net/cadence_gem: use REG32 macro for register definitions
misc/led: LED state is set opposite of what is expected
hw/arm: Avoid using 'first_cpu' when first ARM CPU is reachable
hw/arm/pxa2xx: Realize PXA2XX_I2C device before accessing it
hw/intc/pxa2xx: Factor pxa2xx_pic_realize() out of pxa2xx_pic_init()
hw/intc/pxa2xx: Pass CPU reference using QOM link property
hw/intc/pxa2xx: Convert to Resettable interface
hw/pcmcia/pxa2xx: Inline pxa2xx_pcmcia_init()
hw/pcmcia/pxa2xx: Do not open-code sysbus_create_simple()
hw/pcmcia/pxa2xx: Realize sysbus device before accessing it
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This is not ideal, since it requires all cross-compilers
to be present rather than a simple subset. But since it
is only run manually, should be good enough for now.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add support in gen-vdso-elfn.c.inc for the DT_PPC64_OPT
dynamic tag: this is an integer, so does not need relocation.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Requires a relatively recent binutils version in order to avoid
spurious R_LARCH_NONE relocations. The presence of these relocs
are diagnosed by our gen-vdso tool.
Tested-by: Song Gao <gaosong@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This tool will be used for post-processing the linked vdso image,
turning it into something that is easy to include into elfload.c.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The vdso image will be pre-processed into a C data array, with
a simple list of relocations to perform, and identifying the
location of signal trampolines.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
There are only a couple of uses of bprm->fd remaining.
Migrate to the other field.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Aside from the section headers, we're unlikely to hit the
ImageSource cache on guest executables. But the interface
for imgsrc_read_* is better.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Change parse_elf_properties as well, as the bprm_buf argument
ties the two functions closely.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rearrange the allocation of storage for ehdr between load_elf_image
and load_elf_binary. The same set of copies are done, but we don't
modify bprm_buf, which will be important later.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reorg the if cases to reduce indentation.
Test for 4 bytes in the file before checking the signatures.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Introduced and initialized, but not yet really used.
These will tidy the current tests vs BPRM_BUF_SIZE.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This patch removes the code that ufs-lu was duplicating from
scsi-hd and allows them to share code.
It makes ufs-lu have a virtual scsi-bus and scsi-hd internally.
This allows scsi related commands to be passed thorugh to the scsi-hd.
The query request and nop command work the same as the existing logic.
Well-known lus do not have a virtual scsi-bus and scsi-hd, and
handle the necessary scsi commands by emulating them directly.
Signed-off-by: Jeuk Kim <jeuk20.kim@samsung.com>
The MDIO access is done only on a write to the PHYMNTNC register. A
subsequent read is used to retrieve the result but does not trigger an
MDIO access by itself.
Refactor the PHY access logic to perform all accesses (MDIO reads and
writes) at PHYMNTNC write time.
Signed-off-by: Luc Michel <luc.michel@amd.com>
Reviewed-by: sai.pavan.boddu@amd.com
Message-id: 20231017194422.4124691-11-luc.michel@amd.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In commit 442c9d682c when we converted the ERET, ERETAA, ERETAB
instructions to decodetree, the conversion accidentally lost the
correct setting of the syndrome register when taking a trap because
of the FEAT_FGT HFGITR_EL1.ERET bit. Instead of reporting a correct
full syndrome value with the EC and IL bits, we only reported the low
two bits of the syndrome, because the call to syn_erettrap() got
dropped.
Fix the syndrome values for these traps by reinstating the
syn_erettrap() calls.
Fixes: 442c9d682c ("target/arm: Convert ERET, ERETAA, ERETAB to decodetree")
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20231024172438.2990945-1-peter.maydell@linaro.org
Our list of isar_feature functions is not in any particular order,
but tests on fields of the same ID register tend to be grouped
together. A few functions that are tests of fields in ID_AA64MMFR1
and ID_AA64MMFR2 are not in the same place as the rest; move them
into their groups.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20231024163510.2972081-3-peter.maydell@linaro.org
The feature test functions isar_feature_*() now take up nearly
a thousand lines in target/arm/cpu.h. This header file is included
by a lot of source files, most of which don't need these functions.
Move the feature test functions to their own header file.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20231024163510.2972081-2-peter.maydell@linaro.org
Implement a model of the Neoverse N2 CPU. This is an Armv9.0-A
processor very similar to the Cortex-A710. The differences are:
* no FEAT_EVT
* FEAT_DGH (data gathering hint)
* FEAT_NV (not yet implemented in QEMU)
* Statistical Profiling Extension (not implemented in QEMU)
* 48 bit physical address range, not 40
* CTR_EL0.DIC = 1 (no explicit icache cleaning needed)
* PMCR_EL0.N = 6 (always 6 PMU counters, not 20)
Because it has 48-bit physical address support, we can use
this CPU in the sbsa-ref board as well as the virt board.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20230915185453.1871167-3-peter.maydell@linaro.org
Correct a couple of minor errors in the Cortex-A710 definition:
* ID_AA64DFR0_EL1.DebugVer is 9 (indicating Armv8.4 debug architecture)
* ID_AA64ISAR1_EL1.APA is 5 (indicating more PAuth support)
* there is an IMPDEF CPUCFR_EL1, like that on the Neoverse-N1
Fixes: e3d45c0a89 ("target/arm: Implement cortex-a710")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20230915185453.1871167-2-peter.maydell@linaro.org
2023-10-27 11:41:13 +01:00
884 changed files with 34679 additions and 13636 deletions
0600 : qemu extended register region size, in bytes.
0604 : framebuffer endianness register.
- 0xbebebebe indicates big endian.
- 0x1e1e1e1e indicates little endian.
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.