ppc patch queue for 2023-09-18:
In this short queue we're making two important changes:
- Nicholas Piggin is now the qemu-ppc maintainer. Cédric Le Goater and
Daniel Barboza will act as backup during Nick's transition to this new
role.
- Support for NVIDIA V100 GPU with NVLink2 is dropped from qemu-ppc.
Linux removed the same support back in 5.13, we're following suit now.
A xive Coverity fix is also included.
# -----BEGIN PGP SIGNATURE-----
#
# iIwEABYKADQWIQQX6/+ZI9AYAK8oOBk82cqW3gMxZAUCZQhPnBYcZGFuaWVsaGI0
# MTNAZ21haWwuY29tAAoJEDzZypbeAzFk5QUBAJJNnCtv/SPP6bQVNGMgtfI9sz2z
# MEttDa7SINyLCiVxAP0Y9z8ZHEj6vhztTX0AAv2QubCKWIVbJZbPV5RWrHCEBQ==
# =y3nh
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 18 Sep 2023 09:24:44 EDT
# gpg: using EDDSA key 17EBFF9923D01800AF2838193CD9CA96DE033164
# gpg: issuer "danielhb413@gmail.com"
# gpg: Good signature from "Daniel Henrique Barboza <danielhb413@gmail.com>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 17EB FF99 23D0 1800 AF28 3819 3CD9 CA96 DE03 3164
* tag 'pull-ppc-20230918' of https://gitlab.com/danielhb/qemu:
spapr: Remove support for NVIDIA V100 GPU with NVLink2
ppc/xive: Fix uint32_t overflow
MAINTAINERS: Nick Piggin PPC maintainer, other PPC changes
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
When updating ioeventfds, we need to iterate all address spaces,
but some address spaces do not register eventfd_add|del call when
memory_listener_register() and they do nothing when updating ioeventfds.
So we can skip these AS in address_space_update_ioeventfds().
The overhead of memory_region_transaction_commit() can be significantly
reduced. For example, a VM with 8 vhost net devices and each one has
64 vectors, can reduce the time spent on memory_region_transaction_commit by 20%.
Message-ID: <20230830032906.12488-1-hongmianquan@bytedance.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: hongmianquan <hongmianquan@bytedance.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
There is a difference between how we open a file and how we mmap it,
and we want to support writable private mappings of readonly files. Let's
define RAM_READONLY and RAM_READONLY_FD flags, to replace the single
"readonly" parameter for file-related functions.
In memory_region_init_ram_from_fd() and memory_region_init_ram_from_file(),
initialize mr->readonly based on the new RAM_READONLY flag.
While at it, add some RAM_* flags we missed to add to the list of accepted
flags in the documentation of some functions.
No change in functionality intended. We'll make use of both flags next
and start setting them independently for memory-backend-file.
Message-ID: <20230906120503.359863-3-david@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Currently, when using a true R/O NVDIMM (ROM memory backend) with a label
area, the VM can easily crash QEMU by trying to write to the label area,
because the ROM memory is mmap'ed without PROT_WRITE.
[root@vm-0 ~]# ndctl disable-region region0
disabled 1 region
[root@vm-0 ~]# ndctl zero-labels nmem0
-> QEMU segfaults
Let's remember whether we have a ROM memory backend and properly
reject the write request:
[root@vm-0 ~]# ndctl disable-region region0
disabled 1 region
[root@vm-0 ~]# ndctl zero-labels nmem0
zeroed 0 nmem
In comparison, on a system with a R/W NVDIMM:
[root@vm-0 ~]# ndctl disable-region region0
disabled 1 region
[root@vm-0 ~]# ndctl zero-labels nmem0
zeroed 1 nmem
For ACPI, just return "unsupported", like if no label exists. For spapr,
return "H_P2", similar to when no label area exists.
Could we rely on the "unarmed" property? Maybe, but it looks cleaner to
only disallow what certainly cannot work.
After all "unarmed=on" primarily means: cannot accept persistent writes. In
theory, there might be setups where devices with "unarmed=on" set could
be used to host non-persistent data (temporary files, system RAM, ...); for
example, in Linux, admins can overwrite the "readonly" setting and still
write to the device -- which will work as long as we're not using ROM.
Allowing writing label data in such configurations can make sense.
Message-ID: <20230906120503.359863-2-david@redhat.com>
Fixes: dbd730e859 ("nvdimm: check -object memory-backend-file, readonly=on option")
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
NVLink2 support was removed from the PPC PowerNV platform and VFIO in
Linux 5.13 with commits :
562d1e207d32 ("powerpc/powernv: remove the nvlink support")
b392a1989170 ("vfio/pci: remove vfio_pci_nvlink2")
This was 2.5 years ago. Do the same in QEMU with a revert of commit
ec132efaa8 ("spapr: Support NVIDIA V100 GPU with NVLink2"). Some
adjustements are required on the NUMA part.
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <20230918091717.149950-1-clg@kaod.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Tap indicates support for USO features according to
capabilities of current kernel module.
Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Signed-off-by: Andrew Melnychecnko <andrew@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Rather than saving MemoryRegionSection and offset,
save phys_addr and MemoryRegion. This matches up
much closer with the plugin api.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Now that we defer address space update and tlb_flush until
the next async_run_on_cpu, the plugin run at the end of the
instruction no longer has to contend with a flushed tlb.
Therefore, delete SavedIOTLB entirely.
Properly return false from tlb_plugin_lookup when we do
not have a tlb match.
Fixes a bug in which SavedIOTLB had stale data, because
there were multiple i/o accesses within a single insn.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This update contains the required header changes for the
"target/s390x: AP-passthrough for PV guests" patch from
Steffen Eiden.
Message-ID: <20230912093432.180041-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Ensure that it only get called when dpy_ui_info_supported(). The
function should always return a result. There should be a non-null
console or active_console.
Modify the argument to be const as well.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Albert Esteve <aesteve@redhat.com>
The function calls to `kbd_put_keysym` have been updated to now call
`kbd_put_keysym_console` with a NULL console parameter.
Like most console functions, NULL argument is now for the active console.
This will allow to rename the text console functions in a consistent manner.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
vfio queue:
* Small downtime optimisation for VFIO migration
* P2P support for VFIO migration
* Introduction of a save_prepare() handler to fail VFIO migration
* Fix on DMA logging ranges calculation for OVMF enabling dynamic window
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmT+uZQACgkQUaNDx8/7
# 7KGFSw//UIqSet6MUxZZh/t7yfNFUTnxx6iPdChC3BphBaDDh99FCQrw5mPZ8ImF
# 4rz0cIwSaHXraugEsC42TDaGjEmcAmYD0Crz+pSpLU21nKtYyWtZy6+9kyYslMNF
# bUq0UwD0RGTP+ZZi6GBy1hM30y/JbNAGeC6uX8kyJRuK5Korfzoa/X5h+B2XfouW
# 78G1mARHq5eOkGy91+rAJowdjqtkpKrzkfCJu83330Bb035qAT/PEzGs5LxdfTla
# ORNqWHy3W+d8ZBicBQ5vwrk6D5JIZWma7vdXJRhs1wGO615cuyt1L8nWLFr8klW5
# MJl+wM7DZ6UlSODq7r839GtSuWAnQc2j7JKc+iqZuBBk1v9fGXv2tZmtuTGkG2hN
# nYXSQfuq1igu1nGVdxJv6WorDxsK9wzLNO2ckrOcKTT28RFl8oCDNSPPTKpwmfb5
# i5RrGreeXXqRXIw0VHhq5EqpROLjAFwE9tkJndO8765Ag154plxssaKTUWo5wm7/
# kjQVuRuhs5nnMXfL9ixLZkwD1aFn5fWAIaR0psH5vGD0fnB1Pba+Ux9ZzHvxp5D8
# Kg3H6dKlht6VXdQ/qb0Up1LXCGEa70QM6Th2iO924ydZkkmqrSj+CFwGHvBsINa4
# 89fYd77nbRbdwWurj3JIznJYVipau2PmfbjZ/jTed4RxjBQ+fPA=
# =44e0
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 11 Sep 2023 02:54:12 EDT
# gpg: using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@redhat.com>" [unknown]
# gpg: aka "Cédric Le Goater <clg@kaod.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1
* tag 'pull-vfio-20230911' of https://github.com/legoater/qemu:
vfio/common: Separate vfio-pci ranges
vfio/migration: Block VFIO migration with background snapshot
vfio/migration: Block VFIO migration with postcopy migration
migration: Add .save_prepare() handler to struct SaveVMHandlers
migration: Move more initializations to migrate_init()
vfio/migration: Fail adding device with enable-migration=on and existing blocker
migration: Add migration prefix to functions in target.c
vfio/migration: Allow migration of multiple P2P supporting devices
vfio/migration: Add P2P support for VFIO migration
vfio/migration: Refactor PRE_COPY and RUNNING state checks
qdev: Add qdev_add_vm_change_state_handler_full()
sysemu: Add prepare callback to struct VMChangeStateEntry
vfio/migration: Move from STOP_COPY to STOP in vfio_save_cleanup()
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
First RISC-V PR for 8.2
* Remove 'host' CPU from TCG
* riscv_htif Fixup printing on big endian hosts
* Add zmmul isa string
* Add smepmp isa string
* Fix page_check_range use in fault-only-first
* Use existing lookup tables for MixColumns
* Add RISC-V vector cryptographic instruction set support
* Implement WARL behaviour for mcountinhibit/mcounteren
* Add Zihintntl extension ISA string to DTS
* Fix zfa fleq.d and fltq.d
* Fix upper/lower mtime write calculation
* Make rtc variable names consistent
* Use abi type for linux-user target_ucontext
* Add RISC-V KVM AIA Support
* Fix riscv,pmu DT node path in the virt machine
* Update CSR bits name for svadu extension
* Mark zicond non-experimental
* Fix satp_mode_finalize() when satp_mode.supported = 0
* Fix non-KVM --enable-debug build
* Add new extensions to hwprobe
* Use accelerated helper for AES64KS1I
* Allocate itrigger timers only once
* Respect mseccfg.RLB for pmpaddrX changes
* Align the AIA model to v1.0 ratified spec
* Don't read the CSR in riscv_csrrw_do64
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEaukCtqfKh31tZZKWr3yVEwxTgBMFAmT+ttMACgkQr3yVEwxT
# gBN/rg/+KhOvL9xWSNb8pzlIsMQHLvndno0Sq5b9Rb/o5z1ekyYfyg6712N3JJpA
# TIfZzOIW7oYZV8gHyaBtOt8kIbrjwzGB2rpCh4blhm+yNZv7Ym9Ko6AVVzoUDo7k
# 2dWkLnC+52/l3SXGeyYMJOlgUUsQMwjD6ykDEr42P6DfVord34fpTH7ftwSasO9K
# 35qJQqhUCgB3fMzjKTYICN6Rm1UluijTjRNXUZXC0XZlr+UKw2jT/UsybbWVXyNs
# SmkRtF1MEVGvw+b8XOgA/nG1qVCWglTMcPvKjWMY+cY9WLM6/R9nXAV8OL/JPead
# v1LvROJNukfjNtDW6AOl5/svOJTRLbIrV5EO7Hlm1E4kftGmE5C+AKZZ/VT4ucUK
# XgqaHoXh26tFEymVjzbtyFnUHNv0zLuGelTnmc5Ps1byLSe4lT0dBaJy6Zizg0LE
# DpTR7s3LpyV3qB96Xf9bOMaTPsekUjD3dQI/3X634r36+YovRXapJDEDacN9whbU
# BSZc20NoM5UxVXFTbELQXolue/X2BRLxpzB+BDG8/cpu/MPgcCNiOZaVrr/pOo33
# 6rwwrBhLSCfYAXnJ52qTUEBz0Z/FnRPza8AU/uuRYRFk6JhUXIonmO6xkzsoNKuN
# QNnih/v1J+1XqUyyT2InOoAiTotzHiWgKZKaMfAhomt2j/slz+A=
# =aqcx
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 11 Sep 2023 02:42:27 EDT
# gpg: using RSA key 6AE902B6A7CA877D6D659296AF7C95130C538013
# gpg: Good signature from "Alistair Francis <alistair@alistair23.me>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6AE9 02B6 A7CA 877D 6D65 9296 AF7C 9513 0C53 8013
* tag 'pull-riscv-to-apply-20230911' of https://github.com/alistair23/qemu: (45 commits)
target/riscv: don't read CSR in riscv_csrrw_do64
target/riscv: Align the AIA model to v1.0 ratified spec
target/riscv/pmp.c: respect mseccfg.RLB for pmpaddrX changes
target/riscv: Allocate itrigger timers only once
target/riscv: Use accelerated helper for AES64KS1I
linux-user/riscv: Add new extensions to hwprobe
hw/intc/riscv_aplic.c fix non-KVM --enable-debug build
hw/riscv/virt.c: fix non-KVM --enable-debug build
riscv: zicond: make non-experimental
target/riscv: fix satp_mode_finalize() when satp_mode.supported = 0
target/riscv: Update CSR bits name for svadu extension
hw/riscv: virt: Fix riscv,pmu DT node path
target/riscv: select KVM AIA in riscv virt machine
target/riscv: update APLIC and IMSIC to support KVM AIA
target/riscv: Create an KVM AIA irqchip
target/riscv: check the in-kernel irqchip support
target/riscv: support the AIA device emulation with KVM enabled
linux-user/riscv: Use abi type for target_ucontext
hw/intc: Make rtc variable names consistent
hw/intc: Fix upper/lower mtime write calculation
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Add a new .save_prepare() handler to struct SaveVMHandlers. This handler
is called early, even before migration starts, and can be used by
devices to perform early checks.
Refactor migrate_init() to be able to return errors and call
.save_prepare() from there.
Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Move the PRE_COPY and RUNNING state checks to helper functions.
This is in preparation for adding P2P VFIO migration support, where
these helpers will also test for PRE_COPY_P2P and RUNNING_P2P states.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: YangHang Liu <yanghliu@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Add qdev_add_vm_change_state_handler_full() variant that allows setting
a prepare callback in addition to the main callback.
This will facilitate adding P2P support for VFIO migration in the
following patches.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: YangHang Liu <yanghliu@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Add prepare callback to struct VMChangeStateEntry.
The prepare callback is optional and can be set by the new function
qemu_add_vm_change_state_handler_prio_full() that allows setting this
callback in addition to the main callback.
The prepare callbacks and main callbacks are called in two separate
phases: First all prepare callbacks are called and only then all main
callbacks are called.
The purpose of the new prepare callback is to allow all devices to run a
preliminary task before calling the devices' main callbacks.
This will facilitate adding P2P support for VFIO migration where all
VFIO devices need to be put in an intermediate P2P quiescent state
before being stopped or started by the main callback.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: YangHang Liu <yanghliu@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
The AES MixColumns and InvMixColumns operations are relatively
expensive 4x4 matrix multiplications in GF(2^8), which is why C
implementations usually rely on precomputed lookup tables rather than
performing the calculations on demand.
Given that we already carry those tables in QEMU, we can just grab the
right value in the implementation of the RISC-V AES32 instructions. Note
that the tables in question are permuted according to the respective
Sbox, so we can omit the Sbox lookup as well in this case.
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Philippe Mathieu-Daudé <philmd@linaro.org>
Cc: Zewen Ye <lustrew@foxmail.com>
Cc: Weiwei Li <liweiwei@iscas.ac.cn>
Cc: Junqiang Wang <wangjunqiang@iscas.ac.cn>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20230731084043.1791984-1-ardb@kernel.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Now that we have Eager Page Split support added for ARM in the kernel,
enable it in Qemu. This adds,
-eager-split-size to -accel sub-options to set the eager page split chunk size.
-enable KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE.
The chunk size specifies how many pages to break at a time, using a
single allocation. Bigger the chunk size, more pages need to be
allocated ahead of time.
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Message-id: 20230905091246.1931-1-shameerali.kolothum.thodi@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Introduce the Xilinx Configuration Frame Interface (CFI) for transmitting
CFI data packets between the Xilinx Configuration Frame Unit models
(CFU_APB, CFU_FDRO and CFU_SFR), the Xilinx CFRAME controller (CFRAME_REG)
and the Xilinx CFRAME broadcast controller (CFRAME_BCAST_REG) models (when
emulating bitstream programming and readback).
Signed-off-by: Francisco Iglesias <francisco.iglesias@amd.com>
Reviewed-by: Sai Pavan Boddu <sai.pavan.boddu@amd.com>
Acked-by: Edgar E. Iglesias <edgar@zeroasic.com>
Message-id: 20230831165701.2016397-2-francisco.iglesias@amd.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Migration code can run both in coroutine context (the usual case) and
non-coroutine context (at least savevm/loadvm for snapshots). This also
affects the VMState callbacks, and devices must consider this. Change
the callback definition in VMStateInfo to be explicit about it.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230905145002.46391-2-kwolf@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
CoMutex has poor performance when lock contention is high. The tracked
requests list is accessed frequently and performance suffers in QEMU
multi-queue block layer scenarios.
It is not necessary to use CoMutex for the requests lock. The lock is
always released across coroutine yield operations. It is held for
relatively short periods of time and it is not beneficial to yield when
the lock is held by another coroutine.
Change the lock type from CoMutex to QemuMutex to improve multi-queue
block layer performance. fio randread bs=4k iodepth=64 with 4 IOThreads
handling a virtio-blk device with 8 virtqueues improves from 254k to
517k IOPS (+203%). Full benchmark results and configuration details are
available here:
980c40845d
In the future we may wish to introduce thread-local tracked requests
lists to avoid lock contention completely. That would be much more
involved though.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230808155852.2745350-3-stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>