When DMA memory can't be directly accessed, as is the case when
running the device model in a separate process without shareable DMA
file descriptors, bounce buffering is used.
It is not uncommon for device models to request mapping of several DMA
regions at the same time. Examples include:
* net devices, e.g. when transmitting a packet that is split across
several TX descriptors (observed with igb)
* USB host controllers, when handling a packet with multiple data TRBs
(observed with xhci)
Previously, qemu only provided a single bounce buffer per AddressSpace
and would fail DMA map requests while the buffer was already in use. In
turn, this would cause DMA failures that ultimately manifest as hardware
errors from the guest perspective.
This change allocates DMA bounce buffers dynamically instead of
supporting only a single buffer. Thus, multiple DMA mappings work
correctly also when RAM can't be mmap()-ed.
The total bounce buffer allocation size is limited individually for each
AddressSpace. The default limit is 4096 bytes, matching the previous
maximum buffer size. A new x-max-bounce-buffer-size parameter is
provided to configure the limit for PCI devices.
Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240819135455.2957406-1-mnissler@rivosinc.com
Signed-off-by: Peter Xu <peterx@redhat.com>
(cherry picked from commit 637b0aa139)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
A user can create a SR-IOV device by specifying the PF with the
sriov-pf property of the VFs. The VFs must be added before the PF.
A user-creatable VF must have PCIDeviceClass::sriov_vf_user_creatable
set. Such a VF cannot refer to the PF because it is created before the
PF.
A PF that user-creatable VFs can be attached calls
pcie_sriov_pf_init_from_user_created_vfs() during realization and
pcie_sriov_pf_exit() when exiting.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20240715-sriov-v5-5-3f5539093ffc@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
pcie_sriov doesn't have code to restore its state after migration, but
igb, which uses pcie_sriov, naively claimed its migration capability.
Add code to register VFs after migration and fix igb migration.
Fixes: 3a977deebe ("Intrdocue igb device emulation")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20240627-reuse-v10-9-7ca0b8ed3d9f@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Disable SR-IOV VF devices by reusing code to power down PCI devices
instead of removing them when the guest requests to disable VFs. This
allows to realize devices and report VF realization errors at PF
realization time.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20240627-reuse-v10-6-7ca0b8ed3d9f@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
pci_device_[set|unset]_iommu_device() call pci_device_get_iommu_bus_devfn()
to get iommu_bus->iommu_ops and call [set|unset]_iommu_device callback to
set/unset HostIOMMUDevice for a given PCI device.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
For types that are embedded in structs defined by pci.h, the definition
is pretty much required to be available. Remove them from typedefs.h.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
pcie_sriov_pf_disable_vfs() is called when resetting the PF, but it only
disables VFs and does not reset SR-IOV extended capability, leaking the
state and making the VF Enable register inconsistent with the actual
state.
Replace pcie_sriov_pf_disable_vfs() with pcie_sriov_pf_reset(), which
does not only disable VFs but also resets the capability.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20240228-reuse-v8-3-282660281e60@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@ericsson.com>
This function is no longer used, as all its callers have been converted
to use pci_init_nic_devices() instead.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Thomas Huth <thuth@redhat.com>
The loop over nd_table[] to add PCI NICs is repeated in quite a few
places. Add a helper function to do it.
Some platforms also try to instantiate a specific model in a specific
slot, to match the real hardware. Add pci_init_nic_in_slot() for that
purpose.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
target/hppa: Add emulation of a C3700 HP-PARISC workstation
This series adds a new PA-RISC machine emulation for the HP-PARISC
C3700 workstation.
The physical HP C3700 machine has a PA2.0 (64-bit) CPU, in contrast to
the existing emulation of a B160L workstation which is a 32-bit only
machine and where it's Dino PCI controller isn't 64-bit capable.
With the HP C3700 machine emulation (together with the emulated Astro
Memory controller and the Elroy PCI bridge) it's now possible to
enhance the hppa CPU emulation to support the 64-bit instruction set
in upcoming patches.
Helge
v4 changes:
- Fix testsuite error in astro by adding a realize() implementation
v3 changes:
based on feedback from BALATON Zoltan <balaton@eik.bme.hu>:
- apply paches in different order to bring them logically closer to each other
- update comments in lasips2
- rephrased title and commit message of MAINTAINERS patch
v2 changes:
suggestions by BALATON Zoltan <balaton@eik.bme.hu>:
- merged pci_ids and tulip patch
- dropped comments in lasips2
- mention additional cleanups in patch "Require at least SeaBIOS-hppa version 10"
suggestions by Philippe Mathieu-Daudé <philmd@linaro.org>:
- dropped static pci_bus variable
# -----BEGIN PGP SIGNATURE-----
#
# iHUEABYKAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCZTGzDQAKCRD3ErUQojoP
# X9psAP0cHfTuJuXMiBWhrJhfp5VV0TURvaNXjCGyK8qvfbK+zgEArg3nvKhZPvnu
# jVSq6b/Ppf3eCAZIYSVIsfLITbElTQ4=
# =Esj+
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 19 Oct 2023 15:51:57 PDT
# gpg: using EDDSA key BCE9123E1AD29F07C049BBDEF712B510A23A0F5F
# gpg: Good signature from "Helge Deller <deller@gmx.de>" [unknown]
# gpg: aka "Helge Deller <deller@kernel.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 4544 8228 2CD9 10DB EF3D 25F8 3E5F 3D04 A7A2 4603
# Subkey fingerprint: BCE9 123E 1AD2 9F07 C049 BBDE F712 B510 A23A 0F5F
* tag 'C3700-pull-request' of https://github.com/hdeller/qemu-hppa:
hw/hppa: Add new HP C3700 machine
hw/hppa: Split out machine creation
hw/hppa: Provide RTC and DebugOutputPort on CPU #0
hw/hppa: Export machine name, BTLBs, power-button address via fw_cfg
MAINTAINERS: Update HP-PARISC entries
pci-host: Wire up new Astro/Elroy PCI bridge
hw/pci-host: Add Astro system bus adapter found on PA-RISC machines
lasips2: LASI PS/2 devices are not user-createable
pci_ids/tulip: Add PCI vendor ID for HP and use it in tulip
hw/hppa: Require at least SeaBIOS-hppa version 10
target/hppa: Update to SeaBIOS-hppa version 10
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Fix:
hw/pci/pci.c:504:54: error: declaration shadows a variable in the global scope [-Werror,-Wshadow]
MemoryRegion *address_space_io,
^
hw/pci/pci.c:533:38: error: declaration shadows a variable in the global scope [-Werror,-Wshadow]
MemoryRegion *address_space_io,
^
hw/pci/pci.c:543:40: error: declaration shadows a variable in the global scope [-Werror,-Wshadow]
MemoryRegion *address_space_io,
^
hw/pci/pci.c:590:45: error: declaration shadows a variable in the global scope [-Werror,-Wshadow]
MemoryRegion *address_space_io,
^
include/exec/address-spaces.h:35:21: note: previous declaration is here
extern AddressSpace address_space_io;
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20231010115048.11856-6-philmd@linaro.org>
current code sets PCI_SEC_LATENCY_TIMER to RW, but for
pcie to pcie bridges it must be RO 0 according to
pci express spec which says:
This register does not apply to PCI Express. It must be read-only
and hardwired to 00h. For PCI Express to PCI/PCI-X Bridges, refer to the
[PCIe-to-PCI-PCI-X-Bridge] for requirements for this register.
also, fix typo in comment where it's made writeable - this typo
is likely what prevented us noticing we violate this requirement
in the 1st place.
Reported-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Message-Id: <de9d05366a70172e1789d10591dbe59e39c3849c.1693432039.git.mst@redhat.com>
Tested-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Universal Flash Storage (UFS) is a high-performance mass storage device
with a serial interface. It is primarily used as a high-performance
data storage device for embedded applications.
This commit contains code for UFS device to be recognized
as a UFS PCI device.
Patches to handle UFS logical unit and Transfer Request will follow.
Signed-off-by: Jeuk Kim <jeuk20.kim@samsung.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 10232660d462ee5cd10cf673f1a9a1205fc8276c.1693980783.git.jeuk20.kim@gmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The current implementers of ARI are all SR-IOV devices. The ARI next
function number field is undefined for VF according to PCI Express Base
Specification Revision 5.0 Version 1.0 section 9.3.7.7. The PF still
requires some defined value so end the linked list formed with the field
by specifying 0 as required for any ARI implementation according to
section 7.8.7.2.
For migration, the field will keep having 1 as its value on the old
QEMU machine versions.
Fixes: 2503461691 ("pcie: Add some SR/IOV API documentation in docs/pcie_sriov.txt")
Fixes: 44c2c09488 ("hw/nvme: Add support for SR-IOV")
Fixes: 3a977deebe ("Intrdocue igb device emulation")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20230710153838.33917-3-akihiko.odaki@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
There is also pci_new() which creates non-multifunction PCI devices.
Accordingly the parameter is always set to true when a multi function PCI
device is to be created.
The reason for the parameter's existence seems to be that it is used in the
internal PCI code as well which is the only location where it gets set to
false. This one usage can be resolved by factoring out an internal helper
function.
Remove this redundant, error-prone parameter.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20230304114043.121024-6-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
There is also pci_create_simple() which creates non-multifunction PCI
devices. Accordingly the parameter is always set to true when a multi
function PCI device is to be created.
The reason for the parameter's existence seems to be that it is used in the
internal PCI code as well which is the only location where it gets set to
false. This one usage can be replaced by trivial code.
Remove this redundant, error-prone parameter.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20230304114043.121024-5-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Since it's implementation on v8.0.0-rc0, having the PCI_ERR_UNCOR_MASK
set for machine types < 8.0 will cause migration to fail if the target
QEMU version is < 8.0.0 :
qemu-system-x86_64: get_pci_config_device: Bad config data: i=0x10a read: 40 device: 0 cmask: ff wmask: 0 w1cmask:0
qemu-system-x86_64: Failed to load PCIDevice:config
qemu-system-x86_64: Failed to load e1000e:parent_obj
qemu-system-x86_64: error while loading state for instance 0x0 of device '0000:00:02.0/e1000e'
qemu-system-x86_64: load of migration failed: Invalid argument
The above test migrated a 7.2 machine type from QEMU master to QEMU 7.2.0,
with this cmdline:
./qemu-system-x86_64 -M pc-q35-7.2 [-incoming XXX]
In order to fix this, property x-pcie-err-unc-mask was introduced to
control when PCI_ERR_UNCOR_MASK is enabled. This property is enabled by
default, but is disabled if machine type <= 7.2.
Fixes: 010746ae1d ("hw/pci/aer: Implement PCI_ERR_UNCOR_MASK register")
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Leonardo Bras <leobras@redhat.com>
Message-Id: <20230503002701.854329-1-leobras@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1576
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The lifetime of the PCIBridgeWindows instance accessed via the windows pointer
in struct PCIBridge is managed separately from the PCIBridge itself.
Triggered by ./qemu-system-x86_64 -M x-remote -display none -monitor stdio
QEMU monitor: device_add cxl-downstream
In some error handling paths (such as the above due to attaching a cxl-downstream
port anything other than a cxl-upstream port) the g_free() of the PCIBridge
windows in pci_bridge_region_cleanup() is called before the final call of
flatview_uref() in address_space_set_flatview() ultimately from
drain_call_rcu()
At one stage this resulted in a crash, currently can still be observed using
valgrind which records a use after free.
When present, only one instance is allocated. pci_bridge_update_mappings()
can operate directly on an instance rather than creating a new one and
swapping it in. Thus there appears to be no reason to not directly
couple the lifetimes of the two structures by embedding the PCIBridgeWindows
within the PCIBridge removing the need for the problematic separate free.
Patch is same as was posted deep in the discussion.
https://lore.kernel.org/qemu-devel/20230403171232.000020bb@huawei.com/
Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20230421122550.28234-1-Jonathan.Cameron@huawei.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Previously, PXB_CXL_DEVICE, PXB_PCIE_DEVICE and PXB_DEVICE all
have PCI_DEVICE as their direct parent but share a common state
struct PXBDev. convert_to_pxb() is used to get the PXBDev
instance from which ever of these types it is called on.
This patch switches to an explicit hierarchy based on shared
functionality. To allow use of OBJECT_DECLARE_SIMPLE_TYPE()
whilst minimizing code changes, all types are renamed to have
the postfix _DEV rather than _DEVICE. The new heirarchy
has PXB_CXL_DEV with parent PXB_PCIE_DEV which in turn
has parent PXB_DEV which continues to have parent PCI_DEVICE.
This allows simple use of PXB_DEV() etc rather than a custom function
+ removal of duplicated properties and moving the CXL specific
elements out of struct PXBDev.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20230420142750.6950-3-Jonathan.Cameron@huawei.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch provides accessor functions as replacements for direct
access to slot_reserved_mask according to the comment at the top
of include/hw/pci/pci_bus.h which advises that data structures for
PCIBus should not be directly accessed but instead be accessed using
accessor functions in pci.h.
Three accessor functions can conveniently replace all direct accesses
of slot_reserved_mask. With this patch, the new accessor functions are
used in hw/sparc64/sun4u.c and hw/xen/xen_pt.c and pci_bus.h is removed
from the included header files of the same two files.
No functional change intended.
Signed-off-by: Chuck Zmudzinski <brchuckz@aol.com>
Message-Id: <b1b7f134883cbc83e455abbe5ee225c71aa0e8d0.1678888385.git.brchuckz@aol.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> [sun4u]
The CXL r3.0 specification allows for there to be no HDM decoders on CXL
Host Bridges if they have only a single root port. Instead, all accesses
directed to the host bridge (as specified in CXL Fixed Memory Windows)
are assumed to be routed to the single root port.
Linux currently assumes this implementation choice. So to simplify testing,
make QEMU emulation also default to no HDM decoders under these particular
circumstances, but provide a hdm_for_passthrough boolean option to have
HDM decoders as previously.
Technically this is breaking backwards compatibility, but given the only
known software stack used with the QEMU emulation is the Linux kernel
and this configuration did not work before this change, there are
unlikely to be any complaints that it now works. The option is retained
to allow testing of software that does allow for these HDM decoders to exist,
once someone writes it.
Reported-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Tested-by: Fan Ni <fan.ni@samsung.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
--
v2: Pick up and fix typo in tag from Fan Ni
Message-Id: <20230227153128.8164-3-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
These two helpers enable host bridges to operate differently depending on
the number of downstream ports, in particular if there is only a single
port.
Useful for CXL where HDM address decoders are allowed to be implicit in
the host bridge if there is only a single root port.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20230227153128.8164-2-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We already have indicator values in
include/standard-headers/linux/pci_regs.h , no reason to reinvent them
in include/hw/pci/pcie_regs.h. (and we already have usage of
PCI_EXP_SLTCTL_PWR_IND_BLINK and PCI_EXP_SLTCTL_PWR_IND_OFF in
hw/pci/pcie.c, so let's be consistent)
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Anton Kuchin <antonkuchin@yandex-team.ru>
Message-Id: <20230216180356.156832-9-vsementsov@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>