Compare commits

...

276 Commits

Author SHA1 Message Date
Bernhard Beschow
d13b9e4a54 Revert "hw/i386/pc: Confine system flash handling to pc_sysfw"
Specifying the property `-M pflash0` results in a regression:
  qemu-system-x86_64: Property 'pc-q35-9.0-machine.pflash0' not found
Revert the change for now until a solution is found.

This reverts commit 6f6ad2b245.

Reported-by: Volker Rümelin <vr_qemu@t-online.de>
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20240226215909.30884-3-shentey@gmail.com>
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit f2cb9f34ad)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-27 09:23:41 +01:00
Bernhard Beschow
9b0eb5104f Revert "hw/i386/pc_sysfw: Inline pc_system_flash_create() and remove it"
Commit 6f6ad2b245 "hw/i386/pc: Confine system flash handling to pc_sysfw"
causes a regression when specifying the property `-M pflash0` in the PCI PC
machines:
  qemu-system-x86_64: Property 'pc-q35-9.0-machine.pflash0' not found
In order to revert the commit, the commit below must be reverted first.

This reverts commit cb05cc1602.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20240226215909.30884-2-shentey@gmail.com>
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 0fbe8d7c4c)
Referecens: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-27 09:23:02 +01:00
Xiaoyao Li
1b231e8f26 docs: Add TDX documentation
Add docs/system/i386/tdx.rst for TDX support, and add tdx in
confidential-guest-support.rst

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2024-03-26 17:17:00 +01:00
Sean Christopherson
72db2d374f i386/tdx: Don't get/put guest state for TDX VMs
Don't get/put state of TDX VMs since accessing/mutating guest state of
production TDs is not supported.

Note, it will be allowed for a debug TD. Corresponding support will be
introduced when debug TD support is implemented in the future.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2024-03-26 17:17:00 +01:00
Xiaoyao Li
4a81d67680 i386/tdx: Skip kvm_put_apicbase() for TDs
KVM doesn't allow wirting to MSR_IA32_APICBASE for TDs.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2024-03-26 17:17:00 +01:00
Xiaoyao Li
820895b892 i386/tdx: Only configure MSR_IA32_UCODE_REV in kvm_init_msrs() for TDs
For TDs, only MSR_IA32_UCODE_REV in kvm_init_msrs() can be configured
by VMM, while the features enumerated/controlled by other MSRs except
MSR_IA32_UCODE_REV in kvm_init_msrs() are not under control of VMM.

Only configure MSR_IA32_UCODE_REV for TDs.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2024-03-26 17:17:00 +01:00
Isaku Yamahata
fa977aad67 i386/tdx: Don't synchronize guest tsc for TDs
TSC of TDs is not accessible and KVM doesn't allow access of
MSR_IA32_TSC for TDs. To avoid the assert() in kvm_get_tsc, make
kvm_synchronize_all_tsc() noop for TDs,

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Reviewed-by: Connor Kuehl <ckuehl@redhat.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2024-03-26 17:17:00 +01:00
Isaku Yamahata
e12b096972 hw/i386: add option to forcibly report edge trigger in acpi tables
When level trigger isn't supported on x86 platform,
forcibly report edge trigger in acpi tables.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2024-03-26 17:17:00 +01:00
Xiaoyao Li
321af9d3fa hw/i386: add eoi_intercept_unsupported member to X86MachineState
Add a new bool member, eoi_intercept_unsupported, to X86MachineState
with default value false. Set true for TDX VM.

Inability to intercept eoi causes impossibility to emulate level
triggered interrupt to be re-injected when level is still kept active.
which affects interrupt controller emulation.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2024-03-26 17:17:00 +01:00
Xiaoyao Li
437417dac6 i386/tdx: LMCE is not supported for TDX
LMCE is not supported TDX since KVM doesn't provide emulation for
MSR_IA32_FEAT_CTL.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2024-03-26 17:16:59 +01:00
Xiaoyao Li
1c882e8210 i386/tdx: Don't allow system reset for TDX VMs
TDX CPU state is protected and thus vcpu state cann't be reset by VMM.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2024-03-26 17:16:59 +01:00
Xiaoyao Li
56ee0ce804 i386/tdx: Disable PIC for TDX VMs
Legacy PIC (8259) cannot be supported for TDX VMs since TDX module
doesn't allow directly interrupt injection.  Using posted interrupts
for the PIC is not a viable option as the guest BIOS/kernel will not
do EOI for PIC IRQs, i.e. will leave the vIRR bit set.

Hence disable PIC for TDX VMs and error out if user wants PIC.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2024-03-26 17:16:59 +01:00
Xiaoyao Li
36fe16995c i386/tdx: Disable SMM for TDX VMs
TDX doesn't support SMM and VMM cannot emulate SMM for TDX VMs because
VMM cannot manipulate TDX VM's memory.

Disable SMM for TDX VMs and error out if user requests to enable SMM.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2024-03-26 17:16:59 +01:00
Isaku Yamahata
03d11c6dca q35: Introduce smm_ranges property for q35-pci-host
Add a q35 property to check whether or not SMM ranges, e.g. SMRAM, TSEG,
etc... exist for the target platform.  TDX doesn't support SMM and doesn't
play nice with QEMU modifying related guest memory ranges.

Signed-off-by: Isaku Yamahata <isaku.yamahata@linux.intel.com>
Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2024-03-26 17:16:59 +01:00
Isaku Yamahata
860a8b50a3 pci-host/q35: Move PAM initialization above SMRAM initialization
In mch_realize(), process PAM initialization before SMRAM initialization so
that later patch can skill all the SMRAM related with a single check.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2024-03-26 17:16:59 +01:00
Xiaoyao Li
e03a166f4e i386/tdx: Wire TDX_REPORT_FATAL_ERROR with GuestPanic facility
Integrate TDX's TDX_REPORT_FATAL_ERROR into QEMU GuestPanic facility

Originated-from: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2024-03-26 17:16:59 +01:00
Xiaoyao Li
819540ef60 i386/tdx: Handle TDG.VP.VMCALL<REPORT_FATAL_ERROR>
TD guest can use TDG.VP.VMCALL<REPORT_FATAL_ERROR> to request termination
with error message encoded in GPRs.

Parse and print the error message, and terminate the TD guest in the
handler.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2024-03-26 17:16:59 +01:00
Isaku Yamahata
ac2eeaf4a2 i386/tdx: handle TDG.VP.VMCALL<MapGPA> hypercall
MapGPA is a hypercall to convert GPA from/to private GPA to/from shared GPA.
As the conversion function is already implemented as kvm_convert_memory,
wire it to TDX hypercall exit.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2024-03-26 17:16:59 +01:00
Isaku Yamahata
6f0f362e95 i386/tdx: handle TDG.VP.VMCALL<GetQuote>
Add property "quote-generation-socket" to tdx-guest, which is a property
of type SocketAddress to specify Quote Generation Service(QGS).

On request of GetQuote, it connects to the QGS socket, read request
data from shared guest memory, send the request data to the QGS,
and store the response into shared guest memory, at last notify
TD guest by interrupt.

command line example:
  qemu-system-x86_64 \
    -object '{"qom-type":"tdx-guest","id":"tdx0","quote-generation-socket":{"type": "vsock", "cid":"1","port":"1234"}}' \
    -machine confidential-guest-support=tdx0

Note, above example uses vsock type socket because the QGS we used
implements the vsock socket. It can be other types, like UNIX socket,
which depends on the implementation of QGS.

To avoid no response from QGS server, setup a timer for the transaction.
If timeout, make it an error and interrupt guest. Define the threshold of
time to 30s at present, maybe change to other value if not appropriate.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Codeveloped-by: Chenyi Qiang <chenyi.qiang@intel.com>
Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Codeveloped-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2024-03-26 17:16:57 +01:00
Isaku Yamahata
df99750b0d i386/tdx: handle TDG.VP.VMCALL<SetupEventNotifyInterrupt>
For SetupEventNotifyInterrupt, record interrupt vector and the apic id
of the vcpu that received this TDVMCALL.

Later it can inject interrupt with given vector to the specific vcpu
that received SetupEventNotifyInterrupt.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2024-03-26 17:14:57 +01:00
Xiaoyao Li
ae1705eda4 i386/tdx: Finalize TDX VM
Invoke KVM_TDX_FINALIZE_VM to finalize the TD's measurement and make
the TD vCPUs runnable once machine initialization is complete.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2024-03-26 17:14:57 +01:00
Xiaoyao Li
2c9915e371 i386/tdx: Call KVM_TDX_INIT_VCPU to initialize TDX vcpu
TDX vcpu needs to be initialized by SEAMCALL(TDH.VP.INIT) and KVM
provides vcpu level IOCTL KVM_TDX_INIT_VCPU for it.

KVM_TDX_INIT_VCPU needs the address of the HOB as input. Invoke it for
each vcpu after HOB list is created.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2024-03-26 17:14:57 +01:00
Isaku Yamahata
b0886d6973 i386/tdx: Populate TDVF private memory via KVM_MEMORY_MAPPING
TDVF firmware (CODE and VARS) needs to be copied to TD's private
memory, as well as TD HOB and TEMP memory.

If the TDVF section has TDVF_SECTION_ATTRIBUTES_MR_EXTEND set in the
flag, calling KVM_TDX_EXTEND_MEMORY to extend the measurement.

After populating the TDVF memory, the original image located in shared
ramblock can be discarded.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2024-03-26 17:14:57 +01:00
Xiaoyao Li
3da0f0e494 i386/tdx: Setup the TD HOB list
The TD HOB list is used to pass the information from VMM to TDVF. The TD
HOB must include PHIT HOB and Resource Descriptor HOB. More details can
be found in TDVF specification and PI specification.

Build the TD HOB in TDX's machine_init_done callback.

Co-developed-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2024-03-26 17:14:57 +01:00
Xiaoyao Li
7813aa24ad headers: Add definitions from UEFI spec for volumes, resources, etc...
Add UEFI definitions for literals, enums, structs, GUIDs, etc... that
will be used by TDX to build the UEFI Hand-Off Block (HOB) that is passed
to the Trusted Domain Virtual Firmware (TDVF).

All values come from the UEFI specification [1], PI spec [2] and TDVF
design guide[3].

[1] UEFI Specification v2.1.0 https://uefi.org/sites/default/files/resources/UEFI_Spec_2_10_Aug29.pdf
[2] UEFI PI spec v1.8 https://uefi.org/sites/default/files/resources/UEFI_PI_Spec_1_8_March3.pdf
[3] https://software.intel.com/content/dam/develop/external/us/en/documents/tdx-virtual-firmware-design-guide-rev-1.pdf

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2024-03-26 17:14:56 +01:00
Xiaoyao Li
9fd4ec0cb2 i386/tdx: Track RAM entries for TDX VM
The RAM of TDX VM can be classified into two types:

 - TDX_RAM_UNACCEPTED: default type of TDX memory, which needs to be
   accepted by TDX guest before it can be used and will be all-zeros
   after being accepted.

 - TDX_RAM_ADDED: the RAM that is ADD'ed to TD guest before running, and
   can be used directly. E.g., TD HOB and TEMP MEM that needed by TDVF.

Maintain TdxRamEntries[] which grabs the initial RAM info from e820 table
and mark each RAM range as default type TDX_RAM_UNACCEPTED.

Then turn the range of TD HOB and TEMP MEM to TDX_RAM_ADDED since these
ranges will be ADD'ed before TD runs and no need to be accepted runtime.

The TdxRamEntries[] are later used to setup the memory TD resource HOB
that passes memory info from QEMU to TDVF.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2024-03-26 17:14:56 +01:00
Xiaoyao Li
11319132f6 i386/tdx: Track mem_ptr for each firmware entry of TDVF
For each TDVF sections, QEMU needs to copy the content to guest
private memory via KVM API (KVM_TDX_INIT_MEM_REGION).

Introduce a field @mem_ptr for TdxFirmwareEntry to track the memory
pointer of each TDVF sections. So that QEMU can add/copy them to guest
private memory later.

TDVF sections can be classified into two groups:
 - Firmware itself, e.g., TDVF BFV and CFV, that located separately from
   guest RAM. Its memory pointer is the bios pointer.

 - Sections located at guest RAM, e.g., TEMP_MEM and TD_HOB.
   mmap a new memory range for them.

Register a machine_init_done callback to do the stuff.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2024-03-26 17:14:56 +01:00
Xiaoyao Li
118a3e46df i386/tdx: Don't initialize pc.rom for TDX VMs
For TDX, the address below 1MB are entirely general RAM. No need to
initialize pc.rom memory region for TDs.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2024-03-26 17:14:56 +01:00
Xiaoyao Li
a1d39edf05 i386/tdx: Skip BIOS shadowing setup
TDX doesn't support map different GPAs to same private memory. Thus,
aliasing top 128KB of BIOS as isa-bios is not supported.

On the other hand, TDX guest cannot go to real mode, it can work fine
without isa-bios.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2024-03-26 17:14:56 +01:00
Xiaoyao Li
3c8db2995b i386/tdx: Parse TDVF metadata for TDX VM
After TDVF is loaded to bios MemoryRegion, it needs parse TDVF metadata.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2024-03-26 17:14:56 +01:00
Isaku Yamahata
94b122086c i386/tdvf: Introduce function to parse TDVF metadata
TDX VM needs to boot with its specialized firmware, Trusted Domain
Virtual Firmware (TDVF). QEMU needs to parse TDVF and map it in TD
guest memory prior to running the TDX VM.

A TDVF Metadata in TDVF image describes the structure of firmware.
QEMU refers to it to setup memory for TDVF. Introduce function
tdvf_parse_metadata() to parse the metadata from TDVF image and store
the info of each TDVF section.

TDX metadata is located by a TDX metadata offset block, which is a
GUID-ed structure. The data portion of the GUID structure contains
only an 4-byte field that is the offset of TDX metadata to the end
of firmware file.

Select X86_FW_OVMF when TDX is enable to leverage existing functions
to parse and search OVMF's GUID-ed structures.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Co-developed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2024-03-26 17:14:56 +01:00
Chao Peng
3a8358b856 i386/tdx: load TDVF for TD guest
TDVF(OVMF) needs to run at private memory for TD guest. TDX cannot
support pflash device since it doesn't support read-only private memory.
Thus load TDVF(OVMF) with -bios option for TDs.

Use memory_region_init_ram_guest_memfd() to allocate the MemoryRegion
for TDVF because it needs to be located at private memory.

Also store the MemoryRegion pointer of TDVF since the shared ramblock of
it can be discared after it gets copied to private ramblock.

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Co-developed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2024-03-26 17:14:56 +01:00
Xiaoyao Li
a0a25c2263 memory: Introduce memory_region_init_ram_guest_memfd()
Introduce memory_region_init_ram_guest_memfd() to allocate private
guset memfd on the MemoryRegion initialization. It's for the use case of
TDVF, which must be private on TDX case.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2024-03-26 17:14:56 +01:00
Isaku Yamahata
433562d5e6 kvm/tdx: Ignore memory conversion to shared of unassigned region
TDX requires vMMIO region to be shared.  For KVM, MMIO region is the region
which kvm memslot isn't assigned to (except in-kernel emulation).
qemu has the memory region for vMMIO at each device level.

While OVMF issues MapGPA(to-shared) conservatively on 32bit PCI MMIO
region, qemu doesn't find corresponding vMMIO region because it's before
PCI device allocation and memory_region_find() finds the device region, not
PCI bus region.  It's safe to ignore MapGPA(to-shared) because when guest
accesses those region they use GPA with shared bit set for vMMIO.  Ignore
memory conversion request of non-assigned region to shared and return
success.  Otherwise OVMF is confused and panics there.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2024-03-26 17:14:56 +01:00
Isaku Yamahata
02d624f780 kvm/tdx: Don't complain when converting vMMIO region to shared
Because vMMIO region needs to be shared region, guest TD may explicitly
convert such region from private to shared.  Don't complain such
conversion.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2024-03-26 17:14:56 +01:00
Xiaoyao Li
18b319791d i386/tdx: Set kvm_readonly_mem_enabled to false for TDX VM
TDX only supports readonly for shared memory but not for private memory.

In the view of QEMU, it has no idea whether a memslot is used as shared
memory of private. Thus just mark kvm_readonly_mem_enabled to false to
TDX VM for simplicity.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2024-03-26 17:14:56 +01:00
Xiaoyao Li
ae1391d1da i386/tdx: Implement user specified tsc frequency
Reuse "-cpu,tsc-frequency=" to get user wanted tsc frequency and call VM
scope VM_SET_TSC_KHZ to set the tsc frequency of TD before KVM_TDX_INIT_VM.

Besides, sanity check the tsc frequency to be in the legal range and
legal granularity (required by TDX module).

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2024-03-26 17:14:56 +01:00
Isaku Yamahata
b2c919bcdc i386/tdx: Support user configurable mrconfigid/mrowner/mrownerconfig
Three sha384 hash values, mrconfigid, mrowner and mrownerconfig, of a TD
can be provided for TDX attestation. Detailed meaning of them can be
found: https://lore.kernel.org/qemu-devel/31d6dbc1-f453-4cef-ab08-4813f4e0ff92@intel.com/

Allow user to specify those values via property mrconfigid, mrowner and
mrownerconfig. They are all in base64 format.

example
-object tdx-guest, \
  mrconfigid=ASNFZ4mrze8BI0VniavN7wEjRWeJq83vASNFZ4mrze8BI0VniavN7wEjRWeJq83v,\
  mrowner=ASNFZ4mrze8BI0VniavN7wEjRWeJq83vASNFZ4mrze8BI0VniavN7wEjRWeJq83v,\
  mrownerconfig=ASNFZ4mrze8BI0VniavN7wEjRWeJq83vASNFZ4mrze8BI0VniavN7wEjRWeJq83v

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Co-developed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2024-03-26 17:14:56 +01:00
Xiaoyao Li
aa7c69b590 i386/tdx: Validate TD attributes
Validate TD attributes with tdx_caps that fixed-0 bits must be zero and
fixed-1 bits must be set.

Besides, sanity check the attribute bits that have not been supported by
QEMU yet. e.g., debug bit, it will be allowed in the future when debug
TD support lands in QEMU.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2024-03-26 17:14:56 +01:00
Xiaoyao Li
2410afb463 i386/tdx: Disable pmu for TD guest
Current KVM doesn't support PMU for TD guest. It returns error if TD is
created with PMU bit being set in attributes.

Disable PMU for TD guest on QEMU side.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2024-03-26 17:14:56 +01:00
Xiaoyao Li
ab9bc0b91a i386/tdx: Wire CPU features up with attributes of TD guest
For QEMU VMs, PKS is configured via CPUID_7_0_ECX_PKS and PMU is
configured by x86cpu->enable_pmu. Reuse the existing configuration
interface for TDX VMs.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2024-03-26 17:14:56 +01:00
Isaku Yamahata
45a5878924 i386/tdx: Make sept_ve_disable set by default
For TDX KVM use case, Linux guest is the most major one.  It requires
sept_ve_disable set.  Make it default for the main use case.  For other use
case, it can be enabled/disabled via qemu command line.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
2024-03-26 17:14:56 +01:00
Xiaoyao Li
05d9e2102f i386/tdx: Add property sept-ve-disable for tdx-guest object
Bit 28 of TD attribute, named SEPT_VE_DISABLE. When set to 1, it disables
EPT violation conversion to #VE on guest TD access of PENDING pages.

Some guest OS (e.g., Linux TD guest) may require this bit as 1.
Otherwise refuse to boot.

Add sept-ve-disable property for tdx-guest object, for user to configure
this bit.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
2024-03-26 17:14:56 +01:00
Xiaoyao Li
5e74fb7a2c i386/tdx: Initialize TDX before creating TD vcpus
Invoke KVM_TDX_INIT in kvm_arch_pre_create_vcpu() that KVM_TDX_INIT
configures global TD configurations, e.g. the canonical CPUID config,
and must be executed prior to creating vCPUs.

Use kvm_x86_arch_cpuid() to setup the CPUID settings for TDX VM.

Note, this doesn't address the fact that QEMU may change the CPUID
configuration when creating vCPUs, i.e. punts on refactoring QEMU to
provide a stable CPUID config prior to kvm_arch_init().

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
2024-03-26 17:14:54 +01:00
Xiaoyao Li
5ca74db70f kvm: Introduce kvm_arch_pre_create_vcpu()
Introduce kvm_arch_pre_create_vcpu(), to perform arch-dependent
work prior to create any vcpu. This is for i386 TDX because it needs
call TDX_INIT_VM before creating any vcpu.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2024-03-26 17:13:49 +01:00
Sean Christopherson
796bcd9463 i386/kvm: Move architectural CPUID leaf generation to separate helper
Move the architectural (for lack of a better term) CPUID leaf generation
to a separate helper so that the generation code can be reused by TDX,
which needs to generate a canonical VM-scoped configuration.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2024-03-26 17:13:49 +01:00
Xiaoyao Li
2f3ccbe5b5 i386/tdx: Integrate tdx_caps->attrs_fixed0/1 to tdx_cpuid_lookup
Some bits in TD attributes have corresponding CPUID feature bits. Reflect
the fixed0/1 restriction on TD attributes to their corresponding CPUID
bits in tdx_cpuid_lookup[] as well.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2024-03-26 17:13:49 +01:00
Xiaoyao Li
83c0f0e38c i386/tdx: Integrate tdx_caps->xfam_fixed0/1 into tdx_cpuid_lookup
KVM requires userspace to pass XFAM configuration via CPUID 0xD leaves.

Convert tdx_caps->xfam_fixed0/1 into corresponding
tdx_cpuid_lookup[].tdx_fixed0/1 field of CPUID 0xD leaves. Thus the
requirement can be applied naturally.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2024-03-26 17:13:49 +01:00
Xiaoyao Li
6b412efd3a i386/tdx: Update tdx_cpuid_lookup[].tdx_fixed0/1 by tdx_caps.cpuid_config[]
tdx_cpuid_lookup[].tdx_fixed0/1 is QEMU maintained data which reflects
TDX restrictions regrading what bits are fixed by TDX module.

It's retrieved from TDX spec and static. However, TDX may evolve and
change some fixed fields to configurable in the future. Update
tdx_cpuid.lookup[].tdx_fixed0/1 fields by removing the bits that
reported from TDX module as configurable. This can adapt with the
updated TDX (module) automatically.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2024-03-26 17:13:49 +01:00
Xiaoyao Li
9357a3d329 i386/tdx: Make Intel-PT unsupported for TD guest
Due to the fact that Intel-PT virtualization support has been broken in
QEMU since Sapphire Rapids generation[1], below warning is triggered when
luanching TD guest:

  warning: host doesn't support requested feature: CPUID.07H:EBX.intel-pt [bit 25]

Before Intel-pt is fixed in QEMU, just make Intel-PT unsupported for TD
guest, to avoid the confusing warning.

[1] https://lore.kernel.org/qemu-devel/20230531084311.3807277-1-xiaoyao.li@intel.com/

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2024-03-26 17:13:49 +01:00
Xiaoyao Li
13d5ea1836 i386/tdx: Adjust the supported CPUID based on TDX restrictions
According to Chapter "CPUID Virtualization" in TDX module spec, CPUID
bits of TD can be classified into 6 types:

------------------------------------------------------------------------
1 | As configured | configurable by VMM, independent of native value;
------------------------------------------------------------------------
2 | As configured | configurable by VMM if the bit is supported natively
    (if native)   | Otherwise it equals as native(0).
------------------------------------------------------------------------
3 | Fixed         | fixed to 0/1
------------------------------------------------------------------------
4 | Native        | reflect the native value
------------------------------------------------------------------------
5 | Calculated    | calculated by TDX module.
------------------------------------------------------------------------
6 | Inducing #VE  | get #VE exception
------------------------------------------------------------------------

Note:
1. All the configurable XFAM related features and TD attributes related
   features fall into type #2. And fixed0/1 bits of XFAM and TD
   attributes fall into type #3.

2. For CPUID leaves not listed in "CPUID virtualization Overview" table
   in TDX module spec, TDX module injects #VE to TDs when those are
   queried. For this case, TDs can request CPUID emulation from VMM via
   TDVMCALL and the values are fully controlled by VMM.

Due to TDX module has its own virtualization policy on CPUID bits, it leads
to what reported via KVM_GET_SUPPORTED_CPUID diverges from the supported
CPUID bits for TDs. In order to keep a consistent CPUID configuration
between VMM and TDs. Adjust supported CPUID for TDs based on TDX
restrictions.

Currently only focus on the CPUID leaves recognized by QEMU's
feature_word_info[] that are indexed by a FeatureWord.

Introduce a TDX CPUID lookup table, which maintains 1 entry for each
FeatureWord. Each entry has below fields:

 - tdx_fixed0/1: The bits that are fixed as 0/1;

 - depends_on_vmm_cap: The bits that are configurable from the view of
		       TDX module. But they requires emulation of VMM
		       when configured as enabled. For those, they are
		       not supported if VMM doesn't report them as
		       supported. So they need be fixed up by checking
		       if VMM supports them.

 - inducing_ve: TD gets #VE when querying this CPUID leaf. The result is
                totally configurable by VMM.

 - supported_on_ve: It's valid only when @inducing_ve is true. It represents
		    the maximum feature set supported that be emulated
		    for TDs.

By applying TDX CPUID lookup table and TDX capabilities reported from
TDX module, the supported CPUID for TDs can be obtained from following
steps:

- get the base of VMM supported feature set;

- if the leaf is not a FeatureWord just return VMM's value without
  modification;

- if the leaf is an inducing_ve type, applying supported_on_ve mask and
  return;

- include all native bits, it covers type #2, #4, and parts of type #1.
  (it also includes some unsupported bits. The following step will
   correct it.)

- apply fixed0/1 to it (it covers #3, and rectifies the previous step);

- add configurable bits (it covers the other part of type #1);

- fix the ones in vmm_fixup;

(Calculated type is ignored since it's determined at runtime).

Co-developed-by: Chenyi Qiang <chenyi.qiang@intel.com>
Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2024-03-26 17:13:49 +01:00
Xiaoyao Li
0b0493be25 i386/tdx: Introduce is_tdx_vm() helper and cache tdx_guest object
It will need special handling for TDX VMs all around the QEMU.
Introduce is_tdx_vm() helper to query if it's a TDX VM.

Cache tdx_guest object thus no need to cast from ms->cgs every time.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Isaku Yamahata <isaku.yamahata@intel.com>
2024-03-26 17:13:49 +01:00
Xiaoyao Li
1f15e4c9f0 i386/tdx: Get tdx_capabilities via KVM_TDX_CAPABILITIES
KVM provides TDX capabilities via sub command KVM_TDX_CAPABILITIES of
IOCTL(KVM_MEMORY_ENCRYPT_OP). Get the capabilities when initializing
TDX context. It will be used to validate user's setting later.

Since there is no interface reporting how many cpuid configs contains in
KVM_TDX_CAPABILITIES, QEMU chooses to try starting with a known number
and abort when it exceeds KVM_MAX_CPUID_ENTRIES.

Besides, introduce the interfaces to invoke TDX "ioctls" at different
scope (KVM, VM and VCPU) in preparation.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2024-03-26 17:13:49 +01:00
Xiaoyao Li
a14b9df0cc i386/tdx: Implement tdx_kvm_init() to initialize TDX VM context
Implement TDX specific ConfidentialGuestSupportClass::kvm_init()
callback, tdx_kvm_init().

Set ms->require_guest_memfd to true to require private guest memfd
allocation for any memory backend.

More TDX specific initialization will be added later.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2024-03-26 17:13:49 +01:00
Xiaoyao Li
baab233ea4 target/i386: Implement mc->kvm_type() to get VM type
TDX VM requires VM type KVM_X86_TDX_VM to be passed to
kvm_ioctl(KVM_CREATE_VM). Hence implement mc->kvm_type() for i386
architecture.

If tdx-guest object is specified to confidential-guest-support, like,

  qemu -machine ...,confidential-guest-support=tdx0 \
       -object tdx-guest,id=tdx0,...

it parses VM type as KVM_X86_TDX_VM. Otherwise, it's KVM_X86_DEFAULT_VM.

Also store the vm_type in MachineState for other code to query what the
VM type is.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2024-03-26 17:13:49 +01:00
Xiaoyao Li
27e07ada8b i386: Introduce tdx-guest object
Introduce tdx-guest object which inherits CONFIDENTIAL_GUEST_SUPPORT,
and will be used to create TDX VMs (TDs) by

  qemu -machine ...,confidential-guest-support=tdx0	\
       -object tdx-guest,id=tdx0

So far, it has no QAPI member/properety decleared and only one internal
member 'attributes' with fixed value 0 that not configurable.

QAPI properties will be added later.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
2024-03-26 17:13:44 +01:00
Xiaoyao Li
d8e31efb54 *** HACK *** linux-headers: Update headers to pull in TDX API changes
Pull in recent TDX updates, which are not backwards compatible.

It's just to make this series runnable. It will be updated by script

	scripts/update-linux-headers.sh

once TDX support is upstreamed in linux kernel

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2024-03-26 17:03:16 +01:00
Xiaoyao Li
3152a855e2 kvm/memory: Make memory type private by default if it has guest memfd backend
KVM side leaves the memory to shared by default, while may incur the
overhead of paging conversion on the first visit of each page. Because
the expectation is that page is likely to private for the VMs that
require private memory (has guest memfd).

Explicitly set the memory to private when memory region has valid
guest memfd backend.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2024-03-26 17:03:16 +01:00
Isaku Yamahata
5e897fcf5f trace/kvm: Add trace for page convertion between shared and private
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2024-03-26 17:03:16 +01:00
Chao Peng
363c43d251 kvm: handle KVM_EXIT_MEMORY_FAULT
When geeting KVM_EXIT_MEMORY_FAULT exit, it indicates userspace needs to
do the memory conversion on the RAMBlock to turn the memory into desired
attribute, i.e., private/shared.

Currently only KVM_MEMORY_EXIT_FLAG_PRIVATE in flags is valid when
KVM_EXIT_MEMORY_FAULT happens.

Note, KVM_EXIT_MEMORY_FAULT makes sense only when the RAMBlock has
guest_memfd memory backend.

Note, KVM_EXIT_MEMORY_FAULT returns with -EFAULT, so special handling is
added.

When page is converted from shared to private, the original shared
memory can be discarded via ram_block_discard_range(). Note, shared
memory can be discarded only when it's not back'ed by hugetlb because
hugetlb is supposed to be pre-allocated and no need for discarding.

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Co-developed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Isaku Yamahata <isaku.yamahata@intel.com>
2024-03-26 17:03:16 +01:00
Xiaoyao Li
b359aee12a physmem: Introduce ram_block_discard_guest_memfd_range()
When memory page is converted from private to shared, the original
private memory is back'ed by guest_memfd. Introduce
ram_block_discard_guest_memfd_range() for discarding memory in
guest_memfd.

Originally-from: Isaku Yamahata <isaku.yamahata@intel.com>
Codeveloped-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
2024-03-26 17:03:16 +01:00
Xiaoyao Li
e1f1caf9d4 kvm: Introduce support for memory_attributes
Introduce the helper functions to set the attributes of a range of
memory to private or shared.

This is necessary to notify KVM the private/shared attribute of each gpa
range. KVM needs the information to decide the GPA needs to be mapped at
hva-based shared memory or guest_memfd based private memory.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2024-03-26 17:03:16 +01:00
Chao Peng
1b8cbfe6ee kvm: Enable KVM_SET_USER_MEMORY_REGION2 for memslot
Switch to KVM_SET_USER_MEMORY_REGION2 when supported by KVM.

With KVM_SET_USER_MEMORY_REGION2, QEMU can set up memory region that
backend'ed both by hva-based shared memory and guest memfd based private
memory.

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Co-developed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2024-03-26 17:03:16 +01:00
Xiaoyao Li
28cf6e1fe0 trace/kvm: Split address space and slot id in trace_kvm_set_user_memory()
The upper 16 bits of kvm_userspace_memory_region::slot are
address space id. Parse it separately in trace_kvm_set_user_memory().

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2024-03-26 17:03:16 +01:00
Xiaoyao Li
cdbcc2d68c HostMem: Add mechanism to opt in kvm guest memfd via MachineState
Add a new member "guest_memfd" to memory backends. When it's set
to true, it enables RAM_GUEST_MEMFD in ram_flags, thus private kvm
guest_memfd will be allocated during RAMBlock allocation.

Memory backend's @guest_memfd is wired with @require_guest_memfd
field of MachineState. It avoid looking up the machine in phymem.c.

MachineState::require_guest_memfd is supposed to be set by any VMs
that requires KVM guest memfd as private memory, e.g., TDX VM.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
2024-03-26 17:03:16 +01:00
Xiaoyao Li
e3da9ef694 RAMBlock: Add support of KVM private guest memfd
Add KVM guest_memfd support to RAMBlock so both normal hva based memory
and kvm guest memfd based private memory can be associated in one RAMBlock.

Introduce new flag RAM_GUEST_MEMFD. When it's set, it calls KVM ioctl to
create private guest_memfd during RAMBlock setup.

Allocating a new RAM_GUEST_MEMFD flag to instruct the setup of guest memfd
is more flexible and extensible than simply relying on the VM type because
in the future we may have the case that not all the memory of a VM need
guest memfd. As a benefit, it also avoid getting MachineState in memory
subsystem.

Note, RAM_GUEST_MEMFD is supposed to be set for memory backends of
confidential guests, such as TDX VM. How and when to set it for memory
backends will be implemented in the following patches.

Introduce memory_region_has_guest_memfd() to query if the MemoryRegion has
KVM guest_memfd allocated.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
2024-03-26 17:03:16 +01:00
Xiaoyao Li
2e638a41c1 linux-headers: Update to Linux v6.8-rc5
Guest memfd support in QEMU requires corresponding KVM guest memfd APIs,
which lands in Linux from v6.8-rc1.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
2024-03-26 17:03:16 +01:00
Xiaoyao Li
2dd11a66a3 s390: Switch to use confidential_guest_kvm_init()
Use unified confidential_guest_kvm_init(), to avoid exposing specific
functions.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
Changes from rfc v1:
 - check machine->cgs not NULL before calling confidential_guest_kvm_init();
2024-03-26 17:03:12 +01:00
Xiaoyao Li
6c0cc4975e ppc/pef: switch to use confidential_guest_kvm_init/reset()
Use the unified interface to call confidential guest related kvm_init()
and kvm_reset(), to avoid exposing pef specific functions.

remove perf.h since it is now blank..

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
Changes from rfc v1:
 - check machine->cgs not NULL before callling
   confidential_guest_kvm_init/reset();
2024-03-26 17:03:05 +01:00
Xiaoyao Li
88f5e47bdc i386/sev: Switch to use confidential_guest_kvm_init()
Use confidential_guest_kvm_init() instead of calling SEV specific
sev_kvm_init(). As a bouns, it fits to future TDX when TDX implements
its own confidential_guest_support and .kvm_init().

Move the "TypeInfo sev_guest_info" definition and related functions to
the end of the file, to avoid declaring the sev_kvm_init() ahead.

Delete the sve-stub.c since it's not needed anymore.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
Changes from rfc v1:
- check ms->cgs not NULL before calling confidential_guest_kvm_init();
- delete the sev-stub.c;
2024-03-26 17:02:54 +01:00
Xiaoyao Li
b3f4d68667 confidential guest support: Add kvm_init() and kvm_reset() in class
Different confidential VMs in different architectures all have the same
needs to do their specific initialization (and maybe resetting) stuffs
with KVM. Currently each of them exposes individual *_kvm_init()
functions and let machine code or kvm code to call it.

To make it more object oriented, add two virtual functions, kvm_init()
and kvm_reset() in ConfidentialGuestSupportClass, and expose two helpers
functions for invodking them.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
Changes since rfc v1:
- Drop the NULL check and rely the check from the caller;
2024-03-26 17:02:34 +01:00
Bernhard Beschow
cddbb95e8d hw/i386/pc_q35: Populate interrupt handlers before realizing LPC PCI function
The interrupt handlers need to be populated before the device is realized since
internal devices such as the RTC are wired during realize(). If the interrupt
handlers aren't populated, devices such as the RTC will be wired with a NULL
interrupt handler, i.e. MC146818RtcState::irq is NULL.

Fixes: fc11ca08bc "hw/i386/q35: Realize LPC PCI function before accessing it"

Cc: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-ID: <20240217104644.19755-1-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 143f3fd3d8)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-26 10:53:50 +01:00
Philippe Mathieu-Daudé
20a4ef0b4b hw/i386/pc_sysfw: Use qdev_is_realized() instead of QOM API
Prefer QDev API for QDev objects, avoid the underlying QOM layer.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20240216110313.17039-3-philmd@linaro.org>
(cherry picked from commit 58183abfe7)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-26 10:53:50 +01:00
Philippe Mathieu-Daudé
4e70c6cffd hw/i386/q35: Realize LPC PCI function before accessing it
We should not wire IRQs on unrealized device.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Damien Hedde <dhedde@kalrayinc.com>
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240213130341.1793-5-philmd@linaro.org>
(cherry picked from commit fc11ca08bc)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-26 10:51:06 +01:00
Bernhard Beschow
0ad7dd2a3b hw/i386/pc_sysfw: Inline pc_system_flash_create() and remove it
pc_system_flash_create() checked for pcmc->pci_enabled which is redundant since
its caller already checked it. The method can be turned into just two lines, so
inline and remove it.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240208220349.4948-8-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit cb05cc1602)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-26 10:47:55 +01:00
Bernhard Beschow
a2be8f46c2 hw/i386/pc: Confine system flash handling to pc_sysfw
Rather than distributing PC system flash handling across three files, let's
confine it to one. Now, pc_system_firmware_init() creates, configures and cleans
up the system flash which makes the code easier to understand. It also avoids
the extra call to pc_system_flash_cleanup_unused() in the Xen case.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240208220349.4948-7-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 6f6ad2b245)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-26 10:46:39 +01:00
Bernhard Beschow
f2858dc109 hw/i386/pc: Defer smbios_set_defaults() to machine_done
Handling most of smbios data generation in the machine_done notifier is similar
to how the ARM virt machine handles it which also calls smbios_set_defaults()
there. The result is that all pc machines are freed from explicitly worrying
about smbios setup.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240208220349.4948-6-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit a0204a5ed0)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-26 10:46:21 +01:00
Bernhard Beschow
0e50e2590d hw/i386/pc: Merge pc_guest_info_init() into pc_machine_initfn()
Resolves redundant code in the piix and q35 machines.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240208220349.4948-5-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 4d3457fef9)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-26 10:46:01 +01:00
Bernhard Beschow
cf15fc547f hw/i386/x86: Turn apic_xrupt_override into class attribute
The attribute isn't user-changeable and only true for pc-based machines. Turn it
into a class attribute which allows for inlining pc_guest_info_init() into
pc_machine_initfn().

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240208220349.4948-4-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 6e6d59a94d)
Referencex: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-26 10:45:37 +01:00
Bernhard Beschow
4faecddc93 hw/i386/pc_piix: Share pc_cmos_init() invocation between pc and isapc machines
Both invocations are the same and either one is always executed. Avoid this
redundancy.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240208220349.4948-3-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 16bd024bb4)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-26 10:44:05 +01:00
Bernhard Beschow
603f0584ab hw/i386/x86: Let ioapic_init_gsi() take parent as pointer
Rather than taking a QOM name which has to be resolved, let's pass the parent
directly as pointer. This simplifies the code.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240224135851.100361-2-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 9b0c44334c)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-26 10:43:18 +01:00
Philippe Mathieu-Daudé
0cc701c26d hw/i386/q35: Simplify pc_q35_init() since PCI is always enabled
We can not create the Q35 machine without PCI, so simplify
pc_q35_init() removing pointless checks.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240213041952.58840-1-philmd@linaro.org>
(cherry picked from commit 88ad980c0f)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-26 10:42:47 +01:00
Xiaoyao Li
2e8a084f49 i386/cpuid: Remove subleaf constraint on CPUID leaf 1F
No such constraint that subleaf index needs to be less than 64.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by:Yang Weijiang <weijiang.yang@intel.com>
Message-ID: <20240125024016.2521244-3-xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit a3b5376521)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-26 10:17:47 +01:00
Jai Arora
ee127024c2 accel/kvm: Turn DPRINTF macro use into tracepoints
Patch removes DPRINTF macro and adds multiple tracepoints
to capture different kvm events.

We also drop the DPRINTFs that don't add any additional
information than trace_kvm_run_exit already does.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1827

Signed-off-by: Jai Arora <arorajai2798@gmail.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 9cdfb1e3a5)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-26 09:44:01 +01:00
Philippe Mathieu-Daudé
12c0f00dc9 hw/pci-host/raven: Propagate error in raven_realize()
When an Error** reference is available, it is better to
propagate local errors, rather then using generic ones,
which might terminate the whole QEMU process.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Message-Id: <20231120213301.24349-26-philmd@linaro.org>
(cherry picked from commit cb50fc6842)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-25 22:24:09 +01:00
Philippe Mathieu-Daudé
b9014ea6e7 hw/nvram: Simplify memory_region_init_rom_device() calls
Mechanical change using the following coccinelle script:

@@
expression mr, owner, arg3, arg4, arg5, arg6, errp;
@@
-   memory_region_init_rom_device(mr, owner, arg3, arg4, arg5, arg6, &errp);
    if (
-       errp
+       !memory_region_init_rom_device(mr, owner, arg3, arg4, arg5, arg6, &errp)
    ) {
        ...
        return;
    }

and removing the local Error variable.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Message-Id: <20231120213301.24349-25-philmd@linaro.org>
(cherry picked from commit ca1b876292)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-25 22:23:48 +01:00
Philippe Mathieu-Daudé
9fbb99a0ce hw/misc: Simplify memory_region_init_ram_from_fd() calls
Mechanical change using the following coccinelle script:

@@
expression mr, owner, arg3, arg4, arg5, arg6, arg7, errp;
@@
-   memory_region_init_ram_from_fd(mr, owner, arg3, arg4, arg5, arg6, arg7, &errp);
    if (
-       errp
+       !memory_region_init_ram_from_fd(mr, owner, arg3, arg4, arg5, arg6, arg7, &errp)
    ) {
        ...
        return;
    }

and removing the local Error variable.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Message-Id: <20231120213301.24349-24-philmd@linaro.org>
(cherry picked from commit 7493bd184e)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-25 22:23:25 +01:00
Philippe Mathieu-Daudé
8246d13768 hw/sparc: Simplify memory_region_init_ram_nomigrate() calls
Mechanical change using the following coccinelle script:

@@
expression mr, owner, arg3, arg4, errp;
@@
-   memory_region_init_ram_nomigrate(mr, owner, arg3, arg4, &errp);
    if (
-       errp
+       !memory_region_init_ram_nomigrate(mr, owner, arg3, arg4, &errp)
    ) {
        ...
        return;
    }

and removing the local Error variable.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Message-Id: <20231120213301.24349-23-philmd@linaro.org>
(cherry picked from commit 02e0ecb42c)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-25 22:23:08 +01:00
Philippe Mathieu-Daudé
7fd1a517b6 hw/arm: Simplify memory_region_init_rom() calls
Mechanical change using the following coccinelle script:

@@
expression mr, owner, arg3, arg4, errp;
@@
-   memory_region_init_rom(mr, owner, arg3, arg4, &errp);
    if (
-       errp
+       !memory_region_init_rom(mr, owner, arg3, arg4, &errp)
    ) {
        ...
        return;
    }

and removing the local Error variable.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Message-Id: <20231120213301.24349-22-philmd@linaro.org>
(cherry picked from commit 419d8524a2)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-25 22:22:48 +01:00
Philippe Mathieu-Daudé
f49a1505e9 hw: Simplify memory_region_init_ram() calls
Mechanical change using the following coccinelle script:

@@
expression mr, owner, arg3, arg4, errp;
@@
-   memory_region_init_ram(mr, owner, arg3, arg4, &errp);
    if (
-       errp
+       !memory_region_init_ram(mr, owner, arg3, arg4, &errp)
    ) {
        ...
        return;
    }

and removing the local Error variable.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Andrew Jeffery <andrew@codeconstruct.com.au> # aspeed
Reviewed-by: Gavin Shan <gshan@redhat.com>
Message-Id: <20231120213301.24349-21-philmd@linaro.org>
(cherry picked from commit 2198f5f0f2)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-25 22:22:27 +01:00
Philippe Mathieu-Daudé
c7d60cec4a misc: Simplify qemu_prealloc_mem() calls
Since qemu_prealloc_mem() returns whether or not an error
occured, we don't need to check the @errp pointer. Remove
local_err uses when we can return directly.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Message-Id: <20231120213301.24349-20-philmd@linaro.org>
(cherry picked from commit 9c878ad6fb)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-25 22:22:10 +01:00
Philippe Mathieu-Daudé
d816ca84fc util/oslib: Have qemu_prealloc_mem() handler return a boolean
Following the example documented since commit e3fe3988d7 ("error:
Document Error API usage rules"), have qemu_prealloc_mem()
return a boolean indicating whether an error is set or not.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Message-Id: <20231120213301.24349-19-philmd@linaro.org>
(cherry picked from commit b622ee98bf)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-25 22:21:33 +01:00
Philippe Mathieu-Daudé
ea8f948b5b backends: Reduce variable scope in host_memory_backend_memory_complete
Reduce the &local_err variable use and remove the 'out:' label.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Message-Id: <20231120213301.24349-18-philmd@linaro.org>
(cherry picked from commit 3961613a76)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-25 22:21:10 +01:00
Philippe Mathieu-Daudé
21eadb015d backends: Have HostMemoryBackendClass::alloc() handler return a boolean
Following the example documented since commit e3fe3988d7 ("error:
Document Error API usage rules"), have HostMemoryBackendClass::alloc
return a boolean indicating whether an error is set or not.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Message-Id: <20231120213301.24349-17-philmd@linaro.org>
(cherry picked from commit fdb63cf3b5)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-25 22:20:42 +01:00
Philippe Mathieu-Daudé
455ea0ddd0 backends: Simplify host_memory_backend_memory_complete()
Return early if bc->alloc is NULL. De-indent the if() ladder.

Note, this avoids a pointless call to error_propagate() with
errp=NULL at the 'out:' label.

Change trivial when reviewed with 'git-diff --ignore-all-space'.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Message-Id: <20231120213301.24349-16-philmd@linaro.org>
(cherry picked from commit e199f7ad4d)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-25 22:20:27 +01:00
Philippe Mathieu-Daudé
21c997995e backends: Use g_autofree in HostMemoryBackendClass::alloc() handlers
In preparation of having HostMemoryBackendClass::alloc() handlers
return a boolean, have them use g_autofree.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Message-Id: <20231120213301.24349-15-philmd@linaro.org>
(cherry picked from commit 2d7a1eb6e6)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-25 22:20:10 +01:00
Philippe Mathieu-Daudé
6a1f916145 memory: Have memory_region_init_ram_from_fd() handler return a boolean
Following the example documented since commit e3fe3988d7 ("error:
Document Error API usage rules"), have memory_region_init_ram_from_fd
return a boolean indicating whether an error is set or not.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Message-Id: <20231120213301.24349-14-philmd@linaro.org>
(cherry picked from commit 9583a90579)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-25 22:19:51 +01:00
Philippe Mathieu-Daudé
5c723c4d4f memory: Have memory_region_init_ram_from_file() handler return a boolean
Following the example documented since commit e3fe3988d7 ("error:
Document Error API usage rules"), have memory_region_init_ram_from_file
return a boolean indicating whether an error is set or not.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Message-Id: <20231120213301.24349-13-philmd@linaro.org>
(cherry picked from commit 9b9d11ac03)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-25 22:19:35 +01:00
Philippe Mathieu-Daudé
f354f3391d memory: Have memory_region_init_resizeable_ram() return a boolean
Following the example documented since commit e3fe3988d7 ("error:
Document Error API usage rules"), have memory_region_init_resizeable_ram
return a boolean indicating whether an error is set or not.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Message-Id: <20231120213301.24349-12-philmd@linaro.org>
(cherry picked from commit f25a9fbb64)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-25 22:19:14 +01:00
Philippe Mathieu-Daudé
12bc0f50e7 memory: Have memory_region_init_rom_device() handler return a boolean
Following the example documented since commit e3fe3988d7 ("error:
Document Error API usage rules"), have memory_region_init_rom_device
return a boolean indicating whether an error is set or not.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Message-Id: <20231120213301.24349-11-philmd@linaro.org>
(cherry picked from commit 62f5c1b234)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-25 22:18:56 +01:00
Philippe Mathieu-Daudé
2917525f2a memory: Simplify memory_region_init_rom_device_nomigrate() calls
Mechanical change using the following coccinelle script:

@@
expression mr, owner, arg3, arg4, arg5, arg6, errp;
@@
-   memory_region_init_rom_device_nomigrate(mr, owner, arg3, arg4, arg5, arg6, &errp);
    if (
-       errp
+       !memory_region_init_rom_device_nomigrate(mr, owner, arg3, arg4, arg5, arg6, &errp)
    ) {
        ...
        return;
    }

and removing the local Error variable.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Message-Id: <20231120213301.24349-10-philmd@linaro.org>
(cherry picked from commit bd3aa06950)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-25 22:18:37 +01:00
Philippe Mathieu-Daudé
35d3129006 memory: Have memory_region_init_rom_device_nomigrate() return a boolean
Following the example documented since commit e3fe3988d7
("error: Document Error API usage rules"), have
memory_region_init_rom_device_nomigrate() return a boolean
indicating whether an error is set or not.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Message-Id: <20231120213301.24349-9-philmd@linaro.org>
(cherry picked from commit ae076b6c39)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-25 22:18:18 +01:00
Philippe Mathieu-Daudé
df6b2acc8b memory: Have memory_region_init_rom() handler return a boolean
Following the example documented since commit e3fe3988d7 ("error:
Document Error API usage rules"), have memory_region_init_rom()
return a boolean indicating whether an error is set or not.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Message-Id: <20231120213301.24349-8-philmd@linaro.org>
(cherry picked from commit b9159451d3)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-25 22:17:58 +01:00
Philippe Mathieu-Daudé
7899521f1c memory: Have memory_region_init_ram() handler return a boolean
Following the example documented since commit e3fe3988d7 ("error:
Document Error API usage rules"), have memory_region_init_ram()
return a boolean indicating whether an error is set or not.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Message-Id: <20231120213301.24349-7-philmd@linaro.org>
(cherry picked from commit fe5f33d6b0)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-25 22:17:39 +01:00
Philippe Mathieu-Daudé
ac7d60bdd8 memory: Simplify memory_region_init_ram_from_fd() calls
Mechanical change using the following coccinelle script:

@@
expression mr, owner, arg3, arg4, arg5, arg6, arg7, errp;
@@
-   memory_region_init_ram_from_fd(mr, owner, arg3, arg4, arg5, arg6, arg7, &errp);
    if (
-       errp
+       !memory_region_init_ram_from_fd(mr, owner, arg3, arg4, arg5, arg6, arg7, &errp)
    ) {
        ...
        return;
    }

and removing the local Error variable.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Message-Id: <20231120213301.24349-6-philmd@linaro.org>
(cherry picked from commit d3143bd531)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-25 22:17:21 +01:00
Philippe Mathieu-Daudé
6aaba1df5d memory: Simplify memory_region_init_rom_nomigrate() calls
Mechanical change using the following coccinelle script:

@@
expression mr, owner, arg3, arg4, errp;
@@
-   memory_region_init_rom_nomigrate(mr, owner, arg3, arg4, &errp);
    if (
-       errp
+       !memory_region_init_rom_nomigrate(mr, owner, arg3, arg4, &errp)
    ) {
        ...
        return;
    }

and removing the local Error variable.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Message-Id: <20231120213301.24349-5-philmd@linaro.org>
(cherry picked from commit fd7549ee13)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-25 22:16:58 +01:00
Philippe Mathieu-Daudé
7d33a9126a memory: Have memory_region_init_rom_nomigrate() handler return a boolean
Following the example documented since commit e3fe3988d7 ("error:
Document Error API usage rules"), have memory_region_init_rom_nomigrate
return a boolean indicating whether an error is set or not.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Message-Id: <20231120213301.24349-4-philmd@linaro.org>
[PMD: Only update 'readonly' field on success (Manos Pitsidianakis)]
Message-Id: <af352e7d-3346-4705-be77-6eed86858d18@linaro.org>
(cherry picked from commit 197faa7006)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-25 22:16:37 +01:00
Philippe Mathieu-Daudé
40e2695f58 memory: Have memory_region_init_ram_nomigrate() handler return a boolean
Following the example documented since commit e3fe3988d7 ("error:
Document Error API usage rules"), have memory_region_init_ram_nomigrate
return a boolean indicating whether an error is set or not.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Message-Id: <20231120213301.24349-3-philmd@linaro.org>
(cherry picked from commit 62c19b72c7)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-25 22:16:18 +01:00
Philippe Mathieu-Daudé
e10e0a7eb4 memory: Have memory_region_init_ram_flags_nomigrate() return a boolean
Following the example documented since commit e3fe3988d7 ("error:
Document Error API usage rules"), have memory_region_init_ram_nomigrate
return a boolean indicating whether an error is set or not.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Message-Id: <20231120213301.24349-2-philmd@linaro.org>
(cherry picked from commit cbbc434023)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-25 22:15:59 +01:00
William Roche
476810b8e9 migration: prevent migration when VM has poisoned memory
A memory page poisoned from the hypervisor level is no longer readable.
The migration of a VM will crash Qemu when it tries to read the
memory address space and stumbles on the poisoned page with a similar
stack trace:

Program terminated with signal SIGBUS, Bus error.

To avoid this VM crash during the migration, prevent the migration
when a known hardware poison exists on the VM.

Signed-off-by: William Roche <william.roche@oracle.com>
Link: https://lore.kernel.org/r/20240130190640.139364-2-william.roche@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
(cherry picked from commit 06152b89db)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-25 18:11:59 +01:00
Tianrui Zhao
e3df38cf80 linux-headers: Synchronize linux headers from linux v6.7.0-rc8
Use the scripts/update-linux-headers.sh to synchronize linux
headers from linux v6.7.0-rc8. We mainly want to add the
loongarch linux headers and then add the loongarch kvm support
based on it.

Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn>
Acked-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240105075804.1228596-2-zhaotianrui@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
(cherry picked from commit 5817db6890)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-25 17:55:41 +01:00
Daniel Henrique Barboza
625bf5ceef linux-headers: riscv: add ptrace.h
KVM vector support for RISC-V requires the linux-header ptrace.h.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231218204321.75757-3-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit 1583ca8aa6)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-25 17:54:52 +01:00
Daniel Henrique Barboza
18e4e0be1f linux-headers: Update to Linux v6.7-rc5
We'll add a new RISC-V linux-header file, but first let's update all
headers.

Headers for 'asm-loongarch' were added in this update.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231218204321.75757-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit efb91426af)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-25 17:52:56 +01:00
Akihiko Odaki
158c1626b7 hw/nvme: Use pcie_sriov_num_vfs() (bsc#1220065, CVE-2024-26328)
nvme_sriov_pre_write_ctrl() used to directly inspect SR-IOV
configurations to know the number of VFs being disabled due to SR-IOV
configuration writes, but the logic was flawed and resulted in
out-of-bound memory access.

It assumed PCI_SRIOV_NUM_VF always has the number of currently enabled
VFs, but it actually doesn't in the following cases:
- PCI_SRIOV_NUM_VF has been set but PCI_SRIOV_CTRL_VFE has never been.
- PCI_SRIOV_NUM_VF was written after PCI_SRIOV_CTRL_VFE was set.
- VFs were only partially enabled because of realization failure.

It is a responsibility of pcie_sriov to interpret SR-IOV configurations
and pcie_sriov does it correctly, so use pcie_sriov_num_vfs(), which it
provides, to get the number of enabled VFs before and after SR-IOV
configuration writes.

Cc: qemu-stable@nongnu.org
Fixes: CVE-2024-26328
Fixes: 11871f53ef ("hw/nvme: Add support for the Virtualization Management command")
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20240228-reuse-v8-1-282660281e60@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 91bb64a8d2)
References: bsc#1220065
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-25 17:40:04 +01:00
aad9841219 [openSUSE] Update version to 8.2.2
Update to latest stable release (8.2.2).

Full changelog here:
 https://lore.kernel.org/qemu-devel/1709577077.783602.1474596.nullmailer@tls.msk.ru/

Upstream backports:
 chardev/char-socket: Fix TLS io channels sending too much data to the backend
 tests/unit/test-util-sockets: Remove temporary file after test
 hw/usb/bus.c: PCAP adding 0xA in Windows version
 hw/intc/Kconfig: Fix GIC settings when using "--without-default-devices"
 gitlab: force allow use of pip in Cirrus jobs
 tests/vm: avoid re-building the VM images all the time
 tests/vm: update openbsd image to 7.4
 target/i386: leave the A20 bit set in the final NPT walk
 target/i386: remove unnecessary/wrong application of the A20 mask
 target/i386: Fix physical address truncation
 target/i386: check validity of VMCB addresses
 target/i386: mask high bits of CR3 in 32-bit mode
 pl031: Update last RTCLR value on write in case it's read back
 hw/nvme: fix invalid endian conversion
 update edk2 binaries to edk2-stable202402
 update edk2 submodule to edk2-stable202402
 target/ppc: Fix crash on machine check caused by ifetch
 target/ppc: Fix lxv/stxv MSR facility check
 .gitlab-ci.d/windows.yml: Drop msys2-32bit job
 system/vl: Update description for input grab key
 docs/system: Update description for input grab key
 hw/hppa/Kconfig: Fix building with "configure --without-default-devices"
 tests/qtest: Depend on dbus_display1_dep
 meson: Explicitly specify dbus-display1.h dependency
 audio: Depend on dbus_display1_dep
 ui/console: Fix console resize with placeholder surface
 ui/clipboard: add asserts for update and request
 ui/clipboard: mark type as not available when there is no data
 ui: reject extended clipboard message if not activated
 target/i386: Generate an illegal opcode exception on cmp instructions with lock prefix
 i386/cpuid: Move leaf 7 to correct group
 i386/cpuid: Decrease cpuid_i when skipping CPUID leaf 1F
 i386/cpu: Mask with XCR0/XSS mask for FEAT_XSAVE_XCR0_HI and FEAT_XSAVE_XSS_HI leafs
 i386/cpu: Clear FEAT_XSAVE_XSS_LO/HI leafs when CPUID_EXT_XSAVE is not available
 .gitlab-ci/windows.yml: Don't install libusb or spice packages on 32-bit
 iotests: Make 144 deterministic again
 target/arm: Don't get MDCR_EL2 in pmu_counter_enabled() before checking ARM_FEATURE_PMU
 target/arm: Fix SVE/SME gross MTE suppression checks
 target/arm: Handle mte in do_ldrq, do_ldro
 target/arm: Split out make_svemte_desc
 target/arm: Adjust and validate mtedesc sizem1
 target/arm: Fix nregs computation in do_{ld,st}_zpa
 linux-user/aarch64: Choose SYNC as the preferred MTE mode
 tests/acpi: Update DSDT.cxl to reflect change _STA return value.
 hw/i386: Fix _STA return value for ACPI0017
 tests/acpi: Allow update of DSDT.cxl
 smmu: Clear SMMUPciBus pointer cache when system reset
 virtio_iommu: Clear IOMMUPciBus pointer cache when system reset
 virtio-gpu: Correct virgl_renderer_resource_get_info() error check
 hw/cxl: Pass CXLComponentState to cache_mem_ops
 hw/cxl/device: read from register values in mdev_reg_read()
 cxl/cdat: Fix header sum value in CDAT checksum
 cxl/cdat: Handle cdat table build errors
 vhost-user.rst: Fix vring address description
 tcg/arm: Fix goto_tb for large translation blocks
 tcg: Increase width of temp_subindex
 hw/net/tulip: add chip status register values
 hw/smbios: Fix port connector option validation
 hw/smbios: Fix OEM strings table option validation
 configure: run plugin TCG tests again
 tests/docker: Add sqlite3 module to openSUSE Leap container
 hw/riscv/virt-acpi-build.c: fix leak in build_rhct()
 migration: Fix logic of channels and transport compatibility check
 virtio-blk: avoid using ioeventfd state in irqfd conditional
 virtio: Re-enable notifications after drain
 virtio-scsi: Attach event vq notifier with no_poll
 iotests: give tempdir an identifying name
 iotests: fix leak of tmpdir in dry-run mode
 hw/scsi/lsi53c895a: add missing decrement of reentrancy counter
 linux-user/aarch64: Add padding before __kernel_rt_sigreturn
 tcg/loongarch64: Set vector registers call clobbered
 pci-host: designware: Limit value range of iATU viewport register
 target/arm: Reinstate "vfp" property on AArch32 CPUs
 qemu-options.hx: Improve -serial option documentation
 system/vl.c: Fix handling of '-serial none -serial something'
 target/arm: fix exception syndrome for AArch32 bkpt insn
 block/blkio: Make s->mem_region_alignment be 64 bits
 qemu-docs: Update options for graphical frontends
 Make 'uri' optional for migrate QAPI
 vfio/pci: Clear MSI-X IRQ index always
 migration: Fix use-after-free of migration state object
 migration: Plug memory leak on HMP migrate error path

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 21:21:17 +01:00
Harsh Prateek Bora
921de6788a ppc/spapr: Initialize max_cpus limit to SPAPR_IRQ_NR_IPIS.
Initialize the machine specific max_cpus limit as per the maximum range
of CPU IPIs available. Keeping between 4096 to 8192 will throw IRQ not
free error due to XIVE/XICS limitation and keeping beyond 8192 will hit
assert in tcg_region_init or spapr_xive_claim_irq.

Logs:

Without patch fix:

[root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=4097
qemu-system-ppc64: IRQ 4096 is not free
[root@host build]#

On LPAR:
[root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=8193
**
ERROR:../tcg/region.c:774:tcg_region_init: assertion failed:
(region_size >= 2 * page_size)
Bail out! ERROR:../tcg/region.c:774:tcg_region_init: assertion failed:
(region_size >= 2 * page_size)
Aborted (core dumped)
[root@host build]#

On x86:
[root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=8193
qemu-system-ppc64: ../hw/intc/spapr_xive.c:596: spapr_xive_claim_irq:
Assertion `lisn < xive->nr_irqs' failed.
Aborted (core dumped)
[root@host build]#

With patch fix:
[root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=4097
qemu-system-ppc64: Invalid SMP CPUs 4097. The max CPUs supported by
machine 'pseries-8.2' is 4096
[root@host build]#

Reported-by: Kowshik Jois <kowsjois@linux.ibm.com>
Tested-by: Kowshik Jois <kowsjois@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
(cherry picked from commit c4f91d7b7b)
References: bsc#1220310
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:10 +01:00
Harsh Prateek Bora
278898483e ppc/spapr: Introduce SPAPR_IRQ_NR_IPIS to refer IRQ range for CPU IPIs.
spapr_irq_init currently uses existing macro SPAPR_XIRQ_BASE to refer to
the range of CPU IPIs during initialization of nr-irqs property.
It is more appropriate to have its own define which can be further
reused as appropriate for correct interpretation.

Suggested-by: Cedric Le Goater <clg@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Kowshik Jois <kowsjois@linux.ibm.com>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
(cherry picked from commit 2df5c1f5b0)
References: bsc#1220310
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:10 +01:00
25e7bc99cc [openSUSE]: Increase default phys bits to 42, if host supports that
We wanted QEMU to support larger VMs (in therm of RAM size) by default
and we therefore introduced patch "[openSUSE] increase x86_64 physical
bits to 42". This, however, means that we create VMs with 42 bits of
physical address space even on hosts that only has, say, 40. And that
can't work.

In fact, it has been a problem since a long time (e.g., bsc#1205978) and
it's also the actual root cause of bsc#1219977.

Get rid of that old patch, in favor of a new one that still raise the
default number of address bits to 42, but only on hosts that supports
that.

This means that we can also use the proper SeaBIOS version, without
reverting commits that were only a problem due to our broken downstream
patch.

We probably aslo don't need to ship some of the custom ACPI tables (for
passing tests), but we'll actually remove them later, after double
checking properly that all the tests do work.

References: bsc#1205978
References: bsc#1219977
References: bsc#1220799
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:10 +01:00
adabbbb494 [openSUSE][RPM] Cosmetic fixes to spec files (copyright, sorting, etc)
Update the copyright year to 2024, sort dependencies etc.

This way, 'osc' does not have to do these changes all the times (they're
automatic, so no big deal, but it's annoying to see them in the diffs of
all the requests).

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:10 +01:00
c10c66077b [openSUSE] roms/seabios: Drop an old (and no longer necessary) downstream patch
Drop the patch "[openSUSE] build: be explicit about -mx86-used-note=no"
from SeaBIOS.

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:10 +01:00
4deb258e2e [openSUSE][RPM] Update to latest stable versio (8.2.1)
Backported commits:
 * Update version for 8.2.1 release
 * target/arm: Fix incorrect aa64_tidcp1 feature check
 * target/arm: Fix A64 scalar SQSHRN and SQRSHRN
 * target/xtensa: fix OOB TLB entry access
 * qtest: bump aspeed_smc-test timeout to 6 minutes
 * monitor: only run coroutine commands in qemu_aio_context
 * iotests: port 141 to Python for reliable QMP testing
 * iotests: add filter_qmp_generated_node_ids()
 * block/blklogwrites: Fix a bug when logging "write zeroes" operations.
 * virtio-net: correctly copy vnet header when flushing TX (bsc#1218484, CVE-2023-6693)
 * tcg/arm: Fix SIGILL in tcg_out_qemu_st_direct
 * linux-user/riscv: Adjust vdso signal frame cfa offsets
 * linux-user: Fixed cpu restore with pc 0 on SIGBUS
 * block/io: clear BDRV_BLOCK_RECURSE flag after recursing in bdrv_co_block_status
 * coroutine-ucontext: Save fake stack for pooled coroutine
 * tcg/s390x: Fix encoding of VRIc, VRSa, VRSc insns
 * accel/tcg: Revert mapping of PCREL translation block to multiple virtual addresses
 * acpi/tests/avocado/bits: wait for 200 seconds for SHUTDOWN event from bits VM
 * s390x/pci: drive ISM reset from subsystem reset
 * s390x/pci: refresh fh before disabling aif
 * s390x/pci: avoid double enable/disable of aif
 * hw/scsi/esp-pci: set DMA_STAT_BCMBLT when BLAST command issued
 * hw/scsi/esp-pci: synchronise setting of DMA_STAT_DONE with ESP completion interrupt
 * hw/scsi/esp-pci: generate PCI interrupt from separate ESP and PCI sources
 * hw/scsi/esp-pci: use correct address register for PCI DMA transfers
 * migration/rdma: define htonll/ntohll only if not predefined
 * hw/pflash: implement update buffer for block writes
 * hw/pflash: use ldn_{be,le}_p and stn_{be,le}_p
 * hw/pflash: refactor pflash_data_write()
 * backends/cryptodev: Do not ignore throttle/backends Errors
 * target/i386: pcrel: store low bits of physical address in data[0]
 * target/i386: fix incorrect EIP in PC-relative translation blocks
 * target/i386: Do not re-compute new pc with CF_PCREL
 * load_elf: fix iterator's type for elf file processing
 * target/hppa: Update SeaBIOS-hppa to version 15
 * target/hppa: Fix IOR and ISR on error in probe
 * target/hppa: Fix IOR and ISR on unaligned access trap
 * target/hppa: Export function hppa_set_ior_and_isr()
 * target/hppa: Avoid accessing %gr0 when raising exception
 * hw/hppa: Move software power button address back into PDC
 * target/hppa: Fix PDC address translation on PA2.0 with PSW.W=0
 * hw/pci-host/astro: Add missing astro & elroy registers for NetBSD
 * hw/hppa/machine: Disable default devices with --nodefaults option
 * hw/hppa/machine: Allow up to 3840 MB total memory
 * readthodocs: fully specify a build environment
 * .gitlab-ci.d/buildtest.yml: Work around htags bug when environment is large
 * target/s390x: Fix LAE setting a wrong access register
 * tests/qtest/virtio-ccw: Fix device presence checking
 * tests/acpi: disallow tests/data/acpi/virt/SSDT.memhp changes
 * tests/acpi: update expected data files
 * edk2: update binaries to git snapshot
 * edk2: update build config, set PcdUninstallMemAttrProtocol = TRUE.
 * edk2: update to git snapshot
 * tests/acpi: allow tests/data/acpi/virt/SSDT.memhp changes
 * util: fix build with musl libc on ppc64le
 * tcg/ppc: Use new registers for LQ destination
 * hw/intc/arm_gicv3_cpuif: handle LPIs in in the list registers
 * hw/vfio: fix iteration over global VFIODevice list
 * vfio/container: Replace basename with g_path_get_basename
 * edu: fix DMA range upper bound check
 * hw/net: cadence_gem: Fix MDIO_OP_xxx values
 * audio/audio.c: remove trailing newline in error_setg
 * chardev/char.c: fix "abstract device type" error message
 * target/riscv: Fix mcycle/minstret increment behavior
 * hw/net/can/sja1000: fix bug for single acceptance filter and standard frame
 * target/i386: the sgx_epc_get_section stub is reachable
 * configure: use a native non-cross compiler for linux-user
 * include/ui/rect.h: fix qemu_rect_init() mis-assignment
 * target/riscv/kvm: do not use non-portable strerrorname_np()
 * iotests: Basic tests for internal snapshots
 * vl: Improve error message for conflicting -incoming and -loadvm
 * block: Fix crash when loading snapshot on inactive node

References: bsc#1218484 (CVE-2023-6693)
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:10 +01:00
7399ffaeab [openSUSE][RPM] factor common definitions between qemu and qemu-linux-user spec files
Simplify both the spec files, by factoring common definitions.

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:10 +01:00
942d785c74 [openSUSE][RPM] Install the VGA module "more often" (bsc#1219164)
Depending on the VM configuration (both at the VM definition level and
on the guest itself) a VGA console might be necessary, or weird lockup
will occur. Since the VGA module package is smalle enough, add a
dependency for it, from other display modules, to act as a workaround.

While there, make more explicit and precise the dependencies between all
the various modules, by specifying that they should all have the same
version and release.

References: bsc#1219164
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:10 +01:00
d133e7e5fe [openSUSE][RPM] Create the legacy qemu-kvm symlink for all arches
Historically, KVM was available only for x86 and s390, and was invoked
via a binary called 'kvm' or 'qemu-kvm'. For a while, we've shipped a
package that was making it possible to invoke QEMU like that, but only
for these two arches. This, however, created a lot of confusion and
dependencies issues.

Fix them by creating a symlink from 'qemu-kvm' to the proper binary on
all arches and by making the main QEMU package Providing and Obsoleting
(also on all arches) the old qemu-kvm one.

Note that, for RISCV, the qemu-system-riscv64 binary, to which the symlink
should point, is in the qemu-extra package. However, if we are on RISCV,
qemu-extra is an hard dependency of qemu. Therefore, it's fine to ship
the link and also set the Provides: and Obsoletes: tag in the qemu
package itself. It'd be more correct to do that in the qemu-extra
package, of course, but this would complicate the spec file and it's not
worth it, considering this is all legacy and should very well go away
soon.

References: bsc#1218684
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:10 +01:00
1a0440ded9 [openSUSE][RPM] spec: allow building without spice
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:10 +01:00
39ccf1d7f0 [openSUSE] Update ipxe submodule reference (bsc#1219733, bsc#1219722)
Add to the ipxe submodule the commit (and all its dependencies) for
fixing building with binutils 2.42

References: bsc#1219733
References: bsc#1219722
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:10 +01:00
5aec4c3b9e [openSUSE][RPM] Disable test-crypto-secret in linux-user build 2024-03-14 18:26:10 +01:00
Fabian Vogt
b19143deea [openSUSE][RPM] Fix enabling features on non-x86_64
The %endif was in the wrong place, so on non-x86_64, most features were
disabled.
2024-03-14 18:26:10 +01:00
5224af8952 [openSUSE] Update submodule references for 8.2.0
Point the submodules to the repositories that host our downstream
patches:

* roms/seabios
 - [openSUSE] switch to python3 as needed
 - [openSUSE] build: enable cross compilation on ARM
 - [openSUSE] build: be explicit about -mx86-used-note=no
* roms/SLOF
 - Allow to override build date with SOURCE_DATE_EPOCH
* roms/ipxe
 - [ath5k] Add missing AR5K_EEPROM_READ in ath5k_eeprom_read_turbo_modes
 - [openSUSE] [build] Makefile: fix issues of build reproducibility
 - [openSUSE] [test] help compiler out by initializing array[openSUSE]
 - [openSUSE] [build] Silence GCC 12 spurious warnings
 - [librm] Use explicit operand size when pushing a label address
* roms/skiboot
 - [openSUSE] Makefile: define endianess for cross-building on aarch64
 - [openSUSE] Make Sphinx build reproducible (boo#1102408)
* roms/qboot
 - [openSUSE] add cross.ini file to handle aarch64 based build

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:10 +01:00
f1eb2880a5 [openSUSE][RPM] Update version to 8.2
Update to latest upstream release.

The full list of changes are available at:

  https://wiki.qemu.org/ChangeLog/8.2

Highlights include:
 * New virtio-sound device emulation
 * New virtio-gpu rutabaga device emulation used by Android emulator
 * New hv-balloon for dynamic memory protocol device for Hyper-V guests
 * New Universal Flash Storage device emulation
 * Network Block Device (NBD) 64-bit offsets for improved performance
 * dump-guest-memory now supports the standard kdump format
 * ARM: Xilinx Versal board now models the CFU/CFI, and the TRNG device
 * ARM: CPU emulation support for cortex-a710 and neoverse-n2
 * ARM: architectural feature support for PACQARMA3, EPAC, Pauth2, FPAC,
   FPACCOMBINE, TIDCP1, MOPS, HBC, and HPMN0
 * HPPA: CPU emulation support for 64-bit PA-RISC 2.0
 * HPPA: machine emulation support for C3700, including Astro memory
   controller and four Elroy PCI bridges
 * LoongArch: ISA support for LASX extension and PRELDX instruction
 * LoongArch: CPU emulation support for la132
 * RISC-V: ISA/extension support for AIA virtualization support via KVM,
   and vector cryptographic instructions
 * RISC-V: Numerous extension/instruction cleanups, fixes, and reworks
 * s390x: support for vfio-ap passthrough of crypto adapter for
   protected
   virtualization guests
 * Tricore: support for TC37x CPU which implements ISA v1.6.2
 * Tricore: support for CRCN, FTOU, FTOHP, and HPTOF instructions
 * x86: Zen support for PV console and network devices

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:10 +01:00
Fabiano Rosas
51bf974b6a [openSUSE] Downstream note about the graph lock
In 8.2, the upstream added graph lock annotations that conflict with
our coroutine patches. The GRAPH_RDLOCK_GUARD_MAINLOOP is now asserted
at some places where we turned the code into a coroutine.

My understanding is that the annotation merely asserts that the code
is running in the main loop and serves effectively as a marker of
where the graph lock should be taken if the code were to be moved out
of the main loop. That's exactly what we're doing so it should be safe
to replace the main loop check with an actual rdlock/rdunlock pair.

Commits e7e801a2a7, 94d03a7425 and c89144c058 replace
GRAPH_RDLOCK_GUARD_MAINLOOP() with GRAPH_RDLOCK_GUARD() at the
bdrv_snapshot_list, bdrv_named_nodes_list and qmp_query_block
functions. The changes are split into those patches instead of here to
preserve bisectability.

Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-03-14 18:26:10 +01:00
João Silva
a0aa455b00 [openSUSE] block: Add a thread-pool version of fstat (bsc#1211000)
The fstat call can take a long time to finish when running over
NFS. Add a version of it that runs in the thread pool.

Adapt one of its users, raw_co_get_allocated_file size to use the new
version. That function is called via QMP under the qemu_global_mutex
so it has a large chance of blocking VCPU threads in case it takes too
long to finish.

Signed-off-by: João Silva <jsilva@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Claudio Fontana <cfontana@suse.de>
References: bsc#1211000
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-03-14 18:26:10 +01:00
Fabiano Rosas
bee1f08f69 [openSUSE] block: Convert qmp_query_block() to coroutine_fn (bsc#1211000)
This is another caller of bdrv_get_allocated_file_size() that needs to
be converted to a coroutine because that function will be made
asynchronous when called (indirectly) from the QMP dispatcher.

This QMP command is a candidate because it calls bdrv_do_query_node_info(),
which in turn calls bdrv_get_allocated_file_size().

We've determined bdrv_do_query_node_info() to be coroutine-safe (see
previous commits), so we can just put this QMP command in a coroutine.

Since qmp_query_block() now expects to run in a coroutine, its callers
need to be converted as well. Convert hmp_info_block(), which calls
only coroutine-safe code, including qmp_query_named_block_nodes()
which has been converted to coroutine in the previous patches.

Now that all callers of bdrv_[co_]block_device_info() are using the
coroutine version, a few things happen:

 - we can return to using bdrv_block_device_info() without a wrapper;

 - bdrv_get_allocated_file_size() can stop being mixed;

 - bdrv_co_get_allocated_file_size() needs to be put under the graph
   lock because it is being called wthout the wrapper;

 - bdrv_do_query_node_info() doesn't need to acquire the AioContext
   because it doesn't call aio_poll anymore;

Signed-off-by: Fabiano Rosas <farosas@suse.de>
References: bsc#1211000
[relax main loop requirement at qmp_query_block]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-03-14 18:26:10 +01:00
Fabiano Rosas
fdf023d52d [openSUSE] block: Don't query all block devices at hmp_nbd_server_start (bsc#1211000)
We're currently doing a full query-block just to enumerate the devices
for qmp_nbd_server_add and then discarding the BlockInfoList
afterwards. Alter hmp_nbd_server_start to instead iterate explicitly
over the block_backends list.

This allows the removal of the dependency on qmp_query_block from
hmp_nbd_server_start. This is desirable because we're about to move
qmp_query_block into a coroutine and don't need to change the NBD code
at the same time.

Signed-off-by: Fabiano Rosas <farosas@suse.de>
References: bsc#1211000
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-03-14 18:26:10 +01:00
84e049d7ae [openSUSE] block: Convert qmp_query_named_block_nodes to coroutine (bsc#1211000)
We're converting callers of bdrv_get_allocated_file_size() to run in
coroutines because that function will be made asynchronous when called
(indirectly) from the QMP dispatcher.

This QMP command is a candidate because it indirectly calls
bdrv_get_allocated_file_size() through bdrv_block_device_info() ->
bdrv_query_image_info() -> bdrv_query_image_info().

The previous patches have determined that bdrv_query_image_info() and
bdrv_do_query_node_info() are coroutine-safe so we can just make the
QMP command run in a coroutine.

Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
References: bsc#1211000
[relax the main loop requirement at bdrv_named_nodes_list]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-03-14 18:26:10 +01:00
Fabiano Rosas
52015d0078 [openSUSE] block: Convert bdrv_block_device_info into co_wrapper (bsc#1211000)
We're converting callers of bdrv_get_allocated_file_size() to run in
coroutines because that function will be made asynchronous when called
(indirectly) from the QMP dispatcher.

This function is a candidate because it calls bdrv_query_image_info()
-> bdrv_do_query_node_info() -> bdrv_get_allocated_file_size().

It is safe to turn this is a coroutine because the code it calls is
made up of either simple accessors and string manipulation functions
[1] or it has already been determined to be safe [2].

1) bdrv_refresh_filename(), bdrv_is_read_only(),
   blk_enable_write_cache(), bdrv_cow_bs(), blk_get_public(),
   throttle_group_get_name(), bdrv_write_threshold_get(),
   bdrv_query_dirty_bitmaps(), throttle_group_get_config(),
   bdrv_filter_or_cow_bs(), bdrv_skip_implicit_filters()

2) bdrv_do_query_node_info() (see previous commit);

Signed-off-by: Fabiano Rosas <farosas@suse.de>
References: bsc#1211000
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-03-14 18:26:10 +01:00
Fabiano Rosas
9eef8908d2 [openSUSE] block: Convert bdrv_query_block_graph_info to coroutine (bsc#1211000)
We're converting callers of bdrv_get_allocated_file_size() to run in
coroutines because that function will be made asynchronous when called
(indirectly) from the QMP dispatcher.

This function is a candidate because it calls bdrv_do_query_node_info(),
which in turn calls bdrv_get_allocated_file_size().

All the functions called from bdrv_do_query_node_info() onwards are
coroutine-safe, either have a coroutine version themselves[1] or are
mostly simple code/string manipulation[2].

1) bdrv_getlength(), bdrv_get_allocated_file_size(), bdrv_get_info(),
   bdrv_get_specific_info();

2) bdrv_refresh_filename(), bdrv_get_format_name(),
   bdrv_get_full_backing_filename(), bdrv_query_snapshot_info_list();

Signed-off-by: Fabiano Rosas <farosas@suse.de>
References: bsc#1211000
[relax the main loop requirement at bdrv_snapshot_list]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-03-14 18:26:10 +01:00
Fabiano Rosas
ef1a23f0a5 [openSUSE] block: Temporarily mark bdrv_co_get_allocated_file_size as mixed (bsc#1211000)
Some callers of this function are about to be converted to run in
coroutines, so allow it to be executed both inside and outside a
coroutine while we convert all the callers.

This will be reverted once all callers of bdrv_do_query_node_info run
in a coroutine.

Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
References: bsc#1211000
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-03-14 18:26:10 +01:00
Fabiano Rosas
b29e7f9aeb [openSUSE] block: Allow the wrapper script to see functions declared in qapi.h (bsc#1211000)
The following patches will add co_wrapper annotations to functions
declared in qapi.h. Add that header to the set of files used by
block-coroutine-wrapper.py.

Signed-off-by: Fabiano Rosas <farosas@suse.de>
References: bsc#1211000
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-03-14 18:26:10 +01:00
83389eed7e [openSUSE][RPM] Restrict canokey to openSUSE only
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:10 +01:00
86ffcc5e3a [openSUSE][RPM] Fix virtiofsd dependency on 32 bit systems
And make the switch more general, as we now have multiple
instances of it.

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:09 +01:00
Ludwig Nussel
bfa9615104 [openSUSE][RPM] Add support for canokeys (boo#1217520) 2024-03-14 18:26:09 +01:00
ab15e51a01 [openSUSE][RPM] Disable Xen support in ALP-based distros
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:09 +01:00
06877cdc3d [openSUSE][RPM] Some more refinements of inter-subpackage dependencies
Add some block drivers and virtiofsd as hard dependencies of the
qemu-headless package, to make sure it's really useful for headless
server environments (even when recommended packages are not installed).

Singed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:09 +01:00
cd664a9abb [openSUSE][RPM] Normalize hostname, for reproducible builds
Use a fixed USER value (in case someone builds outside of OBS/osc).

References: boo#1084909
Signed-off-by: Bernhard M. Wiedemann <githubbmwprimary@lsmod.de>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:09 +01:00
33d19cf4e1 [openSUSE][RPM] New subpackage, for SPICE
Define a new sub-(meta-)package that can be installed for having
all the other modules and packages necessary for SPICE to work.

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:09 +01:00
37cc99737a [openSUSE] Update version to 8.1.3
Align to upstream stable release. It includes many of the patches we had
backported ourself, to fix bugs and issues, plus more.

See here for details:
- https://lore.kernel.org/qemu-devel/1700589639.257680.3420728.nullmailer@tls.msk.ru/
- https://gitlab.com/qemu-project/qemu/-/commits/stable-8.1?ref_type=heads

An (incomplete!) list of such backports is:
 * Update version for 8.1.3 release
 * hw/mips: LOONGSON3V depends on UNIMP device
 * target/arm: HVC at EL3 should go to EL3, not EL2
 * s390x/pci: only limit DMA aperture if vfio DMA limit reported
 * target/riscv/kvm: support KVM_GET_REG_LIST
 * target/riscv/kvm: improve 'init_multiext_cfg' error msg
 * tracetool: avoid invalid escape in Python string
 * tests/tcg/s390x: Test LAALG with negative cc_src
 * target/s390x: Fix LAALG not updating cc_src
 * tests/tcg/s390x: Test CLC with inaccessible second operand
 * target/s390x: Fix CLC corrupting cc_src
 * tests/qtest: ahci-test: add test exposing reset issue with pending callback
 * hw/ide: reset: cancel async DMA operation before resetting state
 * target/mips: Fix TX79 LQ/SQ opcodes
 * target/mips: Fix MSA BZ/BNZ opcodes displacement
 * ui/gtk-egl: apply scale factor when calculating window's dimension
 * ui/gtk: force realization of drawing area
 * ati-vga: Implement fallback for pixman routines
 * ...

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:09 +01:00
31572a8b57 [openSUSE] Make Sphinx build reproducible (boo#1102408)
Avoid parallel processing in sphinx because that causes variations in
generated files

This is addressed here, with a downstream patch, until a proper solution
is found upstream.

Signed-off-by: Bernhard Wiedemann <bwiedemann@suse.com>
References: boo#1102408
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:09 +01:00
caaba8677b [openSUSE] supportconfig: Adapt plugin to modern supportconfig
The supportconfig 'scplugin.rc' file is deprecated in favor of
supportconfig.rc'. Adapt the qemu plugin to the new scheme.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:09 +01:00
5d3d60ccc5 [openSUSE] Add -p1 to autosetup in spec files
Our workflow does not include patches in the spec files. Still, it could
be useful to add some there, during development and/or debugging issues.

Make sure that they are applied properly, by adding -p1 to the
%autosetup directive (it's a nop if there are no patches, so both cases
are ok).

Suggested-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:09 +01:00
71c72823d0 [openSUSE] Update version to 8.1.2
This fixes the following upstream issues:
 * https://gitlab.com/qemu-project/qemu/-/issues/1826
 * https://gitlab.com/qemu-project/qemu/-/issues/1834
 * https://gitlab.com/qemu-project/qemu/-/issues/1846

It also contains a fix for:
 * CVE-2023-42467 (bsc#1215192)

As well as several upstream backports:
 * target/riscv: Fix vfwmaccbf16.vf
 * disas/riscv: Fix the typo of inverted order of pmpaddr13 and pmpaddr14
 * roms: use PYTHON to invoke python
 * hw/audio/es1370: reset current sample counter
 * migration/qmp: Fix crash on setting tls-authz with null
 * util/log: re-allow switching away from stderr log file
 * vfio/display: Fix missing update to set backing fields
 * amd_iommu: Fix APIC address check
 * vdpa net: follow VirtIO initialization properly at cvq isolation probing
 * vdpa net: stop probing if cannot set features
 * vdpa net: fix error message setting virtio status
 * vdpa net: zero vhost_vdpa iova_tree pointer at cleanup
 * linux-user/hppa: Fix struct target_sigcontext layout
 * chardev/char-pty: Avoid losing bytes when the other side just (re-)connected
 * hw/display/ramfb: plug slight guest-triggerable leak on mode setting
 * win32: avoid discarding the exception handler
 * target/i386: fix memory operand size for CVTPS2PD
 * target/i386: generalize operand size "ph" for use in CVTPS2PD
 * subprojects/berkeley-testfloat-3: Update to fix a problem with compiler warnings
 * scsi-disk: ensure that FORMAT UNIT commands are terminated
 * esp: restrict non-DMA transfer length to that of available data
 * esp: use correct type for esp_dma_enable() in sysbus_esp_gpio_demux()
 * optionrom: Remove build-id section
 * target/tricore: Fix RCPW/RRPW_INSERT insns for width = 0
 * accel/tcg: Always require can_do_io
 * accel/tcg: Always set CF_LAST_IO with CF_NOIRQ
 * accel/tcg: Improve setting of can_do_io at start of TB
 * accel/tcg: Track current value of can_do_io in the TB
 * accel/tcg: Hoist CF_MEMI_ONLY check outside translation loop
 * accel/tcg: Avoid load of icount_decr if unused
 * softmmu: Use async_run_on_cpu in tcg_commit
 * migration: Move return path cleanup to main migration thread
 * migration: Replace the return path retry logic
 * migration: Consolidate return path closing code
 * migration: Remove redundant cleanup of postcopy_qemufile_src
 * migration: Fix possible race when shutting down to_dst_file
 * migration: Fix possible races when shutting down the return path
 * migration: Fix possible race when setting rp_state.error
 * migration: Fix race that dest preempt thread close too early
 * ui/vnc: fix handling of VNC_FEATURE_XVP
 * ui/vnc: fix debug output for invalid audio message
 * hw/scsi/scsi-disk: Disallow block sizes smaller than 512 [CVE-2023-42467]
 * accel/tcg: mttcg remove false-negative halted assertion
 * meson.build: Make keyutils independent from keyring
 * target/arm: Don't skip MTE checks for LDRT/STRT at EL0
 * hw/arm/boot: Set SCR_EL3.FGTEn when booting kernel
 * include/exec: Widen tlb_hit/tlb_hit_page()
 * tests/file-io-error: New test
 * file-posix: Simplify raw_co_prw's 'out' zone code
 * file-posix: Fix zone update in I/O error path
 * file-posix: Check bs->bl.zoned for zone info
 * file-posix: Clear bs->bl.zoned on error
 * hw/cxl: Fix out of bound array access
 * hw/cxl: Fix CFMW config memory leak
 * linux-user/hppa: lock both words of function descriptor
 * linux-user/hppa: clear the PSW 'N' bit when delivering signals
 * hw/ppc: Read time only once to perform decrementer write
 * hw/ppc: Reset timebase facilities on machine reset
 * hw/ppc: Always store the decrementer value
 * target/ppc: Sign-extend large decrementer to 64-bits
 * hw/ppc: Avoid decrementer rounding errors
 * hw/ppc: Round up the decrementer interval when converting to ns
 * host-utils: Add muldiv64_round_up

Signed-of-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:09 +01:00
ed0d90bb5f [openSUSE] Update to version 8.1.1
This includes the following commits:

 * tpm: fix crash when FD >= 1024 and unnecessary errors due to EINTR (Marc-André Lureau)
 * meson: Fix targetos match for illumos and Solaris. (Jonathan Perkin)
 * s390x/ap: fix missing subsystem reset registration (Janosch Frank)
 * ui: fix crash when there are no active_console (Marc-André Lureau)
 * virtio-gpu/win32: set the destroy function on load (Marc-André Lureau)
 * target/riscv: Allocate itrigger timers only once (Akihiko Odaki)
 * target/riscv/pmp.c: respect mseccfg.RLB for pmpaddrX changes (Leon Schuermann)
 * target/riscv: fix satp_mode_finalize() when satp_mode.supported = 0 (Daniel Henrique Barboza)
 * hw/riscv: virt: Fix riscv,pmu DT node path (Conor Dooley)
 * linux-user/riscv: Use abi type for target_ucontext (LIU Zhiwei)
 * hw/intc: Make rtc variable names consistent (Jason Chien)
 * hw/intc: Fix upper/lower mtime write calculation (Jason Chien)
 * target/riscv: Fix zfa fleq.d and fltq.d (LIU Zhiwei)
 * target/riscv: Fix page_check_range use in fault-only-first (LIU Zhiwei)
 * target/riscv/cpu.c: add zmmul isa string (Daniel Henrique Barboza)
 * hw/char/riscv_htif: Fix the console syscall on big endian hosts (Thomas Huth)
 * hw/char/riscv_htif: Fix printing of console characters on big endian hosts (Thomas Huth)
 * arm64: Restore trapless ptimer access (Colton Lewis)
 * virtio: Drop out of coroutine context in virtio_load() (Kevin Wolf)
 * qxl: don't assert() if device isn't yet initialized (Marc-André Lureau)
 * hw/net/vmxnet3: Fix guest-triggerable assert() (Thomas Huth)
 * docs tests: Fix use of migrate_set_parameter (Markus Armbruster)
 * qemu-options.hx: Rephrase the descriptions of the -hd* and -cdrom options (Thomas Huth)
 * hw/i2c/aspeed: Fix TXBUF transmission start position error (Hang Yu)
 * hw/i2c/aspeed: Fix Tx count and Rx size error in buffer pool mode (Hang Yu)
 * hw/ide/ahci: fix broken SError handling (Niklas Cassel)
 * hw/ide/ahci: fix ahci_write_fis_sdb() (Niklas Cassel)
 * hw/ide/ahci: PxCI should not get cleared when ERR_STAT is set (Niklas Cassel)
 * hw/ide/ahci: PxSACT and PxCI is cleared when PxCMD.ST is cleared (Niklas Cassel)
 * hw/ide/ahci: simplify and document PxCI handling (Niklas Cassel)
 * hw/ide/ahci: write D2H FIS when processing NCQ command (Niklas Cassel)
 * hw/ide/core: set ERR_STAT in unsupported command completion (Niklas Cassel)
 * target/ppc: Fix LQ, STQ register-pair order for big-endian (Nicholas Piggin)
 * target/ppc: Flush inputs to zero with NJ in ppc_store_vscr (Richard Henderson)
 * hw/ppc/e500: fix broken snapshot replay (Maksim Kostin)
 * ppc/vof: Fix missed fields in VOF cleanup (Nicholas Piggin)
 * ui/dbus: Properly dispose touch/mouse dbus objects (Bilal Elmoussaoui)
 * target/i386: raise FERR interrupt with iothread locked (Paolo Bonzini)
 * linux-user: Adjust brk for load_bias (Richard Henderson)
 * target/arm: properly document FEAT_CRC32 (Alex Bennée)
 * block-migration: Ensure we don't crash during migration cleanup (Fabiano Rosas)
 * softmmu: Assert data in bounds in iotlb_to_section (Richard Henderson)
 * docs/about/license: Update LICENSE URL (Philippe Mathieu-Daudé)
 * target/arm: Fix 64-bit SSRA (Richard Henderson)
 * target/arm: Fix SME ST1Q (Richard Henderson)
 * accel/kvm: Specify default IPA size for arm64 (Akihiko Odaki)
 * kvm: Introduce kvm_arch_get_default_type hook (Akihiko Odaki)
 * include/hw/virtio/virtio-gpu: Fix virtio-gpu with blob on big endian hosts (Thomas Huth)
 * target/s390x: Check reserved bits of VFMIN/VFMAX's M5 (Ilya Leoshkevich)
 * target/s390x: Fix VSTL with a large length (Ilya Leoshkevich)
 * target/s390x: Use a 16-bit immediate in VREP (Ilya Leoshkevich)
 * target/s390x: Fix the "ignored match" case in VSTRS (Ilya Leoshkevich)

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:09 +01:00
5a27b86dc6 [openSUSE][RPM] spec: enable the Pipewire audio backend (bsc#1215486)
Enable the Pipewire audio backend (available since 8.1), in the
appropriate subpackage.

References: bsc#1215486
Signed-off-by: Dario Faggioli
2024-03-14 18:26:09 +01:00
b7d4f949b3 [openSUSE][RPM] Use discount instead of perl-Text-Markdown
perl-Text-Markdown is not always available (e.g., in SLE/Leap).
Use discount instead, as the provider of the 'markdown' binary.

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:09 +01:00
fbc4756925 [openSUSE][RPM] Transform meson subproject in git submodules
OBS SCM bridge can handle git submodule, while it can't handle (yet?)
meson subprojects. The (ugly, I know!) solution, for now, is to turn
the latter into the former, with commands like the followings:

git submodule add -f https://gitlab.com/qemu-project/berkeley-testfloat-3 subprojects/berkeley-testfloat-3
git -C subprojects/berkeley-testfloat-3 reset --hard 40619cbb3bf32872df8c53cc457039229428a263

(the hash used comes from the subprojects/berkeley-testfloat-3.wrap file)

It's also necessary to manually apply the layering of the packagefiles,
and that is done in the specfile.

Longer term and better solutions could be:
- Make SCM support meson subprojects
- Create standalone packages for the subprojects (and instruct
  QEMU to pick stuff from there)

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:09 +01:00
527a5d3ed7 [openSUSE][RPM] Update to version 8.1.0
Full list of changes are available at:

  https://wiki.qemu.org/ChangeLog/8.1

Highlights:
 * VFIO: improved live migration support, no longer an experimental feature
 * GTK GUI now supports multi-touch events
 * ARM, PowerPC, and RISC-V can now use AES acceleration on host processor
 * PCIe: new QMP commands to inject CXL General Media events, DRAM
   events and Memory Module events
 * ARM: KVM VMs on a host which supports MTE (the Memory Tagging Extension)
   can now use MTE in the guest
 * ARM: emulation support for bpim2u (Banana Pi BPI-M2 Ultra) board and
   neoverse-v1 (Cortex Neoverse-V1) CPU
 * ARM: new architectural feature support for: FEAT_PAN3 (SCTLR_ELx.EPAN),
   FEAT_LSE2 (Large System Extensions v2), and experimental support for
   FEAT_RME (Realm Management Extensions)
 * Hexagon: new instruction support for v68/v73 scalar, and v68/v69 HVX
 * Hexagon: gdbstub support for HVX
 * MIPS: emulation support for Ingenic XBurstR1/XBurstR2 CPUs, and MXU
   instructions
 * PowerPC: TCG SMT support, allowing pseries and powernv to run with up
   to 8 threads per core
 * PowerPC: emulation support for Power9 DD2.2 CPU model, and perf
   sampling support for POWER CPUs
 * RISC-V: ISA extension support for BF16/Zfa, and disassembly support
   for Zcm*/Z*inx/XVentanaCondOps/Xthead
 * RISC-V: CPU emulation support for Veyron V1
 * RISC-V: numerous KVM/emulation fixes and enhancements
 * s390: instruction emulation fixes for LDER, LCBB, LOCFHR, MXDB, MXDBR,
   EPSW, MDEB, MDEBR, MVCRL, LRA, CKSM, CLM, ICM, MC, STIDP, EXECUTE, and
   CLGEBR(A)
 * SPARC: updated target/sparc to use tcg_gen_lookup_and_goto_ptr() for
   improved performance
 * Tricore: emulation support for TC37x CPU that supports ISA v1.6.2
   instructions
 * Tricore: instruction emulation of POPCNT.W, LHA, CRC32L.W, CRC32.B,
   SHUFFLE, SYSCALL, and DISABLE
 * x86: CPU model support for GraniteRapids
 * and lots more...

This also (automatically) fixes:
 - bsc#1212850 (CVE-2023-3354)
 - bsc#1213001 (CVE-2023-3255)
 - bsc#1213925 (CVE-2023-3180)
 - bsc#1213414 (CVE-2023-3301)
 - bsc#1207205 (CVE-2023-0330)
 - bsc#1212968 (CVE-2023-2861)
 - bsc#1179993, bsc#1181740, bsc#1211697

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:09 +01:00
e4a5f14170 [openSUSE][RPM] Use --preserve-argv0 in qemu-linux-user (boo#1197298, bsc#1212768)
By default try to preserve argv[0].

Original report is boo#1197298, which also became relevant recently again in bsc#1212768.

Signed-off-by: Fabian Vogt <fabian@ritter-vogt.de>
References: boo#1197298
References: bsc#1212768
Signed-off-by: Fabian Vogt <fabian@ritter-vogt.de>
2024-03-14 18:26:09 +01:00
39ef4b454c [openSUSE][RPM] Split qemu-tools package (#31)
Create separate packages for qemu-img and qemu-pr-helper.

Signed-off-by: Vasiliy Ulyanov <vulyanov@suse.de>
Co-authored-by: Vasiliy Ulyanov <vulyanov@suse.de>
2024-03-14 18:26:09 +01:00
3f35213205 [openSUSE][RPM] Fix deps for virtiofsd and improve spec files
Address the comments from Factory Submission
https://build.opensuse.org/request/show/1088674?notification_id=40890530:
- remove the various '%defattr()'
- make sure that we depend on virtiofsd only on arch-es
  where it can actually be built

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:09 +01:00
85343d8ccd [openSUSE][RPM] spec: require virtiofsd, now that it is a sep package (#27)
Since version 8.0.0, virtiofsd is not part of QEMU sources any longer.
We therefore have also moved it to a separate package. To retain
compatibility and consistency of behavior, require such a package as an
hard dependency.

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:09 +01:00
150b5e071c [openSUSE][RPM] Try to avoid recommending too many packages (bsc#1205680)
For example, let's try to avoid recommending GUI UI stuff, unless GTK is
already installed. This way we avoid things like bringing in an entire
graphic stack on servers.

References: bsc#1205680
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:09 +01:00
06617e52e2 [openSUSE][RPM] Move documentation to a subpackage and fix qemu-headless (bsc#1209629)
- The qemu-headless subpackage was defined but never build, because it
  had no files. Fix that by putting there just a simple README.

- Move the docs in a dedicated subpackage

Resolves: bsc#1209629
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:09 +01:00
Gerd Hoffmann
890a31ceaa [openSUSE] roms: add back edk2-basetools target
The efi nic boot rom builds depend on this, they need the
EfiRom utility from edk2 BaseTools.

Fixes: 22e11539e1 ("edk2: replace build scripts")
Reported-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
References: https://lore.kernel.org/qemu-devel/20230411101709.445259-1-kraxel@redhat.com/
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:09 +01:00
40eb837e9b [openSUSE][OBS] Limit the workflow runs to the factory branch (#25)
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:09 +01:00
067e6183b0 [openSUSE] pc: q35: Allow 1024 cpus for old machine types (bsc#1202282, jsc#PED-2592)
In SUSE/openSUSE, we bumped up the number of maximum vcpus since
machine type q35-7.1. Make sure that this continue to be true, for
backward compatibility.

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
References: https://lore.kernel.org/qemu-devel/166876173513.24238.8968021290016401421.stgit@tumbleweed.Wayrath/
References: bsc#1202282, jsc#PED-2592
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:09 +01:00
59d94e7ffc [openSUSE] meson: remove $pkgversion from CONFIG_STAMP input to broaden compatibility
As part of the effort to close the gap with Leap I think we are fine
removing the $pkgversion component to creating a unique CONFIG_STAMP.
This stamp is only used in creating a unique symbol used in ensuring the
dynamically loaded modules correspond correctly to the loading qemu.
The default inputs to producing this unique symbol are somewhat reasonable
as a generic mechanism, but specific packaging and maintenance practices
might require the default to be modified for best use. This is an example
of that.

Signed-off-by: Bruce Rogers <brogers@suse.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:09 +01:00
Bruce Rogers
386d4950ea [openSUSE] meson: install ivshmem-client and ivshmem-server
Turn on the meson install flag for these executables

Signed-off-by: Bruce Rogers <brogers@suse.com>
2024-03-14 18:26:09 +01:00
Bruce Rogers
63428b19f0 [openSUSE] Make installed scripts explicitly python3 (bsc#1077564)
We want to explicitly reference python3 in the scripts we install.

References: bsc#1077564
Signed-off-by: Bruce Rogers <brogers@suse.com>
2024-03-14 18:26:09 +01:00
a8776466de [openSUSE] Disable some tests that have problems in OBS
We are disabling the following tests:

qemu-system-ppc64 / display-vga-test

They are failing due to some memory corruption errors. We believe that
this might be due to the combination of the compiler version and of LTO,
and will take up the investigation within the upstream community.

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:09 +01:00
Bruce Rogers
23964dfca0 [openSUSE] tests/qemu-iotests: Triple timeout of i/o tests due to obs environment
Executing tests in obs is very fickle, since you aren't guaranteed
reliable cpu time. Triple the timeout for each test to help ensure
we don't fail a test because the stars align against us.

Signed-off-by: Bruce Rogers <brogers@suse.com>
[DF: Small tweaks necessary for rebasing on top of 6.2.0]
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:09 +01:00
Bruce Rogers
3724ff19db [openSUSE] tests: change error message in test 162
Since we have a quite restricted execution environment, as far as
networking is concerned, we need to change the error message we expect
in test 162. There is actually no routing set up so the error we get is
"Network is unreachable". Change the expected output accordingly.

Signed-off-by: Bruce Rogers <brogers@suse.com>
2024-03-14 18:26:09 +01:00
9f24a3c6cc [openSUSE] Revert "tests/qtest: enable more vhost-user tests by default"
Revert commit "tests/qtest: enable more vhost-user tests by default"
(8dcb404bff), as it causes prooblem when building with GCC 12 and LTO
enabled.

This should be considered temporary, until the actual reason why the
code of the tests that are added in that commit breaks.

It has been reported upstream, and will be (hopefully) solved there:
https://lore.kernel.org/qemu-devel/1d3bbff9e92e7c8a24db9e140dcf3f428c2df103.camel@suse.com/

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:09 +01:00
Hannes Reinecke
f30c826058 [openSUSE] scsi-generic: check for additional SG_IO status on completion (bsc#1178049)
SG_IO may return additional status in the 'status', 'driver_status',
and 'host_status' fields. When either of these fields are set the
command has not been executed normally, so we should not continue
processing this command but rather return an error.
scsi_read_complete() already checks for these errors,
scsi_write_complete() does not.

References: bsc#1178049
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:09 +01:00
Mauro Matteo Cascella
9b8a507ab1 [openSUSE] hw/scsi/megasas: check for NULL frame in megasas_command_cancelled() (bsc#1180432, CVE-2020-35503)
Ensure that 'cmd->frame' is not NULL before accessing the 'header' field.
This check prevents a potential NULL pointer dereference issue.

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1910346
Signed-off-by: Mauro Matteo Cascella <mcascell@redhat.com>
Reported-by: Cheolwoo Myung <cwmyung@snu.ac.kr>
References: bsc#1180432, CVE-2020-35503
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:09 +01:00
766dcf5a0b [openSUSE] scsi-generic: replace logical block count of response of READ CAPACITY (SLE-20965)
While using SCSI passthrough, Following scenario makes qemu doesn't
realized the capacity change of remote scsi target:
1. online resize the scsi target.
2. issue 'rescan-scsi-bus.sh -s ...' in host.
3. issue 'rescan-scsi-bus.sh -s ...' in vm.

In above scenario I used to experienced errors while accessing the
additional disk space in vm. I think the reasonable operations should
be:
1. online resize the scsi target.
2. issue 'rescan-scsi-bus.sh -s ...' in host.
3. issue 'block_resize' via qmp to notify qemu.
4. issue 'rescan-scsi-bus.sh -s ...' in vm.

The errors disappear once I notify qemu by block_resize via qmp.

So this patch replaces the number of logical blocks of READ CAPACITY
response from scsi target by qemu's bs->total_sectors. If the user in
vm wants to access the additional disk space, The administrator of
host must notify qemu once resizeing the scsi target.

Bonus is that domblkinfo of libvirt can reflect the consistent capacity
information between host and vm in case of missing block_resize in qemu.
E.g:
...
    <disk type='block' device='lun'>
      <driver name='qemu' type='raw'/>
      <source dev='/dev/sdc' index='1'/>
      <backingStore/>
      <target dev='sda' bus='scsi'/>
      <alias name='scsi0-0-0-0'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
...

Before:
1. online resize the scsi target.
2. host:~  # rescan-scsi-bus.sh -s /dev/sdc
3. guest:~ # rescan-scsi-bus.sh -s /dev/sda
4  host:~  # virsh domblkinfo --domain $DOMAIN --human --device sda
Capacity:       4.000 GiB
Allocation:     0.000 B
Physical:       8.000 GiB

5. guest:~ # lsblk /dev/sda
NAME   MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda      8:0    0   8G  0 disk
└─sda1   8:1    0   2G  0 part

After:
1. online resize the scsi target.
2. host:~  # rescan-scsi-bus.sh -s /dev/sdc
3. guest:~ # rescan-scsi-bus.sh -s /dev/sda
4  host:~  # virsh domblkinfo --domain $DOMAIN --human --device sda
Capacity:       4.000 GiB
Allocation:     0.000 B
Physical:       8.000 GiB

5. guest:~ # lsblk /dev/sda
NAME   MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda      8:0    0   4G  0 disk
└─sda1   8:1    0   2G  0 part

References: [SUSE-JIRA] (SLE-20965)
Signed-off-by: Lin Ma <lma@suse.com>
2024-03-14 18:26:09 +01:00
Olaf Hering
81ccb369aa [openSUSE] xen: ignore live parameter from xen-save-devices-state (bsc#1079730, bsc#1101982, bsc#106399)
The final step of xl migrate|save for an HVM domU is saving the state of
qemu. This also involves releasing all block devices. While releasing
backends ought to be a separate step, such functionality is not
implemented.

Unfortunately, releasing the block devices depends on the optional
'live' option. This breaks offline migration with 'virsh migrate domU
dom0' because the sending side does not release the disks, as a result
the receiving side can not properly claim write access to the disks.

As a minimal fix, remove the dependency on the 'live' option. Upstream
may fix this in a different way, like removing the newly added 'live'
parameter entirely.

Fixes: 5d6c599fe1 ("migration, xen: Fix block image lock issue on live migration")

Signed-off-by: Olaf Hering <olaf@aepfle.de>
References: bsc#1079730, bsc#1101982, bsc#1063993
Signed-off-by: Bruce Rogers <brogers@suse.com>
2024-03-14 18:26:09 +01:00
Bruce Rogers
f9f3c4deda [openSUSE] xen: add block resize support for xen disks
Provide monitor naming of xen disks, and plumb guest driver
notification through xenstore of resizing instigated via the
monitor.

[BR: minor edits to pass qemu's checkpatch script]
[BR: significant rework needed due to upstream xen disk qdevification]
[BR: At this point, monitor_add_blk call is all we need to add!]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2024-03-14 18:26:09 +01:00
Bruce Rogers
3f367b50a9 [openSUSE] xen_disk: Add suse specific flush disable handling and map to QEMU equiv (bsc#879425)
Add code to read the suse specific suse-diskcache-disable-flush flag out
of xenstore, and set the equivalent flag within QEMU.

Patch taken from Xen's patch queue, Olaf Hering being the original author.
[bsc#879425]

[BR: minor edits to pass qemu's checkpatch script]
[BR: With qdevification of xen-block, code has changed significantly]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Signed-off-by: Olaf Hering <olaf@aepfle.de>
2024-03-14 18:26:09 +01:00
Andreas Färber
69dc441378 [openSUSE] Raise soft address space limit to hard limit
For SLES we want users to be able to use large memory configurations
with KVM without fiddling with ulimit -Sv.

Signed-off-by: Andreas Färber <afaerber@suse.de>
[BR: add include for sys/resource.h]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2024-03-14 18:26:09 +01:00
Bruce Rogers
396fb12516 [openSUSE] qemu-bridge-helper: reduce security profile (boo#988279)
Change from using glib alloc and free routines to those
from libc. Also perform safety measure of dropping privs
to user if configured no-caps.

References: boo#988279
Signed-off-by: Bruce Rogers <brogers@suse.com>
[AF: Rebased for v2.7.0-rc2]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2024-03-14 18:26:09 +01:00
Alexander Graf
c3414ebae7 [openSUSE] Make char muxer more robust wrt small FIFOs
Virtio-Console can only process one character at a time. Using it on S390
gave me strange "lags" where I got the character I pressed before when
pressing one. So I typed in "abc" and only received "a", then pressed "d"
but the guest received "b" and so on.

While the stdio driver calls a poll function that just processes on its
queue in case virtio-console can't take multiple characters at once, the
muxer does not have such callbacks, so it can't empty its queue.

To work around that limitation, I introduced a new timer that only gets
active when the guest can not receive any more characters. In that case
it polls again after a while to check if the guest is now receiving input.

This patch fixes input when using -nographic on s390 for me.

[AF: Rebased for v2.7.0-rc2]
[BR: minor edits to pass qemu's checkpatch script]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2024-03-14 18:26:09 +01:00
Alexander Graf
ae5b115e1a [openSUSE] PPC: KVM: Disable mmu notifier check
When using hugetlbfs (which is required for HV mode KVM on 970), we
check for MMU notifiers that on 970 can not be implemented properly.

So disable the check for mmu notifiers on PowerPC guests, making
KVM guests work there, even if possibly racy in some odd circumstances.

Signed-off-by: Bruce Rogers <brogers@suse.com>
2024-03-14 18:26:09 +01:00
Alexander Graf
4f1e250c0d [openSUSE] linux-user: lseek: explicitly cast non-set offsets to signed
When doing lseek, SEEK_SET indicates that the offset is an unsigned variable.
Other seek types have parameters that can be negative.

When converting from 32bit to 64bit parameters, we need to take this into
account and enable SEEK_END and SEEK_CUR to be negative, while SEEK_SET stays
absolute positioned which we need to maintain as unsigned.

Signed-off-by: Alexander Graf <agraf@suse.de>
2024-03-14 18:26:09 +01:00
Alexander Graf
635dd43cf0 [openSUSE] linux-user: use target_ulong
Linux syscalls pass pointers or data length or other information of that sort
to the kernel. This is all stuff you don't want to have sign extended.
Otherwise a host 64bit variable parameter with a size parameter will extend
it to a negative number, breaking lseek for example.

Pass syscall arguments as ulong always.

Signed-off-by: Alexander Graf <agraf@suse.de>
[JRZ: changes from linux-user/qemu.h wass moved to linux-user/user-internals.h]
Signed-off-by: Jose R Ziviani <jziviani@suse.de>
[DF: Forward port, i.e., use ulong for do_prctl too]
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:09 +01:00
Andreas Färber
a4b62322d1 [openSUSE] qemu-binfmt-conf: Modify default path
Change QEMU_PATH from /usr/local/bin to /usr/bin prefix.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2024-03-14 18:26:09 +01:00
Bruce Rogers
fe5a4218cb [openSUSE] hw/smbios: handle both file formats regardless of machine type (bsc#994082, bsc#1084316, boo#1131894)
It's easy enough to handle either per-spec or legacy smbios structures
in the smbios file input without regard to the machine type used, by
simply applying the basic smbios formatting rules. then depending on
what is detected. terminal numm bytes are added or removed for machine
type specific processing.

References: bsc#994082, bsc#1084316, boo#1131894
Signed-off-by: Bruce Rogers <brogers@suse.com>
2024-03-14 18:26:09 +01:00
Bruce Rogers
ca1c30aaca [openSUSE] roms/Makefile: add --cross-file to qboot meson setup for aarch64
We add a --cross-file reference so that we can do cross compilation
of qboot from an aarch64 build.

Signed-off-by: Bruce Rogers <brogers@suse.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:09 +01:00
Bruce Rogers
f04930b9c7 [openSUSE] roms/Makefile: pass a packaging timestamp to subpackages with date info (bsc#1011213)
Certain rom subpackages build from qemu git-submodules call the date
program to include date information in the packaged binaries. This
causes repeated builds of the package to be different, wkere the only
real difference is due to the fact that time build timestamp has
changed. To promote reproducible builds and avoid customers being
prompted to update packages needlessly, we'll use the timestamp of the
VERSION file as the packaging timestamp for all packages that build in a
timestamp for whatever reason.

References: bsc#1011213
Signed-off-by: Bruce Rogers <brogers@suse.com>
2024-03-14 18:26:09 +01:00
79113f134b [openSUSE][RPM] Spec file adjustments for 8.0.0 (and later)
The sgabios submodule is no longer there, so let's get rid of any
reference to it from our spec files.

Remove no longer supported './configure' options.

We're also not set yet for using the set_version service, so we need to
update the following manually:
- the Version: tags in the spec files
- the rpm/seabios_version and rpm/skiboot_version files (see qemu.spec
  for instructions on how to do that)
- the %{sbver} variable in rpm/common.inc

A better solution for handling this aspect is being worked on.

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:09 +01:00
d5da72c920 [openSUSE][OBS] Add OBS workflow
Create a rebuild (for pushes) and a pull request workflow.

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:08 +01:00
107266f5bb [openSUSE][RPM] Split qemu and qemu-linux-user spec files
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:08 +01:00
37314a98fc [openSUSE][RPM] Provide seabios and skiboot version files
In an upstream tarball there are some special files, generated by a
script that is run when the archive is prepared. Let's make our
repository look a little more like that, so we can build it properly.

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:08 +01:00
daa30db454 [openSUSE][RPM] Add downstream packaging files
Stash the "packaging files" in the QEMU repository, in the rpm/
directory. During package build, they will be pulled out from there
and used as appropriate.

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2024-03-14 18:26:08 +01:00
Michael Tokarev
11aa0b1ff1 Update version for 8.2.2 release
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-03-04 15:15:46 +03:00
Thomas Huth
21214699c2 chardev/char-socket: Fix TLS io channels sending too much data to the backend
Commit ffda5db65a ("io/channel-tls: fix handling of bigger read buffers")
changed the behavior of the TLS io channels to schedule a second reading
attempt if there is still incoming data pending. This caused a regression
with backends like the sclpconsole that check in their read function that
the sender does not try to write more bytes to it than the device can
currently handle.

The problem can be reproduced like this:

 1) In one terminal, do this:

  mkdir qemu-pki
  cd qemu-pki
  openssl genrsa 2048 > ca-key.pem
  openssl req -new -x509 -nodes -days 365000 -key ca-key.pem -out ca-cert.pem
  # enter some dummy value for the cert
  openssl genrsa 2048 > server-key.pem
  openssl req -new -x509 -nodes -days 365000 -key server-key.pem \
    -out server-cert.pem
  # enter some other dummy values for the cert

  gnutls-serv --echo --x509cafile ca-cert.pem --x509keyfile server-key.pem \
              --x509certfile server-cert.pem -p 8338

 2) In another terminal, do this:

  wget https://download.fedoraproject.org/pub/fedora-secondary/releases/39/Cloud/s390x/images/Fedora-Cloud-Base-39-1.5.s390x.qcow2

  qemu-system-s390x -nographic -nodefaults \
    -hda Fedora-Cloud-Base-39-1.5.s390x.qcow2 \
    -object tls-creds-x509,id=tls0,endpoint=client,verify-peer=false,dir=$PWD/qemu-pki \
    -chardev socket,id=tls_chardev,host=localhost,port=8338,tls-creds=tls0 \
    -device sclpconsole,chardev=tls_chardev,id=tls_serial

QEMU then aborts after a second or two with:

  qemu-system-s390x: ../hw/char/sclpconsole.c:73: chr_read: Assertion
   `size <= SIZE_BUFFER_VT220 - scon->iov_data_len' failed.
 Aborted (core dumped)

It looks like the second read does not trigger the chr_can_read() function
to be called before the second read, which should normally always be done
before sending bytes to a character device to see how much it can handle,
so the s->max_size in tcp_chr_read() still contains the old value from the
previous read. Let's make sure that we use the up-to-date value by calling
tcp_chr_read_poll() again here.

Fixes: ffda5db65a ("io/channel-tls: fix handling of bigger read buffers")
Buglink: https://issues.redhat.com/browse/RHEL-24614
Reviewed-by: "Daniel P. Berrangé" <berrange@redhat.com>
Message-ID: <20240229104339.42574-1-thuth@redhat.com>
Reviewed-by: Antoine Damhet <antoine.damhet@blade-group.com>
Tested-by: Antoine Damhet <antoine.damhet@blade-group.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 462945cd22)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-03-01 19:02:35 +03:00
Thomas Huth
2a97c05796 tests/unit/test-util-sockets: Remove temporary file after test
test-util-sockets leaves the temporary socket files around in the
temporary files folder. Let's better remove them at the end of the
testing.

Fixes: 4d3a329af5 ("tests/util-sockets: add abstract unix socket cases")
Message-ID: <20240226082728.249753-1-thuth@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit f0cb6828ae)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-03-01 19:00:11 +03:00
Benjamin David Lunt
e6ce551c75 hw/usb/bus.c: PCAP adding 0xA in Windows version
Since Windows text files use CRLFs for all \n, the Windows version of QEMU
inserts a CR in the PCAP stream when a LF is encountered when using USB PCAP
files. This is due to the fact that the PCAP file is opened as TEXT instead
of BINARY.

To show an example, when using a very common protocol to USB disks, the BBB
protocol uses a 10-byte command packet. For example, the READ_CAPACITY(10)
command will have a command block length of 10 (0xA). When this 10-byte
command (part of the 31-byte CBW) is placed into the PCAP file, the Windows
file manager inserts a 0xD before the 0xA, turning the 31-byte CBW into a
32-byte CBW.

Actual CBW:
  0040 55 53 42 43 01 00 00 00 08 00 00 00 80 00 0a 25 USBC...........%
  0050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00       ...............

PCAP CBW
  0040 55 53 42 43 01 00 00 00 08 00 00 00 80 00 0d 0a USBC............
  0050 25 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 %..............

I believe simply opening the PCAP file as BINARY instead of TEXT will fix
this issue.

Resolves: https://bugs.launchpad.net/qemu/+bug/2054889
Signed-off-by: Benjamin David Lunt <benlunt@fysnet.net>
Message-ID: <000101da6823$ce1bbf80$6a533e80$@fysnet.net>
[thuth: Break long line to avoid checkpatch.pl error]
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 5e02a4fdeb)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-03-01 18:59:21 +03:00
Thomas Huth
829bb27765 hw/intc/Kconfig: Fix GIC settings when using "--without-default-devices"
When using "--without-default-devices", the ARM_GICV3_TCG and ARM_GIC_KVM
settings currently get disabled, though the arm virt machine is only of
very limited use in that case. This also causes the migration-test to
fail in such builds. Let's make sure that we always keep the GIC switches
enabled in the --without-default-devices builds, too.

Message-ID: <20240221110059.152665-1-thuth@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 8bd3f84d1f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-03-01 18:59:06 +03:00
Daniel P. Berrangé
0e33e4e78e gitlab: force allow use of pip in Cirrus jobs
Python is transitioning to a world where you're not allowed to use 'pip
install' outside of a virutal env by default. The rationale is to stop
use of pip clashing with distro provided python packages, which creates
a major headache on distro upgrades.

All our CI environments, however, are 100% disposable so the upgrade
headaches don't exist. Thus we can undo the python defaults to allow
pip to work.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-id: 20240222114038.2348718-1-berrange@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit a8bf9de2f4)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-29 00:00:21 +03:00
Alex Bennée
6c14f93182 tests/vm: avoid re-building the VM images all the time
The main problem is that "check-venv" is a .PHONY target will always
evaluate and trigger a full re-build of the VM images. While its
tempting to drop it from the dependencies that does introduce a
breakage on freshly configured builds.

Fortunately we do have the otherwise redundant --force flag for the
script which up until now was always on. If we make the usage of
--force conditional on dependencies other than check-venv triggering
the update we can avoid the costly rebuild and still run cleanly on a
fresh checkout.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2118
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240227144335.1196131-4-alex.bennee@linaro.org>
(cherry picked from commit 151b7dba39)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-28 21:21:04 +03:00
Alex Bennée
36d50b4bde tests/vm: update openbsd image to 7.4
The old links are dead so even if we have the ISO cached we can't
finish the install. Update to the current stable and tweak the install
strings.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2192
Tested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240227144335.1196131-5-alex.bennee@linaro.org>
(cherry picked from commit 8467ac75b3)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-28 21:20:12 +03:00
Paolo Bonzini
decafac46b target/i386: leave the A20 bit set in the final NPT walk
The A20 mask is only applied to the final memory access.  Nested
page tables are always walked with the raw guest-physical address.

Unlike the previous patch, in this one the masking must be kept, but
it was done too early.

Cc: qemu-stable@nongnu.org
Fixes: 4a1e9d4d11 ("target/i386: Use atomic operations for pte updates", 2022-10-18)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit b5a9de3259)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-28 21:19:03 +03:00
Paolo Bonzini
6801a20ebd target/i386: remove unnecessary/wrong application of the A20 mask
If ptw_translate() does a MMU_PHYS_IDX access, the A20 mask is already
applied in get_physical_address(), which is called via probe_access_full()
and x86_cpu_tlb_fill().

If ptw_translate() on the other hand does a MMU_NESTED_IDX access,
the A20 mask must not be applied to the address that is looked up in
the nested page tables; it must be applied only to the addresses that
hold the NPT entries (which is achieved via MMU_PHYS_IDX, per the
previous paragraph).

Therefore, we can remove A20 masking from the computation of the page
table entry's address, and let get_physical_address() or mmu_translate()
apply it when they know they are returning a host-physical address.

Cc: qemu-stable@nongnu.org
Fixes: 4a1e9d4d11 ("target/i386: Use atomic operations for pte updates", 2022-10-18)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit a28fe7dc19)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-28 21:18:45 +03:00
Paolo Bonzini
a28b6b4e74 target/i386: Fix physical address truncation
The address translation logic in get_physical_address() will currently
truncate physical addresses to 32 bits unless long mode is enabled.
This is incorrect when using physical address extensions (PAE) outside
of long mode, with the result that a 32-bit operating system using PAE
to access memory above 4G will experience undefined behaviour.

The truncation code was originally introduced in commit 33dfdb5 ("x86:
only allow real mode to access 32bit without LMA"), where it applied
only to translations performed while paging is disabled (and so cannot
affect guests using PAE).

Commit 9828198 ("target/i386: Add MMU_PHYS_IDX and MMU_NESTED_IDX")
rearranged the code such that the truncation also applied to the use
of MMU_PHYS_IDX and MMU_NESTED_IDX.  Commit 4a1e9d4 ("target/i386: Use
atomic operations for pte updates") brought this truncation into scope
for page table entry accesses, and is the first commit for which a
Windows 10 32-bit guest will reliably fail to boot if memory above 4G
is present.

The truncation code however is not completely redundant.  Even though the
maximum address size for any executed instruction is 32 bits, helpers for
operations such as BOUND, FSAVE or XSAVE may ask get_physical_address()
to translate an address outside of the 32-bit range, if invoked with an
argument that is close to the 4G boundary.  Likewise for processor
accesses, for example TSS or IDT accesses, when EFER.LMA==0.

So, move the address truncation in get_physical_address() so that it
applies to 32-bit MMU indexes, but not to MMU_PHYS_IDX and MMU_NESTED_IDX.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2040
Fixes: 4a1e9d4d11 ("target/i386: Use atomic operations for pte updates", 2022-10-18)
Cc: qemu-stable@nongnu.org
Co-developed-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit b1661801c1)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: drop unrelated change in target/i386/cpu.c)
2024-02-28 21:15:46 +03:00
Paolo Bonzini
5c4091fe07 target/i386: check validity of VMCB addresses
MSR_VM_HSAVE_PA bits 0-11 are reserved, as are the bits above the
maximum physical address width of the processor.  Setting them to
1 causes a #GP (see "15.30.4 VM_HSAVE_PA MSR" in the AMD manual).

The same is true of VMCB addresses passed to VMRUN/VMLOAD/VMSAVE,
even though the manual is not clear on that.

Cc: qemu-stable@nongnu.org
Fixes: 4a1e9d4d11 ("target/i386: Use atomic operations for pte updates", 2022-10-18)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit d09c79010f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-28 21:03:19 +03:00
Paolo Bonzini
6ed8211379 target/i386: mask high bits of CR3 in 32-bit mode
CR3 bits 63:32 are ignored in 32-bit mode (either legacy 2-level
paging or PAE paging).  Do this in mmu_translate() to remove
the last where get_physical_address() meaningfully drops the high
bits of the address.

Cc: qemu-stable@nongnu.org
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Fixes: 4a1e9d4d11 ("target/i386: Use atomic operations for pte updates", 2022-10-18)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 68fb78d7d5)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-28 21:02:56 +03:00
Jessica Clarke
a0fb839d0a pl031: Update last RTCLR value on write in case it's read back
The PL031 allows you to read RTCLR, which is meant to give you the last
value written. PL031State has an lr field which is used when reading
from RTCLR, and is present in the VM migration state, but we never
actually update it, so it always reads as its initial 0 value.

Cc: qemu-stable@nongnu.org
Signed-off-by: Jessica Clarke <jrtc27@jrtc27.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20240222000341.1562443-1-jrtc27@jrtc27.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 4d28d57c9f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-27 20:37:06 +03:00
Klaus Jensen
e4e36e65c9 hw/nvme: fix invalid endian conversion
numcntl is one byte and so is max_vfs. Using cpu_to_le16 on big endian
hosts results in numcntl being set to 0.

Fix by dropping the endian conversion.

Fixes: 99f48ae7ae ("hw/nvme: Add support for Secondary Controller List")
Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Minwoo Im <minwoo.im@samsung.com>
Message-ID: <20240222-fix-sriov-numcntl-v1-1-d60bea5e72d0@samsung.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit d2b5bb860e)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-27 19:14:41 +03:00
Gerd Hoffmann
8c86c88cd5 update edk2 binaries to edk2-stable202402
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 658178c3d4)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-27 14:05:06 +03:00
Gerd Hoffmann
cc98bd4f10 update edk2 submodule to edk2-stable202402
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 9c996f3d11)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-27 14:05:06 +03:00
Nicholas Piggin
131ed62955 target/ppc: Fix crash on machine check caused by ifetch
is_prefix_insn_excp() loads the first word of the instruction address
which caused an exception, to determine whether or not it was prefixed
so the prefix bit can be set in [H]SRR1.

This works if the instruction image can be loaded, but if the exception
was caused by an ifetch, this load could fail and cause a recursive
exception and crash. Machine checks caused by ifetch are not excluded
from the prefix check and can crash (see issue 2108 for an example).

Fix this by excluding machine checks caused by ifetch from the prefix
check.

Cc: qemu-stable@nongnu.org
Acked-by: Cédric Le Goater <clg@kaod.org>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2108
Fixes: 55a7fa34f8 ("target/ppc: Machine check on invalid real address access on POWER9/10")
Fixes: 5a5d3b23cb ("target/ppc: Add SRR1 prefix indication to interrupt handlers")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
(cherry picked from commit c8fd9667e5)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-24 19:29:45 +03:00
Nicholas Piggin
175bdedfa9 target/ppc: Fix lxv/stxv MSR facility check
The move to decodetree flipped the inequality test for the VEC / VSX
MSR facility check.

This caused application crashes under Linux, where these facility
unavailable interrupts are used for lazy-switching of VEC/VSX register
sets. Getting the incorrect interrupt would result in wrong registers
being loaded, potentially overwriting live values and/or exposing
stale ones.

Cc: qemu-stable@nongnu.org
Reported-by: Joel Stanley <joel@jms.id.au>
Fixes: 70426b5bb7 ("target/ppc: moved stxvx and lxvx from legacy to decodtree")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1769
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Tested-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

(cherry picked from commit 2cc0e449d1)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-24 19:29:11 +03:00
Peter Maydell
01aa603fb1 .gitlab-ci.d/windows.yml: Drop msys2-32bit job
MSYS2 is dropping support for 32-bit Windows.  This shows up for us
as various packages we were using in our CI job no longer being
available to install, which causes the job to fail.  In commit
8e31b744fd we dropped the dependency on libusb and spice, but the
dtc package has also now been removed.

For us as QEMU upstream, "32 bit x86 hosts for system emulation" have
already been deprecated as of QEMU 8.0, so we are ready to drop them
anyway.

Drop the msys2-32bit CI job, as the first step in doing this.

This is cc'd to stable, because this job will also be broken for CI
on the stable branches.  We can't drop 32-bit support entirely there,
but we will still be covering at least compilation for 32-bit Windows
via the cross-win32-system job.

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20240220165602.135695-1-peter.maydell@linaro.org
(cherry picked from commit 5cd3ae4903)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-22 18:46:16 +03:00
Tianlan Zhou
aafe8c0d12 system/vl: Update description for input grab key
Input grab key should be Ctrl-Alt-g, not just Ctrl-Alt.

Fixes: f8d2c9369b ("sdl: use ctrl-alt-g as grab hotkey")
Signed-off-by: Tianlan Zhou <bobby825@126.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 185311130f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-22 18:46:06 +03:00
Tianlan Zhou
2da2e679d6 docs/system: Update description for input grab key
Input grab key should be Ctrl-Alt-g, not just Ctrl-Alt.

Fixes: f8d2c9369b ("sdl: use ctrl-alt-g as grab hotkey")
Signed-off-by: Tianlan Zhou <bobby825@126.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 4a20ac400f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-22 18:46:06 +03:00
Thomas Huth
56ee4a67cb hw/hppa/Kconfig: Fix building with "configure --without-default-devices"
When running "configure" with "--without-default-devices", building
of qemu-system-hppa currently fails with:

 /usr/bin/ld: libqemu-hppa-softmmu.fa.p/hw_hppa_machine.c.o: in function `machine_HP_common_init_tail':
 hw/hppa/machine.c:399: undefined reference to `usb_bus_find'
 /usr/bin/ld: hw/hppa/machine.c:399: undefined reference to `usb_create_simple'
 /usr/bin/ld: hw/hppa/machine.c:400: undefined reference to `usb_bus_find'
 /usr/bin/ld: hw/hppa/machine.c:400: undefined reference to `usb_create_simple'
 collect2: error: ld returned 1 exit status
 ninja: build stopped: subcommand failed.
 make: *** [Makefile:162: run-ninja] Error 1

And after fixing this, the qemu-system-hppa binary refuses to run
due to the missing 'pci-ohci' and 'pci-serial' devices. Let's add
the right config switches to fix these problems.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 04b86ccb5d)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-22 18:45:41 +03:00
Akihiko Odaki
814f887430 tests/qtest: Depend on dbus_display1_dep
It ensures dbus-display1.c will not be recompiled.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20240214-dbus-v7-3-7eff29f04c34@daynix.com>
(cherry picked from commit 186acfbaf7)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-20 19:01:51 +03:00
Akihiko Odaki
fb22ee75b2 meson: Explicitly specify dbus-display1.h dependency
Explicitly specify dbus-display1.h as a dependency so that files
depending on it will not get compiled too early.

Fixes: 1222070e77 ("meson: ensure dbus-display generated code is built before other units")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20240214-dbus-v7-2-7eff29f04c34@daynix.com>
(cherry picked from commit 7aee57df93)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-20 19:01:32 +03:00
Akihiko Odaki
1766b9360c audio: Depend on dbus_display1_dep
dbusaudio needs dbus_display1_dep.

Fixes: 739362d420 ("audio: add "dbus" audio backend")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20240214-dbus-v7-1-7eff29f04c34@daynix.com>
(cherry picked from commit d676119075)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-20 18:55:19 +03:00
Tianlan Zhou
2e5c9d5462 ui/console: Fix console resize with placeholder surface
In `qemu_console_resize()`, the old surface of the console is keeped if the new
console size is the same as the old one. If the old surface is a placeholder,
and the new size of console is the same as the placeholder surface (640*480),
the surface won't be replace.
In this situation, the surface's `QEMU_PLACEHOLDER_FLAG` flag is still set, so
the console won't be displayed in SDL display mode.
This patch fixes this problem by forcing a new surface if the old one is a
placeholder.

Signed-off-by: Tianlan Zhou <bobby825@126.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <20240207172024.8-1-bobby825@126.com>
(cherry picked from commit 95b08fee8f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-20 18:49:44 +03:00
Fiona Ebner
7ff0d4d184 ui/clipboard: add asserts for update and request
Should an issue like CVE-2023-6683 ever appear again in the future,
it will be more obvious which assumption was violated.

Suggested-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <20240124105749.204610-2-f.ebner@proxmox.com>
(cherry picked from commit 9c41658261)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-20 18:49:10 +03:00
Fiona Ebner
480a6adc83 ui/clipboard: mark type as not available when there is no data
With VNC, a client can send a non-extended VNC_MSG_CLIENT_CUT_TEXT
message with len=0. In qemu_clipboard_set_data(), the clipboard info
will be updated setting data to NULL (because g_memdup(data, size)
returns NULL when size is 0). If the client does not set the
VNC_ENCODING_CLIPBOARD_EXT feature when setting up the encodings, then
the 'request' callback for the clipboard peer is not initialized.
Later, because data is NULL, qemu_clipboard_request() can be reached
via vdagent_chr_write() and vdagent_clipboard_recv_request() and
there, the clipboard owner's 'request' callback will be attempted to
be called, but that is a NULL pointer.

In particular, this can happen when using the KRDC (22.12.3) VNC
client.

Another scenario leading to the same issue is with two clients (say
noVNC and KRDC):

The noVNC client sets the extension VNC_FEATURE_CLIPBOARD_EXT and
initializes its cbpeer.

The KRDC client does not, but triggers a vnc_client_cut_text() (note
it's not the _ext variant)). There, a new clipboard info with it as
the 'owner' is created and via qemu_clipboard_set_data() is called,
which in turn calls qemu_clipboard_update() with that info.

In qemu_clipboard_update(), the notifier for the noVNC client will be
called, i.e. vnc_clipboard_notify() and also set vs->cbinfo for the
noVNC client. The 'owner' in that clipboard info is the clipboard peer
for the KRDC client, which did not initialize the 'request' function.
That sounds correct to me, it is the owner of that clipboard info.

Then when noVNC sends a VNC_MSG_CLIENT_CUT_TEXT message (it did set
the VNC_FEATURE_CLIPBOARD_EXT feature correctly, so a check for it
passes), that clipboard info is passed to qemu_clipboard_request() and
the original segfault still happens.

Fix the issue by handling updates with size 0 differently. In
particular, mark in the clipboard info that the type is not available.

While at it, switch to g_memdup2(), because g_memdup() is deprecated.

Cc: qemu-stable@nongnu.org
Fixes: CVE-2023-6683
Reported-by: Markus Frank <m.frank@proxmox.com>
Suggested-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: Markus Frank <m.frank@proxmox.com>
Message-ID: <20240124105749.204610-1-f.ebner@proxmox.com>
(cherry picked from commit 405484b29f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-20 18:45:27 +03:00
Daniel P. Berrangé
4fd56da337 ui: reject extended clipboard message if not activated
The extended clipboard message protocol requires that the client
activate the extension by requesting a psuedo encoding. If this
is not done, then any extended clipboard messages from the client
should be considered invalid and the client dropped.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20240115095119.654271-1-berrange@redhat.com>
(cherry picked from commit 4cba838896)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-20 18:45:20 +03:00
Ziqiao Kong
0b30735d38 target/i386: Generate an illegal opcode exception on cmp instructions with lock prefix
target/i386: As specified by Intel Manual Vol2 3-180, cmp instructions
are not allowed to have lock prefix and a `UD` should be raised. Without
this patch, s1->T0 will be uninitialized and used in the case OP_CMPL.

Signed-off-by: Ziqiao Kong <ziqiaokong@gmail.com>
Message-ID: <20240215095015.570748-2-ziqiaokong@gmail.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 99d0dcd7f1)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-20 18:44:21 +03:00
Xiaoyao Li
f5dddb856c i386/cpuid: Move leaf 7 to correct group
CPUID leaf 7 was grouped together with SGX leaf 0x12 by commit
b9edbadefb ("i386: Propagate SGX CPUID sub-leafs to KVM") by mistake.

SGX leaf 0x12 has its specific logic to check if subleaf (starting from 2)
is valid or not by checking the bit 0:3 of corresponding EAX is 1 or
not.

Leaf 7 follows the logic that EAX of subleaf 0 enumerates the maximum
valid subleaf.

Fixes: b9edbadefb ("i386: Propagate SGX CPUID sub-leafs to KVM")
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20240125024016.2521244-4-xiaoyao.li@intel.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 0729857c70)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-20 18:43:03 +03:00
Xiaoyao Li
e8d27721cb i386/cpuid: Decrease cpuid_i when skipping CPUID leaf 1F
Existing code misses a decrement of cpuid_i when skip leaf 0x1F.
There's a blank CPUID entry(with leaf, subleaf as 0, and all fields
stuffed 0s) left in the CPUID array.

It conflicts with correct CPUID leaf 0.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by:Yang Weijiang <weijiang.yang@intel.com>
Message-ID: <20240125024016.2521244-2-xiaoyao.li@intel.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 10f92799af)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-20 18:42:24 +03:00
Xiaoyao Li
72c4ef9da0 i386/cpu: Mask with XCR0/XSS mask for FEAT_XSAVE_XCR0_HI and FEAT_XSAVE_XSS_HI leafs
The value of FEAT_XSAVE_XCR0_HI leaf and FEAT_XSAVE_XSS_HI leaf also
need to be masked by XCR0 and XSS mask respectively, to make it
logically correct.

Fixes: 301e90675c ("target/i386: Enable support for XSAVES based features")
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Yang Weijiang <weijiang.yang@intel.com>
Message-ID: <20240115091325.1904229-3-xiaoyao.li@intel.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit a11a365159)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-20 18:41:43 +03:00
Xiaoyao Li
0766f137f5 i386/cpu: Clear FEAT_XSAVE_XSS_LO/HI leafs when CPUID_EXT_XSAVE is not available
Leaf FEAT_XSAVE_XSS_LO and FEAT_XSAVE_XSS_HI also need to be cleared
when CPUID_EXT_XSAVE is not set.

Fixes: 301e90675c ("target/i386: Enable support for XSAVES based features")
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Yang Weijiang <weijiang.yang@intel.com>
Message-ID: <20240115091325.1904229-2-xiaoyao.li@intel.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 81f5cad385)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-20 18:41:25 +03:00
Peter Maydell
4d9dc117ea .gitlab-ci/windows.yml: Don't install libusb or spice packages on 32-bit
When msys2 updated their libusb packages to libusb 1.0.27, they
dropped support for building them for mingw32, leaving only mingw64
packages.  This broke our CI job, as the 'pacman' package install now
fails with:

error: target not found: mingw-w64-i686-libusb
error: target not found: mingw-w64-i686-usbredir

(both these binary packages are from the libusb source package).

Similarly, spice is now 64-bit only:
error: target not found: mingw-w64-i686-spice

Fix this by dropping these packages from the list we install for our
msys2-32bit build.  We do this with a simple mechanism for the
msys2-64bit and msys2-32bit jobs to specify a list of extra packages
to install on top of the common ones we install for both jobs.

Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2160
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Message-id: 20240215155009.2422335-1-peter.maydell@linaro.org
(cherry picked from commit 8e31b744fd)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-16 18:51:32 +03:00
Kevin Wolf
d5bc76fa20 iotests: Make 144 deterministic again
Since commit effd60c8 changed how QMP commands are processed, the order
of the block-commit return value and job events in iotests 144 wasn't
fixed and more and caused the test to fail intermittently.

Change the test to cache events first and then print them in a
predefined order.

Waiting three times for JOB_STATUS_CHANGE is a bit uglier than just
waiting for the JOB_STATUS_CHANGE that has "status": "ready", but the
tooling we have doesn't seem to allow the latter easily.

Fixes: effd60c878
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2126
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20240209173103.239994-1-kwolf@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit cc29c12ec6)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-16 14:28:18 +03:00
Peter Maydell
f030e96d27 target/arm: Don't get MDCR_EL2 in pmu_counter_enabled() before checking ARM_FEATURE_PMU
It doesn't make sense to read the value of MDCR_EL2 on a non-A-profile
CPU, and in fact if you try to do it we will assert:

#6  0x00007ffff4b95e96 in __GI___assert_fail
    (assertion=0x5555565a8c70 "!arm_feature(env, ARM_FEATURE_M)", file=0x5555565a6e5c "../../target/arm/helper.c", line=12600, function=0x5555565a9560 <__PRETTY_FUNCTION__.0> "arm_security_space_below_el3") at ./assert/assert.c:101
#7  0x0000555555ebf412 in arm_security_space_below_el3 (env=0x555557bc8190) at ../../target/arm/helper.c:12600
#8  0x0000555555ea6f89 in arm_is_el2_enabled (env=0x555557bc8190) at ../../target/arm/cpu.h:2595
#9  0x0000555555ea942f in arm_mdcr_el2_eff (env=0x555557bc8190) at ../../target/arm/internals.h:1512

We might call pmu_counter_enabled() on an M-profile CPU (for example
from the migration pre/post hooks in machine.c); this should always
return false because these CPUs don't set ARM_FEATURE_PMU.

Avoid the assertion by not calling arm_mdcr_el2_eff() before we
have done the early return for "PMU not present".

This fixes an assertion failure if you try to do a loadvm or
savevm for an M-profile board.

Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2155
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240208153346.970021-1-peter.maydell@linaro.org
(cherry picked from commit ac1d88e9e7)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-16 14:26:24 +03:00
Richard Henderson
429c11c726 target/arm: Fix SVE/SME gross MTE suppression checks
The TBI and TCMA bits are located within mtedesc, not desc.

Cc: qemu-stable@nongnu.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Gustavo Romero <gustavo.romero@linaro.org>
Message-id: 20240207025210.8837-7-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 855f94eca8)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-16 14:19:15 +03:00
Richard Henderson
2d1a29e3b2 target/arm: Handle mte in do_ldrq, do_ldro
These functions "use the standard load helpers", but
fail to clean_data_tbi or populate mtedesc.

Cc: qemu-stable@nongnu.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Gustavo Romero <gustavo.romero@linaro.org>
Message-id: 20240207025210.8837-6-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 623507ccfc)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-16 14:18:38 +03:00
Richard Henderson
da804717a5 target/arm: Split out make_svemte_desc
Share code that creates mtedesc and embeds within simd_desc.

Cc: qemu-stable@nongnu.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Gustavo Romero <gustavo.romero@linaro.org>
Message-id: 20240207025210.8837-5-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 96fcc9982b)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-16 14:17:50 +03:00
Richard Henderson
8da74af970 target/arm: Adjust and validate mtedesc sizem1
When we added SVE_MTEDESC_SHIFT, we effectively limited the
maximum size of MTEDESC.  Adjust SIZEM1 to consume the remaining
bits (32 - 10 - 5 - 12 == 5).  Assert that the data to be stored
fits within the field (expecting 8 * 4 - 1 == 31, exact fit).

Cc: qemu-stable@nongnu.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Gustavo Romero <gustavo.romero@linaro.org>
Message-id: 20240207025210.8837-4-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit b12a7671b6)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-16 14:01:14 +03:00
Richard Henderson
5e6e09baa5 target/arm: Fix nregs computation in do_{ld,st}_zpa
The field is encoded as [0-3], which is convenient for
indexing our array of function pointers, but the true
value is [1-4].  Adjust before calling do_mem_zpa.

Add an assert, and move the comment re passing ZT to
the helper back next to the relevant code.

Cc: qemu-stable@nongnu.org
Fixes: 206adacfb8 ("target/arm: Add mte helpers for sve scalar + int loads")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Gustavo Romero <gustavo.romero@linaro.org>
Message-id: 20240207025210.8837-3-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 64c6e7444d)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-16 14:00:48 +03:00
Richard Henderson
7950913ece linux-user/aarch64: Choose SYNC as the preferred MTE mode
The API does not generate an error for setting ASYNC | SYNC; that merely
constrains the selection vs the per-cpu default.  For qemu linux-user,
choose SYNC as the default.

Cc: qemu-stable@nongnu.org
Reported-by: Gustavo Romero <gustavo.romero@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Gustavo Romero <gustavo.romero@linaro.org>
Message-id: 20240207025210.8837-2-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 681dfc0d55)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-16 14:00:15 +03:00
Jonathan Cameron
803f1e70ec tests/acpi: Update DSDT.cxl to reflect change _STA return value.
_STA will now return 0xB (in common with most other devices)
rather than not setting the bits to indicate this fake device
has not been enabled, and self tests haven't passed.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20240126120132.24248-13-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit b24a981b9f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-15 11:29:04 +03:00
Jonathan Cameron
02d9979ba8 hw/i386: Fix _STA return value for ACPI0017
Found whilst testing a series for the linux kernel that actually
bothers to check if enabled is set. 0xB is the option used
for vast majority of DSDT entries in QEMU.
It is a little odd for a device that doesn't really exist and
is simply a hook to tell the OS there is a CEDT table but 0xB
seems a reasonable choice and avoids need to special case
this device in the OS.

Means:
* Device present.
* Device enabled and decoding it's resources.
* Not shown in UI
* Functioning properly
* No battery (on this device!)

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20240126120132.24248-12-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit d9ae5802f6)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-15 11:29:04 +03:00
Jonathan Cameron
47df9ca585 tests/acpi: Allow update of DSDT.cxl
The _STA value returned currently indicates the ACPI0017 device
is not enabled.  Whilst this isn't a real device, setting _STA
like this may prevent an OS from enumerating it correctly and
hence from parsing the CEDT table.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20240126120132.24248-11-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 14ec4ff3e4)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-15 11:29:04 +03:00
Zhenzhong Duan
d4157195bd smmu: Clear SMMUPciBus pointer cache when system reset
s->smmu_pcibus_by_bus_num is a SMMUPciBus pointer cache indexed
by bus number, bus number may not always be a fixed value,
i.e., guest reboot to different kernel which set bus number with
different algorithm.

This could lead to smmu_iommu_mr() providing the wrong iommu MR.

Suggested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-Id: <20240125073706.339369-3-zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 8a6b3f4dc9)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-15 11:13:38 +03:00
Zhenzhong Duan
721c3ceaef virtio_iommu: Clear IOMMUPciBus pointer cache when system reset
s->iommu_pcibus_by_bus_num is a IOMMUPciBus pointer cache indexed
by bus number, bus number may not always be a fixed value,
i.e., guest reboot to different kernel which set bus number with
different algorithm.

This could lead to endpoint binding to wrong iommu MR in
virtio_iommu_get_endpoint(), then vfio device setup wrong
mapping from other device.

Remove the memset in virtio_iommu_device_realize() to avoid
redundancy with memset in system reset.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-Id: <20240125073706.339369-2-zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 9a457383ce)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-15 11:12:21 +03:00
Dmitry Osipenko
1c38c8a24a virtio-gpu: Correct virgl_renderer_resource_get_info() error check
virgl_renderer_resource_get_info() returns errno and not -1 on error.
Correct the return-value check.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Message-Id: <20240129073921.446869-1-dmitry.osipenko@collabora.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 574b64aa67)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-14 21:44:10 +03:00
Li Zhijian
bbe51d6ea3 hw/cxl: Pass CXLComponentState to cache_mem_ops
cache_mem_ops.{read,write}() interprets opaque as
CXLComponentState(cxl_cstate) instead of ComponentRegisters(cregs).

Fortunately, cregs is the first member of cxl_cstate, so their values are
the same.

Fixes: 9e58f52d3f ("hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5)")
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20240126120132.24248-8-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 729d45a6af)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-14 21:42:04 +03:00
Hyeonggon Yoo
bdd3159ad7 hw/cxl/device: read from register values in mdev_reg_read()
In the current mdev_reg_read() implementation, it consistently returns
that the Media Status is Ready (01b). This was fine until commit
25a52959f9 ("hw/cxl: Add support for device sanitation") because the
media was presumed to be ready.

However, as per the CXL 3.0 spec "8.2.9.8.5.1 Sanitize (Opcode 4400h)",
during sanitation, the Media State should be set to Disabled (11b). The
mentioned commit correctly sets it to Disabled, but mdev_reg_read()
still returns Media Status as Ready.

To address this, update mdev_reg_read() to read register values instead
of returning dummy values.

Note that __toggle_media() managed to not only write something
that no one read, it did it to the wrong register storage and
so changed the reported mailbox size which was definitely not
the intent. That gets fixed as a side effect of allocating
separate state storage for this register.

Fixes: commit 25a52959f9 ("hw/cxl: Add support for device sanitation")
Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20240126120132.24248-7-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit f7509f462c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-14 21:41:31 +03:00
Ira Weiny
9d8a2a8aaf cxl/cdat: Fix header sum value in CDAT checksum
The addition of the DCD support for CXL type-3 devices extended the CDAT
table large enough that the checksum being returned was incorrect.[1]

This was because the checksum value was using the header length field
rather than each of the 4 bytes of the length field.  This was
previously not seen because the length of the CDAT data was less than
256 thus resulting in an equivalent checksum value.

Properly calculate the checksum for the CDAT header.

[1] https://lore.kernel.org/all/20231116-fix-cdat-devm-free-v1-1-b148b40707d7@intel.com/

Fixes: aba578bdac ("hw/cxl/cdat: CXL CDAT Data Object Exchange implementation")
Cc: Huai-Cheng Kuo <hchkuo@avery-design.com.tw>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Message-Id: <20240126120132.24248-5-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 64fdad5e67)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-14 21:40:17 +03:00
Ira Weiny
8997083118 cxl/cdat: Handle cdat table build errors
The callback for building CDAT tables may return negative error codes.
This was previously unhandled and will result in potentially huge
allocations later on in ct3_build_cdat()

Detect the negative error code and defer cdat building.

Fixes: f5ee7413d5 ("hw/mem/cxl-type3: Add CXL CDAT Data Object Exchange")
Cc: Huai-Cheng Kuo <hchkuo@avery-design.com.tw>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20240126120132.24248-2-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit c62926f730)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-14 21:38:38 +03:00
Andrey Ignatov
17ae7ebedc vhost-user.rst: Fix vring address description
There is no "size" field in vring address structure. Remove it.

Fixes: 5fc0e00291 ("Add vhost-user protocol documentation")
Signed-off-by: Andrey Ignatov <rdna@apple.com>
Message-Id: <20240112004555.64900-1-rdna@apple.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit aa05bd9ef4)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-14 21:37:34 +03:00
Richard Henderson
181e548715 tcg/arm: Fix goto_tb for large translation blocks
Correct arithmetic for separating high and low
on a large negative number.

Cc: qemu-stable@nongnu.org
Fixes: 79ffece444 ("tcg/arm: Implement direct branch for goto_tb")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1714
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit e41f1825b4)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-14 21:17:26 +03:00
Richard Henderson
e5f105655c tcg: Increase width of temp_subindex
We need values 0-3 for TCG_TYPE_I128 on 32-bit hosts.

Cc: qemu-stable@nongnu.org
Fixes: 43eef72f41 ("tcg: Add temp allocation for TCGv_i128")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2159
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Tested-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit c0e688153f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-14 21:17:14 +03:00
Sven Schnelle
281fea01d6 hw/net/tulip: add chip status register values
Netbsd isn't able to detect a link on the emulated tulip card. That's
because netbsd reads the Chip Status Register of the Phy (address
0x14). The default phy data in the qemu tulip driver is all zero,
which means no link is established and autonegotation isn't complete.

Therefore set the register to 0x3b40, which means:

Link is up, Autonegotation complete, Full Duplex, 100MBit/s Link
speed.

Also clear the mask because this register is read only.

Signed-off-by: Sven Schnelle <svens@stackframe.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Helge Deller <deller@gmx.de>
Tested-by: Helge Deller <deller@gmx.de>
Signed-off-by: Helge Deller <deller@gmx.de>
(cherry picked from commit 9b60a3ed55)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-14 21:12:57 +03:00
Akihiko Odaki
9ab476c3de hw/smbios: Fix port connector option validation
qemu_smbios_type8_opts did not have the list terminator and that
resulted in out-of-bound memory access. It also needs to have an element
for the type option.

Cc: qemu-stable@nongnu.org
Fixes: fd8caa253c ("hw/smbios: support for type 8 (port connector)")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 196578c9d0)
2024-02-13 21:06:20 +03:00
Akihiko Odaki
d6e07d5916 hw/smbios: Fix OEM strings table option validation
qemu_smbios_type11_opts did not have the list terminator and that
resulted in out-of-bound memory access. It also needs to have an element
for the type option.

Cc: qemu-stable@nongnu.org
Fixes: 2d6dcbf93f ("smbios: support setting OEM strings table")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit cd8a35b913)
2024-02-13 21:06:00 +03:00
Paolo Bonzini
6eeeb87331 configure: run plugin TCG tests again
Commit 39fb3cfc28 ("configure: clean up plugin option handling", 2023-10-18)
dropped the CONFIG_PLUGIN line from tests/tcg/config-host.mak, due to confusion
caused by the shadowing of $config_host_mak.  However, TCG tests were still
expecting it.  Oops.

Put it back, in the meanwhile the shadowing is gone so it's clear that it goes
in the tests/tcg configuration.

Cc:  <alex.bennee@linaro.org>
Fixes: 39fb3cfc28 ("configure: clean up plugin option handling", 2023-10-18)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20240124115332.612162-1-pbonzini@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240207163812.3231697-4-alex.bennee@linaro.org>
(cherry picked from commit 15cc103362)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: context fixup)
2024-02-13 08:53:44 +03:00
Fabiano Rosas
cefca32a24 tests/docker: Add sqlite3 module to openSUSE Leap container
Avocado needs sqlite3:

  Failed to load plugin from module "avocado.plugins.journal":
  ImportError("Module 'sqlite3' is not installed.
  Use: sudo zypper install python311 to install it")

>From 'zypper info python311':
  "This package supplies rich command line features provided by
  readline, and sqlite3 support for the interpreter core, thus forming
  a so called "extended" runtime."

Include the appropriate package in the lcitool mappings which will
guarantee the dockerfile gets properly updated when lcitool is
run. Also include the updated dockerfile.

Signed-off-by: Fabiano Rosas <farosas@suse.de>
Suggested-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20240117164227.32143-1-farosas@suse.de>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240207163812.3231697-2-alex.bennee@linaro.org>
(cherry picked from commit 7485508341)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-13 08:48:56 +03:00
Daniel Henrique Barboza
eca4e19914 hw/riscv/virt-acpi-build.c: fix leak in build_rhct()
The 'isa' char pointer isn't being freed after use.

Issue detected by Valgrind:

==38752== 128 bytes in 1 blocks are definitely lost in loss record 3,190 of 3,884
==38752==    at 0x484280F: malloc (vg_replace_malloc.c:442)
==38752==    by 0x5189619: g_malloc (gmem.c:130)
==38752==    by 0x51A5BF2: g_strconcat (gstrfuncs.c:628)
==38752==    by 0x6C1E3E: riscv_isa_string_ext (cpu.c:2321)
==38752==    by 0x6C1E3E: riscv_isa_string (cpu.c:2343)
==38752==    by 0x6BD2EA: build_rhct (virt-acpi-build.c:232)
==38752==    by 0x6BD2EA: virt_acpi_build (virt-acpi-build.c:556)
==38752==    by 0x6BDC86: virt_acpi_setup (virt-acpi-build.c:662)
==38752==    by 0x9C8DC6: notifier_list_notify (notify.c:39)
==38752==    by 0x4A595A: qdev_machine_creation_done (machine.c:1589)
==38752==    by 0x61E052: qemu_machine_creation_done (vl.c:2680)
==38752==    by 0x61E052: qmp_x_exit_preconfig.part.0 (vl.c:2709)
==38752==    by 0x6220C6: qmp_x_exit_preconfig (vl.c:2702)
==38752==    by 0x6220C6: qemu_init (vl.c:3758)
==38752==    by 0x425858: main (main.c:47)

Fixes: ebfd392893 ("hw/riscv/virt: virt-acpi-build.c: Add RHCT Table")
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240122221529.86562-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit 1a49762c07)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: context fixup)
2024-02-13 08:48:07 +03:00
Avihai Horon
76c172ffbe migration: Fix logic of channels and transport compatibility check
The commit in the fixes line mistakenly modified the channels and
transport compatibility check logic so it now checks multi-channel
support only for socket transport type.

Thus, running multifd migration using a transport other than socket that
is incompatible with multi-channels (such as "exec") would lead to a
segmentation fault instead of an error message.
For example:
  (qemu) migrate_set_capability multifd on
  (qemu) migrate -d "exec:cat > /tmp/vm_state"
  Segmentation fault (core dumped)

Fix it by checking multi-channel compatibility for all transport types.

Cc: qemu-stable <qemu-stable@nongnu.org>
Fixes: d95533e1cd ("migration: modify migration_channels_and_uri_compatible() for new QAPI syntax")
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240125162528.7552-2-avihaih@nvidia.com
Signed-off-by: Peter Xu <peterx@redhat.com>
(cherry picked from commit 3205bebd4f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-12 19:26:04 +03:00
Stefan Hajnoczi
c36d4d3cee virtio-blk: avoid using ioeventfd state in irqfd conditional
Requests that complete in an IOThread use irqfd to notify the guest
while requests that complete in the main loop thread use the traditional
qdev irq code path. The reason for this conditional is that the irq code
path requires the BQL:

  if (s->ioeventfd_started && !s->ioeventfd_disabled) {
      virtio_notify_irqfd(vdev, req->vq);
  } else {
      virtio_notify(vdev, req->vq);
  }

There is a corner case where the conditional invokes the irq code path
instead of the irqfd code path:

  static void virtio_blk_stop_ioeventfd(VirtIODevice *vdev)
  {
      ...
      /*
       * Set ->ioeventfd_started to false before draining so that host notifiers
       * are not detached/attached anymore.
       */
      s->ioeventfd_started = false;

      /* Wait for virtio_blk_dma_restart_bh() and in flight I/O to complete */
      blk_drain(s->conf.conf.blk);

During blk_drain() the conditional produces the wrong result because
ioeventfd_started is false.

Use qemu_in_iothread() instead of checking the ioeventfd state.

Cc: qemu-stable@nongnu.org
Buglink: https://issues.redhat.com/browse/RHEL-15394
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240122172625.415386-1-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit bfa36802d1)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: fixup for v8.2.0-809-g3cdaf3dd4a
 "virtio-blk: rename dataplane to ioeventfd")
2024-02-12 19:26:03 +03:00
Hanna Czenczek
00e50cb429 virtio: Re-enable notifications after drain
During drain, we do not care about virtqueue notifications, which is why
we remove the handlers on it.  When removing those handlers, whether vq
notifications are enabled or not depends on whether we were in polling
mode or not; if not, they are enabled (by default); if so, they have
been disabled by the io_poll_start callback.

Because we do not care about those notifications after removing the
handlers, this is fine.  However, we have to explicitly ensure they are
enabled when re-attaching the handlers, so we will resume receiving
notifications.  We do this in virtio_queue_aio_attach_host_notifier*().
If such a function is called while we are in a polling section,
attaching the notifiers will then invoke the io_poll_start callback,
re-disabling notifications.

Because we will always miss virtqueue updates in the drained section, we
also need to poll the virtqueue once after attaching the notifiers.

Buglink: https://issues.redhat.com/browse/RHEL-3934
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-ID: <20240202153158.788922-3-hreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 5bdbaebcce)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-12 19:26:03 +03:00
Hanna Czenczek
feb2073c86 virtio-scsi: Attach event vq notifier with no_poll
As of commit 38738f7dbb ("virtio-scsi:
don't waste CPU polling the event virtqueue"), we only attach an io_read
notifier for the virtio-scsi event virtqueue instead, and no polling
notifiers.  During operation, the event virtqueue is typically
non-empty, but none of the buffers are intended to be used immediately.
Instead, they only get used when certain events occur.  Therefore, it
makes no sense to continuously poll it when non-empty, because it is
supposed to be and stay non-empty.

We do this by using virtio_queue_aio_attach_host_notifier_no_poll()
instead of virtio_queue_aio_attach_host_notifier() for the event
virtqueue.

Commit 766aa2de0f ("virtio-scsi: implement
BlockDevOps->drained_begin()") however has virtio_scsi_drained_end() use
virtio_queue_aio_attach_host_notifier() for all virtqueues, including
the event virtqueue.  This can lead to it being polled again, undoing
the benefit of commit 38738f7dbb.

Fix it by using virtio_queue_aio_attach_host_notifier_no_poll() for the
event virtqueue.

Reported-by: Fiona Ebner <f.ebner@proxmox.com>
Fixes: 766aa2de0f
       ("virtio-scsi: implement BlockDevOps->drained_begin()")
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-ID: <20240202153158.788922-2-hreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit c42c3833e0)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-12 19:26:03 +03:00
Daniel P. Berrangé
84c54eaeff iotests: give tempdir an identifying name
If something goes wrong causing the iotests not to cleanup their
temporary directory, it is useful if the dir had an identifying
name to show what is to blame.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240205155158.1843304-1-berrange@redhat.com>
Revieved-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 7d2faf0ce2)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-12 19:26:03 +03:00
Daniel P. Berrangé
88555e3607 iotests: fix leak of tmpdir in dry-run mode
Creating an instance of the 'TestEnv' class will create a temporary
directory. This dir is only deleted, however, in the __exit__ handler
invoked by a context manager.

In dry-run mode, we don't use the TestEnv via a context manager, so
were leaking the temporary directory. Since meson invokes 'check'
5 times on each configure run, developers /tmp was filling up with
empty temporary directories.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240205154019.1841037-1-berrange@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit c645bac4e0)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-12 19:26:03 +03:00
Sven Schnelle
bbfcb0f7bc hw/scsi/lsi53c895a: add missing decrement of reentrancy counter
When the maximum count of SCRIPTS instructions is reached, the code
stops execution and returns, but fails to decrement the reentrancy
counter. This effectively renders the SCSI controller unusable
because on next entry the reentrancy counter is still above the limit.

This bug was seen on HP-UX 10.20 which seems to trigger SCRIPTS
loops.

Fixes: b987718bbb ("hw/scsi/lsi53c895a: Fix reentrancy issues in the LSI controller (CVE-2023-0330)")
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-ID: <20240128202214.2644768-1-svens@stackframe.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Helge Deller <deller@gmx.de>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 8b09b7fe47)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-09 10:44:49 +03:00
Richard Henderson
3a970decfe linux-user/aarch64: Add padding before __kernel_rt_sigreturn
Without this padding, an unwind through the signal handler
will pick up the unwind info for the preceding syscall.

This fixes gcc's 30_threads/thread/native_handle/cancel.cc.

Cc: qemu-stable@nongnu.org
Fixes: ee95fae075 ("linux-user/aarch64: Add vdso")
Resolves: https://linaro.atlassian.net/browse/GNU-974
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240202034427.504686-1-richard.henderson@linaro.org>
(cherry picked from commit 6400be014f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-09 10:44:49 +03:00
Richard Henderson
8b7750c66f tcg/loongarch64: Set vector registers call clobbered
Because there are more call clobbered registers than
call saved registers, we begin with all registers as
call clobbered and then reset those that are saved.

This was missed when we introduced the LSX support.

Cc: qemu-stable@nongnu.org
Fixes: 16288ded94 ("tcg/loongarch64: Lower basic tcg vec ops to LSX")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2136
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240201233414.500588-1-richard.henderson@linaro.org>
(cherry picked from commit 45bf0e7aa6)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-09 10:44:49 +03:00
Guenter Roeck
5f5e30229e pci-host: designware: Limit value range of iATU viewport register
The latest version of qemu (v8.2.0-869-g7a1dc45af5) crashes when booting
the mcimx7d-sabre emulation with Linux v5.11 and later.

qemu-system-arm: ../system/memory.c:2750: memory_region_set_alias_offset: Assertion `mr->alias' failed.

Problem is that the Designware PCIe emulation accepts the full value range
for the iATU Viewport Register. However, both hardware and emulation only
support four inbound and four outbound viewports.

The Linux kernel determines the number of supported viewports by writing
0xff into the viewport register and reading the value back. The expected
value when reading the register is the highest supported viewport index.
Match that code by masking the supported viewport value range when the
register is written. With this change, the Linux kernel reports

imx6q-pcie 33800000.pcie: iATU: unroll F, 4 ob, 4 ib, align 0K, limit 4G

as expected and supported.

Fixes: d64e5eabc4 ("pci: Add support for Designware IP block")
Cc: Andrey Smirnov <andrew.smirnov@gmail.com>
Cc: Nikita Ostrenkov <n.ostrenkov@gmail.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Message-id: 20240129060055.2616989-1-linux@roeck-us.net
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 8a73152020)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-09 10:44:49 +03:00
Peter Maydell
de6992d390 target/arm: Reinstate "vfp" property on AArch32 CPUs
In commit 4315f7c614 we restructured the logic for creating the
VFP related properties to avoid testing the aa32_simd_r32 feature on
AArch64 CPUs.  However in the process we accidentally stopped
exposing the "vfp" QOM property on AArch32 TCG CPUs.

This mostly hasn't had any ill effects because not many people want
to disable VFP, but it wasn't intentional.  Reinstate the property.

Cc: qemu-stable@nongnu.org
Fixes: 4315f7c614 ("target/arm: Restructure has_vfp_d32 test")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2098
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240126193432.2210558-1-peter.maydell@linaro.org
(cherry picked from commit 185e3fdf8d)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-09 10:44:49 +03:00
Peter Maydell
2d0530abe2 qemu-options.hx: Improve -serial option documentation
The -serial option documentation is a bit brief about '-serial none'
and '-serial null'. In particular it's not very clear about the
difference between them, and it doesn't mention that it's up to
the machine model whether '-serial none' means "don't create the
serial port" or "don't wire the serial port up to anything".

Expand on these points.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240122163607.459769-3-peter.maydell@linaro.org
(cherry picked from commit 747bfaf3a9)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-09 10:44:49 +03:00
Peter Maydell
e2a12fa4e7 system/vl.c: Fix handling of '-serial none -serial something'
Currently if the user passes multiple -serial options on the command
line, we mostly treat those as applying to the different serial
devices in order, so that for example
 -serial stdio -serial file:filename
will connect the first serial port to stdio and the second to the
named file.

The exception to this is the '-serial none' serial device type.  This
means "don't allocate this serial device", but a bug means that
following -serial options are not correctly handled, so that
 -serial none -serial stdio
has the unexpected effect that stdio is connected to the first serial
port, not the second.

This is a very long-standing bug that dates back at least as far as
commit 998bbd74b9 from 2009.

Make the 'none' serial type move forward in the indexing of serial
devices like all the other serial types, so that any subsequent
-serial options are correctly handled.

Note that if your commandline mistakenly had a '-serial none' that
was being overridden by a following '-serial something' option, you
should delete the unnecessary '-serial none'.  This will give you the
same behaviour as before, on QEMU versions both with and without this
bug fix.

Cc: qemu-stable@nongnu.org
Reported-by: Bohdan Kostiv <bohdan.kostiv@tii.ae>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240122163607.459769-2-peter.maydell@linaro.org
Fixes: 998bbd74b9 ("default devices: core code & serial lines")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit d2019a9d0c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-09 10:44:49 +03:00
Jan Klötzke
35a60a20f0 target/arm: fix exception syndrome for AArch32 bkpt insn
Debug exceptions that target AArch32 Hyp mode are reported differently
than on AAarch64. Internally, Qemu uses the AArch64 syndromes. Therefore
such exceptions need to be either converted to a prefetch abort
(breakpoints, vector catch) or a data abort (watchpoints).

Cc: qemu-stable@nongnu.org
Signed-off-by: Jan Klötzke <jan.kloetzke@kernkonzept.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240127202758.3326381-1-jan.kloetzke@kernkonzept.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit f670be1aad)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-09 10:44:49 +03:00
Richard W.M. Jones
b91715588a block/blkio: Make s->mem_region_alignment be 64 bits
With GCC 14 the code failed to compile on i686 (and was wrong for any
version of GCC):

../block/blkio.c: In function ‘blkio_file_open’:
../block/blkio.c:857:28: error: passing argument 3 of ‘blkio_get_uint64’ from incompatible pointer type [-Wincompatible-pointer-types]
  857 |                            &s->mem_region_alignment);
      |                            ^~~~~~~~~~~~~~~~~~~~~~~~
      |                            |
      |                            size_t * {aka unsigned int *}
In file included from ../block/blkio.c:12:
/usr/include/blkio.h:49:67: note: expected ‘uint64_t *’ {aka ‘long long unsigned int *’} but argument is of type ‘size_t *’ {aka ‘unsigned int *’}
   49 | int blkio_get_uint64(struct blkio *b, const char *name, uint64_t *value);
      |                                                         ~~~~~~~~~~^~~~~

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Message-id: 20240130122006.2977938-1-rjones@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 615eaeab3d)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-09 10:44:49 +03:00
Yihuan Pan
84c9704b8e qemu-docs: Update options for graphical frontends
The command line options `-ctrl-grab` and `-alt-grab` have been removed
in QEMU 7.1. Instead, use the `-display sdl,grab-mod=<modifiers>` option
to specify the grab modifiers.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2103
Signed-off-by: Yihuan Pan <xun794@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit db101376af)
2024-02-01 14:21:37 +03:00
Het Gala
3837e6dd1e Make 'uri' optional for migrate QAPI
'uri' argument should be optional, as 'uri' and 'channels'
arguments are mutally exclusive in nature.

Fixes: 074dbce5fc (migration: New migrate and migrate-incoming argument 'channels')
Signed-off-by: Het Gala <het.gala@nutanix.com>
Link: https://lore.kernel.org/r/20240123064219.40514-1-het.gala@nutanix.com
Signed-off-by: Peter Xu <peterx@redhat.com>
(cherry picked from commit 57fd4b4e10)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-01-29 23:26:19 +03:00
Cédric Le Goater
b79a2ef0d4 vfio/pci: Clear MSI-X IRQ index always
When doing device assignment of a physical device, MSI-X can be
enabled with no vectors enabled and this sets the IRQ index to
VFIO_PCI_MSIX_IRQ_INDEX. However, when MSI-X is disabled, the IRQ
index is left untouched if no vectors are in use. Then, when INTx
is enabled, the IRQ index value is considered incompatible (set to
MSI-X) and VFIO_DEVICE_SET_IRQS fails. QEMU complains with :

qemu-system-x86_64: vfio 0000:08:00.0: Failed to set up TRIGGER eventfd signaling for interrupt INTX-0: VFIO_DEVICE_SET_IRQS failure: Invalid argument

To avoid that, unconditionaly clear the IRQ index when MSI-X is
disabled.

Buglink: https://issues.redhat.com/browse/RHEL-21293
Fixes: 5ebffa4e87 ("vfio/pci: use an invalid fd to enable MSI-X")
Cc: Jing Liu <jing2.liu@intel.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
(cherry picked from commit d2b668fca5)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-01-29 23:08:46 +03:00
Fabiano Rosas
106aa13c5b migration: Fix use-after-free of migration state object
We're currently allowing the process_incoming_migration_bh bottom-half
to run without holding a reference to the 'current_migration' object,
which leads to a segmentation fault if the BH is still live after
migration_shutdown() has dropped the last reference to
current_migration.

In my system the bug manifests as migrate_multifd() returning true
when it shouldn't and multifd_load_shutdown() calling
multifd_recv_terminate_threads() which crashes due to an uninitialized
multifd_recv_state.

Fix the issue by holding a reference to the object when scheduling the
BH and dropping it before returning from the BH. The same is already
done for the cleanup_bh at migrate_fd_cleanup_schedule().

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1969
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240119233922.32588-2-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
(cherry picked from commit 27eb8499ed)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-01-29 23:06:53 +03:00
Markus Armbruster
e589e5ade7 migration: Plug memory leak on HMP migrate error path
hmp_migrate() leaks @caps when qmp_migrate() fails.  Plug the leak
with g_autoptr().

Fixes: 967f2de5c9 (migration: Implement MigrateChannelList to hmp migration flow.) v8.2.0-rc0
Fixes: CID 1533125
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Link: https://lore.kernel.org/r/20240117140722.3979657-1-armbru@redhat.com
[peterx: fix CID number as reported by Peter Maydell]
Signed-off-by: Peter Xu <peterx@redhat.com>
(cherry picked from commit 918f620d30)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-01-29 23:00:39 +03:00
290 changed files with 12546 additions and 1401 deletions

View File

@@ -21,7 +21,7 @@ build_task:
install_script:
- @UPDATE_COMMAND@
- @INSTALL_COMMAND@ @PKGS@
- if test -n "@PYPI_PKGS@" ; then @PIP3@ install @PYPI_PKGS@ ; fi
- if test -n "@PYPI_PKGS@" ; then PYLIB=$(@PYTHON@ -c 'import sysconfig; print(sysconfig.get_path("stdlib"))'); rm -f $PYLIB/EXTERNALLY-MANAGED; @PIP3@ install @PYPI_PKGS@ ; fi
clone_script:
- git clone --depth 100 "$CI_REPOSITORY_URL" .
- git fetch origin "$CI_COMMIT_REF_NAME"

View File

@@ -88,7 +88,6 @@
$MINGW_TARGET-libpng
$MINGW_TARGET-libssh
$MINGW_TARGET-libtasn1
$MINGW_TARGET-libusb
$MINGW_TARGET-lzo2
$MINGW_TARGET-nettle
$MINGW_TARGET-ninja
@@ -98,9 +97,8 @@
$MINGW_TARGET-SDL2
$MINGW_TARGET-SDL2_image
$MINGW_TARGET-snappy
$MINGW_TARGET-spice
$MINGW_TARGET-usbredir
$MINGW_TARGET-zstd "
$MINGW_TARGET-zstd
$EXTRA_PACKAGES "
- Write-Output "Running build at $(Get-Date -Format u)"
- $env:CHERE_INVOKING = 'yes' # Preserve the current working directory
- $env:MSYS = 'winsymlinks:native' # Enable native Windows symlink
@@ -123,6 +121,8 @@ msys2-64bit:
variables:
MINGW_TARGET: mingw-w64-x86_64
MSYSTEM: MINGW64
# msys2 only ship these packages for 64-bit, not 32-bit
EXTRA_PACKAGES: $MINGW_TARGET-libusb $MINGW_TARGET-usbredir $MINGW_TARGET-spice
# do not remove "--without-default-devices"!
# commit 9f8e6cad65a6 ("gitlab-ci: Speed up the msys2-64bit job by using --without-default-devices"
# changed to compile QEMU with the --without-default-devices switch
@@ -131,11 +131,3 @@ msys2-64bit:
# qTests don't run successfully with "--without-default-devices",
# so let's exclude the qtests from CI for now.
TEST_ARGS: --no-suite qtest
msys2-32bit:
extends: .shared_msys2_builder
variables:
MINGW_TARGET: mingw-w64-i686
MSYSTEM: MINGW32
CONFIGURE_ARGS: --target-list=ppc64-softmmu -Ddebug=false -Doptimization=0
TEST_ARGS: --no-suite qtest

25
.gitmodules vendored
View File

@@ -1,12 +1,12 @@
[submodule "roms/seabios"]
path = roms/seabios
url = https://gitlab.com/qemu-project/seabios.git/
url = https://github.com/openSUSE/qemu-seabios.git
[submodule "roms/SLOF"]
path = roms/SLOF
url = https://gitlab.com/qemu-project/SLOF.git
url = https://github.com/openSUSE/qemu-SLOF.git
[submodule "roms/ipxe"]
path = roms/ipxe
url = https://gitlab.com/qemu-project/ipxe.git
url = https://github.com/openSUSE/qemu-ipxe.git
[submodule "roms/openbios"]
path = roms/openbios
url = https://gitlab.com/qemu-project/openbios.git
@@ -18,7 +18,7 @@
url = https://gitlab.com/qemu-project/u-boot.git
[submodule "roms/skiboot"]
path = roms/skiboot
url = https://gitlab.com/qemu-project/skiboot.git
url = https://github.com/openSUSE/qemu-skiboot.git
[submodule "roms/QemuMacDrivers"]
path = roms/QemuMacDrivers
url = https://gitlab.com/qemu-project/QemuMacDrivers.git
@@ -36,10 +36,25 @@
url = https://gitlab.com/qemu-project/opensbi.git
[submodule "roms/qboot"]
path = roms/qboot
url = https://gitlab.com/qemu-project/qboot.git
url = https://github.com/openSUSE/qemu-qboot.git
[submodule "roms/vbootrom"]
path = roms/vbootrom
url = https://gitlab.com/qemu-project/vbootrom.git
[submodule "tests/lcitool/libvirt-ci"]
path = tests/lcitool/libvirt-ci
url = https://gitlab.com/libvirt/libvirt-ci.git
[submodule "subprojects/berkeley-softfloat-3"]
path = subprojects/berkeley-softfloat-3
url = https://gitlab.com/qemu-project/berkeley-softfloat-3
[submodule "subprojects/berkeley-testfloat-3"]
path = subprojects/berkeley-testfloat-3
url = https://gitlab.com/qemu-project/berkeley-testfloat-3
[submodule "subprojects/dtc"]
path = subprojects/dtc
url = https://gitlab.com/qemu-project/dtc.git
[submodule "subprojects/libvfio-user"]
path = subprojects/libvfio-user
url = https://gitlab.com/qemu-project/libvfio-user.git
[submodule "subprojects/keycodemapdb"]
path = subprojects/keycodemapdb
url = https://gitlab.com/qemu-project/keycodemapdb.git

22
.obs/workflows.yml Normal file
View File

@@ -0,0 +1,22 @@
pr_workflow:
steps:
- branch_package:
source_project: Virtualization:Staging
source_package: qemu
target_project: Virtualization:Staging:PRs
filters:
event: pull_request
branches:
only:
- factory
rebuild_workflow:
steps:
# Will automatically rebuild the package
- trigger_services:
project: Virtualization:Staging
package: qemu
filters:
event: push
branches:
only:
- factory

View File

@@ -1 +1 @@
8.2.1
8.2.2

View File

@@ -69,16 +69,6 @@
#define KVM_GUESTDBG_BLOCKIRQ 0
#endif
//#define DEBUG_KVM
#ifdef DEBUG_KVM
#define DPRINTF(fmt, ...) \
do { fprintf(stderr, fmt, ## __VA_ARGS__); } while (0)
#else
#define DPRINTF(fmt, ...) \
do { } while (0)
#endif
struct KVMParkedVcpu {
unsigned long vcpu_id;
int kvm_fd;
@@ -101,6 +91,8 @@ bool kvm_msi_use_devid;
bool kvm_has_guest_debug;
static int kvm_sstep_flags;
static bool kvm_immediate_exit;
static bool kvm_guest_memfd_supported;
static uint64_t kvm_supported_memory_attributes;
static hwaddr kvm_max_slot_size = ~0;
static const KVMCapabilityInfo kvm_required_capabilites[] = {
@@ -292,34 +284,69 @@ int kvm_physical_memory_addr_from_host(KVMState *s, void *ram,
static int kvm_set_user_memory_region(KVMMemoryListener *kml, KVMSlot *slot, bool new)
{
KVMState *s = kvm_state;
struct kvm_userspace_memory_region mem;
struct kvm_userspace_memory_region2 mem;
static int cap_user_memory2 = -1;
int ret;
if (cap_user_memory2 == -1) {
cap_user_memory2 = kvm_check_extension(s, KVM_CAP_USER_MEMORY2);
}
if (!cap_user_memory2 && slot->guest_memfd >= 0) {
error_report("%s, KVM doesn't support KVM_CAP_USER_MEMORY2,"
" which is required by guest memfd!", __func__);
exit(1);
}
mem.slot = slot->slot | (kml->as_id << 16);
mem.guest_phys_addr = slot->start_addr;
mem.userspace_addr = (unsigned long)slot->ram;
mem.flags = slot->flags;
mem.guest_memfd = slot->guest_memfd;
mem.guest_memfd_offset = slot->guest_memfd_offset;
if (slot->memory_size && !new && (mem.flags ^ slot->old_flags) & KVM_MEM_READONLY) {
/* Set the slot size to 0 before setting the slot to the desired
* value. This is needed based on KVM commit 75d61fbc. */
mem.memory_size = 0;
ret = kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION, &mem);
if (cap_user_memory2) {
ret = kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION2, &mem);
} else {
ret = kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION, &mem);
}
if (ret < 0) {
goto err;
}
}
mem.memory_size = slot->memory_size;
ret = kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION, &mem);
if (cap_user_memory2) {
ret = kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION2, &mem);
} else {
ret = kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION, &mem);
}
slot->old_flags = mem.flags;
err:
trace_kvm_set_user_memory(mem.slot, mem.flags, mem.guest_phys_addr,
mem.memory_size, mem.userspace_addr, ret);
trace_kvm_set_user_memory(mem.slot >> 16, (uint16_t)mem.slot, mem.flags,
mem.guest_phys_addr, mem.memory_size,
mem.userspace_addr, mem.guest_memfd,
mem.guest_memfd_offset, ret);
if (ret < 0) {
error_report("%s: KVM_SET_USER_MEMORY_REGION failed, slot=%d,"
" start=0x%" PRIx64 ", size=0x%" PRIx64 ": %s",
__func__, mem.slot, slot->start_addr,
(uint64_t)mem.memory_size, strerror(errno));
if (cap_user_memory2) {
error_report("%s: KVM_SET_USER_MEMORY_REGION2 failed, slot=%d,"
" start=0x%" PRIx64 ", size=0x%" PRIx64 ","
" flags=0x%" PRIx32 ", guest_memfd=%" PRId32 ","
" guest_memfd_offset=0x%" PRIx64 ": %s",
__func__, mem.slot, slot->start_addr,
(uint64_t)mem.memory_size, mem.flags,
mem.guest_memfd, (uint64_t)mem.guest_memfd_offset,
strerror(errno));
} else {
error_report("%s: KVM_SET_USER_MEMORY_REGION failed, slot=%d,"
" start=0x%" PRIx64 ", size=0x%" PRIx64 ": %s",
__func__, mem.slot, slot->start_addr,
(uint64_t)mem.memory_size, strerror(errno));
}
}
return ret;
}
@@ -331,7 +358,7 @@ static int do_kvm_destroy_vcpu(CPUState *cpu)
struct KVMParkedVcpu *vcpu = NULL;
int ret = 0;
DPRINTF("kvm_destroy_vcpu\n");
trace_kvm_destroy_vcpu();
ret = kvm_arch_destroy_vcpu(cpu);
if (ret < 0) {
@@ -341,7 +368,7 @@ static int do_kvm_destroy_vcpu(CPUState *cpu)
mmap_size = kvm_ioctl(s, KVM_GET_VCPU_MMAP_SIZE, 0);
if (mmap_size < 0) {
ret = mmap_size;
DPRINTF("KVM_GET_VCPU_MMAP_SIZE failed\n");
trace_kvm_failed_get_vcpu_mmap_size();
goto err;
}
@@ -391,6 +418,11 @@ static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id)
return kvm_vm_ioctl(s, KVM_CREATE_VCPU, (void *)vcpu_id);
}
int __attribute__ ((weak)) kvm_arch_pre_create_vcpu(CPUState *cpu, Error **errp)
{
return 0;
}
int kvm_init_vcpu(CPUState *cpu, Error **errp)
{
KVMState *s = kvm_state;
@@ -399,15 +431,27 @@ int kvm_init_vcpu(CPUState *cpu, Error **errp)
trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
/*
* tdx_pre_create_vcpu() may call cpu_x86_cpuid(). It in turn may call
* kvm_vm_ioctl(). Set cpu->kvm_state in advance to avoid NULL pointer
* dereference.
*/
cpu->kvm_state = s;
ret = kvm_arch_pre_create_vcpu(cpu, errp);
if (ret < 0) {
cpu->kvm_state = NULL;
goto err;
}
ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu));
if (ret < 0) {
error_setg_errno(errp, -ret, "kvm_init_vcpu: kvm_get_vcpu failed (%lu)",
kvm_arch_vcpu_id(cpu));
cpu->kvm_state = NULL;
goto err;
}
cpu->kvm_fd = ret;
cpu->kvm_state = s;
cpu->vcpu_dirty = true;
cpu->dirty_pages = 0;
cpu->throttle_us_per_full = 0;
@@ -443,7 +487,6 @@ int kvm_init_vcpu(CPUState *cpu, Error **errp)
PAGE_SIZE * KVM_DIRTY_LOG_PAGE_OFFSET);
if (cpu->kvm_dirty_gfns == MAP_FAILED) {
ret = -errno;
DPRINTF("mmap'ing vcpu dirty gfns failed: %d\n", ret);
goto err;
}
}
@@ -475,6 +518,9 @@ static int kvm_mem_flags(MemoryRegion *mr)
if (readonly && kvm_readonly_mem_allowed) {
flags |= KVM_MEM_READONLY;
}
if (memory_region_has_guest_memfd(mr)) {
flags |= KVM_MEM_GUEST_MEMFD;
}
return flags;
}
@@ -1130,6 +1176,11 @@ int kvm_vm_check_extension(KVMState *s, unsigned int extension)
return ret;
}
/*
* We track the poisoned pages to be able to:
* - replace them on VM reset
* - block a migration for a VM with a poisoned page
*/
typedef struct HWPoisonPage {
ram_addr_t ram_addr;
QLIST_ENTRY(HWPoisonPage) list;
@@ -1163,6 +1214,11 @@ void kvm_hwpoison_page_add(ram_addr_t ram_addr)
QLIST_INSERT_HEAD(&hwpoison_page_list, page, list);
}
bool kvm_hwpoisoned_mem(void)
{
return !QLIST_EMPTY(&hwpoison_page_list);
}
static uint32_t adjust_ioeventfd_endianness(uint32_t val, uint32_t size)
{
#if HOST_BIG_ENDIAN != TARGET_BIG_ENDIAN
@@ -1266,6 +1322,46 @@ void kvm_set_max_memslot_size(hwaddr max_slot_size)
kvm_max_slot_size = max_slot_size;
}
static int kvm_set_memory_attributes(hwaddr start, hwaddr size, uint64_t attr)
{
struct kvm_memory_attributes attrs;
int r;
if (kvm_supported_memory_attributes == 0) {
error_report("No memory attribute supported by KVM\n");
return -EINVAL;
}
if ((attr & kvm_supported_memory_attributes) != attr) {
error_report("memory attribute 0x%lx not supported by KVM,"
" supported bits are 0x%lx\n",
attr, kvm_supported_memory_attributes);
return -EINVAL;
}
attrs.attributes = attr;
attrs.address = start;
attrs.size = size;
attrs.flags = 0;
r = kvm_vm_ioctl(kvm_state, KVM_SET_MEMORY_ATTRIBUTES, &attrs);
if (r) {
error_report("failed to set memory (0x%lx+%#zx) with attr 0x%lx error '%s'",
start, size, attr, strerror(errno));
}
return r;
}
int kvm_set_memory_attributes_private(hwaddr start, hwaddr size)
{
return kvm_set_memory_attributes(start, size, KVM_MEMORY_ATTRIBUTE_PRIVATE);
}
int kvm_set_memory_attributes_shared(hwaddr start, hwaddr size)
{
return kvm_set_memory_attributes(start, size, 0);
}
/* Called with KVMMemoryListener.slots_lock held */
static void kvm_set_phys_mem(KVMMemoryListener *kml,
MemoryRegionSection *section, bool add)
@@ -1362,6 +1458,9 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml,
mem->ram_start_offset = ram_start_offset;
mem->ram = ram;
mem->flags = kvm_mem_flags(mr);
mem->guest_memfd = mr->ram_block->guest_memfd;
mem->guest_memfd_offset = (uint8_t*)ram - mr->ram_block->host;
kvm_slot_init_dirty_bitmap(mem);
err = kvm_set_user_memory_region(kml, mem, true);
if (err) {
@@ -1369,6 +1468,16 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml,
strerror(-err));
abort();
}
if (memory_region_has_guest_memfd(mr)) {
err = kvm_set_memory_attributes_private(start_addr, slot_size);
if (err) {
error_report("%s: failed to set memory attribute private: %s\n",
__func__, strerror(-err));
exit(1);
}
}
start_addr += slot_size;
ram_start_offset += slot_size;
ram += slot_size;
@@ -2396,6 +2505,11 @@ static int kvm_init(MachineState *ms)
}
s->as = g_new0(struct KVMAs, s->nr_as);
kvm_guest_memfd_supported = kvm_check_extension(s, KVM_CAP_GUEST_MEMFD);
ret = kvm_check_extension(s, KVM_CAP_MEMORY_ATTRIBUTES);
kvm_supported_memory_attributes = ret > 0 ? ret : 0;
if (object_property_find(OBJECT(current_machine), "kvm-type")) {
g_autofree char *kvm_type = object_property_get_str(OBJECT(current_machine),
"kvm-type",
@@ -2816,12 +2930,101 @@ static void kvm_eat_signals(CPUState *cpu)
} while (sigismember(&chkset, SIG_IPI));
}
int kvm_convert_memory(hwaddr start, hwaddr size, bool to_private)
{
MemoryRegionSection section;
ram_addr_t offset;
MemoryRegion *mr;
RAMBlock *rb;
void *addr;
int ret = -1;
trace_kvm_convert_memory(start, size, to_private ? "shared_to_private" : "private_to_shared");
if (!QEMU_PTR_IS_ALIGNED(start, qemu_host_page_size) ||
!QEMU_PTR_IS_ALIGNED(size, qemu_host_page_size)) {
return -1;
}
if (!size) {
return -1;
}
section = memory_region_find(get_system_memory(), start, size);
mr = section.mr;
if (!mr) {
/*
* Ignore converting non-assigned region to shared.
*
* TDX requires vMMIO region to be shared to inject #VE to guest.
* OVMF issues conservatively MapGPA(shared) on 32bit PCI MMIO region,
* and vIO-APIC 0xFEC00000 4K page.
* OVMF assigns 32bit PCI MMIO region to
* [top of low memory: typically 2GB=0xC000000, 0xFC00000)
*/
if (!to_private) {
return 0;
}
return -1;
}
if (memory_region_has_guest_memfd(mr)) {
if (to_private) {
ret = kvm_set_memory_attributes_private(start, size);
} else {
ret = kvm_set_memory_attributes_shared(start, size);
}
if (ret) {
memory_region_unref(section.mr);
return ret;
}
addr = memory_region_get_ram_ptr(mr) + section.offset_within_region;
rb = qemu_ram_block_from_host(addr, false, &offset);
if (to_private) {
if (rb->page_size != qemu_host_page_size) {
/*
* shared memory is back'ed by hugetlb, which is supposed to be
* pre-allocated and doesn't need to be discarded
*/
return 0;
} else {
ret = ram_block_discard_range(rb, offset, size);
}
} else {
ret = ram_block_discard_guest_memfd_range(rb, offset, size);
}
} else {
/*
* Because vMMIO region must be shared, guest TD may convert vMMIO
* region to shared explicitly. Don't complain such case. See
* memory_region_type() for checking if the region is MMIO region.
*/
if (!to_private &&
!memory_region_is_ram(mr) &&
!memory_region_is_ram_device(mr) &&
!memory_region_is_rom(mr) &&
!memory_region_is_romd(mr)) {
ret = 0;
} else {
error_report("Convert non guest_memfd backed memory region "
"(0x%"HWADDR_PRIx" ,+ 0x%"HWADDR_PRIx") to %s",
start, size, to_private ? "private" : "shared");
}
}
memory_region_unref(section.mr);
return ret;
}
int kvm_cpu_exec(CPUState *cpu)
{
struct kvm_run *run = cpu->kvm_run;
int ret, run_ret;
DPRINTF("kvm_cpu_exec()\n");
trace_kvm_cpu_exec();
if (kvm_arch_process_async_events(cpu)) {
qatomic_set(&cpu->exit_request, 0);
@@ -2848,7 +3051,7 @@ int kvm_cpu_exec(CPUState *cpu)
kvm_arch_pre_run(cpu, run);
if (qatomic_read(&cpu->exit_request)) {
DPRINTF("interrupt exit requested\n");
trace_kvm_interrupt_exit_request();
/*
* KVM requires us to reenter the kernel after IO exits to complete
* instruction emulation. This self-signal will ensure that we
@@ -2878,29 +3081,30 @@ int kvm_cpu_exec(CPUState *cpu)
if (run_ret < 0) {
if (run_ret == -EINTR || run_ret == -EAGAIN) {
DPRINTF("io window exit\n");
trace_kvm_io_window_exit();
kvm_eat_signals(cpu);
ret = EXCP_INTERRUPT;
break;
}
fprintf(stderr, "error: kvm run failed %s\n",
strerror(-run_ret));
if (!(run_ret == -EFAULT && run->exit_reason == KVM_EXIT_MEMORY_FAULT)) {
fprintf(stderr, "error: kvm run failed %s\n",
strerror(-run_ret));
#ifdef TARGET_PPC
if (run_ret == -EBUSY) {
fprintf(stderr,
"This is probably because your SMT is enabled.\n"
"VCPU can only run on primary threads with all "
"secondary threads offline.\n");
}
if (run_ret == -EBUSY) {
fprintf(stderr,
"This is probably because your SMT is enabled.\n"
"VCPU can only run on primary threads with all "
"secondary threads offline.\n");
}
#endif
ret = -1;
break;
ret = -1;
break;
}
}
trace_kvm_run_exit(cpu->cpu_index, run->exit_reason);
switch (run->exit_reason) {
case KVM_EXIT_IO:
DPRINTF("handle_io\n");
/* Called outside BQL */
kvm_handle_io(run->io.port, attrs,
(uint8_t *)run + run->io.data_offset,
@@ -2910,7 +3114,6 @@ int kvm_cpu_exec(CPUState *cpu)
ret = 0;
break;
case KVM_EXIT_MMIO:
DPRINTF("handle_mmio\n");
/* Called outside BQL */
address_space_rw(&address_space_memory,
run->mmio.phys_addr, attrs,
@@ -2920,11 +3123,9 @@ int kvm_cpu_exec(CPUState *cpu)
ret = 0;
break;
case KVM_EXIT_IRQ_WINDOW_OPEN:
DPRINTF("irq_window_open\n");
ret = EXCP_INTERRUPT;
break;
case KVM_EXIT_SHUTDOWN:
DPRINTF("shutdown\n");
qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET);
ret = EXCP_INTERRUPT;
break;
@@ -2959,6 +3160,7 @@ int kvm_cpu_exec(CPUState *cpu)
ret = 0;
break;
case KVM_EXIT_SYSTEM_EVENT:
trace_kvm_run_exit_system_event(cpu->cpu_index, run->system_event.type);
switch (run->system_event.type) {
case KVM_SYSTEM_EVENT_SHUTDOWN:
qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN);
@@ -2976,13 +3178,21 @@ int kvm_cpu_exec(CPUState *cpu)
ret = 0;
break;
default:
DPRINTF("kvm_arch_handle_exit\n");
ret = kvm_arch_handle_exit(cpu, run);
break;
}
break;
case KVM_EXIT_MEMORY_FAULT:
if (run->memory_fault.flags & ~KVM_MEMORY_EXIT_FLAG_PRIVATE) {
error_report("KVM_EXIT_MEMORY_FAULT: Unknown flag 0x%" PRIx64,
(uint64_t)run->memory_fault.flags);
ret = -1;
break;
}
ret = kvm_convert_memory(run->memory_fault.gpa, run->memory_fault.size,
run->memory_fault.flags & KVM_MEMORY_EXIT_FLAG_PRIVATE);
break;
default:
DPRINTF("kvm_arch_handle_exit\n");
ret = kvm_arch_handle_exit(cpu, run);
break;
}
@@ -4077,3 +4287,25 @@ void query_stats_schemas_cb(StatsSchemaList **result, Error **errp)
query_stats_schema_vcpu(first_cpu, &stats_args);
}
}
int kvm_create_guest_memfd(uint64_t size, uint64_t flags, Error **errp)
{
int fd;
struct kvm_create_guest_memfd guest_memfd = {
.size = size,
.flags = flags,
};
if (!kvm_guest_memfd_supported) {
error_setg(errp, "KVM doesn't support guest memfd\n");
return -1;
}
fd = kvm_vm_ioctl(kvm_state, KVM_CREATE_GUEST_MEMFD, &guest_memfd);
if (fd < 0) {
error_setg_errno(errp, errno, "Error creating kvm guest memfd");
return -1;
}
return fd;
}

View File

@@ -15,7 +15,7 @@ kvm_irqchip_update_msi_route(int virq) "Updating MSI route virq=%d"
kvm_irqchip_release_virq(int virq) "virq %d"
kvm_set_ioeventfd_mmio(int fd, uint64_t addr, uint32_t val, bool assign, uint32_t size, bool datamatch) "fd: %d @0x%" PRIx64 " val=0x%x assign: %d size: %d match: %d"
kvm_set_ioeventfd_pio(int fd, uint16_t addr, uint32_t val, bool assign, uint32_t size, bool datamatch) "fd: %d @0x%x val=0x%x assign: %d size: %d match: %d"
kvm_set_user_memory(uint32_t slot, uint32_t flags, uint64_t guest_phys_addr, uint64_t memory_size, uint64_t userspace_addr, int ret) "Slot#%d flags=0x%x gpa=0x%"PRIx64 " size=0x%"PRIx64 " ua=0x%"PRIx64 " ret=%d"
kvm_set_user_memory(uint16_t as, uint16_t slot, uint32_t flags, uint64_t guest_phys_addr, uint64_t memory_size, uint64_t userspace_addr, uint32_t fd, uint64_t fd_offset, int ret) "AddrSpace#%d Slot#%d flags=0x%x gpa=0x%"PRIx64 " size=0x%"PRIx64 " ua=0x%"PRIx64 " guest_memfd=%d" " guest_memfd_offset=0x%" PRIx64 " ret=%d"
kvm_clear_dirty_log(uint32_t slot, uint64_t start, uint32_t size) "slot#%"PRId32" start 0x%"PRIx64" size 0x%"PRIx32
kvm_resample_fd_notify(int gsi) "gsi %d"
kvm_dirty_ring_full(int id) "vcpu %d"
@@ -25,4 +25,10 @@ kvm_dirty_ring_reaper(const char *s) "%s"
kvm_dirty_ring_reap(uint64_t count, int64_t t) "reaped %"PRIu64" pages (took %"PRIi64" us)"
kvm_dirty_ring_reaper_kick(const char *reason) "%s"
kvm_dirty_ring_flush(int finished) "%d"
kvm_destroy_vcpu(void) ""
kvm_failed_get_vcpu_mmap_size(void) ""
kvm_cpu_exec(void) ""
kvm_interrupt_exit_request(void) ""
kvm_io_window_exit(void) ""
kvm_run_exit_system_event(int cpu_index, uint32_t event_type) "cpu_index %d, system_even_type %"PRIu32
kvm_convert_memory(uint64_t start, uint64_t size, const char *msg) "start 0x%" PRIx64 " size 0x%" PRIx64 " %s"

View File

@@ -124,3 +124,13 @@ uint32_t kvm_dirty_ring_size(void)
{
return 0;
}
bool kvm_hwpoisoned_mem(void)
{
return false;
}
int kvm_create_guest_memfd(uint64_t size, uint64_t flags, Error **errp)
{
return -ENOSYS;
}

View File

@@ -30,7 +30,8 @@ endforeach
if dbus_display
module_ss = ss.source_set()
module_ss.add(when: [gio, pixman], if_true: files('dbusaudio.c'))
module_ss.add(when: [gio, dbus_display1_dep, pixman],
if_true: files('dbusaudio.c'))
audio_modules += {'dbus': module_ss}
endif

View File

@@ -17,31 +17,29 @@
#include "sysemu/hostmem.h"
#include "hw/i386/hostmem-epc.h"
static void
static bool
sgx_epc_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
{
g_autofree char *name = NULL;
uint32_t ram_flags;
char *name;
int fd;
if (!backend->size) {
error_setg(errp, "can't create backend with size 0");
return;
return false;
}
fd = qemu_open_old("/dev/sgx_vepc", O_RDWR);
if (fd < 0) {
error_setg_errno(errp, errno,
"failed to open /dev/sgx_vepc to alloc SGX EPC");
return;
return false;
}
name = object_get_canonical_path(OBJECT(backend));
ram_flags = (backend->share ? RAM_SHARED : 0) | RAM_PROTECTED;
memory_region_init_ram_from_fd(&backend->mr, OBJECT(backend),
name, backend->size, ram_flags,
fd, 0, errp);
g_free(name);
return memory_region_init_ram_from_fd(&backend->mr, OBJECT(backend), name,
backend->size, ram_flags, fd, 0, errp);
}
static void sgx_epc_backend_instance_init(Object *obj)

View File

@@ -36,24 +36,25 @@ struct HostMemoryBackendFile {
OnOffAuto rom;
};
static void
static bool
file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
{
#ifndef CONFIG_POSIX
error_setg(errp, "backend '%s' not supported on this host",
object_get_typename(OBJECT(backend)));
return false;
#else
HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(backend);
g_autofree gchar *name = NULL;
uint32_t ram_flags;
gchar *name;
if (!backend->size) {
error_setg(errp, "can't create backend with size 0");
return;
return false;
}
if (!fb->mem_path) {
error_setg(errp, "mem-path property not set");
return;
return false;
}
switch (fb->rom) {
@@ -65,18 +66,18 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
if (!fb->readonly) {
error_setg(errp, "property 'rom' = 'on' is not supported with"
" 'readonly' = 'off'");
return;
return false;
}
break;
case ON_OFF_AUTO_OFF:
if (fb->readonly && backend->share) {
error_setg(errp, "property 'rom' = 'off' is incompatible with"
" 'readonly' = 'on' and 'share' = 'on'");
return;
return false;
}
break;
default:
assert(false);
g_assert_not_reached();
}
name = host_memory_backend_get_name(backend);
@@ -84,12 +85,12 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
ram_flags |= fb->readonly ? RAM_READONLY_FD : 0;
ram_flags |= fb->rom == ON_OFF_AUTO_ON ? RAM_READONLY : 0;
ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD : 0;
ram_flags |= fb->is_pmem ? RAM_PMEM : 0;
ram_flags |= RAM_NAMED_FILE;
memory_region_init_ram_from_file(&backend->mr, OBJECT(backend), name,
backend->size, fb->align, ram_flags,
fb->mem_path, fb->offset, errp);
g_free(name);
return memory_region_init_ram_from_file(&backend->mr, OBJECT(backend), name,
backend->size, fb->align, ram_flags,
fb->mem_path, fb->offset, errp);
#endif
}

View File

@@ -31,17 +31,17 @@ struct HostMemoryBackendMemfd {
bool seal;
};
static void
static bool
memfd_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
{
HostMemoryBackendMemfd *m = MEMORY_BACKEND_MEMFD(backend);
g_autofree char *name = NULL;
uint32_t ram_flags;
char *name;
int fd;
if (!backend->size) {
error_setg(errp, "can't create backend with size 0");
return;
return false;
}
fd = qemu_memfd_create(TYPE_MEMORY_BACKEND_MEMFD, backend->size,
@@ -49,15 +49,15 @@ memfd_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
F_SEAL_GROW | F_SEAL_SHRINK | F_SEAL_SEAL : 0,
errp);
if (fd == -1) {
return;
return false;
}
name = host_memory_backend_get_name(backend);
ram_flags = backend->share ? RAM_SHARED : 0;
ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
memory_region_init_ram_from_fd(&backend->mr, OBJECT(backend), name,
backend->size, ram_flags, fd, 0, errp);
g_free(name);
ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD : 0;
return memory_region_init_ram_from_fd(&backend->mr, OBJECT(backend), name,
backend->size, ram_flags, fd, 0, errp);
}
static bool

View File

@@ -16,23 +16,24 @@
#include "qemu/module.h"
#include "qom/object_interfaces.h"
static void
static bool
ram_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
{
g_autofree char *name = NULL;
uint32_t ram_flags;
char *name;
if (!backend->size) {
error_setg(errp, "can't create backend with size 0");
return;
return false;
}
name = host_memory_backend_get_name(backend);
ram_flags = backend->share ? RAM_SHARED : 0;
ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
memory_region_init_ram_flags_nomigrate(&backend->mr, OBJECT(backend), name,
backend->size, ram_flags, errp);
g_free(name);
ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD : 0;
return memory_region_init_ram_flags_nomigrate(&backend->mr, OBJECT(backend),
name, backend->size,
ram_flags, errp);
}
static void

View File

@@ -219,7 +219,6 @@ static bool host_memory_backend_get_prealloc(Object *obj, Error **errp)
static void host_memory_backend_set_prealloc(Object *obj, bool value,
Error **errp)
{
Error *local_err = NULL;
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
if (!backend->reserve && value) {
@@ -237,10 +236,8 @@ static void host_memory_backend_set_prealloc(Object *obj, bool value,
void *ptr = memory_region_get_ram_ptr(&backend->mr);
uint64_t sz = memory_region_size(&backend->mr);
qemu_prealloc_mem(fd, ptr, sz, backend->prealloc_threads,
backend->prealloc_context, &local_err);
if (local_err) {
error_propagate(errp, local_err);
if (!qemu_prealloc_mem(fd, ptr, sz, backend->prealloc_threads,
backend->prealloc_context, errp)) {
return;
}
backend->prealloc = true;
@@ -279,6 +276,7 @@ static void host_memory_backend_init(Object *obj)
/* TODO: convert access to globals to compat properties */
backend->merge = machine_mem_merge(machine);
backend->dump = machine_dump_guest_core(machine);
backend->guest_memfd = machine_require_guest_memfd(machine);
backend->reserve = true;
backend->prealloc_threads = machine->smp.cpus;
}
@@ -324,91 +322,86 @@ host_memory_backend_memory_complete(UserCreatable *uc, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(uc);
HostMemoryBackendClass *bc = MEMORY_BACKEND_GET_CLASS(uc);
Error *local_err = NULL;
void *ptr;
uint64_t sz;
if (bc->alloc) {
bc->alloc(backend, &local_err);
if (local_err) {
goto out;
}
if (!bc->alloc) {
return;
}
if (!bc->alloc(backend, errp)) {
return;
}
ptr = memory_region_get_ram_ptr(&backend->mr);
sz = memory_region_size(&backend->mr);
ptr = memory_region_get_ram_ptr(&backend->mr);
sz = memory_region_size(&backend->mr);
if (backend->merge) {
qemu_madvise(ptr, sz, QEMU_MADV_MERGEABLE);
}
if (!backend->dump) {
qemu_madvise(ptr, sz, QEMU_MADV_DONTDUMP);
}
if (backend->merge) {
qemu_madvise(ptr, sz, QEMU_MADV_MERGEABLE);
}
if (!backend->dump) {
qemu_madvise(ptr, sz, QEMU_MADV_DONTDUMP);
}
#ifdef CONFIG_NUMA
unsigned long lastbit = find_last_bit(backend->host_nodes, MAX_NODES);
/* lastbit == MAX_NODES means maxnode = 0 */
unsigned long maxnode = (lastbit + 1) % (MAX_NODES + 1);
/* ensure policy won't be ignored in case memory is preallocated
* before mbind(). note: MPOL_MF_STRICT is ignored on hugepages so
* this doesn't catch hugepage case. */
unsigned flags = MPOL_MF_STRICT | MPOL_MF_MOVE;
int mode = backend->policy;
unsigned long lastbit = find_last_bit(backend->host_nodes, MAX_NODES);
/* lastbit == MAX_NODES means maxnode = 0 */
unsigned long maxnode = (lastbit + 1) % (MAX_NODES + 1);
/* ensure policy won't be ignored in case memory is preallocated
* before mbind(). note: MPOL_MF_STRICT is ignored on hugepages so
* this doesn't catch hugepage case. */
unsigned flags = MPOL_MF_STRICT | MPOL_MF_MOVE;
int mode = backend->policy;
/* check for invalid host-nodes and policies and give more verbose
* error messages than mbind(). */
if (maxnode && backend->policy == MPOL_DEFAULT) {
error_setg(errp, "host-nodes must be empty for policy default,"
" or you should explicitly specify a policy other"
" than default");
return;
} else if (maxnode == 0 && backend->policy != MPOL_DEFAULT) {
error_setg(errp, "host-nodes must be set for policy %s",
HostMemPolicy_str(backend->policy));
return;
}
/* check for invalid host-nodes and policies and give more verbose
* error messages than mbind(). */
if (maxnode && backend->policy == MPOL_DEFAULT) {
error_setg(errp, "host-nodes must be empty for policy default,"
" or you should explicitly specify a policy other"
" than default");
return;
} else if (maxnode == 0 && backend->policy != MPOL_DEFAULT) {
error_setg(errp, "host-nodes must be set for policy %s",
HostMemPolicy_str(backend->policy));
return;
}
/* We can have up to MAX_NODES nodes, but we need to pass maxnode+1
* as argument to mbind() due to an old Linux bug (feature?) which
* cuts off the last specified node. This means backend->host_nodes
* must have MAX_NODES+1 bits available.
*/
assert(sizeof(backend->host_nodes) >=
BITS_TO_LONGS(MAX_NODES + 1) * sizeof(unsigned long));
assert(maxnode <= MAX_NODES);
/* We can have up to MAX_NODES nodes, but we need to pass maxnode+1
* as argument to mbind() due to an old Linux bug (feature?) which
* cuts off the last specified node. This means backend->host_nodes
* must have MAX_NODES+1 bits available.
*/
assert(sizeof(backend->host_nodes) >=
BITS_TO_LONGS(MAX_NODES + 1) * sizeof(unsigned long));
assert(maxnode <= MAX_NODES);
#ifdef HAVE_NUMA_HAS_PREFERRED_MANY
if (mode == MPOL_PREFERRED && numa_has_preferred_many() > 0) {
/*
* Replace with MPOL_PREFERRED_MANY otherwise the mbind() below
* silently picks the first node.
*/
mode = MPOL_PREFERRED_MANY;
}
if (mode == MPOL_PREFERRED && numa_has_preferred_many() > 0) {
/*
* Replace with MPOL_PREFERRED_MANY otherwise the mbind() below
* silently picks the first node.
*/
mode = MPOL_PREFERRED_MANY;
}
#endif
if (maxnode &&
mbind(ptr, sz, mode, backend->host_nodes, maxnode + 1, flags)) {
if (backend->policy != MPOL_DEFAULT || errno != ENOSYS) {
error_setg_errno(errp, errno,
"cannot bind memory to host NUMA nodes");
return;
}
}
#endif
/* Preallocate memory after the NUMA policy has been instantiated.
* This is necessary to guarantee memory is allocated with
* specified NUMA policy in place.
*/
if (backend->prealloc) {
qemu_prealloc_mem(memory_region_get_fd(&backend->mr), ptr, sz,
backend->prealloc_threads,
backend->prealloc_context, &local_err);
if (local_err) {
goto out;
}
if (maxnode &&
mbind(ptr, sz, mode, backend->host_nodes, maxnode + 1, flags)) {
if (backend->policy != MPOL_DEFAULT || errno != ENOSYS) {
error_setg_errno(errp, errno,
"cannot bind memory to host NUMA nodes");
return;
}
}
out:
error_propagate(errp, local_err);
#endif
/* Preallocate memory after the NUMA policy has been instantiated.
* This is necessary to guarantee memory is allocated with
* specified NUMA policy in place.
*/
if (backend->prealloc && !qemu_prealloc_mem(memory_region_get_fd(&backend->mr),
ptr, sz,
backend->prealloc_threads,
backend->prealloc_context, errp)) {
return;
}
}
static bool

View File

@@ -6344,7 +6344,7 @@ BlockDeviceInfoList *bdrv_named_nodes_list(bool flat,
BlockDriverState *bs;
GLOBAL_STATE_CODE();
GRAPH_RDLOCK_GUARD_MAINLOOP();
GRAPH_RDLOCK_GUARD();
list = NULL;
QTAILQ_FOREACH(bs, &graph_bdrv_states, node_list) {

View File

@@ -68,7 +68,7 @@ typedef struct {
CoQueue bounce_available;
/* The value of the "mem-region-alignment" property */
size_t mem_region_alignment;
uint64_t mem_region_alignment;
/* Can we skip adding/deleting blkio_mem_regions? */
bool needs_mem_regions;

View File

@@ -226,6 +226,9 @@ typedef struct RawPosixAIOData {
struct {
unsigned long op;
} zone_mgmt;
struct {
struct stat *st;
} fstat;
};
} RawPosixAIOData;
@@ -2613,6 +2616,34 @@ static void raw_close(BlockDriverState *bs)
}
}
static int handle_aiocb_fstat(void *opaque)
{
RawPosixAIOData *aiocb = opaque;
if (fstat(aiocb->aio_fildes, aiocb->fstat.st) < 0) {
return -errno;
}
return 0;
}
static int coroutine_fn raw_co_fstat(BlockDriverState *bs, struct stat *st)
{
BDRVRawState *s = bs->opaque;
RawPosixAIOData acb;
acb = (RawPosixAIOData) {
.bs = bs,
.aio_fildes = s->fd,
.aio_type = QEMU_AIO_FSTAT,
.fstat = {
.st = st,
},
};
return raw_thread_pool_submit(handle_aiocb_fstat, &acb);
}
/**
* Truncates the given regular file @fd to @offset and, when growing, fills the
* new space according to @prealloc.
@@ -2857,11 +2888,14 @@ static int64_t coroutine_fn raw_co_getlength(BlockDriverState *bs)
static int64_t coroutine_fn raw_co_get_allocated_file_size(BlockDriverState *bs)
{
struct stat st;
BDRVRawState *s = bs->opaque;
int ret;
if (fstat(s->fd, &st) < 0) {
return -errno;
ret = raw_co_fstat(bs, &st);
if (ret) {
return ret;
}
return (int64_t)st.st_blocks * 512;
}

View File

@@ -149,6 +149,7 @@ block_gen_c = custom_target('block-gen.c',
'../include/block/dirty-bitmap.h',
'../include/block/block_int-io.h',
'../include/block/block-global-state.h',
'../include/block/qapi.h',
'../include/sysemu/block-backend-global-state.h',
'../include/sysemu/block-backend-io.h',
'coroutines.h'

View File

@@ -400,7 +400,7 @@ void hmp_nbd_server_start(Monitor *mon, const QDict *qdict)
bool writable = qdict_get_try_bool(qdict, "writable", false);
bool all = qdict_get_try_bool(qdict, "all", false);
Error *local_err = NULL;
BlockInfoList *block_list, *info;
BlockBackend *blk;
SocketAddress *addr;
NbdServerAddOptions export;
@@ -425,18 +425,24 @@ void hmp_nbd_server_start(Monitor *mon, const QDict *qdict)
return;
}
/* Then try adding all block devices. If one fails, close all and
/*
* Then try adding all block devices. If one fails, close all and
* exit.
*/
block_list = qmp_query_block(NULL);
for (blk = blk_all_next(NULL); blk; blk = blk_all_next(blk)) {
BlockDriverState *bs = blk_bs(blk);
for (info = block_list; info; info = info->next) {
if (!info->value->inserted) {
if (!*blk_name(blk) && !blk_get_attached_dev(blk)) {
continue;
}
bs = bdrv_skip_implicit_filters(bs);
if (!bs || !bs->drv) {
continue;
}
export = (NbdServerAddOptions) {
.device = info->value->device,
.device = g_strdup(blk_name(blk)),
.has_writable = true,
.writable = writable,
};
@@ -449,8 +455,6 @@ void hmp_nbd_server_start(Monitor *mon, const QDict *qdict)
}
}
qapi_free_BlockInfoList(block_list);
exit:
hmp_handle_error(mon, local_err);
}
@@ -744,7 +748,7 @@ static void print_block_info(Monitor *mon, BlockInfo *info,
}
}
void hmp_info_block(Monitor *mon, const QDict *qdict)
void coroutine_fn hmp_info_block(Monitor *mon, const QDict *qdict)
{
BlockInfoList *block_list, *info;
BlockDeviceInfoList *blockdev_list, *blockdev;

View File

@@ -41,10 +41,10 @@
#include "qemu/qemu-print.h"
#include "sysemu/block-backend.h"
BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk,
BlockDriverState *bs,
bool flat,
Error **errp)
BlockDeviceInfo *coroutine_fn bdrv_block_device_info(BlockBackend *blk,
BlockDriverState *bs,
bool flat,
Error **errp)
{
ImageInfo **p_image_info;
ImageInfo *backing_info;
@@ -234,8 +234,6 @@ bdrv_do_query_node_info(BlockDriverState *bs, BlockNodeInfo *info, Error **errp)
int ret;
Error *err = NULL;
aio_context_acquire(bdrv_get_aio_context(bs));
size = bdrv_getlength(bs);
if (size < 0) {
error_setg_errno(errp, -size, "Can't get image size '%s'",
@@ -248,7 +246,9 @@ bdrv_do_query_node_info(BlockDriverState *bs, BlockNodeInfo *info, Error **errp)
info->filename = g_strdup(bs->filename);
info->format = g_strdup(bdrv_get_format_name(bs));
info->virtual_size = size;
info->actual_size = bdrv_get_allocated_file_size(bs);
bdrv_graph_co_rdlock();
info->actual_size = bdrv_co_get_allocated_file_size(bs);
bdrv_graph_co_rdunlock();
info->has_actual_size = info->actual_size >= 0;
if (bs->encrypted) {
info->encrypted = true;
@@ -304,7 +304,7 @@ bdrv_do_query_node_info(BlockDriverState *bs, BlockNodeInfo *info, Error **errp)
}
out:
aio_context_release(bdrv_get_aio_context(bs));
return;
}
/**
@@ -374,7 +374,7 @@ fail:
}
/**
* bdrv_query_block_graph_info:
* bdrv_co_query_block_graph_info:
* @bs: root node to start from
* @p_info: location to store image information
* @errp: location to store error information
@@ -383,15 +383,17 @@ fail:
*
* @p_info will be set only on success. On error, store error in @errp.
*/
void bdrv_query_block_graph_info(BlockDriverState *bs,
BlockGraphInfo **p_info,
Error **errp)
void coroutine_fn bdrv_co_query_block_graph_info(BlockDriverState *bs,
BlockGraphInfo **p_info,
Error **errp)
{
BlockGraphInfo *info;
BlockChildInfoList **children_list_tail;
BdrvChild *c;
ERRP_GUARD();
assert_bdrv_graph_readable();
info = g_new0(BlockGraphInfo, 1);
bdrv_do_query_node_info(bs, qapi_BlockGraphInfo_base(info), errp);
if (*errp) {
@@ -407,7 +409,7 @@ void bdrv_query_block_graph_info(BlockDriverState *bs,
QAPI_LIST_APPEND(children_list_tail, c_info);
c_info->name = g_strdup(c->name);
bdrv_query_block_graph_info(c->bs, &c_info->info, errp);
bdrv_co_query_block_graph_info(c->bs, &c_info->info, errp);
if (*errp) {
goto fail;
}
@@ -665,13 +667,13 @@ bdrv_query_bds_stats(BlockDriverState *bs, bool blk_level)
return s;
}
BlockInfoList *qmp_query_block(Error **errp)
BlockInfoList *coroutine_fn qmp_query_block(Error **errp)
{
BlockInfoList *head = NULL, **p_next = &head;
BlockBackend *blk;
Error *local_err = NULL;
GRAPH_RDLOCK_GUARD_MAINLOOP();
GRAPH_RDLOCK_GUARD();
for (blk = blk_all_next(NULL); blk; blk = blk_all_next(blk)) {
BlockInfoList *info;

View File

@@ -389,7 +389,7 @@ int bdrv_snapshot_list(BlockDriverState *bs,
QEMUSnapshotInfo **psn_info)
{
GLOBAL_STATE_CODE();
GRAPH_RDLOCK_GUARD_MAINLOOP();
GRAPH_RDLOCK_GUARD();
BlockDriver *drv = bs->drv;
BlockDriverState *fallback_bs = bdrv_snapshot_fallback(bs);

View File

@@ -2871,9 +2871,9 @@ void qmp_drive_backup(DriveBackup *backup, Error **errp)
blockdev_do_action(&action, errp);
}
BlockDeviceInfoList *qmp_query_named_block_nodes(bool has_flat,
bool flat,
Error **errp)
BlockDeviceInfoList *coroutine_fn qmp_query_named_block_nodes(bool has_flat,
bool flat,
Error **errp)
{
bool return_flat = has_flat && flat;

View File

@@ -21,6 +21,7 @@
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#define HW_POISON_H /* avoid poison since we patch against rules it "enforces" */
#include "qemu/osdep.h"
#include "qemu/error-report.h"
#include "qapi/error.h"

View File

@@ -22,6 +22,7 @@
* THE SOFTWARE.
*/
#define HW_POISON_H /* avoid poison since we patch against rules it "enforces" */
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qemu/module.h"
@@ -198,6 +199,17 @@ static void mux_chr_accept_input(Chardev *chr)
be->chr_read(be->opaque,
&d->buffer[m][d->cons[m]++ & MUX_BUFFER_MASK], 1);
}
#if defined(TARGET_S390X)
/*
* We're still not able to sync producer and consumer, so let's wait a bit
* and try again by then.
*/
if (d->prod[m] != d->cons[m]) {
qemu_mod_timer(d->accept_timer, qemu_get_clock_ns(vm_clock)
+ (int64_t)100000);
}
#endif
}
static int mux_chr_can_read(void *opaque)
@@ -332,6 +344,10 @@ static void qemu_chr_open_mux(Chardev *chr,
}
d->focus = -1;
#if defined(TARGET_S390X)
d->accept_timer = qemu_new_timer_ns(vm_clock,
(QEMUTimerCB *)mux_chr_accept_input, chr);
#endif
/* only default to opened state if we've realized the initial
* set of muxes
*/

View File

@@ -492,9 +492,9 @@ static gboolean tcp_chr_read(QIOChannel *chan, GIOCondition cond, void *opaque)
s->max_size <= 0) {
return TRUE;
}
len = sizeof(buf);
if (len > s->max_size) {
len = s->max_size;
len = tcp_chr_read_poll(opaque);
if (len > sizeof(buf)) {
len = sizeof(buf);
}
size = tcp_chr_recv(chr, (void *)buf, len);
if (size == 0 || (size == -1 && errno != EAGAIN)) {

View File

@@ -22,6 +22,7 @@
* THE SOFTWARE.
*/
#define HW_POISON_H /* avoid poison since we patch against rules it "enforces" */
#include "qemu/osdep.h"
#include "qemu/cutils.h"
#include "monitor/monitor.h"

View File

@@ -37,6 +37,9 @@ struct MuxChardev {
Chardev parent;
CharBackend *backends[MAX_MUX];
CharBackend chr;
#if defined(TARGET_S390X)
QEMUTimer *accept_timer;
#endif
int focus;
int mux_cnt;
int term_got_escape;

View File

@@ -18,6 +18,7 @@
#CONFIG_QXL=n
#CONFIG_SEV=n
#CONFIG_SGA=n
#CONFIG_TDX=n
#CONFIG_TEST_DEVICES=n
#CONFIG_TPM_CRB=n
#CONFIG_TPM_TIS_ISA=n

3
configure vendored
View File

@@ -1675,6 +1675,9 @@ fi
mkdir -p tests/tcg
echo "# Automatically generated by configure - do not modify" > $config_host_mak
echo "SRC_PATH=$source_path" >> $config_host_mak
if test "$plugins" = "yes" ; then
echo "CONFIG_PLUGIN=y" >> tests/tcg/$config_host_mak
fi
tcg_tests_targets=
for target in $target_list; do

View File

@@ -1,4 +1,4 @@
executable('ivshmem-client', files('ivshmem-client.c', 'main.c'), genh,
dependencies: glib,
build_by_default: targetos == 'linux',
install: false)
install: true)

View File

@@ -1,4 +1,4 @@
executable('ivshmem-server', files('ivshmem-server.c', 'main.c'), genh,
dependencies: [qemuutil, rt],
build_by_default: targetos == 'linux',
install: false)
install: true)

View File

@@ -327,7 +327,7 @@ virgl_get_resource_info_modifiers(uint32_t resource_id,
#ifdef VIRGL_RENDERER_RESOURCE_INFO_EXT_VERSION
struct virgl_renderer_resource_info_ext info_ext;
ret = virgl_renderer_resource_get_info_ext(resource_id, &info_ext);
if (ret < 0) {
if (ret) {
return ret;
}
@@ -335,7 +335,7 @@ virgl_get_resource_info_modifiers(uint32_t resource_id,
*modifiers = info_ext.modifiers;
#else
ret = virgl_renderer_resource_get_info(resource_id, info);
if (ret < 0) {
if (ret) {
return ret;
}
@@ -372,7 +372,7 @@ virgl_cmd_set_scanout(VuGpu *g,
uint64_t modifiers = 0;
ret = virgl_get_resource_info_modifiers(ss.resource_id, &info,
&modifiers);
if (ret == -1) {
if (ret) {
g_critical("%s: illegal resource specified %d\n",
__func__, ss.resource_id);
cmd->error = VIRTIO_GPU_RESP_ERR_INVALID_RESOURCE_ID;

View File

@@ -148,9 +148,9 @@ Vring descriptor indices for packed virtqueues
A vring address description
^^^^^^^^^^^^^^^^^^^^^^^^^^^
+-------+-------+------+------------+------+-----------+-----+
| index | flags | size | descriptor | used | available | log |
+-------+-------+------+------------+------+-----------+-----+
+-------+-------+------------+------+-----------+-----+
| index | flags | descriptor | used | available | log |
+-------+-------+------------+------+-----------+-----+
:index: a 32-bit vring index

View File

@@ -13,12 +13,12 @@ if sphinx_build.found()
sphinx_version = run_command(SPHINX_ARGS + ['--version'],
check: true).stdout().split()[1]
if sphinx_version.version_compare('>=1.7.0')
SPHINX_ARGS += ['-j', 'auto']
SPHINX_ARGS += ['-j', '1']
else
nproc = find_program('nproc')
if nproc.found()
jobs = run_command(nproc, check: true).stdout()
SPHINX_ARGS += ['-j', jobs]
SPHINX_ARGS += ['-j', '1']
endif
endif

View File

@@ -38,6 +38,7 @@ Supported mechanisms
Currently supported confidential guest mechanisms are:
* AMD Secure Encrypted Virtualization (SEV) (see :doc:`i386/amd-memory-encryption`)
* Intel Trust Domain Extension (TDX) (see :doc:`i386/tdx`)
* POWER Protected Execution Facility (PEF) (see :ref:`power-papr-protected-execution-facility-pef`)
* s390x Protected Virtualization (PV) (see :doc:`s390x/protvirt`)

143
docs/system/i386/tdx.rst Normal file
View File

@@ -0,0 +1,143 @@
Intel Trusted Domain eXtension (TDX)
====================================
Intel Trusted Domain eXtensions (TDX) refers to an Intel technology that extends
Virtual Machine Extensions (VMX) and Multi-Key Total Memory Encryption (MKTME)
with a new kind of virtual machine guest called a Trust Domain (TD). A TD runs
in a CPU mode that is designed to protect the confidentiality of its memory
contents and its CPU state from any other software, including the hosting
Virtual Machine Monitor (VMM), unless explicitly shared by the TD itself.
Prerequisites
-------------
To run TD, the physical machine needs to have TDX module loaded and initialized
while KVM hypervisor has TDX support and has TDX enabled. If those requirements
are met, the ``KVM_CAP_VM_TYPES`` will report the support of ``KVM_X86_TDX_VM``.
Trust Domain Virtual Firmware (TDVF)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Trust Domain Virtual Firmware (TDVF) is required to provide TD services to boot
TD Guest OS. TDVF needs to be copied to guest private memory and measured before
the TD boots.
KVM vcpu ioctl ``KVM_MEMORY_MAPPING`` can be used to populates the TDVF content
into its private memory.
Since TDX doesn't support readonly memslot, TDVF cannot be mapped as pflash
device and it actually works as RAM. "-bios" option is chosen to load TDVF.
OVMF is the opensource firmware that implements the TDVF support. Thus the
command line to specify and load TDVF is ``-bios OVMF.fd``
KVM private memory
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
TD's memory (RAM) needs to be able to be transformed between private and shared.
Its BIOS (OVMF/TDVF) needs to be mapped as private as well. Thus QEMU needs to
allocate private guest memfd for them via KVM's IOCTL (KVM_CREATE_GUEST_MEMFD),
which requires KVM is newer enough that reports KVM_CAP_GUEST_MEMFD.
Feature Control
---------------
Unlike non-TDX VM, the CPU features (enumerated by CPU or MSR) of a TD is not
under full control of VMM. VMM can only configure part of features of a TD on
``KVM_TDX_INIT_VM`` command of VM scope ``MEMORY_ENCRYPT_OP`` ioctl.
The configurable features have three types:
- Attributes:
- PKS (bit 30) controls whether Supervisor Protection Keys is exposed to TD,
which determines related CPUID bit and CR4 bit;
- PERFMON (bit 63) controls whether PMU is exposed to TD.
- XSAVE related features (XFAM):
XFAM is a 64b mask, which has the same format as XCR0 or IA32_XSS MSR. It
determines the set of extended features available for use by the guest TD.
- CPUID features:
Only some bits of some CPUID leaves are directly configurable by VMM.
What features can be configured is reported via TDX capabilities.
TDX capabilities
~~~~~~~~~~~~~~~~
The VM scope ``MEMORY_ENCRYPT_OP`` ioctl provides command ``KVM_TDX_CAPABILITIES``
to get the TDX capabilities from KVM. It returns a data structure of
``struct kvm_tdx_capabilites``, which tells the supported configuration of
attributes, XFAM and CPUIDs.
TD attestation
--------------
In TD guest, the attestation process is used to verify the TDX guest
trustworthiness to other entities before provisioning secrets to the guest.
TD attestation is initiated first by calling TDG.MR.REPORT inside TD to get the
REPORT. Then the REPORT data needs to be converted into a remotely verifiable
Quote by SGX Quoting Enclave (QE).
A host daemon, Quote Generation Service (QGS), provides the functionality of
SGX GE. It provides a socket address, to which a TD guest can connect via
"quote-generation-socket" property. On the request of <GETQUOTE> from TD guest,
QEMU sends the TDREPORT to QGS via "quote-generation-socket" socket, and gets
the returning Quoting and return it back to TD guest.
Though "quote-generation-socket" is optional for booting the TD guest, it's a
must for supporting TD guest atteatation.
Launching a TD (TDX VM)
-----------------------
To launch a TDX guest, below are new added and required:
.. parsed-literal::
|qemu_system_x86| \\
-object tdx-guest,id=tdx0 \\
-machine ...,kernel-irqchip=split,confidential-guest-support=tdx0 \\
-bios OVMF.fd \\
If TD attestation support is wanted:
.. parsed-literal::
|qemu_system_x86| \\
-object '{"qom-type":"tdx-guest","id":"tdx0","quote-generation-socket":{"type": "vsock", "cid":"1","port":"1234"}}' \\
-machine ...,kernel-irqchip=split,confidential-guest-support=tdx0 \\
-bios OVMF.fd \\
Debugging
---------
Bit 0 of TD attributes, is DEBUG bit, which decides if the TD runs in off-TD
debug mode. When in off-TD debug mode, TD's VCPU state and private memory are
accessible via given SEAMCALLs. This requires KVM to expose APIs to invoke those
SEAMCALLs and resonponding QEMU change.
It's targeted as future work.
restrictions
------------
- kernel-irqchip must be split;
- No readonly support for private memory;
- No SMM support: SMM support requires manipulating the guset register states
which is not allowed;
Live Migration
--------------
TODO
References
----------
- `TDX Homepage <https://www.intel.com/content/www/us/en/developer/articles/technical/intel-trust-domain-extensions.html>`__
- `SGX QE <https://github.com/intel/SGXDataCenterAttestationPrimitives/tree/master/QuoteGeneration>`__

View File

@@ -1,8 +1,9 @@
During the graphical emulation, you can use special key combinations to
change modes. The default key mappings are shown below, but if you use
``-alt-grab`` then the modifier is Ctrl-Alt-Shift (instead of Ctrl-Alt)
and if you use ``-ctrl-grab`` then the modifier is the right Ctrl key
(instead of Ctrl-Alt):
During the graphical emulation, you can use special key combinations from
the following table to change modes. By default the modifier is Ctrl-Alt
(used in the table below) which can be changed with ``-display`` suboption
``mod=`` where appropriate. For example, ``-display sdl,
grab-mod=lshift-lctrl-lalt`` changes the modifier key to Ctrl-Alt-Shift,
while ``-display sdl,grab-mod=rctrl`` changes it to the right Ctrl key.
Ctrl-Alt-f
Toggle full screen
@@ -28,7 +29,7 @@ Ctrl-Alt-n
*3*
Serial port
Ctrl-Alt
Ctrl-Alt-g
Toggle mouse and keyboard grab.
In the virtual consoles, you can use Ctrl-Up, Ctrl-Down, Ctrl-PageUp and

View File

@@ -29,6 +29,7 @@ Architectural features
i386/kvm-pv
i386/sgx
i386/amd-memory-encryption
i386/tdx
OS requirements
~~~~~~~~~~~~~~~

View File

@@ -65,6 +65,7 @@ ERST
.help = "show info of one block device or all block devices "
"(-n: show named nodes; -v: show details)",
.cmd = hmp_info_block,
.coroutine = true,
},
SRST

View File

@@ -247,7 +247,6 @@ static void aspeed_ast2400_soc_realize(DeviceState *dev, Error **errp)
Aspeed2400SoCState *a = ASPEED2400_SOC(dev);
AspeedSoCState *s = ASPEED_SOC(dev);
AspeedSoCClass *sc = ASPEED_SOC_GET_CLASS(s);
Error *err = NULL;
g_autofree char *sram_name = NULL;
/* Default boot region (SPI memory or ROMs) */
@@ -276,9 +275,8 @@ static void aspeed_ast2400_soc_realize(DeviceState *dev, Error **errp)
/* SRAM */
sram_name = g_strdup_printf("aspeed.sram.%d", CPU(&a->cpu[0])->cpu_index);
memory_region_init_ram(&s->sram, OBJECT(s), sram_name, sc->sram_size, &err);
if (err) {
error_propagate(errp, err);
if (!memory_region_init_ram(&s->sram, OBJECT(s), sram_name, sc->sram_size,
errp)) {
return;
}
memory_region_add_subregion(s->memory,

View File

@@ -282,7 +282,6 @@ static void aspeed_soc_ast2600_realize(DeviceState *dev, Error **errp)
Aspeed2600SoCState *a = ASPEED2600_SOC(dev);
AspeedSoCState *s = ASPEED_SOC(dev);
AspeedSoCClass *sc = ASPEED_SOC_GET_CLASS(s);
Error *err = NULL;
qemu_irq irq;
g_autofree char *sram_name = NULL;
@@ -355,9 +354,8 @@ static void aspeed_soc_ast2600_realize(DeviceState *dev, Error **errp)
/* SRAM */
sram_name = g_strdup_printf("aspeed.sram.%d", CPU(&a->cpu[0])->cpu_index);
memory_region_init_ram(&s->sram, OBJECT(s), sram_name, sc->sram_size, &err);
if (err) {
error_propagate(errp, err);
if (!memory_region_init_ram(&s->sram, OBJECT(s), sram_name, sc->sram_size,
errp)) {
return;
}
memory_region_add_subregion(s->memory,

View File

@@ -81,7 +81,6 @@ static void fsl_imx25_realize(DeviceState *dev, Error **errp)
{
FslIMX25State *s = FSL_IMX25(dev);
uint8_t i;
Error *err = NULL;
if (!qdev_realize(DEVICE(&s->cpu), NULL, errp)) {
return;
@@ -281,28 +280,22 @@ static void fsl_imx25_realize(DeviceState *dev, Error **errp)
FSL_IMX25_WDT_IRQ));
/* initialize 2 x 16 KB ROM */
memory_region_init_rom(&s->rom[0], OBJECT(dev), "imx25.rom0",
FSL_IMX25_ROM0_SIZE, &err);
if (err) {
error_propagate(errp, err);
if (!memory_region_init_rom(&s->rom[0], OBJECT(dev), "imx25.rom0",
FSL_IMX25_ROM0_SIZE, errp)) {
return;
}
memory_region_add_subregion(get_system_memory(), FSL_IMX25_ROM0_ADDR,
&s->rom[0]);
memory_region_init_rom(&s->rom[1], OBJECT(dev), "imx25.rom1",
FSL_IMX25_ROM1_SIZE, &err);
if (err) {
error_propagate(errp, err);
if (!memory_region_init_rom(&s->rom[1], OBJECT(dev), "imx25.rom1",
FSL_IMX25_ROM1_SIZE, errp)) {
return;
}
memory_region_add_subregion(get_system_memory(), FSL_IMX25_ROM1_ADDR,
&s->rom[1]);
/* initialize internal RAM (128 KB) */
memory_region_init_ram(&s->iram, NULL, "imx25.iram", FSL_IMX25_IRAM_SIZE,
&err);
if (err) {
error_propagate(errp, err);
if (!memory_region_init_ram(&s->iram, NULL, "imx25.iram",
FSL_IMX25_IRAM_SIZE, errp)) {
return;
}
memory_region_add_subregion(get_system_memory(), FSL_IMX25_IRAM_ADDR,

View File

@@ -63,7 +63,6 @@ static void fsl_imx31_realize(DeviceState *dev, Error **errp)
{
FslIMX31State *s = FSL_IMX31(dev);
uint16_t i;
Error *err = NULL;
if (!qdev_realize(DEVICE(&s->cpu), NULL, errp)) {
return;
@@ -188,30 +187,24 @@ static void fsl_imx31_realize(DeviceState *dev, Error **errp)
sysbus_mmio_map(SYS_BUS_DEVICE(&s->wdt), 0, FSL_IMX31_WDT_ADDR);
/* On a real system, the first 16k is a `secure boot rom' */
memory_region_init_rom(&s->secure_rom, OBJECT(dev), "imx31.secure_rom",
FSL_IMX31_SECURE_ROM_SIZE, &err);
if (err) {
error_propagate(errp, err);
if (!memory_region_init_rom(&s->secure_rom, OBJECT(dev), "imx31.secure_rom",
FSL_IMX31_SECURE_ROM_SIZE, errp)) {
return;
}
memory_region_add_subregion(get_system_memory(), FSL_IMX31_SECURE_ROM_ADDR,
&s->secure_rom);
/* There is also a 16k ROM */
memory_region_init_rom(&s->rom, OBJECT(dev), "imx31.rom",
FSL_IMX31_ROM_SIZE, &err);
if (err) {
error_propagate(errp, err);
if (!memory_region_init_rom(&s->rom, OBJECT(dev), "imx31.rom",
FSL_IMX31_ROM_SIZE, errp)) {
return;
}
memory_region_add_subregion(get_system_memory(), FSL_IMX31_ROM_ADDR,
&s->rom);
/* initialize internal RAM (16 KB) */
memory_region_init_ram(&s->iram, NULL, "imx31.iram", FSL_IMX31_IRAM_SIZE,
&err);
if (err) {
error_propagate(errp, err);
if (!memory_region_init_ram(&s->iram, NULL, "imx31.iram",
FSL_IMX31_IRAM_SIZE, errp)) {
return;
}
memory_region_add_subregion(get_system_memory(), FSL_IMX31_IRAM_ADDR,

View File

@@ -109,7 +109,6 @@ static void fsl_imx6_realize(DeviceState *dev, Error **errp)
MachineState *ms = MACHINE(qdev_get_machine());
FslIMX6State *s = FSL_IMX6(dev);
uint16_t i;
Error *err = NULL;
unsigned int smp_cpus = ms->smp.cpus;
if (smp_cpus > FSL_IMX6_NUM_CPUS) {
@@ -423,30 +422,24 @@ static void fsl_imx6_realize(DeviceState *dev, Error **errp)
}
/* ROM memory */
memory_region_init_rom(&s->rom, OBJECT(dev), "imx6.rom",
FSL_IMX6_ROM_SIZE, &err);
if (err) {
error_propagate(errp, err);
if (!memory_region_init_rom(&s->rom, OBJECT(dev), "imx6.rom",
FSL_IMX6_ROM_SIZE, errp)) {
return;
}
memory_region_add_subregion(get_system_memory(), FSL_IMX6_ROM_ADDR,
&s->rom);
/* CAAM memory */
memory_region_init_rom(&s->caam, OBJECT(dev), "imx6.caam",
FSL_IMX6_CAAM_MEM_SIZE, &err);
if (err) {
error_propagate(errp, err);
if (!memory_region_init_rom(&s->caam, OBJECT(dev), "imx6.caam",
FSL_IMX6_CAAM_MEM_SIZE, errp)) {
return;
}
memory_region_add_subregion(get_system_memory(), FSL_IMX6_CAAM_MEM_ADDR,
&s->caam);
/* OCRAM memory */
memory_region_init_ram(&s->ocram, NULL, "imx6.ocram", FSL_IMX6_OCRAM_SIZE,
&err);
if (err) {
error_propagate(errp, err);
if (!memory_region_init_ram(&s->ocram, NULL, "imx6.ocram",
FSL_IMX6_OCRAM_SIZE, errp)) {
return;
}
memory_region_add_subregion(get_system_memory(), FSL_IMX6_OCRAM_ADDR,

View File

@@ -291,12 +291,9 @@ static void integratorcm_realize(DeviceState *d, Error **errp)
{
IntegratorCMState *s = INTEGRATOR_CM(d);
SysBusDevice *dev = SYS_BUS_DEVICE(d);
Error *local_err = NULL;
memory_region_init_ram(&s->flash, OBJECT(d), "integrator.flash", 0x100000,
&local_err);
if (local_err) {
error_propagate(errp, local_err);
if (!memory_region_init_ram(&s->flash, OBJECT(d), "integrator.flash",
0x100000, errp)) {
return;
}

View File

@@ -58,7 +58,6 @@ static void nrf51_soc_realize(DeviceState *dev_soc, Error **errp)
{
NRF51State *s = NRF51_SOC(dev_soc);
MemoryRegion *mr;
Error *err = NULL;
uint8_t i = 0;
hwaddr base_addr = 0;
@@ -92,10 +91,8 @@ static void nrf51_soc_realize(DeviceState *dev_soc, Error **errp)
memory_region_add_subregion_overlap(&s->container, 0, s->board_memory, -1);
memory_region_init_ram(&s->sram, OBJECT(s), "nrf51.sram", s->sram_size,
&err);
if (err) {
error_propagate(errp, err);
if (!memory_region_init_ram(&s->sram, OBJECT(s), "nrf51.sram", s->sram_size,
errp)) {
return;
}
memory_region_add_subregion(&s->container, NRF51_SRAM_BASE, &s->sram);

View File

@@ -675,6 +675,8 @@ static void smmu_base_reset_hold(Object *obj)
{
SMMUState *s = ARM_SMMU(obj);
memset(s->smmu_pcibus_by_bus_num, 0, sizeof(s->smmu_pcibus_by_bus_num));
g_hash_table_remove_all(s->configs);
g_hash_table_remove_all(s->iotlb);
}

View File

@@ -65,7 +65,7 @@ static void virtio_blk_req_complete(VirtIOBlockReq *req, unsigned char status)
iov_discard_undo(&req->inhdr_undo);
iov_discard_undo(&req->outhdr_undo);
virtqueue_push(req->vq, &req->elem, req->in_len);
if (s->dataplane_started && !s->dataplane_disabled) {
if (qemu_in_iothread()) {
virtio_blk_data_plane_notify(s->dataplane, req->vq);
} else {
virtio_notify(vdev, req->vq);

View File

@@ -418,6 +418,9 @@ static void xen_block_realize(XenDevice *xendev, Error **errp)
xen_block_set_size(blockdev);
if (!monitor_add_blk(conf->blk, blockdev->drive->id, errp)) {
return;
}
blockdev->dataplane =
xen_block_dataplane_create(xendev, blk, conf->logical_block_size,
blockdev->props.iothread);
@@ -874,6 +877,8 @@ static XenBlockDrive *xen_block_drive_create(const char *id,
const char *mode = qdict_get_try_str(opts, "mode");
const char *direct_io_safe = qdict_get_try_str(opts, "direct-io-safe");
const char *discard_enable = qdict_get_try_str(opts, "discard-enable");
const char *suse_diskcache_disable_flush = qdict_get_try_str(opts,
"suse-diskcache-disable-flush");
char *driver = NULL;
char *filename = NULL;
XenBlockDrive *drive = NULL;
@@ -954,6 +959,16 @@ static XenBlockDrive *xen_block_drive_create(const char *id,
}
}
if (suse_diskcache_disable_flush) {
unsigned long value;
if (!qemu_strtoul(suse_diskcache_disable_flush, NULL, 2, &value) && !!value) {
QDict *cache_qdict = qdict_new();
qdict_put_bool(cache_qdict, "no-flush", true);
qdict_put_obj(file_layer, "cache", QOBJECT(cache_qdict));
}
}
/*
* It is necessary to turn file locking off as an emulated device
* may have already opened the same image file.

View File

@@ -1189,6 +1189,11 @@ bool machine_mem_merge(MachineState *machine)
return machine->mem_merge;
}
bool machine_require_guest_memfd(MachineState *machine)
{
return machine->require_guest_memfd;
}
static char *cpu_slot_to_string(const CPUArchId *cpu)
{
GString *s = g_string_new(NULL);

View File

@@ -49,6 +49,7 @@ static void ct3_build_cdat(CDATObject *cdat, Error **errp)
g_autofree CDATTableHeader *cdat_header = NULL;
g_autofree CDATEntry *cdat_st = NULL;
uint8_t sum = 0;
uint8_t *hdr_buf;
int ent, i;
/* Use default table if fopen == NULL */
@@ -63,7 +64,7 @@ static void ct3_build_cdat(CDATObject *cdat, Error **errp)
cdat->built_buf_len = cdat->build_cdat_table(&cdat->built_buf,
cdat->private);
if (!cdat->built_buf_len) {
if (cdat->built_buf_len <= 0) {
/* Build later as not all data available yet */
cdat->to_update = true;
return;
@@ -95,8 +96,12 @@ static void ct3_build_cdat(CDATObject *cdat, Error **errp)
/* For now, no runtime updates */
cdat_header->sequence = 0;
cdat_header->length += sizeof(CDATTableHeader);
sum += cdat_header->revision + cdat_header->sequence +
cdat_header->length;
hdr_buf = (uint8_t *)cdat_header;
for (i = 0; i < sizeof(*cdat_header); i++) {
sum += hdr_buf[i];
}
/* Sum of all bytes including checksum must be 0 */
cdat_header->checksum = ~sum + 1;

View File

@@ -199,7 +199,7 @@ void cxl_component_register_block_init(Object *obj,
/* io registers controls link which we don't care about in QEMU */
memory_region_init_io(&cregs->io, obj, NULL, cregs, ".io",
CXL2_COMPONENT_IO_REGION_SIZE);
memory_region_init_io(&cregs->cache_mem, obj, &cache_mem_ops, cregs,
memory_region_init_io(&cregs->cache_mem, obj, &cache_mem_ops, cxl_cstate,
".cache_mem", CXL2_COMPONENT_CM_REGION_SIZE);
memory_region_add_subregion(&cregs->component_registers, 0, &cregs->io);

View File

@@ -229,12 +229,9 @@ static void mailbox_reg_write(void *opaque, hwaddr offset, uint64_t value,
static uint64_t mdev_reg_read(void *opaque, hwaddr offset, unsigned size)
{
uint64_t retval = 0;
CXLDeviceState *cxl_dstate = opaque;
retval = FIELD_DP64(retval, CXL_MEM_DEV_STS, MEDIA_STATUS, 1);
retval = FIELD_DP64(retval, CXL_MEM_DEV_STS, MBOX_READY, 1);
return retval;
return cxl_dstate->memdev_status;
}
static void ro_reg_write(void *opaque, hwaddr offset, uint64_t value,
@@ -371,7 +368,15 @@ static void mailbox_reg_init_common(CXLDeviceState *cxl_dstate)
cxl_dstate->mbox_msi_n = msi_n;
}
static void memdev_reg_init_common(CXLDeviceState *cxl_dstate) { }
static void memdev_reg_init_common(CXLDeviceState *cxl_dstate)
{
uint64_t memdev_status_reg;
memdev_status_reg = FIELD_DP64(0, CXL_MEM_DEV_STS, MEDIA_STATUS, 1);
memdev_status_reg = FIELD_DP64(memdev_status_reg, CXL_MEM_DEV_STS,
MBOX_READY, 1);
cxl_dstate->memdev_status = memdev_status_reg;
}
void cxl_device_register_init_t3(CXLType3Dev *ct3d)
{

View File

@@ -181,7 +181,7 @@ static void virgl_cmd_set_scanout(VirtIOGPU *g,
memset(&info, 0, sizeof(info));
ret = virgl_renderer_resource_get_info(ss.resource_id, &info);
#endif
if (ret == -1) {
if (ret) {
qemu_log_mask(LOG_GUEST_ERROR,
"%s: illegal resource specified %d\n",
__func__, ss.resource_id);

View File

@@ -7,6 +7,7 @@ config HPPA_B160L
select DINO
select LASI
select SERIAL
select SERIAL_PCI
select ISA_BUS
select I8259
select IDE_CMD646
@@ -16,3 +17,4 @@ config HPPA_B160L
select LASIPS2
select PARALLEL
select ARTIST
select USB_OHCI_PCI

View File

@@ -10,6 +10,11 @@ config SGX
bool
depends on KVM
config TDX
bool
select X86_FW_OVMF
depends on KVM
config PC
bool
imply APPLESMC
@@ -26,6 +31,7 @@ config PC
imply QXL
imply SEV
imply SGX
imply TDX
imply TEST_DEVICES
imply TPM_CRB
imply TPM_TIS_ISA

View File

@@ -975,7 +975,8 @@ static void build_dbg_aml(Aml *table)
aml_append(table, scope);
}
static Aml *build_link_dev(const char *name, uint8_t uid, Aml *reg)
static Aml *build_link_dev(const char *name, uint8_t uid, Aml *reg,
bool level_trigger_unsupported)
{
Aml *dev;
Aml *crs;
@@ -987,7 +988,10 @@ static Aml *build_link_dev(const char *name, uint8_t uid, Aml *reg)
aml_append(dev, aml_name_decl("_UID", aml_int(uid)));
crs = aml_resource_template();
aml_append(crs, aml_interrupt(AML_CONSUMER, AML_LEVEL, AML_ACTIVE_HIGH,
aml_append(crs, aml_interrupt(AML_CONSUMER,
level_trigger_unsupported ?
AML_EDGE : AML_LEVEL,
AML_ACTIVE_HIGH,
AML_SHARED, irqs, ARRAY_SIZE(irqs)));
aml_append(dev, aml_name_decl("_PRS", crs));
@@ -1011,7 +1015,8 @@ static Aml *build_link_dev(const char *name, uint8_t uid, Aml *reg)
return dev;
}
static Aml *build_gsi_link_dev(const char *name, uint8_t uid, uint8_t gsi)
static Aml *build_gsi_link_dev(const char *name, uint8_t uid,
uint8_t gsi, bool level_trigger_unsupported)
{
Aml *dev;
Aml *crs;
@@ -1024,7 +1029,10 @@ static Aml *build_gsi_link_dev(const char *name, uint8_t uid, uint8_t gsi)
crs = aml_resource_template();
irqs = gsi;
aml_append(crs, aml_interrupt(AML_CONSUMER, AML_LEVEL, AML_ACTIVE_HIGH,
aml_append(crs, aml_interrupt(AML_CONSUMER,
level_trigger_unsupported ?
AML_EDGE : AML_LEVEL,
AML_ACTIVE_HIGH,
AML_SHARED, &irqs, 1));
aml_append(dev, aml_name_decl("_PRS", crs));
@@ -1043,7 +1051,7 @@ static Aml *build_gsi_link_dev(const char *name, uint8_t uid, uint8_t gsi)
}
/* _CRS method - get current settings */
static Aml *build_iqcr_method(bool is_piix4)
static Aml *build_iqcr_method(bool is_piix4, bool level_trigger_unsupported)
{
Aml *if_ctx;
uint32_t irqs;
@@ -1051,7 +1059,9 @@ static Aml *build_iqcr_method(bool is_piix4)
Aml *crs = aml_resource_template();
irqs = 0;
aml_append(crs, aml_interrupt(AML_CONSUMER, AML_LEVEL,
aml_append(crs, aml_interrupt(AML_CONSUMER,
level_trigger_unsupported ?
AML_EDGE : AML_LEVEL,
AML_ACTIVE_HIGH, AML_SHARED, &irqs, 1));
aml_append(method, aml_name_decl("PRR0", crs));
@@ -1085,7 +1095,7 @@ static Aml *build_irq_status_method(void)
return method;
}
static void build_piix4_pci0_int(Aml *table)
static void build_piix4_pci0_int(Aml *table, bool level_trigger_unsupported)
{
Aml *dev;
Aml *crs;
@@ -1098,12 +1108,16 @@ static void build_piix4_pci0_int(Aml *table)
aml_append(sb_scope, pci0_scope);
aml_append(sb_scope, build_irq_status_method());
aml_append(sb_scope, build_iqcr_method(true));
aml_append(sb_scope, build_iqcr_method(true, level_trigger_unsupported));
aml_append(sb_scope, build_link_dev("LNKA", 0, aml_name("PRQ0")));
aml_append(sb_scope, build_link_dev("LNKB", 1, aml_name("PRQ1")));
aml_append(sb_scope, build_link_dev("LNKC", 2, aml_name("PRQ2")));
aml_append(sb_scope, build_link_dev("LNKD", 3, aml_name("PRQ3")));
aml_append(sb_scope, build_link_dev("LNKA", 0, aml_name("PRQ0"),
level_trigger_unsupported));
aml_append(sb_scope, build_link_dev("LNKB", 1, aml_name("PRQ1"),
level_trigger_unsupported));
aml_append(sb_scope, build_link_dev("LNKC", 2, aml_name("PRQ2"),
level_trigger_unsupported));
aml_append(sb_scope, build_link_dev("LNKD", 3, aml_name("PRQ3"),
level_trigger_unsupported));
dev = aml_device("LNKS");
{
@@ -1112,7 +1126,9 @@ static void build_piix4_pci0_int(Aml *table)
crs = aml_resource_template();
irqs = 9;
aml_append(crs, aml_interrupt(AML_CONSUMER, AML_LEVEL,
aml_append(crs, aml_interrupt(AML_CONSUMER,
level_trigger_unsupported ?
AML_EDGE : AML_LEVEL,
AML_ACTIVE_HIGH, AML_SHARED,
&irqs, 1));
aml_append(dev, aml_name_decl("_PRS", crs));
@@ -1198,7 +1214,7 @@ static Aml *build_q35_routing_table(const char *str)
return pkg;
}
static void build_q35_pci0_int(Aml *table)
static void build_q35_pci0_int(Aml *table, bool level_trigger_unsupported)
{
Aml *method;
Aml *sb_scope = aml_scope("_SB");
@@ -1237,25 +1253,41 @@ static void build_q35_pci0_int(Aml *table)
aml_append(sb_scope, pci0_scope);
aml_append(sb_scope, build_irq_status_method());
aml_append(sb_scope, build_iqcr_method(false));
aml_append(sb_scope, build_iqcr_method(false, level_trigger_unsupported));
aml_append(sb_scope, build_link_dev("LNKA", 0, aml_name("PRQA")));
aml_append(sb_scope, build_link_dev("LNKB", 1, aml_name("PRQB")));
aml_append(sb_scope, build_link_dev("LNKC", 2, aml_name("PRQC")));
aml_append(sb_scope, build_link_dev("LNKD", 3, aml_name("PRQD")));
aml_append(sb_scope, build_link_dev("LNKE", 4, aml_name("PRQE")));
aml_append(sb_scope, build_link_dev("LNKF", 5, aml_name("PRQF")));
aml_append(sb_scope, build_link_dev("LNKG", 6, aml_name("PRQG")));
aml_append(sb_scope, build_link_dev("LNKH", 7, aml_name("PRQH")));
aml_append(sb_scope, build_link_dev("LNKA", 0, aml_name("PRQA"),
level_trigger_unsupported));
aml_append(sb_scope, build_link_dev("LNKB", 1, aml_name("PRQB"),
level_trigger_unsupported));
aml_append(sb_scope, build_link_dev("LNKC", 2, aml_name("PRQC"),
level_trigger_unsupported));
aml_append(sb_scope, build_link_dev("LNKD", 3, aml_name("PRQD"),
level_trigger_unsupported));
aml_append(sb_scope, build_link_dev("LNKE", 4, aml_name("PRQE"),
level_trigger_unsupported));
aml_append(sb_scope, build_link_dev("LNKF", 5, aml_name("PRQF"),
level_trigger_unsupported));
aml_append(sb_scope, build_link_dev("LNKG", 6, aml_name("PRQG"),
level_trigger_unsupported));
aml_append(sb_scope, build_link_dev("LNKH", 7, aml_name("PRQH"),
level_trigger_unsupported));
aml_append(sb_scope, build_gsi_link_dev("GSIA", 0x10, 0x10));
aml_append(sb_scope, build_gsi_link_dev("GSIB", 0x11, 0x11));
aml_append(sb_scope, build_gsi_link_dev("GSIC", 0x12, 0x12));
aml_append(sb_scope, build_gsi_link_dev("GSID", 0x13, 0x13));
aml_append(sb_scope, build_gsi_link_dev("GSIE", 0x14, 0x14));
aml_append(sb_scope, build_gsi_link_dev("GSIF", 0x15, 0x15));
aml_append(sb_scope, build_gsi_link_dev("GSIG", 0x16, 0x16));
aml_append(sb_scope, build_gsi_link_dev("GSIH", 0x17, 0x17));
aml_append(sb_scope, build_gsi_link_dev("GSIA", 0x10, 0x10,
level_trigger_unsupported));
aml_append(sb_scope, build_gsi_link_dev("GSIB", 0x11, 0x11,
level_trigger_unsupported));
aml_append(sb_scope, build_gsi_link_dev("GSIC", 0x12, 0x12,
level_trigger_unsupported));
aml_append(sb_scope, build_gsi_link_dev("GSID", 0x13, 0x13,
level_trigger_unsupported));
aml_append(sb_scope, build_gsi_link_dev("GSIE", 0x14, 0x14,
level_trigger_unsupported));
aml_append(sb_scope, build_gsi_link_dev("GSIF", 0x15, 0x15,
level_trigger_unsupported));
aml_append(sb_scope, build_gsi_link_dev("GSIG", 0x16, 0x16,
level_trigger_unsupported));
aml_append(sb_scope, build_gsi_link_dev("GSIH", 0x17, 0x17,
level_trigger_unsupported));
aml_append(table, sb_scope);
}
@@ -1415,7 +1447,7 @@ static void build_acpi0017(Aml *table)
aml_append(dev, aml_name_decl("_HID", aml_string("ACPI0017")));
method = aml_method("_STA", 0, AML_NOTSERIALIZED);
aml_append(method, aml_return(aml_int(0x01)));
aml_append(method, aml_return(aml_int(0x0B)));
aml_append(dev, method);
build_cxl_dsm_method(dev);
@@ -1436,6 +1468,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
PCMachineState *pcms = PC_MACHINE(machine);
PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(machine);
X86MachineState *x86ms = X86_MACHINE(machine);
bool level_trigger_unsupported = x86ms->eoi_intercept_unsupported;
AcpiMcfgInfo mcfg;
bool mcfg_valid = !!acpi_get_mcfg(&mcfg);
uint32_t nr_mem = machine->ram_slots;
@@ -1468,7 +1501,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
if (pm->pcihp_bridge_en || pm->pcihp_root_en) {
build_x86_acpi_pci_hotplug(dsdt, pm->pcihp_io_base);
}
build_piix4_pci0_int(dsdt);
build_piix4_pci0_int(dsdt, level_trigger_unsupported);
} else if (q35) {
sb_scope = aml_scope("_SB");
dev = aml_device("PCI0");
@@ -1512,7 +1545,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
if (pm->pcihp_bridge_en) {
build_x86_acpi_pci_hotplug(dsdt, pm->pcihp_io_base);
}
build_q35_pci0_int(dsdt);
build_q35_pci0_int(dsdt, level_trigger_unsupported);
}
if (misc->has_hpet) {

View File

@@ -100,9 +100,11 @@ void acpi_build_madt(GArray *table_data, BIOSLinker *linker,
int i;
bool x2apic_mode = false;
MachineClass *mc = MACHINE_GET_CLASS(x86ms);
X86MachineClass *x86mc = X86_MACHINE_GET_CLASS(x86ms);
const CPUArchIdList *apic_ids = mc->possible_cpu_arch_ids(MACHINE(x86ms));
AcpiTable table = { .sig = "APIC", .rev = 3, .oem_id = oem_id,
.oem_table_id = oem_table_id };
bool level_trigger_unsupported = x86ms->eoi_intercept_unsupported;
acpi_table_begin(&table, table_data);
/* Local APIC Address */
@@ -122,18 +124,42 @@ void acpi_build_madt(GArray *table_data, BIOSLinker *linker,
IO_APIC_SECONDARY_ADDRESS, IO_APIC_SECONDARY_IRQBASE);
}
if (x86ms->apic_xrupt_override) {
build_xrupt_override(table_data, 0, 2,
0 /* Flags: Conforms to the specifications of the bus */);
}
for (i = 1; i < 16; i++) {
if (!(x86ms->pci_irq_mask & (1 << i))) {
/* No need for a INT source override structure. */
continue;
if (level_trigger_unsupported) {
/* Force edge trigger */
if (x86mc->apic_xrupt_override) {
build_xrupt_override(table_data, 0, 2,
/* Flags: active high, edge triggered */
1 | (1 << 2));
}
for (i = x86mc->apic_xrupt_override ? 1 : 0; i < 16; i++) {
build_xrupt_override(table_data, i, i,
/* Flags: active high, edge triggered */
1 | (1 << 2));
}
if (x86ms->ioapic2) {
for (i = 0; i < 16; i++) {
build_xrupt_override(table_data, IO_APIC_SECONDARY_IRQBASE + i,
IO_APIC_SECONDARY_IRQBASE + i,
/* Flags: active high, edge triggered */
1 | (1 << 2));
}
}
} else {
if (x86mc->apic_xrupt_override) {
build_xrupt_override(table_data, 0, 2,
0 /* Flags: Conforms to the specifications of the bus */);
}
for (i = 1; i < 16; i++) {
if (!(x86ms->pci_irq_mask & (1 << i))) {
/* No need for a INT source override structure. */
continue;
}
build_xrupt_override(table_data, i, i,
0xd /* Flags: Active high, Level Triggered */);
}
build_xrupt_override(table_data, i, i,
0xd /* Flags: Active high, Level Triggered */);
}
if (x2apic_mode) {

View File

@@ -48,15 +48,25 @@ const char *fw_cfg_arch_key_name(uint16_t key)
return NULL;
}
void fw_cfg_build_smbios(MachineState *ms, FWCfgState *fw_cfg)
void fw_cfg_build_smbios(PCMachineState *pcms, FWCfgState *fw_cfg)
{
#ifdef CONFIG_SMBIOS
uint8_t *smbios_tables, *smbios_anchor;
size_t smbios_tables_len, smbios_anchor_len;
struct smbios_phys_mem_area *mem_array;
unsigned i, array_count;
MachineState *ms = MACHINE(pcms);
PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
MachineClass *mc = MACHINE_GET_CLASS(pcms);
X86CPU *cpu = X86_CPU(ms->possible_cpus->cpus[0].cpu);
if (pcmc->smbios_defaults) {
/* These values are guest ABI, do not change */
smbios_set_defaults("QEMU", mc->desc, mc->name,
pcmc->smbios_legacy_mode, pcmc->smbios_uuid_encoded,
pcms->smbios_entry_point_type);
}
/* tell smbios about cpuid version and features */
smbios_set_cpuid(cpu->env.cpuid_version, cpu->env.features[FEAT_1_EDX]);

View File

@@ -10,6 +10,7 @@
#define HW_I386_FW_CFG_H
#include "hw/boards.h"
#include "hw/i386/pc.h"
#include "hw/nvram/fw_cfg.h"
#define FW_CFG_IO_BASE 0x510
@@ -22,7 +23,7 @@
FWCfgState *fw_cfg_arch_create(MachineState *ms,
uint16_t boot_cpus,
uint16_t apic_id_limit);
void fw_cfg_build_smbios(MachineState *ms, FWCfgState *fw_cfg);
void fw_cfg_build_smbios(PCMachineState *ms, FWCfgState *fw_cfg);
void fw_cfg_build_feature_control(MachineState *ms, FWCfgState *fw_cfg);
void fw_cfg_add_acpi_dsdt(Aml *scope, FWCfgState *fw_cfg);

View File

@@ -27,6 +27,7 @@ i386_ss.add(when: 'CONFIG_PC', if_true: files(
'port92.c'))
i386_ss.add(when: 'CONFIG_X86_FW_OVMF', if_true: files('pc_sysfw_ovmf.c'),
if_false: files('pc_sysfw_ovmf-stubs.c'))
i386_ss.add(when: 'CONFIG_TDX', if_true: files('tdvf.c', 'tdvf-hob.c'))
subdir('kvm')
subdir('xen')

View File

@@ -175,7 +175,7 @@ static void microvm_devices_init(MicrovmMachineState *mms)
&error_abort);
isa_bus_register_input_irqs(isa_bus, x86ms->gsi);
ioapic_init_gsi(gsi_state, "machine");
ioapic_init_gsi(gsi_state, OBJECT(mms));
if (ioapics > 1) {
x86ms->ioapic2 = ioapic_init_secondary(gsi_state);
}

View File

@@ -43,6 +43,7 @@
#include "sysemu/xen.h"
#include "sysemu/reset.h"
#include "kvm/kvm_i386.h"
#include "kvm/tdx.h"
#include "hw/xen/xen.h"
#include "qapi/qmp/qlist.h"
#include "qemu/error-report.h"
@@ -692,22 +693,13 @@ void pc_machine_done(Notifier *notifier, void *data)
acpi_setup();
if (x86ms->fw_cfg) {
fw_cfg_build_smbios(MACHINE(pcms), x86ms->fw_cfg);
fw_cfg_build_smbios(pcms, x86ms->fw_cfg);
fw_cfg_build_feature_control(MACHINE(pcms), x86ms->fw_cfg);
/* update FW_CFG_NB_CPUS to account for -device added CPUs */
fw_cfg_modify_i16(x86ms->fw_cfg, FW_CFG_NB_CPUS, x86ms->boot_cpus);
}
}
void pc_guest_info_init(PCMachineState *pcms)
{
X86MachineState *x86ms = X86_MACHINE(pcms);
x86ms->apic_xrupt_override = true;
pcms->machine_done.notify = pc_machine_done;
qemu_add_machine_init_done_notifier(&pcms->machine_done);
}
/* setup pci memory address space mapping into system address space */
void pc_pci_as_mapping_init(MemoryRegion *system_memory,
MemoryRegion *pci_address_space)
@@ -1036,16 +1028,18 @@ void pc_memory_init(PCMachineState *pcms,
/* Initialize PC system firmware */
pc_system_firmware_init(pcms, rom_memory);
option_rom_mr = g_malloc(sizeof(*option_rom_mr));
memory_region_init_ram(option_rom_mr, NULL, "pc.rom", PC_ROM_SIZE,
&error_fatal);
if (pcmc->pci_enabled) {
memory_region_set_readonly(option_rom_mr, true);
if (!is_tdx_vm()) {
option_rom_mr = g_malloc(sizeof(*option_rom_mr));
memory_region_init_ram(option_rom_mr, NULL, "pc.rom", PC_ROM_SIZE,
&error_fatal);
if (pcmc->pci_enabled) {
memory_region_set_readonly(option_rom_mr, true);
}
memory_region_add_subregion_overlap(rom_memory,
PC_ROM_MIN_VGA,
option_rom_mr,
1);
}
memory_region_add_subregion_overlap(rom_memory,
PC_ROM_MIN_VGA,
option_rom_mr,
1);
fw_cfg = fw_cfg_arch_create(machine,
x86ms->boot_cpus, x86ms->apic_id_limit);
@@ -1753,6 +1747,9 @@ static void pc_machine_initfn(Object *obj)
object_property_add_alias(OBJECT(pcms), "pcspk-audiodev",
OBJECT(pcms->pcspk), "audiodev");
cxl_machine_init(obj, &pcms->cxl_devices_state);
pcms->machine_done.notify = pc_machine_done;
qemu_add_machine_init_done_notifier(&pcms->machine_done);
}
int pc_machine_kvm_type(MachineState *machine, const char *kvm_type)
@@ -1806,6 +1803,7 @@ static bool pc_hotplug_allowed(MachineState *ms, DeviceState *dev, Error **errp)
static void pc_machine_class_init(ObjectClass *oc, void *data)
{
MachineClass *mc = MACHINE_CLASS(oc);
X86MachineClass *x86mc = X86_MACHINE_CLASS(oc);
PCMachineClass *pcmc = PC_MACHINE_CLASS(oc);
HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc);
@@ -1825,6 +1823,7 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
pcmc->pvh_enabled = true;
pcmc->kvmclock_create_always = true;
pcmc->resizable_acpi_blob = true;
x86mc->apic_xrupt_override = true;
assert(!mc->get_hotplug_handler);
mc->get_hotplug_handler = pc_get_hotplug_handler;
mc->hotplug_allowed = pc_hotplug_allowed;

View File

@@ -36,7 +36,6 @@
#include "hw/rtc/mc146818rtc.h"
#include "hw/southbridge/piix.h"
#include "hw/display/ramfb.h"
#include "hw/firmware/smbios.h"
#include "hw/pci/pci.h"
#include "hw/pci/pci_ids.h"
#include "hw/usb.h"
@@ -112,6 +111,7 @@ static void pc_init1(MachineState *machine,
X86MachineState *x86ms = X86_MACHINE(machine);
MemoryRegion *system_memory = get_system_memory();
MemoryRegion *system_io = get_system_io();
Object *phb = NULL;
PCIBus *pci_bus = NULL;
ISABus *isa_bus;
Object *piix4_pm = NULL;
@@ -195,8 +195,6 @@ static void pc_init1(MachineState *machine,
}
if (pcmc->pci_enabled) {
Object *phb;
pci_memory = g_new(MemoryRegion, 1);
memory_region_init(pci_memory, NULL, "pci", UINT64_MAX);
rom_memory = pci_memory;
@@ -230,17 +228,6 @@ static void pc_init1(MachineState *machine,
&error_abort);
}
pc_guest_info_init(pcms);
if (pcmc->smbios_defaults) {
MachineClass *mc = MACHINE_GET_CLASS(machine);
/* These values are guest ABI, do not change */
smbios_set_defaults("QEMU", mc->desc,
mc->name, pcmc->smbios_legacy_mode,
pcmc->smbios_uuid_encoded,
pcms->smbios_entry_point_type);
}
/* allocate ram and load rom/bios */
if (!xen_enabled()) {
pc_memory_init(pcms, system_memory, rom_memory, hole64_size);
@@ -323,8 +310,8 @@ static void pc_init1(MachineState *machine,
pc_i8259_create(isa_bus, gsi_state->i8259_irq);
}
if (pcmc->pci_enabled) {
ioapic_init_gsi(gsi_state, "i440fx");
if (phb) {
ioapic_init_gsi(gsi_state, phb);
}
if (tcg_enabled()) {
@@ -344,11 +331,8 @@ static void pc_init1(MachineState *machine,
pc_nic_init(pcmc, isa_bus, pci_bus, pcms->xenbus);
if (pcmc->pci_enabled) {
pc_cmos_init(pcms, idebus[0], idebus[1], rtc_state);
}
#ifdef CONFIG_IDE_ISA
else {
if (!pcmc->pci_enabled) {
DriveInfo *hd[MAX_IDE_BUS * MAX_IDE_DEVS];
int i;
@@ -366,10 +350,11 @@ static void pc_init1(MachineState *machine,
busname[4] = '0' + i;
idebus[i] = qdev_get_child_bus(DEVICE(dev), busname);
}
pc_cmos_init(pcms, idebus[0], idebus[1], rtc_state);
}
#endif
pc_cmos_init(pcms, idebus[0], idebus[1], rtc_state);
if (piix4_pm) {
smi_irq = qemu_allocate_irq(pc_acpi_smi_interrupt, first_cpu, 0);

View File

@@ -45,7 +45,6 @@
#include "hw/i386/amd_iommu.h"
#include "hw/i386/intel_iommu.h"
#include "hw/display/ramfb.h"
#include "hw/firmware/smbios.h"
#include "hw/ide/pci.h"
#include "hw/ide/ahci.h"
#include "hw/intc/ioapic.h"
@@ -130,8 +129,7 @@ static void pc_q35_init(MachineState *machine)
ISADevice *rtc_state;
MemoryRegion *system_memory = get_system_memory();
MemoryRegion *system_io = get_system_io();
MemoryRegion *pci_memory;
MemoryRegion *rom_memory;
MemoryRegion *pci_memory = g_new(MemoryRegion, 1);
GSIState *gsi_state;
ISABus *isa_bus;
int i;
@@ -143,6 +141,8 @@ static void pc_q35_init(MachineState *machine)
bool keep_pci_slot_hpc;
uint64_t pci_hole64_size = 0;
assert(pcmc->pci_enabled);
/* Check whether RAM fits below 4G (leaving 1/2 GByte for IO memory
* and 256 Mbytes for PCI Express Enhanced Configuration Access Mapping
* also known as MMCFG).
@@ -189,37 +189,16 @@ static void pc_q35_init(MachineState *machine)
kvmclock_create(pcmc->kvmclock_create_always);
}
/* pci enabled */
if (pcmc->pci_enabled) {
pci_memory = g_new(MemoryRegion, 1);
memory_region_init(pci_memory, NULL, "pci", UINT64_MAX);
rom_memory = pci_memory;
} else {
pci_memory = NULL;
rom_memory = system_memory;
}
pc_guest_info_init(pcms);
if (pcmc->smbios_defaults) {
/* These values are guest ABI, do not change */
smbios_set_defaults("QEMU", mc->desc,
mc->name, pcmc->smbios_legacy_mode,
pcmc->smbios_uuid_encoded,
pcms->smbios_entry_point_type);
}
/* create pci host bus */
phb = OBJECT(qdev_new(TYPE_Q35_HOST_DEVICE));
if (pcmc->pci_enabled) {
pci_hole64_size = object_property_get_uint(phb,
PCI_HOST_PROP_PCI_HOLE64_SIZE,
&error_abort);
}
pci_hole64_size = object_property_get_uint(phb,
PCI_HOST_PROP_PCI_HOLE64_SIZE,
&error_abort);
/* allocate ram and load rom/bios */
pc_memory_init(pcms, system_memory, rom_memory, pci_hole64_size);
memory_region_init(pci_memory, NULL, "pci", UINT64_MAX);
pc_memory_init(pcms, system_memory, pci_memory, pci_hole64_size);
object_property_add_child(OBJECT(machine), "q35", phb);
object_property_set_link(phb, PCI_HOST_PROP_RAM_MEM,
@@ -236,6 +215,8 @@ static void pc_q35_init(MachineState *machine)
x86ms->above_4g_mem_size, NULL);
object_property_set_bool(phb, PCI_HOST_BYPASS_IOMMU,
pcms->default_bus_bypass_iommu, NULL);
object_property_set_bool(phb, PCI_HOST_PROP_SMM_RANGES,
x86_machine_is_smm_enabled(x86ms), NULL);
sysbus_realize_and_unref(SYS_BUS_DEVICE(phb), &error_fatal);
/* pci */
@@ -243,14 +224,14 @@ static void pc_q35_init(MachineState *machine)
pcms->bus = host_bus;
/* irq lines */
gsi_state = pc_gsi_create(&x86ms->gsi, pcmc->pci_enabled);
gsi_state = pc_gsi_create(&x86ms->gsi, true);
/* create ISA bus */
lpc = pci_new_multifunction(PCI_DEVFN(ICH9_LPC_DEV, ICH9_LPC_FUNC),
TYPE_ICH9_LPC_DEVICE);
qdev_prop_set_bit(DEVICE(lpc), "smm-enabled",
x86_machine_is_smm_enabled(x86ms));
lpc_dev = DEVICE(lpc);
qdev_prop_set_bit(lpc_dev, "smm-enabled",
x86_machine_is_smm_enabled(x86ms));
for (i = 0; i < IOAPIC_NUM_PINS; i++) {
qdev_connect_gpio_out_named(lpc_dev, ICH9_GPIO_GSI, i, x86ms->gsi[i]);
}
@@ -286,9 +267,7 @@ static void pc_q35_init(MachineState *machine)
pc_i8259_create(isa_bus, gsi_state->i8259_irq);
}
if (pcmc->pci_enabled) {
ioapic_init_gsi(gsi_state, "q35");
}
ioapic_init_gsi(gsi_state, OBJECT(phb));
if (tcg_enabled()) {
x86_register_ferr_irq(x86ms->gsi[13]);
@@ -412,10 +391,6 @@ static void pc_q35_8_0_machine_options(MachineClass *m)
pc_q35_8_1_machine_options(m);
compat_props_add(m->compat_props, hw_compat_8_0, hw_compat_8_0_len);
compat_props_add(m->compat_props, pc_compat_8_0, pc_compat_8_0_len);
/* For pc-q35-8.0 and older, use SMBIOS 2.8 by default */
pcmc->default_smbios_ep_type = SMBIOS_ENTRY_POINT_TYPE_32;
m->max_cpus = 288;
}
DEFINE_Q35_MACHINE(v8_0, "pc-q35-8.0", NULL,
@@ -448,6 +423,10 @@ static void pc_q35_7_0_machine_options(MachineClass *m)
pcmc->enforce_amd_1tb_hole = false;
compat_props_add(m->compat_props, hw_compat_7_0, hw_compat_7_0_len);
compat_props_add(m->compat_props, pc_compat_7_0, pc_compat_7_0_len);
/* For pc-q35-7.0 and older, use SMBIOS 2.8 by default */
pcmc->default_smbios_ep_type = SMBIOS_ENTRY_POINT_TYPE_32;
m->max_cpus = 288;
}
DEFINE_Q35_MACHINE(v7_0, "pc-q35-7.0", NULL,

View File

@@ -37,6 +37,7 @@
#include "hw/block/flash.h"
#include "sysemu/kvm.h"
#include "sev.h"
#include "kvm/tdx.h"
#define FLASH_SECTOR_SIZE 4096
@@ -107,17 +108,15 @@ void pc_system_flash_cleanup_unused(PCMachineState *pcms)
{
char *prop_name;
int i;
Object *dev_obj;
assert(PC_MACHINE_GET_CLASS(pcms)->pci_enabled);
for (i = 0; i < ARRAY_SIZE(pcms->flash); i++) {
dev_obj = OBJECT(pcms->flash[i]);
if (!object_property_get_bool(dev_obj, "realized", &error_abort)) {
if (!qdev_is_realized(DEVICE(pcms->flash[i]))) {
prop_name = g_strdup_printf("pflash%d", i);
object_property_del(OBJECT(pcms), prop_name);
g_free(prop_name);
object_unparent(dev_obj);
object_unparent(OBJECT(pcms->flash[i]));
pcms->flash[i] = NULL;
}
}
@@ -265,5 +264,11 @@ void x86_firmware_configure(void *ptr, int size)
}
sev_encrypt_flash(ptr, size, &error_fatal);
} else if (is_tdx_vm()) {
ret = tdx_parse_tdvf(ptr, size);
if (ret) {
error_report("failed to parse TDVF for TDX VM");
exit(1);
}
}
}

147
hw/i386/tdvf-hob.c Normal file
View File

@@ -0,0 +1,147 @@
/*
* SPDX-License-Identifier: GPL-2.0-or-later
* Copyright (c) 2020 Intel Corporation
* Author: Isaku Yamahata <isaku.yamahata at gmail.com>
* <isaku.yamahata at intel.com>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
* You should have received a copy of the GNU General Public License along
* with this program; if not, see <http://www.gnu.org/licenses/>.
*/
#include "qemu/osdep.h"
#include "qemu/log.h"
#include "qemu/error-report.h"
#include "e820_memory_layout.h"
#include "hw/i386/pc.h"
#include "hw/i386/x86.h"
#include "hw/pci/pcie_host.h"
#include "sysemu/kvm.h"
#include "standard-headers/uefi/uefi.h"
#include "tdvf-hob.h"
typedef struct TdvfHob {
hwaddr hob_addr;
void *ptr;
int size;
/* working area */
void *current;
void *end;
} TdvfHob;
static uint64_t tdvf_current_guest_addr(const TdvfHob *hob)
{
return hob->hob_addr + (hob->current - hob->ptr);
}
static void tdvf_align(TdvfHob *hob, size_t align)
{
hob->current = QEMU_ALIGN_PTR_UP(hob->current, align);
}
static void *tdvf_get_area(TdvfHob *hob, uint64_t size)
{
void *ret;
if (hob->current + size > hob->end) {
error_report("TD_HOB overrun, size = 0x%" PRIx64, size);
exit(1);
}
ret = hob->current;
hob->current += size;
tdvf_align(hob, 8);
return ret;
}
static void tdvf_hob_add_memory_resources(TdxGuest *tdx, TdvfHob *hob)
{
EFI_HOB_RESOURCE_DESCRIPTOR *region;
EFI_RESOURCE_ATTRIBUTE_TYPE attr;
EFI_RESOURCE_TYPE resource_type;
TdxRamEntry *e;
int i;
for (i = 0; i < tdx->nr_ram_entries; i++) {
e = &tdx->ram_entries[i];
if (e->type == TDX_RAM_UNACCEPTED) {
resource_type = EFI_RESOURCE_MEMORY_UNACCEPTED;
attr = EFI_RESOURCE_ATTRIBUTE_TDVF_UNACCEPTED;
} else if (e->type == TDX_RAM_ADDED){
resource_type = EFI_RESOURCE_SYSTEM_MEMORY;
attr = EFI_RESOURCE_ATTRIBUTE_TDVF_PRIVATE;
} else {
error_report("unknown TDX_RAM_ENTRY type %d", e->type);
exit(1);
}
region = tdvf_get_area(hob, sizeof(*region));
*region = (EFI_HOB_RESOURCE_DESCRIPTOR) {
.Header = {
.HobType = EFI_HOB_TYPE_RESOURCE_DESCRIPTOR,
.HobLength = cpu_to_le16(sizeof(*region)),
.Reserved = cpu_to_le32(0),
},
.Owner = EFI_HOB_OWNER_ZERO,
.ResourceType = cpu_to_le32(resource_type),
.ResourceAttribute = cpu_to_le32(attr),
.PhysicalStart = cpu_to_le64(e->address),
.ResourceLength = cpu_to_le64(e->length),
};
}
}
void tdvf_hob_create(TdxGuest *tdx, TdxFirmwareEntry *td_hob)
{
TdvfHob hob = {
.hob_addr = td_hob->address,
.size = td_hob->size,
.ptr = td_hob->mem_ptr,
.current = td_hob->mem_ptr,
.end = td_hob->mem_ptr + td_hob->size,
};
EFI_HOB_GENERIC_HEADER *last_hob;
EFI_HOB_HANDOFF_INFO_TABLE *hit;
/* Note, Efi{Free}Memory{Bottom,Top} are ignored, leave 'em zeroed. */
hit = tdvf_get_area(&hob, sizeof(*hit));
*hit = (EFI_HOB_HANDOFF_INFO_TABLE) {
.Header = {
.HobType = EFI_HOB_TYPE_HANDOFF,
.HobLength = cpu_to_le16(sizeof(*hit)),
.Reserved = cpu_to_le32(0),
},
.Version = cpu_to_le32(EFI_HOB_HANDOFF_TABLE_VERSION),
.BootMode = cpu_to_le32(0),
.EfiMemoryTop = cpu_to_le64(0),
.EfiMemoryBottom = cpu_to_le64(0),
.EfiFreeMemoryTop = cpu_to_le64(0),
.EfiFreeMemoryBottom = cpu_to_le64(0),
.EfiEndOfHobList = cpu_to_le64(0), /* initialized later */
};
tdvf_hob_add_memory_resources(tdx, &hob);
last_hob = tdvf_get_area(&hob, sizeof(*last_hob));
*last_hob = (EFI_HOB_GENERIC_HEADER) {
.HobType = EFI_HOB_TYPE_END_OF_HOB_LIST,
.HobLength = cpu_to_le16(sizeof(*last_hob)),
.Reserved = cpu_to_le32(0),
};
hit->EfiEndOfHobList = tdvf_current_guest_addr(&hob);
}

24
hw/i386/tdvf-hob.h Normal file
View File

@@ -0,0 +1,24 @@
#ifndef HW_I386_TD_HOB_H
#define HW_I386_TD_HOB_H
#include "hw/i386/tdvf.h"
#include "target/i386/kvm/tdx.h"
void tdvf_hob_create(TdxGuest *tdx, TdxFirmwareEntry *td_hob);
#define EFI_RESOURCE_ATTRIBUTE_TDVF_PRIVATE \
(EFI_RESOURCE_ATTRIBUTE_PRESENT | \
EFI_RESOURCE_ATTRIBUTE_INITIALIZED | \
EFI_RESOURCE_ATTRIBUTE_TESTED)
#define EFI_RESOURCE_ATTRIBUTE_TDVF_UNACCEPTED \
(EFI_RESOURCE_ATTRIBUTE_PRESENT | \
EFI_RESOURCE_ATTRIBUTE_INITIALIZED | \
EFI_RESOURCE_ATTRIBUTE_TESTED)
#define EFI_RESOURCE_ATTRIBUTE_TDVF_MMIO \
(EFI_RESOURCE_ATTRIBUTE_PRESENT | \
EFI_RESOURCE_ATTRIBUTE_INITIALIZED | \
EFI_RESOURCE_ATTRIBUTE_UNCACHEABLE)
#endif

200
hw/i386/tdvf.c Normal file
View File

@@ -0,0 +1,200 @@
/*
* SPDX-License-Identifier: GPL-2.0-or-later
* Copyright (c) 2020 Intel Corporation
* Author: Isaku Yamahata <isaku.yamahata at gmail.com>
* <isaku.yamahata at intel.com>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
* You should have received a copy of the GNU General Public License along
* with this program; if not, see <http://www.gnu.org/licenses/>.
*/
#include "qemu/osdep.h"
#include "qemu/error-report.h"
#include "hw/i386/pc.h"
#include "hw/i386/tdvf.h"
#include "sysemu/kvm.h"
#define TDX_METADATA_OFFSET_GUID "e47a6535-984a-4798-865e-4685a7bf8ec2"
#define TDX_METADATA_VERSION 1
#define TDVF_SIGNATURE 0x46564454 /* TDVF as little endian */
typedef struct {
uint32_t DataOffset;
uint32_t RawDataSize;
uint64_t MemoryAddress;
uint64_t MemoryDataSize;
uint32_t Type;
uint32_t Attributes;
} TdvfSectionEntry;
typedef struct {
uint32_t Signature;
uint32_t Length;
uint32_t Version;
uint32_t NumberOfSectionEntries;
TdvfSectionEntry SectionEntries[];
} TdvfMetadata;
struct tdx_metadata_offset {
uint32_t offset;
};
static TdvfMetadata *tdvf_get_metadata(void *flash_ptr, int size)
{
TdvfMetadata *metadata;
uint32_t offset = 0;
uint8_t *data;
if ((uint32_t) size != size) {
return NULL;
}
if (pc_system_ovmf_table_find(TDX_METADATA_OFFSET_GUID, &data, NULL)) {
offset = size - le32_to_cpu(((struct tdx_metadata_offset *)data)->offset);
if (offset + sizeof(*metadata) > size) {
return NULL;
}
} else {
error_report("Cannot find TDX_METADATA_OFFSET_GUID");
return NULL;
}
metadata = flash_ptr + offset;
/* Finally, verify the signature to determine if this is a TDVF image. */
metadata->Signature = le32_to_cpu(metadata->Signature);
if (metadata->Signature != TDVF_SIGNATURE) {
error_report("Invalid TDVF signature in metadata!");
return NULL;
}
/* Sanity check that the TDVF doesn't overlap its own metadata. */
metadata->Length = le32_to_cpu(metadata->Length);
if (offset + metadata->Length > size) {
return NULL;
}
/* Only version 1 is supported/defined. */
metadata->Version = le32_to_cpu(metadata->Version);
if (metadata->Version != TDX_METADATA_VERSION) {
return NULL;
}
return metadata;
}
static int tdvf_parse_and_check_section_entry(const TdvfSectionEntry *src,
TdxFirmwareEntry *entry)
{
entry->data_offset = le32_to_cpu(src->DataOffset);
entry->data_len = le32_to_cpu(src->RawDataSize);
entry->address = le64_to_cpu(src->MemoryAddress);
entry->size = le64_to_cpu(src->MemoryDataSize);
entry->type = le32_to_cpu(src->Type);
entry->attributes = le32_to_cpu(src->Attributes);
/* sanity check */
if (entry->size < entry->data_len) {
error_report("Broken metadata RawDataSize 0x%x MemoryDataSize 0x%lx",
entry->data_len, entry->size);
return -1;
}
if (!QEMU_IS_ALIGNED(entry->address, TARGET_PAGE_SIZE)) {
error_report("MemoryAddress 0x%lx not page aligned", entry->address);
return -1;
}
if (!QEMU_IS_ALIGNED(entry->size, TARGET_PAGE_SIZE)) {
error_report("MemoryDataSize 0x%lx not page aligned", entry->size);
return -1;
}
switch (entry->type) {
case TDVF_SECTION_TYPE_BFV:
case TDVF_SECTION_TYPE_CFV:
/* The sections that must be copied from firmware image to TD memory */
if (entry->data_len == 0) {
error_report("%d section with RawDataSize == 0", entry->type);
return -1;
}
break;
case TDVF_SECTION_TYPE_TD_HOB:
case TDVF_SECTION_TYPE_TEMP_MEM:
/* The sections that no need to be copied from firmware image */
if (entry->data_len != 0) {
error_report("%d section with RawDataSize 0x%x != 0",
entry->type, entry->data_len);
return -1;
}
break;
default:
error_report("TDVF contains unsupported section type %d", entry->type);
return -1;
}
return 0;
}
int tdvf_parse_metadata(TdxFirmware *fw, void *flash_ptr, int size)
{
TdvfSectionEntry *sections;
TdvfMetadata *metadata;
ssize_t entries_size;
uint32_t len, i;
metadata = tdvf_get_metadata(flash_ptr, size);
if (!metadata) {
return -EINVAL;
}
//load and parse metadata entries
fw->nr_entries = le32_to_cpu(metadata->NumberOfSectionEntries);
if (fw->nr_entries < 2) {
error_report("Invalid number of fw entries (%u) in TDVF", fw->nr_entries);
return -EINVAL;
}
len = le32_to_cpu(metadata->Length);
entries_size = fw->nr_entries * sizeof(TdvfSectionEntry);
if (len != sizeof(*metadata) + entries_size) {
error_report("TDVF metadata len (0x%x) mismatch, expected (0x%x)",
len, (uint32_t)(sizeof(*metadata) + entries_size));
return -EINVAL;
}
fw->entries = g_new(TdxFirmwareEntry, fw->nr_entries);
sections = g_new(TdvfSectionEntry, fw->nr_entries);
if (!memcpy(sections, (void *)metadata + sizeof(*metadata), entries_size)) {
error_report("Failed to read TDVF section entries");
goto err;
}
for (i = 0; i < fw->nr_entries; i++) {
if (tdvf_parse_and_check_section_entry(&sections[i], &fw->entries[i])) {
goto err;
}
}
g_free(sections);
fw->mem_ptr = flash_ptr;
return 0;
err:
g_free(sections);
fw->entries = 0;
g_free(fw->entries);
return -EINVAL;
}

View File

@@ -47,6 +47,7 @@
#include "hw/intc/i8259.h"
#include "hw/rtc/mc146818rtc.h"
#include "target/i386/sev.h"
#include "kvm/tdx.h"
#include "hw/acpi/cpu_hotplug.h"
#include "hw/irq.h"
@@ -636,20 +637,19 @@ void gsi_handler(void *opaque, int n, int level)
}
}
void ioapic_init_gsi(GSIState *gsi_state, const char *parent_name)
void ioapic_init_gsi(GSIState *gsi_state, Object *parent)
{
DeviceState *dev;
SysBusDevice *d;
unsigned int i;
assert(parent_name);
assert(parent);
if (kvm_ioapic_in_kernel()) {
dev = qdev_new(TYPE_KVM_IOAPIC);
} else {
dev = qdev_new(TYPE_IOAPIC);
}
object_property_add_child(object_resolve_path(parent_name, NULL),
"ioapic", OBJECT(dev));
object_property_add_child(parent, "ioapic", OBJECT(dev));
d = SYS_BUS_DEVICE(dev);
sysbus_realize_and_unref(d, &error_fatal);
sysbus_mmio_map(d, 0, IO_APIC_DEFAULT_ADDRESS);
@@ -1154,9 +1154,17 @@ void x86_bios_rom_init(MachineState *ms, const char *default_firmware,
(bios_size % 65536) != 0) {
goto bios_error;
}
bios = g_malloc(sizeof(*bios));
memory_region_init_ram(bios, NULL, "pc.bios", bios_size, &error_fatal);
if (sev_enabled()) {
if (is_tdx_vm()) {
memory_region_init_ram_guest_memfd(bios, NULL, "pc.bios", bios_size,
&error_fatal);
tdx_set_tdvf_region(bios);
} else {
memory_region_init_ram(bios, NULL, "pc.bios", bios_size, &error_fatal);
}
if (sev_enabled() || is_tdx_vm()) {
/*
* The concept of a "reset" simply doesn't exist for
* confidential computing guests, we have to destroy and
@@ -1178,17 +1186,20 @@ void x86_bios_rom_init(MachineState *ms, const char *default_firmware,
}
g_free(filename);
/* map the last 128KB of the BIOS in ISA space */
isa_bios_size = MIN(bios_size, 128 * KiB);
isa_bios = g_malloc(sizeof(*isa_bios));
memory_region_init_alias(isa_bios, NULL, "isa-bios", bios,
bios_size - isa_bios_size, isa_bios_size);
memory_region_add_subregion_overlap(rom_memory,
0x100000 - isa_bios_size,
isa_bios,
1);
if (!isapc_ram_fw) {
memory_region_set_readonly(isa_bios, true);
/* For TDX, alias different GPAs to same private memory is not supported */
if (!is_tdx_vm()) {
/* map the last 128KB of the BIOS in ISA space */
isa_bios_size = MIN(bios_size, 128 * KiB);
isa_bios = g_malloc(sizeof(*isa_bios));
memory_region_init_alias(isa_bios, NULL, "isa-bios", bios,
bios_size - isa_bios_size, isa_bios_size);
memory_region_add_subregion_overlap(rom_memory,
0x100000 - isa_bios_size,
isa_bios,
1);
if (!isapc_ram_fw) {
memory_region_set_readonly(isa_bios, true);
}
}
/* map all the bios at the top of memory */
@@ -1386,6 +1397,17 @@ static void machine_set_sgx_epc(Object *obj, Visitor *v, const char *name,
qapi_free_SgxEPCList(list);
}
static int x86_kvm_type(MachineState *ms, const char *vm_type)
{
X86MachineState *x86ms = X86_MACHINE(ms);
int kvm_type;
kvm_type = kvm_enabled() ? kvm_get_vm_type(ms, vm_type) : 0;
x86ms->vm_type = kvm_type;
return kvm_type;
}
static void x86_machine_initfn(Object *obj)
{
X86MachineState *x86ms = X86_MACHINE(obj);
@@ -1399,6 +1421,7 @@ static void x86_machine_initfn(Object *obj)
x86ms->oem_table_id = g_strndup(ACPI_BUILD_APPNAME8, 8);
x86ms->bus_lock_ratelimit = 0;
x86ms->above_4g_mem_start = 4 * GiB;
x86ms->eoi_intercept_unsupported = false;
}
static void x86_machine_class_init(ObjectClass *oc, void *data)
@@ -1410,6 +1433,7 @@ static void x86_machine_class_init(ObjectClass *oc, void *data)
mc->cpu_index_to_instance_props = x86_cpu_index_to_props;
mc->get_default_cpu_node_id = x86_get_default_cpu_node_id;
mc->possible_cpu_arch_ids = x86_possible_cpu_arch_ids;
mc->kvm_type = x86_kvm_type;
x86mc->save_tsc_khz = true;
x86mc->fwcfg_dma_enabled = true;
nc->nmi_monitor_handler = x86_nmi;

View File

@@ -12,10 +12,6 @@ config IOAPIC
bool
select I8259
config ARM_GIC
bool
select MSI_NONBROKEN
config OPENPIC
bool
select MSI_NONBROKEN
@@ -25,14 +21,18 @@ config APIC
select MSI_NONBROKEN
select I8259
config ARM_GIC
bool
select ARM_GICV3_TCG if TCG
select ARM_GIC_KVM if KVM
select MSI_NONBROKEN
config ARM_GICV3_TCG
bool
default y
depends on ARM_GIC && TCG
config ARM_GIC_KVM
bool
default y
depends on ARM_GIC && KVM
config XICS

View File

@@ -476,7 +476,6 @@ static void setup_interrupt(IVShmemState *s, int vector, Error **errp)
static void process_msg_shmem(IVShmemState *s, int fd, Error **errp)
{
Error *local_err = NULL;
struct stat buf;
size_t size;
@@ -496,10 +495,9 @@ static void process_msg_shmem(IVShmemState *s, int fd, Error **errp)
size = buf.st_size;
/* mmap the region and map into the BAR2 */
memory_region_init_ram_from_fd(&s->server_bar2, OBJECT(s), "ivshmem.bar2",
size, RAM_SHARED, fd, 0, &local_err);
if (local_err) {
error_propagate(errp, local_err);
if (!memory_region_init_ram_from_fd(&s->server_bar2, OBJECT(s),
"ivshmem.bar2", size, RAM_SHARED,
fd, 0, errp)) {
return;
}

View File

@@ -421,7 +421,7 @@ static uint16_t tulip_mdi_default[] = {
/* MDI Registers 8 - 15 */
0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000,
/* MDI Registers 16 - 31 */
0x0003, 0x0000, 0x0001, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000,
0x0003, 0x0000, 0x0001, 0x0000, 0x3b40, 0x0000, 0x0000, 0x0000,
0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000,
};
@@ -429,7 +429,7 @@ static uint16_t tulip_mdi_default[] = {
static const uint16_t tulip_mdi_mask[] = {
0x0000, 0xffff, 0xffff, 0xffff, 0xc01f, 0xffff, 0xffff, 0x0000,
0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000,
0x0fff, 0x0000, 0xffff, 0xffff, 0xffff, 0xffff, 0xffff, 0xffff,
0x0fff, 0x0000, 0xffff, 0xffff, 0x0000, 0xffff, 0xffff, 0xffff,
0xffff, 0xffff, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000,
};

View File

@@ -7924,7 +7924,7 @@ static void nvme_init_state(NvmeCtrl *n)
n->aer_reqs = g_new0(NvmeRequest *, n->params.aerl + 1);
QTAILQ_INIT(&n->aer_queue);
list->numcntl = cpu_to_le16(max_vfs);
list->numcntl = max_vfs;
for (i = 0; i < max_vfs; i++) {
sctrl = &list->sec[i];
sctrl->pcid = cpu_to_le16(n->cntlid);
@@ -8466,36 +8466,26 @@ static void nvme_pci_reset(DeviceState *qdev)
nvme_ctrl_reset(n, NVME_RESET_FUNCTION);
}
static void nvme_sriov_pre_write_ctrl(PCIDevice *dev, uint32_t address,
uint32_t val, int len)
static void nvme_sriov_post_write_config(PCIDevice *dev, uint16_t old_num_vfs)
{
NvmeCtrl *n = NVME(dev);
NvmeSecCtrlEntry *sctrl;
uint16_t sriov_cap = dev->exp.sriov_cap;
uint32_t off = address - sriov_cap;
int i, num_vfs;
int i;
if (!sriov_cap) {
return;
}
if (range_covers_byte(off, len, PCI_SRIOV_CTRL)) {
if (!(val & PCI_SRIOV_CTRL_VFE)) {
num_vfs = pci_get_word(dev->config + sriov_cap + PCI_SRIOV_NUM_VF);
for (i = 0; i < num_vfs; i++) {
sctrl = &n->sec_ctrl_list.sec[i];
nvme_virt_set_state(n, le16_to_cpu(sctrl->scid), false);
}
}
for (i = pcie_sriov_num_vfs(dev); i < old_num_vfs; i++) {
sctrl = &n->sec_ctrl_list.sec[i];
nvme_virt_set_state(n, le16_to_cpu(sctrl->scid), false);
}
}
static void nvme_pci_write_config(PCIDevice *dev, uint32_t address,
uint32_t val, int len)
{
nvme_sriov_pre_write_ctrl(dev, address, val, len);
uint16_t old_num_vfs = pcie_sriov_num_vfs(dev);
pci_default_write_config(dev, address, val, len);
pcie_cap_flr_write_config(dev, address, val, len);
nvme_sriov_post_write_config(dev, old_num_vfs);
}
static const VMStateDescription nvme_vmstate = {

View File

@@ -336,12 +336,9 @@ static void nrf51_nvm_init(Object *obj)
static void nrf51_nvm_realize(DeviceState *dev, Error **errp)
{
NRF51NVMState *s = NRF51_NVM(dev);
Error *err = NULL;
memory_region_init_rom_device(&s->flash, OBJECT(dev), &flash_ops, s,
"nrf51_soc.flash", s->flash_size, &err);
if (err) {
error_propagate(errp, err);
if (!memory_region_init_rom_device(&s->flash, OBJECT(dev), &flash_ops, s,
"nrf51_soc.flash", s->flash_size, errp)) {
return;
}

View File

@@ -340,6 +340,8 @@ static void designware_pcie_root_config_write(PCIDevice *d, uint32_t address,
break;
case DESIGNWARE_PCIE_ATU_VIEWPORT:
val &= DESIGNWARE_PCIE_ATU_REGION_INBOUND |
(DESIGNWARE_PCIE_NUM_VIEWPORTS - 1);
root->atu_viewport = val;
break;

View File

@@ -179,6 +179,8 @@ static Property q35_host_props[] = {
mch.below_4g_mem_size, 0),
DEFINE_PROP_SIZE(PCI_HOST_ABOVE_4G_MEM_SIZE, Q35PCIHost,
mch.above_4g_mem_size, 0),
DEFINE_PROP_BOOL(PCI_HOST_PROP_SMM_RANGES, Q35PCIHost,
mch.has_smm_ranges, true),
DEFINE_PROP_BOOL("x-pci-hole64-fix", Q35PCIHost, pci_hole64_fix, true),
DEFINE_PROP_END_OF_LIST(),
};
@@ -214,6 +216,7 @@ static void q35_host_initfn(Object *obj)
/* mch's object_initialize resets the default value, set it again */
qdev_prop_set_uint64(DEVICE(s), PCI_HOST_PROP_PCI_HOLE64_SIZE,
Q35_PCI_HOST_HOLE64_SIZE_DEFAULT);
object_property_add(obj, PCI_HOST_PROP_PCI_HOLE_START, "uint32",
q35_host_get_pci_hole_start,
NULL, NULL, NULL);
@@ -476,6 +479,10 @@ static void mch_write_config(PCIDevice *d,
mch_update_pciexbar(mch);
}
if (!mch->has_smm_ranges) {
return;
}
if (ranges_overlap(address, len, MCH_HOST_BRIDGE_SMRAM,
MCH_HOST_BRIDGE_SMRAM_SIZE)) {
mch_update_smram(mch);
@@ -494,10 +501,13 @@ static void mch_write_config(PCIDevice *d,
static void mch_update(MCHPCIState *mch)
{
mch_update_pciexbar(mch);
mch_update_pam(mch);
mch_update_smram(mch);
mch_update_ext_tseg_mbytes(mch);
mch_update_smbase_smram(mch);
if (mch->has_smm_ranges) {
mch_update_smram(mch);
mch_update_ext_tseg_mbytes(mch);
mch_update_smbase_smram(mch);
}
/*
* pci hole goes from end-of-low-ram to io-apic.
@@ -538,19 +548,21 @@ static void mch_reset(DeviceState *qdev)
pci_set_quad(d->config + MCH_HOST_BRIDGE_PCIEXBAR,
MCH_HOST_BRIDGE_PCIEXBAR_DEFAULT);
d->config[MCH_HOST_BRIDGE_SMRAM] = MCH_HOST_BRIDGE_SMRAM_DEFAULT;
d->config[MCH_HOST_BRIDGE_ESMRAMC] = MCH_HOST_BRIDGE_ESMRAMC_DEFAULT;
d->wmask[MCH_HOST_BRIDGE_SMRAM] = MCH_HOST_BRIDGE_SMRAM_WMASK;
d->wmask[MCH_HOST_BRIDGE_ESMRAMC] = MCH_HOST_BRIDGE_ESMRAMC_WMASK;
if (mch->has_smm_ranges) {
d->config[MCH_HOST_BRIDGE_SMRAM] = MCH_HOST_BRIDGE_SMRAM_DEFAULT;
d->config[MCH_HOST_BRIDGE_ESMRAMC] = MCH_HOST_BRIDGE_ESMRAMC_DEFAULT;
d->wmask[MCH_HOST_BRIDGE_SMRAM] = MCH_HOST_BRIDGE_SMRAM_WMASK;
d->wmask[MCH_HOST_BRIDGE_ESMRAMC] = MCH_HOST_BRIDGE_ESMRAMC_WMASK;
if (mch->ext_tseg_mbytes > 0) {
pci_set_word(d->config + MCH_HOST_BRIDGE_EXT_TSEG_MBYTES,
MCH_HOST_BRIDGE_EXT_TSEG_MBYTES_QUERY);
if (mch->ext_tseg_mbytes > 0) {
pci_set_word(d->config + MCH_HOST_BRIDGE_EXT_TSEG_MBYTES,
MCH_HOST_BRIDGE_EXT_TSEG_MBYTES_QUERY);
}
d->config[MCH_HOST_BRIDGE_F_SMBASE] = 0;
d->wmask[MCH_HOST_BRIDGE_F_SMBASE] = 0xff;
}
d->config[MCH_HOST_BRIDGE_F_SMBASE] = 0;
d->wmask[MCH_HOST_BRIDGE_F_SMBASE] = 0xff;
mch_update(mch);
}
@@ -568,6 +580,20 @@ static void mch_realize(PCIDevice *d, Error **errp)
/* setup pci memory mapping */
pc_pci_as_mapping_init(mch->system_memory, mch->pci_address_space);
/* PAM */
init_pam(&mch->pam_regions[0], OBJECT(mch), mch->ram_memory,
mch->system_memory, mch->pci_address_space,
PAM_BIOS_BASE, PAM_BIOS_SIZE);
for (i = 0; i < ARRAY_SIZE(mch->pam_regions) - 1; ++i) {
init_pam(&mch->pam_regions[i + 1], OBJECT(mch), mch->ram_memory,
mch->system_memory, mch->pci_address_space,
PAM_EXPAN_BASE + i * PAM_EXPAN_SIZE, PAM_EXPAN_SIZE);
}
if (!mch->has_smm_ranges) {
return;
}
/* if *disabled* show SMRAM to all CPUs */
memory_region_init_alias(&mch->smram_region, OBJECT(mch), "smram-region",
mch->pci_address_space, MCH_HOST_BRIDGE_SMRAM_C_BASE,
@@ -634,15 +660,6 @@ static void mch_realize(PCIDevice *d, Error **errp)
object_property_add_const_link(qdev_get_machine(), "smram",
OBJECT(&mch->smram));
init_pam(&mch->pam_regions[0], OBJECT(mch), mch->ram_memory,
mch->system_memory, mch->pci_address_space,
PAM_BIOS_BASE, PAM_BIOS_SIZE);
for (i = 0; i < ARRAY_SIZE(mch->pam_regions) - 1; ++i) {
init_pam(&mch->pam_regions[i + 1], OBJECT(mch), mch->ram_memory,
mch->system_memory, mch->pci_address_space,
PAM_EXPAN_BASE + i * PAM_EXPAN_SIZE, PAM_EXPAN_SIZE);
}
}
uint64_t mch_mcfg_base(void)

View File

@@ -345,8 +345,10 @@ static void raven_realize(PCIDevice *d, Error **errp)
d->config[PCI_LATENCY_TIMER] = 0x10;
d->config[PCI_CAPABILITY_LIST] = 0x00;
memory_region_init_rom_nomigrate(&s->bios, OBJECT(s), "bios", BIOS_SIZE,
&error_fatal);
if (!memory_region_init_rom_nomigrate(&s->bios, OBJECT(s), "bios",
BIOS_SIZE, errp)) {
return;
}
memory_region_add_subregion(get_system_memory(), (uint32_t)(-BIOS_SIZE),
&s->bios);
if (s->bios_name) {

View File

@@ -15,7 +15,6 @@
#include "sysemu/kvm.h"
#include "migration/blocker.h"
#include "exec/confidential-guest-support.h"
#include "hw/ppc/pef.h"
#define TYPE_PEF_GUEST "pef-guest"
OBJECT_DECLARE_SIMPLE_TYPE(PefGuest, PEF_GUEST)
@@ -93,7 +92,7 @@ static int kvmppc_svm_off(Error **errp)
#endif
}
int pef_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
static int pef_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
{
if (!object_dynamic_cast(OBJECT(cgs), TYPE_PEF_GUEST)) {
return 0;
@@ -107,7 +106,7 @@ int pef_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
return kvmppc_svm_init(cgs, errp);
}
int pef_kvm_reset(ConfidentialGuestSupport *cgs, Error **errp)
static int pef_kvm_reset(ConfidentialGuestSupport *cgs, Error **errp)
{
if (!object_dynamic_cast(OBJECT(cgs), TYPE_PEF_GUEST)) {
return 0;
@@ -131,6 +130,10 @@ OBJECT_DEFINE_TYPE_WITH_INTERFACES(PefGuest,
static void pef_guest_class_init(ObjectClass *oc, void *data)
{
ConfidentialGuestSupportClass *klass = CONFIDENTIAL_GUEST_SUPPORT_CLASS(oc);
klass->kvm_init = pef_kvm_init;
klass->kvm_reset = pef_kvm_reset;
}
static void pef_guest_init(Object *obj)

View File

@@ -143,7 +143,6 @@ static void rs6000mc_realize(DeviceState *dev, Error **errp)
RS6000MCState *s = RS6000MC(dev);
int socket = 0;
unsigned int ram_size = s->ram_size / MiB;
Error *local_err = NULL;
while (socket < 6) {
if (ram_size >= 64) {
@@ -165,10 +164,8 @@ static void rs6000mc_realize(DeviceState *dev, Error **errp)
if (s->simm_size[socket]) {
char name[] = "simm.?";
name[5] = socket + '0';
memory_region_init_ram(&s->simm[socket], OBJECT(dev), name,
s->simm_size[socket] * MiB, &local_err);
if (local_err) {
error_propagate(errp, local_err);
if (!memory_region_init_ram(&s->simm[socket], OBJECT(dev), name,
s->simm_size[socket] * MiB, errp)) {
return;
}
memory_region_add_subregion_overlap(get_system_memory(), 0,

View File

@@ -74,6 +74,7 @@
#include "hw/virtio/vhost-scsi-common.h"
#include "exec/ram_addr.h"
#include "exec/confidential-guest-support.h"
#include "hw/usb.h"
#include "qemu/config-file.h"
#include "qemu/error-report.h"
@@ -86,7 +87,6 @@
#include "hw/ppc/spapr_tpm_proxy.h"
#include "hw/ppc/spapr_nvdimm.h"
#include "hw/ppc/spapr_numa.h"
#include "hw/ppc/pef.h"
#include "monitor/monitor.h"
@@ -1687,7 +1687,9 @@ static void spapr_machine_reset(MachineState *machine, ShutdownCause reason)
qemu_guest_getrandom_nofail(spapr->fdt_rng_seed, 32);
}
pef_kvm_reset(machine->cgs, &error_fatal);
if (machine->cgs) {
confidential_guest_kvm_reset(machine->cgs, &error_fatal);
}
spapr_caps_apply(spapr);
first_ppc_cpu = POWERPC_CPU(first_cpu);
@@ -2810,7 +2812,9 @@ static void spapr_machine_init(MachineState *machine)
/*
* if Secure VM (PEF) support is configured, then initialize it
*/
pef_kvm_init(machine->cgs, &error_fatal);
if (machine->cgs) {
confidential_guest_kvm_init(machine->cgs, &error_fatal);
}
msi_nonbroken = true;
@@ -4647,13 +4651,10 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
mc->block_default_type = IF_SCSI;
/*
* Setting max_cpus to INT32_MAX. Both KVM and TCG max_cpus values
* should be limited by the host capability instead of hardcoded.
* max_cpus for KVM guests will be checked in kvm_init(), and TCG
* guests are welcome to have as many CPUs as the host are capable
* of emulate.
* While KVM determines max cpus in kvm_init() using kvm_max_vcpus(),
* In TCG the limit is restricted by the range of CPU IPIs available.
*/
mc->max_cpus = INT32_MAX;
mc->max_cpus = SPAPR_IRQ_NR_IPIS;
mc->no_parallel = 1;
mc->default_boot_order = "";

View File

@@ -23,6 +23,8 @@
#include "trace.h"
QEMU_BUILD_BUG_ON(SPAPR_IRQ_NR_IPIS > SPAPR_XIRQ_BASE);
static const TypeInfo spapr_intc_info = {
.name = TYPE_SPAPR_INTC,
.parent = TYPE_INTERFACE,
@@ -329,7 +331,7 @@ void spapr_irq_init(SpaprMachineState *spapr, Error **errp)
int i;
dev = qdev_new(TYPE_SPAPR_XIVE);
qdev_prop_set_uint32(dev, "nr-irqs", smc->nr_xirqs + SPAPR_XIRQ_BASE);
qdev_prop_set_uint32(dev, "nr-irqs", smc->nr_xirqs + SPAPR_IRQ_NR_IPIS);
/*
* 8 XIVE END structures per CPU. One for each available
* priority
@@ -356,7 +358,7 @@ void spapr_irq_init(SpaprMachineState *spapr, Error **errp)
}
spapr->qirqs = qemu_allocate_irqs(spapr_set_irq, spapr,
smc->nr_xirqs + SPAPR_XIRQ_BASE);
smc->nr_xirqs + SPAPR_IRQ_NR_IPIS);
/*
* Mostly we don't actually need this until reset, except that not

View File

@@ -132,7 +132,7 @@ static void build_rhct(GArray *table_data,
size_t len, aligned_len;
uint32_t isa_offset, num_rhct_nodes;
RISCVCPU *cpu;
char *isa;
g_autofree char *isa = NULL;
AcpiTable table = { .sig = "RHCT", .rev = 1, .oem_id = s->oem_id,
.oem_table_id = s->oem_table_id };

View File

@@ -141,6 +141,7 @@ static void pl031_write(void * opaque, hwaddr offset,
g_autofree const char *qom_path = object_get_canonical_path(opaque);
struct tm tm;
s->lr = value;
s->tick_offset += value - pl031_get_count(s);
qemu_get_timedate(&tm, s->tick_offset);

View File

@@ -14,6 +14,7 @@
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "exec/ram_addr.h"
#include "exec/confidential-guest-support.h"
#include "hw/s390x/s390-virtio-hcall.h"
#include "hw/s390x/sclp.h"
#include "hw/s390x/s390_flic.h"
@@ -267,7 +268,9 @@ static void ccw_init(MachineState *machine)
s390_init_cpus(machine);
/* Need CPU model to be determined before we can set up PV */
s390_pv_init(machine->cgs, &error_fatal);
if (machine->cgs) {
confidential_guest_kvm_init(machine->cgs, &error_fatal);
}
s390_flic_init();

View File

@@ -1159,6 +1159,7 @@ again:
lsi_script_scsi_interrupt(s, LSI_SIST0_UDC, 0);
lsi_disconnect(s);
trace_lsi_execute_script_stop();
reentrancy_level--;
return;
}
insn = read_dword(s, s->dsp);

View File

@@ -1928,7 +1928,7 @@ static void megasas_command_cancelled(SCSIRequest *req)
{
MegasasCmd *cmd = req->hba_private;
if (!cmd) {
if (!cmd || !cmd->frame) {
return;
}
cmd->frame->header.cmd_status = MFI_STAT_SCSI_IO_FAILED;

View File

@@ -327,11 +327,17 @@ static void scsi_read_complete(void * opaque, int ret)
if (r->req.cmd.buf[0] == READ_CAPACITY_10 &&
(ldl_be_p(&r->buf[0]) != 0xffffffffU || s->max_lba == 0)) {
s->blocksize = ldl_be_p(&r->buf[4]);
s->max_lba = ldl_be_p(&r->buf[0]) & 0xffffffffULL;
BlockBackend *blk = s->conf.blk;
BlockDriverState *bs = blk_bs(blk);
s->max_lba = bs->total_sectors - 1;
stl_be_p(&r->buf[0], s->max_lba);
} else if (r->req.cmd.buf[0] == SERVICE_ACTION_IN_16 &&
(r->req.cmd.buf[1] & 31) == SAI_READ_CAPACITY_16) {
s->blocksize = ldl_be_p(&r->buf[8]);
s->max_lba = ldq_be_p(&r->buf[0]);
BlockBackend *blk = s->conf.blk;
BlockDriverState *bs = blk_bs(blk);
s->max_lba = bs->total_sectors - 1;
stq_be_p(&r->buf[0], s->max_lba);
}
/*
@@ -396,7 +402,10 @@ static void scsi_write_complete(void * opaque, int ret)
assert(r->req.aiocb != NULL);
r->req.aiocb = NULL;
if (ret || r->req.io_canceled) {
if (ret || r->req.io_canceled ||
r->io_header.status != SCSI_HOST_OK ||
(r->io_header.driver_status & SG_ERR_DRIVER_TIMEOUT) ||
r->io_header.status != GOOD) {
scsi_command_complete_noio(r, ret);
goto done;
}

View File

@@ -1149,6 +1149,7 @@ static void virtio_scsi_drained_begin(SCSIBus *bus)
static void virtio_scsi_drained_end(SCSIBus *bus)
{
VirtIOSCSI *s = container_of(bus, VirtIOSCSI, bus);
VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(s);
VirtIODevice *vdev = VIRTIO_DEVICE(s);
uint32_t total_queues = VIRTIO_SCSI_VQ_NUM_FIXED +
s->parent_obj.conf.num_queues;
@@ -1166,7 +1167,11 @@ static void virtio_scsi_drained_end(SCSIBus *bus)
for (uint32_t i = 0; i < total_queues; i++) {
VirtQueue *vq = virtio_get_queue(vdev, i);
virtio_queue_aio_attach_host_notifier(vq, s->ctx);
if (vq == vs->event_vq) {
virtio_queue_aio_attach_host_notifier_no_poll(vq, s->ctx);
} else {
virtio_queue_aio_attach_host_notifier(vq, s->ctx);
}
}
}

View File

@@ -346,6 +346,11 @@ static const QemuOptDesc qemu_smbios_type4_opts[] = {
};
static const QemuOptDesc qemu_smbios_type8_opts[] = {
{
.name = "type",
.type = QEMU_OPT_NUMBER,
.help = "SMBIOS element type",
},
{
.name = "internal_reference",
.type = QEMU_OPT_STRING,
@@ -366,9 +371,15 @@ static const QemuOptDesc qemu_smbios_type8_opts[] = {
.type = QEMU_OPT_NUMBER,
.help = "port type",
},
{ /* end of list */ }
};
static const QemuOptDesc qemu_smbios_type11_opts[] = {
{
.name = "type",
.type = QEMU_OPT_NUMBER,
.help = "SMBIOS element type",
},
{
.name = "value",
.type = QEMU_OPT_STRING,
@@ -379,6 +390,7 @@ static const QemuOptDesc qemu_smbios_type11_opts[] = {
.type = QEMU_OPT_STRING,
.help = "OEM string data from file",
},
{ /* end of list */ }
};
static const QemuOptDesc qemu_smbios_type17_opts[] = {
@@ -1251,6 +1263,7 @@ void smbios_entry_add(QemuOpts *opts, Error **errp)
struct smbios_structure_header *header;
int size;
struct smbios_table *table; /* legacy mode only */
uint8_t *dbl_nulls, *orig_end;
if (!qemu_opts_validate(opts, qemu_smbios_file_opts, errp)) {
return;
@@ -1263,11 +1276,21 @@ void smbios_entry_add(QemuOpts *opts, Error **errp)
}
/*
* NOTE: standard double '\0' terminator expected, per smbios spec.
* (except in legacy mode, where the second '\0' is implicit and
* will be inserted by the BIOS).
* NOTE: standard double '\0' terminator expected, per smbios spec,
* unless the data is formatted for legacy mode, which is used by
* pc-i440fx-2.0 and earlier machine types. Legacy mode structures
* without strings have no '\0' terminators, and those with strings
* also don't have an additional '\0' terminator at the end of the
* final string '\0' terminator. The BIOS will add the '\0' terminators
* to comply with the smbios spec.
* For greater compatibility, regardless of the machine type used,
* either format is accepted.
*/
smbios_tables = g_realloc(smbios_tables, smbios_tables_len + size);
smbios_tables = g_realloc(smbios_tables, smbios_tables_len + size + 2);
orig_end = smbios_tables + smbios_tables_len + size;
/* add extra null bytes to end in case of legacy file data */
*orig_end = '\0';
*(orig_end + 1) = '\0';
header = (struct smbios_structure_header *)(smbios_tables +
smbios_tables_len);
@@ -1285,6 +1308,20 @@ void smbios_entry_add(QemuOpts *opts, Error **errp)
}
set_bit(header->type, have_binfile_bitmap);
}
for (dbl_nulls = smbios_tables + smbios_tables_len + header->length;
dbl_nulls + 2 <= orig_end; dbl_nulls++) {
if (*dbl_nulls == '\0' && *(dbl_nulls + 1) == '\0') {
break;
}
}
if (dbl_nulls + 2 < orig_end) {
error_setg(errp, "SMBIOS file data malformed");
return;
}
/* increase size by how many extra nulls were actually needed */
size += dbl_nulls + 2 - orig_end;
smbios_tables = g_realloc(smbios_tables, smbios_tables_len + size);
set_bit(header->type, have_binfile_bitmap);
if (header->type == 4) {
smbios_type4_count++;
@@ -1304,6 +1341,17 @@ void smbios_entry_add(QemuOpts *opts, Error **errp)
* delete the one we don't need from smbios_set_defaults(),
* once we know which machine version has been requested.
*/
if (dbl_nulls + 2 == orig_end) {
/* chop off nulls to get legacy format */
if (header->length + 2 == size) {
size -= 2;
} else {
size -= 1;
}
} else {
/* undo conversion from legacy format to per-spec format */
size -= dbl_nulls + 2 - orig_end;
}
if (!smbios_entries) {
smbios_entries_len = sizeof(uint16_t);
smbios_entries = g_malloc0(smbios_entries_len);

View File

@@ -577,12 +577,9 @@ static void idreg_realize(DeviceState *ds, Error **errp)
{
IDRegState *s = MACIO_ID_REGISTER(ds);
SysBusDevice *dev = SYS_BUS_DEVICE(ds);
Error *local_err = NULL;
memory_region_init_ram_nomigrate(&s->mem, OBJECT(ds), "sun4m.idreg",
sizeof(idreg_data), &local_err);
if (local_err) {
error_propagate(errp, local_err);
if (!memory_region_init_ram_nomigrate(&s->mem, OBJECT(ds), "sun4m.idreg",
sizeof(idreg_data), errp)) {
return;
}
@@ -631,12 +628,9 @@ static void afx_realize(DeviceState *ds, Error **errp)
{
AFXState *s = TCX_AFX(ds);
SysBusDevice *dev = SYS_BUS_DEVICE(ds);
Error *local_err = NULL;
memory_region_init_ram_nomigrate(&s->mem, OBJECT(ds), "sun4m.afx", 4,
&local_err);
if (local_err) {
error_propagate(errp, local_err);
if (!memory_region_init_ram_nomigrate(&s->mem, OBJECT(ds), "sun4m.afx",
4, errp)) {
return;
}
@@ -715,12 +709,9 @@ static void prom_realize(DeviceState *ds, Error **errp)
{
PROMState *s = OPENPROM(ds);
SysBusDevice *dev = SYS_BUS_DEVICE(ds);
Error *local_err = NULL;
memory_region_init_ram_nomigrate(&s->prom, OBJECT(ds), "sun4m.prom",
PROM_SIZE_MAX, &local_err);
if (local_err) {
error_propagate(errp, local_err);
if (!memory_region_init_ram_nomigrate(&s->prom, OBJECT(ds), "sun4m.prom",
PROM_SIZE_MAX, errp)) {
return;
}

View File

@@ -454,12 +454,9 @@ static void prom_realize(DeviceState *ds, Error **errp)
{
PROMState *s = OPENPROM(ds);
SysBusDevice *dev = SYS_BUS_DEVICE(ds);
Error *local_err = NULL;
memory_region_init_ram_nomigrate(&s->prom, OBJECT(ds), "sun4u.prom",
PROM_SIZE_MAX, &local_err);
if (local_err) {
error_propagate(errp, local_err);
if (!memory_region_init_ram_nomigrate(&s->prom, OBJECT(ds), "sun4u.prom",
PROM_SIZE_MAX, errp)) {
return;
}

View File

@@ -273,13 +273,14 @@ static void usb_qdev_realize(DeviceState *qdev, Error **errp)
}
if (dev->pcap_filename) {
int fd = qemu_open_old(dev->pcap_filename, O_CREAT | O_WRONLY | O_TRUNC, 0666);
int fd = qemu_open_old(dev->pcap_filename,
O_CREAT | O_WRONLY | O_TRUNC | O_BINARY, 0666);
if (fd < 0) {
error_setg(errp, "open %s failed", dev->pcap_filename);
usb_qdev_unrealize(qdev);
return;
}
dev->pcap = fdopen(fd, "w");
dev->pcap = fdopen(fd, "wb");
usb_pcap_init(dev->pcap);
}
}

View File

@@ -824,9 +824,11 @@ static void vfio_msix_disable(VFIOPCIDevice *vdev)
}
}
if (vdev->nr_vectors) {
vfio_disable_irqindex(&vdev->vbasedev, VFIO_PCI_MSIX_IRQ_INDEX);
}
/*
* Always clear MSI-X IRQ index. A PF device could have enabled
* MSI-X with no vectors. See vfio_msix_enable().
*/
vfio_disable_irqindex(&vdev->vbasedev, VFIO_PCI_MSIX_IRQ_INDEX);
vfio_msi_disable_common(vdev);
vfio_intx_enable(vdev, &err);

View File

@@ -1264,6 +1264,8 @@ static void virtio_iommu_system_reset(void *opaque)
trace_virtio_iommu_system_reset();
memset(s->iommu_pcibus_by_bus_num, 0, sizeof(s->iommu_pcibus_by_bus_num));
/*
* config.bypass is sticky across device reset, but should be restored on
* system reset
@@ -1302,8 +1304,6 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
virtio_init(vdev, VIRTIO_ID_IOMMU, sizeof(struct virtio_iommu_config));
memset(s->iommu_pcibus_by_bus_num, 0, sizeof(s->iommu_pcibus_by_bus_num));
s->req_vq = virtio_add_queue(vdev, VIOMMU_DEFAULT_QUEUE_SIZE,
virtio_iommu_handle_command);
s->event_vq = virtio_add_queue(vdev, VIOMMU_DEFAULT_QUEUE_SIZE, NULL);

View File

@@ -605,8 +605,7 @@ static int virtio_mem_set_block_state(VirtIOMEM *vmem, uint64_t start_gpa,
int fd = memory_region_get_fd(&vmem->memdev->mr);
Error *local_err = NULL;
qemu_prealloc_mem(fd, area, size, 1, NULL, &local_err);
if (local_err) {
if (!qemu_prealloc_mem(fd, area, size, 1, NULL, &local_err)) {
static bool warned;
/*
@@ -1249,8 +1248,7 @@ static int virtio_mem_prealloc_range_cb(VirtIOMEM *vmem, void *arg,
int fd = memory_region_get_fd(&vmem->memdev->mr);
Error *local_err = NULL;
qemu_prealloc_mem(fd, area, size, 1, NULL, &local_err);
if (local_err) {
if (!qemu_prealloc_mem(fd, area, size, 1, NULL, &local_err)) {
error_report_err(local_err);
return -ENOMEM;
}

View File

@@ -3556,6 +3556,17 @@ static void virtio_queue_host_notifier_aio_poll_end(EventNotifier *n)
void virtio_queue_aio_attach_host_notifier(VirtQueue *vq, AioContext *ctx)
{
/*
* virtio_queue_aio_detach_host_notifier() can leave notifications disabled.
* Re-enable them. (And if detach has not been used before, notifications
* being enabled is still the default state while a notifier is attached;
* see virtio_queue_host_notifier_aio_poll_end(), which will always leave
* notifications enabled once the polling section is left.)
*/
if (!virtio_queue_get_notification(vq)) {
virtio_queue_set_notification(vq, 1);
}
aio_set_event_notifier(ctx, &vq->host_notifier,
virtio_queue_host_notifier_read,
virtio_queue_host_notifier_aio_poll,
@@ -3563,6 +3574,13 @@ void virtio_queue_aio_attach_host_notifier(VirtQueue *vq, AioContext *ctx)
aio_set_event_notifier_poll(ctx, &vq->host_notifier,
virtio_queue_host_notifier_aio_poll_begin,
virtio_queue_host_notifier_aio_poll_end);
/*
* We will have ignored notifications about new requests from the guest
* while no notifiers were attached, so "kick" the virt queue to process
* those requests now.
*/
event_notifier_set(&vq->host_notifier);
}
/*
@@ -3573,14 +3591,38 @@ void virtio_queue_aio_attach_host_notifier(VirtQueue *vq, AioContext *ctx)
*/
void virtio_queue_aio_attach_host_notifier_no_poll(VirtQueue *vq, AioContext *ctx)
{
/* See virtio_queue_aio_attach_host_notifier() */
if (!virtio_queue_get_notification(vq)) {
virtio_queue_set_notification(vq, 1);
}
aio_set_event_notifier(ctx, &vq->host_notifier,
virtio_queue_host_notifier_read,
NULL, NULL);
/*
* See virtio_queue_aio_attach_host_notifier().
* Note that this may be unnecessary for the type of virtqueues this
* function is used for. Still, it will not hurt to have a quick look into
* whether we can/should process any of the virtqueue elements.
*/
event_notifier_set(&vq->host_notifier);
}
void virtio_queue_aio_detach_host_notifier(VirtQueue *vq, AioContext *ctx)
{
aio_set_event_notifier(ctx, &vq->host_notifier, NULL, NULL, NULL);
/*
* aio_set_event_notifier_poll() does not guarantee whether io_poll_end()
* will run after io_poll_begin(), so by removing the notifier, we do not
* know whether virtio_queue_host_notifier_aio_poll_end() has run after a
* previous virtio_queue_host_notifier_aio_poll_begin(), i.e. whether
* notifications are enabled or disabled. It does not really matter anyway;
* we just removed the notifier, so we do not care about notifications until
* we potentially re-attach it. The attach_host_notifier functions will
* ensure that notifications are enabled again when they are needed.
*/
}
void virtio_queue_host_notifier_read(EventNotifier *n)

View File

@@ -497,9 +497,14 @@ void aio_set_event_notifier(AioContext *ctx,
AioPollFn *io_poll,
EventNotifierHandler *io_poll_ready);
/* Set polling begin/end callbacks for an event notifier that has already been
/*
* Set polling begin/end callbacks for an event notifier that has already been
* registered with aio_set_event_notifier. Do nothing if the event notifier is
* not registered.
*
* Note that if the io_poll_end() callback (or the entire notifier) is removed
* during polling, it will not be called, so an io_poll_begin() is not
* necessarily always followed by an io_poll_end().
*/
void aio_set_event_notifier_poll(AioContext *ctx,
EventNotifier *notifier,

View File

@@ -48,7 +48,7 @@ void hmp_eject(Monitor *mon, const QDict *qdict);
void hmp_qemu_io(Monitor *mon, const QDict *qdict);
void hmp_info_block(Monitor *mon, const QDict *qdict);
void coroutine_fn hmp_info_block(Monitor *mon, const QDict *qdict);
void hmp_info_blockstats(Monitor *mon, const QDict *qdict);
void hmp_info_block_jobs(Monitor *mon, const QDict *qdict);
void hmp_info_snapshots(Monitor *mon, const QDict *qdict);

View File

@@ -25,14 +25,14 @@
#ifndef BLOCK_QAPI_H
#define BLOCK_QAPI_H
#include "block/block-common.h"
#include "block/graph-lock.h"
#include "block/snapshot.h"
#include "qapi/qapi-types-block-core.h"
BlockDeviceInfo * GRAPH_RDLOCK
bdrv_block_device_info(BlockBackend *blk, BlockDriverState *bs,
bool flat, Error **errp);
BlockDeviceInfo *coroutine_fn GRAPH_RDLOCK
bdrv_block_device_info(BlockBackend *blk, BlockDriverState *bs, bool flat,
Error **errp);
int GRAPH_RDLOCK
bdrv_query_snapshot_info_list(BlockDriverState *bs,
SnapshotInfoList **p_list,
@@ -40,7 +40,10 @@ bdrv_query_snapshot_info_list(BlockDriverState *bs,
void GRAPH_RDLOCK
bdrv_query_image_info(BlockDriverState *bs, ImageInfo **p_info, bool flat,
bool skip_implicit_filters, Error **errp);
void GRAPH_RDLOCK
void coroutine_fn GRAPH_RDLOCK
bdrv_co_query_block_graph_info(BlockDriverState *bs, BlockGraphInfo **p_info,
Error **errp);
void co_wrapper_bdrv_rdlock
bdrv_query_block_graph_info(BlockDriverState *bs, BlockGraphInfo **p_info,
Error **errp);

Some files were not shown because too many files have changed in this diff Show More