Commit 6f6ad2b245 "hw/i386/pc: Confine system flash handling to pc_sysfw"
causes a regression when specifying the property `-M pflash0` in the PCI PC
machines:
qemu-system-x86_64: Property 'pc-q35-9.0-machine.pflash0' not found
In order to revert the commit, the commit below must be reverted first.
This reverts commit cb05cc1602.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20240226215909.30884-2-shentey@gmail.com>
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 0fbe8d7c4c)
Referecens: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Don't get/put state of TDX VMs since accessing/mutating guest state of
production TDs is not supported.
Note, it will be allowed for a debug TD. Corresponding support will be
introduced when debug TD support is implemented in the future.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
For TDs, only MSR_IA32_UCODE_REV in kvm_init_msrs() can be configured
by VMM, while the features enumerated/controlled by other MSRs except
MSR_IA32_UCODE_REV in kvm_init_msrs() are not under control of VMM.
Only configure MSR_IA32_UCODE_REV for TDs.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
TSC of TDs is not accessible and KVM doesn't allow access of
MSR_IA32_TSC for TDs. To avoid the assert() in kvm_get_tsc, make
kvm_synchronize_all_tsc() noop for TDs,
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Reviewed-by: Connor Kuehl <ckuehl@redhat.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Add a new bool member, eoi_intercept_unsupported, to X86MachineState
with default value false. Set true for TDX VM.
Inability to intercept eoi causes impossibility to emulate level
triggered interrupt to be re-injected when level is still kept active.
which affects interrupt controller emulation.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
TDX CPU state is protected and thus vcpu state cann't be reset by VMM.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Legacy PIC (8259) cannot be supported for TDX VMs since TDX module
doesn't allow directly interrupt injection. Using posted interrupts
for the PIC is not a viable option as the guest BIOS/kernel will not
do EOI for PIC IRQs, i.e. will leave the vIRR bit set.
Hence disable PIC for TDX VMs and error out if user wants PIC.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
TDX doesn't support SMM and VMM cannot emulate SMM for TDX VMs because
VMM cannot manipulate TDX VM's memory.
Disable SMM for TDX VMs and error out if user requests to enable SMM.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Add a q35 property to check whether or not SMM ranges, e.g. SMRAM, TSEG,
etc... exist for the target platform. TDX doesn't support SMM and doesn't
play nice with QEMU modifying related guest memory ranges.
Signed-off-by: Isaku Yamahata <isaku.yamahata@linux.intel.com>
Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
In mch_realize(), process PAM initialization before SMRAM initialization so
that later patch can skill all the SMRAM related with a single check.
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
TD guest can use TDG.VP.VMCALL<REPORT_FATAL_ERROR> to request termination
with error message encoded in GPRs.
Parse and print the error message, and terminate the TD guest in the
handler.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
MapGPA is a hypercall to convert GPA from/to private GPA to/from shared GPA.
As the conversion function is already implemented as kvm_convert_memory,
wire it to TDX hypercall exit.
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Add property "quote-generation-socket" to tdx-guest, which is a property
of type SocketAddress to specify Quote Generation Service(QGS).
On request of GetQuote, it connects to the QGS socket, read request
data from shared guest memory, send the request data to the QGS,
and store the response into shared guest memory, at last notify
TD guest by interrupt.
command line example:
qemu-system-x86_64 \
-object '{"qom-type":"tdx-guest","id":"tdx0","quote-generation-socket":{"type": "vsock", "cid":"1","port":"1234"}}' \
-machine confidential-guest-support=tdx0
Note, above example uses vsock type socket because the QGS we used
implements the vsock socket. It can be other types, like UNIX socket,
which depends on the implementation of QGS.
To avoid no response from QGS server, setup a timer for the transaction.
If timeout, make it an error and interrupt guest. Define the threshold of
time to 30s at present, maybe change to other value if not appropriate.
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Codeveloped-by: Chenyi Qiang <chenyi.qiang@intel.com>
Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Codeveloped-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
For SetupEventNotifyInterrupt, record interrupt vector and the apic id
of the vcpu that received this TDVMCALL.
Later it can inject interrupt with given vector to the specific vcpu
that received SetupEventNotifyInterrupt.
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Invoke KVM_TDX_FINALIZE_VM to finalize the TD's measurement and make
the TD vCPUs runnable once machine initialization is complete.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
TDX vcpu needs to be initialized by SEAMCALL(TDH.VP.INIT) and KVM
provides vcpu level IOCTL KVM_TDX_INIT_VCPU for it.
KVM_TDX_INIT_VCPU needs the address of the HOB as input. Invoke it for
each vcpu after HOB list is created.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
TDVF firmware (CODE and VARS) needs to be copied to TD's private
memory, as well as TD HOB and TEMP memory.
If the TDVF section has TDVF_SECTION_ATTRIBUTES_MR_EXTEND set in the
flag, calling KVM_TDX_EXTEND_MEMORY to extend the measurement.
After populating the TDVF memory, the original image located in shared
ramblock can be discarded.
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
The TD HOB list is used to pass the information from VMM to TDVF. The TD
HOB must include PHIT HOB and Resource Descriptor HOB. More details can
be found in TDVF specification and PI specification.
Build the TD HOB in TDX's machine_init_done callback.
Co-developed-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
The RAM of TDX VM can be classified into two types:
- TDX_RAM_UNACCEPTED: default type of TDX memory, which needs to be
accepted by TDX guest before it can be used and will be all-zeros
after being accepted.
- TDX_RAM_ADDED: the RAM that is ADD'ed to TD guest before running, and
can be used directly. E.g., TD HOB and TEMP MEM that needed by TDVF.
Maintain TdxRamEntries[] which grabs the initial RAM info from e820 table
and mark each RAM range as default type TDX_RAM_UNACCEPTED.
Then turn the range of TD HOB and TEMP MEM to TDX_RAM_ADDED since these
ranges will be ADD'ed before TD runs and no need to be accepted runtime.
The TdxRamEntries[] are later used to setup the memory TD resource HOB
that passes memory info from QEMU to TDVF.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
For each TDVF sections, QEMU needs to copy the content to guest
private memory via KVM API (KVM_TDX_INIT_MEM_REGION).
Introduce a field @mem_ptr for TdxFirmwareEntry to track the memory
pointer of each TDVF sections. So that QEMU can add/copy them to guest
private memory later.
TDVF sections can be classified into two groups:
- Firmware itself, e.g., TDVF BFV and CFV, that located separately from
guest RAM. Its memory pointer is the bios pointer.
- Sections located at guest RAM, e.g., TEMP_MEM and TD_HOB.
mmap a new memory range for them.
Register a machine_init_done callback to do the stuff.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
For TDX, the address below 1MB are entirely general RAM. No need to
initialize pc.rom memory region for TDs.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
TDX doesn't support map different GPAs to same private memory. Thus,
aliasing top 128KB of BIOS as isa-bios is not supported.
On the other hand, TDX guest cannot go to real mode, it can work fine
without isa-bios.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
After TDVF is loaded to bios MemoryRegion, it needs parse TDVF metadata.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
TDX VM needs to boot with its specialized firmware, Trusted Domain
Virtual Firmware (TDVF). QEMU needs to parse TDVF and map it in TD
guest memory prior to running the TDX VM.
A TDVF Metadata in TDVF image describes the structure of firmware.
QEMU refers to it to setup memory for TDVF. Introduce function
tdvf_parse_metadata() to parse the metadata from TDVF image and store
the info of each TDVF section.
TDX metadata is located by a TDX metadata offset block, which is a
GUID-ed structure. The data portion of the GUID structure contains
only an 4-byte field that is the offset of TDX metadata to the end
of firmware file.
Select X86_FW_OVMF when TDX is enable to leverage existing functions
to parse and search OVMF's GUID-ed structures.
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Co-developed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
TDVF(OVMF) needs to run at private memory for TD guest. TDX cannot
support pflash device since it doesn't support read-only private memory.
Thus load TDVF(OVMF) with -bios option for TDs.
Use memory_region_init_ram_guest_memfd() to allocate the MemoryRegion
for TDVF because it needs to be located at private memory.
Also store the MemoryRegion pointer of TDVF since the shared ramblock of
it can be discared after it gets copied to private ramblock.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Co-developed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Introduce memory_region_init_ram_guest_memfd() to allocate private
guset memfd on the MemoryRegion initialization. It's for the use case of
TDVF, which must be private on TDX case.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
TDX requires vMMIO region to be shared. For KVM, MMIO region is the region
which kvm memslot isn't assigned to (except in-kernel emulation).
qemu has the memory region for vMMIO at each device level.
While OVMF issues MapGPA(to-shared) conservatively on 32bit PCI MMIO
region, qemu doesn't find corresponding vMMIO region because it's before
PCI device allocation and memory_region_find() finds the device region, not
PCI bus region. It's safe to ignore MapGPA(to-shared) because when guest
accesses those region they use GPA with shared bit set for vMMIO. Ignore
memory conversion request of non-assigned region to shared and return
success. Otherwise OVMF is confused and panics there.
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Because vMMIO region needs to be shared region, guest TD may explicitly
convert such region from private to shared. Don't complain such
conversion.
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
TDX only supports readonly for shared memory but not for private memory.
In the view of QEMU, it has no idea whether a memslot is used as shared
memory of private. Thus just mark kvm_readonly_mem_enabled to false to
TDX VM for simplicity.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Reuse "-cpu,tsc-frequency=" to get user wanted tsc frequency and call VM
scope VM_SET_TSC_KHZ to set the tsc frequency of TD before KVM_TDX_INIT_VM.
Besides, sanity check the tsc frequency to be in the legal range and
legal granularity (required by TDX module).
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Three sha384 hash values, mrconfigid, mrowner and mrownerconfig, of a TD
can be provided for TDX attestation. Detailed meaning of them can be
found: https://lore.kernel.org/qemu-devel/31d6dbc1-f453-4cef-ab08-4813f4e0ff92@intel.com/
Allow user to specify those values via property mrconfigid, mrowner and
mrownerconfig. They are all in base64 format.
example
-object tdx-guest, \
mrconfigid=ASNFZ4mrze8BI0VniavN7wEjRWeJq83vASNFZ4mrze8BI0VniavN7wEjRWeJq83v,\
mrowner=ASNFZ4mrze8BI0VniavN7wEjRWeJq83vASNFZ4mrze8BI0VniavN7wEjRWeJq83v,\
mrownerconfig=ASNFZ4mrze8BI0VniavN7wEjRWeJq83vASNFZ4mrze8BI0VniavN7wEjRWeJq83v
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Co-developed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Validate TD attributes with tdx_caps that fixed-0 bits must be zero and
fixed-1 bits must be set.
Besides, sanity check the attribute bits that have not been supported by
QEMU yet. e.g., debug bit, it will be allowed in the future when debug
TD support lands in QEMU.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Current KVM doesn't support PMU for TD guest. It returns error if TD is
created with PMU bit being set in attributes.
Disable PMU for TD guest on QEMU side.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
For QEMU VMs, PKS is configured via CPUID_7_0_ECX_PKS and PMU is
configured by x86cpu->enable_pmu. Reuse the existing configuration
interface for TDX VMs.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
For TDX KVM use case, Linux guest is the most major one. It requires
sept_ve_disable set. Make it default for the main use case. For other use
case, it can be enabled/disabled via qemu command line.
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Bit 28 of TD attribute, named SEPT_VE_DISABLE. When set to 1, it disables
EPT violation conversion to #VE on guest TD access of PENDING pages.
Some guest OS (e.g., Linux TD guest) may require this bit as 1.
Otherwise refuse to boot.
Add sept-ve-disable property for tdx-guest object, for user to configure
this bit.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Invoke KVM_TDX_INIT in kvm_arch_pre_create_vcpu() that KVM_TDX_INIT
configures global TD configurations, e.g. the canonical CPUID config,
and must be executed prior to creating vCPUs.
Use kvm_x86_arch_cpuid() to setup the CPUID settings for TDX VM.
Note, this doesn't address the fact that QEMU may change the CPUID
configuration when creating vCPUs, i.e. punts on refactoring QEMU to
provide a stable CPUID config prior to kvm_arch_init().
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Introduce kvm_arch_pre_create_vcpu(), to perform arch-dependent
work prior to create any vcpu. This is for i386 TDX because it needs
call TDX_INIT_VM before creating any vcpu.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Move the architectural (for lack of a better term) CPUID leaf generation
to a separate helper so that the generation code can be reused by TDX,
which needs to generate a canonical VM-scoped configuration.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Some bits in TD attributes have corresponding CPUID feature bits. Reflect
the fixed0/1 restriction on TD attributes to their corresponding CPUID
bits in tdx_cpuid_lookup[] as well.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
KVM requires userspace to pass XFAM configuration via CPUID 0xD leaves.
Convert tdx_caps->xfam_fixed0/1 into corresponding
tdx_cpuid_lookup[].tdx_fixed0/1 field of CPUID 0xD leaves. Thus the
requirement can be applied naturally.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
tdx_cpuid_lookup[].tdx_fixed0/1 is QEMU maintained data which reflects
TDX restrictions regrading what bits are fixed by TDX module.
It's retrieved from TDX spec and static. However, TDX may evolve and
change some fixed fields to configurable in the future. Update
tdx_cpuid.lookup[].tdx_fixed0/1 fields by removing the bits that
reported from TDX module as configurable. This can adapt with the
updated TDX (module) automatically.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Due to the fact that Intel-PT virtualization support has been broken in
QEMU since Sapphire Rapids generation[1], below warning is triggered when
luanching TD guest:
warning: host doesn't support requested feature: CPUID.07H:EBX.intel-pt [bit 25]
Before Intel-pt is fixed in QEMU, just make Intel-PT unsupported for TD
guest, to avoid the confusing warning.
[1] https://lore.kernel.org/qemu-devel/20230531084311.3807277-1-xiaoyao.li@intel.com/
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
According to Chapter "CPUID Virtualization" in TDX module spec, CPUID
bits of TD can be classified into 6 types:
------------------------------------------------------------------------
1 | As configured | configurable by VMM, independent of native value;
------------------------------------------------------------------------
2 | As configured | configurable by VMM if the bit is supported natively
(if native) | Otherwise it equals as native(0).
------------------------------------------------------------------------
3 | Fixed | fixed to 0/1
------------------------------------------------------------------------
4 | Native | reflect the native value
------------------------------------------------------------------------
5 | Calculated | calculated by TDX module.
------------------------------------------------------------------------
6 | Inducing #VE | get #VE exception
------------------------------------------------------------------------
Note:
1. All the configurable XFAM related features and TD attributes related
features fall into type #2. And fixed0/1 bits of XFAM and TD
attributes fall into type #3.
2. For CPUID leaves not listed in "CPUID virtualization Overview" table
in TDX module spec, TDX module injects #VE to TDs when those are
queried. For this case, TDs can request CPUID emulation from VMM via
TDVMCALL and the values are fully controlled by VMM.
Due to TDX module has its own virtualization policy on CPUID bits, it leads
to what reported via KVM_GET_SUPPORTED_CPUID diverges from the supported
CPUID bits for TDs. In order to keep a consistent CPUID configuration
between VMM and TDs. Adjust supported CPUID for TDs based on TDX
restrictions.
Currently only focus on the CPUID leaves recognized by QEMU's
feature_word_info[] that are indexed by a FeatureWord.
Introduce a TDX CPUID lookup table, which maintains 1 entry for each
FeatureWord. Each entry has below fields:
- tdx_fixed0/1: The bits that are fixed as 0/1;
- depends_on_vmm_cap: The bits that are configurable from the view of
TDX module. But they requires emulation of VMM
when configured as enabled. For those, they are
not supported if VMM doesn't report them as
supported. So they need be fixed up by checking
if VMM supports them.
- inducing_ve: TD gets #VE when querying this CPUID leaf. The result is
totally configurable by VMM.
- supported_on_ve: It's valid only when @inducing_ve is true. It represents
the maximum feature set supported that be emulated
for TDs.
By applying TDX CPUID lookup table and TDX capabilities reported from
TDX module, the supported CPUID for TDs can be obtained from following
steps:
- get the base of VMM supported feature set;
- if the leaf is not a FeatureWord just return VMM's value without
modification;
- if the leaf is an inducing_ve type, applying supported_on_ve mask and
return;
- include all native bits, it covers type #2, #4, and parts of type #1.
(it also includes some unsupported bits. The following step will
correct it.)
- apply fixed0/1 to it (it covers #3, and rectifies the previous step);
- add configurable bits (it covers the other part of type #1);
- fix the ones in vmm_fixup;
(Calculated type is ignored since it's determined at runtime).
Co-developed-by: Chenyi Qiang <chenyi.qiang@intel.com>
Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
It will need special handling for TDX VMs all around the QEMU.
Introduce is_tdx_vm() helper to query if it's a TDX VM.
Cache tdx_guest object thus no need to cast from ms->cgs every time.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Isaku Yamahata <isaku.yamahata@intel.com>
KVM provides TDX capabilities via sub command KVM_TDX_CAPABILITIES of
IOCTL(KVM_MEMORY_ENCRYPT_OP). Get the capabilities when initializing
TDX context. It will be used to validate user's setting later.
Since there is no interface reporting how many cpuid configs contains in
KVM_TDX_CAPABILITIES, QEMU chooses to try starting with a known number
and abort when it exceeds KVM_MAX_CPUID_ENTRIES.
Besides, introduce the interfaces to invoke TDX "ioctls" at different
scope (KVM, VM and VCPU) in preparation.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Implement TDX specific ConfidentialGuestSupportClass::kvm_init()
callback, tdx_kvm_init().
Set ms->require_guest_memfd to true to require private guest memfd
allocation for any memory backend.
More TDX specific initialization will be added later.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
TDX VM requires VM type KVM_X86_TDX_VM to be passed to
kvm_ioctl(KVM_CREATE_VM). Hence implement mc->kvm_type() for i386
architecture.
If tdx-guest object is specified to confidential-guest-support, like,
qemu -machine ...,confidential-guest-support=tdx0 \
-object tdx-guest,id=tdx0,...
it parses VM type as KVM_X86_TDX_VM. Otherwise, it's KVM_X86_DEFAULT_VM.
Also store the vm_type in MachineState for other code to query what the
VM type is.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Introduce tdx-guest object which inherits CONFIDENTIAL_GUEST_SUPPORT,
and will be used to create TDX VMs (TDs) by
qemu -machine ...,confidential-guest-support=tdx0 \
-object tdx-guest,id=tdx0
So far, it has no QAPI member/properety decleared and only one internal
member 'attributes' with fixed value 0 that not configurable.
QAPI properties will be added later.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Pull in recent TDX updates, which are not backwards compatible.
It's just to make this series runnable. It will be updated by script
scripts/update-linux-headers.sh
once TDX support is upstreamed in linux kernel
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
KVM side leaves the memory to shared by default, while may incur the
overhead of paging conversion on the first visit of each page. Because
the expectation is that page is likely to private for the VMs that
require private memory (has guest memfd).
Explicitly set the memory to private when memory region has valid
guest memfd backend.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
When geeting KVM_EXIT_MEMORY_FAULT exit, it indicates userspace needs to
do the memory conversion on the RAMBlock to turn the memory into desired
attribute, i.e., private/shared.
Currently only KVM_MEMORY_EXIT_FLAG_PRIVATE in flags is valid when
KVM_EXIT_MEMORY_FAULT happens.
Note, KVM_EXIT_MEMORY_FAULT makes sense only when the RAMBlock has
guest_memfd memory backend.
Note, KVM_EXIT_MEMORY_FAULT returns with -EFAULT, so special handling is
added.
When page is converted from shared to private, the original shared
memory can be discarded via ram_block_discard_range(). Note, shared
memory can be discarded only when it's not back'ed by hugetlb because
hugetlb is supposed to be pre-allocated and no need for discarding.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Co-developed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Isaku Yamahata <isaku.yamahata@intel.com>
When memory page is converted from private to shared, the original
private memory is back'ed by guest_memfd. Introduce
ram_block_discard_guest_memfd_range() for discarding memory in
guest_memfd.
Originally-from: Isaku Yamahata <isaku.yamahata@intel.com>
Codeveloped-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Introduce the helper functions to set the attributes of a range of
memory to private or shared.
This is necessary to notify KVM the private/shared attribute of each gpa
range. KVM needs the information to decide the GPA needs to be mapped at
hva-based shared memory or guest_memfd based private memory.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Switch to KVM_SET_USER_MEMORY_REGION2 when supported by KVM.
With KVM_SET_USER_MEMORY_REGION2, QEMU can set up memory region that
backend'ed both by hva-based shared memory and guest memfd based private
memory.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Co-developed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
The upper 16 bits of kvm_userspace_memory_region::slot are
address space id. Parse it separately in trace_kvm_set_user_memory().
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Add a new member "guest_memfd" to memory backends. When it's set
to true, it enables RAM_GUEST_MEMFD in ram_flags, thus private kvm
guest_memfd will be allocated during RAMBlock allocation.
Memory backend's @guest_memfd is wired with @require_guest_memfd
field of MachineState. It avoid looking up the machine in phymem.c.
MachineState::require_guest_memfd is supposed to be set by any VMs
that requires KVM guest memfd as private memory, e.g., TDX VM.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Add KVM guest_memfd support to RAMBlock so both normal hva based memory
and kvm guest memfd based private memory can be associated in one RAMBlock.
Introduce new flag RAM_GUEST_MEMFD. When it's set, it calls KVM ioctl to
create private guest_memfd during RAMBlock setup.
Allocating a new RAM_GUEST_MEMFD flag to instruct the setup of guest memfd
is more flexible and extensible than simply relying on the VM type because
in the future we may have the case that not all the memory of a VM need
guest memfd. As a benefit, it also avoid getting MachineState in memory
subsystem.
Note, RAM_GUEST_MEMFD is supposed to be set for memory backends of
confidential guests, such as TDX VM. How and when to set it for memory
backends will be implemented in the following patches.
Introduce memory_region_has_guest_memfd() to query if the MemoryRegion has
KVM guest_memfd allocated.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Guest memfd support in QEMU requires corresponding KVM guest memfd APIs,
which lands in Linux from v6.8-rc1.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Use unified confidential_guest_kvm_init(), to avoid exposing specific
functions.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
Changes from rfc v1:
- check machine->cgs not NULL before calling confidential_guest_kvm_init();
Use the unified interface to call confidential guest related kvm_init()
and kvm_reset(), to avoid exposing pef specific functions.
remove perf.h since it is now blank..
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
Changes from rfc v1:
- check machine->cgs not NULL before callling
confidential_guest_kvm_init/reset();
Use confidential_guest_kvm_init() instead of calling SEV specific
sev_kvm_init(). As a bouns, it fits to future TDX when TDX implements
its own confidential_guest_support and .kvm_init().
Move the "TypeInfo sev_guest_info" definition and related functions to
the end of the file, to avoid declaring the sev_kvm_init() ahead.
Delete the sve-stub.c since it's not needed anymore.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
Changes from rfc v1:
- check ms->cgs not NULL before calling confidential_guest_kvm_init();
- delete the sev-stub.c;
Different confidential VMs in different architectures all have the same
needs to do their specific initialization (and maybe resetting) stuffs
with KVM. Currently each of them exposes individual *_kvm_init()
functions and let machine code or kvm code to call it.
To make it more object oriented, add two virtual functions, kvm_init()
and kvm_reset() in ConfidentialGuestSupportClass, and expose two helpers
functions for invodking them.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
Changes since rfc v1:
- Drop the NULL check and rely the check from the caller;
The interrupt handlers need to be populated before the device is realized since
internal devices such as the RTC are wired during realize(). If the interrupt
handlers aren't populated, devices such as the RTC will be wired with a NULL
interrupt handler, i.e. MC146818RtcState::irq is NULL.
Fixes: fc11ca08bc "hw/i386/q35: Realize LPC PCI function before accessing it"
Cc: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-ID: <20240217104644.19755-1-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 143f3fd3d8)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
pc_system_flash_create() checked for pcmc->pci_enabled which is redundant since
its caller already checked it. The method can be turned into just two lines, so
inline and remove it.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240208220349.4948-8-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit cb05cc1602)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Rather than distributing PC system flash handling across three files, let's
confine it to one. Now, pc_system_firmware_init() creates, configures and cleans
up the system flash which makes the code easier to understand. It also avoids
the extra call to pc_system_flash_cleanup_unused() in the Xen case.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240208220349.4948-7-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 6f6ad2b245)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Handling most of smbios data generation in the machine_done notifier is similar
to how the ARM virt machine handles it which also calls smbios_set_defaults()
there. The result is that all pc machines are freed from explicitly worrying
about smbios setup.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240208220349.4948-6-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit a0204a5ed0)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
The attribute isn't user-changeable and only true for pc-based machines. Turn it
into a class attribute which allows for inlining pc_guest_info_init() into
pc_machine_initfn().
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240208220349.4948-4-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 6e6d59a94d)
Referencex: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Return early if bc->alloc is NULL. De-indent the if() ladder.
Note, this avoids a pointless call to error_propagate() with
errp=NULL at the 'out:' label.
Change trivial when reviewed with 'git-diff --ignore-all-space'.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Message-Id: <20231120213301.24349-16-philmd@linaro.org>
(cherry picked from commit e199f7ad4d)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
A memory page poisoned from the hypervisor level is no longer readable.
The migration of a VM will crash Qemu when it tries to read the
memory address space and stumbles on the poisoned page with a similar
stack trace:
Program terminated with signal SIGBUS, Bus error.
To avoid this VM crash during the migration, prevent the migration
when a known hardware poison exists on the VM.
Signed-off-by: William Roche <william.roche@oracle.com>
Link: https://lore.kernel.org/r/20240130190640.139364-2-william.roche@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
(cherry picked from commit 06152b89db)
References: XXX
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
nvme_sriov_pre_write_ctrl() used to directly inspect SR-IOV
configurations to know the number of VFs being disabled due to SR-IOV
configuration writes, but the logic was flawed and resulted in
out-of-bound memory access.
It assumed PCI_SRIOV_NUM_VF always has the number of currently enabled
VFs, but it actually doesn't in the following cases:
- PCI_SRIOV_NUM_VF has been set but PCI_SRIOV_CTRL_VFE has never been.
- PCI_SRIOV_NUM_VF was written after PCI_SRIOV_CTRL_VFE was set.
- VFs were only partially enabled because of realization failure.
It is a responsibility of pcie_sriov to interpret SR-IOV configurations
and pcie_sriov does it correctly, so use pcie_sriov_num_vfs(), which it
provides, to get the number of enabled VFs before and after SR-IOV
configuration writes.
Cc: qemu-stable@nongnu.org
Fixes: CVE-2024-26328
Fixes: 11871f53ef ("hw/nvme: Add support for the Virtualization Management command")
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20240228-reuse-v8-1-282660281e60@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 91bb64a8d2)
References: bsc#1220065
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Update to latest stable release (8.2.2).
Full changelog here:
https://lore.kernel.org/qemu-devel/1709577077.783602.1474596.nullmailer@tls.msk.ru/
Upstream backports:
chardev/char-socket: Fix TLS io channels sending too much data to the backend
tests/unit/test-util-sockets: Remove temporary file after test
hw/usb/bus.c: PCAP adding 0xA in Windows version
hw/intc/Kconfig: Fix GIC settings when using "--without-default-devices"
gitlab: force allow use of pip in Cirrus jobs
tests/vm: avoid re-building the VM images all the time
tests/vm: update openbsd image to 7.4
target/i386: leave the A20 bit set in the final NPT walk
target/i386: remove unnecessary/wrong application of the A20 mask
target/i386: Fix physical address truncation
target/i386: check validity of VMCB addresses
target/i386: mask high bits of CR3 in 32-bit mode
pl031: Update last RTCLR value on write in case it's read back
hw/nvme: fix invalid endian conversion
update edk2 binaries to edk2-stable202402
update edk2 submodule to edk2-stable202402
target/ppc: Fix crash on machine check caused by ifetch
target/ppc: Fix lxv/stxv MSR facility check
.gitlab-ci.d/windows.yml: Drop msys2-32bit job
system/vl: Update description for input grab key
docs/system: Update description for input grab key
hw/hppa/Kconfig: Fix building with "configure --without-default-devices"
tests/qtest: Depend on dbus_display1_dep
meson: Explicitly specify dbus-display1.h dependency
audio: Depend on dbus_display1_dep
ui/console: Fix console resize with placeholder surface
ui/clipboard: add asserts for update and request
ui/clipboard: mark type as not available when there is no data
ui: reject extended clipboard message if not activated
target/i386: Generate an illegal opcode exception on cmp instructions with lock prefix
i386/cpuid: Move leaf 7 to correct group
i386/cpuid: Decrease cpuid_i when skipping CPUID leaf 1F
i386/cpu: Mask with XCR0/XSS mask for FEAT_XSAVE_XCR0_HI and FEAT_XSAVE_XSS_HI leafs
i386/cpu: Clear FEAT_XSAVE_XSS_LO/HI leafs when CPUID_EXT_XSAVE is not available
.gitlab-ci/windows.yml: Don't install libusb or spice packages on 32-bit
iotests: Make 144 deterministic again
target/arm: Don't get MDCR_EL2 in pmu_counter_enabled() before checking ARM_FEATURE_PMU
target/arm: Fix SVE/SME gross MTE suppression checks
target/arm: Handle mte in do_ldrq, do_ldro
target/arm: Split out make_svemte_desc
target/arm: Adjust and validate mtedesc sizem1
target/arm: Fix nregs computation in do_{ld,st}_zpa
linux-user/aarch64: Choose SYNC as the preferred MTE mode
tests/acpi: Update DSDT.cxl to reflect change _STA return value.
hw/i386: Fix _STA return value for ACPI0017
tests/acpi: Allow update of DSDT.cxl
smmu: Clear SMMUPciBus pointer cache when system reset
virtio_iommu: Clear IOMMUPciBus pointer cache when system reset
virtio-gpu: Correct virgl_renderer_resource_get_info() error check
hw/cxl: Pass CXLComponentState to cache_mem_ops
hw/cxl/device: read from register values in mdev_reg_read()
cxl/cdat: Fix header sum value in CDAT checksum
cxl/cdat: Handle cdat table build errors
vhost-user.rst: Fix vring address description
tcg/arm: Fix goto_tb for large translation blocks
tcg: Increase width of temp_subindex
hw/net/tulip: add chip status register values
hw/smbios: Fix port connector option validation
hw/smbios: Fix OEM strings table option validation
configure: run plugin TCG tests again
tests/docker: Add sqlite3 module to openSUSE Leap container
hw/riscv/virt-acpi-build.c: fix leak in build_rhct()
migration: Fix logic of channels and transport compatibility check
virtio-blk: avoid using ioeventfd state in irqfd conditional
virtio: Re-enable notifications after drain
virtio-scsi: Attach event vq notifier with no_poll
iotests: give tempdir an identifying name
iotests: fix leak of tmpdir in dry-run mode
hw/scsi/lsi53c895a: add missing decrement of reentrancy counter
linux-user/aarch64: Add padding before __kernel_rt_sigreturn
tcg/loongarch64: Set vector registers call clobbered
pci-host: designware: Limit value range of iATU viewport register
target/arm: Reinstate "vfp" property on AArch32 CPUs
qemu-options.hx: Improve -serial option documentation
system/vl.c: Fix handling of '-serial none -serial something'
target/arm: fix exception syndrome for AArch32 bkpt insn
block/blkio: Make s->mem_region_alignment be 64 bits
qemu-docs: Update options for graphical frontends
Make 'uri' optional for migrate QAPI
vfio/pci: Clear MSI-X IRQ index always
migration: Fix use-after-free of migration state object
migration: Plug memory leak on HMP migrate error path
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
spapr_irq_init currently uses existing macro SPAPR_XIRQ_BASE to refer to
the range of CPU IPIs during initialization of nr-irqs property.
It is more appropriate to have its own define which can be further
reused as appropriate for correct interpretation.
Suggested-by: Cedric Le Goater <clg@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Kowshik Jois <kowsjois@linux.ibm.com>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
(cherry picked from commit 2df5c1f5b0)
References: bsc#1220310
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
We wanted QEMU to support larger VMs (in therm of RAM size) by default
and we therefore introduced patch "[openSUSE] increase x86_64 physical
bits to 42". This, however, means that we create VMs with 42 bits of
physical address space even on hosts that only has, say, 40. And that
can't work.
In fact, it has been a problem since a long time (e.g., bsc#1205978) and
it's also the actual root cause of bsc#1219977.
Get rid of that old patch, in favor of a new one that still raise the
default number of address bits to 42, but only on hosts that supports
that.
This means that we can also use the proper SeaBIOS version, without
reverting commits that were only a problem due to our broken downstream
patch.
We probably aslo don't need to ship some of the custom ACPI tables (for
passing tests), but we'll actually remove them later, after double
checking properly that all the tests do work.
References: bsc#1205978
References: bsc#1219977
References: bsc#1220799
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Update the copyright year to 2024, sort dependencies etc.
This way, 'osc' does not have to do these changes all the times (they're
automatic, so no big deal, but it's annoying to see them in the diffs of
all the requests).
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Backported commits:
* Update version for 8.2.1 release
* target/arm: Fix incorrect aa64_tidcp1 feature check
* target/arm: Fix A64 scalar SQSHRN and SQRSHRN
* target/xtensa: fix OOB TLB entry access
* qtest: bump aspeed_smc-test timeout to 6 minutes
* monitor: only run coroutine commands in qemu_aio_context
* iotests: port 141 to Python for reliable QMP testing
* iotests: add filter_qmp_generated_node_ids()
* block/blklogwrites: Fix a bug when logging "write zeroes" operations.
* virtio-net: correctly copy vnet header when flushing TX (bsc#1218484, CVE-2023-6693)
* tcg/arm: Fix SIGILL in tcg_out_qemu_st_direct
* linux-user/riscv: Adjust vdso signal frame cfa offsets
* linux-user: Fixed cpu restore with pc 0 on SIGBUS
* block/io: clear BDRV_BLOCK_RECURSE flag after recursing in bdrv_co_block_status
* coroutine-ucontext: Save fake stack for pooled coroutine
* tcg/s390x: Fix encoding of VRIc, VRSa, VRSc insns
* accel/tcg: Revert mapping of PCREL translation block to multiple virtual addresses
* acpi/tests/avocado/bits: wait for 200 seconds for SHUTDOWN event from bits VM
* s390x/pci: drive ISM reset from subsystem reset
* s390x/pci: refresh fh before disabling aif
* s390x/pci: avoid double enable/disable of aif
* hw/scsi/esp-pci: set DMA_STAT_BCMBLT when BLAST command issued
* hw/scsi/esp-pci: synchronise setting of DMA_STAT_DONE with ESP completion interrupt
* hw/scsi/esp-pci: generate PCI interrupt from separate ESP and PCI sources
* hw/scsi/esp-pci: use correct address register for PCI DMA transfers
* migration/rdma: define htonll/ntohll only if not predefined
* hw/pflash: implement update buffer for block writes
* hw/pflash: use ldn_{be,le}_p and stn_{be,le}_p
* hw/pflash: refactor pflash_data_write()
* backends/cryptodev: Do not ignore throttle/backends Errors
* target/i386: pcrel: store low bits of physical address in data[0]
* target/i386: fix incorrect EIP in PC-relative translation blocks
* target/i386: Do not re-compute new pc with CF_PCREL
* load_elf: fix iterator's type for elf file processing
* target/hppa: Update SeaBIOS-hppa to version 15
* target/hppa: Fix IOR and ISR on error in probe
* target/hppa: Fix IOR and ISR on unaligned access trap
* target/hppa: Export function hppa_set_ior_and_isr()
* target/hppa: Avoid accessing %gr0 when raising exception
* hw/hppa: Move software power button address back into PDC
* target/hppa: Fix PDC address translation on PA2.0 with PSW.W=0
* hw/pci-host/astro: Add missing astro & elroy registers for NetBSD
* hw/hppa/machine: Disable default devices with --nodefaults option
* hw/hppa/machine: Allow up to 3840 MB total memory
* readthodocs: fully specify a build environment
* .gitlab-ci.d/buildtest.yml: Work around htags bug when environment is large
* target/s390x: Fix LAE setting a wrong access register
* tests/qtest/virtio-ccw: Fix device presence checking
* tests/acpi: disallow tests/data/acpi/virt/SSDT.memhp changes
* tests/acpi: update expected data files
* edk2: update binaries to git snapshot
* edk2: update build config, set PcdUninstallMemAttrProtocol = TRUE.
* edk2: update to git snapshot
* tests/acpi: allow tests/data/acpi/virt/SSDT.memhp changes
* util: fix build with musl libc on ppc64le
* tcg/ppc: Use new registers for LQ destination
* hw/intc/arm_gicv3_cpuif: handle LPIs in in the list registers
* hw/vfio: fix iteration over global VFIODevice list
* vfio/container: Replace basename with g_path_get_basename
* edu: fix DMA range upper bound check
* hw/net: cadence_gem: Fix MDIO_OP_xxx values
* audio/audio.c: remove trailing newline in error_setg
* chardev/char.c: fix "abstract device type" error message
* target/riscv: Fix mcycle/minstret increment behavior
* hw/net/can/sja1000: fix bug for single acceptance filter and standard frame
* target/i386: the sgx_epc_get_section stub is reachable
* configure: use a native non-cross compiler for linux-user
* include/ui/rect.h: fix qemu_rect_init() mis-assignment
* target/riscv/kvm: do not use non-portable strerrorname_np()
* iotests: Basic tests for internal snapshots
* vl: Improve error message for conflicting -incoming and -loadvm
* block: Fix crash when loading snapshot on inactive node
References: bsc#1218484 (CVE-2023-6693)
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Depending on the VM configuration (both at the VM definition level and
on the guest itself) a VGA console might be necessary, or weird lockup
will occur. Since the VGA module package is smalle enough, add a
dependency for it, from other display modules, to act as a workaround.
While there, make more explicit and precise the dependencies between all
the various modules, by specifying that they should all have the same
version and release.
References: bsc#1219164
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Historically, KVM was available only for x86 and s390, and was invoked
via a binary called 'kvm' or 'qemu-kvm'. For a while, we've shipped a
package that was making it possible to invoke QEMU like that, but only
for these two arches. This, however, created a lot of confusion and
dependencies issues.
Fix them by creating a symlink from 'qemu-kvm' to the proper binary on
all arches and by making the main QEMU package Providing and Obsoleting
(also on all arches) the old qemu-kvm one.
Note that, for RISCV, the qemu-system-riscv64 binary, to which the symlink
should point, is in the qemu-extra package. However, if we are on RISCV,
qemu-extra is an hard dependency of qemu. Therefore, it's fine to ship
the link and also set the Provides: and Obsoletes: tag in the qemu
package itself. It'd be more correct to do that in the qemu-extra
package, of course, but this would complicate the spec file and it's not
worth it, considering this is all legacy and should very well go away
soon.
References: bsc#1218684
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Add to the ipxe submodule the commit (and all its dependencies) for
fixing building with binutils 2.42
References: bsc#1219733
References: bsc#1219722
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Point the submodules to the repositories that host our downstream
patches:
* roms/seabios
- [openSUSE] switch to python3 as needed
- [openSUSE] build: enable cross compilation on ARM
- [openSUSE] build: be explicit about -mx86-used-note=no
* roms/SLOF
- Allow to override build date with SOURCE_DATE_EPOCH
* roms/ipxe
- [ath5k] Add missing AR5K_EEPROM_READ in ath5k_eeprom_read_turbo_modes
- [openSUSE] [build] Makefile: fix issues of build reproducibility
- [openSUSE] [test] help compiler out by initializing array[openSUSE]
- [openSUSE] [build] Silence GCC 12 spurious warnings
- [librm] Use explicit operand size when pushing a label address
* roms/skiboot
- [openSUSE] Makefile: define endianess for cross-building on aarch64
- [openSUSE] Make Sphinx build reproducible (boo#1102408)
* roms/qboot
- [openSUSE] add cross.ini file to handle aarch64 based build
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Update to latest upstream release.
The full list of changes are available at:
https://wiki.qemu.org/ChangeLog/8.2
Highlights include:
* New virtio-sound device emulation
* New virtio-gpu rutabaga device emulation used by Android emulator
* New hv-balloon for dynamic memory protocol device for Hyper-V guests
* New Universal Flash Storage device emulation
* Network Block Device (NBD) 64-bit offsets for improved performance
* dump-guest-memory now supports the standard kdump format
* ARM: Xilinx Versal board now models the CFU/CFI, and the TRNG device
* ARM: CPU emulation support for cortex-a710 and neoverse-n2
* ARM: architectural feature support for PACQARMA3, EPAC, Pauth2, FPAC,
FPACCOMBINE, TIDCP1, MOPS, HBC, and HPMN0
* HPPA: CPU emulation support for 64-bit PA-RISC 2.0
* HPPA: machine emulation support for C3700, including Astro memory
controller and four Elroy PCI bridges
* LoongArch: ISA support for LASX extension and PRELDX instruction
* LoongArch: CPU emulation support for la132
* RISC-V: ISA/extension support for AIA virtualization support via KVM,
and vector cryptographic instructions
* RISC-V: Numerous extension/instruction cleanups, fixes, and reworks
* s390x: support for vfio-ap passthrough of crypto adapter for
protected
virtualization guests
* Tricore: support for TC37x CPU which implements ISA v1.6.2
* Tricore: support for CRCN, FTOU, FTOHP, and HPTOF instructions
* x86: Zen support for PV console and network devices
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
In 8.2, the upstream added graph lock annotations that conflict with
our coroutine patches. The GRAPH_RDLOCK_GUARD_MAINLOOP is now asserted
at some places where we turned the code into a coroutine.
My understanding is that the annotation merely asserts that the code
is running in the main loop and serves effectively as a marker of
where the graph lock should be taken if the code were to be moved out
of the main loop. That's exactly what we're doing so it should be safe
to replace the main loop check with an actual rdlock/rdunlock pair.
Commits e7e801a2a7, 94d03a7425 and c89144c058 replace
GRAPH_RDLOCK_GUARD_MAINLOOP() with GRAPH_RDLOCK_GUARD() at the
bdrv_snapshot_list, bdrv_named_nodes_list and qmp_query_block
functions. The changes are split into those patches instead of here to
preserve bisectability.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The fstat call can take a long time to finish when running over
NFS. Add a version of it that runs in the thread pool.
Adapt one of its users, raw_co_get_allocated_file size to use the new
version. That function is called via QMP under the qemu_global_mutex
so it has a large chance of blocking VCPU threads in case it takes too
long to finish.
Signed-off-by: João Silva <jsilva@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Claudio Fontana <cfontana@suse.de>
References: bsc#1211000
Signed-off-by: Fabiano Rosas <farosas@suse.de>
This is another caller of bdrv_get_allocated_file_size() that needs to
be converted to a coroutine because that function will be made
asynchronous when called (indirectly) from the QMP dispatcher.
This QMP command is a candidate because it calls bdrv_do_query_node_info(),
which in turn calls bdrv_get_allocated_file_size().
We've determined bdrv_do_query_node_info() to be coroutine-safe (see
previous commits), so we can just put this QMP command in a coroutine.
Since qmp_query_block() now expects to run in a coroutine, its callers
need to be converted as well. Convert hmp_info_block(), which calls
only coroutine-safe code, including qmp_query_named_block_nodes()
which has been converted to coroutine in the previous patches.
Now that all callers of bdrv_[co_]block_device_info() are using the
coroutine version, a few things happen:
- we can return to using bdrv_block_device_info() without a wrapper;
- bdrv_get_allocated_file_size() can stop being mixed;
- bdrv_co_get_allocated_file_size() needs to be put under the graph
lock because it is being called wthout the wrapper;
- bdrv_do_query_node_info() doesn't need to acquire the AioContext
because it doesn't call aio_poll anymore;
Signed-off-by: Fabiano Rosas <farosas@suse.de>
References: bsc#1211000
[relax main loop requirement at qmp_query_block]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
We're currently doing a full query-block just to enumerate the devices
for qmp_nbd_server_add and then discarding the BlockInfoList
afterwards. Alter hmp_nbd_server_start to instead iterate explicitly
over the block_backends list.
This allows the removal of the dependency on qmp_query_block from
hmp_nbd_server_start. This is desirable because we're about to move
qmp_query_block into a coroutine and don't need to change the NBD code
at the same time.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
References: bsc#1211000
Signed-off-by: Fabiano Rosas <farosas@suse.de>
We're converting callers of bdrv_get_allocated_file_size() to run in
coroutines because that function will be made asynchronous when called
(indirectly) from the QMP dispatcher.
This QMP command is a candidate because it indirectly calls
bdrv_get_allocated_file_size() through bdrv_block_device_info() ->
bdrv_query_image_info() -> bdrv_query_image_info().
The previous patches have determined that bdrv_query_image_info() and
bdrv_do_query_node_info() are coroutine-safe so we can just make the
QMP command run in a coroutine.
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
References: bsc#1211000
[relax the main loop requirement at bdrv_named_nodes_list]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
We're converting callers of bdrv_get_allocated_file_size() to run in
coroutines because that function will be made asynchronous when called
(indirectly) from the QMP dispatcher.
This function is a candidate because it calls bdrv_query_image_info()
-> bdrv_do_query_node_info() -> bdrv_get_allocated_file_size().
It is safe to turn this is a coroutine because the code it calls is
made up of either simple accessors and string manipulation functions
[1] or it has already been determined to be safe [2].
1) bdrv_refresh_filename(), bdrv_is_read_only(),
blk_enable_write_cache(), bdrv_cow_bs(), blk_get_public(),
throttle_group_get_name(), bdrv_write_threshold_get(),
bdrv_query_dirty_bitmaps(), throttle_group_get_config(),
bdrv_filter_or_cow_bs(), bdrv_skip_implicit_filters()
2) bdrv_do_query_node_info() (see previous commit);
Signed-off-by: Fabiano Rosas <farosas@suse.de>
References: bsc#1211000
Signed-off-by: Fabiano Rosas <farosas@suse.de>
We're converting callers of bdrv_get_allocated_file_size() to run in
coroutines because that function will be made asynchronous when called
(indirectly) from the QMP dispatcher.
This function is a candidate because it calls bdrv_do_query_node_info(),
which in turn calls bdrv_get_allocated_file_size().
All the functions called from bdrv_do_query_node_info() onwards are
coroutine-safe, either have a coroutine version themselves[1] or are
mostly simple code/string manipulation[2].
1) bdrv_getlength(), bdrv_get_allocated_file_size(), bdrv_get_info(),
bdrv_get_specific_info();
2) bdrv_refresh_filename(), bdrv_get_format_name(),
bdrv_get_full_backing_filename(), bdrv_query_snapshot_info_list();
Signed-off-by: Fabiano Rosas <farosas@suse.de>
References: bsc#1211000
[relax the main loop requirement at bdrv_snapshot_list]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Some callers of this function are about to be converted to run in
coroutines, so allow it to be executed both inside and outside a
coroutine while we convert all the callers.
This will be reverted once all callers of bdrv_do_query_node_info run
in a coroutine.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
References: bsc#1211000
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The following patches will add co_wrapper annotations to functions
declared in qapi.h. Add that header to the set of files used by
block-coroutine-wrapper.py.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
References: bsc#1211000
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Add some block drivers and virtiofsd as hard dependencies of the
qemu-headless package, to make sure it's really useful for headless
server environments (even when recommended packages are not installed).
Singed-off-by: Dario Faggioli <dfaggioli@suse.com>
Use a fixed USER value (in case someone builds outside of OBS/osc).
References: boo#1084909
Signed-off-by: Bernhard M. Wiedemann <githubbmwprimary@lsmod.de>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Define a new sub-(meta-)package that can be installed for having
all the other modules and packages necessary for SPICE to work.
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Align to upstream stable release. It includes many of the patches we had
backported ourself, to fix bugs and issues, plus more.
See here for details:
- https://lore.kernel.org/qemu-devel/1700589639.257680.3420728.nullmailer@tls.msk.ru/
- https://gitlab.com/qemu-project/qemu/-/commits/stable-8.1?ref_type=heads
An (incomplete!) list of such backports is:
* Update version for 8.1.3 release
* hw/mips: LOONGSON3V depends on UNIMP device
* target/arm: HVC at EL3 should go to EL3, not EL2
* s390x/pci: only limit DMA aperture if vfio DMA limit reported
* target/riscv/kvm: support KVM_GET_REG_LIST
* target/riscv/kvm: improve 'init_multiext_cfg' error msg
* tracetool: avoid invalid escape in Python string
* tests/tcg/s390x: Test LAALG with negative cc_src
* target/s390x: Fix LAALG not updating cc_src
* tests/tcg/s390x: Test CLC with inaccessible second operand
* target/s390x: Fix CLC corrupting cc_src
* tests/qtest: ahci-test: add test exposing reset issue with pending callback
* hw/ide: reset: cancel async DMA operation before resetting state
* target/mips: Fix TX79 LQ/SQ opcodes
* target/mips: Fix MSA BZ/BNZ opcodes displacement
* ui/gtk-egl: apply scale factor when calculating window's dimension
* ui/gtk: force realization of drawing area
* ati-vga: Implement fallback for pixman routines
* ...
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Avoid parallel processing in sphinx because that causes variations in
generated files
This is addressed here, with a downstream patch, until a proper solution
is found upstream.
Signed-off-by: Bernhard Wiedemann <bwiedemann@suse.com>
References: boo#1102408
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
The supportconfig 'scplugin.rc' file is deprecated in favor of
supportconfig.rc'. Adapt the qemu plugin to the new scheme.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Our workflow does not include patches in the spec files. Still, it could
be useful to add some there, during development and/or debugging issues.
Make sure that they are applied properly, by adding -p1 to the
%autosetup directive (it's a nop if there are no patches, so both cases
are ok).
Suggested-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
This fixes the following upstream issues:
* https://gitlab.com/qemu-project/qemu/-/issues/1826
* https://gitlab.com/qemu-project/qemu/-/issues/1834
* https://gitlab.com/qemu-project/qemu/-/issues/1846
It also contains a fix for:
* CVE-2023-42467 (bsc#1215192)
As well as several upstream backports:
* target/riscv: Fix vfwmaccbf16.vf
* disas/riscv: Fix the typo of inverted order of pmpaddr13 and pmpaddr14
* roms: use PYTHON to invoke python
* hw/audio/es1370: reset current sample counter
* migration/qmp: Fix crash on setting tls-authz with null
* util/log: re-allow switching away from stderr log file
* vfio/display: Fix missing update to set backing fields
* amd_iommu: Fix APIC address check
* vdpa net: follow VirtIO initialization properly at cvq isolation probing
* vdpa net: stop probing if cannot set features
* vdpa net: fix error message setting virtio status
* vdpa net: zero vhost_vdpa iova_tree pointer at cleanup
* linux-user/hppa: Fix struct target_sigcontext layout
* chardev/char-pty: Avoid losing bytes when the other side just (re-)connected
* hw/display/ramfb: plug slight guest-triggerable leak on mode setting
* win32: avoid discarding the exception handler
* target/i386: fix memory operand size for CVTPS2PD
* target/i386: generalize operand size "ph" for use in CVTPS2PD
* subprojects/berkeley-testfloat-3: Update to fix a problem with compiler warnings
* scsi-disk: ensure that FORMAT UNIT commands are terminated
* esp: restrict non-DMA transfer length to that of available data
* esp: use correct type for esp_dma_enable() in sysbus_esp_gpio_demux()
* optionrom: Remove build-id section
* target/tricore: Fix RCPW/RRPW_INSERT insns for width = 0
* accel/tcg: Always require can_do_io
* accel/tcg: Always set CF_LAST_IO with CF_NOIRQ
* accel/tcg: Improve setting of can_do_io at start of TB
* accel/tcg: Track current value of can_do_io in the TB
* accel/tcg: Hoist CF_MEMI_ONLY check outside translation loop
* accel/tcg: Avoid load of icount_decr if unused
* softmmu: Use async_run_on_cpu in tcg_commit
* migration: Move return path cleanup to main migration thread
* migration: Replace the return path retry logic
* migration: Consolidate return path closing code
* migration: Remove redundant cleanup of postcopy_qemufile_src
* migration: Fix possible race when shutting down to_dst_file
* migration: Fix possible races when shutting down the return path
* migration: Fix possible race when setting rp_state.error
* migration: Fix race that dest preempt thread close too early
* ui/vnc: fix handling of VNC_FEATURE_XVP
* ui/vnc: fix debug output for invalid audio message
* hw/scsi/scsi-disk: Disallow block sizes smaller than 512 [CVE-2023-42467]
* accel/tcg: mttcg remove false-negative halted assertion
* meson.build: Make keyutils independent from keyring
* target/arm: Don't skip MTE checks for LDRT/STRT at EL0
* hw/arm/boot: Set SCR_EL3.FGTEn when booting kernel
* include/exec: Widen tlb_hit/tlb_hit_page()
* tests/file-io-error: New test
* file-posix: Simplify raw_co_prw's 'out' zone code
* file-posix: Fix zone update in I/O error path
* file-posix: Check bs->bl.zoned for zone info
* file-posix: Clear bs->bl.zoned on error
* hw/cxl: Fix out of bound array access
* hw/cxl: Fix CFMW config memory leak
* linux-user/hppa: lock both words of function descriptor
* linux-user/hppa: clear the PSW 'N' bit when delivering signals
* hw/ppc: Read time only once to perform decrementer write
* hw/ppc: Reset timebase facilities on machine reset
* hw/ppc: Always store the decrementer value
* target/ppc: Sign-extend large decrementer to 64-bits
* hw/ppc: Avoid decrementer rounding errors
* hw/ppc: Round up the decrementer interval when converting to ns
* host-utils: Add muldiv64_round_up
Signed-of-by: Dario Faggioli <dfaggioli@suse.com>
perl-Text-Markdown is not always available (e.g., in SLE/Leap).
Use discount instead, as the provider of the 'markdown' binary.
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
OBS SCM bridge can handle git submodule, while it can't handle (yet?)
meson subprojects. The (ugly, I know!) solution, for now, is to turn
the latter into the former, with commands like the followings:
git submodule add -f https://gitlab.com/qemu-project/berkeley-testfloat-3 subprojects/berkeley-testfloat-3
git -C subprojects/berkeley-testfloat-3 reset --hard 40619cbb3bf32872df8c53cc457039229428a263
(the hash used comes from the subprojects/berkeley-testfloat-3.wrap file)
It's also necessary to manually apply the layering of the packagefiles,
and that is done in the specfile.
Longer term and better solutions could be:
- Make SCM support meson subprojects
- Create standalone packages for the subprojects (and instruct
QEMU to pick stuff from there)
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Full list of changes are available at:
https://wiki.qemu.org/ChangeLog/8.1
Highlights:
* VFIO: improved live migration support, no longer an experimental feature
* GTK GUI now supports multi-touch events
* ARM, PowerPC, and RISC-V can now use AES acceleration on host processor
* PCIe: new QMP commands to inject CXL General Media events, DRAM
events and Memory Module events
* ARM: KVM VMs on a host which supports MTE (the Memory Tagging Extension)
can now use MTE in the guest
* ARM: emulation support for bpim2u (Banana Pi BPI-M2 Ultra) board and
neoverse-v1 (Cortex Neoverse-V1) CPU
* ARM: new architectural feature support for: FEAT_PAN3 (SCTLR_ELx.EPAN),
FEAT_LSE2 (Large System Extensions v2), and experimental support for
FEAT_RME (Realm Management Extensions)
* Hexagon: new instruction support for v68/v73 scalar, and v68/v69 HVX
* Hexagon: gdbstub support for HVX
* MIPS: emulation support for Ingenic XBurstR1/XBurstR2 CPUs, and MXU
instructions
* PowerPC: TCG SMT support, allowing pseries and powernv to run with up
to 8 threads per core
* PowerPC: emulation support for Power9 DD2.2 CPU model, and perf
sampling support for POWER CPUs
* RISC-V: ISA extension support for BF16/Zfa, and disassembly support
for Zcm*/Z*inx/XVentanaCondOps/Xthead
* RISC-V: CPU emulation support for Veyron V1
* RISC-V: numerous KVM/emulation fixes and enhancements
* s390: instruction emulation fixes for LDER, LCBB, LOCFHR, MXDB, MXDBR,
EPSW, MDEB, MDEBR, MVCRL, LRA, CKSM, CLM, ICM, MC, STIDP, EXECUTE, and
CLGEBR(A)
* SPARC: updated target/sparc to use tcg_gen_lookup_and_goto_ptr() for
improved performance
* Tricore: emulation support for TC37x CPU that supports ISA v1.6.2
instructions
* Tricore: instruction emulation of POPCNT.W, LHA, CRC32L.W, CRC32.B,
SHUFFLE, SYSCALL, and DISABLE
* x86: CPU model support for GraniteRapids
* and lots more...
This also (automatically) fixes:
- bsc#1212850 (CVE-2023-3354)
- bsc#1213001 (CVE-2023-3255)
- bsc#1213925 (CVE-2023-3180)
- bsc#1213414 (CVE-2023-3301)
- bsc#1207205 (CVE-2023-0330)
- bsc#1212968 (CVE-2023-2861)
- bsc#1179993, bsc#1181740, bsc#1211697
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
By default try to preserve argv[0].
Original report is boo#1197298, which also became relevant recently again in bsc#1212768.
Signed-off-by: Fabian Vogt <fabian@ritter-vogt.de>
References: boo#1197298
References: bsc#1212768
Signed-off-by: Fabian Vogt <fabian@ritter-vogt.de>
Create separate packages for qemu-img and qemu-pr-helper.
Signed-off-by: Vasiliy Ulyanov <vulyanov@suse.de>
Co-authored-by: Vasiliy Ulyanov <vulyanov@suse.de>
Since version 8.0.0, virtiofsd is not part of QEMU sources any longer.
We therefore have also moved it to a separate package. To retain
compatibility and consistency of behavior, require such a package as an
hard dependency.
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
For example, let's try to avoid recommending GUI UI stuff, unless GTK is
already installed. This way we avoid things like bringing in an entire
graphic stack on servers.
References: bsc#1205680
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
- The qemu-headless subpackage was defined but never build, because it
had no files. Fix that by putting there just a simple README.
- Move the docs in a dedicated subpackage
Resolves: bsc#1209629
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
As part of the effort to close the gap with Leap I think we are fine
removing the $pkgversion component to creating a unique CONFIG_STAMP.
This stamp is only used in creating a unique symbol used in ensuring the
dynamically loaded modules correspond correctly to the loading qemu.
The default inputs to producing this unique symbol are somewhat reasonable
as a generic mechanism, but specific packaging and maintenance practices
might require the default to be modified for best use. This is an example
of that.
Signed-off-by: Bruce Rogers <brogers@suse.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
We are disabling the following tests:
qemu-system-ppc64 / display-vga-test
They are failing due to some memory corruption errors. We believe that
this might be due to the combination of the compiler version and of LTO,
and will take up the investigation within the upstream community.
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Executing tests in obs is very fickle, since you aren't guaranteed
reliable cpu time. Triple the timeout for each test to help ensure
we don't fail a test because the stars align against us.
Signed-off-by: Bruce Rogers <brogers@suse.com>
[DF: Small tweaks necessary for rebasing on top of 6.2.0]
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Since we have a quite restricted execution environment, as far as
networking is concerned, we need to change the error message we expect
in test 162. There is actually no routing set up so the error we get is
"Network is unreachable". Change the expected output accordingly.
Signed-off-by: Bruce Rogers <brogers@suse.com>
Revert commit "tests/qtest: enable more vhost-user tests by default"
(8dcb404bff), as it causes prooblem when building with GCC 12 and LTO
enabled.
This should be considered temporary, until the actual reason why the
code of the tests that are added in that commit breaks.
It has been reported upstream, and will be (hopefully) solved there:
https://lore.kernel.org/qemu-devel/1d3bbff9e92e7c8a24db9e140dcf3f428c2df103.camel@suse.com/
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
SG_IO may return additional status in the 'status', 'driver_status',
and 'host_status' fields. When either of these fields are set the
command has not been executed normally, so we should not continue
processing this command but rather return an error.
scsi_read_complete() already checks for these errors,
scsi_write_complete() does not.
References: bsc#1178049
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
While using SCSI passthrough, Following scenario makes qemu doesn't
realized the capacity change of remote scsi target:
1. online resize the scsi target.
2. issue 'rescan-scsi-bus.sh -s ...' in host.
3. issue 'rescan-scsi-bus.sh -s ...' in vm.
In above scenario I used to experienced errors while accessing the
additional disk space in vm. I think the reasonable operations should
be:
1. online resize the scsi target.
2. issue 'rescan-scsi-bus.sh -s ...' in host.
3. issue 'block_resize' via qmp to notify qemu.
4. issue 'rescan-scsi-bus.sh -s ...' in vm.
The errors disappear once I notify qemu by block_resize via qmp.
So this patch replaces the number of logical blocks of READ CAPACITY
response from scsi target by qemu's bs->total_sectors. If the user in
vm wants to access the additional disk space, The administrator of
host must notify qemu once resizeing the scsi target.
Bonus is that domblkinfo of libvirt can reflect the consistent capacity
information between host and vm in case of missing block_resize in qemu.
E.g:
...
<disk type='block' device='lun'>
<driver name='qemu' type='raw'/>
<source dev='/dev/sdc' index='1'/>
<backingStore/>
<target dev='sda' bus='scsi'/>
<alias name='scsi0-0-0-0'/>
<address type='drive' controller='0' bus='0' target='0' unit='0'/>
</disk>
...
Before:
1. online resize the scsi target.
2. host:~ # rescan-scsi-bus.sh -s /dev/sdc
3. guest:~ # rescan-scsi-bus.sh -s /dev/sda
4 host:~ # virsh domblkinfo --domain $DOMAIN --human --device sda
Capacity: 4.000 GiB
Allocation: 0.000 B
Physical: 8.000 GiB
5. guest:~ # lsblk /dev/sda
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 8G 0 disk
└─sda1 8:1 0 2G 0 part
After:
1. online resize the scsi target.
2. host:~ # rescan-scsi-bus.sh -s /dev/sdc
3. guest:~ # rescan-scsi-bus.sh -s /dev/sda
4 host:~ # virsh domblkinfo --domain $DOMAIN --human --device sda
Capacity: 4.000 GiB
Allocation: 0.000 B
Physical: 8.000 GiB
5. guest:~ # lsblk /dev/sda
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 4G 0 disk
└─sda1 8:1 0 2G 0 part
References: [SUSE-JIRA] (SLE-20965)
Signed-off-by: Lin Ma <lma@suse.com>
The final step of xl migrate|save for an HVM domU is saving the state of
qemu. This also involves releasing all block devices. While releasing
backends ought to be a separate step, such functionality is not
implemented.
Unfortunately, releasing the block devices depends on the optional
'live' option. This breaks offline migration with 'virsh migrate domU
dom0' because the sending side does not release the disks, as a result
the receiving side can not properly claim write access to the disks.
As a minimal fix, remove the dependency on the 'live' option. Upstream
may fix this in a different way, like removing the newly added 'live'
parameter entirely.
Fixes: 5d6c599fe1 ("migration, xen: Fix block image lock issue on live migration")
Signed-off-by: Olaf Hering <olaf@aepfle.de>
References: bsc#1079730, bsc#1101982, bsc#1063993
Signed-off-by: Bruce Rogers <brogers@suse.com>
Provide monitor naming of xen disks, and plumb guest driver
notification through xenstore of resizing instigated via the
monitor.
[BR: minor edits to pass qemu's checkpatch script]
[BR: significant rework needed due to upstream xen disk qdevification]
[BR: At this point, monitor_add_blk call is all we need to add!]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Add code to read the suse specific suse-diskcache-disable-flush flag out
of xenstore, and set the equivalent flag within QEMU.
Patch taken from Xen's patch queue, Olaf Hering being the original author.
[bsc#879425]
[BR: minor edits to pass qemu's checkpatch script]
[BR: With qdevification of xen-block, code has changed significantly]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Signed-off-by: Olaf Hering <olaf@aepfle.de>
For SLES we want users to be able to use large memory configurations
with KVM without fiddling with ulimit -Sv.
Signed-off-by: Andreas Färber <afaerber@suse.de>
[BR: add include for sys/resource.h]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Change from using glib alloc and free routines to those
from libc. Also perform safety measure of dropping privs
to user if configured no-caps.
References: boo#988279
Signed-off-by: Bruce Rogers <brogers@suse.com>
[AF: Rebased for v2.7.0-rc2]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Virtio-Console can only process one character at a time. Using it on S390
gave me strange "lags" where I got the character I pressed before when
pressing one. So I typed in "abc" and only received "a", then pressed "d"
but the guest received "b" and so on.
While the stdio driver calls a poll function that just processes on its
queue in case virtio-console can't take multiple characters at once, the
muxer does not have such callbacks, so it can't empty its queue.
To work around that limitation, I introduced a new timer that only gets
active when the guest can not receive any more characters. In that case
it polls again after a while to check if the guest is now receiving input.
This patch fixes input when using -nographic on s390 for me.
[AF: Rebased for v2.7.0-rc2]
[BR: minor edits to pass qemu's checkpatch script]
Signed-off-by: Bruce Rogers <brogers@suse.com>
When using hugetlbfs (which is required for HV mode KVM on 970), we
check for MMU notifiers that on 970 can not be implemented properly.
So disable the check for mmu notifiers on PowerPC guests, making
KVM guests work there, even if possibly racy in some odd circumstances.
Signed-off-by: Bruce Rogers <brogers@suse.com>
When doing lseek, SEEK_SET indicates that the offset is an unsigned variable.
Other seek types have parameters that can be negative.
When converting from 32bit to 64bit parameters, we need to take this into
account and enable SEEK_END and SEEK_CUR to be negative, while SEEK_SET stays
absolute positioned which we need to maintain as unsigned.
Signed-off-by: Alexander Graf <agraf@suse.de>
Linux syscalls pass pointers or data length or other information of that sort
to the kernel. This is all stuff you don't want to have sign extended.
Otherwise a host 64bit variable parameter with a size parameter will extend
it to a negative number, breaking lseek for example.
Pass syscall arguments as ulong always.
Signed-off-by: Alexander Graf <agraf@suse.de>
[JRZ: changes from linux-user/qemu.h wass moved to linux-user/user-internals.h]
Signed-off-by: Jose R Ziviani <jziviani@suse.de>
[DF: Forward port, i.e., use ulong for do_prctl too]
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
It's easy enough to handle either per-spec or legacy smbios structures
in the smbios file input without regard to the machine type used, by
simply applying the basic smbios formatting rules. then depending on
what is detected. terminal numm bytes are added or removed for machine
type specific processing.
References: bsc#994082, bsc#1084316, boo#1131894
Signed-off-by: Bruce Rogers <brogers@suse.com>
We add a --cross-file reference so that we can do cross compilation
of qboot from an aarch64 build.
Signed-off-by: Bruce Rogers <brogers@suse.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Certain rom subpackages build from qemu git-submodules call the date
program to include date information in the packaged binaries. This
causes repeated builds of the package to be different, wkere the only
real difference is due to the fact that time build timestamp has
changed. To promote reproducible builds and avoid customers being
prompted to update packages needlessly, we'll use the timestamp of the
VERSION file as the packaging timestamp for all packages that build in a
timestamp for whatever reason.
References: bsc#1011213
Signed-off-by: Bruce Rogers <brogers@suse.com>
The sgabios submodule is no longer there, so let's get rid of any
reference to it from our spec files.
Remove no longer supported './configure' options.
We're also not set yet for using the set_version service, so we need to
update the following manually:
- the Version: tags in the spec files
- the rpm/seabios_version and rpm/skiboot_version files (see qemu.spec
for instructions on how to do that)
- the %{sbver} variable in rpm/common.inc
A better solution for handling this aspect is being worked on.
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
In an upstream tarball there are some special files, generated by a
script that is run when the archive is prepared. Let's make our
repository look a little more like that, so we can build it properly.
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Stash the "packaging files" in the QEMU repository, in the rpm/
directory. During package build, they will be pulled out from there
and used as appropriate.
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Commit ffda5db65a ("io/channel-tls: fix handling of bigger read buffers")
changed the behavior of the TLS io channels to schedule a second reading
attempt if there is still incoming data pending. This caused a regression
with backends like the sclpconsole that check in their read function that
the sender does not try to write more bytes to it than the device can
currently handle.
The problem can be reproduced like this:
1) In one terminal, do this:
mkdir qemu-pki
cd qemu-pki
openssl genrsa 2048 > ca-key.pem
openssl req -new -x509 -nodes -days 365000 -key ca-key.pem -out ca-cert.pem
# enter some dummy value for the cert
openssl genrsa 2048 > server-key.pem
openssl req -new -x509 -nodes -days 365000 -key server-key.pem \
-out server-cert.pem
# enter some other dummy values for the cert
gnutls-serv --echo --x509cafile ca-cert.pem --x509keyfile server-key.pem \
--x509certfile server-cert.pem -p 8338
2) In another terminal, do this:
wget https://download.fedoraproject.org/pub/fedora-secondary/releases/39/Cloud/s390x/images/Fedora-Cloud-Base-39-1.5.s390x.qcow2
qemu-system-s390x -nographic -nodefaults \
-hda Fedora-Cloud-Base-39-1.5.s390x.qcow2 \
-object tls-creds-x509,id=tls0,endpoint=client,verify-peer=false,dir=$PWD/qemu-pki \
-chardev socket,id=tls_chardev,host=localhost,port=8338,tls-creds=tls0 \
-device sclpconsole,chardev=tls_chardev,id=tls_serial
QEMU then aborts after a second or two with:
qemu-system-s390x: ../hw/char/sclpconsole.c:73: chr_read: Assertion
`size <= SIZE_BUFFER_VT220 - scon->iov_data_len' failed.
Aborted (core dumped)
It looks like the second read does not trigger the chr_can_read() function
to be called before the second read, which should normally always be done
before sending bytes to a character device to see how much it can handle,
so the s->max_size in tcp_chr_read() still contains the old value from the
previous read. Let's make sure that we use the up-to-date value by calling
tcp_chr_read_poll() again here.
Fixes: ffda5db65a ("io/channel-tls: fix handling of bigger read buffers")
Buglink: https://issues.redhat.com/browse/RHEL-24614
Reviewed-by: "Daniel P. Berrangé" <berrange@redhat.com>
Message-ID: <20240229104339.42574-1-thuth@redhat.com>
Reviewed-by: Antoine Damhet <antoine.damhet@blade-group.com>
Tested-by: Antoine Damhet <antoine.damhet@blade-group.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 462945cd22)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Since Windows text files use CRLFs for all \n, the Windows version of QEMU
inserts a CR in the PCAP stream when a LF is encountered when using USB PCAP
files. This is due to the fact that the PCAP file is opened as TEXT instead
of BINARY.
To show an example, when using a very common protocol to USB disks, the BBB
protocol uses a 10-byte command packet. For example, the READ_CAPACITY(10)
command will have a command block length of 10 (0xA). When this 10-byte
command (part of the 31-byte CBW) is placed into the PCAP file, the Windows
file manager inserts a 0xD before the 0xA, turning the 31-byte CBW into a
32-byte CBW.
Actual CBW:
0040 55 53 42 43 01 00 00 00 08 00 00 00 80 00 0a 25 USBC...........%
0050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ...............
PCAP CBW
0040 55 53 42 43 01 00 00 00 08 00 00 00 80 00 0d 0a USBC............
0050 25 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 %..............
I believe simply opening the PCAP file as BINARY instead of TEXT will fix
this issue.
Resolves: https://bugs.launchpad.net/qemu/+bug/2054889
Signed-off-by: Benjamin David Lunt <benlunt@fysnet.net>
Message-ID: <000101da6823$ce1bbf80$6a533e80$@fysnet.net>
[thuth: Break long line to avoid checkpatch.pl error]
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 5e02a4fdeb)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When using "--without-default-devices", the ARM_GICV3_TCG and ARM_GIC_KVM
settings currently get disabled, though the arm virt machine is only of
very limited use in that case. This also causes the migration-test to
fail in such builds. Let's make sure that we always keep the GIC switches
enabled in the --without-default-devices builds, too.
Message-ID: <20240221110059.152665-1-thuth@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 8bd3f84d1f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Python is transitioning to a world where you're not allowed to use 'pip
install' outside of a virutal env by default. The rationale is to stop
use of pip clashing with distro provided python packages, which creates
a major headache on distro upgrades.
All our CI environments, however, are 100% disposable so the upgrade
headaches don't exist. Thus we can undo the python defaults to allow
pip to work.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-id: 20240222114038.2348718-1-berrange@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit a8bf9de2f4)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The main problem is that "check-venv" is a .PHONY target will always
evaluate and trigger a full re-build of the VM images. While its
tempting to drop it from the dependencies that does introduce a
breakage on freshly configured builds.
Fortunately we do have the otherwise redundant --force flag for the
script which up until now was always on. If we make the usage of
--force conditional on dependencies other than check-venv triggering
the update we can avoid the costly rebuild and still run cleanly on a
fresh checkout.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2118
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240227144335.1196131-4-alex.bennee@linaro.org>
(cherry picked from commit 151b7dba39)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The A20 mask is only applied to the final memory access. Nested
page tables are always walked with the raw guest-physical address.
Unlike the previous patch, in this one the masking must be kept, but
it was done too early.
Cc: qemu-stable@nongnu.org
Fixes: 4a1e9d4d11 ("target/i386: Use atomic operations for pte updates", 2022-10-18)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit b5a9de3259)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
If ptw_translate() does a MMU_PHYS_IDX access, the A20 mask is already
applied in get_physical_address(), which is called via probe_access_full()
and x86_cpu_tlb_fill().
If ptw_translate() on the other hand does a MMU_NESTED_IDX access,
the A20 mask must not be applied to the address that is looked up in
the nested page tables; it must be applied only to the addresses that
hold the NPT entries (which is achieved via MMU_PHYS_IDX, per the
previous paragraph).
Therefore, we can remove A20 masking from the computation of the page
table entry's address, and let get_physical_address() or mmu_translate()
apply it when they know they are returning a host-physical address.
Cc: qemu-stable@nongnu.org
Fixes: 4a1e9d4d11 ("target/i386: Use atomic operations for pte updates", 2022-10-18)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit a28fe7dc19)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The address translation logic in get_physical_address() will currently
truncate physical addresses to 32 bits unless long mode is enabled.
This is incorrect when using physical address extensions (PAE) outside
of long mode, with the result that a 32-bit operating system using PAE
to access memory above 4G will experience undefined behaviour.
The truncation code was originally introduced in commit 33dfdb5 ("x86:
only allow real mode to access 32bit without LMA"), where it applied
only to translations performed while paging is disabled (and so cannot
affect guests using PAE).
Commit 9828198 ("target/i386: Add MMU_PHYS_IDX and MMU_NESTED_IDX")
rearranged the code such that the truncation also applied to the use
of MMU_PHYS_IDX and MMU_NESTED_IDX. Commit 4a1e9d4 ("target/i386: Use
atomic operations for pte updates") brought this truncation into scope
for page table entry accesses, and is the first commit for which a
Windows 10 32-bit guest will reliably fail to boot if memory above 4G
is present.
The truncation code however is not completely redundant. Even though the
maximum address size for any executed instruction is 32 bits, helpers for
operations such as BOUND, FSAVE or XSAVE may ask get_physical_address()
to translate an address outside of the 32-bit range, if invoked with an
argument that is close to the 4G boundary. Likewise for processor
accesses, for example TSS or IDT accesses, when EFER.LMA==0.
So, move the address truncation in get_physical_address() so that it
applies to 32-bit MMU indexes, but not to MMU_PHYS_IDX and MMU_NESTED_IDX.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2040
Fixes: 4a1e9d4d11 ("target/i386: Use atomic operations for pte updates", 2022-10-18)
Cc: qemu-stable@nongnu.org
Co-developed-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit b1661801c1)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: drop unrelated change in target/i386/cpu.c)
MSR_VM_HSAVE_PA bits 0-11 are reserved, as are the bits above the
maximum physical address width of the processor. Setting them to
1 causes a #GP (see "15.30.4 VM_HSAVE_PA MSR" in the AMD manual).
The same is true of VMCB addresses passed to VMRUN/VMLOAD/VMSAVE,
even though the manual is not clear on that.
Cc: qemu-stable@nongnu.org
Fixes: 4a1e9d4d11 ("target/i386: Use atomic operations for pte updates", 2022-10-18)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit d09c79010f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
CR3 bits 63:32 are ignored in 32-bit mode (either legacy 2-level
paging or PAE paging). Do this in mmu_translate() to remove
the last where get_physical_address() meaningfully drops the high
bits of the address.
Cc: qemu-stable@nongnu.org
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Fixes: 4a1e9d4d11 ("target/i386: Use atomic operations for pte updates", 2022-10-18)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 68fb78d7d5)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
is_prefix_insn_excp() loads the first word of the instruction address
which caused an exception, to determine whether or not it was prefixed
so the prefix bit can be set in [H]SRR1.
This works if the instruction image can be loaded, but if the exception
was caused by an ifetch, this load could fail and cause a recursive
exception and crash. Machine checks caused by ifetch are not excluded
from the prefix check and can crash (see issue 2108 for an example).
Fix this by excluding machine checks caused by ifetch from the prefix
check.
Cc: qemu-stable@nongnu.org
Acked-by: Cédric Le Goater <clg@kaod.org>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2108
Fixes: 55a7fa34f8 ("target/ppc: Machine check on invalid real address access on POWER9/10")
Fixes: 5a5d3b23cb ("target/ppc: Add SRR1 prefix indication to interrupt handlers")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
(cherry picked from commit c8fd9667e5)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The move to decodetree flipped the inequality test for the VEC / VSX
MSR facility check.
This caused application crashes under Linux, where these facility
unavailable interrupts are used for lazy-switching of VEC/VSX register
sets. Getting the incorrect interrupt would result in wrong registers
being loaded, potentially overwriting live values and/or exposing
stale ones.
Cc: qemu-stable@nongnu.org
Reported-by: Joel Stanley <joel@jms.id.au>
Fixes: 70426b5bb7 ("target/ppc: moved stxvx and lxvx from legacy to decodtree")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1769
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Tested-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
(cherry picked from commit 2cc0e449d1)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
MSYS2 is dropping support for 32-bit Windows. This shows up for us
as various packages we were using in our CI job no longer being
available to install, which causes the job to fail. In commit
8e31b744fd we dropped the dependency on libusb and spice, but the
dtc package has also now been removed.
For us as QEMU upstream, "32 bit x86 hosts for system emulation" have
already been deprecated as of QEMU 8.0, so we are ready to drop them
anyway.
Drop the msys2-32bit CI job, as the first step in doing this.
This is cc'd to stable, because this job will also be broken for CI
on the stable branches. We can't drop 32-bit support entirely there,
but we will still be covering at least compilation for 32-bit Windows
via the cross-win32-system job.
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20240220165602.135695-1-peter.maydell@linaro.org
(cherry picked from commit 5cd3ae4903)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Input grab key should be Ctrl-Alt-g, not just Ctrl-Alt.
Fixes: f8d2c9369b ("sdl: use ctrl-alt-g as grab hotkey")
Signed-off-by: Tianlan Zhou <bobby825@126.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 185311130f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Input grab key should be Ctrl-Alt-g, not just Ctrl-Alt.
Fixes: f8d2c9369b ("sdl: use ctrl-alt-g as grab hotkey")
Signed-off-by: Tianlan Zhou <bobby825@126.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 4a20ac400f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When running "configure" with "--without-default-devices", building
of qemu-system-hppa currently fails with:
/usr/bin/ld: libqemu-hppa-softmmu.fa.p/hw_hppa_machine.c.o: in function `machine_HP_common_init_tail':
hw/hppa/machine.c:399: undefined reference to `usb_bus_find'
/usr/bin/ld: hw/hppa/machine.c:399: undefined reference to `usb_create_simple'
/usr/bin/ld: hw/hppa/machine.c:400: undefined reference to `usb_bus_find'
/usr/bin/ld: hw/hppa/machine.c:400: undefined reference to `usb_create_simple'
collect2: error: ld returned 1 exit status
ninja: build stopped: subcommand failed.
make: *** [Makefile:162: run-ninja] Error 1
And after fixing this, the qemu-system-hppa binary refuses to run
due to the missing 'pci-ohci' and 'pci-serial' devices. Let's add
the right config switches to fix these problems.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 04b86ccb5d)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
In `qemu_console_resize()`, the old surface of the console is keeped if the new
console size is the same as the old one. If the old surface is a placeholder,
and the new size of console is the same as the placeholder surface (640*480),
the surface won't be replace.
In this situation, the surface's `QEMU_PLACEHOLDER_FLAG` flag is still set, so
the console won't be displayed in SDL display mode.
This patch fixes this problem by forcing a new surface if the old one is a
placeholder.
Signed-off-by: Tianlan Zhou <bobby825@126.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <20240207172024.8-1-bobby825@126.com>
(cherry picked from commit 95b08fee8f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
With VNC, a client can send a non-extended VNC_MSG_CLIENT_CUT_TEXT
message with len=0. In qemu_clipboard_set_data(), the clipboard info
will be updated setting data to NULL (because g_memdup(data, size)
returns NULL when size is 0). If the client does not set the
VNC_ENCODING_CLIPBOARD_EXT feature when setting up the encodings, then
the 'request' callback for the clipboard peer is not initialized.
Later, because data is NULL, qemu_clipboard_request() can be reached
via vdagent_chr_write() and vdagent_clipboard_recv_request() and
there, the clipboard owner's 'request' callback will be attempted to
be called, but that is a NULL pointer.
In particular, this can happen when using the KRDC (22.12.3) VNC
client.
Another scenario leading to the same issue is with two clients (say
noVNC and KRDC):
The noVNC client sets the extension VNC_FEATURE_CLIPBOARD_EXT and
initializes its cbpeer.
The KRDC client does not, but triggers a vnc_client_cut_text() (note
it's not the _ext variant)). There, a new clipboard info with it as
the 'owner' is created and via qemu_clipboard_set_data() is called,
which in turn calls qemu_clipboard_update() with that info.
In qemu_clipboard_update(), the notifier for the noVNC client will be
called, i.e. vnc_clipboard_notify() and also set vs->cbinfo for the
noVNC client. The 'owner' in that clipboard info is the clipboard peer
for the KRDC client, which did not initialize the 'request' function.
That sounds correct to me, it is the owner of that clipboard info.
Then when noVNC sends a VNC_MSG_CLIENT_CUT_TEXT message (it did set
the VNC_FEATURE_CLIPBOARD_EXT feature correctly, so a check for it
passes), that clipboard info is passed to qemu_clipboard_request() and
the original segfault still happens.
Fix the issue by handling updates with size 0 differently. In
particular, mark in the clipboard info that the type is not available.
While at it, switch to g_memdup2(), because g_memdup() is deprecated.
Cc: qemu-stable@nongnu.org
Fixes: CVE-2023-6683
Reported-by: Markus Frank <m.frank@proxmox.com>
Suggested-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: Markus Frank <m.frank@proxmox.com>
Message-ID: <20240124105749.204610-1-f.ebner@proxmox.com>
(cherry picked from commit 405484b29f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The extended clipboard message protocol requires that the client
activate the extension by requesting a psuedo encoding. If this
is not done, then any extended clipboard messages from the client
should be considered invalid and the client dropped.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20240115095119.654271-1-berrange@redhat.com>
(cherry picked from commit 4cba838896)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
CPUID leaf 7 was grouped together with SGX leaf 0x12 by commit
b9edbadefb ("i386: Propagate SGX CPUID sub-leafs to KVM") by mistake.
SGX leaf 0x12 has its specific logic to check if subleaf (starting from 2)
is valid or not by checking the bit 0:3 of corresponding EAX is 1 or
not.
Leaf 7 follows the logic that EAX of subleaf 0 enumerates the maximum
valid subleaf.
Fixes: b9edbadefb ("i386: Propagate SGX CPUID sub-leafs to KVM")
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20240125024016.2521244-4-xiaoyao.li@intel.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 0729857c70)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When msys2 updated their libusb packages to libusb 1.0.27, they
dropped support for building them for mingw32, leaving only mingw64
packages. This broke our CI job, as the 'pacman' package install now
fails with:
error: target not found: mingw-w64-i686-libusb
error: target not found: mingw-w64-i686-usbredir
(both these binary packages are from the libusb source package).
Similarly, spice is now 64-bit only:
error: target not found: mingw-w64-i686-spice
Fix this by dropping these packages from the list we install for our
msys2-32bit build. We do this with a simple mechanism for the
msys2-64bit and msys2-32bit jobs to specify a list of extra packages
to install on top of the common ones we install for both jobs.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2160
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Message-id: 20240215155009.2422335-1-peter.maydell@linaro.org
(cherry picked from commit 8e31b744fd)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Since commit effd60c8 changed how QMP commands are processed, the order
of the block-commit return value and job events in iotests 144 wasn't
fixed and more and caused the test to fail intermittently.
Change the test to cache events first and then print them in a
predefined order.
Waiting three times for JOB_STATUS_CHANGE is a bit uglier than just
waiting for the JOB_STATUS_CHANGE that has "status": "ready", but the
tooling we have doesn't seem to allow the latter easily.
Fixes: effd60c878
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2126
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20240209173103.239994-1-kwolf@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit cc29c12ec6)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
It doesn't make sense to read the value of MDCR_EL2 on a non-A-profile
CPU, and in fact if you try to do it we will assert:
#6 0x00007ffff4b95e96 in __GI___assert_fail
(assertion=0x5555565a8c70 "!arm_feature(env, ARM_FEATURE_M)", file=0x5555565a6e5c "../../target/arm/helper.c", line=12600, function=0x5555565a9560 <__PRETTY_FUNCTION__.0> "arm_security_space_below_el3") at ./assert/assert.c:101
#7 0x0000555555ebf412 in arm_security_space_below_el3 (env=0x555557bc8190) at ../../target/arm/helper.c:12600
#8 0x0000555555ea6f89 in arm_is_el2_enabled (env=0x555557bc8190) at ../../target/arm/cpu.h:2595
#9 0x0000555555ea942f in arm_mdcr_el2_eff (env=0x555557bc8190) at ../../target/arm/internals.h:1512
We might call pmu_counter_enabled() on an M-profile CPU (for example
from the migration pre/post hooks in machine.c); this should always
return false because these CPUs don't set ARM_FEATURE_PMU.
Avoid the assertion by not calling arm_mdcr_el2_eff() before we
have done the early return for "PMU not present".
This fixes an assertion failure if you try to do a loadvm or
savevm for an M-profile board.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2155
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240208153346.970021-1-peter.maydell@linaro.org
(cherry picked from commit ac1d88e9e7)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Found whilst testing a series for the linux kernel that actually
bothers to check if enabled is set. 0xB is the option used
for vast majority of DSDT entries in QEMU.
It is a little odd for a device that doesn't really exist and
is simply a hook to tell the OS there is a CEDT table but 0xB
seems a reasonable choice and avoids need to special case
this device in the OS.
Means:
* Device present.
* Device enabled and decoding it's resources.
* Not shown in UI
* Functioning properly
* No battery (on this device!)
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20240126120132.24248-12-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit d9ae5802f6)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The _STA value returned currently indicates the ACPI0017 device
is not enabled. Whilst this isn't a real device, setting _STA
like this may prevent an OS from enumerating it correctly and
hence from parsing the CEDT table.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20240126120132.24248-11-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 14ec4ff3e4)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
s->iommu_pcibus_by_bus_num is a IOMMUPciBus pointer cache indexed
by bus number, bus number may not always be a fixed value,
i.e., guest reboot to different kernel which set bus number with
different algorithm.
This could lead to endpoint binding to wrong iommu MR in
virtio_iommu_get_endpoint(), then vfio device setup wrong
mapping from other device.
Remove the memset in virtio_iommu_device_realize() to avoid
redundancy with memset in system reset.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-Id: <20240125073706.339369-2-zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 9a457383ce)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
In the current mdev_reg_read() implementation, it consistently returns
that the Media Status is Ready (01b). This was fine until commit
25a52959f9 ("hw/cxl: Add support for device sanitation") because the
media was presumed to be ready.
However, as per the CXL 3.0 spec "8.2.9.8.5.1 Sanitize (Opcode 4400h)",
during sanitation, the Media State should be set to Disabled (11b). The
mentioned commit correctly sets it to Disabled, but mdev_reg_read()
still returns Media Status as Ready.
To address this, update mdev_reg_read() to read register values instead
of returning dummy values.
Note that __toggle_media() managed to not only write something
that no one read, it did it to the wrong register storage and
so changed the reported mailbox size which was definitely not
the intent. That gets fixed as a side effect of allocating
separate state storage for this register.
Fixes: commit 25a52959f9 ("hw/cxl: Add support for device sanitation")
Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20240126120132.24248-7-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit f7509f462c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Netbsd isn't able to detect a link on the emulated tulip card. That's
because netbsd reads the Chip Status Register of the Phy (address
0x14). The default phy data in the qemu tulip driver is all zero,
which means no link is established and autonegotation isn't complete.
Therefore set the register to 0x3b40, which means:
Link is up, Autonegotation complete, Full Duplex, 100MBit/s Link
speed.
Also clear the mask because this register is read only.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Helge Deller <deller@gmx.de>
Tested-by: Helge Deller <deller@gmx.de>
Signed-off-by: Helge Deller <deller@gmx.de>
(cherry picked from commit 9b60a3ed55)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
qemu_smbios_type8_opts did not have the list terminator and that
resulted in out-of-bound memory access. It also needs to have an element
for the type option.
Cc: qemu-stable@nongnu.org
Fixes: fd8caa253c ("hw/smbios: support for type 8 (port connector)")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 196578c9d0)
qemu_smbios_type11_opts did not have the list terminator and that
resulted in out-of-bound memory access. It also needs to have an element
for the type option.
Cc: qemu-stable@nongnu.org
Fixes: 2d6dcbf93f ("smbios: support setting OEM strings table")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit cd8a35b913)
Commit 39fb3cfc28 ("configure: clean up plugin option handling", 2023-10-18)
dropped the CONFIG_PLUGIN line from tests/tcg/config-host.mak, due to confusion
caused by the shadowing of $config_host_mak. However, TCG tests were still
expecting it. Oops.
Put it back, in the meanwhile the shadowing is gone so it's clear that it goes
in the tests/tcg configuration.
Cc: <alex.bennee@linaro.org>
Fixes: 39fb3cfc28 ("configure: clean up plugin option handling", 2023-10-18)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20240124115332.612162-1-pbonzini@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240207163812.3231697-4-alex.bennee@linaro.org>
(cherry picked from commit 15cc103362)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: context fixup)
Avocado needs sqlite3:
Failed to load plugin from module "avocado.plugins.journal":
ImportError("Module 'sqlite3' is not installed.
Use: sudo zypper install python311 to install it")
>From 'zypper info python311':
"This package supplies rich command line features provided by
readline, and sqlite3 support for the interpreter core, thus forming
a so called "extended" runtime."
Include the appropriate package in the lcitool mappings which will
guarantee the dockerfile gets properly updated when lcitool is
run. Also include the updated dockerfile.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Suggested-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20240117164227.32143-1-farosas@suse.de>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240207163812.3231697-2-alex.bennee@linaro.org>
(cherry picked from commit 7485508341)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The 'isa' char pointer isn't being freed after use.
Issue detected by Valgrind:
==38752== 128 bytes in 1 blocks are definitely lost in loss record 3,190 of 3,884
==38752== at 0x484280F: malloc (vg_replace_malloc.c:442)
==38752== by 0x5189619: g_malloc (gmem.c:130)
==38752== by 0x51A5BF2: g_strconcat (gstrfuncs.c:628)
==38752== by 0x6C1E3E: riscv_isa_string_ext (cpu.c:2321)
==38752== by 0x6C1E3E: riscv_isa_string (cpu.c:2343)
==38752== by 0x6BD2EA: build_rhct (virt-acpi-build.c:232)
==38752== by 0x6BD2EA: virt_acpi_build (virt-acpi-build.c:556)
==38752== by 0x6BDC86: virt_acpi_setup (virt-acpi-build.c:662)
==38752== by 0x9C8DC6: notifier_list_notify (notify.c:39)
==38752== by 0x4A595A: qdev_machine_creation_done (machine.c:1589)
==38752== by 0x61E052: qemu_machine_creation_done (vl.c:2680)
==38752== by 0x61E052: qmp_x_exit_preconfig.part.0 (vl.c:2709)
==38752== by 0x6220C6: qmp_x_exit_preconfig (vl.c:2702)
==38752== by 0x6220C6: qemu_init (vl.c:3758)
==38752== by 0x425858: main (main.c:47)
Fixes: ebfd392893 ("hw/riscv/virt: virt-acpi-build.c: Add RHCT Table")
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240122221529.86562-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit 1a49762c07)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: context fixup)
The commit in the fixes line mistakenly modified the channels and
transport compatibility check logic so it now checks multi-channel
support only for socket transport type.
Thus, running multifd migration using a transport other than socket that
is incompatible with multi-channels (such as "exec") would lead to a
segmentation fault instead of an error message.
For example:
(qemu) migrate_set_capability multifd on
(qemu) migrate -d "exec:cat > /tmp/vm_state"
Segmentation fault (core dumped)
Fix it by checking multi-channel compatibility for all transport types.
Cc: qemu-stable <qemu-stable@nongnu.org>
Fixes: d95533e1cd ("migration: modify migration_channels_and_uri_compatible() for new QAPI syntax")
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240125162528.7552-2-avihaih@nvidia.com
Signed-off-by: Peter Xu <peterx@redhat.com>
(cherry picked from commit 3205bebd4f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Requests that complete in an IOThread use irqfd to notify the guest
while requests that complete in the main loop thread use the traditional
qdev irq code path. The reason for this conditional is that the irq code
path requires the BQL:
if (s->ioeventfd_started && !s->ioeventfd_disabled) {
virtio_notify_irqfd(vdev, req->vq);
} else {
virtio_notify(vdev, req->vq);
}
There is a corner case where the conditional invokes the irq code path
instead of the irqfd code path:
static void virtio_blk_stop_ioeventfd(VirtIODevice *vdev)
{
...
/*
* Set ->ioeventfd_started to false before draining so that host notifiers
* are not detached/attached anymore.
*/
s->ioeventfd_started = false;
/* Wait for virtio_blk_dma_restart_bh() and in flight I/O to complete */
blk_drain(s->conf.conf.blk);
During blk_drain() the conditional produces the wrong result because
ioeventfd_started is false.
Use qemu_in_iothread() instead of checking the ioeventfd state.
Cc: qemu-stable@nongnu.org
Buglink: https://issues.redhat.com/browse/RHEL-15394
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240122172625.415386-1-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit bfa36802d1)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: fixup for v8.2.0-809-g3cdaf3dd4a
"virtio-blk: rename dataplane to ioeventfd")
During drain, we do not care about virtqueue notifications, which is why
we remove the handlers on it. When removing those handlers, whether vq
notifications are enabled or not depends on whether we were in polling
mode or not; if not, they are enabled (by default); if so, they have
been disabled by the io_poll_start callback.
Because we do not care about those notifications after removing the
handlers, this is fine. However, we have to explicitly ensure they are
enabled when re-attaching the handlers, so we will resume receiving
notifications. We do this in virtio_queue_aio_attach_host_notifier*().
If such a function is called while we are in a polling section,
attaching the notifiers will then invoke the io_poll_start callback,
re-disabling notifications.
Because we will always miss virtqueue updates in the drained section, we
also need to poll the virtqueue once after attaching the notifiers.
Buglink: https://issues.redhat.com/browse/RHEL-3934
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-ID: <20240202153158.788922-3-hreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 5bdbaebcce)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
As of commit 38738f7dbb ("virtio-scsi:
don't waste CPU polling the event virtqueue"), we only attach an io_read
notifier for the virtio-scsi event virtqueue instead, and no polling
notifiers. During operation, the event virtqueue is typically
non-empty, but none of the buffers are intended to be used immediately.
Instead, they only get used when certain events occur. Therefore, it
makes no sense to continuously poll it when non-empty, because it is
supposed to be and stay non-empty.
We do this by using virtio_queue_aio_attach_host_notifier_no_poll()
instead of virtio_queue_aio_attach_host_notifier() for the event
virtqueue.
Commit 766aa2de0f ("virtio-scsi: implement
BlockDevOps->drained_begin()") however has virtio_scsi_drained_end() use
virtio_queue_aio_attach_host_notifier() for all virtqueues, including
the event virtqueue. This can lead to it being polled again, undoing
the benefit of commit 38738f7dbb.
Fix it by using virtio_queue_aio_attach_host_notifier_no_poll() for the
event virtqueue.
Reported-by: Fiona Ebner <f.ebner@proxmox.com>
Fixes: 766aa2de0f
("virtio-scsi: implement BlockDevOps->drained_begin()")
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-ID: <20240202153158.788922-2-hreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit c42c3833e0)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Creating an instance of the 'TestEnv' class will create a temporary
directory. This dir is only deleted, however, in the __exit__ handler
invoked by a context manager.
In dry-run mode, we don't use the TestEnv via a context manager, so
were leaking the temporary directory. Since meson invokes 'check'
5 times on each configure run, developers /tmp was filling up with
empty temporary directories.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240205154019.1841037-1-berrange@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit c645bac4e0)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When the maximum count of SCRIPTS instructions is reached, the code
stops execution and returns, but fails to decrement the reentrancy
counter. This effectively renders the SCSI controller unusable
because on next entry the reentrancy counter is still above the limit.
This bug was seen on HP-UX 10.20 which seems to trigger SCRIPTS
loops.
Fixes: b987718bbb ("hw/scsi/lsi53c895a: Fix reentrancy issues in the LSI controller (CVE-2023-0330)")
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-ID: <20240128202214.2644768-1-svens@stackframe.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Helge Deller <deller@gmx.de>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 8b09b7fe47)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The latest version of qemu (v8.2.0-869-g7a1dc45af5) crashes when booting
the mcimx7d-sabre emulation with Linux v5.11 and later.
qemu-system-arm: ../system/memory.c:2750: memory_region_set_alias_offset: Assertion `mr->alias' failed.
Problem is that the Designware PCIe emulation accepts the full value range
for the iATU Viewport Register. However, both hardware and emulation only
support four inbound and four outbound viewports.
The Linux kernel determines the number of supported viewports by writing
0xff into the viewport register and reading the value back. The expected
value when reading the register is the highest supported viewport index.
Match that code by masking the supported viewport value range when the
register is written. With this change, the Linux kernel reports
imx6q-pcie 33800000.pcie: iATU: unroll F, 4 ob, 4 ib, align 0K, limit 4G
as expected and supported.
Fixes: d64e5eabc4 ("pci: Add support for Designware IP block")
Cc: Andrey Smirnov <andrew.smirnov@gmail.com>
Cc: Nikita Ostrenkov <n.ostrenkov@gmail.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Message-id: 20240129060055.2616989-1-linux@roeck-us.net
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 8a73152020)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The -serial option documentation is a bit brief about '-serial none'
and '-serial null'. In particular it's not very clear about the
difference between them, and it doesn't mention that it's up to
the machine model whether '-serial none' means "don't create the
serial port" or "don't wire the serial port up to anything".
Expand on these points.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240122163607.459769-3-peter.maydell@linaro.org
(cherry picked from commit 747bfaf3a9)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Currently if the user passes multiple -serial options on the command
line, we mostly treat those as applying to the different serial
devices in order, so that for example
-serial stdio -serial file:filename
will connect the first serial port to stdio and the second to the
named file.
The exception to this is the '-serial none' serial device type. This
means "don't allocate this serial device", but a bug means that
following -serial options are not correctly handled, so that
-serial none -serial stdio
has the unexpected effect that stdio is connected to the first serial
port, not the second.
This is a very long-standing bug that dates back at least as far as
commit 998bbd74b9 from 2009.
Make the 'none' serial type move forward in the indexing of serial
devices like all the other serial types, so that any subsequent
-serial options are correctly handled.
Note that if your commandline mistakenly had a '-serial none' that
was being overridden by a following '-serial something' option, you
should delete the unnecessary '-serial none'. This will give you the
same behaviour as before, on QEMU versions both with and without this
bug fix.
Cc: qemu-stable@nongnu.org
Reported-by: Bohdan Kostiv <bohdan.kostiv@tii.ae>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240122163607.459769-2-peter.maydell@linaro.org
Fixes: 998bbd74b9 ("default devices: core code & serial lines")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit d2019a9d0c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
With GCC 14 the code failed to compile on i686 (and was wrong for any
version of GCC):
../block/blkio.c: In function ‘blkio_file_open’:
../block/blkio.c:857:28: error: passing argument 3 of ‘blkio_get_uint64’ from incompatible pointer type [-Wincompatible-pointer-types]
857 | &s->mem_region_alignment);
| ^~~~~~~~~~~~~~~~~~~~~~~~
| |
| size_t * {aka unsigned int *}
In file included from ../block/blkio.c:12:
/usr/include/blkio.h:49:67: note: expected ‘uint64_t *’ {aka ‘long long unsigned int *’} but argument is of type ‘size_t *’ {aka ‘unsigned int *’}
49 | int blkio_get_uint64(struct blkio *b, const char *name, uint64_t *value);
| ~~~~~~~~~~^~~~~
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Message-id: 20240130122006.2977938-1-rjones@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 615eaeab3d)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The command line options `-ctrl-grab` and `-alt-grab` have been removed
in QEMU 7.1. Instead, use the `-display sdl,grab-mod=<modifiers>` option
to specify the grab modifiers.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2103
Signed-off-by: Yihuan Pan <xun794@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit db101376af)
When doing device assignment of a physical device, MSI-X can be
enabled with no vectors enabled and this sets the IRQ index to
VFIO_PCI_MSIX_IRQ_INDEX. However, when MSI-X is disabled, the IRQ
index is left untouched if no vectors are in use. Then, when INTx
is enabled, the IRQ index value is considered incompatible (set to
MSI-X) and VFIO_DEVICE_SET_IRQS fails. QEMU complains with :
qemu-system-x86_64: vfio 0000:08:00.0: Failed to set up TRIGGER eventfd signaling for interrupt INTX-0: VFIO_DEVICE_SET_IRQS failure: Invalid argument
To avoid that, unconditionaly clear the IRQ index when MSI-X is
disabled.
Buglink: https://issues.redhat.com/browse/RHEL-21293
Fixes: 5ebffa4e87 ("vfio/pci: use an invalid fd to enable MSI-X")
Cc: Jing Liu <jing2.liu@intel.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
(cherry picked from commit d2b668fca5)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
We're currently allowing the process_incoming_migration_bh bottom-half
to run without holding a reference to the 'current_migration' object,
which leads to a segmentation fault if the BH is still live after
migration_shutdown() has dropped the last reference to
current_migration.
In my system the bug manifests as migrate_multifd() returning true
when it shouldn't and multifd_load_shutdown() calling
multifd_recv_terminate_threads() which crashes due to an uninitialized
multifd_recv_state.
Fix the issue by holding a reference to the object when scheduling the
BH and dropping it before returning from the BH. The same is already
done for the cleanup_bh at migrate_fd_cleanup_schedule().
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1969
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240119233922.32588-2-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
(cherry picked from commit 27eb8499ed)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
- if test -n "@PYPI_PKGS@" ; then @PIP3@ install @PYPI_PKGS@ ; fi
- if test -n "@PYPI_PKGS@" ; then PYLIB=$(@PYTHON@ -c 'import sysconfig; print(sysconfig.get_path("stdlib"))'); rm -f $PYLIB/EXTERNALLY-MANAGED; @PIP3@ install @PYPI_PKGS@ ; fi
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.