Compare commits

..

74 Commits

Author SHA1 Message Date
Xiaoyao Li
d20f93da31 docs: Add TDX documentation
Add docs/system/i386/tdx.rst for TDX support, and add tdx in
confidential-guest-support.rst

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>

---
Changes since v1:
 - Add prerequisite of private gmem;
 - update example command to launch TD;

Changes since RFC v4:
 - add the restriction that kernel-irqchip must be split
2023-11-15 00:57:10 -05:00
Sean Christopherson
18f152401f i386/tdx: Don't get/put guest state for TDX VMs
Don't get/put state of TDX VMs since accessing/mutating guest state of
production TDs is not supported.

Note, it will be allowed for a debug TD. Corresponding support will be
introduced when debug TD support is implemented in the future.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2023-11-15 00:57:10 -05:00
Xiaoyao Li
5c8d76a6e4 i386/tdx: Skip kvm_put_apicbase() for TDs
KVM doesn't allow wirting to MSR_IA32_APICBASE for TDs.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2023-11-15 00:57:10 -05:00
Xiaoyao Li
f226ab5a34 i386/tdx: Only configure MSR_IA32_UCODE_REV in kvm_init_msrs() for TDs
For TDs, only MSR_IA32_UCODE_REV in kvm_init_msrs() can be configured
by VMM, while the features enumerated/controlled by other MSRs except
MSR_IA32_UCODE_REV in kvm_init_msrs() are not under control of VMM.

Only configure MSR_IA32_UCODE_REV for TDs.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2023-11-15 00:57:10 -05:00
Isaku Yamahata
e954cd4b22 i386/tdx: Don't synchronize guest tsc for TDs
TSC of TDs is not accessible and KVM doesn't allow access of
MSR_IA32_TSC for TDs. To avoid the assert() in kvm_get_tsc, make
kvm_synchronize_all_tsc() noop for TDs,

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Reviewed-by: Connor Kuehl <ckuehl@redhat.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2023-11-15 00:57:10 -05:00
Isaku Yamahata
2dbde0df24 hw/i386: add option to forcibly report edge trigger in acpi tables
When level trigger isn't supported on x86 platform,
forcibly report edge trigger in acpi tables.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2023-11-15 00:57:10 -05:00
Xiaoyao Li
9faa2b03a9 hw/i386: add eoi_intercept_unsupported member to X86MachineState
Add a new bool member, eoi_intercept_unsupported, to X86MachineState
with default value false. Set true for TDX VM.

Inability to intercept eoi causes impossibility to emulate level
triggered interrupt to be re-injected when level is still kept active.
which affects interrupt controller emulation.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2023-11-15 00:57:10 -05:00
Xiaoyao Li
d04ea558aa i386/tdx: LMCE is not supported for TDX
LMCE is not supported TDX since KVM doesn't provide emulation for
MSR_IA32_FEAT_CTL.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2023-11-15 00:57:10 -05:00
Xiaoyao Li
378e61b145 i386/tdx: Don't allow system reset for TDX VMs
TDX CPU state is protected and thus vcpu state cann't be reset by VMM.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2023-11-15 00:57:10 -05:00
Xiaoyao Li
fdd53e1a81 i386/tdx: Disable PIC for TDX VMs
Legacy PIC (8259) cannot be supported for TDX VMs since TDX module
doesn't allow directly interrupt injection.  Using posted interrupts
for the PIC is not a viable option as the guest BIOS/kernel will not
do EOI for PIC IRQs, i.e. will leave the vIRR bit set.

Hence disable PIC for TDX VMs and error out if user wants PIC.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2023-11-15 00:57:10 -05:00
Xiaoyao Li
e1ce29aa33 i386/tdx: Disable SMM for TDX VMs
TDX doesn't support SMM and VMM cannot emulate SMM for TDX VMs because
VMM cannot manipulate TDX VM's memory.

Disable SMM for TDX VMs and error out if user requests to enable SMM.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2023-11-15 00:57:10 -05:00
Isaku Yamahata
ffe9a6dccb q35: Introduce smm_ranges property for q35-pci-host
Add a q35 property to check whether or not SMM ranges, e.g. SMRAM, TSEG,
etc... exist for the target platform.  TDX doesn't support SMM and doesn't
play nice with QEMU modifying related guest memory ranges.

Signed-off-by: Isaku Yamahata <isaku.yamahata@linux.intel.com>
Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2023-11-15 00:57:10 -05:00
Isaku Yamahata
7163da7797 pci-host/q35: Move PAM initialization above SMRAM initialization
In mch_realize(), process PAM initialization before SMRAM initialization so
that later patch can skill all the SMRAM related with a single check.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2023-11-15 00:57:10 -05:00
Xiaoyao Li
a5368c6202 i386/tdx: Wire TDX_REPORT_FATAL_ERROR with GuestPanic facility
Integrate TDX's TDX_REPORT_FATAL_ERROR into QEMU GuestPanic facility

Originated-from: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
Changes from v2:
- Add docmentation of new type and struct (Daniel)
- refine the error message handling (Daniel)
2023-11-15 00:57:10 -05:00
Xiaoyao Li
f597afe309 i386/tdx: Handle TDG.VP.VMCALL<REPORT_FATAL_ERROR>
TD guest can use TDG.VP.VMCALL<REPORT_FATAL_ERROR> to request termination
with error message encoded in GPRs.

Parse and print the error message, and terminate the TD guest in the
handler.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2023-11-15 00:57:10 -05:00
Isaku Yamahata
a23e29229e i386/tdx: Limit the range size for MapGPA
If the range for TDG.VP.VMCALL<MapGPA> is too large, process the limited
size and return retry error.  It's bad for VMM to take too long time,
e.g. second order, with blocking vcpu execution.  It results in too many
missing timer interrupts.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2023-11-15 00:57:10 -05:00
Isaku Yamahata
addb6b676d i386/tdx: handle TDG.VP.VMCALL<MapGPA> hypercall
MapGPA is a hypercall to convert GPA from/to private GPA to/from shared GPA.
As the conversion function is already implemented as kvm_convert_memory,
wire it to TDX hypercall exit.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2023-11-15 00:57:10 -05:00
Chenyi Qiang
824f58fdba i386/tdx: setup a timer for the qio channel
To avoid no response from QGS server, setup a timer for the transaction.
If timeout, make it an error and interrupt guest. Define the threshold of
time to 30s at present, maybe change to other value if not appropriate.

Extract the common cleanup code to make it more clear.

Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
Changes in v3:
 - Use t->timer_armed to track if t->timer is initialized;
2023-11-15 00:57:10 -05:00
Isaku Yamahata
9955e91601 i386/tdx: handle TDG.VP.VMCALL<GetQuote>
For GetQuote, delegate a request to Quote Generation Service.
Add property "quote-generation-socket" to tdx-guest, whihc is a property
of type SocketAddress to specify Quote Generation Service(QGS).

On request, connect to the QGS, read request buffer from shared guest
memory, send the request buffer to the server and store the response
into shared guest memory and notify TD guest by interrupt.

command line example:
  qemu-system-x86_64 \
    -object '{"qom-type":"tdx-guest","id":"tdx0","quote-generation-socket":{"type": "vsock", "cid":"2","port":"1234"}}' \
    -machine confidential-guest-support=tdx0

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Codeveloped-by: Chenyi Qiang <chenyi.qiang@intel.com>
Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
Changes in v3:
- rename property "quote-generation-service" to "quote-generation-socket";
- change the type of "quote-generation-socket" from str to
  SocketAddress;
- squash next patch into this one;
2023-11-15 00:57:10 -05:00
Isaku Yamahata
80417c539e i386/tdx: handle TDG.VP.VMCALL<SetupEventNotifyInterrupt>
For SetupEventNotifyInterrupt, record interrupt vector and the apic id
of the vcpu that received this TDVMCALL.

Later it can inject interrupt with given vector to the specific vcpu
that received SetupEventNotifyInterrupt.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2023-11-15 00:57:10 -05:00
Xiaoyao Li
511b09205f i386/tdx: Finalize TDX VM
Invoke KVM_TDX_FINALIZE_VM to finalize the TD's measurement and make
the TD vCPUs runnable once machine initialization is complete.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2023-11-15 00:57:10 -05:00
Xiaoyao Li
85a8152691 i386/tdx: Call KVM_TDX_INIT_VCPU to initialize TDX vcpu
TDX vcpu needs to be initialized by SEAMCALL(TDH.VP.INIT) and KVM
provides vcpu level IOCTL KVM_TDX_INIT_VCPU for it.

KVM_TDX_INIT_VCPU needs the address of the HOB as input. Invoke it for
each vcpu after HOB list is created.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2023-11-15 00:57:10 -05:00
Chao Peng
08c095f67a i386/tdx: register TDVF as private memory
Allocate private guest memfd memory for BIOS if it's TD VM.

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Co-developed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2023-11-15 00:57:10 -05:00
Xiaoyao Li
d8da77238e memory: Introduce memory_region_init_ram_guest_memfd()
Introduce memory_region_init_ram_guest_memfd() to allocate private
guset memfd on the MemoryRegion initialization. It's for the use case of
TDVF, which must be private on TDX case.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2023-11-15 00:57:10 -05:00
Isaku Yamahata
ab857b85a5 i386/tdx: Add TDVF memory via KVM_TDX_INIT_MEM_REGION
TDVF firmware (CODE and VARS) needs to be added/copied to TD's private
memory via KVM_TDX_INIT_MEM_REGION, as well as TD HOB and TEMP memory.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>

---
Changes in v1:
  - rename variable @metadata to @flags
2023-11-15 00:57:10 -05:00
Xiaoyao Li
517fb00637 i386/tdx: Setup the TD HOB list
The TD HOB list is used to pass the information from VMM to TDVF. The TD
HOB must include PHIT HOB and Resource Descriptor HOB. More details can
be found in TDVF specification and PI specification.

Build the TD HOB in TDX's machine_init_done callback.

Co-developed-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>

---
Changes in v1:
  - drop the code of adding mmio resources since OVMF prepares all the
    MMIO hob itself.
2023-11-15 00:57:10 -05:00
Xiaoyao Li
359be44b10 headers: Add definitions from UEFI spec for volumes, resources, etc...
Add UEFI definitions for literals, enums, structs, GUIDs, etc... that
will be used by TDX to build the UEFI Hand-Off Block (HOB) that is passed
to the Trusted Domain Virtual Firmware (TDVF).

All values come from the UEFI specification [1], PI spec [2] and TDVF
design guide[3].

[1] UEFI Specification v2.1.0 https://uefi.org/sites/default/files/resources/UEFI_Spec_2_10_Aug29.pdf
[2] UEFI PI spec v1.8 https://uefi.org/sites/default/files/resources/UEFI_PI_Spec_1_8_March3.pdf
[3] https://software.intel.com/content/dam/develop/external/us/en/documents/tdx-virtual-firmware-design-guide-rev-1.pdf

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2023-11-15 00:57:10 -05:00
Xiaoyao Li
0222b6f05d i386/tdx: Track RAM entries for TDX VM
The RAM of TDX VM can be classified into two types:

 - TDX_RAM_UNACCEPTED: default type of TDX memory, which needs to be
   accepted by TDX guest before it can be used and will be all-zeros
   after being accepted.

 - TDX_RAM_ADDED: the RAM that is ADD'ed to TD guest before running, and
   can be used directly. E.g., TD HOB and TEMP MEM that needed by TDVF.

Maintain TdxRamEntries[] which grabs the initial RAM info from e820 table
and mark each RAM range as default type TDX_RAM_UNACCEPTED.

Then turn the range of TD HOB and TEMP MEM to TDX_RAM_ADDED since these
ranges will be ADD'ed before TD runs and no need to be accepted runtime.

The TdxRamEntries[] are later used to setup the memory TD resource HOB
that passes memory info from QEMU to TDVF.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
---
Changes in v3:
- use enum TdxRamType in struct TdxRamEntry; (Isaku)
- Fix the indention; (Daniel)

Changes in v1:
  - simplify the algorithm of tdx_accept_ram_range() (Suggested-by: Gerd Hoffman)
    (1) Change the existing entry to cover the accepted ram range.
    (2) If there is room before the accepted ram range add a
	TDX_RAM_UNACCEPTED entry for that.
    (3) If there is room after the accepted ram range add a
	TDX_RAM_UNACCEPTED entry for that.
2023-11-15 00:57:10 -05:00
Xiaoyao Li
8b7763f7ca i386/tdx: Track mem_ptr for each firmware entry of TDVF
For each TDVF sections, QEMU needs to copy the content to guest
private memory via KVM API (KVM_TDX_INIT_MEM_REGION).

Introduce a field @mem_ptr for TdxFirmwareEntry to track the memory
pointer of each TDVF sections. So that QEMU can add/copy them to guest
private memory later.

TDVF sections can be classified into two groups:
 - Firmware itself, e.g., TDVF BFV and CFV, that located separately from
   guest RAM. Its memory pointer is the bios pointer.

 - Sections located at guest RAM, e.g., TEMP_MEM and TD_HOB.
   mmap a new memory range for them.

Register a machine_init_done callback to do the stuff.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2023-11-15 00:57:10 -05:00
Xiaoyao Li
33edcc8426 i386/tdx: Don't initialize pc.rom for TDX VMs
For TDX, the address below 1MB are entirely general RAM. No need to
initialize pc.rom memory region for TDs.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
This is more as a workaround of the issue that for q35 machine type, the
real memslot update (which requires memslot deletion )for pc.rom happens
after tdx_init_memory_region. It leads to the private memory ADD'ed
before get lost. I haven't work out a good solution to resolve the
order issue. So just skip the pc.rom setup to avoid memslot deletion.
2023-11-15 00:57:10 -05:00
Xiaoyao Li
19195c88df i386/tdx: Skip BIOS shadowing setup
TDX doesn't support map different GPAs to same private memory. Thus,
aliasing top 128KB of BIOS as isa-bios is not supported.

On the other hand, TDX guest cannot go to real mode, it can work fine
without isa-bios.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
---
Changes in v1:
 - update commit message and comment to clarify
2023-11-15 00:57:10 -05:00
Xiaoyao Li
954b9e7a42 i386/tdx: Parse TDVF metadata for TDX VM
TDX cannot support pflash device since it doesn't support read-only
memslot and doesn't support emulation. Load TDVF(OVMF) with -bios option
for TDs.

When boot a TD, besides loading TDVF to the address below 4G, it needs
parse TDVF metadata.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2023-11-15 00:57:10 -05:00
Isaku Yamahata
5aad9329d2 i386/tdvf: Introduce function to parse TDVF metadata
TDX VM needs to boot with its specialized firmware, Trusted Domain
Virtual Firmware (TDVF). QEMU needs to parse TDVF and map it in TD
guest memory prior to running the TDX VM.

A TDVF Metadata in TDVF image describes the structure of firmware.
QEMU refers to it to setup memory for TDVF. Introduce function
tdvf_parse_metadata() to parse the metadata from TDVF image and store
the info of each TDVF section.

TDX metadata is located by a TDX metadata offset block, which is a
GUID-ed structure. The data portion of the GUID structure contains
only an 4-byte field that is the offset of TDX metadata to the end
of firmware file.

Select X86_FW_OVMF when TDX is enable to leverage existing functions
to parse and search OVMF's GUID-ed structures.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Co-developed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>

---
Changes in v1:
 - rename tdvf_parse_section_entry() to
   tdvf_parse_and_check_section_entry()
Changes in RFC v4:
 - rename TDX_METADATA_GUID to TDX_METADATA_OFFSET_GUID
2023-11-15 00:57:10 -05:00
Isaku Yamahata
583aae3302 kvm/tdx: Ignore memory conversion to shared of unassigned region
TDX requires vMMIO region to be shared.  For KVM, MMIO region is the region
which kvm memslot isn't assigned to (except in-kernel emulation).
qemu has the memory region for vMMIO at each device level.

While OVMF issues MapGPA(to-shared) conservatively on 32bit PCI MMIO
region, qemu doesn't find corresponding vMMIO region because it's before
PCI device allocation and memory_region_find() finds the device region, not
PCI bus region.  It's safe to ignore MapGPA(to-shared) because when guest
accesses those region they use GPA with shared bit set for vMMIO.  Ignore
memory conversion request of non-assigned region to shared and return
success.  Otherwise OVMF is confused and panics there.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2023-11-15 00:57:10 -05:00
Isaku Yamahata
80af3f2547 kvm/tdx: Don't complain when converting vMMIO region to shared
Because vMMIO region needs to be shared region, guest TD may explicitly
convert such region from private to shared.  Don't complain such
conversion.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2023-11-15 00:57:10 -05:00
Xiaoyao Li
ef90193248 i386/tdx: Make memory type private by default
By default (due to the recent UPM change), restricted memory attribute is
shared.  Convert the memory region from shared to private at the memory
slot creation time.

add kvm region registering function to check the flag
and convert the region, and add memory listener to TDX guest code to set
the flag to the possible memory region.

Without this patch
- Secure-EPT violation on private area
- KVM_MEMORY_FAULT EXIT (kvm -> qemu)
- qemu converts the 4K page from shared to private
- Resume VCPU execution
- Secure-EPT violation again
- KVM resolves EPT Violation
This also prevents huge page because page conversion is done at 4K
granularity.  Although it's possible to merge 4K private mapping into
2M large page, it slows guest boot.

With this patch
- After memory slot creation, convert the region from private to shared
- Secure-EPT violation on private area.
- KVM resolves EPT Violation

Originated-from: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2023-11-15 00:57:10 -05:00
Xiaoyao Li
769b8158f6 kvm/memory: Introduce the infrastructure to set the default shared/private value
Introduce new flag RAM_DEFAULT_PRIVATE for RAMBlock. It's used to
indicate the default attribute,  private or not.

Set the RAM range to private explicitly when it's default private.

Originated-from: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2023-11-15 00:57:10 -05:00
Xiaoyao Li
d7db7d2ea1 i386/tdx: Set kvm_readonly_mem_enabled to false for TDX VM
TDX only supports readonly for shared memory but not for private memory.

In the view of QEMU, it has no idea whether a memslot is used as shared
memory of private. Thus just mark kvm_readonly_mem_enabled to false to
TDX VM for simplicity.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2023-11-15 00:57:09 -05:00
Xiaoyao Li
004db71f60 i386/tdx: Implement user specified tsc frequency
Reuse "-cpu,tsc-frequency=" to get user wanted tsc frequency and call VM
scope VM_SET_TSC_KHZ to set the tsc frequency of TD before KVM_TDX_INIT_VM.

Besides, sanity check the tsc frequency to be in the legal range and
legal granularity (required by TDX module).

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
---
Changes in v3:
- use @errp to report error info; (Daniel)

Changes in v1:
- Use VM scope VM_SET_TSC_KHZ to set the TSC frequency of TD since KVM
  side drop the @tsc_khz field in struct kvm_tdx_init_vm
2023-11-15 00:57:09 -05:00
Isaku Yamahata
bbe50409ce i386/tdx: Allows mrconfigid/mrowner/mrownerconfig for TDX_INIT_VM
Three sha384 hash values, mrconfigid, mrowner and mrownerconfig, of a TD
can be provided for TDX attestation.

So far they were hard coded as 0. Now allow user to specify those values
via property mrconfigid, mrowner and mrownerconfig. They are all in
base64 format.

example
-object tdx-guest, \
  mrconfigid=ASNFZ4mrze8BI0VniavN7wEjRWeJq83vASNFZ4mrze8BI0VniavN7wEjRWeJq83v,\
  mrowner=ASNFZ4mrze8BI0VniavN7wEjRWeJq83vASNFZ4mrze8BI0VniavN7wEjRWeJq83v,\
  mrownerconfig=ASNFZ4mrze8BI0VniavN7wEjRWeJq83vASNFZ4mrze8BI0VniavN7wEjRWeJq83v

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Co-developed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
Changes in v3:
 - use base64 encoding instread of hex-string;
2023-11-15 00:57:09 -05:00
Xiaoyao Li
2ac24a3f82 i386/tdx: Validate TD attributes
Validate TD attributes with tdx_caps that fixed-0 bits must be zero and
fixed-1 bits must be set.

Besides, sanity check the attribute bits that have not been supported by
QEMU yet. e.g., debug bit, it will be allowed in the future when debug
TD support lands in QEMU.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
---
Changes in v3:
- using error_setg() for error report; (Daniel)
2023-11-15 00:57:09 -05:00
Xiaoyao Li
7671a8d293 i386/tdx: Wire CPU features up with attributes of TD guest
For QEMU VMs, PKS is configured via CPUID_7_0_ECX_PKS and PMU is
configured by x86cpu->enable_pmu. Reuse the existing configuration
interface for TDX VMs.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2023-11-15 00:57:09 -05:00
Isaku Yamahata
5bf04c14d8 i386/tdx: Make sept_ve_disable set by default
For TDX KVM use case, Linux guest is the most major one.  It requires
sept_ve_disable set.  Make it default for the main use case.  For other use
case, it can be enabled/disabled via qemu command line.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
2023-11-15 00:57:09 -05:00
Xiaoyao Li
f503878704 i386/tdx: Add property sept-ve-disable for tdx-guest object
Bit 28 of TD attribute, named SEPT_VE_DISABLE. When set to 1, it disables
EPT violation conversion to #VE on guest TD access of PENDING pages.

Some guest OS (e.g., Linux TD guest) may require this bit as 1.
Otherwise refuse to boot.

Add sept-ve-disable property for tdx-guest object, for user to configure
this bit.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
---
Changes in v3:
- update the comment of property @sept-ve-disable to make it more
  descriptive and use new format. (Daniel and Markus)
2023-11-15 00:57:09 -05:00
Xiaoyao Li
98f599ec0b i386/tdx: Initialize TDX before creating TD vcpus
Invoke KVM_TDX_INIT in kvm_arch_pre_create_vcpu() that KVM_TDX_INIT
configures global TD configurations, e.g. the canonical CPUID config,
and must be executed prior to creating vCPUs.

Use kvm_x86_arch_cpuid() to setup the CPUID settings for TDX VM.

Note, this doesn't address the fact that QEMU may change the CPUID
configuration when creating vCPUs, i.e. punts on refactoring QEMU to
provide a stable CPUID config prior to kvm_arch_init().

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
---
Changes in v3:
- Pass @errp in tdx_pre_create_vcpu() and pass error info to it. (Daniel)
2023-11-15 00:57:09 -05:00
Xiaoyao Li
a1b994d89a kvm: Introduce kvm_arch_pre_create_vcpu()
Introduce kvm_arch_pre_create_vcpu(), to perform arch-dependent
work prior to create any vcpu. This is for i386 TDX because it needs
call TDX_INIT_VM before creating any vcpu.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
---
Changes in v3:
- pass @errp to kvm_arch_pre_create_vcpu(); (Per Daniel)
2023-11-15 00:57:09 -05:00
Sean Christopherson
04fc588ea9 i386/kvm: Move architectural CPUID leaf generation to separate helper
Move the architectural (for lack of a better term) CPUID leaf generation
to a separate helper so that the generation code can be reused by TDX,
which needs to generate a canonical VM-scoped configuration.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2023-11-15 00:57:09 -05:00
Xiaoyao Li
7e454d2ca4 i386/tdx: Integrate tdx_caps->attrs_fixed0/1 to tdx_cpuid_lookup
Some bits in TD attributes have corresponding CPUID feature bits. Reflect
the fixed0/1 restriction on TD attributes to their corresponding CPUID
bits in tdx_cpuid_lookup[] as well.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2023-11-15 00:57:09 -05:00
Xiaoyao Li
b4a0470949 i386/tdx: Integrate tdx_caps->xfam_fixed0/1 into tdx_cpuid_lookup
KVM requires userspace to pass XFAM configuration via CPUID 0xD leaves.

Convert tdx_caps->xfam_fixed0/1 into corresponding
tdx_cpuid_lookup[].tdx_fixed0/1 field of CPUID 0xD leaves. Thus the
requirement can be applied naturally.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2023-11-15 00:57:09 -05:00
Xiaoyao Li
5fdefc08b3 i386/tdx: Update tdx_cpuid_lookup[].tdx_fixed0/1 by tdx_caps.cpuid_config[]
tdx_cpuid_lookup[].tdx_fixed0/1 is QEMU maintained data which reflects
TDX restrictions regrading how some CPUIDs are virtualized by TDX.

It's retrieved from TDX spec. However, TDX may change some fixed
fields to configurable in the future. Update
tdx_cpuid.lookup[].tdx_fixed0/1 fields by removing the bits that
reported from TDX module as configurable. This can adapt with the
updated TDX (module) automatically.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2023-11-15 00:57:09 -05:00
Xiaoyao Li
d54122cefb i386/tdx: Adjust the supported CPUID based on TDX restrictions
According to Chapter "CPUID Virtualization" in TDX module spec, CPUID
bits of TD can be classified into 6 types:

------------------------------------------------------------------------
1 | As configured | configurable by VMM, independent of native value;
------------------------------------------------------------------------
2 | As configured | configurable by VMM if the bit is supported natively
    (if native)   | Otherwise it equals as native(0).
------------------------------------------------------------------------
3 | Fixed         | fixed to 0/1
------------------------------------------------------------------------
4 | Native        | reflect the native value
------------------------------------------------------------------------
5 | Calculated    | calculated by TDX module.
------------------------------------------------------------------------
6 | Inducing #VE  | get #VE exception
------------------------------------------------------------------------

Note:
1. All the configurable XFAM related features and TD attributes related
   features fall into type #2. And fixed0/1 bits of XFAM and TD
   attributes fall into type #3.

2. For CPUID leaves not listed in "CPUID virtualization Overview" table
   in TDX module spec, TDX module injects #VE to TDs when those are
   queried. For this case, TDs can request CPUID emulation from VMM via
   TDVMCALL and the values are fully controlled by VMM.

Due to TDX module has its own virtualization policy on CPUID bits, it leads
to what reported via KVM_GET_SUPPORTED_CPUID diverges from the supported
CPUID bits for TDs. In order to keep a consistent CPUID configuration
between VMM and TDs. Adjust supported CPUID for TDs based on TDX
restrictions.

Currently only focus on the CPUID leaves recognized by QEMU's
feature_word_info[] that are indexed by a FeatureWord.

Introduce a TDX CPUID lookup table, which maintains 1 entry for each
FeatureWord. Each entry has below fields:

 - tdx_fixed0/1: The bits that are fixed as 0/1;

 - vmm_fixup:   The bits that are configurable from the view of TDX module.
                But they requires emulation of VMM when they are configured
	        as enabled. For those, they are not supported if VMM doesn't
		report them as supported. So they need be fixed up by
		checking if VMM supports them.

 - inducing_ve: TD gets #VE when querying this CPUID leaf. The result is
                totally configurable by VMM.

 - supported_on_ve: It's valid only when @inducing_ve is true. It represents
		    the maximum feature set supported that be emulated
		    for TDs.

By applying TDX CPUID lookup table and TDX capabilities reported from
TDX module, the supported CPUID for TDs can be obtained from following
steps:

- get the base of VMM supported feature set;

- if the leaf is not a FeatureWord just return VMM's value without
  modification;

- if the leaf is an inducing_ve type, applying supported_on_ve mask and
  return;

- include all native bits, it covers type #2, #4, and parts of type #1.
  (it also includes some unsupported bits. The following step will
   correct it.)

- apply fixed0/1 to it (it covers #3, and rectifies the previous step);

- add configurable bits (it covers the other part of type #1);

- fix the ones in vmm_fixup;

- filter the one has valid .supported field;

(Calculated type is ignored since it's determined at runtime).

Co-developed-by: Chenyi Qiang <chenyi.qiang@intel.com>
Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2023-11-15 00:57:09 -05:00
Xiaoyao Li
ef64621235 i386/tdx: Introduce is_tdx_vm() helper and cache tdx_guest object
It will need special handling for TDX VMs all around the QEMU.
Introduce is_tdx_vm() helper to query if it's a TDX VM.

Cache tdx_guest object thus no need to cast from ms->cgs every time.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
---
changes in v3:
- replace object_dynamic_cast with TDX_GUEST();
2023-11-15 00:57:09 -05:00
Xiaoyao Li
575bfcd358 i386/tdx: Get tdx_capabilities via KVM_TDX_CAPABILITIES
KVM provides TDX capabilities via sub command KVM_TDX_CAPABILITIES of
IOCTL(KVM_MEMORY_ENCRYPT_OP). Get the capabilities when initializing
TDX context. It will be used to validate user's setting later.

Since there is no interface reporting how many cpuid configs contains in
KVM_TDX_CAPABILITIES, QEMU chooses to try starting with a known number
and abort when it exceeds KVM_MAX_CPUID_ENTRIES.

Besides, introduce the interfaces to invoke TDX "ioctls" at different
scope (KVM, VM and VCPU) in preparation.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
Changes in v3:
- rename __tdx_ioctl() to tdx_ioctl_internal()
- Pass errp in get_tdx_capabilities();

changes in v2:
  - Make the error message more clear;

changes in v1:
  - start from nr_cpuid_configs = 6 for the loop;
  - stop the loop when nr_cpuid_configs exceeds KVM_MAX_CPUID_ENTRIES;
2023-11-15 00:57:09 -05:00
Xiaoyao Li
38b04243ce i386/tdx: Implement tdx_kvm_init() to initialize TDX VM context
Introduce tdx_kvm_init() and invoke it in kvm_confidential_guest_init()
if it's a TDX VM.

Set ms->require_guest_memfd to require kvm guest memfd allocation for any
memory backend. More TDX specific initialization will be added later.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2023-11-15 00:57:09 -05:00
Xiaoyao Li
c64b4c4e28 target/i386: Introduce kvm_confidential_guest_init()
Introduce a separate function kvm_confidential_guest_init(), which
dispatches specific confidential guest initialization function by
ms->cgs type.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2023-11-15 00:57:09 -05:00
Xiaoyao Li
871c9f21ee target/i386: Parse TDX vm type
TDX VM requires VM type KVM_X86_TDX_VM to be passed to
kvm_ioctl(KVM_CREATE_VM).

If tdx-guest object is specified to confidential-guest-support, like,

  qemu -machine ...,confidential-guest-support=tdx0 \
       -object tdx-guest,id=tdx0,...

it parses VM type as KVM_X86_TDX_VM.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2023-11-15 00:57:09 -05:00
Xiaoyao Li
3770c1f6cb target/i386: Implement mc->kvm_type() to get VM type
Implement mc->kvm_type() for i386 machines. It provides a way for user
to create SW_PROTECTE_VM.

Also store the vm_type in machinestate to other code to query what the
VM type is.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2023-11-15 00:57:09 -05:00
Xiaoyao Li
636758fe40 i386: Introduce tdx-guest object
Introduce tdx-guest object which implements the interface of
CONFIDENTIAL_GUEST_SUPPORT, and will be used to create TDX VMs (TDs) by

  qemu -machine ...,confidential-guest-support=tdx0	\
       -object tdx-guest,id=tdx0

It has only one member 'attributes' with fixed value 0 and not
configurable so far.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
---
changes in v1
- make @attributes not user-settable
2023-11-15 00:57:09 -05:00
Xiaoyao Li
c9636b4bf5 *** HACK *** linux-headers: Update headers to pull in TDX API changes
Pull in recent TDX updates, which are not backwards compatible.

It's just to make this series runnable. It will be updated by script

	scripts/update-linux-headers.sh

once TDX support is upstreamed in linux kernel

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2023-11-15 00:57:09 -05:00
Isaku Yamahata
d39ad1ccfd trace/kvm: Add trace for page convertion between shared and private
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2023-11-15 00:57:09 -05:00
Chao Peng
552a4e57a8 kvm: handle KVM_EXIT_MEMORY_FAULT
Currently only KVM_MEMORY_EXIT_FLAG_PRIVATE in flags is valid when
KVM_EXIT_MEMORY_FAULT happens. It indicates userspace needs to do
the memory conversion on the RAMBlock to turn the memory into desired
attribute, i.e., private/shared.

Note, KVM_EXIT_MEMORY_FAULT makes sense only when the RAMBlock has
guest_memfd memory backend.

Note, KVM_EXIT_MEMORY_FAULT returns with -EFAULT, so special handling is
added.

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Co-developed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2023-11-15 00:57:09 -05:00
Xiaoyao Li
105fa48cab physmem: Introduce ram_block_convert_range() for page conversion
It's used for discarding opposite memory after memory conversion, for
confidential guest.

When page is converted from shared to private, the original shared
memory can be discarded via ram_block_discard_range();

When page is converted from private to shared, the original private
memory is back'ed by guest_memfd. Introduce
ram_block_discard_guest_memfd_range() for discarding memory in
guest_memfd.

Originally-from: Isaku Yamahata <isaku.yamahata@intel.com>
Codeveloped-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2023-11-15 00:57:09 -05:00
Xiaoyao Li
9824769418 physmem: replace function name with __func__ in ram_block_discard_range()
Use __func__ to avoid hard-coded function name.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2023-11-15 00:57:09 -05:00
Xiaoyao Li
02fda3dd28 physmem: Relax the alignment check of host_startaddr in ram_block_discard_range()
Commit d3a5038c46 ("exec: ram_block_discard_range") introduced
ram_block_discard_range() which grabs some code from
ram_discard_range(). However, during code movement, it changed alignment
check of host_startaddr from qemu_host_page_size to rb->page_size.

When ramblock is back'ed by hugepage, it requires the startaddr to be
huge page size aligned, which is a overkill. e.g., TDX's private-shared
page conversion is done at 4KB granularity. Shared page is discarded
when it gets converts to private and when shared page back'ed by
hugepage it is going to fail on this check.

So change to alignment check back to qemu_host_page_size.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
Changes in v3:
 - Newly added in v3;
2023-11-15 00:57:09 -05:00
Xiaoyao Li
609e0ca1d8 kvm: Introduce support for memory_attributes
Introduce the helper functions to set the attributes of a range of
memory to private or shared.

This is necessary to notify KVM the private/shared attribute of each gpa
range. KVM needs the information to decide the GPA needs to be mapped at
hva-based shared memory or guest_memfd based private memory.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2023-11-15 00:57:09 -05:00
Chao Peng
d21132fe6e kvm: Enable KVM_SET_USER_MEMORY_REGION2 for memslot
Switch to KVM_SET_USER_MEMORY_REGION2 when supported by KVM.

With KVM_SET_USER_MEMORY_REGION2, QEMU can set up memory region that
backend'ed both by hva-based shared memory and guest memfd based private
memory.

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Co-developed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2023-11-15 00:57:09 -05:00
Xiaoyao Li
902dc0c4a7 HostMem: Add mechanism to opt in kvm guest memfd via MachineState
Add a new member "require_guest_memfd" to memory backends. When it's set
to true, it enables RAM_GUEST_MEMFD in ram_flags, thus private kvm
guest_memfd will be allocated during RAMBlock allocation.

Memory backend's @require_guest_memfd is wired with @require_guest_memfd
field of MachineState. MachineState::require_guest_memfd is supposed to
be set by any VMs that requires KVM guest memfd as private memory, e.g.,
TDX VM.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2023-11-15 00:57:09 -05:00
Xiaoyao Li
7bd3bf6642 RAMBlock/guest_memfd: Enable KVM_GUEST_MEMFD_ALLOW_HUGEPAGE
KVM allows KVM_GUEST_MEMFD_ALLOW_HUGEPAGE for guest memfd. When the
flag is set, KVM tries to allocate memory with transparent hugeapge at
first and falls back to non-hugepage on failure.

However, KVM defines one restriction that size must be hugepage size
aligned when KVM_GUEST_MEMFD_ALLOW_HUGEPAGE is set.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
v3:
 - New one in v3.
2023-11-15 00:57:09 -05:00
Xiaoyao Li
843bbbd03b RAMBlock: Add support of KVM private guest memfd
Add KVM guest_memfd support to RAMBlock so both normal hva based memory
and kvm guest memfd based private memory can be associated in one RAMBlock.

Introduce new flag RAM_GUEST_MEMFD. When it's set, it calls KVM ioctl to
create private guest_memfd during RAMBlock setup.

Note, RAM_GUEST_MEMFD is supposed to be set for memory backends of
confidential guests, such as TDX VM. How and when to set it for memory
backends will be implemented in the following patches.

Introduce memory_region_has_guest_memfd() to query if the MemoryRegion has
KVM guest_memfd allocated.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
Changes in v3:
- rename gmem to guest_memfd;
- close(guest_memfd) when RAMBlock is released; (Daniel P. Berrangé)
- Suqash the patch that introduces memory_region_has_guest_memfd().
2023-11-15 00:57:09 -05:00
Xiaoyao Li
3091ad45a5 *** HACK *** linux-headers: Update headers to pull in gmem APIs
This patch needs to be updated by script

	scripts/update-linux-headers.sh

once gmem fd support is upstreamed in Linux kernel.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2023-11-15 00:57:09 -05:00
Xiaoyao Li
8d635c6681 trace/kvm: Split address space and slot id in trace_kvm_set_user_memory()
The upper 16 bits of kvm_userspace_memory_region::slot are
address space id. Parse it separately in trace_kvm_set_user_memory().

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2023-11-15 00:57:09 -05:00
Xiaoyao Li
f5fd218755 i386/cpuid: Remove subleaf constraint on CPUID leaf 1F
No such constraint that subleaf index needs to be less than 64.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2023-11-15 00:57:09 -05:00
Xiaoyao Li
d5591ff440 i386/cpuid: Decrease cpuid_i when skipping CPUID leaf 1F
Decrease array index cpuid_i when CPUID leaf 1F is skipped, otherwise it
will get an all zero'ed CPUID entry with leaf 0 and subleaf 0. It
conflicts with correct leaf 0.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
2023-11-15 00:57:09 -05:00
Xiaoyao Li
605d572f9c i386/pc: Drop pc_machine_kvm_type()
pc_machine_kvm_type() was introduced by commit e21be724ea ("i386/xen:
add pc_machine_kvm_type to initialize XEN_EMULATE mode") to do Xen
specific initialization by utilizing kvm_type method.

commit eeedfe6c63 ("hw/xen: Simplify emulated Xen platform init")
moves the Xen specific initialization to pc_basic_device_init().

There is no need to keep the PC specific kvm_type() implementation
anymore. On the other hand, later patch will implement kvm_type()
method for all x86/i386 machines to support KVM_X86_SW_PROTECTED_VM.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Isaku Yamahata <isaku.yamahata@intel.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
2023-11-15 00:57:09 -05:00
459 changed files with 6809 additions and 5762 deletions

View File

@@ -41,7 +41,7 @@ build-system-ubuntu:
variables:
IMAGE: ubuntu2204
CONFIGURE_ARGS: --enable-docs
TARGETS: alpha-softmmu microblaze-softmmu mips64el-softmmu
TARGETS: alpha-softmmu microblazeel-softmmu mips64el-softmmu
MAKE_CHECK_ARGS: check-build
check-system-ubuntu:
@@ -70,7 +70,7 @@ build-system-debian:
needs:
job: amd64-debian-container
variables:
IMAGE: debian
IMAGE: debian-amd64
CONFIGURE_ARGS: --with-coroutine=sigaltstack
TARGETS: arm-softmmu i386-softmmu riscv64-softmmu sh4eb-softmmu
sparc-softmmu xtensa-softmmu
@@ -82,7 +82,7 @@ check-system-debian:
- job: build-system-debian
artifacts: true
variables:
IMAGE: debian
IMAGE: debian-amd64
MAKE_CHECK_ARGS: check
avocado-system-debian:
@@ -91,7 +91,7 @@ avocado-system-debian:
- job: build-system-debian
artifacts: true
variables:
IMAGE: debian
IMAGE: debian-amd64
MAKE_CHECK_ARGS: check-avocado
AVOCADO_TAGS: arch:arm arch:i386 arch:riscv64 arch:sh4 arch:sparc arch:xtensa
@@ -101,7 +101,7 @@ crash-test-debian:
- job: build-system-debian
artifacts: true
variables:
IMAGE: debian
IMAGE: debian-amd64
script:
- cd build
- make NINJA=":" check-venv
@@ -217,36 +217,6 @@ avocado-system-opensuse:
MAKE_CHECK_ARGS: check-avocado
AVOCADO_TAGS: arch:s390x arch:x86_64 arch:aarch64
#
# Flaky tests. We don't run these by default and they are allow fail
# but often the CI system is the only way to trigger the failures.
#
build-system-flaky:
extends:
- .native_build_job_template
- .native_build_artifact_template
needs:
job: amd64-debian-container
variables:
IMAGE: debian
QEMU_JOB_OPTIONAL: 1
TARGETS: aarch64-softmmu arm-softmmu mips64el-softmmu
ppc64-softmmu rx-softmmu s390x-softmmu sh4-softmmu x86_64-softmmu
MAKE_CHECK_ARGS: check-build
avocado-system-flaky:
extends: .avocado_test_job_template
needs:
- job: build-system-flaky
artifacts: true
allow_failure: true
variables:
IMAGE: debian
MAKE_CHECK_ARGS: check-avocado
QEMU_JOB_OPTIONAL: 1
QEMU_TEST_FLAKY_TESTS: 1
AVOCADO_TAGS: flaky
# This jobs explicitly disable TCG (--disable-tcg), KVM is detected by
# the configure script. The container doesn't contain Xen headers so
@@ -619,7 +589,7 @@ build-tools-and-docs-debian:
# when running on 'master' we use pre-existing container
optional: true
variables:
IMAGE: debian
IMAGE: debian-amd64
MAKE_CHECK_ARGS: check-unit ctags TAGS cscope
CONFIGURE_ARGS: --disable-system --disable-user --enable-docs --enable-tools
QEMU_JOB_PUBLISH: 1
@@ -639,7 +609,7 @@ build-tools-and-docs-debian:
# of what topic branch they're currently using
pages:
extends: .base_job_template
image: $CI_REGISTRY_IMAGE/qemu/debian:$QEMU_CI_CONTAINER_TAG
image: $CI_REGISTRY_IMAGE/qemu/debian-amd64:$QEMU_CI_CONTAINER_TAG
stage: test
needs:
- job: build-tools-and-docs-debian
@@ -647,10 +617,7 @@ pages:
- mkdir -p public
# HTML-ised source tree
- make gtags
# We unset variables to work around a bug in some htags versions
# which causes it to fail when the environment is large
- CI_COMMIT_MESSAGE= CI_COMMIT_TAG_MESSAGE= htags
-anT --tree-view=filetree -m qemu_init
- htags -anT --tree-view=filetree -m qemu_init
-t "Welcome to the QEMU sourcecode"
- mv HTML public/src
# Project documentation

View File

@@ -59,13 +59,13 @@ x64-freebsd-13-build:
INSTALL_COMMAND: pkg install -y
TEST_TARGETS: check
aarch64-macos-13-base-build:
aarch64-macos-12-base-build:
extends: .cirrus_build_job
variables:
NAME: macos-13
NAME: macos-12
CIRRUS_VM_INSTANCE_TYPE: macos_instance
CIRRUS_VM_IMAGE_SELECTOR: image
CIRRUS_VM_IMAGE_NAME: ghcr.io/cirruslabs/macos-ventura-base:latest
CIRRUS_VM_IMAGE_NAME: ghcr.io/cirruslabs/macos-monterey-base:latest
CIRRUS_VM_CPUS: 12
CIRRUS_VM_RAM: 24G
UPDATE_COMMAND: brew update
@@ -74,22 +74,6 @@ aarch64-macos-13-base-build:
PKG_CONFIG_PATH: /opt/homebrew/curl/lib/pkgconfig:/opt/homebrew/ncurses/lib/pkgconfig:/opt/homebrew/readline/lib/pkgconfig
TEST_TARGETS: check-unit check-block check-qapi-schema check-softfloat check-qtest-x86_64
aarch64-macos-14-base-build:
extends: .cirrus_build_job
variables:
NAME: macos-14
CIRRUS_VM_INSTANCE_TYPE: macos_instance
CIRRUS_VM_IMAGE_SELECTOR: image
CIRRUS_VM_IMAGE_NAME: ghcr.io/cirruslabs/macos-sonoma-base:latest
CIRRUS_VM_CPUS: 12
CIRRUS_VM_RAM: 24G
UPDATE_COMMAND: brew update
INSTALL_COMMAND: brew install
PATH_EXTRA: /opt/homebrew/ccache/libexec:/opt/homebrew/gettext/bin
PKG_CONFIG_PATH: /opt/homebrew/curl/lib/pkgconfig:/opt/homebrew/ncurses/lib/pkgconfig:/opt/homebrew/readline/lib/pkgconfig
TEST_TARGETS: check-unit check-block check-qapi-schema check-softfloat check-qtest-x86_64
QEMU_JOB_OPTIONAL: 1
# The following jobs run VM-based tests via KVM on a Linux-based Cirrus-CI job
.cirrus_kvm_job:

View File

@@ -1,6 +1,6 @@
# THIS FILE WAS AUTO-GENERATED
#
# $ lcitool variables macos-13 qemu
# $ lcitool variables macos-12 qemu
#
# https://gitlab.com/libvirt/libvirt-ci

View File

@@ -1,16 +0,0 @@
# THIS FILE WAS AUTO-GENERATED
#
# $ lcitool variables macos-14 qemu
#
# https://gitlab.com/libvirt/libvirt-ci
CCACHE='/opt/homebrew/bin/ccache'
CPAN_PKGS=''
CROSS_PKGS=''
MAKE='/opt/homebrew/bin/gmake'
NINJA='/opt/homebrew/bin/ninja'
PACKAGING_COMMAND='brew'
PIP3='/opt/homebrew/bin/pip3'
PKGS='bash bc bison bzip2 capstone ccache cmocka ctags curl dbus diffutils dtc flex gcovr gettext git glib gnu-sed gnutls gtk+3 jemalloc jpeg-turbo json-c libepoxy libffi libgcrypt libiscsi libnfs libpng libslirp libssh libtasn1 libusb llvm lzo make meson mtools ncurses nettle ninja pixman pkg-config python3 rpm2cpio sdl2 sdl2_image snappy socat sparse spice-protocol swtpm tesseract usbredir vde vte3 xorriso zlib zstd'
PYPI_PKGS='PyYAML numpy pillow sphinx sphinx-rtd-theme tomli'
PYTHON='/opt/homebrew/bin/python3'

View File

@@ -46,12 +46,6 @@ loongarch-debian-cross-container:
variables:
NAME: debian-loongarch-cross
i686-debian-cross-container:
extends: .container_job_template
stage: containers
variables:
NAME: debian-i686-cross
mips64el-debian-cross-container:
extends: .container_job_template
stage: containers
@@ -101,6 +95,11 @@ cris-fedora-cross-container:
variables:
NAME: fedora-cris-cross
i386-fedora-cross-container:
extends: .container_job_template
variables:
NAME: fedora-i386-cross
win32-fedora-cross-container:
extends: .container_job_template
variables:

View File

@@ -11,7 +11,7 @@ amd64-debian-container:
extends: .container_job_template
stage: containers
variables:
NAME: debian
NAME: debian-amd64
amd64-ubuntu2204-container:
extends: .container_job_template

View File

@@ -37,25 +37,25 @@ cross-arm64-kvm-only:
IMAGE: debian-arm64-cross
EXTRA_CONFIGURE_OPTS: --disable-tcg --without-default-features
cross-i686-user:
cross-i386-user:
extends:
- .cross_user_build_job
- .cross_test_artifacts
needs:
job: i686-debian-cross-container
job: i386-fedora-cross-container
variables:
IMAGE: debian-i686-cross
IMAGE: fedora-i386-cross
MAKE_CHECK_ARGS: check
cross-i686-tci:
cross-i386-tci:
extends:
- .cross_accel_build_job
- .cross_test_artifacts
timeout: 60m
needs:
job: i686-debian-cross-container
job: i386-fedora-cross-container
variables:
IMAGE: debian-i686-cross
IMAGE: fedora-i386-cross
ACCEL: tcg-interpreter
EXTRA_CONFIGURE_OPTS: --target-list=i386-softmmu,i386-linux-user,aarch64-softmmu,aarch64-linux-user,ppc-softmmu,ppc-linux-user --disable-plugins
MAKE_CHECK_ARGS: check check-tcg
@@ -165,7 +165,7 @@ cross-win32-system:
job: win32-fedora-cross-container
variables:
IMAGE: fedora-win32-cross
EXTRA_CONFIGURE_OPTS: --enable-fdt=internal
EXTRA_CONFIGURE_OPTS: --enable-fdt=internal --disable-plugins
CROSS_SKIP_TARGETS: alpha-softmmu avr-softmmu hppa-softmmu m68k-softmmu
microblazeel-softmmu mips64el-softmmu nios2-softmmu
artifacts:

View File

@@ -5,21 +5,16 @@
# Required
version: 2
# Set the version of Python and other tools you might need
build:
os: ubuntu-22.04
tools:
python: "3.11"
# Build documentation in the docs/ directory with Sphinx
sphinx:
configuration: docs/conf.py
# We recommend specifying your dependencies to enable reproducible builds:
# https://docs.readthedocs.io/en/stable/guides/reproducible-builds.html
python:
install:
- requirements: docs/requirements.txt
# We want all the document formats
formats: all
# For consistency, we require that QEMU's Sphinx extensions
# run with at least the same minimum version of Python that
# we require for other Python in our codebase (our conf.py
# enforces this, and some code needs it.)
python:
version: 3.6

View File

@@ -174,7 +174,6 @@ F: include/hw/core/tcg-cpu-ops.h
F: host/include/*/host/cpuinfo.h
F: util/cpuinfo-*.c
F: include/tcg/
F: tests/decode/
FPU emulation
M: Aurelien Jarno <aurelien@aurel32.net>

View File

@@ -1 +1 @@
8.2.1
8.1.90

View File

@@ -101,6 +101,8 @@ bool kvm_msi_use_devid;
bool kvm_has_guest_debug;
static int kvm_sstep_flags;
static bool kvm_immediate_exit;
static bool kvm_guest_memfd_supported;
static uint64_t kvm_supported_memory_attributes;
static hwaddr kvm_max_slot_size = ~0;
static const KVMCapabilityInfo kvm_required_capabilites[] = {
@@ -292,34 +294,69 @@ int kvm_physical_memory_addr_from_host(KVMState *s, void *ram,
static int kvm_set_user_memory_region(KVMMemoryListener *kml, KVMSlot *slot, bool new)
{
KVMState *s = kvm_state;
struct kvm_userspace_memory_region mem;
struct kvm_userspace_memory_region2 mem;
static int cap_user_memory2 = -1;
int ret;
if (cap_user_memory2 == -1) {
cap_user_memory2 = kvm_check_extension(s, KVM_CAP_USER_MEMORY2);
}
if (!cap_user_memory2 && slot->guest_memfd >= 0) {
error_report("%s, KVM doesn't support KVM_CAP_USER_MEMORY2,"
" which is required by guest memfd!", __func__);
exit(1);
}
mem.slot = slot->slot | (kml->as_id << 16);
mem.guest_phys_addr = slot->start_addr;
mem.userspace_addr = (unsigned long)slot->ram;
mem.flags = slot->flags;
mem.guest_memfd = slot->guest_memfd;
mem.guest_memfd_offset = slot->guest_memfd_offset;
if (slot->memory_size && !new && (mem.flags ^ slot->old_flags) & KVM_MEM_READONLY) {
/* Set the slot size to 0 before setting the slot to the desired
* value. This is needed based on KVM commit 75d61fbc. */
mem.memory_size = 0;
ret = kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION, &mem);
if (cap_user_memory2) {
ret = kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION2, &mem);
} else {
ret = kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION, &mem);
}
if (ret < 0) {
goto err;
}
}
mem.memory_size = slot->memory_size;
ret = kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION, &mem);
if (cap_user_memory2) {
ret = kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION2, &mem);
} else {
ret = kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION, &mem);
}
slot->old_flags = mem.flags;
err:
trace_kvm_set_user_memory(mem.slot, mem.flags, mem.guest_phys_addr,
mem.memory_size, mem.userspace_addr, ret);
trace_kvm_set_user_memory(mem.slot >> 16, (uint16_t)mem.slot, mem.flags,
mem.guest_phys_addr, mem.memory_size,
mem.userspace_addr, mem.guest_memfd,
mem.guest_memfd_offset, ret);
if (ret < 0) {
error_report("%s: KVM_SET_USER_MEMORY_REGION failed, slot=%d,"
" start=0x%" PRIx64 ", size=0x%" PRIx64 ": %s",
__func__, mem.slot, slot->start_addr,
(uint64_t)mem.memory_size, strerror(errno));
if (cap_user_memory2) {
error_report("%s: KVM_SET_USER_MEMORY_REGION2 failed, slot=%d,"
" start=0x%" PRIx64 ", size=0x%" PRIx64 ","
" flags=0x%" PRIx32 ", guest_memfd=%" PRId32 ","
" guest_memfd_offset=0x%" PRIx64 ": %s",
__func__, mem.slot, slot->start_addr,
(uint64_t)mem.memory_size, mem.flags,
mem.guest_memfd, (uint64_t)mem.guest_memfd_offset,
strerror(errno));
} else {
error_report("%s: KVM_SET_USER_MEMORY_REGION failed, slot=%d,"
" start=0x%" PRIx64 ", size=0x%" PRIx64 ": %s",
__func__, mem.slot, slot->start_addr,
(uint64_t)mem.memory_size, strerror(errno));
}
}
return ret;
}
@@ -391,6 +428,11 @@ static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id)
return kvm_vm_ioctl(s, KVM_CREATE_VCPU, (void *)vcpu_id);
}
int __attribute__ ((weak)) kvm_arch_pre_create_vcpu(CPUState *cpu, Error **errp)
{
return 0;
}
int kvm_init_vcpu(CPUState *cpu, Error **errp)
{
KVMState *s = kvm_state;
@@ -399,15 +441,27 @@ int kvm_init_vcpu(CPUState *cpu, Error **errp)
trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
/*
* tdx_pre_create_vcpu() may call cpu_x86_cpuid(). It in turn may call
* kvm_vm_ioctl(). Set cpu->kvm_state in advance to avoid NULL pointer
* dereference.
*/
cpu->kvm_state = s;
ret = kvm_arch_pre_create_vcpu(cpu, errp);
if (ret < 0) {
cpu->kvm_state = NULL;
goto err;
}
ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu));
if (ret < 0) {
error_setg_errno(errp, -ret, "kvm_init_vcpu: kvm_get_vcpu failed (%lu)",
kvm_arch_vcpu_id(cpu));
cpu->kvm_state = NULL;
goto err;
}
cpu->kvm_fd = ret;
cpu->kvm_state = s;
cpu->vcpu_dirty = true;
cpu->dirty_pages = 0;
cpu->throttle_us_per_full = 0;
@@ -475,6 +529,9 @@ static int kvm_mem_flags(MemoryRegion *mr)
if (readonly && kvm_readonly_mem_allowed) {
flags |= KVM_MEM_READONLY;
}
if (memory_region_has_guest_memfd(mr)) {
flags |= KVM_MEM_PRIVATE;
}
return flags;
}
@@ -1266,6 +1323,44 @@ void kvm_set_max_memslot_size(hwaddr max_slot_size)
kvm_max_slot_size = max_slot_size;
}
static int kvm_set_memory_attributes(hwaddr start, hwaddr size, uint64_t attr)
{
struct kvm_memory_attributes attrs;
int r;
attrs.attributes = attr;
attrs.address = start;
attrs.size = size;
attrs.flags = 0;
r = kvm_vm_ioctl(kvm_state, KVM_SET_MEMORY_ATTRIBUTES, &attrs);
if (r) {
warn_report("%s: failed to set memory (0x%lx+%#zx) with attr 0x%lx error '%s'",
__func__, start, size, attr, strerror(errno));
}
return r;
}
int kvm_set_memory_attributes_private(hwaddr start, hwaddr size)
{
if (!(kvm_supported_memory_attributes & KVM_MEMORY_ATTRIBUTE_PRIVATE)) {
error_report("KVM doesn't support PRIVATE memory attribute\n");
return -EINVAL;
}
return kvm_set_memory_attributes(start, size, KVM_MEMORY_ATTRIBUTE_PRIVATE);
}
int kvm_set_memory_attributes_shared(hwaddr start, hwaddr size)
{
if (!(kvm_supported_memory_attributes & KVM_MEMORY_ATTRIBUTE_PRIVATE)) {
error_report("KVM doesn't support PRIVATE memory attribute\n");
return -EINVAL;
}
return kvm_set_memory_attributes(start, size, 0);
}
/* Called with KVMMemoryListener.slots_lock held */
static void kvm_set_phys_mem(KVMMemoryListener *kml,
MemoryRegionSection *section, bool add)
@@ -1362,6 +1457,9 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml,
mem->ram_start_offset = ram_start_offset;
mem->ram = ram;
mem->flags = kvm_mem_flags(mr);
mem->guest_memfd = mr->ram_block->guest_memfd;
mem->guest_memfd_offset = (uint8_t*)ram - mr->ram_block->host;
kvm_slot_init_dirty_bitmap(mem);
err = kvm_set_user_memory_region(kml, mem, true);
if (err) {
@@ -1369,6 +1467,16 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml,
strerror(-err));
abort();
}
if (memory_region_is_default_private(mr)) {
err = kvm_set_memory_attributes_private(start_addr, slot_size);
if (err) {
error_report("%s: failed to set memory attribute private: %s\n",
__func__, strerror(-err));
exit(1);
}
}
start_addr += slot_size;
ram_start_offset += slot_size;
ram += slot_size;
@@ -2396,6 +2504,11 @@ static int kvm_init(MachineState *ms)
}
s->as = g_new0(struct KVMAs, s->nr_as);
kvm_guest_memfd_supported = kvm_check_extension(s, KVM_CAP_GUEST_MEMFD);
ret = kvm_check_extension(s, KVM_CAP_MEMORY_ATTRIBUTES);
kvm_supported_memory_attributes = ret > 0 ? ret : 0;
if (object_property_find(OBJECT(current_machine), "kvm-type")) {
g_autofree char *kvm_type = object_property_get_str(OBJECT(current_machine),
"kvm-type",
@@ -2816,6 +2929,78 @@ static void kvm_eat_signals(CPUState *cpu)
} while (sigismember(&chkset, SIG_IPI));
}
int kvm_convert_memory(hwaddr start, hwaddr size, bool to_private)
{
MemoryRegionSection section;
ram_addr_t offset;
MemoryRegion *mr;
RAMBlock *rb;
void *addr;
int ret = -1;
trace_kvm_convert_memory(start, size, to_private ? "shared_to_private" : "private_to_shared");
section = memory_region_find(get_system_memory(), start, size);
mr = section.mr;
if (!mr) {
/*
* Ignore converting non-assigned region to shared.
*
* TDX requires vMMIO region to be shared to inject #VE to guest.
* OVMF issues conservatively MapGPA(shared) on 32bit PCI MMIO region,
* and vIO-APIC 0xFEC00000 4K page.
* OVMF assigns 32bit PCI MMIO region to
* [top of low memory: typically 2GB=0xC000000, 0xFC00000)
*/
if (!to_private) {
ret = 0;
}
return ret;
}
if (memory_region_has_guest_memfd(mr)) {
if (to_private) {
ret = kvm_set_memory_attributes_private(start, size);
} else {
ret = kvm_set_memory_attributes_shared(start, size);
}
if (ret) {
memory_region_unref(section.mr);
return ret;
}
addr = memory_region_get_ram_ptr(section.mr) +
section.offset_within_region;
rb = qemu_ram_block_from_host(addr, false, &offset);
/*
* With KVM_SET_MEMORY_ATTRIBUTES by kvm_set_memory_attributes(),
* operation on underlying file descriptor is only for releasing
* unnecessary pages.
*/
ram_block_convert_range(rb, offset, size, to_private);
} else {
/*
* Because vMMIO region must be shared, guest TD may convert vMMIO
* region to shared explicitly. Don't complain such case. See
* memory_region_type() for checking if the region is MMIO region.
*/
if (!to_private &&
!memory_region_is_ram(mr) &&
!memory_region_is_ram_device(mr) &&
!memory_region_is_rom(mr) &&
!memory_region_is_romd(mr)) {
ret = 0;
} else {
warn_report("Convert non guest_memfd backed memory region "
"(0x%"HWADDR_PRIx" ,+ 0x%"HWADDR_PRIx") to %s",
start, size, to_private ? "private" : "shared");
}
}
memory_region_unref(section.mr);
return ret;
}
int kvm_cpu_exec(CPUState *cpu)
{
struct kvm_run *run = cpu->kvm_run;
@@ -2883,18 +3068,20 @@ int kvm_cpu_exec(CPUState *cpu)
ret = EXCP_INTERRUPT;
break;
}
fprintf(stderr, "error: kvm run failed %s\n",
strerror(-run_ret));
if (!(run_ret == -EFAULT && run->exit_reason == KVM_EXIT_MEMORY_FAULT)) {
fprintf(stderr, "error: kvm run failed %s\n",
strerror(-run_ret));
#ifdef TARGET_PPC
if (run_ret == -EBUSY) {
fprintf(stderr,
"This is probably because your SMT is enabled.\n"
"VCPU can only run on primary threads with all "
"secondary threads offline.\n");
}
if (run_ret == -EBUSY) {
fprintf(stderr,
"This is probably because your SMT is enabled.\n"
"VCPU can only run on primary threads with all "
"secondary threads offline.\n");
}
#endif
ret = -1;
break;
ret = -1;
break;
}
}
trace_kvm_run_exit(cpu->cpu_index, run->exit_reason);
@@ -2981,6 +3168,16 @@ int kvm_cpu_exec(CPUState *cpu)
break;
}
break;
case KVM_EXIT_MEMORY_FAULT:
if (run->memory_fault.flags & ~KVM_MEMORY_EXIT_FLAG_PRIVATE) {
error_report("KVM_EXIT_MEMORY_FAULT: Unknown flag 0x%" PRIx64,
(uint64_t)run->memory_fault.flags);
ret = -1;
break;
}
ret = kvm_convert_memory(run->memory_fault.gpa, run->memory_fault.size,
run->memory_fault.flags & KVM_MEMORY_EXIT_FLAG_PRIVATE);
break;
default:
DPRINTF("kvm_arch_handle_exit\n");
ret = kvm_arch_handle_exit(cpu, run);
@@ -4077,3 +4274,24 @@ void query_stats_schemas_cb(StatsSchemaList **result, Error **errp)
query_stats_schema_vcpu(first_cpu, &stats_args);
}
}
int kvm_create_guest_memfd(uint64_t size, uint64_t flags, Error **errp)
{
int fd;
struct kvm_create_guest_memfd guest_memfd = {
.size = size,
.flags = flags,
};
if (!kvm_guest_memfd_supported) {
error_setg(errp, "KVM doesn't support guest memfd\n");
return -EOPNOTSUPP;
}
fd = kvm_vm_ioctl(kvm_state, KVM_CREATE_GUEST_MEMFD, &guest_memfd);
if (fd < 0) {
error_setg_errno(errp, errno, "%s: error creating kvm guest memfd\n", __func__);
}
return fd;
}

View File

@@ -15,7 +15,7 @@ kvm_irqchip_update_msi_route(int virq) "Updating MSI route virq=%d"
kvm_irqchip_release_virq(int virq) "virq %d"
kvm_set_ioeventfd_mmio(int fd, uint64_t addr, uint32_t val, bool assign, uint32_t size, bool datamatch) "fd: %d @0x%" PRIx64 " val=0x%x assign: %d size: %d match: %d"
kvm_set_ioeventfd_pio(int fd, uint16_t addr, uint32_t val, bool assign, uint32_t size, bool datamatch) "fd: %d @0x%x val=0x%x assign: %d size: %d match: %d"
kvm_set_user_memory(uint32_t slot, uint32_t flags, uint64_t guest_phys_addr, uint64_t memory_size, uint64_t userspace_addr, int ret) "Slot#%d flags=0x%x gpa=0x%"PRIx64 " size=0x%"PRIx64 " ua=0x%"PRIx64 " ret=%d"
kvm_set_user_memory(uint16_t as, uint16_t slot, uint32_t flags, uint64_t guest_phys_addr, uint64_t memory_size, uint64_t userspace_addr, uint32_t fd, uint64_t fd_offset, int ret) "AddrSpace#%d Slot#%d flags=0x%x gpa=0x%"PRIx64 " size=0x%"PRIx64 " ua=0x%"PRIx64 " guest_memfd=%d" " guest_memfd_offset=0x%" PRIx64 " ret=%d"
kvm_clear_dirty_log(uint32_t slot, uint64_t start, uint32_t size) "slot#%"PRId32" start 0x%"PRIx64" size 0x%"PRIx32
kvm_resample_fd_notify(int gsi) "gsi %d"
kvm_dirty_ring_full(int id) "vcpu %d"
@@ -25,4 +25,4 @@ kvm_dirty_ring_reaper(const char *s) "%s"
kvm_dirty_ring_reap(uint64_t count, int64_t t) "reaped %"PRIu64" pages (took %"PRIi64" us)"
kvm_dirty_ring_reaper_kick(const char *reason) "%s"
kvm_dirty_ring_flush(int finished) "%d"
kvm_convert_memory(uint64_t start, uint64_t size, const char *msg) "start 0x%" PRIx64 " size 0x%" PRIx64 " %s"

View File

@@ -183,7 +183,7 @@ static bool tb_lookup_cmp(const void *p, const void *d)
const TranslationBlock *tb = p;
const struct tb_desc *desc = d;
if (tb->pc == desc->pc &&
if ((tb_cflags(tb) & CF_PCREL || tb->pc == desc->pc) &&
tb_page_addr0(tb) == desc->page_addr0 &&
tb->cs_base == desc->cs_base &&
tb->flags == desc->flags &&
@@ -233,7 +233,7 @@ static TranslationBlock *tb_htable_lookup(CPUState *cpu, vaddr pc,
return NULL;
}
desc.page_addr0 = phys_pc;
h = tb_hash_func(phys_pc, pc,
h = tb_hash_func(phys_pc, (cflags & CF_PCREL ? 0 : pc),
flags, cs_base, cflags);
return qht_lookup_custom(&tb_ctx.htable, &desc, h, tb_lookup_cmp);
}
@@ -721,7 +721,7 @@ static inline bool cpu_handle_exception(CPUState *cpu, int *ret)
&& cpu->neg.icount_decr.u16.low + cpu->icount_extra == 0) {
/* Execute just one insn to trigger exception pending in the log */
cpu->cflags_next_tb = (curr_cflags(cpu) & ~CF_USE_ICOUNT)
| CF_NOIRQ | 1;
| CF_LAST_IO | CF_NOIRQ | 1;
}
#endif
return false;

View File

@@ -1479,8 +1479,7 @@ int probe_access_full(CPUArchState *env, vaddr addr, int size,
/* Handle clean RAM pages. */
if (unlikely(flags & TLB_NOTDIRTY)) {
int dirtysize = size == 0 ? 1 : size;
notdirty_write(env_cpu(env), addr, dirtysize, *pfull, retaddr);
notdirty_write(env_cpu(env), addr, 1, *pfull, retaddr);
flags &= ~TLB_NOTDIRTY;
}
@@ -1503,8 +1502,7 @@ int probe_access_full_mmu(CPUArchState *env, vaddr addr, int size,
/* Handle clean RAM pages. */
if (unlikely(flags & TLB_NOTDIRTY)) {
int dirtysize = size == 0 ? 1 : size;
notdirty_write(env_cpu(env), addr, dirtysize, *pfull, 0);
notdirty_write(env_cpu(env), addr, 1, *pfull, 0);
flags &= ~TLB_NOTDIRTY;
}
@@ -1526,8 +1524,7 @@ int probe_access_flags(CPUArchState *env, vaddr addr, int size,
/* Handle clean RAM pages. */
if (unlikely(flags & TLB_NOTDIRTY)) {
int dirtysize = size == 0 ? 1 : size;
notdirty_write(env_cpu(env), addr, dirtysize, full, retaddr);
notdirty_write(env_cpu(env), addr, 1, full, retaddr);
flags &= ~TLB_NOTDIRTY;
}
@@ -1563,7 +1560,7 @@ void *probe_access(CPUArchState *env, vaddr addr, int size,
/* Handle clean RAM pages. */
if (flags & TLB_NOTDIRTY) {
notdirty_write(env_cpu(env), addr, size, full, retaddr);
notdirty_write(env_cpu(env), addr, 1, full, retaddr);
}
}

View File

@@ -47,7 +47,7 @@ static bool tb_cmp(const void *ap, const void *bp)
const TranslationBlock *a = ap;
const TranslationBlock *b = bp;
return (a->pc == b->pc &&
return ((tb_cflags(a) & CF_PCREL || a->pc == b->pc) &&
a->cs_base == b->cs_base &&
a->flags == b->flags &&
(tb_cflags(a) & ~CF_INVALID) == (tb_cflags(b) & ~CF_INVALID) &&
@@ -916,7 +916,7 @@ static void do_tb_phys_invalidate(TranslationBlock *tb, bool rm_from_page_list)
/* remove the TB from the hash list */
phys_pc = tb_page_addr0(tb);
h = tb_hash_func(phys_pc, tb->pc,
h = tb_hash_func(phys_pc, (orig_cflags & CF_PCREL ? 0 : tb->pc),
tb->flags, tb->cs_base, orig_cflags);
if (!qht_remove(&tb_ctx.htable, tb, h)) {
return;
@@ -983,7 +983,7 @@ TranslationBlock *tb_link_page(TranslationBlock *tb)
tb_record(tb);
/* add in the hash table */
h = tb_hash_func(tb_page_addr0(tb), tb->pc,
h = tb_hash_func(tb_page_addr0(tb), (tb->cflags & CF_PCREL ? 0 : tb->pc),
tb->flags, tb->cs_base, tb->cflags);
qht_insert(&tb_ctx.htable, tb, h, &existing_tb);
@@ -1083,7 +1083,8 @@ bool tb_invalidate_phys_page_unwind(tb_page_addr_t addr, uintptr_t pc)
if (current_tb_modified) {
/* Force execution of one insn next time. */
CPUState *cpu = current_cpu;
cpu->cflags_next_tb = 1 | CF_NOIRQ | curr_cflags(current_cpu);
cpu->cflags_next_tb =
1 | CF_LAST_IO | CF_NOIRQ | curr_cflags(current_cpu);
return true;
}
return false;
@@ -1153,7 +1154,8 @@ tb_invalidate_phys_page_range__locked(struct page_collection *pages,
if (current_tb_modified) {
page_collection_unlock(pages);
/* Force execution of one insn next time. */
current_cpu->cflags_next_tb = 1 | CF_NOIRQ | curr_cflags(current_cpu);
current_cpu->cflags_next_tb =
1 | CF_LAST_IO | CF_NOIRQ | curr_cflags(current_cpu);
mmap_unlock();
cpu_loop_exit_noexc(current_cpu);
}

View File

@@ -304,7 +304,7 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
if (phys_pc == -1) {
/* Generate a one-shot TB with 1 insn in it */
cflags = (cflags & ~CF_COUNT_MASK) | 1;
cflags = (cflags & ~CF_COUNT_MASK) | CF_LAST_IO | 1;
}
max_insns = cflags & CF_COUNT_MASK;
@@ -327,7 +327,9 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
gen_code_buf = tcg_ctx->code_gen_ptr;
tb->tc.ptr = tcg_splitwx_to_rx(gen_code_buf);
tb->pc = pc;
if (!(cflags & CF_PCREL)) {
tb->pc = pc;
}
tb->cs_base = cs_base;
tb->flags = flags;
tb->cflags = cflags;
@@ -630,7 +632,7 @@ void cpu_io_recompile(CPUState *cpu, uintptr_t retaddr)
* operations only (which execute after completion) so we don't
* double instrument the instruction.
*/
cpu->cflags_next_tb = curr_cflags(cpu) | CF_MEMI_ONLY | n;
cpu->cflags_next_tb = curr_cflags(cpu) | CF_MEMI_ONLY | CF_LAST_IO | n;
if (qemu_loglevel_mask(CPU_LOG_EXEC)) {
vaddr pc = log_pc(cpu, tb);

View File

@@ -89,7 +89,7 @@ static TCGOp *gen_tb_start(DisasContextBase *db, uint32_t cflags)
* each translation block. The cost is minimal, plus it would be
* very easy to forget doing it in the translator.
*/
set_can_do_io(db, db->max_insns == 1);
set_can_do_io(db, db->max_insns == 1 && (cflags & CF_LAST_IO));
return icount_start_insn;
}
@@ -151,7 +151,13 @@ void translator_loop(CPUState *cpu, TranslationBlock *tb, int *max_insns,
ops->tb_start(db, cpu);
tcg_debug_assert(db->is_jmp == DISAS_NEXT); /* no early exit */
plugin_enabled = plugin_gen_tb_start(cpu, db, cflags & CF_MEMI_ONLY);
if (cflags & CF_MEMI_ONLY) {
/* We should only see CF_MEMI_ONLY for io_recompile. */
assert(cflags & CF_LAST_IO);
plugin_enabled = plugin_gen_tb_start(cpu, db, true);
} else {
plugin_enabled = plugin_gen_tb_start(cpu, db, false);
}
db->plugin_enabled = plugin_enabled;
while (true) {
@@ -163,13 +169,11 @@ void translator_loop(CPUState *cpu, TranslationBlock *tb, int *max_insns,
plugin_gen_insn_start(cpu, db);
}
/*
* Disassemble one instruction. The translate_insn hook should
* update db->pc_next and db->is_jmp to indicate what should be
* done next -- either exiting this loop or locate the start of
* the next instruction.
*/
if (db->num_insns == db->max_insns) {
/* Disassemble one instruction. The translate_insn hook should
update db->pc_next and db->is_jmp to indicate what should be
done next -- either exiting this loop or locate the start of
the next instruction. */
if (db->num_insns == db->max_insns && (cflags & CF_LAST_IO)) {
/* Accept I/O on the last instruction. */
set_can_do_io(db, true);
}

View File

@@ -1744,7 +1744,7 @@ static AudioState *audio_init(Audiodev *dev, Error **errp)
if (driver) {
done = !audio_driver_init(s, driver, dev, errp);
} else {
error_setg(errp, "Unknown audio driver `%s'", drvname);
error_setg(errp, "Unknown audio driver `%s'\n", drvname);
}
if (!done) {
goto out;
@@ -1758,15 +1758,12 @@ static AudioState *audio_init(Audiodev *dev, Error **errp)
goto out;
}
s->dev = dev = e->dev;
QSIMPLEQ_REMOVE_HEAD(&default_audiodevs, next);
g_free(e);
drvname = AudiodevDriver_str(dev->driver);
driver = audio_driver_lookup(drvname);
if (!audio_driver_init(s, driver, dev, NULL)) {
break;
}
qapi_free_Audiodev(dev);
s->dev = NULL;
QSIMPLEQ_REMOVE_HEAD(&default_audiodevs, next);
}
}

View File

@@ -398,7 +398,6 @@ static void cryptodev_backend_set_ops(Object *obj, Visitor *v,
static void
cryptodev_backend_complete(UserCreatable *uc, Error **errp)
{
ERRP_GUARD();
CryptoDevBackend *backend = CRYPTODEV_BACKEND(uc);
CryptoDevBackendClass *bc = CRYPTODEV_BACKEND_GET_CLASS(uc);
uint32_t services;
@@ -407,20 +406,11 @@ cryptodev_backend_complete(UserCreatable *uc, Error **errp)
QTAILQ_INIT(&backend->opinfos);
value = backend->tc.buckets[THROTTLE_OPS_TOTAL].avg;
cryptodev_backend_set_throttle(backend, THROTTLE_OPS_TOTAL, value, errp);
if (*errp) {
return;
}
value = backend->tc.buckets[THROTTLE_BPS_TOTAL].avg;
cryptodev_backend_set_throttle(backend, THROTTLE_BPS_TOTAL, value, errp);
if (*errp) {
return;
}
if (bc->init) {
bc->init(backend, errp);
if (*errp) {
return;
}
}
services = backend->conf.crypto_services;

View File

@@ -84,6 +84,7 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
ram_flags |= fb->readonly ? RAM_READONLY_FD : 0;
ram_flags |= fb->rom == ON_OFF_AUTO_ON ? RAM_READONLY : 0;
ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
ram_flags |= backend->require_guest_memfd ? RAM_GUEST_MEMFD : 0;
ram_flags |= fb->is_pmem ? RAM_PMEM : 0;
ram_flags |= RAM_NAMED_FILE;
memory_region_init_ram_from_file(&backend->mr, OBJECT(backend), name,

View File

@@ -55,6 +55,7 @@ memfd_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
name = host_memory_backend_get_name(backend);
ram_flags = backend->share ? RAM_SHARED : 0;
ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
ram_flags |= backend->require_guest_memfd ? RAM_GUEST_MEMFD : 0;
memory_region_init_ram_from_fd(&backend->mr, OBJECT(backend), name,
backend->size, ram_flags, fd, 0, errp);
g_free(name);

View File

@@ -30,6 +30,7 @@ ram_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
name = host_memory_backend_get_name(backend);
ram_flags = backend->share ? RAM_SHARED : 0;
ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
ram_flags |= backend->require_guest_memfd ? RAM_GUEST_MEMFD : 0;
memory_region_init_ram_flags_nomigrate(&backend->mr, OBJECT(backend), name,
backend->size, ram_flags, errp);
g_free(name);

View File

@@ -279,6 +279,7 @@ static void host_memory_backend_init(Object *obj)
/* TODO: convert access to globals to compat properties */
backend->merge = machine_mem_merge(machine);
backend->dump = machine_dump_guest_core(machine);
backend->require_guest_memfd = machine_require_guest_memfd(machine);
backend->reserve = true;
backend->prealloc_threads = machine->smp.cpus;
}

39
block.c
View File

@@ -1713,7 +1713,7 @@ open_failed:
bdrv_unref_child(bs, bs->file);
assert(!bs->file);
}
bdrv_graph_wrunlock(NULL);
bdrv_graph_wrunlock();
g_free(bs->opaque);
bs->opaque = NULL;
@@ -3577,7 +3577,7 @@ int bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
bdrv_drained_begin(drain_bs);
bdrv_graph_wrlock(backing_hd);
ret = bdrv_set_backing_hd_drained(bs, backing_hd, errp);
bdrv_graph_wrunlock(backing_hd);
bdrv_graph_wrunlock();
bdrv_drained_end(drain_bs);
bdrv_unref(drain_bs);
@@ -3796,7 +3796,7 @@ BdrvChild *bdrv_open_child(const char *filename,
child = bdrv_attach_child(parent, bs, bdref_key, child_class, child_role,
errp);
aio_context_release(ctx);
bdrv_graph_wrunlock(NULL);
bdrv_graph_wrunlock();
return child;
}
@@ -4652,7 +4652,7 @@ int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, Error **errp)
bdrv_graph_wrlock(NULL);
tran_commit(tran);
bdrv_graph_wrunlock(NULL);
bdrv_graph_wrunlock();
QTAILQ_FOREACH_REVERSE(bs_entry, bs_queue, entry) {
BlockDriverState *bs = bs_entry->state.bs;
@@ -4671,7 +4671,7 @@ int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, Error **errp)
abort:
bdrv_graph_wrlock(NULL);
tran_abort(tran);
bdrv_graph_wrunlock(NULL);
bdrv_graph_wrunlock();
QTAILQ_FOREACH_SAFE(bs_entry, bs_queue, entry, next) {
if (bs_entry->prepared) {
@@ -4857,7 +4857,7 @@ bdrv_reopen_parse_file_or_backing(BDRVReopenState *reopen_state,
ret = bdrv_set_file_or_backing_noperm(bs, new_child_bs, is_backing,
tran, errp);
bdrv_graph_wrunlock_ctx(ctx);
bdrv_graph_wrunlock();
if (old_ctx != ctx) {
aio_context_release(ctx);
@@ -5216,7 +5216,7 @@ static void bdrv_close(BlockDriverState *bs)
assert(!bs->backing);
assert(!bs->file);
bdrv_graph_wrunlock(bs);
bdrv_graph_wrunlock();
g_free(bs->opaque);
bs->opaque = NULL;
@@ -5511,7 +5511,7 @@ int bdrv_drop_filter(BlockDriverState *bs, Error **errp)
bdrv_drained_begin(child_bs);
bdrv_graph_wrlock(bs);
ret = bdrv_replace_node_common(bs, child_bs, true, true, errp);
bdrv_graph_wrunlock(bs);
bdrv_graph_wrunlock();
bdrv_drained_end(child_bs);
return ret;
@@ -5593,7 +5593,7 @@ out:
tran_finalize(tran, ret);
bdrv_refresh_limits(bs_top, NULL, NULL);
bdrv_graph_wrunlock(bs_top);
bdrv_graph_wrunlock();
bdrv_drained_end(bs_top);
bdrv_drained_end(bs_new);
@@ -5631,7 +5631,7 @@ int bdrv_replace_child_bs(BdrvChild *child, BlockDriverState *new_bs,
tran_finalize(tran, ret);
bdrv_graph_wrunlock(new_bs);
bdrv_graph_wrunlock();
bdrv_drained_end(old_bs);
bdrv_drained_end(new_bs);
bdrv_unref(old_bs);
@@ -5720,7 +5720,7 @@ BlockDriverState *bdrv_insert_node(BlockDriverState *bs, QDict *options,
bdrv_drained_begin(new_node_bs);
bdrv_graph_wrlock(new_node_bs);
ret = bdrv_replace_node(bs, new_node_bs, errp);
bdrv_graph_wrunlock(new_node_bs);
bdrv_graph_wrunlock();
bdrv_drained_end(new_node_bs);
bdrv_drained_end(bs);
bdrv_unref(bs);
@@ -6015,7 +6015,7 @@ int bdrv_drop_intermediate(BlockDriverState *top, BlockDriverState *base,
* That's a FIXME.
*/
bdrv_replace_node_common(top, base, false, false, &local_err);
bdrv_graph_wrunlock(base);
bdrv_graph_wrunlock();
if (local_err) {
error_report_err(local_err);
@@ -6052,7 +6052,7 @@ int bdrv_drop_intermediate(BlockDriverState *top, BlockDriverState *base,
goto exit;
exit_wrlock:
bdrv_graph_wrunlock(base);
bdrv_graph_wrunlock();
exit:
bdrv_drained_end(base);
bdrv_unref(top);
@@ -7254,16 +7254,6 @@ void bdrv_unref(BlockDriverState *bs)
}
}
static void bdrv_schedule_unref_bh(void *opaque)
{
BlockDriverState *bs = opaque;
AioContext *ctx = bdrv_get_aio_context(bs);
aio_context_acquire(ctx);
bdrv_unref(bs);
aio_context_release(ctx);
}
/*
* Release a BlockDriverState reference while holding the graph write lock.
*
@@ -7277,7 +7267,8 @@ void bdrv_schedule_unref(BlockDriverState *bs)
if (!bs) {
return;
}
aio_bh_schedule_oneshot(qemu_get_aio_context(), bdrv_schedule_unref_bh, bs);
aio_bh_schedule_oneshot(qemu_get_aio_context(),
(QEMUBHFunc *) bdrv_unref, bs);
}
struct BdrvOpBlocker {

View File

@@ -499,7 +499,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
bdrv_graph_wrlock(target);
block_job_add_bdrv(&job->common, "target", target, 0, BLK_PERM_ALL,
&error_abort);
bdrv_graph_wrunlock(target);
bdrv_graph_wrunlock();
return &job->common;

View File

@@ -253,7 +253,7 @@ fail_log:
if (ret < 0) {
bdrv_graph_wrlock(NULL);
bdrv_unref_child(bs, s->log_file);
bdrv_graph_wrunlock(NULL);
bdrv_graph_wrunlock();
s->log_file = NULL;
}
fail:
@@ -268,7 +268,7 @@ static void blk_log_writes_close(BlockDriverState *bs)
bdrv_graph_wrlock(NULL);
bdrv_unref_child(bs, s->log_file);
s->log_file = NULL;
bdrv_graph_wrunlock(NULL);
bdrv_graph_wrunlock();
}
static int64_t coroutine_fn GRAPH_RDLOCK
@@ -328,39 +328,22 @@ static void coroutine_fn GRAPH_RDLOCK
blk_log_writes_co_do_log(BlkLogWritesLogReq *lr)
{
BDRVBlkLogWritesState *s = lr->bs->opaque;
/*
* Determine the offsets and sizes of different parts of the entry, and
* update the state of the driver.
*
* This needs to be done in one go, before any actual I/O is done, as the
* log entry may have to be written in two parts, and the state of the
* driver may be modified by other driver operations while waiting for the
* I/O to complete.
*/
const uint64_t entry_start_sector = s->cur_log_sector;
const uint64_t entry_offset = entry_start_sector << s->sectorbits;
const uint64_t qiov_aligned_size = ROUND_UP(lr->qiov->size, s->sectorsize);
const uint64_t entry_aligned_size = qiov_aligned_size +
ROUND_UP(lr->zero_size, s->sectorsize);
const uint64_t entry_nr_sectors = entry_aligned_size >> s->sectorbits;
uint64_t cur_log_offset = s->cur_log_sector << s->sectorbits;
s->nr_entries++;
s->cur_log_sector += entry_nr_sectors;
s->cur_log_sector +=
ROUND_UP(lr->qiov->size, s->sectorsize) >> s->sectorbits;
/*
* Write the log entry. Note that if this is a "write zeroes" operation,
* only the entry header is written here, with the zeroing being done
* separately below.
*/
lr->log_ret = bdrv_co_pwritev(s->log_file, entry_offset, lr->qiov->size,
lr->log_ret = bdrv_co_pwritev(s->log_file, cur_log_offset, lr->qiov->size,
lr->qiov, 0);
/* Logging for the "write zeroes" operation */
if (lr->log_ret == 0 && lr->zero_size) {
const uint64_t zeroes_offset = entry_offset + qiov_aligned_size;
cur_log_offset = s->cur_log_sector << s->sectorbits;
s->cur_log_sector +=
ROUND_UP(lr->zero_size, s->sectorsize) >> s->sectorbits;
lr->log_ret = bdrv_co_pwrite_zeroes(s->log_file, zeroes_offset,
lr->log_ret = bdrv_co_pwrite_zeroes(s->log_file, cur_log_offset,
lr->zero_size, 0);
}

View File

@@ -154,7 +154,7 @@ static void blkverify_close(BlockDriverState *bs)
bdrv_graph_wrlock(NULL);
bdrv_unref_child(bs, s->test_file);
s->test_file = NULL;
bdrv_graph_wrunlock(NULL);
bdrv_graph_wrunlock();
}
static int64_t coroutine_fn GRAPH_RDLOCK

View File

@@ -882,14 +882,11 @@ BlockBackend *blk_by_public(BlockBackendPublic *public)
/*
* Disassociates the currently associated BlockDriverState from @blk.
*
* The caller must hold the AioContext lock for the BlockBackend.
*/
void blk_remove_bs(BlockBackend *blk)
{
ThrottleGroupMember *tgm = &blk->public.throttle_group_member;
BdrvChild *root;
AioContext *ctx;
GLOBAL_STATE_CODE();
@@ -919,10 +916,9 @@ void blk_remove_bs(BlockBackend *blk)
root = blk->root;
blk->root = NULL;
ctx = bdrv_get_aio_context(root->bs);
bdrv_graph_wrlock(root->bs);
bdrv_graph_wrlock(NULL);
bdrv_root_unref_child(root);
bdrv_graph_wrunlock_ctx(ctx);
bdrv_graph_wrunlock();
}
/*
@@ -933,8 +929,6 @@ void blk_remove_bs(BlockBackend *blk)
int blk_insert_bs(BlockBackend *blk, BlockDriverState *bs, Error **errp)
{
ThrottleGroupMember *tgm = &blk->public.throttle_group_member;
AioContext *ctx = bdrv_get_aio_context(bs);
GLOBAL_STATE_CODE();
bdrv_ref(bs);
bdrv_graph_wrlock(bs);
@@ -942,7 +936,7 @@ int blk_insert_bs(BlockBackend *blk, BlockDriverState *bs, Error **errp)
BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
blk->perm, blk->shared_perm,
blk, errp);
bdrv_graph_wrunlock_ctx(ctx);
bdrv_graph_wrunlock();
if (blk->root == NULL) {
return -EPERM;
}

View File

@@ -102,7 +102,7 @@ static void commit_abort(Job *job)
bdrv_drained_begin(commit_top_backing_bs);
bdrv_graph_wrlock(commit_top_backing_bs);
bdrv_replace_node(s->commit_top_bs, commit_top_backing_bs, &error_abort);
bdrv_graph_wrunlock(commit_top_backing_bs);
bdrv_graph_wrunlock();
bdrv_drained_end(commit_top_backing_bs);
bdrv_unref(s->commit_top_bs);
@@ -370,19 +370,19 @@ void commit_start(const char *job_id, BlockDriverState *bs,
ret = block_job_add_bdrv(&s->common, "intermediate node", iter, 0,
iter_shared_perms, errp);
if (ret < 0) {
bdrv_graph_wrunlock(top);
bdrv_graph_wrunlock();
goto fail;
}
}
if (bdrv_freeze_backing_chain(commit_top_bs, base, errp) < 0) {
bdrv_graph_wrunlock(top);
bdrv_graph_wrunlock();
goto fail;
}
s->chain_frozen = true;
ret = block_job_add_bdrv(&s->common, "base", base, 0, BLK_PERM_ALL, errp);
bdrv_graph_wrunlock(top);
bdrv_graph_wrunlock();
if (ret < 0) {
goto fail;
@@ -436,7 +436,7 @@ fail:
bdrv_drained_begin(top);
bdrv_graph_wrlock(top);
bdrv_replace_node(commit_top_bs, top, &error_abort);
bdrv_graph_wrunlock(top);
bdrv_graph_wrunlock();
bdrv_drained_end(top);
}
}

View File

@@ -283,7 +283,6 @@ static void vu_blk_drained_begin(void *opaque)
{
VuBlkExport *vexp = opaque;
vexp->vu_server.quiescing = true;
vhost_user_server_detach_aio_context(&vexp->vu_server);
}
@@ -292,23 +291,19 @@ static void vu_blk_drained_end(void *opaque)
{
VuBlkExport *vexp = opaque;
vexp->vu_server.quiescing = false;
vhost_user_server_attach_aio_context(&vexp->vu_server, vexp->export.ctx);
}
/*
* Ensures that bdrv_drained_begin() waits until in-flight requests complete
* and the server->co_trip coroutine has terminated. It will be restarted in
* vhost_user_server_attach_aio_context().
* Ensures that bdrv_drained_begin() waits until in-flight requests complete.
*
* Called with vexp->export.ctx acquired.
*/
static bool vu_blk_drained_poll(void *opaque)
{
VuBlkExport *vexp = opaque;
VuServer *server = &vexp->vu_server;
return server->co_trip || vhost_user_server_has_in_flight(server);
return vhost_user_server_has_in_flight(&vexp->vu_server);
}
static const BlockDevOps vu_blk_dev_ops = {

View File

@@ -161,21 +161,11 @@ void no_coroutine_fn bdrv_graph_wrlock(BlockDriverState *bs)
}
}
void no_coroutine_fn bdrv_graph_wrunlock_ctx(AioContext *ctx)
void bdrv_graph_wrunlock(void)
{
GLOBAL_STATE_CODE();
assert(qatomic_read(&has_writer));
/*
* Release only non-mainloop AioContext. The mainloop often relies on the
* BQL and doesn't lock the main AioContext before doing things.
*/
if (ctx && ctx != qemu_get_aio_context()) {
aio_context_release(ctx);
} else {
ctx = NULL;
}
WITH_QEMU_LOCK_GUARD(&aio_context_list_lock) {
/*
* No need for memory barriers, this works in pair with
@@ -197,17 +187,6 @@ void no_coroutine_fn bdrv_graph_wrunlock_ctx(AioContext *ctx)
* progress.
*/
aio_bh_poll(qemu_get_aio_context());
if (ctx) {
aio_context_acquire(ctx);
}
}
void no_coroutine_fn bdrv_graph_wrunlock(BlockDriverState *bs)
{
AioContext *ctx = bs ? bdrv_get_aio_context(bs) : NULL;
bdrv_graph_wrunlock_ctx(ctx);
}
void coroutine_fn bdrv_graph_co_rdlock(void)

View File

@@ -2619,16 +2619,6 @@ bdrv_co_do_block_status(BlockDriverState *bs, bool want_zero,
ret |= (ret2 & BDRV_BLOCK_ZERO);
}
}
/*
* Now that the recursive search was done, clear the flag. Otherwise,
* with more complicated block graphs like snapshot-access ->
* copy-before-write -> qcow2, where the return value will be propagated
* further up to a parent bdrv_co_do_block_status() call, both the
* BDRV_BLOCK_RECURSE and BDRV_BLOCK_ZERO flags would be set, which is
* not allowed.
*/
ret &= ~BDRV_BLOCK_RECURSE;
}
out:

View File

@@ -773,7 +773,7 @@ static int mirror_exit_common(Job *job)
"would not lead to an abrupt change of visible data",
to_replace->node_name, target_bs->node_name);
}
bdrv_graph_wrunlock(target_bs);
bdrv_graph_wrunlock();
bdrv_drained_end(to_replace);
if (local_err) {
error_report_err(local_err);
@@ -798,7 +798,7 @@ static int mirror_exit_common(Job *job)
block_job_remove_all_bdrv(bjob);
bdrv_graph_wrlock(mirror_top_bs);
bdrv_replace_node(mirror_top_bs, mirror_top_bs->backing->bs, &error_abort);
bdrv_graph_wrunlock(mirror_top_bs);
bdrv_graph_wrunlock();
bdrv_drained_end(target_bs);
bdrv_unref(target_bs);
@@ -1920,7 +1920,7 @@ static BlockJob *mirror_start_job(
BLK_PERM_CONSISTENT_READ,
errp);
if (ret < 0) {
bdrv_graph_wrunlock(bs);
bdrv_graph_wrunlock();
goto fail;
}
@@ -1965,17 +1965,17 @@ static BlockJob *mirror_start_job(
ret = block_job_add_bdrv(&s->common, "intermediate node", iter, 0,
iter_shared_perms, errp);
if (ret < 0) {
bdrv_graph_wrunlock(bs);
bdrv_graph_wrunlock();
goto fail;
}
}
if (bdrv_freeze_backing_chain(mirror_top_bs, target, errp) < 0) {
bdrv_graph_wrunlock(bs);
bdrv_graph_wrunlock();
goto fail;
}
}
bdrv_graph_wrunlock(bs);
bdrv_graph_wrunlock();
QTAILQ_INIT(&s->ops_in_flight);
@@ -2006,7 +2006,7 @@ fail:
bdrv_child_refresh_perms(mirror_top_bs, mirror_top_bs->backing,
&error_abort);
bdrv_replace_node(mirror_top_bs, bs, &error_abort);
bdrv_graph_wrunlock(bs);
bdrv_graph_wrunlock();
bdrv_drained_end(bs);
bdrv_unref(mirror_top_bs);

View File

@@ -2809,7 +2809,7 @@ qcow2_do_close(BlockDriverState *bs, bool close_data_file)
bdrv_graph_rdunlock_main_loop();
bdrv_graph_wrlock(NULL);
bdrv_unref_child(bs, s->data_file);
bdrv_graph_wrunlock(NULL);
bdrv_graph_wrunlock();
s->data_file = NULL;
bdrv_graph_rdlock_main_loop();
}

View File

@@ -1044,7 +1044,7 @@ close_exit:
}
bdrv_unref_child(bs, s->children[i]);
}
bdrv_graph_wrunlock(NULL);
bdrv_graph_wrunlock();
g_free(s->children);
g_free(opened);
exit:
@@ -1061,7 +1061,7 @@ static void quorum_close(BlockDriverState *bs)
for (i = 0; i < s->num_children; i++) {
bdrv_unref_child(bs, s->children[i]);
}
bdrv_graph_wrunlock(NULL);
bdrv_graph_wrunlock();
g_free(s->children);
}

View File

@@ -568,7 +568,7 @@ static void replication_start(ReplicationState *rs, ReplicationMode mode,
&local_err);
if (local_err) {
error_propagate(errp, local_err);
bdrv_graph_wrunlock(bs);
bdrv_graph_wrunlock();
aio_context_release(aio_context);
return;
}
@@ -579,7 +579,7 @@ static void replication_start(ReplicationState *rs, ReplicationMode mode,
BDRV_CHILD_DATA, &local_err);
if (local_err) {
error_propagate(errp, local_err);
bdrv_graph_wrunlock(bs);
bdrv_graph_wrunlock();
aio_context_release(aio_context);
return;
}
@@ -592,7 +592,7 @@ static void replication_start(ReplicationState *rs, ReplicationMode mode,
if (!top_bs || !bdrv_is_root_node(top_bs) ||
!check_top_bs(top_bs, bs)) {
error_setg(errp, "No top_bs or it is invalid");
bdrv_graph_wrunlock(bs);
bdrv_graph_wrunlock();
reopen_backing_file(bs, false, NULL);
aio_context_release(aio_context);
return;
@@ -600,7 +600,7 @@ static void replication_start(ReplicationState *rs, ReplicationMode mode,
bdrv_op_block_all(top_bs, s->blocker);
bdrv_op_unblock(top_bs, BLOCK_OP_TYPE_DATAPLANE, s->blocker);
bdrv_graph_wrunlock(bs);
bdrv_graph_wrunlock();
s->backup_job = backup_job_create(
NULL, s->secondary_disk->bs, s->hidden_disk->bs,
@@ -696,7 +696,7 @@ static void replication_done(void *opaque, int ret)
s->secondary_disk = NULL;
bdrv_unref_child(bs, s->hidden_disk);
s->hidden_disk = NULL;
bdrv_graph_wrunlock(NULL);
bdrv_graph_wrunlock();
s->error = 0;
} else {

View File

@@ -196,10 +196,8 @@ bdrv_snapshot_fallback(BlockDriverState *bs)
int bdrv_can_snapshot(BlockDriverState *bs)
{
BlockDriver *drv = bs->drv;
GLOBAL_STATE_CODE();
if (!drv || !bdrv_is_inserted(bs) || !bdrv_is_writable(bs)) {
if (!drv || !bdrv_is_inserted(bs) || bdrv_is_read_only(bs)) {
return 0;
}
@@ -294,7 +292,7 @@ int bdrv_snapshot_goto(BlockDriverState *bs,
/* .bdrv_open() will re-attach it */
bdrv_graph_wrlock(NULL);
bdrv_unref_child(bs, fallback);
bdrv_graph_wrunlock(NULL);
bdrv_graph_wrunlock();
ret = bdrv_snapshot_goto(fallback_bs, snapshot_id, errp);
open_ret = drv->bdrv_open(bs, options, bs->open_flags, &local_err);

View File

@@ -99,9 +99,9 @@ static int stream_prepare(Job *job)
}
}
bdrv_graph_wrlock(s->target_bs);
bdrv_graph_wrlock(base);
bdrv_set_backing_hd_drained(unfiltered_bs, base, &local_err);
bdrv_graph_wrunlock(s->target_bs);
bdrv_graph_wrunlock();
/*
* This call will do I/O, so the graph can change again from here on.
@@ -369,7 +369,7 @@ void stream_start(const char *job_id, BlockDriverState *bs,
bdrv_graph_wrlock(bs);
if (block_job_add_bdrv(&s->common, "active node", bs, 0,
basic_flags | BLK_PERM_WRITE, errp)) {
bdrv_graph_wrunlock(bs);
bdrv_graph_wrunlock();
goto fail;
}
@@ -389,11 +389,11 @@ void stream_start(const char *job_id, BlockDriverState *bs,
ret = block_job_add_bdrv(&s->common, "intermediate node", iter, 0,
basic_flags, errp);
if (ret < 0) {
bdrv_graph_wrunlock(bs);
bdrv_graph_wrunlock();
goto fail;
}
}
bdrv_graph_wrunlock(bs);
bdrv_graph_wrunlock();
s->base_overlay = base_overlay;
s->above_base = above_base;

View File

@@ -283,7 +283,7 @@ static void vmdk_free_extents(BlockDriverState *bs)
bdrv_unref_child(bs, e->file);
}
}
bdrv_graph_wrunlock(NULL);
bdrv_graph_wrunlock();
g_free(s->extents);
}
@@ -351,41 +351,29 @@ vmdk_write_cid(BlockDriverState *bs, uint32_t cid)
BDRVVmdkState *s = bs->opaque;
int ret = 0;
size_t desc_buf_size;
if (s->desc_offset == 0) {
desc_buf_size = bdrv_getlength(bs->file->bs);
if (desc_buf_size > 16ULL << 20) {
error_report("VMDK description file too big");
return -EFBIG;
}
} else {
desc_buf_size = DESC_SIZE;
}
desc = g_malloc0(desc_buf_size);
tmp_desc = g_malloc0(desc_buf_size);
ret = bdrv_co_pread(bs->file, s->desc_offset, desc_buf_size, desc, 0);
desc = g_malloc0(DESC_SIZE);
tmp_desc = g_malloc0(DESC_SIZE);
ret = bdrv_co_pread(bs->file, s->desc_offset, DESC_SIZE, desc, 0);
if (ret < 0) {
goto out;
}
desc[desc_buf_size - 1] = '\0';
desc[DESC_SIZE - 1] = '\0';
tmp_str = strstr(desc, "parentCID");
if (tmp_str == NULL) {
ret = -EINVAL;
goto out;
}
pstrcpy(tmp_desc, desc_buf_size, tmp_str);
pstrcpy(tmp_desc, DESC_SIZE, tmp_str);
p_name = strstr(desc, "CID");
if (p_name != NULL) {
p_name += sizeof("CID");
snprintf(p_name, desc_buf_size - (p_name - desc), "%" PRIx32 "\n", cid);
pstrcat(desc, desc_buf_size, tmp_desc);
snprintf(p_name, DESC_SIZE - (p_name - desc), "%" PRIx32 "\n", cid);
pstrcat(desc, DESC_SIZE, tmp_desc);
}
ret = bdrv_co_pwrite_sync(bs->file, s->desc_offset, desc_buf_size, desc, 0);
ret = bdrv_co_pwrite_sync(bs->file, s->desc_offset, DESC_SIZE, desc, 0);
out:
g_free(desc);
@@ -1249,7 +1237,7 @@ vmdk_parse_extents(const char *desc, BlockDriverState *bs, QDict *options,
bdrv_graph_rdunlock_main_loop();
bdrv_graph_wrlock(NULL);
bdrv_unref_child(bs, extent_file);
bdrv_graph_wrunlock(NULL);
bdrv_graph_wrunlock();
bdrv_graph_rdlock_main_loop();
goto out;
}
@@ -1268,7 +1256,7 @@ vmdk_parse_extents(const char *desc, BlockDriverState *bs, QDict *options,
bdrv_graph_rdunlock_main_loop();
bdrv_graph_wrlock(NULL);
bdrv_unref_child(bs, extent_file);
bdrv_graph_wrunlock(NULL);
bdrv_graph_wrunlock();
bdrv_graph_rdlock_main_loop();
goto out;
}
@@ -1279,7 +1267,7 @@ vmdk_parse_extents(const char *desc, BlockDriverState *bs, QDict *options,
bdrv_graph_rdunlock_main_loop();
bdrv_graph_wrlock(NULL);
bdrv_unref_child(bs, extent_file);
bdrv_graph_wrunlock(NULL);
bdrv_graph_wrunlock();
bdrv_graph_rdlock_main_loop();
goto out;
}
@@ -1289,7 +1277,7 @@ vmdk_parse_extents(const char *desc, BlockDriverState *bs, QDict *options,
bdrv_graph_rdunlock_main_loop();
bdrv_graph_wrlock(NULL);
bdrv_unref_child(bs, extent_file);
bdrv_graph_wrunlock(NULL);
bdrv_graph_wrunlock();
bdrv_graph_rdlock_main_loop();
ret = -ENOTSUP;
goto out;

View File

@@ -1613,7 +1613,7 @@ static void external_snapshot_abort(void *opaque)
bdrv_drained_begin(state->new_bs);
bdrv_graph_wrlock(state->old_bs);
bdrv_replace_node(state->new_bs, state->old_bs, &error_abort);
bdrv_graph_wrunlock(state->old_bs);
bdrv_graph_wrunlock();
bdrv_drained_end(state->new_bs);
bdrv_unref(state->old_bs); /* bdrv_replace_node() ref'ed old_bs */
@@ -2400,9 +2400,8 @@ void coroutine_fn qmp_block_resize(const char *device, const char *node_name,
bdrv_co_lock(bs);
bdrv_drained_end(bs);
bdrv_co_unlock(bs);
blk_co_unref(blk);
bdrv_co_unlock(bs);
}
void qmp_block_stream(const char *job_id, const char *device,
@@ -3693,7 +3692,7 @@ void qmp_x_blockdev_change(const char *parent, const char *child,
}
out:
bdrv_graph_wrunlock(NULL);
bdrv_graph_wrunlock();
}
BlockJobInfoList *qmp_query_block_jobs(Error **errp)

View File

@@ -212,7 +212,7 @@ void block_job_remove_all_bdrv(BlockJob *job)
g_slist_free_1(l);
}
bdrv_graph_wrunlock_ctx(job->job.aio_context);
bdrv_graph_wrunlock();
}
bool block_job_has_bdrv(BlockJob *job, BlockDriverState *bs)
@@ -523,7 +523,7 @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver,
job = job_create(job_id, &driver->job_driver, txn, bdrv_get_aio_context(bs),
flags, cb, opaque, errp);
if (job == NULL) {
bdrv_graph_wrunlock(bs);
bdrv_graph_wrunlock();
return NULL;
}
@@ -563,11 +563,11 @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver,
goto fail;
}
bdrv_graph_wrunlock(bs);
bdrv_graph_wrunlock();
return job;
fail:
bdrv_graph_wrunlock(bs);
bdrv_graph_wrunlock();
job_early_fail(&job->job);
return NULL;
}

View File

@@ -235,7 +235,7 @@ static inline abi_long do_obreak(abi_ulong brk_val)
return target_brk;
}
/* Release heap if necessary */
/* Release heap if necesary */
if (new_brk < old_brk) {
target_munmap(new_brk, old_brk - new_brk);

View File

@@ -115,7 +115,7 @@ abi_long freebsd_exec_common(abi_ulong path_or_fd, abi_ulong guest_argp,
}
qarg0 = argp = g_new0(char *, argc + 9);
/* save the first argument for the emulator */
/* save the first agrument for the emulator */
*argp++ = (char *)getprogname();
qargp = argp;
*argp++ = (char *)getprogname();

View File

@@ -146,7 +146,7 @@ static inline abi_long do_freebsd_fstatat(abi_long arg1, abi_long arg2,
return ret;
}
/* undocumented nstat(char *path, struct nstat *ub) syscall */
/* undocummented nstat(char *path, struct nstat *ub) syscall */
static abi_long do_freebsd11_nstat(abi_long arg1, abi_long arg2)
{
abi_long ret;
@@ -162,7 +162,7 @@ static abi_long do_freebsd11_nstat(abi_long arg1, abi_long arg2)
return ret;
}
/* undocumented nfstat(int fd, struct nstat *sb) syscall */
/* undocummented nfstat(int fd, struct nstat *sb) syscall */
static abi_long do_freebsd11_nfstat(abi_long arg1, abi_long arg2)
{
abi_long ret;
@@ -175,7 +175,7 @@ static abi_long do_freebsd11_nfstat(abi_long arg1, abi_long arg2)
return ret;
}
/* undocumented nlstat(char *path, struct nstat *ub) syscall */
/* undocummented nlstat(char *path, struct nstat *ub) syscall */
static abi_long do_freebsd11_nlstat(abi_long arg1, abi_long arg2)
{
abi_long ret;

View File

@@ -518,7 +518,7 @@ static const ChardevClass *char_get_class(const char *driver, Error **errp)
if (object_class_is_abstract(oc)) {
error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "driver",
"a non-abstract device type");
"an abstract device type");
return NULL;
}

View File

@@ -18,6 +18,7 @@
#CONFIG_QXL=n
#CONFIG_SEV=n
#CONFIG_SGA=n
#CONFIG_TDX=n
#CONFIG_TEST_DEVICES=n
#CONFIG_TPM_CRB=n
#CONFIG_TPM_TIS_ISA=n

48
configure vendored
View File

@@ -41,7 +41,12 @@ then
# This file is auto-generated by configure to support in-source tree
# 'make' command invocation
build:
ifeq ($(MAKECMDGOALS),)
recurse: all
endif
.NOTPARALLEL: %
%: force
@echo 'changing dir to build for $(MAKE) "$(MAKECMDGOALS)"...'
@$(MAKE) -C build -f Makefile $(MAKECMDGOALS)
@if test "$(MAKECMDGOALS)" = "distclean" && \
@@ -49,9 +54,8 @@ build:
then \
rm -rf build GNUmakefile ; \
fi
%: build
@
.PHONY: build
force: ;
.PHONY: force
GNUmakefile: ;
EOF
@@ -964,14 +968,14 @@ meson="$(cd pyvenv/bin; pwd)/meson"
# Conditionally ensure Sphinx is installed.
mkvenv_online_flag=""
if test "$download" = "enabled" ; then
mkvenv_online_flag=" --online"
mkvenv_flags=""
if test "$download" = "enabled" -a "$docs" = "enabled" ; then
mkvenv_flags="--online"
fi
if test "$docs" != "disabled" ; then
if ! $mkvenv ensuregroup \
$(test "$docs" = "enabled" && echo "$mkvenv_online_flag") \
$mkvenv_flags \
${source_path}/pythondeps.toml docs;
then
if test "$docs" = "enabled" ; then
@@ -1303,8 +1307,8 @@ probe_target_compiler() {
container_cross_cc=${container_cross_prefix}gcc
;;
i386)
container_image=debian-i686-cross
container_cross_prefix=i686-linux-gnu-
container_image=fedora-i386-cross
container_cross_prefix=
;;
loongarch64)
container_image=debian-loongarch-cross
@@ -1387,19 +1391,16 @@ probe_target_compiler() {
done
try=cross
# For softmmu/roms also look for a bi-endian or multilib-enabled host compiler
if [ "${1%softmmu}" != "$1" ] || test "$target_arch" = "$cpu"; then
case "$target_arch:$cpu" in
aarch64_be:aarch64 | \
armeb:arm | \
i386:x86_64 | \
mips*:mips64 | \
ppc*:ppc64 | \
sparc:sparc64 | \
"$cpu:$cpu")
try='native cross' ;;
esac
fi
case "$target_arch:$cpu" in
aarch64_be:aarch64 | \
armeb:arm | \
i386:x86_64 | \
mips*:mips64 | \
ppc*:ppc64 | \
sparc:sparc64 | \
"$cpu:$cpu")
try='native cross' ;;
esac
eval "target_cflags=\${cross_cc_cflags_$target_arch}"
for thistry in $try; do
case $thistry in
@@ -1630,7 +1631,6 @@ if test "$container" != no; then
fi
echo "SUBDIRS=$subdirs" >> $config_host_mak
echo "PYTHON=$python" >> $config_host_mak
echo "MKVENV_ENSUREGROUP=$mkvenv ensuregroup $mkvenv_online_flag" >> $config_host_mak
echo "GENISOIMAGE=$genisoimage" >> $config_host_mak
echo "MESON=$meson" >> $config_host_mak
echo "NINJA=$ninja" >> $config_host_mak

View File

@@ -49,7 +49,7 @@ all: $(SONAMES)
$(CC) $(CFLAGS) $(PLUGIN_CFLAGS) -c -o $@ $<
ifeq ($(CONFIG_WIN32),y)
lib%$(SO_SUFFIX): %.o win32_linker.o ../../plugins/libqemu_plugin_api.a
lib%$(SO_SUFFIX): %.o win32_linker.o ../../plugins/qemu_plugin_api.lib
$(CC) -shared -o $@ $^ $(LDLIBS)
else ifeq ($(CONFIG_DARWIN),y)
lib%$(SO_SUFFIX): %.o

View File

@@ -401,7 +401,7 @@ virgl_cmd_set_scanout(VuGpu *g,
if (g->use_modifiers) {
/*
* The message uses all the fields set in dmabuf_scanout plus
* The mesage uses all the fields set in dmabuf_scanout plus
* modifiers which is appended after VhostUserGpuDMABUFScanout.
*/
msg.request = VHOST_USER_GPU_DMABUF_SCANOUT2;

View File

@@ -1731,10 +1731,10 @@ format_hex (unsigned long number,
unsigned (== 0). */
static char *
format_dec (long number, char *outbuffer, size_t outsize, int signedp)
format_dec (long number, char *outbuffer, int signedp)
{
last_immediate = number;
snprintf (outbuffer, outsize, signedp ? "%ld" : "%lu", number);
sprintf (outbuffer, signedp ? "%ld" : "%lu", number);
return outbuffer + strlen (outbuffer);
}
@@ -1876,12 +1876,6 @@ print_flags (struct cris_disasm_data *disdata, unsigned int insn, char *cp)
return cp;
}
#define FORMAT_DEC(number, tp, signedp) \
format_dec (number, tp, ({ \
assert(tp >= temp && tp <= temp + sizeof(temp)); \
temp + sizeof(temp) - tp; \
}), signedp)
/* Print out an insn with its operands, and update the info->insn_type
fields. The prefix_opcodep and the rest hold a prefix insn that is
supposed to be output as an address mode. */
@@ -2111,7 +2105,7 @@ print_with_operands (const struct cris_opcode *opcodep,
if ((*cs == 'z' && (insn & 0x20))
|| (opcodep->match == BDAP_QUICK_OPCODE
&& (nbytes <= 2 || buffer[1 + nbytes] == 0)))
tp = FORMAT_DEC (number, tp, signedp);
tp = format_dec (number, tp, signedp);
else
{
unsigned int highbyte = (number >> 24) & 0xff;
@@ -2247,7 +2241,7 @@ print_with_operands (const struct cris_opcode *opcodep,
with_reg_prefix);
if (number >= 0)
*tp++ = '+';
tp = FORMAT_DEC (number, tp, 1);
tp = format_dec (number, tp, 1);
info->flags |= CRIS_DIS_FLAG_MEM_TARGET_IS_REG;
info->target = (prefix_insn >> 12) & 15;
@@ -2346,7 +2340,7 @@ print_with_operands (const struct cris_opcode *opcodep,
{
if (number >= 0)
*tp++ = '+';
tp = FORMAT_DEC (number, tp, 1);
tp = format_dec (number, tp, 1);
}
}
else
@@ -2403,7 +2397,7 @@ print_with_operands (const struct cris_opcode *opcodep,
break;
case 'I':
tp = FORMAT_DEC (insn & 63, tp, 0);
tp = format_dec (insn & 63, tp, 0);
break;
case 'b':
@@ -2432,11 +2426,11 @@ print_with_operands (const struct cris_opcode *opcodep,
break;
case 'c':
tp = FORMAT_DEC (insn & 31, tp, 0);
tp = format_dec (insn & 31, tp, 0);
break;
case 'C':
tp = FORMAT_DEC (insn & 15, tp, 0);
tp = format_dec (insn & 15, tp, 0);
break;
case 'o':
@@ -2469,7 +2463,7 @@ print_with_operands (const struct cris_opcode *opcodep,
if (number > 127)
number = number - 256;
tp = FORMAT_DEC (number, tp, 1);
tp = format_dec (number, tp, 1);
*tp++ = ',';
tp = format_reg (disdata, (insn >> 12) & 15, tp, with_reg_prefix);
}
@@ -2480,7 +2474,7 @@ print_with_operands (const struct cris_opcode *opcodep,
break;
case 'i':
tp = FORMAT_DEC ((insn & 32) ? (insn & 31) | ~31L : insn & 31, tp, 1);
tp = format_dec ((insn & 32) ? (insn & 31) | ~31L : insn & 31, tp, 1);
break;
case 'P':

View File

@@ -1968,10 +1968,6 @@ print_insn_hppa (bfd_vma memaddr, disassemble_info *info)
insn = bfd_getb32 (buffer);
info->fprintf_func(info->stream, " %02x %02x %02x %02x ",
(insn >> 24) & 0xff, (insn >> 16) & 0xff,
(insn >> 8) & 0xff, insn & 0xff);
for (i = 0; i < NUMOPCODES; ++i)
{
const struct pa_opcode *opcode = &pa_opcodes[i];
@@ -2830,6 +2826,6 @@ print_insn_hppa (bfd_vma memaddr, disassemble_info *info)
return sizeof (insn);
}
}
info->fprintf_func(info->stream, "<unknown>");
(*info->fprintf_func) (info->stream, "#%8x", insn);
return sizeof (insn);
}

View File

@@ -236,16 +236,6 @@ it. Since all recent x86 hardware from the past >10 years is capable of the
64-bit x86 extensions, a corresponding 64-bit OS should be used instead.
System emulator CPUs
--------------------
Nios II CPU (since 8.2)
'''''''''''''''''''''''
The Nios II architecture is orphan. The ``nios2`` guest CPU support is
deprecated and will be removed in a future version of QEMU.
System emulator machines
------------------------
@@ -264,11 +254,6 @@ These old machine types are quite neglected nowadays and thus might have
various pitfalls with regards to live migration. Use a newer machine type
instead.
Nios II ``10m50-ghrd`` and ``nios2-generic-nommu`` machines (since 8.2)
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
The Nios II architecture is orphan.
Backend options
---------------
@@ -529,5 +514,5 @@ old compression method (since 8.2)
Compression method fails too much. Too many races. We are going to
remove it if nobody fixes it. For starters, migration-test
compression tests are disabled because they fail randomly. If you need
compression tests are disabled becase they fail randomly. If you need
compression, use multifd compression methods.

View File

@@ -129,9 +129,8 @@ causing most hypervisors to trap and fault on them.
.. warning::
Semihosting inherently bypasses any isolation there may be between
the guest and the host. As a result a program using semihosting can
happily trash your host system. Some semihosting calls (e.g.
``SYS_READC``) can block execution indefinitely. You should only
ever run trusted code with semihosting enabled.
happily trash your host system. You should only ever run trusted
code with semihosting enabled.
Redirection
~~~~~~~~~~~

View File

@@ -122,78 +122,10 @@ functioning. These are performed using a few more helper functions:
indicated by $TMPC.
Python virtual environments and the build process
-------------------------------------------------
An important step in ``configure`` is to create a Python virtual
environment (venv) during the configuration phase. The Python interpreter
comes from the ``--python`` command line option, the ``$PYTHON`` variable
from the environment, or the system PATH, in this order. The venv resides
in the ``pyvenv`` directory in the build tree, and provides consistency
in how the build process runs Python code.
At this stage, ``configure`` also queries the chosen Python interpreter
about QEMU's build dependencies. Note that the build process does *not*
look for ``meson``, ``sphinx-build`` or ``avocado`` binaries in the PATH;
likewise, there are no options such as ``--meson`` or ``--sphinx-build``.
This avoids a potential mismatch, where Meson and Sphinx binaries on the
PATH might operate in a different Python environment than the one chosen
by the user during the build process. On the other hand, it introduces
a potential source of confusion where the user installs a dependency but
``configure`` is not able to find it. When this happens, the dependency
was installed in the ``site-packages`` directory of another interpreter,
or with the wrong ``pip`` program.
If a package is available for the chosen interpreter, ``configure``
prepares a small script that invokes it from the venv itself[#distlib]_.
If not, ``configure`` can also optionally install dependencies in the
virtual environment with ``pip``, either from wheels in ``python/wheels``
or by downloading the package with PyPI. Downloading can be disabled with
``--disable-download``; and anyway, it only happens when a ``configure``
option (currently, only ``--enable-docs``) is explicitly enabled but
the dependencies are not present[#pip]_.
.. [#distlib] The scripts are created based on the package's metadata,
specifically the ``console_script`` entry points. This is the
same mechanism that ``pip`` uses when installing a package.
Currently, in all cases it would be possible to use ``python -m``
instead of an entry point script, which makes this approach a
bit overkill. On the other hand, creating the scripts is
future proof and it makes the contents of the ``pyvenv/bin``
directory more informative. Portability is also not an issue,
because the Python Packaging Authority provides a package
``distlib.scripts`` to perform this task.
.. [#pip] ``pip`` might also be used when running ``make check-avocado``
if downloading is enabled, to ensure that Avocado is
available.
The required versions of the packages are stored in a configuration file
``pythondeps.toml``. The format is custom to QEMU, but it is documented
at the top of the file itself and it should be easy to understand. The
requirements should make it possible to use the version that is packaged
that is provided by supported distros.
When dependencies are downloaded, instead, ``configure`` uses a "known
good" version that is also listed in ``pythondeps.toml``. In this
scenario, ``pythondeps.toml`` behaves like the "lock file" used by
``cargo``, ``poetry`` or other dependency management systems.
Bundled Python packages
-----------------------
Python packages that are **mandatory** dependencies to build QEMU,
but are not available in all supported distros, are bundled with the
QEMU sources. Currently this includes Meson (outdated in CentOS 8
and derivatives, Ubuntu 20.04 and 22.04, and openSUSE Leap) and tomli
(absent in Ubuntu 20.04).
If you need to update these, please do so by modifying and rerunning
``python/scripts/vendor.py``. This script embeds the sha256 hash of
package sources and checks it. The pypi.org web site provides an easy
way to retrieve the sha256 hash of the sources.
Python virtual environments and the QEMU build system
-----------------------------------------------------
TBD
Stage 2: Meson
==============
@@ -444,15 +376,6 @@ This is needed to obey the --python= option passed to the configure
script, which may point to something other than the first python3
binary on the path.
By the time Meson runs, Python dependencies are available in the virtual
environment and should be invoked through the scripts that ``configure``
places under ``pyvenv``. One way to do so is as follows, using Meson's
``find_program`` function::
sphinx_build = find_program(
fs.parent(python.full_path()) / 'sphinx-build',
required: get_option('docs'))
Stage 3: Make
=============
@@ -511,11 +434,6 @@ number of dynamically created files listed later.
executables. Build rules for various subdirectories are included in
other meson.build files spread throughout the QEMU source tree.
``python/scripts/mkvenv.py``
A wrapper for the Python ``venv`` and ``distlib.scripts`` packages.
It handles creating the virtual environment, creating scripts in
``pyvenv/bin``, and calling ``pip`` to install dependencies.
``tests/Makefile.include``
Rules for external test harnesses. These include the TCG tests
and the Avocado-based integration tests.

View File

@@ -1061,7 +1061,7 @@ QEMU version, in this case pc-5.1.
4 - qemu-5.1 -M pc-5.2 -> migrates to -> qemu-5.1 -M pc-5.2
This combination is not possible as the qemu-5.1 doesn't understand
This combination is not possible as the qemu-5.1 doen't understand
pc-5.2 machine type. So nothing to worry here.
Now it comes the interesting ones, when both QEMU processes are
@@ -1214,8 +1214,8 @@ machine types to have the right value::
...
};
A device with different features on both sides
----------------------------------------------
A device with diferent features on both sides
---------------------------------------------
Let's assume that we are using the same QEMU binary on both sides,
just to make the things easier. But we have a device that has
@@ -1294,12 +1294,12 @@ Host B:
$ qemu-system-x86_64 -cpu host,taa-no=off
And you would be able to migrate between them. It is responsibility
And you would be able to migrate between them. It is responsability
of the management application or of the user to make sure that the
configuration is correct. QEMU doesn't know how to look at this kind
of features in general.
Notice that we don't recommend to use -cpu host for migration. It is
Notice that we don't recomend to use -cpu host for migration. It is
used in this example because it makes the example simpler.
Other devices have worse control about individual features. If they

View File

@@ -15,7 +15,7 @@ have default values:
-smp 1,drawers=3,books=3,sockets=2,cores=2,maxcpus=36 \
-device z14-s390x-cpu,core-id=19,entitlement=high \
-device z14-s390x-cpu,core-id=11,entitlement=low \
-device z14-s390x-cpu,core-id=12,entitlement=high \
-device z14-s390x-cpu,core-id=112,entitlement=high \
...
Additions to query-cpus-fast
@@ -78,7 +78,7 @@ modifiers for all configured vCPUs.
"dedicated": true,
"thread-id": 537005,
"props": {
"core-id": 12,
"core-id": 112,
"socket-id": 0,
"drawer-id": 3,
"book-id": 2
@@ -86,7 +86,7 @@ modifiers for all configured vCPUs.
"cpu-state": "operating",
"entitlement": "high",
"qom-path": "/machine/peripheral-anon/device[2]",
"cpu-index": 12,
"cpu-index": 112,
"target": "s390x"
}
]

View File

@@ -62,6 +62,12 @@ To deal with this case, when an I/O access is made we:
- re-compile a single [1]_ instruction block for the current PC
- exit the cpu loop and execute the re-compiled block
The new block is created with the CF_LAST_IO compile flag which
ensures the final instruction translation starts with a call to
gen_io_start() so we don't enter a perpetual loop constantly
recompiling a single instruction block. For translators using the
common translator_loop this is done automatically.
.. [1] sometimes two instructions if dealing with delay slots
Other I/O operations

View File

@@ -1016,7 +1016,7 @@ class. Here's a simple usage example:
self.vm.launch()
res = self.vm.cmd('human-monitor-command',
command_line='info version')
self.assertRegex(res, r'^(\d+\.\d+\.\d)')
self.assertRegexpMatches(res, r'^(\d+\.\d+\.\d)')
To execute your test, run:
@@ -1077,7 +1077,7 @@ and hypothetical example follows:
'human-monitor-command',
command_line='info version')
self.assertEqual(first_res, second_res, third_res)
self.assertEquals(first_res, second_res, third_res)
At test "tear down", ``avocado_qemu.Test`` handles all the QEMUMachines
shutdown.
@@ -1371,33 +1371,23 @@ conditions. For example, tests that take longer to execute when QEMU is
compiled with debug flags. Therefore, the ``AVOCADO_TIMEOUT_EXPECTED`` variable
has been used to determine whether those tests should run or not.
QEMU_TEST_FLAKY_TESTS
^^^^^^^^^^^^^^^^^^^^^
Some tests are not working reliably and thus are disabled by default.
This includes tests that don't run reliably on GitLab's CI which
usually expose real issues that are rarely seen on developer machines
due to the constraints of the CI environment. If you encounter a
similar situation then raise a bug and then mark the test as shown on
the code snippet below:
GITLAB_CI
^^^^^^^^^
A number of tests are flagged to not run on the GitLab CI. Usually because
they proved to the flaky or there are constraints on the CI environment which
would make them fail. If you encounter a similar situation then use that
variable as shown on the code snippet below to skip the test:
.. code::
# See https://gitlab.com/qemu-project/qemu/-/issues/nnnn
@skipUnless(os.getenv('QEMU_TEST_FLAKY_TESTS'), 'Test is unstable on GitLab')
@skipIf(os.getenv('GITLAB_CI'), 'Running on GitLab')
def test(self):
do_something()
You can also add ``:avocado: tags=flaky`` to the test meta-data so
only the flaky tests can be run as a group:
.. code::
env QEMU_TEST_FLAKY_TESTS=1 ./pyvenv/bin/avocado \
run tests/avocado -filter-by-tags=flaky
Tests should not live in this state forever and should either be fixed
or eventually removed.
QEMU_TEST_FLAKY_TESTS
^^^^^^^^^^^^^^^^^^^^^
Some tests are not working reliably and thus are disabled by default.
Set this environment variable to enable them.
Uninstalling Avocado
~~~~~~~~~~~~~~~~~~~~

View File

@@ -1,2 +0,0 @@
sphinx==5.3.0
sphinx_rtd_theme==1.1.1

View File

@@ -1,5 +1,3 @@
.. _tpm-device:
===============
QEMU TPM Device
===============

View File

@@ -70,7 +70,7 @@ the following architecture extensions:
- FEAT_PAN2 (AT S1E1R and AT S1E1W instruction variants affected by PSTATE.PAN)
- FEAT_PAN3 (Support for SCTLR_ELx.EPAN)
- FEAT_PAuth (Pointer authentication)
- FEAT_PAuth2 (Enhancements to pointer authentication)
- FEAT_PAuth2 (Enhacements to pointer authentication)
- FEAT_PMULL (PMULL, PMULL2 instructions)
- FEAT_PMUv3p1 (PMU Extensions v3.1)
- FEAT_PMUv3p4 (PMU Extensions v3.4)

View File

@@ -1,39 +1,34 @@
Xen Device Emulation Backend (``xenpvh``)
XENPVH (``xenpvh``)
=========================================
This machine creates a IOREQ server to register/connect with Xen Hypervisor.
This machine is a little unusual compared to others as QEMU just acts
as an IOREQ server to register/connect with Xen Hypervisor. Control of
the VMs themselves is left to the Xen tooling.
When TPM is enabled, this machine also creates a tpm-tis-device at a user input
tpm base address, adds a TPM emulator and connects to a swtpm application
running on host machine via chardev socket. This enables xenpvh to support TPM
functionalities for a guest domain.
When TPM is enabled, this machine also creates a tpm-tis-device at a
user input tpm base address, adds a TPM emulator and connects to a
swtpm application running on host machine via chardev socket. This
enables xenpvh to support TPM functionalities for a guest domain.
More information about TPM use and installing swtpm linux application
can be found in the :ref:`tpm-device` section.
More information about TPM use and installing swtpm linux application can be
found at: docs/specs/tpm.rst.
Example for starting swtpm on host machine:
.. code-block:: console
mkdir /tmp/vtpm2
swtpm socket --tpmstate dir=/tmp/vtpm2 \
--ctrl type=unixio,path=/tmp/vtpm2/swtpm-sock &
--ctrl type=unixio,path=/tmp/vtpm2/swtpm-sock &
Sample QEMU xenpvh commands for running and connecting with Xen:
.. code-block:: console
qemu-system-aarch64 -xen-domid 1 \
-chardev socket,id=libxl-cmd,path=qmp-libxl-1,server=on,wait=off \
-mon chardev=libxl-cmd,mode=control \
-chardev socket,id=libxenstat-cmd,path=qmp-libxenstat-1,server=on,wait=off \
-mon chardev=libxenstat-cmd,mode=control \
-xen-attach -name guest0 -vnc none -display none -nographic \
-machine xenpvh -m 1301 \
-chardev socket,id=chrtpm,path=tmp/vtpm2/swtpm-sock \
-tpmdev emulator,id=tpm0,chardev=chrtpm -machine tpm-base-addr=0x0C000000
-chardev socket,id=libxl-cmd,path=qmp-libxl-1,server=on,wait=off \
-mon chardev=libxl-cmd,mode=control \
-chardev socket,id=libxenstat-cmd,path=qmp-libxenstat-1,server=on,wait=off \
-mon chardev=libxenstat-cmd,mode=control \
-xen-attach -name guest0 -vnc none -display none -nographic \
-machine xenpvh -m 1301 \
-chardev socket,id=chrtpm,path=tmp/vtpm2/swtpm-sock \
-tpmdev emulator,id=tpm0,chardev=chrtpm -machine tpm-base-addr=0x0C000000
In above QEMU command, last two lines are for connecting xenpvh QEMU to swtpm
via chardev socket.

View File

@@ -38,6 +38,7 @@ Supported mechanisms
Currently supported confidential guest mechanisms are:
* AMD Secure Encrypted Virtualization (SEV) (see :doc:`i386/amd-memory-encryption`)
* Intel Trust Domain Extension (TDX) (see :doc:`i386/tdx`)
* POWER Protected Execution Facility (PEF) (see :ref:`power-papr-protected-execution-facility-pef`)
* s390x Protected Virtualization (PV) (see :doc:`s390x/protvirt`)

View File

@@ -60,7 +60,7 @@ As TCG cannot track all memory accesses in user-mode there is no
support for watchpoints.
Relocating code
===============
---------------
On modern kernels confusion can be caused by code being relocated by
features such as address space layout randomisation. To avoid
@@ -68,17 +68,6 @@ confusion when debugging such things you either need to update gdb's
view of where things are in memory or perhaps more trivially disable
ASLR when booting the system.
Debugging user-space in system emulation
========================================
While it is technically possible to debug a user-space program running
inside a system image, it does present challenges. Kernel preemption
and execution mode changes between kernel and user mode can make it
hard to follow what's going on. Unless you are specifically trying to
debug some interaction between kernel and user-space you are better
off running your guest program with gdb either in the guest or using
a gdbserver exposed via a port to the outside world.
Debugging multicore machines
============================

113
docs/system/i386/tdx.rst Normal file
View File

@@ -0,0 +1,113 @@
Intel Trusted Domain eXtension (TDX)
====================================
Intel Trusted Domain eXtensions (TDX) refers to an Intel technology that extends
Virtual Machine Extensions (VMX) and Multi-Key Total Memory Encryption (MKTME)
with a new kind of virtual machine guest called a Trust Domain (TD). A TD runs
in a CPU mode that is designed to protect the confidentiality of its memory
contents and its CPU state from any other software, including the hosting
Virtual Machine Monitor (VMM), unless explicitly shared by the TD itself.
Prerequisites
-------------
To run TD, the physical machine needs to have TDX module loaded and initialized
while KVM hypervisor has TDX support and has TDX enabled. If those requirements
are met, the ``KVM_CAP_VM_TYPES`` will report the support of ``KVM_X86_TDX_VM``.
Trust Domain Virtual Firmware (TDVF)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Trust Domain Virtual Firmware (TDVF) is required to provide TD services to boot
TD Guest OS. TDVF needs to be copied to guest private memory and measured before
a TD boots.
The VM scope ``MEMORY_ENCRYPT_OP`` ioctl provides command ``KVM_TDX_INIT_MEM_REGION``
to copy the TDVF image to TD's private memory space.
Since TDX doesn't support readonly memslot, TDVF cannot be mapped as pflash
device and it actually works as RAM. "-bios" option is chosen to load TDVF.
OVMF is the opensource firmware that implements the TDVF support. Thus the
command line to specify and load TDVF is ``-bios OVMF.fd``
KVM private gmem
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
TD's memory (RAM) needs to be able to be transformed between private and shared.
And its BIOS (OVMF/TDVF) needs to be mapped as private. Thus QEMU needs to
allocate private gmem for them via KVM's IOCTL (KVM_CREATE_GUEST_MEMFD), which
requires KVM is newer enough that reports KVM_CAP_GUEST_MEMFD.
Feature Control
---------------
Unlike non-TDX VM, the CPU features (enumerated by CPU or MSR) of a TD is not
under full control of VMM. VMM can only configure part of features of a TD on
``KVM_TDX_INIT_VM`` command of VM scope ``MEMORY_ENCRYPT_OP`` ioctl.
The configurable features have three types:
- Attributes:
- PKS (bit 30) controls whether Supervisor Protection Keys is exposed to TD,
which determines related CPUID bit and CR4 bit;
- PERFMON (bit 63) controls whether PMU is exposed to TD.
- XSAVE related features (XFAM):
XFAM is a 64b mask, which has the same format as XCR0 or IA32_XSS MSR. It
determines the set of extended features available for use by the guest TD.
- CPUID features:
Only some bits of some CPUID leaves are directly configurable by VMM.
What features can be configured is reported via TDX capabilities.
TDX capabilities
~~~~~~~~~~~~~~~~
The VM scope ``MEMORY_ENCRYPT_OP`` ioctl provides command ``KVM_TDX_CAPABILITIES``
to get the TDX capabilities from KVM. It returns a data structure of
``struct kvm_tdx_capabilites``, which tells the supported configuration of
attributes, XFAM and CPUIDs.
Launching a TD (TDX VM)
-----------------------
To launch a TDX guest, below are new added and required:
.. parsed-literal::
|qemu_system_x86| \\
-object tdx-guest,id=tdx0 \\
-machine ...,kernel-irqchip=split,confidential-guest-support=tdx0 \\
-bios OVMF.fd \\
Debugging
---------
Bit 0 of TD attributes, is DEBUG bit, which decides if the TD runs in off-TD
debug mode. When in off-TD debug mode, TD's VCPU state and private memory are
accessible via given SEAMCALLs. This requires KVM to expose APIs to invoke those
SEAMCALLs and resonponding QEMU change.
It's targeted as future work.
restrictions
------------
- kernel-irqchip must be split;
- No readonly support for private memory;
- No SMM support: SMM support requires manipulating the guset register states
which is not allowed;
Live Migration
--------------
TODO
References
----------
- `TDX Homepage <https://www.intel.com/content/www/us/en/developer/articles/technical/intel-trust-domain-extensions.html>`__

View File

@@ -29,6 +29,7 @@ Architectural features
i386/kvm-pv
i386/sgx
i386/amd-memory-encryption
i386/tdx
OS requirements
~~~~~~~~~~~~~~~

View File

@@ -692,7 +692,7 @@ static int gdb_handle_vcont(const char *p)
/*
* target_count and last_target keep track of how many CPUs we are going to
* step or resume, and a pointer to the state structure of one of them,
* respectively
* respectivelly
*/
int target_count = 0;
CPUState *last_target = NULL;

View File

@@ -24,7 +24,6 @@ enum {
GDB_SIGNAL_TRAP = 5,
GDB_SIGNAL_ABRT = 6,
GDB_SIGNAL_ALRM = 14,
GDB_SIGNAL_STOP = 17,
GDB_SIGNAL_IO = 23,
GDB_SIGNAL_XCPU = 24,
GDB_SIGNAL_UNKNOWN = 143

View File

@@ -183,7 +183,7 @@ static void gdb_vm_state_change(void *opaque, bool running, RunState state)
break;
case RUN_STATE_IO_ERROR:
trace_gdbstub_hit_io_error();
ret = GDB_SIGNAL_STOP;
ret = GDB_SIGNAL_IO;
break;
case RUN_STATE_WATCHDOG:
trace_gdbstub_hit_watchdog();

View File

@@ -947,7 +947,6 @@ static const VMStateDescription erst_vmstate = {
static void erst_realizefn(PCIDevice *pci_dev, Error **errp)
{
ERRP_GUARD();
ERSTDeviceState *s = ACPIERST(pci_dev);
trace_acpi_erst_realizefn_in();
@@ -965,15 +964,9 @@ static void erst_realizefn(PCIDevice *pci_dev, Error **errp)
/* HostMemoryBackend size will be multiple of PAGE_SIZE */
s->storage_size = object_property_get_int(OBJECT(s->hostmem), "size", errp);
if (*errp) {
return;
}
/* Initialize backend storage and record_count */
check_erst_backend_storage(s, errp);
if (*errp) {
return;
}
/* BAR 0: Programming registers */
memory_region_init_io(&s->iomem_mr, OBJECT(pci_dev), &erst_reg_ops, s,
@@ -984,9 +977,6 @@ static void erst_realizefn(PCIDevice *pci_dev, Error **errp)
memory_region_init_ram(&s->exchange_mr, OBJECT(pci_dev),
"erst.exchange",
le32_to_cpu(s->header->record_size), errp);
if (*errp) {
return;
}
pci_register_bar(pci_dev, 1, PCI_BASE_ADDRESS_SPACE_MEMORY,
&s->exchange_mr);

View File

@@ -169,8 +169,7 @@ static void fsl_imx25_realize(DeviceState *dev, Error **errp)
epit_table[i].irq));
}
object_property_set_uint(OBJECT(&s->fec), "phy-num", s->phy_num,
&error_abort);
object_property_set_uint(OBJECT(&s->fec), "phy-num", s->phy_num, &err);
qdev_set_nic_properties(DEVICE(&s->fec), &nd_table[0]);
if (!sysbus_realize(SYS_BUS_DEVICE(&s->fec), errp)) {

View File

@@ -379,8 +379,7 @@ static void fsl_imx6_realize(DeviceState *dev, Error **errp)
spi_table[i].irq));
}
object_property_set_uint(OBJECT(&s->eth), "phy-num", s->phy_num,
&error_abort);
object_property_set_uint(OBJECT(&s->eth), "phy-num", s->phy_num, &err);
qdev_set_nic_properties(DEVICE(&s->eth), &nd_table[0]);
if (!sysbus_realize(SYS_BUS_DEVICE(&s->eth), errp)) {
return;

View File

@@ -44,6 +44,7 @@ static void netduino2_init(MachineState *machine)
clock_set_hz(sysclk, SYSCLK_FRQ);
dev = qdev_new(TYPE_STM32F205_SOC);
qdev_prop_set_string(dev, "cpu-type", ARM_CPU_TYPE_NAME("cortex-m3"));
qdev_connect_clock_in(dev, "sysclk", sysclk);
sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
@@ -53,14 +54,8 @@ static void netduino2_init(MachineState *machine)
static void netduino2_machine_init(MachineClass *mc)
{
static const char * const valid_cpu_types[] = {
ARM_CPU_TYPE_NAME("cortex-m3"),
NULL
};
mc->desc = "Netduino 2 Machine (Cortex-M3)";
mc->init = netduino2_init;
mc->valid_cpu_types = valid_cpu_types;
mc->ignore_memory_transaction_failures = true;
}

View File

@@ -44,6 +44,7 @@ static void netduinoplus2_init(MachineState *machine)
clock_set_hz(sysclk, SYSCLK_FRQ);
dev = qdev_new(TYPE_STM32F405_SOC);
qdev_prop_set_string(dev, "cpu-type", ARM_CPU_TYPE_NAME("cortex-m4"));
qdev_connect_clock_in(dev, "sysclk", sysclk);
sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
@@ -54,14 +55,8 @@ static void netduinoplus2_init(MachineState *machine)
static void netduinoplus2_machine_init(MachineClass *mc)
{
static const char * const valid_cpu_types[] = {
ARM_CPU_TYPE_NAME("cortex-m4"),
NULL
};
mc->desc = "Netduino Plus 2 Machine (Cortex-M4)";
mc->init = netduinoplus2_init;
mc->valid_cpu_types = valid_cpu_types;
}
DEFINE_MACHINE("netduinoplus2", netduinoplus2_machine_init)

View File

@@ -47,6 +47,7 @@ static void olimex_stm32_h405_init(MachineState *machine)
clock_set_hz(sysclk, SYSCLK_FRQ);
dev = qdev_new(TYPE_STM32F405_SOC);
qdev_prop_set_string(dev, "cpu-type", ARM_CPU_TYPE_NAME("cortex-m4"));
qdev_connect_clock_in(dev, "sysclk", sysclk);
sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
@@ -57,14 +58,9 @@ static void olimex_stm32_h405_init(MachineState *machine)
static void olimex_stm32_h405_machine_init(MachineClass *mc)
{
static const char * const valid_cpu_types[] = {
ARM_CPU_TYPE_NAME("cortex-m4"),
NULL
};
mc->desc = "Olimex STM32-H405 (Cortex-M4)";
mc->init = olimex_stm32_h405_init;
mc->valid_cpu_types = valid_cpu_types;
mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-m4");
/* SRAM pre-allocated as part of the SoC instantiation */
mc->default_ram_size = 0;

View File

@@ -115,7 +115,7 @@ static void stm32f100_soc_realize(DeviceState *dev_soc, Error **errp)
/* Init ARMv7m */
armv7m = DEVICE(&s->armv7m);
qdev_prop_set_uint32(armv7m, "num-irq", 61);
qdev_prop_set_string(armv7m, "cpu-type", ARM_CPU_TYPE_NAME("cortex-m3"));
qdev_prop_set_string(armv7m, "cpu-type", s->cpu_type);
qdev_prop_set_bit(armv7m, "enable-bitband", true);
qdev_connect_clock_in(armv7m, "cpuclk", s->sysclk);
qdev_connect_clock_in(armv7m, "refclk", s->refclk);
@@ -180,12 +180,17 @@ static void stm32f100_soc_realize(DeviceState *dev_soc, Error **errp)
create_unimplemented_device("CRC", 0x40023000, 0x400);
}
static Property stm32f100_soc_properties[] = {
DEFINE_PROP_STRING("cpu-type", STM32F100State, cpu_type),
DEFINE_PROP_END_OF_LIST(),
};
static void stm32f100_soc_class_init(ObjectClass *klass, void *data)
{
DeviceClass *dc = DEVICE_CLASS(klass);
dc->realize = stm32f100_soc_realize;
/* No vmstate or reset required: device has no internal state */
device_class_set_props(dc, stm32f100_soc_properties);
}
static const TypeInfo stm32f100_soc_info = {

View File

@@ -127,7 +127,7 @@ static void stm32f205_soc_realize(DeviceState *dev_soc, Error **errp)
armv7m = DEVICE(&s->armv7m);
qdev_prop_set_uint32(armv7m, "num-irq", 96);
qdev_prop_set_string(armv7m, "cpu-type", ARM_CPU_TYPE_NAME("cortex-m3"));
qdev_prop_set_string(armv7m, "cpu-type", s->cpu_type);
qdev_prop_set_bit(armv7m, "enable-bitband", true);
qdev_connect_clock_in(armv7m, "cpuclk", s->sysclk);
qdev_connect_clock_in(armv7m, "refclk", s->refclk);
@@ -201,12 +201,17 @@ static void stm32f205_soc_realize(DeviceState *dev_soc, Error **errp)
}
}
static Property stm32f205_soc_properties[] = {
DEFINE_PROP_STRING("cpu-type", STM32F205State, cpu_type),
DEFINE_PROP_END_OF_LIST(),
};
static void stm32f205_soc_class_init(ObjectClass *klass, void *data)
{
DeviceClass *dc = DEVICE_CLASS(klass);
dc->realize = stm32f205_soc_realize;
/* No vmstate or reset required: device has no internal state */
device_class_set_props(dc, stm32f205_soc_properties);
}
static const TypeInfo stm32f205_soc_info = {

View File

@@ -149,7 +149,7 @@ static void stm32f405_soc_realize(DeviceState *dev_soc, Error **errp)
armv7m = DEVICE(&s->armv7m);
qdev_prop_set_uint32(armv7m, "num-irq", 96);
qdev_prop_set_string(armv7m, "cpu-type", ARM_CPU_TYPE_NAME("cortex-m4"));
qdev_prop_set_string(armv7m, "cpu-type", s->cpu_type);
qdev_prop_set_bit(armv7m, "enable-bitband", true);
qdev_connect_clock_in(armv7m, "cpuclk", s->sysclk);
qdev_connect_clock_in(armv7m, "refclk", s->refclk);
@@ -287,11 +287,17 @@ static void stm32f405_soc_realize(DeviceState *dev_soc, Error **errp)
create_unimplemented_device("RNG", 0x50060800, 0x400);
}
static Property stm32f405_soc_properties[] = {
DEFINE_PROP_STRING("cpu-type", STM32F405State, cpu_type),
DEFINE_PROP_END_OF_LIST(),
};
static void stm32f405_soc_class_init(ObjectClass *klass, void *data)
{
DeviceClass *dc = DEVICE_CLASS(klass);
dc->realize = stm32f405_soc_realize;
device_class_set_props(dc, stm32f405_soc_properties);
/* No vmstate or reset required: device has no internal state */
}

View File

@@ -47,6 +47,7 @@ static void stm32vldiscovery_init(MachineState *machine)
clock_set_hz(sysclk, SYSCLK_FRQ);
dev = qdev_new(TYPE_STM32F100_SOC);
qdev_prop_set_string(dev, "cpu-type", ARM_CPU_TYPE_NAME("cortex-m3"));
qdev_connect_clock_in(dev, "sysclk", sysclk);
sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
@@ -57,14 +58,8 @@ static void stm32vldiscovery_init(MachineState *machine)
static void stm32vldiscovery_machine_init(MachineClass *mc)
{
static const char * const valid_cpu_types[] = {
ARM_CPU_TYPE_NAME("cortex-m3"),
NULL
};
mc->desc = "ST STM32VLDISCOVERY (Cortex-M3)";
mc->init = stm32vldiscovery_init;
mc->valid_cpu_types = valid_cpu_types;
}
DEFINE_MACHINE("stm32vldiscovery", stm32vldiscovery_machine_init)

View File

@@ -22,7 +22,6 @@
#include "hw/qdev-properties.h"
#include "intel-hda.h"
#include "migration/vmstate.h"
#include "qemu/host-utils.h"
#include "qemu/module.h"
#include "intel-hda-defs.h"
#include "audio/audio.h"
@@ -190,9 +189,9 @@ struct HDAAudioState {
bool use_timer;
};
static inline uint32_t hda_bytes_per_second(HDAAudioStream *st)
static inline int64_t hda_bytes_per_second(HDAAudioStream *st)
{
return 2 * (uint32_t)st->as.nchannels * (uint32_t)st->as.freq;
return 2LL * st->as.nchannels * st->as.freq;
}
static inline void hda_timer_sync_adjust(HDAAudioStream *st, int64_t target_pos)
@@ -223,18 +222,12 @@ static void hda_audio_input_timer(void *opaque)
int64_t now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
int64_t uptime = now - st->buft_start;
int64_t buft_start = st->buft_start;
int64_t wpos = st->wpos;
int64_t rpos = st->rpos;
int64_t wanted_rpos;
if (uptime <= 0) {
/* wanted_rpos <= 0 */
goto out_timer;
}
wanted_rpos = muldiv64(uptime, hda_bytes_per_second(st),
NANOSECONDS_PER_SECOND);
int64_t wanted_rpos = hda_bytes_per_second(st) * (now - buft_start)
/ NANOSECONDS_PER_SECOND;
wanted_rpos &= -4; /* IMPORTANT! clip to frames */
if (wanted_rpos <= rpos) {
@@ -293,18 +286,12 @@ static void hda_audio_output_timer(void *opaque)
int64_t now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
int64_t uptime = now - st->buft_start;
int64_t buft_start = st->buft_start;
int64_t wpos = st->wpos;
int64_t rpos = st->rpos;
int64_t wanted_wpos;
if (uptime <= 0) {
/* wanted_wpos <= 0 */
goto out_timer;
}
wanted_wpos = muldiv64(uptime, hda_bytes_per_second(st),
NANOSECONDS_PER_SECOND);
int64_t wanted_wpos = hda_bytes_per_second(st) * (now - buft_start)
/ NANOSECONDS_PER_SECOND;
wanted_wpos &= -4; /* IMPORTANT! clip to frames */
if (wanted_wpos <= wpos) {
@@ -868,10 +855,10 @@ static Property hda_audio_properties[] = {
static void hda_audio_init_output(HDACodecDevice *hda, Error **errp)
{
HDAAudioState *a = HDA_AUDIO(hda);
const struct desc_codec *desc = &output_mixemu;
const struct desc_codec *desc = &output_nomixemu;
if (!a->mixer) {
desc = &output_nomixemu;
desc = &output_mixemu;
}
hda_audio_init(hda, desc, errp);
@@ -880,10 +867,10 @@ static void hda_audio_init_output(HDACodecDevice *hda, Error **errp)
static void hda_audio_init_duplex(HDACodecDevice *hda, Error **errp)
{
HDAAudioState *a = HDA_AUDIO(hda);
const struct desc_codec *desc = &duplex_mixemu;
const struct desc_codec *desc = &duplex_nomixemu;
if (!a->mixer) {
desc = &duplex_nomixemu;
desc = &duplex_mixemu;
}
hda_audio_init(hda, desc, errp);
@@ -892,10 +879,10 @@ static void hda_audio_init_duplex(HDACodecDevice *hda, Error **errp)
static void hda_audio_init_micro(HDACodecDevice *hda, Error **errp)
{
HDAAudioState *a = HDA_AUDIO(hda);
const struct desc_codec *desc = &micro_mixemu;
const struct desc_codec *desc = &micro_nomixemu;
if (!a->mixer) {
desc = &micro_nomixemu;
desc = &micro_mixemu;
}
hda_audio_init(hda, desc, errp);

View File

@@ -211,14 +211,14 @@ static void out_cb(void *opaque, int avail)
AUD_set_active_out(s->vo, 0);
}
if (c->type & STAT_EOL) {
via_isa_set_irq(&s->dev, 0, 1);
pci_set_irq(&s->dev, 1);
}
}
if (CLEN_IS_FLAG(c)) {
c->stat |= STAT_FLAG;
c->stat |= STAT_PAUSED;
if (c->type & STAT_FLAG) {
via_isa_set_irq(&s->dev, 0, 1);
pci_set_irq(&s->dev, 1);
}
}
if (CLEN_IS_STOP(c)) {
@@ -305,13 +305,13 @@ static void sgd_write(void *opaque, hwaddr addr, uint64_t val, unsigned size)
if (val & STAT_EOL) {
s->aur.stat &= ~(STAT_EOL | STAT_PAUSED);
if (s->aur.type & STAT_EOL) {
via_isa_set_irq(&s->dev, 0, 0);
pci_set_irq(&s->dev, 0);
}
}
if (val & STAT_FLAG) {
s->aur.stat &= ~(STAT_FLAG | STAT_PAUSED);
if (s->aur.type & STAT_FLAG) {
via_isa_set_irq(&s->dev, 0, 0);
pci_set_irq(&s->dev, 0);
}
}
break;

View File

@@ -47,14 +47,12 @@ static void virtio_snd_pci_class_init(ObjectClass *klass, void *data)
{
DeviceClass *dc = DEVICE_CLASS(klass);
VirtioPCIClass *vpciklass = VIRTIO_PCI_CLASS(klass);
PCIDeviceClass *pcidevklass = PCI_DEVICE_CLASS(klass);
device_class_set_props(dc, virtio_snd_pci_properties);
dc->desc = "Virtio Sound";
set_bit(DEVICE_CATEGORY_SOUND, dc->categories);
vpciklass->realize = virtio_snd_pci_realize;
pcidevklass->class_id = PCI_CLASS_MULTIMEDIA_AUDIO;
}
static void virtio_snd_pci_instance_init(Object *obj)

View File

@@ -36,7 +36,6 @@ static void virtio_snd_pcm_out_cb(void *data, int available);
static void virtio_snd_process_cmdq(VirtIOSound *s);
static void virtio_snd_pcm_flush(VirtIOSoundPCMStream *stream);
static void virtio_snd_pcm_in_cb(void *data, int available);
static void virtio_snd_unrealize(DeviceState *dev);
static uint32_t supported_formats = BIT(VIRTIO_SND_PCM_FMT_S8)
| BIT(VIRTIO_SND_PCM_FMT_U8)
@@ -69,7 +68,6 @@ static const VMStateDescription vmstate_virtio_snd_device = {
static const VMStateDescription vmstate_virtio_snd = {
.name = TYPE_VIRTIO_SND,
.unmigratable = 1,
.minimum_version_id = VIRTIO_SOUND_VM_VERSION,
.version_id = VIRTIO_SOUND_VM_VERSION,
.fields = (VMStateField[]) {
@@ -1067,9 +1065,23 @@ static void virtio_snd_realize(DeviceState *dev, Error **errp)
virtio_snd_pcm_set_params default_params = { 0 };
uint32_t status;
vsnd->pcm = NULL;
vsnd->vmstate =
qemu_add_vm_change_state_handler(virtio_snd_vm_state_change, vsnd);
trace_virtio_snd_realize(vsnd);
/* check number of jacks and streams */
vsnd->pcm = g_new0(VirtIOSoundPCM, 1);
vsnd->pcm->snd = vsnd;
vsnd->pcm->streams =
g_new0(VirtIOSoundPCMStream *, vsnd->snd_conf.streams);
vsnd->pcm->pcm_params =
g_new0(virtio_snd_pcm_set_params, vsnd->snd_conf.streams);
virtio_init(vdev, VIRTIO_ID_SOUND, sizeof(virtio_snd_config));
virtio_add_feature(&vsnd->features, VIRTIO_F_VERSION_1);
/* set number of jacks and streams */
if (vsnd->snd_conf.jacks > 8) {
error_setg(errp,
"Invalid number of jacks: %"PRIu32,
@@ -1090,22 +1102,7 @@ static void virtio_snd_realize(DeviceState *dev, Error **errp)
return;
}
if (!AUD_register_card("virtio-sound", &vsnd->card, errp)) {
return;
}
vsnd->vmstate =
qemu_add_vm_change_state_handler(virtio_snd_vm_state_change, vsnd);
vsnd->pcm = g_new0(VirtIOSoundPCM, 1);
vsnd->pcm->snd = vsnd;
vsnd->pcm->streams =
g_new0(VirtIOSoundPCMStream *, vsnd->snd_conf.streams);
vsnd->pcm->pcm_params =
g_new0(virtio_snd_pcm_set_params, vsnd->snd_conf.streams);
virtio_init(vdev, VIRTIO_ID_SOUND, sizeof(virtio_snd_config));
virtio_add_feature(&vsnd->features, VIRTIO_F_VERSION_1);
AUD_register_card("virtio-sound", &vsnd->card, errp);
/* set default params for all streams */
default_params.features = 0;
@@ -1129,23 +1126,18 @@ static void virtio_snd_realize(DeviceState *dev, Error **errp)
status = virtio_snd_set_pcm_params(vsnd, i, &default_params);
if (status != cpu_to_le32(VIRTIO_SND_S_OK)) {
error_setg(errp,
"Can't initialize stream params, device responded with %s.",
"Can't initalize stream params, device responded with %s.",
print_code(status));
goto error_cleanup;
return;
}
status = virtio_snd_pcm_prepare(vsnd, i);
if (status != cpu_to_le32(VIRTIO_SND_S_OK)) {
error_setg(errp,
"Can't prepare streams, device responded with %s.",
print_code(status));
goto error_cleanup;
return;
}
}
return;
error_cleanup:
virtio_snd_unrealize(dev);
}
static inline void return_tx_buffer(VirtIOSoundPCMStream *stream,

View File

@@ -233,10 +233,6 @@ static void atmega_realize(DeviceState *dev, Error **errp)
/* CPU */
object_initialize_child(OBJECT(dev), "cpu", &s->cpu, mc->cpu_type);
object_property_set_uint(OBJECT(&s->cpu), "init-sp",
mc->io_size + mc->sram_size - 1, &error_abort);
qdev_realize(DEVICE(&s->cpu), NULL, &error_abort);
cpudev = DEVICE(&s->cpu);

View File

@@ -80,39 +80,16 @@ struct PFlashCFI01 {
uint16_t ident3;
uint8_t cfi_table[0x52];
uint64_t counter;
uint32_t writeblock_size;
unsigned int writeblock_size;
MemoryRegion mem;
char *name;
void *storage;
VMChangeStateEntry *vmstate;
bool old_multiple_chip_handling;
/* block update buffer */
unsigned char *blk_bytes;
uint32_t blk_offset;
};
static int pflash_post_load(void *opaque, int version_id);
static bool pflash_blk_write_state_needed(void *opaque)
{
PFlashCFI01 *pfl = opaque;
return (pfl->blk_offset != -1);
}
static const VMStateDescription vmstate_pflash_blk_write = {
.name = "pflash_cfi01_blk_write",
.version_id = 1,
.minimum_version_id = 1,
.needed = pflash_blk_write_state_needed,
.fields = (const VMStateField[]) {
VMSTATE_VBUFFER_UINT32(blk_bytes, PFlashCFI01, 0, NULL, writeblock_size),
VMSTATE_UINT32(blk_offset, PFlashCFI01),
VMSTATE_END_OF_LIST()
}
};
static const VMStateDescription vmstate_pflash = {
.name = "pflash_cfi01",
.version_id = 1,
@@ -124,10 +101,6 @@ static const VMStateDescription vmstate_pflash = {
VMSTATE_UINT8(status, PFlashCFI01),
VMSTATE_UINT64(counter, PFlashCFI01),
VMSTATE_END_OF_LIST()
},
.subsections = (const VMStateDescription * []) {
&vmstate_pflash_blk_write,
NULL
}
};
@@ -252,10 +225,34 @@ static uint32_t pflash_data_read(PFlashCFI01 *pfl, hwaddr offset,
uint32_t ret;
p = pfl->storage;
if (be) {
ret = ldn_be_p(p + offset, width);
} else {
ret = ldn_le_p(p + offset, width);
switch (width) {
case 1:
ret = p[offset];
break;
case 2:
if (be) {
ret = p[offset] << 8;
ret |= p[offset + 1];
} else {
ret = p[offset];
ret |= p[offset + 1] << 8;
}
break;
case 4:
if (be) {
ret = p[offset] << 24;
ret |= p[offset + 1] << 16;
ret |= p[offset + 2] << 8;
ret |= p[offset + 3];
} else {
ret = p[offset];
ret |= p[offset + 1] << 8;
ret |= p[offset + 2] << 16;
ret |= p[offset + 3] << 24;
}
break;
default:
abort();
}
trace_pflash_data_read(pfl->name, offset, width, ret);
return ret;
@@ -403,61 +400,40 @@ static void pflash_update(PFlashCFI01 *pfl, int offset,
}
}
/* copy current flash content to block update buffer */
static void pflash_blk_write_start(PFlashCFI01 *pfl, hwaddr offset)
{
hwaddr mask = ~(pfl->writeblock_size - 1);
trace_pflash_write_block_start(pfl->name, pfl->counter);
pfl->blk_offset = offset & mask;
memcpy(pfl->blk_bytes, pfl->storage + pfl->blk_offset,
pfl->writeblock_size);
}
/* commit block update buffer changes */
static void pflash_blk_write_flush(PFlashCFI01 *pfl)
{
g_assert(pfl->blk_offset != -1);
trace_pflash_write_block_flush(pfl->name);
memcpy(pfl->storage + pfl->blk_offset, pfl->blk_bytes,
pfl->writeblock_size);
pflash_update(pfl, pfl->blk_offset, pfl->writeblock_size);
pfl->blk_offset = -1;
}
/* discard block update buffer changes */
static void pflash_blk_write_abort(PFlashCFI01 *pfl)
{
trace_pflash_write_block_abort(pfl->name);
pfl->blk_offset = -1;
}
static inline void pflash_data_write(PFlashCFI01 *pfl, hwaddr offset,
uint32_t value, int width, int be)
{
uint8_t *p;
uint8_t *p = pfl->storage;
if (pfl->blk_offset != -1) {
/* block write: redirect writes to block update buffer */
if ((offset < pfl->blk_offset) ||
(offset + width > pfl->blk_offset + pfl->writeblock_size)) {
pfl->status |= 0x10; /* Programming error */
return;
trace_pflash_data_write(pfl->name, offset, width, value, pfl->counter);
switch (width) {
case 1:
p[offset] = value;
break;
case 2:
if (be) {
p[offset] = value >> 8;
p[offset + 1] = value;
} else {
p[offset] = value;
p[offset + 1] = value >> 8;
}
trace_pflash_data_write_block(pfl->name, offset, width, value,
pfl->counter);
p = pfl->blk_bytes + (offset - pfl->blk_offset);
} else {
/* write directly to storage */
trace_pflash_data_write(pfl->name, offset, width, value);
p = pfl->storage + offset;
break;
case 4:
if (be) {
p[offset] = value >> 24;
p[offset + 1] = value >> 16;
p[offset + 2] = value >> 8;
p[offset + 3] = value;
} else {
p[offset] = value;
p[offset + 1] = value >> 8;
p[offset + 2] = value >> 16;
p[offset + 3] = value >> 24;
}
break;
}
if (be) {
stn_be_p(p, width, value);
} else {
stn_le_p(p, width, value);
}
}
static void pflash_write(PFlashCFI01 *pfl, hwaddr offset,
@@ -572,9 +548,9 @@ static void pflash_write(PFlashCFI01 *pfl, hwaddr offset,
} else {
value = extract32(value, 0, pfl->bank_width * 8);
}
trace_pflash_write_block(pfl->name, value);
pfl->counter = value;
pfl->wcycle++;
pflash_blk_write_start(pfl, offset);
break;
case 0x60:
if (cmd == 0xd0) {
@@ -605,7 +581,12 @@ static void pflash_write(PFlashCFI01 *pfl, hwaddr offset,
switch (pfl->cmd) {
case 0xe8: /* Block write */
/* FIXME check @offset, @width */
if (!pfl->ro && (pfl->blk_offset != -1)) {
if (!pfl->ro) {
/*
* FIXME writing straight to memory is *wrong*. We
* should write to a buffer, and flush it to memory
* only on confirm command (see below).
*/
pflash_data_write(pfl, offset, value, width, be);
} else {
pfl->status |= 0x10; /* Programming error */
@@ -614,8 +595,18 @@ static void pflash_write(PFlashCFI01 *pfl, hwaddr offset,
pfl->status |= 0x80;
if (!pfl->counter) {
hwaddr mask = pfl->writeblock_size - 1;
mask = ~mask;
trace_pflash_write(pfl->name, "block write finished");
pfl->wcycle++;
if (!pfl->ro) {
/* Flush the entire write buffer onto backing storage. */
/* FIXME premature! */
pflash_update(pfl, offset & mask, pfl->writeblock_size);
} else {
pfl->status |= 0x10; /* Programming error */
}
}
pfl->counter--;
@@ -627,17 +618,20 @@ static void pflash_write(PFlashCFI01 *pfl, hwaddr offset,
case 3: /* Confirm mode */
switch (pfl->cmd) {
case 0xe8: /* Block write */
if ((cmd == 0xd0) && !(pfl->status & 0x10)) {
pflash_blk_write_flush(pfl);
if (cmd == 0xd0) {
/* FIXME this is where we should write out the buffer */
pfl->wcycle = 0;
pfl->status |= 0x80;
} else {
pflash_blk_write_abort(pfl);
qemu_log_mask(LOG_UNIMP,
"%s: Aborting write to buffer not implemented,"
" the data is already written to storage!\n"
"Flash device reset into READ mode.\n",
__func__);
goto mode_read_array;
}
break;
default:
pflash_blk_write_abort(pfl);
goto error_flash;
}
break;
@@ -871,9 +865,6 @@ static void pflash_cfi01_realize(DeviceState *dev, Error **errp)
pfl->cmd = 0x00;
pfl->status = 0x80; /* WSM ready */
pflash_cfi01_fill_cfi_table(pfl);
pfl->blk_bytes = g_malloc(pfl->writeblock_size);
pfl->blk_offset = -1;
}
static void pflash_cfi01_system_reset(DeviceState *dev)
@@ -893,8 +884,6 @@ static void pflash_cfi01_system_reset(DeviceState *dev)
* This model deliberately ignores this delay.
*/
pfl->status = 0x80;
pfl->blk_offset = -1;
}
static Property pflash_cfi01_properties[] = {

View File

@@ -546,7 +546,7 @@ static void pflash_write(void *opaque, hwaddr offset, uint64_t value,
}
goto reset_flash;
}
trace_pflash_data_write(pfl->name, offset, width, value);
trace_pflash_data_write(pfl->name, offset, width, value, 0);
if (!pfl->ro) {
p = (uint8_t *)pfl->storage + offset;
if (pfl->be) {

View File

@@ -12,8 +12,7 @@ fdctrl_tc_pulse(int level) "TC pulse: %u"
pflash_chip_erase_invalid(const char *name, uint64_t offset) "%s: chip erase: invalid address 0x%" PRIx64
pflash_chip_erase_start(const char *name) "%s: start chip erase"
pflash_data_read(const char *name, uint64_t offset, unsigned size, uint32_t value) "%s: data offset:0x%04"PRIx64" size:%u value:0x%04x"
pflash_data_write(const char *name, uint64_t offset, unsigned size, uint32_t value) "%s: data offset:0x%04"PRIx64" size:%u value:0x%04x"
pflash_data_write_block(const char *name, uint64_t offset, unsigned size, uint32_t value, uint64_t counter) "%s: data offset:0x%04"PRIx64" size:%u value:0x%04x counter:0x%016"PRIx64
pflash_data_write(const char *name, uint64_t offset, unsigned size, uint32_t value, uint64_t counter) "%s: data offset:0x%04"PRIx64" size:%u value:0x%04x counter:0x%016"PRIx64
pflash_device_id(const char *name, uint16_t id) "%s: read device ID: 0x%04x"
pflash_device_info(const char *name, uint64_t offset) "%s: read device information offset:0x%04" PRIx64
pflash_erase_complete(const char *name) "%s: sector erase complete"
@@ -33,9 +32,7 @@ pflash_unlock0_failed(const char *name, uint64_t offset, uint8_t cmd, uint16_t a
pflash_unlock1_failed(const char *name, uint64_t offset, uint8_t cmd) "%s: unlock0 failed 0x%" PRIx64 " 0x%02x"
pflash_unsupported_device_configuration(const char *name, uint8_t width, uint8_t max) "%s: unsupported device configuration: device_width:%d max_device_width:%d"
pflash_write(const char *name, const char *str) "%s: %s"
pflash_write_block_start(const char *name, uint32_t value) "%s: block write start: bytes:0x%x"
pflash_write_block_flush(const char *name) "%s: block write flush"
pflash_write_block_abort(const char *name) "%s: block write abort"
pflash_write_block(const char *name, uint32_t value) "%s: block write: bytes:0x%x"
pflash_write_block_erase(const char *name, uint64_t offset, uint64_t len) "%s: block erase offset:0x%" PRIx64 " bytes:0x%" PRIx64
pflash_write_failed(const char *name, uint64_t offset, uint8_t cmd) "%s: command failed 0x%" PRIx64 " 0x%02x"
pflash_write_invalid(const char *name, uint8_t cmd) "%s: invalid write for command 0x%02x"

View File

@@ -326,6 +326,7 @@ static int vhost_user_blk_connect(DeviceState *dev, Error **errp)
if (s->connected) {
return 0;
}
s->connected = true;
s->dev.num_queues = s->num_queues;
s->dev.nvqs = s->num_queues;
@@ -342,14 +343,15 @@ static int vhost_user_blk_connect(DeviceState *dev, Error **errp)
return ret;
}
s->connected = true;
/* restore vhost state */
if (virtio_device_started(vdev, vdev->status)) {
ret = vhost_user_blk_start(vdev, errp);
if (ret < 0) {
return ret;
}
}
return ret;
return 0;
}
static void vhost_user_blk_disconnect(DeviceState *dev)

View File

@@ -91,27 +91,9 @@ static bool xen_block_find_free_vdev(XenBlockDevice *blockdev, Error **errp)
existing_frontends = qemu_xen_xs_directory(xenbus->xsh, XBT_NULL, fe_path,
&nr_existing);
if (!existing_frontends) {
if (errno == ENOENT) {
/*
* If the frontend directory doesn't exist because there are
* no existing vbd devices, that's fine. Just ensure that we
* don't dereference the NULL existing_frontends pointer, by
* checking that nr_existing is zero so the loop below is not
* entered.
*
* In fact this is redundant since nr_existing is initialized
* to zero, but setting it again here makes it abundantly clear
* to Coverity, and to the human reader who doesn't know the
* semantics of qemu_xen_xs_directory() off the top of their
* head.
*/
nr_existing = 0;
} else {
/* All other errors accessing the frontend directory are fatal. */
error_setg_errno(errp, errno, "cannot read %s", fe_path);
return false;
}
if (!existing_frontends && errno != ENOENT) {
error_setg_errno(errp, errno, "cannot read %s", fe_path);
return false;
}
memset(used_devs, 0, sizeof(used_devs));

View File

@@ -505,7 +505,7 @@ ssize_t load_elf_ram_sym(const char *filename,
clear_lsb, data_swab, as, load_rom, sym_cb);
}
if (ret > 0) {
if (ret != ELF_LOAD_FAILED) {
debuginfo_report_elf(filename, fd, 0);
}

View File

@@ -1189,6 +1189,11 @@ bool machine_mem_merge(MachineState *machine)
return machine->mem_merge;
}
bool machine_require_guest_memfd(MachineState *machine)
{
return machine->require_guest_memfd;
}
static char *cpu_slot_to_string(const CPUArchId *cpu)
{
GString *s = g_string_new(NULL);

View File

@@ -689,36 +689,23 @@ static void get_prop_array(Object *obj, Visitor *v, const char *name,
Property *prop = opaque;
uint32_t *alenptr = object_field_prop_ptr(obj, prop);
void **arrayptr = (void *)obj + prop->arrayoffset;
char *elemptr = *arrayptr;
ArrayElementList *list = NULL, *elem;
ArrayElementList **tail = &list;
const size_t size = sizeof(*list);
char *elem = *arrayptr;
GenericList *list;
const size_t list_elem_size = sizeof(*list) + prop->arrayfieldsize;
int i;
bool ok;
/* At least the string output visitor needs a real list */
for (i = 0; i < *alenptr; i++) {
elem = g_new0(ArrayElementList, 1);
elem->value = elemptr;
elemptr += prop->arrayfieldsize;
*tail = elem;
tail = &elem->next;
}
if (!visit_start_list(v, name, (GenericList **) &list, size, errp)) {
if (!visit_start_list(v, name, &list, list_elem_size, errp)) {
return;
}
elem = list;
while (elem) {
Property elem_prop = array_elem_prop(obj, prop, name, elem->value);
for (i = 0; i < *alenptr; i++) {
Property elem_prop = array_elem_prop(obj, prop, name, elem);
prop->arrayinfo->get(obj, v, NULL, &elem_prop, errp);
if (*errp) {
goto out_obj;
}
elem = (ArrayElementList *) visit_next_list(v, (GenericList*) elem,
size);
elem += prop->arrayfieldsize;
}
/* visit_check_list() can only fail for input visitors */
@@ -727,12 +714,6 @@ static void get_prop_array(Object *obj, Visitor *v, const char *name,
out_obj:
visit_end_list(v, (void**) &list);
while (list) {
elem = list;
list = elem->next;
g_free(elem);
}
}
static void default_prop_array(ObjectProperty *op, const Property *prop)

View File

@@ -81,7 +81,7 @@ static uint64_t cxl_cache_mem_read_reg(void *opaque, hwaddr offset,
return 0;
default:
/*
* In line with specification limitaions on access sizes, this
* In line with specifiction limitaions on access sizes, this
* routine is not called with other sizes.
*/
g_assert_not_reached();
@@ -152,7 +152,7 @@ static void cxl_cache_mem_write_reg(void *opaque, hwaddr offset, uint64_t value,
return;
default:
/*
* In line with specification limitaions on access sizes, this
* In line with specifiction limitaions on access sizes, this
* routine is not called with other sizes.
*/
g_assert_not_reached();

View File

@@ -431,7 +431,7 @@ static CXLRetCode cmd_identify_switch_device(const struct cxl_cmd *cmd,
out = (struct cxl_fmapi_ident_switch_dev_resp_pl *)payload_out;
*out = (struct cxl_fmapi_ident_switch_dev_resp_pl) {
.num_physical_ports = num_phys_ports + 1, /* 1 USP */
.num_vcss = 1, /* Not yet support multiple VCS - potentially tricky */
.num_vcss = 1, /* Not yet support multiple VCS - potentialy tricky */
.active_vcs_bitmask[0] = 0x1,
.total_vppbs = num_phys_ports + 1,
.bound_vppbs = num_phys_ports + 1,

View File

@@ -33,13 +33,13 @@
/*
* Ref: UG1087 (v1.7) February 8, 2019
* https://www.xilinx.com/html_docs/registers/ug1087/ug1087-zynq-ultrascale-registers
* https://www.xilinx.com/html_docs/registers/ug1087/ug1087-zynq-ultrascale-registers.html
* CSUDMA Module section
*/
REG32(ADDR, 0x0)
FIELD(ADDR, ADDR, 2, 30) /* wo */
REG32(SIZE, 0x4)
FIELD(SIZE, SIZE, 2, 27)
FIELD(SIZE, SIZE, 2, 27) /* wo */
FIELD(SIZE, LAST_WORD, 0, 1) /* rw, only exists in SRC */
REG32(STATUS, 0x8)
FIELD(STATUS, DONE_CNT, 13, 3) /* wtc */
@@ -335,14 +335,10 @@ static uint64_t addr_pre_write(RegisterInfo *reg, uint64_t val)
static uint64_t size_pre_write(RegisterInfo *reg, uint64_t val)
{
XlnxCSUDMA *s = XLNX_CSU_DMA(reg->opaque);
uint64_t size = val & R_SIZE_SIZE_MASK;
if (s->regs[R_SIZE] != 0) {
if (size || s->is_dst) {
qemu_log_mask(LOG_GUEST_ERROR,
"%s: Starting DMA while already running.\n",
__func__);
}
qemu_log_mask(LOG_GUEST_ERROR,
"%s: Starting DMA while already running.\n", __func__);
}
if (!s->is_dst) {
@@ -350,7 +346,7 @@ static uint64_t size_pre_write(RegisterInfo *reg, uint64_t val)
}
/* Size is word aligned */
return size;
return val & R_SIZE_SIZE_MASK;
}
static uint64_t size_post_read(RegisterInfo *reg, uint64_t val)

View File

@@ -36,8 +36,8 @@
#define MIN_SEABIOS_HPPA_VERSION 12 /* require at least this fw version */
#define HPA_POWER_BUTTON (FIRMWARE_END - 0x10)
static hwaddr soft_power_reg;
/* Power button address at &PAGE0->pad[4] */
#define HPA_POWER_BUTTON (0x40 + 4 * sizeof(uint32_t))
#define enable_lasi_lan() 0
@@ -45,6 +45,7 @@ static DeviceState *lasi_dev;
static void hppa_powerdown_req(Notifier *n, void *opaque)
{
hwaddr soft_power_reg = HPA_POWER_BUTTON;
uint32_t val;
val = ldl_be_phys(&address_space_memory, soft_power_reg);
@@ -220,7 +221,7 @@ static FWCfgState *create_fw_cfg(MachineState *ms, PCIBus *pci_bus,
fw_cfg_add_file(fw_cfg, "/etc/hppa/machine",
g_memdup(mc->name, len), len);
val = cpu_to_le64(soft_power_reg);
val = cpu_to_le64(HPA_POWER_BUTTON);
fw_cfg_add_file(fw_cfg, "/etc/hppa/power-button-addr",
g_memdup(&val, sizeof(val)), sizeof(val));
@@ -275,7 +276,6 @@ static TranslateFn *machine_HP_common_init_cpus(MachineState *machine)
unsigned int smp_cpus = machine->smp.cpus;
TranslateFn *translate;
MemoryRegion *cpu_region;
uint64_t ram_max;
/* Create CPUs. */
for (unsigned int i = 0; i < smp_cpus; i++) {
@@ -288,14 +288,10 @@ static TranslateFn *machine_HP_common_init_cpus(MachineState *machine)
*/
if (hppa_is_pa20(&cpu[0]->env)) {
translate = translate_pa20;
ram_max = 0xf0000000; /* 3.75 GB (limited by 32-bit firmware) */
} else {
translate = translate_pa10;
ram_max = 0xf0000000; /* 3.75 GB (32-bit CPU) */
}
soft_power_reg = translate(NULL, HPA_POWER_BUTTON);
for (unsigned int i = 0; i < smp_cpus; i++) {
g_autofree char *name = g_strdup_printf("cpu%u-io-eir", i);
@@ -315,9 +311,9 @@ static TranslateFn *machine_HP_common_init_cpus(MachineState *machine)
cpu_region);
/* Main memory region. */
if (machine->ram_size > ram_max) {
info_report("Max RAM size limited to %" PRIu64 " MB", ram_max / MiB);
machine->ram_size = ram_max;
if (machine->ram_size > 3 * GiB) {
error_report("RAM size is currently restricted to 3GB");
exit(EXIT_FAILURE);
}
memory_region_add_subregion_overlap(addr_space, 0, machine->ram, -1);
@@ -347,10 +343,8 @@ static void machine_HP_common_init_tail(MachineState *machine, PCIBus *pci_bus,
SysBusDevice *s;
/* SCSI disk setup. */
if (drive_get_max_bus(IF_SCSI) >= 0) {
dev = DEVICE(pci_create_simple(pci_bus, -1, "lsi53c895a"));
lsi53c8xx_handle_legacy_cmdline(dev);
}
dev = DEVICE(pci_create_simple(pci_bus, -1, "lsi53c895a"));
lsi53c8xx_handle_legacy_cmdline(dev);
/* Graphics setup. */
if (machine->enable_graphics && vga_interface_type != VGA_NONE) {
@@ -363,7 +357,7 @@ static void machine_HP_common_init_tail(MachineState *machine, PCIBus *pci_bus,
}
/* Network setup. */
if (nd_table[0].used && enable_lasi_lan()) {
if (enable_lasi_lan()) {
lasi_82596_init(addr_space, translate(NULL, LASI_LAN_HPA),
qdev_get_gpio_in(lasi_dev, LASI_IRQ_LAN_HPA));
}
@@ -388,7 +382,7 @@ static void machine_HP_common_init_tail(MachineState *machine, PCIBus *pci_bus,
pci_set_word(&pci_dev->config[PCI_SUBSYSTEM_ID], 0x1227); /* Powerbar */
/* create a second serial PCI card when running Astro */
if (serial_hd(1) && !lasi_dev) {
if (!lasi_dev) {
pci_dev = pci_new(-1, "pci-serial-4x");
qdev_prop_set_chr(DEVICE(pci_dev), "chardev1", serial_hd(1));
qdev_prop_set_chr(DEVICE(pci_dev), "chardev2", serial_hd(2));
@@ -678,18 +672,19 @@ static void hppa_nmi(NMIState *n, int cpu_index, Error **errp)
}
}
static const char *HP_B160L_machine_valid_cpu_types[] = {
TYPE_HPPA_CPU,
NULL
};
static void HP_B160L_machine_init_class_init(ObjectClass *oc, void *data)
{
static const char * const valid_cpu_types[] = {
TYPE_HPPA_CPU,
NULL
};
MachineClass *mc = MACHINE_CLASS(oc);
NMIClass *nc = NMI_CLASS(oc);
mc->desc = "HP B160L workstation";
mc->default_cpu_type = TYPE_HPPA_CPU;
mc->valid_cpu_types = valid_cpu_types;
mc->valid_cpu_types = HP_B160L_machine_valid_cpu_types;
mc->init = machine_HP_B160L_init;
mc->reset = hppa_machine_reset;
mc->block_default_type = IF_SCSI;
@@ -714,18 +709,19 @@ static const TypeInfo HP_B160L_machine_init_typeinfo = {
},
};
static const char *HP_C3700_machine_valid_cpu_types[] = {
TYPE_HPPA64_CPU,
NULL
};
static void HP_C3700_machine_init_class_init(ObjectClass *oc, void *data)
{
static const char * const valid_cpu_types[] = {
TYPE_HPPA64_CPU,
NULL
};
MachineClass *mc = MACHINE_CLASS(oc);
NMIClass *nc = NMI_CLASS(oc);
mc->desc = "HP C3700 workstation";
mc->default_cpu_type = TYPE_HPPA64_CPU;
mc->valid_cpu_types = valid_cpu_types;
mc->valid_cpu_types = HP_C3700_machine_valid_cpu_types;
mc->init = machine_HP_C3700_init;
mc->reset = hppa_machine_reset;
mc->block_default_type = IF_SCSI;

View File

@@ -10,6 +10,11 @@ config SGX
bool
depends on KVM
config TDX
bool
select X86_FW_OVMF
depends on KVM
config PC
bool
imply APPLESMC
@@ -26,6 +31,7 @@ config PC
imply QXL
imply SEV
imply SGX
imply TDX
imply TEST_DEVICES
imply TPM_CRB
imply TPM_TIS_ISA

View File

@@ -975,7 +975,8 @@ static void build_dbg_aml(Aml *table)
aml_append(table, scope);
}
static Aml *build_link_dev(const char *name, uint8_t uid, Aml *reg)
static Aml *build_link_dev(const char *name, uint8_t uid, Aml *reg,
bool level_trigger_unsupported)
{
Aml *dev;
Aml *crs;
@@ -987,7 +988,10 @@ static Aml *build_link_dev(const char *name, uint8_t uid, Aml *reg)
aml_append(dev, aml_name_decl("_UID", aml_int(uid)));
crs = aml_resource_template();
aml_append(crs, aml_interrupt(AML_CONSUMER, AML_LEVEL, AML_ACTIVE_HIGH,
aml_append(crs, aml_interrupt(AML_CONSUMER,
level_trigger_unsupported ?
AML_EDGE : AML_LEVEL,
AML_ACTIVE_HIGH,
AML_SHARED, irqs, ARRAY_SIZE(irqs)));
aml_append(dev, aml_name_decl("_PRS", crs));
@@ -1011,7 +1015,8 @@ static Aml *build_link_dev(const char *name, uint8_t uid, Aml *reg)
return dev;
}
static Aml *build_gsi_link_dev(const char *name, uint8_t uid, uint8_t gsi)
static Aml *build_gsi_link_dev(const char *name, uint8_t uid,
uint8_t gsi, bool level_trigger_unsupported)
{
Aml *dev;
Aml *crs;
@@ -1024,7 +1029,10 @@ static Aml *build_gsi_link_dev(const char *name, uint8_t uid, uint8_t gsi)
crs = aml_resource_template();
irqs = gsi;
aml_append(crs, aml_interrupt(AML_CONSUMER, AML_LEVEL, AML_ACTIVE_HIGH,
aml_append(crs, aml_interrupt(AML_CONSUMER,
level_trigger_unsupported ?
AML_EDGE : AML_LEVEL,
AML_ACTIVE_HIGH,
AML_SHARED, &irqs, 1));
aml_append(dev, aml_name_decl("_PRS", crs));
@@ -1043,7 +1051,7 @@ static Aml *build_gsi_link_dev(const char *name, uint8_t uid, uint8_t gsi)
}
/* _CRS method - get current settings */
static Aml *build_iqcr_method(bool is_piix4)
static Aml *build_iqcr_method(bool is_piix4, bool level_trigger_unsupported)
{
Aml *if_ctx;
uint32_t irqs;
@@ -1051,7 +1059,9 @@ static Aml *build_iqcr_method(bool is_piix4)
Aml *crs = aml_resource_template();
irqs = 0;
aml_append(crs, aml_interrupt(AML_CONSUMER, AML_LEVEL,
aml_append(crs, aml_interrupt(AML_CONSUMER,
level_trigger_unsupported ?
AML_EDGE : AML_LEVEL,
AML_ACTIVE_HIGH, AML_SHARED, &irqs, 1));
aml_append(method, aml_name_decl("PRR0", crs));
@@ -1085,7 +1095,7 @@ static Aml *build_irq_status_method(void)
return method;
}
static void build_piix4_pci0_int(Aml *table)
static void build_piix4_pci0_int(Aml *table, bool level_trigger_unsupported)
{
Aml *dev;
Aml *crs;
@@ -1098,12 +1108,16 @@ static void build_piix4_pci0_int(Aml *table)
aml_append(sb_scope, pci0_scope);
aml_append(sb_scope, build_irq_status_method());
aml_append(sb_scope, build_iqcr_method(true));
aml_append(sb_scope, build_iqcr_method(true, level_trigger_unsupported));
aml_append(sb_scope, build_link_dev("LNKA", 0, aml_name("PRQ0")));
aml_append(sb_scope, build_link_dev("LNKB", 1, aml_name("PRQ1")));
aml_append(sb_scope, build_link_dev("LNKC", 2, aml_name("PRQ2")));
aml_append(sb_scope, build_link_dev("LNKD", 3, aml_name("PRQ3")));
aml_append(sb_scope, build_link_dev("LNKA", 0, aml_name("PRQ0"),
level_trigger_unsupported));
aml_append(sb_scope, build_link_dev("LNKB", 1, aml_name("PRQ1"),
level_trigger_unsupported));
aml_append(sb_scope, build_link_dev("LNKC", 2, aml_name("PRQ2"),
level_trigger_unsupported));
aml_append(sb_scope, build_link_dev("LNKD", 3, aml_name("PRQ3"),
level_trigger_unsupported));
dev = aml_device("LNKS");
{
@@ -1112,7 +1126,9 @@ static void build_piix4_pci0_int(Aml *table)
crs = aml_resource_template();
irqs = 9;
aml_append(crs, aml_interrupt(AML_CONSUMER, AML_LEVEL,
aml_append(crs, aml_interrupt(AML_CONSUMER,
level_trigger_unsupported ?
AML_EDGE : AML_LEVEL,
AML_ACTIVE_HIGH, AML_SHARED,
&irqs, 1));
aml_append(dev, aml_name_decl("_PRS", crs));
@@ -1198,7 +1214,7 @@ static Aml *build_q35_routing_table(const char *str)
return pkg;
}
static void build_q35_pci0_int(Aml *table)
static void build_q35_pci0_int(Aml *table, bool level_trigger_unsupported)
{
Aml *method;
Aml *sb_scope = aml_scope("_SB");
@@ -1237,25 +1253,41 @@ static void build_q35_pci0_int(Aml *table)
aml_append(sb_scope, pci0_scope);
aml_append(sb_scope, build_irq_status_method());
aml_append(sb_scope, build_iqcr_method(false));
aml_append(sb_scope, build_iqcr_method(false, level_trigger_unsupported));
aml_append(sb_scope, build_link_dev("LNKA", 0, aml_name("PRQA")));
aml_append(sb_scope, build_link_dev("LNKB", 1, aml_name("PRQB")));
aml_append(sb_scope, build_link_dev("LNKC", 2, aml_name("PRQC")));
aml_append(sb_scope, build_link_dev("LNKD", 3, aml_name("PRQD")));
aml_append(sb_scope, build_link_dev("LNKE", 4, aml_name("PRQE")));
aml_append(sb_scope, build_link_dev("LNKF", 5, aml_name("PRQF")));
aml_append(sb_scope, build_link_dev("LNKG", 6, aml_name("PRQG")));
aml_append(sb_scope, build_link_dev("LNKH", 7, aml_name("PRQH")));
aml_append(sb_scope, build_link_dev("LNKA", 0, aml_name("PRQA"),
level_trigger_unsupported));
aml_append(sb_scope, build_link_dev("LNKB", 1, aml_name("PRQB"),
level_trigger_unsupported));
aml_append(sb_scope, build_link_dev("LNKC", 2, aml_name("PRQC"),
level_trigger_unsupported));
aml_append(sb_scope, build_link_dev("LNKD", 3, aml_name("PRQD"),
level_trigger_unsupported));
aml_append(sb_scope, build_link_dev("LNKE", 4, aml_name("PRQE"),
level_trigger_unsupported));
aml_append(sb_scope, build_link_dev("LNKF", 5, aml_name("PRQF"),
level_trigger_unsupported));
aml_append(sb_scope, build_link_dev("LNKG", 6, aml_name("PRQG"),
level_trigger_unsupported));
aml_append(sb_scope, build_link_dev("LNKH", 7, aml_name("PRQH"),
level_trigger_unsupported));
aml_append(sb_scope, build_gsi_link_dev("GSIA", 0x10, 0x10));
aml_append(sb_scope, build_gsi_link_dev("GSIB", 0x11, 0x11));
aml_append(sb_scope, build_gsi_link_dev("GSIC", 0x12, 0x12));
aml_append(sb_scope, build_gsi_link_dev("GSID", 0x13, 0x13));
aml_append(sb_scope, build_gsi_link_dev("GSIE", 0x14, 0x14));
aml_append(sb_scope, build_gsi_link_dev("GSIF", 0x15, 0x15));
aml_append(sb_scope, build_gsi_link_dev("GSIG", 0x16, 0x16));
aml_append(sb_scope, build_gsi_link_dev("GSIH", 0x17, 0x17));
aml_append(sb_scope, build_gsi_link_dev("GSIA", 0x10, 0x10,
level_trigger_unsupported));
aml_append(sb_scope, build_gsi_link_dev("GSIB", 0x11, 0x11,
level_trigger_unsupported));
aml_append(sb_scope, build_gsi_link_dev("GSIC", 0x12, 0x12,
level_trigger_unsupported));
aml_append(sb_scope, build_gsi_link_dev("GSID", 0x13, 0x13,
level_trigger_unsupported));
aml_append(sb_scope, build_gsi_link_dev("GSIE", 0x14, 0x14,
level_trigger_unsupported));
aml_append(sb_scope, build_gsi_link_dev("GSIF", 0x15, 0x15,
level_trigger_unsupported));
aml_append(sb_scope, build_gsi_link_dev("GSIG", 0x16, 0x16,
level_trigger_unsupported));
aml_append(sb_scope, build_gsi_link_dev("GSIH", 0x17, 0x17,
level_trigger_unsupported));
aml_append(table, sb_scope);
}
@@ -1436,6 +1468,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
PCMachineState *pcms = PC_MACHINE(machine);
PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(machine);
X86MachineState *x86ms = X86_MACHINE(machine);
bool level_trigger_unsupported = x86ms->eoi_intercept_unsupported;
AcpiMcfgInfo mcfg;
bool mcfg_valid = !!acpi_get_mcfg(&mcfg);
uint32_t nr_mem = machine->ram_slots;
@@ -1468,7 +1501,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
if (pm->pcihp_bridge_en || pm->pcihp_root_en) {
build_x86_acpi_pci_hotplug(dsdt, pm->pcihp_io_base);
}
build_piix4_pci0_int(dsdt);
build_piix4_pci0_int(dsdt, level_trigger_unsupported);
} else if (q35) {
sb_scope = aml_scope("_SB");
dev = aml_device("PCI0");
@@ -1512,7 +1545,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
if (pm->pcihp_bridge_en) {
build_x86_acpi_pci_hotplug(dsdt, pm->pcihp_io_base);
}
build_q35_pci0_int(dsdt);
build_q35_pci0_int(dsdt, level_trigger_unsupported);
}
if (misc->has_hpet) {

View File

@@ -103,6 +103,7 @@ void acpi_build_madt(GArray *table_data, BIOSLinker *linker,
const CPUArchIdList *apic_ids = mc->possible_cpu_arch_ids(MACHINE(x86ms));
AcpiTable table = { .sig = "APIC", .rev = 3, .oem_id = oem_id,
.oem_table_id = oem_table_id };
bool level_trigger_unsupported = x86ms->eoi_intercept_unsupported;
acpi_table_begin(&table, table_data);
/* Local APIC Address */
@@ -122,18 +123,43 @@ void acpi_build_madt(GArray *table_data, BIOSLinker *linker,
IO_APIC_SECONDARY_ADDRESS, IO_APIC_SECONDARY_IRQBASE);
}
if (x86ms->apic_xrupt_override) {
build_xrupt_override(table_data, 0, 2,
0 /* Flags: Conforms to the specifications of the bus */);
}
for (i = 1; i < 16; i++) {
if (!(x86ms->pci_irq_mask & (1 << i))) {
/* No need for a INT source override structure. */
continue;
if (level_trigger_unsupported) {
/* Force edge trigger */
if (x86ms->apic_xrupt_override) {
build_xrupt_override(table_data, 0, 2,
/* Flags: active high, edge triggered */
1 | (1 << 2));
}
for (i = x86ms->apic_xrupt_override ? 1 : 0; i < 16; i++) {
build_xrupt_override(table_data, i, i,
/* Flags: active high, edge triggered */
1 | (1 << 2));
}
if (x86ms->ioapic2) {
for (i = 0; i < 16; i++) {
build_xrupt_override(table_data, IO_APIC_SECONDARY_IRQBASE + i,
IO_APIC_SECONDARY_IRQBASE + i,
/* Flags: active high, edge triggered */
1 | (1 << 2));
}
}
} else {
if (x86ms->apic_xrupt_override) {
build_xrupt_override(table_data, 0, 2,
0 /* Flags: Conforms to the specifications of the bus */);
}
for (i = 1; i < 16; i++) {
if (!(x86ms->pci_irq_mask & (1 << i))) {
/* No need for a INT source override structure. */
continue;
}
build_xrupt_override(table_data, i, i,
0xd /* Flags: Active high, Level Triggered */);
}
build_xrupt_override(table_data, i, i,
0xd /* Flags: Active high, Level Triggered */);
}
if (x2apic_mode) {

View File

@@ -27,6 +27,7 @@ i386_ss.add(when: 'CONFIG_PC', if_true: files(
'port92.c'))
i386_ss.add(when: 'CONFIG_X86_FW_OVMF', if_true: files('pc_sysfw_ovmf.c'),
if_false: files('pc_sysfw_ovmf-stubs.c'))
i386_ss.add(when: 'CONFIG_TDX', if_true: files('tdvf.c', 'tdvf-hob.c'))
subdir('kvm')
subdir('xen')

Some files were not shown because too many files have changed in this diff Show More