The semantic difference between the deprecated device_legacy_reset()
function and the newer device_cold_reset() function is that the new
function resets both the device itself and any qbuses it owns,
whereas the legacy function resets just the device itself and nothing
else.
The x86_cpu_after_reset() function uses device_legacy_reset() to reset
the APIC; this is an APICCommonState and does not have any qbuses, so
for this purpose the two functions behave identically and we can stop
using the deprecated one.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20221013171926.1447899-1-peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Resetting a guest that has Hyper-V VMBus support enabled triggers a QEMU
assertion failure:
hw/hyperv/hyperv.c:131: synic_reset: Assertion `QLIST_EMPTY(&synic->sint_routes)' failed.
This happens both on normal guest reboot or when using "system_reset" HMP
command.
The failing assertion was introduced by commit 64ddecc88b ("hyperv: SControl is optional to enable SynIc")
to catch dangling SINT routes on SynIC reset.
The root cause of this problem is that the SynIC itself is reset before
devices using SINT routes have chance to clean up these routes.
Since there seems to be no existing mechanism to force reset callbacks (or
methods) to be executed in specific order let's use a similar method that
is already used to reset another interrupt controller (APIC) after devices
have been reset - by invoking the SynIC reset from the machine reset
handler via a new x86_cpu_after_reset() function co-located with
the existing x86_cpu_reset() in target/i386/cpu.c.
Opportunistically move the APIC reset handler there, too.
Fixes: 64ddecc88b ("hyperv: SControl is optional to enable SynIc") # exposed the bug
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Message-Id: <cb57cee2e29b20d06f81dce054cbcea8b5d497e8.1664552976.git.maciej.szmigiero@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add support for saving/restoring extended save states when signals
are delivered. This allows using AVX, MPX or PKRU registers in
signal handlers.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The MSR_CORE_THREAD_COUNT MSR describes CPU package topology, such as number
of threads and cores for a given package. This is information that QEMU has
readily available and can provide through the new user space MSR deflection
interface.
This patch propagates the existing hvf logic from patch 027ac0cb51
("target/i386/hvf: add rdmsr 35H MSR_CORE_THREAD_COUNT") to KVM.
Signed-off-by: Alexander Graf <agraf@csgraf.de>
Message-Id: <20221004225643.65036-4-agraf@csgraf.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
KVM has grown support to deflect arbitrary MSRs to user space since
Linux 5.10. For now we don't expect to make a lot of use of this
feature, so let's expose it the easiest way possible: With up to 16
individually maskable MSRs.
This patch adds a kvm_filter_msr() function that other code can call
to install a hook on KVM MSR reads or writes.
Signed-off-by: Alexander Graf <agraf@csgraf.de>
Message-Id: <20221004225643.65036-3-agraf@csgraf.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Intel CPUs starting with Haswell-E implement a new MSR called
MSR_CORE_THREAD_COUNT which exposes the number of threads and cores
inside of a package.
This MSR is used by XNU to populate internal data structures and not
implementing it prevents virtual machines with more than 1 vCPU from
booting if the emulated CPU generation is at least Haswell-E.
This patch propagates the existing hvf logic from patch 027ac0cb51
("target/i386/hvf: add rdmsr 35H MSR_CORE_THREAD_COUNT") to TCG.
Signed-off-by: Alexander Graf <agraf@csgraf.de>
Message-Id: <20221004225643.65036-2-agraf@csgraf.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There are cases that malicious virtual machine can cause CPU stuck (due
to event windows don't open up), e.g., infinite loop in microcode when
nested #AC (CVE-2015-5307). No event window means no event (NMI, SMI and
IRQ) can be delivered. It leads the CPU to be unavailable to host or
other VMs. Notify VM exit is introduced to mitigate such kind of
attacks, which will generate a VM exit if no event window occurs in VM
non-root mode for a specified amount of time (notify window).
A new KVM capability KVM_CAP_X86_NOTIFY_VMEXIT is exposed to user space
so that the user can query the capability and set the expected notify
window when creating VMs. The format of the argument when enabling this
capability is as follows:
Bit 63:32 - notify window specified in qemu command
Bit 31:0 - some flags (e.g. KVM_X86_NOTIFY_VMEXIT_ENABLED is set to
enable the feature.)
Users can configure the feature by a new (x86 only) accel property:
qemu -accel kvm,notify-vmexit=run|internal-error|disable,notify-window=n
The default option of notify-vmexit is run, which will enable the
capability and do nothing if the exit happens. The internal-error option
raises a KVM internal error if it happens. The disable option does not
enable the capability. The default value of notify-window is 0. It is valid
only when notify-vmexit is not disabled. The valid range of notify-window
is non-negative. It is even safe to set it to zero since there's an
internal hardware threshold to be added to ensure no false positive.
Because a notify VM exit may happen with VM_CONTEXT_INVALID set in exit
qualification (no cases are anticipated that would set this bit), which
means VM context is corrupted. It would be reflected in the flags of
KVM_EXIT_NOTIFY exit. If KVM_NOTIFY_CONTEXT_INVALID bit is set, raise a KVM
internal error unconditionally.
Acked-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Message-Id: <20220929072014.20705-5-chenyi.qiang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Several hypervisor capabilities in KVM are target-specific. When exposed
to QEMU users as accelerator properties (i.e. -accel kvm,prop=value), they
should not be available for all targets.
Add a hook for targets to add their own properties to -accel kvm, for
now no such property is defined.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20220929072014.20705-3-chenyi.qiang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For the direct triple faults, i.e. hardware detected and KVM morphed
to VM-Exit, KVM will never lose them. But for triple faults sythesized
by KVM, e.g. the RSM path, if KVM exits to userspace before the request
is serviced, userspace could migrate the VM and lose the triple fault.
A new flag KVM_VCPUEVENT_VALID_TRIPLE_FAULT is defined to signal that
the event.triple_fault_pending field contains a valid state if the
KVM_CAP_X86_TRIPLE_FAULT_EVENT capability is enabled.
Acked-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Message-Id: <20220929072014.20705-2-chenyi.qiang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This helps us construct strings elsewhere before echoing to the
monitor. It avoids having to jump through hoops like:
monitor_printf(mon, "%s", s->str);
It will be useful in following patches but for now convert all
existing plain "%s" printfs to use the _puts api.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20220929114231.583801-33-alex.bennee@linaro.org>
The availability of tb->pc will shortly be conditional.
Introduce accessor functions to minimize ifdefs.
Pass around a known pc to places like tcg_gen_code,
where the caller must already have the value.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
New KVM_CLOCK flags were added in the kernel.(c68dc1b577eabd5605c6c7c08f3e07ae18d30d5d)
```
+ #define KVM_CLOCK_VALID_FLAGS \
+ (KVM_CLOCK_TSC_STABLE | KVM_CLOCK_REALTIME | KVM_CLOCK_HOST_TSC)
case KVM_CAP_ADJUST_CLOCK:
- r = KVM_CLOCK_TSC_STABLE;
+ r = KVM_CLOCK_VALID_FLAGS;
```
kvm_has_adjust_clock_stable needs to handle additional flags,
so that s->clock_is_reliable can be true and kvmclock_current_nsec doesn't need to be called.
Signed-off-by: Ray Zhang <zhanglei002@gmail.com>
Message-Id: <20220922100523.2362205-1-zhanglei002@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The "O" operand type in the Intel SDM needs to load an 8- to 64-bit
unsigned value, while insn_get is limited to 32 bits. Extract the code
out of disas_insn and into a separate function.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
INSERTQ is defined to not modify any bits in the lower 64 bits of the
destination, other than the ones being replaced with bits from the
source operand. QEMU instead is using unshifted bits from the source
for those bits.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
SSE4a instructions EXTRQ and INSERTQ have two bit index operands, that can be
immediates or taken from an XMM register. In both cases, the fields are
6-bit wide and the top two bits in the byte are ignored. translate.c is
doing that correctly for the immediate case, but not for the XMM case, so
fix it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>