This fixes timekeeping of x86-64 Darwin/OS X/macOS guests when using KVM.
Darwin/OS X/macOS for x86-64 uses the TSC for timekeeping; it normally calibrates this by querying various clock frequency scaling MSRs. Details depend on the exact CPU model detected. The local APIC timer frequency is extracted from (EFI) firmware.
This is problematic in the presence of virtualisation, as the MSRs in question are typically not handled by the hypervisor. VMWare (Fusion) advertises TSC and APIC frequency via a custom 0x40000010 CPUID leaf, in the eax and ebx registers respectively. This is documented at https://lwn.net/Articles/301888/ among other places.
Darwin/OS X/macOS looks for the generic 0x40000000 hypervisor leaf, and if this indicates via eax that leaf 0x40000010 might be available, that is in turn queried for the two frequencies.
This adds a CPU option "vmware-cpuid-freq" to enable the same behaviour when running Qemu with KVM acceleration, if the KVM TSC frequency can be determined, and it is stable. (invtsc or user-specified) The virtualised APIC bus cycle is hardcoded to 1GHz in KVM, so ebx of the CPUID leaf is also hardcoded to this value.
Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Message-Id: <1484921496-11257-2-git-send-email-phil@philjordan.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch improves interrupt handling in record/replay mode.
Now "interrupt" event is saved only when cc->cpu_exec_interrupt returns true.
This patch also adds missing return to cpu_exec_interrupt function.
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20170124071708.4572.64023.stgit@PASHA-ISP>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For M profile (unlike A profile) the reset value of R14 is specified
as 0xffffffff. (The rationale is that this is an illegal exception
return value, so if guest code tries to return to it it will result
in a helpful exception.)
Registers r0 to r12 and the flags are architecturally UNKNOWN on
reset, so we leave those at zero.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1485285380-10565-11-git-send-email-peter.maydell@linaro.org
For M profile CPUs, FAULTMASK should be 0 on reset, like PRIMASK.
QEMU stores FAULTMASK in the PSTATE F bit, so (as with PRIMASK in the
I bit) we have to clear these to undo the A profile default of 1.
Update the comment accordingly and move it so that it's closer to the
code it's referring to.
Signed-off-by: Michael Davidsaver <mdavidsaver@gmail.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1485285380-10565-10-git-send-email-peter.maydell@linaro.org
[PMM: rewrote commit message, moved comments]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
For v7M attempts to access a nonexistent coprocessor are reported
differently from plain undefined instructions (as UsageFaults of type
NOCP rather than type UNDEFINSTR). Split them out into a new
EXCP_NOCP so we can report the FSR value correctly.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1485285380-10565-8-git-send-email-peter.maydell@linaro.org
The v7m CONTROL register bit 1 is SPSEL, which indicates
the stack being used. We were storing this information
not in v7m.control but in the separate v7m.other_sp
structure field. Unfortunately, the code handling reads
of the CONTROL register didn't take account of this, and
so if SPSEL was updated by an exception entry or exit then
a subsequent guest read of CONTROL would get the wrong value.
Using a separate structure field doesn't really gain us
anything in efficiency, so drop this unnecessary complexity
in favour of simply storing all the bits in v7m.control.
This is a migration compatibility break for M profile
CPUs only.
Signed-off-by: Michael Davidsaver <mdavidsaver@gmail.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1484937883-1068-6-git-send-email-peter.maydell@linaro.org
[PMM: rewrote commit message;
use deposit32(); use FIELD to define constants for
masking and shifting of CONTROL register fields
]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Give an explicit error and abort when a load
from the vector table fails. Architecturally this
should HardFault (which will then immediately
fail to load the HardFault vector and go into Lockup).
Since we don't model Lockup, just report this guest
error via cpu_abort(). This is more helpful than the
previous behaviour of reading a zero, which is the
address of the reset stack pointer and not a sensible
location to jump to.
Signed-off-by: Michael Davidsaver <mdavidsaver@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1484937883-1068-4-git-send-email-peter.maydell@linaro.org
[PMM: expanded commit message]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
For v7m we need to catch attempts to execute from special
addresses at 0xfffffff0 and above. Previously we did this
with the aid of a hacky special purpose lump of memory
in the address space and a check in translate.c for whether
we were translating code at those addresses.
We can implement this more cleanly using a CPU
unassigned access handler which throws the exception
if the unassigned access is for one of the special addresses.
Signed-off-by: Michael Davidsaver <mdavidsaver@gmail.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1484937883-1068-3-git-send-email-peter.maydell@linaro.org
[PMM:
* drop the deletion of the "don't interrupt if PC is magic"
code in arm_v7m_cpu_exec_interrupt() -- this is still
required
* don't generate an exception for unassigned accesses
which aren't to the magic address -- although doing
this is in theory correct in practice it will break
currently working guests which rely on the RAZ/WI
behaviour when they touch devices which we haven't
modelled.
* trigger EXCP_EXCEPTION_EXIT on is_exec, not !is_write
]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The MRS and MSR instruction handling has a number of flaws:
* unprivileged accesses should only be able to read
CONTROL and the xPSR subfields, and only write APSR
(others RAZ/WI)
* privileged access should not be able to write xPSR
subfields other than APSR
* accesses to unimplemented registers should log as
guest errors, not abort QEMU
Signed-off-by: Michael Davidsaver <mdavidsaver@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1484937883-1068-2-git-send-email-peter.maydell@linaro.org
[PMM: rewrote commit message]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Migration
1 My maintainer change
2 Jianjun's qtailq
3 Ashijeet's only-migratable
4 Zhanghailiang's re-active images
5 Pankaj's change name of migration thread
6 My PCI migration merge
7 Juan's debug to tracing
8 My tracing on save
# gpg: Signature made Tue 24 Jan 2017 18:39:35 GMT
# gpg: using RSA key 0x0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A 9FA9 0516 331E BC5B FDE7
* remotes/dgilbert/tags/pull-migration-20170124b:
migration/tracing: Add tracing on save
migration: transform remaining DPRINTF into trace_
PCI/migration merge vmstate_pci_device and vmstate_pcie_device
migration: Change name of live migration thread
migration: re-active images while migration been canceled after inactive them
migration: Fail migration blocker for --only-migratable
migration: disallow migrate_add_blocker during migration
migration: Allow "device add" options to only add migratable devices
migration: Add a new option to enable only-migratable
block/vvfat: Remove the undesirable comment
migration: add error_report
tests/migration: Add test for QTAILQ migration
migration: migrate QTAILQ
migration: extend VMStateInfo
MAINTAINERS: Add myself as a migration submaintainer
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Two s390x fixes: One for the kvm.c build failure, and one for a bug
that might cause random guest crashes with zeroed out pages on host
kernels with working cmma (< 4.6 and likely >= 4.10).
# gpg: Signature made Tue 24 Jan 2017 15:00:50 GMT
# gpg: using RSA key 0xDECF6B93C6F02FAF
# gpg: Good signature from "Cornelia Huck <huckc@linux.vnet.ibm.com>"
# gpg: aka "Cornelia Huck <cornelia.huck@de.ibm.com>"
# Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0 18CE DECF 6B93 C6F0 2FAF
* remotes/cohuck/tags/s390x-20170124:
s390x/kvm: fix cmma reset for KVM
s390x/kvm: include hw_accel.h instead of kvm.h
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
x86, machine, numa queue (2017-01-23)
# gpg: Signature made Mon 23 Jan 2017 23:26:59 GMT
# gpg: using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6
* remotes/ehabkost/tags/x86-and-machine-pull-request:
kvm: Allow invtsc migration if tsc-khz is set explicitly
kvm: Simplify invtsc check
hw/core/null-machine: Add the possibility to instantiate a CPU and RAM
qemu-options: Rename variables on the -numa "cpus" option
MAINTAINERS: Add an entry for hw/core/null-machine.c
machine: Make possible_cpu_arch_ids() return const pointer
pc: don't return cpu pointer from pc_new_cpu() as it's not needed anymore
pc: cleanup: move smbios_set_cpuid() into pc_build_smbios()
arch_init: Remove unnecessary default_config_files table
vl: Ensure the numa_post_machine_init func in the appropriate location
i386: Return migration-safe field on query-cpu-definitions
i386: Remove AMD feature flag aliases from Opteron models
x86: add AVX512_VPOPCNTDQ features
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We must reset the CMMA states for normal memory (when not on mem path),
but the current code does the opposite. This was unnoticed for some time
as the kernel since 4.6 also had a bug which mostly disabled the paging
optimizations.
Fixes: 07059effd1 ("s390x/kvm: let the CPU model control CMM(A)")
Cc: qemu-stable@nongnu.org # v2.8
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Commit b394662 ("kvm: move cpu synchronization code") switched
to hw_accel.h instead of kvm.h, but missed s390x, resulting in
CC s390x-softmmu/target/s390x/kvm.o
/home/cohuck/git/qemu/target/s390x/kvm.c: In function ‘kvm_sclp_service_call’:
/home/cohuck/git/qemu/target/s390x/kvm.c:1034:5: error: implicit declaration of function ‘cpu_synchronize_state’ [-Werror=implicit-function-declaration]
cpu_synchronize_state(CPU(cpu));
^
/home/cohuck/git/qemu/target/s390x/kvm.c:1034:5: error: nested extern declaration of ‘cpu_synchronize_state’ [-Werror=nested-externs]
/home/cohuck/git/qemu/target/s390x/kvm.c: In function ‘sigp_initial_cpu_reset’:
/home/cohuck/git/qemu/target/s390x/kvm.c:1628:5: error: implicit declaration of function ‘cpu_synchronize_post_reset’ [-Werror=implicit-function-declaration]
cpu_synchronize_post_reset(cs);
^
/home/cohuck/git/qemu/target/s390x/kvm.c:1628:5: error: nested extern declaration of ‘cpu_synchronize_post_reset’ [-Werror=nested-externs]
/home/cohuck/git/qemu/target/s390x/kvm.c: In function ‘sigp_set_prefix’:
/home/cohuck/git/qemu/target/s390x/kvm.c:1665:5: error: implicit declaration of function ‘cpu_synchronize_post_init’ [-Werror=implicit-function-declaration]
cpu_synchronize_post_init(cs);
^
/home/cohuck/git/qemu/target/s390x/kvm.c:1665:5: error: nested extern declaration of ‘cpu_synchronize_post_init’ [-Werror=nested-externs]
cc1: all warnings being treated as errors
/home/cohuck/git/qemu/rules.mak:64: recipe for target 'target/s390x/kvm.o' failed
Fix this.
Fixes: b394662 ("kvm: move cpu synchronization code")
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Vincent Palatin <vpalatin@chromium.org>
We can safely allow a VM to be migrated with invtsc enabled if
tsc-khz is set explicitly, because:
* QEMU already refuses to start if it can't set the TSC frequency
to the configured value.
* Management software is already required to keep device
configuration (including CPU configuration) the same on
migration source and destination.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170108173234.25721-3-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
When CPU vendor is set to AMD, the AMD feature alias bits on
CPUID[0x80000001].EDX are already automatically copied from CPUID[1].EDX
on x86_cpu_realizefn(). When CPU vendor is Intel, those bits are
reserved and should be zero. On either case, those bits shouldn't be set
in the CPU model table.
Commit 726a8ff686 removed those
bits from most CPU models, but the Opteron_* entries still have
them. Remove the alias bits from Opteron_* too.
Add an assert() to x86_register_cpudef_type() to ensure we don't
make the same mistake again.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170113190057.6327-1-ehabkost@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
For linux, page 0 is mapped as an execute-only gateway. A gateway
page is a special bit in the page table that allows a B,GATE insn
within that page to raise processor permissions. This is how system
calls are implemented for HPPA.
Rather than actually map anything here, or handle permissions at all,
implement the semantics of the actual linux syscall entry points.
Signed-off-by: Richard Henderson <rth@twiddle.net>
The HPPA cpu has a unique form of predicated execution in which
almost any instruction can set the PSW[N] (or "nullify") bit,
which suppresses execution (and even decoding) of the following
instruction. Execution of a nullified insn clears the PSW[N] bit.
This adds a generic framework for branching over nullified insns,
or for sufficiently simple insns, transforming the writeback of
the result to a conditional move. In the process, we want to be
able to represent PSW[N] as a TCG condition, which implies management
of the related tcg temps.
Signed-off-by: Richard Henderson <rth@twiddle.net>
This is just about the minimum required to enable compilation
without actually executing any instructions. This contains the
HPPACPU structure and the required callbacks, the gdbstub, the
basic translation loop, and a translate_one function that always
results in an illegal instruction.
Signed-off-by: Richard Henderson <rth@twiddle.net>
First set of s390x patches for 2.9:
- rework of the zpci code, giving us proper multibus support
- introduction of the 2.9 machine
- fixes and improvements
# gpg: Signature made Fri 20 Jan 2017 09:11:58 GMT
# gpg: using RSA key 0xDECF6B93C6F02FAF
# gpg: Good signature from "Cornelia Huck <huckc@linux.vnet.ibm.com>"
# gpg: aka "Cornelia Huck <cornelia.huck@de.ibm.com>"
# Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0 18CE DECF 6B93 C6F0 2FAF
* remotes/cohuck/tags/s390x-20170120-v2:
virtio-ccw: fix ring sizing
s390x/pci: merge msix init functions
s390x/pci: handle PCIBridge bus number
s390x/pci: use hashtable to look up zpci via fh
s390x/pci: PCI multibus bridge handling
s390x/pci: optimize calling s390_get_phb()
s390x/pci: change the device array to a list
s390x/pci: dynamically allocate iommu
s390x/pci: make S390PCIIOMMU inherit Object
s390x/kvm: use kvm_gsi_routing_enabled in flic
s390x: add compat machine for 2.9
s390x: remove double compat statement
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Enable the ARM_FEATURE_EL2 bit on Cortex-A52 and
Cortex-A57, since this is all now sufficiently implemented
to work with the GICv3. We provide the usual CPU property
to disable it for backwards compatibility with the older
virt boards.
In this commit, we disable the EL2 feature on the
virt and ZynpMP boards, so there is no overall effect.
Another commit will expose a board-level property to
allow the user to enable EL2.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 1483977924-14522-18-git-send-email-peter.maydell@linaro.org
The PSCI spec states that a CPU_ON call should cause the new
CPU to be started in the highest implemented Non-secure
exception level. We were incorrectly starting it at the
exception level of the caller, which happens to be correct
if EL2 is not implemented. Implement the correct logic
as described in the PSCI 1.0 spec section 6.4:
* if EL2 exists and SCR_EL3.HCE is set: start in EL2
* otherwise start in EL1
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Tested-by: Andrew Jones <drjones@redhat.com>
Message-id: 1483977924-14522-17-git-send-email-peter.maydell@linaro.org
The GICv3 support for virtualization includes an outbound
maintenance interrupt signal which is asserted when the
CPU interface wants to signal to the hypervisor that it
needs attention. Expose this as an outbound GPIO line from
the CPU object which can be wired up as a physical interrupt
line by the board code (as we do already for the CPU timers).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 1483977924-14522-4-git-send-email-peter.maydell@linaro.org
The DBGVCR_EL2 system register is needed to run a 32-bit
EL1 guest under a Linux EL2 64-bit hypervisor. Its only
purpose is to provide AArch64 with access to the state of
the DBGVCR AArch32 register. Since we only have a dummy
DBGVCR, implement a corresponding dummy DBGVCR32_EL2.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
To run a VM in 32-bit EL1 our AArch32 interrupt handling code
needs to be able to cope with VIRQ and VFIQ exceptions.
These behave like IRQ and FIQ except that we don't need to try
to route them to Monitor mode.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
A function may recursively call device search functions or may call
serveral different device search function. Passing the S390pciState to
search functions as an argument instead of looking up it inside the
search functions lowers the number of calling s390_get_phb().
Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>