a4f30719a8, way back in 2007 noted that "PowerPC hypervisor mode is not
fundamentally available only for PowerPC 64" and added a 32-bit version
of the MSR[HV] bit.
But nothing was ever really done with that; there is no meaningful support
for 32-bit hypervisor mode 13 years later. Let's stop pretending and just
remove the stubs.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Server class POWER CPUs have a "compat" property, which was obsoleted
by commit 7843c0d60d and replaced by a "max-cpu-compat" property on the
pseries machine type. A hack was introduced so that passing "compat" to
-cpu would still produce the desired effect, for the sake of backward
compatibility : it strips the "compat" option from the CPU properties
and applies internally it to the pseries machine. The accessors of the
"compat" property were updated to do nothing but warn the user about the
deprecated status when doing something like:
$ qemu-system-ppc64 -global POWER9-family-powerpc64-cpu.compat=power9
qemu-system-ppc64: warning: CPU 'compat' property is deprecated and has no
effect; use max-cpu-compat machine property instead
This was merged during the QEMU 2.10 timeframe, a few weeks before we
formalized our deprecation process. As a consequence, the "compat"
property fell through the cracks and was never listed in the officialy
deprecated features.
We are now eight QEMU versions later, it is largely time to mention it
in qemu-deprecated.texi. Also, since -global XXX-powerpc64-cpu.compat=
has been emitting warnings since QEMU 2.10 and the usual way of setting
CPU properties is with -cpu, completely remove the "compat" property.
Keep the hack so that -cpu XXX,compat= stays functional some more time,
as required by our deprecation process.
The now empty powerpc_servercpu_properties[] list which was introduced
for "compat" and never had any other use is removed on the way. We can
re-add it in the future if the need for a server class POWER CPU specific
property arises again.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <158274357799.140275.12263135811731647490.stgit@bahia.lan>
[dwg: Convert from .texi to .rst to match upstream change]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Upon a machine check exception (MCE) in a guest address space,
KVM causes a guest exit to enable QEMU to build and pass the
error to the guest in the PAPR defined rtas error log format.
This patch builds the rtas error log, copies it to the rtas_addr
and then invokes the guest registered machine check handler. The
handler in the guest takes suitable action(s) depending on the type
and criticality of the error. For example, if an error is
unrecoverable memory corruption in an application inside the
guest, then the guest kernel sends a SIGBUS to the application.
For recoverable errors, the guest performs recovery actions and
logs the error.
Signed-off-by: Aravinda Prasad <arawinda.p@gmail.com>
[Assume SLOF has allocated enough room for rtas error log]
Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20200130184423.20519-5-ganeshgr@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Introduce fwnmi an spapr capability and add a helper function
which tries to enable it, which would be used by following patch
of the series. This patch by itself does not change the existing
behavior.
Signed-off-by: Aravinda Prasad <arawinda.p@gmail.com>
[eliminate cap_ppc_fwnmi, add fwnmi cap to migration state
and reprhase the commit message]
Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20200130184423.20519-3-ganeshgr@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The privileged message send and clear instructions (msgsndp & msgclrp)
are privileged, but will generate a hypervisor facility unavailable
exception if not enabled in the HFSCR and executed in privileged
non-hypervisor state.
Add checks when accessing the DPDES register and when using the
msgsndp and msgclrp isntructions.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20200120104935.24449-3-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The Processor Control facility for POWER8 processors and later
provides a mechanism for the hypervisor to send messages to other
threads in the system (msgsnd instruction) and cause hypervisor-level
exceptions. Privileged non-hypervisor programs can also send messages
(msgsndp instruction) but are restricted to the threads of the same
subprocessor and cause privileged-level exceptions.
The Directed Privileged Doorbell Exception State (DPDES) register
reflects the state of pending privileged doorbell exceptions and can
be used to modify that state. The register can be used to read and
modify the state of privileged doorbell exceptions for all threads of
a subprocessor and thus is a shared facility for that subprocessor.
The register can be read/written by the hypervisor and read by the
supervisor if enabled in the HFSCR, otherwise a hypervisor facility
unavailable exception is generated.
The privileged message send and clear instructions (msgsndp & msgclrp)
are used to generate and clear the presence of a directed privileged
doorbell exception, respectively. The msgsndp instruction can be used
to target any thread of the current subprocessor, msgclrp acts on the
thread issuing the instruction. These instructions are privileged, but
will generate a hypervisor facility unavailable exception if not
enabled in the HFSCR and executed in privileged non-hypervisor
state. The HV facility unavailable exception will be addressed in
other patch.
Add and implement this register and instructions by reading or
modifying the pending interrupt state of the cpu.
Note that TCG only supports one thread per core and so we only need to
worry about the cpu making the access.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20200120104935.24449-2-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We currently search both the root and the tcg/ directories for tcg
files:
$ git grep '#include "tcg/' | wc -l
28
$ git grep '#include "tcg[^/]' | wc -l
94
To simplify the preprocessor search path, unify by expliciting the
tcg/ directory.
Patch created mechanically by running:
$ for x in \
tcg.h tcg-mo.h tcg-op.h tcg-opc.h \
tcg-op-gvec.h tcg-gvec-desc.h; do \
sed -i "s,#include \"$x\",#include \"tcg/$x\"," \
$(git grep -l "#include \"$x\""); \
done
Acked-by: David Gibson <david@gibson.dropbear.id.au> (ppc parts)
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200101112303.20724-2-philmd@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
There are only two uses. Within dcbz_common, the local variable
mmu_idx already contains the epid computation, and we can avoid
repeating it for the store. Within helper_icbiep, the usage is
trivially expanded using PPC_TLB_EPID_LOAD.
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Invoking KVM_SVM_OFF ioctl for TCG guests will lead to a QEMU crash.
Fix this by ensuring that we don't call KVM_SVM_OFF ioctl on TCG.
Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Fixes: 4930c1966249 ("ppc/spapr: Support reboot of secure pseries guest")
Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Message-Id: <20200102054155.13175-1-bharata@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
A pseries guest can be run as a secure guest on Ultravisor-enabled
POWER platforms. When such a secure guest is reset, we need to
release/reset a few resources both on ultravisor and hypervisor side.
This is achieved by invoking this new ioctl KVM_PPC_SVM_OFF from the
machine reset path.
As part of this ioctl, the secure guest is essentially transitioned
back to normal mode so that it can reboot like a regular guest and
become secure again.
This ioctl has no effect when invoked for a normal guest. If this ioctl
fails for a secure guest, the guest is terminated.
Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Message-Id: <20191219031445.8949-3-bharata@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The exception vector offset calculation was moved into a function but
the case when AIL=0 was not checked.
The reason we got away with this is that the sole caller of
ppc_excp_vector_offset checks the AIL before calling the function:
/* Handle AIL */
if (ail) {
...
vector |= ppc_excp_vector_offset(cs, ail);
}
Fixes: 2586a4d7a0 ("target/ppc: Move exception vector offset computation into a function")
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20191217142512.574075-1-farosas@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* More uses of RCU_READ_LOCK_GUARD (Dave, myself)
* QOM doc improvments (Greg)
* Cleanups from the Meson conversion (Marc-André)
* Support for multiple -accel options (myself)
* Many x86 machine cleanup (Philippe, myself)
* tests/migration-test cleanup (Juan)
* PC machine removal and next round of deprecation (Thomas)
* kernel-doc integration (Peter, myself)
# gpg: Signature made Wed 18 Dec 2019 01:35:02 GMT
# gpg: using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream: (87 commits)
vga: cleanup mapping of VRAM for non-PCI VGA
hw/display: Remove "rombar" hack from vga-pci and vmware_vga
hw/pci: Remove the "command_serr_enable" property
hw/audio: Remove the "use_broken_id" hack from the AC97 device
hw/i386: Remove the deprecated machines 0.12 up to 0.15
hw/pci-host: Add Kconfig entry to select the IGD Passthrough Host Bridge
hw/pci-host/i440fx: Extract the IGD passthrough host bridge device
hw/pci-host/i440fx: Use definitions instead of magic values
hw/pci-host/i440fx: Use size_t to iterate over ARRAY_SIZE()
hw/pci-host/i440fx: Extract PCII440FXState to "hw/pci-host/i440fx.h"
hw/pci-host/i440fx: Correct the header description
Fix some comment spelling errors.
target/i386: remove unused pci-assign codes
WHPX: refactor load library
migration: check length directly to make sure the range is aligned
memory: include MemoryListener documentation and some missing function parameters
docs: add memory API reference
memory.h: Silence kernel-doc complaints
docs: Create bitops.rst as example of kernel-docs
bitops.h: Silence kernel-doc complaints
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Mostly, Error ** is for returning error from the function, so the
callee sets it. However kvmppc_hint_smt_possible gets already filled
errp parameter. It doesn't change the pointer itself, only change the
internal state of referenced Error object. So we can make it Error
*const * errp, to stress the behavior. It will also help coccinelle
script (in future) to distinguish such cases from common errp usage.
While there, rename the function to
kvmppc_error_append_smt_possible_hint().
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <20191205174635.18758-8-vsementsov@virtuozzo.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Commit message replaced]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
This reverts commit cdcca22aab.
Commit cdcca22aab is a superseded version of the next commit that
crept in by accident. Revert it, so the final version applies.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
The KVMState struct is opaque, so provide accessors for the fields
that will be moved from current_machine to the accelerator. For now
they just forward to the machine object, but this will change.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The Access Segment Descriptor Register (ASDR) provides information about
the storage element when taking a hypervisor storage interrupt. When
performing nested radix address translation, this is normally the guest
real address. This register is present on POWER9 processors and later.
Implement the ADSR, note read and write access is limited to the
hypervisor.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191128134700.16091-4-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The Processor Utilisation of Resources Register (PURR) and Scaled
Processor Utilisation of Resources Register (SPURR) provide an estimate
of the resources used by the thread, present on POWER7 and later
processors.
Currently the [S]PURR registers simply count at the rate of the
timebase.
Preserve this behaviour but rework the implementation to store an offset
like the timebase rather than doing the calculation manually. Also allow
hypervisor write access to the register along with the currently
available read access.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
[ clg: rebased on current ppc tree ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191128134700.16091-3-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The virtual timebase register (VTB) is a 64-bit register which
increments at the same rate as the timebase register, present on POWER8
and later processors.
The register is able to be read/written by the hypervisor and read by
the supervisor. All other accesses are illegal.
Currently the VTB is just an alias for the timebase (TB) register.
Implement the VTB so that is can be read/written independent of the TB.
Make use of the existing method for accessing timebase facilities where
by the compensation is stored and used to compute the value on reads/is
updated on writes.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
[ clg: rebased on current ppc tree ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191128134700.16091-2-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This includes in QEMU a new CPU model for the POWER10 processor with
the same capabilities of a POWER9 process. The model will be extended
when support is completed.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191205184454.10722-2-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The power7_set_irq() and power9_set_irq() functions set this but it is
never used actually. Modern Book3s compatible CPUs are only supported
by the pnv and spapr machines. They have an interrupt controller, XICS
for POWER7/8 and XIVE for POWER9, whose models don't require to track
IRQ input states at the CPU level.
Drop these lines to avoid confusion.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157548862861.3650476.16622818876928044450.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When a CPU is reset, QEMU makes sure no interrupt is pending by clearing
CPUPPCstate::pending_interrupts in ppc_cpu_reset(). In the case of a
complete machine emulation, eg. a sPAPR machine, an external interrupt
request could still be pending in KVM though, eg. an IPI. It will be
eventually presented to the guest, which is supposed to acknowledge it at
the interrupt controller. If the interrupt controller is emulated in QEMU,
either XICS or XIVE, ppc_set_irq() won't deassert the external interrupt
pin in KVM since it isn't pending anymore for QEMU. When the vCPU re-enters
the guest, the interrupt request is still pending and the vCPU will try
again to acknowledge it. This causes an infinite loop and eventually hangs
the guest.
The code has been broken since the beginning. The issue wasn't hit before
because accel=kvm,kernel-irqchip=off is an awkward setup that never got
used until recently with the LC92x IBM systems (aka, Boston).
Add a ppc_irq_reset() function to do the necessary cleanup, ie. deassert
the IRQ pins of the CPU in QEMU and most importantly the external interrupt
pin for this vCPU in KVM.
Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157548861740.3650476.16879693165328764758.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We have to set the default model of all machine classes, not just for
the active one. Otherwise, "query-machines" will indicate the wrong
CPU model (e.g. "power9_v2.0-powerpc64-cpu" instead of
"host-powerpc64-cpu") as "default-cpu-type".
s390x already fixed this in de60a92e "s390x/kvm: Set default cpu model for
all machine classes". This patch applies a similar fix for the pseries-*
machine types on ppc64.
Doing a
{"execute":"query-machines"}
under KVM now results in
{
"hotpluggable-cpus": true,
"name": "pseries-4.2",
"numa-mem-supported": true,
"default-cpu-type": "host-powerpc64-cpu",
"is-default": true,
"cpu-max": 1024,
"deprecated": false,
"alias": "pseries"
},
{
"hotpluggable-cpus": true,
"name": "pseries-4.1",
"numa-mem-supported": true,
"default-cpu-type": "host-powerpc64-cpu",
"cpu-max": 1024,
"deprecated": false
},
...
Libvirt probes all machines via "-machine none,accel=kvm:tcg" and will
currently see the wrong CPU model under KVM.
Reported-by: Jiři Denemark <jdenemar@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
There are three page size in qemu:
real host page size
host page size
target page size
All of them have dedicate variable to represent. For the last two, we
use the same form in the whole qemu project, while for the first one we
use two forms: qemu_real_host_page_size and getpagesize().
qemu_real_host_page_size is defined to be a replacement of
getpagesize(), so let it serve the role.
[Note] Not fully tested for some arch or device.
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20191013021145.16011-3-richardw.yang@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In previous implementation, invocation of TCG shift function could request
shift of TCG variable by 64 bits when variable 'sh' is 0, which is not
supported in TCG (values can be shifted by 0 to 63 bits). This patch fixes
this by using two separate invocation of TCG shift functions, with maximum
shift amount of 32.
Name of variable 'shifted' is changed to 'carry' so variable naming
is similar to old helper implementation.
Variables 'avrA' and 'avrB' are replaced with variable 'avr'.
Fixes: 4e6d0920e7
Reported-by: "Paul A. Clark" <pc@us.ibm.com>
Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Suggested-by: Aleksandar Markovic <aleksandar.markovic@rt-rk.com>
Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com>
Message-Id: <1570196639-7025-2-git-send-email-stefan.brankovic@rt-rk.com>
Tested-by: Paul A. Clarke <pc@us.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Switch over all accesses to the decimal numbers held in struct PPC_DFP from
using HI_IDX and LO_IDX to using the VsrD() macro instead. Not only does this
allow the compiler to ensure that the various dfp_* functions are being passed
a ppc_vsr_t rather than an arbitrary uint64_t pointer, but also allows the
host endian-specific HI_IDX and LO_IDX to be completely removed from
dfp_helper.c.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190926185801.11176-7-mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
There are several places in dfp_helper.c that access the decimal number
representations in struct PPC_DFP via HI_IDX and LO_IDX defines which are set
at the top of dfp_helper.c according to the host endian.
However we can instead switch to using ppc_vsr_t for decimal numbers and then
make subsequent use of the existing VsrD() macros to access the correct
element regardless of host endian. Note that 64-bit decimals are stored in the
LSB of ppc_vsr_t (equivalent to VsrD(1)).
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190926185801.11176-6-mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Most of the DFP helper functions call decimal{64,128}FromNumber() just before
returning in order to convert the decNumber stored in dfp.t64 back to a
Decimal{64,128} to write back to the FP registers.
Introduce new dfp_finalize_decimal{64,128}() helper functions which both enable
the parameter list to be reduced considerably, and also help minimise the
changes required in the next patch.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190926185801.11176-5-mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>