Specifically XENMEM_add_to_physmap with space XENMAPSPACE_shared_info to
allow the guest to set its shared_info page.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Use the xen_overlay device, add compat support]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Xen will "latch" the guest's 32-bit or 64-bit ("long mode") setting when
the guest writes the MSR to fill in the hypercall page, or when the guest
sets the event channel callback in HVM_PARAM_CALLBACK_IRQ.
KVM handles the former and sets the kernel's long_mode flag accordingly.
The latter will be handled in userspace. Keep them in sync by noticing
when a hypercall is made in a mode that doesn't match qemu's idea of
the guest mode, and resyncing from the kernel. Do that same sync right
before serialization too, in case the guest has set the hypercall page
but hasn't yet made a system call.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
The xen_overlay device (and later similar devices for event channels and
grant tables) need to be instantiated. Do this from a kvm_type method on
the PC machine derivatives, since KVM is only way to support Xen emulation
for now.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
For the shared info page and for grant tables, Xen shares its own pages
from the "Xen heap" to the guest. The guest requests that a given page
from a certain address space (XENMAPSPACE_shared_info, etc.) be mapped
to a given GPA using the XENMEM_add_to_physmap hypercall.
To support that in qemu when *emulating* Xen, create a memory region
(migratable) and allow it to be mapped as an overlay when requested.
Xen theoretically allows the same page to be mapped multiple times
into the guest, but that's hard to track and reinstate over migration,
so we automatically *unmap* any previous mapping when creating a new
one. This approach has been used in production with.... a non-trivial
number of guests expecting true Xen, without any problems yet being
noticed.
This adds just the shared info page for now. The grant tables will be
a larger region, and will need to be overlaid one page at a time. I
think that means I need to create separate aliases for each page of
the overall grant_frames region, so that they can be mapped individually.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
They both do the same thing and just call sched_yield. This is enough to
stop the Linux guest panicking when running on a host kernel which doesn't
intercept SCHEDOP_poll and lets it reach userspace.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
It allows to shutdown itself via hypercall with any of the 3 reasons:
1) self-reboot
2) shutdown
3) crash
Implementing SCHEDOP_shutdown sub op let us handle crashes gracefully rather
than leading to triple faults if it remains unimplemented.
In addition, the SHUTDOWN_soft_reset reason is used for kexec, to reset
Xen shared pages and other enlightenments and leave a clean slate for the
new kernel without the hypervisor helpfully writing information at
unexpected addresses.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Ditch sched_op_compat which was never available for HVM guests,
Add SCHEDOP_soft_reset]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
This is just meant to serve as an example on how we can implement
hypercalls. xen_version specifically since Qemu does all kind of
feature controllability. So handling that here seems appropriate.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Implement kvm_gva_rw() safely]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
This means handling the new exit reason for Xen but still
crashing on purpose. As we implement each of the hypercalls
we will then return the right return code.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Add CPL to hypercall tracing, disallow hypercalls from CPL > 0]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
The only thing we need to fix to make this build is the PIO hack which
sets the BIOS memory areas to R/W v.s. R/O. Theoretically we could hook
that up to the PAM registers on the emulated PIIX, but in practice
nobody cares, so just leave it doing nothing.
Now it builds without actual Xen, move it to CONFIG_XEN_BUS to include it
in the KVM-only builds.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
There are (at least) three different vCPU ID number spaces. One is the
internal KVM vCPU index, based purely on which vCPU was chronologically
created in the kernel first. If userspace threads are all spawned and
create their KVM vCPUs in essentially random order, then the KVM indices
are basically random too.
The second number space is the APIC ID space, which is consistent and
useful for referencing vCPUs. MSIs will specify the target vCPU using
the APIC ID, for example, and the KVM Xen APIs also take an APIC ID
from userspace whenever a vCPU needs to be specified (as opposed to
just using the appropriate vCPU fd).
The third number space is not normally relevant to the kernel, and is
the ACPI/MADT/Xen CPU number which corresponds to cs->cpu_index. But
Xen timer hypercalls use it, and Xen timer hypercalls *really* want
to be accelerated in the kernel rather than handled in userspace, so
the kernel needs to be told.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Introduce support for emulating CPUID for Xen HVM guests. It doesn't make
sense to advertise the KVM leaves to a Xen guest, so do Xen unconditionally
when the xen-version machine property is set.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Obtain xen_version from KVM property, make it automatic]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
This just initializes the basic Xen support in KVM for now. Only permitted
on TYPE_PC_MACHINE because that's where the sysbus devices for Xen heap
overlay, event channel, grant tables and other stuff will exist. There's
no point having the basic hypercall support if nothing else works.
Provide sysemu/kvm_xen.h and a kvm_xen_get_caps() which will be used
later by support devices.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Also set XEN_ATTACH mode in xen_init() to reflect the truth; not that
anyone ever cared before. It was *only* ever checked in xen_init_pv()
before.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
The XEN_EMU option will cover core Xen support in target/, which exists
only for x86 with KVM today but could theoretically also be implemented
on Arm/Aarch64 and with TCG or other accelerators (if anyone wants to
run the gauntlet of struct layout compatibility, errno mapping, and the
rest of that fui).
It will also cover the support for architecture-independent grant table
and event channel support which will be added in hw/i386/kvm/ (on the
basis that the non-KVM support is very theoretical and making it not use
KVM directly seems like gratuitous overengineering at this point).
The XEN_BUS option is for the xenfv platform support, which will now be
used both by XEN_EMU and by real Xen.
The XEN option remains dependent on the Xen runtime libraries, and covers
support for real Xen. Some code which currently resides under CONFIG_XEN
will be moving to CONFIG_XEN_BUS over time as the direct dependencies on
Xen runtime libraries are eliminated. The Xen PCI platform device will
also reside under CONFIG_XEN_BUS.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
There's already a partial set here; update them and pull in a more
complete set.
To start with, define __XEN_TOOLS__ in hw/xen/xen.h to ensure that any
internal definitions needed by Xen toolstack libraries are present
regardless of the order in which the headers are included. A reckoning
will come later, once we make the PV backends work in emulation and
untangle the headers for Xen-native vs. generic parts.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Update to Xen public headers from 4.16.2 release, add some in io/,
define __XEN_TOOLS__ in hw/xen/xen.h, move to hw/xen/interface/]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Change to match the recent change to probe_access_flags.
All existing callers updated to supply 0, so no change in behaviour.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
probe_access_flags() as it is today uses probe_access_full(), which in
turn uses probe_access_internal() with size = 0. probe_access_internal()
then uses the size to call the tlb_fill() callback for the given CPU.
This size param ('fault_size' as probe_access_internal() calls it) is
ignored by most existing .tlb_fill callback implementations, e.g.
arm_cpu_tlb_fill(), ppc_cpu_tlb_fill(), x86_cpu_tlb_fill() and
mips_cpu_tlb_fill() to name a few.
But RISC-V riscv_cpu_tlb_fill() actually uses it. The 'size' parameter
is used to check for PMP (Physical Memory Protection) access. This is
necessary because PMP does not make any guarantees about all the bytes
of the same page having the same permissions, i.e. the same page can
have different PMP properties, so we're forced to make sub-page range
checks. To allow RISC-V emulation to do a probe_acess_flags() that
covers PMP, we need to either add a 'size' param to the existing
probe_acess_flags() or create a new interface (e.g.
probe_access_range_flags).
There are quite a few probe_* APIs already, so let's add a 'size' param
to probe_access_flags() and re-use this API. This is done by open coding
what probe_access_full() does inside probe_acess_flags() and passing the
'size' param to probe_acess_internal(). Existing probe_access_flags()
callers use size = 0 to not change their current API usage. 'size' is
asserted to enforce single page access like probe_access() already does.
No behavioral changes intended.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-Id: <20230223234427.521114-2-dbarboza@ventanamicro.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We found a case where the source passed to flatview_write_continue() may
overlap with the destination when fuzzing igb, a new proposed network
device with sanitizers.
igb uses pci_dma_map() to get Tx packet, and pci_dma_write() to write Rx
buffer. While pci_dma_write() is usually used to write data from
memory not mapped to the guest, if igb is configured to perform
loopback, the data will be sourced from the guest memory. The source and
destination can overlap and the usage of memcpy() will be invalid in
such a case.
While we do not really have to deal with such an invalid request for
igb, detecting the overlap in igb code beforehand requires complex code,
and only covers this specific case. Instead, just replace memcpy() with
memmove() to tolerate overlaps. Using memmove() will slightly damage the
performance as it will need to check overlaps before using SIMD
instructions for copying, but the cost should be negligible, considering
the inherent complexity of flatview_write_continue().
The test cases generated by the fuzzer is available at:
https://patchew.org/QEMU/20230129053316.1071513-1-alxndr@bu.edu/
The fixed test case is:
fuzz/crash_47dfe62d9f911bf523ff48cd441b61c0013ed805
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Acked-by: Alexander Bulekov <alxndr@bu.edu>
Acked-by: David Hildenbrand <david@redhat.com>
Message-Id: <20230131030155.18932-1-akihiko.odaki@daynix.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
'dh_ctype_f32' is defined as 'float32', itself declared
in "fpu/softfloat-types.h". Include this header to avoid
when refactoring other headers:
In file included from include/exec/helper-proto.h:7,
from include/tcg/tcg-op.h:29,
from ../../tcg/tcg-op-vec.c:22:
include/exec/helper-head.h:44:22: error: unknown type name ‘float32’; did you mean ‘_Float32’?
44 | #define dh_ctype_f32 float32
| ^~~~~~~
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221216225202.25664-1-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
While mouse is grabbed, window title contains a hint for the user what
keyboard keys to press to release the mouse. Make that hint text a bit
more user friendly for a Mac user:
- Replace "Ctrl" and "Alt" by appropriate symbols for those keyboard
keys typically displayed for them on a Mac (encode those symbols by
using UTF-8 characters).
- Drop " + " in between the keys, as that's not common on macOS for
documenting keyboard shortcuts.
- Convert lower case "g" to upper case "G", as that's common on macOS.
- Add one additional space at start and end of key stroke set, to
visually separate the key strokes from the rest of the text.
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <E1pAClj-0003Jo-OB@lizzy.crudebyte.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
ide_get_geometry() and ide_get_bios_chs_trans() are only
used by the TYPE_PC_MACHINE.
"hw/ide.h" is a mixed bag of lost IDE declarations. In order
to remove this (almost) pointless header soon, move these
declarations to "hw/ide/internal.h".
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230220091358.17038-18-philmd@linaro.org>
idebus_active_if() operates on a IDEBus; rename it as
ide_bus_active_if() to emphasize its first argument
is a IDEBus.
Mechanical change using:
$ sed -i -e 's/idebus_active_if/ide_bus_active_if/g' \
$(git grep -l idebus_active_if)
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230215112712.23110-16-philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
ide_init2() initializes a IDEBus, and set its output IRQ.
To emphasize this, rename it as ide_bus_init_output_irq().
Mechanical change using:
$ sed -i -e 's/ide_init2/ide_bus_init_output_irq/g' \
$(git grep -l ide_init2)
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230215112712.23110-15-philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
ide_exec_cmd() operates on a IDEBus; rename it as
ide_bus_exec_cmd() to emphasize its first argument
is a IDEBus.
Mechanical change using:
$ sed -i -e 's/ide_exec_cmd/ide_bus_exec_cmd/g' \
$(git grep -wl ide_exec_cmd)
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230215112712.23110-14-philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
ide_register_restart_cb() operates on a IDEBus; rename it as
ide_bus_register_restart_cb() to emphasize its first argument
is a IDEBus.
Mechanical change using:
$ sed -i -e 's/ide_register_restart_cb/ide_bus_register_restart_cb/g' \
$(git grep -l ide_register_restart_cb)
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230215112712.23110-13-philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
ide_create_drive() operates on a IDEBus; rename it as
ide_bus_create_drive() to emphasize its first argument
is a IDEBus.
Mechanical change using:
$ sed -i -e 's/ide_create_drive/ide_bus_create_drive/g' \
$(git grep -wl ide_create_drive)
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230215112712.23110-12-philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
ide_set_irq() operates on a IDEBus; rename it as
ide_bus_set_irq() to emphasize its first argument
is a IDEBus.
Mechanical change using:
$ sed -i -e 's/ide_set_irq/ide_bus_set_irq/g' \
$(git grep -l ide_set_irq)
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230215112712.23110-11-philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
"hw/ide.h" is a mixed bag of lost IDE declarations.
Extract isa_ide_init() and the TYPE_ISA_IDE QOM declarations
to a new "hw/ide/isa.h" header.
Rename ISAIDEState::isairq as 'irqnum' to emphasize this is
not a qemu_irq object but the number (index) of an ISA IRQ.
Message-Id: <20230215112712.23110-5-philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Bernhard Beschow <shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Following docs/devel/style.rst guidelines, rename MMIOIDEState
as IdeMmioState.
Having the structure name and its typedef named equally,
we can manually convert from the old DECLARE_INSTANCE_CHECKER()
macro to the more recent OBJECT_DECLARE_SIMPLE_TYPE().
Note, due to that name mismatch, this macro wasn't automatically
converted during commit 8063396bf3 ("Use OBJECT_DECLARE_SIMPLE_TYPE
when possible").
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230220091358.17038-3-philmd@linaro.org>