The hardware singlestep mechanism in POWER works via a Trace Interrupt
(0xd00) that happens after any instruction executes, whenever MSR_SE =
1 (PowerISA Section 6.5.15 - Trace Interrupt).
However, with kvm_hv, the Trace Interrupt happens inside the guest and
KVM has no visibility of it. Therefore, when the gdbstub uses the
KVM_SET_GUEST_DEBUG ioctl to enable singlestep, KVM simply ignores it.
This patch takes advantage of the Trace Interrupt to perform the step
inside the guest, but uses a breakpoint at the Trace Interrupt handler
to return control to KVM. The exit is treated by KVM as a regular
breakpoint and it returns to the host (and QEMU eventually).
Before signalling GDB, QEMU sets the Next Instruction Pointer to the
instruction following the one being stepped and restores the MSR,
SRR0, SRR1 values from before the step, effectively skipping the
interrupt handler execution and hiding the trace interrupt breakpoint
from GDB.
This approach works with both of GDB's 'scheduler-locking' options
(off, step).
Note:
- kvm_arch_set_singlestep happens after GDB asks for a single step,
while the vcpus are stopped.
- kvm_handle_singlestep executes after the step, during the handling
of the Emulation Assist Interrupt (breakpoint).
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
There are four scenarios being handled in this function:
- single stepping
- hardware breakpoints
- software breakpoints
- fallback (no debug supported)
A future patch will add code to handle specific single step and
software breakpoints cases so let's split each scenario into its own
function now to avoid hurting readability.
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
For single stepping (via KVM) of a guest vcpu to work, KVM needs not
only to support the SET_GUEST_DEBUG ioctl but to also recognize the
KVM_GUESTDBG_SINGLESTEP bit in the control field of the
kvm_guest_debug struct.
This patch adds support for querying the single step capability so
that QEMU can decide what to do for the platforms that do not have
such support.
This will allow architecture-specific implementations of a fallback
mechanism for single stepping in cases where KVM does not support it.
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
ppc patch queue 2019-02-26
Next set of patches for ppc and spapr. There's a lot in this one:
* Support "STOP light" states on POWER9
* Add support for HVI interrupts on POWER9 (powernv machine)
* CVE-2019-8934: Don't leak host model and serial information to the guest
* Tests and cleanups for various hot unplug options
* Hash and radix MMU implementation on POWER9 for powernv machine
* PCI Host Bridge hotplug support for pseries machine
* Allow larger kernels and initrds for powernv machine
Plus a handful of miscellaneous fixes and cleanups.
The cpu hotplug tests and cleanups from David Hildenbrand aren't
solely power related. However the consensus amongst Michael Tsirkin,
David Hildenbrand, Cornelia Huck and myself was that it made most
sense to come in via my tree.
# gpg: Signature made Tue 26 Feb 2019 03:37:46 GMT
# gpg: using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full]
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full]
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full]
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown]
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-4.0-20190226: (50 commits)
ppc/pnv: use IEC binary prefixes to represent sizes
ppc/pnv: add INITRD_MAX_SIZE constant
ppc/pnv: increase kernel size limit to 256MiB
hw/ppc: Use object_initialize_child for correct reference counting
ppc/xive: xive does not have a POWER7 interrupt model
tests/device-plug: Add PHB unplug request test for spapr
spapr: enable PHB hotplug for default pseries machine type
spapr: add hotplug hooks for PHB hotplug
spapr_pci: add ibm, my-drc-index property for PHB hotplug
spapr_pci: provide node start offset via spapr_populate_pci_dt()
spapr_events: add support for phb hotplug events
spapr: populate PHB DRC entries for root DT node
spapr: create DR connectors for PHBs
spapr_pci: add PHB unrealize
spapr_irq: Expose the phandle of the interrupt controller
spapr: Expose the name of the interrupt controller node
xics: Write source state to KVM at claim time
spapr/drc: Drop spapr_drc_attach() fdt argument
spapr/pci: Generate FDT fragment at configure connector time
spapr: Generate FDT fragment for CPUs at configure connector time
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Block layer patches:
- Block graph change fixes (avoid loops, cope with non-tree graphs)
- bdrv_set_aio_context() related fixes
- HMP snapshot commands: Use only tag, not the ID to identify snapshots
- qmeu-img, commit: Error path fixes
- block/nvme: Build fix for gcc 9
- MAINTAINERS updates
- Fix various issues with bdrv_refresh_filename()
- Fix various iotests
- Include LUKS overhead in qemu-img measure for qcow2
- A fix for vmdk's image creation interface
# gpg: Signature made Mon 25 Feb 2019 14:18:15 GMT
# gpg: using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (71 commits)
iotests: Skip 211 on insufficient memory
vmdk: false positive of compat6 with hwversion not set
iotests: add LUKS payload overhead to 178 qemu-img measure test
qcow2: include LUKS payload overhead in qemu-img measure
iotests.py: s/_/-/g on keys in qmp_log()
iotests: Let 045 be run concurrently
iotests: Filter SSH paths
iotests.py: Filter filename in any string value
iotests.py: Add is_str()
iotests: Fix 207 to use QMP filters for qmp_log
iotests: Fix 232 for LUKS
iotests: Remove superfluous rm from 232
iotests: Fix 237 for Python 2.x
iotests: Re-add filename filters
iotests: Test json:{} filenames of internal BDSs
block: BDS options may lack the "driver" option
block/null: Generate filename even with latency-ns
block/curl: Implement bdrv_refresh_filename()
block/curl: Harmonize option defaults
block/nvme: Fix bdrv_refresh_filename()
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add a standard authorization framework
The current network services now support encryption via TLS and in some
cases support authentication via SASL. In cases where SASL is not
available, x509 client certificates can be used as a crude authorization
scheme, but using a sub-CA and controlling who you give certs to. In
general this is not very flexible though, so this series introduces a
new standard authorization framework.
It comes with four initial authorization mechanisms
- Simple - an exact username match. This is useful when there is
exactly one user that is known to connect. For example when live
migrating from one QEMU to another with TLS, libvirt would use
the simple scheme to whitelist the TLS cert of the source QEMU.
- List - an full access control list, with optional regex matching.
This is more flexible and is used to provide 100% backcompat with
the existing HMP ACL commands. The caveat is that we can't create
these via the CLI -object arg yet.
- ListFile - the same as List, but with the rules stored in JSON
format in an external file. This avoids the -object limitation
while also allowing the admin to change list entries on the file.
QEMU uses inotify to notice these changes and auto-reload the
file contents. This is likely a good default choice for most
network services, if the "simple" mechanism isn't sufficient.
- PAM - delegate the username lookup to a PAM module, which opens
the door to many options including things like SQL/LDAP lookups.
# gpg: Signature made Tue 26 Feb 2019 15:33:46 GMT
# gpg: using RSA key BE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>" [full]
# gpg: aka "Daniel P. Berrange <berrange@redhat.com>" [full]
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E 8E3F BE86 EBB4 1510 4FDF
* remotes/berrange/tags/authz-core-pull-request:
authz: delete existing ACL implementation
authz: add QAuthZPAM object type for authorizing using PAM
authz: add QAuthZListFile object type for a file access control list
authz: add QAuthZList object type for an access control list
authz: add QAuthZSimple object type for easy whitelist auth checks
authz: add QAuthZ object as an authorization base class
hw/usb: switch MTP to use new inotify APIs
hw/usb: fix const-ness for string params in MTP driver
hw/usb: don't set IN_ISDIR for inotify watch in MTP driver
qom: don't require user creatable objects to be registered
util: add helper APIs for dealing with inotify in portable manner
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When we run "certtool 2>&1 | head -1" the latter command is likely to
complete and exit before certtool has written everything it wants to
stderr. In at least the RHEL-7 gnutls 3.3.29 this causes certtool to
quit with broken pipe before it has finished writing the desired
output file to disk. This causes non-deterministic failures of the
iotest 233 because the certs are sometimes zero length files.
If certtool fails the "head -1" means we also lose any useful error
message it would have printed.
Thus this patch gets rid of the pipe and post-processes the output in a
more flexible & reliable manner.
Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20190220145819.30969-3-berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
If we abort the iotest early the server.log file might contain useful
information for diagnosing the problem. Ensure its contents are
displayed in this case.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20190220145819.30969-2-berrange@redhat.com>
[eblake: fix shell quoting]
Signed-off-by: Eric Blake <eblake@redhat.com>
The 'qemu_acl' type was a previous non-QOM based attempt to provide an
authorization facility in QEMU. Because it is non-QOM based it cannot be
created via the command line and requires special monitor commands to
manipulate it.
The new QAuthZ subclasses provide a superset of the functionality in
qemu_acl, so the latter can now be deleted. The HMP 'acl_*' monitor
commands are converted to use the new QAuthZSimple data type instead
in order to provide temporary backwards compatibility.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add an authorization backend that talks to PAM to check whether the user
identity is allowed. This only uses the PAM account validation facility,
which is essentially just a check to see if the provided username is permitted
access. It doesn't use the authentication or session parts of PAM, since
that's dealt with by the relevant part of QEMU (eg VNC server).
Consider starting QEMU with a VNC server and telling it to use TLS with
x509 client certificates and configuring it to use an PAM to validate
the x509 distinguished name. In this example we're telling it to use PAM
for the QAuthZ impl with a service name of "qemu-vnc"
$ qemu-system-x86_64 \
-object tls-creds-x509,id=tls0,dir=/home/berrange/security/qemutls,\
endpoint=server,verify-peer=yes \
-object authz-pam,id=authz0,service=qemu-vnc \
-vnc :1,tls-creds=tls0,tls-authz=authz0
This requires an /etc/pam/qemu-vnc file to be created with the auth
rules. A very simple file based whitelist can be setup using
$ cat > /etc/pam/qemu-vnc <<EOF
account requisite pam_listfile.so item=user sense=allow file=/etc/qemu/vnc.allow
EOF
The /etc/qemu/vnc.allow file simply contains one username per line. Any
username not in the file is denied. The usernames in this example are
the x509 distinguished name from the client's x509 cert.
$ cat > /etc/qemu/vnc.allow <<EOF
CN=laptop.berrange.com,O=Berrange Home,L=London,ST=London,C=GB
EOF
More interesting would be to configure PAM to use an LDAP backend, so
that the QEMU authorization check data can be centralized instead of
requiring each compute host to have file maintained.
The main limitation with this PAM module is that the rules apply to all
QEMU instances on the host. Setting up different rules per VM, would
require creating a separate PAM service name & config file for every
guest. An alternative approach for the future might be to not pass in
the plain username to PAM, but instead combine the VM name or UUID with
the username. This requires further consideration though.
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a QAuthZListFile object type that implements the QAuthZ interface. This
built-in implementation is a proxy around the QAuthZList object type,
initializing it from an external file, and optionally, automatically
reloading it whenever it changes.
To create an instance of this object via the QMP monitor, the syntax
used would be:
{
"execute": "object-add",
"arguments": {
"qom-type": "authz-list-file",
"id": "authz0",
"props": {
"filename": "/etc/qemu/vnc.acl",
"refresh": true
}
}
}
If "refresh" is "yes", inotify is used to monitor the file,
automatically reloading changes. If an error occurs during reloading,
all authorizations will fail until the file is next successfully
loaded.
The /etc/qemu/vnc.acl file would contain a JSON representation of a
QAuthZList object
{
"rules": [
{ "match": "fred", "policy": "allow", "format": "exact" },
{ "match": "bob", "policy": "allow", "format": "exact" },
{ "match": "danb", "policy": "deny", "format": "glob" },
{ "match": "dan*", "policy": "allow", "format": "exact" },
],
"policy": "deny"
}
This sets up an authorization rule that allows 'fred', 'bob' and anyone
whose name starts with 'dan', except for 'danb'. Everyone unmatched is
denied.
The object can be loaded on the comand line using
-object authz-list-file,id=authz0,filename=/etc/qemu/vnc.acl,refresh=yes
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Add a QAuthZList object type that implements the QAuthZ interface. This
built-in implementation maintains a trivial access control list with a
sequence of match rules and a final default policy. This replicates the
functionality currently provided by the qemu_acl module.
To create an instance of this object via the QMP monitor, the syntax
used would be:
{
"execute": "object-add",
"arguments": {
"qom-type": "authz-list",
"id": "authz0",
"props": {
"rules": [
{ "match": "fred", "policy": "allow", "format": "exact" },
{ "match": "bob", "policy": "allow", "format": "exact" },
{ "match": "danb", "policy": "deny", "format": "glob" },
{ "match": "dan*", "policy": "allow", "format": "exact" },
],
"policy": "deny"
}
}
}
This sets up an authorization rule that allows 'fred', 'bob' and anyone
whose name starts with 'dan', except for 'danb'. Everyone unmatched is
denied.
It is not currently possible to create this via -object, since there is
no syntax supported to specify non-scalar properties for objects. This
is likely to be addressed by later support for using JSON with -object,
or an equivalent approach.
In any case the future "authz-listfile" object can be used from the
CLI and is likely a better choice, as it allows the ACL to be refreshed
automatically on change.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
In many cases a single VM will just need to whitelist a single identity
as the allowed user of network services. This is especially the case for
TLS live migration (optionally with NBD storage) where we just need to
whitelist the x509 certificate distinguished name of the source QEMU
host.
Via QMP this can be configured with:
{
"execute": "object-add",
"arguments": {
"qom-type": "authz-simple",
"id": "authz0",
"props": {
"identity": "fred"
}
}
}
Or via the command line
-object authz-simple,id=authz0,identity=fred
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The current qemu_acl module provides a simple access control list
facility inside QEMU, which is used via a set of monitor commands
acl_show, acl_policy, acl_add, acl_remove & acl_reset.
Note there is no ability to create ACLs - the network services (eg VNC
server) were expected to create ACLs that they want to check.
There is also no way to define ACLs on the command line, nor potentially
integrate with external authorization systems like polkit, pam, ldap
lookup, etc.
The QAuthZ object defines a minimal abstract QOM class that can be
subclassed for creating different authorization providers.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The internal inotify APIs allow a lot of conditional statements to be
cleared out, and provide a simpler callback for handling events.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Various functions accepting 'char *' string parameters were missing
'const' qualifiers.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
IN_ISDIR is not a bit that one can request when registering a
watch with inotify_add_watch. Rather it is a bit that is set
automatically when reading events from the kernel.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
When an object is in turn owned by another user object, it is not
desirable to expose this in the QOM object hierarchy. It is just an
internal implementation detail, we should be free to change without
exposure to apps managing QEMU.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The inotify userspace API for reading events is quite horrible, so it is
useful to wrap it in a more friendly API to avoid duplicating code
across many users in QEMU. Wrapping it also allows introduction of a
platform portability layer, so that we can add impls for non-Linux based
equivalents in future.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
We missed a bug in a recent patch as we were not testing all the
rounding modes for all operations. However enabling all rounding modes
for mulAdd does slow down the already slowest test and doesn't really
buy us much additional coverage so lets allow the default test flags
to be overridden.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Previously this was only supported for roundAndPackFloat64.
New support in round_canonical, round_to_int, float128_round_to_int,
roundAndPackFloat32, roundAndPackInt32, roundAndPackInt64,
roundAndPackUint64. This does not include any of the floatx80 routines,
as we do not have users for that rounding mode there.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190215170225.15537-1-richard.henderson@linaro.org>
Tested-by: David Hildenbrand <david@redhat.com>
[AJB: add missing break]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Handling it just like float128_to_uint32_round_to_zero, that hopefully
is free of bugs :)
Documentation basically copied from float128_to_uint64
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Needed on s390x, to test for the data class of a number. So it will
gain soon a user.
A number is considered normal if the exponent is neither 0 nor all 1's.
That can be checked by adding 1 to the exponent, and comparing against
>= 2 after dropping an eventual overflow into the sign bit.
While at it, convert the other floatXX_is_normal functions to use a
similar, less error prone calculation, as suggested by Richard H.
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Commit 2cade3d wired up new tests, but did not exclude the
new *.out files produced by running the tests.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Building kernel with CONFIG_DEBUG_INFO_REDUCED can generate a ~90MB image and
building with CONFIG_DEBUG_INFO can generate a ~225M one, both exceeds the
current limit of 32MiB.
Increasing kernel size limit to 256MiB should fit for now.
Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Message-Id: <20190225170155.1972-2-muriloo@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Both functions, object_initialize() and object_property_add_child() increase
the reference counter of the new object, so one of the references has to be
dropped afterwards to get the reference counting right. Otherwise the child
object will not be properly cleaned up when the parent gets destroyed.
Thus let's use now object_initialize_child() instead to get the reference
counting here right.
Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1550748288-30598-1-git-send-email-thuth@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Hotplugging PHBs is a machine-level operation, but PHBs reside on the
main system bus, so we register spapr machine as the handler for the
main system bus.
Provide the usual pre-plug, plug and unplug-request handlers.
Move the checking of the PHB index to the pre-plug handler. It is okay
to do that and assert in the realize function because the pre-plug
handler is always called, even for the oldest machine types we support.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
(Fixed interrupt controller phandle in "interrupt-map" and
TCE table size in "ibm,dma-window" FDT fragment, Greg Kurz)
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155059672926.1466090.13612804072190051439.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
To support PHB hotplug we need to clean up lingering references,
memory, child properties, etc. prior to the PHB object being
finalized. Generally this will be called as a result of calling
object_unparent() on the PHB object, which in turn would normally
be called as the result of an unplug() operation.
When the PHB is finalized, child objects will be unparented in
turn, and finalized if the PHB was the only reference holder. so
we don't bother to explicitly unparent child objects of the PHB,
with the notable exception of DRCs. This is needed to avoid a QEMU
crash when unplugging a PHB and resetting the machine before the
guest could handle the event. The DRCs are removed from the QOM tree
by pci_unregister_root_bus() and we must make sure we're not leaving
stale aliases under the global /dr-connector path.
The formula that gives the number of DMA windows is moved to an
inline function in the hw/pci-host/spapr.h header because it
will have other users.
The unrealize function is able to cope with partially realized PHBs.
It is hence used to implement proper rollback on the realize error
path.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <155059669881.1466090.13515030705986041517.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The pseries machine only uses LSIs to support legacy PCI devices. Every
PHB claims 4 LSIs at realize time. When using in-kernel XICS (or upcoming
in-kernel XIVE), QEMU synchronizes the state of all irqs, including these
LSIs, later on at machine reset.
In order to support PHB hotplug, we need a way to tell KVM about the LSIs
that doesn't require a machine reset. An easy way to do that is to always
inform KVM when an interrupt is claimed, which really isn't a performance
path.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155059668360.1466090.5969630516627776426.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The current logic is to provide the FDT fragment when attaching a device
to a DRC. This works perfectly fine for our current hotplug support, but
soon we will add support for PHB hotplug which has some constraints, that
CPU, PCI and LMB devices don't seem to have.
The first constraint is that the "ibm,dma-window" property of the PHB
node requires the IOMMU to be configured, ie, spapr_tce_table_enable()
has been called, which happens during PHB reset. It is okay in the case
of hotplug since the device is reset before the hotplug handler is
called. On the contrary with coldplug, the hotplug handler is called
first and device is only reset during the initial system reset. Trying
to create the FDT fragment on the hotplug path in this case, would
result in somthing like this:
ibm,dma-window = < 0x80000000 0x00 0x00 0x00 0x00 >;
This will cause linux in the guest to panic, by simply removing and
re-adding the PHB using the drmgr command:
page = alloc_pages_node(nid, GFP_KERNEL, get_order(sz));
if (!page)
panic("iommu_init_table: Can't allocate %ld bytes\n", sz);
The second and maybe more problematic constraint is that the
"interrupt-map" property needs to reference the interrupt controller
node using the very same phandle that SLOF has already exposed to the
guest. QEMU requires SLOF to call the private KVMPPC_H_UPDATE_DT hcall
at some point to know about this phandle. With the latest QEMU and SLOF,
this happens when SLOF gets quiesced. This means that if the PHB gets
hotplugged after CAS but before SLOF quiesce, then we're sure that the
phandle is not known when the hotplug handler is called.
The FDT is only needed when the guest first invokes RTAS to configure
the connector actually, long after SLOF quiesce. Let's postpone the
creation of FDT fragments for PHBs to rtas_ibm_configure_connector().
Since we only need this for PHBs, introduce a new method in the base
DRC class for that. DRC subtypes will be converted to use it in
subsequent patches.
Allow spapr_drc_attach() to be passed a NULL fdt argument if the method
is available. When all DRC subtypes have been converted, the fdt argument
will eventually disappear.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155059665823.1466090.18358845122627355537.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
That "b" means "base address" and thus shouldn't be in the name
of actual entries and related constants.
This patch keeps the synthetic patb_entry field of the spapr
virtual hypervisor unchanged until I figure out if that has
an impact on the migration stream.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190215170029.15641-11-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Let's use the generic helper tlb_flush_all_cpus_synced() instead
of iterating the CPUs ourselves.
We do lose the optimization of clearing the "other" CPUs "need flush"
flags but this shouldn't be a problem in practice.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190215170029.15641-9-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
POWER9 (arch v3) slightly changes the HPTE format. The B bits move
from the first to the second half of the HPTE, and the AVPN/ARPN
are slightly shorter.
However, under SPAPR, the hypercalls still take the old format
(and probably will for the foreseable future).
The simplest way to support this is thus to convert the HPTEs from
new to old format when reading them if the MMU model is v3 and there
is no virtual hypervisor, leaving the rest of the code unchanged.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190215170029.15641-8-clg@kaod.org>
[dwg: Moved function to .c since there was no real need for it in the .h]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
With mttcg, we can have MMU lookups happening at the same time
as the guest modifying the page tables.
Since the HPTEs of the hash table MMU contains two words (or
double worlds on 64-bit), we need to make sure we read them
in the right order, with the correct memory barrier.
Additionally, when using emulated SPAPR mode, the hypercalls
writing to the hash table must also perform the udpates in
the right order.
Note: This part is still not entirely correct
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190215170029.15641-7-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Historically the 64-bit server MMU supports two way of configuring the
guest "real mode" mapping:
- The "RMA" with is a single chunk of physically contiguous
memory remapped as guest real, and controlled by the RMLS
field in the LPCR register and the RMOR register.
- The "VRMA" which uses special PTEs inserted in the partition
hash table by the hypervisor.
POWER9 deprecates the former, which is reflected by the filtering
done in ppc_store_lpcr() which effectively prevents setting of
the RMLS field.
However, when using fully emulated SPAPR machines, our qemu code
currently only knows how to define the guest real mode memory using
RMLS.
Thus you cannot run a SPAPR machine anymore with a POWER9 CPU
model today.
This works around it with a quirk in ppc_store_lpcr() to continue
allowing the RMLS field to be set when using a virtual hypervisor.
Ultimately we will want to implement configuring a VRMA instead
which will also be necessary if we want to migrate a SPAPR guest
between TCG and KVM but this is a lot more work.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190215170029.15641-4-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The HW relies on LPCR:HR along with the PATE to determine whether
to use Radix or Hash mode. In fact it uses LPCR:HR more commonly
than the PATE.
For us, it's also more efficient to do so, especially since unlike
the HW we do not maintain a cache of the current PATE and HV PATE
in a generic place.
Prepare the grounds for that by ensuring that LPCR:HR is set
properly on SPAPR machines.
Another option would have been to use a callback to get the PATE
but this gets messy when implementing bare metal support, it's
much simpler (and faster) to use LPCR.
Since existing migration streams may not have it, fix it up in
spapr_post_load() as well based on the pseudo-PATE entry that
we keep.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190215170029.15641-2-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We can easily test this, just like PCI. On s390x, cpu unplug is not
supported. On x86 ACPI, cpu unplug requires guest interaction to work, so
it can't be tested that easily. We might add tests for ACPI later.
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190218092202.26683-6-david@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The issue with testing asynchronous unplug requests it that they usually
require a running guest to handle the request. However, to test if
unplug of PCI devices works, we can apply a nice little trick on some
architectures:
On system reset, x86 ACPI, s390x and spapr will perform the unplug,
resulting in the device of interest to get deleted and a DEVICE_DELETED
event getting sent.
On s390x, we still get a warning
qemu-system-s390x: -device virtio-mouse-pci,id=dev0:
warning: Plugging a PCI/zPCI device without the 'zpci' CPU feature
enabled; the guest will not be able to see/use this device
This will be fixed soon, when we enable the zpci CPU feature always
(Conny already has a patch for this queued).
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Collin Walling <walling@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190218092202.26683-4-david@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Adds support for the Hypervisor directed interrupts in addition to the
OS ones.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: - modified the icp_realize() and xive_tctx_realize() to take
into account explicitely the POWER9 interrupt model
- introduced a specific power9_set_irq for POWER9 ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190215161648.9600-10-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
It's very easy for the CPU specific has_work() implementation
and the logic in ppc_hw_interrupt() to be subtly out of sync.
This can occasionally allow a CPU to wakeup from a PM state
and resume executing past the PM instruction when it should
resume at the 0x100 vector.
This detects if it happens and aborts, making it a lot easier
to catch such bugs when testing rather than chasing obscure
guest misbehaviour.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190215161648.9600-8-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
STOP must act differently based on PSSCR:EC on POWER9. When set, it
acts like the P7/P8 power management instructions and wake up at 0x100
based on the wakeup conditions in LPCR.
When PSSCR:EC is clear however it will wakeup at the next instruction
after STOP (if EE is clear) or take the corresponding interrupts (if
EE is set).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190215161648.9600-4-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When issuing a power management instruction, we set MSR:EE
to force ppc_hw_interrupt() into calling powerpc_excp()
to deal with the fact that on P7 and P8, the system reset
caused by the wakeup needs to be generated regardless of
the MSR:EE value (using LPCR only).
This however means that the OS will see a bogus SRR1:EE
value which is a problem. It also prevents properly
implementing P9 STOP "light".
So fix this by instead putting some logic in ppc_hw_interrupt()
to decide whether to deliver or not by taking into account the
fact that we are waking up from sleep.
The LPCR isn't checked as this is done in the has_work() test.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190215161648.9600-3-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Those instructions currently raise an exception from within
the helper. This tends to result in a bogus nip value in
the env context (typically the beginning of the TB). Such
a helper needs a gen_update_nip() first.
This fixes it with a different approach which is to throw the
exception from translate.c instead of the helper using
gen_exception_nip() which does the right thing. Exception
EXCP_HLT is also used instead of POWERPC_EXCP_STOP to effectively
exit from the CPU execution loop.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[clg : modified the commit log to comment the use of EXCP_HLT instead
of POWERPC_EXCP_STOP]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190215161648.9600-2-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Pull request
# gpg: Signature made Fri 22 Feb 2019 14:07:01 GMT
# gpg: using RSA key 9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [full]
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" [full]
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8
* remotes/stefanha/tags/block-pull-request: (27 commits)
tests/virtio-blk: add test for DISCARD command
tests/virtio-blk: add test for WRITE_ZEROES command
tests/virtio-blk: add virtio_blk_fix_dwz_hdr() function
tests/virtio-blk: change assert on data_size in virtio_blk_request()
virtio-blk: add DISCARD and WRITE_ZEROES features
virtio-blk: set config size depending on the features enabled
virtio-net: make VirtIOFeature usable for other virtio devices
virtio-blk: add "discard" and "write-zeroes" properties
virtio-blk: add host_features field in VirtIOBlock
virtio-blk: add acct_failed param to virtio_blk_handle_rw_error()
hw/ide: drop iov field from IDEDMA
hw/ide: drop iov field from IDEBufferedRequest
hw/ide: drop iov field from IDEState
tests/test-bdrv-drain: use QEMU_IOVEC_INIT_BUF
migration/block: use qemu_iovec_init_buf
qemu-img: use qemu_iovec_init_buf
block/vmdk: use qemu_iovec_init_buf
block/qed: use qemu_iovec_init_buf
block/qcow2: use qemu_iovec_init_buf
block/qcow: use qemu_iovec_init_buf
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Block patches:
- Fix various issues with bdrv_refresh_filename()
- Fix various iotests
- Include LUKS overhead in qemu-img measure for qcow2
- A fix for vmdk's image creation interface
# gpg: Signature made Mon Feb 25 15:13:43 2019 CET
# gpg: using RSA key F407DB0061D5CF40
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>"
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40
* mreitz/tags/pull-block-2019-02-25: (45 commits)
iotests: Skip 211 on insufficient memory
vmdk: false positive of compat6 with hwversion not set
iotests: add LUKS payload overhead to 178 qemu-img measure test
qcow2: include LUKS payload overhead in qemu-img measure
iotests.py: s/_/-/g on keys in qmp_log()
iotests: Let 045 be run concurrently
iotests: Filter SSH paths
iotests.py: Filter filename in any string value
iotests.py: Add is_str()
iotests: Fix 207 to use QMP filters for qmp_log
iotests: Fix 232 for LUKS
iotests: Remove superfluous rm from 232
iotests: Fix 237 for Python 2.x
iotests: Re-add filename filters
iotests: Test json:{} filenames of internal BDSs
block: BDS options may lack the "driver" option
block/null: Generate filename even with latency-ns
block/curl: Implement bdrv_refresh_filename()
block/curl: Harmonize option defaults
block/nvme: Fix bdrv_refresh_filename()
...
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
VDI keeps the whole bitmap in memory, and the maximum size (which is
tested here) is 2 GB. This may not be available on all machines, and it
rarely is available when running a 32 bit build.
Fix this by making VM.run_job() return the error string if an error
occurred, and checking whether that contains "Could not allocate bmap"
in 211. If so, the test is skipped.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190218180646.30282-1-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
In vmdk_co_create_opts, when it finds hw_version is undefined, it will
set it to 4, which misleading the compat6 and hwversion in
vmdk_co_do_create. Simply set hw_version to NULL after free, let
the logic in vmdk_co_do_create to decide the value of hw_version.
This bug can be reproduced by:
$ qemu-img convert -O vmdk -o subformat=streamOptimized,compat6
/home/yuchenlin/syno.qcow2 /home/yuchenlin/syno.vmdk
qemu-img: /home/yuchenlin/syno.vmdk: error while converting vmdk:
compat6 cannot be enabled with hwversion set
Signed-off-by: yuchenlin <yuchenlin@synology.com>
Message-id: 20190221110805.28239-1-yuchenlin@synology.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
LUKS encryption reserves clusters for its own payload data. The size of
this area must be included in the qemu-img measure calculation so that
we arrive at the correct minimum required image size.
(Ab)use the qcrypto_block_create() API to determine the payload
overhead. We discard the payload data that qcrypto thinks will be
written to the image.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190218104525.23674-2-stefanha@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
filter_qmp_testfiles() currently filters the filename only for specific
keys. However, there are more keys that take filenames (such as
block-commit's @top and @base, or ssh's @path), and it does not make
sense to list them all here. "$TEST_DIR/$PID-" should have enough
entropy not to appear anywhere randomly.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20190210145736.1486-8-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
With IMGOPTSSYNTAX, $TEST_IMG is useless for this test (it only tests
the file-posix protocol driver). Therefore, if $TEST_IMG_FILE is set,
use that instead.
Because this test requires the file protocol, $TEST_IMG_FILE will always
be set if $IMGOPTSSYNTAX is true.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20190210145736.1486-5-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
math.ceil() returns an integer on Python 3.x, but a float on Python 2.x.
range() always needs integers, so we need an explicit conversion on 2.x
(which does not hurt on 3.x).
It is not quite clear whether we want to support Python 2.x for any
prolonged time, but this may as well be fixed along with the other
issues some iotests have right now.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190210145736.1486-3-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
A previous commit removed the default filters for qmp_log with the
intention to make them explicit; but this happened only for test 206.
There are more tests (for more exotic image formats than qcow2) which
require the filename filter, though.
Note that 237 is still broken for Python 2.x, which is fixed in the next
commit.
Fixes: f8ca8609d8
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20190210145736.1486-2-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
When BDSs are created by qemu itself (e.g. as filters in block jobs),
they may not have a "driver" option in their options QDict. When
generating a json:{} filename, however, it must always be present.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-31-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Both of the defaults we currently have in the curl driver are named
based on a slightly different schema, let's unify that and call both
CURL_BLOCK_OPT_${NAME}_DEFAULT.
While at it, we can add a macro for the third option for which a default
exists, namely "sslverify".
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-28-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Currently, nvme's bdrv_refresh_filename() is an exact copy of null's
implementation. However, for null, "null-co://" and "null-aio://" are
indeed valid filenames -- for nvme, they are not, as a device address is
still required.
The correct implementation should generate a filename of the form
"nvme://[PCI address]/[namespace]" (as the comment above
nvme_parse_filename() describes).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-27-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
If a format BDS's file BDS is in turn a format BDS, we cannot simply use
the same filename, because when opening a BDS tree based on a filename
alone, qemu will create only one format node on top of one protocol node
(disregarding a potential backing file).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-26-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Currently, BlockDriver.bdrv_refresh_filename() is supposed to both
refresh the filename (BDS.exact_filename) and set BDS.full_open_options.
Now that we have generic code in the central bdrv_refresh_filename() for
creating BDS.full_open_options, we can drop the latter part from all
BlockDriver.bdrv_refresh_filename() implementations.
This also means that we can drop all of the existing default code for
this from the global bdrv_refresh_filename() itself.
Furthermore, we now have to call BlockDriver.bdrv_refresh_filename()
after having set BDS.full_open_options, because the block driver's
implementation should now be allowed to depend on BDS.full_open_options
being set correctly.
Finally, with this patch we can drop the @options parameter from
BlockDriver.bdrv_refresh_filename(); also, add a comment on this
function's purpose in block/block_int.h while touching its interface.
This completely obsoletes blklogwrite's implementation of
.bdrv_refresh_filename().
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190201192935.18394-25-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Instead of having every block driver which implements
bdrv_refresh_filename() copy all of the strong runtime options over to
bs->full_open_options, implement this process generically in
bdrv_refresh_filename().
This patch only adds this new generic implementation, it does not remove
the old functionality. This is done in a follow-up patch.
With this patch, some superfluous information (that should never have
been there) may be removed from some JSON filenames, as can be seen in
the change to iotests 110's and 228's reference outputs.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190201192935.18394-24-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Some follow-up patches will rework the way bs->full_open_options is
refreshed in bdrv_refresh_filename(). The new implementation will remove
the need for the block drivers' bdrv_refresh_filename() implementations
to set bs->full_open_options; instead, it will be generic and use static
information from each block driver.
However, by implementing bdrv_gather_child_options(), block drivers will
still be able to override the way the full_open_options of their
children are incorporated into their own.
We need to implement this function for VMDK because we have to prevent
the generic implementation from gathering the options of all children:
It is not possible to specify options for the extents through the
runtime options.
For quorum, the child names that would be used by the generic
implementation and the ones that we actually (currently) want to use
differ. See quorum_gather_child_options() for more information.
Note that both of these are cases which are not ideal: In case of VMDK
it would probably be nice to be able to specify options for all extents.
In case of quorum, the current runtime option structure is simply broken
and needs to be fixed (but that is left for another patch).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-23-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
This new field can be set by block drivers to list the runtime options
they accept that may influence the contents of the respective BDS. As of
a follow-up patch, this list will be used by the common
bdrv_refresh_filename() implementation to decide which options to put
into BDS.full_open_options (and consequently whether a JSON filename has
to be created), thus freeing the drivers of having to implement that
logic themselves.
Additionally, this patch adds the field to all of the block drivers that
need it and sets it accordingly.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-22-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Test 110 tests relative backing filenames for complex BDS trees. Now
that the originally supposedly failing test passes, let us add a new
failing test: Quorum can never work automatically (without detecting
whether all child nodes have the same base directory, but that would be
rather inconsistent behavior).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-21-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
bdrv_get_full_backing_filename_from_filename() breaks down when it comes
to JSON filenames. Using bdrv_dirname() as the basis is better because
since we have BDS, we can descend through the BDS tree to the protocol
layer, which gives us a greater probability of finding a non-JSON name;
also, bdrv_dirname() is more correct as it allows block drivers to
override the generation of that directory name in a protocol-specific
way.
We still need to keep bdrv_get_full_backing_filename_from_filename(),
though, because it has valid callers which need it during image creation
when no BDS is available yet.
This makes a test case in qemu-iotest 110, which was supposed to fail,
work. That is actually good, but we need to change the reference output
(and the comment in 110) accordingly.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-20-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
While the basic idea is obvious and could be handled by the default
bdrv_dirname() implementation, we cannot generate a directory name if
the gid or uid are set, so we have to explicitly return NULL in those
cases.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-19-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
The generic bdrv_dirname() implementation would be able to generate some
form of directory name for many NBD nodes, but it would be always wrong.
Therefore, we have to explicitly make it an error (until NBD has some
form of specification for export paths, if it ever will).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20190201192935.18394-18-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
While the common implementation for bdrv_dirname() should return NULL
for quorum BDSs already (because they do not have a file node and their
exact_filename field should be empty), there is no reason not to make
that explicit.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-17-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
blkverify's BDSs have a file BDS, but we do not want this to be
preferred over the raw node. There is no way to decide between the two
(and not really a reason to, either), so just return NULL in blkverify's
implementation of bdrv_dirname().
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-16-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
This function may be implemented by block drivers to derive a directory
name from a BDS. Concatenating this g_free()-able string with a relative
filename must result in a valid (not necessarily existing) filename, so
this is a function that should generally be not implemented by format
drivers, because this is protocol-specific.
If a BDS's driver does not implement this function, bdrv_dirname() will
fall through to the BDS's file if it exists. If it does not, the
exact_filename field will be used to generate a directory name.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-15-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
bdrv_find_backing_image() should use bdrv_get_full_backing_filename() or
bdrv_make_absolute_filename() instead of trying to do what those
functions do by itself.
path_combine_deprecated() can now be dropped, so let's do that.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-14-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
This is a general function for making a filename that is relative to a
certain BDS absolute.
It calls bdrv_get_full_backing_filename_from_filename() for now, but
that will be changed in a follow-up patch.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-13-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Besides being safe for arbitrary path lengths, after some follow-up
patches all callers will want a freshly allocated buffer anyway.
In the meantime, path_combine_deprecated() is added which has the same
interface as path_combine() had before this patch. All callers to that
function will be converted in follow-up patches.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 20190201192935.18394-10-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Basically, bdrv_refresh_filename() should respect all children of a
BlockDriverState. However, generally those children are driver-specific,
so this function cannot handle the general case. On the other hand,
there are only few drivers which use other children than @file and
@backing (that being vmdk, quorum, and blkverify).
Most block drivers only use @file and/or @backing (if they use any
children at all). Both can be implemented directly in
bdrv_refresh_filename.
The user overriding the file's filename is already handled, however, the
user overriding the backing file is not. If this is done, opening the
BDS with the plain filename of its file will not be correct, so we may
not set bs->exact_filename in that case.
iotest 051 contains test cases for overriding the backing file, and so
its output changes with this patch applied.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-6-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
If the backing file is overridden, this most probably does change the
guest-visible data of a BDS. Therefore, we will need to consider this
in bdrv_refresh_filename().
To see whether it has been overridden, we might want to compare
bs->backing_file and bs->backing->bs->filename. However,
bs->backing_file is changed by bdrv_set_backing_hd() (which is just used
to change the backing child at runtime, without modifying the image
header), so bs->backing_file most of the time simply contains a copy of
bs->backing->bs->filename anyway, so it is useless for such a
comparison.
This patch adds an auto_backing_file BDS field which contains the
backing file path as indicated by the image header, which is not changed
by bdrv_set_backing_hd().
Because of bdrv_refresh_filename() magic, however, a BDS's filename may
differ from what has been specified during bdrv_open(). Then, the
comparison between bs->auto_backing_file and bs->backing->bs->filename
may fail even though bs->backing was opened from bs->auto_backing_file.
To mitigate this, we can copy the real BDS's filename (after the whole
bdrv_open() and bdrv_refresh_filename() process) into
bs->auto_backing_file, if we know the former has been opened based on
the latter. This is only possible if no options modifying the backing
file's behavior have been specified, though. To simplify things, this
patch only copies the filename from the backing file if no options have
been specified for it at all.
Furthermore, there are cases where an overlay is created by qemu which
already contains a BDS's filename (e.g. in blockdev-snapshot-sync). We
do not need to worry about updating the overlay's bs->auto_backing_file
there, because we actually wrote a post-bdrv_refresh_filename() filename
into the image header.
So all in all, there will be false negatives where (as of a future
patch) bdrv_refresh_filename() will assume that the backing file differs
from what was specified in the image header, even though it really does
not. However, these cases should be limited to where (1) the user
actually did override something in the backing chain (e.g. by specifying
options for the backing file), or (2) the user executed a QMP command to
change some node's backing file (e.g. change-backing-file or
block-commit with @backing-file given) where the given filename does not
happen to coincide with qemu's idea of the backing BDS's filename.
Then again, (1) really is limited to -drive. With -blockdev or
blockdev-add, you have to adhere to the schema, so a user cannot give
partial "unimportant" options (e.g. by just setting backing.node-name
and leaving the rest to the image header). Therefore, trying to fix
this would mean trying to fix something for -drive only.
To improve on (2), we would need a full infrastructure to "canonicalize"
an arbitrary filename (+ options), so it can be compared against
another. That seems a bit over the top, considering that filenames
nowadays are there mostly for the user's entertainment.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-5-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
bdrv_refresh_filename() should invoke itself recursively on all
children, not just on file.
With that change, we can remove the manual invocations in blkverify,
quorum, commit, mirror, and blklogwrites.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-3-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Before this patch, bdrv_refresh_filename() is used in a pushing manner:
Whenever the BDS graph is modified, the parents of the modified edges
are supposed to be updated (recursively upwards). However, that is
nonviable, considering that we want child changes not to concern
parents.
Also, in the long run we want a pull model anyway: Here, we would have a
bdrv_filename() function which returns a BDS's filename, freshly
constructed.
This patch is an intermediate step. It adds bdrv_refresh_filename()
calls before every place a BDS.filename value is used. The only
exceptions are protocol drivers that use their own filename, which
clearly would not profit from refreshing that filename before.
Also, bdrv_get_encrypted_filename() is removed along the way (as a user
of BDS.filename), since it is completely unused.
In turn, all of the calls to bdrv_refresh_filename() before this patch
are removed, because we no longer have to call this function on graph
changes.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190201192935.18394-2-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
The QEMU_PACKED is causing a compiler warning/error with GCC 9:
CC block/nvme.o
block/nvme.c: In function ‘nvme_create_queue_pair’:
block/nvme.c:209:22: error: taking address of packed member of
‘struct <anonymous>’ may result in an unaligned pointer value
[-Werror=address-of-packed-member]
209 | q->sq.doorbell = &s->regs->doorbells[idx * 2 * s->doorbell_scale];
All members of the struct are naturally aligned, so there should
not be the need for QEMU_PACKED here, and the following QEMU_BUILD_BUG_ON
also ensures that there is no padding. Thus simply remove the QEMU_PACKED
here.
Buglink: https://bugs.launchpad.net/qemu/+bug/1817525
Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
L1 table entries have a field to store the offset of an L2 table.
The rest of the bits of the entry are currently reserved except from
bit 63, which stores the COPIED flag.
The offset is always taken from the entry using L1E_OFFSET_MASK to
ensure that we only use the bits that belong to that field.
While that mask is used every time we read from the L1 table, it is
never used when we write to it. Due to the limits set elsewhere in the
code QEMU can never produce L2 table offsets that don't fit in that
field so any such offset when allocating an L2 table would indicate a
bug in QEMU.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Various testing fixes:
- Travis updates (inc disable isapc cdrom test)
- Add gitlab control
- Fix docker image
- keep softloat tests short
# gpg: Signature made Fri 22 Feb 2019 09:51:36 GMT
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [full]
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* remotes/stsquad/tags/pull-testing-next-220219-1:
tests/cdrom-test: only include isapc cdrom test when g_test_slow()
tests/softfloat: always do quick softfloat tests
Add a gitlab-ci file for Continuous Integration testing on Gitlab
tests/docker: peg netmap code to a specific version
tests/docker: squash initial update and install step for debian9
.travis.yml: Remove disable-uuid
.travis.yml: Test with disable-replication
.travis.yml: split debug builds
.travis.yml: the xcode10 image seems to be hosed
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
bdrv_check_perm in it's recursion checks each node in context of new
permissions for one parent, because of nature of DFS. It works well,
while children subgraph of top-most updated node is a tree, i.e. it
doesn't have any kind of loops. But if we have a loop (not oriented,
of course), i.e. we have two different ways from top-node to some
child-node, then bdrv_check_perm will do wrong thing:
top
| \
| |
v v
A B
| |
v v
node
It will once check new permissions of node in context of new A
permissions and old B permissions and once visa-versa. It's a wrong way
and may lead to corruption of permission system. We may start with
no-permissions and all-shared for both A->node and B->node relations
and finish up with non shared write permission for both ways.
The following commit will add a test, which shows this bug.
To fix this situation, let's really set BdrvChild permissions during
bdrv_check_perm procedure. And we are happy here, as check-perm is
already written in transaction manner, so we just need to restore
backed-up permissions in _abort.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
As it already said in the comment, we don't want to create loops in
parent->child relations. So, when we try to append @to to @c, we should
check that @c is not in @to children subtree, and we should check it
recursively, not only the first level. The patch provides BFS-based
search, to check the relations.
This is needed for further fleecing-hook filter usage: we need to
append it to source, when the hook is already a parent of target, and
source may be in a backing chain of target (fleecing-scheme). So, on
appending, the hook should not became a child (direct or through
children subtree) of the target.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
aio_poll() has an existing assertion that the function is only called
from the AioContext's home thread if blocking is allowed.
This is not enough, some handlers make assumptions about the thread they
run in. Extend the assertion to non-blocking calls, too.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Now that bdrv_set_aio_context() works inside drained sections, it can
also use the real drain function instead of open coding something
similar.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When a drained node changes its AioContext, we need to move its
aio_disable_external() to the new context, too.
Without this fix, drain_end will try to reenable the new context, which
has never been disabled, so an assertion failure is triggered.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
The explicit aio_poll() call in bdrv_set_aio_context() was added in
commit c2b6428d38 as a workaround for bdrv_drain() failing to achieve
to actually quiesce everything (specifically the NBD client code to
switch AioContext).
Now that the NBD client has been fixed to complete this operation during
bdrv_drain(), we don't need the workaround any more.
It was wrong anyway: aio_poll() must always be run in the home thread of
the AioContext.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
bdrv_drain() must not leave connection_co scheduled, so bs->in_flight
needs to be increased while the coroutine is waiting to be scheduled
in the new AioContext after nbd_client_attach_aio_context().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Instead of using the convenience wrapper qio_channel_read_all_eof(), use
the lower level QIOChannel API. This means duplicating some code, but
we'll need this because this coroutine yield is special: We want it to
be interruptible so that nbd_client_attach_aio_context() can correctly
reenter the coroutine.
This moves the bdrv_dec/inc_in_flight() pair into nbd_read_eof(), so
that connection_co will always sit in this exact qio_channel_yield()
call when bdrv_drain() returns.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
The only caller of nbd_read_eof() is nbd_receive_reply(), so it doesn't
have to live in the header file, but can move next to its caller.
Also add the missing coroutine_fn to the function and its caller.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
qio_channel_yield() now updates ioc->read_write/coroutine and calls
qio_channel_set_aio_fd_handlers(), so the code in the handlers has
become redundant and can be removed.
This does not make a difference in intermediate states because
aio_co_wake() really enters the coroutine immediately here: These
handlers are never run in coroutine context, and we're in the right
AioContext because qio_channel_attach_aio_context() asserts that the
handlers are inactive.
To make these conditions more obvious, assert the right AioContext.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Similar to how qemu_co_sleep_ns() allows preemption from an external
coroutine entry, allow reentering qio_channel_yield() early.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
nbd_client_attach_aio_context() schedules connection_co in the new
AioContext and this way reenters it in any arbitrary place that has
yielded. We can restrict this a bit to the function call where the
coroutine actually sits waiting when it's idle.
This doesn't solve any bug yet, but it shows where in the code we need
to support this random reentrance and where we don't have to care.
Add FIXME comments for the existing bugs that the rest of this series
will fix.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
virtio_blk_dma_restart_bh() submits new requests, so in order to make
sure that these requests are not started inside a drained section of the
attached BlockBackend, we need to make sure that draining the
BlockBackend waits for the BH to be executed.
This BH is still questionable because its scheduled in the main thread
instead of the configured iothread. Leave a FIXME comment for this.
But with this fix, enabling the data plane at least waits for these
requests (in bdrv_set_aio_context()) instead of changing the AioContext
under their feet and making them run in the wrong thread, causing
crashes and failures (e.g. due to missing locking).
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
For some users of BlockBackends, just increasing the in_flight counter
is easier than implementing separate handlers in BlockDevOps. Make the
helper functions for this public.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Error reporting for user_creatable_add_opts_foreach was changed so that
it no longer called 'error_report_err' in:
commit 7e1e0c1112
Author: Markus Armbruster <armbru@redhat.com>
Date: Wed Oct 17 10:26:43 2018 +0200
qom: Clean up error reporting in user_creatable_add_opts_foreach()
Some callers were updated to pass in "&error_fatal" but all the ones in
qemu-img were left passing NULL. As a result all errors went to
/dev/null instead of being reported to the user.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If there's an error in commit_start() then the block job must be
deleted before replacing commit_top_bs, otherwise it will fail because
of lack of permissions. This happens since the permission system was
introduced in 8dfba27977.
Fortunately this bug doesn't seem to be possible to reproduce at the
moment without changing the code.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Adds a fast path on aio context setting preventing
unnecessary context setting routine.
Also, it prevents issues with cyclic walk of child
bds-es appeared because of registering aio walking
notifiers:
Call stack:
0 __GI_raise
1 __GI_abort
2 __assert_fail_base
3 __GI___assert_fail
4 bdrv_detach_aio_context (bs=0x55f54d65c000) <<<
5 bdrv_detach_aio_context (bs=0x55f54fc8a800)
6 bdrv_set_aio_context (bs=0x55f54fc8a800, ...)
7 block_job_attached_aio_context
8 bdrv_attach_aio_context (bs=0x55f54d65c000, ...) <<<
9 bdrv_set_aio_context (bs=0x55f54d65c000)
10 blk_set_aio_context
11 virtio_blk_data_plane_stop
12 virtio_bus_stop_ioeventfd
13 virtio_vmstate_change
14 vm_state_notify (running=0, state=RUN_STATE_SHUTDOWN)
15 do_vm_stop (state=RUN_STATE_SHUTDOWN, send_stop=true)
16 vm_stop (state=RUN_STATE_SHUTDOWN)
17 main_loop_should_exit
18 main_loop
19 main
This can happen because of "new" context attachment to VM disk bds.
When attaching a new context the corresponding aio context handler is
called for each of aio_notifiers registered on the VM disk bds context.
Among those handlers, there is the block_job_attached_aio_context handler
which sets a new aio context for the block job bds. When doing so,
the old context is detached from all the block job bds children and one of
them is the VM disk bds, serving as backing store for the blockjob bds,
although the VM disk bds is actually the initializer of that process.
Since the VM disk bds is protected with walking_aio_notifiers flag
from double processing in recursive calls, the assert fires.
Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In qcow2_snapshot_create there is the following code block:
/* Generate an ID */
find_new_snapshot_id(bs, sn_info->id_str, sizeof(sn_info->id_str));
/* Check that the ID is unique */
if (find_snapshot_by_id_and_name(bs, sn_info->id_str, NULL) >= 0) {
return -EEXIST;
}
find_new_snapshot_id cycles through all snapshots, getting the id_str
as an unsigned long int, calculating the max id_max value of all the
existing id_strs and writing in the id_str pointer id_max + 1:
for(i = 0; i < s->nb_snapshots; i++) {
sn = s->snapshots + i;
id = strtoul(sn->id_str, NULL, 10);
if (id > id_max)
id_max = id;
}
snprintf(id_str, id_str_size, "%lu", id_max + 1);
Here, sn_info->id_str will have the unique value id_max + 1. Right
after that, find_snapshot_by_id_and_name is called with
id = sn_info->id_str and name = NULL. This will cause the function
to execute the following:
} else if (id) {
for (i = 0; i < s->nb_snapshots; i++) {
if (!strcmp(s->snapshots[i].id_str, id)) {
return i;
}
}
}
In short, we're searching the existing snapshots to see if sn_info->id_str
matches any existing id, right after we set in the previous line a
sn_info->id_str value that is already unique.
The first code block goes way back to commit 585f8587ad, a 2006 commit from
Fabrice Bellard that simply says "new qcow2 disk image format". No more
info is provided about this logic in any subsequent commits that moved
this code block around.
I can't say about the original design, but the current logic is redundant.
bdrv_snapshot_create is called in aio_context lock, forbidding any
concurrent call to accidentally create a new snapshot between
the find_new_snapshot_id and find_snapshot_by_id_and_name calls. What
we're ending up doing is to cycle through the snapshots two times
for no viable reason.
This patch eliminates the redundancy by removing the 'id is unique'
check that calls find_snapshot_by_id_and_name.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
After the previous patch, the only instance of this function left
is inside qemu-img.c.
qemu-img is using it inside the 'img_snapshot' function to delete
snapshots in the SNAPSHOT_DELETE case, based on a "snapshot_name"
string that refers to the tag, not ID, of the QEMUSnapshotInfo struct.
This can be verified by checking the SNAPSHOT_CREATE case that
comes shortly before SNAPSHOT_DELETE. In that case, the same
"snapshot_name" variable is being strcpy to the 'name' field
of the QEMUSnapshotInfo struct sn:
pstrcpy(sn.name, sizeof(sn.name), snapshot_name);
Based on that, it is unlikely that "snapshot_name" might contain
an "id" in SNAPSHOT_DELETE.
This patch changes SNAPSHOT_DELETE to use snapshot_find() and
snapshot_delete() instead of bdrv_snapshot_delete_by_id_or_name.
After that, there is no instances left of bdrv_snapshot_delete_by_id_or_name
in the code, so it is safe to remove it entirely.
Suggested-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
At this moment, QEMU attempts to create/load/delete snapshots
by using either an ID (id_str) or a name. The problem is that the code
isn't consistent of whether the entered argument is an ID or a name,
causing unexpected behaviors.
For example, when creating snapshots via savevm <arg>, what happens is that
"arg" is treated as both name and id_str. In a guest without snapshots, create
a single snapshot via savevm:
(qemu) savevm 0
(qemu) info snapshots
List of snapshots present on all disks:
ID TAG VM SIZE DATE VM CLOCK
-- 0 741M 2018-07-31 13:39:56 00:41:25.313
A snapshot with name "0" is created. ID is hidden from the user, but the
ID is a non-zero integer that starts at "1". Thus, this snapshot has
id_str=1, TAG="0". Creating a second snapshot with arg = 1, the first one
is deleted:
(qemu) savevm 1
(qemu) info snapshots
List of snapshots present on all disks:
ID TAG VM SIZE DATE VM CLOCK
-- 1 741M 2018-07-31 13:42:14 00:41:55.252
What happened?
- when creating the second snapshot, a verification is done inside
bdrv_all_delete_snapshot to delete any existing snapshots that matches an
string argument. Here, the code calls bdrv_all_delete_snapshot("1", ...);
- bdrv_all_delete_snapshot calls bdrv_snapshot_find(..., "1") for each
BlockDriverState of the guest. And this is where things goes tilting:
bdrv_snapshot_find does a search by both id_str and name. It finds
out that there is a snapshot that has id_str = 1, stores a reference
to the snapshot in the sn_info pointer and then returns match found;
- since a match was found, a call to bdrv_snapshot_delete_by_id_or_name() is
made. This function ignores the pointer written by bdrv_snapshot_find. Instead,
it deletes the snapshot using bdrv_snapshot_delete() calling it first with
id_str = 1. If it fails to delete, then it calls it again with name = 1.
- after all that, QEMU creates the new snapshot, that has id_str = 1 and
name = 1. The user is left wondering that happened with the first snapshot
created. Similar bugs can be triggered when using loadvm and delvm.
Before contemplating discarding the use of ID input in these operations,
I've searched the code of what would be the implications. My findings
are:
- the RBD and Sheepdog drivers don't care. Both uses the 'name' field as
key in their logic, making id_str = name when appropriate.
replay-snapshot.c does not make any special use of id_str;
- qcow2 uses id_str as an unique identifier but it is automatically
calculated, not being influenced by user input. Other than that, there are
no distinguish operations made only with id_str;
- in blockdev.c, the delete operation uses a match of both id_str AND
name. Given that id_str is either a copy of 'name' or auto-generated,
we're fine here.
This gives motivation to not consider ID as a valid user input in HMP
commands - sticking with 'name' input only is more consistent. To
accomplish that, the following changes were made in this patch:
- bdrv_snapshot_find() does not match for id_str anymore, only 'name'. The
function is called in save_snapshot(), load_snapshot(), bdrv_all_delete_snapshot()
and bdrv_all_find_snapshot(). This change makes the search function more
predictable and does not change the behavior of any underlying code that uses
these affected functions, which are related to HMP (which is fine) and the
main loop inside vl.c (which doesn't care about it anyways);
- bdrv_all_delete_snapshot() does not call bdrv_snapshot_delete_by_id_or_name
anymore. Instead, it uses the pointer returned by bdrv_snapshot_find to
erase the snapshot with the exact match of id_str an name. This function
is called in save_snapshot and hmp_delvm, thus this change produces the
intended effect;
- documentation changes to reflect the new behavior. I consider this to
be an API fix instead of an API change - the user was already creating
snapshots using 'name', but now he/she will also enjoy a consistent
behavior.
Ideally we would get rid of the id_str field entirely, but this would have
repercussions on existing snapshots. Another day perhaps.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
I'll not be involved in day-to-day qemu development. Remove myself as
maintainer from the remainder of the network block drivers, and revert
them to the general block layer maintainership.
Move 'sheepdog' to the 'Odd Fixes' support level.
For VHDX, added my personal email address as a maintainer, as I can
answer questions or send the occassional bug fix. Leaving it as
'Supported', instead of 'Odd Fixes', because I think the rest of the
block layer maintainers and developers will upkeep it as well, if
needed.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Acked-by: Max Reitz <mreitz@redhat.com>
Message-Id: <63e205cb84c8f0a10c1bc6d5d6856d72ceb56e41.1537984851.git.jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
ui: add support for -display spice-app
ui: gtk+sdl bugfixes.
# gpg: Signature made Fri 22 Feb 2019 07:53:13 GMT
# gpg: using RSA key 4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>" [full]
# gpg: aka "Gerd Hoffmann <gerd@kraxel.org>" [full]
# gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>" [full]
# Primary key fingerprint: A032 8CFF B93A 17A7 9901 FE7D 4CB6 D8EE D3E8 7138
* remotes/kraxel/tags/ui-20190222-pull-request:
display: add -display spice-app launching a Spice client
spice: use a default name for the server
qapi: document DisplayType enum
build-sys: add gio-2.0 check
char: register spice ports after spice started
char: move SpiceChardev and open_spice_port() to spice.h header
spice: do not stop spice if VM is paused
spice: merge options lists
spice: avoid spice runtime assert
char/spice: discard write() if backend is disconnected
char/spice: trigger HUP event
ui/gtk: Fix the license information
sdl2: drop qemu_input_event_send_key_qcode call
spice: set device address and device display ID in QXL interface
kbd-state: don't block auto-repeat events
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
MIPS queue for February 21st, 2019, v2
# gpg: Signature made Thu 21 Feb 2019 18:37:04 GMT
# gpg: using RSA key D4972A8967F75A65
# gpg: Good signature from "Aleksandar Markovic <amarkovic@wavecomp.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 8526 FBF1 5DA3 811F 4A01 DD75 D497 2A89 67F7 5A65
* remotes/amarkovic/tags/mips-queue-feb-21-2019-v2:
target/mips: fulong2e: Dynamically generate SPD EEPROM data
target/mips: fulong2e: Fix bios flash size
hw/pci-host/bonito.c: Add PCI mem region mapped at the correct address
target/mips: implement QMP query-cpu-definitions command
tests/tcg: target/mips: Add wrappers for MSA integer compare instructions
tests/tcg: target/mips: Change directory name 'bit-counting' to 'bit-count'
tests/tcg: target/mips: Correct path to headers in some test source files
hw/misc: mips_itu: Fix 32/64 bit issue in a line involving shift operator
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This patch adds the support of DISCARD and WRITE_ZEROES commands,
that have been introduced in the virtio-blk protocol to have
better performance when using SSD backend.
We support only one segment per request since multiple segments
are not widely used and there are no userspace APIs that allow
applications to submit multiple segments in a single call.
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20190221103314.58500-7-sgarzare@redhat.com
Message-Id: <20190221103314.58500-7-sgarzare@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
In order to use VirtIOFeature also in other virtio devices, we move
its declaration and the endof() macro (renamed in virtio_endof())
in virtio.h.
We add virtio_feature_get_config_size() function to iterate the array
of VirtIOFeature and to return the config size depending on the
features enabled. (as virtio_net_set_config_size() did)
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20190221103314.58500-5-sgarzare@redhat.com
Message-Id: <20190221103314.58500-5-sgarzare@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Since configurable features for virtio-blk are growing, this patch
adds host_features field in the struct VirtIOBlock. (as in virtio-net)
In this way, we can avoid to add new fields for new properties and
we can directly set VIRTIO_BLK_F* flags in the host_features.
We update "config-wce" and "scsi" property definition to use the new
host_features field without change the behaviour.
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20190221103314.58500-3-sgarzare@redhat.com
Message-Id: <20190221103314.58500-3-sgarzare@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
We are seeing instability on our CI runs which has been there since
the test was introduced. I suspect it triggers more on Travis due to
their heavy load.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Thomas Huth <thuth@redhat.com>
Some operations take a long time and enabling "-l 2 -r all" can take
more than a day which is stretching the definition of a "slow" test.
Lets default to the quick test and leave a note for those who wish to
run by hand.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
This is very convenient for people like me who store their QEMU git trees
on gitlab.com: Automatic CI pipelines are now run for each branch that is
pushed to the server - useful for some extra-testing before sending PULL-
requests for example. Since the runtime of the jobs is limited to 1h, the
jobs are distributed into multiple pipelines - this way everything finishs
fine within time (ca. 30 minutes currently).
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1550058881-16351-1-git-send-email-thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Cleber Rosa <crosa@redhat.com>
Tracking head is always going to be at the whims of the upstream.
Let's use a defined release so things don't magically change under us.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
The builds are reaching the magic 50 minute limit with regularity so
lets split them up. Rather than doing a full debug build on both just
enable debug tcg for linux-user.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
It fails to install homebrew. Unfortunately we cannot mark
it as an expected failure because Travis does not match
allow_failures rows against include rows (only against the
main test matrix, which we do not use at all), so just disable
it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20190220105131.23479-1-pbonzini@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Add a new display backend that will configure Spice to allow a remote
client to control QEMU in a similar fashion as other QEMU display
backend/UI like GTK.
For this to work, it will set up Spice server with a unix socket, and
register a VC chardev that will be exposed as Spice ports. A QMP
monitor is also exposed as a Spice port, this allows the remote client
fuller qemu control and state handling.
- doesn't handle VC set_echo() - this doesn't seem a strong
requirement, very few front-end use it
- spice options can be tweaked with other -spice arguments
- Windows support shouldn't be hard to do, but will probably use a TCP
port instead
- we may want to watch the child process to quit automatically if it
crashed
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: Victor Toso <victortoso@redhat.com>
Message-id: 20190221110703.5775-12-marcandre.lureau@redhat.com
[ kraxel: squash incremental fix ]
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
This patch adds EDID support to the family of virtio-gpu devices. It is
turned off by default, use the new edid property to enable it.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20190221081054.13853-1-kraxel@redhat.com
We introduce the vfio_init_container_type() helper.
It computes the highest usable iommu type and then
set the container and the iommu type.
Its usage in vfio_connect_container() makes the code
ready for addition of new iommu types.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
A kernel bug was introduced in v4.15 via commit 71a7d3d78e3c which
adds a test for address space wrap-around in the vfio DMA unmap path.
Unfortunately due to overflow, the kernel detects an unmap of the last
page in the 64-bit address space as a wrap-around. In QEMU, a Q35
guest with VT-d emulation and guest IOMMU enabled will attempt to make
such an unmap request during VM system reset, triggering an error:
qemu-kvm: VFIO_UNMAP_DMA: -22
qemu-kvm: vfio_dma_unmap(0x561f059948f0, 0xfef00000, 0xffffffff01100000) = -22 (Invalid argument)
Here the IOVA start address (0xfef00000) and the size parameter
(0xffffffff01100000) add to exactly 2^64, triggering the bug. A
kernel fix is queued for the Linux v5.0 release to address this.
This patch implements a workaround to retry the unmap, excluding the
final page of the range when we detect an unmap failing which matches
the requirements for this issue. This is expected to be a safe and
complete workaround as the VT-d address space does not extend to the
full 64-bit space and therefore the last page should never be mapped.
This workaround can be removed once all kernels with this bug are
sufficiently deprecated.
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1662291
Reported-by: Pei Zhang <pezhang@redhat.com>
Debugged-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
target-arm queue:
* Model the Arm "Musca" development boards: "musca-a" and "musca-b1"
* Implement the ARMv8.3-JSConv extension
* v8M MPU should use background region as default, not always
* Stop unintentional sign extension in pmu_init
# gpg: Signature made Thu 21 Feb 2019 18:56:32 GMT
# gpg: using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg: issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [ultimate]
# gpg: aka "Peter Maydell <pmaydell@gmail.com>" [ultimate]
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [ultimate]
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* remotes/pmaydell/tags/pull-target-arm-20190221: (21 commits)
hw/arm/armsse: Make 0x5... alias region work for per-CPU devices
hw/arm/musca: Wire up PL011 UARTs
hw/arm/musca: Wire up PL031 RTC
hw/arm/musca: Add MPCs
hw/arm/musca: Add PPCs
hw/arm/musca.c: Implement models of the Musca-A and -B1 boards
hw/arm/armsse: Allow boards to specify init-svtor
hw/arm/armsse: Document SRAM_ADDR_WIDTH property in header comment
hw/char/pl011: Use '0x' prefix when logging hex numbers
hw/char/pl011: Support all interrupt lines
hw/char/pl011: Allow use as an embedded-struct device
hw/timer/pl031: Convert to using trace events
hw/timer/pl031: Allow use as an embedded-struct device
hw/misc/tz-ppc: Support having unused ports in the middle of the range
target/arm: Implement ARMv8.3-JSConv
target/arm: Rearrange Floating-point data-processing (2 regs)
target/arm: Split out vfp_helper.c
target/arm: Restructure disas_fp_int_conv
target/arm: Stop unintentional sign extension in pmu_init
target/arm: v8M MPU should use background region as default, not always
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The machine comes with 256M memory module by default but it's
upgradable so it could have different memory size. There was a TODO
comment to replace static SPD EEPROM data with dynamically generated
one to support this. Now that we have a function for that, it's easy
to do. Although this would allow larger RAM sizes, the peculiar memory
map of the machine may need some special handling to map it as low and
high memory. Because I don't know what the correct place would be for
highmem, I've left memory size fixed at 256M for now and TODO is moved
there instead.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
According to both the specifications on linux-mips.org referenced in a
comment at the beginning of the file and the flash chip part number
the bios size should be 512k not 1M.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Stop using system memory as PCI memory otherwise devices such as VGA
that have regions mapped to PCI memory clash with RAM. Use a separate
memory region for PCI memory and map it to the correct address in
system memory which allows PCI mem regions to show at the correct
address where clients expect them.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Change directory name 'bit-counting' to 'bit-count'. This is just for
cosmetic and consistency sake. This was the only subdirectory in MSA
test directory that uses ending 'ing'.
Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Correct path to headers in tests/tcg/mips/user/ase/msa/bit-counting/*
source files.
Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Aleksandar Rikalo <arikalo@wavecomp.com>
Fix 32/64 bit issue in a line involving shift operator. "1 << ..."
calculation of size is done as a 32-bit signed integer which may
then be unintentionally sign-extended into the 64-bit result. The
problem was discovered by Coverity (CID 1398648). Using "1ULL"
instead of "1" on the LHS of the shift fixes this problem.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Especially when dealing with out-of-line gvec helpers, it is often
helpful to specify some vector pointers as constant. E.g. when
we have two inputs and one output, marking the two inputs as consts
pointers helps to avoid bugs.
Const pointers can be specified via "cptr", however behave in TCG just
like ordinary pointers. We can specify helpers like:
DEF_HELPER_FLAGS_4(gvec_vbperm, TCG_CALL_NO_RWG, void, ptr, cptr, cptr, i32)
void HELPER(gvec_vbperm)(void *v1, const void *v2, const void *v3,
uint32_t desc)
And make sure that here, only v1 will be written (as long as const is
not casted away, of course).
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190221093459.22547-1-david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The last update to this file was 9 years ago. In the meantime,
4 of the 6 ideas have actually been completed. The lat two do
not actually make sense anymore.
Suggested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The region 0x40010000 .. 0x4001ffff and its secure-only alias
at 0x50010000... are for per-CPU devices. We implement this by
giving each CPU its own container memory region, where the
per-CPU devices live. Unfortunately, the alias region which
makes devices mapped at 0x4... addresses also appear at 0x5...
is only implemented in the overall "all CPUs" container. The
effect of this bug is that the CPU_IDENTITY register block appears
only at 0x4001f000, but not at the 0x5001f000 alias where it should
also appear. Guests (like very recent Arm Trusted Firmware-M)
which try to access it at 0x5001f000 will crash.
Fix this by moving the handling for this alias from the "all CPUs"
container to the per-CPU container. (We leave the aliases for
0x1... and 0x3... in the overall container, because there are
no per-CPU devices there.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20190215180500.6906-1-peter.maydell@linaro.org
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
The Musca board puts its SRAM and flash behind TrustZone
Memory Protection Controllers (MPCs). Each MPC sits between
the CPU and the RAM/flash, and also has a set of memory mapped
control registers. Wire up the MPCs, and the memory behind them.
For the moment we implement the flash as simple ROM, which
cannot be reprogrammed by the guest.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Many of the devices on the Musca board live behind TrustZone
Peripheral Protection Controllers (PPCs); add models of the
PPCs, using a similar scheme to the MPS2 board models.
This commit wires up the PPCs with "unimplemented device"
stubs behind them in the correct places in the address map.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
The Musca-A and Musca-B1 development boards are based on the
SSE-200 subsystem for embedded. Implement an initial skeleton
model of these boards, which are similar but not identical.
This commit creates the board model with the SSE and the IRQ
splitters to wire IRQs up to its two CPUs. As yet there
are no devices and no memory: these will be added later.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
The Musca boards have DAPLink firmware that sets the initial
secure VTOR value (the location of the vector table) differently
depending on the boot mode (from flash, from RAM, etc). Export
the init-svtor as a QOM property of the ARMSSE object so that
the board can change it.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
In commit 4b635cf7a9 we added a QOM property to the ARMSSE
object, but forgot to add it to the documentation comment in the
header. Correct the omission.
Fixes: 4b635cf7a9 ("hw/arm/armsse: Make SRAM bank size configurable")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
The pl011 logs when the guest makes a bad access. It prints
the address offset in hex but confusingly omits the '0x'
prefix; add it.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
The PL011 UART has six interrupt lines:
* RX (receive data)
* TX (transmit data)
* RT (receive timeout)
* MS (modem status)
* E (errors)
* combined (logical OR of all the above)
So far we have only emulated the combined interrupt line;
add support for the others, so that boards that wire them
up to different interrupt controller inputs can do so.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Create a new include file for the pl011's device struct,
type macros, etc, so that it can be instantiated using
the "embedded struct" coding style.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Convert the debug printing in the PL031 device to use trace events,
and augment it to cover the interesting parts of device operation.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Create a new include file for the pl031's device struct,
type macros, etc, so that it can be instantiated using
the "embedded struct" coding style.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
The Peripheral Protection Controller's handling of unused ports
is that if there is nothing connected to the port's downstream
then it does not create the sysbus MMIO region for the upstream
end of the port. This results in odd behaviour when there is
an unused port in the middle of the range: since sysbus MMIO
regions are implicitly consecutively allocated, any used ports
above the unused ones end up with sysbus MMIO region numbers
that don't match the port number.
Avoid this numbering mismatch by creating dummy MMIO regions
for the unused ports. This doesn't change anything for our
existing boards, which don't have any gaps in the middle of
the port ranges they use; but it will be needed for the Musca
board.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
There are lots of special cases within these insns. Split the
major argument decode/loading/saving into no_output (compares),
rd_is_dp, and rm_is_dp.
We still need to special case argument load for compare (rd as
input, rm as zero) and vcvt fixed (rd as input+output), but lots
of special cases do disappear.
Now that we have a full switch at the beginning, hoist the ISA
checks from the code generation.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190215192302.27855-4-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The "background region" for a v8M MPU is a default which will be used
(if enabled, and if the access is privileged) if the access does
not match any specific MPU region. We were incorrectly using it
always (by putting the condition at the wrong nesting level). This
meant that we would always return the default background permissions
rather than the correct permissions for a specific region, and also
that we would not return the right information in response to a
TT instruction.
Move the check for the background region to the same place in the
logic as the equivalent v8M MPUCheck() pseudocode puts it.
This in turn means we must adjust the condition we use to detect
matches in multiple regions to avoid false-positives.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190214113408.10214-1-peter.maydell@linaro.org
Coverity points out (CID 1398632, CID 1398650) that we
leak a couple of allocated strings in the error-exit
code path for setting up the MHUs in the ARMSSE.
Fix this bug by moving the allocate-and-free of each
string to be closer to the use, so we do the free before
doing the error-exit check.
Fixes: f8574705f6 ("hw/arm/armsse: Add unimplemented-device stubs for MHUs")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190215113707.24553-1-peter.maydell@linaro.org
some versions of HP-UX 10.20 seems to rely on the fact that DINO
strips out the lower 2 bits of the PCI configuration address.
Also update the binary SeaBIOS distributed to the latest version
from Helge's repository, which is required with that change.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190218183314.20157-1-svens@stackframe.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
spice_server_vm_start/stop() was added to help migration state (commit
f5bb039c6d).
However, a paused VM could keep running the spice server. This will
allow a Spice client to keep sending commands to a spice chardev. This
allows to stop/cont a VM from a Spice monitor port. Character
devices (vdagent/usb/smartcard/..) should not read from Spice when the
VM is paused.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: Victor Toso <victortoso@redhat.com>
Message-id: 20190221110703.5775-6-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Passing several -spice options to qemu command line, or calling
several time qemu_opts_set() will ignore all but the first option
list. Since the spice server is a singleton, it makes sense to merge
all the options, the last value being the one taken into account.
This changes the behaviour from, for ex:
$ qemu... -spice port=5900 -spice port=5901 -> port: 5900
to:
$ qemu... -spice port=5900 -spice port=5901 -> port: 5901
(if necessary we could instead produce an error when an option is
given twice, although this makes handling default values and such more
complicated)
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: Victor Toso <victortoso@redhat.com>
Message-id: 20190221110703.5775-5-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The Spice server doesn't like to be started or stopped twice . It
aborts with:
(process:6191): Spice-ERROR **: 19:29:35.912: red-worker.c:623:handle_dev_start: assertion `!worker->running' failed
It's easy to avoid that situation since qemu spice_display_is_running
tracks the server state.
After the commit "spice: do not stop spice if VM is paused", it will
be possible to pause and resume the VM, and this will call
qemu_spice_display_start() twice. The easiest is to add a check for
spice_display_is_running with this patch to avoid the assert.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: Victor Toso <victortoso@redhat.com>
Message-id: 20190221110703.5775-4-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Most chardev backend handle write() as discarded data if underlying
system is disconnected. For unknown historical reasons, the Spice
backend has "reliable" write: it will wait until the client end is
reconnected to do further successful write().
To decide whether it make sense to wait until the client is
reconnected (or queue the writes), let's review Spice chardev usage
and handling of a disconnected client:
* spice vdagent
The agents reopen the virtio port on disconnect. In qemu side,
virtio_serial_close() will also discard pending data.
* usb redirection
A disconnect creates a device disconnection.
* smartcard emulation
Data is discarded in passthru_apdu_from_guest().
(Spice doesn't explicitly open the smartcard char device until
upcoming 0.14.2, commit 69a5cfc74131ec0459f2eb5a231139f5a69a8037)
* spice webdavd
The daemon will restart the service, and reopen the virtio port.
* spice ports (serial console, qemu monitor..)
Depends on the associated device or usage.
- serial, may be throttled or discarded on write, depending on
device
- QMP/HMP monitor have some CLOSED event handling, but want to
flush the write, which will finish when a new client connects.
On disconnect/reconnect, the client starts with fresh sessions. If it
is a seamless migration, the client disconnects after the source
migrated. The handling of source disconnect in qemu is thus irrelevant
for the Spice session migration.
For all these use cases, it is better to discard writes when the
client is disconnected, and require the vm-side device/agent to behave
correctly on CHR_EVENT_CLOSED, to stop reading and writing from
the spice chardev.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: Victor Toso <victortoso@redhat.com>
Message-id: 20190221110703.5775-3-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The license information in this file is very messy. A short note at
the beginning says GPL first, but the long boilerplate code then
talks about "GNU Lesser General Public License version 2.0". First,
there is no such version of the "GNU Lesser GPL", it only started with
version 2.1. In version 2.0, it was still called "GNU Library GPL"
instead. Second, you can easily get the license of this file wrong
if you only quickly glance at the long boilerplate code.
Anyway, looking at the text of the LGPL (see COPYING.LIB in the top
directory), the license clearly states in section "3." that one should
rather replace the license information with the GPL information in
such a case of a mixture instead. Thus let's clean up the confusing
statements and use the proper GPL text only.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 1550731902-28842-1-git-send-email-thuth@redhat.com
[ kraxel: s/v2/v2+/ as requested by Daniel ]
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Calls the new SPICE QXL interface function spice_qxl_set_device_info to
set the hardware address of the graphics device represented by the QXL
interface (e.g. a PCI path) and the device display IDs (the IDs of the
device's monitors that belong to this QXL interface).
Also stops using the deprecated spice_qxl_set_max_monitors, the new
interface function replaces it.
Signed-off-by: Lukáš Hrázký <lhrazky@redhat.com>
Message-Id: <20190215150919.8263-1-lhrazky@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
In musb_packet(), the call to usb_find_device() can return NULL
if it doesn't find a device matching 'addr' so explicitly check
the return value before passing it to usb_ep_get(). This then
allows the subsequent calculation of 'id' to be streamlined.
Signed-off-by: Liam Merwick <liam.merwick@oracle.com>
Message-id: 1549460216-25808-8-git-send-email-liam.merwick@oracle.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Most callers of xhci_port_update() and xhci_wakeup() pass in a pointer
to an array entry and can never be NULL but add two defensive asserts
to protect against future changes (e.g. adding a new port speed, etc.)
adding a path through xhci_lookup_port() that could result in the
return of a NULL XHCIPort.
Signed-off-by: Liam Merwick <liam.merwick@oracle.com>
Message-id: 1549460216-25808-3-git-send-email-liam.merwick@oracle.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
When bitmaps are persistent, they may incur a disk read or write when bitmaps
are added or removed. For configurations like virtio-dataplane, failing to
acquire this lock will abort QEMU when disk IO occurs.
We used to acquire aio_context as part of the bitmap lookup, so re-introduce
the lock for just the cases that have an IO penalty. Commit 2119882c removed
these locks, and I failed to notice this when we committed fd5ae4cc, so this
has been broken since persistent bitmaps were introduced.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1672010
Reported-By: Aihua Liang <aliang@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20190218233154.19303-1-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
Since qemu currently doesn't flush persistent bitmaps to disk until
shutdown (which might be MUCH later), it's useful if 'query-block'
at least shows WHICH bitmaps will (eventually) make it to persistent
storage. Update affected iotests.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20190204210512.27458-1-eblake@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
ppc patch queue 2019-02-19
Here's the next batch of ppc and spapr patches. Higlights are:
* A bunch of improvements to TCG handling of vector instructions from
Richard Henderson and Marc Cave-Ayland
* Cleanup to the XICS interrupt controller from Greg Kurz, removing
the special KVM subclasses which were a bad idea
* Some refinements to the XIVE interrupt controller from Cédric Le
Goater
* Fix from Fabiano Rosas for a really dumb buffer overflow in the
device tree code for memory hotplug
* Code for allowing access to SPRs from the gdb stub from Fabiano
Rosas
* Assorted minor fixes and cleanups
# gpg: Signature made Mon 18 Feb 2019 13:47:54 GMT
# gpg: using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full]
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full]
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full]
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown]
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-4.0-20190219: (43 commits)
target/ppc: convert vmin* and vmax* to vector operations
target/ppc: convert vadd*s and vsub*s to vector operations
target/ppc: Split out VSCR_SAT to a vector field
target/ppc: Add set_vscr_sat
target/ppc: Use mtvscr/mfvscr for vmstate
target/ppc: Add helper_mfvscr
target/ppc: Remove vscr_nj and vscr_sat
target/ppc: Use helper_mtvscr for reset and gdb
target/ppc: Pass integer to helper_mtvscr
target/ppc: convert xxsel to vector operations
target/ppc: convert xxspltw to vector operations
target/ppc: convert xxspltib to vector operations
target/ppc: convert VSX logical operations to vector operations
target/ppc: convert vsplt[bhw] to use vector operations
target/ppc: convert vspltis[bhw] to use vector operations
target/ppc: convert vaddu[b,h,w,d] and vsubu[b,h,w,d] over to use vector operations
target/ppc: convert VMX logical instructions to use vector operations
xics: Drop the KVM ICS class
spapr/irq: Use the "simple" ICS class for KVM
xics: Handle KVM interrupt presentation from "simple" ICS code
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
QAPI patches for 2019-02-18
# gpg: Signature made Mon 18 Feb 2019 13:44:30 GMT
# gpg: using RSA key 3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full]
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>" [full]
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653
* remotes/armbru/tags/pull-qapi-2019-02-18:
qapi: move RTC_CHANGE to the target schema
qmp: Deprecate query-events in favor of query-qmp-schema
Revert "qapi-events: add 'if' condition to implicit event enum"
qapi: remove qmp_unregister_command()
qapi: make query-cpu-definitions depend on specific targets
qapi: make query-cpu-model-expansion depend on s390 or x86
qapi: make query-gic-capabilities depend on TARGET_ARM
target.json: add a note about query-cpu* not being s390x-specific
qapi: make s390 commands depend on TARGET_S390X
qapi: make rtc-reset-reinjection and SEV depend on TARGET_I386
qapi: New module target.json
build: Deal with all of QAPI's .o in qapi/Makefile.objs
build-sys: move qmp-introspect per target
qapi: Generate QAPIEvent stuff into separate files
qapi: Prepare for system modules other than 'builtin'
qapi: Clean up modular built-in code generation a bit
qapi: Fix up documentation for recent commit a95291007b
qapi: Belatedly document modular code generation
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
A few targets don't emit RTC_CHANGE, we could restrict the event to
the tagets that do emit it.
Note: There is a lot more of events & commands that we could restrict
to capable targets, with the cost of some additional complexity, but
the benefit of added correctness and better introspection.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190214152251.2073-19-armbru@redhat.com>
query-events doesn't reflect compile-time configuration. Instead of
fixing that, deprecate the command in favor of query-qmp-schema.
Libvirt prefers query-qmp-schema as of commit 22d7222ec0 "qemu: caps:
Don't call 'query-events' when we probe events from QMP schema".
It'll be in the next release.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20190214152251.2073-18-armbru@redhat.com>
This reverts commit 7bd2634905.
The commit applied the events' conditions to the members of enum
QAPIEvent. Awkward, because it renders QAPIEvent unusable in
target-independent code as soon as we make an event target-dependent.
Reverting this has the following effects:
* ui/vnc.c can remain target independent.
* monitor_qapi_event_conf[] doesn't have to muck around with #ifdef.
* query-events again doesn't reflect conditionals. I'm going to
deprecate it in favor of query-qmp-schema.
Another option would be to split target-dependent parts off enum
QAPIEvent into a target-dependent enum. Doesn't seem worthwhile right
now.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20190214152251.2073-17-armbru@redhat.com>
We can't add appropriate target-specific conditionals to misc.json,
because that would make all of misc.json unusable in
target-independent code. To keep misc.json target-independent, we
need to split off target-dependent target.json.
This commit doesn't actually split off anything, it merely creates the
empty module. The next few patches will move stuff from misc.json
there.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20190214152251.2073-9-armbru@redhat.com>
Adding QAPI's .o to util-obj-y, common-obj-y and obj-y is spread over
three places: Makefile.objs takes care of target-independent generated
code, Makefile.target of target-dependent generated code, and
qapi/Makefile.objs of (target-independent) hand-written code.
Do everything in qapi/Makefile.objs.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20190214152251.2073-8-armbru@redhat.com>
The following patches are going to introduce per-target #ifdef in the
schemas.
The introspection data is statically generated once, and must thus be
built per-target to reflect target-specific configuration.
Drop "do_test_visitor_in_qmp_introspect(&qmp_schema_qlit)" since the
schema is no longer in a common object. It is covered by the per-target
query-qmp-schema test instead.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190214152251.2073-7-armbru@redhat.com>
Having to include qapi-events.h just for QAPIEvent is suboptimal, but
quite tolerable now. It'll become problematic when we have events
conditional on the target, because then qapi-events.h won't be usable
from target-independent code anymore. Avoid that by generating it
into separate files.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20190214152251.2073-6-armbru@redhat.com>
The next commit wants to generate qapi-emit-events.{c.h}. To enable
that, extend QAPISchemaModularCVisitor to support additional "system
modules", i.e. modules that don't correspond to a (user-defined) QAPI
schema module.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20190214152251.2073-5-armbru@redhat.com>
We neglect to call .visit_module() for the special module we use for
built-ins. Harmless, but clean it up anyway. The
tests/qapi-schema/*.out now show the built-in module as 'module None'.
Subclasses of QAPISchemaModularCVisitor need to ._add_module() this
special module to enable code generation for built-ins. When this
hasn't been done, QAPISchemaModularCVisitor.visit_module() does
nothing for the special module. That looks like built-ins could
accidentally be generated into the wrong module when a subclass
neglects to call ._add_module(). Can't happen, because built-ins are
all visited before any other module. But that's non-obvious. Switch
off code generation explicitly.
Rename QAPISchemaModularCVisitor._begin_module() to
._begin_user_module().
New QAPISchemaModularCVisitor._is_builtin_module(), for clarity.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20190214152251.2073-4-armbru@redhat.com>
s390x updates:
- tcg: implement STCK and friends for CONFIG_USER_ONLY
- add zpci to qemu cpu model, as pci is now always built
- add mepoch to default z14 cpu model
- add cpu model for z14 GA2
- various improvements
# gpg: Signature made Mon 18 Feb 2019 11:06:23 GMT
# gpg: using RSA key C3D0D66DC3624FF6A8C018CEDECF6B93C6F02FAF
# gpg: issuer "cohuck@redhat.com"
# gpg: Good signature from "Cornelia Huck <conny@cornelia-huck.de>" [unknown]
# gpg: aka "Cornelia Huck <huckc@linux.vnet.ibm.com>" [full]
# gpg: aka "Cornelia Huck <cornelia.huck@de.ibm.com>" [full]
# gpg: aka "Cornelia Huck <cohuck@kernel.org>" [unknown]
# gpg: aka "Cornelia Huck <cohuck@redhat.com>" [unknown]
# Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0 18CE DECF 6B93 C6F0 2FAF
* remotes/cohuck/tags/s390x-20190218:
s390x: upgrade status of KVM cores to "supported"
s390x/kvm: add tracepoint to ioeventfd interface
s390x/cpumodel: add z14 GA2 model
s390x/cpumodel: default enable mepoch for z14 and later
s390x/cpumodel: mepochptff: warn when no mepoch and re-align group init
s390x: add zPCI feature to "qemu" CPU model
target/s390x: Implement STCK et al for CONFIG_USER_ONLY
target/s390x: Split out s390-tod.h
s390x: always provide pci support
s390x: Fix the confusing contributions-after-2012 license statements
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Latest systems and host kernels support mepoch, which is a
feature that was meant to be supported for z14 GA1 from the
get-go. Let's copy it to the z14 GA1 default CPU model.
Machines s390-ccw-virtio-3.1 and older will retain the old CPU
models and will not provide this bit nor the extended PTFF
functions in the default model.
Signed-off-by: Collin Walling <walling@linux.ibm.com>
Message-Id: <20190212011657.18324-2-walling@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The extended PTFF features (qsie, qtoue, stoe, stoue) are dependent
on the multiple-epoch facility (mepoch). Let's print a warning if these
features are enabled without mepoch.
While we're at it, let's move the FEAT_GROUP_INIT for mepochptff down
the s390_feature_groups list so it can be properly indexed with its
generated S390FeatGroup enum.
Signed-off-by: Collin Walling <walling@linux.ibm.com>
Message-Id: <20190212011657.18324-1-walling@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We tried to make pci support optional on s390x in the past;
unfortunately, we still require the s390 phb to be created
unconditionally due to backwards compatibility issues.
Instead of sinking more effort into this (including compat
handling for older machines etc.) for non-obvious gains, let's
just make CONFIG_PCI something that is always set on s390x.
Note that you can still fence off pci for the _guest_ if you
provide a cpu model without the zpci feature.
Message-Id: <20190211113255.3837-1-cohuck@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The license information in these files is rather confusing. The text
declares LGPL first, but then says that contributions after 2012 are
licensed under the GPL instead. How should the average user who just
downloaded the release tarball know which part is now GPL and which
is LGPL?
Looking at the text of the LGPL (see COPYING.LIB in the top directory),
the license clearly states how this should be done instead:
"3. You may opt to apply the terms of the ordinary GNU General Public
License instead of this License to a given copy of the Library. To do
this, you must alter all the notices that refer to this License, so
that they refer to the ordinary GNU General Public License, version 2,
instead of to this License."
Thus let's clean up the confusing statements and use the proper GPL
text only.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1549456893-16589-1-git-send-email-thuth@redhat.com>
Acked-by: Laurent Vivier <laurent@vivier.eu>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The KVM ICS reset handler simply writes the ICS state to KVM. This
doesn't need the overkill parent_reset logic we have today. Also
we want to use the same ICS type for the KVM and non-KVM case with
pseries.
Call icp_set_kvm_state() from the "simple" ICS reset function.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155023082407.1011724.1983100830860273401.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The realization of KVM ICP currently follows the parent_realize logic,
which is a bit overkill here. Also we want to get rid of the KVM ICP
class. Explicitely call icp_kvm_realize() from the base ICP realize
function.
Note that ICPStateClass::parent_realize is retained because powernv
needs it.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155023080049.1011724.15423463482790260696.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The KVM ICP reset handler simply writes the ICP state to KVM. This
doesn't need the overkill parent_reset logic we have today. Call
icp_set_kvm_state() from the base ICP reset function instead.
Since there are no other users for ICPStateClass::parent_reset, and
it isn't currently expected to change, drop it as well.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155023079461.1011724.12644984391500635645.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Now that we have changed the XICS and the XIVE interrupt backend to
have different size for their IRQ number space, we do not need to
align their source numbers anymore. Remove the offset adjustment and
wire the dual 'qirq' handler to the 'qirq' handler of the current
interrupt mode in use.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190213210756.27032-3-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When using the 'dual' interrupt mode, the source numbers of both sPAPR
IRQ backends are aligned to share a common IRQ number space and to use
a similar mapping of the machine qemu_irq array which is indexed by
the source number.
The XICS IRQ number range initially being [ 0x1000 - 0x2000 ], this
requires to change the XICS ICSState offset to 0 and to provision for
an extra 4K of source numbers and qemu_irqs which will never be used
by the machine when running under the XICS interrupt mode. This is not
an optimal solution.
Change the init() method to allocate an IRQ number space of the
expected size for the XICS sPAPR IRQ backend. It breaks the interrupt
signaling when under the 'dual' mode because source numbers have
unexpected values but next patch will fix that.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190213210756.27032-2-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
buf_len is uint8_t which is not large enough to hold the result of:
nr_entries * sizeof(struct sPAPRDrconfCellV2) + sizeof(uint32_t);
for a nr_entries greater than 10.
This causes the allocated buffer 'int_buf' to be smaller than expected
and we eventually overwrite some of glibc's control structures (see
"chunk" in https://sourceware.org/glibc/wiki/MallocInternals)
The following error is seen while trying to free int_buf:
"free(): invalid next size (fast)"
Fixes: a324d6f166 "spapr: Support ibm,dynamic-memory-v2 property"
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20190213172926.21740-1-farosas@linux.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Certain devices types, like memory/CPU, are now being handled using a
hotplug interface provided by a top-level MachineClass. Hotpluggable
host bridges are another such device where it makes sense to use a
machine-level hotplug handler. However, unlike those devices,
host-bridges have a parent bus (the main system bus), and devices with
a parent bus use a different mechanism for registering their hotplug
handlers: qbus_set_hotplug_handler(). This interface currently expects
a handler to be a subclass of DeviceClass, but this is not the case
for MachineClass, which derives directly from ObjectClass.
Internally, the interface only requires an ObjectClass, so expose that
in qbus_set_hotplug_handler().
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <154999589921.690774.3640149277362188566.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
MSI is the default and LSI specific code is guarded by the
xive_source_irq_is_lsi() helper. The xive_source_irq_set()
helper is a nop for MSIs.
Simplify the code by turning xive_source_irq_set() into
xive_source_irq_set_lsi() and only call it for LSIs. The
call to xive_source_irq_set(false) in spapr_xive_irq_free()
is also a nop. Just drop it.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <154999584656.690774.18352404495120358613.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The PPC BRANCH exception could bubble up, but this is an QEMU internal exception
and QEMU then crased. Instead it should trigger TRACE exception, according to
PPC 2.07 book. It could happen only when using branch stepping, which is not
commonly used.
Change gen_prep_dbgex do do trigger TRACE. The excp, argument is now removed,
since the type of exception can be inferred from the singlestep_enabled flags.
removed the guards around gen_exception, since they are unnecessary.
Fixes: 0e3bf48909 ("ppc: add DBCR based debugging").
Signed-off-by: Roman Kapl <rka@sysgo.com>
Message-Id: <20190212121255.2279-1-rka@sysgo.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Split mode doesn't make sense on pseries, neither with XICS nor XIVE. But
passing kernel-irqchip=split silently behaves like kernel-irqchip=on.
Other architectures that support kernel-irqchip do terminate QEMU when
split mode is requested but not available though. Do the same with pseries
for consistency.
Similarly, passing kernel-irqchip=on,accel=tcg starts the machine with the
emulated interrupt controller, ie, behaves like kernel-irqchip=off. However,
when passing kernel-irqchip=on,accel=kvm, if we can't initialize the KVM
XICS for some reason, ie, xics_kvm_init() fails, then QEMU is terminated.
This is inconsistent. Terminate QEMU all the same when requesting the
in-kernel interrupt controller without KVM.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <154964986747.291716.2679312373018476920.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
In order to handle a race condition in the MacOS 9 CUDA driver, a
delay was introduced when raising the VIA SR interrupt inspired by
similar code in MacOnLinux.
During original testing of the MacOS 9 patches it was found that the
30us delay used in MacOnLinux did not work reliably within QEMU, and a
value of 300us was required to function correctly.
Recent experiments have shown two things: firstly when booting Linux,
MacOS 9 and MacOS X the fast path which bypasses the delay is never
triggered once the OS kernel is loaded making it effectively
useless. Rather than leave this code in place where a guest could
potentially enable it by accident and break itself, we might as well
just remove it.
Secondly the previous reliability issues are no longer present, and
this value can be reduced down to 20us with no apparent ill
effects. This has the benefit of considerably improving the
responsiveness of the ADB keyboard and mouse within the guest.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Now that IRQ allocation has been split in two (first allocate IRQ numbers,
then claim them), if the claiming fails, we must release the IRQs.
Fixes: 4fe75a8ccd "spapr: split the IRQ allocation sequence"
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
According to BookE docs, invalid bits (while undefined behaviour) should
not raise exception but be ignored. This seems to be implementation
dependent though and QEMU currently does what e500 CPUs do and raise
exception for invalid bits. Unfortunately some versions of libstdc++
(and so all programs compiled with it) have lwsync on PPC440 which is
invalid but on real hardware it's just executed as msync ignoring the
invalid bits (maybe that's why it got undetected) but they fail on QEMU.
This patch changes invalid mask of msync to allow these programs to run
but keep generating exception on e500 cores to follow what hardware does.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This allows reading and writing of SPRs via GDB:
(gdb) p/x $srr1
$1 = 0x8000000002803033
(gdb) p/x $pvr
$2 = 0x4b0201
(gdb) set $pvr=0x4b0000
(gdb) p/x $pvr
$3 = 0x4b0000
The `info` command can also be used:
(gdb) info registers spr
For this purpose, GDB needs to be provided with an XML description of
the registers (see the gdb-xml directory for examples) and a set of
callbacks for reading and writing the registers must be defined.
The XML file in this case is created dynamically, based on the SPRs
already defined in the machine. This way we avoid the need for several
XML files to suit each possible ppc machine.
The gdb_{get,set}_spr_reg callbacks take an index based on the order
the registers appear in the XML file. This index does not match the
actual location of the registers in the env->spr array so the
gdb_find_spr_idx function does that conversion.
Note: GDB currently needs to know the guest endianness in order to
properly print the registers values. This is done automatically by GDB
when provided with the ELF file or explicitly with the `set endian
<big|little>` command.
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
All this code is used with both the XICS and XIVE interrupt controllers.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
In 47973a2dbf we split the last generic chipset out of the PC
board, but forgot to remove the include of "hw/i386/pc.h".
Since it is now unused, remove it.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
target-arm queue:
* gdbstub: Send a reply to the vKill packet
* Improve codegen for neon min/max and saturating arithmetic
* Fix a bug in clearing FPSCR exception status bits
* hw/arm/armsse: Fix miswiring of expansion IRQs
* hw/intc/armv7m_nvic: Allow byte accesses to SHPR1
* MAINTAINERS: Remove Peter Crosthwaite from various entries
* arm: Allow system registers for KVM guests to be changed by QEMU code
* linux-user: support HWCAP_CPUID which exposes ID registers to user code
* Fix bug in 128-bit cmpxchg for BE Arm guests
* Implement (no-op) HACR_EL2
* Fix CRn to be 14 for PMEVTYPER/PMEVCNTR
# gpg: Signature made Fri 15 Feb 2019 10:19:14 GMT
# gpg: using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg: issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [ultimate]
# gpg: aka "Peter Maydell <pmaydell@gmail.com>" [ultimate]
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [ultimate]
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* remotes/pmaydell/tags/pull-target-arm-20190215: (25 commits)
gdbstub: Send a reply to the vKill packet.
target/arm: Add missing clear_tail calls
target/arm: Use vector operations for saturation
target/arm: Split out FPSCR.QC to a vector field
target/arm: Fix set of bits kept in xregs[ARM_VFP_FPSCR]
target/arm: Split out flags setting from vfp compares
target/arm: Fix arm_cpu_dump_state vs FPSCR
target/arm: Fix vfp_gdb_get/set_reg vs FPSCR
target/arm: Remove neon min/max helpers
target/arm: Use tcg integer min/max primitives for neon
target/arm: Use vector minmax expanders for aarch32
target/arm: Use vector minmax expanders for aarch64
target/arm: Rely on optimization within tcg_gen_gvec_or
hw/arm/armsse: Fix miswiring of expansion IRQs
hw/intc/armv7m_nvic: Allow byte accesses to SHPR1
MAINTAINERS: Remove Peter Crosthwaite from various entries
arm: Allow system registers for KVM guests to be changed by QEMU code
linux-user/elfload: enable HWCAP_CPUID for AArch64
target/arm: expose remaining CPUID registers as RAZ
target/arm: expose MPIDR_EL1 to userspace
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Given that we mask bits properly on set, there is no reason
to mask them again on get. We failed to clear the exception
status bits, 0x9f, which means that the wrong value would be
returned on get. Except in the (probably normal) case in which
the set clears all of the bits.
Simplify the code in set to also clear the RES0 bits.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190209033847.9014-10-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In commit 91c1e9fcbd where we added dual-CPU support to
the ARMSSE, we set up the wiring of the expansion IRQs via nested
loops: the outer loop on 'i' loops for each CPU, and the inner loop
on 'j' loops for each interrupt. Fix a typo which meant we were
wiring every expansion IRQ line to external IRQ 0 on CPU 0 and
to external IRQ 1 on CPU 1.
Fixes: 91c1e9fcbd ("hw/arm/armsse: Support dual-CPU configuration")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
The code for handling the NVIC SHPR1 register intends to permit
byte and halfword accesses (as the architecture requires). However
the 'case' line for it only lists the base address of the
register, so attempts to access bytes other than the first one
end up in the "bad write" default logic. This bug was added
accidentally when we split out the SHPR1 logic from SHPR2 and
SHPR3 to support v6M.
Fixes: 7c9140afd5 ("nvic: Handle ARMv6-M SCS reserved registers")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
---
The Zephyr RTOS happens to access SHPR1 byte at a time,
which is how I spotted this.
Peter Crosthwaite hasn't had the bandwidth to do code review or
other QEMU work for some time now -- remove his email address
from MAINTAINERS file entries so we don't bombard him with
patch emails.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20190207181422.4907-1-peter.maydell@linaro.org
At the moment the Arm implementations of kvm_arch_{get,put}_registers()
don't support having QEMU change the values of system registers
(aka coprocessor registers for AArch32). This is because although
kvm_arch_get_registers() calls write_list_to_cpustate() to
update the CPU state struct fields (so QEMU code can read the
values in the usual way), kvm_arch_put_registers() does not
call write_cpustate_to_list(), meaning that any changes to
the CPU state struct fields will not be passed back to KVM.
The rationale for this design is documented in a comment in the
AArch32 kvm_arch_put_registers() -- writing the values in the
cpregs list into the CPU state struct is "lossy" because the
write of a register might not succeed, and so if we blindly
copy the CPU state values back again we will incorrectly
change register values for the guest. The assumption was that
no QEMU code would need to write to the registers.
However, when we implemented debug support for KVM guests, we
broke that assumption: the code to handle "set the guest up
to take a breakpoint exception" does so by updating various
guest registers including ESR_EL1.
Support this by making kvm_arch_put_registers() synchronize
CPU state back into the list. We sync only those registers
where the initial write succeeds, which should be sufficient.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Dongjiu Geng <gengdongjiu@huawei.com>
There are a whole bunch more registers in the CPUID space which are
currently not used but are exposed as RAZ. To avoid too much
duplication we expand ARMCPRegUserSpaceInfo to understand glob
patterns so we only need one entry to tweak whole ranges of registers.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20190205190224.2198-5-alex.bennee@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
A number of CPUID registers are exposed to userspace by modern Linux
kernels thanks to the "ARM64 CPU Feature Registers" ABI. For QEMU's
user-mode emulation we don't need to emulate the kernels trap but just
return the value the trap would have done. To avoid too much #ifdef
hackery we process ARMCPRegInfo with a new helper (modify_arm_cp_regs)
before defining the registers. The modify routine is driven by a
simple data structure which describes which bits are exported and
which are fixed.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20190205190224.2198-3-alex.bennee@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Although technically not visible to userspace the kernel does make
them visible via a trap and emulate ABI. We provide a new permission
mask (PL0U_R) which maps to PL0_R for CONFIG_USER builds and adjust
the minimum permission check accordingly.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20190205190224.2198-2-alex.bennee@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The lo,hi order is different from the comments. And in commit
1ec182c333 ("target/arm: Convert to HAVE_CMPXCHG128"), it changes
the original code logic. So just restore the old code logic before this
commit:
do_paired_cmpxchg64_be():
cmpv = int128_make128(env->exclusive_high, env->exclusive_val);
newv = int128_make128(new_hi, new_lo);
This fixes a bug that would only be visible for big-endian
AArch64 guest code.
Fixes: 1ec182c333 ("target/arm: Convert to HAVE_CMPXCHG128")
Signed-off-by: Catherine Ho <catherine.hecx@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1548985244-24523-1-git-send-email-catherine.hecx@gmail.com
[PMM: added note that bug only affects BE guests]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
HACR_EL2 is a register with IMPDEF behaviour, which allows
implementation specific trapping to EL2. Implement it as RAZ/WI,
since QEMU's implementation has no extra traps. This also
matches what h/w implementations like Cortex-A53 and A57 do.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190205181218.8995-1-peter.maydell@linaro.org
MIPS queue for February 14th, 2019
# gpg: Signature made Thu 14 Feb 2019 16:48:39 GMT
# gpg: using RSA key D4972A8967F75A65
# gpg: Good signature from "Aleksandar Markovic <amarkovic@wavecomp.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 8526 FBF1 5DA3 811F 4A01 DD75 D497 2A89 67F7 5A65
* remotes/amarkovic/tags/mips-queue-feb-14-2019:
tests/tcg: target/mips: Add tests for MSA logic instructions
tests/tcg: target/mips: Add wrappers for MSA logic instructions
tests/tcg: target/mips: Add tests for MSA interleave instructions
tests/tcg: target/mips: Add wrappers for MSA interleave instructions
tests/tcg: target/mips: Add tests for MSA bit counting instructions
tests/tcg: target/mips: Add wrappers for MSA bit counting instructions
tests/tcg: target/mips: Add a header with test utilities
tests/tcg: target/mips: Add a header with test inputs
tests/tcg: target/mips: Remove an unnecessary file
target/mips: introduce MTTCG-enabled builds
hw/mips_cpc: kick a VP when putting it into Run statewq
target/mips: hold BQL in mips_vpe_wake()
hw/mips_int: hold BQL for all interrupt requests
target/mips: reimplement SC instruction emulation and use cmpxchg
target/mips: compare virtual addresses in LL/SC sequence
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add tests for MSA logic instructions. This includes following
instructions:
* AND.V - logical AND
* NOR.V - logical NOR
* OR.V - logical OR
* XOR.V - logical XOR
Each test consists of 80 test cases, so altogether there are 320
test cases.
Reviewed-by: Aleksandar Rikalo <arikalo@wavecomp.com>
Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Add tests for MSA interleave instructions. This includes following
instructions:
* ILVEV.B - interleave even (bytes)
* ILVEV.H - interleave even (halfwords)
* ILVEV.W - interleave even (words)
* ILVEV.D - interleave even (doublewords)
* ILVOD.B - interleave odd (bytes)
* ILVOD.H - interleave odd (halfwords)
* ILVOD.W - interleave odd (words)
* ILVOD.D - interleave odd (doublewords)
* ILVL.B - interleave left (bytes)
* ILVL.H - interleave left (halfwords)
* ILVL.W - interleave left (words)
* ILVL.D - interleave left (doublewords)
* ILVR.B - interleave right (bytes)
* ILVR.H - interleave right (halfwords)
* ILVR.W - interleave right (words)
* ILVR.D - interleave right (doublewords)
Each test consists of 80 test cases, so altogether there are 1280
test cases.
Reviewed-by: Aleksandar Rikalo <arikalo@wavecomp.com>
Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Add tests for MSA bit counting instructions. This includes following
instructions:
* NLOC.B - number of leading ones (bytes)
* NLOC.H - number of leading ones (halfwords)
* NLOC.W - number of leading ones (words)
* NLOC.D - number of leading ones (doublewords)
* NLZC.B - number of leading zeros (bytes)
* NLZC.H - number of leading zeros (halfwords)
* NLZC.W - number of leading zeros (words)
* NLZC.D - number of leading zeros (doublewords)
* PCNT.B - population count / number of ones (bytes)
* PCNT.H - population count / number of ones (halfwords)
* PCNT.W - population count / number of ones (words)
* PCNT.D - population count / number of ones (doublewords)
Each test consists of 80 test cases, so altogether there are 960 test
cases.
Reviewed-by: Aleksandar Rikalo <arikalo@wavecomp.com>
Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Add a header that contains wrappers around MSA instructions assembler
invocations. For now, only bit counting instructions (NLOC, NLZC, and
PCNT; each in four data format flavors) are supported.
Reviewed-by: Aleksandar Rikalo <arikalo@wavecomp.com>
Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Add a header that contains test utilities. For now, it contains
only a function for checking and printing test results for bit
counting and similar MSA instructions.
Reviewed-by: Aleksandar Rikalo <arikalo@wavecomp.com>
Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
The file tests/tcg/mips/include/test_inputs.h is planned to
contain various test inputs. For now, it contains 64 128-bit
pattern inputs (alternating groups od ones and zeroes) and
16 128-bit random inputs.
Reviewed-by: Aleksandar Rikalo <arikalo@wavecomp.com>
Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
While testing mttcg VP0 could get stuck in a loop waiting for other
VPs to come up (which never actually happens). To fix this, kick VPs
while they are being powered up by Cluster Power Controller in an
async task which is triggered once the host thread is being spawned.
Signed-off-by: Miodrag Dinic <miodrag.dinic@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Acked-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Hold BQL whenever mips_vpe_wake() is invoked.
Without this patch, MIPS MT with MTTCG enabled triggers an abort in
tcg_handle_interrupt() due to an unlocked access to cpu_interrupt().
This patch makes sure that the BQL is held in this case.
Signed-off-by: Goran Ferenc <goran.ferenc@imgtec.com>
Signed-off-by: Miodrag Dinic <miodrag.dinic@imgtec.com>
Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Acked-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Make sure BQL is held for all interrupt requests.
For MTTCG-enabled configurations, handling soft and hard interrupts
between vCPUs must be properly locked. By acquiring BQL, make sure
all paths triggering an IRQ are synchronized.
Signed-off-by: Miodrag Dinic <miodrag.dinic@imgtec.com>
Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Acked-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Do only virtual addresses comaprisons in LL/SC sequence emulations.
Until this patch, physical addresses had been compared in SC part of
LL/SC sequence, even though such comparisons could be avoided. Getting
rid of them allows throwing away SC helpers and having common SC
implementations in user and system mode, avoiding the need for two
separate implementations selected by #ifdef CONFIG_USER_ONLY.
Correct guest software should not rely on LL/SC if they accesses the
same physical address via different virtual addresses or if page
mapping gets changed between LL/SC due to manipulating TLB entries.
MIPS Instruction Set Manual clearly says that an RMW sequence must
use the same address in the LL and SC (virtual address, physical
address, cacheability and coherency attributes must be identical).
Otherwise, the result of the SC is not predictable. This patch takes
advantage of this fact and removes the virtual->physical address
translation from SC helper.
lladdr served as Coprocessor 0 LLAddr register which captures physical
address of the most recent LL instruction, and also lladdr was used
for comparison with following SC physical address. This patch changes
the meaning of lladdr - now it will only keep the virtual address of
the most recent LL. Additionally, CP0_LLAddr field is introduced which
is the actual Coperocessor 0 LLAddr register that guest can access.
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Signed-off-by: Miodrag Dinic <miodrag.dinic@imgtec.com>
Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Acked-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Chardev fixes
# gpg: Signature made Wed 13 Feb 2019 16:18:36 GMT
# gpg: using RSA key DAE8E10975969CE5
# gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>" [full]
# gpg: aka "Marc-André Lureau <marcandre.lureau@gmail.com>" [full]
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276 F62D DAE8 E109 7596 9CE5
* remotes/elmarco/tags/chardev-pull-request: (25 commits)
char-pty: remove write_lock usage
char-pty: remove the check for connection on write
chardev: add a note about frontend sources and context switch
terminal3270: do not use backend timer sources
char: update the mux handlers in class callback
chardev/wctablet: Fix a typo
char: allow specifying a GMainContext at opening time
chardev: ensure termios is fully initialized
tests: expand coverage of socket chardev test
chardev: fix race with client connections in tcp_chr_wait_connected
chardev: disallow TLS/telnet/websocket with tcp_chr_wait_connected
chardev: honour the reconnect setting in tcp_chr_wait_connected
chardev: use a state machine for socket connection state
chardev: split up qmp_chardev_open_socket connection code
chardev: split tcp_chr_wait_connected into two methods
chardev: remove unused 'sioc' variable & cleanup paths
chardev: ensure qemu_chr_parse_compat reports missing driver error
chardev: remove many local variables in qemu_chr_parse_socket
chardev: forbid 'wait' option with client sockets
chardev: forbid 'reconnect' option with server sockets
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
RISC-V Patches for the 4.0 Soft Freeze, Part 1
This patch set contains a handful of patches I've collected over the
last few weeks. There's nothing really fundamental, but I thought it
would be good to send these out now as there are some other patch sets
on the mailing list that are getting ready to go.
As far as the actual patches, there's:
* A set that cleans up our FS dirty-mode handling.
* Support for writing MISA.
* The removal of Michael as a maintainer.
* A fix to {m,s}counteren handling.
* A fix to make sure the kernel's start address is computed correctly on
32-bit targets.
This makes my "RISC-V Patches for 3.2, Part 3" pull request defunct, as
it contains the same patches but based on a newer master. As usual,
I've tested this using a Fedora boot on the latest Linux. This patch
set does not include Bastian's decodetree patches because there were
some merge conflicts and while I've cleaned them up I want to get a
round of review first.
# gpg: Signature made Wed 13 Feb 2019 15:37:50 GMT
# gpg: using RSA key 00CE76D1834960DFCE886DF8EF4CA1502CCBAB41
# gpg: issuer "palmer@dabbelt.com"
# gpg: Good signature from "Palmer Dabbelt <palmer@dabbelt.com>" [unknown]
# gpg: aka "Palmer Dabbelt <palmer@sifive.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 00CE 76D1 8349 60DF CE88 6DF8 EF4C A150 2CCB AB41
* remotes/palmer/tags/riscv-for-master-4.0-sf1:
riscv: Ensure the kernel start address is correctly cast
target/riscv: fix counter-enable checks in ctr()
MAINTAINERS: Remove Michael Clark as a RISC-V Maintainer
RISC-V: Add misa runtime write support
RISC-V: Add misa.MAFD checks to translate
RISC-V: Add misa to DisasContext
RISC-V: Add priv_ver to DisasContext
RISC-V: Use riscv prefix consistently on cpu helpers
RISC-V: Implement mstatus.TSR/TW/TVM
RISC-V: Mark mstatus.fs dirty
RISC-V: Split out mstatus_fs from tb_flags
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The current check to test if usbfs support should be compiled or not
solely relies on the presence of <linux/usbdevice_fs.h>, without
actually checking that all definition used by Qemu are provided by
this header file.
With sufficiently old kernel headers, <linux/usbdevice_fs.h> may be
present, but some of the definitions needed by Qemu may not be
available.
This commit improves the check by building a small program that
actually tests whether the necessary definitions are available.
In addition, it fixes a bug where have_usbfs was set to "yes"
regardless of the result of the test.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20190213211827.20300-1-thomas.petazzoni@bootlin.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
The sun4uv_init() function expects vga_interface_type to be either
VGA_STD or VGA_NONE and sets up a stdvga device or no vga card
accordingly.
However, the code in vl.c prefers the Cirrus VGA card to stdvga if
it is available and the user and the machine did not specify anything
else.
So far this has not been a problem, since the Cirrus VGA was not linked
into the sparc64 target. But with the upcoming Kconfig build system,
all theoretically possible PCI cards will be enabled by default, so the
Cirrus VGA card might become available on the sparc64 target, too. vl.c
then picks the wrong card, causing sun4uv_init() to abort.
Thus let's make it explicit that we always want stdvga for sparc64 and
so set default_display = "std" for these machines.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <1550041639-10232-1-git-send-email-thuth@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Always make error messages start with 'Error:' as a fallback
to make sure that anything parsing them can tell it failed.
Note: Some places don't use hmp_handle_error
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20190212134758.10514-3-dgilbert@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
We have now managed to eradicate all the places in the codebase
that triggered clang's -Waddress-of-packed-member warning. Remove
the compiler flag that exempted it from our usual -Werror policy.
This will prevent any new problematic code being added in future.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190208132112.31493-1-peter.maydell@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
The lock usage was described with its introduction in commit
9005b2a758. It was necessary because PTY
write() shares more state than GIOChannel with other
operations.
This made char-pty a bit different from other chardev, that only lock
around the write operation. This was apparent in commit
7b3621f47a, which introduced an idle
source to avoid the lock.
By removing the PTY chardev state sharing on write() with previous
patch, we can remove the lock and the idle source.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20190206174328.9736-7-marcandre.lureau@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
This doesn't help much compared to the 1 second poll PTY
timer. I can't think of a use case where this would help.
However, we can simplify the code around chr_write(): the write lock
is no longer needed for other char-pty callbacks (see following
patch).
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20190206174328.9736-6-marcandre.lureau@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
This will be needed by vhost-user-test, when each test switches to
its own GMainLoop and GMainContext. Otherwise, for a reconnecting
socket the initial connection will happen on the default GMainContext,
and no one will be listening on it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20190202110834.24880-1-pbonzini@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Commit caa1ee43 "vhost-user-blk: add discard/write zeroes features
support" added fields to struct virtio_blk_config. This changes
the size of the config space and breaks migration from QEMU 3.1
and older:
qemu-system-ppc64: get_pci_config_device: Bad config data: i=0x10 read: 41 device: 1 cmask: ff wmask: 80 w1cmask:0
qemu-system-ppc64: Failed to load PCIDevice:config
qemu-system-ppc64: Failed to load virtio-blk:virtio
qemu-system-ppc64: error while loading state for instance 0x0 of device 'pci@800000020000000:01.0/virtio-blk'
qemu-system-ppc64: load of migration failed: Invalid argument
Since virtio-blk doesn't support the "discard" and "write zeroes"
features, it shouldn't even expose the associated fields in the
config space actually. Just include all fields up to num_queues to
match QEMU 3.1 and older.
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1550022537-27565-1-git-send-email-changpeng.liu@intel.com
Message-Id: <1550022537-27565-1-git-send-email-changpeng.liu@intel.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
QEMU wraps the socket functions in os-win32.h, but in commit
a9d8b3ec43, the header inclusion was dropped,
breaking libslirp on Windows.
Wrap the missing functions.
Rename the wrapped function with "slirp_" prefix and "_wrap" suffix,
for consistency and to avoid a clash with existing function (such as
"slirp_socket").
Fixes: a9d8b3ec ("slirp: replace remaining qemu headers dependency")
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20190212160953.29051-3-marcandre.lureau@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Tested-by: Howard Spoelstra
QEMU wraps the socket functions in os-win32.h, but in commit
a9d8b3ec43, the header inclusion was dropped,
breaking libslirp on Windows.
There are already a few socket functions that are wrapped in libslirp,
with "slirp_" prefix, but many of them are missing, and we are going
to wrap the missing functions in a second patch.
Using "slirp_" prefix avoids the conflict with socket function #define
wrappers in QEMU os-win32.h, but they are quite intrusive. In the end,
the functions should behave the same as original one, but with errno
being set. To avoid the churn, and potential confusion, remove the
"slirp_" prefix. A series of #undef is necessary until libslirp is
made standalone to prevent the #define conflict with QEMU.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20190212160953.29051-2-marcandre.lureau@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
It looks like the operands where exchanged. HP bootrom tests the
following sequence:
0x00000000f0004064: ldil L%-66666800,r7
0x00000000f0004068: addi 19f,r7,r7
0x00000000f000406c: addi -1,r0,rp
0x00000000f0004070: addi f,r0,r4
0x00000000f0004074: addi 1,r4,r5
0x00000000f0004078: dcor rp,r6
0x00000000f000407c: cmpb,<>,n r6,r7,0xf000411
This returned 0x66666661 instead of the expected 0x9999999f in QEMU.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190211181907.2219-6-svens@stackframe.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These conditions include the signed overflow bit. See page 5-3
of the Parisc 1.1 Architecture Reference Manual for details.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
[rth: More changes for c == 3, to compute (N^V)|Z properly.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We will be fixing do_cond vs signed overflow, which requires
that do_log_cond not rely on do_cond.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Now that the implementation is entirely within the generated
decode function, eliminate the wrapper.
Tested-by: Helge Deller <deller@gmx.de>
Tested-by: Sven Schnelle <svens@stackframe.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
With decodetree.py, the specializations would conflict so we
must have a single entry point for all variants of OR.
Tested-by: Helge Deller <deller@gmx.de>
Tested-by: Sven Schnelle <svens@stackframe.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Instead of returning DisasJumpType, immediately store it.
Return true in preparation for conversion to the decodetree script.
Tested-by: Helge Deller <deller@gmx.de>
Tested-by: Sven Schnelle <svens@stackframe.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The current socket chardev tests try to exercise the chardev socket
driver in both server and client mode at the same time. The chardev API
is not very well designed to handle both ends of the connection being in
the same process so this approach makes the test case quite unpleasant
to deal with.
This splits the tests into distinct cases, one to test server socket
chardevs and one to test client socket chardevs. In each case the peer
is run in a background thread using the simpler QIOChannelSocket APIs.
The main test case code can now be written in a way that mirrors the
typical usage from within QEMU.
In doing this recfactoring it is possible to greatly expand the test
coverage for the socket chardevs to test all combinations except for a
server operating in blocking wait mode.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20190211182442.8542-16-berrange@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
When the 'reconnect' option is given for a client connection, the
qmp_chardev_open_socket_client method will run an asynchronous
connection attempt. The QIOChannel socket executes this is a single use
background thread, so the connection will succeed immediately (assuming
the server is listening). The chardev, however, won't get the result
from this background thread until the main loop starts running and
processes idle callbacks.
Thus when tcp_chr_wait_connected is run s->ioc will be NULL, but the
state will be TCP_CHARDEV_STATE_CONNECTING, and there may already
be an established connection that will be associated with the chardev
by the pending idle callback. tcp_chr_wait_connected doesn't check the
state, only s->ioc, so attempts to establish another connection
synchronously.
If the server allows multiple connections this is unhelpful but not a
fatal problem as the duplicate connection will get ignored by the
tcp_chr_new_client method when it sees the state is already connected.
If the server only supports a single connection, however, the
tcp_chr_wait_connected method will hang forever because the server will
not accept its synchronous connection attempt until the first connection
is closed.
To deal with this tcp_chr_wait_connected needs to synchronize with the
completion of the background connection task. To do this it needs to
create the QIOTask directly and use the qio_task_wait_thread method.
This will cancel the pending idle callback and directly dispatch the
task completion callback, allowing the connection to be associated
with the chardev. If the background connection failed, it can still
attempt a new synchronous connection.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20190211182442.8542-15-berrange@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
In the previous commit
commit 1dc8a6695c
Author: Marc-André Lureau <marcandre.lureau@redhat.com>
Date: Tue Aug 16 12:33:32 2016 +0400
char: fix waiting for TLS and telnet connection
the tcp_chr_wait_connected() method was changed to check for a non-NULL
's->ioc' as a sign that there is already a connection present, as
opposed to checking the "connected" flag to supposedly fix handling of
TLS/telnet connections.
The original code would repeatedly call tcp_chr_wait_connected creating
many connections as 'connected' would never become true. The changed
code would still repeatedly call tcp_chr_wait_connected busy waiting
because s->ioc is set but the chardev will never see CHR_EVENT_OPENED.
IOW, the code is still broken with TLS/telnet, but in a different way.
Checking for a non-NULL 's->ioc' does not mean that a CHR_EVENT_OPENED
will be ready for a TLS/telnet connection. These protocols (and the
websocket protocol) all require the main loop to be running in order
to complete the protocol handshake before emitting CHR_EVENT_OPENED.
The tcp_chr_wait_connected() method is only used during early startup
before a main loop is running, so TLS/telnet/websock connections can
never complete initialization.
Making this work would require changing tcp_chr_wait_connected to run
a main loop. This is quite complex since we must not allow GSource's
that other parts of QEMU have registered to run yet. The current callers
of tcp_chr_wait_connected do not require use of the TLS/telnet/websocket
protocols, so the simplest option is to just forbid this combination
completely for now.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20190211182442.8542-14-berrange@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
If establishing a client connection fails, the tcp_chr_wait_connected
method should sleep for the reconnect timeout and then retry the
attempt. This ensures the callers don't immediately abort with an
error when the initial connection fails.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20190211182442.8542-13-berrange@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
The socket connection state is indicated via the 'bool connected' field
in the SocketChardev struct. This variable is somewhat misleading
though, as it is only set to true once the connection has completed all
required handshakes (eg for TLS, telnet or websockets). IOW there is a
period of time in which the socket is connected, but the "connected"
flag is still false.
The socket chardev really has three states that it can be in,
disconnected, connecting and connected and those should be tracked
explicitly.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20190211182442.8542-12-berrange@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
In qmp_chardev_open_socket the code for connecting client chardevs is
split across two conditionals far apart with some server chardev code in
the middle. Split up the method so that code for client connection setup
is separate from code for server connection setup.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20190211182442.8542-11-berrange@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
The tcp_chr_wait_connected method can deal with either server or client
chardevs, but some callers only care about one of these possibilities.
The tcp_chr_wait_connected method will also need some refactoring to
reliably deal with its primary goal of allowing a device frontend to
wait for an established connection, which will interfere with other
callers.
Split it into two methods, one responsible for server initiated
connections, the other responsible for client initiated connections.
In doing this split the tcp_char_connect_async() method is renamed
to become consistent with naming of the new methods.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20190211182442.8542-10-berrange@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
The 'wait'/'nowait' parameter is used to tell server sockets whether to
block until a client is accepted during initialization. Client chardevs
have always silently ignored this option. Various tests were mistakenly
passing this option for their client chardevs.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20190211182442.8542-6-berrange@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
The 'reconnect' option is used to give the sleep time, in seconds,
before a client socket attempts to re-establish a connection to the
server. It does not make sense to set this for server sockets, as they
will always accept a new client connection immediately after the
previous one went away.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20190211182442.8542-5-berrange@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
The TLS creds option is not valid with certain address types. The user
config was only checked for errors when parsing legacy QemuOpts, thus
the user could pass unsupported values via QMP.
Pull all code for validating options out into a new method
qmp_chardev_validate_socket, that is called from the main
qmp_chardev_open_socket method. This adds a missing check for rejecting
TLS creds with the vsock address type.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20190211182442.8542-4-berrange@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Add the ability for a caller to wait for completion of the
background thread to synchronously dispatch its result, without
needing to wait for the main loop to run the idle callback.
This method needs very careful usage to avoid a dangerous
race condition with the free'ing of the task. The completion
callback is normally invoked from an idle callback registered
with the main loop context. The qio_task_wait_thread method
must only be called if the completion callback has not yet
run. The only safe way to achieve this is to run the
qio_task_wait_thread method from the thread that executes
the main loop.
It is generally a bad idea to use this method since it will
block execution of the main loop, however, the design of
the character devices and its usage from vhostuser already
requires blocking execution.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20190211182442.8542-3-berrange@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
When chardev is multiplexed (mux=on) there are a lot of cases where
CHR_EVENT_OPENED/CHR_EVENT_CLOSED events pairing (expected from
frontend side) is broken. There are either generation of multiple
repeated or extra CHR_EVENT_OPENED events, or CHR_EVENT_CLOSED just
isn't generated at all.
This is mostly because 'qemu_chr_fe_set_handlers()' function makes its
own (and often wrong) implicit decision on updated frontend state and
invokes 'fd_event' callback with 'CHR_EVENT_OPENED'. And even worse,
it doesn't do symmetric action in opposite direction, as someone may
expect (i.e. it doesn't invoke previously set 'fd_event' with
'CHR_EVENT_CLOSED'). Muxed chardev uses trick by calling this function
again to replace callback handlers with its own ones, but it doesn't
account for such side effect.
Fix that using extended version of this function with added argument
for disabling side effect and keep original function for compatibility
with lots of frontends already using this interface and being
"tolerant" to its side effects.
One more source of event duplication is just line of code in
char-mux.c, which does far more than comment above says (obvious fix).
Signed-off-by: Artem Pisarenko <artem.k.pisarenko@gmail.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <7dde6abbd21682857f8294644013173c0b9949b3.1541507990.git.artem.k.pisarenko@gmail.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
qemu coroutine command results in following error output:
Python Exception <class 'gdb.error'> 'arch_prctl' has unknown return
type; cast the call to its declared return type: Error occurred in
Python command: 'arch_prctl' has unknown return type; cast the call to
its declared return type
Fix it by giving it what it wants: arch_prctl return type.
Information on the topic:
https://sourceware.org/gdb/onlinedocs/gdb/Calling.html
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20190206151425.105871-1-vsementsov@virtuozzo.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Lukas reported an hard to reproduce QMP iothread hang on s390 that
QEMU might hang at pthread_join() of the QMP monitor iothread before
quitting:
Thread 1
#0 0x000003ffad10932c in pthread_join
#1 0x0000000109e95750 in qemu_thread_join
at /home/thuth/devel/qemu/util/qemu-thread-posix.c:570
#2 0x0000000109c95a1c in iothread_stop
#3 0x0000000109bb0874 in monitor_cleanup
#4 0x0000000109b55042 in main
While the iothread is still in the main loop:
Thread 4
#0 0x000003ffad0010e4 in ??
#1 0x000003ffad553958 in g_main_context_iterate.isra.19
#2 0x000003ffad553d90 in g_main_loop_run
#3 0x0000000109c9585a in iothread_run
at /home/thuth/devel/qemu/iothread.c:74
#4 0x0000000109e94752 in qemu_thread_start
at /home/thuth/devel/qemu/util/qemu-thread-posix.c:502
#5 0x000003ffad10825a in start_thread
#6 0x000003ffad00dcf2 in thread_start
IMHO it's because there's a race between the main thread and iothread
when stopping the thread in following sequence:
main thread iothread
=========== ==============
aio_poll()
iothread_get_g_main_context
set iothread->worker_context
iothread_stop
schedule iothread_stop_bh
execute iothread_stop_bh [1]
set iothread->running=false
(since main_loop==NULL so
skip to quit main loop.
Note: although main_loop is
NULL but worker_context is
not!)
atomic_read(&iothread->worker_context) [2]
create main_loop object
g_main_loop_run() [3]
pthread_join() [4]
We can see that when execute iothread_stop_bh() at [1] it's possible
that main_loop is still NULL because it's only created until the first
check of the worker_context later at [2]. Then the iothread will hang
in the main loop [3] and it'll starve the main thread too [4].
Here the simple solution should be that we check again the "running"
variable before check against worker_context.
CC: Thomas Huth <thuth@redhat.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Lukáš Doktor <ldoktor@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
CC: Eric Blake <eblake@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
Reported-by: Lukáš Doktor <ldoktor@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-id: 20190129051432.22023-1-peterx@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Cast the kernel start address to the target bit length.
This ensures that we calculate the initrd offset to a valid address for
the architecture.
Steps to reproduce the original problem (reported by Alex):
Build U-Boot for the virt machine for riscv32. Then run it with
$ qemu-system-riscv32 -M virt -kernel u-boot -nographic -initrd <a file>
You can find the initrd address with
U-Boot# fdt addr $fdtcontroladdr
U-Boot# fdt ls /chosen
Then take a peek at that address:
U-Boot# md.b <addr>
and you will see that there is nothing there without this patch. The
reason is that the binary was loaded to a negative address.
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Suggested-by: Alexander Graf <agraf@suse.de>
Reported-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Access to a counter in U-mode is permitted only if the corresponding
bit is set in both mcounteren and scounteren. The current code
ignores mcounteren and checks scounteren only for U-mode access.
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Reviewed-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
This patch adds support for writing misa. misa is validated based
on rules in the ISA specification. 'E' is mutually exclusive with
all other extensions. 'D' depends on 'F' so 'D' bit is dropped
if 'F' is not present. A conservative approach to consistency is
taken by flushing the translation cache on misa writes. misa_mask
is added to the CPU struct to store the original set of extensions.
Signed-off-by: Michael Clark <mjc@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Add misa checks for M, A, F and D extensions and if they are
not present generate illegal instructions. This improves
emulation accurary for harts with a limited set of extensions.
Signed-off-by: Michael Clark <mjc@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
gen methods should access state from DisasContext. Add misa
field to the DisasContext struct and remove CPURISCVState
argument from all gen methods.
Signed-off-by: Michael Clark <mjc@sifive.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
* Add riscv prefix to raise_exception function
* Add riscv prefix to CSR read/write functions
* Add riscv prefix to signal handler function
* Add riscv prefix to get fflags function
* Remove redundant declaration of riscv_cpu_init
and rename cpu_riscv_init to riscv_cpu_init
* rename riscv_set_mode to riscv_cpu_set_mode
Signed-off-by: Michael Clark <mjc@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
In the 'Format specific information' section of the 'qemu-img info'
command output, the supplemental information about existing QCOW2
bitmaps will be shown, such as a bitmap name, flags and granularity:
image: /vz/vmprivate/VM1/harddisk.hdd
file format: qcow2
virtual size: 64G (68719476736 bytes)
disk size: 3.0M
cluster_size: 1048576
Format specific information:
compat: 1.1
lazy refcounts: true
bitmaps:
[0]:
flags:
[0]: in-use
[1]: auto
name: back-up1
granularity: 65536
[1]:
flags:
[0]: in-use
[1]: auto
name: back-up2
granularity: 65536
refcount bits: 16
corrupt: false
Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <1549638368-530182-3-git-send-email-andrey.shinkevich@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
lgtm.com pointed out that commit 678ba275 introduced a shadowed
declaration of local variable 'bs'; thankfully, the inner 'bs'
obtained by 'blk_bs(blk)' matches the outer one given that we had
'blk_insert_bs(blk, bs, errp)' a few lines earlier, and there are
no later uses of 'bs' beyond the scope of the 'if (bitmap)' to
care if we change the value stored in 'bs' while traveling the
backing chain to find a bitmap. So simply get rid of the extra
declaration.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190207191357.6665-1-eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
We are failing to take into account that tlb_fill() can cause a
TLB resize, which renders prior TLB entry pointers/indices stale.
Fix it by re-doing the TLB entry lookups immediately after tlb_fill.
Fixes: 86e1eff8bc ("tcg: introduce dynamic TLB sizing", 2019-01-28)
Reported-by: Max Filippov <jcmvbkbc@gmail.com>
Tested-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <20190209162745.12668-3-cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Currently, a jump to a label that is not defined anywhere will
be emitted not be relocated. This results in a jump to a random
jump target. With tcg debugging, print a diagnostic to the -d op
file and abort.
This could help debug or detect errors like
c2d9644e6d ("target/arm: Fix crash on conditional instruction in an IT block")
Reported-by: Roman Kapl <code@rkapl.cz>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Testing updates:
- .travis.yml tweaks and optimisations
- .cirrus.yml enabled for FreeBSD CI
- docker.py clean-ups for binfmt_misc
- more control of vm-test builds
# gpg: Signature made Mon 11 Feb 2019 13:03:14 GMT
# gpg: using RSA key F715F7CD46F94435F4F588658E520D61289519AE
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [full]
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
# Subkey fingerprint: F715 F7CD 46F9 4435 F4F5 8865 8E52 0D61 2895 19AE
* remotes/stsquad/tags/pull-testing-next-110219-1:
tests/vm: Be verbose while extracting compressed images
docs/devel/testing: Add -a option to usermod command on docker setup
scripts/qemu.py: allow arches use KVM for their 32bit cousins
tests/vm: expose BUILD_TARGET, TARGET_LIST and EXTRA_CONFIGURE_OPTS
tests/vm: add --build-target option
tests/vm: call make check directly for netbsd/freebsd/ubuntu.i386
tests/vm: move images to $HOME/.cache/qemu-vm/images
tests: PEP8 cleanup of docker.py, mostly white space
tests: docker.py be even smarter with persistent binfmt_misc
tests: make docker.py check for persistent configs
tests: make docker.py update use configured binfmt path
docker: add debian-buster-arm64-cross
archive-source.sh: Clone the submodules locally
MAINTAINERS: Add an entry for scripts/archive-source.sh
.travis.yml: fold --disable-tcg into alternate coroutine builds
.travis.yml: separate tools and docs into another entry
.travis.yml: stop requesting libffi & gettext from homebrew
.cirrus.yml: basic compile and test for FreeBSD
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Depending of the host hardware, copying and extracting VM images can
take up to few minutes. Add verbosity to avoid the user to worry about
VMs hanging.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190129175403.18017-2-philmd@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
The option -G of usermod command will remove user from other groups
not listed, i.e.: $USER will belong only to group 'docker' after
following the documentation as is.
From usermod(8) manual page:
If the user is currently a member of a group which is not listed,
the user will be removed from the group. This behaviour can be
changed via the -a option, which appends the user to the current
supplementary group list.
This patch improves the situation by adding the -a option to the
usermod command, which will just append user to the supplementary
group list.
Cc: qemu-trivial@nongnu.org
Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Message-Id: <20190207184346.6840-1-muriloo@linux.ibm.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
A lot of architectures can run their 32 bit cousins on KVM so the
kvm_available function needs to be a little less restricting when
deciding if KVM is available.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Now the underlying basevm support passes these along we can expose
some additional variables to our Makefile to allow more customised
tweaking of the build. For example:
make vm-build-freebsd TARGET_LIST=aarch64-softmmu \
EXTRA_CONFIGURE_OPTS="--disable-tools --disable-docs" \
BUILD_TARGET=check-softfloat
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
This allows us to invoke the build with a custom target (for the VMs
that use the {target} format string specifier). Currently OpenBSD is
still hardwired due to problems running check.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
The "make check" target calls check-qtest which has the appropriate
system binaries as dependencies so we shouldn't need to do two steps
of make invocation. Doing it in two steps was a hangover from when our
make check couldn't run tests in parallel.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
It's easier to move around the images then, by replacing the
subdirectory with a symlink. Allows to share the images between
multiple qemu checkouts for example.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
My editor keeps putting squiggly lines under a bunch of the python
lines to remind me how non-PEP8 compliant it is. Clean that up so it's
easier to spot new errors.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
If we have a persistent mapping we don't need the QEMU binary copied
into the container as the kernel has already opened the file and will
pass the fd in. However the support libraries will still need to be
there.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
binfmt_misc configured with the "F" flag opens the interpreter at
config time. This means it can use an already open file-descriptor to
run QEMU so there is no point trying to copy the binary into a
container.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
When copying a QEMU binary into a linux-user docker image we should
check what the current configured binfmt_misc path is rather than
just assuming "/usr/bin/qemu-bin". Obviously if the user changes the
configuration afterwards they will break their images again.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
We can't build QEMU with this but we can use this image to build newer
arm64 testcases which need more up to date tools.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
We cloned the QEMU repository from the local storage. Since the
submodules are also available there, clone them too. This is
quicker and reduce network use.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
[AJB: incorporated review suggestions from danpb]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
The scripts/archive-source.sh is used by the VM tests, it makes
sense to add it in the "Build and test automation" section.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
The alternate coroutine builds are really only of interest to people
running KVM (although I think you could use them for TCG if you really
tried). As they tend to run long lets kill two birds with one stone
and fold the --disable-tcg build into them.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Re-building the tools and documents by default is a little wasteful as
they are not really affected by the main build options. Split tools
and documents into their own task with a minimal softmmu and
linux-user target list just to check they don't interact badly.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
The default package set installed on macOS builders from Travis already
includes libffi and gettext as shown by log messages:
Skipping install of libffi formula. It is already up-to-date.
Using libffi
Skipping install of gettext formula. It is already up-to-date.
Using gettext
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
More work towards libslirp
Marc-André Lureau (27):
slirp: generalize guestfwd with a callback based approach
net/slirp: simplify checking for cmd: prefix
net/slirp: free forwarding rules on cleanup
net/slirp: fix leaks on forwarding rule registration error
slirp: add callbacks for timer
slirp: replace trace functions with DEBUG calls
slirp: replace QEMU_PACKED with SLIRP_PACKED
slirp: replace most qemu socket utilities with slirp own version
slirp: replace qemu_set_nonblock()
slirp: add unregister_poll_fd() callback
slirp: replace qemu_notify_event() with a callback
slirp: move QEMU state saving to a separate unit
slirp: do not include qemu headers in libslirp.h public API header
slirp: improve windows headers inclusion
slirp: add slirp own version of pstrcpy
slirp: remove qemu timer.h dependency
slirp: remove now useless QEMU headers inclusions
slirp: replace net/eth.h inclusion with own defines
slirp: replace qemu qtailq with slirp own copy
slirp: replace remaining qemu headers dependency
slirp: prefer c99 types over BSD kind
slirp: improve send_packet() callback
slirp: replace global polling with per-instance & notifier
slirp: remove slirp_instances list
slirp: use polling callbacks, drop glib requirement
slirp: pass opaque to all callbacks
slirp: API is extern C
Peter Maydell (2):
slirp: Avoid marking naturally packed structs as QEMU_PACKED
slirp: Don't mark struct ipq or struct ipasfrag as packed
Samuel Thibault (3):
slirp: Avoid unaligned 16bit memory access
slirp: replace QEMU_BUILD_BUG_ON with G_STATIC_ASSERT
slirp: Move g_spawn_async_with_fds_qemu compatibility to slirp/
# gpg: Signature made Thu 07 Feb 2019 14:02:41 GMT
# gpg: using RSA key E61DBB15D4172BDEC97E92D9DB550E89F0FA54F3
# gpg: Good signature from "Samuel Thibault <samuel.thibault@aquilenet.fr>" [unknown]
# gpg: aka "Samuel Thibault <sthibault@debian.org>" [marginal]
# gpg: aka "Samuel Thibault <samuel.thibault@gnu.org>" [unknown]
# gpg: aka "Samuel Thibault <samuel.thibault@inria.fr>" [marginal]
# gpg: aka "Samuel Thibault <samuel.thibault@labri.fr>" [marginal]
# gpg: aka "Samuel Thibault <samuel.thibault@ens-lyon.org>" [marginal]
# gpg: aka "Samuel Thibault <samuel.thibault@u-bordeaux.fr>" [unknown]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 900C B024 B679 31D4 0F82 304B D017 8C76 7D06 9EE6
# Subkey fingerprint: E61D BB15 D417 2BDE C97E 92D9 DB55 0E89 F0FA 54F3
* remotes/thibault/tags/samuel-thibault: (32 commits)
slirp: API is extern C
slirp: pass opaque to all callbacks
slirp: use polling callbacks, drop glib requirement
slirp: remove slirp_instances list
slirp: replace global polling with per-instance & notifier
slirp: improve send_packet() callback
slirp: prefer c99 types over BSD kind
slirp: replace remaining qemu headers dependency
slirp: Move g_spawn_async_with_fds_qemu compatibility to slirp/
slirp: replace QEMU_BUILD_BUG_ON with G_STATIC_ASSERT
slirp: replace qemu qtailq with slirp own copy
slirp: replace net/eth.h inclusion with own defines
slirp: remove now useless QEMU headers inclusions
slirp: remove qemu timer.h dependency
slirp: add slirp own version of pstrcpy
slirp: improve windows headers inclusion
slirp: do not include qemu headers in libslirp.h public API header
slirp: move QEMU state saving to a separate unit
slirp: replace qemu_notify_event() with a callback
slirp: add unregister_poll_fd() callback
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Trivial patches:
* Update copyright
* Fix LGPL in target/moxie
* configure portability fix
* Drop useless inclusion of "hw/i386/pc.h"
* Mark the cpu-cluster device with user_creatable = false
* tsc210x: Fix building with no verbosity
# gpg: Signature made Wed 06 Feb 2019 15:27:35 GMT
# gpg: using RSA key F30C38BD3F2FBE3C
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" [full]
# gpg: aka "Laurent Vivier <laurent@vivier.eu>" [full]
# gpg: aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" [full]
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F 5173 F30C 38BD 3F2F BE3C
* remotes/vivier2/tags/trivial-patches-pull-request:
hw/input/tsc210x: Fix building with no verbosity
hw/cpu/cluster: Mark the cpu-cluster device with user_creatable = false
hw/unicore32/puv3: Drop useless inclusion of "hw/i386/pc.h"
hw/sparc64/sun4u: Drop useless inclusion of "hw/i386/pc.h"
configure: Avoid non-portable 'test -o/-a'
target/moxie: Fix LGPL information in the file headers
qemu-common.h: Update copyright string for 2019
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
It would be legitimate to use libslirp without glib. Let's
add_poll/get_revents pair of callbacks to provide the same
functionality.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Remove hard-coded dependency on slirp in main-loop, and use a "poll"
notifier instead. The notifier is registered per slirp instance.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Use a more descriptive name for the callback.
Reuse the SlirpWriteCb type. Wrap it to check that all data has been written.
Return a ssize_t for potential error handling and data-loss reporting.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Except for the migration code which is gated by WITH_QEMU, only
include our own headers, so libslirp can be built standalone.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Our API usage requires Vista, set WIN32_LEAN_AND_MEAN to fix a number
of issues (winsock2.h include order for ex, which is better to include
first for legacy reasons).
While at it, group redundants #ifndef _WIN32 blocks.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Make state saving optional: this will allow to build SLIRP without
QEMU. (eventually, the vmstate helpers will be extracted, so an
external project & process could save its state)
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Introduce a SlirpCb callback to kick the main io-thread.
Add an intermediary sodrop() function that will call SlirpCb.notify
callback when sbdrop() returns true.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Add a counter-part to register_poll_fd() for completeness.
(so far, register_poll_fd() is called only on struct socket fd)
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Replace qemu_set_nonblock() with slirp_set_nonblock()
qemu_set_nonblock() does some event registration with the main
loop. Add a new callback register_poll_fd() for that reason.
Always build the fd-register stub, to avoid #if WIN32.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Remove a dependency on QEMU. Use the existing logging facilities.
Set SLIRP_DEBUG=tftp to get tftp log.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Instead of calling into QEMU chardev directly, and mixing it with
slirp_add_exec() handling, add a new function slirp_add_guestfwd()
which takes a write callback.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
There is no reason to mark the struct ipq and struct ipasfrag as
packed: they are naturally aligned anyway, and are not representing
any on-the-wire packet format. Indeed they vary in size depending on
the size of pointers on the host system, because the 'struct qlink'
members include 'void *' fields.
Dropping the 'packed' annotation fixes clang -Waddress-of-packed-member
warnings and probably lets the compiler generate better code too.
The only thing we do care about in the layout of the struct is
that the frag_link matches up with the ipf_link of the struct
ipasfrag, as documented in the comment on that struct; assert
at build time that this is the case.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Various ipv6 structs in the slirp headers are marked QEMU_PACKED,
but they are actually naturally aligned and will have no padding
in them. Instead of marking them with the 'packed' attribute,
assert at compile time that they are the size we expect. This
allows us to take the address of fields within the structs
without risking undefined behaviour, and suppresses clang
-Waddress-of-packed-member warnings.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
pkt parameter may be unaligned, so we must access it byte-wise.
This fixes sparc64 host SIGBUS during pxe boot.
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Coverity warns (CID 1390634) that open_net_route() is not
checking the return value from sscanf(), which means that
it might then use values that aren't initialized.
Errors here should in general not happen since we're passing
an assumed-good /proc/net/route from the host kernel, but
if we do fail to parse a line then just skip it in the output
we pass to the guest.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20190205174207.9278-1-peter.maydell@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
When building with TSC_VERBOSE not defined, we get:
CC arm-softmmu/hw/input/tsc210x.o
hw/input/tsc210x.c: In function ‘tsc2102_data_register_write’:
hw/input/tsc210x.c:554:5: error: label at end of compound statement
default:
^~~~~~~
hw/input/tsc210x.c: In function ‘tsc2102_control_register_write’:
hw/input/tsc210x.c:638:5: error: label at end of compound statement
bad_reg:
^~~~~~~
hw/input/tsc210x.c: In function ‘tsc2102_audio_register_write’:
hw/input/tsc210x.c:766:5: error: label at end of compound statement
default:
^~~~~~~
make[1]: *** [rules.mak:69: hw/input/tsc210x.o] Error 1
Fix this by replacing the culprit fprintf(stderr) calls by a more
recent API: qemu_log_mask(LOG_GUEST_ERROR). Other fprintf() calls
are left untouched.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20190204204517.23698-1-philmd@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
The device can not be instantiated by the user and QEMU currently
aborts when you try to use it:
$ x86_64-softmmu/qemu-system-x86_64 -device cpu-cluster
qemu-system-x86_64: hw/cpu/cluster.c:73: cpu_cluster_realize:
Assertion `cbdata.cpu_count > 0' failed.
Aborted (core dumped)
Since this is an internal device only, mark it with user_creatable = false.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Luc Michel <luc.michel@greensocs.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <1549371525-29899-1-git-send-email-thuth@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
POSIX says that it is better to use &&/|| and two separate test
invocations than it is to try and use -a and -o (in fact, there
are some tests that are inherently ambiguous to parse if the
user passes in corner-case input like "(").
Since we cannot guarantee which shell runs configure, we cannot
rely on -o/-a always following bash's parser rules.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190205023937.18245-1-eblake@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
vaddr needs to be equal to the paddr since the dump file represents the
physical memory image.
Without setting vaddr correctly, GDB would load all the different memory
regions on top of each other to vaddr 0, thus making GDB showing the wrong
memory data for a given address.
Signed-off-by: Jon Doron <arilou@gmail.com>
Message-Id: <20190109082203.27142-1-arilou@gmail.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
It's either "GNU *Library* General Public License version 2" or "GNU
Lesser General Public License version *2.1*", but there was no "version
2.0" of the "Lesser" license. So assume that version 2.1 is meant here.
Also the files mentioned the GPL instead of the LGPL after declaring
that the files are licensed under the LGPL, so change these spots to
use LGPL, too.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <1549266858-5043-1-git-send-email-thuth@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
PA-RISC specification says: "Setting the PSW Q-bit, PSW{28}, to 1
with this instruction, if it was not already 1, is an undefined
operation." However, at least HP-UX 10.20 sets the Q bit from 0 to 1
with the SSM instruction. Tested this both on HP9000/712 and
HP9000/785/C3750, both machines set the Q bit from 0 to 1 without
exception. This makes HP-UX 10.20 progress a little bit further.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190129191402.29539-1-svens@stackframe.org>
[rth: Add a comment to the code as well.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In commit f7b78602fd we added the CPU cluster number to the
cflags field of the TB hash; this included adding it to the value
kept in tb->cflags, since we pass that field directly into the hash
calculation in some places. Unfortunately we forgot to check whether
other parts of the code were doing comparisons against tb->cflags
that would need to be updated.
It turns out that there is exactly one such place: the
tb_lookup__cpu_state() function checks whether the TB it has
found in the tb_jmp_cache has a tb->cflags matching the cf_mask
that is passed in. The tb->cflags has the cluster_index in it
but the cf_mask does not.
Hoist the "add cluster index to the cf_mask" code up from
tb_htable_lookup() to tb_lookup__cpu_state() so it can be considered
in the "did this TB match in the jmp cache" condition, as well as
when we do the full hash lookup by physical PC, flags, etc.
(tb_htable_lookup() is only called from tb_lookup__cpu_state(),
so this change doesn't require any further knock-on changes.)
Fixes: f7b78602fd ("accel/tcg: Add cluster number to TCG TB hash")
Tested-by: Cleber Rosa <crosa@redhat.com>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reported-by: Howard Spoelstra <hsp.cat7@gmail.com>
Reported-by: Cleber Rosa <crosa@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20190205151810.571-1-peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The iteration was stopping as soon as prev_var was set to NULL, and
therefore it skipped the first element. Fortunately, or unfortunately,
we have only one use of QTAILQ_FOREACH_REVERSE_SAFE. Thus this only
showed up as incorrect register preferences on the very first translation
block that was compiled.
Reported-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The machine that with hvf accelerator and smp sometimes boot hangs
because all processors are executing instructions at startup,
including early I/O emulations. We should just allow the bootstrap
processor to initialize the machine and then to wake up slave
processors by interrupt.
Signed-off-by: Heiher <r@hev.cc>
Message-Id: <20190123073402.28465-1-r@hev.cc>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
ARM should not have an ISA bus, this device should not be enabled.
Kconfig allows to clean up the dependencies and remove CONFIG_ISA_BUS=y
from ARM, and then catches a contradiction between the hardcoded
CONFIG_SERIAL_ISA=y and CONFIG_ISA_BUS=n.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20190202072456.6468-2-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Our command line interface is really quite overcrowded, we should avoid
duplicated options that do the same thing in just a slightly different
way. "-accel hax" is shorter and more generic that "-enable-hax", so
there is really no real usage for the latter option. "-enable-hax" has
been deprecated since two releases, and nobody complained so far, so
it's time to remove this now.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1544790073-23049-1-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Whenever the allocation length of a SCSI request is shorter than the size of the
VPD page list, page_idx is used blindly to index into r->buf. Even though
the stores in the insertion sort are protected against overflows, the same is not
true of the reads and the final store of 0xb0.
This basically does the same thing as commit 57dbb58d80 ("scsi-generic: avoid
out-of-bounds access to VPD page list", 2018-11-06), except that here the
allocation length can be chosen by the guest. Note that according to the SCSI
standard, the contents of the PAGE LENGTH field are not altered based
on the allocation length.
The code was introduced by commit 6c219fc8a1 ("scsi-generic: keep VPD
page list sorted", 2018-11-06) but the overflow was already possible before.
Reported-by: Kevin Wolf <kwolf@redhat.com>
Fixes: a71c775b24
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The machine description we send is being (silently) thrown on the floor
by GDB and GDB silently uses the default machine description, because
the xml parse fails on <feature> nested within <feature>.
Changes to the xml in qemu source code have no effect.
In addition, the default machine description has fs_base, which fails to
be retrieved, which breaks the whole register window. Add it and the
other control registers.
Signed-off-by: Doug Gale <doug16k@gmail.com>
Message-Id: <20190124040457.2546-1-doug16k@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In 16-bit addressing mode, when Mod = 0 and R/M = 6, decoded displacement
doesn't reach decode_linear_addr and gets lost. Instructions that
involve the combination of ModRM always get a pointer with zero offset
from the beginning of DS segment.
The change fixes drawing in F-BIRD from day 1 of '18 advent calendar.
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Message-Id: <20190125154743.14498-1-r.bolshakov@yadro.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since linux commit: cf8fa920cb42 ("i386: handle an initrd in highmem (version 2)")
linux has supported initrd up to 4 GB, but the header field
ramdisk_max is still set to 2 GB to avoid "possible bootloader bugs".
When use '-kernel vmlinux -initrd initrd.cgz' to launch a VM,
the firmware(it could be linuxboot_dma.bin) helps to read initrd
contents into guest memory(below ramdisk_max) and jump to kernel.
that's similar with what bootloader does, like grub.
In addition, initrd_max is uint32_t simply because QEMU doesn't support
the 64-bit boot protocol (specifically the ext_ramdisk_image field).
Therefore here just limit initrd_max to UINT32_MAX simply as well to
allow initrd to be loaded below 4 GB.
NOTE: it's possible that linux protocol within [0x208, 0x20c]
supports up to 4 GB initrd as well.
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Richard Henderson <rth@twiddle.net>
CC: Eduardo Habkost <ehabkost@redhat.com>
CC: "Michael S. Tsirkin" <mst@redhat.com>
CC: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some address/memory APIs have different type between
'hwaddr/target_ulong addr' and 'int len'. It is very unsafe, especially
some APIs will be passed a non-int len by caller which might cause
overflow quietly.
Below is an potential overflow case:
dma_memory_read(uint32_t len)
-> dma_memory_rw(uint32_t len)
-> dma_memory_rw_relaxed(uint32_t len)
-> address_space_rw(int len) # len overflow
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Peter Crosthwaite <crosthwaite.peter@gmail.com>
CC: Richard Henderson <rth@twiddle.net>
CC: Peter Maydell <peter.maydell@linaro.org>
CC: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
monitor_qmp_requests_pop_any_with_lock cannot modify the monitor list
concurrently with monitor_cleanup, since the dispatch bottom half
runs in the main thread, but anyway it is a bit ugly to keep
"next" live across critical sections of monitor_lock and Coverity
complains (CID 1397072).
Replace QTAILQ_FOREACH_SAFE with a while loop and QTAILQ_FIRST,
it is cleaner and more future-proof.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
MPX support is being phased out by Intel and actually I am not sure that
OS X has ever enabled it in XCR0. Drop it from the Hypervisor.framework
acceleration.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Processor tracing is not yet implemented for KVM and it will be an
opt in feature requiring a special module parameter.
Disable it, because it is wrong to enable it by default and
it is impossible that no one has ever used it.
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In order to avoid migration issues, we enable PVH only for
machine type >= 4.0
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The new pvh.bin option rom can be used with SeaBIOS to boot
uncompressed kernel using the x86/HVM direct boot ABI.
pvh.S contains the entry point of the option rom. It runs
in real mode, loads the e820 table querying the BIOS, and
then it switches to 32bit protected mode and jumps to the
pvh_load_kernel() written in pvh_main.c.
pvh_load_kernel() loads the cmdline and kernel entry_point
using fw_cfg, then it looks for RSDP, fills the
hvm_start_info required by x86/HVM ABI, and finally jumps
to the kernel entry_point.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
In order to allow other option roms to use these common
useful functions and definitions, this patch put them
in two new C header files called optrom.h and
optrom_fw_cfg.h. We also add useful out*() in*()
functions for different size, and new fw_cfg functions
to use when DMA feature is not available.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
These changes (along with corresponding Linux kernel and qboot changes)
enable a guest to be booted using the x86/HVM direct boot ABI.
This commit adds a load_elfboot() routine to pass the size and
location of the kernel entry point to qboot (which will fill in
the start_info struct information needed to to boot the guest).
Having loaded the ELF binary, load_linux() will run qboot
which continues the boot.
The address for the kernel entry point is read from an ELF Note
in the uncompressed kernel binary by a helper routine passed
to load_elf().
Co-developed-by: George Kennedy <George.Kennedy@oracle.com>
Signed-off-by: George Kennedy <George.Kennedy@oracle.com>
Signed-off-by: Liam Merwick <liam.merwick@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The x86/HVM direct boot ABI permits Qemu to be able to boot directly
into the uncompressed Linux kernel binary with minimal firmware involvement.
https://xenbits.xen.org/docs/unstable/misc/pvh.html
This commit adds the header file that defines the start_info struct
that needs to be populated in order to use this ABI.
The canonical version of start_info.h is in the Xen codebase.
(like QEMU, the Linux kernel uses a copy as well).
Signed-off-by: Liam Merwick <Liam.Merwick@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <Konrad.Wilk@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Introduce a routine which, given a pointer to a range of ELF Notes,
searches through them looking for a note matching the type specified
and returns a pointer to the matching ELF note.
get_elf_note_type() is used by elf_load[32|64]() to find the
specified note type required by the 'elf_note_fn' parameter
added in the previous commit.
Signed-off-by: Liam Merwick <Liam.Merwick@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Liam Merwick <liam.merwick@oracle.com>
This patch adds an optional function pointer, 'elf_note_fn', to
load_elf() which causes load_elf() to additionally parse any
ELF program headers of type PT_NOTE and check to see if the ELF
Note is of the type specified by the 'translate_opaque' arg.
If a matching ELF Note is found then the specfied function pointer
is called to process the ELF note.
Passing a NULL function pointer results in ELF Notes being skipped.
The first consumer of this functionality is the PVHboot support
which needs to read the XEN_ELFNOTE_PHYS32_ENTRY ELF Note while
loading the uncompressed kernel binary in order to discover the
boot entry address for the x86/HVM direct boot ABI.
Signed-off-by: Liam Merwick <liam.merwick@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We can have a race condition between qemu_cpu_kick_thread() and
qemu_kvm_cpu_thread_fn() when we hotunplug a CPU. In this case,
qemu_cpu_kick_thread() can try to kick a thread that is exiting.
pthread_kill() returns an error and qemu is stopped by an exit(1).
qemu:qemu_cpu_kick_thread: No such process
We can ignore safely this error.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
On Linux (and maybe some BSDs), we require libutil for the openpty()
function. However, this library is not available on some other systems, so
we currently use a fragile if-statement in the configure script to check
whether we need the library or not. Unfortunately, we also hard-coded a
"-lutil" in the tests/Makefile.include file, so this breaks the build on
Solaris, for example (see buglink below). To fix the issue, add the "-lutil"
to "libs_tools" in the configure script instead, then this gets properly
propagated to the tests, too.
And while we're at it, also replace the fragile if-statement in the confi-
gure script with a proper link-check for the availability of this function.
Buglink: https://bugs.launchpad.net/qemu/+bug/1777252
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We forgot to add this check in faa9372c07 ("translate-all:
introduce assert_no_pages_locked", 2018-06-15); we only added
it after returning from a longjmp in cpu_exec_step_atomic. Fix it.
Signed-off-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
qauthz_list_check_rule(void *authz, const char *identity, const char *rule, int format, int policy) "AuthZ list %p check rule=%s identity=%s format=%d policy=%d"
qauthz_list_default_policy(void *authz, const char *identity, int policy) "AuthZ list %p default identity=%s policy=%d"
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.