When the "migratable" property was implemented, the behavior was tested
by changing the default on the code, but actually using the option on
the command-line (e.g. "-cpu host,migratable=false") doesn't work as
expected. This is a regression for a common use case of "-cpu host",
which is to enable features that are supported by the host CPU + kernel
before feature-specific code is added to QEMU.
Fix this by initializing the feature words for "-cpu host" on
x86_cpu_parse_featurestr(), right after parsing the CPU options.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
(cherry picked from commit 4d1b279b06)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
This patch adds a subsection with exception_index field to the VMState for
correct saving the CPU state.
Without this patch, simulator could miss the pending exception in the saved
virtual machine state.
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
(cherry picked from commit 6c3bff0ed8)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
When trying to print data to the pty, we first check if it is connected.
If not, we try to reconnect, but we drop the pending data even if we
have successfully reconnected; this makes us lose the first byte of the very
first transmission.
This small fix addresses the issue by checking once more if the pty is connected
after having tried to reconnect.
Signed-off-by: Sebastian Tanase <sebastian.tanase@openwide.fr>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit cf7330c759)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Related spice-only bug. We have a fixed 16 MB buffer here, being
presented to the spice-server as qxl video memory in case spice is
used with a non-qxl card. It's also used with qxl in vga mode.
When using display resolutions requiring more than 16 MB of memory we
are going to overflow that buffer. In theory the guest can write,
indirectly via spice-server. The spice-server clears the memory after
setting a new video mode though, triggering a segfault in the overflow
case, so qemu crashes before the guest has a chance to do something
evil.
Fix that by switching to dynamic allocation for the buffer.
CVE-2014-3615
Cc: qemu-stable@nongnu.org
Cc: secalert@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
(cherry picked from commit ab9509ccea)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Plug a bunch of holes in the bochs dispi interface parameter checking.
Add a function doing verification on all registers. Call that
unconditionally on every register write. That way we should catch
everything, even changing one register affecting the valid range of
another register.
Some of the holes have been added by commit
e9c6149f6a. Before that commit the
maximum possible framebuffer (VBE_DISPI_MAX_XRES * VBE_DISPI_MAX_YRES *
32 bpp) has been smaller than the qemu vga memory (8MB) and the checking
for VBE_DISPI_MAX_XRES + VBE_DISPI_MAX_YRES + VBE_DISPI_MAX_BPP was ok.
Some of the holes have been there forever, such as
VBE_DISPI_INDEX_X_OFFSET and VBE_DISPI_INDEX_Y_OFFSET register writes
lacking any verification.
Security impact:
(1) Guest can make the ui (gtk/vnc/...) use memory rages outside the vga
frame buffer as source -> host memory leak. Memory isn't leaked to
the guest but to the vnc client though.
(2) Qemu will segfault in case the memory range happens to include
unmapped areas -> Guest can DoS itself.
The guest can not modify host memory, so I don't think this can be used
by the guest to escape.
CVE-2014-3615
Cc: qemu-stable@nongnu.org
Cc: secalert@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
(cherry picked from commit c1b886c45d)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
VgaState->vram_size is the size of the pci bar. In case of qxl not the
whole pci bar can be used as vga framebuffer. Add a new variable
vbe_size to handle that case. By default (if unset) it equals
vram_size, but qxl can set vbe_size to something else.
This makes sure VBE_DISPI_INDEX_VIDEO_MEMORY_64K returns correct results
and sanity checks are done with the correct size too.
Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
(cherry picked from commit 54a85d4624)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
whenever we start vhost, virtio could have outstanding packets
queued, when they complete later we'll modify the ring
while vhost is processing it.
To prevent this, purge outstanding packets on vhost start.
Cc: qemu-stable@nongnu.org
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 086abc1ccd)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
devices rely on packet callbacks eventually running,
but we violate this rule whenever we purge the queue.
To fix, invoke callbacks on all packets on purge.
Set length to 0, this way callers can detect that
this happened and re-queue if necessary.
Cc: qemu-stable@nongnu.org
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 07d8084624)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
For all NICs(except virtio-net) emulated by qemu,
Such as e1000, rtl8139, pcnet and ne2k_pci,
Qemu can still receive packets when VM is not running.
If this happened in *migration's* last PAUSE VM stage, but
before the end of the migration, the new receiving packets will possibly dirty
parts of RAM which has been cached in *iovec*(will be sent asynchronously) and
dirty parts of new RAM which will be missed.
This will lead serious network fault in VM.
To avoid this, we forbid receiving packets in generic net code when
VM is not running.
Bug reproduction steps:
(1) Start a VM which configured at least one NIC
(2) In VM, open several Terminal and do *Ping IP -i 0.1*
(3) Migrate the VM repeatedly between two Hosts
And the *PING* command in VM will very likely fail with message:
'Destination HOST Unreachable', the NIC in VM will stay unavailable unless you
run 'service network restart'
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit e1d64c084b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
If we start Windows 2008 R2 DataCenter with number of cpu less than 8,
The system will use APIC Flat Logical destination mode as default configuration,
Which has an upper limit of 8 CPUs.
The fault is that VM can not show all processors within Task Manager if
we hot-add cpus when the number of cpus in VM extends the limit of 8.
If we use cluster destination model, the problem will be solved.
Note:
This flag was introduced later than ACPI v1.0 specification while QEMU
generates v1.0 tables only, but...
linux kernel ignores this flag, so patch has no influence on it.
Tested with Win[XPsp3|Srv2003EE|Srv2008DC|Srv2008R2|Srv2012R2], there
isn't BSODs and guests boot just fine. In cases guest doesn't support
cpu-hotplug, cpu becomes visible after reboot and in case the guest
supports cpu-hotplug, it works as expected with this patch.
Cc: qemu-stable@nongnu.org
Signed-off-by: huangzhichao <huangzhichao@huawei.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-By: Igor Mammedov <imammedo@redhat.com>
(cherry picked from commit 07b81ed937)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
As vhost core can use backend_features during init, clear it earlier to
avoid using uninitialized memory.
This use would be harmless since vhost scsi ignores the result
anyway, but initializing earlier will help prevent valgrind errors,
and make scsi and net behave similarly.
Cc: qemu-stable@nongnu.org
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 3a1655fc53)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
commit a9f98bb5eb "vhost: multiqueue
support" changed the order of stopping the device. Previously
vhost_dev_stop would disable backend and only afterwards, unset guest
notifiers. We now unset guest notifiers while vhost is still
active. This can lose interrupts causing guest networking to fail. In
particular, this has been observed during migration.
To fix this, several other changes are needed:
- remove the hdev->started assertion in vhost.c since we may want to
start the guest notifiers before vhost starts and stop the guest
notifiers after vhost is stopped.
- introduce the vhost_net_set_vq_index() and call it before setting
guest notifiers. This is to guarantee vhost_net has the correct
virtqueue index when setting guest notifiers.
MST: fix up error handling.
Cc: qemu-stable@nongnu.org
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Andrey Korolyov <andrey@xdel.ru>
Reported-by: "Zhangjie (HZ)" <zhangjie14@huawei.com>
Tested-by: William Dauchy <william@gandi.net>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit cd7d1d26b0)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
commit 783e770693
virtio-net: stop/start bh when appropriate
is incomplete: BH might execute within the same main loop iteration but
after vmstop, so in theory, we might trigger an assertion.
I was unable to reproduce this in practice,
but it seems clear enough that the potential is there, so worth fixing.
Cc: qemu-stable@nongnu.org
Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit e8bcf84200)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Commit 2c7ffc414 added support for honouring the CPACR coprocessor
access control register bits which may disable access to VFP
and Neon instructions. However it failed to account for the
fact that the CPACR is only present starting from the ARMv6
architecture version, so it accidentally disabled VFP completely
for ARMv5 CPUs like the ARM926. Linux would detect this as
"no VFP present" and probably fall back to its own emulation,
but other guest OSes might crash or misbehave.
This fixes bug LP:1359930.
Reported-by: Jakub Jermar <jakub@jermar.eu>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1408714940-7192-1-git-send-email-peter.maydell@linaro.org
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit ed1f13d607)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The SDM specifies (June 2014 Vol3 11.11.5):
On a hardware reset, the P6 and more recent processors clear the
valid flags in variable-range MTRRs and clear the E flag in the
IA32_MTRR_DEF_TYPE MSR to disable all MTRRs. All other bits in the
MTRRs are undefined.
We currently do none of that, so whatever MTRR settings you had prior
to reset is what you have after reset. Usually this doesn't matter
because KVM often ignores the guest mappings and uses write-back
anyway. However, if you have an assigned device and an IOMMU that
allows NoSnoop for that device, KVM defers to the guest memory
mappings which are now stale after reset. The result is that OVMF
rebooting on such a configuration takes a full minute to LZMA
decompress the firmware volume, a process that is nearly instant on
the initial boot.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 9db2efd95e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The MTRR state in KVM currently runs completely independent of the
QEMU state in CPUX86State.mtrr_*. This means that on migration, the
target loses MTRR state from the source. Generally that's ok though
because KVM ignores it and maps everything as write-back anyway. The
exception to this rule is when we have an assigned device and an IOMMU
that doesn't promote NoSnoop transactions from that device to be cache
coherent. In that case KVM trusts the guest mapping of memory as
configured in the MTRR.
This patch updates kvm_get|put_msrs() so that we retrieve the actual
vCPU MTRR settings and therefore keep CPUX86State synchronized for
migration. kvm_put_msrs() is also used on vCPU reset and therefore
allows future modificaitons of MTRR state at reset to be realized.
Note that the entries array used by both functions was already
slightly undersized for holding every possible MSR, so this patch
increases it beyond the 28 new entries necessary for MTRR state.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit d1ae67f626)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
We currently define the number of variable range MTRR registers as 8
in the CPUX86State structure and vmstate, but use MSR_MTRRcap_VCNT
(also 8) to report to guests the number available. Change this to
use MSR_MTRRcap_VCNT consistently.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit d8b5c67b05)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Commit e8f6d00c30 ("target-i386: raise
page fault for reserved physical address bits") added a check that the
NX bit is not set on PAE PDPEs, but it also added it to rsvd_mask for
the rest of the function. This caused any PDEs or PTEs with NX set to be
erroneously rejected, making PAE guests with NX support unusable.
Signed-off-by: William Grant <wgrant@ubuntu.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 1844e68eca)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
QOM backends can refer to chardevs, but not vice versa. So
process -chardev and -fsdev options before -object
This fixes the rng-egd backend to virtio-rng.
Reported-by: Amos Kong <akong@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 7b71758d79)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
On sPAPR, virtio devices are connected to the PCI bus and use MSI-X.
Commit cc943c36fa has modified MSI-X
so that writes are made using the bus master address space and follow
the IOMMU path.
Unfortunately, the IOMMU address space address space does not have an
MSI window: the notification is silently dropped in unassigned_mem_write
instead of reaching the guest... The most visible effect is that all
virtio devices are non-functional on sPAPR since then. :(
This patch does the following:
1) map the MSI window into the IOMMU address space for each PHB
- since each PHB instantiates its own IOMMU address space, we
can safely map the window at a fixed address (SPAPR_PCI_MSI_WINDOW)
- no real need to keep the MSI window setup in a separate function,
the spapr_pci_msi_init() code moves to spapr_phb_realize().
2) kill the global MSI window as it is not needed in the end
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 8c46f7ec85)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The thread pool has a race condition if two elements complete before
thread_pool_completion_bh() runs:
If element A's callback waits for element B using aio_poll() it will
deadlock since pool->completion_bh is not marked scheduled when the
nested aio_poll() runs.
Fix this by marking the BH scheduled while thread_pool_completion_bh()
is executing. This way any nested aio_poll() loops will enter
thread_pool_completion_bh() and complete the remaining elements.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 3c80ca158c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
EventNotifier is implemented using an eventfd or pipe. It therefore
consumes file descriptors, which can be limited by rlimits and should
therefore be used sparingly.
Switch from EventNotifier to QEMUBH in thread-pool.c. Originally
EventNotifier was used because qemu_bh_schedule() was not thread-safe
yet.
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit c2e50e3d11)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
commit 868270f23d
acpi-build: tweak acpi migration limits
broke kernel loading with -kernel/-initrd: it doubled
the size of ACPI tables but did not reserve
enough memory.
As a result, issues on boot and halt are observed.
Fix this up by doubling reserved memory for new machine types.
Cc: qemu-stable@nongnu.org
Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 927766c7d3)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
When memory is allocated on a wrong node, MPOL_MF_STRICT
doesn't move it - it just fails the allocation.
A simple way to reproduce the failure is with mlock=on
realtime feature.
The code comment actually says: "ensure policy won't be ignored"
so setting MPOL_MF_MOVE seems like a better way to do this.
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 288d332202)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
When running VMware ESXi under qemu-kvm the guest discards frames
that are too short. Short ARP Requests will be dropped, this prevents
guests on the same bridge as VMware ESXi from communicating. This patch
simply adds the padding on the network device itself.
Signed-off-by: Ben Draper <ben@xrsa.net>
Reviewed-by: Dmitry Fleytman <dmitry@daynix.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 40a87c6c9b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 8d9eb33ca0)
Conflicts:
tests/qemu-iotests/group
*fix up context mismatches due to lack of 099 and 103 tests
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The following O_DIRECT read from a <512 byte file fails:
$ truncate -s 320 test.img
$ qemu-io -n -c 'read -P 0 0 512' test.img
qemu-io: can't open device test.img: Could not read image for determining its format: Invalid argument
Note that qemu-io completes successfully without the -n (O_DIRECT)
option.
This patch fixes qemu-iotests ./check -nocache -vmdk 059.
Cc: qemu-stable@nongnu.org
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 61ed73cff4)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
bs->total_sectors is not yet updated at this point. resulting
in memory corruption if the volume has grown and data is written
to the newly availble areas.
CC: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit d832fb4d66)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The current code supplies the PSCI v0.1 function IDs in the DT even when
KVM uses PSCI v0.2.
This will break guest kernels that only support PSCI v0.1 as they will
use the IDs provided in the DT. Guest kernels with PSCI v0.2 support
are not affected by this patch, because they ignore the function IDs in
the device tree and rely on the architecture definition.
Define QEMU versions of the constants and check that they correspond to
the Linux defines on Linux build hosts. After this patch, both guest
kernels with PSCI v0.1 support and guest kernels with PSCI v0.2 should
work.
Tested on TC2 for 32-bit and APM Mustang for 64-bit (aarch64 guest
only). Both cases tested with 3.14 and linus/master and verified I
could bring up 2 cpus with both guest kernels. Also tested 32-bit with
a 3.14 host kernel with only PSCI v0.1 and both guests booted here as
well.
Cc: qemu-stable@nongnu.org
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 863714ba6c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The function IDs for PSCI v0.1 are exported by KVM and defined as
KVM_PSCI_FN_<something>. To build using these defines in non-KVM code,
QEMU defines these IDs locally and check their correctness against the
KVM headers when those are available.
However, the naming scheme used for QEMU (almost) clashes with the PSCI
v0.2 definitions from Linux so to avoid unfortunate naming when we
introduce local PSCI v0.2 defines, rename the current local defines with
QEMU_ prependend and clearly identify the PSCI version as v0.1 in the
defines.
Cc: qemu-stable@nongnu.org
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit a65c9c17ce)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
When we take an exception resulting from a BRK instruction,
the architecture requires that the "preferred return address"
reported to the exception handler is the address of the BRK
itself, not the following instruction (like undefined
insns, and in contrast with SVC, HVC and SMC). Follow this,
rather than incorrectly reporting the address of the following
insn.
(We do get this correct for the A32/T32 BKPT insns.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Cc: qemu-stable@nongnu.org
(cherry picked from commit 229a138d74)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The error messages before and after patch are:
before:
qemu-system-x86_64: total memory for NUMA nodes (134217728) should equal RAM size (20000000)
after:
qemu-system-x86_64: total memory for NUMA nodes (0x8000000) should equal RAM size (0x20000000)
Cc: qemu-stable@nongnu.org
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit c68233aee8)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
If user specifies a node number that exceeds the available numa nodes in
emulated system for pc-dimm device, the device will report an invalid _PXM
to OSPM. Fix this by checking the node property value.
Cc: qemu-stable@nongnu.org
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit cfe0ffd027)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Commit 58ac321135 introduced a check to ide dma processing which
constrains all requests to drive size. However, apparently, some
valid requests (like TRIM) does not fit in this constraint, and
fails in 2.1. So check the range only for reads and writes.
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit d66168ed68)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Some non-linux systems, for example a system with
FreeBSD kernel and glibc, may declare struct mmsghdr
(in glibc) but may not have linux-specific header
file linux/ip.h. The actual implementation in qemu
includes this linux-specific header file unconditionally,
so compilation fails if it is not present. Include
this header in the configure test too.
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit bff6cb7296)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
When new MSI-X vectors are enabled we need to disable MSI-X and
re-enable it with the correct number of vectors. That means we need
to reprogram the eventfd triggers for each vector. Prior to f4d45d47
vector->use tracked whether a vector was masked or unmasked and we
could always pick the KVM path when available for unmasked vectors.
Now vfio doesn't track mask state itself and vector->use and virq
remains configured even for masked vectors. Therefore we need to ask
the MSI-X code whether a vector is masked in order to select the
correct signaling path. As noted in the comment, MSI relies on
hardware to handle masking.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: qemu-stable@nongnu.org # QEMU 2.1
(cherry picked from commit c048be5cc9)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Update -device FOO,help to include QOM properties in addition to qdev
properties. Devices are gradually adding more QOM properties that are
not reflected as qdev properties.
It is important to report all device properties since management tools
like libvirt use this information (and device-list-properties QMP) to
detect the presence of QEMU features.
This patch reuses the device-list-properties QMP machinery to avoid code
duplication.
Reported-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Cole Robinson <crobinso@redhat.com>
(cherry picked from commit ef523587da)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The "hotplugged" device property was not reported before commit
f4eb32b590 ("qmp: show QOM properties in
device-list-properties"). Fix this difference.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 4115dd6527)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
pl031's base address should be 0x9010000, not 0x90010000, otherwise
it sits in ram when configuring a guest with greater than 1G.
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
pc migration fixes
Last minute fixes for migration.
It seems that if we don't fix it now, fixing
it in the next version will be even more painful ...
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Tue 29 Jul 2014 11:45:18 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
* remotes/mst/tags/for_upstream:
piix: set legacy table size for 1.7
acpi-build: tweak acpi migration limits
pc: future-proof migration-compatibility of ACPI tables
acpi-build: minor code cleanup
pc: acpi: generate AML only for PCI0 devices if PCI bridge hotplug is disabled
bios-tables-test: fix ASL normalization false positive
pc: hack for migration compatibility from QEMU 2.0
acpi-dsdt: procedurally generate _PRT
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
- Tweak error message for legacy machine type:
Basically if table size exceeds the limits we set all
bets are off for migration: e.g. it can start failing even
within given qemu minor version simply because of a bugfix.
- Increase table size to 128k.
- Make sure we notice it long before we start getting close to the
128k limit: warn at 64k.
- Don't fail if we exceed the limit: most people don't care about
migration, even less people care about cross version miration.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
This patch avoids that similar changes break QEMU again in the future.
QEMU will now hard-code 64k as the maximum ACPI table size, which
(despite being an order of magnitude smaller than 640k) should be enough
for everyone.
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fixes migration regression from QEMU-1.7 to a newer QEMUs.
SSDT table size in QEMU-1.7 doesn't change regardless of
a number of PCI bridge devices present at startup.
However in QEMU-2.0 since addition of hotplug on PCI bridges,
each PCI bridge adds ~1875 bytes to SSDT table, including
pc-i440fx-1.7 machine type where PCI bridge hotplug disabled
via compat property.
It breaks migration from "QEMU-1.7" to "QEMU-2.[01] -M pc-i440fx-1.7"
since RAMBlock size of ACPI tables on target becomes larger
then on source and migration fails with:
"Length mismatch: /rom@etc/acpi/tables: 2000 in != 3000"
error.
Fix this by generating AML only for PCI0 bus if
hotplug on PCI bridges is disabled and preserves PCI brigde
description in AML as it was done in QEMU-1.7 for pc-i440fx-1.7.
It will help to maintain size of SSDT static regardless of
number of PCI bridges on startup for pc-i440fx-1.7 machine type.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
My version of IASL (from RHEL7) puts two newlines between the head comment
and the DefinitionBlock property. Kill all newlines after the comment,
so that normalize_asl works properly.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Line numbers changed, and some translations were missing after commit
3d914488ae.
Update also "Show Tabs" to a more common translation, and remove some
old unused lines at the end.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Changing the ACPI table size causes migration to break, and the memory
hotplug work opened our eyes on how horribly we were breaking things in
2.0 already.
The ACPI table size is rounded to the next 4k, which one would think
gives some headroom. In practice this is not the case, because the user
can control the ACPI table size (each CPU adds 97 bytes to the SSDT and
8 to the MADT) and so some "-smp" values will break the 4k boundary and
fail to migrate. Similarly, PCI bridges add ~1870 bytes to the SSDT.
This patch concerns itself with fixing migration from QEMU 2.0. It
computes the payload size of QEMU 2.0 and always uses that one.
The previous patch shrunk the ACPI tables enough that the QEMU 2.0 size
should always be enough; non-AML tables can change depending on the
configuration (especially MADT, SRAT, HPET) but they remain the same
between QEMU 2.0 and 2.1, so we only compute our padding based on the
sizes of the SSDT and DSDT.
Migration from QEMU 1.7 should work for guests that have a number of CPUs
other than 12, 13, 14, 54, 55, 56, 97, 98, 139, 140. It was already
broken from QEMU 1.7 to QEMU 2.0 in the same way, though.
Even with this patch, QEMU 1.7 and 2.0 have two different ideas of
"-M pc-i440fx-2.0" when there are PCI bridges. Igor sent a patch to
adopt the QEMU 1.7 definition. I think distributions should apply
it if they move directly from QEMU 1.7 to 2.1+ without ever packaging
version 2.0.
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This replaces the _PRT constant with a method that computes it.
The problem is that the DSDT+SSDT have grown from 2.0 to 2.1,
enough to cross the 8k barrier (we align the ACPI tables to 4k
before putting them in fw_cfg). This causes problems with
migration and the pc-i440fx-2.0 machine type.
The solution to the problem is to hardcode 64k as the limit,
but this doesn't solve the bug with pc-i440fx-2.0. The fix will be
for QEMU 2.1 to use exactly the same size as QEMU 2.0 for the
ACPI tables. First, however, we must make the actual AML
equal or smaller; to do this, rewrite _PRT in a way that saves
over 1k of bytecode.
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
trivial patches for 2014-07-26
# gpg: Signature made Sat 26 Jul 2014 08:16:55 BST using RSA key ID A4C3D7DB
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg: aka "Michael Tokarev <mjt@corpit.ru>"
# gpg: aka "Michael Tokarev <mjt@debian.org>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D 4324 457C E0A0 8044 65C5
# Subkey fingerprint: 6F67 E18E 7C91 C5B1 5514 66A7 BEE5 9D74 A4C3 D7DB
* remotes/mjt/tags/trivial-patches-2014-07-26:
qemu-options: fix another allows-to for -net l2tpv3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Here is the serial fix for 2.1.
# gpg: Signature made Fri 25 Jul 2014 13:36:23 BST using RSA key ID 9B4D86F2
# gpg: Good signature from "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: aka "Paolo Bonzini <bonzini@gnu.org>"
* remotes/bonzini/tags/for-upstream:
qemu-char: ignore flow control if a PTY's slave is not connected
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
After commit f702e62 (serial: change retry logic to avoid concurrency,
2014-07-11), guest boot hangs if the backend is an unconnected PTY.
The reason is that PTYs do not support G_IO_HUP, and serial_xmit is
never called. To fix this, simply invoke serial_xmit immediately
(via g_idle_source_new) when this happens.
Tested-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We need to remember has_updates for each vnc client. Otherwise it might
happen that vnc_update_client(has_dirty=1) takes the first exit due to
output buffers not being flushed yet and subsequent calls with
has_dirty=0 take the second exit, wrongly assuming there is nothing to
do because the work defered in the first call is ignored.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
If the client asks for !incremental frame updates, it has lost its content
so dirty doesn't matter - it has to see the full frame, so setting force_update
Signed-off-by: Stephan Kulow <coolo@suse.de>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
The VSERPORT_CHANGE event was added in e2ae6159. The patch for
this event was prepared at a time when this file was gone, even
though it got applied immediately after dfab4892 restored this
file. Duplicate the documentation into this file, so that
anyone using this file instead of qapi will not miss out on this
new event.
* docs/qmp/qmp-events.txt (VSERPORT_CHANGE): Add.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
The POWERDOWN event was first documented in 0aab9ec3. But since
dfab4892 later restored this file to the state prior to qmp events,
and we never documented it in the past, anyone using this file
instead of qapi will miss out on this event. Tweak the existing
wording of SHUTDOWN to match 84321831, and make the difference
between the two events apparent.
* docs/qmp/qmp-events.txt (POWERDOWN): Add.
(SHUTDOWN): Tweak.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
The SPICE_MIGRATE_COMPLETED event was first documented in
7cfadb6b. But since dfab4892 later restored this file to the
state prior to qmp events, and we never documented it in the
past, anyone using this file instead of qapi will miss out on
this event.
* docs/qmp/qmp-events.txt (SPICE_MIGRATE_COMPLETED): Add.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
For consistency with the rest of this file, every event should be
listed in isolation. Compare how commit 7cfadb6b split
SPICE_CONNECTED and SPICE_DISCONNECTED into separate qmp events.
* docs/qmp/qmp-events.txt (SPICE_CONNECTED, SPICE_DISCONNECTED):
Split.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
When converting to qmp events, commits 7cfadb6b and a6330785
fixed some grammar as part of moving text between files. But
since dfab4892 later restored this file to the state prior to
qmp events, we have to do it again.
* docs/qmp/qmp-events.txt (RESET, SPICE_INITIALIZED): Tweak.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reason: we don't want commit to that interface yet. Possibly
the implementation will be switched over to use fsdev.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The VMStateDescription for the imx_ccm device was missing its
terminator. Found by static search of the codebase using
a regex based on one suggested by Ian Jackson:
pcregrep -rMi '(?s)VMStateField(?:(?!END_OF_LIST).)*?;' $(git grep -l 'VMStateField\[\]')
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Cc: qemu-stable@nongnu.org
"vmstate_xhci_event" was introduced in commit 37352df3 ("xhci: add live
migration support"), and first released in v1.6.0. The field list in this
VMSD is not terminated with the VMSTATE_END_OF_LIST() macro.
During normal use (ie. migration), the issue is practically invisible,
because the "vmstate_xhci_event" object (with the unterminated field list)
is only ever referenced -- via "vmstate_xhci_intr" -- if xhci_er_full()
returns true, for the "ev_buffer" test. Since that field_exists() check
(apparently) almost always returns false, we almost never traverse
"vmstate_xhci_event" during migration, which hides the bug.
However, Amit's vmstate checker forces recursion into this VMSD as well,
and the lack of VMSTATE_END_OF_LIST() breaks the field list terminator
check (field->name != NULL) in dump_vmstate_vmsd(). The result is
undefined behavior, which in my case translates to infinite recursion
(because the loop happens to overflow into "vmstate_xhci_intr", which then
links back to "vmstate_xhci_event").
Add the missing terminator.
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Patch queue for ppc - 2014-07-22
Only a single bug fix to make -mem-path only affect RAM regions.
# gpg: Signature made Tue 22 Jul 2014 16:38:04 BST using RSA key ID 03FEDC60
# gpg: Can't check signature: public key not found
* remotes/agraf/tags/signed-ppc-for-upstream:
ppc: fix -mem-path failure
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
commit e938ba0c tried to enable -mem-path for ppc but breaked some ppc
boards.
The problems are:
1. it fails when allocating memory for rom, sram whose sizes are less
than huge page size:
./ppc-softmmu/qemu-system-ppc -m 512 -mem-path /hugepages/ \
-kernel /home/hutao/Downloads/vmlinux-ppc -initrd \
/home/hutao/Downloads/initrd-ppc.gz
qemu-system-ppc: /mnt/data/projects/qemu/exec.c:1184: qemu_ram_set_idstr: Assertion `new_block' failed.
2. if there is a numa node backed by memory backend object, qemu fails
with message:
./ppc-softmmu/qemu-system-ppc -m 512 \
-object memory-backend-file,size=512M,mem-path=/hugepages,id=f0 \
-numa node,nodeid=0,memdev=f0 \
-kernel /home/hutao/Downloads/vmlinux-ppc \
-initrd /home/hutao/Downloads/initrd-ppc.gz
qemu-system-ppc: memory backend f0 is used multiple times. Each -numa option must use a different memdev value.
This patch does following:
1. replaces memory_region_allocate_system_memory() with
memory_region_init_ram() for rom, sram. Then only system memory
is backed by hugepages when specifying mem-path.
2. for memory banks, allocates all ram with
one memory_region_allocate_system_memory(), and use
memory_region_init_alias() to initialize memory banks.
Tested machines: default(g3beige), mac99, taihu, bamboo, ref405ep.
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
If a negative integer is used for the max_bytes parameter, QEMU currently
calls abort() and leaves behind a core dump. This patch replaces the
abort with a simple error message to make the reason for the termination
clearer. This also ensures device-hotplug with invalid input doesn't
cause qemu to quit.
There is an underlying insufficiency in the parameter parsing code of QEMU
that renders it unable to reject negative values for unsigned properties,
thus the error message "a non-negative integer below 2^63" is the most
user-friendly and correct message we can give until the underlying
insufficiency is corrected.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
One of the two pending migration fix, and a small KVM patch.
# gpg: Signature made Tue 22 Jul 2014 11:49:30 BST using RSA key ID 9B4D86F2
# gpg: Can't check signature: public key not found
* remotes/bonzini/tags/for-upstream:
kvm-all: Use 'tmpcpu' instead of 'cpu' in sub-looping to avoid 'cpu' be NULL
exec: fix migration with devices that use address_space_rw
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
If kvm_arch_remove_sw_breakpoint() in CPU_FOREACH() always be fail, it
will let 'cpu' NULL. And the next kvm_arch_remove_sw_breakpoint() in
QTAILQ_FOREACH_SAFE() will get NULL parameter for 'cpu'.
And kvm_arch_remove_sw_breakpoint() can assumes 'cpu' must never be NULL,
so need define additional temporary variable for 'cpu' to avoid the case.
Cc: qemu-stable@nongnu.org
Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Devices that use address_space_rw to write large areas to memory
(as opposed to address_space_map/unmap) were broken with respect
to migration since fe680d0 (exec: Limit translation limiting in
address_space_translate to xen, 2014-05-07). Such devices include
IDE CD-ROMs.
The reason is that invalidate_and_set_dirty (called by address_space_rw
but not address_space_map/unmap) was only setting the dirty bit for
the first page in the translation.
To fix this, introduce cpu_physical_memory_set_dirty_range_nocode that
is the same as cpu_physical_memory_set_dirty_range except it does not
muck with the DIRTY_MEMORY_CODE bitmap. This function can be used if
the caller invalidates translations with tb_invalidate_phys_page_range.
There is another difference between cpu_physical_memory_set_dirty_range
and cpu_physical_memory_set_dirty_flag; the former includes a call
to xen_modified_memory. This is handled separately in
invalidate_and_set_dirty, and is not needed in other callers of
cpu_physical_memory_set_dirty_range_nocode, so leave it alone.
Just one nit: now that invalidate_and_set_dirty takes care of handling
multiple pages, there is no need for address_space_unmap to wrap it
in a loop. In fact that loop would now be O(n^2).
Reported-by: Dave Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
QOM and device refactorings
* Machine: Property name fixups for 2.1 ABI
# gpg: Signature made Mon 21 Jul 2014 18:00:23 BST using RSA key ID 3E7E013F
# gpg: Good signature from "Andreas Färber <afaerber@suse.de>"
# gpg: aka "Andreas Färber <afaerber@suse.com>"
* remotes/afaerber/tags/qom-devices-for-2.1:
machine: Replace underscores in machine's property names
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Replaced '_' with '-' to comply with QOM guidelines.
Made the conversion from command line to QMP in vl.c.
Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Andreas's fixes to --enable-modules, two 2.1 regression fixes, and a
new qtest. Michael sent a pull request of his own, so I dropped
the vhost changes.
# gpg: Signature made Fri 18 Jul 2014 14:30:34 BST using RSA key ID 9B4D86F2
# gpg: Can't check signature: public key not found
* remotes/bonzini/tags/for-upstream:
Revert "kvmclock: Ensure time in migration never goes backward"
Revert "kvmclock: Ensure proper env->tsc value for kvmclock_current_nsec calculation"
module: Don't complain when a module is absent
module: Simplify module_load()
qtest: new test for wdt_ib700
target-i386: Allow execute from user mode when SMEP is enabled.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Smatch also complains about 0 used for pointers, so replace those by
NULL in test-visitor-serialization.c, too.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This fixes a warning from the static code analysis (smatch).
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
In this case, 'ret' is already '-1', so need not do it again.
Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
If the user specified a (vlan ID, slirp stack name) tuple in a monitor
hostfwd_add/remove command and we can't find it, give the user an
error message rather than silently doing nothing.
This brings this error case in slirp_lookup() into line with the
other two.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This reverts commit 9b1786829a.
This patch fixed a hang introduced by commit a096b3a (kvmclock: Ensure
time in migration never goes backward, 2014-05-16), but it causes
a regression in migration whose cause is not quite clear.
Because of this, I'm choosing to revert both patches. This trades a
2.1 regression for a bug that's been there forever.
Cc: agraf@suse.de
Cc: mtosatti@redhat.com
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The current implementation depends on a configure-time generated list of
block modules. When any of them is absent, module_load() emits a warning.
This is suboptimal because extracting code to modules was mainly done to
allow separate packaging of modules with intrusive dependencies. Absence
of optional packages then leads to absence of modules and an error
message, which users may recognize as new and report as error.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The file path is not used for error reporting, so we can free it
directly after use.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since the "pause" watchdog action had a regression and it went
unnoticed for a while, let's add a test for it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Block pull request
# gpg: Signature made Fri 18 Jul 2014 13:39:43 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>"
* remotes/stefanha/tags/block-pull-request:
qemu-iotests: fix 028 failure due to disk image path
raw-posix: Fail gracefully if no working alignment is found
block: Add Error argument to bdrv_refresh_limits()
qcow2: Fix error path for unknown incompatible features
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The disk image path is echoed by QEMU's readline when the "drive_backup
disk ${TEST_IMG}.copy" HMP command is issued. Unfortunately it is very
hard to filter out the path due to readline's character-by-character
output (with terminal escape sequences). Just redirect this command to
/dev/null for now.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
If qemu couldn't find out what O_DIRECT alignment to use with a given
file, it would run into assert(bdrv_opt_mem_align(bs) != 0); in block.c
and confuse users. This adds a more descriptive error message for such
cases.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
qcow2's report_unsupported_feature() had two bugs: A 32 bit truncation
would prevent feature table entries for bits 32-63 from being used, and
it could assign errp multiple times if there was more than one unknown
feature, resulting in an error_set() assertion failure.
Fix the truncation, make sure to set the error exactly once and add a
qemu-iotests case for it.
This fixes https://bugs.launchpad.net/qemu/+bug/1342704/
Reported-by: Maria Kustova <maria.k@catit.be>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
pc,vhost,test fixes
Minor bugfixes all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Fri 18 Jul 2014 00:43:04 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
* remotes/mst/tags/for_upstream:
vhost-user: minor cleanups
qtest: Adapt vhost-user-test to latest vhost-user changes
vhost-user: Fix VHOST_SET_MEM_TABLE processing
qtest: fix vhost-user-test compilation with old GLib
fix typo: apci -> acpi
pc_piix: Reuse pc_compat_1_2() for pc-0.1[0123]
pc: fix qemu exiting with error when -m X < 128 with old machines types
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
A new field mmap_offset was added in the vhost-user message, we need to reflect
this change in the test too.
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
qemu_get_ram_fd doesn't accept a guest physical address. ram_addr_t are
opaque values that are assigned in qemu_ram_alloc.
Find the ram_addr_t corresponding to the userspace_addr using qemu_ram_addr_from_host,
and then call qemu_get_ram_fd on it.
Thanks to Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
If machine doesn't support memory hotplug then starting QEMU
with initial memory less than default will make QEMU exit with
following error message:
$QEMU -m 16 -M isapc
qemu-system-i386: "-memory 'slots|maxmem'" is not supported by: isapc
Set maxram_size to initial memory value before parsing
'maxmem' option allows to keep maxmem in sync with initial
memory size if no maxmem option was specified.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
CC: Bruce Rogers <brogers@suse.com>
Reviewed-By: Bruce Rogers <brogers@suse.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We keep port 0 reserved for compat with older guests, where only
virtio-console was expected. Even if a system is started without a
virtio-console port, port #0 is kept aside. However, after a
virtconsole port is unplugged, port id 0 became available, and the next
hotplug of a virtserialport caused failure due to it not being a console
port.
Steps to reproduce:
$ ./x86_64-softmmu/qemu-system-x86_64 -m 512 -cpu host -enable-kvm -device virtio-serial-pci -monitor stdio -vnc :1
QEMU 2.0.91 monitor - type 'help' for more information
(qemu) device_add virtconsole,id=p1
(qemu) device_del p1
(qemu) device_add virtserialport,id=p1
Port number 0 on virtio-serial devices reserved for virtconsole devices for backward compatibility.
Device 'virtserialport' could not be initialized
(qemu) quit
Reported-by: dengmin <mdeng@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Commit 292b1634 changed the section name of "ICH9 LPC" to "ICH9-LPC",
and that causes the static checker to flag this:
Section "ICH9 LPC" does not exist in dest
This patch introduces a function that checks for section renames and
also a dictionary that maps those renames.
Reported-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
---
This is a small patch to a script; doesn't break qemu and helps with the
static checker, so it's a very low-risk patch for 2.1.
vhost-user-test uses the linux/vhost.h header, so it must only be
enabled if CONFIG_LINUX is defined. (Previously it was enabled
for CONFIG_POSIX, which broke 'make check' on MacOSX.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Previously, execute would be disabled for all pages with SMEP enabled,
regardless of what mode the access took place in.
Signed-off-by: Ricky Zhou <ricky@rzhou.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* remotes/riku/linux-user-for-upstream:
linux-user: use TARGET_SA_ONSTACK in get_sigframe
alloca one extra byte sockets
linux-user: handle AF_PACKET sockaddrs in target_to_host_sockaddr
qemu-user: Impl. setsockopt(SO_BINDTODEVICE)
SIOCGIFINDEX: fix typo
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Patch queue for ppc - 2014-07-15
Some more bug fixes during the RC phase:
- Fix huge page mapping regressions
- Fix Book3S thread number enumeration
- Fix Book3S VFIO permission issue
# gpg: Signature made Tue 15 Jul 2014 15:13:54 BST using RSA key ID 03FEDC60
# gpg: Can't check signature: public key not found
* remotes/agraf/tags/signed-ppc-for-upstream:
sPAPR/IOMMU: Fix TCE entry permission
spapr: Enable use of huge pages
spapr: Move RMA memory region registration code
ppc: memory: Replace memory_region_init_ram with memory_region_allocate_system_memory
target-ppc: Fix number of threads per core limit
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The permission of TCE entry should exclude physical base address.
Otherwise, unmapping TCE entry can be interpreted to mapping TCE
entry wrongly for VFIO devices.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
0b183fc87 "memory: move mem_path handling to
memory_region_allocate_system_memory" disabled -mempath use for all
machines that do not use memory_region_allocate_system_memory() to
register RAM. Since SPAPR uses memory_region_init_ram(), the huge pages
support was disabled for it.
This replaces memory_region_init_ram()+vmstate_register_ram_global() with
memory_region_allocate_system_memory() to get huge pages back.
This changes RAM size from (ram_limit - rma_alloc_size) to ram_limit as
the previous patch moved RMA memory region allocation after RAM allocation
and therefore this change does not have immediate effect but simplifies
the code.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
PPC970 does not support VRMA (virtual RMA) so real memory required
for SLOF to execute must be allocated by the KVM_ALLOCATE_RMA ioctl.
Later this memory is used as a part of the guest RAM area.
The RMA allocating code also registers a memory region for this piece
of RAM.
We are going to simplify memory regions layout: RMA memory region
will be a subregion in the RAM memory region, both starting from zero.
This way we will not have to take care of start address alignment for
the piece of RAM next to the RMA.
This moves memory region business closer to the RAM memory region
creation/allocation code.
As this is a mechanical patch, no change in behaviour is expected.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[agraf: fix compilation on non-kvm systems]
Signed-off-by: Alexander Graf <agraf@suse.de>
Commit 0b183fc871:"memory: move mem_path handling to
memory_region_allocate_system_memory" split memory_region_init_ram and
memory_region_init_ram_from_file. Also it moved mem-path handling a step
up from memory_region_init_ram to memory_region_allocate_system_memory.
Therefore for any board that uses memory_region_init_ram directly,
-mem-path is not supported.
Fix this by replacing memory_region_init_ram with
memory_region_allocate_system_memory.
Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
The number of threads per core is different for POWER6/7/8 CPUs.
Guest systems do not expect to see more threads per core than
a specific CPU supports so we need to limit this number.
This limit is implemented by ppc_get_compat_smt_threads().
However it has a problem as it checks for PCR (Processor Compatibility
Register) mask, 2.05 means 2 threads per core, 2.06 - 4 threads.
For POWER8 one would expect PCR_COMPAT_2_07 bit set and
ppc_get_compat_smt_threads() checking for it to return 8 threads
per core. But the latest PowerISA spec now is 2.07 and there is
no 2.07 compatibility mode defined, QEMU does not define it either
(will be in PowerISA 2.08).
Instead of relying on a PCR mask, this uses kvmppc_smt_threads()
which returns the maximum supported threads number for KVM or
1 for TCG.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
Now requests are submitted as a batch, so it is natural
to notify guest as a batch too.
This may suppress interrupt notification to VM a lot:
- in my test, decreased by ~13K/sec
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The callback has to be saved and reset in virtio_blk_data_plane_start(),
otherwise dataplane's requests will be completed in qemu aio context.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
when hotplug virtio-scsi disks using laio, the aio_nr will
increase in laio_init() by io_setup(), we can see the number by
# cat /proc/sys/fs/aio-nr
128
if the aio_nr attach the maxnum, which found from
# cat /proc/sys/fs/aio-max-nr
65536
the hotplug process will fail because of aio context leak.
Fix it by io_destroy in laio_cleanup().
Reported-by: daifulai <daifulai@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The libqos implementation of io_read{b,w,l} and io_write{b,w,l} hooks
was relying on qtest_mem{read,write}() respectively. With d81d410 (usb:
improve ehci/uhci test) this resulted in assertion failures on ppc hosts:
ERROR:tests/usb-hcd-ehci-test.c:78:ehci_port_test: assertion failed: ((value & mask) == (expect & mask))
ERROR:tests/usb-hcd-ehci-test.c:128:pci_uhci_port_2: assertion failed: (pcibus != NULL)
ERROR:tests/usb-hcd-ehci-test.c:150:pci_ehci_port_2: assertion failed: (pcibus != NULL)
qtest_read{b,w,l,q}() and qtest_write{b,w,l,q}() had been introduced
as endian-safe replacement for qtest_mem{read,write}() in I2C in
872536b (qtest: Add MMIO support). Use them for PCI as well.
Cc: Anthony Liguori <aliguori@amazon.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Fixes: c4efe1c qtest: add libqos including PCI support
Fixes: d81d410 usb: improve ehci/uhci test
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Misc 2.1 fixes regarding character/serial devices and SCSI.
# gpg: Signature made Mon 14 Jul 2014 16:26:08 BST using RSA key ID 9B4D86F2
# gpg: Can't check signature: public key not found
* remotes/bonzini/tags/for-upstream:
serial-pci: remove memory regions from BAR before destroying them
virtio-scsi: fix with -M pc-i440fx-2.0
serial: change retry logic to avoid concurrency
qemu-char: fix deadlock with "-monitor pty"
scsi: Report error when lun number is in use
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Right now starting a machine with virtio-scsi and a <= 2.0 machine type
fails with:
qemu-system-x86_64: -device virtio-scsi-pci: Property .any_layout not found
This is because the any_layout bit was actually never set after
virtio-scsi was changed to support arbitrary layout for virtio buffers.
(This was just a cleanup and a preparation for virtio 1.0; no guest
actually checks the bit, but the new request parsing algorithms are
tested even with old guest).
Reported-by: David Gilbert <dgilbert@redhat.com>
Reviewed-by: David Gilbert <dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Whenever serial_xmit fails to transmit a byte it adds a watch that would
call it again when the "line" becomes ready. This results in a retry
chain:
serial_xmit -> add_watch -> serial_xmit
Each chain is able to transmit one character, and for every character
passed to serial by the guest driver a new chain is spawned.
The problem lays with the fact that a new chain is spawned even when
there is one already waiting on the watch. So there can be several retry
chains waiting concurrently on one "line". Every chain tries to transmit
current character, so character order is not messed up. But also every
chain increases retry counter (tsr_retry). If there are enough
concurrent chains this counter will hit MAX_XMIT_RETRY value and
the character will be dropped.
To reproduce this bug you need to feed serial output to some program
consuming it slowly enough. A python script from bug #1335444
description is an example of such program.
This commit changes retry logic in the following way to avoid
concurrency: instead of spawning a new chain for each character being
transmitted spawn only one and make it transmit characters until FIFO is
empty.
The change consists of two parts:
- add a do {} while () loop in serial_xmit (diff is a bit erratic
for this part, diff -w will show actual change),
- do not call serial_xmit from serial_ioport_write if there is one
waiting on the watch already.
This should fix another issue causing bug #1335444.
Signed-off-by: Kirill Batuzov <batuzovk@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
qemu_chr_be_generic_open cannot be called with the write lock taken,
because it calls client code that may call qemu_chr_fe_write. This
actually happens for the monitor:
0x00007ffff27dbf79 in __GI_raise (sig=sig@entry=6)
0x00007ffff27df388 in __GI_abort ()
0x00005555555ef489 in error_exit (err=<optimized out>, msg=msg@entry=0x5555559796d0 <__func__.5959> "qemu_mutex_lock")
0x00005555558f9080 in qemu_mutex_lock (mutex=mutex@entry=0x555556248a30)
0x0000555555713936 in qemu_chr_fe_write (s=0x555556248a30, buf=buf@entry=0x5555563d8870 "QEMU 2.0.90 monitor - type 'help' for more information\r\n", len=56)
0x00005555556217fd in monitor_flush_locked (mon=mon@entry=0x555556251fd0)
0x0000555555621a12 in monitor_flush_locked (mon=0x555556251fd0)
monitor_puts (mon=mon@entry=0x555556251fd0, str=0x55555634bfa7 "", str@entry=0x55555634bf70 "QEMU 2.0.90 monitor - type 'help' for more information\n")
0x0000555555624359 in monitor_vprintf (mon=0x555556251fd0, fmt=<optimized out>, ap=<optimized out>)
0x0000555555624414 in monitor_printf (mon=<optimized out>, fmt=fmt@entry=0x5555559105a0 "QEMU %s monitor - type 'help' for more information\n")
0x0000555555629806 in monitor_event (opaque=0x555556251fd0, event=<optimized out>)
0x000055555571343c in qemu_chr_be_generic_open (s=0x555556248a30)
To avoid this, defer the call to an idle callback, which will be
called as soon as the main loop is re-entered. In order to simplify
the cleanup and do it in one place only, change pty_chr_close to
call pty_chr_state.
To reproduce, run with "-monitor pty", then try to read from the
slave /dev/pts/FOO that it creates.
Fixes: 9005b2a758
Reported-by: Li Liang <liangx.z.li@intel.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Block patches for 2.1.0-rc2 (v2)
# gpg: Signature made Mon 14 Jul 2014 11:04:12 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
* remotes/kevin/tags/for-upstream: (22 commits)
ide: Treat read/write beyond end as invalid
virtio-blk: Treat read/write beyond end as invalid
virtio-blk: Bypass error action and I/O accounting on invalid r/w
virtio-blk: Factor common checks out of virtio_blk_handle_read/write()
dma-helpers: Fix too long qiov
qtest: fix vhost-user-test compilation with old GLib
tests: Fix unterminated string output visitor enum human string
AioContext: do not rely on aio_poll(ctx, true) result to end a loop
virtio-blk: embed VirtQueueElement in VirtIOBlockReq
virtio-blk: avoid g_slice_new0() for VirtIOBlockReq and VirtQueueElement
dataplane: do not free VirtQueueElement in vring_push()
virtio-blk: avoid dataplane VirtIOBlockReq early free
block: Assert qiov length matches request length
qed: Make qiov match request size until backing file EOF
qcow2: Make qiov match request size until backing file EOF
block: Make qiov match the request size until EOF
AioContext: speed up aio_notify
test-aio: fix GSource-based timer test
block: drop aio functions that operate on the main AioContext
block: prefer aio_poll to qemu_aio_wait
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
A s390x/kvm bugfix for missing floating point register synchronization.
# gpg: Signature made Mon 14 Jul 2014 08:21:54 BST using RSA key ID C6F02FAF
# gpg: Can't check signature: public key not found
* remotes/cohuck/tags/s390x-20140714:
s390x/kvm: synchronize guest floating point registers
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The block layer fails such reads and writes just fine. However, they
then get treated like valid operations that fail: the error action
gets executed. Unwanted; reporting the error to the guest is the only
sensible action.
Reject them before passing them to the block layer. This bypasses the
error action and I/O accounting. Not quite correct for DMA, because
DMA can fail after some success, and when that happens, the part that
succeeded isn't counted. Tolerable, because I/O accounting is an
inconsistent mess anyway.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The block layer fails such reads and writes just fine. However, they
then get treated like valid operations that fail: the error action
gets executed. Unwanted; reporting the error to the guest is the only
sensible action.
Reject them before passing them to the block layer. This bypasses the
error action and I/O accounting.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When a device model's I/O operation fails, we execute the error
action. This lets layers above QEMU implement thin provisioning, or
attempt to correct errors before they reach the guest. But when the
I/O operation fails because it's invalid, reporting the error to the
guest is the only sensible action.
If the guest's read or write asks for an invalid sector range, fail
the request right away, without considering the error action. No
change with error action BDRV_ACTION_REPORT.
Furthermore, bypass I/O accounting, because we want to track only I/O
that actually reaches the block layer.
The next commit will extend "invalid sector range" to cover attempts
to read/write beyond the end of the medium.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If the size of the scatter/gather list isn't a multiple of 512, the
number of sectors for the block layer request is rounded down, resulting
in a qiov that doesn't match the request length. Truncate the qiov to the
new length of the request.
This fixes the IDE qtest case /x86_64/ide/bmdma/short_prdt.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Mising G_TIME_SPAN_SECOND definition breaks the RHEL6 compilation as GLib
version before 2.26 does not have it. In such case just define it.
Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Tested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The buffer was being allocated of size string length plus two.
Around the string two quotes were being added, but no terminating NUL.
It was then compared using g_assert_cmpstr(), resulting in fairly random
assertion failures:
ERROR:tests/test-string-output-visitor.c:213:test_visitor_out_enum: assertion failed (str == str_human): ("\"value1\"" == "\"value1\"\001EEEEEEEEEEEEEE\0171")
There is no g_assert_cmpnstr() counterpart, so use g_strdup_printf()
for safely assembling the string in the first place.
Cc: Hu Tao <hutao@cn.fujitsu.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Suggested-by: Eric Blake <eblake@redhat.com>
Fixes: b4900c0 tests: add human format test for string output visitor
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Currently, whenever aio_poll(ctx, true) has completed all pending
work it returns true *and* the next call to aio_poll(ctx, true)
will not block.
This invariant has its roots in qemu_aio_flush()'s implementation
as "while (qemu_aio_wait()) {}". However, qemu_aio_flush() does
not exist anymore and bdrv_drain_all() is implemented differently;
and this invariant is complicated to maintain and subtly different
from the return value of GMainLoop's g_main_context_iteration.
All calls to aio_poll(ctx, true) except one are guarded by a
while() loop checking for a request to be incomplete, or a
BlockDriverState to be idle. The one remaining call (in
iothread.c) uses this to delay the aio_context_release/acquire
pair until the AioContext is quiescent, however:
- we can do the same just by using non-blocking aio_poll,
similar to how vl.c invokes main_loop_wait
- it is buggy, because it does not ensure that the AioContext
is released between an aio_notify and the next time the
iothread goes to sleep. This leads to hangs when stopping
the dataplane thread.
In the end, these semantics are a bad match for the current
users of AioContext. So modify that one exception in iothread.c,
which also fixes the hangs, as well as the testcase so that
it use the same idiom as the actual QEMU code.
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The memory allocation between hw/block/virtio-blk.c,
hw/block/dataplane/virtio-blk.c, and hw/virtio/dataplane/vring.c is
messy. Structs are allocated in different files than they are freed in.
This is risky and makes memory leaks easier.
Embed VirtQueueElement in VirtIOBlockReq to reduce the amount of memory
allocation we need to juggle. This also makes vring.c and virtio.c
slightly more similar.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In commit de6c8042ec ("virtio-blk: Avoid
zeroing every request structure") we avoided the 40 KB memset when
allocating VirtIOBlockReq.
The memset was reintroduced in commit
671ec3f056 ("virtio-blk: Convert
VirtIOBlockReq.elem to pointer").
It must be fixed again to avoid a performance regression.
Cc: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
VirtQueueElement is allocated in vring_pop() so it seems to make sense
that vring_push() should free it. Alas, virtio-blk frees
VirtQueueElement itself in virtio_blk_free_request().
This patch solves a double-free assertion in glib's g_slice_free().
Rename vring_free_element() to vring_unmap_element() since it no longer
frees the VirtQueueElement.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
VirtIOBlockReq is freed later by virtio_blk_free_request() in
hw/block/virtio-blk.c. Remove this extraneous g_slice_free().
This patch fixes the following segfault:
0x00005555556373af in virtio_blk_rw_complete (opaque=0x5555565ff5e0, ret=0) at hw/block/virtio-blk.c:99
99 bdrv_acct_done(req->dev->bs, &req->acct);
(gdb) print req
$1 = (VirtIOBlockReq *) 0x5555565ff5e0
(gdb) print req->dev
$2 = (VirtIOBlock *) 0x0
(gdb) bt
#0 0x00005555556373af in virtio_blk_rw_complete (opaque=0x5555565ff5e0, ret=0) at hw/block/virtio-blk.c:99
#1 0x0000555555840ebe in bdrv_co_em_bh (opaque=0x5555566152d0) at block.c:4675
#2 0x000055555583de77 in aio_bh_poll (ctx=ctx@entry=0x5555563a8150) at async.c:81
#3 0x000055555584b7a7 in aio_poll (ctx=0x5555563a8150, blocking=blocking@entry=true) at aio-posix.c:188
#4 0x00005555556e520e in iothread_run (opaque=0x5555563a7fd8) at iothread.c:41
#5 0x00007ffff42ba124 in start_thread () from /usr/lib/libpthread.so.0
#6 0x00007ffff16d14bd in clone () from /usr/lib/libc.so.6
Reported-by: Max Reitz <mreitz@redhat.com>
Cc: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
At least raw-posix relies on this because it can allocate bounce buffers
based on the request length, but access it using all of the qiov entries
later.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
If a QED image has a shorter backing file and a read request to
unallocated clusters goes across EOF of the backing file, the backing
file sees a shortened request and the rest is filled with zeros.
However, the original too long qiov was used with the shortened request.
This patch makes the qiov size match the request size, avoiding a
potential buffer overflow in raw-posix.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
If a qcow2 image has a shorter backing file and a read request to
unallocated clusters goes across EOF of the backing file, the backing
file sees a shortened request and the rest is filled with zeros.
However, the original too long qiov was used with the shortened request.
This patch makes the qiov size match the request size, avoiding a
potential buffer overflow in raw-posix.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
If a read request goes across EOF, the block driver sees a shortened
request that stops at EOF (the rest is memsetted in block.c), however
the original qiov was used for this request.
This patch makes the qiov size match the request size, avoiding a
potential buffer overflow in raw-posix.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
In the case that the lun number is taken by another scsi device, don't
release the existing device siliently, but report an error to user.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add code to kvm_arch_get_registers and kvm_arch_put_registers to
save/restore floating point registers. This missing sync was
unnoticed until migration of userspace that uses fprs.
Signed-off-by: Jason J. Herne <jjherne@us.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
[Update patch to latest upstream]
Cc: qemu-stable@nongnu.org
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Mising G_TIME_SPAN_SECOND definition breaks the RHEL6 compilation as GLib
version before 2.26 does not have it. In such case just define it.
Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
pc-0.13 and older were missing some compat code that was present on
newer machine-types:
* x86_cpu_compat_disable_kvm_features(FEAT_1_ECX, CPUID_EXT_X2APIC);
(pc-i440fx-1.7 and older)
(added by commit ef02ef5f45)
* x86_cpu_compat_set_features("n270", FEAT_1_ECX, 0, CPUID_EXT_MOVBE);
(pc-i440fx-1.4 and older)
(added by commit 4458c23672
* x86_cpu_compat_set_features("Westmere", FEAT_1_ECX, 0, CPUID_EXT_PCLMULQDQ);
(pc-i440fx-1.4 and older)
(added by commit 56383703c0)
Instead of duplicating the code from the previous pc_compat_*()
functions, we can now reuse pc_compat_1_2() and fix those issues.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
If machine doesn't support memory hotplug then staring QEMU
with initial memory less than default will make QEMU exit with
following error message:
$QEMU -m 16 -M isapc
qemu-system-i386: "-memory 'slots|maxmem'" is not supported by: isapc
Set maxram_size to initial memory value before parsing
'maxmem' option allows to keep maxmem in sync with initial
memory size if no maxmem option was specified.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
CC: Bruce Rogers <brogers@suse.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Attach a name to the MTP interface (android phones have this too).
With this patch recent linux guests such as fedora 20 happily detect and
use the device. It shows up in nautilus file manager automatically, and
simple-mtpfs can mount it.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(Resending for correct email addresses via MAINTAINERS ...)
In the GTK UI, after changing focus to the qemu monitor Notebook Page,
when restoring focus to the virtual machine page, the keyboard focus is lost
to a hidden GTK widget. Focus can only be restored to the virtual machine by
pressing "tab" or any of the four directional arrow keys.
Clicking in the window or grabbing/ungrabbing input does not restore keyboard
focus to the child widget.
This patch adjusts the Notebook page switching callback to automatically
steal keyboard focus on the Page switch event, so that keyboard input
does not appear to break or disappear after tabbing to the QEMU monitor.
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Commit b2eb849d4b
"CVE-2007-1320 - Cirrus LGD-54XX "bitblt" heap overflow" broke
cpu to video blits.
When the ROP function is called from cirrus_bitblt_cputovideo_next(),
we pass 0 for the pitch but only operate on one line at a time. The
added test was tripping because after the initial substraction, the
pitch becomes negative. Make the test only trip when the height is
larger than one (ie. the pitch is actually used).
This fixes HW cursor support in Windows NT4.0 (which otherwise was
a white rectangle) and general display of icons in that OS when using
8bpp mode.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
when configure a invalid vram size for cirrus card, such as less
2 MB, which will crash qemu. Follow the real hardware, the cirrus
card has 4 MB video memory. Also for backward compatibility, accept
8 MB and 16 MB vram size.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Set auth to sasl when sasl is enabled, this makes "info spice" correctly
display sasl auth. Also throw an error in case someone tries to set
a spice password via monitor without auth mode being "spice".
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
* remotes/kvm/uq/master:
qtest: fix vhost-user-test compilation with old GLib
mc146818rtc: register the clock reset notifier on the right clock
oslib-posix: Fix new compiler error with -Wclobbered
target-i386: Add "kvmclock-stable-bit" feature bit name
Enforce stack protector usage
watchdog: fix deadlock with -watchdog-action pause
mips_malta: Catch kernels linked at wrong address
mips_malta: Remove incorrect KVM T&E references
mips/kvm: Disable FPU on reset with KVM
mips/kvm: Init EBase to correct KSEG0
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Mising G_TIME_SPAN_SECOND definition breaks the RHEL6 compilation as GLib
version before 2.26 does not have it. In such case just define it.
Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 884f17c (aio / timers: Convert rtc_clock to be a QEMUClockType,
2013-08-21) erroneously changed an occurrence of rtc_clock to
QEMU_CLOCK_REALTIME, which broke the RTC reset notifier in
mc146818rtc. Fix this.
I redid the patch myself since the original reporter did not sign
off on his.
Cc: qemu-stable@nongnu.org
Reported-by: Lb peace <peaceustc@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Newer versions of gcc report a warning (or an error with -Werror) when
compiler option -Wclobbered (or -Wextra) is active:
util/oslib-posix.c:372:12: error:
variable ‘hpagesize’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Werror=clobbered]
The rewritten code fixes this warning: variable 'hpagesize' is now set and
used in a block without any call of sigsetjmp or similar functions.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
KVM_FEATURE_CLOCKSOURCE_STABLE_BIT is enabled by default and supported
by KVM. But not having a name defined makes QEMU treat it as an unknown
and unmigratable feature flag (as any unknown feature may possibly
require state to be migrated), and disable it by default on "-cpu host".
As a side-effect, the new name also makes the flag configurable,
allowing the user to disable it (which may be useful for testing or for
compatibility with old kernels).
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If --enable-stack-protector is used is used, configure script try to use
--fstack-protector-strong. In case it's not supported, --fstack-protector-all
is enabled. If both protectors are not supported, configure does not use
any protector at all without any notification.
This patch reports error when user requests stack protector to be used and
both protector modes are not supported. Behavior is not changed in case
user do not use any of --enable-stack-protector/--disable-stack-protector.
Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>
[Fix non-POSIX operator in test. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The buffer was being allocated of size string length plus two.
Around the string two quotes were being added, but no terminating NUL.
It was then compared using g_assert_cmpstr(), resulting in fairly random
assertion failures:
ERROR:tests/test-string-output-visitor.c:213:test_visitor_out_enum: assertion failed (str == str_human): ("\"value1\"" == "\"value1\"\001EEEEEEEEEEEEEE\0171")
There is no g_assert_cmpnstr() counterpart, so use g_strdup_printf()
for safely assembling the string in the first place.
Cc: Hu Tao <hutao@cn.fujitsu.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Suggested-by: Eric Blake <eblake@redhat.com>
Fixes: b4900c0 tests: add human format test for string output visitor
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
qemu_clock_enable says:
/* Disabling the clock will wait for related timerlists to stop
* executing qemu_run_timers. Thus, this functions should not
* be used from the callback of a timer that is based on @clock.
* Doing so would cause a deadlock.
*/
and it indeed does: vm_stop uses qemu_clock_enable on QEMU_CLOCK_VIRTUAL
and watchdogs are based on QEMU_CLOCK_VIRTUAL, and we get a deadlock.
Use qemu_system_vmstop_request_prepare()/qemu_system_vmstop_request()
instead; yet another alternative could be a BH.
I checked other occurrences of vm_stop and they should not have this
problem. RUN_STATE_IO_ERROR could in principle (it depends on the
code in the drivers) but it has been fixed by commit 2bd3bce, "block:
asynchronously stop the VM on I/O errors", 2014-06-05.
Tested-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add error reporting if the wrong type of kernel is provided for the
current mode of acceleration.
Currently a KVM kernel linked at 0x40000000 can't be used with TCG, and
a normal kernel linked at 0x80000000 can't be used with KVM.
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Fix the error message and code comments relating to KVM not supporting
booting from the flash mapping when no kernel is provided. The issue is
a general MIPS KVM issue and isn't specific to the Trap & Emulate
version of MIPS KVM.
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reported-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
KVM doesn't yet support the MIPS FPU, or writing to the guest's Config1
register which contains the FPU implemented bit. Clear QEMU's version of
that bit on reset and display a warning that the FPU has been disabled.
The previous incorrect Config1 CP0 register value wasn't being passed to
KVM yet, however we should ensure it is set correctly now to reduce the
risk of breaking migration/loadvm to a future version of QEMU/Linux that
does support it.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In many cases, the call to event_notifier_set in aio_notify is unnecessary.
In particular, if we are executing aio_dispatch, or if aio_poll is not
blocking, we know that we will soon get to the next loop iteration (if
necessary); the thread that hosts the AioContext's event loop does not
need any nudging.
The patch includes a Promela formal model that shows that this really
works and does not need any further complication such as generation
counts. It needs a memory barrier though.
The generation counts are not needed because any change to
ctx->dispatching after the memory barrier is okay for aio_notify.
If it changes from zero to one, it is the right thing to skip
event_notifier_set. If it changes from one to zero, the
event_notifier_set is unnecessary but harmless.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The current test depends too much on the implementation of the AioContext
GSource. Just iterate on the main loop until the callback has been invoked
the right number of times.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The main AioContext should be accessed explicitly via qemu_get_aio_context().
Most of the time, using it is not the right thing to do.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bdrv_is_allocated() should return either 0 or 1 in successful cases.
We're lucky that currently, the callers that rely on this (e.g. because
they check for ret == 1) don't seem to break badly. They just might skip
some optimisation or in the case of qemu-io 'map' print separate lines
where a single line would suffice. In theory, a wrong allocation status
could lead to image corruption with certain operations, so let's fix
this quickly.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
When doing a block backup of an image with an unaligned size (with
respect to the BACKUP_CLUSTER_SIZE), qemu would check the allocation
status of sectors after the end of the image. bdrv_is_allocated()
returns a result that is valid for 0 sectors in this case, so the backup
job ran into an endless loop.
Stop looping when seeing a result valid for 0 sectors, we're at EOF then.
The test case looks somewhat unrelated at first sight because I
originally tried to reproduce a different suspected bug that turned out
to not exist. Still a good test case and it accidentally found this one.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Bugfixes for s390x: set subsystem id in the lowcore when booting from the
s390-ccw bios, and set the channel-program address after I/O completion,
when applicable.
# gpg: Signature made Tue 08 Jul 2014 14:18:20 BST using RSA key ID C6F02FAF
# gpg: Can't check signature: public key not found
* remotes/cohuck/tags/s390x-20140708:
s390x/css: reflect cpa in scsw
pc-bios/s390-ccw: update binary
pc-bios/s390-ccw: store proper subsystem information word
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We neglected to update the the channel-program-address field of the scsw
after completion of the start or the halt function: Fortunately, Linux
didn't miss it so far. Let's update it for the cases where the cpa is
expected to be valid; in some cases, the cpa is 'unpredictable', so we
leave it untouched.
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
POP chapter 17 requires to store a subsystem information word at 184
during IPL. Furthermore bytes 188-191 should be zero. The bootmap might
contain data blocks that are written to the first page. We have to
write these values after we processed the bootmap and before the final
IPL.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
target-arm queue:
* fix handling of KVM reset for 32-bit ARM CPUs
* implement NOR flash alias for vexpress-a9
* make sure libvixl gets its own utils.h rather than somebody else's
# gpg: Signature made Tue 08 Jul 2014 13:12:05 BST using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
* remotes/pmaydell/tags/pull-target-arm-20140708:
target-arm: Implement vCPU reset via KVM_ARM_VCPU_INIT for 32-bit CPUs
hw/arm/vexpress: Alias NOR flash at 0 for vexpress-a9
disas/libvixl: prepend the include path of libvixl header files
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Implement kvm_arm_vcpu_init() as a simple call to arm_arm_vcpu_init()
(which uses the KVM_ARM_VCPU_INIT vcpu ioctl to tell the kernel
to re-initialize the vCPU), rather than via the complicated code
which saves a copy of the register state on first init and then
writes it back to the kernel. This is much simpler and brings the
32-bit KVM code into line with the 64-bit code.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1403802973-20841-1-git-send-email-peter.maydell@linaro.org
Currently the Makefile of disas/libvixl appends
-I$(SRC_PATH)/disas/libvixl to QEMU_CFLAGS. As a consequence C++ files
that #include "utils.h", such as disas/libvixl/a64/instructions-a64.cc,
are going to look for utils.h on all the other include paths first.
When building QEMU as part of the Xen make system, another unrelated
utils.h file is going to be chosen for inclusion, causing a build
failure:
In file included from disas/libvixl/a64/instructions-a64.cc:27:0:
/qemu/disas/libvixl/a64/instructions-a64.h:88:64: error:
'rawbits_to_float' was not declared in this scope
const float kFP32PositiveInfinity = rawbits_to_float(0x7f800000);
Fix the problem by prepending (rather than appending) the libvixl
include path to QEMU_CFLAGS.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Patch queue for ppc - 2014-07-08
A few bug fixes for 2.1:
- Fix e500* TLB emulation with qemu-system-ppc
- Update SLOF to current upstream (good number of bugfixes)
- Make POWER7 / POWER8 PVR match more agnostic (needed in 2.1 for cmdline compat)
- Fix u-boot.e500 install (how did that happen?)
- Fix H_CAS on LE hosts
- ppc64le-linux-user fixes
# gpg: Signature made Tue 08 Jul 2014 11:18:58 BST using RSA key ID 03FEDC60
# gpg: Can't check signature: public key not found
* remotes/agraf/tags/signed-ppc-for-upstream:
PPC: e500: Actually install u-boot.e500
target-ppc: Remove POWER7+ and POWER8E families
target-ppc: Add pvr_match() callback
pseries: Update SLOF firmware image to qemu-slof-20140630
PPC: Fix booke206 TLB with phys addrs > 32bit
target-ppc: Fix gdbstub for ppc64le-linux-user
target-ppc: Change default cpu for ppc64le-linux-user
target-ppc: KVMPPC_H_CAS fix cpu-version endianess
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
POWER8E is architecturally equal to POWER8 and POWER7+ is equal to
POWER7. Also no user space tool makes any difference for CPU node name
in the device tree (such as PowerPC,POWER7@0 vs. PowerPC,POWER7+@0).
So there is no point in emulating POWER7+ and POWER8E apart from POWER7
and POWER8. Also, the previos patch implemented multiple PVR mask support
per CPU class so POWER7 class now covers both POWER7 and POWER7+ CPUs,
same is valid for POWER8/8E.
This removes POWER7+ and POWER8E classes. This replaces references
to POWER7P/POWER8E families with POWER7/POWER8 families.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
So far it was enough to have a base PVR value and mask per CPU
family such as POWER7 or POWER8. However there CPUs which are
completely architecturally compatible but have different PVRs such
as POWER7/POWER7+ and POWER8/POWER8E. For these CPUs, top 16 bits
are CPU family and low 16 bits are the version. The families have
PVR base values different enough so defining a mask which
would cover both (or potentially more) CPUs within the family is
not possible.
This adds a pvr_match() callback to PowerPCCPUClass. The default
handler simply compares PVR defined in the class.
This implements ppc_pvr_match_power7/ppc_pvr_match_power8 callbacks
for POWER7/8 families. These check for POWER7/POWER7+ and POWER8/POWER8E.
This changes ppc_cpu_compare_class_pvr_mask() not to check masks but
use the pvr_match() callback.
Since all server CPUs use the same mask, this defines one mask
value - CPU_POWERPC_POWER_SERVER_MASK - which is used everywhere now.
This removes other mask definitions.
This removes pvr_mask from PowerPCCPUClass as it is not used anymore.
This removes pvr initialization for POWER7/8 families as it is not used
to find the class, the pvr_match() callback is used instead.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
The changelog is:
> Quieten the grub warning
> Add boot menu support
> boot from disk having chrp-boot file
> fat16: fix read and remove debug messages
> dhcparch define missing in compilation
> pci-scan: reserve memory for pci-bridge without devices
> pci-bridge: Fix ranges when no device beyond the bridge
> Set dhcp arch in board-qemu config file
> xhci: fix controller stop
> dhcp: support client architecture code 93
> virtio-blk: support variable block size
> usb: use common pci dma alloc/mapping routines
> Remove unused SLOF code
> pci-bridge: generic bridge needs to support pci dma functions
> pci: extract dma functions as separate file
> e1000: fix usage of multiple nics
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
We were truncating physical addresses to 32bit when using qemu-system-ppc
with a booke206 TLB implementation. This patch fixes that and makes the full
address space available.
Signed-off-by: Alexander Graf <agraf@suse.de>
The bswap that's needed for system mode isn't required for
user mode, and in fact breaks debugging.
Signed-off-by: Richard Henderson <rth@twiddle.net>
[agraf: fix apple gdbstub implementation]
Signed-off-by: Alexander Graf <agraf@suse.de>
The default, 970fx, doesn't support MSR_LE. So even though we set LE in
ppc_cpu_reset, it gets cleared again in hreg_store_msr. Error out if a
user-selected cpu model doesn't support LE.
Signed-off-by: Richard Henderson <rth@twiddle.net>
[agraf: switch to POWER7 as default for BE and LE]
Signed-off-by: Alexander Graf <agraf@suse.de>
During KVMPPC_H_CAS processing, the cpu-version updated value is stored
without taking care of the current endianess. As a consequence, the guest
may not switch to the right CPU model, leading to unexpected results.
If needed, the value is now converted.
Fixes: 6d9412ea81 ("target-ppc: Implement "compat" CPU option")
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
pc,vhost,virtio fixes, test
Bugfixes all over the place.
There's a non bugfix here: re-enabling the vhost-user test,
though the patch just brings back functionality that
I disabled earlier to fix mingw build failures.
This is now sorted, and keeping the unit test enabled
seems important since the feature relies on an external
server to work, so isn't easy to test.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Sun 06 Jul 2014 11:01:35 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
* remotes/mst/tags/for_upstream:
qemu-char: add chr_add_watch support in mux chardev
virtio-pci: fix MSI memory region use after free
qdev: Fix crash when using non-device class name on -global
qdev: Don't abort() in case globals can't be set
hw/virtio: enable common virtio feature for mmio device
acpi: fix typo in memory hotplug MMIO region name
pci: assign devfn to pci_dev before calling pci_device_iommu_address_space()
Handle G_IO_HUP in tcp_chr_read for tcp chardev
virtio: move common virtio properties to bus class device
pc-dimm: error out if memory hotplug is not enabled
numa: check for busy memory backend
qtest: enable vhost-user-test
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Block pull request
# gpg: Signature made Mon 07 Jul 2014 13:27:20 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>"
* remotes/stefanha/tags/block-pull-request:
qmp: show QOM properties in device-list-properties
dataplane: submit I/O as a batch
linux-aio: implement io plug, unplug and flush io queue
block: block: introduce APIs for submitting IO as a batch
ahci: map memory via device's address space instead of address_space_memory
raw-posix: Fix raw_getlength() to always return -errno on error
qemu-iotests: Disable Quorum testing in 041 when Quorum is not builtin
ahci.c: mask unused flags when reading size PRDT DBC
MAINTAINERS: add Stefan Hajnoczi to IDE maintainers
mirror: Fix qiov size for short requests
Fix nocow typos in manpage
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* remotes/sstabellini/xen_arm_20140707:
xen: build on ARM
xen_backend: introduce xenstore_read_uint64 and xenstore_read_fe_uint64
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Collection of fixes to build QEMU with Xen support on ARM:
- use xenstore_read_fe_uint64 to retrieve the page-ref (xenfb);
- use xen_pfn_t instead of unsigned long in xenfb;
- unsigned long/xenpfn_t in xen_remove_from_physmap;
- in xen-mapcache.c use HOST_LONG_BITS to check for QEMU's address space
size.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Devices can use a mix of qdev and QOM properties. Currently only the
qdev properties are displayed by device-list-properties.
This patch extends the property enumeration algorithm to also display
QOM properties (excluding the implicit "type", "realized",
"hotpluggable", and "parent_bus" properties).
When a qdev property exists, use the qdev type name to preserve
backwards compatibility. QOM type names can be different for bool (qdev
on/off) and str (used by qdev pointers).
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Before commit 580b6b2aa2(dataplane: use the QEMU block
layer for I/O), dataplane for virtio-blk submits block
I/O as a batch.
This commit 580b6b2aa2 replaces the custom linux AIO
implementation(including submit I/O as a batch) with QEMU
block layer, but this commit causes ~40% throughput regression
on virtio-blk performance, and removing submitting I/O
as a batch is one of the causes.
This patch applies the newly introduced bdrv_io_plug() and
bdrv_io_unplug() interfaces to support submitting I/O
at batch for Qemu block layer, and in my test, the change
can improve throughput by ~30% with 'aio=native'.
Following my fio test script:
[global]
direct=1
size=4G
bsrange=4k-4k
timeout=40
numjobs=4
ioengine=libaio
iodepth=64
filename=/dev/vdc
group_reporting=1
[f]
rw=randread
Result on one of my small machine(host: x86_64, 2cores, 4thread, guest: 4cores):
- qemu master: 65K IOPS
- qemu master with these patches: 92K IOPS
- 2.0.0 release(dataplane using custom linux aio): 104K IOPS
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This patch implements .bdrv_io_plug, .bdrv_io_unplug and
.bdrv_flush_io_queue callbacks for linux-aio Block Drivers,
so that submitting I/O as a batch can be supported on linux-aio.
[Unprocessed requests are completed with -EIO instead of a bogus ret
value.
--Stefan]
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This patch introduces three APIs so that following
patches can support queuing I/O requests and submitting them
as a batch for improving I/O performance.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
In map_page() in hw/ide/ahci.c, replace cpu_physical_memory_map() and
cpu_physical_memory_unmap() with dma_memory_map() and dma_memory_unmap(),
because ahci devices should not access memory directly but via their address
space. Add an AddressSpace parameter to map_page(). In order to call
map_page(), we should pass the AHCIState.as as the AddressSpace argument.
Signed-off-by: Le Tan <tamlokveer@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This avoid breaking tests on RHEL6 where gnutls is too old for quorum to be
built by default.
Signed-off-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The data byte count(DBC) read from the description information is defined for
bits 21:00. Bits 30:22 are reserved and bit 31 is the Interrupt on Completion
(I) flag.
Completion interrupts are triggered after every transaction instead of on
I-flag in QEMU. tbl_entry_size is a signed integer and improperly reading the
DBC leads to a negative offset that causes sglist allocation to fail.
Signed-off-by: Reza Jelveh <reza.jelveh@tuhh.de>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
When mirroring an image of a size that is not a multiple of the
mirror job granularity, the last request would have the right nb_sectors
argument, but a qiov that is rounded up to the next multiple of the
granularity. Don't do this.
This fixes a segfault that is caused by raw-posix being confused by this
and allocating a buffer with request length, but operating on it with
qiov length.
[s/Driver/Drive/ in qemu-iotests 041 as suggested by Eric
--Stefan]
Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Forward chr_add_watch call from mux chardev to underlying
implementation.
This should fix bug #1335444
Signed-off-by: Kirill Batuzov <batuzovk@ispras.ru>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
After memory region QOMification QEMU is stricter in detecting
wrong usage of the memory region API. Here it detected a
memory_region_destroy done before the corresponding
memory_region_del_subregion; the memory_region_destroy is
done by msix_uninit_exclusive_bar, the memory_region_del_subregion
is done by the PCI core's pci_unregister_io_regions before
pc->exit is called.
The problem was introduced by
commit 06a1307379
virtio-pci: add device_unplugged callback
As noted in that commit log, virtio device kick callbacks need to be
stopped before generic virtio is cleaned up. This is because these are
notifications from pci proxy to the generic virtio device so they need
to be stopped in the unplug call before the virtio device is unrealized.
However interrupts are notifications from the virtio device to
the pci proxy so they need to stay around while the device
is realized.
The memory API misuse caused an assertion when hot-unplugging virtio
devices. Using the API correctly fixes the assertion.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This fixes the following crash:
$ qemu-system-x86_64 -global container.xxx=y
hw/core/qdev-properties-system.c:399:qdev_add_one_global: Object 0x7f7eff234100 is not an instance of type device
Aborted (core dumped)
New behavior will be to just warn, just like when non-existing clas
names are used:
$ qemu-system-x86_64 -global container.xxx=y
qemu-system-x86_64: Warning: "-global container.xxx=y" not used
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Tested-by: Don Slutz <dslutz@verizon.com>
It would be much better if we didn't terminate QEMU inside
device_post_init(), but at least exiting cleanly is better than aborting
and dumping core.
Before this patch:
$ qemu-system-x86_64 -global cpu.xxx=y
qemu-system-x86_64: Property '.xxx' not found
Aborted (core dumped)
After this patch:
$ qemu-system-x86_64 -global cpu.xxx=y
qemu-system-x86_64: Property '.xxx' not found
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-By: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Both 'indirect_desc' and 'event_idx' are bus independent features,
and they should be enabled for mmio devices too.
On arm64 quad core VM(qemu-kvm), the patch can increase block I/O
performance a lot with latest linux tree:
- without the patch: 14K IOPS
- with the patch: 34K IOPS
fio script:
[global]
direct=1
bsrange=4k-4k
timeout=10
numjobs=4
ioengine=libaio
iodepth=64
filename=/dev/vdc
group_reporting=1
[f1]
rw=randread
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In function do_pci_register_device() in file hw/pci/pci.c, move the assignment
of pci_dev->devfn to the position before the call to
pci_device_iommu_address_space(pci_dev) which will use the value of
pci_dev->devfn.
Fixes: 9eda7d373e
pci: Introduce helper to retrieve a PCI device's DMA address space
Cc: qemu-stable@nongnu.org
Signed-off-by: Le Tan <tamlokveer@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Since commit cdaa86a54b
("Add G_IO_HUP handler for socket chardev")
GLib limitation results in a bug on Windows host. Steps to reproduce:
Start qemu: qemu-system-i386 -qmp tcp:127.0.0.1:4444:server:nowait
Connect with telnet: telnet 127.0.0.1 4444
Try sending some data from telnet.
Expected result: answers from QEMU.
Observed result: no answers (actually tcp_chr_read is not called at all).
Due to GLib limitations it is not possible to create several watches on one
channel on Windows hosts. See bug #338943 in GNOME bugzilla for details:
https://bugzilla.gnome.org/show_bug.cgi?id=338943
This reimplements commit cdaa86a54b
("Add G_IO_HUP handler for socket chardev") using a single watch:
Handle G_IO_HUP in tcp_chr_read instead. It is already watched by a
corresponding watch. Remove the second watch with its handler.
Cc: Antonios Motakis <a.motakis@virtualopensystems.com>
Cc: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Kirill Batuzov <batuzovk@ispras.ru>
Signed-off-by: Nikita Belov <zodiac@ispras.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The two common virtio features can be defined per bus, so move all
into bus class device to make code more clean.
As discussed with cornelia, s390-virtio-blk doesn't support
the two features at all, so keep s390-virtio as it.
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> #for s390 ccw
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
MST: rebase and resolve conflicts
fixes QEMU abort in case it's started without memory
hotplug enabled.
as result of fix it will print following messages:
"
-device pc-dimm,id=d1,memdev=m1: memory hotplug is not enabled, enable it on startup
-device pc-dimm,id=d1,memdev=m1: Device 'pc-dimm' could not be initialized
"
Also fixup assert condition to detect hotplug address
space overflow.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reported-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Specifying the same memory backend twice leads to an assert:
./x86_64-softmmu/qemu-system-x86_64 -m 512M -enable-kvm -object
memory-backend-ram,size=256M,id=ram0 -numa node,nodeid=0,memdev=ram0
-numa node,nodeid=1,memdev=ram0
qemu-system-x86_64: /scm/qemu/memory.c:1506:
memory_region_add_subregion_common: Assertion `!subregion->container'
failed.
Aborted (core dumped)
Detect and exit with an error message instead.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Use qtest-obj-y to get the right library order. CONFIG_POSIX ensures
mingw compilation won't break.
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
MST: whitespace tweak
The EBase CP0 register is initialised to 0x80000000, however with KVM
the guest's KSEG0 is at 0x40000000. The incorrect value doesn't get
passed to KVM yet as KVM doesn't implement the EBase register, however
we should set it correctly now so as not to break migration/loadvm to a
future version of QEMU that does support EBase.
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Shut up Coverity's complaint about unchecked fcntl return values,
and especially make the code simpler and more efficient.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Block pull request
# gpg: Signature made Tue 01 Jul 2014 09:47:15 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>"
* remotes/stefanha/tags/block-pull-request: (23 commits)
block: add backing-file option to block-stream
block: extend block-commit to accept a string for the backing file
block: add helper function to determine if a BDS is in a chain
block: add QAPI command to allow live backing file change
qapi: Change back sector-count to sectors-count in quorum QAPI events.
block/cow: Avoid use of uninitialized cow_bs in error path
block: simplify bdrv_find_base() and bdrv_find_overlay()
block: make 'top' argument to block-commit optional
iotests: Add more tests to quick group
iotests: Add qemu tests to quick group
iotests: Simplify qemu-iotests-quick.sh
qemu-img create: add 'nocow' option
virtio-blk: remove need for explicit x-data-plane=on option
qdev: drop iothread property type
virtio-blk: replace x-iothread with iothread link property
virtio-blk: move qdev properties into virtio-blk.c
virtio: fix virtio-blk child refcount in transports
virtio-blk: drop virtio_blk_set_conf()
virtio-blk: use aliases instead of duplicate qdev properties
qdev: add qdev_alias_all_properties()
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
this patch makes the VNC server work correctly if the
server surface and the guest surface have different sizes.
Basically the server surface is adjusted to not exceed VNC_MAX_WIDTH
x VNC_MAX_HEIGHT and additionally the width is rounded up to multiple of
VNC_DIRTY_PIXELS_PER_BIT.
If we have a resolution whose width is not dividable by VNC_DIRTY_PIXELS_PER_BIT
we now get a small black bar on the right of the screen.
If the surface is too big to fit the limits only the upper left area is shown.
On top of that this fixes 2 memory corruption issues:
The first was actually discovered during playing
around with a Windows 7 vServer. During resolution
change in Windows 7 it happens sometimes that Windows
changes to an intermediate resolution where
server_stride % cmp_bytes != 0 (in vnc_refresh_server_surface).
This happens only if width % VNC_DIRTY_PIXELS_PER_BIT != 0.
The second is a theoretical issue, but is maybe exploitable
by the guest. If for some reason the guest surface size is bigger
than VNC_MAX_WIDTH x VNC_MAX_HEIGHT we end up in severe corruption since
this limit is nowhere enforced.
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
currently a malicious client could define a payload
size of 2^32 - 1 bytes and send up to that size of
data to the vnc server. The server would allocated
that amount of memory which could easily create an
out of memory condition.
This patch limits the payload size to 1MB max.
Please note that client_cut_text messages are currently
silently ignored.
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
If libusb_get_device_list() fails, the uninitialized local variable
libusb_device would be passed to libusb_free_device_list(), that
will cause a crash, like:
(gdb) bt
#0 0x00007fbbb4bafc10 in pthread_mutex_lock () from /lib64/libpthread.so.0
#1 0x00007fbbb233e653 in libusb_unref_device (dev=0x6275682d627375)
at core.c:902
#2 0x00007fbbb233e739 in libusb_free_device_list (list=0x7fbbb6e8436e,
unref_devices=<optimized out>) at core.c:653
#3 0x00007fbbb6cd80a4 in usb_host_auto_check (unused=unused@entry=0x0)
at hw/usb/host-libusb.c:1446
#4 0x00007fbbb6cd8525 in usb_host_initfn (udev=0x7fbbbd3c5670)
at hw/usb/host-libusb.c:912
#5 0x00007fbbb6cc123b in usb_device_init (dev=0x7fbbbd3c5670)
at hw/usb/bus.c:106
...
So initialize libusb_device at the begin time.
Signed-off-by: Jincheng Miao <jmiao@redhat.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Due to an incomplete initialization, adding a usb-bt-dongle device through HMP
or QMP will cause a segmentation fault.
Signed-off-by: Hani Benhabiles <hani@linux.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Guest mouse pointer was jumpy, when moving host mouse in the vertical direction (see bug #1327800).
Signed-off-by: Christian Burger <christian@krikkel.de>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
* remotes/bonzini/memory:
qdev: correctly send DEVICE_DELETED for recursively-deleted devices
memory: do not give a name to the internal exec.c regions
memory: MemoryRegion: Add size property
memory: MemoryRegion: Add may-overlap and priority props
memory: MemoryRegion: Add container and addr props
memory: MemoryRegion: replace owner field with QOM parent
memory: MemoryRegion: QOMify
memory: MemoryRegion: use /machine as default owner
libqtest: escape strings in QMP commands, fix leak
qom: object: Ignore refs/unrefs of NULL
qom: object: remove parent pointer when unparenting
mc146818rtc: add "rtc-time" link to "/machine/rtc"
qom: allow creating an alias of a child<> property
qom: add a generic mechanism to resolve paths
qom: add object_property_add_alias()
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* remotes/bonzini/scsi-next:
configure: Fix -lm test, so that tools can be compiled on hosts that require -lm
virtio-scsi: scsi events must be converted to target endianness
virtio-scsi: virtio_scsi_push_event() lacks VirtIOSCSIReq parsing
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We have the experience that the guest doesn't stop successfully
though it was instructed to shut down.
The root cause may be not in QEMU mostly. However, QEMU is often
suspected at the beginning just because the issue occurred in
virtualization environment.
Therefore, we need to affirm that QEMU received the shutdown
request and raised ACPI irq from "virsh shutdown" command,
virt-manger or stopping QEMU process to the VM .
So that we can affirm the problems was belonged to the Guset OS
rather than the QEMU itself.
When we stop guests by "virsh shutdown" command or virt-manger,
or stopping QEMU process, qemu_system_powerdown_request() or
qemu_system_shutdown_request() is called. Then the below functions
in main_loop_should_exit() of Vl.c are called roughly in the
following order.
if (qemu_powerdown_requested())
qemu_system_powerdown()
monitor_protocol_event(QEVENT_POWERDOWN, NULL)
OR
if(qemu_shutdown_requested()}
monitor_protocol_event(QEVENT_SHUTDOWN, NULL);
The tracepoint of monitor_protocol_event() already exists, but no
tracepoints are defined for qemu_system_powerdown_request() and
qemu_system_shutdown_request(). So this patch adds two tracepoints for
the two functions. We believe that it will become much easier to
isolate the problem mentioned above by these tracepoints.
Signed-off-by: Yang Zhiyong <yangzy.fnst@cn.fujitsu.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
On some image chains, QEMU may not always be able to resolve the
filenames properly, when updating the backing file of an image
after a block job.
For instance, certain relative pathnames may fail, or drives may
have been specified originally by file descriptor (e.g. /dev/fd/???),
or a relative protocol pathname may have been used.
In these instances, QEMU may lack the information to be able to make
the correct choice, but the user or management layer most likely does
have that knowledge.
With this extension to the block-stream api, the user is able to change
the backing file of the active layer as part of the block-stream
operation.
This allows the change to be 'safe', in the sense that if the attempt
to write the active image metadata fails, then the block-stream
operation returns failure, without disrupting the guest.
If a backing file string is not specified in the command, the backing
file string to use is determined in the same manner as it was
previously.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
On some image chains, QEMU may not always be able to resolve the
filenames properly, when updating the backing file of an image
after a block commit.
For instance, certain relative pathnames may fail, or drives may
have been specified originally by file descriptor (e.g. /dev/fd/???),
or a relative protocol pathname may have been used.
In these instances, QEMU may lack the information to be able to make
the correct choice, but the user or management layer most likely does
have that knowledge.
With this extension to the block-commit api, the user is able to change
the backing file of the overlay image as part of the block-commit
operation.
This allows the change to be 'safe', in the sense that if the attempt
to write the overlay image metadata fails, then the block-commit
operation returns failure, without disrupting the guest.
If the commit top is the active layer, then specifying the backing
file string will be treated as an error (there is no overlay image
to modify in that case).
If a backing file string is not specified in the command, the backing
file string to use is determined in the same manner as it was
previously.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This is a small helper function, to determine if 'base' is in the
chain of BlockDriverState 'top'. It returns true if it is in the chain,
and false otherwise.
If either argument is NULL, it will also return false.
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This allows a user to make a live change to the backing file recorded in
an open image.
The image file to modify can be specified 2 ways:
1) image filename
2) image node-name
Note: this does not cause the backing file itself to be reopened; it
merely changes the backing filename in the image file structure, and
in internal BDS structures.
It is the responsibility of the user to pass a filename string that
can be resolved when the image chain is reopened, and the filename
string is not validated.
A good analogy for this command is that it is a live version of
'qemu-img rebase -u', with respect to changing the backing file string.
[Jeff is offline so I respun this patch in his absence. Dropped image
filename since using node-name is preferred and this is a new command.
No need to introduce the limitations of finding images by filename.
--Stefan]
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The existing test whether "-lm" needs to be included or not is
insufficient as it reports false negative on Fedora20/ppc64.
This happens because sin(0.0) is a constant value which compiler
can safely throw away and therefore there is no need to add "-lm".
As the result, qemu-nbd/qemu-io/qemu-img tools cannot compile.
This adds a global variable and uses it in the test to prevent
from optimization.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[Use Peter's improvement on the test to fool LTO, and remove the
now useless -lm addition in Makefile.target. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When a device is unparented (i.e. made completely hidden from management)
we want to send a DEVICE_DELETED event only if the device actually was
realized. This avoids raising DEVICE_DELETED events when device_add
fails.
However, this does not work right for recursively-deleted
devices: the whole tree is _first_ unrealized, _then_ unparented.
Then device_unparent sees realized==false and fails to trigger
the event. The solution is simply to move have_realized into
the DeviceState struct. If device_add fails, we never set the
new field to true and DEVICE_DELETED is not sent.
Fixes qemu-iotests testcase 067 (broken by commit 5942a19, though that
commit in turn fixed a possible segfault in the same test).
Reported-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
To allow devices to dynamically resize the device. The motivation is
to allow devices with variable size to init their memory_region
without size early and then correctly populate size at realize() time.
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
QOM propertyify the .may-overlap and .priority fields. The setters
will re-add the memory as a subregion if needed (i.e. the values change
when the memory region is already contained).
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
[Remove setters. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Expose the already existing .parent and .addr fields as QOM properties.
.parent (i.e. the field describing the memory region that contains this
one in Memory hierachy) is renamed "container". This is to avoid
confusion with the QOM parent.
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
[Remove setters. Do not unref parent on releasing the property. Clean
up error propagation. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
QOMify memory regions as an Object. The former init() and destroy()
routines become instance_init() and instance_finalize() resp.
memory_region_init() is re-implemented to be:
object_initialize() + set fields
memory_region_destroy() is re-implemented to call unparent().
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
[Add newly-created MR as child, unparent on destruction. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
libqtest is using g_strdup_printf to format QMP commands, but
this does not work if the argument strings need to be escaped.
Instead, use the fancy %-formatting functionality of QObject.
The only change required in tests is that strings have to be
formatted as %s, not '%s' or \"%s\". Luckily this usage of
parameterized QMP commands is not that frequent.
The leak is in socket_sendf. Since we are extracting the send
loop to a new function, fix it now.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Just do nothing if passed NULL for a ref or unref. This avoids
call sites that manage a combination of NULL or non-NULL pointers
having to add iffery around every ref and unref.
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Certain parts of the QOM framework test this pointer to determine if
an object is parented. Nuke it when the object is unparented to allow
for reuse of an object after unparenting.
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a link to rtc under /machine providing a stable
location for management apps to query the value of the
time. The link should be added by any object that sends
RTC_TIME_CHANGE events.
{"execute":"qom-get","arguments":{"path":"/machine","property":"rtc-time"} }
Suggested by Paolo Bonzini and Andreas Faerber.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Child properties must be unique. Fix this problem by
turning their aliases into links.
The resolve function that forwards to the target property
does not have any knowledge of the target property's type,
so it works fine.
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It may be desirable to have custom link<> properties that do more
than just store an object. Even the addition of a "check"
function is not enough if setting the link has side effects
or if a non-standard reference counting is preferrable.
Avoid the assumption that the opaque field of a link<> is a
LinkProperty struct, by adding a generic "resolve" callback
to ObjectProperty. This fixes aliases of link properties.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
fe069d9d had aligned code and documentation while dropping the s from the
actual JSON output. Fix that.
This also fix test/qemu-iotest/081 since the missing s was causing a permutation.
Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Commit 25814e8987 introduced an error-exit code path which does
a "goto exit" before the cow_bs variable is initialized, meaning
we would call bdrv_unref() on an uninitialized variable and
likely segfault. Fix this by moving the NULL-initialization
to the top of the function and making the exit code path handle
the case where it is NULL.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This simplifies the function bdrv_find_overlay(). With this change,
bdrv_find_base() is just a subset of usage of bdrv_find_overlay(),
so this also takes advantage of that.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Now that active layer block-commit is supported, the 'top' argument
no longer needs to be mandatory.
Change it to optional, with the default being the active layer in the
device chain.
[kwolf: Rebased and resolved conflict in tests/qemu-iotests/040]
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
While at it, add some more tests to the quick group (those that run with
-nocache in under three seconds on my HDD).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Now that qemu-iotests-quick.sh supports tests using the qemu binary, we
are free to add such tests to the quick group.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
As of the "iotests: Allow out-of-tree run" series, the qemu-iotests may
(and should) be run directly in the build tree and will then guess the
binary paths themselves. Therefore, qemu-iotests-quick.sh does not need
to (and should not) enter the source path anymore; also, it does not
need to specify the binaries because "check" will guess them
automatically.
As a side-effect, tests using qemu may now be added to the quick group.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Add 'nocow' option so that users could have a chance to set NOCOW flag to
newly created files. It's useful on btrfs file system to enhance performance.
Btrfs has low performance when hosting VM images, even more when the guest
in those VM are also using btrfs as file system. One way to mitigate this bad
performance is to turn off COW attributes on VM files. Generally, there are
two ways to turn off NOCOW on btrfs: a) by mounting fs with nodatacow, then
all newly created files will be NOCOW. b) per file. Add the NOCOW file
attribute. It could only be done to empty or new files.
This patch tries the second way, according to the option, it could add NOCOW
per file.
For most block drivers, since the create file step is in raw-posix.c, so we
can do setting NOCOW flag ioctl in raw-posix.c only.
But there are some exceptions, like block/vpc.c and block/vdi.c, they are
creating file by calling qemu_open directly. For them, do the same setting
NOCOW flag ioctl work in them separately.
[Fixed up 082.out due to the new 'nocow' creation option
--Stefan]
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Virtio SCSI Events need to be byteswapped before being pushed
when host and guest have a different endianness. Not doing so
breaks hotplug of virtio scsi disks, with the following error
message being printed in the guest console:
virtio_scsi: Unsupport virtio scsi event 1000000
This issue got uncovered while testing disk hotplug with a PowerKVM
ppc64le guest. I have checked that this issue also affects a x86_64
guest run on a ppc64 host.
Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
[ Ported from PowerKVM,
Greg Kurz <gkurz@linux.vnet.ibm.com> ]
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Hotplug of a virtio scsi disk is currently broken: no disk appears in the
guest (verified with a fedora 20 host running a fedora 20 guest with KVM).
Bisect leeds to Paolo's patches to support any_layout, especially this
commit:
commit 36b15c79aa
Author: Paolo Bonzini <pbonzini@redhat.com>
Date: Tue Jun 10 16:21:18 2014 +0200
virtio-scsi: start preparing for any_layout
It modifies virtio_scsi_pop_req() so that it is up to the callers to parse
the virtio scsi request. It seems that virtio_scsi_push_event() was not
modified accordingly...
This patch adds a call to virtio_scsi_parse_req(). It also drops some
sanity checks that are already performed by virtio_scsi_parse_req().
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sometimes an object needs to present a property which is actually on
another object, or it needs to provide an alias name for an existing
property.
Examples:
a.foo -> b.foo
a.old_name -> a.new_name
The new object_property_add_alias() API allows objects to alias a
property on the same object or another object. The source and target
names can be different.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
The x-data-plane=on|off option is no longer useful because the
iothread=<iothread> option conveys the same information plus which
IOThread to use.
Do not delete x-data-plane=on|off yet as a convenience to people using
this legacy experimental option. We will drop it in QEMU 2.2.
Instead, turn on data-plane when either x-data-plane=on or
iothread=<iothread> are used. The following command-line uses
data-plane:
qemu -device virtio-blk-pci,iothread=foo,drive=drive0
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Up until now -device virtio-blk-pci,x-iothread=<id> was used to assign
an IOThread. This was a temporary solution while we cleaned up QOM link
properties.
This patch switches over to a QOM link property since it is now possible
to restrict the setter to unrealized instances and automatically unref
the IOThread when the virtio-blk-pci device is freed.
Since the "iothread" property is a QOM property and not a qdev property,
we must alias it explicitly for virtio-blk-pci, as well as CCW and
s390-virtio.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
There is no need to make DEFINE_VIRTIO_BLK_PROPERTIES() public. Inline
it into virtio-blk.c so it cannot be used by mistake from other source
files.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
object_initialize() leaves the object with a refcount of 1.
object_property_add_child() adds its own reference which is dropped
again when the property is deleted.
The upshot of this is that we always have a refcount >= 1. Upon hot
unplug the virtio-blk child is not finalized!
Drop our reference after the child property has been added to the
parent.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
This function is no longer used since parent objects now use child
aliases to set the VirtIOBlkConf directly.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
virtio-blk-pci, virtio-blk-s390, and virtio-blk-ccw all duplicate the
qdev properties of their VirtIOBlock child. This approach does not work
well with string or pointer properties since we must be careful about
leaking or double-freeing them.
Use the QOM alias property to forward property accesses to the
VirtIOBlock child. This way no duplication is necessary.
Remember to stop calling virtio_blk_set_conf() so that we don't clobber
the values already set on the VirtIOBlock instance.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
The qdev_alias_all_properties() function creates QOM alias properties
for each qdev property on a DeviceState. This is useful for parent
objects that wish to forward property accesses to their children.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Move the x-data-plane property. Originally it was outside since not
every transport may wish to support dataplane. But that makes little
sense when we have a dedicated CONFIG_VIRTIO_BLK_DATA_PLANE ifdef
already.
This move makes it easier to switch to property aliases in the next
patch.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
If the virtio transport does not support notifiers (like s390-virtio),
we can't use dataplane. Bail out early and let the user know what is
wrong.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
It becomes unwiedly to duplicate all virtio-blk qdev property
definitions due to an #ifdef. The C preprocessor syntax makes it a
little hard to resolve this cleanly but we can extract the #ifdef and
call a macro it defines later.
Avoiding duplication is important since it will only get worse when we
move the x-data-plane qdev property here too. We'd have a combinatorial
explosion since x-data-plane has its own #ifdef.
Suggested-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
As a prequel to any big Pin refactoring plans, do an in-place conversion
of qemu_irq to an Object, so that we can reference it in link<> properties.
Signed-off-by: Andreas Färber <afaerber@suse.de>
[ PC Changes:
* Removed array-alloctor ref counting logic (limit changes just to
* single IRQ allocator)
* Removed WIP marking from subject line
]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Allocate each IRQ individually on array allocations. This prepares for
QOMification of IRQs, where pointers to individual IRQs may be taken
and handed around for usage as QOM Links. The g_renew() scheme used here
is too fragile and would break all existing links should an IRQ list
be extended.
We now have to pass the IRQ count to qemu_free_irqs(). We have so few
call sites however, so this change is reasonably trivial.
Cc: agarcia@igalia.com
Cc: mst@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Alberto Garcia <agarcia@igalia.com>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Certain parts of the QOM framework test this pointer to determine if
an object is parented. Nuke it when the object is unparented to allow
for reuse of an object after unparenting.
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
io-error is for block device errors; it should always be preceded
by a BLOCK_IO_ERROR event. I think vfio wants to use
RUN_STATE_INTERNAL_ERROR instead.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Slow BAR access path is used when VFIO fails to mmap() BAR.
Since this is just a transport between the guest and a device, there is
no need to do endianness swapping.
This changes BARs to use native endianness. Since non-ROM BARs were
doing byte swapping, we need to remove it so does the patch.
As the result, this eliminates cancelling byte swaps and there is
no change in behavior for non-ROM BARs.
ROM BARs were declared little endian too but byte swapping was not
implemented for them so they never actually worked on big endian systems
as there was no cancelling byte swap. This fixes endiannes for ROM BARs
by declaring them native endian and only fixing access sizes as it is
done for non-ROM BARs.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
There are still old guests out there that over-exercise MSI-X masking.
The current code completely sets-up and tears-down an MSI-X vector on
the "use" and "release" callbacks. While this is functional, it can
slow an old guest to a crawl. We can easily skip the KVM parts of
this so that we keep the MSI route and irqfd setup. We do however
need to switch VFIO to trigger a different eventfd while masked.
Actually, we have the option of continuing to use -1 to disable the
trigger, but by using another EventNotifier we can allow the MSI-X
core to emulate pending bits and re-fire the vector once unmasked.
MSI code gets updated as well to use the same setup and teardown
structures and functions.
Prior to this change, an igbvf assigned to a RHEL5 guest gets about
20Mbps and 50 transactions/s with netperf (remote or VF->PF). With
this change, we get line rate and 3k transactions/s remote or 2Gbps
and 6k+ transactions/s to the PF. No significant change is expected
for newer guests with more well behaved MSI-X support.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
* remotes/bonzini/nbd-next:
nbd: Handle NBD_OPT_LIST option.
nbd: Handle fixed new-style clients.
nbd: Shutdown socket before closing.
nbd: Don't validate from and len in NBD_CMD_DISC.
nbd: Don't export a block device with no medium.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* remotes/bonzini/small-fixes:
tests/test-qmp-event: fix for GLib < 2.31
serial: poll the serial console with G_IO_HUP
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
cocoa.next:
* Honour -show-cursor option
* Fix handling of absolute positioning devices
* Cope with first surface being same as initial window size
# gpg: Signature made Mon 30 Jun 2014 13:48:46 BST using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
* remotes/pmaydell/tags/pull-cocoa-20140630:
ui/cocoa: Honour -show-cursor command line option
ui/cocoa: Fix handling of absolute positioning devices
ui/cocoa: Add utility method to check if point is within window
ui/cocoa: Cope with first surface being same as initial window size
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
target-arm:
* provide PL031 RTC in virt board
* fix missing pxa2xx and strongarm vmstate
* convert cadence_ttc to instance_init
* fix libvixl format strings and README
# gpg: Signature made Mon 30 Jun 2014 13:44:33 BST using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
* remotes/pmaydell/tags/pull-target-arm-20140630:
disas/libvixl: Fix wrong format strings
disas/libvixl: Update README for version base
timer: cadence_ttc: Convert to instance_init
hw/arm/pxa2xx_gpio: Correct and register vmstate
hw/arm/pxa2xx_gpio: Fix handling of GPSR/GPCR reads
hw/arm/strongarm: Wire up missing GPIO and PPC vmstate
hw/arm/strongarm: Fix handling of GPSR/GPCR reads
hw/arm/virt: Provide PL031 RTC
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
On FreeBSD polling a master pty while the other end is not connected
with G_IO_OUT only results in an endless wait. This is different from
the Linux behaviour, that returns immediately. In order to demonstrate
this, I have the following example code:
http://xenbits.xen.org/people/royger/test_poll.c
When executed on Linux:
$ ./test_poll
In callback
On FreeBSD instead, the callback never gets called:
$ ./test_poll
So, in order to workaround this, poll the source with G_IO_HUP (which
makes the code behave the same way on both Linux and FreeBSD).
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Cc: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Cc: Michael Tokarev <mjt@tls.msk.ru>
Cc: "Andreas Färber" <afaerber@suse.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: xen-devel@lists.xenproject.org
[Add hw/char/cadence_uart.c too. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When this flag is set, the server tells the client that it can send another
option if the server received a request with an option that it doesn't
understand instead of directly closing the connection.
Also add link to the most up-to-date documentation.
Signed-off-by: Hani Benhabiles <kroosec@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This forces finishing data sending to client before closing the socket like in
exports listing or replying with NBD_REP_ERR_UNSUP cases.
Signed-off-by: Hani Benhabiles <kroosec@gmail.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When the compiler is told to check the arguments of AppendToOutput,
it reports several errors of this kind:
error: format ‘%d’ expects argument of type ‘int’,
but argument 3 has type ‘int64_t {aka long int}’ [-Werror=format]
Fix those bugs by using the correct format strings with PRId64, PRIx64.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Message-id: 1403113751-19799-1-git-send-email-sw@weilnetz.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Fix handling of absolute positioning devices, which were basically
unusable for two separate reasons:
(1) as soon as you pressed the left mouse button we would call
CGAssociateMouseAndMouseCursorPosition(FALSE), which means that
the absolute coordinates of the mouse events are never updated
(2) we didn't account for MacOSX coordinate origin being bottom left
rather than top right, and so all the Y values sent to the guest
were inverted
We fix (1) by aligning our behaviour with the SDL UI backend for
absolute devices:
* when the mouse moves into the window we do a grab (which means
hiding the host cursor and sending special keys to the guest)
* when the mouse moves out of the window we un-grab
and fix (2) by doing the correct transformation in the call to
qemu_input_queue_abs().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1403516125-14568-4-git-send-email-peter.maydell@linaro.org
Do the recalculation of the content dimensions in switchSurface if the
current cdx is zero as well as if the new surface is a different size to
the current window. This catches the case where the first surface registered
happens to be 640x480 (our current window size), and fixes a bug where we
would always display a black screen until the first surface of a different
size was registered.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1403516125-14568-2-git-send-email-peter.maydell@linaro.org
The pxa2xx-gpio device has a VMStateDescription, but it was accidentally
never actually registered, and it wasn't quite correct. Remove the
'lines' field (this is a device property, not mutable state), add the
missing 'prev_level' field, and set dc->vmsd so it actually gets used.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
The PXA2xx GPIO GPSR and GPCR registers are write-only, with reads being
undefined behaviour. Instead of having GPCR return 31337 and GPSR return
the value last written, make both log the guest error and return 0.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
The VMStateDescription structs for the GPIO and PPC devices were
accidentally never wired up. Add missing state fields and register
them via dc->vmsd.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
The StrongARM GPIO GPSR and GPCR registers are write-only, with reads being
undefined behaviour. Instead of having GPCR return 31337 and GPSR return
the value last written, make both log the guest error and return 0.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
UEFI mandates that the platform must include an RTC, so provide
one in 'virt', using the PL031. This is also useful for directly
booting Linux kernels which would otherwise have to run ntpdate.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
pc,vhost,virtio fixes, enhancements
virtio bi-endian support
new command to resync RTC
misc bugfixes and cleanups
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Sun 29 Jun 2014 17:41:13 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
* remotes/mst/tags/for_upstream: (37 commits)
tests: add human format test for string output visitor
vhost-net: disable when cross-endian
target-ppc: enable virtio endian ambivalent support
virtio-9p: use virtio wrappers to access headers
virtio-serial-bus: use virtio wrappers to access headers
virtio-scsi: use virtio wrappers to access headers
virtio-blk: use virtio wrappers to access headers
virtio-balloon: use virtio wrappers to access page frame numbers
virtio-net: use virtio wrappers to access headers
virtio: allow byte swapping for vring
virtio: memory accessors for endian-ambivalent targets
virtio: add endian-ambivalent support to VirtIODevice
cpu: introduce CPUClass::virtio_is_big_endian()
exec: introduce target_words_bigendian() helper
virtio: add subsections to the migration stream
virtio-rng: implement per-device migration calls
virtio-balloon: implement per-device migration calls
virtio-serial: implement per-device migration calls
virtio-blk: implement per-device migration calls
virtio-net: implement per-device migration calls
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
As of today, vhost assumes guest and host have the same endianness.
This is definitely not compatible with modern PPC64 and ARM that
can change endianness at runtime. Let's disable vhost-net and print
an error message when we detect such a case:
qemu-system-ppc64: vhost-net does not support cross-endian
qemu-system-ppc64: unable to start vhost net: 38: falling back on userspace virtio
This way users can continue to run VMs without changing their setup and
have a chance to know that performance will be impacted.
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The device endianness is the cpu endianness at device reset time.
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Note that st*_raw and ld*_raw are effectively replaced by st*_p and ld*_p.
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This is the virtio-access.h header file taken from Rusty's "endian-ambivalent
targets using legacy virtio" patch. It introduces helpers that should be used
when accessing vring data or by drivers for data that contains headers.
The virtio config space is also target endian, but the current code already
handles that with the virtio_is_big_endian() helper. There is no obvious
benefit at using the virtio accessors in this case.
Now we have two distinct paths: a fast inline one for fixed endian targets,
and a slow out-of-line one for targets that define the new TARGET_IS_BIENDIAN
macro.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
[ relicensed virtio-access.h to GPLv2+ on Rusty's request,
pass &address_space_memory to physical memory accessors,
per-device endianness,
virtio tswap16 and tswap64 helpers,
faspath for fixed endian targets,
Greg Kurz <gkurz@linux.vnet.ibm.com> ]
Cc: Cédric Le Goater <clg@fr.ibm.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Some CPU families can dynamically change their endianness. This means we
can have little endian ppc or big endian arm guests for example. This has
an impact on legacy virtio data structures since they are target endian.
We hence introduce a new property to track the endianness of each virtio
device. It is reasonnably assumed that endianness won't change while the
device is in use : we hence capture the device endianness when it gets
reset.
We migrate this property in a subsection, after the device descriptor. This
means the load code must not rely on it until it is restored. As a consequence,
the vring sanity checks had to be moved after the call to vmstate_load_state().
We enforce paranoia by poisoning the property at the begining of virtio_load().
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
If we want to support targets that can change endianness (modern PPC and
ARM for the moment), we need to add a per-CPU class method to be called
from the virtio code. The virtio_ prefix in the name is a hint for people
to avoid misusage (aka. anywhere but from the virtio code).
The default behaviour is to return the compile-time default target
endianness.
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We currently have a virtio_is_big_endian() helper that provides the target
endianness to the virtio code. As of today, the helper returns a fixed
compile-time value. Of course, this will have to change if we want to
support target endianness changes at run-time.
Let's move the TARGET_WORDS_BIGENDIAN bits out to a new helper and have
virtio_is_big_endian() implemented on top of it.
This patch doesn't change any functionality.
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
There is a need to add some more fields to VirtIODevice that should be
migrated (broken status, endianness). The problem is that we do not
want to break compatibility while adding a new feature... This issue has
been addressed in the generic VMState code with the use of optional
subsections. As a *temporary* alternative to port the whole virtio
migration code to VMState, this patch mimics a similar subsectionning
ability for virtio, using the VMState code.
Since each virtio device is streamed in its own section, the idea is to
stream subsections between the end of the device section and the start
of the next sections. This allows an older QEMU to complain and exit
when fed with subsections:
Unknown savevm section type 5
load of migration failed
Suggested-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In order to migrate virtio subsections, they should be streamed after
the device itself. We need the device specific code to be called from
the common migration code to achieve this. This patch introduces load
and save methods for this purpose.
Suggested-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The device configuration is set at realize time and never changes. It
should not be migrated as it is done today. For the sake of compatibility,
let's just skip them at load time.
Signed-off-by: Alexander Graf <agraf@suse.de>
[ added missing casts to uint16_t *,
added From, SoB and commit message,
Greg Kurz <gkurz@linux.vnet.ibm.com> ]
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
TCP connectivity fails when the guest has a different endianness.
The packets are silently dropped on the host by the tap backend
when they are read from user space because the endianness of the
virtio-net header is in the wrong order. These lines may appear
in the guest console:
[ 454.709327] skbuff: bad partial csum: csum=8704/4096 len=74
[ 455.702554] skbuff: bad partial csum: csum=8704/4096 len=74
The issue that got first spotted with a ppc64le PowerKVM guest,
but it also exists for the less common case of a x86_64 guest run
by a big-endian ppc64 TCG hypervisor.
Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
[ Ported from PowerKVM,
Greg Kurz <gkurz@linux.vnet.ibm.com> ]
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Old code was affected by memory gaps which resulted in buffer pointers
pointing to address outside of the mapped regions.
Here we are introducing following changes:
- new function qemu_get_ram_block_host_ptr() returns host pointer
to the ram block, it is needed to calculate offset of specific
region in the host memory
- new field mmap_offset is added to the VhostUserMemoryRegion. It
contains offset where specific region starts in the mapped memory.
As there is stil no wider adoption of vhost-user agreement was made
that we will not bump version number due to this change
- other fileds in VhostUserMemoryRegion struct are not changed, as
they are all needed for usermode app implementation
- region data is not taken from ram_list.blocks anymore, instead we
use region data which is alredy calculated for use in vhost-net
- Now multiple regions can have same FD and user applicaton can call
mmap() multiple times with the same FD but with different offset
(user needs to take care for offset page alignment)
Signed-off-by: Damjan Marion <damarion@cisco.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Damjan Marion <damarion@cisco.com>
We don't support sparse NUMA node IDs yet, so this changes QEMU to
reject configs where not all nodes are present.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
The same nodeid shouldn't appear multiple times in the command-line.
In addition to detecting command-line mistakes, this will fix a bug
where nb_numa_nodes may become larger than MAX_NODES (and cause
out-of-bounds access on the numa_info array).
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Based on "enable sparse node numbering" patch from Nishanth Aravamudan,
but without the code to actually support sparse node IDs. This just adds
the code to keep track of present/non-present nodes on the command-line,
without changing any behavior.
Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
[Rename max_numa_node to max_numa_nodeid -Eduardo]
[Initialize max_numa_nodeid to 0 -Eduardo]
[Use MAX() macro when setting max_numa_nodeid -Eduardo]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Commit 'virtio: validate config_len on load' restricted config_len
loaded from the wire to match the config_len that the device had.
Unfortunately, there are cases where this isn't true, the one
we found it on was the wce addition in virtio-blk.
Allow mismatched config-lengths:
*) If the version on the wire is shorter then fine
*) If the version on the wire is longer, load what we have space
for and skip the rest.
(This is mst@redhat.com's rework of what I originally posted)
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
QEMU 2.0 changed memory layout for isapc and pc-0.10 to pc-0.13.
This prevents migration from QEMU 1.7.0 for these
machine types when -m 3.5G is specified.
Paolo Bonzini asked that:
smbios_legacy_mode = true;
has_reserved_memory = false;
option_rom_has_mr = true;
rom_file_has_mr = false;
also be done.
Cc: qemu-stable@nongnu.org
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Don Slutz <dslutz@verizon.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fixes: https://bugs.launchpad.net/qemu/+bug/1334307
Tested-by: "Slutz, Donald Christopher" <dslutz@verizon.com>
It is necessary to reset RTC interrupt reinjection backlog if
guest time is synchronized via a different mechanism, such as
QGA's guest-set-time command.
Failing to do so causes both corrections to be applied (summed),
resulting in an incorrect guest time.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
For each compat property on PC_Q35_COMPAT_*, there are only two
possibilities:
* If the device is never instantiated when using a machine other than
pc-q35, then the compat property can be safely added to
PC_COMPAT_*;
* If the device can be instantiated when using a machine other than
pc-q35, that means the other machines also need the compat property
to be set.
That means we don't need separate PC_Q35_COMPAT_* macros at all, today.
The hpet.hpet-intcap case is interesting: piix and q35 do have something
that emulates different defaults, but the machine-specific default is
applied _after_ compat_props are applied, by simply checking if the
property is zero (which is the real default on the hpet code).
The hpet.hpet-intcap=0x4 compat property can (should?) be applied to
piix too, because 0x4 was the default on both piix and q35 before the
hpet-intcap property was introduced.
Now, if one day we change the default HPET intcap on one of the PC
machine-types again, we may want to introduce PC_{Q35,I440FX}_COMPAT
macros. But while we don't need that, we can keep the code simple.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Cc: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fix English in comment:
s/the each/each/
s/ \*\// \*\//
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
* remotes/riku/linux-user-for-upstream:
linux-user: support the SIOCGIFINDEX ioctl
linux-user: support the KDSIGACCEPT ioctl
linux-user: allow NULL tv argument for settimeofday
linux-user: respect timezone for settimeofday
linux-user: fix struct target_epoll_event layout for MIPS
linux-user: support strace of epoll_create1
linux-user: allow NULL arguments to mount
linux-user: support SO_PASSSEC setsockopt option
linux-user: support SO_{SND, RCV}BUFFORCE setsockopt options
linux-user: support SO_ACCEPTCONN getsockopt option
linux-user: translate the result of getsockopt SO_TYPE
linux-user: added fake open() for /proc/self/cmdline
Add support for MAP_NORESERVE mmap flag.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Xtensa fixes and improvements queue 2014-06-29:
- fix FLASH mapping to boot region for KC705;
- clean up boot parameters passing;
- add uImage, DTB and initrd support.
# gpg: Signature made Sat 28 Jun 2014 23:40:32 BST using RSA key ID F83FA044
# gpg: Good signature from "Max Filippov <max.filippov@cogentembedded.com>"
# gpg: aka "Max Filippov <jcmvbkbc@gmail.com>"
* remotes/xtensa/tags/20140629-xtensa:
hw/xtensa/xtfpga: implement initrd loading
hw/xtensa/xtfpga: implement DTB loading
hw/xtensa/xtfpga: implement uImage loading
hw/xtensa/xtfpga: add memory info to bootparam
hw/xtensa/xtfpga: refactor bootparameters filling
hw/xtensa/xtfpga: use symbolic constants for bootparam tags
hw/xtensa/xtfpga: retrieve parameters from machine_opts
hw/xtensa: replace fprintfs with error_report
hw/xtensa: remove extraneous xtensa_ prefix from file names
hw/xtensa/xtfpga: fix FLASH mapping to boot region for KC705
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Block patches for 2.1.0-rc0
# gpg: Signature made Fri 27 Jun 2014 19:50:32 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
* remotes/kevin/tags/for-upstream: (47 commits)
iotests: Fix 083 for out-of-tree builds
iotests: Drop Python version from 065's Shebang
iotests: Use $PYTHON for Python scripts
iotests: Source common.env
configure: Enable out-of-tree iotests
iotests: Allow out-of-tree run
block.c: Don't return success for bdrv_append_temp_snapshot() failure
qemu-iotests: Add TestRepairQuorum to 041 to test drive-mirror node-name mode.
block: Add replaces argument to drive-mirror
blockjob: Fix recent BLOCK_JOB_ERROR regression
blockjob: Fix recent BLOCK_JOB_READY regression
virtio-blk: Rename complete_request_early to complete_request_vring
virtio-blk: Unify {non-,}dataplane's request handlings
virtio-blk: Schedule BH in the right context
virtio-blk: Export request handling functions to dataplane
virtio-blk: Make request completion function virtual
block: acquire AioContext in qmp_query_blockstats()
block: make bdrv_query_stats() static
virtio-blk: Fix and clean up the in_sg and out_sg check
virtio-blk: Fill in VirtIOBlockReq.out in dataplane code
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* remotes/qmp-unstable/queue/qmp:
docs/qmp: Fix documentation of BLOCK_JOB_READY to match code
char: report frontend open/closed state in 'query-chardev'
virtio-serial: report frontend connection state via monitor
qmp: add qmp-events.txt back
qapi event: clean up in callers
qapi script: clean up in scripts
qapi: ignore generated event files
qapi: move event defines
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Net patches
# gpg: Signature made Fri 27 Jun 2014 14:10:57 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>"
* remotes/stefanha/tags/net-pull-request:
hw/net/eepro100: Implement read-only bits in MDI registers
net: move queue number into NICPeers
net: L2TPv3 transport
qemu-bridge-helper: Fix fd leak in main()
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add a definition of the SIOCGIFINDEX ioctl, allowing its use by target
programs.
Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Add a definition of the KDSIGACCEPT ioctl & allow its use by target
programs.
Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The tv argument to the settimeofday syscall is allowed to be NULL, if
the program only wishes to provide the timezone. QEMU previously
returned -EFAULT when tv was NULL. Instead, execute the syscall &
provide NULL to the kernel as the target program expected.
Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The settimeofday syscall accepts a tz argument indicating the desired
timezone to the kernel. QEMU previously ignored any argument provided
by the target program & always passed NULL to the kernel. Instead,
translate the argument & pass along the data userland provided.
Although this argument is described by the settimeofday man page as
obsolete, it is used by systemd as of version 213.
Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
MIPS requires the pad field to 64b-align the data field just as ARM
does.
Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Add the epoll_create1 syscall to strace.list in order to display that
syscall when it occurs, rather than a message about the syscall being
unknown despite QEMU already implementing support for it.
Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Calls to the mount syscall can legitimately provide NULL as the value
for the source of filesystemtype arguments, which QEMU would previously
reject & return -EFAULT to the target program. An example of this is
remounting an already mounted filesystem with different properties.
Instead of rejecting such syscalls with -EFAULT, pass NULL along to the
kernel as the target program expects.
Additionally this patch fixes a potential memory leak when DEBUG_REMAP
is enabled and lock_user_string fails on the target or filesystemtype
arguments but a prior argument was non-NULL and already locked.
Since the patch already touched most lines of the TARGET_NR_mount case,
it fixes the indentation & coding style for good measure.
Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Translate the SO_PASSSEC option to setsockopt to the host value &
perform the syscall as expected, allowing use of the option by target
programs.
Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Translate the SO_SNDBUFFORCE & SO_RCVBUFFORCE options to setsockopt to
the host values & perform the syscall as expected, allowing use of those
options by target programs.
Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Translate the SO_ACCEPTCONN option to the host value & execute the
syscall as expected.
Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
QEMU previously passed the result of the host syscall directly to the
target program. This is a problem if the host & target have different
representations of socket types, as is the case when running a MIPS
target program on an x86 host. Introduce a host_to_target_sock_type
helper function mirroring the existing target_to_host_sock_type, and
call it to translate the value provided by getsockopt when called for
the SO_TYPE option.
Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
mmap_flags_tbl contains a list of mmap flags, and how to map them to
the target. This patch adds MAP_NORESERVE, which was missing to the
list.
Signed-off-by: Christophe Lyon <christophe.lyon@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
A series of patches to the s390-ccw bios:
- code cleanup
- improved error reporting
- most important, support to ipl (boot) from ECKD DASD (CDL, LDL or CMS
formatted)
# gpg: Signature made Fri 27 Jun 2014 12:03:30 BST using RSA key ID C6F02FAF
# gpg: Can't check signature: public key not found
* remotes/cohuck/tags/s390x-20140627:
pc-bios/s390-ccw: update binary
pc-bios/s390-ccw: IPL from LDL/CMS-formatted ECKD DASD
pc-bios/s390-ccw: IPL from CDL-formatted ECKD DASD
pc-bios/s390-ccw: factor out ipl code
pc-bios/s390-ccw: Add fill_hex_val func to provide better msgs
pc-bios/s390-ccw: Unify error handling
pc-bios/s390-ccw: add some utility code
pc-bios/s390-ccw: handle different sector sizes
pc-bios/s390-ccw: cleanup and enhance bootmap defintions
pc-bios/s390-ccw: make checkpatch happy
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add back in the support for 64-bit PPC MacOSX hosts that was
broken in the recent merge of the 32-bit and 64-bit TCG backends.
Reported-by: Andreas Färber <andreas.faerber@web.de>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Tested-by: Andreas Färber <andreas.faerber@web.de>
Provide a simple bootloader code at the reset address that jumps to the
loaded image entry point when it's not equal to the reset address. This
is needed because the old method of setting pc doesn't work due to cpu
reset done after the machine setup.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
While at it rename lx60 (named after the first board of the family) to
more generic xtfpga (the family name).
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
On KC705 bootloader area is located at FLASH offset 0x06000000, not 0 as
on older xtfpga boards.
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
iotest 083 filters out debug messages from nbd, which are prefixed (and
recognized) by __FILE__. However, the current filter (/^nbd\.c…/) is
valid for in-tree builds only, as out-of-tree builds will have a path
before that filename (e.g. "/tmp/qemu/nbd.c"). Fix this by adding .*
before "nbd\.c".
While working on this, also fix the regexes: '.' should be escaped and a
single backslash is not enough for escaping when enclosed by double
quotes.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Test 065 specified python2 to be used in its Shebang; this might not
work on systems without a python2 symlink and furthermore it is now
counter-productive, as the check script compares the Shebang to
"#!/usr/bin/env python" and only uses the Python interpreter selected by
configure on an exact match.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Instead of invoking Python scripts directly via ./, use $PYTHON to
obtain the correct Python interpreter command.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In order to allow out-of-tree iotests, create a symlink for the check
script in the build tree.
While doing so, also write configured options relevant to the iotests to
common.env in the build tree; currently, this is the command to invoke
Python 2.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
As out-of-tree builds are preferred for qemu, running the qemu-iotests
in that out-of-tree build should be supported as well. To do so, a
symbolic link has to be created pointing to the check script in the
source directory. That script will check whether it has been run through
a symlink, and if so, will assume it is run in the build tree. All
output and temporary operations performed by iotests are then redirected
here and, unless specified otherwise by the user, QEMU_PROG etc. will be
set to paths appropriate for the build tree.
Also, drop making every test case executable if it is not yet, as this
would modify the source tree which is not desired for out-of-tree runs
and should be fixed in the repository anyway.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When failure occurs, 'ret' need be set, or may return 0 to indicate
success. Previously, an error was set in errp, but 0 was returned
anyway. So let bdrv_append_temp_snapshot() return an error code and
use that for the bdrv_open() return value.
Also, error_propagate() need be called only one time within a function.
Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The to-replace-node-name is designed to allow repairing a broken Quorum file.
This patch introduces a new class TestRepairQuorum testing that the feature
works.
Some further work will be done on QEMU to improve the robustness of the tests.
Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
drive-mirror will bdrv_swap the new BDS named node-name with the one
pointed by replaces when the mirroring is finished.
Signed-off-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Commit bcada37 dropped the (up to now undocumented) members type, len,
offset, speed, breaking tests/qemu-iotests/040 and 041.
Restore and document them. This fixes 040, and partially fixes 041.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Tested-By: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This drops request handling code from dataplane, and uses code from
hw/block/virtio-blk.c.
It starts to use multiwrite as non-dataplane does.
Dataplane sets VirtIOBlock.complete_request to vring version, and calls
into non-dataplane's process handling. In complete_request_early,
qiov.size is added to vring push length, because it's also called in rw
completion now.
Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The BH must be called in the AioContext of bs. Currently it is only the
main loop, but with coming changes, it could also be a dataplane
IOThread.
Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
virtio_blk_req_complete will call VirtIOBlock.complete_request() to push
data and notify guest. No functional change.
Later, this will allow dataplane to provide it's own (vring_) version.
Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Make query-blockstats safe for dataplane by acquiring the
BlockDriverState's AioContext. This ensures that the dataplane IOThread
and the main loop's monitor code do not race.
Note the assumption that acquiring the drive's BDS AioContext also
protects ->file and ->backing_hd. This assumption is made by other
aio_context_acquire() callers too.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
out_sg is checked by iov_to_buf below, so it can be dropped.
Add assert and iov_discard_back around in_sg, as the in_sg is handled in
dataplane code.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The virtio code currently assumes that the outhdr is in its own iovec.
This is not guaranteed by the spec, so we should relax this assumption.
Convert the VirtIOBlockReq.out field to structrue so that we can use
iov_to_buf and then discard the header from the beginning of iovec.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In current virtio spec, inhdr is a single byte, and is unlikely to
change for both functionality and compatibility considerations.
Non-dataplane uses .in, and we are on the way to converge them. So
let's unify it to get cleaner code.
Remove .inhdr and use .in.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Field "inhdr" is added temporarily for a more mechanical change, and
will be dropped in the next commit.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This will make converging with dataplane code easier.
Add virtio_blk_free_request to handle the freeing of request internal
fields.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
These values aren't used in this case.
Currently, the from field in the request sent by the nbd kernel module leading
to a false error message when ending the connection with the client.
$ qemu-nbd some.img -v
// After nbd-client -d /dev/nbd0
nbd.c:nbd_trip():L1031: From: 18446744073709551104, Len: 0, Size: 20971520,
Offset: 0
nbd.c:nbd_trip():L1032: requested operation past EOF--bad client?
nbd.c:nbd_receive_request():L638: read failed
Signed-off-by: Hani Benhabiles <kroosec@gmail.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The device is exported with erroneous values and can't be read.
Before the patch:
$ sudo nbd-client localhost -p 10809 /dev/nbd0 -name floppy0
Negotiation: ..size = 17592186044415MB
bs=1024, sz=18446744073709547520 bytes
$ sudo mount /dev/nbd0 /mnt/tmp/
mount: block device /dev/nbd0 is write-protected, mounting read-only
mount: /dev/nbd0: can't read superblock
After the patch:
(qemu) nbd_server_add ide0-hd0
(qemu) nbd_server_add floppy0
Device 'floppy0' has no medium
Signed-off-by: Hani Benhabiles <kroosec@gmail.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In addition to the on-line reporting added in the previous patch, allow
libvirt to query frontend state independently of events.
Libvirt's path to identify the guest agent channel it cares about differs
between the event added in the previous patch and the QMP response field
added here. The event identifies the frontend device, by "id". The
'query-chardev' QMP command identifies the backend device (again by "id").
The association is under libvirt's control.
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1080376
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
The conversion of events to the QAPI, resulted in the removal of the
docs/qmp/qmp-events.txt file. This was done to avoid having duplicated
information between qmp-events.txt and qapi-event.json.
However, qmp-events.txt contains examples and we're still not sure
how to proper install QAPI docs in the host. To avoid harming users,
it's better to re-add qmp-events.txt for now and deal with the
duplication later.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
This new argument can be used to specify the node-name of the new mirrored BDS.
Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
On read operations when this parameter is set and some replicas are corrupted
while quorum can be reached quorum will proceed to rewrite the correct version
of the data to fix the corrupted replicas.
This will shine with SSD where the FTL will remap the same block at another
place on rewrite.
Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When the user specifies -nodefaults he can tell us that he doesn't want any
serial ports spawned by default. While we do honor that wish, we still create
device tree entries for those non-existent devices.
Make device tree generation depend on whether the device is actually available.
Signed-off-by: Alexander Graf <agraf@suse.de>
Currently SPAPR PHB keeps track of all allocated MSI (here and below
MSI stands for both MSI and MSIX) interrupt because
XICS used to be unable to reuse interrupts. This is a problem for
dynamic MSI reconfiguration which happens when guest reloads a driver
or performs PCI hotplug. Another problem is that the existing
implementation can enable MSI on 32 devices maximum
(SPAPR_MSIX_MAX_DEVS=32) and there is no good reason for that.
This makes use of new XICS ability to reuse interrupts.
This reorganizes MSI information storage in sPAPRPHBState. Instead of
static array of 32 descriptors (one per a PCI function), this patch adds
a GHashTable when @config_addr is a key and (first_irq, num) pair is
a value. GHashTable can dynamically grow and shrink so the initial limit
of 32 devices is gone.
This changes migration stream as @msi_table was a static array while new
@msi_devs is a dynamic hash table. This adds temporary array which is
used for migration, it is populated in "spapr_pci"::pre_save() callback
and expanded into the hash table in post_load() callback. Since
the destination side does not know the number of MSI-enabled devices
in advance and cannot pre-allocate the temporary array to receive
migration state, this makes use of new VMSTATE_STRUCT_VARRAY_ALLOC macro
which allocates the array automatically.
This resets the MSI configuration space when interrupts are released by
the ibm,change-msi RTAS call.
This fixed traces to be more informative.
This changes vmstate_spapr_pci_msi name from "...lsi" to "...msi" which
was incorrect by accident. As the internal representation changed,
thus bumps migration version number.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[agraf: drop g_malloc_n usage]
Signed-off-by: Alexander Graf <agraf@suse.de>
There are few helpers already to support array migration. However they all
require the destination side to preallocate arrays before migration which
is not always possible due to unknown array size as it might be some
sort of dynamic state. One of the examples is an array of MSIX-enabled
devices in SPAPR PHB - this array may vary from 0 to 65536 entries and
its size depends on guest's ability to enable MSIX or do PCI hotplug.
This adds new VMSTATE_VARRAY_STRUCT_ALLOC macro which is pretty similar to
VMSTATE_STRUCT_VARRAY_POINTER_INT32 but it can alloc memory for migratign
array on the destination side.
This defines VMS_ALLOC flag for a field.
This changes vmstate_base_addr() to do the allocation when receiving
migration.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Juan Quintela <quintela@redhat.com>
[agraf: drop g_malloc_n usage]
Signed-off-by: Alexander Graf <agraf@suse.de>
This implements interrupt release function so IRQs can be returned back
to the pool for reuse in cases such as PCI hot plug.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
This removes @next_irq from sPAPREnvironment which was used in old
IRQ allocator as XICS is now responsible for IRQs and keeps track of
allocated IRQs.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
The current allocator returns IRQ numbers from a pool and does not
support IRQs reuse in any form as it did not keep track of what it
previously returned, it only keeps the last returned IRQ. Some use
cases such as PCI hot(un)plug may require IRQ release and reallocation.
This moves an allocator from SPAPR to XICS.
This switches IRQ users to use new API.
This uses LSI/MSI flags to know if interrupt is allocated.
The interrupt release function will be posted as a separate patch.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
Since islsi[] array has been merged into the ICSState struct,
we must not reset flags as they tell if the interrupt is in use.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
PAPR allows having multiple interrupt sources such as PHB.
This adds a source lookup function and makes use of it.
Since at the moment QEMU only supports a single source,
no change in behaviour is expected.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
The existing interrupt allocation scheme in SPAPR assumes that
interrupts are allocated at the start time, continously and the config
will not change. However, there are cases when this is not going to work
such as:
1. migration - we will have to have an ability to choose interrupt
numbers for devices in the command line and this will create gaps in
interrupt space.
2. PCI hotplug - interrupts from unplugged device need to be returned
back to interrupt pool, otherwise we will quickly run out of interrupts.
This replaces a separate lslsi[] array with a byte in the ICSIRQState
struct and defines "LSI" and "MSI" flags. Neither of these flags set
signals that the descriptor is not allocated and not in use.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
Add support for the SPLPAR Characteristics parameter to the emulated
RTAS call ibm,get-system-parameter.
The support provides just enough information to allow "cat
/proc/powerpc/lparcfg" to succeed without generating a kernel error
message.
Without this patch the above command will produce the following kernel
message: arch/powerpc/platforms/pseries/lparcfg.c \
parse_system_parameter_string Error calling get-system-parameter \
(0xfffffffd)
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Add support for the UUID parameter to the emulated RTAS call
ibm,get-system-parameter.
Return the guest's UUID as the value for the RTAS UUID system
parameter, or null (a zero length result) if it is not set.
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
This allows the ibm,get-system-parameter RTAS call to succeed for the
DIAGNOSTICS_RUN_MODE system parameter.
The problem can be seen with "ppc64_cpu --run-mode" from the
powerpc-utils package which fails before this patch with "Machine does
not support diagnostic run mode".
This is corrected by using the rtas_st_buffer() function to write to
the buffer.
The RTAS constants are also moved out into a header file, some new
constants added and the surrounding code slightly simplified.
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
[agraf: remove some commentary]
Signed-off-by: Alexander Graf <agraf@suse.de>
Add a function to write lengh + data into a buffer as required for the
emulation of the RTAS ibm,get-system-parameter call.
If the destination is smaller than the source, the write is truncated
and success is returned. This matches the behaviour of pHyp.
This will be used in following patches.
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
This adds a v2.1 machine to support backward compatibility
for newer macines in the case if they ever be implemented.
This adds a "pseries-2.1" machine as a child of the "pseries"
machine and only changes visible machine name.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
Every single sPAPR QOM object has small first "s".
Most (not all yet) QOM objects have "State" suffix.
This replaces SPAPRMachine with sPAPRMachineState to conform with QEMU
code style and removes redundant empty line.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
At the moment QEMU knows about one version of POWER8 CPU with
PVR 0x4B.0000. This CPU class is defined as "POWER8". The linux
kernel names it as "POWER8E" which is different from the name QEMU uses.
Now we get another version of POWER8 which is architecturally equivalent
to POWER8E but has different PVR 0x4D.0000 so QEMU fails to find
a PPC CPU class on these machines. The linux kernel names these CPUs as
"POWER8".
This renames the existing "POWER8" to "POWER8E" to be more precise and
stay in sync with the linux kernel.
This adds a new "POWER8" family which calls POWER8E class init function
and defines own PVR mask (used to match a CPU class) and desc (used to
create dynamic version-less CPU class).
This does not change CPU class fw_name attribute as the host POWER8
firmware keeps using "PowerPC,POWER8" on both POWER8 and POWER8E.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
Fix PCI hole size to match that what is found on real hardware.
(OpenBIOS already uses the correct length.)
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Alexander Graf <agraf@suse.de>
Change the order of creating devices for New World Mac emulation so
that devices on the motherboard are added first and PCI cards (VGA and
NIC) come later. As a side effect, this also causes OpenBIOS to map
the motherboard devices into the MMIO space to the same addresses as
on real hardware and allow clients that hardcode these addresses (e.g.
MorphOS) to find and use them until OpenBIOS is tought to map devices
to specific addresses. (On real hardware the graphics and network
cards are really on separate buses but we don't model that yet.) This
brings the memory map closer to what is found on PowerMac3,1.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Alexander Graf <agraf@suse.de>
The gen_qemu_ld8s() function is unused; remove it.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Remove the definition of the IMM and d extract helpers; these seem to have
been added as part of the initial PPC support in 2003 but never actually
used.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
This turns the sPAPR support on and enables VFIO container use
in the kernel.
This extends vfio_connect_container to support VFIO_SPAPR_TCE_IOMMU type
in the host kernel.
This registers a memory listener which sPAPR IOMMU will notify when
executing H_PUT_TCE/etc DMA calls. The listener then will notify the host
kernel about DMA map/unmap operation via VFIO_IOMMU_MAP_DMA/
VFIO_IOMMU_UNMAP_DMA ioctls.
This executes VFIO_IOMMU_ENABLE ioctl to make sure that the IOMMU is free
of mappings and can be exclusively given to the user. At the moment SPAPR
is the only platform requiring this call to be implemented.
Note that the host kernel function implementing VFIO_IOMMU_DISABLE
is called automatically when container's fd is closed so there is
no need to call it explicitly from QEMU. We may need to call
VFIO_IOMMU_DISABLE explicitly in the future for some sort of dynamic
reconfiguration (PCI hotplug or dynamic IOMMU group management).
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
The patch adds a spapr-pci-vfio-host-bridge device type
which is a PCI Host Bridge with VFIO support. The new device
inherits from the spapr-pci-host-bridge device and adds an "iommu"
property which is an IOMMU id. This ID represents a minimal entity
for which IOMMU isolation can be guaranteed. In SPAPR architecture IOMMU
group is called a Partitionable Endpoint (PE).
Current implementation supports one IOMMU id per QEMU VFIO PHB. Since
SPAPR allows multiple PHB for no extra cost, this does not seem to
be a problem. This limitation may change in the future though.
Example of use:
Configure and Add 3 functions of a multifunctional device to QEMU:
(the NEC PCI USB card is used as an example here):
-device spapr-pci-vfio-host-bridge,id=USB,iommu=4,index=7 \
-device vfio-pci,host=4:0:1.0,addr=1.0,bus=USB,multifunction=true
-device vfio-pci,host=4:0:1.1,addr=1.1,bus=USB
-device vfio-pci,host=4:0:1.2,addr=1.2,bus=USB
where:
* index=7 is a QEMU PHB index (used as source for MMIO/MSI/IO windows
offset);
* iommu=4 is an IOMMU id which can be found in sysfs:
[aik@vpl2 ~]$ cd /sys/bus/pci/devices/0004:00:00.0/
[aik@vpl2 0004:00:00.0]$ ls -l iommu_group
lrwxrwxrwx 1 root root 0 Jun 5 12:49 iommu_group -> ../../../kernel/iommu_groups/4
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
While most operations with VFIO IOMMU driver are generic and used inside
vfio.c, there are still some operations which only specific VFIO IOMMU
drivers implement. The first example of it will be reading a DMA window
start from the host.
This adds a helper which passes an ioctl request to the container's fd.
The helper will check if @req is known. For this, stub is added. This return
-1 on any requests for now.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
POWER KVM supports an KVM_CAP_SPAPR_TCE capability which allows allocating
TCE tables in the host kernel memory and handle H_PUT_TCE requests
targeted to specific LIOBN (logical bus number) right in the host without
switching to QEMU. At the moment this is used for emulated devices only
and the handler only puts TCE to the table. If the in-kernel H_PUT_TCE
handler finds a LIOBN and corresponding table, it will put a TCE to
the table and complete hypercall execution. The user space will not be
notified.
Upcoming VFIO support is going to use the same sPAPRTCETable device class
so KVM_CAP_SPAPR_TCE is going to be used as well. That means that TCE
tables for VFIO are going to be allocated in the host as well.
However VFIO operates with real IOMMU tables and simple copying of
a TCE to the real hardware TCE table will not work as guest physical
to host physical address translation is requited.
So until the host kernel gets VFIO support for H_PUT_TCE, we better not
to register VFIO's TCE in the host.
This adds a place holder for KVM_CAP_SPAPR_TCE_VFIO capability. It is not
in upstream yet and being discussed so now it is always false which means
that in-kernel VFIO acceleration is not supported.
This adds a bool @vfio_accel flag to the sPAPRTCETable device telling
that sPAPRTCETable should not try allocating TCE table in the host kernel
for VFIO. The flag is false now as at the moment there is no VFIO.
This adds an vfio_accel parameter to spapr_tce_new_table(), the semantic
is the same. Since there is only emulated PCI and VIO now, the flag is set
to false. Upcoming VFIO support will set it to true.
This is a preparation patch so no change in behaviour is expected
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
At the moment spapr_rtas_register() allocates a new token number for every
new RTAS callback so numbers are not fixed and depend on the number of
supported RTAS handlers and the exact order of spapr_rtas_register() calls.
These tokens are copied into the device tree and remain the same during
the guest lifetime.
When we start another guest to receive a migration, it calls
spapr_rtas_register() as well. If the number of RTAS handlers or their
order is different in QEMU on source and destination sides, the "/rtas"
node in the device tree will differ. Since migration overwrites the device
tree (as it overwrites the entire RAM), the actual RTAS config on
the destination side gets broken.
This defines global contant values for every RTAS token which QEMU
is using today.
This changes spapr_rtas_register() to accept a token number instead of
allocating one. This changes all users of spapr_rtas_register().
This changes XICS-KVM not to cache tokens registered with KVM as they
constant now.
This makes TOKEN_BASE global as RTAS_XXX use TOKEN_BASE as
a base. TOKEN_MAX is moved and renamed too and its value is changed
to the last token + 1. Boundary checks for token values are adjusted.
This reserves token numbers for "os-term" handlers and PCI hotplug
which we are working on.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
The Apple gdbstub protocol is different from the normal gdbstub protocol
used on PowerPC. Add support for the different variant, so that we can use
Apple's gdb to debug guest code.
Keep in mind that the switch is a compile time option. We can't detect
during runtime whether a gdb connecting to us is an upstream gdb or an
Apple gdb.
Signed-off-by: Alexander Graf <agraf@suse.de>
Fixed bug in gen_mcrxr() in target-ppc/translate.c:
The XER[SO], XER[OV], and XER[CA] flags are stored in the least
significant bit (bit 0) of their respective registers. They need
to be shifted left (by their respective offsets) to generate the final
XER value. The old translation code for the 'mcrxr' instruction
was assuming that the flags are stored in bit 2, and was shifting them
right (incorrectly)
Signed-off-by: Sorav Bansal <sbansal@cse.iitd.ernet.in>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Set bits in the AT_HWCAP2 entry of the AUXV. Specifically, detect and set bits
for bctar, ISEL and ISA 2.07.
Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Add VSX, DFP and ISA 2.06 to the bits identified in the AT_HWCAP
entry of the AUXV.
Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Decimal Floating Point is emulated, so add it the mask. This will
fix the erroneous message:
Warning: Disabling some instructions which are not emulated by TCG (0x0, 0x4)
Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Set the AT_ICACHEBSIZE and AT_DCACHEBSIZE entries of the AUXV to match the
CPU model's cache line sizes. This fixes memory clobbering problems on more
recent Book 3s implementations; memset(p, 0, N) will use the dcbz instruction
when N is sufficiently large and many of the newer server CPUs have cache lines
sizes of 128 bytes.
Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Although we defined an eepro100_mdi_mask[] array indicating which bits
in the registers are read-only, we weren't actually doing anything with
it. Make the MDI register-write code use it rather than manually making
register 1 read-only and leaving the rest as reads-as-written. (The
special-case handling of register 0 remains as before since its mask is
all-zeros and the special casing happens before we apply the masking.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1402159924-13853-1-git-send-email-peter.maydell@linaro.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Add code that allows us to start from ECKD DASD using the z/OS
compatible disk layout (CDL), which is the most common format for ECKD
DASD.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Move the scsi-disk specific ipl code from zipl_load() into a new
function ipl_scsi(). This makes it easier to add ipl routines for other
disk types.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Use the virtio device's configuration to figure out the disk geometry
and use a sector size based upon the layout.
[CH: s/SECTOR_SIZE/MAX_SECTOR_SIZE/g]
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Add declarations to describe structure of different dasd IPL sources
(eckd and fba). Move the structure definitions to a new header bootmap.h.
While we are at it, change structs to typedefs.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
If 'base' is smaller than the overlay image being committed into it,
then the base image will be grown in commit_run via bdrv_truncate().
This tests to make sure that this works, and the bdrv_truncate() is
not blocked when it shouldn't be.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If we check for the RESIZE blocker in bdrv_truncate(), that means a
commit will fail if the overlay layer is larger than the base, due to
the backing blocker.
This is a regression in behavior from 2.0; currently, commit will try to
grow the size of the base image to match the overlay size, if the
overlay size is larger.
By moving this into the QMP command qmp_block_resize(), it allows
usage of bdrv_truncate() within block jobs.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
It indicates the number of elements in ncs field and makes sense to have
int inside NICPeers. Also in parse_netdev we do not need to access
container and work with NICPeers only.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This transport allows to connect a QEMU nic to a static Ethernet
over L2TPv3 tunnel. The transport supports all options present
in the Linux kernel implementation. It allows QEMU to connect
to any Linux host running kernel 3.3+, most routers and network
devices as well as other QEMU instances.
[Fixed up net_client_init1() switch statement to support -netdev
--Stefan]
Signed-off-by: Anton Ivanov <antivano@cisco.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
When running a libvirt test suite I've noticed the qemu-img is
crashing occasionally. Tracing the problem down led me to the
following valgrind output:
qemu.git $ valgrind -q ./qemu-img create -f qed -obacking_file=/dev/null,backing_fmt=raw qed
==14881== Invalid write of size 8
==14881== at 0x1D263F: qemu_opts_create (qemu-option.c:692)
==14881== by 0x130782: bdrv_img_create (block.c:5531)
==14881== by 0x118DE0: img_create (qemu-img.c:462)
==14881== by 0x11E7E4: main (qemu-img.c:2830)
==14881== Address 0x11fedd38 is 24 bytes inside a block of size 232 free'd
==14881== at 0x4C2CA5E: realloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==14881== by 0x592D35E: g_realloc (in /usr/lib64/libglib-2.0.so.0.3800.2)
==14881== by 0x1D38D8: qemu_opts_append (qemu-option.c:1129)
==14881== by 0x13075E: bdrv_img_create (block.c:5528)
==14881== by 0x118DE0: img_create (qemu-img.c:462)
==14881== by 0x11E7E4: main (qemu-img.c:2830)
==14881==
Formatting 'qed', fmt=qed size=0 backing_file='/dev/null' backing_fmt='raw' cluster_size=65536
==14881== Invalid write of size 8
==14881== at 0x1D28BE: qemu_opts_del (qemu-option.c:750)
==14881== by 0x130BF3: bdrv_img_create (block.c:5638)
==14881== by 0x118DE0: img_create (qemu-img.c:462)
==14881== by 0x11E7E4: main (qemu-img.c:2830)
==14881== Address 0x11fedd38 is 24 bytes inside a block of size 232 free'd
==14881== at 0x4C2CA5E: realloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==14881== by 0x592D35E: g_realloc (in /usr/lib64/libglib-2.0.so.0.3800.2)
==14881== by 0x1D38D8: qemu_opts_append (qemu-option.c:1129)
==14881== by 0x13075E: bdrv_img_create (block.c:5528)
==14881== by 0x118DE0: img_create (qemu-img.c:462)
==14881== by 0x11E7E4: main (qemu-img.c:2830)
==14881==
The problem is apparently in the qemu_opts_append(). Well, if it
gets called twice or more. On the first call, when @dst is NULL
some initialization is done during which @dst->head list gets
initialized. The list is initialized in a way, so that the list
tail points at the list head. However, the next time
qemu_opts_append() is called for new options to be added,
g_realloc() may move @dst to a new address making the old list tail
point at an invalid address. If that's the case, we must update the
list pointers.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
A gcc codegen bug in x86_64-w64-mingw32-gcc (GCC) 4.6.3 means that
non-debug builds of QEMU for Windows tend to assert when using
coroutines. Work around this by marking qemu_coroutine_switch
as noinline.
If we allow gcc to inline qemu_coroutine_switch into
coroutine_trampoline, then it hoists the code to get the
address of the TLS variable "current" out of the while() loop.
This is an invalid transformation because the SwitchToFiber()
call may be called when running thread A but return in thread B,
and so we might be in a different thread context each time
round the loop. This can happen quite often. Typically.
a coroutine is started when a VCPU thread does bdrv_aio_readv:
VCPU thread
main VCPU thread coroutine I/O coroutine
bdrv_aio_readv ----->
start I/O operation
thread_pool_submit_co
<------------ yields
back to emulation
Then I/O finishes and the thread-pool.c event notifier triggers in
the I/O thread. event_notifier_ready calls thread_pool_co_cb, and
the I/O coroutine now restarts *in another thread*:
iothread
main iothread coroutine I/O coroutine (formerly in VCPU thread)
event_notifier_ready
thread_pool_co_cb -----> current = I/O coroutine;
call AIO callback
But on Win32, because of the bug, the "current" being set here the
current coroutine of the VCPU thread, not the iothread.
noinline is a good-enough workaround, and quite unlikely to break in
the future.
(Thanks to Paolo Bonzini for assistance in diagnosing the problem
and providing the detailed example/ascii art quoted above.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1403535303-14939-1-git-send-email-peter.maydell@linaro.org
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
X86CPU
* Filter out MONITOR for KVM
* Fix filtering for TCG
* -cpu foo,check and -cpu foo,enforce support for TCG
* -cpu host migration support (-cpu host,migratable=no to disable)
* Add invtsc feature support
* New model: Broadwell
# gpg: Signature made Wed 25 Jun 2014 22:55:04 BST using RSA key ID 3E7E013F
# gpg: Good signature from "Andreas Färber <afaerber@suse.de>"
# gpg: aka "Andreas Färber <afaerber@suse.com>"
* remotes/afaerber/tags/qom-cpu-for-2.1:
target-i386: Broadwell CPU model
target-i386: Fix indentation of CPU model definitions
target-i386: Support "invariant tsc" flag
target-i386: block migration and savevm if invariant tsc is exposed
savevm: check vmsd for migratability status
target-i386: Set migratable=yes by default on "host" CPU mooel
target-i386: Add "migratable" property to "host" CPU model
target-i386: Support check/enforce flags in TCG mode, too
target-i386: Loop-based feature word filtering in TCG mode
target-i386: Loop-based copying and setting/unsetting of feature words
target-i386: Define TCG_*_FEATURES earlier in cpu.c
target-i386: Filter KVM and 0xC0000001 features on TCG
target-i386: Filter FEAT_7_0_EBX TCG features too
target-i386: Make TCG feature filtering more readable
target-i386: Isolate KVM-specific code on CPU feature filtering logic
target-i386: Pass FeatureWord argument to report_unavailable_features()
target-i386: Merge feature filtering/checking functions
target-i386: Simplify reporting of unavailable features
target-i386: kvm: Don't enable MONITOR by default on any CPU model
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The only semantic change is that bs->open_flags gets BDRV_O_PROTOCOL set
now. This isn't useful, but it doesn't hurt either. The code that was
previously skipped by 'goto done' is automatically disabled because
protocol drivers don't support backing files (and if they did, this
would probably be a fix) and can't have snapshot_flags set.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Since we parse backing.* options to add a backing file from the command
line when the driver didn't assign one, it has been possible to have a
backing file for e.g. raw images (it just was never accessed).
This is obvious nonsense and should be rejected.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
This recursion was introduced in commit 505d7583 in order to allow
nesting image formats. It only ever takes effect when the user
explicitly specifies a driver name and that driver isn't suitable for
the protocol level.
We can check this earlier in bdrv_open() and if the explicitly
requested driver is a format driver, clear BDRV_O_PROTOCOL so that
another bs->file layer is opened.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
It doesn't do much any more, we can move the code to bdrv_open() now.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Eric Blake <eblake@redhat.com>
This moves the bdrv_open_file() call a bit down so that it can use the
bdrv_open() code that selects the right block driver.
The code between the old and the new call site is either common code
(the error message for an unknown driver has been unified now) or
doesn't run with cleared BDRV_O_PROTOCOL (added an if block in one
place, whereas the right path was already asserted in another place)
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
The "driver" entry in the options QDict is now only missing if we're
opening an image with format probing.
We also catch cases now where both the drv argument and a "driver"
option is specified, e.g. by specifying -drive format=qcow2,driver=raw
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
The idea of bdrv_fill_options() is to convert every parameter for
opening images, in particular the filename and flags, to entries in the
options QDict.
This patch starts with moving the filename parsing and driver probing
part from bdrv_file_open() to the new function.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
upcoming libnfs will feature internal readahead support.
Add a knob to pass the optional readahead value as a URL
parameter.
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
this patch fixes the incorrect usage of strncmp and
adds simple error checking by means of parse_uint_full
instead of atoi for the supplied URL parameters.
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
All behavior and invariant should hold for images with 0 length, so
add a class to repeat all the tests in TestSingleDrive.
Hide two unapplicable test methods that would fail with 0 image length
because it's also used as cluster size.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There should be a BLOCK_JOB_READY event with active commit, regardless
of image length. Let's test the 0 length image case, and make sure it
goes through the ready->complete process.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When mirroring or active committing a zero length image, BLOCK_JOB_READY
is not reported now, instead the job completes because we short circuit
the mirror job loop.
This is inconsistent with non-zero length images, and only confuses
management software.
Let's do the same thing when seeing a 0-length image: report ready
immediately; wait for block-job-cancel or block-job-complete; clear the
cancel flag as existing non-zero image synced case (cancelled after
ready); then jump to the exit.
Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This will unset busy flag and put coroutine to sleep, can be used to
wait for QMP complete/cancel.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-26 12:12:22 +02:00
349 changed files with 10564 additions and 5025 deletions
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.