Compare commits

...

519 Commits

Author SHA1 Message Date
Michael Roth
3cb451edb2 Update version for v2.1.1 release
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 14:30:45 -05:00
Eduardo Habkost
82d80e1f0b target-i386: Support migratable=no properly
When the "migratable" property was implemented, the behavior was tested
by changing the default on the code, but actually using the option on
the command-line (e.g. "-cpu host,migratable=false") doesn't work as
expected. This is a regression for a common use case of "-cpu host",
which is to enable features that are supported by the host CPU + kernel
before feature-specific code is added to QEMU.

Fix this by initializing the feature words for "-cpu host" on
x86_cpu_parse_featurestr(), right after parsing the CPU options.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
(cherry picked from commit 4d1b279b06)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:58 -05:00
Pavel Dovgaluk
5dd076a9f8 exec: Save CPUState::exception_index field
This patch adds a subsection with exception_index field to the VMState for
correct saving the CPU state.
Without this patch, simulator could miss the pending exception in the saved
virtual machine state.

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
(cherry picked from commit 6c3bff0ed8)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:58 -05:00
Sebastian Tanase
257e9cfce2 pty: Fix byte loss bug when connecting to pty
When trying to print data to the pty, we first check if it is connected.
If not, we try to reconnect, but we drop the pending data even if we
have successfully reconnected; this makes us lose the first byte of the very
first transmission.
This small fix addresses the issue by checking once more if the pty is connected
after having tried to reconnect.

Signed-off-by: Sebastian Tanase <sebastian.tanase@openwide.fr>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit cf7330c759)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:58 -05:00
Gerd Hoffmann
1aa87d3689 spice: make sure we don't overflow ssd->buf
Related spice-only bug.  We have a fixed 16 MB buffer here, being
presented to the spice-server as qxl video memory in case spice is
used with a non-qxl card.  It's also used with qxl in vga mode.

When using display resolutions requiring more than 16 MB of memory we
are going to overflow that buffer.  In theory the guest can write,
indirectly via spice-server.  The spice-server clears the memory after
setting a new video mode though, triggering a segfault in the overflow
case, so qemu crashes before the guest has a chance to do something
evil.

Fix that by switching to dynamic allocation for the buffer.

CVE-2014-3615

Cc: qemu-stable@nongnu.org
Cc: secalert@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
(cherry picked from commit ab9509ccea)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:58 -05:00
Gerd Hoffmann
7fe5418d9f vbe: rework sanity checks
Plug a bunch of holes in the bochs dispi interface parameter checking.
Add a function doing verification on all registers.  Call that
unconditionally on every register write.  That way we should catch
everything, even changing one register affecting the valid range of
another register.

Some of the holes have been added by commit
e9c6149f6a.  Before that commit the
maximum possible framebuffer (VBE_DISPI_MAX_XRES * VBE_DISPI_MAX_YRES *
32 bpp) has been smaller than the qemu vga memory (8MB) and the checking
for VBE_DISPI_MAX_XRES + VBE_DISPI_MAX_YRES + VBE_DISPI_MAX_BPP was ok.

Some of the holes have been there forever, such as
VBE_DISPI_INDEX_X_OFFSET and VBE_DISPI_INDEX_Y_OFFSET register writes
lacking any verification.

Security impact:

(1) Guest can make the ui (gtk/vnc/...) use memory rages outside the vga
frame buffer as source  ->  host memory leak.  Memory isn't leaked to
the guest but to the vnc client though.

(2) Qemu will segfault in case the memory range happens to include
unmapped areas  ->  Guest can DoS itself.

The guest can not modify host memory, so I don't think this can be used
by the guest to escape.

CVE-2014-3615

Cc: qemu-stable@nongnu.org
Cc: secalert@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
(cherry picked from commit c1b886c45d)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:58 -05:00
Gerd Hoffmann
c5042f04f7 vbe: make bochs dispi interface return the correct memory size with qxl
VgaState->vram_size is the size of the pci bar.  In case of qxl not the
whole pci bar can be used as vga framebuffer.  Add a new variable
vbe_size to handle that case.  By default (if unset) it equals
vram_size, but qxl can set vbe_size to something else.

This makes sure VBE_DISPI_INDEX_VIDEO_MEMORY_64K returns correct results
and sanity checks are done with the correct size too.

Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
(cherry picked from commit 54a85d4624)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:58 -05:00
Michael S. Tsirkin
cf29a88391 virtio-net: purge outstanding packets when starting vhost
whenever we start vhost, virtio could have outstanding packets
queued, when they complete later we'll modify the ring
while vhost is processing it.

To prevent this, purge outstanding packets on vhost start.

Cc: qemu-stable@nongnu.org
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 086abc1ccd)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:58 -05:00
Michael S. Tsirkin
08743db463 net: complete all queued packets on VM stop
This completes all packets, ensuring that callbacks
will not run when VM is stopped.

Cc: qemu-stable@nongnu.org
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit ca77d85e1d)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:58 -05:00
Michael S. Tsirkin
d9c06c0d79 net: invoke callback when purging queue
devices rely on packet callbacks eventually running,
but we violate this rule whenever we purge the queue.
To fix, invoke callbacks on all packets on purge.
Set length to 0, this way callers can detect that
this happened and re-queue if necessary.

Cc: qemu-stable@nongnu.org
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 07d8084624)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:58 -05:00
Michael S. Tsirkin
f321710cd4 virtio: don't call device on !vm_running
On vm stop, virtio changes vm_running state
too soon, so callbacks can get envoked with
vm_running = false;

Cc: qemu-stable@nongnu.org
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 269bd822e7)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:58 -05:00
zhanghailiang
ec48bfd57b net: Forbid dealing with packets when VM is not running
For all NICs(except virtio-net) emulated by qemu,
Such as e1000, rtl8139, pcnet and ne2k_pci,
Qemu can still receive packets when VM is not running.

If this happened in *migration's* last PAUSE VM stage, but
before the end of the migration, the new receiving packets will possibly dirty
parts of RAM which has been cached in *iovec*(will be sent asynchronously) and
dirty parts of new RAM which will be missed.
This will lead serious network fault in VM.

To avoid this, we forbid receiving packets in generic net code when
VM is not running.

Bug reproduction steps:
(1) Start a VM which configured at least one NIC
(2) In VM, open several Terminal and do *Ping IP -i 0.1*
(3) Migrate the VM repeatedly between two Hosts
And the *PING* command in VM will very likely fail with message:
'Destination HOST Unreachable', the NIC in VM will stay unavailable unless you
run 'service network restart'

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit e1d64c084b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:58 -05:00
zhanghailiang
eb36f79d59 acpi-build: Set FORCE_APIC_CLUSTER_MODEL bit for FADT flags
If we start Windows 2008 R2 DataCenter with number of cpu less than 8,
The system will use APIC Flat Logical destination mode as default configuration,
Which has an upper limit of 8 CPUs.

The fault is that VM can not show all processors within Task Manager if
we hot-add cpus when the number of cpus in VM extends the limit of 8.

If we use cluster destination model, the problem will be solved.

Note:
This flag was introduced later than ACPI v1.0 specification while QEMU
generates v1.0 tables only, but...

linux kernel ignores this flag, so patch has no influence on it.

Tested with Win[XPsp3|Srv2003EE|Srv2008DC|Srv2008R2|Srv2012R2], there
isn't BSODs and guests boot just fine. In cases guest doesn't support
cpu-hotplug, cpu becomes visible after reboot and in case the guest
supports cpu-hotplug, it works as expected with this patch.

Cc: qemu-stable@nongnu.org
Signed-off-by: huangzhichao <huangzhichao@huawei.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-By: Igor Mammedov <imammedo@redhat.com>
(cherry picked from commit 07b81ed937)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:58 -05:00
Michael S. Tsirkin
34d41c1a20 vhost-scsi: init backend features earlier
As vhost core can use backend_features during init, clear it earlier to
avoid using uninitialized memory.
This use would be harmless since vhost scsi ignores the result
anyway, but initializing earlier will help prevent valgrind errors,
and make scsi and net behave similarly.

Cc: qemu-stable@nongnu.org
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 3a1655fc53)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:57 -05:00
Jason Wang
6f8d05a8f8 vhost_net: init acked_features to backend_features
commit 2e6d46d77e (vhost: add
vhost_get_features and vhost_ack_features) removes the step that
initializes the acked_features to backend_features.

As this field is now uninitialized, vhost initialization will sometimes
fail.

To fix, initialize acked_features on each ack.

Tested-by: Andrey Korolyov <andrey@xdel.ru>
Cc: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit b49ae9138d)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:57 -05:00
Jason Wang
5e83dae44e vhost_net: start/stop guest notifiers properly
commit a9f98bb5eb "vhost: multiqueue
support" changed the order of stopping the device. Previously
vhost_dev_stop would disable backend and only afterwards, unset guest
notifiers. We now unset guest notifiers while vhost is still
active. This can lose interrupts causing guest networking to fail. In
particular, this has been observed during migration.

To fix this, several other changes are needed:
- remove the hdev->started assertion in vhost.c since we may want to
start the guest notifiers before vhost starts and stop the guest
notifiers after vhost is stopped.
- introduce the vhost_net_set_vq_index() and call it before setting
guest notifiers. This is to guarantee vhost_net has the correct
virtqueue index when setting guest notifiers.

MST: fix up error handling.

Cc: qemu-stable@nongnu.org
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Andrey Korolyov <andrey@xdel.ru>
Reported-by: "Zhangjie (HZ)" <zhangjie14@huawei.com>
Tested-by: William Dauchy <william@gandi.net>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit cd7d1d26b0)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:57 -05:00
Knut Omang
ff34ca00fd pci: avoid losing config updates to MSI/MSIX cap regs
Since
commit 95d6580024
    msi: Invoke msi/msix_write_config from PCI core
msix config writes are lost, the value written is always 0.

Fix pci_default_write_config to avoid this.

Cc: qemu-stable@nongnu.org
Signed-off-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit d7efb7e08e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:57 -05:00
Michael S. Tsirkin
e685d2abf7 virtio-net: don't run bh on vm stopped
commit 783e770693
    virtio-net: stop/start bh when appropriate

is incomplete: BH might execute within the same main loop iteration but
after vmstop, so in theory, we might trigger an assertion.
I was unable to reproduce this in practice,
but it seems clear enough that the potential is there, so worth fixing.

Cc: qemu-stable@nongnu.org
Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit e8bcf84200)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:57 -05:00
Gerd Hoffmann
67cfda8776 qxl-render: add more sanity checks
Damn, the dirty rectangle values are signed integers.  So the checks
added by commit 788fbf042f are not good
enough, we also have to make sure they are not negative.

[ Note: There must be something broken in spice-server so we get
  negative values in the first place.  Bug opened:
  https://bugzilla.redhat.com/show_bug.cgi?id=1135372 ]

Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
(cherry picked from commit 503b3b33fe)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:57 -05:00
Peter Maydell
4fd144f8f5 target-arm: Correct Cortex-A57 ISAR5 and AA64ISAR0 ID register values
We implement the crypto extensions but were incorrectly reporting
ID register values for the Cortex-A57 which did not advertise
crypto. Use the correct values as described in the TRM.
With this fix Linux correctly detects presence of the crypto
features and advertises them in /proc/cpuinfo.

Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1408718660-7295-1-git-send-email-peter.maydell@linaro.org
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit c379621451)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:57 -05:00
Peter Maydell
ea774b8dd0 target-arm: Fix regression that disabled VFP for ARMv5 CPUs
Commit 2c7ffc414 added support for honouring the CPACR coprocessor
access control register bits which may disable access to VFP
and Neon instructions. However it failed to account for the
fact that the CPACR is only present starting from the ARMv6
architecture version, so it accidentally disabled VFP completely
for ARMv5 CPUs like the ARM926. Linux would detect this as
"no VFP present" and probably fall back to its own emulation,
but other guest OSes might crash or misbehave.

This fixes bug LP:1359930.

Reported-by: Jakub Jermar <jakub@jermar.eu>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1408714940-7192-1-git-send-email-peter.maydell@linaro.org
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit ed1f13d607)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:57 -05:00
Alex Williamson
3e8966df02 x86: Clear MTRRs on vCPU reset
The SDM specifies (June 2014 Vol3 11.11.5):

    On a hardware reset, the P6 and more recent processors clear the
    valid flags in variable-range MTRRs and clear the E flag in the
    IA32_MTRR_DEF_TYPE MSR to disable all MTRRs. All other bits in the
    MTRRs are undefined.

We currently do none of that, so whatever MTRR settings you had prior
to reset is what you have after reset.  Usually this doesn't matter
because KVM often ignores the guest mappings and uses write-back
anyway.  However, if you have an assigned device and an IOMMU that
allows NoSnoop for that device, KVM defers to the guest memory
mappings which are now stale after reset.  The result is that OVMF
rebooting on such a configuration takes a full minute to LZMA
decompress the firmware volume, a process that is nearly instant on
the initial boot.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 9db2efd95e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:57 -05:00
Alex Williamson
ba8576f338 x86: kvm: Add MTRR support for kvm_get|put_msrs()
The MTRR state in KVM currently runs completely independent of the
QEMU state in CPUX86State.mtrr_*.  This means that on migration, the
target loses MTRR state from the source.  Generally that's ok though
because KVM ignores it and maps everything as write-back anyway.  The
exception to this rule is when we have an assigned device and an IOMMU
that doesn't promote NoSnoop transactions from that device to be cache
coherent.  In that case KVM trusts the guest mapping of memory as
configured in the MTRR.

This patch updates kvm_get|put_msrs() so that we retrieve the actual
vCPU MTRR settings and therefore keep CPUX86State synchronized for
migration.  kvm_put_msrs() is also used on vCPU reset and therefore
allows future modificaitons of MTRR state at reset to be realized.

Note that the entries array used by both functions was already
slightly undersized for holding every possible MSR, so this patch
increases it beyond the 28 new entries necessary for MTRR state.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit d1ae67f626)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:57 -05:00
Alex Williamson
07f8c97f84 x86: Use common variable range MTRR counts
We currently define the number of variable range MTRR registers as 8
in the CPUX86State structure and vmstate, but use MSR_MTRRcap_VCNT
(also 8) to report to guests the number available.  Change this to
use MSR_MTRRcap_VCNT consistently.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit d8b5c67b05)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:57 -05:00
William Grant
72c9c9a05e target-i386: Don't forbid NX bit on PAE PDEs and PTEs
Commit e8f6d00c30 ("target-i386: raise
page fault for reserved physical address bits") added a check that the
NX bit is not set on PAE PDPEs, but it also added it to rsvd_mask for
the rest of the function. This caused any PDEs or PTEs with NX set to be
erroneously rejected, making PAE guests with NX support unusable.

Signed-off-by: William Grant <wgrant@ubuntu.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 1844e68eca)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:57 -05:00
Paolo Bonzini
3d8cc86e4f vl: process -object after other backend options
QOM backends can refer to chardevs, but not vice versa.  So
process -chardev and -fsdev options before -object

This fixes the rng-egd backend to virtio-rng.

Reported-by: Amos Kong <akong@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 7b71758d79)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:56 -05:00
Greg Kurz
0824ca6bd1 spapr_pci: map the MSI window in each PHB
On sPAPR, virtio devices are connected to the PCI bus and use MSI-X.
Commit cc943c36fa has modified MSI-X
so that writes are made using the bus master address space and follow
the IOMMU path.

Unfortunately, the IOMMU address space address space does not have an
MSI window: the notification is silently dropped in unassigned_mem_write
instead of reaching the guest... The most visible effect is that all
virtio devices are non-functional on sPAPR since then. :(

This patch does the following:
1) map the MSI window into the IOMMU address space for each PHB
   - since each PHB instantiates its own IOMMU address space, we
     can safely map the window at a fixed address (SPAPR_PCI_MSI_WINDOW)
   - no real need to keep the MSI window setup in a separate function,
     the spapr_pci_msi_init() code moves to spapr_phb_realize().

2) kill the global MSI window as it is not needed in the end

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 8c46f7ec85)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:28 -05:00
Stefan Hajnoczi
feb633411f thread-pool: avoid deadlock in nested aio_poll() calls
The thread pool has a race condition if two elements complete before
thread_pool_completion_bh() runs:

  If element A's callback waits for element B using aio_poll() it will
  deadlock since pool->completion_bh is not marked scheduled when the
  nested aio_poll() runs.

Fix this by marking the BH scheduled while thread_pool_completion_bh()
is executing.  This way any nested aio_poll() loops will enter
thread_pool_completion_bh() and complete the remaining elements.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 3c80ca158c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:06 -05:00
Stefan Hajnoczi
75ada6b763 thread-pool: avoid per-thread-pool EventNotifier
EventNotifier is implemented using an eventfd or pipe.  It therefore
consumes file descriptors, which can be limited by rlimits and should
therefore be used sparingly.

Switch from EventNotifier to QEMUBH in thread-pool.c.  Originally
EventNotifier was used because qemu_bh_schedule() was not thread-safe
yet.

Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit c2e50e3d11)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:06 -05:00
Michael S. Tsirkin
be3af755ac pc: reserve more memory for ACPI for new machine types
commit 868270f23d
    acpi-build: tweak acpi migration limits
broke kernel loading with -kernel/-initrd: it doubled
the size of ACPI tables but did not reserve
enough memory.

As a result, issues on boot and halt are observed.

Fix this up by doubling reserved memory for new machine types.

Cc: qemu-stable@nongnu.org
Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 927766c7d3)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:05 -05:00
Gonglei
bfe3e6f5e3 pcihp: fix possible array out of bounds
Prevent out-of-bounds array access on
acpi_pcihp_pci_status.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
(cherry picked from commit fa365d7cd1)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:05 -05:00
Michael S. Tsirkin
cd4acff8d0 hostmem: set MPOL_MF_MOVE
When memory is allocated on a wrong node, MPOL_MF_STRICT
doesn't move it - it just fails the allocation.
A simple way to reproduce the failure is with mlock=on
realtime feature.

The code comment actually says: "ensure policy won't be ignored"
so setting MPOL_MF_MOVE seems like a better way to do this.

Cc: qemu-stable@nongnu.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

(cherry picked from commit 288d332202)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:05 -05:00
Ben Draper
4b59161253 vmxnet3: Pad short frames to minimum size (60 bytes)
When running VMware ESXi under qemu-kvm the guest discards frames
that are too short. Short ARP Requests will be dropped, this prevents
guests on the same bridge as VMware ESXi from communicating. This patch
simply adds the padding on the network device itself.

Signed-off-by: Ben Draper <ben@xrsa.net>
Reviewed-by: Dmitry Fleytman <dmitry@daynix.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 40a87c6c9b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:05 -05:00
Fam Zheng
fab7560c35 blkdebug: Delete BH in bdrv_aio_cancel
Otherwise error_callback_bh will access the already released acb.

Cc: qemu-stable@nongnu.org
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit cbf95a0b11)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:05 -05:00
Stefan Hajnoczi
16c92cd629 qemu-iotests: add test case 101 for short file I/O
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 8d9eb33ca0)

Conflicts:
	tests/qemu-iotests/group

*fix up context mismatches due to lack of 099 and 103 tests

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:05 -05:00
Stefan Hajnoczi
dea6efe883 raw-posix: fix O_DIRECT short reads
The following O_DIRECT read from a <512 byte file fails:

  $ truncate -s 320 test.img
  $ qemu-io -n -c 'read -P 0 0 512' test.img
  qemu-io: can't open device test.img: Could not read image for determining its format: Invalid argument

Note that qemu-io completes successfully without the -n (O_DIRECT)
option.

This patch fixes qemu-iotests ./check -nocache -vmdk 059.

Cc: qemu-stable@nongnu.org
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 61ed73cff4)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:05 -05:00
Peter Lieven
8c4edd743c block/iscsi: fix memory corruption on iscsi resize
bs->total_sectors is not yet updated at this point. resulting
in memory corruption if the volume has grown and data is written
to the newly availble areas.

CC: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit d832fb4d66)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:05 -05:00
Christoffer Dall
504e2a7139 arm/virt: Use PSCI v0.2 function IDs in the DT when KVM uses PSCI v0.2
The current code supplies the PSCI v0.1 function IDs in the DT even when
KVM uses PSCI v0.2.

This will break guest kernels that only support PSCI v0.1 as they will
use the IDs provided in the DT.  Guest kernels with PSCI v0.2 support
are not affected by this patch, because they ignore the function IDs in
the device tree and rely on the architecture definition.

Define QEMU versions of the constants and check that they correspond to
the Linux defines on Linux build hosts.  After this patch, both guest
kernels with PSCI v0.1 support and guest kernels with PSCI v0.2 should
work.

Tested on TC2 for 32-bit and APM Mustang for 64-bit (aarch64 guest
only).  Both cases tested with 3.14 and linus/master and verified I
could bring up 2 cpus with both guest kernels.  Also tested 32-bit with
a 3.14 host kernel with only PSCI v0.1 and both guests booted here as
well.

Cc: qemu-stable@nongnu.org
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 863714ba6c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:05 -05:00
Christoffer Dall
2f6d5e1c9c target-arm: Rename QEMU PSCI v0.1 definitions
The function IDs for PSCI v0.1 are exported by KVM and defined as
KVM_PSCI_FN_<something>.  To build using these defines in non-KVM code,
QEMU defines these IDs locally and check their correctness against the
KVM headers when those are available.

However, the naming scheme used for QEMU (almost) clashes with the PSCI
v0.2 definitions from Linux so to avoid unfortunate naming when we
introduce local PSCI v0.2 defines, rename the current local defines with
QEMU_ prependend and clearly identify the PSCI version as v0.1 in the
defines.

Cc: qemu-stable@nongnu.org
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit a65c9c17ce)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:05 -05:00
Peter Maydell
20463dc874 target-arm: Fix return address for A64 BRK instructions
When we take an exception resulting from a BRK instruction,
the architecture requires that the "preferred return address"
reported to the exception handler is the address of the BRK
itself, not the following instruction (like undefined
insns, and in contrast with SVC, HVC and SMC). Follow this,
rather than incorrectly reporting the address of the following
insn.

(We do get this correct for the A32/T32 BKPT insns.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Cc: qemu-stable@nongnu.org
(cherry picked from commit 229a138d74)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:05 -05:00
zhanghailiang
2a575c450e virtio-blk: fix reference a pointer which might be freed
In function virtio_blk_handle_request, it may freed memory pointed by req,
So do not access member of req after calling this function.

Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 1bdb176ac5)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:04 -05:00
Michael S. Tsirkin
1ad9dcec47 acpi: align RSDP
RSDP should be aligned at a 16-byte boundary.
This would by chance at the moment, fix up acpi build
to make it robust.

Cc: qemu-stable@nongnu.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
(cherry picked from commit d67aadccfa)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:04 -05:00
Hu Tao
ba1bc81991 numa: show hex number in error message for consistency and prefix them with 0x
The error messages before and after patch are:

before:
qemu-system-x86_64: total memory for NUMA nodes (134217728) should equal RAM size (20000000)

after:
qemu-system-x86_64: total memory for NUMA nodes (0x8000000) should equal RAM size (0x20000000)

Cc: qemu-stable@nongnu.org
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit c68233aee8)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:04 -05:00
Michael S. Tsirkin
948574e0d2 pc-dimm: fix up error message
- int should be printed using %d
- print actual wrong value for property

Cc: qemu-stable@nongnu.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 988eba0f68)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:04 -05:00
Hu Tao
044af98ea8 pc-dimm: validate node property
If user specifies a node number that exceeds the available numa nodes in
emulated system for pc-dimm device, the device will report an invalid _PXM
to OSPM. Fix this by checking the node property value.

Cc: qemu-stable@nongnu.org
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit cfe0ffd027)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:04 -05:00
Hu Tao
7c68c5402a hw:i386: typo fix: MEMORY_HOPTLUG_DEVICE -> MEMORY_HOTPLUG_DEVICE
Cc: qemu-stable@nongnu.org
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 41d2f71376)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:04 -05:00
Michael Tokarev
bd4740621c ide: only constrain read/write requests to drive size, not other types
Commit 58ac321135 introduced a check to ide dma processing which
constrains all requests to drive size.  However, apparently, some
valid requests (like TRIM) does not fit in this constraint, and
fails in 2.1.  So check the range only for reads and writes.

Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit d66168ed68)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-08-26 16:58:56 -05:00
Michael Tokarev
e22d5dc073 l2tpv3 (configure): it is linux-specific
Some non-linux systems, for example a system with
FreeBSD kernel and glibc, may declare struct mmsghdr
(in glibc) but may not have linux-specific header
file linux/ip.h.  The actual implementation in qemu
includes this linux-specific header file unconditionally,
so compilation fails if it is not present.  Include
this header in the configure test too.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit bff6cb7296)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-08-26 16:57:28 -05:00
Alex Williamson
dfd4808222 vfio: Fix MSI-X vector expansion
When new MSI-X vectors are enabled we need to disable MSI-X and
re-enable it with the correct number of vectors.  That means we need
to reprogram the eventfd triggers for each vector.  Prior to f4d45d47
vector->use tracked whether a vector was masked or unmasked and we
could always pick the KVM path when available for unmasked vectors.
Now vfio doesn't track mask state itself and vector->use and virq
remains configured even for masked vectors.  Therefore we need to ask
the MSI-X code whether a vector is masked in order to select the
correct signaling path.  As noted in the comment, MSI relies on
hardware to handle masking.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: qemu-stable@nongnu.org # QEMU 2.1
(cherry picked from commit c048be5cc9)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-08-26 16:48:12 -05:00
Stefan Hajnoczi
5f26e63b17 qdev-monitor: include QOM properties in -device FOO, help output
Update -device FOO,help to include QOM properties in addition to qdev
properties.  Devices are gradually adding more QOM properties that are
not reflected as qdev properties.

It is important to report all device properties since management tools
like libvirt use this information (and device-list-properties QMP) to
detect the presence of QEMU features.

This patch reuses the device-list-properties QMP machinery to avoid code
duplication.

Reported-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Cole Robinson <crobinso@redhat.com>
(cherry picked from commit ef523587da)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-08-26 16:46:19 -05:00
Stefan Hajnoczi
42f7a13178 qmp: hide "hotplugged" device property from device-list-properties
The "hotplugged" device property was not reported before commit
f4eb32b590 ("qmp: show QOM properties in
device-list-properties").  Fix this difference.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 4115dd6527)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-08-26 16:46:01 -05:00
Peter Maydell
541bbb07eb Update version for v2.1.0 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-01 13:31:29 +01:00
Peter Maydell
d24e780427 Update version for v2.1.0-rc5 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-29 18:23:34 +01:00
Andrew Jones
1373e140f0 hw/arm/virt: fix pl031 addr typo
pl031's base address should be 0x9010000, not 0x90010000, otherwise
it sits in ram when configuring a guest with greater than 1G.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-29 17:40:42 +01:00
Peter Maydell
8a2ca741ab Update version for v2.1.0-rc4 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-29 13:45:10 +01:00
Paolo Bonzini
1d80eb7a68 po: update Italian translation
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-29 13:23:33 +01:00
Aurelien Jarno
41892faf89 po: Update French translation
Add new translations for recently added messages.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-29 13:23:18 +01:00
Peter Maydell
04d1d6613f Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc migration fixes

Last minute fixes for migration.
It seems that if we don't fix it now, fixing
it in the next version will be even more painful ...

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Tue 29 Jul 2014 11:45:18 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"

* remotes/mst/tags/for_upstream:
  piix: set legacy table size for 1.7
  acpi-build: tweak acpi migration limits
  pc: future-proof migration-compatibility of ACPI tables
  acpi-build: minor code cleanup
  pc: acpi: generate AML only for PCI0 devices if PCI bridge hotplug is disabled
  bios-tables-test: fix ASL normalization false positive
  pc: hack for migration compatibility from QEMU 2.0
  acpi-dsdt: procedurally generate _PRT

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-29 12:04:02 +01:00
Michael S. Tsirkin
f47337cb91 piix: set legacy table size for 1.7
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-07-29 12:26:12 +02:00
Michael S. Tsirkin
868270f23d acpi-build: tweak acpi migration limits
- Tweak error message for legacy machine type:
  Basically if table size exceeds the limits we set all
  bets are off for migration: e.g. it can start failing even
  within given qemu minor version simply because of a bugfix.
- Increase table size to 128k.
- Make sure we notice it long before we start getting close to the
  128k limit: warn at 64k.
- Don't fail if we exceed the limit: most people don't care about
  migration, even less people care about cross version miration.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-29 12:26:12 +02:00
Paolo Bonzini
18045fb9f4 pc: future-proof migration-compatibility of ACPI tables
This patch avoids that similar changes break QEMU again in the future.
QEMU will now hard-code 64k as the maximum ACPI table size, which
(despite being an order of magnitude smaller than 640k) should be enough
for everyone.

Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-07-29 12:26:12 +02:00
Michael S. Tsirkin
093a35e5fc acpi-build: minor code cleanup
Fix up and add  comments to clarify code, plus a trivial
code change for clarity.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-07-29 12:26:12 +02:00
Igor Mammedov
133a2da488 pc: acpi: generate AML only for PCI0 devices if PCI bridge hotplug is disabled
Fixes migration regression from QEMU-1.7 to a newer QEMUs.
SSDT table size in QEMU-1.7 doesn't change regardless of
a number of PCI bridge devices present at startup.

However in QEMU-2.0 since addition of hotplug on PCI bridges,
each PCI bridge adds ~1875 bytes to SSDT table, including
pc-i440fx-1.7 machine type where PCI bridge hotplug disabled
via compat property.
It breaks migration from "QEMU-1.7" to "QEMU-2.[01] -M pc-i440fx-1.7"
since RAMBlock size of ACPI tables on target becomes larger
then on source and migration fails with:

"Length mismatch: /rom@etc/acpi/tables: 2000 in != 3000"

error.

Fix this by generating AML only for PCI0 bus if
hotplug on PCI bridges is disabled and preserves PCI brigde
description in AML as it was done in QEMU-1.7 for pc-i440fx-1.7.

It will help to maintain size of SSDT static regardless of
number of PCI bridges on startup for pc-i440fx-1.7 machine type.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-07-29 12:26:12 +02:00
Paolo Bonzini
cb348985ab bios-tables-test: fix ASL normalization false positive
My version of IASL (from RHEL7) puts two newlines between the head comment
and the DefinitionBlock property.  Kill all newlines after the comment,
so that normalize_asl works properly.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2014-07-29 12:26:12 +02:00
Stefan Weil
41a1a9c42c po: Update German translation
Line numbers changed, and some translations were missing after commit
3d914488ae.

Update also "Show Tabs" to a more common translation, and remove some
old unused lines at the end.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2014-07-28 23:37:17 +02:00
Dongxue Zhang
62eb3b9a34 target-mips/translate.c: Free TCG in OPC_DINSV
Free t0 and t1 in opcode OPC_DINSV.

Signed-off-by: Dongxue Zhang <elta.era@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2014-07-28 23:37:15 +02:00
Paolo Bonzini
07fb61760c pc: hack for migration compatibility from QEMU 2.0
Changing the ACPI table size causes migration to break, and the memory
hotplug work opened our eyes on how horribly we were breaking things in
2.0 already.

The ACPI table size is rounded to the next 4k, which one would think
gives some headroom.  In practice this is not the case, because the user
can control the ACPI table size (each CPU adds 97 bytes to the SSDT and
8 to the MADT) and so some "-smp" values will break the 4k boundary and
fail to migrate.  Similarly, PCI bridges add ~1870 bytes to the SSDT.

This patch concerns itself with fixing migration from QEMU 2.0.  It
computes the payload size of QEMU 2.0 and always uses that one.
The previous patch shrunk the ACPI tables enough that the QEMU 2.0 size
should always be enough; non-AML tables can change depending on the
configuration (especially MADT, SRAT, HPET) but they remain the same
between QEMU 2.0 and 2.1, so we only compute our padding based on the
sizes of the SSDT and DSDT.

Migration from QEMU 1.7 should work for guests that have a number of CPUs
other than 12, 13, 14, 54, 55, 56, 97, 98, 139, 140.  It was already
broken from QEMU 1.7 to QEMU 2.0 in the same way, though.

Even with this patch, QEMU 1.7 and 2.0 have two different ideas of
"-M pc-i440fx-2.0" when there are PCI bridges.  Igor sent a patch to
adopt the QEMU 1.7 definition.  I think distributions should apply
it if they move directly from QEMU 1.7 to 2.1+ without ever packaging
version 2.0.

Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-07-28 23:02:39 +02:00
Paolo Bonzini
acd727e7cb acpi-dsdt: procedurally generate _PRT
This replaces the _PRT constant with a method that computes it.

The problem is that the DSDT+SSDT have grown from 2.0 to 2.1,
enough to cross the 8k barrier (we align the ACPI tables to 4k
before putting them in fw_cfg).  This causes problems with
migration and the pc-i440fx-2.0 machine type.

The solution to the problem is to hardcode 64k as the limit,
but this doesn't solve the bug with pc-i440fx-2.0.  The fix will be
for QEMU 2.1 to use exactly the same size as QEMU 2.0 for the
ACPI tables.  First, however, we must make the actual AML
equal or smaller; to do this, rewrite _PRT in a way that saves
over 1k of bytecode.

Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-07-28 23:02:39 +02:00
Peter Maydell
f45c56e016 Merge remote-tracking branch 'remotes/mjt/tags/trivial-patches-2014-07-26' into staging
trivial patches for 2014-07-26

# gpg: Signature made Sat 26 Jul 2014 08:16:55 BST using RSA key ID A4C3D7DB
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg:                 aka "Michael Tokarev <mjt@corpit.ru>"
# gpg:                 aka "Michael Tokarev <mjt@debian.org>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D  4324 457C E0A0 8044 65C5
#      Subkey fingerprint: 6F67 E18E 7C91 C5B1 5514  66A7 BEE5 9D74 A4C3 D7DB

* remotes/mjt/tags/trivial-patches-2014-07-26:
  qemu-options: fix another allows-to for -net l2tpv3

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-28 11:05:14 +01:00
Michael Tokarev
2f47b403bd qemu-options: fix another allows-to for -net l2tpv3
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-07-26 11:16:44 +04:00
Peter Maydell
c60a57ff49 Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
Here is the serial fix for 2.1.

# gpg: Signature made Fri 25 Jul 2014 13:36:23 BST using RSA key ID 9B4D86F2
# gpg: Good signature from "Paolo Bonzini <pbonzini@redhat.com>"
# gpg:                 aka "Paolo Bonzini <bonzini@gnu.org>"

* remotes/bonzini/tags/for-upstream:
  qemu-char: ignore flow control if a PTY's slave is not connected

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-25 16:58:41 +01:00
Paolo Bonzini
62c339c527 qemu-char: ignore flow control if a PTY's slave is not connected
After commit f702e62 (serial: change retry logic to avoid concurrency,
2014-07-11), guest boot hangs if the backend is an unconnected PTY.

The reason is that PTYs do not support G_IO_HUP, and serial_xmit is
never called.  To fix this, simply invoke serial_xmit immediately
(via g_idle_source_new) when this happens.

Tested-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-25 14:36:07 +02:00
Peter Maydell
7f0b2ff724 Merge remote-tracking branch 'remotes/kraxel/tags/pull-vnc-20140725-1' into staging
vnc: fix two vnc update issues.

# gpg: Signature made Fri 25 Jul 2014 08:44:23 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-vnc-20140725-1:
  vnc update fix
  fix full frame updates for VNC clients

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-25 10:32:13 +01:00
Gerd Hoffmann
6365828003 vnc update fix
We need to remember has_updates for each vnc client.  Otherwise it might
happen that vnc_update_client(has_dirty=1) takes the first exit due to
output buffers not being flushed yet and subsequent calls with
has_dirty=0 take the second exit, wrongly assuming there is nothing to
do because the work defered in the first call is ignored.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
2014-07-25 09:43:31 +02:00
Stephan Kulow
07535a8902 fix full frame updates for VNC clients
If the client asks for !incremental frame updates, it has lost its content
so dirty doesn't matter - it has to see the full frame, so setting force_update

Signed-off-by: Stephan Kulow <coolo@suse.de>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
2014-07-25 09:42:56 +02:00
Peter Maydell
3b25748663 Merge remote-tracking branch 'remotes/qmp-unstable/queue/qmp' into staging
* remotes/qmp-unstable/queue/qmp:
  docs: document missing VSERPORT_CHANGE event
  docs: document missing POWERDOWN event
  docs: document missing SPICE_MIGRATE_COMPLETED event
  docs: split SPICE_* event docs
  docs: grammar fixes to qmp-events

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-24 15:23:43 +01:00
Eric Blake
032baddea3 docs: document missing VSERPORT_CHANGE event
The VSERPORT_CHANGE event was added in e2ae6159.  The patch for
this event was prepared at a time when this file was gone, even
though it got applied immediately after dfab4892 restored this
file.  Duplicate the documentation into this file, so that
anyone using this file instead of qapi will not miss out on this
new event.

* docs/qmp/qmp-events.txt (VSERPORT_CHANGE): Add.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-07-24 10:00:33 -04:00
Eric Blake
db52658b38 docs: document missing POWERDOWN event
The POWERDOWN event was first documented in 0aab9ec3.  But since
dfab4892 later restored this file to the state prior to qmp events,
and we never documented it in the past, anyone using this file
instead of qapi will miss out on this event.  Tweak the existing
wording of SHUTDOWN to match 84321831, and make the difference
between the two events apparent.

* docs/qmp/qmp-events.txt (POWERDOWN): Add.
(SHUTDOWN): Tweak.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-07-24 10:00:33 -04:00
Eric Blake
5e255004f5 docs: document missing SPICE_MIGRATE_COMPLETED event
The SPICE_MIGRATE_COMPLETED event was first documented in
7cfadb6b.  But since dfab4892 later restored this file to the
state prior to qmp events, and we never documented it in the
past, anyone using this file instead of qapi will miss out on
this event.

* docs/qmp/qmp-events.txt (SPICE_MIGRATE_COMPLETED): Add.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-07-24 10:00:15 -04:00
Eric Blake
f8ecd94501 docs: split SPICE_* event docs
For consistency with the rest of this file, every event should be
listed in isolation.  Compare how commit 7cfadb6b split
SPICE_CONNECTED and SPICE_DISCONNECTED into separate qmp events.

* docs/qmp/qmp-events.txt (SPICE_CONNECTED, SPICE_DISCONNECTED):
Split.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-07-24 09:59:20 -04:00
Eric Blake
1454ac68af docs: grammar fixes to qmp-events
When converting to qmp events, commits 7cfadb6b and a6330785
fixed some grammar as part of moving text between files.  But
since dfab4892 later restored this file to the state prior to
qmp events, we have to do it again.

* docs/qmp/qmp-events.txt (RESET, SPICE_INITIALIZED): Tweak.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-07-24 09:58:51 -04:00
Peter Maydell
a537d373b9 Merge remote-tracking branch 'remotes/kraxel/tags/pull-usb-20140723-1' into staging
usb: mtp: tag root property as experimental

# gpg: Signature made Wed 23 Jul 2014 07:56:21 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-usb-20140723-1:
  usb: mtp: tag root property as experimental

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-24 12:49:54 +01:00
Gerd Hoffmann
cf679caf91 usb: mtp: tag root property as experimental
Reason: we don't want commit to that interface yet.  Possibly
the implementation will be switched over to use fsdev.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-07-23 08:55:40 +02:00
Peter Maydell
f368c33d5a Update version for v2.1.0-rc3 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-22 18:17:03 +01:00
Peter Maydell
ef493d5c29 hw/misc/imx_ccm.c: Add missing VMState list terminator
The VMStateDescription for the imx_ccm device was missing its
terminator. Found by static search of the codebase using
a regex based on one suggested by Ian Jackson:
  pcregrep -rMi '(?s)VMStateField(?:(?!END_OF_LIST).)*?;' $(git grep -l 'VMStateField\[\]')

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Cc: qemu-stable@nongnu.org
2014-07-22 17:53:36 +01:00
Laszlo Ersek
3afca1d6d4 vmstate_xhci_event: fix unterminated field list
"vmstate_xhci_event" was introduced in commit 37352df3 ("xhci: add live
migration support"), and first released in v1.6.0. The field list in this
VMSD is not terminated with the VMSTATE_END_OF_LIST() macro.

During normal use (ie. migration), the issue is practically invisible,
because the "vmstate_xhci_event" object (with the unterminated field list)
is only ever referenced -- via "vmstate_xhci_intr" -- if xhci_er_full()
returns true, for the "ev_buffer" test. Since that field_exists() check
(apparently) almost always returns false, we almost never traverse
"vmstate_xhci_event" during migration, which hides the bug.

However, Amit's vmstate checker forces recursion into this VMSD as well,
and the lack of VMSTATE_END_OF_LIST() breaks the field list terminator
check (field->name != NULL) in dump_vmstate_vmsd(). The result is
undefined behavior, which in my case translates to infinite recursion
(because the loop happens to overflow into "vmstate_xhci_intr", which then
links back to "vmstate_xhci_event").

Add the missing terminator.

Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-22 17:34:24 +01:00
Peter Maydell
3a18d44983 Merge remote-tracking branch 'remotes/agraf/tags/signed-ppc-for-upstream' into staging
Patch queue for ppc - 2014-07-22

Only a single bug fix to make -mem-path only affect RAM regions.

# gpg: Signature made Tue 22 Jul 2014 16:38:04 BST using RSA key ID 03FEDC60
# gpg: Can't check signature: public key not found

* remotes/agraf/tags/signed-ppc-for-upstream:
  ppc: fix -mem-path failure

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-22 16:40:34 +01:00
Hu Tao
e206ad4833 ppc: fix -mem-path failure
commit e938ba0c tried to enable -mem-path for ppc but breaked some ppc
boards.

The problems are:

1. it fails when allocating memory for rom, sram whose sizes are less
   than huge page size:

   ./ppc-softmmu/qemu-system-ppc  -m 512 -mem-path /hugepages/ \
   -kernel /home/hutao/Downloads/vmlinux-ppc -initrd \
   /home/hutao/Downloads/initrd-ppc.gz
   qemu-system-ppc: /mnt/data/projects/qemu/exec.c:1184: qemu_ram_set_idstr: Assertion `new_block' failed.

2. if there is a numa node backed by memory backend object, qemu fails
   with message:

   ./ppc-softmmu/qemu-system-ppc  -m 512 \
   -object memory-backend-file,size=512M,mem-path=/hugepages,id=f0 \
   -numa node,nodeid=0,memdev=f0 \
   -kernel /home/hutao/Downloads/vmlinux-ppc \
   -initrd /home/hutao/Downloads/initrd-ppc.gz
   qemu-system-ppc: memory backend f0 is used multiple times. Each -numa option must use a different memdev value.

This patch does following:

1. replaces memory_region_allocate_system_memory() with
   memory_region_init_ram() for rom, sram. Then only system memory
   is backed by hugepages when specifying mem-path.

2. for memory banks, allocates all ram with
   one memory_region_allocate_system_memory(), and use
   memory_region_init_alias() to initialize memory banks.

Tested machines: default(g3beige), mac99, taihu, bamboo, ref405ep.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-22 17:37:25 +02:00
Peter Maydell
b64c670f1d Merge remote-tracking branch 'remotes/amit-virtio-rng/for-2.1' into staging
* remotes/amit-virtio-rng/for-2.1:
  virtio-rng: Add human-readable error message for negative max-bytes parameter

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-22 13:16:04 +01:00
John Snow
713e8a1022 virtio-rng: Add human-readable error message for negative max-bytes parameter
If a negative integer is used for the max_bytes parameter, QEMU currently
calls abort() and leaves behind a core dump. This patch replaces the
abort with a simple error message to make the reason for the termination
clearer. This also ensures device-hotplug with invalid input doesn't
cause qemu to quit.

There is an underlying insufficiency in the parameter parsing code of QEMU
that renders it unable to reject negative values for unsigned properties,
thus the error message "a non-negative integer below 2^63" is the most
user-friendly and correct message we can give until the underlying
insufficiency is corrected.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2014-07-22 17:18:55 +05:30
Peter Maydell
25af8e6b61 Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
One of the two pending migration fix, and a small KVM patch.

# gpg: Signature made Tue 22 Jul 2014 11:49:30 BST using RSA key ID 9B4D86F2
# gpg: Can't check signature: public key not found

* remotes/bonzini/tags/for-upstream:
  kvm-all: Use 'tmpcpu' instead of 'cpu' in sub-looping to avoid 'cpu' be NULL
  exec: fix migration with devices that use address_space_rw

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-22 12:03:45 +01:00
Chen Gang
dc54e25253 kvm-all: Use 'tmpcpu' instead of 'cpu' in sub-looping to avoid 'cpu' be NULL
If kvm_arch_remove_sw_breakpoint() in CPU_FOREACH() always be fail, it
will let 'cpu' NULL. And the next kvm_arch_remove_sw_breakpoint() in
QTAILQ_FOREACH_SAFE() will get NULL parameter for 'cpu'.

And kvm_arch_remove_sw_breakpoint() can assumes 'cpu' must never be NULL,
so need define additional temporary variable for 'cpu' to avoid the case.

Cc: qemu-stable@nongnu.org
Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-22 12:38:17 +02:00
Paolo Bonzini
6886867e98 exec: fix migration with devices that use address_space_rw
Devices that use address_space_rw to write large areas to memory
(as opposed to address_space_map/unmap) were broken with respect
to migration since fe680d0 (exec: Limit translation limiting in
address_space_translate to xen, 2014-05-07).  Such devices include
IDE CD-ROMs.

The reason is that invalidate_and_set_dirty (called by address_space_rw
but not address_space_map/unmap) was only setting the dirty bit for
the first page in the translation.

To fix this, introduce cpu_physical_memory_set_dirty_range_nocode that
is the same as cpu_physical_memory_set_dirty_range except it does not
muck with the DIRTY_MEMORY_CODE bitmap.  This function can be used if
the caller invalidates translations with tb_invalidate_phys_page_range.

There is another difference between cpu_physical_memory_set_dirty_range
and cpu_physical_memory_set_dirty_flag; the former includes a call
to xen_modified_memory.  This is handled separately in
invalidate_and_set_dirty, and is not needed in other callers of
cpu_physical_memory_set_dirty_range_nocode, so leave it alone.

Just one nit: now that invalidate_and_set_dirty takes care of handling
multiple pages, there is no need for address_space_unmap to wrap it
in a loop.  In fact that loop would now be O(n^2).

Reported-by: Dave Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-22 10:38:50 +02:00
Peter Maydell
35858955e6 Merge remote-tracking branch 'remotes/afaerber/tags/qom-devices-for-2.1' into staging
QOM and device refactorings

* Machine: Property name fixups for 2.1 ABI

# gpg: Signature made Mon 21 Jul 2014 18:00:23 BST using RSA key ID 3E7E013F
# gpg: Good signature from "Andreas Färber <afaerber@suse.de>"
# gpg:                 aka "Andreas Färber <afaerber@suse.com>"

* remotes/afaerber/tags/qom-devices-for-2.1:
  machine: Replace underscores in machine's property names

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-21 18:06:12 +01:00
Marcel Apfelbaum
b0ddb8bf6b machine: Replace underscores in machine's property names
Replaced '_' with '-' to comply with QOM guidelines.
Made the conversion from command line to QMP in vl.c.

Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-07-21 18:58:36 +02:00
Peter Maydell
147fc41973 Merge remote-tracking branch 'remotes/mjt/tags/trivial-patches-2014-07-18' into staging
trivial patches for 2014-07-18

# gpg: Signature made Fri 18 Jul 2014 15:04:43 BST using RSA key ID A4C3D7DB
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg:                 aka "Michael Tokarev <mjt@corpit.ru>"
# gpg:                 aka "Michael Tokarev <mjt@debian.org>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D  4324 457C E0A0 8044 65C5
#      Subkey fingerprint: 6F67 E18E 7C91 C5B1 5514  66A7 BEE5 9D74 A4C3 D7DB

* remotes/mjt/tags/trivial-patches-2014-07-18:
  tests: Add missing 'static' attributes (fix warnings from smatch)
  migration: Add missing 'static' attribute
  qga: Add missing 'static' attribute
  hw/usb: Add missing 'static' attribute
  doc: slirp supports ICMP echo if enabled in Linux
  qemu-img: Remove redundancy "ret = -1"
  Fix new typos in comments (found by codespell)
  slirp: Give error message if hostfwd_add/remove for unrecognized vlan/stack

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-18 16:59:29 +01:00
Peter Maydell
50a2c45da9 Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
Andreas's fixes to --enable-modules, two 2.1 regression fixes, and a
new qtest.  Michael sent a pull request of his own, so I dropped
the vhost changes.

# gpg: Signature made Fri 18 Jul 2014 14:30:34 BST using RSA key ID 9B4D86F2
# gpg: Can't check signature: public key not found

* remotes/bonzini/tags/for-upstream:
  Revert "kvmclock: Ensure time in migration never goes backward"
  Revert "kvmclock: Ensure proper env->tsc value for kvmclock_current_nsec calculation"
  module: Don't complain when a module is absent
  module: Simplify module_load()
  qtest: new test for wdt_ib700
  target-i386: Allow execute from user mode when SMEP is enabled.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-18 14:46:53 +01:00
Stefan Weil
748bfb4eee tests: Add missing 'static' attributes (fix warnings from smatch)
Smatch also complains about 0 used for pointers, so replace those by
NULL in test-visitor-serialization.c, too.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-07-18 17:45:37 +04:00
Stefan Weil
7a46d042e0 migration: Add missing 'static' attribute
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-07-18 17:45:37 +04:00
Stefan Weil
13a439ec40 qga: Add missing 'static' attribute
This fixes a warning from the static code analysis (smatch).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-07-18 17:45:37 +04:00
Stefan Weil
b9b45b4a88 hw/usb: Add missing 'static' attribute
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-07-18 17:45:37 +04:00
Gernot Hillier
37cbfcce14 doc: slirp supports ICMP echo if enabled in Linux
Since QEMU 0.15, slirp (user mode networking) supports ping to the
Internet, see e6d43cfb1f

Signed-off-by: Gernot Hillier <gernot.hillier@siemens.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-07-18 17:45:37 +04:00
Chen Gang
b847ae2d60 qemu-img: Remove redundancy "ret = -1"
In this case, 'ret' is already '-1', so need not do it again.

Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-07-18 17:45:37 +04:00
Stefan Weil
a9dd38db68 Fix new typos in comments (found by codespell)
arbitary -> arbitrary
basicly -> basically

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-07-18 17:45:36 +04:00
Peter Maydell
b739ef05db slirp: Give error message if hostfwd_add/remove for unrecognized vlan/stack
If the user specified a (vlan ID, slirp stack name) tuple in a monitor
hostfwd_add/remove command and we can't find it, give the user an
error message rather than silently doing nothing.

This brings this error case in slirp_lookup() into line with the
other two.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-07-18 17:45:36 +04:00
Paolo Bonzini
fa666c10f2 Revert "kvmclock: Ensure time in migration never goes backward"
This reverts commit a096b3a673.

This patch caused a hang that was fixed by commit 9b17868 (kvmclock:
Ensure proper env->tsc value for kvmclock_current_nsec calculation,
2014-06-03), and we just had to revert that commit.  Drop this one
too.

Cc: agraf@suse.de
Cc: mtosatti@redhat.com
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-18 15:28:03 +02:00
Paolo Bonzini
108e4c3871 Revert "kvmclock: Ensure proper env->tsc value for kvmclock_current_nsec calculation"
This reverts commit 9b1786829a.

This patch fixed a hang introduced by commit a096b3a (kvmclock: Ensure
time in migration never goes backward, 2014-05-16), but it causes
a regression in migration whose cause is not quite clear.

Because of this, I'm choosing to revert both patches.  This trades a
2.1 regression for a bug that's been there forever.

Cc: agraf@suse.de
Cc: mtosatti@redhat.com
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-18 15:15:14 +02:00
Andreas Färber
bb2eb1892d module: Don't complain when a module is absent
The current implementation depends on a configure-time generated list of
block modules. When any of them is absent, module_load() emits a warning.

This is suboptimal because extracting code to modules was mainly done to
allow separate packaging of modules with intrusive dependencies. Absence
of optional packages then leads to absence of modules and an error
message, which users may recognize as new and report as error.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-18 14:57:35 +02:00
Andreas Färber
f9e13f8fd8 module: Simplify module_load()
The file path is not used for error reporting, so we can free it
directly after use.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-18 14:57:35 +02:00
Paolo Bonzini
f52b768782 qtest: new test for wdt_ib700
Since the "pause" watchdog action had a regression and it went
unnoticed for a while, let's add a test for it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-18 14:57:35 +02:00
Peter Maydell
e0097ea371 Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
Block pull request

# gpg: Signature made Fri 18 Jul 2014 13:39:43 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/block-pull-request:
  qemu-iotests: fix 028 failure due to disk image path
  raw-posix: Fail gracefully if no working alignment is found
  block: Add Error argument to bdrv_refresh_limits()
  qcow2: Fix error path for unknown incompatible features

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-18 13:47:22 +01:00
Stefan Hajnoczi
8283c5c316 qemu-iotests: fix 028 failure due to disk image path
The disk image path is echoed by QEMU's readline when the "drive_backup
disk ${TEST_IMG}.copy" HMP command is issued.  Unfortunately it is very
hard to filter out the path due to readline's character-by-character
output (with terminal escape sequences).  Just redirect this command to
/dev/null for now.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
2014-07-18 13:27:11 +01:00
Kevin Wolf
df26a35025 raw-posix: Fail gracefully if no working alignment is found
If qemu couldn't find out what O_DIRECT alignment to use with a given
file, it would run into assert(bdrv_opt_mem_align(bs) != 0); in block.c
and confuse users. This adds a more descriptive error message for such
cases.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-18 13:18:43 +01:00
Kevin Wolf
3baca89139 block: Add Error argument to bdrv_refresh_limits()
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-18 13:18:43 +01:00
Kevin Wolf
12ac6d3db7 qcow2: Fix error path for unknown incompatible features
qcow2's report_unsupported_feature() had two bugs: A 32 bit truncation
would prevent feature table entries for bits 32-63 from being used, and
it could assign errp multiple times if there was more than one unknown
feature, resulting in an error_set() assertion failure.

Fix the truncation, make sure to set the error exactly once and add a
qemu-iotests case for it.

This fixes https://bugs.launchpad.net/qemu/+bug/1342704/

Reported-by: Maria Kustova <maria.k@catit.be>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-18 13:12:15 +01:00
Peter Maydell
4d121a5498 Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc,vhost,test fixes

Minor bugfixes all over the place.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Fri 18 Jul 2014 00:43:04 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"

* remotes/mst/tags/for_upstream:
  vhost-user: minor cleanups
  qtest: Adapt vhost-user-test to latest vhost-user changes
  vhost-user: Fix VHOST_SET_MEM_TABLE processing
  qtest: fix vhost-user-test compilation with old GLib
  fix typo: apci -> acpi
  pc_piix: Reuse pc_compat_1_2() for pc-0.1[0123]
  pc: fix qemu exiting with error when -m X < 128 with old machines types

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-18 09:35:51 +01:00
Michael S. Tsirkin
cd98639f67 vhost-user: minor cleanups
assert to verify cast does not discard information
minor style fixup.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-07-18 02:22:24 +03:00
Nikolay Nikolaev
d6970e3b00 qtest: Adapt vhost-user-test to latest vhost-user changes
A new field mmap_offset was added in the vhost-user message, we need to reflect
this change in the test too.

Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-07-18 02:14:15 +03:00
Nikolay Nikolaev
f69a28051f vhost-user: Fix VHOST_SET_MEM_TABLE processing
qemu_get_ram_fd doesn't accept a guest physical address. ram_addr_t are
opaque values that are assigned in qemu_ram_alloc.

Find the ram_addr_t corresponding to the userspace_addr using qemu_ram_addr_from_host,
and then call qemu_get_ram_fd on it.

Thanks to Paolo Bonzini <pbonzini@redhat.com>

Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-18 02:14:15 +03:00
Igor Mammedov
5734d031aa pc: fix qemu exiting with error when -m X < 128 with old machine types
If machine doesn't support memory hotplug then starting QEMU
with initial memory less than default will make QEMU exit with
following error message:

$QEMU -m 16  -M isapc
qemu-system-i386: "-memory 'slots|maxmem'" is not supported by: isapc

Set maxram_size to initial memory value before parsing
'maxmem' option allows to keep maxmem in sync with initial
memory size if no maxmem option was specified.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
CC: Bruce Rogers <brogers@suse.com>
Reviewed-By: Bruce Rogers <brogers@suse.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-17 17:45:45 +01:00
KONRAD Frederic
af52fe862f cadence_uart: check for serial backend before using it.
This checks that s->chr is not NULL before using it.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-17 16:36:17 +01:00
Peter Maydell
231f6927c8 Merge remote-tracking branch 'remotes/amit-migration/for-2.1' into staging
* remotes/amit-migration/for-2.1:
  vmstate static checker: detect section renames

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-17 12:17:28 +01:00
Peter Maydell
104369c8c7 Merge remote-tracking branch 'remotes/amit/for-2.1' into staging
* remotes/amit/for-2.1:
  virtio-serial-bus: keep port 0 reserved for virtconsole even on unplug

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-17 11:18:51 +01:00
Amit Shah
57d84cf353 virtio-serial-bus: keep port 0 reserved for virtconsole even on unplug
We keep port 0 reserved for compat with older guests, where only
virtio-console was expected.  Even if a system is started without a
virtio-console port, port #0 is kept aside.  However, after a
virtconsole port is unplugged, port id 0 became available, and the next
hotplug of a virtserialport caused failure due to it not being a console
port.

Steps to reproduce:

$ ./x86_64-softmmu/qemu-system-x86_64 -m 512 -cpu host -enable-kvm -device virtio-serial-pci -monitor stdio  -vnc :1
QEMU 2.0.91 monitor - type 'help' for more information
(qemu) device_add virtconsole,id=p1
(qemu) device_del p1
(qemu) device_add virtserialport,id=p1
Port number 0 on virtio-serial devices reserved for virtconsole devices for backward compatibility.
Device 'virtserialport' could not be initialized
(qemu) quit

Reported-by: dengmin <mdeng@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2014-07-16 14:32:40 +05:30
Amit Shah
79fe16c048 vmstate static checker: detect section renames
Commit 292b1634 changed the section name of "ICH9 LPC" to "ICH9-LPC",
and that causes the static checker to flag this:

Section "ICH9 LPC" does not exist in dest

This patch introduces a function that checks for section renames and
also a dictionary that maps those renames.

Reported-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>

---
This is a small patch to a script; doesn't break qemu and helps with the
static checker, so it's a very low-risk patch for 2.1.
2014-07-16 14:29:34 +05:30
Peter Maydell
5a73480450 Update version for v2.1.0-rc2 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-15 18:55:37 +01:00
Peter Maydell
82172b7519 tests/Makefile: Only run vhost-user-test on Linux
vhost-user-test uses the linux/vhost.h header, so it must only be
enabled if CONFIG_LINUX is defined. (Previously it was enabled
for CONFIG_POSIX, which broke 'make check' on MacOSX.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-15 18:36:10 +01:00
Ricky Zhou
b4bda1ae57 target-i386: Allow execute from user mode when SMEP is enabled.
Previously, execute would be disabled for all pages with SMEP enabled,
regardless of what mode the access took place in.

Signed-off-by: Ricky Zhou <ricky@rzhou.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-15 18:43:14 +02:00
Peter Maydell
cbb46f5f49 Merge remote-tracking branch 'remotes/riku/linux-user-for-upstream' into staging
* remotes/riku/linux-user-for-upstream:
  linux-user: use TARGET_SA_ONSTACK in get_sigframe
  alloca one extra byte sockets
  linux-user: handle AF_PACKET sockaddrs in target_to_host_sockaddr
  qemu-user: Impl. setsockopt(SO_BINDTODEVICE)
  SIOCGIFINDEX: fix typo

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-15 16:49:28 +01:00
Peter Maydell
146ae00192 Merge remote-tracking branch 'remotes/agraf/tags/signed-ppc-for-upstream' into staging
Patch queue for ppc - 2014-07-15

Some more bug fixes during the RC phase:

  - Fix huge page mapping regressions
  - Fix Book3S thread number enumeration
  - Fix Book3S VFIO permission issue

# gpg: Signature made Tue 15 Jul 2014 15:13:54 BST using RSA key ID 03FEDC60
# gpg: Can't check signature: public key not found

* remotes/agraf/tags/signed-ppc-for-upstream:
  sPAPR/IOMMU: Fix TCE entry permission
  spapr: Enable use of huge pages
  spapr: Move RMA memory region registration code
  ppc: memory: Replace memory_region_init_ram with memory_region_allocate_system_memory
  target-ppc: Fix number of threads per core limit

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-15 15:51:12 +01:00
Gavin Shan
27e27782f7 sPAPR/IOMMU: Fix TCE entry permission
The permission of TCE entry should exclude physical base address.
Otherwise, unmapping TCE entry can be interpreted to mapping TCE
entry wrongly for VFIO devices.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-15 16:11:59 +02:00
Alexey Kardashevskiy
f92f5da108 spapr: Enable use of huge pages
0b183fc87 "memory: move mem_path handling to
memory_region_allocate_system_memory" disabled -mempath use for all
machines that do not use memory_region_allocate_system_memory() to
register RAM. Since SPAPR uses memory_region_init_ram(), the huge pages
support was disabled for it.

This replaces memory_region_init_ram()+vmstate_register_ram_global() with
memory_region_allocate_system_memory() to get huge pages back.

This changes RAM size from (ram_limit - rma_alloc_size) to ram_limit as
the previous patch moved RMA memory region allocation after RAM allocation
and therefore this change does not have immediate effect but simplifies
the code.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-15 16:11:59 +02:00
Alexey Kardashevskiy
658fa66b81 spapr: Move RMA memory region registration code
PPC970 does not support VRMA (virtual RMA) so real memory required
for SLOF to execute must be allocated by the KVM_ALLOCATE_RMA ioctl.
Later this memory is used as a part of the guest RAM area.
The RMA allocating code also registers a memory region for this piece
of RAM.

We are going to simplify memory regions layout: RMA memory region
will be a subregion in the RAM memory region, both starting from zero.
This way we will not have to take care of start address alignment for
the piece of RAM next to the RMA.

This moves memory region business closer to the RAM memory region
creation/allocation code.

As this is a mechanical patch, no change in behaviour is expected.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[agraf: fix compilation on non-kvm systems]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-15 16:11:59 +02:00
Shreyas B. Prabhu
e938ba0c35 ppc: memory: Replace memory_region_init_ram with memory_region_allocate_system_memory
Commit 0b183fc871:"memory: move mem_path handling to
memory_region_allocate_system_memory" split memory_region_init_ram and
memory_region_init_ram_from_file. Also it moved mem-path handling a step
up from memory_region_init_ram to memory_region_allocate_system_memory.

Therefore for any board that uses memory_region_init_ram directly,
-mem-path is not supported.

Fix this by replacing memory_region_init_ram with
memory_region_allocate_system_memory.

Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-15 16:11:58 +02:00
Alexey Kardashevskiy
063cac5326 target-ppc: Fix number of threads per core limit
The number of threads per core is different for POWER6/7/8 CPUs.
Guest systems do not expect to see more threads per core than
a specific CPU supports so we need to limit this number.
This limit is implemented by ppc_get_compat_smt_threads().

However it has a problem as it checks for PCR (Processor Compatibility
Register) mask, 2.05 means 2 threads per core, 2.06 - 4 threads.
For POWER8 one would expect PCR_COMPAT_2_07 bit set and
ppc_get_compat_smt_threads() checking for it to return 8 threads
per core. But the latest PowerISA spec now is 2.07 and there is
no 2.07 compatibility mode defined, QEMU does not define it either
(will be in PowerISA 2.08).

Instead of relying on a PCR mask, this uses kvmppc_smt_threads()
which returns the maximum supported threads number for KVM or
1 for TCG.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-15 16:11:58 +02:00
Riku Voipio
b545f63fa9 linux-user: use TARGET_SA_ONSTACK in get_sigframe
As reported by Laurent, which should use TARGET_SA_ONSTACK
on arm, microblaze and openrisc targets like we do on all
others. Practical matter is minimal as for almost all archs
SA_ONSTACK is 0x08000000:

http://lxr.free-electrons.com/ident?i=SA_ONSTACK

Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-07-15 17:08:41 +03:00
Peter Maydell
2c65ebe646 Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
Block pull request

# gpg: Signature made Tue 15 Jul 2014 14:49:01 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/block-pull-request:
  virtio-blk: dataplane: notify guest as a batch
  virtio-blk: data-plane: fix save/set .complete_request in start
  linux-aio: Fix laio resource leak

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-15 15:06:17 +01:00
Ming Lei
5b2ffbe4d9 virtio-blk: dataplane: notify guest as a batch
Now requests are submitted as a batch, so it is natural
to notify guest as a batch too.

This may suppress interrupt notification to VM a lot:

        - in my test, decreased by ~13K/sec

Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-15 15:34:13 +02:00
Ming Lei
e926d9b8c5 virtio-blk: data-plane: fix save/set .complete_request in start
The callback has to be saved and reset in virtio_blk_data_plane_start(),
otherwise dataplane's requests will be completed in qemu aio context.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-15 15:34:13 +02:00
Gonglei
a1abf40d6b linux-aio: Fix laio resource leak
when hotplug virtio-scsi disks using laio, the aio_nr will
increase in laio_init() by io_setup(), we can see the number by
  # cat /proc/sys/fs/aio-nr
  128
if the aio_nr attach the maxnum, which found from
  # cat /proc/sys/fs/aio-max-nr
  65536
the hotplug process will fail because of aio context leak.

Fix it by io_destroy in laio_cleanup().

Reported-by: daifulai <daifulai@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-15 15:34:13 +02:00
Joakim Tjernlund
2dd08dfd9a alloca one extra byte sockets
target_to_host_sockaddr() may increase the lenth with 1 byte
for AF_UNIX sockets so allocate 1 extra byte.

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-07-15 16:28:36 +03:00
Joakim Tjernlund
33a29b51c9 linux-user: handle AF_PACKET sockaddrs in target_to_host_sockaddr
Implement conversion of the AF_PACKET sockaddr subtype
in target_to_host_sockaddr.

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-07-15 16:28:25 +03:00
Joakim Tjernlund
451aaf688c qemu-user: Impl. setsockopt(SO_BINDTODEVICE)
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-07-15 16:28:20 +03:00
Joakim Tjernlund
27a07827c4 SIOCGIFINDEX: fix typo
Wrong type was used in ioctl definition.

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-07-15 16:26:31 +03:00
Andreas Färber
0e16297461 libqos: Fix PC PCI endianness glitches
The libqos implementation of io_read{b,w,l} and io_write{b,w,l} hooks
was relying on qtest_mem{read,write}() respectively. With d81d410 (usb:
improve ehci/uhci test) this resulted in assertion failures on ppc hosts:

 ERROR:tests/usb-hcd-ehci-test.c:78:ehci_port_test: assertion failed: ((value & mask) == (expect & mask))

 ERROR:tests/usb-hcd-ehci-test.c:128:pci_uhci_port_2: assertion failed: (pcibus != NULL)

 ERROR:tests/usb-hcd-ehci-test.c:150:pci_ehci_port_2: assertion failed: (pcibus != NULL)

qtest_read{b,w,l,q}() and qtest_write{b,w,l,q}() had been introduced
as endian-safe replacement for qtest_mem{read,write}() in I2C in
872536b (qtest: Add MMIO support). Use them for PCI as well.

Cc: Anthony Liguori <aliguori@amazon.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Fixes: c4efe1c qtest: add libqos including PCI support
Fixes: d81d410 usb: improve ehci/uhci test
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-15 14:18:15 +01:00
Peter Maydell
0a9934eef1 Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
Misc 2.1 fixes regarding character/serial devices and SCSI.

# gpg: Signature made Mon 14 Jul 2014 16:26:08 BST using RSA key ID 9B4D86F2
# gpg: Can't check signature: public key not found

* remotes/bonzini/tags/for-upstream:
  serial-pci: remove memory regions from BAR before destroying them
  virtio-scsi: fix with -M pc-i440fx-2.0
  serial: change retry logic to avoid concurrency
  qemu-char: fix deadlock with "-monitor pty"
  scsi: Report error when lun number is in use

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-14 17:01:45 +01:00
Paolo Bonzini
7497bce6c2 serial-pci: remove memory regions from BAR before destroying them
Otherwise, hot-unplug of pci-serial-2x trips the assertion
in memory_region_destroy:

    (qemu) device_del gg
    (qemu) qemu-system-x86_64: /work/armbru/tmp/qemu/memory.c:1021: memory_region_destroy: Assertion `((&mr->subregions)->tqh_first == ((void *)0))' failed.
    Aborted (core dumped)

Reported-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-14 16:14:15 +02:00
Paolo Bonzini
1f4e6a069b virtio-scsi: fix with -M pc-i440fx-2.0
Right now starting a machine with virtio-scsi and a <= 2.0 machine type
fails with:

    qemu-system-x86_64: -device virtio-scsi-pci: Property .any_layout not found

This is because the any_layout bit was actually never set after
virtio-scsi was changed to support arbitrary layout for virtio buffers.

(This was just a cleanup and a preparation for virtio 1.0; no guest
actually checks the bit, but the new request parsing algorithms are
tested even with old guest).

Reported-by: David Gilbert <dgilbert@redhat.com>
Reviewed-by: David Gilbert <dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-14 16:14:15 +02:00
Kirill Batuzov
f702e62a19 serial: change retry logic to avoid concurrency
Whenever serial_xmit fails to transmit a byte it adds a watch that would
call it again when the "line" becomes ready. This results in a retry
chain:
  serial_xmit -> add_watch -> serial_xmit
Each chain is able to transmit one character, and for every character
passed to serial by the guest driver a new chain is spawned.

The problem lays with the fact that a new chain is spawned even when
there is one already waiting on the watch. So there can be several retry
chains waiting concurrently on one "line". Every chain tries to transmit
current character, so character order is not messed up. But also every
chain increases retry counter (tsr_retry). If there are enough
concurrent chains this counter will hit MAX_XMIT_RETRY value and
the character will be dropped.

To reproduce this bug you need to feed serial output to some program
consuming it slowly enough. A python script from bug #1335444
description is an example of such program.

This commit changes retry logic in the following way to avoid
concurrency: instead of spawning a new chain for each character being
transmitted spawn only one and make it transmit characters until FIFO is
empty.

The change consists of two parts:
 - add a do {} while () loop in serial_xmit (diff is a bit erratic
   for this part, diff -w will show actual change),
 - do not call serial_xmit from serial_ioport_write if there is one
   waiting on the watch already.

This should fix another issue causing bug #1335444.

Signed-off-by: Kirill Batuzov <batuzovk@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-14 16:14:14 +02:00
Paolo Bonzini
7b3621f47a qemu-char: fix deadlock with "-monitor pty"
qemu_chr_be_generic_open cannot be called with the write lock taken,
because it calls client code that may call qemu_chr_fe_write.  This
actually happens for the monitor:

    0x00007ffff27dbf79 in __GI_raise (sig=sig@entry=6)
    0x00007ffff27df388 in __GI_abort ()
    0x00005555555ef489 in error_exit (err=<optimized out>, msg=msg@entry=0x5555559796d0 <__func__.5959> "qemu_mutex_lock")
    0x00005555558f9080 in qemu_mutex_lock (mutex=mutex@entry=0x555556248a30)
    0x0000555555713936 in qemu_chr_fe_write (s=0x555556248a30, buf=buf@entry=0x5555563d8870 "QEMU 2.0.90 monitor - type 'help' for more information\r\n", len=56)
    0x00005555556217fd in monitor_flush_locked (mon=mon@entry=0x555556251fd0)
    0x0000555555621a12 in monitor_flush_locked (mon=0x555556251fd0)
    monitor_puts (mon=mon@entry=0x555556251fd0, str=0x55555634bfa7 "", str@entry=0x55555634bf70 "QEMU 2.0.90 monitor - type 'help' for more information\n")
    0x0000555555624359 in monitor_vprintf (mon=0x555556251fd0, fmt=<optimized out>, ap=<optimized out>)
    0x0000555555624414 in monitor_printf (mon=<optimized out>, fmt=fmt@entry=0x5555559105a0 "QEMU %s monitor - type 'help' for more information\n")
    0x0000555555629806 in monitor_event (opaque=0x555556251fd0, event=<optimized out>)
    0x000055555571343c in qemu_chr_be_generic_open (s=0x555556248a30)

To avoid this, defer the call to an idle callback, which will be
called as soon as the main loop is re-entered.  In order to simplify
the cleanup and do it in one place only, change pty_chr_close to
call pty_chr_state.

To reproduce, run with "-monitor pty", then try to read from the
slave /dev/pts/FOO that it creates.

Fixes: 9005b2a758
Reported-by: Li Liang <liangx.z.li@intel.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-14 16:13:58 +02:00
Peter Maydell
7a6d04e73f Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block patches for 2.1.0-rc2 (v2)

# gpg: Signature made Mon 14 Jul 2014 11:04:12 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"

* remotes/kevin/tags/for-upstream: (22 commits)
  ide: Treat read/write beyond end as invalid
  virtio-blk: Treat read/write beyond end as invalid
  virtio-blk: Bypass error action and I/O accounting on invalid r/w
  virtio-blk: Factor common checks out of virtio_blk_handle_read/write()
  dma-helpers: Fix too long qiov
  qtest: fix vhost-user-test compilation with old GLib
  tests: Fix unterminated string output visitor enum human string
  AioContext: do not rely on aio_poll(ctx, true) result to end a loop
  virtio-blk: embed VirtQueueElement in VirtIOBlockReq
  virtio-blk: avoid g_slice_new0() for VirtIOBlockReq and VirtQueueElement
  dataplane: do not free VirtQueueElement in vring_push()
  virtio-blk: avoid dataplane VirtIOBlockReq early free
  block: Assert qiov length matches request length
  qed: Make qiov match request size until backing file EOF
  qcow2: Make qiov match request size until backing file EOF
  block: Make qiov match the request size until EOF
  AioContext: speed up aio_notify
  test-aio: fix GSource-based timer test
  block: drop aio functions that operate on the main AioContext
  block: prefer aio_poll to qemu_aio_wait
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-14 13:09:29 +01:00
Peter Maydell
c15a34eda0 Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20140714' into staging
A s390x/kvm bugfix for missing floating point register synchronization.

# gpg: Signature made Mon 14 Jul 2014 08:21:54 BST using RSA key ID C6F02FAF
# gpg: Can't check signature: public key not found

* remotes/cohuck/tags/s390x-20140714:
  s390x/kvm: synchronize guest floating point registers

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-14 11:04:11 +01:00
Markus Armbruster
58ac321135 ide: Treat read/write beyond end as invalid
The block layer fails such reads and writes just fine.  However, they
then get treated like valid operations that fail: the error action
gets executed.  Unwanted; reporting the error to the guest is the only
sensible action.

Reject them before passing them to the block layer.  This bypasses the
error action and I/O accounting.  Not quite correct for DMA, because
DMA can fail after some success, and when that happens, the part that
succeeded isn't counted.  Tolerable, because I/O accounting is an
inconsistent mess anyway.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-07-14 12:03:21 +02:00
Markus Armbruster
3c2daac0b9 virtio-blk: Treat read/write beyond end as invalid
The block layer fails such reads and writes just fine.  However, they
then get treated like valid operations that fail: the error action
gets executed.  Unwanted; reporting the error to the guest is the only
sensible action.

Reject them before passing them to the block layer.  This bypasses the
error action and I/O accounting.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-07-14 12:03:21 +02:00
Markus Armbruster
42e38c1fd0 virtio-blk: Bypass error action and I/O accounting on invalid r/w
When a device model's I/O operation fails, we execute the error
action.  This lets layers above QEMU implement thin provisioning, or
attempt to correct errors before they reach the guest.  But when the
I/O operation fails because it's invalid, reporting the error to the
guest is the only sensible action.

If the guest's read or write asks for an invalid sector range, fail
the request right away, without considering the error action.  No
change with error action BDRV_ACTION_REPORT.

Furthermore, bypass I/O accounting, because we want to track only I/O
that actually reaches the block layer.

The next commit will extend "invalid sector range" to cover attempts
to read/write beyond the end of the medium.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-07-14 12:03:21 +02:00
Markus Armbruster
d0e14376ee virtio-blk: Factor common checks out of virtio_blk_handle_read/write()
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-07-14 12:03:21 +02:00
Kevin Wolf
58f423fbd5 dma-helpers: Fix too long qiov
If the size of the scatter/gather list isn't a multiple of 512, the
number of sectors for the block layer request is rounded down, resulting
in a qiov that doesn't match the request length. Truncate the qiov to the
new length of the request.

This fixes the IDE qtest case /x86_64/ide/bmdma/short_prdt.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-07-14 12:03:21 +02:00
Nikolay Nikolaev
80504dcaa1 qtest: fix vhost-user-test compilation with old GLib
Mising G_TIME_SPAN_SECOND definition breaks the RHEL6 compilation as GLib
version before 2.26 does not have it. In such case just define it.

Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Tested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-07-14 12:03:21 +02:00
Andreas Färber
b8864245b1 tests: Fix unterminated string output visitor enum human string
The buffer was being allocated of size string length plus two.
Around the string two quotes were being added, but no terminating NUL.
It was then compared using g_assert_cmpstr(), resulting in fairly random
assertion failures:

 ERROR:tests/test-string-output-visitor.c:213:test_visitor_out_enum: assertion failed (str == str_human): ("\"value1\"" == "\"value1\"\001EEEEEEEEEEEEEE\0171")

There is no g_assert_cmpnstr() counterpart, so use g_strdup_printf()
for safely assembling the string in the first place.

Cc: Hu Tao <hutao@cn.fujitsu.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Suggested-by: Eric Blake <eblake@redhat.com>
Fixes: b4900c0 tests: add human format test for string output visitor
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-07-14 12:03:20 +02:00
Paolo Bonzini
acfb23ad3d AioContext: do not rely on aio_poll(ctx, true) result to end a loop
Currently, whenever aio_poll(ctx, true) has completed all pending
work it returns true *and* the next call to aio_poll(ctx, true)
will not block.

This invariant has its roots in qemu_aio_flush()'s implementation
as "while (qemu_aio_wait()) {}".  However, qemu_aio_flush() does
not exist anymore and bdrv_drain_all() is implemented differently;
and this invariant is complicated to maintain and subtly different
from the return value of GMainLoop's g_main_context_iteration.

All calls to aio_poll(ctx, true) except one are guarded by a
while() loop checking for a request to be incomplete, or a
BlockDriverState to be idle.  The one remaining call (in
iothread.c) uses this to delay the aio_context_release/acquire
pair until the AioContext is quiescent, however:

- we can do the same just by using non-blocking aio_poll,
  similar to how vl.c invokes main_loop_wait

- it is buggy, because it does not ensure that the AioContext
  is released between an aio_notify and the next time the
  iothread goes to sleep.  This leads to hangs when stopping
  the dataplane thread.

In the end, these semantics are a bad match for the current
users of AioContext.  So modify that one exception in iothread.c,
which also fixes the hangs, as well as the testcase so that
it use the same idiom as the actual QEMU code.

Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-07-14 12:03:20 +02:00
Stefan Hajnoczi
f897bf751f virtio-blk: embed VirtQueueElement in VirtIOBlockReq
The memory allocation between hw/block/virtio-blk.c,
hw/block/dataplane/virtio-blk.c, and hw/virtio/dataplane/vring.c is
messy.  Structs are allocated in different files than they are freed in.
This is risky and makes memory leaks easier.

Embed VirtQueueElement in VirtIOBlockReq to reduce the amount of memory
allocation we need to juggle.  This also makes vring.c and virtio.c
slightly more similar.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-07-14 12:03:20 +02:00
Stefan Hajnoczi
869d66af53 virtio-blk: avoid g_slice_new0() for VirtIOBlockReq and VirtQueueElement
In commit de6c8042ec ("virtio-blk: Avoid
zeroing every request structure") we avoided the 40 KB memset when
allocating VirtIOBlockReq.

The memset was reintroduced in commit
671ec3f056 ("virtio-blk: Convert
VirtIOBlockReq.elem to pointer").

It must be fixed again to avoid a performance regression.

Cc: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-07-14 12:03:20 +02:00
Stefan Hajnoczi
abd764250f dataplane: do not free VirtQueueElement in vring_push()
VirtQueueElement is allocated in vring_pop() so it seems to make sense
that vring_push() should free it.  Alas, virtio-blk frees
VirtQueueElement itself in virtio_blk_free_request().

This patch solves a double-free assertion in glib's g_slice_free().

Rename vring_free_element() to vring_unmap_element() since it no longer
frees the VirtQueueElement.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-07-14 12:03:20 +02:00
Stefan Hajnoczi
0a21ea3289 virtio-blk: avoid dataplane VirtIOBlockReq early free
VirtIOBlockReq is freed later by virtio_blk_free_request() in
hw/block/virtio-blk.c.  Remove this extraneous g_slice_free().

This patch fixes the following segfault:

  0x00005555556373af in virtio_blk_rw_complete (opaque=0x5555565ff5e0, ret=0) at hw/block/virtio-blk.c:99
  99          bdrv_acct_done(req->dev->bs, &req->acct);
  (gdb) print req
  $1 = (VirtIOBlockReq *) 0x5555565ff5e0
  (gdb) print req->dev
  $2 = (VirtIOBlock *) 0x0
  (gdb) bt
  #0  0x00005555556373af in virtio_blk_rw_complete (opaque=0x5555565ff5e0, ret=0) at hw/block/virtio-blk.c:99
  #1  0x0000555555840ebe in bdrv_co_em_bh (opaque=0x5555566152d0) at block.c:4675
  #2  0x000055555583de77 in aio_bh_poll (ctx=ctx@entry=0x5555563a8150) at async.c:81
  #3  0x000055555584b7a7 in aio_poll (ctx=0x5555563a8150, blocking=blocking@entry=true) at aio-posix.c:188
  #4  0x00005555556e520e in iothread_run (opaque=0x5555563a7fd8) at iothread.c:41
  #5  0x00007ffff42ba124 in start_thread () from /usr/lib/libpthread.so.0
  #6  0x00007ffff16d14bd in clone () from /usr/lib/libc.so.6

Reported-by: Max Reitz <mreitz@redhat.com>
Cc: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-07-14 12:03:20 +02:00
Kevin Wolf
8eb029c26e block: Assert qiov length matches request length
At least raw-posix relies on this because it can allocate bounce buffers
based on the request length, but access it using all of the qiov entries
later.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-07-14 12:03:20 +02:00
Kevin Wolf
f06ee3d4aa qed: Make qiov match request size until backing file EOF
If a QED image has a shorter backing file and a read request to
unallocated clusters goes across EOF of the backing file, the backing
file sees a shortened request and the rest is filled with zeros.
However, the original too long qiov was used with the shortened request.

This patch makes the qiov size match the request size, avoiding a
potential buffer overflow in raw-posix.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-07-14 12:03:20 +02:00
Kevin Wolf
44deba5a52 qcow2: Make qiov match request size until backing file EOF
If a qcow2 image has a shorter backing file and a read request to
unallocated clusters goes across EOF of the backing file, the backing
file sees a shortened request and the rest is filled with zeros.
However, the original too long qiov was used with the shortened request.

This patch makes the qiov size match the request size, avoiding a
potential buffer overflow in raw-posix.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-07-14 12:03:20 +02:00
Kevin Wolf
33f461e0c5 block: Make qiov match the request size until EOF
If a read request goes across EOF, the block driver sees a shortened
request that stops at EOF (the rest is memsetted in block.c), however
the original qiov was used for this request.

This patch makes the qiov size match the request size, avoiding a
potential buffer overflow in raw-posix.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-07-14 12:03:20 +02:00
Fam Zheng
2039511b8f scsi: Report error when lun number is in use
In the case that the lun number is taken by another scsi device, don't
release the existing device siliently, but report an error to user.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-14 11:54:57 +02:00
Jason J. Herne
85ad6230b3 s390x/kvm: synchronize guest floating point registers
Add code to kvm_arch_get_registers and kvm_arch_put_registers to
save/restore floating point registers. This missing sync was
unnoticed until migration of userspace that uses fprs.

Signed-off-by: Jason J. Herne <jjherne@us.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
[Update patch to latest upstream]
Cc: qemu-stable@nongnu.org
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-07-14 09:15:38 +02:00
Nikolay Nikolaev
0e3cd8334a qtest: fix vhost-user-test compilation with old GLib
Mising G_TIME_SPAN_SECOND definition breaks the RHEL6 compilation as GLib
version before 2.26 does not have it. In such case just define it.

Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-07-14 00:42:54 +03:00
Hu Tao
75902802c2 fix typo: apci -> acpi
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

MST: rebase
2014-07-11 21:31:55 +03:00
Eduardo Habkost
faab459797 pc_piix: Reuse pc_compat_1_2() for pc-0.1[0123]
pc-0.13 and older were missing some compat code that was present on
newer machine-types:

* x86_cpu_compat_disable_kvm_features(FEAT_1_ECX, CPUID_EXT_X2APIC);
  (pc-i440fx-1.7 and older)
  (added by commit ef02ef5f45)
* x86_cpu_compat_set_features("n270", FEAT_1_ECX, 0, CPUID_EXT_MOVBE);
  (pc-i440fx-1.4 and older)
  (added by commit 4458c23672
* x86_cpu_compat_set_features("Westmere", FEAT_1_ECX, 0, CPUID_EXT_PCLMULQDQ);
  (pc-i440fx-1.4 and older)
  (added by commit 56383703c0)

Instead of duplicating the code from the previous pc_compat_*()
functions, we can now reuse pc_compat_1_2() and fix those issues.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-07-11 21:13:00 +03:00
Igor Mammedov
4ec6ee5ace pc: fix qemu exiting with error when -m X < 128 with old machines types
If machine doesn't support memory hotplug then staring QEMU
with initial memory less than default will make QEMU exit with
following error message:

$QEMU -m 16  -M isapc
qemu-system-i386: "-memory 'slots|maxmem'" is not supported by: isapc

Set maxram_size to initial memory value before parsing
'maxmem' option allows to keep maxmem in sync with initial
memory size if no maxmem option was specified.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
CC: Bruce Rogers <brogers@suse.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-07-11 21:05:14 +03:00
Peter Maydell
ab6d3749c4 Merge remote-tracking branch 'remotes/kraxel/tags/pull-vga-20140711-1' into staging
vga: some cirrus fixes.

# gpg: Signature made Fri 11 Jul 2014 10:38:32 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-vga-20140711-1:
  cirrus: Fix host CPU blits
  cirrus: Fix build of debug code
  cirrus_vga: adding sanity check for vram size

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-11 17:50:38 +01:00
Peter Maydell
aee230d707 Merge remote-tracking branch 'remotes/kraxel/tags/pull-usb-20140711-1' into staging
mtp: linux guest detection fix

# gpg: Signature made Fri 11 Jul 2014 11:32:20 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-usb-20140711-1:
  mtp: linux guest detection fix.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-11 16:01:38 +01:00
Peter Maydell
42ca32f776 Merge remote-tracking branch 'remotes/spice/tags/pull-spice-20140711-1' into staging
spice: auth fixes

# gpg: Signature made Fri 11 Jul 2014 10:17:15 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/spice/tags/pull-spice-20140711-1:
  spice: auth fixes

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-11 14:50:18 +01:00
Peter Maydell
22df3452dc Merge remote-tracking branch 'remotes/kraxel/tags/pull-gtk-20140711-1' into staging
ui/gtk: Restore keyboard focus after Page change

# gpg: Signature made Fri 11 Jul 2014 09:46:21 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-gtk-20140711-1:
  ui/gtk: Restore keyboard focus after Page change

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-11 13:48:07 +01:00
Gerd Hoffmann
13d54125a3 mtp: linux guest detection fix.
Attach a name to the MTP interface (android phones have this too).

With this patch recent linux guests such as fedora 20 happily detect and
use the device.  It shows up in nautilus file manager automatically, and
simple-mtpfs can mount it.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-07-11 12:31:41 +02:00
John Snow
e72b59fa93 ui/gtk: Restore keyboard focus after Page change
(Resending for correct email addresses via MAINTAINERS ...)

In the GTK UI, after changing focus to the qemu monitor Notebook Page,
when restoring focus to the virtual machine page, the keyboard focus is lost
to a hidden GTK widget. Focus can only be restored to the virtual machine by
pressing "tab" or any of the four directional arrow keys.

Clicking in the window or grabbing/ungrabbing input does not restore keyboard
focus to the child widget.

This patch adjusts the Notebook page switching callback to automatically
steal keyboard focus on the Page switch event, so that keyboard input
does not appear to break or disappear after tabbing to the QEMU monitor.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-07-11 10:44:00 +02:00
Benjamin Herrenschmidt
d16136d22a cirrus: Fix host CPU blits
Commit b2eb849d4b
"CVE-2007-1320 - Cirrus LGD-54XX "bitblt" heap overflow" broke
cpu to video blits.

When the ROP function is called from cirrus_bitblt_cputovideo_next(),
we pass 0 for the pitch but only operate on one line at a time. The
added test was tripping because after the initial substraction, the
pitch becomes negative. Make the test only trip when the height is
larger than one (ie. the pitch is actually used).

This fixes HW cursor support in Windows NT4.0 (which otherwise was
a white rectangle) and general display of icons in that OS when using
8bpp mode.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-07-11 10:17:02 +02:00
Benjamin Herrenschmidt
e8ee4b68be cirrus: Fix build of debug code
Use PRIu64 to print uint64_t

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-07-11 10:17:01 +02:00
Gonglei
f61d82c2df cirrus_vga: adding sanity check for vram size
when configure a invalid vram size for cirrus card, such as less
2 MB, which will crash qemu. Follow the real hardware, the cirrus
card has 4 MB video memory. Also for backward compatibility, accept
8 MB and 16 MB vram size.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-07-11 10:17:01 +02:00
Gerd Hoffmann
b1ea7b79e1 spice: auth fixes
Set auth to sasl when sasl is enabled, this makes "info spice" correctly
display sasl auth.  Also throw an error in case someone tries to set
a spice password via monitor without auth mode being "spice".

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-07-11 10:12:47 +02:00
Peter Maydell
74aeb37de0 Merge remote-tracking branch 'remotes/kvm/uq/master' into staging
* remotes/kvm/uq/master:
  qtest: fix vhost-user-test compilation with old GLib
  mc146818rtc: register the clock reset notifier on the right clock
  oslib-posix: Fix new compiler error with -Wclobbered
  target-i386: Add "kvmclock-stable-bit" feature bit name
  Enforce stack protector usage
  watchdog: fix deadlock with -watchdog-action pause
  mips_malta: Catch kernels linked at wrong address
  mips_malta: Remove incorrect KVM T&E references
  mips/kvm: Disable FPU on reset with KVM
  mips/kvm: Init EBase to correct KSEG0

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-10 17:37:16 +01:00
Nikolay Nikolaev
0a58991a5f qtest: fix vhost-user-test compilation with old GLib
Mising G_TIME_SPAN_SECOND definition breaks the RHEL6 compilation as GLib
version before 2.26 does not have it. In such case just define it.

Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-10 17:06:33 +02:00
Paolo Bonzini
13c0cbaec5 mc146818rtc: register the clock reset notifier on the right clock
Commit 884f17c (aio / timers: Convert rtc_clock to be a QEMUClockType,
2013-08-21) erroneously changed an occurrence of rtc_clock to
QEMU_CLOCK_REALTIME, which broke the RTC reset notifier in
mc146818rtc.  Fix this.

I redid the patch myself since the original reporter did not sign
off on his.

Cc: qemu-stable@nongnu.org
Reported-by: Lb peace <peaceustc@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-10 17:06:33 +02:00
Stefan Weil
b7bf8f5657 oslib-posix: Fix new compiler error with -Wclobbered
Newer versions of gcc report a warning (or an error with -Werror) when
compiler option -Wclobbered (or -Wextra) is active:

util/oslib-posix.c:372:12: error:
 variable ‘hpagesize’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Werror=clobbered]

The rewritten code fixes this warning: variable 'hpagesize' is now set and
used in a block without any call of sigsetjmp or similar functions.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-10 17:06:33 +02:00
Eduardo Habkost
8248c36a5d target-i386: Add "kvmclock-stable-bit" feature bit name
KVM_FEATURE_CLOCKSOURCE_STABLE_BIT is enabled by default and supported
by KVM. But not having a name defined makes QEMU treat it as an unknown
and unmigratable feature flag (as any unknown feature may possibly
require state to be migrated), and disable it by default on "-cpu host".

As a side-effect, the new name also makes the flag configurable,
allowing the user to disable it (which may be useful for testing or for
compatibility with old kernels).

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-10 17:06:33 +02:00
Miroslav Rezanina
3b463a3fa8 Enforce stack protector usage
If --enable-stack-protector is used is used, configure script try to use
--fstack-protector-strong. In case it's not supported, --fstack-protector-all
is enabled. If both protectors are not supported, configure does not use
any protector at all without any notification.

This patch reports error when user requests stack protector to be used and
both protector modes are not supported. Behavior is not changed in case
user do not use any of --enable-stack-protector/--disable-stack-protector.

Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>
[Fix non-POSIX operator in test. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-10 17:06:29 +02:00
Andreas Färber
9e99c5fd70 tests: Fix unterminated string output visitor enum human string
The buffer was being allocated of size string length plus two.
Around the string two quotes were being added, but no terminating NUL.
It was then compared using g_assert_cmpstr(), resulting in fairly random
assertion failures:

 ERROR:tests/test-string-output-visitor.c:213:test_visitor_out_enum: assertion failed (str == str_human): ("\"value1\"" == "\"value1\"\001EEEEEEEEEEEEEE\0171")

There is no g_assert_cmpnstr() counterpart, so use g_strdup_printf()
for safely assembling the string in the first place.

Cc: Hu Tao <hutao@cn.fujitsu.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Suggested-by: Eric Blake <eblake@redhat.com>
Fixes: b4900c0 tests: add human format test for string output visitor
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-10 11:53:14 +01:00
Paolo Bonzini
30e5210a70 watchdog: fix deadlock with -watchdog-action pause
qemu_clock_enable says:

/* Disabling the clock will wait for related timerlists to stop
 * executing qemu_run_timers.  Thus, this functions should not
 * be used from the callback of a timer that is based on @clock.
 * Doing so would cause a deadlock.
 */

and it indeed does: vm_stop uses qemu_clock_enable on QEMU_CLOCK_VIRTUAL
and watchdogs are based on QEMU_CLOCK_VIRTUAL, and we get a deadlock.

Use qemu_system_vmstop_request_prepare()/qemu_system_vmstop_request()
instead; yet another alternative could be a BH.

I checked other occurrences of vm_stop and they should not have this
problem.  RUN_STATE_IO_ERROR could in principle (it depends on the
code in the drivers) but it has been fixed by commit 2bd3bce, "block:
asynchronously stop the VM on I/O errors", 2014-06-05.

Tested-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-09 18:17:08 +02:00
James Hogan
f7f152458e mips_malta: Catch kernels linked at wrong address
Add error reporting if the wrong type of kernel is provided for the
current mode of acceleration.

Currently a KVM kernel linked at 0x40000000 can't be used with TCG, and
a normal kernel linked at 0x80000000 can't be used with KVM.

Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-09 18:17:08 +02:00
James Hogan
fbdb1d9555 mips_malta: Remove incorrect KVM T&E references
Fix the error message and code comments relating to KVM not supporting
booting from the flash mapping when no kernel is provided. The issue is
a general MIPS KVM issue and isn't specific to the Trap & Emulate
version of MIPS KVM.

Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reported-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-09 18:17:08 +02:00
James Hogan
0e928b12c9 mips/kvm: Disable FPU on reset with KVM
KVM doesn't yet support the MIPS FPU, or writing to the guest's Config1
register which contains the FPU implemented bit. Clear QEMU's version of
that bit on reset and display a warning that the FPU has been disabled.

The previous incorrect Config1 CP0 register value wasn't being passed to
KVM yet, however we should ensure it is set correctly now to reduce the
risk of breaking migration/loadvm to a future version of QEMU/Linux that
does support it.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-09 18:17:04 +02:00
Paolo Bonzini
0ceb849bd3 AioContext: speed up aio_notify
In many cases, the call to event_notifier_set in aio_notify is unnecessary.
In particular, if we are executing aio_dispatch, or if aio_poll is not
blocking, we know that we will soon get to the next loop iteration (if
necessary); the thread that hosts the AioContext's event loop does not
need any nudging.

The patch includes a Promela formal model that shows that this really
works and does not need any further complication such as generation
counts.  It needs a memory barrier though.

The generation counts are not needed because any change to
ctx->dispatching after the memory barrier is okay for aio_notify.
If it changes from zero to one, it is the right thing to skip
event_notifier_set.  If it changes from one to zero, the
event_notifier_set is unnecessary but harmless.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-07-09 15:50:11 +02:00
Paolo Bonzini
ef508f427b test-aio: fix GSource-based timer test
The current test depends too much on the implementation of the AioContext
GSource.  Just iterate on the main loop until the callback has been invoked
the right number of times.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-07-09 15:50:11 +02:00
Paolo Bonzini
87f68d3182 block: drop aio functions that operate on the main AioContext
The main AioContext should be accessed explicitly via qemu_get_aio_context().
Most of the time, using it is not the right thing to do.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-07-09 15:50:11 +02:00
Paolo Bonzini
b47ec2c456 block: prefer aio_poll to qemu_aio_wait
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-07-09 15:50:11 +02:00
Kevin Wolf
01fb2705bd block: Fix bdrv_is_allocated() return value
bdrv_is_allocated() should return either 0 or 1 in successful cases.
We're lucky that currently, the callers that rely on this (e.g. because
they check for ret == 1) don't seem to break badly. They just might skip
some optimisation or in the case of qemu-io 'map' print separate lines
where a single line would suffice. In theory, a wrong allocation status
could lead to image corruption with certain operations, so let's fix
this quickly.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-07-09 15:50:11 +02:00
Kevin Wolf
d40593dd90 block/backup: Fix hang for unaligned image size
When doing a block backup of an image with an unaligned size (with
respect to the BACKUP_CLUSTER_SIZE), qemu would check the allocation
status of sectors after the end of the image. bdrv_is_allocated()
returns a result that is valid for 0 sectors in this case, so the backup
job ran into an endless loop.

Stop looping when seeing a result valid for 0 sectors, we're at EOF then.

The test case looks somewhat unrelated at first sight because I
originally tried to reproduce a different suspected bug that turned out
to not exist. Still a good test case and it accidentally found this one.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-07-09 15:50:11 +02:00
Peter Maydell
675879f6f3 Update version for v2.1.0-rc1 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-08 16:53:59 +01:00
Peter Maydell
b653282ecc hw/ppc/spapr_hcall.c: Add ULL suffix to 64 bit constant
Add ULL suffix to 64 bit constant to prevent compiler warnings
on some 32 bit platforms.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-08 16:03:19 +01:00
Peter Maydell
d614cb68da Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20140708' into staging
Bugfixes for s390x: set subsystem id in the lowcore when booting from the
s390-ccw bios, and set the channel-program address after I/O completion,
when applicable.

# gpg: Signature made Tue 08 Jul 2014 14:18:20 BST using RSA key ID C6F02FAF
# gpg: Can't check signature: public key not found

* remotes/cohuck/tags/s390x-20140708:
  s390x/css: reflect cpa in scsw
  pc-bios/s390-ccw: update binary
  pc-bios/s390-ccw: store proper subsystem information word

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-08 15:10:42 +01:00
Cornelia Huck
2ed982b6a9 s390x/css: reflect cpa in scsw
We neglected to update the the channel-program-address field of the scsw
after completion of the start or the halt function: Fortunately, Linux
didn't miss it so far. Let's update it for the cases where the cpa is
expected to be valid; in some cases, the cpa is 'unpredictable', so we
leave it untouched.

Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-07-08 15:08:03 +02:00
Cornelia Huck
32a02d070b pc-bios/s390-ccw: update binary
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-07-08 15:08:03 +02:00
Christian Borntraeger
f2879a5c9e pc-bios/s390-ccw: store proper subsystem information word
POP chapter 17 requires to store a subsystem information word at 184
during IPL. Furthermore bytes 188-191 should be zero. The bootmap might
contain data blocks that are written to the first page. We have to
write these values after we processed the bootmap and before the final
IPL.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-07-08 15:08:03 +02:00
Peter Maydell
67d01fb806 Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20140708' into staging
target-arm queue:
 * fix handling of KVM reset for 32-bit ARM CPUs
 * implement NOR flash alias for vexpress-a9
 * make sure libvixl gets its own utils.h rather than somebody else's

# gpg: Signature made Tue 08 Jul 2014 13:12:05 BST using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"

* remotes/pmaydell/tags/pull-target-arm-20140708:
  target-arm: Implement vCPU reset via KVM_ARM_VCPU_INIT for 32-bit CPUs
  hw/arm/vexpress: Alias NOR flash at 0 for vexpress-a9
  disas/libvixl: prepend the include path of libvixl header files

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-08 14:01:58 +01:00
Peter Maydell
75c9a1a047 target-arm: Implement vCPU reset via KVM_ARM_VCPU_INIT for 32-bit CPUs
Implement kvm_arm_vcpu_init() as a simple call to arm_arm_vcpu_init()
(which uses the KVM_ARM_VCPU_INIT vcpu ioctl to tell the kernel
to re-initialize the vCPU), rather than via the complicated code
which saves a copy of the register state on first init and then
writes it back to the kernel. This is much simpler and brings the
32-bit KVM code into line with the 64-bit code.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1403802973-20841-1-git-send-email-peter.maydell@linaro.org
2014-07-08 13:05:11 +01:00
Peter Maydell
6ec1588e09 hw/arm/vexpress: Alias NOR flash at 0 for vexpress-a9
Make the vexpress-a9 board alias the first NOR flash region at
address zero, like vexpress-a15. This makes "-bios" actually usable
on this board.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1404310070-3561-1-git-send-email-peter.maydell@linaro.org
Reviewed-by: Greg Bellows <greg.bellows@linaro.org>
2014-07-08 13:05:10 +01:00
Stefano Stabellini
834fb1b269 disas/libvixl: prepend the include path of libvixl header files
Currently the Makefile of disas/libvixl appends
-I$(SRC_PATH)/disas/libvixl to QEMU_CFLAGS. As a consequence C++ files
that #include "utils.h", such as disas/libvixl/a64/instructions-a64.cc,
are going to look for utils.h on all the other include paths first.

When building QEMU as part of the Xen make system, another unrelated
utils.h file is going to be chosen for inclusion, causing a build
failure:

In file included from disas/libvixl/a64/instructions-a64.cc:27:0:
/qemu/disas/libvixl/a64/instructions-a64.h:88:64: error:
'rawbits_to_float' was not declared in this scope
 const float kFP32PositiveInfinity = rawbits_to_float(0x7f800000);

Fix the problem by prepending (rather than appending) the libvixl
include path to QEMU_CFLAGS.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-08 12:45:57 +01:00
Peter Maydell
eaa4980185 Merge remote-tracking branch 'remotes/agraf/tags/signed-ppc-for-upstream' into staging
Patch queue for ppc - 2014-07-08

A few bug fixes for 2.1:

  - Fix e500* TLB emulation with qemu-system-ppc
  - Update SLOF to current upstream (good number of bugfixes)
  - Make POWER7 / POWER8 PVR match more agnostic (needed in 2.1 for cmdline compat)
  - Fix u-boot.e500 install (how did that happen?)
  - Fix H_CAS on LE hosts
  - ppc64le-linux-user fixes

# gpg: Signature made Tue 08 Jul 2014 11:18:58 BST using RSA key ID 03FEDC60
# gpg: Can't check signature: public key not found

* remotes/agraf/tags/signed-ppc-for-upstream:
  PPC: e500: Actually install u-boot.e500
  target-ppc: Remove POWER7+ and POWER8E families
  target-ppc: Add pvr_match() callback
  pseries: Update SLOF firmware image to qemu-slof-20140630
  PPC: Fix booke206 TLB with phys addrs > 32bit
  target-ppc: Fix gdbstub for ppc64le-linux-user
  target-ppc: Change default cpu for ppc64le-linux-user
  target-ppc: KVMPPC_H_CAS fix cpu-version endianess

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-08 11:38:12 +01:00
Cole Robinson
0c6ab8c988 PPC: e500: Actually install u-boot.e500
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-08 12:10:37 +02:00
Alexey Kardashevskiy
b60c60070c target-ppc: Remove POWER7+ and POWER8E families
POWER8E is architecturally equal to POWER8 and POWER7+ is equal to
POWER7. Also no user space tool makes any difference for CPU node name
in the device tree (such as PowerPC,POWER7@0 vs. PowerPC,POWER7+@0).
So there is no point in emulating POWER7+ and POWER8E apart from POWER7
and POWER8. Also, the previos patch implemented multiple PVR mask support
per CPU class so POWER7 class now covers both POWER7 and POWER7+ CPUs,
same is valid for POWER8/8E.

This removes POWER7+ and POWER8E classes. This replaces references
to POWER7P/POWER8E families with POWER7/POWER8 families.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-08 12:10:36 +02:00
Alexey Kardashevskiy
03ae4133ab target-ppc: Add pvr_match() callback
So far it was enough to have a base PVR value and mask per CPU
family such as POWER7 or POWER8. However there CPUs which are
completely architecturally compatible but have different PVRs such
as POWER7/POWER7+ and POWER8/POWER8E. For these CPUs, top 16 bits
are CPU family and low 16 bits are the version. The families have
PVR base values different enough so defining a mask which
would cover both (or potentially more) CPUs within the family is
not possible.

This adds a pvr_match() callback to PowerPCCPUClass. The default
handler simply compares PVR defined in the class.

This implements ppc_pvr_match_power7/ppc_pvr_match_power8 callbacks
for POWER7/8 families. These check for POWER7/POWER7+ and POWER8/POWER8E.

This changes ppc_cpu_compare_class_pvr_mask() not to check masks but
use the pvr_match() callback.

Since all server CPUs use the same mask, this defines one mask
value - CPU_POWERPC_POWER_SERVER_MASK - which is used everywhere now.
This removes other mask definitions.

This removes pvr_mask from PowerPCCPUClass as it is not used anymore.
This removes pvr initialization for POWER7/8 families as it is not used
to find the class, the pvr_match() callback is used instead.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-08 12:10:36 +02:00
Alexey Kardashevskiy
d6c23f8a1b pseries: Update SLOF firmware image to qemu-slof-20140630
The changelog is:
  > Quieten the grub warning
  > Add boot menu support
  > boot from disk having chrp-boot file
  > fat16: fix read and remove debug messages
  > dhcparch define missing in compilation
  > pci-scan: reserve memory for pci-bridge without devices
  > pci-bridge: Fix ranges when no device beyond the bridge
  > Set dhcp arch in board-qemu config file
  > xhci: fix controller stop
  > dhcp: support client architecture code 93
  > virtio-blk: support variable block size
  > usb: use common pci dma alloc/mapping routines
  > Remove unused SLOF code
  > pci-bridge: generic bridge needs to support pci dma functions
  > pci: extract dma functions as separate file
  > e1000: fix usage of multiple nics

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-08 12:10:36 +02:00
Alexander Graf
da89a1cf92 PPC: Fix booke206 TLB with phys addrs > 32bit
We were truncating physical addresses to 32bit when using qemu-system-ppc
with a booke206 TLB implementation. This patch fixes that and makes the full
address space available.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-08 12:10:36 +02:00
Richard Henderson
be5c9ddabc target-ppc: Fix gdbstub for ppc64le-linux-user
The bswap that's needed for system mode isn't required for
user mode, and in fact breaks debugging.

Signed-off-by: Richard Henderson <rth@twiddle.net>
[agraf: fix apple gdbstub implementation]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-08 12:10:36 +02:00
Richard Henderson
a74029f6cb target-ppc: Change default cpu for ppc64le-linux-user
The default, 970fx, doesn't support MSR_LE.  So even though we set LE in
ppc_cpu_reset, it gets cleared again in hreg_store_msr.  Error out if a
user-selected cpu model doesn't support LE.

Signed-off-by: Richard Henderson <rth@twiddle.net>
[agraf: switch to POWER7 as default for BE and LE]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-08 12:10:36 +02:00
Laurent Dufour
4bce526ec4 target-ppc: KVMPPC_H_CAS fix cpu-version endianess
During KVMPPC_H_CAS processing, the cpu-version updated value is stored
without taking care of the current endianess. As a consequence, the guest
may not switch to the right CPU model, leading to unexpected results.

If needed, the value is now converted.

Fixes: 6d9412ea81 ("target-ppc: Implement "compat" CPU option")
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-08 12:10:36 +02:00
Peter Maydell
128f0e6614 Merge remote-tracking branch 'remotes/afaerber/tags/prep-for-2.1' into staging
PowerPC Reference Platform (PReP)

* Update OpenHack'Ware firmware to replace QEMU-side workarounds

# gpg: Signature made Mon 07 Jul 2014 15:49:42 BST using RSA key ID 3E7E013F
# gpg: Good signature from "Andreas Färber <afaerber@suse.de>"
# gpg:                 aka "Andreas Färber <afaerber@suse.com>"

* remotes/afaerber/tags/prep-for-2.1:
  prep: Update ppc_rom.bin
  prep: Remove CPU reset entry point hack related to OpenHack'Ware
  prep: Remove PCI memory hack related to OpenHack'Ware

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-07 19:06:55 +01:00
Peter Maydell
c6ea9b73b1 Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc,vhost,virtio fixes, test

Bugfixes all over the place.

There's a  non bugfix here: re-enabling the vhost-user test,
though the patch just brings back functionality that
I disabled earlier to fix mingw build failures.
This is now sorted, and keeping the unit test enabled
seems important since the feature relies on an external
server to work, so isn't easy to test.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Sun 06 Jul 2014 11:01:35 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"

* remotes/mst/tags/for_upstream:
  qemu-char: add chr_add_watch support in mux chardev
  virtio-pci: fix MSI memory region use after free
  qdev: Fix crash when using non-device class name on -global
  qdev: Don't abort() in case globals can't be set
  hw/virtio: enable common virtio feature for mmio device
  acpi: fix typo in memory hotplug MMIO region name
  pci: assign devfn to pci_dev before calling pci_device_iommu_address_space()
  Handle G_IO_HUP in tcp_chr_read for tcp chardev
  virtio: move common virtio properties to bus class device
  pc-dimm: error out if memory hotplug is not enabled
  numa: check for busy memory backend
  qtest: enable vhost-user-test

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-07 16:30:14 +01:00
Andreas Färber
ee0f2601b9 prep: Update ppc_rom.bin
This replaces QEMU-side workarounds for PCI BARs and CPU reset.

Signed-off-by: Andreas Färber <andreas.faerber@web.de>
2014-07-07 16:46:35 +02:00
Hervé Poussineau
56de2e5269 prep: Remove CPU reset entry point hack related to OpenHack'Ware
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
2014-07-07 16:46:35 +02:00
Hervé Poussineau
97db046678 prep: Remove PCI memory hack related to OpenHack'Ware
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
2014-07-07 16:46:35 +02:00
Peter Maydell
9540d1f8d9 Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
Block pull request

# gpg: Signature made Mon 07 Jul 2014 13:27:20 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/block-pull-request:
  qmp: show QOM properties in device-list-properties
  dataplane: submit I/O as a batch
  linux-aio: implement io plug, unplug and flush io queue
  block: block: introduce APIs for submitting IO as a batch
  ahci: map memory via device's address space instead of address_space_memory
  raw-posix: Fix raw_getlength() to always return -errno on error
  qemu-iotests: Disable Quorum testing in 041 when Quorum is not builtin
  ahci.c: mask unused flags when reading size PRDT DBC
  MAINTAINERS: add Stefan Hajnoczi to IDE maintainers
  mirror: Fix qiov size for short requests
  Fix nocow typos in manpage

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-07 15:02:36 +01:00
Peter Maydell
f811d4743b Merge remote-tracking branch 'remotes/sstabellini/xen_arm_20140707' into staging
* remotes/sstabellini/xen_arm_20140707:
  xen: build on ARM
  xen_backend: introduce xenstore_read_uint64 and xenstore_read_fe_uint64

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-07 13:43:03 +01:00
Stefano Stabellini
643f593224 xen: build on ARM
Collection of fixes to build QEMU with Xen support on ARM:
- use xenstore_read_fe_uint64 to retrieve the page-ref (xenfb);
- use xen_pfn_t instead of unsigned long in xenfb;
- unsigned long/xenpfn_t in xen_remove_from_physmap;
- in xen-mapcache.c use HOST_LONG_BITS to check for QEMU's address space
size.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-07 10:37:40 +00:00
Stefano Stabellini
4aba9eb138 xen_backend: introduce xenstore_read_uint64 and xenstore_read_fe_uint64
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-07 10:37:40 +00:00
Stefan Hajnoczi
f4eb32b590 qmp: show QOM properties in device-list-properties
Devices can use a mix of qdev and QOM properties.  Currently only the
qdev properties are displayed by device-list-properties.

This patch extends the property enumeration algorithm to also display
QOM properties (excluding the implicit "type", "realized",
"hotpluggable", and "parent_bus" properties).

When a qdev property exists, use the qdev type name to preserve
backwards compatibility.  QOM type names can be different for bool (qdev
on/off) and str (used by qdev pointers).

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-07 11:10:05 +02:00
Ming Lei
dd67c1d7e7 dataplane: submit I/O as a batch
Before commit 580b6b2aa2(dataplane: use the QEMU block
layer for I/O), dataplane for virtio-blk submits block
I/O as a batch.

This commit 580b6b2aa2 replaces the custom linux AIO
implementation(including submit I/O as a batch) with QEMU
block layer, but this commit causes ~40% throughput regression
on virtio-blk performance, and removing submitting I/O
as a batch is one of the causes.

This patch applies the newly introduced bdrv_io_plug() and
bdrv_io_unplug() interfaces to support submitting I/O
at batch for Qemu block layer, and in my test, the change
can improve throughput by ~30% with 'aio=native'.

Following my fio test script:

	[global]
	direct=1
	size=4G
	bsrange=4k-4k
	timeout=40
	numjobs=4
	ioengine=libaio
	iodepth=64
	filename=/dev/vdc
	group_reporting=1

	[f]
	rw=randread

Result on one of my small machine(host: x86_64, 2cores, 4thread, guest: 4cores):
	- qemu master: 65K IOPS
	- qemu master with these patches: 92K IOPS
	- 2.0.0 release(dataplane using custom linux aio): 104K IOPS

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-07 11:05:17 +02:00
Ming Lei
1b3abdcccf linux-aio: implement io plug, unplug and flush io queue
This patch implements .bdrv_io_plug, .bdrv_io_unplug and
.bdrv_flush_io_queue callbacks for linux-aio Block Drivers,
so that submitting I/O as a batch can be supported on linux-aio.

[Unprocessed requests are completed with -EIO instead of a bogus ret
value.
--Stefan]

Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-07 11:05:17 +02:00
Ming Lei
448ad91db4 block: block: introduce APIs for submitting IO as a batch
This patch introduces three APIs so that following
patches can support queuing I/O requests and submitting them
as a batch for improving I/O performance.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-07 11:05:17 +02:00
Le Tan
5a18e67dfd ahci: map memory via device's address space instead of address_space_memory
In map_page() in hw/ide/ahci.c, replace cpu_physical_memory_map() and
cpu_physical_memory_unmap() with dma_memory_map() and dma_memory_unmap(),
because ahci devices should not access memory directly but via their address
space. Add an AddressSpace parameter to map_page(). In order to call
map_page(), we should pass the AHCIState.as as the AddressSpace argument.

Signed-off-by: Le Tan <tamlokveer@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-07 10:22:43 +02:00
Markus Armbruster
aa729704f4 raw-posix: Fix raw_getlength() to always return -errno on error
We got a merry mix of -1 and -errno here.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-07 09:41:29 +02:00
Benoît Canet
a42a1facb7 qemu-iotests: Disable Quorum testing in 041 when Quorum is not builtin
This avoid breaking tests on RHEL6 where gnutls is too old for quorum to be
built by default.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-07 09:15:29 +02:00
Reza Jelveh
d02f8adc6d ahci.c: mask unused flags when reading size PRDT DBC
The data byte count(DBC) read from the description information is defined for
bits 21:00. Bits 30:22 are reserved and bit 31 is the Interrupt on Completion
(I) flag.

Completion interrupts are triggered after every transaction instead of on
I-flag in QEMU. tbl_entry_size is a signed integer and improperly reading the
DBC leads to a negative offset that causes sglist allocation to fail.

Signed-off-by: Reza Jelveh <reza.jelveh@tuhh.de>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-07 09:15:29 +02:00
Stefan Hajnoczi
37253e1ec8 MAINTAINERS: add Stefan Hajnoczi to IDE maintainers
Make Stefan officially co-maintain hw/ide/ with Kevin.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Kevin Wolf <kwolf@redhat.com>
2014-07-07 09:15:29 +02:00
Kevin Wolf
5a0f6fd5c8 mirror: Fix qiov size for short requests
When mirroring an image of a size that is not a multiple of the
mirror job granularity, the last request would have the right nb_sectors
argument, but a qiov that is rounded up to the next multiple of the
granularity. Don't do this.

This fixes a segfault that is caused by raw-posix being confused by this
and allocating a buffer with request length, but operating on it with
qiov length.

[s/Driver/Drive/ in qemu-iotests 041 as suggested by Eric
--Stefan]

Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-07 09:15:29 +02:00
Chunyan Liu
bc3a7f90ff Fix nocow typos in manpage
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-07 09:15:29 +02:00
Kirill Batuzov
3f0838ab85 qemu-char: add chr_add_watch support in mux chardev
Forward chr_add_watch call from mux chardev to underlying
implementation.

This should fix bug #1335444

Signed-off-by: Kirill Batuzov <batuzovk@ispras.ru>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-07-06 09:13:54 +03:00
Paolo Bonzini
8b81bb3b06 virtio-pci: fix MSI memory region use after free
After memory region QOMification QEMU is stricter in detecting
wrong usage of the memory region API.  Here it detected a
memory_region_destroy done before the corresponding
memory_region_del_subregion; the memory_region_destroy is
done by msix_uninit_exclusive_bar, the memory_region_del_subregion
is done by the PCI core's pci_unregister_io_regions before
pc->exit is called.

The problem was introduced by
commit 06a1307379
    virtio-pci: add device_unplugged callback
As noted in that commit log, virtio device kick callbacks need to be
stopped before generic virtio is cleaned up. This is because these are
notifications from pci proxy to the generic virtio device so they need
to be stopped in the unplug call before the virtio device is unrealized.
However interrupts are notifications from the virtio device to
the pci proxy so they need to stay around while the device
is realized.

The memory API misuse caused an assertion when hot-unplugging virtio
devices.  Using the API correctly fixes the assertion.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-07-06 09:13:54 +03:00
Eduardo Habkost
dd98b71f48 qdev: Fix crash when using non-device class name on -global
This fixes the following crash:

    $ qemu-system-x86_64 -global container.xxx=y
    hw/core/qdev-properties-system.c:399:qdev_add_one_global: Object 0x7f7eff234100 is not an instance of type device
    Aborted (core dumped)

New behavior will be to just warn, just like when non-existing clas
names are used:

    $ qemu-system-x86_64 -global container.xxx=y
    qemu-system-x86_64: Warning: "-global container.xxx=y" not used

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Tested-by: Don Slutz <dslutz@verizon.com>
2014-07-06 09:13:54 +03:00
Eduardo Habkost
319627006a qdev: Don't abort() in case globals can't be set
It would be much better if we didn't terminate QEMU inside
device_post_init(), but at least exiting cleanly is better than aborting
and dumping core.

Before this patch:

    $ qemu-system-x86_64 -global cpu.xxx=y
    qemu-system-x86_64: Property '.xxx' not found
    Aborted (core dumped)

After this patch:

    $ qemu-system-x86_64 -global cpu.xxx=y
    qemu-system-x86_64: Property '.xxx' not found

Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-By: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2014-07-06 09:13:54 +03:00
Ming Lei
b7c9285b8d hw/virtio: enable common virtio feature for mmio device
Both 'indirect_desc' and 'event_idx' are bus independent features,
and they should be enabled for mmio devices too.

On arm64 quad core VM(qemu-kvm), the patch can increase block I/O
performance a lot with latest linux tree:
        - without the patch: 14K IOPS
        - with the patch: 34K IOPS

fio script:
        [global]
        direct=1
        bsrange=4k-4k
        timeout=10
        numjobs=4
        ioengine=libaio
        iodepth=64

        filename=/dev/vdc
        group_reporting=1

        [f1]
        rw=randread

Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-07-06 09:13:54 +03:00
Igor Mammedov
22dc50d758 acpi: fix typo in memory hotplug MMIO region name
Reported-by: Sergey Fionov <fionov@gmail.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
2014-07-06 09:13:54 +03:00
Le Tan
efc8188e93 pci: assign devfn to pci_dev before calling pci_device_iommu_address_space()
In function do_pci_register_device() in file hw/pci/pci.c, move the assignment
of pci_dev->devfn to the position before the call to
pci_device_iommu_address_space(pci_dev) which will use the value of
pci_dev->devfn.

Fixes: 9eda7d373e
    pci: Introduce helper to retrieve a PCI device's DMA address space

Cc: qemu-stable@nongnu.org
Signed-off-by: Le Tan <tamlokveer@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-07-06 09:13:54 +03:00
Kirill Batuzov
812c1057f6 Handle G_IO_HUP in tcp_chr_read for tcp chardev
Since commit cdaa86a54b
("Add G_IO_HUP handler for socket chardev")
GLib limitation results in a bug on Windows host. Steps to reproduce:

Start qemu: qemu-system-i386 -qmp tcp:127.0.0.1:4444:server:nowait
Connect with telnet: telnet 127.0.0.1 4444
Try sending some data from telnet.
Expected result: answers from QEMU.
Observed result: no answers (actually tcp_chr_read is not called at all).

Due to GLib limitations it is not possible to create several watches on one
channel on Windows hosts. See bug #338943 in GNOME bugzilla for details:
https://bugzilla.gnome.org/show_bug.cgi?id=338943

This reimplements commit cdaa86a54b
("Add G_IO_HUP handler for socket chardev") using a single watch:

Handle G_IO_HUP in tcp_chr_read instead. It is already watched by a
corresponding watch.  Remove the second watch with its handler.

Cc: Antonios Motakis <a.motakis@virtualopensystems.com>
Cc: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Kirill Batuzov <batuzovk@ispras.ru>
Signed-off-by: Nikita Belov <zodiac@ispras.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-07-06 09:13:54 +03:00
Ming Lei
85d1277e66 virtio: move common virtio properties to bus class device
The two common virtio features can be defined per bus, so move all
into bus class device to make code more clean.

As discussed with cornelia, s390-virtio-blk doesn't support
the two features at all, so keep s390-virtio as it.

Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> #for s390 ccw
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

MST: rebase and resolve conflicts
2014-07-06 09:13:54 +03:00
Igor Mammedov
9b79a76cdb pc-dimm: error out if memory hotplug is not enabled
fixes QEMU abort in case it's started without memory
hotplug enabled.

as result of fix it will print following messages:
"
-device pc-dimm,id=d1,memdev=m1: memory hotplug is not enabled, enable it on startup
-device pc-dimm,id=d1,memdev=m1: Device 'pc-dimm' could not be initialized
"

Also fixup assert condition to detect hotplug address
space overflow.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reported-by:  Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-07-06 09:13:54 +03:00
Hu Tao
0462faee67 numa: check for busy memory backend
Specifying the same memory backend twice leads to an assert:

./x86_64-softmmu/qemu-system-x86_64 -m 512M -enable-kvm -object
memory-backend-ram,size=256M,id=ram0 -numa node,nodeid=0,memdev=ram0
-numa node,nodeid=1,memdev=ram0
qemu-system-x86_64: /scm/qemu/memory.c:1506:
memory_region_add_subregion_common: Assertion `!subregion->container'
failed.
Aborted (core dumped)

Detect and exit with an error message instead.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-07-06 09:13:53 +03:00
Nikolay Nikolaev
e06cbc376e qtest: enable vhost-user-test
Use qtest-obj-y to get the right library order. CONFIG_POSIX ensures
mingw compilation won't break.

Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

MST: whitespace tweak
2014-07-06 09:13:53 +03:00
James Hogan
0a2672b7ea mips/kvm: Init EBase to correct KSEG0
The EBase CP0 register is initialised to 0x80000000, however with KVM
the guest's KSEG0 is at 0x40000000. The incorrect value doesn't get
passed to KVM yet as KVM doesn't implement the EBase register, however
we should set it correctly now so as not to break migration/loadvm to a
future version of QEMU that does support EBase.

Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-05 11:53:07 +02:00
Eduardo Otubo
9d9de254c2 MAINTAINERS: seccomp: change email contact for Eduardo Otubo
Signed-off-by: Eduardo Otubo <eduardo.otubo@profitbricks.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-03 12:36:15 +01:00
Peter Maydell
92259b7f43 Update version for v2.1.0-rc0 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-01 18:48:01 +01:00
Gonglei
015a33bd05 net: add mmsghdr struct check for L2TPV3
The mmsghdr struct is only introduced in Linux 2.6.32; add a
configure check for it and disable L2TPV3 on hosts which are
too old to provide it, rather than simply failing to compile.

Reported-by: chenliang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1404219488-11196-1-git-send-email-arei.gonglei@huawei.com
[PMM: cleaned up commit message and corrected kernel version number]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-01 17:42:23 +01:00
Peter Maydell
596742db33 Merge remote-tracking branch 'remotes/kraxel/tags/pull-usb-20140701-1' into staging
usb bugfixes.

# gpg: Signature made Tue 01 Jul 2014 14:51:19 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-usb-20140701-1:
  ccid-card-emulated: use EventNotifier
  usb: initialize libusb_device to avoid crash
  usb: Fix usb-bt-dongle initialization.
  input: fix jumpy mouse cursor with USB mouse emulation

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-01 16:16:19 +01:00
Peter Maydell
f9119a2572 Merge remote-tracking branch 'remotes/kraxel/tags/pull-vnc-20140701-1' into staging
vnc: two bugfixes (by Peter Lieven).

# gpg: Signature made Tue 01 Jul 2014 12:32:19 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-vnc-20140701-1:
  ui/vnc: fix potential memory corruption issues
  ui/vnc: limit client_cut_text msg payload size

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-01 15:12:05 +01:00
Paolo Bonzini
c1129f6bff ccid-card-emulated: use EventNotifier
Shut up Coverity's complaint about unchecked fcntl return values,
and especially make the code simpler and more efficient.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-07-01 15:49:51 +02:00
Peter Maydell
1aa85f46b3 Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging
Tracing pull request

# gpg: Signature made Tue 01 Jul 2014 09:56:27 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/tracing-pull-request:
  trace: add qemu_system_powerdown_request and qemu_system_shutdown_request trace events

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-01 14:21:50 +01:00
Peter Maydell
8593efa4fb Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
Block pull request

# gpg: Signature made Tue 01 Jul 2014 09:47:15 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/block-pull-request: (23 commits)
  block: add backing-file option to block-stream
  block: extend block-commit to accept a string for the backing file
  block: add helper function to determine if a BDS is in a chain
  block: add QAPI command to allow live backing file change
  qapi: Change back sector-count to sectors-count in quorum QAPI events.
  block/cow: Avoid use of uninitialized cow_bs in error path
  block: simplify bdrv_find_base() and bdrv_find_overlay()
  block: make 'top' argument to block-commit optional
  iotests: Add more tests to quick group
  iotests: Add qemu tests to quick group
  iotests: Simplify qemu-iotests-quick.sh
  qemu-img create: add 'nocow' option
  virtio-blk: remove need for explicit x-data-plane=on option
  qdev: drop iothread property type
  virtio-blk: replace x-iothread with iothread link property
  virtio-blk: move qdev properties into virtio-blk.c
  virtio: fix virtio-blk child refcount in transports
  virtio-blk: drop virtio_blk_set_conf()
  virtio-blk: use aliases instead of duplicate qdev properties
  qdev: add qdev_alias_all_properties()
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-01 13:13:04 +01:00
Peter Lieven
bea60dd767 ui/vnc: fix potential memory corruption issues
this patch makes the VNC server work correctly if the
server surface and the guest surface have different sizes.

Basically the server surface is adjusted to not exceed VNC_MAX_WIDTH
x VNC_MAX_HEIGHT and additionally the width is rounded up to multiple of
VNC_DIRTY_PIXELS_PER_BIT.

If we have a resolution whose width is not dividable by VNC_DIRTY_PIXELS_PER_BIT
we now get a small black bar on the right of the screen.

If the surface is too big to fit the limits only the upper left area is shown.

On top of that this fixes 2 memory corruption issues:

The first was actually discovered during playing
around with a Windows 7 vServer. During resolution
change in Windows 7 it happens sometimes that Windows
changes to an intermediate resolution where
server_stride % cmp_bytes != 0 (in vnc_refresh_server_surface).
This happens only if width % VNC_DIRTY_PIXELS_PER_BIT != 0.

The second is a theoretical issue, but is maybe exploitable
by the guest. If for some reason the guest surface size is bigger
than VNC_MAX_WIDTH x VNC_MAX_HEIGHT we end up in severe corruption since
this limit is nowhere enforced.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-07-01 13:26:40 +02:00
Peter Lieven
f9a70e7939 ui/vnc: limit client_cut_text msg payload size
currently a malicious client could define a payload
size of 2^32 - 1 bytes and send up to that size of
data to the vnc server. The server would allocated
that amount of memory which could easily create an
out of memory condition.

This patch limits the payload size to 1MB max.

Please note that client_cut_text messages are currently
silently ignored.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-07-01 13:26:40 +02:00
Jincheng Miao
3ce2144538 usb: initialize libusb_device to avoid crash
If libusb_get_device_list() fails, the uninitialized local variable
libusb_device would be passed to libusb_free_device_list(), that
will cause a crash, like:
(gdb) bt
 #0  0x00007fbbb4bafc10 in pthread_mutex_lock () from /lib64/libpthread.so.0
 #1  0x00007fbbb233e653 in libusb_unref_device (dev=0x6275682d627375)
     at core.c:902
 #2  0x00007fbbb233e739 in libusb_free_device_list (list=0x7fbbb6e8436e,
     unref_devices=<optimized out>) at core.c:653
 #3  0x00007fbbb6cd80a4 in usb_host_auto_check (unused=unused@entry=0x0)
     at hw/usb/host-libusb.c:1446
 #4  0x00007fbbb6cd8525 in usb_host_initfn (udev=0x7fbbbd3c5670)
     at hw/usb/host-libusb.c:912
 #5  0x00007fbbb6cc123b in usb_device_init (dev=0x7fbbbd3c5670)
     at hw/usb/bus.c:106
 ...

So initialize libusb_device at the begin time.

Signed-off-by: Jincheng Miao <jmiao@redhat.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-07-01 13:26:37 +02:00
Hani Benhabiles
c340a284f3 usb: Fix usb-bt-dongle initialization.
Due to an incomplete initialization, adding a usb-bt-dongle device through HMP
or QMP will cause a segmentation fault.

Signed-off-by: Hani Benhabiles <hani@linux.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-07-01 13:26:37 +02:00
Christian Burger
35e83d10f2 input: fix jumpy mouse cursor with USB mouse emulation
Guest mouse pointer was jumpy, when moving host mouse in the vertical direction (see bug #1327800).

Signed-off-by: Christian Burger <christian@krikkel.de>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-07-01 13:26:37 +02:00
Peter Maydell
c26f3a0a6d Merge remote-tracking branch 'remotes/bonzini/memory' into staging
* remotes/bonzini/memory:
  qdev: correctly send DEVICE_DELETED for recursively-deleted devices
  memory: do not give a name to the internal exec.c regions
  memory: MemoryRegion: Add size property
  memory: MemoryRegion: Add may-overlap and priority props
  memory: MemoryRegion: Add container and addr props
  memory: MemoryRegion: replace owner field with QOM parent
  memory: MemoryRegion: QOMify
  memory: MemoryRegion: use /machine as default owner
  libqtest: escape strings in QMP commands, fix leak
  qom: object: Ignore refs/unrefs of NULL
  qom: object: remove parent pointer when unparenting
  mc146818rtc: add "rtc-time" link to "/machine/rtc"
  qom: allow creating an alias of a child<> property
  qom: add a generic mechanism to resolve paths
  qom: add object_property_add_alias()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-01 11:55:49 +01:00
Peter Maydell
b3959efdbb Merge remote-tracking branch 'remotes/afaerber/tags/qom-devices-for-2.1' into staging
QOM and device refactorings

* QOM unparenting cleanup
* IRQ conversion to QOM

# gpg: Signature made Tue 01 Jul 2014 04:03:23 BST using RSA key ID 3E7E013F
# gpg: Good signature from "Andreas Färber <afaerber@suse.de>"
# gpg:                 aka "Andreas Färber <afaerber@suse.com>"

* remotes/afaerber/tags/qom-devices-for-2.1:
  irq: Slim conversion of qemu_irq to QOM
  irq: Allocate IRQs individually
  hw: Fix qemu_allocate_irqs() leaks
  sdhci: Fix misuse of qemu_free_irqs()
  qom: Remove parent pointer when unparenting

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-01 11:00:53 +01:00
Peter Maydell
d94a658712 Merge remote-tracking branch 'remotes/bonzini/scsi-next' into staging
* remotes/bonzini/scsi-next:
  configure: Fix -lm test, so that tools can be compiled on hosts that require -lm
  virtio-scsi: scsi events must be converted to target endianness
  virtio-scsi: virtio_scsi_push_event() lacks VirtIOSCSIReq parsing

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-01 10:28:52 +01:00
Yang Zhiyong
bc78cff975 trace: add qemu_system_powerdown_request and qemu_system_shutdown_request trace events
We have the experience that the guest doesn't stop successfully
though it was instructed to shut down.

The root cause may be not in QEMU mostly.  However, QEMU is often
suspected at the beginning just because the issue occurred in
virtualization environment.

Therefore, we need to affirm that QEMU received the shutdown
request and raised ACPI irq from "virsh shutdown" command,
virt-manger or stopping QEMU process to the VM .
So that we can affirm the problems was belonged to the Guset OS
rather than the QEMU itself.

When we stop guests by "virsh shutdown" command or virt-manger,
or stopping QEMU process, qemu_system_powerdown_request() or
qemu_system_shutdown_request() is called. Then the below functions
in main_loop_should_exit() of Vl.c are called roughly in the
following order.

	if (qemu_powerdown_requested())
		qemu_system_powerdown()
			monitor_protocol_event(QEVENT_POWERDOWN, NULL)

	OR

	if(qemu_shutdown_requested()}
		monitor_protocol_event(QEVENT_SHUTDOWN, NULL);

The tracepoint of monitor_protocol_event() already exists, but no
tracepoints are defined for qemu_system_powerdown_request() and
qemu_system_shutdown_request(). So this patch adds two tracepoints for
the two functions. We believe that it will become much easier to
isolate the problem mentioned above by these tracepoints.

Signed-off-by: Yang Zhiyong <yangzy.fnst@cn.fujitsu.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 10:56:13 +02:00
Jeff Cody
13d8cc515d block: add backing-file option to block-stream
On some image chains, QEMU may not always be able to resolve the
filenames properly, when updating the backing file of an image
after a block job.

For instance, certain relative pathnames may fail, or drives may
have been specified originally by file descriptor (e.g. /dev/fd/???),
or a relative protocol pathname may have been used.

In these instances, QEMU may lack the information to be able to make
the correct choice, but the user or management layer most likely does
have that knowledge.

With this extension to the block-stream api, the user is able to change
the backing file of the active layer as part of the block-stream
operation.

This allows the change to be 'safe', in the sense that if the attempt
to write the active image metadata fails, then the block-stream
operation returns failure, without disrupting the guest.

If a backing file string is not specified in the command, the backing
file string to use is determined in the same manner as it was
previously.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 10:47:01 +02:00
Jeff Cody
54e2690090 block: extend block-commit to accept a string for the backing file
On some image chains, QEMU may not always be able to resolve the
filenames properly, when updating the backing file of an image
after a block commit.

For instance, certain relative pathnames may fail, or drives may
have been specified originally by file descriptor (e.g. /dev/fd/???),
or a relative protocol pathname may have been used.

In these instances, QEMU may lack the information to be able to make
the correct choice, but the user or management layer most likely does
have that knowledge.

With this extension to the block-commit api, the user is able to change
the backing file of the overlay image as part of the block-commit
operation.

This allows the change to be 'safe', in the sense that if the attempt
to write the overlay image metadata fails, then the block-commit
operation returns failure, without disrupting the guest.

If the commit top is the active layer, then specifying the backing
file string will be treated as an error (there is no overlay image
to modify in that case).

If a backing file string is not specified in the command, the backing
file string to use is determined in the same manner as it was
previously.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 10:47:01 +02:00
Jeff Cody
5a6684d2b9 block: add helper function to determine if a BDS is in a chain
This is a small helper function, to determine if 'base' is in the
chain of BlockDriverState 'top'.  It returns true if it is in the chain,
and false otherwise.

If either argument is NULL, it will also return false.

Reviewed-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 10:47:01 +02:00
Jeff Cody
fa40e65622 block: add QAPI command to allow live backing file change
This allows a user to make a live change to the backing file recorded in
an open image.

The image file to modify can be specified 2 ways:

1) image filename
2) image node-name

Note: this does not cause the backing file itself to be reopened; it
merely changes the backing filename in the image file structure, and
in internal BDS structures.

It is the responsibility of the user to pass a filename string that
can be resolved when the image chain is reopened, and the filename
string is not validated.

A good analogy for this command is that it is a live version of
'qemu-img rebase -u', with respect to changing the backing file string.

[Jeff is offline so I respun this patch in his absence.  Dropped image
filename since using node-name is preferred and this is a new command.
No need to introduce the limitations of finding images by filename.
--Stefan]

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 10:46:38 +02:00
Alexey Kardashevskiy
f80ea9862f configure: Fix -lm test, so that tools can be compiled on hosts that require -lm
The existing test whether "-lm" needs to be included or not is
insufficient as it reports false negative on Fedora20/ppc64.
This happens because sin(0.0) is a constant value which compiler
can safely throw away and therefore there is no need to add "-lm".
As the result, qemu-nbd/qemu-io/qemu-img tools cannot compile.

This adds a global variable and uses it in the test to prevent
from optimization.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[Use Peter's improvement on the test to fool LTO, and remove the
 now useless -lm addition in Makefile.target. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-01 10:36:28 +02:00
Paolo Bonzini
352e8da743 qdev: correctly send DEVICE_DELETED for recursively-deleted devices
When a device is unparented (i.e. made completely hidden from management)
we want to send a DEVICE_DELETED event only if the device actually was
realized.  This avoids raising DEVICE_DELETED events when device_add
fails.

However, this does not work right for recursively-deleted
devices: the whole tree is _first_ unrealized, _then_ unparented.
Then device_unparent sees realized==false and fails to trigger
the event.  The solution is simply to move have_realized into
the DeviceState struct.  If device_add fails, we never set the
new field to true and DEVICE_DELETED is not sent.

Fixes qemu-iotests testcase 067 (broken by commit 5942a19, though that
commit in turn fixed a possible segfault in the same test).

Reported-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-01 10:20:42 +02:00
Paolo Bonzini
1f6245e5ab memory: do not give a name to the internal exec.c regions
There is no need to have them visible under /machine.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-01 10:20:41 +02:00
Peter Crosthwaite
52aef7bba7 memory: MemoryRegion: Add size property
To allow devices to dynamically resize the device. The motivation is
to allow devices with variable size to init their memory_region
without size early and then correctly populate size at realize() time.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-01 10:20:41 +02:00
Peter Crosthwaite
d33382da9a memory: MemoryRegion: Add may-overlap and priority props
QOM propertyify the .may-overlap and .priority fields. The setters
will re-add the memory as a subregion if needed (i.e. the values change
when the memory region is already contained).

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
[Remove setters. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-01 10:20:41 +02:00
Peter Crosthwaite
409ddd0139 memory: MemoryRegion: Add container and addr props
Expose the already existing .parent and .addr fields as QOM properties.
.parent (i.e. the field describing the memory region that contains this
one in Memory hierachy) is renamed "container". This is to avoid
confusion with the QOM parent.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
[Remove setters.  Do not unref parent on releasing the property. Clean
 up error propagation. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-01 10:20:41 +02:00
Paolo Bonzini
22a893e4f5 memory: MemoryRegion: replace owner field with QOM parent
The two are now the same.

Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-01 10:20:41 +02:00
Peter Crosthwaite
b4fefef9d5 memory: MemoryRegion: QOMify
QOMify memory regions as an Object. The former init() and destroy()
routines become instance_init() and instance_finalize() resp.

memory_region_init() is re-implemented to be:
object_initialize() + set fields

memory_region_destroy() is re-implemented to call unparent().

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
[Add newly-created MR as child, unparent on destruction. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-01 10:20:41 +02:00
Paolo Bonzini
b5c2c3d0c8 memory: MemoryRegion: use /machine as default owner
This will be added (after QOMification) as the QOM parent.

Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-01 10:20:41 +02:00
Paolo Bonzini
563890c7c7 libqtest: escape strings in QMP commands, fix leak
libqtest is using g_strdup_printf to format QMP commands, but
this does not work if the argument strings need to be escaped.
Instead, use the fancy %-formatting functionality of QObject.
The only change required in tests is that strings have to be
formatted as %s, not '%s' or \"%s\".  Luckily this usage of
parameterized QMP commands is not that frequent.

The leak is in socket_sendf.  Since we are extracting the send
loop to a new function, fix it now.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-01 10:20:41 +02:00
Peter Crosthwaite
8ffad850ef qom: object: Ignore refs/unrefs of NULL
Just do nothing if passed NULL for a ref or unref. This avoids
call sites that manage a combination of NULL or non-NULL pointers
having to add iffery around every ref and unref.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-01 10:20:41 +02:00
Peter Crosthwaite
c28322d10c qom: object: remove parent pointer when unparenting
Certain parts of the QOM framework test this pointer to determine if
an object is parented. Nuke it when the object is unparented to allow
for reuse of an object after unparenting.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-01 10:20:41 +02:00
Marcelo Tosatti
654a36d857 mc146818rtc: add "rtc-time" link to "/machine/rtc"
Add a link to rtc under /machine providing a stable
location for management apps to query the value of the
time.  The link should be added by any object that sends
RTC_TIME_CHANGE events.

{"execute":"qom-get","arguments":{"path":"/machine","property":"rtc-time"} }

Suggested by Paolo Bonzini and Andreas Faerber.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-01 10:20:30 +02:00
Paolo Bonzini
d190698e6f qom: allow creating an alias of a child<> property
Child properties must be unique.  Fix this problem by
turning their aliases into links.

The resolve function that forwards to the target property
does not have any knowledge of the target property's type,
so it works fine.

Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-01 10:17:48 +02:00
Paolo Bonzini
64607d0881 qom: add a generic mechanism to resolve paths
It may be desirable to have custom link<> properties that do more
than just store an object.  Even the addition of a "check"
function is not enough if setting the link has side effects
or if a non-standard reference counting is preferrable.

Avoid the assumption that the opaque field of a link<> is a
LinkProperty struct, by adding a generic "resolve" callback
to ObjectProperty.  This fixes aliases of link properties.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-01 10:17:48 +02:00
Benoît Canet
4e855baabf qapi: Change back sector-count to sectors-count in quorum QAPI events.
fe069d9d had aligned code and documentation while dropping the s from the
actual JSON output. Fix that.

This also fix test/qemu-iotest/081 since the missing s was causing a permutation.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 10:15:34 +02:00
Peter Maydell
6764579f89 block/cow: Avoid use of uninitialized cow_bs in error path
Commit 25814e8987 introduced an error-exit code path which does
a "goto exit" before the cow_bs variable is initialized, meaning
we would call bdrv_unref() on an uninitialized variable and
likely segfault. Fix this by moving the NULL-initialization
to the top of the function and making the exit code path handle
the case where it is NULL.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 10:15:34 +02:00
Jeff Cody
4caf0fcd45 block: simplify bdrv_find_base() and bdrv_find_overlay()
This simplifies the function bdrv_find_overlay().  With this change,
bdrv_find_base() is just a subset of usage of bdrv_find_overlay(),
so this also takes advantage of that.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 10:15:34 +02:00
Jeff Cody
7676e2c597 block: make 'top' argument to block-commit optional
Now that active layer block-commit is supported, the 'top' argument
no longer needs to be mandatory.

Change it to optional, with the default being the active layer in the
device chain.

[kwolf: Rebased and resolved conflict in tests/qemu-iotests/040]

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 10:15:33 +02:00
Max Reitz
c891e3bbc5 iotests: Add more tests to quick group
While at it, add some more tests to the quick group (those that run with
-nocache in under three seconds on my HDD).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 10:15:33 +02:00
Max Reitz
ee9dd1fc90 iotests: Add qemu tests to quick group
Now that qemu-iotests-quick.sh supports tests using the qemu binary, we
are free to add such tests to the quick group.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 10:15:33 +02:00
Max Reitz
214a081a0d iotests: Simplify qemu-iotests-quick.sh
As of the "iotests: Allow out-of-tree run" series, the qemu-iotests may
(and should) be run directly in the build tree and will then guess the
binary paths themselves. Therefore, qemu-iotests-quick.sh does not need
to (and should not) enter the source path anymore; also, it does not
need to specify the binaries because "check" will guess them
automatically.

As a side-effect, tests using qemu may now be added to the quick group.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 10:15:33 +02:00
Chunyan Liu
4ab1559085 qemu-img create: add 'nocow' option
Add 'nocow' option so that users could have a chance to set NOCOW flag to
newly created files. It's useful on btrfs file system to enhance performance.

Btrfs has low performance when hosting VM images, even more when the guest
in those VM are also using btrfs as file system. One way to mitigate this bad
performance is to turn off COW attributes on VM files. Generally, there are
two ways to turn off NOCOW on btrfs: a) by mounting fs with nodatacow, then
all newly created files will be NOCOW. b) per file. Add the NOCOW file
attribute. It could only be done to empty or new files.

This patch tries the second way, according to the option, it could add NOCOW
per file.

For most block drivers, since the create file step is in raw-posix.c, so we
can do setting NOCOW flag ioctl in raw-posix.c only.

But there are some exceptions, like block/vpc.c and block/vdi.c, they are
creating file by calling qemu_open directly. For them, do the same setting
NOCOW flag ioctl work in them separately.

[Fixed up 082.out due to the new 'nocow' creation option
--Stefan]

Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 10:15:12 +02:00
Cédric Le Goater
424baff549 virtio-scsi: scsi events must be converted to target endianness
Virtio SCSI Events need to be byteswapped before being pushed
when host and guest have a different endianness. Not doing so
breaks hotplug of virtio scsi disks, with the following error
message being printed in the guest console:

virtio_scsi: Unsupport virtio scsi event 1000000

This issue got uncovered while testing disk hotplug with a PowerKVM
ppc64le guest. I have checked that this issue also affects a x86_64
guest run on a ppc64 host.

Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
[ Ported from PowerKVM,
  Greg Kurz <gkurz@linux.vnet.ibm.com> ]
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-01 09:40:38 +02:00
Greg Kurz
dfecbb95e3 virtio-scsi: virtio_scsi_push_event() lacks VirtIOSCSIReq parsing
Hotplug of a virtio scsi disk is currently broken: no disk appears in the
guest (verified with a fedora 20 host running a fedora 20 guest with KVM).
Bisect leeds to Paolo's patches to support any_layout, especially this
commit:

commit 36b15c79aa
Author: Paolo Bonzini <pbonzini@redhat.com>
Date:   Tue Jun 10 16:21:18 2014 +0200

    virtio-scsi: start preparing for any_layout

It modifies virtio_scsi_pop_req() so that it is up to the callers to parse
the virtio scsi request. It seems that virtio_scsi_push_event() was not
modified accordingly...

This patch adds a call to virtio_scsi_parse_req(). It also drops some
sanity checks that are already performed by virtio_scsi_parse_req().

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-01 09:40:38 +02:00
Stefan Hajnoczi
ef7c7ff6d4 qom: add object_property_add_alias()
Sometimes an object needs to present a property which is actually on
another object, or it needs to provide an alias name for an existing
property.

Examples:
  a.foo -> b.foo
  a.old_name -> a.new_name

The new object_property_add_alias() API allows objects to alias a
property on the same object or another object.  The source and target
names can be different.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
2014-07-01 09:15:02 +02:00
Stefan Hajnoczi
8698e110f8 virtio-blk: remove need for explicit x-data-plane=on option
The x-data-plane=on|off option is no longer useful because the
iothread=<iothread> option conveys the same information plus which
IOThread to use.

Do not delete x-data-plane=on|off yet as a convenience to people using
this legacy experimental option.  We will drop it in QEMU 2.2.

Instead, turn on data-plane when either x-data-plane=on or
iothread=<iothread> are used.  The following command-line uses
data-plane:

  qemu -device virtio-blk-pci,iothread=foo,drive=drive0

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 09:15:02 +02:00
Stefan Hajnoczi
1351d1ec89 qdev: drop iothread property type
The iothread property type is no longer used and can be removed.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 09:15:02 +02:00
Stefan Hajnoczi
467b3f33e9 virtio-blk: replace x-iothread with iothread link property
Up until now -device virtio-blk-pci,x-iothread=<id> was used to assign
an IOThread.  This was a temporary solution while we cleaned up QOM link
properties.

This patch switches over to a QOM link property since it is now possible
to restrict the setter to unrealized instances and automatically unref
the IOThread when the virtio-blk-pci device is freed.

Since the "iothread" property is a QOM property and not a qdev property,
we must alias it explicitly for virtio-blk-pci, as well as CCW and
s390-virtio.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 09:15:02 +02:00
Stefan Hajnoczi
32a877e405 virtio-blk: move qdev properties into virtio-blk.c
There is no need to make DEFINE_VIRTIO_BLK_PROPERTIES() public.  Inline
it into virtio-blk.c so it cannot be used by mistake from other source
files.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
2014-07-01 09:15:02 +02:00
Stefan Hajnoczi
c5d49db446 virtio: fix virtio-blk child refcount in transports
object_initialize() leaves the object with a refcount of 1.
object_property_add_child() adds its own reference which is dropped
again when the property is deleted.

The upshot of this is that we always have a refcount >= 1.  Upon hot
unplug the virtio-blk child is not finalized!

Drop our reference after the child property has been added to the
parent.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
2014-07-01 09:15:02 +02:00
Stefan Hajnoczi
f7fedda84a virtio-blk: drop virtio_blk_set_conf()
This function is no longer used since parent objects now use child
aliases to set the VirtIOBlkConf directly.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
2014-07-01 09:15:02 +02:00
Stefan Hajnoczi
caffdac363 virtio-blk: use aliases instead of duplicate qdev properties
virtio-blk-pci, virtio-blk-s390, and virtio-blk-ccw all duplicate the
qdev properties of their VirtIOBlock child.  This approach does not work
well with string or pointer properties since we must be careful about
leaking or double-freeing them.

Use the QOM alias property to forward property accesses to the
VirtIOBlock child.  This way no duplication is necessary.

Remember to stop calling virtio_blk_set_conf() so that we don't clobber
the values already set on the VirtIOBlock instance.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
2014-07-01 09:15:02 +02:00
Stefan Hajnoczi
67cc7e0aac qdev: add qdev_alias_all_properties()
The qdev_alias_all_properties() function creates QOM alias properties
for each qdev property on a DeviceState.  This is useful for parent
objects that wish to forward property accesses to their children.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
2014-07-01 09:15:02 +02:00
Stefan Hajnoczi
ee512c6f21 virtio-blk: move x-data-plane qdev property to virtio-blk.h
Move the x-data-plane property.  Originally it was outside since not
every transport may wish to support dataplane.  But that makes little
sense when we have a dedicated CONFIG_VIRTIO_BLK_DATA_PLANE ifdef
already.

This move makes it easier to switch to property aliases in the next
patch.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
2014-07-01 09:15:02 +02:00
Cornelia Huck
a9968c77d5 dataplane: bail out on unsupported transport
If the virtio transport does not support notifiers (like s390-virtio),
we can't use dataplane. Bail out early and let the user know what is
wrong.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 09:15:02 +02:00
Stefan Hajnoczi
dc80ca6cd6 virtio-blk: avoid qdev property definition duplication
It becomes unwiedly to duplicate all virtio-blk qdev property
definitions due to an #ifdef.  The C preprocessor syntax makes it a
little hard to resolve this cleanly but we can extract the #ifdef and
call a macro it defines later.

Avoiding duplication is important since it will only get worse when we
move the x-data-plane qdev property here too.  We'd have a combinatorial
explosion since x-data-plane has its own #ifdef.

Suggested-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
2014-07-01 09:15:02 +02:00
Andreas Färber
615c489570 irq: Slim conversion of qemu_irq to QOM
As a prequel to any big Pin refactoring plans, do an in-place conversion
of qemu_irq to an Object, so that we can reference it in link<> properties.

Signed-off-by: Andreas Färber <afaerber@suse.de>
[ PC Changes:
 * Removed array-alloctor ref counting logic (limit changes just to
 * single IRQ allocator)
 * Removed WIP marking from subject line
]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-07-01 04:12:48 +02:00
Peter Crosthwaite
f173d57a4c irq: Allocate IRQs individually
Allocate each IRQ individually on array allocations. This prepares for
QOMification of IRQs, where pointers to individual IRQs may be taken
and handed around for usage as QOM Links. The g_renew() scheme used here
is too fragile and would break all existing links should an IRQ list
be extended.

We now have to pass the IRQ count to qemu_free_irqs(). We have so few
call sites however, so this change is reasonably trivial.

Cc: agarcia@igalia.com
Cc: mst@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Alberto Garcia <agarcia@igalia.com>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-07-01 04:02:53 +02:00
Andreas Färber
f3c7d0389f hw: Fix qemu_allocate_irqs() leaks
Replace qemu_allocate_irqs(foo, bar, 1)[0]
with qemu_allocate_irq(foo, bar, 0).

This avoids leaking the dereferenced qemu_irq *.

Cc: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andreas Färber <afaerber@suse.de>
[PC Changes:
 * Applied change to instance in sh4/sh7750.c
]
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Kirill Batuzov <batuzovk@ispras.ru>
[AF: Fix IRQ index in sh4/sh7750.c]
Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-06-30 21:13:30 +02:00
Andreas Färber
127a4e1a51 sdhci: Fix misuse of qemu_free_irqs()
It does a g_free() on the pointer, so don't pass a local &foo reference.

Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-06-30 21:13:30 +02:00
Peter Crosthwaite
d15ae221ea qom: Remove parent pointer when unparenting
Certain parts of the QOM framework test this pointer to determine if
an object is parented. Nuke it when the object is unparented to allow
for reuse of an object after unparenting.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-06-30 21:13:30 +02:00
Peter Maydell
53a259da56 Merge remote-tracking branch 'remotes/awilliam/tags/vfio-pci-for-qemu-20140630.0' into staging
VFIO patches: MSI-X masking performance fix, Endian fixes, fix runstate on device error

# gpg: Signature made Mon 30 Jun 2014 18:13:40 BST using RSA key ID 3BB08B22
# gpg: Can't check signature: public key not found

* remotes/awilliam/tags/vfio-pci-for-qemu-20140630.0:
  vfio: use correct runstate
  vfio: Make BARs native endian
  vfio-pci: Fix MSI-X masking performance
  vfio-pci: Fix MSI/X debug code

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-30 18:31:07 +01:00
Paolo Bonzini
ba29776fd8 vfio: use correct runstate
io-error is for block device errors; it should always be preceded
by a BLOCK_IO_ERROR event.  I think vfio wants to use
RUN_STATE_INTERNAL_ERROR instead.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-06-30 09:56:08 -06:00
Alexey Kardashevskiy
c40708176a vfio: Make BARs native endian
Slow BAR access path is used when VFIO fails to mmap() BAR.
Since this is just a transport between the guest and a device, there is
no need to do endianness swapping.

This changes BARs to use native endianness. Since non-ROM BARs were
doing byte swapping, we need to remove it so does the patch.
As the result, this eliminates cancelling byte swaps and there is
no change in behavior for non-ROM BARs.

ROM BARs were declared little endian too but byte swapping was not
implemented for them so they never actually worked on big endian systems
as there was no cancelling byte swap. This fixes endiannes for ROM BARs
by declaring them native endian and only fixing access sizes as it is
done for non-ROM BARs.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-06-30 09:52:58 -06:00
Alex Williamson
f4d45d4782 vfio-pci: Fix MSI-X masking performance
There are still old guests out there that over-exercise MSI-X masking.
The current code completely sets-up and tears-down an MSI-X vector on
the "use" and "release" callbacks.  While this is functional, it can
slow an old guest to a crawl.  We can easily skip the KVM parts of
this so that we keep the MSI route and irqfd setup.  We do however
need to switch VFIO to trigger a different eventfd while masked.
Actually, we have the option of continuing to use -1 to disable the
trigger, but by using another EventNotifier we can allow the MSI-X
core to emulate pending bits and re-fire the vector once unmasked.
MSI code gets updated as well to use the same setup and teardown
structures and functions.

Prior to this change, an igbvf assigned to a RHEL5 guest gets about
20Mbps and 50 transactions/s with netperf (remote or VF->PF).  With
this change, we get line rate and 3k transactions/s remote or 2Gbps
and 6k+ transactions/s to the PF.  No significant change is expected
for newer guests with more well behaved MSI-X support.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-06-30 09:50:33 -06:00
Alex Williamson
9035f8c09b vfio-pci: Fix MSI/X debug code
Use the correct MSI message function for debug info.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-06-30 09:50:33 -06:00
Peter Maydell
8954000b9e Merge remote-tracking branch 'remotes/bonzini/nbd-next' into staging
* remotes/bonzini/nbd-next:
  nbd: Handle NBD_OPT_LIST option.
  nbd: Handle fixed new-style clients.
  nbd: Shutdown socket before closing.
  nbd: Don't validate from and len in NBD_CMD_DISC.
  nbd: Don't export a block device with no medium.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-30 16:13:32 +01:00
Peter Maydell
ec9fe956d5 Merge remote-tracking branch 'remotes/bonzini/small-fixes' into staging
* remotes/bonzini/small-fixes:
  tests/test-qmp-event: fix for GLib < 2.31
  serial: poll the serial console with G_IO_HUP

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-30 15:56:00 +01:00
Peter Maydell
a4b31047c8 Merge remote-tracking branch 'remotes/pmaydell/tags/pull-cocoa-20140630' into staging
cocoa.next:
 * Honour -show-cursor option
 * Fix handling of absolute positioning devices
 * Cope with first surface being same as initial window size

# gpg: Signature made Mon 30 Jun 2014 13:48:46 BST using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"

* remotes/pmaydell/tags/pull-cocoa-20140630:
  ui/cocoa: Honour -show-cursor command line option
  ui/cocoa: Fix handling of absolute positioning devices
  ui/cocoa: Add utility method to check if point is within window
  ui/cocoa: Cope with first surface being same as initial window size

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-30 15:42:35 +01:00
Peter Maydell
a156dd9a22 Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20140630' into staging
target-arm:
 * provide PL031 RTC in virt board
 * fix missing pxa2xx and strongarm vmstate
 * convert cadence_ttc to instance_init
 * fix libvixl format strings and README

# gpg: Signature made Mon 30 Jun 2014 13:44:33 BST using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"

* remotes/pmaydell/tags/pull-target-arm-20140630:
  disas/libvixl: Fix wrong format strings
  disas/libvixl: Update README for version base
  timer: cadence_ttc: Convert to instance_init
  hw/arm/pxa2xx_gpio: Correct and register vmstate
  hw/arm/pxa2xx_gpio: Fix handling of GPSR/GPCR reads
  hw/arm/strongarm: Wire up missing GPIO and PPC vmstate
  hw/arm/strongarm: Fix handling of GPSR/GPCR reads
  hw/arm/virt: Provide PL031 RTC

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-30 15:16:26 +01:00
Paolo Bonzini
af35e5e1fb tests/test-qmp-event: fix for GLib < 2.31
On old GLib, the test needs a g_thread_init call.

Reported-by: Wenchao Xia <wenchaoqemu@gmail.com>
Tested-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-30 15:06:11 +02:00
Roger Pau Monne
e02bc6de30 serial: poll the serial console with G_IO_HUP
On FreeBSD polling a master pty while the other end is not connected
with G_IO_OUT only results in an endless wait. This is different from
the Linux behaviour, that returns immediately. In order to demonstrate
this, I have the following example code:

http://xenbits.xen.org/people/royger/test_poll.c

When executed on Linux:

$ ./test_poll
In callback

On FreeBSD instead, the callback never gets called:

$ ./test_poll

So, in order to workaround this, poll the source with G_IO_HUP (which
makes the code behave the same way on both Linux and FreeBSD).

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Cc: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Cc: Michael Tokarev <mjt@tls.msk.ru>
Cc: "Andreas Färber" <afaerber@suse.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: xen-devel@lists.xenproject.org
[Add hw/char/cadence_uart.c too. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-30 15:04:34 +02:00
Hani Benhabiles
32d7d2e068 nbd: Handle NBD_OPT_LIST option.
Signed-off-by: Hani Benhabiles <kroosec@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-30 12:50:17 +02:00
Hani Benhabiles
f5076b5a75 nbd: Handle fixed new-style clients.
When this flag is set, the server tells the client that it can send another
option if the server received a request with an option that it doesn't
understand instead of directly closing the connection.

Also add link to the most up-to-date documentation.

Signed-off-by: Hani Benhabiles <kroosec@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-30 12:50:17 +02:00
Hani Benhabiles
27e5eae457 nbd: Shutdown socket before closing.
This forces finishing data sending to client before closing the socket like in
exports listing or replying with NBD_REP_ERR_UNSUP cases.

Signed-off-by: Hani Benhabiles <kroosec@gmail.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-30 12:50:12 +02:00
Stefan Weil
ffebe89975 disas/libvixl: Fix wrong format strings
When the compiler is told to check the arguments of AppendToOutput,
it reports several errors of this kind:

error: format ‘%d’ expects argument of type ‘int’,
 but argument 3 has type ‘int64_t {aka long int}’ [-Werror=format]

Fix those bugs by using the correct format strings with PRId64, PRIx64.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Message-id: 1403113751-19799-1-git-send-email-sw@weilnetz.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-29 22:04:28 +01:00
Richard Henderson
1ce8be7e0d disas/libvixl: Update README for version base
Signed-off-by: Richard Henderson <rth@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-29 22:02:52 +01:00
Peter Maydell
13aefd303c ui/cocoa: Honour -show-cursor command line option
Honour the -show-cursor command line option (which forces the mouse pointer
to always be displayed even when input is grabbed) in the Cocoa UI backend.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1403516125-14568-5-git-send-email-peter.maydell@linaro.org
2014-06-29 22:00:33 +01:00
Peter Maydell
f61c387ea6 ui/cocoa: Fix handling of absolute positioning devices
Fix handling of absolute positioning devices, which were basically
unusable for two separate reasons:
 (1) as soon as you pressed the left mouse button we would call
     CGAssociateMouseAndMouseCursorPosition(FALSE), which means that
     the absolute coordinates of the mouse events are never updated
 (2) we didn't account for MacOSX coordinate origin being bottom left
     rather than top right, and so all the Y values sent to the guest
     were inverted

We fix (1) by aligning our behaviour with the SDL UI backend for
absolute devices:
 * when the mouse moves into the window we do a grab (which means
   hiding the host cursor and sending special keys to the guest)
 * when the mouse moves out of the window we un-grab
and fix (2) by doing the correct transformation in the call to
qemu_input_queue_abs().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1403516125-14568-4-git-send-email-peter.maydell@linaro.org
2014-06-29 22:00:33 +01:00
Peter Maydell
5dd45bee58 ui/cocoa: Add utility method to check if point is within window
Add a utility method to check whether a point is within the current window
bounds, and use it in the various places in the mouse handling code that
were opencoding the check.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1403516125-14568-3-git-send-email-peter.maydell@linaro.org
2014-06-29 22:00:33 +01:00
Peter Maydell
381600dad9 ui/cocoa: Cope with first surface being same as initial window size
Do the recalculation of the content dimensions in switchSurface if the
current cdx is zero as well as if the new surface is a different size to
the current window. This catches the case where the first surface registered
happens to be 640x480 (our current window size), and fixes a bug where we
would always display a black screen until the first surface of a different
size was registered.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1403516125-14568-2-git-send-email-peter.maydell@linaro.org
2014-06-29 22:00:33 +01:00
Alistair Francis
b841642daa timer: cadence_ttc: Convert to instance_init
SysBusDevice::init is deprecated. Convert to instance_init
as prescribed by QOM conventions.

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 1223f14833159b9ea5c57734dd2ffa88d4b15a83.1403583596.git.alistair.francis@xilinx.com
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-29 18:38:40 +01:00
Peter Maydell
166fa99996 hw/arm/pxa2xx_gpio: Correct and register vmstate
The pxa2xx-gpio device has a VMStateDescription, but it was accidentally
never actually registered, and it wasn't quite correct. Remove the
'lines' field (this is a device property, not mutable state), add the
missing 'prev_level' field, and set dc->vmsd so it actually gets used.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
2014-06-29 18:38:40 +01:00
Peter Maydell
ab7a0f0b6d hw/arm/pxa2xx_gpio: Fix handling of GPSR/GPCR reads
The PXA2xx GPIO GPSR and GPCR registers are write-only, with reads being
undefined behaviour. Instead of having GPCR return 31337 and GPSR return
the value last written, make both log the guest error and return 0.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
2014-06-29 18:38:40 +01:00
Peter Maydell
ed657d7117 hw/arm/strongarm: Wire up missing GPIO and PPC vmstate
The VMStateDescription structs for the GPIO and PPC devices were
accidentally never wired up. Add missing state fields and register
them via dc->vmsd.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
2014-06-29 18:38:39 +01:00
Peter Maydell
92335a0d40 hw/arm/strongarm: Fix handling of GPSR/GPCR reads
The StrongARM GPIO GPSR and GPCR registers are write-only, with reads being
undefined behaviour. Instead of having GPCR return 31337 and GPSR return
the value last written, make both log the guest error and return 0.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
2014-06-29 18:38:39 +01:00
Peter Maydell
6e411af935 hw/arm/virt: Provide PL031 RTC
UEFI mandates that the platform must include an RTC, so provide
one in 'virt', using the PL031. This is also useful for directly
booting Linux kernels which would otherwise have to run ntpdate.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
2014-06-29 18:38:39 +01:00
Peter Maydell
9328cfd2fe Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc,vhost,virtio fixes, enhancements

virtio bi-endian support
new command to resync RTC
misc bugfixes and cleanups

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Sun 29 Jun 2014 17:41:13 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"

* remotes/mst/tags/for_upstream: (37 commits)
  tests: add human format test for string output visitor
  vhost-net: disable when cross-endian
  target-ppc: enable virtio endian ambivalent support
  virtio-9p: use virtio wrappers to access headers
  virtio-serial-bus: use virtio wrappers to access headers
  virtio-scsi: use virtio wrappers to access headers
  virtio-blk: use virtio wrappers to access headers
  virtio-balloon: use virtio wrappers to access page frame numbers
  virtio-net: use virtio wrappers to access headers
  virtio: allow byte swapping for vring
  virtio: memory accessors for endian-ambivalent targets
  virtio: add endian-ambivalent support to VirtIODevice
  cpu: introduce CPUClass::virtio_is_big_endian()
  exec: introduce target_words_bigendian() helper
  virtio: add subsections to the migration stream
  virtio-rng: implement per-device migration calls
  virtio-balloon: implement per-device migration calls
  virtio-serial: implement per-device migration calls
  virtio-blk: implement per-device migration calls
  virtio-net: implement per-device migration calls
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-29 18:09:51 +01:00
Hu Tao
b4900c0e8a tests: add human format test for string output visitor
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:43 +03:00
Greg Kurz
371df9f5e0 vhost-net: disable when cross-endian
As of today, vhost assumes guest and host have the same endianness.
This is definitely not compatible with modern PPC64 and ARM that
can change endianness at runtime. Let's disable vhost-net and print
an error message when we detect such a case:

qemu-system-ppc64: vhost-net does not support cross-endian
qemu-system-ppc64: unable to start vhost net: 38: falling back on userspace virtio

This way users can continue to run VMs without changing their setup and
have a chance to know that performance will be impacted.

Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:43 +03:00
Greg Kurz
7826c2b2a4 target-ppc: enable virtio endian ambivalent support
The device endianness is the cpu endianness at device reset time.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:43 +03:00
Greg Kurz
d64ccb91ad virtio-9p: use virtio wrappers to access headers
Note that st*_raw and ld*_raw are effectively replaced by st*_p and ld*_p.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:43 +03:00
Rusty Russell
e0ab7fac65 virtio-serial-bus: use virtio wrappers to access headers
We also fix max_nr_ports at reset time as the device endianness may have
changed.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
[ pass VirtIODevice * to memory accessors,
  fix max_nr_ports at reset time,
  Greg Kurz <gkurz@linux.vnet.ibm.com> ]
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:43 +03:00
Rusty Russell
8c085dbe6d virtio-scsi: use virtio wrappers to access headers
Note that st*_raw and ld*_raw are effectively replaced by st*_p and ld*_p.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
[ pass VirtIODevice * to memory accessors,
  converted new tswap locations to virtio_tswap,
  Greg Kurz <gkurz@linux.vnet.ibm.com> ]
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:42 +03:00
Rusty Russell
783d189725 virtio-blk: use virtio wrappers to access headers
Note that st*_raw and ld*_raw are effectively replaced by st*_p and ld*_p.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
[ pass VirtIODevice * to memory accessors,
  Greg Kurz <gkurz@linux.vnet.ibm.com> ]
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:42 +03:00
Rusty Russell
8609d2a87a virtio-balloon: use virtio wrappers to access page frame numbers
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
[ pass VirtIODevice * to memory accessors,
  Greg Kurz <gkurz@linux.vnet.ibm.com> ]
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:42 +03:00
Rusty Russell
1399c60d70 virtio-net: use virtio wrappers to access headers
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
[ pass VirtIODevice * to memory accessors,
  converted new tswap locations to virtio_tswap,
  Greg Kurz <gkurz@linux.vnet.ibm.com> ]
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:42 +03:00
Rusty Russell
cee3ca0028 virtio: allow byte swapping for vring
Quoting original text from Rusty: "This is based on a simpler patch by Anthony
Liguouri".

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
[ add VirtIODevice * argument to most helpers,
  Greg Kurz <gkurz@linux.vnet.ibm.com> ]
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:42 +03:00
Greg Kurz
0f5d1d2a49 virtio: memory accessors for endian-ambivalent targets
This is the virtio-access.h header file taken from Rusty's "endian-ambivalent
targets using legacy virtio" patch. It introduces helpers that should be used
when accessing vring data or by drivers for data that contains headers.
The virtio config space is also target endian, but the current code already
handles that with the virtio_is_big_endian() helper. There is no obvious
benefit at using the virtio accessors in this case.

Now we have two distinct paths: a fast inline one for fixed endian targets,
and a slow out-of-line one for targets that define the new TARGET_IS_BIENDIAN
macro.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
[ relicensed virtio-access.h to GPLv2+ on Rusty's request,
  pass &address_space_memory to physical memory accessors,
  per-device endianness,
  virtio tswap16 and tswap64 helpers,
  faspath for fixed endian targets,
  Greg Kurz <gkurz@linux.vnet.ibm.com> ]
Cc: Cédric Le Goater <clg@fr.ibm.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:42 +03:00
Greg Kurz
616a655219 virtio: add endian-ambivalent support to VirtIODevice
Some CPU families can dynamically change their endianness. This means we
can have little endian ppc or big endian arm guests for example. This has
an impact on legacy virtio data structures since they are target endian.
We hence introduce a new property to track the endianness of each virtio
device. It is reasonnably assumed that endianness won't change while the
device is in use : we hence capture the device endianness when it gets
reset.

We migrate this property in a subsection, after the device descriptor. This
means the load code must not rely on it until it is restored. As a consequence,
the vring sanity checks had to be moved after the call to vmstate_load_state().
We enforce paranoia by poisoning the property at the begining of virtio_load().

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:42 +03:00
Greg Kurz
bf7663c4bd cpu: introduce CPUClass::virtio_is_big_endian()
If we want to support targets that can change endianness (modern PPC and
ARM for the moment), we need to add a per-CPU class method to be called
from the virtio code. The virtio_ prefix in the name is a hint for people
to avoid misusage (aka. anywhere but from the virtio code).

The default behaviour is to return the compile-time default target
endianness.

Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:42 +03:00
Greg Kurz
98ed8ecfc9 exec: introduce target_words_bigendian() helper
We currently have a virtio_is_big_endian() helper that provides the target
endianness to the virtio code. As of today, the helper returns a fixed
compile-time value. Of course, this will have to change if we want to
support target endianness changes at run-time.

Let's move the TARGET_WORDS_BIGENDIAN bits out to a new helper and have
virtio_is_big_endian() implemented on top of it.

This patch doesn't change any functionality.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:42 +03:00
Greg Kurz
6b321a3df5 virtio: add subsections to the migration stream
There is a need to add some more fields to VirtIODevice that should be
migrated (broken status, endianness). The problem is that we do not
want to break compatibility while adding a new feature... This issue has
been addressed in the generic VMState code with the use of optional
subsections. As a *temporary* alternative to port the whole virtio
migration code to VMState, this patch mimics a similar subsectionning
ability for virtio, using the VMState code.

Since each virtio device is streamed in its own section, the idea is to
stream subsections between the end of the device section and the start
of the next sections. This allows an older QEMU to complain and exit
when fed with subsections:

Unknown savevm section type 5
load of migration failed

Suggested-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:42 +03:00
Greg Kurz
3902d49e13 virtio-rng: implement per-device migration calls
While we are here, we also check virtio_load() return value.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:41 +03:00
Greg Kurz
9ea2511c85 virtio-balloon: implement per-device migration calls
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:41 +03:00
Greg Kurz
13c6855ab0 virtio-serial: implement per-device migration calls
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:41 +03:00
Greg Kurz
b2b295a74a virtio-blk: implement per-device migration calls
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:41 +03:00
Greg Kurz
037dab2fe8 virtio-net: implement per-device migration calls
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:41 +03:00
Greg Kurz
1b5fc0dea4 virtio: introduce device specific migration calls
In order to migrate virtio subsections, they should be streamed after
the device itself. We need the device specific code to be called from
the common migration code to achieve this. This patch introduces load
and save methods for this purpose.

Suggested-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:41 +03:00
Alexander Graf
e38e943a1f virtio-serial: don't migrate the config space
The device configuration is set at realize time and never changes. It
should not be migrated as it is done today. For the sake of compatibility,
let's just skip them at load time.

Signed-off-by: Alexander Graf <agraf@suse.de>
[ added missing casts to uint16_t *,
  added From, SoB and commit message,
  Greg Kurz <gkurz@linux.vnet.ibm.com> ]
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:41 +03:00
Cédric Le Goater
032a74a1c0 virtio-net: byteswap virtio-net header
TCP connectivity fails when the guest has a different endianness.
The packets are silently dropped on the host by the tap backend
when they are read from user space because the endianness of the
virtio-net header is in the wrong order. These lines may appear
in the guest console:

[  454.709327] skbuff: bad partial csum: csum=8704/4096 len=74
[  455.702554] skbuff: bad partial csum: csum=8704/4096 len=74

The issue that got first spotted with a ppc64le PowerKVM guest,
but it also exists for the less common case of a x86_64 guest run
by a big-endian ppc64 TCG hypervisor.

Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
[ Ported from PowerKVM,
  Greg Kurz <gkurz@linux.vnet.ibm.com> ]
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:41 +03:00
Michael S. Tsirkin
a628fc8dae vhost-user: typo fixups
Fix typo in field name.
Strip two consequitive empty lines.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:41 +03:00
Damjan Marion
3fd74b8407 vhost-user: fix regions provied with VHOST_USER_SET_MEM_TABLE message
Old code was affected by memory gaps which resulted in buffer pointers
pointing to address outside of the mapped regions.

Here we are introducing following changes:
 - new function qemu_get_ram_block_host_ptr() returns host pointer
   to the ram block, it is needed to calculate offset of specific
   region in the host memory
 - new field mmap_offset is added to the VhostUserMemoryRegion. It
   contains offset where specific region starts in the mapped memory.
   As there is stil no wider adoption of vhost-user agreement was made
   that we will not bump version number due to this change
 - other fileds in VhostUserMemoryRegion struct are not changed, as
   they are all needed for usermode app implementation
 - region data is not taken from ram_list.blocks anymore, instead we
   use region data which is alredy calculated for use in vhost-net
 - Now multiple regions can have same FD and user applicaton can call
   mmap() multiple times with the same FD but with different offset
   (user needs to take care for offset page alignment)

Signed-off-by: Damjan Marion <damarion@cisco.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Damjan Marion <damarion@cisco.com>
2014-06-29 19:39:40 +03:00
Eduardo Habkost
12d6e4640c numa: Reject configuration if not all node IDs are present
We don't support sparse NUMA node IDs yet, so this changes QEMU to
reject configs where not all nodes are present.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-06-29 18:59:42 +03:00
Eduardo Habkost
1945b9d8b0 numa: Reject duplicate node IDs
The same nodeid shouldn't appear multiple times in the command-line.

In addition to detecting command-line mistakes, this will fix a bug
where nb_numa_nodes may become larger than MAX_NODES (and cause
out-of-bounds access on the numa_info array).

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-06-29 18:59:42 +03:00
Eduardo Habkost
1af878e049 numa: Keep track of NUMA nodes present on the command-line
Based on "enable sparse node numbering" patch from Nishanth Aravamudan,
but without the code to actually support sparse node IDs. This just adds
the code to keep track of present/non-present nodes on the command-line,
without changing any behavior.

Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
[Rename max_numa_node to max_numa_nodeid -Eduardo]
[Initialize max_numa_nodeid to 0 -Eduardo]
[Use MAX() macro when setting max_numa_nodeid -Eduardo]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-06-29 18:59:41 +03:00
Dr. David Alan Gilbert
2f5732e964 Allow mismatched virtio config-len
Commit 'virtio: validate config_len on load' restricted config_len
loaded from the wire to match the config_len that the device had.

Unfortunately, there are cases where this isn't true, the one
we found it on was the wce addition in virtio-blk.

Allow mismatched config-lengths:
   *) If the version on the wire is shorter then fine
   *) If the version on the wire is longer, load what we have space
      for and skip the rest.

(This is mst@redhat.com's rework of what I originally posted)

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 18:59:41 +03:00
Don Slutz
5f8632d3c3 pc: make isapc and pc-0.10 to pc-0.13 have 1.7.0 memory layout
QEMU 2.0 changed memory layout for isapc and pc-0.10 to pc-0.13.
This prevents migration from QEMU 1.7.0 for these
machine types when -m 3.5G is specified.

Paolo Bonzini asked that:

    smbios_legacy_mode = true;
    has_reserved_memory = false;
    option_rom_has_mr = true;
    rom_file_has_mr = false;

also be done.

Cc: qemu-stable@nongnu.org
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Don Slutz <dslutz@verizon.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fixes: https://bugs.launchpad.net/qemu/+bug/1334307
Tested-by: "Slutz, Donald Christopher" <dslutz@verizon.com>
2014-06-29 18:59:41 +03:00
Damjan Marion
46e797c4d3 vhost-user: fix wrong ids in documentation
Signed-off-by: Damjan Marion <damarion@cisco.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 18:59:41 +03:00
Marcelo Tosatti
f2ae8abf1f mc146818rtc: add rtc-reset-reinjection QMP command
It is necessary to reset RTC interrupt reinjection backlog if
guest time is synchronized via a different mechanism, such as
QGA's guest-set-time command.

Failing to do so causes both corrections to be applied (summed),
resulting in an incorrect guest time.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 18:59:35 +03:00
Eduardo Habkost
fa118d1f8b pc: Fix "prog_if" typo on PC_COMPAT_2_0
The property name is "prog_if", not "prof_if".

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reported-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 18:59:07 +03:00
Eduardo Habkost
b8f5cfd682 pc: Move q35 compat props to PC_COMPAT_*
For each compat property on PC_Q35_COMPAT_*, there are only two
possibilities:

 * If the device is never instantiated when using a machine other than
   pc-q35, then the compat property can be safely added to
   PC_COMPAT_*;
 * If the device can be instantiated when using a machine other than
   pc-q35, that means the other machines also need the compat property
   to be set.

That means we don't need separate PC_Q35_COMPAT_* macros at all, today.

The hpet.hpet-intcap case is interesting: piix and q35 do have something
that emulates different defaults, but the machine-specific default is
applied _after_ compat_props are applied, by simply checking if the
property is zero (which is the real default on the hpet code).

The hpet.hpet-intcap=0x4 compat property can (should?) be applied to
piix too, because 0x4 was the default on both piix and q35 before the
hpet-intcap property was introduced.

Now, if one day we change the default HPET intcap on one of the PC
machine-types again, we may want to introduce PC_{Q35,I440FX}_COMPAT
macros. But while we don't need that, we can keep the code simple.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Cc: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 18:59:06 +03:00
Michael S. Tsirkin
9851d0fe35 numa: fix comment
s/if given for/is given for/;

Reported-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 18:59:06 +03:00
Michael S. Tsirkin
29923e94e7 openrisc: fix comment
Fix English in comment:

s/the each/each/

s/  \*\// \*\//

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2014-06-29 18:59:06 +03:00
Michael S. Tsirkin
d75e2f6889 numa: fix comment
Fix up English in comments:
s/the each/each/

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2014-06-29 18:59:06 +03:00
Peter Maydell
4f9c5be919 Merge remote-tracking branch 'remotes/riku/linux-user-for-upstream' into staging
* remotes/riku/linux-user-for-upstream:
  linux-user: support the SIOCGIFINDEX ioctl
  linux-user: support the KDSIGACCEPT ioctl
  linux-user: allow NULL tv argument for settimeofday
  linux-user: respect timezone for settimeofday
  linux-user: fix struct target_epoll_event layout for MIPS
  linux-user: support strace of epoll_create1
  linux-user: allow NULL arguments to mount
  linux-user: support SO_PASSSEC setsockopt option
  linux-user: support SO_{SND, RCV}BUFFORCE setsockopt options
  linux-user: support SO_ACCEPTCONN getsockopt option
  linux-user: translate the result of getsockopt SO_TYPE
  linux-user: added fake open() for /proc/self/cmdline
  Add support for MAP_NORESERVE mmap flag.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-29 16:44:13 +01:00
Peter Maydell
4daebe014e Merge remote-tracking branch 'remotes/xtensa/tags/20140629-xtensa' into staging
Xtensa fixes and improvements queue 2014-06-29:
- fix FLASH mapping to boot region for KC705;
- clean up boot parameters passing;
- add uImage, DTB and initrd support.

# gpg: Signature made Sat 28 Jun 2014 23:40:32 BST using RSA key ID F83FA044
# gpg: Good signature from "Max Filippov <max.filippov@cogentembedded.com>"
# gpg:                 aka "Max Filippov <jcmvbkbc@gmail.com>"

* remotes/xtensa/tags/20140629-xtensa:
  hw/xtensa/xtfpga: implement initrd loading
  hw/xtensa/xtfpga: implement DTB loading
  hw/xtensa/xtfpga: implement uImage loading
  hw/xtensa/xtfpga: add memory info to bootparam
  hw/xtensa/xtfpga: refactor bootparameters filling
  hw/xtensa/xtfpga: use symbolic constants for bootparam tags
  hw/xtensa/xtfpga: retrieve parameters from machine_opts
  hw/xtensa: replace fprintfs with error_report
  hw/xtensa: remove extraneous xtensa_ prefix from file names
  hw/xtensa/xtfpga: fix FLASH mapping to boot region for KC705

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-29 16:17:50 +01:00
Peter Maydell
2d40fa6987 Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block patches for 2.1.0-rc0

# gpg: Signature made Fri 27 Jun 2014 19:50:32 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"

* remotes/kevin/tags/for-upstream: (47 commits)
  iotests: Fix 083 for out-of-tree builds
  iotests: Drop Python version from 065's Shebang
  iotests: Use $PYTHON for Python scripts
  iotests: Source common.env
  configure: Enable out-of-tree iotests
  iotests: Allow out-of-tree run
  block.c: Don't return success for bdrv_append_temp_snapshot() failure
  qemu-iotests: Add TestRepairQuorum to 041 to test drive-mirror node-name mode.
  block: Add replaces argument to drive-mirror
  blockjob: Fix recent BLOCK_JOB_ERROR regression
  blockjob: Fix recent BLOCK_JOB_READY regression
  virtio-blk: Rename complete_request_early to complete_request_vring
  virtio-blk: Unify {non-,}dataplane's request handlings
  virtio-blk: Schedule BH in the right context
  virtio-blk: Export request handling functions to dataplane
  virtio-blk: Make request completion function virtual
  block: acquire AioContext in qmp_query_blockstats()
  block: make bdrv_query_stats() static
  virtio-blk: Fix and clean up the in_sg and out_sg check
  virtio-blk: Fill in VirtIOBlockReq.out in dataplane code
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-29 15:24:54 +01:00
Peter Maydell
ac8076ac86 Merge remote-tracking branch 'remotes/qmp-unstable/queue/qmp' into staging
* remotes/qmp-unstable/queue/qmp:
  docs/qmp: Fix documentation of BLOCK_JOB_READY to match code
  char: report frontend open/closed state in 'query-chardev'
  virtio-serial: report frontend connection state via monitor
  qmp: add qmp-events.txt back
  qapi event: clean up in callers
  qapi script: clean up in scripts
  qapi: ignore generated event files
  qapi: move event defines

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-29 13:39:04 +01:00
Peter Maydell
76fbbec931 Merge remote-tracking branch 'remotes/stefanha/tags/net-pull-request' into staging
Net patches

# gpg: Signature made Fri 27 Jun 2014 14:10:57 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/net-pull-request:
  hw/net/eepro100: Implement read-only bits in MDI registers
  net: move queue number into NICPeers
  net: L2TPv3 transport
  qemu-bridge-helper: Fix fd leak in main()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-29 12:45:54 +01:00
Paul Burton
f63eb01ac7 linux-user: support the SIOCGIFINDEX ioctl
Add a definition of the SIOCGIFINDEX ioctl, allowing its use by target
programs.

Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-06-29 14:19:59 +03:00
Paul Burton
ca56f5b596 linux-user: support the KDSIGACCEPT ioctl
Add a definition of the KDSIGACCEPT ioctl & allow its use by target
programs.

Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-06-29 14:19:59 +03:00
Paul Burton
b67d80311a linux-user: allow NULL tv argument for settimeofday
The tv argument to the settimeofday syscall is allowed to be NULL, if
the program only wishes to provide the timezone. QEMU previously
returned -EFAULT when tv was NULL. Instead, execute the syscall &
provide NULL to the kernel as the target program expected.

Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-06-29 14:19:59 +03:00
Paul Burton
ef4467e911 linux-user: respect timezone for settimeofday
The settimeofday syscall accepts a tz argument indicating the desired
timezone to the kernel. QEMU previously ignored any argument provided
by the target program & always passed NULL to the kernel. Instead,
translate the argument & pass along the data userland provided.

Although this argument is described by the settimeofday man page as
obsolete, it is used by systemd as of version 213.

Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-06-29 14:19:59 +03:00
Paul Burton
fd76783243 linux-user: fix struct target_epoll_event layout for MIPS
MIPS requires the pad field to 64b-align the data field just as ARM
does.

Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-06-29 14:19:59 +03:00
Paul Burton
0fa82d39c8 linux-user: support strace of epoll_create1
Add the epoll_create1 syscall to strace.list in order to display that
syscall when it occurs, rather than a message about the syscall being
unknown despite QEMU already implementing support for it.

Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-06-29 14:19:59 +03:00
Paul Burton
356d771b30 linux-user: allow NULL arguments to mount
Calls to the mount syscall can legitimately provide NULL as the value
for the source of filesystemtype arguments, which QEMU would previously
reject & return -EFAULT to the target program. An example of this is
remounting an already mounted filesystem with different properties.

Instead of rejecting such syscalls with -EFAULT, pass NULL along to the
kernel as the target program expects.

Additionally this patch fixes a potential memory leak when DEBUG_REMAP
is enabled and lock_user_string fails on the target or filesystemtype
arguments but a prior argument was non-NULL and already locked.

Since the patch already touched most lines of the TARGET_NR_mount case,
it fixes the indentation & coding style for good measure.

Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-06-29 14:19:59 +03:00
Paul Burton
82d0fe6b7a linux-user: support SO_PASSSEC setsockopt option
Translate the SO_PASSSEC option to setsockopt to the host value &
perform the syscall as expected, allowing use of the option by target
programs.

Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-06-29 14:19:59 +03:00
Paul Burton
d79b6cc435 linux-user: support SO_{SND, RCV}BUFFORCE setsockopt options
Translate the SO_SNDBUFFORCE & SO_RCVBUFFORCE options to setsockopt to
the host values & perform the syscall as expected, allowing use of those
options by target programs.

Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-06-29 14:19:58 +03:00
Paul Burton
aec1ca411e linux-user: support SO_ACCEPTCONN getsockopt option
Translate the SO_ACCEPTCONN option to the host value & execute the
syscall as expected.

Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-06-29 14:19:58 +03:00
Paul Burton
8289d11281 linux-user: translate the result of getsockopt SO_TYPE
QEMU previously passed the result of the host syscall directly to the
target program. This is a problem if the host & target have different
representations of socket types, as is the case when running a MIPS
target program on an x86 host. Introduce a host_to_target_sock_type
helper function mirroring the existing target_to_host_sock_type, and
call it to translate the value provided by getsockopt when called for
the SO_TYPE option.

Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-06-29 14:19:58 +03:00
Wim Vander Schelden
76b9424550 linux-user: added fake open() for /proc/self/cmdline
Signed-off-by: Wim Vander Schelden <wim@fixnum.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-06-29 14:19:58 +03:00
Christophe Lyon
e8efd8e71f Add support for MAP_NORESERVE mmap flag.
mmap_flags_tbl contains a list of mmap flags, and how to map them to
the target. This patch adds MAP_NORESERVE, which was missing to the
list.

Signed-off-by: Christophe Lyon <christophe.lyon@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-06-29 14:19:58 +03:00
Peter Maydell
2d80e0ab4b Merge remote-tracking branch 'remotes/agraf/tags/signed-ppc-for-upstream' into staging
Patch queue for ppc - 2014-06-27

Changes include:

  - instruction emulation fixes
  - linux-user fixes
  - mac99: layout fixes
  - pseries: Initial VFIO support
  - pseries: support for UUID
  - pseries: support for -boot m

# gpg: Signature made Fri 27 Jun 2014 12:51:01 BST using RSA key ID 03FEDC60
# gpg: Can't check signature: public key not found

* remotes/agraf/tags/signed-ppc-for-upstream: (32 commits)
  PPC: e500: Only create dt entries for existing serial ports
  spapr_pci: Use XICS interrupt allocator and do not cache interrupts in PHB
  vmstate: Add preallocation for migrating arrays (VMS_ALLOC flag)
  xics: Implement xics_ics_free()
  spapr: Remove @next_irq
  spapr: Move interrupt allocator to xics
  xics: Disable flags reset on xics reset
  xics: Add xics_find_source()
  xics: Add flags for interrupts
  spapr: Add RTAS sysparm SPLPAR Characteristics
  spapr: Add RTAS sysparm UUID
  spapr: Fix RTAS sysparm DIAGNOSTICS_RUN_MODE
  spapr: Add rtas_st_buffer utility function
  spapr: Define a 2.1 pseries machine
  spapr: Fix code design style (s/SPAPRMachine/sPAPRMachineState)
  target-ppc: Add support for POWER8 pvr 0x4D0000
  uninorth: Fix PCI hole size
  mac99: Add motherboard devices before PCI cards
  target-ppc: Remove unused gen_qemu_ld8s()
  target-ppc: Remove unused IMM and d extract helpers
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-29 11:59:00 +01:00
Peter Maydell
de6793e8c2 Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20140627' into staging
A series of patches to the s390-ccw bios:
- code cleanup
- improved error reporting
- most important, support to ipl (boot) from ECKD DASD (CDL, LDL or CMS
  formatted)

# gpg: Signature made Fri 27 Jun 2014 12:03:30 BST using RSA key ID C6F02FAF
# gpg: Can't check signature: public key not found

* remotes/cohuck/tags/s390x-20140627:
  pc-bios/s390-ccw: update binary
  pc-bios/s390-ccw: IPL from LDL/CMS-formatted ECKD DASD
  pc-bios/s390-ccw: IPL from CDL-formatted ECKD DASD
  pc-bios/s390-ccw: factor out ipl code
  pc-bios/s390-ccw: Add fill_hex_val func to provide better msgs
  pc-bios/s390-ccw: Unify error handling
  pc-bios/s390-ccw: add some utility code
  pc-bios/s390-ccw: handle different sector sizes
  pc-bios/s390-ccw: cleanup and enhance bootmap defintions
  pc-bios/s390-ccw: make checkpatch happy

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-29 11:43:31 +01:00
Peter Maydell
1045fc0439 tcg/ppc: Fix support for 64-bit PPC MacOSX hosts
Add back in the support for 64-bit PPC MacOSX hosts that was
broken in the recent merge of the 32-bit and 64-bit TCG backends.

Reported-by: Andreas Färber <andreas.faerber@web.de>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Tested-by: Andreas Färber <andreas.faerber@web.de>
2014-06-29 11:38:50 +01:00
Max Filippov
f55b32e749 hw/xtensa/xtfpga: implement initrd loading
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2014-06-29 02:32:42 +04:00
Max Filippov
996dfe98ed hw/xtensa/xtfpga: implement DTB loading
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2014-06-29 02:32:42 +04:00
Max Filippov
364d480242 hw/xtensa/xtfpga: implement uImage loading
Provide a simple bootloader code at the reset address that jumps to the
loaded image entry point when it's not equal to the reset address. This
is needed because the old method of setting pc doesn't work due to cpu
reset done after the machine setup.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2014-06-29 02:32:42 +04:00
Max Filippov
b6edea8b68 hw/xtensa/xtfpga: add memory info to bootparam
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2014-06-29 02:32:42 +04:00
Max Filippov
a9a28591fb hw/xtensa/xtfpga: refactor bootparameters filling
Separate filling first/last tag and size calculation from the kernel
command line setup.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2014-06-29 02:32:42 +04:00
Max Filippov
62dbaede80 hw/xtensa/xtfpga: use symbolic constants for bootparam tags
Import bootparam tag names from linux/arch/xtensa/include/asm/bootparam.h
No functional changes.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2014-06-29 02:32:42 +04:00
Max Filippov
37b259d034 hw/xtensa/xtfpga: retrieve parameters from machine_opts
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2014-06-29 02:32:42 +04:00
Max Filippov
8488ab021b hw/xtensa: replace fprintfs with error_report
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2014-06-29 02:32:41 +04:00
Max Filippov
b707ab757e hw/xtensa: remove extraneous xtensa_ prefix from file names
While at it rename lx60 (named after the first board of the family) to
more generic xtfpga (the family name).

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2014-06-29 02:32:41 +04:00
Max Filippov
37ed7c4b24 hw/xtensa/xtfpga: fix FLASH mapping to boot region for KC705
On KC705 bootloader area is located at FLASH offset 0x06000000, not 0 as
on older xtfpga boards.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2014-06-29 02:32:41 +04:00
Max Reitz
f5264553c3 iotests: Fix 083 for out-of-tree builds
iotest 083 filters out debug messages from nbd, which are prefixed (and
recognized) by __FILE__. However, the current filter (/^nbd\.c…/) is
valid for in-tree builds only, as out-of-tree builds will have a path
before that filename (e.g. "/tmp/qemu/nbd.c"). Fix this by adding .*
before "nbd\.c".

While working on this, also fix the regexes: '.' should be escaped and a
single backslash is not enough for escaping when enclosed by double
quotes.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 20:00:01 +02:00
Max Reitz
f99b4b5d7e iotests: Drop Python version from 065's Shebang
Test 065 specified python2 to be used in its Shebang; this might not
work on systems without a python2 symlink and furthermore it is now
counter-productive, as the check script compares the Shebang to
"#!/usr/bin/env python" and only uses the Python interpreter selected by
configure on an exact match.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 20:00:01 +02:00
Max Reitz
ea81ca9de1 iotests: Use $PYTHON for Python scripts
Instead of invoking Python scripts directly via ./, use $PYTHON to
obtain the correct Python interpreter command.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 20:00:00 +02:00
Max Reitz
7fed1a49ff iotests: Source common.env
Source common.env in the iotests' check script.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 20:00:00 +02:00
Max Reitz
76c7560ae7 configure: Enable out-of-tree iotests
In order to allow out-of-tree iotests, create a symlink for the check
script in the build tree.

While doing so, also write configured options relevant to the iotests to
common.env in the build tree; currently, this is the command to invoke
Python 2.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 20:00:00 +02:00
Max Reitz
e8f8624d3b iotests: Allow out-of-tree run
As out-of-tree builds are preferred for qemu, running the qemu-iotests
in that out-of-tree build should be supported as well. To do so, a
symbolic link has to be created pointing to the check script in the
source directory. That script will check whether it has been run through
a symlink, and if so, will assume it is run in the build tree. All
output and temporary operations performed by iotests are then redirected
here and, unless specified otherwise by the user, QEMU_PROG etc. will be
set to paths appropriate for the build tree.

Also, drop making every test case executable if it is not yet, as this
would modify the source tree which is not desired for out-of-tree runs
and should be fixed in the repository anyway.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 20:00:00 +02:00
Chen Gang
6b8aeca574 block.c: Don't return success for bdrv_append_temp_snapshot() failure
When failure occurs, 'ret' need be set, or may return 0 to indicate
success. Previously, an error was set in errp, but 0 was returned
anyway. So let bdrv_append_temp_snapshot() return an error code and
use that for the bdrv_open() return value.

Also, error_propagate() need be called only one time within a function.

Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 20:00:00 +02:00
Benoît Canet
d88964aeda qemu-iotests: Add TestRepairQuorum to 041 to test drive-mirror node-name mode.
The to-replace-node-name is designed to allow repairing a broken Quorum file.
This patch introduces a new class TestRepairQuorum testing that the feature
works.
Some further work will be done on QEMU to improve the robustness of the tests.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 20:00:00 +02:00
Benoît Canet
09158f00e0 block: Add replaces argument to drive-mirror
drive-mirror will bdrv_swap the new BDS named node-name with the one
pointed by replaces when the mirroring is finished.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 20:00:00 +02:00
Markus Armbruster
823c686356 blockjob: Fix recent BLOCK_JOB_ERROR regression
Commit 5a2d2cb screwed up the the value of members device and action,
breaking tests/qemu-iotests/041.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Tested-By: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 20:00:00 +02:00
Markus Armbruster
518848a214 blockjob: Fix recent BLOCK_JOB_READY regression
Commit bcada37 dropped the (up to now undocumented) members type, len,
offset, speed, breaking tests/qemu-iotests/040 and 041.

Restore and document them.  This fixes 040, and partially fixes 041.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Tested-By: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 20:00:00 +02:00
Markus Armbruster
a22d8e47f7 docs/qmp: Fix documentation of BLOCK_JOB_READY to match code
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-27 13:40:41 -04:00
Fam Zheng
d64c60a75f virtio-blk: Rename complete_request_early to complete_request_vring
The old name is misleading in its new usage, so rename it.

Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 18:20:42 +02:00
Fam Zheng
b002254dbd virtio-blk: Unify {non-,}dataplane's request handlings
This drops request handling code from dataplane, and uses code from
hw/block/virtio-blk.c.

It starts to use multiwrite as non-dataplane does.

Dataplane sets VirtIOBlock.complete_request to vring version, and calls
into non-dataplane's process handling. In complete_request_early,
qiov.size is added to vring push length, because it's also called in rw
completion now.

Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 18:20:39 +02:00
Fam Zheng
4407c1c56a virtio-blk: Schedule BH in the right context
The BH must be called in the AioContext of bs. Currently it is only the
main loop, but with coming changes, it could also be a dataplane
IOThread.

Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 18:20:37 +02:00
Fam Zheng
fee65db771 virtio-blk: Export request handling functions to dataplane
So that dataplane can use virtio_blk_handle_request and
virtio_submit_multiwrite.

Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 18:20:35 +02:00
Fam Zheng
bf4bd461b4 virtio-blk: Make request completion function virtual
virtio_blk_req_complete will call VirtIOBlock.complete_request() to push
data and notify guest. No functional change.

Later, this will allow dataplane to provide it's own (vring_) version.

Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 18:20:32 +02:00
Stefan Hajnoczi
13344f3a17 block: acquire AioContext in qmp_query_blockstats()
Make query-blockstats safe for dataplane by acquiring the
BlockDriverState's AioContext.  This ensures that the dataplane IOThread
and the main loop's monitor code do not race.

Note the assumption that acquiring the drive's BDS AioContext also
protects ->file and ->backing_hd.  This assumption is made by other
aio_context_acquire() callers too.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 18:20:29 +02:00
Stefan Hajnoczi
ac46821f2c block: make bdrv_query_stats() static
This function is only called from block/qapi.c.  There is no need to
keep it public.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 18:19:57 +02:00
Fam Zheng
ee17e84830 virtio-blk: Fix and clean up the in_sg and out_sg check
out_sg is checked by iov_to_buf below, so it can be dropped.

Add assert and iov_discard_back around in_sg, as the in_sg is handled in
dataplane code.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 18:18:31 +02:00
Fam Zheng
ab2e3cd2dc virtio-blk: Fill in VirtIOBlockReq.out in dataplane code
VirtIOBlockReq is allocated in process_request, and freed in command
functions.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 18:18:28 +02:00
Fam Zheng
827805a249 virtio-blk: Convert VirtIOBlockReq.out to structrue
The virtio code currently assumes that the outhdr is in its own iovec.
This is not guaranteed by the spec, so we should relax this assumption.

Convert the VirtIOBlockReq.out field to structrue so that we can use
iov_to_buf and then discard the header from the beginning of iovec.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 18:18:25 +02:00
Fam Zheng
eddb102e86 virtio-blk: Use VirtIOBlockReq.in to drop VirtIOBlockReq.inhdr
In current virtio spec, inhdr is a single byte, and is unlikely to
change for both functionality and compatibility considerations.
Non-dataplane uses .in, and we are on the way to converge them. So
let's unify it to get cleaner code.

Remove .inhdr and use .in.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 18:18:23 +02:00
Fam Zheng
04af2d70c5 virtio-blk: Replace VirtIOBlockRequest with VirtIOBlockReq
Field "inhdr" is added temporarily for a more mechanical change, and
will be dropped in the next commit.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 18:18:20 +02:00
Fam Zheng
98e2d49241 virtio-blk: Drop VirtIOBlockRequest.read
Since it's set but not used.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 18:18:18 +02:00
Fam Zheng
0bcb34472d virtio-blk: Drop bounce buffer from dataplane code
The block layer will handle the unaligned request.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 18:18:16 +02:00
Fam Zheng
671ec3f056 virtio-blk: Convert VirtIOBlockReq.elem to pointer
This will make converging with dataplane code easier.

Add virtio_blk_free_request to handle the freeing of request internal
fields.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 18:18:13 +02:00
Fam Zheng
09f6458770 virtio-blk: Move VirtIOBlockReq to header
For later reusing by dataplane code.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 18:17:59 +02:00
Hani Benhabiles
8c5d1abbb7 nbd: Don't validate from and len in NBD_CMD_DISC.
These values aren't used in this case.

Currently, the from field in the request sent by the nbd kernel module leading
to a false error message when ending the connection with the client.

$ qemu-nbd some.img -v
// After nbd-client -d /dev/nbd0
nbd.c:nbd_trip():L1031: From: 18446744073709551104, Len: 0, Size: 20971520,
Offset: 0
nbd.c:nbd_trip():L1032: requested operation past EOF--bad client?
nbd.c:nbd_receive_request():L638: read failed

Signed-off-by: Hani Benhabiles <kroosec@gmail.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-27 16:06:48 +02:00
Hani Benhabiles
60fe4fac22 nbd: Don't export a block device with no medium.
The device is exported with erroneous values and can't be read.

Before the patch:
$ sudo nbd-client localhost -p 10809 /dev/nbd0 -name floppy0
Negotiation: ..size = 17592186044415MB
bs=1024, sz=18446744073709547520 bytes

$ sudo mount /dev/nbd0 /mnt/tmp/
mount: block device /dev/nbd0 is write-protected, mounting read-only
mount: /dev/nbd0: can't read superblock

After the patch:
(qemu) nbd_server_add ide0-hd0
(qemu) nbd_server_add floppy0
Device 'floppy0' has no medium

Signed-off-by: Hani Benhabiles <kroosec@gmail.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-27 16:06:48 +02:00
Laszlo Ersek
32a97ea171 char: report frontend open/closed state in 'query-chardev'
In addition to the on-line reporting added in the previous patch, allow
libvirt to query frontend state independently of events.

Libvirt's path to identify the guest agent channel it cares about differs
between the event added in the previous patch and the QMP response field
added here. The event identifies the frontend device, by "id". The
'query-chardev' QMP command identifies the backend device (again by "id").
The association is under libvirt's control.

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1080376

Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-27 09:34:00 -04:00
Laszlo Ersek
e2ae6159de virtio-serial: report frontend connection state via monitor
Libvirt wants to know about the guest-side connection state of some
virtio-serial ports (in particular the one(s) assigned to guest agent(s)).
Report such states with a new monitor event.

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1080376
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-27 09:33:27 -04:00
Luiz Capitulino
dfab489214 qmp: add qmp-events.txt back
The conversion of events to the QAPI, resulted in the removal of the
docs/qmp/qmp-events.txt file. This was done to avoid having duplicated
information between qmp-events.txt and qapi-event.json.

However, qmp-events.txt contains examples and we're still not sure
how to proper install QAPI docs in the host. To avoid harming users,
it's better to re-add qmp-events.txt for now and deal with the
duplication later.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-06-27 09:27:56 -04:00
Wenchao Xia
2f44a08b3e qapi event: clean up in callers
This patch improves docs and address small issues in event
callers.

Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-27 09:27:56 -04:00
Wenchao Xia
d6f9c82c62 qapi script: clean up in scripts
This patch improve docs and uses c_type(argentry, is_param=True)
in script.

Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-27 09:27:56 -04:00
Wenchao Xia
1dbbe04525 qapi: ignore generated event files
Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-27 09:27:55 -04:00
Wenchao Xia
82d72d9d23 qapi: move event defines
Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-27 09:27:55 -04:00
Richard Henderson
d4cba13bdf tcg/ppc: Fix failure in tcg_out_mem_long
With rt != r0 on loads, we use rt for scratch.  If we need an index
register different from base, we can't use rt, but r0 is usable.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-id: 1403843160-30332-1-git-send-email-rth@twiddle.net
Tested-by: Cédric Le Goater <clg@fr.ibm.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-27 13:23:41 +01:00
Benoît Canet
4c828dc61a block: Add node-name argument to drive-mirror
This new argument can be used to specify the node-name of the new mirrored BDS.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 14:18:18 +02:00
Benoît Canet
cf29a570a7 quorum: Add the rewrite-corrupted parameter to quorum
On read operations when this parameter is set and some replicas are corrupted
while quorum can be reached quorum will proceed to rewrite the correct version
of the data to fix the corrupted replicas.

This will shine with SSD where the FTL will remap the same block at another
place on rewrite.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 14:18:17 +02:00
Alexander Graf
79c0ff2cae PPC: e500: Only create dt entries for existing serial ports
When the user specifies -nodefaults he can tell us that he doesn't want any
serial ports spawned by default. While we do honor that wish, we still create
device tree entries for those non-existent devices.

Make device tree generation depend on whether the device is actually available.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:27 +02:00
Alexey Kardashevskiy
9a321e9234 spapr_pci: Use XICS interrupt allocator and do not cache interrupts in PHB
Currently SPAPR PHB keeps track of all allocated MSI (here and below
MSI stands for both MSI and MSIX) interrupt because
XICS used to be unable to reuse interrupts. This is a problem for
dynamic MSI reconfiguration which happens when guest reloads a driver
or performs PCI hotplug. Another problem is that the existing
implementation can enable MSI on 32 devices maximum
(SPAPR_MSIX_MAX_DEVS=32) and there is no good reason for that.

This makes use of new XICS ability to reuse interrupts.

This reorganizes MSI information storage in sPAPRPHBState. Instead of
static array of 32 descriptors (one per a PCI function), this patch adds
a GHashTable when @config_addr is a key and (first_irq, num) pair is
a value. GHashTable can dynamically grow and shrink so the initial limit
of 32 devices is gone.

This changes migration stream as @msi_table was a static array while new
@msi_devs is a dynamic hash table. This adds temporary array which is
used for migration, it is populated in "spapr_pci"::pre_save() callback
and expanded into the hash table in post_load() callback. Since
the destination side does not know the number of MSI-enabled devices
in advance and cannot pre-allocate the temporary array to receive
migration state, this makes use of new VMSTATE_STRUCT_VARRAY_ALLOC macro
which allocates the array automatically.

This resets the MSI configuration space when interrupts are released by
the ibm,change-msi RTAS call.

This fixed traces to be more informative.

This changes vmstate_spapr_pci_msi name from "...lsi" to "...msi" which
was incorrect by accident. As the internal representation changed,
thus bumps migration version number.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[agraf: drop g_malloc_n usage]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:27 +02:00
Alexey Kardashevskiy
f32935ea22 vmstate: Add preallocation for migrating arrays (VMS_ALLOC flag)
There are few helpers already to support array migration. However they all
require the destination side to preallocate arrays before migration which
is not always possible due to unknown array size as it might be some
sort of dynamic state. One of the examples is an array of MSIX-enabled
devices in SPAPR PHB - this array may vary from 0 to 65536 entries and
its size depends on guest's ability to enable MSIX or do PCI hotplug.

This adds new VMSTATE_VARRAY_STRUCT_ALLOC macro which is pretty similar to
VMSTATE_STRUCT_VARRAY_POINTER_INT32 but it can alloc memory for migratign
array on the destination side.

This defines VMS_ALLOC flag for a field.

This changes vmstate_base_addr() to do the allocation when receiving
migration.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Juan Quintela <quintela@redhat.com>
[agraf: drop g_malloc_n usage]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:27 +02:00
Alexey Kardashevskiy
51bba713fe xics: Implement xics_ics_free()
This implements interrupt release function so IRQs can be returned back
to the pool for reuse in cases such as PCI hot plug.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:26 +02:00
Alexey Kardashevskiy
ba0e5bf8de spapr: Remove @next_irq
This removes @next_irq from sPAPREnvironment which was used in old
IRQ allocator as XICS is now responsible for IRQs and keeps track of
allocated IRQs.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:26 +02:00
Alexey Kardashevskiy
bee763dbfb spapr: Move interrupt allocator to xics
The current allocator returns IRQ numbers from a pool and does not
support IRQs reuse in any form as it did not keep track of what it
previously returned, it only keeps the last returned IRQ. Some use
cases such as PCI hot(un)plug may require IRQ release and reallocation.

This moves an allocator from SPAPR to XICS.

This switches IRQ users to use new API.

This uses LSI/MSI flags to know if interrupt is allocated.

The interrupt release function will be posted as a separate patch.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:26 +02:00
Alexey Kardashevskiy
a7e519a8cf xics: Disable flags reset on xics reset
Since islsi[] array has been merged into the ICSState struct,
we must not reset flags as they tell if the interrupt is in use.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:26 +02:00
Alexey Kardashevskiy
641c349352 xics: Add xics_find_source()
PAPR allows having multiple interrupt sources such as PHB.

This adds a source lookup function and makes use of it.

Since at the moment QEMU only supports a single source,
no change in behaviour is expected.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:26 +02:00
Alexey Kardashevskiy
4af88944d0 xics: Add flags for interrupts
The existing interrupt allocation scheme in SPAPR assumes that
interrupts are allocated at the start time, continously and the config
will not change. However, there are cases when this is not going to work
such as:

1. migration - we will have to have an ability to choose interrupt
numbers for devices in the command line and this will create gaps in
interrupt space.

2. PCI hotplug - interrupts from unplugged device need to be returned
back to interrupt pool, otherwise we will quickly run out of interrupts.

This replaces a separate lslsi[] array with a byte in the ICSIRQState
struct and defines "LSI" and "MSI" flags. Neither of these flags set
signals that the descriptor is not allocated and not in use.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:26 +02:00
Sam bobroff
3b50d8974b spapr: Add RTAS sysparm SPLPAR Characteristics
Add support for the SPLPAR Characteristics parameter to the emulated
RTAS call ibm,get-system-parameter.

The support provides just enough information to allow "cat
/proc/powerpc/lparcfg" to succeed without generating a kernel error
message.

Without this patch the above command will produce the following kernel
message: arch/powerpc/platforms/pseries/lparcfg.c \
parse_system_parameter_string Error calling get-system-parameter \
(0xfffffffd)

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:26 +02:00
Sam bobroff
b907d7b0fd spapr: Add RTAS sysparm UUID
Add support for the UUID parameter to the emulated RTAS call
ibm,get-system-parameter.

Return the guest's UUID as the value for the RTAS UUID system
parameter, or null (a zero length result) if it is not set.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:26 +02:00
Sam bobroff
3052d95190 spapr: Fix RTAS sysparm DIAGNOSTICS_RUN_MODE
This allows the ibm,get-system-parameter RTAS call to succeed for the
DIAGNOSTICS_RUN_MODE system parameter.

The problem can be seen with "ppc64_cpu --run-mode" from the
powerpc-utils package which fails before this patch with "Machine does
not support diagnostic run mode".

This is corrected by using the rtas_st_buffer() function to write to
the buffer.

The RTAS constants are also moved out into a header file, some new
constants added and the surrounding code slightly simplified.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
[agraf: remove some commentary]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:25 +02:00
Sam bobroff
ce3fa1eca2 spapr: Add rtas_st_buffer utility function
Add a function to write lengh + data into a buffer as required for the
emulation of the RTAS ibm,get-system-parameter call.

If the destination is smaller than the source, the write is truncated
and success is returned. This matches the behaviour of pHyp.

This will be used in following patches.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:25 +02:00
Alexey Kardashevskiy
6026db4501 spapr: Define a 2.1 pseries machine
This adds a v2.1 machine to support backward compatibility
for newer macines in the case if they ever be implemented.

This adds a "pseries-2.1" machine as a child of the "pseries"
machine and only changes visible machine name.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:25 +02:00
Alexey Kardashevskiy
6ca1502e36 spapr: Fix code design style (s/SPAPRMachine/sPAPRMachineState)
Every single sPAPR QOM object has small first "s".
Most (not all yet) QOM objects have "State" suffix.

This replaces SPAPRMachine with sPAPRMachineState to conform with QEMU
code style and removes redundant empty line.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:25 +02:00
Alexey Kardashevskiy
f6c3ebcc3b target-ppc: Add support for POWER8 pvr 0x4D0000
At the moment QEMU knows about one version of POWER8 CPU with
PVR 0x4B.0000. This CPU class is defined as "POWER8". The linux
kernel names it as "POWER8E" which is different from the name QEMU uses.

Now we get another version of POWER8 which is architecturally equivalent
to POWER8E but has different PVR 0x4D.0000 so QEMU fails to find
a PPC CPU class on these machines. The linux kernel names these CPUs as
"POWER8".

This renames the existing "POWER8" to "POWER8E" to be more precise and
stay in sync with the linux kernel.

This adds a new "POWER8" family which calls POWER8E class init function
and defines own PVR mask (used to match a CPU class) and desc (used to
create dynamic version-less CPU class).

This does not change CPU class fw_name attribute as the host POWER8
firmware keeps using "PowerPC,POWER8" on both POWER8 and POWER8E.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:24 +02:00
BALATON Zoltan
1be88255a7 uninorth: Fix PCI hole size
Fix PCI hole size to match that what is found on real hardware.
(OpenBIOS already uses the correct length.)

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:24 +02:00
BALATON Zoltan
a0bb2a5fa0 mac99: Add motherboard devices before PCI cards
Change the order of creating devices for New World Mac emulation so
that devices on the motherboard are added first and PCI cards (VGA and
NIC) come later. As a side effect, this also causes OpenBIOS to map
the motherboard devices into the MMIO space to the same addresses as
on real hardware and allow clients that hardcode these addresses (e.g.
MorphOS) to find and use them until OpenBIOS is tought to map devices
to specific addresses. (On real hardware the graphics and network
cards are really on separate buses but we don't model that yet.) This
brings the memory map closer to what is found on PowerMac3,1.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:24 +02:00
Peter Maydell
c99b6f879a target-ppc: Remove unused gen_qemu_ld8s()
The gen_qemu_ld8s() function is unused; remove it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:23 +02:00
Peter Maydell
b247812e4a target-ppc: Remove unused IMM and d extract helpers
Remove the definition of the IMM and d extract helpers; these seem to have
been added as part of the initial PPC support in 2003 but never actually
used.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:23 +02:00
Alexey Kardashevskiy
591812634c vfio: Enable for SPAPR
This turns the sPAPR support on and enables VFIO container use
in the kernel.

This extends vfio_connect_container to support VFIO_SPAPR_TCE_IOMMU type
in the host kernel.

This registers a memory listener which sPAPR IOMMU will notify when
executing H_PUT_TCE/etc DMA calls. The listener then will notify the host
kernel about DMA map/unmap operation via VFIO_IOMMU_MAP_DMA/
VFIO_IOMMU_UNMAP_DMA ioctls.

This executes VFIO_IOMMU_ENABLE ioctl to make sure that the IOMMU is free
of mappings and can be exclusively given to the user. At the moment SPAPR
is the only platform requiring this call to be implemented.

Note that the host kernel function implementing VFIO_IOMMU_DISABLE
is called automatically when container's fd is closed so there is
no need to call it explicitly from QEMU. We may need to call
VFIO_IOMMU_DISABLE explicitly in the future for some sort of dynamic
reconfiguration (PCI hotplug or dynamic IOMMU group management).

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:23 +02:00
Alexey Kardashevskiy
9fc34ada7e spapr_pci_vfio: Add spapr-pci-vfio-host-bridge to support vfio
The patch adds a spapr-pci-vfio-host-bridge device type
which is a PCI Host Bridge with VFIO support. The new device
inherits from the spapr-pci-host-bridge device and adds an "iommu"
property which is an IOMMU id. This ID represents a minimal entity
for which IOMMU isolation can be guaranteed. In SPAPR architecture IOMMU
group is called a Partitionable Endpoint (PE).

Current implementation supports one IOMMU id per QEMU VFIO PHB. Since
SPAPR allows multiple PHB for no extra cost, this does not seem to
be a problem. This limitation may change in the future though.

Example of use:
Configure and Add 3 functions of a multifunctional device to QEMU:
(the NEC PCI USB card is used as an example here):
-device spapr-pci-vfio-host-bridge,id=USB,iommu=4,index=7 \
-device vfio-pci,host=4:0:1.0,addr=1.0,bus=USB,multifunction=true
-device vfio-pci,host=4:0:1.1,addr=1.1,bus=USB
-device vfio-pci,host=4:0:1.2,addr=1.2,bus=USB

where:
* index=7 is a QEMU PHB index (used as source for MMIO/MSI/IO windows
offset);
* iommu=4 is an IOMMU id which can be found in sysfs:
[aik@vpl2 ~]$ cd /sys/bus/pci/devices/0004:00:00.0/
[aik@vpl2 0004:00:00.0]$ ls -l iommu_group
lrwxrwxrwx 1 root root 0 Jun  5 12:49 iommu_group -> ../../../kernel/iommu_groups/4

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:23 +02:00
Alexey Kardashevskiy
6d8be4c343 vfio: Add vfio_container_ioctl()
While most operations with VFIO IOMMU driver are generic and used inside
vfio.c, there are still some operations which only specific VFIO IOMMU
drivers implement. The first example of it will be reading a DMA window
start from the host.

This adds a helper which passes an ioctl request to the container's fd.

The helper will check if @req is known. For this, stub is added. This return
-1 on any requests for now.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:23 +02:00
Alexey Kardashevskiy
9bb62a0702 spapr_iommu: Make in-kernel TCE table optional
POWER KVM supports an KVM_CAP_SPAPR_TCE capability which allows allocating
TCE tables in the host kernel memory and handle H_PUT_TCE requests
targeted to specific LIOBN (logical bus number) right in the host without
switching to QEMU. At the moment this is used for emulated devices only
and the handler only puts TCE to the table. If the in-kernel H_PUT_TCE
handler finds a LIOBN and corresponding table, it will put a TCE to
the table and complete hypercall execution. The user space will not be
notified.

Upcoming VFIO support is going to use the same sPAPRTCETable device class
so KVM_CAP_SPAPR_TCE is going to be used as well. That means that TCE
tables for VFIO are going to be allocated in the host as well.
However VFIO operates with real IOMMU tables and simple copying of
a TCE to the real hardware TCE table will not work as guest physical
to host physical address translation is requited.

So until the host kernel gets VFIO support for H_PUT_TCE, we better not
to register VFIO's TCE in the host.

This adds a place holder for KVM_CAP_SPAPR_TCE_VFIO capability. It is not
in upstream yet and being discussed so now it is always false which means
that in-kernel VFIO acceleration is not supported.

This adds a bool @vfio_accel flag to the sPAPRTCETable device telling
that sPAPRTCETable should not try allocating TCE table in the host kernel
for VFIO. The flag is false now as at the moment there is no VFIO.

This adds an vfio_accel parameter to spapr_tce_new_table(), the semantic
is the same. Since there is only emulated PCI and VIO now, the flag is set
to false. Upcoming VFIO support will set it to true.

This is a preparation patch so no change in behaviour is expected

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:23 +02:00
Alexey Kardashevskiy
3a3b8502e6 spapr: Fix RTAS token numbers
At the moment spapr_rtas_register() allocates a new token number for every
new RTAS callback so numbers are not fixed and depend on the number of
supported RTAS handlers and the exact order of spapr_rtas_register() calls.
These tokens are copied into the device tree and remain the same during
the guest lifetime.

When we start another guest to receive a migration, it calls
spapr_rtas_register() as well. If the number of RTAS handlers or their
order is different in QEMU on source and destination sides, the "/rtas"
node in the device tree will differ. Since migration overwrites the device
tree (as it overwrites the entire RAM), the actual RTAS config on
the destination side gets broken.

This defines global contant values for every RTAS token which QEMU
is using today.

This changes spapr_rtas_register() to accept a token number instead of
allocating one. This changes all users of spapr_rtas_register().

This changes XICS-KVM not to cache tokens registered with KVM as they
constant now.

This makes TOKEN_BASE global as RTAS_XXX use TOKEN_BASE as
a base. TOKEN_MAX is moved and renamed too and its value is changed
to the last token + 1. Boundary checks for token values are adjusted.

This reserves token numbers for "os-term" handlers and PCI hotplug
which we are working on.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:22 +02:00
Alexander Graf
b3cad3abf6 PPC: Add support for Apple gdb in gdbstub
The Apple gdbstub protocol is different from the normal gdbstub protocol
used on PowerPC. Add support for the different variant, so that we can use
Apple's gdb to debug guest code.

Keep in mind that the switch is a compile time option. We can't detect
during runtime whether a gdb connecting to us is an upstream gdb or an
Apple gdb.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:22 +02:00
Sorav Bansal
294d129289 target-ppc: fixed translation of mcrxr instruction
Fixed bug in gen_mcrxr() in target-ppc/translate.c:
The XER[SO], XER[OV], and XER[CA] flags are stored in the least
significant bit (bit 0) of their respective registers. They need
to be shifted left (by their respective offsets) to generate the final
XER value. The old translation code for the 'mcrxr' instruction
was assuming that  the flags are stored in bit 2, and was shifting them
right (incorrectly)

Signed-off-by: Sorav Bansal <sbansal@cse.iitd.ernet.in>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:22 +02:00
Avik Sil
cc84c0f357 spapr: Add "qemu, boot-menu" property to /chosen
This is required to enable boot menu display during booting

Signed-off-by: Avik Sil <aviksil@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:22 +02:00
Tom Musta
a60438ddd6 linux-user: Support HWCAP2 in PowerPC
Set bits in the AT_HWCAP2 entry of the AUXV.  Specifically, detect and set bits
for bctar, ISEL and ISA 2.07.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:22 +02:00
Tom Musta
0e019746d7 linux-user: Identify Addition Hardware Capabilities for PowerPC
Add VSX, DFP and ISA 2.06 to the bits identified in the AT_HWCAP
entry of the AUXV.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:22 +02:00
Tom Musta
b2f1355020 target-ppc: Add DFP to Emulated Instructions Flag
Decimal Floating Point is emulated, so add it the mask.  This will
fix the erroneous message:

  Warning: Disabling some instructions which are not emulated by TCG (0x0, 0x4)

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:22 +02:00
Tom Musta
623e250abd linux-user: Correct AUXV Cache Line Sizes for PowerPC
Set the AT_ICACHEBSIZE and AT_DCACHEBSIZE entries of the AUXV to match the
CPU model's cache line sizes.  This fixes memory clobbering problems on more
recent Book 3s implementations; memset(p, 0, N) will use the dcbz instruction
when N is sufficiently large and many of the newer server CPUs have cache lines
sizes of 128 bytes.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:21 +02:00
Peter Maydell
5e80dd223d hw/net/eepro100: Implement read-only bits in MDI registers
Although we defined an eepro100_mdi_mask[] array indicating which bits
in the registers are read-only, we weren't actually doing anything with
it. Make the MDI register-write code use it rather than manually making
register 1 read-only and leaving the rest as reads-as-written. (The
special-case handling of register 0 remains as before since its mask is
all-zeros and the special casing happens before we apply the masking.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1402159924-13853-1-git-send-email-peter.maydell@linaro.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-27 12:23:45 +02:00
Jens Freimann
77416f4075 pc-bios/s390-ccw: update binary
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-06-27 12:11:53 +02:00
Eugene (jno) Dvurechenski
564e52b96f pc-bios/s390-ccw: IPL from LDL/CMS-formatted ECKD DASD
Add code that allows us to start from two further ECKD DASD disk
layouts: LDL (Linux disk layout) and CMS (cms-formatted disk).

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-06-27 12:11:52 +02:00
Eugene (jno) Dvurechenski
e0aff4aa3f pc-bios/s390-ccw: IPL from CDL-formatted ECKD DASD
Add code that allows us to start from ECKD DASD using the z/OS
compatible disk layout (CDL), which is the most common format for ECKD
DASD.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-06-27 12:11:52 +02:00
Eugene (jno) Dvurechenski
a00b33d9e2 pc-bios/s390-ccw: factor out ipl code
Move the scsi-disk specific ipl code from zipl_load() into a new
function ipl_scsi(). This makes it easier to add ipl routines for other
disk types.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-06-27 12:11:49 +02:00
Eugene (jno) Dvurechenski
058cc1f311 pc-bios/s390-ccw: Add fill_hex_val func to provide better msgs
Factor out helper function for dumping a hex value into a buffer.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-06-27 12:10:28 +02:00
Eugene (jno) Dvurechenski
60612d5cbb pc-bios/s390-ccw: Unify error handling
Convert to IPL_assert and friends

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-06-27 12:10:28 +02:00
Eugene (jno) Dvurechenski
a94b485e17 pc-bios/s390-ccw: add some utility code
IPL_assert(term,message) is introduced to handle error conditions.
ebcdic_to_ascii() to convert chars (mostly to print VOLSERs).
read_block() provision for unified block-number handling.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-06-27 12:10:28 +02:00
Eugene (jno) Dvurechenski
91a03f9b69 pc-bios/s390-ccw: handle different sector sizes
Use the virtio device's configuration to figure out the disk geometry
and use a sector size based upon the layout.

[CH: s/SECTOR_SIZE/MAX_SECTOR_SIZE/g]
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-06-27 12:10:03 +02:00
Eugene (jno) Dvurechenski
26f2bbd6b1 pc-bios/s390-ccw: cleanup and enhance bootmap defintions
Add declarations to describe structure of different dasd IPL sources
(eckd and fba). Move the structure definitions to a new header bootmap.h.
While we are at it, change structs to typedefs.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-06-27 11:58:47 +02:00
Eugene (jno) Dvurechenski
abd696e4f7 pc-bios/s390-ccw: make checkpatch happy
Remove tabs, tweak whitespace and comments.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-06-27 11:57:25 +02:00
Jeff Cody
d1fde4ad3c block: add qemu-iotest for resize base during live commit
If 'base' is smaller than the overlay image being committed into it,
then the base image will be grown in commit_run via bdrv_truncate().

This tests to make sure that this works, and the bdrv_truncate() is
not blocked when it shouldn't be.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 11:37:54 +02:00
Jeff Cody
9c75e168bc block: check for RESIZE blocker in the QMP command, not bdrv_truncate()
If we check for the RESIZE blocker in bdrv_truncate(), that means a
commit will fail if the overlay layer is larger than the base, due to
the backing blocker.

This is a regression in behavior from 2.0; currently, commit will try to
grow the size of the base image to match the overlay size, if the
overlay size is larger.

By moving this into the QMP command qmp_block_resize(), it allows
usage of bdrv_truncate() within block jobs.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 11:37:35 +02:00
Jiri Pirko
575a1c0e42 net: move queue number into NICPeers
It indicates the number of elements in ncs field and makes sense to have
int inside NICPeers. Also in parse_netdev we do not need to access
container and work with NICPeers only.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-27 11:19:31 +02:00
Anton Ivanov
3fb69aa1d1 net: L2TPv3 transport
This transport allows to connect a QEMU nic to a static Ethernet
over L2TPv3 tunnel. The transport supports all options present
in the Linux kernel implementation. It allows QEMU to connect
to any Linux host running kernel 3.3+, most routers and network
devices as well as other QEMU instances.

[Fixed up net_client_init1() switch statement to support -netdev
--Stefan]

Signed-off-by: Anton Ivanov <antivano@cisco.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-27 10:39:10 +02:00
Gonglei
eb3f45c5af qemu-bridge-helper: Fix fd leak in main()
initialize fd and ctlfd, and close them at the end

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-27 10:39:10 +02:00
Michal Privoznik
a760715095 qemu_opts_append: Play nicely with QemuOptsList's head
When running a libvirt test suite I've noticed the qemu-img is
crashing occasionally. Tracing the problem down led me to the
following valgrind output:

qemu.git $ valgrind -q ./qemu-img create -f qed -obacking_file=/dev/null,backing_fmt=raw qed
==14881== Invalid write of size 8
==14881==    at 0x1D263F: qemu_opts_create (qemu-option.c:692)
==14881==    by 0x130782: bdrv_img_create (block.c:5531)
==14881==    by 0x118DE0: img_create (qemu-img.c:462)
==14881==    by 0x11E7E4: main (qemu-img.c:2830)
==14881==  Address 0x11fedd38 is 24 bytes inside a block of size 232 free'd
==14881==    at 0x4C2CA5E: realloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==14881==    by 0x592D35E: g_realloc (in /usr/lib64/libglib-2.0.so.0.3800.2)
==14881==    by 0x1D38D8: qemu_opts_append (qemu-option.c:1129)
==14881==    by 0x13075E: bdrv_img_create (block.c:5528)
==14881==    by 0x118DE0: img_create (qemu-img.c:462)
==14881==    by 0x11E7E4: main (qemu-img.c:2830)
==14881==
Formatting 'qed', fmt=qed size=0 backing_file='/dev/null' backing_fmt='raw' cluster_size=65536
==14881== Invalid write of size 8
==14881==    at 0x1D28BE: qemu_opts_del (qemu-option.c:750)
==14881==    by 0x130BF3: bdrv_img_create (block.c:5638)
==14881==    by 0x118DE0: img_create (qemu-img.c:462)
==14881==    by 0x11E7E4: main (qemu-img.c:2830)
==14881==  Address 0x11fedd38 is 24 bytes inside a block of size 232 free'd
==14881==    at 0x4C2CA5E: realloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==14881==    by 0x592D35E: g_realloc (in /usr/lib64/libglib-2.0.so.0.3800.2)
==14881==    by 0x1D38D8: qemu_opts_append (qemu-option.c:1129)
==14881==    by 0x13075E: bdrv_img_create (block.c:5528)
==14881==    by 0x118DE0: img_create (qemu-img.c:462)
==14881==    by 0x11E7E4: main (qemu-img.c:2830)
==14881==

The problem is apparently in the qemu_opts_append(). Well, if it
gets called twice or more. On the first call, when @dst is NULL
some initialization is done during which @dst->head list gets
initialized. The list is initialized in a way, so that the list
tail points at the list head. However, the next time
qemu_opts_append() is called for new options to be added,
g_realloc() may move @dst to a new address making the old list tail
point at an invalid address. If that's the case, we must update the
list pointers.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-26 15:53:52 +02:00
Peter Maydell
ff4873cb8c coroutine-win32.c: Add noinline attribute to work around gcc bug
A gcc codegen bug in x86_64-w64-mingw32-gcc (GCC) 4.6.3 means that
non-debug builds of QEMU for Windows tend to assert when using
coroutines. Work around this by marking qemu_coroutine_switch
as noinline.

If we allow gcc to inline qemu_coroutine_switch into
coroutine_trampoline, then it hoists the code to get the
address of the TLS variable "current" out of the while() loop.
This is an invalid transformation because the SwitchToFiber()
call may be called when running thread A but return in thread B,
and so we might be in a different thread context each time
round the loop. This can happen quite often.  Typically.
a coroutine is started when a VCPU thread does bdrv_aio_readv:

     VCPU thread

     main VCPU thread coroutine      I/O coroutine
        bdrv_aio_readv ----->
                                     start I/O operation
                                       thread_pool_submit_co
                       <------------ yields
        back to emulation

Then I/O finishes and the thread-pool.c event notifier triggers in
the I/O thread.  event_notifier_ready calls thread_pool_co_cb, and
the I/O coroutine now restarts *in another thread*:

     iothread

     main iothread coroutine         I/O coroutine (formerly in VCPU thread)
        event_notifier_ready
          thread_pool_co_cb ----->   current = I/O coroutine;
                                     call AIO callback

But on Win32, because of the bug, the "current" being set here the
current coroutine of the VCPU thread, not the iothread.

noinline is a good-enough workaround, and quite unlikely to break in
the future.

(Thanks to Paolo Bonzini for assistance in diagnosing the problem
and providing the detailed example/ascii art quoted above.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1403535303-14939-1-git-send-email-peter.maydell@linaro.org
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2014-06-26 14:08:14 +01:00
Peter Maydell
8589744aaf Merge remote-tracking branch 'remotes/afaerber/tags/qom-cpu-for-2.1' into staging
X86CPU

* Filter out MONITOR for KVM
* Fix filtering for TCG
* -cpu foo,check and -cpu foo,enforce support for TCG
* -cpu host migration support (-cpu host,migratable=no to disable)
* Add invtsc feature support
* New model: Broadwell

# gpg: Signature made Wed 25 Jun 2014 22:55:04 BST using RSA key ID 3E7E013F
# gpg: Good signature from "Andreas Färber <afaerber@suse.de>"
# gpg:                 aka "Andreas Färber <afaerber@suse.com>"

* remotes/afaerber/tags/qom-cpu-for-2.1:
  target-i386: Broadwell CPU model
  target-i386: Fix indentation of CPU model definitions
  target-i386: Support "invariant tsc" flag
  target-i386: block migration and savevm if invariant tsc is exposed
  savevm: check vmsd for migratability status
  target-i386: Set migratable=yes by default on "host" CPU mooel
  target-i386: Add "migratable" property to "host" CPU model
  target-i386: Support check/enforce flags in TCG mode, too
  target-i386: Loop-based feature word filtering in TCG mode
  target-i386: Loop-based copying and setting/unsetting of feature words
  target-i386: Define TCG_*_FEATURES earlier in cpu.c
  target-i386: Filter KVM and 0xC0000001 features on TCG
  target-i386: Filter FEAT_7_0_EBX TCG features too
  target-i386: Make TCG feature filtering more readable
  target-i386: Isolate KVM-specific code on CPU feature filtering logic
  target-i386: Pass FeatureWord argument to report_unavailable_features()
  target-i386: Merge feature filtering/checking functions
  target-i386: Simplify reporting of unavailable features
  target-i386: kvm: Don't enable MONITOR by default on any CPU model

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-26 13:33:11 +01:00
Paolo Bonzini
f3db17b951 qemu-char: initialize chr_write_lock
Otherwise, Windows fails with a deadlock.

Reported-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1403679897-11480-1-git-send-email-pbonzini@redhat.com
Tested-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-26 13:13:54 +01:00
Kevin Wolf
20cca275c6 block: Remove a special case for protocols
The only semantic change is that bs->open_flags gets BDRV_O_PROTOCOL set
now. This isn't useful, but it doesn't hurt either. The code that was
previously skipped by 'goto done' is automatically disabled because
protocol drivers don't support backing files (and if they did, this
would probably be a fix) and can't have snapshot_flags set.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-06-26 13:51:01 +02:00
Kevin Wolf
8ee79e707a block: Catch backing files assigned to non-COW drivers
Since we parse backing.* options to add a backing file from the command
line when the driver didn't assign one, it has been possible to have a
backing file for e.g. raw images (it just was never accessed).

This is obvious nonsense and should be rejected.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-06-26 13:51:01 +02:00
Kevin Wolf
76c591b013 block: Remove second bdrv_open() recursion
This recursion was introduced in commit 505d7583 in order to allow
nesting image formats. It only ever takes effect when the user
explicitly specifies a driver name and that driver isn't suitable for
the protocol level.

We can check this earlier in bdrv_open() and if the explicitly
requested driver is a format driver, clear BDRV_O_PROTOCOL so that
another bs->file layer is opened.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-06-26 13:51:01 +02:00
Kevin Wolf
b348f3311c block: Inline bdrv_file_open()
It doesn't do much any more, we can move the code to bdrv_open() now.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-06-26 13:51:01 +02:00
Kevin Wolf
f4788adcb4 block: Use common driver selection code for bdrv_open_file()
This moves the bdrv_open_file() call a bit down so that it can use the
bdrv_open() code that selects the right block driver.

The code between the old and the new call site is either common code
(the error message for an unknown driver has been unified now) or
doesn't run with cleared BDRV_O_PROTOCOL (added an if block in one
place, whereas the right path was already asserted in another place)

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-06-26 13:51:01 +02:00
Kevin Wolf
17b005f1d4 block: Always pass driver name through options QDict
The "driver" entry in the options QDict is now only missing if we're
opening an image with format probing.

We also catch cases now where both the drv argument and a "driver"
option is specified, e.g. by specifying -drive format=qcow2,driver=raw

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-06-26 13:51:01 +02:00
Kevin Wolf
5e5c4f63f4 block: Move json: parsing to bdrv_fill_options()
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-06-26 13:51:01 +02:00
Kevin Wolf
462f5bcf69 block: Move bdrv_fill_options() call to bdrv_open()
bs->options now contains the modified version of the options.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-06-26 13:51:01 +02:00
Kevin Wolf
f54120ff1a block: Create bdrv_fill_options()
The idea of bdrv_fill_options() is to convert every parameter for
opening images, in particular the filename and flags, to entries in the
options QDict.

This patch starts with moving the filename parsing and driver probing
part from bdrv_file_open() to the new function.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-06-26 13:51:01 +02:00
Peter Lieven
f42ca3cad1 block/nfs: add knob to set readahead
upcoming libnfs will feature internal readahead support.
Add a knob to pass the optional readahead value as a URL
parameter.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-26 13:51:01 +02:00
Peter Lieven
7c24384b3b block/nfs: fix url parameter checking
this patch fixes the incorrect usage of strncmp and
adds simple error checking by means of parse_uint_full
instead of atoi for the supplied URL parameters.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-26 13:51:01 +02:00
Fam Zheng
3b9f27d2b3 qemu-iotests: Test 0-length image for mirror
All behavior and invariant should hold for images with 0 length, so
add a class to repeat all the tests in TestSingleDrive.

Hide two unapplicable test methods that would fail with 0 image length
because it's also used as cluster size.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-26 13:51:00 +02:00
Fam Zheng
8b9a30ca5b qemu-iotests: Test BLOCK_JOB_READY event for 0Kb image active commit
There should be a BLOCK_JOB_READY event with active commit, regardless
of image length. Let's test the 0 length image case, and make sure it
goes through the ready->complete process.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-26 13:51:00 +02:00
Fam Zheng
9e48b02540 mirror: Go through ready -> complete process for 0 len image
When mirroring or active committing a zero length image, BLOCK_JOB_READY
is not reported now, instead the job completes because we short circuit
the mirror job loop.

This is inconsistent with non-zero length images, and only confuses
management software.

Let's do the same thing when seeing a 0-length image: report ready
immediately; wait for block-job-cancel or block-job-complete; clear the
cancel flag as existing non-zero image synced case (cancelled after
ready); then jump to the exit.

Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-26 13:50:57 +02:00
Igor Mammedov
0931304788 qemu-char: fix warning 'res' may be used uninitialized
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-id: 1403683241-20678-1-git-send-email-imammedo@redhat.com
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-26 12:34:41 +01:00
Fam Zheng
dc71ce45de blockjob: Add block_job_yield()
This will unset busy flag and put coroutine to sleep, can be used to
wait for QMP complete/cancel.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-26 12:12:22 +02:00
349 changed files with 10564 additions and 5025 deletions

1
.gitignore vendored
View File

@@ -26,6 +26,7 @@
/qapi-generated
/qapi-types.[ch]
/qapi-visit.[ch]
/qapi-event.[ch]
/qmp-commands.h
/qmp-marshal.c
/qemu-doc.html

View File

@@ -563,6 +563,7 @@ Devices
-------
IDE
M: Kevin Wolf <kwolf@redhat.com>
M: Stefan Hajnoczi <stefanha@redhat.com>
S: Odd Fixes
F: include/hw/ide.h
F: hw/ide/
@@ -853,7 +854,7 @@ S: Odd Fixes
F: scripts/checkpatch.pl
Seccomp
M: Eduardo Otubo <otubo@linux.vnet.ibm.com>
M: Eduardo Otubo <eduardo.otubo@profitbricks.com>
S: Supported
F: qemu-seccomp.c
F: include/sysemu/seccomp.h

View File

@@ -248,7 +248,7 @@ $(SRC_PATH)/qga/qapi-schema.json $(SRC_PATH)/scripts/qapi-commands.py $(qapi-py)
qapi-modules = $(SRC_PATH)/qapi-schema.json $(SRC_PATH)/qapi/common.json \
$(SRC_PATH)/qapi/block.json $(SRC_PATH)/qapi/block-core.json \
$(SRC_PATH)/qapi-event.json
$(SRC_PATH)/qapi/event.json
qapi-types.c qapi-types.h :\
$(qapi-modules) $(SRC_PATH)/scripts/qapi-types.py $(qapi-py)
@@ -344,7 +344,8 @@ multiboot.bin linuxboot.bin kvmvapic.bin \
s390-zipl.rom \
s390-ccw.img \
spapr-rtas.bin slof.bin \
palcode-clipper
palcode-clipper \
u-boot.e500
else
BLOBS=
endif

View File

@@ -163,10 +163,6 @@ dummy := $(call unnest-vars,.., \
all-obj-y += $(common-obj-y)
all-obj-$(CONFIG_SOFTMMU) += $(block-obj-y)
ifndef CONFIG_HAIKU
LIBS+=-lm
endif
# build either PROG or PROGW
$(QEMU_PROG_BUILD): $(all-obj-y) ../libqemuutil.a ../libqemustub.a
$(call LINK,$^)

View File

@@ -1 +1 @@
2.0.50
2.1.1

View File

@@ -125,7 +125,7 @@ static bool aio_dispatch(AioContext *ctx)
bool progress = false;
/*
* We have to walk very carefully in case qemu_aio_set_fd_handler is
* We have to walk very carefully in case aio_set_fd_handler is
* called while we're walking.
*/
node = QLIST_FIRST(&ctx->aio_handlers);
@@ -175,27 +175,56 @@ static bool aio_dispatch(AioContext *ctx)
bool aio_poll(AioContext *ctx, bool blocking)
{
AioHandler *node;
bool was_dispatching;
int ret;
bool progress;
was_dispatching = ctx->dispatching;
progress = false;
/* aio_notify can avoid the expensive event_notifier_set if
* everything (file descriptors, bottom halves, timers) will
* be re-evaluated before the next blocking poll(). This happens
* in two cases:
*
* 1) when aio_poll is called with blocking == false
*
* 2) when we are called after poll(). If we are called before
* poll(), bottom halves will not be re-evaluated and we need
* aio_notify() if blocking == true.
*
* The first aio_dispatch() only does something when AioContext is
* running as a GSource, and in that case aio_poll is used only
* with blocking == false, so this optimization is already quite
* effective. However, the code is ugly and should be restructured
* to have a single aio_dispatch() call. To do this, we need to
* reorganize aio_poll into a prepare/poll/dispatch model like
* glib's.
*
* If we're in a nested event loop, ctx->dispatching might be true.
* In that case we can restore it just before returning, but we
* have to clear it now.
*/
aio_set_dispatching(ctx, !blocking);
/*
* If there are callbacks left that have been queued, we need to call them.
* Do not call select in this case, because it is possible that the caller
* does not need a complete flush (as is the case for qemu_aio_wait loops).
* does not need a complete flush (as is the case for aio_poll loops).
*/
if (aio_bh_poll(ctx)) {
blocking = false;
progress = true;
}
/* Re-evaluate condition (1) above. */
aio_set_dispatching(ctx, !blocking);
if (aio_dispatch(ctx)) {
progress = true;
}
if (progress && !blocking) {
return true;
goto out;
}
ctx->walking_handlers++;
@@ -234,9 +263,12 @@ bool aio_poll(AioContext *ctx, bool blocking)
}
/* Run dispatch even if there were no readable fds to run timers */
aio_set_dispatching(ctx, true);
if (aio_dispatch(ctx)) {
progress = true;
}
out:
aio_set_dispatching(ctx, was_dispatching);
return progress;
}

View File

@@ -102,7 +102,7 @@ bool aio_poll(AioContext *ctx, bool blocking)
/*
* If there are callbacks left that have been queued, we need to call then.
* Do not call select in this case, because it is possible that the caller
* does not need a complete flush (as is the case for qemu_aio_wait loops).
* does not need a complete flush (as is the case for aio_poll loops).
*/
if (aio_bh_poll(ctx)) {
blocking = false;
@@ -115,7 +115,7 @@ bool aio_poll(AioContext *ctx, bool blocking)
/*
* Then dispatch any pending callbacks from the GSource.
*
* We have to walk very carefully in case qemu_aio_set_fd_handler is
* We have to walk very carefully in case aio_set_fd_handler is
* called while we're walking.
*/
node = QLIST_FIRST(&ctx->aio_handlers);
@@ -177,7 +177,7 @@ bool aio_poll(AioContext *ctx, bool blocking)
blocking = false;
/* we have to walk very carefully in case
* qemu_aio_set_fd_handler is called while we're walking */
* aio_set_fd_handler is called while we're walking */
node = QLIST_FIRST(&ctx->aio_handlers);
while (node) {
AioHandler *tmp;

19
async.c
View File

@@ -26,6 +26,7 @@
#include "block/aio.h"
#include "block/thread-pool.h"
#include "qemu/main-loop.h"
#include "qemu/atomic.h"
/***********************************************************/
/* bottom halves (can be seen as timers which expire ASAP) */
@@ -247,9 +248,25 @@ ThreadPool *aio_get_thread_pool(AioContext *ctx)
return ctx->thread_pool;
}
void aio_set_dispatching(AioContext *ctx, bool dispatching)
{
ctx->dispatching = dispatching;
if (!dispatching) {
/* Write ctx->dispatching before reading e.g. bh->scheduled.
* Optimization: this is only needed when we're entering the "unsafe"
* phase where other threads must call event_notifier_set.
*/
smp_mb();
}
}
void aio_notify(AioContext *ctx)
{
event_notifier_set(&ctx->notifier);
/* Write e.g. bh->scheduled before reading ctx->dispatching. */
smp_mb();
if (!ctx->dispatching) {
event_notifier_set(&ctx->notifier);
}
}
static void aio_timerlist_notify(void *opaque)

View File

@@ -304,7 +304,7 @@ host_memory_backend_memory_complete(UserCreatable *uc, Error **errp)
/* ensure policy won't be ignored in case memory is preallocated
* before mbind(). note: MPOL_MF_STRICT is ignored on hugepages so
* this doesn't catch hugepage case. */
unsigned flags = MPOL_MF_STRICT;
unsigned flags = MPOL_MF_STRICT | MPOL_MF_MOVE;
/* check for invalid host-nodes and policies and give more verbose
* error messages than mbind(). */

View File

@@ -861,7 +861,7 @@ static bool block_is_active(void *opaque)
return block_mig_state.blk_enable == 1;
}
SaveVMHandlers savevm_block_handlers = {
static SaveVMHandlers savevm_block_handlers = {
.set_params = block_set_params,
.save_live_setup = block_save_setup,
.save_live_iterate = block_save_iterate,

474
block.c
View File

@@ -471,7 +471,7 @@ int bdrv_create(BlockDriver *drv, const char* filename,
co = qemu_coroutine_create(bdrv_create_co_entry);
qemu_coroutine_enter(co, &cco);
while (cco.ret == NOT_DONE) {
qemu_aio_wait();
aio_poll(qemu_get_aio_context(), true);
}
}
@@ -508,19 +508,24 @@ int bdrv_create_file(const char *filename, QemuOpts *opts, Error **errp)
return ret;
}
int bdrv_refresh_limits(BlockDriverState *bs)
void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
{
BlockDriver *drv = bs->drv;
Error *local_err = NULL;
memset(&bs->bl, 0, sizeof(bs->bl));
if (!drv) {
return 0;
return;
}
/* Take some limits from the children as a default */
if (bs->file) {
bdrv_refresh_limits(bs->file);
bdrv_refresh_limits(bs->file, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return;
}
bs->bl.opt_transfer_length = bs->file->bl.opt_transfer_length;
bs->bl.opt_mem_alignment = bs->file->bl.opt_mem_alignment;
} else {
@@ -528,7 +533,11 @@ int bdrv_refresh_limits(BlockDriverState *bs)
}
if (bs->backing_hd) {
bdrv_refresh_limits(bs->backing_hd);
bdrv_refresh_limits(bs->backing_hd, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return;
}
bs->bl.opt_transfer_length =
MAX(bs->bl.opt_transfer_length,
bs->backing_hd->bl.opt_transfer_length);
@@ -539,10 +548,8 @@ int bdrv_refresh_limits(BlockDriverState *bs)
/* Then let the driver override it */
if (drv->bdrv_refresh_limits) {
return drv->bdrv_refresh_limits(bs);
drv->bdrv_refresh_limits(bs, errp);
}
return 0;
}
/*
@@ -831,7 +838,7 @@ static int bdrv_open_flags(BlockDriverState *bs, int flags)
* Clear flags that are internal to the block layer before opening the
* image.
*/
open_flags &= ~(BDRV_O_SNAPSHOT | BDRV_O_NO_BACKING);
open_flags &= ~(BDRV_O_SNAPSHOT | BDRV_O_NO_BACKING | BDRV_O_PROTOCOL);
/*
* Snapshots should be writable.
@@ -928,6 +935,7 @@ static int bdrv_open_common(BlockDriverState *bs, BlockDriverState *file,
bs->zero_beyond_eof = true;
open_flags = bdrv_open_flags(bs, flags);
bs->read_only = !(open_flags & BDRV_O_RDWR);
bs->growable = !!(flags & BDRV_O_PROTOCOL);
if (use_bdrv_whitelist && !bdrv_is_whitelisted(drv, bs->read_only)) {
error_setg(errp,
@@ -992,7 +1000,13 @@ static int bdrv_open_common(BlockDriverState *bs, BlockDriverState *file,
goto free_and_fail;
}
bdrv_refresh_limits(bs);
bdrv_refresh_limits(bs, &local_err);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto free_and_fail;
}
assert(bdrv_opt_mem_align(bs) != 0);
assert((bs->request_alignment != 0) || bs->sg);
return 0;
@@ -1005,94 +1019,124 @@ free_and_fail:
return ret;
}
/*
* Opens a file using a protocol (file, host_device, nbd, ...)
*
* options is an indirect pointer to a QDict of options to pass to the block
* drivers, or pointer to NULL for an empty set of options. If this function
* takes ownership of the QDict reference, it will set *options to NULL;
* otherwise, it will contain unused/unrecognized options after this function
* returns. Then, the caller is responsible for freeing it. If it intends to
* reuse the QDict, QINCREF() should be called beforehand.
*/
static int bdrv_file_open(BlockDriverState *bs, const char *filename,
QDict **options, int flags, Error **errp)
static QDict *parse_json_filename(const char *filename, Error **errp)
{
BlockDriver *drv;
const char *drvname;
bool parse_filename = false;
Error *local_err = NULL;
QObject *options_obj;
QDict *options;
int ret;
ret = strstart(filename, "json:", &filename);
assert(ret);
options_obj = qobject_from_json(filename);
if (!options_obj) {
error_setg(errp, "Could not parse the JSON options");
return NULL;
}
if (qobject_type(options_obj) != QTYPE_QDICT) {
qobject_decref(options_obj);
error_setg(errp, "Invalid JSON object given");
return NULL;
}
options = qobject_to_qdict(options_obj);
qdict_flatten(options);
return options;
}
/*
* Fills in default options for opening images and converts the legacy
* filename/flags pair to option QDict entries.
*/
static int bdrv_fill_options(QDict **options, const char **pfilename, int flags,
BlockDriver *drv, Error **errp)
{
const char *filename = *pfilename;
const char *drvname;
bool protocol = flags & BDRV_O_PROTOCOL;
bool parse_filename = false;
Error *local_err = NULL;
/* Parse json: pseudo-protocol */
if (filename && g_str_has_prefix(filename, "json:")) {
QDict *json_options = parse_json_filename(filename, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return -EINVAL;
}
/* Options given in the filename have lower priority than options
* specified directly */
qdict_join(*options, json_options, false);
QDECREF(json_options);
*pfilename = filename = NULL;
}
/* Fetch the file name from the options QDict if necessary */
if (!filename) {
filename = qdict_get_try_str(*options, "filename");
} else if (filename && !qdict_haskey(*options, "filename")) {
qdict_put(*options, "filename", qstring_from_str(filename));
parse_filename = true;
} else {
error_setg(errp, "Can't specify 'file' and 'filename' options at the "
"same time");
ret = -EINVAL;
goto fail;
if (protocol && filename) {
if (!qdict_haskey(*options, "filename")) {
qdict_put(*options, "filename", qstring_from_str(filename));
parse_filename = true;
} else {
error_setg(errp, "Can't specify 'file' and 'filename' options at "
"the same time");
return -EINVAL;
}
}
/* Find the right block driver */
filename = qdict_get_try_str(*options, "filename");
drvname = qdict_get_try_str(*options, "driver");
if (drvname) {
drv = bdrv_find_format(drvname);
if (!drv) {
error_setg(errp, "Unknown driver '%s'", drvname);
}
qdict_del(*options, "driver");
} else if (filename) {
drv = bdrv_find_protocol(filename, parse_filename);
if (!drv) {
error_setg(errp, "Unknown protocol");
if (drv) {
if (drvname) {
error_setg(errp, "Driver specified twice");
return -EINVAL;
}
drvname = drv->format_name;
qdict_put(*options, "driver", qstring_from_str(drvname));
} else {
error_setg(errp, "Must specify either driver or file");
drv = NULL;
if (!drvname && protocol) {
if (filename) {
drv = bdrv_find_protocol(filename, parse_filename);
if (!drv) {
error_setg(errp, "Unknown protocol");
return -EINVAL;
}
drvname = drv->format_name;
qdict_put(*options, "driver", qstring_from_str(drvname));
} else {
error_setg(errp, "Must specify either driver or file");
return -EINVAL;
}
} else if (drvname) {
drv = bdrv_find_format(drvname);
if (!drv) {
error_setg(errp, "Unknown driver '%s'", drvname);
return -ENOENT;
}
}
}
if (!drv) {
/* errp has been set already */
ret = -ENOENT;
goto fail;
}
assert(drv || !protocol);
/* Parse the filename and open it */
if (drv->bdrv_parse_filename && parse_filename) {
/* Driver-specific filename parsing */
if (drv && drv->bdrv_parse_filename && parse_filename) {
drv->bdrv_parse_filename(filename, *options, &local_err);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
return -EINVAL;
}
if (!drv->bdrv_needs_filename) {
qdict_del(*options, "filename");
} else {
filename = qdict_get_str(*options, "filename");
}
}
if (!drv->bdrv_file_open) {
ret = bdrv_open(&bs, filename, NULL, *options, flags, drv, &local_err);
*options = NULL;
} else {
ret = bdrv_open_common(bs, NULL, *options, flags, drv, &local_err);
}
if (ret < 0) {
error_propagate(errp, local_err);
goto fail;
}
bs->growable = 1;
return 0;
fail:
return ret;
}
void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd)
@@ -1123,7 +1167,7 @@ void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd)
bdrv_op_unblock(bs->backing_hd, BLOCK_OP_TYPE_COMMIT,
bs->backing_blocker);
out:
bdrv_refresh_limits(bs);
bdrv_refresh_limits(bs, NULL);
}
/*
@@ -1162,6 +1206,13 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict *options, Error **errp)
bdrv_get_full_backing_filename(bs, backing_filename, PATH_MAX);
}
if (!bs->drv || !bs->drv->supports_backing) {
ret = -EINVAL;
error_setg(errp, "Driver doesn't support backing files");
QDECREF(options);
goto free_exit;
}
backing_hd = bdrv_new("", errp);
if (bs->backing_format[0] != '\0') {
@@ -1240,7 +1291,7 @@ done:
return ret;
}
void bdrv_append_temp_snapshot(BlockDriverState *bs, int flags, Error **errp)
int bdrv_append_temp_snapshot(BlockDriverState *bs, int flags, Error **errp)
{
/* TODO: extra byte is a hack to ensure MAX_PATH space on Windows. */
char *tmp_filename = g_malloc0(PATH_MAX + 1);
@@ -1258,6 +1309,7 @@ void bdrv_append_temp_snapshot(BlockDriverState *bs, int flags, Error **errp)
/* Get the required size from the image */
total_size = bdrv_getlength(bs);
if (total_size < 0) {
ret = total_size;
error_setg_errno(errp, -total_size, "Could not get image size");
goto out;
}
@@ -1304,33 +1356,7 @@ void bdrv_append_temp_snapshot(BlockDriverState *bs, int flags, Error **errp)
out:
g_free(tmp_filename);
}
static QDict *parse_json_filename(const char *filename, Error **errp)
{
QObject *options_obj;
QDict *options;
int ret;
ret = strstart(filename, "json:", &filename);
assert(ret);
options_obj = qobject_from_json(filename);
if (!options_obj) {
error_setg(errp, "Could not parse the JSON options");
return NULL;
}
if (qobject_type(options_obj) != QTYPE_QDICT) {
qobject_decref(options_obj);
error_setg(errp, "Invalid JSON object given");
return NULL;
}
options = qobject_to_qdict(options_obj);
qdict_flatten(options);
return options;
return ret;
}
/*
@@ -1396,77 +1422,62 @@ int bdrv_open(BlockDriverState **pbs, const char *filename,
options = qdict_new();
}
if (filename && g_str_has_prefix(filename, "json:")) {
QDict *json_options = parse_json_filename(filename, &local_err);
if (local_err) {
ret = -EINVAL;
goto fail;
}
/* Options given in the filename have lower priority than options
* specified directly */
qdict_join(options, json_options, false);
QDECREF(json_options);
filename = NULL;
}
bs->options = options;
options = qdict_clone_shallow(options);
if (flags & BDRV_O_PROTOCOL) {
assert(!drv);
ret = bdrv_file_open(bs, filename, &options, flags & ~BDRV_O_PROTOCOL,
&local_err);
if (!ret) {
drv = bs->drv;
goto done;
} else if (bs->drv) {
goto close_and_fail;
} else {
goto fail;
}
}
/* Open image file without format layer */
if (flags & BDRV_O_RDWR) {
flags |= BDRV_O_ALLOW_RDWR;
}
if (flags & BDRV_O_SNAPSHOT) {
snapshot_flags = bdrv_temp_snapshot_flags(flags);
flags = bdrv_backing_flags(flags);
}
assert(file == NULL);
ret = bdrv_open_image(&file, filename, options, "file",
bdrv_inherited_flags(flags),
true, &local_err);
if (ret < 0) {
ret = bdrv_fill_options(&options, &filename, flags, drv, &local_err);
if (local_err) {
goto fail;
}
/* Find the right image format driver */
drv = NULL;
drvname = qdict_get_try_str(options, "driver");
if (drvname) {
drv = bdrv_find_format(drvname);
qdict_del(options, "driver");
if (!drv) {
error_setg(errp, "Invalid driver: '%s'", drvname);
error_setg(errp, "Unknown driver: '%s'", drvname);
ret = -EINVAL;
goto fail;
}
}
if (!drv) {
if (file) {
ret = find_image_format(file, filename, &drv, &local_err);
} else {
error_setg(errp, "Must specify either driver or file");
ret = -EINVAL;
assert(drvname || !(flags & BDRV_O_PROTOCOL));
if (drv && !drv->bdrv_file_open) {
/* If the user explicitly wants a format driver here, we'll need to add
* another layer for the protocol in bs->file */
flags &= ~BDRV_O_PROTOCOL;
}
bs->options = options;
options = qdict_clone_shallow(options);
/* Open image file without format layer */
if ((flags & BDRV_O_PROTOCOL) == 0) {
if (flags & BDRV_O_RDWR) {
flags |= BDRV_O_ALLOW_RDWR;
}
if (flags & BDRV_O_SNAPSHOT) {
snapshot_flags = bdrv_temp_snapshot_flags(flags);
flags = bdrv_backing_flags(flags);
}
assert(file == NULL);
ret = bdrv_open_image(&file, filename, options, "file",
bdrv_inherited_flags(flags),
true, &local_err);
if (ret < 0) {
goto fail;
}
}
if (!drv) {
/* Image format probing */
if (!drv && file) {
ret = find_image_format(file, filename, &drv, &local_err);
if (ret < 0) {
goto fail;
}
} else if (!drv) {
error_setg(errp, "Must specify either driver or file");
ret = -EINVAL;
goto fail;
}
@@ -1495,15 +1506,12 @@ int bdrv_open(BlockDriverState **pbs, const char *filename,
/* For snapshot=on, create a temporary qcow2 overlay. bs points to the
* temporary snapshot afterwards. */
if (snapshot_flags) {
bdrv_append_temp_snapshot(bs, snapshot_flags, &local_err);
ret = bdrv_append_temp_snapshot(bs, snapshot_flags, &local_err);
if (local_err) {
error_propagate(errp, local_err);
goto close_and_fail;
}
}
done:
/* Check if any unknown options were used */
if (options && (qdict_size(options) != 0)) {
const QDictEntry *entry = qdict_first(options);
@@ -1783,7 +1791,7 @@ void bdrv_reopen_commit(BDRVReopenState *reopen_state)
BDRV_O_CACHE_WB);
reopen_state->bs->read_only = !(reopen_state->flags & BDRV_O_RDWR);
bdrv_refresh_limits(reopen_state->bs);
bdrv_refresh_limits(reopen_state->bs, NULL);
}
/*
@@ -1910,6 +1918,7 @@ void bdrv_drain_all(void)
bool bs_busy;
aio_context_acquire(aio_context);
bdrv_flush_io_queue(bs);
bdrv_start_throttled_reqs(bs);
bs_busy = bdrv_requests_pending(bs);
bs_busy |= aio_poll(aio_context, bs_busy);
@@ -2513,32 +2522,23 @@ int bdrv_change_backing_file(BlockDriverState *bs,
*
* Returns NULL if bs is not found in active's image chain,
* or if active == bs.
*
* Returns the bottommost base image if bs == NULL.
*/
BlockDriverState *bdrv_find_overlay(BlockDriverState *active,
BlockDriverState *bs)
{
BlockDriverState *overlay = NULL;
BlockDriverState *intermediate;
assert(active != NULL);
assert(bs != NULL);
/* if bs is the same as active, then by definition it has no overlay
*/
if (active == bs) {
return NULL;
while (active && bs != active->backing_hd) {
active = active->backing_hd;
}
intermediate = active;
while (intermediate->backing_hd) {
if (intermediate->backing_hd == bs) {
overlay = intermediate;
break;
}
intermediate = intermediate->backing_hd;
}
return active;
}
return overlay;
/* Given a BDS, searches for the base layer. */
BlockDriverState *bdrv_find_base(BlockDriverState *bs)
{
return bdrv_find_overlay(bs, NULL);
}
typedef struct BlkIntermediateStates {
@@ -2569,12 +2569,15 @@ typedef struct BlkIntermediateStates {
*
* base <- active
*
* If backing_file_str is non-NULL, it will be used when modifying top's
* overlay image metadata.
*
* Error conditions:
* if active == top, that is considered an error
*
*/
int bdrv_drop_intermediate(BlockDriverState *active, BlockDriverState *top,
BlockDriverState *base)
BlockDriverState *base, const char *backing_file_str)
{
BlockDriverState *intermediate;
BlockDriverState *base_bs = NULL;
@@ -2626,7 +2629,8 @@ int bdrv_drop_intermediate(BlockDriverState *active, BlockDriverState *top,
}
/* success - we can delete the intermediate states, and link top->base */
ret = bdrv_change_backing_file(new_top_bs, base_bs->filename,
backing_file_str = backing_file_str ? backing_file_str : base_bs->filename;
ret = bdrv_change_backing_file(new_top_bs, backing_file_str,
base_bs->drv ? base_bs->drv->format_name : "");
if (ret) {
goto exit;
@@ -3019,6 +3023,7 @@ static int coroutine_fn bdrv_aligned_preadv(BlockDriverState *bs,
assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
assert(!qiov || bytes == qiov->size);
/* Handle Copy on Read and associated serialisation */
if (flags & BDRV_REQ_COPY_ON_READ) {
@@ -3063,8 +3068,20 @@ static int coroutine_fn bdrv_aligned_preadv(BlockDriverState *bs,
max_nb_sectors = ROUND_UP(MAX(0, total_sectors - sector_num),
align >> BDRV_SECTOR_BITS);
if (max_nb_sectors > 0) {
ret = drv->bdrv_co_readv(bs, sector_num,
MIN(nb_sectors, max_nb_sectors), qiov);
QEMUIOVector local_qiov;
size_t local_sectors;
max_nb_sectors = MIN(max_nb_sectors, SIZE_MAX / BDRV_SECTOR_BITS);
local_sectors = MIN(max_nb_sectors, nb_sectors);
qemu_iovec_init(&local_qiov, qiov->niov);
qemu_iovec_concat(&local_qiov, qiov, 0,
local_sectors * BDRV_SECTOR_SIZE);
ret = drv->bdrv_co_readv(bs, sector_num, local_sectors,
&local_qiov);
qemu_iovec_destroy(&local_qiov);
} else {
ret = 0;
}
@@ -3276,6 +3293,7 @@ static int coroutine_fn bdrv_aligned_pwritev(BlockDriverState *bs,
assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
assert(!qiov || bytes == qiov->size);
waited = wait_serialising_requests(req);
assert(!waited || !req->serialising);
@@ -3489,9 +3507,7 @@ int bdrv_truncate(BlockDriverState *bs, int64_t offset)
return -ENOTSUP;
if (bs->read_only)
return -EACCES;
if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_RESIZE, NULL)) {
return -EBUSY;
}
ret = drv->bdrv_truncate(bs, offset);
if (ret == 0) {
ret = refresh_total_sectors(bs, offset >> BDRV_SECTOR_BITS);
@@ -3790,6 +3806,17 @@ BlockDriverState *bdrv_lookup_bs(const char *device,
return NULL;
}
/* If 'base' is in the same chain as 'top', return true. Otherwise,
* return false. If either argument is NULL, return false. */
bool bdrv_chain_contains(BlockDriverState *top, BlockDriverState *base)
{
while (top && top != base) {
top = top->backing_hd;
}
return top != NULL;
}
BlockDriverState *bdrv_next(BlockDriverState *bs)
{
if (!bs) {
@@ -4040,7 +4067,7 @@ int coroutine_fn bdrv_is_allocated(BlockDriverState *bs, int64_t sector_num,
if (ret < 0) {
return ret;
}
return (ret & BDRV_BLOCK_ALLOCATED);
return !!(ret & BDRV_BLOCK_ALLOCATED);
}
/*
@@ -4333,22 +4360,6 @@ int bdrv_get_backing_file_depth(BlockDriverState *bs)
return 1 + bdrv_get_backing_file_depth(bs->backing_hd);
}
BlockDriverState *bdrv_find_base(BlockDriverState *bs)
{
BlockDriverState *curr_bs = NULL;
if (!bs) {
return NULL;
}
curr_bs = bs;
while (curr_bs->backing_hd) {
curr_bs = curr_bs->backing_hd;
}
return curr_bs;
}
/**************************************************************/
/* async I/Os */
@@ -5774,3 +5785,58 @@ bool bdrv_is_first_non_filter(BlockDriverState *candidate)
return false;
}
BlockDriverState *check_to_replace_node(const char *node_name, Error **errp)
{
BlockDriverState *to_replace_bs = bdrv_find_node(node_name);
if (!to_replace_bs) {
error_setg(errp, "Node name '%s' not found", node_name);
return NULL;
}
if (bdrv_op_is_blocked(to_replace_bs, BLOCK_OP_TYPE_REPLACE, errp)) {
return NULL;
}
/* We don't want arbitrary node of the BDS chain to be replaced only the top
* most non filter in order to prevent data corruption.
* Another benefit is that this tests exclude backing files which are
* blocked by the backing blockers.
*/
if (!bdrv_is_first_non_filter(to_replace_bs)) {
error_setg(errp, "Only top most non filter can be replaced");
return NULL;
}
return to_replace_bs;
}
void bdrv_io_plug(BlockDriverState *bs)
{
BlockDriver *drv = bs->drv;
if (drv && drv->bdrv_io_plug) {
drv->bdrv_io_plug(bs);
} else if (bs->file) {
bdrv_io_plug(bs->file);
}
}
void bdrv_io_unplug(BlockDriverState *bs)
{
BlockDriver *drv = bs->drv;
if (drv && drv->bdrv_io_unplug) {
drv->bdrv_io_unplug(bs);
} else if (bs->file) {
bdrv_io_unplug(bs->file);
}
}
void bdrv_flush_io_queue(BlockDriverState *bs)
{
BlockDriver *drv = bs->drv;
if (drv && drv->bdrv_flush_io_queue) {
drv->bdrv_flush_io_queue(bs);
} else if (bs->file) {
bdrv_flush_io_queue(bs->file);
}
}

View File

@@ -307,7 +307,7 @@ static void coroutine_fn backup_run(void *opaque)
BACKUP_SECTORS_PER_CLUSTER - i, &n);
i += n;
if (alloced == 1) {
if (alloced == 1 || n == 0) {
break;
}
}

View File

@@ -449,6 +449,10 @@ static void error_callback_bh(void *opaque)
static void blkdebug_aio_cancel(BlockDriverAIOCB *blockacb)
{
BlkdebugAIOCB *acb = container_of(blockacb, BlkdebugAIOCB, common);
if (acb->bh) {
qemu_bh_delete(acb->bh);
acb->bh = NULL;
}
qemu_aio_release(acb);
}

View File

@@ -37,6 +37,7 @@ typedef struct CommitBlockJob {
BlockdevOnError on_error;
int base_flags;
int orig_overlay_flags;
char *backing_file_str;
} CommitBlockJob;
static int coroutine_fn commit_populate(BlockDriverState *bs,
@@ -141,7 +142,7 @@ wait:
if (!block_job_is_cancelled(&s->common) && sector_num == end) {
/* success */
ret = bdrv_drop_intermediate(active, top, base);
ret = bdrv_drop_intermediate(active, top, base, s->backing_file_str);
}
exit_free_buf:
@@ -158,7 +159,7 @@ exit_restore_reopen:
if (overlay_bs && s->orig_overlay_flags != bdrv_get_flags(overlay_bs)) {
bdrv_reopen(overlay_bs, s->orig_overlay_flags, NULL);
}
g_free(s->backing_file_str);
block_job_completed(&s->common, ret);
}
@@ -182,7 +183,7 @@ static const BlockJobDriver commit_job_driver = {
void commit_start(BlockDriverState *bs, BlockDriverState *base,
BlockDriverState *top, int64_t speed,
BlockdevOnError on_error, BlockDriverCompletionFunc *cb,
void *opaque, Error **errp)
void *opaque, const char *backing_file_str, Error **errp)
{
CommitBlockJob *s;
BlockReopenQueue *reopen_queue = NULL;
@@ -244,6 +245,8 @@ void commit_start(BlockDriverState *bs, BlockDriverState *base,
s->base_flags = orig_base_flags;
s->orig_overlay_flags = orig_overlay_flags;
s->backing_file_str = g_strdup(backing_file_str);
s->on_error = on_error;
s->common.co = qemu_coroutine_create(commit_run);

View File

@@ -332,7 +332,7 @@ static int cow_create(const char *filename, QemuOpts *opts, Error **errp)
char *image_filename = NULL;
Error *local_err = NULL;
int ret;
BlockDriverState *cow_bs;
BlockDriverState *cow_bs = NULL;
/* Read out options */
image_sectors = qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0) / 512;
@@ -344,7 +344,6 @@ static int cow_create(const char *filename, QemuOpts *opts, Error **errp)
goto exit;
}
cow_bs = NULL;
ret = bdrv_open(&cow_bs, filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_PROTOCOL, NULL, &local_err);
if (ret < 0) {
@@ -383,7 +382,9 @@ static int cow_create(const char *filename, QemuOpts *opts, Error **errp)
exit:
g_free(image_filename);
bdrv_unref(cow_bs);
if (cow_bs) {
bdrv_unref(cow_bs);
}
return ret;
}
@@ -414,6 +415,7 @@ static BlockDriver bdrv_cow = {
.bdrv_close = cow_close,
.bdrv_create = cow_create,
.bdrv_has_zero_init = bdrv_has_zero_init_1,
.supports_backing = true,
.bdrv_read = cow_co_read,
.bdrv_write = cow_co_write,

View File

@@ -1450,7 +1450,7 @@ static void iscsi_close(BlockDriverState *bs)
memset(iscsilun, 0, sizeof(IscsiLun));
}
static int iscsi_refresh_limits(BlockDriverState *bs)
static void iscsi_refresh_limits(BlockDriverState *bs, Error **errp)
{
IscsiLun *iscsilun = bs->opaque;
@@ -1475,7 +1475,6 @@ static int iscsi_refresh_limits(BlockDriverState *bs)
}
bs->bl.opt_transfer_length = sector_lun2qemu(iscsilun->bl.opt_xfer_len,
iscsilun);
return 0;
}
/* Since iscsi_open() ignores bdrv_flags, there is nothing to do here in
@@ -1510,7 +1509,8 @@ static int iscsi_truncate(BlockDriverState *bs, int64_t offset)
if (iscsilun->allocationmap != NULL) {
g_free(iscsilun->allocationmap);
iscsilun->allocationmap =
bitmap_new(DIV_ROUND_UP(bs->total_sectors,
bitmap_new(DIV_ROUND_UP(sector_lun2qemu(iscsilun->num_blocks,
iscsilun),
iscsilun->cluster_sectors));
}

View File

@@ -25,6 +25,8 @@
*/
#define MAX_EVENTS 128
#define MAX_QUEUED_IO 128
struct qemu_laiocb {
BlockDriverAIOCB common;
struct qemu_laio_state *ctx;
@@ -36,9 +38,19 @@ struct qemu_laiocb {
QLIST_ENTRY(qemu_laiocb) node;
};
typedef struct {
struct iocb *iocbs[MAX_QUEUED_IO];
int plugged;
unsigned int size;
unsigned int idx;
} LaioQueue;
struct qemu_laio_state {
io_context_t ctx;
EventNotifier e;
/* io queue for submit at batch */
LaioQueue io_q;
};
static inline ssize_t io_event_ret(struct io_event *ev)
@@ -135,6 +147,79 @@ static const AIOCBInfo laio_aiocb_info = {
.cancel = laio_cancel,
};
static void ioq_init(LaioQueue *io_q)
{
io_q->size = MAX_QUEUED_IO;
io_q->idx = 0;
io_q->plugged = 0;
}
static int ioq_submit(struct qemu_laio_state *s)
{
int ret, i = 0;
int len = s->io_q.idx;
do {
ret = io_submit(s->ctx, len, s->io_q.iocbs);
} while (i++ < 3 && ret == -EAGAIN);
/* empty io queue */
s->io_q.idx = 0;
if (ret < 0) {
i = 0;
} else {
i = ret;
}
for (; i < len; i++) {
struct qemu_laiocb *laiocb =
container_of(s->io_q.iocbs[i], struct qemu_laiocb, iocb);
laiocb->ret = (ret < 0) ? ret : -EIO;
qemu_laio_process_completion(s, laiocb);
}
return ret;
}
static void ioq_enqueue(struct qemu_laio_state *s, struct iocb *iocb)
{
unsigned int idx = s->io_q.idx;
s->io_q.iocbs[idx++] = iocb;
s->io_q.idx = idx;
/* submit immediately if queue is full */
if (idx == s->io_q.size) {
ioq_submit(s);
}
}
void laio_io_plug(BlockDriverState *bs, void *aio_ctx)
{
struct qemu_laio_state *s = aio_ctx;
s->io_q.plugged++;
}
int laio_io_unplug(BlockDriverState *bs, void *aio_ctx, bool unplug)
{
struct qemu_laio_state *s = aio_ctx;
int ret = 0;
assert(s->io_q.plugged > 0 || !unplug);
if (unplug && --s->io_q.plugged > 0) {
return 0;
}
if (s->io_q.idx > 0) {
ret = ioq_submit(s);
}
return ret;
}
BlockDriverAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockDriverCompletionFunc *cb, void *opaque, int type)
@@ -168,8 +253,13 @@ BlockDriverAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
}
io_set_eventfd(&laiocb->iocb, event_notifier_get_fd(&s->e));
if (io_submit(s->ctx, 1, &iocbs) < 0)
goto out_free_aiocb;
if (!s->io_q.plugged) {
if (io_submit(s->ctx, 1, &iocbs) < 0) {
goto out_free_aiocb;
}
} else {
ioq_enqueue(s, iocbs);
}
return &laiocb->common;
out_free_aiocb:
@@ -204,6 +294,8 @@ void *laio_init(void)
goto out_close_efd;
}
ioq_init(&s->io_q);
return s;
out_close_efd:
@@ -218,5 +310,10 @@ void laio_cleanup(void *s_)
struct qemu_laio_state *s = s_;
event_notifier_cleanup(&s->e);
if (io_destroy(s->ctx) != 0) {
fprintf(stderr, "%s: destroy AIO context %p failed\n",
__func__, &s->ctx);
}
g_free(s);
}

View File

@@ -32,6 +32,12 @@ typedef struct MirrorBlockJob {
RateLimit limit;
BlockDriverState *target;
BlockDriverState *base;
/* The name of the graph node to replace */
char *replaces;
/* The BDS to replace */
BlockDriverState *to_replace;
/* Used to block operations on the drive-mirror-replace target */
Error *replace_blocker;
bool is_none_mode;
BlockdevOnError on_source_error, on_target_error;
bool synced;
@@ -259,9 +265,11 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
next_sector = sector_num;
while (nb_chunks-- > 0) {
MirrorBuffer *buf = QSIMPLEQ_FIRST(&s->buf_free);
size_t remaining = (nb_sectors * BDRV_SECTOR_SIZE) - op->qiov.size;
QSIMPLEQ_REMOVE_HEAD(&s->buf_free, next);
s->buf_free_count--;
qemu_iovec_add(&op->qiov, buf, s->granularity);
qemu_iovec_add(&op->qiov, buf, MIN(s->granularity, remaining));
/* Advance the HBitmapIter in parallel, so that we do not examine
* the same sector twice.
@@ -324,9 +332,18 @@ static void coroutine_fn mirror_run(void *opaque)
}
s->common.len = bdrv_getlength(bs);
if (s->common.len <= 0) {
if (s->common.len < 0) {
ret = s->common.len;
goto immediate_exit;
} else if (s->common.len == 0) {
/* Report BLOCK_JOB_READY and wait for complete. */
block_job_event_ready(&s->common);
s->synced = true;
while (!block_job_is_cancelled(&s->common) && !s->should_complete) {
block_job_yield(&s->common);
}
s->common.cancelled = false;
goto immediate_exit;
}
length = DIV_ROUND_UP(s->common.len, s->granularity);
@@ -491,10 +508,14 @@ immediate_exit:
bdrv_release_dirty_bitmap(bs, s->dirty_bitmap);
bdrv_iostatus_disable(s->target);
if (s->should_complete && ret == 0) {
if (bdrv_get_flags(s->target) != bdrv_get_flags(s->common.bs)) {
bdrv_reopen(s->target, bdrv_get_flags(s->common.bs), NULL);
BlockDriverState *to_replace = s->common.bs;
if (s->to_replace) {
to_replace = s->to_replace;
}
bdrv_swap(s->target, s->common.bs);
if (bdrv_get_flags(s->target) != bdrv_get_flags(to_replace)) {
bdrv_reopen(s->target, bdrv_get_flags(to_replace), NULL);
}
bdrv_swap(s->target, to_replace);
if (s->common.driver->job_type == BLOCK_JOB_TYPE_COMMIT) {
/* drop the bs loop chain formed by the swap: break the loop then
* trigger the unref from the top one */
@@ -503,6 +524,12 @@ immediate_exit:
bdrv_unref(p);
}
}
if (s->to_replace) {
bdrv_op_unblock_all(s->to_replace, s->replace_blocker);
error_free(s->replace_blocker);
bdrv_unref(s->to_replace);
}
g_free(s->replaces);
bdrv_unref(s->target);
block_job_completed(&s->common, ret);
}
@@ -541,6 +568,20 @@ static void mirror_complete(BlockJob *job, Error **errp)
return;
}
/* check the target bs is not blocked and block all operations on it */
if (s->replaces) {
s->to_replace = check_to_replace_node(s->replaces, &local_err);
if (!s->to_replace) {
error_propagate(errp, local_err);
return;
}
error_setg(&s->replace_blocker,
"block device is in use by block-job-complete");
bdrv_op_block_all(s->to_replace, s->replace_blocker);
bdrv_ref(s->to_replace);
}
s->should_complete = true;
block_job_resume(job);
}
@@ -563,14 +604,15 @@ static const BlockJobDriver commit_active_job_driver = {
};
static void mirror_start_job(BlockDriverState *bs, BlockDriverState *target,
int64_t speed, int64_t granularity,
int64_t buf_size,
BlockdevOnError on_source_error,
BlockdevOnError on_target_error,
BlockDriverCompletionFunc *cb,
void *opaque, Error **errp,
const BlockJobDriver *driver,
bool is_none_mode, BlockDriverState *base)
const char *replaces,
int64_t speed, int64_t granularity,
int64_t buf_size,
BlockdevOnError on_source_error,
BlockdevOnError on_target_error,
BlockDriverCompletionFunc *cb,
void *opaque, Error **errp,
const BlockJobDriver *driver,
bool is_none_mode, BlockDriverState *base)
{
MirrorBlockJob *s;
@@ -601,6 +643,7 @@ static void mirror_start_job(BlockDriverState *bs, BlockDriverState *target,
return;
}
s->replaces = g_strdup(replaces);
s->on_source_error = on_source_error;
s->on_target_error = on_target_error;
s->target = target;
@@ -622,6 +665,7 @@ static void mirror_start_job(BlockDriverState *bs, BlockDriverState *target,
}
void mirror_start(BlockDriverState *bs, BlockDriverState *target,
const char *replaces,
int64_t speed, int64_t granularity, int64_t buf_size,
MirrorSyncMode mode, BlockdevOnError on_source_error,
BlockdevOnError on_target_error,
@@ -633,7 +677,8 @@ void mirror_start(BlockDriverState *bs, BlockDriverState *target,
is_none_mode = mode == MIRROR_SYNC_MODE_NONE;
base = mode == MIRROR_SYNC_MODE_TOP ? bs->backing_hd : NULL;
mirror_start_job(bs, target, speed, granularity, buf_size,
mirror_start_job(bs, target, replaces,
speed, granularity, buf_size,
on_source_error, on_target_error, cb, opaque, errp,
&mirror_job_driver, is_none_mode, base);
}
@@ -681,7 +726,7 @@ void commit_active_start(BlockDriverState *bs, BlockDriverState *base,
}
bdrv_ref(base);
mirror_start_job(bs, base, speed, 0, 0,
mirror_start_job(bs, base, NULL, speed, 0, 0,
on_error, on_error, cb, opaque, &local_err,
&commit_active_job_driver, false, base);
if (local_err) {

View File

@@ -304,17 +304,27 @@ static int64_t nfs_client_open(NFSClient *client, const char *filename,
qp = query_params_parse(uri->query);
for (i = 0; i < qp->n; i++) {
unsigned long long val;
if (!qp->p[i].value) {
error_setg(errp, "Value for NFS parameter expected: %s",
qp->p[i].name);
goto fail;
}
if (!strncmp(qp->p[i].name, "uid", 3)) {
nfs_set_uid(client->context, atoi(qp->p[i].value));
} else if (!strncmp(qp->p[i].name, "gid", 3)) {
nfs_set_gid(client->context, atoi(qp->p[i].value));
} else if (!strncmp(qp->p[i].name, "tcp-syncnt", 10)) {
nfs_set_tcp_syncnt(client->context, atoi(qp->p[i].value));
if (parse_uint_full(qp->p[i].value, &val, 0)) {
error_setg(errp, "Illegal value for NFS parameter: %s",
qp->p[i].name);
goto fail;
}
if (!strcmp(qp->p[i].name, "uid")) {
nfs_set_uid(client->context, val);
} else if (!strcmp(qp->p[i].name, "gid")) {
nfs_set_gid(client->context, val);
} else if (!strcmp(qp->p[i].name, "tcp-syncnt")) {
nfs_set_tcp_syncnt(client->context, val);
#ifdef LIBNFS_FEATURE_READAHEAD
} else if (!strcmp(qp->p[i].name, "readahead")) {
nfs_set_readahead(client->context, val);
#endif
} else {
error_setg(errp, "Unknown NFS parameter name: %s",
qp->p[i].name);

View File

@@ -293,7 +293,7 @@ void bdrv_query_info(BlockDriverState *bs,
qapi_free_BlockInfo(info);
}
BlockStats *bdrv_query_stats(const BlockDriverState *bs)
static BlockStats *bdrv_query_stats(const BlockDriverState *bs)
{
BlockStats *s;
@@ -360,7 +360,11 @@ BlockStatsList *qmp_query_blockstats(Error **errp)
while ((bs = bdrv_next(bs))) {
BlockStatsList *info = g_malloc0(sizeof(*info));
AioContext *ctx = bdrv_get_aio_context(bs);
aio_context_acquire(ctx);
info->value = bdrv_query_stats(bs);
aio_context_release(ctx);
*p_next = info;
p_next = &info->next;

View File

@@ -941,6 +941,7 @@ static BlockDriver bdrv_qcow = {
.bdrv_reopen_prepare = qcow_reopen_prepare,
.bdrv_create = qcow_create,
.bdrv_has_zero_init = bdrv_has_zero_init_1,
.supports_backing = true,
.bdrv_co_readv = qcow_co_readv,
.bdrv_co_writev = qcow_co_writev,

View File

@@ -210,20 +210,31 @@ static void GCC_FMT_ATTR(3, 4) report_unsupported(BlockDriverState *bs,
static void report_unsupported_feature(BlockDriverState *bs,
Error **errp, Qcow2Feature *table, uint64_t mask)
{
char *features = g_strdup("");
char *old;
while (table && table->name[0] != '\0') {
if (table->type == QCOW2_FEAT_TYPE_INCOMPATIBLE) {
if (mask & (1 << table->bit)) {
report_unsupported(bs, errp, "%.46s", table->name);
mask &= ~(1 << table->bit);
if (mask & (1ULL << table->bit)) {
old = features;
features = g_strdup_printf("%s%s%.46s", old, *old ? ", " : "",
table->name);
g_free(old);
mask &= ~(1ULL << table->bit);
}
}
table++;
}
if (mask) {
report_unsupported(bs, errp, "Unknown incompatible feature: %" PRIx64,
mask);
old = features;
features = g_strdup_printf("%s%sUnknown incompatible feature: %" PRIx64,
old, *old ? ", " : "", mask);
g_free(old);
}
report_unsupported(bs, errp, "%s", features);
g_free(features);
}
/*
@@ -855,13 +866,11 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
return ret;
}
static int qcow2_refresh_limits(BlockDriverState *bs)
static void qcow2_refresh_limits(BlockDriverState *bs, Error **errp)
{
BDRVQcowState *s = bs->opaque;
bs->bl.write_zeroes_alignment = s->cluster_sectors;
return 0;
}
static int qcow2_set_key(BlockDriverState *bs, const char *key)
@@ -1020,11 +1029,20 @@ static coroutine_fn int qcow2_co_readv(BlockDriverState *bs, int64_t sector_num,
n1 = qcow2_backing_read1(bs->backing_hd, &hd_qiov,
sector_num, cur_nr_sectors);
if (n1 > 0) {
QEMUIOVector local_qiov;
qemu_iovec_init(&local_qiov, hd_qiov.niov);
qemu_iovec_concat(&local_qiov, &hd_qiov, 0,
n1 * BDRV_SECTOR_SIZE);
BLKDBG_EVENT(bs->file, BLKDBG_READ_BACKING_AIO);
qemu_co_mutex_unlock(&s->lock);
ret = bdrv_co_readv(bs->backing_hd, sector_num,
n1, &hd_qiov);
n1, &local_qiov);
qemu_co_mutex_lock(&s->lock);
qemu_iovec_destroy(&local_qiov);
if (ret < 0) {
goto fail;
}
@@ -2418,6 +2436,7 @@ static BlockDriver bdrv_qcow2 = {
.bdrv_save_vmstate = qcow2_save_vmstate,
.bdrv_load_vmstate = qcow2_load_vmstate,
.supports_backing = true,
.bdrv_change_backing_file = qcow2_change_backing_file,
.bdrv_refresh_limits = qcow2_refresh_limits,

View File

@@ -528,13 +528,11 @@ out:
return ret;
}
static int bdrv_qed_refresh_limits(BlockDriverState *bs)
static void bdrv_qed_refresh_limits(BlockDriverState *bs, Error **errp)
{
BDRVQEDState *s = bs->opaque;
bs->bl.write_zeroes_alignment = s->header.cluster_size >> BDRV_SECTOR_BITS;
return 0;
}
/* We have nothing to do for QED reopen, stubs just return
@@ -567,7 +565,7 @@ static void bdrv_qed_close(BlockDriverState *bs)
static int qed_create(const char *filename, uint32_t cluster_size,
uint64_t image_size, uint32_t table_size,
const char *backing_file, const char *backing_fmt,
Error **errp)
QemuOpts *opts, Error **errp)
{
QEDHeader header = {
.magic = QED_MAGIC,
@@ -586,7 +584,7 @@ static int qed_create(const char *filename, uint32_t cluster_size,
int ret = 0;
BlockDriverState *bs;
ret = bdrv_create_file(filename, NULL, &local_err);
ret = bdrv_create_file(filename, opts, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
return ret;
@@ -682,7 +680,7 @@ static int bdrv_qed_create(const char *filename, QemuOpts *opts, Error **errp)
}
ret = qed_create(filename, cluster_size, image_size, table_size,
backing_file, backing_fmt, errp);
backing_file, backing_fmt, opts, errp);
finish:
g_free(backing_file);
@@ -761,17 +759,19 @@ static BDRVQEDState *acb_to_s(QEDAIOCB *acb)
/**
* Read from the backing file or zero-fill if no backing file
*
* @s: QED state
* @pos: Byte position in device
* @qiov: Destination I/O vector
* @cb: Completion function
* @opaque: User data for completion function
* @s: QED state
* @pos: Byte position in device
* @qiov: Destination I/O vector
* @backing_qiov: Possibly shortened copy of qiov, to be allocated here
* @cb: Completion function
* @opaque: User data for completion function
*
* This function reads qiov->size bytes starting at pos from the backing file.
* If there is no backing file then zeroes are read.
*/
static void qed_read_backing_file(BDRVQEDState *s, uint64_t pos,
QEMUIOVector *qiov,
QEMUIOVector **backing_qiov,
BlockDriverCompletionFunc *cb, void *opaque)
{
uint64_t backing_length = 0;
@@ -804,15 +804,21 @@ static void qed_read_backing_file(BDRVQEDState *s, uint64_t pos,
/* If the read straddles the end of the backing file, shorten it */
size = MIN((uint64_t)backing_length - pos, qiov->size);
assert(*backing_qiov == NULL);
*backing_qiov = g_new(QEMUIOVector, 1);
qemu_iovec_init(*backing_qiov, qiov->niov);
qemu_iovec_concat(*backing_qiov, qiov, 0, size);
BLKDBG_EVENT(s->bs->file, BLKDBG_READ_BACKING_AIO);
bdrv_aio_readv(s->bs->backing_hd, pos / BDRV_SECTOR_SIZE,
qiov, size / BDRV_SECTOR_SIZE, cb, opaque);
*backing_qiov, size / BDRV_SECTOR_SIZE, cb, opaque);
}
typedef struct {
GenericCB gencb;
BDRVQEDState *s;
QEMUIOVector qiov;
QEMUIOVector *backing_qiov;
struct iovec iov;
uint64_t offset;
} CopyFromBackingFileCB;
@@ -829,6 +835,12 @@ static void qed_copy_from_backing_file_write(void *opaque, int ret)
CopyFromBackingFileCB *copy_cb = opaque;
BDRVQEDState *s = copy_cb->s;
if (copy_cb->backing_qiov) {
qemu_iovec_destroy(copy_cb->backing_qiov);
g_free(copy_cb->backing_qiov);
copy_cb->backing_qiov = NULL;
}
if (ret) {
qed_copy_from_backing_file_cb(copy_cb, ret);
return;
@@ -866,11 +878,12 @@ static void qed_copy_from_backing_file(BDRVQEDState *s, uint64_t pos,
copy_cb = gencb_alloc(sizeof(*copy_cb), cb, opaque);
copy_cb->s = s;
copy_cb->offset = offset;
copy_cb->backing_qiov = NULL;
copy_cb->iov.iov_base = qemu_blockalign(s->bs, len);
copy_cb->iov.iov_len = len;
qemu_iovec_init_external(&copy_cb->qiov, &copy_cb->iov, 1);
qed_read_backing_file(s, pos, &copy_cb->qiov,
qed_read_backing_file(s, pos, &copy_cb->qiov, &copy_cb->backing_qiov,
qed_copy_from_backing_file_write, copy_cb);
}
@@ -1313,7 +1326,7 @@ static void qed_aio_read_data(void *opaque, int ret,
return;
} else if (ret != QED_CLUSTER_FOUND) {
qed_read_backing_file(s, acb->cur_pos, &acb->cur_qiov,
qed_aio_next_io, acb);
&acb->backing_qiov, qed_aio_next_io, acb);
return;
}
@@ -1339,6 +1352,12 @@ static void qed_aio_next_io(void *opaque, int ret)
trace_qed_aio_next_io(s, acb, ret, acb->cur_pos + acb->cur_qiov.size);
if (acb->backing_qiov) {
qemu_iovec_destroy(acb->backing_qiov);
g_free(acb->backing_qiov);
acb->backing_qiov = NULL;
}
/* Handle I/O error */
if (ret) {
qed_aio_complete(acb, ret);
@@ -1378,6 +1397,7 @@ static BlockDriverAIOCB *qed_aio_setup(BlockDriverState *bs,
acb->qiov_offset = 0;
acb->cur_pos = (uint64_t)sector_num * BDRV_SECTOR_SIZE;
acb->end_pos = acb->cur_pos + nb_sectors * BDRV_SECTOR_SIZE;
acb->backing_qiov = NULL;
acb->request.l2_table = NULL;
qemu_iovec_init(&acb->cur_qiov, qiov->niov);
@@ -1652,6 +1672,7 @@ static BlockDriver bdrv_qed = {
.format_name = "qed",
.instance_size = sizeof(BDRVQEDState),
.create_opts = &qed_create_opts,
.supports_backing = true,
.bdrv_probe = bdrv_qed_probe,
.bdrv_rebind = bdrv_qed_rebind,

View File

@@ -142,6 +142,7 @@ typedef struct QEDAIOCB {
/* Current cluster scatter-gather list */
QEMUIOVector cur_qiov;
QEMUIOVector *backing_qiov;
uint64_t cur_pos; /* position on block device, in bytes */
uint64_t cur_cluster; /* cluster offset in image file */
unsigned int cur_nclusters; /* number of clusters being accessed */

View File

@@ -23,6 +23,7 @@
#define QUORUM_OPT_VOTE_THRESHOLD "vote-threshold"
#define QUORUM_OPT_BLKVERIFY "blkverify"
#define QUORUM_OPT_REWRITE "rewrite-corrupted"
/* This union holds a vote hash value */
typedef union QuorumVoteValue {
@@ -70,6 +71,9 @@ typedef struct BDRVQuorumState {
* It is useful to debug other block drivers by
* comparing them with a reference one.
*/
bool rewrite_corrupted;/* true if the driver must rewrite-on-read corrupted
* block if Quorum is reached.
*/
} BDRVQuorumState;
typedef struct QuorumAIOCB QuorumAIOCB;
@@ -105,13 +109,17 @@ struct QuorumAIOCB {
int count; /* number of completed AIOCB */
int success_count; /* number of successfully completed AIOCB */
int rewrite_count; /* number of replica to rewrite: count down to
* zero once writes are fired
*/
QuorumVotes votes;
bool is_read;
int vote_ret;
};
static void quorum_vote(QuorumAIOCB *acb);
static bool quorum_vote(QuorumAIOCB *acb);
static void quorum_aio_cancel(BlockDriverAIOCB *blockacb)
{
@@ -183,6 +191,7 @@ static QuorumAIOCB *quorum_aio_get(BDRVQuorumState *s,
acb->qcrs = g_new0(QuorumChildRequest, s->num_children);
acb->count = 0;
acb->success_count = 0;
acb->rewrite_count = 0;
acb->votes.compare = quorum_sha256_compare;
QLIST_INIT(&acb->votes.vote_list);
acb->is_read = false;
@@ -232,11 +241,27 @@ static bool quorum_has_too_much_io_failed(QuorumAIOCB *acb)
return false;
}
static void quorum_rewrite_aio_cb(void *opaque, int ret)
{
QuorumAIOCB *acb = opaque;
/* one less rewrite to do */
acb->rewrite_count--;
/* wait until all rewrite callbacks have completed */
if (acb->rewrite_count) {
return;
}
quorum_aio_finalize(acb);
}
static void quorum_aio_cb(void *opaque, int ret)
{
QuorumChildRequest *sacb = opaque;
QuorumAIOCB *acb = sacb->parent;
BDRVQuorumState *s = acb->common.bs->opaque;
bool rewrite = false;
sacb->ret = ret;
acb->count++;
@@ -253,12 +278,15 @@ static void quorum_aio_cb(void *opaque, int ret)
/* Do the vote on read */
if (acb->is_read) {
quorum_vote(acb);
rewrite = quorum_vote(acb);
} else {
quorum_has_too_much_io_failed(acb);
}
quorum_aio_finalize(acb);
/* if no rewrite is done the code will finish right away */
if (!rewrite) {
quorum_aio_finalize(acb);
}
}
static void quorum_report_bad_versions(BDRVQuorumState *s,
@@ -278,6 +306,43 @@ static void quorum_report_bad_versions(BDRVQuorumState *s,
}
}
static bool quorum_rewrite_bad_versions(BDRVQuorumState *s, QuorumAIOCB *acb,
QuorumVoteValue *value)
{
QuorumVoteVersion *version;
QuorumVoteItem *item;
int count = 0;
/* first count the number of bad versions: done first to avoid concurrency
* issues.
*/
QLIST_FOREACH(version, &acb->votes.vote_list, next) {
if (acb->votes.compare(&version->value, value)) {
continue;
}
QLIST_FOREACH(item, &version->items, next) {
count++;
}
}
/* quorum_rewrite_aio_cb will count down this to zero */
acb->rewrite_count = count;
/* now fire the correcting rewrites */
QLIST_FOREACH(version, &acb->votes.vote_list, next) {
if (acb->votes.compare(&version->value, value)) {
continue;
}
QLIST_FOREACH(item, &version->items, next) {
bdrv_aio_writev(s->bs[item->index], acb->sector_num, acb->qiov,
acb->nb_sectors, quorum_rewrite_aio_cb, acb);
}
}
/* return true if any rewrite is done else false */
return count;
}
static void quorum_copy_qiov(QEMUIOVector *dest, QEMUIOVector *source)
{
int i;
@@ -468,16 +533,17 @@ static int quorum_vote_error(QuorumAIOCB *acb)
return ret;
}
static void quorum_vote(QuorumAIOCB *acb)
static bool quorum_vote(QuorumAIOCB *acb)
{
bool quorum = true;
bool rewrite = false;
int i, j, ret;
QuorumVoteValue hash;
BDRVQuorumState *s = acb->common.bs->opaque;
QuorumVoteVersion *winner;
if (quorum_has_too_much_io_failed(acb)) {
return;
return false;
}
/* get the index of the first successful read */
@@ -505,7 +571,7 @@ static void quorum_vote(QuorumAIOCB *acb)
/* Every successful read agrees */
if (quorum) {
quorum_copy_qiov(acb->qiov, &acb->qcrs[i].qiov);
return;
return false;
}
/* compute hashes for each successful read, also store indexes */
@@ -538,9 +604,15 @@ static void quorum_vote(QuorumAIOCB *acb)
/* some versions are bad print them */
quorum_report_bad_versions(s, acb, &winner->value);
/* corruption correction is enabled */
if (s->rewrite_corrupted) {
rewrite = quorum_rewrite_bad_versions(s, acb, &winner->value);
}
free_exit:
/* free lists */
quorum_free_vote_list(&acb->votes);
return rewrite;
}
static BlockDriverAIOCB *quorum_aio_readv(BlockDriverState *bs,
@@ -705,6 +777,11 @@ static QemuOptsList quorum_runtime_opts = {
.type = QEMU_OPT_BOOL,
.help = "Trigger block verify mode if set",
},
{
.name = QUORUM_OPT_REWRITE,
.type = QEMU_OPT_BOOL,
.help = "Rewrite corrupted block on read quorum",
},
{ /* end of list */ }
},
};
@@ -766,6 +843,14 @@ static int quorum_open(BlockDriverState *bs, QDict *options, int flags,
"and using two files with vote_threshold=2\n");
}
s->rewrite_corrupted = qemu_opt_get_bool(opts, QUORUM_OPT_REWRITE, false);
if (s->rewrite_corrupted && s->is_blkverify) {
error_setg(&local_err,
"rewrite-corrupted=on cannot be used with blkverify=on");
ret = -EINVAL;
goto exit;
}
/* allocate the children BlockDriverState array */
s->bs = g_new0(BlockDriverState *, s->num_children);
opened = g_new0(bool, s->num_children);

View File

@@ -40,6 +40,8 @@ BlockDriverAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
BlockDriverCompletionFunc *cb, void *opaque, int type);
void laio_detach_aio_context(void *s, AioContext *old_context);
void laio_attach_aio_context(void *s, AioContext *new_context);
void laio_io_plug(BlockDriverState *bs, void *aio_ctx);
int laio_io_unplug(BlockDriverState *bs, void *aio_ctx, bool unplug);
#endif
#ifdef _WIN32

View File

@@ -55,6 +55,9 @@
#include <linux/cdrom.h>
#include <linux/fd.h>
#include <linux/fs.h>
#ifndef FS_NOCOW_FL
#define FS_NOCOW_FL 0x00800000 /* Do not cow file */
#endif
#endif
#ifdef CONFIG_FIEMAP
#include <linux/fiemap.h>
@@ -218,7 +221,7 @@ static int raw_normalize_devicepath(const char **filename)
}
#endif
static void raw_probe_alignment(BlockDriverState *bs)
static void raw_probe_alignment(BlockDriverState *bs, int fd, Error **errp)
{
BDRVRawState *s = bs->opaque;
char *buf;
@@ -237,24 +240,24 @@ static void raw_probe_alignment(BlockDriverState *bs)
s->buf_align = 0;
#ifdef BLKSSZGET
if (ioctl(s->fd, BLKSSZGET, &sector_size) >= 0) {
if (ioctl(fd, BLKSSZGET, &sector_size) >= 0) {
bs->request_alignment = sector_size;
}
#endif
#ifdef DKIOCGETBLOCKSIZE
if (ioctl(s->fd, DKIOCGETBLOCKSIZE, &sector_size) >= 0) {
if (ioctl(fd, DKIOCGETBLOCKSIZE, &sector_size) >= 0) {
bs->request_alignment = sector_size;
}
#endif
#ifdef DIOCGSECTORSIZE
if (ioctl(s->fd, DIOCGSECTORSIZE, &sector_size) >= 0) {
if (ioctl(fd, DIOCGSECTORSIZE, &sector_size) >= 0) {
bs->request_alignment = sector_size;
}
#endif
#ifdef CONFIG_XFS
if (s->is_xfs) {
struct dioattr da;
if (xfsctl(NULL, s->fd, XFS_IOC_DIOINFO, &da) >= 0) {
if (xfsctl(NULL, fd, XFS_IOC_DIOINFO, &da) >= 0) {
bs->request_alignment = da.d_miniosz;
/* The kernel returns wrong information for d_mem */
/* s->buf_align = da.d_mem; */
@@ -267,7 +270,7 @@ static void raw_probe_alignment(BlockDriverState *bs)
size_t align;
buf = qemu_memalign(MAX_BLOCKSIZE, 2 * MAX_BLOCKSIZE);
for (align = 512; align <= MAX_BLOCKSIZE; align <<= 1) {
if (pread(s->fd, buf + align, MAX_BLOCKSIZE, 0) >= 0) {
if (pread(fd, buf + align, MAX_BLOCKSIZE, 0) >= 0) {
s->buf_align = align;
break;
}
@@ -279,13 +282,18 @@ static void raw_probe_alignment(BlockDriverState *bs)
size_t align;
buf = qemu_memalign(s->buf_align, MAX_BLOCKSIZE);
for (align = 512; align <= MAX_BLOCKSIZE; align <<= 1) {
if (pread(s->fd, buf, align, 0) >= 0) {
if (pread(fd, buf, align, 0) >= 0) {
bs->request_alignment = align;
break;
}
}
qemu_vfree(buf);
}
if (!s->buf_align || !bs->request_alignment) {
error_setg(errp, "Could not find working O_DIRECT alignment. "
"Try cache.direct=off.");
}
}
static void raw_parse_flags(int bdrv_flags, int *open_flags)
@@ -502,6 +510,7 @@ static int raw_reopen_prepare(BDRVReopenState *state,
BDRVRawState *s;
BDRVRawReopenState *raw_s;
int ret = 0;
Error *local_err = NULL;
assert(state != NULL);
assert(state->bs != NULL);
@@ -574,6 +583,19 @@ static int raw_reopen_prepare(BDRVReopenState *state,
ret = -1;
}
}
/* Fail already reopen_prepare() if we can't get a working O_DIRECT
* alignment with the new fd. */
if (raw_s->fd != -1) {
raw_probe_alignment(state->bs, raw_s->fd, &local_err);
if (local_err) {
qemu_close(raw_s->fd);
raw_s->fd = -1;
error_propagate(errp, local_err);
ret = -EINVAL;
}
}
return ret;
}
@@ -612,14 +634,12 @@ static void raw_reopen_abort(BDRVReopenState *state)
state->opaque = NULL;
}
static int raw_refresh_limits(BlockDriverState *bs)
static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
{
BDRVRawState *s = bs->opaque;
raw_probe_alignment(bs);
raw_probe_alignment(bs, s->fd, errp);
bs->bl.opt_mem_alignment = s->buf_align;
return 0;
}
static ssize_t handle_aiocb_ioctl(RawPosixAIOData *aiocb)
@@ -727,6 +747,15 @@ static ssize_t handle_aiocb_rw_linear(RawPosixAIOData *aiocb, char *buf)
}
if (len == -1 && errno == EINTR) {
continue;
} else if (len == -1 && errno == EINVAL &&
(aiocb->bs->open_flags & BDRV_O_NOCACHE) &&
!(aiocb->aio_type & QEMU_AIO_WRITE) &&
offset > 0) {
/* O_DIRECT pread() may fail with EINVAL when offset is unaligned
* after a short read. Assume that O_DIRECT short reads only occur
* at EOF. Therefore this is a short read, not an I/O error.
*/
break;
} else if (len == -1) {
offset = -errno;
break;
@@ -787,6 +816,7 @@ static ssize_t handle_aiocb_rw(RawPosixAIOData *aiocb)
memcpy(p, aiocb->aio_iov[i].iov_base, aiocb->aio_iov[i].iov_len);
p += aiocb->aio_iov[i].iov_len;
}
assert(p - buf == aiocb->aio_nbytes);
}
nbytes = handle_aiocb_rw_linear(aiocb, buf);
@@ -801,9 +831,11 @@ static ssize_t handle_aiocb_rw(RawPosixAIOData *aiocb)
copy = aiocb->aio_iov[i].iov_len;
}
memcpy(aiocb->aio_iov[i].iov_base, p, copy);
assert(count >= copy);
p += copy;
count -= copy;
}
assert(count == 0);
}
qemu_vfree(buf);
@@ -990,12 +1022,14 @@ static int paio_submit_co(BlockDriverState *bs, int fd,
acb->aio_type = type;
acb->aio_fildes = fd;
acb->aio_nbytes = nb_sectors * BDRV_SECTOR_SIZE;
acb->aio_offset = sector_num * BDRV_SECTOR_SIZE;
if (qiov) {
acb->aio_iov = qiov->iov;
acb->aio_niov = qiov->niov;
assert(qiov->size == acb->aio_nbytes);
}
acb->aio_nbytes = nb_sectors * 512;
acb->aio_offset = sector_num * 512;
trace_paio_submit_co(sector_num, nb_sectors, type);
pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
@@ -1013,12 +1047,14 @@ static BlockDriverAIOCB *paio_submit(BlockDriverState *bs, int fd,
acb->aio_type = type;
acb->aio_fildes = fd;
acb->aio_nbytes = nb_sectors * BDRV_SECTOR_SIZE;
acb->aio_offset = sector_num * BDRV_SECTOR_SIZE;
if (qiov) {
acb->aio_iov = qiov->iov;
acb->aio_niov = qiov->niov;
assert(qiov->size == acb->aio_nbytes);
}
acb->aio_nbytes = nb_sectors * 512;
acb->aio_offset = sector_num * 512;
trace_paio_submit(acb, opaque, sector_num, nb_sectors, type);
pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
@@ -1054,6 +1090,36 @@ static BlockDriverAIOCB *raw_aio_submit(BlockDriverState *bs,
cb, opaque, type);
}
static void raw_aio_plug(BlockDriverState *bs)
{
#ifdef CONFIG_LINUX_AIO
BDRVRawState *s = bs->opaque;
if (s->use_aio) {
laio_io_plug(bs, s->aio_ctx);
}
#endif
}
static void raw_aio_unplug(BlockDriverState *bs)
{
#ifdef CONFIG_LINUX_AIO
BDRVRawState *s = bs->opaque;
if (s->use_aio) {
laio_io_unplug(bs, s->aio_ctx, true);
}
#endif
}
static void raw_aio_flush_io_queue(BlockDriverState *bs)
{
#ifdef CONFIG_LINUX_AIO
BDRVRawState *s = bs->opaque;
if (s->use_aio) {
laio_io_unplug(bs, s->aio_ctx, false);
}
#endif
}
static BlockDriverAIOCB *raw_aio_readv(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockDriverCompletionFunc *cb, void *opaque)
@@ -1130,12 +1196,12 @@ static int64_t raw_getlength(BlockDriverState *bs)
struct stat st;
if (fstat(fd, &st))
return -1;
return -errno;
if (S_ISCHR(st.st_mode) || S_ISBLK(st.st_mode)) {
struct disklabel dl;
if (ioctl(fd, DIOCGDINFO, &dl))
return -1;
return -errno;
return (uint64_t)dl.d_secsize *
dl.d_partitions[DISKPART(st.st_rdev)].p_size;
} else
@@ -1149,7 +1215,7 @@ static int64_t raw_getlength(BlockDriverState *bs)
struct stat st;
if (fstat(fd, &st))
return -1;
return -errno;
if (S_ISCHR(st.st_mode) || S_ISBLK(st.st_mode)) {
struct dkwedge_info dkw;
@@ -1159,7 +1225,7 @@ static int64_t raw_getlength(BlockDriverState *bs)
struct disklabel dl;
if (ioctl(fd, DIOCGDINFO, &dl))
return -1;
return -errno;
return (uint64_t)dl.d_secsize *
dl.d_partitions[DISKPART(st.st_rdev)].p_size;
}
@@ -1172,6 +1238,7 @@ static int64_t raw_getlength(BlockDriverState *bs)
BDRVRawState *s = bs->opaque;
struct dk_minfo minfo;
int ret;
int64_t size;
ret = fd_open(bs);
if (ret < 0) {
@@ -1190,7 +1257,11 @@ static int64_t raw_getlength(BlockDriverState *bs)
* There are reports that lseek on some devices fails, but
* irc discussion said that contingency on contingency was overkill.
*/
return lseek(s->fd, 0, SEEK_END);
size = lseek(s->fd, 0, SEEK_END);
if (size < 0) {
return -errno;
}
return size;
}
#elif defined(CONFIG_BSD)
static int64_t raw_getlength(BlockDriverState *bs)
@@ -1228,6 +1299,9 @@ again:
size = LLONG_MAX;
#else
size = lseek(fd, 0LL, SEEK_END);
if (size < 0) {
return -errno;
}
#endif
#if defined(__FreeBSD__) || defined(__FreeBSD_kernel__)
switch(s->type) {
@@ -1244,6 +1318,9 @@ again:
#endif
} else {
size = lseek(fd, 0, SEEK_END);
if (size < 0) {
return -errno;
}
}
return size;
}
@@ -1252,13 +1329,18 @@ static int64_t raw_getlength(BlockDriverState *bs)
{
BDRVRawState *s = bs->opaque;
int ret;
int64_t size;
ret = fd_open(bs);
if (ret < 0) {
return ret;
}
return lseek(s->fd, 0, SEEK_END);
size = lseek(s->fd, 0, SEEK_END);
if (size < 0) {
return -errno;
}
return size;
}
#endif
@@ -1278,12 +1360,14 @@ static int raw_create(const char *filename, QemuOpts *opts, Error **errp)
int fd;
int result = 0;
int64_t total_size = 0;
bool nocow = false;
strstart(filename, "file:", &filename);
/* Read out options */
total_size =
qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0) / BDRV_SECTOR_SIZE;
nocow = qemu_opt_get_bool(opts, BLOCK_OPT_NOCOW, false);
fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
0644);
@@ -1291,6 +1375,21 @@ static int raw_create(const char *filename, QemuOpts *opts, Error **errp)
result = -errno;
error_setg_errno(errp, -result, "Could not create file");
} else {
if (nocow) {
#ifdef __linux__
/* Set NOCOW flag to solve performance issue on fs like btrfs.
* This is an optimisation. The FS_IOC_SETFLAGS ioctl return value
* will be ignored since any failure of this operation should not
* block the left work.
*/
int attr;
if (ioctl(fd, FS_IOC_GETFLAGS, &attr) == 0) {
attr |= FS_NOCOW_FL;
ioctl(fd, FS_IOC_SETFLAGS, &attr);
}
#endif
}
if (ftruncate(fd, total_size * BDRV_SECTOR_SIZE) != 0) {
result = -errno;
error_setg_errno(errp, -result, "Could not resize file");
@@ -1477,6 +1576,11 @@ static QemuOptsList raw_create_opts = {
.type = QEMU_OPT_SIZE,
.help = "Virtual disk size"
},
{
.name = BLOCK_OPT_NOCOW,
.type = QEMU_OPT_BOOL,
.help = "Turn off copy-on-write (valid only on btrfs)"
},
{ /* end of list */ }
}
};
@@ -1503,6 +1607,9 @@ static BlockDriver bdrv_file = {
.bdrv_aio_flush = raw_aio_flush,
.bdrv_aio_discard = raw_aio_discard,
.bdrv_refresh_limits = raw_refresh_limits,
.bdrv_io_plug = raw_aio_plug,
.bdrv_io_unplug = raw_aio_unplug,
.bdrv_flush_io_queue = raw_aio_flush_io_queue,
.bdrv_truncate = raw_truncate,
.bdrv_getlength = raw_getlength,
@@ -1902,6 +2009,9 @@ static BlockDriver bdrv_host_device = {
.bdrv_aio_flush = raw_aio_flush,
.bdrv_aio_discard = hdev_aio_discard,
.bdrv_refresh_limits = raw_refresh_limits,
.bdrv_io_plug = raw_aio_plug,
.bdrv_io_unplug = raw_aio_unplug,
.bdrv_flush_io_queue = raw_aio_flush_io_queue,
.bdrv_truncate = raw_truncate,
.bdrv_getlength = raw_getlength,
@@ -2047,6 +2157,9 @@ static BlockDriver bdrv_host_floppy = {
.bdrv_aio_writev = raw_aio_writev,
.bdrv_aio_flush = raw_aio_flush,
.bdrv_refresh_limits = raw_refresh_limits,
.bdrv_io_plug = raw_aio_plug,
.bdrv_io_unplug = raw_aio_unplug,
.bdrv_flush_io_queue = raw_aio_flush_io_queue,
.bdrv_truncate = raw_truncate,
.bdrv_getlength = raw_getlength,
@@ -2175,6 +2288,9 @@ static BlockDriver bdrv_host_cdrom = {
.bdrv_aio_writev = raw_aio_writev,
.bdrv_aio_flush = raw_aio_flush,
.bdrv_refresh_limits = raw_refresh_limits,
.bdrv_io_plug = raw_aio_plug,
.bdrv_io_unplug = raw_aio_unplug,
.bdrv_flush_io_queue = raw_aio_flush_io_queue,
.bdrv_truncate = raw_truncate,
.bdrv_getlength = raw_getlength,
@@ -2309,6 +2425,9 @@ static BlockDriver bdrv_host_cdrom = {
.bdrv_aio_writev = raw_aio_writev,
.bdrv_aio_flush = raw_aio_flush,
.bdrv_refresh_limits = raw_refresh_limits,
.bdrv_io_plug = raw_aio_plug,
.bdrv_io_unplug = raw_aio_unplug,
.bdrv_flush_io_queue = raw_aio_flush_io_queue,
.bdrv_truncate = raw_truncate,
.bdrv_getlength = raw_getlength,

View File

@@ -94,10 +94,9 @@ static int raw_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
return bdrv_get_info(bs->file, bdi);
}
static int raw_refresh_limits(BlockDriverState *bs)
static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
{
bs->bl = bs->file->bl;
return 0;
}
static int raw_truncate(BlockDriverState *bs, int64_t offset)

View File

@@ -32,7 +32,7 @@ typedef struct StreamBlockJob {
RateLimit limit;
BlockDriverState *base;
BlockdevOnError on_error;
char backing_file_id[1024];
char *backing_file_str;
} StreamBlockJob;
static int coroutine_fn stream_populate(BlockDriverState *bs,
@@ -76,7 +76,7 @@ static void close_unused_images(BlockDriverState *top, BlockDriverState *base,
bdrv_unref(unused);
}
bdrv_refresh_limits(top);
bdrv_refresh_limits(top, NULL);
}
static void coroutine_fn stream_run(void *opaque)
@@ -186,7 +186,7 @@ wait:
if (!block_job_is_cancelled(&s->common) && sector_num == end && ret == 0) {
const char *base_id = NULL, *base_fmt = NULL;
if (base) {
base_id = s->backing_file_id;
base_id = s->backing_file_str;
if (base->drv) {
base_fmt = base->drv->format_name;
}
@@ -196,6 +196,7 @@ wait:
}
qemu_vfree(buf);
g_free(s->backing_file_str);
block_job_completed(&s->common, ret);
}
@@ -217,7 +218,7 @@ static const BlockJobDriver stream_job_driver = {
};
void stream_start(BlockDriverState *bs, BlockDriverState *base,
const char *base_id, int64_t speed,
const char *backing_file_str, int64_t speed,
BlockdevOnError on_error,
BlockDriverCompletionFunc *cb,
void *opaque, Error **errp)
@@ -237,9 +238,7 @@ void stream_start(BlockDriverState *bs, BlockDriverState *base,
}
s->base = base;
if (base_id) {
pstrcpy(s->backing_file_id, sizeof(s->backing_file_id), base_id);
}
s->backing_file_str = g_strdup(backing_file_str);
s->on_error = on_error;
s->common.co = qemu_coroutine_create(stream_run);

View File

@@ -53,6 +53,13 @@
#include "block/block_int.h"
#include "qemu/module.h"
#include "migration/migration.h"
#ifdef __linux__
#include <linux/fs.h>
#include <sys/ioctl.h>
#ifndef FS_NOCOW_FL
#define FS_NOCOW_FL 0x00800000 /* Do not cow file */
#endif
#endif
#if defined(CONFIG_UUID)
#include <uuid/uuid.h>
@@ -683,6 +690,7 @@ static int vdi_create(const char *filename, QemuOpts *opts, Error **errp)
VdiHeader header;
size_t i;
size_t bmap_size;
bool nocow = false;
logout("\n");
@@ -699,6 +707,7 @@ static int vdi_create(const char *filename, QemuOpts *opts, Error **errp)
image_type = VDI_TYPE_STATIC;
}
#endif
nocow = qemu_opt_get_bool_del(opts, BLOCK_OPT_NOCOW, false);
if (bytes > VDI_DISK_SIZE_MAX) {
result = -ENOTSUP;
@@ -716,6 +725,21 @@ static int vdi_create(const char *filename, QemuOpts *opts, Error **errp)
goto exit;
}
if (nocow) {
#ifdef __linux__
/* Set NOCOW flag to solve performance issue on fs like btrfs.
* This is an optimisation. The FS_IOC_SETFLAGS ioctl return value will
* be ignored since any failure of this operation should not block the
* left work.
*/
int attr;
if (ioctl(fd, FS_IOC_GETFLAGS, &attr) == 0) {
attr |= FS_NOCOW_FL;
ioctl(fd, FS_IOC_SETFLAGS, &attr);
}
#endif
}
/* We need enough blocks to store the given disk size,
so always round up. */
blocks = (bytes + block_size - 1) / block_size;
@@ -818,6 +842,11 @@ static QemuOptsList vdi_create_opts = {
.def_value_str = "off"
},
#endif
{
.name = BLOCK_OPT_NOCOW,
.type = QEMU_OPT_BOOL,
.help = "Turn off copy-on-write (valid only on btrfs)"
},
/* TODO: An additional option to set UUID values might be useful. */
{ /* end of list */ }
}

View File

@@ -938,7 +938,7 @@ fail:
}
static int vmdk_refresh_limits(BlockDriverState *bs)
static void vmdk_refresh_limits(BlockDriverState *bs, Error **errp)
{
BDRVVmdkState *s = bs->opaque;
int i;
@@ -950,8 +950,6 @@ static int vmdk_refresh_limits(BlockDriverState *bs)
s->extents[i].cluster_sectors);
}
}
return 0;
}
static int get_whole_cluster(BlockDriverState *bs,
@@ -1529,7 +1527,7 @@ static int coroutine_fn vmdk_co_write_zeroes(BlockDriverState *bs,
static int vmdk_create_extent(const char *filename, int64_t filesize,
bool flat, bool compress, bool zeroed_grain,
Error **errp)
QemuOpts *opts, Error **errp)
{
int ret, i;
BlockDriverState *bs = NULL;
@@ -1539,7 +1537,7 @@ static int vmdk_create_extent(const char *filename, int64_t filesize,
uint32_t *gd_buf = NULL;
int gd_buf_size;
ret = bdrv_create_file(filename, NULL, &local_err);
ret = bdrv_create_file(filename, opts, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
goto exit;
@@ -1845,7 +1843,7 @@ static int vmdk_create(const char *filename, QemuOpts *opts, Error **errp)
path, desc_filename);
if (vmdk_create_extent(ext_filename, size,
flat, compress, zeroed_grain, errp)) {
flat, compress, zeroed_grain, opts, errp)) {
ret = -EINVAL;
goto exit;
}
@@ -2180,6 +2178,7 @@ static BlockDriver bdrv_vmdk = {
.bdrv_detach_aio_context = vmdk_detach_aio_context,
.bdrv_attach_aio_context = vmdk_attach_aio_context,
.supports_backing = true,
.create_opts = &vmdk_create_opts,
};

View File

@@ -29,6 +29,13 @@
#if defined(CONFIG_UUID)
#include <uuid/uuid.h>
#endif
#ifdef __linux__
#include <linux/fs.h>
#include <sys/ioctl.h>
#ifndef FS_NOCOW_FL
#define FS_NOCOW_FL 0x00800000 /* Do not cow file */
#endif
#endif
/**************************************************************/
@@ -751,6 +758,7 @@ static int vpc_create(const char *filename, QemuOpts *opts, Error **errp)
int64_t total_size;
int disk_type;
int ret = -EIO;
bool nocow = false;
/* Read out options */
total_size = qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0);
@@ -767,6 +775,7 @@ static int vpc_create(const char *filename, QemuOpts *opts, Error **errp)
} else {
disk_type = VHD_DYNAMIC;
}
nocow = qemu_opt_get_bool_del(opts, BLOCK_OPT_NOCOW, false);
/* Create the file */
fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, 0644);
@@ -775,6 +784,21 @@ static int vpc_create(const char *filename, QemuOpts *opts, Error **errp)
goto out;
}
if (nocow) {
#ifdef __linux__
/* Set NOCOW flag to solve performance issue on fs like btrfs.
* This is an optimisation. The FS_IOC_SETFLAGS ioctl return value will
* be ignored since any failure of this operation should not block the
* left work.
*/
int attr;
if (ioctl(fd, FS_IOC_GETFLAGS, &attr) == 0) {
attr |= FS_NOCOW_FL;
ioctl(fd, FS_IOC_SETFLAGS, &attr);
}
#endif
}
/*
* Calculate matching total_size and geometry. Increase the number of
* sectors requested until we get enough (or fail). This ensures that
@@ -884,6 +908,11 @@ static QemuOptsList vpc_create_opts = {
"Type of virtual hard disk format. Supported formats are "
"{dynamic (default) | fixed} "
},
{
.name = BLOCK_OPT_NOCOW,
.type = QEMU_OPT_BOOL,
.help = "Turn off copy-on-write (valid only on btrfs)"
},
{ /* end of list */ }
}
};

View File

@@ -28,6 +28,7 @@ static void nbd_accept(void *opaque)
int fd = accept(server_fd, (struct sockaddr *)&addr, &addr_len);
if (fd >= 0 && !nbd_client_new(NULL, fd, nbd_client_put)) {
shutdown(fd, 2);
close(fd);
}
}
@@ -91,6 +92,10 @@ void qmp_nbd_server_add(const char *device, bool has_writable, bool writable,
error_set(errp, QERR_DEVICE_NOT_FOUND, device);
return;
}
if (!bdrv_is_inserted(bs)) {
error_set(errp, QERR_DEVICE_HAS_NO_MEDIUM, device);
return;
}
if (!has_writable) {
writable = false;

View File

@@ -1819,6 +1819,11 @@ void qmp_block_resize(bool has_device, const char *device,
return;
}
if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_RESIZE, NULL)) {
error_set(errp, QERR_DEVICE_IN_USE, device);
return;
}
/* complete all in-flight operations before resizing the device */
bdrv_drain_all();
@@ -1866,14 +1871,17 @@ static void block_job_cb(void *opaque, int ret)
bdrv_put_ref_bh_schedule(bs);
}
void qmp_block_stream(const char *device, bool has_base,
const char *base, bool has_speed, int64_t speed,
void qmp_block_stream(const char *device,
bool has_base, const char *base,
bool has_backing_file, const char *backing_file,
bool has_speed, int64_t speed,
bool has_on_error, BlockdevOnError on_error,
Error **errp)
{
BlockDriverState *bs;
BlockDriverState *base_bs = NULL;
Error *local_err = NULL;
const char *base_name = NULL;
if (!has_on_error) {
on_error = BLOCKDEV_ON_ERROR_REPORT;
@@ -1889,15 +1897,27 @@ void qmp_block_stream(const char *device, bool has_base,
return;
}
if (base) {
if (has_base) {
base_bs = bdrv_find_backing_image(bs, base);
if (base_bs == NULL) {
error_set(errp, QERR_BASE_NOT_FOUND, base);
return;
}
base_name = base;
}
stream_start(bs, base_bs, base, has_speed ? speed : 0,
/* if we are streaming the entire chain, the result will have no backing
* file, and specifying one is therefore an error */
if (base_bs == NULL && has_backing_file) {
error_setg(errp, "backing file specified, but streaming the "
"entire chain");
return;
}
/* backing_file string overrides base bs filename */
base_name = has_backing_file ? backing_file : base_name;
stream_start(bs, base_bs, base_name, has_speed ? speed : 0,
on_error, block_job_cb, bs, &local_err);
if (local_err) {
error_propagate(errp, local_err);
@@ -1908,7 +1928,9 @@ void qmp_block_stream(const char *device, bool has_base,
}
void qmp_block_commit(const char *device,
bool has_base, const char *base, const char *top,
bool has_base, const char *base,
bool has_top, const char *top,
bool has_backing_file, const char *backing_file,
bool has_speed, int64_t speed,
Error **errp)
{
@@ -1927,6 +1949,11 @@ void qmp_block_commit(const char *device,
/* drain all i/o before commits */
bdrv_drain_all();
/* Important Note:
* libvirt relies on the DeviceNotFound error class in order to probe for
* live commit feature versions; for this to work, we must make sure to
* perform the device lookup before any generic errors that may occur in a
* scenario in which all optional arguments are omitted. */
bs = bdrv_find(device);
if (!bs) {
error_set(errp, QERR_DEVICE_NOT_FOUND, device);
@@ -1940,7 +1967,7 @@ void qmp_block_commit(const char *device,
/* default top_bs is the active layer */
top_bs = bs;
if (top) {
if (has_top && top) {
if (strcmp(bs->filename, top) != 0) {
top_bs = bdrv_find_backing_image(bs, top);
}
@@ -1962,12 +1989,23 @@ void qmp_block_commit(const char *device,
return;
}
/* Do not allow attempts to commit an image into itself */
if (top_bs == base_bs) {
error_setg(errp, "cannot commit an image into itself");
return;
}
if (top_bs == bs) {
if (has_backing_file) {
error_setg(errp, "'backing-file' specified,"
" but 'top' is the active layer");
return;
}
commit_active_start(bs, base_bs, speed, on_error, block_job_cb,
bs, &local_err);
} else {
commit_start(bs, base_bs, top_bs, speed, on_error, block_job_cb, bs,
&local_err);
has_backing_file ? backing_file : NULL, &local_err);
}
if (local_err != NULL) {
error_propagate(errp, local_err);
@@ -2094,6 +2132,8 @@ BlockDeviceInfoList *qmp_query_named_block_nodes(Error **errp)
void qmp_drive_mirror(const char *device, const char *target,
bool has_format, const char *format,
bool has_node_name, const char *node_name,
bool has_replaces, const char *replaces,
enum MirrorSyncMode sync,
bool has_mode, enum NewImageMode mode,
bool has_speed, int64_t speed,
@@ -2107,6 +2147,7 @@ void qmp_drive_mirror(const char *device, const char *target,
BlockDriverState *source, *target_bs;
BlockDriver *drv = NULL;
Error *local_err = NULL;
QDict *options = NULL;
int flags;
int64_t size;
int ret;
@@ -2180,6 +2221,29 @@ void qmp_drive_mirror(const char *device, const char *target,
return;
}
if (has_replaces) {
BlockDriverState *to_replace_bs;
if (!has_node_name) {
error_setg(errp, "a node-name must be provided when replacing a"
" named node of the graph");
return;
}
to_replace_bs = check_to_replace_node(replaces, &local_err);
if (!to_replace_bs) {
error_propagate(errp, local_err);
return;
}
if (size != bdrv_getlength(to_replace_bs)) {
error_setg(errp, "cannot replace image with a mirror image of "
"different size");
return;
}
}
if ((sync == MIRROR_SYNC_MODE_FULL || !source)
&& mode != NEW_IMAGE_MODE_EXISTING)
{
@@ -2208,18 +2272,28 @@ void qmp_drive_mirror(const char *device, const char *target,
return;
}
if (has_node_name) {
options = qdict_new();
qdict_put(options, "node-name", qstring_from_str(node_name));
}
/* Mirroring takes care of copy-on-write using the source's backing
* file.
*/
target_bs = NULL;
ret = bdrv_open(&target_bs, target, NULL, NULL, flags | BDRV_O_NO_BACKING,
drv, &local_err);
ret = bdrv_open(&target_bs, target, NULL, options,
flags | BDRV_O_NO_BACKING, drv, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
return;
}
mirror_start(bs, target_bs, speed, granularity, buf_size, sync,
/* pass the node name to replace to mirror start since it's loose coupling
* and will allow to check whether the node still exist at mirror completion
*/
mirror_start(bs, target_bs,
has_replaces ? replaces : NULL,
speed, granularity, buf_size, sync,
on_source_error, on_target_error,
block_job_cb, bs, &local_err);
if (local_err != NULL) {
@@ -2314,6 +2388,85 @@ void qmp_block_job_complete(const char *device, Error **errp)
block_job_complete(job, errp);
}
void qmp_change_backing_file(const char *device,
const char *image_node_name,
const char *backing_file,
Error **errp)
{
BlockDriverState *bs = NULL;
BlockDriverState *image_bs = NULL;
Error *local_err = NULL;
bool ro;
int open_flags;
int ret;
/* find the top layer BDS of the chain */
bs = bdrv_find(device);
if (!bs) {
error_set(errp, QERR_DEVICE_NOT_FOUND, device);
return;
}
image_bs = bdrv_lookup_bs(NULL, image_node_name, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return;
}
if (!image_bs) {
error_setg(errp, "image file not found");
return;
}
if (bdrv_find_base(image_bs) == image_bs) {
error_setg(errp, "not allowing backing file change on an image "
"without a backing file");
return;
}
/* even though we are not necessarily operating on bs, we need it to
* determine if block ops are currently prohibited on the chain */
if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_CHANGE, errp)) {
return;
}
/* final sanity check */
if (!bdrv_chain_contains(bs, image_bs)) {
error_setg(errp, "'%s' and image file are not in the same chain",
device);
return;
}
/* if not r/w, reopen to make r/w */
open_flags = image_bs->open_flags;
ro = bdrv_is_read_only(image_bs);
if (ro) {
bdrv_reopen(image_bs, open_flags | BDRV_O_RDWR, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return;
}
}
ret = bdrv_change_backing_file(image_bs, backing_file,
image_bs->drv ? image_bs->drv->format_name : "");
if (ret < 0) {
error_setg_errno(errp, -ret, "Could not change backing file to '%s'",
backing_file);
/* don't exit here, so we can try to restore open flags if
* appropriate */
}
if (ro) {
bdrv_reopen(image_bs, open_flags, &local_err);
if (local_err) {
error_propagate(errp, local_err); /* will preserve prior errp */
}
}
}
void qmp_blockdev_add(BlockdevOptions *options, Error **errp)
{
QmpOutputVisitor *ov = qmp_output_visitor_new();

View File

@@ -187,7 +187,7 @@ int block_job_cancel_sync(BlockJob *job)
job->opaque = &data;
block_job_cancel(job);
while (data.ret == -EINPROGRESS) {
qemu_aio_wait();
aio_poll(bdrv_get_aio_context(bs), true);
}
return (data.cancelled && data.ret == 0) ? -ECANCELED : data.ret;
}
@@ -210,6 +210,20 @@ void block_job_sleep_ns(BlockJob *job, QEMUClockType type, int64_t ns)
job->busy = true;
}
void block_job_yield(BlockJob *job)
{
assert(job->busy);
/* Check cancellation *before* setting busy = false, too! */
if (block_job_is_cancelled(job)) {
return;
}
job->busy = false;
qemu_coroutine_yield();
job->busy = true;
}
BlockJobInfo *block_job_query(BlockJob *job)
{
BlockJobInfo *info = g_new0(BlockJobInfo, 1);
@@ -256,7 +270,11 @@ void block_job_event_completed(BlockJob *job, const char *msg)
void block_job_event_ready(BlockJob *job)
{
qapi_event_send_block_job_ready(bdrv_get_device_name(job->bs), &error_abort);
qapi_event_send_block_job_ready(job->driver->job_type,
bdrv_get_device_name(job->bs),
job->len,
job->offset,
job->speed, &error_abort);
}
BlockErrorAction block_job_error_action(BlockJob *job, BlockDriverState *bs,
@@ -282,7 +300,7 @@ BlockErrorAction block_job_error_action(BlockJob *job, BlockDriverState *bs,
default:
abort();
}
qapi_event_send_block_job_error(bdrv_get_device_name(bs),
qapi_event_send_block_job_error(bdrv_get_device_name(job->bs),
is_read ? IO_OPERATION_TYPE_READ :
IO_OPERATION_TYPE_WRITE,
action, &error_abort);

40
configure vendored
View File

@@ -1489,8 +1489,9 @@ for flag in $gcc_flags; do
fi
done
if test "$stack_protector" != "no" ; then
if test "$stack_protector" != "no"; then
gcc_flags="-fstack-protector-strong -fstack-protector-all"
sp_on=0
for flag in $gcc_flags; do
# We need to check both a compile and a link, since some compiler
# setups fail only on a .c->.o compile and some only at link time
@@ -1498,9 +1499,15 @@ if test "$stack_protector" != "no" ; then
compile_prog "-Werror $flag" ""; then
QEMU_CFLAGS="$QEMU_CFLAGS $flag"
LIBTOOLFLAGS="$LIBTOOLFLAGS -Wc,$flag"
sp_on=1
break
fi
done
if test "$stack_protector" = yes; then
if test $sp_on = 0; then
error_exit "Stack protector not supported"
fi
fi
fi
# Workaround for http://gcc.gnu.org/PR55489. Happens with -fPIE/-fPIC and
@@ -1711,6 +1718,20 @@ else
echo big/little test failed
fi
##########################################
# L2TPV3 probe
cat > $TMPC <<EOF
#include <sys/socket.h>
#include <linux/ip.h>
int main(void) { return sizeof(struct mmsghdr); }
EOF
if compile_prog "" "" ; then
l2tpv3=yes
else
l2tpv3=no
fi
##########################################
# pkg-config probe
@@ -3453,7 +3474,7 @@ fi
# Do we need libm
cat > $TMPC << EOF
#include <math.h>
int main(void) { return isnan(sin(0.0)); }
int main(int argc, char **argv) { return isnan(sin((double)argc)); }
EOF
if compile_prog "" "" ; then
:
@@ -4343,6 +4364,9 @@ fi
if test "$netmap" = "yes" ; then
echo "CONFIG_NETMAP=y" >> $config_host_mak
fi
if test "$l2tpv3" = "yes" ; then
echo "CONFIG_L2TPV3=y" >> $config_host_mak
fi
if test "$cap_ng" = "yes" ; then
echo "CONFIG_LIBCAP=y" >> $config_host_mak
fi
@@ -5273,6 +5297,18 @@ if test "$docs" = "yes" ; then
mkdir -p QMP
fi
# set up qemu-iotests in this build directory
iotests_common_env="tests/qemu-iotests/common.env"
iotests_check="tests/qemu-iotests/check"
echo "# Automatically generated by configure - do not modify" > "$iotests_common_env"
echo >> "$iotests_common_env"
echo "export PYTHON='$python'" >> "$iotests_common_env"
if [ ! -e "$iotests_check" ]; then
symlink "$source_path/$iotests_check" "$iotests_check"
fi
# Save the configure command line for later reuse.
cat <<EOD >config.status
#!/bin/sh

View File

@@ -36,8 +36,17 @@ typedef struct
static __thread CoroutineWin32 leader;
static __thread Coroutine *current;
CoroutineAction qemu_coroutine_switch(Coroutine *from_, Coroutine *to_,
CoroutineAction action)
/* This function is marked noinline to prevent GCC from inlining it
* into coroutine_trampoline(). If we allow it to do that then it
* hoists the code to get the address of the TLS variable "current"
* out of the while() loop. This is an invalid transformation because
* the SwitchToFiber() call may be called when running thread A but
* return in thread B, and so we might be in a different thread
* context each time round the loop.
*/
CoroutineAction __attribute__((noinline))
qemu_coroutine_switch(Coroutine *from_, Coroutine *to_,
CoroutineAction action)
{
CoroutineWin32 *from = DO_UPCAST(CoroutineWin32, base, from_);
CoroutineWin32 *to = DO_UPCAST(CoroutineWin32, base, to_);

View File

@@ -3,6 +3,6 @@ libvixl_OBJS = utils.o \
a64/decoder-a64.o \
a64/disasm-a64.o
$(addprefix $(obj)/,$(libvixl_OBJS)): QEMU_CFLAGS += -I$(SRC_PATH)/disas/libvixl
$(addprefix $(obj)/,$(libvixl_OBJS)): QEMU_CFLAGS := -I$(SRC_PATH)/disas/libvixl $(QEMU_CFLAGS)
common-obj-$(CONFIG_ARM_A64_DIS) += $(libvixl_OBJS)

View File

@@ -2,7 +2,7 @@
The code in this directory is a subset of libvixl:
https://github.com/armvixl/vixl
(specifically, it is the set of files needed for disassembly only,
taken from libvixl 1.1).
taken from libvixl 1.4).
Bugfixes should preferably be sent upstream initially.
The disassembler does not currently support the entire A64 instruction

View File

@@ -1369,7 +1369,7 @@ int Disassembler::SubstituteImmediateField(Instruction* instr,
VIXL_ASSERT(format[5] == 'L');
AppendToOutput("#0x%" PRIx64, instr->ImmMoveWide());
if (instr->ShiftMoveWide() > 0) {
AppendToOutput(", lsl #%d", 16 * instr->ShiftMoveWide());
AppendToOutput(", lsl #%" PRId64, 16 * instr->ShiftMoveWide());
}
}
return 8;
@@ -1418,7 +1418,7 @@ int Disassembler::SubstituteImmediateField(Instruction* instr,
}
case 'F': { // IFPSingle, IFPDouble or IFPFBits.
if (format[3] == 'F') { // IFPFbits.
AppendToOutput("#%d", 64 - instr->FPScale());
AppendToOutput("#%" PRId64, 64 - instr->FPScale());
return 8;
} else {
AppendToOutput("#0x%" PRIx64 " (%.4f)", instr->ImmFP(),
@@ -1439,23 +1439,23 @@ int Disassembler::SubstituteImmediateField(Instruction* instr,
return 5;
}
case 'P': { // IP - Conditional compare.
AppendToOutput("#%d", instr->ImmCondCmp());
AppendToOutput("#%" PRId64, instr->ImmCondCmp());
return 2;
}
case 'B': { // Bitfields.
return SubstituteBitfieldImmediateField(instr, format);
}
case 'E': { // IExtract.
AppendToOutput("#%d", instr->ImmS());
AppendToOutput("#%" PRId64, instr->ImmS());
return 8;
}
case 'S': { // IS - Test and branch bit.
AppendToOutput("#%d", (instr->ImmTestBranchBit5() << 5) |
instr->ImmTestBranchBit40());
AppendToOutput("#%" PRId64, (instr->ImmTestBranchBit5() << 5) |
instr->ImmTestBranchBit40());
return 2;
}
case 'D': { // IDebug - HLT and BRK instructions.
AppendToOutput("#0x%x", instr->ImmException());
AppendToOutput("#0x%" PRIx64, instr->ImmException());
return 6;
}
default: {
@@ -1626,12 +1626,12 @@ int Disassembler::SubstituteExtendField(Instruction* instr,
(((instr->ExtendMode() == UXTW) && (instr->SixtyFourBits() == 0)) ||
(instr->ExtendMode() == UXTX))) {
if (instr->ImmExtendShift() > 0) {
AppendToOutput(", lsl #%d", instr->ImmExtendShift());
AppendToOutput(", lsl #%" PRId64, instr->ImmExtendShift());
}
} else {
AppendToOutput(", %s", extend_mode[instr->ExtendMode()]);
if (instr->ImmExtendShift() > 0) {
AppendToOutput(" #%d", instr->ImmExtendShift());
AppendToOutput(" #%" PRId64, instr->ImmExtendShift());
}
}
return 3;
@@ -1660,7 +1660,7 @@ int Disassembler::SubstituteLSRegOffsetField(Instruction* instr,
if (!((ext == UXTX) && (shift == 0))) {
AppendToOutput(", %s", extend_mode[ext]);
if (shift != 0) {
AppendToOutput(" #%d", instr->SizeLS());
AppendToOutput(" #%" PRId64, instr->SizeLS());
}
}
return 9;

View File

@@ -170,6 +170,10 @@ static void dma_bdrv_cb(void *opaque, int ret)
return;
}
if (dbs->iov.size & ~BDRV_SECTOR_MASK) {
qemu_iovec_discard_back(&dbs->iov, dbs->iov.size & ~BDRV_SECTOR_MASK);
}
dbs->acb = dbs->io_func(dbs->bs, dbs->sector_num, &dbs->iov,
dbs->iov.size / 512, dma_bdrv_cb, dbs);
assert(dbs->acb);

104
docs/aio_notify.promela Normal file
View File

@@ -0,0 +1,104 @@
/*
* This model describes the interaction between aio_set_dispatching()
* and aio_notify().
*
* Author: Paolo Bonzini <pbonzini@redhat.com>
*
* This file is in the public domain. If you really want a license,
* the WTFPL will do.
*
* To simulate it:
* spin -p docs/aio_notify.promela
*
* To verify it:
* spin -a docs/aio_notify.promela
* gcc -O2 pan.c
* ./a.out -a
*/
#define MAX 4
#define LAST (1 << (MAX - 1))
#define FINAL ((LAST << 1) - 1)
bool dispatching;
bool event;
int req, done;
active proctype waiter()
{
int fetch, blocking;
do
:: done != FINAL -> {
// Computing "blocking" is separate from execution of the
// "bottom half"
blocking = (req == 0);
// This is our "bottom half"
atomic { fetch = req; req = 0; }
done = done | fetch;
// Wait for a nudge from the other side
do
:: event == 1 -> { event = 0; break; }
:: !blocking -> break;
od;
dispatching = 1;
// If you are simulating this model, you may want to add
// something like this here:
//
// int foo; foo++; foo++; foo++;
//
// This only wastes some time and makes it more likely
// that the notifier process hits the "fast path".
dispatching = 0;
}
:: else -> break;
od
}
active proctype notifier()
{
int next = 1;
int sets = 0;
do
:: next <= LAST -> {
// generate a request
req = req | next;
next = next << 1;
// aio_notify
if
:: dispatching == 0 -> sets++; event = 1;
:: else -> skip;
fi;
// Test both synchronous and asynchronous delivery
if
:: 1 -> do
:: req == 0 -> break;
od;
:: 1 -> skip;
fi;
}
:: else -> break;
od;
printf("Skipped %d event_notifier_set\n", MAX - sets);
}
#define p (done == FINAL)
never {
do
:: 1 // after an arbitrarily long prefix
:: p -> break // p becomes true
od;
do
:: !p -> accept: break // it then must remains true forever after
od
}

View File

@@ -218,10 +218,10 @@ An example command is:
=== Events ===
Events are defined with the keyword 'event'. When 'data' is also specified,
additional info will be carried on. Finally there will be C API generated
in qapi-event.h; when called by QEMU code, a message with timestamp will
be emitted on the wire. If timestamp is -1, it means failure to retrieve host
time.
additional info will be included in the event. Finally there will be C API
generated in qapi-event.h; when called by QEMU code, a message with timestamp
will be emitted on the wire. If timestamp is -1, it means failure to retrieve
host time.
An example event is:

627
docs/qmp/qmp-events.txt Normal file
View File

@@ -0,0 +1,627 @@
QEMU Machine Protocol Events
============================
ACPI_DEVICE_OST
---------------
Emitted when guest executes ACPI _OST method.
- data: ACPIOSTInfo type as described in qapi-schema.json
{ "event": "ACPI_DEVICE_OST",
"data": { "device": "d1", "slot": "0", "slot-type": "DIMM", "source": 1, "status": 0 } }
BALLOON_CHANGE
--------------
Emitted when the guest changes the actual BALLOON level. This
value is equivalent to the 'actual' field return by the
'query-balloon' command
Data:
- "actual": actual level of the guest memory balloon in bytes (json-number)
Example:
{ "event": "BALLOON_CHANGE",
"data": { "actual": 944766976 },
"timestamp": { "seconds": 1267020223, "microseconds": 435656 } }
BLOCK_IMAGE_CORRUPTED
---------------------
Emitted when a disk image is being marked corrupt.
Data:
- "device": Device name (json-string)
- "msg": Informative message (e.g., reason for the corruption) (json-string)
- "offset": If the corruption resulted from an image access, this is the access
offset into the image (json-int)
- "size": If the corruption resulted from an image access, this is the access
size (json-int)
Example:
{ "event": "BLOCK_IMAGE_CORRUPTED",
"data": { "device": "ide0-hd0",
"msg": "Prevented active L1 table overwrite", "offset": 196608,
"size": 65536 },
"timestamp": { "seconds": 1378126126, "microseconds": 966463 } }
BLOCK_IO_ERROR
--------------
Emitted when a disk I/O error occurs.
Data:
- "device": device name (json-string)
- "operation": I/O operation (json-string, "read" or "write")
- "action": action that has been taken, it's one of the following (json-string):
"ignore": error has been ignored
"report": error has been reported to the device
"stop": the VM is going to stop because of the error
Example:
{ "event": "BLOCK_IO_ERROR",
"data": { "device": "ide0-hd1",
"operation": "write",
"action": "stop" },
"timestamp": { "seconds": 1265044230, "microseconds": 450486 } }
Note: If action is "stop", a STOP event will eventually follow the
BLOCK_IO_ERROR event.
BLOCK_JOB_CANCELLED
-------------------
Emitted when a block job has been cancelled.
Data:
- "type": Job type (json-string; "stream" for image streaming
"commit" for block commit)
- "device": Device name (json-string)
- "len": Maximum progress value (json-int)
- "offset": Current progress value (json-int)
On success this is equal to len.
On failure this is less than len.
- "speed": Rate limit, bytes per second (json-int)
Example:
{ "event": "BLOCK_JOB_CANCELLED",
"data": { "type": "stream", "device": "virtio-disk0",
"len": 10737418240, "offset": 134217728,
"speed": 0 },
"timestamp": { "seconds": 1267061043, "microseconds": 959568 } }
BLOCK_JOB_COMPLETED
-------------------
Emitted when a block job has completed.
Data:
- "type": Job type (json-string; "stream" for image streaming
"commit" for block commit)
- "device": Device name (json-string)
- "len": Maximum progress value (json-int)
- "offset": Current progress value (json-int)
On success this is equal to len.
On failure this is less than len.
- "speed": Rate limit, bytes per second (json-int)
- "error": Error message (json-string, optional)
Only present on failure. This field contains a human-readable
error message. There are no semantics other than that streaming
has failed and clients should not try to interpret the error
string.
Example:
{ "event": "BLOCK_JOB_COMPLETED",
"data": { "type": "stream", "device": "virtio-disk0",
"len": 10737418240, "offset": 10737418240,
"speed": 0 },
"timestamp": { "seconds": 1267061043, "microseconds": 959568 } }
BLOCK_JOB_ERROR
---------------
Emitted when a block job encounters an error.
Data:
- "device": device name (json-string)
- "operation": I/O operation (json-string, "read" or "write")
- "action": action that has been taken, it's one of the following (json-string):
"ignore": error has been ignored, the job may fail later
"report": error will be reported and the job canceled
"stop": error caused job to be paused
Example:
{ "event": "BLOCK_JOB_ERROR",
"data": { "device": "ide0-hd1",
"operation": "write",
"action": "stop" },
"timestamp": { "seconds": 1265044230, "microseconds": 450486 } }
BLOCK_JOB_READY
---------------
Emitted when a block job is ready to complete.
Data:
- "type": Job type (json-string; "stream" for image streaming
"commit" for block commit)
- "device": Device name (json-string)
- "len": Maximum progress value (json-int)
- "offset": Current progress value (json-int)
On success this is equal to len.
On failure this is less than len.
- "speed": Rate limit, bytes per second (json-int)
Example:
{ "event": "BLOCK_JOB_READY",
"data": { "device": "drive0", "type": "mirror", "speed": 0,
"len": 2097152, "offset": 2097152 }
"timestamp": { "seconds": 1265044230, "microseconds": 450486 } }
Note: The "ready to complete" status is always reset by a BLOCK_JOB_ERROR
event.
DEVICE_DELETED
--------------
Emitted whenever the device removal completion is acknowledged
by the guest.
At this point, it's safe to reuse the specified device ID.
Device removal can be initiated by the guest or by HMP/QMP commands.
Data:
- "device": device name (json-string, optional)
- "path": device path (json-string)
{ "event": "DEVICE_DELETED",
"data": { "device": "virtio-net-pci-0",
"path": "/machine/peripheral/virtio-net-pci-0" },
"timestamp": { "seconds": 1265044230, "microseconds": 450486 } }
DEVICE_TRAY_MOVED
-----------------
It's emitted whenever the tray of a removable device is moved by the guest
or by HMP/QMP commands.
Data:
- "device": device name (json-string)
- "tray-open": true if the tray has been opened or false if it has been closed
(json-bool)
{ "event": "DEVICE_TRAY_MOVED",
"data": { "device": "ide1-cd0",
"tray-open": true
},
"timestamp": { "seconds": 1265044230, "microseconds": 450486 } }
GUEST_PANICKED
--------------
Emitted when guest OS panic is detected.
Data:
- "action": Action that has been taken (json-string, currently always "pause").
Example:
{ "event": "GUEST_PANICKED",
"data": { "action": "pause" } }
NIC_RX_FILTER_CHANGED
---------------------
The event is emitted once until the query command is executed,
the first event will always be emitted.
Data:
- "name": net client name (json-string)
- "path": device path (json-string)
{ "event": "NIC_RX_FILTER_CHANGED",
"data": { "name": "vnet0",
"path": "/machine/peripheral/vnet0/virtio-backend" },
"timestamp": { "seconds": 1368697518, "microseconds": 326866 } }
}
POWERDOWN
---------
Emitted when the Virtual Machine is powered down through the power
control system, such as via ACPI.
Data: None.
Example:
{ "event": "POWERDOWN",
"timestamp": { "seconds": 1267040730, "microseconds": 682951 } }
QUORUM_FAILURE
--------------
Emitted by the Quorum block driver if it fails to establish a quorum.
Data:
- "reference": device name if defined else node name.
- "sector-num": Number of the first sector of the failed read operation.
- "sectors-count": Failed read operation sector count.
Example:
{ "event": "QUORUM_FAILURE",
"data": { "reference": "usr1", "sector-num": 345435, "sectors-count": 5 },
"timestamp": { "seconds": 1344522075, "microseconds": 745528 } }
QUORUM_REPORT_BAD
-----------------
Emitted to report a corruption of a Quorum file.
Data:
- "error": Error message (json-string, optional)
Only present on failure. This field contains a human-readable
error message. There are no semantics other than that the
block layer reported an error and clients should not try to
interpret the error string.
- "node-name": The graph node name of the block driver state.
- "sector-num": Number of the first sector of the failed read operation.
- "sectors-count": Failed read operation sector count.
Example:
{ "event": "QUORUM_REPORT_BAD",
"data": { "node-name": "1.raw", "sector-num": 345435, "sectors-count": 5 },
"timestamp": { "seconds": 1344522075, "microseconds": 745528 } }
RESET
-----
Emitted when the Virtual Machine is reset.
Data: None.
Example:
{ "event": "RESET",
"timestamp": { "seconds": 1267041653, "microseconds": 9518 } }
RESUME
------
Emitted when the Virtual Machine resumes execution.
Data: None.
Example:
{ "event": "RESUME",
"timestamp": { "seconds": 1271770767, "microseconds": 582542 } }
RTC_CHANGE
----------
Emitted when the guest changes the RTC time.
Data:
- "offset": Offset between base RTC clock (as specified by -rtc base), and
new RTC clock value (json-number)
Example:
{ "event": "RTC_CHANGE",
"data": { "offset": 78 },
"timestamp": { "seconds": 1267020223, "microseconds": 435656 } }
SHUTDOWN
--------
Emitted when the Virtual Machine has shut down, indicating that qemu
is about to exit.
Data: None.
Example:
{ "event": "SHUTDOWN",
"timestamp": { "seconds": 1267040730, "microseconds": 682951 } }
Note: If the command-line option "-no-shutdown" has been specified, a STOP
event will eventually follow the SHUTDOWN event.
SPICE_CONNECTED
---------------
Emitted when a SPICE client connects.
Data:
- "server": Server information (json-object)
- "host": IP address (json-string)
- "port": port number (json-string)
- "family": address family (json-string, "ipv4" or "ipv6")
- "client": Client information (json-object)
- "host": IP address (json-string)
- "port": port number (json-string)
- "family": address family (json-string, "ipv4" or "ipv6")
Example:
{ "timestamp": {"seconds": 1290688046, "microseconds": 388707},
"event": "SPICE_CONNECTED",
"data": {
"server": { "port": "5920", "family": "ipv4", "host": "127.0.0.1"},
"client": {"port": "52873", "family": "ipv4", "host": "127.0.0.1"}
}}
SPICE_DISCONNECTED
------------------
Emitted when a SPICE client disconnects.
Data:
- "server": Server information (json-object)
- "host": IP address (json-string)
- "port": port number (json-string)
- "family": address family (json-string, "ipv4" or "ipv6")
- "client": Client information (json-object)
- "host": IP address (json-string)
- "port": port number (json-string)
- "family": address family (json-string, "ipv4" or "ipv6")
Example:
{ "timestamp": {"seconds": 1290688046, "microseconds": 388707},
"event": "SPICE_DISCONNECTED",
"data": {
"server": { "port": "5920", "family": "ipv4", "host": "127.0.0.1"},
"client": {"port": "52873", "family": "ipv4", "host": "127.0.0.1"}
}}
SPICE_INITIALIZED
-----------------
Emitted after initial handshake and authentication takes place (if any)
and the SPICE channel is up and running
Data:
- "server": Server information (json-object)
- "host": IP address (json-string)
- "port": port number (json-string)
- "family": address family (json-string, "ipv4" or "ipv6")
- "auth": authentication method (json-string, optional)
- "client": Client information (json-object)
- "host": IP address (json-string)
- "port": port number (json-string)
- "family": address family (json-string, "ipv4" or "ipv6")
- "connection-id": spice connection id. All channels with the same id
belong to the same spice session (json-int)
- "channel-type": channel type. "1" is the main control channel, filter for
this one if you want track spice sessions only (json-int)
- "channel-id": channel id. Usually "0", might be different needed when
multiple channels of the same type exist, such as multiple
display channels in a multihead setup (json-int)
- "tls": whevener the channel is encrypted (json-bool)
Example:
{ "timestamp": {"seconds": 1290688046, "microseconds": 417172},
"event": "SPICE_INITIALIZED",
"data": {"server": {"auth": "spice", "port": "5921",
"family": "ipv4", "host": "127.0.0.1"},
"client": {"port": "49004", "family": "ipv4", "channel-type": 3,
"connection-id": 1804289383, "host": "127.0.0.1",
"channel-id": 0, "tls": true}
}}
SPICE_MIGRATE_COMPLETED
-----------------------
Emitted when SPICE migration has completed
Data: None.
Example:
{ "timestamp": {"seconds": 1290688046, "microseconds": 417172},
"event": "SPICE_MIGRATE_COMPLETED" }
STOP
----
Emitted when the Virtual Machine is stopped.
Data: None.
Example:
{ "event": "STOP",
"timestamp": { "seconds": 1267041730, "microseconds": 281295 } }
SUSPEND
-------
Emitted when guest enters S3 state.
Data: None.
Example:
{ "event": "SUSPEND",
"timestamp": { "seconds": 1344456160, "microseconds": 309119 } }
SUSPEND_DISK
------------
Emitted when the guest makes a request to enter S4 state.
Data: None.
Example:
{ "event": "SUSPEND_DISK",
"timestamp": { "seconds": 1344456160, "microseconds": 309119 } }
Note: QEMU shuts down when entering S4 state.
VNC_CONNECTED
-------------
Emitted when a VNC client establishes a connection.
Data:
- "server": Server information (json-object)
- "host": IP address (json-string)
- "service": port number (json-string)
- "family": address family (json-string, "ipv4" or "ipv6")
- "auth": authentication method (json-string, optional)
- "client": Client information (json-object)
- "host": IP address (json-string)
- "service": port number (json-string)
- "family": address family (json-string, "ipv4" or "ipv6")
Example:
{ "event": "VNC_CONNECTED",
"data": {
"server": { "auth": "sasl", "family": "ipv4",
"service": "5901", "host": "0.0.0.0" },
"client": { "family": "ipv4", "service": "58425",
"host": "127.0.0.1" } },
"timestamp": { "seconds": 1262976601, "microseconds": 975795 } }
Note: This event is emitted before any authentication takes place, thus
the authentication ID is not provided.
VNC_DISCONNECTED
----------------
Emitted when the connection is closed.
Data:
- "server": Server information (json-object)
- "host": IP address (json-string)
- "service": port number (json-string)
- "family": address family (json-string, "ipv4" or "ipv6")
- "auth": authentication method (json-string, optional)
- "client": Client information (json-object)
- "host": IP address (json-string)
- "service": port number (json-string)
- "family": address family (json-string, "ipv4" or "ipv6")
- "x509_dname": TLS dname (json-string, optional)
- "sasl_username": SASL username (json-string, optional)
Example:
{ "event": "VNC_DISCONNECTED",
"data": {
"server": { "auth": "sasl", "family": "ipv4",
"service": "5901", "host": "0.0.0.0" },
"client": { "family": "ipv4", "service": "58425",
"host": "127.0.0.1", "sasl_username": "luiz" } },
"timestamp": { "seconds": 1262976601, "microseconds": 975795 } }
VNC_INITIALIZED
---------------
Emitted after authentication takes place (if any) and the VNC session is
made active.
Data:
- "server": Server information (json-object)
- "host": IP address (json-string)
- "service": port number (json-string)
- "family": address family (json-string, "ipv4" or "ipv6")
- "auth": authentication method (json-string, optional)
- "client": Client information (json-object)
- "host": IP address (json-string)
- "service": port number (json-string)
- "family": address family (json-string, "ipv4" or "ipv6")
- "x509_dname": TLS dname (json-string, optional)
- "sasl_username": SASL username (json-string, optional)
Example:
{ "event": "VNC_INITIALIZED",
"data": {
"server": { "auth": "sasl", "family": "ipv4",
"service": "5901", "host": "0.0.0.0"},
"client": { "family": "ipv4", "service": "46089",
"host": "127.0.0.1", "sasl_username": "luiz" } },
"timestamp": { "seconds": 1263475302, "microseconds": 150772 } }
VSERPORT_CHANGE
---------------
Emitted when the guest opens or closes a virtio-serial port.
Data:
- "id": device identifier of the virtio-serial port (json-string)
- "open": true if the guest has opened the virtio-serial port (json-bool)
Example:
{ "event": "VSERPORT_CHANGE",
"data": { "id": "channel0", "open": true },
"timestamp": { "seconds": 1401385907, "microseconds": 422329 } }
WAKEUP
------
Emitted when the guest has woken up from S3 and is running.
Data: None.
Example:
{ "event": "WAKEUP",
"timestamp": { "seconds": 1344522075, "microseconds": 745528 } }
WATCHDOG
--------
Emitted when the watchdog device's timer is expired.
Data:
- "action": Action that has been taken, it's one of the following (json-string):
"reset", "shutdown", "poweroff", "pause", "debug", or "none"
Example:
{ "event": "WATCHDOG",
"data": { "action": "reset" },
"timestamp": { "seconds": 1267061043, "microseconds": 959568 } }
Note: If action is "reset", "shutdown", or "pause" the WATCHDOG event is
followed respectively by the RESET, SHUTDOWN, or STOP events.

View File

@@ -78,14 +78,14 @@ Depending on the request type, payload can be:
Padding: 32-bit
A region is:
---------------------------------------
| guest address | size | user address |
---------------------------------------
-----------------------------------------------------
| guest address | size | user address | mmap offset |
-----------------------------------------------------
Guest address: a 64-bit guest address of the region
Size: a 64-bit size
User address: a 64-bit user address
mmap offset: 64-bit offset where region starts in the mapped memory
In QEMU the vhost-user message is implemented with the following struct:
@@ -132,7 +132,7 @@ Message types
* VHOST_USER_GET_FEATURES
Id: 2
Id: 1
Equivalent ioctl: VHOST_GET_FEATURES
Master payload: N/A
Slave payload: u64
@@ -141,7 +141,7 @@ Message types
* VHOST_USER_SET_FEATURES
Id: 3
Id: 2
Ioctl: VHOST_SET_FEATURES
Master payload: u64
@@ -149,7 +149,7 @@ Message types
* VHOST_USER_SET_OWNER
Id: 4
Id: 3
Equivalent ioctl: VHOST_SET_OWNER
Master payload: N/A
@@ -159,7 +159,7 @@ Message types
* VHOST_USER_RESET_OWNER
Id: 5
Id: 4
Equivalent ioctl: VHOST_RESET_OWNER
Master payload: N/A
@@ -168,7 +168,7 @@ Message types
* VHOST_USER_SET_MEM_TABLE
Id: 6
Id: 5
Equivalent ioctl: VHOST_SET_MEM_TABLE
Master payload: memory regions description
@@ -179,7 +179,7 @@ Message types
* VHOST_USER_SET_LOG_BASE
Id: 7
Id: 6
Equivalent ioctl: VHOST_SET_LOG_BASE
Master payload: u64
@@ -187,7 +187,7 @@ Message types
* VHOST_USER_SET_LOG_FD
Id: 8
Id: 7
Equivalent ioctl: VHOST_SET_LOG_FD
Master payload: N/A
@@ -195,7 +195,7 @@ Message types
* VHOST_USER_SET_VRING_NUM
Id: 9
Id: 8
Equivalent ioctl: VHOST_SET_VRING_NUM
Master payload: vring state description
@@ -203,7 +203,7 @@ Message types
* VHOST_USER_SET_VRING_ADDR
Id: 10
Id: 9
Equivalent ioctl: VHOST_SET_VRING_ADDR
Master payload: vring address description
Slave payload: N/A
@@ -212,7 +212,7 @@ Message types
* VHOST_USER_SET_VRING_BASE
Id: 11
Id: 10
Equivalent ioctl: VHOST_SET_VRING_BASE
Master payload: vring state description
@@ -220,7 +220,7 @@ Message types
* VHOST_USER_GET_VRING_BASE
Id: 12
Id: 11
Equivalent ioctl: VHOST_USER_GET_VRING_BASE
Master payload: vring state description
Slave payload: vring state description
@@ -229,7 +229,7 @@ Message types
* VHOST_USER_SET_VRING_KICK
Id: 13
Id: 12
Equivalent ioctl: VHOST_SET_VRING_KICK
Master payload: u64
@@ -242,7 +242,7 @@ Message types
* VHOST_USER_SET_VRING_CALL
Id: 14
Id: 13
Equivalent ioctl: VHOST_SET_VRING_CALL
Master payload: u64
@@ -255,7 +255,7 @@ Message types
* VHOST_USER_SET_VRING_ERR
Id: 15
Id: 14
Equivalent ioctl: VHOST_SET_VRING_ERR
Master payload: u64

82
exec.c
View File

@@ -430,15 +430,50 @@ static int cpu_common_post_load(void *opaque, int version_id)
return 0;
}
static int cpu_common_pre_load(void *opaque)
{
CPUState *cpu = opaque;
cpu->exception_index = 0;
return 0;
}
static bool cpu_common_exception_index_needed(void *opaque)
{
CPUState *cpu = opaque;
return cpu->exception_index != 0;
}
static const VMStateDescription vmstate_cpu_common_exception_index = {
.name = "cpu_common/exception_index",
.version_id = 1,
.minimum_version_id = 1,
.fields = (VMStateField[]) {
VMSTATE_INT32(exception_index, CPUState),
VMSTATE_END_OF_LIST()
}
};
const VMStateDescription vmstate_cpu_common = {
.name = "cpu_common",
.version_id = 1,
.minimum_version_id = 1,
.pre_load = cpu_common_pre_load,
.post_load = cpu_common_post_load,
.fields = (VMStateField[]) {
VMSTATE_UINT32(halted, CPUState),
VMSTATE_UINT32(interrupt_request, CPUState),
VMSTATE_END_OF_LIST()
},
.subsections = (VMStateSubsection[]) {
{
.vmsd = &vmstate_cpu_common_exception_index,
.needed = cpu_common_exception_index_needed,
} , {
/* empty */
}
}
};
@@ -883,7 +918,7 @@ static void phys_section_destroy(MemoryRegion *mr)
if (mr->subpage) {
subpage_t *subpage = container_of(mr, subpage_t, iomem);
memory_region_destroy(&subpage->iomem);
object_unref(OBJECT(&subpage->iomem));
g_free(subpage);
}
}
@@ -1456,6 +1491,13 @@ int qemu_get_ram_fd(ram_addr_t addr)
return block->fd;
}
void *qemu_get_ram_block_host_ptr(ram_addr_t addr)
{
RAMBlock *block = qemu_get_ram_block(addr);
return block->host;
}
/* Return a host pointer to ram allocated with qemu_ram_alloc.
With the exception of the softmmu code in this file, this should
only be used for local memory (e.g. video ram) that the device owns,
@@ -1561,8 +1603,7 @@ static void notdirty_mem_write(void *opaque, hwaddr ram_addr,
default:
abort();
}
cpu_physical_memory_set_dirty_flag(ram_addr, DIRTY_MEMORY_MIGRATION);
cpu_physical_memory_set_dirty_flag(ram_addr, DIRTY_MEMORY_VGA);
cpu_physical_memory_set_dirty_range_nocode(ram_addr, size);
/* we remove the notdirty callback only if the code has been
flushed */
if (!cpu_physical_memory_is_clean(ram_addr)) {
@@ -1761,7 +1802,7 @@ static subpage_t *subpage_init(AddressSpace *as, hwaddr base)
mmio->as = as;
mmio->base = base;
memory_region_init_io(&mmio->iomem, NULL, &subpage_ops, mmio,
"subpage", TARGET_PAGE_SIZE);
NULL, TARGET_PAGE_SIZE);
mmio->iomem.subpage = true;
#if defined(DEBUG_SUBPAGE)
printf("%s: %p base " TARGET_FMT_plx " len %08x\n", __func__,
@@ -1794,13 +1835,13 @@ MemoryRegion *iotlb_to_region(AddressSpace *as, hwaddr index)
static void io_mem_init(void)
{
memory_region_init_io(&io_mem_rom, NULL, &unassigned_mem_ops, NULL, "rom", UINT64_MAX);
memory_region_init_io(&io_mem_rom, NULL, &unassigned_mem_ops, NULL, NULL, UINT64_MAX);
memory_region_init_io(&io_mem_unassigned, NULL, &unassigned_mem_ops, NULL,
"unassigned", UINT64_MAX);
NULL, UINT64_MAX);
memory_region_init_io(&io_mem_notdirty, NULL, &notdirty_mem_ops, NULL,
"notdirty", UINT64_MAX);
NULL, UINT64_MAX);
memory_region_init_io(&io_mem_watch, NULL, &watch_mem_ops, NULL,
"watch", UINT64_MAX);
NULL, UINT64_MAX);
}
static void mem_begin(MemoryListener *listener)
@@ -1971,8 +2012,7 @@ static void invalidate_and_set_dirty(hwaddr addr,
/* invalidate code */
tb_invalidate_phys_page_range(addr, addr + length, 0);
/* set dirty bit */
cpu_physical_memory_set_dirty_flag(addr, DIRTY_MEMORY_VGA);
cpu_physical_memory_set_dirty_flag(addr, DIRTY_MEMORY_MIGRATION);
cpu_physical_memory_set_dirty_range_nocode(addr, length);
}
xen_modified_memory(addr, length);
}
@@ -2328,15 +2368,7 @@ void address_space_unmap(AddressSpace *as, void *buffer, hwaddr len,
mr = qemu_ram_addr_from_host(buffer, &addr1);
assert(mr != NULL);
if (is_write) {
while (access_len) {
unsigned l;
l = TARGET_PAGE_SIZE;
if (l > access_len)
l = access_len;
invalidate_and_set_dirty(addr1, l);
addr1 += l;
access_len -= l;
}
invalidate_and_set_dirty(addr1, access_len);
}
if (xen_enabled()) {
xen_invalidate_map_cache_entry(buffer);
@@ -2574,9 +2606,7 @@ void stl_phys_notdirty(AddressSpace *as, hwaddr addr, uint32_t val)
/* invalidate code */
tb_invalidate_phys_page_range(addr1, addr1 + 4, 0);
/* set dirty bit */
cpu_physical_memory_set_dirty_flag(addr1,
DIRTY_MEMORY_MIGRATION);
cpu_physical_memory_set_dirty_flag(addr1, DIRTY_MEMORY_VGA);
cpu_physical_memory_set_dirty_range_nocode(addr1, 4);
}
}
}
@@ -2752,14 +2782,12 @@ int cpu_memory_rw_debug(CPUState *cpu, target_ulong addr,
}
#endif
#if !defined(CONFIG_USER_ONLY)
/*
* A helper function for the _utterly broken_ virtio device model to find out if
* it's running on a big endian machine. Don't do this at home kids!
*/
bool virtio_is_big_endian(void);
bool virtio_is_big_endian(void)
bool target_words_bigendian(void);
bool target_words_bigendian(void)
{
#if defined(TARGET_WORDS_BIGENDIAN)
return true;
@@ -2768,8 +2796,6 @@ bool virtio_is_big_endian(void)
#endif
}
#endif
#ifndef CONFIG_USER_ONLY
bool cpu_physical_memory_is_io(hwaddr phys_addr)
{

3
hmp.c
View File

@@ -933,6 +933,7 @@ void hmp_drive_mirror(Monitor *mon, const QDict *qdict)
}
qmp_drive_mirror(device, filename, !!format, format,
false, NULL, false, NULL,
full ? MIRROR_SYNC_MODE_FULL : MIRROR_SYNC_MODE_TOP,
true, mode, false, 0, false, 0, false, 0,
false, 0, false, 0, &err);
@@ -1175,7 +1176,7 @@ void hmp_block_stream(Monitor *mon, const QDict *qdict)
const char *base = qdict_get_try_str(qdict, "base");
int64_t speed = qdict_get_try_int(qdict, "speed", 0);
qmp_block_stream(device, base != NULL, base,
qmp_block_stream(device, base != NULL, base, false, NULL,
qdict_haskey(qdict, "speed"), speed,
true, BLOCKDEV_ON_ERROR_REPORT, &error);

View File

@@ -19,6 +19,7 @@
#include "fsdev/qemu-fsdev.h"
#include "virtio-9p-xattr.h"
#include "virtio-9p-coth.h"
#include "hw/virtio/virtio-access.h"
static uint32_t virtio_9p_get_features(VirtIODevice *vdev, uint32_t features)
{
@@ -34,7 +35,7 @@ static void virtio_9p_get_config(VirtIODevice *vdev, uint8_t *config)
len = strlen(s->tag);
cfg = g_malloc0(sizeof(struct virtio_9p_config) + len);
stw_p(&cfg->tag_len, len);
virtio_stw_p(vdev, &cfg->tag_len, len);
/* We don't copy the terminating null to config space */
memcpy(cfg->tag, s->tag, len);
memcpy(config, cfg, s->config_size);

View File

@@ -232,11 +232,11 @@ void ich9_pm_init(PCIDevice *lpc_pci, ICH9LPCPMRegs *pm,
acpi_gpe_init(&pm->acpi_regs, ICH9_PMIO_GPE0_LEN);
memory_region_init_io(&pm->io_gpe, OBJECT(lpc_pci), &ich9_gpe_ops, pm,
"apci-gpe0", ICH9_PMIO_GPE0_LEN);
"acpi-gpe0", ICH9_PMIO_GPE0_LEN);
memory_region_add_subregion(&pm->io, ICH9_PMIO_GPE0_STS, &pm->io_gpe);
memory_region_init_io(&pm->io_smi, OBJECT(lpc_pci), &ich9_smi_ops, pm,
"apci-smi", 8);
"acpi-smi", 8);
memory_region_add_subregion(&pm->io, ICH9_PMIO_SMI_EN, &pm->io_smi);
pm->irq = sci_irq;

View File

@@ -159,7 +159,7 @@ void acpi_memory_hotplug_init(MemoryRegion *as, Object *owner,
state->devs = g_malloc0(sizeof(*state->devs) * state->dev_count);
memory_region_init_io(&state->io, owner, &acpi_memory_hotplug_ops, state,
"apci-mem-hotplug", ACPI_MEMORY_HOTPLUG_IO_LEN);
"acpi-mem-hotplug", ACPI_MEMORY_HOTPLUG_IO_LEN);
memory_region_add_subregion(as, ACPI_MEMORY_HOTPLUG_BASE, &state->io);
}

View File

@@ -231,7 +231,7 @@ static uint64_t pci_read(void *opaque, hwaddr addr, unsigned int size)
uint32_t val = 0;
int bsel = s->hotplug_select;
if (bsel < 0 || bsel > ACPI_PCIHP_MAX_HOTPLUG_BUS) {
if (bsel < 0 || bsel >= ACPI_PCIHP_MAX_HOTPLUG_BUS) {
return 0;
}

View File

@@ -172,7 +172,7 @@ static void omap_timer_clk_update(void *opaque, int line, int on)
static void omap_timer_clk_setup(struct omap_mpu_timer_s *timer)
{
omap_clk_adduser(timer->clk,
qemu_allocate_irqs(omap_timer_clk_update, timer, 1)[0]);
qemu_allocate_irq(omap_timer_clk_update, timer, 0));
timer->rate = omap_clk_getrate(timer->clk);
}
@@ -2098,7 +2098,7 @@ static struct omap_mpuio_s *omap_mpuio_init(MemoryRegion *memory,
"omap-mpuio", 0x800);
memory_region_add_subregion(memory, base, &s->iomem);
omap_clk_adduser(clk, qemu_allocate_irqs(omap_mpuio_onoff, s, 1)[0]);
omap_clk_adduser(clk, qemu_allocate_irq(omap_mpuio_onoff, s, 0));
return s;
}
@@ -2401,7 +2401,7 @@ static struct omap_pwl_s *omap_pwl_init(MemoryRegion *system_memory,
"omap-pwl", 0x800);
memory_region_add_subregion(system_memory, base, &s->iomem);
omap_clk_adduser(clk, qemu_allocate_irqs(omap_pwl_clk_update, s, 1)[0]);
omap_clk_adduser(clk, qemu_allocate_irq(omap_pwl_clk_update, s, 0));
return s;
}
@@ -3485,8 +3485,8 @@ static void omap_mcbsp_i2s_start(void *opaque, int line, int level)
void omap_mcbsp_i2s_attach(struct omap_mcbsp_s *s, I2SCodec *slave)
{
s->codec = slave;
slave->rx_swallow = qemu_allocate_irqs(omap_mcbsp_i2s_swallow, s, 1)[0];
slave->tx_start = qemu_allocate_irqs(omap_mcbsp_i2s_start, s, 1)[0];
slave->rx_swallow = qemu_allocate_irq(omap_mcbsp_i2s_swallow, s, 0);
slave->tx_start = qemu_allocate_irq(omap_mcbsp_i2s_start, s, 0);
}
/* LED Pulse Generators */
@@ -3634,7 +3634,7 @@ static struct omap_lpg_s *omap_lpg_init(MemoryRegion *system_memory,
memory_region_init_io(&s->iomem, NULL, &omap_lpg_ops, s, "omap-lpg", 0x800);
memory_region_add_subregion(system_memory, base, &s->iomem);
omap_clk_adduser(clk, qemu_allocate_irqs(omap_lpg_clk_update, s, 1)[0]);
omap_clk_adduser(clk, qemu_allocate_irq(omap_lpg_clk_update, s, 0));
return s;
}
@@ -3848,7 +3848,7 @@ struct omap_mpu_state_s *omap310_mpu_init(MemoryRegion *system_memory,
s->sdram_size = sdram_size;
s->sram_size = OMAP15XX_SRAM_SIZE;
s->wakeup = qemu_allocate_irqs(omap_mpu_wakeup, s, 1)[0];
s->wakeup = qemu_allocate_irq(omap_mpu_wakeup, s, 0);
/* Clocks */
omap_clk_init(s);

View File

@@ -2260,7 +2260,7 @@ struct omap_mpu_state_s *omap2420_mpu_init(MemoryRegion *sysmem,
s->sdram_size = sdram_size;
s->sram_size = OMAP242X_SRAM_SIZE;
s->wakeup = qemu_allocate_irqs(omap_mpu_wakeup, s, 1)[0];
s->wakeup = qemu_allocate_irq(omap_mpu_wakeup, s, 0);
/* Clocks */
omap_clk_init(s);

View File

@@ -2052,7 +2052,7 @@ PXA2xxState *pxa270_init(MemoryRegion *address_space,
fprintf(stderr, "Unable to find CPU definition\n");
exit(1);
}
s->reset = qemu_allocate_irqs(pxa2xx_reset, s, 1)[0];
s->reset = qemu_allocate_irq(pxa2xx_reset, s, 0);
/* SDRAM & Internal Memory Storage */
memory_region_init_ram(&s->sdram, NULL, "pxa270.sdram", sdram_size);
@@ -2183,7 +2183,7 @@ PXA2xxState *pxa255_init(MemoryRegion *address_space, unsigned int sdram_size)
fprintf(stderr, "Unable to find CPU definition\n");
exit(1);
}
s->reset = qemu_allocate_irqs(pxa2xx_reset, s, 1)[0];
s->reset = qemu_allocate_irq(pxa2xx_reset, s, 0);
/* SDRAM & Internal Memory Storage */
memory_region_init_ram(&s->sdram, NULL, "pxa255.sdram", sdram_size);

View File

@@ -36,7 +36,6 @@ struct PXA2xxGPIOInfo {
uint32_t rising[PXA2XX_GPIO_BANKS];
uint32_t falling[PXA2XX_GPIO_BANKS];
uint32_t status[PXA2XX_GPIO_BANKS];
uint32_t gpsr[PXA2XX_GPIO_BANKS];
uint32_t gafr[PXA2XX_GPIO_BANKS * 2];
uint32_t prev_level[PXA2XX_GPIO_BANKS];
@@ -162,14 +161,14 @@ static uint64_t pxa2xx_gpio_read(void *opaque, hwaddr offset,
return s->dir[bank];
case GPSR: /* GPIO Pin-Output Set registers */
printf("%s: Read from a write-only register " REG_FMT "\n",
__FUNCTION__, offset);
return s->gpsr[bank]; /* Return last written value. */
qemu_log_mask(LOG_GUEST_ERROR,
"pxa2xx GPIO: read from write only register GPSR\n");
return 0;
case GPCR: /* GPIO Pin-Output Clear registers */
printf("%s: Read from a write-only register " REG_FMT "\n",
__FUNCTION__, offset);
return 31337; /* Specified as unpredictable in the docs. */
qemu_log_mask(LOG_GUEST_ERROR,
"pxa2xx GPIO: read from write only register GPCR\n");
return 0;
case GRER: /* GPIO Rising-Edge Detect Enable registers */
return s->rising[bank];
@@ -217,7 +216,6 @@ static void pxa2xx_gpio_write(void *opaque, hwaddr offset,
case GPSR: /* GPIO Pin-Output Set registers */
s->olevel[bank] |= value;
pxa2xx_gpio_handler_update(s);
s->gpsr[bank] = value;
break;
case GPCR: /* GPIO Pin-Output Clear registers */
@@ -314,7 +312,6 @@ static const VMStateDescription vmstate_pxa2xx_gpio_regs = {
.version_id = 1,
.minimum_version_id = 1,
.fields = (VMStateField[]) {
VMSTATE_INT32(lines, PXA2xxGPIOInfo),
VMSTATE_UINT32_ARRAY(ilevel, PXA2xxGPIOInfo, PXA2XX_GPIO_BANKS),
VMSTATE_UINT32_ARRAY(olevel, PXA2xxGPIOInfo, PXA2XX_GPIO_BANKS),
VMSTATE_UINT32_ARRAY(dir, PXA2xxGPIOInfo, PXA2XX_GPIO_BANKS),
@@ -322,6 +319,7 @@ static const VMStateDescription vmstate_pxa2xx_gpio_regs = {
VMSTATE_UINT32_ARRAY(falling, PXA2xxGPIOInfo, PXA2XX_GPIO_BANKS),
VMSTATE_UINT32_ARRAY(status, PXA2xxGPIOInfo, PXA2XX_GPIO_BANKS),
VMSTATE_UINT32_ARRAY(gafr, PXA2xxGPIOInfo, PXA2XX_GPIO_BANKS * 2),
VMSTATE_UINT32_ARRAY(prev_level, PXA2xxGPIOInfo, PXA2XX_GPIO_BANKS),
VMSTATE_END_OF_LIST(),
},
};
@@ -340,6 +338,7 @@ static void pxa2xx_gpio_class_init(ObjectClass *klass, void *data)
k->init = pxa2xx_gpio_initfn;
dc->desc = "PXA2xx GPIO controller";
dc->props = pxa2xx_gpio_properties;
dc->vmsd = &vmstate_pxa2xx_gpio_regs;
}
static const TypeInfo pxa2xx_gpio_info = {

View File

@@ -752,7 +752,7 @@ static void spitz_i2c_setup(PXA2xxState *cpu)
spitz_wm8750_addr(wm, 0, 0);
qdev_connect_gpio_out(cpu->gpio, SPITZ_GPIO_WM,
qemu_allocate_irqs(spitz_wm8750_addr, wm, 1)[0]);
qemu_allocate_irq(spitz_wm8750_addr, wm, 0));
/* .. and to the sound interface. */
cpu->i2s->opaque = wm;
cpu->i2s->codec_out = wm8750_dac_dat;
@@ -858,7 +858,7 @@ static void spitz_gpio_setup(PXA2xxState *cpu, int slots)
* wouldn't guarantee that a guest ever exits the loop.
*/
spitz_hsync = 0;
lcd_hsync = qemu_allocate_irqs(spitz_lcd_hsync_handler, cpu, 1)[0];
lcd_hsync = qemu_allocate_irq(spitz_lcd_hsync_handler, cpu, 0);
pxa2xx_gpio_read_notifier(cpu->gpio, lcd_hsync);
pxa2xx_lcd_vsync_notifier(cpu->lcd, lcd_hsync);

View File

@@ -480,7 +480,6 @@ struct StrongARMGPIOInfo {
uint32_t rising;
uint32_t falling;
uint32_t status;
uint32_t gpsr;
uint32_t gafr;
uint32_t prev_level;
@@ -544,14 +543,14 @@ static uint64_t strongarm_gpio_read(void *opaque, hwaddr offset,
return s->dir;
case GPSR: /* GPIO Pin-Output Set registers */
DPRINTF("%s: Read from a write-only register 0x" TARGET_FMT_plx "\n",
__func__, offset);
return s->gpsr; /* Return last written value. */
qemu_log_mask(LOG_GUEST_ERROR,
"strongarm GPIO: read from write only register GPSR\n");
return 0;
case GPCR: /* GPIO Pin-Output Clear registers */
DPRINTF("%s: Read from a write-only register 0x" TARGET_FMT_plx "\n",
__func__, offset);
return 31337; /* Specified as unpredictable in the docs. */
qemu_log_mask(LOG_GUEST_ERROR,
"strongarm GPIO: read from write only register GPCR\n");
return 0;
case GRER: /* GPIO Rising-Edge Detect Enable registers */
return s->rising;
@@ -590,7 +589,6 @@ static void strongarm_gpio_write(void *opaque, hwaddr offset,
case GPSR: /* GPIO Pin-Output Set registers */
s->olevel |= value;
strongarm_gpio_handler_update(s);
s->gpsr = value;
break;
case GPCR: /* GPIO Pin-Output Clear registers */
@@ -676,6 +674,7 @@ static const VMStateDescription vmstate_strongarm_gpio_regs = {
VMSTATE_UINT32(falling, StrongARMGPIOInfo),
VMSTATE_UINT32(status, StrongARMGPIOInfo),
VMSTATE_UINT32(gafr, StrongARMGPIOInfo),
VMSTATE_UINT32(prev_level, StrongARMGPIOInfo),
VMSTATE_END_OF_LIST(),
},
};
@@ -687,6 +686,7 @@ static void strongarm_gpio_class_init(ObjectClass *klass, void *data)
k->init = strongarm_gpio_initfn;
dc->desc = "StrongARM GPIO controller";
dc->vmsd = &vmstate_strongarm_gpio_regs;
}
static const TypeInfo strongarm_gpio_info = {
@@ -846,6 +846,7 @@ static const VMStateDescription vmstate_strongarm_ppc_regs = {
VMSTATE_UINT32(ppar, StrongARMPPCInfo),
VMSTATE_UINT32(psdr, StrongARMPPCInfo),
VMSTATE_UINT32(ppfr, StrongARMPPCInfo),
VMSTATE_UINT32(prev_level, StrongARMPPCInfo),
VMSTATE_END_OF_LIST(),
},
};
@@ -857,6 +858,7 @@ static void strongarm_ppc_class_init(ObjectClass *klass, void *data)
k->init = strongarm_ppc_init;
dc->desc = "StrongARM PPC controller";
dc->vmsd = &vmstate_strongarm_ppc_regs;
}
static const TypeInfo strongarm_ppc_info = {

View File

@@ -84,6 +84,7 @@ enum {
};
static hwaddr motherboard_legacy_map[] = {
[VE_NORFLASHALIAS] = 0,
/* CS7: 0x10000000 .. 0x10020000 */
[VE_SYSREGS] = 0x10000000,
[VE_SP810] = 0x10001000,
@@ -114,7 +115,6 @@ static hwaddr motherboard_legacy_map[] = {
[VE_VIDEORAM] = 0x4c000000,
[VE_ETHERNET] = 0x4e000000,
[VE_USB] = 0x4f000000,
[VE_NORFLASHALIAS] = -1, /* not present */
};
static hwaddr motherboard_aseries_map[] = {

View File

@@ -65,6 +65,7 @@ enum {
VIRT_GIC_CPU,
VIRT_UART,
VIRT_MMIO,
VIRT_RTC,
};
typedef struct MemMapEntry {
@@ -92,6 +93,8 @@ typedef struct VirtBoardInfo {
* high memory region beyond 4GB).
* This represents a compromise between how much RAM can be given to
* a 32 bit VM and leaving space for expansion and in particular for PCI.
* Note that devices should generally be placed at multiples of 0x10000,
* to accommodate guests using 64K pages.
*/
static const MemMapEntry a15memmap[] = {
/* Space up to 0x8000000 is reserved for a boot ROM */
@@ -101,6 +104,7 @@ static const MemMapEntry a15memmap[] = {
[VIRT_GIC_DIST] = { 0x8000000, 0x10000 },
[VIRT_GIC_CPU] = { 0x8010000, 0x10000 },
[VIRT_UART] = { 0x9000000, 0x1000 },
[VIRT_RTC] = { 0x9010000, 0x1000 },
[VIRT_MMIO] = { 0xa000000, 0x200 },
/* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
/* 0x10000000 .. 0x40000000 reserved for PCI */
@@ -109,6 +113,7 @@ static const MemMapEntry a15memmap[] = {
static const int a15irqmap[] = {
[VIRT_UART] = 1,
[VIRT_RTC] = 2,
[VIRT_MMIO] = 16, /* ...to 16 + NUM_VIRTIO_TRANSPORTS - 1 */
};
@@ -189,20 +194,41 @@ static void fdt_add_psci_node(const VirtBoardInfo *vbi)
/* No PSCI for TCG yet */
if (kvm_enabled()) {
uint32_t cpu_suspend_fn;
uint32_t cpu_off_fn;
uint32_t cpu_on_fn;
uint32_t migrate_fn;
qemu_fdt_add_subnode(fdt, "/psci");
if (armcpu->psci_version == 2) {
const char comp[] = "arm,psci-0.2\0arm,psci";
qemu_fdt_setprop(fdt, "/psci", "compatible", comp, sizeof(comp));
cpu_off_fn = QEMU_PSCI_0_2_FN_CPU_OFF;
if (arm_feature(&armcpu->env, ARM_FEATURE_AARCH64)) {
cpu_suspend_fn = QEMU_PSCI_0_2_FN64_CPU_SUSPEND;
cpu_on_fn = QEMU_PSCI_0_2_FN64_CPU_ON;
migrate_fn = QEMU_PSCI_0_2_FN64_MIGRATE;
} else {
cpu_suspend_fn = QEMU_PSCI_0_2_FN_CPU_SUSPEND;
cpu_on_fn = QEMU_PSCI_0_2_FN_CPU_ON;
migrate_fn = QEMU_PSCI_0_2_FN_MIGRATE;
}
} else {
qemu_fdt_setprop_string(fdt, "/psci", "compatible", "arm,psci");
cpu_suspend_fn = QEMU_PSCI_0_1_FN_CPU_SUSPEND;
cpu_off_fn = QEMU_PSCI_0_1_FN_CPU_OFF;
cpu_on_fn = QEMU_PSCI_0_1_FN_CPU_ON;
migrate_fn = QEMU_PSCI_0_1_FN_MIGRATE;
}
qemu_fdt_setprop_string(fdt, "/psci", "method", "hvc");
qemu_fdt_setprop_cell(fdt, "/psci", "cpu_suspend",
PSCI_FN_CPU_SUSPEND);
qemu_fdt_setprop_cell(fdt, "/psci", "cpu_off", PSCI_FN_CPU_OFF);
qemu_fdt_setprop_cell(fdt, "/psci", "cpu_on", PSCI_FN_CPU_ON);
qemu_fdt_setprop_cell(fdt, "/psci", "migrate", PSCI_FN_MIGRATE);
qemu_fdt_setprop_cell(fdt, "/psci", "cpu_suspend", cpu_suspend_fn);
qemu_fdt_setprop_cell(fdt, "/psci", "cpu_off", cpu_off_fn);
qemu_fdt_setprop_cell(fdt, "/psci", "cpu_on", cpu_on_fn);
qemu_fdt_setprop_cell(fdt, "/psci", "migrate", migrate_fn);
}
}
@@ -353,6 +379,29 @@ static void create_uart(const VirtBoardInfo *vbi, qemu_irq *pic)
g_free(nodename);
}
static void create_rtc(const VirtBoardInfo *vbi, qemu_irq *pic)
{
char *nodename;
hwaddr base = vbi->memmap[VIRT_RTC].base;
hwaddr size = vbi->memmap[VIRT_RTC].size;
int irq = vbi->irqmap[VIRT_RTC];
const char compat[] = "arm,pl031\0arm,primecell";
sysbus_create_simple("pl031", base, pic[irq]);
nodename = g_strdup_printf("/pl031@%" PRIx64, base);
qemu_fdt_add_subnode(vbi->fdt, nodename);
qemu_fdt_setprop(vbi->fdt, nodename, "compatible", compat, sizeof(compat));
qemu_fdt_setprop_sized_cells(vbi->fdt, nodename, "reg",
2, base, 2, size);
qemu_fdt_setprop_cells(vbi->fdt, nodename, "interrupts",
GIC_FDT_IRQ_TYPE_SPI, irq,
GIC_FDT_IRQ_FLAGS_EDGE_LO_HI);
qemu_fdt_setprop_cell(vbi->fdt, nodename, "clocks", vbi->clock_phandle);
qemu_fdt_setprop_string(vbi->fdt, nodename, "clock-names", "apb_pclk");
g_free(nodename);
}
static void create_virtio_devices(const VirtBoardInfo *vbi, qemu_irq *pic)
{
int i;
@@ -469,6 +518,8 @@ static void machvirt_init(MachineState *machine)
create_uart(vbi, pic);
create_rtc(vbi, pic);
/* Create mmio transports, so the user can create virtio backends
* (which will be automatically plugged in to the transports). If
* no backend is created the transport will just sit harmlessly idle.

View File

@@ -363,7 +363,7 @@ static void z2_init(MachineState *machine)
wm8750_data_req_set(wm, mpu->i2s->data_req, mpu->i2s);
qdev_connect_gpio_out(mpu->gpio, Z2_GPIO_LCD_CS,
qemu_allocate_irqs(z2_lcd_cs, z2_lcd, 1)[0]);
qemu_allocate_irq(z2_lcd_cs, z2_lcd, 0));
z2_binfo.kernel_filename = kernel_filename;
z2_binfo.kernel_cmdline = kernel_cmdline;

View File

@@ -24,16 +24,6 @@
#include "hw/virtio/virtio-bus.h"
#include "qom/object_interfaces.h"
typedef struct {
VirtIOBlockDataPlane *s;
QEMUIOVector *inhdr; /* iovecs for virtio_blk_inhdr */
VirtQueueElement *elem; /* saved data from the virtqueue */
QEMUIOVector qiov; /* original request iovecs */
struct iovec bounce_iov; /* used if guest buffers are unaligned */
QEMUIOVector bounce_qiov; /* bounce buffer iovecs */
bool read; /* read or write? */
} VirtIOBlockRequest;
struct VirtIOBlockDataPlane {
bool started;
bool starting;
@@ -44,6 +34,7 @@ struct VirtIOBlockDataPlane {
VirtIODevice *vdev;
Vring vring; /* virtqueue vring */
EventNotifier *guest_notifier; /* irq */
QEMUBH *bh; /* bh for guest notification */
/* Note that these EventNotifiers are assigned by value. This is
* fine as long as you do not call event_notifier_cleanup on them
@@ -57,6 +48,8 @@ struct VirtIOBlockDataPlane {
/* Operation blocker on BDS */
Error *blocker;
void (*saved_complete_request)(struct VirtIOBlockReq *req,
unsigned char status);
};
/* Raise an interrupt to signal guest, if necessary */
@@ -69,248 +62,65 @@ static void notify_guest(VirtIOBlockDataPlane *s)
event_notifier_set(s->guest_notifier);
}
static void complete_rdwr(void *opaque, int ret)
static void notify_guest_bh(void *opaque)
{
VirtIOBlockRequest *req = opaque;
struct virtio_blk_inhdr hdr;
int len;
VirtIOBlockDataPlane *s = opaque;
if (likely(ret == 0)) {
hdr.status = VIRTIO_BLK_S_OK;
len = req->qiov.size;
} else {
hdr.status = VIRTIO_BLK_S_IOERR;
len = 0;
}
trace_virtio_blk_data_plane_complete_request(req->s, req->elem->index, ret);
if (req->read && req->bounce_iov.iov_base) {
qemu_iovec_from_buf(&req->qiov, 0, req->bounce_iov.iov_base, len);
}
if (req->bounce_iov.iov_base) {
qemu_vfree(req->bounce_iov.iov_base);
}
qemu_iovec_from_buf(req->inhdr, 0, &hdr, sizeof(hdr));
qemu_iovec_destroy(req->inhdr);
g_slice_free(QEMUIOVector, req->inhdr);
/* According to the virtio specification len should be the number of bytes
* written to, but for virtio-blk it seems to be the number of bytes
* transferred plus the status bytes.
*/
vring_push(&req->s->vring, req->elem, len + sizeof(hdr));
notify_guest(req->s);
g_slice_free(VirtIOBlockRequest, req);
}
static void complete_request_early(VirtIOBlockDataPlane *s, VirtQueueElement *elem,
QEMUIOVector *inhdr, unsigned char status)
{
struct virtio_blk_inhdr hdr = {
.status = status,
};
qemu_iovec_from_buf(inhdr, 0, &hdr, sizeof(hdr));
qemu_iovec_destroy(inhdr);
g_slice_free(QEMUIOVector, inhdr);
vring_push(&s->vring, elem, sizeof(hdr));
notify_guest(s);
}
/* Get disk serial number */
static void do_get_id_cmd(VirtIOBlockDataPlane *s,
struct iovec *iov, unsigned int iov_cnt,
VirtQueueElement *elem, QEMUIOVector *inhdr)
static void complete_request_vring(VirtIOBlockReq *req, unsigned char status)
{
char id[VIRTIO_BLK_ID_BYTES];
VirtIOBlockDataPlane *s = req->dev->dataplane;
stb_p(&req->in->status, status);
/* Serial number not NUL-terminated when longer than buffer */
strncpy(id, s->blk->serial ? s->blk->serial : "", sizeof(id));
iov_from_buf(iov, iov_cnt, 0, id, sizeof(id));
complete_request_early(s, elem, inhdr, VIRTIO_BLK_S_OK);
}
vring_push(&req->dev->dataplane->vring, &req->elem,
req->qiov.size + sizeof(*req->in));
static void do_rdwr_cmd(VirtIOBlockDataPlane *s, bool read,
struct iovec *iov, unsigned iov_cnt,
int64_t sector_num, VirtQueueElement *elem,
QEMUIOVector *inhdr)
{
VirtIOBlockRequest *req = g_slice_new0(VirtIOBlockRequest);
QEMUIOVector *qiov;
int nb_sectors;
/* Fill in virtio block metadata needed for completion */
req->s = s;
req->elem = elem;
req->inhdr = inhdr;
req->read = read;
qemu_iovec_init_external(&req->qiov, iov, iov_cnt);
qiov = &req->qiov;
if (!bdrv_qiov_is_aligned(s->blk->conf.bs, qiov)) {
void *bounce_buffer = qemu_blockalign(s->blk->conf.bs, qiov->size);
/* Populate bounce buffer with data for writes */
if (!read) {
qemu_iovec_to_buf(qiov, 0, bounce_buffer, qiov->size);
}
/* Redirect I/O to aligned bounce buffer */
req->bounce_iov.iov_base = bounce_buffer;
req->bounce_iov.iov_len = qiov->size;
qemu_iovec_init_external(&req->bounce_qiov, &req->bounce_iov, 1);
qiov = &req->bounce_qiov;
}
nb_sectors = qiov->size / BDRV_SECTOR_SIZE;
if (read) {
bdrv_aio_readv(s->blk->conf.bs, sector_num, qiov, nb_sectors,
complete_rdwr, req);
} else {
bdrv_aio_writev(s->blk->conf.bs, sector_num, qiov, nb_sectors,
complete_rdwr, req);
}
}
static void complete_flush(void *opaque, int ret)
{
VirtIOBlockRequest *req = opaque;
unsigned char status;
if (ret == 0) {
status = VIRTIO_BLK_S_OK;
} else {
status = VIRTIO_BLK_S_IOERR;
}
complete_request_early(req->s, req->elem, req->inhdr, status);
g_slice_free(VirtIOBlockRequest, req);
}
static void do_flush_cmd(VirtIOBlockDataPlane *s, VirtQueueElement *elem,
QEMUIOVector *inhdr)
{
VirtIOBlockRequest *req = g_slice_new(VirtIOBlockRequest);
req->s = s;
req->elem = elem;
req->inhdr = inhdr;
bdrv_aio_flush(s->blk->conf.bs, complete_flush, req);
}
static void do_scsi_cmd(VirtIOBlockDataPlane *s, VirtQueueElement *elem,
QEMUIOVector *inhdr)
{
int status;
status = virtio_blk_handle_scsi_req(VIRTIO_BLK(s->vdev), elem);
complete_request_early(s, elem, inhdr, status);
}
static int process_request(VirtIOBlockDataPlane *s, VirtQueueElement *elem)
{
struct iovec *iov = elem->out_sg;
struct iovec *in_iov = elem->in_sg;
unsigned out_num = elem->out_num;
unsigned in_num = elem->in_num;
struct virtio_blk_outhdr outhdr;
QEMUIOVector *inhdr;
size_t in_size;
/* Copy in outhdr */
if (unlikely(iov_to_buf(iov, out_num, 0, &outhdr,
sizeof(outhdr)) != sizeof(outhdr))) {
error_report("virtio-blk request outhdr too short");
return -EFAULT;
}
iov_discard_front(&iov, &out_num, sizeof(outhdr));
/* Grab inhdr for later */
in_size = iov_size(in_iov, in_num);
if (in_size < sizeof(struct virtio_blk_inhdr)) {
error_report("virtio_blk request inhdr too short");
return -EFAULT;
}
inhdr = g_slice_new(QEMUIOVector);
qemu_iovec_init(inhdr, 1);
qemu_iovec_concat_iov(inhdr, in_iov, in_num,
in_size - sizeof(struct virtio_blk_inhdr),
sizeof(struct virtio_blk_inhdr));
iov_discard_back(in_iov, &in_num, sizeof(struct virtio_blk_inhdr));
/* TODO Linux sets the barrier bit even when not advertised! */
outhdr.type &= ~VIRTIO_BLK_T_BARRIER;
switch (outhdr.type) {
case VIRTIO_BLK_T_IN:
do_rdwr_cmd(s, true, in_iov, in_num,
outhdr.sector * 512 / BDRV_SECTOR_SIZE,
elem, inhdr);
return 0;
case VIRTIO_BLK_T_OUT:
do_rdwr_cmd(s, false, iov, out_num,
outhdr.sector * 512 / BDRV_SECTOR_SIZE,
elem, inhdr);
return 0;
case VIRTIO_BLK_T_SCSI_CMD:
do_scsi_cmd(s, elem, inhdr);
return 0;
case VIRTIO_BLK_T_FLUSH:
do_flush_cmd(s, elem, inhdr);
return 0;
case VIRTIO_BLK_T_GET_ID:
do_get_id_cmd(s, in_iov, in_num, elem, inhdr);
return 0;
default:
error_report("virtio-blk unsupported request type %#x", outhdr.type);
qemu_iovec_destroy(inhdr);
g_slice_free(QEMUIOVector, inhdr);
return -EFAULT;
}
/* Suppress notification to guest by BH and its scheduled
* flag because requests are completed as a batch after io
* plug & unplug is introduced, and the BH can still be
* executed in dataplane aio context even after it is
* stopped, so needn't worry about notification loss with BH.
*/
qemu_bh_schedule(s->bh);
}
static void handle_notify(EventNotifier *e)
{
VirtIOBlockDataPlane *s = container_of(e, VirtIOBlockDataPlane,
host_notifier);
VirtQueueElement *elem;
int ret;
VirtIOBlock *vblk = VIRTIO_BLK(s->vdev);
event_notifier_test_and_clear(&s->host_notifier);
bdrv_io_plug(s->blk->conf.bs);
for (;;) {
MultiReqBuffer mrb = {
.num_writes = 0,
};
int ret;
/* Disable guest->host notifies to avoid unnecessary vmexits */
vring_disable_notification(s->vdev, &s->vring);
for (;;) {
ret = vring_pop(s->vdev, &s->vring, &elem);
VirtIOBlockReq *req = virtio_blk_alloc_request(vblk);
ret = vring_pop(s->vdev, &s->vring, &req->elem);
if (ret < 0) {
assert(elem == NULL);
virtio_blk_free_request(req);
break; /* no more requests */
}
trace_virtio_blk_data_plane_process_request(s, elem->out_num,
elem->in_num, elem->index);
trace_virtio_blk_data_plane_process_request(s, req->elem.out_num,
req->elem.in_num,
req->elem.index);
if (process_request(s, elem) < 0) {
vring_set_broken(&s->vring);
vring_free_element(elem);
ret = -EFAULT;
break;
}
virtio_blk_handle_request(req, &mrb);
}
virtio_submit_multiwrite(s->blk->conf.bs, &mrb);
if (likely(ret == -EAGAIN)) { /* vring emptied */
/* Re-enable guest->host notifies and stop processing the vring.
* But if the guest has snuck in more descriptors, keep processing.
@@ -322,6 +132,7 @@ static void handle_notify(EventNotifier *e)
break;
}
}
bdrv_io_unplug(s->blk->conf.bs);
}
/* Context: QEMU global mutex held */
@@ -331,10 +142,20 @@ void virtio_blk_data_plane_create(VirtIODevice *vdev, VirtIOBlkConf *blk,
{
VirtIOBlockDataPlane *s;
Error *local_err = NULL;
BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
*dataplane = NULL;
if (!blk->data_plane) {
if (!blk->data_plane && !blk->iothread) {
return;
}
/* Don't try if transport does not support notifiers. */
if (!k->set_guest_notifiers || !k->set_host_notifier) {
error_setg(errp,
"device is incompatible with x-data-plane "
"(transport does not support notifiers)");
return;
}
@@ -367,6 +188,7 @@ void virtio_blk_data_plane_create(VirtIODevice *vdev, VirtIOBlkConf *blk,
s->iothread = &s->internal_iothread_obj;
}
s->ctx = iothread_get_aio_context(s->iothread);
s->bh = aio_bh_new(s->ctx, notify_guest_bh, s);
error_setg(&s->blocker, "block device is in use by data plane");
bdrv_op_block_all(blk->conf.bs, s->blocker);
@@ -385,6 +207,7 @@ void virtio_blk_data_plane_destroy(VirtIOBlockDataPlane *s)
bdrv_op_unblock_all(s->blk->conf.bs, s->blocker);
error_free(s->blocker);
object_unref(OBJECT(s->iothread));
qemu_bh_delete(s->bh);
g_free(s);
}
@@ -393,6 +216,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
{
BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(s->vdev)));
VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
VirtIOBlock *vblk = VIRTIO_BLK(s->vdev);
VirtQueue *vq;
if (s->started) {
@@ -426,6 +250,9 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
}
s->host_notifier = *virtio_queue_get_host_notifier(vq);
s->saved_complete_request = vblk->complete_request;
vblk->complete_request = complete_request_vring;
s->starting = false;
s->started = true;
trace_virtio_blk_data_plane_start(s);
@@ -446,10 +273,12 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
{
BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(s->vdev)));
VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
VirtIOBlock *vblk = VIRTIO_BLK(s->vdev);
if (!s->started || s->stopping) {
return;
}
s->stopping = true;
vblk->complete_request = s->saved_complete_request;
trace_virtio_blk_data_plane_stop(s);
aio_context_acquire(s->ctx);

View File

@@ -12,6 +12,7 @@
*/
#include "qemu-common.h"
#include "qemu/iov.h"
#include "qemu/error-report.h"
#include "trace.h"
#include "hw/block/block.h"
@@ -26,19 +27,26 @@
# include <scsi/sg.h>
#endif
#include "hw/virtio/virtio-bus.h"
#include "hw/virtio/virtio-access.h"
typedef struct VirtIOBlockReq
VirtIOBlockReq *virtio_blk_alloc_request(VirtIOBlock *s)
{
VirtIOBlock *dev;
VirtQueueElement elem;
struct virtio_blk_inhdr *in;
struct virtio_blk_outhdr *out;
QEMUIOVector qiov;
struct VirtIOBlockReq *next;
BlockAcctCookie acct;
} VirtIOBlockReq;
VirtIOBlockReq *req = g_slice_new(VirtIOBlockReq);
req->dev = s;
req->qiov.size = 0;
req->next = NULL;
return req;
}
static void virtio_blk_req_complete(VirtIOBlockReq *req, int status)
void virtio_blk_free_request(VirtIOBlockReq *req)
{
if (req) {
g_slice_free(VirtIOBlockReq, req);
}
}
static void virtio_blk_complete_request(VirtIOBlockReq *req,
unsigned char status)
{
VirtIOBlock *s = req->dev;
VirtIODevice *vdev = VIRTIO_DEVICE(s);
@@ -50,6 +58,11 @@ static void virtio_blk_req_complete(VirtIOBlockReq *req, int status)
virtio_notify(vdev, s->vq);
}
static void virtio_blk_req_complete(VirtIOBlockReq *req, unsigned char status)
{
req->dev->complete_request(req, status);
}
static int virtio_blk_handle_rw_error(VirtIOBlockReq *req, int error,
bool is_read)
{
@@ -62,7 +75,7 @@ static int virtio_blk_handle_rw_error(VirtIOBlockReq *req, int error,
} else if (action == BLOCK_ERROR_ACTION_REPORT) {
virtio_blk_req_complete(req, VIRTIO_BLK_S_IOERR);
bdrv_acct_done(s->bs, &req->acct);
g_free(req);
virtio_blk_free_request(req);
}
bdrv_error_action(s->bs, action, is_read, error);
@@ -76,14 +89,15 @@ static void virtio_blk_rw_complete(void *opaque, int ret)
trace_virtio_blk_rw_complete(req, ret);
if (ret) {
bool is_read = !(ldl_p(&req->out->type) & VIRTIO_BLK_T_OUT);
int p = virtio_ldl_p(VIRTIO_DEVICE(req->dev), &req->out.type);
bool is_read = !(p & VIRTIO_BLK_T_OUT);
if (virtio_blk_handle_rw_error(req, -ret, is_read))
return;
}
virtio_blk_req_complete(req, VIRTIO_BLK_S_OK);
bdrv_acct_done(req->dev->bs, &req->acct);
g_free(req);
virtio_blk_free_request(req);
}
static void virtio_blk_flush_complete(void *opaque, int ret)
@@ -98,27 +112,16 @@ static void virtio_blk_flush_complete(void *opaque, int ret)
virtio_blk_req_complete(req, VIRTIO_BLK_S_OK);
bdrv_acct_done(req->dev->bs, &req->acct);
g_free(req);
}
static VirtIOBlockReq *virtio_blk_alloc_request(VirtIOBlock *s)
{
VirtIOBlockReq *req = g_malloc(sizeof(*req));
req->dev = s;
req->qiov.size = 0;
req->next = NULL;
return req;
virtio_blk_free_request(req);
}
static VirtIOBlockReq *virtio_blk_get_request(VirtIOBlock *s)
{
VirtIOBlockReq *req = virtio_blk_alloc_request(s);
if (req != NULL) {
if (!virtqueue_pop(s->vq, &req->elem)) {
g_free(req);
return NULL;
}
if (!virtqueue_pop(s->vq, &req->elem)) {
virtio_blk_free_request(req);
return NULL;
}
return req;
@@ -129,6 +132,8 @@ int virtio_blk_handle_scsi_req(VirtIOBlock *blk,
{
int status = VIRTIO_BLK_S_OK;
struct virtio_scsi_inhdr *scsi = NULL;
VirtIODevice *vdev = VIRTIO_DEVICE(blk);
#ifdef __linux__
int i;
struct sg_io_hdr hdr;
@@ -223,12 +228,12 @@ int virtio_blk_handle_scsi_req(VirtIOBlock *blk,
hdr.status = CHECK_CONDITION;
}
stl_p(&scsi->errors,
hdr.status | (hdr.msg_status << 8) |
(hdr.host_status << 16) | (hdr.driver_status << 24));
stl_p(&scsi->residual, hdr.resid);
stl_p(&scsi->sense_len, hdr.sb_len_wr);
stl_p(&scsi->data_len, hdr.dxfer_len);
virtio_stl_p(vdev, &scsi->errors,
hdr.status | (hdr.msg_status << 8) |
(hdr.host_status << 16) | (hdr.driver_status << 24));
virtio_stl_p(vdev, &scsi->residual, hdr.resid);
virtio_stl_p(vdev, &scsi->sense_len, hdr.sb_len_wr);
virtio_stl_p(vdev, &scsi->data_len, hdr.dxfer_len);
return status;
#else
@@ -238,7 +243,7 @@ int virtio_blk_handle_scsi_req(VirtIOBlock *blk,
fail:
/* Just put anything nonzero so that the ioctl fails in the guest. */
if (scsi) {
stl_p(&scsi->errors, 255);
virtio_stl_p(vdev, &scsi->errors, 255);
}
return status;
}
@@ -249,15 +254,10 @@ static void virtio_blk_handle_scsi(VirtIOBlockReq *req)
status = virtio_blk_handle_scsi_req(req->dev, &req->elem);
virtio_blk_req_complete(req, status);
g_free(req);
virtio_blk_free_request(req);
}
typedef struct MultiReqBuffer {
BlockRequest blkreq[32];
unsigned int num_writes;
} MultiReqBuffer;
static void virtio_submit_multiwrite(BlockDriverState *bs, MultiReqBuffer *mrb)
void virtio_submit_multiwrite(BlockDriverState *bs, MultiReqBuffer *mrb)
{
int i, ret;
@@ -288,26 +288,42 @@ static void virtio_blk_handle_flush(VirtIOBlockReq *req, MultiReqBuffer *mrb)
bdrv_aio_flush(req->dev->bs, virtio_blk_flush_complete, req);
}
static bool virtio_blk_sect_range_ok(VirtIOBlock *dev,
uint64_t sector, size_t size)
{
uint64_t nb_sectors = size >> BDRV_SECTOR_BITS;
uint64_t total_sectors;
if (sector & dev->sector_mask) {
return false;
}
if (size % dev->conf->logical_block_size) {
return false;
}
bdrv_get_geometry(dev->bs, &total_sectors);
if (sector > total_sectors || nb_sectors > total_sectors - sector) {
return false;
}
return true;
}
static void virtio_blk_handle_write(VirtIOBlockReq *req, MultiReqBuffer *mrb)
{
BlockRequest *blkreq;
uint64_t sector;
sector = ldq_p(&req->out->sector);
bdrv_acct_start(req->dev->bs, &req->acct, req->qiov.size, BDRV_ACCT_WRITE);
sector = virtio_ldq_p(VIRTIO_DEVICE(req->dev), &req->out.sector);
trace_virtio_blk_handle_write(req, sector, req->qiov.size / 512);
if (sector & req->dev->sector_mask) {
virtio_blk_rw_complete(req, -EIO);
return;
}
if (req->qiov.size % req->dev->conf->logical_block_size) {
virtio_blk_rw_complete(req, -EIO);
if (!virtio_blk_sect_range_ok(req->dev, sector, req->qiov.size)) {
virtio_blk_req_complete(req, VIRTIO_BLK_S_IOERR);
virtio_blk_free_request(req);
return;
}
bdrv_acct_start(req->dev->bs, &req->acct, req->qiov.size, BDRV_ACCT_WRITE);
if (mrb->num_writes == 32) {
virtio_submit_multiwrite(req->dev->bs, mrb);
}
@@ -327,45 +343,55 @@ static void virtio_blk_handle_read(VirtIOBlockReq *req)
{
uint64_t sector;
sector = ldq_p(&req->out->sector);
bdrv_acct_start(req->dev->bs, &req->acct, req->qiov.size, BDRV_ACCT_READ);
sector = virtio_ldq_p(VIRTIO_DEVICE(req->dev), &req->out.sector);
trace_virtio_blk_handle_read(req, sector, req->qiov.size / 512);
if (sector & req->dev->sector_mask) {
virtio_blk_rw_complete(req, -EIO);
return;
}
if (req->qiov.size % req->dev->conf->logical_block_size) {
virtio_blk_rw_complete(req, -EIO);
if (!virtio_blk_sect_range_ok(req->dev, sector, req->qiov.size)) {
virtio_blk_req_complete(req, VIRTIO_BLK_S_IOERR);
virtio_blk_free_request(req);
return;
}
bdrv_acct_start(req->dev->bs, &req->acct, req->qiov.size, BDRV_ACCT_READ);
bdrv_aio_readv(req->dev->bs, sector, &req->qiov,
req->qiov.size / BDRV_SECTOR_SIZE,
virtio_blk_rw_complete, req);
}
static void virtio_blk_handle_request(VirtIOBlockReq *req,
MultiReqBuffer *mrb)
void virtio_blk_handle_request(VirtIOBlockReq *req, MultiReqBuffer *mrb)
{
uint32_t type;
struct iovec *in_iov = req->elem.in_sg;
struct iovec *iov = req->elem.out_sg;
unsigned in_num = req->elem.in_num;
unsigned out_num = req->elem.out_num;
if (req->elem.out_num < 1 || req->elem.in_num < 1) {
error_report("virtio-blk missing headers");
exit(1);
}
if (req->elem.out_sg[0].iov_len < sizeof(*req->out) ||
req->elem.in_sg[req->elem.in_num - 1].iov_len < sizeof(*req->in)) {
error_report("virtio-blk header not in correct element");
if (unlikely(iov_to_buf(iov, out_num, 0, &req->out,
sizeof(req->out)) != sizeof(req->out))) {
error_report("virtio-blk request outhdr too short");
exit(1);
}
req->out = (void *)req->elem.out_sg[0].iov_base;
req->in = (void *)req->elem.in_sg[req->elem.in_num - 1].iov_base;
iov_discard_front(&iov, &out_num, sizeof(req->out));
type = ldl_p(&req->out->type);
if (in_num < 1 ||
in_iov[in_num - 1].iov_len < sizeof(struct virtio_blk_inhdr)) {
error_report("virtio-blk request inhdr too short");
exit(1);
}
req->in = (void *)in_iov[in_num - 1].iov_base
+ in_iov[in_num - 1].iov_len
- sizeof(struct virtio_blk_inhdr);
iov_discard_back(in_iov, &in_num, sizeof(struct virtio_blk_inhdr));
type = virtio_ldl_p(VIRTIO_DEVICE(req->dev), &req->out.type);
if (type & VIRTIO_BLK_T_FLUSH) {
virtio_blk_handle_flush(req, mrb);
@@ -382,7 +408,7 @@ static void virtio_blk_handle_request(VirtIOBlockReq *req,
s->blk.serial ? s->blk.serial : "",
MIN(req->elem.in_sg[0].iov_len, VIRTIO_BLK_ID_BYTES));
virtio_blk_req_complete(req, VIRTIO_BLK_S_OK);
g_free(req);
virtio_blk_free_request(req);
} else if (type & VIRTIO_BLK_T_OUT) {
qemu_iovec_init_external(&req->qiov, &req->elem.out_sg[1],
req->elem.out_num - 1);
@@ -394,7 +420,7 @@ static void virtio_blk_handle_request(VirtIOBlockReq *req,
virtio_blk_handle_read(req);
} else {
virtio_blk_req_complete(req, VIRTIO_BLK_S_UNSUPP);
g_free(req);
virtio_blk_free_request(req);
}
}
@@ -443,8 +469,9 @@ static void virtio_blk_dma_restart_bh(void *opaque)
s->rq = NULL;
while (req) {
VirtIOBlockReq *next = req->next;
virtio_blk_handle_request(req, &mrb);
req = req->next;
req = next;
}
virtio_submit_multiwrite(s->bs, &mrb);
@@ -460,7 +487,8 @@ static void virtio_blk_dma_restart_cb(void *opaque, int running,
}
if (!s->bh) {
s->bh = qemu_bh_new(virtio_blk_dma_restart_bh, s);
s->bh = aio_bh_new(bdrv_get_aio_context(s->blk.conf.bs),
virtio_blk_dma_restart_bh, s);
qemu_bh_schedule(s->bh);
}
}
@@ -494,12 +522,12 @@ static void virtio_blk_update_config(VirtIODevice *vdev, uint8_t *config)
bdrv_get_geometry(s->bs, &capacity);
memset(&blkcfg, 0, sizeof(blkcfg));
stq_p(&blkcfg.capacity, capacity);
stl_p(&blkcfg.seg_max, 128 - 2);
stw_p(&blkcfg.cylinders, s->conf->cyls);
stl_p(&blkcfg.blk_size, blk_size);
stw_p(&blkcfg.min_io_size, s->conf->min_io_size / blk_size);
stw_p(&blkcfg.opt_io_size, s->conf->opt_io_size / blk_size);
virtio_stq_p(vdev, &blkcfg.capacity, capacity);
virtio_stl_p(vdev, &blkcfg.seg_max, 128 - 2);
virtio_stw_p(vdev, &blkcfg.cylinders, s->conf->cyls);
virtio_stl_p(vdev, &blkcfg.blk_size, blk_size);
virtio_stw_p(vdev, &blkcfg.min_io_size, s->conf->min_io_size / blk_size);
virtio_stw_p(vdev, &blkcfg.opt_io_size, s->conf->opt_io_size / blk_size);
blkcfg.heads = s->conf->heads;
/*
* We must ensure that the block device capacity is a multiple of
@@ -601,15 +629,20 @@ static void virtio_blk_set_status(VirtIODevice *vdev, uint8_t status)
static void virtio_blk_save(QEMUFile *f, void *opaque)
{
VirtIOBlock *s = opaque;
VirtIODevice *vdev = VIRTIO_DEVICE(s);
VirtIOBlockReq *req = s->rq;
VirtIODevice *vdev = VIRTIO_DEVICE(opaque);
virtio_save(vdev, f);
}
static void virtio_blk_save_device(VirtIODevice *vdev, QEMUFile *f)
{
VirtIOBlock *s = VIRTIO_BLK(vdev);
VirtIOBlockReq *req = s->rq;
while (req) {
qemu_put_sbyte(f, 1);
qemu_put_buffer(f, (unsigned char*)&req->elem, sizeof(req->elem));
qemu_put_buffer(f, (unsigned char *)&req->elem,
sizeof(VirtQueueElement));
req = req->next;
}
qemu_put_sbyte(f, 0);
@@ -619,19 +652,22 @@ static int virtio_blk_load(QEMUFile *f, void *opaque, int version_id)
{
VirtIOBlock *s = opaque;
VirtIODevice *vdev = VIRTIO_DEVICE(s);
int ret;
if (version_id != 2)
return -EINVAL;
ret = virtio_load(vdev, f);
if (ret) {
return ret;
}
return virtio_load(vdev, f, version_id);
}
static int virtio_blk_load_device(VirtIODevice *vdev, QEMUFile *f,
int version_id)
{
VirtIOBlock *s = VIRTIO_BLK(vdev);
while (qemu_get_sbyte(f)) {
VirtIOBlockReq *req = virtio_blk_alloc_request(s);
qemu_get_buffer(f, (unsigned char*)&req->elem, sizeof(req->elem));
qemu_get_buffer(f, (unsigned char *)&req->elem,
sizeof(VirtQueueElement));
req->next = s->rq;
s->rq = req;
@@ -655,12 +691,6 @@ static const BlockDevOps virtio_block_ops = {
.resize_cb = virtio_blk_resize,
};
void virtio_blk_set_conf(DeviceState *dev, VirtIOBlkConf *blk)
{
VirtIOBlock *s = VIRTIO_BLK(dev);
memcpy(&(s->blk), blk, sizeof(struct VirtIOBlkConf));
}
#ifdef CONFIG_VIRTIO_BLK_DATA_PLANE
/* Disable dataplane thread during live migration since it does not
* update the dirty memory bitmap yet.
@@ -729,6 +759,7 @@ static void virtio_blk_device_realize(DeviceState *dev, Error **errp)
s->sector_mask = (s->conf->logical_block_size / BDRV_SECTOR_SIZE) - 1;
s->vq = virtio_add_queue(vdev, 128, virtio_blk_handle_output);
s->complete_request = virtio_blk_complete_request;
#ifdef CONFIG_VIRTIO_BLK_DATA_PLANE
virtio_blk_data_plane_create(vdev, blk, &s->dataplane, &err);
if (err != NULL) {
@@ -767,8 +798,27 @@ static void virtio_blk_device_unrealize(DeviceState *dev, Error **errp)
virtio_cleanup(vdev);
}
static void virtio_blk_instance_init(Object *obj)
{
VirtIOBlock *s = VIRTIO_BLK(obj);
object_property_add_link(obj, "iothread", TYPE_IOTHREAD,
(Object **)&s->blk.iothread,
qdev_prop_allow_set_link_before_realize,
OBJ_PROP_LINK_UNREF_ON_RELEASE, NULL);
}
static Property virtio_blk_properties[] = {
DEFINE_VIRTIO_BLK_PROPERTIES(VirtIOBlock, blk),
DEFINE_BLOCK_PROPERTIES(VirtIOBlock, blk.conf),
DEFINE_BLOCK_CHS_PROPERTIES(VirtIOBlock, blk.conf),
DEFINE_PROP_STRING("serial", VirtIOBlock, blk.serial),
DEFINE_PROP_BIT("config-wce", VirtIOBlock, blk.config_wce, 0, true),
#ifdef __linux__
DEFINE_PROP_BIT("scsi", VirtIOBlock, blk.scsi, 0, true),
#endif
#ifdef CONFIG_VIRTIO_BLK_DATA_PLANE
DEFINE_PROP_BIT("x-data-plane", VirtIOBlock, blk.data_plane, 0, false),
#endif
DEFINE_PROP_END_OF_LIST(),
};
@@ -786,12 +836,15 @@ static void virtio_blk_class_init(ObjectClass *klass, void *data)
vdc->get_features = virtio_blk_get_features;
vdc->set_status = virtio_blk_set_status;
vdc->reset = virtio_blk_reset;
vdc->save = virtio_blk_save_device;
vdc->load = virtio_blk_load_device;
}
static const TypeInfo virtio_device_info = {
.name = TYPE_VIRTIO_BLK,
.parent = TYPE_VIRTIO_DEVICE,
.instance_size = sizeof(VirtIOBlock),
.instance_init = virtio_blk_instance_init,
.class_init = virtio_blk_class_init,
};

View File

@@ -175,8 +175,10 @@ static void uart_send_breaks(UartState *s)
{
int break_enabled = 1;
qemu_chr_fe_ioctl(s->chr, CHR_IOCTL_SERIAL_SET_BREAK,
&break_enabled);
if (s->chr) {
qemu_chr_fe_ioctl(s->chr, CHR_IOCTL_SERIAL_SET_BREAK,
&break_enabled);
}
}
static void uart_parameters_setup(UartState *s)
@@ -227,7 +229,9 @@ static void uart_parameters_setup(UartState *s)
packet_size += ssp.data_bits + ssp.stop_bits;
s->char_tx_time = (get_ticks_per_sec() / ssp.speed) * packet_size;
qemu_chr_fe_ioctl(s->chr, CHR_IOCTL_SERIAL_SET_PARAMS, &ssp);
if (s->chr) {
qemu_chr_fe_ioctl(s->chr, CHR_IOCTL_SERIAL_SET_PARAMS, &ssp);
}
}
static int uart_can_receive(void *opaque)
@@ -295,6 +299,7 @@ static gboolean cadence_uart_xmit(GIOChannel *chan, GIOCondition cond,
/* instant drain the fifo when there's no back-end */
if (!s->chr) {
s->tx_count = 0;
return FALSE;
}
if (!s->tx_count) {
@@ -306,7 +311,8 @@ static gboolean cadence_uart_xmit(GIOChannel *chan, GIOCondition cond,
memmove(s->tx_fifo, s->tx_fifo + ret, s->tx_count);
if (s->tx_count) {
int r = qemu_chr_fe_add_watch(s->chr, G_IO_OUT, cadence_uart_xmit, s);
int r = qemu_chr_fe_add_watch(s->chr, G_IO_OUT|G_IO_HUP,
cadence_uart_xmit, s);
assert(r);
}
@@ -374,7 +380,9 @@ static void uart_read_rx_fifo(UartState *s, uint32_t *c)
*c = s->rx_fifo[rx_rpos];
s->rx_count--;
qemu_chr_accept_input(s->chr);
if (s->chr) {
qemu_chr_accept_input(s->chr);
}
} else {
*c = 0;
}

View File

@@ -148,11 +148,12 @@ static void multi_serial_pci_exit(PCIDevice *dev)
for (i = 0; i < pci->ports; i++) {
s = pci->state + i;
serial_exit_core(s);
memory_region_del_subregion(&pci->iobar, &s->io);
memory_region_destroy(&s->io);
g_free(pci->name[i]);
}
memory_region_destroy(&pci->iobar);
qemu_free_irqs(pci->irqs);
qemu_free_irqs(pci->irqs, pci->ports);
}
static const VMStateDescription vmstate_pci_serial = {

View File

@@ -223,37 +223,42 @@ static gboolean serial_xmit(GIOChannel *chan, GIOCondition cond, void *opaque)
{
SerialState *s = opaque;
if (s->tsr_retry <= 0) {
if (s->fcr & UART_FCR_FE) {
if (fifo8_is_empty(&s->xmit_fifo)) {
do {
if (s->tsr_retry <= 0) {
if (s->fcr & UART_FCR_FE) {
if (fifo8_is_empty(&s->xmit_fifo)) {
return FALSE;
}
s->tsr = fifo8_pop(&s->xmit_fifo);
if (!s->xmit_fifo.num) {
s->lsr |= UART_LSR_THRE;
}
} else if ((s->lsr & UART_LSR_THRE)) {
return FALSE;
} else {
s->tsr = s->thr;
s->lsr |= UART_LSR_THRE;
s->lsr &= ~UART_LSR_TEMT;
}
}
if (s->mcr & UART_MCR_LOOP) {
/* in loopback mode, say that we just received a char */
serial_receive1(s, &s->tsr, 1);
} else if (qemu_chr_fe_write(s->chr, &s->tsr, 1) != 1) {
if (s->tsr_retry >= 0 && s->tsr_retry < MAX_XMIT_RETRY &&
qemu_chr_fe_add_watch(s->chr, G_IO_OUT|G_IO_HUP,
serial_xmit, s) > 0) {
s->tsr_retry++;
return FALSE;
}
s->tsr = fifo8_pop(&s->xmit_fifo);
if (!s->xmit_fifo.num) {
s->lsr |= UART_LSR_THRE;
}
} else if ((s->lsr & UART_LSR_THRE)) {
return FALSE;
s->tsr_retry = 0;
} else {
s->tsr = s->thr;
s->lsr |= UART_LSR_THRE;
s->lsr &= ~UART_LSR_TEMT;
s->tsr_retry = 0;
}
}
if (s->mcr & UART_MCR_LOOP) {
/* in loopback mode, say that we just received a char */
serial_receive1(s, &s->tsr, 1);
} else if (qemu_chr_fe_write(s->chr, &s->tsr, 1) != 1) {
if (s->tsr_retry >= 0 && s->tsr_retry < MAX_XMIT_RETRY &&
qemu_chr_fe_add_watch(s->chr, G_IO_OUT, serial_xmit, s) > 0) {
s->tsr_retry++;
return FALSE;
}
s->tsr_retry = 0;
} else {
s->tsr_retry = 0;
}
/* Transmit another byte if it is already available. It is only
possible when FIFO is enabled and not empty. */
} while ((s->fcr & UART_FCR_FE) && !fifo8_is_empty(&s->xmit_fifo));
s->last_xmit_ts = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
@@ -293,7 +298,9 @@ static void serial_ioport_write(void *opaque, hwaddr addr, uint64_t val,
s->thr_ipending = 0;
s->lsr &= ~UART_LSR_THRE;
serial_update_irq(s);
serial_xmit(NULL, G_IO_OUT, s);
if (s->tsr_retry <= 0) {
serial_xmit(NULL, G_IO_OUT, s);
}
}
break;
case 1:

View File

@@ -14,6 +14,7 @@
#include "qemu/error-report.h"
#include "trace.h"
#include "hw/virtio/virtio-serial.h"
#include "qapi-event.h"
#define TYPE_VIRTIO_CONSOLE_SERIAL_PORT "virtserialport"
#define VIRTIO_CONSOLE(obj) \
@@ -69,7 +70,8 @@ static ssize_t flush_buf(VirtIOSerialPort *port,
if (!k->is_console) {
virtio_serial_throttle_port(port, true);
if (!vcon->watch) {
vcon->watch = qemu_chr_fe_add_watch(vcon->chr, G_IO_OUT,
vcon->watch = qemu_chr_fe_add_watch(vcon->chr,
G_IO_OUT|G_IO_HUP,
chr_write_unblocked, vcon);
}
}
@@ -81,11 +83,16 @@ static ssize_t flush_buf(VirtIOSerialPort *port,
static void set_guest_connected(VirtIOSerialPort *port, int guest_connected)
{
VirtConsole *vcon = VIRTIO_CONSOLE(port);
DeviceState *dev = DEVICE(port);
if (!vcon->chr) {
return;
if (vcon->chr) {
qemu_chr_fe_set_open(vcon->chr, guest_connected);
}
if (dev->id) {
qapi_event_send_vserport_change(dev->id, guest_connected,
&error_abort);
}
qemu_chr_fe_set_open(vcon->chr, guest_connected);
}
/* Readiness of the guest to accept data on a port */

View File

@@ -24,6 +24,7 @@
#include "hw/sysbus.h"
#include "trace.h"
#include "hw/virtio/virtio-serial.h"
#include "hw/virtio/virtio-access.h"
static VirtIOSerialPort *find_port_by_id(VirtIOSerial *vser, uint32_t id)
{
@@ -183,11 +184,12 @@ static size_t send_control_msg(VirtIOSerial *vser, void *buf, size_t len)
static size_t send_control_event(VirtIOSerial *vser, uint32_t port_id,
uint16_t event, uint16_t value)
{
VirtIODevice *vdev = VIRTIO_DEVICE(vser);
struct virtio_console_control cpkt;
stl_p(&cpkt.id, port_id);
stw_p(&cpkt.event, event);
stw_p(&cpkt.value, value);
virtio_stl_p(vdev, &cpkt.id, port_id);
virtio_stw_p(vdev, &cpkt.event, event);
virtio_stw_p(vdev, &cpkt.value, value);
trace_virtio_serial_send_control_event(port_id, event, value);
return send_control_msg(vser, &cpkt, sizeof(cpkt));
@@ -278,6 +280,7 @@ void virtio_serial_throttle_port(VirtIOSerialPort *port, bool throttle)
/* Guest wants to notify us of some event */
static void handle_control_message(VirtIOSerial *vser, void *buf, size_t len)
{
VirtIODevice *vdev = VIRTIO_DEVICE(vser);
struct VirtIOSerialPort *port;
VirtIOSerialPortClass *vsc;
struct virtio_console_control cpkt, *gcpkt;
@@ -291,8 +294,8 @@ static void handle_control_message(VirtIOSerial *vser, void *buf, size_t len)
return;
}
cpkt.event = lduw_p(&gcpkt->event);
cpkt.value = lduw_p(&gcpkt->value);
cpkt.event = virtio_lduw_p(vdev, &gcpkt->event);
cpkt.value = virtio_lduw_p(vdev, &gcpkt->value);
trace_virtio_serial_handle_control_message(cpkt.event, cpkt.value);
@@ -312,10 +315,10 @@ static void handle_control_message(VirtIOSerial *vser, void *buf, size_t len)
return;
}
port = find_port_by_id(vser, ldl_p(&gcpkt->id));
port = find_port_by_id(vser, virtio_ldl_p(vdev, &gcpkt->id));
if (!port) {
error_report("virtio-serial-bus: Unexpected port id %u for device %s",
ldl_p(&gcpkt->id), vser->bus.qbus.name);
virtio_ldl_p(vdev, &gcpkt->id), vser->bus.qbus.name);
return;
}
@@ -342,9 +345,9 @@ static void handle_control_message(VirtIOSerial *vser, void *buf, size_t len)
}
if (port->name) {
stl_p(&cpkt.id, port->id);
stw_p(&cpkt.event, VIRTIO_CONSOLE_PORT_NAME);
stw_p(&cpkt.value, 1);
virtio_stl_p(vdev, &cpkt.id, port->id);
virtio_stw_p(vdev, &cpkt.event, VIRTIO_CONSOLE_PORT_NAME);
virtio_stw_p(vdev, &cpkt.value, 1);
buffer_len = sizeof(cpkt) + strlen(port->name) + 1;
buffer = g_malloc(buffer_len);
@@ -510,18 +513,25 @@ static void vser_reset(VirtIODevice *vdev)
vser = VIRTIO_SERIAL(vdev);
guest_reset(vser);
/* In case we have switched endianness */
vser->config.max_nr_ports =
virtio_tswap32(vdev, vser->serial.max_virtserial_ports);
}
static void virtio_serial_save(QEMUFile *f, void *opaque)
{
VirtIOSerial *s = VIRTIO_SERIAL(opaque);
/* The virtio device */
virtio_save(VIRTIO_DEVICE(opaque), f);
}
static void virtio_serial_save_device(VirtIODevice *vdev, QEMUFile *f)
{
VirtIOSerial *s = VIRTIO_SERIAL(vdev);
VirtIOSerialPort *port;
uint32_t nr_active_ports;
unsigned int i, max_nr_ports;
/* The virtio device */
virtio_save(VIRTIO_DEVICE(s), f);
/* The config space */
qemu_put_be16s(f, &s->config.cols);
qemu_put_be16s(f, &s->config.rows);
@@ -529,7 +539,7 @@ static void virtio_serial_save(QEMUFile *f, void *opaque)
qemu_put_be32s(f, &s->config.max_nr_ports);
/* The ports map */
max_nr_ports = tswap32(s->config.max_nr_ports);
max_nr_ports = virtio_tswap32(vdev, s->config.max_nr_ports);
for (i = 0; i < (max_nr_ports + 31) / 32; i++) {
qemu_put_be32s(f, &s->ports_map[i]);
}
@@ -659,36 +669,39 @@ static int fetch_active_ports_list(QEMUFile *f, int version_id,
static int virtio_serial_load(QEMUFile *f, void *opaque, int version_id)
{
VirtIOSerial *s = VIRTIO_SERIAL(opaque);
uint32_t max_nr_ports, nr_active_ports, ports_map;
unsigned int i;
int ret;
if (version_id > 3) {
return -EINVAL;
}
/* The virtio device */
ret = virtio_load(VIRTIO_DEVICE(s), f);
if (ret) {
return ret;
}
return virtio_load(VIRTIO_DEVICE(opaque), f, version_id);
}
static int virtio_serial_load_device(VirtIODevice *vdev, QEMUFile *f,
int version_id)
{
VirtIOSerial *s = VIRTIO_SERIAL(vdev);
uint32_t max_nr_ports, nr_active_ports, ports_map;
unsigned int i;
int ret;
uint32_t tmp;
if (version_id < 2) {
return 0;
}
/* The config space */
qemu_get_be16s(f, &s->config.cols);
qemu_get_be16s(f, &s->config.rows);
qemu_get_be32s(f, &max_nr_ports);
tswap32s(&max_nr_ports);
if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
/* Source could have had more ports than us. Fail migration. */
return -EINVAL;
}
/* Unused */
qemu_get_be16s(f, (uint16_t *) &tmp);
qemu_get_be16s(f, (uint16_t *) &tmp);
qemu_get_be32s(f, &tmp);
/* Note: this is the only location where we use tswap32() instead of
* virtio_tswap32() because:
* - virtio_tswap32() only makes sense when the device is fully restored
* - the target endianness that was used to populate s->config is
* necessarly the default one
*/
max_nr_ports = tswap32(s->config.max_nr_ports);
for (i = 0; i < (max_nr_ports + 31) / 32; i++) {
qemu_get_be32s(f, &ports_map);
@@ -751,9 +764,10 @@ static void virtser_bus_dev_print(Monitor *mon, DeviceState *qdev, int indent)
/* This function is only used if a port id is not provided by the user */
static uint32_t find_free_port_id(VirtIOSerial *vser)
{
VirtIODevice *vdev = VIRTIO_DEVICE(vser);
unsigned int i, max_nr_ports;
max_nr_ports = tswap32(vser->config.max_nr_ports);
max_nr_ports = virtio_tswap32(vdev, vser->config.max_nr_ports);
for (i = 0; i < (max_nr_ports + 31) / 32; i++) {
uint32_t map, bit;
@@ -783,10 +797,18 @@ static void add_port(VirtIOSerial *vser, uint32_t port_id)
static void remove_port(VirtIOSerial *vser, uint32_t port_id)
{
VirtIOSerialPort *port;
unsigned int i;
i = port_id / 32;
vser->ports_map[i] &= ~(1U << (port_id % 32));
/*
* Don't mark port 0 removed -- we explicitly reserve it for
* backward compat with older guests, ensure a virtconsole device
* unplug retains the reservation.
*/
if (port_id) {
unsigned int i;
i = port_id / 32;
vser->ports_map[i] &= ~(1U << (port_id % 32));
}
port = find_port_by_id(vser, port_id);
/*
@@ -806,6 +828,7 @@ static void virtser_port_device_realize(DeviceState *dev, Error **errp)
VirtIOSerialPort *port = VIRTIO_SERIAL_PORT(dev);
VirtIOSerialPortClass *vsc = VIRTIO_SERIAL_PORT_GET_CLASS(port);
VirtIOSerialBus *bus = VIRTIO_SERIAL_BUS(qdev_get_parent_bus(dev));
VirtIODevice *vdev = VIRTIO_DEVICE(bus->vser);
int max_nr_ports;
bool plugging_port0;
Error *err = NULL;
@@ -841,7 +864,7 @@ static void virtser_port_device_realize(DeviceState *dev, Error **errp)
}
}
max_nr_ports = tswap32(port->vser->config.max_nr_ports);
max_nr_ports = virtio_tswap32(vdev, port->vser->config.max_nr_ports);
if (port->id >= max_nr_ports) {
error_setg(errp, "virtio-serial-bus: Out-of-range port id specified, "
"max. allowed: %u", max_nr_ports - 1);
@@ -863,7 +886,7 @@ static void virtser_port_device_realize(DeviceState *dev, Error **errp)
add_port(port->vser, port->id);
/* Send an update to the guest about this new port added */
virtio_notify_config(VIRTIO_DEVICE(port->vser));
virtio_notify_config(vdev);
}
static void virtser_port_device_unrealize(DeviceState *dev, Error **errp)
@@ -942,7 +965,8 @@ static void virtio_serial_device_realize(DeviceState *dev, Error **errp)
vser->ovqs[i] = virtio_add_queue(vdev, 128, handle_output);
}
vser->config.max_nr_ports = tswap32(vser->serial.max_virtserial_ports);
vser->config.max_nr_ports =
virtio_tswap32(vdev, vser->serial.max_virtserial_ports);
vser->ports_map = g_malloc0(((vser->serial.max_virtserial_ports + 31) / 32)
* sizeof(vser->ports_map[0]));
/*
@@ -1019,6 +1043,8 @@ static void virtio_serial_class_init(ObjectClass *klass, void *data)
vdc->get_config = get_config;
vdc->set_status = set_status;
vdc->reset = vser_reset;
vdc->save = virtio_serial_save_device;
vdc->load = virtio_serial_load_device;
}
static const TypeInfo virtio_device_info = {

View File

@@ -23,8 +23,13 @@
*/
#include "qemu-common.h"
#include "hw/irq.h"
#include "qom/object.h"
#define IRQ(obj) OBJECT_CHECK(struct IRQState, (obj), TYPE_IRQ)
struct IRQState {
Object parent_obj;
qemu_irq_handler handler;
void *opaque;
int n;
@@ -42,23 +47,14 @@ qemu_irq *qemu_extend_irqs(qemu_irq *old, int n_old, qemu_irq_handler handler,
void *opaque, int n)
{
qemu_irq *s;
struct IRQState *p;
int i;
if (!old) {
n_old = 0;
}
s = old ? g_renew(qemu_irq, old, n + n_old) : g_new(qemu_irq, n);
p = old ? g_renew(struct IRQState, s[0], n + n_old) :
g_new(struct IRQState, n);
for (i = 0; i < n + n_old; i++) {
if (i >= n_old) {
p->handler = handler;
p->opaque = opaque;
p->n = i;
}
s[i] = p;
p++;
for (i = n_old; i < n + n_old; i++) {
s[i] = qemu_allocate_irq(handler, opaque, i);
}
return s;
}
@@ -72,7 +68,7 @@ qemu_irq qemu_allocate_irq(qemu_irq_handler handler, void *opaque, int n)
{
struct IRQState *irq;
irq = g_new(struct IRQState, 1);
irq = IRQ(object_new(TYPE_IRQ));
irq->handler = handler;
irq->opaque = opaque;
irq->n = n;
@@ -80,15 +76,18 @@ qemu_irq qemu_allocate_irq(qemu_irq_handler handler, void *opaque, int n)
return irq;
}
void qemu_free_irqs(qemu_irq *s)
void qemu_free_irqs(qemu_irq *s, int n)
{
g_free(s[0]);
int i;
for (i = 0; i < n; i++) {
qemu_free_irq(s[i]);
}
g_free(s);
}
void qemu_free_irq(qemu_irq irq)
{
g_free(irq);
object_unref(OBJECT(irq));
}
static void qemu_notirq(void *opaque, int line, int level)
@@ -102,7 +101,7 @@ qemu_irq qemu_irq_invert(qemu_irq irq)
{
/* The default state for IRQs is low, so raise the output now. */
qemu_irq_raise(irq);
return qemu_allocate_irqs(qemu_notirq, irq, 1)[0];
return qemu_allocate_irq(qemu_notirq, irq, 0);
}
static void qemu_splitirq(void *opaque, int line, int level)
@@ -117,7 +116,7 @@ qemu_irq qemu_irq_split(qemu_irq irq1, qemu_irq irq2)
qemu_irq *s = g_malloc0(2 * sizeof(qemu_irq));
s[0] = irq1;
s[1] = irq2;
return qemu_allocate_irqs(qemu_splitirq, s, 1)[0];
return qemu_allocate_irq(qemu_splitirq, s, 0);
}
static void proxy_irq_handler(void *opaque, int n, int level)
@@ -150,3 +149,16 @@ void qemu_irq_intercept_out(qemu_irq **gpio_out, qemu_irq_handler handler, int n
qemu_irq *old_irqs = *gpio_out;
*gpio_out = qemu_allocate_irqs(handler, old_irqs, n);
}
static const TypeInfo irq_type_info = {
.name = TYPE_IRQ,
.parent = TYPE_OBJECT,
.instance_size = sizeof(struct IRQState),
};
static void irq_register_types(void)
{
type_register_static(&irq_type_info);
}
type_init(irq_register_types)

View File

@@ -239,11 +239,11 @@ static void machine_initfn(Object *obj)
{
object_property_add_str(obj, "accel",
machine_get_accel, machine_set_accel, NULL);
object_property_add_bool(obj, "kernel_irqchip",
object_property_add_bool(obj, "kernel-irqchip",
machine_get_kernel_irqchip,
machine_set_kernel_irqchip,
NULL);
object_property_add(obj, "kvm_shadow_mem", "int",
object_property_add(obj, "kvm-shadow-mem", "int",
machine_get_kvm_shadow_mem,
machine_set_kvm_shadow_mem,
NULL, NULL, NULL);
@@ -257,11 +257,11 @@ static void machine_initfn(Object *obj)
machine_get_dtb, machine_set_dtb, NULL);
object_property_add_str(obj, "dumpdtb",
machine_get_dumpdtb, machine_set_dumpdtb, NULL);
object_property_add(obj, "phandle_start", "int",
object_property_add(obj, "phandle-start", "int",
machine_get_phandle_start,
machine_set_phandle_start,
NULL, NULL, NULL);
object_property_add_str(obj, "dt_compatible",
object_property_add_str(obj, "dt-compatible",
machine_get_dt_compatible,
machine_set_dt_compatible,
NULL);

View File

@@ -180,7 +180,6 @@ PropertyInfo qdev_prop_chr = {
static int parse_netdev(DeviceState *dev, const char *str, void **ptr)
{
NICPeers *peers_ptr = (NICPeers *)ptr;
NICConf *conf = container_of(peers_ptr, NICConf, peers);
NetClientState **ncs = peers_ptr->ncs;
NetClientState *peers[MAX_QUEUE_NUM];
int queues, i = 0;
@@ -219,7 +218,7 @@ static int parse_netdev(DeviceState *dev, const char *str, void **ptr)
ncs[i]->queue_index = i;
}
conf->queues = queues;
peers_ptr->queues = queues;
return 0;
@@ -386,56 +385,6 @@ void qdev_set_nic_properties(DeviceState *dev, NICInfo *nd)
nd->instantiated = 1;
}
/* --- iothread --- */
static char *print_iothread(void *ptr)
{
return iothread_get_id(ptr);
}
static int parse_iothread(DeviceState *dev, const char *str, void **ptr)
{
IOThread *iothread;
iothread = iothread_find(str);
if (!iothread) {
return -ENOENT;
}
object_ref(OBJECT(iothread));
*ptr = iothread;
return 0;
}
static void get_iothread(Object *obj, struct Visitor *v, void *opaque,
const char *name, Error **errp)
{
get_pointer(obj, v, opaque, print_iothread, name, errp);
}
static void set_iothread(Object *obj, struct Visitor *v, void *opaque,
const char *name, Error **errp)
{
set_pointer(obj, v, opaque, parse_iothread, name, errp);
}
static void release_iothread(Object *obj, const char *name, void *opaque)
{
DeviceState *dev = DEVICE(obj);
Property *prop = opaque;
IOThread **ptr = qdev_get_prop_ptr(dev, prop);
if (*ptr) {
object_unref(OBJECT(*ptr));
}
}
PropertyInfo qdev_prop_iothread = {
.name = "iothread",
.get = get_iothread,
.set = set_iothread,
.release = release_iothread,
};
static int qdev_add_one_global(QemuOpts *opts, void *opaque)
{
GlobalProperty *g;
@@ -445,7 +394,8 @@ static int qdev_add_one_global(QemuOpts *opts, void *opaque)
g->driver = qemu_opt_get(opts, "driver");
g->property = qemu_opt_get(opts, "property");
g->value = qemu_opt_get(opts, "value");
oc = object_class_by_name(g->driver);
oc = object_class_dynamic_cast(object_class_by_name(g->driver),
TYPE_DEVICE);
if (oc) {
DeviceClass *dc = DEVICE_CLASS(oc);

View File

@@ -780,6 +780,27 @@ void qdev_property_add_static(DeviceState *dev, Property *prop,
}
}
/* @qdev_alias_all_properties - Add alias properties to the source object for
* all qdev properties on the target DeviceState.
*/
void qdev_alias_all_properties(DeviceState *target, Object *source)
{
ObjectClass *class;
Property *prop;
class = object_get_class(OBJECT(target));
do {
DeviceClass *dc = DEVICE_CLASS(class);
for (prop = dc->props; prop && prop->name; prop++) {
object_property_add_alias(source, prop->name,
OBJECT(target), prop->name,
&error_abort);
}
class = object_class_get_parent(class);
} while (class != object_class_by_name(TYPE_DEVICE));
}
static bool device_get_realized(Object *obj, Error **errp)
{
DeviceState *dev = DEVICE(obj);
@@ -848,6 +869,7 @@ static void device_set_realized(Object *obj, bool value, Error **errp)
if (dev->hotplugged && local_err == NULL) {
device_reset(dev);
}
dev->pending_deleted_event = false;
} else if (!value && dev->realized) {
QLIST_FOREACH(bus, &dev->child_bus, sibling) {
object_property_set_bool(OBJECT(bus), false, "realized",
@@ -862,6 +884,7 @@ static void device_set_realized(Object *obj, bool value, Error **errp)
if (dc->unrealize && local_err == NULL) {
dc->unrealize(dev, &local_err);
}
dev->pending_deleted_event = true;
}
if (local_err != NULL) {
@@ -934,7 +957,13 @@ static void device_initfn(Object *obj)
static void device_post_init(Object *obj)
{
qdev_prop_set_globals(DEVICE(obj), &error_abort);
Error *err = NULL;
qdev_prop_set_globals(DEVICE(obj), &err);
if (err) {
qerror_report_err(err);
error_free(err);
exit(EXIT_FAILURE);
}
}
/* Unlink device from bus and free the structure. */
@@ -949,7 +978,7 @@ static void device_finalize(Object *obj)
QLIST_FOREACH_SAFE(ngl, &dev->gpios, node, next) {
QLIST_REMOVE(ngl, node);
qemu_free_irqs(ngl->in);
qemu_free_irqs(ngl->in, ngl->num_in);
g_free(ngl->name);
g_free(ngl);
/* ngl->out irqs are owned by the other end and should not be freed
@@ -972,7 +1001,6 @@ static void device_unparent(Object *obj)
{
DeviceState *dev = DEVICE(obj);
BusState *bus;
bool have_realized = dev->realized;
if (dev->realized) {
object_property_set_bool(obj, false, "realized", NULL);
@@ -988,7 +1016,7 @@ static void device_unparent(Object *obj)
}
/* Only send event if the device had been completely realized */
if (have_realized) {
if (dev->pending_deleted_event) {
gchar *path = object_get_canonical_path(OBJECT(dev));
qapi_event_send_device_deleted(!!dev->id, dev->id, path, &error_abort);

View File

@@ -2059,7 +2059,7 @@ static void cirrus_vga_mem_write(void *opaque,
}
} else {
#ifdef DEBUG_CIRRUS
printf("cirrus: mem_writeb " TARGET_FMT_plx " value %02x\n", addr,
printf("cirrus: mem_writeb " TARGET_FMT_plx " value 0x%02" PRIu64 "\n", addr,
mem_value);
#endif
}
@@ -2594,7 +2594,7 @@ static void cirrus_vga_ioport_write(void *opaque, hwaddr addr, uint64_t val,
break;
case 0x3c5:
#ifdef DEBUG_VGA_REG
printf("vga: write SR%x = 0x%02x\n", s->sr_index, val);
printf("vga: write SR%x = 0x%02" PRIu64 "\n", s->sr_index, val);
#endif
cirrus_vga_write_sr(c, val);
break;
@@ -2619,7 +2619,7 @@ static void cirrus_vga_ioport_write(void *opaque, hwaddr addr, uint64_t val,
break;
case 0x3cf:
#ifdef DEBUG_VGA_REG
printf("vga: write GR%x = 0x%02x\n", s->gr_index, val);
printf("vga: write GR%x = 0x%02" PRIu64 "\n", s->gr_index, val);
#endif
cirrus_vga_write_gr(c, s->gr_index, val);
break;
@@ -2630,7 +2630,7 @@ static void cirrus_vga_ioport_write(void *opaque, hwaddr addr, uint64_t val,
case 0x3b5:
case 0x3d5:
#ifdef DEBUG_VGA_REG
printf("vga: write CR%x = 0x%02x\n", s->cr_index, val);
printf("vga: write CR%x = 0x%02"PRIu64"\n", s->cr_index, val);
#endif
cirrus_vga_write_cr(c, val);
break;
@@ -2911,6 +2911,14 @@ static void isa_cirrus_vga_realizefn(DeviceState *dev, Error **errp)
ISACirrusVGAState *d = ISA_CIRRUS_VGA(dev);
VGACommonState *s = &d->cirrus_vga.vga;
/* follow real hardware, cirrus card emulated has 4 MB video memory.
Also accept 8 MB/16 MB for backward compatibility. */
if (s->vram_size_mb != 4 && s->vram_size_mb != 8 &&
s->vram_size_mb != 16) {
error_setg(errp, "Invalid cirrus_vga ram size '%u'",
s->vram_size_mb);
return;
}
vga_common_init(s, OBJECT(dev), true);
cirrus_init_common(&d->cirrus_vga, OBJECT(dev), CIRRUS_ID_CLGD5430, 0,
isa_address_space(isadev),
@@ -2957,6 +2965,14 @@ static int pci_cirrus_vga_initfn(PCIDevice *dev)
PCIDeviceClass *pc = PCI_DEVICE_GET_CLASS(dev);
int16_t device_id = pc->device_id;
/* follow real hardware, cirrus card emulated has 4 MB video memory.
Also accept 8 MB/16 MB for backward compatibility. */
if (s->vga.vram_size_mb != 4 && s->vga.vram_size_mb != 8 &&
s->vga.vram_size_mb != 16) {
error_report("Invalid cirrus_vga ram size '%u'",
s->vga.vram_size_mb);
return -1;
}
/* setup VGA */
vga_common_init(&s->vga, OBJECT(dev), true);
cirrus_init_common(s, OBJECT(dev), device_id, 1, pci_address_space(dev),

View File

@@ -52,8 +52,7 @@ glue(cirrus_bitblt_rop_fwd_, ROP_NAME)(CirrusVGAState *s,
dstpitch -= bltwidth;
srcpitch -= bltwidth;
if (dstpitch < 0 || srcpitch < 0) {
/* is 0 valid? srcpitch == 0 could be useful */
if (bltheight > 1 && (dstpitch < 0 || srcpitch < 0)) {
return;
}

View File

@@ -138,7 +138,9 @@ static void qxl_render_update_area_unlocked(PCIQXLDevice *qxl)
if (qemu_spice_rect_is_empty(qxl->dirty+i)) {
break;
}
if (qxl->dirty[i].left > qxl->dirty[i].right ||
if (qxl->dirty[i].left < 0 ||
qxl->dirty[i].top < 0 ||
qxl->dirty[i].left > qxl->dirty[i].right ||
qxl->dirty[i].top > qxl->dirty[i].bottom ||
qxl->dirty[i].right > qxl->guest_primary.surface.width ||
qxl->dirty[i].bottom > qxl->guest_primary.surface.height) {

View File

@@ -2063,6 +2063,7 @@ static int qxl_init_primary(PCIDevice *dev)
qxl->id = 0;
qxl_init_ramsize(qxl);
vga->vbe_size = qxl->vgamem_size;
vga->vram_size_mb = qxl->vga.vram_size >> 20;
vga_common_init(vga, OBJECT(dev), true);
vga_init(vga, OBJECT(dev),

View File

@@ -580,6 +580,93 @@ void vga_ioport_write(void *opaque, uint32_t addr, uint32_t val)
}
}
/*
* Sanity check vbe register writes.
*
* As we don't have a way to signal errors to the guest in the bochs
* dispi interface we'll go adjust the registers to the closest valid
* value.
*/
static void vbe_fixup_regs(VGACommonState *s)
{
uint16_t *r = s->vbe_regs;
uint32_t bits, linelength, maxy, offset;
if (!(r[VBE_DISPI_INDEX_ENABLE] & VBE_DISPI_ENABLED)) {
/* vbe is turned off -- nothing to do */
return;
}
/* check depth */
switch (r[VBE_DISPI_INDEX_BPP]) {
case 4:
case 8:
case 16:
case 24:
case 32:
bits = r[VBE_DISPI_INDEX_BPP];
break;
case 15:
bits = 16;
break;
default:
bits = r[VBE_DISPI_INDEX_BPP] = 8;
break;
}
/* check width */
r[VBE_DISPI_INDEX_XRES] &= ~7u;
if (r[VBE_DISPI_INDEX_XRES] == 0) {
r[VBE_DISPI_INDEX_XRES] = 8;
}
if (r[VBE_DISPI_INDEX_XRES] > VBE_DISPI_MAX_XRES) {
r[VBE_DISPI_INDEX_XRES] = VBE_DISPI_MAX_XRES;
}
r[VBE_DISPI_INDEX_VIRT_WIDTH] &= ~7u;
if (r[VBE_DISPI_INDEX_VIRT_WIDTH] > VBE_DISPI_MAX_XRES) {
r[VBE_DISPI_INDEX_VIRT_WIDTH] = VBE_DISPI_MAX_XRES;
}
if (r[VBE_DISPI_INDEX_VIRT_WIDTH] < r[VBE_DISPI_INDEX_XRES]) {
r[VBE_DISPI_INDEX_VIRT_WIDTH] = r[VBE_DISPI_INDEX_XRES];
}
/* check height */
linelength = r[VBE_DISPI_INDEX_VIRT_WIDTH] * bits / 8;
maxy = s->vbe_size / linelength;
if (r[VBE_DISPI_INDEX_YRES] == 0) {
r[VBE_DISPI_INDEX_YRES] = 1;
}
if (r[VBE_DISPI_INDEX_YRES] > VBE_DISPI_MAX_YRES) {
r[VBE_DISPI_INDEX_YRES] = VBE_DISPI_MAX_YRES;
}
if (r[VBE_DISPI_INDEX_YRES] > maxy) {
r[VBE_DISPI_INDEX_YRES] = maxy;
}
/* check offset */
if (r[VBE_DISPI_INDEX_X_OFFSET] > VBE_DISPI_MAX_XRES) {
r[VBE_DISPI_INDEX_X_OFFSET] = VBE_DISPI_MAX_XRES;
}
if (r[VBE_DISPI_INDEX_Y_OFFSET] > VBE_DISPI_MAX_YRES) {
r[VBE_DISPI_INDEX_Y_OFFSET] = VBE_DISPI_MAX_YRES;
}
offset = r[VBE_DISPI_INDEX_X_OFFSET] * bits / 8;
offset += r[VBE_DISPI_INDEX_Y_OFFSET] * linelength;
if (offset + r[VBE_DISPI_INDEX_YRES] * linelength > s->vbe_size) {
r[VBE_DISPI_INDEX_Y_OFFSET] = 0;
offset = r[VBE_DISPI_INDEX_X_OFFSET] * bits / 8;
if (offset + r[VBE_DISPI_INDEX_YRES] * linelength > s->vbe_size) {
r[VBE_DISPI_INDEX_X_OFFSET] = 0;
offset = 0;
}
}
/* update vga state */
r[VBE_DISPI_INDEX_VIRT_HEIGHT] = maxy;
s->vbe_line_offset = linelength;
s->vbe_start_addr = offset / 4;
}
static uint32_t vbe_ioport_read_index(void *opaque, uint32_t addr)
{
VGACommonState *s = opaque;
@@ -614,7 +701,7 @@ uint32_t vbe_ioport_read_data(void *opaque, uint32_t addr)
val = s->vbe_regs[s->vbe_index];
}
} else if (s->vbe_index == VBE_DISPI_INDEX_VIDEO_MEMORY_64K) {
val = s->vram_size / (64 * 1024);
val = s->vbe_size / (64 * 1024);
} else {
val = 0;
}
@@ -649,22 +736,13 @@ void vbe_ioport_write_data(void *opaque, uint32_t addr, uint32_t val)
}
break;
case VBE_DISPI_INDEX_XRES:
if ((val <= VBE_DISPI_MAX_XRES) && ((val & 7) == 0)) {
s->vbe_regs[s->vbe_index] = val;
}
break;
case VBE_DISPI_INDEX_YRES:
if (val <= VBE_DISPI_MAX_YRES) {
s->vbe_regs[s->vbe_index] = val;
}
break;
case VBE_DISPI_INDEX_BPP:
if (val == 0)
val = 8;
if (val == 4 || val == 8 || val == 15 ||
val == 16 || val == 24 || val == 32) {
s->vbe_regs[s->vbe_index] = val;
}
case VBE_DISPI_INDEX_VIRT_WIDTH:
case VBE_DISPI_INDEX_X_OFFSET:
case VBE_DISPI_INDEX_Y_OFFSET:
s->vbe_regs[s->vbe_index] = val;
vbe_fixup_regs(s);
break;
case VBE_DISPI_INDEX_BANK:
if (s->vbe_regs[VBE_DISPI_INDEX_BPP] == 4) {
@@ -681,19 +759,11 @@ void vbe_ioport_write_data(void *opaque, uint32_t addr, uint32_t val)
!(s->vbe_regs[VBE_DISPI_INDEX_ENABLE] & VBE_DISPI_ENABLED)) {
int h, shift_control;
s->vbe_regs[VBE_DISPI_INDEX_VIRT_WIDTH] =
s->vbe_regs[VBE_DISPI_INDEX_XRES];
s->vbe_regs[VBE_DISPI_INDEX_VIRT_HEIGHT] =
s->vbe_regs[VBE_DISPI_INDEX_YRES];
s->vbe_regs[VBE_DISPI_INDEX_VIRT_WIDTH] = 0;
s->vbe_regs[VBE_DISPI_INDEX_X_OFFSET] = 0;
s->vbe_regs[VBE_DISPI_INDEX_Y_OFFSET] = 0;
if (s->vbe_regs[VBE_DISPI_INDEX_BPP] == 4)
s->vbe_line_offset = s->vbe_regs[VBE_DISPI_INDEX_XRES] >> 1;
else
s->vbe_line_offset = s->vbe_regs[VBE_DISPI_INDEX_XRES] *
((s->vbe_regs[VBE_DISPI_INDEX_BPP] + 7) >> 3);
s->vbe_start_addr = 0;
s->vbe_regs[VBE_DISPI_INDEX_ENABLE] |= VBE_DISPI_ENABLED;
vbe_fixup_regs(s);
/* clear the screen (should be done in BIOS) */
if (!(val & VBE_DISPI_NOCLEARMEM)) {
@@ -742,40 +812,6 @@ void vbe_ioport_write_data(void *opaque, uint32_t addr, uint32_t val)
s->vbe_regs[s->vbe_index] = val;
vga_update_memory_access(s);
break;
case VBE_DISPI_INDEX_VIRT_WIDTH:
{
int w, h, line_offset;
if (val < s->vbe_regs[VBE_DISPI_INDEX_XRES])
return;
w = val;
if (s->vbe_regs[VBE_DISPI_INDEX_BPP] == 4)
line_offset = w >> 1;
else
line_offset = w * ((s->vbe_regs[VBE_DISPI_INDEX_BPP] + 7) >> 3);
h = s->vram_size / line_offset;
/* XXX: support weird bochs semantics ? */
if (h < s->vbe_regs[VBE_DISPI_INDEX_YRES])
return;
s->vbe_regs[VBE_DISPI_INDEX_VIRT_WIDTH] = w;
s->vbe_regs[VBE_DISPI_INDEX_VIRT_HEIGHT] = h;
s->vbe_line_offset = line_offset;
}
break;
case VBE_DISPI_INDEX_X_OFFSET:
case VBE_DISPI_INDEX_Y_OFFSET:
{
int x;
s->vbe_regs[s->vbe_index] = val;
s->vbe_start_addr = s->vbe_line_offset * s->vbe_regs[VBE_DISPI_INDEX_Y_OFFSET];
x = s->vbe_regs[VBE_DISPI_INDEX_X_OFFSET];
if (s->vbe_regs[VBE_DISPI_INDEX_BPP] == 4)
s->vbe_start_addr += x >> 1;
else
s->vbe_start_addr += x * ((s->vbe_regs[VBE_DISPI_INDEX_BPP] + 7) >> 3);
s->vbe_start_addr >>= 2;
}
break;
default:
break;
}
@@ -2289,6 +2325,9 @@ void vga_common_init(VGACommonState *s, Object *obj, bool global_vmstate)
s->vram_size <<= 1;
}
s->vram_size_mb = s->vram_size >> 20;
if (!s->vbe_size) {
s->vbe_size = s->vram_size;
}
s->is_vbe_vmstate = 1;
memory_region_init_ram(&s->vram, obj, "vga.vram", s->vram_size);

View File

@@ -93,6 +93,7 @@ typedef struct VGACommonState {
MemoryRegion vram_vbe;
uint32_t vram_size;
uint32_t vram_size_mb; /* property */
uint32_t vbe_size;
uint32_t latch;
MemoryRegion *chain4_alias;
uint8_t sr_index;

View File

@@ -93,10 +93,12 @@ struct XenFB {
static int common_bind(struct common *c)
{
int mfn;
uint64_t mfn;
if (xenstore_read_fe_int(&c->xendev, "page-ref", &mfn) == -1)
if (xenstore_read_fe_uint64(&c->xendev, "page-ref", &mfn) == -1)
return -1;
assert(mfn == (xen_pfn_t)mfn);
if (xenstore_read_fe_int(&c->xendev, "event-channel", &c->xendev.remote_port) == -1)
return -1;
@@ -107,7 +109,7 @@ static int common_bind(struct common *c)
return -1;
xen_be_bind_evtchn(&c->xendev);
xen_be_printf(&c->xendev, 1, "ring mfn %d, remote-port %d, local-port %d\n",
xen_be_printf(&c->xendev, 1, "ring mfn %"PRIx64", remote-port %d, local-port %d\n",
mfn, c->xendev.remote_port, c->xendev.local_port);
return 0;
@@ -409,7 +411,7 @@ static void input_event(struct XenDevice *xendev)
/* -------------------------------------------------------------------- */
static void xenfb_copy_mfns(int mode, int count, unsigned long *dst, void *src)
static void xenfb_copy_mfns(int mode, int count, xen_pfn_t *dst, void *src)
{
uint32_t *src32 = src;
uint64_t *src64 = src;
@@ -424,8 +426,8 @@ static int xenfb_map_fb(struct XenFB *xenfb)
struct xenfb_page *page = xenfb->c.page;
char *protocol = xenfb->c.xendev.protocol;
int n_fbdirs;
unsigned long *pgmfns = NULL;
unsigned long *fbmfns = NULL;
xen_pfn_t *pgmfns = NULL;
xen_pfn_t *fbmfns = NULL;
void *map, *pd;
int mode, ret = -1;
@@ -483,8 +485,8 @@ static int xenfb_map_fb(struct XenFB *xenfb)
n_fbdirs = xenfb->fbpages * mode / 8;
n_fbdirs = (n_fbdirs + (XC_PAGE_SIZE - 1)) / XC_PAGE_SIZE;
pgmfns = g_malloc0(sizeof(unsigned long) * n_fbdirs);
fbmfns = g_malloc0(sizeof(unsigned long) * xenfb->fbpages);
pgmfns = g_malloc0(sizeof(xen_pfn_t) * n_fbdirs);
fbmfns = g_malloc0(sizeof(xen_pfn_t) * xenfb->fbpages);
xenfb_copy_mfns(mode, n_fbdirs, pgmfns, pd);
map = xc_map_foreign_pages(xen_xc, xenfb->c.xendev.dom,

View File

@@ -1660,7 +1660,7 @@ struct soc_dma_s *omap_dma_init(hwaddr base, qemu_irq *irqs,
}
omap_dma_setcaps(s);
omap_clk_adduser(s->clk, qemu_allocate_irqs(omap_dma_clk_update, s, 1)[0]);
omap_clk_adduser(s->clk, qemu_allocate_irq(omap_dma_clk_update, s, 0));
omap_dma_reset(s->dma);
omap_dma_clk_update(s, 0, 1);
@@ -2082,7 +2082,7 @@ struct soc_dma_s *omap_dma4_init(hwaddr base, qemu_irq *irqs,
s->intr_update = omap_dma_interrupts_4_update;
omap_dma_setcaps(s);
omap_clk_adduser(s->clk, qemu_allocate_irqs(omap_dma_clk_update, s, 1)[0]);
omap_clk_adduser(s->clk, qemu_allocate_irq(omap_dma_clk_update, s, 0));
omap_dma_reset(s->dma);
omap_dma_clk_update(s, 0, !!s->dma->freq);

View File

@@ -25,7 +25,9 @@
#include <glib.h>
#include "qemu-common.h"
#include "qemu/bitmap.h"
#include "qemu/osdep.h"
#include "qemu/range.h"
#include "qemu/error-report.h"
#include "hw/pci/pci.h"
#include "qom/cpu.h"
#include "hw/i386/pc.h"
@@ -52,6 +54,16 @@
#include "qapi/qmp/qint.h"
#include "qom/qom-qobject.h"
/* These are used to size the ACPI tables for -M pc-i440fx-1.7 and
* -M pc-i440fx-2.0. Even if the actual amount of AML generated grows
* a little bit, there should be plenty of free space since the DSDT
* shrunk by ~1.5k between QEMU 2.0 and QEMU 2.1.
*/
#define ACPI_BUILD_LEGACY_CPU_AML_SIZE 97
#define ACPI_BUILD_ALIGN_SIZE 0x1000
#define ACPI_BUILD_TABLE_SIZE 0x20000
typedef struct AcpiCpuInfo {
DECLARE_BITMAP(found_cpus, ACPI_CPU_HOTPLUG_ID_LIMIT);
} AcpiCpuInfo;
@@ -64,6 +76,7 @@ typedef struct AcpiMcfgInfo {
typedef struct AcpiPmInfo {
bool s3_disabled;
bool s4_disabled;
bool pcihp_bridge_en;
uint8_t s4_val;
uint16_t sci_int;
uint8_t acpi_enable_cmd;
@@ -85,6 +98,7 @@ typedef struct AcpiBuildPciBusHotplugState {
GArray *device_table;
GArray *notify_table;
struct AcpiBuildPciBusHotplugState *parent;
bool pcihp_bridge_en;
} AcpiBuildPciBusHotplugState;
static void acpi_get_dsdt(AcpiMiscInfo *info)
@@ -188,6 +202,9 @@ static void acpi_get_pm_info(AcpiPmInfo *pm)
NULL);
pm->gpe0_blk_len = object_property_get_int(obj, ACPI_PM_PROP_GPE0_BLK_LEN,
NULL);
pm->pcihp_bridge_en =
object_property_get_bool(obj, "acpi-pci-hotplug-with-bridge-support",
NULL);
}
static void acpi_get_misc_info(AcpiMiscInfo *info)
@@ -529,6 +546,12 @@ static void fadt_setup(AcpiFadtDescriptorRev1 *fadt, AcpiPmInfo *pm)
(1 << ACPI_FADT_F_SLP_BUTTON) |
(1 << ACPI_FADT_F_RTC_S4));
fadt->flags |= cpu_to_le32(1 << ACPI_FADT_F_USE_PLATFORM_CLOCK);
/* APIC destination mode ("Flat Logical") has an upper limit of 8 CPUs
* For more than 8 CPUs, "Clustered Logical" mode has to be used
*/
if (max_cpus > 8) {
fadt->flags |= cpu_to_le32(1 << ACPI_FADT_F_FORCE_APIC_CLUSTER_MODEL);
}
}
@@ -768,11 +791,13 @@ static void acpi_set_pci_info(void)
}
static void build_pci_bus_state_init(AcpiBuildPciBusHotplugState *state,
AcpiBuildPciBusHotplugState *parent)
AcpiBuildPciBusHotplugState *parent,
bool pcihp_bridge_en)
{
state->parent = parent;
state->device_table = build_alloc_array();
state->notify_table = build_alloc_array();
state->pcihp_bridge_en = pcihp_bridge_en;
}
static void build_pci_bus_state_cleanup(AcpiBuildPciBusHotplugState *state)
@@ -786,7 +811,7 @@ static void *build_pci_bus_begin(PCIBus *bus, void *parent_state)
AcpiBuildPciBusHotplugState *parent = parent_state;
AcpiBuildPciBusHotplugState *child = g_malloc(sizeof *child);
build_pci_bus_state_init(child, parent);
build_pci_bus_state_init(child, parent, parent->pcihp_bridge_en);
return child;
}
@@ -807,6 +832,14 @@ static void build_pci_bus_end(PCIBus *bus, void *bus_state)
GArray *method;
bool bus_hotplug_support = false;
/*
* Skip bridge subtree creation if bridge hotplug is disabled
* to make acpi tables compatible with legacy machine types.
*/
if (!child->pcihp_bridge_en && bus->parent_dev) {
return;
}
if (bus->parent_dev) {
op = 0x82; /* DeviceOp */
build_append_nameseg(bus_table, "S%.02X_",
@@ -844,6 +877,7 @@ static void build_pci_bus_end(PCIBus *bus, void *bus_state)
PCIDeviceClass *pc;
PCIDevice *pdev = bus->devices[i];
int slot = PCI_SLOT(i);
bool bridge_in_acpi;
if (!pdev) {
continue;
@@ -853,7 +887,13 @@ static void build_pci_bus_end(PCIBus *bus, void *bus_state)
pc = PCI_DEVICE_GET_CLASS(pdev);
dc = DEVICE_GET_CLASS(pdev);
if (pc->class_id == PCI_CLASS_BRIDGE_ISA || pc->is_bridge) {
/* When hotplug for bridges is enabled, bridges are
* described in ACPI separately (see build_pci_bus_end).
* In this case they aren't themselves hot-pluggable.
*/
bridge_in_acpi = pc->is_bridge && child->pcihp_bridge_en;
if (pc->class_id == PCI_CLASS_BRIDGE_ISA || bridge_in_acpi) {
set_bit(slot, slot_device_system);
}
@@ -865,7 +905,7 @@ static void build_pci_bus_end(PCIBus *bus, void *bus_state)
}
}
if (!dc->hotpluggable || pc->is_bridge) {
if (!dc->hotpluggable || bridge_in_acpi) {
clear_bit(slot, slot_hotplug_enable);
}
}
@@ -1130,7 +1170,7 @@ build_ssdt(GArray *table_data, GArray *linker,
bus = PCI_HOST_BRIDGE(pci_host)->bus;
}
build_pci_bus_state_init(&hotplug_state, NULL);
build_pci_bus_state_init(&hotplug_state, NULL, pm->pcihp_bridge_en);
if (bus) {
/* Scan all PCI buses. Generate tables to support hotplug. */
@@ -1359,7 +1399,7 @@ build_rsdp(GArray *rsdp_table, GArray *linker, unsigned rsdt)
{
AcpiRsdpDescriptor *rsdp = acpi_data_push(rsdp_table, sizeof *rsdp);
bios_linker_loader_alloc(linker, ACPI_BUILD_RSDP_FILE, 1,
bios_linker_loader_alloc(linker, ACPI_BUILD_RSDP_FILE, 16,
true /* fseg memory */);
memcpy(&rsdp->signature, "RSD PTR ", 8);
@@ -1440,13 +1480,14 @@ static
void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables)
{
GArray *table_offsets;
unsigned facs, dsdt, rsdt;
unsigned facs, ssdt, dsdt, rsdt;
AcpiCpuInfo cpu;
AcpiPmInfo pm;
AcpiMiscInfo misc;
AcpiMcfgInfo mcfg;
PcPciInfo pci;
uint8_t *u;
size_t aml_len = 0;
acpi_get_cpu_info(&cpu);
acpi_get_pm_info(&pm);
@@ -1474,13 +1515,20 @@ void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables)
dsdt = tables->table_data->len;
build_dsdt(tables->table_data, tables->linker, &misc);
/* Count the size of the DSDT and SSDT, we will need it for legacy
* sizing of ACPI tables.
*/
aml_len += tables->table_data->len - dsdt;
/* ACPI tables pointed to by RSDT */
acpi_add_table(table_offsets, tables->table_data);
build_fadt(tables->table_data, tables->linker, &pm, facs, dsdt);
ssdt = tables->table_data->len;
acpi_add_table(table_offsets, tables->table_data);
build_ssdt(tables->table_data, tables->linker, &cpu, &pm, &misc, &pci,
guest_info);
aml_len += tables->table_data->len - ssdt;
acpi_add_table(table_offsets, tables->table_data);
build_madt(tables->table_data, tables->linker, &cpu, guest_info);
@@ -1513,14 +1561,53 @@ void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables)
/* RSDP is in FSEG memory, so allocate it separately */
build_rsdp(tables->rsdp, tables->linker, rsdt);
/* We'll expose it all to Guest so align size to reduce
/* We'll expose it all to Guest so we want to reduce
* chance of size changes.
* RSDP is small so it's easy to keep it immutable, no need to
* bother with alignment.
*
* We used to align the tables to 4k, but of course this would
* too simple to be enough. 4k turned out to be too small an
* alignment very soon, and in fact it is almost impossible to
* keep the table size stable for all (max_cpus, max_memory_slots)
* combinations. So the table size is always 64k for pc-i440fx-2.1
* and we give an error if the table grows beyond that limit.
*
* We still have the problem of migrating from "-M pc-i440fx-2.0". For
* that, we exploit the fact that QEMU 2.1 generates _smaller_ tables
* than 2.0 and we can always pad the smaller tables with zeros. We can
* then use the exact size of the 2.0 tables.
*
* All this is for PIIX4, since QEMU 2.0 didn't support Q35 migration.
*/
acpi_align_size(tables->table_data, 0x1000);
if (guest_info->legacy_acpi_table_size) {
/* Subtracting aml_len gives the size of fixed tables. Then add the
* size of the PIIX4 DSDT/SSDT in QEMU 2.0.
*/
int legacy_aml_len =
guest_info->legacy_acpi_table_size +
ACPI_BUILD_LEGACY_CPU_AML_SIZE * max_cpus;
int legacy_table_size =
ROUND_UP(tables->table_data->len - aml_len + legacy_aml_len,
ACPI_BUILD_ALIGN_SIZE);
if (tables->table_data->len > legacy_table_size) {
/* Should happen only with PCI bridges and -M pc-i440fx-2.0. */
error_report("Warning: migration may not work.");
}
g_array_set_size(tables->table_data, legacy_table_size);
} else {
/* Make sure we have a buffer in case we need to resize the tables. */
if (tables->table_data->len > ACPI_BUILD_TABLE_SIZE / 2) {
/* As of QEMU 2.1, this fires with 160 VCPUs and 255 memory slots. */
error_report("Warning: ACPI tables are larger than 64k.");
error_report("Warning: migration may not work.");
error_report("Warning: please remove CPUs, NUMA nodes, "
"memory slots or PCI bridges.");
}
acpi_align_size(tables->table_data, ACPI_BUILD_TABLE_SIZE);
}
acpi_align_size(tables->linker, 0x1000);
acpi_align_size(tables->linker, ACPI_BUILD_ALIGN_SIZE);
/* Cleanup memory that's no longer used. */
g_array_free(table_offsets, true);

View File

@@ -181,57 +181,45 @@ DefinitionBlock (
Scope(\_SB) {
Scope(PCI0) {
Name(_PRT, Package() {
/* PCI IRQ routing table, example from ACPI 2.0a specification,
section 6.2.8.1 */
/* Note: we provide the same info as the PCI routing
table of the Bochs BIOS */
Method (_PRT, 0) {
Store(Package(128) {}, Local0)
Store(Zero, Local1)
While(LLess(Local1, 128)) {
// slot = pin >> 2
Store(ShiftRight(Local1, 2), Local2)
#define prt_slot(nr, lnk0, lnk1, lnk2, lnk3) \
Package() { nr##ffff, 0, lnk0, 0 }, \
Package() { nr##ffff, 1, lnk1, 0 }, \
Package() { nr##ffff, 2, lnk2, 0 }, \
Package() { nr##ffff, 3, lnk3, 0 }
// lnk = (slot + pin) & 3
Store(And(Add(Local1, Local2), 3), Local3)
If (LEqual(Local3, 0)) {
Store(Package(4) { Zero, Zero, LNKD, Zero }, Local4)
}
If (LEqual(Local3, 1)) {
// device 1 is the power-management device, needs SCI
If (LEqual(Local1, 4)) {
Store(Package(4) { Zero, Zero, LNKS, Zero }, Local4)
} Else {
Store(Package(4) { Zero, Zero, LNKA, Zero }, Local4)
}
}
If (LEqual(Local3, 2)) {
Store(Package(4) { Zero, Zero, LNKB, Zero }, Local4)
}
If (LEqual(Local3, 3)) {
Store(Package(4) { Zero, Zero, LNKC, Zero }, Local4)
}
#define prt_slot0(nr) prt_slot(nr, LNKD, LNKA, LNKB, LNKC)
#define prt_slot1(nr) prt_slot(nr, LNKA, LNKB, LNKC, LNKD)
#define prt_slot2(nr) prt_slot(nr, LNKB, LNKC, LNKD, LNKA)
#define prt_slot3(nr) prt_slot(nr, LNKC, LNKD, LNKA, LNKB)
// Complete the interrupt routing entry:
// Package(4) { 0x[slot]FFFF, [pin], [link], 0) }
prt_slot0(0x0000),
/* Device 1 is power mgmt device, and can only use irq 9 */
prt_slot(0x0001, LNKS, LNKB, LNKC, LNKD),
prt_slot2(0x0002),
prt_slot3(0x0003),
prt_slot0(0x0004),
prt_slot1(0x0005),
prt_slot2(0x0006),
prt_slot3(0x0007),
prt_slot0(0x0008),
prt_slot1(0x0009),
prt_slot2(0x000a),
prt_slot3(0x000b),
prt_slot0(0x000c),
prt_slot1(0x000d),
prt_slot2(0x000e),
prt_slot3(0x000f),
prt_slot0(0x0010),
prt_slot1(0x0011),
prt_slot2(0x0012),
prt_slot3(0x0013),
prt_slot0(0x0014),
prt_slot1(0x0015),
prt_slot2(0x0016),
prt_slot3(0x0017),
prt_slot0(0x0018),
prt_slot1(0x0019),
prt_slot2(0x001a),
prt_slot3(0x001b),
prt_slot0(0x001c),
prt_slot1(0x001d),
prt_slot2(0x001e),
prt_slot3(0x001f),
})
Store(Or(ShiftLeft(Local2, 16), 0xFFFF), Index(Local4, 0))
Store(And(Local1, 3), Index(Local4, 1))
Store(Local4, Index(Local0, Local1))
Increment(Local1)
}
Return(Local0)
}
}
Field(PCI0.ISA.P40C, ByteAcc, NoLock, Preserve) {
@@ -314,7 +302,7 @@ DefinitionBlock (
/****************************************************************
* General purpose events
****************************************************************/
External(\_SB.PCI0.MEMORY_HOPTLUG_DEVICE.MEMORY_SLOT_SCAN_METHOD, MethodObj)
External(\_SB.PCI0.MEMORY_HOTPLUG_DEVICE.MEMORY_SLOT_SCAN_METHOD, MethodObj)
Scope(\_GPE) {
Name(_HID, "ACPI0006")
@@ -333,7 +321,7 @@ DefinitionBlock (
}
Method(_E03) {
// Memory hotplug event
\_SB.PCI0.MEMORY_HOPTLUG_DEVICE.MEMORY_SLOT_SCAN_METHOD()
\_SB.PCI0.MEMORY_HOTPLUG_DEVICE.MEMORY_SLOT_SCAN_METHOD()
}
Method(_L04) {
}

File diff suppressed because it is too large Load Diff

View File

@@ -14,10 +14,8 @@
*/
#include "qemu-common.h"
#include "qemu/host-utils.h"
#include "sysemu/sysemu.h"
#include "sysemu/kvm.h"
#include "sysemu/cpus.h"
#include "hw/sysbus.h"
#include "hw/kvm/clock.h"
@@ -36,48 +34,6 @@ typedef struct KVMClockState {
bool clock_valid;
} KVMClockState;
struct pvclock_vcpu_time_info {
uint32_t version;
uint32_t pad0;
uint64_t tsc_timestamp;
uint64_t system_time;
uint32_t tsc_to_system_mul;
int8_t tsc_shift;
uint8_t flags;
uint8_t pad[2];
} __attribute__((__packed__)); /* 32 bytes */
static uint64_t kvmclock_current_nsec(KVMClockState *s)
{
CPUState *cpu = first_cpu;
CPUX86State *env = cpu->env_ptr;
hwaddr kvmclock_struct_pa = env->system_time_msr & ~1ULL;
uint64_t migration_tsc = env->tsc;
struct pvclock_vcpu_time_info time;
uint64_t delta;
uint64_t nsec_lo;
uint64_t nsec_hi;
uint64_t nsec;
if (!(env->system_time_msr & 1ULL)) {
/* KVM clock not active */
return 0;
}
cpu_physical_memory_read(kvmclock_struct_pa, &time, sizeof(time));
assert(time.tsc_timestamp <= migration_tsc);
delta = migration_tsc - time.tsc_timestamp;
if (time.tsc_shift < 0) {
delta >>= -time.tsc_shift;
} else {
delta <<= time.tsc_shift;
}
mulu64(&nsec_lo, &nsec_hi, delta, time.tsc_to_system_mul);
nsec = (nsec_lo >> 32) | (nsec_hi << 32);
return nsec + time.system_time;
}
static void kvmclock_vm_state_change(void *opaque, int running,
RunState state)
@@ -89,15 +45,9 @@ static void kvmclock_vm_state_change(void *opaque, int running,
if (running) {
struct kvm_clock_data data;
uint64_t time_at_migration = kvmclock_current_nsec(s);
s->clock_valid = false;
/* We can't rely on the migrated clock value, just discard it */
if (time_at_migration) {
s->clock = time_at_migration;
}
data.clock = s->clock;
data.flags = 0;
ret = kvm_vm_ioctl(kvm_state, KVM_SET_CLOCK, &data);
@@ -125,8 +75,6 @@ static void kvmclock_vm_state_change(void *opaque, int running,
if (s->clock_valid) {
return;
}
cpu_synchronize_all_states();
ret = kvm_vm_ioctl(kvm_state, KVM_GET_CLOCK, &data);
if (ret < 0) {
fprintf(stderr, "KVM_GET_CLOCK failed: %s\n", strerror(ret));

View File

@@ -73,7 +73,12 @@
#endif
/* Leave a chunk of memory at the top of RAM for the BIOS ACPI tables. */
#define ACPI_DATA_SIZE 0x10000
unsigned acpi_data_size = 0x20000;
void pc_set_legacy_acpi_data_size(void)
{
acpi_data_size = 0x10000;
}
#define BIOS_CFG_IOPORT 0x510
#define FW_CFG_ACPI_TABLES (FW_CFG_ARCH_LOCAL + 0)
#define FW_CFG_SMBIOS_ENTRIES (FW_CFG_ARCH_LOCAL + 1)
@@ -811,8 +816,9 @@ static void load_linux(FWCfgState *fw_cfg,
initrd_max = 0x37ffffff;
}
if (initrd_max >= max_ram_size-ACPI_DATA_SIZE)
initrd_max = max_ram_size-ACPI_DATA_SIZE-1;
if (initrd_max >= max_ram_size - acpi_data_size) {
initrd_max = max_ram_size - acpi_data_size - 1;
}
fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_ADDR, cmdline_addr);
fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_SIZE, strlen(kernel_cmdline)+1);

View File

@@ -61,6 +61,7 @@ static const int ide_irq[MAX_IDE_BUS] = { 14, 15 };
static bool has_pci_info;
static bool has_acpi_build = true;
static int legacy_acpi_table_size;
static bool smbios_defaults = true;
static bool smbios_legacy_mode;
/* Make sure that guest addresses aligned at 1Gbyte boundaries get mapped to
@@ -114,7 +115,7 @@ static void pc_init1(MachineState *machine,
lowmem = 0xe0000000;
}
/* Handle the machine opt max-ram-below-4g. It is basicly doing
/* Handle the machine opt max-ram-below-4g. It is basically doing
* min(qemu limit, user limit).
*/
if (lowmem > pc_machine->max_ram_below_4g) {
@@ -163,6 +164,7 @@ static void pc_init1(MachineState *machine,
guest_info = pc_guest_info_init(below_4g_mem_size, above_4g_mem_size);
guest_info->has_acpi_build = has_acpi_build;
guest_info->legacy_acpi_table_size = legacy_acpi_table_size;
guest_info->has_pci_info = has_pci_info;
guest_info->isapc_ram_fw = !pci_enabled;
@@ -297,8 +299,26 @@ static void pc_init_pci(MachineState *machine)
static void pc_compat_2_0(MachineState *machine)
{
/* This value depends on the actual DSDT and SSDT compiled into
* the source QEMU; unfortunately it depends on the binary and
* not on the machine type, so we cannot make pc-i440fx-1.7 work on
* both QEMU 1.7 and QEMU 2.0.
*
* Large variations cause migration to fail for more than one
* consecutive value of the "-smp" maxcpus option.
*
* For small variations of the kind caused by different iasl versions,
* the 4k rounding usually leaves slack. However, there could be still
* one or two values that break. For QEMU 1.7 and QEMU 2.0 the
* slack is only ~10 bytes before one "-smp maxcpus" value breaks!
*
* 6652 is valid for QEMU 2.0, the right value for pc-i440fx-1.7 on
* QEMU 1.7 it is 6414. For RHEL/CentOS 7.0 it is 6418.
*/
legacy_acpi_table_size = 6652;
smbios_legacy_mode = true;
has_reserved_memory = false;
pc_set_legacy_acpi_data_size();
}
static void pc_compat_1_7(MachineState *machine)
@@ -307,6 +327,7 @@ static void pc_compat_1_7(MachineState *machine)
smbios_defaults = false;
gigabyte_align = false;
option_rom_has_mr = true;
legacy_acpi_table_size = 6414;
x86_cpu_compat_disable_kvm_features(FEAT_1_ECX, CPUID_EXT_X2APIC);
}
@@ -386,14 +407,10 @@ static void pc_init_pci_1_2(MachineState *machine)
pc_init_pci(machine);
}
/* PC init function for pc-0.10 to pc-0.13, and reused by xenfv */
/* PC init function for pc-0.10 to pc-0.13 */
static void pc_init_pci_no_kvmclock(MachineState *machine)
{
has_pci_info = false;
has_acpi_build = false;
smbios_defaults = false;
x86_cpu_compat_disable_kvm_features(FEAT_KVM, KVM_FEATURE_PV_EOI);
enable_compat_apic_id_mode();
pc_compat_1_2(machine);
pc_init1(machine, 1, 0);
}
@@ -402,6 +419,11 @@ static void pc_init_isa(MachineState *machine)
has_pci_info = false;
has_acpi_build = false;
smbios_defaults = false;
gigabyte_align = false;
smbios_legacy_mode = true;
has_reserved_memory = false;
option_rom_has_mr = true;
rom_file_has_mr = false;
if (!machine->cpu_model) {
machine->cpu_model = "486";
}

View File

@@ -103,7 +103,7 @@ static void pc_q35_init(MachineState *machine)
lowmem = 0xb0000000;
}
/* Handle the machine opt max-ram-below-4g. It is basicly doing
/* Handle the machine opt max-ram-below-4g. It is basically doing
* min(qemu limit, user limit).
*/
if (lowmem > pc_machine->max_ram_below_4g) {
@@ -155,6 +155,11 @@ static void pc_q35_init(MachineState *machine)
guest_info->has_acpi_build = has_acpi_build;
guest_info->has_reserved_memory = has_reserved_memory;
/* Migration was not supported in 2.0 for Q35, so do not bother
* with this hack (see hw/i386/acpi-build.c).
*/
guest_info->legacy_acpi_table_size = 0;
if (smbios_defaults) {
MachineClass *mc = MACHINE_GET_CLASS(machine);
/* These values are guest ABI, do not change */
@@ -277,6 +282,7 @@ static void pc_compat_2_0(MachineState *machine)
{
smbios_legacy_mode = true;
has_reserved_memory = false;
pc_set_legacy_acpi_data_size();
}
static void pc_compat_1_7(MachineState *machine)
@@ -361,7 +367,7 @@ static QEMUMachine pc_q35_machine_v2_0 = {
.name = "pc-q35-2.0",
.init = pc_q35_init_2_0,
.compat_props = (GlobalProperty[]) {
PC_Q35_COMPAT_2_0,
PC_COMPAT_2_0,
{ /* end of list */ }
},
};
@@ -373,7 +379,7 @@ static QEMUMachine pc_q35_machine_v1_7 = {
.name = "pc-q35-1.7",
.init = pc_q35_init_1_7,
.compat_props = (GlobalProperty[]) {
PC_Q35_COMPAT_1_7,
PC_COMPAT_1_7,
{ /* end of list */ }
},
};
@@ -385,7 +391,7 @@ static QEMUMachine pc_q35_machine_v1_6 = {
.name = "pc-q35-1.6",
.init = pc_q35_init_1_6,
.compat_props = (GlobalProperty[]) {
PC_Q35_COMPAT_1_6,
PC_COMPAT_1_6,
{ /* end of list */ }
},
};
@@ -395,7 +401,7 @@ static QEMUMachine pc_q35_machine_v1_5 = {
.name = "pc-q35-1.5",
.init = pc_q35_init_1_5,
.compat_props = (GlobalProperty[]) {
PC_Q35_COMPAT_1_5,
PC_COMPAT_1_5,
{ /* end of list */ }
},
};
@@ -409,7 +415,7 @@ static QEMUMachine pc_q35_machine_v1_4 = {
.name = "pc-q35-1.4",
.init = pc_q35_init_1_4,
.compat_props = (GlobalProperty[]) {
PC_Q35_COMPAT_1_4,
PC_COMPAT_1_4,
{ /* end of list */ }
},
};

View File

@@ -410,7 +410,7 @@ DefinitionBlock (
/****************************************************************
* General purpose events
****************************************************************/
External(\_SB.PCI0.MEMORY_HOPTLUG_DEVICE.MEMORY_SLOT_SCAN_METHOD, MethodObj)
External(\_SB.PCI0.MEMORY_HOTPLUG_DEVICE.MEMORY_SLOT_SCAN_METHOD, MethodObj)
Scope(\_GPE) {
Name(_HID, "ACPI0006")
@@ -425,7 +425,7 @@ DefinitionBlock (
}
Method(_E03) {
// Memory hotplug event
\_SB.PCI0.MEMORY_HOPTLUG_DEVICE.MEMORY_SLOT_SCAN_METHOD()
\_SB.PCI0.MEMORY_HOTPLUG_DEVICE.MEMORY_SLOT_SCAN_METHOD()
}
Method(_L04) {
}

View File

@@ -39,10 +39,10 @@ ACPI_EXTRACT_ALL_CODE ssdm_mem_aml
DefinitionBlock ("ssdt-mem.aml", "SSDT", 0x02, "BXPC", "CSSDT", 0x1)
{
External(\_SB.PCI0.MEMORY_HOPTLUG_DEVICE.MEMORY_SLOT_CRS_METHOD, MethodObj)
External(\_SB.PCI0.MEMORY_HOPTLUG_DEVICE.MEMORY_SLOT_STATUS_METHOD, MethodObj)
External(\_SB.PCI0.MEMORY_HOPTLUG_DEVICE.MEMORY_SLOT_OST_METHOD, MethodObj)
External(\_SB.PCI0.MEMORY_HOPTLUG_DEVICE.MEMORY_SLOT_PROXIMITY_METHOD, MethodObj)
External(\_SB.PCI0.MEMORY_HOTPLUG_DEVICE.MEMORY_SLOT_CRS_METHOD, MethodObj)
External(\_SB.PCI0.MEMORY_HOTPLUG_DEVICE.MEMORY_SLOT_STATUS_METHOD, MethodObj)
External(\_SB.PCI0.MEMORY_HOTPLUG_DEVICE.MEMORY_SLOT_OST_METHOD, MethodObj)
External(\_SB.PCI0.MEMORY_HOTPLUG_DEVICE.MEMORY_SLOT_PROXIMITY_METHOD, MethodObj)
Scope(\_SB) {
/* v------------------ DO NOT EDIT ------------------v */
@@ -58,19 +58,19 @@ DefinitionBlock ("ssdt-mem.aml", "SSDT", 0x02, "BXPC", "CSSDT", 0x1)
Name(_HID, EISAID("PNP0C80"))
Method(_CRS, 0) {
Return(\_SB.PCI0.MEMORY_HOPTLUG_DEVICE.MEMORY_SLOT_CRS_METHOD(_UID))
Return(\_SB.PCI0.MEMORY_HOTPLUG_DEVICE.MEMORY_SLOT_CRS_METHOD(_UID))
}
Method(_STA, 0) {
Return(\_SB.PCI0.MEMORY_HOPTLUG_DEVICE.MEMORY_SLOT_STATUS_METHOD(_UID))
Return(\_SB.PCI0.MEMORY_HOTPLUG_DEVICE.MEMORY_SLOT_STATUS_METHOD(_UID))
}
Method(_PXM, 0) {
Return(\_SB.PCI0.MEMORY_HOPTLUG_DEVICE.MEMORY_SLOT_PROXIMITY_METHOD(_UID))
Return(\_SB.PCI0.MEMORY_HOTPLUG_DEVICE.MEMORY_SLOT_PROXIMITY_METHOD(_UID))
}
Method(_OST, 3) {
\_SB.PCI0.MEMORY_HOPTLUG_DEVICE.MEMORY_SLOT_OST_METHOD(_UID, Arg0, Arg1, Arg2)
\_SB.PCI0.MEMORY_HOTPLUG_DEVICE.MEMORY_SLOT_OST_METHOD(_UID, Arg0, Arg1, Arg2)
}
}
}

View File

@@ -120,7 +120,7 @@ DefinitionBlock ("ssdt-misc.aml", "SSDT", 0x01, "BXPC", "BXSSDTSUSP", 0x1)
External(MEMORY_SLOT_NOTIFY_METHOD, MethodObj)
Scope(\_SB.PCI0) {
Device(MEMORY_HOPTLUG_DEVICE) {
Device(MEMORY_HOTPLUG_DEVICE) {
Name(_HID, "PNP0A06")
Name(_UID, "Memory hotplug resources")

View File

@@ -175,17 +175,18 @@ static void ahci_trigger_irq(AHCIState *s, AHCIDevice *d,
ahci_check_irq(s);
}
static void map_page(uint8_t **ptr, uint64_t addr, uint32_t wanted)
static void map_page(AddressSpace *as, uint8_t **ptr, uint64_t addr,
uint32_t wanted)
{
hwaddr len = wanted;
if (*ptr) {
cpu_physical_memory_unmap(*ptr, len, 1, len);
dma_memory_unmap(as, *ptr, len, DMA_DIRECTION_FROM_DEVICE, len);
}
*ptr = cpu_physical_memory_map(addr, &len, 1);
*ptr = dma_memory_map(as, addr, &len, DMA_DIRECTION_FROM_DEVICE);
if (len < wanted) {
cpu_physical_memory_unmap(*ptr, len, 1, len);
dma_memory_unmap(as, *ptr, len, DMA_DIRECTION_FROM_DEVICE, len);
*ptr = NULL;
}
}
@@ -198,24 +199,24 @@ static void ahci_port_write(AHCIState *s, int port, int offset, uint32_t val)
switch (offset) {
case PORT_LST_ADDR:
pr->lst_addr = val;
map_page(&s->dev[port].lst,
map_page(s->as, &s->dev[port].lst,
((uint64_t)pr->lst_addr_hi << 32) | pr->lst_addr, 1024);
s->dev[port].cur_cmd = NULL;
break;
case PORT_LST_ADDR_HI:
pr->lst_addr_hi = val;
map_page(&s->dev[port].lst,
map_page(s->as, &s->dev[port].lst,
((uint64_t)pr->lst_addr_hi << 32) | pr->lst_addr, 1024);
s->dev[port].cur_cmd = NULL;
break;
case PORT_FIS_ADDR:
pr->fis_addr = val;
map_page(&s->dev[port].res_fis,
map_page(s->as, &s->dev[port].res_fis,
((uint64_t)pr->fis_addr_hi << 32) | pr->fis_addr, 256);
break;
case PORT_FIS_ADDR_HI:
pr->fis_addr_hi = val;
map_page(&s->dev[port].res_fis,
map_page(s->as, &s->dev[port].res_fis,
((uint64_t)pr->fis_addr_hi << 32) | pr->fis_addr, 256);
break;
case PORT_IRQ_STAT:
@@ -639,6 +640,11 @@ static void ahci_write_fis_d2h(AHCIDevice *ad, uint8_t *cmd_fis)
}
}
static int prdt_tbl_entry_size(const AHCI_SG *tbl)
{
return (le32_to_cpu(tbl->flags_size) & AHCI_PRDT_SIZE_MASK) + 1;
}
static int ahci_populate_sglist(AHCIDevice *ad, QEMUSGList *sglist, int offset)
{
AHCICmdHdr *cmd = ad->cur_cmd;
@@ -681,7 +687,7 @@ static int ahci_populate_sglist(AHCIDevice *ad, QEMUSGList *sglist, int offset)
sum = 0;
for (i = 0; i < sglist_alloc_hint; i++) {
/* flags_size is zero-based */
tbl_entry_size = (le32_to_cpu(tbl[i].flags_size) + 1);
tbl_entry_size = prdt_tbl_entry_size(&tbl[i]);
if (offset <= (sum + tbl_entry_size)) {
off_idx = i;
off_pos = offset - sum;
@@ -700,12 +706,12 @@ static int ahci_populate_sglist(AHCIDevice *ad, QEMUSGList *sglist, int offset)
qemu_sglist_init(sglist, qbus->parent, (sglist_alloc_hint - off_idx),
ad->hba->as);
qemu_sglist_add(sglist, le64_to_cpu(tbl[off_idx].addr + off_pos),
le32_to_cpu(tbl[off_idx].flags_size) + 1 - off_pos);
prdt_tbl_entry_size(&tbl[off_idx]) - off_pos);
for (i = off_idx + 1; i < sglist_alloc_hint; i++) {
/* flags_size is zero-based */
qemu_sglist_add(sglist, le64_to_cpu(tbl[i].addr),
le32_to_cpu(tbl[i].flags_size) + 1);
prdt_tbl_entry_size(&tbl[i]));
}
}
@@ -1260,9 +1266,9 @@ static int ahci_state_post_load(void *opaque, int version_id)
ad = &s->dev[i];
AHCIPortRegs *pr = &ad->port_regs;
map_page(&ad->lst,
map_page(s->as, &ad->lst,
((uint64_t)pr->lst_addr_hi << 32) | pr->lst_addr, 1024);
map_page(&ad->res_fis,
map_page(s->as, &ad->res_fis,
((uint64_t)pr->fis_addr_hi << 32) | pr->fis_addr, 256);
/*
* All pending i/o should be flushed out on a migrate. However,

View File

@@ -201,6 +201,8 @@
#define AHCI_COMMAND_TABLE_ACMD 0x40
#define AHCI_PRDT_SIZE_MASK 0x3fffff
#define IDE_FEATURE_DMA 1
#define READ_FPDMA_QUEUED 0x60

View File

@@ -499,6 +499,18 @@ static void ide_rw_error(IDEState *s) {
ide_set_irq(s->bus);
}
static bool ide_sect_range_ok(IDEState *s,
uint64_t sector, uint64_t nb_sectors)
{
uint64_t total_sectors;
bdrv_get_geometry(s->bs, &total_sectors);
if (sector > total_sectors || nb_sectors > total_sectors - sector) {
return false;
}
return true;
}
static void ide_sector_read_cb(void *opaque, int ret)
{
IDEState *s = opaque;
@@ -554,6 +566,11 @@ void ide_sector_read(IDEState *s)
printf("sector=%" PRId64 "\n", sector_num);
#endif
if (!ide_sect_range_ok(s, sector_num, n)) {
ide_rw_error(s);
return;
}
s->iov.iov_base = s->io_buffer;
s->iov.iov_len = n * BDRV_SECTOR_SIZE;
qemu_iovec_init_external(&s->qiov, &s->iov, 1);
@@ -671,6 +688,13 @@ void ide_dma_cb(void *opaque, int ret)
sector_num, n, s->dma_cmd);
#endif
if ((s->dma_cmd == IDE_DMA_READ || s->dma_cmd == IDE_DMA_WRITE) &&
!ide_sect_range_ok(s, sector_num, n)) {
dma_buf_commit(s);
ide_dma_error(s);
return;
}
switch (s->dma_cmd) {
case IDE_DMA_READ:
s->bus->dma->aiocb = dma_bdrv_read(s->bs, &s->sg, sector_num,
@@ -790,6 +814,11 @@ void ide_sector_write(IDEState *s)
n = s->req_nb_sectors;
}
if (!ide_sect_range_ok(s, sector_num, n)) {
ide_rw_error(s);
return;
}
s->iov.iov_base = s->io_buffer;
s->iov.iov_len = n * BDRV_SECTOR_SIZE;
qemu_iovec_init_external(&s->qiov, &s->iov, 1);

View File

@@ -593,7 +593,7 @@ static void microdrive_realize(DeviceState *dev, Error **errp)
{
MicroDriveState *md = MICRODRIVE(dev);
ide_init2(&md->bus, qemu_allocate_irqs(md_set_irq, md, 1)[0]);
ide_init2(&md->bus, qemu_allocate_irq(md_set_irq, md, 0));
}
static void microdrive_init(Object *obj)

View File

@@ -124,7 +124,7 @@ static void hid_pointer_event(DeviceState *dev, QemuConsole *src,
if (evt->rel->axis == INPUT_AXIS_X) {
e->xdx += evt->rel->value;
} else if (evt->rel->axis == INPUT_AXIS_Y) {
e->ydy -= evt->rel->value;
e->ydy += evt->rel->value;
}
break;
@@ -191,7 +191,7 @@ static void hid_pointer_sync(DeviceState *dev)
if (hs->kind == HID_MOUSE) {
prev->xdx += curr->xdx;
curr->xdx = 0;
prev->ydy -= curr->ydy;
prev->ydy += curr->ydy;
curr->ydy = 0;
} else {
prev->xdx = curr->xdx;

View File

@@ -437,7 +437,7 @@ static void ics_set_irq(void *opaque, int srcno, int val)
{
ICSState *ics = (ICSState *)opaque;
if (ics->islsi[srcno]) {
if (ics->irqs[srcno].flags & XICS_FLAGS_IRQ_LSI) {
set_irq_lsi(ics, srcno, val);
} else {
set_irq_msi(ics, srcno, val);
@@ -474,7 +474,7 @@ static void ics_write_xive(ICSState *ics, int nr, int server,
trace_xics_ics_write_xive(nr, srcno, server, priority);
if (ics->islsi[srcno]) {
if (ics->irqs[srcno].flags & XICS_FLAGS_IRQ_LSI) {
write_xive_lsi(ics, srcno);
} else {
write_xive_msi(ics, srcno);
@@ -496,7 +496,7 @@ static void ics_resend(ICSState *ics)
for (i = 0; i < ics->nr_irqs; i++) {
/* FIXME: filter by server#? */
if (ics->islsi[i]) {
if (ics->irqs[i].flags & XICS_FLAGS_IRQ_LSI) {
resend_lsi(ics, i);
} else {
resend_msi(ics, i);
@@ -511,7 +511,7 @@ static void ics_eoi(ICSState *ics, int nr)
trace_xics_ics_eoi(nr);
if (ics->islsi[srcno]) {
if (ics->irqs[srcno].flags & XICS_FLAGS_IRQ_LSI) {
irq->status &= ~XICS_STATUS_SENT;
}
}
@@ -520,11 +520,18 @@ static void ics_reset(DeviceState *dev)
{
ICSState *ics = ICS(dev);
int i;
uint8_t flags[ics->nr_irqs];
for (i = 0; i < ics->nr_irqs; i++) {
flags[i] = ics->irqs[i].flags;
}
memset(ics->irqs, 0, sizeof(ICSIRQState) * ics->nr_irqs);
for (i = 0; i < ics->nr_irqs; i++) {
ics->irqs[i].priority = 0xff;
ics->irqs[i].saved_priority = 0xff;
ics->irqs[i].flags = flags[i];
}
}
@@ -563,13 +570,14 @@ static int ics_dispatch_post_load(void *opaque, int version_id)
static const VMStateDescription vmstate_ics_irq = {
.name = "ics/irq",
.version_id = 1,
.version_id = 2,
.minimum_version_id = 1,
.fields = (VMStateField[]) {
VMSTATE_UINT32(server, ICSIRQState),
VMSTATE_UINT8(priority, ICSIRQState),
VMSTATE_UINT8(saved_priority, ICSIRQState),
VMSTATE_UINT8(status, ICSIRQState),
VMSTATE_UINT8(flags, ICSIRQState),
VMSTATE_END_OF_LIST()
},
};
@@ -606,7 +614,6 @@ static void ics_realize(DeviceState *dev, Error **errp)
return;
}
ics->irqs = g_malloc0(ics->nr_irqs * sizeof(ICSIRQState));
ics->islsi = g_malloc0(ics->nr_irqs * sizeof(bool));
ics->qirqs = qemu_allocate_irqs(ics_set_irq, ics, ics->nr_irqs);
}
@@ -633,21 +640,166 @@ static const TypeInfo ics_info = {
/*
* Exported functions
*/
static int xics_find_source(XICSState *icp, int irq)
{
int sources = 1;
int src;
/* FIXME: implement multiple sources */
for (src = 0; src < sources; ++src) {
ICSState *ics = &icp->ics[src];
if (ics_valid_irq(ics, irq)) {
return src;
}
}
return -1;
}
qemu_irq xics_get_qirq(XICSState *icp, int irq)
{
if (!ics_valid_irq(icp->ics, irq)) {
return NULL;
int src = xics_find_source(icp, irq);
if (src >= 0) {
ICSState *ics = &icp->ics[src];
return ics->qirqs[irq - ics->offset];
}
return icp->ics->qirqs[irq - icp->ics->offset];
return NULL;
}
static void ics_set_irq_type(ICSState *ics, int srcno, bool lsi)
{
assert(!(ics->irqs[srcno].flags & XICS_FLAGS_IRQ_MASK));
ics->irqs[srcno].flags |=
lsi ? XICS_FLAGS_IRQ_LSI : XICS_FLAGS_IRQ_MSI;
}
void xics_set_irq_type(XICSState *icp, int irq, bool lsi)
{
assert(ics_valid_irq(icp->ics, irq));
int src = xics_find_source(icp, irq);
ICSState *ics;
icp->ics->islsi[irq - icp->ics->offset] = lsi;
assert(src >= 0);
ics = &icp->ics[src];
ics_set_irq_type(ics, irq - ics->offset, lsi);
}
#define ICS_IRQ_FREE(ics, srcno) \
(!((ics)->irqs[(srcno)].flags & (XICS_FLAGS_IRQ_MASK)))
static int ics_find_free_block(ICSState *ics, int num, int alignnum)
{
int first, i;
for (first = 0; first < ics->nr_irqs; first += alignnum) {
if (num > (ics->nr_irqs - first)) {
return -1;
}
for (i = first; i < first + num; ++i) {
if (!ICS_IRQ_FREE(ics, i)) {
break;
}
}
if (i == (first + num)) {
return first;
}
}
return -1;
}
int xics_alloc(XICSState *icp, int src, int irq_hint, bool lsi)
{
ICSState *ics = &icp->ics[src];
int irq;
if (irq_hint) {
assert(src == xics_find_source(icp, irq_hint));
if (!ICS_IRQ_FREE(ics, irq_hint - ics->offset)) {
trace_xics_alloc_failed_hint(src, irq_hint);
return -1;
}
irq = irq_hint;
} else {
irq = ics_find_free_block(ics, 1, 1);
if (irq < 0) {
trace_xics_alloc_failed_no_left(src);
return -1;
}
irq += ics->offset;
}
ics_set_irq_type(ics, irq - ics->offset, lsi);
trace_xics_alloc(src, irq);
return irq;
}
/*
* Allocate block of consequtive IRQs, returns a number of the first.
* If align==true, aligns the first IRQ number to num.
*/
int xics_alloc_block(XICSState *icp, int src, int num, bool lsi, bool align)
{
int i, first = -1;
ICSState *ics = &icp->ics[src];
assert(src == 0);
/*
* MSIMesage::data is used for storing VIRQ so
* it has to be aligned to num to support multiple
* MSI vectors. MSI-X is not affected by this.
* The hint is used for the first IRQ, the rest should
* be allocated continuously.
*/
if (align) {
assert((num == 1) || (num == 2) || (num == 4) ||
(num == 8) || (num == 16) || (num == 32));
first = ics_find_free_block(ics, num, num);
} else {
first = ics_find_free_block(ics, num, 1);
}
if (first >= 0) {
for (i = first; i < first + num; ++i) {
ics_set_irq_type(ics, i, lsi);
}
}
first += ics->offset;
trace_xics_alloc_block(src, first, num, lsi, align);
return first;
}
static void ics_free(ICSState *ics, int srcno, int num)
{
int i;
for (i = srcno; i < srcno + num; ++i) {
if (ICS_IRQ_FREE(ics, i)) {
trace_xics_ics_free_warn(ics - ics->icp->ics, i + ics->offset);
}
memset(&ics->irqs[i], 0, sizeof(ICSIRQState));
}
}
void xics_free(XICSState *icp, int irq, int num)
{
int src = xics_find_source(icp, irq);
if (src >= 0) {
ICSState *ics = &icp->ics[src];
/* FIXME: implement multiple sources */
assert(src == 0);
trace_xics_ics_free(ics - icp->ics, irq, num);
ics_free(ics, irq - ics->offset, num);
}
}
/*
@@ -866,10 +1018,10 @@ static void xics_realize(DeviceState *dev, Error **errp)
}
/* Registration of global state belongs into realize */
spapr_rtas_register("ibm,set-xive", rtas_set_xive);
spapr_rtas_register("ibm,get-xive", rtas_get_xive);
spapr_rtas_register("ibm,int-off", rtas_int_off);
spapr_rtas_register("ibm,int-on", rtas_int_on);
spapr_rtas_register(RTAS_IBM_SET_XIVE, "ibm,set-xive", rtas_set_xive);
spapr_rtas_register(RTAS_IBM_GET_XIVE, "ibm,get-xive", rtas_get_xive);
spapr_rtas_register(RTAS_IBM_INT_OFF, "ibm,int-off", rtas_int_off);
spapr_rtas_register(RTAS_IBM_INT_ON, "ibm,int-on", rtas_int_on);
spapr_register_hypercall(H_CPPR, h_cppr);
spapr_register_hypercall(H_IPI, h_ipi);

View File

@@ -38,10 +38,6 @@
typedef struct KVMXICSState {
XICSState parent_obj;
uint32_t set_xive_token;
uint32_t get_xive_token;
uint32_t int_off_token;
uint32_t int_on_token;
int kernel_xics_fd;
} KVMXICSState;
@@ -224,7 +220,7 @@ static int ics_set_kvm_state(ICSState *ics, int version_id)
state |= KVM_XICS_MASKED;
}
if (ics->islsi[i]) {
if (ics->irqs[i].flags & XICS_FLAGS_IRQ_LSI) {
state |= KVM_XICS_LEVEL_SENSITIVE;
if (irq->status & XICS_STATUS_ASSERTED) {
state |= KVM_XICS_PENDING;
@@ -253,7 +249,7 @@ static void ics_kvm_set_irq(void *opaque, int srcno, int val)
int rc;
args.irq = srcno + ics->offset;
if (!ics->islsi[srcno]) {
if (ics->irqs[srcno].flags & XICS_FLAGS_IRQ_MSI) {
if (!val) {
return;
}
@@ -271,11 +267,18 @@ static void ics_kvm_reset(DeviceState *dev)
{
ICSState *ics = ICS(dev);
int i;
uint8_t flags[ics->nr_irqs];
for (i = 0; i < ics->nr_irqs; i++) {
flags[i] = ics->irqs[i].flags;
}
memset(ics->irqs, 0, sizeof(ICSIRQState) * ics->nr_irqs);
for (i = 0; i < ics->nr_irqs; i++) {
ics->irqs[i].priority = 0xff;
ics->irqs[i].saved_priority = 0xff;
ics->irqs[i].flags = flags[i];
}
ics_set_kvm_state(ics, 1);
@@ -290,7 +293,6 @@ static void ics_kvm_realize(DeviceState *dev, Error **errp)
return;
}
ics->irqs = g_malloc0(ics->nr_irqs * sizeof(ICSIRQState));
ics->islsi = g_malloc0(ics->nr_irqs * sizeof(bool));
ics->qirqs = qemu_allocate_irqs(ics_kvm_set_irq, ics, ics->nr_irqs);
}
@@ -392,32 +394,30 @@ static void xics_kvm_realize(DeviceState *dev, Error **errp)
goto fail;
}
icpkvm->set_xive_token = spapr_rtas_register("ibm,set-xive", rtas_dummy);
icpkvm->get_xive_token = spapr_rtas_register("ibm,get-xive", rtas_dummy);
icpkvm->int_off_token = spapr_rtas_register("ibm,int-off", rtas_dummy);
icpkvm->int_on_token = spapr_rtas_register("ibm,int-on", rtas_dummy);
spapr_rtas_register(RTAS_IBM_SET_XIVE, "ibm,set-xive", rtas_dummy);
spapr_rtas_register(RTAS_IBM_GET_XIVE, "ibm,get-xive", rtas_dummy);
spapr_rtas_register(RTAS_IBM_INT_OFF, "ibm,int-off", rtas_dummy);
spapr_rtas_register(RTAS_IBM_INT_ON, "ibm,int-on", rtas_dummy);
rc = kvmppc_define_rtas_kernel_token(icpkvm->set_xive_token,
"ibm,set-xive");
rc = kvmppc_define_rtas_kernel_token(RTAS_IBM_SET_XIVE, "ibm,set-xive");
if (rc < 0) {
error_setg(errp, "kvmppc_define_rtas_kernel_token: ibm,set-xive");
goto fail;
}
rc = kvmppc_define_rtas_kernel_token(icpkvm->get_xive_token,
"ibm,get-xive");
rc = kvmppc_define_rtas_kernel_token(RTAS_IBM_GET_XIVE, "ibm,get-xive");
if (rc < 0) {
error_setg(errp, "kvmppc_define_rtas_kernel_token: ibm,get-xive");
goto fail;
}
rc = kvmppc_define_rtas_kernel_token(icpkvm->int_on_token, "ibm,int-on");
rc = kvmppc_define_rtas_kernel_token(RTAS_IBM_INT_ON, "ibm,int-on");
if (rc < 0) {
error_setg(errp, "kvmppc_define_rtas_kernel_token: ibm,int-on");
goto fail;
}
rc = kvmppc_define_rtas_kernel_token(icpkvm->int_off_token, "ibm,int-off");
rc = kvmppc_define_rtas_kernel_token(RTAS_IBM_INT_OFF, "ibm,int-off");
if (rc < 0) {
error_setg(errp, "kvmppc_define_rtas_kernel_token: ibm,int-off");
goto fail;

View File

@@ -66,7 +66,7 @@ static void ipack_device_unrealize(DeviceState *dev, Error **errp)
return;
}
qemu_free_irqs(idev->irq);
qemu_free_irqs(idev->irq, 2);
}
static Property ipack_device_props[] = {

View File

@@ -146,7 +146,13 @@ uint64_t pc_dimm_get_free_addr(uint64_t address_space_start,
uint64_t new_addr, ret = 0;
uint64_t address_space_end = address_space_start + address_space_size;
assert(address_space_end > address_space_size);
if (!address_space_size) {
error_setg(errp, "memory hotplug is not enabled, "
"please add maxmem option");
goto out;
}
assert(address_space_end > address_space_start);
object_child_foreach(qdev_get_machine(), pc_dimm_built_list, &list);
if (hint) {
@@ -246,6 +252,12 @@ static void pc_dimm_realize(DeviceState *dev, Error **errp)
error_setg(errp, "'" PC_DIMM_MEMDEV_PROP "' property is not set");
return;
}
if (dimm->node >= nb_numa_nodes) {
error_setg(errp, "'DIMM property " PC_DIMM_NODE_PROP " has value %"
PRIu32 "' which exceeds the number of numa nodes: %d",
dimm->node, nb_numa_nodes);
return;
}
}
static MemoryRegion *pc_dimm_get_memory_region(PCDIMMDevice *dimm)

View File

@@ -792,9 +792,23 @@ static int64_t load_kernel (void)
loaderparams.kernel_filename);
exit(1);
}
/* Sanity check where the kernel has been linked */
if (kvm_enabled()) {
if (kernel_entry & 0x80000000ll) {
error_report("KVM guest kernels must be linked in useg. "
"Did you forget to enable CONFIG_KVM_GUEST?");
exit(1);
}
xlate_to_kseg0 = cpu_mips_kvm_um_phys_to_kseg0;
} else {
if (!(kernel_entry & 0x80000000ll)) {
error_report("KVM guest kernels aren't supported with TCG. "
"Did you unintentionally enable CONFIG_KVM_GUEST?");
exit(1);
}
xlate_to_kseg0 = cpu_mips_phys_to_kseg0;
}
@@ -1028,7 +1042,7 @@ void mips_malta_init(MachineState *machine)
fl_idx++;
if (kernel_filename) {
ram_low_size = MIN(ram_size, 256 << 20);
/* For KVM T&E we reserve 1MB of RAM for running bootloader */
/* For KVM we reserve 1MB of RAM for running bootloader */
if (kvm_enabled()) {
ram_low_size -= 0x100000;
bootloader_run_addr = 0x40000000 + ram_low_size;
@@ -1052,10 +1066,10 @@ void mips_malta_init(MachineState *machine)
bootloader_run_addr, kernel_entry);
}
} else {
/* The flash region isn't executable from a KVM T&E guest */
/* The flash region isn't executable from a KVM guest */
if (kvm_enabled()) {
error_report("KVM enabled but no -kernel argument was specified. "
"Booting from flash is not supported with KVM T&E.");
"Booting from flash is not supported with KVM.");
exit(1);
}
/* Load firmware from flash. */

View File

@@ -135,9 +135,9 @@ CBus *cbus_init(qemu_irq dat)
CBusPriv *s = (CBusPriv *) g_malloc0(sizeof(*s));
s->dat_out = dat;
s->cbus.clk = qemu_allocate_irqs(cbus_clk, s, 1)[0];
s->cbus.dat = qemu_allocate_irqs(cbus_dat, s, 1)[0];
s->cbus.sel = qemu_allocate_irqs(cbus_sel, s, 1)[0];
s->cbus.clk = qemu_allocate_irq(cbus_clk, s, 0);
s->cbus.dat = qemu_allocate_irq(cbus_dat, s, 0);
s->cbus.sel = qemu_allocate_irq(cbus_sel, s, 0);
s->sel = 1;
s->clk = 0;

Some files were not shown because too many files have changed in this diff Show More