Compare commits

..

1861 Commits

Author SHA1 Message Date
Peter Maydell
45e1611de8 Update version for v2.2.0 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-12-09 12:13:37 +00:00
Peter Maydell
d00e6cddc2 Update version for v2.2.0-rc5 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-12-04 15:51:22 +00:00
Peter Maydell
54f3a180a3 Merge remote-tracking branch 'remotes/kraxel/tags/pull-cve-2014-8106-20141204-1' into staging
cirrus: fix blit region check

# gpg: Signature made Thu 04 Dec 2014 11:54:57 GMT using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-cve-2014-8106-20141204-1:
  cirrus: don't overflow CirrusVGAState->cirrus_bltbuf
  cirrus: fix blit region check

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-12-04 12:22:46 +00:00
Peter Maydell
0d7954c288 Update version for v2.2.0-rc4 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-12-01 13:35:26 +00:00
Gonglei
b19ca18802 vhost: Fix vhostfd leak in error branch
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1417166789-1960-1-git-send-email-arei.gonglei@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-12-01 12:29:35 +00:00
Gerd Hoffmann
bf25983345 cirrus: don't overflow CirrusVGAState->cirrus_bltbuf
This is CVE-2014-8106.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-12-01 10:25:46 +01:00
Gerd Hoffmann
d3532a0db0 cirrus: fix blit region check
Issues:
 * Doesn't check pitches correctly in case it is negative.
 * Doesn't check width at all.

Turn macro into functions while being at it, also factor out the check
for one region which we then can simply call twice for src + dst.

This is CVE-2014-8106.

Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-01 10:25:12 +01:00
David Gibson
db12451dec Fix for crash after migration in virtio-rng on bi-endian targets
VirtIO devices now remember which endianness they're operating in in order
to support targets which may have guests of either endianness, such as
powerpc.  This endianness state is transferred in a subsection of the
virtio device's information.

With virtio-rng this can lead to an abort after a loadvm hitting the
assert() in virtio_is_big_endian().  This can be reproduced by doing a
migrate and load from file on a bi-endian target with a virtio-rng device.
The actual guest state isn't particularly important to triggering this.

The cause is that virtio_rng_load_device() calls virtio_rng_process() which
accesses the ring and thus needs the endianness.  However,
virtio_rng_process() is called via virtio_load() before it loads the
subsections.  Essentially the ->load callback in VirtioDeviceClass should
only be used for actually reading the device state from the stream, not for
post-load re-initialization.

This patch fixes the bug by moving the virtio_rng_process() after the call
to virtio_load().  Better yet would be to convert virtio to use vmsd and
have the virtio_rng_process() as a post_load callback, but that's a bigger
project for another day.

This is bugfix, and should be considered for the 2.2 branch.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Message-id: 1417067290-20715-1-git-send-email-david@gibson.dropbear.id.au
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-28 13:06:00 +00:00
Jason Wang
771b6ed37e virtio-net: fix unmap leak
virtio_net_handle_ctrl() and other functions that process control vq
request call iov_discard_front() which will shorten the iov. This will
lead unmapping in virtqueue_push() leaks mapping.

Fixes this by keeping the original iov untouched and using a temp variable
in those functions.

Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1417082643-23907-1-git-send-email-jasowang@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-28 10:29:20 +00:00
Marcel Apfelbaum
4cae4d5aca hmp: fix regression of HMP device_del auto-completion
The commits:
 - 6a1fa9f5 (monitor: add del completion for peripheral device)
 - 66e56b13 (qdev: add qdev_build_hotpluggable_device_list helper)

cause a QEMU crash when trying to use HMP device_del auto-completion.
It can be easily reproduced by:
    <qemu-bin> -enable-kvm  ~/images/fedora.qcow2 -monitor stdio -device virtio-net-pci,id=vnet

    (qemu) device_del
    /home/mapfelba/git/upstream/qemu/hw/core/qdev.c:941:qdev_build_hotpluggable_device_list: Object 0x7f6ce04e4fe0 is not an instance of type device
    Aborted (core dumped)

The root cause is qdev_build_hotpluggable_device_list going recursively over
all peripherals and their children assuming all are devices. It doesn't work
since PCI devices have at least on child which is a memory region (bus master).

Solved by observing that all devices appear as direct children of
/machine/peripheral container. No need of going recursively
over all the children.

Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Reported-by: Gal Hammer <ghammer@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-id: 1417002601-20799-1-git-send-email-marcel.a@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-27 14:36:20 +00:00
Peter Maydell
490309fcfb qemu-timer: Avoid overflows when converting timeout to struct timespec
In qemu_poll_ns(), when we convert an int64_t nanosecond timeout into
a struct timespec, we may accidentally run into overflow problems if
the timeout is very long. This happens because the tv_sec field is a
time_t, which is signed, so we might end up setting it to a negative
value by mistake. This will result in what was intended to be a
near-infinite timeout turning into an instantaneous timeout, and we'll
busy loop. Cap the maximum timeout at INT32_MAX seconds (about 68 years)
to avoid this problem.

This specifically manifested on ARM hosts as an extreme slowdown on
guest shutdown (when the guest reprogrammed the PL031 RTC to not
generate alarms using a very long timeout) but could happen on other
hosts and guests too.

Reported-by: Christoffer Dall <christoffer.dall@linaro.org>
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1416939705-1272-1-git-send-email-peter.maydell@linaro.org
2014-11-27 11:31:58 +00:00
Peter Maydell
3ef4ebcc5c Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
The final 2.2 patches from me.

# gpg: Signature made Wed 26 Nov 2014 11:12:25 GMT using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream:
  s390x/kvm: Fix compile error
  fw_cfg: fix boot order bug when dynamically modified via QOM
  -machine vmport=auto: Fix handling of VMWare ioport emulation for xen

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-26 12:18:00 +00:00
Christian Borntraeger
dc622deb2d s390x/kvm: Fix compile error
commit a2b257d621 "memory: expose alignment used for allocating RAM
as MemoryRegion API" triggered a compile error on KVM/s390x.

Fix the prototype and the implementation of legacy_s390_alloc.

Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-26 12:11:27 +01:00
Gonglei
f3b3766899 fw_cfg: fix boot order bug when dynamically modified via QOM
When we dynamically modify boot order, the length of
boot order will be changed, but we don't update
s->files->f[i].size with new length. This casuse
seabios read a wrong vale of qemu cfg file about
bootorder.

Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-26 12:11:27 +01:00
Don Slutz
d1048bef9d -machine vmport=auto: Fix handling of VMWare ioport emulation for xen
c/s 9b23cfb76b

or

c/s b154537ad0

moved the testing of xen_enabled() from pc_init1() to
pc_machine_initfn().

xen_enabled() does not return the correct value in
pc_machine_initfn().

Changed vmport from a bool to an enum.  Added the value "auto" to do
the old way.  Move check of xen_enabled() back to pc_init1().

Acked-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Don Slutz <dslutz@verizon.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-26 12:11:27 +01:00
Peter Maydell
2528043f1f Update version for v2.2.0-rc3 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-25 18:23:54 +00:00
Gerd Hoffmann
df5b2adb73 input: move input-send-event into experimental namespace
Ongoing discussions on how we are going to specify the console,
so tag the command as experiental so we can refine things in
the 2.3 development cycle.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1416923657-10614-1-git-send-email-armbru@redhat.com
[Spell out "not a stable API", and x- the QAPI schema, too]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-25 17:03:31 +00:00
Peter Maydell
ca6028185d Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc, pci, misc bugfixes

A bunch of bugfixes for 2.2.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Mon 24 Nov 2014 18:59:47 GMT using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"

* remotes/mst/tags/for_upstream:
  pc: acpi: mark all possible CPUs as enabled in SRAT
  pcie: fix improper use of negative value
  pcie: fix typo in pcie_cap_deverr_init()
  target-i386: move generic memory hotplug methods to DSDTs
  acpi-build: mark RAM dirty on table update
  hw/pci: fix crash on shpc error flow
  pc: count in 1Gb hugepage alignment when sizing hotplug-memory container
  pc: explicitly check maxmem limit when adding DIMM
  pc: pc-dimm: use backend alignment during address auto allocation
  pc: align DIMM's address/size by backend's alignment value
  memory: expose alignment used for allocating RAM as MemoryRegion API
  pc: limit DIMM address and size to page aligned values
  pc: make pc_dimm_plug() more readble
  pc: kvm: check if KVM has free memory slots to avoid abort()
  qemu-char: fix tcp_get_fds

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-24 19:31:50 +00:00
Igor Mammedov
dd0247e09a pc: acpi: mark all possible CPUs as enabled in SRAT
If QEMU is started with  -numa ... Windows only notices that
CPU has been hot-added but it will not online such CPUs.

It's caused by the fact that possible CPUs are flagged as
not enabled in SRAT and Windows honoring that information
doesn't use corresponding CPU.

ACPI 5.0 Spec regarding to flag says:
"
Table 5-47 Local APIC Flags
...
Enabled: if zero, this processor is unusable, and the operating system
support will not attempt to use it.
"

Fix QEMU to adhere to spec and mark possible CPUs as enabled
in SRAT.

With that Windows onlines hot-added CPUs as expected.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-11-24 20:57:11 +02:00
Gonglei
6c150fbd34 pcie: fix improper use of negative value
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-11-24 20:57:11 +02:00
Gonglei
8e815eeefe pcie: fix typo in pcie_cap_deverr_init()
Reported-by:
 https://bugs.launchpad.net/qemu/+bug/1393440

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-11-24 20:57:10 +02:00
Paolo Bonzini
4f99ab7a78 target-i386: move generic memory hotplug methods to DSDTs
This makes it simpler to keep the SSDT byte-for-byte identical for a
given machine type, which is a goal we want to have for 2.2 and newer
types.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-11-24 20:57:10 +02:00
Michael S. Tsirkin
ad5b88b1f1 acpi-build: mark RAM dirty on table update
acpi build modifies internal FW CFG RAM on first access
but we forgot to mark it dirty.
If this RAM has been migrated already, it won't be
migrated again, returning corrupted tables to guest.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-11-24 20:57:10 +02:00
Marcel Apfelbaum
109e90e470 hw/pci: fix crash on shpc error flow
If the pci bridge enters in error flow as part
of init process it will only delete the shpc mmio
subregion but not remove it from the properties list,
resulting in segmentation fault when the bridge runs
the exit function.

Example: add a pci bridge without specifing the chassis number:
    <qemu-bin> ... -device pci-bridge,id=p1
Result:
    (qemu) qemu-system-x86_64: -device pci-bridge,id=p1: Bridge chassis not specified. Each bridge is required to be assigned a unique chassis id > 0.
    qemu-system-x86_64: -device pci-bridge,id=p1: Device
    initialization failed.
    Segmentation fault (core dumped)

    if (child->class->unparent) {
    #0  0x00005555558d629b in object_finalize_child_property (obj=0x555556d2e830, name=0x555556d30630 "shpc-mmio[0]", opaque=0x555556a42fc8) at qom/object.c:1078
    #1  0x00005555558d4b1f in object_property_del_all (obj=0x555556d2e830) at qom/object.c:367
    #2  0x00005555558d4ca1 in object_finalize (data=0x555556d2e830) at qom/object.c:412
    #3  0x00005555558d55a1 in object_unref (obj=0x555556d2e830) at qom/object.c:720
    #4  0x000055555572c907 in qdev_device_add (opts=0x5555563544f0) at qdev-monitor.c:566
    #5  0x0000555555744f16 in device_init_func (opts=0x5555563544f0, opaque=0x0) at vl.c:2213
    #6  0x00005555559cf5f0 in qemu_opts_foreach (list=0x555555e0f8e0 <qemu_device_opts>, func=0x555555744efa <device_init_func>, opaque=0x0, abort_on_failure=1) at util/qemu-option.c:1057
    #7  0x000055555574a11b in main (argc=16, argv=0x7fffffffdde8, envp=0x7fffffffde70) at vl.c:423

Unparent the shpc mmio region as part of shpc cleanup.

Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
2014-11-24 20:57:10 +02:00
Igor Mammedov
085f8e88ba pc: count in 1Gb hugepage alignment when sizing hotplug-memory container
if DIMMs with different size/alignment are interleaved
in creation order, it could lead to hotplug-memory
container fragmentation and following inability to use
all RAM upto maxmem.
For example:
    -m 4G,slots=3,maxmem=7G
    -object memory-backend-file,id=mem-1,size=256M,mem-path=/pagesize-2MB
    -device pc-dimm,id=mem1,memdev=mem-1
    -object memory-backend-file,id=mem-2,size=1G,mem-path=/pagesize-1GB
    -device pc-dimm,id=mem2,memdev=mem-2
    -object memory-backend-file,id=mem-3,size=256M,mem-path=/pagesize-2MB
    -device pc-dimm,id=mem3,memdev=mem-3

fragments hotplug-memory container and doesn't allow
to use 1GB hugepage backend to consume remainig 1Gb.

To ease managment factor count in max 1Gb alignment for
each memory slot when sizing hotplug-memory region so
that regadless of fragmentaion it would be possible to
add max aligned DIMM.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-11-24 20:57:10 +02:00
Igor Mammedov
b03541fa77 pc: explicitly check maxmem limit when adding DIMM
Currently maxmem limit is not checked and depends on
hotplug region container not being able to fit more RAM
than maxmem. Do check explicitly so that it would
be possible to change hotplug container size later
to deal with fragmentation.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-11-24 20:57:10 +02:00
Peter Maydell
3d4a70f80f Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block patches for 2.2.0-rc3

# gpg: Signature made Mon 24 Nov 2014 12:52:23 GMT using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"

* remotes/kevin/tags/for-upstream:
  Revert "qemu-img info: show nocow info"

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-24 15:01:54 +00:00
Peter Maydell
a31a7475e9 Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
Three patches to fix ExtINT for the QEMU implementation of the local APIC.

# gpg: Signature made Mon 24 Nov 2014 13:38:36 GMT using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream:
  apic: fix incorrect handling of ExtINT interrupts wrt processor priority
  apic: fix loss of IPI due to masked ExtINT
  apic: avoid getting out of halted state on masked PIC interrupts

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-24 13:50:22 +00:00
Paolo Bonzini
5224c88dd3 apic: fix incorrect handling of ExtINT interrupts wrt processor priority
This fixes another failure with ExtINT, demonstrated by QNX.  The failure
mode is as follows:
- IPI sent to cpu 0 (bit set in APIC irr)
- IPI accepted by cpu 0 (bit cleared in irr, set in isr)
- IPI sent to cpu 0 (bit set in both irr and isr)
- PIC interrupt sent to cpu 0

The PIC interrupt causes CPU_INTERRUPT_HARD to be set, but
apic_irq_pending observes that the highest pending APIC interrupt priority
(the IPI) is the same as the processor priority (since the IPI is still
being handled), so apic_get_interrupt returns a spurious interrupt rather
than the pending PIC interrupt. The result is an endless sequence of
spurious interrupts, since nothing will clear CPU_INTERRUPT_HARD.

Instead, ExtINT interrupts should have ignored the processor priority.
Calling apic_check_pic early in apic_get_interrupt ensures that
apic_deliver_pic_intr is called instead of delivering the spurious
interrupt.  apic_deliver_pic_intr then clears CPU_INTERRUPT_HARD if needed.

Reported-by: Richard Bilson <rbilson@qnx.com>
Tested-by: Richard Bilson <rbilson@qnx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-24 14:37:45 +01:00
Paolo Bonzini
8092cb7132 apic: fix loss of IPI due to masked ExtINT
This patch fixes an obscure failure of the QNX kernel on QEMU x86 SMP.
In QNX, all hardware interrupts come via the PIC, and are delivered by
the cpu 0 LAPIC in ExtINT mode, while IPIs are delivered by the LAPIC
in fixed mode.

This bug happens as follows:
- cpu 0 masks a particular PIC interrupt
- IPI sent to cpu 0 (CPU_INTERRUPT_HARD is set)
- before the IPI is accepted, the masked interrupt line is asserted by the
device

Since the interrupt is masked, apic_deliver_pic_intr will clear
CPU_INTERRUPT_HARD. The IPI will still be set in the APIC irr, but since
CPU_INTERRUPT_HARD is not set the cpu will not notice. Depending on the
scenario this can cause a system hang, i.e. if cpu 0 is expected to unmask
the interrupt.

In order to fix this, do a full check of the APIC before an EXTINT
is acknowledged.  This can result in clearing CPU_INTERRUPT_HARD, but
can also result in delivering the lost IPI.

Reported-by: Richard Bilson <rbilson@qnx.com>
Tested-by: Richard Bilson <rbilson@qnx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-24 14:37:40 +01:00
Paolo Bonzini
60e68042cf apic: avoid getting out of halted state on masked PIC interrupts
After the next patch, if a masked PIC interrupts causes CPU_INTERRUPT_POLL
to be set, the CPU will spuriously get out of halted state.  While this
is technically valid, we should avoid that.

Make CPU_INTERRUPT_POLL run apic_update_irq in the right thread and then
look at CPU_INTERRUPT_HARD.  If CPU_INTERRUPT_HARD does not get set,
do not report the CPU as having work.

Also move the handling of software-disabled APIC from apic_update_irq
to apic_irq_pending, and always trigger CPU_INTERRUPT_POLL.  This will
be important once we will add a case that resets CPU_INTERRUPT_HARD
from apic_update_irq.  We want to run it even if we go through
CPU_INTERRUPT_POLL, and even if the local APIC is software disabled.

Reported-by: Richard Bilson <rbilson@qnx.com>
Tested-by: Richard Bilson <rbilson@qnx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-24 14:37:30 +01:00
Kevin Wolf
24bf10dac3 Revert "qemu-img info: show nocow info"
This reverts commit 000c4dfff4.

The main reason for reverting this commit before the 2.2 release is that
it adds a QAPI interface that we don't want to keep: The 'nocow' flag
doesn't generally make sense for block nodes, but only for the raw-posix
driver. It should therefore be part of ImageInfoSpecific rather than
ImageInfo.

The commit contains more problems, but unlike the API stability issue
they wouldn't justify reverting it.

Conflicts:
	block/qapi.c

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-24 13:52:10 +01:00
Igor Mammedov
0c0de1b681 pc: pc-dimm: use backend alignment during address auto allocation
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-11-23 12:12:46 +02:00
Igor Mammedov
91aa70ab2a pc: align DIMM's address/size by backend's alignment value
Performance wise it's better to align GVA by the backend's
page size.

Also do not allow to create DIMM device with suboptimal
size (i.e. not aligned to backends page size) to aviod
memory loss.

Do above only for 2.2 and newer machine types to avoid
breaking working configs with 2.1 machine type.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-11-23 12:12:39 +02:00
Igor Mammedov
a2b257d621 memory: expose alignment used for allocating RAM as MemoryRegion API
introduce memory_region_get_alignment() that returns
underlying memory block alignment or 0 if it's not
relevant/implemented for backend.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-11-23 12:11:30 +02:00
Igor Mammedov
92a37a04d6 pc: limit DIMM address and size to page aligned values
When running in KVM mode, kvm_set_phys_mem() will silently
fail if registered MemoryRegion address/size is not page
aligned. Causing memory hotplug failure in guest.

Mapping non aligned MemoryRegion in TCG mode 'works', but
sane guest OS still expects page aligned memory module
and fails to initialize it if it's not aligned.

So do not allow non aligned (i.e. valid) address/size
values for DIMM to avoid either KVM failure or guest
issues caused by it.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-11-23 12:11:30 +02:00
Igor Mammedov
34dde13685 pc: make pc_dimm_plug() more readble
split addr initialization from declaration so that
later when new local vars are added property getter
wouldn't drift off of error check.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-11-23 12:11:30 +02:00
Igor Mammedov
b8865591d4 pc: kvm: check if KVM has free memory slots to avoid abort()
When more memory devices are used than available
KVM memory slots, QEMU crashes with:

kvm_alloc_slot: no free slot available
Aborted (core dumped)

Fix this by checking that KVM has a free slot before
attempting to map memory in guest address space.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-11-23 12:11:29 +02:00
Michael S. Tsirkin
c409572678 qemu-char: fix tcp_get_fds
tcp_get_fds API discards fds if there's more than 1 of these.

It's tricky to fix this without API changes in the generic case.

However, this API is only used by tests ATM, and tests know how
many fds they expect.

So let's not waste cycles trying to fix this properly:
simply assume at most 16 fds (tests use at most 8 now).
assert if some test tries to get more.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-11-23 12:11:29 +02:00
Peter Maydell
0e88f47850 Merge remote-tracking branch 'remotes/stefanha/tags/net-pull-request' into staging
# gpg: Signature made Fri 21 Nov 2014 11:12:37 GMT using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/net-pull-request:
  rtl8139: fix Pointer to local outside scope
  pcnet: fix Negative array index read
  net/socket: fix Uninitialized scalar variable
  net/slirp: fix memory leak

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-21 14:15:37 +00:00
Peter Maydell
a00c117338 Merge remote-tracking branch 'remotes/kraxel/tags/pull-gtk-20141121-1' into staging
gtk: two bugfixes for 2.2.

# gpg: Signature made Fri 21 Nov 2014 07:38:45 GMT using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-gtk-20141121-1:
  gtk: Don't crash if -nodefaults
  gtk: fix possible memory leak about local_err

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-21 13:22:18 +00:00
Gonglei
b0af844007 rtl8139: fix Pointer to local outside scope
Coverity spot:
 Assigning: iov = struct iovec [3]({{buf, 12UL},
                       {(void *)dot1q_buf, 4UL},
                       {buf + 12, size - 12}})
 (address of temporary variable of type struct iovec [3]).
 out_of_scope: Temporary variable of type struct iovec [3] goes out of scope.

Pointer to local outside scope (RETURN_LOCAL)
use_invalid:
 Using iov, which points to an out-of-scope temporary variable of type struct iovec [3].

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-21 10:50:54 +00:00
Gonglei
7b50d00911 pcnet: fix Negative array index read
s->xmit_pos maybe assigned to a negative value (-1),
but in this branch variable s->xmit_pos as an index to
array s->buffer. Let's add a check for s->xmit_pos.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-21 10:50:54 +00:00
Gonglei
8db804ac41 net/socket: fix Uninitialized scalar variable
If is_connected parameter is false, the saddr
variable will no initialize. Coverity report:
uninit_use: Using uninitialized value saddr.sin_port.

We don't need add saddr information to nc->info_str
when is_connected is false.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-21 10:50:54 +00:00
Gonglei
7a8919dc29 net/slirp: fix memory leak
commit b412eb61 introduce 'cmd:' target for guestfwd,
and fwd don't be used in this scenario, and will leak
memory in true branch with 'cmd:'. Let's allocate memory
for fwd variable just in else statement.

Cc: Alexander Graf <agraf@suse.de>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-21 10:50:54 +00:00
Fam Zheng
b310a2a609 gtk: Don't crash if -nodefaults
This fixes a crash by just skipping the vte resize hack if cur is NULL.

Reproducer:

qemu-system-x86_64 -nodefaults

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-11-21 08:37:59 +01:00
zhanghailiang
8a0f9b5263 gtk: fix possible memory leak about local_err
local_err in gd_vc_gfx_init() is not freed, and we don't use it,
so remove it.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-11-21 08:37:59 +01:00
Leif Lindholm
9c7074da5e hw/arm/virt: set stdout-path instead of linux,stdout-path
ePAPR 1.1 defines the stdout-path property, making the os-specific
linux,stdout-path property redundant. Change the DT setup for ARM virt
to use the generic property - supported by Linux since 3.15.

The old QEMU behaviour was not present in any released version of
QEMU, and was only added to QEMU after the kernel changed, so
this should not break any existing setups.

Signed-off-by: Leif Lindholm <leif.lindholm@linaro.org>
[PMM: add note to commit about the old behaviour never hving been
in a released version of QEMU]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-20 14:58:37 +00:00
Peter Maydell
ff323a6b54 Merge remote-tracking branch 'remotes/agraf/tags/signed-ppc-for-upstream' into staging
Patch queue for ppc - 2014-11-20

Hopefully the last few fixups for 2.2:

  - KVM memory slot fix (should usually only occur on PPC)
  - e300 fix
  - Altivec mtvscr instruction fix

# gpg: Signature made Thu 20 Nov 2014 13:53:34 GMT using RSA key ID 03FEDC60
# gpg: Good signature from "Alexander Graf <agraf@suse.de>"
# gpg:                 aka "Alexander Graf <alex@csgraf.de>"

* remotes/agraf/tags/signed-ppc-for-upstream:
  target-ppc: Altivec's mtvscr Decodes Wrong Register
  kvm: Fix memory slot page alignment logic
  target-ppc: Fix breakpoint registers for e300

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-20 14:02:24 +00:00
Tom Musta
76cb658419 target-ppc: Altivec's mtvscr Decodes Wrong Register
The Move to Vector Status and Control Register (mtvscr) instruction
uses VRB as the source register.  Fix the code generator to correctly
decode the VRB field.  That is, use "rB(ctx->opcode)" instead of
"rD(ctx->opcode)".

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-20 14:52:01 +01:00
Alexander Graf
f2a64032a1 kvm: Fix memory slot page alignment logic
Memory slots have to be page aligned to get entered into KVM. There
is existing logic that tries to ensure that we pad memory slots that
are not page aligned to the biggest region that would still fit in the
alignment requirements.

Unfortunately, that logic is broken. It tries to calculate the start
offset based on the region size.

Fix up the logic to do the thing it was intended to do and document it
properly in the comment above it.

With this patch applied, I can successfully run an e500 guest with more
than 3GB RAM (at which point RAM starts overlapping subpage memory regions).

Cc: qemu-stable@nongnu.org
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-20 14:52:01 +01:00
Fabien Chouteau
3ade1a055c target-ppc: Fix breakpoint registers for e300
In the previous patch, the registers were added to init_proc_G2LE
instead of init_proc_e300.

Signed-off-by: Fabien Chouteau <chouteau@adacore.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-20 14:52:01 +01:00
Peter Maydell
f75ad80f6c Merge remote-tracking branch 'remotes/amit-migration/tags/for-2.2-2' into staging
Fix from a while back that unfortunately got ignored.  Dave Gilbert says
it may actually fix a case where autoconverge would break on a repeat
migration (and not just fix stats).

# gpg: Signature made Thu 20 Nov 2014 12:52:41 GMT using RSA key ID 854083B6
# gpg: Good signature from "Amit Shah <amit@amitshah.net>"
# gpg:                 aka "Amit Shah <amit@kernel.org>"
# gpg:                 aka "Amit Shah <amitshah@gmx.net>"

* remotes/amit-migration/tags/for-2.2-2:
  migration: static variables will not be reset at second migration

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-20 13:00:28 +00:00
ChenLiang
6c1b663c4c migration: static variables will not be reset at second migration
The static variables in migration_bitmap_sync will not be reset in
the case of a second attempted migration.

Signed-off-by: ChenLiang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2014-11-20 18:17:22 +05:30
Peter Maydell
af3ff19b48 Update version for v2.2.0-rc2 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-18 18:00:58 +00:00
Don Slutz
6b896ab261 hw/ide/core.c: Prevent SIGSEGV during migration
The other callers to blk_set_enable_write_cache() in this file
already check for s->blk == NULL.

Signed-off-by: Don Slutz <dslutz@verizon.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1416259239-13281-1-git-send-email-dslutz@verizon.com
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-18 17:36:14 +00:00
Peter Maydell
8336e465ac Merge remote-tracking branch 'remotes/stefanha/tags/net-pull-request' into staging
# gpg: Signature made Tue 18 Nov 2014 15:04:53 GMT using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/net-pull-request:
  net: The third parameter of getsockname should be initialized

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-18 16:17:32 +00:00
Peter Maydell
b1b1e81fb5 Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging
# gpg: Signature made Tue 18 Nov 2014 15:04:14 GMT using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/tracing-pull-request:
  Tracing: Fix simpletrace.py error on tcg enabled binary traces
  Tracing docs fix configure option and description

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-18 15:05:36 +00:00
zhanghailiang
ed6273e26f net: The third parameter of getsockname should be initialized
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-18 15:04:35 +00:00
Christoph Seifert
776ec96f79 Tracing: Fix simpletrace.py error on tcg enabled binary traces
simpletrace.py does not recognize the tcg option while reading trace-events  file. In result simpletrace does not work on binary traces and tcg enabled events. Moved transformation of tcg enabled events to _read_events() which is used by simpletrace.

Signed-off-by: Christoph Seifert <christoph.seifert@posteo.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-18 14:05:58 +00:00
Dr. David Alan Gilbert
b73e8bd414 Tracing docs fix configure option and description
Fix the example trace configure option.
Update the text to say that multiple backends are allowed and what
happens when multiple backends are enabled.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-id: 1412691161-31785-1-git-send-email-dgilbert@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-18 14:05:54 +00:00
Peter Maydell
1ab8f867ef Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block patches for 2.2.0-rc2

# gpg: Signature made Tue 18 Nov 2014 11:32:55 GMT using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"

* remotes/kevin/tags/for-upstream:
  block/raw-posix: Catch fsync() errors
  block/raw-posix: Only sync after successful preallocation
  block/raw-posix: Fix preallocating write() loop
  raw-posix: The SEEK_HOLE code is flawed, rewrite it
  raw-posix: SEEK_HOLE suffices, get rid of FIEMAP
  raw-posix: Fix comment for raw_co_get_block_status()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-18 13:43:37 +00:00
Peter Maydell
ea5b201a0a Merge remote-tracking branch 'remotes/amit-migration/tags/for-2.2' into staging
Fix for CVE-2014-7840, avoiding arbitrary qemu memory overwrite for
migration by Michael S. Tsirkin.

# gpg: Signature made Tue 18 Nov 2014 11:23:00 GMT using RSA key ID 854083B6
# gpg: Good signature from "Amit Shah <amit@amitshah.net>"
# gpg:                 aka "Amit Shah <amit@kernel.org>"
# gpg:                 aka "Amit Shah <amitshah@gmx.net>"

* remotes/amit-migration/tags/for-2.2:
  migration: fix parameter validation on ram load

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-18 12:29:05 +00:00
Ard Biesheuvel
444b1996cb linux-headers: update to 3.18-rc5
This updates the Linux header to version 3.18-rc5, adding support for
(among other things) read-only memslots on ARM and arm64.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Message-id: 1416248898-6302-1-git-send-email-ard.biesheuvel@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-18 11:24:31 +00:00
Michael S. Tsirkin
0be839a270 migration: fix parameter validation on ram load
During migration, the values read from migration stream during ram load
are not validated. Especially offset in host_from_stream_offset() and
also the length of the writes in the callers of said function.

To fix this, we need to make sure that the [offset, offset + length]
range fits into one of the allocated memory regions.

Validating addr < len should be sufficient since data seems to always be
managed in TARGET_PAGE_SIZE chunks.

Fixes: CVE-2014-7840

Note: follow-up patches add extra checks on each block->host access.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2014-11-18 16:49:44 +05:30
Max Reitz
098ffa6674 block/raw-posix: Catch fsync() errors
fsync() may fail, and that case should be handled.

Reported-by: László Érsek <lersek@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-11-18 12:09:00 +01:00
Max Reitz
731de38052 block/raw-posix: Only sync after successful preallocation
The loop which filled the file with zeroes may have been left early due
to an error. In that case, the fsync() should be skipped.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-11-18 12:09:00 +01:00
Max Reitz
39411cf3c3 block/raw-posix: Fix preallocating write() loop
write() may write less bytes than requested; in this case, the number of
bytes written is returned. This is the byte count we should be
subtracting from the number of bytes still to be written, and not the
byte count we requested to write.

Reported-by: László Érsek <lersek@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-11-18 12:08:59 +01:00
Peter Maydell
f874bf905f exec: Handle multipage ranges in invalidate_and_set_dirty()
The code in invalidate_and_set_dirty() needs to handle addr/length
combinations which cross guest physical page boundaries. This can happen,
for example, when disk I/O reads large blocks into guest RAM which previously
held code that we have cached translations for. Unfortunately we were only
checking the clean/dirty status of the first page in the range, and then
were calling a tb_invalidate function which only handles ranges that don't
cross page boundaries. Fix the function to deal with multipage ranges.

The symptoms of this bug were that guest code would misbehave (eg segfault),
in particular after a guest reboot but potentially any time the guest
reused a page of its physical RAM for new code.

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1416167061-13203-1-git-send-email-peter.maydell@linaro.org
2014-11-18 10:19:12 +00:00
Kevin Wolf
8676785302 Merge remote-tracking branch 'mreitz/block' into queue-block
* mreitz/block:
  raw-posix: The SEEK_HOLE code is flawed, rewrite it
  raw-posix: SEEK_HOLE suffices, get rid of FIEMAP
  raw-posix: Fix comment for raw_co_get_block_status()
2014-11-18 11:01:05 +01:00
Markus Armbruster
d1f06fe665 raw-posix: The SEEK_HOLE code is flawed, rewrite it
On systems where SEEK_HOLE in a trailing hole seeks to EOF (Solaris,
but not Linux), try_seek_hole() reports trailing data instead.

Additionally, unlikely lseek() failures are treated badly:

* When SEEK_HOLE fails, try_seek_hole() reports trailing data.  For
  -ENXIO, there's in fact a trailing hole.  Can happen only when
  something truncated the file since we opened it.

* When SEEK_HOLE succeeds, SEEK_DATA fails, and SEEK_END succeeds,
  then try_seek_hole() reports a trailing hole.  This is okay only
  when SEEK_DATA failed with -ENXIO (which means the non-trailing hole
  found by SEEK_HOLE has since become trailing somehow).  For other
  failures (unlikely), it's wrong.

* When SEEK_HOLE succeeds, SEEK_DATA fails, SEEK_END fails (unlikely),
  then try_seek_hole() reports bogus data [-1,start), which its caller
  raw_co_get_block_status() turns into zero sectors of data.  Could
  theoretically lead to infinite loops in code that attempts to scan
  data vs. hole forward.

Rewrite from scratch, with very careful comments.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2014-11-18 09:45:48 +01:00
Markus Armbruster
c4875e5b22 raw-posix: SEEK_HOLE suffices, get rid of FIEMAP
Commit 5500316 (May 2012) implemented raw_co_is_allocated() as
follows:

1. If defined(CONFIG_FIEMAP), use the FS_IOC_FIEMAP ioctl

2. Else if defined(SEEK_HOLE) && defined(SEEK_DATA), use lseek()

3. Else pretend there are no holes

Later on, raw_co_is_allocated() was generalized to
raw_co_get_block_status().

Commit 4f11aa8 (May 2014) changed it to try the three methods in order
until success, because "there may be implementations which support
[SEEK_HOLE/SEEK_DATA] but not [FIEMAP] (e.g., NFSv4.2) as well as vice
versa."

Unfortunately, we used FIEMAP incorrectly: we lacked FIEMAP_FLAG_SYNC.
Commit 38c4d0a (Sep 2014) added it.  Because that's a significant
speed hit, the next commit 7c159037 put SEEK_HOLE/SEEK_DATA first.

As you see, the obvious use of FIEMAP is wrong, and the correct use is
slow.  I guess this puts it somewhere between -7 "The obvious use is
wrong" and -10 "It's impossible to get right" on Rusty Russel's Hard
to Misuse scale[*].

"Fortunately", the FIEMAP code is used only when

* SEEK_HOLE/SEEK_DATA aren't defined, but CONFIG_FIEMAP is

  Uncommon.  SEEK_HOLE had no XFS implementation between 2011 (when it
  was introduced for ext4 and btrfs) and 2012.

* SEEK_HOLE/SEEK_DATA and CONFIG_FIEMAP are defined, but lseek() fails

  Unlikely.

Thus, the FIEMAP code executes rarely.  Makes it a nice hidey-hole for
bugs.  Worse, bugs hiding there can theoretically bite even on a host
that has SEEK_HOLE/SEEK_DATA.

I don't want to worry about this crap, not even theoretically.  Get
rid of it.

[*] http://ozlabs.org/~rusty/index.cgi/tech/2008-04-01.html

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2014-11-18 09:45:35 +01:00
Markus Armbruster
be2ebc6dad raw-posix: Fix comment for raw_co_get_block_status()
Missed in commit 705be72.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2014-11-18 09:44:02 +01:00
Peter Maydell
d6be29e3fb target-arm: handle address translations that start at level 3
The ARMv8 address translation system defines that a page table walk
starts at a level which depends on the translation granule size
and the number of bits of virtual address that need to be resolved.
Where the translation granule is 64KB and the guest sets the
TCR.TxSZ field to between 35 and 39, it's actually possible to
start at level 3 (the final level). QEMU's implementation failed
to handle this case, and so we would set level to 2 and behave
incorrectly (including invoking the C undefined behaviour of
shifting left by a negative number). Correct the code that
determines the starting level to deal with the start-at-3 case,
by replacing the if-else ladder with an expression derived from
the ARM ARM pseudocode version.

This error was detected by the Coverity scan, which spotted
the potential shift by a negative number.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1415890569-7454-1-git-send-email-peter.maydell@linaro.org
2014-11-17 19:30:28 +00:00
Peter Maydell
1aba4be97e Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
A smattering of fixes for problems that Coverity reported.

# gpg: Signature made Mon 17 Nov 2014 17:03:25 GMT using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream:
  hcd-musb: fix dereference null return value
  target-cris/translate.c: fix out of bounds read
  shpc: fix error propaagation
  qemu-char: fix MISSING_COMMA
  acl: fix memory leak
  nvme: remove superfluous check
  loader: fix NEGATIVE_RETURNS
  qga: fix false negative argument passing
  mips_mipssim: fix use-after-free for filename
  l2tpv3: fix fd leak
  l2tpv3: fix possible double free
  libcacard: fix resource leak

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-17 17:22:03 +00:00
Paolo Bonzini
a9be76576e hcd-musb: fix dereference null return value
usb_ep_get and usb_handle_packet can deal with a NULL device, but we have
to avoid dereferencing NULL pointers when building the id.

Thanks to Gonglei for an initial stab at fixing this.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-17 18:02:31 +01:00
Peter Maydell
d8edf52a51 Merge remote-tracking branch 'remotes/mcayland/tags/qemu-openbios-signed' into staging
Update OpenBIOS images

# gpg: Signature made Sat 15 Nov 2014 13:12:02 GMT using RSA key ID AE0F321F
# gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>"

* remotes/mcayland/tags/qemu-openbios-signed:
  Update OpenBIOS images

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-17 15:37:10 +00:00
zhanghailiang
fae38221e7 target-cris/translate.c: fix out of bounds read
In function t_gen_mov_TN_preg and t_gen_mov_preg_TN, The begin check about the
validity of in-parameter 'r' is useless. We still access cpu_PR[r] in the
follow code if it is invalid. Which will be an out-of-bounds read error.

Fix it by using assert() to ensure it is valid before using it.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-17 13:59:23 +01:00
Gonglei
0e8b439ae5 shpc: fix error propaagation
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-17 11:49:19 +01:00
Gonglei
86d10328a0 qemu-char: fix MISSING_COMMA
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-17 11:49:05 +01:00
Gonglei
6cfcd864a4 acl: fix memory leak
If 'i != index' for all acl->entries, variable
entry leaks the storage it points to.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-17 11:48:56 +01:00
Gonglei
720fdd6fa9 nvme: remove superfluous check
Operands don't affect result (CONSTANT_EXPRESSION_RESULT)
((n->bar.aqa >> AQA_ASQS_SHIFT) & AQA_ASQS_MASK) > 4095
is always false regardless of the values of its operands.
This occurs as the logical second operand of '||'.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-17 11:43:09 +01:00
Gonglei
ddd2eab72f loader: fix NEGATIVE_RETURNS
lseek will return -1 on error, g_malloc0(size) and read(,,size)
paramenters cannot be negative. We should add a check for return
value of lseek().

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-17 11:41:56 +01:00
Gonglei
1def74548d qga: fix false negative argument passing
Function send_response(s, &qdict->base) returns a negative number
when any failures occured. But strerror()'s parameter cannot be
negative. Let's change the testing condition and pass '-ret' to
strerr().

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-17 11:41:25 +01:00
Gonglei
77e205a528 mips_mipssim: fix use-after-free for filename
May pass freed pointer filename as an argument to error_report.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-17 11:41:03 +01:00
Gonglei
d4754a9531 l2tpv3: fix fd leak
In this false branch, fd will leak when it is zero.
Change the testing condition.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
[Fix net_l2tpv3_cleanup as well. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-17 11:40:36 +01:00
Mark Cave-Ayland
35fb5b73a2 Update OpenBIOS images
Update OpenBIOS images to SVN r1327 built from submodule.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2014-11-15 13:01:44 +00:00
Peter Maydell
4e70f9271d Merge remote-tracking branch 'remotes/sstabellini/xen-2014-11-14' into staging
* remotes/sstabellini/xen-2014-11-14:
  xen_disk: fix unmapping of persistent grants
  pc: piix4_pm: init legacy PCI hotplug when running on Xen

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-14 12:05:33 +00:00
zhanghailiang
77374582ab l2tpv3: fix possible double free
freeaddrinfo(result) does not assign result = NULL, after frees it.
There will be a double free when it goes error case.
It is reported by covertiy.

Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-14 12:16:24 +01:00
zhanghailiang
5bbebf6228 libcacard: fix resource leak
In function connect_to_qemu(), getaddrinfo() will allocate memory
that is stored into server, it should be freed by using freeaddrinfo()
before connect_to_qemu() return.

Cc: qemu-stable@nongnu.org
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-14 12:15:40 +01:00
Peter Maydell
b87dcdd074 Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
# gpg: Signature made Fri 14 Nov 2014 11:05:54 GMT using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/block-pull-request:
  vmdk: Leave bdi intact if -ENOTSUP in vmdk_get_info
  block: Fix max nb_sectors in bdrv_make_zero
  ahci: factor out FIS decomposition from handle_cmd
  ahci: Check cmd_fis[1] more explicitly
  ahci: Reorder error cases in handle_cmd
  ahci: Fix FIS decomposition
  ahci: add is_ncq predicate helper
  ide: Correct handling of malformed/short PRDTs
  ahci: unify sglist preparation
  ide: repair PIO transfers for cases where nsector > 1
  ahci: Fix byte count regression for ATAPI/PIO

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-14 11:12:40 +00:00
Roger Pau Monne
2f01dfacb5 xen_disk: fix unmapping of persistent grants
This patch fixes two issues with persistent grants and the disk PV backend
(Qdisk):

 - Keep track of memory regions where persistent grants have been mapped
   since we need to unmap them as a whole. It is not possible to unmap a
   single grant if it has been batch-mapped. A new check has also been added
   to make sure persistent grants are only used if the whole mapped region
   can be persistently mapped in the batch_maps case.
 - Unmap persistent grants before switching to the closed state, so the
   frontend can also free them.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reported-by: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2014-11-14 11:12:38 +00:00
Igor Mammedov
91ab2ed722 pc: piix4_pm: init legacy PCI hotplug when running on Xen
If user starts QEMU with "-machine pc,accel=xen", then
compat property in xenfv won't work and it would cause error:
"Unsupported bus. Bus doesn't have property 'acpi-pcihp-bsel' set"
when PCI device is added with -device on QEMU CLI.

From: Igor Mammedov <imammedo@redhat.com>

In case of Xen instead of using compat property, just use the fact
that xen doesn't use QEMU's fw_cfg/acpi tables to switch piix4_pm
into legacy PCI hotplug mode when Xen is enabled.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Li Liang <liang.z.li@intel.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-14 11:11:44 +00:00
Fam Zheng
5f58330790 vmdk: Leave bdi intact if -ENOTSUP in vmdk_get_info
When extent types don't match, we return -ENOTSUP. In this case, be
polite to the caller and don't modify bdi.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1415938161-16217-1-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-14 09:20:45 +00:00
Fam Zheng
f3a9cfddae block: Fix max nb_sectors in bdrv_make_zero
In bdrv_rw_co we report -EINVAL for nb_sectors > INT_MAX /
BDRV_SECTOR_SIZE, so a caller shouldn't exceed it.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1415603264-21497-1-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-14 09:20:35 +00:00
John Snow
107f0d4677 ahci: factor out FIS decomposition from handle_cmd
In order to make handle_cmd more readable at the macro level,
the details of how to decompose particular types of FIS packets
are left to helper functions.

In our case, the only type of FIS packet we currently expect to
see is a Register H2D FIS packet, but the gory details of its
decomposition are of no particular interest in handle_cmd.

This patch keeps the receipt of FIS packets and the decomposition
thereof separated to two different functions.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1415058979-16604-6-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-14 09:20:35 +00:00
John Snow
102e56254d ahci: Check cmd_fis[1] more explicitly
Instead of checking for a known byte, inspect the
fields of this byte explicitly to produce more meaningful
error messages and improve the readability of this section.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1415058979-16604-5-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-14 09:20:35 +00:00
John Snow
36ab3c3400 ahci: Reorder error cases in handle_cmd
Error checking in ahci's handle_cmd is re-ordered so that we
initialize as few things as possible before we've done our
sanity checking. This simplifies returning from this call
in case of an error.

A check to make sure the DMA memory map succeeds with the
correct size is also added, and the debug print of the
command fis is cleaned up with its size corrected.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1415058979-16604-4-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-14 09:20:35 +00:00
John Snow
1cbdd96813 ahci: Fix FIS decomposition
This patch introduces a few changes to how FIS packets are
deciphered in the AHCI virtual device. The summary of
changes can be grouped into two pieces:

[A] Changes to how we apply a preliminary sieve to FISes,
[B] Changes in how we internalize a decomposed FIS.

== Changes to how we apply a preliminary sieve to FISes ==

(1) Packets may now either update the Control register or
    the Command register, but not both. This is according
    to the SATA 3.2 specification which states:
    "...the device either initiates processing of the command
    indicated in the Command register or initiates processing
    of the control request indicated [...] depending on the
    state of the C bit in the FIS."

    See SATA 3.2 section 10.5.5.4, "Reception" in the 10.5.5
    "Register Host to Device FIS" section.

    This change accounts for the first two regions of change
    within the diff. All other changes belong to the following
    changes.

== Changes in how we internalize a decomposed FIS ==

(2) Instead of trying to extract the sector number out of the
    FIS from bytes 4-10 and setting it with ide_set_sector,
    we set the appropriate IDEState registers and trust that
    ide_get_sector can retrieve the correct sector later.

    By "constructing" the sector for use with ide_set_sector,
    we are duplicating the mechanisms of ide_get_sector.
    This change makes the FIS decomposition more obvious.

    SATA 3.2 as a specification does not make the legacy
    register mapping with respect to the D2H FIS obvious.
    However, SATA 3.2 section 10.5.5.1 "Register Host to
    Device FIS layout" describes all of the "cmd_fis"
    bytes:

    0 - FIS Type (0x27)
    1 - Port Multiplier Port and Command Update flag
    2 - ATA Command
    3 - Features_Low
    4 - LBA 7:0
    5 - LBA 15:8
    6 - LBA 23:16
    7 - Device, AKA "Drive Select."
    8 - LBA 31:24
    9 - LBA 39:32
    10 - LBA 47:40
    11 - Features_High
    12 - Count Low
    13 - Count High
    14 - ICC
    15 - Control
    16-19 - Auxiliary (for NCQ, defined per-command)

    Most of these registers map to existing IDEState registers
    in obvious ways, especially features, select, hob_features,
    and nsector (count). ICC is reserved in older specifications
    but is not supported in our implementation, and remains
    unused here. The Control register is not valid for a command
    that is trying to update the command register and is to be
    considered reserved at this point.

    What is not obvious is the LBA register mappings, but SATA 1.0
    can help inform of us legacy device support, see SATA 1.0 section
    8.5.2 "Register - Host to Device."

    LBA 7:0   - Sector Number    (sector)
    LBA 15:8  - Cyl Low          (lcyl)
    LBA 23:16 - Cyl High         (hcyl)
    LBA 31:24 - Sector Num Exp.  (hob_sector)
    LBA 39:32 - Cyl Low Exp.     (hob_lcyl)
    LBA 47:40 - Cyl High Exp.    (hob_hcyl)

    These mappings help guide which registers the FIS should be decomposed
    into/towards for CHS, LBA28 and LBA48 commands.

    As a note: The prior confusion that can be seen in the documentation
    arises from the fact that CHS and LBA28 commands use the low nybble
    of the drive select register to store LBA 27:24, whereas LNA48 commands
    use the hob_sector, hob_lcyl and hob_hcyl registers as explained above.

    The decomposition as it stands now will correctly decompose CHS, LBA28
    and LBA48 commands into their appropriate registers where the core
    IDE/ATAPI layers can deal with them correctly.

    See the below point for more information.

(3) We save cmd_fis[7] as ide_state->select, which informs
    decisions about if we are using LBA or CHS.
    This corrects a bug in AHCI wherein we attempt to set and/or
    retrieve the sector number by using ide_set_sector and
    ide_get_sector, which depend on the select register to
    determine if we are using LBA or CHS.

    Without this adjustment, LBA48 read/writes are currently
    broken. Thanks to Eniac Zheng @ HP for pointing this out.

(4) Save cmd_fis[11] as ide_state->hob_feature, as defined in SATA 3.2.

(5) For several ATA commands, the sector count register set to 0
    is a magic number that means 256 sectors. For LBA48 commands,
    this means 65,536 sectors. We drop the magic sector correction
    here, and trust the ide core layer to handle the conversion
    appropriately, in ide_cmd_lba48_transform(). As it stands,
    the current AHCI code is only compliant with LBA28 commands.
    By simply removing the magic, it will work with LBA28 and LBA48.

(6) We expand FIS decomposition to include both ATAPI and IDE devices.
    We leave the logic of determining if the fields are valid or not
    to the respective layers.

    This change intends to make it clearer that AHCI is only a
    composition mechanism for the FIS packets: the meanings of
    the registers is best left to the implementation layers for
    those devices.

(7) Forcefully setting the feature, hcyl and lcyl registers for ATAPI
    commands is removed.
    - The hcyl and lcyl magic present here is valid at boot only,
      and should not be overridden for every PACKET command.
    - The feature register is defined as valid for the PACKET command,
      so we should not suppress it. The ATAPI layer does not even
      currently depend on or require 0x01 as mandatory.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1415058979-16604-3-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-14 09:20:35 +00:00
John Snow
72a065dbb1 ahci: add is_ncq predicate helper
A small helper to determine which S/ATA commands
are destined to be routed to the NCQ pathways.

This references SATA 3.2 section 13.6,
Native Command Queueing. See sections 13.6.4,
13.6.5, 13.6.6, 13.6.7 and 13.6.8 for all
SATA commands considered to be part of the
NCQ feature set. This is summarized in a small
list in section 13.6.3.1 and again in 13.6.3.2.

Not all of these NCQ commands are currently supported,
so the error pathways are adjusted slightly to be more
informative in the case they are encountered.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1415058979-16604-2-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-14 09:20:35 +00:00
John Snow
3251bdcf1c ide: Correct handling of malformed/short PRDTs
This impacts both BMDMA and AHCI HBA interfaces for IDE.
Currently, we confuse the difference between a PRDT having
"0 bytes" and a PRDT having "0 complete sectors."

When we receive an incomplete sector, inconsistent error checking
leads to an infinite loop wherein the call succeeds, but it
didn't give us enough bytes -- leading us to re-call the
DMA chain over and over again. This leads to, in the BMDMA case,
leaked memory for short PRDTs, and infinite loops and resource
usage in the AHCI case.

The .prepare_buf() callback is reworked to return the number of
bytes that it successfully prepared. 0 is a valid, non-error
answer that means the table was empty and described no bytes.
-1 indicates an error.

Our current implementation uses the io_buffer in IDEState to
ultimately describe the size of a prepared scatter-gather list.
Even though the AHCI PRDT/SGList can be as large as 256GiB, the
AHCI command header limits transactions to just 4GiB. ATA8-ACS3,
however, defines the largest transaction to be an LBA48 command
that transfers 65,536 sectors. With a 512 byte sector size, this
is just 32MiB.

Since our current state structures use the int type to describe
the size of the buffer, and this state is migrated as int32, we
are limited to describing 2GiB buffer sizes unless we change the
migration protocol.

For this reason, this patch begins to unify the assertions in the
IDE pathways that the scatter-gather list provided by either the
AHCI PRDT or the PCI BMDMA PRDs can only describe, at a maximum,
2GiB. This should be resilient enough unless we need a sector
size that exceeds 32KiB.

Further, the likelihood of any guest operating system actually
attempting to transfer this much data in a single operation is
very slim.

To this end, the IDEState variables have been updated to more
explicitly clarify our maximum supported size. Callers to the
prepare_buf callback have been reworked to understand the new
return code, and all versions of the prepare_buf callback have
been adjusted accordingly.

Lastly, the ahci_populate_sglist helper, relied upon by the
AHCI implementation of .prepare_buf() as well as the PCI
implementation of the callback have had overflow assertions
added to help make clear the reasonings behind the various
type changes.

[Added %d -> %"PRId64" fix John sent because off_pos changed from int to
int64_t.
--Stefan]

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1414785819-26209-4-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-14 09:20:35 +00:00
John Snow
bef1301acb ahci: unify sglist preparation
The intent of this patch is to further unify the creation and
deletion of the sglist used for all AHCI transfers, including
emulated PIO, ATAPI R/W, and native DMA R/W.

By replacing ahci_start_transfer's call to ahci_populate_sglist
with ahci_dma_prepare_buf, we reduce the number of direct calls
where we manipulate the scatter-gather list in the AHCI code.

To make this switch, the constant "0" passed as an offset
in ahci_dma_prepare_buf is adjusted to use io_buffer_offset.

For DMA pathways, this has no effect: io_buffer_offset is always
updated to 0 at the beginning of a DMA transfer loop regardless.
DMA pathways through ide_dma_cb() update the io_buffer_offset
accordingly, and for circumstances where we might make several
trips through this loop, this may actually correct a design flaw.

For PIO pathways, the newly updated ahci_dma_prepare_buf will
now prepare the sglist at the correct offset. It will also set
io_buffer_size, but this is not used in the cmd_read_pio or
cmd_write_pio pathways.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1414785819-26209-3-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-14 09:20:34 +00:00
John Snow
36334faf35 ide: repair PIO transfers for cases where nsector > 1
Currently, for emulated PIO transfers through the AHCI device,
any attempt made to request more than a single sector's worth
of data will result in the same sector being transferred over
and over.

For example, if we request 8 sectors via PIO READ SECTORS, the
AHCI device will give us the same sector eight times.

This patch adds offset tracking into the PIO pathways so that
we can fulfill these requests appropriately.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1414785819-26209-2-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-14 09:20:34 +00:00
John Snow
a395f3fa2f ahci: Fix byte count regression for ATAPI/PIO
This patch fixes a regression caused by commit
659142ecf7.
The problem occurs when we wish to return early
from the ahci_start_transfer function, but are now
updating the transferred byte count in the AHCI
command header via ahci_commit_buf.

This will cause problems in the Windows 8 installer.

Don't update the byte count in the command header
for the transmission of ATAPI packets: These commands
will distort the final byte count of the actual data
payload.

The call to ahci_commit_buf remains in the "out"
portion of the call in order to clean up the sglist.
The byte count is maintained by forcing size to be 0.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-14 09:20:34 +00:00
Peter Maydell
c52e67924f Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
x86 and SCSI fixes.  I left out the APIC device model
patches, pending confirmation from the submitter that they really
fix QNX.

# gpg: Signature made Thu 13 Nov 2014 15:13:38 GMT using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream:
  acpi: accurate overflow check
  smbios: change 'ram_addr_t' variables to 'uint64_t'
  kvmclock: Add comment explaining why we need cpu_clean_all_dirty()
  target-i386: fix Coverity complaints about overflows
  apic_common: migrate missing fields
  target-i386: eliminate dead code and hoist common code out of "if"
  virtio-scsi: Fix comment for VirtIOSCSIReq
  virtio-scsi: dataplane: suppress guest notification
  esp: Do not overwrite ESP_TCHI after reset
  virtio-scsi: dataplane: fix allocation for 'cmd_vrings'
  esp: fix coding standards
  virtio-scsi: work around bug in old BIOSes
  esp-pci: fixup deadlock with linux

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-13 15:44:16 +00:00
Pavel Dovgalyuk
3ef0eab178 acpi: accurate overflow check
Compare clock in ns, because acpi_pm_tmr_update uses rounded
to ns value instead of ticks.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
[This lets Windows boot in icount mode. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-13 16:13:28 +01:00
SeokYeon Hwang
f4ec5cd29d smbios: change 'ram_addr_t' variables to 'uint64_t'
ram_addr_t should not be used except if referring to a RAMBlobk.
Using 'uint64_t' avoids a -Wconstant-conversion warning, which
clang >= 3.4 produces in "smbios_get_tables()".

Signed-off-by: SeokYeon Hwang <syeon.hwang@samsung.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-13 16:13:28 +01:00
Eduardo Habkost
1154d84dcc kvmclock: Add comment explaining why we need cpu_clean_all_dirty()
Try to explain why commit 317b0a6d8b
needed a cpu_clean_all_dirty() call just after calling
cpu_synchronize_all_states().

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Cc: Andrey Korolyov <andrey@xdel.ru>
Cc: Marcin Gibuła <m.gibula@beyond.pl>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-13 16:13:28 +01:00
Paolo Bonzini
e6a33e45c2 target-i386: fix Coverity complaints about overflows
sipi_vector is an int; it is shifted by 12 and passed as a 64-bit value,
which makes Coverity think that we wanted (uint64_t)sipi_vector << 12.

But actually it must be between 0 and 255.  Make this explicit.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-13 16:13:27 +01:00
Pavel Dovgalyuk
c2c00148ec apic_common: migrate missing fields
This patch adds missed sipi_vector and wait_for_sipi fields to a new
subsection of the vmstate of the apic_common module. Saving and loading
of these fields makes migration of the apic state deterministic.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
[Initialize the field in pre_load and kvm_apic_realize. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-13 16:13:27 +01:00
Peter Maydell
b56cb28895 Merge remote-tracking branch 'remotes/kraxel/tags/pull-seabios-1.7.5.1-20141113-1' into staging
update seabios to 1.7.5.1 stable release

# gpg: Signature made Thu 13 Nov 2014 11:03:05 GMT using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-seabios-1.7.5.1-20141113-1:
  update seabios to 1.7.5.1 stable release

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-13 13:02:31 +00:00
Peter Maydell
e08d300450 Merge remote-tracking branch 'remotes/kraxel/tags/pull-input-20141113-1' into staging
QMP/input-send-event: make console parameter optional

# gpg: Signature made Thu 13 Nov 2014 10:07:26 GMT using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-input-20141113-1:
  QMP/input-send-event: make console parameter optional
  QMP/input-send-event: update document of union InputEvent

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-13 11:52:11 +00:00
Gerd Hoffmann
953ea14d66 update seabios to 1.7.5.1 stable release
git shortlog since 1.7.5:

Hannes Reinecke (1):
      megasas: read addional PCI I/O bar

Kevin O'Connor (5):
      boot: Change ":rom%d" boot order rom instance to ":rom%x"
      vgabios: Return from handle_1011() if handler found.
      Don't enable thread preemption during S3 resume vga option rom execution.
      build: Avoid absolute paths during "whole-program" compiling.
      ehci: Fix bug in hub port assignment

Marcel Apfelbaum (1):
      hw/pci: reserve IO and mem for pci express downstream ports with no devices attached

Markus Armbruster (1):
      boot: Fix boot order for SCSI target, lun > 9

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-11-13 11:59:46 +01:00
Peter Maydell
410bd787bf Merge remote-tracking branch 'remotes/kraxel/tags/pull-usb-20141112-1' into staging
usb bugfixes for 2.2

# gpg: Signature made Wed 12 Nov 2014 14:35:09 GMT using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-usb-20141112-1:
  usb-host: fix usb_host_speed_compat tyops
  xhci: add sanity checks to xhci_lookup_uport
  Provide the missing LIBUSB_LOG_LEVEL_* for older libusb or FreeBSD. Providing just the needed value as a defined.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-13 10:54:05 +00:00
Amos Kong
51fc44768a QMP/input-send-event: make console parameter optional
The 'QemuConsole' is the input source for handler, we share some
input handlers to process the input events from different QemuConsole.

Normally we only have one set of keyboard, mouse, usbtablet, etc.
The devices have different mask, it's fine to just checking mask to
insure that the handler has the ability to process the event.

I saw we try to bind console to handler in usb/dev-hid.c, but display
always isn't available at that time.

If we have multiseat setup (as Gerd said), we only have 'problem' in
this case. Actually event from different devices have the same effect
for system, it's fine to always use the first available handler
without caring about the console.

For send-key command, we just pass a NULL for console parameter in
calling qemu_input_event_send_key(NULL, ..), but 'input-send-event'
needs to care more devices.

Conclusion:
Generally assigning the special console is meanless, and we can't
directly remove the QMP parameter for compatibility.

So we can make the parameter optional. The parameter might be useful
for some special condition: we have multiple devices without binding
console and they all have the ability(mask) to process events, and
we don't want to use the first one.

Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Amos Kong <akong@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-11-13 11:06:40 +01:00
Amos Kong
935fb91522 QMP/input-send-event: update document of union InputEvent
Signed-off-by: Amos Kong <akong@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-11-13 11:06:40 +01:00
Gerd Hoffmann
79ae25af15 usb-host: fix usb_host_speed_compat tyops
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
2014-11-12 15:27:23 +01:00
Paolo Bonzini
ae67dc72e4 target-i386: eliminate dead code and hoist common code out of "if"
ist != 0 is checked in the first "if", so it cannot be true in
the "else if" part.  While at it, simplify the code and move
the ESP alignment out of the conditionals.

Reported by Coverity.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-12 12:43:45 +01:00
Fam Zheng
f69c111585 virtio-scsi: Fix comment for VirtIOSCSIReq
The cdb is not zeroed by virtio_scsi_init_req, so fix the misleading
comment.

Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-12 12:43:45 +01:00
Ming Lei
6012ca8159 virtio-scsi: dataplane: suppress guest notification
This patch uses vring_should_notify() to suppress
guest notification, and looks notification frequency
can be decreased from ~33K/sec to ~2K/sec in my test
environment.

Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-12 11:19:19 +01:00
Hannes Reinecke
c9cf45c1a4 esp: Do not overwrite ESP_TCHI after reset
After a reset ESP_TCHI should contain the unique ID
of the chip. This value will be overwritten with the
current tranfer count if the transfer count has
previously been set.
So we should always return the chip id if ESP_TCHI
has never been written to.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-12 10:27:03 +01:00
Peter Maydell
e0d0041ec6 Update version for v2.2.0-rc1 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-11 17:25:11 +00:00
Peter Maydell
7f06a3b14d Merge remote-tracking branch 'remotes/otubo/tags/pull-seccomp-20141111' into staging
seccomp branch queue

# gpg: Signature made Tue 11 Nov 2014 16:12:48 GMT using RSA key ID 12F8BD2F
# gpg: Can't check signature: public key not found

* remotes/otubo/tags/pull-seccomp-20141111:
  seccomp: change configure to avoid arm 32 to break
  seccomp: whitelist syscalls fallocate(), fadvise64(), inotify_init1() and inotify_add_watch()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-11 16:23:02 +00:00
Eduardo Otubo
4cc47f8b3c seccomp: change configure to avoid arm 32 to break
Current stable version of libseccomp (2.1.1) only supports i386 and
x86_64 archs correctly. This patch limits the usage of the syscall
filter for those archs and updates to the correct last version of
libseccomp.

This patch also fixes the bug:
https://bugs.launchpad.net/qemu/+bug/1363641

Signed-off-by: Eduardo Otubo <eduardo.otubo@profitbricks.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Paul Moore <pmoore@redhat.com>
2014-11-11 17:05:21 +01:00
Philipp Gesang
f73adec709 seccomp: whitelist syscalls fallocate(), fadvise64(), inotify_init1() and inotify_add_watch()
fallocate() is needed for snapshotting. If it isn’t whitelisted

    $ qemu-img create -f qcow2 x.qcow 1G
    Formatting 'x.qcow', fmt=qcow2 size=1073741824 encryption=off cluster_size=65536 lazy_refcounts=off
    $ qemu-kvm -display none -monitor stdio -sandbox on x.qcow
    QEMU 2.1.50 monitor - type 'help' for more information
    (qemu) savevm foo
    (qemu) loadvm foo

will fail, as will subsequent savevm commands on the same image.

fadvise64(), inotify_init1(), inotify_add_watch() are needed by
the SDL display. Without the whitelist entries,

    qemu-kvm -sandbox on

fails immediately.

In my tests fadvise64() is called 50--51 times per VM run. That
number seems independent of the duration of the run. fallocate(),
inotify_init1(), inotify_add_watch() are called once each.
Accordingly, they are added to the whitelist at a very low
priority.

Signed-off-by: Philipp Gesang <philipp.gesang@intra2net.com>
Signed-off-by: Eduardo Otubo <eduardo.otubo@profitbricks.com>
2014-11-11 17:01:35 +01:00
Peter Maydell
776346cd63 Merge remote-tracking branch 'remotes/mjt/tags/pull-trivial-patches-2014-11-11' into staging
trivial patches for 2014-11-11

# gpg: Signature made Tue 11 Nov 2014 14:38:39 GMT using RSA key ID A4C3D7DB
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg:                 aka "Michael Tokarev <mjt@corpit.ru>"
# gpg:                 aka "Michael Tokarev <mjt@debian.org>"

* remotes/mjt/tags/pull-trivial-patches-2014-11-11:
  block: Fix comment for bdrv_co_get_block_status
  sysbus: Correct SYSTEM_BUS(obj) defines
  target-i386: cpu: keeping function parameters alignment on new line
  xen-hvm: Remove redundant variable 'xstate'
  coroutine-sigaltstack: Change jmp_buf to sigjmp_buf
  pc-bios: petalogix-s3adsp1800.dtb: Use 'xlnx, xps-ethernetlite-2.00.a' instead of 'xlnx, xps-ethernetlite-2.00.b'
  gdbstub: Add a missing case of signal number translation in gdbstub
  numa: make 'info numa' take into account hotplugged memory
  slirp/smbd: modify/set several parameters in generated smbd.conf
  qemu-doc.texi: fix typos in x509 examples
  icc_bus: fix typo ICC_BRIGDE -> ICC_BRIDGE

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-11 14:50:10 +00:00
Fam Zheng
705be728c0 block: Fix comment for bdrv_co_get_block_status
It returns more information than binary, fix the comment.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-11-11 17:36:19 +03:00
Gonglei
00c2275c95 sysbus: Correct SYSTEM_BUS(obj) defines
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-11-11 17:36:19 +03:00
Chen Fan
8f9d989cac target-i386: cpu: keeping function parameters alignment on new line
Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-11-11 17:36:19 +03:00
Chen Gang
d208a85f15 xen-hvm: Remove redundant variable 'xstate'
In xen_hvm_change_state_handler(), we can pass 'opaque' with type cast
to xen_main_loop_prepare() directly, there's no need to use additional
variable for it.

Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-11-11 17:34:53 +03:00
Peter Maydell
8447414510 Merge remote-tracking branch 'remotes/armbru/tags/for-upstream' into staging
Patches to MAINTAINERS that haven't been picked up

# gpg: Signature made Tue 11 Nov 2014 08:46:55 GMT using RSA key ID EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"

* remotes/armbru/tags/for-upstream:
  Add Migration maintainer
  MAINTAINERS: add section for QEMU Guest Agent
  MAINTAINERS: add myself as bootdevice.c maintainer

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-11 11:05:54 +00:00
Ming Lei
ed4b43265d virtio-scsi: dataplane: fix allocation for 'cmd_vrings'
The size of each element should be sizeof(VirtIOSCSIVring *).

Signed-off-by: Ming Lei <ming.lei@canonical.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-11 12:03:47 +01:00
Peter Maydell
59c4f2ecef Merge remote-tracking branch 'remotes/riku/tags/pull-linux-user-20141111' into staging
linux-user pull for 2.2

Two last minute fixes uncovered and fixed by Tom Musta
and Alexander Graf, thanks

# gpg: Signature made Tue 11 Nov 2014 06:36:02 GMT using RSA key ID DE3C9BC0
# gpg: Good signature from "Riku Voipio <riku.voipio@iki.fi>"
# gpg:                 aka "Riku Voipio <riku.voipio@linaro.org>"

* remotes/riku/tags/pull-linux-user-20141111:
  linux-user: Fix up timer id handling
  linux-user: Do not subtract offset from end address

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-11 10:09:31 +00:00
Juan Quintela
c0787c8dd1 Add Migration maintainer
Signed-off-by: Juan Quintela <quintela@trasno.org>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2014-11-11 09:46:46 +01:00
Michael Roth
f05d9999f4 MAINTAINERS: add section for QEMU Guest Agent
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2014-11-11 09:46:34 +01:00
Gonglei
b5e9476c0f MAINTAINERS: add myself as bootdevice.c maintainer
bootdevice.c was created by me, and I wrote most of
the code in this file. And now I can maintain it,
I'd hope nobody object this.

Cc: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2014-11-11 09:42:47 +01:00
Willem Pinckaers
7f151e6f71 coroutine-sigaltstack: Change jmp_buf to sigjmp_buf
This is a simple patch to change the type of old_env from jmp_buf
to sigjmp_buf.  old_env is used by sigsetjmp and as such should be
a sigjmp_buf.

This fixes a stack_chk fail in a OSX 32bit build. Since at least on
OSX sigjmp_buf is four bytes larger then a jmpbuf, resulting in an
overflow in sigsetjmp. Due to variable reordering this overwrites
the stack cookie.

Signed-off-by: Willem Pinckaers <willem_qemu@lekkertech.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Peter: I think I must have missed this one when I converted
       all the jmp_buf to sigjmp_buf in commit 6ab7e546.
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-11-11 11:07:55 +03:00
Gerd Hoffmann
f2ad97ff81 xhci: add sanity checks to xhci_lookup_uport
Also catch xhci_lookup_uport failures in post_load.

https://bugzilla.redhat.com/show_bug.cgi?id=1074219

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-11-11 08:48:16 +01:00
Chris Johns
1e03e40784 Provide the missing LIBUSB_LOG_LEVEL_* for older libusb or FreeBSD. Providing just the needed value as a defined.
Signed-off-by: Chris Johns <chrisj@rtems.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-11-11 08:48:16 +01:00
Alexander Graf
aecc88616a linux-user: Fix up timer id handling
When creating a timer handle, we give the timer id a special magic offset
of 0xcafe0000. However, we never mask that offset out of the timer id before
we start using it to dereference our timer array. So we always end up aborting
timer operations because the timer id is out of bounds.

This was not an issue before my patch e52a99f756 ("linux-user: Simplify
timerid checks on g_posix_timers range") because before we would blindly mask
anything above the first 16 bits.

This patch simplifies the code around timer id creation by introducing a proper
target_timer_id typedef that is s32, just like Linux has it. It also changes the
magic offset to a value that makes all timer ids be positive.

Reported-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-11-11 08:13:09 +02:00
Tom Musta
ccf661f827 linux-user: Do not subtract offset from end address
When computing the upper address of a program segment, do not subtract the
offset from the virtual address; instead compute the sum of the virtual address
and the memory size.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-11-11 08:12:45 +02:00
Chen Gang
c21fd2c79e pc-bios: petalogix-s3adsp1800.dtb: Use 'xlnx, xps-ethernetlite-2.00.a' instead of 'xlnx, xps-ethernetlite-2.00.b'
For Linux upstream kernel (e.g. 3.17-rc7), the related compatible string
'xlnx,xps-ethernetlite-2.00.a' is supported, but 'b' is not supported,
so change qemu dtb file to match kernel driver.

The related operation for qemu (after this patch):

   yum install libvirt
   yum install tunctl
   tunctl -b
   ip link set tap0 up
   brctl addif virbr0 tap0

   ./configure
   make
   ./microblaze-softmmu/qemu-system-microblaze -M petalogix-s3adsp1800 \
     -kernel ../linux-stable.microblaze/arch/microblaze/boot/linux.bin \
     -no-reboot -append "console=ttyUL0,115200 doreboot" -nographic \
     -net nic,vlan=0,model=xlnx.xps-ethernetlite,macaddr=00:16:35:AF:94:00 \
     -net tap,vlan=0,ifname=tap0,script=no,downscript=no

   in microblaze qemu bash (guest machine):

     ifconfig eth0 add 192.168.122.2 netmask 255.255.255.0
     ifconfig eth0 up

   Then can telnet 192.168.122.2 directly without password from the host
   machine.

The related operation for generating new dtb:

   building Linux kernel firstly, then get dts tool "./scripts/dts/dts".
   "./scripts/dtc/dtc -I dtb -O dts  -o ../work.dts ../qemu/petalogix-s3adsp1800.dtb"
   edit work.dts (replace 'xlnx,xps-ethernetlite-2.00.b')
   "./scripts/dtc/dtc -I dts -O dtb  -o ..qemu/petalogix-s3adsp1800.dtb ../work.dts"

(Since I am not quite sure whether can read this patch or not, I put the
related dtb file in attachment, please check, thanks).

Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-11-11 09:04:13 +03:00
Martin Simmons
f17b069010 gdbstub: Add a missing case of signal number translation in gdbstub
While using qemu with gdb "target remote" to debug an application that uses
fork and exec, the qemu process receives SIGSTOP every time the forked process
terminates (sending SIGCHLD).

This is caused by a missing call to gdb_signal_to_target in gdbstub.c, which
is fixed by this patch:

Signed-off-by: Martin Simmons <martin@lispworks.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-11-11 08:58:30 +03:00
zhanghailiang
5b009e4008 numa: make 'info numa' take into account hotplugged memory
When do memory hotplug, if there is numa node, we should add
the memory size to the corresponding node memory size.

It affects the result of hmp command "info numa".

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-11-11 08:50:58 +03:00
Peter Wu
7912d04be6 slirp/smbd: modify/set several parameters in generated smbd.conf
The file sharing module should not handle printers, so disable it.
The options 'load printers' and 'printing' have been available since the
beginning (May 1996, commit 0e8fd3398771da2f016d72830179507f3edda51b).
Option 'disable spoolss' is available since Samba 2.0.4, commit
de5f42c9d9172592779fa2504d44544e3b6b1c0d).

Next, "socket address" was reported as deprecated, use a combination of
"interfaces" and "bind interfaces only" instead (available since October
1997, commit 79f4fb52c1ed56fd843f81b4eb0cdd2991d4d0f4).

Override cache directory to avoid writing to a global directory. Option
available since Samba 3.4.0, Jan 2009, commit
19a05bf2f485023b11b41dfae3f6459847d55ef7.

Set "usershare max shared=0" to prevent a global directory from being
used. Option available since Samba 3.0.23, February 2006, commit
5831715049f2d460ce42299963a5defdc160891b.

The last option was introduced with Samba 3.4.0, but previously
"state directory" was already added which exists in Samba 3.4.0. As
unknown parameters are ignored (while printing a warning), it should be
safe to add another option.

Signed-off-by: Peter Wu <peter@lekensteyn.nl>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-11-11 08:49:16 +03:00
Peter Maydell
9df98352b7 Merge remote-tracking branch 'remotes/xtensa/tags/20141110-xtensa' into staging
Xtensa fixes for 2.2:
- fix entry opcode register window checking and add unit test.

# gpg: Signature made Mon 10 Nov 2014 15:01:47 GMT using RSA key ID F83FA044
# gpg: Good signature from "Max Filippov <max.filippov@cogentembedded.com>"
# gpg:                 aka "Max Filippov <jcmvbkbc@gmail.com>"

* remotes/xtensa/tags/20141110-xtensa:
  target-xtensa: add entry overflow test
  target-xtensa: add missing window check for entry

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-10 20:50:37 +00:00
Peter Maydell
558c2c8ddf Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block patches

# gpg: Signature made Mon 10 Nov 2014 09:42:07 GMT using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"

* remotes/kevin/tags/for-upstream:
  block/vdi: Limit maximum size even futher
  qapi: Complete BlkdebugEvent
  iotests: Add test for non-existing backing file
  block: Propagate error in bdrv_img_create()
  qemu-img: Omit error_report() after img_open()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-10 16:28:51 +00:00
Max Filippov
09c7fbef76 target-xtensa: add entry overflow test
Check that entry instruction raises window overflow exception when
PS.CALLINC points to live registers.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2014-11-10 17:59:13 +03:00
Max Filippov
1b3e71f8ee target-xtensa: add missing window check for entry
Entry opcode needs to check if moving to new register frame would cause
register window overflow. Entry used in function prologue never
overflows because preceding windowed call* opcode writes return address
to the target register window frame, causing overflow exceptions at the
point of call. But when a sequence of entry opcodes is used for register
window spilling there may not be a call or other opcode that would cause
window check between entries and they would not raise overflow exception
themselves resulting in data corruption.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2014-11-10 17:59:13 +03:00
Peter Maydell
7a8dda7e5d Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20141105' into staging
Several bugfixes for s390x:
- instruction decoding and sparse warning in kvm
- overlong input and hangs in the sclp consoles

# gpg: Signature made Wed 05 Nov 2014 15:42:14 GMT using RSA key ID C6F02FAF
# gpg: Good signature from "Cornelia Huck <huckc@linux.vnet.ibm.com>"
# gpg:                 aka "Cornelia Huck <cornelia.huck@de.ibm.com>"

* remotes/cohuck/tags/s390x-20141105:
  s390x/sclpconsole: Avoid hanging SCLP ASCII console
  s390x/sclpconsole-lm: Fix hanging SCLP line mode console
  s390x/sclpconsole-lm: truncate input if line is too long
  s390x/kvm: Fix warning from sparse
  s390x/kvm: Fix opcode decoding for eb instruction handler

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-10 14:58:59 +00:00
Peter Maydell
2d9177588b Merge remote-tracking branch 'remotes/lalrae/tags/mips-20141107' into staging
* remotes/lalrae/tags/mips-20141107:
  target-mips: fix multiple TCG registers covering same data
  mips: Ensure PC update with MTC0 single-stepping
  target-mips: fix for missing delay slot in BC1EQZ and BC1NEZ
  mips: Set the CP0.Config3.DSP and CP0.Config3.DSP2P bits
  mips: Add macros for CP0.Config3 and CP0.Config4 bits
  mips: Respect CP0.Status.CU1 for microMIPS FP branches
  mips: Remove CONFIG_VT82C686 from non-Fulong configs

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-10 13:56:47 +00:00
Paolo Bonzini
25aaa2c568 esp: fix coding standards
Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-10 13:58:14 +01:00
Peter Maydell
7b4b7c5fc7 Merge remote-tracking branch 'remotes/amit/tags/vser-2.2.0-queue-2' into staging
Fixes a crash when a virtio-serial port is added without a name to it.

# gpg: Signature made Fri 07 Nov 2014 04:58:05 GMT using RSA key ID 854083B6
# gpg: Good signature from "Amit Shah <amit@amitshah.net>"
# gpg:                 aka "Amit Shah <amit@kernel.org>"
# gpg:                 aka "Amit Shah <amitshah@gmx.net>"

* remotes/amit/tags/vser-2.2.0-queue-2:
  virtio-serial: avoid crash when port has no name

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-10 11:58:39 +00:00
Kevin Wolf
ea3beed41d Merge remote-tracking branch 'mreitz/block' into queue-block
* mreitz/block:
  block/vdi: Limit maximum size even futher
2014-11-10 10:41:34 +01:00
Max Reitz
d20418ee51 block/vdi: Limit maximum size even futher
The block layer read and write functions do not like requests which are
bigger than INT_MAX bytes. Since the VDI bmap is read and written in a
single operation, its size is therefore limited accordingly. This
reduces the maximum VDI image size supported by QEMU to half of what it
currently is (down to approximately 512 TB).

The VDI test 084 has to be adapted accordingly. Actually, one could
clearly see that it was broken from the "Could not open
'TEST_DIR/t.IMGFMT': Invalid argument" line for an image which was
supposed to work just fine.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
2014-11-09 23:39:50 +01:00
Max Reitz
d21de4d97f qapi: Complete BlkdebugEvent
Several events were missing from the QAPI enum, add them.

Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-11-07 17:38:18 +01:00
Paolo Bonzini
55783a5521 virtio-scsi: work around bug in old BIOSes
Old BIOSes left some padding by mistake after the req_size/resp_size.
New QEMU does not like it, thinking it is a bidirectional command.

As a workaround, we can check if the ANY_LAYOUT bit is set; if not, we
always consider the first buffer as the virtio-scsi request/response,
because, back when QEMU did not support ANY_LAYOUT, it expected the
payload to start at the second element of the iovec.

This can show up during migration.

Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-07 16:09:57 +01:00
Yongbok Kim
cb269f273f target-mips: fix multiple TCG registers covering same data
Avoid to allocate different TCG registers for the FPU registers
that are mapped on the MSA vectore registers.

Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2014-11-07 14:15:28 +00:00
Maciej W. Rozycki
342368aff7 mips: Ensure PC update with MTC0 single-stepping
Correct the way PC is updated when single-stepping instructions, by
keeping the old PC only for the BS_EXCP (exception condition) state.

Some MTC0 (and possibly other) instructions switch to the BS_STOP state
to terminate the current translation block, so that the state transition
of the simulated CPU resulting from the CP0 operation takes effect with
the following instruction.  This happens with `mtc0 <reg>,c0_config' for
example, typically used to set KSEG0 cacheability.

While single-stepping this has a side-effect of not advancing the PC
past the instruction just executed; subsequent single-step traps will
stop at the same instruction repeatedly.  Example:

(gdb) stepi
0x80004d24 in _start ()
5: x/i $pc
=> 0x80004d24 <_start+364>:     mfc0    t1,c0_config
(gdb)
0x80004d28 in _start ()
5: x/i $pc
=> 0x80004d28 <_start+368>:     li      at,-8
(gdb)
0x80004d2c in _start ()
5: x/i $pc
=> 0x80004d2c <_start+372>:     and     t1,t1,at
(gdb)
0x80004d30 in _start ()
5: x/i $pc
=> 0x80004d30 <_start+376>:     ori     t1,t1,0x3
(gdb)
0x80004d34 in _start ()
5: x/i $pc
=> 0x80004d34 <_start+380>:     mtc0    t1,c0_config
(gdb)
0x80004d34 in _start ()
5: x/i $pc
=> 0x80004d34 <_start+380>:     mtc0    t1,c0_config
(gdb)
0x80004d34 in _start ()
5: x/i $pc
=> 0x80004d34 <_start+380>:     mtc0    t1,c0_config
(gdb)
0x80004d34 in _start ()
5: x/i $pc
=> 0x80004d34 <_start+380>:     mtc0    t1,c0_config
(gdb)

-- oops!

Signed-off-by: Maciej W. Rozycki <macro@codesourcery.com>
Reviewed-by: Leon Alrae <leon.alrae@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2014-11-07 14:15:28 +00:00
Leon Alrae
854795753c target-mips: fix for missing delay slot in BC1EQZ and BC1NEZ
New R6 COP1 conditional branches currently don't have delay slot. Fixing this
by setting MIPS_HFLAG_BDS32 flag which is required for branches having 4-byte
delay slot.

Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
2014-11-07 14:15:28 +00:00
Maciej W. Rozycki
e30614d517 mips: Set the CP0.Config3.DSP and CP0.Config3.DSP2P bits
Set the CP0.Config3.DSP2P bit for the 74kf processor and both that bit
and the CP0.Config3.DSP bit for the artificial mips32r5-generic and
mips64dspr2 processors.  They have the DSPr2 ASE enabled in `insn_flags'
and CPUs that implement that ASE need to have both CP0.Config3.DSP and
CP0.Config3.DSP2P set or software won't detect its presence.

Signed-off-by: Maciej W. Rozycki <macro@codesourcery.com>
Reviewed-by: Leon Alrae <leon.alrae@imgtec.com>
[leon.alrae@imgtec.com: remove DSP flags from mips32r5-generic]
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2014-11-07 14:15:28 +00:00
Maciej W. Rozycki
70409e6726 mips: Add macros for CP0.Config3 and CP0.Config4 bits
Define macros for CP0.Config3 and CP0.Config4 bits.  These used to be
exhaustive as at MIPS32r3, but more bits may have been added since.

Signed-off-by: Maciej W. Rozycki <macro@codesourcery.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2014-11-07 14:15:28 +00:00
Hannes Reinecke
c3543fb5fe esp-pci: fixup deadlock with linux
A linux guest will be issuing messages:

[   32.124042] DC390: Deadlock in DataIn_0: DMA aborted unfinished: 000000 bytes remain!!
[   32.126348] DC390: DataIn_0: DMA State: 0

and the HBA will fail to work properly.
Reason is the emulation is not setting the 'DMA transfer done'
status correctly.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-07 13:31:19 +01:00
Maciej W. Rozycki
272f458dc8 mips: Respect CP0.Status.CU1 for microMIPS FP branches
Make microMIPS FP branches respect CP0.Status.CU1 and trap with a
Coprocessor Unusable exception if COP1 has been disabled; also trap if
no FPU is present at all.

Standard MIPS FP instruction encodings have a more regular structure and
branches are covered with a single umbrella along other instructions.
This is not the case with the microMIPS encoding, this case has to be
taken care of explicitly here.  Code to do so has been copied from the
standard MIPS code handler for OPC_CP1, in `decode_opc'.

Problems arising from this bug will generally only show up on user
context switches in operating systems making use of lazy FP context
switches, such as Linux.  It will also more readily trigger if software
FPU emulation is used, either implicitly on a non-float CPU, or forced
on a hard-float CPU such as with the "nofpu" Linux kernel command line
argument.

The problem may have been easily missed because we have no hard-float
microMIPS CPU configuration present; in fact we have no microMIPS CPU
configuration of any kind present.

Signed-off-by: Maciej W. Rozycki <macro@codesourcery.com>
Reviewed-by: Leon Alrae <leon.alrae@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2014-11-07 11:16:16 +00:00
Maciej W. Rozycki
dff4021730 mips: Remove CONFIG_VT82C686 from non-Fulong configs
Fix the regression introduced with commit
47934d0aad [hw: move ISA bridges and
devices to hw/isa/, configure with default-configs/], by removing
CONFIG_VT82C686 from configurations that previously did not enable it.
That southbridge is only available on Fulong platforms (CONFIG_FULONG)
that are exclusively little-endian, 64-bit MIPS.  Previously vt82c686.o
was pulled explicitly with obj-$(CONFIG_FULONG).

Signed-off-by: Maciej W. Rozycki <macro@codesourcery.com>
Reviewed-by: Leon Alrae <leon.alrae@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2014-11-07 11:15:49 +00:00
Marc-André Lureau
7eb7311427 virtio-serial: avoid crash when port has no name
It seems "name" is not mandatory, and the following command line (based
on one generated by current libvirt) will crash qemu at start:

qemu-system-x86_64 \
    -device virtio-serial-pci \
    -device virtserialport,name=foo \
    -device virtconsole

Program received signal SIGSEGV, Segmentation fault.
__strcmp_ssse3 () at ../sysdeps/x86_64/strcmp.S:210
210        movlpd    (%rsi), %xmm2
Missing separate debuginfos, use: debuginfo-install
python-libs-2.7.5-13.fc20.x86_64
(gdb) bt
 #0  __strcmp_ssse3 () at ../sysdeps/x86_64/strcmp.S:210
 #1  0x000055555566bdc6 in find_port_by_name (name=0x0) at /home/elmarco/src/qemu/hw/char/virtio-serial-bus.c:67

Signed-off-by: Marc-André Lureau <marcandre.lureau@gmail.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2014-11-07 10:27:11 +05:30
Max Reitz
c4d01535dc iotests: Add test for non-existing backing file
Test the error message when a COW file is about to be created which is
supposed to inherit the size of its backing file, while the backing file
given does not actually exist.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-11-06 12:45:47 +01:00
Max Reitz
e56934bece block: Propagate error in bdrv_img_create()
If the specified backing file could not be opened, do not generate a new
error message which contains the message which has been generated by
bdrv_open(), but just propagate the latter.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-11-06 12:45:47 +01:00
Max Reitz
cc4d3ee435 qemu-img: Omit error_report() after img_open()
img_open() already prints an error if the operation failed, so there
should not be another error_report() afterwards.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-11-06 12:45:47 +01:00
Heinz Graalfs
bb3e9e1fd7 s390x/sclpconsole: Avoid hanging SCLP ASCII console
Force recalculation of file descriptor sets for main loop's poll(),
in order to be able to readd a possibly removed input file descriptor
after can_read() returned 0 (zero).

Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-11-05 16:35:56 +01:00
Heinz Graalfs
87f2eff016 s390x/sclpconsole-lm: Fix hanging SCLP line mode console
Trigger recalculating sets of file descriptors for the main loop's poll()
in order to make sure a possibly removed FD 0 from the poll() file
descriptor array is re-added. FD 0 is removed from the decriptor array
when the console's can_read() callback returns 0.

Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-11-05 16:35:56 +01:00
Heinz Graalfs
b3191432cf s390x/sclpconsole-lm: truncate input if line is too long
As the SCLP line mode console input length is limited by the available
SCCB buffer space, it might lock up if the input does not fit into the
buffer.

With this patch, characters that don't fit are 'eaten' up to the next
CR/LF and the input line is sent truncated to the guest.

Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-11-05 16:35:55 +01:00
Thomas Huth
f0d4dc18ce s390x/kvm: Fix warning from sparse
When running "sparse" with the s390x kvm.c code, it complains that
"constant 0x00400f1d40330000 is so big it is long" - let's fix this
by appending a proper suffix.

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-11-05 16:35:55 +01:00
Frank Blaschka
80765f0734 s390x/kvm: Fix opcode decoding for eb instruction handler
The second byte of the opcode is encoded in the lowest byte of the ipb
field, not the lowest byte of the ipa field.

Signed-off-by: Frank Blaschka <blaschka@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
2014-11-05 16:35:55 +01:00
Peter Maydell
6e76d125f2 Update version for v2.2.0-rc0 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-05 15:21:04 +00:00
Peter Maydell
3752ac8932 Merge remote-tracking branch 'remotes/agraf/tags/signed-s390-for-upstream' into staging
Patch queue for s390 - 2014-11-05

Two simple bug fixes to enable slightly newer guest kernels
and preliminary -M s390-ccw support for TCG (virtio doesn't work yet!)

# gpg: Signature made Wed 05 Nov 2014 11:01:55 GMT using RSA key ID 03FEDC60
# gpg: Good signature from "Alexander Graf <agraf@suse.de>"
# gpg:                 aka "Alexander Graf <alex@csgraf.de>"

* remotes/agraf/tags/signed-s390-for-upstream:
  s390x: Implement SAM{24,31,64}
  s390x: Fix sclp console input

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-05 14:14:47 +00:00
Gonglei
30de46db50 vhost-user-test: Fix 'make check' broken on glib < 2.26
After commit 89b516d8, some logics is turbid and
breaks 'make check' as below errors:
tests/vhost-user-test.c: In function '_cond_wait_until':
tests/vhost-user-test.c:154: error: 'G_TIME_SPAN_SECOND' undeclared (first use in this function)
tests/vhost-user-test.c:154: error: (Each undeclared identifier is reported only once
tests/vhost-user-test.c:154: error: for each function it appears in.)
tests/vhost-user-test.c: In function 'read_guest_mem':
tests/vhost-user-test.c:192: warning: implicit declaration of function 'g_get_monotonic_time'
tests/vhost-user-test.c:192: warning: nested extern declaration of 'g_get_monotonic_time'
tests/vhost-user-test.c:192: error: 'G_TIME_SPAN_SECOND' undeclared (first use in this function)
make: *** [tests/vhost-user-test.o] Error 1

First, vhost-usr-test.c rely on glib-compat.h because
of using G_TIME_SPAN_SECOND [glib < 2.26] and g_get_monotonic_time(),
but vhost-usr-test.c defined QEMU_GLIB_COMPAT_H, which make
glib-compat.h will not be included.
Second, if we remove QEMU_GLIB_COMPAT_H definability in
vhost-usr-test.c, then we will get below warnings:

tests/vhost-user-test.c: In function 'read_guest_mem':
tests/vhost-user-test.c:190: warning: passing argument 1 of 'g_mutex_lock' from incompatible pointer type
tests/vhost-user-test.c:234: warning: passing argument 1 of 'g_mutex_unlock' from incompatible pointer type

That's because glib-compat.h redefine the g_mutex_lock/unlock
function. Those functions' arguments is CompatGMutex/CompatGCond,
but vhost-user-test.c is using GMutex/GCond, which cause the type
is not consistent.

We can rerealize those functions of vhost-user-test.c,
which need a lots of patches. Let's simply address it, and
leave this file alone.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Message-id: 1415149259-6188-1-git-send-email-arei.gonglei@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-05 12:53:08 +00:00
Alexander Graf
44dd33ba8f s390x: Implement SAM{24,31,64}
The SAM instructions simply change 2 bits in PSW.MASK to advertise
the current memory mode. While we can't fully guarantee that 31 bit
mode (or even remotely 24 bit mode) actually work correctly, we don't
check whether lpswe modifies these bits, so we shouldn't keep the
guest from executing SAM instructions either.

This patch implements all SAM instrutions with their actual PSW changing
semantics, making more recent Linux kernels boot properly which do issue
a SAM31 call during early boot.

Signed-off-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2014-11-05 12:01:28 +01:00
Alexander Graf
d4827355f6 s390x: Fix sclp console input
When injecting an sclp console interrupt into the guest, we increase
the PC by 4 for some reason. I have no idea why I put that code there,
but it's clearly wrong. Remove the increment.

This patch fixes sclp serial input for the ccw machine.

Signed-off-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
2014-11-05 12:01:28 +01:00
Gonglei
63c693f8d0 qemu-doc.texi: fix typos in x509 examples
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-11-05 09:53:18 +03:00
Peter Maydell
c8d943303d Merge remote-tracking branch 'remotes/agraf/tags/signed-ppc-for-upstream' into staging
Patch queue for ppc - 2014-11-04

Fun things for 2.2:

  - e500 virt machine: power off support (needs 3.19 guests)
  - e500 virt machine: -device eTSEC support
  - new framework to allow dynamic spawning of sysbus devices
  - spapr: enable migration of nvram
  - new 440x5wDFPU cpu type
  - Altivec and other random fixes

# gpg: Signature made Tue 04 Nov 2014 22:26:39 GMT using RSA key ID 03FEDC60
# gpg: Good signature from "Alexander Graf <agraf@suse.de>"
# gpg:                 aka "Alexander Graf <alex@csgraf.de>"

* remotes/agraf/tags/signed-ppc-for-upstream: (34 commits)
  spapr: Allow dynamic creation of PHB
  target-ppc: Fix Altivec Round Opcodes
  target-ppc: Fix vcmpbfp. Unordered Case
  target-ppc: Fix Altivec Shifts
  target-ppc: simplify AES emulation
  e500: Add support for eTSEC in device tree
  PPC: e500: Support dynamically spawned sysbus devices
  sysbus: Add new platform bus helper device
  sysbus: Expose MMIO enumeration helper
  sysbus: Expose IRQ enumeration helpers
  sysbus: Make devices spawnable via -device
  sysbus: Add dynamic sysbus device search
  hw/ppc/spapr_pci.c: Avoid functions not in glib 2.12 (g_hash_table_iter_*)
  ppc: do not look at the MMU index to detect PR/HV mode
  target-ppc: kvm: Fix memory overflow issue about strncat()
  spapr_nvram: Enable migration
  PPC: E500: Hook up power off GPIO to GPIO controller
  PPC: E500: Instantiate MPC8XXX gpio controller on virt machine
  PPC: Add MPC8XXX gpio controller
  target-ppc: Fix an invalid free in opcode table handling code.
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-04 22:27:23 +00:00
Alexander Graf
9e3f973335 spapr: Allow dynamic creation of PHB
Now that we finally check for presence of dangling sysbus devices, make check
started complaining that the sPAPR PHB is one such device.

However, it really isn't. The spapr PHB is not really a traditional sysbus
device, but much more a special spapr pv device which is already able to get
created dynamically.

Move spapr to its own dynamic sysbus check handling and allow PHB devices to
get allocated dynamically.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04 23:26:15 +01:00
Tom Musta
abe60a439b target-ppc: Fix Altivec Round Opcodes
Correct the opcodes for the vrfim, vrfin and vrfiz instructions.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04 23:26:15 +01:00
Tom Musta
4007b8de6e target-ppc: Fix vcmpbfp. Unordered Case
Fix the implementation of Vector Compare Bounds Single Precision.
Specifically, fix the case where the operands are unordered -- since
the result is non-zero, the CR[6] field should be set to zero.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04 23:26:15 +01:00
Tom Musta
24e669ba53 target-ppc: Fix Altivec Shifts
Fix the implementation of the Altivec shift left and shift right
instructions (vsl, vsr) which erroneously inverts shift direction
on big endian hosts.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04 23:26:15 +01:00
Aurelien Jarno
36cbde7c30 target-ppc: simplify AES emulation
This patch simplifies the AES code, by directly accessing the newly added
S-Box, InvS-Box tables instead of recreating them by using the AES_Te and
AES_Td tables.

Cc: Alexander Graf <agraf@suse.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04 23:26:15 +01:00
Alexander Graf
fdfb7f2cdb e500: Add support for eTSEC in device tree
This patch adds support to expose eTSEC devices in the dynamically created
guest facing device tree. This allows us to expose eTSEC devices into guests
without changes in the machine file.

Because we can now tell the guest about eTSEC devices this patch allows the
user to specify eTSEC devices via -device at all.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04 23:26:15 +01:00
Alexander Graf
f70873438d PPC: e500: Support dynamically spawned sysbus devices
For e500 our approach to supporting dynamically spawned sysbus devices is to
create a simple bus from the guest's point of view within which we map those
devices dynamically.

We allocate memory regions always within the "platform" hole in address
space and map IRQs to predetermined IRQ lines that are reserved for platform
device usage.

This maps really nicely into device tree logic, so we can just tell the
guest about our virtual simple bus in device tree as well.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04 23:26:14 +01:00
Alexander Graf
7634fe3c27 sysbus: Add new platform bus helper device
We need to support spawning of sysbus devices dynamically via the command line.
The easiest way to represent these dynamically spawned devices in the guest's
memory and IRQ layout is by preallocating some space for dynamic sysbus devices.

This is what the "platform bus" device does. It is a sysbus device that exports
a configurably sized MMIO region and a configurable number of IRQ lines. When
this device encounters sysbus devices that have been dynamically created and not
manually wired up, it dynamically connects them to its own pool of resources.

The machine model can then loop through all of these devices and create a guest
configuration (device tree) to make them visible to the guest.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04 23:26:14 +01:00
Alexander Graf
471a9bc144 sysbus: Expose MMIO enumeration helper
Sysbus devices have a range of MMIO regions they expose. The exact number
of regions is device specific and internal information to the device model.

Expose whether a region exists via a public interface. That way our platform
bus enumeration code can dynamically determine how many regions exist.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04 23:26:14 +01:00
Alexander Graf
b797318666 sysbus: Expose IRQ enumeration helpers
Sysbus devices can get their IRQ lines connected to other devices. It is
possible to figure out which IRQ line a connection is on and whether a sysbus
device even provides an IRQ connector at a specific offset.

This patch exposes helpers to make this information publicly accessible. We
will need it for the platform bus dynamic sysbus enumeration.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04 23:26:14 +01:00
Alexander Graf
33cd52b5d7 sysbus: Make devices spawnable via -device
Now that we can properly map sysbus devices that haven't been connected to
something forcefully by C code, we can allow the -device command line option
to spawn them.

For machines that don't implement dynamic sysbus assignment in their board
files we add a new bool "has_dynamic_sysbus" to the machine class.
When that property is false (default), we bail out when we see dynamically
spawned sysbus devices, like we did before.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04 23:26:14 +01:00
Alexander Graf
eb5722801c sysbus: Add dynamic sysbus device search
Sysbus devices can be spawned by C code or dynamically via the command line.
In the latter case, we need to be able to find the dynamically created devices
to do things with them.

This patch adds a search helper that makes it easy to look for dynamically
spawned sysbus devices.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04 23:26:14 +01:00
Peter Maydell
f8833a37c0 hw/ppc/spapr_pci.c: Avoid functions not in glib 2.12 (g_hash_table_iter_*)
The g_hash_table_iter_* functions for iterating through a hash table
are not present in glib 2.12, which is our current minimum requirement.
Rewrite the code to use g_hash_table_foreach() instead.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04 23:26:13 +01:00
Paolo Bonzini
c47493f24f ppc: do not look at the MMU index to detect PR/HV mode
The MMU index is an internal detail that should not be needed by the
translator (except to generate loads and stores).  Look at the MSR
directly.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04 23:26:13 +01:00
Chen Gang
cc64b1a194 target-ppc: kvm: Fix memory overflow issue about strncat()
strncat() will append additional '\0' to destination buffer, so need
additional 1 byte for it, or may cause memory overflow, just like other
area within QEMU have done.

And can use g_strdup_printf() instead of strncat(), which may be more
easier understanding.

Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04 23:26:13 +01:00
Alexey Kardashevskiy
f58aa48314 spapr_nvram: Enable migration
The only case when sPAPR NVRAM migrates now is if is backed by a file and
copy-storage migration is performed. In other cases NVRAM does not
migrate regardless whether it is backed by a file or not.

This enables shadow copy of NVRAM in RAM which is read from a file
(if used) and used for reads. Writes to NVRAM are mirrored to the file.

This defines a VMSTATE descriptor for NVRAM device so the memory copy
of NVRAM can migrate and be flushed to a backing file on the destination
if one is specified.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04 23:26:13 +01:00
Alexander Graf
016f775898 PPC: E500: Hook up power off GPIO to GPIO controller
Now that we have a working GPIO controller on the virt machine, we can use
one pin to notify QEMU that the guests wants to power off the system.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04 23:26:13 +01:00
Alexander Graf
b88e77f493 PPC: E500: Instantiate MPC8XXX gpio controller on virt machine
With the e500 virt machine, we don't have to adhere to the exact hardware
layout of an mpc8544ds board. So there we can just add a qoriq compatible
GPIO controller into the system that we can add a power off hook to.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04 23:26:12 +01:00
Alexander Graf
228aa992fc PPC: Add MPC8XXX gpio controller
On e500 systems most SoCs implement a common GPIO controller that Linux
calls the "mpc8xxx" gpio controller. This patch adds an emulation model
for this device.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04 23:26:12 +01:00
Bharata B Rao
81f194dd69 target-ppc: Fix an invalid free in opcode table handling code.
Opcode table has direct, indirect and double indirect handlers, but
ppc_cpu_unrealizefn() frees direct handlers which are never allocated
and never frees double indirect handlers.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04 23:26:12 +01:00
Bharata B Rao
54ff58bb10 target-ppc: Use macros in opcodes table handling code
Define and use macros instead of direct numbers wherever
possible in ppc opcodes table handling code.

This doesn't change any code functionality.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04 23:26:12 +01:00
Peter Maydell
bf362e9610 hw/pci/ppc4xx_pci.c: Remove unused pci4xx_cfgaddr_read/write/ops
The MemoryRegionOps struct pci4xx_cfgaddr_ops and the read and
write functions it references are all unused; remove them.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04 23:26:12 +01:00
Pierre Mallard
b8c867ed09 target-ppc : Add new processor type 440x5wDFPU
This patch add a new processor type 440x5wDFPU for Virtex 5 PPC440
with an external APU FPU in double precision mode

Signed-off-by: Pierre Mallard <mallard.pierre@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04 23:26:12 +01:00
Pierre Mallard
4171853cf4 target-ppc : Allow fc[tf]id[*] mnemonics for non TARGET_PPC64
This patch remove limitation for fc[tf]id[*] on 32 bits targets and
add a new insn flag for signed integer 64 conversion PPC2_FP_CVT_S64

Signed-off-by: Pierre Mallard <mallard.pierre@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04 23:26:11 +01:00
Alexander Graf
9ac58dc59a PPC: openpic_kvm: Only map first occurence in address space
The in-kernel OpenPIC emulation only supports a single map. However, we
map the OpenPIC at 2 locations: The CPU visible one and the PCI visible
one. For KVM acceleration, we only care about the first one.

To make sure that we only map that first mapping and not the PCI map that
happens dynamically later during bootup, ignore maps that happen when
we are already considering ourselves mapped.

Credits due are to Bogdan and Mihai for debugging this.

Reported-by: Bogdan Purcareata <bogdan.purcareata@freescale.com>
Reported-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04 23:26:11 +01:00
David Gibson
4aee73623d spapr: Cleanup machine naming conventions, and prepare for 2.2 release
As of qemu-2.1, spapr/pseries, has a set of versioned machine classes to
represent the machine type as it appeared to the guest in different qemu
versions.  This allows for safe migration of guests between current and
future qemu versions.

However, these are organized a bit differently from those for PC: on PC,
the default plain "pc" machine type is just an alias for the most recent
versioned machine type.  In sPAPR, it names the base machine class from
which the versioned types are derived.

The PC approach is preferable; it makes it clearer which explicit version
is the current one.  Additionally updating the "current" machine as the
base class makes it even more likely than otherwise to incorrectly alter
the versioned machines' behaviour when updating the current machine.

Therefore this patch changes sPAPR to the PC approach - the base class
becomes abstract, and plain "pseries" becomes an alias for the most
recent versioned machine class.  Since qemu-2.1 is now released, we also
create a new pseries-2.2 machine type, to incorporate changes during this
development cycle (for now it is identical to pseries-2.1).

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04 23:26:11 +01:00
David Gibson
0691e8ebce target-ppc: virtex-ml507 machine type should depend on CONFIG_XILINX
The virtex-ml507 is a Xilinx CPU based system, and requires several sub
devices which are only included with CONFIG_XILINX.  Therefore, it should
only be compiled if CONFIG_XILINX is set.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04 23:26:11 +01:00
Tom Musta
8412d11276 target-ppc: Implement IVOR[59] By Default for Book E
Adjust the IVOR mask for generic Book E implementation to support bit 59.
This is consistent with the Power ISA.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reported-by: Pierre Mallard <mallard.pierre@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04 23:26:11 +01:00
Alexey Kardashevskiy
0b6ff57640 target-ppc: Fix kvmppc_set_compat to use negotiated cpu-version
By mistake, QEMU uses the maximum compatibility level from the command
line instead of the value negotiated in client-architecture-support call.

This replaces @max_compat with @cpu_version. This only affects guests
which do not support the host CPU.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04 23:26:10 +01:00
Paolo Bonzini
8f9fb7ac49 ppc: compute mask from BI using right shift
This will match the code we use in fpu_helper.c when we flip
CRF_* bit-endianness.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04 23:26:10 +01:00
Paolo Bonzini
e57d02022c ppc: rename gen_set_cr6_from_fpscr
It sets CR1, not CR6 (and the spec agrees).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04 23:26:10 +01:00
Paolo Bonzini
ebbd8b40a9 ppc: fix result of DLMZB when no zero bytes are found
It must return 8 and place 8 in XER, but the current code uses
i directly which is 9 at this point of the code.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04 23:26:10 +01:00
Paolo Bonzini
72189ea41d ppc: use CRF_* in int_helper.c
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04 23:26:10 +01:00
Paolo Bonzini
d298118060 ppc: fix monitor access to CR
This was off-by-one.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04 23:26:10 +01:00
Peter Maydell
d5b4dc3b50 Merge remote-tracking branch 'remotes/afaerber/tags/qom-devices-for-peter' into staging
QOM infrastructure fixes and device conversions

* Fixes for -device foo,help

# gpg: Signature made Tue 04 Nov 2014 17:27:41 GMT using RSA key ID 3E7E013F
# gpg: Good signature from "Andreas Färber <afaerber@suse.de>"
# gpg:                 aka "Andreas Färber <afaerber@suse.com>"

* remotes/afaerber/tags/qom-devices-for-peter:
  qdev: Use qdev_get_device_class() for -device <type>,help
  qdev: Move error printing to the end of qdev_device_help()
  qdev: Create qdev_get_device_class() function

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-04 17:33:34 +00:00
Eduardo Habkost
31bed5509d qdev: Use qdev_get_device_class() for -device <type>,help
Make sure we try to list properties from classes that can be safely used
with "-device".

Fixes the following crashes:

  $ qemu-system-x86_64 -device x86_64-cpu,help
  **
  ERROR:qom/object.c:336:object_initialize_with_type: assertion failed: (type->abstract == false)
  Aborted (core dumped)
  $ qemu-system-x86_64 -device host-x86_64-cpu,help
  qemu-system-x86_64: [...]/target-i386/cpu.c:1329: host_x86_cpu_initfn: Assertion `(kvm_allowed)' failed.
  Aborted (core dumped)

After applying this patch:

  $ qemu-system-x86_64 -device x86_64-cpu,help
  Parameter 'driver' expects non-abstract device type
  $ qemu-system-x86_64 -device host-x86_64-cpu,help
  Parameter 'driver' expects pluggable device type

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-11-04 17:50:00 +01:00
Eduardo Habkost
5185f0e0a6 qdev: Move error printing to the end of qdev_device_help()
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-11-04 17:50:00 +01:00
Eduardo Habkost
43c95d782d qdev: Create qdev_get_device_class() function
Extract the DeviceClass lookup from qdev_device_add() to a separate
function.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-11-04 17:50:00 +01:00
Peter Maydell
2bb41e5d30 Merge remote-tracking branch 'remotes/afaerber/tags/qom-cpu-for-peter' into staging
QOM CPUState and X86CPU

* Cleanups for -cpu ...,enforce

* remotes/afaerber/tags/qom-cpu-for-peter:
  target-i386: Disable SVM by default in KVM mode
  target-i386: Don't enable nested VMX by default
  target-i386: Remove unsupported bits from all CPU models
  target-i386: Disable CPUID_ACPI by default in KVM mode
  target-i386: Rename KVM auto-feature-enable compat function
  pc: Create pc_compat_2_1() functions

Conflicts:
	hw/i386/pc_piix.c
	hw/i386/pc_q35.c
[PMM: Fixed minor textual conflicts]

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-04 15:56:26 +00:00
Peter Maydell
1bc8dae31b Merge remote-tracking branch 'remotes/kraxel/tags/pull-gtk-20141104-2' into staging
gtk: fix fullscreen with gtk3, fix build with older gtk2 versions.

# gpg: Signature made Tue 04 Nov 2014 13:42:09 GMT using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-gtk-20141104-2:
  gtk: add GDK_KEY_pause #define
  gtk: Hide the menubar when in fullscreen mode (lp 1294898)
  gtk: Install vc accelerators on parent window
  gtk: Install fullscreen accelerator on toplevel window
  gtk: Grab accel_group from GtkDisplayState

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-04 15:00:17 +00:00
Eduardo Habkost
75d373ef97 target-i386: Disable SVM by default in KVM mode
Make SVM be disabled by default on all CPU models when in KVM mode.
Nested SVM is enabled by default in the KVM kernel module, but it is
probably less stable than nested VMX (which is already disabled by
default).

Add a new compat function, x86_cpu_compat_kvm_no_autodisable(), to keep
compatibility on previous machine-types.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-11-04 15:49:05 +01:00
Eduardo Habkost
e93abc147f target-i386: Don't enable nested VMX by default
TCG doesn't support VMX, and nested VMX is not enabled by default in the
KVM kernel module.

So, there's no reason to have VMX enabled by default on the core2duo and
coreduo CPU models, today. Even the newer Intel CPU model definitions
don't have it enabled.

In this case, we need machine-type compat code, as people may be running
the older machine-types on hosts that had VMX nesting enabled.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-11-04 15:48:47 +01:00
Eduardo Habkost
b9fc20bccf target-i386: Remove unsupported bits from all CPU models
The following CPU features were never supported by neither TCG or KVM,
so they are useless on the CPU model definitions, today:

 * CPUID_DTS (DS)
 * CPUID_HT
 * CPUID_TM
 * CPUID_PBE
 * CPUID_EXT_DTES64
 * CPUID_EXT_DSCPL
 * CPUID_EXT_EST
 * CPUID_EXT_TM2
 * CPUID_EXT_XTPR
 * CPUID_EXT_PDCM
 * CPUID_SVM_LBRV

As using "enforce" mode is the only way to ensure guest ABI doesn't
change when moving to a different host, we should make "enforce" mode
the default or at least encourage management software to always use it.

In turn, to make "enforce" usable, we need CPU models that work without
always requiring some features to be explicitly disabled. This patch
removes the above features from all CPU model definitions.

We won't need any machine-type compat code for those changes, because it
is impossible to have existing VMs with those features enabled.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-11-04 15:42:39 +01:00
Eduardo Habkost
864867b91b target-i386: Disable CPUID_ACPI by default in KVM mode
KVM never supported the CPUID_ACPI flag, so it doesn't make sense to
have it enabled by default when KVM is enabled.

The motivation here is exactly the same we had for the MONITOR flag
(disabled by commit 136a7e9a85).

And like in the MONITOR flag case, we don't need machine-type compat code
because it is currently impossible to run a KVM VM with the ACPI flag set.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-11-04 15:35:47 +01:00
Gerd Hoffmann
dc52017146 gtk: add GDK_KEY_pause #define
Add pause key to the list of compatibility defines.
Fixes the build with older gtk versions.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-11-04 14:40:20 +01:00
Peter Maydell
45da08fa8a Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20141104' into staging
target-arm queue:
 * avoid passing CPU env pointer around in A32/T32 decoders
 * split M profile exception masking out from A/R profile

# gpg: Signature made Tue 04 Nov 2014 12:28:15 GMT using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"

* remotes/pmaydell/tags/pull-target-arm-20141104:
  target-arm: Correct condition for taking VIRQ and VFIQ
  target-arm: Separate out M profile cpu_exec_interrupt handling
  target-arm/translate.c: Don't pass CPUARMState * to disas_arm_insn()
  target-arm/translate.c: Don't pass CPUARMState around in the decoder
  target-arm/translate.c: Don't use IS_M()
  target-arm/translate.c: Use arm_dc_feature() rather than arm_feature()
  target-arm/translate.c: Use arm_dc_feature() in ENABLE_ARCH_ macros

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-04 13:35:04 +00:00
Peter Maydell
6943109011 Merge remote-tracking branch 'remotes/mcayland/tags/qemu-openbios-signed' into staging
Update OpenBIOS images

# gpg: Signature made Tue 04 Nov 2014 00:24:41 GMT using RSA key ID AE0F321F
# gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>"

* remotes/mcayland/tags/qemu-openbios-signed:
  Update OpenBIOS images

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-04 12:35:08 +00:00
Peter Maydell
9fae24f554 target-arm: Correct condition for taking VIRQ and VFIQ
The VIRQ and VFIQ exceptions are (as the comments say) only
taken if the CPU is in Non-secure state and the IMO/FMO bits
are set to enable virtualized interrupts. Correct the code
to actually implement this.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1414684132-23971-3-git-send-email-peter.maydell@linaro.org
2014-11-04 12:05:23 +00:00
Peter Maydell
b5c633c5bd target-arm: Separate out M profile cpu_exec_interrupt handling
The M profile cpu_exec_interrupt handling is fairly simple
but does include an M profile specific oddity (disabling
interrupts for certain PC values). A/R profile handling
on the other hand is getting rapidly more complicated
with the support for EL2 and EL3. Split the M profile
code out into its own implementation of cpu_exec_interrupt
to keep these two things out of each others' way.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1414684132-23971-2-git-send-email-peter.maydell@linaro.org
2014-11-04 12:05:23 +00:00
Peter Maydell
f4df22102a target-arm/translate.c: Don't pass CPUARMState * to disas_arm_insn()
Refactor to avoid passing a CPUARMState * to disas_arm_insn(). To do this
we move the "read insn from memory" code to the callsite and pass the
insn to the function instead.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1414524244-20316-6-git-send-email-peter.maydell@linaro.org
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
2014-11-04 12:05:11 +00:00
Peter Maydell
7dcc1f894d target-arm/translate.c: Don't pass CPUARMState around in the decoder
Passing the CPUARMState around in the decoder is a recipe for
bugs where we accidentally generate code that depends on CPU
state which isn't reflected in the TB flags. Stop doing this
and instead use DisasContext as a way to pass around those
bits of CPU state which are known to be safe to use.

This commit simply removes initial "CPUARMState *env" parameters
from various function definitions, and removes the initial "env"
argument from the places where those functions are called.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1414524244-20316-5-git-send-email-peter.maydell@linaro.org
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
2014-11-04 12:05:06 +00:00
Peter Maydell
b53d8923a5 target-arm/translate.c: Don't use IS_M()
Instead of using IS_M(), use arm_dc_feature(s, ARM_FEATURE_M), so we
don't need to pass CPUARMState pointers around the decoder.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1414524244-20316-4-git-send-email-peter.maydell@linaro.org
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
2014-11-04 12:05:03 +00:00
Peter Maydell
d614a51378 target-arm/translate.c: Use arm_dc_feature() rather than arm_feature()
Use arm_dc_feature() rather than arm_feature() to avoid using
CPUARMState unnecessarily.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1414524244-20316-3-git-send-email-peter.maydell@linaro.org
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
2014-11-04 12:04:25 +00:00
Cole Robinson
b0f3182064 gtk: Hide the menubar when in fullscreen mode (lp 1294898)
In fullscreen mode, we attempt to shrink the menubar to 1 pixel in height,
so it takes up as little room as possible while still allowing us to use
the keyboard shortcuts for its various operations.

However this shrinking is disregarded on gtk3, so the entire menu bar is
visible, which isn't very pleasant. This patch hides the menu bar instead.

The side effect is that the only keyboard shortcuts that will work in this
mode are the ones that we explicitly register on the top level window and
not the menu bar. The previous patches changed the fullscreen and vc
shortcuts to work like that, which I think are the only ones that really
matter in for the fullscreen case.

https://bugs.launchpad.net/qemu/+bug/1294898
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-11-04 08:15:21 +01:00
Cole Robinson
277836c82b gtk: Install vc accelerators on parent window
So they are usable when we hide the menubar in upcoming patches. This
has the accelerator text caveat as the fullscreen bit in the previous
patch.

Signed-off-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-11-04 08:15:21 +01:00
Cole Robinson
9541491461 gtk: Install fullscreen accelerator on toplevel window
Instead of installing it on the menu. This will be needed to keep the
fullscreen keyboard shortcut working when we hide the menu (in future
patches).

On gtk < 3.8, this has the unfortunate side effect of no longer listing
the key combo in the UI. We could manually change the label in that case,
but it will look visually out of place, and I'm not sure if anyone really
cares.

Signed-off-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-11-04 08:15:21 +01:00
Cole Robinson
400519d24a gtk: Grab accel_group from GtkDisplayState
Rather than needlessly pass it around

Signed-off-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-11-04 08:15:21 +01:00
Peter Maydell
2b51668fca target-arm/translate.c: Use arm_dc_feature() in ENABLE_ARCH_ macros
All the places where we use the ENABLE_ARCH_* and ARCH() macros have a
DisasContext* s, so switch them over to use arm_dc_feature() rather than
arm_feature() so we don't need to pass the CPUARMState* env around too.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1414524244-20316-2-git-send-email-peter.maydell@linaro.org
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
2014-11-04 00:39:16 +00:00
Peter Maydell
d780615520 Merge remote-tracking branch 'remotes/lalrae/tags/mips-20141103' into staging
* remotes/lalrae/tags/mips-20141103: (34 commits)
  target-mips: add MSA support to mips32r5-generic
  disas/mips.c: disassemble MSA instructions
  target-mips: add MSA MI10 format instructions
  target-mips: add MSA 2RF format instructions
  target-mips: add MSA VEC/2R format instructions
  target-mips: add MSA 3RF format instructions
  target-mips: add MSA ELM format instructions
  target-mips: add MSA 3R format instructions
  target-mips: add MSA BIT format instructions
  target-mips: add MSA I5 format instruction
  target-mips: add MSA I8 format instructions
  target-mips: add MSA branch instructions
  target-mips: add msa_helper.c
  target-mips: add msa_reset(), global msa register
  target-mips: add MSA opcode enum
  target-mips: stop translation after ctc1
  target-mips: remove duplicated mips/ieee mapping function
  target-mips: add MSA exceptions
  target-mips: add MSA defines and data structure
  target-mips: enable features in MIPS64R6-generic CPU
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-04 00:17:45 +00:00
Mark Cave-Ayland
e3b561be48 Update OpenBIOS images
Update OpenBIOS images to SVN r1321 built from submodule.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2014-11-04 00:02:33 +00:00
Peter Maydell
949ca9e479 Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc, virtio, misc bugfixes

A bunch of minor bugfixes all over the place.

changes from v2:
    added cpu hotplug rework
    added default vga type switch
    more fixes
changes from v1:
    fix for test re-generation script
    add missing acks to two patches

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Mon 03 Nov 2014 16:33:13 GMT using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"

* remotes/mst/tags/for_upstream: (28 commits)
  vga: flip qemu 2.2 pc machine types from cirrus to stdvga
  vga: add default display to machine class
  vhost-user: fix mmap offset calculation
  hw/i386/acpi-build.c: Fix memory leak in acpi_build_tables_cleanup()
  smbios: Encode UUID according to SMBIOS specification
  pc: Add pc_compat_2_1() function
  hw/virtio/vring/event_idx: fix the vring_avail_event error
  hw/pci: fixed hotplug crash when using rombar=0 with devices having romfile
  hw/pci: fixed error flow in pci_qdev_init
  -machine vmport=off: Allow disabling of VMWare ioport emulation
  acpi/cpu-hotplug: introduce helper function to keep bit setting in one place
  cpu-hotplug: rename function for better readability
  qom/cpu: remove the unused CPU hot-plug notifier
  pc: Update rtc_cmos in pc_cpu_plug
  pc: add cpu hotplug handler to PC_MACHINE
  acpi:piix4: convert cpu hotplug to hotplug_handler API
  acpi:ich9: convert cpu hotplug to hotplug_handler API
  acpi/cpu: add cpu hotplug callback function to match hotplug_handler API
  acpi: create separate file for TCPA log
  tests: fix rebuild-expected-aml.sh for acpi-test rename
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-03 22:51:08 +00:00
Peter Maydell
47e8acb45f Merge remote-tracking branch 'remotes/riku/tags/pull-linux-user-20141101' into staging
linux-user pull for 2.2

Two minor fixes and new a feature, addition of QEMU_RAND_SEED for
testing needs.

# gpg: Signature made Mon 03 Nov 2014 11:49:39 GMT using RSA key ID DE3C9BC0
# gpg: Good signature from "Riku Voipio <riku.voipio@iki.fi>"
# gpg:                 aka "Riku Voipio <riku.voipio@linaro.org>"

* remotes/riku/tags/pull-linux-user-20141101:
  elf: take phdr offset into account when calculating the program load address
  linux-user: Fix fault address truncation AArch64
  linux-user: Let user specify random seed

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-03 20:23:15 +00:00
Eduardo Habkost
1cadaa9482 target-i386: Rename KVM auto-feature-enable compat function
The x86_cpu_compat_disable_kvm_features() name was a bit confusing, as
it won't forcibly disable the feature for all CPU models (i.e. add it to
kvm_default_unset_features), but it will instead turn off the KVM
auto-enabling of the feature (i.e. remove it from kvm_default_features),
meaning the feature may still be enabled by default in some CPU models).

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-11-03 19:39:10 +01:00
Eduardo Habkost
179b9f40f2 pc: Create pc_compat_2_1() functions
We will need new compat code for the 2.1 machine-types.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-11-03 19:36:19 +01:00
Peter Maydell
9a33c0c851 Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
# gpg: Signature made Mon 03 Nov 2014 11:50:53 GMT using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/block-pull-request: (53 commits)
  block: declare blockjobs and dataplane friends!
  block: let commit blockjob run in BDS AioContext
  block: let mirror blockjob run in BDS AioContext
  block: let stream blockjob run in BDS AioContext
  block: let backup blockjob run in BDS AioContext
  block: add bdrv_drain()
  blockjob: add block_job_defer_to_main_loop()
  blockdev: add note that block_job_cb() must be thread-safe
  blockdev: acquire AioContext in blockdev_mark_auto_del()
  blockdev: acquire AioContext in do_qmp_query_block_jobs_one()
  block: acquire AioContext in generic blockjob QMP commands
  iotests: Expand test 061
  block/qcow2: Simplify shared L2 handling in amend
  block/qcow2: Make get_refcount() global
  block/qcow2: Implement status CB for amend
  qemu-img: Fix insignificant memleak
  qemu-img: Add progress output for amend
  block: Add status callback to bdrv_amend_options()
  block: qemu-iotest 107 supports NFS
  iotests: Add test for qcow2's bdrv_make_empty
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-03 18:34:09 +00:00
Zhu Guihua
3a0614c6c7 icc_bus: fix typo ICC_BRIGDE -> ICC_BRIDGE
Rename ICC_BRIGDE for better readability.

Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-11-03 19:51:56 +03:00
Peter Maydell
eb5f222b5c Merge remote-tracking branch 'remotes/xtensa/tags/20141103-xtensa' into staging
Xtensa fixes and improvements 2014-11-03:
- build fixes for cores w/o windowed registers and with profiling
  interrupts;
- fix uImage load address for MMUv2 cores;
- add script for automatic core import from xtensa configuration overlay.

# gpg: Signature made Sun 02 Nov 2014 22:04:44 GMT using RSA key ID F83FA044
# gpg: Good signature from "Max Filippov <max.filippov@cogentembedded.com>"
# gpg:                 aka "Max Filippov <jcmvbkbc@gmail.com>"

* remotes/xtensa/tags/20141103-xtensa:
  MAINTAINERS: update xtensa boards
  target-xtensa: fix build for cores w/o windowed registers
  target-xtensa: add core importing script
  hw/xtensa/xtfpga: treat uImage load address as virtual
  hw/core/loader: implement address translation in uimage loader
  target-xtensa: avoid duplicate timer interrupt delivery
  target-xtensa: tests: pre-process tests linker script
  target-xtensa: add definition for XTHAL_INTTYPE_PROFILING

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-03 16:43:32 +00:00
Gerd Hoffmann
d43f0d641e vga: flip qemu 2.2 pc machine types from cirrus to stdvga
This patch switches the default display from cirrus to vga
for the new (qemu 2.2+) machine types.  Old machines types
stay as-is for compatibility reasons.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-11-03 18:32:48 +02:00
Gerd Hoffmann
6f00494abe vga: add default display to machine class
This allows machine classes to specify which display device they want
as default.  If unspecified the current behavior (try cirrus, failing
that try stdvga, failing that use no display) will be used.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-11-03 18:32:48 +02:00
Michael S. Tsirkin
d3f16ec887 vhost-user: fix mmap offset calculation
qemu_get_ram_block_host_ptr should get ram_addr_t,
vhost-user passes in GPA.
That's very wrong.

Reported-by: Linhaifeng <haifeng.lin@huawei.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-11-03 18:32:48 +02:00
Peter Maydell
7135781f65 Merge remote-tracking branch 'remotes/mjt/tags/pull-trivial-patches-2014-11-02' into staging
trivial patches for 2014-11-02

# gpg: Signature made Sun 02 Nov 2014 11:54:43 GMT using RSA key ID A4C3D7DB
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg:                 aka "Michael Tokarev <mjt@corpit.ru>"
# gpg:                 aka "Michael Tokarev <mjt@debian.org>"

* remotes/mjt/tags/pull-trivial-patches-2014-11-02: (23 commits)
  vdi: wrapped uuid_unparse() in #ifdef
  tap: fix possible fd leak in net_init_tap
  tap: do not close(fd) in net_init_tap_one
  target-i386: Remove unused model_features_t struct
  tap_int.h: remove repeating NETWORK_SCRIPT defines
  os-posix: reorder parent notification for -daemonize
  pidfile: stop making pidfile error a special case
  os-posix: replace goto again with a proper loop
  os-posix: use global daemon_pipe instead of cryptic fds[1]
  dump: Fix dump-guest-memory termination and use-after-close
  virtio-9p-proxy: improve error messages in connect_namedsocket()
  virtio-9p-proxy: fix error return in proxy_init()
  virtio-9p-proxy: Fix sockfd leak
  target-tricore: check return value before using it
  net/slirp: specify logbase for smbd
  Revert "os-posix: report error message when lock file failed"
  util: Improve os_mem_prealloc error message
  sparse: fix build
  target-arm: A64: remove redundant store
  target-xtensa: mark XtensaConfig structs as unused
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-03 14:55:17 +00:00
Peter Maydell
f67d23b1ae Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
The last round of patches for soft freeze.  Includes ivshmem bugfixes,
megasas 2108 emulation, and other small patches here and there.

# gpg: Signature made Fri 31 Oct 2014 17:17:54 GMT using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (35 commits)
  virtio-scsi: fix dataplane
  ivshmem: use error_report
  ivshmem: Fix fd leak on error
  ivshmem: Fix potential OOB r/w access
  ivshmem: validate incoming_posn value from server
  ivshmem: Check ivshmem_read() size argument
  i386: fix breakpoints handling in icount mode
  kvm_stat: Add powerpc support
  kvm_stat: Abstract ioctl numbers
  kvm_stat: Rework platform detection
  kvm_stat: Fix the non-x86 exit reasons
  kvm_stat: Only consider online cpus
  virtio-scsi: Fix num_queue input validation
  scsi: devirtualize unrealize of SCSI devices
  virtio-scsi: Fix memory leak when realize failed
  iscsi: Refuse to open as writable if the LUN is write protected
  kvmvapic: patch_instruction fix
  vl.c: Fix Coverity complaining for vmstate_dump_file
  Add skip_dump flag to ignore memory region during dump
  -machine vmport=off: Allow disabling of VMWare ioport emulation
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-03 12:31:07 +00:00
Yongbok Kim
55a2201e79 target-mips: add MSA support to mips32r5-generic
add MSA support to mips32r5-generic core definition

Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Reviewed-by: Leon Alrae <leon.alrae@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2014-11-03 11:48:35 +00:00
Yongbok Kim
ed8a933f97 disas/mips.c: disassemble MSA instructions
disassemble MIPS SIMD Architecture instructions

Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2014-11-03 11:48:35 +00:00
Yongbok Kim
f7685877f5 target-mips: add MSA MI10 format instructions
add MSA MI10 format instructions
update LSA and DLSA for MSA

add 16, 64 bit load and store

Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2014-11-03 11:48:35 +00:00
Yongbok Kim
3bdeb68866 target-mips: add MSA 2RF format instructions
add MSA 2RF format instructions

Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2014-11-03 11:48:35 +00:00
Yongbok Kim
cbe50b9a8e target-mips: add MSA VEC/2R format instructions
add MSA VEC/2R format instructions

Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2014-11-03 11:48:35 +00:00
Yongbok Kim
7d05b9c86f target-mips: add MSA 3RF format instructions
add MSA 3RF format instructions

Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2014-11-03 11:48:35 +00:00
Yongbok Kim
1e608ec14e target-mips: add MSA ELM format instructions
add MSA ELM format instructions

Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2014-11-03 11:48:35 +00:00
Yongbok Kim
28f99f08cf target-mips: add MSA 3R format instructions
add MSA 3R format instructions

Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2014-11-03 11:48:35 +00:00
Yongbok Kim
d4cf28dec2 target-mips: add MSA BIT format instructions
add MSA BIT format instructions

Reviewed-by: Leon Alrae <leon.alrae@imgtec.com>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2014-11-03 11:48:35 +00:00
Yongbok Kim
80e7159184 target-mips: add MSA I5 format instruction
add MSA I5 format instructions

Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Reviewed-by: Leon Alrae <leon.alrae@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2014-11-03 11:48:35 +00:00
Yongbok Kim
4c7895465e target-mips: add MSA I8 format instructions
add MSA I8 format instructions

Reviewed-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2014-11-03 11:48:35 +00:00
Yongbok Kim
5692c6e1f8 target-mips: add MSA branch instructions
add MSA branch instructions

Reviewed-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2014-11-03 11:48:35 +00:00
Yongbok Kim
42daa9bed4 target-mips: add msa_helper.c
add msa_helper.c

Reviewed-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Leon Alrae <leon.alrae@imgtec.com>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2014-11-03 11:48:35 +00:00
Yongbok Kim
863f264d10 target-mips: add msa_reset(), global msa register
add msa_reset() and global msa register (d type only)

Reviewed-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2014-11-03 11:48:35 +00:00
Yongbok Kim
239dfebe12 target-mips: add MSA opcode enum
add MSA opcode enum

Reviewed-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Leon Alrae <leon.alrae@imgtec.com>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2014-11-03 11:48:35 +00:00
Yongbok Kim
4cf8a45f56 target-mips: stop translation after ctc1
stop translation as ctc1 instruction can change hflags

Reviewed-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2014-11-03 11:48:35 +00:00
Yongbok Kim
b7651e9521 target-mips: remove duplicated mips/ieee mapping function
Remove the duplicated ieee_rm in gdbstub.c.
Make the other ieee_rm and ieee_ex_to_mips available to other files.

Reviewed-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Leon Alrae <leon.alrae@imgtec.com>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2014-11-03 11:48:35 +00:00
Yongbok Kim
b10ac20446 target-mips: add MSA exceptions
add MSA exceptions

Reviewed-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2014-11-03 11:48:35 +00:00
Yongbok Kim
e97a391d20 target-mips: add MSA defines and data structure
add defines and data structure for MIPS SIMD Architecture

Reviewed-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2014-11-03 11:48:35 +00:00
Leon Alrae
2d9e48bc04 target-mips: enable features in MIPS64R6-generic CPU
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
2014-11-03 11:48:35 +00:00
Leon Alrae
f31b035a9f target-mips: correctly handle access to unimplemented CP0 register
Release 6 limits the number of cases where software can cause UNDEFINED or
UNPREDICTABLE behaviour. In this case, when accessing reserved / unimplemented
CP0 register, writes are ignored and reads return 0.

In pre-R6 the behaviour is not specified, but generating RI exception is not
what the real HW does.

Additionally, remove CP0 Random register as it became reserved in Release 6.

Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
2014-11-03 11:48:34 +00:00
Leon Alrae
ba801af429 target-mips: add restrictions for possible values in registers
In Release 6 not all the values are allowed to be written to a register.
If the value is not valid or unsupported then it should stay unchanged.

For pre-R6 the existing behaviour has been changed only for CP0_Index register
as the current implementation does not seem to be correct - it looks like it
tries to limit the input value but the limit is higher than the actual
number of tlb entries.

Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
2014-11-03 11:48:34 +00:00
Leon Alrae
a63eb0ce0f target-mips: CP0_Status.CU0 no longer allows the user to access CP0
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
2014-11-03 11:48:34 +00:00
Leon Alrae
339cd2a82a target-mips: implement forbidden slot
When conditional compact branch is encountered decode one more instruction in
current translation block - that will be forbidden slot. Instruction in
forbidden slot will be executed only if conditional compact branch is not taken.

Any control transfer instruction (CTI) which are branches, jumps, ERET,
DERET, WAIT and PAUSE will generate RI exception if executed in forbidden or
delay slot.

Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
2014-11-03 11:48:34 +00:00
Leon Alrae
faf1f68ba1 target-mips: add Config5.SBRI
SDBBP instruction Reserved Instruction control. The purpose of this field is
to restrict availability of SDBBP to kernel mode operation.

If the bit is set then SDBBP instruction can only be executed in kernel mode.
User execution of SDBBP will cause a Reserved Instruction exception.

Additionally add missing Config4 and Config5 cases for dm{f,t}c0.

Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
2014-11-03 11:48:34 +00:00
Leon Alrae
460c81f14a target-mips: update cpu_save/cpu_load to support new registers
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
2014-11-03 11:48:34 +00:00
Leon Alrae
aea14095ea target-mips: add BadInstr and BadInstrP support
BadInstr Register (CP0 Register 8, Select 1)
The BadInstr register is a read-only register that capture the most recent
instruction which caused an exception.

BadInstrP Register (CP0 Register 8, Select 2)
The BadInstrP register contains the prior branch instruction, when the
faulting instruction is in a branch delay slot.

Using error_code to indicate whether AdEL or TLBL was triggered during
instruction fetch, in this case BadInstr is not updated as valid instruction
word is not available.

Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
2014-11-03 11:48:34 +00:00
Leon Alrae
9456c2fbcd target-mips: add TLBINV support
For Standard TLB configuration (Config.MT=1):

TLBINV invalidates a set of TLB entries based on ASID. The virtual address is
ignored in the entry match. TLB entries which have their G bit set to 1 are not
modified.

TLBINVF causes all entries to be invalidated.

Single TLB entry can be marked as invalid on TLB entry write by having
EntryHi.EHINV set to 1.

Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
2014-11-03 11:48:34 +00:00
Leon Alrae
92ceb440d4 target-mips: add new Read-Inhibit and Execute-Inhibit exceptions
An Execute-Inhibit exception occurs when the virtual address of an instruction
fetch matches a TLB entry whose XI bit is set. This exception type can only
occur if the XI bit is implemented within the TLB and is enabled, this is
denoted by the PageGrain XIE bit.

An Read-Inhibit exception occurs when the virtual address of a memory load
reference matches a TLB entry whose RI bit is set. This exception type can
only occur if the RI bit is implemented within the TLB and is enabled, this is
denoted by the PageGrain RIE bit.

Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
2014-11-03 11:48:34 +00:00
Leon Alrae
7207c7f9d7 target-mips: update PageGrain and m{t,f}c0 EntryLo{0,1}
PageGrain needs rw bitmask which differs between MIPS architectures.
In pre-R6 if RIXI is supported, PageGrain.XIE and PageGrain.RIE are writeable,
whereas in R6 they are read-only 1.

On MIPS64 mtc0 instruction left shifts bits 31:30 for MIPS32 backward
compatiblity, therefore there are separate mtc0 and dmtc0 helpers.

Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
2014-11-03 11:48:34 +00:00
Leon Alrae
2fb58b7374 target-mips: add RI and XI fields to TLB entry
In Revision 3 of the architecture, the RI and XI bits were added to the TLB
to enable more secure access of memory pages. These bits (along with the Dirty
bit) allow the implementation of read-only, write-only, no-execute access
policies for mapped pages.

Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
2014-11-03 11:48:34 +00:00
Leon Alrae
9f6bcedba6 target-mips: distinguish between data load and instruction fetch
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
2014-11-03 11:48:34 +00:00
Leon Alrae
55e9409366 softmmu: provide softmmu access type enum
New MIPS features depend on the access type and enum is more convenient than
using the numbers directly.

Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
2014-11-03 11:48:34 +00:00
Leon Alrae
e98c0d179f target-mips: add KScratch registers
KScratch<n> Registers (CP0 Register 31, Selects 2 to 7)

The KScratch registers are read/write registers available for scratch pad
storage by kernel mode software. They are 32-bits in width for 32-bit
processors and 64-bits for 64-bit processors.

CP0Config4.KScrExist[2:7] bits indicate presence of CP0_KScratch1-6 registers.
For Release 6, all KScratch registers are required.

Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
2014-11-03 11:48:34 +00:00
Stefan Hajnoczi
b112a65c52 block: declare blockjobs and dataplane friends!
Now that blockjobs use AioContext they are safe for use with dataplane.
Unblock them!

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1413889440-32577-12-git-send-email-stefanha@redhat.com
2014-11-03 11:41:49 +00:00
Stefan Hajnoczi
9e85cd5ce0 block: let commit blockjob run in BDS AioContext
The commit block job must run in the BlockDriverState AioContext so that
it works with dataplane.

Acquire the AioContext in blockdev.c so starting the block job is safe.
One detail here is that the bdrv_drain_all() must be moved inside the
aio_context_acquire() region so requests cannot sneak in between the
drain and acquire.

The completion code in block/commit.c must perform backing chain
manipulation and bdrv_reopen() from the main loop.  Use
block_job_defer_to_main_loop() to achieve that.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1413889440-32577-11-git-send-email-stefanha@redhat.com
2014-11-03 11:41:49 +00:00
Stefan Hajnoczi
5a7e7a0bad block: let mirror blockjob run in BDS AioContext
The mirror block job must run in the BlockDriverState AioContext so that
it works with dataplane.

Acquire the AioContext in blockdev.c so starting the block job is safe.

Note that to_replace is treated separately from other BlockDriverStates
in that it does not need to be in the same AioContext.  Explicitly
acquire/release to_replace's AioContext when accessing it.

The completion code in block/mirror.c must perform BDS graph
manipulation and bdrv_reopen() from the main loop.  Use
block_job_defer_to_main_loop() to achieve that.

The bdrv_drain_all() call is not allowed outside the main loop since it
could lead to lock ordering problems.  Use bdrv_drain(bs) instead
because we have acquired the AioContext so nothing else can sneak in
I/O.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1413889440-32577-10-git-send-email-stefanha@redhat.com
2014-11-03 11:41:49 +00:00
Stefan Hajnoczi
f3e69beb94 block: let stream blockjob run in BDS AioContext
The stream block job must run in the BlockDriverState AioContext so that
it works with dataplane.

The basics of acquiring the AioContext are easy in blockdev.c.

The tricky part is the completion code which drops part of the backing
file chain.  This must be done in the main loop where bdrv_unref() and
bdrv_close() are safe to call.  Use block_job_defer_to_main_loop() to
achieve that.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1413889440-32577-9-git-send-email-stefanha@redhat.com
2014-11-03 11:41:49 +00:00
Stefan Hajnoczi
761731b180 block: let backup blockjob run in BDS AioContext
The backup block job must run in the BlockDriverState AioContext so that
it works with dataplane.

The basics of acquiring the AioContext are easy in blockdev.c.

The completion code in block/backup.c must call bdrv_unref() from the
main loop.  Use block_job_defer_to_main_loop() to achieve that.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1413889440-32577-8-git-send-email-stefanha@redhat.com
2014-11-03 11:41:49 +00:00
Stefan Hajnoczi
5b98db0ad3 block: add bdrv_drain()
Now that op blockers are in use, we can ensure that no other sources are
generating I/O on a BlockDriverState.  Therefore it is possible to drain
requests for a single BDS.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1413889440-32577-7-git-send-email-stefanha@redhat.com
2014-11-03 11:41:49 +00:00
Stefan Hajnoczi
dec7d421f8 blockjob: add block_job_defer_to_main_loop()
Block jobs will run in the BlockDriverState's AioContext, which may not
always be the QEMU main loop.

There are some block layer APIs that are either not thread-safe or risk
lock ordering problems.  This includes bdrv_unref(), bdrv_close(), and
anything that calls bdrv_drain_all().

The block_job_defer_to_main_loop() API allows a block job to schedule a
function to run in the main loop with the BlockDriverState AioContext
held.

This function will be used to perform cleanup and backing chain
manipulations in block jobs.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1413889440-32577-6-git-send-email-stefanha@redhat.com
2014-11-03 11:41:49 +00:00
Stefan Hajnoczi
723c5d93c5 blockdev: add note that block_job_cb() must be thread-safe
This function is correct but we should document the constraint that
everything must be thread-safe.

Emitting QMP events and scheduling BHs are both thread-safe so nothing
needs to be done here.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1413889440-32577-5-git-send-email-stefanha@redhat.com
2014-11-03 11:41:49 +00:00
Stefan Hajnoczi
91fddb0db6 blockdev: acquire AioContext in blockdev_mark_auto_del()
When an emulated storage controller is unrealized it will call
blockdev_mark_auto_del().  This will cancel any running block job (and
that eventually releases its reference to the BDS so it can be freed).

Since the block job may be executing in another AioContext we must
acquire/release to ensure thread safety.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1413889440-32577-4-git-send-email-stefanha@redhat.com
2014-11-03 11:41:49 +00:00
Stefan Hajnoczi
69691e7270 blockdev: acquire AioContext in do_qmp_query_block_jobs_one()
Make sure that query-block-jobs acquires the BlockDriverState
AioContext so that the blockjob isn't running in another thread while we
access its state.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1413889440-32577-3-git-send-email-stefanha@redhat.com
2014-11-03 11:41:49 +00:00
Stefan Hajnoczi
3d948cdf37 block: acquire AioContext in generic blockjob QMP commands
block-job-set-speed, block-job-cancel, block-job-pause,
block-job-resume, and block-job-complete must acquire the
BlockDriverState AioContext so that it is safe to access bs.

At the moment bs->job is always NULL when dataplane is active because op
blockers prevent blockjobs from starting.  Once the rest of the blockjob
API has been made aware of AioContext we can drop the op blocker.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1413889440-32577-2-git-send-email-stefanha@redhat.com
2014-11-03 11:41:49 +00:00
Max Reitz
78fa65821d iotests: Expand test 061
Add some tests for progress output to 061.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Message-id: 1414404776-4919-8-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 11:41:49 +00:00
Max Reitz
ecf58777c5 block/qcow2: Simplify shared L2 handling in amend
Currently, we have a bitmap for keeping track of which clusters have
been created during the zero cluster expansion process. This was
necessary because we need to properly increase the refcount for shared
L2 tables.

However, now we can simply take the L2 refcount and use it for the
cluster allocated for expansion. This will be the correct refcount and
therefore we don't have to remember that cluster having been allocated
any more.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Message-id: 1414404776-4919-7-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 11:41:49 +00:00
Max Reitz
44751917db block/qcow2: Make get_refcount() global
Reading the refcount of a cluster is an operation which can be useful in
all of the qcow2 code, so make that function globally available.

While touching this function, amend the comment describing the "addend"
parameter: It is (no longer, if it ever was) necessary to have it set to
-1 or 1; any value is fine.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Message-id: 1414404776-4919-6-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 11:41:49 +00:00
Max Reitz
4057a2b24a block/qcow2: Implement status CB for amend
The only really time-consuming operation potentially performed by
qcow2_amend_options() is zero cluster expansion when downgrading qcow2
images from compat=1.1 to compat=0.10, so report status of that
operation and that operation only through the status CB.

For this, approximate the progress as the number of L1 entries visited
during the operation.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Message-id: 1414404776-4919-5-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 11:41:49 +00:00
Max Reitz
b2f27e4438 qemu-img: Fix insignificant memleak
As soon as options is set in img_amend(), it needs to be freed before
the function returns. This leak is rather insignificant, as qemu-img
will exit subsequently anyway, but there's no point in not fixing it.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Message-id: 1414404776-4919-4-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 11:41:49 +00:00
Max Reitz
76a3a34dce qemu-img: Add progress output for amend
Now that bdrv_amend_options() supports a status callback, use it to
display a progress report.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1414404776-4919-3-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 11:41:48 +00:00
Max Reitz
7748543420 block: Add status callback to bdrv_amend_options()
Depending on the changed options and the image format,
bdrv_amend_options() may take a significant amount of time. In these
cases, a way to be informed about the operation's status is desirable.

Since the operation is rather complex and may fundamentally change the
image, implementing it as AIO or a coroutine does not seem feasible. On
the other hand, implementing it as a block job would be significantly
more difficult than a simple callback and would not add benefits other
than progress report to the amending operation, because it should not
actually be run as a block job at all.

A callback may not be very pretty, but it's very easy to implement and
perfectly fits its purpose here.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1414404776-4919-2-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 11:41:48 +00:00
Peter Lieven
9ea92c2106 block: qemu-iotest 107 supports NFS
As discussed during review a follow up for Max's fix.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1414249537-29257-1-git-send-email-pl@kamp.de
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 11:41:48 +00:00
Max Reitz
7d90030196 iotests: Add test for qcow2's bdrv_make_empty
Add a test for qcow2's fast bdrv_make_empty implementation on images
without internal snapshots.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1414159063-25977-15-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 11:41:48 +00:00
Max Reitz
e6ea23126c iotests: Add test for backing-chain commits
Add a test for qemu-img commit on backing chains with more than two
images. This test also checks whether the top image is emptied (unless
this is prevented by specifying either -d or -b) and does therefore not
work for qed and vmdk which requires it to be separate from 020.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1414159063-25977-14-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 11:41:48 +00:00
Max Reitz
f67ac71edb iotests: Add _filter_qemu_img_map
As different image formats most probably map guest addresses to
different host addresses, add a filter to filter the host addresses out;
also, the image filename should be filtered.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1414159063-25977-13-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 11:41:48 +00:00
Max Reitz
1b22bffd82 qemu-img: Specify backing file for commit
Introduce a new parameter for qemu-img commit which may be used to
explicitly specify the backing file into which an image should be
committed if the backing chain has more than a single layer.

[Applied Eric Blake's qemu-img.texi documentation rewording
--Stefan]

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1414159063-25977-12-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 11:41:48 +00:00
Max Reitz
687fa1d830 qemu-img: Enable progress output for commit
Implement progress output for the commit command by querying the
progress of the block job.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1414159063-25977-11-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 11:41:48 +00:00
Max Reitz
9a86fe4895 qemu-img: Empty image after commit
After the top image has been committed, it should be emptied unless
specified otherwise.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1414159063-25977-10-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 11:41:48 +00:00
Max Reitz
d4a3238af5 qemu-img: Implement commit like QMP
qemu-img should use QMP commands whenever possible in order to ensure
feature completeness of both online and offline image operations. As
qemu-img itself has no access to QMP (since this would basically require
just everything being linked into qemu-img), imitate QMP's
implementation of block-commit by using commit_active_start() and then
waiting for the block job to finish.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1414159063-25977-9-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 11:41:48 +00:00
Max Reitz
b21c76529d block/mirror: Improve progress report
Instead of taking the total length of the block device as the block
job's length, use the number of dirty sectors. The progress is now the
number of sectors mirrored to the target block device. Note that this
may result in the job's length increasing during operation, which is
however in fact desirable.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1414159063-25977-8-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 11:41:48 +00:00
Max Reitz
1d3ba15acc iotests: Omit length/offset test in 040 and 041
As of a follow-up patch to this one, the length of a mirror block job
will no longer directly depend on the size of the block device;
therefore, drop these checks from this test. Instead, just check whether
the final offset equals the block job length.

As 041 uses the wait_until_completed function from iotests.py, the same
applies there as well which in turn affects tests 030, 055 and 056. On
the other hand, a block job's length does not have to be related to the
length of the image file in the first place, so that check was
questionable anyway.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1414159063-25977-7-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 11:41:48 +00:00
Max Reitz
ef6dbf1e46 blockjob: Add "ready" field
When a block job signals readiness, this is currently reported only
through QMP. If qemu wants to use block jobs for internal tasks, there
needs to be another way to correctly detect when a block job may be
completed.

For this reason, introduce a bool "ready" which is set when the block
job may be completed.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1414159063-25977-6-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 11:41:48 +00:00
Max Reitz
345f9e1b04 blockjob: Introduce block_job_complete_sync()
Implement block_job_complete_sync() by doing the exact same thing as
block_job_cancel_sync() does, only with calling block_job_complete()
instead of block_job_cancel().

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1414159063-25977-5-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 11:41:48 +00:00
Max Reitz
94054183da qcow2: Optimize bdrv_make_empty()
bdrv_make_empty() is currently only called if the current image
represents an external snapshot that has been committed to its base
image; it is therefore unlikely to have internal snapshots. In this
case, bdrv_make_empty() can be greatly sped up by emptying the L1 and
refcount table (while having the dirty flag set, which only works for
compat=1.1) and creating a trivial refcount structure.

If there are snapshots or for compat=0.10, fall back to the simple
implementation (discard all clusters).

[Applied s/clusters/cluster/ typo fix suggested by Eric Blake
--Stefan]

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1414159063-25977-4-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 11:41:48 +00:00
Max Reitz
491d27e2af qcow2: Implement bdrv_make_empty()
Implement this function by making all clusters in the image file fall
through to the backing file (by using the recently extended discard).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1414159063-25977-3-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 11:41:48 +00:00
Max Reitz
808c4b6f30 qcow2: Allow "full" discard
Normally, discarded sectors should read back as zero. However, there are
cases in which a sector (or rather cluster) should be discarded as if
they were never written in the first place, that is, reading them should
fall through to the backing file again.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1414159063-25977-2-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 11:41:47 +00:00
Max Reitz
70a5ff6bdd iotests: Add test for external image truncation
It should not be happening, but it is possible to truncate an image
outside of qemu while qemu is running (or any of the qemu tools using
the block layer. raw_co_get_block_status() should not break then.

While touching this test, replace the existing "truncate" invocation by
"$QEMU_IMG convert -f raw".

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1414148280-17949-4-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 11:41:47 +00:00
Max Reitz
d7f62751a1 raw-posix: raw_co_get_block_status() return value
Instead of generating the full return value thrice in try_fiemap(),
try_seek_hole() and as a fall-back in raw_co_get_block_status() itself,
generate the value only in raw_co_get_block_status().

While at it, also remove the pnum parameter from try_fiemap() and
try_seek_hole().

Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1414148280-17949-3-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 11:41:47 +00:00
Max Reitz
e6d7ec32dd raw-posix: Fix raw_co_get_block_status() after EOF
As its comment states, raw_co_get_block_status() should unconditionally
return 0 and set *pnum to 0 for after EOF.

An assertion after lseek(..., SEEK_HOLE) tried to catch this case by
asserting that errno != -ENXIO (which would indicate a position after
the EOF); but it should be errno != ENXIO instead. Regardless of that,
there should be no such assertion at all. If bdrv_getlength() returned
an outdated value and the image has been resized outside of qemu,
lseek() will return with errno == ENXIO. Just return that value as an
error then.

Setting *pnum to 0 and returning 0 should not be done here, as in that
case we should update the device length as well. So, from qemu's
perspective, the file has not been resized; it's just that there was an
error querying sectors beyond a certain point (the actual file size).

Additionally, nb_sectors should be clamped against the image end. This
was probably not an issue if FIEMAP or SEEK_HOLE/SEEK_DATA worked, but
the fallback did not take this case into account.

Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1414148280-17949-2-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 11:41:47 +00:00
Richard W.M. Jones
f76faeda4b block/curl: Improve type safety of s->timeout.
qemu_opt_get_number returns a uint64_t, and curl_easy_setopt expects a
long (not an int).  There is no warning about the latter type error
because curl_easy_setopt uses a varargs argument.

Store the timeout (which is a positive number of seconds) as a
uint64_t.  Check that the number given by the user is reasonable.
Zero is permissible (meaning no timeout is enforced by cURL).

Cast it to long before calling curl_easy_setopt to fix the type error.

Example error message after this change has been applied:

$ ./qemu-img create -f qcow2 /tmp/test.qcow2 \
    -b 'json: { "file.driver":"https",
                "file.url":"https://foo/bar",
                "file.timeout":-1 }'
qemu-img: /tmp/test.qcow2: Could not open 'json: { "file.driver":"https", "file.url":"https://foo/bar", "file.timeout":-1 }': timeout parameter is too large or negative: Invalid argument

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 11:41:47 +00:00
Zhang Haoyu
3432a1929e snapshot: add bdrv_drain_all() to bdrv_snapshot_delete() to avoid concurrency problem
If there are still pending i/o while deleting snapshot,
because deleting snapshot is done in non-coroutine context, and
the pending i/o read/write (bdrv_co_do_rw) is done in coroutine context,
so it's possible to cause concurrency problem between above two operations.
Add bdrv_drain_all() to bdrv_snapshot_delete() to avoid this problem.

Signed-off-by: Zhang Haoyu <zhanghy@sangfor.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 201410211637596311287@sangfor.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 09:48:42 +00:00
Peter Maydell
573742a543 block.c: Fix type of IoOperationType variable in send_qmp_error_event()
The local variable 'ac' in send_qmp_error_event() is declared with the
wrong type, which causes clang to complain when it is initialized
and again when it is used:

block.c:3655:20: warning: implicit conversion from enumeration type 'enum IoOperationType' to different enumeration type 'BlockErrorAction' (aka 'enum BlockErrorAction') [-Wenum-conversion]
    ac = is_read ? IO_OPERATION_TYPE_READ : IO_OPERATION_TYPE_WRITE;
       ~           ^~~~~~~~~~~~~~~~~~~~~~
block.c:3655:45: warning: implicit conversion from enumeration type 'enum IoOperationType' to different enumeration type 'BlockErrorAction' (aka 'enum BlockErrorAction') [-Wenum-conversion]
    ac = is_read ? IO_OPERATION_TYPE_READ : IO_OPERATION_TYPE_WRITE;
       ~                                    ^~~~~~~~~~~~~~~~~~~~~~~
block.c:3656:62: warning: implicit conversion from enumeration type 'BlockErrorAction' (aka 'enum BlockErrorAction') to different enumeration type 'IoOperationType' (aka 'enum IoOperationType') [-Wenum-conversion]
    qapi_event_send_block_io_error(bdrv_get_device_name(bs), ac, action,
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                           ^~

Correct the type to IoOperationType, and rename the variable
to 'optype' to match its correct type.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
Message-id: 1412969583-21045-1-git-send-email-peter.maydell@linaro.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 09:48:41 +00:00
Adam Crume
be21788495 rbd: Add support for bdrv_invalidate_cache
This fixes Ceph issue 2467: ttp://tracker.ceph.com/issues/2467

[Dropped return r in void function as suggested by Josh Durgin
<josh.durgin@inktank.com>.
--Stefan]

Signed-off-by: Adam Crume <adamcrume@gmail.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1412880272-3154-1-git-send-email-adamcrume@gmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 09:48:41 +00:00
Denis V. Lunev
da725d0b0e block/parallels: fix access to not initialized memory in catalog_bitmap
found by valgrind.

Command: ./qemu-img convert -f parallels -O qcow2 1.hds 1.img
Invalid read of size 4
   at 0x17D0EF: parallels_co_read (parallels.c:357)
   by 0x11FEE4: bdrv_aio_rw_vector (block.c:4640)
   by 0x11FFBF: bdrv_aio_readv_em (block.c:4652)
   by 0x11F55F: bdrv_co_readv_em (block.c:4862)
   by 0x123428: bdrv_aligned_preadv (block.c:3056)
   by 0x1239FA: bdrv_co_do_preadv (block.c:3162)
   by 0x125424: bdrv_rw_co_entry (block.c:2706)
   by 0x155DD9: coroutine_trampoline (coroutine-ucontext.c:118)
   by 0x6975B6F: ??? (in /lib/x86_64-linux-gnu/libc-2.19.so)

The problem is that s->catalog_bitmap is allocated/filled as
gmalloc(s->catalog_size) thus index validity check must be
inclusive, i.e. index >= s->catalog_size is invalid.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1412759610-2257-4-git-send-email-den@openvz.org
CC: Jeff Cody <jcody@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 09:48:41 +00:00
Denis V. Lunev
76823c6e79 iotests: add v2 parallels sample image and simple test for it
This is simple test image for the following commit made by me.

    commit d25d598020
    Author: Denis V. Lunev <den@openvz.org>
    Date:   Mon Jul 28 20:23:55 2014 +0400
    parallels: 2TB+ parallels images support

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1412759610-2257-3-git-send-email-den@openvz.org
CC: Jeff Cody <jcody@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 09:48:41 +00:00
Denis V. Lunev
285030b0de iotests: replace fake parallels image with authentic one
The image was generated using http://openvz.org/Ploop utility and properly
filled with the same content as original one.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1412759610-2257-2-git-send-email-den@openvz.org
CC: Jeff Cody <jcody@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 09:48:41 +00:00
Chris Spiegel
ba2b22888c snapshot: Reset err to NULL to avoid double free
If an error occurs in bdrv_snapshot_delete_by_id_or_name(), "err" is
freed.  If "err" is not set to NULL before calling
bdrv_snapshot_delete_by_id_or_name() again, it will not be updated on
error, and will be freed again.

This can be triggered by starting a VM with at least two drives and then
attempting to delete a non-existent snapshot.

Broken in commit a89d89d.

Signed-off-by: Chris Spiegel <chris.spiegel@cypherpath.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1412613225-32676-1-git-send-email-chris.spiegel@cypherpath.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 09:48:41 +00:00
John Snow
54a7f8f38d ahci: Fix SDB FIS Construction
The SDB FIS creation was mangled;
We were writing the error byte to byte 0,
and omitting the SDB FIS magic byte.

Though the SDB packet layout states that:
byte 0: Must be 0xA1 to indicate SDB FIS.
byte 1: Port multiplier select & other flags
byte 2: status byte.
byte 3: error byte.

This patch adds an SDB FIS structure with
human-readable names, and ensures that we
are filling the structure appropriately.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1412204151-18117-7-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 09:48:41 +00:00
John Snow
659142ecf7 ahci: Update byte count after DMA completion
Currently, DMA read/write operations neglect to update
the byte count after a successful transfer like ATAPI
DMA read or PIO read/write operations do.

We correct this oversight by adding another callback into
the IDEDMAOps structure. The commit callback is called
whenever we are cleaning up a scatter-gather list.
AHCI can register this callback in order to update post-
transfer information such as byte count updates.

We use this callback in AHCI to consolidate where we delete
the SGlist as generated from the PRDT, as well as update the
byte count after the transfer is complete.

The QEMUSGList structure has an init flag added to it in order
to make qemu_sglist_destroy a nop if it is called when
there is no sglist, which simplifies cleanup and error paths.

This patch fixes several AHCI problems, notably Non-NCQ modes
of operation for Windows 7 as well as Hibernate support for Windows 7.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1412204151-18117-3-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 09:48:41 +00:00
John Snow
7b8bad1b6a ahci: Correct PIO/D2H FIS responses
Currently, the D2H FIS packets AHCI generates simply parrot back
the LBA that the guest sent to us in the cmd_fis. However, some
commands (like READ NATIVE MAX) modify the LBA registers as a
return value, through which the AHCI D2H FIS is the only response
mechanism. Thus, the D2H response should use the current register
values, not the initial ones.

This patch adjusts the LBA and drive select register responses for
PIO Setup and D2H FIS response packets.

Additionally, the PIO and D2H FIS responses copy too many bytes
from the command FIS that it is being generated from. Specifically,
byte 11 which is the Features(15:8) field for Register Host to
Device FIS packets, is instead reserved for the PIO Setup FIS and
should always be 0.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1412204151-18117-2-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 09:48:41 +00:00
Peter Lieven
dc9e716369 block/iscsi: check for oversized requests
Cancel oversized requests early. They would generate
an iSCSI protocol error anyway; after having transferred
possibly a lot of data over the wire.

Suggested-By: Max Reitz <mreitz@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 09:48:41 +00:00
Peter Lieven
3dab155154 block/iscsi: use sector_limits_lun2qemu throughout iscsi_refresh_limits
As Max pointed out there is a hidden cast from int64_t to int for all
limits. So use the newly introduced sector_limits_lun2qemu for all
limits received from the target.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 09:48:41 +00:00
Peter Lieven
6c5a42ac34 block: avoid creating oversized writes in multiwrite_merge
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 09:48:41 +00:00
Peter Lieven
52f6fa1430 block/iscsi: set max_transfer_length
Copy the max_xfer_len from the BlockLimits VPD or use the
maximum value fitting in the CDB.

The helper function sector_limits_lun2qemu is introduced to convert
and cap the limits from the VPD to the maximum power of two fitting
in an integer; integer is the range for nb_sectors throughout
the block layer.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 09:48:41 +00:00
Peter Lieven
2647fab57d BlockLimits: introduce max_transfer_length
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 09:48:41 +00:00
Peter Lieven
ac3a872664 util: introduce MIN_NON_ZERO
at least in block layer we have the case of limits being defined for a
BlockDriverState. However, in this context often zero (0) has the special
meanining of undefined which means no limit. If two of those limits are
combined and the minimum is needed the minimum function should only return
zero if both parameters are zero.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 09:48:41 +00:00
Jonas Maebe
a93934fecd elf: take phdr offset into account when calculating the program load address
The first program header does not necessarily start at offset 0. This change
corresponds to what the Linux kernel does in load_elf_binary().

Signed-off-by: Jonas Maebe <jonas.maebe@elis.ugent.be>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-11-03 11:03:34 +02:00
Riku Voipio
686581adcf linux-user: Fix fault address truncation AArch64
On AArch64 the si_addr field of siginfo_t is truncated to 32 bits
because the fault address passes through an uint32_t variable.

Follow Peters suggestion and drop the uint32_t variable
since its only used once in the Aarch64 loop.

Reported-by: Amanieu d'Antras <amanieu@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-11-03 11:03:34 +02:00
Magnus Reftel
c5e4a5a95e linux-user: Let user specify random seed
This patch introduces the -seed command line option and the
QEMU_RAND_SEED environment variable for setting the random seed, which
is used for the AT_RANDOM ELF aux entry.

Signed-off-by: Magnus Reftel <reftel@spotify.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-11-03 11:03:34 +02:00
Max Filippov
437a8c11c0 MAINTAINERS: update xtensa boards
- fix file names that were changed by the commit
  b707ab7 hw/xtensa: remove extraneous xtensa_ prefix from file names
- mark OpenCores 10/100 Mbit MAC model as maintained.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2014-11-03 01:00:37 +03:00
Max Filippov
ab5824134f target-xtensa: fix build for cores w/o windowed registers
Cores without windowed registers don't have window overflow/underflow
vectors. Move these vectors to a separate group defined conditionally.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2014-11-03 01:00:37 +03:00
Max Filippov
9bea2e91a6 target-xtensa: add core importing script
This script copies configuration and gdb information from the xtensa
configuration overlay archive and registers new xtensa core.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2014-11-03 01:00:37 +03:00
Max Filippov
6d2e453053 hw/xtensa/xtfpga: treat uImage load address as virtual
U-boot for xtensa always treats uImage load address as virtual address.
This is important when booting uImage on xtensa core with MMUv2, because
MMUv2 has fixed non-identity virtual-to-physical mapping after reset.

Always do virtual-to-physical translation of uImage load address and
load uImage at the translated address. This fixes booting uImage kernels
on dc232b and other MMUv2 cores.

Cc: qemu-stable@nongnu.org
Reported-by: Waldemar Brodkorb <mail@waldemar-brodkorb.de>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2014-11-03 01:00:37 +03:00
Max Filippov
25bda50a0c hw/core/loader: implement address translation in uimage loader
Such address translation is needed when load address recorded in uImage
is a virtual address. When the actual load address is requested, return
untranslated address: user that needs the translated address can always
apply translation function to it and those that need it untranslated
don't need to do the inverse translation.

Add translation function pointer and its parameter to uimage_load
prototype. Update all existing users.

No user-visible functional changes.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
2014-11-03 00:59:10 +03:00
Max Filippov
c9e9521fcb target-xtensa: avoid duplicate timer interrupt delivery
Timer interrupt should be raised at the same cycle when CCOUNT equals
CCOMPARE. As cycles are counted in batches, timer interrupt is sent
every time CCOMPARE lies in the interval [old CCOUNT, new CCOUNT]. This
is wrong, because when new CCOUNT equals CCOMPARE interrupt is sent
twice, once for the upper interval boundary and once for the lower. Fix
that by excluding lower interval boundary from the condition.

This doesn't have user-visible effect, because CCOMPARE reload always
causes CCOUNT increment followed by current timer interrupt reset.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2014-11-03 00:51:44 +03:00
Max Filippov
20303e42d4 target-xtensa: tests: pre-process tests linker script
Xtensa cores have configurable interrupt vectors and endiannes. This
information is needed to link executable images correctly for a specific
core configuration. Instead of hard-coding dc232 defaults pull endianness,
number of high-priority interrupts and location of vectors from the core
configuration and pass it through the C preprocessor.

While at it clean up tabs and align the initial stack on 16 bytes.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2014-11-03 00:51:43 +03:00
Max Filippov
dec71d2d63 target-xtensa: add definition for XTHAL_INTTYPE_PROFILING
There's new interrupt type in the recent Xtensa releases that may appear
in configuration overlay. Add definition so that new cores that use it
could be automatically imported.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2014-11-03 00:51:43 +03:00
Aurelien Jarno
0a2923f848 tcg/mips: fix store softmmu slow path
Commit 9d8bf2d1 moved the softmmu slow path out of line and introduce a
regression at the same time by always calling tcg_out_tlb_load with
is_load=1. This makes impossible to run any significant code under
qemu-system-mips*.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2014-11-02 13:30:00 +01:00
Nikita Belov
ac369a7796 hw/i386/acpi-build.c: Fix memory leak in acpi_build_tables_cleanup()
There are three ACPI tables: 'linker_data', 'rsdp' and 'table_data'. They are
used differently. Two of them are being copied before using and only the copy
is used later. But the third is used directly. Because of that we need to free
two tables completely and delete only wrapper for the third one.

Valgrind output:
==23931== 131,072 bytes in 1 blocks are definitely lost in loss record 7,729 of 7,734
==23931==    at 0x4C2CE8E: realloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==23931==    by 0x2EA920: realloc_and_trace (vl.c:2811)
==23931==    by 0x509E6AE: g_realloc (in /lib/x86_64-linux-gnu/libglib-2.0.so.0.4000.0)
==23931==    by 0x506DB32: ??? (in /lib/x86_64-linux-gnu/libglib-2.0.so.0.4000.0)
==23931==    by 0x506E463: g_array_set_size (in /lib/x86_64-linux-gnu/libglib-2.0.so.0.4000.0)
==23931==    by 0x256A4F: acpi_align_size (acpi-build.c:487)
==23931==    by 0x259F92: acpi_build (acpi-build.c:1601)
==23931==    by 0x25A212: acpi_setup (acpi-build.c:1682)
==23931==    by 0x24F346: pc_guest_info_machine_done (pc.c:1110)
==23931==    by 0x55FAAB: notifier_list_notify (notify.c:39)
==23931==    by 0x2EA704: qemu_run_machine_init_done_notifiers (vl.c:2759)
==23931==    by 0x2EEC3C: main (vl.c:4504)

Signed-off-by: Nikita Belov <zodiac@ispras.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-11-02 13:44:52 +02:00
Eduardo Habkost
caad057bb6 smbios: Encode UUID according to SMBIOS specification
Differently from older versions, SMBIOS version 2.6 is explicit about
the encoding of UUID fields:

> Although RFC 4122 recommends network byte order for all fields, the PC
> industry (including the ACPI, UEFI, and Microsoft specifications) has
> consistently used little-endian byte encoding for the first three fields:
> time_low, time_mid, time_hi_and_version. The same encoding, also known as
> wire format, should also be used for the SMBIOS representation of the UUID.
>
> The UUID {00112233-4455-6677-8899-AABBCCDDEEFF} would thus be represented
> as 33 22 11 00 55 44 77 66 88 99 AA BB CC DD EE FF.

The dmidecode tool implements this and decodes the above "wire format"
when SMBIOS version >= 2.6. We moved from SMBIOS version 2.4 to 2.8 when
we started building the SMBIOS entry point inside QEMU, on commit
c97294ec1b.

Change smbios_build_type_1_table() to encode the UUID as specified.

To make sure we won't change the guest-visible UUID when upgrading to a
newer QEMU version, keep the old behavior on pc-*-2.1 and older.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-11-02 13:44:52 +02:00
Eduardo Habkost
2cad57c717 pc: Add pc_compat_2_1() function
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-11-02 13:44:50 +02:00
Bin Wu
a3614c65cf hw/virtio/vring/event_idx: fix the vring_avail_event error
The event idx in virtio is an effective way to reduce the number of
interrupts and exits of the guest. When the guest puts an request
into the virtio ring, it doesn't exit immediately to inform the
backend. Instead, the guest checks the "avail" event idx to determine
the notification.

In virtqueue_pop, when a request is poped, the current avail event
idx should be set to the number of vq->last_avail_idx.

Signed-off-by: Bin Wu <wu.wubin@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-02 13:44:12 +02:00
Marcel Apfelbaum
db80c7b974 hw/pci: fixed hotplug crash when using rombar=0 with devices having romfile
Hot-plugging a device that has a romfile (either supplied by user
or built-in) using rombar=0 option is a user error,
do not allow the device to be hot-plugged.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-11-02 13:44:12 +02:00
Marcel Apfelbaum
178e785fb4 hw/pci: fixed error flow in pci_qdev_init
Verify return code for pci_add_option_rom.

Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2014-11-02 13:44:12 +02:00
Dr. David Alan Gilbert
9b23cfb76b -machine vmport=off: Allow disabling of VMWare ioport emulation
This is a pc & q35 only machine opt.

VMWare apparently doesn't like running under QEMU due to our
incomplete emulation of it's special IO Port.  This adds a
pc & q35 property to allow it to be turned off.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Don Slutz <dslutz@verizon.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
2014-11-02 13:44:12 +02:00
Gu Zheng
cc43364de7 acpi/cpu-hotplug: introduce helper function to keep bit setting in one place
Introduce helper function acpi_set_cpu_present_bit() to simplify acpi_cpu_plug_cb
and acpi_cpu_hotplug_init, so that we can keep bit setting in one place.

Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2014-11-02 13:44:12 +02:00
Gu Zheng
411b5db8e5 cpu-hotplug: rename function for better readability
Rename:
AcpiCpuHotplug_init --> acpi_cpu_hotplug_init
AcpiCpuHotplug_ops --> acpi_cpu_hotplug_ops
for better readability, just cleanup.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2014-11-02 13:44:12 +02:00
Gu Zheng
fcd702e17a qom/cpu: remove the unused CPU hot-plug notifier
Remove the unused CPU hot-plug notifier.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2014-11-02 13:44:11 +02:00
Gu Zheng
2d996150ed pc: Update rtc_cmos in pc_cpu_plug
Update rtc_cmos in pc_cpu_plug() directly, instead of the notifier.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2014-11-02 13:44:11 +02:00
Gu Zheng
5279569eea pc: add cpu hotplug handler to PC_MACHINE
Add cpu hotplug handler to PC_MACHINE, which will perform the acpi
cpu hotplug callback via hotplug_handler API.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2014-11-02 13:44:11 +02:00
Gu Zheng
08bba95bd3 acpi:piix4: convert cpu hotplug to hotplug_handler API
Convert notifier based hotplug to hotplug_handler API,
and remove the unused AcpiCpuHotplug_add().

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2014-11-02 13:44:11 +02:00
Gu Zheng
c5171ed0cf acpi:ich9: convert cpu hotplug to hotplug_handler API
Convert notifier based hotplug to hotplug_handler API.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2014-11-02 13:44:11 +02:00
Gu Zheng
1be6b511a6 acpi/cpu: add cpu hotplug callback function to match hotplug_handler API
Add cpu hotplug callback function (acpi_cpu_plug_cb) to match hotplug_handler API.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2014-11-02 13:44:11 +02:00
Stefan Berger
42a5b30844 acpi: create separate file for TCPA log
Create the TCPA log in a separate file rather than allocating
ACPI table memory for it.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-11-02 13:43:37 +02:00
Paolo Bonzini
9047579161 tests: fix rebuild-expected-aml.sh for acpi-test rename
This is now called bios-tables-test.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-11-02 12:03:04 +02:00
Michael S. Tsirkin
1e06f131fd intel_iommu: fix VTD_SID_TO_BUS
(((sid) >> 8) && 0xff)  makes no sense
(((sid) >> 8) & 0xff) seems to be what was meant.

https://bugs.launchpad.net/qemu/+bug/1382477

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-11-02 12:03:04 +02:00
Michael S. Tsirkin
68a27b208a virtio-pci: fix migration for pci bus master
Current support for bus master (clearing OK bit) together with the need to
support guests which do not enable PCI bus mastering, leads to extra state in
VIRTIO_PCI_FLAG_BUS_MASTER_BUG bit, which isn't robust in case of cross-version
migration for the case when guests use the device before setting DRIVER_OK.

Rip out this code, and replace it:
-   Modern QEMU doesn't need VIRTIO_PCI_FLAG_BUS_MASTER_BUG
    so just drop it for latest machine type.
-   For compat machine types, set PCI_COMMAND if DRIVER_OK
    is set.

As this is needed for 2.1 for both pc and ppc, move PC_COMPAT macros from pc.h
to a new common header.

Cc: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
2014-11-02 12:03:03 +02:00
Gonglei
4d5e17a53b pcie: change confused comment clearer
This comment applies to all functions below it.
It is not appropriate that called capability allocation
functions, change it into capability list management functions.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-11-02 11:52:24 +02:00
Gal Hammer
1c87d68c91 i386: Add an ACPI_EXTRACT_NAME_BUFFER16 directive.
Add a 16-bytes buffer to allow storing a 128-bit UUID value in an
ACPI table.

Signed-off-by: Gal Hammer <ghammer@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-11-02 11:52:24 +02:00
Jan Kiszka
df1fd4b541 pc: Fix disabling of vapic for compat PC models
We used to be able to address both the QEMU and the KVM APIC via "apic".
This doesn't work anymore. So we need to use their parent class to turn
off the vapic on machines that should not expose them.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-11-02 11:52:24 +02:00
Laszlo Ersek
562542b6ae i386/pc: add piix and q35 machtypes to sorting families for -M \?
With this patch applied, the output of -M \? is

> Supported machines are:
> pc                   Standard PC (i440FX + PIIX, 1996) (alias of pc-i440fx-2.2)
> pc-i440fx-2.2        Standard PC (i440FX + PIIX, 1996) (default)
> pc-i440fx-2.1        Standard PC (i440FX + PIIX, 1996)
> pc-i440fx-2.0        Standard PC (i440FX + PIIX, 1996)
> pc-i440fx-1.7        Standard PC (i440FX + PIIX, 1996)
> pc-i440fx-1.6        Standard PC (i440FX + PIIX, 1996)
> pc-i440fx-1.5        Standard PC (i440FX + PIIX, 1996)
> pc-i440fx-1.4        Standard PC (i440FX + PIIX, 1996)
> pc-1.3               Standard PC (i440FX + PIIX, 1996)
> pc-1.2               Standard PC (i440FX + PIIX, 1996)
> pc-1.1               Standard PC (i440FX + PIIX, 1996)
> pc-1.0               Standard PC (i440FX + PIIX, 1996)
> pc-0.15              Standard PC (i440FX + PIIX, 1996)
> pc-0.14              Standard PC (i440FX + PIIX, 1996)
> pc-0.13              Standard PC (i440FX + PIIX, 1996)
> pc-0.12              Standard PC (i440FX + PIIX, 1996)
> pc-0.11              Standard PC (i440FX + PIIX, 1996)
> pc-0.10              Standard PC (i440FX + PIIX, 1996)
> q35                  Standard PC (Q35 + ICH9, 2009) (alias of pc-q35-2.2)
> pc-q35-2.2           Standard PC (Q35 + ICH9, 2009)
> pc-q35-2.1           Standard PC (Q35 + ICH9, 2009)
> pc-q35-2.0           Standard PC (Q35 + ICH9, 2009)
> pc-q35-1.7           Standard PC (Q35 + ICH9, 2009)
> pc-q35-1.6           Standard PC (Q35 + ICH9, 2009)
> pc-q35-1.5           Standard PC (Q35 + ICH9, 2009)
> pc-q35-1.4           Standard PC (Q35 + ICH9, 2009)
> isapc                ISA-only PC
> none                 empty machine

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1145042
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel.a@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
2014-11-02 11:52:23 +02:00
Laszlo Ersek
2709f26395 well-defined listing order for machine types
Commit 261747f1 ("vl: Use MachineClass instead of global QEMUMachine
list") broke the ordering of the machine types in the user-visible output
of

  qemu-system-XXXX -M \?

This occurred because registration was rebased from a manually maintained
linked list to GLib hash tables:

  qemu_register_machine()
    type_register()
      type_register_internal()
        type_table_add()
          g_hash_table_insert()

and because the listing was rebased accordingly, from the traversal of the
list to the traversal of the hash table (rendered as an ad-hoc list):

  machine_parse()
    object_class_get_list(TYPE_MACHINE)
      object_class_foreach()
        g_hash_table_foreach()

The current order is a "random" one, for practical purposes, which is
annoying for users.

Introduce new members QEMUMachine.family and MachineClass.family, allowing
machine types to be "clustered". Introduce a comparator function that
establishes a total ordering between machine types, ordering machine types
in the same family next to each other. In machine_parse(), list the
supported machine types sorted with the comparator function.

The comparator function:
- sorts whole families before standalone machine types,
- sorts whole families between each other in alphabetically increasing
  order,
- sorts machine types inside the same family in alphabetically decreasing
  order,
- sorts standalone machine types between each other in alphabetically
  increasing order.

After this patch, all machine types are considered standalone, and
accordingly, the output is alphabetically ascending. This will be refined
in the following patches.

Effects on the x86_64 output:

Before:

> Supported machines are:
> pc-0.13              Standard PC (i440FX + PIIX, 1996)
> pc-i440fx-2.0        Standard PC (i440FX + PIIX, 1996)
> pc-1.0               Standard PC (i440FX + PIIX, 1996)
> pc-i440fx-2.1        Standard PC (i440FX + PIIX, 1996)
> pc-q35-1.7           Standard PC (Q35 + ICH9, 2009)
> pc-1.1               Standard PC (i440FX + PIIX, 1996)
> pc-0.14              Standard PC (i440FX + PIIX, 1996)
> pc-q35-2.0           Standard PC (Q35 + ICH9, 2009)
> pc-i440fx-1.4        Standard PC (i440FX + PIIX, 1996)
> pc-i440fx-1.5        Standard PC (i440FX + PIIX, 1996)
> pc-0.15              Standard PC (i440FX + PIIX, 1996)
> pc-q35-1.4           Standard PC (Q35 + ICH9, 2009)
> isapc                ISA-only PC
> pc                   Standard PC (i440FX + PIIX, 1996) (alias of pc-i440fx-2.2)
> pc-i440fx-2.2        Standard PC (i440FX + PIIX, 1996) (default)
> pc-1.2               Standard PC (i440FX + PIIX, 1996)
> pc-0.10              Standard PC (i440FX + PIIX, 1996)
> pc-0.11              Standard PC (i440FX + PIIX, 1996)
> pc-q35-2.1           Standard PC (Q35 + ICH9, 2009)
> q35                  Standard PC (Q35 + ICH9, 2009) (alias of pc-q35-2.2)
> pc-q35-2.2           Standard PC (Q35 + ICH9, 2009)
> pc-i440fx-1.6        Standard PC (i440FX + PIIX, 1996)
> pc-i440fx-1.7        Standard PC (i440FX + PIIX, 1996)
> none                 empty machine
> pc-q35-1.5           Standard PC (Q35 + ICH9, 2009)
> pc-q35-1.6           Standard PC (Q35 + ICH9, 2009)
> pc-0.12              Standard PC (i440FX + PIIX, 1996)
> pc-1.3               Standard PC (i440FX + PIIX, 1996)

After:

> Supported machines are:
> isapc                ISA-only PC
> none                 empty machine
> pc-0.10              Standard PC (i440FX + PIIX, 1996)
> pc-0.11              Standard PC (i440FX + PIIX, 1996)
> pc-0.12              Standard PC (i440FX + PIIX, 1996)
> pc-0.13              Standard PC (i440FX + PIIX, 1996)
> pc-0.14              Standard PC (i440FX + PIIX, 1996)
> pc-0.15              Standard PC (i440FX + PIIX, 1996)
> pc-1.0               Standard PC (i440FX + PIIX, 1996)
> pc-1.1               Standard PC (i440FX + PIIX, 1996)
> pc-1.2               Standard PC (i440FX + PIIX, 1996)
> pc-1.3               Standard PC (i440FX + PIIX, 1996)
> pc-i440fx-1.4        Standard PC (i440FX + PIIX, 1996)
> pc-i440fx-1.5        Standard PC (i440FX + PIIX, 1996)
> pc-i440fx-1.6        Standard PC (i440FX + PIIX, 1996)
> pc-i440fx-1.7        Standard PC (i440FX + PIIX, 1996)
> pc-i440fx-2.0        Standard PC (i440FX + PIIX, 1996)
> pc-i440fx-2.1        Standard PC (i440FX + PIIX, 1996)
> pc                   Standard PC (i440FX + PIIX, 1996) (alias of pc-i440fx-2.2)
> pc-i440fx-2.2        Standard PC (i440FX + PIIX, 1996) (default)
> pc-q35-1.4           Standard PC (Q35 + ICH9, 2009)
> pc-q35-1.5           Standard PC (Q35 + ICH9, 2009)
> pc-q35-1.6           Standard PC (Q35 + ICH9, 2009)
> pc-q35-1.7           Standard PC (Q35 + ICH9, 2009)
> pc-q35-2.0           Standard PC (Q35 + ICH9, 2009)
> pc-q35-2.1           Standard PC (Q35 + ICH9, 2009)
> q35                  Standard PC (Q35 + ICH9, 2009) (alias of pc-q35-2.2)
> pc-q35-2.2           Standard PC (Q35 + ICH9, 2009)

Effects on the aarch64 output:

Before:

> Supported machines are:
> lm3s811evb           Stellaris LM3S811EVB
> canon-a1100          Canon PowerShot A1100 IS
> vexpress-a15         ARM Versatile Express for Cortex-A15
> vexpress-a9          ARM Versatile Express for Cortex-A9
> xilinx-zynq-a9       Xilinx Zynq Platform Baseboard for Cortex-A9
> connex               Gumstix Connex (PXA255)
> n800                 Nokia N800 tablet aka. RX-34 (OMAP2420)
> lm3s6965evb          Stellaris LM3S6965EVB
> versatileab          ARM Versatile/AB (ARM926EJ-S)
> borzoi               Borzoi PDA (PXA270)
> tosa                 Tosa PDA (PXA255)
> cheetah              Palm Tungsten|E aka. Cheetah PDA (OMAP310)
> midway               Calxeda Midway (ECX-2000)
> mainstone            Mainstone II (PXA27x)
> n810                 Nokia N810 tablet aka. RX-44 (OMAP2420)
> terrier              Terrier PDA (PXA270)
> highbank             Calxeda Highbank (ECX-1000)
> cubieboard           cubietech cubieboard
> sx1-v1               Siemens SX1 (OMAP310) V1
> sx1                  Siemens SX1 (OMAP310) V2
> realview-eb-mpcore   ARM RealView Emulation Baseboard (ARM11MPCore)
> kzm                  ARM KZM Emulation Baseboard (ARM1136)
> akita                Akita PDA (PXA270)
> z2                   Zipit Z2 (PXA27x)
> musicpal             Marvell 88w8618 / MusicPal (ARM926EJ-S)
> realview-pb-a8       ARM RealView Platform Baseboard for Cortex-A8
> versatilepb          ARM Versatile/PB (ARM926EJ-S)
> realview-eb          ARM RealView Emulation Baseboard (ARM926EJ-S)
> realview-pbx-a9      ARM RealView Platform Baseboard Explore for Cortex-A9
> spitz                Spitz PDA (PXA270)
> none                 empty machine
> virt                 ARM Virtual Machine
> collie               Collie PDA (SA-1110)
> smdkc210             Samsung SMDKC210 board (Exynos4210)
> verdex               Gumstix Verdex (PXA270)
> nuri                 Samsung NURI board (Exynos4210)
> integratorcp         ARM Integrator/CP (ARM926EJ-S)

After:

> Supported machines are:
> akita                Akita PDA (PXA270)
> borzoi               Borzoi PDA (PXA270)
> canon-a1100          Canon PowerShot A1100 IS
> cheetah              Palm Tungsten|E aka. Cheetah PDA (OMAP310)
> collie               Collie PDA (SA-1110)
> connex               Gumstix Connex (PXA255)
> cubieboard           cubietech cubieboard
> highbank             Calxeda Highbank (ECX-1000)
> integratorcp         ARM Integrator/CP (ARM926EJ-S)
> kzm                  ARM KZM Emulation Baseboard (ARM1136)
> lm3s6965evb          Stellaris LM3S6965EVB
> lm3s811evb           Stellaris LM3S811EVB
> mainstone            Mainstone II (PXA27x)
> midway               Calxeda Midway (ECX-2000)
> musicpal             Marvell 88w8618 / MusicPal (ARM926EJ-S)
> n800                 Nokia N800 tablet aka. RX-34 (OMAP2420)
> n810                 Nokia N810 tablet aka. RX-44 (OMAP2420)
> none                 empty machine
> nuri                 Samsung NURI board (Exynos4210)
> realview-eb          ARM RealView Emulation Baseboard (ARM926EJ-S)
> realview-eb-mpcore   ARM RealView Emulation Baseboard (ARM11MPCore)
> realview-pb-a8       ARM RealView Platform Baseboard for Cortex-A8
> realview-pbx-a9      ARM RealView Platform Baseboard Explore for Cortex-A9
> smdkc210             Samsung SMDKC210 board (Exynos4210)
> spitz                Spitz PDA (PXA270)
> sx1                  Siemens SX1 (OMAP310) V2
> sx1-v1               Siemens SX1 (OMAP310) V1
> terrier              Terrier PDA (PXA270)
> tosa                 Tosa PDA (PXA255)
> verdex               Gumstix Verdex (PXA270)
> versatileab          ARM Versatile/AB (ARM926EJ-S)
> versatilepb          ARM Versatile/PB (ARM926EJ-S)
> vexpress-a15         ARM Versatile Express for Cortex-A15
> vexpress-a9          ARM Versatile Express for Cortex-A9
> virt                 ARM Virtual Machine
> xilinx-zynq-a9       Xilinx Zynq Platform Baseboard for Cortex-A9
> z2                   Zipit Z2 (PXA27x)

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1145042
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel.a@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
2014-11-02 11:52:23 +02:00
Eduardo Habkost
7dfddd7f88 smbios: Fix assertion on socket count calculation
QEMU currently allows the number of VCPUs to not be a multiple of the
number of threads per socket, but the smbios socket count calculation
introduced by commit c97294ec1b doesn't
take that into account, triggering an assertion. e.g.:

  $ ./x86_64-softmmu/qemu-system-x86_64 -smp 4,sockets=2,cores=6,threads=1
  qemu-system-x86_64: /home/ehabkost/rh/proj/virt/qemu/hw/i386/smbios.c:825: smbios_get_tables: Assertion `smbios_smp_sockets >= 1' failed.
  Aborted (core dumped)

Socket count calculation doesn't belong to smbios.c and should
eventually be moved to the main SMP topology configuration code. But
while we don't move the code, at least make it correct by rounding up
the division.

Cc: Gabriel Somlo <somlo@cmu.edu>
Cc: qemu-stable@nongnu.org
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-By: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-11-02 11:52:23 +02:00
SeokYeon Hwang
f18a768efb vdi: wrapped uuid_unparse() in #ifdef
Wrapped uuid_unparse() in #ifdef to avoid "-Wunused-function"
on clang 3.4 or later.

Signed-off-by: SeokYeon Hwang <syeon.hwang@samsung.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-11-02 10:05:35 +03:00
Gonglei
84f8f3dace tap: fix possible fd leak in net_init_tap
In hotplugging scenario, taking those true branch, the file
handler do not be closed. Let's close them before return.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-11-02 10:05:35 +03:00
Gonglei
d0caa3eb53 tap: do not close(fd) in net_init_tap_one
commit 5193e5fb (tap: factor out common tap initialization)
introduce net_init_tap_one(). But it's inappropriate that
we close fd in net_init_tap_one(), we should lay it in the
caller, becuase some callers needn't to close it if we get
the fd by monitor_handle_fd_param().

On the other hand, in other exceptional branches fd isn't
closed, so that's incomplete anyway.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-11-02 10:05:32 +03:00
Eduardo Habkost
bb019cf911 target-i386: Remove unused model_features_t struct
The struct is not used anymore and can be removed.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-11-02 10:04:34 +03:00
Gonglei
b391567b64 tap_int.h: remove repeating NETWORK_SCRIPT defines
DEFAULT_NETWORK_SCRIPT and DEFAULT_NETWORK_DOWN_SCRIPT
have been defined in net/net.h included in
tap.c, which is the only C file that using those two macro.
Let's remove the repeating macroinstruction.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-11-02 10:04:34 +03:00
Michael Tokarev
25cec2b896 os-posix: reorder parent notification for -daemonize
Put "success" parent reporting in os_setup_post() to after
all other initializers which may also fail, to the very end,
so more possible failure cases are reported properly to the
calling process.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
2014-11-02 10:04:34 +03:00
Michael Tokarev
fee78fd6d2 pidfile: stop making pidfile error a special case
In case of -daemonize, we write non-zero to the daemon
pipe only if pidfile creation failed, so the parent will
report error about pidfile problem.  There's no need to
make special case for this, since all other errors are
reported by the child just fine.  Let the parent report
error and simplify logic in os_daemonize().

This way, we don't need os_pidfile_error() function, since
it only prints error now, so put the error reporting printf
into the only place where qemu_create_pidfile() is called,
in vl.c.

While at it, fix wrong indentation in os_daemonize().

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-11-02 10:04:34 +03:00
Michael Tokarev
ccea25f1c7 os-posix: replace goto again with a proper loop
Eliminiate two fullwrite implementations with goto replacing them with
a proper do..while loop.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
2014-11-02 10:04:34 +03:00
Michael Tokarev
0be5e436ff os-posix: use global daemon_pipe instead of cryptic fds[1]
When asked to -daemonize, we fork a child and setup a pipe between
it and parent to pass exit status.  os-posix.c used global fds[2]
array for that, but actually only the writing side of the pipe is
needed to be global, and this name is really too generic.  Use
just one interger for the writing side of the pipe, and name it
daemon_pipe to be more understandable than cryptic fds[1].

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
2014-11-02 10:04:34 +03:00
Gonglei
08a655be71 dump: Fix dump-guest-memory termination and use-after-close
dump_iterate() dumps blocks in a loop.  Eventually, get_next_block()
returns "no more".  We then call dump_completed().  But we neglect to
break the loop!  Broken in commit 4c7e251a.

Because of that, we dump the last block again.  This attempts to write
to s->fd, which fails if we're lucky.  The error makes dump_iterate()
return failure.  It's the only way it can ever return.

Theoretical: if we're not so lucky, something else has opened something
for writing and got the same fd.  dump_iterate() then keeps looping,
messing up the something else's output, until a write fails, or the
process mercifully terminates.

The obvious fix is to restore the return lost in commit 4c7e251a.  But
the root cause of the bug is needlessly opaque loop control.  Replace it
by a clean do ... while loop.

This makes the badly chosen return values of get_next_block() more
visible.  Cleaning that up is outside the scope of this bug fix.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-11-02 10:04:34 +03:00
Michael Tokarev
7d5a8435ba virtio-9p-proxy: improve error messages in connect_namedsocket()
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
2014-11-02 10:04:34 +03:00
Michael Tokarev
6af76c6f7d virtio-9p-proxy: fix error return in proxy_init()
proxy_init() does not check the return value of connect_namedsocket(),
fix this by rearranging code a little bit.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-11-02 10:04:34 +03:00
Michael Tokarev
660edd4eda virtio-9p-proxy: Fix sockfd leak
If connect() in connect_namedsocket() return false, the sockfd will leak.
Plug it.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
2014-11-02 10:04:34 +03:00
zhanghailiang
8ef2b256b9 target-tricore: check return value before using it
We reference the return value of cpu before checking whether it is NULL,
The checking code is after that which violates code style.

It makes no difference if the cpu is NULL, qemu process will terminate.
But one will be 'Segmentation fault' and the other will report a error
which is what we want.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-11-02 10:04:34 +03:00
Michael Tokarev
44d8d2b2dd net/slirp: specify logbase for smbd
It looks like smbd always logs to /var/log/samba/log.$progname
even if config file specifies different logfile -- when it needs
to log something before completing reading the config file.  But
if it can't open it for writing, it fails and exits.  Tell smbd
to use our temp dir as logbase (-l option) to avoid that.

The same option is used by samba3 and samba4, so there should
be no incompatible changes.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Tested-by: Jan Kiszka <jan.kiszka@siemens.com>
2014-11-02 10:04:34 +03:00
Michael Tokarev
2fd7ae36c5 Revert "os-posix: report error message when lock file failed"
This reverts commit e5048d15ce.

qemu_create_pidfile() is only created from main(), and there,
if that function returns failure, os_pidfile_error() function
is called, to, guess that, report error (which is done differently
whenever we're daemonizing or not).

qemu_create_pidfile() function has several error returns, this
lockf() failure is one of them, there are others (another shown
in the patch context too).

So this patch makes whole thing inconsistent at least.

If we need to show error message when we're daemonizing, it
looks like we should modify os_pidfile_error() routine to always
report error and only after that check for daemon mode.  This way
all errors will be reported the same way.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-11-02 10:04:34 +03:00
Michal Privoznik
404ac83efd util: Improve os_mem_prealloc error message
Currently, when the preallocating guest memory process fails, a not
so helpful error message is printed out:

    # virsh start migt10
    error: Failed to start domain migt10
    error: internal error: process exited while connecting to monitor:
    os_mem_prealloc: failed to preallocate pages

From the error message it's not clear at the first glance where the
problem lies. However, changing the error message might give users a
clue.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-11-02 10:04:34 +03:00
Gerd Hoffmann
2944d742f7 sparse: fix build
c++ compiler isn't wrapped with cgcc, resulting in gcc complaining about
the sparse compiler flags which it doesn't know in case qemu is built
with --enable-sparse.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-11-02 10:04:34 +03:00
Alex Bennée
0d61f18b57 target-arm: A64: remove redundant store
There is not much point storing the same value twice in a row.

Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-11-02 10:04:34 +03:00
Peter Maydell
11693a6cf0 target-xtensa: mark XtensaConfig structs as unused
The XtensaConfig structs will be defined but not used if they are
for the opposite endianness from that of the binary being built;
keep the compiler from complaining about this by marking them
with the 'unused' attribute.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-11-02 10:04:34 +03:00
Eduardo Habkost
d6aaddfe97 bitmap.h: Don't include qemu-common.h
This will avoid unexpected circular header dependencies in the future.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-11-02 10:04:34 +03:00
Eduardo Habkost
afeeead1bc bitops.h: Don't include qemu-common.h
This removes the following circular dependency:

bitops.h -> qemu-common.h -> target-i386/cpu.h -> target-i386/cpu-qom.h ->
qom/cpu.h -> qdev-core.h -> bitmap.h -> bitops.h.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-11-02 10:04:34 +03:00
Eduardo Habkost
a5ebc0ebae tests: Add missing include to test-bitops.c
The test code needs osdep.h for the ARRAY_SIZE macro.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-11-02 10:04:34 +03:00
Paolo Bonzini
ace386b4e0 virtio-scsi: fix dataplane
Commit 361dcc7 (virtio-scsi: dataplane: fail setup gracefully, 2014-10-15)
actually broke successful dataplane setup in a not-so-graceful manner:

    qemu-system-x86_64: .../util/rfifolock.c:71: rfifolock_unlock: Assertion `r->nesting > 0' failed.

due to a missing return statement.

Fixes: 361dcc790d
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-31 17:39:41 +01:00
Andrew Jones
dbc464d401 ivshmem: use error_report
Replace all the fprintf(stderr, ...) calls with error_report.
Also make sure exit() consistently uses the error code 1. A few calls
used -1. While at it cleanup some indentation in the printf argument
lists.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-31 17:02:22 +01:00
Andreas Färber
3a31cff112 ivshmem: Fix fd leak on error
Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-31 17:02:14 +01:00
Sebastian Krahmer
34bc07c528 ivshmem: Fix potential OOB r/w access
Fix OOB access via malformed incoming_posn parameters
and check that requested memory is actually alloc'ed.

Signed-off-by: Sebastian Krahmer <krahmer@suse.de>
[AF: Rebased, cleanups, avoid fd leak]
Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-31 17:02:07 +01:00
Stefan Hajnoczi
363ba1c72f ivshmem: validate incoming_posn value from server
Check incoming_posn to avoid out-of-bounds array accesses if the ivshmem
server on the host sends invalid values.

Cc: Cam Macdonell <cam@cs.ualberta.ca>
Reported-by: Sebastian Krahmer <krahmer@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
[AF: Tighten upper bound check for posn in close_guest_eventfds()]
Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-31 17:01:59 +01:00
Stefan Hajnoczi
a2e9011b41 ivshmem: Check ivshmem_read() size argument
The third argument to the fd_read() callback implemented by
ivshmem_read() is the number of bytes, not a flags field.  Fix this and
check we received enough bytes before accessing the buffer pointer.

Cc: Cam Macdonell <cam@cs.ualberta.ca>
Reported-by: Sebastian Krahmer <krahmer@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
[AF: Handle partial reads via FIFO]
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-31 17:01:44 +01:00
Pavel Dovgalyuk
e64e353590 i386: fix breakpoints handling in icount mode
This patch fixes instructions counting when execution is stopped on
breakpoint (e.g. set from gdb). Without a patch extra instruction is translated
and icount is incremented by invalid value (which equals to number of
executed instructions + 1).

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
2014-10-31 16:41:05 +01:00
Michael Ellerman
4725398f93 kvm_stat: Add powerpc support
Add support for powerpc platforms. We use uname -m, which allows us to
detect ppc, ppc64 and ppc64le/el.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-31 16:36:23 +01:00
Michael Ellerman
a15d5642a0 kvm_stat: Abstract ioctl numbers
Unfortunately ioctl numbers are platform specific, so abstract them out
of the code so they can be overridden. As it happens x86 and s390 share
the same values, so nothing needs to change yet.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-31 16:35:15 +01:00
Michael Ellerman
4d4103ff32 kvm_stat: Rework platform detection
The current platform detection is a little bit messy. We look for lines
in /proc/cpuinfo starting with 'flags' OR 'vendor-id', and scan both
for values we know will only occur in one or the other. We also keep
scanning once we've found a value, which could be a feature, but isn't
in this case.

We'd also like to add another platform, powerpc, which will just make it
worse. So clean it up in preparation.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-31 16:34:21 +01:00
Michael Ellerman
27d318a885 kvm_stat: Fix the non-x86 exit reasons
In kvm_stat we have a dictionary of exit reasons for s390. Firstly these
are not s390 specific, they are the generic exit reasons. So rename the
dictionary to reflect that, and add it separately to filters[].

Secondly, the values are defined using hex, but in the kernel header
they are decimal. That means values above 9 in kvm_stat are incorrect.

While we're there, fix the whitespace to match the rest of the file.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-31 16:32:07 +01:00
Michael Ellerman
763952d08b kvm_stat: Only consider online cpus
In kvm_stat we grovel through /sys to find out how many cpus are in the
system. However if a cpu is offline it will still be present in /sys,
and the perf_event_open() will fail.

Modify the logic to only return online cpus. We need to be careful on
systems which don't support cpu hotplug, the online file will not be
present at all.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-31 16:13:21 +01:00
Fam Zheng
0ba1f53191 virtio-scsi: Fix num_queue input validation
We need to count the ctrlq and eventq, and also cleanup before
returning. Besides, the format string should be unsigned.

The number could never be less than zero.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-31 11:29:02 +01:00
Paolo Bonzini
fb7b5c0df6 scsi: devirtualize unrealize of SCSI devices
All implementations are the same.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-31 11:29:02 +01:00
Fam Zheng
93bd49aff9 virtio-scsi: Fix memory leak when realize failed
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-31 11:29:02 +01:00
Fam Zheng
c1d4096b0f iscsi: Refuse to open as writable if the LUN is write protected
Before, when a write protected iSCSI target is attached as scsi-disk
with BDRV_O_RDWR, we report it as writable, while in fact all writes
will fail.

One way to improve this is to report write protect flag as true to
guest, but a even better way is to refuse using a write protected LUN to
guest.

Target write protect flag is checked with a mode sense query.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-31 11:29:02 +01:00
Pavel Dovgalyuk
076893d3d0 kvmvapic: patch_instruction fix
When QEMU works in icount mode cpu_restore_state function performs two actions:
restoring the program counter and updating icount to the correct value.
kvmvapic's patch_instruction function is called by cpu_report_tpr_access
function which also invokes cpu_restore_state. It results to calling
cpu_restore_state twice - in cpu_report_tpr_access and in patch_instruction.
When icount is disabled second call is safe. But when icount is enabled,
cpu_restore_state modifies instructions counter twice, which leads to incorrect
behavior. This patch removes useless cpu_restore_state call from kvmvapic.

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
2014-10-31 11:29:02 +01:00
Gonglei
522abf6999 vl.c: Fix Coverity complaining for vmstate_dump_file
commit abfd9ce3(migration: dump vmstate info as a json
file for static analysis) introduce a new command,
'-dump-vmstate', that takes a filename
as an argument.  When executed, QEMU will dump the vmstate information
for the machine type it's invoked with to the file, and quit.

However, only one instance of the -dump-vmstate option is supported.
If more were given, the vmstate_dump_file variable would be overwritten.

This fix also helps silence a Coverity error.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-31 11:29:01 +01:00
Nikunj A Dadhania
e4dc3f5909 Add skip_dump flag to ignore memory region during dump
The PCI MMIO might be disabled or the device in the reset state.
Make sure we do not dump these memory regions.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-31 11:29:01 +01:00
Dr. David Alan Gilbert
b154537ad0 -machine vmport=off: Allow disabling of VMWare ioport emulation
This is a pc & q35 only machine opt.

VMWare apparently doesn't like running under QEMU due to our
incomplete emulation of it's special IO Port.  This adds a
pc & q35 property to allow it to be turned off.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Don Slutz <dslutz@verizon.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-31 11:29:01 +01:00
Hannes Reinecke
7957ee71c7 megasas: Fixup MSI-X handling
MSI-X works slightly different than INTx; the doorbell
registers are not necessarily used as MSI-X interrupts
are directed anyway. So the head pointer on the
reply queue needs to be updated as soon as a frame
is completed, and we can set the doorbell only
when in INTx mode.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-31 11:29:01 +01:00
Hannes Reinecke
6df5718bd3 megasas: Rework frame queueing algorithm
Windows requires the frames to be unmapped, otherwise we run
into a race condition where the updated frame data is not
visible to the guest.
With that we can simplify the queue algorithm and use a bitmap
for tracking free frames.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-31 11:29:01 +01:00
Hannes Reinecke
aaf2a859b6 megasas: Update queue logging
Improve queue logging by displaying head and tail pointer
of the completion queue.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-31 11:29:01 +01:00
Hannes Reinecke
200b6966cd megasas: Implement DCMD_CLUSTER_RESET_LD
Some implementations use DCMD_CLUSTER_RESET_LD to simulate
a device reset.

Signed-off-by: Hannes Reinecke <hare@suse.de>
[Compare against id, not lun. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-31 11:29:01 +01:00
Hannes Reinecke
96f8f23a1e megasas: Ignore duplicate init_firmware commands
The windows driver is sending several init_firmware commands
when in MSI-X mode. It is, however, using only the first
queue. So disregard any additional init_firmware commands
until the HBA is reset.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-31 11:29:00 +01:00
Hannes Reinecke
8d72db68fe megasas: Clear unit attention on initial reset
The EFI firmware doesn't handle unit attentions properly,
so we need to clear the Power On/Reset unit attention upon
initial reset.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-31 11:29:00 +01:00
Hannes Reinecke
77bb6b1710 megasas: Decode register names
To ease debugging we should be decoding
the register names.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-31 11:29:00 +01:00
Hannes Reinecke
e74a43154d megasas: Fix typo in megasas_dcmd_ld_get_list()
The check for a valid command buffer size was inverted.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-31 11:29:00 +01:00
Hannes Reinecke
e23d04984a megasas: add MegaRAID SAS 2108 emulation
The 2108 chip supports MSI and MSI-X, so update the emulation
to support both chips.

Signed-off-by: Hannes Reinecke <hare@suse.de>
[Make VMStateDescription const. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-31 11:29:00 +01:00
Hannes Reinecke
3f2cd4dd47 megasas: fixup device mapping
Logical drives can only be addressed with the 'target_id' number;
LUN numbers cannot be selected.
Physical drives can be selected with both, target and LUN id.

So we should disallow LUN numbers not equal to 0 when in
RAID mode.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-31 11:29:00 +01:00
Hannes Reinecke
7bd908491c megasas: simplify trace event messages
The trace events already contain the function name, so the actual
message doesn't need to contain any of these informations.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-31 11:28:59 +01:00
Hannes Reinecke
d97ae36848 megasas: fixup MFI_DCMD_LD_LIST_QUERY
The MFI_DCMD_LD_LIST_QUERY function is using a different format than
MFI_DCMD_LD_LIST, so we need to implement it differently.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-31 11:28:59 +01:00
Hannes Reinecke
1894df0281 scsi: Rename scsi_*_length() to scsi_*_xfer(), add scsi_cdb_length()
scsi_cdb_length() does not return the length of the cdb, but
the transfersize encoded in the cdb. So rename it to scsi_cdb_xfer()
and also rename all other related functions to end with _xfer.

We can then add a new scsi_cdb_length() which actually does return the
length of the cdb.  With that DEBUG_SCSI can now display the correct
CDB buffer.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-31 11:28:59 +01:00
Fam Zheng
98001e7b08 ui: Use the new ".mo-cflags" rule syntax for SDL_CFLAGS
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-31 11:26:25 +01:00
Fam Zheng
2d38853239 rules.mak: Allow .mo-objs and .mo-cflags in -y variables
Expand %.mo-objs in -y nested objects, so that we can write combined
object -cflags rules like what will be done in the coming patch.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-31 11:26:25 +01:00
Peter Maydell
ee29498e4f Merge remote-tracking branch 'remotes/sstabellini/xen-2014-10-30' into staging
* remotes/sstabellini/xen-2014-10-30:
  fix off-by-one error in pci_piix3_xen_ide_unplug
  xen-hvm.c: Add support for Xen access to vmport

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-30 20:28:09 +00:00
Peter Maydell
4239e2dc01 Merge remote-tracking branch 'remotes/kraxel/tags/pull-cve-2014-3689-20141029-1' into staging
vmware-vga: add rectangle verification (CVE-2014-3689)

# gpg: Signature made Wed 29 Oct 2014 11:45:29 GMT using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-cve-2014-3689-20141029-1:
  vmware-vga: use vmsvga_verify_rect in vmsvga_fill_rect
  vmware-vga: use vmsvga_verify_rect in vmsvga_copy_rect
  vmware-vga: use vmsvga_verify_rect in vmsvga_update_rect
  vmware-vga: add vmsvga_verify_rect
  vmware-vga: CVE-2014-3689: turn off hw accel

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-30 19:11:25 +00:00
Peter Maydell
fecd54ccd7 Merge remote-tracking branch 'remotes/kraxel/tags/pull-vnc-20141028-1' into staging
vnc: return directly if no vnc client connected
vnc: sanitize bits_per_pixel from the client (CVE-2014-7815)

# gpg: Signature made Tue 28 Oct 2014 10:52:31 GMT using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-vnc-20141028-1:
  vnc: return directly if no vnc client connected
  vnc: sanitize bits_per_pixel from the client

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-30 18:21:25 +00:00
Peter Maydell
f33f43bd86 Merge remote-tracking branch 'remotes/kraxel/tags/pull-usb-20141028-1' into staging
Fixes for libcacard (usb smartcard emulation), xhci and uhci.

# gpg: Signature made Tue 28 Oct 2014 10:39:52 GMT using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-usb-20141028-1:
  uhci: remove useless DEBUG
  xhci: add property to turn on/off streams support
  libcacard: don't free sign buffer while sign op is pending
  libcacard: Lock NSS cert db when selecting an applet on an emulated card
  libcacard: introduce new vcard_emul_logout

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-30 17:04:29 +00:00
Peter Maydell
3c1d9a15be Merge remote-tracking branch 'remotes/kraxel/tags/pull-gtk-20141028-1' into staging
gtk: fix two warnings with gtk 3.14+

# gpg: Signature made Tue 28 Oct 2014 10:25:52 GMT using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-gtk-20141028-1:
  gtk: avoid gd_widget_reparent with gtk 3.14+
  gtk: drop gtk_widget_set_double_buffered call

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-30 14:45:53 +00:00
James Harper
d4f9e806c2 fix off-by-one error in pci_piix3_xen_ide_unplug
Fix off-by-one error when unplugging disks, which would otherwise leave the last ATA disk plugged, with obvious consequences. Also rewrite loop to be more readable.

Signed-off-by: James Harper <james.harper@ejbdigital.com.au>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2014-10-30 14:16:39 +00:00
Don Slutz
37f9e258b6 xen-hvm.c: Add support for Xen access to vmport
This adds synchronisation of the 6 vcpu registers (only 32bits of
them) that vmport.c needs between Xen and QEMU.

This is to avoid a 2nd and 3rd exchange between QEMU and Xen to
fetch and put these 6 vcpu registers used by the code in vmport.c
and vmmouse.c

The registers are passed in the new shared page provided by
HVM_PARAM_VMPORT_REGS_PFN.

Add new array to XenIOState that allows selection of current_cpu by
vcpu id.

Now pass XenIOState to handle_ioreq().

Add new routines regs_to_cpu(), regs_from_cpu(), and
handle_vmport_ioreq().

Signed-off-by: Don Slutz <dslutz@verizon.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2014-10-30 14:16:38 +00:00
Peter Maydell
08118672d0 Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
virtio-scsi fixes, the first part of dynamic sysbus devices,
MAINTAINERS updates, and AVX512 support.

# gpg: Signature made Mon 27 Oct 2014 15:12:13 GMT using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (28 commits)
  aio / timers: De-document -clock
  hw/scsi/virtio-scsi.c: fix the "type" use error in virtio_scsi_handle_ctrl
  virtio-scsi: sense in virtio_scsi_command_complete
  target-i386: add Intel AVX-512 support
  get_maintainer.pl: restrict cases where it falls back to --git
  get_maintainer.pl: move git loop under "if ($email) {"
  qtest: fix qtest log fd should be initialized before qtest chardev
  MAINTAINERS: avoid M entries that point to mailing lists
  MAINTAINERS: add some tests directories
  MAINTAINERS: Add more TCG files
  MAINTAINERS: add myself for X86
  MAINTAINERS: add Samuel Thibault as usb-serial.c and baum.c maintainer
  MAINTAINERS: grab more files from Anthony's pile
  target-i386: warns users when CPU threads>1 for non-Intel CPUs
  sysbus: Use TYPE_DEVICE GPIO functionality
  qdev: gpio: Define qdev_pass_gpios()
  qdev: gpio: Remove qdev_init_gpio_out x1 restriction
  qdev: gpio: delete NamedGPIOList::out
  irq: Remove qemu_irq_intercept_out
  qtest/irq: Rework IRQ interception
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-30 13:35:12 +00:00
Paolo Bonzini
cbd5ac6991 virtio: link the rng backend through an alias property
The virtio-rng backend is currently linked twice, once in the proxy
device (e.g. virtio-rng-pci) and once in virtio-rng-device.  This causes
a double unref of the backend when the parent device is unplugged.

To fix this, make the proxy device use an alias, similar to what is
already being done for the iothread link.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Message-id: 1414577839-18695-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-30 12:59:27 +00:00
Gerd Hoffmann
bd9ccd8517 vmware-vga: use vmsvga_verify_rect in vmsvga_fill_rect
Add verification to vmsvga_fill_rect, re-enable HW_FILL_ACCEL.

Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Don Koch <dkoch@verizon.com>
2014-10-29 12:01:30 +01:00
Gerd Hoffmann
61b41b4c20 vmware-vga: use vmsvga_verify_rect in vmsvga_copy_rect
Add verification to vmsvga_copy_rect, re-enable HW_RECT_ACCEL.

Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Don Koch <dkoch@verizon.com>
2014-10-29 12:01:26 +01:00
ChenLiang
9d6b207047 vnc: return directly if no vnc client connected
graphic_hw_update and vnc_refresh_server_surface aren't
need to do when no vnc client connected. It can reduce
lock contention, because vnc_refresh will hold global big
lock two millisecond every three seconds.

Signed-off-by: ChenLiang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-28 11:51:04 +01:00
Petr Matousek
e6908bfe8e vnc: sanitize bits_per_pixel from the client
bits_per_pixel that are less than 8 could result in accessing
non-initialized buffers later in the code due to the expectation
that bytes_per_pixel value that is used to initialize these buffers is
never zero.

To fix this check that bits_per_pixel from the client is one of the
values that the rfb protocol specification allows.

This is CVE-2014-7815.

Signed-off-by: Petr Matousek <pmatouse@redhat.com>

[ kraxel: apply codestyle fix ]

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-28 11:51:04 +01:00
Gonglei
a65e4ef90f uhci: remove useless DEBUG
commit 50dcc0f8 (uhci: tracing support) had removed
DPRINTF, the DEBUG marco is useless now, remove it.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-28 11:38:18 +01:00
Gerd Hoffmann
2aa6bfcb66 xhci: add property to turn on/off streams support
streams support in usb-redir and usb-host works only with recent enough
versions of the support libraries (libusbredir and libusbx).  Failure
mode is rather unelegant:  Any stream usb transfers will throw stall
errors.  Turning off support for streams in the xhci host controller
will work better as the guest can figure beforehand that streams are
not going to work.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
2014-10-28 11:38:18 +01:00
Ray Strode
81b49e8f89 libcacard: don't free sign buffer while sign op is pending
commit 57f97834ef cleaned up
the cac_applet_pki_process_apdu function to have a single
exit point. Unfortunately, that commit introduced a bug
where the sign buffer can get free'd and nullified while
it's still being used.

This commit corrects the bug by introducing a boolean to
track whether or not the sign buffer should be freed in
the function exit path.

Signed-off-by: Ray Strode <rstrode@redhat.com>
Reviewed-by: Alon Levy <alon@pobox.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-28 11:38:18 +01:00
Ray Strode
1223bc4cee libcacard: Lock NSS cert db when selecting an applet on an emulated card
When a process in a guest uses an emulated smartcard, libcacard running
on the host passes the PIN from the guest to the PK11_Authenticate NSS
function. The first time PK11_Authenticate is called the passed in PIN
is used to unlock the certificate database. Subsequent calls to
PK11_Authenticate will transparently succeed, regardless of the passed in
PIN. This is a convenience for applications provided by NSS.

Of course, the guest may have many applications using the one emulated
smart card all driven from the same host QEMU process.  That means if a
user enters the right PIN in one program in the guest, and then enters the
wrong PIN in another program in the guest, the wrong PIN will still
successfully unlock the virtual smartcard.

This commit forces the NSS certificate database to be locked anytime an
applet is selected on an emulated smartcard by calling vcard_emul_logout.

Signed-off-by: Ray Strode <rstrode@redhat.com>
Reviewed-By: Robert Relyea <rrelyea@redhat.com>
Reviewed-By: Alon Levy <alevy@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-28 11:38:18 +01:00
Ray Strode
f032cfab61 libcacard: introduce new vcard_emul_logout
vcard_emul_reset currently only logs NSS out, but there is a TODO
for potentially sending insertion/removal events when powering down
or powering up.

For clarity, this commit moves the current guts of vcard_emul_reset to
a new vcard_emul_logout function which will never send insertion/removal
events. The vcard_emul_reset function now just calls vcard_emul_logout,
but also retains its TODO for watching power state transitions and sending
insertion/removal events.

Signed-off-by: Ray Strode <rstrode@redhat.com>
Reviewed-By: Robert Relyea <rrelyea@redhat.com>
Reviewed-By: Alon Levy <alevy@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-28 11:38:18 +01:00
Gerd Hoffmann
316cb068bd gtk: avoid gd_widget_reparent with gtk 3.14+
gtk_widget_reparent is depricated in gtk 3.14, stop using it.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-28 11:25:14 +01:00
Gerd Hoffmann
987fec54e1 gtk: drop gtk_widget_set_double_buffered call
Dunno why it is here.  Removing it seems to have no ill side effects.
It is depricated in 3.14+.  In some cases it has no effect since 3.10
according to the docs:

https://developer.gnome.org/gtk3/stable/GtkWidget.html#gtk-widget-set-double-buffered

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-28 11:25:14 +01:00
Gerd Hoffmann
1735fe1edb vmware-vga: use vmsvga_verify_rect in vmsvga_update_rect
Switch vmsvga_update_rect over to use vmsvga_verify_rect.  Slight change
in behavior:  We don't try to automatically fixup rectangles any more.
In case we find invalid update requests we'll do a full-screen update
instead.

Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Don Koch <dkoch@verizon.com>
2014-10-28 10:40:08 +01:00
Gerd Hoffmann
07258900fd vmware-vga: add vmsvga_verify_rect
Add verification function for rectangles, returning
true if verification passes and false otherwise.

Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Don Koch <dkoch@verizon.com>
2014-10-28 10:40:04 +01:00
Gerd Hoffmann
83afa38eb2 vmware-vga: CVE-2014-3689: turn off hw accel
Quick & easy stopgap for CVE-2014-3689:  We just compile out the
hardware acceleration functions which lack sanity checks.  Thankfully
we have capability bits for them (SVGA_CAP_RECT_COPY and
SVGA_CAP_RECT_FILL), so guests should deal just fine, in theory.

Subsequent patches will add the missing checks and re-enable the
hardware acceleration emulation.

Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Don Koch <dkoch@verizon.com>
2014-10-28 10:39:58 +01:00
Markus Armbruster
e218052f92 aio / timers: De-document -clock
Commit 6d32717 "aio / timers: Remove alarm timers" has issues:

1. It silently ignores -clock for backward compatibility.
Incompatible change: -clock help no longer terminates the program.
Tolerable.

2. Failed to update option documentation.  In particular, -help still
advises users to try -clock help for available timers.  Drop all
documentation on -clock.

3. The 'query-alarm-clock' example in docs/writing-commands.txt no
longer works, and needs to be redone.  Can't do that right now, so I
just stick in a FIXME.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-27 16:11:45 +01:00
Bin Wu
024d9adc79 hw/scsi/virtio-scsi.c: fix the "type" use error in virtio_scsi_handle_ctrl
The local variable "type" in virtio_scsi_handle_ctl represents the tmf
command type from the guest and it has the same meaning as the
req->req.tmf.type. However, before the invoking of virtio_scsi_parse_req
the req->req.tmf.type doesn't has the correct value(just initialized to
zero). Therefore, we need to use the "type" variable to judge the case.

Cc: qemu-stable@nongnu.org
Signed-off-by: Bin Wu <wu.wubin@huawei.com>
[Actually make it compile, "type" must be uint32_t in order to pass
 it to virtio_tswap32s. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-27 16:11:45 +01:00
Ting Wang
b7890c40e5 virtio-scsi: sense in virtio_scsi_command_complete
If req->resp.cmd.status is not GOOD, the address of sense for
qemu_iovec_from_buf should be modified from &req->resp to sense.

Cc: qemu-stable@nongnu.org
Signed-off-by: Ting Wang <kathy.wangting@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-27 16:10:53 +01:00
Jan Kiszka
3e9418e160 Revert "main-loop.c: Handle SIGINT, SIGHUP and SIGTERM synchronously"
This reverts commit 15124e1420. It breaks
debuggability of qemu and is no longer needed as the problem has
now been addressed in a different way.

Instead we provide a comment about why these signals must be
handled asynchronously.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
[PMM: added comment]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-27 15:05:09 +00:00
Jan Kiszka
817ef04db2 Make qemu_shutdown_requested signal-safe
qemu_shutdown_requested may be interrupted by qemu_system_killed. If the
latter sets shutdown_requested after qemu_shutdown_requested has read it
but before it was cleared, the shutdown event is lost. Fix this by using
atomic_xchg.

This provides a different fix for the problem which commit 15124e142
attempts to deal with. That commit breaks use of ^C to drop into gdb,
and so this approach is better (and 15124e142 can be reverted).

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
[PMM: commit message tweak]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-27 14:09:27 +00:00
Chao Peng
9aecd6f8ae target-i386: add Intel AVX-512 support
Add AVX512 feature bits, register definition and corresponding
xsave/vmstate support.

Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-24 18:03:14 +02:00
Peter Maydell
ff0d48768b MAINTAINERS: add myself under 'general project admin' section
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Acked-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1413405052-4527-1-git-send-email-peter.maydell@linaro.org
2014-10-24 14:30:18 +01:00
Leon Alrae
6f64091786 MAINTAINERS: add myself as MIPS guest cores co-maintainer
Add myself to the maintainer list for MIPS guest cores and update the status
from "Odd Fixes" to "Maintained".

Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 1413459487-13658-1-git-send-email-leon.alrae@imgtec.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-24 14:28:18 +01:00
Leon Alrae
74dda9876b target-mips: add ULL suffix in bitswap to avoid compiler warning
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Message-id: 1413982829-27225-1-git-send-email-leon.alrae@imgtec.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-24 14:07:51 +01:00
Peter Maydell
71b7f54fdf Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20141024' into staging
target-arm queue:
 * remove pointless 'info pcmcia' and a lot of now-dead code
 * register ARM cpu reset handlers even if not using -kernel
 * update to libvixl 1.6
 * various minor code cleanups
 * support PSCI under TCG ('virt' machine can now be shut down,
   SMP configurations work)
 * correct the sense of the AArch64 DCZID DZP bit
 * report a valid L1Ip field in CTR_EL0 for CPU type "any"
 * correctly UNDEF writes to FPINST/FPINST2 from EL0
 * more preparatory code refactoring for EL2/EL3 support

# gpg: Signature made Fri 24 Oct 2014 12:35:52 BST using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"

* remotes/pmaydell/tags/pull-target-arm-20141024: (23 commits)
  target-arm: A32: Emulate the SMC instruction
  target-arm: make arm_current_el() return EL3
  target-arm: rename arm_current_pl to arm_current_el
  target-arm: reject switching to monitor mode
  target-arm: add arm_is_secure() function
  target-arm: increase arrays of registers R13 & R14
  target-arm: correctly UNDEF writes to FPINST/FPINST2 from EL0
  target-arm: Report a valid L1Ip field in CTR_EL0 for CPU type "any"
  target-arm: Correct sense of the DCZID DZP bit
  arm/virt: enable PSCI emulation support for system emulation
  target-arm: add emulation of PSCI calls for system emulation
  target-arm: Add support for A32 and T32 HVC and SMC insns
  target-arm: Handle SMC/HVC undef-if-no-ELx in pre_* helpers
  target-arm: add missing PSCI constants needed for PSCI emulation
  target-arm: do not set do_interrupt handlers for ARM and AArch64 user modes
  target-arm: add powered off cpu state
  omap_gpmc.c: Remove duplicate assignment
  disas/libvixl/a64/instructions-a64.h: Remove unused constants
  arm_gic: remove unused parameter.
  disas/libvixl: Update to libvixl 1.6
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-24 12:40:29 +01:00
Fabian Aggeler
dbe9d16367 target-arm: A32: Emulate the SMC instruction
Implements SMC instruction in AArch32 using the A32 syndrome. When executing
SMC instruction from monitor CPU mode SCR.NS bit is reset.

Signed-off-by: Sergey Fedorov <s.fedorov@samsung.com>
Signed-off-by: Fabian Aggeler <aggelerf@ethz.ch>
Signed-off-by: Greg Bellows <greg.bellows@linaro.org>
Message-id: 1413910544-20150-7-git-send-email-greg.bellows@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-24 12:19:15 +01:00
Fabian Aggeler
592125f83a target-arm: make arm_current_el() return EL3
Make arm_current_el() return EL3 for secure PL1 and monitor mode.
Increase MMU modes since mmu_index is directly inferred from arm_
current_el(). Change assertion in arm_el_is_aa64() to allow EL3.

Signed-off-by: Fabian Aggeler <aggelerf@ethz.ch>
Signed-off-by: Greg Bellows <greg.bellows@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1413910544-20150-6-git-send-email-greg.bellows@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-24 12:19:14 +01:00
Greg Bellows
dcbff19bd0 target-arm: rename arm_current_pl to arm_current_el
Renamed the arm_current_pl CPU function to more accurately represent that it
returns the ARMv8 EL rather than ARMv7 PL.

Signed-off-by: Greg Bellows <greg.bellows@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1413910544-20150-5-git-send-email-greg.bellows@linaro.org
[PMM: fixed a minor merge resolution error in a couple of hunks]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-24 12:19:14 +01:00
Sergey Fedorov
027fc52704 target-arm: reject switching to monitor mode
Reject switching to monitor mode from non-secure state.

Signed-off-by: Sergey Fedorov <s.fedorov@samsung.com>
Signed-off-by: Fabian Aggeler <aggelerf@ethz.ch>
Signed-off-by: Greg Bellows <greg.bellows@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1413910544-20150-4-git-send-email-greg.bellows@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-24 12:19:14 +01:00
Fabian Aggeler
19e0fefa6f target-arm: add arm_is_secure() function
arm_is_secure() function allows to determine CPU security state
if the CPU implements Security Extensions/EL3.
arm_is_secure_below_el3() returns true if CPU is in secure state
below EL3.

Signed-off-by: Sergey Fedorov <s.fedorov@samsung.com>
Signed-off-by: Fabian Aggeler <aggelerf@ethz.ch>
Signed-off-by: Greg Bellows <greg.bellows@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1413910544-20150-3-git-send-email-greg.bellows@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-24 12:19:14 +01:00
Fabian Aggeler
0b7d409d42 target-arm: increase arrays of registers R13 & R14
Increasing banked_r13 and banked_r14 to store LR_mon and SP_mon (bank
index 7).

Signed-off-by: Fabian Aggeler <aggelerf@ethz.ch>
Signed-off-by: Greg Bellows <greg.bellows@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1413910544-20150-2-git-send-email-greg.bellows@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-24 12:19:14 +01:00
Peter Maydell
23adb8618c target-arm: correctly UNDEF writes to FPINST/FPINST2 from EL0
The ARM ARM requires that the FPINST and FPINST2 VFP control
registers are not accessible to code at EL0. We were already
correctly implementing this for reads of these registers; add
the missing check for the write code path.

Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Message-id: 1412967447-20931-1-git-send-email-peter.maydell@linaro.org
2014-10-24 12:19:14 +01:00
Peter Maydell
0e7b176ae0 target-arm: Report a valid L1Ip field in CTR_EL0 for CPU type "any"
For the CPU type "any" (only used with linux-user) we were reporting
the L1Ip field as 0b00, which is reserved. Change this field to 0b10
instead, indicating a VIPT icache as the comment describes.

Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Message-id: 1412966807-20844-1-git-send-email-peter.maydell@linaro.org
2014-10-24 12:19:13 +01:00
Peter Maydell
14e5f10607 target-arm: Correct sense of the DCZID DZP bit
The DZP bit in the DCZID system register should be set if
the control bits which prohibit use of the DC ZVA instruction
have been set (it stands for Data Zero Prohibited). However
we had the sense of the test inverted; fix this so that the
bit reads correctly.

To avoid this regressing the behaviour of the user-mode
emulator, we must set the DZE bit in the SCTLR for that
config so that userspace continues to see DZP as zero (it
was getting the correct result by accident previously).

Reported-by: Christopher Covington <cov@codeaurora.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Christopher Covington <cov@codeaurora.org>
Message-id: 1412959792-20708-1-git-send-email-peter.maydell@linaro.org
2014-10-24 12:19:13 +01:00
Rob Herring
211b016915 arm/virt: enable PSCI emulation support for system emulation
Now that we have PSCI emulation, enable it for the virt platform.
This simplifies the virt machine a bit now that PSCI no longer
needs to be a KVM only feature.

Signed-off-by: Rob Herring <rob.herring@linaro.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1412865028-17725-8-git-send-email-peter.maydell@linaro.org
2014-10-24 12:19:13 +01:00
Rob Herring
98128601ac target-arm: add emulation of PSCI calls for system emulation
Add support for handling PSCI calls in system emulation. Both version
0.1 and 0.2 of the PSCI spec are supported. Platforms can enable support
by setting the "psci-conduit" QOM property on the cpus to SMC or HVC
emulation and having a PSCI binding in their dtb.

Signed-off-by: Rob Herring <rob.herring@linaro.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1412865028-17725-7-git-send-email-peter.maydell@linaro.org
[PMM: made system reset/off PSCI functions power down the CPU so
 we obey the PSCI API requirement never to return from them;
 rearranged how the code is plumbed into the exception system,
 so that we split "is this a valid call?" from "do the call"]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-24 12:19:13 +01:00
Peter Maydell
37e6456ef5 target-arm: Add support for A32 and T32 HVC and SMC insns
Add support for HVC and SMC instructions to the A32 and
T32 decoder. Using these for real exceptions to EL2 or EL3
is currently not supported (the do_interrupt routine does
not handle them) but we require the instruction support to
implement PSCI.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1412865028-17725-6-git-send-email-peter.maydell@linaro.org
2014-10-24 12:19:13 +01:00
Peter Maydell
3940433843 target-arm: Handle SMC/HVC undef-if-no-ELx in pre_* helpers
SMC must UNDEF if EL3 is not implemented; similarly HVC UNDEFs
if EL2 is not implemented. Move the handling of this from
translate-a64.c into the pre_smc and pre_hvc helper functions.
This is necessary because use of these instructions for PSCI
takes precedence over this UNDEF case, and we can't tell if
this is a PSCI call until runtime.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1412865028-17725-5-git-send-email-peter.maydell@linaro.org
2014-10-24 12:19:12 +01:00
Ard Biesheuvel
3df53cdf56 target-arm: add missing PSCI constants needed for PSCI emulation
This adds some PSCI function IDs and symbolic return codes that are needed
to implement PSCI emulation in TCG mode.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1412865028-17725-4-git-send-email-peter.maydell@linaro.org
2014-10-24 12:19:12 +01:00
Rob Herring
0adf7d3cc3 target-arm: do not set do_interrupt handlers for ARM and AArch64 user modes
User mode emulation should never get interrupts and thus should not
use the system emulation exception handler function. Remove the reference,
and '#ifndef USER_MODE_ONLY' the function itself as well, so that we can add
system mode only functionality to it.

Signed-off-by: Rob Herring <rob.herring@linaro.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1412865028-17725-3-git-send-email-peter.maydell@linaro.org
2014-10-24 12:19:12 +01:00
Rob Herring
543486db35 target-arm: add powered off cpu state
Add tracking of cpu power state in order to support powering off of
cores in system emuluation. The initial state is determined by the
start-powered-off QOM property.

Signed-off-by: Rob Herring <rob.herring@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1412865028-17725-2-git-send-email-peter.maydell@linaro.org
2014-10-24 12:19:12 +01:00
Dr. David Alan Gilbert
635117e71f omap_gpmc.c: Remove duplicate assignment
This looks like an old merge error and should have no effect.
(Build tested only)

Found by Coccinelle using Julia Lawall's script:
https://lkml.org/lkml/2014/8/23/128

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-id: 1414055855-6688-1-git-send-email-dgilbert@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-24 12:19:12 +01:00
Chen Gang
94cc44a9e5 disas/libvixl/a64/instructions-a64.h: Remove unused constants
The instructions-a64.h header defines a number of floating point
constants whose initializers are function calls. gcc 5 will warn
if these constants are not used by the C or C++ file which includes
the header, because they imply a runtime cost. Since for the files
QEMU uses from libvixl we don't use these constants at all, just
remove them.

Upstream intend to fix these by shifting to an 'extern const' in
the header plus definition in a suitable source file, so we can
drop this patch when we sync with the upcoming libvixl 1.7.

The related compiling error:

    CXX   disas/arm-a64.o
  In file included from /upstream/qemu/disas/libvixl/a64/disasm-a64.h:32:0,
                   from disas/arm-a64.cc:20:
  disas/libvixl/a64/instructions-a64.h:98:13: error: 'vixl::kFP32PositiveInfinity' defined but not used [-Werror=unused-variable]
   const float kFP32PositiveInfinity = rawbits_to_float(0x7f800000);
               ^
  disas/libvixl/a64/instructions-a64.h:99:13: error: 'vixl::kFP32NegativeInfinity' defined but not used [-Werror=unused-variable]
   const float kFP32NegativeInfinity = rawbits_to_float(0xff800000);
               ^
  disas/libvixl/a64/instructions-a64.h💯14: error: 'vixl::kFP64PositiveInfinity' defined but not used [-Werror=unused-variable]
   const double kFP64PositiveInfinity =
                ^
  disas/libvixl/a64/instructions-a64.h:102:14: error: 'vixl::kFP64NegativeInfinity' defined but not used [-Werror=unused-variable]
   const double kFP64NegativeInfinity =
                ^
  disas/libvixl/a64/instructions-a64.h:107:21: error: 'vixl::kFP64SignallingNaN' defined but not used [-Werror=unused-variable]
   static const double kFP64SignallingNaN =
                       ^
  disas/libvixl/a64/instructions-a64.h:109:20: error: 'vixl::kFP32SignallingNaN' defined but not used [-Werror=unused-variable]
   static const float kFP32SignallingNaN = rawbits_to_float(0x7f800001);
                      ^
  disas/libvixl/a64/instructions-a64.h:112:21: error: 'vixl::kFP64QuietNaN' defined but not used [-Werror=unused-variable]
   static const double kFP64QuietNaN =
                       ^
  disas/libvixl/a64/instructions-a64.h:114:20: error: 'vixl::kFP32QuietNaN' defined but not used [-Werror=unused-variable]
   static const float kFP32QuietNaN = rawbits_to_float(0x7fc00001);
                      ^
  disas/libvixl/a64/instructions-a64.h:117:21: error: 'vixl::kFP64DefaultNaN' defined but not used [-Werror=unused-variable]
   static const double kFP64DefaultNaN =
                       ^
  disas/libvixl/a64/instructions-a64.h:119:20: error: 'vixl::kFP32DefaultNaN' defined but not used [-Werror=unused-variable]
   static const float kFP32DefaultNaN = rawbits_to_float(0x7fc00000);
                      ^
  cc1plus: all warnings being treated as errors
  make: *** [disas/arm-a64.o] Error 1

Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
[PMM: Rewrote the commit message a little]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-24 12:19:12 +01:00
KONRAD Frederic
7b95a50858 arm_gic: remove unused parameter.
This removes num_irq parameter from gic_init_irqs_and_distributor as it is not
used.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Message-id: 1412859651-15060-1-git-send-email-fred.konrad@greensocs.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-24 12:19:11 +01:00
Peter Maydell
6aea44fc2b disas/libvixl: Update to libvixl 1.6
Update our copy of libvixl to upstream 1.6. There are no
changes of any particular interest to QEMU, so this is simply
keeping up with current upstream.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1412091418-25744-1-git-send-email-peter.maydell@linaro.org
2014-10-24 12:19:11 +01:00
Ard Biesheuvel
c6faa758e3 hw/arm/boot: register cpu reset handlers if using -bios
Move the registering of CPU reset handlers to before the point where
we leave the function in the -bios (not -kernel) case, so CPU reset
works correctly with -bios as well.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-24 12:19:11 +01:00
Claudio Fontana
b32a950910 hw/arm/virt: mark timer in fdt as v8-compatible
check if the first cpu is an armv8 cpu, and if so, put
arm,armv8-timer in the compatible string list.

Note that due to this check, this patch moves the creation
of the timer fdt node to after the cpu creation loop.

Signed-off-by: Claudio Fontana <claudio.fontana@huawei.com>
Message-id: 1411736960-24206-1-git-send-email-hw.claudio@gmail.com
[PMM: updated to list arm,armv8-timer first]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-24 12:19:11 +01:00
Markus Armbruster
7797a73947 hmp: Remove "info pcmcia"
This command lists PCMCIA sockets and cards.  Only a few ARM boards
have sockets (akita, borzoi, connex, mainstone, spitz, terrier, tosa,
verdex, z2), the only card is the DSCM-1xxxx Hitachi Microdrive (qdev
"microdrive"), and it is only inserted during machine init, if ever.
So this command doesn't really tell anybody anything new so far.

Moreover, pcmcia_socket_unregister() has a use-after-free bug, flagged
by Coverity.  Has never been used, because there has never been code
to eject a PCMCIA card.

Not worth fixing & converting to QMP.  Remove it.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Luiz Capitulino <lcapitulino@redhat.com>
Acked-by: Andreas Färber <afaerber@suse.de>
Message-id: 1411144812-22958-1-git-send-email-armbru@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-24 12:19:11 +01:00
Peter Maydell
8b135a288a Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block patches

# gpg: Signature made Thu 23 Oct 2014 18:56:05 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"

* remotes/kevin/tags/for-upstream: (32 commits)
  qemu-img: Print error if check failed
  block: char devices on FreeBSD are not behind a pager
  iotests: Add test for qcow2 L1 table update
  qcow2: Do not overflow when writing an L1 sector
  iotests: Add test for map commands
  qemu-io: Respect early image end for map
  block: Respect underlying file's EOF
  docs/qcow2: Limit refcount_order to [0, 6]
  docs/qcow2: Correct refcount_block_entries
  qcow2: Drop REFCOUNT_SHIFT
  iotests: Add test for potentially damaging repairs
  iotests: Fix test outputs
  qcow2: Clean up after refcount rebuild
  qcow2: Rebuild refcount structure during check
  qcow2: Do not perform potentially damaging repairs
  qcow2: Fix refcount blocks beyond image end
  qcow2: Reuse refcount table in calculate_refcounts()
  qcow2: Let inc_refcounts() resize the reftable
  qcow2: Let inc_refcounts() return -errno
  qcow2: Split fail code in L1 and L2 checks
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-24 11:33:46 +01:00
Kevin Wolf
d4db399b8b Merge remote-tracking branch 'mreitz/block' into queue-block
* mreitz/block:
  qemu-img: Print error if check failed
  block: char devices on FreeBSD are not behind a pager
2014-10-23 19:55:55 +02:00
Max Reitz
832390a5ed qemu-img: Print error if check failed
Currently, if bdrv_check() fails either by returning -errno or having
check_errors set, qemu-img check just exits with 1 after having told the
user that there were no errors on the image. This is bad.

Instead of printing the check result if there were internal errors which
were so bad that bdrv_check() could not even complete with 0 as a return
value, qemu-img check should inform the user about the error.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2014-10-23 19:42:07 +02:00
Peter Maydell
1430500bb8 Merge remote-tracking branch 'remotes/qmp-unstable/tags/for-upstream' into staging
QMP patches

# gpg: Signature made Thu 23 Oct 2014 16:05:52 BST using RSA key ID E24ED5A7
# gpg: Good signature from "Luiz Capitulino <lcapitulino@gmail.com>"

* remotes/qmp-unstable/tags/for-upstream:
  monitor: delete device_del_bus_completion
  monitor: add del completion for peripheral device
  qdev: add qdev_build_hotpluggable_device_list helper
  MAINTAINERS: add entry for qobject files
  dump: Turn some functions to void to make code cleaner
  dump: Propagate errors into qmp_dump_guest_memory()
  virtio-balloon: Tweak recent fix for integer overflow

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-23 17:05:15 +01:00
Roger Pau Monne
3cad83075c block: char devices on FreeBSD are not behind a pager
Introduce a new flag to mark devices that require requests to be aligned and
replace the usage of BDRV_O_NOCACHE and O_DIRECT with this flag when
appropriate.

If a character device is used as a backend on a FreeBSD host set this flag
unconditionally.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2014-10-23 16:56:53 +02:00
Paolo Bonzini
c6561586f0 get_maintainer.pl: restrict cases where it falls back to --git
The list emitted by --git-fallback often leads inexperienced contributors
to add pointless CCs.  While not discouraging usage of --git-fallback,
we want to:

1) disable the fallback if only some files lack a maintainer

    $ scripts/get_maintainer.pl -f util/cutils.c hw/ide/core.c
    Kevin Wolf <kwolf@redhat.com> (odd fixer:IDE)
    Stefan Hajnoczi <stefanha@redhat.com> (odd fixer:IDE)

This behavior is taken even if --git-fallback is specified.

2) warn the contributors about what we're doing, asking them to use their
common sense:

    $ scripts/get_maintainer.pl -f util/cutils.c
    get_maintainer.pl: No maintainers found, printing recent contributors.
    get_maintainer.pl: Do not blindly cc: them on patches!  Use common sense.

    Luiz Capitulino <lcapitulino@redhat.com> (commit_signer:1/2=50%)
    ...
    $

Explicitly disabling the fallback will not result in the warning message:

    $ scripts/get_maintainer.pl -f util/cutils.c   --no-git-fallback
    $ echo $?
    0

(Returning 1 would break usage of scripts/get_maintainer.pl as a cccmd
for git-send-email).

Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-23 16:41:27 +02:00
Paolo Bonzini
8ad2c0f0f8 get_maintainer.pl: move git loop under "if ($email) {"
All checks in the loop are guarded by that condition, and there is a
handy "if" just below.  Simplify the code.

Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-23 16:41:27 +02:00
Li Liu
107684c05d qtest: fix qtest log fd should be initialized before qtest chardev
qtest_log_fp should be inited before qemu_chr_add_handlers.
If not the log dumped from callback functions may be lost.

easy to reproduce it by command:
"QTEST_LOG=1 QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64
gtester -k --verbose -m=quick tests/qdev-monitor-test"

The log "[I xxxxxx] OPENED" should be printed out by
qtest_event, but does not.

Signed-off-by: Li Liu <john.liuli@huawei.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-23 16:41:27 +02:00
Paolo Bonzini
5dd4a88c37 MAINTAINERS: avoid M entries that point to mailing lists
"L" entries that point to qemu-devel are not much better either, but at least
the get_maintainer.pl output is clearer.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-23 16:41:27 +02:00
Paolo Bonzini
c0bd0b509b MAINTAINERS: add some tests directories
Low-hanging fruit...

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-23 16:41:27 +02:00
Paolo Bonzini
486bbe5f14 MAINTAINERS: Add more TCG files
Unfortunately, TCG files do not really have a maintainer yet.
But at least there will be fewer unmaintained files.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-23 16:41:27 +02:00
Paolo Bonzini
d46d72fd10 MAINTAINERS: add myself for X86
Still not moving it beyond "Odd fixes".  Richard Henderson also has
reviewed a bunch of X86 TCG patches, so add him as well.  All we want
is to avoid that patches fall on the floor.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-23 16:41:27 +02:00
Paolo Bonzini
e26082fd0b MAINTAINERS: add Samuel Thibault as usb-serial.c and baum.c maintainer
He wrote "I've written mostly all of usb-serial.c and baum.c, and keep
maintaining them, since I use them regularly."

Cc: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-23 16:41:26 +02:00
Paolo Bonzini
da26f37a3e MAINTAINERS: grab more files from Anthony's pile
I am picking up character devices and the main loop, as agreed during
QEMU Summit.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-23 16:41:26 +02:00
Wei Huang
e48638fdb3 target-i386: warns users when CPU threads>1 for non-Intel CPUs
Only Intel CPUs support hyperthreading. When users select threads>1 in
-smp option, QEMU fixes it by adjusting CPUID_0000_0001_EBX and
CPUID_8000_0008_ECX based on inputs (sockets, cores, threads);
so guest VM can boot correctly. However it is still better to gives
users a warning when such case happens.

Signed-off-by: Wei Huang <wei@redhat.com>
[As suggested by Eduardo, check for !IS_INTEL instead of AMD. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-23 16:41:26 +02:00
Peter Crosthwaite
b591721909 sysbus: Use TYPE_DEVICE GPIO functionality
Re-implement the Sysbus GPIOs to use the existing TYPE_DEVICE
GPIO named framework. A constant string name is chosen to avoid
conflicts with existing unnamed GPIOs.

This unifies GPIOs are IRQs for sysbus devices and allows removal
of all Sysbus state for GPIOs.

Any existing and future-added functionality for GPIOs is now
also available for sysbus IRQs.

Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-23 16:41:26 +02:00
Peter Crosthwaite
17a96a146c qdev: gpio: Define qdev_pass_gpios()
Allows a container to take ownership of GPIOs in a contained
device and automatically connect them as GPIOs to the container.

This prepares for deprecation of the SYSBUS IRQ functionality, which
has this feature. We push it up to the device level instead of sysbus
level. There's nothing sysbus specific about passing GPIOs to
containers so its a legitimate device-level generic feature.

Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-23 16:41:26 +02:00
Peter Crosthwaite
aef0869e8e qdev: gpio: Remove qdev_init_gpio_out x1 restriction
Previously this was restricted to a single call per-dev/per-name. With
the conversion of the GPIO output state to QOM the implementation can
now handle repeated calls. Remove the restriction.

Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-23 16:41:26 +02:00
Peter Crosthwaite
15942b6569 qdev: gpio: delete NamedGPIOList::out
All users of GPIO outputs are fully QOMified, using QOM properties to
access the GPIO data. Delete.

Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-23 16:41:26 +02:00
Peter Crosthwaite
b58c303590 irq: Remove qemu_irq_intercept_out
No more users left and obsoleted by qdev_intercept_gpio_out.

Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-23 16:41:25 +02:00
Peter Crosthwaite
60a79016ae qtest/irq: Rework IRQ interception
Change the qtest intercept handler to accept just the individual IRQ
being intercepted as opaque. n is still expected to be correctly set
as for the original intercepted irq. qemu_intercept_irq_in is updated
accordingly.

Then covert the qemu_irq_intercept_out call to use qdev intercept
version. This stops qtest from having to mess with the raw IRQ pointers
(still has to mess with names and counts but a step in the right
direction).

Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-23 16:41:25 +02:00
Peter Crosthwaite
0c24db2b8c qdev: gpio: Add API for intercepting a GPIO
To replace the old qemu_irq intercept API (which had users reaching
into qdev private state for GPIOs).

Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-23 16:41:25 +02:00
Peter Crosthwaite
02757df2ad qdev: gpio: Re-implement qdev_connect_gpio QOM style
Re-implement as a link setter. This should allow the QOM framework to
keep track of ref counts properly etc.

We need to add a default parent for the connecting input incase it's
coming from a non-qdev source. We simply parent the IRQ to the machine
in this case.

Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-23 16:41:25 +02:00
Peter Crosthwaite
8faa2f8571 qom: Demote already-has-a-parent to a regular error
Rather than an abort(). This allows callers to decide whether parenting
an already-parented object is a fatal error condition.

Useful for providing a default value for an object's parent in the case
where you want to set one iff it doesn't already have one.

Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-23 16:41:25 +02:00
Peter Crosthwaite
d3c4931647 qom: Allow clearing of a Link property
By passing in "" to object_property_set_link.

The lead user of this is the QDEV GPIO framework which will implement
GPIO disconnects via an "unlink".  GPIO disconnection is used by
qtest's irq_intercept_out command.

Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-23 16:41:25 +02:00
Cornelia Huck
4adea8042f virtio-scsi: dataplane: stop trying on notifier error
There's no use to constantly trying to enable dataplane if we failed
to set up guest or host notifiers, so fence it off in that case.
We'll try again if the device is reinitialized.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-23 16:41:25 +02:00
Cornelia Huck
361dcc790d virtio-scsi: dataplane: fail setup gracefully
The dataplane code is currently doing a hard exit on various setup
failures. In practice, this may mean that a guest suddenly dies after
a dataplane device failed to come up (e.g., when a file descriptor
limit is hit for the nth device).

Let's just try to unwind the setup instead and return.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-23 16:41:24 +02:00
Cornelia Huck
6d2c83165b virtio-scsi: dataplane: print why starting failed
Setting up guest or host notifiers may fail, but the user will have
no idea why: Let's print the error returned by the callback.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-23 16:41:24 +02:00
Fam Zheng
3e2b2a9c49 virtio-scsi-dataplane: Add op blocker
We need this to protect dataplane thread from race conditions with block
jobs until the latter is made dataplane-safe.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-23 16:41:22 +02:00
Max Reitz
1b2dd0bee6 iotests: Add test for qcow2 L1 table update
Updating the L1 table should not result in random data being written.
This adds a test for that.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-23 15:34:02 +02:00
Max Reitz
a1391444fe qcow2: Do not overflow when writing an L1 sector
While writing an L1 table sector, qcow2_write_l1_entry() copies the
respective range from s->l1_table to the local "buf" array. The size of
s->l1_table does not have to be a multiple of L1_ENTRIES_PER_SECTOR;
thus, limit the index which is used for copying all entries to the L1
size.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-23 15:34:02 +02:00
Max Reitz
925bb3238d iotests: Add test for map commands
Add a test for qemu-img map and qemu-io -c map on truncated files.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-23 15:34:02 +02:00
Max Reitz
4b25bbc4c2 qemu-io: Respect early image end for map
bdrv_is_allocated() may report zero clusters which most probably means
the image (file) is shorter than expected. Respect this case in order to
avoid an infinite loop.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-23 15:34:02 +02:00
Max Reitz
59c9a95fd2 block: Respect underlying file's EOF
When falling through to the underlying file in
bdrv_co_get_block_status(), if it returns that the query offset is
beyond the file end (by setting *pnum to 0), return the range to be
zero and do not let the number of sectors for which information could be
obtained be overwritten.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-23 15:34:02 +02:00
Max Reitz
7f75a07d50 docs/qcow2: Limit refcount_order to [0, 6]
Specify the upper limit of refcount_order to be 6 (that is,
refcount_bits = 64). Any larger value does not make much sense when all
offsets, sizes, cluster counts etc. "only" have a width of 64 bit as
well, and very large values would be very difficult to support.
Therefore, just cap it at the largest reasonable value.

Suggested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-23 15:34:02 +02:00
Max Reitz
4b318d6ca6 docs/qcow2: Correct refcount_block_entries
A refblock entry may have a different size than 16 bits, it may even be
smaller than a byte. Correct the refcount_block_entries calculation
accordingly.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-23 15:34:02 +02:00
Max Reitz
17bd5f4727 qcow2: Drop REFCOUNT_SHIFT
With BDRVQcowState.refcount_block_bits, we don't need REFCOUNT_SHIFT
anymore.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-23 15:34:02 +02:00
Max Reitz
234764eed1 iotests: Add test for potentially damaging repairs
There are certain cases where repairing a qcow2 image might actually
damage it further (or rather, where repairing it has in fact damaged it
further with the old qcow2 check implementation). This should not
happen, so add a test for these cases.

Furthermore, the repair function now repairs refblocks beyond the image
end by resizing the image accordingly. Add several tests for this as
well.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-23 15:34:01 +02:00
Max Reitz
d26e6ec052 iotests: Fix test outputs
039, 060 and 061 all create images with referenced clusters having a
refcount of 0. Because previous commits changed handling of such errors,
these tests now have a different output. Fix it.

Furthermore, 060 created a refblock with a refcount greater than one
which now results in having to rebuild the refcount structure as well.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-23 15:34:01 +02:00
Max Reitz
791230d8bb qcow2: Clean up after refcount rebuild
Because the old refcount structure will be leaked after having rebuilt
it, we need to recalculate the refcounts and run a leak-fixing operation
afterwards (if leaks should be fixed at all).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-23 15:34:01 +02:00
Max Reitz
c7c0681bc8 qcow2: Rebuild refcount structure during check
The previous commit introduced the "rebuild" variable to qcow2's
implementation of the image consistency check. Now make use of this by
adding a function which creates a completely new refcount structure
based solely on the in-memory information gathered before.

The old refcount structure will be leaked, however. This leak will be
dealt with in a follow-up commit.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-23 15:34:01 +02:00
Max Reitz
f307b2558f qcow2: Do not perform potentially damaging repairs
If a referenced cluster has a refcount of 0, increasing its refcount may
result in clusters being allocated for the refcount structures. This may
overwrite the referenced cluster, therefore we cannot simply increase
the refcount then.

In such cases, we can either try to replicate all the refcount
operations solely for the check operation, basing the allocations on the
in-memory refcount table; or we can simply rebuild the whole refcount
structure based on the in-memory refcount table. Since the latter will
be much easier, do that.

To prepare for this, introduce a "rebuild" boolean which should be set
to true whenever a fix is rather dangerous or too complicated using the
current refcount structures. Another example for this is refcount blocks
being referenced more than once.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-23 15:34:01 +02:00
Max Reitz
001c158def qcow2: Fix refcount blocks beyond image end
If the qcow2 check function detects a refcount block located beyond the
image end, grow the image appropriately. This cannot break anything and
is the logical fix for such a case.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-23 15:34:01 +02:00
Max Reitz
9696df219a qcow2: Reuse refcount table in calculate_refcounts()
We will later call calculate_refcounts multiple times, so reuse the
refcount table if possible.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-23 15:34:01 +02:00
Max Reitz
641bb63cd6 qcow2: Let inc_refcounts() resize the reftable
Now that the refcount table can be passed around by reference, do that
for inc_refcounts() (and subsequently check_refcounts_l1() and
check_refcounts_l2()) and use it for resizing it when a cluster after
the image end is encountered.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-23 15:34:01 +02:00
Max Reitz
fef4d3d564 qcow2: Let inc_refcounts() return -errno
As of a future patch, inc_refcounts() will have to throw errors which
are generally signaled by returning -errno. Therefore, let it return an
integer which is either 0 for success or -errno and handle the -errno
case in all callers.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-23 15:34:01 +02:00
Max Reitz
ad27390c85 qcow2: Split fail code in L1 and L2 checks
Instead of printing out an error message, incrementing check_errors and
returning a fixed -errno, just do cleanups and return -ret, with ret set
by the code which threw the exception (jumped to the fail label).

Also, increment check_errors on error in check_refcounts_l2().

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-23 15:34:01 +02:00
Max Reitz
713d9675e0 qcow2: Use int64_t for in-memory reftable size
Use int64_t for the entry count of the in-memory refcount table
throughout the check functions.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-23 15:34:01 +02:00
Max Reitz
057a3fe57e qcow2: Pull check_refblocks() up
Pull check_refblocks() before calculate_refcounts() so we can drop its
static declaration.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-23 15:34:01 +02:00
Max Reitz
78fb328e85 qcow2: Use sizeof(**refcount_table)
When implementing variable refcounts, we want to be able to easily find
all the places in qemu which are tied to a certain refcount order.
Replace sizeof(uint16_t) in the check code by sizeof(**refcount_table)
so we can later find it more easily.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-23 15:34:01 +02:00
Max Reitz
6ca56bf5e9 qcow2: Split qcow2_check_refcounts()
Put the code for calculating the reference counts and comparing them
during qemu-img check into own functions.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-23 15:34:01 +02:00
Max Reitz
5b84106bd9 qcow2: Fix leaks in dirty images
When opening dirty images, qcow2's repair function should not only
repair errors but leaks as well.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-23 15:34:01 +02:00
Max Reitz
1d13d65466 qcow2: Calculate refcount block entry count
The size of a refblock entry is (in theory) variable; calculate
therefore the number of entries per refblock and the according bit shift
(1 << x == entry count) when opening an image.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-23 15:34:01 +02:00
Max Reitz
9ebd844805 block: Add qemu_{,try_}blockalign0()
These functions call their non-0-counterparts and then fill the
allocated buffer with 0 (if the allocation has been successful).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-23 15:34:01 +02:00
Peter Lieven
c5f7c0af47 block: qemu-iotests change _supported_proto to file once more.
In preparation to possible automatic regression and performance
testing for the block layer I found that the iotests don't work
for all protocols anymore.

In commit 1f7bf7d0 I started to change supported protocols from
generic to file for various tests. Unfortunately, some tests
added in the meantime again carry generic protocol altough they
can only work with file because they require local file access.

The other way around for some tests that only support file I added
NFS protocol after confirming they work.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-23 15:34:01 +02:00
Max Reitz
e9082e4736 block/vdi: Use {DIV_,}ROUND_UP
There are macros for these operations, so make use of them.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-23 15:34:01 +02:00
Paolo Bonzini
8113fb5292 MAINTAINERS: add the image fuzzer to the block layer
More work for the block device maintainers!

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-23 15:34:01 +02:00
Kevin Wolf
292420918d MAINTAINERS: qemu-iotests belongs to the block layer
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-23 15:34:01 +02:00
Paolo Bonzini
558939c6c7 MAINTAINERS: add aio to block layer
AioContext falls under the block layer, mark it as such.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-23 15:34:01 +02:00
Zhu Guihua
e799157c76 monitor: delete device_del_bus_completion
device_del_bus_completion() that gathers devices from buses is unused; delete
it.

Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
Reviewed-by: Marcel Apfelbaum <marcel.a@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-10-23 09:04:05 -04:00
Zhu Guihua
6a1fa9f5a6 monitor: add del completion for peripheral device
Add peripheral_device_del_completion() to let peripheral device del completion
be possible.

Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
Reviewed-by: Marcel Apfelbaum <marcel.a@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-10-23 09:03:44 -04:00
Zhu Guihua
66e56b13ad qdev: add qdev_build_hotpluggable_device_list helper
For peripheral device del completion, add a function to build a list for
hotpluggable devices.

Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
Reviewed-by: Marcel Apfelbaum <marcel.a@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-10-23 09:02:51 -04:00
Luiz Capitulino
f3582ba41c MAINTAINERS: add entry for qobject files
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-10-23 09:01:29 -04:00
zhanghailiang
4c7e251ae6 dump: Turn some functions to void to make code cleaner
Functions shouldn't return an error code and an Error object at the same time.
Turn all these functions that returning Error object to void.
We also judge if a function success or fail by reference to the local_err.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-10-23 09:01:29 -04:00
zhanghailiang
37917da2d0 dump: Propagate errors into qmp_dump_guest_memory()
The code calls dump_error() on error, and even passes it a suitable
message.  However, the message is thrown away, and its callers pass
up only success/failure.  All qmp_dump_guest_memory() can do is set
a generic error.

Propagate the errors properly, so qmp_dump_guest_memory() can return
a more useful error.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-10-23 09:01:29 -04:00
Markus Armbruster
22644cd2c6 virtio-balloon: Tweak recent fix for integer overflow
Commit 1f9296b avoids "other kinds of overflow" by limiting the
polling interval to UINT_MAX.  The computations to protect are done in
64 bits.  This is indeed safe when unsigned is 32 bits, as it commonly
is.  It isn't when unsigned is 64 bits.  Purely theoretical; I'm not
aware of such a system.  Limit it to UINT32_MAX instead.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-10-23 09:01:29 -04:00
Peter Maydell
e40830afa1 Merge remote-tracking branch 'remotes/mdroth/tags/qga-pull-2014-10-22-tag' into staging
qga: remove readdir_r usage and fix use-after-free

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

# gpg: Signature made Wed 22 Oct 2014 13:56:19 BST using RSA key ID F108B584
# gpg: Can't check signature: public key not found

* remotes/mdroth/tags/qga-pull-2014-10-22-tag:
  qga: Rewrite code where using readdir_r

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-22 21:42:33 +01:00
Peter Maydell
6de4e7fdf6 Merge remote-tracking branch 'remotes/bkoppelmann/tags/pull-tricore-20141021' into staging
TriCore ABS, ABSB, B, BIT, BO instructions added

# gpg: Signature made Tue 21 Oct 2014 17:47:32 BST using RSA key ID 6B69CA14
# gpg: Good signature from "Bastian Koppelmann <kbastian@mail.uni-paderborn.de>"

* remotes/bkoppelmann/tags/pull-tricore-20141021:
  target-tricore: Add instructions of BO opcode format
  target-tricore: Add instructions of BIT opcode format
  target-tricore: Add instructions of B opcode format
  target-tricore: Add instructions of ABS, ABSB opcode format
  target-tricore: Cleanup and Bugfixes

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-22 18:43:35 +01:00
Peter Maydell
60b6381ffb Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20141021' into staging
add missing s390x files to MAINTAINERS

# gpg: Signature made Tue 21 Oct 2014 11:57:12 BST using RSA key ID C6F02FAF
# gpg: Good signature from "Cornelia Huck <huckc@linux.vnet.ibm.com>"
# gpg:                 aka "Cornelia Huck <cornelia.huck@de.ibm.com>"

* remotes/cohuck/tags/s390x-20141021:
  s390x: sweep up unmaintained files

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-22 18:41:38 +01:00
Peter Maydell
8f4699d873 Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block patches

# gpg: Signature made Mon 20 Oct 2014 13:04:09 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"

* remotes/kevin/tags/for-upstream: (28 commits)
  block: Make device model's references to BlockBackend strong
  block: Lift device model API into BlockBackend
  blockdev: Convert qmp_eject(), qmp_change_blockdev() to BlockBackend
  block/qapi: Convert qmp_query_block() to BlockBackend
  blockdev: Fix blockdev-add not to create DriveInfo
  blockdev: Drop superfluous DriveInfo member id
  pc87312: Drop unused members of PC87312State
  ide: Complete conversion from BlockDriverState to BlockBackend
  hw: Convert from BlockDriverState to BlockBackend, mostly
  virtio-blk: Rename VirtIOBlkConf variables to conf
  virtio-blk: Drop redundant VirtIOBlock member conf
  block: Rename BlockDriverCompletionFunc to BlockCompletionFunc
  block: Rename BlockDriverAIOCB* to BlockAIOCB*
  block: Eliminate DriveInfo member bdrv, use blk_by_legacy_dinfo()
  block: Merge BlockBackend and BlockDriverState name spaces
  block: Eliminate BlockDriverState member device_name[]
  block: Eliminate bdrv_iterate(), use bdrv_next()
  blockdev: Eliminate drive_del()
  block: Make BlockBackend own its BlockDriverState
  block: Code motion to get rid of stubs/blockdev.c
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-22 16:39:49 +01:00
Peter Maydell
895b810c12 Merge remote-tracking branch 'remotes/kraxel/tags/pull-usb-20141015-2' into staging
usb: add high speed mouse & keyboard configuration

* remotes/kraxel/tags/pull-usb-20141015-2:
  xhci: remove dead code
  usb-hid: Add high speed keyboard configuration
  usb-hid: Add high speed mouse configuration
  usb-hid: Move descriptor decision to usb-hid initfn

Conflicts:
	include/hw/i386/pc.h
[Fixed trivial merge conflict in the pc-2.1 property list]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-22 15:48:32 +01:00
Peter Maydell
19f3772995 Merge remote-tracking branch 'remotes/spice/tags/pull-spice-20141015-1' into staging
qxl: keep going if reaching guest bug on empty area

# gpg: Signature made Wed 15 Oct 2014 11:45:37 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/spice/tags/pull-spice-20141015-1:
  qxl: keep going if reaching guest bug on empty area

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-22 14:49:37 +01:00
Peter Maydell
4b4ce6c2b8 Merge remote-tracking branch 'remotes/kraxel/tags/pull-gtk-20141015-1' into staging
gtk: fix memory leak, add pause key support.

# gpg: Signature made Wed 15 Oct 2014 11:30:39 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-gtk-20141015-1:
  gtk: add support for the Pause key
  gtk.c: Fix memory leak in gd_set_keycode_type()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-22 14:01:42 +01:00
zhanghailiang
e668d1b854 qga: Rewrite code where using readdir_r
If readdir_r fails, error_setg_errno will reference the freed
pointer *dirpath*.

Moreover, readdir_r may cause a buffer overflow, using readdir instead.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-10-22 07:49:52 -05:00
Peter Maydell
ad089894a8 Merge remote-tracking branch 'remotes/kraxel/tags/pull-console-20141015-1' into staging
configure: Prepend pixman and ftd flags to overrule system-provided ones

# gpg: Signature made Wed 15 Oct 2014 11:21:13 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-console-20141015-1:
  configure: Prepend pixman and ftd flags to overrule system-provided ones

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-22 13:08:43 +01:00
Peter Maydell
31cc9514a5 Merge remote-tracking branch 'remotes/lalrae/tags/mips-20141015' into staging
* remotes/lalrae/tags/mips-20141015: (28 commits)
  target-mips: Remove unused gen_load_ACX, gen_store_ACX and cpu_ACX
  target-mips/dsp_helper.c: Add ifdef guards around various functions
  target-mips/translate.c: Add ifdef guard around check_mips64()
  target-mips/op_helper.c: Remove unused do_lbu() function
  target-mips/dsp_helper.c: Remove unused function get_DSPControl_24()
  target-mips: fix broken MIPS16 and microMIPS
  target-mips/translate.c: Update OPC_SYNCI
  target-mips: define a new generic CPU supporting MIPS64 Release 6 ISA
  mips_malta: update malta's pseudo-bootloader - replace JR with JALR
  target-mips: remove JR, BLTZAL, BGEZAL and add NAL, BAL instructions
  target-mips: do not allow Status.FR=0 mode in 64-bit FPU
  target-mips: add new Floating Point Comparison instructions
  target-mips: add new Floating Point instructions
  softfloat: add functions corresponding to IEEE-2008 min/maxNumMag
  target-mips: add AUI, LSA and PCREL instruction families
  target-mips: add compact and CP1 branches
  target-mips: add ALIGN, DALIGN, BITSWAP and DBITSWAP instructions
  target-mips: Status.UX/SX/KX enable 32-bit address wrapping
  target-mips: move CLO, DCLO, CLZ, DCLZ, SDBBP and free special2 in R6
  target-mips: redefine Integer Multiply and Divide instructions
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-22 12:06:47 +01:00
Peter Maydell
01a2050fa5 hw/i386/pc_q35.c: Avoid g_assert_cmpint() as it is not in glib 2.12
The function g_assert_cmpint() is not in glib 2.12, which is our current
minimum requirement. Rephrase the recently added assertion to avoid it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
2014-10-22 11:32:44 +01:00
Cornelia Huck
4277af19d9 s390x: sweep up unmaintained files
Several s390x/kvm/ccw related files don't have an entry in MAINTAINERS:
Sort them into the appropriate sections.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-10-21 12:40:34 +02:00
Markus Armbruster
84ebe3755f block: Make device model's references to BlockBackend strong
Doesn't make a difference just yet, but it's the right thing to do.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 14:03:51 +02:00
Markus Armbruster
a7f53e26a6 block: Lift device model API into BlockBackend
Move device model attachment / detachment and the BlockDevOps device
model callbacks and their wrappers from BlockDriverState to
BlockBackend.

Wrapper calls in block.c change from

    bdrv_dev_FOO_cb(bs, ...)

to

    if (bs->blk) {
        bdrv_dev_FOO_cb(bs->blk, ...);
    }

No change, because both bdrv_dev_change_media_cb() and
bdrv_dev_resize_cb() do nothing when no device model is attached, and
a device model can be attached only when bs->blk.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 14:03:50 +02:00
Markus Armbruster
6007cdd448 blockdev: Convert qmp_eject(), qmp_change_blockdev() to BlockBackend
Much more command code needs conversion.  I'm converting these now
because they're using bdrv_dev_* functions, which I'm about to lift
into BlockBackend.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 14:03:50 +02:00
Markus Armbruster
d829a2115f block/qapi: Convert qmp_query_block() to BlockBackend
Much more command code needs conversion.  I start with this one
because it's using bdrv_dev_* functions, which I'm about to lift into
BlockBackend.

While there, give bdrv_query_info() internal linkage.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 14:03:50 +02:00
Markus Armbruster
26f8b3a847 blockdev: Fix blockdev-add not to create DriveInfo
blockdev_init() always creates a DriveInfo, but only drive_new() fills
it in.  qmp_blockdev_add() leaves it blank.  This results in a drive
with type = IF_IDE, bus = 0, unit = 0.  Screwed up in commit ee13ed1c.

Board initialization code looking for IDE drive (0,0) can pick up one
of these bogus drives.  The QMP command has to execute really early to
be visible.  Not sure how likely that is in practice.

Fix by creating DriveInfo in drive_new().  Block backends created by
blockdev-add don't get one.

Breaks the test for "has been created by qmp_blockdev_add()" in
blockdev_mark_auto_del() and do_drive_del(), because it changes the
value of dinfo && !dinfo->enable_auto_del from true to false.  Simply
test !dinfo instead.

Leaves DriveInfo member enable_auto_del unused.  Drop it.

A few places assume a block backend always has a DriveInfo.  Fix them
up.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 14:03:50 +02:00
Markus Armbruster
d3aeb1b7da blockdev: Drop superfluous DriveInfo member id
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 14:03:50 +02:00
Markus Armbruster
b8864be5f3 pc87312: Drop unused members of PC87312State
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 14:03:50 +02:00
Markus Armbruster
a987ee1f1b ide: Complete conversion from BlockDriverState to BlockBackend
Add a BlockBackend member to TrimAIOCB, so ide_issue_trim_cb() can use
blk_aio_discard() instead of bdrv_aio_discard().

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 14:03:50 +02:00
Markus Armbruster
4be746345f hw: Convert from BlockDriverState to BlockBackend, mostly
Device models should access their block backends only through the
block-backend.h API.  Convert them, and drop direct includes of
inappropriate headers.

Just four uses of BlockDriverState are left:

* The Xen paravirtual block device backend (xen_disk.c) opens images
  itself when set up via xenbus, bypassing blockdev.c.  I figure it
  should go through qmp_blockdev_add() instead.

* Device model "usb-storage" prompts for keys.  No other device model
  does, and this one probably shouldn't do it, either.

* ide_issue_trim_cb() uses bdrv_aio_discard() instead of
  blk_aio_discard() because it fishes its backend out of a BlockAIOCB,
  which has only the BlockDriverState.

* PC87312State has an unused BlockDriverState[] member.

The next two commits take care of the latter two.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 14:02:25 +02:00
Markus Armbruster
2a30307f70 virtio-blk: Rename VirtIOBlkConf variables to conf
This is consistent with how VirtIOFOOConf variables are named
elsewhere, and makes blk available for BlockBackend variables.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 14:02:21 +02:00
Markus Armbruster
f75167313c virtio-blk: Drop redundant VirtIOBlock member conf
Commit 12c5674 turned it into a pointer to member blk.conf.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 13:48:28 +02:00
Markus Armbruster
097310b53e block: Rename BlockDriverCompletionFunc to BlockCompletionFunc
I'll use it with block backends shortly, and the name is going to fit
badly there.  It's a block layer thing anyway, not just a block driver
thing.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 13:41:27 +02:00
Markus Armbruster
7c84b1b831 block: Rename BlockDriverAIOCB* to BlockAIOCB*
I'll use BlockDriverAIOCB with block backends shortly, and the name is
going to fit badly there.  It's a block layer thing anyway, not just a
block driver thing.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 13:41:27 +02:00
Markus Armbruster
fa1d36df74 block: Eliminate DriveInfo member bdrv, use blk_by_legacy_dinfo()
The patch is big, but all it really does is replacing

    dinfo->bdrv

by

    blk_bs(blk_by_legacy_dinfo(dinfo))

The replacement is repetitive, but the conversion of device models to
BlockBackend is imminent, and will shorten it to just
blk_legacy_dinfo(dinfo).

Line wrapping muddies the waters a bit.  I also omit tests whether
dinfo->bdrv is null, because it never is.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 13:41:27 +02:00
Markus Armbruster
7f06d47eff block: Merge BlockBackend and BlockDriverState name spaces
BlockBackend's name space is separate only to keep the initial patches
simple.  Time to merge the two.

Retain bdrv_find() and bdrv_get_device_name() for now, to keep this
series manageable.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 13:41:26 +02:00
Markus Armbruster
bfb197e0d9 block: Eliminate BlockDriverState member device_name[]
device_name[] can become non-empty only in bdrv_new_root() and
bdrv_move_feature_fields().  The latter is used only to undo damage
done by bdrv_swap().  The former is called only by blk_new_with_bs().
Therefore, when a BlockDriverState's device_name[] is non-empty, then
it's been created with a BlockBackend, and vice versa.  Furthermore,
blk_new_with_bs() keeps the two names equal.

Therefore, device_name[] is redundant.  Eliminate it.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 13:41:26 +02:00
Markus Armbruster
fea68bb6e9 block: Eliminate bdrv_iterate(), use bdrv_next()
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 13:41:26 +02:00
Markus Armbruster
b9fe8a7a12 blockdev: Eliminate drive_del()
drive_del() has become a trivial wrapper around blk_unref().  Get rid
of it.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 13:41:26 +02:00
Markus Armbruster
9ba10c95a4 block: Make BlockBackend own its BlockDriverState
On BlockBackend destruction, unref its BlockDriverState.  Replaces the
callers' unrefs.

This turns the pointer from BlockBackend to BlockDriverState into a
strong reference, managed with bdrv_ref() / bdrv_unref().  The
back-pointer remains weak.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 13:41:26 +02:00
Markus Armbruster
8fb3c76c94 block: Code motion to get rid of stubs/blockdev.c
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 13:41:26 +02:00
Markus Armbruster
18e46a033d block: Connect BlockBackend and DriveInfo
Make the BlockBackend own the DriveInfo.  Change blockdev_init() to
return the BlockBackend instead of the DriveInfo.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 13:41:26 +02:00
Markus Armbruster
7e7d56d9e0 block: Connect BlockBackend to BlockDriverState
Convenience function blk_new_with_bs() creates a BlockBackend with its
BlockDriverState.  Callers have to unref both.  The commit after next
will relieve them of the need to unref the BlockDriverState.

Complication: due to the silly way drive_del works, we need a way to
hide a BlockBackend, just like bdrv_make_anon().  To emphasize its
"special" status, give the function a suitably off-putting name:
blk_hide_on_behalf_of_do_drive_del().  Unfortunately, hiding turns the
BlockBackend's name into the empty string.  Can't avoid that without
breaking the blk->bs->device_name equals blk->name invariant.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 13:41:26 +02:00
Markus Armbruster
26f54e9a3c block: New BlockBackend
A block device consists of a frontend device model and a backend.

A block backend has a tree of block drivers doing the actual work.
The tree is managed by the block layer.

We currently use a single abstraction BlockDriverState both for tree
nodes and the backend as a whole.  Drawbacks:

* Its API includes both stuff that makes sense only at the block
  backend level (root of the tree) and stuff that's only for use
  within the block layer.  This makes the API bigger and more complex
  than necessary.  Moreover, it's not obvious which interfaces are
  meant for device models, and which really aren't.

* Since device models keep a reference to their backend, the backend
  object can't just be destroyed.  But for media change, we need to
  replace the tree.  Our solution is to make the BlockDriverState
  generic, with actual driver state in a separate object, pointed to
  by member opaque.  That lets us replace the tree by deinitializing
  and reinitializing its root.  This special need of the root makes
  the data structure awkward everywhere in the tree.

The general plan is to separate the APIs into "block backend", for use
by device models, monitor and whatever other code dealing with block
backends, and "block driver", for use by the block layer and whatever
other code (if any) dealing with trees and tree nodes.

Code dealing with block backends, device models in particular, should
become completely oblivious of BlockDriverState.  This should let us
clean up both APIs, and the tree data structures.

This commit is a first step.  It creates a minimal "block backend"
API: type BlockBackend and functions to create, destroy and find them.

BlockBackend objects are created and destroyed exactly when root
BlockDriverState objects are created and destroyed.  "Root" in the
sense of "in bdrv_states".  They're not yet used for anything; that'll
come shortly.

A root BlockDriverState is created with bdrv_new_root(), so where to
create a BlockBackend is obvious.  Where these roots get destroyed
isn't always as obvious.

It is obvious in qemu-img.c, qemu-io.c and qemu-nbd.c, and in error
paths of blockdev_init(), blk_connect().  That leaves destruction of
objects successfully created by blockdev_init() and blk_connect().

blockdev_init() is used only by drive_new() and qmp_blockdev_add().
Objects created by the latter are currently indestructible (see commit
48f364d "blockdev: Refuse to drive_del something added with
blockdev-add" and commit 2d246f0 "blockdev: Introduce
DriveInfo.enable_auto_del").  Objects created by the former get
destroyed by drive_del().

Objects created by blk_connect() get destroyed by blk_disconnect().

BlockBackend is reference-counted.  Its reference count never exceeds
one so far, but that's going to change.

In drive_del(), the BB's reference count is surely one now.  The BDS's
reference count is greater than one when something else is holding a
reference, such as a block job.  In this case, the BB is destroyed
right away, but the BDS lives on until all extra references get
dropped.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 13:41:26 +02:00
Markus Armbruster
e4e9986b1c block: Split bdrv_new_root() off bdrv_new()
Creating an anonymous BDS can't fail.  Make that obvious.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 13:41:26 +02:00
Max Reitz
ec0de76874 nbd: Fix filename generation
Export names may be used with nbd+unix, too, fix nbd_refresh_filename()
accordingly. Also, for nbd+tcp, the documented path schema is
"nbd://host[:port]/export", so use it. Furthermore, as can be seen from
that schema, the port is optional.

That makes six single cases for how the filename can be formatted; it is
not easy to generalize these cases without the resulting statement being
completely unreadable, thus there is simply one snprintf() per case.

Finally, taking the options from BDRVNBDState::socket_opts is wrong,
because those will not contain the export name. Just use
BlockDriverState::options instead.

Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 13:41:26 +02:00
Tony Breeds
7c15903789 block/raw-posix: use seek_hole ahead of fiemap
try_fiemap() uses FIEMAP_FLAG_SYNC which has a significant performance
impact.

Prefer seek_hole() over fiemap() to avoid this impact where possible.
seek_hole is more widely used and, arguably, has potential to be
optimised in the kernel.

Reported-By: Michael Steffens <michael_steffens@posteo.de>
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Cc: Pádraig Brady <pbrady@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 13:41:26 +02:00
Tony Breeds
38c4d0aea3 block/raw-posix: Fix disk corruption in try_fiemap
Using fiemap without FIEMAP_FLAG_SYNC is a known corrupter.

Add the FIEMAP_FLAG_SYNC flag to the FS_IOC_FIEMAP ioctl.  This has
the downside of significantly reducing performance.

Reported-By: Michael Steffens <michael_steffens@posteo.de>
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Cc: Pádraig Brady <pbrady@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 13:41:26 +02:00
Zhang Haoyu
d8bb71b622 qcow2: fix leak of Qcow2DiscardRegion in update_refcount_discard
When the Qcow2DiscardRegion is adjacent to another one referenced by "d",
free this Qcow2DiscardRegion metadata referenced by "p" after
it was removed from s->discards queue.

Signed-off-by: Zhang Haoyu <zhanghy@sangfor.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 13:41:26 +02:00
Bastian Koppelmann
3a16ecb063 target-tricore: Add instructions of BO opcode format
Add instructions of BO opcode format.
Add microcode generator functions gen_swap, gen_ldmst.
Add microcode generator functions gen_st/ld_preincr, which write back the address after the memory access.
Add helper for circular and bit reverse addr mode calculation.
Add sign extended bitmask for BO_OFF10 field.

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2014-10-20 12:25:07 +01:00
Bastian Koppelmann
b74f2b5bb3 target-tricore: Add instructions of BIT opcode format
Add instructions of BIT opcode format.
Add microcode generator functions gen_bit_1/2op to do 1/2 bit operations on the last bit.

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2014-10-20 12:25:07 +01:00
Bastian Koppelmann
f718b0bbfa target-tricore: Add instructions of B opcode format
Add instructions of B opcode format.

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2014-10-20 12:25:07 +01:00
Bastian Koppelmann
59543d4ee1 target-tricore: Add instructions of ABS, ABSB opcode format
Add instructions of ABS, ABSB opcode format.
Add microcode generator functions for ld/st of two 32bit reg as one 64bit value.
Add microcode generator functions for ldmst and swap.
Add helper ldlcx, lducx, stlcx and stucx.

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2014-10-20 12:25:07 +01:00
Bastian Koppelmann
030c58dfb7 target-tricore: Cleanup and Bugfixes
Move FCX loading of save_context_ to caller functions, for STLCX, STUCX insn to use those functions.
Move FCX storing of restore_context_ to caller functions, for LDLCX, LDUCX insn to use those functions.
Remove do_raise_exception function, which caused clang to emit a warning.
Fix: save_context_lower now saves a[11] instead of PSW.
Fix: MASK_OP_ABSB_BPOS starting at wrong offset.

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2014-10-20 12:25:06 +01:00
Gonglei
5f77ef69a1 glib: add compatibility interface for g_strcmp0()
This patch fixes compilation errors when building against glib < 2.16.0
due to the missing g_strcmp0() function.

Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Message-id: 1413457177-10132-1-git-send-email-arei.gonglei@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-16 23:02:31 +01:00
Peter Maydell
8a2c263624 Merge remote-tracking branch 'remotes/kraxel/tags/pull-vga-20141015-1' into staging
vga-pci: add qext region to mmio
vga: Remove unused arrays dmask4 and dmask16

# gpg: Signature made Wed 15 Oct 2014 10:12:06 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-vga-20141015-1:
  hw/display/vga: Remove unused arrays dmask4 and dmask16
  vga-pci: add qext region to mmio

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-16 09:26:14 +01:00
Peter Maydell
605c690b1b Merge remote-tracking branch 'remotes/kraxel/tags/pull-bootindex-20141015-1' into staging
allow changing bootorder via monitor at runtime,
by making bootindex a writable qom property.

* remotes/kraxel/tags/pull-bootindex-20141015-1: (34 commits)
  bootindex: change fprintf to error_report
  bootindex: delete bootindex when device is removed
  bootindex: move calling add_boot_device_patch to bootindex setter function
  ide: add calling add_boot_device_patch in bootindex setter function
  nvma: ide: add bootindex to qom property
  usb-storage: add bootindex to qom property
  virtio-blk: alias bootindex property explicitly for virt-blk-pci/ccw/s390
  block: remove bootindex property from qdev to qom
  virtio-blk: add bootindex to qom property
  ide: add bootindex to qom property
  scsi: add bootindex to qom property
  isa-fdc: remove bootindexA/B property from qdev to qom
  redirect: remove bootindex property from qdev to qom
  vfio: remove bootindex property from qdev to qom
  pci-assign: remove bootindex property from qdev to qom
  host-libusb: remove bootindex property from qdev to qom
  virtio-net: alias bootindex property explicitly for virt-net-pci/ccw/s390
  net: remove bootindex property from qdev to qom
  usb-net: add bootindex to qom property
  vmxnet3: add bootindex to qom property
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-16 09:24:45 +01:00
Stefan Hajnoczi
89b516d8b9 glib: add compatibility interface for g_get_monotonic_time()
This patch fixes compilation errors when building against glib <2.28.0
due to the missing g_get_monotonic_time() function.

The compilation error in tests/libqos/virtio.c was introduced in commit
70556264a8 ("libqos: use microseconds
instead of iterations for virtio timeout").

Add a simple g_get_monotonic_time() implementation to glib-compat.h
based on code from vhost-user-test.c.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
[Igor: add G_TIME_SPAN_SECOND, include glib-compat.h in libqtest.h]
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-15 13:43:35 +01:00
Gerd Hoffmann
ac20e1bb0e xhci: remove dead code
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 13:39:48 +02:00
Jan Vesely
b13ce07688 usb-hid: Add high speed keyboard configuration
Signed-off-by: Jan Vesely <jano.vesely@gmail.com>

[ kraxel: fixup compat property to apply to 2.1 & older ]

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 13:39:22 +02:00
Jan Vesely
58e4fee24a usb-hid: Add high speed mouse configuration
Signed-off-by: Jan Vesely <jano.vesely@gmail.com>

[ kraxel: fixup compat property to apply to 2.1 & older ]

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 13:38:09 +02:00
Peter Maydell
32d9c5613e Merge remote-tracking branch 'remotes/juanquintela/tags/migration/20141015' into staging
migration/next for 20141015

# gpg: Signature made Wed 15 Oct 2014 09:21:54 BST using RSA key ID 5872D723
# gpg: Can't check signature: public key not found

* remotes/juanquintela/tags/migration/20141015:
  migration: catch unknown flag combinations in ram_load
  qemu-file: Move stdio implementation to qemu-file-stdio.c
  qemu-file: Move unix and socket implementations to qemu-file-unix.c
  qemu-file: Use qemu_file_is_writable() on stdio_fclose()
  qemu-file: Make qemu_file_is_writable() non-static
  qemu-file: Add copyright header to qemu-file.c
  vmstate: Allow dynamic allocation for VBUFFER during migration
  block/migration: Disable cache invalidate for incoming migration
  Tests: QEMUSizedBuffer/QEMUBuffer
  QEMUSizedBuffer based QEMUFile

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-15 11:55:54 +01:00
Jan Kiszka
50e1206069 configure: Prepend pixman and ftd flags to overrule system-provided ones
Other packages may provide includes for pixman as well if the host has a
devel package installed. So add ours to the front to unsure that the
right version is used.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 12:20:27 +02:00
Peter Maydell
9879232543 hw/display/vga: Remove unused arrays dmask4 and dmask16
Following cleanup of the vga device code in commit d2e043a804,
the arrays dmask4 and dmask16 are now unused. gcc doesn't warn
about this, but clang does; remove them.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 11:10:50 +02:00
Gerd Hoffmann
b5682aa4ca vga-pci: add qext region to mmio
Add a qemu extented register range to the standard vga mmio bar.
Right nowe there are two registers:  One readonly register returning the
size of the region (so we can easily add more registers there if needed)
and one endian control register, so guests (especially ppc) can flip
the framebuffer endianness as they need it.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2014-10-15 11:08:35 +02:00
Marc-André Lureau
9e5a25f1c2 qxl: keep going if reaching guest bug on empty area
Xorg server hangs when using xfig and typing a text with space:
 #0  qxl_wait_for_io_command (qxl=<value optimized out>) at qxl_io.c:47
 #1  0x00007f826a49a299 in qxl_download_box (surface=0x221d030, x1=231, y1=259,
     x2=<value optimized out>, y2=<value optimized out>) at qxl_surface.c:143

       while (!(ram_header->int_pending & QXL_INTERRUPT_IO_CMD))
         usleep (1);

The QXL driver is calling QXL_IO_UPDATE_AREA with an empty area. This
is a guest bug. The call is async and no ack is sent back on guest
bug, so the X server will hang. The driver should be improved to avoid
this situation and also to abort on QXL_INTERRUPT_ERROR. This will be
a different patch series for the driver. However, it is simple enough
to keep qemu running on empty areas update, which is what this patch
provides.

https://bugzilla.redhat.com/show_bug.cgi?id=1151363

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 11:08:34 +02:00
Jan Vesely
ff326ffaeb usb-hid: Move descriptor decision to usb-hid initfn
Signed-off-by: Jan Vesely <jano.vesely@gmail.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 11:08:34 +02:00
Martin Decky
5c960521b8 gtk: add support for the Pause key
Special handing of the Pause key. Implemented in a similar way as in
ui/sdl.c.

Signed-off-by: Martin Decky <martin@decky.cz>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 11:08:32 +02:00
Chen Fan
84961407a5 gtk.c: Fix memory leak in gd_set_keycode_type()
this memory leak is introduced by the original
 commit 3158a3482b

 valgrind out showing:
 ==14553== 21,459 (72 direct, 21,387 indirect) bytes in 1 blocks are definitely
 lost in loss record 8,055 of 8,082
 ==14553==    at 0x4A06BC3: calloc (vg_replace_malloc.c:618)
 ==14553==    by 0x80DBFBC: XkbGetKeyboardByName (in /usr/lib64/libX11.so.6.3.0)
 ==14553==    by 0x40C704: gtk_display_init (gtk.c:1798)
 ==14553==    by 0x1AEDC1: main (vl.c:4480)

Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 11:08:32 +02:00
Gonglei
54086fe5d2 bootindex: change fprintf to error_report
The function may be called by qmp command, we should
report error message to the caller.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 10:46:01 +02:00
Gonglei
4aca8a8178 bootindex: delete bootindex when device is removed
Device should be removed from global boot list when
it is hot-unplugged.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 10:46:01 +02:00
Gonglei
d749e10c4f bootindex: move calling add_boot_device_patch to bootindex setter function
On this way, we can assure the new bootindex take effect
during vm rebooting.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 10:46:01 +02:00
Gonglei
d2b186f96d ide: add calling add_boot_device_patch in bootindex setter function
On this way, we can assure the new bootindex take effect
during vm rebooting. Meanwhile set the initial value of
bootindex to -1.

Because ide devcies's unit property maybe
do not initialize when set_bootindex function is called,
so that we don't know its suffix. So we have to save the
call add_boot_device_path() on ide realize/init function.
When we want to change bootindex during vm rebooting, we
can call it in setter function.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 10:45:51 +02:00
Gonglei
33739c7129 nvma: ide: add bootindex to qom property
At present, nvma cannot boot. However, it provides already
a bootindex property, so change bootindex to qom for nvma
device, but not call add_boot_device_path.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 10:27:52 +02:00
Gonglei
89f0762dde usb-storage: add bootindex to qom property
Add a qom property with the same name 'bootindex',
when we remove it form qdev property, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Because usb-storage rely on scsi-disk which is created
in usb_msg_realize_storage(), so we should store the SCSIDevice
pointer in MSDState struct. Only in this way, we can change
the global boot_order_list when we want to change the bootindex
during vm rebooting by calling object_property_set_int(Object(SCSIDevice),).

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 10:27:07 +02:00
Gonglei
aeb98ddc50 virtio-blk: alias bootindex property explicitly for virt-blk-pci/ccw/s390
Since the "bootindex" property is a QOM property and not a qdev property
now, we must alias it explicitly for virtio-blk-pci, as well as CCW and
s390-virtio.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 09:52:55 +02:00
Gonglei
8dece34f26 block: remove bootindex property from qdev to qom
Remove bootindex form qdev property to qom, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 09:52:55 +02:00
Gonglei
3342ec324a virtio-blk: add bootindex to qom property
Add a qom property with the same name 'bootindex',
when we remove it form qdev property, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 09:52:55 +02:00
Gonglei
4556363087 ide: add bootindex to qom property
Add a qom property with the same name 'bootindex',
when we remove it form qdev property, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 09:52:55 +02:00
Gonglei
44fb6337b9 scsi: add bootindex to qom property
Add a qom property with the same name 'bootindex',
when we remove it form qdev property, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 09:52:55 +02:00
Gonglei
81782b6a78 isa-fdc: remove bootindexA/B property from qdev to qom
Remove bootindexA/B form qdev property to qom, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 09:52:55 +02:00
Gonglei
295857994c redirect: remove bootindex property from qdev to qom
Remove bootindex form qdev property to qom, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 09:52:55 +02:00
Gonglei
abc5b3bfe1 vfio: remove bootindex property from qdev to qom
Remove bootindex form qdev property to qom, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 09:52:55 +02:00
Gonglei
7f6014af27 pci-assign: remove bootindex property from qdev to qom
Remove bootindex form qdev property to qom, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 09:52:54 +02:00
Gonglei
e6adae52b1 host-libusb: remove bootindex property from qdev to qom
Remove bootindex form qdev property to qom, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 09:52:54 +02:00
Gonglei
0cf63c3e35 virtio-net: alias bootindex property explicitly for virt-net-pci/ccw/s390
Since the "bootindex" property is a QOM property and not a qdev property
now, we must alias it explicitly for virtio-net-pci, as well as CCW and
s390-virtio.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 09:52:54 +02:00
Gonglei
45e8a9e123 net: remove bootindex property from qdev to qom
Remove bootindex form qdev property to qom, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 09:52:54 +02:00
Gonglei
c11f4bc9ff usb-net: add bootindex to qom property
Add a qom property with the same name 'bootindex',
when we remove it form qdev property, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 09:52:54 +02:00
Gonglei
e25524efb0 vmxnet3: add bootindex to qom property
Add a qom property with the same name 'bootindex',
when we remove it form qdev property, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 09:52:54 +02:00
Gonglei
dfe79cf268 spapr_lian: add bootindex to qom property
Add a qom property with the same name 'bootindex',
when we remove it form qdev property, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 09:52:54 +02:00
Gonglei
afd7c850f5 rtl8139: add bootindex to qom property
Add a qom property with the same name 'bootindex',
when we remove it form qdev property, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 09:52:54 +02:00
Gonglei
ea3b3511cd pcnet: add bootindex to qom property
Add a qom property with the same name 'bootindex',
when we remove it form qdev property, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 09:52:54 +02:00
Gonglei
6cb0851d62 ne2000: add bootindex to qom property
Add a qom property with the same name 'bootindex',
when we remove it form qdev property, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

At present, isa_ne2000 device does not support to boot
os, so we register two seprate qom getter/setter functions.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 09:52:54 +02:00
Gonglei
7317bb1782 eepro100: add bootindex to qom property
Add a qom property with the same name 'bootindex',
when we remove it form qdev property, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 09:52:54 +02:00
Gonglei
5df3bf623d e1000: add bootindex to qom property
Add a qom property with the same name 'bootindex',
when we remove it form qdev property, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 09:52:54 +02:00
Gonglei
aa4197c323 virtio-net: add bootindex to qom property
Add a qom property with the same name 'bootindex',
when we remove it form qdev property, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 09:52:53 +02:00
Gonglei
12da309778 bootindex: add a setter/getter functions wrapper for bootindex property
when we remove bootindex form qdev.property to qom.property,
we can use those functions set/get bootindex property for all
correlative devices. Meanwhile set the initial value of
bootindex to -1.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 09:52:47 +02:00
Gonglei
a598f2ffc2 bootindex: support to set a existent device's bootindex to -1
When set a device's bootindex to -1, we remove it from global
fw_boot_order list.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 09:52:47 +02:00
Gonglei
e614b54b93 bootindex: rework add_boot_device_path function
Add the function of updating bootindex about fw_boot_order list
in add_boot_device_path(). We should delete the old one if a
device has existed in global fw_boot_order list.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 09:52:47 +02:00
Gonglei
bdbb5b1706 fw_cfg: add fw_cfg_machine_reset function
We must assure that the changed bootindex can take effect
when guest is rebooted. So we introduce fw_cfg_machine_reset(),
which change the fw_cfg file's bootindex data using the new
global fw_boot_order list.

Signed-off-by: Chenliang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 09:52:15 +02:00
Gonglei
9d27572d62 bootindex: add del_boot_device_path function
Introduce del_boot_device_path() to clean up fw_cfg content when
hot-unplugging a device that refers to a bootindex or update a
existent devcie's bootindex.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Chenliang <chenliang88@huawei.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 09:49:48 +02:00
Gonglei
694fb857ab bootindex: add check bootindex function
Determine whether a given bootindex exists or not.
If exists, we report an error.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 09:49:48 +02:00
Gonglei
bc74112f7e bootdevice: move bootdevice related code to new file bootdevice.c
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 09:49:48 +02:00
Peter Maydell
88e6599669 Merge remote-tracking branch 'remotes/afaerber/tags/qom-devices-for-peter' into staging
QOM infrastructure fixes and device conversions

* GPIO conversion to QOM, continued
* Device property description support
* QTest cases for hotplug
* Hotplug handler conversion

# gpg: Signature made Wed 15 Oct 2014 04:05:17 BST using RSA key ID 3E7E013F
# gpg: Good signature from "Andreas Färber <afaerber@suse.de>"
# gpg:                 aka "Andreas Färber <afaerber@suse.com>"

* remotes/afaerber/tags/qom-devices-for-peter: (47 commits)
  qdev: Drop legacy_name from qdev properties
  qmp: Print descriptions of object properties
  qdev: Set the object property's description to the qdev property's.
  qom: Add description field in ObjectProperty struct
  qdev: Add description field in PropertyInfo struct
  qdev: device_del: Search for to be unplugged device in 'peripheral' container
  qdev: HotplugHandler: Add support for unplugging BUS-less devices
  qdev: Drop legacy hotplug fields/methods
  usb: Convert usb devices to hotplug handler API
  usb: Convert usb-ccid to hotplug handler API
  usb-storage: Drop not needed "allow_hotplug = 0"
  usb-bot: Drop not needed "allow_hotplug = 0"
  usb-bot: Mark device as non hotpluggable
  scsi: Cleanup not used anymore SCSIBusInfo{hotplug, hot_unplug} fields
  scsi: Convert virtio-scsi HBA to hotplug handler API
  scsi: Convert pvscsi HBA to hotplug handler API
  scsi: Set SCSI BUS itself as default HotplugHandler
  s390x: Convert virtio-ccw to hotplug handler API
  s390x: Convert s390-virtio to hotplug handler API
  s390x: Drop not used allow_hotplug in event-facility
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-15 08:12:53 +01:00
Gonglei
18b91a3e08 qdev: Drop legacy_name from qdev properties
The legacy_name is useless now, better help
information is provided by description field of property.

Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:15 +02:00
Gonglei
07d09c58db qmp: Print descriptions of object properties
Add a new "description" field to DevicePropertyInfo.
The descriptions can serve as documentation in the code,
and they can be used to provide better help. For example:

$./qemu-system-x86_64 -device virtio-blk-pci,?

Before this patch:

virtio-blk-pci.iothread=link<iothread>
virtio-blk-pci.x-data-plane=bool
virtio-blk-pci.scsi=bool
virtio-blk-pci.config-wce=bool
virtio-blk-pci.serial=str
virtio-blk-pci.secs=uint32
virtio-blk-pci.heads=uint32
virtio-blk-pci.cyls=uint32
virtio-blk-pci.discard_granularity=uint32
virtio-blk-pci.bootindex=int32
virtio-blk-pci.opt_io_size=uint32
virtio-blk-pci.min_io_size=uint16
virtio-blk-pci.physical_block_size=uint16
virtio-blk-pci.logical_block_size=uint16
virtio-blk-pci.drive=str
virtio-blk-pci.virtio-backend=child<virtio-blk-device>
virtio-blk-pci.command_serr_enable=on/off
virtio-blk-pci.multifunction=on/off
virtio-blk-pci.rombar=uint32
virtio-blk-pci.romfile=str
virtio-blk-pci.addr=pci-devfn
virtio-blk-pci.event_idx=on/off
virtio-blk-pci.indirect_desc=on/off
virtio-blk-pci.vectors=uint32
virtio-blk-pci.ioeventfd=on/off
virtio-blk-pci.class=uint32

After:

virtio-blk-pci.iothread=link<iothread>
virtio-blk-pci.x-data-plane=bool (on/off)
virtio-blk-pci.scsi=bool (on/off)
virtio-blk-pci.config-wce=bool (on/off)
virtio-blk-pci.serial=str
virtio-blk-pci.secs=uint32
virtio-blk-pci.heads=uint32
virtio-blk-pci.cyls=uint32
virtio-blk-pci.discard_granularity=uint32
virtio-blk-pci.bootindex=int32
virtio-blk-pci.opt_io_size=uint32
virtio-blk-pci.min_io_size=uint16
virtio-blk-pci.physical_block_size=uint16 (A power of two between 512 and 32768)
virtio-blk-pci.logical_block_size=uint16 (A power of two between 512 and 32768)
virtio-blk-pci.drive=str (ID of a drive to use as a backend)
virtio-blk-pci.virtio-backend=child<virtio-blk-device>
virtio-blk-pci.command_serr_enable=bool (on/off)
virtio-blk-pci.multifunction=bool (on/off)
virtio-blk-pci.rombar=uint32
virtio-blk-pci.romfile=str
virtio-blk-pci.addr=int32 (Slot and optional function number, example: 06.0 or 06)
virtio-blk-pci.event_idx=bool (on/off)
virtio-blk-pci.indirect_desc=bool (on/off)
virtio-blk-pci.vectors=uint32
virtio-blk-pci.ioeventfd=bool (on/off)
virtio-blk-pci.class=uint32

Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:15 +02:00
Gonglei
b8c9cd5c8c qdev: Set the object property's description to the qdev property's.
Set all static qdev properties' descriptions to object property's
description.

Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:15 +02:00
Gonglei
8074264203 qom: Add description field in ObjectProperty struct
The descriptions can serve as documentation in the code,
and they can be used to provide better help.

Copy property descriptions when copying alias properties.

Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:15 +02:00
Gonglei
51b2e8c331 qdev: Add description field in PropertyInfo struct
The descriptions can serve as documentation in the code,
and they can be used to provide better help.

Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:15 +02:00
Igor Mammedov
b6cc36abb2 qdev: device_del: Search for to be unplugged device in 'peripheral' container
device_add puts every device with 'id' inside of 'peripheral'
container using id's value as the last component name.
Use it by replacing recursive search on sysbus with path
lookup in 'peripheral' container, which could handle both
BUS and BUS-less device cases.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:15 +02:00
Igor Mammedov
7716b8ca74 qdev: HotplugHandler: Add support for unplugging BUS-less devices
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:14 +02:00
Igor Mammedov
2d9a982f37 qdev: Drop legacy hotplug fields/methods
It removes not needed anymore BusState::allow_hotplug field and
DeviceClass::unplug callback.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:14 +02:00
Igor Mammedov
5f4d917376 usb: Convert usb devices to hotplug handler API
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:14 +02:00
Igor Mammedov
138d587afb usb: Convert usb-ccid to hotplug handler API
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:14 +02:00
Igor Mammedov
77de4a09c6 usb-storage: Drop not needed "allow_hotplug = 0"
Drop useless hack that disables hotplug on bus, after backend
storage was added to it, by setting "allow_hotplug = 0". Even
if bus is hotpluggable, it won't be possible to add another
SCSI device to bus since its realize will fail early with
error "no free target" in scsi_qdev_realize() method.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:14 +02:00
Igor Mammedov
af01492755 usb-bot: Drop not needed "allow_hotplug = 0"
Drop useless hack that disables hotplug on bus by setting
"allow_hotplug = 0". Even if bus is hotpluggable, It won't
be possible to add another SCSI device to bus since its
realization will fail early with error "no free target"
in scsi_qdev_realize() method.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:14 +02:00
Igor Mammedov
e9fd12aa0d usb-bot: Mark device as non hotpluggable
usb-bot creates SCSI bus and immediately makes it
non hotpluggable which was making not possible to
hotplug usb-bot since QEMU would abort at
bus_add_child(scsi-hd) time when usb-bot is
realized.

Mark usb-bot as not hotpluggable so that attempt
to hotplug it would error out even before it gets
to device initialization point.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:14 +02:00
Igor Mammedov
10bdcd5659 scsi: Cleanup not used anymore SCSIBusInfo{hotplug, hot_unplug} fields
SCSI subsytem was converted to hotplug handler API and
doesn't use SCSIBusInfo{hotplug, hot_unplug} fields and
related callbacks anymore.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:14 +02:00
Igor Mammedov
02206e5275 scsi: Convert virtio-scsi HBA to hotplug handler API
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:14 +02:00
Igor Mammedov
91c8daad4b scsi: Convert pvscsi HBA to hotplug handler API
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:14 +02:00
Igor Mammedov
bddd763a4e scsi: Set SCSI BUS itself as default HotplugHandler
That would allow to handle SCSI device unplug
on HBAs without dedicated hot(un)plug handlers
and avoid making such HBAs explicitly hotpluggable.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:14 +02:00
Igor Mammedov
277bc95ed3 s390x: Convert virtio-ccw to hotplug handler API
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:14 +02:00
Igor Mammedov
e98f8c3622 s390x: Convert s390-virtio to hotplug handler API
Beside of conversion, patch drops present unplug
handling, effectively disabling hot-unplug of
s390-virtio devices.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:14 +02:00
Igor Mammedov
492bcf8f71 s390x: Drop not used allow_hotplug in event-facility
s390-sclp-event-facility creates s390-sclp-events-bus
and immediately sets its allow_hotplug field to 0,
which is NOP since it's already 0 by default.

Also since BUS is not hotpluggable, it's not possible
to call SCLP_EVENT{ DeviceClass::unplug } callback
from qdev_unplug() making this unreachable code,
so drop it as well.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:13 +02:00
Igor Mammedov
2f4f603517 virtio-mmio: Drop useless bus->allow_hotplug = 0
Bus by default is not hotpluggable.
virtio-mmio-bus and its parent types do not set allow_hotplug
anywhere explicitly, so remove not needed field access
and wrapper along with it.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:13 +02:00
Igor Mammedov
0ddef15b04 virtio-serial: Convert to hotplug-handler API
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:13 +02:00
Igor Mammedov
7f17a91715 virtio-pci: Drop BusState::allow_hotplug
virtio-pci-bus is an internal object of composite
virtio-pci device and it doesn't participate in
-device/device_add hotplug flow, and since it's
not required by bus_add_child() that BUS must
be hotpluggable to be able to add child at runtime,
it's possible to drop not needed 'allow_hotplug'
field.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:13 +02:00
Igor Mammedov
c32e36f6ab target-i386: ICC bus: Drop BusState::allow_hotplug
Since bus_add_child() no longer cares if BUS is hotpluggable
or not, there is no need in setting allow_hotplug field.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:13 +02:00
Igor Mammedov
e378acb404 qdev: Drop hotplug check from bus_add_child()
Check is too restrictive and does not allow
to add children to just created bus during hotplug
when the bus is part of composite device.

Removing check from bus_add_child() doesn't affect
devices creatable with device_add/del commands since
they have a similar builtin check and patch will
allow to create complex composite devices during
hotplug.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:13 +02:00
Igor Mammedov
431bbb26cb qdev: Add wrapper to set BUS as HotplugHandler
To be used for conversion of SCSI and USB devices,
and would allow to make every HBA/USB host switch
to HotplugHandler API without touching each controller
explicitly.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:13 +02:00
Igor Mammedov
014176f914 qdev: Add simple/generic unplug callback for HotplugHandler
It will be used in shallow conversion from legacy hotplug
mechanism and eventually replace all the uses of old mechanism
DeviceClass::unplug = qdev_simple_unplug_cb()

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:13 +02:00
Igor Mammedov
181a2c6323 qdev: HotplugHandler: Provide unplug callback
It is to be called for actual device removal and
will allow to separate request and removal handling
phases of x86-CPU devices and also it's a handler
to be called for synchronously removable devices.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:13 +02:00
Igor Mammedov
14d5a28fb6 qdev: HotplugHandler: Rename unplug callback to unplug_request
'HotplugHandler.unplug' callback is currently used as async
call to issue unplug request for device that implements it.
Renaming 'unplug' callback to 'unplug_request' should help to
avoid confusion about what callback does and would allow to
introduce 'unplug' callback that would perform actual device
removal when guest is ready for it.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:13 +02:00
Igor Mammedov
ce9835e00d qdev: do not allow to instantiate non hotpluggable device with device_add
It will allow explicitly mark device as not hotpluggable and
avoid its creation with following error at realize time
and destroying it afterwards anyway. Instead of it will
error out even before instance of device is created.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:13 +02:00
Igor Mammedov
39b888bd88 Access BusState::allow_hotplug using wraper qbus_is_hotpluggable()
It would allow to transparently switch detection whether Bus
is hotpluggable from allow_hotplug field to hotplug_handler
link and to drop allow_hotplug field once all users are
converted to hotplug handler API.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:13 +02:00
Igor Mammedov
49cec38591 tests: usb: usb-uas hotplug test
checks that it's possible to hotplug usb-uas HBA and
then if it's possible to hot(un)plug scsi-disk to it.
Thest basically covers hot(un)plug on dummy HBAs
without means of hot(un)plug notification of the guest.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:13 +02:00
Igor Mammedov
b6ca82feed tests: usb: usb-storage hotplug test
usb-storage is different from usual usb devices
in that it uses a child SCSI bus for underlying storage.
This commit verifies that the SCSI bus is hotpluggable, as
hotplug operation wouldn't succeed without it.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:13 +02:00
Igor Mammedov
b393768314 tests: usb: Generic usb device hotplug
use usb-tablet as a hotplugged usb device.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:13 +02:00
Igor Mammedov
fbd942c993 tests: usb: add port test to uhci unit test
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:12 +02:00
Igor Mammedov
b0354ec596 tests: usb: Move uhci port test code to libqos/usb.c
Move code necessary for testing uhci port into library
so it could be used by other USB tests.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:12 +02:00
Igor Mammedov
aaf3607051 tests: virtio-blk: Check if hot-plug/unplug works
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:12 +02:00
Igor Mammedov
9224709b7b tests: virtio-net: Check if hot-plug/unplug works
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:12 +02:00
Igor Mammedov
d1f3fc24f8 tests: virtio-rng: Check if hot-plug/unplug works
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:12 +02:00
Igor Mammedov
2f8b276720 libqos: Add qpci_plug_device_test() and qpci_unplug_acpi_device_test()
Functions will be used for testing hot(un)plug of PCI devices.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:12 +02:00
Igor Mammedov
823a9987c9 tests: virtio-serial: Check if hot-plug/unplug works
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:12 +02:00
Igor Mammedov
ac2c4946cc tests: virtio-scsi: Check if hot-plug/unplug works
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:12 +02:00
Gonglei
8ae9a9ef4e qom: Add error handler for object alias property
object_property_add_alias() is called at some
places at present. And its parameter errp may not NULL,
such as
 object_property_add_alias(obj, "iothread", OBJECT(&dev->vdev),"iothread",
                              &error_abort);
This patch add error handler for security.

Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:03:04 +02:00
Gonglei
3a53009fa0 qom: Add error handler for object_property_print()
Avoid the caller of object_property_print() leaking string
argument's memory, such as qdev_print_props() when
encounter errors.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-15 05:02:55 +02:00
Peter Maydell
340fff722d target-mips: Remove unused gen_load_ACX, gen_store_ACX and cpu_ACX
Remove the functions gen_load_ACX and gen_store_ACX, which appear to have
been unused since they were first introduced many years ago. These functions
were the only places using the cpu_ACX[] array of TCG globals, so remove
that and its accompanying regnames_ACX[] as well.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2014-10-14 13:29:15 +01:00
Peter Maydell
31efecccce target-mips/dsp_helper.c: Add ifdef guards around various functions
Add ifdef TARGET_MIPS64 guards around various functions that are only
called from helpers for TARGET_MIPS64 CPUs; this avoids compiler
warnings when building other configs.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2014-10-14 13:29:14 +01:00
Peter Maydell
c7986fd6cd target-mips/translate.c: Add ifdef guard around check_mips64()
The function check_mips64() is only used if TARGET_MIPS64 is defined;
add an ifdef guard to its definition to avoid warnings about it being
unused in other configurations.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2014-10-14 13:29:14 +01:00
Peter Maydell
b808a1a812 target-mips/op_helper.c: Remove unused do_lbu() function
The do_lbu() function defined by the expansion of HELPER_LD() is
never used, so don't define it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2014-10-14 13:29:14 +01:00
Peter Maydell
3414e93eb7 target-mips/dsp_helper.c: Remove unused function get_DSPControl_24()
The function get_DSPControl_24() is unused; remove it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2014-10-14 13:29:14 +01:00
Yongbok Kim
b231c103af target-mips: fix broken MIPS16 and microMIPS
Commit 240ce26a broke MIPS16 and microMIPS support as it didn't
care those branches and jumps don't have delay slot in
MIPS16 and microMIPS.

This patch introduces a new argument delayslot_size to the
gen_compute_branch() indicating size of delay slot {0, 2, 4}.
And the information is used to call handle_delay_slot() forcingly
when no delay slot is required.

There are some microMIPS branch and jump instructions that requires
exact size of instruction in the delay slot. For indicating
these instructions, MIPS_HFLAG_BDS_STRICT flag is introduced.

Those fictional branch opcodes defined to support MIPS16 and
microMIPS are no longer needed.

Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Tested-by: Jonas Gorski <jogo@openwrt.org>
Reviewed-by: Leon Alrae <leon.alrae@imgtec.com>
[leon.alrae@imgtec.com: cosmetic changes]
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2014-10-14 13:29:14 +01:00
Dongxue Zhang
a83bddd60d target-mips/translate.c: Update OPC_SYNCI
Update OPC_SYNCI with BS_STOP, in order to handle the instructions which saved
in the same TB of the store instruction.

Signed-off-by: Dongxue Zhang <elta.era@gmail.com>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
[leon.alrae@imgtec.com: update microMIPS SYNCI as well]
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2014-10-14 13:29:14 +01:00
Leon Alrae
a773cc7970 target-mips: define a new generic CPU supporting MIPS64 Release 6 ISA
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
2014-10-14 13:28:52 +01:00
Leon Alrae
9fba150042 mips_malta: update malta's pseudo-bootloader - replace JR with JALR
JR has been removed in R6 and now this instruction will cause Reserved
Instruction Exception. Therefore use JALR with rd=0 which is equivalent to JR.

Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
2014-10-14 13:28:52 +01:00
Yongbok Kim
0aefa33318 target-mips: remove JR, BLTZAL, BGEZAL and add NAL, BAL instructions
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
2014-10-14 13:28:52 +01:00
Leon Alrae
ddc584bdb5 target-mips: do not allow Status.FR=0 mode in 64-bit FPU
Status.FR bit must be ignored on write and read as 1 when an implementation of
Release 6 of the Architecture in which a 64-bit floating point unit is
implemented.

Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
2014-10-14 13:28:52 +01:00
Yongbok Kim
3f4938833c target-mips: add new Floating Point Comparison instructions
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
2014-10-14 13:28:52 +01:00
Leon Alrae
e7f16abbc5 target-mips: add new Floating Point instructions
In terms of encoding MIPS32R6 MIN.fmt, MAX.fmt, MINA.fmt, MAXA.fmt replaced
MIPS-3D RECIP1, RECIP2, RSQRT1, RSQRT2 instructions.

In R6 all Floating Point instructions are supposed to be IEEE-2008 compliant
i.e. FIR.HAS2008 always 1. However, QEMU softfloat for MIPS has not been
updated yet.

Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
2014-10-14 13:28:51 +01:00
Leon Alrae
2d31e0607d softfloat: add functions corresponding to IEEE-2008 min/maxNumMag
Add abs argument to the existing softfloat minmax() function and define
new float{32,64}_{min,max}nummag functions.

minnummag(x,y) returns x if |x| < |y|,
               returns y if |y| < |x|,
               otherwise minnum(x,y)

maxnummag(x,y) returns x if |x| > |y|,
               returns y if |y| > |x|,
               otherwise maxnum(x,y)

Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
2014-10-14 13:28:51 +01:00
Leon Alrae
d4ea6acdf6 target-mips: add AUI, LSA and PCREL instruction families
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
2014-10-14 13:26:54 +01:00
Peter Lieven
5b0e9dd46f migration: catch unknown flag combinations in ram_load
this patch extends commit db80fac by not only checking
for unknown flags, but also filtering out unknown flag
combinations.

Suggested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-10-14 11:24:20 +02:00
Eduardo Habkost
c54d1c0670 qemu-file: Move stdio implementation to qemu-file-stdio.c
Separate the QEMUFile interface from the stdio-specific implementation,
to reduce dependencies from code using QEMUFile.

The code that is being moved is similar to the one that was on savevm.c before
it was moved in commit 093c455a8c, except for
some changes done by Markus, Juan, and myself. So, I am using the copyright and
license header from savevm.c, but CCing Juan and Markus so they can review the
copyright/license header.

Cc: Markus Armbruster <armbru@redhat.com>
Cc: Juan Quintela <quintela@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-10-14 10:29:28 +02:00
Eduardo Habkost
bee223ba27 qemu-file: Move unix and socket implementations to qemu-file-unix.c
Separate the QEMUFile interface from the implementation, to reduce
dependencies from code using QEMUFile.

All the code that is being moved to the new file is exactly the same
code that was on savevm.c (moved by commit
093c455a8c), so I am using the copyright
and license header from savevm.c for the new file.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-10-14 10:29:28 +02:00
Eduardo Habkost
532bc727c3 qemu-file: Use qemu_file_is_writable() on stdio_fclose()
Use the existing function which checks if writev_buffer() or
put_buffer() are set, instead of duplicating it.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-10-14 10:28:12 +02:00
Eduardo Habkost
e68dd36596 qemu-file: Make qemu_file_is_writable() non-static
The QEMUFileStdio code will use qemu_file_is_writable() and will be
moved to a separate file.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-10-14 10:28:12 +02:00
Eduardo Habkost
9a4ac51f72 qemu-file: Add copyright header to qemu-file.c
The person who created qemu-file.c (me, on commit
093c455a8c) didn't add a copyright/license
header to the file, even though the whole code was copied from savevm.c
(which had a copyright/license header).

To correct this, copy the copyright information and license from
savevm.c, that's where the original code came from.

Luckily, very few changes were made on qemu-file.c after it was created.
All the authors who touched the code are being CCed, so they can confirm
if they are OK with the copyright/license information being added.

Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Juan Quintela <quintela@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-10-14 10:16:38 +02:00
Alexey Kardashevskiy
94ed706d53 vmstate: Allow dynamic allocation for VBUFFER during migration
This extends use of VMS_ALLOC flag from arrays to VBUFFER as well.

This defines VMSTATE_VBUFFER_ALLOC_UINT32 which makes use of VMS_ALLOC
and uses uint32_t type for a size.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-10-14 09:35:48 +02:00
Alexey Kardashevskiy
7ea2d269cb block/migration: Disable cache invalidate for incoming migration
When migrated using libvirt with "--copy-storage-all", at the end of
migration there is race between NBD mirroring task trying to do flush
and migration completion, both end up invalidating cache. Since qcow2
driver does not handle this situation very well, random crashes happen.

This disables the BDRV_O_INCOMING flag for the block device being migrated
once the cache has been invalidated.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

--

fixed parens by hand
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-10-14 09:35:21 +02:00
Dr. David Alan Gilbert
9935baca9b Tests: QEMUSizedBuffer/QEMUBuffer
Modify some of tests/test-vmstate.c to use the in memory file based
on QEMUSizedBuffer to provide basic testing of QEMUSizedBuffer and
the associated memory backed QEMUFile type.

Only some of the tests are changed so that the fd backed QEMUFile is
still tested.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-10-14 09:17:06 +02:00
Dr. David Alan Gilbert
deb22f9a44 QEMUSizedBuffer based QEMUFile
This is based on Stefan and Joel's patch that creates a QEMUFile that goes
to a memory buffer; from:

http://lists.gnu.org/archive/html/qemu-devel/2013-03/msg05036.html

Using the QEMUFile interface, this patch adds support functions for
operating on in-memory sized buffers that can be written to or read from.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Joel Schopp <jschopp@linux.vnet.ibm.com>

For fixes/tweeks I've done:
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-10-14 09:17:06 +02:00
Peter Crosthwaite
688b057aec qdev: gpio: Register GPIO outputs as QOM links
Within the object that contains the GPIO output. This allows for
connecting GPIO outputs via setting of a Link property.

Also clear the link value to zero. This catch-alls the case
where a device improperly inits a gpio_out (malloc instead of
malloc0).

Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-13 16:39:27 +02:00
Peter Crosthwaite
a69bef1cf1 qdev: gpio: Register GPIO inputs as child objects
To the device that contains them. This will allow for referencing
a GPIO input from it's canonical path (exciting for dynamic machine
generation!)

Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-13 16:39:26 +02:00
Peter Crosthwaite
b235a71f52 qdev: gpio: Don't allow name share between I and O
Only allow a GPIO name to be one or the other. Inputs and outputs are
functionally different and should be in different namespaces. Prepares
support for the QOMification of IRQs as Links or Child objects.

The alternative is to munge names .e.g. with "-in" or "-out" suffixes
when giving QOM names. But that reduces clarity and if there are cases
out there where users want I and O with same name they can manually add
their own suffixes.

Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-10-13 16:39:26 +02:00
Yongbok Kim
31837be3ee target-mips: add compact and CP1 branches
Introduce MIPS32R6 Compact Branch instructions which do not have delay slot -
they have forbidden slot instead. However, current implementation does not
support forbidden slot yet.

Add also BC1EQZ and BC1NEZ instructions.

Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2014-10-13 12:38:25 +01:00
Yongbok Kim
15eacb9b52 target-mips: add ALIGN, DALIGN, BITSWAP and DBITSWAP instructions
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
2014-10-13 12:38:25 +01:00
Leon Alrae
01f7288579 target-mips: Status.UX/SX/KX enable 32-bit address wrapping
In R6 the special behaviour for data references is also specified for Kernel
and Supervisor mode. Therefore MIPS_HFLAG_UX is replaced by generic
MIPS_HFLAG_AWRAP indicating enabled 32-bit address wrapping.

Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
2014-10-13 12:38:25 +01:00
Leon Alrae
4267d3e6e0 target-mips: move CLO, DCLO, CLZ, DCLZ, SDBBP and free special2 in R6
Also consider OPC_SPIM instruction as deleted in R6 because it is overlaping
with MIPS32R6 SDBBP.

Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
2014-10-13 12:38:25 +01:00
Leon Alrae
b42ee5e1d9 target-mips: redefine Integer Multiply and Divide instructions
Use "R6_" prefix in front of all new Multiply / Divide instructions for
easier differentiation between R6 and preR6.

Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
2014-10-13 12:38:24 +01:00
Leon Alrae
bf7910c6b1 target-mips: move PREF, CACHE, LLD and SCD instructions
The encoding of PREF, CACHE, LLD and SCD instruction changed in MIPS32R6.
Additionally, the hint codes in PREF instruction greater than or
equal to 24 generate Reserved Instruction Exception.

Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
2014-10-13 12:38:24 +01:00
Leon Alrae
fac5a07330 target-mips: signal RI Exception on DSP and Loongson instructions
Move DSP and Loongson instruction to *_legacy functions as they have been
removed in R6.

Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
2014-10-13 12:38:24 +01:00
Leon Alrae
10dc65dbb8 target-mips: split decode_opc_special* into *_r6 and *_legacy
For better code readability and to avoid 'if' statements for all R6 and preR6
instructions whose opcodes are the same - decode_opc_special* functions are
split into functions with _r6 and _legacy suffixes.

*_r6 functions will contain instructions which were introduced in R6.
*_legacy functions will contain instructions which were removed in R6.

Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
2014-10-13 12:38:24 +01:00
Leon Alrae
099e5b4d9f target-mips: extract decode_opc_special* from decode_opc
Creating separate decode functions for special, special2 and special3
instructions to ease adding new R6 instructions and removing legacy
instructions.

Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
2014-10-13 12:38:24 +01:00
Leon Alrae
4368b29a26 target-mips: move LL and SC instructions
The encoding of LL and SC instruction has changed in MIPS32 Release 6.

Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
2014-10-13 12:38:24 +01:00
Leon Alrae
b691d9d2a0 target-mips: add SELEQZ and SELNEZ instructions
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
2014-10-13 12:38:24 +01:00
Leon Alrae
fecd264695 target-mips: signal RI Exception on instructions removed in R6
Signal Reserved Instruction Exception on instructions that do not exist in R6.
In this commit the following groups of preR6 instructions are marked as deleted:
- Floating Point Paired Single
- Floating Point Compare
- conditional moves / branches on FPU conditions
- branch likelies
- unaligned loads / stores
- traps
- legacy accumulator instructions
- COP1X
- MIPS-3D

Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
2014-10-13 12:38:24 +01:00
Leon Alrae
fa0d2f69e7 target-mips: define ISA_MIPS64R6
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
2014-10-13 12:38:24 +01:00
Peter Maydell
b1d28ec6a7 Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20141010' into staging
various s390x updates:
- cpu state handling in qemu and migration
- vhost-scsi-ccw bugfix

# gpg: Signature made Fri 10 Oct 2014 14:01:34 BST using RSA key ID C6F02FAF
# gpg: Can't check signature: public key not found

* remotes/cohuck/tags/s390x-20141010:
  s390x/virtio-ccw: fix vhost-scsi intialization
  s390x/migration: migrate CPU state
  s390x/kvm: synchronize the cpu state after SIGP (INITIAL) CPU RESET
  s390x/kvm: reuse kvm_s390_reset_vcpu() to get rid of ifdefs
  s390x/kvm: propagate s390 cpu state to kvm
  s390x/kvm: proper use of the cpu states OPERATING and STOPPED
  s390x/kvm: introduce proper states for s390 cpus
  linux-headers: update to 3.17-rc7

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-10 14:55:29 +01:00
Paolo Bonzini
9d1c35dfc9 kvm fix compilation with GCC 4.3.4
As usual, SLES11's GCC complained about double typedefs:

/home/cohuck/git/qemu/kvm-all.c:110: error: redefinition of typedef ‘KVMState’
/home/cohuck/git/qemu/include/sysemu/kvm.h:161: error: previous declaration of ‘KVMState’ was here

Reported-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-10 14:07:08 +01:00
Cornelia Huck
4b7757bae7 s390x/virtio-ccw: fix vhost-scsi intialization
The vhost-scsi-ccw backend is of type VHostSCSICcw, not VirtIOSCSICcw.

This fixes a segfault when invoking

    qemu-system-s390x -device vhost-scsi-ccw,?

Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-10-10 13:32:39 +02:00
Thomas Huth
ef1df13087 s390x/migration: migrate CPU state
This patch provides the cpu save information for dumps and later life
migration and enables migration of the CPU state. The code is based on
earlier work from Christian Borntraeger and Jason Herne.

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
[provide cpu_post_load()]
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
CC: Andreas Faerber <afaerber@suse.de>
CC: Christian Borntraeger <borntraeger@de.ibm.com>
CC: Jason J. Herne <jjherne@us.ibm.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
[Cornelia Huck: tweaked cpu_post_load() comment]
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-10-10 13:31:51 +02:00
David Hildenbrand
71dd7e69b3 s390x/kvm: synchronize the cpu state after SIGP (INITIAL) CPU RESET
We need to synchronize registers after a reset has been performed. The
current code does that in qemu_system_reset(), load_normal_reset() and
modified_clear_reset() for all vcpus. After SIGP (INITIAL) CPU RESET,
this needs to be done for the targeted vcpu as well, so let's call
cpu_synchronize_post_reset() in the respective handlers.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
CC: Andreas Faerber <afaerber@suse.de>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-10-10 10:37:47 +02:00
David Hildenbrand
99607144a4 s390x/kvm: reuse kvm_s390_reset_vcpu() to get rid of ifdefs
This patch reuses kvm_s390_reset_vcpu() to get rid of some CONFIG_KVM and
CONFIG_USER_ONLY ifdefs in cpu.c.

In order to get rid of CONFIG_USER_ONLY, kvm_s390_reset_vcpu() has to provide a
dummy implementation - the two definitions are moved to the proper section in
cpu.h.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
CC: Andreas Faerber <afaerber@suse.de>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-10-10 10:37:47 +02:00
David Hildenbrand
c9e659c9ee s390x/kvm: propagate s390 cpu state to kvm
Let QEMU propagate the cpu state to kvm. If kvm doesn't yet support it, it is
silently ignored as kvm will still handle the cpu state itself in that case.

The state is not synced back, thus kvm won't have a chance to actively modify
the cpu state. To do so, control has to be given back to QEMU (which is already
done so in all relevant cases).

Setting of the cpu state can fail either because kvm doesn't support the
interface yet, or because the state is invalid/not supported. Failed attempts
will be traced

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
CC: Andreas Faerber <afaerber@suse.de>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-10-10 10:37:47 +02:00
David Hildenbrand
eb24f7c689 s390x/kvm: proper use of the cpu states OPERATING and STOPPED
This patch makes sure that halting a cpu and stopping a cpu are two different
things. Stopping a cpu will also set the cpu halted - this is needed for common
infrastructure to work (note that the stop and stopped flag cannot be used for
our purpose because they are already used by other mechanisms).

A cpu can be halted ("waiting") when it is operating. If interrupts are
disabled, this is called a "disabled wait", as it can't be woken up anymore. A
stopped cpu is treated like a "disabled wait" cpu, but in order to prepare for a
proper cpu state synchronization with the kvm part, we need to track the real
logical state of a cpu.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
CC: Andreas Faerber <afaerber@suse.de>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-10-10 10:37:47 +02:00
David Hildenbrand
75973bfe41 s390x/kvm: introduce proper states for s390 cpus
Until now, when a s390 cpu was stopped or halted, the number of running
CPUs was tracked in a global variable. This was problematic for migration,
so Jason came up with a per-cpu running state.
As it turns out, we want to track the full logical state of a target vcpu,
so we need real s390 cpu states.

This patch is based on an initial patch by Jason Herne, but was heavily
rewritten when adding the cpu states STOPPED and OPERATING. On the way we
move add_del_running to cpu.c (the declaration is already in cpu.h) and
modify the users where appropriate.

Please note that the cpu is still set to be stopped when it is
halted, which is wrong. This will be fixed in the next patch. The LOAD and
CHECK-STOP state will not be used in the first step.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
[folded Jason's patch into David's patch to avoid add/remove same lines]
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
CC: Andreas Faerber <afaerber@suse.de>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-10-10 10:37:47 +02:00
Jens Freimann
a9fd16544d linux-headers: update to 3.17-rc7
Sync headers with 3.17-rc7

Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-10-10 10:37:47 +02:00
Peter Maydell
fcb2cd928f Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
Four changes here.  Polling for reconnection of character devices,
the QOMification of accelerators, a fix for -kernel support on x86, and one
for a recently-introduced virtio-scsi optimization.

# gpg: Signature made Thu 09 Oct 2014 14:36:50 BST using RSA key ID 4E6B09D7
# gpg: Good signature from "Paolo Bonzini <pbonzini@redhat.com>"
# gpg:                 aka "Paolo Bonzini <bonzini@gnu.org>"

* remotes/bonzini/tags/for-upstream: (28 commits)
  qemu-char: Fix reconnect socket error reporting
  qemu-sockets: Add error to non-blocking connect handler
  qemu-error: Add error_vreport()
  virtio-scsi: fix use-after-free of VirtIOSCSIReq
  linuxboot: compute initrd loading address
  kvm: Make KVMState be the TYPE_KVM_ACCEL instance struct
  accel: Create accel object when initializing machine
  accel: Pass MachineState object to accel init functions
  accel: Rename 'init' method to 'init_machine'
  accel: Move accel init/allowed code to separate function
  accel: Remove tcg_available() function
  accel: Move qtest accel registration to qtest.c
  accel: Move Xen registration code to xen-common.c
  accel: Move KVM accel registration to kvm-all.c
  accel: Report unknown accelerator as "not found" instead of "does not exist"
  accel: Make AccelClass.available() optional
  accel: Use QOM classes for accel types
  accel: Move accel name lookup to separate function
  accel: Simplify configure_accelerator() using AccelType *acc variable
  accel: Create AccelType typedef
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-09 15:09:05 +01:00
Corey Minyard
5008e5b7b8 qemu-char: Fix reconnect socket error reporting
If reconnect was set, errors wouldn't always be reported.
Fix that and also only report a connect error once until a
connection has been made.

The primary purpose of this is to tell the user that a
connection failed so they can know they need to figure out
what went wrong.  So we don't want to spew too much
out here, just enough so they know.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-09 15:36:15 +02:00
Corey Minyard
5179502918 qemu-sockets: Add error to non-blocking connect handler
An error value here would be quite handy and more consistent
with the rest of the code.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
[Make sure SO_ERROR value is passed to error_setg_errno. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-09 15:36:15 +02:00
Corey Minyard
5748e4c2be qemu-error: Add error_vreport()
Needed to nicely print socket error reports.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-09 15:36:15 +02:00
Paolo Bonzini
35e4e96c4d virtio-scsi: fix use-after-free of VirtIOSCSIReq
scsi_req_continue can complete the request and cause the VirtIOSCSIReq
to be freed.  Fetch req->sreq just once to avoid the bug.

Reported-by: Richard Jones <rjones@redhat.com>
Tested-by: Richard Jones <rjones@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-09 15:36:15 +02:00
Paolo Bonzini
cdebec5e40 linuxboot: compute initrd loading address
Even though hw/i386/pc.c tries to compute a valid loading address for the
initrd, close to the top of RAM, this does not take into account other
data that is malloced into that memory by SeaBIOS.

Luckily we can easily look at the memory map to find out how much memory is
used up there.  This patch places the initrd in the first four gigabytes,
below the first hole (as returned by INT 15h, AX=e801h).

Without this patch:
[    0.000000] init_memory_mapping: [mem 0x07000000-0x07fdffff]
[    0.000000] RAMDISK: [mem 0x0710a000-0x07fd7fff]

With this patch:
[    0.000000] init_memory_mapping: [mem 0x07000000-0x07fdffff]
[    0.000000] RAMDISK: [mem 0x07112000-0x07fdffff]

So linuxboot is able to use the 64k that were added as padding for
QEMU <= 2.1.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-09 15:36:15 +02:00
Eduardo Habkost
fc02086b5a kvm: Make KVMState be the TYPE_KVM_ACCEL instance struct
Now that we create an accel object before calling machine_init, we can
simply use the accel object to save all KVMState data, instead of
allocationg KVMState manually.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-09 15:36:15 +02:00
Eduardo Habkost
ac2da55e01 accel: Create accel object when initializing machine
Create an actual TYPE_ACCEL object when initializing a machine. This
will allow accelerator classes to implement some initialization on
instance_init, and to save state on the TYPE_ACCEL object.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-09 15:36:14 +02:00
Eduardo Habkost
f6a1ef6440 accel: Pass MachineState object to accel init functions
Most of the machine options and machine state information is in the
MachineState object, not on the MachineClass. This will allow init
functions to use the MachineState object directly instead of
qemu_get_machine_opts() or the current_machine global.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-09 12:57:10 +02:00
Peter Maydell
b6011bd8a5 Merge remote-tracking branch 'remotes/riku/tags/pull-linux-user-20141006-2' into staging
linux-user pull for 2.2

Clearest linux-user patches sent to the list since august,
Apart from Mikhails patch, the rest are quite trivial.

v2: check for CONFIG_TIMERFD only after it has been defined

# gpg: Signature made Mon 06 Oct 2014 20:08:10 BST using RSA key ID DE3C9BC0
# gpg: Good signature from "Riku Voipio <riku.voipio@iki.fi>"
# gpg:                 aka "Riku Voipio <riku.voipio@linaro.org>"

* remotes/riku/tags/pull-linux-user-20141006-2:
  translate-all.c: memory walker initial address miscalculation
  linux-user: don't include timerfd if not needed
  linux-user: Simplify timerid checks on g_posix_timers range
  linux-user: Convert blkpg to use a special subop handler
  linux-user: Enable epoll_pwait syscall for ARM

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-07 10:41:48 +01:00
Mikhail Ilyin
1a1c4db9b2 translate-all.c: memory walker initial address miscalculation
The initial base address is miscalculated in walk_memory_regions().
It has to be shifted TARGET_PAGE_BITS more. Holder variables are
extended to target_ulong size otherwise they don't fit for MIPS N32
(a 32-bit ABI with a 64-bit address space) and qemu won't compile.
The issue led to incorrect debug output of memory maps and a
mis-formed coredumped file.

Signed-off-by: Mikhail Ilyin <m.ilin@samsung.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-10-06 21:53:35 +03:00
Riku Voipio
d80a190594 linux-user: don't include timerfd if not needed
Without this, builds on older systems fail with:

qemu/linux-user/syscall.c:61:25: warning: sys/timerfd.h: No such file or directory

v2: fix the usual case where CONFIG_TIMERFD is enabled..

Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-10-06 21:52:46 +03:00
Alexander Graf
e52a99f756 linux-user: Simplify timerid checks on g_posix_timers range
We check whether the passed in timer id is negative on all calls
that involve g_posix_timers.

However, these checks are bogus. First off we limit the timer_id to
16 bits which is not what Linux does. Then we check whether it's negative
which it can't be because we masked it.

We can safely remove the masking. For the negativity check we can just
treat the timerid as unsigned and only check for upper boundaries.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-10-06 21:52:45 +03:00
Alexander Graf
a59b5e35d1 linux-user: Convert blkpg to use a special subop handler
The blkpg ioctl can take different payloads depending on the opcode in
its payload structure. Create a new special ioctl handler that can only
deal with partition style ones for now.

This patch fixes running parted for me.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-10-06 21:52:45 +03:00
Peter Maydell
40645c7bfd linux-user: Enable epoll_pwait syscall for ARM
We have support for the epoll_pwait syscall, but it wasn't enabled for
ARM guests because we hadn't defined the syscall number; correct this
deficiency.

Reported-by: Dave Flogeras <dflogeras2@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-10-06 21:52:45 +03:00
Peter Maydell
2472b6c07b gdbstub: Allow target CPUs to specify watchpoint STOP_BEFORE_ACCESS flag
GDB assumes that watchpoint set via the gdbstub remote protocol will
behave in the same way as hardware watchpoints for the target. In
particular, whether the CPU stops with the PC before or after the insn
which triggers the watchpoint is target dependent. Allow guest CPU
code to specify which behaviour to use. This fixes a bug where with
guest CPUs which stop before the accessing insn GDB would manually
step forward over what it thought was the insn and end up one insn
further forward than it should be.

We set this flag for the CPU architectures which set
gdbarch_have_nonsteppable_watchpoint in gdb 7.7:
ARM, CRIS, LM32, MIPS and Xtensa.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Tested-by: Max Filippov <jcmvbkbc@gmail.com>
Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Tested-by: Michael Walle <michael@walle.cc> (for lm32)
Message-id: 1410545057-14014-1-git-send-email-peter.maydell@linaro.org
2014-10-06 14:25:43 +01:00
Peter Maydell
507ef2f9fa Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
# gpg: Signature made Sat 04 Oct 2014 21:24:46 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/block-pull-request: (23 commits)
  blockdev-test: Test device_del after drive_del
  blockdev-test: Factor out some common code into helpers
  blockdev-test: Simplify by using g_assert_cmpstr()
  blockdev-test: Clean up bogus drive_add argument
  blockdev-test: Use single rather than double quotes in QMP
  drive_del-test: Merge of qdev-monitor-test, blockdev-test
  iotests: qemu-img info output for corrupt image
  qapi: Add corrupt field to ImageInfoSpecificQCow2
  iotests: Use _img_info
  util: Emancipate id_wellformed() from QemuOpts
  q35/ahci: Pick up -cdrom and -hda options
  qtest/bios-tables: Correct Q35 command line
  ide: Update ide_drive_get to be HBA agnostic
  pc/vl: Add units-per-default-bus property
  blockdev: Allow overriding if_max_dev property
  blockdev: Orphaned drive search
  qemu-iotests: Fix supported cache modes for 052
  make check-block: Use default cache modes
  Modify qemu_opt_rename to realize renaming all items in opts
  vmdk: Fix integer overflow in offset calculation
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-06 10:59:56 +01:00
Markus Armbruster
767c86d3e7 blockdev-test: Test device_del after drive_del
Executed in this order, drive_del and device_del's automatic drive
deletion take notoriously tricky special paths.

[Fixed "an device" -> "a device" typo as requested by Eric Blake.
--Stefan]

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1412261496-24455-7-git-send-email-armbru@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-10-04 19:28:39 +01:00
Markus Armbruster
2eea5cd452 blockdev-test: Factor out some common code into helpers
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1412261496-24455-6-git-send-email-armbru@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-10-04 19:28:14 +01:00
Markus Armbruster
37e153fe45 blockdev-test: Simplify by using g_assert_cmpstr()
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1412261496-24455-5-git-send-email-armbru@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-10-04 19:28:14 +01:00
Markus Armbruster
e319df669d blockdev-test: Clean up bogus drive_add argument
The first argument should be a PCI address, which pci-addr=auto isn't.
Doesn't really matter, as drive_add ignores its first argument when
its second argument has if=none.  Clean it up anyway.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1412261496-24455-4-git-send-email-armbru@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-10-04 19:28:13 +01:00
Markus Armbruster
d0e3866837 blockdev-test: Use single rather than double quotes in QMP
QMP accepts both single and double quotes.  This is the only test
using double quotes.  They need to be quoted in C strings.  Replace
them by single quotes.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1412261496-24455-3-git-send-email-armbru@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-10-04 19:28:13 +01:00
Markus Armbruster
e2f3f22188 drive_del-test: Merge of qdev-monitor-test, blockdev-test
Each of qdev-monitor-test and blockdev-test has just one test case,
and both are about drive_del.

[Extended copyright from 2013 to 2013-2014 as requested by Eric Blake.
--Stefan]

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1412261496-24455-2-git-send-email-armbru@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-10-04 19:27:29 +01:00
Max Reitz
f383611a0a iotests: qemu-img info output for corrupt image
The "corrupt" entry in the format-specific information section should be
"true".

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1412105489-7681-4-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-10-04 19:18:17 +01:00
Max Reitz
9009b1963c qapi: Add corrupt field to ImageInfoSpecificQCow2
Just like lazy-refcounts, this field will be present iff the qcow2
compat level is 1.1 (or probably any future revision).

As expected, this breaks some tests due to the new field present in
qemu-img info output; so fix their output accordingly.

Suggested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1412105489-7681-3-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-10-04 19:18:17 +01:00
Max Reitz
1b53eab270 iotests: Use _img_info
qemu-img info should only be used directly if the format-specific
information or the name of the format is relevant (some tests explicitly
test format-specific information; test 082 uses qcow2-specific settings
to test the qemu-img interface); otherwise, tests should always use
_img_info instead.

Test 082 was touched only partially. It does test the qemu-img
interface; however, its invocations of qemu-img info are not real tests
but rather verifications, so if format-specific information is not
important for the test, there is no reason not to use _img_info. In
contrast to directly invoking qemu-img info, "qcow2" is replaced by
"IMGFMT"; but as "qcow2" is only mentioned once in test 082 (in
_supported_fmt), I consider this an improvement.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1412105489-7681-2-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-10-04 19:18:17 +01:00
Eduardo Habkost
0d15da8e6f accel: Rename 'init' method to 'init_machine'
Today, all accelerator init functions affect some global state:
* tcg_init() calls tcg_exec_init() and affects globals such as tcg_tcx,
  page size globals, and possibly others;
* kvm_init() changes the kvm_state global, cpu_interrupt_handler, and possibly
  others;
* xen_init() changes the xen_xc global, and registers a change state handler.

With the new accelerator QOM classes, initialization may now be split in two
steps:
* instance_init() will do basic initialization that doesn't affect any global
  state and don't need MachineState or MachineClass data. This will allow
  probing code to safely create multiple accelerator objects on the fly just
  for reporting host/accelerator capabilities, for example.
* accel_init_machine()/init_machine() will save the accelerator object in
  MachineState, and do initialization steps which still affect global state,
  machine state, or that need data from MachineClass or MachineState.

To clarify the difference between those two steps, rename init() to
init_machine().

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-04 08:59:16 +02:00
Eduardo Habkost
d95c8527e9 accel: Move accel init/allowed code to separate function
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-04 08:59:16 +02:00
Eduardo Habkost
32592e112f accel: Remove tcg_available() function
As the function always return 1, it is not needed anymore.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-04 08:59:15 +02:00
Eduardo Habkost
3a6ce5147f accel: Move qtest accel registration to qtest.c
As qtest_availble() returns 1 only when CONFIG_POSIX is set, keep
setting AccelClass.available to keep current behavior (this is different
from what we did for KVM and Xen).

This also allows us to make qtest_init_accel() static.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-04 08:59:15 +02:00
Eduardo Habkost
b152b05a35 accel: Move Xen registration code to xen-common.c
Note that this has an user-visible side-effect: instead of reporting
"Xen is not supported for this target", QEMU binaries not supporting Xen
will report "xen accelerator does not exist".

As xen_available() always return 1 when CONFIG_XEN is enabled, we don't
need to set AccelClass.available anymore. xen_enabled() is not being
removed yet, but only because vl.c is still using it.

This also allows us to make xen_init() static.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-04 08:59:15 +02:00
Eduardo Habkost
782c3f2939 accel: Move KVM accel registration to kvm-all.c
Note that this has an user-visible side-effect: instead of reporting
"KVM is not supported for this target", QEMU binaries not supporting KVM
will report "kvm accelerator does not exist".

As kvm_availble() always return 1 when CONFIG_KVM is enabled, we don't
need to set AccelClass.available anymore. kvm_enabled() is not being
completely removed yet only because qmp_query_kvm() still uses it.

This also allows us to make kvm_init() static.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-04 08:59:15 +02:00
Eduardo Habkost
b31f9acaaa accel: Report unknown accelerator as "not found" instead of "does not exist"
As the accelerator classes won't be registered anymore if they are not
enabled at compile time, saying "does not exist" may be misleading, as
the accelerator may be simply disabled. Change the wording to just say
"not found".

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-04 08:59:15 +02:00
Eduardo Habkost
f6dfb83547 accel: Make AccelClass.available() optional
When we move accel classes outside accel.c, the available() function
won't be necessary anymore, because the classes will be registered only
if the accelerator code is really enabled at build time.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-04 08:59:15 +02:00
Eduardo Habkost
b14a0b7469 accel: Use QOM classes for accel types
Instead of having a static AccelType array, register a class for each
accelerator type, and use class name lookup to find accelerator
information.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-04 08:59:15 +02:00
Eduardo Habkost
a224655200 accel: Move accel name lookup to separate function
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-04 08:59:15 +02:00
Eduardo Habkost
e8b466ef95 accel: Simplify configure_accelerator() using AccelType *acc variable
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-04 08:59:15 +02:00
Eduardo Habkost
e54adde615 accel: Create AccelType typedef
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-04 08:59:15 +02:00
Eduardo Habkost
a1a9cb0ccd accel: Move accel code to accel.c
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-04 08:59:15 +02:00
Eduardo Habkost
2067444945 vl.c: Small coding style fix
Just to make checkpatch.pl happy when moving the code.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-04 08:59:14 +02:00
Corey Minyard
01ca519f24 qemu-char: Print the remote and local addresses for a socket
It seems that it might be a good idea to know what is at the remote
end of a socket for tracking down issues.  So add that to the
socket filename.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-04 08:59:14 +02:00
Corey Minyard
5dd1f02b4b qemu-char: Add reconnecting to client sockets
Adds a "reconnect" option to socket backends that gives a reconnect
timeout.  This only applies to client sockets.  If the other end
of a socket closes the connection, qemu will attempt to reconnect
after the given number of seconds.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-04 08:59:14 +02:00
Corey Minyard
16cc4ffe34 qemu-char: set socket filename to disconnected when not connected
This way we can tell if the socket is connected or not.  It also splits
the string conversions out into separate functions to make this more
convenient.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-04 08:59:14 +02:00
Corey Minyard
cfb429cb1a qemu-char: Move some items into TCPCharDriver
This keeps them from having to be passed around and makes them
available for later functions, like printing and reconnecting.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-04 08:59:14 +02:00
Corey Minyard
43ded1a0d2 qemu-char: Rework qemu_chr_open_socket() for reconnect
Move all socket configuration to qmp_chardev_open_socket().
qemu_chr_open_socket_fd() just opens the socket.  This is getting ready
for the reconnect code, which will call open_sock_fd() on a reconnect
attempt.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-04 08:59:14 +02:00
Corey Minyard
9f781168c5 qemu-char: Make the filename size for a chardev a #define
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-04 08:59:14 +02:00
Markus Armbruster
f5bebbbb28 util: Emancipate id_wellformed() from QemuOpts
IDs have long spread beyond QemuOpts: not everything with an ID
necessarily goes through QemuOpts.  Commit 9aebf3b is about such a
case: block layer names are meant to be well-formed IDs, but some of
them don't go through QemuOpts, and thus weren't checked.  The commit
fixed that the straightforward way: rename the internal QemuOpts
helper id_wellformed() to qemu_opts_id_wellformed() and give it
external linkage.

Instead of using it directly in block.c, the commit adds wrapper
bdrv_is_valid_name(), probably to hide the connection to QemuOpts.

Go one logical step further: emancipate IDs from QemuOpts.  Rename the
function back to id_wellformed(), and put it in another file.  While
there, clean up its value to bool.  Peel off the bdrv_is_valid_name()
wrapper.

[Replaced stray return 0 with return false to match bool returns used
elsewhere in id_wellformed().
--Stefan]

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-10-03 10:30:33 +01:00
John Snow
d93162e13c q35/ahci: Pick up -cdrom and -hda options
This patch implements the backend for the Q35 board
for us to be able to pick up and use drives defined
by the -cdrom, -hda, or -drive if=ide shorthand options.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1412187569-23452-7-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-10-03 10:30:33 +01:00
John Snow
6b9e03a4e7 qtest/bios-tables: Correct Q35 command line
If the Q35 board types are to begin recognizing
and decoding syntactic sugar for drive/device
declarations, then workarounds found within
the qtests suite need to be adjusted to prevent
any test failures after the fix.

bios-tables-test improperly uses this cli:
-drive file=etc,id=hd -device ide-hd,drive=hd

Which will create a drive and device due to
the lack of specifying if=none. Then, it will
attempt to create a second device and fail.

This patch corrects this test to always use
the full, non-sugared -device/-drive syntax
for both PC and Q35.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1412187569-23452-6-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-10-03 10:30:33 +01:00
John Snow
d8f94e1bb2 ide: Update ide_drive_get to be HBA agnostic
Instead of duplicating the logic for the if_ide
(bus,unit) mappings, rely on the blockdev layer
for managing those mappings for us, and use the
drive_get_by_index call instead.

This allows ide_drive_get to work for AHCI HBAs
as well, and can be used in the Q35 initialization.

Lastly, change the nature of the argument to
ide_drive_get so that represents the number of
total drives we can support, and not the total
number of buses. This will prevent array overflows
if the units-per-default-bus property ever needs
to be adjusted for compatibility reasons.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1412187569-23452-5-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-10-03 10:30:33 +01:00
John Snow
1602651833 pc/vl: Add units-per-default-bus property
This patch adds the 'units_per_default_bus' property which
allows individual boards to declare their desired
index => (bus,unit) mapping for their default HBA, so that
boards such as Q35 can specify that its default if_ide HBA,
AHCI, only accepts one unit per bus.

This property only overrides the mapping for drives matching
the block_default_type interface.

This patch also adds this property to *all* past and present
Q35 machine types. This retroactive addition is justified
because the previous erroneous index=>(bus,unit) mappings
caused by lack of such a property were not utilized due to
lack of initialization code in the Q35 init routine.

Further, semantically, the Q35 board type has always had the
property that its default HBA, AHCI, only accepts one unit per
bus. The new code added to add devices to drives relies upon
the accuracy of this mapping. Thus, the property is applied
retroactively to reduce complexity of allowing IDE HBAs with
different units per bus.

Examples:

Prior to this patch, all IDE HBAs were assumed to use 2 units
per bus (Master, Slave). When using Q35 and AHCI, however, we
only allow one unit per bus.

-hdb foo.qcow2 would become index=1, or bus=0,unit=1.
-hdd foo.qcow2 would become index=3, or bus=1,unit=1.
-drive file=foo.qcow2,index=5 becomes bus=2,unit=1.

These are invalid for AHCI. They now become, under Q35 only:

-hdb foo.qcow2 --> index=1, bus=1, unit=0.
-hdd foo.qcow2 --> index=3, bus=3, unit=0.
-drive file=foo.qcow2,index=5 --> bus=5,unit=0.

The mapping is adjusted based on the fact that the default IF
for the Q35 machine type is IF_IDE, and units-per-default-bus
overrides the IDE mapping from its default of 2 units per bus
to just 1 unit per bus.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1412187569-23452-4-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-10-03 10:30:33 +01:00
John Snow
21dff8cf38 blockdev: Allow overriding if_max_dev property
The if_max_devs table as in the past been an immutable
default that controls the mapping of index => (bus,unit)
for all boards and all HBAs for each interface type.

Since adding this mapping information to the HBA device
itself is currently unwieldly from the perspective of
retrieving this information at option parsing time
(e.g, within drive_new), we consider the alternative
of marking the if_max_devs table mutable so that
later configuration and initialization can adjust the
mapping at will, but only up until a drive is added,
at which point the mapping is finalized.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1412187569-23452-3-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-10-03 10:30:33 +01:00
John Snow
a66c9dc734 blockdev: Orphaned drive search
When users use command line options like -hda, -cdrom,
or even -drive if=ide, it is up to the board initialization
routines to pick up these drives and create backing
devices for them.

Some boards, like Q35, have not been doing this.
However, there is no warning explaining why certain
drive specifications are just silently ignored,
so this function adds a check to print some warnings
to assist users in debugging these sorts of issues
in the future.

This patch will not warn about drives added with if_none,
for which it is not possible to tell in advance if
the omission of a backing device is an issue.

A warning in these cases is considered appropriate.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1412187569-23452-2-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-10-03 10:30:33 +01:00
Kevin Wolf
cf77b2d25e qemu-iotests: Fix supported cache modes for 052
The requirement for this test case is really "no O_DIRECT", because the
temporary snapshot for BDRV_O_SNAPSHOT is created in /tmp, which often
is a tmpfs.

Commit f210a83c ('qemu-iotests: Add _default_cache_mode and
_supported_cache_modes') turned the restriction into writethrough-only,
but that's not really necessary.

Allow to run the test for any non-O_DIRECT cache modes, and use the
global default of writeback if no cache mode is specified.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1412076430-11623-3-git-send-email-kwolf@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-10-03 10:30:33 +01:00
Kevin Wolf
d9323e9b20 make check-block: Use default cache modes
When qemu-iotests only gave a choice between cache=none and
cache=writethrough, we picked cache=none because it was the option that
would complete the test in finite time. Some tests could only work for
one of the two options and would be skipped with cache=none, but that
was an acceptable trade-off at the time.

Today, however, qemu-iotests is a bit more flexible than that and you
can specify any of the cache modes supported by qemu. The default is
writeback, like in qemu, which is fast and (unlike cache=none) compatible
with any host filesystem. Test cases that have specific requirements for
the cache mode can also specify a different default.

In order to get a fast test run that works everywhere and doesn't skip
tests that need a different cache mode, not specifying any cache mode
and instead relying on the default is the best we can do today.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1412076430-11623-2-git-send-email-kwolf@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-10-03 10:30:33 +01:00
Jun Li
20d6cd47d0 Modify qemu_opt_rename to realize renaming all items in opts
Add realization of rename all items in opts for qemu_opt_rename.
e.g:
When add bps twice in command line, need to rename all bps to
throttling.bps-total.

This patch solved following bug:
Bug 1145586 - qemu-kvm will give strange hint when add bps twice for a drive
ref:https://bugzilla.redhat.com/show_bug.cgi?id=1145586

[Resolved conflict with commit 5abbf0ee4d
("block: Catch simultaneous usage of options and their aliases").  Check
for simultaneous use first, and then loop over all options.
--Stefan]

Signed-off-by: Jun Li <junmuzi@gmail.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1411537527-16715-1-git-send-email-junmuzi@gmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-10-03 10:30:33 +01:00
Fam Zheng
d1319b077a vmdk: Fix integer overflow in offset calculation
This fixes the bug introduced by commit c6ac36e (vmdk: Optimize cluster
allocation).

$ ~/build/master/qemu-io /stor/vm/arch.vmdk -c 'write 2G 1k'
write failed: Invalid argument

Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1411437381-11234-1-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-10-03 10:30:33 +01:00
Markus Armbruster
fbf28a4328 block: Drop superfluous conditionals around qemu_opts_del()
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1411999675-14533-1-git-send-email-armbru@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-10-03 10:30:33 +01:00
Richard W.M. Jones
18fe46d79a ssh: Don't crash if either host or path is not specified.
$ ./qemu-img create -f qcow2 overlay \
    -b 'json: { "file.driver":"ssh",
                "file.host":"localhost",
                "file.host_key_check":"no" }'
qemu-img: qobject/qdict.c:193: qdict_get_obj: Assertion `obj != ((void *)0)' failed.
Aborted

A similar crash also happens if the file.host field is omitted.

https://bugzilla.redhat.com/show_bug.cgi?id=1147343

Bug found and reported by Jun Li.

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-10-03 10:30:33 +01:00
Zhang Haoyu
af95738754 snapshot: fix referencing wrong variable in while loop in do_delvm
The while loop variabal is "bs1",
but "bs" is always passed to bdrv_snapshot_delete_by_id_or_name.
Broken in commit a89d89d, v1.7.0.

Signed-off-by: Zhang Haoyu <zhanghy@sangfor.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-10-03 10:30:33 +01:00
Peter Maydell
b00a0ddb31 Merge remote-tracking branch 'remotes/kraxel/tags/pull-input-20141002-1' into staging
input monitor patches: fix send-key release ordering
and new input-send-event command

# gpg: Signature made Thu 02 Oct 2014 09:10:44 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-input-20141002-1:
  add input-send-event command
  input: fix send-key monitor command release event ordering

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-02 15:01:48 +01:00
Peter Maydell
53b98718f1 Merge remote-tracking branch 'remotes/kraxel/tags/pull-console-20141002-1' into staging
pixman: fix qemu_default_pixman_format (32bpp non-native endian)

# gpg: Signature made Thu 02 Oct 2014 08:47:20 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-console-20141002-1:
  pixman: fix qemu_default_pixman_format (32bpp non-native endian)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-02 13:23:56 +01:00
Peter Maydell
fbcaca994d Merge remote-tracking branch 'remotes/kraxel/tags/pull-vga-20141002-1' into staging
vga: cleanups, prepare for endianness switching

# gpg: Signature made Thu 02 Oct 2014 08:10:49 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-vga-20141002-1:
  vga: Add endian to vmstate
  vga: Make fb endian a common state variable
  vga: Rename vga_template.h to vga-helpers.h
  vga: Remove some "should be done in BIOS" comments
  cirrus: Remove non-32bpp cursor drawing
  vga: Simplify vga_draw_blank() a bit
  vga: Remove rgb_to_pixel indirection
  vga: Separate LE and BE conversion functions
  vga: Remove remainder of old conversion cruft
  vga: Start cutting out non-32bpp conversion support

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-02 12:28:50 +01:00
Marcelo Tosatti
50c6617fcb add input-send-event command
Which allows specification of absolute/relative,
up/down and console parameters.

Suggested by Gerd Hoffman.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-02 09:58:14 +02:00
Gerd Hoffmann
e37f202450 input: fix send-key monitor command release event ordering
commit 2e377f1730 changed the ordering
of the release events as side effect.  Some guests are not happy with
that and don't recognise ctrl-alt-del any more.  This patch restores
the old last-pressed first-released behavior.

Cc: Amos Kong <akong@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-02 09:58:14 +02:00
Peter Maydell
1831e15060 Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
This update brings dataplane to virtio-scsi (NOT
yet 100% thread-safe, though, which makes it really, really
experimental.  It also brings asynchronous cancellation to
the SCSI subsystem and implements it in virtio-scsi.  This
is a pretty important feature.  Almost all the work here
was done by Fam Zheng.

I also included the virtio refcount fixes from Gonglei,
because they had a small conflict with virtio-scsi dataplane.

This pull request is using the new subkey 4E6B09D7.

# gpg: Signature made Tue 30 Sep 2014 12:31:02 BST using RSA key ID 4E6B09D7
# gpg: Good signature from "Paolo Bonzini <pbonzini@redhat.com>"
# gpg:                 aka "Paolo Bonzini <bonzini@gnu.org>"

* remotes/bonzini/tags/for-upstream: (39 commits)
  block/iscsi: handle failure on malloc of the allocationmap
  util: introduce bitmap_try_new
  virtio-scsi: Handle TMF request cancellation asynchronously
  scsi: Introduce scsi_req_cancel_async
  scsi: Introduce scsi_req_cancel_complete
  scsi: Drop SCSIReqOps.cancel_io
  scsi: Unify request unref in scsi_req_cancel
  scsi-generic: Handle canceled request in scsi_command_complete
  scsi: Drop scsi_req_abort
  virtio-scsi: Process ".iothread" property
  virtio-scsi: Call bdrv_io_plug/bdrv_io_unplug in cmd request handling
  virtio-scsi: Batched prepare for cmd reqs
  virtio-scsi: Two stages processing of cmd request
  virtio-scsi: Add migration state notifier for dataplane code
  virtio-scsi: Hook up with dataplane
  virtio-scsi-dataplane: Code to run virtio-scsi on iothread
  virtio-scsi: Add VirtIOSCSIVring in VirtIOSCSIReq
  virtio-scsi: Add 'iothread' property to virtio-scsi
  virtio: add a wrapper for virtio-backend initialization
  virtio-9p: fix virtio-9p child refcount in transports
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-30 16:45:35 +01:00
Peter Maydell
7be3c1408a Merge remote-tracking branch 'remotes/kraxel/tags/pull-audio-20140930-1' into staging
ac97: register reset via qom

# gpg: Signature made Tue 30 Sep 2014 12:32:29 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-audio-20140930-1:
  ac97: register reset via qom

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-30 15:03:27 +01:00
Peter Maydell
2e456b2b60 Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pci, pc, virtio, misc bugfixes

A bunch of bugfixes.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Mon 29 Sep 2014 17:59:57 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"

* remotes/mst/tags/for_upstream:
  vl: Adjust the place of calling mlockall to speedup VM's startup
  pc-dimm: Don't check dimm->node when there is non-NUMA config
  pci-hotplug-old: avoid losing error message
  Revert "virtio-pci: fix migration for pci bus master"
  loader: g_realloc(p, 0) frees and returns NULL, simplify

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-30 13:09:40 +01:00
Benjamin Herrenschmidt
c3b1060514 vga: Add endian to vmstate
Include the endian state in the migration stream as an optional
subsection which we only include when the endian isn't the default,
thus enabling backward compatibility of the common case.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

Changes by kraxel:
 * Remove bochs dispi interface changes.  We'll do that in
   a different way to make sure we don't conflict with
   possible future bochs dispi interface changes.
 * keep live migration bits.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2014-09-30 13:34:09 +02:00
Benjamin Herrenschmidt
2c7d8736af vga: Make fb endian a common state variable
And initialize it based on target endian

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2014-09-30 13:34:09 +02:00
Benjamin Herrenschmidt
e657d8ef3c vga: Rename vga_template.h to vga-helpers.h
It's no longer a template, we only instanciate the file once.

Keep it a #included file so the functions remain static.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2014-09-30 13:34:09 +02:00
Benjamin Herrenschmidt
ace89b8ff2 vga: Remove some "should be done in BIOS" comments
Not all platforms have a VGA BIOS, powerpc typically relies on
using the DISPI interface to initialize the card.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2014-09-30 13:34:09 +02:00
Benjamin Herrenschmidt
70a041fe2c cirrus: Remove non-32bpp cursor drawing
We only draw cursor on non-shared surfaces (so it seems...) and
these are always 32bpp

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2014-09-30 13:34:09 +02:00
Benjamin Herrenschmidt
2c79f2a2ec vga: Simplify vga_draw_blank() a bit
The test for surface_bits_per_pixel() isn't necessary anymore,
the 8bpp case never happens.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2014-09-30 13:34:09 +02:00
Benjamin Herrenschmidt
d3c2343af0 vga: Remove rgb_to_pixel indirection
We always use rgb_to_pixel32 nowadays.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2014-09-30 13:34:09 +02:00
Benjamin Herrenschmidt
46c3a8c8eb vga: Separate LE and BE conversion functions
Provide different functions for converting from an LE vs a BE
framebuffer. We cannot rely on the simple cases always being
shared surfaces since cirrus will need to always shadow for
cursor emulation, so we need the full set of functions to
be able to later handle runtime switching.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>\
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2014-09-30 13:34:09 +02:00
Benjamin Herrenschmidt
d2e043a804 vga: Remove remainder of old conversion cruft
All the macros used to generate different versions of vga_template.h
are now unnecessary, take them all out and remove the _32 suffix from
most functions.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2014-09-30 13:34:09 +02:00
Benjamin Herrenschmidt
9e057c0b09 vga: Start cutting out non-32bpp conversion support
Nowadays, we either share a surface with the host, or we create
a 32bpp ARGB console surface.

So we only need to draw/convert to 32bpp, enabling us to remove
all but one instance of vga_template.h inclusion (to be further
cleaned up), rgb_to_pixel_* etc...

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2014-09-30 13:34:09 +02:00
Gerd Hoffmann
89ec031b09 pixman: fix qemu_default_pixman_format (32bpp non-native endian)
Bug breaks SDL display of bigendian guests on little endian hosts.

Reported-by: BALATON Zoltan <balaton@eik.bme.hu>
Reported-by: Valentin Manea <valentin.manea@gmail.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-30 13:34:04 +02:00
Peter Lieven
a9fe4c957b block/iscsi: handle failure on malloc of the allocationmap
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-30 13:30:51 +02:00
Peter Lieven
be4d57c1ea util: introduce bitmap_try_new
regular bitmap_new simply aborts if the memory allocation fails.
bitmap_try_new returns NULL on failure and allows for proper
error handling.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-30 13:30:51 +02:00
Fam Zheng
49e7e31aa0 virtio-scsi: Handle TMF request cancellation asynchronously
For VIRTIO_SCSI_T_TMF_ABORT_TASK and VIRTIO_SCSI_T_TMF_ABORT_TASK_SET,
use scsi_req_cancel_async to start the cancellation.

Because each tmf command may cancel multiple requests, we need to use a
counter to track the number of remaining requests we still need to wait
for.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-30 13:30:51 +02:00
Fam Zheng
8e0a9320e9 scsi: Introduce scsi_req_cancel_async
Devices will call this function to start an asynchronous cancellation. The
bus->info->cancel will be called after the request is canceled.

Devices will probably need to track a separate TMF request that triggers this
cancellation, and wait until the cancellation is done before completing it. So
we store a notifier list in SCSIRequest and in scsi_req_cancel_complete we
notify them.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-30 13:30:51 +02:00
Fam Zheng
d5776465ee scsi: Introduce scsi_req_cancel_complete
Let the aio cb do the clean up and notification job after scsi_req_cancel, in
preparation for asynchronous cancellation.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-30 13:30:51 +02:00
Fam Zheng
a83cfd12d9 scsi: Drop SCSIReqOps.cancel_io
The only two implementations are identical to each other, with nothing specific
to device: they only call bdrv_aio_cancel with the SCSIRequest.aiocb.

Let's move it to scsi-bus.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-30 13:30:51 +02:00
Fam Zheng
3df9caf88f scsi: Unify request unref in scsi_req_cancel
Before, scsi_req_cancel will take ownership of the canceled request and unref
it. We did this because we didn't know whether AIO CB will be called or not
during the cancelling, so we set the io_canceled flag before calling it, and
skip unref in the potentially called callbacks, which is not very nice.

Now, bdrv_aio_cancel has a stricter contract that the completion callbacks are
always called, so we can remove the checks of req->io_canceled and just unref
it in callbacks.

It will also make implementing asynchronous cancellation easier.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-30 13:30:51 +02:00
Fam Zheng
6c25fa6cf8 scsi-generic: Handle canceled request in scsi_command_complete
Now that we always called the cb in bdrv_aio_cancel, let's make scsi-generic
callbacks check io_canceled flag similarly to scsi-disk.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-30 13:30:51 +02:00
Fam Zheng
eda470e41a scsi: Drop scsi_req_abort
The only user of this function is spapr_vscsi.c. We can convert to
scsi_req_cancel plus adding a check in vscsi_request_cancelled.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
[Drop prototype. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-30 13:30:50 +02:00
Peter Maydell
45c270b1ea Merge remote-tracking branch 'remotes/rth/tags/tcg-next-201400729' into staging
tcg updates

# gpg: Signature made Mon 29 Sep 2014 19:58:04 BST using RSA key ID 4DD0279B
# gpg: Good signature from "Richard Henderson <rth7680@gmail.com>"
# gpg:                 aka "Richard Henderson <rth@redhat.com>"
# gpg:                 aka "Richard Henderson <rth@twiddle.net>"

* remotes/rth/tags/tcg-next-201400729:
  tcg: Always enable TCGv type checking
  qemu/compiler: Define QEMU_ARTIFICIAL
  tcg-aarch64: Use 32-bit loads for qemu_ld_i32
  tcg-sparc: Use UMULXHI instruction
  tcg-sparc: Rename ADDX/SUBX insns
  tcg-sparc: Use ADDXC in setcond_i64
  tcg-sparc: Fix setcond_i32 uninitialized value
  tcg-sparc: Use ADDXC in addsub2_i64
  tcg-sparc: Support addsub2_i64

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-30 11:52:41 +01:00
Peter Maydell
29429c7244 Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20140929' into staging
target-arm:
 * more EL2/EL3 preparation work
 * don't handle c15_cpar changes via tb_flush()
 * fix some unused function warnings in ARM devices
 * build the GDB XML for 32 bit CPUs into qemu-*-aarch64
 * implement guest breakpoint support

# gpg: Signature made Mon 29 Sep 2014 19:25:37 BST using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"

* remotes/pmaydell/tags/pull-target-arm-20140929:
  target-arm: Add support for VIRQ and VFIQ
  target-arm: Add IRQ and FIQ routing to EL2 and 3
  target-arm: A64: Emulate the SMC insn
  target-arm: Add a Hypervisor Trap exception type
  target-arm: A64: Emulate the HVC insn
  target-arm: A64: Correct updates to FAR and ESR on exceptions
  target-arm: Don't take interrupts targeting lower ELs
  target-arm: Break out exception masking to a separate func
  target-arm: A64: Refactor aarch64_cpu_do_interrupt
  target-arm: Add SCR_EL3
  target-arm: Add HCR_EL2
  target-arm: Don't handle c15_cpar changes via tb_flush()
  hw/input/tsc210x.c: Delete unused array tsc2101_rates
  hw/display/pxa2xx_lcd.c: Remove unused function pxa2xx_dma_rdst_set
  hw/intc/imx_avic.c: Remove unused function imx_avic_set_prio()
  hw/display/blizzard.c: Delete unused function blizzard_rgb2yuv
  configure: Build GDB XML for 32 bit ARM CPUs into qemu aarch64 binaries
  target-arm: Implement handling of breakpoint firing
  target-arm: Implement setting guest breakpoints

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-30 11:02:06 +01:00
Fam Zheng
9786b592a9 virtio-scsi: Process ".iothread" property
We are ready, now let's effectively enable dataplane.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-30 11:11:20 +02:00
Fam Zheng
5170f40b10 virtio-scsi: Call bdrv_io_plug/bdrv_io_unplug in cmd request handling
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-30 11:11:20 +02:00
Fam Zheng
1880ad4f4e virtio-scsi: Batched prepare for cmd reqs
Queue the popped requests while calling
virtio_scsi_handle_cmd_req_prepare(), then submit them after all
prepared.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-30 11:11:20 +02:00
Fam Zheng
359eea71d9 virtio-scsi: Two stages processing of cmd request
Mechanical change, in preparation for bdrv_io_plug/bdrv_io_unplug.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-30 11:11:20 +02:00
Fam Zheng
dfb37cf7fa virtio-scsi: Add migration state notifier for dataplane code
Similar to virtio-blk-dataplane, we stop the iothread while migration
starts and restart it when migration finishes.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-30 11:11:20 +02:00
Fam Zheng
63c7e54268 virtio-scsi: Hook up with dataplane
This enables the virtio-scsi-dataplane code by setting the iothread
in virtio-scsi device, and makes any function that is called by
back from dataplane to cooperate with the caller: they need to be
vring/iothread aware when handling the requests and using scsi devices
on the bus.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-30 11:11:20 +02:00
Fam Zheng
91cb1c9b56 virtio-scsi-dataplane: Code to run virtio-scsi on iothread
This implements the core part of dataplane feature of virtio-scsi.

A few fields are added in VirtIOSCSICommon to maintain the dataplane
status. These fields are managed by a new source file:
virtio-scsi-dataplane.c.

Most code in this file will run on an iothread, unless otherwise
commented as in a global mutex context, such as those functions to
start, stop and setting the iothread property.

Upon start, we set up guest/host event notifiers, in a same way as
virtio-blk does. The handlers then pop request from vring and call into
virtio-scsi.c functions to process it. So we need to make sure make all
those called functions work with iothread, too.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-30 11:11:20 +02:00
Fam Zheng
244e2898b7 virtio-scsi: Add VirtIOSCSIVring in VirtIOSCSIReq
Move VirtIOSCSIReq to header and add one field "vring" as a wrapper
structure of Vring, VirtIOSCSIVring.

This is necessary for coming dataplane code that runs uses vring on
iothread.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-30 11:11:20 +02:00
Fam Zheng
19d339f11d virtio-scsi: Add 'iothread' property to virtio-scsi
Similar to this property in virtio-blk for dataplane, add it as a QOM
link in virtio-scsi and an alias in virtio-scsi-pci and virtio-scsi-ccw,
in order to assign an iothread to the device.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-30 11:11:20 +02:00
Gonglei
c8075caf19 virtio: add a wrapper for virtio-backend initialization
For better code sharing, add a helper function that handles
reference counting of the virtio backend for virtio proxy devices.

Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-30 11:09:59 +02:00
Gonglei
8f3d60e568 virtio-9p: fix virtio-9p child refcount in transports
object_initialize() leaves the object with a refcount of 1.
object_property_add_child() adds its own reference which is
dropped again when the property is deleted.

The upshot of this is that we always have a refcount >= 1. Upon
unplug the virtio-9p child is not finalized!

Drop our reference after the child property has been added to the
parent.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-30 11:09:59 +02:00
Gonglei
48833071d9 virtio-9p: use aliases instead of duplicate qdev properties
virtio-9p-pci all duplicate the qdev properties of their
V9fsState child. This approach does not work well with
string or pointer properties since we must be careful
about leaking or double-freeing them.

Use the QOM alias property to forward property accesses to the
V9fsState child.  This way no duplication is necessary.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-30 11:09:56 +02:00
Gonglei
91ba212088 virtio-balloon: fix virtio-balloon child refcount in transports
object_initialize() leaves the object with a refcount of 1.
object_property_add_child() adds its own reference which is dropped
again when the property is deleted.

The upshot of this is that we always have a refcount >= 1.  Upon hot
unplug the virtio-balloon child is not finalized!

Drop our reference after the child property has been added to the
parent.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-30 11:09:31 +02:00
Gonglei
352fa88dfb virtio-rng: fix virtio-rng child refcount in transports
object_initialize() leaves the object with a refcount of 1.
object_property_add_child() adds its own reference which is dropped
again when the property is deleted.

The upshot of this is that we always have a refcount >= 1.  Upon hot
unplug the virtio-rng child is not finalized!

Drop our reference after the child property has been added to the
parent.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-30 11:09:24 +02:00
Gonglei
8ee486ae33 virtio-rng: use aliases instead of duplicate qdev properties
virtio-rng-{pci, s390, ccw} all duplicate the
qdev properties of their VirtIORNG child.
This approach does not work well with string or pointer
properties since we must be careful about leaking or
double-freeing them.

Use the QOM alias property to forward property accesses to the
VirtIORNG child.  This way no duplication is necessary.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-30 11:09:21 +02:00
Gonglei
e77ca8b92a virtio-serial: fix virtio-serial child refcount in transports
object_initialize() leaves the object with a refcount of 1.
object_property_add_child() adds its own reference which is dropped
again when the property is deleted.

The upshot of this is that we always have a refcount >= 1.  Upon hot
unplug the virtio-serial child is not finalized!

Drop our reference after the child property has been added to the
parent.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-30 11:09:12 +02:00
Gonglei
4f456d8025 virtio-serial: use aliases instead of duplicate qdev properties
virtio-serial-{pci, s390, ccw} all duplicate the
qdev properties of their VirtIOSerial child.
This approach does not work well with string or pointer
properties since we must be careful about leaking or
double-freeing them.

Use the QOM alias property to forward property accesses to the
VirtIOSerial child.  This way no duplication is necessary.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-30 11:09:03 +02:00
Gonglei
1312f12bcc virtio/vhost-scsi: fix virtio-scsi/vhost-scsi child refcount in transports
object_initialize() leaves the object with a refcount of 1.
object_property_add_child() adds its own reference which is dropped
again when the property is deleted.

The upshot of this is that we always have a refcount >= 1.  Upon hot
unplug the virtio-scsi/vhost-scsi child is not finalized!

Drop our reference after the child property has been added to the
parent.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-30 11:08:56 +02:00
Gonglei
c39343fd81 virtio/vhost-scsi: use aliases instead of duplicate qdev properties
{virtio, vhost}-scsi-{pci, s390, ccw} all duplicate the
qdev properties of their VirtIOSCSI/VHostSCSI child.
This approach does not work well with string or pointer
properties since we must be careful about leaking or
double-freeing them.

Use the QOM alias property to forward property accesses to the
VirtIOSCSI/VHostSCSI child. This way no duplication is necessary.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-30 11:08:41 +02:00
Gonglei
6a0c6b5978 virtio-net: fix virtio-net child refcount in transports
object_initialize() leaves the object with a refcount of 1.
object_property_add_child() adds its own reference which is dropped
again when the property is deleted.

The upshot of this is that we always have a refcount >= 1.  Upon hot
unplug the virtio-net child is not finalized!

Drop our reference after the child property has been added to the
parent.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-30 11:08:38 +02:00
Gonglei
7779edfeb1 virtio-net: use aliases instead of duplicate qdev properties
virtio-net-pci, virtio-net-s390, and virtio-net-ccw all duplicate the
qdev properties of their VirtIONet child. This approach does not work
well with string or pointer properties since we must be careful about
leaking or double-freeing them.

Use the QOM alias property to forward property accesses to the
VirtIONet child.  This way no duplication is necessary.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-30 11:08:16 +02:00
Richard Henderson
b6c73a6d45 tcg: Always enable TCGv type checking
Instead of using structures, which imply some amount of overhead
on certain ABIs, use pointer types.

This actually reduces the size of the binaries vs a NON-debug
build on ppc64 and x86_64, due to a reduction in the number of
sign-extension insns.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-09-29 14:55:28 -04:00
Richard Henderson
58099c8066 qemu/compiler: Define QEMU_ARTIFICIAL
The combination of always_inline + artificial allows tiny inline
functions to be written that do not interfere with debugging.
In particular, gdb will not step into an artificial function.

The always_inline attribute was introduced in gcc 4.2,
and the artificial attribute was introduced in gcc 4.3.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-09-29 14:55:28 -04:00
Richard Henderson
9c53889ba3 tcg-aarch64: Use 32-bit loads for qemu_ld_i32
The "old" qemu_ld opcode did not specify the size of the result,
and so we had to assume full register width.  With the new opcodes,
we can narrow the result.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-09-29 14:55:28 -04:00
Richard Henderson
de8301e542 tcg-sparc: Use UMULXHI instruction
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-09-29 14:55:27 -04:00
Richard Henderson
c470b663f7 tcg-sparc: Rename ADDX/SUBX insns
The pre-v9 ADDX/SUBX insns were renamed ADDC/SUBC for v9.
Standardizing on the v9 name makes things less confusing.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-09-29 14:55:27 -04:00
Richard Henderson
9d6a7a8542 tcg-sparc: Use ADDXC in setcond_i64
Similar to the ADDC tricks we use in setcond_i32.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-09-29 14:55:27 -04:00
Richard Henderson
321b6c0585 tcg-sparc: Fix setcond_i32 uninitialized value
We failed to swap c1 and c2 correctly for NE c2 == 0.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-09-29 14:55:27 -04:00
Richard Henderson
90379ca84e tcg-sparc: Use ADDXC in addsub2_i64
On T4 and newer Sparc chips we have an add-with-carry insn
that takes its input from %xcc instead of %icc.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-09-29 14:55:27 -04:00
Richard Henderson
609ac1e164 tcg-sparc: Support addsub2_i64
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-09-29 14:55:26 -04:00
Peter Maydell
70d3a7a7b8 Merge remote-tracking branch 'remotes/spice/tags/pull-spice-20140929-1' into staging
add and use graphic_console_set_hwops

# gpg: Signature made Mon 29 Sep 2014 11:18:37 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/spice/tags/pull-spice-20140929-1:
  qxl: use graphic_console_set_hwops
  console: add graphic_console_set_hwops

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-29 19:28:15 +01:00
Edgar E. Iglesias
136e67e9b5 target-arm: Add support for VIRQ and VFIQ
This only implements the external delivery method via the GIC.

Acked-by: Greg Bellows <greg.bellows@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1411718914-6608-12-git-send-email-edgar.iglesias@gmail.com
[PMM: adjusted following cpu-exec refactoring]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-29 18:48:51 +01:00
Edgar E. Iglesias
041c96666d target-arm: Add IRQ and FIQ routing to EL2 and 3
Reviewed-by: Greg Bellows <greg.bellows@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1411718914-6608-11-git-send-email-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-29 18:48:51 +01:00
Edgar E. Iglesias
e0d6e6a5e7 target-arm: A64: Emulate the SMC insn
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1411718914-6608-10-git-send-email-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-29 18:48:50 +01:00
Edgar E. Iglesias
607d98b81e target-arm: Add a Hypervisor Trap exception type
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1411718914-6608-9-git-send-email-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-29 18:48:50 +01:00
Edgar E. Iglesias
35979d71c4 target-arm: A64: Emulate the HVC insn
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1411718914-6608-8-git-send-email-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-29 18:48:50 +01:00
Edgar E. Iglesias
2dd081ae76 target-arm: A64: Correct updates to FAR and ESR on exceptions
Not all exception types update both FAR and ESR.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Greg Bellows <greg.bellows@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1411718914-6608-7-git-send-email-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-29 18:48:50 +01:00
Edgar E. Iglesias
dfafd09088 target-arm: Don't take interrupts targeting lower ELs
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Greg Bellows <greg.bellows@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1411718914-6608-6-git-send-email-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-29 18:48:49 +01:00
Edgar E. Iglesias
043b7f8d12 target-arm: Break out exception masking to a separate func
Reviewed-by: Greg Bellows <greg.bellows@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1411718914-6608-5-git-send-email-edgar.iglesias@gmail.com
[PMM: updated to account for recent cpu-exec refactoring]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-29 18:48:49 +01:00
Edgar E. Iglesias
9e729b57ac target-arm: A64: Refactor aarch64_cpu_do_interrupt
Introduce new_el and new_mode in preparation for future patches
that add support for taking exceptions to and from EL2 and 3.
No functional change.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1411718914-6608-4-git-send-email-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-29 18:48:49 +01:00
Edgar E. Iglesias
64e0e2de0c target-arm: Add SCR_EL3
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1411718914-6608-3-git-send-email-edgar.iglesias@gmail.com
[PMM: apply offsetoflow32() to correct regdef]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-29 18:48:49 +01:00
Edgar E. Iglesias
f149e3e847 target-arm: Add HCR_EL2
Reviewed-by: Greg Bellows <greg.bellows@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1411718914-6608-2-git-send-email-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-29 18:48:48 +01:00
Peter Maydell
c0f4af1719 target-arm: Don't handle c15_cpar changes via tb_flush()
At the moment we try to handle c15_cpar with the strategy of:
 * emit generated code which makes assumptions about its value
 * when the register value changes call tb_flush() to throw
   away the now-invalid generated code
This works because XScale CPUs are always uniprocessor, but
it's confusing because it suggests that the same approach can
be taken for other registers. It also means we do a tb_flush()
on CPU reset, which makes multithreaded linux-user binaries
even more likely to fail than would otherwise be the case.

Replace it with a combination of TB flags for the access
checks done on cp0/cp1 for the XScale and iwMMXt instructions,
plus a runtime check for cp2..cp13 coprocessor accesses.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1411056959-23070-1-git-send-email-peter.maydell@linaro.org
2014-09-29 18:48:48 +01:00
Peter Maydell
f59492b984 hw/input/tsc210x.c: Delete unused array tsc2101_rates
The array tsc2101_rates[] is unused (and we don't implement
the TSC2101 anyway, only the 2102); delete it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1410723223-17711-5-git-send-email-peter.maydell@linaro.org
2014-09-29 18:48:48 +01:00
Peter Maydell
6ac285b824 hw/display/pxa2xx_lcd.c: Remove unused function pxa2xx_dma_rdst_set
The function pxa2xx_dma_rdst_set() is unused; delete it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1410723223-17711-4-git-send-email-peter.maydell@linaro.org
2014-09-29 18:48:47 +01:00
Peter Maydell
93c9aea9b4 hw/intc/imx_avic.c: Remove unused function imx_avic_set_prio()
The function imx_avic_set_prio() is unused; delete it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1410723223-17711-3-git-send-email-peter.maydell@linaro.org
2014-09-29 18:48:47 +01:00
Peter Maydell
65731d1c3e hw/display/blizzard.c: Delete unused function blizzard_rgb2yuv
The function blizzard_rgb2yuv() is unused; delete it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1410723223-17711-2-git-send-email-peter.maydell@linaro.org
2014-09-29 18:48:47 +01:00
Peter Maydell
8f95ce2e4c configure: Build GDB XML for 32 bit ARM CPUs into qemu aarch64 binaries
The qemu-aarch64 and qemu-system-aarch64 binaries include support
for all the 32 bit ARM CPUs as well as the 64 bit ones. This means
we need to build in the GDB XML files for the 32 bit CPUs too.
Otherwise gdb will complain:
 warning: while parsing target description (at line 1): Could not load XML document "arm-core.xml"
when you try to connect to our gdbserver to debug a 32 bit CPU
running in a qemu-aarch64 or qemu-system-aarch64 binary.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1410533739-13836-1-git-send-email-peter.maydell@linaro.org
2014-09-29 18:48:47 +01:00
Peter Maydell
0eacea7060 target-arm: Implement handling of breakpoint firing
Implement handling of breakpoint event firing to correctly
inject the debug exception into the guest.

Since the breakpoint and watchpoint control register format is
very similar we adjust wp_matches() to also handle breakpoints
as well rather than using a separate function.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1410523465-13400-3-git-send-email-peter.maydell@linaro.org
2014-09-29 18:48:46 +01:00
Peter Maydell
46747d1508 target-arm: Implement setting guest breakpoints
This patch adds support for setting guest breakpoints
based on values the guest writes to the DBGBVR and DBGBCR
registers. (It doesn't include the code to handle when
these breakpoints fire, so has no guest-visible effect.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1410523465-13400-2-git-send-email-peter.maydell@linaro.org
2014-09-29 18:48:46 +01:00
Peter Maydell
b60a7726cc Merge remote-tracking branch 'remotes/qmp-unstable/queue/qmp' into staging
* remotes/qmp-unstable/queue/qmp:
  Add HMP command "info memory-devices"
  qemu-socket: Eliminate silly QERR_ macros
  qemu-socket: Polish errors for connect() and listen() failure
  qemu-iotests: Test missing "driver" key for blockdev-add
  tests: add QMP input visitor test for unions with no discriminator
  qapi: dealloc visitor, implement visit_start_union
  qapi: add visit_start_union and visit_end_union
  virtio-balloon: fix integer overflow in memory stats feature
  monitor: Reset HMP mon->rs in CHR_EVENT_OPEN

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-29 18:18:29 +01:00
zhanghailiang
28d16f38d0 vl: Adjust the place of calling mlockall to speedup VM's startup
If we configure mlock=on and memory policy=bind at the same time,
It will consume lots of time for system to treat with memory,
especially when call mbind behind mlockall.

Adjust the place of calling mlockall, calling mbind before mlockall
can remarkably reduce the time of VM's startup.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-29 19:44:04 +03:00
zhanghailiang
fc50ff0666 pc-dimm: Don't check dimm->node when there is non-NUMA config
It should not break memory hotplug feature if there is non-NUMA option.

This patch would also allow to use pc-dimm as replacement for initial memory
for non-NUMA configs.

Note: After this patch, the memory hotplug can work normally for Linux guest OS
when there is non-NUMA option and NUMA option. But not support Windows guest OS
to hotplug memory with no-NUMA config, actully, it's Windows limitation.

Reviewed-By: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-29 19:44:04 +03:00
Gonglei
22b80e85ff pci-hotplug-old: avoid losing error message
When scsi_bus_legacy_add_drive() produces an error,
we will lose the error message. Using error_report
to report it.

Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-29 19:44:04 +03:00
Michael S. Tsirkin
45363e46ae Revert "virtio-pci: fix migration for pci bus master"
This reverts commit 4d43d3f3c8.

Reported to break PPC guests.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-29 19:44:04 +03:00
Markus Armbruster
8ce3c44c92 loader: g_realloc(p, 0) frees and returns NULL, simplify
Once upon a time, it was decided that qemu_realloc(ptr, 0) should
abort.  Switching to glib retired that bright idea.  A bit of code
that was added to cope with it (commit 3e372cf) is still around.  Bury
it.

See also commit 6528499.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-29 19:44:04 +03:00
Stefan Hajnoczi
70556264a8 libqos: use microseconds instead of iterations for virtio timeout
Some hosts are slow or overloaded so test execution takes a long time.
Test cases use timeouts to protect against an infinite loop stalling the
test forever (especially important in automated test setups).

Commit 6cd14054b6 ("libqos virtio:
Increase ISR timeout") increased the clock_step() value in an attempt to
lengthen the virtio interrupt wait timeout, but timeout failures are
still occuring on the Travis automated testing platform.

This is because clock_step() only affects the guest's virtual time.
Virtio requests can be bottlenecked on host disk I/O latency - which
cannot be improved by stepping the clock, so the fix was ineffective.

This patch changes the qvirtio_wait_queue_isr() and
qvirtio_wait_config_isr() timeout mechanism from loop iterations to
microseconds.  This way the test case can specify an absolute 30 second
timeout.  Number of loop iterations is not a reliable timeout mechanism
since the speed depends on many factors including host performance.

Tests should no longer timeout on overloaded Travis instances.

Cc: Marc Marí <marc.mari.barcelo@gmail.com>
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-29 17:31:11 +01:00
Stefan Hajnoczi
e8c81b4d8a libqos: improve event_index test with timeout
The virtio event_index feature lets the device driver tell the device
how many requests to process before raising the next interrupt.
virtio-blk-test.c tries to verify that the device does not raise an
interrupt unnecessarily.

Unfortunately the test has a race condition.  It spins checking for an
interrupt up to 100 times and then assumes the request has finished.  On
a slow host the I/O request could still be in flight and the test would
fail.

This patch waits for the request to complete, or until a 30-second
timeout is reached.  If an interrupt is raised while waiting the test
fails since the device was not supposed to raise interrupts.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-29 17:31:08 +01:00
Kevin Wolf
ed9114356b raw-posix: Fix build without posix_fallocate()
Check for the presence of posix_fallocate() in configure and only
compile in support for PREALLOC_MODE_FALLOC when it's there.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-29 16:28:24 +01:00
Peter Maydell
0ebcc56453 Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block patches

# gpg: Signature made Fri 26 Sep 2014 19:57:52 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"

* remotes/kevin/tags/for-upstream:
  qemu-iotests: Fail test if explicit test case number is unknown
  block: Validate node-name
  vpc: fix beX_to_cpu() and cpu_to_beX() confusion
  docs: add blkdebug block driver documentation
  block: Catch simultaneous usage of options and their aliases
  block: Specify -drive legacy option aliases in array
  block: Improve message for device name clashing with node name
  qemu-nbd: Destroy the BlockDriverState properly
  block: Keep DriveInfo alive until BlockDriverState dies
  blockdev: Disentangle BlockDriverState and DriveInfo creation
  blkdebug: show an error for invalid event names

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-29 12:26:15 +01:00
Gerd Hoffmann
151623353f qxl: use graphic_console_set_hwops
Simply switch function pointers when entering/leaving vga mode.
Allows to remove wrapper functions which do nothing but dispatch
calls depending on the current qxl mode.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-29 10:20:09 +02:00
Gerd Hoffmann
1c1f949844 console: add graphic_console_set_hwops
Add a function to allow display emulations to switch the hwops
function pointers.  This is useful for devices which have two
completely different operation modes.  Typical case is the vga
compatibility mode vs. native mode in qxl and the upcoming
virtio-vga device.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-29 10:20:09 +02:00
Gerd Hoffmann
133771477c ac97: register reset via qom
So it gets properly unregistered on hot-unplug.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-29 10:20:05 +02:00
Peter Maydell
e80084d352 Merge remote-tracking branch 'remotes/mjt/tags/trivial-patches-2014-09-26' into staging
trivial patches for 2014-09-26

# gpg: Signature made Fri 26 Sep 2014 18:33:53 BST using RSA key ID A4C3D7DB
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg:                 aka "Michael Tokarev <mjt@corpit.ru>"
# gpg:                 aka "Michael Tokarev <mjt@debian.org>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D  4324 457C E0A0 8044 65C5
#      Subkey fingerprint: 6F67 E18E 7C91 C5B1 5514  66A7 BEE5 9D74 A4C3 D7DB

* remotes/mjt/tags/trivial-patches-2014-09-26:
  os-posix: report error message when lock file failed
  os-posix: remove confused errno
  os-posix: change tab to space avoid violating coding style
  qapi: Update docs given recent event, spacing fixes
  qapi: Ignore files created during make check
  qapi: Consistent whitespace in tests/Makefile
  vmxcap: Update according to SDM of September 2014
  .travis.yml: remove "make check" from main matrix
  .travis.yml: pre-seed sub-modules for speed
  .travis.yml: make the make slightly more parallel
  .travis.yml: add more linux-user to the build matrix
  tests: avoid running duplicate qom-tests

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-26 18:44:25 +01:00
Zhu Guihua
a631892f9d Add HMP command "info memory-devices"
Provides HMP equivalent of QMP query-memory-devices command.

Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
Reviewed-By: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-09-26 13:37:06 -04:00
Markus Armbruster
235256a2bd qemu-socket: Eliminate silly QERR_ macros
The QERR_ macros are leftovers from the days of "rich" error objects.
They're used with error_set() and qerror_report(), and expand into the
first *two* arguments.  This trickiness has become pointless.  Clean
up.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-09-26 13:37:06 -04:00
Gonglei
e5048d15ce os-posix: report error message when lock file failed
It will cause that create vm failed When manager
tool is killed forcibly (kill -9 libvirtd_pid),
the file not was unlink, and unlock. It's better
that report the error message for users.

Signed-off-by: Huangweidong <weidong.huang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-26 21:21:09 +04:00
Gonglei
97699eff3a os-posix: remove confused errno
If we get inside the 'else if (status == 1)' conditional,
then we know that read() succeeded, and therefore errno is
unspecified. Printing strerror(errno) on a random value
is not helpful.

Cc: Eric Blake <eblake@redhat.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-26 21:20:50 +04:00
Gonglei
63ce8e150c os-posix: change tab to space avoid violating coding style
Cc: Eric Blake <eblake@redhat.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-26 21:20:28 +04:00
Markus Armbruster
dbe2a7a62a qemu-socket: Polish errors for connect() and listen() failure
connect() doesn't "connect to socket", it connects a socket to an
address and, if it's of type SOCK_STREAM, initiates a connection.
Scratch "to".

listen() does "set socket to listening mode", but it sounds awkward.
Change to "listen on socket".

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-09-26 13:18:38 -04:00
Eric Blake
59a2c4ce2b qapi: Update docs given recent event, spacing fixes
Commit 21cd70d added event support but didn't document what the
generated code looks like.  Commit 05dfb26 removed some unwanted
spaces in the generated code, but didn't reflect those changes
into the documentation.  Finally, the docs start with a big
disclaimer about QMP not using QAPI yet, which feels rather stale.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-26 21:18:20 +04:00
Eric Blake
597db727cc qapi: Ignore files created during make check
After an in-tree build and run of 'make check-{qapi-schema,unit}',
I noticed some leftover files.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Wenchao Xia <wenchaoqemu@gmail.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-26 21:18:15 +04:00
Eric Blake
04404c21cc qapi: Consistent whitespace in tests/Makefile
tests/Makefile had a mix of TAB vs. 8-space indentation; given
that it is a Makefile, TAB is more idiomatic even though in these
particular cases the choice of whitespace didn't matter.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-26 21:18:02 +04:00
Fam Zheng
fe509ee237 qemu-iotests: Test missing "driver" key for blockdev-add
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-09-26 13:14:11 -04:00
Michael Roth
cb55111b4e tests: add QMP input visitor test for unions with no discriminator
This is more of an exercise of the dealloc visitor, where it may
erroneously use an uninitialized discriminator field as indication
that union fields corresponding to that discriminator field/type are
present, which can lead to attempts to free random chunks of heap
memory.

Cc: qemu-stable@nongnu.org
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-09-26 13:14:11 -04:00
Michael Roth
146db9f919 qapi: dealloc visitor, implement visit_start_union
If the .data field of a QAPI Union is NULL, we don't need to free
any of the union fields.

Make use of the new visit_start_union interface to access this
information and instruct the generated code to not visit these
fields when this occurs.

Cc: qemu-stable@nongnu.org
Reported-by: Fam Zheng <famz@redhat.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-09-26 13:14:10 -04:00
Michael Roth
cee2dedb85 qapi: add visit_start_union and visit_end_union
In some cases an input visitor might bail out on filling out a
struct for various reasons, such as missing fields when running
in strict mode. In the case of a QAPI Union type, this may lead
to cases where the .kind field which encodes the union type
is uninitialized. Subsequently, other visitors, such as the
dealloc visitor, may use this .kind value as if it were
initialized, leading to assumptions about the union type which
in this case may lead to segfaults. For example, freeing an
integer value.

However, we can generally rely on the fact that the always-present
.data void * field that we generate for these union types will
always be NULL in cases where .kind is uninitialized (at least,
there shouldn't be a reason where we'd do this purposefully).

So pass this information on to Visitor implementation via these
optional start_union/end_union interfaces so this information
can be used to guard against the situation above. We will make
use of this information in a subsequent patch for the dealloc
visitor.

Cc: qemu-stable@nongnu.org
Reported-by: Fam Zheng <famz@redhat.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-09-26 13:14:10 -04:00
Luiz Capitulino
1f9296b51a virtio-balloon: fix integer overflow in memory stats feature
When a QMP client changes the polling interval time by setting
the guest-stats-polling-interval property, the interval value
is stored and manipulated as an int64_t variable.

However, the balloon_stats_change_timer() function, which is
used to set the actual timer with the interval value, takes
an int instead, causing an overflow for big interval values.

This commit fix this bug by changing balloon_stats_change_timer()
to take an int64_t and also it limits the polling interval value
to UINT_MAX to avoid other kinds of overflow.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2014-09-26 13:14:10 -04:00
Stratos Psomadakis
e5554e2015 monitor: Reset HMP mon->rs in CHR_EVENT_OPEN
Commit cdaa86a54 ("Add G_IO_HUP handler for socket chardev") exposed a bug in
the way the HMP monitor handles its command buffer. When a client closes the
connection to the monitor, tcp_chr_read() will detect the G_IO_HUP condition
and call tcp_chr_disconnect() to close the server-side connection too. Due to
the fact that monitor reads 1 byte at a time (for each tcp_chr_read()), the
monitor readline state / buffers might contain junk (i.e. a half-finished
command). Thus, without calling readline_restart() on mon->rs in
CHR_EVENT_OPEN, future HMP commands will fail.

Signed-off-by: Stratos Psomadakis <psomas@grnet.gr>
Signed-off-by: Dimitris Aragiorgis <dimara@grnet.gr>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-09-26 13:14:10 -04:00
Adrian-Ken Rueegsegger
c5d1e2cce3 vmxcap: Update according to SDM of September 2014
This adds reporting of RDSEED exiting and XSAVES/XRSTORS #UD and fixes
the range of VMCS revision as well as some typos.

Signed-off-by: Adrian-Ken Rueegsegger <ken@codelabs.ch>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-26 21:08:56 +04:00
Alex Bennée
ed173cb704 .travis.yml: remove "make check" from main matrix
There are problems with unreliability in "make check" which still need
to be tracked down. As the tests are broadly the same for all targets if
added one explicit target to the matrix to run it. However this does
build all softmmu targets to ensure they at least "run"

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Brian Jackson <iggy@theiggy.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-26 21:05:06 +04:00
Alex Bennée
cb021cfee7 .travis.yml: pre-seed sub-modules for speed
A significant portion of the build time is spent initialising all the
sub-modules we use in the source tree. Often this is almost as long as
the build itself. By pre-seeding the .git/modules tree this will
hopefully improve things.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Brian Jackson <iggy@theiggy.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-26 21:04:45 +04:00
Alex Bennée
eebf29401a .travis.yml: make the make slightly more parallel
The Travis VMs have 1.5 cores so we might as well make some use of the
paralellism.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Brian Jackson <iggy@theiggy.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-26 21:04:22 +04:00
Alex Bennée
10905cb26a .travis.yml: add more linux-user to the build matrix
At the same time I've grouped the $ARCH-linux-user and $ARCH-softmmu
builds together (hoping FS cache helps) and grouped all $ARCH-softmmu
only builds into one target. This reduces the build matrix slightly
which will hopefully help with build times.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-26 21:03:59 +04:00
Michael Roth
2b8419cb49 tests: avoid running duplicate qom-tests
Since 3687d532 we've been unconditionally adding qom-test to our qtests
for every arch. However, some archs inherit their tests from Makefile
variables for other archs, such as i386/x86_64,
microblaze/microblazeel, and xtensa/xtensaeb. Since these are evaluated
in a lazy manner, we ultimately end up adding qom-test twice.

In the case x86_64, where we have a large number of machine types that
we rerun qom-test for, this has lead to a fairly noticeable increase
in the overall run-time of `make check` (78s vs. 42s on my machine).
Similar speed-ups are visible for other such archs, but not nearly as
significant.

Fix this by only adding qom-test to an arch's test list if it's not
already present.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-26 21:03:26 +04:00
Peter Maydell
81ab11a7a5 Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
Usual mix of patches, the most important being Alex and Marcelo's
kvmclock fix.  This was reverted last minute for 2.1, but it is now back
with the problematic case fixed.

Note: I will soon switch to a subkey for signing purposes.  To verify
future signed pull requests from me, please update my key with
"gpg --recv-keys 9B4D86F2".  You should see 3 new subkeys---the
one for signing will be a 2048-bit RSA key, 4E6B09D7.

# gpg: Signature made Fri 26 Sep 2014 15:34:44 BST using RSA key ID 9B4D86F2
# gpg: Good signature from "Paolo Bonzini <pbonzini@redhat.com>"
# gpg:                 aka "Paolo Bonzini <bonzini@gnu.org>"

* remotes/bonzini/tags/for-upstream:
  kvm/valgrind: don't mark memory as initialized
  po: fix conflict with %.mo rule in rules.mak
  kvmvapic: fix migration when VM paused and when not running Windows
  serial: check if backed by a physical serial port at realize time
  serial: reset state at startup
  target-i386: update fp status fix
  hw/dma/i8257: Silence phony error message
  kvmclock: Ensure time in migration never goes backward
  kvmclock: Ensure proper env->tsc value for kvmclock_current_nsec calculation
  Introduce cpu_clean_all_dirty
  pit: fix pit interrupt can't inject into vm after migration

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-26 15:41:50 +01:00
Christian Borntraeger
541be9274e kvm/valgrind: don't mark memory as initialized
since commit 7dda5dc82a ("migration: initialize RAM to zero") the
guest memory is defined zero. No need to call valgrind on guest memory.
This reverts commit 62fe83318d ("qemu: Use valgrind annotations to
mark kvm guest memory as defined") thus speeding up kvm start if
<includedir>/valgrind/valgrind.h is available.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-26 13:35:08 +02:00
Paolo Bonzini
a697d240ff po: fix conflict with %.mo rule in rules.mak
po/Makefile includes rules.mak to use the nice quiet-command macro.
However, this also brings in a %.mo rule that breaks "make build".
Put our own rule before the include, so that it has precedence.

Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-26 13:35:08 +02:00
Pavel Dovgalyuk
5a6e8ba64f kvmvapic: fix migration when VM paused and when not running Windows
This patch fixes migration by extending do_vapic_enable function. This function
called vapic_enable which read cpu number from the guest memory. When cpu
number could not be read, vapic was not enabled while loading the VM state.
This patch adds required code for cpu_number=0 to do_vapic_enable function,
because it is called only when cpu_number=0.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-26 13:32:04 +02:00
Peter Maydell
da1c4ec88a Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging
# gpg: Signature made Fri 26 Sep 2014 11:59:34 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/tracing-pull-request:
  ohci: drop computed flags from trace events
  ohci: Split long traces to smaller ones
  scripts/tracetool: don't barf on formats with precision
  trace: install trace-events file
  trace-events: Fix comments pointing to source files
  trace-events: Drop orphaned monitor trace event
  trace-events: Drop unused megasas trace event
  cleanup-trace-events.pl: Tighten search for trace event call
  trace: tighten up trace-events regex to fix bad parse
  trace-events: drop orphan iscsi trace events
  trace-events: drop orphan usb_mtp_data_out
  trace-events: drop orphan virtio_blk_data_plane_complete_request
  trace: [hmp] Reimplement "trace-event" and "info trace-events" using QMP
  trace: [qmp] Add commands to query and control event tracing state
  trace: docs: add trace file description
  trace: [ust] Fix format string computation in tcg-enabled events

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-26 12:26:07 +01:00
Peter Maydell
15124e1420 main-loop.c: Handle SIGINT, SIGHUP and SIGTERM synchronously
Add the termination signals SIGINT, SIGHUP and SIGTERM to the
list of signals which we handle synchronously via a signalfd.
This avoids a race condition where if we took the SIGTERM
in the middle of qemu_shutdown_requested:
    int r = shutdown_requested;
[SIGTERM here...]
    shutdown_requested = 0;

then the setting of the shutdown_requested flag by
termsig_handler() would be lost and QEMU would fail to
shut down. This was causing 'make check' to hang occasionally.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1411660269-11081-1-git-send-email-peter.maydell@linaro.org
Cc: qemu-stable@nongnu.org
2014-09-26 11:47:30 +01:00
Alex Bennée
bc0d104c6a ohci: drop computed flags from trace events
This exceeded the trace argument limit for LTTNG UST and wasn't really
needed as the flags value is stored anyway. Dropping this fixes the
compile failure for UST. It can probably be merged with the previous
trace shortening patch.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-26 09:43:06 +01:00
Alexey Kardashevskiy
3af8f177fa ohci: Split long traces to smaller ones
Recent traces rework introduced 2 tracepoints with 13 and 20
arguments. When dtrace backend is selected
(--enable-trace-backend=dtrace), compile fails as
sys/sdt.h defines DTRACE_PROBE up to DTRACE_PROBE12 only.

This splits long tracepoints.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-26 09:43:06 +01:00
Alex Bennée
931f53e184 scripts/tracetool: don't barf on formats with precision
This only affects lttng user space tracing at the moment.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-26 09:34:39 +01:00
Stefan Hajnoczi
89ae5831a5 trace: install trace-events file
Install the ./trace-events file into the data directory.  This file
contains the list of trace events that were built into QEMU at
compile-time.

The file is a handy reference for the set of trace events that the QEMU
binary was built with.  It is also needed by the simpletrace.py tool
that parses binary trace data either emitted from QEMU when built with
--enable-trace-backend=simple or by the SystemTap simpletrace script
that QEMU provides.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1411486175-3017-1-git-send-email-stefanha@redhat.com
2014-09-26 09:34:39 +01:00
Markus Armbruster
40893768df trace-events: Fix comments pointing to source files
A few files have been renamed without updating their comment here.  A
few events have been added in the wrong place.  Clean that up.

Comments with no space after the '#' look ugly and confuse
cleanup-trace-events.pl.  Insert a space.

scripts/cleanup-trace-events.pl is now happy again.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1411476811-24251-5-git-send-email-armbru@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-26 09:34:39 +01:00
Markus Armbruster
4d249a2074 trace-events: Drop orphaned monitor trace event
Event monitor_protocol_event is unused since commit 7517517.  Drop it.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1411476811-24251-4-git-send-email-armbru@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-26 09:34:38 +01:00
Markus Armbruster
8a55acb1e0 trace-events: Drop unused megasas trace event
Event megasas_io_read was added in commit e8f943c, but never used.
Drop it.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1411476811-24251-3-git-send-email-armbru@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-26 09:34:38 +01:00
Markus Armbruster
88ed34ff5e cleanup-trace-events.pl: Tighten search for trace event call
The script can get fooled too easily.  For instance, it finds
trace_megasas_io_read_start when looking for trace_megasas_io_read,
and incorrectly concludes that event megasas_io_read is used.

Supply -w to git-grep to tighten the search.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1411476811-24251-2-git-send-email-armbru@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-26 09:34:38 +01:00
Stefan Hajnoczi
f9bbba9569 trace: tighten up trace-events regex to fix bad parse
Use \w for properties and trace event names since they are both drawn
from [a-zA-Z0-9_] character sets.

The .* for matching properties was too aggressive and caused the
following failure with foo(int rc) "(this is a test)":

  Traceback (most recent call last):
    File "scripts/tracetool.py", line 139, in <module>
      main(sys.argv)
    File "scripts/tracetool.py", line 134, in main
      binary=binary, probe_prefix=probe_prefix)
    File "scripts/tracetool/__init__.py", line 334, in generate
      events = _read_events(fevents)
    File "scripts/tracetool/__init__.py", line 262, in _read_events
      res.append(Event.build(line))
    File "scripts/tracetool/__init__.py", line 225, in build
      return Event(name, props, fmt, args, arg_fmts)
    File "scripts/tracetool/__init__.py", line 185, in __init__
      % ", ".join(unknown_props))
  ValueError: Unknown properties: foo(int, rc)

Cc: Lluís Vilanova <vilanova@ac.upc.edu>
Reported-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1411468626-20450-1-git-send-email-stefanha@redhat.com
2014-09-26 09:34:38 +01:00
Stefan Hajnoczi
44e7ebb8bb trace-events: drop orphan iscsi trace events
iscsi_aio_write16_cb, iscsi_aio_writev, iscsi_aio_read16_cb, and
iscsi_aio_readv have not not been in use since commit
063c3378a9 ("block/iscsi: introduce
bdrv_co_{readv, writev, flush_to_disk}").

These were the only trace events in block/iscsi.c so drop the the
trace.h include.

Cc: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1411394595-15300-4-git-send-email-stefanha@redhat.com
2014-09-26 09:34:38 +01:00
Stefan Hajnoczi
f4026f269a trace-events: drop orphan usb_mtp_data_out
This trace event was added in commit
840a178c94 ("usb: mtp filesharing") but
never used.

Cc: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1411394595-15300-3-git-send-email-stefanha@redhat.com
2014-09-26 09:34:38 +01:00
Stefan Hajnoczi
0bc375eb51 trace-events: drop orphan virtio_blk_data_plane_complete_request
This trace event has not been in use since commit
b002254dbd ("virtio-blk: Unify
{non-,}dataplane's request handlings").

Cc: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1411394595-15300-2-git-send-email-stefanha@redhat.com
2014-09-26 09:34:38 +01:00
Lluís Vilanova
14101d028d trace: [hmp] Reimplement "trace-event" and "info trace-events" using QMP
Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Message-id: 20140825112002.31112.60143.stgit@fimbulvetr.bsc.es
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-26 09:34:38 +01:00
Lluís Vilanova
1dde0f48d5 trace: [qmp] Add commands to query and control event tracing state
Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Message-id: 20140825111957.31112.31733.stgit@fimbulvetr.bsc.es
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-26 09:34:38 +01:00
Chen Fan
60e17d2822 trace: docs: add trace file description
When user used the trace print command from docs/tracing.txt:
  ./scripts/simpletrace.py trace-events trace-*

the user maybe be misled by the "trace-*", because if user
directly copy the comand line to run, there alway print the
bored message:
"usage: ./scripts/simpletrace.py <trace-events> <trace-file>"

then we should describe that the "trace-*" represented.

Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-26 09:34:38 +01:00
Lluís Vilanova
2321442920 trace: [ust] Fix format string computation in tcg-enabled events
TCG-enabled events start with two format strings. Delay per-argument format
computation until requested ('Event.formats').

Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-26 09:34:38 +01:00
Richard Henderson
6a0fcbdf2d cpu-exec: Do CPU_INTERRUPT_HALT unconditionally
The signal is currently checked by 10 targets, but only actually
raised by Sparc and ARM.  For the sake of one test-and-branch,
we can handle this generic bit generically.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-id: 1410626734-3804-24-git-send-email-rth@twiddle.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-25 18:54:22 +01:00
Richard Henderson
42f53fea9f target-i386: Use cpu_exec_interrupt qom hook
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-id: 1410626734-3804-23-git-send-email-rth@twiddle.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-25 18:54:22 +01:00
Richard Henderson
458dd76656 target-ppc: Use cpu_exec_interrupt qom hook
Cc: qemu-ppc@nongnu.org
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-id: 1410626734-3804-22-git-send-email-rth@twiddle.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-25 18:54:22 +01:00
Richard Henderson
e9854c3945 target-lm32: Use cpu_exec_interrupt qom hook
Cc: Michael Walle <michael@walle.cc>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Acked-by: Michael Walle <michael@walle.cc>
Message-id: 1410626734-3804-21-git-send-email-rth@twiddle.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-25 18:54:22 +01:00
Richard Henderson
29cd33d3c7 target-microblaze: Use cpu_exec_interrupt qom hook
Cc: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-id: 1410626734-3804-20-git-send-email-rth@twiddle.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-25 18:54:22 +01:00
Richard Henderson
fa4faba448 target-mips: Use cpu_exec_interrupt qom hook
Cc: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Leon Alrae <leon.alrae@imgtec.com>
Tested-by: Leon Alrae <leon.alrae@imgtec.com>
Message-id: 1410626734-3804-19-git-send-email-rth@twiddle.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-25 18:54:22 +01:00
Richard Henderson
dfdb483454 target-tricore: Remove the dummy interrupt boilerplate
It can go back in when it actually does something.

Cc: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Message-id: 1410626734-3804-18-git-send-email-rth@twiddle.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-25 18:54:22 +01:00
Richard Henderson
fbb96c4b7f target-openrisc: Use cpu_exec_interrupt qom hook
Cc: Jia Liu <proljc@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Tested-by: Jia Liu <proljc@gmail.com>
Message-id: 1410626734-3804-17-git-send-email-rth@twiddle.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-25 18:54:22 +01:00
Richard Henderson
87afe467e2 target-sparc: Use cpu_exec_interrupt qom hook
Cc: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-id: 1410626734-3804-16-git-send-email-rth@twiddle.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-25 18:54:22 +01:00
Richard Henderson
e8925712e6 target-arm: Use cpu_exec_interrupt qom hook
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-id: 1410626734-3804-15-git-send-email-rth@twiddle.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-25 18:54:22 +01:00
Richard Henderson
d8bb915972 target-unicore32: Use cpu_exec_interrupt qom hook
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-id: 1410626734-3804-14-git-send-email-rth@twiddle.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-25 18:54:22 +01:00
Richard Henderson
f47ede195b target-sh4: Use cpu_exec_interrupt qom hook
Cc: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-id: 1410626734-3804-13-git-send-email-rth@twiddle.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-25 18:54:22 +01:00
Richard Henderson
dde7c241e3 target-alpha: Use cpu_exec_interrupt qom hook
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-id: 1410626734-3804-12-git-send-email-rth@twiddle.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-25 18:54:21 +01:00
Richard Henderson
5a1f7f44cf target-cris: Use cpu_exec_interrupt qom hook
Cc: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1410626734-3804-11-git-send-email-rth@twiddle.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-25 18:54:21 +01:00
Richard Henderson
ab409bb3fe target-m68k: Use cpu_exec_interrupt qom hook
Since do_interrupt_m68k_hardirq is no longer used outside
op_helper.c, make it static.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1410626734-3804-10-git-send-email-rth@twiddle.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-25 18:54:21 +01:00
Richard Henderson
02bb9bbf1d target-s390x: Use cpu_exec_interrupt qom hook
Cc: Alexander Graf <agraf@suse.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1410626734-3804-9-git-send-email-rth@twiddle.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-25 18:54:21 +01:00
Richard Henderson
37f3616aa8 target-xtensa: Use cpu_exec_interrupt qom hook
Cc: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Max Filippov <jcmvbkbc@gmail.com>
Message-id: 1410626734-3804-8-git-send-email-rth@twiddle.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-25 18:54:21 +01:00
Richard Henderson
9585db68c7 qom: Add cpu_exec_interrupt hook
Continuing the removal of ifdefs from cpu_exec.

Cc: Andreas Färber <afaerber@suse.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1410626734-3804-7-git-send-email-rth@twiddle.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-25 18:54:21 +01:00
Richard Henderson
774f0abeae target-ppc: Use cpu_exec_enter qom hook
Cc: qemu-ppc@nongnu.org
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1410626734-3804-6-git-send-email-rth@twiddle.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-25 18:54:21 +01:00
Richard Henderson
00f3fd63e1 target-m68k: Use cpu_exec_enter/exit qom hooks
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1410626734-3804-5-git-send-email-rth@twiddle.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-25 18:54:21 +01:00
Richard Henderson
374e0cd4db target-i386: Use cpu_exec_enter/exit qom hooks
Note that the code that was within the "exit" ifdef block
was identical to the cpu_compute_eflags inline, so make that
simplification at the same time.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1410626734-3804-4-git-send-email-rth@twiddle.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-25 18:54:21 +01:00
Richard Henderson
1039e1290f cpu-exec: Remove do-nothing ifdef chains
Around the cpu_exec_enter/exit hooks contain many empty
ifdef blocks.  Delete all of these to highlight those
targets for which we actually need to do work.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1410626734-3804-3-git-send-email-rth@twiddle.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-25 18:54:21 +01:00
Richard Henderson
cffe7b3249 qom: Add cpu_exec_enter and cpu_exec_exit hooks
In preparation for removing a bunch of ifdefs from cpu_exec.

Cc: Andreas Färber <afaerber@suse.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1410626734-3804-2-git-send-email-rth@twiddle.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-25 18:54:21 +01:00
Peter Maydell
1ba50f4ea0 Merge remote-tracking branch 'remotes/mcayland/tags/qemu-openbios-signed' into staging
Update OpenBIOS images

# gpg: Signature made Thu 25 Sep 2014 13:35:55 BST using RSA key ID AE0F321F
# gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: CC62 1AB9 8E82 200D 915C  C9C4 5BC2 C56F AE0F 321F

* remotes/mcayland/tags/qemu-openbios-signed:
  Update OpenBIOS images

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-25 16:58:04 +01:00
Fam Zheng
c9d17ad0dd qemu-iotests: Fail test if explicit test case number is unknown
When we expand a number range, we just print "$id - unknown test,
ignored", this is convenient if we want to run a range of tests.

When we designate a test case number explicitly, we shouldn't just
ignore it if the case script doesn't exist.

Print an error and fail the test.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-25 15:25:20 +02:00
Kevin Wolf
9aebf3b892 block: Validate node-name
The device_name of a BlockDriverState is currently checked because it is
always used as a QemuOpts ID and qemu_opts_create() checks whether such
IDs are wellformed.

node-name is supposed to share the same namespace, but it isn't checked
currently. This patch adds explicit checks both for device_name and
node-name so that the same rules will still apply even if QemuOpts won't
be used any more at some point.

qemu-img used to use names with spaces in them, which isn't allowed any
more. Replace them with underscores.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-25 15:24:32 +02:00
Stefan Hajnoczi
a4127c428e vpc: fix beX_to_cpu() and cpu_to_beX() confusion
The beX_to_cpu() and cpu_to_beX() functions perform the same operation -
they do a byteswap if the host CPU endianness is little-endian or a
nothing otherwise.

The point of two names for the same operation is that it documents which
direction the data is being converted.  This makes it clear whether the
data is suitable for CPU processing or in its external representation.

This patch fixes incorrect beX_to_cpu()/cpu_to_beX() usage.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-25 15:24:14 +02:00
Stefan Hajnoczi
f9d7b4b3b4 docs: add blkdebug block driver documentation
The blkdebug block driver is undocumented.  Documenting it is worthwhile
since it offers powerful error injection features that are used by
qemu-iotests test cases.

This document will make it easier for people to learn about and use
blkdebug.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-25 15:24:14 +02:00
Kevin Wolf
5abbf0ee4d block: Catch simultaneous usage of options and their aliases
While thinking about precedence of conflicting block device options from
different sources, I noticed that you can specify both an option and its
legacy alias at the same time (e.g. readonly=on,read-only=off). Rather
than specifying the order of precedence, we should simply forbid such
combinations.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2014-09-25 15:24:14 +02:00
Kevin Wolf
247147fbc1 block: Specify -drive legacy option aliases in array
Instead of a series of qemu_opt_rename() calls, use an array that
contains all of the renames and call qemu_opt_rename() in a loop. This
will keep the code readable even when we add an error return to
qemu_opt_rename().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2014-09-25 15:24:14 +02:00
Markus Armbruster
d224469d87 block: Improve message for device name clashing with node name
Suggested-by: Benoit Canet <benoit.canet@nodalink.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-25 15:24:14 +02:00
Markus Armbruster
e5d7bbeb10 qemu-nbd: Destroy the BlockDriverState properly
Match the bdrv_new() with a bdrv_unref(), just to be tidy.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-25 15:24:14 +02:00
Markus Armbruster
3ae59580a0 block: Keep DriveInfo alive until BlockDriverState dies
If the BDS's refcnt > 0, drive_del() destroys the DriveInfo, but not
the BDS.  This can happen in three places:

* Device model destruction during unplug: blockdev_auto_del()

* Xen IDE unplug: pci_piix3_xen_ide_unplug()

* drive_del command when no device model is attached: do_drive_del()

The other callers of drive_del are on error paths where refcnt == 1.

If the user somehow manages to plug in a device model using a BDS that
has gone through drive_del(), the legacy configuration passed in
DriveInfo doesn't reach the device model, and automatic deletion on
unplug doesn't work.  Worse, some device models such as scsi-disk
crash when DriveInfo doesn't exist.

This is theoretical; I didn't research an actual reproducer. The problem
was introduced when we replaced DriveInfo reference counting by BDS
reference counting in commit a94a3fa..fa510eb.

Fix by keeping DriveInfo alive until its BDS dies.

This affects qemu_drive_opts: now you can't reuse the same ID for new
drive options until the BDS dies.  Before, you could, but since the
code always attempts to create a BDS with the same ID next, the
enclosing operation "create a new drive" failed anyway.  Different
error path, same result.

Unfortunately, the fix involves use of blockdev.c stuff from block.c,
which is a layering violation.  Fortunately, my forthcoming
BlockBackend work will get rid of it again.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-25 15:24:14 +02:00
Markus Armbruster
a0f1eab157 blockdev: Disentangle BlockDriverState and DriveInfo creation
blockdev_init() mixes up BlockDriverState and DriveInfo initialization
Finish the BlockDriverState job before starting to mess with
DriveInfo.  Easier on the eyes.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-25 15:24:14 +02:00
Stefan Hajnoczi
d4362d642e blkdebug: show an error for invalid event names
It is easy to typo a blkdebug configuration and waste a lot of time
figuring out why no rules are matching.

Push the Error** down into add_rule() so we can report an error when the
event name is invalid.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-25 15:24:14 +02:00
Mark Cave-Ayland
1862ed02df Update OpenBIOS images
Update OpenBIOS images to SVN r1320 built from submodule.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2014-09-25 13:34:03 +01:00
Peter Maydell
4f2280b219 Merge remote-tracking branch 'remotes/mcayland/tags/qemu-sparc-signed' into staging
tcx: Implement hardware acceleration

# gpg: Signature made Tue 23 Sep 2014 22:52:34 BST using RSA key ID AE0F321F
# gpg: Can't check signature: public key not found

* remotes/mcayland/tags/qemu-sparc-signed:
  tcx: Implement hardware acceleration

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-24 13:45:13 +01:00
Stefan Weil
c0f896810b virtio: Fix wrong type cast from pointer to long
Compiler warning (w32, w64):

include/hw/virtio/virtio_ring.h:142:26: warning:
 cast from pointer to integer of different size [-Wpointer-to-int-cast]

When sizeof(long) < sizeof(void *), this is not only a warning but a
real program error.

Add also missing blanks in the same statement.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1411536002-14088-1-git-send-email-sw@weilnetz.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-24 12:51:38 +01:00
Peter Maydell
ca976d1803 Merge remote-tracking branch 'remotes/awilliam/tags/vfio-pci-for-qemu-20140923.0' into staging
Endian updates to re-fix cross endian host and guest and
enable the same for ROM loading (Alexey)

# gpg: Signature made Tue 23 Sep 2014 18:03:03 BST using RSA key ID 3BB08B22
# gpg: Can't check signature: public key not found

* remotes/awilliam/tags/vfio-pci-for-qemu-20140923.0:
  vfio: make rom read endian sensitive
  Revert "vfio: Make BARs native endian"

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-24 12:00:08 +01:00
Mark Cave-Ayland
55d7bfe229 tcx: Implement hardware acceleration
The S24/TCX framebuffer is a mildly accelerated video card with
blitter, stippler and hardware cursor.

* Solaris and NetBSD 6.x use all the hardware acceleration features
* The Xorg driver (used by Linux) can use the hardware cursor only

This patch implements hardware acceleration in both 8 bit and 24 bit
modes. It is based on the NetBSD driver sources and from tests with
Solaris.

Signed-off-by: Olivier Danet <odanet@caramail.com>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2014-09-23 22:23:14 +01:00
Petr Matousek
01f7cecf00 slirp: udp: fix NULL pointer dereference because of uninitialized socket
When guest sends udp packet with source port and source addr 0,
uninitialized socket is picked up when looking for matching and already
created udp sockets, and later passed to sosendto() where NULL pointer
dereference is hit during so->slirp->vnetwork_mask.s_addr access.

Fix this by checking that the socket is not just a socket stub.

This is CVE-2014-3640.

Signed-off-by: Petr Matousek <pmatouse@redhat.com>
Reported-by: Xavier Mehrenberger <xavier.mehrenberger@airbus.com>
Reported-by: Stephane Duverger <stephane.duverger@eads.net>
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Message-id: 20140918063537.GX9321@dhcp-25-225.brq.redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-23 19:15:05 +01:00
Peter Maydell
769188d3bb Merge remote-tracking branch 'remotes/kraxel/tags/pull-usb-20140923-1' into staging
usb: enable hotplug, switch to realize, ohci tracing, misc fixes.

# gpg: Signature made Tue 23 Sep 2014 12:42:29 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-usb-20140923-1: (26 commits)
  usb: tag standalone ehci as hotpluggable
  usb: tag standalone uhci as hotpluggable
  usb: tag xhci as hotpluggable
  usb-serial: only check speed once at realize time
  usb-bus: introduce a wrapper function to check speed
  usb-bus: remove "init" from USBDeviceClass struct
  usb-mtp: convert init to realize
  usb-redir: convert init to realize
  usb-audio: convert init to realize
  dev-wacom: convert init to realize
  dev-hid: convert init to realize
  usb-ccid: convert init to realize
  dev-serial: convert init to realize
  dev-bluetooth: convert init to realize
  dev-uas: using error_report instead of fprintf
  dev-uas: convert init to realize
  dev-storage: usring error_report instead of fprintf/printf
  dev-storage: convert init to realize
  usb-hub: convert init to realize
  libusb: using error_report instead of fprintf
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-23 14:43:47 +01:00
Fam Zheng
20e6dca1df virtio-scsi: Make virtio_scsi_push_event public
Later this will be called by dataplane code.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-23 15:41:11 +02:00
Fam Zheng
aa8e8f83d0 virtio-scsi: Make virtio_scsi_free_req public
To share with dataplane code later.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-23 15:41:11 +02:00
Fam Zheng
c505333dab virtio-scsi: Make virtio_scsi_init_req public
To share with datplane code later.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-23 15:41:11 +02:00
Fam Zheng
dc56b7c4fb virtio-scsi: Split virtio_scsi_handle_ctrl_req from virtio_scsi_handle_ctrl
To share with dataplane code.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-23 15:41:11 +02:00
Fam Zheng
bf359a445e virtio-scsi: Split virtio_scsi_handle_cmd_req from virtio_scsi_handle_cmd
This is the "common part" to handle one cmd request. Refactor out for
later usage of dataplane iothread code.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-23 15:41:11 +02:00
Paolo Bonzini
9df7bfddcc virtio-scsi: clean up virtio_scsi_parse_cdb
The command direction according to the guest-passed buffers
is already stored in the VirtIOSCSIReq.  We can use it instead
of computing it again from req->elem.

Cc: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-23 15:40:51 +02:00
Paolo Bonzini
7ce0425575 vhost-scsi: use virtio_ldl_p
This helps for cross-endian configurations.

Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-23 15:40:51 +02:00
Fam Zheng
faf1e1fb4c virtio-scsi: Optimize virtio_scsi_init_req
The VirtQueueElement is a very big structure (>48k!), since it will be
initialzed by virtqueue_pop, we can save the expensive zeroing here.

This saves a few microseconds per request in my test:

[fio-test]      rw         bs         iodepth    jobs       bw         iops       latency
--------------------------------------------------------------------------------------------
Before          read       4k         1          1          110        28269      34
After           read       4k         1          1          131        33745      28

Whereas,

virtio-blk      read       4k         1          1          217        55673      16

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-23 15:40:51 +02:00
Fam Zheng
61e68b3fbd scsi: Optimize scsi_req_alloc
Zeroing sense buffer for each scsi request is not efficient, we can just
leave it uninitialized because sense_len is set to 0.

Move the implicitly zeroed fields to the end of the structure and use a
partial memset.

The explicitly initialized fields (by scsi_req_alloc or scsi_req_new)
are moved to the beginning of the structure, before sense buffer, to
skip the memset.

Also change g_malloc0 to g_slice_alloc.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-23 15:40:51 +02:00
Peter Maydell
cad684c538 Merge remote-tracking branch 'remotes/borntraeger/tags/s390x-20140923' into staging
s390x/kvm: some fixes and cleanups

1. sclp: get of of duplicate defines
2. ccw: implement and fix handling of some special cases

# gpg: Signature made Tue 23 Sep 2014 13:10:47 BST using RSA key ID B5A61C7C
# gpg: Can't check signature: public key not found

* remotes/borntraeger/tags/s390x-20140923:
  s390x/css: catch ccw sequence errors
  s390x/css: support format-0 ccws
  s390x: remove duplicate defines in SCLP code

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-23 13:28:06 +01:00
Cornelia Huck
e8601dd5d0 s390x/css: catch ccw sequence errors
We must not allow chains of more than 255 ccws without data transfer.

Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-23 14:10:17 +02:00
Cornelia Huck
a327c9215d s390x/css: support format-0 ccws
Add support for format-0 ccws in channel programs. As a format-1 ccw
contains the same information as format-0 ccws, only supporting larger
addresses, simply convert every ccw to format-1 as we walk the chain.

Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-23 14:10:17 +02:00
Jens Freimann
450749141e s390x: remove duplicate defines in SCLP code
Let's get rid of these duplicate defines.

Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-23 14:10:17 +02:00
Peter Maydell
380f649e02 Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
# gpg: Signature made Mon 22 Sep 2014 12:41:59 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/block-pull-request: (59 commits)
  block: Always compile virtio-blk dataplane
  vring: Better error handling if num is too large
  virtio: Import virtio_vring.h
  async: aio_context_new(): Handle event_notifier_init failure
  block: vhdx - fix reading beyond pointer during image creation
  block: delete cow block driver
  block/archipelago: Fix typo in qemu_archipelago_truncate()
  ahci: Add test_identify case to ahci-test.
  ahci: Add test_hba_enable to ahci-test.
  ahci: Add test_hba_spec to ahci-test.
  ahci: properly shadow the TFD register
  ahci: add test_pci_enable to ahci-test.
  ahci: Add test_pci_spec to ahci-test.
  ahci: MSI capability should be at 0x80, not 0x50.
  ahci: Adding basic functionality qtest.
  layout: Add generators for refcount table and blocks
  fuzz: Add fuzzing functions for entries of refcount table and blocks
  docs: List all image elements currently supported by the fuzzer
  qapi/block-core: Add "new" qcow2 options
  qcow2: Add overlap-check.template option
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-23 12:08:55 +01:00
Gerd Hoffmann
ec56214f6f usb: tag standalone ehci as hotpluggable
Add a flag to EHCIPCIInfo saying whenever the controller supports
companions or not.  Make sure we only allow registering companions for
ehci versions supporting that.  Enable pci hotplug for the ehci
variants not supporting companions.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-23 12:51:08 +02:00
Gerd Hoffmann
638ca939d8 usb: tag standalone uhci as hotpluggable
uhci hostadapters in companion setups can't be hotplugged.  So leave
hotplug disabled for all ich9 variants (which are already tagged with
unplug = true in the info struct).  For the other variants we'll enable
hotplug and remove the companion setup properties.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-23 12:51:08 +02:00
Gerd Hoffmann
3533c3d2bf usb: tag xhci as hotpluggable
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-23 12:51:08 +02:00
Gonglei
7334d6507a usb-serial: only check speed once at realize time
Whatever the chardev is open or not, we should assure
the speed is matched each other. So, call usb_check_attach()
check speed. And then pass &error_abort at all calls to
usb_device_attach().

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-23 12:51:08 +02:00
Gonglei
594a53607e usb-bus: introduce a wrapper function to check speed
In this way, we can check speed directly, don't need
call usb_device_attach(), which has other conditions,
such as checking the chardev is open.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-23 12:51:08 +02:00
Gonglei
bd2ba2752d usb-bus: remove "init" from USBDeviceClass struct
All usb-bus devices are realized by realize(),
remove init callback function from USBDeviceClass struct.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-23 12:51:08 +02:00
Gonglei
df9bb6660d usb-mtp: convert init to realize
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-23 12:51:07 +02:00
Gonglei
77e35b4bbe usb-redir: convert init to realize
In this way, all the implementations now use
error_setg instead of qerror_report for reporting error.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-23 12:51:07 +02:00
Gonglei
5450eeaaad usb-audio: convert init to realize
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-23 12:51:07 +02:00
Gonglei
27107d416e dev-wacom: convert init to realize
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-23 12:51:07 +02:00
Gonglei
276b7ac8e9 dev-hid: convert init to realize
In this way, all the implementations now use
error_setg instead of error_report for reporting error.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-23 12:51:07 +02:00
Gonglei
0b8b863fb2 usb-ccid: convert init to realize
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-23 12:51:07 +02:00
Gonglei
38fff2c9f4 dev-serial: convert init to realize
In this way, all the implementations now use
error_setg instead of error_report for reporting error.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-23 12:51:07 +02:00
Gonglei
63cdca364c dev-bluetooth: convert init to realize
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-23 12:51:07 +02:00
Gonglei
6b7afb7f0b dev-uas: using error_report instead of fprintf
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-23 12:51:07 +02:00
Gonglei
b89dc7e33f dev-uas: convert init to realize
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-23 12:51:07 +02:00
Gonglei
f5dc597878 dev-storage: usring error_report instead of fprintf/printf
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-23 12:51:07 +02:00
Gonglei
5a882e40d5 dev-storage: convert init to realize
In this way, all the implementations now use
error_setg instead of error_report for reporting error.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-23 12:51:07 +02:00
Gonglei
f3f8c45972 usb-hub: convert init to realize
In this way, all the implementations now use
error_setg instead of error_report for reporting error.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-23 12:51:07 +02:00
Gonglei
2e6a0dd1ac libusb: using error_report instead of fprintf
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-23 12:51:07 +02:00
Gonglei
2aa76dc18c libusb: convert init to realize
In this way, all the implementations now use
error_setg instead of error_report for reporting error.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-23 12:51:07 +02:00
Gonglei
d73ad35990 usb-net: convert init to realize
meanwhile, qerror_report_err() is a transitional interface to
help with converting existing HMP commands to QMP. It should
not be used elsewhere.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-23 12:51:06 +02:00
Gonglei
7d553f27fc usb-bus: convert USBDeviceClass init to realize
Add "realize/unrealize" in USBDeviceClass, which has errp
as a parameter. So all the implementations now use
error_setg instead of error_report for reporting error.

Note: this patch still keep "init" in USBDeviceClass, and
call kclass->init in usb_device_realize(), avoid breaking
git bisect. After realize all usb devices, will be removed.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-23 12:51:06 +02:00
Alexey Kardashevskiy
dc1f598845 ohci: Convert fprint/DPRINTF/print to traces
This converts many kinds of debug prints to traces.

This implements packets logging to avoid unnecessary calculations if
usb_ohci_td_pkt_short/usb_ohci_td_pkt_long is not enabled.

This makes OHCI errors (such as "DMA error") invisible by default.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-23 12:51:06 +02:00
Peter Maydell
17336812c7 Merge remote-tracking branch 'remotes/mjt/tags/trivial-patches-2014-09-22' into staging
trivial patches for 2014-09-22

# gpg: Signature made Mon 22 Sep 2014 09:10:03 BST using RSA key ID A4C3D7DB
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg:                 aka "Michael Tokarev <mjt@corpit.ru>"
# gpg:                 aka "Michael Tokarev <mjt@debian.org>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D  4324 457C E0A0 8044 65C5
#      Subkey fingerprint: 6F67 E18E 7C91 C5B1 5514  66A7 BEE5 9D74 A4C3 D7DB

* remotes/mjt/tags/trivial-patches-2014-09-22:
  arch_init: Setting QEMU_ARCH enum straight
  pc: Add missing 'static' attribute
  block: allow creation of fixed vhdx images
  vl: Print maxmem in hex format for error message
  configure: trivial fixes
  xen-hvm.c: Always return -1 when failure occurs in xen_hvm_init()
  rdma: Fix incorrect description in comments
  Fix typos and misspellings in comments
  qemu-char: Permit only a single "stdio" character device

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-23 11:00:08 +01:00
Gonglei
f0bc7fe3b7 usb-storage: fix possible memory leak and missing error message
When scsi_bus_legacy_add_drive() return NULL, meanwhile err will
be not NULL, which will casue memory leak and missing error message.

Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-23 11:41:45 +02:00
Nikunj A Dadhania
75bd0c7253 vfio: make rom read endian sensitive
All memory regions used by VFIO are LITTLE_ENDIAN and they
already take care of endiannes when accessing real device BARs
except ROM - it was broken on BE hosts.

This fixes endiannes for ROM BARs the same way as it is done
for other BARs.

This has been tested on PPC64 BE/LE host/guest in all possible
combinations including TCG.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
[aik: added commit log]
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-09-22 15:27:43 -06:00
Alexey Kardashevskiy
6758008e2c Revert "vfio: Make BARs native endian"
This reverts commit c40708176a.

The resulting code wrongly assumed target and host endianness are
the same which is not always the case for PPC64.

[aw: or potentially any host supporting VFIO and TCG]

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-09-22 15:26:36 -06:00
Fam Zheng
52b53c04fa block: Always compile virtio-blk dataplane
Dataplane doesn't depend on linux-aio any more, so we don't need the
compiling condition now.

Configure options are kept but just print a message.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1410329871-28885-4-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:51 +01:00
Fam Zheng
032f8b8158 vring: Better error handling if num is too large
To be more consistent inside this function.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1410329871-28885-3-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:49 +01:00
Fam Zheng
d9612b43dd virtio: Import virtio_vring.h
This header has no further dependencies. It only has some stable data
types and primitive functions, so we can copy it to include/hw/virtio in
order to allow vring code (and its user virtio-blk dataplane) to be
built unconditionally, even for cross compiling.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1410329871-28885-2-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:49 +01:00
Chrysostomos Nanakos
2f78e491d7 async: aio_context_new(): Handle event_notifier_init failure
On a system with a low limit of open files the initialization
of the event notifier could fail and QEMU exits without printing any
error information to the user.

The problem can be easily reproduced by enforcing a low limit of open
files and start QEMU with enough I/O threads to hit this limit.

The same problem raises, without the creation of I/O threads, while
QEMU initializes the main event loop by enforcing an even lower limit of
open files.

This commit adds an error message on failure:

 # qemu [...] -object iothread,id=iothread0 -object iothread,id=iothread1
 qemu: Failed to initialize event notifier: Too many open files in system

Signed-off-by: Chrysostomos Nanakos <cnanakos@grnet.gr>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:48 +01:00
Jeff Cody
e91a8b2fef block: vhdx - fix reading beyond pointer during image creation
In vhdx_create_metadata(), we allocate 40 bytes to entry_buffer for
the various metadata table entries.  However, we write out 64kB from
that buffer into the new file.  Only write out the correct 40 bytes.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:46 +01:00
Stefan Hajnoczi
550830f935 block: delete cow block driver
This patch removes support for the cow file format.

Normally we do not break backwards compatibility but in this case there
is no impact and it is the most logical option.  Extraordinary claims
require extraordinary evidence so I will show why removing the cow block
driver is the right thing to do.

The cow file format is the disk image format for Usermode Linux, a way
of running a Linux system in userspace.  The performance of UML was
never great and it was hacky, but it enjoyed some popularity before
hardware virtualization support became mainstream.

QEMU's block/cow.c is supposed to read this image file format.
Unfortunately the file format was underspecified:

1. Earlier Linux versions used the MAXPATHLEN constant for the backing
   filename field.  The value of MAXPATHLEN can change, so Linux
   switched to a 4096 literal but QEMU has a 1024 literal.

2. Padding was not used on the header struct (both in the Linux kernel
   and in QEMU) so the struct layout varied across architectures.  In
   particular, i386 and x86_64 were different due to int64_t alignment
   differences.  Linux now uses __attribute__((packed)), QEMU does not.

Therefore:

1. QEMU cow images do not conform to the Linux cow image file format.

2. cow images cannot be shared between different host architectures.

This means QEMU cow images are useless and QEMU has not had bug reports
from users actually hitting these issues.

Let's get rid of this thing, it serves no purpose and no one will be
affected.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1410877464-20481-1-git-send-email-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:45 +01:00
Chrysostomos Nanakos
3e7e6f186f block/archipelago: Fix typo in qemu_archipelago_truncate()
Fix a typo introduced by 94c80a438c

Signed-off-by: Chrysostomos Nanakos <cnanakos@grnet.gr>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:44 +01:00
John Snow
0fa781e3d5 ahci: Add test_identify case to ahci-test.
Utilizing all of the bring-up code in pci_enable and hba_enable,
this test issues a simple IDENTIFY command via the HBA and retrieves
the response via the PIO receive mechanisms of the HBA.

Bugs: The DPS interrupt (Descriptor Processed Status) does not
currently get set. This will need to be adjusted in a future
patch series when the AHCI DMA pathways are reworked to allow
the feature, which may be utilized by OSX guests.

Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1408643079-30675-9-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:43 +01:00
John Snow
dbc180e572 ahci: Add test_hba_enable to ahci-test.
This test engages the HBA functionality and initializes
values to sane defaults to allow for minimal HBA functionality.

Buffers are allocated and pointers are updated to allow minimal
I/O commands to complete as expected. Error registers and responses
are sanity checked for specification adherence.

Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1408643079-30675-8-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:42 +01:00
John Snow
c2f3029fbc ahci: Add test_hba_spec to ahci-test.
Add a test routine that checks the boot-up values of the HBA
configuration memory space against the AHCI 1.3 specification
and Intel ICH9 data sheet (for Q35 machines) for adherence and
sane values.

The HBA is not yet engaged or put into the idle state.

[Replaced g_assert_false(...) with g_assert(!...) for glib <2.38
compatibility, reported by Peter Maydell <peter.maydell@linaro.org>.
--Stefan]

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1408643079-30675-7-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:42 +01:00
John Snow
fac7aa7fc2 ahci: properly shadow the TFD register
In a real AHCI device, several S/ATA registers are mirrored or shadowed
within the AHCI register set. These registers are not updated
synchronously for each read access, but are instead updated after a
Device-to-Host Register FIS packet is received. The D2H FIS contains
the values from these registers on the device.

In QEMU, by reaching directly into the device to grab these bits before
they are "sent," we may introduce race conditions where unexpected
values are present "before they are sent" which could cause issues for
some guests, particularly if an attempt is made to read the PxTFD
register prior to enabling the port, where incorrect values will be read.

This patch also addresses the boot-time values for the PxTFD and PxSIG
registers to bring them in line with the AHCI 1.3 specification.

Lastly, several fields (PxTFD, PxSIG and PxSACT) are read-only,
and any attempts to write to them should be ignored.

Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1408643079-30675-6-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:41 +01:00
John Snow
96d6d3bad9 ahci: add test_pci_enable to ahci-test.
This adds a test wherein we engage the PCI AHCI
device and ensure that the memory region for the
HBA functionality is now accessible.

Under Q35 environments, additional PCI configuration
is performed to ensure that the HBA functionality
will become usable.

Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1408643079-30675-5-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:40 +01:00
John Snow
8840a843dc ahci: Add test_pci_spec to ahci-test.
Adds a specification adherence test for AHCI
where the boot-up values for the PCI configuration space
are compared against the AHCI 1.3 specification.

This test does not itself attempt to engage the device.

Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1408643079-30675-4-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:39 +01:00
John Snow
c8b5b20f81 ahci: MSI capability should be at 0x80, not 0x50.
In the Intel ICH9 data sheet, the MSI capability offset
in the PCI configuration space for ICH9 AHCI devices is
specified to be 0x80.

Further, the PCI capability pointer should always point
to 0x80 in ICH9 devices, despite the fact that AHCI 1.3
specifies that it should be pointing to PMCAP (Which in
this instance would be 0x70) to maintain adherence to
the Intel data sheet specifications and real observed behavior.

Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1408643079-30675-3-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:39 +01:00
John Snow
1cd1031ddc ahci: Adding basic functionality qtest.
Currently, there is no qtest to test the functionality of
the AHCI functionality present within the Q35 machine type.

This patch adds a skeleton for an AHCI test suite,
and adds a simple sanity-check test case where we
identify that the AHCI device is present, then
disengage the virtual machine.

Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1408643079-30675-2-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:38 +01:00
Maria Kustova
1e8fd8d44d layout: Add generators for refcount table and blocks
Refcount structures are placed in clusters randomly selected from all
unallocated host clusters.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Maria Kustova <maria.k@catit.be>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 7e2f38608db6fba2da53997390b19400d445c45d.1408450493.git.maria.k@catit.be
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:37 +01:00
Maria Kustova
2e5be6b77e fuzz: Add fuzzing functions for entries of refcount table and blocks
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Maria Kustova <maria.k@catit.be>
Message-id: c9f4027b6f401c67e9d18f94aed29be445e81d48.1408450493.git.maria.k@catit.be
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:36 +01:00
Maria Kustova
56271efdea docs: List all image elements currently supported by the fuzzer
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Maria Kustova <maria.k@catit.be>
Message-id: cb71485d0f55d1d8401eebaead8324eb78673060.1408450493.git.maria.k@catit.be
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:35 +01:00
Max Reitz
f658581130 qapi/block-core: Add "new" qcow2 options
qcow2 supports more than four options by now, add the new options
(overlap check mode and metadata cache size)

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1408557576-14574-5-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:35 +01:00
Max Reitz
ee42b5ce0b qcow2: Add overlap-check.template option
Being able to set the overlap-check option to a string and then refine
it via the overlap-check.* options is a nice idea for the command line
but does not work so well for non-flattened dicts. In that case, one can
only specify either but not both, so add a field to overlap-check.*
which does the same as directly specifying overlap-check but can be used
in conjunction with the other fields in non-flattened dicts.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1408557576-14574-4-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:34 +01:00
Max Reitz
e775ba7721 qapi: Allow enums in anonymous unions
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1408557576-14574-3-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:33 +01:00
Max Reitz
7b17ce60cc qcow2: Fix leak of QemuOpts in qcow2_open()
Currently, the QemuOpts object opts is leaked if anything fails from its
creation up to and including the image repair block. Fix this by freeing
that object in the fail path.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Message-id: 1408557576-14574-2-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:32 +01:00
Gonglei
93bb131525 hmp: fix memory leak at hmp_info_block_jobs()
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1410874615-14292-1-git-send-email-arei.gonglei@huawei.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:31 +01:00
Maria Kustova
407ba0844d image-fuzzer: Trivial readability and formatting improvements
Signed-off-by: Maria Kustova <maria.k@catit.be>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:30 +01:00
Max Reitz
5b0ed2be88 iotests: Add more tests for qcow2 corruption
Add tests for unaligned L1/L2/reftable entries and non-fatal corruption
reports.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1409926039-29044-6-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:29 +01:00
Max Reitz
a97c67ee6c qcow2: Check L1/L2/reftable entries for alignment
Offsets taken from the L1, L2 and refcount tables are generally assumed
to be correctly aligned. However, this cannot be guaranteed if the image
has been written to by something different than qemu, thus check all
offsets taken from these tables for correct cluster alignment.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1409926039-29044-5-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:28 +01:00
Max Reitz
adb435522b qcow2: Use qcow2_signal_corruption() for overlaps
Use the new function in case of a failed overlap check.

This changes output in case of corruption, so adapt iotest 060's
reference output accordingly.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Message-id: 1409926039-29044-4-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:28 +01:00
Max Reitz
85186ebdac qcow2: Add qcow2_signal_corruption()
Add a helper function for easily marking an image corrupt (on fatal
corruptions) while outputting an informative message to stderr and via
QAPI.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Message-id: 1409926039-29044-3-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:27 +01:00
Max Reitz
9bf040b962 qapi/block: Add "fatal" to BLOCK_IMAGE_CORRUPTED
Not every BLOCK_IMAGE_CORRUPTED event must be fatal; for example, when
reading from an image, they should generally not be. Nonetheless, even
an image only read from may of course be corrupted and this can be
detected during normal operation. In this case, a non-fatal event should
be emitted, but the image should not be marked corrupt (in accordance to
"fatal" set to false).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1409926039-29044-2-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:26 +01:00
Fam Zheng
db866be982 qapi: Sort items in BlockdevOptions definition
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1410415798-20673-4-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:25 +01:00
Fam Zheng
e8712cea6a qapi: Sort BlockdevDriver enum data list
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1410415798-20673-3-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:24 +01:00
Fam Zheng
e819ab2277 block: Introduce "null" drivers
This is an analogue to Linux null_blk. It can be used for testing or
benchmarking block device emulation and general block layer
functionalities such as coroutines and throttling, where disk IO is not
necessary or wanted.

Use null-aio:// for AIO version, and null-co:// for coroutine version.

[Resolved conflict with Fam's async bdrv_aio_cancel() series:
1. Drop .bdrv_aio_cancel() since it is now done by block.c
2. Rename qemu_aio_release() to qemu_aio_unref()
--Stefan]

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1410415798-20673-2-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:23 +01:00
Paolo Bonzini
a90d411e63 aio-win32: avoid out-of-bounds access to the events array
If ret is WAIT_TIMEOUT and there was an event returned by select(),
we can write to a location after the end of the array.  But in
that case we can retry the WaitForMultipleObjects call with the
same set of events, so just move the event[ret - WAIT_OBJECT_0]
assignment inside the existin conditional.

Reported-by: TeLeMan <geleman@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: TeLeMan <geleman@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:21 +01:00
Gonglei
0722eba945 qdev-monitor: fix segmentation fault on qdev_device_help()
Normally, qmp_device_list_properties() may return NULL when
a device haven't special properties excpet Object and DeviceState
properties, such as virtio-balloon-device.

We just need check local_err instead of prop_list.

Example:

Segmentation fault (core dumped)

The backtrace as below:

Program received signal SIGSEGV, Segmentation fault.
0x00005555559af1a8 in error_get_pretty (err=0x0) at util/error.c:152
152         return err->msg;
(gdb) bt
    func=0x55555574a6ca <device_help_func>, opaque=0x0, abort_on_failure=0) at util/qemu-option.c:1072

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:18 +01:00
Fam Zheng
8007429a99 block: Rename qemu_aio_release -> qemu_aio_unref
Suggested-by: Benoît Canet <benoit.canet@irqsave.net>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:17 +01:00
Fam Zheng
ca5fd113b8 block: Drop AIOCBInfo.cancel
Now that all the implementations are converted to asynchronous version
and we can emulate synchronous cancellation with it. Let's drop the
unused member.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:16 +01:00
Fam Zheng
e551c999bc ide: Convert trim_aiocb_info.cancel to .cancel_async
We know that either bh is scheduled or ide_issue_trim_cb will be called
again, so we just set i, j and ret to the right values. In both cases,
ide_trim_bh_cb will be called.

Also forward the cancellation to the iocb->aiocb which we get from
bdrv_aio_discard.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:16 +01:00
Fam Zheng
5da91e4ef4 win32-aio: Drop win32_aiocb_info.cancel
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:15 +01:00
Fam Zheng
6d24b4df38 sheepdog: Convert sd_aiocb_info.cancel to .cancel_async
Also drop the now unused SheepdogAIOCB.finished field. Note that this
aio is internal to sheepdog driver and has NULL cb and opaque, and
should be unused at all.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:14 +01:00
Fam Zheng
7691e24dbe rbd: Drop rbd_aiocb_info.cancel
And also drop the now unused "cancelled" field.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:13 +01:00
Fam Zheng
7940e5056b quorum: Convert quorum_aiocb_info.cancel to .cancel_async
Before, we cancel all the child requests with bdrv_aio_cancel, then free
the acb..

Now we just kick off asynchronous cancellation of child requests and
return, we know quorum_aio_cb will be called later, so in the end
quorum_aio_finalize will take care of calling the caller's cb.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:13 +01:00
Liu Yuan
997dd8df3e quorum: fix quorum_aio_cancel()
For a fifo read pattern, we only have one running aio (possible other cases that
has less number than num_children in the future), so we need to check if
.acb is NULL against bdrv_aio_cancel() to avoid segfault.

Cc: Eric Blake <eblake@redhat.com>
Cc: Benoit Canet <benoit@irqsave.net>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:12 +01:00
Fam Zheng
533ffb17a5 qed: Drop qed_aiocb_info.cancel
Also drop the now unused ->finished field.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:11 +01:00
Fam Zheng
facb5539d6 curl: Drop curl_aiocb_info.cancel
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:10 +01:00
Fam Zheng
9784e70f3d blkverify: Drop blkverify_aiocb_info.cancel
Also the finished pointer is not used any more.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:09 +01:00
Fam Zheng
4c781717c5 blkdebug: Drop blkdebug_aiocb_info.cancel
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:08 +01:00
Fam Zheng
4743b42a1b archipelago: Drop archipelago_aiocb_info.cancel
The cancelled flag is no longer useful. Later the request will complete
as before, and cb will be called.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:07 +01:00
Fam Zheng
722d93335f iscsi: Convert iscsi_aiocb_info.cancel to .cancel_async
Also drop the unused field "canceled".

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:06 +01:00
Fam Zheng
9bb9da46d6 dma: Convert dma_aiocb_info.cancel to .cancel_async
Just forward the request to bdrv_aio_cancel_async.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:05 +01:00
Fam Zheng
771b64daf9 linux-aio: Convert laio_aiocb_info.cancel to .cancel_async
Just call io_cancel (2), if it fails, it means the request is not
canceled, so the event loop will eventually call
qemu_laio_process_completion.

In qemu_laio_process_completion, change to call the cb unconditionally.
It is required by bdrv_aio_cancel_async.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:04 +01:00
Fam Zheng
3391f5e51c thread-pool: Convert thread_pool_aiocb_info.cancel to cancel_async
The .cancel_async shares the same the first half with .cancel: try to
steal the request if not submitted yet. In this case set the elem to
THREAD_DONE status and ret to -ECANCELED, which means
thread_pool_completion_bh will call the cb with -ECANCELED.

If the request is already submitted, do nothing, as we know the normal
completion will happen in the future.

Testing code update:

Before, done_cb is only called if the request is already submitted by
thread pool. Now done_cb is always called, even before it is submitted,
because we emulate bdrv_aio_cancel with bdrv_aio_cancel_async. So also
update the test criteria accordingly.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:02 +01:00
Fam Zheng
f600ac1902 block: Drop bdrv_em_aiocb_info.cancel
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:00 +01:00
Fam Zheng
3acabd685e block: Drop bdrv_em_co_aiocb_info.cancel
Also drop the now unused ->done pointer.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:38:59 +01:00
Fam Zheng
02c50efe08 block: Add bdrv_aio_cancel_async
This is the async version of bdrv_aio_cancel, which doesn't block the
caller. It guarantees that the cb is called either before returning or
some time later.

bdrv_aio_cancel can base on bdrv_aio_cancel_async, later we can convert
all .io_cancel implementations to .io_cancel_async, and the aio_poll is
the common logic. In the end, .io_cancel can be dropped.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:38:58 +01:00
Fam Zheng
f197fe2b2c block: Add refcnt in BlockDriverAIOCB
This will be useful in synchronous cancel emulation with
bdrv_aio_cancel_async.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:38:57 +01:00
Fam Zheng
0d910cfeaf ide/ahci: Check for -ECANCELED in aio callbacks
Before, bdrv_aio_cancel will either complete the request (like normal)
and call CB with an actual return code, or skip calling the request (for
example when the IO req is not submitted by thread pool yet).

We will change bdrv_aio_cancel to do it differently: always call CB
before return, with either [1] a normal req completion ret code, or [2]
ret == -ECANCELED. So the callers' callback must accept both cases. The
existing logic works with case [1], but not [2].

The simplest transition of callback code is do nothing in case [2], just
as if the CB is not called by the bdrv_aio_cancel() call.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:38:55 +01:00
Paolo Bonzini
6698c5bed2 aio-win32: fix uninitialized use of have_select_revents
Always initialize it with the return value of aio_prepare.

Reported-by: TeLeMan <geleman@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:38:50 +01:00
John Snow
d735b620b5 ide/atapi: Mark non-data commands as complete
When the command completion code in IDE and AHCI
was unified to put all command completion inside
of a callback, "cmd_done," we neglected to
ensure that all AHCI/ATAPI command paths would
eventually register as finished. for the PCI
interface to IDE this is not a problem because
cmd_done is a nop, but the AHCI implementation
needs to send a D2H_REG_FIS and interrupt back
to the guest to inform of completion.

This patch adds calls to ide_stop_transfer,
which calls ide_cmd_done, inside of
ide_atapi_cmd_ok and ide_atapi_cmd_error.

This fixes regressions observed by trying to boot QEMU
with a Fedora 20 live CD under Q35/AHCI, which uses
ATAPI command 0x00, which is a status check that may
cause a hang because we never complete, and ATAPI
command 0x56, which is unsupported by our current
implementation and results in an error that we never
report back to the guest.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:38:44 +01:00
Peter Maydell
c2ebb05e25 block/vhdx.c: Mark parent_vhdx_guid variable as unused
The parent_vhdx_guid variable is defined but never used, which provokes
complaints from newer versions of clang. Since the variable definition
is here acting as documentation of the image format, mark it with the
'unused' attribute to keep the compiler happy rather than simply
deleting it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:35:02 +01:00
Bastian Koppelmann
7e3d523883 arch_init: Setting QEMU_ARCH enum straight
Every QEMU_ARCH is now in (1 << n) notation, instead of a mixture of decimal and hexadecimal.

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-22 12:09:43 +04:00
Stefan Weil
e0bcc42ee7 pc: Add missing 'static' attribute
This fixes a warning from smatch (static code analysis).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-22 12:09:43 +04:00
Adelina Tuvenie
a011898d25 block: allow creation of fixed vhdx images
When trying to create a fixed vhd image qemu-img will return the
following error:

 qemu-img: test.vhdx: Could not create image: Cannot allocate memory

This happens because of a incorrect check in vhdx.c. Specifficaly,
in vhdx_create_bat(), after allocating memory for the BAT entry,
there is a check to determine if the allocation was unsuccsessful.
The error comes from the fact that it checks if s->bat isn't NULL,
which is true in case of succsessful allocation,  and exits with
error ENOMEM.

Signed-off-by: Adelina Tuvenie <atuvenie@cloudbasesolutions.com>
Acked-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-22 12:09:31 +04:00
Markus Armbruster
31376776d0 usb-storage: Fix how legacy init handles option ID clash
usb_msd_init() calls qemu_opts_create() with a made-up ID and false
fail_if_exists.  If the ID already exists, it happily messes up those
options, then fails drive_new(), because the BlockDriverState with
that ID already exists, too.

Reproducer: -drive if=none,id=usb0,format=raw -usbdevice disk:tmp.qcow2

Pass true fail_if_exists to qemu_opts_create(), and if it fails, try
the next made-up ID.

The reproducer now succeeds, and creates an usb-storage device with ID
usb1.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-22 09:55:55 +02:00
zhanghailiang
4d63322cd4 vl: Print maxmem in hex format for error message
In error message, maxmem is printed in Dec but ram_size in Hex.
It is better to print them in same format.
Also use error_report instead of fprintf.

Reviewed-By: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-20 17:55:53 +04:00
Gonglei
2d5361f2a9 configure: trivial fixes
Make them consistent with the others.

Cc: qemu-trivial@nongnu.org
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-20 17:55:53 +04:00
Chen Gang
6c5b0c0ac0 xen-hvm.c: Always return -1 when failure occurs in xen_hvm_init()
When failure occurs, it need to use "return -1" instead of exit(1), so
an upper layer has a chance to print failure information, too.

For simplicity, in xen_hvm_init(), also use '-1' instead of all
'-errno', since all related upper callers always exit(1) on failure.

It is not a normal function, it does not release related resources when
return -1, so need give related comments for it.

It passes common check:

  "./configure --enable-xen && make && make check"
  "echo $? == 0"

Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-20 17:55:53 +04:00
zhanghailiang
971ae6ef47 rdma: Fix incorrect description in comments
Since we have supported memory hotplug, VM's ram include pc.ram
and hotplug-memory.

Fix the confused description for rdma migration: pc.ram -> VM's ram

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-20 17:55:53 +04:00
zhanghailiang
9d632f5f68 Fix typos and misspellings in comments
formated -> formatted
gaurantee -> guarantee
shear -> sheer

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-20 17:55:53 +04:00
Li Liu
c88930a686 qemu-char: Permit only a single "stdio" character device
When more than one is used, the terminal settings aren't restored
correctly on exit.  Fixable.  However, such usage makes no sense,
because the users race for input, so outlaw it instead.

If you want to connect multiple things to stdio, use the mux
chardev.

Signed-off-by: Li Liu <john.liuli@huawei.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-20 17:55:53 +04:00
Max Filippov
07e2863d02 exec.c: fix setting 1-byte-long watchpoints
With commit 05068c0dfb 'exec.c: Relax restrictions on watchpoint length
and alignment' it's no longer possible to set 1-byte-long watchpoint
because of incorrect address range check.
Fix that by changing condition that checks for address wraparound.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1411016616-29879-1-git-send-email-jcmvbkbc@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-19 17:42:16 +01:00
Stefan Weil
4852ee95f3 Fix cross compilation (nm command)
Commit c261d774fb added one more binutils
tool: nm also needs a cross prefix.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1411070108-8954-1-git-send-email-sw@weilnetz.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-19 17:20:11 +01:00
Paolo Bonzini
a30cf8760f serial: check if backed by a physical serial port at realize time
Right now, s->poll_msl may linger at "0" value for an arbitrarily long
time, until serial_update_msl is called for the first time.  This is
unnecessary, and will lead to the s->poll_msl field being unnecessarily
migrated.

We can call serial_update_msl immediately at realize time (via
serial_reset) and be done with it.  The memory-mapped UART was already
doing that, but not the ISA and PCI variants.

Regarding the delta bits, be consistent with what serial_reset does when
the serial port is not backed by a physical serial port, and always clear
them at reset time.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-19 10:50:07 +02:00
Paolo Bonzini
4df7961faa serial: reset state at startup
When a serial port is started, its initial state is all zero.  Make
it consistent with reset state instead.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-19 10:50:07 +02:00
Peter Maydell
10e11f4d2b Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pci, pc, virtio, misc bugfixes

A bunch of bugfixes - some of these will make sense for 2.1.2
I put Cc: qemu-stable included where appropriate.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Thu 18 Sep 2014 19:52:18 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"

* remotes/mst/tags/for_upstream:
  pc: leave more space for BIOS allocations
  virtio-pci: fix migration for pci bus master
  vhost-user: fix VIRTIO_NET_F_MRG_RXBUF negotiation
  virtio-pci: enable bus master for old guests
  Revert "virtio: don't call device on !vm_running"
  virtio-net: drop assert on vm stop
  Revert "rng-egd: remove redundant free"
  qdev: Move global validation to a single function
  qdev: Rename qdev_prop_check_global() to qdev_prop_check_globals()
  test-qdev-global-props: Test handling of hotpluggable and non-device types
  test-qdev-global-props: Initialize not_used=true for all props
  test-qdev-global-props: Run tests on subprocess
  tests: disable global props test for old glib
  test-qdev-global-props: Trivial comment fix
  hw/machine: Free old values of string properties

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-18 20:02:01 +01:00
Michael S. Tsirkin
438f92ee9f pc: leave more space for BIOS allocations
Since QEMU 2.1, we are allocating more space for ACPI tables, so no
space is left after initrd for the BIOS to allocate memory.

Besides ACPI tables, there are a few other uses of high memory in
SeaBIOS: SMBIOS tables and USB drivers use it in particular.  These uses
allocate a very small amount of memory.  Malloc metadata also lives
there.  So we need _some_ extra padding there to avoid initrd breakage,
but not much.

John Snow found a case where RHEL5 was broken by the recent change to
ACPI_TABLE_SIZE; in his case 4KB of extra padding are fine, but just to
be safe I am adding 32KB, which is roughly the same amount of padding
that was left by QEMU 2.0 and earlier.

Move initrd to leave some space for the BIOS.

Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reported-by: John Snow <jsnow@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-18 21:51:24 +03:00
Michael S. Tsirkin
4d43d3f3c8 virtio-pci: fix migration for pci bus master
Current support for bus master (clearing OK bit)
together with the need to support guests which do not
enable PCI bus mastering, leads to extra state in
VIRTIO_PCI_FLAG_BUS_MASTER_BUG bit, which isn't robust
in case of cross-version migration for the case when
guests use the device before setting DRIVER_OK.

Rip out VIRTIO_PCI_FLAG_BUS_MASTER_BUG and implement a simpler
work-around: treat clearing of PCI_COMMAND as a virtio reset.  Old
guests never touch this bit so they will work.

As reset clears device status, DRIVER and MASTER bits are
now in sync, so we can fix up cross-version migration simply
by synchronising them, without need to detect a buggy guest
explicitly.

Drop tracking VIRTIO_PCI_FLAG_BUS_MASTER_BUG completely.

As reset makes the device quiescent, in the future we'll be able to drop
checking OK bit in a bunch of places.

Cc: Jason Wang <jasowang@redhat.com>
Cc: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-18 21:51:24 +03:00
Damjan Marion
d8e80ae37a vhost-user: fix VIRTIO_NET_F_MRG_RXBUF negotiation
Header length check should happen only if backend is kernel. For user
backend there is no reason to reset this bit.

vhost-user code does not define .has_vnet_hdr_len so
VIRTIO_NET_F_MRG_RXBUF cannot be negotiated even if both sides
support it.

Signed-off-by: Damjan Marion <damarion@cisco.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-18 21:51:24 +03:00
Michael S. Tsirkin
e43c0b2ea5 virtio-pci: enable bus master for old guests
commit cc943c36fa
    pci: Use bus master address space for delivering MSI/MSI-X messages
breaks virtio-net for rhel6.[56] x86 guests because they don't
enable bus mastering for virtio PCI devices. For the same reason,
rhel6.[56] ppc64 guests cannot boot on a virtio-blk disk anymore.

Old guests forgot to enable bus mastering, enable it automatically on
DRIVER (guests use some devices before DRIVER_OK).

Reported-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Tested-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-18 21:51:24 +03:00
Michael S. Tsirkin
9e8e8c4865 Revert "virtio: don't call device on !vm_running"
This reverts commit a1bc7b827e422e1ff065640d8ec5347c4aadfcd8.
    virtio: don't call device on !vm_running
It turns out that virtio net assumes that vm_running
is updated before device status callback in many places,
so this change leads to asserts.
Previous commit fixes the root issue that motivated
a1bc7b827e422e1ff065640d8ec5347c4aadfcd8 differently,
so there's no longer a need for this change.

In the future, we might be able to drop checking vm_running
completely, and check vm state directly.

Reported-by: Dietmar Maurer <dietmar@proxmox.com>
Cc: qemu-stable@nongnu.org
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-18 21:51:24 +03:00
Michael S. Tsirkin
131c5221fe virtio-net: drop assert on vm stop
On vm stop, vm_running state set to stopped
before device is notified, so callbacks can get envoked with
vm_running = false; and this is not an error.

Cc: qemu-stable@nongnu.org
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-18 21:51:24 +03:00
Eduardo Habkost
abb4d5f2e2 Revert "rng-egd: remove redundant free"
This reverts commit 5e490b6a50.

Cc: qemu-stable@nongnu.org
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-18 21:51:24 +03:00
Eduardo Habkost
b3ce84fea4 qdev: Move global validation to a single function
Currently GlobalProperty.not_used=false has multiple meanings:

* It may be a property for a hotpluggable device, which may or may not
  have been used by a device;
* It may be a machine-type-provided property, which may or may not have
  been used by a device.
* It may be a user-provided property that was actually not used by
  any device.

Simplify the logic by having two separate fields: 'user_provided' and
'used'. This allows the entire global property validation logic to be
contained in a single function, and allows more specific error messages.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-18 21:51:24 +03:00
Eduardo Habkost
d828c430eb qdev: Rename qdev_prop_check_global() to qdev_prop_check_globals()
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-18 21:51:24 +03:00
Eduardo Habkost
08ac80cd61 test-qdev-global-props: Test handling of hotpluggable and non-device types
Ensure no warning will be printed for hotpluggable types, and warnings
will be printed for non-device types.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-18 21:51:24 +03:00
Eduardo Habkost
45de81735b test-qdev-global-props: Initialize not_used=true for all props
This will ensure we are actually testing the code which sets
not_used=false when the property is used.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-18 21:51:24 +03:00
Eduardo Habkost
2177801a48 test-qdev-global-props: Run tests on subprocess
There are multiple reasons for running the global property tests on a
subprocess:

* We need the global_props lists to be empty for each test case, so
  global properties from the previous test won't affect the next one;
* We don't want the qdev_prop_check_global() warnings to pollute test
  output;
* With a subprocess, we can ensure qdev_prop_check_global() is printing
  the warning messages it should.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-18 21:51:24 +03:00
Michael S. Tsirkin
9d41401b90 tests: disable global props test for old glib
follow-up patch moves global property tests to subprocesses.
Unfortunately with old glib this causes:

tests/test-qdev-global-props.c: In function
‘test_static_prop’:
tests/test-qdev-global-props.c:80:5: error: implicit
declaration of function ‘g_test_trap_subprocess’
[-Werror=implicit-function-declaration]
tests/test-qdev-global-props.c:80:5: error: nested extern
declaration of ‘g_test_trap_subprocess’ [-Werror=nested-externs]

This function was only added in glib 2.38, and our
minimum version is 2.12.

To fix, disable the test for glib < 2.38.

Apply before that patch to avoid breaking bisect.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-18 21:50:30 +03:00
Peter Maydell
bb26a1e80b Merge remote-tracking branch 'remotes/kraxel/tags/pull-vnc-20140918-1' into staging
vnc: set TCP_NODELAY, cleanup in tlc code

# gpg: Signature made Thu 18 Sep 2014 07:02:37 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-vnc-20140918-1:
  vnc-tls: Clean up dead store in vnc_set_x509_credential()
  ui/vnc: set TCP_NODELAY

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-18 17:00:38 +01:00
Pavel Dovgalyuk
5bde14078d target-i386: update fp status fix
This patch introduces cpu_set_fpuc() function, which changes fpuc field
of the CPU state and calls update_fp_status() function.
These calls update status of softfloat library and prevent bugs caused
by non-coherent rounding settings of the FPU and softfloat.

v2 changes:
 * Added missed calls and intoduced setter function (as suggested by TeLeMan)

Reviewed-by: TeLeMan <geleman@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
2014-09-18 17:06:12 +02:00
Markus Armbruster
9d64fab422 vnc-tls: Clean up dead store in vnc_set_x509_credential()
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-18 08:01:53 +02:00
Peter Lieven
86152436eb ui/vnc: set TCP_NODELAY
we currently have the Nagle algorithm enabled for all outgoing VNC updates.
This may delay sensitive updates as mouse movements or typing in the console.
As we currently prepare all data in a buffer and then send as much as we can
disabling the Nagle algorithm should not cause big trouble. Well established
VNC servers like TightVNC set TCP_NODELAY as well.
A regular framebuffer update request generates exactly one framebuffer update
which should be pushed out as fast as possible.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-17 15:14:41 +02:00
Peter Maydell
e4d50d47a9 qemu-char: Rename register_char_driver_qapi() to register_char_driver()
Now we have removed the legacy register_char_driver() we can
rename register_char_driver_qapi() to the more obvious and
shorter name.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1409653457-27863-6-git-send-email-peter.maydell@linaro.org
2014-09-16 23:36:32 +01:00
Peter Maydell
a61ae7f88c qemu-char: Remove register_char_driver() machinery
Now that all the char backends have been converted to the QAPI
framework we can remove the machinery for handling old style
backends.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1409653457-27863-5-git-send-email-peter.maydell@linaro.org
2014-09-16 23:36:32 +01:00
Peter Maydell
90a14bfe52 qemu-char: Convert udp backend to QAPI
Convert the udp char backend to the new style QAPI framework.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1409653457-27863-4-git-send-email-peter.maydell@linaro.org
2014-09-16 23:36:32 +01:00
Peter Maydell
8287fea321 util/qemu-sockets.c: Support specifying IPv4 or IPv6 in socket_dgram()
Currently you can specify whether you want a UDP chardev backend
to be IPv4 or IPv6 using the ipv4 or ipv6 options if you use the
QemuOpts parsing code in inet_dgram_opts(). However the QMP struct
parsing code in socket_dgram() doesn't provide this flexibility
(which in turn prevents us from converting the UDP backend handling
to the new style QAPI framework).

Use the existing inet_addr_to_opts() function to convert the
remote->inet address to option strings; this handles ipv4 and
ipv6 flags as well as host and port. (It will also convert any
'to' specification, which is harmless as it is ignored in this
context.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1409653457-27863-3-git-send-email-peter.maydell@linaro.org
2014-09-16 23:36:32 +01:00
Peter Maydell
dafd325dbb qemu-char: Convert socket backend to QAPI
Convert the socket char backend to the new style QAPI framework;
this allows it to return an Error ** to callers who might not
want it to print directly about socket failures.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1409653457-27863-2-git-send-email-peter.maydell@linaro.org
2014-09-16 23:36:32 +01:00
Peter Maydell
8af47027eb Merge remote-tracking branch 'remotes/kraxel/tags/pull-sdl-20140916-1' into staging
Two minor sdl2 fixes.

# gpg: Signature made Tue 16 Sep 2014 07:20:37 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-sdl-20140916-1:
  sdl2: keymap fixups
  sdl2: drop sdl_zoom.h

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-16 18:29:40 +01:00
Peter Maydell
5f3fb5a2e2 Merge remote-tracking branch 'remotes/spice/tags/pull-spice-20140916-2' into staging
spice: call qemu_spice_set_passwd() during init

# gpg: Signature made Tue 16 Sep 2014 07:11:22 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/spice/tags/pull-spice-20140916-2:
  spice: call qemu_spice_set_passwd() during init

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-16 18:28:31 +01:00
Philipp Hahn
7dbb4c49bf hw/dma/i8257: Silence phony error message
Convert into trace event. Otherwise the message
	dma: unregistered DMA channel used nchan=0 dma_pos=0 dma_len=1
gets printed every time and fills up the log-file with 50 MiB / minute.

Signed-off-by: Philipp Hahn <hahn@univention.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-16 12:35:02 +02:00
Alexander Graf
9a48bcd1b8 kvmclock: Ensure time in migration never goes backward
When we migrate we ask the kernel about its current belief on what the guest
time would be. However, I've seen cases where the kvmclock guest structure
indicates a time more recent than the kvm returned time.

To make sure we never go backwards, calculate what the guest would have seen as time at the point of migration and use that value instead of the kernel returned one when it's more recent.
This bases the view of the kvmclock after migration on the
same foundation in host as well as guest.

Signed-off-by: Alexander Graf <agraf@suse.de>
Cc: qemu-stable@nongnu.org
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-16 11:11:38 +02:00
Marcelo Tosatti
317b0a6d8b kvmclock: Ensure proper env->tsc value for kvmclock_current_nsec calculation
Ensure proper env->tsc value for kvmclock_current_nsec calculation.

Reported-by: Marcin Gibuła <m.gibula@beyond.pl>
Analyzed-by: Marcin Gibuła <m.gibula@beyond.pl>
Cc: qemu-stable@nongnu.org
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-16 11:11:24 +02:00
Marcelo Tosatti
de9d61e83d Introduce cpu_clean_all_dirty
Introduce cpu_clean_all_dirty, to force subsequent cpu_synchronize_all_states
to read in-kernel register state.

Cc: qemu-stable@nongnu.org
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-16 11:04:09 +02:00
ChenLiang
be894f51b6 pit: fix pit interrupt can't inject into vm after migration
kvm_pit is running in kmod. kvm_pit is going to inject
interrupt to vm before cpu_synchronize_all_post_init at
dest side. vcpu will lose the pit interrupt, but
ack_irq(in kmod) has been 0. ack_irq become 1 after
vcpu responds pit interrupt. pit interruptcan inject
to vm when ack_irq is 1.

By the way, kvm_pit_vm_state_change has save and load
state of pit, so pre_save and post_load is unnecessary.

Signed-off-by: ChenLiang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-16 10:40:38 +02:00
Marc-André Lureau
07d49a53b6 spice: call qemu_spice_set_passwd() during init
Don't call SPICE API directly to set password given in command line, but
use the internal API, saving password for later calls.

This solves losing password when changing expiration in qemu monitor.

https://bugzilla.redhat.com/show_bug.cgi?id=1138639

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-16 08:09:03 +02:00
Gerd Hoffmann
0d61f7dcc6 sdl2: keymap fixups
Make a few keys works correctly in SDL2.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-16 08:07:05 +02:00
Gerd Hoffmann
4f36e42ee9 sdl2: drop sdl_zoom.h
It isn't used.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-16 08:07:05 +02:00
Peter Maydell
cc35a44cf7 Merge remote-tracking branch 'remotes/qmp-unstable/queue/qmp' into staging
* remotes/qmp-unstable/queue/qmp:
  exec: file_ram_alloc(): print error when prealloc fails
  monitor: fix debug print compiling error

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-15 19:44:34 +01:00
Peter Maydell
f2bcdc8de0 Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block patches

# gpg: Signature made Fri 12 Sep 2014 16:09:43 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"

* remotes/kevin/tags/for-upstream: (22 commits)
  qcow2: Add falloc and full preallocation option
  raw-posix: Add falloc and full preallocation option
  qapi: introduce PreallocMode and new PreallocModes full and falloc.
  block: don't convert file size to sector size
  block: round up file size to nearest sector
  iotests: Send the correct fd in socket_scm_helper
  blockdev: Refuse to drive_del something added with blockdev-add
  block: extend BLOCK_IO_ERROR with reason string
  dataplane: fix virtio_blk_data_plane_create() op blocker error path
  qemu-iotests: Run 025 for Archipelago block driver
  block/archipelago: Implement bdrv_truncate()
  block: Make the block accounting functions operate on BlockAcctStats
  block: rename BlockAcctType members to start with BLOCK_ instead of BDRV_
  block: Extract the block accounting code
  block: Extract the BlockAcctStats structure
  IDE: MMIO IDE device control should be little endian
  thread-pool: Drop unnecessary includes
  xen: Drop redundant bdrv_close() from pci_piix3_xen_ide_unplug()
  xen_disk: Plug memory leak on error path
  qemu-io: Clean up openfile() after commit 2e40134
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-15 17:35:22 +01:00
Peter Maydell
16ab5046c4 Merge remote-tracking branch 'remotes/kraxel/tags/pull-console-20140915-1' into staging
Fix pixman build failure.

# gpg: Signature made Mon 15 Sep 2014 07:15:11 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-console-20140915-1:
  configure: check for pixman-1 version
  pixman: update internal copy to pixman-0.32.6

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-15 16:42:28 +01:00
Hu Tao
236f282c1c configure: check for pixman-1 version
commit a93a3af9 introduces use of PIXMAN_TYPE_RGBA, but it's only available
in pixman >= 0.21.8. If pixman doesn't meet the version requirement, qemu
will fail to build with following message:

qemu/ui/qemu-pixman.c: In function ‘qemu_pixelformat_from_pixman’:
qemu/ui/qemu-pixman.c:42: error: ‘PIXMAN_TYPE_RGBA’ undeclared (first use in this function)
qemu/ui/qemu-pixman.c:42: error: (Each undeclared identifier is reported only once
qemu/ui/qemu-pixman.c:42: error: for each function it appears in.)

This patch fixes the problem by checking the pixman version.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-15 08:14:19 +02:00
Hu Tao
122abbe5fc pixman: update internal copy to pixman-0.32.6
commit a93a3af9 introduces use of PIXMAN_TYPE_RGBA, but it's only available
in pixman >= 0.21.8. Although commit f27b2e1d bumped pixman to pixman-0.28.2,
but the change was reverted later by 7b1b5d19.

This patch updates internal copy of pixman to pixman-0.32.6 to fix the
problem.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-15 08:14:19 +02:00
Eduardo Habkost
70e98b230f test-qdev-global-props: Trivial comment fix
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-14 21:32:16 +03:00
Eduardo Habkost
556068eed0 hw/machine: Free old values of string properties
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel.a@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Cc: qemu-stable@nongnu.org
2014-09-14 21:32:16 +03:00
Peter Maydell
2b31cd4e08 Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
- Memory: improve error reporting and avoid crashes on hotplug
- Build: fixing block/iscsi.so and ranlib warnings on Mac OS X
- Migration fixes for x86
- The odd KVM patch.

# gpg: Signature made Thu 11 Sep 2014 11:21:10 BST using RSA key ID 9B4D86F2
# gpg: Good signature from "Paolo Bonzini <pbonzini@redhat.com>"
# gpg:                 aka "Paolo Bonzini <bonzini@gnu.org>"

* remotes/bonzini/tags/for-upstream: (21 commits)
  gdbstub: init mon_chr through qemu_chr_alloc
  pckbd: adding new fields to vmstate
  mc146818rtc: add missed field to vmstate
  piix: do not set irq while loading vmstate
  serial: fixing vmstate for save/restore
  parallel: adding vmstate for save/restore
  fdc: adding vmstate for save/restore
  cpu: init vmstate for ticks and clock offset
  apic_common: vapic_paddr synchronization fix
  vl: use QLIST_FOREACH_SAFE to visit change state handlers
  exec: add parameter errp to gethugepagesize
  exec: report error when memory < hpagesize
  hostmem-ram: don't exit qemu if size of memory-backend-ram is way too big
  memory: add parameter errp to memory_region_init_rom_device
  memory: add parameter errp to memory_region_init_ram
  exec: add parameter errp to qemu_ram_alloc and qemu_ram_alloc_from_ptr
  rules.mak: Fix DSO build by pulling in archive symbols
  util: Don't link host-utils.o if it's empty
  util: Move general qemu_getauxval to util/getauxval.c
  trace: Only link generated-tracers.o with "simple" backend
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 16:55:49 +01:00
Luiz Capitulino
e4d9df4fb1 exec: file_ram_alloc(): print error when prealloc fails
If memory allocation fails when using the -mem-prealloc command-line
option, QEMU exits without printing any error information to
the user:

 # qemu [...] -m 1G -mem-prealloc -mem-path /dev/hugepages
 # echo $?
 1

This commit adds an error message, so that we print instead:

 # qemu [...] -m 1G -mem-prealloc -mem-path /dev/hugepages
 qemu: unable to map backing store for hugepages: Cannot allocate memory

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-09-12 11:22:21 -04:00
Gonglei
5fb9b5b9cb monitor: fix debug print compiling error
error: 'i' undeclared (first use in this function)

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-09-12 11:01:50 -04:00
Peter Maydell
4c24f40040 Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20140912' into staging
target-arm:
 * add "linux,stdout-path" to the virt DTB
 * fix a long standing bug with IRQ disabling on Cortex-M CPUs
 * implement input interrupt logic in the PL061
 * fix failure to load correct SP/PC on reset of Cortex-M CPUs
   if the vector table is not in a ROM-blob-in-RAM
 * provide flash devices for boot ROMs in the virt board
 * implement architectural watchpoints
 * fix misimplementation of Inner Shareable TLB operations that
   caused instability of guests in TCG SMP configurations
 * configure PL011 and PL031 in the virt board correctly with
   level-triggered interrupts rather than edge-triggered
 * support providing a device tree blob to ROM (firmware)
   images as well as to kernels

# gpg: Signature made Fri 12 Sep 2014 14:19:08 BST using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"

* remotes/pmaydell/tags/pull-target-arm-20140912: (23 commits)
  hw/arm/boot: enable DTB support when booting ELF images
  hw/arm/boot: load device tree to base of DRAM if no -kernel option was passed
  hw/arm/boot: pass an address limit to and return size from load_dtb()
  hw/arm/boot: load DTB as a ROM image
  hw/arm/virt: fix pl011 and pl031 irq flags
  target-arm: Make *IS TLB maintenance ops affect all CPUs
  target-arm: Push legacy wildcard TLB ops back into v6
  target-arm: Implement minimal DBGVCR, OSDLR_EL1, MDCCSR_EL0
  target-arm: Remove comment about MDSCR_EL1 being dummy implementation
  target-arm: Set DBGDSCR.MOE for debug exceptions taken to AArch32
  target-arm: Implement handling of fired watchpoints
  target-arm: Move extended_addresses_enabled() to internals.h
  target-arm: Implement setting of watchpoints
  cpu-exec: Make debug_excp_handler a QOM CPU method
  exec.c: Record watchpoint fault address and direction
  exec.c: Provide full set of dummy wp remove functions in user-mode
  exec.c: Relax restrictions on watchpoint length and alignment
  hw/arm/virt: Provide flash devices for boot ROMs
  target-arm: Fix broken indentation in arm_cpu_reest()
  target-arm: Fix resetting issues on ARMv7-M CPUs
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 15:12:26 +01:00
Hu Tao
0e4271b711 qcow2: Add falloc and full preallocation option
preallocation=falloc allocates disk space by posix_fallocate(),
preallocation=full allocates disk space by writing zeros to disk.
Both modes imply preallocation=metadata.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-12 15:43:06 +02:00
Hu Tao
06247428be raw-posix: Add falloc and full preallocation option
This patch adds a new option preallocation for raw format, and implements
falloc and full preallocation.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-12 15:43:06 +02:00
Hu Tao
ffeaac9b4e qapi: introduce PreallocMode and new PreallocModes full and falloc.
This patch prepares for the subsequent patches.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-12 15:43:06 +02:00
Hu Tao
180e95265e block: don't convert file size to sector size
and avoid converting it back later.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-12 15:43:06 +02:00
Hu Tao
c2eb918e32 block: round up file size to nearest sector
Currently the file size requested by user is rounded down to nearest
sector, causing the actual file size could be a bit less than the size
user requested. Since some formats (like qcow2) record virtual disk
size in bytes, this can make the last few bytes cannot be accessed.

This patch fixes it by rounding up file size to nearest sector so that
the actual file size is no less than the requested file size.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-12 15:43:06 +02:00
Ard Biesheuvel
92df845070 hw/arm/boot: enable DTB support when booting ELF images
Add support for loading DTB images when booting ELF images using
-kernel. If there are no conflicts with the placement of the ELF
segments, the DTB image is loaded at the base of RAM.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Message-id: 1410453915-9344-5-git-send-email-ard.biesheuvel@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 14:06:50 +01:00
Ard Biesheuvel
69e7f76f6a hw/arm/boot: load device tree to base of DRAM if no -kernel option was passed
If we are running the 'virt' machine, we may have a device tree blob but no
kernel to supply it to if no -kernel option was passed. In that case, copy it
to the base of RAM where it can be picked up by a bootloader.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Message-id: 1410453915-9344-4-git-send-email-ard.biesheuvel@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 14:06:50 +01:00
Ard Biesheuvel
fee8ea12eb hw/arm/boot: pass an address limit to and return size from load_dtb()
Add an address limit input parameter to load_dtb() so that we can
tell load_dtb() how much memory the dtb is allowed to consume. If
the dtb doesn't fit, return 0, otherwise return the actual size of
the loaded dtb.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Message-id: 1410453915-9344-3-git-send-email-ard.biesheuvel@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 14:06:50 +01:00
Ard Biesheuvel
4c4bf65474 hw/arm/boot: load DTB as a ROM image
In order to make the device tree blob (DTB) available in memory not only at
first boot, but also after system reset, use rom_blob_add_fixed() to install
it into memory.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Message-id: 1410453915-9344-2-git-send-email-ard.biesheuvel@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 14:06:50 +01:00
Peter Maydell
0be969a2d9 hw/arm/virt: fix pl011 and pl031 irq flags
The pl011 and pl031 devices both use level triggered interrupts,
but the device tree we construct was incorrectly telling the
kernel to configure the GIC to treat them as edge triggered.
This meant that output from the pl011 would hang after a while.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1410274423-9461-1-git-send-email-peter.maydell@linaro.org
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Cc: qemu-stable@nongnu.org
2014-09-12 14:06:50 +01:00
Peter Maydell
fa439fc5d7 target-arm: Make *IS TLB maintenance ops affect all CPUs
The ARM architecture defines that the "IS" variants of TLB
maintenance operations must affect all TLBs in the Inner Shareable
domain, which for us means all CPUs. We were incorrectly implementing
these to only affect the current CPU, which meant that SMP TCG
operation was unstable.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1410274883-9578-3-git-send-email-peter.maydell@linaro.org
Cc: qemu-stable@nongnu.org
2014-09-12 14:06:50 +01:00
Peter Maydell
995939a650 target-arm: Push legacy wildcard TLB ops back into v6
When we implemented ARMv8 in QEMU we retained our legacy loose
wildcarded decoding of the TLB maintenance operations for v7
and earlier CPUs and provided the correct stricter decode for
v8. However the loose decode is in fact wrong for v7MP, because
it doesn't correctly implement the operations which must apply
to every CPU in the Inner Shareable domain.

Move the legacy wildcarding from the not_v8 reginfo array
into the not_v7 array, and move the strictly decoded operations
from the v8 reginfo to v7 or v7mp arrays as appropriate.

Cache and TLB lockdown legacy wildcarding remains in the
not_v8 array for the moment.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1410274883-9578-2-git-send-email-peter.maydell@linaro.org
Cc: qemu-stable@nongnu.org
2014-09-12 14:06:50 +01:00
Peter Maydell
5e8b12ffbb target-arm: Implement minimal DBGVCR, OSDLR_EL1, MDCCSR_EL0
Implement debug registers DBGVCR, OSDLR_EL1 and MDCCSR_EL0
(as dummy or limited-functionality). 32 bit Linux kernels will
access these at startup so they are required for breakpoints
and watchpoints to be supported.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 14:06:50 +01:00
Peter Maydell
17a9eb53a9 target-arm: Remove comment about MDSCR_EL1 being dummy implementation
MDSCR_EL1 has actual functionality now; remove the out of date
comment that claims it is a dummy implementation.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 14:06:49 +01:00
Peter Maydell
16a906fd6e target-arm: Set DBGDSCR.MOE for debug exceptions taken to AArch32
For debug exceptions taken to AArch32 we have to set the
DBGDSCR.MOE (Method Of Entry) bits; we can identify the
kind of debug exception from the information in
exception.syndrome.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 14:06:49 +01:00
Peter Maydell
3ff6fc9148 target-arm: Implement handling of fired watchpoints
Implement the ARM debug exception handler for dealing with
fired watchpoints.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 14:06:49 +01:00
Peter Maydell
73c5211ba9 target-arm: Move extended_addresses_enabled() to internals.h
Move the utility function extended_addresses_enabled() into
internals.h; we're going to need to call it from op_helper.c.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 14:06:49 +01:00
Peter Maydell
9ee98ce810 target-arm: Implement setting of watchpoints
Implement support for setting QEMU watchpoints based on the
values the guest writes to the ARM architected watchpoint
registers. (We do not yet report the firing of the watchpoints
to the guest, so they will just be ignored.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 14:06:49 +01:00
Peter Maydell
86025ee443 cpu-exec: Make debug_excp_handler a QOM CPU method
Make the debug_excp_handler target specific hook into a QOM
CPU method.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 14:06:48 +01:00
Peter Maydell
08225676b2 exec.c: Record watchpoint fault address and direction
When we check whether we've hit a watchpoint we know the address
that we were attempting to access and whether it was a read or a
write. Record this information in the CPUWatchpoint struct so that
target-specific code can report it to the guest.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2014-09-12 14:06:48 +01:00
Peter Maydell
3ee887e8ff exec.c: Provide full set of dummy wp remove functions in user-mode
We already provide dummy versions of the cpu_watchpoint_insert
and cpu_watchpoint_remove_all functions when CONFIG_USER_ONLY
is defined. Complete the set by providing cpu_watchpoint_remove
and cpu_watchpoint_remove_by_ref as well.

This allows target-* code using these functions to avoid
some ifdeffery.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2014-09-12 14:06:48 +01:00
Peter Maydell
05068c0dfb exec.c: Relax restrictions on watchpoint length and alignment
The current implementation of watchpoints requires that they
have a power of 2 length which is not greater than TARGET_PAGE_SIZE
and that their address is a multiple of their length. Watchpoints
on ARM don't fit these restrictions, so change the implementation
so they can be relaxed.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2014-09-12 14:06:48 +01:00
Peter Maydell
acf82361c6 hw/arm/virt: Provide flash devices for boot ROMs
Add two flash devices to the virt board, so that it can be used for
running guests which want a bootrom image such as UEFI. We provide
two flash devices to make it more convenient to provide both a
read-only UEFI image and a read-write place to store guest-set
UEFI config variables. The '-bios' command line option is set up
to provide an image for the first of the two flash devices.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1409930126-28449-2-git-send-email-ard.biesheuvel@linaro.org
2014-09-12 14:06:48 +01:00
Martin Galvan
34bf774485 target-arm: Fix broken indentation in arm_cpu_reest()
Fix a single misindented line in arm_cpu_reset().

Signed-off-by: Martin Galvan <martin.galvan@tallertechnologies.com>
[PMM: split this out from the previous commit]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 14:06:48 +01:00
Martin Galvan
6e3cf5df01 target-arm: Fix resetting issues on ARMv7-M CPUs
When calling qemu_system_reset after startup on a Cortex-M
CPU, the initial values of PC, MSP and the Thumb bit weren't being set
correctly if the vector table was in ROM. In particular, since Thumb was 0, a
Usage Fault would arise immediately after trying to execute any instruction
on a Cortex-M.

Signed-off-by: Martin Galvan <martin.galvan@tallertechnologies.com>
Message-id: CAOKbPbaLt-LJsAKkQdOE0cs9Xx4OWrUfpDhATXPSdtuNw2xu_A@mail.gmail.com
[PMM: removed an incorrect comment]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 14:06:48 +01:00
Colin Leitner
bfb27e6042 pl061: implement input interrupt logic
This patch adds the missing input interrupt logic to the pl061 GPIO device. To
keep the floating output pins to stay high, the old state variable had to be
split into two separate ones for input and output - which brings the vmstate
version to 3.

Edge level interrupts and I/O were tested under Linux 3.14. Level interrupt
handling hasn't been tested.

Signed-off-by: Colin Leitner <colin.leitner@googlemail.com>
Message-id: 54024FD2.9080204@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 14:06:48 +01:00
David Hoover
c3c8d6b3dd cpu-exec.c: Allow disabling of IRQs on ARM Cortex-M CPUs
Correct an error in the logic for deciding whether we can
take an IRQ interrupt which meant that on M profile cores
it was never possible to disable them.

The design here is still bogus in that M profile doesn't
have separate "IRQ" and "FIQ", which are an A/R profile
concept; we should ideally implement the proper priority
based scheme.

Signed-off-by: David Hoover <spm@boiteauxlettres.sent.at>
[PMM: Wrote a proper commit message]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 14:06:47 +01:00
Ard Biesheuvel
f022b8e953 hw/arm/virt: add linux, stdout-path to /chosen DT node
Add a property "linux,stdout-path" to the /chosen DT node and make
it point to the emulated UART. This allows users such as the Linux
kernel to produce console output without the need to pass console=
or earlycon=pl011,0x... command line arguments.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Message-id: 1409317439-29349-1-git-send-email-ard.biesheuvel@linaro.org
Reviewed-by: Rob Herring <rob.herring@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 14:06:47 +01:00
Marc Marí
6cd14054b6 libqos virtio: Increase ISR timeout
Increase the clock step to avoid Travis failure in some builds due to
overagressive timeout.

Signed-off-by: Marc Marí <marc.mari.barcelo@gmail.com>
Message-id: 1410428416-5046-1-git-send-email-marc.mari.barcelo@gmail.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 13:58:07 +01:00
Stratos Psomadakis
be2bfb9dbd iotests: Send the correct fd in socket_scm_helper
Make sure to pass the correct fd via SCM_RIGHTS in socket_scm_helper.c
(i.e. fd_to_send, not socket-fd).

Signed-off-by: Stratos Psomadakis <psomas@grnet.gr>
Signed-off-by: Dimitris Aragiorgis <dimara@grnet.gr>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-12 10:27:54 +02:00
Markus Armbruster
48f364dd0b blockdev: Refuse to drive_del something added with blockdev-add
For some device models, the guest can prevent unplug.  Some users need a
way to forcibly revoke device model access to the block backend then, so
the underlying images can be safely used for something else.

drive_del lets you do that.  Unfortunately, it conflates revoking access
with destroying the backend.

Commit 9063f81 made drive_del immediately destroy the root BDS.  Nice:
the device name becomes available for reuse immediately.  Not so nice:
the device model's pointer to the root BDS dangles, and we're prone to
crash when the memory gets reused.

Commit d22b2f4 fixed that by hiding the root BDS instead of destroying
it.  Destruction only happens on unplug.  "Hiding" means removing it
from bdrv_states and graph_bdrv_states; see bdrv_make_anon().

This "destroy on revoke" is a misfeature we don't want to carry
forward to blockdev-add, just like "destroy on unplug" (commit
2d246f0).  So make drive_del fail on anything added with blockdev-add.

We'll add separate QMP commands to revoke device model access and to
destroy backends.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-11 17:14:24 +02:00
Luiz Capitulino
624ff5736e block: extend BLOCK_IO_ERROR with reason string
BLOCK_IO_ERROR events are logged by libvirt, which helps with
post mortem analysis of guests. However, one information that
we miss today is a human readable string describing the cause
of the I/O error.

This commit adds that string it to BLOCK_IO_ERROR. Note that
this string is a debugging aid for humans, meaning that it
should not parsed by applications.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-11 17:14:13 +02:00
Stefan Hajnoczi
745a9bb9cd dataplane: fix virtio_blk_data_plane_create() op blocker error path
Commit 3718d8ab65 ("block: Replace in_use
with operation blocker") broke the error path because it consumed
local_err instead of propagating it.

The caller has no way to know that the function failed.  This caused
virtio-blk to start "successfully" even though there was a fatal
dataplane error.

Steps to reproduce:

  $ qemu-system-x86_64 -enable-kvm -object iothread,id=iothread0 \
                       -drive if=none,id=drive0,file=a.img \
  (qemu) drive_mirror drive0 /tmp/foo.img
  (qemu) device_add virtio-blk-pci,iothread=iothread0,drive=drive0

Expected result:

  Since the mirror block job is using drive0 it is not possible to start
  virtio-blk data-plane.

  device_add fails and the PCI adapter is not added.

Actual result:

  device_add completes and the PCI adapter is added.

Cc: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-11 16:21:46 +02:00
Peter Maydell
0dfa7e3012 Merge remote-tracking branch 'remotes/kraxel/tags/pull-console-20140905-2' into staging
console: pixman switchover continued, add some infrastructure to make it
         easier using pixman in display device emulation.

# gpg: Signature made Fri 05 Sep 2014 14:38:57 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-console-20140905-2:
  console: Remove unused QEMU_BIG_ENDIAN_FLAG
  console: add qemu_pixman_linebuf_copy
  console: add dpy_gfx_update_dirty
  console: add qemu_create_displaysurface_guestmem
  console: stop using PixelFormat
  console: reimplement qemu_default_pixelformat
  console: add qemu_default_pixman_format
  console: add qemu_pixelformat_from_pixman

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-11 11:44:17 +01:00
Pavel Dovgalyuk
462efe9e53 gdbstub: init mon_chr through qemu_chr_alloc
This patch initializes monitor for gdbstub with the qemu_chr_alloc function
instead of just allocating the memory. Initialization function call
is required, because it also creates chr_write_lock mutex, which is used
when writing to this character device.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-11 12:20:33 +02:00
Pavel Dovgalyuk
a28fe7e3f6 pckbd: adding new fields to vmstate
This patch adds outport to VMState to allow correct saving and restoring
the state of PC keyboard controller.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-11 12:20:32 +02:00
Pavel Dovgalyuk
0b102153e0 mc146818rtc: add missed field to vmstate
This patch adds irq_reinject_on_ack_count field to VMState to allow correct
saving/loading the state of MC146818 RTC.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Acked-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-11 12:20:32 +02:00
Pavel Dovgalyuk
2c9ecdeb9f piix: do not set irq while loading vmstate
This patch avoids setting an irq while loading the state of the ISA bridge.
Because the i8259 has not been deserialized yet, raising an interrupt
could bring the system out-of-sync with the migration source.  For example,
the migration source could have masked the interrupt in the i8259. On the
destination, the i8259 device model would not know that yet and would
trigger an interrupt in the CPU.

This patch eliminates setting the irq and just restores the calculated
state fields in post_load function.  Interrupt state will be deserialized
separately through the IRR field of the i8259.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-11 12:20:32 +02:00
Pavel Dovgalyuk
7385b275d9 serial: fixing vmstate for save/restore
Some fields were added to VMState by this patch to preserve correct
loading of the serial port controller state.
Updating FCR value while loading was also modified to disable generating
an interrupt by loadvm.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-11 12:20:32 +02:00
Pavel Dovgalyuk
461a2753a1 parallel: adding vmstate for save/restore
VMState added by this patch preserves correct
loading of the parallel port controller state.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Acked-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-11 12:20:32 +02:00
Pavel Dovgalyuk
c0b92f3037 fdc: adding vmstate for save/restore
VMState added by this patch preserves correct
loading of the FDC device state.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Acked-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-11 12:20:32 +02:00
Pavel Dovgalyuk
4603ea0105 cpu: init vmstate for ticks and clock offset
Ticks and clock offset used by CPU timers have to be saved in vmstate.
But vmstate for these fields registered only in icount mode.
Missing registration leads to breaking the continuity when vmstate is loaded.
This patch introduces new initialization function which fixes this.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-11 12:20:32 +02:00
Pavel Dovgalyuk
a6dead43e6 apic_common: vapic_paddr synchronization fix
This patch postpones vapic_paddr initialization, which is performed
during migration. When vapic_paddr is synchronized within the migration
process, apic_common functions could operate with incorrect apic state,
if it hadn't loaded yet. This patch postpones the synchronization until
the virtual machine is started, ensuring that the whole virtual machine
state has been loaded.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Tested-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-11 12:20:25 +02:00
Peter Maydell
fc3b9aa876 Merge remote-tracking branch 'remotes/kraxel/tags/pull-usb-20140910-1' into staging
xhci PCIe endpoint migration compatibility fix

# gpg: Signature made Wed 10 Sep 2014 06:35:20 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-usb-20140910-1:
  xhci PCIe endpoint migration compatibility fix

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-11 10:36:50 +01:00
Paolo Bonzini
9b10ac869d vl: use QLIST_FOREACH_SAFE to visit change state handlers
This lets a handler delete itself.

Acked-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-10 12:08:33 +02:00
Chrysostomos Nanakos
466c80f21f qemu-iotests: Run 025 for Archipelago block driver
Run resize grow test to ensure that existing data
is not lost during grow and new space is zeroed.

Signed-off-by: Chrysostomos Nanakos <cnanakos@grnet.gr>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-10 10:41:29 +02:00
Chrysostomos Nanakos
94c80a438c block/archipelago: Implement bdrv_truncate()
Signed-off-by: Chrysostomos Nanakos <cnanakos@grnet.gr>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-10 10:41:29 +02:00
Benoît Canet
5366d0c8bc block: Make the block accounting functions operate on BlockAcctStats
This is the next step for decoupling block accounting functions from
BlockDriverState.
In a future commit the BlockAcctStats structure will be moved from
BlockDriverState to the device models structures.

Note that bdrv_get_stats was introduced so device models can retrieve the
BlockAcctStats structure of a BlockDriverState without being aware of it's
layout.
This function should go away when BlockAcctStats will be embedded in the device
models structures.

CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Keith Busch <keith.busch@intel.com>
CC: Anthony Liguori <aliguori@amazon.com>
CC: "Michael S. Tsirkin" <mst@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Eric Blake <eblake@redhat.com>
CC: Peter Maydell <peter.maydell@linaro.org>
CC: Michael Tokarev <mjt@tls.msk.ru>
CC: John Snow <jsnow@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
CC: Alexander Graf <agraf@suse.de>
CC: Max Reitz <mreitz@redhat.com>

Signed-off-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-10 10:41:29 +02:00
Benoît Canet
28298fd3d9 block: rename BlockAcctType members to start with BLOCK_ instead of BDRV_
The middle term goal is to move the BlockAcctStats structure in the device models.
(Capturing I/O accounting statistics in the device models is good for billing)
This patch make a small step in this direction by removing a reference to BDRV.

CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Keith Busch <keith.busch@intel.com>
CC: Anthony Liguori <aliguori@amazon.com>
CC: "Michael S. Tsirkin" <mst@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: John Snow <jsnow@redhat.com>
CC: Richard Henderson <rth@twiddle.net>
CC: Markus Armbruster <armbru@redhat.com>
CC: Alexander Graf <agraf@suse.de>i

Signed-off-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-10 10:41:29 +02:00
Benoît Canet
5e5a94b605 block: Extract the block accounting code
The plan is to add new accounting metrics (latency, invalid requests, failed
requests, queue depth) and block.c is overpopulated so it will be better to work
in a separate module.

Moreover the long term plan is to have statistics in each of the BDS of the graph
for metrology purpose; this means that the device model statistics must move from
the topmost BDS to the device model.

So we need to decouple the statistic code from BlockDriverState.

This is another argument for the extraction of the code in a separate module.

CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
CC: Eric Blake <eblake@redhat.com>
CC: Benoit Canet <benoit@irqsave.net>
CC: Fam Zheng <famz@redhat.com>
CC: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
CC: Paolo Bonzini <pbonzini@redhat.com>

Signed-off-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-10 10:41:29 +02:00
Benoît Canet
0ddd0ad96a block: Extract the BlockAcctStats structure
Extract the block accounting statistics into a structure so the block device
models can hold them in the future.

CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
CC: Eric Blake <eblake@redhat.com>

Signed-off-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-10 10:41:29 +02:00
Valentin Manea
1a7044bb62 IDE: MMIO IDE device control should be little endian
Set the IDE MMIO memory type to little endian. The ATA specs identify
words part of the control commands encoded as little endian.
While this has no impact on little endian systems, it's required for big
endian systems(eg OpenRisc).

Signed-off-by: Valentin Manea <valentin.manea@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-10 10:41:29 +02:00
Markus Armbruster
a31e69cf00 thread-pool: Drop unnecessary includes
Dragging block_int.h into a header is *not* nice.  Fortunately, this
is the only offender.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-10 10:41:29 +02:00
Markus Armbruster
7ca9b7c035 xen: Drop redundant bdrv_close() from pci_piix3_xen_ide_unplug()
drive_del() closes just fine.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-10 10:41:29 +02:00
Markus Armbruster
cedccf1381 xen_disk: Plug memory leak on error path
The Error object was leaked after failed bdrv_new(). While there,
streamline control flow a bit.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-10 10:41:29 +02:00
Markus Armbruster
dbb651c46c qemu-io: Clean up openfile() after commit 2e40134
Commit 6db9560 split off the growable case so it can use
bdrv_file_open() instead of bdrv_open() then.  Growable BDSes become
anonymous.  Weird.

Commit 2e40134 folded bdrv_file_open() back into bdrv_open() with new
flag BDRV_O_PROTOCOL.  We still have two bdrv_open() calls, and
growable BDSes remain anonymous.

Circle back to before commit 6db9560: just one call, not anonymous.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-10 10:41:29 +02:00
Xiaodong Gong
0d4cc3e715 Fix improper usage of cpu_to_be32 in vpc
cpu_to_be32() is wrong since vhd_type is an enum constant
(just a regular CPU-endian integer).

Signed-off-by: Xiaodong Gong <gordongong0350@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-10 10:41:29 +02:00
Luiz Capitulino
c7c2ff0c7e block: extend BLOCK_IO_ERROR event with nospace indicator
Management software, such as RHEV's vdsm, want to be able to allocate
disk space on demand. The basic use case is to start a VM with a small
disk and then the disk is enlarged when QEMU hits a ENOSPC condition.

To this end, the management software has to be notified when QEMU
encounters ENOSPC. The solution implemented by this commit is simple:
it extends the BLOCK_IO_ERROR with a 'nospace' key, which is true
when QEMU is stopped due to ENOSPC.

Note that support for querying this event is already present in
query-block by means of the 'io-status' key. Also, the new 'nospace'
BLOCK_IO_ERROR field shares the same semantics with 'io-status',
which basically means that werror= has to be set to either
'stop' or 'enospc' to enable 'nospace'.

Finally, this commit also updates the 'io-status' key doc in the
schema with a list of supported device models.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-10 10:41:29 +02:00
Dr. David Alan Gilbert
e6043e92c2 xhci PCIe endpoint migration compatibility fix
Add back the PCIe config capabilities on XHCI cards in non-PCIe slots,
but only for machine types before 2.1.

This fixes a migration incompatibility in the XHCI PCI devices
caused by:
   058fdcf52c - xhci: add endpoint cap on express bus only

Note that in fixing it for compatibility with older QEMUs, it breaks
compatibility with existing QEMU 2.1's on older machine types.

The status before this patch was (if it used an XHCI adapter):
   machine type | source qemu
     any           pre-2.1     - FAIL
     any           2.1...      - PASS

With this patch:
   machine type | source qemu
     any           pre-2.1    - PASS
     pre-2.1       2.1...     - FAIL
     2.1           2.1...     - PASS

A test to trigger it is to add '-device nec-usb-xhci,id=xhci,addr=0x12'
to the command line.

Cc: qemu-stable@nongnu.org
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-10 07:20:53 +02:00
Peter Maydell
10601bef56 Merge remote-tracking branch 'remotes/mcayland/tags/qemu-sparc-signed' into staging
apb: implement PCI bus error interrupt map registers

# gpg: Signature made Tue 09 Sep 2014 06:09:27 BST using RSA key ID AE0F321F
# gpg: Can't check signature: public key not found

* remotes/mcayland/tags/qemu-sparc-signed:
  apb: implement PCI bus error interrupt map registers

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-09 15:08:05 +01:00
Hu Tao
fc7a5800ad exec: add parameter errp to gethugepagesize
Add parameter errp to gethugepagesize thus callers can handle errors.

If user adds a memory-backend-file object using object_add command,
specifying a non-existing directory for property mem-path, qemu will
core dump with message:

  /nonexistingdir: No such file or directory
  Bad ram offset fffffffffffff000
  Aborted (core dumped)

This patch fixes the problem. With this patch, qemu reports an error
message like:

  qemu-system-x86_64: -object memory-backend-file,mem-path=/nonexistingdir,id=mem-file0,size=128M:
  failed to get page size of file /nonexistingdir: No such file or directory

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-09 13:41:44 +02:00
Hu Tao
557529dd60 exec: report error when memory < hpagesize
Report an error when memory < hpagesize in file_ram_alloc() so callers
can handle the error.

If user adds a memory-backend-file object using object_add command,
specifying a size that is less than huge page size, qemu will core dump
with message:

  Bad ram offset fffffffffffff000
  Aborted (core dumped)

This patch fixes the problem. With this patch, qemu reports error
message like:

  qemu-system-x86_64: -object memory-backend-file,mem-path=/hugepages,id=mem-file0,size=1M: memory
  size 0x100000 must be equal to or larger than huge page size 0x200000

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-09 13:41:44 +02:00
Hu Tao
d42e2de7bc hostmem-ram: don't exit qemu if size of memory-backend-ram is way too big
When using monitor command object_add to add a memory backend whose
size is way too big to allocate memory for it, qemu just exits. In
the case we'd better give an error message and keep guest running.

The problem can be reproduced as follows:

1. run qemu
2. (monitor)object_add memory-backend-ram,size=100000G,id=ram0

Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-09 13:41:44 +02:00
Hu Tao
33e0eb5297 memory: add parameter errp to memory_region_init_rom_device
Add parameter errp to memory_region_init_rom_device and update all call
sites to propagate the error.

Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
[Propagate the error out of realize. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-09 13:41:44 +02:00
Hu Tao
49946538d2 memory: add parameter errp to memory_region_init_ram
Add parameter errp to memory_region_init_ram and update all call sites
to pass in &error_abort.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-09 13:41:43 +02:00
Hu Tao
ef701d7b6f exec: add parameter errp to qemu_ram_alloc and qemu_ram_alloc_from_ptr
Add parameter errp to qemu_ram_alloc and qemu_ram_alloc_from_ptr so that
we can handle errors.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
[Assert ptr != NULL in memory_region_init_ram_ptr. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-09 13:41:25 +02:00
Fam Zheng
c261d774fb rules.mak: Fix DSO build by pulling in archive symbols
This fixes an issue with module build system. block/iscsi.so is
currently broken:

    $ ~/build/last/qemu-img
    Failed to open module: /home/fam/build/master/block-iscsi.so:
    undefined symbol: qmp_query_uuid
    qemu-img: Not enough arguments
    Try 'qemu-img --help' for more information

To fix this, we should (at least) let qemu-img link qmp_query_uuid from
libqemustub.a. (There are a few other symbols missing, as well.)

This patch changes the linking rules to:

1) Build ".mo" with "ld -r -o $@ $^" for each ".so", and later build .so
   with it.

2) Always build all the .mo before linking the executables. This is
   achieved by adding those .mo files to the executables' "-y"
   variables.

3) When linking an executable, those .mo files in its "-y" variables are
   filtered out, and replaced by one or more -Wl,-u,$symbol flags. This
   is done in the added macro "process-archive-undefs".

   These "-Wl,-u,$symbol" flags will force ld to pull in the function
   definition from the archives when linking.

   Note that the .mo objects, that are actually meant to be linked in
   the executables, are already expanded in unnest-vars, before the
   linking command. So we are safe to simply filter out .mo for the
   purpose of pulling undefined symbols.

   process-archive-undefs works as this: For each ".mo", find all the
   undefined symbols in it, filter ones that are defined in the
   archives. For each of these symbols, generate a "-Wl,-u,$symbol" in
   the link command, and put them before archive names in the command
   line.

Suggested-by: H.J. Lu <hjl.tools@gmail.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-09 13:13:05 +02:00
Fam Zheng
2ceee4b052 util: Don't link host-utils.o if it's empty
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-09 13:13:05 +02:00
Fam Zheng
f6e0830298 util: Move general qemu_getauxval to util/getauxval.c
So that we won't have an empty getauxval.o which is disliked by ranlib.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-09 13:13:05 +02:00
Fam Zheng
ddbc41de38 trace: Only link generated-tracers.o with "simple" backend
In any other cases the object file is effectively empty, which is
disliked by ranlib and nm on Mac OS X.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-09 13:13:05 +02:00
Paolo Bonzini
a85e130e01 kvm: do not abort if KVM_RUN fails
Just go to the internal error runstate.  This lets you use the "x",
"dump-guest-memory" or "info register" commands.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-09 13:13:05 +02:00
Mark Cave-Ayland
de739df8e0 apb: implement PCI bus error interrupt map registers
Both OpenBSD and FreeBSD SPARC64 attempt to read the interrupt map from the
hardware and will fail if the correct ino isn't present.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2014-09-09 06:07:12 +01:00
Peter Maydell
1bc0e40581 Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
Block pull request

# gpg: Signature made Mon 08 Sep 2014 11:49:31 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/block-pull-request: (24 commits)
  ide: Add resize callback to ide/core
  IDE: Fill the IDENTIFY request consistently
  vmdk: fix buf leak in vmdk_parse_extents()
  vmdk: fix vmdk_parse_extents() extent_file leaks
  ide: Add wwn support to IDE-ATAPI drive
  qtest/ide: Uninitialize PC allocator
  libqos: add a simple first-fit memory allocator
  MAINTAINERS: update sheepdog maintainer
  qemu-nbd: fix indentation and coding style
  qemu-nbd: add option to set detect-zeroes mode
  rename parse_enum_option to qapi_enum_parse and make it public
  block/archipelago: Use QEMU atomic builtins
  qemu-img: fix rebase src_cache option documentation
  qemu-img: clarify src_cache option documentation
  libqos: Added EVENT_IDX support
  libqos: Added MSI-X support
  libqos: Added test case for configuration changes in virtio-blk test
  libqos: Added indirect descriptor support to virtio implementation
  libqos: Added basic virtqueue support to virtio implementation
  tests: Add virtio device initialization
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-08 13:14:41 +01:00
Peter Maydell
2d6838e86c Merge remote-tracking branch 'remotes/agraf/tags/signed-ppc-for-upstream' into staging
Patch queue for ppc - 2014-09-08

Alexander Graf (11):
      PPC: KVM: Fix g3beige and mac99 when HV is loaded
      PPC: mac99: Move NVRAM to page boundary when necessary
      KVM: Add helper to run KVM_CHECK_EXTENSION on vm fd
      PPC: KVM: Use vm check_extension for pv hcall
      PPC: mac99: Fix core99 timer frequency
      PPC: mac_nvram: Remove unused functions
      PPC: mac_nvram: Allow 2 and 4 byte accesses
      PPC: mac_nvram: Split NVRAM into OF and OSX parts
      PPC: Mac: Move tbfreq into local variable
      PPC: Cuda: Use cuda timer to expose tbfreq to guest
      PPC: Fix default config ordering and add eTSEC for ppc64

Alexey Kardashevskiy (7):
      spapr: Move DT memory node rendering to a helper
      spapr: Use DT memory node rendering helper for other nodes
      spapr: Refactor spapr_populate_memory() to allow memoryless nodes
      spapr: Split memory nodes to power-of-two blocks
      spapr: Add a helper for node0_size calculation
      spapr: Fix ibm, associativity for memory nodes
      spapr_pci: Fix config space corruption

Anton Blanchard (2):
      spapr-vlan: Don't touch last entry in buffer list
      hypervisor property clashes with hypervisor node

Benjamin Herrenschmidt (2):
      loader: Add load_image_size() to replace load_image()
      spapr: Locate RTAS and device-tree based on real RMA

Bharat Bhushan (4):
      ppc: debug stub: Get trap instruction opcode from KVM
      ppc: synchronize excp_vectors for injecting exception
      ppc: Add software breakpoint support
      ppc: Add hw breakpoint watchpoint support

Gonglei (1):
      spapr: fix possible memory leak

Greg Kurz (1):
      spapr_pci: map the MSI window in each PHB

Nikunj A Dadhania (3):
      ppc: spapr-rtas - implement os-term rtas call
      spapr: add uuid/host details to device tree
      ppc/spapr: Fix MAX_CPUS to 255

Peter Maydell (1):
      hw/ppc/spapr_hcall.c: Fix typo in function names

Tom Musta (20):
      linux-user: Fix Stack Pointer Bug in PPC setup_rt_frame
      linux-user: Split PPC Trampoline Encoding from Register Save
      linux-user: Enable Signal Handlers on PPC64
      linux-user: Properly Dereference PPC64 ELFv1 Signal Handler Pointer
      linux-user: Implement do_setcontext for PPC64
      linux-user: Handle PPC64 ELFv2 Function Pointers
      target-ppc: Bug Fix: rlwinm
      target-ppc: Bug Fix: rlwnm
      target-ppc: Bug Fix: rlwimi
      target-ppc: Bug Fix: mullwo
      target-ppc: Bug Fix: mullw
      target-ppc: Bug Fix: mulldo OV Detection
      target-ppc: Bug Fix: srawi
      target-ppc: Bug Fix: srad
      target-ppc: Special Case of rlwimi Should Use Deposit
      target-ppc: Optimize rlwinm MB=0 ME=31
      target-ppc: Optimize rlwnm MB=0 ME=31
      target-ppc: Clean Up mullw
      target-ppc: Clean up mullwo
      target-ppc: Implement mulldo with TCG

# gpg: Signature made Mon 08 Sep 2014 11:51:15 BST using RSA key ID 03FEDC60
# gpg: Can't check signature: public key not found

* remotes/agraf/tags/signed-ppc-for-upstream: (52 commits)
  hypervisor property clashes with hypervisor node
  PPC: Fix default config ordering and add eTSEC for ppc64
  spapr_pci: map the MSI window in each PHB
  target-ppc: Implement mulldo with TCG
  target-ppc: Clean up mullwo
  target-ppc: Clean Up mullw
  target-ppc: Optimize rlwnm MB=0 ME=31
  target-ppc: Optimize rlwinm MB=0 ME=31
  target-ppc: Special Case of rlwimi Should Use Deposit
  spapr-vlan: Don't touch last entry in buffer list
  spapr_pci: Fix config space corruption
  PPC: Cuda: Use cuda timer to expose tbfreq to guest
  PPC: Mac: Move tbfreq into local variable
  PPC: mac_nvram: Split NVRAM into OF and OSX parts
  PPC: mac_nvram: Allow 2 and 4 byte accesses
  PPC: mac_nvram: Remove unused functions
  PPC: mac99: Fix core99 timer frequency
  PPC: KVM: Use vm check_extension for pv hcall
  KVM: Add helper to run KVM_CHECK_EXTENSION on vm fd
  target-ppc: Bug Fix: srad
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-08 12:02:07 +01:00
Anton Blanchard
85423d90c7 hypervisor property clashes with hypervisor node
dtc fails on a recent QEMU snapshot:

ERROR (name_properties): "name" property in /hypervisor#1 is incorrect ("hypervisor" instead of base node name)

Looking at the device tree we have a hypervisor property:

# lsprop hypervisor
hypervisor       "kvm"

But we also have a hypervisor node, with a name that doesn't match:

# lsprop hypervisor#1/
name             "hypervisor"
compatible       "linux,kvm"
linux,phandle    7e5eb5d8 (2120136152)

Commit c08ce91d309c (spapr: add uuid/host details to device tree)
looks to have collided with an earlier patch. Remove the hypervisor
property.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:54 +02:00
Alexander Graf
4a761ffa37 PPC: Fix default config ordering and add eTSEC for ppc64
We messed up the ordering in our default configs for PPC. The top entries
are generic entries, then come sections that indicate that features are only
in because of a special feature (such as PReP).

Fix the ordering again and while at it add eTSEC support to the ppc64 target
so that we can spawn eTSEC adapters with qemu-system-ppc64.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:54 +02:00
Greg Kurz
8c46f7ec85 spapr_pci: map the MSI window in each PHB
On sPAPR, virtio devices are connected to the PCI bus and use MSI-X.
Commit cc943c36fa has modified MSI-X
so that writes are made using the bus master address space and follow
the IOMMU path.

Unfortunately, the IOMMU address space address space does not have an
MSI window: the notification is silently dropped in unassigned_mem_write
instead of reaching the guest... The most visible effect is that all
virtio devices are non-functional on sPAPR since then. :(

This patch does the following:
1) map the MSI window into the IOMMU address space for each PHB
   - since each PHB instantiates its own IOMMU address space, we
     can safely map the window at a fixed address (SPAPR_PCI_MSI_WINDOW)
   - no real need to keep the MSI window setup in a separate function,
     the spapr_pci_msi_init() code moves to spapr_phb_realize().

2) kill the global MSI window as it is not needed in the end

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:53 +02:00
Tom Musta
22ffad31d4 target-ppc: Implement mulldo with TCG
Optimize mulldo by using the muls2_i64 operation rather than a helper.  Eliminate
the obsolete helper code.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Suggested-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:53 +02:00
Tom Musta
269778769d target-ppc: Clean up mullwo
Simplify the implementation of mullwo.  For 64 bit CPUs, the result is
the concatenation of the upper and lower parts of the muls2_i32 operation,
which may be slightly better than deposit.  For 32 bit CPUs, the lower part
of the muls_i32 operation is moved into the target GPR.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Suggested-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:53 +02:00
Tom Musta
03039e5ef0 target-ppc: Clean Up mullw
Eliminate the unecessary ext32s TCG operation and make the multiplication
operation explicitly 32 bit.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Suggested-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:53 +02:00
Tom Musta
57fca134bb target-ppc: Optimize rlwnm MB=0 ME=31
Optimize the special case of rlwnm where MB=0 and ME=31.  This can
be implemented using a ROTL.

Suggested-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:53 +02:00
Tom Musta
8979c2f602 target-ppc: Optimize rlwinm MB=0 ME=31
Optimize the special case of rlwinm where MB=0 and ME=31.  This can
be implemented as a 32-bit ROTL.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Suggested-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:53 +02:00
Tom Musta
ab92678d0a target-ppc: Special Case of rlwimi Should Use Deposit
The special case of rlwimi where MB <= ME and SH = 31-ME can be implemented
with a single TCG deposit operation.  This replaces the less general case
of SH = MB = 0 and ME = 31.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Suggested-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:52 +02:00
Anton Blanchard
439ce1401b spapr-vlan: Don't touch last entry in buffer list
The last 8 bytes of the buffer list is defined to contain the number
of dropped frames. At the moment we use it to store rx entries,
which trips up ethtool -S:

rx_no_buffer: 9223380832981355136

Fix this by skipping the last buffer list entry.

Signed-off-by: Anton Blanchard <anton@samba.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:52 +02:00
Alexey Kardashevskiy
3242052248 spapr_pci: Fix config space corruption
When disabling MSI/MSIX via "ibm,change-msi" RTAS call, no check was made
if MSI or MSIX is actually supported and the MSI message was reset
unconditionally. If this happened on a device which does not support MSI
(but does support MSIX, otherwise "ibm,change-msi" would not be called),
this device would have PCIDevice::msi_cap field (MSI capability offset)
set to zero and writing a vector would actually clear PCI status.

This clears MSI message only if MSI or MSIX is present on a device.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:52 +02:00
Alexander Graf
b981289c49 PPC: Cuda: Use cuda timer to expose tbfreq to guest
Mac OS X calibrates a number of frequencies on bootup based on reading
tb values on bootup and comparing them to via cuda timer values.

The only variable we can really steer well (thanks to KVM) is the cuda
frequency. So let's use that one to fake Mac OS X into believing the
bus frequency is tbfreq * 4. That way Mac OS X will automatically
calculate the correct timebase frequency.

With this patch and the patch set I posted earlier I can successfully
run Mac OS X 10.2, 10.3 and 10.4 guests with -M mac99 on TCG and KVM.

Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:52 +02:00
Alexander Graf
caae6c9611 PPC: Mac: Move tbfreq into local variable
We already expose the real CPU's tb frequency to the guest via fw_cfg. Soon
we will need to also expose it to the MacIO, so let's move it to a variable
that we can leverage every time we need the frequency.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:52 +02:00
Alexander Graf
2d9907a333 PPC: mac_nvram: Split NVRAM into OF and OSX parts
Mac OS X (at least with -M mac99) searches for a valid NVRAM partition
of a special Apple type. If it can't find that partition in the first
half of NVRAM, it will look at the second half.

There are a few implications from this. The first is that we need to
split NVRAM into 2 halves - one for Open Firmware use, the other one for
Mac OS X. Without this split Mac OS X will just loop endlessly over the
second half trying to find a partition.

The other implication is that we should provide a specially crafted Mac
OS X compatible NVRAM partition on the second half that Mac OS X can
happily use as it sees fit.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:52 +02:00
Alexander Graf
b19eae18c1 PPC: mac_nvram: Allow 2 and 4 byte accesses
The NVRAM in our Core99 machine really supports 2byte and 4byte accesses
just as well as 1byte accesses. In fact, Mac OS X uses those.

Add support for higher register size granularities.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:51 +02:00
Alexander Graf
a8b0503701 PPC: mac_nvram: Remove unused functions
The macio_nvram_read and macio_nvram_write functions are never called,
just remove them.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:51 +02:00
Alexander Graf
d696760b43 PPC: mac99: Fix core99 timer frequency
There is a special timer in the mac99 machine that we recently started
to emulate. Unfortunately we emulated it in the wrong frequency.

This patch adapts the frequency Mac OS X uses to evaluate results from
this timer, making calculations it bases off of it work.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:51 +02:00
Alexander Graf
6fd33a7502 PPC: KVM: Use vm check_extension for pv hcall
To find out whether we support the KVM hypercall interface we need to ask KVM
on the VM level rather than the global KVM level, because Book3S HV KVM does
not support it and we play conservative when both HV and PR are loaded.

So instead, use the VM helper that falls back to global KVM enumeration. That
should cover all cases.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:51 +02:00
Alexander Graf
7d0a07fa92 KVM: Add helper to run KVM_CHECK_EXTENSION on vm fd
We now can call KVM_CHECK_EXTENSION on the kvm fd or on the vm fd, whereas
the vm version is more accurate when it comes to PPC KVM.

Add a helper to make the vm version available that falls back to the non-vm
variant if the vm one is not available yet to stay compatible.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:51 +02:00
Tom Musta
4bc02e230d target-ppc: Bug Fix: srad
Fix the check for carry in the srad helper to properly construct
the mask -- a "1ULL" must be used (instead of "1") in order to
get the desired result.

Example:

R3 8000000000000000
R4 F3511AD4A2CD4C38
srad 3,3,4

Should *not* set XER[CA] but does without this patch.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:51 +02:00
Tom Musta
34a0fad102 target-ppc: Bug Fix: srawi
For 64 bit implementations, the special case of a shift by zero
should result in the sign extension of the least significant 32 bits
of the source GPR (not a direct copy of the 64 bit source GPR).

Example:

R3 A6212433228F41DC
srawi 3,3,0
R3 expected : 00000000228F41DC
R3 actual   : A6212433228F41DC (without this patch)

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:50 +02:00
Tom Musta
9824d01d5d target-ppc: Bug Fix: mulldo OV Detection
Fix the code to properly detect overflow; the 128 bit signed
product must have all zeroes or all ones in the first 65 bits
otherwise OV should be set.

Example:

R3 45F086A5D5887509
R4 0000000000000002
mulldo 3,3,4

Should set XER[OV].

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:50 +02:00
Tom Musta
1fa74845f2 target-ppc: Bug Fix: mullw
For 64-bit implementations, the mullw result is the 64 bit product
of the sign-extended least significant 32 bits of the source
registers.

Fix the code to properly sign extend the source operands and produce
a 64 bit product.

Example:
R3 00000000002F37A0
R4 41C33D242F816715
mullw 3,3,4
R3 expected : 0008C3146AE0F020
R3 actual   : 000000006AE0F020 (without this patch)

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:50 +02:00
Tom Musta
f11ebbf8d4 target-ppc: Bug Fix: mullwo
On 64-bit implementations, the mullwo result is the 64 bit product of
the signed 32 bit operands.  Fix the implementation to properly deposit
the upper 32 bits into the target register.

Example:

R3 0407DED115077586
R4 53778DF3CA992E09
mullwo 3,3,4
R3 expected : FB9D02730D7735B6
R3 actual   : 000000000D7735B6 (without this patch)

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:50 +02:00
Tom Musta
6ea7b35c02 target-ppc: Bug Fix: rlwimi
The rlwimi specification includes the ROTL32 operation, which is defined
to be a left rotation of two copies of the least significant 32 bits of
the source GPR.

The current implementation is incorrect on 64-bit implementations in that
it rotates a single copy of the least significant 32 bits, padding with
zeroes in the most significant bits.

Fix the code to properly implement this ROTL32 operation.

Also fix the special case of MB=31 and ME=0 to copy the entire contents
of the source GPR.

Examples:

R3 FFFFFFFFFFFFFFF0
rlwimi 3,3,29,14,1
R3 expected : 1FFFFFFE3FFFFFFE
R3 actual   : 000000003FFFFFFE (without this patch)

R3 ED7EB4DD824F0853
rlwimi 3,3,10,31,0
R3 expected : 3C214E09024F0853
R3 actual   : 00000000024F0853 (without this patch)

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:50 +02:00
Tom Musta
1c0a150f4b target-ppc: Bug Fix: rlwnm
The rlwnm specification includes the ROTL32 operation, which is defined
to be a left rotation of two copies of the least significant 32 bits of
the source GPR.

The current implementation is incorrect on 64-bit implementations in that
it rotates a single copy of the least significant 32 bits, padding with
zeroes in the most significant bits.

Fix the code to properly implement this ROTL32 operation.

Example:

R3 = 0000000000000002
R4 = 7FFFFFFFFFFFFFFF
rlwnm 3,3,4,31,16
R3 expected : 0000000100000001
R3 actual   : 0000000000000001 (without this patch)

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:50 +02:00
Tom Musta
a7f23d0f8b target-ppc: Bug Fix: rlwinm
The rlwinm specification includes the ROTL32 operation, which is defined
to be a left rotation of two copies of the least significant 32 bits of
the source GPR.

The current implementation is incorrect on 64-bit implementations in that
it rotates a single copy of the least significant 32 bits, padding with
zeroes in the most significant bits.

Fix the code to properly implement this ROTL32 operation.

Example:
R3 = F7487D82EC6F75DF
rlwinm 3,3,5,12,4

R3 expected : 8DEEBBFD880EBBFD
R3 actual   : 00000000880EBBFD (without this fix)

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:49 +02:00
Nikunj A Dadhania
9674a35626 ppc/spapr: Fix MAX_CPUS to 255
MAX_CPUS 256 is inconsistent with qemu supporting upto 255 cpus. This
MAX_CPUS number was percolated back to "virsh capabilities" with wrong
max_cpus.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:49 +02:00
Bharat Bhushan
88365d17d5 ppc: Add hw breakpoint watchpoint support
This patch adds hardware breakpoint and hardware watchpoint support
for ppc.

On BOOKE architecture we cannot share debug resources between QEMU
and guest because:
    When QEMU is using debug resources then debug exception must
    be always enabled. To achieve this we set MSR_DE and also set
    MSRP_DEP so guest cannot change MSR_DE.

    When emulating debug resource for guest we want guest
    to control MSR_DE (enable/disable debug interrupt on need).

    So above mentioned two configuration cannot be supported
    at the same time. So the result is that we cannot share
    debug resources between QEMU and Guest on BOOKE architecture.

In the current design QEMU gets priority over guest,
this means that if QEMU is using debug resources then guest
cannot use them and if guest is using debug resource then
qemu can overwrite them.

When QEMU is not able to handle debug exception then we inject program
exception to guest. Yes program exception NOT debug exception and the
reason is:
 1) QEMU and guest not sharing debug resources
 2) For software breakpoint QEMU uses a ehpriv-1 instruction;

 So there cannot be any reason that we are in qemu with exit reason
 KVM_EXIT_DEBUG  for guest set debug exception, only possibility is
 guest executed ehpriv-1 privilege instruction and that's why we are
 injecting program exception.

Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:49 +02:00
Bharat Bhushan
8a0548f94e ppc: Add software breakpoint support
This patch allow insert/remove software breakpoint.

When QEMU is not able to handle debug exception then we inject
program exception to guest because for software breakpoint QEMU
uses a ehpriv-1 instruction;
So there cannot be any reason that we are in qemu with exit reason
KVM_EXIT_DEBUG  for guest set debug exception, only possibility is
guest executed ehpriv-1 privilege instruction and that's why we are
injecting program exception.

Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
[agraf: make deflect comment booke/book3s agnostic]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:49 +02:00
Bharat Bhushan
c371c2e3e0 ppc: synchronize excp_vectors for injecting exception
This patch synchronizes env->excp_vectors[] with env->iovr[].
This is required for using the existing interrupt injection mechanism
for kvm.

Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:49 +02:00
Bharat Bhushan
3c902d4469 ppc: debug stub: Get trap instruction opcode from KVM
Get trap instruction opcode from KVM and this opcode will
be used for setting software breakpoint in following patch

Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:49 +02:00
Benjamin Herrenschmidt
b7d1f77ada spapr: Locate RTAS and device-tree based on real RMA
We currently calculate the final RTAS and FDT location based on
the early estimate of the RMA size, cropped to 256M on KVM since
we only know the real RMA size at reset time which happens much
later in the boot process.

This means the FDT and RTAS end up right below 256M while they
could be much higher, using precious RMA space and limiting
what the OS bootloader can put there which has proved to be
a problem with some OSes (such as when using very large initrd's)

Fortunately, we do the actual copy of the device-tree into guest
memory much later, during reset, late enough to be able to do it
using the final RMA value, we just need to move the calculation
to the right place.

However, RTAS is still loaded too early, so we change the code to
load the tiny blob into qemu memory early on, and then copy it into
guest memory at reset time. It's small enough that the memory usage
doesn't matter.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[aik: fixed errors from checkpatch.pl, defined RTAS_MAX_ADDR]
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[agraf: fix compilation on 32bit hosts]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:48 +02:00
Benjamin Herrenschmidt
ea87616d6c loader: Add load_image_size() to replace load_image()
A subsequent patch to ppc/spapr needs to load the RTAS blob into
qemu memory rather than target memory (so it can later be copied
into the right spot at machine reset time).

I would use load_image() but it is marked deprecated because it
doesn't take a buffer size as argument, so let's add load_image_size()
that does.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[aik: fixed errors from checkpatch.pl]
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:48 +02:00
Alexey Kardashevskiy
c3b4f589d8 spapr: Fix ibm, associativity for memory nodes
We want the associtivity lists of memory and CPU nodes to match but
memory nodes have incorrect domain#3 which is zero for CPU so they won't
match.

This clears domain#3 in the list to match CPUs associtivity lists.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:48 +02:00
Alexey Kardashevskiy
b082d65a30 spapr: Add a helper for node0_size calculation
In multiple places there is a node0_size variable calculation
which assumes that NUMA node #0 and memory node #0 are the same
things which they are not. Since we are going to change it and
do not want to change it in multiple places, let's make a helper.

This adds a spapr_node0_size() helper and makes use of it.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:48 +02:00
Alexey Kardashevskiy
6010818c30 spapr: Split memory nodes to power-of-two blocks
Linux kernel expects nodes to have power-of-two size and
does WARN_ON if this is not the case:
[    0.041456] WARNING: at drivers/base/memory.c:115
which is:

===
	/* Validate blk_sz is a power of 2 and not less than section size */
	if ((block_sz & (block_sz - 1)) || (block_sz < MIN_MEMORY_BLOCK_SIZE)) {
        	WARN_ON(1);
	        block_sz = MIN_MEMORY_BLOCK_SIZE;
	}
===

This splits memory nodes into set of smaller blocks with
a size which is a power of two. This makes sure the start
address of every node is aligned to the node size.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[agraf: squash windows compile fix in]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:48 +02:00
Alexey Kardashevskiy
7db8a127e3 spapr: Refactor spapr_populate_memory() to allow memoryless nodes
Current QEMU does not support memoryless NUMA nodes, however
actual hardware may have them so it makes sense to have a way
to emulate them in QEMU. This prepares SPAPR for that.

This moves 2 calls of spapr_populate_memory_node() into
the existing loop over numa nodes so first several nodes may
have no memory and this still will work.

If there is no numa configuration, the code assumes there is just
a single node at 0 and it has all the guest memory.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:48 +02:00
Alexey Kardashevskiy
81014ac2b8 spapr: Use DT memory node rendering helper for other nodes
This finishes refactoring by using the spapr_populate_memory_node helper
for all nodes and removing leftovers from spapr_populate_memory().

This is not a part of the previous patch because the patches look
nicer apart.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:47 +02:00
Alexey Kardashevskiy
26a8c353bf spapr: Move DT memory node rendering to a helper
This moves recurring bits of code related to memory@xxx nodes
creation to a helper.

This makes use of the new helper for node@0.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:47 +02:00
Gonglei
a21a7a7012 spapr: fix possible memory leak
get_boot_devices_list() will malloc memory, spapr_finalize_fdt
doesn't free it.

Signed-off-by: Chenliang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:47 +02:00
Alexander Graf
261265cc91 PPC: mac99: Move NVRAM to page boundary when necessary
When running KVM we have to adhere to host page boundaries for memory slots.
Unfortunately the NVRAM on mac99 is a 4k RAM hole inside of an MMIO flash
area.

So if our host is configured with 64k page size, we can't use the mac99 target
with KVM. This is a real shame, as this limitation is not really an issue - we
can easily map NVRAM somewhere else and at least Linux and Mac OS X use it
at their new location.

So in that emergency case when it's about failing to run at all and moving NVRAM
to a place it shouldn't be at, choose the latter.

This patch enables -M mac99 with KVM on 64k page size hosts.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:47 +02:00
Nikunj A Dadhania
ef9514431d spapr: add uuid/host details to device tree
Useful for identifying the guest/host uniquely within the
guest. Adding following properties to the guest root node.

vm,uuid - uuid of the guest
host-model - Host model number
host-serial - Host machine serial number
hypervisor type - Tells its "kvm"

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:47 +02:00
Peter Maydell
7d0cd464a7 hw/ppc/spapr_hcall.c: Fix typo in function names
Fix a typo in the names of a couple of functions
(s/resouce/resource/).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:47 +02:00
Tom Musta
145855801a linux-user: Handle PPC64 ELFv2 Function Pointers
Function pointers in the 64-bit ELFv2 PowerPC ABI are actual (internal)
entry point addresses.  However, when invoking a function via a function
pointer, GPR 12 must also be set to this address so that the TOC may be
handled properly.

Add this support to the invocation of a signal handler.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:46 +02:00
Tom Musta
19774ec5c4 linux-user: Implement do_setcontext for PPC64
Eliminate the stub for the do_setcontext() function for TARGET_PPC64.  The
implementation re-uses the existing TARGET_PPC32 code with the only change
being the computation of the address of the register save area.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:46 +02:00
Tom Musta
8d6ab333eb linux-user: Properly Dereference PPC64 ELFv1 Signal Handler Pointer
Properly dereference 64-bit PPC ELF V1 ABIT function pointers to signal handlers.
On this platform, function pointers are pointers to structures and the first 64
bits of such a structure contains the function's entry point.  The second 64 bits
contains the TOC pointer, which must be placed into GPR 2.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:46 +02:00
Tom Musta
61e75fecef linux-user: Enable Signal Handlers on PPC64
Enable the 64-bit PowerPC signal handling code that was previously
disabled via #ifdefs.  Specifically:

  - Move the target_mcontext (register save area) structure and
    append it to the 64-bit target_sigcontext structure.  This
    provides the space on the stack for saving and restoring
    context.
  - Define the target_rt_sigframe for 64-bit.
  - Adjust the setup_frame and setup_rt_frame routines to properly
    select the target_mcontext area and trampoline within the stack
    frame; tthis is different for 32-bit and 64-bit implementations.
  - Adjust the do_setcontext stub for 64-bit so that it compiles
    without warnings.

The 64-bit signal handling code is still not functional after this
change; but the 32-bit code is.  Subsequent changes will address
specific issues with the 64-bit code.

Signed-off-by: Tom Musta <tommusta@gmail.com>
[agraf: fix build on 32bit hosts, ppc64abi32]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:46 +02:00
Tom Musta
7678108b13 linux-user: Split PPC Trampoline Encoding from Register Save
Split the encoding of the PowerPC sigreturn trampoline from the saving of
register state onto the signal handler stack.  This will make it easier
in subsequent patches to deal with variations in the stack frame layouts between
32 and 64 bit PowerPC.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:46 +02:00
Tom Musta
fbdc200ac2 linux-user: Fix Stack Pointer Bug in PPC setup_rt_frame
The code that sets the stack frame back pointer is incorrect for
the setup_rt_frame() code; qemu will abort (SIGSEGV) in some
environments.  The setup_frame code  was fixed in commit
beb526b121 but the setup_rt_frame
code was not.

Make the setup_rt_frame code consistent with the setup_frame
code.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:46 +02:00
Nikunj A Dadhania
2e14072f9e ppc: spapr-rtas - implement os-term rtas call
PAPR compliant guest calls this in absence of kdump. This finally
reaches the guest and can be handled according to the policies set by
higher level tools(like taking dump) for further analysis by tools like
crash.

Linux kernel calls ibm,os-term when extended property of os-term is set.
This makes sure that a return to the linux kernel is gauranteed.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
[agraf: reduce RTAS_TOKEN_MAX]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:45 +02:00
Alexander Graf
277c7a4d71 PPC: KVM: Fix g3beige and mac99 when HV is loaded
On PPC we have 2 different styles of KVM: PR and HV. HV can only virtualize
sPAPR guests while PR can virtualize everything that's reasonably close to
the host hardware platform.

As long as only one kernel module (PR or HV) is loaded, the "default" kvm type
is the module that's loaded. So if your hardware only supports PR mode you can
easily spawn a Mac VM.

However, if both HV and PR are loaded we default to HV mode. And in that case
the Mac machines have to explicitly ask for PR mode to get a working VM.

Fix this up by explicitly having the Mac machines ask for PR style KVM. This
fixes bootup of Mac VMs on systems where bot HV and PR kvm modules are loaded
for me.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:45 +02:00
John Snow
01ce352e62 ide: Add resize callback to ide/core
Currently, if the block device backing the IDE drive is resized,
the information about the device as cached inside of the IDEState
structure is not updated, thus when a guest OS re-queries the drive,
it is unable to see the expanded size.

This patch adds a resize callback that updates the IDENTIFY data
buffer in order to correct this.

Lastly, a Linux guest as-is cannot resize a libata drive while in-use,
but it can see the expanded size as part of a bus rescan event.
This patch also allows guests such as Linux to see the new drive size
after a soft reboot event, without having to exit the QEMU process.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:44 +01:00
John Snow
4bf6637d35 IDE: Fill the IDENTIFY request consistently
IDE-HD, IDE-ATAPI and IDE-CFATA all fill the
identify buffer in slightly different ways,
this is a relatively minor patch to make them
uniform, to emphasize that:

(1) We build the s->identify_data cache first, then
(2) We copy it to s->io_buffer to fulfill the request.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:44 +01:00
Stefan Hajnoczi
b6b1d31f09 vmdk: fix buf leak in vmdk_parse_extents()
vmdk_open_sparse() does not take ownership of buf so the caller always
needs to free it.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2014-09-08 11:12:44 +01:00
Stefan Hajnoczi
ff74f33c31 vmdk: fix vmdk_parse_extents() extent_file leaks
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2014-09-08 11:12:44 +01:00
John Snow
c5fe97e359 ide: Add wwn support to IDE-ATAPI drive
Although it is possible to specify the wwn
property for cdrom devices on the command line,
the underlying driver fails to relay this information
to the guest operating system via IDENTIFY.

This is a simple patch to correct that.

See ATA8-ACS, Table 22 parts 5, 6, and 9.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:44 +01:00
John Snow
0142f88bff qtest/ide: Uninitialize PC allocator
Use the new call to pc_alloc_uninit
as a test for the new pathways.

The leak checking / assert pathways are
not enabled in this patch, leaving this
as an option to future test writers.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:43 +01:00
John Snow
ec2f160538 libqos: add a simple first-fit memory allocator
Implement a simple first-fit memory allocator that
attempts to keep track of leased blocks of memory
in order to be able to re-use blocks.

Additionally, allow the user to specify when
initializing the device that upon cleanup,
we would like to assert that there are no
blocks in use. This may be useful for identifying
problems in qtests that use more complicated
set-up and tear-down routines.

This functionality is used in my upcoming ahci-test v2
patch set, but I didn't see fit to enable it for any
existing tests, which will continue to operate the
same as they have prior.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:43 +01:00
MORITA Kazutaka
53b33231f7 MAINTAINERS: update sheepdog maintainer
Hitoshi takes over sheepdog maintenance from me.

Signed-off-by: MORITA Kazutaka <morita.kazutaka@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:43 +01:00
Peter Lieven
713cc671f1 qemu-nbd: fix indentation and coding style
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit.canet@nodalink.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:43 +01:00
Peter Lieven
b3838a4088 qemu-nbd: add option to set detect-zeroes mode
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit.canet@nodalink.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:43 +01:00
Peter Lieven
9e7dac7c6c rename parse_enum_option to qapi_enum_parse and make it public
relaxing the license to LGPLv2+ is intentional.

Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit.canet@nodalink.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:43 +01:00
Chrysostomos Nanakos
072f9ac44a block/archipelago: Use QEMU atomic builtins
Replace __sync builtins with ones provided by QEMU
for atomic operations.

Special thanks goes to Paolo Bonzini for his refactoring
suggestion in order to use the already existing atomic builtins
interface.

Signed-off-by: Chrysostomos Nanakos <cnanakos@grnet.gr>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:43 +01:00
Stefan Hajnoczi
3ba6796d08 qemu-img: fix rebase src_cache option documentation
The src_cache option (-T) specifies the cache mode for backing files.
It applies both the image's old backing file as well as the new backing
file:

  ret = bdrv_open(&bs_old_backing, backing_name, NULL, NULL, src_flags,
                  old_backing_drv, &local_err);
  if (ret) {
      ...
  }
  if (out_baseimg[0]) {
      bs_new_backing = bdrv_new("new_backing", &error_abort);
      ret = bdrv_open(&bs_new_backing, out_baseimg, NULL, NULL, src_flags,
                      new_backing_drv, &local_err);
      if (ret) {
          ...
      }
  }

The documentation only mentions the new backing file but it really
applies to both.

Suggested-by: Jeff Nelson <jenelson@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-09-08 11:12:43 +01:00
Stefan Hajnoczi
bb87fdf871 qemu-img: clarify src_cache option documentation
The source cache option takes the same values as the cache option.  The
documentation reads a little strange because it starts with "In contrast
the src_cache option ...".  The fact that this is comparing with the
previous documented option (the 'cache' option) is implicit.  Readers
may be confused, especially if they jump to src_cache without reading
cache documentation first.

Suggested-by: Jeff Nelson <jenelson@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-09-08 11:12:43 +01:00
Marc Marí
1053587c3f libqos: Added EVENT_IDX support
Added avail_event and NO_NOTIFY check before notifying.
Added used_event setting.

Signed-off-by: Marc Marí <marc.mari.barcelo@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:43 +01:00
Marc Marí
5836811398 libqos: Added MSI-X support
Added MSI-X support for qtest PCI.
Added MSI-X support for virtio-pci.
Added MSI-X test case in virtio-blk-test.

Signed-off-by: Marc Marí <marc.mari.barcelo@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:43 +01:00
Marc Marí
e11199554c libqos: Added test case for configuration changes in virtio-blk test
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Marc Marí <marc.mari.barcelo@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:43 +01:00
Marc Marí
f294b029aa libqos: Added indirect descriptor support to virtio implementation
Add functions necessary for working with indirect descriptors.
Add test using new functions.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Marc Marí <marc.mari.barcelo@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:43 +01:00
Marc Marí
bf3c63d201 libqos: Added basic virtqueue support to virtio implementation
Add status changing and feature negotiation.
Add basic virtqueue support for adding and sending virtqueue requests.
Add ISR checking.

[Squashed request endianness fix by Greg Kurz <gkurz@linux.vnet.ibm.com>
--Stefan]

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Marc Marí <marc.mari.barcelo@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:43 +01:00
Marc Marí
46e0cf7629 tests: Add virtio device initialization
Add functions to read and write virtio header fields.
Add status bit setting in virtio-blk-device.

Signed-off-by: Marc Marí <marc.mari.barcelo@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:43 +01:00
Marc Marí
311e666aea tests: Functions bus_foreach and device_find from libqos virtio API
Virtio header has been changed to compile and work with a real device.
Functions bus_foreach and device_find have been implemented for PCI.
Virtio-blk test case now opens a fake device.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Marc Marí <marc.mari.barcelo@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:43 +01:00
Laszlo Ersek
4c0cfc72b3 pflash_cfi01: write flash contents to bdrv on incoming migration
A drive that backs a pflash device is special:
- it is very small,
- its entire contents are kept in a RAMBlock at all times, covering the
  guest-phys address range that provides the guest's view of the emulated
  flash chip.

The pflash device model keeps the drive (the host-side file) and the
guest-visible flash contents in sync. When migrating the guest, the
guest-visible flash contents (the RAMBlock) is migrated by default, but on
the target host, the drive (the host-side file) remains in full sync with
the RAMBlock only if:
- the source and target hosts share the storage underlying the pflash
  drive,
- or the migration requests full or incremental block migration too, which
  then covers all drives.

Due to the special nature of pflash drives, the following scenario makes
sense as well:
- no full nor incremental block migration, covering all drives, alongside
  the base migration (justified eg. by shared storage for "normal" (big)
  drives),
- non-shared storage for pflash drives.

In this case, currently only those portions of the flash drive are updated
on the target disk that the guest reprograms while running on the target
host.

In order to restore accord, dump the entire flash contents to the bdrv in
a post_load() callback.

- The read-only check follows the other call-sites of pflash_update();
- both "pfl->ro" and pflash_update() reflect / consider the case when
  "pfl->bs" is NULL;
- the total size of the flash device is calculated as in
  pflash_cfi01_realize().

When using shared storage, or requesting full or incremental block
migration along with the normal migration, the patch should incur a
harmless rewrite from the target side.

It is assumed that, on the target host, RAM is loaded ahead of the call to
pflash_post_load().

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:43 +01:00
Laszlo Ersek
afeb25f926 pflash_cfi01: fixup stale DPRINTF() calls
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:42 +01:00
Liu Yuan
6bb4515849 block: kill tail whitespace in block.c
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:42 +01:00
Peter Maydell
f102f22455 Merge remote-tracking branch 'remotes/afaerber/tags/qom-cpu-for-peter' into staging
QOM CPUState and X86CPU

* Include exception state in CPU VMState
* Fix -cpu *,migratable=foo
* Error out on unknown -cpu *,+foo,-bar

# gpg: Signature made Fri 05 Sep 2014 15:38:14 BST using RSA key ID 3E7E013F
# gpg: Good signature from "Andreas Färber <afaerber@suse.de>"
# gpg:                 aka "Andreas Färber <afaerber@suse.com>"

* remotes/afaerber/tags/qom-cpu-for-peter:
  target-i386: Reject invalid CPU feature names on the command-line
  target-i386: Support migratable=no properly
  exec: Save CPUState::exception_index field

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-05 16:03:56 +01:00
Eduardo Habkost
c00c94abbd target-i386: Reject invalid CPU feature names on the command-line
Instead of simply printing a warning, report an error when invalid CPU
options are provided on the CPU model string.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-09-05 16:37:07 +02:00
Eduardo Habkost
4d1b279b06 target-i386: Support migratable=no properly
When the "migratable" property was implemented, the behavior was tested
by changing the default on the code, but actually using the option on
the command-line (e.g. "-cpu host,migratable=false") doesn't work as
expected. This is a regression for a common use case of "-cpu host",
which is to enable features that are supported by the host CPU + kernel
before feature-specific code is added to QEMU.

Fix this by initializing the feature words for "-cpu host" on
x86_cpu_parse_featurestr(), right after parsing the CPU options.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-09-05 16:37:06 +02:00
Pavel Dovgaluk
6c3bff0ed8 exec: Save CPUState::exception_index field
This patch adds a subsection with exception_index field to the VMState for
correct saving the CPU state.
Without this patch, simulator could miss the pending exception in the saved
virtual machine state.

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-09-05 16:32:48 +02:00
Benjamin Herrenschmidt
77bfcf28f1 console: Remove unused QEMU_BIG_ENDIAN_FLAG
If we need to, we should use the pixman formats instead but for
now this is unused except in commented out code so take it out
to avoid further confusion about surface endianness.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-05 15:38:04 +02:00
Peter Maydell
20dbe65049 Merge remote-tracking branch 'remotes/kraxel/tags/pull-chardev-20140905-1' into staging
pty: Fix byte loss bug when connecting to pty

# gpg: Signature made Fri 05 Sep 2014 12:57:32 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-chardev-20140905-1:
  pty: Fix byte loss bug when connecting to pty

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-05 14:29:42 +01:00
Gerd Hoffmann
43c7d8bd44 console: add qemu_pixman_linebuf_copy
Helper function for copying data from linebuf to framebuffer using
pixman, possibly converting in case src and dst formats differ.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-05 13:27:11 +02:00
Gerd Hoffmann
4c38762fb5 console: add dpy_gfx_update_dirty
Calls dpy_gfx_update for all dirty scanlines. Works for
DisplaySurfaces backed by guest memory (i.e. the ones created
using qemu_create_displaysurface_guestmem).

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-05 13:27:11 +02:00
Gerd Hoffmann
a77549b3ff console: add qemu_create_displaysurface_guestmem
This patch adds a qemu_create_displaysurface_guestmem helper function.
Works simliar to qemu_create_displaysurface_from, but accepts a
guest address instead of a host pointer and it handles
cpu_physical_memory_{map,unmap} for you.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-05 13:27:11 +02:00
Gerd Hoffmann
30f1e661b6 console: stop using PixelFormat
With this patch the qemu console core stops using PixelFormat and pixman
format codes side-by-side, pixman format code is the primary way to
specify the DisplaySurface format:

 * DisplaySurface stops carrying a PixelFormat field.
 * qemu_create_displaysurface_from() expects a pixman format now.

Functions to convert PixelFormat to pixman_format_code_t (and back)
exist for those who still use PixelFormat.   As PixelFormat allows
easy access to masks and shifts it will probably continue to exist.

[ xenfb added by Benjamin Herrenschmidt ]

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-05 13:27:11 +02:00
Gerd Hoffmann
56bd9ea1a3 console: reimplement qemu_default_pixelformat
Use the new qemu_pixelformat_from_pixman and qemu_default_pixman_format
functions to reimplement qemu_default_pixelformat
(qemu_different_endianness_pixelformat too).

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-05 13:27:11 +02:00
Gerd Hoffmann
1527a25ec9 console: add qemu_default_pixman_format
Function returning the default pixman format for a given depth.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-05 13:27:11 +02:00
Gerd Hoffmann
a93a3af9ec console: add qemu_pixelformat_from_pixman
Function to convert pixman format codes to qemu PixelFormat.

[ Benjamin Herrenschmidt: fix BGRA+RGBA shifts ]

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-05 13:27:11 +02:00
Sebastian Tanase
cf7330c759 pty: Fix byte loss bug when connecting to pty
When trying to print data to the pty, we first check if it is connected.
If not, we try to reconnect, but we drop the pending data even if we
have successfully reconnected; this makes us lose the first byte of the very
first transmission.
This small fix addresses the issue by checking once more if the pty is connected
after having tried to reconnect.

Signed-off-by: Sebastian Tanase <sebastian.tanase@openwide.fr>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-05 13:27:10 +02:00
Peter Maydell
5fd7fc8db9 Merge remote-tracking branch 'remotes/kraxel/tags/pull-cve-2014-3615-20140905-1' into staging
CVE-2014-3615: fix sanity checks in vbe (bochs dispi) and spice.

# gpg: Signature made Fri 05 Sep 2014 12:18:04 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-cve-2014-3615-20140905-1:
  spice: make sure we don't overflow ssd->buf
  vbe: rework sanity checks
  vbe: make bochs dispi interface return the correct memory size with qxl

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-05 12:26:33 +01:00
Gerd Hoffmann
ab9509ccea spice: make sure we don't overflow ssd->buf
Related spice-only bug.  We have a fixed 16 MB buffer here, being
presented to the spice-server as qxl video memory in case spice is
used with a non-qxl card.  It's also used with qxl in vga mode.

When using display resolutions requiring more than 16 MB of memory we
are going to overflow that buffer.  In theory the guest can write,
indirectly via spice-server.  The spice-server clears the memory after
setting a new video mode though, triggering a segfault in the overflow
case, so qemu crashes before the guest has a chance to do something
evil.

Fix that by switching to dynamic allocation for the buffer.

CVE-2014-3615

Cc: qemu-stable@nongnu.org
Cc: secalert@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2014-09-05 12:19:50 +02:00
Peter Maydell
fd884c0765 Merge remote-tracking branch 'remotes/afaerber/tags/qom-devices-for-peter' into staging
QOM infrastructure fixes and device conversions

* Cleanups for recursive device unrealization

# gpg: Signature made Thu 04 Sep 2014 18:17:35 BST using RSA key ID 3E7E013F
# gpg: Good signature from "Andreas Färber <afaerber@suse.de>"
# gpg:                 aka "Andreas Färber <afaerber@suse.com>"

* remotes/afaerber/tags/qom-devices-for-peter:
  qdev: Add cleanup logic in device_set_realized() to avoid resource leak
  qdev: Use NULL instead of local_err for qbus_child unrealize
  qdev: Use error_abort instead of using local_err
  memory: Remove object_property_add_child_array()
  qom: Add automatic arrayification to object_property_add()
  machine: Clean up -machine handling
  qom: Make object_child_foreach() safe for objects removal

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-04 19:41:15 +01:00
Peter Maydell
bbb6a1e872 Merge remote-tracking branch 'remotes/kvaneesh/for-upstream' into staging
* remotes/kvaneesh/for-upstream:
  hw/9pfs: Don't return type from host in readdir on local 9p filesystem
  hw/9pfs: Use little-endian format for xattr values

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-04 18:34:28 +01:00
Gonglei
1d45a705fc qdev: Add cleanup logic in device_set_realized() to avoid resource leak
At present, this function doesn't have partial cleanup implemented,
which will cause resource leaks in some scenarios.

Example:

1. Assume that "dc->realize(dev, &local_err)" executes successful
   and local_err == NULL;
2. device hotplug in hotplug_handler_plug() executes but fails
   (it is prone to occur). Then local_err != NULL;
3. error_propagate(errp, local_err) and return. But the resources
   which have been allocated in dc->realize() will be leaked.
Simple backtrace:
  dc->realize()
   |->device_realize
            |->pci_qdev_init()
                |->do_pci_register_device()
                |->etc.

Add fuller cleanup logic which assures that function can
goto appropriate error label as local_err population is
detected at each relevant point.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-09-04 19:15:54 +02:00
Gonglei
cd4520adca qdev: Use NULL instead of local_err for qbus_child unrealize
Forcefully unrealize all children regardless of errors in earlier
iterations (if any). We should keep going with cleanup operation
rather than report an error immediately. Therefore store the first
child unrealization failure and propagate it at the end. We also
forcefully unregister vmsd and unrealize actual object, too.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-09-04 19:15:06 +02:00
Peter Maydell
8cf8c92e77 Merge remote-tracking branch 'remotes/stefanha/tags/net-pull-request' into staging
Net patches

# gpg: Signature made Thu 04 Sep 2014 17:32:44 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/net-pull-request:
  virtio-net: purge outstanding packets when starting vhost
  net: complete all queued packets on VM stop
  net: invoke callback when purging queue
  virtio: don't call device on !vm_running
  virtio-net: don't run bh on vm stopped
  net: Forbid dealing with packets when VM is not running

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-04 17:39:07 +01:00
Michael S. Tsirkin
086abc1ccd virtio-net: purge outstanding packets when starting vhost
whenever we start vhost, virtio could have outstanding packets
queued, when they complete later we'll modify the ring
while vhost is processing it.

To prevent this, purge outstanding packets on vhost start.

Cc: qemu-stable@nongnu.org
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-04 17:19:09 +01:00
Michael S. Tsirkin
ca77d85e1d net: complete all queued packets on VM stop
This completes all packets, ensuring that callbacks
will not run when VM is stopped.

Cc: qemu-stable@nongnu.org
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-04 17:19:09 +01:00
Michael S. Tsirkin
07d8084624 net: invoke callback when purging queue
devices rely on packet callbacks eventually running,
but we violate this rule whenever we purge the queue.
To fix, invoke callbacks on all packets on purge.
Set length to 0, this way callers can detect that
this happened and re-queue if necessary.

Cc: qemu-stable@nongnu.org
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-04 17:19:09 +01:00
Michael S. Tsirkin
269bd822e7 virtio: don't call device on !vm_running
On vm stop, virtio changes vm_running state
too soon, so callbacks can get envoked with
vm_running = false;

Cc: qemu-stable@nongnu.org
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-04 17:19:09 +01:00
Michael S. Tsirkin
e8bcf84200 virtio-net: don't run bh on vm stopped
commit 783e770693
    virtio-net: stop/start bh when appropriate

is incomplete: BH might execute within the same main loop iteration but
after vmstop, so in theory, we might trigger an assertion.
I was unable to reproduce this in practice,
but it seems clear enough that the potential is there, so worth fixing.

Cc: qemu-stable@nongnu.org
Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-04 17:19:09 +01:00
Bastian Blank
840a1bf283 hw/9pfs: Don't return type from host in readdir on local 9p filesystem
When using mapped mode in 9pfs, readdir implementation
should not return file type in d_type from the host
readdir, instead, it should use the type stored in
the extended attributes.  Since d_type is optional
and reading ext attrs for every readdir is expensive,
it should be sufficient to just set d_type to DT_UNKNOWN,
so guest will know to look it up separately.

This is a -stable material.

Signed-off-by: Bastian Blank <waldi@debian.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
2014-09-04 10:51:13 -05:00
Gonglei
d578029e71 qdev: Use error_abort instead of using local_err
This error can not happen normally. If it happens, it indicates
something very wrong, we should abort QEMU. Moreover, the
user can only refer to /machine/peripheral or /objects, not
/machine/unattached.

While at it, remove superfluous check about local_err.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-09-04 16:14:47 +02:00
Peter Crosthwaite
843ef73a69 memory: Remove object_property_add_child_array()
Obsoleted by automatic object_property_add() arrayification.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-09-04 16:14:47 +02:00
Peter Crosthwaite
339659041f qom: Add automatic arrayification to object_property_add()
If "[*]" is given as the last part of a QOM property name, treat that
as an array property. The added property is given the first available
name, replacing the * with a decimal number counting from 0.

First add with name "foo[*]" will be "foo[0]". Second "foo[1]" and so
on.

Callers may inspect the ObjectProperty * return value to see what
number the added property was given.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-09-04 16:14:47 +02:00
Andreas Färber
d2659e27e1 machine: Clean up -machine handling
Since commit c4090f8, -object options are no longer handled through
object_set_property(), so clean up -object leftovers by renaming the
function and dropping special-casing of qom-type and id properties.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel.a@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-09-04 16:14:47 +02:00
Alexey Kardashevskiy
8af734ca31 qom: Make object_child_foreach() safe for objects removal
Current object_child_foreach() uses QTAILQ_FOREACH() to walk
through children and that makes children removal from the callback
impossible.

This makes object_child_foreach() use QTAILQ_FOREACH_SAFE().

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-09-04 16:14:47 +02:00
zhanghailiang
e1d64c084b net: Forbid dealing with packets when VM is not running
For all NICs(except virtio-net) emulated by qemu,
Such as e1000, rtl8139, pcnet and ne2k_pci,
Qemu can still receive packets when VM is not running.

If this happened in *migration's* last PAUSE VM stage, but
before the end of the migration, the new receiving packets will possibly dirty
parts of RAM which has been cached in *iovec*(will be sent asynchronously) and
dirty parts of new RAM which will be missed.
This will lead serious network fault in VM.

To avoid this, we forbid receiving packets in generic net code when
VM is not running.

Bug reproduction steps:
(1) Start a VM which configured at least one NIC
(2) In VM, open several Terminal and do *Ping IP -i 0.1*
(3) Migrate the VM repeatedly between two Hosts
And the *PING* command in VM will very likely fail with message:
'Destination HOST Unreachable', the NIC in VM will stay unavailable unless you
run 'service network restart'

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-04 14:31:54 +01:00
Peter Maydell
01eb313907 Merge remote-tracking branch 'remotes/mjt/tags/trivial-patches-2014-09-03' into staging
trivial patches for 2014-09-03

# gpg: Signature made Wed 03 Sep 2014 06:53:42 BST using RSA key ID A4C3D7DB
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg:                 aka "Michael Tokarev <mjt@corpit.ru>"
# gpg:                 aka "Michael Tokarev <mjt@debian.org>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D  4324 457C E0A0 8044 65C5
#      Subkey fingerprint: 6F67 E18E 7C91 C5B1 5514  66A7 BEE5 9D74 A4C3 D7DB

* remotes/mjt/tags/trivial-patches-2014-09-03:
  slirp: Honour vlan/stack in hostfwd_remove commands
  hmp: fix MemdevList memory leak
  qom/object.c, hmp.c: fix string_output_get_string() memory leak
  query-memdev: fix potential memory leaks
  MAINTAINERS: Add VMWare devices maintainer
  device_tree.c: dump all err mesages with error_report
  device_tree.c: redirect load_device_tree err message to stderr
  scripts: Remove scripts/qtest
  Fix debug print warning
  curl: The macro that you have to uncomment to get debugging is DEBUG_CURL.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-04 13:33:53 +01:00
Peter Maydell
b27e37d4ce Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pci, pc fixes, features

A bunch of bugfixes - these will make sense for 2.1.1

Initial Intel IOMMU support.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Wed 03 Sep 2014 14:41:23 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"

* remotes/mst/tags/for_upstream:
  acpi-build: Set FORCE_APIC_CLUSTER_MODEL bit for FADT flags
  vhost-scsi: init backend features earlier
  vhost_net: init acked_features to backend_features
  vhost_net: start/stop guest notifiers properly

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-04 12:20:41 +01:00
Peter Maydell
4771b02512 Revert "vhost_net: start/stop guest notifiers properly"
This reverts commit aad4dce934.

I accidentally merged the wrong version of a pull request
which had a buggy version of this patch. Reverting the
buggy version means we can then cleanly merge in the correct
pull with the corrected change.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-04 12:19:37 +01:00
Gerd Hoffmann
c1b886c45d vbe: rework sanity checks
Plug a bunch of holes in the bochs dispi interface parameter checking.
Add a function doing verification on all registers.  Call that
unconditionally on every register write.  That way we should catch
everything, even changing one register affecting the valid range of
another register.

Some of the holes have been added by commit
e9c6149f6a.  Before that commit the
maximum possible framebuffer (VBE_DISPI_MAX_XRES * VBE_DISPI_MAX_YRES *
32 bpp) has been smaller than the qemu vga memory (8MB) and the checking
for VBE_DISPI_MAX_XRES + VBE_DISPI_MAX_YRES + VBE_DISPI_MAX_BPP was ok.

Some of the holes have been there forever, such as
VBE_DISPI_INDEX_X_OFFSET and VBE_DISPI_INDEX_Y_OFFSET register writes
lacking any verification.

Security impact:

(1) Guest can make the ui (gtk/vnc/...) use memory rages outside the vga
frame buffer as source  ->  host memory leak.  Memory isn't leaked to
the guest but to the vnc client though.

(2) Qemu will segfault in case the memory range happens to include
unmapped areas  ->  Guest can DoS itself.

The guest can not modify host memory, so I don't think this can be used
by the guest to escape.

CVE-2014-3615

Cc: qemu-stable@nongnu.org
Cc: secalert@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2014-09-04 08:23:14 +02:00
Gerd Hoffmann
54a85d4624 vbe: make bochs dispi interface return the correct memory size with qxl
VgaState->vram_size is the size of the pci bar.  In case of qxl not the
whole pci bar can be used as vga framebuffer.  Add a new variable
vbe_size to handle that case.  By default (if unset) it equals
vram_size, but qxl can set vbe_size to something else.

This makes sure VBE_DISPI_INDEX_VIDEO_MEMORY_64K returns correct results
and sanity checks are done with the correct size too.

Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2014-09-04 08:22:48 +02:00
zhanghailiang
07b81ed937 acpi-build: Set FORCE_APIC_CLUSTER_MODEL bit for FADT flags
If we start Windows 2008 R2 DataCenter with number of cpu less than 8,
The system will use APIC Flat Logical destination mode as default configuration,
Which has an upper limit of 8 CPUs.

The fault is that VM can not show all processors within Task Manager if
we hot-add cpus when the number of cpus in VM extends the limit of 8.

If we use cluster destination model, the problem will be solved.

Note:
This flag was introduced later than ACPI v1.0 specification while QEMU
generates v1.0 tables only, but...

linux kernel ignores this flag, so patch has no influence on it.

Tested with Win[XPsp3|Srv2003EE|Srv2008DC|Srv2008R2|Srv2012R2], there
isn't BSODs and guests boot just fine. In cases guest doesn't support
cpu-hotplug, cpu becomes visible after reboot and in case the guest
supports cpu-hotplug, it works as expected with this patch.

Cc: qemu-stable@nongnu.org
Signed-off-by: huangzhichao <huangzhichao@huawei.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-By: Igor Mammedov <imammedo@redhat.com>
2014-09-03 16:41:05 +03:00
Michael S. Tsirkin
3a1655fc53 vhost-scsi: init backend features earlier
As vhost core can use backend_features during init, clear it earlier to
avoid using uninitialized memory.
This use would be harmless since vhost scsi ignores the result
anyway, but initializing earlier will help prevent valgrind errors,
and make scsi and net behave similarly.

Cc: qemu-stable@nongnu.org
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-03 16:41:05 +03:00
Jason Wang
b49ae9138d vhost_net: init acked_features to backend_features
commit 2e6d46d77e (vhost: add
vhost_get_features and vhost_ack_features) removes the step that
initializes the acked_features to backend_features.

As this field is now uninitialized, vhost initialization will sometimes
fail.

To fix, initialize acked_features on each ack.

Tested-by: Andrey Korolyov <andrey@xdel.ru>
Cc: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-03 16:41:05 +03:00
Jason Wang
cd7d1d26b0 vhost_net: start/stop guest notifiers properly
commit a9f98bb5eb "vhost: multiqueue
support" changed the order of stopping the device. Previously
vhost_dev_stop would disable backend and only afterwards, unset guest
notifiers. We now unset guest notifiers while vhost is still
active. This can lose interrupts causing guest networking to fail. In
particular, this has been observed during migration.

To fix this, several other changes are needed:
- remove the hdev->started assertion in vhost.c since we may want to
start the guest notifiers before vhost starts and stop the guest
notifiers after vhost is stopped.
- introduce the vhost_net_set_vq_index() and call it before setting
guest notifiers. This is to guarantee vhost_net has the correct
virtqueue index when setting guest notifiers.

MST: fix up error handling.

Cc: qemu-stable@nongnu.org
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Andrey Korolyov <andrey@xdel.ru>
Reported-by: "Zhangjie (HZ)" <zhangjie14@huawei.com>
Tested-by: William Dauchy <william@gandi.net>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-03 16:40:44 +03:00
Aneesh Kumar K.V
f8ad4a89e9 hw/9pfs: Use little-endian format for xattr values
With security_model=mapped-xattr, we encode the uid,gid and other file
attributes as extended attributes of the file. We save them under
user.virtfs.* namespace.

Use little-endian encoding for on-disk values. This enables us to export
the same directory from both little-endian and big-endian hosts.

NOTE: This will break big-endian host that have virtFS exports
using security model mapped-xattr. They will have to use external tools
to convert the xattr to little-endian format.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
2014-09-02 16:02:33 -05:00
Peter Maydell
70381662aa slirp: Honour vlan/stack in hostfwd_remove commands
The hostfwd_add and hostfwd_remove monitor commands allow the user
to optionally specify a vlan/stack tuple. hostfwd_add honours this,
but hostfwd_remove does not (it looks up the tuple but then ignores
the SlirpState it has looked up and always uses the first stack
in the list anyway). Correct this to honour what the user requested.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-02 22:38:16 +04:00
Chen Fan
ecaf54a052 hmp: fix MemdevList memory leak
the memdev_list in hmp_info_memdev() is never freed.
so we use existent method qapi_free_MemdevList() to free it.
and also we can use qapi_free_MemdevList() to replace list loops
to clean up the memdev list in error path.

Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-02 22:38:16 +04:00
Chen Fan
976620ac40 qom/object.c, hmp.c: fix string_output_get_string() memory leak
string_output_get_string() uses g_string_free(str, false) to
transfer the 'str' pointer to callers and never free it.

Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-02 22:38:16 +04:00
Chen Fan
b0e90181e4 query-memdev: fix potential memory leaks
Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-02 22:38:16 +04:00
Dmitry Fleytman
622fb504c4 MAINTAINERS: Add VMWare devices maintainer
Signed-off-by: Dmitry Fleytman <dmitry@daynix.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-02 22:38:16 +04:00
Li Liu
508e221f2c device_tree.c: dump all err mesages with error_report
Signed-off-by: Li Liu <john.liuli@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-02 22:38:16 +04:00
Li Liu
db013f81b2 device_tree.c: redirect load_device_tree err message to stderr
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Li Liu <john.liuli@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-02 22:38:16 +04:00
Fam Zheng
7d2ff422ca scripts: Remove scripts/qtest
This is a dummy file with no user, drop it.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-02 22:38:16 +04:00
Gonglei
c5539cb426 Fix debug print warning
Steps:

1.enable qemu debug print, using simply scprit as below:
 grep "//#define DEBUG" * -rl | xargs sed -i "s/\/\/#define DEBUG/#define DEBUG/g"
2. make -j
3. get some warning:
hw/i2c/pm_smbus.c: In function 'smb_ioport_writeb':
hw/i2c/pm_smbus.c:142: warning: format '%04x' expects type 'unsigned int', but argument 2 has type 'hwaddr'
hw/i2c/pm_smbus.c:142: warning: format '%02x' expects type 'unsigned int', but argument 3 has type 'uint64_t'
hw/i2c/pm_smbus.c: In function 'smb_ioport_readb':
hw/i2c/pm_smbus.c:209: warning: format '%04x' expects type 'unsigned int', but argument 2 has type 'hwaddr'
hw/intc/i8259.c: In function 'pic_ioport_read':
hw/intc/i8259.c:373: warning: format '%02x' expects type 'unsigned int', but argument 2 has type 'hwaddr'
hw/input/pckbd.c: In function 'kbd_write_command':
hw/input/pckbd.c:232: warning: format '%02x' expects type 'unsigned int', but argument 2 has type 'uint64_t'
hw/input/pckbd.c: In function 'kbd_write_data':
hw/input/pckbd.c:333: warning: format '%02x' expects type 'unsigned int', but argument 2 has type 'uint64_t'
hw/isa/apm.c: In function 'apm_ioport_writeb':
hw/isa/apm.c:44: warning: format '%x' expects type 'unsigned int', but argument 2 has type 'hwaddr'
hw/isa/apm.c:44: warning: format '%02x' expects type 'unsigned int', but argument 3 has type 'uint64_t'
hw/isa/apm.c: In function 'apm_ioport_readb':
hw/isa/apm.c:67: warning: format '%x' expects type 'unsigned int', but argument 2 has type 'hwaddr'
hw/timer/mc146818rtc.c: In function 'cmos_ioport_write':
hw/timer/mc146818rtc.c:394: warning: format '%02x' expects type 'unsigned int', but argument 3 has type 'uint64_t'
hw/i386/pc.c: In function 'port92_write':
hw/i386/pc.c:479: warning: format '%02x' expects type 'unsigned int', but argument 2 has type 'uint64_t'

Fix them.

Cc: qemu-trivial@nongnu.org
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-02 22:38:16 +04:00
Richard W.M. Jones
41c2346716 curl: The macro that you have to uncomment to get debugging is DEBUG_CURL.
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-02 22:38:16 +04:00
Peter Maydell
f2426947de Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pci, pc fixes, features

A bunch of bugfixes - these will make sense for 2.1.1

Initial Intel IOMMU support.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Tue 02 Sep 2014 16:05:04 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"

* remotes/mst/tags/for_upstream:
  vhost_net: start/stop guest notifiers properly
  pci: avoid losing config updates to MSI/MSIX cap regs
  virtio-net: don't run bh on vm stopped
  ioh3420: remove unused ioh3420_init() declaration
  vhost_net: cleanup start/stop condition
  intel-iommu: add IOTLB using hash table
  intel-iommu: add context-cache to cache context-entry
  intel-iommu: add supports for queued invalidation interface
  intel-iommu: fix coding style issues around in q35.c and machine.c
  intel-iommu: add Intel IOMMU emulation to q35 and add a machine option "iommu" as a switch
  intel-iommu: add DMAR table to ACPI tables
  intel-iommu: introduce Intel IOMMU (VT-d) emulation
  iommu: add is_write as a parameter to the translate function of MemoryRegionIOMMUOps

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-02 16:07:31 +01:00
Jason Wang
aad4dce934 vhost_net: start/stop guest notifiers properly
commit a9f98bb5eb vhost: multiqueue
support changed the order of stopping the device. Previously
vhost_dev_stop would disable backend and only afterwards, unset guest
notifiers. We now unset guest notifiers while vhost is still
active. This can lose interrupts causing guest networking to fail. In
particular, this has been observed during migration.

To adapt this, several other changes are needed:
- remove the hdev->started assertion in vhost.c since we may want to
start the guest notifiers before vhost starts and stop the guest
notifiers after vhost is stopped.
- introduce the vhost_net_set_vq_index() and call it before setting
guest notifiers. This is used to guarantee vhost_net has the correct
virtqueue index when setting guest notifiers.

Cc: qemu-stable@nongnu.org
Reported-by: "Zhangjie (HZ)" <zhangjie14@huawei.com>
Tested-by: William Dauchy <wdauchy@gmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-02 17:33:37 +03:00
Knut Omang
d7efb7e08e pci: avoid losing config updates to MSI/MSIX cap regs
Since
commit 95d6580024
    msi: Invoke msi/msix_write_config from PCI core
msix config writes are lost, the value written is always 0.

Fix pci_default_write_config to avoid this.

Cc: qemu-stable@nongnu.org
Signed-off-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-02 17:28:26 +03:00
Michael S. Tsirkin
0187c7989a virtio-net: don't run bh on vm stopped
commit 783e770693
    virtio-net: stop/start bh when appropriate

is incomplete: BH might execute within the same main loop iteration but
after vmstop, so in theory, we might trigger an assertion.
I was unable to reproduce this in practice,
but it seems clear enough that the potential is there, so worth fixing.

Cc: qemu-stable@nongnu.org
Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-02 17:28:26 +03:00
Gonglei
fc8342f758 ioh3420: remove unused ioh3420_init() declaration
commit 0f9b1771cc
    ioh3420: Remove obsoleted, unused ioh3420_init function
removed the implementation of ioh3420_init

Drop the declaration from the header file as well.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-02 17:28:26 +03:00
Michael S. Tsirkin
2d2507ef23 vhost_net: cleanup start/stop condition
Checking vhost device internal state in vhost_net looks like
a layering violation since vhost_net does not
set this flag: it is set and tested by vhost.c.
There seems to be no reason to check this:
caller in virtio net uses its own flag,
vhost_started, to ensure vhost is started/stopped
as appropriate.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
2014-09-02 17:28:25 +03:00
Peter Maydell
30eaca3acd Merge remote-tracking branch 'remotes/spice/tags/pull-spice-20140902-1' into staging
sanity check for qxl, minor spice display channel tweak.

# gpg: Signature made Tue 02 Sep 2014 09:53:39 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/spice/tags/pull-spice-20140902-1:
  spice: use console index as display id
  qxl-render: add more sanity checks

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-02 10:26:10 +01:00
Xin Tong
88e89a57f9 implementing victim TLB for QEMU system emulated TLB
QEMU system mode page table walks are expensive. Taken by running QEMU
qemu-system-x86_64 system mode on Intel PIN , a TLB miss and walking a
4-level page tables in guest Linux OS takes ~450 X86 instructions on
average.

QEMU system mode TLB is implemented using a directly-mapped hashtable.
This structure suffers from conflict misses. Increasing the
associativity of the TLB may not be the solution to conflict misses as
all the ways may have to be walked in serial.

A victim TLB is a TLB used to hold translations evicted from the
primary TLB upon replacement. The victim TLB lies between the main TLB
and its refill path. Victim TLB is of greater associativity (fully
associative in this patch). It takes longer to lookup the victim TLB,
but its likely better than a full page table walk. The memory
translation path is changed as follows :

Before Victim TLB:
1. Inline TLB lookup
2. Exit code cache on TLB miss.
3. Check for unaligned, IO accesses
4. TLB refill.
5. Do the memory access.
6. Return to code cache.

After Victim TLB:
1. Inline TLB lookup
2. Exit code cache on TLB miss.
3. Check for unaligned, IO accesses
4. Victim TLB lookup.
5. If victim TLB misses, TLB refill
6. Do the memory access.
7. Return to code cache

The advantage is that victim TLB can offer more associativity to a
directly mapped TLB and thus potentially fewer page table walks while
still keeping the time taken to flush within reasonable limits.
However, placing a victim TLB before the refill path increase TLB
refill path as the victim TLB is consulted before the TLB refill. The
performance results demonstrate that the pros outweigh the cons.

some performance results taken on SPECINT2006 train
datasets and kernel boot and qemu configure script on an
Intel(R) Xeon(R) CPU  E5620  @ 2.40GHz Linux machine are shown in the
Google Doc link below.

https://docs.google.com/spreadsheets/d/1eiItzekZwNQOal_h-5iJmC4tMDi051m9qidi5_nwvH4/edit?usp=sharing

In summary, victim TLB improves the performance of qemu-system-x86_64 by
11% on average on SPECINT2006, kernelboot and qemu configscript and with
highest improvement of in 26% in 456.hmmer. And victim TLB does not result
in any performance degradation in any of the measured benchmarks. Furthermore,
the implemented victim TLB is architecture independent and is expected to
benefit other architectures in QEMU as well.

Although there are measurement fluctuations, the performance
improvement is very significant and by no means in the range of
noises.

Signed-off-by: Xin Tong <trent.tong@gmail.com>
Message-id: 1407202523-23553-1-git-send-email-trent.tong@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-01 17:43:06 +01:00
Bastian Koppelmann
44ea34309e target-tricore: Add instructions of SR opcode format
Add instructions of SR opcode format.
Add micro-op generator functions for saturate.
Add helper return from exception (rfe).

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Message-id: 1409572800-4116-16-git-send-email-kbastian@mail.uni-paderborn.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-01 14:49:21 +01:00
Bastian Koppelmann
5a7634a28c target-tricore: Add instructions of SLR, SSRO and SRO opcode format
Add instructions of SLR, SSRO and SRO opcode format.

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>

Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 1409572800-4116-15-git-send-email-kbastian@mail.uni-paderborn.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-01 14:49:21 +01:00
Bastian Koppelmann
5de93515f9 target-tricore: Add instructions of SC opcode format
Add instructions of SC opcode format.
Add helper for begin interrupt service routine.

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Message-id: 1409572800-4116-14-git-send-email-kbastian@mail.uni-paderborn.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-01 14:49:21 +01:00
Bastian Koppelmann
a47b50db60 target-tricore: Add instructions of SBR opcode format
Add instructions of SBR opcode format.
Add gen_loop micro-op generator function.

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>

Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 1409572800-4116-13-git-send-email-kbastian@mail.uni-paderborn.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-01 14:49:21 +01:00
Bastian Koppelmann
70b0226250 target-tricore: Add instructions of SBC and SBRN opcode format
Add instructions of SBC and SBRN opcode format.

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>

Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 1409572800-4116-12-git-send-email-kbastian@mail.uni-paderborn.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-01 14:49:21 +01:00
Bastian Koppelmann
9a31922b08 target-tricore: Add instructions of SB opcode format
Add instructions of SB opcode format.
Add helper call/ret.
Add micro-op generator functions for branches.
Add makro to generate helper functions.

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Message-id: 1409572800-4116-11-git-send-email-kbastian@mail.uni-paderborn.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-01 14:49:21 +01:00
Bastian Koppelmann
d279821074 target-tricore: Add instructions of SRRS and SLRO opcode format
Add instructions of SSRS and SLRO opcode format.
Add micro-op generator functions for offset loads.

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>

Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 1409572800-4116-10-git-send-email-kbastian@mail.uni-paderborn.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-01 14:49:21 +01:00
Bastian Koppelmann
46aa848ffb target-tricore: Add instructions of SSR opcode format
Add instructions of SSR opcode format.

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>

Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 1409572800-4116-9-git-send-email-kbastian@mail.uni-paderborn.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-01 14:49:21 +01:00
Bastian Koppelmann
2692802a37 target-tricore: Add instructions of SRR opcode format
Add instructions of SRR opcode format.
Add helper for add/sub_ssov.

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Message-id: 1409572800-4116-8-git-send-email-kbastian@mail.uni-paderborn.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-01 14:49:21 +01:00
Bastian Koppelmann
0707ec1bea target-tricore: Add instructions of SRC opcode format
Add instructions of SRC opcode format.
Add micro-op generator functions for add, conditional add/sub and shi/shai.

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>

Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 1409572800-4116-7-git-send-email-kbastian@mail.uni-paderborn.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-01 14:49:20 +01:00
Bastian Koppelmann
7c87d074d2 target-tricore: Add masks and opcodes for decoding
Add masks and opcodes for decoding TriCore instructions.

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>

Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 1409572800-4116-6-git-send-email-kbastian@mail.uni-paderborn.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-01 14:49:20 +01:00
Bastian Koppelmann
0aaeb118b3 target-tricore: Add initialization for translation and activate target
Add tcg and cpu model initialization.
Add gen_intermediate_code function.
Activate target in configure and add softmmu config.

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Message-id: 1409572800-4116-5-git-send-email-kbastian@mail.uni-paderborn.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-01 14:49:20 +01:00
Bastian Koppelmann
2d30267e8e target-tricore: Add softmmu support
Add basic softmmu support for TriCore

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Message-id: 1409572800-4116-4-git-send-email-kbastian@mail.uni-paderborn.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-01 14:49:20 +01:00
Bastian Koppelmann
e2d0501103 target-tricore: Add board for systemmode
Add basic board to allow systemmode emulation

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Message-id: 1409572800-4116-3-git-send-email-kbastian@mail.uni-paderborn.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-01 14:49:20 +01:00
Bastian Koppelmann
48e06fe0ed target-tricore: Add target stubs and qom-cpu
Add TriCore target stubs, and QOM cpu, and Maintainer

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Message-id: 1409572800-4116-2-git-send-email-kbastian@mail.uni-paderborn.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-01 14:49:20 +01:00
Peter Maydell
5cd1475d28 Merge remote-tracking branch 'remotes/borntraeger/tags/kvm-s390-20140901' into staging
s390x/kvm: Several updates/fixes/features

1. s390x/kvm: avoid synchronize_rcu's in kernel
----------------------------------------------
The first patches change s390x/kvm code to issue VCPU specific ioctls
from the VCPU thread. This will avoid unnecessary synchronize_rcu in
the kernel, which caused a noticably slowdown with many guest CPUs.
It speeds up all start/restart/reset operations involving cpus
drastically.

2. s390-ccw.img: block size and DASD format support
---------------------------------------------------
The second part changes the s390-ccw bios to IPL (boot)  more disk
formats than before. Furthermore a small fix is made to the console
output of the bios.

3. s390: Support for Hotplug of Standby Memory
----------------------------------------------
The third part adds support in s390 for a pool of standby memory,
which can be set online/offline by the guest (ie, via chmem).
The standby pool of memory is allocated as the difference between
the initial memory setting and the maxmem setting.
As part of this work, additional results are provided for the
Read SCP Information SCLP, and new implentation is added for the
Read Storage Element Information, Attach Storage Element,
Assign Storage and Unassign Storage SCLPs, which enables the s390
guest to manipulate the standby memory pool.

This patchset is based on work originally done by Jeng-Fang (Nick)
Wang.

Sample qemu command snippet:

qemu -machine s390-ccw-virtio  -m 1024M,maxmem=2048M,slots=32 -enable-kvm

This will allocate 1024M of active memory, and another 1024M
of standby memory.  Example output from s390-tools lsmem:
=============================================================================
0x0000000000000000-0x000000000fffffff        256  online   no         0-127
0x0000000010000000-0x000000001fffffff        256  online   yes        128-255
0x0000000020000000-0x000000003fffffff        512  online   no         256-511
0x0000000040000000-0x000000007fffffff       1024  offline  -          512-1023

Memory device size  : 2 MB
Memory block size   : 256 MB
Total online memory : 1024 MB
Total offline memory: 1024 MB

The guest can dynamically enable part or all of the standby pool
via the s390-tools chmem, for example:

chmem -e 512M

And can attempt to dynamically disable:

chmem -d 512M

4. s390x/gdb: various fixes
---------------------------
* Patch 1 fixes a bug where the cc was changed accidentally.
* Patch 2 adds the gdb feature XML files for s390x
* Patch 3 Define acr and fpr registers as coprocessor registers. This allows us
   to reuse the feature XML files.
* Patch 4 whitespace fixes

# gpg: Signature made Mon 01 Sep 2014 12:53:39 BST using RSA key ID B5A61C7C
# gpg: Can't check signature: public key not found

* remotes/borntraeger/tags/kvm-s390-20140901:
  s390x/gdb: coding style fixes
  s390x/gdb: generate target.xml and handle fp/ac as coprocessors
  s390x/gdb: add the feature xml files for s390x
  s390x/gdb: don't touch the cc if tcg is not enabled
  sclp-s390: Add memory hotplug SCLPs
  s390-virtio: Apply same memory boundaries as virtio-ccw
  virtio-ccw: Include standby memory when calculating storage increment
  sclp-s390: Add device to manage s390 memory hotplug
  pc-bios/s390-ccw.img binary update
  pc-bios/s390-ccw: Do proper console setup
  pc-bios/s390-ccw: IPL from DASD with format variations
  pc-bios/s390-ccw Really big EAV ECKD DASD handling
  pc-bios/s390-ccw Improve ECKD informational message
  pc-bios/s390-ccw: handle more ECKD DASD block sizes
  pc-bios/s390-ccw: support all virtio block size
  s390x/kvm: execute the first cpu reset on the vcpu thread
  s390x/kvm: execute "system reset" cpu resets on the vcpu thread
  s390x/kvm: execute sigp orders on the target vcpu thread
  s390x/kvm: run guest triggered resets on the target vcpu thread

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-01 13:57:46 +01:00
Gerd Hoffmann
cd56cc6b07 spice: use console index as display id
... instead of maintaining our own numbering.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-01 10:19:03 +02:00
Gerd Hoffmann
503b3b33fe qxl-render: add more sanity checks
Damn, the dirty rectangle values are signed integers.  So the checks
added by commit 788fbf042f are not good
enough, we also have to make sure they are not negative.

[ Note: There must be something broken in spice-server so we get
  negative values in the first place.  Bug opened:
  https://bugzilla.redhat.com/show_bug.cgi?id=1135372 ]

Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2014-09-01 10:19:03 +02:00
David Hildenbrand
218829db23 s390x/gdb: coding style fixes
This patch cleanes up two coding style issues (missing whitespaces).

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:45:19 +02:00
David Hildenbrand
73d510c9d3 s390x/gdb: generate target.xml and handle fp/ac as coprocessors
This patch reduces the core registers to the psw and the general purpose
registers. The fpc and ac registers are handled as coprocessors registers by gdb.
This allows to reuse the feature xml files taken from gdb without further
modification and is what other architectures do.

The target.xml is now generated and provided to the gdb client. Therefore, the
client doesn't have to guess which registers are available at which logical
register number.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:45:19 +02:00
David Hildenbrand
6117afac34 s390x/gdb: add the feature xml files for s390x
This patch adds the relevant s390x feature xml files taken from gdb.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:45:19 +02:00
David Hildenbrand
97fa52f097 s390x/gdb: don't touch the cc if tcg is not enabled
When reading/writing the psw mask, the condition code may only be touched if
running on tcg.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:45:19 +02:00
Matthew Rosato
1def6656b6 sclp-s390: Add memory hotplug SCLPs
Add memory information to read SCP info and add handlers for
Read Storage Element Information, Attach Storage Element,
Assign Storage and Unassign Storage.

Signed-off-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:25:32 +02:00
Matthew Rosato
e7f1314f97 s390-virtio: Apply same memory boundaries as virtio-ccw
Although s390-virtio won't support memory hotplug, it should
enforce the same memory boundaries so that it can use shared codepaths
(like read_SCP_info).

Signed-off-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:25:32 +02:00
Matthew Rosato
b6fe01248e virtio-ccw: Include standby memory when calculating storage increment
When determining the memory increment size, use the maxmem size if
it was specified.

Signed-off-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:25:32 +02:00
Matthew Rosato
0844df77fd sclp-s390: Add device to manage s390 memory hotplug
Add sclpMemoryHotplugDev to contain associated data structures, etc.

Signed-off-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:25:32 +02:00
Eugene (jno) Dvurechenski
f360221988 pc-bios/s390-ccw.img binary update
Rebuild of s390-ccw.img containing these patches:

  pc-bios/s390-ccw: Do proper console setup
  pc-bios/s390-ccw: support all virtio block size
  pc-bios/s390-ccw: handle more ECKD DASD block sizes
  pc-bios/s390-ccw Improve ECKD informational message
  pc-bios/s390-ccw Really big EAV ECKD DASD handling
  pc-bios/s390-ccw: IPL from DASD with format variations

Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:23:02 +02:00
Christian Borntraeger
1aa7f4c6aa pc-bios/s390-ccw: Do proper console setup
The final newline/return must happen before we reset the sclp via
diag 308.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:23:02 +02:00
Eugene (jno) Dvurechenski
14f56a2e35 pc-bios/s390-ccw: IPL from DASD with format variations
There are two known cases of DASD format where signatures are
incomplete or absent:

1. result of <dasdfmt -d ldl -L ...> (ECKD_LDL_UNLABELED)
2. CDL with zero keys in IPL1 and IPL2 records

Now the code attempts to
1. find zIPL and use SCSI layout
2. find IPL1 and use CDL layout
3. find CMS1 and use LDL layout
3. find LNX1 and use LDL layout
4. find zIPL and use unlabeled LDL layout
5. find zIPL and use CDL layout
6. die
in this sequence.

Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:23:02 +02:00
Eugene (jno) Dvurechenski
f04db28b86 pc-bios/s390-ccw Really big EAV ECKD DASD handling
For EAV ECKD DASD, the cylinder count will have the magic value
0xfffeU. Therefore, use the block number to test for valid eckd
addresses instead.

Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:23:02 +02:00
Eugene (jno) Dvurechenski
b0885f7599 pc-bios/s390-ccw Improve ECKD informational message
Add block size display to ECKD scheme report.

Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:23:02 +02:00
Eugene (jno) Dvurechenski
00a47e7e71 pc-bios/s390-ccw: handle more ECKD DASD block sizes
Using dasdfmt(8) to format a DASD allows to choose a block size.
There are four supported values: 512, 1024, 2048, and 4096 bytes
per block. Each block size leads to selection of new count of
sectors per track. The head count remains always the same: 15.

This empiric knowledge is used to detect ECKD DASD to IPL from.

Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:23:02 +02:00
Eugene (jno) Dvurechenski
92cb05574b pc-bios/s390-ccw: support all virtio block size
The block size value may be given "as is" OR as a base value and
a shift count (exponent). So, we have to use calculation to get
the proper number in the code.

The main expression reads as
        (blk_cfg.blk_size << blk_cfg.physical_block_exp)

E.g., various combinations between blk_size=1/physical_block_exp=12
and blk_size=4096/physical_block_exp=0 are valid for 4K blocks.

Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:23:02 +02:00
David Hildenbrand
159855f098 s390x/kvm: execute the first cpu reset on the vcpu thread
As all full cpu resets currently call into the kernel to do initial cpu reset,
let's run this reset (triggered by cpu_s390x_init()) on the proper vcpu thread.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:23:02 +02:00
David Hildenbrand
1fad8b3be3 s390x/kvm: execute "system reset" cpu resets on the vcpu thread
Let's execute resets triggered by qemu system resets on the target vcpu thread.
This will avoid synchronize_rcu's in the kernel.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:23:02 +02:00
David Hildenbrand
6e6ad8db11 s390x/kvm: execute sigp orders on the target vcpu thread
All sigp orders that can result in ioctls on the target vcpu should be executed
on the associated vcpu thread.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:23:02 +02:00
David Hildenbrand
85ca3371f6 s390x/kvm: run guest triggered resets on the target vcpu thread
Currently, load_normal_reset() and modified_clear_reset() as triggered
by a guest vcpu will initiate cpu resets on the current vcpu thread for
all cpus. The reset should happen on the individual vcpu thread
instead, so let's use run_on_cpu() for this.

This avoids calls to synchronize_rcu() in the kernel.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:23:02 +02:00
Peter Maydell
988f463614 Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
Block pull request

# gpg: Signature made Fri 29 Aug 2014 17:25:58 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/block-pull-request: (35 commits)
  quorum: Fix leak of opts in quorum_open
  blkverify: Fix leak of opts in blkverify_open
  nfs: Fix leak of opts in nfs_file_open
  curl: Don't deref NULL pointer in call to aio_poll.
  curl: Allow a cookie or cookies to be sent with http/https requests.
  virtio-blk: allow drive_del with dataplane
  block: acquire AioContext in do_drive_del()
  linux-aio: avoid deadlock in nested aio_poll() calls
  qemu-iotests: add multiwrite test cases
  block: fix overlapping multiwrite requests
  nbd: Follow the BDS' AIO context
  block: Add AIO context notifiers
  nbd: Drop nbd_can_read()
  sheepdog: fix a core dump while do auto-reconnecting
  aio-win32: add support for sockets
  qemu-coroutine-io: fix for Win32
  AioContext: introduce aio_prepare
  aio-win32: add aio_set_dispatching optimization
  test-aio: test timers on Windows too
  AioContext: export and use aio_dispatch
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 18:40:04 +01:00
Fam Zheng
8df3abfcee quorum: Fix leak of opts in quorum_open
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 17:10:18 +01:00
Fam Zheng
3158593126 blkverify: Fix leak of opts in blkverify_open
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 17:10:18 +01:00
Fam Zheng
810f4f86b7 nfs: Fix leak of opts in nfs_file_open
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 17:10:18 +01:00
Richard W.M. Jones
a2f468e48f curl: Don't deref NULL pointer in call to aio_poll.
In commit 63f0f45f2e the following
mechanical change was made:

         if (!state) {
-            qemu_aio_wait();
+            aio_poll(state->s->aio_context, true);
         }

The new code now checks if state is NULL and then dereferences it
('state->s') which is obviously incorrect.

This commit replaces state->s->aio_context with
bdrv_get_aio_context(bs), fixing this problem.  The two other hunks
are concerned with getting the BlockDriverState pointer bs to where it
is needed.

The original bug causes a segfault when using libguestfs to access a
VMware vCenter Server and doing any kind of complex read-heavy
operations.  With this commit the segfault goes away.

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 16:19:01 +01:00
Richard W.M. Jones
a94f83d94f curl: Allow a cookie or cookies to be sent with http/https requests.
In order to access VMware ESX efficiently, we need to send a session
cookie.  This patch is very simple and just allows you to send that
session cookie.  It punts on the question of how you get the session
cookie in the first place, but in practice you can just run a `curl'
command against the server and extract the cookie that way.

To use it, add file.cookie to the curl URL.  For example:

$ qemu-img info 'json: {
    "file.driver":"https",
    "file.url":"https://vcenter/folder/Windows%202003/Windows%202003-flat.vmdk?dcPath=Datacenter&dsName=datastore1",
    "file.sslverify":"off",
    "file.cookie":"vmware_soap_session=\"52a01262-bf93-ccce-d379-8dabb3e55560\""}'
image: [...]
file format: raw
virtual size: 8.0G (8589934592 bytes)
disk size: unavailable

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 16:11:14 +01:00
Stefan Hajnoczi
3255d1c21f virtio-blk: allow drive_del with dataplane
Now that drive_del acquires the AioContext we can safely allow deleting
the drive.  As with non-dataplane mode, all I/Os submitted by the guest
after drive_del will return EIO.

This patch makes hot unplug work with virtio-blk dataplane.  Previously
drive_del reported an error because the device was busy.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 16:01:48 +01:00
Stefan Hajnoczi
8ad4202bf6 block: acquire AioContext in do_drive_del()
Make drive_del safe for dataplane where another thread may be running
the BlockDriverState's AioContext.

Note the assumption that AioContext's lifetime exceeds DriveInfo and
BlockDriverState.  We release AioContext after DriveInfo and
BlockDriverState are potentially freed.

This is clearly safe with the global AioContext but also with -object
iothread and implicit iothreads created by -device
virtio-blk-pci,x-data-plane=on (their lifetime is tied to DeviceState,
not BlockDriverState).

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 16:01:10 +01:00
Stefan Hajnoczi
2cdff7f620 linux-aio: avoid deadlock in nested aio_poll() calls
If two Linux AIO request completions are fetched in the same
io_getevents() call, QEMU will deadlock if request A's callback waits
for request B to complete using an aio_poll() loop.  This was reported
to happen with the mirror blockjob.

This patch moves completion processing into a BH and makes it resumable.
Nested event loops can resume completion processing so that request B
will complete and the deadlock will not occur.

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Ming Lei <ming.lei@canonical.com>
Cc: Marcin Gibuła <m.gibula@beyond.pl>
Reported-by: Marcin Gibuła <m.gibula@beyond.pl>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Marcin Gibuła <m.gibula@beyond.pl>
2014-08-29 15:59:17 +01:00
Peter Maydell
8b3030114a Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20140829' into staging
target-arm queue:
 * support PMCCNTR in ARMv8
 * various GIC fixes and cleanups
 * Correct Cortex-A57 ISAR5 and AA64ISAR0 ID register values
 * Fix regression that disabled VFP for ARMv5 CPUs
 * Update to upstream VIXL 1.5

# gpg: Signature made Fri 29 Aug 2014 15:34:47 BST using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"

* remotes/pmaydell/tags/pull-target-arm-20140829:
  target-arm: Implement pmccfiltr_write function
  target-arm: Remove old code and replace with new functions
  target-arm: Implement pmccntr_sync function
  target-arm: Add arm_ccnt_enabled function
  target-arm: Implement PMCCNTR_EL0 and related registers
  arm: Implement PMCCNTR 32b read-modify-write
  target-arm: Make the ARM PMCCNTR register 64-bit
  hw/intc/arm_gic: honor target mask in gic_update()
  aarch64: raise max_cpus to 8
  arm_gic: Use GIC_NR_SGIS constant
  arm_gic: Do not force PPIs to edge-triggered mode
  arm_gic: GICD_ICFGR: Write model only for pre v1 GICs
  arm_gic: Fix read of GICD_ICFGR
  target-arm: Correct Cortex-A57 ISAR5 and AA64ISAR0 ID register values
  target-arm: Fix regression that disabled VFP for ARMv5 CPUs
  disas/libvixl: Update to upstream VIXL 1.5

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 15:48:15 +01:00
Alistair Francis
0614601cec target-arm: Implement pmccfiltr_write function
This is the function that is called when writing to the
PMCCFILTR_EL0 register

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-id: 73da3da6404855b17d5ae82975a32ff3a4dcae3d.1409025949.git.peter.crosthwaite@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 15:00:30 +01:00
Alistair Francis
942a155b20 target-arm: Remove old code and replace with new functions
Remove the old PMCCNTR code and replace it with calls to the new
pmccntr_sync() and arm_ccnt_enabled() functions.

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-id: 693a6e437d915c2195fd3dc7303f384ca538b7bf.1409025949.git.peter.crosthwaite@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 15:00:30 +01:00
Alistair Francis
ec7b4ce4c7 target-arm: Implement pmccntr_sync function
This is used to synchronise the PMCCNTR counter and swap its
state between enabled and disabled if required. It must always
be called twice, both before and after any logic that could
change the state of the PMCCNTR counter.

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-id: 62811d4c0f7b1384f7aab62ea2fcfda3dcb0db50.1409025949.git.peter.crosthwaite@xilinx.com
[PMM: fixed minor typos in pmccntr_sync doc comment]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 15:00:29 +01:00
Alistair Francis
87124fdea4 target-arm: Add arm_ccnt_enabled function
Include a helper function to determine if the CCNT counter
is enabled.

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-id: e1a64f17a756e06c8bda8238ad4826d705049f7a.1409025949.git.peter.crosthwaite@xilinx.com
[ PC changes
  * Remove EL based checks
]
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 15:00:29 +01:00
Alistair Francis
8521466b39 target-arm: Implement PMCCNTR_EL0 and related registers
This patch adds support for the ARMv8 version of the PMCCNTR and
related registers. It also starts to implement the PMCCFILTR_EL0
register.

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-id: b5d1094764a5416363ee95216799b394ecd011e8.1409025949.git.peter.crosthwaite@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 15:00:29 +01:00
Peter Crosthwaite
421c7ebd93 arm: Implement PMCCNTR 32b read-modify-write
The register is now 64bit, however a 32 bit write to the register
should leave the higher bits unchanged. The open coded write handler
does not implement this, so we need to read-modify-write accordingly.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Alistair Francis <alistair23@gmail.com>
Message-id: ec350573424bb2adc1701c3b9278d26598e2f2d1.1409025949.git.peter.crosthwaite@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 15:00:29 +01:00
Alistair Francis
c92c06872a target-arm: Make the ARM PMCCNTR register 64-bit
This makes the PMCCNTR register 64-bit to allow for the
64-bit ARMv8 version.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 6c5bac5fd0ea54963b1fc0e7f9464909f2e19a73.1409025949.git.peter.crosthwaite@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 15:00:29 +01:00
Sergey Fedorov
b52b81e44f hw/intc/arm_gic: honor target mask in gic_update()
Take IRQ target mask into account when determining the highest priority
pending interrupt.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Message-id: 1407947471-26981-1-git-send-email-serge.fdrv@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 15:00:29 +01:00
Joel Schopp
d3579f362f aarch64: raise max_cpus to 8
I'm running on a system with 8 cpus and it would be nice to have qemu
support all of them.  The attached patch does that and has been tested.

That said, I'm not sure if 8 is enough or if we want to bump this even higher
now before systems with many more cpus come along. 255 anyone?

Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Joel Schopp <joel.schopp@amd.com>
Message-id: 20140819213304.19537.2834.stgit@joelaarch64.amd.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 15:00:29 +01:00
Adam Lackorzynski
93b5f6f1a6 arm_gic: Use GIC_NR_SGIS constant
Use constant rather than a plain number.

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Adam Lackorzynski <adam@os.inf.tu-dresden.de>
Message-id: 1408372255-12358-5-git-send-email-adam@os.inf.tu-dresden.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 15:00:29 +01:00
Adam Lackorzynski
de7a900f0c arm_gic: Do not force PPIs to edge-triggered mode
Only SGIs must be WI, done by forcing them to their default
(edge-triggered).

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Adam Lackorzynski <adam@os.inf.tu-dresden.de>
Message-id: 1408372255-12358-4-git-send-email-adam@os.inf.tu-dresden.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 15:00:28 +01:00
Adam Lackorzynski
24b790df43 arm_gic: GICD_ICFGR: Write model only for pre v1 GICs
Setting the model is only available in pre-v1 GIC models.

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Adam Lackorzynski <adam@os.inf.tu-dresden.de>
Message-id: 1408372255-12358-3-git-send-email-adam@os.inf.tu-dresden.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 15:00:28 +01:00
Adam Lackorzynski
71a62046ae arm_gic: Fix read of GICD_ICFGR
The GICD_ICFGR register covers 4 interrupts per byte.

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Adam Lackorzynski <adam@os.inf.tu-dresden.de>
Message-id: 1408372255-12358-2-git-send-email-adam@os.inf.tu-dresden.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 15:00:28 +01:00
Peter Maydell
c379621451 target-arm: Correct Cortex-A57 ISAR5 and AA64ISAR0 ID register values
We implement the crypto extensions but were incorrectly reporting
ID register values for the Cortex-A57 which did not advertise
crypto. Use the correct values as described in the TRM.
With this fix Linux correctly detects presence of the crypto
features and advertises them in /proc/cpuinfo.

Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1408718660-7295-1-git-send-email-peter.maydell@linaro.org
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 15:00:28 +01:00
Peter Maydell
ed1f13d607 target-arm: Fix regression that disabled VFP for ARMv5 CPUs
Commit 2c7ffc414 added support for honouring the CPACR coprocessor
access control register bits which may disable access to VFP
and Neon instructions. However it failed to account for the
fact that the CPACR is only present starting from the ARMv6
architecture version, so it accidentally disabled VFP completely
for ARMv5 CPUs like the ARM926. Linux would detect this as
"no VFP present" and probably fall back to its own emulation,
but other guest OSes might crash or misbehave.

This fixes bug LP:1359930.

Reported-by: Jakub Jermar <jakub@jermar.eu>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1408714940-7192-1-git-send-email-peter.maydell@linaro.org
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 15:00:28 +01:00
Peter Maydell
508280f566 disas/libvixl: Update to upstream VIXL 1.5
Update our copy of libvixl to upstream's 1.5 release.
This includes the upstream versions of the fixes we
were carrying locally (commit ffebe899).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1407162987-4659-1-git-send-email-peter.maydell@linaro.org
2014-08-29 15:00:27 +01:00
Stefan Hajnoczi
12ade76090 qemu-iotests: add multiwrite test cases
This test case covers the basic bdrv_aio_multiwrite() scenarios:
1. Single request
2. Sequential requests (AABB)
3. Superset overlapping requests (AABBAA)
4. Subset overlapping requests (BBAABB)
5. Head overlapping requests (AABB)
6. Tail overlapping requests (BBAA)
7. Disjoint requests (AA BB)

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-29 14:10:15 +01:00
Stefan Hajnoczi
391827eb10 block: fix overlapping multiwrite requests
When request A is a strict superset of request B:

  AAAAAAAA
    BBBB

multiwrite_merge() merges them as follows:

  AABBBB

The tail of request A should have been included:

  AABBBBAA

This patch fixes data loss but this code path is probably rare.  Since
guests cannot assume ordering between in-flight requests, few
applications submit overlapping write requests.

Reported-by: Slava Pestov <sviatoslav.pestov@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-29 14:09:43 +01:00
Peter Maydell
d9aa688557 Merge remote-tracking branch 'remotes/kraxel/tags/pull-usb-20140829-1' into staging
usb: bugfix collection.
usb: add cleanup functions for host adapters,
     in preparation for hotplug support.
usb: add simple qtests for uhci,ohci,xhci.

# gpg: Signature made Fri 29 Aug 2014 12:56:20 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-usb-20140829-1:
  tests: add xHCI qtest
  tests: add UHCI qtest
  tests: add OHCI qtest
  usb: add usb host adapters exit trace
  usb-xhci: add exit function
  usb-ehci: add ehci-pci device exit function
  usb-ehci: add ehci unrealize funciton
  usb-ehci: add vmstate properity for EHCIState
  usb-uhci: clean up uhci resource when pci-uhci exit
  usb-ohci: add exit function
  usb-ohci: Fix memory leak for ohci timer
  usb: add usb_bus_release function
  Revert "xhci: Fix number of streams allocated when using streams"
  xhci: use (1u << i)
  Fix OHCI ISO TD state never being written back.
  xhci: fix debug print compiling error
  usb: Fix bootindex for portnr > 9

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 13:08:04 +01:00
Gonglei
25e89ec5d2 tests: add xHCI qtest
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-08-29 12:53:47 +02:00
Gonglei
44ced58e3a tests: add UHCI qtest
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-08-29 12:53:47 +02:00
Gonglei
28edfce0f3 tests: add OHCI qtest
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-08-29 12:53:47 +02:00
Gonglei
d733f74c33 usb: add usb host adapters exit trace
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-08-29 12:52:14 +02:00
Gonglei
53c30545fb usb-xhci: add exit function
clean up xhci resource when xhci pci device exit.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-08-29 12:52:14 +02:00
Gonglei
96e14926c6 usb-ehci: add ehci-pci device exit function
clean up ehci resource when ehci pci device exit.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-08-29 12:52:14 +02:00
Gonglei
4e130cf6a8 usb-ehci: add ehci unrealize funciton
cleanup ehci controller resource, both pci and sysbus
if they're necessary.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-08-29 12:52:14 +02:00
Gonglei
05a36991c5 usb-ehci: add vmstate properity for EHCIState
since hotunplug the ehci host adapter, we should
delete vm_change_state_handler also, so the
VMChangeStateEntry should be saved in EHCIState.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-08-29 12:52:14 +02:00
Gonglei
3a3464b000 usb-uhci: clean up uhci resource when pci-uhci exit
clean up uhci resource when uhci pci device exit.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-08-29 12:52:13 +02:00
Gonglei
07832c38d3 usb-ohci: add exit function
clean up ohci resource when ohci pci device exit.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-08-29 12:52:13 +02:00
Gonglei
80be63df5a usb-ohci: Fix memory leak for ohci timer
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-08-29 12:51:44 +02:00
Gonglei
e5a9bece9b usb: add usb_bus_release function
add global variables releasing logic when the usb buses
were removed or hot-unpluged.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-08-29 12:51:44 +02:00
Gerd Hoffmann
f90e160b50 Revert "xhci: Fix number of streams allocated when using streams"
This reverts commit d063c3112c.

"2 << x" is the same as "2 ^ (x + 1)", so the old code is correct.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-08-29 12:51:44 +02:00
Gerd Hoffmann
3d80365b55 xhci: use (1u << i)
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 12:51:43 +02:00
Jack Un
cae7f29c47 Fix OHCI ISO TD state never being written back.
There appears to be typo in OHCI with isochronous transfers
resulting in isoch. transfer descriptor state never being written back.
The'put_words' function is in a OR statement hence it is never called.

Signed-off-by: Jack Un <jack.un@gmail.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-08-29 12:51:43 +02:00
Gonglei
8c244210d8 xhci: fix debug print compiling error
after commit 003e15a180
the DPRINTF will broke compiling, adjust its location.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-08-29 12:51:43 +02:00
Markus Armbruster
830cd54fca usb: Fix bootindex for portnr > 9
We identify devices by their Open Firmware device paths.  The encoding
of the host controller and hub port numbers is incorrect:
usb_get_fw_dev_path() formats them in decimal, while SeaBIOS uses
hexadecimal.  When some port number > 9, SeaBIOS will miss the
bootindex (lucky case), or apply it to another device (unlucky case).

The relevant spec[*] agrees with SeaBIOS (and OVMF, for that matter).
Change %d to %x.

Bug can bite only with host controllers or hubs sporting more than ten
ports.  I'm not aware of any.

[*] Open Firmware Recommended Practice: Universal Serial Bus,
Version 1, Section 3.2.1 Device Node Address Representation
http://www.openfirmware.org/1275/bindings/usb/usb-1_0.ps

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>

Note: xhci can be configured with up to 15 ports (default is 4 ports).

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-08-29 12:51:43 +02:00
Max Reitz
f21492817b nbd: Follow the BDS' AIO context
Keep the NBD server always in the same AIO context as the exported BDS
by calling bdrv_add_aio_context_notifier() and implementing the required
callbacks.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:48:45 +01:00
Max Reitz
33384421b3 block: Add AIO context notifiers
If a long-running operation on a BDS wants to always remain in the same
AIO context, it somehow needs to keep track of the BDS changing its
context. This adds a function for registering callbacks on a BDS which
are called whenever the BDS is attached or detached from an AIO context.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:48:45 +01:00
Max Reitz
958c717df9 nbd: Drop nbd_can_read()
There is no variant of aio_set_fd_handler() like qemu_set_fd_handler2(),
so we cannot give a can_read() callback function. Instead, unregister
the nbd_read() function whenever we cannot read and re-register it as
soon as we can read again.

All this is hidden behind the functions nbd_set_handlers() (which
registers all handlers for the AIO context and file descriptor belonging
to the given client), nbd_unset_handlers() (which unregisters them) and
nbd_update_can_read() (which checks whether NBD can read for the given
client and acts accordingly).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:48:45 +01:00
Liu Yuan
a780dea045 sheepdog: fix a core dump while do auto-reconnecting
We should reinit local_err as NULL inside the while loop or g_free() will report
corrupption and abort the QEMU when sheepdog driver tries reconnecting.

This was broken in commit 356b4ca.

qemu-system-x86_64: failed to get the header, Resource temporarily unavailable
qemu-system-x86_64: Failed to connect to socket: Connection refused
qemu-system-x86_64: (null)
[xcb] Unknown sequence number while awaiting reply
[xcb] Most likely this is a multi-threaded client and XInitThreads has not been called
[xcb] Aborting, sorry about that.
qemu-system-x86_64: ../../src/xcb_io.c:298: poll_for_response: Assertion `!xcb_xlib_threads_sequence_lost' failed.
Aborted (core dumped)

Cc: qemu-devel@nongnu.org
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:58 +01:00
Paolo Bonzini
b493317d34 aio-win32: add support for sockets
Uses the same select/WSAEventSelect scheme as main-loop.c.
WSAEventSelect() is edge-triggered, so it cannot be used
directly, but it is still used as a way to exit from a
blocking g_poll().

Before g_poll() is called, we poll sockets with a non-blocking
select() to achieve the level-triggered semantics we require:
if a socket is ready, the g_poll() is made non-blocking too.

Based on a patch from Or Goshen.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:58 +01:00
Paolo Bonzini
79d9b6566b qemu-coroutine-io: fix for Win32
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:58 +01:00
Paolo Bonzini
a3462c6561 AioContext: introduce aio_prepare
This will be used to implement socket polling on Windows.
On Windows, select() and g_poll() are completely different;
sockets are polled with select() before calling g_poll,
and the g_poll must be nonblocking if select() says a
socket is ready.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:58 +01:00
Paolo Bonzini
0a9dd1664a aio-win32: add aio_set_dispatching optimization
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:58 +01:00
Paolo Bonzini
363285d4b3 test-aio: test timers on Windows too
Use EventNotifier instead of a pipe, which makes it trivial to test
timers on Windows.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:58 +01:00
Paolo Bonzini
e4c7e2d12d AioContext: export and use aio_dispatch
So far, aio_poll's scheme was dispatch/poll/dispatch, where
the first dispatch phase was used only in the GSource case in
order to avoid a blocking poll.  Earlier patches changed it to
dispatch/prepare/poll/dispatch, where prepare is aio_compute_timeout.

By making aio_dispatch public, we can remove the first dispatch
phase altogether, so that both aio_poll and the GSource use the same
prepare/poll/dispatch scheme.

This patch breaks the invariant that aio_poll(..., true) will not block
the first time it returns false.  This used to be fundamental for
qemu_aio_flush's implementation as "while (qemu_aio_wait()) {}" but
no code in QEMU relies on this invariant anymore.  The return value
of aio_poll() is now comparable with that of g_main_context_iteration.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:58 +01:00
Paolo Bonzini
3672fa5083 AioContext: run bottom halves after polling
Make the dispatching phase the same before blocking and afterwards.
The next patch will make aio_dispatch public and use it directly
for the GSource case, instead of aio_poll.  aio_poll can then be
simplified heavily.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:58 +01:00
Paolo Bonzini
a398dea34c aio-win32: Factor out duplicate code into aio_dispatch_handlers
Later, the call to aio_dispatch will move int the GSource wrapper, while the
standalone case will still be call the component functions aio_bh_poll,
aio_dispatch_handlers and timerlistgroup_run_timers.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:58 +01:00
Paolo Bonzini
d397ec99bb aio-win32: Evaluate timers after handles
This is similar to what aio_poll does in the stand-alone case.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:58 +01:00
Paolo Bonzini
845ca10dd0 AioContext: take bottom halves into account when computing aio_poll timeout
Right now, QEMU invokes aio_bh_poll before the "poll" phase
of aio_poll.  It is simpler to do it afterwards and skip the
"poll" phase altogether when the OS-dependent parts of AioContext
are invoked from GSource.  This way, AioContext behaves more
similarly when used as a GSource vs. when used as stand-alone.

As a start, take bottom halves into account when computing the
poll timeout.  If a bottom half is ready, do a non-blocking
poll.  As a side effect, this makes idle bottom halves work
with aio_poll; an improvement, but not really an important
one since they are deprecated.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:58 +01:00
Stefan Hajnoczi
3cbbe9fd1f blockdev: fix drive-mirror 'granularity' error message
Name the 'granularity' parameter and give its expected value range.
Previously the device name was mistakenly reported as the parameter
name.

Note that the error class is unchanged from ERROR_CLASS_GENERIC_ERROR.

Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
2014-08-29 10:46:58 +01:00
Fam Zheng
0b9caf9b31 coroutine: Drop co_sleep_ns
block_job_sleep_ns is the only user. Since we are moving towards
AioContext aware code, it's better to use the explicit version and drop
the old one.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:58 +01:00
Liu Yuan
a9db86b223 block/quorum: add simple read pattern support
This patch adds single read pattern to quorum driver and quorum vote is default
pattern.

For now we do a quorum vote on all the reads, it is designed for unreliable
underlying storage such as non-redundant NFS to make sure data integrity at the
cost of the read performance.

For some use cases as following:

        VM
  --------------
  |            |
  v            v
  A            B

Both A and B has hardware raid storage to justify the data integrity on its own.
So it would help performance if we do a single read instead of on all the nodes.
Further, if we run VM on either of the storage node, we can make a local read
request for better performance.

This patch generalize the above 2 nodes case in the N nodes. That is,

vm -> write to all the N nodes, read just one of them. If single read fails, we
try to read next node in FIFO order specified by the startup command.

The 2 nodes case is very similar to DRBD[1] though lack of auto-sync
functionality in the single device/node failure for now. But compared with DRBD
we still have some advantages over it:

- Suppose we have 20 VMs running on one(assume A) of 2 nodes' DRBD backed
storage. And if A crashes, we need to restart all the VMs on node B. But for
practice case, we can't because B might not have enough resources to setup 20 VMs
at once. So if we run our 20 VMs with quorum driver, and scatter the replicated
images over the data center, we can very likely restart 20 VMs without any
resource problem.

After all, I think we can build a more powerful replicated image functionality
on quorum and block jobs(block mirror) to meet various High Availibility needs.

E.g, Enable single read pattern on 2 children,

-drive driver=quorum,children.0.file.filename=0.qcow2,\
children.1.file.filename=1.qcow2,read-pattern=fifo,vote-threshold=1

[1] http://en.wikipedia.org/wiki/Distributed_Replicated_Block_Device

[Dropped \n from an error_setg() error message
--Stefan]

Cc: Benoit Canet <benoit@irqsave.net>
Cc: Eric Blake <eblake@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:58 +01:00
Liu Yuan
62c6031f96 qapi: add read-pattern enum for quorum
Cc: Eric Blake <eblake@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:58 +01:00
Hitoshi Mitake
38890b246d sheepdog: improve error handling for a case of failed lock
Recently, sheepdog revived its VDI locking functionality. This patch
updates sheepdog driver of QEMU for this feature. It changes an error
code for a case of failed locking. -EBUSY is a suitable one.

Reported-by: Valerio Pachera <sirio81@gmail.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Liu Yuan <namei.unix@gmail.com>
Cc: MORITA Kazutaka <morita.kazutaka@gmail.com>
Signed-off-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:57 +01:00
Hitoshi Mitake
1dbfafed7f sheepdog: adopting protocol update for VDI locking
The update is required for supporting iSCSI multipath. It doesn't
affect behavior of QEMU driver but adding a new field to vdi request
struct is required.

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Liu Yuan <namei.unix@gmail.com>
Cc: MORITA Kazutaka <morita.kazutaka@gmail.com>
Signed-off-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:57 +01:00
Stefan Hajnoczi
40ed35a3c4 qemu-img: always goto out in img_snapshot() error paths
The out label has the qemu_progress_end() and other cleanup calls.
Always goto out in error paths so the cleanup happens.  These error
paths now return 1 instead of -1.

Note that bdrv_unref(NULL) is safe.  We just need to initialize bs to
NULL at the top of the function.

We can now remove the obsolete bs_old_backing = NULL and bs_new_backing
= NULL for safe mode.  Originally it was necessary in commit 3e85c6fd
("qemu-img rebase") but became useless in commit c2abcce ("qemu-img:
avoid calling exit(1) to release resources properly") because the
variables are already initialized during declaration.

Reported-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-08-29 10:46:57 +01:00
Stefan Hajnoczi
cbda016d94 qemu-img: fix img_compare() flags error path
If img_compare() fails to parse the cache flags the goto out3 code path
will call qemu_progress_end().  Make sure we actually call
qemu_progress_init() first.

Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-08-29 10:46:57 +01:00
Stefan Hajnoczi
a3981eb978 qemu-img: fix img_commit() error return value
The img_commit() return value is a process exit code.  Use 1 for failure
instead of -1.  The other failure paths in this function already use 1.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-08-29 10:46:57 +01:00
Daniel Henrique Barboza
212aefaa53 block.curl: adding 'timeout' option
The curl hardcoded timeout (5 seconds) sometimes is not long
enough depending on the remote server configuration and network
traffic. The user should be able to set how much long he is
willing to wait for the connection.

Adding a new option to set this timeout gives the user this
flexibility. The previous default timeout of 5 seconds will be
used if this option is not present.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Reviewed-by: Benoit Canet <benoit.canet@nodalink.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:57 +01:00
Markus Armbruster
28fa7133b8 ide: Fix bootindex for bus_id > 9
We identify devices by their Open Firmware device paths.  The encoding
of bus numbers is incorrect: idebus_get_fw_dev_path() formats them in
decimal, while SeaBIOS uses hexadecimal.  With bus number > 9, SeaBIOS
will miss the bootindex (lucky case), or apply it to another device
(unlucky case).

Bug can't bite right now: ich9-ahci has six ports, and the sysbus-ahci
created by Calxeda Highbank has just one.

Fix it anyway, by changing %d to %x.

I couldn't find an Open Firmware spec covering this.  For what it's
worth, OVMF agrees with SeaBIOS.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:57 +01:00
Le Tan
b5a280c008 intel-iommu: add IOTLB using hash table
Add IOTLB to cache information about the translation of input-addresses. IOTLB
use a GHashTable as cache. The key of the hash table is the logical-OR of gfn
and source id after left-shifting.

Signed-off-by: Le Tan <tamlokveer@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-28 23:10:22 +02:00
Le Tan
d92fa2dc6e intel-iommu: add context-cache to cache context-entry
Add context-cache to cache context-entry encountered on a page-walk. Each
VTDAddressSpace has a member of VTDContextCacheEntry which represents an entry
in the context-cache. Since devices with different bus_num and devfn have their
respective VTDAddressSpace, this will be a good way to reference the cached
entries.
Each VTDContextCacheEntry will have a context_cache_gen and the cached entry
is valid only when context_cache_gen equals IntelIOMMUState.context_cache_gen.

Signed-off-by: Le Tan <tamlokveer@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-28 23:10:22 +02:00
Le Tan
ed7b8fbcfb intel-iommu: add supports for queued invalidation interface
Add supports for queued invalidation interface, an expended invalidation
interface with extended capabilities.

Signed-off-by: Le Tan <tamlokveer@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-28 23:10:22 +02:00
Le Tan
ac40aa1540 intel-iommu: fix coding style issues around in q35.c and machine.c
Fix coding style issues around in hw/pci-host/q35.c and hw/core/machine.c.

Signed-off-by: Le Tan <tamlokveer@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-28 23:10:22 +02:00
Le Tan
a52a7fdfa7 intel-iommu: add Intel IOMMU emulation to q35 and add a machine option "iommu" as a switch
Add Intel IOMMU emulation to q35 chipset and expose it to the guest.
1. Add a machine option. Users can use "-machine iommu=on|off" in the command
line to enable/disable Intel IOMMU. The default is off.
2. Accroding to the machine option, q35 will initialize the Intel IOMMU and
use pci_setup_iommu() to setup q35_host_dma_iommu() as the IOMMU function for
the pci bus.
3. q35_host_dma_iommu() will return different address space according to the
bus_num and devfn of the device.

Signed-off-by: Le Tan <tamlokveer@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-28 23:10:22 +02:00
Le Tan
d4eb911935 intel-iommu: add DMAR table to ACPI tables
Expose Intel IOMMU to the BIOS. If object of TYPE_INTEL_IOMMU_DEVICE exists,
add DMAR table to ACPI RSDT table. For now the DMAR table indicates that there
is only one hardware unit without INTR_REMAP capability on the platform.

Signed-off-by: Le Tan <tamlokveer@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-28 23:10:22 +02:00
Le Tan
1da12ec4c8 intel-iommu: introduce Intel IOMMU (VT-d) emulation
Add support for emulating Intel IOMMU according to the VT-d specification for
the q35 chipset machine. Implement the logics for DMAR (DMA remapping) without
PASID support. The emulation supports register-based invalidation and primary
fault logging.

Signed-off-by: Le Tan <tamlokveer@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-28 23:10:22 +02:00
Le Tan
8d7b8cb9c2 iommu: add is_write as a parameter to the translate function of MemoryRegionIOMMUOps
Add a bool variable is_write as a parameter to the translate function of
MemoryRegionIOMMUOps to indicate the operation of the access. It can be
used for correct fault reporting from within the callback.
Change the interface of related functions.

Signed-off-by: Le Tan <tamlokveer@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-28 23:10:22 +02:00
Peter Maydell
a6aebb38ba Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
SCSI patches include bug fixes from Fam and Peter, improved error
reporting from Fam and a fix for DPRINTF bitrot.  Memory patches try
again to initialize name from the QOM name.

# gpg: Signature made Thu 28 Aug 2014 15:10:31 BST using RSA key ID 9B4D86F2
# gpg: Good signature from "Paolo Bonzini <pbonzini@redhat.com>"
# gpg:                 aka "Paolo Bonzini <bonzini@gnu.org>"

* remotes/bonzini/tags/for-upstream:
  memory: Lazy init name from QOM name as needed
  xen: hvm: Abstract away memory region name ref
  xen-hvm: Constify string
  virtio-scsi: Report error if num_queues is 0 or too large
  scsi-generic: remove superfluous DPRINTF avoid to break compiling
  block/iscsi: fix memory corruption on iscsi resize
  scsi-bus: Convert DeviceClass init to realize
  block: Pass errp in blkconf_geometry

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-28 17:08:13 +01:00
Peter Maydell
38a01e55d2 Merge remote-tracking branch 'remotes/kvm/tags/for-upstream' into staging
Mostly bugfixes + Alexey's interface-based implementation
of the NMI monitor command.

# gpg: Signature made Thu 28 Aug 2014 15:07:22 BST using RSA key ID 9B4D86F2
# gpg: Good signature from "Paolo Bonzini <pbonzini@redhat.com>"
# gpg:                 aka "Paolo Bonzini <bonzini@gnu.org>"

* remotes/kvm/tags/for-upstream:
  mc146818rtc: reinitialize irq_reinject_on_ack_count on reset
  target-i386: Add "tsc_adjust" CPU feature name
  target-i386: Add "mpx" CPU feature name
  vl: process -object after other backend options
  checkpatch.pl: adjust typedef definition to QEMU coding style
  x86: Clear MTRRs on vCPU reset
  x86: kvm: Add MTRR support for kvm_get|put_msrs()
  x86: Use common variable range MTRR counts
  target-i386: Don't forbid NX bit on PAE PDEs and PTEs
  spapr: Add support for new NMI interface
  s390x: Migrate to new NMI interface
  s390x: Convert QEMUMachine to MachineClass
  cpus: Define callback for QEMU "nmi" command
  kvm: run cpu state synchronization on target vcpu thread

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-28 16:07:23 +01:00
Peter Crosthwaite
d1dd32af6f memory: Lazy init name from QOM name as needed
To support name retrieval of MemoryRegions that were created
dynamically (that is, not via memory_region_init and friends). We
cache the name in MemoryRegion's state as
object_get_canonical_path_component mallocs the returned value
so it's not suitable for direct return to callers. Memory already
frees the name field, so this will be garbage collected along with
the MR object.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-28 16:09:44 +02:00
Peter Crosthwaite
3e1f50867b xen: hvm: Abstract away memory region name ref
The mr->name field is removed. This slipped through compile testing.
Fix.

Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-28 16:09:44 +02:00
Peter Crosthwaite
dc6c4fe837 xen-hvm: Constify string
It's constant, and sourced from existing const strings. Avoid dodgy
casts by converting to const.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-28 16:09:44 +02:00
Peter Maydell
795c050e37 Merge remote-tracking branch 'remotes/stefanha/tags/fix-buildbot-12082014-pull-request' into staging
Pull request

# gpg: Signature made Thu 28 Aug 2014 13:43:00 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/fix-buildbot-12082014-pull-request:
  Revert "qemu-img: sort block formats in help message"
  block: sort formats alphabetically in bdrv_iterate_format()
  mirror: fix uninitialized variable delay_ns warnings
  trace: avoid Python 2.5 all() in tracetool
  libqtest: launch QEMU with QEMU_AUDIO_DRV=none
  qapi.py: avoid Python 2.5+ any() function

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-28 14:51:12 +01:00
Stefan Hajnoczi
00c6d403a3 Revert "qemu-img: sort block formats in help message"
This reverts commit 1a443c1b8b and the
later commit 395071a763.

GSequence was introduced in glib 2.14.  RHEL 5 fails to compile since it
uses glib 2.12.3.

Now that bdrv_iterate_format() invokes the iteration callback in sorted
order these commits are unnecessary.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
2014-08-28 13:42:25 +01:00
Stefan Hajnoczi
ada4240103 block: sort formats alphabetically in bdrv_iterate_format()
Format names are best consumed in alphabetical order.  This makes
human-readable output easy to produce.

bdrv_iterate_format() already has an array of format strings.  Sort them
before invoking the iteration callback.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
2014-08-28 13:42:25 +01:00
Stefan Hajnoczi
6d0de8eb21 mirror: fix uninitialized variable delay_ns warnings
The gcc 4.1.2 compiler warns that delay_ns may be uninitialized in
mirror_iteration().

There are two break statements in the do ... while loop that skip over
the delay_ns assignment.  These are probably the cause of the warning.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
2014-08-28 13:42:25 +01:00
Stefan Hajnoczi
73735f7218 trace: avoid Python 2.5 all() in tracetool
Red Hat Enterprise Linux 5 ships Python 2.4.3.  The all() function was
added in Python 2.5 so we cannot use it.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
2014-08-28 13:42:25 +01:00
Stefan Hajnoczi
6b02921605 libqtest: launch QEMU with QEMU_AUDIO_DRV=none
No test case actually uses the audio backend.  Disable audio to prevent
warnings on hosts with no sound hardware present:

  GTESTER check-qtest-aarch64
  sdl: SDL_OpenAudio failed
  sdl: Reason: No available audio device
  sdl: SDL_OpenAudio failed
  sdl: Reason: No available audio device
  audio: Failed to create voice `lm4549.out'

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
2014-08-28 13:42:25 +01:00
Stefan Hajnoczi
7ac9a9d6e1 qapi.py: avoid Python 2.5+ any() function
There is one instance of any() in qapi.py that breaks builds on older
distros that ship Python 2.4 (like RHEL5):

  GEN   qmp-commands.h
Traceback (most recent call last):
  File "build/scripts/qapi-commands.py", line 445, in ?
    exprs = parse_schema(input_file)
  File "build/scripts/qapi.py", line 329, in parse_schema
    schema = QAPISchema(open(input_file, "r"))
  File "build/scripts/qapi.py", line 110, in __init__
    if any(include_path == elem[1]
NameError: global name 'any' is not defined

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
2014-08-28 13:42:25 +01:00
Paolo Bonzini
172dbc52b3 mc146818rtc: reinitialize irq_reinject_on_ack_count on reset
This field was forgotten, and it makes the state after reset
non-deterministic.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-27 17:54:52 +02:00
Peter Maydell
0265361a72 Merge remote-tracking branch 'remotes/mcayland/qemu-openbios' into staging
* remotes/mcayland/qemu-openbios:
  Update OpenBIOS images

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-26 14:18:40 +01:00
Eduardo Habkost
7b458bfd12 target-i386: Add "tsc_adjust" CPU feature name
tsc_adjust migration support is already implemented (commit
f28558d3d3), so we can add it to the list
of known feature names.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-26 14:52:43 +02:00
Eduardo Habkost
5bd8ff07e6 target-i386: Add "mpx" CPU feature name
Migration support for MPX is already implemented (commit
79e9ebebbf), so we can add it to the list
of known feature names.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-26 14:52:38 +02:00
Mark Cave-Ayland
d2a68f3a30 Update OpenBIOS images
Update OpenBIOS images to SVN r1316 built from submodule.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2014-08-26 13:52:15 +01:00
Paolo Bonzini
7b71758d79 vl: process -object after other backend options
QOM backends can refer to chardevs, but not vice versa.  So
process -chardev and -fsdev options before -object

This fixes the rng-egd backend to virtio-rng.

Reported-by: Amos Kong <akong@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-26 13:44:39 +02:00
Paolo Bonzini
a6859deb69 checkpatch.pl: adjust typedef definition to QEMU coding style
Most QEMU typedefs are camelcase, starting with one uppercase letter
and containing at least one lowercase letter.  There are a few
all-uppercase types, add the most common too.

This fixes recognition of types in lines such as

    static __attribute__((unused)) inline void tcg_out8(TCGContext *s, uint8_t v)

(Example provided by Peter Maydell).

Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-26 13:44:28 +02:00
Fam Zheng
c9f6552803 virtio-scsi: Report error if num_queues is 0 or too large
No cmd vq surprises guest (Linux panics in virtscsi_probe), too many
queues abort qemu (in the following virtio_add_queue).

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-26 13:20:44 +02:00
Gonglei
f93d2c15d6 scsi-generic: remove superfluous DPRINTF avoid to break compiling
variables lun and tag had been eliminated, break compiling
when enable debug switch. Meanwhile traces provide the same
information with this DPRINTF, so remove it.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-26 13:20:44 +02:00
Peter Lieven
9db693f764 block/iscsi: fix memory corruption on iscsi resize
bs->total_sectors is not yet updated at this point. resulting
in memory corruption if the volume has grown and data is written
to the newly availble areas.

CC: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-26 13:20:44 +02:00
Fam Zheng
a818a4b69d scsi-bus: Convert DeviceClass init to realize
Replace "init/destroy" with "realize/unrealize" in SCSIDeviceClass,
which has errp as a parameter. So all the implementations now use
error_setg instead of error_report for reporting error.

Also in scsi_bus_legacy_handle_cmdline, report the error when
initializing the if=scsi devices, before returning it, because in the
callee, error_report is changed to error_setg. And the callers don't
have the right locations (e.g. "-drive if=scsi").

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-26 13:20:44 +02:00
Fam Zheng
5ff5efb46c block: Pass errp in blkconf_geometry
This allows us to pass error information to caller.

Reviewed-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-26 13:20:44 +02:00
Peter Maydell
c47c61be8d Merge remote-tracking branch 'remotes/awilliam/tags/vfio-pci-for-qemu-20140825.0' into staging
VFIO: Enable primary NVIDIA quirk regardless of VGA support

# gpg: Signature made Mon 25 Aug 2014 20:29:37 BST using RSA key ID 3BB08B22
# gpg: Can't check signature: public key not found

* remotes/awilliam/tags/vfio-pci-for-qemu-20140825.0:
  vfio: Enable NVIDIA 88000 region quirk regardless of VGA

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-26 10:42:06 +01:00
Alex Williamson
fe08275db9 vfio: Enable NVIDIA 88000 region quirk regardless of VGA
If we make use of OVMF for the BIOS then we can use GPUs without VGA
space access, but we still need this quirk.  Disassociate it from the
x-vga option and enable it on all NVIDIA VGA display class devices.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-08-25 12:10:15 -06:00
Peter Maydell
a44a12b78a Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pci, pc fixes, features

A bunch of bugfixes - these will make sense for 2.1.1

ACPI support for TPM and partial ARI support for PCIE.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Sun 24 Aug 2014 23:16:35 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"

* remotes/mst/tags/for_upstream:
  pcie: fix trailing whitespace
  ioh3420: Enable ARI forwarding
  ioh3420: Remove obsoleted, unused ioh3420_init function
  pcie: Rename the pcie_cap_ari_* functions to pcie_cap_arifwd_*
  pcie: Fix incorrect write to the ari capability next function field
  ssdt-tpm: add generated hex file to git
  Add ACPI tables for TPM
  pc: reserve more memory for ACPI for new machine types
  pcihp: fix possible array out of bounds
  pci_bridge: manually destroy memory regions within PCIBridgeWindows
  hostmem: set MPOL_MF_MOVE

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-25 18:49:25 +01:00
Alex Williamson
9db2efd95e x86: Clear MTRRs on vCPU reset
The SDM specifies (June 2014 Vol3 11.11.5):

    On a hardware reset, the P6 and more recent processors clear the
    valid flags in variable-range MTRRs and clear the E flag in the
    IA32_MTRR_DEF_TYPE MSR to disable all MTRRs. All other bits in the
    MTRRs are undefined.

We currently do none of that, so whatever MTRR settings you had prior
to reset is what you have after reset.  Usually this doesn't matter
because KVM often ignores the guest mappings and uses write-back
anyway.  However, if you have an assigned device and an IOMMU that
allows NoSnoop for that device, KVM defers to the guest memory
mappings which are now stale after reset.  The result is that OVMF
rebooting on such a configuration takes a full minute to LZMA
decompress the firmware volume, a process that is nearly instant on
the initial boot.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-25 18:53:42 +02:00
Alex Williamson
d1ae67f626 x86: kvm: Add MTRR support for kvm_get|put_msrs()
The MTRR state in KVM currently runs completely independent of the
QEMU state in CPUX86State.mtrr_*.  This means that on migration, the
target loses MTRR state from the source.  Generally that's ok though
because KVM ignores it and maps everything as write-back anyway.  The
exception to this rule is when we have an assigned device and an IOMMU
that doesn't promote NoSnoop transactions from that device to be cache
coherent.  In that case KVM trusts the guest mapping of memory as
configured in the MTRR.

This patch updates kvm_get|put_msrs() so that we retrieve the actual
vCPU MTRR settings and therefore keep CPUX86State synchronized for
migration.  kvm_put_msrs() is also used on vCPU reset and therefore
allows future modificaitons of MTRR state at reset to be realized.

Note that the entries array used by both functions was already
slightly undersized for holding every possible MSR, so this patch
increases it beyond the 28 new entries necessary for MTRR state.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-25 18:53:42 +02:00
Alex Williamson
d8b5c67b05 x86: Use common variable range MTRR counts
We currently define the number of variable range MTRR registers as 8
in the CPUX86State structure and vmstate, but use MSR_MTRRcap_VCNT
(also 8) to report to guests the number available.  Change this to
use MSR_MTRRcap_VCNT consistently.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-25 18:53:42 +02:00
William Grant
1844e68eca target-i386: Don't forbid NX bit on PAE PDEs and PTEs
Commit e8f6d00c30 ("target-i386: raise
page fault for reserved physical address bits") added a check that the
NX bit is not set on PAE PDPEs, but it also added it to rsvd_mask for
the rest of the function. This caused any PDEs or PTEs with NX set to be
erroneously rejected, making PAE guests with NX support unusable.

Signed-off-by: William Grant <wgrant@ubuntu.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-25 18:53:42 +02:00
Peter Maydell
3dd359c2d3 Merge remote-tracking branch 'remotes/mjt/tags/trivial-patches-2014-08-24' into staging
trivial patches for 2014-08-24

# gpg: Signature made Sun 24 Aug 2014 14:28:49 BST using RSA key ID A4C3D7DB
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg:                 aka "Michael Tokarev <mjt@corpit.ru>"
# gpg:                 aka "Michael Tokarev <mjt@debian.org>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D  4324 457C E0A0 8044 65C5
#      Subkey fingerprint: 6F67 E18E 7C91 C5B1 5514  66A7 BEE5 9D74 A4C3 D7DB

* remotes/mjt/tags/trivial-patches-2014-08-24:
  vmxnet3: Pad short frames to minimum size (60 bytes)
  libdecnumber: Fix warnings from smatch (missing static, boolean operations)
  linux-user: fix file descriptor leaks
  po: Fix Makefile rules for in-tree builds without configuration
  slirp/misc: Use the GLib memory allocation APIs
  configure: no need to mkdir QMP
  dma: axidma: Variablise repeated s->streams[i] sub-expr
  microblaze: ml605: Get rid of ddr_base variable
  tests/bios-tables-test: check the value returned by fopen()
  tcg: dump op count into qemu log
  util/path: Use the GLib memory allocation routines

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-25 17:34:30 +01:00
Alexey Kardashevskiy
3431648272 spapr: Add support for new NMI interface
This implements an NMI interface POWERPC SPAPR machine.
This enables an "nmi" HMP/QMP command supported on SPAPR.

This calls POWERPC_EXCP_RESET (vector 0x100) in the guest to deliver NMI
to every CPU. The expected result is XMON (in-kernel debugger) invocation.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-25 13:25:16 +02:00
Alexey Kardashevskiy
3dd7852f19 s390x: Migrate to new NMI interface
This implements an NMI interface for s390 and s390-ccw machines.

This removes #ifdef s390 branch in qmp_inject_nmi so new s390's
nmi_monitor_handler() callback is going to be used for NMI.

Since nmi_monitor_handler()-calling code is platform independent,
CPUState::cpu_index is used instead of S390CPU::env.cpu_num.
There should not be any change in behaviour as both @cpu_index and
@cpu_num are global CPU numbers.

Note that s390_cpu_restart() already takes care of the specified cpu,
so we don't need to schedule via async_run_on_cpu().

Since the only error s390_cpu_restart() can return is ENOSYS, convert
it to QERR_UNSUPPORTED.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-25 13:25:16 +02:00
Alexey Kardashevskiy
d07aa7c7bb s390x: Convert QEMUMachine to MachineClass
This converts s390-virtio and s390-ccw-virtio machines to QOM MachineClass.
This brings ability to add interfaces to the machine classes. The first
interface for addition will be NMI.

The patch is mechanical so no change in behavior is expected.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-25 13:25:16 +02:00
Alexey Kardashevskiy
9cb805fd26 cpus: Define callback for QEMU "nmi" command
This introduces an NMI (Non Maskable Interrupt) interface with
a single nmi_monitor_handler() method. A machine or a device can
implement it. This searches for an QOM object with this interface
and if it is implemented, calls it. The callback implements an action
required to cause debug crash dump on in-kernel debugger invocation.
The callback returns Error**.

This adds a nmi_monitor_handle() helper which walks through
all objects to find the interface. The interface method is called
for all found instances.

This adds support for it in qmp_inject_nmi(). Since no architecture
supports it at the moment, there is no change in behaviour.

This changes inject-nmi command description for HMP and QMP.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-25 13:25:16 +02:00
Michael S. Tsirkin
187de915e8 pcie: fix trailing whitespace
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-25 00:16:07 +02:00
Knut Omang
a74b870270 ioh3420: Enable ARI forwarding
Signed-off-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-25 00:16:06 +02:00
Knut Omang
0f9b1771cc ioh3420: Remove obsoleted, unused ioh3420_init function
Signed-off-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-25 00:16:06 +02:00
Knut Omang
821be9dbb2 pcie: Rename the pcie_cap_ari_* functions to pcie_cap_arifwd_*
Rename helper functions to make a clearer distinction between
the PCIe capability/control register feature ARI forwarding and a
device that supports the ARI feature via an ARI extended PCIe capability.

Signed-off-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-25 00:16:06 +02:00
Knut Omang
ec70b46bab pcie: Fix incorrect write to the ari capability next function field
PCI_ARI_CAP_NFN, a macro for reading next function was used instead of
the intended write.

Signed-off-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-25 00:16:06 +02:00
Michael S. Tsirkin
cec391d752 ssdt-tpm: add generated hex file to git
Needed for systems without IASL.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-25 00:16:06 +02:00
Stefan Berger
711b20b479 Add ACPI tables for TPM
Add an SSDT ACPI table for the TPM device.
Add a TCPA table for BIOS logging area when a TPM is being used.

The latter follows this spec here:

http://www.trustedcomputinggroup.org/files/static_page_files/DCD4188E-1A4B-B294-D050A155FB6F7385/TCG_ACPIGeneralSpecification_PublicReview.pdf

This patch has Michael Tsirkin's patches folded in.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-25 00:16:06 +02:00
Michael S. Tsirkin
927766c7d3 pc: reserve more memory for ACPI for new machine types
commit 868270f23d
    acpi-build: tweak acpi migration limits
broke kernel loading with -kernel/-initrd: it doubled
the size of ACPI tables but did not reserve
enough memory.

As a result, issues on boot and halt are observed.

Fix this up by doubling reserved memory for new machine types.

Cc: qemu-stable@nongnu.org
Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-25 00:16:06 +02:00
Gonglei
fa365d7cd1 pcihp: fix possible array out of bounds
Prevent out-of-bounds array access on
acpi_pcihp_pci_status.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2014-08-25 00:16:06 +02:00
Paolo Bonzini
9f6b2f1c64 pci_bridge: manually destroy memory regions within PCIBridgeWindows
The regions are destroyed and recreated on configuration space accesses.
We need to destroy them before the containing PCIBridgeWindows object
is freed.

Reported-by: Gonglei <arei.gonglei@huawei.com>
Reported-by: Knut Omang <knut.omang@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-25 00:16:06 +02:00
Ben Draper
40a87c6c9b vmxnet3: Pad short frames to minimum size (60 bytes)
When running VMware ESXi under qemu-kvm the guest discards frames
that are too short. Short ARP Requests will be dropped, this prevents
guests on the same bridge as VMware ESXi from communicating. This patch
simply adds the padding on the network device itself.

Signed-off-by: Ben Draper <ben@xrsa.net>
Reviewed-by: Dmitry Fleytman <dmitry@daynix.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-24 17:11:08 +04:00
Stefan Weil
d072cdf3ba libdecnumber: Fix warnings from smatch (missing static, boolean operations)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-24 13:21:06 +04:00
zhanghailiang
680dfde919 linux-user: fix file descriptor leaks
Handle variable "fd_orig" going out of scope leaks the handle.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-24 13:18:28 +04:00
Stefan Weil
bcc55f327c po: Fix Makefile rules for in-tree builds without configuration
Adding 'update' to the phony targets fixes this error:

$ LANG=C make -C po update
make: Entering directory `/qemu/po'
  LINK  update
/qemu/po/de_DE.po: file not recognized: File format not recognized
collect2: error: ld returned 1 exit status
make: *** [update] Error 1
make: Leaving directory `/qemu/po'

Some other phony targets (build, install) were also added, and the
existing .PHONY statement was moved to a more prominent position at
the beginning of the Makefile.

The patch also fixes a 2nd bug. The default target should be 'all',
but instead 'modules' (from rules.mak) was the default. Fix this by
adding 'all' as a target before any include statement.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-24 13:16:42 +04:00
zhanghailiang
2fd5d86409 slirp/misc: Use the GLib memory allocation APIs
Here we don't check the return value of malloc() which may fail.
Use the g_new() instead, which will abort the program when
there is not enough memory.

Also, use g_strdup instead of strdup and remove the unnecessary
strdup function.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-24 13:16:32 +04:00
Liming Wang
58ab9e6807 configure: no need to mkdir QMP
commit 7537fe04 QMP: QMP/ -> docs/qmp/

Above commit has moved last QMP files to docs/qmp and it's not necessary
to create QMP directory. So remove it from configure.

Signed-off-by: Liming Wang <liming.wang@canonical.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-24 13:16:32 +04:00
Peter Crosthwaite
6a07a695b0 dma: axidma: Variablise repeated s->streams[i] sub-expr
This have 6 inline usages. Make it a bit more readable by using a local
variable.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-24 13:16:32 +04:00
Peter Crosthwaite
f55f885267 microblaze: ml605: Get rid of ddr_base variable
It's a constant based on a macro. Just use the macro in place.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-24 13:16:32 +04:00
zhanghailiang
c39a28a43d tests/bios-tables-test: check the value returned by fopen()
The function fopen() may fail, so check its return value.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Liu <john.liuli@huawei.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-24 13:16:32 +04:00
zhanghailiang
d70724cec8 tcg: dump op count into qemu log
fopen() may fail and it does not check its return vaule here,
it is better to dump op count to the normal log file.

Signed-off-by: Li Liu <john.liuli@huawei.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-24 13:16:32 +04:00
zhanghailiang
bc5008a832 util/path: Use the GLib memory allocation routines
In this file, we don't check the return value of malloc/strdup/realloc which may fail.
Instead of using these routines, we use the GLib memory APIs g_malloc/g_strdup/g_realloc.
They will exit on allocation failure, so there is no need to test for failure,
which would be fine for setup.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-24 13:16:32 +04:00
Peter Maydell
33886ebeec Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block patches

# gpg: Signature made Fri 22 Aug 2014 14:47:53 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"

* remotes/kevin/tags/for-upstream: (29 commits)
  qemu-img: Allow cache mode specification for amend
  qemu-img: Allow source cache mode specification
  vmdk: Use bdrv_nb_sectors() where sectors, not bytes are wanted
  blkdebug: Delete BH in bdrv_aio_cancel
  qemu-iotests: add test case 101 for short file I/O
  raw-posix: fix O_DIRECT short reads
  block/iscsi: fix memory corruption on iscsi resize
  block/vvfat.c: remove debugging code to reinit stderr if NULL
  iotests: Add test for image filename construction
  quorum: Implement bdrv_refresh_filename()
  nbd: Implement bdrv_refresh_filename()
  blkverify: Implement bdrv_refresh_filename()
  blkdebug: Implement bdrv_refresh_filename()
  block: Add bdrv_refresh_filename()
  virtio-blk: fix reference a pointer which might be freed
  virtio-blk: allow block_resize with dataplane
  block: acquire AioContext in qmp_block_resize()
  qemu-iotests: Fix 028 reference output for qed
  test-coroutine: test cost introduced by coroutine
  iotests: Add test for qcow2's cache options
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-22 16:12:51 +01:00
Peter Maydell
43fe62757b Merge remote-tracking branch 'remotes/riku/linux-user-for-upstream' into staging
* remotes/riku/linux-user-for-upstream: (22 commits)
  linux-user: check return value of malloc()
  linux-user: writev Partial Writes
  linux-user: Support target-to-host translation of mlockall argument
  linux-user: clock_nanosleep errno Handling on PPC
  linux-user: Minimum Sig Handler Stack Size for PPC64 ELF V2
  linux-user: Move get_ppc64_abi
  linux-user: Detect fault in sched_rr_get_interval
  linux-user: Handle NULL sched_param argument to sched_*
  linux-user: Detect Negative Message Sizes in msgsnd System Call
  linux-user: Conditionally Pass Attribute Pointer to mq_open()
  linux-user: Make ipc syscall's third argument an abi_long
  linux-user: Properly Handle semun Structure In Cross-Endian Situations
  linux-user: Dereference Pointer Argument to ipc/semctl Sys Call
  linux-user: PPC64 semid_ds Doesnt Include _unused1 and _unused2
  linux-user: add setns and unshare
  linux-user: support ioprio_{get, set} syscalls
  linux-user: support timerfd_{create, gettime, settime} syscalls
  linux-user: fix readlink handling with magic exe symlink
  linux-user: Fix conversion of sigevent argument to timer_create
  linux-user: Fix syscall instruction usermode emulation on X86_64
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-22 14:39:53 +01:00
Max Reitz
bd39e6ed0b qemu-img: Allow cache mode specification for amend
qemu-img amend may extensively modify the target image, depending on the
options to be amended (e.g. conversion to qcow2 compat level 0.10 from
1.1 for an image with many unallocated zero clusters). Therefore it
makes sense to allow the user to specify the cache mode to be used.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-22 14:54:48 +02:00
Max Reitz
40055951a7 qemu-img: Allow source cache mode specification
Many qemu-img subcommands only read the source file(s) once. For these
use cases, a full write-back cache is unnecessary and mainly clutters
host cache memory. Though this is generally no concern as cache memory
is freely available and can be scaled by the host OS, it may become a
concern with thin provisioning.

For these cases, it makes sense to allow users to freely specify the
source cache mode (e.g. use no cache at all).

This commit adds a new switch (-T) for the qemu-img subcommands check,
compare, convert and rebase to specify the cache to be used for source
images (the backing file in case of rebase).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-22 14:54:44 +02:00
zhanghailiang
29e03fcb62 linux-user: check return value of malloc()
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Acked-by: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:35 +03:00
Tom Musta
29560a6cb7 linux-user: writev Partial Writes
Although not technically not required by POSIX, the writev system call will
typically write out its buffers individually.  That is, if the first buffer
is written successfully, but the second buffer pointer is invalid, then
the first chuck will be written and its size is returned.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:35 +03:00
Tom Musta
6f6a40328b linux-user: Support target-to-host translation of mlockall argument
The argument to the mlockall system call is not necessarily the same on
all platforms and thus may require translation prior to passing to the
host.

For example, PowerPC 64 bit platforms define values for MCL_CURRENT
(0x2000) and MCL_FUTURE (0x4000) which are different from Intel platforms
(0x1 and 0x2, respectively)

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:35 +03:00
Tom Musta
8fbe8fdfbc linux-user: clock_nanosleep errno Handling on PPC
The clock_nanosleep syscall is unusual in that it returns positive
numbers in error handling situations, versus returning -1 and setting
errno, or returning a negative errno value.  On POWER, the kernel will
set the SO bit of CR0 to indicate failure in a syscall.  QEMU has
generic handling to do this for syscalls with standard return values.

Add special case code for clock_nanosleep to handle CR0 properly.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:35 +03:00
Tom Musta
0903c8be9e linux-user: Minimum Sig Handler Stack Size for PPC64 ELF V2
The ELF V2 ABI for PPC64 defines MINSIGSTKSZ as 4096 bytes whereas it was
2048 previously.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:35 +03:00
Tom Musta
67d6d829cd linux-user: Move get_ppc64_abi
The get_ppc64_abi is used to determine the ELF ABI (i.e. V1 or V2). This
routine is currently implemented in the linux-user/elfload.c file but
is useful in other scenarios.  Move the routine to a more generally
available location (linux-user/ppc/target_cpu.h).

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:35 +03:00
Tom Musta
d4290c40a4 linux-user: Detect fault in sched_rr_get_interval
Properly detect a fault when attempting to store into an invalid
struct timespec pointer.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:35 +03:00
Tom Musta
a1d5c5b25d linux-user: Handle NULL sched_param argument to sched_*
The sched_getparam, sched_setparam and sched_setscheduler system
calls take a pointer argument to a sched_param structure.  When
this pointer is null, errno should be set to EINVAL.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:35 +03:00
Tom Musta
edcc5f9dc3 linux-user: Detect Negative Message Sizes in msgsnd System Call
The msgsnd system call takes an argument that describes the message
size (msgsz) and is of type size_t.  The system call should set
errno to EINVAL in the event that a negative message size is passed.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:35 +03:00
Tom Musta
b6ce1f6b90 linux-user: Conditionally Pass Attribute Pointer to mq_open()
The mq_open system call takes an optional struct mq_attr pointer
argument in the fourth position.  This pointer is used when O_CREAT
is specified in the flags (second) argument.  It may be NULL, in
which case the queue is created with implementation defined attributes.

Change the code to properly handle the case when NULL is passed in the
arg4 position.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:34 +03:00
Tom Musta
37ed09560c linux-user: Make ipc syscall's third argument an abi_long
For those target ABIs that use the ipc system call (e.g. POWER),
the third argument is used in the shmat path as a pointer.  It
therefore must be declared as an abi_long (versus int) so that
the address bits are not lost in truncation.  In fact, all arguments
to do_ipc should be declared as abit_long.

In fact, it makes more sense for all of the arguments to be declaried
as abi_long (except call).

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:34 +03:00
Tom Musta
5464baecf5 linux-user: Properly Handle semun Structure In Cross-Endian Situations
The semun union used in the semctl system call contains both an int (val) and
pointers.  In cross-endian situations on 64 bit targets, the value passed to
semctl is an 8 byte (abi_long) value and thus does not have the 4-byte val
field in the correct location.  In order to rectify this, the other half
of the union must be accessed.  This is achieved in code by performing
a byte swap on the entire 8 byte union, followed by a 4-byte swap of the
first half.

Also, eliminate an extraneous (dead) line of code that sets target_su.val in
the IPC_SET/IPC_GET case.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:34 +03:00
Tom Musta
5d2fa8ebb4 linux-user: Dereference Pointer Argument to ipc/semctl Sys Call
When the ipc system call is used to wrap a semctl system call,
the ptr argument to ipc needs to be dereferenced prior to passing
it to the semctl handler.  This is because the fourth argument to
semctl is a union and not a pointer to a union.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:34 +03:00
Tom Musta
035273440b linux-user: PPC64 semid_ds Doesnt Include _unused1 and _unused2
The 64 bit PowerPC platforms eliminate the _unused1 and _unused2
elements of the semid_ds structure from <sys/sem.h>.  So eliminate
these from the target_semid_ds structure.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:34 +03:00
Riku Voipio
9af5c906d1 linux-user: add setns and unshare
Add support for the setns and unshare syscalls, trivially passed through to
the host. Based on patches by Paul Burton, added configure check.

Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:34 +03:00
Paul Burton
ab31cda327 linux-user: support ioprio_{get, set} syscalls
Add support for the ioprio_get & ioprio_set syscalls, allowing their
use by target programs.

Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:34 +03:00
Riku Voipio
518343413f linux-user: support timerfd_{create, gettime, settime} syscalls
Adds support for the timerfd_create, timerfd_gettime & timerfd_settime
syscalls, allowing use of timerfds by target programs.

v2: By Riku - added configure check for timerfd and ifdefs
for benefit of old distributions like RHEL5.

Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:33 +03:00
Mike Frysinger
f17f4989fa linux-user: fix readlink handling with magic exe symlink
The current code always returns the length of the path when it should
be returning the number of bytes it wrote to the output string.

Further, readlink is not supposed to append a NUL byte, but the current
snprintf logic will always do just that.

Even further, if you pass in a length of 0, you're suppoesd to get back
an error (EINVAL), but the current logic just returns 0.

Further still, if there was an error reading the symlink, we should not
go ahead and try to read the target buffer as it is garbage.

Simple test for the first two issues:
$ cat test.c
int main() {
    char buf[50];
    size_t len;
    for (len = 0; len < 10; ++len) {
        memset(buf, '!', sizeof(buf));
        ssize_t ret = readlink("/proc/self/exe", buf, len);
        buf[20] = '\0';
        printf("readlink(/proc/self/exe, {%s}, %zu) = %zi\n", buf, len, ret);
    }
    return 0;
}

Now compare the output of the native:
$ gcc test.c -o /tmp/x
$ /tmp/x
$ strace /tmp/x

With what qemu does:
$ armv7a-cros-linux-gnueabi-gcc test.c -o /tmp/x -static
$ qemu-arm /tmp/x
$ qemu-arm -strace /tmp/x

Signed-off-by: Mike Frysinger <vapier@chromium.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:33 +03:00
Peter Maydell
c065976f2b linux-user: Fix conversion of sigevent argument to timer_create
There were a number of bugs in the conversion of the sigevent
argument to timer_create from target to host format:
 * signal number not converted from target to host
 * thread ID not copied across
 * sigev_value not copied across
 * we never unlocked the struct when we were done

Between them, these problems meant that SIGEV_THREAD_ID
timers (and the glibc-implemented SIGEV_THREAD timers which
depend on them) didn't work.

Fix these problems and clean up the code a little by pulling
the struct conversion out into its own function, in line with
how we convert various other structs. This allows the test
program in bug LP:1042388 to run.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:33 +03:00
Jincheng Miao
47575997be linux-user: Fix syscall instruction usermode emulation on X86_64
Currently syscall instruction is buggy on user mode X86_64,
the EIP is updated after do_syscall(), that is too late for
clone(). Because clone() will create a thread at the env->EIP
(the address of syscall insn), and then child thread enters
do_syscall() again, that is not expected. Sometimes it is tragic.

User mode syscall insn emulation is not used MSR, so the
action should be same to INT 0x80. INT 0x80 will update EIP in
do_interrupt(), ditto for syscall() for consistency.

Signed-off-by: Jincheng Miao <jmiao@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:33 +03:00
Riku Voipio
0b2effd744 linux-user: redirect openat calls
While Mikhail fixed /proc/self/maps, it was noticed openat calls are
not redirected currently. Some archs don't have open at all, so
openat needs to be redirected.

Fix this by consolidating open/openat code to do_openat - open
is implemented using openat(AT_FDCWD, ... ), which according
to open(2) man page is identical.

Since all targets now have openat, remove the ifdef around sys_openat
and openat: case in do_syscall.

Cc: Mikhail Ilin <m.ilin@samsung.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:33 +03:00
Mikhail Ilyin
d67f4aaae8 linux-user: /proc/self/maps content
Build /proc/self/maps doing a match against guest memory translation table.
Output only that map records which are valid for guest memory layout.

Signed-off-by: Mikhail Ilyin <m.ilin@samsung.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:33 +03:00
Markus Armbruster
0a156f7c75 vmdk: Use bdrv_nb_sectors() where sectors, not bytes are wanted
Instead of bdrv_getlength().

Commit 57322b7 did this all over block, but one more bdrv_getlength()
has crept in since.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-22 11:10:12 +02:00
Fam Zheng
cbf95a0b11 blkdebug: Delete BH in bdrv_aio_cancel
Otherwise error_callback_bh will access the already released acb.

Cc: qemu-stable@nongnu.org
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-22 11:07:00 +02:00
Stefan Hajnoczi
8d9eb33ca0 qemu-iotests: add test case 101 for short file I/O
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-22 11:01:12 +02:00
Stefan Hajnoczi
61ed73cff4 raw-posix: fix O_DIRECT short reads
The following O_DIRECT read from a <512 byte file fails:

  $ truncate -s 320 test.img
  $ qemu-io -n -c 'read -P 0 0 512' test.img
  qemu-io: can't open device test.img: Could not read image for determining its format: Invalid argument

Note that qemu-io completes successfully without the -n (O_DIRECT)
option.

This patch fixes qemu-iotests ./check -nocache -vmdk 059.

Cc: qemu-stable@nongnu.org
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-22 11:00:56 +02:00
Peter Lieven
d832fb4d66 block/iscsi: fix memory corruption on iscsi resize
bs->total_sectors is not yet updated at this point. resulting
in memory corruption if the volume has grown and data is written
to the newly availble areas.

CC: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-22 10:55:22 +02:00
Peter Maydell
fd3cced366 Merge remote-tracking branch 'remotes/otubo/seccomp' into staging
* remotes/otubo/seccomp:
  seccomp: add semctl() to the syscall whitelist

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-21 12:48:44 +01:00
Michael Tokarev
13b552c2f4 block/vvfat.c: remove debugging code to reinit stderr if NULL
Just log to stderr unconditionally, like other similar code does.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-21 10:36:29 +02:00
Paul Moore
b22876cc2f seccomp: add semctl() to the syscall whitelist
QEMU needs to call semctl() for correct operation.  This particular
problem was identified on shutdown with the following commandline:

 # qemu -sandbox on -monitor stdio \
   -device intel-hda -device hda-duplex -vnc :0

Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Eduardo Otubo <eduardo.otubo@profitbricks.com>
2014-08-21 10:29:16 +02:00
Michael S. Tsirkin
288d332202 hostmem: set MPOL_MF_MOVE
When memory is allocated on a wrong node, MPOL_MF_STRICT
doesn't move it - it just fails the allocation.
A simple way to reproduce the failure is with mlock=on
realtime feature.

The code comment actually says: "ensure policy won't be ignored"
so setting MPOL_MF_MOVE seems like a better way to do this.

Cc: qemu-stable@nongnu.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-20 21:15:56 +02:00
David Hildenbrand
c8e2085d8e kvm: run cpu state synchronization on target vcpu thread
As already done for kvm_cpu_synchronize_state(), let's trigger
kvm_arch_put_registers() via run_on_cpu() for kvm_cpu_synchronize_post_reset()
and kvm_cpu_synchronize_post_init().

This way, we make sure that the register synchronizing ioctls are
called from the proper vcpu thread; this avoids calls to
synchronize_rcu() in the kernel.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-20 15:21:00 +02:00
Max Reitz
911864c6e5 iotests: Add test for image filename construction
Testing a real in-use protocol such as NBD is hard; testing blkdebug and
blkverify in its stead is easier and tests basically the same
functionality.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 14:33:42 +02:00
Max Reitz
fafcfe228d quorum: Implement bdrv_refresh_filename()
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 14:31:56 +02:00
Max Reitz
2019d68b3b nbd: Implement bdrv_refresh_filename()
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 14:31:56 +02:00
Max Reitz
74b36b2eb8 blkverify: Implement bdrv_refresh_filename()
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 14:31:56 +02:00
Max Reitz
2c31b04c94 blkdebug: Implement bdrv_refresh_filename()
Because blkdebug cannot simply create a configuration file, simply
refuse to reconstruct a plain filename and only generate an options
QDict from the rules instead.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 14:31:56 +02:00
Max Reitz
91af701412 block: Add bdrv_refresh_filename()
Some block devices may not have a filename in their BDS; and for some,
there may not even be a normal filename at all. To work around this, add
a function which tries to construct a valid filename for the
BDS.filename field.

If a filename exists or a block driver is able to reconstruct a valid
filename (which is placed in BDS.exact_filename), this can directly be
used.

If no filename can be constructed, we can still construct an options
QDict which is then converted to a JSON object and prefixed with the
"json:" pseudo protocol prefix. The QDict is placed in
BDS.full_open_options.

For most block drivers, this process can be done automatically; those
that need special handling may define a .bdrv_refresh_filename() method
to fill BDS.exact_filename and BDS.full_open_options themselves.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 14:31:56 +02:00
zhanghailiang
1bdb176ac5 virtio-blk: fix reference a pointer which might be freed
In function virtio_blk_handle_request, it may freed memory pointed by req,
So do not access member of req after calling this function.

Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 11:57:05 +02:00
Stefan Hajnoczi
466560b9fc virtio-blk: allow block_resize with dataplane
Now that block_resize acquires the AioContext we can safely allow
resizing the disk.

Reported-by: Andrey Korolyov <andrey@xdel.ru>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 11:53:52 +02:00
Stefan Hajnoczi
927e0e769f block: acquire AioContext in qmp_block_resize()
Make block_resize safe for dataplane where another thread may be running
the BlockDriverState's AioContext.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 11:53:42 +02:00
Kevin Wolf
6ffb4cb6fd qemu-iotests: Fix 028 reference output for qed
We need to filter out driver-specific options in the "Formatting..."
string printed by qemu when creating the backup image.

Reported-by: Peter Wu <peter@lekensteyn.nl>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Peter Wu <peter@lekensteyn.nl>
2014-08-20 11:51:28 +02:00
Ming Lei
61ff8cfbec test-coroutine: test cost introduced by coroutine
This test runs dummy function with coroutine by using
two enter and one yield since which is a common usage.

So we can see the cost introduced by corouting for running
one function, for example:

	Run operation 20000000 iterations 4.841071 s, 4131K operations/s
	242ns per coroutine

Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 11:51:28 +02:00
Max Reitz
a1cb48a3bf iotests: Add test for qcow2's cache options
Add a test which tests various combinations of qcow2's cache options
(some of which are valid, some of which are not).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 11:51:28 +02:00
Max Reitz
6c1c8d5d3e qcow2: Add runtime options for cache sizes
Add options for specifying the size of the metadata caches. This can
either be done directly for each cache (if only one is given, the other
will be derived according to a default ratio) or combined for both.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 11:51:28 +02:00
Max Reitz
02004bd4ba qcow2: Use g_try_new0() for cache array
With a variable cache size, the number given to qcow2_cache_create() may
be huge. Therefore, use g_try_new0().

While at it, use g_new0() instead of g_malloc0() for allocating the
Qcow2Cache object.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 11:51:28 +02:00
Max Reitz
440ba08aea qcow2: Constant cache size in bytes
Specifying the metadata cache sizes in clusters results in less clusters
(and much less bytes) covered for small cluster sizes and vice versa.
Using a constant byte size reduces this difference, and makes it
possible to manually specify the cache size in an easily comprehensible
unit.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 11:51:28 +02:00
Maria Kustova
18a7d0c56e runner: Kill a program under test by time-out
If a program under test get frozen, the test should finish and report about its
failure.
In such cases the runner waits for 10 minutes until the program ends its
execution. After this time-out the program will be terminated and the test will
be marked as failed.

For current limitation of test image size to 10 MB as a maximum an execution of
each command takes about several seconds in general, so 10 minutes is enough to
discriminate freeze, but not drastically increase an overall test duration.

Signed-off-by: Maria Kustova <maria.k@catit.be>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 11:51:28 +02:00
Maria Kustova
9d256ca616 runner: Add an argument for test duration
After the specified duration the runner stops executing new tests, but it
doesn't interrupt running ones.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Maria Kustova <maria.k@catit.be>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 11:51:28 +02:00
Markus Armbruster
d4df3dbc02 block: Drop some superfluous casts from void *
They clutter the code.  Unfortunately, I can't figure out how to make
Coccinelle drop all of them, so I have to settle for common special
cases:

    @@
    type T;
    T *pt;
    void *pv;
    @@
    - pt = (T *)pv;
    + pt = pv;
    @@
    type T;
    @@
    - (T *)
      (\(g_malloc\|g_malloc0\|g_realloc\|g_new\|g_new0\|g_renew\|
	 g_try_malloc\|g_try_malloc0\|g_try_realloc\|
	 g_try_new\|g_try_new0\|g_try_renew\)(...))

Topped off with minor manual style cleanups.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 11:51:28 +02:00
Markus Armbruster
08193dd52b qemu-io-cmds: g_renew() can't fail, bury dead error handling
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 11:51:28 +02:00
Markus Armbruster
02c4f26b15 block: Use g_new() & friends to avoid multiplying sizes
g_new(T, n) is safer than g_malloc(sizeof(*v) * n) for two reasons.
One, it catches multiplication overflowing size_t.  Two, it returns
T * rather than void *, which lets the compiler catch more type
errors.

Perhaps a conversion to g_malloc_n() would be neater in places, but
that's merely four years old, and we can't use such newfangled stuff.

This commit only touches allocations with size arguments of the form
sizeof(T), plus two that use 4 instead of sizeof(uint32_t).  We can
make the others safe by converting to g_malloc_n() when it becomes
available to us in a couple of years.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 11:51:28 +02:00
Markus Armbruster
5839e53bbc block: Use g_new() & friends where that makes obvious sense
g_new(T, n) is neater than g_malloc(sizeof(T) * n).  It's also safer,
for two reasons.  One, it catches multiplication overflowing size_t.
Two, it returns T * rather than void *, which lets the compiler catch
more type errors.

Patch created with Coccinelle, with two manual changes on top:

* Add const to bdrv_iterate_format() to keep the types straight

* Convert the allocation in bdrv_drop_intermediate(), which Coccinelle
  inexplicably misses

Coccinelle semantic patch:

    @@
    type T;
    @@
    -g_malloc(sizeof(T))
    +g_new(T, 1)
    @@
    type T;
    @@
    -g_try_malloc(sizeof(T))
    +g_try_new(T, 1)
    @@
    type T;
    @@
    -g_malloc0(sizeof(T))
    +g_new0(T, 1)
    @@
    type T;
    @@
    -g_try_malloc0(sizeof(T))
    +g_try_new0(T, 1)
    @@
    type T;
    expression n;
    @@
    -g_malloc(sizeof(T) * (n))
    +g_new(T, n)
    @@
    type T;
    expression n;
    @@
    -g_try_malloc(sizeof(T) * (n))
    +g_try_new(T, n)
    @@
    type T;
    expression n;
    @@
    -g_malloc0(sizeof(T) * (n))
    +g_new0(T, n)
    @@
    type T;
    expression n;
    @@
    -g_try_malloc0(sizeof(T) * (n))
    +g_try_new0(T, n)
    @@
    type T;
    expression p, n;
    @@
    -g_realloc(p, sizeof(T) * (n))
    +g_renew(T, p, n)
    @@
    type T;
    expression p, n;
    @@
    -g_try_realloc(p, sizeof(T) * (n))
    +g_try_renew(T, p, n)

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 11:51:28 +02:00
Peter Maydell
2656eb7c59 Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20140819' into staging
target-arm:
 * fix preferred return address for A64 BRK insn
 * implement AArch64 single-stepping
 * support loading gzip compressed AArch64 kernels
 * use correct PSCI function IDs in the DT when KVM uses PSCI 0.2
 * minor cleanups

# gpg: Signature made Tue 19 Aug 2014 19:04:09 BST using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"

* remotes/pmaydell/tags/pull-target-arm-20140819:
  arm: stellaris: Remove misleading address_space_mem var
  arm: armv7m: Rename address_space_mem -> system_memory
  aarch64: Allow -kernel option to take a gzip-compressed kernel.
  loader: Add load_image_gzipped function.
  arm: cortex-a9: Fix cache-line size and associativity
  arm/virt: Use PSCI v0.2 function IDs in the DT when KVM uses PSCI v0.2
  target-arm: Rename QEMU PSCI v0.1 definitions
  target-arm: Implement MDSCR_EL1 as having state
  target-arm: Implement ARMv8 single-stepping for AArch32 code
  target-arm: Implement ARMv8 single-step handling for A64 code
  target-arm: A64: Avoid duplicate exit_tb(0) in non-linked goto_tb
  target-arm: Set PSTATE.SS correctly on exception return from AArch64
  target-arm: Correctly handle PSTATE.SS when taking exception to AArch32
  target-arm: Don't allow AArch32 to access RES0 CPSR bits
  target-arm: Adjust debug ID registers per-CPU
  target-arm: Provide both 32 and 64 bit versions of debug registers
  target-arm: Allow STATE_BOTH reginfo descriptions for more than cp14
  target-arm: Collect up the debug cp register definitions
  target-arm: Fix return address for A64 BRK instructions

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-20 09:55:42 +01:00
Peter Maydell
302fa28378 Revert "memory: Use canonical path component as the name"
This reverts commit b0225c2c0d
(which breaks building with Xen enabled and also leaks memory).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-19 20:05:46 +01:00
Peter Crosthwaite
14a906f755 arm: stellaris: Remove misleading address_space_mem var
It's a MemoryRegion and not an AddressSpace. But since it's single use,
just inline the get_system_memory() call to the only usage to remove it.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-id: d6914047e10b956514cfaa5f391ef56c7d851b34.1408347860.git.peter.crosthwaite@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-19 19:02:40 +01:00
Peter Crosthwaite
6e9322dea3 arm: armv7m: Rename address_space_mem -> system_memory
This argument is a MemoryRegion and not an AddressSpace.

"Address space" means something quite different to "memory region"
in QEMU parlance so rename the variable to reduce confusion.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-id: f666cf7f2318d9b461b1e320a45bf0d82da9b7dd.1408347860.git.peter.crosthwaite@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-19 19:02:40 +01:00
Richard W.M. Jones
6f5d3cbe88 aarch64: Allow -kernel option to take a gzip-compressed kernel.
On aarch64 it is the bootloader's job to uncompress the kernel.  UEFI
and u-boot bootloaders do this automatically when the kernel is
gzip-compressed.

However the qemu -kernel option does not do this.  The following
command does not work:

  qemu-system-aarch64 [...] -kernel /boot/vmlinuz

because it tries to execute the gzip-compressed data.

This commit lets gzip-compressed kernels be uncompressed
transparently.

Currently this is only done when emulating aarch64.

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1407831259-2115-3-git-send-email-rjones@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-19 19:02:40 +01:00
Richard W.M. Jones
235e74afcb loader: Add load_image_gzipped function.
As the name suggests this lets you load a ROM/disk image that is
gzipped.  It is uncompressed before storing it in guest memory.

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1407831259-2115-2-git-send-email-rjones@redhat.com
[PMM: removed stray space before ')']
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-19 19:02:40 +01:00
Peter Crosthwaite
f7838b5290 arm: cortex-a9: Fix cache-line size and associativity
For A9, The cache associativity is 4 and the lines size is 32B.
Self identify in CCSIDR accordingly. Cache size remains at 16k.

QEMU doesn't emulate caches, but we should still report the correct
cache-line size to the guest. Some guests (like u-boot) complain if
the cache-line size mismatches a requested flush or invalidate
operation.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-id: 1de6bd40155a1d2f2e93e24b1b1d1d677a432641.1408346233.git.peter.crosthwaite@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-19 19:02:40 +01:00
Christoffer Dall
863714ba6c arm/virt: Use PSCI v0.2 function IDs in the DT when KVM uses PSCI v0.2
The current code supplies the PSCI v0.1 function IDs in the DT even when
KVM uses PSCI v0.2.

This will break guest kernels that only support PSCI v0.1 as they will
use the IDs provided in the DT.  Guest kernels with PSCI v0.2 support
are not affected by this patch, because they ignore the function IDs in
the device tree and rely on the architecture definition.

Define QEMU versions of the constants and check that they correspond to
the Linux defines on Linux build hosts.  After this patch, both guest
kernels with PSCI v0.1 support and guest kernels with PSCI v0.2 should
work.

Tested on TC2 for 32-bit and APM Mustang for 64-bit (aarch64 guest
only).  Both cases tested with 3.14 and linus/master and verified I
could bring up 2 cpus with both guest kernels.  Also tested 32-bit with
a 3.14 host kernel with only PSCI v0.1 and both guests booted here as
well.

Cc: qemu-stable@nongnu.org
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-19 19:02:25 +01:00
Christoffer Dall
a65c9c17ce target-arm: Rename QEMU PSCI v0.1 definitions
The function IDs for PSCI v0.1 are exported by KVM and defined as
KVM_PSCI_FN_<something>.  To build using these defines in non-KVM code,
QEMU defines these IDs locally and check their correctness against the
KVM headers when those are available.

However, the naming scheme used for QEMU (almost) clashes with the PSCI
v0.2 definitions from Linux so to avoid unfortunate naming when we
introduce local PSCI v0.2 defines, rename the current local defines with
QEMU_ prependend and clearly identify the PSCI version as v0.1 in the
defines.

Cc: qemu-stable@nongnu.org
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-19 19:02:03 +01:00
Peter Maydell
0e5e8935bb target-arm: Implement MDSCR_EL1 as having state
Now that all the new code to support single-stepping is in
place, wire up the guest-visible MDSCR_EL1, so the guest
can enable single-stepping.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2014-08-19 19:02:03 +01:00
Peter Maydell
50225ad0c1 target-arm: Implement ARMv8 single-stepping for AArch32 code
ARMv8 single-stepping requires the exception level that controls
the single-stepping to be in AArch64 execution state, but the
code being stepped may be in AArch64 or AArch32. Implement the
necessary support code for single-stepping AArch32 code.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2014-08-19 19:02:03 +01:00
Peter Maydell
7ea47fe7be target-arm: Implement ARMv8 single-step handling for A64 code
Implement ARMv8 software single-step handling for A64 code:
correctly update the single-step state machine and generate
debug exceptions when stepping A64 code.

This patch has no behavioural change since MDSCR_EL1.SS can't
be set by the guest yet.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2014-08-19 19:02:03 +01:00
Peter Maydell
cc9c1ed14e target-arm: A64: Avoid duplicate exit_tb(0) in non-linked goto_tb
If gen_goto_tb() decides not to link the two TBs, then the
fallback path generates unnecessary code:
 * if singlestep is enabled then we generate unreachable code
   after the gen_exception_internal(EXCP_DEBUG)
 * if singlestep is disabled then we will generate exit_tb(0)
   twice, once in gen_goto_tb() and once coming out of the
   main loop with is_jmp set to DISAS_JUMP

Correct these deficiencies by only emitting exit_tb() in the
non-singlestep case, in which case we can use DISAS_TB_JUMP
to suppress the main-loop exit_tb().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2014-08-19 19:02:03 +01:00
Peter Maydell
3a2982038a target-arm: Set PSTATE.SS correctly on exception return from AArch64
Set the PSTATE.SS bit correctly on exception returns from AArch64,
as required by the debug single-step functionality.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2014-08-19 19:02:03 +01:00
Peter Maydell
662cefb775 target-arm: Correctly handle PSTATE.SS when taking exception to AArch32
When an exception is taken to AArch32, we must clear the PSTATE.SS
bit for the exception handler, and must also ensure that the SS bit
is not set in the value saved to SPSR_<mode>. Achieve both of these
aims by clearing the bit in uncached_cpsr before saving it to the SPSR.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2014-08-19 19:02:03 +01:00
Peter Maydell
4051e12c5d target-arm: Don't allow AArch32 to access RES0 CPSR bits
The CPSR has a new-in-v8 execution state bit (IL), and
also some state which has effects in AArch32 but appears
only in the SPSR format (SS) but is RES0 in the CPSR.

Add the IL bit to CPSR_EXEC, and enforce that guest direct
reads and writes to CPSR can't read or write the RES0
bits, so the guest can't get at the SS bit which we store
in uncached_cpsr. This includes not permitting exception
returns to copy reserved bits from an SPSR into CPSR.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2014-08-19 19:02:03 +01:00
Peter Maydell
48eb3ae64b target-arm: Adjust debug ID registers per-CPU
Allow each CPU type to specify the value for the debug ID
registers, by putting them in the ARMCPU struct, and use
the resulting information to only expose the correct number
of watchpoint and breakpoint registers for the CPU.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2014-08-19 19:02:03 +01:00
Peter Maydell
10aae1049f target-arm: Provide both 32 and 64 bit versions of debug registers
Bring the 32 bit and 64 bit views of the debug registers into
line by providing the same set of registers in both cases.
(This still isn't a complete set, but it is consistent.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2014-08-19 19:02:03 +01:00
Peter Maydell
58a1d8ceab target-arm: Allow STATE_BOTH reginfo descriptions for more than cp14
Currently the STATE_BOTH shorthand for allowing a single reginfo struct
to define handling for both AArch32 and AArch64 views of a register
only permits this where the AArch32 view is in cp15. It turns out that
the debug registers in cp14 also have neatly lined up encodings;
allow these also to share reginfo structs by permitting a STATE_BOTH
reginfo to specify the .cp field (and continue to default to 15 if
it is not specified).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2014-08-19 19:02:03 +01:00
Peter Maydell
503006983a target-arm: Collect up the debug cp register definitions
At the moment we have a mixed set of mostly dummy register
definitions for various debug related registers which have
been added piecemeal in order to get Linux kernels to boot.
In preparation for actually implementing debug support,
bring them all together into one place.

This commit doesn't change behaviour: we still expose
exactly the same registers and behaviour to the guest
in all configurations.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2014-08-19 19:02:03 +01:00
Peter Maydell
229a138d74 target-arm: Fix return address for A64 BRK instructions
When we take an exception resulting from a BRK instruction,
the architecture requires that the "preferred return address"
reported to the exception handler is the address of the BRK
itself, not the following instruction (like undefined
insns, and in contrast with SVC, HVC and SMC). Follow this,
rather than incorrectly reporting the address of the following
insn.

(We do get this correct for the A32/T32 BKPT insns.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Cc: qemu-stable@nongnu.org
2014-08-19 18:56:24 +01:00
Peter Maydell
0e4a773705 Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
SCSI changes that enable sending vendor-specific commands via virtio-scsi.

Memory changes for QOMification and automatic tracking of MR lifetime.

# gpg: Signature made Mon 18 Aug 2014 13:03:09 BST using RSA key ID 9B4D86F2
# gpg: Good signature from "Paolo Bonzini <pbonzini@redhat.com>"
# gpg:                 aka "Paolo Bonzini <bonzini@gnu.org>"

* remotes/bonzini/tags/for-upstream:
  mtree: remove write-only field
  memory: Use canonical path component as the name
  memory: Use memory_region_name for name access
  memory: constify memory_region_name
  exec: Abstract away ref to memory region names
  loader: Abstract away ref to memory region names
  tpm_tis: remove instance_finalize callback
  memory: remove memory_region_destroy
  memory: convert memory_region_destroy to object_unparent
  ioport: split deletion and destruction
  nic: do not destroy memory regions in cleanup functions
  vga: do not dynamically allocate chain4_alias
  sysbus: remove unused function sysbus_del_io
  qom: object: move unparenting to the child property's release callback
  qom: object: delete properties before calling instance_finalize
  virtio-scsi: implement parse_cdb
  scsi-block, scsi-generic: implement parse_cdb
  scsi-block: extract scsi_block_is_passthrough
  scsi-bus: introduce parse_cdb in SCSIDeviceClass and SCSIBusInfo
  scsi-bus: prepare scsi_req_new for introduction of parse_cdb

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-19 13:00:57 +01:00
Peter Maydell
8e6e2c2ae7 Merge remote-tracking branch 'remotes/qmp-unstable/queue/qmp' into staging
* remotes/qmp-unstable/queue/qmp:
  monitor: fix use after free
  dump.c: Fix memory leak issue in cleanup processing for dump_init()
  monitor: Remove hardcoded watchdog event names

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-19 10:30:36 +01:00
Michael S. Tsirkin
b3dd1b8c29 monitor: fix use after free
The function monitor_fdset_dup_fd_find_remove() references member of
'mon_fdset' which - when remove flag is set - may be freed in function
monitor_fdset_cleanup().
remove is set by monitor_fdset_dup_fd_remove which in practice
does not need the returned value, so make it void,
and return -1 from monitor_fdset_dup_fd_find_remove.

Reported-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-08-18 14:39:10 -04:00
Chen Gang
2928207ac1 dump.c: Fix memory leak issue in cleanup processing for dump_init()
In dump_init(), when failure occurs, need notice about 'fd' and memory
mapping. So call dump_cleanup() for it (need let all initializations at
front).

Also simplify dump_cleanup(): remove redundant 'ret' and redundant 'fd'
checking.

Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-08-18 14:39:10 -04:00
Hani Benhabiles
4bb08af34e monitor: Remove hardcoded watchdog event names
Signed-off-by: Hani Benhabiles <hani@linux.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-08-18 14:39:10 -04:00
Peter Maydell
073fd73e56 Merge remote-tracking branch 'remotes/amit/for-2.2' into staging
* remotes/amit/for-2.2:
  virtio-serial: search for duplicate port names before adding new ports
  virtio-serial: create a linked list of all active devices

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-18 18:24:38 +01:00
Amit Shah
d0a0bfe672 virtio-serial: search for duplicate port names before adding new ports
Before adding new ports to VirtIOSerial devices, check if there's a
conflict in the 'name' parameter.  This ensures two virtserialports with
identical names are not initialized.

Reported-by: <mazhang@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2014-08-18 22:42:49 +05:30
Amit Shah
a1857ad1ac virtio-serial: create a linked list of all active devices
To ensure two virtserialports don't get added to the system with the
same 'name' parameter, we need to access all the ports on all the
devices added, and compare the names.

We currently don't have a list of all VirtIOSerial devices added to the
system.  This commit adds a simple linked list in which devices are put
when they're initialized, and removed when they go away.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2014-08-18 22:42:37 +05:30
Peter Maydell
08ab59770d Merge remote-tracking branch 'remotes/mcayland/qemu-sparc' into staging
* remotes/mcayland/qemu-sparc:
  target-sparc64: implement Short Floating-Point Store Instructions
  apb: add IOMMU flush register implementation
  sun4u: switch second PCI-ebus bridge BAR over to PCI IO space

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-18 12:55:02 +01:00
Peter Maydell
da398fcc25 Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
Block pull request

# gpg: Signature made Fri 15 Aug 2014 18:04:23 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/block-pull-request: (55 commits)
  qcow2: fix new_blocks double-free in alloc_refcount_block()
  image-fuzzer: Reduce number of generator functions in __init__
  image-fuzzer: Add generators of L1/L2 tables
  image-fuzzer: Add fuzzing functions for L1/L2 table entries
  docs: Expand the list of supported image elements with L1/L2 tables
  image-fuzzer: Public API for image-fuzzer/runner/runner.py
  image-fuzzer: Generator of fuzzed qcow2 images
  image-fuzzer: Fuzzing functions for qcow2 images
  image-fuzzer: Tool for fuzz tests execution
  docs: Specification for the image fuzzer
  ide: only constrain read/write requests to drive size, not other types
  virtio-blk: Correct bug in support for flexible descriptor layout
  libqos: Change free function called in malloc
  libqos: Correct mask to align size to PAGE_SIZE in malloc-pc
  libqtest: add QTEST_LOG for debugging qtest testcases
  ide: Fix segfault when flushing a device that doesn't exist
  qemu-options: add missing -drive discard option to cmdline help
  parallels: 2TB+ parallels images support
  parallels: split check for parallels format in parallels_open
  parallels: replace tabs with spaces in block/parallels.c
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-18 11:59:27 +01:00
Paolo Bonzini
f54bb15f9d mtree: remove write-only field
ml->printed is never set to true.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-18 12:06:21 +02:00
Peter Crosthwaite
b0225c2c0d memory: Use canonical path component as the name
Rather than having the name as separate state. This prepares support
for creating a MemoryRegion dynamically (i.e. without
memory_region_init() and friends) and the MemoryRegion still getting
a usable name.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-18 12:06:21 +02:00
Peter Crosthwaite
3fb18b4da7 memory: Use memory_region_name for name access
Despite being local to memory.c, use the helper function. This prepares
support for fully QOMifiying the name field of MR (which will remove
this state from MR completely).

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-18 12:06:21 +02:00
Peter Crosthwaite
5d546d4b65 memory: constify memory_region_name
It doesn't change the MR and some prospective call sites will have
const MRs at hand.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-18 12:06:21 +02:00
Peter Crosthwaite
83234bf2fa exec: Abstract away ref to memory region names
Use the function provided rather than spying on the struct.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-18 12:06:21 +02:00
Peter Crosthwaite
401cf7fdc4 loader: Abstract away ref to memory region names
Use the function provided rather than spying on the struct.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-18 12:06:21 +02:00
Paolo Bonzini
c54779f962 tpm_tis: remove instance_finalize callback
It is never used, since ISA device are not hot-unpluggable.

Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-18 12:06:21 +02:00
Paolo Bonzini
469b046ead memory: remove memory_region_destroy
The function is empty after the previous patch, so remove it.

Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-18 12:06:21 +02:00
Paolo Bonzini
d8d9581460 memory: convert memory_region_destroy to object_unparent
Explicitly call object_unparent in the few places where we
will re-create the memory region.  If the memory region is
simply being destroyed as part of device teardown, let QOM
handle it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-18 12:06:20 +02:00
Paolo Bonzini
e3fb0ade83 ioport: split deletion and destruction
Of the two functions portio_list_del and portio_list_destroy,
the latter is just freeing a memory area.  However, portio_list_del
is the logical equivalent of memory_region_del_subregion so
destruction of memory regions does not belong there.

Actually, neither of these APIs are in use; portio is mostly used by
ISA devices or VGAs, and neither of these is currently hot-unpluggable.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-17 23:25:24 +02:00
Paolo Bonzini
eed7930950 nic: do not destroy memory regions in cleanup functions
The memory regions should be destroyed in the unrealize function;
since these NICs are not even qdev-ified, they cannot be unplugged
and they do not have to do anything to destroy their memory regions.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-17 23:25:24 +02:00
Paolo Bonzini
ad37168cbd vga: do not dynamically allocate chain4_alias
Instead, add a boolean variable to indicate the presence of the region.
This avoids a repeated malloc/free (later we can also avoid the
add_child/unparent by changing the offset/size of the alias).

Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-17 23:25:24 +02:00
Paolo Bonzini
1dd79a237e sysbus: remove unused function sysbus_del_io
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-17 23:25:24 +02:00
Paolo Bonzini
bffc687d66 qom: object: move unparenting to the child property's release callback
This ensures that the unparent callback is called automatically
when the parent object is finalized.

Note that there's no need to keep a reference neither in
object_unparent nor in object_finalize_child_property.  The
reference held by the child property itself will do.

Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-17 23:25:24 +02:00
Paolo Bonzini
76a6e1cc7c qom: object: delete properties before calling instance_finalize
This ensures that the children's unparent callback will still
have a usable parent.

Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-17 23:25:24 +02:00
Artyom Tarasenko
2a5fade753 target-sparc64: implement Short Floating-Point Store Instructions
Implement Short Floating-Point Store Instructions as described
in the chapter 13.5.2 of UltraSPARC-IIi User's Manual.

Particularly this instructions are used by NetBSD 4.0.1+ /sparc64

Signed-off-by: Artyom Tarasenko <atar4qemu@gmail.com>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2014-08-17 13:24:27 +01:00
Mark Cave-Ayland
b87b0644bc apb: add IOMMU flush register implementation
The IOMMU flush register is a write-only register used to remove entries from the
hardware TLB. Allow guest writes to this register as a no-op, and return a value
of 0 for reads.

This fixes IOMMU DMA operations under NetBSD SPARC64.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2014-08-17 13:13:01 +01:00
Mark Cave-Ayland
a1cf8be550 sun4u: switch second PCI-ebus bridge BAR over to PCI IO space
The ebus is the sun4u equivalent of the old ISA bus which is already mapped at
the beginning of PCI IO space within QEMU. NetBSD attempts to find the physical
addresses of devices connected to the ebus by parsing the BARs of the PCI-ebus
bridge and using the base address found by matching both the address space
type and range for a particular ebus address.

Since the second PCI-ebus bridge BAR is already aliased onto IO space, switch
the BAR over to match and reduce the size to 0x1000 which is enough to cover
all the legacy ioport devices whilst leaving the remaining IO space for other
PCI devices. This allows NetBSD SPARC64 to correctly detect and access devices
on the ebus.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2014-08-17 13:12:52 +01:00
Peter Maydell
142f4ac5d5 Merge remote-tracking branch 'remotes/mjt/tags/trivial-patches-2014-08-15' into staging
trivial patches for 2014-08-15

# gpg: Signature made Fri 15 Aug 2014 16:13:03 BST using RSA key ID A4C3D7DB
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg:                 aka "Michael Tokarev <mjt@corpit.ru>"
# gpg:                 aka "Michael Tokarev <mjt@debian.org>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D  4324 457C E0A0 8044 65C5
#      Subkey fingerprint: 6F67 E18E 7C91 C5B1 5514  66A7 BEE5 9D74 A4C3 D7DB

* remotes/mjt/tags/trivial-patches-2014-08-15:
  ivshmem: check the value returned by fstat()
  l2cap: fix access to freed memory
  intc: i8259: Convert Array allocation to g_new0
  ppc: convert g_new(qemu_irq usages to g_new0
  ssi: xilinx_spi: Initialise CS GPIOs as NULL
  vl: free err
  qemu-options.hx: fix typo about l2tpv3
  vmxnet3: don't use 'Yoda conditions'
  vl: don't use 'Yoda conditions'
  spice: don't use 'Yoda conditions'
  don't use 'Yoda conditions'
  isa-bus: don't use 'Yoda conditions'
  audio: don't use 'Yoda conditions'
  usb: don't use 'Yoda conditions'
  CODING_STYLE: Section about conditional statement
  pci-host: update uncorresponding description
  pci-host: update obsolete reference about piix_pci.c
  qemu-options.hx: fix a typo of chardev
  memory: Update obsolete comment about AddrRange field type
  apic: Fix reported DFR content

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-15 18:44:48 +01:00
Stefan Hajnoczi
39ba3bf69c qcow2: fix new_blocks double-free in alloc_refcount_block()
Commit de82815db1 ("qcow2: Handle failure
for potentially large allocations") introduced a double-free of
new_blocks in the alloc_refcount_block() error path.

The qemu-iotests qcow2 026 test case was failing because qemu-io
segfaulted.

Make sure new_blocks is NULL after we free it the first time.

Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:26 +01:00
Maria Kustova
94c83a24c1 image-fuzzer: Reduce number of generator functions in __init__
Some issues can be found only when a fuzzed image has a partial structure,
e.g. has L1/L2 tables but no refcount ones. Generation of an entirely
defined image limits these cases. Now the Image constructor creates only
a header and a backing file name (if any), other image elements are generated
in the 'create_image' API.

Signed-off-by: Maria Kustova <maria.k@catit.be>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:14 +01:00
Maria Kustova
38eb193b8b image-fuzzer: Add generators of L1/L2 tables
Entries in L1/L2 entries are based on a portion of random guest clusters.
L2 entries contain offsets to host image clusters filled with random data.
Clusters for L1/L2 tables and guest data are selected randomly.

Signed-off-by: Maria Kustova <maria.k@catit.be>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:14 +01:00
Maria Kustova
eeadd92487 image-fuzzer: Add fuzzing functions for L1/L2 table entries
Signed-off-by: Maria Kustova <maria.k@catit.be>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:14 +01:00
Maria Kustova
489cb4d7f9 docs: Expand the list of supported image elements with L1/L2 tables
Signed-off-by: Maria Kustova <maria.k@catit.be>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:14 +01:00
Maria Kustova
071e649194 image-fuzzer: Public API for image-fuzzer/runner/runner.py
__init__.py provides the public API required by the test runner

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Maria Kustova <maria.k@catit.be>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:14 +01:00
Maria Kustova
e123232331 image-fuzzer: Generator of fuzzed qcow2 images
The layout submodule of the qcow2 package creates a random valid image,
randomly selects some amount of its fields, fuzzes them and write the fuzzed
image to the file. Fuzzing process can be controlled by an external
configuration.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Maria Kustova <maria.k@catit.be>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:14 +01:00
Maria Kustova
6d5e9372f6 image-fuzzer: Fuzzing functions for qcow2 images
The fuzz submodule of the qcow2 image generator contains fuzzing functions for
image fields.
Each fuzzing function contains a list of constraints and a call of a helper
function that randomly selects a fuzzed value satisfied to one of constraints.
For now constraints include only known as invalid or potentially dangerous
values. But after investigation of code coverage by fuzz tests they will be
expanded by heuristic values based on inner checks and flows of a program
under test.

Now fuzzing of a header, header extensions and a backing file name is
supported.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Maria Kustova <maria.k@catit.be>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:14 +01:00
Maria Kustova
ad724dd728 image-fuzzer: Tool for fuzz tests execution
The purpose of the test runner is to prepare the test environment (e.g. create
a work directory, a test image, etc), execute a program under test with
parameters, indicate a test failure if the program was killed during the test
execution and collect core dumps, logs and other test artifacts.

The test runner doesn't depend on an image format, so it can be used with any
external image generator.

[Fixed path to qcow2 format module "qcow2" instead of "../qcow2" since
runner.py is no longer in a sub-directory.
--Stefan]

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Maria Kustova <maria.k@catit.be>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:14 +01:00
Maria Kustova
d6dc10aad8 docs: Specification for the image fuzzer
'Overall fuzzer requirements' chapter contains the current product vision and
features done and to be done. This chapter is still in progress.

Signed-off-by: Maria Kustova <maria.k@catit.be>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:14 +01:00
Michael Tokarev
d66168ed68 ide: only constrain read/write requests to drive size, not other types
Commit 58ac321135 introduced a check to ide dma processing which
constrains all requests to drive size.  However, apparently, some
valid requests (like TRIM) does not fit in this constraint, and
fails in 2.1.  So check the range only for reads and writes.

Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:14 +01:00
Marc Marí
a83ceea8ff virtio-blk: Correct bug in support for flexible descriptor layout
Without this correction, only a three descriptor layout is accepted, and
requests with just two descriptors are not completed and no error message is
displayed.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Marc Marí <marc.mari.barcelo@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:14 +01:00
Marc Marí
220c1a2fad libqos: Change free function called in malloc
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Marc Marí <marc.mari.barcelo@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:14 +01:00
Marc Marí
f75ffc5857 libqos: Correct mask to align size to PAGE_SIZE in malloc-pc
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Marc Marí <marc.mari.barcelo@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:14 +01:00
Marc Marí
ae74f18782 libqtest: add QTEST_LOG for debugging qtest testcases
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Marc Marí <marc.mari.barcelo@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:14 +01:00
Kevin Wolf
f7f3ff1da0 ide: Fix segfault when flushing a device that doesn't exist
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
Peter Lieven
2f7133b2e5 qemu-options: add missing -drive discard option to cmdline help
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
Denis V. Lunev
d25d598020 parallels: 2TB+ parallels images support
Parallels has released in the recent updates of Parallels Server 5/6
new addition to his image format. Images with signature WithouFreSpacExt
have offsets in the catalog coded not as offsets in sectors (multiple
of 512 bytes) but offsets coded in blocks (i.e. header->tracks * 512)

In this case all 64 bits of header->nb_sectors are used for image size.

This patch implements support of this for qemu-img and also adds specific
check for an incorrect image. Images with block size greater than
INT_MAX/513 are not supported. The biggest available Parallels image
cluster size in the field is 1 Mb. Thus this limit will not hurt
anyone.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Jeff Cody <jcody@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
Denis V. Lunev
418a7adb77 parallels: split check for parallels format in parallels_open
and rework error path a bit. There is no difference at the moment, but
the code will be definitely shorter when additional processing will
be required for WithouFreSpacExt

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Jeff Cody <jcody@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
Denis V. Lunev
f08e2f8465 parallels: replace tabs with spaces in block/parallels.c
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Jeff Cody <jcody@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
Denis V. Lunev
8c27d54fa0 parallels: extend parallels format header with actual data values
Parallels image format has several additional fields inside:
- nb_sectors is actually 64 bit wide. Upper 32bits are not used for
  images with signature "WithoutFreeSpace" and must be explicitly
  zeroed according to Parallels. They will be used for images with
  signature "WithouFreSpacExt"
- inuse is magic which means that the image is currently opened for
  read/write or was not closed correctly, the magic is 0x746f6e59
- data_off is the location of the first data block. It can be zero
  and in this case data starts just beyond the header aligned to
  512 bytes. Though this field does not matter for read-only driver

This patch adds these values to struct parallels_header and adds
proper handling of nb_sectors for currently supported WithoutFreeSpace
images.

WithouFreSpacExt will be covered in next patches.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Jeff Cody <jcody@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
Cornelia Huck
2f5f70fa5f dataplane: stop trying on notifier error
If we fail to set up guest or host notifiers, there's no use trying again
every time the guest kicks, so disable dataplane in that case.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
Cornelia Huck
f9907ebc4c dataplane: fail notifier setting gracefully
The dataplane code is currently doing a hard exit if it fails to set
up either guest or host notifiers. In practice, this may mean that a
guest suddenly dies after a dataplane device failed to come up (e.g.,
when a file descriptor limit is hit for tne nth device).

Let's just try to unwind the setup instead and return.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
Cornelia Huck
267e1a204c dataplane: print why starting failed
Setting up guest or host notifiers may fail, but the user will have
no idea why: Let's print the error returned by the callback.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
Gonglei
16b38080e3 channel-posix: using qemu_set_nonblock() instead of fcntl(O_NONBLOCK)
Technically, fcntl(soc, F_SETFL, O_NONBLOCK)
is incorrect since it clobbers all other file flags.
We can use F_GETFL to get the current flags, set or
clear the O_NONBLOCK flag, then use F_SETFL to set the flags.

Using the qemu_set_nonblock() wrapper.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Wangxin <wangxinxin.wang@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
Gonglei
4ff12bdb1d qemu-char: using qemu_set_nonblock() instead of fcntl(O_NONBLOCK)
Technically, fcntl(soc, F_SETFL, O_NONBLOCK)
is incorrect since it clobbers all other file flags.
We can use F_GETFL to get the current flags, set or
clear the O_NONBLOCK flag, then use F_SETFL to set the flags.

Using the qemu_set_nonblock() wrapper.

Signed-off-by: Wangxin <wangxinxin.wang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
Mark Cave-Ayland
271dddd133 cmd646: synchronise UDMA interrupt status with DMA interrupt status
Make sure that both registers are synchronised when being accessed through
PCI configuration space.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
Mark Cave-Ayland
1d113ef874 cmd646: allow MRDMODE interrupt status bits clearing from PCI config space
Make sure that we also update the normal DMA interrupt status bits at the
same time, and alter the IRQ if being cleared accordingly.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
Mark Cave-Ayland
dab91a1e13 cmd646: switch cmd646_update_irq() to accept PCIDevice instead of PCIIDEState
This is in preparation for adding configuration space accessors which accept
PCIDevice as a parameter.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
Mark Cave-Ayland
5bbc0a703d cmd646: synchronise DMA interrupt status with UDMA interrupt status
Make sure that the standard DMA interrupt status bits reflect any changes made
to the UDMA interrupt status bits. The CMD646U2 datasheet claims that these
bits are equivalent, and they must be synchronised for guests that manipulate
both registers.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
Mark Cave-Ayland
58f16a7b47 cmd646: add constants for CNTRL register access
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
John Snow
e42de189e8 qtest/ide: Fix small memory leak
For libqos debugging purposes, it's nice to
be able to assert that tests and associated libraries
have no memory leaks. To that end, free up the
trivial cmdline leak.

The remaining leaks caused by pc_alloc_init are fixed
instead by my first-fit pc_alloc implementation already
on the qemu-devel mailing list.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
John Snow
6ce7100e7f libqos: allow qpci_iomap to return BAR mapping size
This patch allows qpci_iomap to return the size of the
BAR mapping that it created, to allow driver applications
(e.g, ahci-test) to make determinations about the suitability
or the mapping size, or in the specific case of AHCI, how
many ports are supported by the HBA.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
John Snow
7f2a5ae6c1 libqos: Fixes a small memory leak.
Allow users the chance to clean up the QPCIBusPC structure
by adding a small cleanup routine. Helps clear up small
memory leaks during setup/teardown, to allow for cleaner
debug output messages.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
John Snow
a7afc6b8c1 libqtest: Correct small memory leak.
Fixes a small memory leak inside of libqtest.
After we produce a test path and glib copies the string
for itself, we should clean up our temporary copy.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
John Snow
f3cdcbaee1 libqos: Correct memory leak
Fix a small memory leak inside of libqos, in the pc_alloc_init routine.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:12 +01:00
John Snow
86298845e1 qtest: Adding qtest_memset and qmemset.
Currently, libqtest allows for memread and memwrite, but
does not offer a simple way to zero out regions of memory.
This patch adds a simple function to do so.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:12 +01:00
John Snow
552b48f44d q35: Enable the ioapic device to be seen by qtest.
Currently, the ioapic device can not be found in a qtest environment
when requesting "irq_interrupt_in ioapic" via the qtest socket.

By mirroring how the ioapic is added in i44ofx (hw/i440/pc_piix.c),
as a child of "q35," the device is able to be seen by qtest.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:12 +01:00
Paolo Bonzini
088415202b ahci: construct PIO Setup FIS for PIO commands
PIO commands should put a PIO Setup FIS in the receive area when data
transfer ends.  Currently QEMU does not do this and only places the
D2H FIS at the end of the operation.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:12 +01:00
Paolo Bonzini
c7e73adb48 ide: make all commands go through cmd_done
AHCI has code to fill in the D2H FIS trigger the IRQ all over the place.
Centralize this in a single cmd_done callback by generalizing the existing
async_cmd_done callback.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:12 +01:00
Paolo Bonzini
08ee9e3368 ide: stop PIO transfer on errors
This will provide a hook for sending the result of the command via the
FIS receive area.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:12 +01:00
Paolo Bonzini
1f88f77348 ahci: remove duplicate PORT_IRQ_* constants
These are defined twice, just use one set consistently.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:12 +01:00
Paolo Bonzini
fd648f10af ide: move retry constants out of BM_STATUS_* namespace
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:12 +01:00
Paolo Bonzini
7e2648df86 ide: move BM_STATUS bits to pci.[ch]
They are not used by AHCI, and should not be even available there.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:12 +01:00
Paolo Bonzini
0e7ce54cf5 ide: fold add_status callback into set_inactive
It is now called only after the set_inactive callback.  Put the two together.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:12 +01:00
Paolo Bonzini
0def37baf9 ide: remove wrong setting of BM_STATUS_INT
Similar to the case removed in commit 69c38b8 (ide/core: Remove explicit
setting of BM_STATUS_INT, 2011-05-19), the only remaining use of
add_status(..., BM_STATUS_INT) is for short PRDs.  The flag should
not be raised in this case.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:12 +01:00
Paolo Bonzini
4855b57639 ide: wrap start_dma callback
Make it optional and prepare for the next patches.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:12 +01:00
Paolo Bonzini
446351236b ide: simplify start_transfer callbacks
Drop the unused return value and make the callback optional.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:12 +01:00
Paolo Bonzini
c039cb1e5a ide: simplify async_cmd_done callbacks
Drop the unused return value.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:12 +01:00
Paolo Bonzini
829b933b70 ide: simplify set_inactive callbacks
Drop the unused return value and make the callback optional.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:12 +01:00
Paolo Bonzini
1374bec063 ide: simplify reset callbacks
Drop the unused return value and make the callback optional.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:12 +01:00
Paolo Bonzini
69f72a2221 ide: stash aiocb for flushes
This ensures that operations are completed after a reset

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:12 +01:00
Paolo Bonzini
14a92e5fe1 ide-test: add test for werror=stop
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:12 +01:00
Paolo Bonzini
7c282cb4c5 libqtest: add QTEST_LOG for debugging qtest testcases
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:11 +01:00
Paolo Bonzini
9e52c53b8c blkdebug: report errors on flush too
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:11 +01:00
Peter Maydell
f2c85a2f36 Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
post-2.1 bugfixes

A bunch of fixes that missed 2.1 by a small margin.
If we do 2.1.1, some of these would be good candidates,
added Cc qemu-stable as appropriate.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Thu 14 Aug 2014 17:07:25 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"

* remotes/mst/tags/for_upstream:
  pc: Get rid of pci-info leftovers
  e1000: use symbolic constants to init phy ctrl & status registers
  e1000: correctly handle phy_ctrl reserved & self-clearing bits
  ivshmem: fix building when debug mode is enabled
  acpi: align RSDP
  numa: show hex number in error message for consistency and prefix them with 0x
  pc-dimm: fix up error message
  pc-dimm: validate node property
  hw:i386: typo fix: MEMORY_HOPTLUG_DEVICE -> MEMORY_HOTPLUG_DEVICE
  hw/audio/intel-hda: Fix MSI capability address
  pc: Create 2.2 machine type
  pci: Use bus master address space for delivering MSI/MSI-X messages

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-15 17:43:51 +01:00
Peter Maydell
5c6b3c50cc Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging
Tracing pull request

* remotes/stefanha/tags/tracing-pull-request:
  virtio-rng: add some trace events
  trace: add some tcg tracing support
  trace: teach lttng backend to use format strings
  trace: [tcg] Include TCG-tracing header on all targets
  trace: [tcg] Include event definitions in "trace.h"
  trace: [tcg] Generate TCG tracing routines
  trace: [tcg] Include TCG-tracing helpers
  trace: [tcg] Define TCG tracing helper routine wrappers
  trace: [tcg] Define TCG tracing helper routines
  trace: [tcg] Declare TCG tracing helper routines
  trace: [tcg] Add 'tcg' event property
  trace: [tcg] Argument type transformation machinery
  trace: [tcg] Argument type transformation rules
  trace: [tcg] Add documentation
  trace: install simpletrace SystemTap tapset
  simpletrace: add simpletrace.py --no-header option
  trace: add tracetool simpletrace_stap format
  trace: extract stap_escape() function for reuse

Conflicts:
	Makefile.objs
2014-08-15 16:37:17 +01:00
zhanghailiang
5edbdbcdf8 ivshmem: check the value returned by fstat()
The function fstat() may fail, so check its return value.

Acked-by: Levente Kurusa <lkurusa@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 19:12:58 +04:00
zhanghailiang
2c145d7a73 l2cap: fix access to freed memory
Pointer 'ch' will be used in function 'l2cap_channel_open_req_msg' after
it was previously freed in 'l2cap_channel_open'.
Assigned it to NULL after it is freed.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 19:12:48 +04:00
Peter Crosthwaite
8945c7f754 intc: i8259: Convert Array allocation to g_new0
To be more array friendly and to indicate the IRQs are initially
disconnected.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:55 +04:00
Peter Crosthwaite
aa2ac1dac3 ppc: convert g_new(qemu_irq usages to g_new0
To indicate the IRQs are initially disconnected.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:50 +04:00
Peter Crosthwaite
c75f3c041a ssi: xilinx_spi: Initialise CS GPIOs as NULL
To properly indicate they are unconnected.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:40 +04:00
Hu Tao
3a9cbfe009 vl: free err
err is not freed after use, thus causing memory leak. This patch fixes
it.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Cc: qemu-trivial@nongnu.org
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:07 +04:00
Gonglei
3952651a75 qemu-options.hx: fix typo about l2tpv3
two duplicate destport description.

s/destport/srcport/, s/destination/source/

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:07 +04:00
Gonglei
f7472ca405 vmxnet3: don't use 'Yoda conditions'
imitate nearby code about using '!value' or 'value == NULL'

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:07 +04:00
Gonglei
28de2f883c vl: don't use 'Yoda conditions'
imitate nearby code about using '!value' or 'value == NULL'

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:07 +04:00
Gonglei
fe8e8327f1 spice: don't use 'Yoda conditions'
imitate nearby code about using '!value' or 'value == NULL'

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:07 +04:00
Gonglei
8108fd3e26 don't use 'Yoda conditions'
imitate nearby code about using '!value' or 'value == NULL'

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:07 +04:00
Gonglei
337a3e5c7d isa-bus: don't use 'Yoda conditions'
imitate nearby code about using '!value' or 'value == NULL'

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:06 +04:00
Gonglei
2ab5bf67b7 audio: don't use 'Yoda conditions'
imitate nearby code about using '!value' or 'value == NULL'

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:06 +04:00
Gonglei
d0657b2aab usb: don't use 'Yoda conditions'
imitate nearby code about using '!value' or 'value == NULL'

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:06 +04:00
Gonglei
2bb0020cf9 CODING_STYLE: Section about conditional statement
Yoda conditions lack readability, and QEMU has a
strict compiler configuration for checking a common
mistake like "if (dev = NULL)". Make it a written rule.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:06 +04:00
Gonglei
30dc600bbf pci-host: update uncorresponding description
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:06 +04:00
Gonglei
ef9f7b587d pci-host: update obsolete reference about piix_pci.c
piix_pci.c has been renamed into piix.c at commit
c0907c9e64

update the obsolete reference.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:06 +04:00
Liming Wang
38a24c8b74 qemu-options.hx: fix a typo of chardev
Change host to port.

Signed-off-by: Liming Wang <liming.wang@canonical.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:06 +04:00
Fam Zheng
c9cdaa3ab9 memory: Update obsolete comment about AddrRange field type
We are not 64 bit any more since

08dafab4 memory: use 128-bit integers for sizes and intermediates

but the comment is forgotten to be updated.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:06 +04:00
Jan Kiszka
d6c140a771 apic: Fix reported DFR content
IA-32 SDM, Figure 10-14: Bits 27:0 are reserved as 1.

Fixes Jailhouse hypervisor start with in-kernel irqchips off.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:06 +04:00
Peter Maydell
f2fb1da941 Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block patches

# gpg: Signature made Fri 15 Aug 2014 14:07:42 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"

* remotes/kevin/tags/for-upstream: (59 commits)
  block: Catch !bs->drv in bdrv_check()
  iotests: Add test for image header overlap
  qcow2: Catch !*host_offset for data allocation
  qcow2: Return useful error code in refcount_init()
  mirror: Handle failure for potentially large allocations
  vpc: Handle failure for potentially large allocations
  vmdk: Handle failure for potentially large allocations
  vhdx: Handle failure for potentially large allocations
  vdi: Handle failure for potentially large allocations
  rbd: Handle failure for potentially large allocations
  raw-win32: Handle failure for potentially large allocations
  raw-posix: Handle failure for potentially large allocations
  qed: Handle failure for potentially large allocations
  qcow2: Handle failure for potentially large allocations
  qcow1: Handle failure for potentially large allocations
  parallels: Handle failure for potentially large allocations
  nfs: Handle failure for potentially large allocations
  iscsi: Handle failure for potentially large allocations
  dmg: Handle failure for potentially large allocations
  curl: Handle failure for potentially large allocations
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-15 14:49:50 +01:00
Max Reitz
908bcd540f block: Catch !bs->drv in bdrv_check()
qemu-img check calls bdrv_check() twice if the first run repaired some
inconsistencies. If the first run however again triggered corruption
prevention (on qcow2) due to very bad inconsistencies, bs->drv may be
NULL afterwards. Thus, bdrv_check() should check whether bs->drv is set.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-15 15:07:16 +02:00
Max Reitz
a42f8a3d05 iotests: Add test for image header overlap
Add a test for an image with an unallocated image header; instead of an
assertion, this should result in the image being marked corrupt.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-15 15:07:16 +02:00
Max Reitz
ff52aab2df qcow2: Catch !*host_offset for data allocation
qcow2_alloc_cluster_offset() uses host_offset == 0 as "no preferred
offset" for the (data) cluster range to be allocated. However, this
offset is actually valid and may be allocated on images with a corrupted
refcount table or first refcount block.

In this case, the corruption prevention should normally catch that
write anyway (because it would overwrite the image header). But since 0
is a special value here, the function assumes that nothing has been
allocated at all which it asserts against.

Because this condition is not qemu's fault but rather that of a broken
image, it shouldn't throw an assertion but rather mark the image corrupt
and show an appropriate message, which this patch does by calling the
corruption check earlier than it would be called normally (before the
assertion).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-15 15:07:16 +02:00
Max Reitz
8fcffa9853 qcow2: Return useful error code in refcount_init()
If bdrv_pread() returns an error, it is very unlikely that it was
ENOMEM. In this case, the return value should be passed along; as
bdrv_pread() will always either return the number of bytes read or a
negative value (the error code), the condition for checking whether
bdrv_pread() failed can be simplified (and clarified) as well.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-15 15:07:16 +02:00
Kevin Wolf
7504edf477 mirror: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the mirror block job.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-15 15:07:16 +02:00
Kevin Wolf
5fb09cd586 vpc: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the vpc block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-15 15:07:16 +02:00
Kevin Wolf
d6e5993197 vmdk: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the vmdk block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-15 15:07:16 +02:00
Kevin Wolf
a67e128a4f vhdx: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the vhdx block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-15 15:07:16 +02:00
Kevin Wolf
17cce73578 vdi: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the vdi block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-15 15:07:16 +02:00
Kevin Wolf
0f7a02379b rbd: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the rbd block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:16 +02:00
Kevin Wolf
4b6af3d58a raw-win32: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the raw-win32 block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:16 +02:00
Kevin Wolf
50d4a858e6 raw-posix: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the raw-posix block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:15 +02:00
Kevin Wolf
4f4896db5f qed: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the qed block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-15 15:07:15 +02:00
Kevin Wolf
de82815db1 qcow2: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the qcow2 block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:15 +02:00
Kevin Wolf
0df93305f2 qcow1: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the qcow1 block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:15 +02:00
Kevin Wolf
f7b593d937 parallels: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the parallels block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-15 15:07:15 +02:00
Kevin Wolf
2347dd7b68 nfs: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the nfs block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-15 15:07:15 +02:00
Kevin Wolf
4d5a3f888c iscsi: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the iscsi block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-08-15 15:07:15 +02:00
Kevin Wolf
b546a94474 dmg: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the dmg block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-15 15:07:15 +02:00
Kevin Wolf
8dc7a7725b curl: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the curl block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-15 15:07:15 +02:00
Kevin Wolf
4ae7a52e43 cloop: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the cloop block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-15 15:07:15 +02:00
Kevin Wolf
7bf665ee35 bochs: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the bochs block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-15 15:07:15 +02:00
Kevin Wolf
857d4f46c3 block: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses bounce buffer allocations in block.c. While at it,
convert bdrv_commit() from plain g_malloc() to qemu_try_blockalign().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:15 +02:00
Kevin Wolf
7d2a35cc92 block: Introduce qemu_try_blockalign()
This function returns NULL instead of aborting when an allocation fails.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-15 15:07:15 +02:00
Jeff Cody
23d20b5b4f block: iotest - update 084 to test static VDI image creation
This updates the VDI corruption test to also test static VDI image
creation, as well as the default dynamic image creation.

Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-15 15:07:15 +02:00
Jeff Cody
fef6070eff block: vpc - use block layer ops in vpc_create, instead of posix calls
Use the block layer to create, and write to, the image file in the VPC
.bdrv_create() operation.

This has a couple of benefits: Images can now be created over protocols,
and hacks such as NOCOW are not needed in the image format driver, and
the underlying file protocol appropriate for the host OS can be relied
upon.

Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-15 15:07:15 +02:00
Jeff Cody
dddc7750d6 block: use the standard 'ret' instead of 'result'
Most QEMU code uses 'ret' for function return values. The VDI driver
uses a mix of 'result' and 'ret'.  This cleans that up, switching over
to the standard 'ret' usage.

Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-15 15:07:15 +02:00
Jeff Cody
70747862f1 block: vdi - use block layer ops in vdi_create, instead of posix calls
Use the block layer to create, and write to, the image file in the
VDI .bdrv_create() operation.

This has a couple of benefits: Images can now be created over protocols,
and hacks such as NOCOW are not needed in the image format driver, and
the underlying file protocol appropriate for the host OS can be relied
upon.

Also some minor cleanup for error handling.

Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-15 15:07:14 +02:00
Jeff Cody
9a4d5ca607 block: allow bdrv_unref() to be passed NULL pointers
If bdrv_unref() is passed a NULL BDS pointer, it is safe to
exit with no operation.  This will allow cleanup code to blindly
call bdrv_unref() on a BDS that has been initialized to NULL.

Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-15 15:07:14 +02:00
Paolo Bonzini
58803ce74f test-coroutine: add baseline test that times the cost of function calls
This can be used to compute the cost of coroutine operations.  In the
end the cost of the function call is a few clock cycles, so it's pretty
cheap for now, but it may become more relevant as the coroutine code
is optimized.

For example, here are the results on my machine:

   Function call 100000000 iterations: 0.173884 s
   Yield 100000000 iterations: 8.445064 s
   Lifecycle 1000000 iterations: 0.098445 s
   Nesting 10000 iterations of 1000 depth each: 7.406431 s

One yield takes 83 nanoseconds, one enter takes 97 nanoseconds,
one coroutine allocation takes (roughly, since some of the allocations
in the nesting test do hit the pool) 739 nanoseconds:

   (8.445064 - 0.173884) * 10^9 / 100000000 = 82.7
   (0.098445 * 100 - 0.173884) * 10^9 / 100000000 = 96.7
   (7.406431 * 10 - 0.173884) * 10^9 / 100000000 = 738.9

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-15 15:07:14 +02:00
Jeff Cody
4f75b52a07 block: VHDX endian fixes
This patch contains several changes for endian conversion fixes for
VHDX, particularly for big-endian machines (multibyte values in VHDX are
all on disk in LE format).

Tests were done with existing qemu-iotests on an IBM POWER7 (8406-71Y).
This includes sample images created by Hyper-V, both with dirty logs and
without.

In addition, VHDX image files created (and written to) on a BE machine
were tested on a LE machine, and vice-versa.

Reported-by: Markus Armburster <armbru@redhat.com>
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-15 15:07:14 +02:00
Jeff Cody
349592e0b9 block: vhdx - add error check
This add an error check for an invalid descriptor entry signature,
when flushing the log descriptor entries.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-15 15:07:14 +02:00
Stefan Hajnoczi
3c80ca158c thread-pool: avoid deadlock in nested aio_poll() calls
The thread pool has a race condition if two elements complete before
thread_pool_completion_bh() runs:

  If element A's callback waits for element B using aio_poll() it will
  deadlock since pool->completion_bh is not marked scheduled when the
  nested aio_poll() runs.

Fix this by marking the BH scheduled while thread_pool_completion_bh()
is executing.  This way any nested aio_poll() loops will enter
thread_pool_completion_bh() and complete the remaining elements.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:14 +02:00
Stefan Hajnoczi
c2e50e3d11 thread-pool: avoid per-thread-pool EventNotifier
EventNotifier is implemented using an eventfd or pipe.  It therefore
consumes file descriptors, which can be limited by rlimits and should
therefore be used sparingly.

Switch from EventNotifier to QEMUBH in thread-pool.c.  Originally
EventNotifier was used because qemu_bh_schedule() was not thread-safe
yet.

Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:14 +02:00
Stefan Hajnoczi
2a87151fb2 block: bump coroutine pool size for drives
When a BlockDriverState is associated with a storage controller
DeviceState we expect guest I/O.  Use this opportunity to bump the
coroutine pool size by 64.

This patch ensures that the coroutine pool size scales with the number
of drives attached to the guest.  It should increase coroutine pool
usage (which makes qemu_coroutine_create() fast) without hogging too
much memory when fewer drives are attached.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-08-15 15:07:14 +02:00
Stefan Hajnoczi
ac2662a913 coroutine: make pool size dynamic
Allow coroutine users to adjust the pool size.  For example, if the
guest has multiple emulated disk drives we should keep around more
coroutines.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-08-15 15:07:14 +02:00
Chrysostomos Nanakos
746ebfa77a qemu-iotests: add support for Archipelago protocol
Signed-off-by: Chrysostomos Nanakos <cnanakos@grnet.gr>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:14 +02:00
Chrysostomos Nanakos
b1de5f439d QMP: Add support for Archipelago
Introduce new enum BlockdevOptionsArchipelago.

@volume:              #Name of the Archipelago volume image

@mport:               #'mport' is the port number on which mapperd is
                      listening. This is optional and if not specified,
                      QEMU will make Archipelago to use the default port.

@vport:               #'vport' is the port number on which vlmcd is
                      listening. This is optional and if not specified,
                      QEMU will make Archipelago to use the default port.

@segment:             #optional The name of the shared memory segment
                      Archipelago stack is using. This is optional
                      and if not specified, QEMU will make Archipelago
                      use the default value, 'archipelago'.

Signed-off-by: Chrysostomos Nanakos <cnanakos@grnet.gr>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:14 +02:00
Chrysostomos Nanakos
76d3d83a37 block/archipelago: Add support for creating images
qemu-img archipelago:<volumename>[/mport=<mapperd_port>[:vport=<vlmcd_port>]
 [:segment=<segment_name>]] [size]

Signed-off-by: Chrysostomos Nanakos <cnanakos@grnet.gr>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:14 +02:00
Chrysostomos Nanakos
70537a8506 block/archipelago: Implement bdrv_parse_filename()
VM Image on Archipelago volume can also be specified like this:

file=archipelago:<volumename>[/mport=<mapperd_port>[:vport=<vlmcd_port>][:
segment=<segment_name>]]

Examples:

file=archipelago:my_vm_volume
file=archipelago:my_vm_volume/mport=123
file=archipelago:my_vm_volume/mport=123:vport=1234
file=archipelago:my_vm_volume/mport=123:vport=1234:segment=my_segment

Signed-off-by: Chrysostomos Nanakos <cnanakos@grnet.gr>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:14 +02:00
Chrysostomos Nanakos
c9a12e751b block: Support Archipelago as a QEMU block backend
VM Image on Archipelago volume is specified like this:

file.driver=archipelago,file.volume=<volumename>[,file.mport=<mapperd_port>[,
file.vport=<vlmcd_port>][,file.segment=<segment_name>]]

'archipelago' is the protocol.

'mport' is the port number on which mapperd is listening. This is optional
and if not specified, QEMU will make Archipelago to use the default port.

'vport' is the port number on which vlmcd is listening. This is optional
and if not specified, QEMU will make Archipelago to use the default port.

'segment' is the name of the shared memory segment Archipelago stack is using.
This is optional and if not specified, QEMU will make Archipelago to use the
default value, 'archipelago'.

Examples:

file.driver=archipelago,file.volume=my_vm_volume
file.driver=archipelago,file.volume=my_vm_volume,file.mport=123
file.driver=archipelago,file.volume=my_vm_volume,file.mport=123,
file.vport=1234
file.driver=archipelago,file.volume=my_vm_volume,file.mport=123,
file.vport=1234,file.segment=my_segment

Signed-off-by: Chrysostomos Nanakos <cnanakos@grnet.gr>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-15 15:07:14 +02:00
Chunyan Liu
000c4dfff4 qemu-img info: show nocow info
Add nocow info in 'qemu-img info' output to show whether the file
currently has NOCOW flag set or not.

Signed-off-by: Chunyan Liu <cyliu@suse.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:14 +02:00
Fam Zheng
c6ac36e145 vmdk: Optimize cluster allocation
This drops the unnecessary bdrv_truncate() from, and also improves,
cluster allocation code path.

Before, when we need a new cluster, get_cluster_offset truncates the
image to bdrv_getlength() + cluster_size, and returns the offset of
added area, i.e. the image length before truncating.

This is not efficient, so it's now rewritten as:

  - Save the extent file length when opening.

  - When allocating cluster, use the saved length as cluster offset.

  - Don't truncate image, because we'll anyway write data there: just
    write any data at the EOF position, in descending priority:

    * New user data (cluster allocation happens in a write request).

    * Filling data in the beginning and/or ending of the new cluster, if
      not covered by user data: either backing file content (COW), or
      zero for standalone images.

One major benifit of this change is, on host mounted NFS images, even
over a fast network, ftruncate is slow (see the example below). This
change significantly speeds up cluster allocation. Comparing by
converting a cirros image (296M) to VMDK on an NFS mount point, over
1Gbe LAN:

    $ time qemu-img convert cirros-0.3.1.img /mnt/a.raw -O vmdk

    Before:
        real    0m21.796s
        user    0m0.130s
        sys     0m0.483s

    After:
        real    0m2.017s
        user    0m0.047s
        sys     0m0.190s

We also get rid of unchecked bdrv_getlength() and bdrv_truncate(), and
get a little more documentation in function comments.

Tested that this passes qemu-iotests for all VMDK subformats.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:14 +02:00
Fam Zheng
a8d8a1a06c qemu-iotests: Add data pattern in version3 VMDK sample image in 059
It's possible that we diverge from the specification with our
implementation.  Having a reference image in the test cases may detect
such problems when we introduce a bug that can read what it creates, but
can't handle a real VMDK.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:13 +02:00
Stefan Hajnoczi
ef523587da qdev-monitor: include QOM properties in -device FOO, help output
Update -device FOO,help to include QOM properties in addition to qdev
properties.  Devices are gradually adding more QOM properties that are
not reflected as qdev properties.

It is important to report all device properties since management tools
like libvirt use this information (and device-list-properties QMP) to
detect the presence of QEMU features.

This patch reuses the device-list-properties QMP machinery to avoid code
duplication.

Reported-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Cole Robinson <crobinso@redhat.com>
2014-08-15 15:07:13 +02:00
Stefan Hajnoczi
4115dd6527 qmp: hide "hotplugged" device property from device-list-properties
The "hotplugged" device property was not reported before commit
f4eb32b590 ("qmp: show QOM properties in
device-list-properties").  Fix this difference.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-08-15 15:07:13 +02:00
Stefan Hajnoczi
ef558696b5 docs/multiple-iothreads.txt: add documentation on IOThread programming
This document explains how IOThreads and the main loop are related,
especially how to write code that can run in an IOThread.  Currently
only virtio-blk-data-plane uses these techniques.  The next obvious
target is virtio-scsi; there has also been work on virtio-net.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-08-15 15:07:13 +02:00
Gonglei (Arei)
8cced12143 xen_disk: fix possible null-ptr dereference
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:13 +02:00
Hu Tao
8efc936336 configure: explicitly state version requirements to devel packages
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:13 +02:00
Maria Kustova
8e436ec1f3 docs: Make the recommendation for the backing file name position a requirement
The current version of the qcow2 specification recommends to save the backing
file name in the end of the first cluster. It follows that the backing file
name can be saved somewhere in the image, but the first cluster, which
contradicts the current QEMU implementation.

The patch makes the backing file name required to be placed after the header
extensions in the first image cluster.

Signed-off-by: Maria Kustova <maria.k@catit.be>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-15 15:07:13 +02:00
Markus Armbruster
52bf1e722d block: Avoid bdrv_get_geometry() where errors should be detected
bdrv_get_geometry() hides errors.  Use bdrv_nb_sectors() or
bdrv_getlength() instead where that's obviously inappropriate.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:13 +02:00
Markus Armbruster
d739f1c410 qemu-img: Make img_convert() get image size just once per image
Chiefly so I don't have to do the error checking in quadruplicate in
the next commit.  Moreover, replacing the frequently updated
bs_sectors by an array assigned just once makes the code easier to
understand.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:13 +02:00
Markus Armbruster
75d3d21f9e block: Drop superfluous aligning of bdrv_getlength()'s value
It returns a multiple of the sector size.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:13 +02:00
Markus Armbruster
57322b7811 block: Use bdrv_nb_sectors() where sectors, not bytes are wanted
Instead of bdrv_getlength().

Aside: a few of these callers don't handle errors.  I didn't
investigate whether they should.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:13 +02:00
Markus Armbruster
43716fa805 block: Use bdrv_nb_sectors() in img_convert()
Instead of bdrv_getlength().  Replace variable output_length by
output_sectors.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:13 +02:00
Markus Armbruster
30a7f2fc91 block: Use bdrv_nb_sectors() in bdrv_co_get_block_status()
Instead of bdrv_getlength().

Replace variables length, length2 by total_sectors, nb_sectors2.
Bonus: use total_sectors instead of the slightly unclean
bs->total_sectors.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:13 +02:00
Markus Armbruster
4049082c4b block: Use bdrv_nb_sectors() in bdrv_aligned_preadv()
Instead of bdrv_getlength().  Eliminate variable len.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:13 +02:00
Markus Armbruster
d32f7c101b block: Use bdrv_nb_sectors() in bdrv_make_zero()
Instead of bdrv_getlength().

Variable target_size is initially in bytes, then changes meaning to
sectors.  Ugh.  Replace by target_sectors.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:13 +02:00
Markus Armbruster
65a9bb25d6 block: New bdrv_nb_sectors()
A call to retrieve the image size converts between bytes and sectors
several times:

* BlockDriver method bdrv_getlength() returns bytes.

* refresh_total_sectors() converts to sectors, rounding up, and stores
  in total_sectors.

* bdrv_getlength() converts total_sectors back to bytes (now rounded
  up to a multiple of the sector size).

* Callers wanting sectors rather bytes convert it right back.
  Example: bdrv_get_geometry().

bdrv_nb_sectors() provides a way to omit the last two conversions.
It's exactly bdrv_getlength() with the conversion to bytes omitted.
It's functionally like bdrv_get_geometry() without its odd error
handling.

Reimplement bdrv_getlength() and bdrv_get_geometry() on top of
bdrv_nb_sectors().

The next patches will convert some users of bdrv_getlength() to
bdrv_nb_sectors().

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:12 +02:00
Peter Maydell
f083201667 Merge remote-tracking branch 'remotes/mjt/tags/trivial-patches-2014-08-09' into staging
trivial patches for 2014-08-09

# gpg: Signature made Fri 08 Aug 2014 21:36:44 BST using RSA key ID A4C3D7DB
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg:                 aka "Michael Tokarev <mjt@corpit.ru>"
# gpg:                 aka "Michael Tokarev <mjt@debian.org>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D  4324 457C E0A0 8044 65C5
#      Subkey fingerprint: 6F67 E18E 7C91 C5B1 5514  66A7 BEE5 9D74 A4C3 D7DB

* remotes/mjt/tags/trivial-patches-2014-08-09:
  build-sys: Move qapi-{types, visit, event}.o into util-obj-y
  po: Add Chinese translation
  qemu-img: Check getchar() return value in read_password() for WIN32
  hw/timer: Move extern declaration from .c to .h file
  virtio: Move extern declaration to header file
  Show length mismatch error is hex
  target-i386/cpu.c: Fix two error output indentation
  l2tpv3 (configure): it is linux-specific
  hw/timer/imx_*: fix TIMER_MAX clash with system symbol

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-15 13:41:55 +01:00
Markus Armbruster
260cb1c409 pc: Get rid of pci-info leftovers
pc_fw_cfg_guest_info() never does anything, because has_pci_info is
always false.

Introduced in commit f8c457b "pc: pass PCI hole ranges to Guests",
disabled in commit 9604f70 "pc: disable pci-info for 1.6", and hasn't
been enabled since.  Obviously a dead end.  Get of it.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-14 13:22:25 +02:00
Gabriel L. Somlo
9616c29045 e1000: use symbolic constants to init phy ctrl & status registers
Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-14 13:22:25 +02:00
Gabriel L. Somlo
1195fed9e6 e1000: correctly handle phy_ctrl reserved & self-clearing bits
Make phyreg_writeops responsible for actually writing their
respective phy registers, rather than rely on set_mdic() to
do it on their behalf.

The only current instance of phyreg_writeops is set_phy_ctrl();
modify it to write the register on its own, while also correctly
handling reserved and self-clearing bits.

have_autoneg() does not need to check for MII_CR_RESTART_AUTO_NEG,
since the only time the flag comes into play is during set_phy_ctrl(),
and, following this patch, never actually gets written to the phy
control register.

Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-14 13:22:25 +02:00
Levente Kurusa
7f9efb6b80 ivshmem: fix building when debug mode is enabled
ivsmem_offset was removed, however this debug statement was not updated.
Modify the statement to fit the new mechanic.

Signed-off-by: Levente Kurusa <lkurusa@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-14 13:22:25 +02:00
Michael S. Tsirkin
d67aadccfa acpi: align RSDP
RSDP should be aligned at a 16-byte boundary.
This would by chance at the moment, fix up acpi build
to make it robust.

Cc: qemu-stable@nongnu.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2014-08-14 13:22:16 +02:00
Hu Tao
c68233aee8 numa: show hex number in error message for consistency and prefix them with 0x
The error messages before and after patch are:

before:
qemu-system-x86_64: total memory for NUMA nodes (134217728) should equal RAM size (20000000)

after:
qemu-system-x86_64: total memory for NUMA nodes (0x8000000) should equal RAM size (0x20000000)

Cc: qemu-stable@nongnu.org
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-14 13:22:07 +02:00
Michael S. Tsirkin
988eba0f68 pc-dimm: fix up error message
- int should be printed using %d
- print actual wrong value for property

Cc: qemu-stable@nongnu.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-14 13:22:00 +02:00
Hu Tao
cfe0ffd027 pc-dimm: validate node property
If user specifies a node number that exceeds the available numa nodes in
emulated system for pc-dimm device, the device will report an invalid _PXM
to OSPM. Fix this by checking the node property value.

Cc: qemu-stable@nongnu.org
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-14 13:20:59 +02:00
Hu Tao
41d2f71376 hw:i386: typo fix: MEMORY_HOPTLUG_DEVICE -> MEMORY_HOTPLUG_DEVICE
Cc: qemu-stable@nongnu.org
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-14 13:20:49 +02:00
Jan Kiszka
d209c7440a hw/audio/intel-hda: Fix MSI capability address
According to ICH9 spec, the MSI capability is located at 0x60. This is
important for guest drivers that do not parse the capability chain and
use absolute addresses instead.

CC: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-14 13:20:49 +02:00
Jan Kiszka
f9f218730c pc: Create 2.2 machine type
Yet identical to 2.1.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-14 13:20:49 +02:00
Jan Kiszka
cc943c36fa pci: Use bus master address space for delivering MSI/MSI-X messages
The spec says (and real HW confirms this) that, if the bus master bit
is 0, the device will not generate any PCI accesses. MSI and MSI-X
messages fall among these, so we should use the corresponding address
space to deliver them. This will prevent delivery if bus master support
is disabled.

Cc: qemu-stable@nongnu.org
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-14 13:20:33 +02:00
Amit Shah
4ac4458076 virtio-rng: add some trace events
Add some trace events to virtio-rng for easier debugging

Signed-off-by: Amit Shah <amit.shah@redhat.com>

Reviewed-by: Amos Kong <akong@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-12 14:29:55 +01:00
Alex Bennée
6db8b53866 trace: add some tcg tracing support
This adds a couple of tcg specific trace-events which are useful for
tracing execution though tcg generated blocks. It's been tested with
lttng user space tracing but is generic enough for all systems. The tcg
events are:

  * translate_block - when a subject block is translated
  * exec_tb - when a translated block is entered
  * exec_tb_exit - when we exit the translated code
  * exec_tb_nocache - special case translations

Of course we can only trace the entrance to the first block of a chain
as each block will jump directly to the next when it can. See the -d
nochain patch to allow more complete tracing at the expense of
performance.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-12 14:26:12 +01:00
Alex Bennée
41ef7b00ab trace: teach lttng backend to use format strings
This makes the UST backend pay attention to the format string arguments
that are defined when defining payload data. With this you can now
ensure integers are reported in hex mode if you want.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-12 14:26:12 +01:00
Lluís Vilanova
a7e30d84ce trace: [tcg] Include TCG-tracing header on all targets
Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-12 14:26:12 +01:00
Lluís Vilanova
85d8bf2f36 trace: [tcg] Include event definitions in "trace.h"
Otherwise the user has to explicitly include an auto-generated header.

Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-12 14:26:12 +01:00
Lluís Vilanova
465830fbd9 trace: [tcg] Generate TCG tracing routines
Generate header "trace/generated-tcg-tracers.h" with the necessary routines for
tracing events in guest code:

* trace_${event}_tcg

  Convenience wrapper that calls the translation-time tracer
  'trace_${event}_trans', and calls 'gen_helper_trace_${event}_exec to
  generate the TCG code to later trace the event at execution time.

Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-12 14:26:12 +01:00
Lluís Vilanova
76b53aa324 trace: [tcg] Include TCG-tracing helpers
Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-12 14:26:12 +01:00
Lluís Vilanova
f4654226d4 trace: [tcg] Define TCG tracing helper routine wrappers
Generates header "trace/generated-helpers-wrappers.h" with definitions for TCG
helper wrappers.

These wrappers ('gen_helper_trace_${event}_exec_wrapper') transform mixed native
and TCG argument types to TCG types and call the actual TCG helpers
('gen_helper_trace_${event}_exec_proxy').

Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-12 14:26:12 +01:00
Lluís Vilanova
341ea69185 trace: [tcg] Define TCG tracing helper routines
Generates file "trace/generated-helpers.c" with TCG helper definitions to trace
events in guest code at execution time.

The helpers ('helper_trace_${event}_exec_proxy') cast the TCG-compatible native
argument types to their original types (as defined in "trace-events") and call
the tracing routine ('trace_${event}_exec').

Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-12 14:26:12 +01:00
Lluís Vilanova
707c8a98e4 trace: [tcg] Declare TCG tracing helper routines
Generates file "trace/generated-helpers.h" with TCG helper declarations to trace
events in guest code at execution time ('trace_${event}_exec_proxy').

Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-12 14:26:12 +01:00
Lluís Vilanova
b2b36c22bd trace: [tcg] Add 'tcg' event property
Transforms event:

  tcg name(...) "...", "..."

into two internal events:

  tcg-trans name_trans(...) "..."
  tcg-exec name_exec(...) "..."

Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-12 14:26:11 +01:00
Lluís Vilanova
b55835ac10 trace: [tcg] Argument type transformation machinery
Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-12 14:26:11 +01:00
Lluís Vilanova
e6d6c4bebf trace: [tcg] Argument type transformation rules
Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-12 14:26:11 +01:00
Lluís Vilanova
0bb403b0ae trace: [tcg] Add documentation
Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-12 14:26:11 +01:00
Stefan Hajnoczi
e0b2fd0efb trace: install simpletrace SystemTap tapset
The simpletrace SystemTap tapset outputs simpletrace binary traces for
SystemTap probes.  This is useful because SystemTap has no default way
to format or store traces.  The simpletrace SystemTap tapset provides an
easy way to store traces.

The simpletrace.py tool or custom Python scripts using the
simpletrace.py API can analyze SystemTap these traces:

  $ ./configure --enable-trace-backends=dtrace ...
  $ make && make install
  $ stap -e 'probe qemu.system.x86_64.simpletrace.* {}' \
         -c qemu-system-x86_64 >/tmp/trace.out
  $ scripts/simpletrace.py --no-header trace-events /tmp/trace.out
  g_malloc 4.531 pid=15519 size=0xb ptr=0x7f8639c10470
  g_malloc 3.264 pid=15519 size=0x300 ptr=0x7f8639c10490
  g_free 5.155 pid=15519 ptr=0x7f8639c0f7b0

Note that, unlike qemu-system-x86_64.stp and
qemu-system-x86_64.stp-installed, only one file is needed since the
simpletrace SystemTap tapset does not reference the QEMU binary by path.
Therefore it doesn't matter whether the QEMU binary is installed or not.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-12 14:26:11 +01:00
Stefan Hajnoczi
15327c3df0 simpletrace: add simpletrace.py --no-header option
It can be useful to read simpletrace files that have no header.  For
example, a ring buffer may not have a header record but can still be
processed if the user is sure the file format version is compatible.

  $ scripts/simpletrace.py --no-header trace-events trace-file

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-12 14:26:11 +01:00
Stefan Hajnoczi
3f8b112d6b trace: add tracetool simpletrace_stap format
This new tracetool "format" generates a SystemTap .stp file that outputs
simpletrace binary trace data.

In contrast to simpletrace or ftrace, SystemTap does not define its own
trace format.  All output from SystemTap is generated by .stp files.
This patch lets us generate a .stp file that outputs in the simpletrace
binary format.

This makes it possible to reuse simpletrace.py to analyze traces
recorded using SystemTap.  The simpletrace binary format is especially
useful for long-running traces like flight-recorder mode where string
formatting can be expensive.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-12 14:26:11 +01:00
Stefan Hajnoczi
a76ccf3c1c trace: extract stap_escape() function for reuse
SystemTap reserved words sometimes conflict with QEMU variable names.
We escape them to prevent conflicts.

Move escaping into its own function so the next patch can reuse it.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-12 14:26:11 +01:00
Fam Zheng
169a24aea4 build-sys: Move qapi-{types, visit, event}.o into util-obj-y
These three objects are repeated in multiple times in Makefiles. Let's
just add them to libqemuutil.a, and don't list explicitly elsewhere.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-09 00:09:17 +04:00
Fam Zheng
90bda0823a po: Add Chinese translation
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Reviewed-by: Dongsheng Song <songdongsheng@live.cn>
Reviewed-by: Wei Huang <wehuang@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-09 00:06:41 +04:00
Chen Gang
fdcf6e65bc qemu-img: Check getchar() return value in read_password() for WIN32
getchar() is a standard c library function which may return with failure
(e.g. -1), so like another platforms, also need check it under WIN32.

And make the related code match current qemu code styles, too.

Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-09 00:06:32 +04:00
Stefan Weil
f13bef9592 hw/timer: Move extern declaration from .c to .h file
This fixes a warning from smatch (static code analyser).

Fix also the comment with the renamed source file name.

Signed-off-by: Stefan Weil <sw@weilnetz.de>

 hw/timer/tusb6010.c |    3 ---
 include/hw/usb.h    |    7 ++++++-
 2 files changed, 6 insertions(+), 4 deletions(-)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-09 00:06:32 +04:00
Stefan Weil
0f03fb6094 virtio: Move extern declaration to header file
This fixes a warning from smatch (static code analyser).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-09 00:06:32 +04:00
Alex Bligh
a3f1f040d2 Show length mismatch error is hex
When live migrate fails due to a section length mismatch we currently
see an error message like:

Length mismatch: 0000:00:03.0/virtio-net-pci.rom: 10000 in != 20000

The section lengths are in fact in hex, so this should read

Length mismatch: 0000:00:03.0/virtio-net-pci.rom: 0x10000 in != 0x20000

Correct the error string to reflect this.

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-09 00:06:32 +04:00
chenfan
5bb4c35dca target-i386/cpu.c: Fix two error output indentation
Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-09 00:06:32 +04:00
Michael Tokarev
bff6cb7296 l2tpv3 (configure): it is linux-specific
Some non-linux systems, for example a system with
FreeBSD kernel and glibc, may declare struct mmsghdr
(in glibc) but may not have linux-specific header
file linux/ip.h.  The actual implementation in qemu
includes this linux-specific header file unconditionally,
so compilation fails if it is not present.  Include
this header in the configure test too.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-09 00:06:32 +04:00
Michael Tokarev
203d65a470 hw/timer/imx_*: fix TIMER_MAX clash with system symbol
The symbol TIMER_MAX used in imx_epit.c and imx_gpt.c
clashes with system symbol with the same name.  Because
all qemu source files includes qemu-common.h which, in
turn, includes limits.h, which is not unusual to define
it.  Rename local symbol to have a reasonable prefix.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-09 00:06:32 +04:00
Peter Maydell
2d591ce2ae Merge remote-tracking branch 'remotes/mdroth/qga-pull-2014-08-08' into staging
* remotes/mdroth/qga-pull-2014-08-08:
  qga: Disable unsupported commands by default
  qga: Add guest-get-fsinfo command
  qga: Add guest-fsfreeze-freeze-list command

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-08 14:16:05 +01:00
Tomoki Sekiyama
1281c08a46 qga: Disable unsupported commands by default
Currently management softwares cannot know whether a qemu-ga command is
supported or not on the running platform until they actually execute it.
This patch disables unsupported commands at launch time of qemu-ga, so that
management softwares can check whether they are supported from 'enabled'
property of the result from 'guest-info' command.

Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-08-07 17:15:53 -05:00
Tomoki Sekiyama
46d4c5723e qga: Add guest-get-fsinfo command
Add command to get mounted filesystems information in the guest.
The returned value contains a list of mountpoint paths and
corresponding disks info such as disk bus type, drive address,
and the disk controllers' PCI addresses, so that management layer
such as libvirt can resolve the disk backends.

For example, when `lsblk' result is:

    NAME           MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
    sdb              8:16   0    1G  0 disk
    `-sdb1           8:17   0 1024M  0 part
      `-vg0-lv0    253:1    0  1.4G  0 lvm  /mnt/test
    sdc              8:32   0    1G  0 disk
    `-sdc1           8:33   0  512M  0 part
      `-vg0-lv0    253:1    0  1.4G  0 lvm  /mnt/test
    vda            252:0    0   25G  0 disk
    `-vda1         252:1    0   25G  0 part /

where sdb is a SCSI disk with PCI controller 0000:00:0a.0 and ID=1,
      sdc is an IDE disk with PCI controller 0000:00:01.1, and
      vda is a virtio-blk disk with PCI device 0000:00:06.0,

guest-get-fsinfo command will return the following result:

    {"return":
     [{"name":"dm-1",
       "mountpoint":"/mnt/test",
       "disk":[
        {"bus-type":"scsi","bus":0,"unit":1,"target":0,
         "pci-controller":{"bus":0,"slot":10,"domain":0,"function":0}},
        {"bus-type":"ide","bus":0,"unit":0,"target":0,
         "pci-controller":{"bus":0,"slot":1,"domain":0,"function":1}}],
       "type":"xfs"},
      {"name":"vda1", "mountpoint":"/",
       "disk":[
        {"bus-type":"virtio","bus":0,"unit":0,"target":0,
         "pci-controller":{"bus":0,"slot":6,"domain":0,"function":0}}],
       "type":"ext4"}]}

In Linux guest, the disk information is resolved from sysfs. So far,
it only supports virtio-blk, virtio-scsi, IDE, SATA, SCSI disks on x86
hosts, and "disk" parameter may be empty for unsupported disk types.

Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>

*updated schema to report 2.2 as initial supported version

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-08-07 17:15:14 -05:00
Tomoki Sekiyama
e99bce2021 qga: Add guest-fsfreeze-freeze-list command
If an array of mount point paths is specified as 'mountpoints' argument
of guest-fsfreeze-freeze-list, qemu-ga will only freeze the file systems
mounted on specified paths in Linux guests. Otherwise, it works as the
same way as guest-fsfreeze-freeze.
This would be useful when the host wants to create partial disk snapshots.

Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

*updated schema to report 2.2 as initial supported version

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-08-07 17:13:10 -05:00
Peter Maydell
2ee55b8351 Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
KVM changes include a MIPS patch and the testdev backend used by the
ARM kvm-unit-tests.  icount include the first part of reverse execution
and Sebastian Tanase's patches to slow down -icount execution to the
desired speed of the target.

v1->v2: fix dump_drift_info to print nothing outside icount mode,
        and to compile on 32-bit architectures

# gpg: Signature made Thu 07 Aug 2014 14:09:58 BST using RSA key ID 9B4D86F2
# gpg: Good signature from "Paolo Bonzini <pbonzini@redhat.com>"
# gpg:                 aka "Paolo Bonzini <bonzini@gnu.org>"

* remotes/bonzini/tags/for-upstream:
  target-mips: Ignore unassigned accesses with KVM
  monitor: Add drift info to 'info jit'
  cpu-exec: Print to console if the guest is late
  cpu-exec: Add sleeping algorithm
  icount: Add align option to icount
  icount: Add QemuOpts for icount
  icount: Fix virtual clock start value on ARM
  timer: add cpu_icount_to_ns function.
  migration: migrate icount fields.
  icount: put icount variables into TimerState.
  backends: Introduce chr-testdev

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-07 14:54:47 +01:00
James Hogan
eddedd546a target-mips: Ignore unassigned accesses with KVM
MIPS registers an unassigned access handler which raises a guest bus
error exception. However this causes QEMU to crash when KVM is enabled
as it isn't called from the main execution loop so longjmp() gets called
without a corresponding setjmp().

Until the KVM API can be updated to trigger a guest exception in
response to an MMIO exit, prevent the bus error exception being raised
from mips_cpu_unassigned_access() if KVM is enabled.

The check is at run time since the do_unassigned_access callback is
initialised before it is known whether KVM will be enabled.

The problem can be triggered with Malta emulation by making the guest
write to the reset region at physical address 0x1bf00000, since it is
marked read-only which is treated as unassigned for writes.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-07 15:09:48 +02:00
Sebastian Tanase
27498bef35 monitor: Add drift info to 'info jit'
Show in 'info jit' the current delay between the host clock
and the guest clock. In addition, print the maximum advance
and delay of the guest compared to the host.

Signed-off-by: Sebastian Tanase <sebastian.tanase@openwide.fr>
Tested-by: Camille Bégué <camille.begue@openwide.fr>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-07 15:09:48 +02:00
Peter Maydell
9d8bb35574 Merge remote-tracking branch 'remotes/awilliam/tags/vfio-pci-for-qemu-20140805.0' into staging
VFIO patches: Fix MSI-X vector expansion, remove MSI/X message caching

# gpg: Signature made Tue 05 Aug 2014 20:25:57 BST using RSA key ID 3BB08B22
# gpg: Can't check signature: public key not found

* remotes/awilliam/tags/vfio-pci-for-qemu-20140805.0:
  vfio: Don't cache MSIMessage
  vfio: Fix MSI-X vector expansion

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-07 11:30:38 +01:00
Sebastian Tanase
7f7bc144ed cpu-exec: Print to console if the guest is late
If the align option is enabled, we print to the user whenever
the guest clock is behind the host clock in order for he/she
to have a hint about the actual performance. The maximum
print interval is 2s and we limit the number of messages to 100.
If desired, this can be changed in cpu-exec.c

Signed-off-by: Sebastian Tanase <sebastian.tanase@openwide.fr>
Tested-by: Camille Bégué <camille.begue@openwide.fr>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-06 17:53:07 +02:00
Sebastian Tanase
c2aa5f8199 cpu-exec: Add sleeping algorithm
The goal is to sleep qemu whenever the guest clock
is in advance compared to the host clock (we use
the monotonic clocks). The amount of time to sleep
is calculated in the execution loop in cpu_exec.

At first, we tried to approximate at each for loop the real time elapsed
while searching for a TB (generating or retrieving from cache) and
executing it. We would then approximate the virtual time corresponding
to the number of virtual instructions executed. The difference between
these 2 values would allow us to know if the guest is in advance or delayed.
However, the function used for measuring the real time
(qemu_clock_get_ns(QEMU_CLOCK_REALTIME)) proved to be very expensive.
We had an added overhead of 13% of the total run time.

Therefore, we modified the algorithm and only take into account the
difference between the 2 clocks at the begining of the cpu_exec function.
During the for loop we try to reduce the advance of the guest only by
computing the virtual time elapsed and sleeping if necessary. The overhead
is thus reduced to 3%. Even though this method still has a noticeable
overhead, it no longer is a bottleneck in trying to achieve a better
guest frequency for which the guest clock is faster than the host one.

As for the the alignement of the 2 clocks, with the first algorithm
the guest clock was oscillating between -1 and 1ms compared to the host clock.
Using the second algorithm we notice that the guest is 5ms behind the host, which
is still acceptable for our use case.

The tests where conducted using fio and stress. The host machine in an i5 CPU at
3.10GHz running Debian Jessie (kernel 3.12). The guest machine is an arm versatile-pb
built with buildroot.

Currently, on our test machine, the lowest icount we can achieve that is suitable for
aligning the 2 clocks is 6. However, we observe that the IO tests (using fio) are
slower than the cpu tests (using stress).

Signed-off-by: Sebastian Tanase <sebastian.tanase@openwide.fr>
Tested-by: Camille Bégué <camille.begue@openwide.fr>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-06 17:53:07 +02:00
Sebastian Tanase
a8bfac3708 icount: Add align option to icount
The align option is used for activating the align algorithm
in order to synchronise the host clock and the guest clock.

Signed-off-by: Sebastian Tanase <sebastian.tanase@openwide.fr>
Tested-by: Camille Bégué <camille.begue@openwide.fr>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-06 17:53:07 +02:00
Sebastian Tanase
1ad9580bd7 icount: Add QemuOpts for icount
Make icount parameter use QemuOpts style options in order
to easily add other suboptions.

Signed-off-by: Sebastian Tanase <sebastian.tanase@openwide.fr>
Tested-by: Camille Bégué <camille.begue@openwide.fr>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-06 17:53:07 +02:00
Sebastian Tanase
7146839505 icount: Fix virtual clock start value on ARM
When using the icount option on ARM, the virtual
clock starts counting at realtime clock but it
should start at 0.

The reason why the virtual clock starts at realtime clock
is because the first time we call qemu_clock_warp (which
calls icount_warp_rt) in tcg_exec_all, qemu_icount_bias
(which is part of the virtual time computation mechanism)
will increment by realtime - vm_clock_warp_start, with
vm_clock_warp_start being 0 (see icount_warp_rt in cpus.c).

By changing the value of vm_clock_warp_start from 0 to -1,
the first time we call qemu_clock_warp which calls
icount_warp_rt, we will return immediatly because
icount_warp_rt first checks if vm_clock_warp_start is -1
and if it's the case it returns. Therefore, qemu_icount_bias
will first be incremented by the value of a virtual timer
deadline when the virtual cpu goes from active to inactive.

The virtual time will start at 0 and increment based
on the instruction counter when the vcpu is active or
the qemu_icount_bias value when inactive.

Signed-off-by: Sebastian Tanase <sebastian.tanase@openwide.fr>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-06 17:53:07 +02:00
KONRAD Frederic
3f03131390 timer: add cpu_icount_to_ns function.
This adds cpu_icount_to_ns function which is needed for reverse execution.

It returns the time for a specific instruction.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-06 17:53:07 +02:00
KONRAD Frederic
d09eae3726 migration: migrate icount fields.
This fixes a bug where qemu_icount and qemu_icount_bias are not migrated.
It adds a subsection "timer/icount" to vmstate_timers so icount is migrated only
when needed.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-06 17:53:07 +02:00
KONRAD Frederic
c96778bb84 icount: put icount variables into TimerState.
This puts qemu_icount and qemu_icount_bias into TimerState structure to allow
them to be migrated.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-06 17:53:07 +02:00
Paolo Bonzini
5692399f0a backends: Introduce chr-testdev
From: Paolo Bonzini <pbonzini@redhat.com>

chr-testdev enables a virtio serial channel to be used for guest
initiated qemu exits. hw/misc/debugexit already enables guest
initiated qemu exits, but only for PC targets. chr-testdev supports
any virtio-capable target. kvm-unit-tests/arm is already making use
of this backend.

Currently there is a single command implemented, "q".  It takes a
(prefix) argument for the exit code, thus an exit is implemented by
writing, e.g. "1q", to the virtio-serial port.

It can be used as:
   $QEMU ... \
     -device virtio-serial-device \
     -device virtserialport,chardev=ctd -chardev testdev,id=ctd

or, use:
   $QEMU ... \
     -device virtio-serial-device \
     -device virtconsole,chardev=ctd -chardev testdev,id=ctd

to bind it to virtio-serial port0.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-06 17:53:05 +02:00
Alex Williamson
9b3af4c0e4 vfio: Don't cache MSIMessage
Commit 40509f7f added a test to avoid updating KVM MSI routes when the
MSIMessage is unchanged and f4d45d47 switched to relying on this
rather than doing our own comparison.  Our cached msg is effectively
unused now.  Remove it.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-08-05 13:05:57 -06:00
Alex Williamson
c048be5cc9 vfio: Fix MSI-X vector expansion
When new MSI-X vectors are enabled we need to disable MSI-X and
re-enable it with the correct number of vectors.  That means we need
to reprogram the eventfd triggers for each vector.  Prior to f4d45d47
vector->use tracked whether a vector was masked or unmasked and we
could always pick the KVM path when available for unmasked vectors.
Now vfio doesn't track mask state itself and vector->use and virq
remains configured even for masked vectors.  Therefore we need to ask
the MSI-X code whether a vector is masked in order to select the
correct signaling path.  As noted in the comment, MSI relies on
hardware to handle masking.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: qemu-stable@nongnu.org # QEMU 2.1
2014-08-05 13:05:52 -06:00
Peter Maydell
69f87f7130 Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20140804' into staging
target-arm queue:
 * Set PC correctly when loading AArch64 ELF files
 * sdhci: Fix ADMA dma_memory_read access
 * some more foundational work for EL2/EL3 support
 * fix bugs which reveal themselves if the TARGET_PAGE_SIZE
   is not set to 1K

# gpg: Signature made Mon 04 Aug 2014 14:51:34 BST using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"

* remotes/pmaydell/tags/pull-target-arm-20140804:
  target-arm: A64: fix TLB flush instructions
  target-arm: don't hardcode mask values in arm_cpu_handle_mmu_fault
  target-arm: Fix bit test in sp_el0_access
  target-arm: Add FAR_EL2 and 3
  target-arm: Add ESR_EL2 and 3
  target-arm: Make far_el1 an array
  target-arm: A64: Respect SPSEL when taking exceptions
  target-arm: A64: Respect SPSEL in ERET SP restore
  target-arm: A64: Break out aarch64_save/restore_sp
  sd: sdhci: Fix ADMA dma_memory_read access
  hw/arm/virt: formatting: memory map
  hw/arm/boot: Set PC correctly when loading AArch64 ELF files

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-04 15:01:38 +01:00
Alex Bennée
dbb1fb277c target-arm: A64: fix TLB flush instructions
According to the ARM ARM we weren't correctly flushing the TLB entries
where bits 63:56 didn't match bit 55 of the virtual address. This
exposed a problem when we switched QEMU's internal TARGET_PAGE_BITS to
12 for aarch64.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1406733627-24255-3-git-send-email-alex.bennee@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-04 14:41:56 +01:00
Alex Bennée
dcd82c118c target-arm: don't hardcode mask values in arm_cpu_handle_mmu_fault
Otherwise we break quickly when we change TARGET_PAGE_SIZE.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1406733627-24255-2-git-send-email-alex.bennee@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-04 14:41:55 +01:00
Stefan Weil
cdcf14057d target-arm: Fix bit test in sp_el0_access
Static code analyzers complain about a dubious & operation used for a
boolean value. The code does not test the PSTATE_SP bit as it should.

Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Message-id: 1406359601-25583-1-git-send-email-sw@weilnetz.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-04 14:41:55 +01:00
Edgar E. Iglesias
63b60551a7 target-arm: Add FAR_EL2 and 3
Reviewed-by: Greg Bellows <greg.bellows@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1402994746-8328-7-git-send-email-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-04 14:41:55 +01:00
Edgar E. Iglesias
f2c30f42f5 target-arm: Add ESR_EL2 and 3
Reviewed-by: Greg Bellows <greg.bellows@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1402994746-8328-6-git-send-email-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-04 14:41:55 +01:00
Edgar E. Iglesias
2f0180c51b target-arm: Make far_el1 an array
No functional change.
Prepares for future additions of the EL2 and 3 versions of this reg.

Reviewed-by: Greg Bellows <greg.bellows@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1402994746-8328-5-git-send-email-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-04 14:41:54 +01:00
Edgar E. Iglesias
f151b123a3 target-arm: A64: Respect SPSEL when taking exceptions
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Greg Bellows <greg.bellows@linaro.org>
Message-id: 1402994746-8328-4-git-send-email-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-04 14:41:54 +01:00
Edgar E. Iglesias
98ea5615ab target-arm: A64: Respect SPSEL in ERET SP restore
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Greg Bellows <greg.bellows@linaro.org>
Message-id: 1402994746-8328-3-git-send-email-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-04 14:41:54 +01:00
Edgar E. Iglesias
9208b9617f target-arm: A64: Break out aarch64_save/restore_sp
Break out code to save/restore AArch64 SP into functions.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Greg Bellows <greg.bellows@linaro.org>
Message-id: 1402994746-8328-2-git-send-email-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-04 14:41:54 +01:00
Peter Crosthwaite
9db11cef8c sd: sdhci: Fix ADMA dma_memory_read access
This dma_memory_read was giving too big a size when begin was non-zero.
This could cause segfaults in some circumstances. Fix.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-04 14:41:54 +01:00
Andrew Jones
fab4693239 hw/arm/virt: formatting: memory map
Add some spacing and zeros to make it easier to read and
modify the map. This patch has no functional changes. The
review looks ugly, but it's actually pretty easy to confirm
all the addresses are as they should be - thanks to the new
formatting ;-)

Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-04 14:41:53 +01:00
Peter Maydell
a9047ec3f6 hw/arm/boot: Set PC correctly when loading AArch64 ELF files
The code in do_cpu_reset() correctly handled AArch64 CPUs
when running Linux kernels, but was missing code in the
branch of the if() that deals with loading ELF files.
Correctly jump to the ELF entry point on reset rather than
leaving the reset PC at zero.

Reported-by: Christopher Covington <cov@codeaurora.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Christopher Covington <cov@codeaurora.org>
Cc: qemu-stable@nongnu.org
2014-08-04 14:41:53 +01:00
Peter Maydell
cc11a0623a Merge remote-tracking branch 'remotes/amit-migration/for-2.2' into staging
* remotes/amit-migration/for-2.2:
  checker: ignore fields marked unused
  vmstate static checker: whitelist additions

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-04 14:41:19 +01:00
Peter Maydell
924c09db51 Merge remote-tracking branch 'remotes/amit-virtio-rng/for-2.2' into staging
* remotes/amit-virtio-rng/for-2.2:
  virtio-rng: replace error_set calls with error_setg
  virtio-rng: Move error-checking forward to prevent memory leak

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-04 13:07:02 +01:00
Peter Maydell
7b13ff3f15 Merge remote-tracking branch 'remotes/sstabellini/xen-20140801' into staging
* remotes/sstabellini/xen-20140801:
  qemu: support xen hvm direct kernel boot
  tap-bsd: implement a FreeBSD only version of tap_open
  xen: fix usage of ENODATA

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-04 11:17:24 +01:00
Amit Shah
32ce1b4817 checker: ignore fields marked unused
While comparing qemu-1.0 json output with qemu-2.1, a few fields got
marked unused.  These need to be skipped over, and not flagged as
mismatches.

For handling unused fields, the exact number of bytes need to be skipped
over as the size of the unused field.

Currently, only the term "unused" is matched.  When more field names
turn up, this will have to be updated based on the whitelist matching
method to match more such terms.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
2014-08-04 15:02:37 +05:30
John Snow
c617dd3b7e virtio-rng: replace error_set calls with error_setg
Under recommendation from Luiz Capitulino, we are changing
the error_set calls to error_setg while we are fixing up
the error handling pathways of virtio-rng.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2014-08-04 14:50:11 +05:30
John Snow
1efd6e072c virtio-rng: Move error-checking forward to prevent memory leak
This patch pushes the error-checking forward and the virtio
initialization backward in the device realization function
in order to prevent memory leaks for hot plug scenarios.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2014-08-04 14:49:53 +05:30
Peter Maydell
c79805802b Open 2.2 development tree
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-01 18:30:08 +01:00
Chunyan Liu
b33a5bbfba qemu: support xen hvm direct kernel boot
qemu side patch to support xen HVM direct kernel boot:
if -kernel exists, calls xen_load_linux(), which will read kernel/initrd
and add a linuxboot.bin or multiboot.bin option rom. The
linuxboot.bin/multiboot.bin will load kernel/initrd and jump to execute
kernel directly. It's working when xen uses seabios.

During this work, found the 'kvmvapic' is in option_rom list, it should
not be there in xen case. Set s->vapic_control = 0 in xen_apic_realize()
to handle that.

Signed-off-by: Chunyan Liu <cyliu@suse.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-01 15:58:12 +00:00
Roger Pau Monne
8677de2b4d tap-bsd: implement a FreeBSD only version of tap_open
The current behaviour of tap_open for BSD systems differ greatly from
it's Linux counterpart. Since FreeBSD supports interface renaming and
tap device cloning by opening /dev/tap, implement a FreeBSD specific
version of tap_open that behaves like it's Linux counterpart.

This is specially important for toolstacks that use Qemu (like Xen
libxl), in order to have a unified behaviour across suported
platforms.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-01 15:57:48 +00:00
Roger Pau Monne
74bc41511a xen: fix usage of ENODATA
ENODATA doesn't exist on FreeBSD, so ENODATA errors returned by the
hypervisor are translated to ENOENT.

Also, the error code is returned in errno if the call returns -1, so
compare the error code with the value in errno instead of the value
returned by the function.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: xen-devel@lists.xenproject.org
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Anthony Perard <anthony.perard@citrix.com>
2014-08-01 15:57:28 +00:00
Paolo Bonzini
33cbb2c546 virtio-scsi: implement parse_cdb
Enable passthrough of vendor-specific commands.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-29 17:36:38 +02:00
Paolo Bonzini
3e7e180ab3 scsi-block, scsi-generic: implement parse_cdb
The callback lets the bus provide the direction and transfer count
for passthrough commands, enabling passthrough of vendor-specific
commands.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-29 17:36:33 +02:00
Paolo Bonzini
592c3b289f scsi-block: extract scsi_block_is_passthrough
This will be used for both scsi_block_new_request and the scsi-block
implementation of parse_cdb.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-29 17:36:29 +02:00
Paolo Bonzini
ff34c32ccc scsi-bus: introduce parse_cdb in SCSIDeviceClass and SCSIBusInfo
These callbacks will let devices do their own request parsing, or
defer it to the bus.  If the bus does not provide an implementation,
in turn, fall back to the default parsing routine.

Swap the first two arguments to scsi_req_parse, and rename it to
scsi_req_parse_cdb, for consistency.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-29 17:36:25 +02:00
Paolo Bonzini
769998a1db scsi-bus: prepare scsi_req_new for introduction of parse_cdb
The per-SCSIDevice parse_cdb callback must not be called if the
request will go through special SCSIReqOps, so detect the special
cases early enough.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-29 17:36:09 +02:00
Amit Shah
bb9c3636d9 vmstate static checker: whitelist additions
Comparing json outputs from qemu-1.0 with qemu-2.1 turned up a few
description name changes; whitelist them here.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
2014-07-22 17:06:54 +05:30
1017 changed files with 62257 additions and 16134 deletions

4
.gitignore vendored
View File

@@ -11,6 +11,10 @@
/trace/generated-tracers.dtrace
/trace/generated-events.h
/trace/generated-events.c
/trace/generated-helpers-wrappers.h
/trace/generated-helpers.h
/trace/generated-helpers.c
/trace/generated-tcg-tracers.h
/trace/generated-ust-provider.h
/trace/generated-ust.c
/libcacard/trace/generated-tracers.c

View File

@@ -12,7 +12,7 @@ notifications:
on_failure: always
env:
global:
- TEST_CMD="make check"
- TEST_CMD=""
- EXTRA_CONFIG=""
# Development packages, EXTRA_PKGS saved for additional builds
- CORE_PKGS="libusb-1.0-0-dev libiscsi-dev librados-dev libncurses5-dev"
@@ -20,31 +20,51 @@ env:
- GUI_PKGS="libgtk-3-dev libvte-2.90-dev libsdl1.2-dev libpng12-dev libpixman-1-dev"
- EXTRA_PKGS=""
matrix:
# Group major targets together with their linux-user counterparts
- TARGETS=alpha-softmmu,alpha-linux-user
- TARGETS=arm-softmmu,arm-linux-user
- TARGETS=aarch64-softmmu,aarch64-linux-user
- TARGETS=cris-softmmu
- TARGETS=i386-softmmu,x86_64-softmmu
- TARGETS=lm32-softmmu
- TARGETS=m68k-softmmu
- TARGETS=microblaze-softmmu,microblazeel-softmmu
- TARGETS=arm-softmmu,arm-linux-user,armeb-linux-user,aarch64-softmmu,aarch64-linux-user
- TARGETS=cris-softmmu,cris-linux-user
- TARGETS=i386-softmmu,i386-linux-user,x86_64-softmmu,x86_64-linux-user
- TARGETS=m68k-softmmu,m68k-linux-user
- TARGETS=microblaze-softmmu,microblazeel-softmmu,microblaze-linux-user,microblazeel-linux-user
- TARGETS=mips-softmmu,mips64-softmmu,mips64el-softmmu,mipsel-softmmu
- TARGETS=moxie-softmmu
- TARGETS=or32-softmmu,
- TARGETS=ppc-softmmu,ppc64-softmmu,ppcemb-softmmu
- TARGETS=s390x-softmmu
- TARGETS=sh4-softmmu,sh4eb-softmmu
- TARGETS=sparc-softmmu,sparc64-softmmu
- TARGETS=unicore32-softmmu
- TARGETS=xtensa-softmmu,xtensaeb-softmmu
- TARGETS=mips-linux-user,mips64-linux-user,mips64el-linux-user,mipsel-linux-user,mipsn32-linux-user,mipsn32el-linux-user
- TARGETS=or32-softmmu,or32-linux-user
- TARGETS=ppc-softmmu,ppc64-softmmu,ppcemb-softmmu,ppc-linux-user,ppc64-linux-user,ppc64abi32-linux-user,ppc64le-linux-user
- TARGETS=s390x-softmmu,s390x-linux-user
- TARGETS=sh4-softmmu,sh4eb-softmmu,sh4-linux-user sh4eb-linux-user
- TARGETS=sparc-softmmu,sparc64-softmmu,sparc-linux-user,sparc32plus-linux-user,sparc64-linux-user
- TARGETS=unicore32-softmmu,unicore32-linux-user
# Group remaining softmmu only targets into one build
- TARGETS=lm32-softmmu,moxie-softmmu,tricore-softmmu,xtensa-softmmu,xtensaeb-softmmu
git:
# we want to do this ourselves
submodules: false
before_install:
- wget -O - http://people.linaro.org/~alex.bennee/qemu-submodule-git-seed.tar.xz | tar -xvJ
- git submodule update --init --recursive
- sudo apt-get update -qq
- sudo apt-get install -qq ${CORE_PKGS} ${NET_PKGS} ${GUI_PKGS} ${EXTRA_PKGS}
script: "./configure --target-list=${TARGETS} ${EXTRA_CONFIG} && make && ${TEST_CMD}"
before_script:
- ./configure --target-list=${TARGETS} --enable-debug-tcg ${EXTRA_CONFIG}
script:
- make -j2 && ${TEST_CMD}
matrix:
# We manually include a number of additional build for non-standard bits
include:
# Make check target (we only do this once)
- env:
- TARGETS=alpha-softmmu,arm-softmmu,aarch64-softmmu,cris-softmmu,
i386-softmmu,x86_64-softmmu,m68k-softmmu,microblaze-softmmu,
microblazeel-softmmu,mips-softmmu,mips64-softmmu,
mips64el-softmmu,mipsel-softmmu,or32-softmmu,ppc-softmmu,
ppc64-softmmu,ppcemb-softmmu,s390x-softmmu,sh4-softmmu,
sh4eb-softmmu,sparc-softmmu,sparc64-softmmu,
unicore32-softmmu,unicore32-linux-user,
lm32-softmmu,moxie-softmmu,tricore-softmmu,xtensa-softmmu,
xtensaeb-softmmu
TEST_CMD="make check"
compiler: gcc
# Debug related options
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_CONFIG="--enable-debug"
@@ -73,7 +93,6 @@ matrix:
compiler: gcc
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_CONFIG="--enable-trace-backends=ftrace"
TEST_CMD=""
compiler: gcc
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_PKGS="liblttng-ust-dev liburcu-dev"

View File

@@ -91,3 +91,17 @@ Mixed declarations (interleaving statements and declarations within blocks)
are not allowed; declarations should be at the beginning of blocks. In other
words, the code should not generate warnings if using GCC's
-Wdeclaration-after-statement option.
6. Conditional statements
When comparing a variable for (in)equality with a constant, list the
constant on the right, as in:
if (a == 1) {
/* Reads like: "If a equals 1" */
do_something();
}
Rationale: Yoda conditions (as in 'if (1 == a)') are awkward to read.
Besides, good compilers already warn users when '==' is mis-typed as '=',
even when the constant is on the right.

View File

@@ -51,6 +51,7 @@ Descriptions of section entries:
General Project Administration
------------------------------
M: Anthony Liguori <aliguori@amazon.com>
M: Peter Maydell <peter.maydell@linaro.org>
Responsible Disclosure, Reporting Security Issues
------------------------------
@@ -61,11 +62,23 @@ L: secalert@redhat.com
Guest CPU cores (TCG):
----------------------
Overall
L: qemu-devel@nongnu.org
S: Odd fixes
F: cpu-exec.c
F: cputlb.c
F: softmmu_template.h
F: translate-all.c
F: include/exec/cpu_ldst.h
F: include/exec/cpu_ldst_template.h
F: include/exec/helper*.h
Alpha
M: Richard Henderson <rth@twiddle.net>
S: Maintained
F: target-alpha/
F: hw/alpha/
F: tests/tcg/alpha/
ARM
M: Peter Maydell <peter.maydell@linaro.org>
@@ -79,6 +92,7 @@ M: Edgar E. Iglesias <edgar.iglesias@gmail.com>
S: Maintained
F: target-cris/
F: hw/cris/
F: tests/tcg/cris/
LM32
M: Michael Walle <michael@walle.cc>
@@ -86,6 +100,7 @@ S: Maintained
F: target-lm32/
F: hw/lm32/
F: hw/char/lm32_*
F: tests/tcg/lm32/
M68K
S: Orphan
@@ -100,9 +115,11 @@ F: hw/microblaze/
MIPS
M: Aurelien Jarno <aurelien@aurel32.net>
S: Odd Fixes
M: Leon Alrae <leon.alrae@imgtec.com>
S: Maintained
F: target-mips/
F: hw/mips/
F: tests/tcg/mips/
Moxie
M: Anthony Green <green@moxielogic.com>
@@ -114,6 +131,7 @@ M: Jia Liu <proljc@gmail.com>
S: Maintained
F: target-openrisc/
F: hw/openrisc/
F: tests/tcg/openrisc/
PowerPC
M: Alexander Graf <agraf@suse.de>
@@ -149,7 +167,8 @@ F: target-unicore32/
F: hw/unicore32/
X86
M: qemu-devel@nongnu.org
M: Paolo Bonzini <pbonzini@redhat.com>
M: Richard Henderson <rth@twiddle.net>
S: Odd Fixes
F: target-i386/
F: hw/i386/
@@ -160,6 +179,13 @@ W: http://wiki.osll.spb.ru/doku.php?id=etc:users:jcmvbkbc:qemu-target-xtensa
S: Maintained
F: target-xtensa/
F: hw/xtensa/
F: tests/tcg/xtensa/
TriCore
M: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
S: Maintained
F: target-tricore/
F: hw/tricore/
Guest CPU Cores (KVM):
----------------------
@@ -192,9 +218,12 @@ M: Cornelia Huck <cornelia.huck@de.ibm.com>
M: Alexander Graf <agraf@suse.de>
S: Maintained
F: target-s390x/kvm.c
F: hw/intc/s390_flic.[hc]
F: hw/intc/s390_flic.c
F: hw/intc/s390_flic_kvm.c
F: include/hw/s390x/s390_flic.h
X86
M: Paolo Bonzini <pbonzini@redhat.com>
M: Marcelo Tosatti <mtosatti@redhat.com>
L: kvm@vger.kernel.org
S: Supported
@@ -260,7 +289,7 @@ F: include/hw/arm/digic.h
F: hw/*/digic*
Gumstix
M: qemu-devel@nongnu.org
L: qemu-devel@nongnu.org
S: Orphan
F: hw/arm/gumstix.c
@@ -276,7 +305,7 @@ S: Maintained
F: hw/arm/integratorcp.c
Mainstone
M: qemu-devel@nongnu.org
L: qemu-devel@nongnu.org
S: Orphan
F: hw/arm/mainstone.c
@@ -382,7 +411,7 @@ S: Maintained
F: hw/mips/mips_malta.c
Mipssim
M: qemu-devel@nongnu.org
L: qemu-devel@nongnu.org
S: Orphan
F: hw/mips/mips_mipssim.c
@@ -515,6 +544,8 @@ F: hw/s390x/s390-virtio-ccw.c
F: hw/s390x/css.[hc]
F: hw/s390x/sclp*.[hc]
F: hw/s390x/ipl*.[hc]
F: include/hw/s390x/
F: pc-bios/s390-ccw/
T: git git://github.com/cohuck/qemu virtio-ccw-upstr
UniCore32 Machines
@@ -552,12 +583,13 @@ Xtensa Machines
sim
M: Max Filippov <jcmvbkbc@gmail.com>
S: Maintained
F: hw/xtensa/xtensa_sim.c
F: hw/xtensa/sim.c
Avnet LX60
XTFPGA (LX60, LX200, ML605, KC705)
M: Max Filippov <jcmvbkbc@gmail.com>
S: Maintained
F: hw/xtensa/xtensa_lx60.c
F: hw/xtensa/xtfpga.c
F: hw/net/opencores_eth.c
Devices
-------
@@ -614,7 +646,13 @@ USB
M: Gerd Hoffmann <kraxel@redhat.com>
S: Maintained
F: hw/usb/*
F: tests/usb-hcd-ehci-test.c
F: tests/usb-*-test.c
USB (serial adapter)
M: Gerd Hoffmann <kraxel@redhat.com>
M: Samuel Thibault <samuel.thibault@ens-lyon.org>
S: Maintained
F: hw/usb/dev-serial.c
VFIO
M: Alex Williamson <alex.williamson@redhat.com>
@@ -678,6 +716,12 @@ S: Maintained
F: hw/*/xilinx_*
F: include/hw/xilinx.h
Vmware
M: Dmitry Fleytman <dmitry@daynix.com>
S: Maintained
F: hw/net/vmxnet*
F: hw/scsi/vmw_pvscsi*
Subsystems
----------
Audio
@@ -694,18 +738,30 @@ Block
M: Kevin Wolf <kwolf@redhat.com>
M: Stefan Hajnoczi <stefanha@redhat.com>
S: Supported
F: async.c
F: aio-*.c
F: block*
F: block/
F: hw/block/
F: qemu-img*
F: qemu-io*
F: tests/image-fuzzer/
F: tests/qemu-iotests/
T: git git://repo.or.cz/qemu/kevin.git block
T: git git://github.com/stefanha/qemu.git block
Character Devices
M: Anthony Liguori <aliguori@amazon.com>
M: Paolo Bonzini <pbonzini@redhat.com>
S: Maintained
F: qemu-char.c
F: backends/msmouse.c
F: backends/testdev.c
Character Devices (Braille)
M: Samuel Thibault <samuel.thibault@ens-lyon.org>
S: Maintained
F: backends/baum.c
CPU
M: Andreas Färber <afaerber@suse.de>
@@ -727,7 +783,7 @@ S: Maintained
F: device_tree.[ch]
GDB stub
M: qemu-devel@nongnu.org
L: qemu-devel@nongnu.org
S: Odd Fixes
F: gdbstub*
F: gdb-xml/
@@ -764,7 +820,11 @@ F: ui/cocoa.m
Main loop
M: Anthony Liguori <aliguori@amazon.com>
S: Supported
M: Paolo Bonzini <pbonzini@redhat.com>
S: Maintained
F: cpus.c
F: main-loop.c
F: qemu-timer.c
F: vl.c
Human Monitor (HMP)
@@ -803,6 +863,7 @@ M: Luiz Capitulino <lcapitulino@redhat.com>
M: Michael Roth <mdroth@linux.vnet.ibm.com>
S: Maintained
F: qapi/
F: tests/qapi-schema/
T: git git://repo.or.cz/qemu/qmp-unstable.git queue/qmp
QAPI Schema
@@ -813,6 +874,18 @@ S: Supported
F: qapi-schema.json
T: git git://repo.or.cz/qemu/qmp-unstable.git queue/qmp
QObject
M: Luiz Capitulino <lcapitulino@redhat.com>
S: Maintained
F: qobject/
T: git git://repo.or.cz/qemu/qmp-unstable.git queue/qmp
QEMU Guest Agent
M: Michael Roth <mdroth@linux.vnet.ibm.com>
S: Maintained
F: qga/
T: git git://github.com/mdroth/qemu.git qga
QOM
M: Anthony Liguori <aliguori@amazon.com>
M: Andreas Färber <afaerber@suse.de>
@@ -853,6 +926,15 @@ M: Blue Swirl <blauwirbel@gmail.com>
S: Odd Fixes
F: scripts/checkpatch.pl
Migration
M: Juan Quintela <quintela@redhat.com>
S: Maintained
F: include/migration/
F: migration*
F: savevm.c
F: arch_init.c
F: vmstate.c
Seccomp
M: Eduardo Otubo <eduardo.otubo@profitbricks.com>
S: Supported
@@ -861,6 +943,12 @@ F: include/sysemu/seccomp.h
Usermode Emulation
------------------
Overall
M: Riku Voipio <riku.voipio@iki.fi>
S: Maintained
F: thunk.c
F: user-exec.c
BSD user
M: Blue Swirl <blauwirbel@gmail.com>
S: Maintained
@@ -874,7 +962,6 @@ F: linux-user/
Tiny Code Generator (TCG)
-------------------------
Common code
M: qemu-devel@nongnu.org
M: Richard Henderson <rth@twiddle.net>
S: Maintained
F: tcg/
@@ -891,7 +978,7 @@ S: Maintained
F: tcg/arm/
i386 target
M: qemu-devel@nongnu.org
L: qemu-devel@nongnu.org
S: Maintained
F: tcg/i386/
@@ -968,7 +1055,7 @@ S: Supported
F: block/rbd.c
Sheepdog
M: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
M: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
M: Liu Yuan <namei.unix@gmail.com>
L: sheepdog@lists.wpkg.org
S: Supported
@@ -1000,3 +1087,14 @@ SSH
M: Richard W.M. Jones <rjones@redhat.com>
S: Supported
F: block/ssh.c
ARCHIPELAGO
M: Chrysostomos Nanakos <cnanakos@grnet.gr>
M: Chrysostomos Nanakos <chris@include.gr>
S: Maintained
F: block/archipelago.c
Bootdevice
M: Gonglei <arei.gonglei@huawei.com>
S: Maintained
F: bootdevice.c

View File

@@ -57,6 +57,12 @@ GENERATED_HEADERS += trace/generated-tracers-dtrace.h
endif
GENERATED_SOURCES += trace/generated-tracers.c
GENERATED_HEADERS += trace/generated-tcg-tracers.h
GENERATED_HEADERS += trace/generated-helpers-wrappers.h
GENERATED_HEADERS += trace/generated-helpers.h
GENERATED_SOURCES += trace/generated-helpers.c
ifeq ($(findstring ust,$(TRACE_BACKENDS)),ust)
GENERATED_HEADERS += trace/generated-ust-provider.h
GENERATED_SOURCES += trace/generated-ust.c
@@ -202,7 +208,7 @@ Makefile: $(version-obj-y) $(version-lobj-y)
# Build libraries
libqemustub.a: $(stub-obj-y)
libqemuutil.a: $(util-obj-y) qapi-types.o qapi-visit.o qapi-event.o
libqemuutil.a: $(util-obj-y)
block-modules = $(foreach o,$(block-obj-m),"$(basename $(subst /,-,$o))",) NULL
util/module.o-cflags = -D'CONFIG_BLOCK_MODULES=$(block-modules)'
@@ -412,6 +418,7 @@ endif
set -e; for x in $(KEYMAPS); do \
$(INSTALL_DATA) $(SRC_PATH)/pc-bios/keymaps/$$x "$(DESTDIR)$(qemu_datadir)/keymaps"; \
done
$(INSTALL_DATA) $(SRC_PATH)/trace-events "$(DESTDIR)$(qemu_datadir)/trace-events"
for d in $(TARGET_DIRS); do \
$(MAKE) $(SUBDIR_MAKEFLAGS) TARGET_DIR=$$d/ -C $$d $@ || exit 1 ; \
done

View File

@@ -1,7 +1,7 @@
#######################################################################
# Common libraries for tools and emulators
stub-obj-y = stubs/
util-obj-y = util/ qobject/ qapi/ trace/
util-obj-y = util/ qobject/ qapi/ qapi-types.o qapi-visit.o qapi-event.o
#######################################################################
# block-obj-y is code used by both qemu system emulation and qemu-img
@@ -12,7 +12,6 @@ block-obj-y += main-loop.o iohandler.o qemu-timer.o
block-obj-$(CONFIG_POSIX) += aio-posix.o
block-obj-$(CONFIG_WIN32) += aio-win32.o
block-obj-y += block/
block-obj-y += qapi-types.o qapi-visit.o qapi-event.o
block-obj-y += qemu-io-cmds.o
block-obj-y += qemu-coroutine.o qemu-coroutine-lock.o qemu-coroutine-io.o
@@ -51,7 +50,7 @@ common-obj-$(CONFIG_LINUX) += fsdev/
common-obj-y += migration.o migration-tcp.o
common-obj-y += vmstate.o
common-obj-y += qemu-file.o
common-obj-y += qemu-file.o qemu-file-unix.o qemu-file-stdio.o
common-obj-$(CONFIG_RDMA) += migration-rdma.o
common-obj-y += qemu-char.o #aio.o
common-obj-y += block-migration.o
@@ -63,6 +62,7 @@ common-obj-$(CONFIG_SPICE) += spice-qemu-char.o
common-obj-y += audio/
common-obj-y += hw/
common-obj-y += accel.o
common-obj-y += ui/
common-obj-y += bt-host.o bt-vhci.o
@@ -88,11 +88,6 @@ common-obj-y += qmp-marshal.o
common-obj-y += qmp.o hmp.o
endif
######################################################################
# some qapi visitors are used by both system and user emulation:
common-obj-y += qapi-visit.o qapi-types.o
#######################################################################
# Target-independent parts used in system and user emulation
common-obj-y += qemu-log.o
@@ -106,10 +101,15 @@ common-obj-y += disas/
version-obj-$(CONFIG_WIN32) += $(BUILD_DIR)/version.o
version-lobj-$(CONFIG_WIN32) += $(BUILD_DIR)/version.lo
######################################################################
# tracing
util-obj-y += trace/
target-obj-y += trace/
######################################################################
# guest agent
# FIXME: a few definitions from qapi-types.o/qapi-visit.o are needed
# by libqemuutil.a. These should be moved to a separate .json schema.
qga-obj-y = qga/ qapi-types.o qapi-visit.o
qga-obj-y = qga/
qga-vss-dll-obj-y = qga/

View File

@@ -38,7 +38,7 @@ config-target.h: config-target.h-timestamp
config-target.h-timestamp: config-target.mak
ifdef CONFIG_TRACE_SYSTEMTAP
stap: $(QEMU_PROG).stp-installed $(QEMU_PROG).stp
stap: $(QEMU_PROG).stp-installed $(QEMU_PROG).stp $(QEMU_PROG)-simpletrace.stp
ifdef CONFIG_USER_ONLY
TARGET_TYPE=user
@@ -64,6 +64,13 @@ $(QEMU_PROG).stp: $(SRC_PATH)/trace-events
--target-type=$(TARGET_TYPE) \
< $< > $@," GEN $(TARGET_DIR)$(QEMU_PROG).stp")
$(QEMU_PROG)-simpletrace.stp: $(SRC_PATH)/trace-events
$(call quiet-command,$(TRACETOOL) \
--format=simpletrace-stap \
--backends=$(TRACE_BACKENDS) \
--probe-prefix=qemu.$(TARGET_TYPE).$(TARGET_NAME) \
< $< > $@," GEN $(TARGET_DIR)$(QEMU_PROG)-simpletrace.stp")
else
stap:
endif
@@ -120,7 +127,7 @@ endif #CONFIG_BSD_USER
# System emulator target
ifdef CONFIG_SOFTMMU
obj-y += arch_init.o cpus.o monitor.o gdbstub.o balloon.o ioport.o numa.o
obj-y += qtest.o
obj-y += qtest.o bootdevice.o
obj-y += hw/
obj-$(CONFIG_FDT) += device_tree.o
obj-$(CONFIG_KVM) += kvm-all.o
@@ -152,15 +159,20 @@ endif # CONFIG_SOFTMMU
dummy := $(call unnest-vars,,obj-y)
all-obj-y := $(obj-y)
target-obj-y :=
block-obj-y :=
common-obj-y :=
include $(SRC_PATH)/Makefile.objs
dummy := $(call unnest-vars,,target-obj-y)
target-obj-y-save := $(target-obj-y)
dummy := $(call unnest-vars,.., \
block-obj-y \
block-obj-m \
common-obj-y \
common-obj-m)
target-obj-y := $(target-obj-y-save)
all-obj-y += $(common-obj-y)
all-obj-y += $(target-obj-y)
all-obj-$(CONFIG_SOFTMMU) += $(block-obj-y)
# build either PROG or PROGW
@@ -191,6 +203,7 @@ endif
ifdef CONFIG_TRACE_SYSTEMTAP
$(INSTALL_DIR) "$(DESTDIR)$(qemu_datadir)/../systemtap/tapset"
$(INSTALL_DATA) $(QEMU_PROG).stp-installed "$(DESTDIR)$(qemu_datadir)/../systemtap/tapset/$(QEMU_PROG).stp"
$(INSTALL_DATA) $(QEMU_PROG)-simpletrace.stp "$(DESTDIR)$(qemu_datadir)/../systemtap/tapset/$(QEMU_PROG)-simpletrace.stp"
endif
GENERATED_HEADERS += config-target.h

View File

@@ -1 +1 @@
2.1.2
2.2.0

157
accel.c Normal file
View File

@@ -0,0 +1,157 @@
/*
* QEMU System Emulator, accelerator interfaces
*
* Copyright (c) 2003-2008 Fabrice Bellard
* Copyright (c) 2014 Red Hat Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "sysemu/accel.h"
#include "hw/boards.h"
#include "qemu-common.h"
#include "sysemu/arch_init.h"
#include "sysemu/sysemu.h"
#include "sysemu/kvm.h"
#include "sysemu/qtest.h"
#include "hw/xen/xen.h"
#include "qom/object.h"
#include "hw/boards.h"
int tcg_tb_size;
static bool tcg_allowed = true;
static int tcg_init(MachineState *ms)
{
tcg_exec_init(tcg_tb_size * 1024 * 1024);
return 0;
}
static const TypeInfo accel_type = {
.name = TYPE_ACCEL,
.parent = TYPE_OBJECT,
.class_size = sizeof(AccelClass),
.instance_size = sizeof(AccelState),
};
/* Lookup AccelClass from opt_name. Returns NULL if not found */
static AccelClass *accel_find(const char *opt_name)
{
char *class_name = g_strdup_printf(ACCEL_CLASS_NAME("%s"), opt_name);
AccelClass *ac = ACCEL_CLASS(object_class_by_name(class_name));
g_free(class_name);
return ac;
}
static int accel_init_machine(AccelClass *acc, MachineState *ms)
{
ObjectClass *oc = OBJECT_CLASS(acc);
const char *cname = object_class_get_name(oc);
AccelState *accel = ACCEL(object_new(cname));
int ret;
ms->accelerator = accel;
*(acc->allowed) = true;
ret = acc->init_machine(ms);
if (ret < 0) {
ms->accelerator = NULL;
*(acc->allowed) = false;
object_unref(OBJECT(accel));
}
return ret;
}
int configure_accelerator(MachineState *ms)
{
const char *p;
char buf[10];
int ret;
bool accel_initialised = false;
bool init_failed = false;
AccelClass *acc = NULL;
p = qemu_opt_get(qemu_get_machine_opts(), "accel");
if (p == NULL) {
/* Use the default "accelerator", tcg */
p = "tcg";
}
while (!accel_initialised && *p != '\0') {
if (*p == ':') {
p++;
}
p = get_opt_name(buf, sizeof(buf), p, ':');
acc = accel_find(buf);
if (!acc) {
fprintf(stderr, "\"%s\" accelerator not found.\n", buf);
continue;
}
if (acc->available && !acc->available()) {
printf("%s not supported for this target\n",
acc->name);
continue;
}
ret = accel_init_machine(acc, ms);
if (ret < 0) {
init_failed = true;
fprintf(stderr, "failed to initialize %s: %s\n",
acc->name,
strerror(-ret));
} else {
accel_initialised = true;
}
}
if (!accel_initialised) {
if (!init_failed) {
fprintf(stderr, "No accelerator found!\n");
}
exit(1);
}
if (init_failed) {
fprintf(stderr, "Back to %s accelerator.\n", acc->name);
}
return !accel_initialised;
}
static void tcg_accel_class_init(ObjectClass *oc, void *data)
{
AccelClass *ac = ACCEL_CLASS(oc);
ac->name = "tcg";
ac->init_machine = tcg_init;
ac->allowed = &tcg_allowed;
}
#define TYPE_TCG_ACCEL ACCEL_CLASS_NAME("tcg")
static const TypeInfo tcg_accel_type = {
.name = TYPE_TCG_ACCEL,
.parent = TYPE_ACCEL,
.class_init = tcg_accel_class_init,
};
static void register_accel_types(void)
{
type_register_static(&accel_type);
type_register_static(&tcg_accel_type);
}
type_init(register_accel_types);

View File

@@ -100,6 +100,11 @@ void aio_set_event_notifier(AioContext *ctx,
(IOHandler *)io_read, NULL, notifier);
}
bool aio_prepare(AioContext *ctx)
{
return false;
}
bool aio_pending(AioContext *ctx)
{
AioHandler *node;
@@ -119,11 +124,20 @@ bool aio_pending(AioContext *ctx)
return false;
}
static bool aio_dispatch(AioContext *ctx)
bool aio_dispatch(AioContext *ctx)
{
AioHandler *node;
bool progress = false;
/*
* If there are callbacks left that have been queued, we need to call them.
* Do not call select in this case, because it is possible that the caller
* does not need a complete flush (as is the case for aio_poll loops).
*/
if (aio_bh_poll(ctx)) {
progress = true;
}
/*
* We have to walk very carefully in case aio_set_fd_handler is
* called while we're walking.
@@ -184,22 +198,9 @@ bool aio_poll(AioContext *ctx, bool blocking)
/* aio_notify can avoid the expensive event_notifier_set if
* everything (file descriptors, bottom halves, timers) will
* be re-evaluated before the next blocking poll(). This happens
* in two cases:
*
* 1) when aio_poll is called with blocking == false
*
* 2) when we are called after poll(). If we are called before
* poll(), bottom halves will not be re-evaluated and we need
* aio_notify() if blocking == true.
*
* The first aio_dispatch() only does something when AioContext is
* running as a GSource, and in that case aio_poll is used only
* with blocking == false, so this optimization is already quite
* effective. However, the code is ugly and should be restructured
* to have a single aio_dispatch() call. To do this, we need to
* reorganize aio_poll into a prepare/poll/dispatch model like
* glib's.
* be re-evaluated before the next blocking poll(). This is
* already true when aio_poll is called with blocking == false;
* if blocking == true, it is only true after poll() returns.
*
* If we're in a nested event loop, ctx->dispatching might be true.
* In that case we can restore it just before returning, but we
@@ -207,26 +208,6 @@ bool aio_poll(AioContext *ctx, bool blocking)
*/
aio_set_dispatching(ctx, !blocking);
/*
* If there are callbacks left that have been queued, we need to call them.
* Do not call select in this case, because it is possible that the caller
* does not need a complete flush (as is the case for aio_poll loops).
*/
if (aio_bh_poll(ctx)) {
blocking = false;
progress = true;
}
/* Re-evaluate condition (1) above. */
aio_set_dispatching(ctx, !blocking);
if (aio_dispatch(ctx)) {
progress = true;
}
if (progress && !blocking) {
goto out;
}
ctx->walking_handlers++;
g_array_set_size(ctx->pollfds, 0);
@@ -249,7 +230,7 @@ bool aio_poll(AioContext *ctx, bool blocking)
/* wait until next event */
ret = qemu_poll_ns((GPollFD *)ctx->pollfds->data,
ctx->pollfds->len,
blocking ? timerlistgroup_deadline_ns(&ctx->tlg) : 0);
blocking ? aio_compute_timeout(ctx) : 0);
/* if we have any readable fds, dispatch event */
if (ret > 0) {
@@ -268,7 +249,6 @@ bool aio_poll(AioContext *ctx, bool blocking)
progress = true;
}
out:
aio_set_dispatching(ctx, was_dispatching);
return progress;
}

View File

@@ -22,12 +22,80 @@
struct AioHandler {
EventNotifier *e;
IOHandler *io_read;
IOHandler *io_write;
EventNotifierHandler *io_notify;
GPollFD pfd;
int deleted;
void *opaque;
QLIST_ENTRY(AioHandler) node;
};
void aio_set_fd_handler(AioContext *ctx,
int fd,
IOHandler *io_read,
IOHandler *io_write,
void *opaque)
{
/* fd is a SOCKET in our case */
AioHandler *node;
QLIST_FOREACH(node, &ctx->aio_handlers, node) {
if (node->pfd.fd == fd && !node->deleted) {
break;
}
}
/* Are we deleting the fd handler? */
if (!io_read && !io_write) {
if (node) {
/* If the lock is held, just mark the node as deleted */
if (ctx->walking_handlers) {
node->deleted = 1;
node->pfd.revents = 0;
} else {
/* Otherwise, delete it for real. We can't just mark it as
* deleted because deleted nodes are only cleaned up after
* releasing the walking_handlers lock.
*/
QLIST_REMOVE(node, node);
g_free(node);
}
}
} else {
HANDLE event;
if (node == NULL) {
/* Alloc and insert if it's not already there */
node = g_malloc0(sizeof(AioHandler));
node->pfd.fd = fd;
QLIST_INSERT_HEAD(&ctx->aio_handlers, node, node);
}
node->pfd.events = 0;
if (node->io_read) {
node->pfd.events |= G_IO_IN;
}
if (node->io_write) {
node->pfd.events |= G_IO_OUT;
}
node->e = &ctx->notifier;
/* Update handler with latest information */
node->opaque = opaque;
node->io_read = io_read;
node->io_write = io_write;
event = event_notifier_get_handle(&ctx->notifier);
WSAEventSelect(node->pfd.fd, event,
FD_READ | FD_ACCEPT | FD_CLOSE |
FD_CONNECT | FD_WRITE | FD_OOB);
}
aio_notify(ctx);
}
void aio_set_event_notifier(AioContext *ctx,
EventNotifier *e,
EventNotifierHandler *io_notify)
@@ -76,6 +144,43 @@ void aio_set_event_notifier(AioContext *ctx,
aio_notify(ctx);
}
bool aio_prepare(AioContext *ctx)
{
static struct timeval tv0;
AioHandler *node;
bool have_select_revents = false;
fd_set rfds, wfds;
/* fill fd sets */
FD_ZERO(&rfds);
FD_ZERO(&wfds);
QLIST_FOREACH(node, &ctx->aio_handlers, node) {
if (node->io_read) {
FD_SET ((SOCKET)node->pfd.fd, &rfds);
}
if (node->io_write) {
FD_SET ((SOCKET)node->pfd.fd, &wfds);
}
}
if (select(0, &rfds, &wfds, NULL, &tv0) > 0) {
QLIST_FOREACH(node, &ctx->aio_handlers, node) {
node->pfd.revents = 0;
if (FD_ISSET(node->pfd.fd, &rfds)) {
node->pfd.revents |= G_IO_IN;
have_select_revents = true;
}
if (FD_ISSET(node->pfd.fd, &wfds)) {
node->pfd.revents |= G_IO_OUT;
have_select_revents = true;
}
}
}
return have_select_revents;
}
bool aio_pending(AioContext *ctx)
{
AioHandler *node;
@@ -84,47 +189,37 @@ bool aio_pending(AioContext *ctx)
if (node->pfd.revents && node->io_notify) {
return true;
}
if ((node->pfd.revents & G_IO_IN) && node->io_read) {
return true;
}
if ((node->pfd.revents & G_IO_OUT) && node->io_write) {
return true;
}
}
return false;
}
bool aio_poll(AioContext *ctx, bool blocking)
static bool aio_dispatch_handlers(AioContext *ctx, HANDLE event)
{
AioHandler *node;
HANDLE events[MAXIMUM_WAIT_OBJECTS + 1];
bool progress;
int count;
int timeout;
progress = false;
bool progress = false;
/*
* If there are callbacks left that have been queued, we need to call then.
* Do not call select in this case, because it is possible that the caller
* does not need a complete flush (as is the case for aio_poll loops).
*/
if (aio_bh_poll(ctx)) {
blocking = false;
progress = true;
}
/* Run timers */
progress |= timerlistgroup_run_timers(&ctx->tlg);
/*
* Then dispatch any pending callbacks from the GSource.
*
* We have to walk very carefully in case aio_set_fd_handler is
* called while we're walking.
*/
node = QLIST_FIRST(&ctx->aio_handlers);
while (node) {
AioHandler *tmp;
int revents = node->pfd.revents;
ctx->walking_handlers++;
if (node->pfd.revents && node->io_notify) {
if (!node->deleted &&
(revents || event_notifier_get_handle(node->e) == event) &&
node->io_notify) {
node->pfd.revents = 0;
node->io_notify(node->e);
@@ -134,6 +229,28 @@ bool aio_poll(AioContext *ctx, bool blocking)
}
}
if (!node->deleted &&
(node->io_read || node->io_write)) {
node->pfd.revents = 0;
if ((revents & G_IO_IN) && node->io_read) {
node->io_read(node->opaque);
progress = true;
}
if ((revents & G_IO_OUT) && node->io_write) {
node->io_write(node->opaque);
progress = true;
}
/* if the next select() will return an event, we have progressed */
if (event == event_notifier_get_handle(&ctx->notifier)) {
WSANETWORKEVENTS ev;
WSAEnumNetworkEvents(node->pfd.fd, event, &ev);
if (ev.lNetworkEvents) {
progress = true;
}
}
}
tmp = node;
node = QLIST_NEXT(node, node);
@@ -145,10 +262,47 @@ bool aio_poll(AioContext *ctx, bool blocking)
}
}
if (progress && !blocking) {
return true;
return progress;
}
bool aio_dispatch(AioContext *ctx)
{
bool progress;
progress = aio_bh_poll(ctx);
progress |= aio_dispatch_handlers(ctx, INVALID_HANDLE_VALUE);
progress |= timerlistgroup_run_timers(&ctx->tlg);
return progress;
}
bool aio_poll(AioContext *ctx, bool blocking)
{
AioHandler *node;
HANDLE events[MAXIMUM_WAIT_OBJECTS + 1];
bool was_dispatching, progress, have_select_revents, first;
int count;
int timeout;
have_select_revents = aio_prepare(ctx);
if (have_select_revents) {
blocking = false;
}
was_dispatching = ctx->dispatching;
progress = false;
/* aio_notify can avoid the expensive event_notifier_set if
* everything (file descriptors, bottom halves, timers) will
* be re-evaluated before the next blocking poll(). This is
* already true when aio_poll is called with blocking == false;
* if blocking == true, it is only true after poll() returns.
*
* If we're in a nested event loop, ctx->dispatching might be true.
* In that case we can restore it just before returning, but we
* have to clear it now.
*/
aio_set_dispatching(ctx, !blocking);
ctx->walking_handlers++;
/* fill fd sets */
@@ -160,64 +314,40 @@ bool aio_poll(AioContext *ctx, bool blocking)
}
ctx->walking_handlers--;
first = true;
/* wait until next event */
while (count > 0) {
HANDLE event;
int ret;
timeout = blocking ?
qemu_timeout_ns_to_ms(timerlistgroup_deadline_ns(&ctx->tlg)) : 0;
timeout = blocking
? qemu_timeout_ns_to_ms(aio_compute_timeout(ctx)) : 0;
ret = WaitForMultipleObjects(count, events, FALSE, timeout);
aio_set_dispatching(ctx, true);
if (first && aio_bh_poll(ctx)) {
progress = true;
}
first = false;
/* if we have any signaled events, dispatch event */
if ((DWORD) (ret - WAIT_OBJECT_0) >= count) {
event = NULL;
if ((DWORD) (ret - WAIT_OBJECT_0) < count) {
event = events[ret - WAIT_OBJECT_0];
events[ret - WAIT_OBJECT_0] = events[--count];
} else if (!have_select_revents) {
break;
}
have_select_revents = false;
blocking = false;
/* we have to walk very carefully in case
* aio_set_fd_handler is called while we're walking */
node = QLIST_FIRST(&ctx->aio_handlers);
while (node) {
AioHandler *tmp;
ctx->walking_handlers++;
if (!node->deleted &&
event_notifier_get_handle(node->e) == events[ret - WAIT_OBJECT_0] &&
node->io_notify) {
node->io_notify(node->e);
/* aio_notify() does not count as progress */
if (node->e != &ctx->notifier) {
progress = true;
}
}
tmp = node;
node = QLIST_NEXT(node, node);
ctx->walking_handlers--;
if (!ctx->walking_handlers && tmp->deleted) {
QLIST_REMOVE(tmp, node);
g_free(tmp);
}
}
/* Try again, but only call each handler once. */
events[ret - WAIT_OBJECT_0] = events[--count];
progress |= aio_dispatch_handlers(ctx, event);
}
if (blocking) {
/* Run the timers a second time. We do this because otherwise aio_wait
* will not note progress - and will stop a drain early - if we have
* a timer that was not ready to run entering g_poll but is ready
* after g_poll. This will only do anything if a timer has expired.
*/
progress |= timerlistgroup_run_timers(&ctx->tlg);
}
progress |= timerlistgroup_run_timers(&ctx->tlg);
aio_set_dispatching(ctx, was_dispatching);
return progress;
}

View File

@@ -104,6 +104,8 @@ int graphic_depth = 32;
#define QEMU_ARCH QEMU_ARCH_XTENSA
#elif defined(TARGET_UNICORE32)
#define QEMU_ARCH QEMU_ARCH_UNICORE32
#elif defined(TARGET_TRICORE)
#define QEMU_ARCH QEMU_ARCH_TRICORE
#endif
const uint32_t arch_type = QEMU_ARCH;
@@ -484,15 +486,23 @@ static void migration_bitmap_sync_range(ram_addr_t start, ram_addr_t length)
/* Needs iothread lock! */
/* Fix me: there are too many global variables used in migration process. */
static int64_t start_time;
static int64_t bytes_xfer_prev;
static int64_t num_dirty_pages_period;
static void migration_bitmap_sync_init(void)
{
start_time = 0;
bytes_xfer_prev = 0;
num_dirty_pages_period = 0;
}
static void migration_bitmap_sync(void)
{
RAMBlock *block;
uint64_t num_dirty_pages_init = migration_dirty_pages;
MigrationState *s = migrate_get_current();
static int64_t start_time;
static int64_t bytes_xfer_prev;
static int64_t num_dirty_pages_period;
int64_t end_time;
int64_t bytes_xfer_now;
static uint64_t xbzrle_cache_miss_prev;
@@ -772,6 +782,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
mig_throttle_on = false;
dirty_rate_high_cnt = 0;
bitmap_sync_count = 0;
migration_bitmap_sync_init();
if (migrate_use_xbzrle()) {
XBZRLE_cache_lock();
@@ -1004,7 +1015,7 @@ static inline void *host_from_stream_offset(QEMUFile *f,
uint8_t len;
if (flags & RAM_SAVE_FLAG_CONTINUE) {
if (!block) {
if (!block || block->length <= offset) {
error_report("Ack, bad migration stream!");
return NULL;
}
@@ -1017,8 +1028,9 @@ static inline void *host_from_stream_offset(QEMUFile *f,
id[len] = 0;
QTAILQ_FOREACH(block, &ram_list.blocks, next) {
if (!strncmp(id, block->idstr, sizeof(id)))
if (!strncmp(id, block->idstr, sizeof(id)) && block->length > offset) {
return memory_region_get_ram_ptr(block->mr) + offset;
}
}
error_report("Can't find block %s!", id);
@@ -1038,8 +1050,7 @@ void ram_handle_compressed(void *host, uint8_t ch, uint64_t size)
static int ram_load(QEMUFile *f, void *opaque, int version_id)
{
ram_addr_t addr;
int flags, ret = 0;
int flags = 0, ret = 0;
static uint64_t seq_iter;
seq_iter++;
@@ -1048,21 +1059,24 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
ret = -EINVAL;
}
while (!ret) {
addr = qemu_get_be64(f);
while (!ret && !(flags & RAM_SAVE_FLAG_EOS)) {
ram_addr_t addr, total_ram_bytes;
void *host;
uint8_t ch;
addr = qemu_get_be64(f);
flags = addr & ~TARGET_PAGE_MASK;
addr &= TARGET_PAGE_MASK;
if (flags & RAM_SAVE_FLAG_MEM_SIZE) {
switch (flags & ~RAM_SAVE_FLAG_CONTINUE) {
case RAM_SAVE_FLAG_MEM_SIZE:
/* Synchronize RAM block list */
char id[256];
ram_addr_t length;
ram_addr_t total_ram_bytes = addr;
while (total_ram_bytes) {
total_ram_bytes = addr;
while (!ret && total_ram_bytes) {
RAMBlock *block;
uint8_t len;
char id[256];
ram_addr_t length;
len = qemu_get_byte(f);
qemu_get_buffer(f, (uint8_t *)id, len);
@@ -1072,8 +1086,8 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
QTAILQ_FOREACH(block, &ram_list.blocks, next) {
if (!strncmp(id, block->idstr, sizeof(id))) {
if (block->length != length) {
error_report("Length mismatch: %s: " RAM_ADDR_FMT
" in != " RAM_ADDR_FMT, id, length,
error_report("Length mismatch: %s: 0x" RAM_ADDR_FMT
" in != 0x" RAM_ADDR_FMT, id, length,
block->length);
ret = -EINVAL;
}
@@ -1086,16 +1100,11 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
"accept migration", id);
ret = -EINVAL;
}
if (ret) {
break;
}
total_ram_bytes -= length;
}
} else if (flags & RAM_SAVE_FLAG_COMPRESS) {
void *host;
uint8_t ch;
break;
case RAM_SAVE_FLAG_COMPRESS:
host = host_from_stream_offset(f, addr, flags);
if (!host) {
error_report("Illegal RAM offset " RAM_ADDR_FMT, addr);
@@ -1105,9 +1114,8 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
ch = qemu_get_byte(f);
ram_handle_compressed(host, ch, TARGET_PAGE_SIZE);
} else if (flags & RAM_SAVE_FLAG_PAGE) {
void *host;
break;
case RAM_SAVE_FLAG_PAGE:
host = host_from_stream_offset(f, addr, flags);
if (!host) {
error_report("Illegal RAM offset " RAM_ADDR_FMT, addr);
@@ -1116,8 +1124,9 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
}
qemu_get_buffer(f, host, TARGET_PAGE_SIZE);
} else if (flags & RAM_SAVE_FLAG_XBZRLE) {
void *host = host_from_stream_offset(f, addr, flags);
break;
case RAM_SAVE_FLAG_XBZRLE:
host = host_from_stream_offset(f, addr, flags);
if (!host) {
error_report("Illegal RAM offset " RAM_ADDR_FMT, addr);
ret = -EINVAL;
@@ -1130,17 +1139,22 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
ret = -EINVAL;
break;
}
} else if (flags & RAM_SAVE_FLAG_HOOK) {
ram_control_load_hook(f, flags);
} else if (flags & RAM_SAVE_FLAG_EOS) {
break;
case RAM_SAVE_FLAG_EOS:
/* normal exit */
break;
} else {
error_report("Unknown migration flags: %#x", flags);
ret = -EINVAL;
break;
default:
if (flags & RAM_SAVE_FLAG_HOOK) {
ram_control_load_hook(f, flags);
} else {
error_report("Unknown combination of migration flags: %#x",
flags);
ret = -EINVAL;
}
}
if (!ret) {
ret = qemu_file_get_error(f);
}
ret = qemu_file_get_error(f);
}
DPRINTF("Completed load of VM with exit code %d seq iteration "
@@ -1335,11 +1349,6 @@ void cpudef_init(void)
#endif
}
int tcg_available(void)
{
return 1;
}
int kvm_available(void)
{
#ifdef CONFIG_KVM

55
async.c
View File

@@ -152,39 +152,48 @@ void qemu_bh_delete(QEMUBH *bh)
bh->deleted = 1;
}
static gboolean
aio_ctx_prepare(GSource *source, gint *timeout)
int64_t
aio_compute_timeout(AioContext *ctx)
{
AioContext *ctx = (AioContext *) source;
int64_t deadline;
int timeout = -1;
QEMUBH *bh;
int deadline;
/* We assume there is no timeout already supplied */
*timeout = -1;
for (bh = ctx->first_bh; bh; bh = bh->next) {
if (!bh->deleted && bh->scheduled) {
if (bh->idle) {
/* idle bottom halves will be polled at least
* every 10ms */
*timeout = 10;
timeout = 10000000;
} else {
/* non-idle bottom halves will be executed
* immediately */
*timeout = 0;
return true;
return 0;
}
}
}
deadline = qemu_timeout_ns_to_ms(timerlistgroup_deadline_ns(&ctx->tlg));
deadline = timerlistgroup_deadline_ns(&ctx->tlg);
if (deadline == 0) {
*timeout = 0;
return true;
return 0;
} else {
*timeout = qemu_soonest_timeout(*timeout, deadline);
return qemu_soonest_timeout(timeout, deadline);
}
}
static gboolean
aio_ctx_prepare(GSource *source, gint *timeout)
{
AioContext *ctx = (AioContext *) source;
/* We assume there is no timeout already supplied */
*timeout = qemu_timeout_ns_to_ms(aio_compute_timeout(ctx));
if (aio_prepare(ctx)) {
*timeout = 0;
}
return false;
return *timeout == 0;
}
static gboolean
@@ -209,7 +218,7 @@ aio_ctx_dispatch(GSource *source,
AioContext *ctx = (AioContext *) source;
assert(callback == NULL);
aio_poll(ctx, false);
aio_dispatch(ctx);
return true;
}
@@ -280,18 +289,24 @@ static void aio_rfifolock_cb(void *opaque)
aio_notify(opaque);
}
AioContext *aio_context_new(void)
AioContext *aio_context_new(Error **errp)
{
int ret;
AioContext *ctx;
ctx = (AioContext *) g_source_new(&aio_source_funcs, sizeof(AioContext));
ret = event_notifier_init(&ctx->notifier, false);
if (ret < 0) {
g_source_destroy(&ctx->source);
error_setg_errno(errp, -ret, "Failed to initialize event notifier");
return NULL;
}
aio_set_event_notifier(ctx, &ctx->notifier,
(EventNotifierHandler *)
event_notifier_test_and_clear);
ctx->pollfds = g_array_new(FALSE, FALSE, sizeof(GPollFD));
ctx->thread_pool = NULL;
qemu_mutex_init(&ctx->bh_lock);
rfifolock_init(&ctx->lock, aio_rfifolock_cb, ctx);
event_notifier_init(&ctx->notifier, false);
aio_set_event_notifier(ctx, &ctx->notifier,
(EventNotifierHandler *)
event_notifier_test_and_clear);
timerlistgroup_init(&ctx->tlg, aio_timerlist_notify, ctx);
return ctx;

View File

@@ -1,7 +1,7 @@
common-obj-y += rng.o rng-egd.o
common-obj-$(CONFIG_POSIX) += rng-random.o
common-obj-y += msmouse.o
common-obj-y += msmouse.o testdev.o
common-obj-$(CONFIG_BRLAPI) += baum.o
baum.o-cflags := $(SDL_CFLAGS)

View File

@@ -629,7 +629,7 @@ fail_handle:
static void register_types(void)
{
register_char_driver_qapi("braille", CHARDEV_BACKEND_KIND_BRAILLE, NULL);
register_char_driver("braille", CHARDEV_BACKEND_KIND_BRAILLE, NULL);
}
type_init(register_types);

View File

@@ -27,7 +27,7 @@ ram_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
path = object_get_canonical_path_component(OBJECT(backend));
memory_region_init_ram(&backend->mr, OBJECT(backend), path,
backend->size);
backend->size, errp);
g_free(path);
}

View File

@@ -257,15 +257,6 @@ static void host_memory_backend_init(Object *obj)
host_memory_backend_set_policy, NULL, NULL, NULL);
}
static void host_memory_backend_finalize(Object *obj)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
if (memory_region_size(&backend->mr)) {
memory_region_destroy(&backend->mr);
}
}
MemoryRegion *
host_memory_backend_get_memory(HostMemoryBackend *backend, Error **errp)
{
@@ -360,7 +351,6 @@ static const TypeInfo host_memory_backend_info = {
.class_init = host_memory_backend_class_init,
.instance_size = sizeof(HostMemoryBackend),
.instance_init = host_memory_backend_init,
.instance_finalize = host_memory_backend_finalize,
.interfaces = (InterfaceInfo[]) {
{ TYPE_USER_CREATABLE },
{ }

View File

@@ -79,7 +79,7 @@ CharDriverState *qemu_chr_open_msmouse(void)
static void register_types(void)
{
register_char_driver_qapi("msmouse", CHARDEV_BACKEND_KIND_MSMOUSE, NULL);
register_char_driver("msmouse", CHARDEV_BACKEND_KIND_MSMOUSE, NULL);
}
type_init(register_types);

131
backends/testdev.c Normal file
View File

@@ -0,0 +1,131 @@
/*
* QEMU Char Device for testsuite control
*
* Copyright (c) 2014 Red Hat, Inc.
*
* Author: Paolo Bonzini <pbonzini@redhat.com>
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "qemu-common.h"
#include "sysemu/char.h"
#define BUF_SIZE 32
typedef struct {
CharDriverState *chr;
uint8_t in_buf[32];
int in_buf_used;
} TestdevCharState;
/* Try to interpret a whole incoming packet */
static int testdev_eat_packet(TestdevCharState *testdev)
{
const uint8_t *cur = testdev->in_buf;
int len = testdev->in_buf_used;
uint8_t c;
int arg;
#define EAT(c) do { \
if (!len--) { \
return 0; \
} \
c = *cur++; \
} while (0)
EAT(c);
while (isspace(c)) {
EAT(c);
}
arg = 0;
while (isdigit(c)) {
arg = arg * 10 + c - '0';
EAT(c);
}
while (isspace(c)) {
EAT(c);
}
switch (c) {
case 'q':
exit((arg << 1) | 1);
break;
default:
break;
}
return cur - testdev->in_buf;
}
/* The other end is writing some data. Store it and try to interpret */
static int testdev_write(CharDriverState *chr, const uint8_t *buf, int len)
{
TestdevCharState *testdev = chr->opaque;
int tocopy, eaten, orig_len = len;
while (len) {
/* Complete our buffer as much as possible */
tocopy = MIN(len, BUF_SIZE - testdev->in_buf_used);
memcpy(testdev->in_buf + testdev->in_buf_used, buf, tocopy);
testdev->in_buf_used += tocopy;
buf += tocopy;
len -= tocopy;
/* Interpret it as much as possible */
while (testdev->in_buf_used > 0 &&
(eaten = testdev_eat_packet(testdev)) > 0) {
memmove(testdev->in_buf, testdev->in_buf + eaten,
testdev->in_buf_used - eaten);
testdev->in_buf_used -= eaten;
}
}
return orig_len;
}
static void testdev_close(struct CharDriverState *chr)
{
TestdevCharState *testdev = chr->opaque;
g_free(testdev);
}
CharDriverState *chr_testdev_init(void)
{
TestdevCharState *testdev;
CharDriverState *chr;
testdev = g_malloc0(sizeof(TestdevCharState));
testdev->chr = chr = g_malloc0(sizeof(CharDriverState));
chr->opaque = testdev;
chr->chr_write = testdev_write;
chr->chr_close = testdev_close;
return chr;
}
static void register_types(void)
{
register_char_driver("testdev", CHARDEV_BACKEND_KIND_TESTDEV, NULL);
}
type_init(register_types);

View File

@@ -14,7 +14,9 @@
*/
#include "qemu-common.h"
#include "block/block_int.h"
#include "block/block.h"
#include "qemu/error-report.h"
#include "qemu/main-loop.h"
#include "hw/hw.h"
#include "qemu/queue.h"
#include "qemu/timer.h"
@@ -70,7 +72,7 @@ typedef struct BlkMigBlock {
int nr_sectors;
struct iovec iov;
QEMUIOVector qiov;
BlockDriverAIOCB *aiocb;
BlockAIOCB *aiocb;
/* Protected by block migration lock. */
int ret;
@@ -130,9 +132,9 @@ static void blk_send(QEMUFile *f, BlkMigBlock * blk)
| flags);
/* device name */
len = strlen(blk->bmds->bs->device_name);
len = strlen(bdrv_get_device_name(blk->bmds->bs));
qemu_put_byte(f, len);
qemu_put_buffer(f, (uint8_t *)blk->bmds->bs->device_name, len);
qemu_put_buffer(f, (uint8_t *)bdrv_get_device_name(blk->bmds->bs), len);
/* if a block is zero we need to flush here since the network
* bandwidth is now a lot higher than the storage device bandwidth.
@@ -186,7 +188,7 @@ static int bmds_aio_inflight(BlkMigDevState *bmds, int64_t sector)
{
int64_t chunk = sector / (int64_t)BDRV_SECTORS_PER_DIRTY_CHUNK;
if ((sector << BDRV_SECTOR_BITS) < bdrv_getlength(bmds->bs)) {
if (sector < bdrv_nb_sectors(bmds->bs)) {
return !!(bmds->aio_bitmap[chunk / (sizeof(unsigned long) * 8)] &
(1UL << (chunk % (sizeof(unsigned long) * 8))));
} else {
@@ -223,8 +225,7 @@ static void alloc_aio_bitmap(BlkMigDevState *bmds)
BlockDriverState *bs = bmds->bs;
int64_t bitmap_size;
bitmap_size = (bdrv_getlength(bs) >> BDRV_SECTOR_BITS) +
BDRV_SECTORS_PER_DIRTY_CHUNK * 8 - 1;
bitmap_size = bdrv_nb_sectors(bs) + BDRV_SECTORS_PER_DIRTY_CHUNK * 8 - 1;
bitmap_size /= BDRV_SECTORS_PER_DIRTY_CHUNK * 8;
bmds->aio_bitmap = g_malloc0(bitmap_size);
@@ -284,7 +285,7 @@ static int mig_save_device_bulk(QEMUFile *f, BlkMigDevState *bmds)
nr_sectors = total_sectors - cur_sector;
}
blk = g_malloc(sizeof(BlkMigBlock));
blk = g_new(BlkMigBlock, 1);
blk->buf = g_malloc(BLOCK_SIZE);
blk->bmds = bmds;
blk->sector = cur_sector;
@@ -344,18 +345,31 @@ static void unset_dirty_tracking(void)
}
}
static void init_blk_migration_it(void *opaque, BlockDriverState *bs)
static void init_blk_migration(QEMUFile *f)
{
BlockDriverState *bs;
BlkMigDevState *bmds;
int64_t sectors;
if (!bdrv_is_read_only(bs)) {
sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
block_mig_state.submitted = 0;
block_mig_state.read_done = 0;
block_mig_state.transferred = 0;
block_mig_state.total_sector_sum = 0;
block_mig_state.prev_progress = -1;
block_mig_state.bulk_completed = 0;
block_mig_state.zero_blocks = migrate_zero_blocks();
for (bs = bdrv_next(NULL); bs; bs = bdrv_next(bs)) {
if (bdrv_is_read_only(bs)) {
continue;
}
sectors = bdrv_nb_sectors(bs);
if (sectors <= 0) {
return;
}
bmds = g_malloc0(sizeof(BlkMigDevState));
bmds = g_new0(BlkMigDevState, 1);
bmds->bs = bs;
bmds->bulk_completed = 0;
bmds->total_sectors = sectors;
@@ -370,28 +384,15 @@ static void init_blk_migration_it(void *opaque, BlockDriverState *bs)
if (bmds->shared_base) {
DPRINTF("Start migration for %s with shared base image\n",
bs->device_name);
bdrv_get_device_name(bs));
} else {
DPRINTF("Start full migration for %s\n", bs->device_name);
DPRINTF("Start full migration for %s\n", bdrv_get_device_name(bs));
}
QSIMPLEQ_INSERT_TAIL(&block_mig_state.bmds_list, bmds, entry);
}
}
static void init_blk_migration(QEMUFile *f)
{
block_mig_state.submitted = 0;
block_mig_state.read_done = 0;
block_mig_state.transferred = 0;
block_mig_state.total_sector_sum = 0;
block_mig_state.prev_progress = -1;
block_mig_state.bulk_completed = 0;
block_mig_state.zero_blocks = migrate_zero_blocks();
bdrv_iterate(init_blk_migration_it, NULL);
}
/* Called with no lock taken. */
static int blk_mig_save_bulked_block(QEMUFile *f)
@@ -466,7 +467,7 @@ static int mig_save_device_dirty(QEMUFile *f, BlkMigDevState *bmds,
} else {
nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK;
}
blk = g_malloc(sizeof(BlkMigBlock));
blk = g_new(BlkMigBlock, 1);
blk->buf = g_malloc(BLOCK_SIZE);
blk->bmds = bmds;
blk->sector = sector;
@@ -799,7 +800,7 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
if (bs != bs_prev) {
bs_prev = bs;
total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
total_sectors = bdrv_nb_sectors(bs);
if (total_sectors <= 0) {
error_report("Error getting length of block device %s",
device_name);

998
block.c

File diff suppressed because it is too large Load Diff

View File

@@ -1,28 +1,28 @@
block-obj-y += raw_bsd.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat.o
block-obj-y += raw_bsd.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat.o
block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o
block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
block-obj-y += qed-check.o
block-obj-$(CONFIG_VHDX) += vhdx.o vhdx-endian.o vhdx-log.o
block-obj-$(CONFIG_QUORUM) += quorum.o
block-obj-y += parallels.o blkdebug.o blkverify.o
block-obj-y += snapshot.o qapi.o
block-obj-y += block-backend.o snapshot.o qapi.o
block-obj-$(CONFIG_WIN32) += raw-win32.o win32-aio.o
block-obj-$(CONFIG_POSIX) += raw-posix.o
block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
block-obj-y += null.o mirror.o
ifeq ($(CONFIG_POSIX),y)
block-obj-y += nbd.o nbd-client.o sheepdog.o
block-obj-$(CONFIG_LIBISCSI) += iscsi.o
block-obj-$(CONFIG_LIBNFS) += nfs.o
block-obj-$(CONFIG_CURL) += curl.o
block-obj-$(CONFIG_RBD) += rbd.o
block-obj-$(CONFIG_GLUSTERFS) += gluster.o
block-obj-$(CONFIG_ARCHIPELAGO) += archipelago.o
block-obj-$(CONFIG_LIBSSH2) += ssh.o
endif
block-obj-y += accounting.o
common-obj-y += stream.o
common-obj-y += commit.o
common-obj-y += mirror.o
common-obj-y += backup.o
iscsi.o-cflags := $(LIBISCSI_CFLAGS)
@@ -35,5 +35,6 @@ gluster.o-cflags := $(GLUSTERFS_CFLAGS)
gluster.o-libs := $(GLUSTERFS_LIBS)
ssh.o-cflags := $(LIBSSH2_CFLAGS)
ssh.o-libs := $(LIBSSH2_LIBS)
archipelago.o-libs := $(ARCHIPELAGO_LIBS)
qcow.o-libs := -lz
linux-aio.o-libs := -laio

54
block/accounting.c Normal file
View File

@@ -0,0 +1,54 @@
/*
* QEMU System Emulator block accounting
*
* Copyright (c) 2011 Christoph Hellwig
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "block/accounting.h"
#include "block/block_int.h"
void block_acct_start(BlockAcctStats *stats, BlockAcctCookie *cookie,
int64_t bytes, enum BlockAcctType type)
{
assert(type < BLOCK_MAX_IOTYPE);
cookie->bytes = bytes;
cookie->start_time_ns = get_clock();
cookie->type = type;
}
void block_acct_done(BlockAcctStats *stats, BlockAcctCookie *cookie)
{
assert(cookie->type < BLOCK_MAX_IOTYPE);
stats->nr_bytes[cookie->type] += cookie->bytes;
stats->nr_ops[cookie->type]++;
stats->total_time_ns[cookie->type] += get_clock() - cookie->start_time_ns;
}
void block_acct_highest_sector(BlockAcctStats *stats, int64_t sector_num,
unsigned int nb_sectors)
{
if (stats->wr_highest_sector < sector_num + nb_sectors - 1) {
stats->wr_highest_sector = sector_num + nb_sectors - 1;
}
}

1084
block/archipelago.c Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -227,9 +227,25 @@ static BlockErrorAction backup_error_action(BackupBlockJob *job,
}
}
typedef struct {
int ret;
} BackupCompleteData;
static void backup_complete(BlockJob *job, void *opaque)
{
BackupBlockJob *s = container_of(job, BackupBlockJob, common);
BackupCompleteData *data = opaque;
bdrv_unref(s->target);
block_job_completed(job, data->ret);
g_free(data);
}
static void coroutine_fn backup_run(void *opaque)
{
BackupBlockJob *job = opaque;
BackupCompleteData *data;
BlockDriverState *bs = job->common.bs;
BlockDriverState *target = job->target;
BlockdevOnError on_target_error = job->on_target_error;
@@ -344,16 +360,17 @@ static void coroutine_fn backup_run(void *opaque)
hbitmap_free(job->bitmap);
bdrv_iostatus_disable(target);
bdrv_unref(target);
block_job_completed(&job->common, ret);
data = g_malloc(sizeof(*data));
data->ret = ret;
block_job_defer_to_main_loop(&job->common, backup_complete, data);
}
void backup_start(BlockDriverState *bs, BlockDriverState *target,
int64_t speed, MirrorSyncMode sync_mode,
BlockdevOnError on_source_error,
BlockdevOnError on_target_error,
BlockDriverCompletionFunc *cb, void *opaque,
BlockCompletionFunc *cb, void *opaque,
Error **errp)
{
int64_t len;

View File

@@ -26,6 +26,10 @@
#include "qemu/config-file.h"
#include "block/block_int.h"
#include "qemu/module.h"
#include "qapi/qmp/qbool.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qint.h"
#include "qapi/qmp/qstring.h"
typedef struct BDRVBlkdebugState {
int state;
@@ -37,7 +41,7 @@ typedef struct BDRVBlkdebugState {
} BDRVBlkdebugState;
typedef struct BlkdebugAIOCB {
BlockDriverAIOCB common;
BlockAIOCB common;
QEMUBH *bh;
int ret;
} BlkdebugAIOCB;
@@ -48,11 +52,8 @@ typedef struct BlkdebugSuspendedReq {
QLIST_ENTRY(BlkdebugSuspendedReq) next;
} BlkdebugSuspendedReq;
static void blkdebug_aio_cancel(BlockDriverAIOCB *blockacb);
static const AIOCBInfo blkdebug_aiocb_info = {
.aiocb_size = sizeof(BlkdebugAIOCB),
.cancel = blkdebug_aio_cancel,
.aiocb_size = sizeof(BlkdebugAIOCB),
};
enum {
@@ -194,6 +195,8 @@ static const char *event_names[BLKDBG_EVENT_MAX] = {
[BLKDBG_PWRITEV] = "pwritev",
[BLKDBG_PWRITEV_ZERO] = "pwritev_zero",
[BLKDBG_PWRITEV_DONE] = "pwritev_done",
[BLKDBG_EMPTY_IMAGE_PREPARE] = "empty_image_prepare",
};
static int get_event_by_name(const char *name, BlkDebugEvent *event)
@@ -213,6 +216,7 @@ static int get_event_by_name(const char *name, BlkDebugEvent *event)
struct add_rule_data {
BDRVBlkdebugState *s;
int action;
Error **errp;
};
static int add_rule(QemuOpts *opts, void *opaque)
@@ -225,7 +229,11 @@ static int add_rule(QemuOpts *opts, void *opaque)
/* Find the right event for the rule */
event_name = qemu_opt_get(opts, "event");
if (!event_name || get_event_by_name(event_name, &event) < 0) {
if (!event_name) {
error_setg(d->errp, "Missing event name for rule");
return -1;
} else if (get_event_by_name(event_name, &event) < 0) {
error_setg(d->errp, "Invalid event name \"%s\"", event_name);
return -1;
}
@@ -311,10 +319,21 @@ static int read_config(BDRVBlkdebugState *s, const char *filename,
d.s = s;
d.action = ACTION_INJECT_ERROR;
qemu_opts_foreach(&inject_error_opts, add_rule, &d, 0);
d.errp = &local_err;
qemu_opts_foreach(&inject_error_opts, add_rule, &d, 1);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
d.action = ACTION_SET_STATE;
qemu_opts_foreach(&set_state_opts, add_rule, &d, 0);
qemu_opts_foreach(&set_state_opts, add_rule, &d, 1);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
ret = 0;
fail:
@@ -443,21 +462,11 @@ static void error_callback_bh(void *opaque)
struct BlkdebugAIOCB *acb = opaque;
qemu_bh_delete(acb->bh);
acb->common.cb(acb->common.opaque, acb->ret);
qemu_aio_release(acb);
qemu_aio_unref(acb);
}
static void blkdebug_aio_cancel(BlockDriverAIOCB *blockacb)
{
BlkdebugAIOCB *acb = container_of(blockacb, BlkdebugAIOCB, common);
if (acb->bh) {
qemu_bh_delete(acb->bh);
acb->bh = NULL;
}
qemu_aio_release(acb);
}
static BlockDriverAIOCB *inject_error(BlockDriverState *bs,
BlockDriverCompletionFunc *cb, void *opaque, BlkdebugRule *rule)
static BlockAIOCB *inject_error(BlockDriverState *bs,
BlockCompletionFunc *cb, void *opaque, BlkdebugRule *rule)
{
BDRVBlkdebugState *s = bs->opaque;
int error = rule->options.inject.error;
@@ -482,9 +491,9 @@ static BlockDriverAIOCB *inject_error(BlockDriverState *bs,
return &acb->common;
}
static BlockDriverAIOCB *blkdebug_aio_readv(BlockDriverState *bs,
static BlockAIOCB *blkdebug_aio_readv(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockDriverCompletionFunc *cb, void *opaque)
BlockCompletionFunc *cb, void *opaque)
{
BDRVBlkdebugState *s = bs->opaque;
BlkdebugRule *rule = NULL;
@@ -504,9 +513,9 @@ static BlockDriverAIOCB *blkdebug_aio_readv(BlockDriverState *bs,
return bdrv_aio_readv(bs->file, sector_num, qiov, nb_sectors, cb, opaque);
}
static BlockDriverAIOCB *blkdebug_aio_writev(BlockDriverState *bs,
static BlockAIOCB *blkdebug_aio_writev(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockDriverCompletionFunc *cb, void *opaque)
BlockCompletionFunc *cb, void *opaque)
{
BDRVBlkdebugState *s = bs->opaque;
BlkdebugRule *rule = NULL;
@@ -526,6 +535,25 @@ static BlockDriverAIOCB *blkdebug_aio_writev(BlockDriverState *bs,
return bdrv_aio_writev(bs->file, sector_num, qiov, nb_sectors, cb, opaque);
}
static BlockAIOCB *blkdebug_aio_flush(BlockDriverState *bs,
BlockCompletionFunc *cb, void *opaque)
{
BDRVBlkdebugState *s = bs->opaque;
BlkdebugRule *rule = NULL;
QSIMPLEQ_FOREACH(rule, &s->active_rules, active_next) {
if (rule->options.inject.sector == -1) {
break;
}
}
if (rule && rule->options.inject.error) {
return inject_error(bs, cb, opaque, rule);
}
return bdrv_aio_flush(bs->file, cb, opaque);
}
static void blkdebug_close(BlockDriverState *bs)
{
@@ -691,6 +719,98 @@ static int64_t blkdebug_getlength(BlockDriverState *bs)
return bdrv_getlength(bs->file);
}
static void blkdebug_refresh_filename(BlockDriverState *bs)
{
BDRVBlkdebugState *s = bs->opaque;
struct BlkdebugRule *rule;
QDict *opts;
QList *inject_error_list = NULL, *set_state_list = NULL;
QList *suspend_list = NULL;
int event;
if (!bs->file->full_open_options) {
/* The config file cannot be recreated, so creating a plain filename
* is impossible */
return;
}
opts = qdict_new();
qdict_put_obj(opts, "driver", QOBJECT(qstring_from_str("blkdebug")));
QINCREF(bs->file->full_open_options);
qdict_put_obj(opts, "image", QOBJECT(bs->file->full_open_options));
for (event = 0; event < BLKDBG_EVENT_MAX; event++) {
QLIST_FOREACH(rule, &s->rules[event], next) {
if (rule->action == ACTION_INJECT_ERROR) {
QDict *inject_error = qdict_new();
qdict_put_obj(inject_error, "event", QOBJECT(qstring_from_str(
BlkdebugEvent_lookup[rule->event])));
qdict_put_obj(inject_error, "state",
QOBJECT(qint_from_int(rule->state)));
qdict_put_obj(inject_error, "errno", QOBJECT(qint_from_int(
rule->options.inject.error)));
qdict_put_obj(inject_error, "sector", QOBJECT(qint_from_int(
rule->options.inject.sector)));
qdict_put_obj(inject_error, "once", QOBJECT(qbool_from_int(
rule->options.inject.once)));
qdict_put_obj(inject_error, "immediately",
QOBJECT(qbool_from_int(
rule->options.inject.immediately)));
if (!inject_error_list) {
inject_error_list = qlist_new();
}
qlist_append_obj(inject_error_list, QOBJECT(inject_error));
} else if (rule->action == ACTION_SET_STATE) {
QDict *set_state = qdict_new();
qdict_put_obj(set_state, "event", QOBJECT(qstring_from_str(
BlkdebugEvent_lookup[rule->event])));
qdict_put_obj(set_state, "state",
QOBJECT(qint_from_int(rule->state)));
qdict_put_obj(set_state, "new_state", QOBJECT(qint_from_int(
rule->options.set_state.new_state)));
if (!set_state_list) {
set_state_list = qlist_new();
}
qlist_append_obj(set_state_list, QOBJECT(set_state));
} else if (rule->action == ACTION_SUSPEND) {
QDict *suspend = qdict_new();
qdict_put_obj(suspend, "event", QOBJECT(qstring_from_str(
BlkdebugEvent_lookup[rule->event])));
qdict_put_obj(suspend, "state",
QOBJECT(qint_from_int(rule->state)));
qdict_put_obj(suspend, "tag", QOBJECT(qstring_from_str(
rule->options.suspend.tag)));
if (!suspend_list) {
suspend_list = qlist_new();
}
qlist_append_obj(suspend_list, QOBJECT(suspend));
}
}
}
if (inject_error_list) {
qdict_put_obj(opts, "inject-error", QOBJECT(inject_error_list));
}
if (set_state_list) {
qdict_put_obj(opts, "set-state", QOBJECT(set_state_list));
}
if (suspend_list) {
qdict_put_obj(opts, "suspend", QOBJECT(suspend_list));
}
bs->full_open_options = opts;
}
static BlockDriver bdrv_blkdebug = {
.format_name = "blkdebug",
.protocol_name = "blkdebug",
@@ -700,9 +820,11 @@ static BlockDriver bdrv_blkdebug = {
.bdrv_file_open = blkdebug_open,
.bdrv_close = blkdebug_close,
.bdrv_getlength = blkdebug_getlength,
.bdrv_refresh_filename = blkdebug_refresh_filename,
.bdrv_aio_readv = blkdebug_aio_readv,
.bdrv_aio_writev = blkdebug_aio_writev,
.bdrv_aio_flush = blkdebug_aio_flush,
.bdrv_debug_event = blkdebug_debug_event,
.bdrv_debug_breakpoint = blkdebug_debug_breakpoint,

View File

@@ -10,6 +10,8 @@
#include <stdarg.h>
#include "qemu/sockets.h" /* for EINPROGRESS on Windows */
#include "block/block_int.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qstring.h"
typedef struct {
BlockDriverState *test_file;
@@ -17,7 +19,7 @@ typedef struct {
typedef struct BlkverifyAIOCB BlkverifyAIOCB;
struct BlkverifyAIOCB {
BlockDriverAIOCB common;
BlockAIOCB common;
QEMUBH *bh;
/* Request metadata */
@@ -27,7 +29,6 @@ struct BlkverifyAIOCB {
int ret; /* first completed request's result */
unsigned int done; /* completion counter */
bool *finished; /* completion signal for cancel */
QEMUIOVector *qiov; /* user I/O vector */
QEMUIOVector raw_qiov; /* cloned I/O vector for raw file */
@@ -36,22 +37,8 @@ struct BlkverifyAIOCB {
void (*verify)(BlkverifyAIOCB *acb);
};
static void blkverify_aio_cancel(BlockDriverAIOCB *blockacb)
{
BlkverifyAIOCB *acb = (BlkverifyAIOCB *)blockacb;
AioContext *aio_context = bdrv_get_aio_context(blockacb->bs);
bool finished = false;
/* Wait until request completes, invokes its callback, and frees itself */
acb->finished = &finished;
while (!finished) {
aio_poll(aio_context, true);
}
}
static const AIOCBInfo blkverify_aiocb_info = {
.aiocb_size = sizeof(BlkverifyAIOCB),
.cancel = blkverify_aio_cancel,
};
static void GCC_FMT_ATTR(2, 3) blkverify_err(BlkverifyAIOCB *acb,
@@ -156,6 +143,7 @@ static int blkverify_open(BlockDriverState *bs, QDict *options, int flags,
ret = 0;
fail:
qemu_opts_del(opts);
return ret;
}
@@ -177,7 +165,7 @@ static int64_t blkverify_getlength(BlockDriverState *bs)
static BlkverifyAIOCB *blkverify_aio_get(BlockDriverState *bs, bool is_write,
int64_t sector_num, QEMUIOVector *qiov,
int nb_sectors,
BlockDriverCompletionFunc *cb,
BlockCompletionFunc *cb,
void *opaque)
{
BlkverifyAIOCB *acb = qemu_aio_get(&blkverify_aiocb_info, bs, cb, opaque);
@@ -191,7 +179,6 @@ static BlkverifyAIOCB *blkverify_aio_get(BlockDriverState *bs, bool is_write,
acb->qiov = qiov;
acb->buf = NULL;
acb->verify = NULL;
acb->finished = NULL;
return acb;
}
@@ -205,10 +192,7 @@ static void blkverify_aio_bh(void *opaque)
qemu_vfree(acb->buf);
}
acb->common.cb(acb->common.opaque, acb->ret);
if (acb->finished) {
*acb->finished = true;
}
qemu_aio_release(acb);
qemu_aio_unref(acb);
}
static void blkverify_aio_cb(void *opaque, int ret)
@@ -245,9 +229,9 @@ static void blkverify_verify_readv(BlkverifyAIOCB *acb)
}
}
static BlockDriverAIOCB *blkverify_aio_readv(BlockDriverState *bs,
static BlockAIOCB *blkverify_aio_readv(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockDriverCompletionFunc *cb, void *opaque)
BlockCompletionFunc *cb, void *opaque)
{
BDRVBlkverifyState *s = bs->opaque;
BlkverifyAIOCB *acb = blkverify_aio_get(bs, false, sector_num, qiov,
@@ -265,9 +249,9 @@ static BlockDriverAIOCB *blkverify_aio_readv(BlockDriverState *bs,
return &acb->common;
}
static BlockDriverAIOCB *blkverify_aio_writev(BlockDriverState *bs,
static BlockAIOCB *blkverify_aio_writev(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockDriverCompletionFunc *cb, void *opaque)
BlockCompletionFunc *cb, void *opaque)
{
BDRVBlkverifyState *s = bs->opaque;
BlkverifyAIOCB *acb = blkverify_aio_get(bs, true, sector_num, qiov,
@@ -280,9 +264,9 @@ static BlockDriverAIOCB *blkverify_aio_writev(BlockDriverState *bs,
return &acb->common;
}
static BlockDriverAIOCB *blkverify_aio_flush(BlockDriverState *bs,
BlockDriverCompletionFunc *cb,
void *opaque)
static BlockAIOCB *blkverify_aio_flush(BlockDriverState *bs,
BlockCompletionFunc *cb,
void *opaque)
{
BDRVBlkverifyState *s = bs->opaque;
@@ -320,6 +304,32 @@ static void blkverify_attach_aio_context(BlockDriverState *bs,
bdrv_attach_aio_context(s->test_file, new_context);
}
static void blkverify_refresh_filename(BlockDriverState *bs)
{
BDRVBlkverifyState *s = bs->opaque;
/* bs->file has already been refreshed */
bdrv_refresh_filename(s->test_file);
if (bs->file->full_open_options && s->test_file->full_open_options) {
QDict *opts = qdict_new();
qdict_put_obj(opts, "driver", QOBJECT(qstring_from_str("blkverify")));
QINCREF(bs->file->full_open_options);
qdict_put_obj(opts, "raw", QOBJECT(bs->file->full_open_options));
QINCREF(s->test_file->full_open_options);
qdict_put_obj(opts, "test", QOBJECT(s->test_file->full_open_options));
bs->full_open_options = opts;
}
if (bs->file->exact_filename[0] && s->test_file->exact_filename[0]) {
snprintf(bs->exact_filename, sizeof(bs->exact_filename),
"blkverify:%s:%s",
bs->file->exact_filename, s->test_file->exact_filename);
}
}
static BlockDriver bdrv_blkverify = {
.format_name = "blkverify",
.protocol_name = "blkverify",
@@ -329,6 +339,7 @@ static BlockDriver bdrv_blkverify = {
.bdrv_file_open = blkverify_open,
.bdrv_close = blkverify_close,
.bdrv_getlength = blkverify_getlength,
.bdrv_refresh_filename = blkverify_refresh_filename,
.bdrv_aio_readv = blkverify_aio_readv,
.bdrv_aio_writev = blkverify_aio_writev,

631
block/block-backend.c Normal file
View File

@@ -0,0 +1,631 @@
/*
* QEMU Block backends
*
* Copyright (C) 2014 Red Hat, Inc.
*
* Authors:
* Markus Armbruster <armbru@redhat.com>,
*
* This work is licensed under the terms of the GNU LGPL, version 2.1
* or later. See the COPYING.LIB file in the top-level directory.
*/
#include "sysemu/block-backend.h"
#include "block/block_int.h"
#include "sysemu/blockdev.h"
#include "qapi-event.h"
/* Number of coroutines to reserve per attached device model */
#define COROUTINE_POOL_RESERVATION 64
struct BlockBackend {
char *name;
int refcnt;
BlockDriverState *bs;
DriveInfo *legacy_dinfo; /* null unless created by drive_new() */
QTAILQ_ENTRY(BlockBackend) link; /* for blk_backends */
void *dev; /* attached device model, if any */
/* TODO change to DeviceState when all users are qdevified */
const BlockDevOps *dev_ops;
void *dev_opaque;
};
static void drive_info_del(DriveInfo *dinfo);
/* All the BlockBackends (except for hidden ones) */
static QTAILQ_HEAD(, BlockBackend) blk_backends =
QTAILQ_HEAD_INITIALIZER(blk_backends);
/*
* Create a new BlockBackend with @name, with a reference count of one.
* @name must not be null or empty.
* Fail if a BlockBackend with this name already exists.
* Store an error through @errp on failure, unless it's null.
* Return the new BlockBackend on success, null on failure.
*/
BlockBackend *blk_new(const char *name, Error **errp)
{
BlockBackend *blk;
assert(name && name[0]);
if (!id_wellformed(name)) {
error_setg(errp, "Invalid device name");
return NULL;
}
if (blk_by_name(name)) {
error_setg(errp, "Device with id '%s' already exists", name);
return NULL;
}
if (bdrv_find_node(name)) {
error_setg(errp,
"Device name '%s' conflicts with an existing node name",
name);
return NULL;
}
blk = g_new0(BlockBackend, 1);
blk->name = g_strdup(name);
blk->refcnt = 1;
QTAILQ_INSERT_TAIL(&blk_backends, blk, link);
return blk;
}
/*
* Create a new BlockBackend with a new BlockDriverState attached.
* Otherwise just like blk_new(), which see.
*/
BlockBackend *blk_new_with_bs(const char *name, Error **errp)
{
BlockBackend *blk;
BlockDriverState *bs;
blk = blk_new(name, errp);
if (!blk) {
return NULL;
}
bs = bdrv_new_root();
blk->bs = bs;
bs->blk = blk;
return blk;
}
static void blk_delete(BlockBackend *blk)
{
assert(!blk->refcnt);
assert(!blk->dev);
if (blk->bs) {
assert(blk->bs->blk == blk);
blk->bs->blk = NULL;
bdrv_unref(blk->bs);
blk->bs = NULL;
}
/* Avoid double-remove after blk_hide_on_behalf_of_do_drive_del() */
if (blk->name[0]) {
QTAILQ_REMOVE(&blk_backends, blk, link);
}
g_free(blk->name);
drive_info_del(blk->legacy_dinfo);
g_free(blk);
}
static void drive_info_del(DriveInfo *dinfo)
{
if (!dinfo) {
return;
}
qemu_opts_del(dinfo->opts);
g_free(dinfo->serial);
g_free(dinfo);
}
/*
* Increment @blk's reference count.
* @blk must not be null.
*/
void blk_ref(BlockBackend *blk)
{
blk->refcnt++;
}
/*
* Decrement @blk's reference count.
* If this drops it to zero, destroy @blk.
* For convenience, do nothing if @blk is null.
*/
void blk_unref(BlockBackend *blk)
{
if (blk) {
assert(blk->refcnt > 0);
if (!--blk->refcnt) {
blk_delete(blk);
}
}
}
/*
* Return the BlockBackend after @blk.
* If @blk is null, return the first one.
* Else, return @blk's next sibling, which may be null.
*
* To iterate over all BlockBackends, do
* for (blk = blk_next(NULL); blk; blk = blk_next(blk)) {
* ...
* }
*/
BlockBackend *blk_next(BlockBackend *blk)
{
return blk ? QTAILQ_NEXT(blk, link) : QTAILQ_FIRST(&blk_backends);
}
/*
* Return @blk's name, a non-null string.
* Wart: the name is empty iff @blk has been hidden with
* blk_hide_on_behalf_of_do_drive_del().
*/
const char *blk_name(BlockBackend *blk)
{
return blk->name;
}
/*
* Return the BlockBackend with name @name if it exists, else null.
* @name must not be null.
*/
BlockBackend *blk_by_name(const char *name)
{
BlockBackend *blk;
assert(name);
QTAILQ_FOREACH(blk, &blk_backends, link) {
if (!strcmp(name, blk->name)) {
return blk;
}
}
return NULL;
}
/*
* Return the BlockDriverState attached to @blk if any, else null.
*/
BlockDriverState *blk_bs(BlockBackend *blk)
{
return blk->bs;
}
/*
* Return @blk's DriveInfo if any, else null.
*/
DriveInfo *blk_legacy_dinfo(BlockBackend *blk)
{
return blk->legacy_dinfo;
}
/*
* Set @blk's DriveInfo to @dinfo, and return it.
* @blk must not have a DriveInfo set already.
* No other BlockBackend may have the same DriveInfo set.
*/
DriveInfo *blk_set_legacy_dinfo(BlockBackend *blk, DriveInfo *dinfo)
{
assert(!blk->legacy_dinfo);
return blk->legacy_dinfo = dinfo;
}
/*
* Return the BlockBackend with DriveInfo @dinfo.
* It must exist.
*/
BlockBackend *blk_by_legacy_dinfo(DriveInfo *dinfo)
{
BlockBackend *blk;
QTAILQ_FOREACH(blk, &blk_backends, link) {
if (blk->legacy_dinfo == dinfo) {
return blk;
}
}
abort();
}
/*
* Hide @blk.
* @blk must not have been hidden already.
* Make attached BlockDriverState, if any, anonymous.
* Once hidden, @blk is invisible to all functions that don't receive
* it as argument. For example, blk_by_name() won't return it.
* Strictly for use by do_drive_del().
* TODO get rid of it!
*/
void blk_hide_on_behalf_of_do_drive_del(BlockBackend *blk)
{
QTAILQ_REMOVE(&blk_backends, blk, link);
blk->name[0] = 0;
if (blk->bs) {
bdrv_make_anon(blk->bs);
}
}
/*
* Attach device model @dev to @blk.
* Return 0 on success, -EBUSY when a device model is attached already.
*/
int blk_attach_dev(BlockBackend *blk, void *dev)
/* TODO change to DeviceState *dev when all users are qdevified */
{
if (blk->dev) {
return -EBUSY;
}
blk_ref(blk);
blk->dev = dev;
bdrv_iostatus_reset(blk->bs);
/* We're expecting I/O from the device so bump up coroutine pool size */
qemu_coroutine_adjust_pool_size(COROUTINE_POOL_RESERVATION);
return 0;
}
/*
* Attach device model @dev to @blk.
* @blk must not have a device model attached already.
* TODO qdevified devices don't use this, remove when devices are qdevified
*/
void blk_attach_dev_nofail(BlockBackend *blk, void *dev)
{
if (blk_attach_dev(blk, dev) < 0) {
abort();
}
}
/*
* Detach device model @dev from @blk.
* @dev must be currently attached to @blk.
*/
void blk_detach_dev(BlockBackend *blk, void *dev)
/* TODO change to DeviceState *dev when all users are qdevified */
{
assert(blk->dev == dev);
blk->dev = NULL;
blk->dev_ops = NULL;
blk->dev_opaque = NULL;
bdrv_set_guest_block_size(blk->bs, 512);
qemu_coroutine_adjust_pool_size(-COROUTINE_POOL_RESERVATION);
blk_unref(blk);
}
/*
* Return the device model attached to @blk if any, else null.
*/
void *blk_get_attached_dev(BlockBackend *blk)
/* TODO change to return DeviceState * when all users are qdevified */
{
return blk->dev;
}
/*
* Set @blk's device model callbacks to @ops.
* @opaque is the opaque argument to pass to the callbacks.
* This is for use by device models.
*/
void blk_set_dev_ops(BlockBackend *blk, const BlockDevOps *ops,
void *opaque)
{
blk->dev_ops = ops;
blk->dev_opaque = opaque;
}
/*
* Notify @blk's attached device model of media change.
* If @load is true, notify of media load.
* Else, notify of media eject.
* Also send DEVICE_TRAY_MOVED events as appropriate.
*/
void blk_dev_change_media_cb(BlockBackend *blk, bool load)
{
if (blk->dev_ops && blk->dev_ops->change_media_cb) {
bool tray_was_closed = !blk_dev_is_tray_open(blk);
blk->dev_ops->change_media_cb(blk->dev_opaque, load);
if (tray_was_closed) {
/* tray open */
qapi_event_send_device_tray_moved(blk_name(blk),
true, &error_abort);
}
if (load) {
/* tray close */
qapi_event_send_device_tray_moved(blk_name(blk),
false, &error_abort);
}
}
}
/*
* Does @blk's attached device model have removable media?
* %true if no device model is attached.
*/
bool blk_dev_has_removable_media(BlockBackend *blk)
{
return !blk->dev || (blk->dev_ops && blk->dev_ops->change_media_cb);
}
/*
* Notify @blk's attached device model of a media eject request.
* If @force is true, the medium is about to be yanked out forcefully.
*/
void blk_dev_eject_request(BlockBackend *blk, bool force)
{
if (blk->dev_ops && blk->dev_ops->eject_request_cb) {
blk->dev_ops->eject_request_cb(blk->dev_opaque, force);
}
}
/*
* Does @blk's attached device model have a tray, and is it open?
*/
bool blk_dev_is_tray_open(BlockBackend *blk)
{
if (blk->dev_ops && blk->dev_ops->is_tray_open) {
return blk->dev_ops->is_tray_open(blk->dev_opaque);
}
return false;
}
/*
* Does @blk's attached device model have the medium locked?
* %false if the device model has no such lock.
*/
bool blk_dev_is_medium_locked(BlockBackend *blk)
{
if (blk->dev_ops && blk->dev_ops->is_medium_locked) {
return blk->dev_ops->is_medium_locked(blk->dev_opaque);
}
return false;
}
/*
* Notify @blk's attached device model of a backend size change.
*/
void blk_dev_resize_cb(BlockBackend *blk)
{
if (blk->dev_ops && blk->dev_ops->resize_cb) {
blk->dev_ops->resize_cb(blk->dev_opaque);
}
}
void blk_iostatus_enable(BlockBackend *blk)
{
bdrv_iostatus_enable(blk->bs);
}
int blk_read(BlockBackend *blk, int64_t sector_num, uint8_t *buf,
int nb_sectors)
{
return bdrv_read(blk->bs, sector_num, buf, nb_sectors);
}
int blk_read_unthrottled(BlockBackend *blk, int64_t sector_num, uint8_t *buf,
int nb_sectors)
{
return bdrv_read_unthrottled(blk->bs, sector_num, buf, nb_sectors);
}
int blk_write(BlockBackend *blk, int64_t sector_num, const uint8_t *buf,
int nb_sectors)
{
return bdrv_write(blk->bs, sector_num, buf, nb_sectors);
}
BlockAIOCB *blk_aio_write_zeroes(BlockBackend *blk, int64_t sector_num,
int nb_sectors, BdrvRequestFlags flags,
BlockCompletionFunc *cb, void *opaque)
{
return bdrv_aio_write_zeroes(blk->bs, sector_num, nb_sectors, flags,
cb, opaque);
}
int blk_pread(BlockBackend *blk, int64_t offset, void *buf, int count)
{
return bdrv_pread(blk->bs, offset, buf, count);
}
int blk_pwrite(BlockBackend *blk, int64_t offset, const void *buf, int count)
{
return bdrv_pwrite(blk->bs, offset, buf, count);
}
int64_t blk_getlength(BlockBackend *blk)
{
return bdrv_getlength(blk->bs);
}
void blk_get_geometry(BlockBackend *blk, uint64_t *nb_sectors_ptr)
{
bdrv_get_geometry(blk->bs, nb_sectors_ptr);
}
BlockAIOCB *blk_aio_readv(BlockBackend *blk, int64_t sector_num,
QEMUIOVector *iov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque)
{
return bdrv_aio_readv(blk->bs, sector_num, iov, nb_sectors, cb, opaque);
}
BlockAIOCB *blk_aio_writev(BlockBackend *blk, int64_t sector_num,
QEMUIOVector *iov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque)
{
return bdrv_aio_writev(blk->bs, sector_num, iov, nb_sectors, cb, opaque);
}
BlockAIOCB *blk_aio_flush(BlockBackend *blk,
BlockCompletionFunc *cb, void *opaque)
{
return bdrv_aio_flush(blk->bs, cb, opaque);
}
BlockAIOCB *blk_aio_discard(BlockBackend *blk,
int64_t sector_num, int nb_sectors,
BlockCompletionFunc *cb, void *opaque)
{
return bdrv_aio_discard(blk->bs, sector_num, nb_sectors, cb, opaque);
}
void blk_aio_cancel(BlockAIOCB *acb)
{
bdrv_aio_cancel(acb);
}
void blk_aio_cancel_async(BlockAIOCB *acb)
{
bdrv_aio_cancel_async(acb);
}
int blk_aio_multiwrite(BlockBackend *blk, BlockRequest *reqs, int num_reqs)
{
return bdrv_aio_multiwrite(blk->bs, reqs, num_reqs);
}
int blk_ioctl(BlockBackend *blk, unsigned long int req, void *buf)
{
return bdrv_ioctl(blk->bs, req, buf);
}
BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void *buf,
BlockCompletionFunc *cb, void *opaque)
{
return bdrv_aio_ioctl(blk->bs, req, buf, cb, opaque);
}
int blk_flush(BlockBackend *blk)
{
return bdrv_flush(blk->bs);
}
int blk_flush_all(void)
{
return bdrv_flush_all();
}
void blk_drain_all(void)
{
bdrv_drain_all();
}
BlockdevOnError blk_get_on_error(BlockBackend *blk, bool is_read)
{
return bdrv_get_on_error(blk->bs, is_read);
}
BlockErrorAction blk_get_error_action(BlockBackend *blk, bool is_read,
int error)
{
return bdrv_get_error_action(blk->bs, is_read, error);
}
void blk_error_action(BlockBackend *blk, BlockErrorAction action,
bool is_read, int error)
{
bdrv_error_action(blk->bs, action, is_read, error);
}
int blk_is_read_only(BlockBackend *blk)
{
return bdrv_is_read_only(blk->bs);
}
int blk_is_sg(BlockBackend *blk)
{
return bdrv_is_sg(blk->bs);
}
int blk_enable_write_cache(BlockBackend *blk)
{
return bdrv_enable_write_cache(blk->bs);
}
void blk_set_enable_write_cache(BlockBackend *blk, bool wce)
{
bdrv_set_enable_write_cache(blk->bs, wce);
}
int blk_is_inserted(BlockBackend *blk)
{
return bdrv_is_inserted(blk->bs);
}
void blk_lock_medium(BlockBackend *blk, bool locked)
{
bdrv_lock_medium(blk->bs, locked);
}
void blk_eject(BlockBackend *blk, bool eject_flag)
{
bdrv_eject(blk->bs, eject_flag);
}
int blk_get_flags(BlockBackend *blk)
{
return bdrv_get_flags(blk->bs);
}
void blk_set_guest_block_size(BlockBackend *blk, int align)
{
bdrv_set_guest_block_size(blk->bs, align);
}
void *blk_blockalign(BlockBackend *blk, size_t size)
{
return qemu_blockalign(blk ? blk->bs : NULL, size);
}
bool blk_op_is_blocked(BlockBackend *blk, BlockOpType op, Error **errp)
{
return bdrv_op_is_blocked(blk->bs, op, errp);
}
void blk_op_unblock(BlockBackend *blk, BlockOpType op, Error *reason)
{
bdrv_op_unblock(blk->bs, op, reason);
}
void blk_op_block_all(BlockBackend *blk, Error *reason)
{
bdrv_op_block_all(blk->bs, reason);
}
void blk_op_unblock_all(BlockBackend *blk, Error *reason)
{
bdrv_op_unblock_all(blk->bs, reason);
}
AioContext *blk_get_aio_context(BlockBackend *blk)
{
return bdrv_get_aio_context(blk->bs);
}
void blk_set_aio_context(BlockBackend *blk, AioContext *new_context)
{
bdrv_set_aio_context(blk->bs, new_context);
}
void blk_io_plug(BlockBackend *blk)
{
bdrv_io_plug(blk->bs);
}
void blk_io_unplug(BlockBackend *blk)
{
bdrv_io_unplug(blk->bs);
}
BlockAcctStats *blk_get_stats(BlockBackend *blk)
{
return bdrv_get_stats(blk->bs);
}
void *blk_aio_get(const AIOCBInfo *aiocb_info, BlockBackend *blk,
BlockCompletionFunc *cb, void *opaque)
{
return qemu_aio_get(aiocb_info, blk_bs(blk), cb, opaque);
}

View File

@@ -131,7 +131,11 @@ static int bochs_open(BlockDriverState *bs, QDict *options, int flags,
return -EFBIG;
}
s->catalog_bitmap = g_malloc(s->catalog_size * 4);
s->catalog_bitmap = g_try_new(uint32_t, s->catalog_size);
if (s->catalog_size && s->catalog_bitmap == NULL) {
error_setg(errp, "Could not allocate memory for catalog");
return -ENOMEM;
}
ret = bdrv_pread(bs->file, le32_to_cpu(bochs.header), s->catalog_bitmap,
s->catalog_size * 4);

View File

@@ -116,7 +116,12 @@ static int cloop_open(BlockDriverState *bs, QDict *options, int flags,
"try increasing block size");
return -EINVAL;
}
s->offsets = g_malloc(offsets_size);
s->offsets = g_try_malloc(offsets_size);
if (s->offsets == NULL) {
error_setg(errp, "Could not allocate offsets table");
return -ENOMEM;
}
ret = bdrv_pread(bs->file, 128 + 4 + 4, s->offsets, offsets_size);
if (ret < 0) {
@@ -158,8 +163,20 @@ static int cloop_open(BlockDriverState *bs, QDict *options, int flags,
}
/* initialize zlib engine */
s->compressed_block = g_malloc(max_compressed_block_size + 1);
s->uncompressed_block = g_malloc(s->block_size);
s->compressed_block = g_try_malloc(max_compressed_block_size + 1);
if (s->compressed_block == NULL) {
error_setg(errp, "Could not allocate compressed_block");
ret = -ENOMEM;
goto fail;
}
s->uncompressed_block = g_try_malloc(s->block_size);
if (s->uncompressed_block == NULL) {
error_setg(errp, "Could not allocate uncompressed_block");
ret = -ENOMEM;
goto fail;
}
if (inflateInit(&s->zstream) != Z_OK) {
ret = -EINVAL;
goto fail;

View File

@@ -60,17 +60,50 @@ static int coroutine_fn commit_populate(BlockDriverState *bs,
return 0;
}
static void coroutine_fn commit_run(void *opaque)
typedef struct {
int ret;
} CommitCompleteData;
static void commit_complete(BlockJob *job, void *opaque)
{
CommitBlockJob *s = opaque;
CommitBlockJob *s = container_of(job, CommitBlockJob, common);
CommitCompleteData *data = opaque;
BlockDriverState *active = s->active;
BlockDriverState *top = s->top;
BlockDriverState *base = s->base;
BlockDriverState *overlay_bs;
int ret = data->ret;
if (!block_job_is_cancelled(&s->common) && ret == 0) {
/* success */
ret = bdrv_drop_intermediate(active, top, base, s->backing_file_str);
}
/* restore base open flags here if appropriate (e.g., change the base back
* to r/o). These reopens do not need to be atomic, since we won't abort
* even on failure here */
if (s->base_flags != bdrv_get_flags(base)) {
bdrv_reopen(base, s->base_flags, NULL);
}
overlay_bs = bdrv_find_overlay(active, top);
if (overlay_bs && s->orig_overlay_flags != bdrv_get_flags(overlay_bs)) {
bdrv_reopen(overlay_bs, s->orig_overlay_flags, NULL);
}
g_free(s->backing_file_str);
block_job_completed(&s->common, ret);
g_free(data);
}
static void coroutine_fn commit_run(void *opaque)
{
CommitBlockJob *s = opaque;
CommitCompleteData *data;
BlockDriverState *top = s->top;
BlockDriverState *base = s->base;
int64_t sector_num, end;
int ret = 0;
int n = 0;
void *buf;
void *buf = NULL;
int bytes_written = 0;
int64_t base_len;
@@ -78,18 +111,18 @@ static void coroutine_fn commit_run(void *opaque)
if (s->common.len < 0) {
goto exit_restore_reopen;
goto out;
}
ret = base_len = bdrv_getlength(base);
if (base_len < 0) {
goto exit_restore_reopen;
goto out;
}
if (base_len < s->common.len) {
ret = bdrv_truncate(base, s->common.len);
if (ret) {
goto exit_restore_reopen;
goto out;
}
}
@@ -128,7 +161,7 @@ wait:
if (s->on_error == BLOCKDEV_ON_ERROR_STOP ||
s->on_error == BLOCKDEV_ON_ERROR_REPORT||
(s->on_error == BLOCKDEV_ON_ERROR_ENOSPC && ret == -ENOSPC)) {
goto exit_free_buf;
goto out;
} else {
n = 0;
continue;
@@ -140,27 +173,12 @@ wait:
ret = 0;
if (!block_job_is_cancelled(&s->common) && sector_num == end) {
/* success */
ret = bdrv_drop_intermediate(active, top, base, s->backing_file_str);
}
exit_free_buf:
out:
qemu_vfree(buf);
exit_restore_reopen:
/* restore base open flags here if appropriate (e.g., change the base back
* to r/o). These reopens do not need to be atomic, since we won't abort
* even on failure here */
if (s->base_flags != bdrv_get_flags(base)) {
bdrv_reopen(base, s->base_flags, NULL);
}
overlay_bs = bdrv_find_overlay(active, top);
if (overlay_bs && s->orig_overlay_flags != bdrv_get_flags(overlay_bs)) {
bdrv_reopen(overlay_bs, s->orig_overlay_flags, NULL);
}
g_free(s->backing_file_str);
block_job_completed(&s->common, ret);
data = g_malloc(sizeof(*data));
data->ret = ret;
block_job_defer_to_main_loop(&s->common, commit_complete, data);
}
static void commit_set_speed(BlockJob *job, int64_t speed, Error **errp)
@@ -182,7 +200,7 @@ static const BlockJobDriver commit_job_driver = {
void commit_start(BlockDriverState *bs, BlockDriverState *base,
BlockDriverState *top, int64_t speed,
BlockdevOnError on_error, BlockDriverCompletionFunc *cb,
BlockdevOnError on_error, BlockCompletionFunc *cb,
void *opaque, const char *backing_file_str, Error **errp)
{
CommitBlockJob *s;

View File

@@ -1,432 +0,0 @@
/*
* Block driver for the COW format
*
* Copyright (c) 2004 Fabrice Bellard
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "qemu-common.h"
#include "block/block_int.h"
#include "qemu/module.h"
/**************************************************************/
/* COW block driver using file system holes */
/* user mode linux compatible COW file */
#define COW_MAGIC 0x4f4f4f4d /* MOOO */
#define COW_VERSION 2
struct cow_header_v2 {
uint32_t magic;
uint32_t version;
char backing_file[1024];
int32_t mtime;
uint64_t size;
uint32_t sectorsize;
};
typedef struct BDRVCowState {
CoMutex lock;
int64_t cow_sectors_offset;
} BDRVCowState;
static int cow_probe(const uint8_t *buf, int buf_size, const char *filename)
{
const struct cow_header_v2 *cow_header = (const void *)buf;
if (buf_size >= sizeof(struct cow_header_v2) &&
be32_to_cpu(cow_header->magic) == COW_MAGIC &&
be32_to_cpu(cow_header->version) == COW_VERSION)
return 100;
else
return 0;
}
static int cow_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVCowState *s = bs->opaque;
struct cow_header_v2 cow_header;
int bitmap_size;
int64_t size;
int ret;
/* see if it is a cow image */
ret = bdrv_pread(bs->file, 0, &cow_header, sizeof(cow_header));
if (ret < 0) {
goto fail;
}
if (be32_to_cpu(cow_header.magic) != COW_MAGIC) {
error_setg(errp, "Image not in COW format");
ret = -EINVAL;
goto fail;
}
if (be32_to_cpu(cow_header.version) != COW_VERSION) {
char version[64];
snprintf(version, sizeof(version),
"COW version %" PRIu32, cow_header.version);
error_set(errp, QERR_UNKNOWN_BLOCK_FORMAT_FEATURE,
bs->device_name, "cow", version);
ret = -ENOTSUP;
goto fail;
}
/* cow image found */
size = be64_to_cpu(cow_header.size);
bs->total_sectors = size / 512;
pstrcpy(bs->backing_file, sizeof(bs->backing_file),
cow_header.backing_file);
bitmap_size = ((bs->total_sectors + 7) >> 3) + sizeof(cow_header);
s->cow_sectors_offset = (bitmap_size + 511) & ~511;
qemu_co_mutex_init(&s->lock);
return 0;
fail:
return ret;
}
static inline void cow_set_bits(uint8_t *bitmap, int start, int64_t nb_sectors)
{
int64_t bitnum = start, last = start + nb_sectors;
while (bitnum < last) {
if ((bitnum & 7) == 0 && bitnum + 8 <= last) {
bitmap[bitnum / 8] = 0xFF;
bitnum += 8;
continue;
}
bitmap[bitnum/8] |= (1 << (bitnum % 8));
bitnum++;
}
}
#define BITS_PER_BITMAP_SECTOR (512 * 8)
/* Cannot use bitmap.c on big-endian machines. */
static int cow_test_bit(int64_t bitnum, const uint8_t *bitmap)
{
return (bitmap[bitnum / 8] & (1 << (bitnum & 7))) != 0;
}
static int cow_find_streak(const uint8_t *bitmap, int value, int start, int nb_sectors)
{
int streak_value = value ? 0xFF : 0;
int last = MIN(start + nb_sectors, BITS_PER_BITMAP_SECTOR);
int bitnum = start;
while (bitnum < last) {
if ((bitnum & 7) == 0 && bitmap[bitnum / 8] == streak_value) {
bitnum += 8;
continue;
}
if (cow_test_bit(bitnum, bitmap) == value) {
bitnum++;
continue;
}
break;
}
return MIN(bitnum, last) - start;
}
/* Return true if first block has been changed (ie. current version is
* in COW file). Set the number of continuous blocks for which that
* is true. */
static int coroutine_fn cow_co_is_allocated(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, int *num_same)
{
int64_t bitnum = sector_num + sizeof(struct cow_header_v2) * 8;
uint64_t offset = (bitnum / 8) & -BDRV_SECTOR_SIZE;
bool first = true;
int changed = 0, same = 0;
do {
int ret;
uint8_t bitmap[BDRV_SECTOR_SIZE];
bitnum &= BITS_PER_BITMAP_SECTOR - 1;
int sector_bits = MIN(nb_sectors, BITS_PER_BITMAP_SECTOR - bitnum);
ret = bdrv_pread(bs->file, offset, &bitmap, sizeof(bitmap));
if (ret < 0) {
return ret;
}
if (first) {
changed = cow_test_bit(bitnum, bitmap);
first = false;
}
same += cow_find_streak(bitmap, changed, bitnum, nb_sectors);
bitnum += sector_bits;
nb_sectors -= sector_bits;
offset += BDRV_SECTOR_SIZE;
} while (nb_sectors);
*num_same = same;
return changed;
}
static int64_t coroutine_fn cow_co_get_block_status(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, int *num_same)
{
BDRVCowState *s = bs->opaque;
int ret = cow_co_is_allocated(bs, sector_num, nb_sectors, num_same);
int64_t offset = s->cow_sectors_offset + (sector_num << BDRV_SECTOR_BITS);
if (ret < 0) {
return ret;
}
return (ret ? BDRV_BLOCK_DATA : 0) | offset | BDRV_BLOCK_OFFSET_VALID;
}
static int cow_update_bitmap(BlockDriverState *bs, int64_t sector_num,
int nb_sectors)
{
int64_t bitnum = sector_num + sizeof(struct cow_header_v2) * 8;
uint64_t offset = (bitnum / 8) & -BDRV_SECTOR_SIZE;
bool first = true;
int sector_bits;
for ( ; nb_sectors;
bitnum += sector_bits,
nb_sectors -= sector_bits,
offset += BDRV_SECTOR_SIZE) {
int ret, set;
uint8_t bitmap[BDRV_SECTOR_SIZE];
bitnum &= BITS_PER_BITMAP_SECTOR - 1;
sector_bits = MIN(nb_sectors, BITS_PER_BITMAP_SECTOR - bitnum);
ret = bdrv_pread(bs->file, offset, &bitmap, sizeof(bitmap));
if (ret < 0) {
return ret;
}
/* Skip over any already set bits */
set = cow_find_streak(bitmap, 1, bitnum, sector_bits);
bitnum += set;
sector_bits -= set;
nb_sectors -= set;
if (!sector_bits) {
continue;
}
if (first) {
ret = bdrv_flush(bs->file);
if (ret < 0) {
return ret;
}
first = false;
}
cow_set_bits(bitmap, bitnum, sector_bits);
ret = bdrv_pwrite(bs->file, offset, &bitmap, sizeof(bitmap));
if (ret < 0) {
return ret;
}
}
return 0;
}
static int coroutine_fn cow_read(BlockDriverState *bs, int64_t sector_num,
uint8_t *buf, int nb_sectors)
{
BDRVCowState *s = bs->opaque;
int ret, n;
while (nb_sectors > 0) {
ret = cow_co_is_allocated(bs, sector_num, nb_sectors, &n);
if (ret < 0) {
return ret;
}
if (ret) {
ret = bdrv_pread(bs->file,
s->cow_sectors_offset + sector_num * 512,
buf, n * 512);
if (ret < 0) {
return ret;
}
} else {
if (bs->backing_hd) {
/* read from the base image */
ret = bdrv_read(bs->backing_hd, sector_num, buf, n);
if (ret < 0) {
return ret;
}
} else {
memset(buf, 0, n * 512);
}
}
nb_sectors -= n;
sector_num += n;
buf += n * 512;
}
return 0;
}
static coroutine_fn int cow_co_read(BlockDriverState *bs, int64_t sector_num,
uint8_t *buf, int nb_sectors)
{
int ret;
BDRVCowState *s = bs->opaque;
qemu_co_mutex_lock(&s->lock);
ret = cow_read(bs, sector_num, buf, nb_sectors);
qemu_co_mutex_unlock(&s->lock);
return ret;
}
static int cow_write(BlockDriverState *bs, int64_t sector_num,
const uint8_t *buf, int nb_sectors)
{
BDRVCowState *s = bs->opaque;
int ret;
ret = bdrv_pwrite(bs->file, s->cow_sectors_offset + sector_num * 512,
buf, nb_sectors * 512);
if (ret < 0) {
return ret;
}
return cow_update_bitmap(bs, sector_num, nb_sectors);
}
static coroutine_fn int cow_co_write(BlockDriverState *bs, int64_t sector_num,
const uint8_t *buf, int nb_sectors)
{
int ret;
BDRVCowState *s = bs->opaque;
qemu_co_mutex_lock(&s->lock);
ret = cow_write(bs, sector_num, buf, nb_sectors);
qemu_co_mutex_unlock(&s->lock);
return ret;
}
static void cow_close(BlockDriverState *bs)
{
}
static int cow_create(const char *filename, QemuOpts *opts, Error **errp)
{
struct cow_header_v2 cow_header;
struct stat st;
int64_t image_sectors = 0;
char *image_filename = NULL;
Error *local_err = NULL;
int ret;
BlockDriverState *cow_bs = NULL;
/* Read out options */
image_sectors = qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0) / 512;
image_filename = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FILE);
ret = bdrv_create_file(filename, opts, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
goto exit;
}
ret = bdrv_open(&cow_bs, filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_PROTOCOL, NULL, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
goto exit;
}
memset(&cow_header, 0, sizeof(cow_header));
cow_header.magic = cpu_to_be32(COW_MAGIC);
cow_header.version = cpu_to_be32(COW_VERSION);
if (image_filename) {
/* Note: if no file, we put a dummy mtime */
cow_header.mtime = cpu_to_be32(0);
if (stat(image_filename, &st) != 0) {
goto mtime_fail;
}
cow_header.mtime = cpu_to_be32(st.st_mtime);
mtime_fail:
pstrcpy(cow_header.backing_file, sizeof(cow_header.backing_file),
image_filename);
}
cow_header.sectorsize = cpu_to_be32(512);
cow_header.size = cpu_to_be64(image_sectors * 512);
ret = bdrv_pwrite(cow_bs, 0, &cow_header, sizeof(cow_header));
if (ret < 0) {
goto exit;
}
/* resize to include at least all the bitmap */
ret = bdrv_truncate(cow_bs,
sizeof(cow_header) + ((image_sectors + 7) >> 3));
if (ret < 0) {
goto exit;
}
exit:
g_free(image_filename);
if (cow_bs) {
bdrv_unref(cow_bs);
}
return ret;
}
static QemuOptsList cow_create_opts = {
.name = "cow-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(cow_create_opts.head),
.desc = {
{
.name = BLOCK_OPT_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Virtual disk size"
},
{
.name = BLOCK_OPT_BACKING_FILE,
.type = QEMU_OPT_STRING,
.help = "File name of a base image"
},
{ /* end of list */ }
}
};
static BlockDriver bdrv_cow = {
.format_name = "cow",
.instance_size = sizeof(BDRVCowState),
.bdrv_probe = cow_probe,
.bdrv_open = cow_open,
.bdrv_close = cow_close,
.bdrv_create = cow_create,
.bdrv_has_zero_init = bdrv_has_zero_init_1,
.supports_backing = true,
.bdrv_read = cow_co_read,
.bdrv_write = cow_co_write,
.bdrv_co_get_block_status = cow_co_get_block_status,
.create_opts = &cow_create_opts,
};
static void bdrv_cow_init(void)
{
bdrv_register(&bdrv_cow);
}
block_init(bdrv_cow_init);

View File

@@ -26,7 +26,7 @@
#include "qapi/qmp/qbool.h"
#include <curl/curl.h>
// #define DEBUG
// #define DEBUG_CURL
// #define DEBUG_VERBOSE
#ifdef DEBUG_CURL
@@ -63,6 +63,8 @@ static CURLMcode __curl_multi_socket_action(CURLM *multi_handle,
#define CURL_NUM_ACB 8
#define SECTOR_SIZE 512
#define READ_AHEAD_DEFAULT (256 * 1024)
#define CURL_TIMEOUT_DEFAULT 5
#define CURL_TIMEOUT_MAX 10000
#define FIND_RET_NONE 0
#define FIND_RET_OK 1
@@ -71,11 +73,13 @@ static CURLMcode __curl_multi_socket_action(CURLM *multi_handle,
#define CURL_BLOCK_OPT_URL "url"
#define CURL_BLOCK_OPT_READAHEAD "readahead"
#define CURL_BLOCK_OPT_SSLVERIFY "sslverify"
#define CURL_BLOCK_OPT_TIMEOUT "timeout"
#define CURL_BLOCK_OPT_COOKIE "cookie"
struct BDRVCURLState;
typedef struct CURLAIOCB {
BlockDriverAIOCB common;
BlockAIOCB common;
QEMUBH *bh;
QEMUIOVector *qiov;
@@ -109,6 +113,8 @@ typedef struct BDRVCURLState {
char *url;
size_t readahead_size;
bool sslverify;
uint64_t timeout;
char *cookie;
bool accept_range;
AioContext *aio_context;
} BDRVCURLState;
@@ -207,7 +213,7 @@ static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
qemu_iovec_from_buf(acb->qiov, 0, s->orig_buf + acb->start,
acb->end - acb->start);
acb->common.cb(acb->common.opaque, 0);
qemu_aio_release(acb);
qemu_aio_unref(acb);
s->acb[i] = NULL;
}
}
@@ -299,7 +305,7 @@ static void curl_multi_check_completion(BDRVCURLState *s)
}
acb->common.cb(acb->common.opaque, -EIO);
qemu_aio_release(acb);
qemu_aio_unref(acb);
state->acb[i] = NULL;
}
}
@@ -352,7 +358,7 @@ static void curl_multi_timeout_do(void *arg)
#endif
}
static CURLState *curl_init_state(BDRVCURLState *s)
static CURLState *curl_init_state(BlockDriverState *bs, BDRVCURLState *s)
{
CURLState *state = NULL;
int i, j;
@@ -370,7 +376,7 @@ static CURLState *curl_init_state(BDRVCURLState *s)
break;
}
if (!state) {
aio_poll(state->s->aio_context, true);
aio_poll(bdrv_get_aio_context(bs), true);
}
} while(!state);
@@ -382,7 +388,10 @@ static CURLState *curl_init_state(BDRVCURLState *s)
curl_easy_setopt(state->curl, CURLOPT_URL, s->url);
curl_easy_setopt(state->curl, CURLOPT_SSL_VERIFYPEER,
(long) s->sslverify);
curl_easy_setopt(state->curl, CURLOPT_TIMEOUT, 5);
if (s->cookie) {
curl_easy_setopt(state->curl, CURLOPT_COOKIE, s->cookie);
}
curl_easy_setopt(state->curl, CURLOPT_TIMEOUT, (long)s->timeout);
curl_easy_setopt(state->curl, CURLOPT_WRITEFUNCTION,
(void *)curl_read_cb);
curl_easy_setopt(state->curl, CURLOPT_WRITEDATA, (void *)state);
@@ -489,6 +498,16 @@ static QemuOptsList runtime_opts = {
.type = QEMU_OPT_BOOL,
.help = "Verify SSL certificate"
},
{
.name = CURL_BLOCK_OPT_TIMEOUT,
.type = QEMU_OPT_NUMBER,
.help = "Curl timeout"
},
{
.name = CURL_BLOCK_OPT_COOKIE,
.type = QEMU_OPT_STRING,
.help = "Pass the cookie or list of cookies with each request"
},
{ /* end of list */ }
},
};
@@ -501,6 +520,7 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
QemuOpts *opts;
Error *local_err = NULL;
const char *file;
const char *cookie;
double d;
static int inited = 0;
@@ -525,8 +545,18 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
goto out_noclean;
}
s->timeout = qemu_opt_get_number(opts, CURL_BLOCK_OPT_TIMEOUT,
CURL_TIMEOUT_DEFAULT);
if (s->timeout > CURL_TIMEOUT_MAX) {
error_setg(errp, "timeout parameter is too large or negative");
goto out_noclean;
}
s->sslverify = qemu_opt_get_bool(opts, CURL_BLOCK_OPT_SSLVERIFY, true);
cookie = qemu_opt_get(opts, CURL_BLOCK_OPT_COOKIE);
s->cookie = g_strdup(cookie);
file = qemu_opt_get(opts, CURL_BLOCK_OPT_URL);
if (file == NULL) {
error_setg(errp, "curl block driver requires an 'url' option");
@@ -541,7 +571,7 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
DPRINTF("CURL: Opening %s\n", file);
s->aio_context = bdrv_get_aio_context(bs);
s->url = g_strdup(file);
state = curl_init_state(s);
state = curl_init_state(bs, s);
if (!state)
goto out_noclean;
@@ -582,19 +612,14 @@ out:
curl_easy_cleanup(state->curl);
state->curl = NULL;
out_noclean:
g_free(s->cookie);
g_free(s->url);
qemu_opts_del(opts);
return -EINVAL;
}
static void curl_aio_cancel(BlockDriverAIOCB *blockacb)
{
// Do we have to implement canceling? Seems to work without...
}
static const AIOCBInfo curl_aiocb_info = {
.aiocb_size = sizeof(CURLAIOCB),
.cancel = curl_aio_cancel,
};
@@ -616,7 +641,7 @@ static void curl_readv_bh_cb(void *p)
// we can just call the callback and be done.
switch (curl_find_buf(s, start, acb->nb_sectors * SECTOR_SIZE, acb)) {
case FIND_RET_OK:
qemu_aio_release(acb);
qemu_aio_unref(acb);
// fall through
case FIND_RET_WAIT:
return;
@@ -625,10 +650,10 @@ static void curl_readv_bh_cb(void *p)
}
// No cache found, so let's start a new request
state = curl_init_state(s);
state = curl_init_state(acb->common.bs, s);
if (!state) {
acb->common.cb(acb->common.opaque, -EIO);
qemu_aio_release(acb);
qemu_aio_unref(acb);
return;
}
@@ -640,7 +665,13 @@ static void curl_readv_bh_cb(void *p)
state->buf_start = start;
state->buf_len = acb->end + s->readahead_size;
end = MIN(start + state->buf_len, s->len) - 1;
state->orig_buf = g_malloc(state->buf_len);
state->orig_buf = g_try_malloc(state->buf_len);
if (state->buf_len && state->orig_buf == NULL) {
curl_clean_state(state);
acb->common.cb(acb->common.opaque, -ENOMEM);
qemu_aio_unref(acb);
return;
}
state->acb[0] = acb;
snprintf(state->range, 127, "%zd-%zd", start, end);
@@ -654,9 +685,9 @@ static void curl_readv_bh_cb(void *p)
curl_multi_socket_action(s->multi, CURL_SOCKET_TIMEOUT, 0, &running);
}
static BlockDriverAIOCB *curl_aio_readv(BlockDriverState *bs,
static BlockAIOCB *curl_aio_readv(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockDriverCompletionFunc *cb, void *opaque)
BlockCompletionFunc *cb, void *opaque)
{
CURLAIOCB *acb;
@@ -678,6 +709,7 @@ static void curl_close(BlockDriverState *bs)
DPRINTF("CURL: Close\n");
curl_detach_aio_context(bs);
g_free(s->cookie);
g_free(s->url);
}

View File

@@ -284,8 +284,15 @@ static int dmg_open(BlockDriverState *bs, QDict *options, int flags,
}
/* initialize zlib engine */
s->compressed_chunk = g_malloc(max_compressed_size + 1);
s->uncompressed_chunk = g_malloc(512 * max_sectors_per_chunk);
s->compressed_chunk = qemu_try_blockalign(bs->file,
max_compressed_size + 1);
s->uncompressed_chunk = qemu_try_blockalign(bs->file,
512 * max_sectors_per_chunk);
if (s->compressed_chunk == NULL || s->uncompressed_chunk == NULL) {
ret = -ENOMEM;
goto fail;
}
if (inflateInit(&s->zstream) != Z_OK) {
ret = -EINVAL;
goto fail;
@@ -302,8 +309,8 @@ fail:
g_free(s->lengths);
g_free(s->sectors);
g_free(s->sectorcounts);
g_free(s->compressed_chunk);
g_free(s->uncompressed_chunk);
qemu_vfree(s->compressed_chunk);
qemu_vfree(s->uncompressed_chunk);
return ret;
}
@@ -426,8 +433,8 @@ static void dmg_close(BlockDriverState *bs)
g_free(s->lengths);
g_free(s->sectors);
g_free(s->sectorcounts);
g_free(s->compressed_chunk);
g_free(s->uncompressed_chunk);
qemu_vfree(s->compressed_chunk);
qemu_vfree(s->uncompressed_chunk);
inflateEnd(&s->zstream);
}

View File

@@ -291,7 +291,7 @@ static int qemu_gluster_open(BlockDriverState *bs, QDict *options,
BDRVGlusterState *s = bs->opaque;
int open_flags = 0;
int ret = 0;
GlusterConf *gconf = g_malloc0(sizeof(GlusterConf));
GlusterConf *gconf = g_new0(GlusterConf, 1);
QemuOpts *opts;
Error *local_err = NULL;
const char *filename;
@@ -351,12 +351,12 @@ static int qemu_gluster_reopen_prepare(BDRVReopenState *state,
assert(state != NULL);
assert(state->bs != NULL);
state->opaque = g_malloc0(sizeof(BDRVGlusterReopenState));
state->opaque = g_new0(BDRVGlusterReopenState, 1);
reop_s = state->opaque;
qemu_gluster_parse_flags(state->flags, &open_flags);
gconf = g_malloc0(sizeof(GlusterConf));
gconf = g_new0(GlusterConf, 1);
reop_s->glfs = qemu_gluster_init(gconf, state->bs->filename, errp);
if (reop_s->glfs == NULL) {
@@ -486,7 +486,7 @@ static int qemu_gluster_create(const char *filename,
int prealloc = 0;
int64_t total_size = 0;
char *tmp = NULL;
GlusterConf *gconf = g_malloc0(sizeof(GlusterConf));
GlusterConf *gconf = g_new0(GlusterConf, 1);
glfs = qemu_gluster_init(gconf, filename, errp);
if (!glfs) {
@@ -494,8 +494,8 @@ static int qemu_gluster_create(const char *filename,
goto out;
}
total_size =
qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0) / BDRV_SECTOR_SIZE;
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
tmp = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
if (!tmp || !strcmp(tmp, "off")) {
@@ -516,9 +516,8 @@ static int qemu_gluster_create(const char *filename,
if (!fd) {
ret = -errno;
} else {
if (!glfs_ftruncate(fd, total_size * BDRV_SECTOR_SIZE)) {
if (prealloc && qemu_gluster_zerofill(fd, 0,
total_size * BDRV_SECTOR_SIZE)) {
if (!glfs_ftruncate(fd, total_size)) {
if (prealloc && qemu_gluster_zerofill(fd, 0, total_size)) {
ret = -errno;
}
} else {

View File

@@ -34,7 +34,6 @@
#include "qemu/bitops.h"
#include "qemu/bitmap.h"
#include "block/block_int.h"
#include "trace.h"
#include "block/scsi.h"
#include "qemu/iov.h"
#include "sysemu/sysemu.h"
@@ -81,14 +80,13 @@ typedef struct IscsiTask {
} IscsiTask;
typedef struct IscsiAIOCB {
BlockDriverAIOCB common;
BlockAIOCB common;
QEMUIOVector *qiov;
QEMUBH *bh;
IscsiLun *iscsilun;
struct scsi_task *task;
uint8_t *buf;
int status;
int canceled;
int64_t sector_num;
int nb_sectors;
#ifdef __linux__
@@ -120,16 +118,14 @@ iscsi_bh_cb(void *p)
g_free(acb->buf);
acb->buf = NULL;
if (acb->canceled == 0) {
acb->common.cb(acb->common.opaque, acb->status);
}
acb->common.cb(acb->common.opaque, acb->status);
if (acb->task != NULL) {
scsi_free_scsi_task(acb->task);
acb->task = NULL;
}
qemu_aio_release(acb);
qemu_aio_unref(acb);
}
static void
@@ -231,7 +227,7 @@ iscsi_abort_task_cb(struct iscsi_context *iscsi, int status, void *command_data,
}
static void
iscsi_aio_cancel(BlockDriverAIOCB *blockacb)
iscsi_aio_cancel(BlockAIOCB *blockacb)
{
IscsiAIOCB *acb = (IscsiAIOCB *)blockacb;
IscsiLun *iscsilun = acb->iscsilun;
@@ -240,20 +236,15 @@ iscsi_aio_cancel(BlockDriverAIOCB *blockacb)
return;
}
acb->canceled = 1;
/* send a task mgmt call to the target to cancel the task on the target */
iscsi_task_mgmt_abort_task_async(iscsilun->iscsi, acb->task,
iscsi_abort_task_cb, acb);
while (acb->status == -EINPROGRESS) {
aio_poll(iscsilun->aio_context, true);
}
}
static const AIOCBInfo iscsi_aiocb_info = {
.aiocb_size = sizeof(IscsiAIOCB),
.cancel = iscsi_aio_cancel,
.cancel_async = iscsi_aio_cancel,
};
@@ -325,6 +316,13 @@ static bool is_request_lun_aligned(int64_t sector_num, int nb_sectors,
return 1;
}
static unsigned long *iscsi_allocationmap_init(IscsiLun *iscsilun)
{
return bitmap_try_new(DIV_ROUND_UP(sector_lun2qemu(iscsilun->num_blocks,
iscsilun),
iscsilun->cluster_sectors));
}
static void iscsi_allocationmap_set(IscsiLun *iscsilun, int64_t sector_num,
int nb_sectors)
{
@@ -364,6 +362,12 @@ static int coroutine_fn iscsi_co_writev(BlockDriverState *bs,
return -EINVAL;
}
if (bs->bl.max_transfer_length && nb_sectors > bs->bl.max_transfer_length) {
error_report("iSCSI Error: Write of %d sectors exceeds max_xfer_len "
"of %d sectors", nb_sectors, bs->bl.max_transfer_length);
return -EINVAL;
}
lba = sector_qemu2lun(sector_num, iscsilun);
num_sectors = sector_qemu2lun(nb_sectors, iscsilun);
iscsi_co_init_iscsitask(iscsilun, &iTask);
@@ -531,6 +535,12 @@ static int coroutine_fn iscsi_co_readv(BlockDriverState *bs,
return -EINVAL;
}
if (bs->bl.max_transfer_length && nb_sectors > bs->bl.max_transfer_length) {
error_report("iSCSI Error: Read of %d sectors exceeds max_xfer_len "
"of %d sectors", nb_sectors, bs->bl.max_transfer_length);
return -EINVAL;
}
if (iscsilun->lbprz && nb_sectors >= ISCSI_CHECKALLOC_THRES &&
!iscsi_allocationmap_is_allocated(iscsilun, sector_num, nb_sectors)) {
int64_t ret;
@@ -638,10 +648,6 @@ iscsi_aio_ioctl_cb(struct iscsi_context *iscsi, int status,
g_free(acb->buf);
acb->buf = NULL;
if (acb->canceled != 0) {
return;
}
acb->status = 0;
if (status < 0) {
error_report("Failed to ioctl(SG_IO) to iSCSI lun. %s",
@@ -669,9 +675,9 @@ iscsi_aio_ioctl_cb(struct iscsi_context *iscsi, int status,
iscsi_schedule_bh(acb);
}
static BlockDriverAIOCB *iscsi_aio_ioctl(BlockDriverState *bs,
static BlockAIOCB *iscsi_aio_ioctl(BlockDriverState *bs,
unsigned long int req, void *buf,
BlockDriverCompletionFunc *cb, void *opaque)
BlockCompletionFunc *cb, void *opaque)
{
IscsiLun *iscsilun = bs->opaque;
struct iscsi_context *iscsi = iscsilun->iscsi;
@@ -683,7 +689,6 @@ static BlockDriverAIOCB *iscsi_aio_ioctl(BlockDriverState *bs,
acb = qemu_aio_get(&iscsi_aiocb_info, bs, cb, opaque);
acb->iscsilun = iscsilun;
acb->canceled = 0;
acb->bh = NULL;
acb->status = -EINPROGRESS;
acb->buf = NULL;
@@ -693,7 +698,7 @@ static BlockDriverAIOCB *iscsi_aio_ioctl(BlockDriverState *bs,
if (acb->task == NULL) {
error_report("iSCSI: Failed to allocate task for scsi command. %s",
iscsi_get_error(iscsi));
qemu_aio_release(acb);
qemu_aio_unref(acb);
return NULL;
}
memset(acb->task, 0, sizeof(struct scsi_task));
@@ -731,7 +736,7 @@ static BlockDriverAIOCB *iscsi_aio_ioctl(BlockDriverState *bs,
(data.size > 0) ? &data : NULL,
acb) != 0) {
scsi_free_scsi_task(acb->task);
qemu_aio_release(acb);
qemu_aio_unref(acb);
return NULL;
}
@@ -893,7 +898,10 @@ coroutine_fn iscsi_co_write_zeroes(BlockDriverState *bs, int64_t sector_num,
nb_blocks = sector_qemu2lun(nb_sectors, iscsilun);
if (iscsilun->zeroblock == NULL) {
iscsilun->zeroblock = g_malloc0(iscsilun->block_size);
iscsilun->zeroblock = g_try_malloc0(iscsilun->block_size);
if (iscsilun->zeroblock == NULL) {
return -ENOMEM;
}
}
iscsi_co_init_iscsitask(iscsilun, &iTask);
@@ -1223,6 +1231,40 @@ static void iscsi_attach_aio_context(BlockDriverState *bs,
qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + NOP_INTERVAL);
}
static bool iscsi_is_write_protected(IscsiLun *iscsilun)
{
struct scsi_task *task;
struct scsi_mode_sense *ms = NULL;
bool wrprotected = false;
task = iscsi_modesense6_sync(iscsilun->iscsi, iscsilun->lun,
1, SCSI_MODESENSE_PC_CURRENT,
0x3F, 0, 255);
if (task == NULL) {
error_report("iSCSI: Failed to send MODE_SENSE(6) command: %s",
iscsi_get_error(iscsilun->iscsi));
goto out;
}
if (task->status != SCSI_STATUS_GOOD) {
error_report("iSCSI: Failed MODE_SENSE(6), LUN assumed writable");
goto out;
}
ms = scsi_datain_unmarshall(task);
if (!ms) {
error_report("iSCSI: Failed to unmarshall MODE_SENSE(6) data: %s",
iscsi_get_error(iscsilun->iscsi));
goto out;
}
wrprotected = ms->device_specific_parameter & 0x80;
out:
if (task) {
scsi_free_scsi_task(task);
}
return wrprotected;
}
/*
* We support iscsi url's on the form
* iscsi://[<username>%<password>@]<host>[:<port>]/<targetname>/<lun>
@@ -1343,6 +1385,14 @@ static int iscsi_open(BlockDriverState *bs, QDict *options, int flags,
scsi_free_scsi_task(task);
task = NULL;
/* Check the write protect flag of the LUN if we want to write */
if (iscsilun->type == TYPE_DISK && (flags & BDRV_O_RDWR) &&
iscsi_is_write_protected(iscsilun)) {
error_setg(errp, "Cannot open a write protected LUN as read-write");
ret = -EACCES;
goto out;
}
iscsi_readcapacity_sync(iscsilun, &local_err);
if (local_err != NULL) {
error_propagate(errp, local_err);
@@ -1413,9 +1463,10 @@ static int iscsi_open(BlockDriverState *bs, QDict *options, int flags,
iscsilun->cluster_sectors = (iscsilun->bl.opt_unmap_gran *
iscsilun->block_size) >> BDRV_SECTOR_BITS;
if (iscsilun->lbprz && !(bs->open_flags & BDRV_O_NOCACHE)) {
iscsilun->allocationmap =
bitmap_new(DIV_ROUND_UP(bs->total_sectors,
iscsilun->cluster_sectors));
iscsilun->allocationmap = iscsi_allocationmap_init(iscsilun);
if (iscsilun->allocationmap == NULL) {
ret = -ENOMEM;
}
}
}
@@ -1450,31 +1501,44 @@ static void iscsi_close(BlockDriverState *bs)
memset(iscsilun, 0, sizeof(IscsiLun));
}
static int sector_limits_lun2qemu(int64_t sector, IscsiLun *iscsilun)
{
return MIN(sector_lun2qemu(sector, iscsilun), INT_MAX / 2 + 1);
}
static void iscsi_refresh_limits(BlockDriverState *bs, Error **errp)
{
IscsiLun *iscsilun = bs->opaque;
/* We don't actually refresh here, but just return data queried in
* iscsi_open(): iscsi targets don't change their limits. */
IscsiLun *iscsilun = bs->opaque;
uint32_t max_xfer_len = iscsilun->use_16_for_rw ? 0xffffffff : 0xffff;
if (iscsilun->bl.max_xfer_len) {
max_xfer_len = MIN(max_xfer_len, iscsilun->bl.max_xfer_len);
}
bs->bl.max_transfer_length = sector_limits_lun2qemu(max_xfer_len, iscsilun);
if (iscsilun->lbp.lbpu) {
if (iscsilun->bl.max_unmap < 0xffffffff) {
bs->bl.max_discard = sector_lun2qemu(iscsilun->bl.max_unmap,
iscsilun);
bs->bl.max_discard =
sector_limits_lun2qemu(iscsilun->bl.max_unmap, iscsilun);
}
bs->bl.discard_alignment = sector_lun2qemu(iscsilun->bl.opt_unmap_gran,
iscsilun);
bs->bl.discard_alignment =
sector_limits_lun2qemu(iscsilun->bl.opt_unmap_gran, iscsilun);
}
if (iscsilun->bl.max_ws_len < 0xffffffff) {
bs->bl.max_write_zeroes = sector_lun2qemu(iscsilun->bl.max_ws_len,
iscsilun);
bs->bl.max_write_zeroes =
sector_limits_lun2qemu(iscsilun->bl.max_ws_len, iscsilun);
}
if (iscsilun->lbp.lbpws) {
bs->bl.write_zeroes_alignment = sector_lun2qemu(iscsilun->bl.opt_unmap_gran,
iscsilun);
bs->bl.write_zeroes_alignment =
sector_limits_lun2qemu(iscsilun->bl.opt_unmap_gran, iscsilun);
}
bs->bl.opt_transfer_length = sector_lun2qemu(iscsilun->bl.opt_xfer_len,
iscsilun);
bs->bl.opt_transfer_length =
sector_limits_lun2qemu(iscsilun->bl.opt_xfer_len, iscsilun);
}
/* Since iscsi_open() ignores bdrv_flags, there is nothing to do here in
@@ -1508,10 +1572,7 @@ static int iscsi_truncate(BlockDriverState *bs, int64_t offset)
if (iscsilun->allocationmap != NULL) {
g_free(iscsilun->allocationmap);
iscsilun->allocationmap =
bitmap_new(DIV_ROUND_UP(sector_lun2qemu(iscsilun->num_blocks,
iscsilun),
iscsilun->cluster_sectors));
iscsilun->allocationmap = iscsi_allocationmap_init(iscsilun);
}
return 0;
@@ -1525,12 +1586,12 @@ static int iscsi_create(const char *filename, QemuOpts *opts, Error **errp)
IscsiLun *iscsilun = NULL;
QDict *bs_options;
bs = bdrv_new("", &error_abort);
bs = bdrv_new();
/* Read out options */
total_size =
qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0) / BDRV_SECTOR_SIZE;
bs->opaque = g_malloc0(sizeof(struct IscsiLun));
total_size = DIV_ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
bs->opaque = g_new0(struct IscsiLun, 1);
iscsilun = bs->opaque;
bs_options = qdict_new();

View File

@@ -28,7 +28,7 @@
#define MAX_QUEUED_IO 128
struct qemu_laiocb {
BlockDriverAIOCB common;
BlockAIOCB common;
struct qemu_laio_state *ctx;
struct iocb iocb;
ssize_t ret;
@@ -51,6 +51,12 @@ struct qemu_laio_state {
/* io queue for submit at batch */
LaioQueue io_q;
/* I/O completion processing */
QEMUBH *completion_bh;
struct io_event events[MAX_EVENTS];
int event_idx;
int event_max;
};
static inline ssize_t io_event_ret(struct io_event *ev)
@@ -79,72 +85,89 @@ static void qemu_laio_process_completion(struct qemu_laio_state *s,
ret = -EINVAL;
}
}
}
laiocb->common.cb(laiocb->common.opaque, ret);
laiocb->common.cb(laiocb->common.opaque, ret);
qemu_aio_unref(laiocb);
}
/* The completion BH fetches completed I/O requests and invokes their
* callbacks.
*
* The function is somewhat tricky because it supports nested event loops, for
* example when a request callback invokes aio_poll(). In order to do this,
* the completion events array and index are kept in qemu_laio_state. The BH
* reschedules itself as long as there are completions pending so it will
* either be called again in a nested event loop or will be called after all
* events have been completed. When there are no events left to complete, the
* BH returns without rescheduling.
*/
static void qemu_laio_completion_bh(void *opaque)
{
struct qemu_laio_state *s = opaque;
/* Fetch more completion events when empty */
if (s->event_idx == s->event_max) {
do {
struct timespec ts = { 0 };
s->event_max = io_getevents(s->ctx, MAX_EVENTS, MAX_EVENTS,
s->events, &ts);
} while (s->event_max == -EINTR);
s->event_idx = 0;
if (s->event_max <= 0) {
s->event_max = 0;
return; /* no more events */
}
}
qemu_aio_release(laiocb);
/* Reschedule so nested event loops see currently pending completions */
qemu_bh_schedule(s->completion_bh);
/* Process completion events */
while (s->event_idx < s->event_max) {
struct iocb *iocb = s->events[s->event_idx].obj;
struct qemu_laiocb *laiocb =
container_of(iocb, struct qemu_laiocb, iocb);
laiocb->ret = io_event_ret(&s->events[s->event_idx]);
s->event_idx++;
qemu_laio_process_completion(s, laiocb);
}
}
static void qemu_laio_completion_cb(EventNotifier *e)
{
struct qemu_laio_state *s = container_of(e, struct qemu_laio_state, e);
while (event_notifier_test_and_clear(&s->e)) {
struct io_event events[MAX_EVENTS];
struct timespec ts = { 0 };
int nevents, i;
do {
nevents = io_getevents(s->ctx, MAX_EVENTS, MAX_EVENTS, events, &ts);
} while (nevents == -EINTR);
for (i = 0; i < nevents; i++) {
struct iocb *iocb = events[i].obj;
struct qemu_laiocb *laiocb =
container_of(iocb, struct qemu_laiocb, iocb);
laiocb->ret = io_event_ret(&events[i]);
qemu_laio_process_completion(s, laiocb);
}
if (event_notifier_test_and_clear(&s->e)) {
qemu_bh_schedule(s->completion_bh);
}
}
static void laio_cancel(BlockDriverAIOCB *blockacb)
static void laio_cancel(BlockAIOCB *blockacb)
{
struct qemu_laiocb *laiocb = (struct qemu_laiocb *)blockacb;
struct io_event event;
int ret;
if (laiocb->ret != -EINPROGRESS)
if (laiocb->ret != -EINPROGRESS) {
return;
/*
* Note that as of Linux 2.6.31 neither the block device code nor any
* filesystem implements cancellation of AIO request.
* Thus the polling loop below is the normal code path.
*/
}
ret = io_cancel(laiocb->ctx->ctx, &laiocb->iocb, &event);
if (ret == 0) {
laiocb->ret = -ECANCELED;
laiocb->ret = -ECANCELED;
if (ret != 0) {
/* iocb is not cancelled, cb will be called by the event loop later */
return;
}
/*
* We have to wait for the iocb to finish.
*
* The only way to get the iocb status update is by polling the io context.
* We might be able to do this slightly more optimal by removing the
* O_NONBLOCK flag.
*/
while (laiocb->ret == -EINPROGRESS) {
qemu_laio_completion_cb(&laiocb->ctx->e);
}
laiocb->common.cb(laiocb->common.opaque, laiocb->ret);
}
static const AIOCBInfo laio_aiocb_info = {
.aiocb_size = sizeof(struct qemu_laiocb),
.cancel = laio_cancel,
.cancel_async = laio_cancel,
};
static void ioq_init(LaioQueue *io_q)
@@ -220,9 +243,9 @@ int laio_io_unplug(BlockDriverState *bs, void *aio_ctx, bool unplug)
return ret;
}
BlockDriverAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
BlockAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockDriverCompletionFunc *cb, void *opaque, int type)
BlockCompletionFunc *cb, void *opaque, int type)
{
struct qemu_laio_state *s = aio_ctx;
struct qemu_laiocb *laiocb;
@@ -263,7 +286,7 @@ BlockDriverAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
return &laiocb->common;
out_free_aiocb:
qemu_aio_release(laiocb);
qemu_aio_unref(laiocb);
return NULL;
}
@@ -272,12 +295,14 @@ void laio_detach_aio_context(void *s_, AioContext *old_context)
struct qemu_laio_state *s = s_;
aio_set_event_notifier(old_context, &s->e, NULL);
qemu_bh_delete(s->completion_bh);
}
void laio_attach_aio_context(void *s_, AioContext *new_context)
{
struct qemu_laio_state *s = s_;
s->completion_bh = aio_bh_new(new_context, qemu_laio_completion_bh, s);
aio_set_event_notifier(new_context, &s->e, qemu_laio_completion_cb);
}

View File

@@ -45,6 +45,7 @@ typedef struct MirrorBlockJob {
int64_t sector_num;
int64_t granularity;
size_t buf_size;
int64_t bdev_length;
unsigned long *cow_bitmap;
BdrvDirtyBitmap *dirty_bitmap;
HBitmapIter hbi;
@@ -54,6 +55,7 @@ typedef struct MirrorBlockJob {
unsigned long *in_flight_bitmap;
int in_flight;
int sectors_in_flight;
int ret;
} MirrorBlockJob;
@@ -87,6 +89,7 @@ static void mirror_iteration_done(MirrorOp *op, int ret)
trace_mirror_iteration_done(s, op->sector_num, op->nb_sectors, ret);
s->in_flight--;
s->sectors_in_flight -= op->nb_sectors;
iov = op->qiov.iov;
for (i = 0; i < op->qiov.niov; i++) {
MirrorBuffer *buf = (MirrorBuffer *) iov[i].iov_base;
@@ -98,8 +101,11 @@ static void mirror_iteration_done(MirrorOp *op, int ret)
chunk_num = op->sector_num / sectors_per_chunk;
nb_chunks = op->nb_sectors / sectors_per_chunk;
bitmap_clear(s->in_flight_bitmap, chunk_num, nb_chunks);
if (s->cow_bitmap && ret >= 0) {
bitmap_set(s->cow_bitmap, chunk_num, nb_chunks);
if (ret >= 0) {
if (s->cow_bitmap) {
bitmap_set(s->cow_bitmap, chunk_num, nb_chunks);
}
s->common.offset += (uint64_t)op->nb_sectors * BDRV_SECTOR_SIZE;
}
qemu_iovec_destroy(&op->qiov);
@@ -157,7 +163,7 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
BlockDriverState *source = s->common.bs;
int nb_sectors, sectors_per_chunk, nb_chunks;
int64_t end, sector_num, next_chunk, next_sector, hbitmap_next_sector;
uint64_t delay_ns;
uint64_t delay_ns = 0;
MirrorOp *op;
s->sector_num = hbitmap_iter_next(&s->hbi);
@@ -172,7 +178,7 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
hbitmap_next_sector = s->sector_num;
sector_num = s->sector_num;
sectors_per_chunk = s->granularity >> BDRV_SECTOR_BITS;
end = s->common.len >> BDRV_SECTOR_BITS;
end = s->bdev_length / BDRV_SECTOR_SIZE;
/* Extend the QEMUIOVector to include all adjacent blocks that will
* be copied in this operation.
@@ -247,8 +253,6 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
next_chunk += added_chunks;
if (!s->synced && s->common.speed) {
delay_ns = ratelimit_calculate_delay(&s->limit, added_sectors);
} else {
delay_ns = 0;
}
} while (delay_ns == 0 && next_sector < end);
@@ -286,6 +290,7 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
/* Copy the dirty cluster. */
s->in_flight++;
s->sectors_in_flight += nb_sectors;
trace_mirror_one_iteration(s, sector_num, nb_sectors);
bdrv_aio_readv(source, sector_num, &op->qiov, nb_sectors,
mirror_read_complete, op);
@@ -316,9 +321,56 @@ static void mirror_drain(MirrorBlockJob *s)
}
}
typedef struct {
int ret;
} MirrorExitData;
static void mirror_exit(BlockJob *job, void *opaque)
{
MirrorBlockJob *s = container_of(job, MirrorBlockJob, common);
MirrorExitData *data = opaque;
AioContext *replace_aio_context = NULL;
if (s->to_replace) {
replace_aio_context = bdrv_get_aio_context(s->to_replace);
aio_context_acquire(replace_aio_context);
}
if (s->should_complete && data->ret == 0) {
BlockDriverState *to_replace = s->common.bs;
if (s->to_replace) {
to_replace = s->to_replace;
}
if (bdrv_get_flags(s->target) != bdrv_get_flags(to_replace)) {
bdrv_reopen(s->target, bdrv_get_flags(to_replace), NULL);
}
bdrv_swap(s->target, to_replace);
if (s->common.driver->job_type == BLOCK_JOB_TYPE_COMMIT) {
/* drop the bs loop chain formed by the swap: break the loop then
* trigger the unref from the top one */
BlockDriverState *p = s->base->backing_hd;
bdrv_set_backing_hd(s->base, NULL);
bdrv_unref(p);
}
}
if (s->to_replace) {
bdrv_op_unblock_all(s->to_replace, s->replace_blocker);
error_free(s->replace_blocker);
bdrv_unref(s->to_replace);
}
if (replace_aio_context) {
aio_context_release(replace_aio_context);
}
g_free(s->replaces);
bdrv_unref(s->target);
block_job_completed(&s->common, data->ret);
g_free(data);
}
static void coroutine_fn mirror_run(void *opaque)
{
MirrorBlockJob *s = opaque;
MirrorExitData *data;
BlockDriverState *bs = s->common.bs;
int64_t sector_num, end, sectors_per_chunk, length;
uint64_t last_pause_ns;
@@ -331,11 +383,11 @@ static void coroutine_fn mirror_run(void *opaque)
goto immediate_exit;
}
s->common.len = bdrv_getlength(bs);
if (s->common.len < 0) {
ret = s->common.len;
s->bdev_length = bdrv_getlength(bs);
if (s->bdev_length < 0) {
ret = s->bdev_length;
goto immediate_exit;
} else if (s->common.len == 0) {
} else if (s->bdev_length == 0) {
/* Report BLOCK_JOB_READY and wait for complete. */
block_job_event_ready(&s->common);
s->synced = true;
@@ -346,7 +398,7 @@ static void coroutine_fn mirror_run(void *opaque)
goto immediate_exit;
}
length = DIV_ROUND_UP(s->common.len, s->granularity);
length = DIV_ROUND_UP(s->bdev_length, s->granularity);
s->in_flight_bitmap = bitmap_new(length);
/* If we have no backing file yet in the destination, we cannot let
@@ -366,8 +418,13 @@ static void coroutine_fn mirror_run(void *opaque)
}
}
end = s->common.len >> BDRV_SECTOR_BITS;
s->buf = qemu_blockalign(bs, s->buf_size);
end = s->bdev_length / BDRV_SECTOR_SIZE;
s->buf = qemu_try_blockalign(bs, s->buf_size);
if (s->buf == NULL) {
ret = -ENOMEM;
goto immediate_exit;
}
sectors_per_chunk = s->granularity >> BDRV_SECTOR_BITS;
mirror_free_init(s);
@@ -406,6 +463,12 @@ static void coroutine_fn mirror_run(void *opaque)
}
cnt = bdrv_get_dirty_count(bs, s->dirty_bitmap);
/* s->common.offset contains the number of bytes already processed so
* far, cnt is the number of dirty sectors remaining and
* s->sectors_in_flight is the number of sectors currently being
* processed; together those are the current total operation length */
s->common.len = s->common.offset +
(cnt + s->sectors_in_flight) * BDRV_SECTOR_SIZE;
/* Note that even when no rate limit is applied we need to yield
* periodically with no pending I/O so that qemu_aio_flush() returns.
@@ -442,7 +505,6 @@ static void coroutine_fn mirror_run(void *opaque)
* report completion. This way, block-job-cancel will leave
* the target in a consistent state.
*/
s->common.offset = end * BDRV_SECTOR_SIZE;
if (!s->synced) {
block_job_event_ready(&s->common);
s->synced = true;
@@ -464,15 +526,13 @@ static void coroutine_fn mirror_run(void *opaque)
* mirror_populate runs.
*/
trace_mirror_before_drain(s, cnt);
bdrv_drain_all();
bdrv_drain(bs);
cnt = bdrv_get_dirty_count(bs, s->dirty_bitmap);
}
ret = 0;
trace_mirror_before_sleep(s, cnt, s->synced, delay_ns);
if (!s->synced) {
/* Publish progress */
s->common.offset = (end - cnt) * BDRV_SECTOR_SIZE;
block_job_sleep_ns(&s->common, QEMU_CLOCK_REALTIME, delay_ns);
if (block_job_is_cancelled(&s->common)) {
break;
@@ -507,31 +567,10 @@ immediate_exit:
g_free(s->in_flight_bitmap);
bdrv_release_dirty_bitmap(bs, s->dirty_bitmap);
bdrv_iostatus_disable(s->target);
if (s->should_complete && ret == 0) {
BlockDriverState *to_replace = s->common.bs;
if (s->to_replace) {
to_replace = s->to_replace;
}
if (bdrv_get_flags(s->target) != bdrv_get_flags(to_replace)) {
bdrv_reopen(s->target, bdrv_get_flags(to_replace), NULL);
}
bdrv_swap(s->target, to_replace);
if (s->common.driver->job_type == BLOCK_JOB_TYPE_COMMIT) {
/* drop the bs loop chain formed by the swap: break the loop then
* trigger the unref from the top one */
BlockDriverState *p = s->base->backing_hd;
bdrv_set_backing_hd(s->base, NULL);
bdrv_unref(p);
}
}
if (s->to_replace) {
bdrv_op_unblock_all(s->to_replace, s->replace_blocker);
error_free(s->replace_blocker);
bdrv_unref(s->to_replace);
}
g_free(s->replaces);
bdrv_unref(s->target);
block_job_completed(&s->common, ret);
data = g_malloc(sizeof(*data));
data->ret = ret;
block_job_defer_to_main_loop(&s->common, mirror_exit, data);
}
static void mirror_set_speed(BlockJob *job, int64_t speed, Error **errp)
@@ -564,22 +603,30 @@ static void mirror_complete(BlockJob *job, Error **errp)
return;
}
if (!s->synced) {
error_set(errp, QERR_BLOCK_JOB_NOT_READY, job->bs->device_name);
error_set(errp, QERR_BLOCK_JOB_NOT_READY,
bdrv_get_device_name(job->bs));
return;
}
/* check the target bs is not blocked and block all operations on it */
if (s->replaces) {
AioContext *replace_aio_context;
s->to_replace = check_to_replace_node(s->replaces, &local_err);
if (!s->to_replace) {
error_propagate(errp, local_err);
return;
}
replace_aio_context = bdrv_get_aio_context(s->to_replace);
aio_context_acquire(replace_aio_context);
error_setg(&s->replace_blocker,
"block device is in use by block-job-complete");
bdrv_op_block_all(s->to_replace, s->replace_blocker);
bdrv_ref(s->to_replace);
aio_context_release(replace_aio_context);
}
s->should_complete = true;
@@ -609,7 +656,7 @@ static void mirror_start_job(BlockDriverState *bs, BlockDriverState *target,
int64_t buf_size,
BlockdevOnError on_source_error,
BlockdevOnError on_target_error,
BlockDriverCompletionFunc *cb,
BlockCompletionFunc *cb,
void *opaque, Error **errp,
const BlockJobDriver *driver,
bool is_none_mode, BlockDriverState *base)
@@ -669,7 +716,7 @@ void mirror_start(BlockDriverState *bs, BlockDriverState *target,
int64_t speed, int64_t granularity, int64_t buf_size,
MirrorSyncMode mode, BlockdevOnError on_source_error,
BlockdevOnError on_target_error,
BlockDriverCompletionFunc *cb,
BlockCompletionFunc *cb,
void *opaque, Error **errp)
{
bool is_none_mode;
@@ -686,7 +733,7 @@ void mirror_start(BlockDriverState *bs, BlockDriverState *target,
void commit_active_start(BlockDriverState *bs, BlockDriverState *base,
int64_t speed,
BlockdevOnError on_error,
BlockDriverCompletionFunc *cb,
BlockCompletionFunc *cb,
void *opaque, Error **errp)
{
int64_t length, base_length;

View File

@@ -31,8 +31,10 @@
#include "block/block_int.h"
#include "qemu/module.h"
#include "qemu/sockets.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qjson.h"
#include "qapi/qmp/qint.h"
#include "qapi/qmp/qstring.h"
#include <sys/types.h>
#include <unistd.h>
@@ -338,6 +340,51 @@ static void nbd_attach_aio_context(BlockDriverState *bs,
nbd_client_session_attach_aio_context(&s->client, new_context);
}
static void nbd_refresh_filename(BlockDriverState *bs)
{
QDict *opts = qdict_new();
const char *path = qdict_get_try_str(bs->options, "path");
const char *host = qdict_get_try_str(bs->options, "host");
const char *port = qdict_get_try_str(bs->options, "port");
const char *export = qdict_get_try_str(bs->options, "export");
qdict_put_obj(opts, "driver", QOBJECT(qstring_from_str("nbd")));
if (path && export) {
snprintf(bs->exact_filename, sizeof(bs->exact_filename),
"nbd+unix:///%s?socket=%s", export, path);
} else if (path && !export) {
snprintf(bs->exact_filename, sizeof(bs->exact_filename),
"nbd+unix://?socket=%s", path);
} else if (!path && export && port) {
snprintf(bs->exact_filename, sizeof(bs->exact_filename),
"nbd://%s:%s/%s", host, port, export);
} else if (!path && export && !port) {
snprintf(bs->exact_filename, sizeof(bs->exact_filename),
"nbd://%s/%s", host, export);
} else if (!path && !export && port) {
snprintf(bs->exact_filename, sizeof(bs->exact_filename),
"nbd://%s:%s", host, port);
} else if (!path && !export && !port) {
snprintf(bs->exact_filename, sizeof(bs->exact_filename),
"nbd://%s", host);
}
if (path) {
qdict_put_obj(opts, "path", QOBJECT(qstring_from_str(path)));
} else if (port) {
qdict_put_obj(opts, "host", QOBJECT(qstring_from_str(host)));
qdict_put_obj(opts, "port", QOBJECT(qstring_from_str(port)));
} else {
qdict_put_obj(opts, "host", QOBJECT(qstring_from_str(host)));
}
if (export) {
qdict_put_obj(opts, "export", QOBJECT(qstring_from_str(export)));
}
bs->full_open_options = opts;
}
static BlockDriver bdrv_nbd = {
.format_name = "nbd",
.protocol_name = "nbd",
@@ -352,6 +399,7 @@ static BlockDriver bdrv_nbd = {
.bdrv_getlength = nbd_getlength,
.bdrv_detach_aio_context = nbd_detach_aio_context,
.bdrv_attach_aio_context = nbd_attach_aio_context,
.bdrv_refresh_filename = nbd_refresh_filename,
};
static BlockDriver bdrv_nbd_tcp = {
@@ -368,6 +416,7 @@ static BlockDriver bdrv_nbd_tcp = {
.bdrv_getlength = nbd_getlength,
.bdrv_detach_aio_context = nbd_detach_aio_context,
.bdrv_attach_aio_context = nbd_attach_aio_context,
.bdrv_refresh_filename = nbd_refresh_filename,
};
static BlockDriver bdrv_nbd_unix = {
@@ -384,6 +433,7 @@ static BlockDriver bdrv_nbd_unix = {
.bdrv_getlength = nbd_getlength,
.bdrv_detach_aio_context = nbd_detach_aio_context,
.bdrv_attach_aio_context = nbd_attach_aio_context,
.bdrv_refresh_filename = nbd_refresh_filename,
};
static void bdrv_nbd_init(void)

View File

@@ -172,7 +172,11 @@ static int coroutine_fn nfs_co_writev(BlockDriverState *bs,
nfs_co_init_task(client, &task);
buf = g_malloc(nb_sectors * BDRV_SECTOR_SIZE);
buf = g_try_malloc(nb_sectors * BDRV_SECTOR_SIZE);
if (nb_sectors && buf == NULL) {
return -ENOMEM;
}
qemu_iovec_to_buf(iov, 0, buf, nb_sectors * BDRV_SECTOR_SIZE);
if (nfs_pwrite_async(client->context, client->fh,
@@ -389,28 +393,33 @@ static int nfs_file_open(BlockDriverState *bs, QDict *options, int flags,
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return -EINVAL;
ret = -EINVAL;
goto out;
}
ret = nfs_client_open(client, qemu_opt_get(opts, "filename"),
(flags & BDRV_O_RDWR) ? O_RDWR : O_RDONLY,
errp);
if (ret < 0) {
return ret;
goto out;
}
bs->total_sectors = ret;
return 0;
ret = 0;
out:
qemu_opts_del(opts);
return ret;
}
static int nfs_file_create(const char *url, QemuOpts *opts, Error **errp)
{
int ret = 0;
int64_t total_size = 0;
NFSClient *client = g_malloc0(sizeof(NFSClient));
NFSClient *client = g_new0(NFSClient, 1);
client->aio_context = qemu_get_aio_context();
/* Read out options */
total_size = qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0);
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
ret = nfs_client_open(client, url, O_CREAT, errp);
if (ret < 0) {

168
block/null.c Normal file
View File

@@ -0,0 +1,168 @@
/*
* Null block driver
*
* Authors:
* Fam Zheng <famz@redhat.com>
*
* Copyright (C) 2014 Red Hat, Inc.
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*/
#include "block/block_int.h"
typedef struct {
int64_t length;
} BDRVNullState;
static QemuOptsList runtime_opts = {
.name = "null",
.head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
.desc = {
{
.name = "filename",
.type = QEMU_OPT_STRING,
.help = "",
},
{
.name = BLOCK_OPT_SIZE,
.type = QEMU_OPT_SIZE,
.help = "size of the null block",
},
{ /* end of list */ }
},
};
static int null_file_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
QemuOpts *opts;
BDRVNullState *s = bs->opaque;
opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &error_abort);
s->length =
qemu_opt_get_size(opts, BLOCK_OPT_SIZE, 1 << 30);
qemu_opts_del(opts);
return 0;
}
static void null_close(BlockDriverState *bs)
{
}
static int64_t null_getlength(BlockDriverState *bs)
{
BDRVNullState *s = bs->opaque;
return s->length;
}
static coroutine_fn int null_co_readv(BlockDriverState *bs,
int64_t sector_num, int nb_sectors,
QEMUIOVector *qiov)
{
return 0;
}
static coroutine_fn int null_co_writev(BlockDriverState *bs,
int64_t sector_num, int nb_sectors,
QEMUIOVector *qiov)
{
return 0;
}
static coroutine_fn int null_co_flush(BlockDriverState *bs)
{
return 0;
}
typedef struct {
BlockAIOCB common;
QEMUBH *bh;
} NullAIOCB;
static const AIOCBInfo null_aiocb_info = {
.aiocb_size = sizeof(NullAIOCB),
};
static void null_bh_cb(void *opaque)
{
NullAIOCB *acb = opaque;
acb->common.cb(acb->common.opaque, 0);
qemu_bh_delete(acb->bh);
qemu_aio_unref(acb);
}
static inline BlockAIOCB *null_aio_common(BlockDriverState *bs,
BlockCompletionFunc *cb,
void *opaque)
{
NullAIOCB *acb;
acb = qemu_aio_get(&null_aiocb_info, bs, cb, opaque);
acb->bh = aio_bh_new(bdrv_get_aio_context(bs), null_bh_cb, acb);
qemu_bh_schedule(acb->bh);
return &acb->common;
}
static BlockAIOCB *null_aio_readv(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov,
int nb_sectors,
BlockCompletionFunc *cb,
void *opaque)
{
return null_aio_common(bs, cb, opaque);
}
static BlockAIOCB *null_aio_writev(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov,
int nb_sectors,
BlockCompletionFunc *cb,
void *opaque)
{
return null_aio_common(bs, cb, opaque);
}
static BlockAIOCB *null_aio_flush(BlockDriverState *bs,
BlockCompletionFunc *cb,
void *opaque)
{
return null_aio_common(bs, cb, opaque);
}
static BlockDriver bdrv_null_co = {
.format_name = "null-co",
.protocol_name = "null-co",
.instance_size = sizeof(BDRVNullState),
.bdrv_file_open = null_file_open,
.bdrv_close = null_close,
.bdrv_getlength = null_getlength,
.bdrv_co_readv = null_co_readv,
.bdrv_co_writev = null_co_writev,
.bdrv_co_flush_to_disk = null_co_flush,
};
static BlockDriver bdrv_null_aio = {
.format_name = "null-aio",
.protocol_name = "null-aio",
.instance_size = sizeof(BDRVNullState),
.bdrv_file_open = null_file_open,
.bdrv_close = null_close,
.bdrv_getlength = null_getlength,
.bdrv_aio_readv = null_aio_readv,
.bdrv_aio_writev = null_aio_writev,
.bdrv_aio_flush = null_aio_flush,
};
static void bdrv_null_init(void)
{
bdrv_register(&bdrv_null_co);
bdrv_register(&bdrv_null_aio);
}
block_init(bdrv_null_init);

View File

@@ -30,6 +30,7 @@
/**************************************************************/
#define HEADER_MAGIC "WithoutFreeSpace"
#define HEADER_MAGIC2 "WithouFreSpacExt"
#define HEADER_VERSION 2
#define HEADER_SIZE 64
@@ -41,8 +42,10 @@ struct parallels_header {
uint32_t cylinders;
uint32_t tracks;
uint32_t catalog_entries;
uint32_t nb_sectors;
char padding[24];
uint64_t nb_sectors;
uint32_t inuse;
uint32_t data_off;
char padding[12];
} QEMU_PACKED;
typedef struct BDRVParallelsState {
@@ -52,6 +55,8 @@ typedef struct BDRVParallelsState {
unsigned int catalog_size;
unsigned int tracks;
unsigned int off_multiplier;
} BDRVParallelsState;
static int parallels_probe(const uint8_t *buf, int buf_size, const char *filename)
@@ -59,11 +64,12 @@ static int parallels_probe(const uint8_t *buf, int buf_size, const char *filenam
const struct parallels_header *ph = (const void *)buf;
if (buf_size < HEADER_SIZE)
return 0;
return 0;
if (!memcmp(ph->magic, HEADER_MAGIC, 16) &&
(le32_to_cpu(ph->version) == HEADER_VERSION))
return 100;
if ((!memcmp(ph->magic, HEADER_MAGIC, 16) ||
!memcmp(ph->magic, HEADER_MAGIC2, 16)) &&
(le32_to_cpu(ph->version) == HEADER_VERSION))
return 100;
return 0;
}
@@ -83,14 +89,19 @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags,
goto fail;
}
if (memcmp(ph.magic, HEADER_MAGIC, 16) ||
(le32_to_cpu(ph.version) != HEADER_VERSION)) {
error_setg(errp, "Image not in Parallels format");
ret = -EINVAL;
goto fail;
}
bs->total_sectors = le64_to_cpu(ph.nb_sectors);
bs->total_sectors = le32_to_cpu(ph.nb_sectors);
if (le32_to_cpu(ph.version) != HEADER_VERSION) {
goto fail_format;
}
if (!memcmp(ph.magic, HEADER_MAGIC, 16)) {
s->off_multiplier = 1;
bs->total_sectors = 0xffffffff & bs->total_sectors;
} else if (!memcmp(ph.magic, HEADER_MAGIC2, 16)) {
s->off_multiplier = le32_to_cpu(ph.tracks);
} else {
goto fail_format;
}
s->tracks = le32_to_cpu(ph.tracks);
if (s->tracks == 0) {
@@ -98,6 +109,11 @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags,
ret = -EINVAL;
goto fail;
}
if (s->tracks > INT32_MAX/513) {
error_setg(errp, "Invalid image: Too big cluster");
ret = -EFBIG;
goto fail;
}
s->catalog_size = le32_to_cpu(ph.catalog_entries);
if (s->catalog_size > INT_MAX / 4) {
@@ -105,7 +121,11 @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags,
ret = -EFBIG;
goto fail;
}
s->catalog_bitmap = g_malloc(s->catalog_size * 4);
s->catalog_bitmap = g_try_new(uint32_t, s->catalog_size);
if (s->catalog_size && s->catalog_bitmap == NULL) {
ret = -ENOMEM;
goto fail;
}
ret = bdrv_pread(bs->file, 64, s->catalog_bitmap, s->catalog_size * 4);
if (ret < 0) {
@@ -113,11 +133,14 @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags,
}
for (i = 0; i < s->catalog_size; i++)
le32_to_cpus(&s->catalog_bitmap[i]);
le32_to_cpus(&s->catalog_bitmap[i]);
qemu_co_mutex_init(&s->lock);
return 0;
fail_format:
error_setg(errp, "Image not in Parallels format");
ret = -EINVAL;
fail:
g_free(s->catalog_bitmap);
return ret;
@@ -132,9 +155,10 @@ static int64_t seek_to_sector(BlockDriverState *bs, int64_t sector_num)
offset = sector_num % s->tracks;
/* not allocated */
if ((index > s->catalog_size) || (s->catalog_bitmap[index] == 0))
return -1;
return (uint64_t)(s->catalog_bitmap[index] + offset) * 512;
if ((index >= s->catalog_size) || (s->catalog_bitmap[index] == 0))
return -1;
return
((uint64_t)s->catalog_bitmap[index] * s->off_multiplier + offset) * 512;
}
static int parallels_read(BlockDriverState *bs, int64_t sector_num,

View File

@@ -28,6 +28,7 @@
#include "qapi-visit.h"
#include "qapi/qmp-output-visitor.h"
#include "qapi/qmp/types.h"
#include "sysemu/block-backend.h"
BlockDeviceInfo *bdrv_block_device_info(BlockDriverState *bs)
{
@@ -165,19 +166,25 @@ void bdrv_query_image_info(BlockDriverState *bs,
ImageInfo **p_info,
Error **errp)
{
uint64_t total_sectors;
int64_t size;
const char *backing_filename;
char backing_filename2[1024];
BlockDriverInfo bdi;
int ret;
Error *err = NULL;
ImageInfo *info = g_new0(ImageInfo, 1);
ImageInfo *info;
bdrv_get_geometry(bs, &total_sectors);
size = bdrv_getlength(bs);
if (size < 0) {
error_setg_errno(errp, -size, "Can't get size of device '%s'",
bdrv_get_device_name(bs));
return;
}
info = g_new0(ImageInfo, 1);
info->filename = g_strdup(bs->filename);
info->format = g_strdup(bdrv_get_format_name(bs));
info->virtual_size = total_sectors * 512;
info->virtual_size = size;
info->actual_size = bdrv_get_allocated_file_size(bs);
info->has_actual_size = info->actual_size >= 0;
if (bdrv_is_encrypted(bs)) {
@@ -236,22 +243,22 @@ void bdrv_query_image_info(BlockDriverState *bs,
}
/* @p_info will be set only on success. */
void bdrv_query_info(BlockDriverState *bs,
BlockInfo **p_info,
Error **errp)
static void bdrv_query_info(BlockBackend *blk, BlockInfo **p_info,
Error **errp)
{
BlockInfo *info = g_malloc0(sizeof(*info));
BlockDriverState *bs = blk_bs(blk);
BlockDriverState *bs0;
ImageInfo **p_image_info;
Error *local_err = NULL;
info->device = g_strdup(bs->device_name);
info->device = g_strdup(blk_name(blk));
info->type = g_strdup("unknown");
info->locked = bdrv_dev_is_medium_locked(bs);
info->removable = bdrv_dev_has_removable_media(bs);
info->locked = blk_dev_is_medium_locked(blk);
info->removable = blk_dev_has_removable_media(blk);
if (bdrv_dev_has_removable_media(bs)) {
if (blk_dev_has_removable_media(blk)) {
info->has_tray_open = true;
info->tray_open = bdrv_dev_is_tray_open(bs);
info->tray_open = blk_dev_is_tray_open(blk);
}
if (bdrv_iostatus_is_enabled(bs)) {
@@ -299,21 +306,22 @@ static BlockStats *bdrv_query_stats(const BlockDriverState *bs)
s = g_malloc0(sizeof(*s));
if (bs->device_name[0]) {
if (bdrv_get_device_name(bs)[0]) {
s->has_device = true;
s->device = g_strdup(bs->device_name);
s->device = g_strdup(bdrv_get_device_name(bs));
}
s->stats = g_malloc0(sizeof(*s->stats));
s->stats->rd_bytes = bs->nr_bytes[BDRV_ACCT_READ];
s->stats->wr_bytes = bs->nr_bytes[BDRV_ACCT_WRITE];
s->stats->rd_operations = bs->nr_ops[BDRV_ACCT_READ];
s->stats->wr_operations = bs->nr_ops[BDRV_ACCT_WRITE];
s->stats->wr_highest_offset = bs->wr_highest_sector * BDRV_SECTOR_SIZE;
s->stats->flush_operations = bs->nr_ops[BDRV_ACCT_FLUSH];
s->stats->wr_total_time_ns = bs->total_time_ns[BDRV_ACCT_WRITE];
s->stats->rd_total_time_ns = bs->total_time_ns[BDRV_ACCT_READ];
s->stats->flush_total_time_ns = bs->total_time_ns[BDRV_ACCT_FLUSH];
s->stats->rd_bytes = bs->stats.nr_bytes[BLOCK_ACCT_READ];
s->stats->wr_bytes = bs->stats.nr_bytes[BLOCK_ACCT_WRITE];
s->stats->rd_operations = bs->stats.nr_ops[BLOCK_ACCT_READ];
s->stats->wr_operations = bs->stats.nr_ops[BLOCK_ACCT_WRITE];
s->stats->wr_highest_offset =
bs->stats.wr_highest_sector * BDRV_SECTOR_SIZE;
s->stats->flush_operations = bs->stats.nr_ops[BLOCK_ACCT_FLUSH];
s->stats->wr_total_time_ns = bs->stats.total_time_ns[BLOCK_ACCT_WRITE];
s->stats->rd_total_time_ns = bs->stats.total_time_ns[BLOCK_ACCT_READ];
s->stats->flush_total_time_ns = bs->stats.total_time_ns[BLOCK_ACCT_FLUSH];
if (bs->file) {
s->has_parent = true;
@@ -331,12 +339,12 @@ static BlockStats *bdrv_query_stats(const BlockDriverState *bs)
BlockInfoList *qmp_query_block(Error **errp)
{
BlockInfoList *head = NULL, **p_next = &head;
BlockDriverState *bs = NULL;
BlockBackend *blk;
Error *local_err = NULL;
while ((bs = bdrv_next(bs))) {
for (blk = blk_next(NULL); blk; blk = blk_next(blk)) {
BlockInfoList *info = g_malloc0(sizeof(*info));
bdrv_query_info(bs, &info->value, &local_err);
bdrv_query_info(blk, &info->value, &local_err);
if (local_err) {
error_propagate(errp, local_err);
goto err;

View File

@@ -124,7 +124,7 @@ static int qcow_open(BlockDriverState *bs, QDict *options, int flags,
snprintf(version, sizeof(version), "QCOW version %" PRIu32,
header.version);
error_set(errp, QERR_UNKNOWN_BLOCK_FORMAT_FEATURE,
bs->device_name, "qcow", version);
bdrv_get_device_name(bs), "qcow", version);
ret = -ENOTSUP;
goto fail;
}
@@ -182,7 +182,12 @@ static int qcow_open(BlockDriverState *bs, QDict *options, int flags,
}
s->l1_table_offset = header.l1_table_offset;
s->l1_table = g_malloc(s->l1_size * sizeof(uint64_t));
s->l1_table = g_try_new(uint64_t, s->l1_size);
if (s->l1_table == NULL) {
error_setg(errp, "Could not allocate memory for L1 table");
ret = -ENOMEM;
goto fail;
}
ret = bdrv_pread(bs->file, s->l1_table_offset, s->l1_table,
s->l1_size * sizeof(uint64_t));
@@ -193,8 +198,16 @@ static int qcow_open(BlockDriverState *bs, QDict *options, int flags,
for(i = 0;i < s->l1_size; i++) {
be64_to_cpus(&s->l1_table[i]);
}
/* alloc L2 cache */
s->l2_cache = g_malloc(s->l2_size * L2_CACHE_SIZE * sizeof(uint64_t));
/* alloc L2 cache (max. 64k * 16 * 8 = 8 MB) */
s->l2_cache =
qemu_try_blockalign(bs->file,
s->l2_size * L2_CACHE_SIZE * sizeof(uint64_t));
if (s->l2_cache == NULL) {
error_setg(errp, "Could not allocate L2 table cache");
ret = -ENOMEM;
goto fail;
}
s->cluster_cache = g_malloc(s->cluster_size);
s->cluster_data = g_malloc(s->cluster_size);
s->cluster_cache_offset = -1;
@@ -218,7 +231,7 @@ static int qcow_open(BlockDriverState *bs, QDict *options, int flags,
/* Disable migration when qcow images are used */
error_set(&s->migration_blocker,
QERR_BLOCK_FORMAT_FEATURE_NOT_SUPPORTED,
"qcow", bs->device_name, "live migration");
"qcow", bdrv_get_device_name(bs), "live migration");
migrate_add_blocker(s->migration_blocker);
qemu_co_mutex_init(&s->lock);
@@ -226,7 +239,7 @@ static int qcow_open(BlockDriverState *bs, QDict *options, int flags,
fail:
g_free(s->l1_table);
g_free(s->l2_cache);
qemu_vfree(s->l2_cache);
g_free(s->cluster_cache);
g_free(s->cluster_data);
return ret;
@@ -517,7 +530,10 @@ static coroutine_fn int qcow_co_readv(BlockDriverState *bs, int64_t sector_num,
void *orig_buf;
if (qiov->niov > 1) {
buf = orig_buf = qemu_blockalign(bs, qiov->size);
buf = orig_buf = qemu_try_blockalign(bs, qiov->size);
if (buf == NULL) {
return -ENOMEM;
}
} else {
orig_buf = NULL;
buf = (uint8_t *)qiov->iov->iov_base;
@@ -619,7 +635,10 @@ static coroutine_fn int qcow_co_writev(BlockDriverState *bs, int64_t sector_num,
s->cluster_cache_offset = -1; /* disable compressed cache */
if (qiov->niov > 1) {
buf = orig_buf = qemu_blockalign(bs, qiov->size);
buf = orig_buf = qemu_try_blockalign(bs, qiov->size);
if (buf == NULL) {
return -ENOMEM;
}
qemu_iovec_to_buf(qiov, 0, buf, qiov->size);
} else {
orig_buf = NULL;
@@ -685,7 +704,7 @@ static void qcow_close(BlockDriverState *bs)
BDRVQcowState *s = bs->opaque;
g_free(s->l1_table);
g_free(s->l2_cache);
qemu_vfree(s->l2_cache);
g_free(s->cluster_cache);
g_free(s->cluster_data);
@@ -706,7 +725,8 @@ static int qcow_create(const char *filename, QemuOpts *opts, Error **errp)
BlockDriverState *qcow_bs;
/* Read out options */
total_size = qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0) / 512;
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
backing_file = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FILE);
if (qemu_opt_get_bool_del(opts, BLOCK_OPT_ENCRYPT, false)) {
flags |= BLOCK_FLAG_ENCRYPT;
@@ -734,7 +754,7 @@ static int qcow_create(const char *filename, QemuOpts *opts, Error **errp)
memset(&header, 0, sizeof(header));
header.magic = cpu_to_be32(QCOW_MAGIC);
header.version = cpu_to_be32(QCOW_VERSION);
header.size = cpu_to_be64(total_size * 512);
header.size = cpu_to_be64(total_size);
header_size = sizeof(header);
backing_filename_len = 0;
if (backing_file) {
@@ -756,7 +776,7 @@ static int qcow_create(const char *filename, QemuOpts *opts, Error **errp)
}
header_size = (header_size + 7) & ~7;
shift = header.cluster_bits + header.l2_bits;
l1_size = ((total_size * 512) + (1LL << shift) - 1) >> shift;
l1_size = (total_size + (1LL << shift) - 1) >> shift;
header.l1_table_offset = cpu_to_be64(header_size);
if (flags & BLOCK_FLAG_ENCRYPT) {

View File

@@ -48,15 +48,31 @@ Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables)
Qcow2Cache *c;
int i;
c = g_malloc0(sizeof(*c));
c = g_new0(Qcow2Cache, 1);
c->size = num_tables;
c->entries = g_malloc0(sizeof(*c->entries) * num_tables);
c->entries = g_try_new0(Qcow2CachedTable, num_tables);
if (!c->entries) {
goto fail;
}
for (i = 0; i < c->size; i++) {
c->entries[i].table = qemu_blockalign(bs, s->cluster_size);
c->entries[i].table = qemu_try_blockalign(bs->file, s->cluster_size);
if (c->entries[i].table == NULL) {
goto fail;
}
}
return c;
fail:
if (c->entries) {
for (i = 0; i < c->size; i++) {
qemu_vfree(c->entries[i].table);
}
}
g_free(c->entries);
g_free(c);
return NULL;
}
int qcow2_cache_destroy(BlockDriverState* bs, Qcow2Cache *c)

View File

@@ -72,14 +72,20 @@ int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
#endif
new_l1_size2 = sizeof(uint64_t) * new_l1_size;
new_l1_table = g_malloc0(align_offset(new_l1_size2, 512));
new_l1_table = qemu_try_blockalign(bs->file,
align_offset(new_l1_size2, 512));
if (new_l1_table == NULL) {
return -ENOMEM;
}
memset(new_l1_table, 0, align_offset(new_l1_size2, 512));
memcpy(new_l1_table, s->l1_table, s->l1_size * sizeof(uint64_t));
/* write new table (align to cluster) */
BLKDBG_EVENT(bs->file, BLKDBG_L1_GROW_ALLOC_TABLE);
new_l1_table_offset = qcow2_alloc_clusters(bs, new_l1_size2);
if (new_l1_table_offset < 0) {
g_free(new_l1_table);
qemu_vfree(new_l1_table);
return new_l1_table_offset;
}
@@ -113,7 +119,7 @@ int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
if (ret < 0) {
goto fail;
}
g_free(s->l1_table);
qemu_vfree(s->l1_table);
old_l1_table_offset = s->l1_table_offset;
s->l1_table_offset = new_l1_table_offset;
s->l1_table = new_l1_table;
@@ -123,7 +129,7 @@ int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
QCOW2_DISCARD_OTHER);
return 0;
fail:
g_free(new_l1_table);
qemu_vfree(new_l1_table);
qcow2_free_clusters(bs, new_l1_table_offset, new_l1_size2,
QCOW2_DISCARD_OTHER);
return ret;
@@ -158,12 +164,14 @@ static int l2_load(BlockDriverState *bs, uint64_t l2_offset,
int qcow2_write_l1_entry(BlockDriverState *bs, int l1_index)
{
BDRVQcowState *s = bs->opaque;
uint64_t buf[L1_ENTRIES_PER_SECTOR];
uint64_t buf[L1_ENTRIES_PER_SECTOR] = { 0 };
int l1_start_index;
int i, ret;
l1_start_index = l1_index & ~(L1_ENTRIES_PER_SECTOR - 1);
for (i = 0; i < L1_ENTRIES_PER_SECTOR; i++) {
for (i = 0; i < L1_ENTRIES_PER_SECTOR && l1_start_index + i < s->l1_size;
i++)
{
buf[i] = cpu_to_be64(s->l1_table[l1_start_index + i]);
}
@@ -372,7 +380,10 @@ static int coroutine_fn copy_sectors(BlockDriverState *bs,
}
iov.iov_len = n * BDRV_SECTOR_SIZE;
iov.iov_base = qemu_blockalign(bs, iov.iov_len);
iov.iov_base = qemu_try_blockalign(bs, iov.iov_len);
if (iov.iov_base == NULL) {
return -ENOMEM;
}
qemu_iovec_init_external(&qiov, &iov, 1);
@@ -477,6 +488,13 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
goto out;
}
if (offset_into_cluster(s, l2_offset)) {
qcow2_signal_corruption(bs, true, -1, -1, "L2 table offset %#" PRIx64
" unaligned (L1 index: %#" PRIx64 ")",
l2_offset, l1_index);
return -EIO;
}
/* load the l2 table in memory */
ret = l2_load(bs, l2_offset, &l2_table);
@@ -499,8 +517,11 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
break;
case QCOW2_CLUSTER_ZERO:
if (s->qcow_version < 3) {
qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
return -EIO;
qcow2_signal_corruption(bs, true, -1, -1, "Zero cluster entry found"
" in pre-v3 image (L2 offset: %#" PRIx64
", L2 index: %#x)", l2_offset, l2_index);
ret = -EIO;
goto fail;
}
c = count_contiguous_clusters(nb_clusters, s->cluster_size,
&l2_table[l2_index], QCOW_OFLAG_ZERO);
@@ -516,6 +537,14 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
c = count_contiguous_clusters(nb_clusters, s->cluster_size,
&l2_table[l2_index], QCOW_OFLAG_ZERO);
*cluster_offset &= L2E_OFFSET_MASK;
if (offset_into_cluster(s, *cluster_offset)) {
qcow2_signal_corruption(bs, true, -1, -1, "Data cluster offset %#"
PRIx64 " unaligned (L2 offset: %#" PRIx64
", L2 index: %#x)", *cluster_offset,
l2_offset, l2_index);
ret = -EIO;
goto fail;
}
break;
default:
abort();
@@ -532,6 +561,10 @@ out:
*num = nb_available - index_in_cluster;
return ret;
fail:
qcow2_cache_put(bs, s->l2_table_cache, (void **)&l2_table);
return ret;
}
/*
@@ -567,6 +600,12 @@ static int get_cluster_table(BlockDriverState *bs, uint64_t offset,
assert(l1_index < s->l1_size);
l2_offset = s->l1_table[l1_index] & L1E_OFFSET_MASK;
if (offset_into_cluster(s, l2_offset)) {
qcow2_signal_corruption(bs, true, -1, -1, "L2 table offset %#" PRIx64
" unaligned (L1 index: %#" PRIx64 ")",
l2_offset, l1_index);
return -EIO;
}
/* seek the l2 table of the given l2 offset */
@@ -702,7 +741,11 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
trace_qcow2_cluster_link_l2(qemu_coroutine_self(), m->nb_clusters);
assert(m->nb_clusters > 0);
old_cluster = g_malloc(m->nb_clusters * sizeof(uint64_t));
old_cluster = g_try_new(uint64_t, m->nb_clusters);
if (old_cluster == NULL) {
ret = -ENOMEM;
goto err;
}
/* copy content of unmodified sectors */
ret = perform_cow(bs, m, &m->cow_start);
@@ -935,6 +978,15 @@ static int handle_copied(BlockDriverState *bs, uint64_t guest_offset,
bool offset_matches =
(cluster_offset & L2E_OFFSET_MASK) == *host_offset;
if (offset_into_cluster(s, cluster_offset & L2E_OFFSET_MASK)) {
qcow2_signal_corruption(bs, true, -1, -1, "Data cluster offset "
"%#llx unaligned (guest offset: %#" PRIx64
")", cluster_offset & L2E_OFFSET_MASK,
guest_offset);
ret = -EIO;
goto out;
}
if (*host_offset != 0 && !offset_matches) {
*bytes = 0;
ret = 0;
@@ -966,7 +1018,7 @@ out:
/* Only return a host offset if we actually made progress. Otherwise we
* would make requirements for handle_alloc() that it can't fulfill */
if (ret) {
if (ret > 0) {
*host_offset = (cluster_offset & L2E_OFFSET_MASK)
+ offset_into_cluster(s, guest_offset);
}
@@ -1106,6 +1158,17 @@ static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset,
return 0;
}
/* !*host_offset would overwrite the image header and is reserved for "no
* host offset preferred". If 0 was a valid host offset, it'd trigger the
* following overlap check; do that now to avoid having an invalid value in
* *host_offset. */
if (!alloc_cluster_offset) {
ret = qcow2_pre_write_overlap_check(bs, 0, alloc_cluster_offset,
nb_clusters * s->cluster_size);
assert(ret < 0);
goto fail;
}
/*
* Save info needed for meta data update.
*
@@ -1351,7 +1414,7 @@ int qcow2_decompress_cluster(BlockDriverState *bs, uint64_t cluster_offset)
* clusters.
*/
static int discard_single_l2(BlockDriverState *bs, uint64_t offset,
unsigned int nb_clusters, enum qcow2_discard_type type)
unsigned int nb_clusters, enum qcow2_discard_type type, bool full_discard)
{
BDRVQcowState *s = bs->opaque;
uint64_t *l2_table;
@@ -1373,23 +1436,30 @@ static int discard_single_l2(BlockDriverState *bs, uint64_t offset,
old_l2_entry = be64_to_cpu(l2_table[l2_index + i]);
/*
* Make sure that a discarded area reads back as zeroes for v3 images
* (we cannot do it for v2 without actually writing a zero-filled
* buffer). We can skip the operation if the cluster is already marked
* as zero, or if it's unallocated and we don't have a backing file.
* If full_discard is false, make sure that a discarded area reads back
* as zeroes for v3 images (we cannot do it for v2 without actually
* writing a zero-filled buffer). We can skip the operation if the
* cluster is already marked as zero, or if it's unallocated and we
* don't have a backing file.
*
* TODO We might want to use bdrv_get_block_status(bs) here, but we're
* holding s->lock, so that doesn't work today.
*
* If full_discard is true, the sector should not read back as zeroes,
* but rather fall through to the backing file.
*/
switch (qcow2_get_cluster_type(old_l2_entry)) {
case QCOW2_CLUSTER_UNALLOCATED:
if (!bs->backing_hd) {
if (full_discard || !bs->backing_hd) {
continue;
}
break;
case QCOW2_CLUSTER_ZERO:
continue;
if (!full_discard) {
continue;
}
break;
case QCOW2_CLUSTER_NORMAL:
case QCOW2_CLUSTER_COMPRESSED:
@@ -1401,7 +1471,7 @@ static int discard_single_l2(BlockDriverState *bs, uint64_t offset,
/* First remove L2 entries */
qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
if (s->qcow_version >= 3) {
if (!full_discard && s->qcow_version >= 3) {
l2_table[l2_index + i] = cpu_to_be64(QCOW_OFLAG_ZERO);
} else {
l2_table[l2_index + i] = cpu_to_be64(0);
@@ -1420,7 +1490,7 @@ static int discard_single_l2(BlockDriverState *bs, uint64_t offset,
}
int qcow2_discard_clusters(BlockDriverState *bs, uint64_t offset,
int nb_sectors, enum qcow2_discard_type type)
int nb_sectors, enum qcow2_discard_type type, bool full_discard)
{
BDRVQcowState *s = bs->opaque;
uint64_t end_offset;
@@ -1443,7 +1513,7 @@ int qcow2_discard_clusters(BlockDriverState *bs, uint64_t offset,
/* Each L2 table is handled by its own loop iteration */
while (nb_clusters > 0) {
ret = discard_single_l2(bs, offset, nb_clusters, type);
ret = discard_single_l2(bs, offset, nb_clusters, type, full_discard);
if (ret < 0) {
goto fail;
}
@@ -1543,15 +1613,14 @@ fail:
* Expands all zero clusters in a specific L1 table (or deallocates them, for
* non-backed non-pre-allocated zero clusters).
*
* expanded_clusters is a bitmap where every bit corresponds to one cluster in
* the image file; a bit gets set if the corresponding cluster has been used for
* zero expansion (i.e., has been filled with zeroes and is referenced from an
* L2 table). nb_clusters contains the total cluster count of the image file,
* i.e., the number of bits in expanded_clusters.
* l1_entries and *visited_l1_entries are used to keep track of progress for
* status_cb(). l1_entries contains the total number of L1 entries and
* *visited_l1_entries counts all visited L1 entries.
*/
static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
int l1_size, uint8_t **expanded_clusters,
uint64_t *nb_clusters)
int l1_size, int64_t *visited_l1_entries,
int64_t l1_entries,
BlockDriverAmendStatusCB *status_cb)
{
BDRVQcowState *s = bs->opaque;
bool is_active_l1 = (l1_table == s->l1_table);
@@ -1562,15 +1631,23 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
if (!is_active_l1) {
/* inactive L2 tables require a buffer to be stored in when loading
* them from disk */
l2_table = qemu_blockalign(bs, s->cluster_size);
l2_table = qemu_try_blockalign(bs->file, s->cluster_size);
if (l2_table == NULL) {
return -ENOMEM;
}
}
for (i = 0; i < l1_size; i++) {
uint64_t l2_offset = l1_table[i] & L1E_OFFSET_MASK;
bool l2_dirty = false;
int l2_refcount;
if (!l2_offset) {
/* unallocated */
(*visited_l1_entries)++;
if (status_cb) {
status_cb(bs, *visited_l1_entries, l1_entries);
}
continue;
}
@@ -1587,33 +1664,19 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
goto fail;
}
l2_refcount = qcow2_get_refcount(bs, l2_offset >> s->cluster_bits);
if (l2_refcount < 0) {
ret = l2_refcount;
goto fail;
}
for (j = 0; j < s->l2_size; j++) {
uint64_t l2_entry = be64_to_cpu(l2_table[j]);
int64_t offset = l2_entry & L2E_OFFSET_MASK, cluster_index;
int64_t offset = l2_entry & L2E_OFFSET_MASK;
int cluster_type = qcow2_get_cluster_type(l2_entry);
bool preallocated = offset != 0;
if (cluster_type == QCOW2_CLUSTER_NORMAL) {
cluster_index = offset >> s->cluster_bits;
assert((cluster_index >= 0) && (cluster_index < *nb_clusters));
if ((*expanded_clusters)[cluster_index / 8] &
(1 << (cluster_index % 8))) {
/* Probably a shared L2 table; this cluster was a zero
* cluster which has been expanded, its refcount
* therefore most likely requires an update. */
ret = qcow2_update_cluster_refcount(bs, cluster_index, 1,
QCOW2_DISCARD_NEVER);
if (ret < 0) {
goto fail;
}
/* Since we just increased the refcount, the COPIED flag may
* no longer be set. */
l2_table[j] = cpu_to_be64(l2_entry & ~QCOW_OFLAG_COPIED);
l2_dirty = true;
}
continue;
}
else if (qcow2_get_cluster_type(l2_entry) != QCOW2_CLUSTER_ZERO) {
if (cluster_type != QCOW2_CLUSTER_ZERO) {
continue;
}
@@ -1631,6 +1694,19 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
ret = offset;
goto fail;
}
if (l2_refcount > 1) {
/* For shared L2 tables, set the refcount accordingly (it is
* already 1 and needs to be l2_refcount) */
ret = qcow2_update_cluster_refcount(bs,
offset >> s->cluster_bits, l2_refcount - 1,
QCOW2_DISCARD_OTHER);
if (ret < 0) {
qcow2_free_clusters(bs, offset, s->cluster_size,
QCOW2_DISCARD_OTHER);
goto fail;
}
}
}
ret = qcow2_pre_write_overlap_check(bs, 0, offset, s->cluster_size);
@@ -1652,29 +1728,12 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
goto fail;
}
l2_table[j] = cpu_to_be64(offset | QCOW_OFLAG_COPIED);
l2_dirty = true;
cluster_index = offset >> s->cluster_bits;
if (cluster_index >= *nb_clusters) {
uint64_t old_bitmap_size = (*nb_clusters + 7) / 8;
uint64_t new_bitmap_size;
/* The offset may lie beyond the old end of the underlying image
* file for growable files only */
assert(bs->file->growable);
*nb_clusters = size_to_clusters(s, bs->file->total_sectors *
BDRV_SECTOR_SIZE);
new_bitmap_size = (*nb_clusters + 7) / 8;
*expanded_clusters = g_realloc(*expanded_clusters,
new_bitmap_size);
/* clear the newly allocated space */
memset(&(*expanded_clusters)[old_bitmap_size], 0,
new_bitmap_size - old_bitmap_size);
if (l2_refcount == 1) {
l2_table[j] = cpu_to_be64(offset | QCOW_OFLAG_COPIED);
} else {
l2_table[j] = cpu_to_be64(offset);
}
assert((cluster_index >= 0) && (cluster_index < *nb_clusters));
(*expanded_clusters)[cluster_index / 8] |= 1 << (cluster_index % 8);
l2_dirty = true;
}
if (is_active_l1) {
@@ -1703,6 +1762,11 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
}
}
}
(*visited_l1_entries)++;
if (status_cb) {
status_cb(bs, *visited_l1_entries, l1_entries);
}
}
ret = 0;
@@ -1729,21 +1793,25 @@ fail:
* allocation for pre-allocated ones). This is important for downgrading to a
* qcow2 version which doesn't yet support metadata zero clusters.
*/
int qcow2_expand_zero_clusters(BlockDriverState *bs)
int qcow2_expand_zero_clusters(BlockDriverState *bs,
BlockDriverAmendStatusCB *status_cb)
{
BDRVQcowState *s = bs->opaque;
uint64_t *l1_table = NULL;
uint64_t nb_clusters;
uint8_t *expanded_clusters;
int64_t l1_entries = 0, visited_l1_entries = 0;
int ret;
int i, j;
nb_clusters = size_to_clusters(s, bs->file->total_sectors *
BDRV_SECTOR_SIZE);
expanded_clusters = g_malloc0((nb_clusters + 7) / 8);
if (status_cb) {
l1_entries = s->l1_size;
for (i = 0; i < s->nb_snapshots; i++) {
l1_entries += s->snapshots[i].l1_size;
}
}
ret = expand_zero_clusters_in_l1(bs, s->l1_table, s->l1_size,
&expanded_clusters, &nb_clusters);
&visited_l1_entries, l1_entries,
status_cb);
if (ret < 0) {
goto fail;
}
@@ -1777,7 +1845,8 @@ int qcow2_expand_zero_clusters(BlockDriverState *bs)
}
ret = expand_zero_clusters_in_l1(bs, l1_table, s->snapshots[i].l1_size,
&expanded_clusters, &nb_clusters);
&visited_l1_entries, l1_entries,
status_cb);
if (ret < 0) {
goto fail;
}
@@ -1786,7 +1855,6 @@ int qcow2_expand_zero_clusters(BlockDriverState *bs)
ret = 0;
fail:
g_free(expanded_clusters);
g_free(l1_table);
return ret;
}

File diff suppressed because it is too large Load Diff

View File

@@ -58,7 +58,7 @@ int qcow2_read_snapshots(BlockDriverState *bs)
}
offset = s->snapshots_offset;
s->snapshots = g_malloc0(s->nb_snapshots * sizeof(QCowSnapshot));
s->snapshots = g_new0(QCowSnapshot, s->nb_snapshots);
for(i = 0; i < s->nb_snapshots; i++) {
/* Read statically sized part of the snapshot header */
@@ -381,7 +381,12 @@ int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)
sn->l1_table_offset = l1_table_offset;
sn->l1_size = s->l1_size;
l1_table = g_malloc(s->l1_size * sizeof(uint64_t));
l1_table = g_try_new(uint64_t, s->l1_size);
if (s->l1_size && l1_table == NULL) {
ret = -ENOMEM;
goto fail;
}
for(i = 0; i < s->l1_size; i++) {
l1_table[i] = cpu_to_be64(s->l1_table[i]);
}
@@ -412,7 +417,7 @@ int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)
}
/* Append the new snapshot to the snapshot list */
new_snapshot_list = g_malloc((s->nb_snapshots + 1) * sizeof(QCowSnapshot));
new_snapshot_list = g_new(QCowSnapshot, s->nb_snapshots + 1);
if (s->snapshots) {
memcpy(new_snapshot_list, s->snapshots,
s->nb_snapshots * sizeof(QCowSnapshot));
@@ -436,7 +441,7 @@ int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)
qcow2_discard_clusters(bs, qcow2_vm_state_offset(s),
align_offset(sn->vm_state_size, s->cluster_size)
>> BDRV_SECTOR_BITS,
QCOW2_DISCARD_NEVER);
QCOW2_DISCARD_NEVER, false);
#ifdef DEBUG_ALLOC
{
@@ -499,7 +504,11 @@ int qcow2_snapshot_goto(BlockDriverState *bs, const char *snapshot_id)
* Decrease the refcount referenced by the old one only when the L1
* table is overwritten.
*/
sn_l1_table = g_malloc0(cur_l1_bytes);
sn_l1_table = g_try_malloc0(cur_l1_bytes);
if (cur_l1_bytes && sn_l1_table == NULL) {
ret = -ENOMEM;
goto fail;
}
ret = bdrv_pread(bs->file, sn->l1_table_offset, sn_l1_table, sn_l1_bytes);
if (ret < 0) {
@@ -652,7 +661,7 @@ int qcow2_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_tab)
return s->nb_snapshots;
}
sn_tab = g_malloc0(s->nb_snapshots * sizeof(QEMUSnapshotInfo));
sn_tab = g_new0(QEMUSnapshotInfo, s->nb_snapshots);
for(i = 0; i < s->nb_snapshots; i++) {
sn_info = sn_tab + i;
sn = s->snapshots + i;
@@ -698,17 +707,21 @@ int qcow2_snapshot_load_tmp(BlockDriverState *bs,
return -EFBIG;
}
new_l1_bytes = sn->l1_size * sizeof(uint64_t);
new_l1_table = g_malloc0(align_offset(new_l1_bytes, 512));
new_l1_table = qemu_try_blockalign(bs->file,
align_offset(new_l1_bytes, 512));
if (new_l1_table == NULL) {
return -ENOMEM;
}
ret = bdrv_pread(bs->file, sn->l1_table_offset, new_l1_table, new_l1_bytes);
if (ret < 0) {
error_setg(errp, "Failed to read l1 table for snapshot");
g_free(new_l1_table);
qemu_vfree(new_l1_table);
return ret;
}
/* Switch the L1 table */
g_free(s->l1_table);
qemu_vfree(s->l1_table);
s->l1_size = sn->l1_size;
s->l1_table_offset = sn->l1_table_offset;

View File

@@ -30,6 +30,9 @@
#include "qemu/error-report.h"
#include "qapi/qmp/qerror.h"
#include "qapi/qmp/qbool.h"
#include "qapi/util.h"
#include "qapi/qmp/types.h"
#include "qapi-event.h"
#include "trace.h"
#include "qemu/option_int.h"
@@ -203,8 +206,8 @@ static void GCC_FMT_ATTR(3, 4) report_unsupported(BlockDriverState *bs,
vsnprintf(msg, sizeof(msg), fmt, ap);
va_end(ap);
error_set(errp, QERR_UNKNOWN_BLOCK_FORMAT_FEATURE, bs->device_name, "qcow2",
msg);
error_set(errp, QERR_UNKNOWN_BLOCK_FORMAT_FEATURE,
bdrv_get_device_name(bs), "qcow2", msg);
}
static void report_unsupported_feature(BlockDriverState *bs,
@@ -402,6 +405,12 @@ static QemuOptsList qcow2_runtime_opts = {
.help = "Selects which overlap checks to perform from a range of "
"templates (none, constant, cached, all)",
},
{
.name = QCOW2_OPT_OVERLAP_TEMPLATE,
.type = QEMU_OPT_STRING,
.help = "Selects which overlap checks to perform from a range of "
"templates (none, constant, cached, all)",
},
{
.name = QCOW2_OPT_OVERLAP_MAIN_HEADER,
.type = QEMU_OPT_BOOL,
@@ -442,6 +451,22 @@ static QemuOptsList qcow2_runtime_opts = {
.type = QEMU_OPT_BOOL,
.help = "Check for unintended writes into an inactive L2 table",
},
{
.name = QCOW2_OPT_CACHE_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Maximum combined metadata (L2 tables and refcount blocks) "
"cache size",
},
{
.name = QCOW2_OPT_L2_CACHE_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Maximum L2 table cache size",
},
{
.name = QCOW2_OPT_REFCOUNT_CACHE_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Maximum refcount block cache size",
},
{ /* end of list */ }
},
};
@@ -457,6 +482,61 @@ static const char *overlap_bool_option_names[QCOW2_OL_MAX_BITNR] = {
[QCOW2_OL_INACTIVE_L2_BITNR] = QCOW2_OPT_OVERLAP_INACTIVE_L2,
};
static void read_cache_sizes(QemuOpts *opts, uint64_t *l2_cache_size,
uint64_t *refcount_cache_size, Error **errp)
{
uint64_t combined_cache_size;
bool l2_cache_size_set, refcount_cache_size_set, combined_cache_size_set;
combined_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_CACHE_SIZE);
l2_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_L2_CACHE_SIZE);
refcount_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_REFCOUNT_CACHE_SIZE);
combined_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_CACHE_SIZE, 0);
*l2_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_L2_CACHE_SIZE, 0);
*refcount_cache_size = qemu_opt_get_size(opts,
QCOW2_OPT_REFCOUNT_CACHE_SIZE, 0);
if (combined_cache_size_set) {
if (l2_cache_size_set && refcount_cache_size_set) {
error_setg(errp, QCOW2_OPT_CACHE_SIZE ", " QCOW2_OPT_L2_CACHE_SIZE
" and " QCOW2_OPT_REFCOUNT_CACHE_SIZE " may not be set "
"the same time");
return;
} else if (*l2_cache_size > combined_cache_size) {
error_setg(errp, QCOW2_OPT_L2_CACHE_SIZE " may not exceed "
QCOW2_OPT_CACHE_SIZE);
return;
} else if (*refcount_cache_size > combined_cache_size) {
error_setg(errp, QCOW2_OPT_REFCOUNT_CACHE_SIZE " may not exceed "
QCOW2_OPT_CACHE_SIZE);
return;
}
if (l2_cache_size_set) {
*refcount_cache_size = combined_cache_size - *l2_cache_size;
} else if (refcount_cache_size_set) {
*l2_cache_size = combined_cache_size - *refcount_cache_size;
} else {
*refcount_cache_size = combined_cache_size
/ (DEFAULT_L2_REFCOUNT_SIZE_RATIO + 1);
*l2_cache_size = combined_cache_size - *refcount_cache_size;
}
} else {
if (!l2_cache_size_set && !refcount_cache_size_set) {
*l2_cache_size = DEFAULT_L2_CACHE_BYTE_SIZE;
*refcount_cache_size = *l2_cache_size
/ DEFAULT_L2_REFCOUNT_SIZE_RATIO;
} else if (!l2_cache_size_set) {
*l2_cache_size = *refcount_cache_size
* DEFAULT_L2_REFCOUNT_SIZE_RATIO;
} else if (!refcount_cache_size_set) {
*refcount_cache_size = *l2_cache_size
/ DEFAULT_L2_REFCOUNT_SIZE_RATIO;
}
}
}
static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
@@ -464,12 +544,13 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
unsigned int len, i;
int ret = 0;
QCowHeader header;
QemuOpts *opts;
QemuOpts *opts = NULL;
Error *local_err = NULL;
uint64_t ext_end;
uint64_t l1_vm_state_index;
const char *opt_overlap_check;
const char *opt_overlap_check, *opt_overlap_check_template;
int overlap_check_template = 0;
uint64_t l2_cache_size, refcount_cache_size;
ret = bdrv_pread(bs->file, 0, &header, sizeof(header));
if (ret < 0) {
@@ -617,6 +698,9 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
s->l2_bits = s->cluster_bits - 3; /* L2 is always one cluster */
s->l2_size = 1 << s->l2_bits;
/* 2^(s->refcount_order - 3) is the refcount width in bytes */
s->refcount_block_bits = s->cluster_bits - (s->refcount_order - 3);
s->refcount_block_size = 1 << s->refcount_block_bits;
bs->total_sectors = header.size / 512;
s->csize_shift = (62 - (s->cluster_bits - 8));
s->csize_mask = (1 << (s->cluster_bits - 8)) - 1;
@@ -688,8 +772,13 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
if (s->l1_size > 0) {
s->l1_table = g_malloc0(
s->l1_table = qemu_try_blockalign(bs->file,
align_offset(s->l1_size * sizeof(uint64_t), 512));
if (s->l1_table == NULL) {
error_setg(errp, "Could not allocate L1 table");
ret = -ENOMEM;
goto fail;
}
ret = bdrv_pread(bs->file, s->l1_table_offset, s->l1_table,
s->l1_size * sizeof(uint64_t));
if (ret < 0) {
@@ -701,14 +790,61 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
}
}
/* get L2 table/refcount block cache size from command line options */
opts = qemu_opts_create(&qcow2_runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
read_cache_sizes(opts, &l2_cache_size, &refcount_cache_size, &local_err);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
l2_cache_size /= s->cluster_size;
if (l2_cache_size < MIN_L2_CACHE_SIZE) {
l2_cache_size = MIN_L2_CACHE_SIZE;
}
if (l2_cache_size > INT_MAX) {
error_setg(errp, "L2 cache size too big");
ret = -EINVAL;
goto fail;
}
refcount_cache_size /= s->cluster_size;
if (refcount_cache_size < MIN_REFCOUNT_CACHE_SIZE) {
refcount_cache_size = MIN_REFCOUNT_CACHE_SIZE;
}
if (refcount_cache_size > INT_MAX) {
error_setg(errp, "Refcount cache size too big");
ret = -EINVAL;
goto fail;
}
/* alloc L2 table/refcount block cache */
s->l2_table_cache = qcow2_cache_create(bs, L2_CACHE_SIZE);
s->refcount_block_cache = qcow2_cache_create(bs, REFCOUNT_CACHE_SIZE);
s->l2_table_cache = qcow2_cache_create(bs, l2_cache_size);
s->refcount_block_cache = qcow2_cache_create(bs, refcount_cache_size);
if (s->l2_table_cache == NULL || s->refcount_block_cache == NULL) {
error_setg(errp, "Could not allocate metadata caches");
ret = -ENOMEM;
goto fail;
}
s->cluster_cache = g_malloc(s->cluster_size);
/* one more sector for decompressed data alignment */
s->cluster_data = qemu_blockalign(bs, QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size
+ 512);
s->cluster_data = qemu_try_blockalign(bs->file, QCOW_MAX_CRYPT_CLUSTERS
* s->cluster_size + 512);
if (s->cluster_data == NULL) {
error_setg(errp, "Could not allocate temporary cluster buffer");
ret = -ENOMEM;
goto fail;
}
s->cluster_cache_offset = -1;
s->flags = flags;
@@ -774,7 +910,7 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
(s->incompatible_features & QCOW2_INCOMPAT_DIRTY)) {
BdrvCheckResult result = {0};
ret = qcow2_check(bs, &result, BDRV_FIX_ERRORS);
ret = qcow2_check(bs, &result, BDRV_FIX_ERRORS | BDRV_FIX_LEAKS);
if (ret < 0) {
error_setg_errno(errp, -ret, "Could not repair dirty image");
goto fail;
@@ -782,14 +918,6 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
}
/* Enable lazy_refcounts according to image and command line options */
opts = qemu_opts_create(&qcow2_runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
s->use_lazy_refcounts = qemu_opt_get_bool(opts, QCOW2_OPT_LAZY_REFCOUNTS,
(s->compatible_features & QCOW2_COMPAT_LAZY_REFCOUNTS));
@@ -803,7 +931,21 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
s->discard_passthrough[QCOW2_DISCARD_OTHER] =
qemu_opt_get_bool(opts, QCOW2_OPT_DISCARD_OTHER, false);
opt_overlap_check = qemu_opt_get(opts, "overlap-check") ?: "cached";
opt_overlap_check = qemu_opt_get(opts, QCOW2_OPT_OVERLAP);
opt_overlap_check_template = qemu_opt_get(opts, QCOW2_OPT_OVERLAP_TEMPLATE);
if (opt_overlap_check_template && opt_overlap_check &&
strcmp(opt_overlap_check_template, opt_overlap_check))
{
error_setg(errp, "Conflicting values for qcow2 options '"
QCOW2_OPT_OVERLAP "' ('%s') and '" QCOW2_OPT_OVERLAP_TEMPLATE
"' ('%s')", opt_overlap_check, opt_overlap_check_template);
ret = -EINVAL;
goto fail;
}
if (!opt_overlap_check) {
opt_overlap_check = opt_overlap_check_template ?: "cached";
}
if (!strcmp(opt_overlap_check, "none")) {
overlap_check_template = 0;
} else if (!strcmp(opt_overlap_check, "constant")) {
@@ -816,7 +958,6 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
error_setg(errp, "Unsupported value '%s' for qcow2 option "
"'overlap-check'. Allowed are either of the following: "
"none, constant, cached, all", opt_overlap_check);
qemu_opts_del(opts);
ret = -EINVAL;
goto fail;
}
@@ -831,6 +972,7 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
}
qemu_opts_del(opts);
opts = NULL;
if (s->use_lazy_refcounts && s->qcow_version < 3) {
error_setg(errp, "Lazy refcounts require a qcow2 image with at least "
@@ -848,11 +990,12 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
return ret;
fail:
qemu_opts_del(opts);
g_free(s->unknown_header_fields);
cleanup_unknown_header_ext(bs);
qcow2_free_snapshots(bs);
qcow2_refcount_close(bs);
g_free(s->l1_table);
qemu_vfree(s->l1_table);
/* else pre-write overlap checks in cache_destroy may crash */
s->l1_table = NULL;
if (s->l2_table_cache) {
@@ -1082,7 +1225,12 @@ static coroutine_fn int qcow2_co_readv(BlockDriverState *bs, int64_t sector_num,
*/
if (!cluster_data) {
cluster_data =
qemu_blockalign(bs, QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size);
qemu_try_blockalign(bs->file, QCOW_MAX_CRYPT_CLUSTERS
* s->cluster_size);
if (cluster_data == NULL) {
ret = -ENOMEM;
goto fail;
}
}
assert(cur_nr_sectors <=
@@ -1182,8 +1330,13 @@ static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
if (s->crypt_method) {
if (!cluster_data) {
cluster_data = qemu_blockalign(bs, QCOW_MAX_CRYPT_CLUSTERS *
s->cluster_size);
cluster_data = qemu_try_blockalign(bs->file,
QCOW_MAX_CRYPT_CLUSTERS
* s->cluster_size);
if (cluster_data == NULL) {
ret = -ENOMEM;
goto fail;
}
}
assert(hd_qiov.size <=
@@ -1270,7 +1423,7 @@ fail:
static void qcow2_close(BlockDriverState *bs)
{
BDRVQcowState *s = bs->opaque;
g_free(s->l1_table);
qemu_vfree(s->l1_table);
/* else pre-write overlap checks in cache_destroy may crash */
s->l1_table = NULL;
@@ -1557,7 +1710,7 @@ static int preallocate(BlockDriverState *bs)
int ret;
QCowL2Meta *meta;
nb_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
nb_sectors = bdrv_nb_sectors(bs);
offset = 0;
while (nb_sectors) {
@@ -1612,7 +1765,7 @@ static int preallocate(BlockDriverState *bs)
static int qcow2_create2(const char *filename, int64_t total_size,
const char *backing_file, const char *backing_format,
int flags, size_t cluster_size, int prealloc,
int flags, size_t cluster_size, PreallocMode prealloc,
QemuOpts *opts, int version,
Error **errp)
{
@@ -1645,6 +1798,56 @@ static int qcow2_create2(const char *filename, int64_t total_size,
Error *local_err = NULL;
int ret;
if (prealloc == PREALLOC_MODE_FULL || prealloc == PREALLOC_MODE_FALLOC) {
int64_t meta_size = 0;
uint64_t nreftablee, nrefblocke, nl1e, nl2e;
int64_t aligned_total_size = align_offset(total_size, cluster_size);
/* header: 1 cluster */
meta_size += cluster_size;
/* total size of L2 tables */
nl2e = aligned_total_size / cluster_size;
nl2e = align_offset(nl2e, cluster_size / sizeof(uint64_t));
meta_size += nl2e * sizeof(uint64_t);
/* total size of L1 tables */
nl1e = nl2e * sizeof(uint64_t) / cluster_size;
nl1e = align_offset(nl1e, cluster_size / sizeof(uint64_t));
meta_size += nl1e * sizeof(uint64_t);
/* total size of refcount blocks
*
* note: every host cluster is reference-counted, including metadata
* (even refcount blocks are recursively included).
* Let:
* a = total_size (this is the guest disk size)
* m = meta size not including refcount blocks and refcount tables
* c = cluster size
* y1 = number of refcount blocks entries
* y2 = meta size including everything
* then,
* y1 = (y2 + a)/c
* y2 = y1 * sizeof(u16) + y1 * sizeof(u16) * sizeof(u64) / c + m
* we can get y1:
* y1 = (a + m) / (c - sizeof(u16) - sizeof(u16) * sizeof(u64) / c)
*/
nrefblocke = (aligned_total_size + meta_size + cluster_size) /
(cluster_size - sizeof(uint16_t) -
1.0 * sizeof(uint16_t) * sizeof(uint64_t) / cluster_size);
nrefblocke = align_offset(nrefblocke, cluster_size / sizeof(uint16_t));
meta_size += nrefblocke * sizeof(uint16_t);
/* total size of refcount tables */
nreftablee = nrefblocke * sizeof(uint16_t) / cluster_size;
nreftablee = align_offset(nreftablee, cluster_size / sizeof(uint64_t));
meta_size += nreftablee * sizeof(uint64_t);
qemu_opt_set_number(opts, BLOCK_OPT_SIZE,
aligned_total_size + meta_size);
qemu_opt_set(opts, BLOCK_OPT_PREALLOC, PreallocMode_lookup[prealloc]);
}
ret = bdrv_create_file(filename, opts, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
@@ -1671,7 +1874,7 @@ static int qcow2_create2(const char *filename, int64_t total_size,
.l1_size = cpu_to_be32(0),
.refcount_table_offset = cpu_to_be64(cluster_size),
.refcount_table_clusters = cpu_to_be32(1),
.refcount_order = cpu_to_be32(3 + REFCOUNT_SHIFT),
.refcount_order = cpu_to_be32(4),
.header_length = cpu_to_be32(sizeof(*header)),
};
@@ -1733,7 +1936,7 @@ static int qcow2_create2(const char *filename, int64_t total_size,
}
/* Okay, now that we have a valid image, let's give it the right size */
ret = bdrv_truncate(bs, total_size * BDRV_SECTOR_SIZE);
ret = bdrv_truncate(bs, total_size);
if (ret < 0) {
error_setg_errno(errp, -ret, "Could not resize image");
goto out;
@@ -1750,7 +1953,7 @@ static int qcow2_create2(const char *filename, int64_t total_size,
}
/* And if we're supposed to preallocate metadata, do that now */
if (prealloc) {
if (prealloc != PREALLOC_MODE_OFF) {
BDRVQcowState *s = bs->opaque;
qemu_co_mutex_lock(&s->lock);
ret = preallocate(bs);
@@ -1786,16 +1989,17 @@ static int qcow2_create(const char *filename, QemuOpts *opts, Error **errp)
char *backing_file = NULL;
char *backing_fmt = NULL;
char *buf = NULL;
uint64_t sectors = 0;
uint64_t size = 0;
int flags = 0;
size_t cluster_size = DEFAULT_CLUSTER_SIZE;
int prealloc = 0;
PreallocMode prealloc;
int version = 3;
Error *local_err = NULL;
int ret;
/* Read out options */
sectors = qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0) / 512;
size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
backing_file = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FILE);
backing_fmt = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FMT);
if (qemu_opt_get_bool_del(opts, BLOCK_OPT_ENCRYPT, false)) {
@@ -1804,12 +2008,11 @@ static int qcow2_create(const char *filename, QemuOpts *opts, Error **errp)
cluster_size = qemu_opt_get_size_del(opts, BLOCK_OPT_CLUSTER_SIZE,
DEFAULT_CLUSTER_SIZE);
buf = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
if (!buf || !strcmp(buf, "off")) {
prealloc = 0;
} else if (!strcmp(buf, "metadata")) {
prealloc = 1;
} else {
error_setg(errp, "Invalid preallocation mode: '%s'", buf);
prealloc = qapi_enum_parse(PreallocMode_lookup, buf,
PREALLOC_MODE_MAX, PREALLOC_MODE_OFF,
&local_err);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto finish;
}
@@ -1831,7 +2034,7 @@ static int qcow2_create(const char *filename, QemuOpts *opts, Error **errp)
flags |= BLOCK_FLAG_LAZY_REFCOUNTS;
}
if (backing_file && prealloc) {
if (backing_file && prealloc != PREALLOC_MODE_OFF) {
error_setg(errp, "Backing file and preallocation cannot be used at "
"the same time");
ret = -EINVAL;
@@ -1845,7 +2048,7 @@ static int qcow2_create(const char *filename, QemuOpts *opts, Error **errp)
goto finish;
}
ret = qcow2_create2(filename, sectors, backing_file, backing_fmt, flags,
ret = qcow2_create2(filename, size, backing_file, backing_fmt, flags,
cluster_size, prealloc, opts, version, &local_err);
if (local_err) {
error_propagate(errp, local_err);
@@ -1886,7 +2089,7 @@ static coroutine_fn int qcow2_co_discard(BlockDriverState *bs,
qemu_co_mutex_lock(&s->lock);
ret = qcow2_discard_clusters(bs, sector_num << BDRV_SECTOR_BITS,
nb_sectors, QCOW2_DISCARD_REQUEST);
nb_sectors, QCOW2_DISCARD_REQUEST, false);
qemu_co_mutex_unlock(&s->lock);
return ret;
}
@@ -1947,7 +2150,6 @@ static int qcow2_write_compressed(BlockDriverState *bs, int64_t sector_num,
/* align end of file to a sector boundary to ease reading with
sector based I/Os */
cluster_offset = bdrv_getlength(bs->file);
cluster_offset = (cluster_offset + 511) & ~511;
bdrv_truncate(bs->file, cluster_offset);
return 0;
}
@@ -2028,6 +2230,195 @@ fail:
return ret;
}
static int make_completely_empty(BlockDriverState *bs)
{
BDRVQcowState *s = bs->opaque;
int ret, l1_clusters;
int64_t offset;
uint64_t *new_reftable = NULL;
uint64_t rt_entry, l1_size2;
struct {
uint64_t l1_offset;
uint64_t reftable_offset;
uint32_t reftable_clusters;
} QEMU_PACKED l1_ofs_rt_ofs_cls;
ret = qcow2_cache_empty(bs, s->l2_table_cache);
if (ret < 0) {
goto fail;
}
ret = qcow2_cache_empty(bs, s->refcount_block_cache);
if (ret < 0) {
goto fail;
}
/* Refcounts will be broken utterly */
ret = qcow2_mark_dirty(bs);
if (ret < 0) {
goto fail;
}
BLKDBG_EVENT(bs->file, BLKDBG_L1_UPDATE);
l1_clusters = DIV_ROUND_UP(s->l1_size, s->cluster_size / sizeof(uint64_t));
l1_size2 = (uint64_t)s->l1_size * sizeof(uint64_t);
/* After this call, neither the in-memory nor the on-disk refcount
* information accurately describe the actual references */
ret = bdrv_write_zeroes(bs->file, s->l1_table_offset / BDRV_SECTOR_SIZE,
l1_clusters * s->cluster_sectors, 0);
if (ret < 0) {
goto fail_broken_refcounts;
}
memset(s->l1_table, 0, l1_size2);
BLKDBG_EVENT(bs->file, BLKDBG_EMPTY_IMAGE_PREPARE);
/* Overwrite enough clusters at the beginning of the sectors to place
* the refcount table, a refcount block and the L1 table in; this may
* overwrite parts of the existing refcount and L1 table, which is not
* an issue because the dirty flag is set, complete data loss is in fact
* desired and partial data loss is consequently fine as well */
ret = bdrv_write_zeroes(bs->file, s->cluster_size / BDRV_SECTOR_SIZE,
(2 + l1_clusters) * s->cluster_size /
BDRV_SECTOR_SIZE, 0);
/* This call (even if it failed overall) may have overwritten on-disk
* refcount structures; in that case, the in-memory refcount information
* will probably differ from the on-disk information which makes the BDS
* unusable */
if (ret < 0) {
goto fail_broken_refcounts;
}
BLKDBG_EVENT(bs->file, BLKDBG_L1_UPDATE);
BLKDBG_EVENT(bs->file, BLKDBG_REFTABLE_UPDATE);
/* "Create" an empty reftable (one cluster) directly after the image
* header and an empty L1 table three clusters after the image header;
* the cluster between those two will be used as the first refblock */
cpu_to_be64w(&l1_ofs_rt_ofs_cls.l1_offset, 3 * s->cluster_size);
cpu_to_be64w(&l1_ofs_rt_ofs_cls.reftable_offset, s->cluster_size);
cpu_to_be32w(&l1_ofs_rt_ofs_cls.reftable_clusters, 1);
ret = bdrv_pwrite_sync(bs->file, offsetof(QCowHeader, l1_table_offset),
&l1_ofs_rt_ofs_cls, sizeof(l1_ofs_rt_ofs_cls));
if (ret < 0) {
goto fail_broken_refcounts;
}
s->l1_table_offset = 3 * s->cluster_size;
new_reftable = g_try_new0(uint64_t, s->cluster_size / sizeof(uint64_t));
if (!new_reftable) {
ret = -ENOMEM;
goto fail_broken_refcounts;
}
s->refcount_table_offset = s->cluster_size;
s->refcount_table_size = s->cluster_size / sizeof(uint64_t);
g_free(s->refcount_table);
s->refcount_table = new_reftable;
new_reftable = NULL;
/* Now the in-memory refcount information again corresponds to the on-disk
* information (reftable is empty and no refblocks (the refblock cache is
* empty)); however, this means some clusters (e.g. the image header) are
* referenced, but not refcounted, but the normal qcow2 code assumes that
* the in-memory information is always correct */
BLKDBG_EVENT(bs->file, BLKDBG_REFBLOCK_ALLOC);
/* Enter the first refblock into the reftable */
rt_entry = cpu_to_be64(2 * s->cluster_size);
ret = bdrv_pwrite_sync(bs->file, s->cluster_size,
&rt_entry, sizeof(rt_entry));
if (ret < 0) {
goto fail_broken_refcounts;
}
s->refcount_table[0] = 2 * s->cluster_size;
s->free_cluster_index = 0;
assert(3 + l1_clusters <= s->refcount_block_size);
offset = qcow2_alloc_clusters(bs, 3 * s->cluster_size + l1_size2);
if (offset < 0) {
ret = offset;
goto fail_broken_refcounts;
} else if (offset > 0) {
error_report("First cluster in emptied image is in use");
abort();
}
/* Now finally the in-memory information corresponds to the on-disk
* structures and is correct */
ret = qcow2_mark_clean(bs);
if (ret < 0) {
goto fail;
}
ret = bdrv_truncate(bs->file, (3 + l1_clusters) * s->cluster_size);
if (ret < 0) {
goto fail;
}
return 0;
fail_broken_refcounts:
/* The BDS is unusable at this point. If we wanted to make it usable, we
* would have to call qcow2_refcount_close(), qcow2_refcount_init(),
* qcow2_check_refcounts(), qcow2_refcount_close() and qcow2_refcount_init()
* again. However, because the functions which could have caused this error
* path to be taken are used by those functions as well, it's very likely
* that that sequence will fail as well. Therefore, just eject the BDS. */
bs->drv = NULL;
fail:
g_free(new_reftable);
return ret;
}
static int qcow2_make_empty(BlockDriverState *bs)
{
BDRVQcowState *s = bs->opaque;
uint64_t start_sector;
int sector_step = INT_MAX / BDRV_SECTOR_SIZE;
int l1_clusters, ret = 0;
l1_clusters = DIV_ROUND_UP(s->l1_size, s->cluster_size / sizeof(uint64_t));
if (s->qcow_version >= 3 && !s->snapshots &&
3 + l1_clusters <= s->refcount_block_size) {
/* The following function only works for qcow2 v3 images (it requires
* the dirty flag) and only as long as there are no snapshots (because
* it completely empties the image). Furthermore, the L1 table and three
* additional clusters (image header, refcount table, one refcount
* block) have to fit inside one refcount block. */
return make_completely_empty(bs);
}
/* This fallback code simply discards every active cluster; this is slow,
* but works in all cases */
for (start_sector = 0; start_sector < bs->total_sectors;
start_sector += sector_step)
{
/* As this function is generally used after committing an external
* snapshot, QCOW2_DISCARD_SNAPSHOT seems appropriate. Also, the
* default action for this kind of discard is to pass the discard,
* which will ideally result in an actually smaller image file, as
* is probably desired. */
ret = qcow2_discard_clusters(bs, start_sector * BDRV_SECTOR_SIZE,
MIN(sector_step,
bs->total_sectors - start_sector),
QCOW2_DISCARD_SNAPSHOT, true);
if (ret < 0) {
break;
}
}
return ret;
}
static coroutine_fn int qcow2_co_flush_to_os(BlockDriverState *bs)
{
BDRVQcowState *s = bs->opaque;
@@ -2083,6 +2474,9 @@ static ImageInfoSpecific *qcow2_get_specific_info(BlockDriverState *bs)
.lazy_refcounts = s->compatible_features &
QCOW2_COMPAT_LAZY_REFCOUNTS,
.has_lazy_refcounts = true,
.corrupt = s->incompatible_features &
QCOW2_INCOMPAT_CORRUPT,
.has_corrupt = true,
};
}
@@ -2156,7 +2550,8 @@ static int qcow2_load_vmstate(BlockDriverState *bs, uint8_t *buf,
* Downgrades an image's version. To achieve this, any incompatible features
* have to be removed.
*/
static int qcow2_downgrade(BlockDriverState *bs, int target_version)
static int qcow2_downgrade(BlockDriverState *bs, int target_version,
BlockDriverAmendStatusCB *status_cb)
{
BDRVQcowState *s = bs->opaque;
int current_version = s->qcow_version;
@@ -2205,7 +2600,7 @@ static int qcow2_downgrade(BlockDriverState *bs, int target_version)
/* clearing autoclear features is trivial */
s->autoclear_features = 0;
ret = qcow2_expand_zero_clusters(bs);
ret = qcow2_expand_zero_clusters(bs, status_cb);
if (ret < 0) {
return ret;
}
@@ -2219,7 +2614,8 @@ static int qcow2_downgrade(BlockDriverState *bs, int target_version)
return 0;
}
static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts)
static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts,
BlockDriverAmendStatusCB *status_cb)
{
BDRVQcowState *s = bs->opaque;
int old_version = s->qcow_version, new_version = old_version;
@@ -2297,7 +2693,7 @@ static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts)
return ret;
}
} else {
ret = qcow2_downgrade(bs, new_version);
ret = qcow2_downgrade(bs, new_version, status_cb);
if (ret < 0) {
return ret;
}
@@ -2353,6 +2749,52 @@ static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts)
return 0;
}
/*
* If offset or size are negative, respectively, they will not be included in
* the BLOCK_IMAGE_CORRUPTED event emitted.
* fatal will be ignored for read-only BDS; corruptions found there will always
* be considered non-fatal.
*/
void qcow2_signal_corruption(BlockDriverState *bs, bool fatal, int64_t offset,
int64_t size, const char *message_format, ...)
{
BDRVQcowState *s = bs->opaque;
char *message;
va_list ap;
fatal = fatal && !bs->read_only;
if (s->signaled_corruption &&
(!fatal || (s->incompatible_features & QCOW2_INCOMPAT_CORRUPT)))
{
return;
}
va_start(ap, message_format);
message = g_strdup_vprintf(message_format, ap);
va_end(ap);
if (fatal) {
fprintf(stderr, "qcow2: Marking image as corrupt: %s; further "
"corruption events will be suppressed\n", message);
} else {
fprintf(stderr, "qcow2: Image is corrupt: %s; further non-fatal "
"corruption events will be suppressed\n", message);
}
qapi_event_send_block_image_corrupted(bdrv_get_device_name(bs), message,
offset >= 0, offset, size >= 0, size,
fatal, &error_abort);
g_free(message);
if (fatal) {
qcow2_mark_corrupt(bs);
bs->drv = NULL; /* make BDS unusable */
}
s->signaled_corruption = true;
}
static QemuOptsList qcow2_create_opts = {
.name = "qcow2-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(qcow2_create_opts.head),
@@ -2392,7 +2834,8 @@ static QemuOptsList qcow2_create_opts = {
{
.name = BLOCK_OPT_PREALLOC,
.type = QEMU_OPT_STRING,
.help = "Preallocation mode (allowed values: off, metadata)"
.help = "Preallocation mode (allowed values: off, metadata, "
"falloc, full)"
},
{
.name = BLOCK_OPT_LAZY_REFCOUNTS,
@@ -2424,6 +2867,7 @@ static BlockDriver bdrv_qcow2 = {
.bdrv_co_discard = qcow2_co_discard,
.bdrv_truncate = qcow2_truncate,
.bdrv_write_compressed = qcow2_write_compressed,
.bdrv_make_empty = qcow2_make_empty,
.bdrv_snapshot_create = qcow2_snapshot_create,
.bdrv_snapshot_goto = qcow2_snapshot_goto,

View File

@@ -59,15 +59,19 @@
/* The cluster reads as all zeros */
#define QCOW_OFLAG_ZERO (1ULL << 0)
#define REFCOUNT_SHIFT 1 /* refcount size is 2 bytes */
#define MIN_CLUSTER_BITS 9
#define MAX_CLUSTER_BITS 21
#define L2_CACHE_SIZE 16
#define MIN_L2_CACHE_SIZE 1 /* cluster */
/* Must be at least 4 to cover all cases of refcount table growth */
#define REFCOUNT_CACHE_SIZE 4
#define MIN_REFCOUNT_CACHE_SIZE 4 /* clusters */
#define DEFAULT_L2_CACHE_BYTE_SIZE 1048576 /* bytes */
/* The refblock cache needs only a fourth of the L2 cache size to cover as many
* clusters */
#define DEFAULT_L2_REFCOUNT_SIZE_RATIO 4
#define DEFAULT_CLUSTER_SIZE 65536
@@ -77,6 +81,7 @@
#define QCOW2_OPT_DISCARD_SNAPSHOT "pass-discard-snapshot"
#define QCOW2_OPT_DISCARD_OTHER "pass-discard-other"
#define QCOW2_OPT_OVERLAP "overlap-check"
#define QCOW2_OPT_OVERLAP_TEMPLATE "overlap-check.template"
#define QCOW2_OPT_OVERLAP_MAIN_HEADER "overlap-check.main-header"
#define QCOW2_OPT_OVERLAP_ACTIVE_L1 "overlap-check.active-l1"
#define QCOW2_OPT_OVERLAP_ACTIVE_L2 "overlap-check.active-l2"
@@ -85,6 +90,9 @@
#define QCOW2_OPT_OVERLAP_SNAPSHOT_TABLE "overlap-check.snapshot-table"
#define QCOW2_OPT_OVERLAP_INACTIVE_L1 "overlap-check.inactive-l1"
#define QCOW2_OPT_OVERLAP_INACTIVE_L2 "overlap-check.inactive-l2"
#define QCOW2_OPT_CACHE_SIZE "cache-size"
#define QCOW2_OPT_L2_CACHE_SIZE "l2-cache-size"
#define QCOW2_OPT_REFCOUNT_CACHE_SIZE "refcount-cache-size"
typedef struct QCowHeader {
uint32_t magic;
@@ -213,6 +221,8 @@ typedef struct BDRVQcowState {
int l2_size;
int l1_size;
int l1_vm_state_index;
int refcount_block_bits;
int refcount_block_size;
int csize_shift;
int csize_mask;
uint64_t cluster_offset_mask;
@@ -252,6 +262,7 @@ typedef struct BDRVQcowState {
bool discard_passthrough[QCOW2_DISCARD_MAX];
int overlap_check; /* bitmask of Qcow2MetadataOverlap values */
bool signaled_corruption;
uint64_t incompatible_features;
uint64_t compatible_features;
@@ -468,10 +479,16 @@ int qcow2_mark_corrupt(BlockDriverState *bs);
int qcow2_mark_consistent(BlockDriverState *bs);
int qcow2_update_header(BlockDriverState *bs);
void qcow2_signal_corruption(BlockDriverState *bs, bool fatal, int64_t offset,
int64_t size, const char *message_format, ...)
GCC_FMT_ATTR(5, 6);
/* qcow2-refcount.c functions */
int qcow2_refcount_init(BlockDriverState *bs);
void qcow2_refcount_close(BlockDriverState *bs);
int qcow2_get_refcount(BlockDriverState *bs, int64_t cluster_index);
int qcow2_update_cluster_refcount(BlockDriverState *bs, int64_t cluster_index,
int addend, enum qcow2_discard_type type);
@@ -519,10 +536,11 @@ uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs,
int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m);
int qcow2_discard_clusters(BlockDriverState *bs, uint64_t offset,
int nb_sectors, enum qcow2_discard_type type);
int nb_sectors, enum qcow2_discard_type type, bool full_discard);
int qcow2_zero_clusters(BlockDriverState *bs, uint64_t offset, int nb_sectors);
int qcow2_expand_zero_clusters(BlockDriverState *bs);
int qcow2_expand_zero_clusters(BlockDriverState *bs,
BlockDriverAmendStatusCB *status_cb);
/* qcow2-snapshot.c functions */
int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info);

View File

@@ -227,8 +227,10 @@ int qed_check(BDRVQEDState *s, BdrvCheckResult *result, bool fix)
};
int ret;
check.used_clusters = g_malloc0(((check.nclusters + 31) / 32) *
sizeof(check.used_clusters[0]));
check.used_clusters = g_try_new0(uint32_t, (check.nclusters + 31) / 32);
if (check.nclusters && check.used_clusters == NULL) {
return -ENOMEM;
}
check.result->bfi.total_clusters =
(s->header.image_size + s->header.cluster_size - 1) /

View File

@@ -13,7 +13,7 @@
#include "qed.h"
void *gencb_alloc(size_t len, BlockDriverCompletionFunc *cb, void *opaque)
void *gencb_alloc(size_t len, BlockCompletionFunc *cb, void *opaque)
{
GenericCB *gencb = g_malloc(len);
gencb->cb = cb;
@@ -24,7 +24,7 @@ void *gencb_alloc(size_t len, BlockDriverCompletionFunc *cb, void *opaque)
void gencb_complete(void *opaque, int ret)
{
GenericCB *gencb = opaque;
BlockDriverCompletionFunc *cb = gencb->cb;
BlockCompletionFunc *cb = gencb->cb;
void *user_opaque = gencb->opaque;
g_free(gencb);

View File

@@ -49,7 +49,7 @@ out:
}
static void qed_read_table(BDRVQEDState *s, uint64_t offset, QEDTable *table,
BlockDriverCompletionFunc *cb, void *opaque)
BlockCompletionFunc *cb, void *opaque)
{
QEDReadTableCB *read_table_cb = gencb_alloc(sizeof(*read_table_cb),
cb, opaque);
@@ -119,7 +119,7 @@ out:
*/
static void qed_write_table(BDRVQEDState *s, uint64_t offset, QEDTable *table,
unsigned int index, unsigned int n, bool flush,
BlockDriverCompletionFunc *cb, void *opaque)
BlockCompletionFunc *cb, void *opaque)
{
QEDWriteTableCB *write_table_cb;
unsigned int sector_mask = BDRV_SECTOR_SIZE / sizeof(uint64_t) - 1;
@@ -180,7 +180,7 @@ int qed_read_l1_table_sync(BDRVQEDState *s)
}
void qed_write_l1_table(BDRVQEDState *s, unsigned int index, unsigned int n,
BlockDriverCompletionFunc *cb, void *opaque)
BlockCompletionFunc *cb, void *opaque)
{
BLKDBG_EVENT(s->bs->file, BLKDBG_L1_UPDATE);
qed_write_table(s, s->header.l1_table_offset,
@@ -235,7 +235,7 @@ static void qed_read_l2_table_cb(void *opaque, int ret)
}
void qed_read_l2_table(BDRVQEDState *s, QEDRequest *request, uint64_t offset,
BlockDriverCompletionFunc *cb, void *opaque)
BlockCompletionFunc *cb, void *opaque)
{
QEDReadL2TableCB *read_l2_table_cb;
@@ -275,7 +275,7 @@ int qed_read_l2_table_sync(BDRVQEDState *s, QEDRequest *request, uint64_t offset
void qed_write_l2_table(BDRVQEDState *s, QEDRequest *request,
unsigned int index, unsigned int n, bool flush,
BlockDriverCompletionFunc *cb, void *opaque)
BlockCompletionFunc *cb, void *opaque)
{
BLKDBG_EVENT(s->bs->file, BLKDBG_L2_UPDATE);
qed_write_table(s, request->l2_table->offset,

View File

@@ -18,22 +18,8 @@
#include "qapi/qmp/qerror.h"
#include "migration/migration.h"
static void qed_aio_cancel(BlockDriverAIOCB *blockacb)
{
QEDAIOCB *acb = (QEDAIOCB *)blockacb;
AioContext *aio_context = bdrv_get_aio_context(blockacb->bs);
bool finished = false;
/* Wait for the request to finish */
acb->finished = &finished;
while (!finished) {
aio_poll(aio_context, true);
}
}
static const AIOCBInfo qed_aiocb_info = {
.aiocb_size = sizeof(QEDAIOCB),
.cancel = qed_aio_cancel,
};
static int bdrv_qed_probe(const uint8_t *buf, int buf_size,
@@ -144,7 +130,7 @@ static void qed_write_header_read_cb(void *opaque, int ret)
* This function only updates known header fields in-place and does not affect
* extra data after the QED header.
*/
static void qed_write_header(BDRVQEDState *s, BlockDriverCompletionFunc cb,
static void qed_write_header(BDRVQEDState *s, BlockCompletionFunc cb,
void *opaque)
{
/* We must write full sectors for O_DIRECT but cannot necessarily generate
@@ -422,7 +408,7 @@ static int bdrv_qed_open(BlockDriverState *bs, QDict *options, int flags,
snprintf(buf, sizeof(buf), "%" PRIx64,
s->header.features & ~QED_FEATURE_MASK);
error_set(errp, QERR_UNKNOWN_BLOCK_FORMAT_FEATURE,
bs->device_name, "QED", buf);
bdrv_get_device_name(bs), "QED", buf);
return -ENOTSUP;
}
if (!qed_is_cluster_size_valid(s->header.cluster_size)) {
@@ -648,7 +634,8 @@ static int bdrv_qed_create(const char *filename, QemuOpts *opts, Error **errp)
char *backing_fmt = NULL;
int ret;
image_size = qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0);
image_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
backing_file = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FILE);
backing_fmt = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FMT);
cluster_size = qemu_opt_get_size_del(opts,
@@ -772,7 +759,7 @@ static BDRVQEDState *acb_to_s(QEDAIOCB *acb)
static void qed_read_backing_file(BDRVQEDState *s, uint64_t pos,
QEMUIOVector *qiov,
QEMUIOVector **backing_qiov,
BlockDriverCompletionFunc *cb, void *opaque)
BlockCompletionFunc *cb, void *opaque)
{
uint64_t backing_length = 0;
size_t size;
@@ -864,7 +851,7 @@ static void qed_copy_from_backing_file_write(void *opaque, int ret)
*/
static void qed_copy_from_backing_file(BDRVQEDState *s, uint64_t pos,
uint64_t len, uint64_t offset,
BlockDriverCompletionFunc *cb,
BlockCompletionFunc *cb,
void *opaque)
{
CopyFromBackingFileCB *copy_cb;
@@ -915,21 +902,15 @@ static void qed_update_l2_table(BDRVQEDState *s, QEDTable *table, int index,
static void qed_aio_complete_bh(void *opaque)
{
QEDAIOCB *acb = opaque;
BlockDriverCompletionFunc *cb = acb->common.cb;
BlockCompletionFunc *cb = acb->common.cb;
void *user_opaque = acb->common.opaque;
int ret = acb->bh_ret;
bool *finished = acb->finished;
qemu_bh_delete(acb->bh);
qemu_aio_release(acb);
qemu_aio_unref(acb);
/* Invoke callback */
cb(user_opaque, ret);
/* Signal cancel completion */
if (finished) {
*finished = true;
}
}
static void qed_aio_complete(QEDAIOCB *acb, int ret)
@@ -1083,7 +1064,7 @@ static void qed_aio_write_main(void *opaque, int ret)
BDRVQEDState *s = acb_to_s(acb);
uint64_t offset = acb->cur_cluster +
qed_offset_into_cluster(s, acb->cur_pos);
BlockDriverCompletionFunc *next_fn;
BlockCompletionFunc *next_fn;
trace_qed_aio_write_main(s, acb, ret, offset, acb->cur_qiov.size);
@@ -1183,7 +1164,7 @@ static void qed_aio_write_zero_cluster(void *opaque, int ret)
static void qed_aio_write_alloc(QEDAIOCB *acb, size_t len)
{
BDRVQEDState *s = acb_to_s(acb);
BlockDriverCompletionFunc *cb;
BlockCompletionFunc *cb;
/* Cancel timer when the first allocating request comes in */
if (QSIMPLEQ_EMPTY(&s->allocating_write_reqs)) {
@@ -1240,7 +1221,11 @@ static void qed_aio_write_inplace(QEDAIOCB *acb, uint64_t offset, size_t len)
struct iovec *iov = acb->qiov->iov;
if (!iov->iov_base) {
iov->iov_base = qemu_blockalign(acb->common.bs, iov->iov_len);
iov->iov_base = qemu_try_blockalign(acb->common.bs, iov->iov_len);
if (iov->iov_base == NULL) {
qed_aio_complete(acb, -ENOMEM);
return;
}
memset(iov->iov_base, 0, iov->iov_len);
}
}
@@ -1380,11 +1365,11 @@ static void qed_aio_next_io(void *opaque, int ret)
io_fn, acb);
}
static BlockDriverAIOCB *qed_aio_setup(BlockDriverState *bs,
int64_t sector_num,
QEMUIOVector *qiov, int nb_sectors,
BlockDriverCompletionFunc *cb,
void *opaque, int flags)
static BlockAIOCB *qed_aio_setup(BlockDriverState *bs,
int64_t sector_num,
QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb,
void *opaque, int flags)
{
QEDAIOCB *acb = qemu_aio_get(&qed_aiocb_info, bs, cb, opaque);
@@ -1392,7 +1377,6 @@ static BlockDriverAIOCB *qed_aio_setup(BlockDriverState *bs,
opaque, flags);
acb->flags = flags;
acb->finished = NULL;
acb->qiov = qiov;
acb->qiov_offset = 0;
acb->cur_pos = (uint64_t)sector_num * BDRV_SECTOR_SIZE;
@@ -1406,20 +1390,20 @@ static BlockDriverAIOCB *qed_aio_setup(BlockDriverState *bs,
return &acb->common;
}
static BlockDriverAIOCB *bdrv_qed_aio_readv(BlockDriverState *bs,
int64_t sector_num,
QEMUIOVector *qiov, int nb_sectors,
BlockDriverCompletionFunc *cb,
void *opaque)
static BlockAIOCB *bdrv_qed_aio_readv(BlockDriverState *bs,
int64_t sector_num,
QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb,
void *opaque)
{
return qed_aio_setup(bs, sector_num, qiov, nb_sectors, cb, opaque, 0);
}
static BlockDriverAIOCB *bdrv_qed_aio_writev(BlockDriverState *bs,
int64_t sector_num,
QEMUIOVector *qiov, int nb_sectors,
BlockDriverCompletionFunc *cb,
void *opaque)
static BlockAIOCB *bdrv_qed_aio_writev(BlockDriverState *bs,
int64_t sector_num,
QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb,
void *opaque)
{
return qed_aio_setup(bs, sector_num, qiov, nb_sectors, cb,
opaque, QED_AIOCB_WRITE);
@@ -1447,7 +1431,7 @@ static int coroutine_fn bdrv_qed_co_write_zeroes(BlockDriverState *bs,
int nb_sectors,
BdrvRequestFlags flags)
{
BlockDriverAIOCB *blockacb;
BlockAIOCB *blockacb;
BDRVQEDState *s = bs->opaque;
QEDWriteZeroesCB cb = { .done = false };
QEMUIOVector qiov;

View File

@@ -128,7 +128,7 @@ enum {
};
typedef struct QEDAIOCB {
BlockDriverAIOCB common;
BlockAIOCB common;
QEMUBH *bh;
int bh_ret; /* final return status for completion bh */
QSIMPLEQ_ENTRY(QEDAIOCB) next; /* next request */
@@ -203,11 +203,11 @@ typedef void QEDFindClusterFunc(void *opaque, int ret, uint64_t offset, size_t l
* Generic callback for chaining async callbacks
*/
typedef struct {
BlockDriverCompletionFunc *cb;
BlockCompletionFunc *cb;
void *opaque;
} GenericCB;
void *gencb_alloc(size_t len, BlockDriverCompletionFunc *cb, void *opaque);
void *gencb_alloc(size_t len, BlockCompletionFunc *cb, void *opaque);
void gencb_complete(void *opaque, int ret);
/**
@@ -230,16 +230,16 @@ void qed_commit_l2_cache_entry(L2TableCache *l2_cache, CachedL2Table *l2_table);
*/
int qed_read_l1_table_sync(BDRVQEDState *s);
void qed_write_l1_table(BDRVQEDState *s, unsigned int index, unsigned int n,
BlockDriverCompletionFunc *cb, void *opaque);
BlockCompletionFunc *cb, void *opaque);
int qed_write_l1_table_sync(BDRVQEDState *s, unsigned int index,
unsigned int n);
int qed_read_l2_table_sync(BDRVQEDState *s, QEDRequest *request,
uint64_t offset);
void qed_read_l2_table(BDRVQEDState *s, QEDRequest *request, uint64_t offset,
BlockDriverCompletionFunc *cb, void *opaque);
BlockCompletionFunc *cb, void *opaque);
void qed_write_l2_table(BDRVQEDState *s, QEDRequest *request,
unsigned int index, unsigned int n, bool flush,
BlockDriverCompletionFunc *cb, void *opaque);
BlockCompletionFunc *cb, void *opaque);
int qed_write_l2_table_sync(BDRVQEDState *s, QEDRequest *request,
unsigned int index, unsigned int n, bool flush);

View File

@@ -16,7 +16,12 @@
#include <gnutls/gnutls.h>
#include <gnutls/crypto.h>
#include "block/block_int.h"
#include "qapi/qmp/qbool.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qint.h"
#include "qapi/qmp/qjson.h"
#include "qapi/qmp/qlist.h"
#include "qapi/qmp/qstring.h"
#include "qapi-event.h"
#define HASH_LENGTH 32
@@ -24,6 +29,7 @@
#define QUORUM_OPT_VOTE_THRESHOLD "vote-threshold"
#define QUORUM_OPT_BLKVERIFY "blkverify"
#define QUORUM_OPT_REWRITE "rewrite-corrupted"
#define QUORUM_OPT_READ_PATTERN "read-pattern"
/* This union holds a vote hash value */
typedef union QuorumVoteValue {
@@ -74,6 +80,8 @@ typedef struct BDRVQuorumState {
bool rewrite_corrupted;/* true if the driver must rewrite-on-read corrupted
* block if Quorum is reached.
*/
QuorumReadPattern read_pattern;
} BDRVQuorumState;
typedef struct QuorumAIOCB QuorumAIOCB;
@@ -84,7 +92,7 @@ typedef struct QuorumAIOCB QuorumAIOCB;
* $children_count QuorumChildRequest.
*/
typedef struct QuorumChildRequest {
BlockDriverAIOCB *aiocb;
BlockAIOCB *aiocb;
QEMUIOVector qiov;
uint8_t *buf;
int ret;
@@ -97,7 +105,7 @@ typedef struct QuorumChildRequest {
* used to do operations on each children and track overall progress.
*/
struct QuorumAIOCB {
BlockDriverAIOCB common;
BlockAIOCB common;
/* Request metadata */
uint64_t sector_num;
@@ -117,11 +125,12 @@ struct QuorumAIOCB {
bool is_read;
int vote_ret;
int child_iter; /* which child to read in fifo pattern */
};
static bool quorum_vote(QuorumAIOCB *acb);
static void quorum_aio_cancel(BlockDriverAIOCB *blockacb)
static void quorum_aio_cancel(BlockAIOCB *blockacb)
{
QuorumAIOCB *acb = container_of(blockacb, QuorumAIOCB, common);
BDRVQuorumState *s = acb->common.bs->opaque;
@@ -129,21 +138,19 @@ static void quorum_aio_cancel(BlockDriverAIOCB *blockacb)
/* cancel all callbacks */
for (i = 0; i < s->num_children; i++) {
bdrv_aio_cancel(acb->qcrs[i].aiocb);
if (acb->qcrs[i].aiocb) {
bdrv_aio_cancel_async(acb->qcrs[i].aiocb);
}
}
g_free(acb->qcrs);
qemu_aio_release(acb);
}
static AIOCBInfo quorum_aiocb_info = {
.aiocb_size = sizeof(QuorumAIOCB),
.cancel = quorum_aio_cancel,
.cancel_async = quorum_aio_cancel,
};
static void quorum_aio_finalize(QuorumAIOCB *acb)
{
BDRVQuorumState *s = acb->common.bs->opaque;
int i, ret = 0;
if (acb->vote_ret) {
@@ -153,14 +160,15 @@ static void quorum_aio_finalize(QuorumAIOCB *acb)
acb->common.cb(acb->common.opaque, ret);
if (acb->is_read) {
for (i = 0; i < s->num_children; i++) {
/* on the quorum case acb->child_iter == s->num_children - 1 */
for (i = 0; i <= acb->child_iter; i++) {
qemu_vfree(acb->qcrs[i].buf);
qemu_iovec_destroy(&acb->qcrs[i].qiov);
}
}
g_free(acb->qcrs);
qemu_aio_release(acb);
qemu_aio_unref(acb);
}
static bool quorum_sha256_compare(QuorumVoteValue *a, QuorumVoteValue *b)
@@ -178,7 +186,7 @@ static QuorumAIOCB *quorum_aio_get(BDRVQuorumState *s,
QEMUIOVector *qiov,
uint64_t sector_num,
int nb_sectors,
BlockDriverCompletionFunc *cb,
BlockCompletionFunc *cb,
void *opaque)
{
QuorumAIOCB *acb = qemu_aio_get(&quorum_aiocb_info, bs, cb, opaque);
@@ -218,8 +226,8 @@ static void quorum_report_bad(QuorumAIOCB *acb, char *node_name, int ret)
static void quorum_report_failure(QuorumAIOCB *acb)
{
const char *reference = acb->common.bs->device_name[0] ?
acb->common.bs->device_name :
const char *reference = bdrv_get_device_name(acb->common.bs)[0] ?
bdrv_get_device_name(acb->common.bs) :
acb->common.bs->node_name;
qapi_event_send_quorum_failure(reference, acb->sector_num,
@@ -256,6 +264,21 @@ static void quorum_rewrite_aio_cb(void *opaque, int ret)
quorum_aio_finalize(acb);
}
static BlockAIOCB *read_fifo_child(QuorumAIOCB *acb);
static void quorum_copy_qiov(QEMUIOVector *dest, QEMUIOVector *source)
{
int i;
assert(dest->niov == source->niov);
assert(dest->size == source->size);
for (i = 0; i < source->niov; i++) {
assert(dest->iov[i].iov_len == source->iov[i].iov_len);
memcpy(dest->iov[i].iov_base,
source->iov[i].iov_base,
source->iov[i].iov_len);
}
}
static void quorum_aio_cb(void *opaque, int ret)
{
QuorumChildRequest *sacb = opaque;
@@ -263,6 +286,21 @@ static void quorum_aio_cb(void *opaque, int ret)
BDRVQuorumState *s = acb->common.bs->opaque;
bool rewrite = false;
if (acb->is_read && s->read_pattern == QUORUM_READ_PATTERN_FIFO) {
/* We try to read next child in FIFO order if we fail to read */
if (ret < 0 && ++acb->child_iter < s->num_children) {
read_fifo_child(acb);
return;
}
if (ret == 0) {
quorum_copy_qiov(acb->qiov, &acb->qcrs[acb->child_iter].qiov);
}
acb->vote_ret = ret;
quorum_aio_finalize(acb);
return;
}
sacb->ret = ret;
acb->count++;
if (ret == 0) {
@@ -343,19 +381,6 @@ static bool quorum_rewrite_bad_versions(BDRVQuorumState *s, QuorumAIOCB *acb,
return count;
}
static void quorum_copy_qiov(QEMUIOVector *dest, QEMUIOVector *source)
{
int i;
assert(dest->niov == source->niov);
assert(dest->size == source->size);
for (i = 0; i < source->niov; i++) {
assert(dest->iov[i].iov_len == source->iov[i].iov_len);
memcpy(dest->iov[i].iov_base,
source->iov[i].iov_base,
source->iov[i].iov_len);
}
}
static void quorum_count_vote(QuorumVotes *votes,
QuorumVoteValue *value,
int index)
@@ -615,40 +640,68 @@ free_exit:
return rewrite;
}
static BlockDriverAIOCB *quorum_aio_readv(BlockDriverState *bs,
int64_t sector_num,
QEMUIOVector *qiov,
int nb_sectors,
BlockDriverCompletionFunc *cb,
void *opaque)
static BlockAIOCB *read_quorum_children(QuorumAIOCB *acb)
{
BDRVQuorumState *s = bs->opaque;
QuorumAIOCB *acb = quorum_aio_get(s, bs, qiov, sector_num,
nb_sectors, cb, opaque);
BDRVQuorumState *s = acb->common.bs->opaque;
int i;
acb->is_read = true;
for (i = 0; i < s->num_children; i++) {
acb->qcrs[i].buf = qemu_blockalign(s->bs[i], qiov->size);
qemu_iovec_init(&acb->qcrs[i].qiov, qiov->niov);
qemu_iovec_clone(&acb->qcrs[i].qiov, qiov, acb->qcrs[i].buf);
acb->qcrs[i].buf = qemu_blockalign(s->bs[i], acb->qiov->size);
qemu_iovec_init(&acb->qcrs[i].qiov, acb->qiov->niov);
qemu_iovec_clone(&acb->qcrs[i].qiov, acb->qiov, acb->qcrs[i].buf);
}
for (i = 0; i < s->num_children; i++) {
bdrv_aio_readv(s->bs[i], sector_num, &acb->qcrs[i].qiov, nb_sectors,
quorum_aio_cb, &acb->qcrs[i]);
bdrv_aio_readv(s->bs[i], acb->sector_num, &acb->qcrs[i].qiov,
acb->nb_sectors, quorum_aio_cb, &acb->qcrs[i]);
}
return &acb->common;
}
static BlockDriverAIOCB *quorum_aio_writev(BlockDriverState *bs,
int64_t sector_num,
QEMUIOVector *qiov,
int nb_sectors,
BlockDriverCompletionFunc *cb,
void *opaque)
static BlockAIOCB *read_fifo_child(QuorumAIOCB *acb)
{
BDRVQuorumState *s = acb->common.bs->opaque;
acb->qcrs[acb->child_iter].buf = qemu_blockalign(s->bs[acb->child_iter],
acb->qiov->size);
qemu_iovec_init(&acb->qcrs[acb->child_iter].qiov, acb->qiov->niov);
qemu_iovec_clone(&acb->qcrs[acb->child_iter].qiov, acb->qiov,
acb->qcrs[acb->child_iter].buf);
bdrv_aio_readv(s->bs[acb->child_iter], acb->sector_num,
&acb->qcrs[acb->child_iter].qiov, acb->nb_sectors,
quorum_aio_cb, &acb->qcrs[acb->child_iter]);
return &acb->common;
}
static BlockAIOCB *quorum_aio_readv(BlockDriverState *bs,
int64_t sector_num,
QEMUIOVector *qiov,
int nb_sectors,
BlockCompletionFunc *cb,
void *opaque)
{
BDRVQuorumState *s = bs->opaque;
QuorumAIOCB *acb = quorum_aio_get(s, bs, qiov, sector_num,
nb_sectors, cb, opaque);
acb->is_read = true;
if (s->read_pattern == QUORUM_READ_PATTERN_QUORUM) {
acb->child_iter = s->num_children - 1;
return read_quorum_children(acb);
}
acb->child_iter = 0;
return read_fifo_child(acb);
}
static BlockAIOCB *quorum_aio_writev(BlockDriverState *bs,
int64_t sector_num,
QEMUIOVector *qiov,
int nb_sectors,
BlockCompletionFunc *cb,
void *opaque)
{
BDRVQuorumState *s = bs->opaque;
QuorumAIOCB *acb = quorum_aio_get(s, bs, qiov, sector_num, nb_sectors,
@@ -782,16 +835,39 @@ static QemuOptsList quorum_runtime_opts = {
.type = QEMU_OPT_BOOL,
.help = "Rewrite corrupted block on read quorum",
},
{
.name = QUORUM_OPT_READ_PATTERN,
.type = QEMU_OPT_STRING,
.help = "Allowed pattern: quorum, fifo. Quorum is default",
},
{ /* end of list */ }
},
};
static int parse_read_pattern(const char *opt)
{
int i;
if (!opt) {
/* Set quorum as default */
return QUORUM_READ_PATTERN_QUORUM;
}
for (i = 0; i < QUORUM_READ_PATTERN_MAX; i++) {
if (!strcmp(opt, QuorumReadPattern_lookup[i])) {
return i;
}
}
return -EINVAL;
}
static int quorum_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVQuorumState *s = bs->opaque;
Error *local_err = NULL;
QemuOpts *opts;
QemuOpts *opts = NULL;
bool *opened;
QDict *sub = NULL;
QList *list = NULL;
@@ -827,28 +903,37 @@ static int quorum_open(BlockDriverState *bs, QDict *options, int flags,
}
s->threshold = qemu_opt_get_number(opts, QUORUM_OPT_VOTE_THRESHOLD, 0);
/* and validate it against s->num_children */
ret = quorum_valid_threshold(s->threshold, s->num_children, &local_err);
ret = parse_read_pattern(qemu_opt_get(opts, QUORUM_OPT_READ_PATTERN));
if (ret < 0) {
error_setg(&local_err, "Please set read-pattern as fifo or quorum");
goto exit;
}
s->read_pattern = ret;
/* is the driver in blkverify mode */
if (qemu_opt_get_bool(opts, QUORUM_OPT_BLKVERIFY, false) &&
s->num_children == 2 && s->threshold == 2) {
s->is_blkverify = true;
} else if (qemu_opt_get_bool(opts, QUORUM_OPT_BLKVERIFY, false)) {
fprintf(stderr, "blkverify mode is set by setting blkverify=on "
"and using two files with vote_threshold=2\n");
}
if (s->read_pattern == QUORUM_READ_PATTERN_QUORUM) {
/* and validate it against s->num_children */
ret = quorum_valid_threshold(s->threshold, s->num_children, &local_err);
if (ret < 0) {
goto exit;
}
s->rewrite_corrupted = qemu_opt_get_bool(opts, QUORUM_OPT_REWRITE, false);
if (s->rewrite_corrupted && s->is_blkverify) {
error_setg(&local_err,
"rewrite-corrupted=on cannot be used with blkverify=on");
ret = -EINVAL;
goto exit;
/* is the driver in blkverify mode */
if (qemu_opt_get_bool(opts, QUORUM_OPT_BLKVERIFY, false) &&
s->num_children == 2 && s->threshold == 2) {
s->is_blkverify = true;
} else if (qemu_opt_get_bool(opts, QUORUM_OPT_BLKVERIFY, false)) {
fprintf(stderr, "blkverify mode is set by setting blkverify=on "
"and using two files with vote_threshold=2\n");
}
s->rewrite_corrupted = qemu_opt_get_bool(opts, QUORUM_OPT_REWRITE,
false);
if (s->rewrite_corrupted && s->is_blkverify) {
error_setg(&local_err,
"rewrite-corrupted=on cannot be used with blkverify=on");
ret = -EINVAL;
goto exit;
}
}
/* allocate the children BlockDriverState array */
@@ -903,6 +988,7 @@ close_exit:
g_free(s->bs);
g_free(opened);
exit:
qemu_opts_del(opts);
/* propagate error */
if (local_err) {
error_propagate(errp, local_err);
@@ -945,6 +1031,39 @@ static void quorum_attach_aio_context(BlockDriverState *bs,
}
}
static void quorum_refresh_filename(BlockDriverState *bs)
{
BDRVQuorumState *s = bs->opaque;
QDict *opts;
QList *children;
int i;
for (i = 0; i < s->num_children; i++) {
bdrv_refresh_filename(s->bs[i]);
if (!s->bs[i]->full_open_options) {
return;
}
}
children = qlist_new();
for (i = 0; i < s->num_children; i++) {
QINCREF(s->bs[i]->full_open_options);
qlist_append_obj(children, QOBJECT(s->bs[i]->full_open_options));
}
opts = qdict_new();
qdict_put_obj(opts, "driver", QOBJECT(qstring_from_str("quorum")));
qdict_put_obj(opts, QUORUM_OPT_VOTE_THRESHOLD,
QOBJECT(qint_from_int(s->threshold)));
qdict_put_obj(opts, QUORUM_OPT_BLKVERIFY,
QOBJECT(qbool_from_int(s->is_blkverify)));
qdict_put_obj(opts, QUORUM_OPT_REWRITE,
QOBJECT(qbool_from_int(s->rewrite_corrupted)));
qdict_put_obj(opts, "children", QOBJECT(children));
bs->full_open_options = opts;
}
static BlockDriver bdrv_quorum = {
.format_name = "quorum",
.protocol_name = "quorum",
@@ -953,6 +1072,7 @@ static BlockDriver bdrv_quorum = {
.bdrv_file_open = quorum_open,
.bdrv_close = quorum_close,
.bdrv_refresh_filename = quorum_refresh_filename,
.bdrv_co_flush_to_disk = quorum_co_flush,

View File

@@ -35,9 +35,9 @@
#ifdef CONFIG_LINUX_AIO
void *laio_init(void);
void laio_cleanup(void *s);
BlockDriverAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
BlockAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockDriverCompletionFunc *cb, void *opaque, int type);
BlockCompletionFunc *cb, void *opaque, int type);
void laio_detach_aio_context(void *s, AioContext *old_context);
void laio_attach_aio_context(void *s, AioContext *new_context);
void laio_io_plug(BlockDriverState *bs, void *aio_ctx);
@@ -49,10 +49,10 @@ typedef struct QEMUWin32AIOState QEMUWin32AIOState;
QEMUWin32AIOState *win32_aio_init(void);
void win32_aio_cleanup(QEMUWin32AIOState *aio);
int win32_aio_attach(QEMUWin32AIOState *aio, HANDLE hfile);
BlockDriverAIOCB *win32_aio_submit(BlockDriverState *bs,
BlockAIOCB *win32_aio_submit(BlockDriverState *bs,
QEMUWin32AIOState *aio, HANDLE hfile,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockDriverCompletionFunc *cb, void *opaque, int type);
BlockCompletionFunc *cb, void *opaque, int type);
void win32_aio_detach_aio_context(QEMUWin32AIOState *aio,
AioContext *old_context);
void win32_aio_attach_aio_context(QEMUWin32AIOState *aio,

View File

@@ -30,6 +30,7 @@
#include "block/thread-pool.h"
#include "qemu/iov.h"
#include "raw-aio.h"
#include "qapi/util.h"
#if defined(__APPLE__) && (__MACH__)
#include <paths.h>
@@ -59,9 +60,6 @@
#define FS_NOCOW_FL 0x00800000 /* Do not cow file */
#endif
#endif
#ifdef CONFIG_FIEMAP
#include <linux/fiemap.h>
#endif
#ifdef CONFIG_FALLOCATE_PUNCH_HOLE
#include <linux/falloc.h>
#endif
@@ -149,9 +147,7 @@ typedef struct BDRVRawState {
bool has_discard:1;
bool has_write_zeroes:1;
bool discard_zeroes:1;
#ifdef CONFIG_FIEMAP
bool skip_fiemap;
#endif
bool needs_alignment;
} BDRVRawState;
typedef struct BDRVRawReopenState {
@@ -229,7 +225,7 @@ static void raw_probe_alignment(BlockDriverState *bs, int fd, Error **errp)
/* For /dev/sg devices the alignment is not really used.
With buffered I/O, we don't have any restrictions. */
if (bs->sg || !(s->open_flags & O_DIRECT)) {
if (bs->sg || !s->needs_alignment) {
bs->request_alignment = 1;
s->buf_align = 1;
return;
@@ -445,6 +441,9 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
s->has_discard = true;
s->has_write_zeroes = true;
if ((bs->open_flags & BDRV_O_NOCACHE) != 0) {
s->needs_alignment = true;
}
if (fstat(s->fd, &st) < 0) {
error_setg_errno(errp, errno, "Could not stat file");
@@ -471,6 +470,17 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
}
#endif
}
#ifdef __FreeBSD__
if (S_ISCHR(st.st_mode)) {
/*
* The file is a char device (disk), which on FreeBSD isn't behind
* a pager, so force all requests to be aligned. This is needed
* so QEMU makes sure all IO operations on the device are aligned
* to sector size, or else FreeBSD will reject them with EINVAL.
*/
s->needs_alignment = true;
}
#endif
#ifdef CONFIG_XFS
if (platform_test_xfs_fd(s->fd)) {
@@ -517,7 +527,7 @@ static int raw_reopen_prepare(BDRVReopenState *state,
s = state->bs->opaque;
state->opaque = g_malloc0(sizeof(BDRVRawReopenState));
state->opaque = g_new0(BDRVRawReopenState, 1);
raw_s = state->opaque;
#ifdef CONFIG_LINUX_AIO
@@ -807,7 +817,11 @@ static ssize_t handle_aiocb_rw(RawPosixAIOData *aiocb)
* Ok, we have to do it the hard way, copy all segments into
* a single aligned buffer.
*/
buf = qemu_blockalign(aiocb->bs, aiocb->aio_nbytes);
buf = qemu_try_blockalign(aiocb->bs, aiocb->aio_nbytes);
if (buf == NULL) {
return -ENOMEM;
}
if (aiocb->aio_type & QEMU_AIO_WRITE) {
char *p = buf;
int i;
@@ -1036,9 +1050,9 @@ static int paio_submit_co(BlockDriverState *bs, int fd,
return thread_pool_submit_co(pool, aio_worker, acb);
}
static BlockDriverAIOCB *paio_submit(BlockDriverState *bs, int fd,
static BlockAIOCB *paio_submit(BlockDriverState *bs, int fd,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockDriverCompletionFunc *cb, void *opaque, int type)
BlockCompletionFunc *cb, void *opaque, int type)
{
RawPosixAIOData *acb = g_slice_new(RawPosixAIOData);
ThreadPool *pool;
@@ -1061,9 +1075,9 @@ static BlockDriverAIOCB *paio_submit(BlockDriverState *bs, int fd,
return thread_pool_submit_aio(pool, aio_worker, acb, cb, opaque);
}
static BlockDriverAIOCB *raw_aio_submit(BlockDriverState *bs,
static BlockAIOCB *raw_aio_submit(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockDriverCompletionFunc *cb, void *opaque, int type)
BlockCompletionFunc *cb, void *opaque, int type)
{
BDRVRawState *s = bs->opaque;
@@ -1071,11 +1085,12 @@ static BlockDriverAIOCB *raw_aio_submit(BlockDriverState *bs,
return NULL;
/*
* If O_DIRECT is used the buffer needs to be aligned on a sector
* boundary. Check if this is the case or tell the low-level
* driver that it needs to copy the buffer.
* Check if the underlying device requires requests to be aligned,
* and if the request we are trying to submit is aligned or not.
* If this is the case tell the low-level driver that it needs
* to copy the buffer.
*/
if ((bs->open_flags & BDRV_O_NOCACHE)) {
if (s->needs_alignment) {
if (!bdrv_qiov_is_aligned(bs, qiov)) {
type |= QEMU_AIO_MISALIGNED;
#ifdef CONFIG_LINUX_AIO
@@ -1120,24 +1135,24 @@ static void raw_aio_flush_io_queue(BlockDriverState *bs)
#endif
}
static BlockDriverAIOCB *raw_aio_readv(BlockDriverState *bs,
static BlockAIOCB *raw_aio_readv(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockDriverCompletionFunc *cb, void *opaque)
BlockCompletionFunc *cb, void *opaque)
{
return raw_aio_submit(bs, sector_num, qiov, nb_sectors,
cb, opaque, QEMU_AIO_READ);
}
static BlockDriverAIOCB *raw_aio_writev(BlockDriverState *bs,
static BlockAIOCB *raw_aio_writev(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockDriverCompletionFunc *cb, void *opaque)
BlockCompletionFunc *cb, void *opaque)
{
return raw_aio_submit(bs, sector_num, qiov, nb_sectors,
cb, opaque, QEMU_AIO_WRITE);
}
static BlockDriverAIOCB *raw_aio_flush(BlockDriverState *bs,
BlockDriverCompletionFunc *cb, void *opaque)
static BlockAIOCB *raw_aio_flush(BlockDriverState *bs,
BlockCompletionFunc *cb, void *opaque)
{
BDRVRawState *s = bs->opaque;
@@ -1361,128 +1376,199 @@ static int raw_create(const char *filename, QemuOpts *opts, Error **errp)
int result = 0;
int64_t total_size = 0;
bool nocow = false;
PreallocMode prealloc;
char *buf = NULL;
Error *local_err = NULL;
strstart(filename, "file:", &filename);
/* Read out options */
total_size =
qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0) / BDRV_SECTOR_SIZE;
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
nocow = qemu_opt_get_bool(opts, BLOCK_OPT_NOCOW, false);
buf = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
prealloc = qapi_enum_parse(PreallocMode_lookup, buf,
PREALLOC_MODE_MAX, PREALLOC_MODE_OFF,
&local_err);
g_free(buf);
if (local_err) {
error_propagate(errp, local_err);
result = -EINVAL;
goto out;
}
fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
0644);
if (fd < 0) {
result = -errno;
error_setg_errno(errp, -result, "Could not create file");
} else {
if (nocow) {
#ifdef __linux__
/* Set NOCOW flag to solve performance issue on fs like btrfs.
* This is an optimisation. The FS_IOC_SETFLAGS ioctl return value
* will be ignored since any failure of this operation should not
* block the left work.
*/
int attr;
if (ioctl(fd, FS_IOC_GETFLAGS, &attr) == 0) {
attr |= FS_NOCOW_FL;
ioctl(fd, FS_IOC_SETFLAGS, &attr);
}
#endif
}
if (ftruncate(fd, total_size * BDRV_SECTOR_SIZE) != 0) {
result = -errno;
error_setg_errno(errp, -result, "Could not resize file");
}
if (qemu_close(fd) != 0) {
result = -errno;
error_setg_errno(errp, -result, "Could not close the new file");
}
goto out;
}
if (nocow) {
#ifdef __linux__
/* Set NOCOW flag to solve performance issue on fs like btrfs.
* This is an optimisation. The FS_IOC_SETFLAGS ioctl return value
* will be ignored since any failure of this operation should not
* block the left work.
*/
int attr;
if (ioctl(fd, FS_IOC_GETFLAGS, &attr) == 0) {
attr |= FS_NOCOW_FL;
ioctl(fd, FS_IOC_SETFLAGS, &attr);
}
#endif
}
if (ftruncate(fd, total_size) != 0) {
result = -errno;
error_setg_errno(errp, -result, "Could not resize file");
goto out_close;
}
switch (prealloc) {
#ifdef CONFIG_POSIX_FALLOCATE
case PREALLOC_MODE_FALLOC:
/* posix_fallocate() doesn't set errno. */
result = -posix_fallocate(fd, 0, total_size);
if (result != 0) {
error_setg_errno(errp, -result,
"Could not preallocate data for the new file");
}
break;
#endif
case PREALLOC_MODE_FULL:
{
int64_t num = 0, left = total_size;
buf = g_malloc0(65536);
while (left > 0) {
num = MIN(left, 65536);
result = write(fd, buf, num);
if (result < 0) {
result = -errno;
error_setg_errno(errp, -result,
"Could not write to the new file");
break;
}
left -= result;
}
if (result >= 0) {
result = fsync(fd);
if (result < 0) {
result = -errno;
error_setg_errno(errp, -result,
"Could not flush new file to disk");
}
}
g_free(buf);
break;
}
case PREALLOC_MODE_OFF:
break;
default:
result = -EINVAL;
error_setg(errp, "Unsupported preallocation mode: %s",
PreallocMode_lookup[prealloc]);
break;
}
out_close:
if (qemu_close(fd) != 0 && result == 0) {
result = -errno;
error_setg_errno(errp, -result, "Could not close the new file");
}
out:
return result;
}
static int64_t try_fiemap(BlockDriverState *bs, off_t start, off_t *data,
off_t *hole, int nb_sectors, int *pnum)
{
#ifdef CONFIG_FIEMAP
BDRVRawState *s = bs->opaque;
int64_t ret = BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | start;
struct {
struct fiemap fm;
struct fiemap_extent fe;
} f;
if (s->skip_fiemap) {
return -ENOTSUP;
}
f.fm.fm_start = start;
f.fm.fm_length = (int64_t)nb_sectors * BDRV_SECTOR_SIZE;
f.fm.fm_flags = 0;
f.fm.fm_extent_count = 1;
f.fm.fm_reserved = 0;
if (ioctl(s->fd, FS_IOC_FIEMAP, &f) == -1) {
s->skip_fiemap = true;
return -errno;
}
if (f.fm.fm_mapped_extents == 0) {
/* No extents found, data is beyond f.fm.fm_start + f.fm.fm_length.
* f.fm.fm_start + f.fm.fm_length must be clamped to the file size!
*/
off_t length = lseek(s->fd, 0, SEEK_END);
*hole = f.fm.fm_start;
*data = MIN(f.fm.fm_start + f.fm.fm_length, length);
} else {
*data = f.fe.fe_logical;
*hole = f.fe.fe_logical + f.fe.fe_length;
if (f.fe.fe_flags & FIEMAP_EXTENT_UNWRITTEN) {
ret |= BDRV_BLOCK_ZERO;
}
}
return ret;
#else
return -ENOTSUP;
#endif
}
static int64_t try_seek_hole(BlockDriverState *bs, off_t start, off_t *data,
off_t *hole, int *pnum)
/*
* Find allocation range in @bs around offset @start.
* May change underlying file descriptor's file offset.
* If @start is not in a hole, store @start in @data, and the
* beginning of the next hole in @hole, and return 0.
* If @start is in a non-trailing hole, store @start in @hole and the
* beginning of the next non-hole in @data, and return 0.
* If @start is in a trailing hole or beyond EOF, return -ENXIO.
* If we can't find out, return a negative errno other than -ENXIO.
*/
static int find_allocation(BlockDriverState *bs, off_t start,
off_t *data, off_t *hole)
{
#if defined SEEK_HOLE && defined SEEK_DATA
BDRVRawState *s = bs->opaque;
off_t offs;
*hole = lseek(s->fd, start, SEEK_HOLE);
if (*hole == -1) {
/* -ENXIO indicates that sector_num was past the end of the file.
* There is a virtual hole there. */
assert(errno != -ENXIO);
/*
* SEEK_DATA cases:
* D1. offs == start: start is in data
* D2. offs > start: start is in a hole, next data at offs
* D3. offs < 0, errno = ENXIO: either start is in a trailing hole
* or start is beyond EOF
* If the latter happens, the file has been truncated behind
* our back since we opened it. All bets are off then.
* Treating like a trailing hole is simplest.
* D4. offs < 0, errno != ENXIO: we learned nothing
*/
offs = lseek(s->fd, start, SEEK_DATA);
if (offs < 0) {
return -errno; /* D3 or D4 */
}
assert(offs >= start);
return -errno;
if (offs > start) {
/* D2: in hole, next data at offs */
*hole = start;
*data = offs;
return 0;
}
if (*hole > start) {
/* D1: in data, end not yet known */
/*
* SEEK_HOLE cases:
* H1. offs == start: start is in a hole
* If this happens here, a hole has been dug behind our back
* since the previous lseek().
* H2. offs > start: either start is in data, next hole at offs,
* or start is in trailing hole, EOF at offs
* Linux treats trailing holes like any other hole: offs ==
* start. Solaris seeks to EOF instead: offs > start (blech).
* If that happens here, a hole has been dug behind our back
* since the previous lseek().
* H3. offs < 0, errno = ENXIO: start is beyond EOF
* If this happens, the file has been truncated behind our
* back since we opened it. Treat it like a trailing hole.
* H4. offs < 0, errno != ENXIO: we learned nothing
* Pretend we know nothing at all, i.e. "forget" about D1.
*/
offs = lseek(s->fd, start, SEEK_HOLE);
if (offs < 0) {
return -errno; /* D1 and (H3 or H4) */
}
assert(offs >= start);
if (offs > start) {
/*
* D1 and H2: either in data, next hole at offs, or it was in
* data but is now in a trailing hole. In the latter case,
* all bets are off. Treating it as if it there was data all
* the way to EOF is safe, so simply do that.
*/
*data = start;
} else {
/* On a hole. We need another syscall to find its end. */
*data = lseek(s->fd, start, SEEK_DATA);
if (*data == -1) {
*data = lseek(s->fd, 0, SEEK_END);
}
*hole = offs;
return 0;
}
return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | start;
/* D1 and H1 */
return -EBUSY;
#else
return -ENOTSUP;
#endif
}
/*
* Returns true iff the specified sector is present in the disk image. Drivers
* not implementing the functionality are assumed to not support backing files,
* hence all their sectors are reported as allocated.
* Returns the allocation status of the specified sectors.
*
* If 'sector_num' is beyond the end of the disk image the return value is 0
* and 'pnum' is set to 0.
@@ -1499,7 +1585,8 @@ static int64_t coroutine_fn raw_co_get_block_status(BlockDriverState *bs,
int nb_sectors, int *pnum)
{
off_t start, data = 0, hole = 0;
int64_t ret;
int64_t total_size;
int ret;
ret = fd_open(bs);
if (ret < 0) {
@@ -1507,34 +1594,41 @@ static int64_t coroutine_fn raw_co_get_block_status(BlockDriverState *bs,
}
start = sector_num * BDRV_SECTOR_SIZE;
ret = try_fiemap(bs, start, &data, &hole, nb_sectors, pnum);
if (ret < 0) {
ret = try_seek_hole(bs, start, &data, &hole, pnum);
if (ret < 0) {
/* Assume everything is allocated. */
data = 0;
hole = start + nb_sectors * BDRV_SECTOR_SIZE;
ret = BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | start;
}
total_size = bdrv_getlength(bs);
if (total_size < 0) {
return total_size;
} else if (start >= total_size) {
*pnum = 0;
return 0;
} else if (start + nb_sectors * BDRV_SECTOR_SIZE > total_size) {
nb_sectors = DIV_ROUND_UP(total_size - start, BDRV_SECTOR_SIZE);
}
if (data <= start) {
ret = find_allocation(bs, start, &data, &hole);
if (ret == -ENXIO) {
/* Trailing hole */
*pnum = nb_sectors;
ret = BDRV_BLOCK_ZERO;
} else if (ret < 0) {
/* No info available, so pretend there are no holes */
*pnum = nb_sectors;
ret = BDRV_BLOCK_DATA;
} else if (data == start) {
/* On a data extent, compute sectors to the end of the extent. */
*pnum = MIN(nb_sectors, (hole - start) / BDRV_SECTOR_SIZE);
ret = BDRV_BLOCK_DATA;
} else {
/* On a hole, compute sectors to the beginning of the next extent. */
assert(hole == start);
*pnum = MIN(nb_sectors, (data - start) / BDRV_SECTOR_SIZE);
ret &= ~BDRV_BLOCK_DATA;
ret |= BDRV_BLOCK_ZERO;
ret = BDRV_BLOCK_ZERO;
}
return ret;
return ret | BDRV_BLOCK_OFFSET_VALID | start;
}
static coroutine_fn BlockDriverAIOCB *raw_aio_discard(BlockDriverState *bs,
static coroutine_fn BlockAIOCB *raw_aio_discard(BlockDriverState *bs,
int64_t sector_num, int nb_sectors,
BlockDriverCompletionFunc *cb, void *opaque)
BlockCompletionFunc *cb, void *opaque)
{
BDRVRawState *s = bs->opaque;
@@ -1581,6 +1675,11 @@ static QemuOptsList raw_create_opts = {
.type = QEMU_OPT_BOOL,
.help = "Turn off copy-on-write (valid only on btrfs)"
},
{
.name = BLOCK_OPT_PREALLOC,
.type = QEMU_OPT_STRING,
.help = "Preallocation mode (allowed values: off, falloc, full)"
},
{ /* end of list */ }
}
};
@@ -1867,9 +1966,9 @@ static int hdev_ioctl(BlockDriverState *bs, unsigned long int req, void *buf)
return ioctl(s->fd, req, buf);
}
static BlockDriverAIOCB *hdev_aio_ioctl(BlockDriverState *bs,
static BlockAIOCB *hdev_aio_ioctl(BlockDriverState *bs,
unsigned long int req, void *buf,
BlockDriverCompletionFunc *cb, void *opaque)
BlockCompletionFunc *cb, void *opaque)
{
BDRVRawState *s = bs->opaque;
RawPosixAIOData *acb;
@@ -1908,9 +2007,9 @@ static int fd_open(BlockDriverState *bs)
#endif /* !linux && !FreeBSD */
static coroutine_fn BlockDriverAIOCB *hdev_aio_discard(BlockDriverState *bs,
static coroutine_fn BlockAIOCB *hdev_aio_discard(BlockDriverState *bs,
int64_t sector_num, int nb_sectors,
BlockDriverCompletionFunc *cb, void *opaque)
BlockCompletionFunc *cb, void *opaque)
{
BDRVRawState *s = bs->opaque;
@@ -1962,8 +2061,8 @@ static int hdev_create(const char *filename, QemuOpts *opts,
(void)has_prefix;
/* Read out options */
total_size =
qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0) / BDRV_SECTOR_SIZE;
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
fd = qemu_open(filename, O_WRONLY | O_BINARY);
if (fd < 0) {
@@ -1979,7 +2078,7 @@ static int hdev_create(const char *filename, QemuOpts *opts,
error_setg(errp,
"The given file is neither a block nor a character device");
ret = -ENODEV;
} else if (lseek(fd, 0, SEEK_END) < total_size * BDRV_SECTOR_SIZE) {
} else if (lseek(fd, 0, SEEK_END) < total_size) {
error_setg(errp, "Device is too small");
ret = -ENOSPC;
}

View File

@@ -138,9 +138,9 @@ static int aio_worker(void *arg)
return ret;
}
static BlockDriverAIOCB *paio_submit(BlockDriverState *bs, HANDLE hfile,
static BlockAIOCB *paio_submit(BlockDriverState *bs, HANDLE hfile,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockDriverCompletionFunc *cb, void *opaque, int type)
BlockCompletionFunc *cb, void *opaque, int type)
{
RawWin32AIOData *acb = g_slice_new(RawWin32AIOData);
ThreadPool *pool;
@@ -369,9 +369,9 @@ fail:
return ret;
}
static BlockDriverAIOCB *raw_aio_readv(BlockDriverState *bs,
static BlockAIOCB *raw_aio_readv(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockDriverCompletionFunc *cb, void *opaque)
BlockCompletionFunc *cb, void *opaque)
{
BDRVRawState *s = bs->opaque;
if (s->aio) {
@@ -383,9 +383,9 @@ static BlockDriverAIOCB *raw_aio_readv(BlockDriverState *bs,
}
}
static BlockDriverAIOCB *raw_aio_writev(BlockDriverState *bs,
static BlockAIOCB *raw_aio_writev(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockDriverCompletionFunc *cb, void *opaque)
BlockCompletionFunc *cb, void *opaque)
{
BDRVRawState *s = bs->opaque;
if (s->aio) {
@@ -397,8 +397,8 @@ static BlockDriverAIOCB *raw_aio_writev(BlockDriverState *bs,
}
}
static BlockDriverAIOCB *raw_aio_flush(BlockDriverState *bs,
BlockDriverCompletionFunc *cb, void *opaque)
static BlockAIOCB *raw_aio_flush(BlockDriverState *bs,
BlockCompletionFunc *cb, void *opaque)
{
BDRVRawState *s = bs->opaque;
return paio_submit(bs, s->hfile, 0, NULL, 0, cb, opaque, QEMU_AIO_FLUSH);
@@ -511,8 +511,8 @@ static int raw_create(const char *filename, QemuOpts *opts, Error **errp)
strstart(filename, "file:", &filename);
/* Read out options */
total_size =
qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0) / 512;
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
0644);
@@ -521,7 +521,7 @@ static int raw_create(const char *filename, QemuOpts *opts, Error **errp)
return -EIO;
}
set_sparse(fd);
ftruncate(fd, total_size * 512);
ftruncate(fd, total_size);
qemu_close(fd);
return 0;
}

View File

@@ -129,10 +129,10 @@ static int raw_ioctl(BlockDriverState *bs, unsigned long int req, void *buf)
return bdrv_ioctl(bs->file, req, buf);
}
static BlockDriverAIOCB *raw_aio_ioctl(BlockDriverState *bs,
unsigned long int req, void *buf,
BlockDriverCompletionFunc *cb,
void *opaque)
static BlockAIOCB *raw_aio_ioctl(BlockDriverState *bs,
unsigned long int req, void *buf,
BlockCompletionFunc *cb,
void *opaque)
{
return bdrv_aio_ioctl(bs->file, req, buf, cb, opaque);
}

View File

@@ -68,7 +68,7 @@ typedef enum {
} RBDAIOCmd;
typedef struct RBDAIOCB {
BlockDriverAIOCB common;
BlockAIOCB common;
QEMUBH *bh;
int64_t ret;
QEMUIOVector *qiov;
@@ -77,7 +77,6 @@ typedef struct RBDAIOCB {
int64_t sector_num;
int error;
struct BDRVRBDState *s;
int cancelled;
int status;
} RBDAIOCB;
@@ -314,7 +313,8 @@ static int qemu_rbd_create(const char *filename, QemuOpts *opts, Error **errp)
}
/* Read out options */
bytes = qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0);
bytes = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
objsize = qemu_opt_get_size_del(opts, BLOCK_OPT_CLUSTER_SIZE, 0);
if (objsize) {
if ((objsize - 1) & objsize) { /* not a power of 2? */
@@ -407,9 +407,7 @@ static void qemu_rbd_complete_aio(RADOSCB *rcb)
acb->common.cb(acb->common.opaque, (acb->ret > 0 ? 0 : acb->ret));
acb->status = 0;
if (!acb->cancelled) {
qemu_aio_release(acb);
}
qemu_aio_unref(acb);
}
/* TODO Convert to fine grained options */
@@ -538,25 +536,8 @@ static void qemu_rbd_close(BlockDriverState *bs)
rados_shutdown(s->cluster);
}
/*
* Cancel aio. Since we don't reference acb in a non qemu threads,
* it is safe to access it here.
*/
static void qemu_rbd_aio_cancel(BlockDriverAIOCB *blockacb)
{
RBDAIOCB *acb = (RBDAIOCB *) blockacb;
acb->cancelled = 1;
while (acb->status == -EINPROGRESS) {
aio_poll(bdrv_get_aio_context(acb->common.bs), true);
}
qemu_aio_release(acb);
}
static const AIOCBInfo rbd_aiocb_info = {
.aiocb_size = sizeof(RBDAIOCB),
.cancel = qemu_rbd_aio_cancel,
};
static void rbd_finish_bh(void *opaque)
@@ -608,16 +589,16 @@ static int rbd_aio_flush_wrapper(rbd_image_t image,
#endif
}
static BlockDriverAIOCB *rbd_start_aio(BlockDriverState *bs,
int64_t sector_num,
QEMUIOVector *qiov,
int nb_sectors,
BlockDriverCompletionFunc *cb,
void *opaque,
RBDAIOCmd cmd)
static BlockAIOCB *rbd_start_aio(BlockDriverState *bs,
int64_t sector_num,
QEMUIOVector *qiov,
int nb_sectors,
BlockCompletionFunc *cb,
void *opaque,
RBDAIOCmd cmd)
{
RBDAIOCB *acb;
RADOSCB *rcb;
RADOSCB *rcb = NULL;
rbd_completion_t c;
int64_t off, size;
char *buf;
@@ -631,12 +612,14 @@ static BlockDriverAIOCB *rbd_start_aio(BlockDriverState *bs,
if (cmd == RBD_AIO_DISCARD || cmd == RBD_AIO_FLUSH) {
acb->bounce = NULL;
} else {
acb->bounce = qemu_blockalign(bs, qiov->size);
acb->bounce = qemu_try_blockalign(bs, qiov->size);
if (acb->bounce == NULL) {
goto failed;
}
}
acb->ret = 0;
acb->error = 0;
acb->s = s;
acb->cancelled = 0;
acb->bh = NULL;
acb->status = -EINPROGRESS;
@@ -649,7 +632,7 @@ static BlockDriverAIOCB *rbd_start_aio(BlockDriverState *bs,
off = sector_num * BDRV_SECTOR_SIZE;
size = nb_sectors * BDRV_SECTOR_SIZE;
rcb = g_malloc(sizeof(RADOSCB));
rcb = g_new(RADOSCB, 1);
rcb->done = 0;
rcb->acb = acb;
rcb->buf = buf;
@@ -688,36 +671,36 @@ failed_completion:
failed:
g_free(rcb);
qemu_vfree(acb->bounce);
qemu_aio_release(acb);
qemu_aio_unref(acb);
return NULL;
}
static BlockDriverAIOCB *qemu_rbd_aio_readv(BlockDriverState *bs,
int64_t sector_num,
QEMUIOVector *qiov,
int nb_sectors,
BlockDriverCompletionFunc *cb,
void *opaque)
static BlockAIOCB *qemu_rbd_aio_readv(BlockDriverState *bs,
int64_t sector_num,
QEMUIOVector *qiov,
int nb_sectors,
BlockCompletionFunc *cb,
void *opaque)
{
return rbd_start_aio(bs, sector_num, qiov, nb_sectors, cb, opaque,
RBD_AIO_READ);
}
static BlockDriverAIOCB *qemu_rbd_aio_writev(BlockDriverState *bs,
int64_t sector_num,
QEMUIOVector *qiov,
int nb_sectors,
BlockDriverCompletionFunc *cb,
void *opaque)
static BlockAIOCB *qemu_rbd_aio_writev(BlockDriverState *bs,
int64_t sector_num,
QEMUIOVector *qiov,
int nb_sectors,
BlockCompletionFunc *cb,
void *opaque)
{
return rbd_start_aio(bs, sector_num, qiov, nb_sectors, cb, opaque,
RBD_AIO_WRITE);
}
#ifdef LIBRBD_SUPPORTS_AIO_FLUSH
static BlockDriverAIOCB *qemu_rbd_aio_flush(BlockDriverState *bs,
BlockDriverCompletionFunc *cb,
void *opaque)
static BlockAIOCB *qemu_rbd_aio_flush(BlockDriverState *bs,
BlockCompletionFunc *cb,
void *opaque)
{
return rbd_start_aio(bs, 0, NULL, 0, cb, opaque, RBD_AIO_FLUSH);
}
@@ -859,7 +842,7 @@ static int qemu_rbd_snap_list(BlockDriverState *bs,
int max_snaps = RBD_MAX_SNAPS;
do {
snaps = g_malloc(sizeof(*snaps) * max_snaps);
snaps = g_new(rbd_snap_info_t, max_snaps);
snap_count = rbd_snap_list(s->image, snaps, &max_snaps);
if (snap_count <= 0) {
g_free(snaps);
@@ -870,7 +853,7 @@ static int qemu_rbd_snap_list(BlockDriverState *bs,
goto done;
}
sn_tab = g_malloc0(snap_count * sizeof(QEMUSnapshotInfo));
sn_tab = g_new0(QEMUSnapshotInfo, snap_count);
for (i = 0; i < snap_count; i++) {
const char *snap_name = snaps[i].name;
@@ -893,17 +876,29 @@ static int qemu_rbd_snap_list(BlockDriverState *bs,
}
#ifdef LIBRBD_SUPPORTS_DISCARD
static BlockDriverAIOCB* qemu_rbd_aio_discard(BlockDriverState *bs,
int64_t sector_num,
int nb_sectors,
BlockDriverCompletionFunc *cb,
void *opaque)
static BlockAIOCB* qemu_rbd_aio_discard(BlockDriverState *bs,
int64_t sector_num,
int nb_sectors,
BlockCompletionFunc *cb,
void *opaque)
{
return rbd_start_aio(bs, sector_num, NULL, nb_sectors, cb, opaque,
RBD_AIO_DISCARD);
}
#endif
#ifdef LIBRBD_SUPPORTS_INVALIDATE
static void qemu_rbd_invalidate_cache(BlockDriverState *bs,
Error **errp)
{
BDRVRBDState *s = bs->opaque;
int r = rbd_invalidate_cache(s->image);
if (r < 0) {
error_setg_errno(errp, -r, "Failed to invalidate the cache");
}
}
#endif
static QemuOptsList qemu_rbd_create_opts = {
.name = "rbd-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(qemu_rbd_create_opts.head),
@@ -953,6 +948,9 @@ static BlockDriver bdrv_rbd = {
.bdrv_snapshot_delete = qemu_rbd_snap_remove,
.bdrv_snapshot_list = qemu_rbd_snap_list,
.bdrv_snapshot_goto = qemu_rbd_snap_rollback,
#ifdef LIBRBD_SUPPORTS_INVALIDATE
.bdrv_invalidate_cache = qemu_rbd_invalidate_cache,
#endif
};
static void bdrv_rbd_init(void)

View File

@@ -103,6 +103,9 @@
#define SD_INODE_SIZE (sizeof(SheepdogInode))
#define CURRENT_VDI_ID 0
#define LOCK_TYPE_NORMAL 0
#define LOCK_TYPE_SHARED 1 /* for iSCSI multipath */
typedef struct SheepdogReq {
uint8_t proto_ver;
uint8_t opcode;
@@ -166,7 +169,8 @@ typedef struct SheepdogVdiReq {
uint8_t copy_policy;
uint8_t reserved[2];
uint32_t snapid;
uint32_t pad[3];
uint32_t type;
uint32_t pad[2];
} SheepdogVdiReq;
typedef struct SheepdogVdiRsp {
@@ -297,7 +301,7 @@ enum AIOCBState {
};
struct SheepdogAIOCB {
BlockDriverAIOCB common;
BlockAIOCB common;
QEMUIOVector *qiov;
@@ -311,7 +315,6 @@ struct SheepdogAIOCB {
void (*aio_done_func)(SheepdogAIOCB *);
bool cancelable;
bool *finished;
int nr_pending;
};
@@ -442,10 +445,7 @@ static inline void free_aio_req(BDRVSheepdogState *s, AIOReq *aio_req)
static void coroutine_fn sd_finish_aiocb(SheepdogAIOCB *acb)
{
qemu_coroutine_enter(acb->coroutine, NULL);
if (acb->finished) {
*acb->finished = true;
}
qemu_aio_release(acb);
qemu_aio_unref(acb);
}
/*
@@ -473,41 +473,38 @@ static bool sd_acb_cancelable(const SheepdogAIOCB *acb)
return true;
}
static void sd_aio_cancel(BlockDriverAIOCB *blockacb)
static void sd_aio_cancel(BlockAIOCB *blockacb)
{
SheepdogAIOCB *acb = (SheepdogAIOCB *)blockacb;
BDRVSheepdogState *s = acb->common.bs->opaque;
AIOReq *aioreq, *next;
bool finished = false;
acb->finished = &finished;
while (!finished) {
if (sd_acb_cancelable(acb)) {
/* Remove outstanding requests from pending and failed queues. */
QLIST_FOREACH_SAFE(aioreq, &s->pending_aio_head, aio_siblings,
next) {
if (aioreq->aiocb == acb) {
free_aio_req(s, aioreq);
}
if (sd_acb_cancelable(acb)) {
/* Remove outstanding requests from pending and failed queues. */
QLIST_FOREACH_SAFE(aioreq, &s->pending_aio_head, aio_siblings,
next) {
if (aioreq->aiocb == acb) {
free_aio_req(s, aioreq);
}
QLIST_FOREACH_SAFE(aioreq, &s->failed_aio_head, aio_siblings,
next) {
if (aioreq->aiocb == acb) {
free_aio_req(s, aioreq);
}
}
assert(acb->nr_pending == 0);
sd_finish_aiocb(acb);
return;
}
aio_poll(s->aio_context, true);
QLIST_FOREACH_SAFE(aioreq, &s->failed_aio_head, aio_siblings,
next) {
if (aioreq->aiocb == acb) {
free_aio_req(s, aioreq);
}
}
assert(acb->nr_pending == 0);
if (acb->common.cb) {
acb->common.cb(acb->common.opaque, -ECANCELED);
}
sd_finish_aiocb(acb);
}
}
static const AIOCBInfo sd_aiocb_info = {
.aiocb_size = sizeof(SheepdogAIOCB),
.cancel = sd_aio_cancel,
.aiocb_size = sizeof(SheepdogAIOCB),
.cancel_async = sd_aio_cancel,
};
static SheepdogAIOCB *sd_aio_setup(BlockDriverState *bs, QEMUIOVector *qiov,
@@ -524,7 +521,6 @@ static SheepdogAIOCB *sd_aio_setup(BlockDriverState *bs, QEMUIOVector *qiov,
acb->aio_done_func = NULL;
acb->cancelable = true;
acb->finished = NULL;
acb->coroutine = qemu_coroutine_self();
acb->ret = 0;
acb->nr_pending = 0;
@@ -712,7 +708,6 @@ static void coroutine_fn send_pending_req(BDRVSheepdogState *s, uint64_t oid)
static coroutine_fn void reconnect_to_sdog(void *opaque)
{
Error *local_err = NULL;
BDRVSheepdogState *s = opaque;
AIOReq *aio_req, *next;
@@ -727,6 +722,7 @@ static coroutine_fn void reconnect_to_sdog(void *opaque)
/* Try to reconnect the sheepdog server every one second. */
while (s->fd < 0) {
Error *local_err = NULL;
s->fd = get_sheep_fd(s, &local_err);
if (s->fd < 0) {
DPRINTF("Wait for connection to be established\n");
@@ -1090,6 +1086,7 @@ static int find_vdi_name(BDRVSheepdogState *s, const char *filename,
memset(&hdr, 0, sizeof(hdr));
if (lock) {
hdr.opcode = SD_OP_LOCK_VDI;
hdr.type = LOCK_TYPE_NORMAL;
} else {
hdr.opcode = SD_OP_GET_VDI_INFO;
}
@@ -1110,6 +1107,8 @@ static int find_vdi_name(BDRVSheepdogState *s, const char *filename,
sd_strerror(rsp->result), filename, snapid, tag);
if (rsp->result == SD_RES_NO_VDI) {
ret = -ENOENT;
} else if (rsp->result == SD_RES_VDI_LOCKED) {
ret = -EBUSY;
} else {
ret = -EIO;
}
@@ -1682,7 +1681,7 @@ static int sd_create(const char *filename, QemuOpts *opts,
uint32_t snapid;
bool prealloc = false;
s = g_malloc0(sizeof(BDRVSheepdogState));
s = g_new0(BDRVSheepdogState, 1);
memset(tag, 0, sizeof(tag));
if (strstr(filename, "://")) {
@@ -1695,7 +1694,8 @@ static int sd_create(const char *filename, QemuOpts *opts,
goto out;
}
s->inode.vdi_size = qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0);
s->inode.vdi_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
backing_file = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FILE);
buf = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
if (!buf || !strcmp(buf, "off")) {
@@ -1793,6 +1793,7 @@ static void sd_close(BlockDriverState *bs)
memset(&hdr, 0, sizeof(hdr));
hdr.opcode = SD_OP_RELEASE_VDI;
hdr.type = LOCK_TYPE_NORMAL;
hdr.base_vdi_id = s->inode.vdi_id;
wlen = strlen(s->name) + 1;
hdr.data_length = wlen;
@@ -2129,7 +2130,7 @@ static coroutine_fn int sd_co_writev(BlockDriverState *bs, int64_t sector_num,
ret = sd_co_rw_vector(acb);
if (ret <= 0) {
qemu_aio_release(acb);
qemu_aio_unref(acb);
return ret;
}
@@ -2150,7 +2151,7 @@ static coroutine_fn int sd_co_readv(BlockDriverState *bs, int64_t sector_num,
ret = sd_co_rw_vector(acb);
if (ret <= 0) {
qemu_aio_release(acb);
qemu_aio_unref(acb);
return ret;
}
@@ -2273,7 +2274,7 @@ static int sd_snapshot_goto(BlockDriverState *bs, const char *snapshot_id)
uint32_t snapid = 0;
int ret = 0;
old_s = g_malloc(sizeof(BDRVSheepdogState));
old_s = g_new(BDRVSheepdogState, 1);
memcpy(old_s, s, sizeof(BDRVSheepdogState));
@@ -2357,7 +2358,7 @@ static int sd_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_tab)
goto out;
}
sn_tab = g_malloc0(nr * sizeof(*sn_tab));
sn_tab = g_new0(QEMUSnapshotInfo, nr);
/* calculate a vdi id with hash function */
hval = fnv_64a_buf(s->name, strlen(s->name), FNV1A_64_INIT);
@@ -2509,7 +2510,7 @@ static coroutine_fn int sd_co_discard(BlockDriverState *bs, int64_t sector_num,
ret = sd_co_rw_vector(acb);
if (ret <= 0) {
qemu_aio_release(acb);
qemu_aio_unref(acb);
return ret;
}

View File

@@ -236,6 +236,10 @@ int bdrv_snapshot_delete(BlockDriverState *bs,
error_setg(errp, "snapshot_id and name are both NULL");
return -EINVAL;
}
/* drain all pending i/o before deleting snapshot */
bdrv_drain_all();
if (drv->bdrv_snapshot_delete) {
return drv->bdrv_snapshot_delete(bs, snapshot_id, name, errp);
}

View File

@@ -517,6 +517,11 @@ static int connect_to_ssh(BDRVSSHState *s, QDict *options,
const char *host, *user, *path, *host_key_check;
int port;
if (!qdict_haskey(options, "host")) {
ret = -EINVAL;
error_setg(errp, "No hostname was specified");
goto err;
}
host = qdict_get_str(options, "host");
if (qdict_haskey(options, "port")) {
@@ -525,6 +530,11 @@ static int connect_to_ssh(BDRVSSHState *s, QDict *options,
port = 22;
}
if (!qdict_haskey(options, "path")) {
ret = -EINVAL;
error_setg(errp, "No path was specified");
goto err;
}
path = qdict_get_str(options, "path");
if (qdict_haskey(options, "user")) {
@@ -700,7 +710,8 @@ static int ssh_create(const char *filename, QemuOpts *opts, Error **errp)
ssh_state_init(&s);
/* Get desired file size. */
total_size = qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0);
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
DPRINTF("total_size=%" PRIi64, total_size);
uri_options = qdict_new();

View File

@@ -79,9 +79,39 @@ static void close_unused_images(BlockDriverState *top, BlockDriverState *base,
bdrv_refresh_limits(top, NULL);
}
typedef struct {
int ret;
bool reached_end;
} StreamCompleteData;
static void stream_complete(BlockJob *job, void *opaque)
{
StreamBlockJob *s = container_of(job, StreamBlockJob, common);
StreamCompleteData *data = opaque;
BlockDriverState *base = s->base;
if (!block_job_is_cancelled(&s->common) && data->reached_end &&
data->ret == 0) {
const char *base_id = NULL, *base_fmt = NULL;
if (base) {
base_id = s->backing_file_str;
if (base->drv) {
base_fmt = base->drv->format_name;
}
}
data->ret = bdrv_change_backing_file(job->bs, base_id, base_fmt);
close_unused_images(job->bs, base, base_id);
}
g_free(s->backing_file_str);
block_job_completed(&s->common, data->ret);
g_free(data);
}
static void coroutine_fn stream_run(void *opaque)
{
StreamBlockJob *s = opaque;
StreamCompleteData *data;
BlockDriverState *bs = s->common.bs;
BlockDriverState *base = s->base;
int64_t sector_num, end;
@@ -183,21 +213,13 @@ wait:
/* Do not remove the backing file if an error was there but ignored. */
ret = error;
if (!block_job_is_cancelled(&s->common) && sector_num == end && ret == 0) {
const char *base_id = NULL, *base_fmt = NULL;
if (base) {
base_id = s->backing_file_str;
if (base->drv) {
base_fmt = base->drv->format_name;
}
}
ret = bdrv_change_backing_file(bs, base_id, base_fmt);
close_unused_images(bs, base, base_id);
}
qemu_vfree(buf);
g_free(s->backing_file_str);
block_job_completed(&s->common, ret);
/* Modify backing chain and close BDSes in main loop */
data = g_malloc(sizeof(*data));
data->ret = ret;
data->reached_end = sector_num == end;
block_job_defer_to_main_loop(&s->common, stream_complete, data);
}
static void stream_set_speed(BlockJob *job, int64_t speed, Error **errp)
@@ -220,7 +242,7 @@ static const BlockJobDriver stream_job_driver = {
void stream_start(BlockDriverState *bs, BlockDriverState *base,
const char *backing_file_str, int64_t speed,
BlockdevOnError on_error,
BlockDriverCompletionFunc *cb,
BlockCompletionFunc *cb,
void *opaque, Error **errp)
{
StreamBlockJob *s;

View File

@@ -53,13 +53,6 @@
#include "block/block_int.h"
#include "qemu/module.h"
#include "migration/migration.h"
#ifdef __linux__
#include <linux/fs.h>
#include <sys/ioctl.h>
#ifndef FS_NOCOW_FL
#define FS_NOCOW_FL 0x00800000 /* Do not cow file */
#endif
#endif
#if defined(CONFIG_UUID)
#include <uuid/uuid.h>
@@ -127,8 +120,18 @@ typedef unsigned char uuid_t[16];
#define VDI_IS_ALLOCATED(X) ((X) < VDI_DISCARDED)
/* max blocks in image is (0xffffffff / 4) */
#define VDI_BLOCKS_IN_IMAGE_MAX 0x3fffffff
/* The bmap will take up VDI_BLOCKS_IN_IMAGE_MAX * sizeof(uint32_t) bytes; since
* the bmap is read and written in a single operation, its size needs to be
* limited to INT_MAX; furthermore, when opening an image, the bmap size is
* rounded up to be aligned on BDRV_SECTOR_SIZE.
* Therefore this should satisfy the following:
* VDI_BLOCKS_IN_IMAGE_MAX * sizeof(uint32_t) + BDRV_SECTOR_SIZE == INT_MAX + 1
* (INT_MAX + 1 is the first value not representable as an int)
* This guarantees that any value below or equal to the constant will, when
* multiplied by sizeof(uint32_t) and rounded up to a BDRV_SECTOR_SIZE boundary,
* still be below or equal to INT_MAX. */
#define VDI_BLOCKS_IN_IMAGE_MAX \
((unsigned)((INT_MAX + 1u - BDRV_SECTOR_SIZE) / sizeof(uint32_t)))
#define VDI_DISK_SIZE_MAX ((uint64_t)VDI_BLOCKS_IN_IMAGE_MAX * \
(uint64_t)DEFAULT_CLUSTER_SIZE)
@@ -144,12 +147,14 @@ static inline int uuid_is_null(const uuid_t uu)
return memcmp(uu, null_uuid, sizeof(uuid_t)) == 0;
}
# if defined(CONFIG_VDI_DEBUG)
static inline void uuid_unparse(const uuid_t uu, char *out)
{
snprintf(out, 37, UUID_FMT,
uu[0], uu[1], uu[2], uu[3], uu[4], uu[5], uu[6], uu[7],
uu[8], uu[9], uu[10], uu[11], uu[12], uu[13], uu[14], uu[15]);
}
# endif
#endif
typedef struct {
@@ -299,7 +304,12 @@ static int vdi_check(BlockDriverState *bs, BdrvCheckResult *res,
return -ENOTSUP;
}
bmap = g_malloc(s->header.blocks_in_image * sizeof(uint32_t));
bmap = g_try_new(uint32_t, s->header.blocks_in_image);
if (s->header.blocks_in_image && bmap == NULL) {
res->check_errors++;
return -ENOMEM;
}
memset(bmap, 0xff, s->header.blocks_in_image * sizeof(uint32_t));
/* Check block map and value of blocks_allocated. */
@@ -357,23 +367,23 @@ static int vdi_make_empty(BlockDriverState *bs)
static int vdi_probe(const uint8_t *buf, int buf_size, const char *filename)
{
const VdiHeader *header = (const VdiHeader *)buf;
int result = 0;
int ret = 0;
logout("\n");
if (buf_size < sizeof(*header)) {
/* Header too small, no VDI. */
} else if (le32_to_cpu(header->signature) == VDI_SIGNATURE) {
result = 100;
ret = 100;
}
if (result == 0) {
if (ret == 0) {
logout("no vdi image\n");
} else {
logout("%s", header->text);
}
return result;
return ret;
}
static int vdi_open(BlockDriverState *bs, QDict *options, int flags,
@@ -409,8 +419,7 @@ static int vdi_open(BlockDriverState *bs, QDict *options, int flags,
We accept them but round the disk size to the next multiple of
SECTOR_SIZE. */
logout("odd disk size %" PRIu64 " B, round up\n", header.disk_size);
header.disk_size += SECTOR_SIZE - 1;
header.disk_size &= ~(SECTOR_SIZE - 1);
header.disk_size = ROUND_UP(header.disk_size, SECTOR_SIZE);
}
if (header.signature != VDI_SIGNATURE) {
@@ -477,8 +486,13 @@ static int vdi_open(BlockDriverState *bs, QDict *options, int flags,
s->header = header;
bmap_size = header.blocks_in_image * sizeof(uint32_t);
bmap_size = (bmap_size + SECTOR_SIZE - 1) / SECTOR_SIZE;
s->bmap = g_malloc(bmap_size * SECTOR_SIZE);
bmap_size = DIV_ROUND_UP(bmap_size, SECTOR_SIZE);
s->bmap = qemu_try_blockalign(bs->file, bmap_size * SECTOR_SIZE);
if (s->bmap == NULL) {
ret = -ENOMEM;
goto fail;
}
ret = bdrv_read(bs->file, s->bmap_sector, (uint8_t *)s->bmap, bmap_size);
if (ret < 0) {
goto fail_free_bmap;
@@ -487,13 +501,13 @@ static int vdi_open(BlockDriverState *bs, QDict *options, int flags,
/* Disable migration when vdi images are used */
error_set(&s->migration_blocker,
QERR_BLOCK_FORMAT_FEATURE_NOT_SUPPORTED,
"vdi", bs->device_name, "live migration");
"vdi", bdrv_get_device_name(bs), "live migration");
migrate_add_blocker(s->migration_blocker);
return 0;
fail_free_bmap:
g_free(s->bmap);
qemu_vfree(s->bmap);
fail:
return ret;
@@ -681,8 +695,7 @@ static int vdi_co_write(BlockDriverState *bs,
static int vdi_create(const char *filename, QemuOpts *opts, Error **errp)
{
int fd;
int result = 0;
int ret = 0;
uint64_t bytes = 0;
uint32_t blocks;
size_t block_size = DEFAULT_CLUSTER_SIZE;
@@ -690,12 +703,16 @@ static int vdi_create(const char *filename, QemuOpts *opts, Error **errp)
VdiHeader header;
size_t i;
size_t bmap_size;
bool nocow = false;
int64_t offset = 0;
Error *local_err = NULL;
BlockDriverState *bs = NULL;
uint32_t *bmap = NULL;
logout("\n");
/* Read out options. */
bytes = qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0);
bytes = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
#if defined(CONFIG_VDI_BLOCK_SIZE)
/* TODO: Additional checks (SECTOR_SIZE * 2^n, ...). */
block_size = qemu_opt_get_size_del(opts,
@@ -707,45 +724,33 @@ static int vdi_create(const char *filename, QemuOpts *opts, Error **errp)
image_type = VDI_TYPE_STATIC;
}
#endif
nocow = qemu_opt_get_bool_del(opts, BLOCK_OPT_NOCOW, false);
if (bytes > VDI_DISK_SIZE_MAX) {
result = -ENOTSUP;
ret = -ENOTSUP;
error_setg(errp, "Unsupported VDI image size (size is 0x%" PRIx64
", max supported is 0x%" PRIx64 ")",
bytes, VDI_DISK_SIZE_MAX);
goto exit;
}
fd = qemu_open(filename,
O_WRONLY | O_CREAT | O_TRUNC | O_BINARY | O_LARGEFILE,
0644);
if (fd < 0) {
result = -errno;
ret = bdrv_create_file(filename, opts, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
goto exit;
}
if (nocow) {
#ifdef __linux__
/* Set NOCOW flag to solve performance issue on fs like btrfs.
* This is an optimisation. The FS_IOC_SETFLAGS ioctl return value will
* be ignored since any failure of this operation should not block the
* left work.
*/
int attr;
if (ioctl(fd, FS_IOC_GETFLAGS, &attr) == 0) {
attr |= FS_NOCOW_FL;
ioctl(fd, FS_IOC_SETFLAGS, &attr);
}
#endif
ret = bdrv_open(&bs, filename, NULL, NULL, BDRV_O_RDWR | BDRV_O_PROTOCOL,
NULL, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
goto exit;
}
/* We need enough blocks to store the given disk size,
so always round up. */
blocks = (bytes + block_size - 1) / block_size;
blocks = DIV_ROUND_UP(bytes, block_size);
bmap_size = blocks * sizeof(uint32_t);
bmap_size = ((bmap_size + SECTOR_SIZE - 1) & ~(SECTOR_SIZE -1));
bmap_size = ROUND_UP(bmap_size, SECTOR_SIZE);
memset(&header, 0, sizeof(header));
pstrcpy(header.text, sizeof(header.text), VDI_TEXT);
@@ -769,13 +774,20 @@ static int vdi_create(const char *filename, QemuOpts *opts, Error **errp)
vdi_header_print(&header);
#endif
vdi_header_to_le(&header);
if (write(fd, &header, sizeof(header)) < 0) {
result = -errno;
goto close_and_exit;
ret = bdrv_pwrite_sync(bs, offset, &header, sizeof(header));
if (ret < 0) {
error_setg(errp, "Error writing header to %s", filename);
goto exit;
}
offset += sizeof(header);
if (bmap_size > 0) {
uint32_t *bmap = g_malloc0(bmap_size);
bmap = g_try_malloc0(bmap_size);
if (bmap == NULL) {
ret = -ENOMEM;
error_setg(errp, "Could not allocate bmap");
goto exit;
}
for (i = 0; i < blocks; i++) {
if (image_type == VDI_TYPE_STATIC) {
bmap[i] = i;
@@ -783,35 +795,33 @@ static int vdi_create(const char *filename, QemuOpts *opts, Error **errp)
bmap[i] = VDI_UNALLOCATED;
}
}
if (write(fd, bmap, bmap_size) < 0) {
result = -errno;
g_free(bmap);
goto close_and_exit;
ret = bdrv_pwrite_sync(bs, offset, bmap, bmap_size);
if (ret < 0) {
error_setg(errp, "Error writing bmap to %s", filename);
goto exit;
}
g_free(bmap);
offset += bmap_size;
}
if (image_type == VDI_TYPE_STATIC) {
if (ftruncate(fd, sizeof(header) + bmap_size + blocks * block_size)) {
result = -errno;
goto close_and_exit;
ret = bdrv_truncate(bs, offset + blocks * block_size);
if (ret < 0) {
error_setg(errp, "Failed to statically allocate %s", filename);
goto exit;
}
}
close_and_exit:
if ((close(fd) < 0) && !result) {
result = -errno;
}
exit:
return result;
bdrv_unref(bs);
g_free(bmap);
return ret;
}
static void vdi_close(BlockDriverState *bs)
{
BDRVVdiState *s = bs->opaque;
g_free(s->bmap);
qemu_vfree(s->bmap);
migrate_del_blocker(s->migration_blocker);
error_free(s->migration_blocker);

View File

@@ -82,8 +82,6 @@ void vhdx_log_desc_le_import(VHDXLogDescriptor *d)
assert(d != NULL);
le32_to_cpus(&d->signature);
le32_to_cpus(&d->trailing_bytes);
le64_to_cpus(&d->leading_bytes);
le64_to_cpus(&d->file_offset);
le64_to_cpus(&d->sequence_number);
}
@@ -99,6 +97,15 @@ void vhdx_log_desc_le_export(VHDXLogDescriptor *d)
cpu_to_le64s(&d->sequence_number);
}
void vhdx_log_data_le_import(VHDXLogDataSector *d)
{
assert(d != NULL);
le32_to_cpus(&d->data_signature);
le32_to_cpus(&d->sequence_high);
le32_to_cpus(&d->sequence_low);
}
void vhdx_log_data_le_export(VHDXLogDataSector *d)
{
assert(d != NULL);

View File

@@ -84,6 +84,7 @@ static int vhdx_log_peek_hdr(BlockDriverState *bs, VHDXLogEntries *log,
if (ret < 0) {
goto exit;
}
vhdx_log_entry_hdr_le_import(hdr);
exit:
return ret;
@@ -211,7 +212,7 @@ static bool vhdx_log_hdr_is_valid(VHDXLogEntries *log, VHDXLogEntryHeader *hdr,
{
int valid = false;
if (memcmp(&hdr->signature, "loge", 4)) {
if (hdr->signature != VHDX_LOG_SIGNATURE) {
goto exit;
}
@@ -275,12 +276,12 @@ static bool vhdx_log_desc_is_valid(VHDXLogDescriptor *desc,
goto exit;
}
if (!memcmp(&desc->signature, "zero", 4)) {
if (desc->signature == VHDX_LOG_ZERO_SIGNATURE) {
if (desc->zero_length % VHDX_LOG_SECTOR_SIZE == 0) {
/* valid */
ret = true;
}
} else if (!memcmp(&desc->signature, "desc", 4)) {
} else if (desc->signature == VHDX_LOG_DESC_SIGNATURE) {
/* valid */
ret = true;
}
@@ -327,13 +328,15 @@ static int vhdx_compute_desc_sectors(uint32_t desc_cnt)
* passed into this function. Each descriptor will also be validated,
* and error returned if any are invalid. */
static int vhdx_log_read_desc(BlockDriverState *bs, BDRVVHDXState *s,
VHDXLogEntries *log, VHDXLogDescEntries **buffer)
VHDXLogEntries *log, VHDXLogDescEntries **buffer,
bool convert_endian)
{
int ret = 0;
uint32_t desc_sectors;
uint32_t sectors_read;
VHDXLogEntryHeader hdr;
VHDXLogDescEntries *desc_entries = NULL;
VHDXLogDescriptor desc;
int i;
assert(*buffer == NULL);
@@ -342,14 +345,19 @@ static int vhdx_log_read_desc(BlockDriverState *bs, BDRVVHDXState *s,
if (ret < 0) {
goto exit;
}
vhdx_log_entry_hdr_le_import(&hdr);
if (vhdx_log_hdr_is_valid(log, &hdr, s) == false) {
ret = -EINVAL;
goto exit;
}
desc_sectors = vhdx_compute_desc_sectors(hdr.descriptor_count);
desc_entries = qemu_blockalign(bs, desc_sectors * VHDX_LOG_SECTOR_SIZE);
desc_entries = qemu_try_blockalign(bs->file,
desc_sectors * VHDX_LOG_SECTOR_SIZE);
if (desc_entries == NULL) {
ret = -ENOMEM;
goto exit;
}
ret = vhdx_log_read_sectors(bs, log, &sectors_read, desc_entries,
desc_sectors, false);
@@ -363,12 +371,19 @@ static int vhdx_log_read_desc(BlockDriverState *bs, BDRVVHDXState *s,
/* put in proper endianness, and validate each desc */
for (i = 0; i < hdr.descriptor_count; i++) {
vhdx_log_desc_le_import(&desc_entries->desc[i]);
if (vhdx_log_desc_is_valid(&desc_entries->desc[i], &hdr) == false) {
desc = desc_entries->desc[i];
vhdx_log_desc_le_import(&desc);
if (convert_endian) {
desc_entries->desc[i] = desc;
}
if (vhdx_log_desc_is_valid(&desc, &hdr) == false) {
ret = -EINVAL;
goto free_and_exit;
}
}
if (convert_endian) {
desc_entries->hdr = hdr;
}
*buffer = desc_entries;
goto exit;
@@ -403,7 +418,7 @@ static int vhdx_log_flush_desc(BlockDriverState *bs, VHDXLogDescriptor *desc,
buffer = qemu_blockalign(bs, VHDX_LOG_SECTOR_SIZE);
if (!memcmp(&desc->signature, "desc", 4)) {
if (desc->signature == VHDX_LOG_DESC_SIGNATURE) {
/* data sector */
if (data == NULL) {
ret = -EFAULT;
@@ -431,10 +446,15 @@ static int vhdx_log_flush_desc(BlockDriverState *bs, VHDXLogDescriptor *desc,
memcpy(buffer+offset, &desc->trailing_bytes, 4);
} else if (!memcmp(&desc->signature, "zero", 4)) {
} else if (desc->signature == VHDX_LOG_ZERO_SIGNATURE) {
/* write 'count' sectors of sector */
memset(buffer, 0, VHDX_LOG_SECTOR_SIZE);
count = desc->zero_length / VHDX_LOG_SECTOR_SIZE;
} else {
error_report("Invalid VHDX log descriptor entry signature 0x%" PRIx32,
desc->signature);
ret = -EINVAL;
goto exit;
}
file_offset = desc->file_offset;
@@ -493,13 +513,13 @@ static int vhdx_log_flush(BlockDriverState *bs, BDRVVHDXState *s,
goto exit;
}
ret = vhdx_log_read_desc(bs, s, &logs->log, &desc_entries);
ret = vhdx_log_read_desc(bs, s, &logs->log, &desc_entries, true);
if (ret < 0) {
goto exit;
}
for (i = 0; i < desc_entries->hdr.descriptor_count; i++) {
if (!memcmp(&desc_entries->desc[i].signature, "desc", 4)) {
if (desc_entries->desc[i].signature == VHDX_LOG_DESC_SIGNATURE) {
/* data sector, so read a sector to flush */
ret = vhdx_log_read_sectors(bs, &logs->log, &sectors_read,
data, 1, false);
@@ -510,6 +530,7 @@ static int vhdx_log_flush(BlockDriverState *bs, BDRVVHDXState *s,
ret = -EINVAL;
goto exit;
}
vhdx_log_data_le_import(data);
}
ret = vhdx_log_flush_desc(bs, &desc_entries->desc[i], data);
@@ -558,9 +579,6 @@ static int vhdx_validate_log_entry(BlockDriverState *bs, BDRVVHDXState *s,
goto inc_and_exit;
}
vhdx_log_entry_hdr_le_import(&hdr);
if (vhdx_log_hdr_is_valid(log, &hdr, s) == false) {
goto inc_and_exit;
}
@@ -573,13 +591,13 @@ static int vhdx_validate_log_entry(BlockDriverState *bs, BDRVVHDXState *s,
desc_sectors = vhdx_compute_desc_sectors(hdr.descriptor_count);
/* Read desc sectors, and calculate log checksum */
/* Read all log sectors, and calculate log checksum */
total_sectors = hdr.entry_length / VHDX_LOG_SECTOR_SIZE;
/* read_desc() will increment the read idx */
ret = vhdx_log_read_desc(bs, s, log, &desc_buffer);
ret = vhdx_log_read_desc(bs, s, log, &desc_buffer, false);
if (ret < 0) {
goto free_and_exit;
}
@@ -602,7 +620,7 @@ static int vhdx_validate_log_entry(BlockDriverState *bs, BDRVVHDXState *s,
}
}
crc ^= 0xffffffff;
if (crc != desc_buffer->hdr.checksum) {
if (crc != hdr.checksum) {
goto free_and_exit;
}
@@ -905,7 +923,7 @@ static int vhdx_log_write(BlockDriverState *bs, BDRVVHDXState *s,
buffer = qemu_blockalign(bs, total_length);
memcpy(buffer, &new_hdr, sizeof(new_hdr));
new_desc = (VHDXLogDescriptor *) (buffer + sizeof(new_hdr));
new_desc = buffer + sizeof(new_hdr);
data_sector = buffer + (desc_sectors * VHDX_LOG_SECTOR_SIZE);
data_tmp = data;
@@ -962,7 +980,6 @@ static int vhdx_log_write(BlockDriverState *bs, BDRVVHDXState *s,
* last data sector */
vhdx_update_checksum(buffer, total_length,
offsetof(VHDXLogEntryHeader, checksum));
cpu_to_le32s((uint32_t *)(buffer + 4));
/* now write to the log */
ret = vhdx_log_write_sectors(bs, &s->log, &sectors_written, buffer,

View File

@@ -99,7 +99,8 @@ static const MSGUID logical_sector_guid = { .data1 = 0x8141bf1d,
/* Each parent type must have a valid GUID; this is for parent images
* of type 'VHDX'. If we were to allow e.g. a QCOW2 parent, we would
* need to make up our own QCOW2 GUID type */
static const MSGUID parent_vhdx_guid = { .data1 = 0xb04aefb7,
static const MSGUID parent_vhdx_guid __attribute__((unused))
= { .data1 = 0xb04aefb7,
.data2 = 0xd19e,
.data3 = 0x4a81,
.data4 = { 0xb7, 0x89, 0x25, 0xb8,
@@ -135,10 +136,8 @@ typedef struct VHDXSectorInfo {
* buf: buffer pointer
* size: size of buffer (must be > crc_offset+4)
*
* Note: The resulting checksum is in the CPU endianness, not necessarily
* in the file format endianness (LE). Any header export to disk should
* make sure that vhdx_header_le_export() is used to convert to the
* correct endianness
* Note: The buffer should have all multi-byte data in little-endian format,
* and the resulting checksum is in little endian format.
*/
uint32_t vhdx_update_checksum(uint8_t *buf, size_t size, int crc_offset)
{
@@ -149,6 +148,7 @@ uint32_t vhdx_update_checksum(uint8_t *buf, size_t size, int crc_offset)
memset(buf + crc_offset, 0, sizeof(crc));
crc = crc32c(0xffffffff, buf, size);
cpu_to_le32s(&crc);
memcpy(buf + crc_offset, &crc, sizeof(crc));
return crc;
@@ -300,7 +300,7 @@ static int vhdx_write_header(BlockDriverState *bs_file, VHDXHeader *hdr,
{
uint8_t *buffer = NULL;
int ret;
VHDXHeader header_le;
VHDXHeader *header_le;
assert(bs_file != NULL);
assert(hdr != NULL);
@@ -321,11 +321,12 @@ static int vhdx_write_header(BlockDriverState *bs_file, VHDXHeader *hdr,
}
/* overwrite the actual VHDXHeader portion */
memcpy(buffer, hdr, sizeof(VHDXHeader));
hdr->checksum = vhdx_update_checksum(buffer, VHDX_HEADER_SIZE,
offsetof(VHDXHeader, checksum));
vhdx_header_le_export(hdr, &header_le);
ret = bdrv_pwrite_sync(bs_file, offset, &header_le, sizeof(VHDXHeader));
header_le = (VHDXHeader *)buffer;
memcpy(header_le, hdr, sizeof(VHDXHeader));
vhdx_header_le_export(hdr, header_le);
vhdx_update_checksum(buffer, VHDX_HEADER_SIZE,
offsetof(VHDXHeader, checksum));
ret = bdrv_pwrite_sync(bs_file, offset, header_le, sizeof(VHDXHeader));
exit:
qemu_vfree(buffer);
@@ -432,13 +433,14 @@ static void vhdx_parse_header(BlockDriverState *bs, BDRVVHDXState *s,
}
/* copy over just the relevant portion that we need */
memcpy(header1, buffer, sizeof(VHDXHeader));
vhdx_header_le_import(header1);
if (vhdx_checksum_is_valid(buffer, VHDX_HEADER_SIZE, 4) &&
!memcmp(&header1->signature, "head", 4) &&
header1->version == 1) {
h1_seq = header1->sequence_number;
h1_valid = true;
if (vhdx_checksum_is_valid(buffer, VHDX_HEADER_SIZE, 4)) {
vhdx_header_le_import(header1);
if (header1->signature == VHDX_HEADER_SIGNATURE &&
header1->version == 1) {
h1_seq = header1->sequence_number;
h1_valid = true;
}
}
ret = bdrv_pread(bs->file, VHDX_HEADER2_OFFSET, buffer, VHDX_HEADER_SIZE);
@@ -447,13 +449,14 @@ static void vhdx_parse_header(BlockDriverState *bs, BDRVVHDXState *s,
}
/* copy over just the relevant portion that we need */
memcpy(header2, buffer, sizeof(VHDXHeader));
vhdx_header_le_import(header2);
if (vhdx_checksum_is_valid(buffer, VHDX_HEADER_SIZE, 4) &&
!memcmp(&header2->signature, "head", 4) &&
header2->version == 1) {
h2_seq = header2->sequence_number;
h2_valid = true;
if (vhdx_checksum_is_valid(buffer, VHDX_HEADER_SIZE, 4)) {
vhdx_header_le_import(header2);
if (header2->signature == VHDX_HEADER_SIGNATURE &&
header2->version == 1) {
h2_seq = header2->sequence_number;
h2_valid = true;
}
}
/* If there is only 1 valid header (or no valid headers), we
@@ -519,15 +522,21 @@ static int vhdx_open_region_tables(BlockDriverState *bs, BDRVVHDXState *s)
goto fail;
}
memcpy(&s->rt, buffer, sizeof(s->rt));
vhdx_region_header_le_import(&s->rt);
offset += sizeof(s->rt);
if (!vhdx_checksum_is_valid(buffer, VHDX_HEADER_BLOCK_SIZE, 4) ||
memcmp(&s->rt.signature, "regi", 4)) {
if (!vhdx_checksum_is_valid(buffer, VHDX_HEADER_BLOCK_SIZE, 4)) {
ret = -EINVAL;
goto fail;
}
vhdx_region_header_le_import(&s->rt);
if (s->rt.signature != VHDX_REGION_SIGNATURE) {
ret = -EINVAL;
goto fail;
}
/* Per spec, maximum region table entry count is 2047 */
if (s->rt.entry_count > 2047) {
ret = -EINVAL;
@@ -630,7 +639,7 @@ static int vhdx_parse_metadata(BlockDriverState *bs, BDRVVHDXState *s)
vhdx_metadata_header_le_import(&s->metadata_hdr);
if (memcmp(&s->metadata_hdr.signature, "metadata", 8)) {
if (s->metadata_hdr.signature != VHDX_METADATA_SIGNATURE) {
ret = -EINVAL;
goto exit;
}
@@ -950,7 +959,11 @@ static int vhdx_open(BlockDriverState *bs, QDict *options, int flags,
}
/* s->bat is freed in vhdx_close() */
s->bat = qemu_blockalign(bs, s->bat_rt.length);
s->bat = qemu_try_blockalign(bs->file, s->bat_rt.length);
if (s->bat == NULL) {
ret = -ENOMEM;
goto fail;
}
ret = bdrv_pread(bs->file, s->bat_offset, s->bat, s->bat_rt.length);
if (ret < 0) {
@@ -991,7 +1004,7 @@ static int vhdx_open(BlockDriverState *bs, QDict *options, int flags,
/* Disable migration when VHDX images are used */
error_set(&s->migration_blocker,
QERR_BLOCK_FORMAT_FEATURE_NOT_SUPPORTED,
"vhdx", bs->device_name, "live migration");
"vhdx", bdrv_get_device_name(bs), "live migration");
migrate_add_blocker(s->migration_blocker);
return 0;
@@ -1369,7 +1382,7 @@ static int vhdx_create_new_headers(BlockDriverState *bs, uint64_t image_size,
int ret = 0;
VHDXHeader *hdr = NULL;
hdr = g_malloc0(sizeof(VHDXHeader));
hdr = g_new0(VHDXHeader, 1);
hdr->signature = VHDX_HEADER_SIGNATURE;
hdr->sequence_number = g_random_int();
@@ -1395,6 +1408,12 @@ exit:
return ret;
}
#define VHDX_METADATA_ENTRY_BUFFER_SIZE \
(sizeof(VHDXFileParameters) +\
sizeof(VHDXVirtualDiskSize) +\
sizeof(VHDXPage83Data) +\
sizeof(VHDXVirtualDiskLogicalSectorSize) +\
sizeof(VHDXVirtualDiskPhysicalSectorSize))
/*
* Create the Metadata entries.
@@ -1433,11 +1452,7 @@ static int vhdx_create_new_metadata(BlockDriverState *bs,
VHDXVirtualDiskLogicalSectorSize *mt_log_sector_size;
VHDXVirtualDiskPhysicalSectorSize *mt_phys_sector_size;
entry_buffer = g_malloc0(sizeof(VHDXFileParameters) +
sizeof(VHDXVirtualDiskSize) +
sizeof(VHDXPage83Data) +
sizeof(VHDXVirtualDiskLogicalSectorSize) +
sizeof(VHDXVirtualDiskPhysicalSectorSize));
entry_buffer = g_malloc0(VHDX_METADATA_ENTRY_BUFFER_SIZE);
mt_file_params = entry_buffer;
offset += sizeof(VHDXFileParameters);
@@ -1518,7 +1533,7 @@ static int vhdx_create_new_metadata(BlockDriverState *bs,
}
ret = bdrv_pwrite(bs, metadata_offset + (64 * KiB), entry_buffer,
VHDX_HEADER_BLOCK_SIZE);
VHDX_METADATA_ENTRY_BUFFER_SIZE);
if (ret < 0) {
goto exit;
}
@@ -1540,7 +1555,8 @@ exit:
*/
static int vhdx_create_bat(BlockDriverState *bs, BDRVVHDXState *s,
uint64_t image_size, VHDXImageType type,
bool use_zero_blocks, VHDXRegionTableEntry *rt_bat)
bool use_zero_blocks, uint64_t file_offset,
uint32_t length)
{
int ret = 0;
uint64_t data_file_offset;
@@ -1555,7 +1571,7 @@ static int vhdx_create_bat(BlockDriverState *bs, BDRVVHDXState *s,
/* this gives a data start after BAT/bitmap entries, and well
* past any metadata entries (with a 4 MB buffer for future
* expansion */
data_file_offset = rt_bat->file_offset + rt_bat->length + 5 * MiB;
data_file_offset = file_offset + length + 5 * MiB;
total_sectors = image_size >> s->logical_sector_size_bits;
if (type == VHDX_TYPE_DYNAMIC) {
@@ -1579,7 +1595,11 @@ static int vhdx_create_bat(BlockDriverState *bs, BDRVVHDXState *s,
use_zero_blocks ||
bdrv_has_zero_init(bs) == 0) {
/* for a fixed file, the default BAT entry is not zero */
s->bat = g_malloc0(rt_bat->length);
s->bat = g_try_malloc0(length);
if (length && s->bat == NULL) {
ret = -ENOMEM;
goto exit;
}
block_state = type == VHDX_TYPE_FIXED ? PAYLOAD_BLOCK_FULLY_PRESENT :
PAYLOAD_BLOCK_NOT_PRESENT;
block_state = use_zero_blocks ? PAYLOAD_BLOCK_ZERO : block_state;
@@ -1594,7 +1614,7 @@ static int vhdx_create_bat(BlockDriverState *bs, BDRVVHDXState *s,
cpu_to_le64s(&s->bat[sinfo.bat_idx]);
sector_num += s->sectors_per_block;
}
ret = bdrv_pwrite(bs, rt_bat->file_offset, s->bat, rt_bat->length);
ret = bdrv_pwrite(bs, file_offset, s->bat, length);
if (ret < 0) {
goto exit;
}
@@ -1626,6 +1646,8 @@ static int vhdx_create_new_region_table(BlockDriverState *bs,
int ret = 0;
uint32_t offset = 0;
void *buffer = NULL;
uint64_t bat_file_offset;
uint32_t bat_length;
BDRVVHDXState *s = NULL;
VHDXRegionTableHeader *region_table;
VHDXRegionTableEntry *rt_bat;
@@ -1635,7 +1657,7 @@ static int vhdx_create_new_region_table(BlockDriverState *bs,
/* Populate enough of the BDRVVHDXState to be able to use the
* pre-existing BAT calculation, translation, and update functions */
s = g_malloc0(sizeof(BDRVVHDXState));
s = g_new0(BDRVVHDXState, 1);
s->chunk_ratio = (VHDX_MAX_SECTORS_PER_BLOCK) *
(uint64_t) sector_size / (uint64_t) block_size;
@@ -1674,19 +1696,26 @@ static int vhdx_create_new_region_table(BlockDriverState *bs,
rt_metadata->length = 1 * MiB; /* min size, and more than enough */
*metadata_offset = rt_metadata->file_offset;
bat_file_offset = rt_bat->file_offset;
bat_length = rt_bat->length;
vhdx_region_header_le_export(region_table);
vhdx_region_entry_le_export(rt_bat);
vhdx_region_entry_le_export(rt_metadata);
vhdx_update_checksum(buffer, VHDX_HEADER_BLOCK_SIZE,
offsetof(VHDXRegionTableHeader, checksum));
/* The region table gives us the data we need to create the BAT,
* so do that now */
ret = vhdx_create_bat(bs, s, image_size, type, use_zero_blocks, rt_bat);
ret = vhdx_create_bat(bs, s, image_size, type, use_zero_blocks,
bat_file_offset, bat_length);
if (ret < 0) {
goto exit;
}
/* Now write out the region headers to disk */
vhdx_region_header_le_export(region_table);
vhdx_region_entry_le_export(rt_bat);
vhdx_region_entry_le_export(rt_metadata);
ret = bdrv_pwrite(bs, VHDX_REGION_TABLE_OFFSET, buffer,
VHDX_HEADER_BLOCK_SIZE);
if (ret < 0) {
@@ -1699,7 +1728,6 @@ static int vhdx_create_new_region_table(BlockDriverState *bs,
goto exit;
}
exit:
g_free(s);
g_free(buffer);
@@ -1740,7 +1768,8 @@ static int vhdx_create(const char *filename, QemuOpts *opts, Error **errp)
VHDXImageType image_type;
Error *local_err = NULL;
image_size = qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0);
image_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
log_size = qemu_opt_get_size_del(opts, VHDX_BLOCK_OPT_LOG_SIZE, 0);
block_size = qemu_opt_get_size_del(opts, VHDX_BLOCK_OPT_BLOCK_SIZE, 0);
type = qemu_opt_get_del(opts, BLOCK_OPT_SUBFMT);
@@ -1849,7 +1878,6 @@ static int vhdx_create(const char *filename, QemuOpts *opts, Error **errp)
}
delete_and_exit:
bdrv_unref(bs);
exit:

View File

@@ -435,6 +435,7 @@ void vhdx_header_le_import(VHDXHeader *h);
void vhdx_header_le_export(VHDXHeader *orig_h, VHDXHeader *new_h);
void vhdx_log_desc_le_import(VHDXLogDescriptor *d);
void vhdx_log_desc_le_export(VHDXLogDescriptor *d);
void vhdx_log_data_le_import(VHDXLogDataSector *d);
void vhdx_log_data_le_export(VHDXLogDataSector *d);
void vhdx_log_entry_hdr_le_import(VHDXLogEntryHeader *hdr);
void vhdx_log_entry_hdr_le_export(VHDXLogEntryHeader *hdr);

View File

@@ -106,6 +106,7 @@ typedef struct VmdkExtent {
uint32_t l2_cache_counts[L2_CACHE_SIZE];
int64_t cluster_sectors;
int64_t next_cluster_sector;
char *type;
} VmdkExtent;
@@ -124,7 +125,6 @@ typedef struct BDRVVmdkState {
} BDRVVmdkState;
typedef struct VmdkMetaData {
uint32_t offset;
unsigned int l1_index;
unsigned int l2_index;
unsigned int l2_offset;
@@ -233,7 +233,7 @@ static void vmdk_free_last_extent(BlockDriverState *bs)
return;
}
s->num_extents--;
s->extents = g_realloc(s->extents, s->num_extents * sizeof(VmdkExtent));
s->extents = g_renew(VmdkExtent, s->extents, s->num_extents);
}
static uint32_t vmdk_read_cid(BlockDriverState *bs, int parent)
@@ -397,6 +397,7 @@ static int vmdk_add_extent(BlockDriverState *bs,
{
VmdkExtent *extent;
BDRVVmdkState *s = bs->opaque;
int64_t nb_sectors;
if (cluster_sectors > 0x200000) {
/* 0x200000 * 512Bytes = 1GB for one cluster is unrealistic */
@@ -412,8 +413,12 @@ static int vmdk_add_extent(BlockDriverState *bs,
return -EFBIG;
}
s->extents = g_realloc(s->extents,
(s->num_extents + 1) * sizeof(VmdkExtent));
nb_sectors = bdrv_nb_sectors(file);
if (nb_sectors < 0) {
return nb_sectors;
}
s->extents = g_renew(VmdkExtent, s->extents, s->num_extents + 1);
extent = &s->extents[s->num_extents];
s->num_extents++;
@@ -427,6 +432,7 @@ static int vmdk_add_extent(BlockDriverState *bs,
extent->l1_entry_sectors = l2_size * cluster_sectors;
extent->l2_size = l2_size;
extent->cluster_sectors = flat ? sectors : cluster_sectors;
extent->next_cluster_sector = ROUND_UP(nb_sectors, cluster_sectors);
if (s->num_extents > 1) {
extent->end_sector = (*(extent - 1)).end_sector + extent->sectors;
@@ -448,7 +454,11 @@ static int vmdk_init_tables(BlockDriverState *bs, VmdkExtent *extent,
/* read the L1 table */
l1_size = extent->l1_size * sizeof(uint32_t);
extent->l1_table = g_malloc(l1_size);
extent->l1_table = g_try_malloc(l1_size);
if (l1_size && extent->l1_table == NULL) {
return -ENOMEM;
}
ret = bdrv_pread(extent->file,
extent->l1_table_offset,
extent->l1_table,
@@ -464,7 +474,11 @@ static int vmdk_init_tables(BlockDriverState *bs, VmdkExtent *extent,
}
if (extent->l1_backup_table_offset) {
extent->l1_backup_table = g_malloc(l1_size);
extent->l1_backup_table = g_try_malloc(l1_size);
if (l1_size && extent->l1_backup_table == NULL) {
ret = -ENOMEM;
goto fail_l1;
}
ret = bdrv_pread(extent->file,
extent->l1_backup_table_offset,
extent->l1_backup_table,
@@ -481,7 +495,7 @@ static int vmdk_init_tables(BlockDriverState *bs, VmdkExtent *extent,
}
extent->l2_cache =
g_malloc(extent->l2_size * L2_CACHE_SIZE * sizeof(uint32_t));
g_new(uint32_t, extent->l2_size * L2_CACHE_SIZE);
return 0;
fail_l1b:
g_free(extent->l1_backup_table);
@@ -643,7 +657,7 @@ static int vmdk_open_vmdk4(BlockDriverState *bs,
snprintf(buf, sizeof(buf), "VMDK version %" PRId32,
le32_to_cpu(header.version));
error_set(errp, QERR_UNKNOWN_BLOCK_FORMAT_FEATURE,
bs->device_name, "vmdk", buf);
bdrv_get_device_name(bs), "vmdk", buf);
return -ENOTSUP;
} else if (le32_to_cpu(header.version) == 3 && (flags & BDRV_O_RDWR)) {
/* VMware KB 2064959 explains that version 3 added support for
@@ -669,8 +683,7 @@ static int vmdk_open_vmdk4(BlockDriverState *bs,
if (le32_to_cpu(header.flags) & VMDK4_FLAG_RGD) {
l1_backup_offset = le64_to_cpu(header.rgd_offset) << 9;
}
if (bdrv_getlength(file) <
le64_to_cpu(header.grain_offset) * BDRV_SECTOR_SIZE) {
if (bdrv_nb_sectors(file) < le64_to_cpu(header.grain_offset)) {
error_setg(errp, "File truncated, expecting at least %" PRId64 " bytes",
(int64_t)(le64_to_cpu(header.grain_offset)
* BDRV_SECTOR_SIZE));
@@ -821,6 +834,7 @@ static int vmdk_parse_extents(const char *desc, BlockDriverState *bs,
ret = vmdk_add_extent(bs, extent_file, true, sectors,
0, 0, 0, 0, 0, &extent, errp);
if (ret < 0) {
bdrv_unref(extent_file);
return ret;
}
extent->flat_start_offset = flat_offset << 9;
@@ -832,14 +846,15 @@ static int vmdk_parse_extents(const char *desc, BlockDriverState *bs,
} else {
ret = vmdk_open_sparse(bs, extent_file, bs->open_flags, buf, errp);
}
g_free(buf);
if (ret) {
g_free(buf);
bdrv_unref(extent_file);
return ret;
}
extent = &s->extents[s->num_extents - 1];
} else {
error_setg(errp, "Unsupported extent type '%s'", type);
bdrv_unref(extent_file);
return -ENOTSUP;
}
extent->type = g_strdup(type);
@@ -924,7 +939,7 @@ static int vmdk_open(BlockDriverState *bs, QDict *options, int flags,
/* Disable migration when VMDK images are used */
error_set(&s->migration_blocker,
QERR_BLOCK_FORMAT_FEATURE_NOT_SUPPORTED,
"vmdk", bs->device_name, "live migration");
"vmdk", bdrv_get_device_name(bs), "live migration");
migrate_add_blocker(s->migration_blocker);
g_free(buf);
return 0;
@@ -952,57 +967,97 @@ static void vmdk_refresh_limits(BlockDriverState *bs, Error **errp)
}
}
/**
* get_whole_cluster
*
* Copy backing file's cluster that covers @sector_num, otherwise write zero,
* to the cluster at @cluster_sector_num.
*
* If @skip_start_sector < @skip_end_sector, the relative range
* [@skip_start_sector, @skip_end_sector) is not copied or written, and leave
* it for call to write user data in the request.
*/
static int get_whole_cluster(BlockDriverState *bs,
VmdkExtent *extent,
uint64_t cluster_offset,
uint64_t offset,
bool allocate)
VmdkExtent *extent,
uint64_t cluster_sector_num,
uint64_t sector_num,
uint64_t skip_start_sector,
uint64_t skip_end_sector)
{
int ret = VMDK_OK;
uint8_t *whole_grain = NULL;
int64_t cluster_bytes;
uint8_t *whole_grain;
/* For COW, align request sector_num to cluster start */
sector_num = QEMU_ALIGN_DOWN(sector_num, extent->cluster_sectors);
cluster_bytes = extent->cluster_sectors << BDRV_SECTOR_BITS;
whole_grain = qemu_blockalign(bs, cluster_bytes);
if (!bs->backing_hd) {
memset(whole_grain, 0, skip_start_sector << BDRV_SECTOR_BITS);
memset(whole_grain + (skip_end_sector << BDRV_SECTOR_BITS), 0,
cluster_bytes - (skip_end_sector << BDRV_SECTOR_BITS));
}
assert(skip_end_sector <= extent->cluster_sectors);
/* we will be here if it's first write on non-exist grain(cluster).
* try to read from parent image, if exist */
if (bs->backing_hd) {
whole_grain =
qemu_blockalign(bs, extent->cluster_sectors << BDRV_SECTOR_BITS);
if (!vmdk_is_cid_valid(bs)) {
ret = VMDK_ERROR;
goto exit;
}
if (bs->backing_hd && !vmdk_is_cid_valid(bs)) {
ret = VMDK_ERROR;
goto exit;
}
/* floor offset to cluster */
offset -= offset % (extent->cluster_sectors * 512);
ret = bdrv_read(bs->backing_hd, offset >> 9, whole_grain,
extent->cluster_sectors);
if (ret < 0) {
ret = VMDK_ERROR;
goto exit;
/* Read backing data before skip range */
if (skip_start_sector > 0) {
if (bs->backing_hd) {
ret = bdrv_read(bs->backing_hd, sector_num,
whole_grain, skip_start_sector);
if (ret < 0) {
ret = VMDK_ERROR;
goto exit;
}
}
/* Write grain only into the active image */
ret = bdrv_write(extent->file, cluster_offset, whole_grain,
extent->cluster_sectors);
ret = bdrv_write(extent->file, cluster_sector_num, whole_grain,
skip_start_sector);
if (ret < 0) {
ret = VMDK_ERROR;
goto exit;
}
}
/* Read backing data after skip range */
if (skip_end_sector < extent->cluster_sectors) {
if (bs->backing_hd) {
ret = bdrv_read(bs->backing_hd, sector_num + skip_end_sector,
whole_grain + (skip_end_sector << BDRV_SECTOR_BITS),
extent->cluster_sectors - skip_end_sector);
if (ret < 0) {
ret = VMDK_ERROR;
goto exit;
}
}
ret = bdrv_write(extent->file, cluster_sector_num + skip_end_sector,
whole_grain + (skip_end_sector << BDRV_SECTOR_BITS),
extent->cluster_sectors - skip_end_sector);
if (ret < 0) {
ret = VMDK_ERROR;
goto exit;
}
}
exit:
qemu_vfree(whole_grain);
return ret;
}
static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data)
static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data,
uint32_t offset)
{
uint32_t offset;
QEMU_BUILD_BUG_ON(sizeof(offset) != sizeof(m_data->offset));
offset = cpu_to_le32(m_data->offset);
offset = cpu_to_le32(offset);
/* update L2 table */
if (bdrv_pwrite_sync(
extent->file,
((int64_t)m_data->l2_offset * 512)
+ (m_data->l2_index * sizeof(m_data->offset)),
+ (m_data->l2_index * sizeof(offset)),
&offset, sizeof(offset)) < 0) {
return VMDK_ERROR;
}
@@ -1012,7 +1067,7 @@ static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data)
if (bdrv_pwrite_sync(
extent->file,
((int64_t)m_data->l2_offset * 512)
+ (m_data->l2_index * sizeof(m_data->offset)),
+ (m_data->l2_index * sizeof(offset)),
&offset, sizeof(offset)) < 0) {
return VMDK_ERROR;
}
@@ -1024,17 +1079,41 @@ static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data)
return VMDK_OK;
}
/**
* get_cluster_offset
*
* Look up cluster offset in extent file by sector number, and store in
* @cluster_offset.
*
* For flat extents, the start offset as parsed from the description file is
* returned.
*
* For sparse extents, look up in L1, L2 table. If allocate is true, return an
* offset for a new cluster and update L2 cache. If there is a backing file,
* COW is done before returning; otherwise, zeroes are written to the allocated
* cluster. Both COW and zero writing skips the sector range
* [@skip_start_sector, @skip_end_sector) passed in by caller, because caller
* has new data to write there.
*
* Returns: VMDK_OK if cluster exists and mapped in the image.
* VMDK_UNALLOC if cluster is not mapped and @allocate is false.
* VMDK_ERROR if failed.
*/
static int get_cluster_offset(BlockDriverState *bs,
VmdkExtent *extent,
VmdkMetaData *m_data,
uint64_t offset,
int allocate,
uint64_t *cluster_offset)
VmdkExtent *extent,
VmdkMetaData *m_data,
uint64_t offset,
bool allocate,
uint64_t *cluster_offset,
uint64_t skip_start_sector,
uint64_t skip_end_sector)
{
unsigned int l1_index, l2_offset, l2_index;
int min_index, i, j;
uint32_t min_count, *l2_table;
bool zeroed = false;
int64_t ret;
int64_t cluster_sector;
if (m_data) {
m_data->valid = 0;
@@ -1088,52 +1167,41 @@ static int get_cluster_offset(BlockDriverState *bs,
extent->l2_cache_counts[min_index] = 1;
found:
l2_index = ((offset >> 9) / extent->cluster_sectors) % extent->l2_size;
*cluster_offset = le32_to_cpu(l2_table[l2_index]);
cluster_sector = le32_to_cpu(l2_table[l2_index]);
if (m_data) {
m_data->valid = 1;
m_data->l1_index = l1_index;
m_data->l2_index = l2_index;
m_data->offset = *cluster_offset;
m_data->l2_offset = l2_offset;
m_data->l2_cache_entry = &l2_table[l2_index];
}
if (extent->has_zero_grain && *cluster_offset == VMDK_GTE_ZEROED) {
if (extent->has_zero_grain && cluster_sector == VMDK_GTE_ZEROED) {
zeroed = true;
}
if (!*cluster_offset || zeroed) {
if (!cluster_sector || zeroed) {
if (!allocate) {
return zeroed ? VMDK_ZEROED : VMDK_UNALLOC;
}
/* Avoid the L2 tables update for the images that have snapshots. */
*cluster_offset = bdrv_getlength(extent->file);
if (!extent->compressed) {
bdrv_truncate(
extent->file,
*cluster_offset + (extent->cluster_sectors << 9)
);
}
*cluster_offset >>= 9;
l2_table[l2_index] = cpu_to_le32(*cluster_offset);
cluster_sector = extent->next_cluster_sector;
extent->next_cluster_sector += extent->cluster_sectors;
/* First of all we write grain itself, to avoid race condition
* that may to corrupt the image.
* This problem may occur because of insufficient space on host disk
* or inappropriate VM shutdown.
*/
if (get_whole_cluster(
bs, extent, *cluster_offset, offset, allocate) == -1) {
return VMDK_ERROR;
}
if (m_data) {
m_data->offset = *cluster_offset;
ret = get_whole_cluster(bs, extent,
cluster_sector,
offset >> BDRV_SECTOR_BITS,
skip_start_sector, skip_end_sector);
if (ret) {
return ret;
}
}
*cluster_offset <<= 9;
*cluster_offset = cluster_sector << BDRV_SECTOR_BITS;
return VMDK_OK;
}
@@ -1168,7 +1236,8 @@ static int64_t coroutine_fn vmdk_co_get_block_status(BlockDriverState *bs,
}
qemu_co_mutex_lock(&s->lock);
ret = get_cluster_offset(bs, extent, NULL,
sector_num * 512, 0, &offset);
sector_num * 512, false, &offset,
0, 0);
qemu_co_mutex_unlock(&s->lock);
switch (ret) {
@@ -1321,9 +1390,9 @@ static int vmdk_read(BlockDriverState *bs, int64_t sector_num,
if (!extent) {
return -EIO;
}
ret = get_cluster_offset(
bs, extent, NULL,
sector_num << 9, 0, &cluster_offset);
ret = get_cluster_offset(bs, extent, NULL,
sector_num << 9, false, &cluster_offset,
0, 0);
extent_begin_sector = extent->end_sector - extent->sectors;
extent_relative_sector_num = sector_num - extent_begin_sector;
index_in_cluster = extent_relative_sector_num % extent->cluster_sectors;
@@ -1404,12 +1473,17 @@ static int vmdk_write(BlockDriverState *bs, int64_t sector_num,
if (!extent) {
return -EIO;
}
ret = get_cluster_offset(
bs,
extent,
&m_data,
sector_num << 9, !extent->compressed,
&cluster_offset);
extent_begin_sector = extent->end_sector - extent->sectors;
extent_relative_sector_num = sector_num - extent_begin_sector;
index_in_cluster = extent_relative_sector_num % extent->cluster_sectors;
n = extent->cluster_sectors - index_in_cluster;
if (n > nb_sectors) {
n = nb_sectors;
}
ret = get_cluster_offset(bs, extent, &m_data, sector_num << 9,
!(extent->compressed || zeroed),
&cluster_offset,
index_in_cluster, index_in_cluster + n);
if (extent->compressed) {
if (ret == VMDK_OK) {
/* Refuse write to allocated cluster for streamOptimized */
@@ -1418,24 +1492,13 @@ static int vmdk_write(BlockDriverState *bs, int64_t sector_num,
return -EIO;
} else {
/* allocate */
ret = get_cluster_offset(
bs,
extent,
&m_data,
sector_num << 9, 1,
&cluster_offset);
ret = get_cluster_offset(bs, extent, &m_data, sector_num << 9,
true, &cluster_offset, 0, 0);
}
}
if (ret == VMDK_ERROR) {
return -EINVAL;
}
extent_begin_sector = extent->end_sector - extent->sectors;
extent_relative_sector_num = sector_num - extent_begin_sector;
index_in_cluster = extent_relative_sector_num % extent->cluster_sectors;
n = extent->cluster_sectors - index_in_cluster;
if (n > nb_sectors) {
n = nb_sectors;
}
if (zeroed) {
/* Do zeroed write, buf is ignored */
if (extent->has_zero_grain &&
@@ -1443,9 +1506,9 @@ static int vmdk_write(BlockDriverState *bs, int64_t sector_num,
n >= extent->cluster_sectors) {
n = extent->cluster_sectors;
if (!zero_dry_run) {
m_data.offset = VMDK_GTE_ZEROED;
/* update L2 tables */
if (vmdk_L2update(extent, &m_data) != VMDK_OK) {
if (vmdk_L2update(extent, &m_data, VMDK_GTE_ZEROED)
!= VMDK_OK) {
return -EIO;
}
}
@@ -1461,7 +1524,9 @@ static int vmdk_write(BlockDriverState *bs, int64_t sector_num,
}
if (m_data.valid) {
/* update L2 tables */
if (vmdk_L2update(extent, &m_data) != VMDK_OK) {
if (vmdk_L2update(extent, &m_data,
cluster_offset >> BDRV_SECTOR_BITS)
!= VMDK_OK) {
return -EIO;
}
}
@@ -1742,7 +1807,8 @@ static int vmdk_create(const char *filename, QemuOpts *opts, Error **errp)
goto exit;
}
/* Read out options */
total_size = qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0);
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
adapter_type = qemu_opt_get_del(opts, BLOCK_OPT_ADAPTER_TYPE);
backing_file = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FILE);
if (qemu_opt_get_bool_del(opts, BLOCK_OPT_COMPAT6, false)) {
@@ -1999,7 +2065,7 @@ static int vmdk_check(BlockDriverState *bs, BdrvCheckResult *result,
BDRVVmdkState *s = bs->opaque;
VmdkExtent *extent = NULL;
int64_t sector_num = 0;
int64_t total_sectors = bdrv_getlength(bs) / BDRV_SECTOR_SIZE;
int64_t total_sectors = bdrv_nb_sectors(bs);
int ret;
uint64_t cluster_offset;
@@ -2020,7 +2086,7 @@ static int vmdk_check(BlockDriverState *bs, BdrvCheckResult *result,
}
ret = get_cluster_offset(bs, extent, NULL,
sector_num << BDRV_SECTOR_BITS,
0, &cluster_offset);
false, &cluster_offset, 0, 0);
if (ret == VMDK_ERROR) {
fprintf(stderr,
"ERROR: could not get cluster_offset for sector %"
@@ -2071,23 +2137,29 @@ static ImageInfoSpecific *vmdk_get_specific_info(BlockDriverState *bs)
return spec_info;
}
static bool vmdk_extents_type_eq(const VmdkExtent *a, const VmdkExtent *b)
{
return a->flat == b->flat &&
a->compressed == b->compressed &&
(a->flat || a->cluster_sectors == b->cluster_sectors);
}
static int vmdk_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
{
int i;
BDRVVmdkState *s = bs->opaque;
assert(s->num_extents);
/* See if we have multiple extents but they have different cases */
for (i = 1; i < s->num_extents; i++) {
if (!vmdk_extents_type_eq(&s->extents[0], &s->extents[i])) {
return -ENOTSUP;
}
}
bdi->needs_compressed_writes = s->extents[0].compressed;
if (!s->extents[0].flat) {
bdi->cluster_size = s->extents[0].cluster_sectors << BDRV_SECTOR_BITS;
}
/* See if we have multiple extents but they have different cases */
for (i = 1; i < s->num_extents; i++) {
if (bdi->needs_compressed_writes != s->extents[i].compressed ||
(bdi->cluster_size && bdi->cluster_size !=
s->extents[i].cluster_sectors << BDRV_SECTOR_BITS)) {
return -ENOTSUP;
}
}
return 0;
}

View File

@@ -29,13 +29,6 @@
#if defined(CONFIG_UUID)
#include <uuid/uuid.h>
#endif
#ifdef __linux__
#include <linux/fs.h>
#include <sys/ioctl.h>
#ifndef FS_NOCOW_FL
#define FS_NOCOW_FL 0x00800000 /* Do not cow file */
#endif
#endif
/**************************************************************/
@@ -214,7 +207,7 @@ static int vpc_open(BlockDriverState *bs, QDict *options, int flags,
"incorrect.\n", bs->filename);
/* Write 'checksum' back to footer, or else will leave it with zero. */
footer->checksum = be32_to_cpu(checksum);
footer->checksum = cpu_to_be32(checksum);
// The visible size of a image in Virtual PC depends on the geometry
// rather than on the size stored in the footer (the size in the footer
@@ -276,7 +269,11 @@ static int vpc_open(BlockDriverState *bs, QDict *options, int flags,
goto fail;
}
s->pagetable = qemu_blockalign(bs, s->max_table_entries * 4);
s->pagetable = qemu_try_blockalign(bs->file, s->max_table_entries * 4);
if (s->pagetable == NULL) {
ret = -ENOMEM;
goto fail;
}
s->bat_offset = be64_to_cpu(dyndisk_header->table_offset);
@@ -323,7 +320,7 @@ static int vpc_open(BlockDriverState *bs, QDict *options, int flags,
/* Disable migration when VHD images are used */
error_set(&s->migration_blocker,
QERR_BLOCK_FORMAT_FEATURE_NOT_SUPPORTED,
"vpc", bs->device_name, "live migration");
"vpc", bdrv_get_device_name(bs), "live migration");
migrate_add_blocker(s->migration_blocker);
return 0;
@@ -475,7 +472,7 @@ static int64_t alloc_block(BlockDriverState* bs, int64_t sector_num)
// Write BAT entry to disk
bat_offset = s->bat_offset + (4 * index);
bat_value = be32_to_cpu(s->pagetable[index]);
bat_value = cpu_to_be32(s->pagetable[index]);
ret = bdrv_pwrite_sync(bs->file, bat_offset, &bat_value, 4);
if (ret < 0)
goto fail;
@@ -492,7 +489,7 @@ static int vpc_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
BDRVVPCState *s = (BDRVVPCState *)bs->opaque;
VHDFooter *footer = (VHDFooter *) s->footer_buf;
if (cpu_to_be32(footer->type) != VHD_FIXED) {
if (be32_to_cpu(footer->type) != VHD_FIXED) {
bdi->cluster_size = s->block_size;
}
@@ -509,7 +506,7 @@ static int vpc_read(BlockDriverState *bs, int64_t sector_num,
int64_t sectors, sectors_per_block;
VHDFooter *footer = (VHDFooter *) s->footer_buf;
if (cpu_to_be32(footer->type) == VHD_FIXED) {
if (be32_to_cpu(footer->type) == VHD_FIXED) {
return bdrv_read(bs->file, sector_num, buf, nb_sectors);
}
while (nb_sectors > 0) {
@@ -558,7 +555,7 @@ static int vpc_write(BlockDriverState *bs, int64_t sector_num,
int ret;
VHDFooter *footer = (VHDFooter *) s->footer_buf;
if (cpu_to_be32(footer->type) == VHD_FIXED) {
if (be32_to_cpu(footer->type) == VHD_FIXED) {
return bdrv_write(bs->file, sector_num, buf, nb_sectors);
}
while (nb_sectors > 0) {
@@ -656,39 +653,41 @@ static int calculate_geometry(int64_t total_sectors, uint16_t* cyls,
return 0;
}
static int create_dynamic_disk(int fd, uint8_t *buf, int64_t total_sectors)
static int create_dynamic_disk(BlockDriverState *bs, uint8_t *buf,
int64_t total_sectors)
{
VHDDynDiskHeader *dyndisk_header =
(VHDDynDiskHeader *) buf;
size_t block_size, num_bat_entries;
int i;
int ret = -EIO;
int ret;
int64_t offset = 0;
// Write the footer (twice: at the beginning and at the end)
block_size = 0x200000;
num_bat_entries = (total_sectors + block_size / 512) / (block_size / 512);
if (write(fd, buf, HEADER_SIZE) != HEADER_SIZE) {
ret = bdrv_pwrite_sync(bs, offset, buf, HEADER_SIZE);
if (ret) {
goto fail;
}
if (lseek(fd, 1536 + ((num_bat_entries * 4 + 511) & ~511), SEEK_SET) < 0) {
goto fail;
}
if (write(fd, buf, HEADER_SIZE) != HEADER_SIZE) {
offset = 1536 + ((num_bat_entries * 4 + 511) & ~511);
ret = bdrv_pwrite_sync(bs, offset, buf, HEADER_SIZE);
if (ret < 0) {
goto fail;
}
// Write the initial BAT
if (lseek(fd, 3 * 512, SEEK_SET) < 0) {
goto fail;
}
offset = 3 * 512;
memset(buf, 0xFF, 512);
for (i = 0; i < (num_bat_entries * 4 + 511) / 512; i++) {
if (write(fd, buf, 512) != 512) {
ret = bdrv_pwrite_sync(bs, offset, buf, 512);
if (ret < 0) {
goto fail;
}
offset += 512;
}
// Prepare the Dynamic Disk Header
@@ -700,48 +699,44 @@ static int create_dynamic_disk(int fd, uint8_t *buf, int64_t total_sectors)
* Note: The spec is actually wrong here for data_offset, it says
* 0xFFFFFFFF, but MS tools expect all 64 bits to be set.
*/
dyndisk_header->data_offset = be64_to_cpu(0xFFFFFFFFFFFFFFFFULL);
dyndisk_header->table_offset = be64_to_cpu(3 * 512);
dyndisk_header->version = be32_to_cpu(0x00010000);
dyndisk_header->block_size = be32_to_cpu(block_size);
dyndisk_header->max_table_entries = be32_to_cpu(num_bat_entries);
dyndisk_header->data_offset = cpu_to_be64(0xFFFFFFFFFFFFFFFFULL);
dyndisk_header->table_offset = cpu_to_be64(3 * 512);
dyndisk_header->version = cpu_to_be32(0x00010000);
dyndisk_header->block_size = cpu_to_be32(block_size);
dyndisk_header->max_table_entries = cpu_to_be32(num_bat_entries);
dyndisk_header->checksum = be32_to_cpu(vpc_checksum(buf, 1024));
dyndisk_header->checksum = cpu_to_be32(vpc_checksum(buf, 1024));
// Write the header
if (lseek(fd, 512, SEEK_SET) < 0) {
goto fail;
}
offset = 512;
if (write(fd, buf, 1024) != 1024) {
ret = bdrv_pwrite_sync(bs, offset, buf, 1024);
if (ret < 0) {
goto fail;
}
ret = 0;
fail:
return ret;
}
static int create_fixed_disk(int fd, uint8_t *buf, int64_t total_size)
static int create_fixed_disk(BlockDriverState *bs, uint8_t *buf,
int64_t total_size)
{
int ret = -EIO;
int ret;
/* Add footer to total size */
total_size += 512;
if (ftruncate(fd, total_size) != 0) {
ret = -errno;
goto fail;
}
if (lseek(fd, -512, SEEK_END) < 0) {
goto fail;
}
if (write(fd, buf, HEADER_SIZE) != HEADER_SIZE) {
goto fail;
total_size += HEADER_SIZE;
ret = bdrv_truncate(bs, total_size);
if (ret < 0) {
return ret;
}
ret = 0;
ret = bdrv_pwrite_sync(bs, total_size - HEADER_SIZE, buf, HEADER_SIZE);
if (ret < 0) {
return ret;
}
fail:
return ret;
}
@@ -750,7 +745,7 @@ static int vpc_create(const char *filename, QemuOpts *opts, Error **errp)
uint8_t buf[1024];
VHDFooter *footer = (VHDFooter *) buf;
char *disk_type_param;
int fd, i;
int i;
uint16_t cyls = 0;
uint8_t heads = 0;
uint8_t secs_per_cyl = 0;
@@ -758,10 +753,12 @@ static int vpc_create(const char *filename, QemuOpts *opts, Error **errp)
int64_t total_size;
int disk_type;
int ret = -EIO;
bool nocow = false;
Error *local_err = NULL;
BlockDriverState *bs = NULL;
/* Read out options */
total_size = qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0);
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
disk_type_param = qemu_opt_get_del(opts, BLOCK_OPT_SUBFMT);
if (disk_type_param) {
if (!strcmp(disk_type_param, "dynamic")) {
@@ -775,28 +772,17 @@ static int vpc_create(const char *filename, QemuOpts *opts, Error **errp)
} else {
disk_type = VHD_DYNAMIC;
}
nocow = qemu_opt_get_bool_del(opts, BLOCK_OPT_NOCOW, false);
/* Create the file */
fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, 0644);
if (fd < 0) {
ret = -EIO;
ret = bdrv_create_file(filename, opts, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
goto out;
}
if (nocow) {
#ifdef __linux__
/* Set NOCOW flag to solve performance issue on fs like btrfs.
* This is an optimisation. The FS_IOC_SETFLAGS ioctl return value will
* be ignored since any failure of this operation should not block the
* left work.
*/
int attr;
if (ioctl(fd, FS_IOC_GETFLAGS, &attr) == 0) {
attr |= FS_NOCOW_FL;
ioctl(fd, FS_IOC_SETFLAGS, &attr);
}
#endif
ret = bdrv_open(&bs, filename, NULL, NULL, BDRV_O_RDWR | BDRV_O_PROTOCOL,
NULL, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
goto out;
}
/*
@@ -810,7 +796,7 @@ static int vpc_create(const char *filename, QemuOpts *opts, Error **errp)
&secs_per_cyl))
{
ret = -EFBIG;
goto fail;
goto out;
}
}
@@ -824,46 +810,45 @@ static int vpc_create(const char *filename, QemuOpts *opts, Error **errp)
memcpy(footer->creator_app, "qemu", 4);
memcpy(footer->creator_os, "Wi2k", 4);
footer->features = be32_to_cpu(0x02);
footer->version = be32_to_cpu(0x00010000);
footer->features = cpu_to_be32(0x02);
footer->version = cpu_to_be32(0x00010000);
if (disk_type == VHD_DYNAMIC) {
footer->data_offset = be64_to_cpu(HEADER_SIZE);
footer->data_offset = cpu_to_be64(HEADER_SIZE);
} else {
footer->data_offset = be64_to_cpu(0xFFFFFFFFFFFFFFFFULL);
footer->data_offset = cpu_to_be64(0xFFFFFFFFFFFFFFFFULL);
}
footer->timestamp = be32_to_cpu(time(NULL) - VHD_TIMESTAMP_BASE);
footer->timestamp = cpu_to_be32(time(NULL) - VHD_TIMESTAMP_BASE);
/* Version of Virtual PC 2007 */
footer->major = be16_to_cpu(0x0005);
footer->minor = be16_to_cpu(0x0003);
footer->major = cpu_to_be16(0x0005);
footer->minor = cpu_to_be16(0x0003);
if (disk_type == VHD_DYNAMIC) {
footer->orig_size = be64_to_cpu(total_sectors * 512);
footer->size = be64_to_cpu(total_sectors * 512);
footer->orig_size = cpu_to_be64(total_sectors * 512);
footer->size = cpu_to_be64(total_sectors * 512);
} else {
footer->orig_size = be64_to_cpu(total_size);
footer->size = be64_to_cpu(total_size);
footer->orig_size = cpu_to_be64(total_size);
footer->size = cpu_to_be64(total_size);
}
footer->cyls = be16_to_cpu(cyls);
footer->cyls = cpu_to_be16(cyls);
footer->heads = heads;
footer->secs_per_cyl = secs_per_cyl;
footer->type = be32_to_cpu(disk_type);
footer->type = cpu_to_be32(disk_type);
#if defined(CONFIG_UUID)
uuid_generate(footer->uuid);
#endif
footer->checksum = be32_to_cpu(vpc_checksum(buf, HEADER_SIZE));
footer->checksum = cpu_to_be32(vpc_checksum(buf, HEADER_SIZE));
if (disk_type == VHD_DYNAMIC) {
ret = create_dynamic_disk(fd, buf, total_sectors);
ret = create_dynamic_disk(bs, buf, total_sectors);
} else {
ret = create_fixed_disk(fd, buf, total_size);
ret = create_fixed_disk(bs, buf, total_size);
}
fail:
qemu_close(fd);
out:
bdrv_unref(bs);
g_free(disk_type_param);
return ret;
}
@@ -873,7 +858,7 @@ static int vpc_has_zero_init(BlockDriverState *bs)
BDRVVPCState *s = bs->opaque;
VHDFooter *footer = (VHDFooter *) s->footer_buf;
if (cpu_to_be32(footer->type) == VHD_FIXED) {
if (be32_to_cpu(footer->type) == VHD_FIXED) {
return bdrv_has_zero_init(bs->file);
} else {
return 1;

View File

@@ -52,10 +52,6 @@
#define DLOG(a) a
#undef stderr
#define stderr STDERR
FILE* stderr = NULL;
static void checkpoint(void);
#ifdef __MINGW32__
@@ -732,7 +728,7 @@ static int read_directory(BDRVVVFATState* s, int mapping_index)
if(first_cluster == 0 && (is_dotdot || is_dot))
continue;
buffer=(char*)g_malloc(length);
buffer = g_malloc(length);
snprintf(buffer,length,"%s/%s",dirname,entry->d_name);
if(stat(buffer,&st)<0) {
@@ -767,7 +763,7 @@ static int read_directory(BDRVVVFATState* s, int mapping_index)
/* create mapping for this file */
if(!is_dot && !is_dotdot && (S_ISDIR(st.st_mode) || st.st_size)) {
s->current_mapping=(mapping_t*)array_get_next(&(s->mapping));
s->current_mapping = array_get_next(&(s->mapping));
s->current_mapping->begin=0;
s->current_mapping->end=st.st_size;
/*
@@ -811,12 +807,12 @@ static int read_directory(BDRVVVFATState* s, int mapping_index)
}
/* reget the mapping, since s->mapping was possibly realloc()ed */
mapping = (mapping_t*)array_get(&(s->mapping), mapping_index);
mapping = array_get(&(s->mapping), mapping_index);
first_cluster += (s->directory.next - mapping->info.dir.first_dir_index)
* 0x20 / s->cluster_size;
mapping->end = first_cluster;
direntry = (direntry_t*)array_get(&(s->directory), mapping->dir_index);
direntry = array_get(&(s->directory), mapping->dir_index);
set_begin_of_direntry(direntry, mapping->begin);
return 0;
@@ -1082,11 +1078,6 @@ static int vvfat_open(BlockDriverState *bs, QDict *options, int flags,
vvv = s;
#endif
DLOG(if (stderr == NULL) {
stderr = fopen("vvfat.log", "a");
setbuf(stderr, NULL);
})
opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
@@ -1191,7 +1182,7 @@ DLOG(if (stderr == NULL) {
if (s->qcow) {
error_set(&s->migration_blocker,
QERR_BLOCK_FORMAT_FEATURE_NOT_SUPPORTED,
"vvfat (rw)", bs->device_name, "live migration");
"vvfat (rw)", bdrv_get_device_name(bs), "live migration");
migrate_add_blocker(s->migration_blocker);
}
@@ -2948,9 +2939,9 @@ static int enable_write_target(BDRVVVFATState *s, Error **errp)
unlink(s->qcow_filename);
#endif
bdrv_set_backing_hd(s->bs, bdrv_new("", &error_abort));
bdrv_set_backing_hd(s->bs, bdrv_new());
s->bs->backing_hd->drv = &vvfat_write_target;
s->bs->backing_hd->opaque = g_malloc(sizeof(void*));
s->bs->backing_hd->opaque = g_new(void *, 1);
*(void**)s->bs->backing_hd->opaque = s;
return 0;

View File

@@ -44,7 +44,7 @@ struct QEMUWin32AIOState {
};
typedef struct QEMUWin32AIOCB {
BlockDriverAIOCB common;
BlockAIOCB common;
struct QEMUWin32AIOState *ctx;
int nbytes;
OVERLAPPED ov;
@@ -88,7 +88,7 @@ static void win32_aio_process_completion(QEMUWin32AIOState *s,
waiocb->common.cb(waiocb->common.opaque, ret);
qemu_aio_release(waiocb);
qemu_aio_unref(waiocb);
}
static void win32_aio_completion_cb(EventNotifier *e)
@@ -106,28 +106,14 @@ static void win32_aio_completion_cb(EventNotifier *e)
}
}
static void win32_aio_cancel(BlockDriverAIOCB *blockacb)
{
QEMUWin32AIOCB *waiocb = (QEMUWin32AIOCB *)blockacb;
/*
* CancelIoEx is only supported in Vista and newer. For now, just
* wait for completion.
*/
while (!HasOverlappedIoCompleted(&waiocb->ov)) {
aio_poll(bdrv_get_aio_context(blockacb->bs), true);
}
}
static const AIOCBInfo win32_aiocb_info = {
.aiocb_size = sizeof(QEMUWin32AIOCB),
.cancel = win32_aio_cancel,
};
BlockDriverAIOCB *win32_aio_submit(BlockDriverState *bs,
BlockAIOCB *win32_aio_submit(BlockDriverState *bs,
QEMUWin32AIOState *aio, HANDLE hfile,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockDriverCompletionFunc *cb, void *opaque, int type)
BlockCompletionFunc *cb, void *opaque, int type)
{
struct QEMUWin32AIOCB *waiocb;
uint64_t offset = sector_num * 512;
@@ -139,7 +125,10 @@ BlockDriverAIOCB *win32_aio_submit(BlockDriverState *bs,
waiocb->is_read = (type == QEMU_AIO_READ);
if (qiov->niov > 1) {
waiocb->buf = qemu_blockalign(bs, qiov->size);
waiocb->buf = qemu_try_blockalign(bs, qiov->size);
if (waiocb->buf == NULL) {
goto out;
}
if (type & QEMU_AIO_WRITE) {
iov_to_buf(qiov->iov, qiov->niov, 0, waiocb->buf, qiov->size);
}
@@ -168,7 +157,8 @@ BlockDriverAIOCB *win32_aio_submit(BlockDriverState *bs,
out_dec_count:
aio->count--;
qemu_aio_release(waiocb);
out:
qemu_aio_unref(waiocb);
return NULL;
}

View File

@@ -108,7 +108,7 @@ void qmp_nbd_server_add(const char *device, bool has_writable, bool writable,
nbd_export_set_name(exp, device);
n = g_malloc0(sizeof(NBDCloseNotifier));
n = g_new0(NBDCloseNotifier, 1);
n->n.notify = nbd_close_notifier;
n->exp = exp;
bdrv_add_close_notifier(bs, &n->n);

File diff suppressed because it is too large Load Diff

View File

@@ -36,7 +36,7 @@
#include "qapi-event.h"
void *block_job_create(const BlockJobDriver *driver, BlockDriverState *bs,
int64_t speed, BlockDriverCompletionFunc *cb,
int64_t speed, BlockCompletionFunc *cb,
void *opaque, Error **errp)
{
BlockJob *job;
@@ -50,6 +50,7 @@ void *block_job_create(const BlockJobDriver *driver, BlockDriverState *bs,
error_setg(&job->blocker, "block device is in use by block job: %s",
BlockJobType_lookup[driver->job_type]);
bdrv_op_block_all(bs, job->blocker);
bdrv_op_unblock(bs, BLOCK_OP_TYPE_DATAPLANE, job->blocker);
job->driver = driver;
job->bs = bs;
@@ -107,7 +108,8 @@ void block_job_set_speed(BlockJob *job, int64_t speed, Error **errp)
void block_job_complete(BlockJob *job, Error **errp)
{
if (job->paused || job->cancelled || !job->driver->complete) {
error_set(errp, QERR_BLOCK_JOB_NOT_READY, job->bs->device_name);
error_set(errp, QERR_BLOCK_JOB_NOT_READY,
bdrv_get_device_name(job->bs));
return;
}
@@ -152,27 +154,30 @@ void block_job_iostatus_reset(BlockJob *job)
}
}
struct BlockCancelData {
struct BlockFinishData {
BlockJob *job;
BlockDriverCompletionFunc *cb;
BlockCompletionFunc *cb;
void *opaque;
bool cancelled;
int ret;
};
static void block_job_cancel_cb(void *opaque, int ret)
static void block_job_finish_cb(void *opaque, int ret)
{
struct BlockCancelData *data = opaque;
struct BlockFinishData *data = opaque;
data->cancelled = block_job_is_cancelled(data->job);
data->ret = ret;
data->cb(data->opaque, ret);
}
int block_job_cancel_sync(BlockJob *job)
static int block_job_finish_sync(BlockJob *job,
void (*finish)(BlockJob *, Error **errp),
Error **errp)
{
struct BlockCancelData data;
struct BlockFinishData data;
BlockDriverState *bs = job->bs;
Error *local_err = NULL;
assert(bs->job == job);
@@ -183,15 +188,37 @@ int block_job_cancel_sync(BlockJob *job)
data.cb = job->cb;
data.opaque = job->opaque;
data.ret = -EINPROGRESS;
job->cb = block_job_cancel_cb;
job->cb = block_job_finish_cb;
job->opaque = &data;
block_job_cancel(job);
finish(job, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return -EBUSY;
}
while (data.ret == -EINPROGRESS) {
aio_poll(bdrv_get_aio_context(bs), true);
}
return (data.cancelled && data.ret == 0) ? -ECANCELED : data.ret;
}
/* A wrapper around block_job_cancel() taking an Error ** parameter so it may be
* used with block_job_finish_sync() without the need for (rather nasty)
* function pointer casts there. */
static void block_job_cancel_err(BlockJob *job, Error **errp)
{
block_job_cancel(job);
}
int block_job_cancel_sync(BlockJob *job)
{
return block_job_finish_sync(job, &block_job_cancel_err, NULL);
}
int block_job_complete_sync(BlockJob *job, Error **errp)
{
return block_job_finish_sync(job, &block_job_complete, errp);
}
void block_job_sleep_ns(BlockJob *job, QEMUClockType type, int64_t ns)
{
assert(job->busy);
@@ -205,7 +232,7 @@ void block_job_sleep_ns(BlockJob *job, QEMUClockType type, int64_t ns)
if (block_job_is_paused(job)) {
qemu_coroutine_yield();
} else {
co_sleep_ns(type, ns);
co_aio_sleep_ns(bdrv_get_aio_context(job->bs), type, ns);
}
job->busy = true;
}
@@ -235,6 +262,7 @@ BlockJobInfo *block_job_query(BlockJob *job)
info->offset = job->offset;
info->speed = job->speed;
info->io_status = job->iostatus;
info->ready = job->ready;
return info;
}
@@ -270,6 +298,8 @@ void block_job_event_completed(BlockJob *job, const char *msg)
void block_job_event_ready(BlockJob *job)
{
job->ready = true;
qapi_event_send_block_job_ready(job->driver->job_type,
bdrv_get_device_name(job->bs),
job->len,
@@ -313,3 +343,48 @@ BlockErrorAction block_job_error_action(BlockJob *job, BlockDriverState *bs,
}
return action;
}
typedef struct {
BlockJob *job;
QEMUBH *bh;
AioContext *aio_context;
BlockJobDeferToMainLoopFn *fn;
void *opaque;
} BlockJobDeferToMainLoopData;
static void block_job_defer_to_main_loop_bh(void *opaque)
{
BlockJobDeferToMainLoopData *data = opaque;
AioContext *aio_context;
qemu_bh_delete(data->bh);
/* Prevent race with block_job_defer_to_main_loop() */
aio_context_acquire(data->aio_context);
/* Fetch BDS AioContext again, in case it has changed */
aio_context = bdrv_get_aio_context(data->job->bs);
aio_context_acquire(aio_context);
data->fn(data->job, data->opaque);
aio_context_release(aio_context);
aio_context_release(data->aio_context);
g_free(data);
}
void block_job_defer_to_main_loop(BlockJob *job,
BlockJobDeferToMainLoopFn *fn,
void *opaque)
{
BlockJobDeferToMainLoopData *data = g_malloc(sizeof(*data));
data->job = job;
data->bh = qemu_bh_new(block_job_defer_to_main_loop_bh, data);
data->aio_context = bdrv_get_aio_context(job->bs);
data->fn = fn;
data->opaque = opaque;
qemu_bh_schedule(data->bh);
}

258
bootdevice.c Normal file
View File

@@ -0,0 +1,258 @@
/*
* QEMU Boot Device Implement
*
* Copyright (c) 2014 HUAWEI TECHNOLOGIES CO.,LTD.
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "sysemu/sysemu.h"
#include "qapi/visitor.h"
#include "qemu/error-report.h"
typedef struct FWBootEntry FWBootEntry;
struct FWBootEntry {
QTAILQ_ENTRY(FWBootEntry) link;
int32_t bootindex;
DeviceState *dev;
char *suffix;
};
static QTAILQ_HEAD(, FWBootEntry) fw_boot_order =
QTAILQ_HEAD_INITIALIZER(fw_boot_order);
void check_boot_index(int32_t bootindex, Error **errp)
{
FWBootEntry *i;
if (bootindex >= 0) {
QTAILQ_FOREACH(i, &fw_boot_order, link) {
if (i->bootindex == bootindex) {
error_setg(errp, "The bootindex %d has already been used",
bootindex);
return;
}
}
}
}
void del_boot_device_path(DeviceState *dev, const char *suffix)
{
FWBootEntry *i;
if (dev == NULL) {
return;
}
QTAILQ_FOREACH(i, &fw_boot_order, link) {
if ((!suffix || !g_strcmp0(i->suffix, suffix)) &&
i->dev == dev) {
QTAILQ_REMOVE(&fw_boot_order, i, link);
g_free(i->suffix);
g_free(i);
break;
}
}
}
void add_boot_device_path(int32_t bootindex, DeviceState *dev,
const char *suffix)
{
FWBootEntry *node, *i;
if (bootindex < 0) {
del_boot_device_path(dev, suffix);
return;
}
assert(dev != NULL || suffix != NULL);
del_boot_device_path(dev, suffix);
node = g_malloc0(sizeof(FWBootEntry));
node->bootindex = bootindex;
node->suffix = g_strdup(suffix);
node->dev = dev;
QTAILQ_FOREACH(i, &fw_boot_order, link) {
if (i->bootindex == bootindex) {
error_report("Two devices with same boot index %d", bootindex);
exit(1);
} else if (i->bootindex < bootindex) {
continue;
}
QTAILQ_INSERT_BEFORE(i, node, link);
return;
}
QTAILQ_INSERT_TAIL(&fw_boot_order, node, link);
}
DeviceState *get_boot_device(uint32_t position)
{
uint32_t counter = 0;
FWBootEntry *i = NULL;
DeviceState *res = NULL;
if (!QTAILQ_EMPTY(&fw_boot_order)) {
QTAILQ_FOREACH(i, &fw_boot_order, link) {
if (counter == position) {
res = i->dev;
break;
}
counter++;
}
}
return res;
}
/*
* This function returns null terminated string that consist of new line
* separated device paths.
*
* memory pointed by "size" is assigned total length of the array in bytes
*
*/
char *get_boot_devices_list(size_t *size, bool ignore_suffixes)
{
FWBootEntry *i;
size_t total = 0;
char *list = NULL;
QTAILQ_FOREACH(i, &fw_boot_order, link) {
char *devpath = NULL, *bootpath;
size_t len;
if (i->dev) {
devpath = qdev_get_fw_dev_path(i->dev);
assert(devpath);
}
if (i->suffix && !ignore_suffixes && devpath) {
size_t bootpathlen = strlen(devpath) + strlen(i->suffix) + 1;
bootpath = g_malloc(bootpathlen);
snprintf(bootpath, bootpathlen, "%s%s", devpath, i->suffix);
g_free(devpath);
} else if (devpath) {
bootpath = devpath;
} else if (!ignore_suffixes) {
assert(i->suffix);
bootpath = g_strdup(i->suffix);
} else {
bootpath = g_strdup("");
}
if (total) {
list[total-1] = '\n';
}
len = strlen(bootpath) + 1;
list = g_realloc(list, total + len);
memcpy(&list[total], bootpath, len);
total += len;
g_free(bootpath);
}
*size = total;
if (boot_strict && *size > 0) {
list[total-1] = '\n';
list = g_realloc(list, total + 5);
memcpy(&list[total], "HALT", 5);
*size = total + 5;
}
return list;
}
typedef struct {
int32_t *bootindex;
const char *suffix;
DeviceState *dev;
} BootIndexProperty;
static void device_get_bootindex(Object *obj, Visitor *v, void *opaque,
const char *name, Error **errp)
{
BootIndexProperty *prop = opaque;
visit_type_int32(v, prop->bootindex, name, errp);
}
static void device_set_bootindex(Object *obj, Visitor *v, void *opaque,
const char *name, Error **errp)
{
BootIndexProperty *prop = opaque;
int32_t boot_index;
Error *local_err = NULL;
visit_type_int32(v, &boot_index, name, &local_err);
if (local_err) {
goto out;
}
/* check whether bootindex is present in fw_boot_order list */
check_boot_index(boot_index, &local_err);
if (local_err) {
goto out;
}
/* change bootindex to a new one */
*prop->bootindex = boot_index;
add_boot_device_path(*prop->bootindex, prop->dev, prop->suffix);
out:
if (local_err) {
error_propagate(errp, local_err);
}
}
static void property_release_bootindex(Object *obj, const char *name,
void *opaque)
{
BootIndexProperty *prop = opaque;
del_boot_device_path(prop->dev, prop->suffix);
g_free(prop);
}
void device_add_bootindex_property(Object *obj, int32_t *bootindex,
const char *name, const char *suffix,
DeviceState *dev, Error **errp)
{
Error *local_err = NULL;
BootIndexProperty *prop = g_malloc0(sizeof(*prop));
prop->bootindex = bootindex;
prop->suffix = suffix;
prop->dev = dev;
object_property_add(obj, name, "int32",
device_get_bootindex,
device_set_bootindex,
property_release_bootindex,
prop, &local_err);
if (local_err) {
error_propagate(errp, local_err);
g_free(prop);
return;
}
/* initialize devices' bootindex property to -1 */
object_property_set_int(obj, -1, name, NULL);
}

170
configure vendored
View File

@@ -326,7 +326,7 @@ seccomp=""
glusterfs=""
glusterfs_discard="no"
glusterfs_zerofill="no"
virtio_blk_data_plane=""
archipelago=""
gtk=""
gtkabi=""
vte=""
@@ -388,6 +388,7 @@ cpp="${CPP-$cc -E}"
objcopy="${OBJCOPY-${cross_prefix}objcopy}"
ld="${LD-${cross_prefix}ld}"
libtool="${LIBTOOL-${cross_prefix}libtool}"
nm="${NM-${cross_prefix}nm}"
strip="${STRIP-${cross_prefix}strip}"
windres="${WINDRES-${cross_prefix}windres}"
pkg_config_exe="${PKG_CONFIG-${cross_prefix}pkg-config}"
@@ -1087,9 +1088,12 @@ for opt do
;;
--enable-glusterfs) glusterfs="yes"
;;
--disable-virtio-blk-data-plane) virtio_blk_data_plane="no"
--disable-archipelago) archipelago="no"
;;
--enable-virtio-blk-data-plane) virtio_blk_data_plane="yes"
--enable-archipelago) archipelago="yes"
;;
--disable-virtio-blk-data-plane|--enable-virtio-blk-data-plane)
echo "$0: $opt is obsolete, virtio-blk data-plane is always on" >&2
;;
--disable-gtk) gtk="no"
;;
@@ -1344,7 +1348,7 @@ Advanced options (experts only):
--enable-linux-aio enable Linux AIO support
--disable-cap-ng disable libcap-ng support
--enable-cap-ng enable libcap-ng support
--disable-attr disables attr and xattr support
--disable-attr disable attr and xattr support
--enable-attr enable attr and xattr support
--disable-blobs disable installing provided firmware blobs
--enable-docs enable documentation build
@@ -1375,20 +1379,22 @@ Advanced options (experts only):
--with-vss-sdk=SDK-path enable Windows VSS support in QEMU Guest Agent
--with-win-sdk=SDK-path path to Windows Platform SDK (to build VSS .tlb)
--disable-seccomp disable seccomp support
--enable-seccomp enables seccomp support
--enable-seccomp enable seccomp support
--with-coroutine=BACKEND coroutine backend. Supported options:
gthread, ucontext, sigaltstack, windows
--disable-coroutine-pool disable coroutine freelist (worse performance)
--enable-coroutine-pool enable coroutine freelist (better performance)
--enable-glusterfs enable GlusterFS backend
--disable-glusterfs disable GlusterFS backend
--enable-archipelago enable Archipelago backend
--disable-archipelago disable Archipelago backend
--enable-gcov enable test coverage analysis with gcov
--gcov=GCOV use specified gcov [$gcov_tool]
--disable-tpm disable TPM support
--enable-tpm enable TPM support
--disable-libssh2 disable ssh block device support
--enable-libssh2 enable ssh block device support
--disable-vhdx disables support for the Microsoft VHDX image format
--disable-vhdx disable support for the Microsoft VHDX image format
--enable-vhdx enable support for the Microsoft VHDX image format
--disable-quorum disable quorum block filter support
--enable-quorum enable quorum block filter support
@@ -1817,7 +1823,8 @@ fi
# libseccomp check
if test "$seccomp" != "no" ; then
if $pkg_config --atleast-version=2.1.0 libseccomp; then
if test "$cpu" = "i386" || test "$cpu" = "x86_64" &&
$pkg_config --atleast-version=2.1.1 libseccomp; then
libs_softmmu="$libs_softmmu `$pkg_config --libs libseccomp`"
QEMU_CFLAGS="$QEMU_CFLAGS `$pkg_config --cflags libseccomp`"
seccomp="yes"
@@ -2709,6 +2716,12 @@ for i in $glib_modules; do
fi
done
# g_test_trap_subprocess added in 2.38. Used by some tests.
glib_subprocess=yes
if ! $pkg_config --atleast-version=2.38 glib-2.0; then
glib_subprocess=no
fi
##########################################
# SHA command probe for modules
if test "$modules" = yes; then
@@ -2730,7 +2743,7 @@ fi
if test "$pixman" = ""; then
if test "$want_tools" = "no" -a "$softmmu" = "no"; then
pixman="none"
elif $pkg_config pixman-1 > /dev/null 2>&1; then
elif $pkg_config --atleast-version=0.21.8 pixman-1 > /dev/null 2>&1; then
pixman="system"
else
pixman="internal"
@@ -2746,11 +2759,12 @@ if test "$pixman" = "none"; then
pixman_cflags=
pixman_libs=
elif test "$pixman" = "system"; then
# pixman version has been checked above
pixman_cflags=`$pkg_config --cflags pixman-1`
pixman_libs=`$pkg_config --libs pixman-1`
else
if test ! -d ${source_path}/pixman/pixman; then
error_exit "pixman not present. Your options:" \
error_exit "pixman >= 0.21.8 not present. Your options:" \
" (1) Preferred: Install the pixman devel package (any recent" \
" distro should have packages as Xorg needs pixman too)." \
" (2) Fetch the pixman submodule, using:" \
@@ -2928,16 +2942,6 @@ else
tpm_passthrough=no
fi
##########################################
# adjust virtio-blk-data-plane based on linux-aio
if test "$virtio_blk_data_plane" = "yes" -a \
"$linux_aio" != "yes" ; then
error_exit "virtio-blk-data-plane requires Linux AIO, please try --enable-linux-aio"
elif test -z "$virtio_blk_data_plane" ; then
virtio_blk_data_plane=$linux_aio
fi
##########################################
# attr probe
@@ -3072,6 +3076,33 @@ EOF
fi
fi
##########################################
# archipelago probe
if test "$archipelago" != "no" ; then
cat > $TMPC <<EOF
#include <stdio.h>
#include <xseg/xseg.h>
#include <xseg/protocol.h>
int main(void) {
xseg_initialize();
return 0;
}
EOF
archipelago_libs=-lxseg
if compile_prog "" "$archipelago_libs"; then
archipelago="yes"
libs_tools="$archipelago_libs $libs_tools"
libs_softmmu="$archipelago_libs $libs_softmmu"
else
if test "$archipelago" = "yes" ; then
feature_not_found "Archipelago backend support" "Install libxseg devel"
fi
archipelago="no"
fi
fi
##########################################
# glusterfs probe
if test "$glusterfs" != "no" ; then
@@ -3087,7 +3118,8 @@ if test "$glusterfs" != "no" ; then
fi
else
if test "$glusterfs" = "yes" ; then
feature_not_found "GlusterFS backend support" "Install glusterfs-api devel"
feature_not_found "GlusterFS backend support" \
"Install glusterfs-api devel >= 3"
fi
glusterfs="no"
fi
@@ -3277,6 +3309,21 @@ if compile_prog "" "" ; then
fallocate_punch_hole=yes
fi
# check for posix_fallocate
posix_fallocate=no
cat > $TMPC << EOF
#include <fcntl.h>
int main(void)
{
posix_fallocate(0, 0, 0);
return 0;
}
EOF
if compile_prog "" "" ; then
posix_fallocate=yes
fi
# check for sync_file_range
sync_file_range=no
cat > $TMPC << EOF
@@ -3421,6 +3468,37 @@ if compile_prog "" "" ; then
sendfile=yes
fi
# check for timerfd support (glibc 2.8 and newer)
timerfd=no
cat > $TMPC << EOF
#include <sys/timerfd.h>
int main(void)
{
return(timerfd_create(CLOCK_REALTIME, 0));
}
EOF
if compile_prog "" "" ; then
timerfd=yes
fi
# check for setns and unshare support
setns=no
cat > $TMPC << EOF
#include <sched.h>
int main(void)
{
int ret;
ret = setns(0, 0);
ret = unshare(0);
return ret;
}
EOF
if compile_prog "" "" ; then
setns=yes
fi
# Check if tools are available to build documentation.
if test "$docs" != "no" ; then
if has makeinfo && has pod2man; then
@@ -3532,7 +3610,8 @@ EOF
spice_server_version=$($pkg_config --modversion spice-server)
else
if test "$spice" = "yes" ; then
feature_not_found "spice" "Install spice-server and spice-protocol devel"
feature_not_found "spice" \
"Install spice-server(>=0.12.0) and spice-protocol(>=0.12.3) devel"
fi
spice="no"
fi
@@ -3563,7 +3642,7 @@ EOF
smartcard_nss="yes"
else
if test "$smartcard_nss" = "yes"; then
feature_not_found "nss"
feature_not_found "nss" "Install nss devel >= 3.12.8"
fi
smartcard_nss="no"
fi
@@ -3579,7 +3658,7 @@ if test "$libusb" != "no" ; then
libs_softmmu="$libs_softmmu $libusb_libs"
else
if test "$libusb" = "yes"; then
feature_not_found "libusb" "Install libusb devel"
feature_not_found "libusb" "Install libusb devel >= 1.0.13"
fi
libusb="no"
fi
@@ -3893,12 +3972,11 @@ else
fi
########################################
# check if we have valgrind/valgrind.h and valgrind/memcheck.h
# check if we have valgrind/valgrind.h
valgrind_h=no
cat > $TMPC << EOF
#include <valgrind/valgrind.h>
#include <valgrind/memcheck.h>
int main(void) {
return 0;
}
@@ -4004,7 +4082,7 @@ if test "$libnfs" != "no" ; then
LIBS="$LIBS $libnfs_libs"
else
if test "$libnfs" = "yes" ; then
feature_not_found "libnfs"
feature_not_found "libnfs" "Install libnfs devel >= 1.9.3"
fi
libnfs="no"
fi
@@ -4134,9 +4212,9 @@ EOF
fi
fi
# add pixman flags after all config tests are done
QEMU_CFLAGS="$QEMU_CFLAGS $pixman_cflags $fdt_cflags"
libs_softmmu="$libs_softmmu $pixman_libs"
# prepend pixman and ftd flags after all config tests are done
QEMU_CFLAGS="$pixman_cflags $fdt_cflags $QEMU_CFLAGS"
libs_softmmu="$pixman_libs $libs_softmmu"
echo "Install prefix $prefix"
echo "BIOS directory `eval echo $qemu_datadir`"
@@ -4251,7 +4329,7 @@ echo "seccomp support $seccomp"
echo "coroutine backend $coroutine"
echo "coroutine pool $coroutine_pool"
echo "GlusterFS support $glusterfs"
echo "virtio-blk-data-plane $virtio_blk_data_plane"
echo "Archipelago support $archipelago"
echo "gcov $gcov_tool"
echo "gcov enabled $gcov"
echo "TPM support $tpm"
@@ -4460,6 +4538,9 @@ fi
if test "$fallocate_punch_hole" = "yes" ; then
echo "CONFIG_FALLOCATE_PUNCH_HOLE=y" >> $config_host_mak
fi
if test "$posix_fallocate" = "yes" ; then
echo "CONFIG_POSIX_FALLOCATE=y" >> $config_host_mak
fi
if test "$sync_file_range" = "yes" ; then
echo "CONFIG_SYNC_FILE_RANGE=y" >> $config_host_mak
fi
@@ -4487,6 +4568,12 @@ fi
if test "$sendfile" = "yes" ; then
echo "CONFIG_SENDFILE=y" >> $config_host_mak
fi
if test "$timerfd" = "yes" ; then
echo "CONFIG_TIMERFD=y" >> $config_host_mak
fi
if test "$setns" = "yes" ; then
echo "CONFIG_SETNS=y" >> $config_host_mak
fi
if test "$inotify" = "yes" ; then
echo "CONFIG_INOTIFY=y" >> $config_host_mak
fi
@@ -4511,6 +4598,9 @@ if test "$bluez" = "yes" ; then
echo "CONFIG_BLUEZ=y" >> $config_host_mak
echo "BLUEZ_CFLAGS=$bluez_cflags" >> $config_host_mak
fi
if test "glib_subprocess" = "yes" ; then
echo "CONFIG_HAS_GLIB_SUBPROCESS_TESTS=y" >> $config_host_mak
fi
echo "GLIB_CFLAGS=$glib_cflags" >> $config_host_mak
if test "$gtk" = "yes" ; then
echo "CONFIG_GTK=y" >> $config_host_mak
@@ -4689,6 +4779,11 @@ if test "$glusterfs_zerofill" = "yes" ; then
echo "CONFIG_GLUSTERFS_ZEROFILL=y" >> $config_host_mak
fi
if test "$archipelago" = "yes" ; then
echo "CONFIG_ARCHIPELAGO=m" >> $config_host_mak
echo "ARCHIPELAGO_LIBS=$archipelago_libs" >> $config_host_mak
fi
if test "$libssh2" = "yes" ; then
echo "CONFIG_LIBSSH2=m" >> $config_host_mak
echo "LIBSSH2_CFLAGS=$libssh2_cflags" >> $config_host_mak
@@ -4699,10 +4794,6 @@ if test "$quorum" = "yes" ; then
echo "CONFIG_QUORUM=y" >> $config_host_mak
fi
if test "$virtio_blk_data_plane" = "yes" ; then
echo 'CONFIG_VIRTIO_BLK_DATA_PLANE=$(CONFIG_VIRTIO)' >> $config_host_mak
fi
if test "$vhdx" = "yes" ; then
echo "CONFIG_VHDX=y" >> $config_host_mak
fi
@@ -4809,6 +4900,7 @@ echo "AS=$as" >> $config_host_mak
echo "CPP=$cpp" >> $config_host_mak
echo "OBJCOPY=$objcopy" >> $config_host_mak
echo "LD=$ld" >> $config_host_mak
echo "NM=$nm" >> $config_host_mak
echo "WINDRES=$windres" >> $config_host_mak
echo "LIBTOOL=$libtool" >> $config_host_mak
echo "CFLAGS=$CFLAGS" >> $config_host_mak
@@ -4817,6 +4909,7 @@ echo "QEMU_CFLAGS=$QEMU_CFLAGS" >> $config_host_mak
echo "QEMU_INCLUDES=$QEMU_INCLUDES" >> $config_host_mak
if test "$sparse" = "yes" ; then
echo "CC := REAL_CC=\"\$(CC)\" cgcc" >> $config_host_mak
echo "CXX := REAL_CC=\"\$(CXX)\" cgcc" >> $config_host_mak
echo "HOST_CC := REAL_CC=\"\$(HOST_CC)\" cgcc" >> $config_host_mak
echo "QEMU_CFLAGS += -Wbitwise -Wno-transparent-union -Wno-old-initializer -Wno-non-pointer-null" >> $config_host_mak
fi
@@ -4937,7 +5030,7 @@ case "$target_name" in
aarch64)
TARGET_BASE_ARCH=arm
bflt="yes"
gdb_xml_files="aarch64-core.xml aarch64-fpu.xml"
gdb_xml_files="aarch64-core.xml aarch64-fpu.xml arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml"
;;
cris)
;;
@@ -4966,6 +5059,8 @@ case "$target_name" in
TARGET_BASE_ARCH=mips
echo "TARGET_ABI_MIPSN64=y" >> $config_target_mak
;;
tricore)
;;
moxie)
;;
or32)
@@ -5014,6 +5109,7 @@ case "$target_name" in
echo "TARGET_ABI32=y" >> $config_target_mak
;;
s390x)
gdb_xml_files="s390x-core64.xml s390-acr.xml s390-fpr.xml"
;;
unicore32)
;;
@@ -5293,10 +5389,6 @@ for rom in seabios vgabios ; do
echo "LD=$ld" >> $config_mak
done
if test "$docs" = "yes" ; then
mkdir -p QMP
fi
# set up qemu-iotests in this build directory
iotests_common_env="tests/qemu-iotests/common.env"
iotests_check="tests/qemu-iotests/check"

View File

@@ -155,7 +155,7 @@ Coroutine *qemu_coroutine_new(void)
stack_t oss;
sigset_t sigs;
sigset_t osigs;
jmp_buf old_env;
sigjmp_buf old_env;
/* The way to manipulate stack is with the sigaltstack function. We
* prepare a stack, with it delivering a signal to ourselves and then

View File

@@ -18,10 +18,114 @@
*/
#include "config.h"
#include "cpu.h"
#include "trace.h"
#include "disas/disas.h"
#include "tcg.h"
#include "qemu/atomic.h"
#include "sysemu/qtest.h"
#include "qemu/timer.h"
/* -icount align implementation. */
typedef struct SyncClocks {
int64_t diff_clk;
int64_t last_cpu_icount;
int64_t realtime_clock;
} SyncClocks;
#if !defined(CONFIG_USER_ONLY)
/* Allow the guest to have a max 3ms advance.
* The difference between the 2 clocks could therefore
* oscillate around 0.
*/
#define VM_CLOCK_ADVANCE 3000000
#define THRESHOLD_REDUCE 1.5
#define MAX_DELAY_PRINT_RATE 2000000000LL
#define MAX_NB_PRINTS 100
static void align_clocks(SyncClocks *sc, const CPUState *cpu)
{
int64_t cpu_icount;
if (!icount_align_option) {
return;
}
cpu_icount = cpu->icount_extra + cpu->icount_decr.u16.low;
sc->diff_clk += cpu_icount_to_ns(sc->last_cpu_icount - cpu_icount);
sc->last_cpu_icount = cpu_icount;
if (sc->diff_clk > VM_CLOCK_ADVANCE) {
#ifndef _WIN32
struct timespec sleep_delay, rem_delay;
sleep_delay.tv_sec = sc->diff_clk / 1000000000LL;
sleep_delay.tv_nsec = sc->diff_clk % 1000000000LL;
if (nanosleep(&sleep_delay, &rem_delay) < 0) {
sc->diff_clk -= (sleep_delay.tv_sec - rem_delay.tv_sec) * 1000000000LL;
sc->diff_clk -= sleep_delay.tv_nsec - rem_delay.tv_nsec;
} else {
sc->diff_clk = 0;
}
#else
Sleep(sc->diff_clk / SCALE_MS);
sc->diff_clk = 0;
#endif
}
}
static void print_delay(const SyncClocks *sc)
{
static float threshold_delay;
static int64_t last_realtime_clock;
static int nb_prints;
if (icount_align_option &&
sc->realtime_clock - last_realtime_clock >= MAX_DELAY_PRINT_RATE &&
nb_prints < MAX_NB_PRINTS) {
if ((-sc->diff_clk / (float)1000000000LL > threshold_delay) ||
(-sc->diff_clk / (float)1000000000LL <
(threshold_delay - THRESHOLD_REDUCE))) {
threshold_delay = (-sc->diff_clk / 1000000000LL) + 1;
printf("Warning: The guest is now late by %.1f to %.1f seconds\n",
threshold_delay - 1,
threshold_delay);
nb_prints++;
last_realtime_clock = sc->realtime_clock;
}
}
}
static void init_delay_params(SyncClocks *sc,
const CPUState *cpu)
{
if (!icount_align_option) {
return;
}
sc->realtime_clock = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
sc->diff_clk = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) -
sc->realtime_clock +
cpu_get_clock_offset();
sc->last_cpu_icount = cpu->icount_extra + cpu->icount_decr.u16.low;
if (sc->diff_clk < max_delay) {
max_delay = sc->diff_clk;
}
if (sc->diff_clk > max_advance) {
max_advance = sc->diff_clk;
}
/* Print every 2s max if the guest is late. We limit the number
of printed messages to NB_PRINT_MAX(currently 100) */
print_delay(sc);
}
#else
static void align_clocks(SyncClocks *sc, const CPUState *cpu)
{
}
static void init_delay_params(SyncClocks *sc, const CPUState *cpu)
{
}
#endif /* CONFIG USER ONLY */
void cpu_loop_exit(CPUState *cpu)
{
@@ -65,6 +169,9 @@ static inline tcg_target_ulong cpu_tb_exec(CPUState *cpu, uint8_t *tb_ptr)
#endif /* DEBUG_DISAS */
next_tb = tcg_qemu_tb_exec(env, tb_ptr);
trace_exec_tb_exit((void *) (next_tb & ~TB_EXIT_MASK),
next_tb & TB_EXIT_MASK);
if ((next_tb & TB_EXIT_MASK) > TB_EXIT_IDX1) {
/* We didn't start executing this TB (eg because the instruction
* counter hit zero); we must restore the guest PC to the address
@@ -105,6 +212,7 @@ static void cpu_exec_nocache(CPUArchState *env, int max_cycles,
max_cycles);
cpu->current_tb = tb;
/* execute the generated code */
trace_exec_tb_nocache(tb, tb->pc);
cpu_tb_exec(cpu, tb->tc_ptr);
cpu->current_tb = NULL;
tb_phys_invalidate(tb, -1);
@@ -187,16 +295,10 @@ static inline TranslationBlock *tb_find_fast(CPUArchState *env)
return tb;
}
static CPUDebugExcpHandler *debug_excp_handler;
void cpu_set_debug_excp_handler(CPUDebugExcpHandler *handler)
{
debug_excp_handler = handler;
}
static void cpu_handle_debug_exception(CPUArchState *env)
{
CPUState *cpu = ENV_GET_CPU(env);
CPUClass *cc = CPU_GET_CLASS(cpu);
CPUWatchpoint *wp;
if (!cpu->watchpoint_hit) {
@@ -204,9 +306,8 @@ static void cpu_handle_debug_exception(CPUArchState *env)
wp->flags &= ~BP_WATCHPOINT_HIT;
}
}
if (debug_excp_handler) {
debug_excp_handler(env);
}
cc->debug_excp_handler(cpu);
}
/* main execution loop */
@@ -216,10 +317,7 @@ volatile sig_atomic_t exit_request;
int cpu_exec(CPUArchState *env)
{
CPUState *cpu = ENV_GET_CPU(env);
#if !(defined(CONFIG_USER_ONLY) && \
(defined(TARGET_M68K) || defined(TARGET_PPC) || defined(TARGET_S390X)))
CPUClass *cc = CPU_GET_CLASS(cpu);
#endif
#ifdef TARGET_I386
X86CPU *x86_cpu = X86_CPU(cpu);
#endif
@@ -227,6 +325,8 @@ int cpu_exec(CPUArchState *env)
TranslationBlock *tb;
uint8_t *tc_ptr;
uintptr_t next_tb;
SyncClocks sc;
/* This must be volatile so it is not trashed by longjmp() */
volatile bool have_tb_lock = false;
@@ -252,37 +352,16 @@ int cpu_exec(CPUArchState *env)
cpu->exit_request = 1;
}
#if defined(TARGET_I386)
/* put eflags in CPU temporary format */
CC_SRC = env->eflags & (CC_O | CC_S | CC_Z | CC_A | CC_P | CC_C);
env->df = 1 - (2 * ((env->eflags >> 10) & 1));
CC_OP = CC_OP_EFLAGS;
env->eflags &= ~(DF_MASK | CC_O | CC_S | CC_Z | CC_A | CC_P | CC_C);
#elif defined(TARGET_SPARC)
#elif defined(TARGET_M68K)
env->cc_op = CC_OP_FLAGS;
env->cc_dest = env->sr & 0xf;
env->cc_x = (env->sr >> 4) & 1;
#elif defined(TARGET_ALPHA)
#elif defined(TARGET_ARM)
#elif defined(TARGET_UNICORE32)
#elif defined(TARGET_PPC)
env->reserve_addr = -1;
#elif defined(TARGET_LM32)
#elif defined(TARGET_MICROBLAZE)
#elif defined(TARGET_MIPS)
#elif defined(TARGET_MOXIE)
#elif defined(TARGET_OPENRISC)
#elif defined(TARGET_SH4)
#elif defined(TARGET_CRIS)
#elif defined(TARGET_S390X)
#elif defined(TARGET_XTENSA)
/* XXXXX */
#else
#error unsupported target CPU
#endif
cc->cpu_exec_enter(cpu);
cpu->exception_index = -1;
/* Calculate difference between guest clock and host clock.
* This delay includes the delay of the last cycle, so
* what we have to do is sleep until it is 0. As for the
* advance/delay we gain here, we try to fix it next time.
*/
init_delay_params(&sc, cpu);
/* prepare setjmp context for exception handling */
for(;;) {
if (sigsetjmp(cpu->jmp_env, 0) == 0) {
@@ -325,16 +404,12 @@ int cpu_exec(CPUArchState *env)
cpu->exception_index = EXCP_DEBUG;
cpu_loop_exit(cpu);
}
#if defined(TARGET_ARM) || defined(TARGET_SPARC) || defined(TARGET_MIPS) || \
defined(TARGET_PPC) || defined(TARGET_ALPHA) || defined(TARGET_CRIS) || \
defined(TARGET_MICROBLAZE) || defined(TARGET_LM32) || defined(TARGET_UNICORE32)
if (interrupt_request & CPU_INTERRUPT_HALT) {
cpu->interrupt_request &= ~CPU_INTERRUPT_HALT;
cpu->halted = 1;
cpu->exception_index = EXCP_HLT;
cpu_loop_exit(cpu);
}
#endif
#if defined(TARGET_I386)
if (interrupt_request & CPU_INTERRUPT_INIT) {
cpu_svm_check_intercept_param(env, SVM_EXIT_INIT, 0);
@@ -347,251 +422,15 @@ int cpu_exec(CPUArchState *env)
cpu_reset(cpu);
}
#endif
#if defined(TARGET_I386)
#if !defined(CONFIG_USER_ONLY)
if (interrupt_request & CPU_INTERRUPT_POLL) {
cpu->interrupt_request &= ~CPU_INTERRUPT_POLL;
apic_poll_irq(x86_cpu->apic_state);
}
#endif
if (interrupt_request & CPU_INTERRUPT_SIPI) {
do_cpu_sipi(x86_cpu);
} else if (env->hflags2 & HF2_GIF_MASK) {
if ((interrupt_request & CPU_INTERRUPT_SMI) &&
!(env->hflags & HF_SMM_MASK)) {
cpu_svm_check_intercept_param(env, SVM_EXIT_SMI,
0);
cpu->interrupt_request &= ~CPU_INTERRUPT_SMI;
do_smm_enter(x86_cpu);
next_tb = 0;
} else if ((interrupt_request & CPU_INTERRUPT_NMI) &&
!(env->hflags2 & HF2_NMI_MASK)) {
cpu->interrupt_request &= ~CPU_INTERRUPT_NMI;
env->hflags2 |= HF2_NMI_MASK;
do_interrupt_x86_hardirq(env, EXCP02_NMI, 1);
next_tb = 0;
} else if (interrupt_request & CPU_INTERRUPT_MCE) {
cpu->interrupt_request &= ~CPU_INTERRUPT_MCE;
do_interrupt_x86_hardirq(env, EXCP12_MCHK, 0);
next_tb = 0;
} else if ((interrupt_request & CPU_INTERRUPT_HARD) &&
(((env->hflags2 & HF2_VINTR_MASK) &&
(env->hflags2 & HF2_HIF_MASK)) ||
(!(env->hflags2 & HF2_VINTR_MASK) &&
(env->eflags & IF_MASK &&
!(env->hflags & HF_INHIBIT_IRQ_MASK))))) {
int intno;
cpu_svm_check_intercept_param(env, SVM_EXIT_INTR,
0);
cpu->interrupt_request &= ~(CPU_INTERRUPT_HARD |
CPU_INTERRUPT_VIRQ);
intno = cpu_get_pic_interrupt(env);
qemu_log_mask(CPU_LOG_TB_IN_ASM, "Servicing hardware INT=0x%02x\n", intno);
do_interrupt_x86_hardirq(env, intno, 1);
/* ensure that no TB jump will be modified as
the program flow was changed */
next_tb = 0;
#if !defined(CONFIG_USER_ONLY)
} else if ((interrupt_request & CPU_INTERRUPT_VIRQ) &&
(env->eflags & IF_MASK) &&
!(env->hflags & HF_INHIBIT_IRQ_MASK)) {
int intno;
/* FIXME: this should respect TPR */
cpu_svm_check_intercept_param(env, SVM_EXIT_VINTR,
0);
intno = ldl_phys(cpu->as,
env->vm_vmcb
+ offsetof(struct vmcb,
control.int_vector));
qemu_log_mask(CPU_LOG_TB_IN_ASM, "Servicing virtual hardware INT=0x%02x\n", intno);
do_interrupt_x86_hardirq(env, intno, 1);
cpu->interrupt_request &= ~CPU_INTERRUPT_VIRQ;
next_tb = 0;
#endif
}
}
#elif defined(TARGET_PPC)
if (interrupt_request & CPU_INTERRUPT_HARD) {
ppc_hw_interrupt(env);
if (env->pending_interrupts == 0) {
cpu->interrupt_request &= ~CPU_INTERRUPT_HARD;
}
/* The target hook has 3 exit conditions:
False when the interrupt isn't processed,
True when it is, and we should restart on a new TB,
and via longjmp via cpu_loop_exit. */
if (cc->cpu_exec_interrupt(cpu, interrupt_request)) {
next_tb = 0;
}
#elif defined(TARGET_LM32)
if ((interrupt_request & CPU_INTERRUPT_HARD)
&& (env->ie & IE_IE)) {
cpu->exception_index = EXCP_IRQ;
cc->do_interrupt(cpu);
next_tb = 0;
}
#elif defined(TARGET_MICROBLAZE)
if ((interrupt_request & CPU_INTERRUPT_HARD)
&& (env->sregs[SR_MSR] & MSR_IE)
&& !(env->sregs[SR_MSR] & (MSR_EIP | MSR_BIP))
&& !(env->iflags & (D_FLAG | IMM_FLAG))) {
cpu->exception_index = EXCP_IRQ;
cc->do_interrupt(cpu);
next_tb = 0;
}
#elif defined(TARGET_MIPS)
if ((interrupt_request & CPU_INTERRUPT_HARD) &&
cpu_mips_hw_interrupts_pending(env)) {
/* Raise it */
cpu->exception_index = EXCP_EXT_INTERRUPT;
env->error_code = 0;
cc->do_interrupt(cpu);
next_tb = 0;
}
#elif defined(TARGET_OPENRISC)
{
int idx = -1;
if ((interrupt_request & CPU_INTERRUPT_HARD)
&& (env->sr & SR_IEE)) {
idx = EXCP_INT;
}
if ((interrupt_request & CPU_INTERRUPT_TIMER)
&& (env->sr & SR_TEE)) {
idx = EXCP_TICK;
}
if (idx >= 0) {
cpu->exception_index = idx;
cc->do_interrupt(cpu);
next_tb = 0;
}
}
#elif defined(TARGET_SPARC)
if (interrupt_request & CPU_INTERRUPT_HARD) {
if (cpu_interrupts_enabled(env) &&
env->interrupt_index > 0) {
int pil = env->interrupt_index & 0xf;
int type = env->interrupt_index & 0xf0;
if (((type == TT_EXTINT) &&
cpu_pil_allowed(env, pil)) ||
type != TT_EXTINT) {
cpu->exception_index = env->interrupt_index;
cc->do_interrupt(cpu);
next_tb = 0;
}
}
}
#elif defined(TARGET_ARM)
if (interrupt_request & CPU_INTERRUPT_FIQ
&& !(env->daif & PSTATE_F)) {
cpu->exception_index = EXCP_FIQ;
cc->do_interrupt(cpu);
next_tb = 0;
}
/* ARMv7-M interrupt return works by loading a magic value
into the PC. On real hardware the load causes the
return to occur. The qemu implementation performs the
jump normally, then does the exception return when the
CPU tries to execute code at the magic address.
This will cause the magic PC value to be pushed to
the stack if an interrupt occurred at the wrong time.
We avoid this by disabling interrupts when
pc contains a magic address. */
if (interrupt_request & CPU_INTERRUPT_HARD
&& ((IS_M(env) && env->regs[15] < 0xfffffff0)
|| !(env->daif & PSTATE_I))) {
cpu->exception_index = EXCP_IRQ;
cc->do_interrupt(cpu);
next_tb = 0;
}
#elif defined(TARGET_UNICORE32)
if (interrupt_request & CPU_INTERRUPT_HARD
&& !(env->uncached_asr & ASR_I)) {
cpu->exception_index = UC32_EXCP_INTR;
cc->do_interrupt(cpu);
next_tb = 0;
}
#elif defined(TARGET_SH4)
if (interrupt_request & CPU_INTERRUPT_HARD) {
cc->do_interrupt(cpu);
next_tb = 0;
}
#elif defined(TARGET_ALPHA)
{
int idx = -1;
/* ??? This hard-codes the OSF/1 interrupt levels. */
switch (env->pal_mode ? 7 : env->ps & PS_INT_MASK) {
case 0 ... 3:
if (interrupt_request & CPU_INTERRUPT_HARD) {
idx = EXCP_DEV_INTERRUPT;
}
/* FALLTHRU */
case 4:
if (interrupt_request & CPU_INTERRUPT_TIMER) {
idx = EXCP_CLK_INTERRUPT;
}
/* FALLTHRU */
case 5:
if (interrupt_request & CPU_INTERRUPT_SMP) {
idx = EXCP_SMP_INTERRUPT;
}
/* FALLTHRU */
case 6:
if (interrupt_request & CPU_INTERRUPT_MCHK) {
idx = EXCP_MCHK;
}
}
if (idx >= 0) {
cpu->exception_index = idx;
env->error_code = 0;
cc->do_interrupt(cpu);
next_tb = 0;
}
}
#elif defined(TARGET_CRIS)
if (interrupt_request & CPU_INTERRUPT_HARD
&& (env->pregs[PR_CCS] & I_FLAG)
&& !env->locked_irq) {
cpu->exception_index = EXCP_IRQ;
cc->do_interrupt(cpu);
next_tb = 0;
}
if (interrupt_request & CPU_INTERRUPT_NMI) {
unsigned int m_flag_archval;
if (env->pregs[PR_VR] < 32) {
m_flag_archval = M_FLAG_V10;
} else {
m_flag_archval = M_FLAG_V32;
}
if ((env->pregs[PR_CCS] & m_flag_archval)) {
cpu->exception_index = EXCP_NMI;
cc->do_interrupt(cpu);
next_tb = 0;
}
}
#elif defined(TARGET_M68K)
if (interrupt_request & CPU_INTERRUPT_HARD
&& ((env->sr & SR_I) >> SR_I_SHIFT)
< env->pending_level) {
/* Real hardware gets the interrupt vector via an
IACK cycle at this point. Current emulated
hardware doesn't rely on this, so we
provide/save the vector when the interrupt is
first signalled. */
cpu->exception_index = env->pending_vector;
do_interrupt_m68k_hardirq(env);
next_tb = 0;
}
#elif defined(TARGET_S390X) && !defined(CONFIG_USER_ONLY)
if ((interrupt_request & CPU_INTERRUPT_HARD) &&
(env->psw.mask & PSW_MASK_EXT)) {
cc->do_interrupt(cpu);
next_tb = 0;
}
#elif defined(TARGET_XTENSA)
if (interrupt_request & CPU_INTERRUPT_HARD) {
cpu->exception_index = EXC_IRQ;
cc->do_interrupt(cpu);
next_tb = 0;
}
#endif
/* Don't use the cached interrupt_request value,
do_interrupt may have updated the EXITTB flag. */
/* Don't use the cached interrupt_request value,
do_interrupt may have updated the EXITTB flag. */
if (cpu->interrupt_request & CPU_INTERRUPT_EXITTB) {
cpu->interrupt_request &= ~CPU_INTERRUPT_EXITTB;
/* ensure that no TB jump will be modified as
@@ -637,6 +476,7 @@ int cpu_exec(CPUArchState *env)
cpu->current_tb = tb;
barrier();
if (likely(!cpu->exit_request)) {
trace_exec_tb(tb, tb->pc);
tc_ptr = tb->tc_ptr;
/* execute the generated code */
next_tb = cpu_tb_exec(cpu, tc_ptr);
@@ -672,6 +512,7 @@ int cpu_exec(CPUArchState *env)
if (insns_left > 0) {
/* Execute remaining instructions. */
cpu_exec_nocache(env, insns_left, tb);
align_clocks(&sc, cpu);
}
cpu->exception_index = EXCP_INTERRUPT;
next_tb = 0;
@@ -684,6 +525,9 @@ int cpu_exec(CPUArchState *env)
}
}
cpu->current_tb = NULL;
/* Try to align the host and virtual clocks
if the guest is in advance */
align_clocks(&sc, cpu);
/* reset soft MMU for next block (it can currently
only be set by a memory fault) */
} /* for(;;) */
@@ -692,10 +536,7 @@ int cpu_exec(CPUArchState *env)
* local variables as longjmp is marked 'noreturn'. */
cpu = current_cpu;
env = cpu->env_ptr;
#if !(defined(CONFIG_USER_ONLY) && \
(defined(TARGET_M68K) || defined(TARGET_PPC) || defined(TARGET_S390X)))
cc = CPU_GET_CLASS(cpu);
#endif
#ifdef TARGET_I386
x86_cpu = X86_CPU(cpu);
#endif
@@ -706,35 +547,7 @@ int cpu_exec(CPUArchState *env)
}
} /* for(;;) */
#if defined(TARGET_I386)
/* restore flags in standard format */
env->eflags = env->eflags | cpu_cc_compute_all(env, CC_OP)
| (env->df & DF_MASK);
#elif defined(TARGET_ARM)
/* XXX: Save/restore host fpu exception state?. */
#elif defined(TARGET_UNICORE32)
#elif defined(TARGET_SPARC)
#elif defined(TARGET_PPC)
#elif defined(TARGET_LM32)
#elif defined(TARGET_M68K)
cpu_m68k_flush_flags(env, env->cc_op);
env->cc_op = CC_OP_FLAGS;
env->sr = (env->sr & 0xffe0)
| env->cc_dest | (env->cc_x << 4);
#elif defined(TARGET_MICROBLAZE)
#elif defined(TARGET_MIPS)
#elif defined(TARGET_MOXIE)
#elif defined(TARGET_OPENRISC)
#elif defined(TARGET_SH4)
#elif defined(TARGET_ALPHA)
#elif defined(TARGET_CRIS)
#elif defined(TARGET_S390X)
#elif defined(TARGET_XTENSA)
/* XXXXX */
#else
#error unsupported target CPU
#endif
cc->cpu_exec_exit(cpu);
/* fail safe : never use current_cpu outside cpu_exec() */
current_cpu = NULL;

154
cpus.c
View File

@@ -40,6 +40,7 @@
#include "qemu/bitmap.h"
#include "qemu/seqlock.h"
#include "qapi-event.h"
#include "hw/nmi.h"
#ifndef _WIN32
#include "qemu/compatfd.h"
@@ -64,6 +65,8 @@
#endif /* CONFIG_LINUX */
static CPUState *next_cpu;
int64_t max_delay;
int64_t max_advance;
bool cpu_is_stopped(CPUState *cpu)
{
@@ -102,17 +105,12 @@ static bool all_cpu_threads_idle(void)
/* Protected by TimersState seqlock */
/* Compensate for varying guest execution speed. */
static int64_t qemu_icount_bias;
static int64_t vm_clock_warp_start;
static int64_t vm_clock_warp_start = -1;
/* Conversion factor from emulated instructions to virtual clock ticks. */
static int icount_time_shift;
/* Arbitrarily pick 1MIPS as the minimum allowable speed. */
#define MAX_ICOUNT_SHIFT 10
/* Only written by TCG thread */
static int64_t qemu_icount;
static QEMUTimer *icount_rt_timer;
static QEMUTimer *icount_vm_timer;
static QEMUTimer *icount_warp_timer;
@@ -129,6 +127,11 @@ typedef struct TimersState {
int64_t cpu_clock_offset;
int32_t cpu_ticks_enabled;
int64_t dummy;
/* Compensate for varying guest execution speed. */
int64_t qemu_icount_bias;
/* Only written by TCG thread */
int64_t qemu_icount;
} TimersState;
static TimersState timers_state;
@@ -139,14 +142,14 @@ static int64_t cpu_get_icount_locked(void)
int64_t icount;
CPUState *cpu = current_cpu;
icount = qemu_icount;
icount = timers_state.qemu_icount;
if (cpu) {
if (!cpu_can_do_io(cpu)) {
fprintf(stderr, "Bad clock read\n");
}
icount -= (cpu->icount_decr.u16.low + cpu->icount_extra);
}
return qemu_icount_bias + (icount << icount_time_shift);
return timers_state.qemu_icount_bias + cpu_icount_to_ns(icount);
}
int64_t cpu_get_icount(void)
@@ -162,6 +165,11 @@ int64_t cpu_get_icount(void)
return icount;
}
int64_t cpu_icount_to_ns(int64_t icount)
{
return icount << icount_time_shift;
}
/* return the host CPU cycle counter and handle stop/restart */
/* Caller must hold the BQL */
int64_t cpu_get_ticks(void)
@@ -214,6 +222,23 @@ int64_t cpu_get_clock(void)
return ti;
}
/* return the offset between the host clock and virtual CPU clock */
int64_t cpu_get_clock_offset(void)
{
int64_t ti;
unsigned start;
do {
start = seqlock_read_begin(&timers_state.vm_clock_seqlock);
ti = timers_state.cpu_clock_offset;
if (!timers_state.cpu_ticks_enabled) {
ti -= get_clock();
}
} while (seqlock_read_retry(&timers_state.vm_clock_seqlock, start));
return -ti;
}
/* enable cpu_get_ticks()
* Caller must hold BQL which server as mutex for vm_clock_seqlock.
*/
@@ -284,7 +309,8 @@ static void icount_adjust(void)
icount_time_shift++;
}
last_delta = delta;
qemu_icount_bias = cur_icount - (qemu_icount << icount_time_shift);
timers_state.qemu_icount_bias = cur_icount
- (timers_state.qemu_icount << icount_time_shift);
seqlock_write_unlock(&timers_state.vm_clock_seqlock);
}
@@ -333,7 +359,7 @@ static void icount_warp_rt(void *opaque)
int64_t delta = cur_time - cur_icount;
warp_delta = MIN(warp_delta, delta);
}
qemu_icount_bias += warp_delta;
timers_state.qemu_icount_bias += warp_delta;
}
vm_clock_warp_start = -1;
seqlock_write_unlock(&timers_state.vm_clock_seqlock);
@@ -351,7 +377,7 @@ void qtest_clock_warp(int64_t dest)
int64_t deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL);
int64_t warp = qemu_soonest_timeout(dest - clock, deadline);
seqlock_write_lock(&timers_state.vm_clock_seqlock);
qemu_icount_bias += warp;
timers_state.qemu_icount_bias += warp;
seqlock_write_unlock(&timers_state.vm_clock_seqlock);
qemu_clock_run_timers(QEMU_CLOCK_VIRTUAL);
@@ -428,6 +454,25 @@ void qemu_clock_warp(QEMUClockType type)
}
}
static bool icount_state_needed(void *opaque)
{
return use_icount;
}
/*
* This is a subsection for icount migration.
*/
static const VMStateDescription icount_vmstate_timers = {
.name = "timer/icount",
.version_id = 1,
.minimum_version_id = 1,
.fields = (VMStateField[]) {
VMSTATE_INT64(qemu_icount_bias, TimersState),
VMSTATE_INT64(qemu_icount, TimersState),
VMSTATE_END_OF_LIST()
}
};
static const VMStateDescription vmstate_timers = {
.name = "timer",
.version_id = 2,
@@ -437,23 +482,48 @@ static const VMStateDescription vmstate_timers = {
VMSTATE_INT64(dummy, TimersState),
VMSTATE_INT64_V(cpu_clock_offset, TimersState, 2),
VMSTATE_END_OF_LIST()
},
.subsections = (VMStateSubsection[]) {
{
.vmsd = &icount_vmstate_timers,
.needed = icount_state_needed,
}, {
/* empty */
}
}
};
void configure_icount(const char *option)
void cpu_ticks_init(void)
{
seqlock_init(&timers_state.vm_clock_seqlock, NULL);
vmstate_register(NULL, 0, &vmstate_timers, &timers_state);
}
void configure_icount(QemuOpts *opts, Error **errp)
{
const char *option;
char *rem_str = NULL;
option = qemu_opt_get(opts, "shift");
if (!option) {
if (qemu_opt_get(opts, "align") != NULL) {
error_setg(errp, "Please specify shift option when using align");
}
return;
}
icount_align_option = qemu_opt_get_bool(opts, "align", false);
icount_warp_timer = timer_new_ns(QEMU_CLOCK_REALTIME,
icount_warp_rt, NULL);
if (strcmp(option, "auto") != 0) {
icount_time_shift = strtol(option, NULL, 0);
errno = 0;
icount_time_shift = strtol(option, &rem_str, 0);
if (errno != 0 || *rem_str != '\0' || !strlen(option)) {
error_setg(errp, "icount: Invalid shift value");
}
use_icount = 1;
return;
} else if (icount_align_option) {
error_setg(errp, "shift=auto and align=on are incompatible");
}
use_icount = 2;
@@ -523,6 +593,15 @@ void cpu_synchronize_all_post_init(void)
}
}
void cpu_clean_all_dirty(void)
{
CPUState *cpu;
CPU_FOREACH(cpu) {
cpu_clean_state(cpu);
}
}
static int do_vm_stop(RunState state)
{
int ret = 0;
@@ -1250,7 +1329,8 @@ static int tcg_cpu_exec(CPUArchState *env)
int64_t count;
int64_t deadline;
int decr;
qemu_icount -= (cpu->icount_decr.u16.low + cpu->icount_extra);
timers_state.qemu_icount -= (cpu->icount_decr.u16.low
+ cpu->icount_extra);
cpu->icount_decr.u16.low = 0;
cpu->icount_extra = 0;
deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL);
@@ -1265,7 +1345,7 @@ static int tcg_cpu_exec(CPUArchState *env)
}
count = qemu_icount_round(deadline);
qemu_icount += count;
timers_state.qemu_icount += count;
decr = (count > 0xffff) ? 0xffff : count;
count -= decr;
cpu->icount_decr.u16.low = decr;
@@ -1278,7 +1358,8 @@ static int tcg_cpu_exec(CPUArchState *env)
if (use_icount) {
/* Fold pending instructions back into the
instruction counter, and clear the interrupt flag. */
qemu_icount -= (cpu->icount_decr.u16.low + cpu->icount_extra);
timers_state.qemu_icount -= (cpu->icount_decr.u16.low
+ cpu->icount_extra);
cpu->icount_decr.u32 = 0;
cpu->icount_extra = 0;
}
@@ -1342,6 +1423,9 @@ CpuInfoList *qmp_query_cpus(Error **errp)
#elif defined(TARGET_MIPS)
MIPSCPU *mips_cpu = MIPS_CPU(cpu);
CPUMIPSState *env = &mips_cpu->env;
#elif defined(TARGET_TRICORE)
TriCoreCPU *tricore_cpu = TRICORE_CPU(cpu);
CPUTriCoreState *env = &tricore_cpu->env;
#endif
cpu_synchronize_state(cpu);
@@ -1366,6 +1450,9 @@ CpuInfoList *qmp_query_cpus(Error **errp)
#elif defined(TARGET_MIPS)
info->value->has_PC = true;
info->value->PC = env->active_tc.PC;
#elif defined(TARGET_TRICORE)
info->value->has_PC = true;
info->value->PC = env->PC;
#endif
/* XXX: waiting for the qapi to support GSList */
@@ -1469,21 +1556,24 @@ void qmp_inject_nmi(Error **errp)
apic_deliver_nmi(cpu->apic_state);
}
}
#elif defined(TARGET_S390X)
CPUState *cs;
S390CPU *cpu;
CPU_FOREACH(cs) {
cpu = S390_CPU(cs);
if (cpu->env.cpu_num == monitor_get_cpu_index()) {
if (s390_cpu_restart(S390_CPU(cs)) == -1) {
error_set(errp, QERR_UNSUPPORTED);
return;
}
break;
}
}
#else
error_set(errp, QERR_UNSUPPORTED);
nmi_monitor_handle(monitor_get_cpu_index(), errp);
#endif
}
void dump_drift_info(FILE *f, fprintf_function cpu_fprintf)
{
if (!use_icount) {
return;
}
cpu_fprintf(f, "Host - Guest clock %"PRIi64" ms\n",
(cpu_get_clock() - cpu_get_icount())/SCALE_MS);
if (icount_align_option) {
cpu_fprintf(f, "Max guest delay %"PRIi64" ms\n", -max_delay/SCALE_MS);
cpu_fprintf(f, "Max guest advance %"PRIi64" ms\n", max_advance/SCALE_MS);
} else {
cpu_fprintf(f, "Max guest delay NA\n");
cpu_fprintf(f, "Max guest advance NA\n");
}
}

View File

@@ -60,8 +60,10 @@ void tlb_flush(CPUState *cpu, int flush_global)
cpu->current_tb = NULL;
memset(env->tlb_table, -1, sizeof(env->tlb_table));
memset(env->tlb_v_table, -1, sizeof(env->tlb_v_table));
memset(cpu->tb_jmp_cache, 0, sizeof(cpu->tb_jmp_cache));
env->vtlb_index = 0;
env->tlb_flush_addr = -1;
env->tlb_flush_mask = 0;
tlb_flush_count++;
@@ -108,6 +110,14 @@ void tlb_flush_page(CPUState *cpu, target_ulong addr)
tlb_flush_entry(&env->tlb_table[mmu_idx][i], addr);
}
/* check whether there are entries that need to be flushed in the vtlb */
for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
int k;
for (k = 0; k < CPU_VTLB_SIZE; k++) {
tlb_flush_entry(&env->tlb_v_table[mmu_idx][k], addr);
}
}
tb_flush_jmp_cache(cpu, addr);
}
@@ -172,6 +182,11 @@ void cpu_tlb_reset_dirty_all(ram_addr_t start1, ram_addr_t length)
tlb_reset_dirty_range(&env->tlb_table[mmu_idx][i],
start1, length);
}
for (i = 0; i < CPU_VTLB_SIZE; i++) {
tlb_reset_dirty_range(&env->tlb_v_table[mmu_idx][i],
start1, length);
}
}
}
}
@@ -195,6 +210,13 @@ void tlb_set_dirty(CPUArchState *env, target_ulong vaddr)
for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
tlb_set_dirty1(&env->tlb_table[mmu_idx][i], vaddr);
}
for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
int k;
for (k = 0; k < CPU_VTLB_SIZE; k++) {
tlb_set_dirty1(&env->tlb_v_table[mmu_idx][k], vaddr);
}
}
}
/* Our TLB does not support large pages, so remember the area covered by
@@ -235,6 +257,7 @@ void tlb_set_page(CPUState *cpu, target_ulong vaddr,
uintptr_t addend;
CPUTLBEntry *te;
hwaddr iotlb, xlat, sz;
unsigned vidx = env->vtlb_index++ % CPU_VTLB_SIZE;
assert(size >= TARGET_PAGE_SIZE);
if (size != TARGET_PAGE_SIZE) {
@@ -267,8 +290,14 @@ void tlb_set_page(CPUState *cpu, target_ulong vaddr,
prot, &address);
index = (vaddr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
env->iotlb[mmu_idx][index] = iotlb - vaddr;
te = &env->tlb_table[mmu_idx][index];
/* do not discard the translation in te, evict it into a victim tlb */
env->tlb_v_table[mmu_idx][vidx] = *te;
env->iotlb_v[mmu_idx][vidx] = env->iotlb[mmu_idx][index];
/* refill the tlb */
env->iotlb[mmu_idx][index] = iotlb - vaddr;
te->addend = addend - vaddr;
if (prot & PAGE_READ) {
te->addr_read = address;

View File

@@ -32,6 +32,5 @@ CONFIG_G364FB=y
CONFIG_I8259=y
CONFIG_JAZZ_LED=y
CONFIG_MC146818RTC=y
CONFIG_VT82C686=y
CONFIG_ISA_TESTDEV=y
CONFIG_EMPTY_SLOT=y

View File

@@ -32,6 +32,5 @@ CONFIG_G364FB=y
CONFIG_I8259=y
CONFIG_JAZZ_LED=y
CONFIG_MC146818RTC=y
CONFIG_VT82C686=y
CONFIG_ISA_TESTDEV=y
CONFIG_EMPTY_SLOT=y

View File

@@ -32,6 +32,5 @@ CONFIG_G364FB=y
CONFIG_I8259=y
CONFIG_JAZZ_LED=y
CONFIG_MC146818RTC=y
CONFIG_VT82C686=y
CONFIG_ISA_TESTDEV=y
CONFIG_EMPTY_SLOT=y

View File

@@ -45,8 +45,8 @@ CONFIG_PREP=y
CONFIG_MAC=y
CONFIG_E500=y
CONFIG_OPENPIC_KVM=$(and $(CONFIG_E500),$(CONFIG_KVM))
CONFIG_ETSEC=y
CONFIG_LIBDECNUMBER=y
# For PReP
CONFIG_MC146818RTC=y
CONFIG_ETSEC=y
CONFIG_ISA_TESTDEV=y
CONFIG_LIBDECNUMBER=y

View File

@@ -46,6 +46,8 @@ CONFIG_PREP=y
CONFIG_MAC=y
CONFIG_E500=y
CONFIG_OPENPIC_KVM=$(and $(CONFIG_E500),$(CONFIG_KVM))
CONFIG_ETSEC=y
CONFIG_LIBDECNUMBER=y
# For pSeries
CONFIG_XICS=$(CONFIG_PSERIES)
CONFIG_XICS_KVM=$(and $(CONFIG_PSERIES),$(CONFIG_KVM))
@@ -58,4 +60,3 @@ CONFIG_I82374=y
CONFIG_I8257=y
CONFIG_MC146818RTC=y
CONFIG_ISA_TESTDEV=y
CONFIG_LIBDECNUMBER=y

View File

View File

@@ -24,6 +24,7 @@
#include "hw/hw.h"
#include "hw/boards.h"
#include "sysemu/block-backend.h"
#include "sysemu/blockdev.h"
#include "qemu/config-file.h"
#include "sysemu/sysemu.h"
@@ -76,6 +77,6 @@ void drive_hot_add(Monitor *mon, const QDict *qdict)
err:
if (dinfo) {
drive_del(dinfo);
blk_unref(blk_by_legacy_dinfo(dinfo));
}
}

View File

@@ -20,6 +20,7 @@
#include "config.h"
#include "qemu-common.h"
#include "qemu/error-report.h"
#include "sysemu/device_tree.h"
#include "sysemu/sysemu.h"
#include "hw/loader.h"
@@ -59,13 +60,13 @@ void *create_device_tree(int *sizep)
}
ret = fdt_open_into(fdt, fdt, *sizep);
if (ret) {
fprintf(stderr, "Unable to copy device tree in memory\n");
error_report("Unable to copy device tree in memory");
exit(1);
}
return fdt;
fail:
fprintf(stderr, "%s Couldn't create dt: %s\n", __func__, fdt_strerror(ret));
error_report("%s Couldn't create dt: %s", __func__, fdt_strerror(ret));
exit(1);
}
@@ -79,8 +80,8 @@ void *load_device_tree(const char *filename_path, int *sizep)
*sizep = 0;
dt_size = get_image_size(filename_path);
if (dt_size < 0) {
printf("Unable to get size of device tree file '%s'\n",
filename_path);
error_report("Unable to get size of device tree file '%s'",
filename_path);
goto fail;
}
@@ -92,21 +93,21 @@ void *load_device_tree(const char *filename_path, int *sizep)
dt_file_load_size = load_image(filename_path, fdt);
if (dt_file_load_size < 0) {
printf("Unable to open device tree file '%s'\n",
filename_path);
error_report("Unable to open device tree file '%s'",
filename_path);
goto fail;
}
ret = fdt_open_into(fdt, fdt, dt_size);
if (ret) {
printf("Unable to copy device tree in memory\n");
error_report("Unable to copy device tree in memory");
goto fail;
}
/* Check sanity of device tree */
if (fdt_check_header(fdt)) {
printf ("Device tree file loaded into memory is invalid: %s\n",
filename_path);
error_report("Device tree file loaded into memory is invalid: %s",
filename_path);
goto fail;
}
*sizep = dt_size;
@@ -123,8 +124,8 @@ static int findnode_nofail(void *fdt, const char *node_path)
offset = fdt_path_offset(fdt, node_path);
if (offset < 0) {
fprintf(stderr, "%s Couldn't find node %s: %s\n", __func__, node_path,
fdt_strerror(offset));
error_report("%s Couldn't find node %s: %s", __func__, node_path,
fdt_strerror(offset));
exit(1);
}
@@ -138,8 +139,8 @@ int qemu_fdt_setprop(void *fdt, const char *node_path,
r = fdt_setprop(fdt, findnode_nofail(fdt, node_path), property, val, size);
if (r < 0) {
fprintf(stderr, "%s: Couldn't set %s/%s: %s\n", __func__, node_path,
property, fdt_strerror(r));
error_report("%s: Couldn't set %s/%s: %s", __func__, node_path,
property, fdt_strerror(r));
exit(1);
}
@@ -153,8 +154,8 @@ int qemu_fdt_setprop_cell(void *fdt, const char *node_path,
r = fdt_setprop_cell(fdt, findnode_nofail(fdt, node_path), property, val);
if (r < 0) {
fprintf(stderr, "%s: Couldn't set %s/%s = %#08x: %s\n", __func__,
node_path, property, val, fdt_strerror(r));
error_report("%s: Couldn't set %s/%s = %#08x: %s", __func__,
node_path, property, val, fdt_strerror(r));
exit(1);
}
@@ -175,8 +176,8 @@ int qemu_fdt_setprop_string(void *fdt, const char *node_path,
r = fdt_setprop_string(fdt, findnode_nofail(fdt, node_path), property, string);
if (r < 0) {
fprintf(stderr, "%s: Couldn't set %s/%s = %s: %s\n", __func__,
node_path, property, string, fdt_strerror(r));
error_report("%s: Couldn't set %s/%s = %s: %s", __func__,
node_path, property, string, fdt_strerror(r));
exit(1);
}
@@ -193,8 +194,8 @@ const void *qemu_fdt_getprop(void *fdt, const char *node_path,
}
r = fdt_getprop(fdt, findnode_nofail(fdt, node_path), property, lenp);
if (!r) {
fprintf(stderr, "%s: Couldn't get %s/%s: %s\n", __func__,
node_path, property, fdt_strerror(*lenp));
error_report("%s: Couldn't get %s/%s: %s", __func__,
node_path, property, fdt_strerror(*lenp));
exit(1);
}
return r;
@@ -206,8 +207,8 @@ uint32_t qemu_fdt_getprop_cell(void *fdt, const char *node_path,
int len;
const uint32_t *p = qemu_fdt_getprop(fdt, node_path, property, &len);
if (len != 4) {
fprintf(stderr, "%s: %s/%s not 4 bytes long (not a cell?)\n",
__func__, node_path, property);
error_report("%s: %s/%s not 4 bytes long (not a cell?)",
__func__, node_path, property);
exit(1);
}
return be32_to_cpu(*p);
@@ -219,8 +220,8 @@ uint32_t qemu_fdt_get_phandle(void *fdt, const char *path)
r = fdt_get_phandle(fdt, findnode_nofail(fdt, path));
if (r == 0) {
fprintf(stderr, "%s: Couldn't get phandle for %s: %s\n", __func__,
path, fdt_strerror(r));
error_report("%s: Couldn't get phandle for %s: %s", __func__,
path, fdt_strerror(r));
exit(1);
}
@@ -265,8 +266,8 @@ int qemu_fdt_nop_node(void *fdt, const char *node_path)
r = fdt_nop_node(fdt, findnode_nofail(fdt, node_path));
if (r < 0) {
fprintf(stderr, "%s: Couldn't nop node %s: %s\n", __func__, node_path,
fdt_strerror(r));
error_report("%s: Couldn't nop node %s: %s", __func__, node_path,
fdt_strerror(r));
exit(1);
}
@@ -294,8 +295,8 @@ int qemu_fdt_add_subnode(void *fdt, const char *name)
retval = fdt_add_subnode(fdt, parent, basename);
if (retval < 0) {
fprintf(stderr, "FDT: Failed to create subnode %s: %s\n", name,
fdt_strerror(retval));
error_report("FDT: Failed to create subnode %s: %s", name,
fdt_strerror(retval));
exit(1);
}

View File

@@ -39,7 +39,7 @@ public:
~QEMUDisassembler() { }
protected:
void ProcessOutput(Instruction *instr) {
virtual void ProcessOutput(const Instruction *instr) {
fprintf(stream_, "%08" PRIx32 " %s",
instr->InstructionBits(), GetOutput());
}

View File

@@ -2,7 +2,7 @@
The code in this directory is a subset of libvixl:
https://github.com/armvixl/vixl
(specifically, it is the set of files needed for disassembly only,
taken from libvixl 1.4).
taken from libvixl 1.6).
Bugfixes should preferably be sent upstream initially.
The disassembler does not currently support the entire A64 instruction

View File

@@ -28,9 +28,11 @@
#define VIXL_A64_ASSEMBLER_A64_H_
#include <list>
#include <stack>
#include "globals.h"
#include "utils.h"
#include "code-buffer.h"
#include "a64/instructions-a64.h"
namespace vixl {
@@ -167,6 +169,11 @@ class CPURegister {
return type_ == kFPRegister;
}
bool IsW() const { return IsValidRegister() && Is32Bits(); }
bool IsX() const { return IsValidRegister() && Is64Bits(); }
bool IsS() const { return IsValidFPRegister() && Is32Bits(); }
bool IsD() const { return IsValidFPRegister() && Is64Bits(); }
const Register& W() const;
const Register& X() const;
const FPRegister& S() const;
@@ -190,12 +197,12 @@ class CPURegister {
class Register : public CPURegister {
public:
explicit Register() : CPURegister() {}
Register() : CPURegister() {}
inline explicit Register(const CPURegister& other)
: CPURegister(other.code(), other.size(), other.type()) {
VIXL_ASSERT(IsValidRegister());
}
explicit Register(unsigned code, unsigned size)
Register(unsigned code, unsigned size)
: CPURegister(code, size, kRegister) {}
bool IsValid() const {
@@ -535,7 +542,7 @@ class Operand {
class MemOperand {
public:
explicit MemOperand(Register base,
ptrdiff_t offset = 0,
int64_t offset = 0,
AddrMode addrmode = Offset);
explicit MemOperand(Register base,
Register regoffset,
@@ -551,7 +558,7 @@ class MemOperand {
const Register& base() const { return base_; }
const Register& regoffset() const { return regoffset_; }
ptrdiff_t offset() const { return offset_; }
int64_t offset() const { return offset_; }
AddrMode addrmode() const { return addrmode_; }
Shift shift() const { return shift_; }
Extend extend() const { return extend_; }
@@ -564,7 +571,7 @@ class MemOperand {
private:
Register base_;
Register regoffset_;
ptrdiff_t offset_;
int64_t offset_;
AddrMode addrmode_;
Shift shift_;
Extend extend_;
@@ -574,71 +581,233 @@ class MemOperand {
class Label {
public:
Label() : is_bound_(false), link_(NULL), target_(NULL) {}
Label() : location_(kLocationUnbound) {}
~Label() {
// If the label has been linked to, it needs to be bound to a target.
VIXL_ASSERT(!IsLinked() || IsBound());
}
inline Instruction* link() const { return link_; }
inline Instruction* target() const { return target_; }
inline bool IsBound() const { return is_bound_; }
inline bool IsLinked() const { return link_ != NULL; }
inline void set_link(Instruction* new_link) { link_ = new_link; }
static const int kEndOfChain = 0;
inline bool IsBound() const { return location_ >= 0; }
inline bool IsLinked() const { return !links_.empty(); }
private:
// Indicates if the label has been bound, ie its location is fixed.
bool is_bound_;
// Branches instructions branching to this label form a chained list, with
// their offset indicating where the next instruction is located.
// link_ points to the latest branch instruction generated branching to this
// branch.
// If link_ is not NULL, the label has been linked to.
Instruction* link_;
// The list of linked instructions is stored in a stack-like structure. We
// don't use std::stack directly because it's slow for the common case where
// only one or two instructions refer to a label, and labels themselves are
// short-lived. This class behaves like std::stack, but the first few links
// are preallocated (configured by kPreallocatedLinks).
//
// If more than N links are required, this falls back to std::stack.
class LinksStack {
public:
LinksStack() : size_(0), links_extended_(NULL) {}
~LinksStack() {
delete links_extended_;
}
size_t size() const {
return size_;
}
bool empty() const {
return size_ == 0;
}
void push(ptrdiff_t value) {
if (size_ < kPreallocatedLinks) {
links_[size_] = value;
} else {
if (links_extended_ == NULL) {
links_extended_ = new std::stack<ptrdiff_t>();
}
VIXL_ASSERT(size_ == (links_extended_->size() + kPreallocatedLinks));
links_extended_->push(value);
}
size_++;
}
ptrdiff_t top() const {
return (size_ <= kPreallocatedLinks) ? links_[size_ - 1]
: links_extended_->top();
}
void pop() {
size_--;
if (size_ >= kPreallocatedLinks) {
links_extended_->pop();
VIXL_ASSERT(size_ == (links_extended_->size() + kPreallocatedLinks));
}
}
private:
static const size_t kPreallocatedLinks = 4;
size_t size_;
ptrdiff_t links_[kPreallocatedLinks];
std::stack<ptrdiff_t> * links_extended_;
};
inline ptrdiff_t location() const { return location_; }
inline void Bind(ptrdiff_t location) {
// Labels can only be bound once.
VIXL_ASSERT(!IsBound());
location_ = location;
}
inline void AddLink(ptrdiff_t instruction) {
// If a label is bound, the assembler already has the information it needs
// to write the instruction, so there is no need to add it to links_.
VIXL_ASSERT(!IsBound());
links_.push(instruction);
}
inline ptrdiff_t GetAndRemoveNextLink() {
VIXL_ASSERT(IsLinked());
ptrdiff_t link = links_.top();
links_.pop();
return link;
}
// The offsets of the instructions that have linked to this label.
LinksStack links_;
// The label location.
Instruction* target_;
ptrdiff_t location_;
static const ptrdiff_t kLocationUnbound = -1;
// It is not safe to copy labels, so disable the copy constructor by declaring
// it private (without an implementation).
Label(const Label&);
// The Assembler class is responsible for binding and linking labels, since
// the stored offsets need to be consistent with the Assembler's buffer.
friend class Assembler;
};
// TODO: Obtain better values for these, based on real-world data.
const int kLiteralPoolCheckInterval = 4 * KBytes;
const int kRecommendedLiteralPoolRange = 2 * kLiteralPoolCheckInterval;
// Control whether a branch over the literal pool should also be emitted. This
// is needed if the literal pool has to be emitted in the middle of the JITted
// code.
enum LiteralPoolEmitOption {
JumpRequired,
NoJumpRequired
};
// Literal pool entry.
class Literal {
// A literal is a 32-bit or 64-bit piece of data stored in the instruction
// stream and loaded through a pc relative load. The same literal can be
// referred to by multiple instructions but a literal can only reside at one
// place in memory. A literal can be used by a load before or after being
// placed in memory.
//
// Internally an offset of 0 is associated with a literal which has been
// neither used nor placed. Then two possibilities arise:
// 1) the label is placed, the offset (stored as offset + 1) is used to
// resolve any subsequent load using the label.
// 2) the label is not placed and offset is the offset of the last load using
// the literal (stored as -offset -1). If multiple loads refer to this
// literal then the last load holds the offset of the preceding load and
// all loads form a chain. Once the offset is placed all the loads in the
// chain are resolved and future loads fall back to possibility 1.
class RawLiteral {
public:
Literal(Instruction* pc, uint64_t imm, unsigned size)
: pc_(pc), value_(imm), size_(size) {}
RawLiteral() : size_(0), offset_(0), raw_value_(0) {}
private:
Instruction* pc_;
int64_t value_;
unsigned size_;
size_t size() {
VIXL_STATIC_ASSERT(kDRegSizeInBytes == kXRegSizeInBytes);
VIXL_STATIC_ASSERT(kSRegSizeInBytes == kWRegSizeInBytes);
VIXL_ASSERT((size_ == kXRegSizeInBytes) || (size_ == kWRegSizeInBytes));
return size_;
}
uint64_t raw_value64() {
VIXL_ASSERT(size_ == kXRegSizeInBytes);
return raw_value_;
}
uint32_t raw_value32() {
VIXL_ASSERT(size_ == kWRegSizeInBytes);
VIXL_ASSERT(is_uint32(raw_value_) || is_int32(raw_value_));
return static_cast<uint32_t>(raw_value_);
}
bool IsUsed() { return offset_ < 0; }
bool IsPlaced() { return offset_ > 0; }
protected:
ptrdiff_t offset() {
VIXL_ASSERT(IsPlaced());
return offset_ - 1;
}
void set_offset(ptrdiff_t offset) {
VIXL_ASSERT(offset >= 0);
VIXL_ASSERT(IsWordAligned(offset));
VIXL_ASSERT(!IsPlaced());
offset_ = offset + 1;
}
ptrdiff_t last_use() {
VIXL_ASSERT(IsUsed());
return -offset_ - 1;
}
void set_last_use(ptrdiff_t offset) {
VIXL_ASSERT(offset >= 0);
VIXL_ASSERT(IsWordAligned(offset));
VIXL_ASSERT(!IsPlaced());
offset_ = -offset - 1;
}
size_t size_;
ptrdiff_t offset_;
uint64_t raw_value_;
friend class Assembler;
};
template <typename T>
class Literal : public RawLiteral {
public:
explicit Literal(T value) {
size_ = sizeof(value);
memcpy(&raw_value_, &value, sizeof(value));
}
};
// Control whether or not position-independent code should be emitted.
enum PositionIndependentCodeOption {
// All code generated will be position-independent; all branches and
// references to labels generated with the Label class will use PC-relative
// addressing.
PositionIndependentCode,
// Allow VIXL to generate code that refers to absolute addresses. With this
// option, it will not be possible to copy the code buffer and run it from a
// different address; code must be generated in its final location.
PositionDependentCode,
// Allow VIXL to assume that the bottom 12 bits of the address will be
// constant, but that the top 48 bits may change. This allows `adrp` to
// function in systems which copy code between pages, but otherwise maintain
// 4KB page alignment.
PageOffsetDependentCode
};
// Control how scaled- and unscaled-offset loads and stores are generated.
enum LoadStoreScalingOption {
// Prefer scaled-immediate-offset instructions, but emit unscaled-offset,
// register-offset, pre-index or post-index instructions if necessary.
PreferScaledOffset,
// Prefer unscaled-immediate-offset instructions, but emit scaled-offset,
// register-offset, pre-index or post-index instructions if necessary.
PreferUnscaledOffset,
// Require scaled-immediate-offset instructions.
RequireScaledOffset,
// Require unscaled-immediate-offset instructions.
RequireUnscaledOffset
};
// Assembler.
class Assembler {
public:
Assembler(byte* buffer, unsigned buffer_size);
Assembler(size_t capacity,
PositionIndependentCodeOption pic = PositionIndependentCode);
Assembler(byte* buffer, size_t capacity,
PositionIndependentCodeOption pic = PositionIndependentCode);
// The destructor asserts that one of the following is true:
// * The Assembler object has not been used.
@@ -650,9 +819,6 @@ class Assembler {
// Start generating code from the beginning of the buffer, discarding any code
// and data that has already been emitted into the buffer.
//
// In order to avoid any accidental transfer of state, Reset ASSERTs that the
// constant pool is not blocked.
void Reset();
// Finalize a code buffer of generated instructions. This function must be
@@ -662,12 +828,49 @@ class Assembler {
// Label.
// Bind a label to the current PC.
void bind(Label* label);
int UpdateAndGetByteOffsetTo(Label* label);
inline int UpdateAndGetInstructionOffsetTo(Label* label) {
VIXL_ASSERT(Label::kEndOfChain == 0);
return UpdateAndGetByteOffsetTo(label) >> kInstructionSizeLog2;
// Bind a label to a specified offset from the start of the buffer.
void BindToOffset(Label* label, ptrdiff_t offset);
// Place a literal at the current PC.
void place(RawLiteral* literal);
ptrdiff_t CursorOffset() const {
return buffer_->CursorOffset();
}
ptrdiff_t BufferEndOffset() const {
return static_cast<ptrdiff_t>(buffer_->capacity());
}
// Return the address of an offset in the buffer.
template <typename T>
inline T GetOffsetAddress(ptrdiff_t offset) {
VIXL_STATIC_ASSERT(sizeof(T) >= sizeof(uintptr_t));
return buffer_->GetOffsetAddress<T>(offset);
}
// Return the address of a bound label.
template <typename T>
inline T GetLabelAddress(const Label * label) {
VIXL_ASSERT(label->IsBound());
VIXL_STATIC_ASSERT(sizeof(T) >= sizeof(uintptr_t));
return GetOffsetAddress<T>(label->location());
}
// Return the address of the cursor.
template <typename T>
inline T GetCursorAddress() {
VIXL_STATIC_ASSERT(sizeof(T) >= sizeof(uintptr_t));
return GetOffsetAddress<T>(CursorOffset());
}
// Return the address of the start of the buffer.
template <typename T>
inline T GetStartAddress() {
VIXL_STATIC_ASSERT(sizeof(T) >= sizeof(uintptr_t));
return GetOffsetAddress<T>(0);
}
// Instruction set functions.
@@ -733,6 +936,12 @@ class Assembler {
// Calculate the address of a PC offset.
void adr(const Register& rd, int imm21);
// Calculate the page address of a label.
void adrp(const Register& rd, Label* label);
// Calculate the page address of a PC offset.
void adrp(const Register& rd, int imm21);
// Data Processing instructions.
// Add.
void add(const Register& rd,
@@ -1112,31 +1321,76 @@ class Assembler {
// Memory instructions.
// Load integer or FP register.
void ldr(const CPURegister& rt, const MemOperand& src);
void ldr(const CPURegister& rt, const MemOperand& src,
LoadStoreScalingOption option = PreferScaledOffset);
// Store integer or FP register.
void str(const CPURegister& rt, const MemOperand& dst);
void str(const CPURegister& rt, const MemOperand& dst,
LoadStoreScalingOption option = PreferScaledOffset);
// Load word with sign extension.
void ldrsw(const Register& rt, const MemOperand& src);
void ldrsw(const Register& rt, const MemOperand& src,
LoadStoreScalingOption option = PreferScaledOffset);
// Load byte.
void ldrb(const Register& rt, const MemOperand& src);
void ldrb(const Register& rt, const MemOperand& src,
LoadStoreScalingOption option = PreferScaledOffset);
// Store byte.
void strb(const Register& rt, const MemOperand& dst);
void strb(const Register& rt, const MemOperand& dst,
LoadStoreScalingOption option = PreferScaledOffset);
// Load byte with sign extension.
void ldrsb(const Register& rt, const MemOperand& src);
void ldrsb(const Register& rt, const MemOperand& src,
LoadStoreScalingOption option = PreferScaledOffset);
// Load half-word.
void ldrh(const Register& rt, const MemOperand& src);
void ldrh(const Register& rt, const MemOperand& src,
LoadStoreScalingOption option = PreferScaledOffset);
// Store half-word.
void strh(const Register& rt, const MemOperand& dst);
void strh(const Register& rt, const MemOperand& dst,
LoadStoreScalingOption option = PreferScaledOffset);
// Load half-word with sign extension.
void ldrsh(const Register& rt, const MemOperand& src);
void ldrsh(const Register& rt, const MemOperand& src,
LoadStoreScalingOption option = PreferScaledOffset);
// Load integer or FP register (with unscaled offset).
void ldur(const CPURegister& rt, const MemOperand& src,
LoadStoreScalingOption option = PreferUnscaledOffset);
// Store integer or FP register (with unscaled offset).
void stur(const CPURegister& rt, const MemOperand& src,
LoadStoreScalingOption option = PreferUnscaledOffset);
// Load word with sign extension.
void ldursw(const Register& rt, const MemOperand& src,
LoadStoreScalingOption option = PreferUnscaledOffset);
// Load byte (with unscaled offset).
void ldurb(const Register& rt, const MemOperand& src,
LoadStoreScalingOption option = PreferUnscaledOffset);
// Store byte (with unscaled offset).
void sturb(const Register& rt, const MemOperand& dst,
LoadStoreScalingOption option = PreferUnscaledOffset);
// Load byte with sign extension (and unscaled offset).
void ldursb(const Register& rt, const MemOperand& src,
LoadStoreScalingOption option = PreferUnscaledOffset);
// Load half-word (with unscaled offset).
void ldurh(const Register& rt, const MemOperand& src,
LoadStoreScalingOption option = PreferUnscaledOffset);
// Store half-word (with unscaled offset).
void sturh(const Register& rt, const MemOperand& dst,
LoadStoreScalingOption option = PreferUnscaledOffset);
// Load half-word with sign extension (and unscaled offset).
void ldursh(const Register& rt, const MemOperand& src,
LoadStoreScalingOption option = PreferUnscaledOffset);
// Load integer or FP register pair.
void ldp(const CPURegister& rt, const CPURegister& rt2,
@@ -1157,14 +1411,90 @@ class Assembler {
void stnp(const CPURegister& rt, const CPURegister& rt2,
const MemOperand& dst);
// Load literal to register.
void ldr(const Register& rt, uint64_t imm);
// Load integer or FP register from literal pool.
void ldr(const CPURegister& rt, RawLiteral* literal);
// Load double precision floating point literal to FP register.
void ldr(const FPRegister& ft, double imm);
// Load word with sign extension from literal pool.
void ldrsw(const Register& rt, RawLiteral* literal);
// Load integer or FP register from pc + imm19 << 2.
void ldr(const CPURegister& rt, int imm19);
// Load word with sign extension from pc + imm19 << 2.
void ldrsw(const Register& rt, int imm19);
// Store exclusive byte.
void stxrb(const Register& rs, const Register& rt, const MemOperand& dst);
// Store exclusive half-word.
void stxrh(const Register& rs, const Register& rt, const MemOperand& dst);
// Store exclusive register.
void stxr(const Register& rs, const Register& rt, const MemOperand& dst);
// Load exclusive byte.
void ldxrb(const Register& rt, const MemOperand& src);
// Load exclusive half-word.
void ldxrh(const Register& rt, const MemOperand& src);
// Load exclusive register.
void ldxr(const Register& rt, const MemOperand& src);
// Store exclusive register pair.
void stxp(const Register& rs,
const Register& rt,
const Register& rt2,
const MemOperand& dst);
// Load exclusive register pair.
void ldxp(const Register& rt, const Register& rt2, const MemOperand& src);
// Store-release exclusive byte.
void stlxrb(const Register& rs, const Register& rt, const MemOperand& dst);
// Store-release exclusive half-word.
void stlxrh(const Register& rs, const Register& rt, const MemOperand& dst);
// Store-release exclusive register.
void stlxr(const Register& rs, const Register& rt, const MemOperand& dst);
// Load-acquire exclusive byte.
void ldaxrb(const Register& rt, const MemOperand& src);
// Load-acquire exclusive half-word.
void ldaxrh(const Register& rt, const MemOperand& src);
// Load-acquire exclusive register.
void ldaxr(const Register& rt, const MemOperand& src);
// Store-release exclusive register pair.
void stlxp(const Register& rs,
const Register& rt,
const Register& rt2,
const MemOperand& dst);
// Load-acquire exclusive register pair.
void ldaxp(const Register& rt, const Register& rt2, const MemOperand& src);
// Store-release byte.
void stlrb(const Register& rt, const MemOperand& dst);
// Store-release half-word.
void stlrh(const Register& rt, const MemOperand& dst);
// Store-release register.
void stlr(const Register& rt, const MemOperand& dst);
// Load-acquire byte.
void ldarb(const Register& rt, const MemOperand& src);
// Load-acquire half-word.
void ldarh(const Register& rt, const MemOperand& src);
// Load-acquire register.
void ldar(const Register& rt, const MemOperand& src);
// Load single precision floating point literal to FP register.
void ldr(const FPRegister& ft, float imm);
// Move instructions. The default shift of -1 indicates that the move
// instruction will calculate an appropriate 16-bit immediate and left shift
@@ -1214,6 +1544,9 @@ class Assembler {
// System hint.
void hint(SystemHint code);
// Clear exclusive monitor.
void clrex(int imm4 = 0xf);
// Data memory barrier.
void dmb(BarrierDomain domain, BarrierType type);
@@ -1375,25 +1708,26 @@ class Assembler {
inline void dci(Instr raw_inst) { Emit(raw_inst); }
// Emit 32 bits of data into the instruction stream.
inline void dc32(uint32_t data) { EmitData(&data, sizeof(data)); }
inline void dc32(uint32_t data) {
VIXL_ASSERT(buffer_monitor_ > 0);
buffer_->Emit32(data);
}
// Emit 64 bits of data into the instruction stream.
inline void dc64(uint64_t data) { EmitData(&data, sizeof(data)); }
inline void dc64(uint64_t data) {
VIXL_ASSERT(buffer_monitor_ > 0);
buffer_->Emit64(data);
}
// Copy a string into the instruction stream, including the terminating NULL
// character. The instruction pointer (pc_) is then aligned correctly for
// character. The instruction pointer is then aligned correctly for
// subsequent instructions.
void EmitStringData(const char * string) {
void EmitString(const char * string) {
VIXL_ASSERT(string != NULL);
VIXL_ASSERT(buffer_monitor_ > 0);
size_t len = strlen(string) + 1;
EmitData(string, len);
// Pad with NULL characters until pc_ is aligned.
const char pad[] = {'\0', '\0', '\0', '\0'};
VIXL_STATIC_ASSERT(sizeof(pad) == kInstructionSize);
Instruction* next_pc = AlignUp(pc_, kInstructionSize);
EmitData(&pad, next_pc - pc_);
buffer_->EmitString(string);
buffer_->Align();
}
// Code generation helpers.
@@ -1429,6 +1763,11 @@ class Assembler {
return rt2.code() << Rt2_offset;
}
static Instr Rs(CPURegister rs) {
VIXL_ASSERT(rs.code() != kSPRegInternalCode);
return rs.code() << Rs_offset;
}
// These encoding functions allow the stack pointer to be encoded, and
// disallow the zero register.
static Instr RdSP(Register rd) {
@@ -1619,6 +1958,11 @@ class Assembler {
return imm7 << ImmHint_offset;
}
static Instr CRm(int imm4) {
VIXL_ASSERT(is_uint4(imm4));
return imm4 << CRm_offset;
}
static Instr ImmBarrierDomain(int imm2) {
VIXL_ASSERT(is_uint2(imm2));
return imm2 << ImmBarrierDomain_offset;
@@ -1659,55 +2003,73 @@ class Assembler {
return scale << FPScale_offset;
}
// Size of the code generated in bytes
uint64_t SizeOfCodeGenerated() const {
VIXL_ASSERT((pc_ >= buffer_) && (pc_ < (buffer_ + buffer_size_)));
return pc_ - buffer_;
}
// Size of the code generated since label to the current position.
uint64_t SizeOfCodeGeneratedSince(Label* label) const {
size_t SizeOfCodeGeneratedSince(Label* label) const {
VIXL_ASSERT(label->IsBound());
VIXL_ASSERT((pc_ >= label->target()) && (pc_ < (buffer_ + buffer_size_)));
return pc_ - label->target();
return buffer_->OffsetFrom(label->location());
}
size_t BufferCapacity() const { return buffer_->capacity(); }
inline void BlockLiteralPool() {
literal_pool_monitor_++;
}
size_t RemainingBufferSpace() const { return buffer_->RemainingBytes(); }
inline void ReleaseLiteralPool() {
if (--literal_pool_monitor_ == 0) {
// Has the literal pool been blocked for too long?
VIXL_ASSERT(literals_.empty() ||
(pc_ < (literals_.back()->pc_ + kMaxLoadLiteralRange)));
void EnsureSpaceFor(size_t amount) {
if (buffer_->RemainingBytes() < amount) {
size_t capacity = buffer_->capacity();
size_t size = buffer_->CursorOffset();
do {
// TODO(all): refine.
capacity *= 2;
} while ((capacity - size) < amount);
buffer_->Grow(capacity);
}
}
inline bool IsLiteralPoolBlocked() {
return literal_pool_monitor_ != 0;
#ifdef DEBUG
void AcquireBuffer() {
VIXL_ASSERT(buffer_monitor_ >= 0);
buffer_monitor_++;
}
void CheckLiteralPool(LiteralPoolEmitOption option = JumpRequired);
void EmitLiteralPool(LiteralPoolEmitOption option = NoJumpRequired);
size_t LiteralPoolSize();
void ReleaseBuffer() {
buffer_monitor_--;
VIXL_ASSERT(buffer_monitor_ >= 0);
}
#endif
protected:
inline const Register& AppropriateZeroRegFor(const CPURegister& reg) const {
inline PositionIndependentCodeOption pic() {
return pic_;
}
inline bool AllowPageOffsetDependentCode() {
return (pic() == PageOffsetDependentCode) ||
(pic() == PositionDependentCode);
}
static inline const Register& AppropriateZeroRegFor(const CPURegister& reg) {
return reg.Is64Bits() ? xzr : wzr;
}
protected:
void LoadStore(const CPURegister& rt,
const MemOperand& addr,
LoadStoreOp op);
static bool IsImmLSUnscaled(ptrdiff_t offset);
static bool IsImmLSScaled(ptrdiff_t offset, LSDataSize size);
LoadStoreOp op,
LoadStoreScalingOption option = PreferScaledOffset);
static bool IsImmLSUnscaled(int64_t offset);
static bool IsImmLSScaled(int64_t offset, LSDataSize size);
void LoadStorePair(const CPURegister& rt,
const CPURegister& rt2,
const MemOperand& addr,
LoadStorePairOp op);
static bool IsImmLSPair(int64_t offset, LSDataSize size);
// TODO(all): The third parameter should be passed by reference but gcc 4.8.2
// reports a bogus uninitialised warning then.
void Logical(const Register& rd,
const Register& rn,
const Operand& operand,
const Operand operand,
LogicalOp op);
void LogicalImmediate(const Register& rd,
const Register& rn,
@@ -1717,9 +2079,9 @@ class Assembler {
LogicalOp op);
static bool IsImmLogical(uint64_t value,
unsigned width,
unsigned* n,
unsigned* imm_s,
unsigned* imm_r);
unsigned* n = NULL,
unsigned* imm_s = NULL,
unsigned* imm_r = NULL);
void ConditionalCompare(const Register& rn,
const Operand& operand,
@@ -1768,6 +2130,7 @@ class Assembler {
const CPURegister& rt, const CPURegister& rt2);
static LoadStorePairNonTemporalOp StorePairNonTemporalOpFor(
const CPURegister& rt, const CPURegister& rt2);
static LoadLiteralOp LoadLiteralOpFor(const CPURegister& rt);
private:
@@ -1786,10 +2149,6 @@ class Assembler {
const Operand& operand,
FlagsUpdate S,
Instr op);
void LoadStorePair(const CPURegister& rt,
const CPURegister& rt2,
const MemOperand& addr,
LoadStorePairOp op);
void LoadStorePairNonTemporal(const CPURegister& rt,
const CPURegister& rt2,
const MemOperand& addr,
@@ -1821,75 +2180,110 @@ class Assembler {
const FPRegister& fa,
FPDataProcessing3SourceOp op);
void RecordLiteral(int64_t imm, unsigned size);
// Link the current (not-yet-emitted) instruction to the specified label, then
// return an offset to be encoded in the instruction. If the label is not yet
// bound, an offset of 0 is returned.
ptrdiff_t LinkAndGetByteOffsetTo(Label * label);
ptrdiff_t LinkAndGetInstructionOffsetTo(Label * label);
ptrdiff_t LinkAndGetPageOffsetTo(Label * label);
// Emit the instruction at pc_.
// A common implementation for the LinkAndGet<Type>OffsetTo helpers.
template <int element_shift>
ptrdiff_t LinkAndGetOffsetTo(Label* label);
// Literal load offset are in words (32-bit).
ptrdiff_t LinkAndGetWordOffsetTo(RawLiteral* literal);
// Emit the instruction in buffer_.
void Emit(Instr instruction) {
VIXL_STATIC_ASSERT(sizeof(*pc_) == 1);
VIXL_STATIC_ASSERT(sizeof(instruction) == kInstructionSize);
VIXL_ASSERT((pc_ + sizeof(instruction)) <= (buffer_ + buffer_size_));
#ifdef DEBUG
finalized_ = false;
#endif
memcpy(pc_, &instruction, sizeof(instruction));
pc_ += sizeof(instruction);
CheckBufferSpace();
VIXL_ASSERT(buffer_monitor_ > 0);
buffer_->Emit32(instruction);
}
// Emit data inline in the instruction stream.
void EmitData(void const * data, unsigned size) {
VIXL_STATIC_ASSERT(sizeof(*pc_) == 1);
VIXL_ASSERT((pc_ + size) <= (buffer_ + buffer_size_));
// Buffer where the code is emitted.
CodeBuffer* buffer_;
PositionIndependentCodeOption pic_;
#ifdef DEBUG
finalized_ = false;
#endif
// TODO: Record this 'instruction' as data, so that it can be disassembled
// correctly.
memcpy(pc_, data, size);
pc_ += size;
CheckBufferSpace();
}
inline void CheckBufferSpace() {
VIXL_ASSERT(pc_ < (buffer_ + buffer_size_));
if (pc_ > next_literal_pool_check_) {
CheckLiteralPool();
}
}
// The buffer into which code and relocation info are generated.
Instruction* buffer_;
// Buffer size, in bytes.
unsigned buffer_size_;
Instruction* pc_;
std::list<Literal*> literals_;
Instruction* next_literal_pool_check_;
unsigned literal_pool_monitor_;
friend class BlockLiteralPoolScope;
#ifdef DEBUG
bool finalized_;
int64_t buffer_monitor_;
#endif
};
class BlockLiteralPoolScope {
// All Assembler emits MUST acquire/release the underlying code buffer. The
// helper scope below will do so and optionally ensure the buffer is big enough
// to receive the emit. It is possible to request the scope not to perform any
// checks (kNoCheck) if for example it is known in advance the buffer size is
// adequate or there is some other size checking mechanism in place.
class CodeBufferCheckScope {
public:
explicit BlockLiteralPoolScope(Assembler* assm) : assm_(assm) {
assm_->BlockLiteralPool();
// Tell whether or not the scope needs to ensure the associated CodeBuffer
// has enough space for the requested size.
enum CheckPolicy {
kNoCheck,
kCheck
};
// Tell whether or not the scope should assert the amount of code emitted
// within the scope is consistent with the requested amount.
enum AssertPolicy {
kNoAssert, // No assert required.
kExactSize, // The code emitted must be exactly size bytes.
kMaximumSize // The code emitted must be at most size bytes.
};
CodeBufferCheckScope(Assembler* assm,
size_t size,
CheckPolicy check_policy = kCheck,
AssertPolicy assert_policy = kMaximumSize)
: assm_(assm) {
if (check_policy == kCheck) assm->EnsureSpaceFor(size);
#ifdef DEBUG
assm->bind(&start_);
size_ = size;
assert_policy_ = assert_policy;
assm->AcquireBuffer();
#else
USE(assert_policy);
#endif
}
~BlockLiteralPoolScope() {
assm_->ReleaseLiteralPool();
// This is a shortcut for CodeBufferCheckScope(assm, 0, kNoCheck, kNoAssert).
explicit CodeBufferCheckScope(Assembler* assm) : assm_(assm) {
#ifdef DEBUG
size_ = 0;
assert_policy_ = kNoAssert;
assm->AcquireBuffer();
#endif
}
private:
~CodeBufferCheckScope() {
#ifdef DEBUG
assm_->ReleaseBuffer();
switch (assert_policy_) {
case kNoAssert: break;
case kExactSize:
VIXL_ASSERT(assm_->SizeOfCodeGeneratedSince(&start_) == size_);
break;
case kMaximumSize:
VIXL_ASSERT(assm_->SizeOfCodeGeneratedSince(&start_) <= size_);
break;
default:
VIXL_UNREACHABLE();
}
#endif
}
protected:
Assembler* assm_;
#ifdef DEBUG
Label start_;
size_t size_;
AssertPolicy assert_policy_;
#endif
};
} // namespace vixl
#endif // VIXL_A64_ASSEMBLER_A64_H_

View File

@@ -46,13 +46,13 @@ R(24) R(25) R(26) R(27) R(28) R(29) R(30) R(31)
#define INSTRUCTION_FIELDS_LIST(V_) \
/* Register fields */ \
V_(Rd, 4, 0, Bits) /* Destination register. */ \
V_(Rn, 9, 5, Bits) /* First source register. */ \
V_(Rm, 20, 16, Bits) /* Second source register. */ \
V_(Ra, 14, 10, Bits) /* Third source register. */ \
V_(Rt, 4, 0, Bits) /* Load dest / store source. */ \
V_(Rt2, 14, 10, Bits) /* Load second dest / */ \
/* store second source. */ \
V_(Rd, 4, 0, Bits) /* Destination register. */ \
V_(Rn, 9, 5, Bits) /* First source register. */ \
V_(Rm, 20, 16, Bits) /* Second source register. */ \
V_(Ra, 14, 10, Bits) /* Third source register. */ \
V_(Rt, 4, 0, Bits) /* Load/store register. */ \
V_(Rt2, 14, 10, Bits) /* Load/store second register. */ \
V_(Rs, 20, 16, Bits) /* Exclusive access status. */ \
V_(PrefetchMode, 4, 0, Bits) \
\
/* Common bits */ \
@@ -126,6 +126,13 @@ V_(SysOp1, 18, 16, Bits) \
V_(SysOp2, 7, 5, Bits) \
V_(CRn, 15, 12, Bits) \
V_(CRm, 11, 8, Bits) \
\
/* Load-/store-exclusive */ \
V_(LdStXLoad, 22, 22, Bits) \
V_(LdStXNotExclusive, 23, 23, Bits) \
V_(LdStXAcquireRelease, 15, 15, Bits) \
V_(LdStXSizeLog2, 31, 30, Bits) \
V_(LdStXPair, 21, 21, Bits) \
#define SYSTEM_REGISTER_FIELDS_LIST(V_, M_) \
@@ -585,6 +592,13 @@ enum MemBarrierOp {
ISB = MemBarrierFixed | 0x00000040
};
enum SystemExclusiveMonitorOp {
SystemExclusiveMonitorFixed = 0xD503305F,
SystemExclusiveMonitorFMask = 0xFFFFF0FF,
SystemExclusiveMonitorMask = 0xFFFFF0FF,
CLREX = SystemExclusiveMonitorFixed
};
// Any load or store.
enum LoadStoreAnyOp {
LoadStoreAnyFMask = 0x0a000000,
@@ -702,7 +716,7 @@ enum LoadStoreUnscaledOffsetOp {
// Load/store (post, pre, offset and unsigned.)
enum LoadStoreOp {
LoadStoreOpMask = 0xC4C00000,
LoadStoreOpMask = 0xC4C00000,
#define LOAD_STORE(A, B, C, D) \
A##B##_##C = D
LOAD_STORE_OP_LIST(LOAD_STORE),
@@ -756,6 +770,44 @@ enum LoadStoreRegisterOffset {
#undef LOAD_STORE_REGISTER_OFFSET
};
enum LoadStoreExclusive {
LoadStoreExclusiveFixed = 0x08000000,
LoadStoreExclusiveFMask = 0x3F000000,
LoadStoreExclusiveMask = 0xFFE08000,
STXRB_w = LoadStoreExclusiveFixed | 0x00000000,
STXRH_w = LoadStoreExclusiveFixed | 0x40000000,
STXR_w = LoadStoreExclusiveFixed | 0x80000000,
STXR_x = LoadStoreExclusiveFixed | 0xC0000000,
LDXRB_w = LoadStoreExclusiveFixed | 0x00400000,
LDXRH_w = LoadStoreExclusiveFixed | 0x40400000,
LDXR_w = LoadStoreExclusiveFixed | 0x80400000,
LDXR_x = LoadStoreExclusiveFixed | 0xC0400000,
STXP_w = LoadStoreExclusiveFixed | 0x80200000,
STXP_x = LoadStoreExclusiveFixed | 0xC0200000,
LDXP_w = LoadStoreExclusiveFixed | 0x80600000,
LDXP_x = LoadStoreExclusiveFixed | 0xC0600000,
STLXRB_w = LoadStoreExclusiveFixed | 0x00008000,
STLXRH_w = LoadStoreExclusiveFixed | 0x40008000,
STLXR_w = LoadStoreExclusiveFixed | 0x80008000,
STLXR_x = LoadStoreExclusiveFixed | 0xC0008000,
LDAXRB_w = LoadStoreExclusiveFixed | 0x00408000,
LDAXRH_w = LoadStoreExclusiveFixed | 0x40408000,
LDAXR_w = LoadStoreExclusiveFixed | 0x80408000,
LDAXR_x = LoadStoreExclusiveFixed | 0xC0408000,
STLXP_w = LoadStoreExclusiveFixed | 0x80208000,
STLXP_x = LoadStoreExclusiveFixed | 0xC0208000,
LDAXP_w = LoadStoreExclusiveFixed | 0x80608000,
LDAXP_x = LoadStoreExclusiveFixed | 0xC0608000,
STLRB_w = LoadStoreExclusiveFixed | 0x00808000,
STLRH_w = LoadStoreExclusiveFixed | 0x40808000,
STLR_w = LoadStoreExclusiveFixed | 0x80808000,
STLR_x = LoadStoreExclusiveFixed | 0xC0808000,
LDARB_w = LoadStoreExclusiveFixed | 0x00C08000,
LDARH_w = LoadStoreExclusiveFixed | 0x40C08000,
LDAR_w = LoadStoreExclusiveFixed | 0x80C08000,
LDAR_x = LoadStoreExclusiveFixed | 0xC0C08000
};
// Conditional compare.
enum ConditionalCompareOp {
ConditionalCompareMask = 0x60000000,

View File

@@ -28,6 +28,7 @@
#define VIXL_CPU_A64_H
#include "globals.h"
#include "instructions-a64.h"
namespace vixl {
@@ -42,6 +43,32 @@ class CPU {
// safely run.
static void EnsureIAndDCacheCoherency(void *address, size_t length);
// Handle tagged pointers.
template <typename T>
static T SetPointerTag(T pointer, uint64_t tag) {
VIXL_ASSERT(is_uintn(kAddressTagWidth, tag));
// Use C-style casts to get static_cast behaviour for integral types (T),
// and reinterpret_cast behaviour for other types.
uint64_t raw = (uint64_t)pointer;
VIXL_STATIC_ASSERT(sizeof(pointer) == sizeof(raw));
raw = (raw & ~kAddressTagMask) | (tag << kAddressTagOffset);
return (T)raw;
}
template <typename T>
static uint64_t GetPointerTag(T pointer) {
// Use C-style casts to get static_cast behaviour for integral types (T),
// and reinterpret_cast behaviour for other types.
uint64_t raw = (uint64_t)pointer;
VIXL_STATIC_ASSERT(sizeof(pointer) == sizeof(raw));
return (raw & kAddressTagMask) >> kAddressTagOffset;
}
private:
// Return the content of the cache type register.
static uint32_t GetCacheType();

View File

@@ -29,8 +29,8 @@
#include "a64/decoder-a64.h"
namespace vixl {
// Top-level instruction decode function.
void Decoder::Decode(Instruction *instr) {
void Decoder::DecodeInstruction(const Instruction *instr) {
if (instr->Bits(28, 27) == 0) {
VisitUnallocated(instr);
} else {
@@ -109,20 +109,17 @@ void Decoder::Decode(Instruction *instr) {
}
void Decoder::AppendVisitor(DecoderVisitor* new_visitor) {
visitors_.remove(new_visitor);
visitors_.push_front(new_visitor);
visitors_.push_back(new_visitor);
}
void Decoder::PrependVisitor(DecoderVisitor* new_visitor) {
visitors_.remove(new_visitor);
visitors_.push_back(new_visitor);
visitors_.push_front(new_visitor);
}
void Decoder::InsertVisitorBefore(DecoderVisitor* new_visitor,
DecoderVisitor* registered_visitor) {
visitors_.remove(new_visitor);
std::list<DecoderVisitor*>::iterator it;
for (it = visitors_.begin(); it != visitors_.end(); it++) {
if (*it == registered_visitor) {
@@ -139,7 +136,6 @@ void Decoder::InsertVisitorBefore(DecoderVisitor* new_visitor,
void Decoder::InsertVisitorAfter(DecoderVisitor* new_visitor,
DecoderVisitor* registered_visitor) {
visitors_.remove(new_visitor);
std::list<DecoderVisitor*>::iterator it;
for (it = visitors_.begin(); it != visitors_.end(); it++) {
if (*it == registered_visitor) {
@@ -160,7 +156,7 @@ void Decoder::RemoveVisitor(DecoderVisitor* visitor) {
}
void Decoder::DecodePCRelAddressing(Instruction* instr) {
void Decoder::DecodePCRelAddressing(const Instruction* instr) {
VIXL_ASSERT(instr->Bits(27, 24) == 0x0);
// We know bit 28 is set, as <b28:b27> = 0 is filtered out at the top level
// decode.
@@ -169,11 +165,11 @@ void Decoder::DecodePCRelAddressing(Instruction* instr) {
}
void Decoder::DecodeBranchSystemException(Instruction* instr) {
void Decoder::DecodeBranchSystemException(const Instruction* instr) {
VIXL_ASSERT((instr->Bits(27, 24) == 0x4) ||
(instr->Bits(27, 24) == 0x5) ||
(instr->Bits(27, 24) == 0x6) ||
(instr->Bits(27, 24) == 0x7) );
(instr->Bits(27, 24) == 0x5) ||
(instr->Bits(27, 24) == 0x6) ||
(instr->Bits(27, 24) == 0x7) );
switch (instr->Bits(31, 29)) {
case 0:
@@ -270,18 +266,17 @@ void Decoder::DecodeBranchSystemException(Instruction* instr) {
}
void Decoder::DecodeLoadStore(Instruction* instr) {
void Decoder::DecodeLoadStore(const Instruction* instr) {
VIXL_ASSERT((instr->Bits(27, 24) == 0x8) ||
(instr->Bits(27, 24) == 0x9) ||
(instr->Bits(27, 24) == 0xC) ||
(instr->Bits(27, 24) == 0xD) );
(instr->Bits(27, 24) == 0x9) ||
(instr->Bits(27, 24) == 0xC) ||
(instr->Bits(27, 24) == 0xD) );
if (instr->Bit(24) == 0) {
if (instr->Bit(28) == 0) {
if (instr->Bit(29) == 0) {
if (instr->Bit(26) == 0) {
// TODO: VisitLoadStoreExclusive.
VisitUnimplemented(instr);
VisitLoadStoreExclusive(instr);
} else {
DecodeAdvSIMDLoadStore(instr);
}
@@ -389,7 +384,7 @@ void Decoder::DecodeLoadStore(Instruction* instr) {
}
void Decoder::DecodeLogical(Instruction* instr) {
void Decoder::DecodeLogical(const Instruction* instr) {
VIXL_ASSERT(instr->Bits(27, 24) == 0x2);
if (instr->Mask(0x80400000) == 0x00400000) {
@@ -408,7 +403,7 @@ void Decoder::DecodeLogical(Instruction* instr) {
}
void Decoder::DecodeBitfieldExtract(Instruction* instr) {
void Decoder::DecodeBitfieldExtract(const Instruction* instr) {
VIXL_ASSERT(instr->Bits(27, 24) == 0x3);
if ((instr->Mask(0x80400000) == 0x80000000) ||
@@ -433,7 +428,7 @@ void Decoder::DecodeBitfieldExtract(Instruction* instr) {
}
void Decoder::DecodeAddSubImmediate(Instruction* instr) {
void Decoder::DecodeAddSubImmediate(const Instruction* instr) {
VIXL_ASSERT(instr->Bits(27, 24) == 0x1);
if (instr->Bit(23) == 1) {
VisitUnallocated(instr);
@@ -443,7 +438,7 @@ void Decoder::DecodeAddSubImmediate(Instruction* instr) {
}
void Decoder::DecodeDataProcessing(Instruction* instr) {
void Decoder::DecodeDataProcessing(const Instruction* instr) {
VIXL_ASSERT((instr->Bits(27, 24) == 0xA) ||
(instr->Bits(27, 24) == 0xB));
@@ -558,7 +553,7 @@ void Decoder::DecodeDataProcessing(Instruction* instr) {
}
void Decoder::DecodeFP(Instruction* instr) {
void Decoder::DecodeFP(const Instruction* instr) {
VIXL_ASSERT((instr->Bits(27, 24) == 0xE) ||
(instr->Bits(27, 24) == 0xF));
@@ -685,14 +680,14 @@ void Decoder::DecodeFP(Instruction* instr) {
}
void Decoder::DecodeAdvSIMDLoadStore(Instruction* instr) {
void Decoder::DecodeAdvSIMDLoadStore(const Instruction* instr) {
// TODO: Implement Advanced SIMD load/store instruction decode.
VIXL_ASSERT(instr->Bits(29, 25) == 0x6);
VisitUnimplemented(instr);
}
void Decoder::DecodeAdvSIMDDataProcessing(Instruction* instr) {
void Decoder::DecodeAdvSIMDDataProcessing(const Instruction* instr) {
// TODO: Implement Advanced SIMD data processing instruction decode.
VIXL_ASSERT(instr->Bits(27, 25) == 0x7);
VisitUnimplemented(instr);
@@ -700,7 +695,7 @@ void Decoder::DecodeAdvSIMDDataProcessing(Instruction* instr) {
#define DEFINE_VISITOR_CALLERS(A) \
void Decoder::Visit##A(Instruction *instr) { \
void Decoder::Visit##A(const Instruction *instr) { \
VIXL_ASSERT(instr->Mask(A##FMask) == A##Fixed); \
std::list<DecoderVisitor*>::iterator it; \
for (it = visitors_.begin(); it != visitors_.end(); it++) { \

View File

@@ -59,6 +59,7 @@
V(LoadStorePreIndex) \
V(LoadStoreRegisterOffset) \
V(LoadStoreUnsignedOffset) \
V(LoadStoreExclusive) \
V(LogicalShifted) \
V(AddSubShifted) \
V(AddSubExtended) \
@@ -87,112 +88,152 @@ namespace vixl {
// must provide implementations for all of these functions.
class DecoderVisitor {
public:
#define DECLARE(A) virtual void Visit##A(Instruction* instr) = 0;
VISITOR_LIST(DECLARE)
#undef DECLARE
enum VisitorConstness {
kConstVisitor,
kNonConstVisitor
};
explicit DecoderVisitor(VisitorConstness constness = kConstVisitor)
: constness_(constness) {}
virtual ~DecoderVisitor() {}
private:
// Visitors are registered in a list.
std::list<DecoderVisitor*> visitors_;
#define DECLARE(A) virtual void Visit##A(const Instruction* instr) = 0;
VISITOR_LIST(DECLARE)
#undef DECLARE
friend class Decoder;
bool IsConstVisitor() const { return constness_ == kConstVisitor; }
Instruction* MutableInstruction(const Instruction* instr) {
VIXL_ASSERT(!IsConstVisitor());
return const_cast<Instruction*>(instr);
}
private:
VisitorConstness constness_;
};
class Decoder: public DecoderVisitor {
class Decoder {
public:
Decoder() {}
// Top-level instruction decoder function. Decodes an instruction and calls
// the visitor functions registered with the Decoder class.
void Decode(Instruction *instr);
// Top-level wrappers around the actual decoding function.
void Decode(const Instruction* instr) {
std::list<DecoderVisitor*>::iterator it;
for (it = visitors_.begin(); it != visitors_.end(); it++) {
VIXL_ASSERT((*it)->IsConstVisitor());
}
DecodeInstruction(instr);
}
void Decode(Instruction* instr) {
DecodeInstruction(const_cast<const Instruction*>(instr));
}
// Register a new visitor class with the decoder.
// Decode() will call the corresponding visitor method from all registered
// visitor classes when decoding reaches the leaf node of the instruction
// decode tree.
// Visitors are called in the order.
// A visitor can only be registered once.
// Registering an already registered visitor will update its position.
// Visitors are called in order.
// A visitor can be registered multiple times.
//
// d.AppendVisitor(V1);
// d.AppendVisitor(V2);
// d.PrependVisitor(V2); // Move V2 at the start of the list.
// d.InsertVisitorBefore(V3, V2);
// d.AppendVisitor(V4);
// d.AppendVisitor(V4); // No effect.
// d.PrependVisitor(V2);
// d.AppendVisitor(V3);
//
// d.Decode(i);
//
// will call in order visitor methods in V3, V2, V1, V4.
// will call in order visitor methods in V2, V1, V2, V3.
void AppendVisitor(DecoderVisitor* visitor);
void PrependVisitor(DecoderVisitor* visitor);
// These helpers register `new_visitor` before or after the first instance of
// `registered_visiter` in the list.
// So if
// V1, V2, V1, V2
// are registered in this order in the decoder, calls to
// d.InsertVisitorAfter(V3, V1);
// d.InsertVisitorBefore(V4, V2);
// will yield the order
// V1, V3, V4, V2, V1, V2
//
// For more complex modifications of the order of registered visitors, one can
// directly access and modify the list of visitors via the `visitors()'
// accessor.
void InsertVisitorBefore(DecoderVisitor* new_visitor,
DecoderVisitor* registered_visitor);
void InsertVisitorAfter(DecoderVisitor* new_visitor,
DecoderVisitor* registered_visitor);
// Remove a previously registered visitor class from the list of visitors
// stored by the decoder.
// Remove all instances of a previously registered visitor class from the list
// of visitors stored by the decoder.
void RemoveVisitor(DecoderVisitor* visitor);
#define DECLARE(A) void Visit##A(Instruction* instr);
#define DECLARE(A) void Visit##A(const Instruction* instr);
VISITOR_LIST(DECLARE)
#undef DECLARE
std::list<DecoderVisitor*>* visitors() { return &visitors_; }
private:
// Decodes an instruction and calls the visitor functions registered with the
// Decoder class.
void DecodeInstruction(const Instruction* instr);
// Decode the PC relative addressing instruction, and call the corresponding
// visitors.
// On entry, instruction bits 27:24 = 0x0.
void DecodePCRelAddressing(Instruction* instr);
void DecodePCRelAddressing(const Instruction* instr);
// Decode the add/subtract immediate instruction, and call the correspoding
// visitors.
// On entry, instruction bits 27:24 = 0x1.
void DecodeAddSubImmediate(Instruction* instr);
void DecodeAddSubImmediate(const Instruction* instr);
// Decode the branch, system command, and exception generation parts of
// the instruction tree, and call the corresponding visitors.
// On entry, instruction bits 27:24 = {0x4, 0x5, 0x6, 0x7}.
void DecodeBranchSystemException(Instruction* instr);
void DecodeBranchSystemException(const Instruction* instr);
// Decode the load and store parts of the instruction tree, and call
// the corresponding visitors.
// On entry, instruction bits 27:24 = {0x8, 0x9, 0xC, 0xD}.
void DecodeLoadStore(Instruction* instr);
void DecodeLoadStore(const Instruction* instr);
// Decode the logical immediate and move wide immediate parts of the
// instruction tree, and call the corresponding visitors.
// On entry, instruction bits 27:24 = 0x2.
void DecodeLogical(Instruction* instr);
void DecodeLogical(const Instruction* instr);
// Decode the bitfield and extraction parts of the instruction tree,
// and call the corresponding visitors.
// On entry, instruction bits 27:24 = 0x3.
void DecodeBitfieldExtract(Instruction* instr);
void DecodeBitfieldExtract(const Instruction* instr);
// Decode the data processing parts of the instruction tree, and call the
// corresponding visitors.
// On entry, instruction bits 27:24 = {0x1, 0xA, 0xB}.
void DecodeDataProcessing(Instruction* instr);
void DecodeDataProcessing(const Instruction* instr);
// Decode the floating point parts of the instruction tree, and call the
// corresponding visitors.
// On entry, instruction bits 27:24 = {0xE, 0xF}.
void DecodeFP(Instruction* instr);
void DecodeFP(const Instruction* instr);
// Decode the Advanced SIMD (NEON) load/store part of the instruction tree,
// and call the corresponding visitors.
// On entry, instruction bits 29:25 = 0x6.
void DecodeAdvSIMDLoadStore(Instruction* instr);
void DecodeAdvSIMDLoadStore(const Instruction* instr);
// Decode the Advanced SIMD (NEON) data processing part of the instruction
// tree, and call the corresponding visitors.
// On entry, instruction bits 27:25 = 0x7.
void DecodeAdvSIMDDataProcessing(Instruction* instr);
void DecodeAdvSIMDDataProcessing(const Instruction* instr);
private:
// Visitors are registered in a list.
std::list<DecoderVisitor*> visitors_;
};
} // namespace vixl
#endif // VIXL_A64_DECODER_A64_H_

View File

@@ -24,6 +24,7 @@
// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#include <cstdlib>
#include "a64/disasm-a64.h"
namespace vixl {
@@ -56,7 +57,7 @@ char* Disassembler::GetOutput() {
}
void Disassembler::VisitAddSubImmediate(Instruction* instr) {
void Disassembler::VisitAddSubImmediate(const Instruction* instr) {
bool rd_is_zr = RdIsZROrSP(instr);
bool stack_op = (rd_is_zr || RnIsZROrSP(instr)) &&
(instr->ImmAddSub() == 0) ? true : false;
@@ -101,7 +102,7 @@ void Disassembler::VisitAddSubImmediate(Instruction* instr) {
}
void Disassembler::VisitAddSubShifted(Instruction* instr) {
void Disassembler::VisitAddSubShifted(const Instruction* instr) {
bool rd_is_zr = RdIsZROrSP(instr);
bool rn_is_zr = RnIsZROrSP(instr);
const char *mnemonic = "";
@@ -148,7 +149,7 @@ void Disassembler::VisitAddSubShifted(Instruction* instr) {
}
void Disassembler::VisitAddSubExtended(Instruction* instr) {
void Disassembler::VisitAddSubExtended(const Instruction* instr) {
bool rd_is_zr = RdIsZROrSP(instr);
const char *mnemonic = "";
Extend mode = static_cast<Extend>(instr->ExtendMode());
@@ -186,7 +187,7 @@ void Disassembler::VisitAddSubExtended(Instruction* instr) {
}
void Disassembler::VisitAddSubWithCarry(Instruction* instr) {
void Disassembler::VisitAddSubWithCarry(const Instruction* instr) {
bool rn_is_zr = RnIsZROrSP(instr);
const char *mnemonic = "";
const char *form = "'Rd, 'Rn, 'Rm";
@@ -221,7 +222,7 @@ void Disassembler::VisitAddSubWithCarry(Instruction* instr) {
}
void Disassembler::VisitLogicalImmediate(Instruction* instr) {
void Disassembler::VisitLogicalImmediate(const Instruction* instr) {
bool rd_is_zr = RdIsZROrSP(instr);
bool rn_is_zr = RnIsZROrSP(instr);
const char *mnemonic = "";
@@ -293,7 +294,7 @@ bool Disassembler::IsMovzMovnImm(unsigned reg_size, uint64_t value) {
}
void Disassembler::VisitLogicalShifted(Instruction* instr) {
void Disassembler::VisitLogicalShifted(const Instruction* instr) {
bool rd_is_zr = RdIsZROrSP(instr);
bool rn_is_zr = RnIsZROrSP(instr);
const char *mnemonic = "";
@@ -344,7 +345,7 @@ void Disassembler::VisitLogicalShifted(Instruction* instr) {
}
void Disassembler::VisitConditionalCompareRegister(Instruction* instr) {
void Disassembler::VisitConditionalCompareRegister(const Instruction* instr) {
const char *mnemonic = "";
const char *form = "'Rn, 'Rm, 'INzcv, 'Cond";
@@ -359,7 +360,7 @@ void Disassembler::VisitConditionalCompareRegister(Instruction* instr) {
}
void Disassembler::VisitConditionalCompareImmediate(Instruction* instr) {
void Disassembler::VisitConditionalCompareImmediate(const Instruction* instr) {
const char *mnemonic = "";
const char *form = "'Rn, 'IP, 'INzcv, 'Cond";
@@ -374,7 +375,7 @@ void Disassembler::VisitConditionalCompareImmediate(Instruction* instr) {
}
void Disassembler::VisitConditionalSelect(Instruction* instr) {
void Disassembler::VisitConditionalSelect(const Instruction* instr) {
bool rnm_is_zr = (RnIsZROrSP(instr) && RmIsZROrSP(instr));
bool rn_is_rm = (instr->Rn() == instr->Rm());
const char *mnemonic = "";
@@ -427,7 +428,7 @@ void Disassembler::VisitConditionalSelect(Instruction* instr) {
}
void Disassembler::VisitBitfield(Instruction* instr) {
void Disassembler::VisitBitfield(const Instruction* instr) {
unsigned s = instr->ImmS();
unsigned r = instr->ImmR();
unsigned rd_size_minus_1 =
@@ -505,7 +506,7 @@ void Disassembler::VisitBitfield(Instruction* instr) {
}
void Disassembler::VisitExtract(Instruction* instr) {
void Disassembler::VisitExtract(const Instruction* instr) {
const char *mnemonic = "";
const char *form = "'Rd, 'Rn, 'Rm, 'IExtract";
@@ -526,16 +527,16 @@ void Disassembler::VisitExtract(Instruction* instr) {
}
void Disassembler::VisitPCRelAddressing(Instruction* instr) {
void Disassembler::VisitPCRelAddressing(const Instruction* instr) {
switch (instr->Mask(PCRelAddressingMask)) {
case ADR: Format(instr, "adr", "'Xd, 'AddrPCRelByte"); break;
// ADRP is not implemented.
case ADRP: Format(instr, "adrp", "'Xd, 'AddrPCRelPage"); break;
default: Format(instr, "unimplemented", "(PCRelAddressing)");
}
}
void Disassembler::VisitConditionalBranch(Instruction* instr) {
void Disassembler::VisitConditionalBranch(const Instruction* instr) {
switch (instr->Mask(ConditionalBranchMask)) {
case B_cond: Format(instr, "b.'CBrn", "'BImmCond"); break;
default: VIXL_UNREACHABLE();
@@ -543,7 +544,8 @@ void Disassembler::VisitConditionalBranch(Instruction* instr) {
}
void Disassembler::VisitUnconditionalBranchToRegister(Instruction* instr) {
void Disassembler::VisitUnconditionalBranchToRegister(
const Instruction* instr) {
const char *mnemonic = "unimplemented";
const char *form = "'Xn";
@@ -563,7 +565,7 @@ void Disassembler::VisitUnconditionalBranchToRegister(Instruction* instr) {
}
void Disassembler::VisitUnconditionalBranch(Instruction* instr) {
void Disassembler::VisitUnconditionalBranch(const Instruction* instr) {
const char *mnemonic = "";
const char *form = "'BImmUncn";
@@ -576,7 +578,7 @@ void Disassembler::VisitUnconditionalBranch(Instruction* instr) {
}
void Disassembler::VisitDataProcessing1Source(Instruction* instr) {
void Disassembler::VisitDataProcessing1Source(const Instruction* instr) {
const char *mnemonic = "";
const char *form = "'Rd, 'Rn";
@@ -597,7 +599,7 @@ void Disassembler::VisitDataProcessing1Source(Instruction* instr) {
}
void Disassembler::VisitDataProcessing2Source(Instruction* instr) {
void Disassembler::VisitDataProcessing2Source(const Instruction* instr) {
const char *mnemonic = "unimplemented";
const char *form = "'Rd, 'Rn, 'Rm";
@@ -618,7 +620,7 @@ void Disassembler::VisitDataProcessing2Source(Instruction* instr) {
}
void Disassembler::VisitDataProcessing3Source(Instruction* instr) {
void Disassembler::VisitDataProcessing3Source(const Instruction* instr) {
bool ra_is_zr = RaIsZROrSP(instr);
const char *mnemonic = "";
const char *form = "'Xd, 'Wn, 'Wm, 'Xa";
@@ -696,7 +698,7 @@ void Disassembler::VisitDataProcessing3Source(Instruction* instr) {
}
void Disassembler::VisitCompareBranch(Instruction* instr) {
void Disassembler::VisitCompareBranch(const Instruction* instr) {
const char *mnemonic = "";
const char *form = "'Rt, 'BImmCmpa";
@@ -711,7 +713,7 @@ void Disassembler::VisitCompareBranch(Instruction* instr) {
}
void Disassembler::VisitTestBranch(Instruction* instr) {
void Disassembler::VisitTestBranch(const Instruction* instr) {
const char *mnemonic = "";
// If the top bit of the immediate is clear, the tested register is
// disassembled as Wt, otherwise Xt. As the top bit of the immediate is
@@ -728,7 +730,7 @@ void Disassembler::VisitTestBranch(Instruction* instr) {
}
void Disassembler::VisitMoveWideImmediate(Instruction* instr) {
void Disassembler::VisitMoveWideImmediate(const Instruction* instr) {
const char *mnemonic = "";
const char *form = "'Rd, 'IMoveImm";
@@ -767,7 +769,7 @@ void Disassembler::VisitMoveWideImmediate(Instruction* instr) {
V(LDR_s, "ldr", "'St") \
V(LDR_d, "ldr", "'Dt")
void Disassembler::VisitLoadStorePreIndex(Instruction* instr) {
void Disassembler::VisitLoadStorePreIndex(const Instruction* instr) {
const char *mnemonic = "unimplemented";
const char *form = "(LoadStorePreIndex)";
@@ -781,7 +783,7 @@ void Disassembler::VisitLoadStorePreIndex(Instruction* instr) {
}
void Disassembler::VisitLoadStorePostIndex(Instruction* instr) {
void Disassembler::VisitLoadStorePostIndex(const Instruction* instr) {
const char *mnemonic = "unimplemented";
const char *form = "(LoadStorePostIndex)";
@@ -795,7 +797,7 @@ void Disassembler::VisitLoadStorePostIndex(Instruction* instr) {
}
void Disassembler::VisitLoadStoreUnsignedOffset(Instruction* instr) {
void Disassembler::VisitLoadStoreUnsignedOffset(const Instruction* instr) {
const char *mnemonic = "unimplemented";
const char *form = "(LoadStoreUnsignedOffset)";
@@ -810,7 +812,7 @@ void Disassembler::VisitLoadStoreUnsignedOffset(Instruction* instr) {
}
void Disassembler::VisitLoadStoreRegisterOffset(Instruction* instr) {
void Disassembler::VisitLoadStoreRegisterOffset(const Instruction* instr) {
const char *mnemonic = "unimplemented";
const char *form = "(LoadStoreRegisterOffset)";
@@ -825,7 +827,7 @@ void Disassembler::VisitLoadStoreRegisterOffset(Instruction* instr) {
}
void Disassembler::VisitLoadStoreUnscaledOffset(Instruction* instr) {
void Disassembler::VisitLoadStoreUnscaledOffset(const Instruction* instr) {
const char *mnemonic = "unimplemented";
const char *form = "'Wt, ['Xns'ILS]";
const char *form_x = "'Xt, ['Xns'ILS]";
@@ -856,7 +858,7 @@ void Disassembler::VisitLoadStoreUnscaledOffset(Instruction* instr) {
}
void Disassembler::VisitLoadLiteral(Instruction* instr) {
void Disassembler::VisitLoadLiteral(const Instruction* instr) {
const char *mnemonic = "ldr";
const char *form = "(LoadLiteral)";
@@ -865,6 +867,11 @@ void Disassembler::VisitLoadLiteral(Instruction* instr) {
case LDR_x_lit: form = "'Xt, 'ILLiteral 'LValue"; break;
case LDR_s_lit: form = "'St, 'ILLiteral 'LValue"; break;
case LDR_d_lit: form = "'Dt, 'ILLiteral 'LValue"; break;
case LDRSW_x_lit: {
mnemonic = "ldrsw";
form = "'Xt, 'ILLiteral 'LValue";
break;
}
default: mnemonic = "unimplemented";
}
Format(instr, mnemonic, form);
@@ -882,7 +889,7 @@ void Disassembler::VisitLoadLiteral(Instruction* instr) {
V(STP_d, "stp", "'Dt, 'Dt2", "8") \
V(LDP_d, "ldp", "'Dt, 'Dt2", "8")
void Disassembler::VisitLoadStorePairPostIndex(Instruction* instr) {
void Disassembler::VisitLoadStorePairPostIndex(const Instruction* instr) {
const char *mnemonic = "unimplemented";
const char *form = "(LoadStorePairPostIndex)";
@@ -896,7 +903,7 @@ void Disassembler::VisitLoadStorePairPostIndex(Instruction* instr) {
}
void Disassembler::VisitLoadStorePairPreIndex(Instruction* instr) {
void Disassembler::VisitLoadStorePairPreIndex(const Instruction* instr) {
const char *mnemonic = "unimplemented";
const char *form = "(LoadStorePairPreIndex)";
@@ -910,7 +917,7 @@ void Disassembler::VisitLoadStorePairPreIndex(Instruction* instr) {
}
void Disassembler::VisitLoadStorePairOffset(Instruction* instr) {
void Disassembler::VisitLoadStorePairOffset(const Instruction* instr) {
const char *mnemonic = "unimplemented";
const char *form = "(LoadStorePairOffset)";
@@ -924,7 +931,7 @@ void Disassembler::VisitLoadStorePairOffset(Instruction* instr) {
}
void Disassembler::VisitLoadStorePairNonTemporal(Instruction* instr) {
void Disassembler::VisitLoadStorePairNonTemporal(const Instruction* instr) {
const char *mnemonic = "unimplemented";
const char *form;
@@ -943,7 +950,50 @@ void Disassembler::VisitLoadStorePairNonTemporal(Instruction* instr) {
}
void Disassembler::VisitFPCompare(Instruction* instr) {
void Disassembler::VisitLoadStoreExclusive(const Instruction* instr) {
const char *mnemonic = "unimplemented";
const char *form;
switch (instr->Mask(LoadStoreExclusiveMask)) {
case STXRB_w: mnemonic = "stxrb"; form = "'Ws, 'Wt, ['Xns]"; break;
case STXRH_w: mnemonic = "stxrh"; form = "'Ws, 'Wt, ['Xns]"; break;
case STXR_w: mnemonic = "stxr"; form = "'Ws, 'Wt, ['Xns]"; break;
case STXR_x: mnemonic = "stxr"; form = "'Ws, 'Xt, ['Xns]"; break;
case LDXRB_w: mnemonic = "ldxrb"; form = "'Wt, ['Xns]"; break;
case LDXRH_w: mnemonic = "ldxrh"; form = "'Wt, ['Xns]"; break;
case LDXR_w: mnemonic = "ldxr"; form = "'Wt, ['Xns]"; break;
case LDXR_x: mnemonic = "ldxr"; form = "'Xt, ['Xns]"; break;
case STXP_w: mnemonic = "stxp"; form = "'Ws, 'Wt, 'Wt2, ['Xns]"; break;
case STXP_x: mnemonic = "stxp"; form = "'Ws, 'Xt, 'Xt2, ['Xns]"; break;
case LDXP_w: mnemonic = "ldxp"; form = "'Wt, 'Wt2, ['Xns]"; break;
case LDXP_x: mnemonic = "ldxp"; form = "'Xt, 'Xt2, ['Xns]"; break;
case STLXRB_w: mnemonic = "stlxrb"; form = "'Ws, 'Wt, ['Xns]"; break;
case STLXRH_w: mnemonic = "stlxrh"; form = "'Ws, 'Wt, ['Xns]"; break;
case STLXR_w: mnemonic = "stlxr"; form = "'Ws, 'Wt, ['Xns]"; break;
case STLXR_x: mnemonic = "stlxr"; form = "'Ws, 'Xt, ['Xns]"; break;
case LDAXRB_w: mnemonic = "ldaxrb"; form = "'Wt, ['Xns]"; break;
case LDAXRH_w: mnemonic = "ldaxrh"; form = "'Wt, ['Xns]"; break;
case LDAXR_w: mnemonic = "ldaxr"; form = "'Wt, ['Xns]"; break;
case LDAXR_x: mnemonic = "ldaxr"; form = "'Xt, ['Xns]"; break;
case STLXP_w: mnemonic = "stlxp"; form = "'Ws, 'Wt, 'Wt2, ['Xns]"; break;
case STLXP_x: mnemonic = "stlxp"; form = "'Ws, 'Xt, 'Xt2, ['Xns]"; break;
case LDAXP_w: mnemonic = "ldaxp"; form = "'Wt, 'Wt2, ['Xns]"; break;
case LDAXP_x: mnemonic = "ldaxp"; form = "'Xt, 'Xt2, ['Xns]"; break;
case STLRB_w: mnemonic = "stlrb"; form = "'Wt, ['Xns]"; break;
case STLRH_w: mnemonic = "stlrh"; form = "'Wt, ['Xns]"; break;
case STLR_w: mnemonic = "stlr"; form = "'Wt, ['Xns]"; break;
case STLR_x: mnemonic = "stlr"; form = "'Xt, ['Xns]"; break;
case LDARB_w: mnemonic = "ldarb"; form = "'Wt, ['Xns]"; break;
case LDARH_w: mnemonic = "ldarh"; form = "'Wt, ['Xns]"; break;
case LDAR_w: mnemonic = "ldar"; form = "'Wt, ['Xns]"; break;
case LDAR_x: mnemonic = "ldar"; form = "'Xt, ['Xns]"; break;
default: form = "(LoadStoreExclusive)";
}
Format(instr, mnemonic, form);
}
void Disassembler::VisitFPCompare(const Instruction* instr) {
const char *mnemonic = "unimplemented";
const char *form = "'Fn, 'Fm";
const char *form_zero = "'Fn, #0.0";
@@ -959,7 +1009,7 @@ void Disassembler::VisitFPCompare(Instruction* instr) {
}
void Disassembler::VisitFPConditionalCompare(Instruction* instr) {
void Disassembler::VisitFPConditionalCompare(const Instruction* instr) {
const char *mnemonic = "unmplemented";
const char *form = "'Fn, 'Fm, 'INzcv, 'Cond";
@@ -974,7 +1024,7 @@ void Disassembler::VisitFPConditionalCompare(Instruction* instr) {
}
void Disassembler::VisitFPConditionalSelect(Instruction* instr) {
void Disassembler::VisitFPConditionalSelect(const Instruction* instr) {
const char *mnemonic = "";
const char *form = "'Fd, 'Fn, 'Fm, 'Cond";
@@ -987,7 +1037,7 @@ void Disassembler::VisitFPConditionalSelect(Instruction* instr) {
}
void Disassembler::VisitFPDataProcessing1Source(Instruction* instr) {
void Disassembler::VisitFPDataProcessing1Source(const Instruction* instr) {
const char *mnemonic = "unimplemented";
const char *form = "'Fd, 'Fn";
@@ -1015,7 +1065,7 @@ void Disassembler::VisitFPDataProcessing1Source(Instruction* instr) {
}
void Disassembler::VisitFPDataProcessing2Source(Instruction* instr) {
void Disassembler::VisitFPDataProcessing2Source(const Instruction* instr) {
const char *mnemonic = "";
const char *form = "'Fd, 'Fn, 'Fm";
@@ -1039,7 +1089,7 @@ void Disassembler::VisitFPDataProcessing2Source(Instruction* instr) {
}
void Disassembler::VisitFPDataProcessing3Source(Instruction* instr) {
void Disassembler::VisitFPDataProcessing3Source(const Instruction* instr) {
const char *mnemonic = "";
const char *form = "'Fd, 'Fn, 'Fm, 'Fa";
@@ -1058,7 +1108,7 @@ void Disassembler::VisitFPDataProcessing3Source(Instruction* instr) {
}
void Disassembler::VisitFPImmediate(Instruction* instr) {
void Disassembler::VisitFPImmediate(const Instruction* instr) {
const char *mnemonic = "";
const char *form = "(FPImmediate)";
@@ -1071,7 +1121,7 @@ void Disassembler::VisitFPImmediate(Instruction* instr) {
}
void Disassembler::VisitFPIntegerConvert(Instruction* instr) {
void Disassembler::VisitFPIntegerConvert(const Instruction* instr) {
const char *mnemonic = "unimplemented";
const char *form = "(FPIntegerConvert)";
const char *form_rf = "'Rd, 'Fn";
@@ -1127,7 +1177,7 @@ void Disassembler::VisitFPIntegerConvert(Instruction* instr) {
}
void Disassembler::VisitFPFixedPointConvert(Instruction* instr) {
void Disassembler::VisitFPFixedPointConvert(const Instruction* instr) {
const char *mnemonic = "";
const char *form = "'Rd, 'Fn, 'IFPFBits";
const char *form_fr = "'Fd, 'Rn, 'IFPFBits";
@@ -1155,14 +1205,22 @@ void Disassembler::VisitFPFixedPointConvert(Instruction* instr) {
}
void Disassembler::VisitSystem(Instruction* instr) {
void Disassembler::VisitSystem(const Instruction* instr) {
// Some system instructions hijack their Op and Cp fields to represent a
// range of immediates instead of indicating a different instruction. This
// makes the decoding tricky.
const char *mnemonic = "unimplemented";
const char *form = "(System)";
if (instr->Mask(SystemSysRegFMask) == SystemSysRegFixed) {
if (instr->Mask(SystemExclusiveMonitorFMask) == SystemExclusiveMonitorFixed) {
switch (instr->Mask(SystemExclusiveMonitorMask)) {
case CLREX: {
mnemonic = "clrex";
form = (instr->CRm() == 0xf) ? NULL : "'IX";
break;
}
}
} else if (instr->Mask(SystemSysRegFMask) == SystemSysRegFixed) {
switch (instr->Mask(SystemSysRegMask)) {
case MRS: {
mnemonic = "mrs";
@@ -1184,7 +1242,6 @@ void Disassembler::VisitSystem(Instruction* instr) {
}
}
} else if (instr->Mask(SystemHintFMask) == SystemHintFixed) {
VIXL_ASSERT(instr->Mask(SystemHintMask) == HINT);
switch (instr->ImmHint()) {
case NOP: {
mnemonic = "nop";
@@ -1216,7 +1273,7 @@ void Disassembler::VisitSystem(Instruction* instr) {
}
void Disassembler::VisitException(Instruction* instr) {
void Disassembler::VisitException(const Instruction* instr) {
const char *mnemonic = "unimplemented";
const char *form = "'IDebug";
@@ -1235,22 +1292,75 @@ void Disassembler::VisitException(Instruction* instr) {
}
void Disassembler::VisitUnimplemented(Instruction* instr) {
void Disassembler::VisitUnimplemented(const Instruction* instr) {
Format(instr, "unimplemented", "(Unimplemented)");
}
void Disassembler::VisitUnallocated(Instruction* instr) {
void Disassembler::VisitUnallocated(const Instruction* instr) {
Format(instr, "unallocated", "(Unallocated)");
}
void Disassembler::ProcessOutput(Instruction* /*instr*/) {
void Disassembler::ProcessOutput(const Instruction* /*instr*/) {
// The base disasm does nothing more than disassembling into a buffer.
}
void Disassembler::Format(Instruction* instr, const char* mnemonic,
void Disassembler::AppendRegisterNameToOutput(const Instruction* instr,
const CPURegister& reg) {
USE(instr);
VIXL_ASSERT(reg.IsValid());
char reg_char;
if (reg.IsRegister()) {
reg_char = reg.Is64Bits() ? 'x' : 'w';
} else {
VIXL_ASSERT(reg.IsFPRegister());
reg_char = reg.Is64Bits() ? 'd' : 's';
}
if (reg.IsFPRegister() || !(reg.Aliases(sp) || reg.Aliases(xzr))) {
// A normal register: w0 - w30, x0 - x30, s0 - s31, d0 - d31.
AppendToOutput("%c%d", reg_char, reg.code());
} else if (reg.Aliases(sp)) {
// Disassemble w31/x31 as stack pointer wsp/sp.
AppendToOutput("%s", reg.Is64Bits() ? "sp" : "wsp");
} else {
// Disassemble w31/x31 as zero register wzr/xzr.
AppendToOutput("%czr", reg_char);
}
}
void Disassembler::AppendPCRelativeOffsetToOutput(const Instruction* instr,
int64_t offset) {
USE(instr);
char sign = (offset < 0) ? '-' : '+';
AppendToOutput("#%c0x%" PRIx64, sign, std::abs(offset));
}
void Disassembler::AppendAddressToOutput(const Instruction* instr,
const void* addr) {
USE(instr);
AppendToOutput("(addr %p)", addr);
}
void Disassembler::AppendCodeAddressToOutput(const Instruction* instr,
const void* addr) {
AppendAddressToOutput(instr, addr);
}
void Disassembler::AppendDataAddressToOutput(const Instruction* instr,
const void* addr) {
AppendAddressToOutput(instr, addr);
}
void Disassembler::Format(const Instruction* instr, const char* mnemonic,
const char* format) {
VIXL_ASSERT(mnemonic != NULL);
ResetOutput();
@@ -1264,7 +1374,7 @@ void Disassembler::Format(Instruction* instr, const char* mnemonic,
}
void Disassembler::Substitute(Instruction* instr, const char* string) {
void Disassembler::Substitute(const Instruction* instr, const char* string) {
char chr = *string++;
while (chr != '\0') {
if (chr == '\'') {
@@ -1277,7 +1387,8 @@ void Disassembler::Substitute(Instruction* instr, const char* string) {
}
int Disassembler::SubstituteField(Instruction* instr, const char* format) {
int Disassembler::SubstituteField(const Instruction* instr,
const char* format) {
switch (format[0]) {
case 'R': // Register. X or W, selected by sf bit.
case 'F': // FP Register. S or D, selected by type field.
@@ -1303,7 +1414,7 @@ int Disassembler::SubstituteField(Instruction* instr, const char* format) {
}
int Disassembler::SubstituteRegisterField(Instruction* instr,
int Disassembler::SubstituteRegisterField(const Instruction* instr,
const char* format) {
unsigned reg_num = 0;
unsigned field_len = 2;
@@ -1312,6 +1423,7 @@ int Disassembler::SubstituteRegisterField(Instruction* instr,
case 'n': reg_num = instr->Rn(); break;
case 'm': reg_num = instr->Rm(); break;
case 'a': reg_num = instr->Ra(); break;
case 's': reg_num = instr->Rs(); break;
case 't': {
if (format[2] == '2') {
reg_num = instr->Rt2();
@@ -1329,34 +1441,47 @@ int Disassembler::SubstituteRegisterField(Instruction* instr,
field_len = 3;
}
char reg_type;
CPURegister::RegisterType reg_type;
unsigned reg_size;
if (format[0] == 'R') {
// Register type is R: use sf bit to choose X and W.
reg_type = instr->SixtyFourBits() ? 'x' : 'w';
reg_type = CPURegister::kRegister;
reg_size = instr->SixtyFourBits() ? kXRegSize : kWRegSize;
} else if (format[0] == 'F') {
// Floating-point register: use type field to choose S or D.
reg_type = ((instr->FPType() & 1) == 0) ? 's' : 'd';
reg_type = CPURegister::kFPRegister;
reg_size = ((instr->FPType() & 1) == 0) ? kSRegSize : kDRegSize;
} else {
// Register type is specified. Make it lower case.
reg_type = format[0] + 0x20;
// The register type is specified.
switch (format[0]) {
case 'W':
reg_type = CPURegister::kRegister; reg_size = kWRegSize; break;
case 'X':
reg_type = CPURegister::kRegister; reg_size = kXRegSize; break;
case 'S':
reg_type = CPURegister::kFPRegister; reg_size = kSRegSize; break;
case 'D':
reg_type = CPURegister::kFPRegister; reg_size = kDRegSize; break;
default:
VIXL_UNREACHABLE();
reg_type = CPURegister::kRegister;
reg_size = kXRegSize;
}
}
if ((reg_num != kZeroRegCode) || (reg_type == 's') || (reg_type == 'd')) {
// A normal register: w0 - w30, x0 - x30, s0 - s31, d0 - d31.
AppendToOutput("%c%d", reg_type, reg_num);
} else if (format[2] == 's') {
// Disassemble w31/x31 as stack pointer wsp/sp.
AppendToOutput("%s", (reg_type == 'w') ? "wsp" : "sp");
} else {
// Disassemble w31/x31 as zero register wzr/xzr.
AppendToOutput("%czr", reg_type);
if ((reg_type == CPURegister::kRegister) &&
(reg_num == kZeroRegCode) && (format[2] == 's')) {
reg_num = kSPRegInternalCode;
}
AppendRegisterNameToOutput(instr, CPURegister(reg_num, reg_size, reg_type));
return field_len;
}
int Disassembler::SubstituteImmediateField(Instruction* instr,
int Disassembler::SubstituteImmediateField(const Instruction* instr,
const char* format) {
VIXL_ASSERT(format[0] == 'I');
@@ -1406,8 +1531,7 @@ int Disassembler::SubstituteImmediateField(Instruction* instr,
}
case 'C': { // ICondB - Immediate Conditional Branch.
int64_t offset = instr->ImmCondBranch() << 2;
char sign = (offset >= 0) ? '+' : '-';
AppendToOutput("#%c0x%" PRIx64, sign, offset);
AppendPCRelativeOffsetToOutput(instr, offset);
return 6;
}
case 'A': { // IAddSub.
@@ -1458,6 +1582,10 @@ int Disassembler::SubstituteImmediateField(Instruction* instr,
AppendToOutput("#0x%" PRIx64, instr->ImmException());
return 6;
}
case 'X': { // IX - CLREX instruction.
AppendToOutput("#0x%" PRIx64, instr->CRm());
return 2;
}
default: {
VIXL_UNIMPLEMENTED();
return 0;
@@ -1466,7 +1594,7 @@ int Disassembler::SubstituteImmediateField(Instruction* instr,
}
int Disassembler::SubstituteBitfieldImmediateField(Instruction* instr,
int Disassembler::SubstituteBitfieldImmediateField(const Instruction* instr,
const char* format) {
VIXL_ASSERT((format[0] == 'I') && (format[1] == 'B'));
unsigned r = instr->ImmR();
@@ -1501,7 +1629,7 @@ int Disassembler::SubstituteBitfieldImmediateField(Instruction* instr,
}
int Disassembler::SubstituteLiteralField(Instruction* instr,
int Disassembler::SubstituteLiteralField(const Instruction* instr,
const char* format) {
VIXL_ASSERT(strncmp(format, "LValue", 6) == 0);
USE(format);
@@ -1509,16 +1637,21 @@ int Disassembler::SubstituteLiteralField(Instruction* instr,
switch (instr->Mask(LoadLiteralMask)) {
case LDR_w_lit:
case LDR_x_lit:
case LDRSW_x_lit:
case LDR_s_lit:
case LDR_d_lit: AppendToOutput("(addr %p)", instr->LiteralAddress()); break;
default: VIXL_UNREACHABLE();
case LDR_d_lit:
AppendDataAddressToOutput(instr, instr->LiteralAddress());
break;
default:
VIXL_UNREACHABLE();
}
return 6;
}
int Disassembler::SubstituteShiftField(Instruction* instr, const char* format) {
int Disassembler::SubstituteShiftField(const Instruction* instr,
const char* format) {
VIXL_ASSERT(format[0] == 'H');
VIXL_ASSERT(instr->ShiftDP() <= 0x3);
@@ -1541,7 +1674,7 @@ int Disassembler::SubstituteShiftField(Instruction* instr, const char* format) {
}
int Disassembler::SubstituteConditionField(Instruction* instr,
int Disassembler::SubstituteConditionField(const Instruction* instr,
const char* format) {
VIXL_ASSERT(format[0] == 'C');
const char* condition_code[] = { "eq", "ne", "hs", "lo",
@@ -1562,28 +1695,28 @@ int Disassembler::SubstituteConditionField(Instruction* instr,
}
int Disassembler::SubstitutePCRelAddressField(Instruction* instr,
int Disassembler::SubstitutePCRelAddressField(const Instruction* instr,
const char* format) {
USE(format);
VIXL_ASSERT(strncmp(format, "AddrPCRel", 9) == 0);
VIXL_ASSERT((strcmp(format, "AddrPCRelByte") == 0) || // Used by `adr`.
(strcmp(format, "AddrPCRelPage") == 0)); // Used by `adrp`.
int offset = instr->ImmPCRel();
int64_t offset = instr->ImmPCRel();
const Instruction * base = instr;
// Only ADR (AddrPCRelByte) is supported.
VIXL_ASSERT(strcmp(format, "AddrPCRelByte") == 0);
char sign = '+';
if (offset < 0) {
offset = -offset;
sign = '-';
if (format[9] == 'P') {
offset *= kPageSize;
base = AlignDown(base, kPageSize);
}
VIXL_STATIC_ASSERT(sizeof(*instr) == 1);
AppendToOutput("#%c0x%x (addr %p)", sign, offset, instr + offset);
const void* target = reinterpret_cast<const void*>(base + offset);
AppendPCRelativeOffsetToOutput(instr, offset);
AppendToOutput(" ");
AppendAddressToOutput(instr, target);
return 13;
}
int Disassembler::SubstituteBranchTargetField(Instruction* instr,
int Disassembler::SubstituteBranchTargetField(const Instruction* instr,
const char* format) {
VIXL_ASSERT(strncmp(format, "BImm", 4) == 0);
@@ -1600,18 +1733,18 @@ int Disassembler::SubstituteBranchTargetField(Instruction* instr,
default: VIXL_UNIMPLEMENTED();
}
offset <<= kInstructionSizeLog2;
char sign = '+';
if (offset < 0) {
offset = -offset;
sign = '-';
}
const void* target_address = reinterpret_cast<const void*>(instr + offset);
VIXL_STATIC_ASSERT(sizeof(*instr) == 1);
AppendToOutput("#%c0x%" PRIx64 " (addr %p)", sign, offset, instr + offset);
AppendPCRelativeOffsetToOutput(instr, offset);
AppendToOutput(" ");
AppendCodeAddressToOutput(instr, target_address);
return 8;
}
int Disassembler::SubstituteExtendField(Instruction* instr,
int Disassembler::SubstituteExtendField(const Instruction* instr,
const char* format) {
VIXL_ASSERT(strncmp(format, "Ext", 3) == 0);
VIXL_ASSERT(instr->ExtendMode() <= 7);
@@ -1638,7 +1771,7 @@ int Disassembler::SubstituteExtendField(Instruction* instr,
}
int Disassembler::SubstituteLSRegOffsetField(Instruction* instr,
int Disassembler::SubstituteLSRegOffsetField(const Instruction* instr,
const char* format) {
VIXL_ASSERT(strncmp(format, "Offsetreg", 9) == 0);
const char* extend_mode[] = { "undefined", "undefined", "uxtw", "lsl",
@@ -1667,7 +1800,7 @@ int Disassembler::SubstituteLSRegOffsetField(Instruction* instr,
}
int Disassembler::SubstitutePrefetchField(Instruction* instr,
int Disassembler::SubstitutePrefetchField(const Instruction* instr,
const char* format) {
VIXL_ASSERT(format[0] == 'P');
USE(format);
@@ -1682,7 +1815,7 @@ int Disassembler::SubstitutePrefetchField(Instruction* instr,
return 6;
}
int Disassembler::SubstituteBarrierField(Instruction* instr,
int Disassembler::SubstituteBarrierField(const Instruction* instr,
const char* format) {
VIXL_ASSERT(format[0] == 'M');
USE(format);
@@ -1714,7 +1847,7 @@ void Disassembler::AppendToOutput(const char* format, ...) {
}
void PrintDisassembler::ProcessOutput(Instruction* instr) {
void PrintDisassembler::ProcessOutput(const Instruction* instr) {
fprintf(stream_, "0x%016" PRIx64 " %08" PRIx32 "\t\t%s\n",
reinterpret_cast<uint64_t>(instr),
instr->InstructionBits(),

View File

@@ -31,6 +31,7 @@
#include "utils.h"
#include "instructions-a64.h"
#include "decoder-a64.h"
#include "assembler-a64.h"
namespace vixl {
@@ -42,50 +43,85 @@ class Disassembler: public DecoderVisitor {
char* GetOutput();
// Declare all Visitor functions.
#define DECLARE(A) void Visit##A(Instruction* instr);
#define DECLARE(A) void Visit##A(const Instruction* instr);
VISITOR_LIST(DECLARE)
#undef DECLARE
protected:
virtual void ProcessOutput(Instruction* instr);
virtual void ProcessOutput(const Instruction* instr);
// Default output functions. The functions below implement a default way of
// printing elements in the disassembly. A sub-class can override these to
// customize the disassembly output.
// Prints the name of a register.
virtual void AppendRegisterNameToOutput(const Instruction* instr,
const CPURegister& reg);
// Prints a PC-relative offset. This is used for example when disassembling
// branches to immediate offsets.
virtual void AppendPCRelativeOffsetToOutput(const Instruction* instr,
int64_t offset);
// Prints an address, in the general case. It can be code or data. This is
// used for example to print the target address of an ADR instruction.
virtual void AppendAddressToOutput(const Instruction* instr,
const void* addr);
// Prints the address of some code.
// This is used for example to print the target address of a branch to an
// immediate offset.
// A sub-class can for example override this method to lookup the address and
// print an appropriate name.
virtual void AppendCodeAddressToOutput(const Instruction* instr,
const void* addr);
// Prints the address of some data.
// This is used for example to print the source address of a load literal
// instruction.
virtual void AppendDataAddressToOutput(const Instruction* instr,
const void* addr);
private:
void Format(Instruction* instr, const char* mnemonic, const char* format);
void Substitute(Instruction* instr, const char* string);
int SubstituteField(Instruction* instr, const char* format);
int SubstituteRegisterField(Instruction* instr, const char* format);
int SubstituteImmediateField(Instruction* instr, const char* format);
int SubstituteLiteralField(Instruction* instr, const char* format);
int SubstituteBitfieldImmediateField(Instruction* instr, const char* format);
int SubstituteShiftField(Instruction* instr, const char* format);
int SubstituteExtendField(Instruction* instr, const char* format);
int SubstituteConditionField(Instruction* instr, const char* format);
int SubstitutePCRelAddressField(Instruction* instr, const char* format);
int SubstituteBranchTargetField(Instruction* instr, const char* format);
int SubstituteLSRegOffsetField(Instruction* instr, const char* format);
int SubstitutePrefetchField(Instruction* instr, const char* format);
int SubstituteBarrierField(Instruction* instr, const char* format);
void Format(
const Instruction* instr, const char* mnemonic, const char* format);
void Substitute(const Instruction* instr, const char* string);
int SubstituteField(const Instruction* instr, const char* format);
int SubstituteRegisterField(const Instruction* instr, const char* format);
int SubstituteImmediateField(const Instruction* instr, const char* format);
int SubstituteLiteralField(const Instruction* instr, const char* format);
int SubstituteBitfieldImmediateField(
const Instruction* instr, const char* format);
int SubstituteShiftField(const Instruction* instr, const char* format);
int SubstituteExtendField(const Instruction* instr, const char* format);
int SubstituteConditionField(const Instruction* instr, const char* format);
int SubstitutePCRelAddressField(const Instruction* instr, const char* format);
int SubstituteBranchTargetField(const Instruction* instr, const char* format);
int SubstituteLSRegOffsetField(const Instruction* instr, const char* format);
int SubstitutePrefetchField(const Instruction* instr, const char* format);
int SubstituteBarrierField(const Instruction* instr, const char* format);
inline bool RdIsZROrSP(Instruction* instr) const {
inline bool RdIsZROrSP(const Instruction* instr) const {
return (instr->Rd() == kZeroRegCode);
}
inline bool RnIsZROrSP(Instruction* instr) const {
inline bool RnIsZROrSP(const Instruction* instr) const {
return (instr->Rn() == kZeroRegCode);
}
inline bool RmIsZROrSP(Instruction* instr) const {
inline bool RmIsZROrSP(const Instruction* instr) const {
return (instr->Rm() == kZeroRegCode);
}
inline bool RaIsZROrSP(Instruction* instr) const {
inline bool RaIsZROrSP(const Instruction* instr) const {
return (instr->Ra() == kZeroRegCode);
}
bool IsMovzMovnImm(unsigned reg_size, uint64_t value);
protected:
void ResetOutput();
void AppendToOutput(const char* string, ...);
void AppendToOutput(const char* string, ...) PRINTF_CHECK(2, 3);
char* buffer_;
uint32_t buffer_pos_;
@@ -97,10 +133,10 @@ class Disassembler: public DecoderVisitor {
class PrintDisassembler: public Disassembler {
public:
explicit PrintDisassembler(FILE* stream) : stream_(stream) { }
~PrintDisassembler() { }
virtual ~PrintDisassembler() { }
protected:
virtual void ProcessOutput(Instruction* instr);
virtual void ProcessOutput(const Instruction* instr);
private:
FILE *stream_;

Some files were not shown because too many files have changed in this diff Show More