Compare commits

..

59 Commits

Author SHA1 Message Date
Michael Roth
562d6b4f7f Update version for v2.1.2 release
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-25 14:52:04 -05:00
Petr Matousek
9a72433843 slirp: udp: fix NULL pointer dereference because of uninitialized socket
When guest sends udp packet with source port and source addr 0,
uninitialized socket is picked up when looking for matching and already
created udp sockets, and later passed to sosendto() where NULL pointer
dereference is hit during so->slirp->vnetwork_mask.s_addr access.

Fix this by checking that the socket is not just a socket stub.

This is CVE-2014-3640.

Signed-off-by: Petr Matousek <pmatouse@redhat.com>
Reported-by: Xavier Mehrenberger <xavier.mehrenberger@airbus.com>
Reported-by: Stephane Duverger <stephane.duverger@eads.net>
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Message-id: 20140918063537.GX9321@dhcp-25-225.brq.redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 01f7cecf00)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-24 11:11:52 -05:00
Michael S. Tsirkin
00dd2b22f6 pc: leave more space for BIOS allocations
Since QEMU 2.1, we are allocating more space for ACPI tables, so no
space is left after initrd for the BIOS to allocate memory.

Besides ACPI tables, there are a few other uses of high memory in
SeaBIOS: SMBIOS tables and USB drivers use it in particular.  These uses
allocate a very small amount of memory.  Malloc metadata also lives
there.  So we need _some_ extra padding there to avoid initrd breakage,
but not much.

John Snow found a case where RHEL5 was broken by the recent change to
ACPI_TABLE_SIZE; in his case 4KB of extra padding are fine, but just to
be safe I am adding 32KB, which is roughly the same amount of padding
that was left by QEMU 2.0 and earlier.

Move initrd to leave some space for the BIOS.

Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reported-by: John Snow <jsnow@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 438f92ee9f)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-23 10:48:06 -05:00
Michael S. Tsirkin
80f4d021f0 Revert "virtio: don't call device on !vm_running"
This reverts commit a1bc7b827e422e1ff065640d8ec5347c4aadfcd8.
    virtio: don't call device on !vm_running
It turns out that virtio net assumes that vm_running
is updated before device status callback in many places,
so this change leads to asserts.
Previous commit fixes the root issue that motivated
a1bc7b827e422e1ff065640d8ec5347c4aadfcd8 differently,
so there's no longer a need for this change.

In the future, we might be able to drop checking vm_running
completely, and check vm state directly.

Reported-by: Dietmar Maurer <dietmar@proxmox.com>
Cc: qemu-stable@nongnu.org
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 9e8e8c4865)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-23 10:48:06 -05:00
Michael S. Tsirkin
074e347138 virtio-net: drop assert on vm stop
On vm stop, vm_running state set to stopped
before device is notified, so callbacks can get envoked with
vm_running = false; and this is not an error.

Cc: qemu-stable@nongnu.org
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 131c5221fe)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-23 10:48:06 -05:00
Eduardo Habkost
9e8d994111 Revert "rng-egd: remove redundant free"
This reverts commit 5e490b6a50.

Cc: qemu-stable@nongnu.org
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit abb4d5f2e2)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-23 10:48:06 -05:00
Eduardo Habkost
a56b9cfd86 hw/machine: Free old values of string properties
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel.a@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Cc: qemu-stable@nongnu.org
(cherry picked from commit 556068eed0)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-23 10:48:06 -05:00
Greg Kurz
07178559a9 Revert "spapr_pci: map the MSI window in each PHB"
This patch is predicated on cc943c, which was dropped from
stable tree for other reasons.

This reverts commit 0824ca6bd1.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-23 10:48:06 -05:00
Michael Roth
3cb451edb2 Update version for v2.1.1 release
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 14:30:45 -05:00
Eduardo Habkost
82d80e1f0b target-i386: Support migratable=no properly
When the "migratable" property was implemented, the behavior was tested
by changing the default on the code, but actually using the option on
the command-line (e.g. "-cpu host,migratable=false") doesn't work as
expected. This is a regression for a common use case of "-cpu host",
which is to enable features that are supported by the host CPU + kernel
before feature-specific code is added to QEMU.

Fix this by initializing the feature words for "-cpu host" on
x86_cpu_parse_featurestr(), right after parsing the CPU options.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
(cherry picked from commit 4d1b279b06)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:58 -05:00
Pavel Dovgaluk
5dd076a9f8 exec: Save CPUState::exception_index field
This patch adds a subsection with exception_index field to the VMState for
correct saving the CPU state.
Without this patch, simulator could miss the pending exception in the saved
virtual machine state.

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
(cherry picked from commit 6c3bff0ed8)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:58 -05:00
Sebastian Tanase
257e9cfce2 pty: Fix byte loss bug when connecting to pty
When trying to print data to the pty, we first check if it is connected.
If not, we try to reconnect, but we drop the pending data even if we
have successfully reconnected; this makes us lose the first byte of the very
first transmission.
This small fix addresses the issue by checking once more if the pty is connected
after having tried to reconnect.

Signed-off-by: Sebastian Tanase <sebastian.tanase@openwide.fr>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit cf7330c759)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:58 -05:00
Gerd Hoffmann
1aa87d3689 spice: make sure we don't overflow ssd->buf
Related spice-only bug.  We have a fixed 16 MB buffer here, being
presented to the spice-server as qxl video memory in case spice is
used with a non-qxl card.  It's also used with qxl in vga mode.

When using display resolutions requiring more than 16 MB of memory we
are going to overflow that buffer.  In theory the guest can write,
indirectly via spice-server.  The spice-server clears the memory after
setting a new video mode though, triggering a segfault in the overflow
case, so qemu crashes before the guest has a chance to do something
evil.

Fix that by switching to dynamic allocation for the buffer.

CVE-2014-3615

Cc: qemu-stable@nongnu.org
Cc: secalert@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
(cherry picked from commit ab9509ccea)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:58 -05:00
Gerd Hoffmann
7fe5418d9f vbe: rework sanity checks
Plug a bunch of holes in the bochs dispi interface parameter checking.
Add a function doing verification on all registers.  Call that
unconditionally on every register write.  That way we should catch
everything, even changing one register affecting the valid range of
another register.

Some of the holes have been added by commit
e9c6149f6a.  Before that commit the
maximum possible framebuffer (VBE_DISPI_MAX_XRES * VBE_DISPI_MAX_YRES *
32 bpp) has been smaller than the qemu vga memory (8MB) and the checking
for VBE_DISPI_MAX_XRES + VBE_DISPI_MAX_YRES + VBE_DISPI_MAX_BPP was ok.

Some of the holes have been there forever, such as
VBE_DISPI_INDEX_X_OFFSET and VBE_DISPI_INDEX_Y_OFFSET register writes
lacking any verification.

Security impact:

(1) Guest can make the ui (gtk/vnc/...) use memory rages outside the vga
frame buffer as source  ->  host memory leak.  Memory isn't leaked to
the guest but to the vnc client though.

(2) Qemu will segfault in case the memory range happens to include
unmapped areas  ->  Guest can DoS itself.

The guest can not modify host memory, so I don't think this can be used
by the guest to escape.

CVE-2014-3615

Cc: qemu-stable@nongnu.org
Cc: secalert@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
(cherry picked from commit c1b886c45d)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:58 -05:00
Gerd Hoffmann
c5042f04f7 vbe: make bochs dispi interface return the correct memory size with qxl
VgaState->vram_size is the size of the pci bar.  In case of qxl not the
whole pci bar can be used as vga framebuffer.  Add a new variable
vbe_size to handle that case.  By default (if unset) it equals
vram_size, but qxl can set vbe_size to something else.

This makes sure VBE_DISPI_INDEX_VIDEO_MEMORY_64K returns correct results
and sanity checks are done with the correct size too.

Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
(cherry picked from commit 54a85d4624)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:58 -05:00
Michael S. Tsirkin
cf29a88391 virtio-net: purge outstanding packets when starting vhost
whenever we start vhost, virtio could have outstanding packets
queued, when they complete later we'll modify the ring
while vhost is processing it.

To prevent this, purge outstanding packets on vhost start.

Cc: qemu-stable@nongnu.org
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 086abc1ccd)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:58 -05:00
Michael S. Tsirkin
08743db463 net: complete all queued packets on VM stop
This completes all packets, ensuring that callbacks
will not run when VM is stopped.

Cc: qemu-stable@nongnu.org
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit ca77d85e1d)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:58 -05:00
Michael S. Tsirkin
d9c06c0d79 net: invoke callback when purging queue
devices rely on packet callbacks eventually running,
but we violate this rule whenever we purge the queue.
To fix, invoke callbacks on all packets on purge.
Set length to 0, this way callers can detect that
this happened and re-queue if necessary.

Cc: qemu-stable@nongnu.org
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 07d8084624)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:58 -05:00
Michael S. Tsirkin
f321710cd4 virtio: don't call device on !vm_running
On vm stop, virtio changes vm_running state
too soon, so callbacks can get envoked with
vm_running = false;

Cc: qemu-stable@nongnu.org
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 269bd822e7)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:58 -05:00
zhanghailiang
ec48bfd57b net: Forbid dealing with packets when VM is not running
For all NICs(except virtio-net) emulated by qemu,
Such as e1000, rtl8139, pcnet and ne2k_pci,
Qemu can still receive packets when VM is not running.

If this happened in *migration's* last PAUSE VM stage, but
before the end of the migration, the new receiving packets will possibly dirty
parts of RAM which has been cached in *iovec*(will be sent asynchronously) and
dirty parts of new RAM which will be missed.
This will lead serious network fault in VM.

To avoid this, we forbid receiving packets in generic net code when
VM is not running.

Bug reproduction steps:
(1) Start a VM which configured at least one NIC
(2) In VM, open several Terminal and do *Ping IP -i 0.1*
(3) Migrate the VM repeatedly between two Hosts
And the *PING* command in VM will very likely fail with message:
'Destination HOST Unreachable', the NIC in VM will stay unavailable unless you
run 'service network restart'

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit e1d64c084b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:58 -05:00
zhanghailiang
eb36f79d59 acpi-build: Set FORCE_APIC_CLUSTER_MODEL bit for FADT flags
If we start Windows 2008 R2 DataCenter with number of cpu less than 8,
The system will use APIC Flat Logical destination mode as default configuration,
Which has an upper limit of 8 CPUs.

The fault is that VM can not show all processors within Task Manager if
we hot-add cpus when the number of cpus in VM extends the limit of 8.

If we use cluster destination model, the problem will be solved.

Note:
This flag was introduced later than ACPI v1.0 specification while QEMU
generates v1.0 tables only, but...

linux kernel ignores this flag, so patch has no influence on it.

Tested with Win[XPsp3|Srv2003EE|Srv2008DC|Srv2008R2|Srv2012R2], there
isn't BSODs and guests boot just fine. In cases guest doesn't support
cpu-hotplug, cpu becomes visible after reboot and in case the guest
supports cpu-hotplug, it works as expected with this patch.

Cc: qemu-stable@nongnu.org
Signed-off-by: huangzhichao <huangzhichao@huawei.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-By: Igor Mammedov <imammedo@redhat.com>
(cherry picked from commit 07b81ed937)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:58 -05:00
Michael S. Tsirkin
34d41c1a20 vhost-scsi: init backend features earlier
As vhost core can use backend_features during init, clear it earlier to
avoid using uninitialized memory.
This use would be harmless since vhost scsi ignores the result
anyway, but initializing earlier will help prevent valgrind errors,
and make scsi and net behave similarly.

Cc: qemu-stable@nongnu.org
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 3a1655fc53)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:57 -05:00
Jason Wang
6f8d05a8f8 vhost_net: init acked_features to backend_features
commit 2e6d46d77e (vhost: add
vhost_get_features and vhost_ack_features) removes the step that
initializes the acked_features to backend_features.

As this field is now uninitialized, vhost initialization will sometimes
fail.

To fix, initialize acked_features on each ack.

Tested-by: Andrey Korolyov <andrey@xdel.ru>
Cc: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit b49ae9138d)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:57 -05:00
Jason Wang
5e83dae44e vhost_net: start/stop guest notifiers properly
commit a9f98bb5eb "vhost: multiqueue
support" changed the order of stopping the device. Previously
vhost_dev_stop would disable backend and only afterwards, unset guest
notifiers. We now unset guest notifiers while vhost is still
active. This can lose interrupts causing guest networking to fail. In
particular, this has been observed during migration.

To fix this, several other changes are needed:
- remove the hdev->started assertion in vhost.c since we may want to
start the guest notifiers before vhost starts and stop the guest
notifiers after vhost is stopped.
- introduce the vhost_net_set_vq_index() and call it before setting
guest notifiers. This is to guarantee vhost_net has the correct
virtqueue index when setting guest notifiers.

MST: fix up error handling.

Cc: qemu-stable@nongnu.org
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Andrey Korolyov <andrey@xdel.ru>
Reported-by: "Zhangjie (HZ)" <zhangjie14@huawei.com>
Tested-by: William Dauchy <william@gandi.net>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit cd7d1d26b0)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:57 -05:00
Knut Omang
ff34ca00fd pci: avoid losing config updates to MSI/MSIX cap regs
Since
commit 95d6580024
    msi: Invoke msi/msix_write_config from PCI core
msix config writes are lost, the value written is always 0.

Fix pci_default_write_config to avoid this.

Cc: qemu-stable@nongnu.org
Signed-off-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit d7efb7e08e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:57 -05:00
Michael S. Tsirkin
e685d2abf7 virtio-net: don't run bh on vm stopped
commit 783e770693
    virtio-net: stop/start bh when appropriate

is incomplete: BH might execute within the same main loop iteration but
after vmstop, so in theory, we might trigger an assertion.
I was unable to reproduce this in practice,
but it seems clear enough that the potential is there, so worth fixing.

Cc: qemu-stable@nongnu.org
Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit e8bcf84200)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:57 -05:00
Gerd Hoffmann
67cfda8776 qxl-render: add more sanity checks
Damn, the dirty rectangle values are signed integers.  So the checks
added by commit 788fbf042f are not good
enough, we also have to make sure they are not negative.

[ Note: There must be something broken in spice-server so we get
  negative values in the first place.  Bug opened:
  https://bugzilla.redhat.com/show_bug.cgi?id=1135372 ]

Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
(cherry picked from commit 503b3b33fe)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:57 -05:00
Peter Maydell
4fd144f8f5 target-arm: Correct Cortex-A57 ISAR5 and AA64ISAR0 ID register values
We implement the crypto extensions but were incorrectly reporting
ID register values for the Cortex-A57 which did not advertise
crypto. Use the correct values as described in the TRM.
With this fix Linux correctly detects presence of the crypto
features and advertises them in /proc/cpuinfo.

Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1408718660-7295-1-git-send-email-peter.maydell@linaro.org
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit c379621451)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:57 -05:00
Peter Maydell
ea774b8dd0 target-arm: Fix regression that disabled VFP for ARMv5 CPUs
Commit 2c7ffc414 added support for honouring the CPACR coprocessor
access control register bits which may disable access to VFP
and Neon instructions. However it failed to account for the
fact that the CPACR is only present starting from the ARMv6
architecture version, so it accidentally disabled VFP completely
for ARMv5 CPUs like the ARM926. Linux would detect this as
"no VFP present" and probably fall back to its own emulation,
but other guest OSes might crash or misbehave.

This fixes bug LP:1359930.

Reported-by: Jakub Jermar <jakub@jermar.eu>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1408714940-7192-1-git-send-email-peter.maydell@linaro.org
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit ed1f13d607)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:57 -05:00
Alex Williamson
3e8966df02 x86: Clear MTRRs on vCPU reset
The SDM specifies (June 2014 Vol3 11.11.5):

    On a hardware reset, the P6 and more recent processors clear the
    valid flags in variable-range MTRRs and clear the E flag in the
    IA32_MTRR_DEF_TYPE MSR to disable all MTRRs. All other bits in the
    MTRRs are undefined.

We currently do none of that, so whatever MTRR settings you had prior
to reset is what you have after reset.  Usually this doesn't matter
because KVM often ignores the guest mappings and uses write-back
anyway.  However, if you have an assigned device and an IOMMU that
allows NoSnoop for that device, KVM defers to the guest memory
mappings which are now stale after reset.  The result is that OVMF
rebooting on such a configuration takes a full minute to LZMA
decompress the firmware volume, a process that is nearly instant on
the initial boot.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 9db2efd95e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:57 -05:00
Alex Williamson
ba8576f338 x86: kvm: Add MTRR support for kvm_get|put_msrs()
The MTRR state in KVM currently runs completely independent of the
QEMU state in CPUX86State.mtrr_*.  This means that on migration, the
target loses MTRR state from the source.  Generally that's ok though
because KVM ignores it and maps everything as write-back anyway.  The
exception to this rule is when we have an assigned device and an IOMMU
that doesn't promote NoSnoop transactions from that device to be cache
coherent.  In that case KVM trusts the guest mapping of memory as
configured in the MTRR.

This patch updates kvm_get|put_msrs() so that we retrieve the actual
vCPU MTRR settings and therefore keep CPUX86State synchronized for
migration.  kvm_put_msrs() is also used on vCPU reset and therefore
allows future modificaitons of MTRR state at reset to be realized.

Note that the entries array used by both functions was already
slightly undersized for holding every possible MSR, so this patch
increases it beyond the 28 new entries necessary for MTRR state.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit d1ae67f626)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:57 -05:00
Alex Williamson
07f8c97f84 x86: Use common variable range MTRR counts
We currently define the number of variable range MTRR registers as 8
in the CPUX86State structure and vmstate, but use MSR_MTRRcap_VCNT
(also 8) to report to guests the number available.  Change this to
use MSR_MTRRcap_VCNT consistently.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit d8b5c67b05)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:57 -05:00
William Grant
72c9c9a05e target-i386: Don't forbid NX bit on PAE PDEs and PTEs
Commit e8f6d00c30 ("target-i386: raise
page fault for reserved physical address bits") added a check that the
NX bit is not set on PAE PDPEs, but it also added it to rsvd_mask for
the rest of the function. This caused any PDEs or PTEs with NX set to be
erroneously rejected, making PAE guests with NX support unusable.

Signed-off-by: William Grant <wgrant@ubuntu.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 1844e68eca)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:57 -05:00
Paolo Bonzini
3d8cc86e4f vl: process -object after other backend options
QOM backends can refer to chardevs, but not vice versa.  So
process -chardev and -fsdev options before -object

This fixes the rng-egd backend to virtio-rng.

Reported-by: Amos Kong <akong@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 7b71758d79)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:56 -05:00
Greg Kurz
0824ca6bd1 spapr_pci: map the MSI window in each PHB
On sPAPR, virtio devices are connected to the PCI bus and use MSI-X.
Commit cc943c36fa has modified MSI-X
so that writes are made using the bus master address space and follow
the IOMMU path.

Unfortunately, the IOMMU address space address space does not have an
MSI window: the notification is silently dropped in unassigned_mem_write
instead of reaching the guest... The most visible effect is that all
virtio devices are non-functional on sPAPR since then. :(

This patch does the following:
1) map the MSI window into the IOMMU address space for each PHB
   - since each PHB instantiates its own IOMMU address space, we
     can safely map the window at a fixed address (SPAPR_PCI_MSI_WINDOW)
   - no real need to keep the MSI window setup in a separate function,
     the spapr_pci_msi_init() code moves to spapr_phb_realize().

2) kill the global MSI window as it is not needed in the end

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 8c46f7ec85)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-10 09:30:28 -05:00
Stefan Hajnoczi
feb633411f thread-pool: avoid deadlock in nested aio_poll() calls
The thread pool has a race condition if two elements complete before
thread_pool_completion_bh() runs:

  If element A's callback waits for element B using aio_poll() it will
  deadlock since pool->completion_bh is not marked scheduled when the
  nested aio_poll() runs.

Fix this by marking the BH scheduled while thread_pool_completion_bh()
is executing.  This way any nested aio_poll() loops will enter
thread_pool_completion_bh() and complete the remaining elements.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 3c80ca158c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:06 -05:00
Stefan Hajnoczi
75ada6b763 thread-pool: avoid per-thread-pool EventNotifier
EventNotifier is implemented using an eventfd or pipe.  It therefore
consumes file descriptors, which can be limited by rlimits and should
therefore be used sparingly.

Switch from EventNotifier to QEMUBH in thread-pool.c.  Originally
EventNotifier was used because qemu_bh_schedule() was not thread-safe
yet.

Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit c2e50e3d11)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:06 -05:00
Michael S. Tsirkin
be3af755ac pc: reserve more memory for ACPI for new machine types
commit 868270f23d
    acpi-build: tweak acpi migration limits
broke kernel loading with -kernel/-initrd: it doubled
the size of ACPI tables but did not reserve
enough memory.

As a result, issues on boot and halt are observed.

Fix this up by doubling reserved memory for new machine types.

Cc: qemu-stable@nongnu.org
Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 927766c7d3)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:05 -05:00
Gonglei
bfe3e6f5e3 pcihp: fix possible array out of bounds
Prevent out-of-bounds array access on
acpi_pcihp_pci_status.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
(cherry picked from commit fa365d7cd1)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:05 -05:00
Michael S. Tsirkin
cd4acff8d0 hostmem: set MPOL_MF_MOVE
When memory is allocated on a wrong node, MPOL_MF_STRICT
doesn't move it - it just fails the allocation.
A simple way to reproduce the failure is with mlock=on
realtime feature.

The code comment actually says: "ensure policy won't be ignored"
so setting MPOL_MF_MOVE seems like a better way to do this.

Cc: qemu-stable@nongnu.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

(cherry picked from commit 288d332202)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:05 -05:00
Ben Draper
4b59161253 vmxnet3: Pad short frames to minimum size (60 bytes)
When running VMware ESXi under qemu-kvm the guest discards frames
that are too short. Short ARP Requests will be dropped, this prevents
guests on the same bridge as VMware ESXi from communicating. This patch
simply adds the padding on the network device itself.

Signed-off-by: Ben Draper <ben@xrsa.net>
Reviewed-by: Dmitry Fleytman <dmitry@daynix.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 40a87c6c9b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:05 -05:00
Fam Zheng
fab7560c35 blkdebug: Delete BH in bdrv_aio_cancel
Otherwise error_callback_bh will access the already released acb.

Cc: qemu-stable@nongnu.org
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit cbf95a0b11)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:05 -05:00
Stefan Hajnoczi
16c92cd629 qemu-iotests: add test case 101 for short file I/O
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 8d9eb33ca0)

Conflicts:
	tests/qemu-iotests/group

*fix up context mismatches due to lack of 099 and 103 tests

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:05 -05:00
Stefan Hajnoczi
dea6efe883 raw-posix: fix O_DIRECT short reads
The following O_DIRECT read from a <512 byte file fails:

  $ truncate -s 320 test.img
  $ qemu-io -n -c 'read -P 0 0 512' test.img
  qemu-io: can't open device test.img: Could not read image for determining its format: Invalid argument

Note that qemu-io completes successfully without the -n (O_DIRECT)
option.

This patch fixes qemu-iotests ./check -nocache -vmdk 059.

Cc: qemu-stable@nongnu.org
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 61ed73cff4)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:05 -05:00
Peter Lieven
8c4edd743c block/iscsi: fix memory corruption on iscsi resize
bs->total_sectors is not yet updated at this point. resulting
in memory corruption if the volume has grown and data is written
to the newly availble areas.

CC: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit d832fb4d66)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:05 -05:00
Christoffer Dall
504e2a7139 arm/virt: Use PSCI v0.2 function IDs in the DT when KVM uses PSCI v0.2
The current code supplies the PSCI v0.1 function IDs in the DT even when
KVM uses PSCI v0.2.

This will break guest kernels that only support PSCI v0.1 as they will
use the IDs provided in the DT.  Guest kernels with PSCI v0.2 support
are not affected by this patch, because they ignore the function IDs in
the device tree and rely on the architecture definition.

Define QEMU versions of the constants and check that they correspond to
the Linux defines on Linux build hosts.  After this patch, both guest
kernels with PSCI v0.1 support and guest kernels with PSCI v0.2 should
work.

Tested on TC2 for 32-bit and APM Mustang for 64-bit (aarch64 guest
only).  Both cases tested with 3.14 and linus/master and verified I
could bring up 2 cpus with both guest kernels.  Also tested 32-bit with
a 3.14 host kernel with only PSCI v0.1 and both guests booted here as
well.

Cc: qemu-stable@nongnu.org
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 863714ba6c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:05 -05:00
Christoffer Dall
2f6d5e1c9c target-arm: Rename QEMU PSCI v0.1 definitions
The function IDs for PSCI v0.1 are exported by KVM and defined as
KVM_PSCI_FN_<something>.  To build using these defines in non-KVM code,
QEMU defines these IDs locally and check their correctness against the
KVM headers when those are available.

However, the naming scheme used for QEMU (almost) clashes with the PSCI
v0.2 definitions from Linux so to avoid unfortunate naming when we
introduce local PSCI v0.2 defines, rename the current local defines with
QEMU_ prependend and clearly identify the PSCI version as v0.1 in the
defines.

Cc: qemu-stable@nongnu.org
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit a65c9c17ce)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:05 -05:00
Peter Maydell
20463dc874 target-arm: Fix return address for A64 BRK instructions
When we take an exception resulting from a BRK instruction,
the architecture requires that the "preferred return address"
reported to the exception handler is the address of the BRK
itself, not the following instruction (like undefined
insns, and in contrast with SVC, HVC and SMC). Follow this,
rather than incorrectly reporting the address of the following
insn.

(We do get this correct for the A32/T32 BKPT insns.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Cc: qemu-stable@nongnu.org
(cherry picked from commit 229a138d74)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:05 -05:00
zhanghailiang
2a575c450e virtio-blk: fix reference a pointer which might be freed
In function virtio_blk_handle_request, it may freed memory pointed by req,
So do not access member of req after calling this function.

Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 1bdb176ac5)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:04 -05:00
Michael S. Tsirkin
1ad9dcec47 acpi: align RSDP
RSDP should be aligned at a 16-byte boundary.
This would by chance at the moment, fix up acpi build
to make it robust.

Cc: qemu-stable@nongnu.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
(cherry picked from commit d67aadccfa)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:04 -05:00
Hu Tao
ba1bc81991 numa: show hex number in error message for consistency and prefix them with 0x
The error messages before and after patch are:

before:
qemu-system-x86_64: total memory for NUMA nodes (134217728) should equal RAM size (20000000)

after:
qemu-system-x86_64: total memory for NUMA nodes (0x8000000) should equal RAM size (0x20000000)

Cc: qemu-stable@nongnu.org
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit c68233aee8)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:04 -05:00
Michael S. Tsirkin
948574e0d2 pc-dimm: fix up error message
- int should be printed using %d
- print actual wrong value for property

Cc: qemu-stable@nongnu.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 988eba0f68)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:04 -05:00
Hu Tao
044af98ea8 pc-dimm: validate node property
If user specifies a node number that exceeds the available numa nodes in
emulated system for pc-dimm device, the device will report an invalid _PXM
to OSPM. Fix this by checking the node property value.

Cc: qemu-stable@nongnu.org
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit cfe0ffd027)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:04 -05:00
Hu Tao
7c68c5402a hw:i386: typo fix: MEMORY_HOPTLUG_DEVICE -> MEMORY_HOTPLUG_DEVICE
Cc: qemu-stable@nongnu.org
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 41d2f71376)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-08 11:23:04 -05:00
Michael Tokarev
bd4740621c ide: only constrain read/write requests to drive size, not other types
Commit 58ac321135 introduced a check to ide dma processing which
constrains all requests to drive size.  However, apparently, some
valid requests (like TRIM) does not fit in this constraint, and
fails in 2.1.  So check the range only for reads and writes.

Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit d66168ed68)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-08-26 16:58:56 -05:00
Michael Tokarev
e22d5dc073 l2tpv3 (configure): it is linux-specific
Some non-linux systems, for example a system with
FreeBSD kernel and glibc, may declare struct mmsghdr
(in glibc) but may not have linux-specific header
file linux/ip.h.  The actual implementation in qemu
includes this linux-specific header file unconditionally,
so compilation fails if it is not present.  Include
this header in the configure test too.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit bff6cb7296)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-08-26 16:57:28 -05:00
Alex Williamson
dfd4808222 vfio: Fix MSI-X vector expansion
When new MSI-X vectors are enabled we need to disable MSI-X and
re-enable it with the correct number of vectors.  That means we need
to reprogram the eventfd triggers for each vector.  Prior to f4d45d47
vector->use tracked whether a vector was masked or unmasked and we
could always pick the KVM path when available for unmasked vectors.
Now vfio doesn't track mask state itself and vector->use and virq
remains configured even for masked vectors.  Therefore we need to ask
the MSI-X code whether a vector is masked in order to select the
correct signaling path.  As noted in the comment, MSI relies on
hardware to handle masking.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: qemu-stable@nongnu.org # QEMU 2.1
(cherry picked from commit c048be5cc9)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-08-26 16:48:12 -05:00
Stefan Hajnoczi
5f26e63b17 qdev-monitor: include QOM properties in -device FOO, help output
Update -device FOO,help to include QOM properties in addition to qdev
properties.  Devices are gradually adding more QOM properties that are
not reflected as qdev properties.

It is important to report all device properties since management tools
like libvirt use this information (and device-list-properties QMP) to
detect the presence of QEMU features.

This patch reuses the device-list-properties QMP machinery to avoid code
duplication.

Reported-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Cole Robinson <crobinso@redhat.com>
(cherry picked from commit ef523587da)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-08-26 16:46:19 -05:00
Stefan Hajnoczi
42f7a13178 qmp: hide "hotplugged" device property from device-list-properties
The "hotplugged" device property was not reported before commit
f4eb32b590 ("qmp: show QOM properties in
device-list-properties").  Fix this difference.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 4115dd6527)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-08-26 16:46:01 -05:00
783 changed files with 7989 additions and 34526 deletions

4
.gitignore vendored
View File

@@ -11,10 +11,6 @@
/trace/generated-tracers.dtrace
/trace/generated-events.h
/trace/generated-events.c
/trace/generated-helpers-wrappers.h
/trace/generated-helpers.h
/trace/generated-helpers.c
/trace/generated-tcg-tracers.h
/trace/generated-ust-provider.h
/trace/generated-ust.c
/libcacard/trace/generated-tracers.c

View File

@@ -12,7 +12,7 @@ notifications:
on_failure: always
env:
global:
- TEST_CMD=""
- TEST_CMD="make check"
- EXTRA_CONFIG=""
# Development packages, EXTRA_PKGS saved for additional builds
- CORE_PKGS="libusb-1.0-0-dev libiscsi-dev librados-dev libncurses5-dev"
@@ -20,51 +20,31 @@ env:
- GUI_PKGS="libgtk-3-dev libvte-2.90-dev libsdl1.2-dev libpng12-dev libpixman-1-dev"
- EXTRA_PKGS=""
matrix:
# Group major targets together with their linux-user counterparts
- TARGETS=alpha-softmmu,alpha-linux-user
- TARGETS=arm-softmmu,arm-linux-user,armeb-linux-user,aarch64-softmmu,aarch64-linux-user
- TARGETS=cris-softmmu,cris-linux-user
- TARGETS=i386-softmmu,i386-linux-user,x86_64-softmmu,x86_64-linux-user
- TARGETS=m68k-softmmu,m68k-linux-user
- TARGETS=microblaze-softmmu,microblazeel-softmmu,microblaze-linux-user,microblazeel-linux-user
- TARGETS=arm-softmmu,arm-linux-user
- TARGETS=aarch64-softmmu,aarch64-linux-user
- TARGETS=cris-softmmu
- TARGETS=i386-softmmu,x86_64-softmmu
- TARGETS=lm32-softmmu
- TARGETS=m68k-softmmu
- TARGETS=microblaze-softmmu,microblazeel-softmmu
- TARGETS=mips-softmmu,mips64-softmmu,mips64el-softmmu,mipsel-softmmu
- TARGETS=mips-linux-user,mips64-linux-user,mips64el-linux-user,mipsel-linux-user,mipsn32-linux-user,mipsn32el-linux-user
- TARGETS=or32-softmmu,or32-linux-user
- TARGETS=ppc-softmmu,ppc64-softmmu,ppcemb-softmmu,ppc-linux-user,ppc64-linux-user,ppc64abi32-linux-user,ppc64le-linux-user
- TARGETS=s390x-softmmu,s390x-linux-user
- TARGETS=sh4-softmmu,sh4eb-softmmu,sh4-linux-user sh4eb-linux-user
- TARGETS=sparc-softmmu,sparc64-softmmu,sparc-linux-user,sparc32plus-linux-user,sparc64-linux-user
- TARGETS=unicore32-softmmu,unicore32-linux-user
# Group remaining softmmu only targets into one build
- TARGETS=lm32-softmmu,moxie-softmmu,tricore-softmmu,xtensa-softmmu,xtensaeb-softmmu
git:
# we want to do this ourselves
submodules: false
- TARGETS=moxie-softmmu
- TARGETS=or32-softmmu,
- TARGETS=ppc-softmmu,ppc64-softmmu,ppcemb-softmmu
- TARGETS=s390x-softmmu
- TARGETS=sh4-softmmu,sh4eb-softmmu
- TARGETS=sparc-softmmu,sparc64-softmmu
- TARGETS=unicore32-softmmu
- TARGETS=xtensa-softmmu,xtensaeb-softmmu
before_install:
- wget -O - http://people.linaro.org/~alex.bennee/qemu-submodule-git-seed.tar.xz | tar -xvJ
- git submodule update --init --recursive
- sudo apt-get update -qq
- sudo apt-get install -qq ${CORE_PKGS} ${NET_PKGS} ${GUI_PKGS} ${EXTRA_PKGS}
before_script:
- ./configure --target-list=${TARGETS} --enable-debug-tcg ${EXTRA_CONFIG}
script:
- make -j2 && ${TEST_CMD}
script: "./configure --target-list=${TARGETS} ${EXTRA_CONFIG} && make && ${TEST_CMD}"
matrix:
# We manually include a number of additional build for non-standard bits
include:
# Make check target (we only do this once)
- env:
- TARGETS=alpha-softmmu,arm-softmmu,aarch64-softmmu,cris-softmmu,
i386-softmmu,x86_64-softmmu,m68k-softmmu,microblaze-softmmu,
microblazeel-softmmu,mips-softmmu,mips64-softmmu,
mips64el-softmmu,mipsel-softmmu,or32-softmmu,ppc-softmmu,
ppc64-softmmu,ppcemb-softmmu,s390x-softmmu,sh4-softmmu,
sh4eb-softmmu,sparc-softmmu,sparc64-softmmu,
unicore32-softmmu,unicore32-linux-user,
lm32-softmmu,moxie-softmmu,tricore-softmmu,xtensa-softmmu,
xtensaeb-softmmu
TEST_CMD="make check"
compiler: gcc
# Debug related options
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_CONFIG="--enable-debug"
@@ -93,6 +73,7 @@ matrix:
compiler: gcc
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_CONFIG="--enable-trace-backends=ftrace"
TEST_CMD=""
compiler: gcc
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_PKGS="liblttng-ust-dev liburcu-dev"

View File

@@ -91,17 +91,3 @@ Mixed declarations (interleaving statements and declarations within blocks)
are not allowed; declarations should be at the beginning of blocks. In other
words, the code should not generate warnings if using GCC's
-Wdeclaration-after-statement option.
6. Conditional statements
When comparing a variable for (in)equality with a constant, list the
constant on the right, as in:
if (a == 1) {
/* Reads like: "If a equals 1" */
do_something();
}
Rationale: Yoda conditions (as in 'if (1 == a)') are awkward to read.
Besides, good compilers already warn users when '==' is mis-typed as '=',
even when the constant is on the right.

View File

@@ -161,12 +161,6 @@ S: Maintained
F: target-xtensa/
F: hw/xtensa/
TriCore
M: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
S: Maintained
F: target-tricore/
F: hw/tricore/
Guest CPU Cores (KVM):
----------------------
@@ -620,7 +614,7 @@ USB
M: Gerd Hoffmann <kraxel@redhat.com>
S: Maintained
F: hw/usb/*
F: tests/usb-*-test.c
F: tests/usb-hcd-ehci-test.c
VFIO
M: Alex Williamson <alex.williamson@redhat.com>
@@ -684,12 +678,6 @@ S: Maintained
F: hw/*/xilinx_*
F: include/hw/xilinx.h
Vmware
M: Dmitry Fleytman <dmitry@daynix.com>
S: Maintained
F: hw/net/vmxnet*
F: hw/scsi/vmw_pvscsi*
Subsystems
----------
Audio
@@ -980,7 +968,7 @@ S: Supported
F: block/rbd.c
Sheepdog
M: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
M: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
M: Liu Yuan <namei.unix@gmail.com>
L: sheepdog@lists.wpkg.org
S: Supported
@@ -1012,9 +1000,3 @@ SSH
M: Richard W.M. Jones <rjones@redhat.com>
S: Supported
F: block/ssh.c
ARCHIPELAGO
M: Chrysostomos Nanakos <cnanakos@grnet.gr>
M: Chrysostomos Nanakos <chris@include.gr>
S: Maintained
F: block/archipelago.c

View File

@@ -57,12 +57,6 @@ GENERATED_HEADERS += trace/generated-tracers-dtrace.h
endif
GENERATED_SOURCES += trace/generated-tracers.c
GENERATED_HEADERS += trace/generated-tcg-tracers.h
GENERATED_HEADERS += trace/generated-helpers-wrappers.h
GENERATED_HEADERS += trace/generated-helpers.h
GENERATED_SOURCES += trace/generated-helpers.c
ifeq ($(findstring ust,$(TRACE_BACKENDS)),ust)
GENERATED_HEADERS += trace/generated-ust-provider.h
GENERATED_SOURCES += trace/generated-ust.c
@@ -208,7 +202,7 @@ Makefile: $(version-obj-y) $(version-lobj-y)
# Build libraries
libqemustub.a: $(stub-obj-y)
libqemuutil.a: $(util-obj-y)
libqemuutil.a: $(util-obj-y) qapi-types.o qapi-visit.o qapi-event.o
block-modules = $(foreach o,$(block-obj-m),"$(basename $(subst /,-,$o))",) NULL
util/module.o-cflags = -D'CONFIG_BLOCK_MODULES=$(block-modules)'
@@ -418,7 +412,6 @@ endif
set -e; for x in $(KEYMAPS); do \
$(INSTALL_DATA) $(SRC_PATH)/pc-bios/keymaps/$$x "$(DESTDIR)$(qemu_datadir)/keymaps"; \
done
$(INSTALL_DATA) $(SRC_PATH)/trace-events "$(DESTDIR)$(qemu_datadir)/trace-events"
for d in $(TARGET_DIRS); do \
$(MAKE) $(SUBDIR_MAKEFLAGS) TARGET_DIR=$$d/ -C $$d $@ || exit 1 ; \
done

View File

@@ -1,7 +1,7 @@
#######################################################################
# Common libraries for tools and emulators
stub-obj-y = stubs/
util-obj-y = util/ qobject/ qapi/ qapi-types.o qapi-visit.o qapi-event.o
util-obj-y = util/ qobject/ qapi/ trace/
#######################################################################
# block-obj-y is code used by both qemu system emulation and qemu-img
@@ -12,6 +12,7 @@ block-obj-y += main-loop.o iohandler.o qemu-timer.o
block-obj-$(CONFIG_POSIX) += aio-posix.o
block-obj-$(CONFIG_WIN32) += aio-win32.o
block-obj-y += block/
block-obj-y += qapi-types.o qapi-visit.o qapi-event.o
block-obj-y += qemu-io-cmds.o
block-obj-y += qemu-coroutine.o qemu-coroutine-lock.o qemu-coroutine-io.o
@@ -62,7 +63,6 @@ common-obj-$(CONFIG_SPICE) += spice-qemu-char.o
common-obj-y += audio/
common-obj-y += hw/
common-obj-y += accel.o
common-obj-y += ui/
common-obj-y += bt-host.o bt-vhci.o
@@ -88,6 +88,11 @@ common-obj-y += qmp-marshal.o
common-obj-y += qmp.o hmp.o
endif
######################################################################
# some qapi visitors are used by both system and user emulation:
common-obj-y += qapi-visit.o qapi-types.o
#######################################################################
# Target-independent parts used in system and user emulation
common-obj-y += qemu-log.o
@@ -101,15 +106,10 @@ common-obj-y += disas/
version-obj-$(CONFIG_WIN32) += $(BUILD_DIR)/version.o
version-lobj-$(CONFIG_WIN32) += $(BUILD_DIR)/version.lo
######################################################################
# tracing
util-obj-y += trace/
target-obj-y += trace/
######################################################################
# guest agent
# FIXME: a few definitions from qapi-types.o/qapi-visit.o are needed
# by libqemuutil.a. These should be moved to a separate .json schema.
qga-obj-y = qga/
qga-obj-y = qga/ qapi-types.o qapi-visit.o
qga-vss-dll-obj-y = qga/

View File

@@ -38,7 +38,7 @@ config-target.h: config-target.h-timestamp
config-target.h-timestamp: config-target.mak
ifdef CONFIG_TRACE_SYSTEMTAP
stap: $(QEMU_PROG).stp-installed $(QEMU_PROG).stp $(QEMU_PROG)-simpletrace.stp
stap: $(QEMU_PROG).stp-installed $(QEMU_PROG).stp
ifdef CONFIG_USER_ONLY
TARGET_TYPE=user
@@ -64,13 +64,6 @@ $(QEMU_PROG).stp: $(SRC_PATH)/trace-events
--target-type=$(TARGET_TYPE) \
< $< > $@," GEN $(TARGET_DIR)$(QEMU_PROG).stp")
$(QEMU_PROG)-simpletrace.stp: $(SRC_PATH)/trace-events
$(call quiet-command,$(TRACETOOL) \
--format=simpletrace-stap \
--backends=$(TRACE_BACKENDS) \
--probe-prefix=qemu.$(TARGET_TYPE).$(TARGET_NAME) \
< $< > $@," GEN $(TARGET_DIR)$(QEMU_PROG)-simpletrace.stp")
else
stap:
endif
@@ -159,20 +152,15 @@ endif # CONFIG_SOFTMMU
dummy := $(call unnest-vars,,obj-y)
all-obj-y := $(obj-y)
target-obj-y :=
block-obj-y :=
common-obj-y :=
include $(SRC_PATH)/Makefile.objs
dummy := $(call unnest-vars,,target-obj-y)
target-obj-y-save := $(target-obj-y)
dummy := $(call unnest-vars,.., \
block-obj-y \
block-obj-m \
common-obj-y \
common-obj-m)
target-obj-y := $(target-obj-y-save)
all-obj-y += $(common-obj-y)
all-obj-y += $(target-obj-y)
all-obj-$(CONFIG_SOFTMMU) += $(block-obj-y)
# build either PROG or PROGW
@@ -203,7 +191,6 @@ endif
ifdef CONFIG_TRACE_SYSTEMTAP
$(INSTALL_DIR) "$(DESTDIR)$(qemu_datadir)/../systemtap/tapset"
$(INSTALL_DATA) $(QEMU_PROG).stp-installed "$(DESTDIR)$(qemu_datadir)/../systemtap/tapset/$(QEMU_PROG).stp"
$(INSTALL_DATA) $(QEMU_PROG)-simpletrace.stp "$(DESTDIR)$(qemu_datadir)/../systemtap/tapset/$(QEMU_PROG)-simpletrace.stp"
endif
GENERATED_HEADERS += config-target.h

View File

@@ -1 +1 @@
2.1.50
2.1.2

157
accel.c
View File

@@ -1,157 +0,0 @@
/*
* QEMU System Emulator, accelerator interfaces
*
* Copyright (c) 2003-2008 Fabrice Bellard
* Copyright (c) 2014 Red Hat Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "sysemu/accel.h"
#include "hw/boards.h"
#include "qemu-common.h"
#include "sysemu/arch_init.h"
#include "sysemu/sysemu.h"
#include "sysemu/kvm.h"
#include "sysemu/qtest.h"
#include "hw/xen/xen.h"
#include "qom/object.h"
#include "hw/boards.h"
int tcg_tb_size;
static bool tcg_allowed = true;
static int tcg_init(MachineState *ms)
{
tcg_exec_init(tcg_tb_size * 1024 * 1024);
return 0;
}
static const TypeInfo accel_type = {
.name = TYPE_ACCEL,
.parent = TYPE_OBJECT,
.class_size = sizeof(AccelClass),
.instance_size = sizeof(AccelState),
};
/* Lookup AccelClass from opt_name. Returns NULL if not found */
static AccelClass *accel_find(const char *opt_name)
{
char *class_name = g_strdup_printf(ACCEL_CLASS_NAME("%s"), opt_name);
AccelClass *ac = ACCEL_CLASS(object_class_by_name(class_name));
g_free(class_name);
return ac;
}
static int accel_init_machine(AccelClass *acc, MachineState *ms)
{
ObjectClass *oc = OBJECT_CLASS(acc);
const char *cname = object_class_get_name(oc);
AccelState *accel = ACCEL(object_new(cname));
int ret;
ms->accelerator = accel;
*(acc->allowed) = true;
ret = acc->init_machine(ms);
if (ret < 0) {
ms->accelerator = NULL;
*(acc->allowed) = false;
object_unref(OBJECT(accel));
}
return ret;
}
int configure_accelerator(MachineState *ms)
{
const char *p;
char buf[10];
int ret;
bool accel_initialised = false;
bool init_failed = false;
AccelClass *acc = NULL;
p = qemu_opt_get(qemu_get_machine_opts(), "accel");
if (p == NULL) {
/* Use the default "accelerator", tcg */
p = "tcg";
}
while (!accel_initialised && *p != '\0') {
if (*p == ':') {
p++;
}
p = get_opt_name(buf, sizeof(buf), p, ':');
acc = accel_find(buf);
if (!acc) {
fprintf(stderr, "\"%s\" accelerator not found.\n", buf);
continue;
}
if (acc->available && !acc->available()) {
printf("%s not supported for this target\n",
acc->name);
continue;
}
ret = accel_init_machine(acc, ms);
if (ret < 0) {
init_failed = true;
fprintf(stderr, "failed to initialize %s: %s\n",
acc->name,
strerror(-ret));
} else {
accel_initialised = true;
}
}
if (!accel_initialised) {
if (!init_failed) {
fprintf(stderr, "No accelerator found!\n");
}
exit(1);
}
if (init_failed) {
fprintf(stderr, "Back to %s accelerator.\n", acc->name);
}
return !accel_initialised;
}
static void tcg_accel_class_init(ObjectClass *oc, void *data)
{
AccelClass *ac = ACCEL_CLASS(oc);
ac->name = "tcg";
ac->init_machine = tcg_init;
ac->allowed = &tcg_allowed;
}
#define TYPE_TCG_ACCEL ACCEL_CLASS_NAME("tcg")
static const TypeInfo tcg_accel_type = {
.name = TYPE_TCG_ACCEL,
.parent = TYPE_ACCEL,
.class_init = tcg_accel_class_init,
};
static void register_accel_types(void)
{
type_register_static(&accel_type);
type_register_static(&tcg_accel_type);
}
type_init(register_accel_types);

View File

@@ -100,11 +100,6 @@ void aio_set_event_notifier(AioContext *ctx,
(IOHandler *)io_read, NULL, notifier);
}
bool aio_prepare(AioContext *ctx)
{
return false;
}
bool aio_pending(AioContext *ctx)
{
AioHandler *node;
@@ -124,20 +119,11 @@ bool aio_pending(AioContext *ctx)
return false;
}
bool aio_dispatch(AioContext *ctx)
static bool aio_dispatch(AioContext *ctx)
{
AioHandler *node;
bool progress = false;
/*
* If there are callbacks left that have been queued, we need to call them.
* Do not call select in this case, because it is possible that the caller
* does not need a complete flush (as is the case for aio_poll loops).
*/
if (aio_bh_poll(ctx)) {
progress = true;
}
/*
* We have to walk very carefully in case aio_set_fd_handler is
* called while we're walking.
@@ -198,9 +184,22 @@ bool aio_poll(AioContext *ctx, bool blocking)
/* aio_notify can avoid the expensive event_notifier_set if
* everything (file descriptors, bottom halves, timers) will
* be re-evaluated before the next blocking poll(). This is
* already true when aio_poll is called with blocking == false;
* if blocking == true, it is only true after poll() returns.
* be re-evaluated before the next blocking poll(). This happens
* in two cases:
*
* 1) when aio_poll is called with blocking == false
*
* 2) when we are called after poll(). If we are called before
* poll(), bottom halves will not be re-evaluated and we need
* aio_notify() if blocking == true.
*
* The first aio_dispatch() only does something when AioContext is
* running as a GSource, and in that case aio_poll is used only
* with blocking == false, so this optimization is already quite
* effective. However, the code is ugly and should be restructured
* to have a single aio_dispatch() call. To do this, we need to
* reorganize aio_poll into a prepare/poll/dispatch model like
* glib's.
*
* If we're in a nested event loop, ctx->dispatching might be true.
* In that case we can restore it just before returning, but we
@@ -208,6 +207,26 @@ bool aio_poll(AioContext *ctx, bool blocking)
*/
aio_set_dispatching(ctx, !blocking);
/*
* If there are callbacks left that have been queued, we need to call them.
* Do not call select in this case, because it is possible that the caller
* does not need a complete flush (as is the case for aio_poll loops).
*/
if (aio_bh_poll(ctx)) {
blocking = false;
progress = true;
}
/* Re-evaluate condition (1) above. */
aio_set_dispatching(ctx, !blocking);
if (aio_dispatch(ctx)) {
progress = true;
}
if (progress && !blocking) {
goto out;
}
ctx->walking_handlers++;
g_array_set_size(ctx->pollfds, 0);
@@ -230,7 +249,7 @@ bool aio_poll(AioContext *ctx, bool blocking)
/* wait until next event */
ret = qemu_poll_ns((GPollFD *)ctx->pollfds->data,
ctx->pollfds->len,
blocking ? aio_compute_timeout(ctx) : 0);
blocking ? timerlistgroup_deadline_ns(&ctx->tlg) : 0);
/* if we have any readable fds, dispatch event */
if (ret > 0) {
@@ -249,6 +268,7 @@ bool aio_poll(AioContext *ctx, bool blocking)
progress = true;
}
out:
aio_set_dispatching(ctx, was_dispatching);
return progress;
}

View File

@@ -22,80 +22,12 @@
struct AioHandler {
EventNotifier *e;
IOHandler *io_read;
IOHandler *io_write;
EventNotifierHandler *io_notify;
GPollFD pfd;
int deleted;
void *opaque;
QLIST_ENTRY(AioHandler) node;
};
void aio_set_fd_handler(AioContext *ctx,
int fd,
IOHandler *io_read,
IOHandler *io_write,
void *opaque)
{
/* fd is a SOCKET in our case */
AioHandler *node;
QLIST_FOREACH(node, &ctx->aio_handlers, node) {
if (node->pfd.fd == fd && !node->deleted) {
break;
}
}
/* Are we deleting the fd handler? */
if (!io_read && !io_write) {
if (node) {
/* If the lock is held, just mark the node as deleted */
if (ctx->walking_handlers) {
node->deleted = 1;
node->pfd.revents = 0;
} else {
/* Otherwise, delete it for real. We can't just mark it as
* deleted because deleted nodes are only cleaned up after
* releasing the walking_handlers lock.
*/
QLIST_REMOVE(node, node);
g_free(node);
}
}
} else {
HANDLE event;
if (node == NULL) {
/* Alloc and insert if it's not already there */
node = g_malloc0(sizeof(AioHandler));
node->pfd.fd = fd;
QLIST_INSERT_HEAD(&ctx->aio_handlers, node, node);
}
node->pfd.events = 0;
if (node->io_read) {
node->pfd.events |= G_IO_IN;
}
if (node->io_write) {
node->pfd.events |= G_IO_OUT;
}
node->e = &ctx->notifier;
/* Update handler with latest information */
node->opaque = opaque;
node->io_read = io_read;
node->io_write = io_write;
event = event_notifier_get_handle(&ctx->notifier);
WSAEventSelect(node->pfd.fd, event,
FD_READ | FD_ACCEPT | FD_CLOSE |
FD_CONNECT | FD_WRITE | FD_OOB);
}
aio_notify(ctx);
}
void aio_set_event_notifier(AioContext *ctx,
EventNotifier *e,
EventNotifierHandler *io_notify)
@@ -144,43 +76,6 @@ void aio_set_event_notifier(AioContext *ctx,
aio_notify(ctx);
}
bool aio_prepare(AioContext *ctx)
{
static struct timeval tv0;
AioHandler *node;
bool have_select_revents = false;
fd_set rfds, wfds;
/* fill fd sets */
FD_ZERO(&rfds);
FD_ZERO(&wfds);
QLIST_FOREACH(node, &ctx->aio_handlers, node) {
if (node->io_read) {
FD_SET ((SOCKET)node->pfd.fd, &rfds);
}
if (node->io_write) {
FD_SET ((SOCKET)node->pfd.fd, &wfds);
}
}
if (select(0, &rfds, &wfds, NULL, &tv0) > 0) {
QLIST_FOREACH(node, &ctx->aio_handlers, node) {
node->pfd.revents = 0;
if (FD_ISSET(node->pfd.fd, &rfds)) {
node->pfd.revents |= G_IO_IN;
have_select_revents = true;
}
if (FD_ISSET(node->pfd.fd, &wfds)) {
node->pfd.revents |= G_IO_OUT;
have_select_revents = true;
}
}
}
return have_select_revents;
}
bool aio_pending(AioContext *ctx)
{
AioHandler *node;
@@ -189,37 +84,47 @@ bool aio_pending(AioContext *ctx)
if (node->pfd.revents && node->io_notify) {
return true;
}
if ((node->pfd.revents & G_IO_IN) && node->io_read) {
return true;
}
if ((node->pfd.revents & G_IO_OUT) && node->io_write) {
return true;
}
}
return false;
}
static bool aio_dispatch_handlers(AioContext *ctx, HANDLE event)
bool aio_poll(AioContext *ctx, bool blocking)
{
AioHandler *node;
bool progress = false;
HANDLE events[MAXIMUM_WAIT_OBJECTS + 1];
bool progress;
int count;
int timeout;
progress = false;
/*
* If there are callbacks left that have been queued, we need to call then.
* Do not call select in this case, because it is possible that the caller
* does not need a complete flush (as is the case for aio_poll loops).
*/
if (aio_bh_poll(ctx)) {
blocking = false;
progress = true;
}
/* Run timers */
progress |= timerlistgroup_run_timers(&ctx->tlg);
/*
* Then dispatch any pending callbacks from the GSource.
*
* We have to walk very carefully in case aio_set_fd_handler is
* called while we're walking.
*/
node = QLIST_FIRST(&ctx->aio_handlers);
while (node) {
AioHandler *tmp;
int revents = node->pfd.revents;
ctx->walking_handlers++;
if (!node->deleted &&
(revents || event_notifier_get_handle(node->e) == event) &&
node->io_notify) {
if (node->pfd.revents && node->io_notify) {
node->pfd.revents = 0;
node->io_notify(node->e);
@@ -229,28 +134,6 @@ static bool aio_dispatch_handlers(AioContext *ctx, HANDLE event)
}
}
if (!node->deleted &&
(node->io_read || node->io_write)) {
node->pfd.revents = 0;
if ((revents & G_IO_IN) && node->io_read) {
node->io_read(node->opaque);
progress = true;
}
if ((revents & G_IO_OUT) && node->io_write) {
node->io_write(node->opaque);
progress = true;
}
/* if the next select() will return an event, we have progressed */
if (event == event_notifier_get_handle(&ctx->notifier)) {
WSANETWORKEVENTS ev;
WSAEnumNetworkEvents(node->pfd.fd, event, &ev);
if (ev.lNetworkEvents) {
progress = true;
}
}
}
tmp = node;
node = QLIST_NEXT(node, node);
@@ -262,47 +145,10 @@ static bool aio_dispatch_handlers(AioContext *ctx, HANDLE event)
}
}
return progress;
}
bool aio_dispatch(AioContext *ctx)
{
bool progress;
progress = aio_bh_poll(ctx);
progress |= aio_dispatch_handlers(ctx, INVALID_HANDLE_VALUE);
progress |= timerlistgroup_run_timers(&ctx->tlg);
return progress;
}
bool aio_poll(AioContext *ctx, bool blocking)
{
AioHandler *node;
HANDLE events[MAXIMUM_WAIT_OBJECTS + 1];
bool was_dispatching, progress, have_select_revents, first;
int count;
int timeout;
have_select_revents = aio_prepare(ctx);
if (have_select_revents) {
blocking = false;
if (progress && !blocking) {
return true;
}
was_dispatching = ctx->dispatching;
progress = false;
/* aio_notify can avoid the expensive event_notifier_set if
* everything (file descriptors, bottom halves, timers) will
* be re-evaluated before the next blocking poll(). This is
* already true when aio_poll is called with blocking == false;
* if blocking == true, it is only true after poll() returns.
*
* If we're in a nested event loop, ctx->dispatching might be true.
* In that case we can restore it just before returning, but we
* have to clear it now.
*/
aio_set_dispatching(ctx, !blocking);
ctx->walking_handlers++;
/* fill fd sets */
@@ -314,40 +160,64 @@ bool aio_poll(AioContext *ctx, bool blocking)
}
ctx->walking_handlers--;
first = true;
/* wait until next event */
while (count > 0) {
HANDLE event;
int ret;
timeout = blocking
? qemu_timeout_ns_to_ms(aio_compute_timeout(ctx)) : 0;
timeout = blocking ?
qemu_timeout_ns_to_ms(timerlistgroup_deadline_ns(&ctx->tlg)) : 0;
ret = WaitForMultipleObjects(count, events, FALSE, timeout);
aio_set_dispatching(ctx, true);
if (first && aio_bh_poll(ctx)) {
progress = true;
}
first = false;
/* if we have any signaled events, dispatch event */
event = NULL;
if ((DWORD) (ret - WAIT_OBJECT_0) < count) {
event = events[ret - WAIT_OBJECT_0];
events[ret - WAIT_OBJECT_0] = events[--count];
} else if (!have_select_revents) {
if ((DWORD) (ret - WAIT_OBJECT_0) >= count) {
break;
}
have_select_revents = false;
blocking = false;
progress |= aio_dispatch_handlers(ctx, event);
/* we have to walk very carefully in case
* aio_set_fd_handler is called while we're walking */
node = QLIST_FIRST(&ctx->aio_handlers);
while (node) {
AioHandler *tmp;
ctx->walking_handlers++;
if (!node->deleted &&
event_notifier_get_handle(node->e) == events[ret - WAIT_OBJECT_0] &&
node->io_notify) {
node->io_notify(node->e);
/* aio_notify() does not count as progress */
if (node->e != &ctx->notifier) {
progress = true;
}
}
tmp = node;
node = QLIST_NEXT(node, node);
ctx->walking_handlers--;
if (!ctx->walking_handlers && tmp->deleted) {
QLIST_REMOVE(tmp, node);
g_free(tmp);
}
}
/* Try again, but only call each handler once. */
events[ret - WAIT_OBJECT_0] = events[--count];
}
progress |= timerlistgroup_run_timers(&ctx->tlg);
if (blocking) {
/* Run the timers a second time. We do this because otherwise aio_wait
* will not note progress - and will stop a drain early - if we have
* a timer that was not ready to run entering g_poll but is ready
* after g_poll. This will only do anything if a timer has expired.
*/
progress |= timerlistgroup_run_timers(&ctx->tlg);
}
aio_set_dispatching(ctx, was_dispatching);
return progress;
}

View File

@@ -104,8 +104,6 @@ int graphic_depth = 32;
#define QEMU_ARCH QEMU_ARCH_XTENSA
#elif defined(TARGET_UNICORE32)
#define QEMU_ARCH QEMU_ARCH_UNICORE32
#elif defined(TARGET_TRICORE)
#define QEMU_ARCH QEMU_ARCH_TRICORE
#endif
const uint32_t arch_type = QEMU_ARCH;
@@ -1074,8 +1072,8 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
QTAILQ_FOREACH(block, &ram_list.blocks, next) {
if (!strncmp(id, block->idstr, sizeof(id))) {
if (block->length != length) {
error_report("Length mismatch: %s: 0x" RAM_ADDR_FMT
" in != 0x" RAM_ADDR_FMT, id, length,
error_report("Length mismatch: %s: " RAM_ADDR_FMT
" in != " RAM_ADDR_FMT, id, length,
block->length);
ret = -EINVAL;
}
@@ -1337,6 +1335,11 @@ void cpudef_init(void)
#endif
}
int tcg_available(void)
{
return 1;
}
int kvm_available(void)
{
#ifdef CONFIG_KVM

55
async.c
View File

@@ -152,48 +152,39 @@ void qemu_bh_delete(QEMUBH *bh)
bh->deleted = 1;
}
int64_t
aio_compute_timeout(AioContext *ctx)
static gboolean
aio_ctx_prepare(GSource *source, gint *timeout)
{
int64_t deadline;
int timeout = -1;
AioContext *ctx = (AioContext *) source;
QEMUBH *bh;
int deadline;
/* We assume there is no timeout already supplied */
*timeout = -1;
for (bh = ctx->first_bh; bh; bh = bh->next) {
if (!bh->deleted && bh->scheduled) {
if (bh->idle) {
/* idle bottom halves will be polled at least
* every 10ms */
timeout = 10000000;
*timeout = 10;
} else {
/* non-idle bottom halves will be executed
* immediately */
return 0;
*timeout = 0;
return true;
}
}
}
deadline = timerlistgroup_deadline_ns(&ctx->tlg);
deadline = qemu_timeout_ns_to_ms(timerlistgroup_deadline_ns(&ctx->tlg));
if (deadline == 0) {
return 0;
} else {
return qemu_soonest_timeout(timeout, deadline);
}
}
static gboolean
aio_ctx_prepare(GSource *source, gint *timeout)
{
AioContext *ctx = (AioContext *) source;
/* We assume there is no timeout already supplied */
*timeout = qemu_timeout_ns_to_ms(aio_compute_timeout(ctx));
if (aio_prepare(ctx)) {
*timeout = 0;
return true;
} else {
*timeout = qemu_soonest_timeout(*timeout, deadline);
}
return *timeout == 0;
return false;
}
static gboolean
@@ -218,7 +209,7 @@ aio_ctx_dispatch(GSource *source,
AioContext *ctx = (AioContext *) source;
assert(callback == NULL);
aio_dispatch(ctx);
aio_poll(ctx, false);
return true;
}
@@ -289,24 +280,18 @@ static void aio_rfifolock_cb(void *opaque)
aio_notify(opaque);
}
AioContext *aio_context_new(Error **errp)
AioContext *aio_context_new(void)
{
int ret;
AioContext *ctx;
ctx = (AioContext *) g_source_new(&aio_source_funcs, sizeof(AioContext));
ret = event_notifier_init(&ctx->notifier, false);
if (ret < 0) {
g_source_destroy(&ctx->source);
error_setg_errno(errp, -ret, "Failed to initialize event notifier");
return NULL;
}
aio_set_event_notifier(ctx, &ctx->notifier,
(EventNotifierHandler *)
event_notifier_test_and_clear);
ctx->pollfds = g_array_new(FALSE, FALSE, sizeof(GPollFD));
ctx->thread_pool = NULL;
qemu_mutex_init(&ctx->bh_lock);
rfifolock_init(&ctx->lock, aio_rfifolock_cb, ctx);
event_notifier_init(&ctx->notifier, false);
aio_set_event_notifier(ctx, &ctx->notifier,
(EventNotifierHandler *)
event_notifier_test_and_clear);
timerlistgroup_init(&ctx->tlg, aio_timerlist_notify, ctx);
return ctx;

View File

@@ -1,7 +1,7 @@
common-obj-y += rng.o rng-egd.o
common-obj-$(CONFIG_POSIX) += rng-random.o
common-obj-y += msmouse.o testdev.o
common-obj-y += msmouse.o
common-obj-$(CONFIG_BRLAPI) += baum.o
baum.o-cflags := $(SDL_CFLAGS)

View File

@@ -629,7 +629,7 @@ fail_handle:
static void register_types(void)
{
register_char_driver("braille", CHARDEV_BACKEND_KIND_BRAILLE, NULL);
register_char_driver_qapi("braille", CHARDEV_BACKEND_KIND_BRAILLE, NULL);
}
type_init(register_types);

View File

@@ -27,7 +27,7 @@ ram_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
path = object_get_canonical_path_component(OBJECT(backend));
memory_region_init_ram(&backend->mr, OBJECT(backend), path,
backend->size, errp);
backend->size);
g_free(path);
}

View File

@@ -257,6 +257,15 @@ static void host_memory_backend_init(Object *obj)
host_memory_backend_set_policy, NULL, NULL, NULL);
}
static void host_memory_backend_finalize(Object *obj)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
if (memory_region_size(&backend->mr)) {
memory_region_destroy(&backend->mr);
}
}
MemoryRegion *
host_memory_backend_get_memory(HostMemoryBackend *backend, Error **errp)
{
@@ -351,6 +360,7 @@ static const TypeInfo host_memory_backend_info = {
.class_init = host_memory_backend_class_init,
.instance_size = sizeof(HostMemoryBackend),
.instance_init = host_memory_backend_init,
.instance_finalize = host_memory_backend_finalize,
.interfaces = (InterfaceInfo[]) {
{ TYPE_USER_CREATABLE },
{ }

View File

@@ -79,7 +79,7 @@ CharDriverState *qemu_chr_open_msmouse(void)
static void register_types(void)
{
register_char_driver("msmouse", CHARDEV_BACKEND_KIND_MSMOUSE, NULL);
register_char_driver_qapi("msmouse", CHARDEV_BACKEND_KIND_MSMOUSE, NULL);
}
type_init(register_types);

View File

@@ -1,131 +0,0 @@
/*
* QEMU Char Device for testsuite control
*
* Copyright (c) 2014 Red Hat, Inc.
*
* Author: Paolo Bonzini <pbonzini@redhat.com>
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "qemu-common.h"
#include "sysemu/char.h"
#define BUF_SIZE 32
typedef struct {
CharDriverState *chr;
uint8_t in_buf[32];
int in_buf_used;
} TestdevCharState;
/* Try to interpret a whole incoming packet */
static int testdev_eat_packet(TestdevCharState *testdev)
{
const uint8_t *cur = testdev->in_buf;
int len = testdev->in_buf_used;
uint8_t c;
int arg;
#define EAT(c) do { \
if (!len--) { \
return 0; \
} \
c = *cur++; \
} while (0)
EAT(c);
while (isspace(c)) {
EAT(c);
}
arg = 0;
while (isdigit(c)) {
arg = arg * 10 + c - '0';
EAT(c);
}
while (isspace(c)) {
EAT(c);
}
switch (c) {
case 'q':
exit((arg << 1) | 1);
break;
default:
break;
}
return cur - testdev->in_buf;
}
/* The other end is writing some data. Store it and try to interpret */
static int testdev_write(CharDriverState *chr, const uint8_t *buf, int len)
{
TestdevCharState *testdev = chr->opaque;
int tocopy, eaten, orig_len = len;
while (len) {
/* Complete our buffer as much as possible */
tocopy = MIN(len, BUF_SIZE - testdev->in_buf_used);
memcpy(testdev->in_buf + testdev->in_buf_used, buf, tocopy);
testdev->in_buf_used += tocopy;
buf += tocopy;
len -= tocopy;
/* Interpret it as much as possible */
while (testdev->in_buf_used > 0 &&
(eaten = testdev_eat_packet(testdev)) > 0) {
memmove(testdev->in_buf, testdev->in_buf + eaten,
testdev->in_buf_used - eaten);
testdev->in_buf_used -= eaten;
}
}
return orig_len;
}
static void testdev_close(struct CharDriverState *chr)
{
TestdevCharState *testdev = chr->opaque;
g_free(testdev);
}
CharDriverState *chr_testdev_init(void)
{
TestdevCharState *testdev;
CharDriverState *chr;
testdev = g_malloc0(sizeof(TestdevCharState));
testdev->chr = chr = g_malloc0(sizeof(CharDriverState));
chr->opaque = testdev;
chr->chr_write = testdev_write;
chr->chr_close = testdev_close;
return chr;
}
static void register_types(void)
{
register_char_driver("testdev", CHARDEV_BACKEND_KIND_TESTDEV, NULL);
}
type_init(register_types);

View File

@@ -186,7 +186,7 @@ static int bmds_aio_inflight(BlkMigDevState *bmds, int64_t sector)
{
int64_t chunk = sector / (int64_t)BDRV_SECTORS_PER_DIRTY_CHUNK;
if (sector < bdrv_nb_sectors(bmds->bs)) {
if ((sector << BDRV_SECTOR_BITS) < bdrv_getlength(bmds->bs)) {
return !!(bmds->aio_bitmap[chunk / (sizeof(unsigned long) * 8)] &
(1UL << (chunk % (sizeof(unsigned long) * 8))));
} else {
@@ -223,7 +223,8 @@ static void alloc_aio_bitmap(BlkMigDevState *bmds)
BlockDriverState *bs = bmds->bs;
int64_t bitmap_size;
bitmap_size = bdrv_nb_sectors(bs) + BDRV_SECTORS_PER_DIRTY_CHUNK * 8 - 1;
bitmap_size = (bdrv_getlength(bs) >> BDRV_SECTOR_BITS) +
BDRV_SECTORS_PER_DIRTY_CHUNK * 8 - 1;
bitmap_size /= BDRV_SECTORS_PER_DIRTY_CHUNK * 8;
bmds->aio_bitmap = g_malloc0(bitmap_size);
@@ -283,7 +284,7 @@ static int mig_save_device_bulk(QEMUFile *f, BlkMigDevState *bmds)
nr_sectors = total_sectors - cur_sector;
}
blk = g_new(BlkMigBlock, 1);
blk = g_malloc(sizeof(BlkMigBlock));
blk->buf = g_malloc(BLOCK_SIZE);
blk->bmds = bmds;
blk->sector = cur_sector;
@@ -349,12 +350,12 @@ static void init_blk_migration_it(void *opaque, BlockDriverState *bs)
int64_t sectors;
if (!bdrv_is_read_only(bs)) {
sectors = bdrv_nb_sectors(bs);
sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
if (sectors <= 0) {
return;
}
bmds = g_new0(BlkMigDevState, 1);
bmds = g_malloc0(sizeof(BlkMigDevState));
bmds->bs = bs;
bmds->bulk_completed = 0;
bmds->total_sectors = sectors;
@@ -465,7 +466,7 @@ static int mig_save_device_dirty(QEMUFile *f, BlkMigDevState *bmds,
} else {
nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK;
}
blk = g_new(BlkMigBlock, 1);
blk = g_malloc(sizeof(BlkMigBlock));
blk->buf = g_malloc(BLOCK_SIZE);
blk->bmds = bmds;
blk->sector = sector;
@@ -798,7 +799,7 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
if (bs != bs_prev) {
bs_prev = bs;
total_sectors = bdrv_nb_sectors(bs);
total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
if (total_sectors <= 0) {
error_report("Error getting length of block device %s",
device_name);

515
block.c
View File

@@ -29,7 +29,6 @@
#include "qemu/module.h"
#include "qapi/qmp/qjson.h"
#include "sysemu/sysemu.h"
#include "sysemu/blockdev.h" /* FIXME layering violation */
#include "qemu/notify.h"
#include "block/coroutine.h"
#include "block/qapi.h"
@@ -58,8 +57,6 @@ struct BdrvDirtyBitmap {
#define NOT_DONE 0x7fffffff /* used while emulated sync operation in progress */
#define COROUTINE_POOL_RESERVATION 64 /* number of coroutines to reserve */
static void bdrv_dev_change_media_cb(BlockDriverState *bs, bool load);
static BlockDriverAIOCB *bdrv_aio_readv_em(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
@@ -341,24 +338,18 @@ BlockDriverState *bdrv_new(const char *device_name, Error **errp)
BlockDriverState *bs;
int i;
if (*device_name && !id_wellformed(device_name)) {
error_setg(errp, "Invalid device name");
return NULL;
}
if (bdrv_find(device_name)) {
error_setg(errp, "Device with id '%s' already exists",
device_name);
return NULL;
}
if (bdrv_find_node(device_name)) {
error_setg(errp,
"Device name '%s' conflicts with an existing node name",
error_setg(errp, "Device with node-name '%s' already exists",
device_name);
return NULL;
}
bs = g_new0(BlockDriverState, 1);
bs = g_malloc0(sizeof(BlockDriverState));
QLIST_INIT(&bs->dirty_bitmaps);
pstrcpy(bs->device_name, sizeof(bs->device_name), device_name);
if (device_name[0] != '\0') {
@@ -710,7 +701,6 @@ static int find_image_format(BlockDriverState *bs, const char *filename,
/**
* Set the current 'total_sectors' value
* Return 0 on success, -errno on error.
*/
static int refresh_total_sectors(BlockDriverState *bs, int64_t hint)
{
@@ -868,9 +858,9 @@ static void bdrv_assign_node_name(BlockDriverState *bs,
return;
}
/* Check for empty string or invalid characters */
if (!id_wellformed(node_name)) {
error_setg(errp, "Invalid node name");
/* empty string node name is invalid */
if (node_name[0] == '\0') {
error_setg(errp, "Empty node name");
return;
}
@@ -971,7 +961,6 @@ static int bdrv_open_common(BlockDriverState *bs, BlockDriverState *file,
} else {
bs->filename[0] = '\0';
}
pstrcpy(bs->exact_filename, sizeof(bs->exact_filename), bs->filename);
bs->drv = drv;
bs->opaque = g_malloc0(drv->instance_size);
@@ -1324,6 +1313,7 @@ int bdrv_append_temp_snapshot(BlockDriverState *bs, int flags, Error **errp)
error_setg_errno(errp, -total_size, "Could not get image size");
goto out;
}
total_size &= BDRV_SECTOR_MASK;
/* Create the temporary image */
ret = get_tmp_filename(tmp_filename, PATH_MAX + 1);
@@ -1513,8 +1503,6 @@ int bdrv_open(BlockDriverState **pbs, const char *filename,
}
}
bdrv_refresh_filename(bs);
/* For snapshot=on, create a temporary qcow2 overlay. bs points to the
* temporary snapshot afterwards. */
if (snapshot_flags) {
@@ -1826,8 +1814,6 @@ void bdrv_reopen_abort(BDRVReopenState *reopen_state)
void bdrv_close(BlockDriverState *bs)
{
BdrvAioNotifier *ban, *ban_next;
if (bs->job) {
block_job_cancel_sync(bs->job);
}
@@ -1857,8 +1843,6 @@ void bdrv_close(BlockDriverState *bs)
bs->zero_beyond_eof = false;
QDECREF(bs->options);
bs->options = NULL;
QDECREF(bs->full_open_options);
bs->full_open_options = NULL;
if (bs->file != NULL) {
bdrv_unref(bs->file);
@@ -1872,11 +1856,6 @@ void bdrv_close(BlockDriverState *bs)
if (bs->io_limits_enabled) {
bdrv_io_limits_disable(bs);
}
QLIST_FOREACH_SAFE(ban, &bs->aio_notifiers, list, ban_next) {
g_free(ban);
}
QLIST_INIT(&bs->aio_notifiers);
}
void bdrv_close_all(void)
@@ -2117,7 +2096,6 @@ static void bdrv_delete(BlockDriverState *bs)
/* remove from list, if necessary */
bdrv_make_anon(bs);
drive_info_del(drive_get_by_blockdev(bs));
g_free(bs);
}
@@ -2129,9 +2107,6 @@ int bdrv_attach_dev(BlockDriverState *bs, void *dev)
}
bs->dev = dev;
bdrv_iostatus_reset(bs);
/* We're expecting I/O from the device so bump up coroutine pool size */
qemu_coroutine_adjust_pool_size(COROUTINE_POOL_RESERVATION);
return 0;
}
@@ -2151,7 +2126,6 @@ void bdrv_detach_dev(BlockDriverState *bs, void *dev)
bs->dev_ops = NULL;
bs->dev_opaque = NULL;
bs->guest_block_size = 512;
qemu_coroutine_adjust_pool_size(-COROUTINE_POOL_RESERVATION);
}
/* TODO change to return DeviceState * when all users are qdevified */
@@ -2229,9 +2203,6 @@ bool bdrv_dev_is_medium_locked(BlockDriverState *bs)
*/
int bdrv_check(BlockDriverState *bs, BdrvCheckResult *res, BdrvCheckMode fix)
{
if (bs->drv == NULL) {
return -ENOMEDIUM;
}
if (bs->drv->bdrv_check == NULL) {
return -ENOTSUP;
}
@@ -2254,7 +2225,7 @@ int bdrv_commit(BlockDriverState *bs)
if (!drv)
return -ENOMEDIUM;
if (!bs->backing_hd) {
return -ENOTSUP;
}
@@ -2298,14 +2269,7 @@ int bdrv_commit(BlockDriverState *bs)
}
total_sectors = length >> BDRV_SECTOR_BITS;
/* qemu_try_blockalign() for bs will choose an alignment that works for
* bs->backing_hd as well, so no need to compare the alignment manually. */
buf = qemu_try_blockalign(bs, COMMIT_BUF_SECTORS * BDRV_SECTOR_SIZE);
if (buf == NULL) {
ret = -ENOMEM;
goto ro_cleanup;
}
buf = g_malloc(COMMIT_BUF_SECTORS * BDRV_SECTOR_SIZE);
for (sector = 0; sector < total_sectors; sector += n) {
ret = bdrv_is_allocated(bs, sector, COMMIT_BUF_SECTORS, &n);
@@ -2343,7 +2307,7 @@ int bdrv_commit(BlockDriverState *bs)
ret = 0;
ro_cleanup:
qemu_vfree(buf);
g_free(buf);
if (ro) {
/* ignoring error return here */
@@ -2648,7 +2612,7 @@ int bdrv_drop_intermediate(BlockDriverState *active, BlockDriverState *top,
* into our deletion queue, until we hit the 'base'
*/
while (intermediate) {
intermediate_state = g_new0(BlkIntermediateStates, 1);
intermediate_state = g_malloc0(sizeof(BlkIntermediateStates));
intermediate_state->bs = intermediate;
QSIMPLEQ_INSERT_TAIL(&states_to_delete, intermediate_state, entry);
@@ -2863,16 +2827,18 @@ int bdrv_write_zeroes(BlockDriverState *bs, int64_t sector_num,
*/
int bdrv_make_zero(BlockDriverState *bs, BdrvRequestFlags flags)
{
int64_t target_sectors, ret, nb_sectors, sector_num = 0;
int64_t target_size;
int64_t ret, nb_sectors, sector_num = 0;
int n;
target_sectors = bdrv_nb_sectors(bs);
if (target_sectors < 0) {
return target_sectors;
target_size = bdrv_getlength(bs);
if (target_size < 0) {
return target_size;
}
target_size /= BDRV_SECTOR_SIZE;
for (;;) {
nb_sectors = target_sectors - sector_num;
nb_sectors = target_size - sector_num;
if (nb_sectors <= 0) {
return 0;
}
@@ -3002,12 +2968,7 @@ static int coroutine_fn bdrv_co_do_copy_on_readv(BlockDriverState *bs,
cluster_sector_num, cluster_nb_sectors);
iov.iov_len = cluster_nb_sectors * BDRV_SECTOR_SIZE;
iov.iov_base = bounce_buffer = qemu_try_blockalign(bs, iov.iov_len);
if (bounce_buffer == NULL) {
ret = -ENOMEM;
goto err;
}
iov.iov_base = bounce_buffer = qemu_blockalign(bs, iov.iov_len);
qemu_iovec_init_external(&bounce_qiov, &iov, 1);
ret = drv->bdrv_co_readv(bs, cluster_sector_num, cluster_nb_sectors,
@@ -3095,14 +3056,15 @@ static int coroutine_fn bdrv_aligned_preadv(BlockDriverState *bs,
ret = drv->bdrv_co_readv(bs, sector_num, nb_sectors, qiov);
} else {
/* Read zeros after EOF of growable BDSes */
int64_t total_sectors, max_nb_sectors;
int64_t len, total_sectors, max_nb_sectors;
total_sectors = bdrv_nb_sectors(bs);
if (total_sectors < 0) {
ret = total_sectors;
len = bdrv_getlength(bs);
if (len < 0) {
ret = len;
goto out;
}
total_sectors = DIV_ROUND_UP(len, BDRV_SECTOR_SIZE);
max_nb_sectors = ROUND_UP(MAX(0, total_sectors - sector_num),
align >> BDRV_SECTOR_BITS);
if (max_nb_sectors > 0) {
@@ -3291,11 +3253,7 @@ static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
/* Fall back to bounce buffer if write zeroes is unsupported */
iov.iov_len = num * BDRV_SECTOR_SIZE;
if (iov.iov_base == NULL) {
iov.iov_base = qemu_try_blockalign(bs, num * BDRV_SECTOR_SIZE);
if (iov.iov_base == NULL) {
ret = -ENOMEM;
goto fail;
}
iov.iov_base = qemu_blockalign(bs, num * BDRV_SECTOR_SIZE);
memset(iov.iov_base, 0, num * BDRV_SECTOR_SIZE);
}
qemu_iovec_init_external(&qiov, &iov, 1);
@@ -3315,7 +3273,6 @@ static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
nb_sectors -= num;
}
fail:
qemu_vfree(iov.iov_base);
return ret;
}
@@ -3371,8 +3328,9 @@ static int coroutine_fn bdrv_aligned_pwritev(BlockDriverState *bs,
bdrv_set_dirty(bs, sector_num, nb_sectors);
block_acct_highest_sector(&bs->stats, sector_num, nb_sectors);
if (bs->wr_highest_sector < sector_num + nb_sectors - 1) {
bs->wr_highest_sector = sector_num + nb_sectors - 1;
}
if (bs->growable && ret >= 0) {
bs->total_sectors = MAX(bs->total_sectors, sector_num + nb_sectors);
}
@@ -3578,12 +3536,11 @@ int64_t bdrv_get_allocated_file_size(BlockDriverState *bs)
}
/**
* Return number of sectors on success, -errno on error.
* Length of a file in bytes. Return < 0 if error or unknown.
*/
int64_t bdrv_nb_sectors(BlockDriverState *bs)
int64_t bdrv_getlength(BlockDriverState *bs)
{
BlockDriver *drv = bs->drv;
if (!drv)
return -ENOMEDIUM;
@@ -3593,26 +3550,19 @@ int64_t bdrv_nb_sectors(BlockDriverState *bs)
return ret;
}
}
return bs->total_sectors;
}
/**
* Return length in bytes on success, -errno on error.
* The length is always a multiple of BDRV_SECTOR_SIZE.
*/
int64_t bdrv_getlength(BlockDriverState *bs)
{
int64_t ret = bdrv_nb_sectors(bs);
return ret < 0 ? ret : ret * BDRV_SECTOR_SIZE;
return bs->total_sectors * BDRV_SECTOR_SIZE;
}
/* return 0 as number of sectors if no device present or error */
void bdrv_get_geometry(BlockDriverState *bs, uint64_t *nb_sectors_ptr)
{
int64_t nb_sectors = bdrv_nb_sectors(bs);
*nb_sectors_ptr = nb_sectors < 0 ? 0 : nb_sectors;
int64_t length;
length = bdrv_getlength(bs);
if (length < 0)
length = 0;
else
length = length >> BDRV_SECTOR_BITS;
*nb_sectors_ptr = length;
}
void bdrv_set_on_error(BlockDriverState *bs, BlockdevOnError on_read_error,
@@ -3646,19 +3596,6 @@ BlockErrorAction bdrv_get_error_action(BlockDriverState *bs, bool is_read, int e
}
}
static void send_qmp_error_event(BlockDriverState *bs,
BlockErrorAction action,
bool is_read, int error)
{
BlockErrorAction ac;
ac = is_read ? IO_OPERATION_TYPE_READ : IO_OPERATION_TYPE_WRITE;
qapi_event_send_block_io_error(bdrv_get_device_name(bs), ac, action,
bdrv_iostatus_is_enabled(bs),
error == ENOSPC, strerror(error),
&error_abort);
}
/* This is done by device models because, while the block layer knows
* about the error, it does not know whether an operation comes from
* the device or the block layer (from a job, for example).
@@ -3684,10 +3621,16 @@ void bdrv_error_action(BlockDriverState *bs, BlockErrorAction action,
* also ensures that the STOP/RESUME pair of events is emitted.
*/
qemu_system_vmstop_request_prepare();
send_qmp_error_event(bs, action, is_read, error);
qapi_event_send_block_io_error(bdrv_get_device_name(bs),
is_read ? IO_OPERATION_TYPE_READ :
IO_OPERATION_TYPE_WRITE,
action, &error_abort);
qemu_system_vmstop_request(RUN_STATE_IO_ERROR);
} else {
send_qmp_error_event(bs, action, is_read, error);
qapi_event_send_block_io_error(bdrv_get_device_name(bs),
is_read ? IO_OPERATION_TYPE_READ :
IO_OPERATION_TYPE_WRITE,
action, &error_abort);
}
}
@@ -3765,17 +3708,11 @@ const char *bdrv_get_format_name(BlockDriverState *bs)
return bs->drv ? bs->drv->format_name : NULL;
}
static int qsort_strcmp(const void *a, const void *b)
{
return strcmp(a, b);
}
void bdrv_iterate_format(void (*it)(void *opaque, const char *name),
void *opaque)
{
BlockDriver *drv;
int count = 0;
int i;
const char **formats = NULL;
QLIST_FOREACH(drv, &bdrv_drivers, list) {
@@ -3787,18 +3724,12 @@ void bdrv_iterate_format(void (*it)(void *opaque, const char *name),
}
if (!found) {
formats = g_renew(const char *, formats, count + 1);
formats = g_realloc(formats, (count + 1) * sizeof(char *));
formats[count++] = drv->format_name;
it(opaque, drv->format_name);
}
}
}
qsort(formats, count, sizeof(formats[0]), qsort_strcmp);
for (i = 0; i < count; i++) {
it(opaque, formats[i]);
}
g_free(formats);
}
@@ -4014,21 +3945,21 @@ static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
int64_t sector_num,
int nb_sectors, int *pnum)
{
int64_t total_sectors;
int64_t length;
int64_t n;
int64_t ret, ret2;
total_sectors = bdrv_nb_sectors(bs);
if (total_sectors < 0) {
return total_sectors;
length = bdrv_getlength(bs);
if (length < 0) {
return length;
}
if (sector_num >= total_sectors) {
if (sector_num >= (length >> BDRV_SECTOR_BITS)) {
*pnum = 0;
return 0;
}
n = total_sectors - sector_num;
n = bs->total_sectors - sector_num;
if (n < nb_sectors) {
nb_sectors = n;
}
@@ -4063,8 +3994,8 @@ static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
ret |= BDRV_BLOCK_ZERO;
} else if (bs->backing_hd) {
BlockDriverState *bs2 = bs->backing_hd;
int64_t nb_sectors2 = bdrv_nb_sectors(bs2);
if (nb_sectors2 >= 0 && sector_num >= nb_sectors2) {
int64_t length2 = bdrv_getlength(bs2);
if (length2 >= 0 && sector_num >= (length2 >> BDRV_SECTOR_BITS)) {
ret |= BDRV_BLOCK_ZERO;
}
}
@@ -4567,12 +4498,6 @@ static int multiwrite_merge(BlockDriverState *bs, BlockRequest *reqs,
// Add the second request
qemu_iovec_concat(qiov, reqs[i].qiov, 0, reqs[i].qiov->size);
// Add tail of first request, if necessary
if (qiov->size < reqs[outidx].qiov->size) {
qemu_iovec_concat(qiov, reqs[outidx].qiov, qiov->size,
reqs[outidx].qiov->size - qiov->size);
}
reqs[outidx].nb_sectors = qiov->size >> 9;
reqs[outidx].qiov = qiov;
@@ -4648,28 +4573,7 @@ int bdrv_aio_multiwrite(BlockDriverState *bs, BlockRequest *reqs, int num_reqs)
void bdrv_aio_cancel(BlockDriverAIOCB *acb)
{
qemu_aio_ref(acb);
bdrv_aio_cancel_async(acb);
while (acb->refcnt > 1) {
if (acb->aiocb_info->get_aio_context) {
aio_poll(acb->aiocb_info->get_aio_context(acb), true);
} else if (acb->bs) {
aio_poll(bdrv_get_aio_context(acb->bs), true);
} else {
abort();
}
}
qemu_aio_unref(acb);
}
/* Async version of aio cancel. The caller is not blocked if the acb implements
* cancel_async, otherwise we do nothing and let the request normally complete.
* In either case the completion callback must be called. */
void bdrv_aio_cancel_async(BlockDriverAIOCB *acb)
{
if (acb->aiocb_info->cancel_async) {
acb->aiocb_info->cancel_async(acb);
}
acb->aiocb_info->cancel(acb);
}
/**************************************************************/
@@ -4685,22 +4589,31 @@ typedef struct BlockDriverAIOCBSync {
int is_write;
} BlockDriverAIOCBSync;
static void bdrv_aio_cancel_em(BlockDriverAIOCB *blockacb)
{
BlockDriverAIOCBSync *acb =
container_of(blockacb, BlockDriverAIOCBSync, common);
qemu_bh_delete(acb->bh);
acb->bh = NULL;
qemu_aio_release(acb);
}
static const AIOCBInfo bdrv_em_aiocb_info = {
.aiocb_size = sizeof(BlockDriverAIOCBSync),
.cancel = bdrv_aio_cancel_em,
};
static void bdrv_aio_bh_cb(void *opaque)
{
BlockDriverAIOCBSync *acb = opaque;
if (!acb->is_write && acb->ret >= 0) {
if (!acb->is_write)
qemu_iovec_from_buf(acb->qiov, 0, acb->bounce, acb->qiov->size);
}
qemu_vfree(acb->bounce);
acb->common.cb(acb->common.opaque, acb->ret);
qemu_bh_delete(acb->bh);
acb->bh = NULL;
qemu_aio_unref(acb);
qemu_aio_release(acb);
}
static BlockDriverAIOCB *bdrv_aio_rw_vector(BlockDriverState *bs,
@@ -4717,12 +4630,10 @@ static BlockDriverAIOCB *bdrv_aio_rw_vector(BlockDriverState *bs,
acb = qemu_aio_get(&bdrv_em_aiocb_info, bs, cb, opaque);
acb->is_write = is_write;
acb->qiov = qiov;
acb->bounce = qemu_try_blockalign(bs, qiov->size);
acb->bounce = qemu_blockalign(bs, qiov->size);
acb->bh = aio_bh_new(bdrv_get_aio_context(bs), bdrv_aio_bh_cb, acb);
if (acb->bounce == NULL) {
acb->ret = -ENOMEM;
} else if (is_write) {
if (is_write) {
qemu_iovec_to_buf(acb->qiov, 0, acb->bounce, qiov->size);
acb->ret = bs->drv->bdrv_write(bs, sector_num, acb->bounce, nb_sectors);
} else {
@@ -4757,8 +4668,22 @@ typedef struct BlockDriverAIOCBCoroutine {
QEMUBH* bh;
} BlockDriverAIOCBCoroutine;
static void bdrv_aio_co_cancel_em(BlockDriverAIOCB *blockacb)
{
AioContext *aio_context = bdrv_get_aio_context(blockacb->bs);
BlockDriverAIOCBCoroutine *acb =
container_of(blockacb, BlockDriverAIOCBCoroutine, common);
bool done = false;
acb->done = &done;
while (!done) {
aio_poll(aio_context, true);
}
}
static const AIOCBInfo bdrv_em_co_aiocb_info = {
.aiocb_size = sizeof(BlockDriverAIOCBCoroutine),
.cancel = bdrv_aio_co_cancel_em,
};
static void bdrv_co_em_bh(void *opaque)
@@ -4767,8 +4692,12 @@ static void bdrv_co_em_bh(void *opaque)
acb->common.cb(acb->common.opaque, acb->req.error);
if (acb->done) {
*acb->done = true;
}
qemu_bh_delete(acb->bh);
qemu_aio_unref(acb);
qemu_aio_release(acb);
}
/* Invoke bdrv_co_do_readv/bdrv_co_do_writev */
@@ -4807,6 +4736,7 @@ static BlockDriverAIOCB *bdrv_co_aio_rw_vector(BlockDriverState *bs,
acb->req.qiov = qiov;
acb->req.flags = flags;
acb->is_write = is_write;
acb->done = NULL;
co = qemu_coroutine_create(bdrv_co_do_rw);
qemu_coroutine_enter(co, acb);
@@ -4833,6 +4763,7 @@ BlockDriverAIOCB *bdrv_aio_flush(BlockDriverState *bs,
BlockDriverAIOCBCoroutine *acb;
acb = qemu_aio_get(&bdrv_em_co_aiocb_info, bs, cb, opaque);
acb->done = NULL;
co = qemu_coroutine_create(bdrv_aio_flush_co_entry);
qemu_coroutine_enter(co, acb);
@@ -4862,6 +4793,7 @@ BlockDriverAIOCB *bdrv_aio_discard(BlockDriverState *bs,
acb = qemu_aio_get(&bdrv_em_co_aiocb_info, bs, cb, opaque);
acb->req.sector = sector_num;
acb->req.nb_sectors = nb_sectors;
acb->done = NULL;
co = qemu_coroutine_create(bdrv_aio_discard_co_entry);
qemu_coroutine_enter(co, acb);
@@ -4889,23 +4821,13 @@ void *qemu_aio_get(const AIOCBInfo *aiocb_info, BlockDriverState *bs,
acb->bs = bs;
acb->cb = cb;
acb->opaque = opaque;
acb->refcnt = 1;
return acb;
}
void qemu_aio_ref(void *p)
void qemu_aio_release(void *p)
{
BlockDriverAIOCB *acb = p;
acb->refcnt++;
}
void qemu_aio_unref(void *p)
{
BlockDriverAIOCB *acb = p;
assert(acb->refcnt > 0);
if (--acb->refcnt == 0) {
g_slice_free1(acb->aiocb_info->aiocb_size, acb);
}
g_slice_free1(acb->aiocb_info->aiocb_size, acb);
}
/**************************************************************/
@@ -5325,19 +5247,6 @@ void *qemu_blockalign(BlockDriverState *bs, size_t size)
return qemu_memalign(bdrv_opt_mem_align(bs), size);
}
void *qemu_try_blockalign(BlockDriverState *bs, size_t size)
{
size_t align = bdrv_opt_mem_align(bs);
/* Ensure that NULL is never returned on success */
assert(align > 0);
if (size == 0) {
size = align;
}
return qemu_try_memalign(align, size);
}
/*
* Check if all memory in this vector is sector aligned.
*/
@@ -5368,13 +5277,14 @@ BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState *bs, int granularity,
granularity >>= BDRV_SECTOR_BITS;
assert(granularity);
bitmap_size = bdrv_nb_sectors(bs);
bitmap_size = bdrv_getlength(bs);
if (bitmap_size < 0) {
error_setg_errno(errp, -bitmap_size, "could not get length of device");
errno = -bitmap_size;
return NULL;
}
bitmap = g_new0(BdrvDirtyBitmap, 1);
bitmap_size >>= BDRV_SECTOR_BITS;
bitmap = g_malloc0(sizeof(BdrvDirtyBitmap));
bitmap->bitmap = hbitmap_alloc(bitmap_size, ffs(granularity) - 1);
QLIST_INSERT_HEAD(&bs->dirty_bitmaps, bitmap, list);
return bitmap;
@@ -5400,8 +5310,8 @@ BlockDirtyInfoList *bdrv_query_dirty_bitmaps(BlockDriverState *bs)
BlockDirtyInfoList **plist = &list;
QLIST_FOREACH(bm, &bs->dirty_bitmaps, list) {
BlockDirtyInfo *info = g_new0(BlockDirtyInfo, 1);
BlockDirtyInfoList *entry = g_new0(BlockDirtyInfoList, 1);
BlockDirtyInfo *info = g_malloc0(sizeof(BlockDirtyInfo));
BlockDirtyInfoList *entry = g_malloc0(sizeof(BlockDirtyInfoList));
info->count = bdrv_get_dirty_count(bs, bm);
info->granularity =
((int64_t) BDRV_SECTOR_SIZE << hbitmap_granularity(bm->bitmap));
@@ -5461,9 +5371,6 @@ void bdrv_ref(BlockDriverState *bs)
* deleted. */
void bdrv_unref(BlockDriverState *bs)
{
if (!bs) {
return;
}
assert(bs->refcnt > 0);
if (--bs->refcnt == 0) {
bdrv_delete(bs);
@@ -5495,7 +5402,7 @@ void bdrv_op_block(BlockDriverState *bs, BlockOpType op, Error *reason)
BdrvOpBlocker *blocker;
assert((int) op >= 0 && op < BLOCK_OP_TYPE_MAX);
blocker = g_new0(BdrvOpBlocker, 1);
blocker = g_malloc0(sizeof(BdrvOpBlocker));
blocker->reason = reason;
QLIST_INSERT_HEAD(&bs->op_blockers[op], blocker, list);
}
@@ -5580,6 +5487,27 @@ void bdrv_iostatus_set_err(BlockDriverState *bs, int error)
}
}
void
bdrv_acct_start(BlockDriverState *bs, BlockAcctCookie *cookie, int64_t bytes,
enum BlockAcctType type)
{
assert(type < BDRV_MAX_IOTYPE);
cookie->bytes = bytes;
cookie->start_time_ns = get_clock();
cookie->type = type;
}
void
bdrv_acct_done(BlockDriverState *bs, BlockAcctCookie *cookie)
{
assert(cookie->type < BDRV_MAX_IOTYPE);
bs->nr_bytes[cookie->type] += cookie->bytes;
bs->nr_ops[cookie->type]++;
bs->total_time_ns[cookie->type] += get_clock() - cookie->start_time_ns;
}
void bdrv_img_create(const char *filename, const char *fmt,
const char *base_filename, const char *base_fmt,
char *options, uint64_t img_size, int flags,
@@ -5663,7 +5591,7 @@ void bdrv_img_create(const char *filename, const char *fmt,
if (size == -1) {
if (backing_file) {
BlockDriverState *bs;
int64_t size;
uint64_t size;
int back_flags;
/* backing files always opened read-only */
@@ -5681,13 +5609,8 @@ void bdrv_img_create(const char *filename, const char *fmt,
local_err = NULL;
goto out;
}
size = bdrv_getlength(bs);
if (size < 0) {
error_setg_errno(errp, -size, "Could not get size of '%s'",
backing_file);
bdrv_unref(bs);
goto out;
}
bdrv_get_geometry(bs, &size);
size *= 512;
qemu_opt_set_number(opts, BLOCK_OPT_SIZE, size);
@@ -5735,16 +5658,10 @@ AioContext *bdrv_get_aio_context(BlockDriverState *bs)
void bdrv_detach_aio_context(BlockDriverState *bs)
{
BdrvAioNotifier *baf;
if (!bs->drv) {
return;
}
QLIST_FOREACH(baf, &bs->aio_notifiers, list) {
baf->detach_aio_context(baf->opaque);
}
if (bs->io_limits_enabled) {
throttle_detach_aio_context(&bs->throttle_state);
}
@@ -5764,8 +5681,6 @@ void bdrv_detach_aio_context(BlockDriverState *bs)
void bdrv_attach_aio_context(BlockDriverState *bs,
AioContext *new_context)
{
BdrvAioNotifier *ban;
if (!bs->drv) {
return;
}
@@ -5784,10 +5699,6 @@ void bdrv_attach_aio_context(BlockDriverState *bs,
if (bs->io_limits_enabled) {
throttle_attach_aio_context(&bs->throttle_state, new_context);
}
QLIST_FOREACH(ban, &bs->aio_notifiers, list) {
ban->attached_aio_context(new_context, ban->opaque);
}
}
void bdrv_set_aio_context(BlockDriverState *bs, AioContext *new_context)
@@ -5804,43 +5715,6 @@ void bdrv_set_aio_context(BlockDriverState *bs, AioContext *new_context)
aio_context_release(new_context);
}
void bdrv_add_aio_context_notifier(BlockDriverState *bs,
void (*attached_aio_context)(AioContext *new_context, void *opaque),
void (*detach_aio_context)(void *opaque), void *opaque)
{
BdrvAioNotifier *ban = g_new(BdrvAioNotifier, 1);
*ban = (BdrvAioNotifier){
.attached_aio_context = attached_aio_context,
.detach_aio_context = detach_aio_context,
.opaque = opaque
};
QLIST_INSERT_HEAD(&bs->aio_notifiers, ban, list);
}
void bdrv_remove_aio_context_notifier(BlockDriverState *bs,
void (*attached_aio_context)(AioContext *,
void *),
void (*detach_aio_context)(void *),
void *opaque)
{
BdrvAioNotifier *ban, *ban_next;
QLIST_FOREACH_SAFE(ban, &bs->aio_notifiers, list, ban_next) {
if (ban->attached_aio_context == attached_aio_context &&
ban->detach_aio_context == detach_aio_context &&
ban->opaque == opaque)
{
QLIST_REMOVE(ban, list);
g_free(ban);
return;
}
}
abort();
}
void bdrv_add_before_write_notifier(BlockDriverState *bs,
NotifierWithReturn *notifier)
{
@@ -5966,144 +5840,3 @@ void bdrv_flush_io_queue(BlockDriverState *bs)
bdrv_flush_io_queue(bs->file);
}
}
static bool append_open_options(QDict *d, BlockDriverState *bs)
{
const QDictEntry *entry;
bool found_any = false;
for (entry = qdict_first(bs->options); entry;
entry = qdict_next(bs->options, entry))
{
/* Only take options for this level and exclude all non-driver-specific
* options */
if (!strchr(qdict_entry_key(entry), '.') &&
strcmp(qdict_entry_key(entry), "node-name"))
{
qobject_incref(qdict_entry_value(entry));
qdict_put_obj(d, qdict_entry_key(entry), qdict_entry_value(entry));
found_any = true;
}
}
return found_any;
}
/* Updates the following BDS fields:
* - exact_filename: A filename which may be used for opening a block device
* which (mostly) equals the given BDS (even without any
* other options; so reading and writing must return the same
* results, but caching etc. may be different)
* - full_open_options: Options which, when given when opening a block device
* (without a filename), result in a BDS (mostly)
* equalling the given one
* - filename: If exact_filename is set, it is copied here. Otherwise,
* full_open_options is converted to a JSON object, prefixed with
* "json:" (for use through the JSON pseudo protocol) and put here.
*/
void bdrv_refresh_filename(BlockDriverState *bs)
{
BlockDriver *drv = bs->drv;
QDict *opts;
if (!drv) {
return;
}
/* This BDS's file name will most probably depend on its file's name, so
* refresh that first */
if (bs->file) {
bdrv_refresh_filename(bs->file);
}
if (drv->bdrv_refresh_filename) {
/* Obsolete information is of no use here, so drop the old file name
* information before refreshing it */
bs->exact_filename[0] = '\0';
if (bs->full_open_options) {
QDECREF(bs->full_open_options);
bs->full_open_options = NULL;
}
drv->bdrv_refresh_filename(bs);
} else if (bs->file) {
/* Try to reconstruct valid information from the underlying file */
bool has_open_options;
bs->exact_filename[0] = '\0';
if (bs->full_open_options) {
QDECREF(bs->full_open_options);
bs->full_open_options = NULL;
}
opts = qdict_new();
has_open_options = append_open_options(opts, bs);
/* If no specific options have been given for this BDS, the filename of
* the underlying file should suffice for this one as well */
if (bs->file->exact_filename[0] && !has_open_options) {
strcpy(bs->exact_filename, bs->file->exact_filename);
}
/* Reconstructing the full options QDict is simple for most format block
* drivers, as long as the full options are known for the underlying
* file BDS. The full options QDict of that file BDS should somehow
* contain a representation of the filename, therefore the following
* suffices without querying the (exact_)filename of this BDS. */
if (bs->file->full_open_options) {
qdict_put_obj(opts, "driver",
QOBJECT(qstring_from_str(drv->format_name)));
QINCREF(bs->file->full_open_options);
qdict_put_obj(opts, "file", QOBJECT(bs->file->full_open_options));
bs->full_open_options = opts;
} else {
QDECREF(opts);
}
} else if (!bs->full_open_options && qdict_size(bs->options)) {
/* There is no underlying file BDS (at least referenced by BDS.file),
* so the full options QDict should be equal to the options given
* specifically for this block device when it was opened (plus the
* driver specification).
* Because those options don't change, there is no need to update
* full_open_options when it's already set. */
opts = qdict_new();
append_open_options(opts, bs);
qdict_put_obj(opts, "driver",
QOBJECT(qstring_from_str(drv->format_name)));
if (bs->exact_filename[0]) {
/* This may not work for all block protocol drivers (some may
* require this filename to be parsed), but we have to find some
* default solution here, so just include it. If some block driver
* does not support pure options without any filename at all or
* needs some special format of the options QDict, it needs to
* implement the driver-specific bdrv_refresh_filename() function.
*/
qdict_put_obj(opts, "filename",
QOBJECT(qstring_from_str(bs->exact_filename)));
}
bs->full_open_options = opts;
}
if (bs->exact_filename[0]) {
pstrcpy(bs->filename, sizeof(bs->filename), bs->exact_filename);
} else if (bs->full_open_options) {
QString *json = qobject_to_json(QOBJECT(bs->full_open_options));
snprintf(bs->filename, sizeof(bs->filename), "json:%s",
qstring_get_str(json));
QDECREF(json);
}
}
/* This accessor function purpose is to allow the device models to access the
* BlockAcctStats structure embedded inside a BlockDriverState without being
* aware of the BlockDriverState structure layout.
* It will go away when the BlockAcctStats structure will be moved inside
* the device models.
*/
BlockAcctStats *bdrv_get_stats(BlockDriverState *bs)
{
return &bs->stats;
}

View File

@@ -1,4 +1,4 @@
block-obj-y += raw_bsd.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat.o
block-obj-y += raw_bsd.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat.o
block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o
block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
block-obj-y += qed-check.o
@@ -9,17 +9,16 @@ block-obj-y += snapshot.o qapi.o
block-obj-$(CONFIG_WIN32) += raw-win32.o win32-aio.o
block-obj-$(CONFIG_POSIX) += raw-posix.o
block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
block-obj-y += null.o
ifeq ($(CONFIG_POSIX),y)
block-obj-y += nbd.o nbd-client.o sheepdog.o
block-obj-$(CONFIG_LIBISCSI) += iscsi.o
block-obj-$(CONFIG_LIBNFS) += nfs.o
block-obj-$(CONFIG_CURL) += curl.o
block-obj-$(CONFIG_RBD) += rbd.o
block-obj-$(CONFIG_GLUSTERFS) += gluster.o
block-obj-$(CONFIG_ARCHIPELAGO) += archipelago.o
block-obj-$(CONFIG_LIBSSH2) += ssh.o
block-obj-y += accounting.o
endif
common-obj-y += stream.o
common-obj-y += commit.o
@@ -36,6 +35,5 @@ gluster.o-cflags := $(GLUSTERFS_CFLAGS)
gluster.o-libs := $(GLUSTERFS_LIBS)
ssh.o-cflags := $(LIBSSH2_CFLAGS)
ssh.o-libs := $(LIBSSH2_LIBS)
archipelago.o-libs := $(ARCHIPELAGO_LIBS)
qcow.o-libs := -lz
linux-aio.o-libs := -laio

View File

@@ -1,54 +0,0 @@
/*
* QEMU System Emulator block accounting
*
* Copyright (c) 2011 Christoph Hellwig
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "block/accounting.h"
#include "block/block_int.h"
void block_acct_start(BlockAcctStats *stats, BlockAcctCookie *cookie,
int64_t bytes, enum BlockAcctType type)
{
assert(type < BLOCK_MAX_IOTYPE);
cookie->bytes = bytes;
cookie->start_time_ns = get_clock();
cookie->type = type;
}
void block_acct_done(BlockAcctStats *stats, BlockAcctCookie *cookie)
{
assert(cookie->type < BLOCK_MAX_IOTYPE);
stats->nr_bytes[cookie->type] += cookie->bytes;
stats->nr_ops[cookie->type]++;
stats->total_time_ns[cookie->type] += get_clock() - cookie->start_time_ns;
}
void block_acct_highest_sector(BlockAcctStats *stats, int64_t sector_num,
unsigned int nb_sectors)
{
if (stats->wr_highest_sector < sector_num + nb_sectors - 1) {
stats->wr_highest_sector = sector_num + nb_sectors - 1;
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -26,10 +26,6 @@
#include "qemu/config-file.h"
#include "block/block_int.h"
#include "qemu/module.h"
#include "qapi/qmp/qbool.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qint.h"
#include "qapi/qmp/qstring.h"
typedef struct BDRVBlkdebugState {
int state;
@@ -52,8 +48,11 @@ typedef struct BlkdebugSuspendedReq {
QLIST_ENTRY(BlkdebugSuspendedReq) next;
} BlkdebugSuspendedReq;
static void blkdebug_aio_cancel(BlockDriverAIOCB *blockacb);
static const AIOCBInfo blkdebug_aiocb_info = {
.aiocb_size = sizeof(BlkdebugAIOCB),
.aiocb_size = sizeof(BlkdebugAIOCB),
.cancel = blkdebug_aio_cancel,
};
enum {
@@ -214,7 +213,6 @@ static int get_event_by_name(const char *name, BlkDebugEvent *event)
struct add_rule_data {
BDRVBlkdebugState *s;
int action;
Error **errp;
};
static int add_rule(QemuOpts *opts, void *opaque)
@@ -227,11 +225,7 @@ static int add_rule(QemuOpts *opts, void *opaque)
/* Find the right event for the rule */
event_name = qemu_opt_get(opts, "event");
if (!event_name) {
error_setg(d->errp, "Missing event name for rule");
return -1;
} else if (get_event_by_name(event_name, &event) < 0) {
error_setg(d->errp, "Invalid event name \"%s\"", event_name);
if (!event_name || get_event_by_name(event_name, &event) < 0) {
return -1;
}
@@ -317,21 +311,10 @@ static int read_config(BDRVBlkdebugState *s, const char *filename,
d.s = s;
d.action = ACTION_INJECT_ERROR;
d.errp = &local_err;
qemu_opts_foreach(&inject_error_opts, add_rule, &d, 1);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
qemu_opts_foreach(&inject_error_opts, add_rule, &d, 0);
d.action = ACTION_SET_STATE;
qemu_opts_foreach(&set_state_opts, add_rule, &d, 1);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
qemu_opts_foreach(&set_state_opts, add_rule, &d, 0);
ret = 0;
fail:
@@ -460,7 +443,17 @@ static void error_callback_bh(void *opaque)
struct BlkdebugAIOCB *acb = opaque;
qemu_bh_delete(acb->bh);
acb->common.cb(acb->common.opaque, acb->ret);
qemu_aio_unref(acb);
qemu_aio_release(acb);
}
static void blkdebug_aio_cancel(BlockDriverAIOCB *blockacb)
{
BlkdebugAIOCB *acb = container_of(blockacb, BlkdebugAIOCB, common);
if (acb->bh) {
qemu_bh_delete(acb->bh);
acb->bh = NULL;
}
qemu_aio_release(acb);
}
static BlockDriverAIOCB *inject_error(BlockDriverState *bs,
@@ -533,25 +526,6 @@ static BlockDriverAIOCB *blkdebug_aio_writev(BlockDriverState *bs,
return bdrv_aio_writev(bs->file, sector_num, qiov, nb_sectors, cb, opaque);
}
static BlockDriverAIOCB *blkdebug_aio_flush(BlockDriverState *bs,
BlockDriverCompletionFunc *cb, void *opaque)
{
BDRVBlkdebugState *s = bs->opaque;
BlkdebugRule *rule = NULL;
QSIMPLEQ_FOREACH(rule, &s->active_rules, active_next) {
if (rule->options.inject.sector == -1) {
break;
}
}
if (rule && rule->options.inject.error) {
return inject_error(bs, cb, opaque, rule);
}
return bdrv_aio_flush(bs->file, cb, opaque);
}
static void blkdebug_close(BlockDriverState *bs)
{
@@ -717,98 +691,6 @@ static int64_t blkdebug_getlength(BlockDriverState *bs)
return bdrv_getlength(bs->file);
}
static void blkdebug_refresh_filename(BlockDriverState *bs)
{
BDRVBlkdebugState *s = bs->opaque;
struct BlkdebugRule *rule;
QDict *opts;
QList *inject_error_list = NULL, *set_state_list = NULL;
QList *suspend_list = NULL;
int event;
if (!bs->file->full_open_options) {
/* The config file cannot be recreated, so creating a plain filename
* is impossible */
return;
}
opts = qdict_new();
qdict_put_obj(opts, "driver", QOBJECT(qstring_from_str("blkdebug")));
QINCREF(bs->file->full_open_options);
qdict_put_obj(opts, "image", QOBJECT(bs->file->full_open_options));
for (event = 0; event < BLKDBG_EVENT_MAX; event++) {
QLIST_FOREACH(rule, &s->rules[event], next) {
if (rule->action == ACTION_INJECT_ERROR) {
QDict *inject_error = qdict_new();
qdict_put_obj(inject_error, "event", QOBJECT(qstring_from_str(
BlkdebugEvent_lookup[rule->event])));
qdict_put_obj(inject_error, "state",
QOBJECT(qint_from_int(rule->state)));
qdict_put_obj(inject_error, "errno", QOBJECT(qint_from_int(
rule->options.inject.error)));
qdict_put_obj(inject_error, "sector", QOBJECT(qint_from_int(
rule->options.inject.sector)));
qdict_put_obj(inject_error, "once", QOBJECT(qbool_from_int(
rule->options.inject.once)));
qdict_put_obj(inject_error, "immediately",
QOBJECT(qbool_from_int(
rule->options.inject.immediately)));
if (!inject_error_list) {
inject_error_list = qlist_new();
}
qlist_append_obj(inject_error_list, QOBJECT(inject_error));
} else if (rule->action == ACTION_SET_STATE) {
QDict *set_state = qdict_new();
qdict_put_obj(set_state, "event", QOBJECT(qstring_from_str(
BlkdebugEvent_lookup[rule->event])));
qdict_put_obj(set_state, "state",
QOBJECT(qint_from_int(rule->state)));
qdict_put_obj(set_state, "new_state", QOBJECT(qint_from_int(
rule->options.set_state.new_state)));
if (!set_state_list) {
set_state_list = qlist_new();
}
qlist_append_obj(set_state_list, QOBJECT(set_state));
} else if (rule->action == ACTION_SUSPEND) {
QDict *suspend = qdict_new();
qdict_put_obj(suspend, "event", QOBJECT(qstring_from_str(
BlkdebugEvent_lookup[rule->event])));
qdict_put_obj(suspend, "state",
QOBJECT(qint_from_int(rule->state)));
qdict_put_obj(suspend, "tag", QOBJECT(qstring_from_str(
rule->options.suspend.tag)));
if (!suspend_list) {
suspend_list = qlist_new();
}
qlist_append_obj(suspend_list, QOBJECT(suspend));
}
}
}
if (inject_error_list) {
qdict_put_obj(opts, "inject-error", QOBJECT(inject_error_list));
}
if (set_state_list) {
qdict_put_obj(opts, "set-state", QOBJECT(set_state_list));
}
if (suspend_list) {
qdict_put_obj(opts, "suspend", QOBJECT(suspend_list));
}
bs->full_open_options = opts;
}
static BlockDriver bdrv_blkdebug = {
.format_name = "blkdebug",
.protocol_name = "blkdebug",
@@ -818,11 +700,9 @@ static BlockDriver bdrv_blkdebug = {
.bdrv_file_open = blkdebug_open,
.bdrv_close = blkdebug_close,
.bdrv_getlength = blkdebug_getlength,
.bdrv_refresh_filename = blkdebug_refresh_filename,
.bdrv_aio_readv = blkdebug_aio_readv,
.bdrv_aio_writev = blkdebug_aio_writev,
.bdrv_aio_flush = blkdebug_aio_flush,
.bdrv_debug_event = blkdebug_debug_event,
.bdrv_debug_breakpoint = blkdebug_debug_breakpoint,

View File

@@ -10,8 +10,6 @@
#include <stdarg.h>
#include "qemu/sockets.h" /* for EINPROGRESS on Windows */
#include "block/block_int.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qstring.h"
typedef struct {
BlockDriverState *test_file;
@@ -29,6 +27,7 @@ struct BlkverifyAIOCB {
int ret; /* first completed request's result */
unsigned int done; /* completion counter */
bool *finished; /* completion signal for cancel */
QEMUIOVector *qiov; /* user I/O vector */
QEMUIOVector raw_qiov; /* cloned I/O vector for raw file */
@@ -37,8 +36,22 @@ struct BlkverifyAIOCB {
void (*verify)(BlkverifyAIOCB *acb);
};
static void blkverify_aio_cancel(BlockDriverAIOCB *blockacb)
{
BlkverifyAIOCB *acb = (BlkverifyAIOCB *)blockacb;
AioContext *aio_context = bdrv_get_aio_context(blockacb->bs);
bool finished = false;
/* Wait until request completes, invokes its callback, and frees itself */
acb->finished = &finished;
while (!finished) {
aio_poll(aio_context, true);
}
}
static const AIOCBInfo blkverify_aiocb_info = {
.aiocb_size = sizeof(BlkverifyAIOCB),
.cancel = blkverify_aio_cancel,
};
static void GCC_FMT_ATTR(2, 3) blkverify_err(BlkverifyAIOCB *acb,
@@ -143,7 +156,6 @@ static int blkverify_open(BlockDriverState *bs, QDict *options, int flags,
ret = 0;
fail:
qemu_opts_del(opts);
return ret;
}
@@ -179,6 +191,7 @@ static BlkverifyAIOCB *blkverify_aio_get(BlockDriverState *bs, bool is_write,
acb->qiov = qiov;
acb->buf = NULL;
acb->verify = NULL;
acb->finished = NULL;
return acb;
}
@@ -192,7 +205,10 @@ static void blkverify_aio_bh(void *opaque)
qemu_vfree(acb->buf);
}
acb->common.cb(acb->common.opaque, acb->ret);
qemu_aio_unref(acb);
if (acb->finished) {
*acb->finished = true;
}
qemu_aio_release(acb);
}
static void blkverify_aio_cb(void *opaque, int ret)
@@ -304,32 +320,6 @@ static void blkverify_attach_aio_context(BlockDriverState *bs,
bdrv_attach_aio_context(s->test_file, new_context);
}
static void blkverify_refresh_filename(BlockDriverState *bs)
{
BDRVBlkverifyState *s = bs->opaque;
/* bs->file has already been refreshed */
bdrv_refresh_filename(s->test_file);
if (bs->file->full_open_options && s->test_file->full_open_options) {
QDict *opts = qdict_new();
qdict_put_obj(opts, "driver", QOBJECT(qstring_from_str("blkverify")));
QINCREF(bs->file->full_open_options);
qdict_put_obj(opts, "raw", QOBJECT(bs->file->full_open_options));
QINCREF(s->test_file->full_open_options);
qdict_put_obj(opts, "test", QOBJECT(s->test_file->full_open_options));
bs->full_open_options = opts;
}
if (bs->file->exact_filename[0] && s->test_file->exact_filename[0]) {
snprintf(bs->exact_filename, sizeof(bs->exact_filename),
"blkverify:%s:%s",
bs->file->exact_filename, s->test_file->exact_filename);
}
}
static BlockDriver bdrv_blkverify = {
.format_name = "blkverify",
.protocol_name = "blkverify",
@@ -339,7 +329,6 @@ static BlockDriver bdrv_blkverify = {
.bdrv_file_open = blkverify_open,
.bdrv_close = blkverify_close,
.bdrv_getlength = blkverify_getlength,
.bdrv_refresh_filename = blkverify_refresh_filename,
.bdrv_aio_readv = blkverify_aio_readv,
.bdrv_aio_writev = blkverify_aio_writev,

View File

@@ -131,11 +131,7 @@ static int bochs_open(BlockDriverState *bs, QDict *options, int flags,
return -EFBIG;
}
s->catalog_bitmap = g_try_new(uint32_t, s->catalog_size);
if (s->catalog_size && s->catalog_bitmap == NULL) {
error_setg(errp, "Could not allocate memory for catalog");
return -ENOMEM;
}
s->catalog_bitmap = g_malloc(s->catalog_size * 4);
ret = bdrv_pread(bs->file, le32_to_cpu(bochs.header), s->catalog_bitmap,
s->catalog_size * 4);

View File

@@ -116,12 +116,7 @@ static int cloop_open(BlockDriverState *bs, QDict *options, int flags,
"try increasing block size");
return -EINVAL;
}
s->offsets = g_try_malloc(offsets_size);
if (s->offsets == NULL) {
error_setg(errp, "Could not allocate offsets table");
return -ENOMEM;
}
s->offsets = g_malloc(offsets_size);
ret = bdrv_pread(bs->file, 128 + 4 + 4, s->offsets, offsets_size);
if (ret < 0) {
@@ -163,20 +158,8 @@ static int cloop_open(BlockDriverState *bs, QDict *options, int flags,
}
/* initialize zlib engine */
s->compressed_block = g_try_malloc(max_compressed_block_size + 1);
if (s->compressed_block == NULL) {
error_setg(errp, "Could not allocate compressed_block");
ret = -ENOMEM;
goto fail;
}
s->uncompressed_block = g_try_malloc(s->block_size);
if (s->uncompressed_block == NULL) {
error_setg(errp, "Could not allocate uncompressed_block");
ret = -ENOMEM;
goto fail;
}
s->compressed_block = g_malloc(max_compressed_block_size + 1);
s->uncompressed_block = g_malloc(s->block_size);
if (inflateInit(&s->zstream) != Z_OK) {
ret = -EINVAL;
goto fail;

432
block/cow.c Normal file
View File

@@ -0,0 +1,432 @@
/*
* Block driver for the COW format
*
* Copyright (c) 2004 Fabrice Bellard
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "qemu-common.h"
#include "block/block_int.h"
#include "qemu/module.h"
/**************************************************************/
/* COW block driver using file system holes */
/* user mode linux compatible COW file */
#define COW_MAGIC 0x4f4f4f4d /* MOOO */
#define COW_VERSION 2
struct cow_header_v2 {
uint32_t magic;
uint32_t version;
char backing_file[1024];
int32_t mtime;
uint64_t size;
uint32_t sectorsize;
};
typedef struct BDRVCowState {
CoMutex lock;
int64_t cow_sectors_offset;
} BDRVCowState;
static int cow_probe(const uint8_t *buf, int buf_size, const char *filename)
{
const struct cow_header_v2 *cow_header = (const void *)buf;
if (buf_size >= sizeof(struct cow_header_v2) &&
be32_to_cpu(cow_header->magic) == COW_MAGIC &&
be32_to_cpu(cow_header->version) == COW_VERSION)
return 100;
else
return 0;
}
static int cow_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVCowState *s = bs->opaque;
struct cow_header_v2 cow_header;
int bitmap_size;
int64_t size;
int ret;
/* see if it is a cow image */
ret = bdrv_pread(bs->file, 0, &cow_header, sizeof(cow_header));
if (ret < 0) {
goto fail;
}
if (be32_to_cpu(cow_header.magic) != COW_MAGIC) {
error_setg(errp, "Image not in COW format");
ret = -EINVAL;
goto fail;
}
if (be32_to_cpu(cow_header.version) != COW_VERSION) {
char version[64];
snprintf(version, sizeof(version),
"COW version %" PRIu32, cow_header.version);
error_set(errp, QERR_UNKNOWN_BLOCK_FORMAT_FEATURE,
bs->device_name, "cow", version);
ret = -ENOTSUP;
goto fail;
}
/* cow image found */
size = be64_to_cpu(cow_header.size);
bs->total_sectors = size / 512;
pstrcpy(bs->backing_file, sizeof(bs->backing_file),
cow_header.backing_file);
bitmap_size = ((bs->total_sectors + 7) >> 3) + sizeof(cow_header);
s->cow_sectors_offset = (bitmap_size + 511) & ~511;
qemu_co_mutex_init(&s->lock);
return 0;
fail:
return ret;
}
static inline void cow_set_bits(uint8_t *bitmap, int start, int64_t nb_sectors)
{
int64_t bitnum = start, last = start + nb_sectors;
while (bitnum < last) {
if ((bitnum & 7) == 0 && bitnum + 8 <= last) {
bitmap[bitnum / 8] = 0xFF;
bitnum += 8;
continue;
}
bitmap[bitnum/8] |= (1 << (bitnum % 8));
bitnum++;
}
}
#define BITS_PER_BITMAP_SECTOR (512 * 8)
/* Cannot use bitmap.c on big-endian machines. */
static int cow_test_bit(int64_t bitnum, const uint8_t *bitmap)
{
return (bitmap[bitnum / 8] & (1 << (bitnum & 7))) != 0;
}
static int cow_find_streak(const uint8_t *bitmap, int value, int start, int nb_sectors)
{
int streak_value = value ? 0xFF : 0;
int last = MIN(start + nb_sectors, BITS_PER_BITMAP_SECTOR);
int bitnum = start;
while (bitnum < last) {
if ((bitnum & 7) == 0 && bitmap[bitnum / 8] == streak_value) {
bitnum += 8;
continue;
}
if (cow_test_bit(bitnum, bitmap) == value) {
bitnum++;
continue;
}
break;
}
return MIN(bitnum, last) - start;
}
/* Return true if first block has been changed (ie. current version is
* in COW file). Set the number of continuous blocks for which that
* is true. */
static int coroutine_fn cow_co_is_allocated(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, int *num_same)
{
int64_t bitnum = sector_num + sizeof(struct cow_header_v2) * 8;
uint64_t offset = (bitnum / 8) & -BDRV_SECTOR_SIZE;
bool first = true;
int changed = 0, same = 0;
do {
int ret;
uint8_t bitmap[BDRV_SECTOR_SIZE];
bitnum &= BITS_PER_BITMAP_SECTOR - 1;
int sector_bits = MIN(nb_sectors, BITS_PER_BITMAP_SECTOR - bitnum);
ret = bdrv_pread(bs->file, offset, &bitmap, sizeof(bitmap));
if (ret < 0) {
return ret;
}
if (first) {
changed = cow_test_bit(bitnum, bitmap);
first = false;
}
same += cow_find_streak(bitmap, changed, bitnum, nb_sectors);
bitnum += sector_bits;
nb_sectors -= sector_bits;
offset += BDRV_SECTOR_SIZE;
} while (nb_sectors);
*num_same = same;
return changed;
}
static int64_t coroutine_fn cow_co_get_block_status(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, int *num_same)
{
BDRVCowState *s = bs->opaque;
int ret = cow_co_is_allocated(bs, sector_num, nb_sectors, num_same);
int64_t offset = s->cow_sectors_offset + (sector_num << BDRV_SECTOR_BITS);
if (ret < 0) {
return ret;
}
return (ret ? BDRV_BLOCK_DATA : 0) | offset | BDRV_BLOCK_OFFSET_VALID;
}
static int cow_update_bitmap(BlockDriverState *bs, int64_t sector_num,
int nb_sectors)
{
int64_t bitnum = sector_num + sizeof(struct cow_header_v2) * 8;
uint64_t offset = (bitnum / 8) & -BDRV_SECTOR_SIZE;
bool first = true;
int sector_bits;
for ( ; nb_sectors;
bitnum += sector_bits,
nb_sectors -= sector_bits,
offset += BDRV_SECTOR_SIZE) {
int ret, set;
uint8_t bitmap[BDRV_SECTOR_SIZE];
bitnum &= BITS_PER_BITMAP_SECTOR - 1;
sector_bits = MIN(nb_sectors, BITS_PER_BITMAP_SECTOR - bitnum);
ret = bdrv_pread(bs->file, offset, &bitmap, sizeof(bitmap));
if (ret < 0) {
return ret;
}
/* Skip over any already set bits */
set = cow_find_streak(bitmap, 1, bitnum, sector_bits);
bitnum += set;
sector_bits -= set;
nb_sectors -= set;
if (!sector_bits) {
continue;
}
if (first) {
ret = bdrv_flush(bs->file);
if (ret < 0) {
return ret;
}
first = false;
}
cow_set_bits(bitmap, bitnum, sector_bits);
ret = bdrv_pwrite(bs->file, offset, &bitmap, sizeof(bitmap));
if (ret < 0) {
return ret;
}
}
return 0;
}
static int coroutine_fn cow_read(BlockDriverState *bs, int64_t sector_num,
uint8_t *buf, int nb_sectors)
{
BDRVCowState *s = bs->opaque;
int ret, n;
while (nb_sectors > 0) {
ret = cow_co_is_allocated(bs, sector_num, nb_sectors, &n);
if (ret < 0) {
return ret;
}
if (ret) {
ret = bdrv_pread(bs->file,
s->cow_sectors_offset + sector_num * 512,
buf, n * 512);
if (ret < 0) {
return ret;
}
} else {
if (bs->backing_hd) {
/* read from the base image */
ret = bdrv_read(bs->backing_hd, sector_num, buf, n);
if (ret < 0) {
return ret;
}
} else {
memset(buf, 0, n * 512);
}
}
nb_sectors -= n;
sector_num += n;
buf += n * 512;
}
return 0;
}
static coroutine_fn int cow_co_read(BlockDriverState *bs, int64_t sector_num,
uint8_t *buf, int nb_sectors)
{
int ret;
BDRVCowState *s = bs->opaque;
qemu_co_mutex_lock(&s->lock);
ret = cow_read(bs, sector_num, buf, nb_sectors);
qemu_co_mutex_unlock(&s->lock);
return ret;
}
static int cow_write(BlockDriverState *bs, int64_t sector_num,
const uint8_t *buf, int nb_sectors)
{
BDRVCowState *s = bs->opaque;
int ret;
ret = bdrv_pwrite(bs->file, s->cow_sectors_offset + sector_num * 512,
buf, nb_sectors * 512);
if (ret < 0) {
return ret;
}
return cow_update_bitmap(bs, sector_num, nb_sectors);
}
static coroutine_fn int cow_co_write(BlockDriverState *bs, int64_t sector_num,
const uint8_t *buf, int nb_sectors)
{
int ret;
BDRVCowState *s = bs->opaque;
qemu_co_mutex_lock(&s->lock);
ret = cow_write(bs, sector_num, buf, nb_sectors);
qemu_co_mutex_unlock(&s->lock);
return ret;
}
static void cow_close(BlockDriverState *bs)
{
}
static int cow_create(const char *filename, QemuOpts *opts, Error **errp)
{
struct cow_header_v2 cow_header;
struct stat st;
int64_t image_sectors = 0;
char *image_filename = NULL;
Error *local_err = NULL;
int ret;
BlockDriverState *cow_bs = NULL;
/* Read out options */
image_sectors = qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0) / 512;
image_filename = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FILE);
ret = bdrv_create_file(filename, opts, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
goto exit;
}
ret = bdrv_open(&cow_bs, filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_PROTOCOL, NULL, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
goto exit;
}
memset(&cow_header, 0, sizeof(cow_header));
cow_header.magic = cpu_to_be32(COW_MAGIC);
cow_header.version = cpu_to_be32(COW_VERSION);
if (image_filename) {
/* Note: if no file, we put a dummy mtime */
cow_header.mtime = cpu_to_be32(0);
if (stat(image_filename, &st) != 0) {
goto mtime_fail;
}
cow_header.mtime = cpu_to_be32(st.st_mtime);
mtime_fail:
pstrcpy(cow_header.backing_file, sizeof(cow_header.backing_file),
image_filename);
}
cow_header.sectorsize = cpu_to_be32(512);
cow_header.size = cpu_to_be64(image_sectors * 512);
ret = bdrv_pwrite(cow_bs, 0, &cow_header, sizeof(cow_header));
if (ret < 0) {
goto exit;
}
/* resize to include at least all the bitmap */
ret = bdrv_truncate(cow_bs,
sizeof(cow_header) + ((image_sectors + 7) >> 3));
if (ret < 0) {
goto exit;
}
exit:
g_free(image_filename);
if (cow_bs) {
bdrv_unref(cow_bs);
}
return ret;
}
static QemuOptsList cow_create_opts = {
.name = "cow-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(cow_create_opts.head),
.desc = {
{
.name = BLOCK_OPT_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Virtual disk size"
},
{
.name = BLOCK_OPT_BACKING_FILE,
.type = QEMU_OPT_STRING,
.help = "File name of a base image"
},
{ /* end of list */ }
}
};
static BlockDriver bdrv_cow = {
.format_name = "cow",
.instance_size = sizeof(BDRVCowState),
.bdrv_probe = cow_probe,
.bdrv_open = cow_open,
.bdrv_close = cow_close,
.bdrv_create = cow_create,
.bdrv_has_zero_init = bdrv_has_zero_init_1,
.supports_backing = true,
.bdrv_read = cow_co_read,
.bdrv_write = cow_co_write,
.bdrv_co_get_block_status = cow_co_get_block_status,
.create_opts = &cow_create_opts,
};
static void bdrv_cow_init(void)
{
bdrv_register(&bdrv_cow);
}
block_init(bdrv_cow_init);

View File

@@ -26,7 +26,7 @@
#include "qapi/qmp/qbool.h"
#include <curl/curl.h>
// #define DEBUG_CURL
// #define DEBUG
// #define DEBUG_VERBOSE
#ifdef DEBUG_CURL
@@ -63,7 +63,6 @@ static CURLMcode __curl_multi_socket_action(CURLM *multi_handle,
#define CURL_NUM_ACB 8
#define SECTOR_SIZE 512
#define READ_AHEAD_DEFAULT (256 * 1024)
#define CURL_TIMEOUT_DEFAULT 5
#define FIND_RET_NONE 0
#define FIND_RET_OK 1
@@ -72,8 +71,6 @@ static CURLMcode __curl_multi_socket_action(CURLM *multi_handle,
#define CURL_BLOCK_OPT_URL "url"
#define CURL_BLOCK_OPT_READAHEAD "readahead"
#define CURL_BLOCK_OPT_SSLVERIFY "sslverify"
#define CURL_BLOCK_OPT_TIMEOUT "timeout"
#define CURL_BLOCK_OPT_COOKIE "cookie"
struct BDRVCURLState;
@@ -112,8 +109,6 @@ typedef struct BDRVCURLState {
char *url;
size_t readahead_size;
bool sslverify;
int timeout;
char *cookie;
bool accept_range;
AioContext *aio_context;
} BDRVCURLState;
@@ -212,7 +207,7 @@ static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
qemu_iovec_from_buf(acb->qiov, 0, s->orig_buf + acb->start,
acb->end - acb->start);
acb->common.cb(acb->common.opaque, 0);
qemu_aio_unref(acb);
qemu_aio_release(acb);
s->acb[i] = NULL;
}
}
@@ -304,7 +299,7 @@ static void curl_multi_check_completion(BDRVCURLState *s)
}
acb->common.cb(acb->common.opaque, -EIO);
qemu_aio_unref(acb);
qemu_aio_release(acb);
state->acb[i] = NULL;
}
}
@@ -357,7 +352,7 @@ static void curl_multi_timeout_do(void *arg)
#endif
}
static CURLState *curl_init_state(BlockDriverState *bs, BDRVCURLState *s)
static CURLState *curl_init_state(BDRVCURLState *s)
{
CURLState *state = NULL;
int i, j;
@@ -375,7 +370,7 @@ static CURLState *curl_init_state(BlockDriverState *bs, BDRVCURLState *s)
break;
}
if (!state) {
aio_poll(bdrv_get_aio_context(bs), true);
aio_poll(state->s->aio_context, true);
}
} while(!state);
@@ -387,10 +382,7 @@ static CURLState *curl_init_state(BlockDriverState *bs, BDRVCURLState *s)
curl_easy_setopt(state->curl, CURLOPT_URL, s->url);
curl_easy_setopt(state->curl, CURLOPT_SSL_VERIFYPEER,
(long) s->sslverify);
if (s->cookie) {
curl_easy_setopt(state->curl, CURLOPT_COOKIE, s->cookie);
}
curl_easy_setopt(state->curl, CURLOPT_TIMEOUT, s->timeout);
curl_easy_setopt(state->curl, CURLOPT_TIMEOUT, 5);
curl_easy_setopt(state->curl, CURLOPT_WRITEFUNCTION,
(void *)curl_read_cb);
curl_easy_setopt(state->curl, CURLOPT_WRITEDATA, (void *)state);
@@ -497,16 +489,6 @@ static QemuOptsList runtime_opts = {
.type = QEMU_OPT_BOOL,
.help = "Verify SSL certificate"
},
{
.name = CURL_BLOCK_OPT_TIMEOUT,
.type = QEMU_OPT_NUMBER,
.help = "Curl timeout"
},
{
.name = CURL_BLOCK_OPT_COOKIE,
.type = QEMU_OPT_STRING,
.help = "Pass the cookie or list of cookies with each request"
},
{ /* end of list */ }
},
};
@@ -519,7 +501,6 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
QemuOpts *opts;
Error *local_err = NULL;
const char *file;
const char *cookie;
double d;
static int inited = 0;
@@ -544,14 +525,8 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
goto out_noclean;
}
s->timeout = qemu_opt_get_number(opts, CURL_BLOCK_OPT_TIMEOUT,
CURL_TIMEOUT_DEFAULT);
s->sslverify = qemu_opt_get_bool(opts, CURL_BLOCK_OPT_SSLVERIFY, true);
cookie = qemu_opt_get(opts, CURL_BLOCK_OPT_COOKIE);
s->cookie = g_strdup(cookie);
file = qemu_opt_get(opts, CURL_BLOCK_OPT_URL);
if (file == NULL) {
error_setg(errp, "curl block driver requires an 'url' option");
@@ -566,7 +541,7 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
DPRINTF("CURL: Opening %s\n", file);
s->aio_context = bdrv_get_aio_context(bs);
s->url = g_strdup(file);
state = curl_init_state(bs, s);
state = curl_init_state(s);
if (!state)
goto out_noclean;
@@ -607,14 +582,19 @@ out:
curl_easy_cleanup(state->curl);
state->curl = NULL;
out_noclean:
g_free(s->cookie);
g_free(s->url);
qemu_opts_del(opts);
return -EINVAL;
}
static void curl_aio_cancel(BlockDriverAIOCB *blockacb)
{
// Do we have to implement canceling? Seems to work without...
}
static const AIOCBInfo curl_aiocb_info = {
.aiocb_size = sizeof(CURLAIOCB),
.cancel = curl_aio_cancel,
};
@@ -636,7 +616,7 @@ static void curl_readv_bh_cb(void *p)
// we can just call the callback and be done.
switch (curl_find_buf(s, start, acb->nb_sectors * SECTOR_SIZE, acb)) {
case FIND_RET_OK:
qemu_aio_unref(acb);
qemu_aio_release(acb);
// fall through
case FIND_RET_WAIT:
return;
@@ -645,10 +625,10 @@ static void curl_readv_bh_cb(void *p)
}
// No cache found, so let's start a new request
state = curl_init_state(acb->common.bs, s);
state = curl_init_state(s);
if (!state) {
acb->common.cb(acb->common.opaque, -EIO);
qemu_aio_unref(acb);
qemu_aio_release(acb);
return;
}
@@ -660,13 +640,7 @@ static void curl_readv_bh_cb(void *p)
state->buf_start = start;
state->buf_len = acb->end + s->readahead_size;
end = MIN(start + state->buf_len, s->len) - 1;
state->orig_buf = g_try_malloc(state->buf_len);
if (state->buf_len && state->orig_buf == NULL) {
curl_clean_state(state);
acb->common.cb(acb->common.opaque, -ENOMEM);
qemu_aio_unref(acb);
return;
}
state->orig_buf = g_malloc(state->buf_len);
state->acb[0] = acb;
snprintf(state->range, 127, "%zd-%zd", start, end);
@@ -704,7 +678,6 @@ static void curl_close(BlockDriverState *bs)
DPRINTF("CURL: Close\n");
curl_detach_aio_context(bs);
g_free(s->cookie);
g_free(s->url);
}

View File

@@ -284,15 +284,8 @@ static int dmg_open(BlockDriverState *bs, QDict *options, int flags,
}
/* initialize zlib engine */
s->compressed_chunk = qemu_try_blockalign(bs->file,
max_compressed_size + 1);
s->uncompressed_chunk = qemu_try_blockalign(bs->file,
512 * max_sectors_per_chunk);
if (s->compressed_chunk == NULL || s->uncompressed_chunk == NULL) {
ret = -ENOMEM;
goto fail;
}
s->compressed_chunk = g_malloc(max_compressed_size + 1);
s->uncompressed_chunk = g_malloc(512 * max_sectors_per_chunk);
if (inflateInit(&s->zstream) != Z_OK) {
ret = -EINVAL;
goto fail;
@@ -309,8 +302,8 @@ fail:
g_free(s->lengths);
g_free(s->sectors);
g_free(s->sectorcounts);
qemu_vfree(s->compressed_chunk);
qemu_vfree(s->uncompressed_chunk);
g_free(s->compressed_chunk);
g_free(s->uncompressed_chunk);
return ret;
}
@@ -433,8 +426,8 @@ static void dmg_close(BlockDriverState *bs)
g_free(s->lengths);
g_free(s->sectors);
g_free(s->sectorcounts);
qemu_vfree(s->compressed_chunk);
qemu_vfree(s->uncompressed_chunk);
g_free(s->compressed_chunk);
g_free(s->uncompressed_chunk);
inflateEnd(&s->zstream);
}

View File

@@ -291,7 +291,7 @@ static int qemu_gluster_open(BlockDriverState *bs, QDict *options,
BDRVGlusterState *s = bs->opaque;
int open_flags = 0;
int ret = 0;
GlusterConf *gconf = g_new0(GlusterConf, 1);
GlusterConf *gconf = g_malloc0(sizeof(GlusterConf));
QemuOpts *opts;
Error *local_err = NULL;
const char *filename;
@@ -351,12 +351,12 @@ static int qemu_gluster_reopen_prepare(BDRVReopenState *state,
assert(state != NULL);
assert(state->bs != NULL);
state->opaque = g_new0(BDRVGlusterReopenState, 1);
state->opaque = g_malloc0(sizeof(BDRVGlusterReopenState));
reop_s = state->opaque;
qemu_gluster_parse_flags(state->flags, &open_flags);
gconf = g_new0(GlusterConf, 1);
gconf = g_malloc0(sizeof(GlusterConf));
reop_s->glfs = qemu_gluster_init(gconf, state->bs->filename, errp);
if (reop_s->glfs == NULL) {
@@ -486,7 +486,7 @@ static int qemu_gluster_create(const char *filename,
int prealloc = 0;
int64_t total_size = 0;
char *tmp = NULL;
GlusterConf *gconf = g_new0(GlusterConf, 1);
GlusterConf *gconf = g_malloc0(sizeof(GlusterConf));
glfs = qemu_gluster_init(gconf, filename, errp);
if (!glfs) {
@@ -494,8 +494,8 @@ static int qemu_gluster_create(const char *filename,
goto out;
}
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
total_size =
qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0) / BDRV_SECTOR_SIZE;
tmp = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
if (!tmp || !strcmp(tmp, "off")) {
@@ -516,8 +516,9 @@ static int qemu_gluster_create(const char *filename,
if (!fd) {
ret = -errno;
} else {
if (!glfs_ftruncate(fd, total_size)) {
if (prealloc && qemu_gluster_zerofill(fd, 0, total_size)) {
if (!glfs_ftruncate(fd, total_size * BDRV_SECTOR_SIZE)) {
if (prealloc && qemu_gluster_zerofill(fd, 0,
total_size * BDRV_SECTOR_SIZE)) {
ret = -errno;
}
} else {

View File

@@ -34,6 +34,7 @@
#include "qemu/bitops.h"
#include "qemu/bitmap.h"
#include "block/block_int.h"
#include "trace.h"
#include "block/scsi.h"
#include "qemu/iov.h"
#include "sysemu/sysemu.h"
@@ -87,6 +88,7 @@ typedef struct IscsiAIOCB {
struct scsi_task *task;
uint8_t *buf;
int status;
int canceled;
int64_t sector_num;
int nb_sectors;
#ifdef __linux__
@@ -118,14 +120,16 @@ iscsi_bh_cb(void *p)
g_free(acb->buf);
acb->buf = NULL;
acb->common.cb(acb->common.opaque, acb->status);
if (acb->canceled == 0) {
acb->common.cb(acb->common.opaque, acb->status);
}
if (acb->task != NULL) {
scsi_free_scsi_task(acb->task);
acb->task = NULL;
}
qemu_aio_unref(acb);
qemu_aio_release(acb);
}
static void
@@ -236,15 +240,20 @@ iscsi_aio_cancel(BlockDriverAIOCB *blockacb)
return;
}
acb->canceled = 1;
/* send a task mgmt call to the target to cancel the task on the target */
iscsi_task_mgmt_abort_task_async(iscsilun->iscsi, acb->task,
iscsi_abort_task_cb, acb);
while (acb->status == -EINPROGRESS) {
aio_poll(iscsilun->aio_context, true);
}
}
static const AIOCBInfo iscsi_aiocb_info = {
.aiocb_size = sizeof(IscsiAIOCB),
.cancel_async = iscsi_aio_cancel,
.cancel = iscsi_aio_cancel,
};
@@ -316,13 +325,6 @@ static bool is_request_lun_aligned(int64_t sector_num, int nb_sectors,
return 1;
}
static unsigned long *iscsi_allocationmap_init(IscsiLun *iscsilun)
{
return bitmap_try_new(DIV_ROUND_UP(sector_lun2qemu(iscsilun->num_blocks,
iscsilun),
iscsilun->cluster_sectors));
}
static void iscsi_allocationmap_set(IscsiLun *iscsilun, int64_t sector_num,
int nb_sectors)
{
@@ -636,6 +638,10 @@ iscsi_aio_ioctl_cb(struct iscsi_context *iscsi, int status,
g_free(acb->buf);
acb->buf = NULL;
if (acb->canceled != 0) {
return;
}
acb->status = 0;
if (status < 0) {
error_report("Failed to ioctl(SG_IO) to iSCSI lun. %s",
@@ -677,6 +683,7 @@ static BlockDriverAIOCB *iscsi_aio_ioctl(BlockDriverState *bs,
acb = qemu_aio_get(&iscsi_aiocb_info, bs, cb, opaque);
acb->iscsilun = iscsilun;
acb->canceled = 0;
acb->bh = NULL;
acb->status = -EINPROGRESS;
acb->buf = NULL;
@@ -686,7 +693,7 @@ static BlockDriverAIOCB *iscsi_aio_ioctl(BlockDriverState *bs,
if (acb->task == NULL) {
error_report("iSCSI: Failed to allocate task for scsi command. %s",
iscsi_get_error(iscsi));
qemu_aio_unref(acb);
qemu_aio_release(acb);
return NULL;
}
memset(acb->task, 0, sizeof(struct scsi_task));
@@ -724,7 +731,7 @@ static BlockDriverAIOCB *iscsi_aio_ioctl(BlockDriverState *bs,
(data.size > 0) ? &data : NULL,
acb) != 0) {
scsi_free_scsi_task(acb->task);
qemu_aio_unref(acb);
qemu_aio_release(acb);
return NULL;
}
@@ -886,10 +893,7 @@ coroutine_fn iscsi_co_write_zeroes(BlockDriverState *bs, int64_t sector_num,
nb_blocks = sector_qemu2lun(nb_sectors, iscsilun);
if (iscsilun->zeroblock == NULL) {
iscsilun->zeroblock = g_try_malloc0(iscsilun->block_size);
if (iscsilun->zeroblock == NULL) {
return -ENOMEM;
}
iscsilun->zeroblock = g_malloc0(iscsilun->block_size);
}
iscsi_co_init_iscsitask(iscsilun, &iTask);
@@ -1409,10 +1413,9 @@ static int iscsi_open(BlockDriverState *bs, QDict *options, int flags,
iscsilun->cluster_sectors = (iscsilun->bl.opt_unmap_gran *
iscsilun->block_size) >> BDRV_SECTOR_BITS;
if (iscsilun->lbprz && !(bs->open_flags & BDRV_O_NOCACHE)) {
iscsilun->allocationmap = iscsi_allocationmap_init(iscsilun);
if (iscsilun->allocationmap == NULL) {
ret = -ENOMEM;
}
iscsilun->allocationmap =
bitmap_new(DIV_ROUND_UP(bs->total_sectors,
iscsilun->cluster_sectors));
}
}
@@ -1505,7 +1508,10 @@ static int iscsi_truncate(BlockDriverState *bs, int64_t offset)
if (iscsilun->allocationmap != NULL) {
g_free(iscsilun->allocationmap);
iscsilun->allocationmap = iscsi_allocationmap_init(iscsilun);
iscsilun->allocationmap =
bitmap_new(DIV_ROUND_UP(sector_lun2qemu(iscsilun->num_blocks,
iscsilun),
iscsilun->cluster_sectors));
}
return 0;
@@ -1522,9 +1528,9 @@ static int iscsi_create(const char *filename, QemuOpts *opts, Error **errp)
bs = bdrv_new("", &error_abort);
/* Read out options */
total_size = DIV_ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
bs->opaque = g_new0(struct IscsiLun, 1);
total_size =
qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0) / BDRV_SECTOR_SIZE;
bs->opaque = g_malloc0(sizeof(struct IscsiLun));
iscsilun = bs->opaque;
bs_options = qdict_new();

View File

@@ -51,12 +51,6 @@ struct qemu_laio_state {
/* io queue for submit at batch */
LaioQueue io_q;
/* I/O completion processing */
QEMUBH *completion_bh;
struct io_event events[MAX_EVENTS];
int event_idx;
int event_max;
};
static inline ssize_t io_event_ret(struct io_event *ev)
@@ -85,64 +79,34 @@ static void qemu_laio_process_completion(struct qemu_laio_state *s,
ret = -EINVAL;
}
}
}
laiocb->common.cb(laiocb->common.opaque, ret);
qemu_aio_unref(laiocb);
}
/* The completion BH fetches completed I/O requests and invokes their
* callbacks.
*
* The function is somewhat tricky because it supports nested event loops, for
* example when a request callback invokes aio_poll(). In order to do this,
* the completion events array and index are kept in qemu_laio_state. The BH
* reschedules itself as long as there are completions pending so it will
* either be called again in a nested event loop or will be called after all
* events have been completed. When there are no events left to complete, the
* BH returns without rescheduling.
*/
static void qemu_laio_completion_bh(void *opaque)
{
struct qemu_laio_state *s = opaque;
/* Fetch more completion events when empty */
if (s->event_idx == s->event_max) {
do {
struct timespec ts = { 0 };
s->event_max = io_getevents(s->ctx, MAX_EVENTS, MAX_EVENTS,
s->events, &ts);
} while (s->event_max == -EINTR);
s->event_idx = 0;
if (s->event_max <= 0) {
s->event_max = 0;
return; /* no more events */
}
laiocb->common.cb(laiocb->common.opaque, ret);
}
/* Reschedule so nested event loops see currently pending completions */
qemu_bh_schedule(s->completion_bh);
/* Process completion events */
while (s->event_idx < s->event_max) {
struct iocb *iocb = s->events[s->event_idx].obj;
struct qemu_laiocb *laiocb =
container_of(iocb, struct qemu_laiocb, iocb);
laiocb->ret = io_event_ret(&s->events[s->event_idx]);
s->event_idx++;
qemu_laio_process_completion(s, laiocb);
}
qemu_aio_release(laiocb);
}
static void qemu_laio_completion_cb(EventNotifier *e)
{
struct qemu_laio_state *s = container_of(e, struct qemu_laio_state, e);
if (event_notifier_test_and_clear(&s->e)) {
qemu_bh_schedule(s->completion_bh);
while (event_notifier_test_and_clear(&s->e)) {
struct io_event events[MAX_EVENTS];
struct timespec ts = { 0 };
int nevents, i;
do {
nevents = io_getevents(s->ctx, MAX_EVENTS, MAX_EVENTS, events, &ts);
} while (nevents == -EINTR);
for (i = 0; i < nevents; i++) {
struct iocb *iocb = events[i].obj;
struct qemu_laiocb *laiocb =
container_of(iocb, struct qemu_laiocb, iocb);
laiocb->ret = io_event_ret(&events[i]);
qemu_laio_process_completion(s, laiocb);
}
}
}
@@ -152,22 +116,35 @@ static void laio_cancel(BlockDriverAIOCB *blockacb)
struct io_event event;
int ret;
if (laiocb->ret != -EINPROGRESS) {
if (laiocb->ret != -EINPROGRESS)
return;
}
/*
* Note that as of Linux 2.6.31 neither the block device code nor any
* filesystem implements cancellation of AIO request.
* Thus the polling loop below is the normal code path.
*/
ret = io_cancel(laiocb->ctx->ctx, &laiocb->iocb, &event);
laiocb->ret = -ECANCELED;
if (ret != 0) {
/* iocb is not cancelled, cb will be called by the event loop later */
if (ret == 0) {
laiocb->ret = -ECANCELED;
return;
}
laiocb->common.cb(laiocb->common.opaque, laiocb->ret);
/*
* We have to wait for the iocb to finish.
*
* The only way to get the iocb status update is by polling the io context.
* We might be able to do this slightly more optimal by removing the
* O_NONBLOCK flag.
*/
while (laiocb->ret == -EINPROGRESS) {
qemu_laio_completion_cb(&laiocb->ctx->e);
}
}
static const AIOCBInfo laio_aiocb_info = {
.aiocb_size = sizeof(struct qemu_laiocb),
.cancel_async = laio_cancel,
.cancel = laio_cancel,
};
static void ioq_init(LaioQueue *io_q)
@@ -286,7 +263,7 @@ BlockDriverAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
return &laiocb->common;
out_free_aiocb:
qemu_aio_unref(laiocb);
qemu_aio_release(laiocb);
return NULL;
}
@@ -295,14 +272,12 @@ void laio_detach_aio_context(void *s_, AioContext *old_context)
struct qemu_laio_state *s = s_;
aio_set_event_notifier(old_context, &s->e, NULL);
qemu_bh_delete(s->completion_bh);
}
void laio_attach_aio_context(void *s_, AioContext *new_context)
{
struct qemu_laio_state *s = s_;
s->completion_bh = aio_bh_new(new_context, qemu_laio_completion_bh, s);
aio_set_event_notifier(new_context, &s->e, qemu_laio_completion_cb);
}

View File

@@ -157,7 +157,7 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
BlockDriverState *source = s->common.bs;
int nb_sectors, sectors_per_chunk, nb_chunks;
int64_t end, sector_num, next_chunk, next_sector, hbitmap_next_sector;
uint64_t delay_ns = 0;
uint64_t delay_ns;
MirrorOp *op;
s->sector_num = hbitmap_iter_next(&s->hbi);
@@ -247,6 +247,8 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
next_chunk += added_chunks;
if (!s->synced && s->common.speed) {
delay_ns = ratelimit_calculate_delay(&s->limit, added_sectors);
} else {
delay_ns = 0;
}
} while (delay_ns == 0 && next_sector < end);
@@ -365,12 +367,7 @@ static void coroutine_fn mirror_run(void *opaque)
}
end = s->common.len >> BDRV_SECTOR_BITS;
s->buf = qemu_try_blockalign(bs, s->buf_size);
if (s->buf == NULL) {
ret = -ENOMEM;
goto immediate_exit;
}
s->buf = qemu_blockalign(bs, s->buf_size);
sectors_per_chunk = s->granularity >> BDRV_SECTOR_BITS;
mirror_free_init(s);

View File

@@ -31,10 +31,8 @@
#include "block/block_int.h"
#include "qemu/module.h"
#include "qemu/sockets.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qjson.h"
#include "qapi/qmp/qint.h"
#include "qapi/qmp/qstring.h"
#include <sys/types.h>
#include <unistd.h>
@@ -340,37 +338,6 @@ static void nbd_attach_aio_context(BlockDriverState *bs,
nbd_client_session_attach_aio_context(&s->client, new_context);
}
static void nbd_refresh_filename(BlockDriverState *bs)
{
BDRVNBDState *s = bs->opaque;
QDict *opts = qdict_new();
const char *path = qemu_opt_get(s->socket_opts, "path");
const char *host = qemu_opt_get(s->socket_opts, "host");
const char *port = qemu_opt_get(s->socket_opts, "port");
const char *export = qemu_opt_get(s->socket_opts, "export");
qdict_put_obj(opts, "driver", QOBJECT(qstring_from_str("nbd")));
if (path) {
snprintf(bs->exact_filename, sizeof(bs->exact_filename),
"nbd+unix:%s", path);
qdict_put_obj(opts, "path", QOBJECT(qstring_from_str(path)));
} else if (export) {
snprintf(bs->exact_filename, sizeof(bs->exact_filename),
"nbd:%s:%s/%s", host, port, export);
qdict_put_obj(opts, "host", QOBJECT(qstring_from_str(host)));
qdict_put_obj(opts, "port", QOBJECT(qstring_from_str(port)));
qdict_put_obj(opts, "export", QOBJECT(qstring_from_str(export)));
} else {
snprintf(bs->exact_filename, sizeof(bs->exact_filename),
"nbd:%s:%s", host, port);
qdict_put_obj(opts, "host", QOBJECT(qstring_from_str(host)));
qdict_put_obj(opts, "port", QOBJECT(qstring_from_str(port)));
}
bs->full_open_options = opts;
}
static BlockDriver bdrv_nbd = {
.format_name = "nbd",
.protocol_name = "nbd",
@@ -385,7 +352,6 @@ static BlockDriver bdrv_nbd = {
.bdrv_getlength = nbd_getlength,
.bdrv_detach_aio_context = nbd_detach_aio_context,
.bdrv_attach_aio_context = nbd_attach_aio_context,
.bdrv_refresh_filename = nbd_refresh_filename,
};
static BlockDriver bdrv_nbd_tcp = {
@@ -402,7 +368,6 @@ static BlockDriver bdrv_nbd_tcp = {
.bdrv_getlength = nbd_getlength,
.bdrv_detach_aio_context = nbd_detach_aio_context,
.bdrv_attach_aio_context = nbd_attach_aio_context,
.bdrv_refresh_filename = nbd_refresh_filename,
};
static BlockDriver bdrv_nbd_unix = {
@@ -419,7 +384,6 @@ static BlockDriver bdrv_nbd_unix = {
.bdrv_getlength = nbd_getlength,
.bdrv_detach_aio_context = nbd_detach_aio_context,
.bdrv_attach_aio_context = nbd_attach_aio_context,
.bdrv_refresh_filename = nbd_refresh_filename,
};
static void bdrv_nbd_init(void)

View File

@@ -172,11 +172,7 @@ static int coroutine_fn nfs_co_writev(BlockDriverState *bs,
nfs_co_init_task(client, &task);
buf = g_try_malloc(nb_sectors * BDRV_SECTOR_SIZE);
if (nb_sectors && buf == NULL) {
return -ENOMEM;
}
buf = g_malloc(nb_sectors * BDRV_SECTOR_SIZE);
qemu_iovec_to_buf(iov, 0, buf, nb_sectors * BDRV_SECTOR_SIZE);
if (nfs_pwrite_async(client->context, client->fh,
@@ -393,33 +389,28 @@ static int nfs_file_open(BlockDriverState *bs, QDict *options, int flags,
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto out;
return -EINVAL;
}
ret = nfs_client_open(client, qemu_opt_get(opts, "filename"),
(flags & BDRV_O_RDWR) ? O_RDWR : O_RDONLY,
errp);
if (ret < 0) {
goto out;
return ret;
}
bs->total_sectors = ret;
ret = 0;
out:
qemu_opts_del(opts);
return ret;
return 0;
}
static int nfs_file_create(const char *url, QemuOpts *opts, Error **errp)
{
int ret = 0;
int64_t total_size = 0;
NFSClient *client = g_new0(NFSClient, 1);
NFSClient *client = g_malloc0(sizeof(NFSClient));
client->aio_context = qemu_get_aio_context();
/* Read out options */
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
total_size = qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0);
ret = nfs_client_open(client, url, O_CREAT, errp);
if (ret < 0) {

View File

@@ -1,168 +0,0 @@
/*
* Null block driver
*
* Authors:
* Fam Zheng <famz@redhat.com>
*
* Copyright (C) 2014 Red Hat, Inc.
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*/
#include "block/block_int.h"
typedef struct {
int64_t length;
} BDRVNullState;
static QemuOptsList runtime_opts = {
.name = "null",
.head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
.desc = {
{
.name = "filename",
.type = QEMU_OPT_STRING,
.help = "",
},
{
.name = BLOCK_OPT_SIZE,
.type = QEMU_OPT_SIZE,
.help = "size of the null block",
},
{ /* end of list */ }
},
};
static int null_file_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
QemuOpts *opts;
BDRVNullState *s = bs->opaque;
opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &error_abort);
s->length =
qemu_opt_get_size(opts, BLOCK_OPT_SIZE, 1 << 30);
qemu_opts_del(opts);
return 0;
}
static void null_close(BlockDriverState *bs)
{
}
static int64_t null_getlength(BlockDriverState *bs)
{
BDRVNullState *s = bs->opaque;
return s->length;
}
static coroutine_fn int null_co_readv(BlockDriverState *bs,
int64_t sector_num, int nb_sectors,
QEMUIOVector *qiov)
{
return 0;
}
static coroutine_fn int null_co_writev(BlockDriverState *bs,
int64_t sector_num, int nb_sectors,
QEMUIOVector *qiov)
{
return 0;
}
static coroutine_fn int null_co_flush(BlockDriverState *bs)
{
return 0;
}
typedef struct {
BlockDriverAIOCB common;
QEMUBH *bh;
} NullAIOCB;
static const AIOCBInfo null_aiocb_info = {
.aiocb_size = sizeof(NullAIOCB),
};
static void null_bh_cb(void *opaque)
{
NullAIOCB *acb = opaque;
acb->common.cb(acb->common.opaque, 0);
qemu_bh_delete(acb->bh);
qemu_aio_unref(acb);
}
static inline BlockDriverAIOCB *null_aio_common(BlockDriverState *bs,
BlockDriverCompletionFunc *cb,
void *opaque)
{
NullAIOCB *acb;
acb = qemu_aio_get(&null_aiocb_info, bs, cb, opaque);
acb->bh = aio_bh_new(bdrv_get_aio_context(bs), null_bh_cb, acb);
qemu_bh_schedule(acb->bh);
return &acb->common;
}
static BlockDriverAIOCB *null_aio_readv(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov,
int nb_sectors,
BlockDriverCompletionFunc *cb,
void *opaque)
{
return null_aio_common(bs, cb, opaque);
}
static BlockDriverAIOCB *null_aio_writev(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov,
int nb_sectors,
BlockDriverCompletionFunc *cb,
void *opaque)
{
return null_aio_common(bs, cb, opaque);
}
static BlockDriverAIOCB *null_aio_flush(BlockDriverState *bs,
BlockDriverCompletionFunc *cb,
void *opaque)
{
return null_aio_common(bs, cb, opaque);
}
static BlockDriver bdrv_null_co = {
.format_name = "null-co",
.protocol_name = "null-co",
.instance_size = sizeof(BDRVNullState),
.bdrv_file_open = null_file_open,
.bdrv_close = null_close,
.bdrv_getlength = null_getlength,
.bdrv_co_readv = null_co_readv,
.bdrv_co_writev = null_co_writev,
.bdrv_co_flush_to_disk = null_co_flush,
};
static BlockDriver bdrv_null_aio = {
.format_name = "null-aio",
.protocol_name = "null-aio",
.instance_size = sizeof(BDRVNullState),
.bdrv_file_open = null_file_open,
.bdrv_close = null_close,
.bdrv_getlength = null_getlength,
.bdrv_aio_readv = null_aio_readv,
.bdrv_aio_writev = null_aio_writev,
.bdrv_aio_flush = null_aio_flush,
};
static void bdrv_null_init(void)
{
bdrv_register(&bdrv_null_co);
bdrv_register(&bdrv_null_aio);
}
block_init(bdrv_null_init);

View File

@@ -30,7 +30,6 @@
/**************************************************************/
#define HEADER_MAGIC "WithoutFreeSpace"
#define HEADER_MAGIC2 "WithouFreSpacExt"
#define HEADER_VERSION 2
#define HEADER_SIZE 64
@@ -42,10 +41,8 @@ struct parallels_header {
uint32_t cylinders;
uint32_t tracks;
uint32_t catalog_entries;
uint64_t nb_sectors;
uint32_t inuse;
uint32_t data_off;
char padding[12];
uint32_t nb_sectors;
char padding[24];
} QEMU_PACKED;
typedef struct BDRVParallelsState {
@@ -55,8 +52,6 @@ typedef struct BDRVParallelsState {
unsigned int catalog_size;
unsigned int tracks;
unsigned int off_multiplier;
} BDRVParallelsState;
static int parallels_probe(const uint8_t *buf, int buf_size, const char *filename)
@@ -64,12 +59,11 @@ static int parallels_probe(const uint8_t *buf, int buf_size, const char *filenam
const struct parallels_header *ph = (const void *)buf;
if (buf_size < HEADER_SIZE)
return 0;
return 0;
if ((!memcmp(ph->magic, HEADER_MAGIC, 16) ||
!memcmp(ph->magic, HEADER_MAGIC2, 16)) &&
(le32_to_cpu(ph->version) == HEADER_VERSION))
return 100;
if (!memcmp(ph->magic, HEADER_MAGIC, 16) &&
(le32_to_cpu(ph->version) == HEADER_VERSION))
return 100;
return 0;
}
@@ -89,19 +83,14 @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags,
goto fail;
}
bs->total_sectors = le64_to_cpu(ph.nb_sectors);
if (memcmp(ph.magic, HEADER_MAGIC, 16) ||
(le32_to_cpu(ph.version) != HEADER_VERSION)) {
error_setg(errp, "Image not in Parallels format");
ret = -EINVAL;
goto fail;
}
if (le32_to_cpu(ph.version) != HEADER_VERSION) {
goto fail_format;
}
if (!memcmp(ph.magic, HEADER_MAGIC, 16)) {
s->off_multiplier = 1;
bs->total_sectors = 0xffffffff & bs->total_sectors;
} else if (!memcmp(ph.magic, HEADER_MAGIC2, 16)) {
s->off_multiplier = le32_to_cpu(ph.tracks);
} else {
goto fail_format;
}
bs->total_sectors = le32_to_cpu(ph.nb_sectors);
s->tracks = le32_to_cpu(ph.tracks);
if (s->tracks == 0) {
@@ -109,11 +98,6 @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags,
ret = -EINVAL;
goto fail;
}
if (s->tracks > INT32_MAX/513) {
error_setg(errp, "Invalid image: Too big cluster");
ret = -EFBIG;
goto fail;
}
s->catalog_size = le32_to_cpu(ph.catalog_entries);
if (s->catalog_size > INT_MAX / 4) {
@@ -121,11 +105,7 @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags,
ret = -EFBIG;
goto fail;
}
s->catalog_bitmap = g_try_new(uint32_t, s->catalog_size);
if (s->catalog_size && s->catalog_bitmap == NULL) {
ret = -ENOMEM;
goto fail;
}
s->catalog_bitmap = g_malloc(s->catalog_size * 4);
ret = bdrv_pread(bs->file, 64, s->catalog_bitmap, s->catalog_size * 4);
if (ret < 0) {
@@ -133,14 +113,11 @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags,
}
for (i = 0; i < s->catalog_size; i++)
le32_to_cpus(&s->catalog_bitmap[i]);
le32_to_cpus(&s->catalog_bitmap[i]);
qemu_co_mutex_init(&s->lock);
return 0;
fail_format:
error_setg(errp, "Image not in Parallels format");
ret = -EINVAL;
fail:
g_free(s->catalog_bitmap);
return ret;
@@ -156,9 +133,8 @@ static int64_t seek_to_sector(BlockDriverState *bs, int64_t sector_num)
/* not allocated */
if ((index > s->catalog_size) || (s->catalog_bitmap[index] == 0))
return -1;
return
((uint64_t)s->catalog_bitmap[index] * s->off_multiplier + offset) * 512;
return -1;
return (uint64_t)(s->catalog_bitmap[index] + offset) * 512;
}
static int parallels_read(BlockDriverState *bs, int64_t sector_num,

View File

@@ -28,13 +28,6 @@
#include "qapi-visit.h"
#include "qapi/qmp-output-visitor.h"
#include "qapi/qmp/types.h"
#ifdef __linux__
#include <linux/fs.h>
#include <sys/ioctl.h>
#ifndef FS_NOCOW_FL
#define FS_NOCOW_FL 0x00800000 /* Do not cow file */
#endif
#endif
BlockDeviceInfo *bdrv_block_device_info(BlockDriverState *bs)
{
@@ -172,28 +165,19 @@ void bdrv_query_image_info(BlockDriverState *bs,
ImageInfo **p_info,
Error **errp)
{
int64_t size;
uint64_t total_sectors;
const char *backing_filename;
char backing_filename2[1024];
BlockDriverInfo bdi;
int ret;
Error *err = NULL;
ImageInfo *info;
#ifdef __linux__
int fd, attr;
#endif
ImageInfo *info = g_new0(ImageInfo, 1);
size = bdrv_getlength(bs);
if (size < 0) {
error_setg_errno(errp, -size, "Can't get size of device '%s'",
bdrv_get_device_name(bs));
return;
}
bdrv_get_geometry(bs, &total_sectors);
info = g_new0(ImageInfo, 1);
info->filename = g_strdup(bs->filename);
info->format = g_strdup(bdrv_get_format_name(bs));
info->virtual_size = size;
info->virtual_size = total_sectors * 512;
info->actual_size = bdrv_get_allocated_file_size(bs);
info->has_actual_size = info->actual_size >= 0;
if (bdrv_is_encrypted(bs)) {
@@ -211,18 +195,6 @@ void bdrv_query_image_info(BlockDriverState *bs,
info->format_specific = bdrv_get_specific_info(bs);
info->has_format_specific = info->format_specific != NULL;
#ifdef __linux__
/* get NOCOW info */
fd = qemu_open(bs->filename, O_RDONLY | O_NONBLOCK);
if (fd >= 0) {
if (ioctl(fd, FS_IOC_GETFLAGS, &attr) == 0 && (attr & FS_NOCOW_FL)) {
info->has_nocow = true;
info->nocow = true;
}
qemu_close(fd);
}
#endif
backing_filename = bs->backing_file;
if (backing_filename[0] != '\0') {
info->backing_filename = g_strdup(backing_filename);
@@ -333,16 +305,15 @@ static BlockStats *bdrv_query_stats(const BlockDriverState *bs)
}
s->stats = g_malloc0(sizeof(*s->stats));
s->stats->rd_bytes = bs->stats.nr_bytes[BLOCK_ACCT_READ];
s->stats->wr_bytes = bs->stats.nr_bytes[BLOCK_ACCT_WRITE];
s->stats->rd_operations = bs->stats.nr_ops[BLOCK_ACCT_READ];
s->stats->wr_operations = bs->stats.nr_ops[BLOCK_ACCT_WRITE];
s->stats->wr_highest_offset =
bs->stats.wr_highest_sector * BDRV_SECTOR_SIZE;
s->stats->flush_operations = bs->stats.nr_ops[BLOCK_ACCT_FLUSH];
s->stats->wr_total_time_ns = bs->stats.total_time_ns[BLOCK_ACCT_WRITE];
s->stats->rd_total_time_ns = bs->stats.total_time_ns[BLOCK_ACCT_READ];
s->stats->flush_total_time_ns = bs->stats.total_time_ns[BLOCK_ACCT_FLUSH];
s->stats->rd_bytes = bs->nr_bytes[BDRV_ACCT_READ];
s->stats->wr_bytes = bs->nr_bytes[BDRV_ACCT_WRITE];
s->stats->rd_operations = bs->nr_ops[BDRV_ACCT_READ];
s->stats->wr_operations = bs->nr_ops[BDRV_ACCT_WRITE];
s->stats->wr_highest_offset = bs->wr_highest_sector * BDRV_SECTOR_SIZE;
s->stats->flush_operations = bs->nr_ops[BDRV_ACCT_FLUSH];
s->stats->wr_total_time_ns = bs->total_time_ns[BDRV_ACCT_WRITE];
s->stats->rd_total_time_ns = bs->total_time_ns[BDRV_ACCT_READ];
s->stats->flush_total_time_ns = bs->total_time_ns[BDRV_ACCT_FLUSH];
if (bs->file) {
s->has_parent = true;
@@ -654,8 +625,4 @@ void bdrv_image_info_dump(fprintf_function func_fprintf, void *f,
func_fprintf(f, "Format specific information:\n");
bdrv_image_info_specific_dump(func_fprintf, f, info->format_specific);
}
if (info->has_nocow && info->nocow) {
func_fprintf(f, "NOCOW flag: set\n");
}
}

View File

@@ -182,12 +182,7 @@ static int qcow_open(BlockDriverState *bs, QDict *options, int flags,
}
s->l1_table_offset = header.l1_table_offset;
s->l1_table = g_try_new(uint64_t, s->l1_size);
if (s->l1_table == NULL) {
error_setg(errp, "Could not allocate memory for L1 table");
ret = -ENOMEM;
goto fail;
}
s->l1_table = g_malloc(s->l1_size * sizeof(uint64_t));
ret = bdrv_pread(bs->file, s->l1_table_offset, s->l1_table,
s->l1_size * sizeof(uint64_t));
@@ -198,16 +193,8 @@ static int qcow_open(BlockDriverState *bs, QDict *options, int flags,
for(i = 0;i < s->l1_size; i++) {
be64_to_cpus(&s->l1_table[i]);
}
/* alloc L2 cache (max. 64k * 16 * 8 = 8 MB) */
s->l2_cache =
qemu_try_blockalign(bs->file,
s->l2_size * L2_CACHE_SIZE * sizeof(uint64_t));
if (s->l2_cache == NULL) {
error_setg(errp, "Could not allocate L2 table cache");
ret = -ENOMEM;
goto fail;
}
/* alloc L2 cache */
s->l2_cache = g_malloc(s->l2_size * L2_CACHE_SIZE * sizeof(uint64_t));
s->cluster_cache = g_malloc(s->cluster_size);
s->cluster_data = g_malloc(s->cluster_size);
s->cluster_cache_offset = -1;
@@ -239,7 +226,7 @@ static int qcow_open(BlockDriverState *bs, QDict *options, int flags,
fail:
g_free(s->l1_table);
qemu_vfree(s->l2_cache);
g_free(s->l2_cache);
g_free(s->cluster_cache);
g_free(s->cluster_data);
return ret;
@@ -530,10 +517,7 @@ static coroutine_fn int qcow_co_readv(BlockDriverState *bs, int64_t sector_num,
void *orig_buf;
if (qiov->niov > 1) {
buf = orig_buf = qemu_try_blockalign(bs, qiov->size);
if (buf == NULL) {
return -ENOMEM;
}
buf = orig_buf = qemu_blockalign(bs, qiov->size);
} else {
orig_buf = NULL;
buf = (uint8_t *)qiov->iov->iov_base;
@@ -635,10 +619,7 @@ static coroutine_fn int qcow_co_writev(BlockDriverState *bs, int64_t sector_num,
s->cluster_cache_offset = -1; /* disable compressed cache */
if (qiov->niov > 1) {
buf = orig_buf = qemu_try_blockalign(bs, qiov->size);
if (buf == NULL) {
return -ENOMEM;
}
buf = orig_buf = qemu_blockalign(bs, qiov->size);
qemu_iovec_to_buf(qiov, 0, buf, qiov->size);
} else {
orig_buf = NULL;
@@ -704,7 +685,7 @@ static void qcow_close(BlockDriverState *bs)
BDRVQcowState *s = bs->opaque;
g_free(s->l1_table);
qemu_vfree(s->l2_cache);
g_free(s->l2_cache);
g_free(s->cluster_cache);
g_free(s->cluster_data);
@@ -725,8 +706,7 @@ static int qcow_create(const char *filename, QemuOpts *opts, Error **errp)
BlockDriverState *qcow_bs;
/* Read out options */
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
total_size = qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0) / 512;
backing_file = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FILE);
if (qemu_opt_get_bool_del(opts, BLOCK_OPT_ENCRYPT, false)) {
flags |= BLOCK_FLAG_ENCRYPT;
@@ -754,7 +734,7 @@ static int qcow_create(const char *filename, QemuOpts *opts, Error **errp)
memset(&header, 0, sizeof(header));
header.magic = cpu_to_be32(QCOW_MAGIC);
header.version = cpu_to_be32(QCOW_VERSION);
header.size = cpu_to_be64(total_size);
header.size = cpu_to_be64(total_size * 512);
header_size = sizeof(header);
backing_filename_len = 0;
if (backing_file) {
@@ -776,7 +756,7 @@ static int qcow_create(const char *filename, QemuOpts *opts, Error **errp)
}
header_size = (header_size + 7) & ~7;
shift = header.cluster_bits + header.l2_bits;
l1_size = (total_size + (1LL << shift) - 1) >> shift;
l1_size = ((total_size * 512) + (1LL << shift) - 1) >> shift;
header.l1_table_offset = cpu_to_be64(header_size);
if (flags & BLOCK_FLAG_ENCRYPT) {

View File

@@ -48,31 +48,15 @@ Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables)
Qcow2Cache *c;
int i;
c = g_new0(Qcow2Cache, 1);
c = g_malloc0(sizeof(*c));
c->size = num_tables;
c->entries = g_try_new0(Qcow2CachedTable, num_tables);
if (!c->entries) {
goto fail;
}
c->entries = g_malloc0(sizeof(*c->entries) * num_tables);
for (i = 0; i < c->size; i++) {
c->entries[i].table = qemu_try_blockalign(bs->file, s->cluster_size);
if (c->entries[i].table == NULL) {
goto fail;
}
c->entries[i].table = qemu_blockalign(bs, s->cluster_size);
}
return c;
fail:
if (c->entries) {
for (i = 0; i < c->size; i++) {
qemu_vfree(c->entries[i].table);
}
}
g_free(c->entries);
g_free(c);
return NULL;
}
int qcow2_cache_destroy(BlockDriverState* bs, Qcow2Cache *c)

View File

@@ -72,20 +72,14 @@ int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
#endif
new_l1_size2 = sizeof(uint64_t) * new_l1_size;
new_l1_table = qemu_try_blockalign(bs->file,
align_offset(new_l1_size2, 512));
if (new_l1_table == NULL) {
return -ENOMEM;
}
memset(new_l1_table, 0, align_offset(new_l1_size2, 512));
new_l1_table = g_malloc0(align_offset(new_l1_size2, 512));
memcpy(new_l1_table, s->l1_table, s->l1_size * sizeof(uint64_t));
/* write new table (align to cluster) */
BLKDBG_EVENT(bs->file, BLKDBG_L1_GROW_ALLOC_TABLE);
new_l1_table_offset = qcow2_alloc_clusters(bs, new_l1_size2);
if (new_l1_table_offset < 0) {
qemu_vfree(new_l1_table);
g_free(new_l1_table);
return new_l1_table_offset;
}
@@ -119,7 +113,7 @@ int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
if (ret < 0) {
goto fail;
}
qemu_vfree(s->l1_table);
g_free(s->l1_table);
old_l1_table_offset = s->l1_table_offset;
s->l1_table_offset = new_l1_table_offset;
s->l1_table = new_l1_table;
@@ -129,7 +123,7 @@ int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
QCOW2_DISCARD_OTHER);
return 0;
fail:
qemu_vfree(new_l1_table);
g_free(new_l1_table);
qcow2_free_clusters(bs, new_l1_table_offset, new_l1_size2,
QCOW2_DISCARD_OTHER);
return ret;
@@ -378,10 +372,7 @@ static int coroutine_fn copy_sectors(BlockDriverState *bs,
}
iov.iov_len = n * BDRV_SECTOR_SIZE;
iov.iov_base = qemu_try_blockalign(bs, iov.iov_len);
if (iov.iov_base == NULL) {
return -ENOMEM;
}
iov.iov_base = qemu_blockalign(bs, iov.iov_len);
qemu_iovec_init_external(&qiov, &iov, 1);
@@ -486,13 +477,6 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
goto out;
}
if (offset_into_cluster(s, l2_offset)) {
qcow2_signal_corruption(bs, true, -1, -1, "L2 table offset %#" PRIx64
" unaligned (L1 index: %#" PRIx64 ")",
l2_offset, l1_index);
return -EIO;
}
/* load the l2 table in memory */
ret = l2_load(bs, l2_offset, &l2_table);
@@ -515,11 +499,8 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
break;
case QCOW2_CLUSTER_ZERO:
if (s->qcow_version < 3) {
qcow2_signal_corruption(bs, true, -1, -1, "Zero cluster entry found"
" in pre-v3 image (L2 offset: %#" PRIx64
", L2 index: %#x)", l2_offset, l2_index);
ret = -EIO;
goto fail;
qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
return -EIO;
}
c = count_contiguous_clusters(nb_clusters, s->cluster_size,
&l2_table[l2_index], QCOW_OFLAG_ZERO);
@@ -535,14 +516,6 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
c = count_contiguous_clusters(nb_clusters, s->cluster_size,
&l2_table[l2_index], QCOW_OFLAG_ZERO);
*cluster_offset &= L2E_OFFSET_MASK;
if (offset_into_cluster(s, *cluster_offset)) {
qcow2_signal_corruption(bs, true, -1, -1, "Data cluster offset %#"
PRIx64 " unaligned (L2 offset: %#" PRIx64
", L2 index: %#x)", *cluster_offset,
l2_offset, l2_index);
ret = -EIO;
goto fail;
}
break;
default:
abort();
@@ -559,10 +532,6 @@ out:
*num = nb_available - index_in_cluster;
return ret;
fail:
qcow2_cache_put(bs, s->l2_table_cache, (void **)&l2_table);
return ret;
}
/*
@@ -598,12 +567,6 @@ static int get_cluster_table(BlockDriverState *bs, uint64_t offset,
assert(l1_index < s->l1_size);
l2_offset = s->l1_table[l1_index] & L1E_OFFSET_MASK;
if (offset_into_cluster(s, l2_offset)) {
qcow2_signal_corruption(bs, true, -1, -1, "L2 table offset %#" PRIx64
" unaligned (L1 index: %#" PRIx64 ")",
l2_offset, l1_index);
return -EIO;
}
/* seek the l2 table of the given l2 offset */
@@ -739,11 +702,7 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
trace_qcow2_cluster_link_l2(qemu_coroutine_self(), m->nb_clusters);
assert(m->nb_clusters > 0);
old_cluster = g_try_new(uint64_t, m->nb_clusters);
if (old_cluster == NULL) {
ret = -ENOMEM;
goto err;
}
old_cluster = g_malloc(m->nb_clusters * sizeof(uint64_t));
/* copy content of unmodified sectors */
ret = perform_cow(bs, m, &m->cow_start);
@@ -976,15 +935,6 @@ static int handle_copied(BlockDriverState *bs, uint64_t guest_offset,
bool offset_matches =
(cluster_offset & L2E_OFFSET_MASK) == *host_offset;
if (offset_into_cluster(s, cluster_offset & L2E_OFFSET_MASK)) {
qcow2_signal_corruption(bs, true, -1, -1, "Data cluster offset "
"%#llx unaligned (guest offset: %#" PRIx64
")", cluster_offset & L2E_OFFSET_MASK,
guest_offset);
ret = -EIO;
goto out;
}
if (*host_offset != 0 && !offset_matches) {
*bytes = 0;
ret = 0;
@@ -1016,7 +966,7 @@ out:
/* Only return a host offset if we actually made progress. Otherwise we
* would make requirements for handle_alloc() that it can't fulfill */
if (ret > 0) {
if (ret) {
*host_offset = (cluster_offset & L2E_OFFSET_MASK)
+ offset_into_cluster(s, guest_offset);
}
@@ -1156,17 +1106,6 @@ static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset,
return 0;
}
/* !*host_offset would overwrite the image header and is reserved for "no
* host offset preferred". If 0 was a valid host offset, it'd trigger the
* following overlap check; do that now to avoid having an invalid value in
* *host_offset. */
if (!alloc_cluster_offset) {
ret = qcow2_pre_write_overlap_check(bs, 0, alloc_cluster_offset,
nb_clusters * s->cluster_size);
assert(ret < 0);
goto fail;
}
/*
* Save info needed for meta data update.
*
@@ -1623,10 +1562,7 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
if (!is_active_l1) {
/* inactive L2 tables require a buffer to be stored in when loading
* them from disk */
l2_table = qemu_try_blockalign(bs->file, s->cluster_size);
if (l2_table == NULL) {
return -ENOMEM;
}
l2_table = qemu_blockalign(bs, s->cluster_size);
}
for (i = 0; i < l1_size; i++) {
@@ -1804,11 +1740,7 @@ int qcow2_expand_zero_clusters(BlockDriverState *bs)
nb_clusters = size_to_clusters(s, bs->file->total_sectors *
BDRV_SECTOR_SIZE);
expanded_clusters = g_try_malloc0((nb_clusters + 7) / 8);
if (expanded_clusters == NULL) {
ret = -ENOMEM;
goto fail;
}
expanded_clusters = g_malloc0((nb_clusters + 7) / 8);
ret = expand_zero_clusters_in_l1(bs, s->l1_table, s->l1_size,
&expanded_clusters, &nb_clusters);

View File

@@ -26,6 +26,8 @@
#include "block/block_int.h"
#include "block/qcow2.h"
#include "qemu/range.h"
#include "qapi/qmp/types.h"
#include "qapi-event.h"
static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size);
static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
@@ -44,25 +46,19 @@ int qcow2_refcount_init(BlockDriverState *bs)
assert(s->refcount_table_size <= INT_MAX / sizeof(uint64_t));
refcount_table_size2 = s->refcount_table_size * sizeof(uint64_t);
s->refcount_table = g_try_malloc(refcount_table_size2);
s->refcount_table = g_malloc(refcount_table_size2);
if (s->refcount_table_size > 0) {
if (s->refcount_table == NULL) {
ret = -ENOMEM;
goto fail;
}
BLKDBG_EVENT(bs->file, BLKDBG_REFTABLE_LOAD);
ret = bdrv_pread(bs->file, s->refcount_table_offset,
s->refcount_table, refcount_table_size2);
if (ret < 0) {
if (ret != refcount_table_size2)
goto fail;
}
for(i = 0; i < s->refcount_table_size; i++)
be64_to_cpus(&s->refcount_table[i]);
}
return 0;
fail:
return ret;
return -ENOMEM;
}
void qcow2_refcount_close(BlockDriverState *bs)
@@ -108,13 +104,6 @@ static int get_refcount(BlockDriverState *bs, int64_t cluster_index)
if (!refcount_block_offset)
return 0;
if (offset_into_cluster(s, refcount_block_offset)) {
qcow2_signal_corruption(bs, true, -1, -1, "Refblock offset %#" PRIx64
" unaligned (reftable index: %#" PRIx64 ")",
refcount_block_offset, refcount_table_index);
return -EIO;
}
ret = qcow2_cache_get(bs, s->refcount_block_cache, refcount_block_offset,
(void**) &refcount_block);
if (ret < 0) {
@@ -188,14 +177,6 @@ static int alloc_refcount_block(BlockDriverState *bs,
/* If it's already there, we're done */
if (refcount_block_offset) {
if (offset_into_cluster(s, refcount_block_offset)) {
qcow2_signal_corruption(bs, true, -1, -1, "Refblock offset %#"
PRIx64 " unaligned (reftable index: "
"%#x)", refcount_block_offset,
refcount_table_index);
return -EIO;
}
return load_refcount_block(bs, refcount_block_offset,
(void**) refcount_block);
}
@@ -363,14 +344,8 @@ static int alloc_refcount_block(BlockDriverState *bs,
uint64_t meta_offset = (blocks_used * refcount_block_clusters) *
s->cluster_size;
uint64_t table_offset = meta_offset + blocks_clusters * s->cluster_size;
uint64_t *new_table = g_try_new0(uint64_t, table_size);
uint16_t *new_blocks = g_try_malloc0(blocks_clusters * s->cluster_size);
assert(table_size > 0 && blocks_clusters > 0);
if (new_table == NULL || new_blocks == NULL) {
ret = -ENOMEM;
goto fail_table;
}
uint16_t *new_blocks = g_malloc0(blocks_clusters * s->cluster_size);
uint64_t *new_table = g_malloc0(table_size * sizeof(uint64_t));
/* Fill the new refcount table */
memcpy(new_table, s->refcount_table,
@@ -394,7 +369,6 @@ static int alloc_refcount_block(BlockDriverState *bs,
ret = bdrv_pwrite_sync(bs->file, meta_offset, new_blocks,
blocks_clusters * s->cluster_size);
g_free(new_blocks);
new_blocks = NULL;
if (ret < 0) {
goto fail_table;
}
@@ -450,7 +424,6 @@ static int alloc_refcount_block(BlockDriverState *bs,
return -EAGAIN;
fail_table:
g_free(new_blocks);
g_free(new_table);
fail_block:
if (*refcount_block != NULL) {
@@ -851,14 +824,8 @@ void qcow2_free_any_clusters(BlockDriverState *bs, uint64_t l2_entry,
case QCOW2_CLUSTER_NORMAL:
case QCOW2_CLUSTER_ZERO:
if (l2_entry & L2E_OFFSET_MASK) {
if (offset_into_cluster(s, l2_entry & L2E_OFFSET_MASK)) {
qcow2_signal_corruption(bs, false, -1, -1,
"Cannot free unaligned cluster %#llx",
l2_entry & L2E_OFFSET_MASK);
} else {
qcow2_free_clusters(bs, l2_entry & L2E_OFFSET_MASK,
nb_clusters << s->cluster_bits, type);
}
qcow2_free_clusters(bs, l2_entry & L2E_OFFSET_MASK,
nb_clusters << s->cluster_bits, type);
}
break;
case QCOW2_CLUSTER_UNALLOCATED:
@@ -880,8 +847,7 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
int64_t l1_table_offset, int l1_size, int addend)
{
BDRVQcowState *s = bs->opaque;
uint64_t *l1_table, *l2_table, l2_offset, offset, l1_size2;
bool l1_allocated = false;
uint64_t *l1_table, *l2_table, l2_offset, offset, l1_size2, l1_allocated;
int64_t old_offset, old_l2_offset;
int i, j, l1_modified = 0, nb_csectors, refcount;
int ret;
@@ -896,12 +862,8 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
* l1_table_offset when it is the current s->l1_table_offset! Be careful
* when changing this! */
if (l1_table_offset != s->l1_table_offset) {
l1_table = g_try_malloc0(align_offset(l1_size2, 512));
if (l1_size2 && l1_table == NULL) {
ret = -ENOMEM;
goto fail;
}
l1_allocated = true;
l1_table = g_malloc0(align_offset(l1_size2, 512));
l1_allocated = 1;
ret = bdrv_pread(bs->file, l1_table_offset, l1_table, l1_size2);
if (ret < 0) {
@@ -913,7 +875,7 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
} else {
assert(l1_size == s->l1_size);
l1_table = s->l1_table;
l1_allocated = false;
l1_allocated = 0;
}
for(i = 0; i < l1_size; i++) {
@@ -922,14 +884,6 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
old_l2_offset = l2_offset;
l2_offset &= L1E_OFFSET_MASK;
if (offset_into_cluster(s, l2_offset)) {
qcow2_signal_corruption(bs, true, -1, -1, "L2 table offset %#"
PRIx64 " unaligned (L1 index: %#x)",
l2_offset, i);
ret = -EIO;
goto fail;
}
ret = qcow2_cache_get(bs, s->l2_table_cache, l2_offset,
(void**) &l2_table);
if (ret < 0) {
@@ -962,17 +916,6 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
case QCOW2_CLUSTER_NORMAL:
case QCOW2_CLUSTER_ZERO:
if (offset_into_cluster(s, offset & L2E_OFFSET_MASK)) {
qcow2_signal_corruption(bs, true, -1, -1, "Data "
"cluster offset %#llx "
"unaligned (L2 offset: %#"
PRIx64 ", L2 index: %#x)",
offset & L2E_OFFSET_MASK,
l2_offset, j);
ret = -EIO;
goto fail;
}
cluster_index = (offset & L2E_OFFSET_MASK) >> s->cluster_bits;
if (!cluster_index) {
/* unallocated */
@@ -1254,11 +1197,7 @@ static int check_refcounts_l1(BlockDriverState *bs,
if (l1_size2 == 0) {
l1_table = NULL;
} else {
l1_table = g_try_malloc(l1_size2);
if (l1_table == NULL) {
ret = -ENOMEM;
goto fail;
}
l1_table = g_malloc(l1_size2);
if (bdrv_pread(bs->file, l1_table_offset,
l1_table, l1_size2) != l1_size2)
goto fail;
@@ -1562,11 +1501,7 @@ int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
return -EFBIG;
}
refcount_table = g_try_new0(uint16_t, nb_clusters);
if (nb_clusters && refcount_table == NULL) {
res->check_errors++;
return -ENOMEM;
}
refcount_table = g_malloc0(nb_clusters * sizeof(uint16_t));
res->bfi.total_clusters =
size_to_clusters(s, bs->total_sectors * BDRV_SECTOR_SIZE);
@@ -1643,8 +1578,8 @@ int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
/* increase refcount_table size if necessary */
int old_nb_clusters = nb_clusters;
nb_clusters = (new_offset >> s->cluster_bits) + 1;
refcount_table = g_renew(uint16_t, refcount_table,
nb_clusters);
refcount_table = g_realloc(refcount_table,
nb_clusters * sizeof(uint16_t));
memset(&refcount_table[old_nb_clusters], 0, (nb_clusters
- old_nb_clusters) * sizeof(uint16_t));
}
@@ -1818,13 +1753,9 @@ int qcow2_check_metadata_overlap(BlockDriverState *bs, int ign, int64_t offset,
uint64_t l1_ofs = s->snapshots[i].l1_table_offset;
uint32_t l1_sz = s->snapshots[i].l1_size;
uint64_t l1_sz2 = l1_sz * sizeof(uint64_t);
uint64_t *l1 = g_try_malloc(l1_sz2);
uint64_t *l1 = g_malloc(l1_sz2);
int ret;
if (l1_sz2 && l1 == NULL) {
return -ENOMEM;
}
ret = bdrv_pread(bs->file, l1_ofs, l1, l1_sz2);
if (ret < 0) {
g_free(l1);
@@ -1876,11 +1807,26 @@ int qcow2_pre_write_overlap_check(BlockDriverState *bs, int ign, int64_t offset,
return ret;
} else if (ret > 0) {
int metadata_ol_bitnr = ffs(ret) - 1;
char *message;
assert(metadata_ol_bitnr < QCOW2_OL_MAX_BITNR);
qcow2_signal_corruption(bs, true, offset, size, "Preventing invalid "
"write on metadata (overlaps with %s)",
metadata_ol_names[metadata_ol_bitnr]);
fprintf(stderr, "qcow2: Preventing invalid write on metadata (overlaps "
"with %s); image marked as corrupt.\n",
metadata_ol_names[metadata_ol_bitnr]);
message = g_strdup_printf("Prevented %s overwrite",
metadata_ol_names[metadata_ol_bitnr]);
qapi_event_send_block_image_corrupted(bdrv_get_device_name(bs),
message,
true,
offset,
true,
size,
&error_abort);
g_free(message);
qcow2_mark_corrupt(bs);
bs->drv = NULL; /* make BDS unusable */
return -EIO;
}

View File

@@ -58,7 +58,7 @@ int qcow2_read_snapshots(BlockDriverState *bs)
}
offset = s->snapshots_offset;
s->snapshots = g_new0(QCowSnapshot, s->nb_snapshots);
s->snapshots = g_malloc0(s->nb_snapshots * sizeof(QCowSnapshot));
for(i = 0; i < s->nb_snapshots; i++) {
/* Read statically sized part of the snapshot header */
@@ -381,12 +381,7 @@ int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)
sn->l1_table_offset = l1_table_offset;
sn->l1_size = s->l1_size;
l1_table = g_try_new(uint64_t, s->l1_size);
if (s->l1_size && l1_table == NULL) {
ret = -ENOMEM;
goto fail;
}
l1_table = g_malloc(s->l1_size * sizeof(uint64_t));
for(i = 0; i < s->l1_size; i++) {
l1_table[i] = cpu_to_be64(s->l1_table[i]);
}
@@ -417,7 +412,7 @@ int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)
}
/* Append the new snapshot to the snapshot list */
new_snapshot_list = g_new(QCowSnapshot, s->nb_snapshots + 1);
new_snapshot_list = g_malloc((s->nb_snapshots + 1) * sizeof(QCowSnapshot));
if (s->snapshots) {
memcpy(new_snapshot_list, s->snapshots,
s->nb_snapshots * sizeof(QCowSnapshot));
@@ -504,11 +499,7 @@ int qcow2_snapshot_goto(BlockDriverState *bs, const char *snapshot_id)
* Decrease the refcount referenced by the old one only when the L1
* table is overwritten.
*/
sn_l1_table = g_try_malloc0(cur_l1_bytes);
if (cur_l1_bytes && sn_l1_table == NULL) {
ret = -ENOMEM;
goto fail;
}
sn_l1_table = g_malloc0(cur_l1_bytes);
ret = bdrv_pread(bs->file, sn->l1_table_offset, sn_l1_table, sn_l1_bytes);
if (ret < 0) {
@@ -661,7 +652,7 @@ int qcow2_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_tab)
return s->nb_snapshots;
}
sn_tab = g_new0(QEMUSnapshotInfo, s->nb_snapshots);
sn_tab = g_malloc0(s->nb_snapshots * sizeof(QEMUSnapshotInfo));
for(i = 0; i < s->nb_snapshots; i++) {
sn_info = sn_tab + i;
sn = s->snapshots + i;
@@ -707,21 +698,17 @@ int qcow2_snapshot_load_tmp(BlockDriverState *bs,
return -EFBIG;
}
new_l1_bytes = sn->l1_size * sizeof(uint64_t);
new_l1_table = qemu_try_blockalign(bs->file,
align_offset(new_l1_bytes, 512));
if (new_l1_table == NULL) {
return -ENOMEM;
}
new_l1_table = g_malloc0(align_offset(new_l1_bytes, 512));
ret = bdrv_pread(bs->file, sn->l1_table_offset, new_l1_table, new_l1_bytes);
if (ret < 0) {
error_setg(errp, "Failed to read l1 table for snapshot");
qemu_vfree(new_l1_table);
g_free(new_l1_table);
return ret;
}
/* Switch the L1 table */
qemu_vfree(s->l1_table);
g_free(s->l1_table);
s->l1_size = sn->l1_size;
s->l1_table_offset = sn->l1_table_offset;

View File

@@ -30,9 +30,6 @@
#include "qemu/error-report.h"
#include "qapi/qmp/qerror.h"
#include "qapi/qmp/qbool.h"
#include "qapi/util.h"
#include "qapi/qmp/types.h"
#include "qapi-event.h"
#include "trace.h"
#include "qemu/option_int.h"
@@ -405,12 +402,6 @@ static QemuOptsList qcow2_runtime_opts = {
.help = "Selects which overlap checks to perform from a range of "
"templates (none, constant, cached, all)",
},
{
.name = QCOW2_OPT_OVERLAP_TEMPLATE,
.type = QEMU_OPT_STRING,
.help = "Selects which overlap checks to perform from a range of "
"templates (none, constant, cached, all)",
},
{
.name = QCOW2_OPT_OVERLAP_MAIN_HEADER,
.type = QEMU_OPT_BOOL,
@@ -451,22 +442,6 @@ static QemuOptsList qcow2_runtime_opts = {
.type = QEMU_OPT_BOOL,
.help = "Check for unintended writes into an inactive L2 table",
},
{
.name = QCOW2_OPT_CACHE_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Maximum combined metadata (L2 tables and refcount blocks) "
"cache size",
},
{
.name = QCOW2_OPT_L2_CACHE_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Maximum L2 table cache size",
},
{
.name = QCOW2_OPT_REFCOUNT_CACHE_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Maximum refcount block cache size",
},
{ /* end of list */ }
},
};
@@ -482,61 +457,6 @@ static const char *overlap_bool_option_names[QCOW2_OL_MAX_BITNR] = {
[QCOW2_OL_INACTIVE_L2_BITNR] = QCOW2_OPT_OVERLAP_INACTIVE_L2,
};
static void read_cache_sizes(QemuOpts *opts, uint64_t *l2_cache_size,
uint64_t *refcount_cache_size, Error **errp)
{
uint64_t combined_cache_size;
bool l2_cache_size_set, refcount_cache_size_set, combined_cache_size_set;
combined_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_CACHE_SIZE);
l2_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_L2_CACHE_SIZE);
refcount_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_REFCOUNT_CACHE_SIZE);
combined_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_CACHE_SIZE, 0);
*l2_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_L2_CACHE_SIZE, 0);
*refcount_cache_size = qemu_opt_get_size(opts,
QCOW2_OPT_REFCOUNT_CACHE_SIZE, 0);
if (combined_cache_size_set) {
if (l2_cache_size_set && refcount_cache_size_set) {
error_setg(errp, QCOW2_OPT_CACHE_SIZE ", " QCOW2_OPT_L2_CACHE_SIZE
" and " QCOW2_OPT_REFCOUNT_CACHE_SIZE " may not be set "
"the same time");
return;
} else if (*l2_cache_size > combined_cache_size) {
error_setg(errp, QCOW2_OPT_L2_CACHE_SIZE " may not exceed "
QCOW2_OPT_CACHE_SIZE);
return;
} else if (*refcount_cache_size > combined_cache_size) {
error_setg(errp, QCOW2_OPT_REFCOUNT_CACHE_SIZE " may not exceed "
QCOW2_OPT_CACHE_SIZE);
return;
}
if (l2_cache_size_set) {
*refcount_cache_size = combined_cache_size - *l2_cache_size;
} else if (refcount_cache_size_set) {
*l2_cache_size = combined_cache_size - *refcount_cache_size;
} else {
*refcount_cache_size = combined_cache_size
/ (DEFAULT_L2_REFCOUNT_SIZE_RATIO + 1);
*l2_cache_size = combined_cache_size - *refcount_cache_size;
}
} else {
if (!l2_cache_size_set && !refcount_cache_size_set) {
*l2_cache_size = DEFAULT_L2_CACHE_BYTE_SIZE;
*refcount_cache_size = *l2_cache_size
/ DEFAULT_L2_REFCOUNT_SIZE_RATIO;
} else if (!l2_cache_size_set) {
*l2_cache_size = *refcount_cache_size
* DEFAULT_L2_REFCOUNT_SIZE_RATIO;
} else if (!refcount_cache_size_set) {
*refcount_cache_size = *l2_cache_size
/ DEFAULT_L2_REFCOUNT_SIZE_RATIO;
}
}
}
static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
@@ -544,13 +464,12 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
unsigned int len, i;
int ret = 0;
QCowHeader header;
QemuOpts *opts = NULL;
QemuOpts *opts;
Error *local_err = NULL;
uint64_t ext_end;
uint64_t l1_vm_state_index;
const char *opt_overlap_check, *opt_overlap_check_template;
const char *opt_overlap_check;
int overlap_check_template = 0;
uint64_t l2_cache_size, refcount_cache_size;
ret = bdrv_pread(bs->file, 0, &header, sizeof(header));
if (ret < 0) {
@@ -769,13 +688,8 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
if (s->l1_size > 0) {
s->l1_table = qemu_try_blockalign(bs->file,
s->l1_table = g_malloc0(
align_offset(s->l1_size * sizeof(uint64_t), 512));
if (s->l1_table == NULL) {
error_setg(errp, "Could not allocate L1 table");
ret = -ENOMEM;
goto fail;
}
ret = bdrv_pread(bs->file, s->l1_table_offset, s->l1_table,
s->l1_size * sizeof(uint64_t));
if (ret < 0) {
@@ -787,61 +701,14 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
}
}
/* get L2 table/refcount block cache size from command line options */
opts = qemu_opts_create(&qcow2_runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
read_cache_sizes(opts, &l2_cache_size, &refcount_cache_size, &local_err);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
l2_cache_size /= s->cluster_size;
if (l2_cache_size < MIN_L2_CACHE_SIZE) {
l2_cache_size = MIN_L2_CACHE_SIZE;
}
if (l2_cache_size > INT_MAX) {
error_setg(errp, "L2 cache size too big");
ret = -EINVAL;
goto fail;
}
refcount_cache_size /= s->cluster_size;
if (refcount_cache_size < MIN_REFCOUNT_CACHE_SIZE) {
refcount_cache_size = MIN_REFCOUNT_CACHE_SIZE;
}
if (refcount_cache_size > INT_MAX) {
error_setg(errp, "Refcount cache size too big");
ret = -EINVAL;
goto fail;
}
/* alloc L2 table/refcount block cache */
s->l2_table_cache = qcow2_cache_create(bs, l2_cache_size);
s->refcount_block_cache = qcow2_cache_create(bs, refcount_cache_size);
if (s->l2_table_cache == NULL || s->refcount_block_cache == NULL) {
error_setg(errp, "Could not allocate metadata caches");
ret = -ENOMEM;
goto fail;
}
s->l2_table_cache = qcow2_cache_create(bs, L2_CACHE_SIZE);
s->refcount_block_cache = qcow2_cache_create(bs, REFCOUNT_CACHE_SIZE);
s->cluster_cache = g_malloc(s->cluster_size);
/* one more sector for decompressed data alignment */
s->cluster_data = qemu_try_blockalign(bs->file, QCOW_MAX_CRYPT_CLUSTERS
* s->cluster_size + 512);
if (s->cluster_data == NULL) {
error_setg(errp, "Could not allocate temporary cluster buffer");
ret = -ENOMEM;
goto fail;
}
s->cluster_data = qemu_blockalign(bs, QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size
+ 512);
s->cluster_cache_offset = -1;
s->flags = flags;
@@ -915,6 +782,14 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
}
/* Enable lazy_refcounts according to image and command line options */
opts = qemu_opts_create(&qcow2_runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
s->use_lazy_refcounts = qemu_opt_get_bool(opts, QCOW2_OPT_LAZY_REFCOUNTS,
(s->compatible_features & QCOW2_COMPAT_LAZY_REFCOUNTS));
@@ -928,21 +803,7 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
s->discard_passthrough[QCOW2_DISCARD_OTHER] =
qemu_opt_get_bool(opts, QCOW2_OPT_DISCARD_OTHER, false);
opt_overlap_check = qemu_opt_get(opts, QCOW2_OPT_OVERLAP);
opt_overlap_check_template = qemu_opt_get(opts, QCOW2_OPT_OVERLAP_TEMPLATE);
if (opt_overlap_check_template && opt_overlap_check &&
strcmp(opt_overlap_check_template, opt_overlap_check))
{
error_setg(errp, "Conflicting values for qcow2 options '"
QCOW2_OPT_OVERLAP "' ('%s') and '" QCOW2_OPT_OVERLAP_TEMPLATE
"' ('%s')", opt_overlap_check, opt_overlap_check_template);
ret = -EINVAL;
goto fail;
}
if (!opt_overlap_check) {
opt_overlap_check = opt_overlap_check_template ?: "cached";
}
opt_overlap_check = qemu_opt_get(opts, "overlap-check") ?: "cached";
if (!strcmp(opt_overlap_check, "none")) {
overlap_check_template = 0;
} else if (!strcmp(opt_overlap_check, "constant")) {
@@ -955,6 +816,7 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
error_setg(errp, "Unsupported value '%s' for qcow2 option "
"'overlap-check'. Allowed are either of the following: "
"none, constant, cached, all", opt_overlap_check);
qemu_opts_del(opts);
ret = -EINVAL;
goto fail;
}
@@ -969,7 +831,6 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
}
qemu_opts_del(opts);
opts = NULL;
if (s->use_lazy_refcounts && s->qcow_version < 3) {
error_setg(errp, "Lazy refcounts require a qcow2 image with at least "
@@ -987,12 +848,11 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
return ret;
fail:
qemu_opts_del(opts);
g_free(s->unknown_header_fields);
cleanup_unknown_header_ext(bs);
qcow2_free_snapshots(bs);
qcow2_refcount_close(bs);
qemu_vfree(s->l1_table);
g_free(s->l1_table);
/* else pre-write overlap checks in cache_destroy may crash */
s->l1_table = NULL;
if (s->l2_table_cache) {
@@ -1222,12 +1082,7 @@ static coroutine_fn int qcow2_co_readv(BlockDriverState *bs, int64_t sector_num,
*/
if (!cluster_data) {
cluster_data =
qemu_try_blockalign(bs->file, QCOW_MAX_CRYPT_CLUSTERS
* s->cluster_size);
if (cluster_data == NULL) {
ret = -ENOMEM;
goto fail;
}
qemu_blockalign(bs, QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size);
}
assert(cur_nr_sectors <=
@@ -1327,13 +1182,8 @@ static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
if (s->crypt_method) {
if (!cluster_data) {
cluster_data = qemu_try_blockalign(bs->file,
QCOW_MAX_CRYPT_CLUSTERS
* s->cluster_size);
if (cluster_data == NULL) {
ret = -ENOMEM;
goto fail;
}
cluster_data = qemu_blockalign(bs, QCOW_MAX_CRYPT_CLUSTERS *
s->cluster_size);
}
assert(hd_qiov.size <=
@@ -1420,7 +1270,7 @@ fail:
static void qcow2_close(BlockDriverState *bs)
{
BDRVQcowState *s = bs->opaque;
qemu_vfree(s->l1_table);
g_free(s->l1_table);
/* else pre-write overlap checks in cache_destroy may crash */
s->l1_table = NULL;
@@ -1707,7 +1557,7 @@ static int preallocate(BlockDriverState *bs)
int ret;
QCowL2Meta *meta;
nb_sectors = bdrv_nb_sectors(bs);
nb_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
offset = 0;
while (nb_sectors) {
@@ -1762,7 +1612,7 @@ static int preallocate(BlockDriverState *bs)
static int qcow2_create2(const char *filename, int64_t total_size,
const char *backing_file, const char *backing_format,
int flags, size_t cluster_size, PreallocMode prealloc,
int flags, size_t cluster_size, int prealloc,
QemuOpts *opts, int version,
Error **errp)
{
@@ -1795,56 +1645,6 @@ static int qcow2_create2(const char *filename, int64_t total_size,
Error *local_err = NULL;
int ret;
if (prealloc == PREALLOC_MODE_FULL || prealloc == PREALLOC_MODE_FALLOC) {
int64_t meta_size = 0;
uint64_t nreftablee, nrefblocke, nl1e, nl2e;
int64_t aligned_total_size = align_offset(total_size, cluster_size);
/* header: 1 cluster */
meta_size += cluster_size;
/* total size of L2 tables */
nl2e = aligned_total_size / cluster_size;
nl2e = align_offset(nl2e, cluster_size / sizeof(uint64_t));
meta_size += nl2e * sizeof(uint64_t);
/* total size of L1 tables */
nl1e = nl2e * sizeof(uint64_t) / cluster_size;
nl1e = align_offset(nl1e, cluster_size / sizeof(uint64_t));
meta_size += nl1e * sizeof(uint64_t);
/* total size of refcount blocks
*
* note: every host cluster is reference-counted, including metadata
* (even refcount blocks are recursively included).
* Let:
* a = total_size (this is the guest disk size)
* m = meta size not including refcount blocks and refcount tables
* c = cluster size
* y1 = number of refcount blocks entries
* y2 = meta size including everything
* then,
* y1 = (y2 + a)/c
* y2 = y1 * sizeof(u16) + y1 * sizeof(u16) * sizeof(u64) / c + m
* we can get y1:
* y1 = (a + m) / (c - sizeof(u16) - sizeof(u16) * sizeof(u64) / c)
*/
nrefblocke = (aligned_total_size + meta_size + cluster_size) /
(cluster_size - sizeof(uint16_t) -
1.0 * sizeof(uint16_t) * sizeof(uint64_t) / cluster_size);
nrefblocke = align_offset(nrefblocke, cluster_size / sizeof(uint16_t));
meta_size += nrefblocke * sizeof(uint16_t);
/* total size of refcount tables */
nreftablee = nrefblocke * sizeof(uint16_t) / cluster_size;
nreftablee = align_offset(nreftablee, cluster_size / sizeof(uint64_t));
meta_size += nreftablee * sizeof(uint64_t);
qemu_opt_set_number(opts, BLOCK_OPT_SIZE,
aligned_total_size + meta_size);
qemu_opt_set(opts, BLOCK_OPT_PREALLOC, PreallocMode_lookup[prealloc]);
}
ret = bdrv_create_file(filename, opts, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
@@ -1933,7 +1733,7 @@ static int qcow2_create2(const char *filename, int64_t total_size,
}
/* Okay, now that we have a valid image, let's give it the right size */
ret = bdrv_truncate(bs, total_size);
ret = bdrv_truncate(bs, total_size * BDRV_SECTOR_SIZE);
if (ret < 0) {
error_setg_errno(errp, -ret, "Could not resize image");
goto out;
@@ -1950,7 +1750,7 @@ static int qcow2_create2(const char *filename, int64_t total_size,
}
/* And if we're supposed to preallocate metadata, do that now */
if (prealloc != PREALLOC_MODE_OFF) {
if (prealloc) {
BDRVQcowState *s = bs->opaque;
qemu_co_mutex_lock(&s->lock);
ret = preallocate(bs);
@@ -1986,17 +1786,16 @@ static int qcow2_create(const char *filename, QemuOpts *opts, Error **errp)
char *backing_file = NULL;
char *backing_fmt = NULL;
char *buf = NULL;
uint64_t size = 0;
uint64_t sectors = 0;
int flags = 0;
size_t cluster_size = DEFAULT_CLUSTER_SIZE;
PreallocMode prealloc;
int prealloc = 0;
int version = 3;
Error *local_err = NULL;
int ret;
/* Read out options */
size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
sectors = qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0) / 512;
backing_file = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FILE);
backing_fmt = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FMT);
if (qemu_opt_get_bool_del(opts, BLOCK_OPT_ENCRYPT, false)) {
@@ -2005,11 +1804,12 @@ static int qcow2_create(const char *filename, QemuOpts *opts, Error **errp)
cluster_size = qemu_opt_get_size_del(opts, BLOCK_OPT_CLUSTER_SIZE,
DEFAULT_CLUSTER_SIZE);
buf = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
prealloc = qapi_enum_parse(PreallocMode_lookup, buf,
PREALLOC_MODE_MAX, PREALLOC_MODE_OFF,
&local_err);
if (local_err) {
error_propagate(errp, local_err);
if (!buf || !strcmp(buf, "off")) {
prealloc = 0;
} else if (!strcmp(buf, "metadata")) {
prealloc = 1;
} else {
error_setg(errp, "Invalid preallocation mode: '%s'", buf);
ret = -EINVAL;
goto finish;
}
@@ -2031,7 +1831,7 @@ static int qcow2_create(const char *filename, QemuOpts *opts, Error **errp)
flags |= BLOCK_FLAG_LAZY_REFCOUNTS;
}
if (backing_file && prealloc != PREALLOC_MODE_OFF) {
if (backing_file && prealloc) {
error_setg(errp, "Backing file and preallocation cannot be used at "
"the same time");
ret = -EINVAL;
@@ -2045,7 +1845,7 @@ static int qcow2_create(const char *filename, QemuOpts *opts, Error **errp)
goto finish;
}
ret = qcow2_create2(filename, size, backing_file, backing_fmt, flags,
ret = qcow2_create2(filename, sectors, backing_file, backing_fmt, flags,
cluster_size, prealloc, opts, version, &local_err);
if (local_err) {
error_propagate(errp, local_err);
@@ -2147,6 +1947,7 @@ static int qcow2_write_compressed(BlockDriverState *bs, int64_t sector_num,
/* align end of file to a sector boundary to ease reading with
sector based I/Os */
cluster_offset = bdrv_getlength(bs->file);
cluster_offset = (cluster_offset + 511) & ~511;
bdrv_truncate(bs->file, cluster_offset);
return 0;
}
@@ -2282,9 +2083,6 @@ static ImageInfoSpecific *qcow2_get_specific_info(BlockDriverState *bs)
.lazy_refcounts = s->compatible_features &
QCOW2_COMPAT_LAZY_REFCOUNTS,
.has_lazy_refcounts = true,
.corrupt = s->incompatible_features &
QCOW2_INCOMPAT_CORRUPT,
.has_corrupt = true,
};
}
@@ -2555,52 +2353,6 @@ static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts)
return 0;
}
/*
* If offset or size are negative, respectively, they will not be included in
* the BLOCK_IMAGE_CORRUPTED event emitted.
* fatal will be ignored for read-only BDS; corruptions found there will always
* be considered non-fatal.
*/
void qcow2_signal_corruption(BlockDriverState *bs, bool fatal, int64_t offset,
int64_t size, const char *message_format, ...)
{
BDRVQcowState *s = bs->opaque;
char *message;
va_list ap;
fatal = fatal && !bs->read_only;
if (s->signaled_corruption &&
(!fatal || (s->incompatible_features & QCOW2_INCOMPAT_CORRUPT)))
{
return;
}
va_start(ap, message_format);
message = g_strdup_vprintf(message_format, ap);
va_end(ap);
if (fatal) {
fprintf(stderr, "qcow2: Marking image as corrupt: %s; further "
"corruption events will be suppressed\n", message);
} else {
fprintf(stderr, "qcow2: Image is corrupt: %s; further non-fatal "
"corruption events will be suppressed\n", message);
}
qapi_event_send_block_image_corrupted(bdrv_get_device_name(bs), message,
offset >= 0, offset, size >= 0, size,
fatal, &error_abort);
g_free(message);
if (fatal) {
qcow2_mark_corrupt(bs);
bs->drv = NULL; /* make BDS unusable */
}
s->signaled_corruption = true;
}
static QemuOptsList qcow2_create_opts = {
.name = "qcow2-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(qcow2_create_opts.head),
@@ -2640,8 +2392,7 @@ static QemuOptsList qcow2_create_opts = {
{
.name = BLOCK_OPT_PREALLOC,
.type = QEMU_OPT_STRING,
.help = "Preallocation mode (allowed values: off, metadata, "
"falloc, full)"
.help = "Preallocation mode (allowed values: off, metadata)"
},
{
.name = BLOCK_OPT_LAZY_REFCOUNTS,

View File

@@ -64,16 +64,10 @@
#define MIN_CLUSTER_BITS 9
#define MAX_CLUSTER_BITS 21
#define MIN_L2_CACHE_SIZE 1 /* cluster */
#define L2_CACHE_SIZE 16
/* Must be at least 4 to cover all cases of refcount table growth */
#define MIN_REFCOUNT_CACHE_SIZE 4 /* clusters */
#define DEFAULT_L2_CACHE_BYTE_SIZE 1048576 /* bytes */
/* The refblock cache needs only a fourth of the L2 cache size to cover as many
* clusters */
#define DEFAULT_L2_REFCOUNT_SIZE_RATIO 4
#define REFCOUNT_CACHE_SIZE 4
#define DEFAULT_CLUSTER_SIZE 65536
@@ -83,7 +77,6 @@
#define QCOW2_OPT_DISCARD_SNAPSHOT "pass-discard-snapshot"
#define QCOW2_OPT_DISCARD_OTHER "pass-discard-other"
#define QCOW2_OPT_OVERLAP "overlap-check"
#define QCOW2_OPT_OVERLAP_TEMPLATE "overlap-check.template"
#define QCOW2_OPT_OVERLAP_MAIN_HEADER "overlap-check.main-header"
#define QCOW2_OPT_OVERLAP_ACTIVE_L1 "overlap-check.active-l1"
#define QCOW2_OPT_OVERLAP_ACTIVE_L2 "overlap-check.active-l2"
@@ -92,9 +85,6 @@
#define QCOW2_OPT_OVERLAP_SNAPSHOT_TABLE "overlap-check.snapshot-table"
#define QCOW2_OPT_OVERLAP_INACTIVE_L1 "overlap-check.inactive-l1"
#define QCOW2_OPT_OVERLAP_INACTIVE_L2 "overlap-check.inactive-l2"
#define QCOW2_OPT_CACHE_SIZE "cache-size"
#define QCOW2_OPT_L2_CACHE_SIZE "l2-cache-size"
#define QCOW2_OPT_REFCOUNT_CACHE_SIZE "refcount-cache-size"
typedef struct QCowHeader {
uint32_t magic;
@@ -262,7 +252,6 @@ typedef struct BDRVQcowState {
bool discard_passthrough[QCOW2_DISCARD_MAX];
int overlap_check; /* bitmask of Qcow2MetadataOverlap values */
bool signaled_corruption;
uint64_t incompatible_features;
uint64_t compatible_features;
@@ -479,10 +468,6 @@ int qcow2_mark_corrupt(BlockDriverState *bs);
int qcow2_mark_consistent(BlockDriverState *bs);
int qcow2_update_header(BlockDriverState *bs);
void qcow2_signal_corruption(BlockDriverState *bs, bool fatal, int64_t offset,
int64_t size, const char *message_format, ...)
GCC_FMT_ATTR(5, 6);
/* qcow2-refcount.c functions */
int qcow2_refcount_init(BlockDriverState *bs);
void qcow2_refcount_close(BlockDriverState *bs);

View File

@@ -227,10 +227,8 @@ int qed_check(BDRVQEDState *s, BdrvCheckResult *result, bool fix)
};
int ret;
check.used_clusters = g_try_new0(uint32_t, (check.nclusters + 31) / 32);
if (check.nclusters && check.used_clusters == NULL) {
return -ENOMEM;
}
check.used_clusters = g_malloc0(((check.nclusters + 31) / 32) *
sizeof(check.used_clusters[0]));
check.result->bfi.total_clusters =
(s->header.image_size + s->header.cluster_size - 1) /

View File

@@ -18,8 +18,22 @@
#include "qapi/qmp/qerror.h"
#include "migration/migration.h"
static void qed_aio_cancel(BlockDriverAIOCB *blockacb)
{
QEDAIOCB *acb = (QEDAIOCB *)blockacb;
AioContext *aio_context = bdrv_get_aio_context(blockacb->bs);
bool finished = false;
/* Wait for the request to finish */
acb->finished = &finished;
while (!finished) {
aio_poll(aio_context, true);
}
}
static const AIOCBInfo qed_aiocb_info = {
.aiocb_size = sizeof(QEDAIOCB),
.cancel = qed_aio_cancel,
};
static int bdrv_qed_probe(const uint8_t *buf, int buf_size,
@@ -634,8 +648,7 @@ static int bdrv_qed_create(const char *filename, QemuOpts *opts, Error **errp)
char *backing_fmt = NULL;
int ret;
image_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
image_size = qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0);
backing_file = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FILE);
backing_fmt = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FMT);
cluster_size = qemu_opt_get_size_del(opts,
@@ -905,12 +918,18 @@ static void qed_aio_complete_bh(void *opaque)
BlockDriverCompletionFunc *cb = acb->common.cb;
void *user_opaque = acb->common.opaque;
int ret = acb->bh_ret;
bool *finished = acb->finished;
qemu_bh_delete(acb->bh);
qemu_aio_unref(acb);
qemu_aio_release(acb);
/* Invoke callback */
cb(user_opaque, ret);
/* Signal cancel completion */
if (finished) {
*finished = true;
}
}
static void qed_aio_complete(QEDAIOCB *acb, int ret)
@@ -1221,11 +1240,7 @@ static void qed_aio_write_inplace(QEDAIOCB *acb, uint64_t offset, size_t len)
struct iovec *iov = acb->qiov->iov;
if (!iov->iov_base) {
iov->iov_base = qemu_try_blockalign(acb->common.bs, iov->iov_len);
if (iov->iov_base == NULL) {
qed_aio_complete(acb, -ENOMEM);
return;
}
iov->iov_base = qemu_blockalign(acb->common.bs, iov->iov_len);
memset(iov->iov_base, 0, iov->iov_len);
}
}
@@ -1377,6 +1392,7 @@ static BlockDriverAIOCB *qed_aio_setup(BlockDriverState *bs,
opaque, flags);
acb->flags = flags;
acb->finished = NULL;
acb->qiov = qiov;
acb->qiov_offset = 0;
acb->cur_pos = (uint64_t)sector_num * BDRV_SECTOR_SIZE;

View File

@@ -16,12 +16,7 @@
#include <gnutls/gnutls.h>
#include <gnutls/crypto.h>
#include "block/block_int.h"
#include "qapi/qmp/qbool.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qint.h"
#include "qapi/qmp/qjson.h"
#include "qapi/qmp/qlist.h"
#include "qapi/qmp/qstring.h"
#include "qapi-event.h"
#define HASH_LENGTH 32
@@ -29,7 +24,6 @@
#define QUORUM_OPT_VOTE_THRESHOLD "vote-threshold"
#define QUORUM_OPT_BLKVERIFY "blkverify"
#define QUORUM_OPT_REWRITE "rewrite-corrupted"
#define QUORUM_OPT_READ_PATTERN "read-pattern"
/* This union holds a vote hash value */
typedef union QuorumVoteValue {
@@ -80,8 +74,6 @@ typedef struct BDRVQuorumState {
bool rewrite_corrupted;/* true if the driver must rewrite-on-read corrupted
* block if Quorum is reached.
*/
QuorumReadPattern read_pattern;
} BDRVQuorumState;
typedef struct QuorumAIOCB QuorumAIOCB;
@@ -125,7 +117,6 @@ struct QuorumAIOCB {
bool is_read;
int vote_ret;
int child_iter; /* which child to read in fifo pattern */
};
static bool quorum_vote(QuorumAIOCB *acb);
@@ -138,19 +129,21 @@ static void quorum_aio_cancel(BlockDriverAIOCB *blockacb)
/* cancel all callbacks */
for (i = 0; i < s->num_children; i++) {
if (acb->qcrs[i].aiocb) {
bdrv_aio_cancel_async(acb->qcrs[i].aiocb);
}
bdrv_aio_cancel(acb->qcrs[i].aiocb);
}
g_free(acb->qcrs);
qemu_aio_release(acb);
}
static AIOCBInfo quorum_aiocb_info = {
.aiocb_size = sizeof(QuorumAIOCB),
.cancel_async = quorum_aio_cancel,
.cancel = quorum_aio_cancel,
};
static void quorum_aio_finalize(QuorumAIOCB *acb)
{
BDRVQuorumState *s = acb->common.bs->opaque;
int i, ret = 0;
if (acb->vote_ret) {
@@ -160,15 +153,14 @@ static void quorum_aio_finalize(QuorumAIOCB *acb)
acb->common.cb(acb->common.opaque, ret);
if (acb->is_read) {
/* on the quorum case acb->child_iter == s->num_children - 1 */
for (i = 0; i <= acb->child_iter; i++) {
for (i = 0; i < s->num_children; i++) {
qemu_vfree(acb->qcrs[i].buf);
qemu_iovec_destroy(&acb->qcrs[i].qiov);
}
}
g_free(acb->qcrs);
qemu_aio_unref(acb);
qemu_aio_release(acb);
}
static bool quorum_sha256_compare(QuorumVoteValue *a, QuorumVoteValue *b)
@@ -264,21 +256,6 @@ static void quorum_rewrite_aio_cb(void *opaque, int ret)
quorum_aio_finalize(acb);
}
static BlockDriverAIOCB *read_fifo_child(QuorumAIOCB *acb);
static void quorum_copy_qiov(QEMUIOVector *dest, QEMUIOVector *source)
{
int i;
assert(dest->niov == source->niov);
assert(dest->size == source->size);
for (i = 0; i < source->niov; i++) {
assert(dest->iov[i].iov_len == source->iov[i].iov_len);
memcpy(dest->iov[i].iov_base,
source->iov[i].iov_base,
source->iov[i].iov_len);
}
}
static void quorum_aio_cb(void *opaque, int ret)
{
QuorumChildRequest *sacb = opaque;
@@ -286,21 +263,6 @@ static void quorum_aio_cb(void *opaque, int ret)
BDRVQuorumState *s = acb->common.bs->opaque;
bool rewrite = false;
if (acb->is_read && s->read_pattern == QUORUM_READ_PATTERN_FIFO) {
/* We try to read next child in FIFO order if we fail to read */
if (ret < 0 && ++acb->child_iter < s->num_children) {
read_fifo_child(acb);
return;
}
if (ret == 0) {
quorum_copy_qiov(acb->qiov, &acb->qcrs[acb->child_iter].qiov);
}
acb->vote_ret = ret;
quorum_aio_finalize(acb);
return;
}
sacb->ret = ret;
acb->count++;
if (ret == 0) {
@@ -381,6 +343,19 @@ static bool quorum_rewrite_bad_versions(BDRVQuorumState *s, QuorumAIOCB *acb,
return count;
}
static void quorum_copy_qiov(QEMUIOVector *dest, QEMUIOVector *source)
{
int i;
assert(dest->niov == source->niov);
assert(dest->size == source->size);
for (i = 0; i < source->niov; i++) {
assert(dest->iov[i].iov_len == source->iov[i].iov_len);
memcpy(dest->iov[i].iov_base,
source->iov[i].iov_base,
source->iov[i].iov_len);
}
}
static void quorum_count_vote(QuorumVotes *votes,
QuorumVoteValue *value,
int index)
@@ -640,60 +615,32 @@ free_exit:
return rewrite;
}
static BlockDriverAIOCB *read_quorum_children(QuorumAIOCB *acb)
{
BDRVQuorumState *s = acb->common.bs->opaque;
int i;
for (i = 0; i < s->num_children; i++) {
acb->qcrs[i].buf = qemu_blockalign(s->bs[i], acb->qiov->size);
qemu_iovec_init(&acb->qcrs[i].qiov, acb->qiov->niov);
qemu_iovec_clone(&acb->qcrs[i].qiov, acb->qiov, acb->qcrs[i].buf);
}
for (i = 0; i < s->num_children; i++) {
bdrv_aio_readv(s->bs[i], acb->sector_num, &acb->qcrs[i].qiov,
acb->nb_sectors, quorum_aio_cb, &acb->qcrs[i]);
}
return &acb->common;
}
static BlockDriverAIOCB *read_fifo_child(QuorumAIOCB *acb)
{
BDRVQuorumState *s = acb->common.bs->opaque;
acb->qcrs[acb->child_iter].buf = qemu_blockalign(s->bs[acb->child_iter],
acb->qiov->size);
qemu_iovec_init(&acb->qcrs[acb->child_iter].qiov, acb->qiov->niov);
qemu_iovec_clone(&acb->qcrs[acb->child_iter].qiov, acb->qiov,
acb->qcrs[acb->child_iter].buf);
bdrv_aio_readv(s->bs[acb->child_iter], acb->sector_num,
&acb->qcrs[acb->child_iter].qiov, acb->nb_sectors,
quorum_aio_cb, &acb->qcrs[acb->child_iter]);
return &acb->common;
}
static BlockDriverAIOCB *quorum_aio_readv(BlockDriverState *bs,
int64_t sector_num,
QEMUIOVector *qiov,
int nb_sectors,
BlockDriverCompletionFunc *cb,
void *opaque)
int64_t sector_num,
QEMUIOVector *qiov,
int nb_sectors,
BlockDriverCompletionFunc *cb,
void *opaque)
{
BDRVQuorumState *s = bs->opaque;
QuorumAIOCB *acb = quorum_aio_get(s, bs, qiov, sector_num,
nb_sectors, cb, opaque);
int i;
acb->is_read = true;
if (s->read_pattern == QUORUM_READ_PATTERN_QUORUM) {
acb->child_iter = s->num_children - 1;
return read_quorum_children(acb);
for (i = 0; i < s->num_children; i++) {
acb->qcrs[i].buf = qemu_blockalign(s->bs[i], qiov->size);
qemu_iovec_init(&acb->qcrs[i].qiov, qiov->niov);
qemu_iovec_clone(&acb->qcrs[i].qiov, qiov, acb->qcrs[i].buf);
}
acb->child_iter = 0;
return read_fifo_child(acb);
for (i = 0; i < s->num_children; i++) {
bdrv_aio_readv(s->bs[i], sector_num, &acb->qcrs[i].qiov, nb_sectors,
quorum_aio_cb, &acb->qcrs[i]);
}
return &acb->common;
}
static BlockDriverAIOCB *quorum_aio_writev(BlockDriverState *bs,
@@ -835,39 +782,16 @@ static QemuOptsList quorum_runtime_opts = {
.type = QEMU_OPT_BOOL,
.help = "Rewrite corrupted block on read quorum",
},
{
.name = QUORUM_OPT_READ_PATTERN,
.type = QEMU_OPT_STRING,
.help = "Allowed pattern: quorum, fifo. Quorum is default",
},
{ /* end of list */ }
},
};
static int parse_read_pattern(const char *opt)
{
int i;
if (!opt) {
/* Set quorum as default */
return QUORUM_READ_PATTERN_QUORUM;
}
for (i = 0; i < QUORUM_READ_PATTERN_MAX; i++) {
if (!strcmp(opt, QuorumReadPattern_lookup[i])) {
return i;
}
}
return -EINVAL;
}
static int quorum_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVQuorumState *s = bs->opaque;
Error *local_err = NULL;
QemuOpts *opts = NULL;
QemuOpts *opts;
bool *opened;
QDict *sub = NULL;
QList *list = NULL;
@@ -903,37 +827,28 @@ static int quorum_open(BlockDriverState *bs, QDict *options, int flags,
}
s->threshold = qemu_opt_get_number(opts, QUORUM_OPT_VOTE_THRESHOLD, 0);
ret = parse_read_pattern(qemu_opt_get(opts, QUORUM_OPT_READ_PATTERN));
/* and validate it against s->num_children */
ret = quorum_valid_threshold(s->threshold, s->num_children, &local_err);
if (ret < 0) {
error_setg(&local_err, "Please set read-pattern as fifo or quorum");
goto exit;
}
s->read_pattern = ret;
if (s->read_pattern == QUORUM_READ_PATTERN_QUORUM) {
/* and validate it against s->num_children */
ret = quorum_valid_threshold(s->threshold, s->num_children, &local_err);
if (ret < 0) {
goto exit;
}
/* is the driver in blkverify mode */
if (qemu_opt_get_bool(opts, QUORUM_OPT_BLKVERIFY, false) &&
s->num_children == 2 && s->threshold == 2) {
s->is_blkverify = true;
} else if (qemu_opt_get_bool(opts, QUORUM_OPT_BLKVERIFY, false)) {
fprintf(stderr, "blkverify mode is set by setting blkverify=on "
"and using two files with vote_threshold=2\n");
}
/* is the driver in blkverify mode */
if (qemu_opt_get_bool(opts, QUORUM_OPT_BLKVERIFY, false) &&
s->num_children == 2 && s->threshold == 2) {
s->is_blkverify = true;
} else if (qemu_opt_get_bool(opts, QUORUM_OPT_BLKVERIFY, false)) {
fprintf(stderr, "blkverify mode is set by setting blkverify=on "
"and using two files with vote_threshold=2\n");
}
s->rewrite_corrupted = qemu_opt_get_bool(opts, QUORUM_OPT_REWRITE,
false);
if (s->rewrite_corrupted && s->is_blkverify) {
error_setg(&local_err,
"rewrite-corrupted=on cannot be used with blkverify=on");
ret = -EINVAL;
goto exit;
}
s->rewrite_corrupted = qemu_opt_get_bool(opts, QUORUM_OPT_REWRITE, false);
if (s->rewrite_corrupted && s->is_blkverify) {
error_setg(&local_err,
"rewrite-corrupted=on cannot be used with blkverify=on");
ret = -EINVAL;
goto exit;
}
/* allocate the children BlockDriverState array */
@@ -988,7 +903,6 @@ close_exit:
g_free(s->bs);
g_free(opened);
exit:
qemu_opts_del(opts);
/* propagate error */
if (local_err) {
error_propagate(errp, local_err);
@@ -1031,39 +945,6 @@ static void quorum_attach_aio_context(BlockDriverState *bs,
}
}
static void quorum_refresh_filename(BlockDriverState *bs)
{
BDRVQuorumState *s = bs->opaque;
QDict *opts;
QList *children;
int i;
for (i = 0; i < s->num_children; i++) {
bdrv_refresh_filename(s->bs[i]);
if (!s->bs[i]->full_open_options) {
return;
}
}
children = qlist_new();
for (i = 0; i < s->num_children; i++) {
QINCREF(s->bs[i]->full_open_options);
qlist_append_obj(children, QOBJECT(s->bs[i]->full_open_options));
}
opts = qdict_new();
qdict_put_obj(opts, "driver", QOBJECT(qstring_from_str("quorum")));
qdict_put_obj(opts, QUORUM_OPT_VOTE_THRESHOLD,
QOBJECT(qint_from_int(s->threshold)));
qdict_put_obj(opts, QUORUM_OPT_BLKVERIFY,
QOBJECT(qbool_from_int(s->is_blkverify)));
qdict_put_obj(opts, QUORUM_OPT_REWRITE,
QOBJECT(qbool_from_int(s->rewrite_corrupted)));
qdict_put_obj(opts, "children", QOBJECT(children));
bs->full_open_options = opts;
}
static BlockDriver bdrv_quorum = {
.format_name = "quorum",
.protocol_name = "quorum",
@@ -1072,7 +953,6 @@ static BlockDriver bdrv_quorum = {
.bdrv_file_open = quorum_open,
.bdrv_close = quorum_close,
.bdrv_refresh_filename = quorum_refresh_filename,
.bdrv_co_flush_to_disk = quorum_co_flush,

View File

@@ -30,7 +30,6 @@
#include "block/thread-pool.h"
#include "qemu/iov.h"
#include "raw-aio.h"
#include "qapi/util.h"
#if defined(__APPLE__) && (__MACH__)
#include <paths.h>
@@ -518,7 +517,7 @@ static int raw_reopen_prepare(BDRVReopenState *state,
s = state->bs->opaque;
state->opaque = g_new0(BDRVRawReopenState, 1);
state->opaque = g_malloc0(sizeof(BDRVRawReopenState));
raw_s = state->opaque;
#ifdef CONFIG_LINUX_AIO
@@ -808,11 +807,7 @@ static ssize_t handle_aiocb_rw(RawPosixAIOData *aiocb)
* Ok, we have to do it the hard way, copy all segments into
* a single aligned buffer.
*/
buf = qemu_try_blockalign(aiocb->bs, aiocb->aio_nbytes);
if (buf == NULL) {
return -ENOMEM;
}
buf = qemu_blockalign(aiocb->bs, aiocb->aio_nbytes);
if (aiocb->aio_type & QEMU_AIO_WRITE) {
char *p = buf;
int i;
@@ -1366,102 +1361,44 @@ static int raw_create(const char *filename, QemuOpts *opts, Error **errp)
int result = 0;
int64_t total_size = 0;
bool nocow = false;
PreallocMode prealloc;
char *buf = NULL;
Error *local_err = NULL;
strstart(filename, "file:", &filename);
/* Read out options */
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
total_size =
qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0) / BDRV_SECTOR_SIZE;
nocow = qemu_opt_get_bool(opts, BLOCK_OPT_NOCOW, false);
buf = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
prealloc = qapi_enum_parse(PreallocMode_lookup, buf,
PREALLOC_MODE_MAX, PREALLOC_MODE_OFF,
&local_err);
g_free(buf);
if (local_err) {
error_propagate(errp, local_err);
result = -EINVAL;
goto out;
}
fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
0644);
if (fd < 0) {
result = -errno;
error_setg_errno(errp, -result, "Could not create file");
goto out;
}
if (nocow) {
} else {
if (nocow) {
#ifdef __linux__
/* Set NOCOW flag to solve performance issue on fs like btrfs.
* This is an optimisation. The FS_IOC_SETFLAGS ioctl return value
* will be ignored since any failure of this operation should not
* block the left work.
*/
int attr;
if (ioctl(fd, FS_IOC_GETFLAGS, &attr) == 0) {
attr |= FS_NOCOW_FL;
ioctl(fd, FS_IOC_SETFLAGS, &attr);
}
#endif
}
if (ftruncate(fd, total_size) != 0) {
result = -errno;
error_setg_errno(errp, -result, "Could not resize file");
goto out_close;
}
switch (prealloc) {
#ifdef CONFIG_POSIX_FALLOCATE
case PREALLOC_MODE_FALLOC:
/* posix_fallocate() doesn't set errno. */
result = -posix_fallocate(fd, 0, total_size);
if (result != 0) {
error_setg_errno(errp, -result,
"Could not preallocate data for the new file");
}
break;
#endif
case PREALLOC_MODE_FULL:
{
int64_t num = 0, left = total_size;
buf = g_malloc0(65536);
while (left > 0) {
num = MIN(left, 65536);
result = write(fd, buf, num);
if (result < 0) {
result = -errno;
error_setg_errno(errp, -result,
"Could not write to the new file");
break;
/* Set NOCOW flag to solve performance issue on fs like btrfs.
* This is an optimisation. The FS_IOC_SETFLAGS ioctl return value
* will be ignored since any failure of this operation should not
* block the left work.
*/
int attr;
if (ioctl(fd, FS_IOC_GETFLAGS, &attr) == 0) {
attr |= FS_NOCOW_FL;
ioctl(fd, FS_IOC_SETFLAGS, &attr);
}
left -= num;
#endif
}
fsync(fd);
g_free(buf);
break;
}
case PREALLOC_MODE_OFF:
break;
default:
result = -EINVAL;
error_setg(errp, "Unsupported preallocation mode: %s",
PreallocMode_lookup[prealloc]);
break;
}
out_close:
if (qemu_close(fd) != 0 && result == 0) {
result = -errno;
error_setg_errno(errp, -result, "Could not close the new file");
if (ftruncate(fd, total_size * BDRV_SECTOR_SIZE) != 0) {
result = -errno;
error_setg_errno(errp, -result, "Could not resize file");
}
if (qemu_close(fd) != 0) {
result = -errno;
error_setg_errno(errp, -result, "Could not close the new file");
}
}
out:
return result;
}
@@ -1644,11 +1581,6 @@ static QemuOptsList raw_create_opts = {
.type = QEMU_OPT_BOOL,
.help = "Turn off copy-on-write (valid only on btrfs)"
},
{
.name = BLOCK_OPT_PREALLOC,
.type = QEMU_OPT_STRING,
.help = "Preallocation mode (allowed values: off, falloc, full)"
},
{ /* end of list */ }
}
};
@@ -2030,8 +1962,8 @@ static int hdev_create(const char *filename, QemuOpts *opts,
(void)has_prefix;
/* Read out options */
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
total_size =
qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0) / BDRV_SECTOR_SIZE;
fd = qemu_open(filename, O_WRONLY | O_BINARY);
if (fd < 0) {
@@ -2047,7 +1979,7 @@ static int hdev_create(const char *filename, QemuOpts *opts,
error_setg(errp,
"The given file is neither a block nor a character device");
ret = -ENODEV;
} else if (lseek(fd, 0, SEEK_END) < total_size) {
} else if (lseek(fd, 0, SEEK_END) < total_size * BDRV_SECTOR_SIZE) {
error_setg(errp, "Device is too small");
ret = -ENOSPC;
}

View File

@@ -511,8 +511,8 @@ static int raw_create(const char *filename, QemuOpts *opts, Error **errp)
strstart(filename, "file:", &filename);
/* Read out options */
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
total_size =
qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0) / 512;
fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
0644);
@@ -521,7 +521,7 @@ static int raw_create(const char *filename, QemuOpts *opts, Error **errp)
return -EIO;
}
set_sparse(fd);
ftruncate(fd, total_size);
ftruncate(fd, total_size * 512);
qemu_close(fd);
return 0;
}

View File

@@ -77,6 +77,7 @@ typedef struct RBDAIOCB {
int64_t sector_num;
int error;
struct BDRVRBDState *s;
int cancelled;
int status;
} RBDAIOCB;
@@ -313,8 +314,7 @@ static int qemu_rbd_create(const char *filename, QemuOpts *opts, Error **errp)
}
/* Read out options */
bytes = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
bytes = qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0);
objsize = qemu_opt_get_size_del(opts, BLOCK_OPT_CLUSTER_SIZE, 0);
if (objsize) {
if ((objsize - 1) & objsize) { /* not a power of 2? */
@@ -407,7 +407,9 @@ static void qemu_rbd_complete_aio(RADOSCB *rcb)
acb->common.cb(acb->common.opaque, (acb->ret > 0 ? 0 : acb->ret));
acb->status = 0;
qemu_aio_unref(acb);
if (!acb->cancelled) {
qemu_aio_release(acb);
}
}
/* TODO Convert to fine grained options */
@@ -536,8 +538,25 @@ static void qemu_rbd_close(BlockDriverState *bs)
rados_shutdown(s->cluster);
}
/*
* Cancel aio. Since we don't reference acb in a non qemu threads,
* it is safe to access it here.
*/
static void qemu_rbd_aio_cancel(BlockDriverAIOCB *blockacb)
{
RBDAIOCB *acb = (RBDAIOCB *) blockacb;
acb->cancelled = 1;
while (acb->status == -EINPROGRESS) {
aio_poll(bdrv_get_aio_context(acb->common.bs), true);
}
qemu_aio_release(acb);
}
static const AIOCBInfo rbd_aiocb_info = {
.aiocb_size = sizeof(RBDAIOCB),
.cancel = qemu_rbd_aio_cancel,
};
static void rbd_finish_bh(void *opaque)
@@ -598,7 +617,7 @@ static BlockDriverAIOCB *rbd_start_aio(BlockDriverState *bs,
RBDAIOCmd cmd)
{
RBDAIOCB *acb;
RADOSCB *rcb = NULL;
RADOSCB *rcb;
rbd_completion_t c;
int64_t off, size;
char *buf;
@@ -612,14 +631,12 @@ static BlockDriverAIOCB *rbd_start_aio(BlockDriverState *bs,
if (cmd == RBD_AIO_DISCARD || cmd == RBD_AIO_FLUSH) {
acb->bounce = NULL;
} else {
acb->bounce = qemu_try_blockalign(bs, qiov->size);
if (acb->bounce == NULL) {
goto failed;
}
acb->bounce = qemu_blockalign(bs, qiov->size);
}
acb->ret = 0;
acb->error = 0;
acb->s = s;
acb->cancelled = 0;
acb->bh = NULL;
acb->status = -EINPROGRESS;
@@ -632,7 +649,7 @@ static BlockDriverAIOCB *rbd_start_aio(BlockDriverState *bs,
off = sector_num * BDRV_SECTOR_SIZE;
size = nb_sectors * BDRV_SECTOR_SIZE;
rcb = g_new(RADOSCB, 1);
rcb = g_malloc(sizeof(RADOSCB));
rcb->done = 0;
rcb->acb = acb;
rcb->buf = buf;
@@ -671,7 +688,7 @@ failed_completion:
failed:
g_free(rcb);
qemu_vfree(acb->bounce);
qemu_aio_unref(acb);
qemu_aio_release(acb);
return NULL;
}
@@ -842,7 +859,7 @@ static int qemu_rbd_snap_list(BlockDriverState *bs,
int max_snaps = RBD_MAX_SNAPS;
do {
snaps = g_new(rbd_snap_info_t, max_snaps);
snaps = g_malloc(sizeof(*snaps) * max_snaps);
snap_count = rbd_snap_list(s->image, snaps, &max_snaps);
if (snap_count <= 0) {
g_free(snaps);
@@ -853,7 +870,7 @@ static int qemu_rbd_snap_list(BlockDriverState *bs,
goto done;
}
sn_tab = g_new0(QEMUSnapshotInfo, snap_count);
sn_tab = g_malloc0(snap_count * sizeof(QEMUSnapshotInfo));
for (i = 0; i < snap_count; i++) {
const char *snap_name = snaps[i].name;

View File

@@ -103,9 +103,6 @@
#define SD_INODE_SIZE (sizeof(SheepdogInode))
#define CURRENT_VDI_ID 0
#define LOCK_TYPE_NORMAL 0
#define LOCK_TYPE_SHARED 1 /* for iSCSI multipath */
typedef struct SheepdogReq {
uint8_t proto_ver;
uint8_t opcode;
@@ -169,8 +166,7 @@ typedef struct SheepdogVdiReq {
uint8_t copy_policy;
uint8_t reserved[2];
uint32_t snapid;
uint32_t type;
uint32_t pad[2];
uint32_t pad[3];
} SheepdogVdiReq;
typedef struct SheepdogVdiRsp {
@@ -315,6 +311,7 @@ struct SheepdogAIOCB {
void (*aio_done_func)(SheepdogAIOCB *);
bool cancelable;
bool *finished;
int nr_pending;
};
@@ -445,7 +442,10 @@ static inline void free_aio_req(BDRVSheepdogState *s, AIOReq *aio_req)
static void coroutine_fn sd_finish_aiocb(SheepdogAIOCB *acb)
{
qemu_coroutine_enter(acb->coroutine, NULL);
qemu_aio_unref(acb);
if (acb->finished) {
*acb->finished = true;
}
qemu_aio_release(acb);
}
/*
@@ -478,33 +478,36 @@ static void sd_aio_cancel(BlockDriverAIOCB *blockacb)
SheepdogAIOCB *acb = (SheepdogAIOCB *)blockacb;
BDRVSheepdogState *s = acb->common.bs->opaque;
AIOReq *aioreq, *next;
bool finished = false;
if (sd_acb_cancelable(acb)) {
/* Remove outstanding requests from pending and failed queues. */
QLIST_FOREACH_SAFE(aioreq, &s->pending_aio_head, aio_siblings,
next) {
if (aioreq->aiocb == acb) {
free_aio_req(s, aioreq);
acb->finished = &finished;
while (!finished) {
if (sd_acb_cancelable(acb)) {
/* Remove outstanding requests from pending and failed queues. */
QLIST_FOREACH_SAFE(aioreq, &s->pending_aio_head, aio_siblings,
next) {
if (aioreq->aiocb == acb) {
free_aio_req(s, aioreq);
}
}
}
QLIST_FOREACH_SAFE(aioreq, &s->failed_aio_head, aio_siblings,
next) {
if (aioreq->aiocb == acb) {
free_aio_req(s, aioreq);
QLIST_FOREACH_SAFE(aioreq, &s->failed_aio_head, aio_siblings,
next) {
if (aioreq->aiocb == acb) {
free_aio_req(s, aioreq);
}
}
}
assert(acb->nr_pending == 0);
if (acb->common.cb) {
acb->common.cb(acb->common.opaque, -ECANCELED);
assert(acb->nr_pending == 0);
sd_finish_aiocb(acb);
return;
}
sd_finish_aiocb(acb);
aio_poll(s->aio_context, true);
}
}
static const AIOCBInfo sd_aiocb_info = {
.aiocb_size = sizeof(SheepdogAIOCB),
.cancel_async = sd_aio_cancel,
.aiocb_size = sizeof(SheepdogAIOCB),
.cancel = sd_aio_cancel,
};
static SheepdogAIOCB *sd_aio_setup(BlockDriverState *bs, QEMUIOVector *qiov,
@@ -521,6 +524,7 @@ static SheepdogAIOCB *sd_aio_setup(BlockDriverState *bs, QEMUIOVector *qiov,
acb->aio_done_func = NULL;
acb->cancelable = true;
acb->finished = NULL;
acb->coroutine = qemu_coroutine_self();
acb->ret = 0;
acb->nr_pending = 0;
@@ -708,6 +712,7 @@ static void coroutine_fn send_pending_req(BDRVSheepdogState *s, uint64_t oid)
static coroutine_fn void reconnect_to_sdog(void *opaque)
{
Error *local_err = NULL;
BDRVSheepdogState *s = opaque;
AIOReq *aio_req, *next;
@@ -722,7 +727,6 @@ static coroutine_fn void reconnect_to_sdog(void *opaque)
/* Try to reconnect the sheepdog server every one second. */
while (s->fd < 0) {
Error *local_err = NULL;
s->fd = get_sheep_fd(s, &local_err);
if (s->fd < 0) {
DPRINTF("Wait for connection to be established\n");
@@ -1086,7 +1090,6 @@ static int find_vdi_name(BDRVSheepdogState *s, const char *filename,
memset(&hdr, 0, sizeof(hdr));
if (lock) {
hdr.opcode = SD_OP_LOCK_VDI;
hdr.type = LOCK_TYPE_NORMAL;
} else {
hdr.opcode = SD_OP_GET_VDI_INFO;
}
@@ -1107,8 +1110,6 @@ static int find_vdi_name(BDRVSheepdogState *s, const char *filename,
sd_strerror(rsp->result), filename, snapid, tag);
if (rsp->result == SD_RES_NO_VDI) {
ret = -ENOENT;
} else if (rsp->result == SD_RES_VDI_LOCKED) {
ret = -EBUSY;
} else {
ret = -EIO;
}
@@ -1681,7 +1682,7 @@ static int sd_create(const char *filename, QemuOpts *opts,
uint32_t snapid;
bool prealloc = false;
s = g_new0(BDRVSheepdogState, 1);
s = g_malloc0(sizeof(BDRVSheepdogState));
memset(tag, 0, sizeof(tag));
if (strstr(filename, "://")) {
@@ -1694,8 +1695,7 @@ static int sd_create(const char *filename, QemuOpts *opts,
goto out;
}
s->inode.vdi_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
s->inode.vdi_size = qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0);
backing_file = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FILE);
buf = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
if (!buf || !strcmp(buf, "off")) {
@@ -1793,7 +1793,6 @@ static void sd_close(BlockDriverState *bs)
memset(&hdr, 0, sizeof(hdr));
hdr.opcode = SD_OP_RELEASE_VDI;
hdr.type = LOCK_TYPE_NORMAL;
hdr.base_vdi_id = s->inode.vdi_id;
wlen = strlen(s->name) + 1;
hdr.data_length = wlen;
@@ -2130,7 +2129,7 @@ static coroutine_fn int sd_co_writev(BlockDriverState *bs, int64_t sector_num,
ret = sd_co_rw_vector(acb);
if (ret <= 0) {
qemu_aio_unref(acb);
qemu_aio_release(acb);
return ret;
}
@@ -2151,7 +2150,7 @@ static coroutine_fn int sd_co_readv(BlockDriverState *bs, int64_t sector_num,
ret = sd_co_rw_vector(acb);
if (ret <= 0) {
qemu_aio_unref(acb);
qemu_aio_release(acb);
return ret;
}
@@ -2274,7 +2273,7 @@ static int sd_snapshot_goto(BlockDriverState *bs, const char *snapshot_id)
uint32_t snapid = 0;
int ret = 0;
old_s = g_new(BDRVSheepdogState, 1);
old_s = g_malloc(sizeof(BDRVSheepdogState));
memcpy(old_s, s, sizeof(BDRVSheepdogState));
@@ -2358,7 +2357,7 @@ static int sd_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_tab)
goto out;
}
sn_tab = g_new0(QEMUSnapshotInfo, nr);
sn_tab = g_malloc0(nr * sizeof(*sn_tab));
/* calculate a vdi id with hash function */
hval = fnv_64a_buf(s->name, strlen(s->name), FNV1A_64_INIT);
@@ -2510,7 +2509,7 @@ static coroutine_fn int sd_co_discard(BlockDriverState *bs, int64_t sector_num,
ret = sd_co_rw_vector(acb);
if (ret <= 0) {
qemu_aio_unref(acb);
qemu_aio_release(acb);
return ret;
}

View File

@@ -517,11 +517,6 @@ static int connect_to_ssh(BDRVSSHState *s, QDict *options,
const char *host, *user, *path, *host_key_check;
int port;
if (!qdict_haskey(options, "host")) {
ret = -EINVAL;
error_setg(errp, "No hostname was specified");
goto err;
}
host = qdict_get_str(options, "host");
if (qdict_haskey(options, "port")) {
@@ -530,11 +525,6 @@ static int connect_to_ssh(BDRVSSHState *s, QDict *options,
port = 22;
}
if (!qdict_haskey(options, "path")) {
ret = -EINVAL;
error_setg(errp, "No path was specified");
goto err;
}
path = qdict_get_str(options, "path");
if (qdict_haskey(options, "user")) {
@@ -710,8 +700,7 @@ static int ssh_create(const char *filename, QemuOpts *opts, Error **errp)
ssh_state_init(&s);
/* Get desired file size. */
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
total_size = qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0);
DPRINTF("total_size=%" PRIi64, total_size);
uri_options = qdict_new();

View File

@@ -53,6 +53,13 @@
#include "block/block_int.h"
#include "qemu/module.h"
#include "migration/migration.h"
#ifdef __linux__
#include <linux/fs.h>
#include <sys/ioctl.h>
#ifndef FS_NOCOW_FL
#define FS_NOCOW_FL 0x00800000 /* Do not cow file */
#endif
#endif
#if defined(CONFIG_UUID)
#include <uuid/uuid.h>
@@ -292,12 +299,7 @@ static int vdi_check(BlockDriverState *bs, BdrvCheckResult *res,
return -ENOTSUP;
}
bmap = g_try_new(uint32_t, s->header.blocks_in_image);
if (s->header.blocks_in_image && bmap == NULL) {
res->check_errors++;
return -ENOMEM;
}
bmap = g_malloc(s->header.blocks_in_image * sizeof(uint32_t));
memset(bmap, 0xff, s->header.blocks_in_image * sizeof(uint32_t));
/* Check block map and value of blocks_allocated. */
@@ -355,23 +357,23 @@ static int vdi_make_empty(BlockDriverState *bs)
static int vdi_probe(const uint8_t *buf, int buf_size, const char *filename)
{
const VdiHeader *header = (const VdiHeader *)buf;
int ret = 0;
int result = 0;
logout("\n");
if (buf_size < sizeof(*header)) {
/* Header too small, no VDI. */
} else if (le32_to_cpu(header->signature) == VDI_SIGNATURE) {
ret = 100;
result = 100;
}
if (ret == 0) {
if (result == 0) {
logout("no vdi image\n");
} else {
logout("%s", header->text);
}
return ret;
return result;
}
static int vdi_open(BlockDriverState *bs, QDict *options, int flags,
@@ -476,12 +478,7 @@ static int vdi_open(BlockDriverState *bs, QDict *options, int flags,
bmap_size = header.blocks_in_image * sizeof(uint32_t);
bmap_size = (bmap_size + SECTOR_SIZE - 1) / SECTOR_SIZE;
s->bmap = qemu_try_blockalign(bs->file, bmap_size * SECTOR_SIZE);
if (s->bmap == NULL) {
ret = -ENOMEM;
goto fail;
}
s->bmap = g_malloc(bmap_size * SECTOR_SIZE);
ret = bdrv_read(bs->file, s->bmap_sector, (uint8_t *)s->bmap, bmap_size);
if (ret < 0) {
goto fail_free_bmap;
@@ -496,7 +493,7 @@ static int vdi_open(BlockDriverState *bs, QDict *options, int flags,
return 0;
fail_free_bmap:
qemu_vfree(s->bmap);
g_free(s->bmap);
fail:
return ret;
@@ -684,7 +681,8 @@ static int vdi_co_write(BlockDriverState *bs,
static int vdi_create(const char *filename, QemuOpts *opts, Error **errp)
{
int ret = 0;
int fd;
int result = 0;
uint64_t bytes = 0;
uint32_t blocks;
size_t block_size = DEFAULT_CLUSTER_SIZE;
@@ -692,16 +690,12 @@ static int vdi_create(const char *filename, QemuOpts *opts, Error **errp)
VdiHeader header;
size_t i;
size_t bmap_size;
int64_t offset = 0;
Error *local_err = NULL;
BlockDriverState *bs = NULL;
uint32_t *bmap = NULL;
bool nocow = false;
logout("\n");
/* Read out options. */
bytes = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
bytes = qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0);
#if defined(CONFIG_VDI_BLOCK_SIZE)
/* TODO: Additional checks (SECTOR_SIZE * 2^n, ...). */
block_size = qemu_opt_get_size_del(opts,
@@ -713,25 +707,37 @@ static int vdi_create(const char *filename, QemuOpts *opts, Error **errp)
image_type = VDI_TYPE_STATIC;
}
#endif
nocow = qemu_opt_get_bool_del(opts, BLOCK_OPT_NOCOW, false);
if (bytes > VDI_DISK_SIZE_MAX) {
ret = -ENOTSUP;
result = -ENOTSUP;
error_setg(errp, "Unsupported VDI image size (size is 0x%" PRIx64
", max supported is 0x%" PRIx64 ")",
bytes, VDI_DISK_SIZE_MAX);
goto exit;
}
ret = bdrv_create_file(filename, opts, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
fd = qemu_open(filename,
O_WRONLY | O_CREAT | O_TRUNC | O_BINARY | O_LARGEFILE,
0644);
if (fd < 0) {
result = -errno;
goto exit;
}
ret = bdrv_open(&bs, filename, NULL, NULL, BDRV_O_RDWR | BDRV_O_PROTOCOL,
NULL, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
goto exit;
if (nocow) {
#ifdef __linux__
/* Set NOCOW flag to solve performance issue on fs like btrfs.
* This is an optimisation. The FS_IOC_SETFLAGS ioctl return value will
* be ignored since any failure of this operation should not block the
* left work.
*/
int attr;
if (ioctl(fd, FS_IOC_GETFLAGS, &attr) == 0) {
attr |= FS_NOCOW_FL;
ioctl(fd, FS_IOC_SETFLAGS, &attr);
}
#endif
}
/* We need enough blocks to store the given disk size,
@@ -763,20 +769,13 @@ static int vdi_create(const char *filename, QemuOpts *opts, Error **errp)
vdi_header_print(&header);
#endif
vdi_header_to_le(&header);
ret = bdrv_pwrite_sync(bs, offset, &header, sizeof(header));
if (ret < 0) {
error_setg(errp, "Error writing header to %s", filename);
goto exit;
if (write(fd, &header, sizeof(header)) < 0) {
result = -errno;
goto close_and_exit;
}
offset += sizeof(header);
if (bmap_size > 0) {
bmap = g_try_malloc0(bmap_size);
if (bmap == NULL) {
ret = -ENOMEM;
error_setg(errp, "Could not allocate bmap");
goto exit;
}
uint32_t *bmap = g_malloc0(bmap_size);
for (i = 0; i < blocks; i++) {
if (image_type == VDI_TYPE_STATIC) {
bmap[i] = i;
@@ -784,33 +783,35 @@ static int vdi_create(const char *filename, QemuOpts *opts, Error **errp)
bmap[i] = VDI_UNALLOCATED;
}
}
ret = bdrv_pwrite_sync(bs, offset, bmap, bmap_size);
if (ret < 0) {
error_setg(errp, "Error writing bmap to %s", filename);
goto exit;
if (write(fd, bmap, bmap_size) < 0) {
result = -errno;
g_free(bmap);
goto close_and_exit;
}
offset += bmap_size;
g_free(bmap);
}
if (image_type == VDI_TYPE_STATIC) {
ret = bdrv_truncate(bs, offset + blocks * block_size);
if (ret < 0) {
error_setg(errp, "Failed to statically allocate %s", filename);
goto exit;
if (ftruncate(fd, sizeof(header) + bmap_size + blocks * block_size)) {
result = -errno;
goto close_and_exit;
}
}
close_and_exit:
if ((close(fd) < 0) && !result) {
result = -errno;
}
exit:
bdrv_unref(bs);
g_free(bmap);
return ret;
return result;
}
static void vdi_close(BlockDriverState *bs)
{
BDRVVdiState *s = bs->opaque;
qemu_vfree(s->bmap);
g_free(s->bmap);
migrate_del_blocker(s->migration_blocker);
error_free(s->migration_blocker);

View File

@@ -82,6 +82,8 @@ void vhdx_log_desc_le_import(VHDXLogDescriptor *d)
assert(d != NULL);
le32_to_cpus(&d->signature);
le32_to_cpus(&d->trailing_bytes);
le64_to_cpus(&d->leading_bytes);
le64_to_cpus(&d->file_offset);
le64_to_cpus(&d->sequence_number);
}
@@ -97,15 +99,6 @@ void vhdx_log_desc_le_export(VHDXLogDescriptor *d)
cpu_to_le64s(&d->sequence_number);
}
void vhdx_log_data_le_import(VHDXLogDataSector *d)
{
assert(d != NULL);
le32_to_cpus(&d->data_signature);
le32_to_cpus(&d->sequence_high);
le32_to_cpus(&d->sequence_low);
}
void vhdx_log_data_le_export(VHDXLogDataSector *d)
{
assert(d != NULL);

View File

@@ -84,7 +84,6 @@ static int vhdx_log_peek_hdr(BlockDriverState *bs, VHDXLogEntries *log,
if (ret < 0) {
goto exit;
}
vhdx_log_entry_hdr_le_import(hdr);
exit:
return ret;
@@ -212,7 +211,7 @@ static bool vhdx_log_hdr_is_valid(VHDXLogEntries *log, VHDXLogEntryHeader *hdr,
{
int valid = false;
if (hdr->signature != VHDX_LOG_SIGNATURE) {
if (memcmp(&hdr->signature, "loge", 4)) {
goto exit;
}
@@ -276,12 +275,12 @@ static bool vhdx_log_desc_is_valid(VHDXLogDescriptor *desc,
goto exit;
}
if (desc->signature == VHDX_LOG_ZERO_SIGNATURE) {
if (!memcmp(&desc->signature, "zero", 4)) {
if (desc->zero_length % VHDX_LOG_SECTOR_SIZE == 0) {
/* valid */
ret = true;
}
} else if (desc->signature == VHDX_LOG_DESC_SIGNATURE) {
} else if (!memcmp(&desc->signature, "desc", 4)) {
/* valid */
ret = true;
}
@@ -328,15 +327,13 @@ static int vhdx_compute_desc_sectors(uint32_t desc_cnt)
* passed into this function. Each descriptor will also be validated,
* and error returned if any are invalid. */
static int vhdx_log_read_desc(BlockDriverState *bs, BDRVVHDXState *s,
VHDXLogEntries *log, VHDXLogDescEntries **buffer,
bool convert_endian)
VHDXLogEntries *log, VHDXLogDescEntries **buffer)
{
int ret = 0;
uint32_t desc_sectors;
uint32_t sectors_read;
VHDXLogEntryHeader hdr;
VHDXLogDescEntries *desc_entries = NULL;
VHDXLogDescriptor desc;
int i;
assert(*buffer == NULL);
@@ -345,19 +342,14 @@ static int vhdx_log_read_desc(BlockDriverState *bs, BDRVVHDXState *s,
if (ret < 0) {
goto exit;
}
vhdx_log_entry_hdr_le_import(&hdr);
if (vhdx_log_hdr_is_valid(log, &hdr, s) == false) {
ret = -EINVAL;
goto exit;
}
desc_sectors = vhdx_compute_desc_sectors(hdr.descriptor_count);
desc_entries = qemu_try_blockalign(bs->file,
desc_sectors * VHDX_LOG_SECTOR_SIZE);
if (desc_entries == NULL) {
ret = -ENOMEM;
goto exit;
}
desc_entries = qemu_blockalign(bs, desc_sectors * VHDX_LOG_SECTOR_SIZE);
ret = vhdx_log_read_sectors(bs, log, &sectors_read, desc_entries,
desc_sectors, false);
@@ -371,19 +363,12 @@ static int vhdx_log_read_desc(BlockDriverState *bs, BDRVVHDXState *s,
/* put in proper endianness, and validate each desc */
for (i = 0; i < hdr.descriptor_count; i++) {
desc = desc_entries->desc[i];
vhdx_log_desc_le_import(&desc);
if (convert_endian) {
desc_entries->desc[i] = desc;
}
if (vhdx_log_desc_is_valid(&desc, &hdr) == false) {
vhdx_log_desc_le_import(&desc_entries->desc[i]);
if (vhdx_log_desc_is_valid(&desc_entries->desc[i], &hdr) == false) {
ret = -EINVAL;
goto free_and_exit;
}
}
if (convert_endian) {
desc_entries->hdr = hdr;
}
*buffer = desc_entries;
goto exit;
@@ -418,7 +403,7 @@ static int vhdx_log_flush_desc(BlockDriverState *bs, VHDXLogDescriptor *desc,
buffer = qemu_blockalign(bs, VHDX_LOG_SECTOR_SIZE);
if (desc->signature == VHDX_LOG_DESC_SIGNATURE) {
if (!memcmp(&desc->signature, "desc", 4)) {
/* data sector */
if (data == NULL) {
ret = -EFAULT;
@@ -446,15 +431,10 @@ static int vhdx_log_flush_desc(BlockDriverState *bs, VHDXLogDescriptor *desc,
memcpy(buffer+offset, &desc->trailing_bytes, 4);
} else if (desc->signature == VHDX_LOG_ZERO_SIGNATURE) {
} else if (!memcmp(&desc->signature, "zero", 4)) {
/* write 'count' sectors of sector */
memset(buffer, 0, VHDX_LOG_SECTOR_SIZE);
count = desc->zero_length / VHDX_LOG_SECTOR_SIZE;
} else {
error_report("Invalid VHDX log descriptor entry signature 0x%" PRIx32,
desc->signature);
ret = -EINVAL;
goto exit;
}
file_offset = desc->file_offset;
@@ -513,13 +493,13 @@ static int vhdx_log_flush(BlockDriverState *bs, BDRVVHDXState *s,
goto exit;
}
ret = vhdx_log_read_desc(bs, s, &logs->log, &desc_entries, true);
ret = vhdx_log_read_desc(bs, s, &logs->log, &desc_entries);
if (ret < 0) {
goto exit;
}
for (i = 0; i < desc_entries->hdr.descriptor_count; i++) {
if (desc_entries->desc[i].signature == VHDX_LOG_DESC_SIGNATURE) {
if (!memcmp(&desc_entries->desc[i].signature, "desc", 4)) {
/* data sector, so read a sector to flush */
ret = vhdx_log_read_sectors(bs, &logs->log, &sectors_read,
data, 1, false);
@@ -530,7 +510,6 @@ static int vhdx_log_flush(BlockDriverState *bs, BDRVVHDXState *s,
ret = -EINVAL;
goto exit;
}
vhdx_log_data_le_import(data);
}
ret = vhdx_log_flush_desc(bs, &desc_entries->desc[i], data);
@@ -579,6 +558,9 @@ static int vhdx_validate_log_entry(BlockDriverState *bs, BDRVVHDXState *s,
goto inc_and_exit;
}
vhdx_log_entry_hdr_le_import(&hdr);
if (vhdx_log_hdr_is_valid(log, &hdr, s) == false) {
goto inc_and_exit;
}
@@ -591,13 +573,13 @@ static int vhdx_validate_log_entry(BlockDriverState *bs, BDRVVHDXState *s,
desc_sectors = vhdx_compute_desc_sectors(hdr.descriptor_count);
/* Read all log sectors, and calculate log checksum */
/* Read desc sectors, and calculate log checksum */
total_sectors = hdr.entry_length / VHDX_LOG_SECTOR_SIZE;
/* read_desc() will increment the read idx */
ret = vhdx_log_read_desc(bs, s, log, &desc_buffer, false);
ret = vhdx_log_read_desc(bs, s, log, &desc_buffer);
if (ret < 0) {
goto free_and_exit;
}
@@ -620,7 +602,7 @@ static int vhdx_validate_log_entry(BlockDriverState *bs, BDRVVHDXState *s,
}
}
crc ^= 0xffffffff;
if (crc != hdr.checksum) {
if (crc != desc_buffer->hdr.checksum) {
goto free_and_exit;
}
@@ -923,7 +905,7 @@ static int vhdx_log_write(BlockDriverState *bs, BDRVVHDXState *s,
buffer = qemu_blockalign(bs, total_length);
memcpy(buffer, &new_hdr, sizeof(new_hdr));
new_desc = buffer + sizeof(new_hdr);
new_desc = (VHDXLogDescriptor *) (buffer + sizeof(new_hdr));
data_sector = buffer + (desc_sectors * VHDX_LOG_SECTOR_SIZE);
data_tmp = data;
@@ -980,6 +962,7 @@ static int vhdx_log_write(BlockDriverState *bs, BDRVVHDXState *s,
* last data sector */
vhdx_update_checksum(buffer, total_length,
offsetof(VHDXLogEntryHeader, checksum));
cpu_to_le32s((uint32_t *)(buffer + 4));
/* now write to the log */
ret = vhdx_log_write_sectors(bs, &s->log, &sectors_written, buffer,

View File

@@ -99,8 +99,7 @@ static const MSGUID logical_sector_guid = { .data1 = 0x8141bf1d,
/* Each parent type must have a valid GUID; this is for parent images
* of type 'VHDX'. If we were to allow e.g. a QCOW2 parent, we would
* need to make up our own QCOW2 GUID type */
static const MSGUID parent_vhdx_guid __attribute__((unused))
= { .data1 = 0xb04aefb7,
static const MSGUID parent_vhdx_guid = { .data1 = 0xb04aefb7,
.data2 = 0xd19e,
.data3 = 0x4a81,
.data4 = { 0xb7, 0x89, 0x25, 0xb8,
@@ -136,8 +135,10 @@ typedef struct VHDXSectorInfo {
* buf: buffer pointer
* size: size of buffer (must be > crc_offset+4)
*
* Note: The buffer should have all multi-byte data in little-endian format,
* and the resulting checksum is in little endian format.
* Note: The resulting checksum is in the CPU endianness, not necessarily
* in the file format endianness (LE). Any header export to disk should
* make sure that vhdx_header_le_export() is used to convert to the
* correct endianness
*/
uint32_t vhdx_update_checksum(uint8_t *buf, size_t size, int crc_offset)
{
@@ -148,7 +149,6 @@ uint32_t vhdx_update_checksum(uint8_t *buf, size_t size, int crc_offset)
memset(buf + crc_offset, 0, sizeof(crc));
crc = crc32c(0xffffffff, buf, size);
cpu_to_le32s(&crc);
memcpy(buf + crc_offset, &crc, sizeof(crc));
return crc;
@@ -300,7 +300,7 @@ static int vhdx_write_header(BlockDriverState *bs_file, VHDXHeader *hdr,
{
uint8_t *buffer = NULL;
int ret;
VHDXHeader *header_le;
VHDXHeader header_le;
assert(bs_file != NULL);
assert(hdr != NULL);
@@ -321,12 +321,11 @@ static int vhdx_write_header(BlockDriverState *bs_file, VHDXHeader *hdr,
}
/* overwrite the actual VHDXHeader portion */
header_le = (VHDXHeader *)buffer;
memcpy(header_le, hdr, sizeof(VHDXHeader));
vhdx_header_le_export(hdr, header_le);
vhdx_update_checksum(buffer, VHDX_HEADER_SIZE,
offsetof(VHDXHeader, checksum));
ret = bdrv_pwrite_sync(bs_file, offset, header_le, sizeof(VHDXHeader));
memcpy(buffer, hdr, sizeof(VHDXHeader));
hdr->checksum = vhdx_update_checksum(buffer, VHDX_HEADER_SIZE,
offsetof(VHDXHeader, checksum));
vhdx_header_le_export(hdr, &header_le);
ret = bdrv_pwrite_sync(bs_file, offset, &header_le, sizeof(VHDXHeader));
exit:
qemu_vfree(buffer);
@@ -433,14 +432,13 @@ static void vhdx_parse_header(BlockDriverState *bs, BDRVVHDXState *s,
}
/* copy over just the relevant portion that we need */
memcpy(header1, buffer, sizeof(VHDXHeader));
vhdx_header_le_import(header1);
if (vhdx_checksum_is_valid(buffer, VHDX_HEADER_SIZE, 4)) {
vhdx_header_le_import(header1);
if (header1->signature == VHDX_HEADER_SIGNATURE &&
header1->version == 1) {
h1_seq = header1->sequence_number;
h1_valid = true;
}
if (vhdx_checksum_is_valid(buffer, VHDX_HEADER_SIZE, 4) &&
!memcmp(&header1->signature, "head", 4) &&
header1->version == 1) {
h1_seq = header1->sequence_number;
h1_valid = true;
}
ret = bdrv_pread(bs->file, VHDX_HEADER2_OFFSET, buffer, VHDX_HEADER_SIZE);
@@ -449,14 +447,13 @@ static void vhdx_parse_header(BlockDriverState *bs, BDRVVHDXState *s,
}
/* copy over just the relevant portion that we need */
memcpy(header2, buffer, sizeof(VHDXHeader));
vhdx_header_le_import(header2);
if (vhdx_checksum_is_valid(buffer, VHDX_HEADER_SIZE, 4)) {
vhdx_header_le_import(header2);
if (header2->signature == VHDX_HEADER_SIGNATURE &&
header2->version == 1) {
h2_seq = header2->sequence_number;
h2_valid = true;
}
if (vhdx_checksum_is_valid(buffer, VHDX_HEADER_SIZE, 4) &&
!memcmp(&header2->signature, "head", 4) &&
header2->version == 1) {
h2_seq = header2->sequence_number;
h2_valid = true;
}
/* If there is only 1 valid header (or no valid headers), we
@@ -522,21 +519,15 @@ static int vhdx_open_region_tables(BlockDriverState *bs, BDRVVHDXState *s)
goto fail;
}
memcpy(&s->rt, buffer, sizeof(s->rt));
vhdx_region_header_le_import(&s->rt);
offset += sizeof(s->rt);
if (!vhdx_checksum_is_valid(buffer, VHDX_HEADER_BLOCK_SIZE, 4)) {
if (!vhdx_checksum_is_valid(buffer, VHDX_HEADER_BLOCK_SIZE, 4) ||
memcmp(&s->rt.signature, "regi", 4)) {
ret = -EINVAL;
goto fail;
}
vhdx_region_header_le_import(&s->rt);
if (s->rt.signature != VHDX_REGION_SIGNATURE) {
ret = -EINVAL;
goto fail;
}
/* Per spec, maximum region table entry count is 2047 */
if (s->rt.entry_count > 2047) {
ret = -EINVAL;
@@ -639,7 +630,7 @@ static int vhdx_parse_metadata(BlockDriverState *bs, BDRVVHDXState *s)
vhdx_metadata_header_le_import(&s->metadata_hdr);
if (s->metadata_hdr.signature != VHDX_METADATA_SIGNATURE) {
if (memcmp(&s->metadata_hdr.signature, "metadata", 8)) {
ret = -EINVAL;
goto exit;
}
@@ -959,11 +950,7 @@ static int vhdx_open(BlockDriverState *bs, QDict *options, int flags,
}
/* s->bat is freed in vhdx_close() */
s->bat = qemu_try_blockalign(bs->file, s->bat_rt.length);
if (s->bat == NULL) {
ret = -ENOMEM;
goto fail;
}
s->bat = qemu_blockalign(bs, s->bat_rt.length);
ret = bdrv_pread(bs->file, s->bat_offset, s->bat, s->bat_rt.length);
if (ret < 0) {
@@ -1382,7 +1369,7 @@ static int vhdx_create_new_headers(BlockDriverState *bs, uint64_t image_size,
int ret = 0;
VHDXHeader *hdr = NULL;
hdr = g_new0(VHDXHeader, 1);
hdr = g_malloc0(sizeof(VHDXHeader));
hdr->signature = VHDX_HEADER_SIGNATURE;
hdr->sequence_number = g_random_int();
@@ -1408,12 +1395,6 @@ exit:
return ret;
}
#define VHDX_METADATA_ENTRY_BUFFER_SIZE \
(sizeof(VHDXFileParameters) +\
sizeof(VHDXVirtualDiskSize) +\
sizeof(VHDXPage83Data) +\
sizeof(VHDXVirtualDiskLogicalSectorSize) +\
sizeof(VHDXVirtualDiskPhysicalSectorSize))
/*
* Create the Metadata entries.
@@ -1452,7 +1433,11 @@ static int vhdx_create_new_metadata(BlockDriverState *bs,
VHDXVirtualDiskLogicalSectorSize *mt_log_sector_size;
VHDXVirtualDiskPhysicalSectorSize *mt_phys_sector_size;
entry_buffer = g_malloc0(VHDX_METADATA_ENTRY_BUFFER_SIZE);
entry_buffer = g_malloc0(sizeof(VHDXFileParameters) +
sizeof(VHDXVirtualDiskSize) +
sizeof(VHDXPage83Data) +
sizeof(VHDXVirtualDiskLogicalSectorSize) +
sizeof(VHDXVirtualDiskPhysicalSectorSize));
mt_file_params = entry_buffer;
offset += sizeof(VHDXFileParameters);
@@ -1533,7 +1518,7 @@ static int vhdx_create_new_metadata(BlockDriverState *bs,
}
ret = bdrv_pwrite(bs, metadata_offset + (64 * KiB), entry_buffer,
VHDX_METADATA_ENTRY_BUFFER_SIZE);
VHDX_HEADER_BLOCK_SIZE);
if (ret < 0) {
goto exit;
}
@@ -1555,8 +1540,7 @@ exit:
*/
static int vhdx_create_bat(BlockDriverState *bs, BDRVVHDXState *s,
uint64_t image_size, VHDXImageType type,
bool use_zero_blocks, uint64_t file_offset,
uint32_t length)
bool use_zero_blocks, VHDXRegionTableEntry *rt_bat)
{
int ret = 0;
uint64_t data_file_offset;
@@ -1571,7 +1555,7 @@ static int vhdx_create_bat(BlockDriverState *bs, BDRVVHDXState *s,
/* this gives a data start after BAT/bitmap entries, and well
* past any metadata entries (with a 4 MB buffer for future
* expansion */
data_file_offset = file_offset + length + 5 * MiB;
data_file_offset = rt_bat->file_offset + rt_bat->length + 5 * MiB;
total_sectors = image_size >> s->logical_sector_size_bits;
if (type == VHDX_TYPE_DYNAMIC) {
@@ -1595,11 +1579,7 @@ static int vhdx_create_bat(BlockDriverState *bs, BDRVVHDXState *s,
use_zero_blocks ||
bdrv_has_zero_init(bs) == 0) {
/* for a fixed file, the default BAT entry is not zero */
s->bat = g_try_malloc0(length);
if (length && s->bat == NULL) {
ret = -ENOMEM;
goto exit;
}
s->bat = g_malloc0(rt_bat->length);
block_state = type == VHDX_TYPE_FIXED ? PAYLOAD_BLOCK_FULLY_PRESENT :
PAYLOAD_BLOCK_NOT_PRESENT;
block_state = use_zero_blocks ? PAYLOAD_BLOCK_ZERO : block_state;
@@ -1614,7 +1594,7 @@ static int vhdx_create_bat(BlockDriverState *bs, BDRVVHDXState *s,
cpu_to_le64s(&s->bat[sinfo.bat_idx]);
sector_num += s->sectors_per_block;
}
ret = bdrv_pwrite(bs, file_offset, s->bat, length);
ret = bdrv_pwrite(bs, rt_bat->file_offset, s->bat, rt_bat->length);
if (ret < 0) {
goto exit;
}
@@ -1646,8 +1626,6 @@ static int vhdx_create_new_region_table(BlockDriverState *bs,
int ret = 0;
uint32_t offset = 0;
void *buffer = NULL;
uint64_t bat_file_offset;
uint32_t bat_length;
BDRVVHDXState *s = NULL;
VHDXRegionTableHeader *region_table;
VHDXRegionTableEntry *rt_bat;
@@ -1657,7 +1635,7 @@ static int vhdx_create_new_region_table(BlockDriverState *bs,
/* Populate enough of the BDRVVHDXState to be able to use the
* pre-existing BAT calculation, translation, and update functions */
s = g_new0(BDRVVHDXState, 1);
s = g_malloc0(sizeof(BDRVVHDXState));
s->chunk_ratio = (VHDX_MAX_SECTORS_PER_BLOCK) *
(uint64_t) sector_size / (uint64_t) block_size;
@@ -1696,26 +1674,19 @@ static int vhdx_create_new_region_table(BlockDriverState *bs,
rt_metadata->length = 1 * MiB; /* min size, and more than enough */
*metadata_offset = rt_metadata->file_offset;
bat_file_offset = rt_bat->file_offset;
bat_length = rt_bat->length;
vhdx_region_header_le_export(region_table);
vhdx_region_entry_le_export(rt_bat);
vhdx_region_entry_le_export(rt_metadata);
vhdx_update_checksum(buffer, VHDX_HEADER_BLOCK_SIZE,
offsetof(VHDXRegionTableHeader, checksum));
/* The region table gives us the data we need to create the BAT,
* so do that now */
ret = vhdx_create_bat(bs, s, image_size, type, use_zero_blocks,
bat_file_offset, bat_length);
if (ret < 0) {
goto exit;
}
ret = vhdx_create_bat(bs, s, image_size, type, use_zero_blocks, rt_bat);
/* Now write out the region headers to disk */
vhdx_region_header_le_export(region_table);
vhdx_region_entry_le_export(rt_bat);
vhdx_region_entry_le_export(rt_metadata);
ret = bdrv_pwrite(bs, VHDX_REGION_TABLE_OFFSET, buffer,
VHDX_HEADER_BLOCK_SIZE);
if (ret < 0) {
@@ -1728,6 +1699,7 @@ static int vhdx_create_new_region_table(BlockDriverState *bs,
goto exit;
}
exit:
g_free(s);
g_free(buffer);
@@ -1768,8 +1740,7 @@ static int vhdx_create(const char *filename, QemuOpts *opts, Error **errp)
VHDXImageType image_type;
Error *local_err = NULL;
image_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
image_size = qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0);
log_size = qemu_opt_get_size_del(opts, VHDX_BLOCK_OPT_LOG_SIZE, 0);
block_size = qemu_opt_get_size_del(opts, VHDX_BLOCK_OPT_BLOCK_SIZE, 0);
type = qemu_opt_get_del(opts, BLOCK_OPT_SUBFMT);
@@ -1878,6 +1849,7 @@ static int vhdx_create(const char *filename, QemuOpts *opts, Error **errp)
}
delete_and_exit:
bdrv_unref(bs);
exit:

View File

@@ -435,7 +435,6 @@ void vhdx_header_le_import(VHDXHeader *h);
void vhdx_header_le_export(VHDXHeader *orig_h, VHDXHeader *new_h);
void vhdx_log_desc_le_import(VHDXLogDescriptor *d);
void vhdx_log_desc_le_export(VHDXLogDescriptor *d);
void vhdx_log_data_le_import(VHDXLogDataSector *d);
void vhdx_log_data_le_export(VHDXLogDataSector *d);
void vhdx_log_entry_hdr_le_import(VHDXLogEntryHeader *hdr);
void vhdx_log_entry_hdr_le_export(VHDXLogEntryHeader *hdr);

View File

@@ -106,7 +106,6 @@ typedef struct VmdkExtent {
uint32_t l2_cache_counts[L2_CACHE_SIZE];
int64_t cluster_sectors;
int64_t next_cluster_sector;
char *type;
} VmdkExtent;
@@ -125,6 +124,7 @@ typedef struct BDRVVmdkState {
} BDRVVmdkState;
typedef struct VmdkMetaData {
uint32_t offset;
unsigned int l1_index;
unsigned int l2_index;
unsigned int l2_offset;
@@ -233,7 +233,7 @@ static void vmdk_free_last_extent(BlockDriverState *bs)
return;
}
s->num_extents--;
s->extents = g_renew(VmdkExtent, s->extents, s->num_extents);
s->extents = g_realloc(s->extents, s->num_extents * sizeof(VmdkExtent));
}
static uint32_t vmdk_read_cid(BlockDriverState *bs, int parent)
@@ -397,7 +397,6 @@ static int vmdk_add_extent(BlockDriverState *bs,
{
VmdkExtent *extent;
BDRVVmdkState *s = bs->opaque;
int64_t nb_sectors;
if (cluster_sectors > 0x200000) {
/* 0x200000 * 512Bytes = 1GB for one cluster is unrealistic */
@@ -413,12 +412,8 @@ static int vmdk_add_extent(BlockDriverState *bs,
return -EFBIG;
}
nb_sectors = bdrv_nb_sectors(file);
if (nb_sectors < 0) {
return nb_sectors;
}
s->extents = g_renew(VmdkExtent, s->extents, s->num_extents + 1);
s->extents = g_realloc(s->extents,
(s->num_extents + 1) * sizeof(VmdkExtent));
extent = &s->extents[s->num_extents];
s->num_extents++;
@@ -432,7 +427,6 @@ static int vmdk_add_extent(BlockDriverState *bs,
extent->l1_entry_sectors = l2_size * cluster_sectors;
extent->l2_size = l2_size;
extent->cluster_sectors = flat ? sectors : cluster_sectors;
extent->next_cluster_sector = ROUND_UP(nb_sectors, cluster_sectors);
if (s->num_extents > 1) {
extent->end_sector = (*(extent - 1)).end_sector + extent->sectors;
@@ -454,11 +448,7 @@ static int vmdk_init_tables(BlockDriverState *bs, VmdkExtent *extent,
/* read the L1 table */
l1_size = extent->l1_size * sizeof(uint32_t);
extent->l1_table = g_try_malloc(l1_size);
if (l1_size && extent->l1_table == NULL) {
return -ENOMEM;
}
extent->l1_table = g_malloc(l1_size);
ret = bdrv_pread(extent->file,
extent->l1_table_offset,
extent->l1_table,
@@ -474,11 +464,7 @@ static int vmdk_init_tables(BlockDriverState *bs, VmdkExtent *extent,
}
if (extent->l1_backup_table_offset) {
extent->l1_backup_table = g_try_malloc(l1_size);
if (l1_size && extent->l1_backup_table == NULL) {
ret = -ENOMEM;
goto fail_l1;
}
extent->l1_backup_table = g_malloc(l1_size);
ret = bdrv_pread(extent->file,
extent->l1_backup_table_offset,
extent->l1_backup_table,
@@ -495,7 +481,7 @@ static int vmdk_init_tables(BlockDriverState *bs, VmdkExtent *extent,
}
extent->l2_cache =
g_new(uint32_t, extent->l2_size * L2_CACHE_SIZE);
g_malloc(extent->l2_size * L2_CACHE_SIZE * sizeof(uint32_t));
return 0;
fail_l1b:
g_free(extent->l1_backup_table);
@@ -683,7 +669,8 @@ static int vmdk_open_vmdk4(BlockDriverState *bs,
if (le32_to_cpu(header.flags) & VMDK4_FLAG_RGD) {
l1_backup_offset = le64_to_cpu(header.rgd_offset) << 9;
}
if (bdrv_nb_sectors(file) < le64_to_cpu(header.grain_offset)) {
if (bdrv_getlength(file) <
le64_to_cpu(header.grain_offset) * BDRV_SECTOR_SIZE) {
error_setg(errp, "File truncated, expecting at least %" PRId64 " bytes",
(int64_t)(le64_to_cpu(header.grain_offset)
* BDRV_SECTOR_SIZE));
@@ -834,7 +821,6 @@ static int vmdk_parse_extents(const char *desc, BlockDriverState *bs,
ret = vmdk_add_extent(bs, extent_file, true, sectors,
0, 0, 0, 0, 0, &extent, errp);
if (ret < 0) {
bdrv_unref(extent_file);
return ret;
}
extent->flat_start_offset = flat_offset << 9;
@@ -846,15 +832,14 @@ static int vmdk_parse_extents(const char *desc, BlockDriverState *bs,
} else {
ret = vmdk_open_sparse(bs, extent_file, bs->open_flags, buf, errp);
}
g_free(buf);
if (ret) {
g_free(buf);
bdrv_unref(extent_file);
return ret;
}
extent = &s->extents[s->num_extents - 1];
} else {
error_setg(errp, "Unsupported extent type '%s'", type);
bdrv_unref(extent_file);
return -ENOTSUP;
}
extent->type = g_strdup(type);
@@ -967,97 +952,57 @@ static void vmdk_refresh_limits(BlockDriverState *bs, Error **errp)
}
}
/**
* get_whole_cluster
*
* Copy backing file's cluster that covers @sector_num, otherwise write zero,
* to the cluster at @cluster_sector_num.
*
* If @skip_start_sector < @skip_end_sector, the relative range
* [@skip_start_sector, @skip_end_sector) is not copied or written, and leave
* it for call to write user data in the request.
*/
static int get_whole_cluster(BlockDriverState *bs,
VmdkExtent *extent,
uint64_t cluster_sector_num,
uint64_t sector_num,
uint64_t skip_start_sector,
uint64_t skip_end_sector)
VmdkExtent *extent,
uint64_t cluster_offset,
uint64_t offset,
bool allocate)
{
int ret = VMDK_OK;
int64_t cluster_bytes;
uint8_t *whole_grain;
uint8_t *whole_grain = NULL;
/* For COW, align request sector_num to cluster start */
sector_num = QEMU_ALIGN_DOWN(sector_num, extent->cluster_sectors);
cluster_bytes = extent->cluster_sectors << BDRV_SECTOR_BITS;
whole_grain = qemu_blockalign(bs, cluster_bytes);
if (!bs->backing_hd) {
memset(whole_grain, 0, skip_start_sector << BDRV_SECTOR_BITS);
memset(whole_grain + (skip_end_sector << BDRV_SECTOR_BITS), 0,
cluster_bytes - (skip_end_sector << BDRV_SECTOR_BITS));
}
assert(skip_end_sector <= extent->cluster_sectors);
/* we will be here if it's first write on non-exist grain(cluster).
* try to read from parent image, if exist */
if (bs->backing_hd && !vmdk_is_cid_valid(bs)) {
ret = VMDK_ERROR;
goto exit;
}
/* Read backing data before skip range */
if (skip_start_sector > 0) {
if (bs->backing_hd) {
ret = bdrv_read(bs->backing_hd, sector_num,
whole_grain, skip_start_sector);
if (ret < 0) {
ret = VMDK_ERROR;
goto exit;
}
if (bs->backing_hd) {
whole_grain =
qemu_blockalign(bs, extent->cluster_sectors << BDRV_SECTOR_BITS);
if (!vmdk_is_cid_valid(bs)) {
ret = VMDK_ERROR;
goto exit;
}
ret = bdrv_write(extent->file, cluster_sector_num, whole_grain,
skip_start_sector);
/* floor offset to cluster */
offset -= offset % (extent->cluster_sectors * 512);
ret = bdrv_read(bs->backing_hd, offset >> 9, whole_grain,
extent->cluster_sectors);
if (ret < 0) {
ret = VMDK_ERROR;
goto exit;
}
/* Write grain only into the active image */
ret = bdrv_write(extent->file, cluster_offset, whole_grain,
extent->cluster_sectors);
if (ret < 0) {
ret = VMDK_ERROR;
goto exit;
}
}
/* Read backing data after skip range */
if (skip_end_sector < extent->cluster_sectors) {
if (bs->backing_hd) {
ret = bdrv_read(bs->backing_hd, sector_num + skip_end_sector,
whole_grain + (skip_end_sector << BDRV_SECTOR_BITS),
extent->cluster_sectors - skip_end_sector);
if (ret < 0) {
ret = VMDK_ERROR;
goto exit;
}
}
ret = bdrv_write(extent->file, cluster_sector_num + skip_end_sector,
whole_grain + (skip_end_sector << BDRV_SECTOR_BITS),
extent->cluster_sectors - skip_end_sector);
if (ret < 0) {
ret = VMDK_ERROR;
goto exit;
}
}
exit:
qemu_vfree(whole_grain);
return ret;
}
static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data,
uint32_t offset)
static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data)
{
offset = cpu_to_le32(offset);
uint32_t offset;
QEMU_BUILD_BUG_ON(sizeof(offset) != sizeof(m_data->offset));
offset = cpu_to_le32(m_data->offset);
/* update L2 table */
if (bdrv_pwrite_sync(
extent->file,
((int64_t)m_data->l2_offset * 512)
+ (m_data->l2_index * sizeof(offset)),
+ (m_data->l2_index * sizeof(m_data->offset)),
&offset, sizeof(offset)) < 0) {
return VMDK_ERROR;
}
@@ -1067,7 +1012,7 @@ static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data,
if (bdrv_pwrite_sync(
extent->file,
((int64_t)m_data->l2_offset * 512)
+ (m_data->l2_index * sizeof(offset)),
+ (m_data->l2_index * sizeof(m_data->offset)),
&offset, sizeof(offset)) < 0) {
return VMDK_ERROR;
}
@@ -1079,41 +1024,17 @@ static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data,
return VMDK_OK;
}
/**
* get_cluster_offset
*
* Look up cluster offset in extent file by sector number, and store in
* @cluster_offset.
*
* For flat extents, the start offset as parsed from the description file is
* returned.
*
* For sparse extents, look up in L1, L2 table. If allocate is true, return an
* offset for a new cluster and update L2 cache. If there is a backing file,
* COW is done before returning; otherwise, zeroes are written to the allocated
* cluster. Both COW and zero writing skips the sector range
* [@skip_start_sector, @skip_end_sector) passed in by caller, because caller
* has new data to write there.
*
* Returns: VMDK_OK if cluster exists and mapped in the image.
* VMDK_UNALLOC if cluster is not mapped and @allocate is false.
* VMDK_ERROR if failed.
*/
static int get_cluster_offset(BlockDriverState *bs,
VmdkExtent *extent,
VmdkMetaData *m_data,
uint64_t offset,
bool allocate,
uint64_t *cluster_offset,
uint64_t skip_start_sector,
uint64_t skip_end_sector)
VmdkExtent *extent,
VmdkMetaData *m_data,
uint64_t offset,
int allocate,
uint64_t *cluster_offset)
{
unsigned int l1_index, l2_offset, l2_index;
int min_index, i, j;
uint32_t min_count, *l2_table;
bool zeroed = false;
int64_t ret;
int64_t cluster_sector;
if (m_data) {
m_data->valid = 0;
@@ -1167,41 +1088,52 @@ static int get_cluster_offset(BlockDriverState *bs,
extent->l2_cache_counts[min_index] = 1;
found:
l2_index = ((offset >> 9) / extent->cluster_sectors) % extent->l2_size;
cluster_sector = le32_to_cpu(l2_table[l2_index]);
*cluster_offset = le32_to_cpu(l2_table[l2_index]);
if (m_data) {
m_data->valid = 1;
m_data->l1_index = l1_index;
m_data->l2_index = l2_index;
m_data->offset = *cluster_offset;
m_data->l2_offset = l2_offset;
m_data->l2_cache_entry = &l2_table[l2_index];
}
if (extent->has_zero_grain && cluster_sector == VMDK_GTE_ZEROED) {
if (extent->has_zero_grain && *cluster_offset == VMDK_GTE_ZEROED) {
zeroed = true;
}
if (!cluster_sector || zeroed) {
if (!*cluster_offset || zeroed) {
if (!allocate) {
return zeroed ? VMDK_ZEROED : VMDK_UNALLOC;
}
cluster_sector = extent->next_cluster_sector;
extent->next_cluster_sector += extent->cluster_sectors;
/* Avoid the L2 tables update for the images that have snapshots. */
*cluster_offset = bdrv_getlength(extent->file);
if (!extent->compressed) {
bdrv_truncate(
extent->file,
*cluster_offset + (extent->cluster_sectors << 9)
);
}
*cluster_offset >>= 9;
l2_table[l2_index] = cpu_to_le32(*cluster_offset);
/* First of all we write grain itself, to avoid race condition
* that may to corrupt the image.
* This problem may occur because of insufficient space on host disk
* or inappropriate VM shutdown.
*/
ret = get_whole_cluster(bs, extent,
cluster_sector,
offset >> BDRV_SECTOR_BITS,
skip_start_sector, skip_end_sector);
if (ret) {
return ret;
if (get_whole_cluster(
bs, extent, *cluster_offset, offset, allocate) == -1) {
return VMDK_ERROR;
}
if (m_data) {
m_data->offset = *cluster_offset;
}
}
*cluster_offset = cluster_sector << BDRV_SECTOR_BITS;
*cluster_offset <<= 9;
return VMDK_OK;
}
@@ -1236,8 +1168,7 @@ static int64_t coroutine_fn vmdk_co_get_block_status(BlockDriverState *bs,
}
qemu_co_mutex_lock(&s->lock);
ret = get_cluster_offset(bs, extent, NULL,
sector_num * 512, false, &offset,
0, 0);
sector_num * 512, 0, &offset);
qemu_co_mutex_unlock(&s->lock);
switch (ret) {
@@ -1390,9 +1321,9 @@ static int vmdk_read(BlockDriverState *bs, int64_t sector_num,
if (!extent) {
return -EIO;
}
ret = get_cluster_offset(bs, extent, NULL,
sector_num << 9, false, &cluster_offset,
0, 0);
ret = get_cluster_offset(
bs, extent, NULL,
sector_num << 9, 0, &cluster_offset);
extent_begin_sector = extent->end_sector - extent->sectors;
extent_relative_sector_num = sector_num - extent_begin_sector;
index_in_cluster = extent_relative_sector_num % extent->cluster_sectors;
@@ -1473,17 +1404,12 @@ static int vmdk_write(BlockDriverState *bs, int64_t sector_num,
if (!extent) {
return -EIO;
}
extent_begin_sector = extent->end_sector - extent->sectors;
extent_relative_sector_num = sector_num - extent_begin_sector;
index_in_cluster = extent_relative_sector_num % extent->cluster_sectors;
n = extent->cluster_sectors - index_in_cluster;
if (n > nb_sectors) {
n = nb_sectors;
}
ret = get_cluster_offset(bs, extent, &m_data, sector_num << 9,
!(extent->compressed || zeroed),
&cluster_offset,
index_in_cluster, index_in_cluster + n);
ret = get_cluster_offset(
bs,
extent,
&m_data,
sector_num << 9, !extent->compressed,
&cluster_offset);
if (extent->compressed) {
if (ret == VMDK_OK) {
/* Refuse write to allocated cluster for streamOptimized */
@@ -1492,13 +1418,24 @@ static int vmdk_write(BlockDriverState *bs, int64_t sector_num,
return -EIO;
} else {
/* allocate */
ret = get_cluster_offset(bs, extent, &m_data, sector_num << 9,
true, &cluster_offset, 0, 0);
ret = get_cluster_offset(
bs,
extent,
&m_data,
sector_num << 9, 1,
&cluster_offset);
}
}
if (ret == VMDK_ERROR) {
return -EINVAL;
}
extent_begin_sector = extent->end_sector - extent->sectors;
extent_relative_sector_num = sector_num - extent_begin_sector;
index_in_cluster = extent_relative_sector_num % extent->cluster_sectors;
n = extent->cluster_sectors - index_in_cluster;
if (n > nb_sectors) {
n = nb_sectors;
}
if (zeroed) {
/* Do zeroed write, buf is ignored */
if (extent->has_zero_grain &&
@@ -1506,9 +1443,9 @@ static int vmdk_write(BlockDriverState *bs, int64_t sector_num,
n >= extent->cluster_sectors) {
n = extent->cluster_sectors;
if (!zero_dry_run) {
m_data.offset = VMDK_GTE_ZEROED;
/* update L2 tables */
if (vmdk_L2update(extent, &m_data, VMDK_GTE_ZEROED)
!= VMDK_OK) {
if (vmdk_L2update(extent, &m_data) != VMDK_OK) {
return -EIO;
}
}
@@ -1524,9 +1461,7 @@ static int vmdk_write(BlockDriverState *bs, int64_t sector_num,
}
if (m_data.valid) {
/* update L2 tables */
if (vmdk_L2update(extent, &m_data,
cluster_offset >> BDRV_SECTOR_BITS)
!= VMDK_OK) {
if (vmdk_L2update(extent, &m_data) != VMDK_OK) {
return -EIO;
}
}
@@ -1807,8 +1742,7 @@ static int vmdk_create(const char *filename, QemuOpts *opts, Error **errp)
goto exit;
}
/* Read out options */
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
total_size = qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0);
adapter_type = qemu_opt_get_del(opts, BLOCK_OPT_ADAPTER_TYPE);
backing_file = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FILE);
if (qemu_opt_get_bool_del(opts, BLOCK_OPT_COMPAT6, false)) {
@@ -2065,7 +1999,7 @@ static int vmdk_check(BlockDriverState *bs, BdrvCheckResult *result,
BDRVVmdkState *s = bs->opaque;
VmdkExtent *extent = NULL;
int64_t sector_num = 0;
int64_t total_sectors = bdrv_nb_sectors(bs);
int64_t total_sectors = bdrv_getlength(bs) / BDRV_SECTOR_SIZE;
int ret;
uint64_t cluster_offset;
@@ -2086,7 +2020,7 @@ static int vmdk_check(BlockDriverState *bs, BdrvCheckResult *result,
}
ret = get_cluster_offset(bs, extent, NULL,
sector_num << BDRV_SECTOR_BITS,
false, &cluster_offset, 0, 0);
0, &cluster_offset);
if (ret == VMDK_ERROR) {
fprintf(stderr,
"ERROR: could not get cluster_offset for sector %"

View File

@@ -29,6 +29,13 @@
#if defined(CONFIG_UUID)
#include <uuid/uuid.h>
#endif
#ifdef __linux__
#include <linux/fs.h>
#include <sys/ioctl.h>
#ifndef FS_NOCOW_FL
#define FS_NOCOW_FL 0x00800000 /* Do not cow file */
#endif
#endif
/**************************************************************/
@@ -207,7 +214,7 @@ static int vpc_open(BlockDriverState *bs, QDict *options, int flags,
"incorrect.\n", bs->filename);
/* Write 'checksum' back to footer, or else will leave it with zero. */
footer->checksum = cpu_to_be32(checksum);
footer->checksum = be32_to_cpu(checksum);
// The visible size of a image in Virtual PC depends on the geometry
// rather than on the size stored in the footer (the size in the footer
@@ -269,11 +276,7 @@ static int vpc_open(BlockDriverState *bs, QDict *options, int flags,
goto fail;
}
s->pagetable = qemu_try_blockalign(bs->file, s->max_table_entries * 4);
if (s->pagetable == NULL) {
ret = -ENOMEM;
goto fail;
}
s->pagetable = qemu_blockalign(bs, s->max_table_entries * 4);
s->bat_offset = be64_to_cpu(dyndisk_header->table_offset);
@@ -472,7 +475,7 @@ static int64_t alloc_block(BlockDriverState* bs, int64_t sector_num)
// Write BAT entry to disk
bat_offset = s->bat_offset + (4 * index);
bat_value = cpu_to_be32(s->pagetable[index]);
bat_value = be32_to_cpu(s->pagetable[index]);
ret = bdrv_pwrite_sync(bs->file, bat_offset, &bat_value, 4);
if (ret < 0)
goto fail;
@@ -489,7 +492,7 @@ static int vpc_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
BDRVVPCState *s = (BDRVVPCState *)bs->opaque;
VHDFooter *footer = (VHDFooter *) s->footer_buf;
if (be32_to_cpu(footer->type) != VHD_FIXED) {
if (cpu_to_be32(footer->type) != VHD_FIXED) {
bdi->cluster_size = s->block_size;
}
@@ -506,7 +509,7 @@ static int vpc_read(BlockDriverState *bs, int64_t sector_num,
int64_t sectors, sectors_per_block;
VHDFooter *footer = (VHDFooter *) s->footer_buf;
if (be32_to_cpu(footer->type) == VHD_FIXED) {
if (cpu_to_be32(footer->type) == VHD_FIXED) {
return bdrv_read(bs->file, sector_num, buf, nb_sectors);
}
while (nb_sectors > 0) {
@@ -555,7 +558,7 @@ static int vpc_write(BlockDriverState *bs, int64_t sector_num,
int ret;
VHDFooter *footer = (VHDFooter *) s->footer_buf;
if (be32_to_cpu(footer->type) == VHD_FIXED) {
if (cpu_to_be32(footer->type) == VHD_FIXED) {
return bdrv_write(bs->file, sector_num, buf, nb_sectors);
}
while (nb_sectors > 0) {
@@ -653,41 +656,39 @@ static int calculate_geometry(int64_t total_sectors, uint16_t* cyls,
return 0;
}
static int create_dynamic_disk(BlockDriverState *bs, uint8_t *buf,
int64_t total_sectors)
static int create_dynamic_disk(int fd, uint8_t *buf, int64_t total_sectors)
{
VHDDynDiskHeader *dyndisk_header =
(VHDDynDiskHeader *) buf;
size_t block_size, num_bat_entries;
int i;
int ret;
int64_t offset = 0;
int ret = -EIO;
// Write the footer (twice: at the beginning and at the end)
block_size = 0x200000;
num_bat_entries = (total_sectors + block_size / 512) / (block_size / 512);
ret = bdrv_pwrite_sync(bs, offset, buf, HEADER_SIZE);
if (ret) {
if (write(fd, buf, HEADER_SIZE) != HEADER_SIZE) {
goto fail;
}
offset = 1536 + ((num_bat_entries * 4 + 511) & ~511);
ret = bdrv_pwrite_sync(bs, offset, buf, HEADER_SIZE);
if (ret < 0) {
if (lseek(fd, 1536 + ((num_bat_entries * 4 + 511) & ~511), SEEK_SET) < 0) {
goto fail;
}
if (write(fd, buf, HEADER_SIZE) != HEADER_SIZE) {
goto fail;
}
// Write the initial BAT
offset = 3 * 512;
if (lseek(fd, 3 * 512, SEEK_SET) < 0) {
goto fail;
}
memset(buf, 0xFF, 512);
for (i = 0; i < (num_bat_entries * 4 + 511) / 512; i++) {
ret = bdrv_pwrite_sync(bs, offset, buf, 512);
if (ret < 0) {
if (write(fd, buf, 512) != 512) {
goto fail;
}
offset += 512;
}
// Prepare the Dynamic Disk Header
@@ -699,44 +700,48 @@ static int create_dynamic_disk(BlockDriverState *bs, uint8_t *buf,
* Note: The spec is actually wrong here for data_offset, it says
* 0xFFFFFFFF, but MS tools expect all 64 bits to be set.
*/
dyndisk_header->data_offset = cpu_to_be64(0xFFFFFFFFFFFFFFFFULL);
dyndisk_header->table_offset = cpu_to_be64(3 * 512);
dyndisk_header->version = cpu_to_be32(0x00010000);
dyndisk_header->block_size = cpu_to_be32(block_size);
dyndisk_header->max_table_entries = cpu_to_be32(num_bat_entries);
dyndisk_header->data_offset = be64_to_cpu(0xFFFFFFFFFFFFFFFFULL);
dyndisk_header->table_offset = be64_to_cpu(3 * 512);
dyndisk_header->version = be32_to_cpu(0x00010000);
dyndisk_header->block_size = be32_to_cpu(block_size);
dyndisk_header->max_table_entries = be32_to_cpu(num_bat_entries);
dyndisk_header->checksum = cpu_to_be32(vpc_checksum(buf, 1024));
dyndisk_header->checksum = be32_to_cpu(vpc_checksum(buf, 1024));
// Write the header
offset = 512;
ret = bdrv_pwrite_sync(bs, offset, buf, 1024);
if (ret < 0) {
if (lseek(fd, 512, SEEK_SET) < 0) {
goto fail;
}
if (write(fd, buf, 1024) != 1024) {
goto fail;
}
ret = 0;
fail:
return ret;
}
static int create_fixed_disk(BlockDriverState *bs, uint8_t *buf,
int64_t total_size)
static int create_fixed_disk(int fd, uint8_t *buf, int64_t total_size)
{
int ret;
int ret = -EIO;
/* Add footer to total size */
total_size += HEADER_SIZE;
ret = bdrv_truncate(bs, total_size);
if (ret < 0) {
return ret;
total_size += 512;
if (ftruncate(fd, total_size) != 0) {
ret = -errno;
goto fail;
}
if (lseek(fd, -512, SEEK_END) < 0) {
goto fail;
}
if (write(fd, buf, HEADER_SIZE) != HEADER_SIZE) {
goto fail;
}
ret = bdrv_pwrite_sync(bs, total_size - HEADER_SIZE, buf, HEADER_SIZE);
if (ret < 0) {
return ret;
}
ret = 0;
fail:
return ret;
}
@@ -745,7 +750,7 @@ static int vpc_create(const char *filename, QemuOpts *opts, Error **errp)
uint8_t buf[1024];
VHDFooter *footer = (VHDFooter *) buf;
char *disk_type_param;
int i;
int fd, i;
uint16_t cyls = 0;
uint8_t heads = 0;
uint8_t secs_per_cyl = 0;
@@ -753,12 +758,10 @@ static int vpc_create(const char *filename, QemuOpts *opts, Error **errp)
int64_t total_size;
int disk_type;
int ret = -EIO;
Error *local_err = NULL;
BlockDriverState *bs = NULL;
bool nocow = false;
/* Read out options */
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
total_size = qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0);
disk_type_param = qemu_opt_get_del(opts, BLOCK_OPT_SUBFMT);
if (disk_type_param) {
if (!strcmp(disk_type_param, "dynamic")) {
@@ -772,17 +775,28 @@ static int vpc_create(const char *filename, QemuOpts *opts, Error **errp)
} else {
disk_type = VHD_DYNAMIC;
}
nocow = qemu_opt_get_bool_del(opts, BLOCK_OPT_NOCOW, false);
ret = bdrv_create_file(filename, opts, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
/* Create the file */
fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, 0644);
if (fd < 0) {
ret = -EIO;
goto out;
}
ret = bdrv_open(&bs, filename, NULL, NULL, BDRV_O_RDWR | BDRV_O_PROTOCOL,
NULL, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
goto out;
if (nocow) {
#ifdef __linux__
/* Set NOCOW flag to solve performance issue on fs like btrfs.
* This is an optimisation. The FS_IOC_SETFLAGS ioctl return value will
* be ignored since any failure of this operation should not block the
* left work.
*/
int attr;
if (ioctl(fd, FS_IOC_GETFLAGS, &attr) == 0) {
attr |= FS_NOCOW_FL;
ioctl(fd, FS_IOC_SETFLAGS, &attr);
}
#endif
}
/*
@@ -796,7 +810,7 @@ static int vpc_create(const char *filename, QemuOpts *opts, Error **errp)
&secs_per_cyl))
{
ret = -EFBIG;
goto out;
goto fail;
}
}
@@ -810,45 +824,46 @@ static int vpc_create(const char *filename, QemuOpts *opts, Error **errp)
memcpy(footer->creator_app, "qemu", 4);
memcpy(footer->creator_os, "Wi2k", 4);
footer->features = cpu_to_be32(0x02);
footer->version = cpu_to_be32(0x00010000);
footer->features = be32_to_cpu(0x02);
footer->version = be32_to_cpu(0x00010000);
if (disk_type == VHD_DYNAMIC) {
footer->data_offset = cpu_to_be64(HEADER_SIZE);
footer->data_offset = be64_to_cpu(HEADER_SIZE);
} else {
footer->data_offset = cpu_to_be64(0xFFFFFFFFFFFFFFFFULL);
footer->data_offset = be64_to_cpu(0xFFFFFFFFFFFFFFFFULL);
}
footer->timestamp = cpu_to_be32(time(NULL) - VHD_TIMESTAMP_BASE);
footer->timestamp = be32_to_cpu(time(NULL) - VHD_TIMESTAMP_BASE);
/* Version of Virtual PC 2007 */
footer->major = cpu_to_be16(0x0005);
footer->minor = cpu_to_be16(0x0003);
footer->major = be16_to_cpu(0x0005);
footer->minor = be16_to_cpu(0x0003);
if (disk_type == VHD_DYNAMIC) {
footer->orig_size = cpu_to_be64(total_sectors * 512);
footer->size = cpu_to_be64(total_sectors * 512);
footer->orig_size = be64_to_cpu(total_sectors * 512);
footer->size = be64_to_cpu(total_sectors * 512);
} else {
footer->orig_size = cpu_to_be64(total_size);
footer->size = cpu_to_be64(total_size);
footer->orig_size = be64_to_cpu(total_size);
footer->size = be64_to_cpu(total_size);
}
footer->cyls = cpu_to_be16(cyls);
footer->cyls = be16_to_cpu(cyls);
footer->heads = heads;
footer->secs_per_cyl = secs_per_cyl;
footer->type = cpu_to_be32(disk_type);
footer->type = be32_to_cpu(disk_type);
#if defined(CONFIG_UUID)
uuid_generate(footer->uuid);
#endif
footer->checksum = cpu_to_be32(vpc_checksum(buf, HEADER_SIZE));
footer->checksum = be32_to_cpu(vpc_checksum(buf, HEADER_SIZE));
if (disk_type == VHD_DYNAMIC) {
ret = create_dynamic_disk(bs, buf, total_sectors);
ret = create_dynamic_disk(fd, buf, total_sectors);
} else {
ret = create_fixed_disk(bs, buf, total_size);
ret = create_fixed_disk(fd, buf, total_size);
}
fail:
qemu_close(fd);
out:
bdrv_unref(bs);
g_free(disk_type_param);
return ret;
}
@@ -858,7 +873,7 @@ static int vpc_has_zero_init(BlockDriverState *bs)
BDRVVPCState *s = bs->opaque;
VHDFooter *footer = (VHDFooter *) s->footer_buf;
if (be32_to_cpu(footer->type) == VHD_FIXED) {
if (cpu_to_be32(footer->type) == VHD_FIXED) {
return bdrv_has_zero_init(bs->file);
} else {
return 1;

View File

@@ -52,6 +52,10 @@
#define DLOG(a) a
#undef stderr
#define stderr STDERR
FILE* stderr = NULL;
static void checkpoint(void);
#ifdef __MINGW32__
@@ -728,7 +732,7 @@ static int read_directory(BDRVVVFATState* s, int mapping_index)
if(first_cluster == 0 && (is_dotdot || is_dot))
continue;
buffer = g_malloc(length);
buffer=(char*)g_malloc(length);
snprintf(buffer,length,"%s/%s",dirname,entry->d_name);
if(stat(buffer,&st)<0) {
@@ -763,7 +767,7 @@ static int read_directory(BDRVVVFATState* s, int mapping_index)
/* create mapping for this file */
if(!is_dot && !is_dotdot && (S_ISDIR(st.st_mode) || st.st_size)) {
s->current_mapping = array_get_next(&(s->mapping));
s->current_mapping=(mapping_t*)array_get_next(&(s->mapping));
s->current_mapping->begin=0;
s->current_mapping->end=st.st_size;
/*
@@ -807,12 +811,12 @@ static int read_directory(BDRVVVFATState* s, int mapping_index)
}
/* reget the mapping, since s->mapping was possibly realloc()ed */
mapping = array_get(&(s->mapping), mapping_index);
mapping = (mapping_t*)array_get(&(s->mapping), mapping_index);
first_cluster += (s->directory.next - mapping->info.dir.first_dir_index)
* 0x20 / s->cluster_size;
mapping->end = first_cluster;
direntry = array_get(&(s->directory), mapping->dir_index);
direntry = (direntry_t*)array_get(&(s->directory), mapping->dir_index);
set_begin_of_direntry(direntry, mapping->begin);
return 0;
@@ -1078,6 +1082,11 @@ static int vvfat_open(BlockDriverState *bs, QDict *options, int flags,
vvv = s;
#endif
DLOG(if (stderr == NULL) {
stderr = fopen("vvfat.log", "a");
setbuf(stderr, NULL);
})
opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
@@ -2941,7 +2950,7 @@ static int enable_write_target(BDRVVVFATState *s, Error **errp)
bdrv_set_backing_hd(s->bs, bdrv_new("", &error_abort));
s->bs->backing_hd->drv = &vvfat_write_target;
s->bs->backing_hd->opaque = g_new(void *, 1);
s->bs->backing_hd->opaque = g_malloc(sizeof(void*));
*(void**)s->bs->backing_hd->opaque = s;
return 0;

View File

@@ -88,7 +88,7 @@ static void win32_aio_process_completion(QEMUWin32AIOState *s,
waiocb->common.cb(waiocb->common.opaque, ret);
qemu_aio_unref(waiocb);
qemu_aio_release(waiocb);
}
static void win32_aio_completion_cb(EventNotifier *e)
@@ -106,8 +106,22 @@ static void win32_aio_completion_cb(EventNotifier *e)
}
}
static void win32_aio_cancel(BlockDriverAIOCB *blockacb)
{
QEMUWin32AIOCB *waiocb = (QEMUWin32AIOCB *)blockacb;
/*
* CancelIoEx is only supported in Vista and newer. For now, just
* wait for completion.
*/
while (!HasOverlappedIoCompleted(&waiocb->ov)) {
aio_poll(bdrv_get_aio_context(blockacb->bs), true);
}
}
static const AIOCBInfo win32_aiocb_info = {
.aiocb_size = sizeof(QEMUWin32AIOCB),
.cancel = win32_aio_cancel,
};
BlockDriverAIOCB *win32_aio_submit(BlockDriverState *bs,
@@ -125,10 +139,7 @@ BlockDriverAIOCB *win32_aio_submit(BlockDriverState *bs,
waiocb->is_read = (type == QEMU_AIO_READ);
if (qiov->niov > 1) {
waiocb->buf = qemu_try_blockalign(bs, qiov->size);
if (waiocb->buf == NULL) {
goto out;
}
waiocb->buf = qemu_blockalign(bs, qiov->size);
if (type & QEMU_AIO_WRITE) {
iov_to_buf(qiov->iov, qiov->niov, 0, waiocb->buf, qiov->size);
}
@@ -157,8 +168,7 @@ BlockDriverAIOCB *win32_aio_submit(BlockDriverState *bs,
out_dec_count:
aio->count--;
out:
qemu_aio_unref(waiocb);
qemu_aio_release(waiocb);
return NULL;
}

View File

@@ -108,7 +108,7 @@ void qmp_nbd_server_add(const char *device, bool has_writable, bool writable,
nbd_export_set_name(exp, device);
n = g_new0(NBDCloseNotifier, 1);
n = g_malloc0(sizeof(NBDCloseNotifier));
n->n.notify = nbd_close_notifier;
n->exp = exp;
bdrv_add_close_notifier(bs, &n->n);

View File

@@ -39,7 +39,6 @@
#include "qapi/qmp/types.h"
#include "qapi-visit.h"
#include "qapi/qmp-output-visitor.h"
#include "qapi/util.h"
#include "sysemu/sysemu.h"
#include "block/block_int.h"
#include "qmp-commands.h"
@@ -60,7 +59,7 @@ static const char *const if_name[IF_COUNT] = {
[IF_XEN] = "xen",
};
static int if_max_devs[IF_COUNT] = {
static const int if_max_devs[IF_COUNT] = {
/*
* Do not change these numbers! They govern how drive option
* index maps to unit and bus. That mapping is ABI.
@@ -79,30 +78,6 @@ static int if_max_devs[IF_COUNT] = {
[IF_SCSI] = 7,
};
/**
* Boards may call this to offer board-by-board overrides
* of the default, global values.
*/
void override_max_devs(BlockInterfaceType type, int max_devs)
{
DriveInfo *dinfo;
if (max_devs <= 0) {
return;
}
QTAILQ_FOREACH(dinfo, &drives, next) {
if (dinfo->type == type) {
fprintf(stderr, "Cannot override units-per-bus property of"
" the %s interface, because a drive of that type has"
" already been added.\n", if_name[type]);
g_assert_not_reached();
}
}
if_max_devs[type] = max_devs;
}
/*
* We automatically delete the drive when a device using it gets
* unplugged. Questionable feature, but we can't just drop it.
@@ -135,23 +110,6 @@ void blockdev_auto_del(BlockDriverState *bs)
}
}
/**
* Returns the current mapping of how many units per bus
* a particular interface can support.
*
* A positive integer indicates n units per bus.
* 0 implies the mapping has not been established.
* -1 indicates an invalid BlockInterfaceType was given.
*/
int drive_get_max_devs(BlockInterfaceType type)
{
if (type >= IF_IDE && type < IF_COUNT) {
return if_max_devs[type];
}
return -1;
}
static int drive_index_to_bus_id(BlockInterfaceType type, int index)
{
int max_devs = if_max_devs[type];
@@ -207,27 +165,6 @@ DriveInfo *drive_get(BlockInterfaceType type, int bus, int unit)
return NULL;
}
bool drive_check_orphaned(void)
{
DriveInfo *dinfo;
bool rs = false;
QTAILQ_FOREACH(dinfo, &drives, next) {
/* If dinfo->bdrv->dev is NULL, it has no device attached. */
/* Unless this is a default drive, this may be an oversight. */
if (!dinfo->bdrv->dev && !dinfo->is_default &&
dinfo->type != IF_NONE) {
fprintf(stderr, "Warning: Orphaned drive without device: "
"id=%s,file=%s,if=%s,bus=%d,unit=%d\n",
dinfo->id, dinfo->bdrv->filename, if_name[dinfo->type],
dinfo->bus, dinfo->unit);
rs = true;
}
}
return rs;
}
DriveInfo *drive_get_by_index(BlockInterfaceType type, int index)
{
return drive_get(type,
@@ -278,15 +215,11 @@ static void bdrv_format_print(void *opaque, const char *name)
void drive_del(DriveInfo *dinfo)
{
bdrv_unref(dinfo->bdrv);
}
void drive_info_del(DriveInfo *dinfo)
{
if (!dinfo) {
return;
if (dinfo->opts) {
qemu_opts_del(dinfo->opts);
}
qemu_opts_del(dinfo->opts);
bdrv_unref(dinfo->bdrv);
g_free(dinfo->id);
QTAILQ_REMOVE(&drives, dinfo, next);
g_free(dinfo->serial);
@@ -341,6 +274,25 @@ static int parse_block_error_action(const char *buf, bool is_read, Error **errp)
}
}
static inline int parse_enum_option(const char *lookup[], const char *buf,
int max, int def, Error **errp)
{
int i;
if (!buf) {
return def;
}
for (i = 0; i < max; i++) {
if (!strcmp(buf, lookup[i])) {
return i;
}
}
error_setg(errp, "invalid parameter value: %s", buf);
return def;
}
static bool check_throttle_config(ThrottleConfig *cfg, Error **errp)
{
if (throttle_conflicting(cfg)) {
@@ -367,7 +319,6 @@ static DriveInfo *blockdev_init(const char *file, QDict *bs_opts,
int ro = 0;
int bdrv_flags = 0;
int on_read_error, on_write_error;
BlockDriverState *bs;
DriveInfo *dinfo;
ThrottleConfig cfg;
int snapshot = 0;
@@ -505,11 +456,11 @@ static DriveInfo *blockdev_init(const char *file, QDict *bs_opts,
}
detect_zeroes =
qapi_enum_parse(BlockdevDetectZeroesOptions_lookup,
qemu_opt_get(opts, "detect-zeroes"),
BLOCKDEV_DETECT_ZEROES_OPTIONS_MAX,
BLOCKDEV_DETECT_ZEROES_OPTIONS_OFF,
&error);
parse_enum_option(BlockdevDetectZeroesOptions_lookup,
qemu_opt_get(opts, "detect-zeroes"),
BLOCKDEV_DETECT_ZEROES_OPTIONS_MAX,
BLOCKDEV_DETECT_ZEROES_OPTIONS_OFF,
&error);
if (error) {
error_propagate(errp, error);
goto early_err;
@@ -523,27 +474,26 @@ static DriveInfo *blockdev_init(const char *file, QDict *bs_opts,
}
/* init */
bs = bdrv_new(qemu_opts_id(opts), errp);
if (!bs) {
goto early_err;
dinfo = g_malloc0(sizeof(*dinfo));
dinfo->id = g_strdup(qemu_opts_id(opts));
dinfo->bdrv = bdrv_new(dinfo->id, &error);
if (error) {
error_propagate(errp, error);
goto bdrv_new_err;
}
bs->open_flags = snapshot ? BDRV_O_SNAPSHOT : 0;
bs->read_only = ro;
bs->detect_zeroes = detect_zeroes;
dinfo->bdrv->open_flags = snapshot ? BDRV_O_SNAPSHOT : 0;
dinfo->bdrv->read_only = ro;
dinfo->bdrv->detect_zeroes = detect_zeroes;
QTAILQ_INSERT_TAIL(&drives, dinfo, next);
bdrv_set_on_error(bs, on_read_error, on_write_error);
bdrv_set_on_error(dinfo->bdrv, on_read_error, on_write_error);
/* disk I/O throttling */
if (throttle_enabled(&cfg)) {
bdrv_io_limits_enable(bs);
bdrv_set_io_limits(bs, &cfg);
bdrv_io_limits_enable(dinfo->bdrv);
bdrv_set_io_limits(dinfo->bdrv, &cfg);
}
dinfo = g_malloc0(sizeof(*dinfo));
dinfo->id = g_strdup(qemu_opts_id(opts));
dinfo->bdrv = bs;
QTAILQ_INSERT_TAIL(&drives, dinfo, next);
if (!file || !*file) {
if (has_driver_specific_opts) {
file = NULL;
@@ -570,8 +520,7 @@ static DriveInfo *blockdev_init(const char *file, QDict *bs_opts,
bdrv_flags |= ro ? 0 : BDRV_O_RDWR;
QINCREF(bs_opts);
ret = bdrv_open(&bs, file, NULL, bs_opts, bdrv_flags, drv, &error);
assert(bs == dinfo->bdrv);
ret = bdrv_open(&dinfo->bdrv, file, NULL, bs_opts, bdrv_flags, drv, &error);
if (ret < 0) {
error_setg(errp, "could not open disk image %s: %s",
@@ -580,9 +529,8 @@ static DriveInfo *blockdev_init(const char *file, QDict *bs_opts,
goto err;
}
if (bdrv_key_required(bs)) {
if (bdrv_key_required(dinfo->bdrv))
autostart = 0;
}
QDECREF(bs_opts);
qemu_opts_del(opts);
@@ -590,7 +538,11 @@ static DriveInfo *blockdev_init(const char *file, QDict *bs_opts,
return dinfo;
err:
bdrv_unref(bs);
bdrv_unref(dinfo->bdrv);
QTAILQ_REMOVE(&drives, dinfo, next);
bdrv_new_err:
g_free(dinfo->id);
g_free(dinfo);
early_err:
qemu_opts_del(opts);
err_no_opts:
@@ -598,22 +550,12 @@ err_no_opts:
return NULL;
}
static void qemu_opt_rename(QemuOpts *opts, const char *from, const char *to,
Error **errp)
static void qemu_opt_rename(QemuOpts *opts, const char *from, const char *to)
{
const char *value;
value = qemu_opt_get(opts, from);
if (value) {
if (qemu_opt_find(opts, to)) {
error_setg(errp, "'%s' and its alias '%s' can't be used at the "
"same time", to, from);
return;
}
}
/* rename all items in opts */
while ((value = qemu_opt_get(opts, from))) {
qemu_opt_set(opts, to, value);
qemu_opt_unset(opts, from);
}
@@ -717,43 +659,28 @@ DriveInfo *drive_new(QemuOpts *all_opts, BlockInterfaceType block_default_type)
const char *serial;
const char *filename;
Error *local_err = NULL;
int i;
/* Change legacy command line options into QMP ones */
static const struct {
const char *from;
const char *to;
} opt_renames[] = {
{ "iops", "throttling.iops-total" },
{ "iops_rd", "throttling.iops-read" },
{ "iops_wr", "throttling.iops-write" },
qemu_opt_rename(all_opts, "iops", "throttling.iops-total");
qemu_opt_rename(all_opts, "iops_rd", "throttling.iops-read");
qemu_opt_rename(all_opts, "iops_wr", "throttling.iops-write");
{ "bps", "throttling.bps-total" },
{ "bps_rd", "throttling.bps-read" },
{ "bps_wr", "throttling.bps-write" },
qemu_opt_rename(all_opts, "bps", "throttling.bps-total");
qemu_opt_rename(all_opts, "bps_rd", "throttling.bps-read");
qemu_opt_rename(all_opts, "bps_wr", "throttling.bps-write");
{ "iops_max", "throttling.iops-total-max" },
{ "iops_rd_max", "throttling.iops-read-max" },
{ "iops_wr_max", "throttling.iops-write-max" },
qemu_opt_rename(all_opts, "iops_max", "throttling.iops-total-max");
qemu_opt_rename(all_opts, "iops_rd_max", "throttling.iops-read-max");
qemu_opt_rename(all_opts, "iops_wr_max", "throttling.iops-write-max");
{ "bps_max", "throttling.bps-total-max" },
{ "bps_rd_max", "throttling.bps-read-max" },
{ "bps_wr_max", "throttling.bps-write-max" },
qemu_opt_rename(all_opts, "bps_max", "throttling.bps-total-max");
qemu_opt_rename(all_opts, "bps_rd_max", "throttling.bps-read-max");
qemu_opt_rename(all_opts, "bps_wr_max", "throttling.bps-write-max");
{ "iops_size", "throttling.iops-size" },
qemu_opt_rename(all_opts,
"iops_size", "throttling.iops-size");
{ "readonly", "read-only" },
};
for (i = 0; i < ARRAY_SIZE(opt_renames); i++) {
qemu_opt_rename(all_opts, opt_renames[i].from, opt_renames[i].to,
&local_err);
if (local_err) {
error_report("%s", error_get_pretty(local_err));
error_free(local_err);
return NULL;
}
}
qemu_opt_rename(all_opts, "readonly", "read-only");
value = qemu_opt_get(all_opts, "cache");
if (value) {
@@ -1167,7 +1094,7 @@ SnapshotInfo *qmp_blockdev_snapshot_delete_internal_sync(const char *device,
return NULL;
}
info = g_new0(SnapshotInfo, 1);
info = g_malloc0(sizeof(SnapshotInfo));
info->id = g_strdup(sn.id_str);
info->name = g_strdup(sn.name);
info->date_nsec = sn.date_nsec;
@@ -1830,8 +1757,6 @@ int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data)
{
const char *id = qdict_get_str(qdict, "id");
BlockDriverState *bs;
DriveInfo *dinfo;
AioContext *aio_context;
Error *local_err = NULL;
bs = bdrv_find(id);
@@ -1839,21 +1764,9 @@ int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data)
error_report("Device '%s' not found", id);
return -1;
}
dinfo = drive_get_by_blockdev(bs);
if (dinfo && !dinfo->enable_auto_del) {
error_report("Deleting device added with blockdev-add"
" is not supported");
return -1;
}
aio_context = bdrv_get_aio_context(bs);
aio_context_acquire(aio_context);
if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_DRIVE_DEL, &local_err)) {
error_report("%s", error_get_pretty(local_err));
error_free(local_err);
aio_context_release(aio_context);
return -1;
}
@@ -1874,10 +1787,9 @@ int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data)
bdrv_set_on_error(bs, BLOCKDEV_ON_ERROR_REPORT,
BLOCKDEV_ON_ERROR_REPORT);
} else {
drive_del(dinfo);
drive_del(drive_get_by_blockdev(bs));
}
aio_context_release(aio_context);
return 0;
}
@@ -1887,7 +1799,6 @@ void qmp_block_resize(bool has_device, const char *device,
{
Error *local_err = NULL;
BlockDriverState *bs;
AioContext *aio_context;
int ret;
bs = bdrv_lookup_bs(has_device ? device : NULL,
@@ -1898,22 +1809,19 @@ void qmp_block_resize(bool has_device, const char *device,
return;
}
aio_context = bdrv_get_aio_context(bs);
aio_context_acquire(aio_context);
if (!bdrv_is_first_non_filter(bs)) {
error_set(errp, QERR_FEATURE_DISABLED, "resize");
goto out;
return;
}
if (size < 0) {
error_set(errp, QERR_INVALID_PARAMETER_VALUE, "size", "a >0 size");
goto out;
return;
}
if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_RESIZE, NULL)) {
error_set(errp, QERR_DEVICE_IN_USE, device);
goto out;
return;
}
/* complete all in-flight operations before resizing the device */
@@ -1939,9 +1847,6 @@ void qmp_block_resize(bool has_device, const char *device,
error_setg_errno(errp, -ret, "Could not resize");
break;
}
out:
aio_context_release(aio_context);
}
static void block_job_cb(void *opaque, int ret)
@@ -2267,12 +2172,11 @@ void qmp_drive_mirror(const char *device, const char *target,
}
if (granularity != 0 && (granularity < 512 || granularity > 1048576 * 64)) {
error_set(errp, QERR_INVALID_PARAMETER_VALUE, "granularity",
"a value in range [512B, 64MB]");
error_set(errp, QERR_INVALID_PARAMETER, device);
return;
}
if (granularity & (granularity - 1)) {
error_set(errp, QERR_INVALID_PARAMETER_VALUE, "granularity", "power of 2");
error_set(errp, QERR_INVALID_PARAMETER, device);
return;
}

View File

@@ -205,7 +205,7 @@ void block_job_sleep_ns(BlockJob *job, QEMUClockType type, int64_t ns)
if (block_job_is_paused(job)) {
qemu_coroutine_yield();
} else {
co_aio_sleep_ns(bdrv_get_aio_context(job->bs), type, ns);
co_sleep_ns(type, ns);
}
job->busy = true;
}

160
configure vendored
View File

@@ -326,7 +326,7 @@ seccomp=""
glusterfs=""
glusterfs_discard="no"
glusterfs_zerofill="no"
archipelago=""
virtio_blk_data_plane=""
gtk=""
gtkabi=""
vte=""
@@ -388,7 +388,6 @@ cpp="${CPP-$cc -E}"
objcopy="${OBJCOPY-${cross_prefix}objcopy}"
ld="${LD-${cross_prefix}ld}"
libtool="${LIBTOOL-${cross_prefix}libtool}"
nm="${NM-${cross_prefix}nm}"
strip="${STRIP-${cross_prefix}strip}"
windres="${WINDRES-${cross_prefix}windres}"
pkg_config_exe="${PKG_CONFIG-${cross_prefix}pkg-config}"
@@ -1088,12 +1087,9 @@ for opt do
;;
--enable-glusterfs) glusterfs="yes"
;;
--disable-archipelago) archipelago="no"
--disable-virtio-blk-data-plane) virtio_blk_data_plane="no"
;;
--enable-archipelago) archipelago="yes"
;;
--disable-virtio-blk-data-plane|--enable-virtio-blk-data-plane)
echo "$0: $opt is obsolete, virtio-blk data-plane is always on" >&2
--enable-virtio-blk-data-plane) virtio_blk_data_plane="yes"
;;
--disable-gtk) gtk="no"
;;
@@ -1348,7 +1344,7 @@ Advanced options (experts only):
--enable-linux-aio enable Linux AIO support
--disable-cap-ng disable libcap-ng support
--enable-cap-ng enable libcap-ng support
--disable-attr disable attr and xattr support
--disable-attr disables attr and xattr support
--enable-attr enable attr and xattr support
--disable-blobs disable installing provided firmware blobs
--enable-docs enable documentation build
@@ -1379,22 +1375,20 @@ Advanced options (experts only):
--with-vss-sdk=SDK-path enable Windows VSS support in QEMU Guest Agent
--with-win-sdk=SDK-path path to Windows Platform SDK (to build VSS .tlb)
--disable-seccomp disable seccomp support
--enable-seccomp enable seccomp support
--enable-seccomp enables seccomp support
--with-coroutine=BACKEND coroutine backend. Supported options:
gthread, ucontext, sigaltstack, windows
--disable-coroutine-pool disable coroutine freelist (worse performance)
--enable-coroutine-pool enable coroutine freelist (better performance)
--enable-glusterfs enable GlusterFS backend
--disable-glusterfs disable GlusterFS backend
--enable-archipelago enable Archipelago backend
--disable-archipelago disable Archipelago backend
--enable-gcov enable test coverage analysis with gcov
--gcov=GCOV use specified gcov [$gcov_tool]
--disable-tpm disable TPM support
--enable-tpm enable TPM support
--disable-libssh2 disable ssh block device support
--enable-libssh2 enable ssh block device support
--disable-vhdx disable support for the Microsoft VHDX image format
--disable-vhdx disables support for the Microsoft VHDX image format
--enable-vhdx enable support for the Microsoft VHDX image format
--disable-quorum disable quorum block filter support
--enable-quorum enable quorum block filter support
@@ -2715,12 +2709,6 @@ for i in $glib_modules; do
fi
done
# g_test_trap_subprocess added in 2.38. Used by some tests.
glib_subprocess=yes
if ! $pkg_config --atleast-version=2.38 glib-2.0; then
glib_subprocess=no
fi
##########################################
# SHA command probe for modules
if test "$modules" = yes; then
@@ -2742,7 +2730,7 @@ fi
if test "$pixman" = ""; then
if test "$want_tools" = "no" -a "$softmmu" = "no"; then
pixman="none"
elif $pkg_config --atleast-version=0.21.8 pixman-1 > /dev/null 2>&1; then
elif $pkg_config pixman-1 > /dev/null 2>&1; then
pixman="system"
else
pixman="internal"
@@ -2758,12 +2746,11 @@ if test "$pixman" = "none"; then
pixman_cflags=
pixman_libs=
elif test "$pixman" = "system"; then
# pixman version has been checked above
pixman_cflags=`$pkg_config --cflags pixman-1`
pixman_libs=`$pkg_config --libs pixman-1`
else
if test ! -d ${source_path}/pixman/pixman; then
error_exit "pixman >= 0.21.8 not present. Your options:" \
error_exit "pixman not present. Your options:" \
" (1) Preferred: Install the pixman devel package (any recent" \
" distro should have packages as Xorg needs pixman too)." \
" (2) Fetch the pixman submodule, using:" \
@@ -2941,6 +2928,16 @@ else
tpm_passthrough=no
fi
##########################################
# adjust virtio-blk-data-plane based on linux-aio
if test "$virtio_blk_data_plane" = "yes" -a \
"$linux_aio" != "yes" ; then
error_exit "virtio-blk-data-plane requires Linux AIO, please try --enable-linux-aio"
elif test -z "$virtio_blk_data_plane" ; then
virtio_blk_data_plane=$linux_aio
fi
##########################################
# attr probe
@@ -3075,33 +3072,6 @@ EOF
fi
fi
##########################################
# archipelago probe
if test "$archipelago" != "no" ; then
cat > $TMPC <<EOF
#include <stdio.h>
#include <xseg/xseg.h>
#include <xseg/protocol.h>
int main(void) {
xseg_initialize();
return 0;
}
EOF
archipelago_libs=-lxseg
if compile_prog "" "$archipelago_libs"; then
archipelago="yes"
libs_tools="$archipelago_libs $libs_tools"
libs_softmmu="$archipelago_libs $libs_softmmu"
else
if test "$archipelago" = "yes" ; then
feature_not_found "Archipelago backend support" "Install libxseg devel"
fi
archipelago="no"
fi
fi
##########################################
# glusterfs probe
if test "$glusterfs" != "no" ; then
@@ -3117,8 +3087,7 @@ if test "$glusterfs" != "no" ; then
fi
else
if test "$glusterfs" = "yes" ; then
feature_not_found "GlusterFS backend support" \
"Install glusterfs-api devel >= 3"
feature_not_found "GlusterFS backend support" "Install glusterfs-api devel"
fi
glusterfs="no"
fi
@@ -3308,21 +3277,6 @@ if compile_prog "" "" ; then
fallocate_punch_hole=yes
fi
# check for posix_fallocate
posix_fallocate=no
cat > $TMPC << EOF
#include <fcntl.h>
int main(void)
{
posix_fallocate(0, 0, 0);
return 0;
}
EOF
if compile_prog "" "" ; then
posix_fallocate=yes
fi
# check for sync_file_range
sync_file_range=no
cat > $TMPC << EOF
@@ -3467,37 +3421,6 @@ if compile_prog "" "" ; then
sendfile=yes
fi
# check for timerfd support (glibc 2.8 and newer)
timerfd=no
cat > $TMPC << EOF
#include <sys/timerfd.h>
int main(void)
{
return(timerfd_create(CLOCK_REALTIME, 0));
}
EOF
if compile_prog "" "" ; then
timerfd=yes
fi
# check for setns and unshare support
setns=no
cat > $TMPC << EOF
#include <sched.h>
int main(void)
{
int ret;
ret = setns(0, 0);
ret = unshare(0);
return ret;
}
EOF
if compile_prog "" "" ; then
setns=yes
fi
# Check if tools are available to build documentation.
if test "$docs" != "no" ; then
if has makeinfo && has pod2man; then
@@ -3609,8 +3532,7 @@ EOF
spice_server_version=$($pkg_config --modversion spice-server)
else
if test "$spice" = "yes" ; then
feature_not_found "spice" \
"Install spice-server(>=0.12.0) and spice-protocol(>=0.12.3) devel"
feature_not_found "spice" "Install spice-server and spice-protocol devel"
fi
spice="no"
fi
@@ -3641,7 +3563,7 @@ EOF
smartcard_nss="yes"
else
if test "$smartcard_nss" = "yes"; then
feature_not_found "nss" "Install nss devel >= 3.12.8"
feature_not_found "nss"
fi
smartcard_nss="no"
fi
@@ -3657,7 +3579,7 @@ if test "$libusb" != "no" ; then
libs_softmmu="$libs_softmmu $libusb_libs"
else
if test "$libusb" = "yes"; then
feature_not_found "libusb" "Install libusb devel >= 1.0.13"
feature_not_found "libusb" "Install libusb devel"
fi
libusb="no"
fi
@@ -3971,11 +3893,12 @@ else
fi
########################################
# check if we have valgrind/valgrind.h
# check if we have valgrind/valgrind.h and valgrind/memcheck.h
valgrind_h=no
cat > $TMPC << EOF
#include <valgrind/valgrind.h>
#include <valgrind/memcheck.h>
int main(void) {
return 0;
}
@@ -4081,7 +4004,7 @@ if test "$libnfs" != "no" ; then
LIBS="$LIBS $libnfs_libs"
else
if test "$libnfs" = "yes" ; then
feature_not_found "libnfs" "Install libnfs devel >= 1.9.3"
feature_not_found "libnfs"
fi
libnfs="no"
fi
@@ -4328,7 +4251,7 @@ echo "seccomp support $seccomp"
echo "coroutine backend $coroutine"
echo "coroutine pool $coroutine_pool"
echo "GlusterFS support $glusterfs"
echo "Archipelago support $archipelago"
echo "virtio-blk-data-plane $virtio_blk_data_plane"
echo "gcov $gcov_tool"
echo "gcov enabled $gcov"
echo "TPM support $tpm"
@@ -4537,9 +4460,6 @@ fi
if test "$fallocate_punch_hole" = "yes" ; then
echo "CONFIG_FALLOCATE_PUNCH_HOLE=y" >> $config_host_mak
fi
if test "$posix_fallocate" = "yes" ; then
echo "CONFIG_POSIX_FALLOCATE=y" >> $config_host_mak
fi
if test "$sync_file_range" = "yes" ; then
echo "CONFIG_SYNC_FILE_RANGE=y" >> $config_host_mak
fi
@@ -4567,12 +4487,6 @@ fi
if test "$sendfile" = "yes" ; then
echo "CONFIG_SENDFILE=y" >> $config_host_mak
fi
if test "$timerfd" = "yes" ; then
echo "CONFIG_TIMERFD=y" >> $config_host_mak
fi
if test "$setns" = "yes" ; then
echo "CONFIG_SETNS=y" >> $config_host_mak
fi
if test "$inotify" = "yes" ; then
echo "CONFIG_INOTIFY=y" >> $config_host_mak
fi
@@ -4597,9 +4511,6 @@ if test "$bluez" = "yes" ; then
echo "CONFIG_BLUEZ=y" >> $config_host_mak
echo "BLUEZ_CFLAGS=$bluez_cflags" >> $config_host_mak
fi
if test "glib_subprocess" = "yes" ; then
echo "CONFIG_HAS_GLIB_SUBPROCESS_TESTS=y" >> $config_host_mak
fi
echo "GLIB_CFLAGS=$glib_cflags" >> $config_host_mak
if test "$gtk" = "yes" ; then
echo "CONFIG_GTK=y" >> $config_host_mak
@@ -4778,11 +4689,6 @@ if test "$glusterfs_zerofill" = "yes" ; then
echo "CONFIG_GLUSTERFS_ZEROFILL=y" >> $config_host_mak
fi
if test "$archipelago" = "yes" ; then
echo "CONFIG_ARCHIPELAGO=m" >> $config_host_mak
echo "ARCHIPELAGO_LIBS=$archipelago_libs" >> $config_host_mak
fi
if test "$libssh2" = "yes" ; then
echo "CONFIG_LIBSSH2=m" >> $config_host_mak
echo "LIBSSH2_CFLAGS=$libssh2_cflags" >> $config_host_mak
@@ -4793,6 +4699,10 @@ if test "$quorum" = "yes" ; then
echo "CONFIG_QUORUM=y" >> $config_host_mak
fi
if test "$virtio_blk_data_plane" = "yes" ; then
echo 'CONFIG_VIRTIO_BLK_DATA_PLANE=$(CONFIG_VIRTIO)' >> $config_host_mak
fi
if test "$vhdx" = "yes" ; then
echo "CONFIG_VHDX=y" >> $config_host_mak
fi
@@ -4899,7 +4809,6 @@ echo "AS=$as" >> $config_host_mak
echo "CPP=$cpp" >> $config_host_mak
echo "OBJCOPY=$objcopy" >> $config_host_mak
echo "LD=$ld" >> $config_host_mak
echo "NM=$nm" >> $config_host_mak
echo "WINDRES=$windres" >> $config_host_mak
echo "LIBTOOL=$libtool" >> $config_host_mak
echo "CFLAGS=$CFLAGS" >> $config_host_mak
@@ -5028,7 +4937,7 @@ case "$target_name" in
aarch64)
TARGET_BASE_ARCH=arm
bflt="yes"
gdb_xml_files="aarch64-core.xml aarch64-fpu.xml arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml"
gdb_xml_files="aarch64-core.xml aarch64-fpu.xml"
;;
cris)
;;
@@ -5057,8 +4966,6 @@ case "$target_name" in
TARGET_BASE_ARCH=mips
echo "TARGET_ABI_MIPSN64=y" >> $config_target_mak
;;
tricore)
;;
moxie)
;;
or32)
@@ -5107,7 +5014,6 @@ case "$target_name" in
echo "TARGET_ABI32=y" >> $config_target_mak
;;
s390x)
gdb_xml_files="s390x-core64.xml s390-acr.xml s390-fpr.xml"
;;
unicore32)
;;
@@ -5387,6 +5293,10 @@ for rom in seabios vgabios ; do
echo "LD=$ld" >> $config_mak
done
if test "$docs" = "yes" ; then
mkdir -p QMP
fi
# set up qemu-iotests in this build directory
iotests_common_env="tests/qemu-iotests/common.env"
iotests_check="tests/qemu-iotests/check"

View File

@@ -18,114 +18,10 @@
*/
#include "config.h"
#include "cpu.h"
#include "trace.h"
#include "disas/disas.h"
#include "tcg.h"
#include "qemu/atomic.h"
#include "sysemu/qtest.h"
#include "qemu/timer.h"
/* -icount align implementation. */
typedef struct SyncClocks {
int64_t diff_clk;
int64_t last_cpu_icount;
int64_t realtime_clock;
} SyncClocks;
#if !defined(CONFIG_USER_ONLY)
/* Allow the guest to have a max 3ms advance.
* The difference between the 2 clocks could therefore
* oscillate around 0.
*/
#define VM_CLOCK_ADVANCE 3000000
#define THRESHOLD_REDUCE 1.5
#define MAX_DELAY_PRINT_RATE 2000000000LL
#define MAX_NB_PRINTS 100
static void align_clocks(SyncClocks *sc, const CPUState *cpu)
{
int64_t cpu_icount;
if (!icount_align_option) {
return;
}
cpu_icount = cpu->icount_extra + cpu->icount_decr.u16.low;
sc->diff_clk += cpu_icount_to_ns(sc->last_cpu_icount - cpu_icount);
sc->last_cpu_icount = cpu_icount;
if (sc->diff_clk > VM_CLOCK_ADVANCE) {
#ifndef _WIN32
struct timespec sleep_delay, rem_delay;
sleep_delay.tv_sec = sc->diff_clk / 1000000000LL;
sleep_delay.tv_nsec = sc->diff_clk % 1000000000LL;
if (nanosleep(&sleep_delay, &rem_delay) < 0) {
sc->diff_clk -= (sleep_delay.tv_sec - rem_delay.tv_sec) * 1000000000LL;
sc->diff_clk -= sleep_delay.tv_nsec - rem_delay.tv_nsec;
} else {
sc->diff_clk = 0;
}
#else
Sleep(sc->diff_clk / SCALE_MS);
sc->diff_clk = 0;
#endif
}
}
static void print_delay(const SyncClocks *sc)
{
static float threshold_delay;
static int64_t last_realtime_clock;
static int nb_prints;
if (icount_align_option &&
sc->realtime_clock - last_realtime_clock >= MAX_DELAY_PRINT_RATE &&
nb_prints < MAX_NB_PRINTS) {
if ((-sc->diff_clk / (float)1000000000LL > threshold_delay) ||
(-sc->diff_clk / (float)1000000000LL <
(threshold_delay - THRESHOLD_REDUCE))) {
threshold_delay = (-sc->diff_clk / 1000000000LL) + 1;
printf("Warning: The guest is now late by %.1f to %.1f seconds\n",
threshold_delay - 1,
threshold_delay);
nb_prints++;
last_realtime_clock = sc->realtime_clock;
}
}
}
static void init_delay_params(SyncClocks *sc,
const CPUState *cpu)
{
if (!icount_align_option) {
return;
}
sc->realtime_clock = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
sc->diff_clk = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) -
sc->realtime_clock +
cpu_get_clock_offset();
sc->last_cpu_icount = cpu->icount_extra + cpu->icount_decr.u16.low;
if (sc->diff_clk < max_delay) {
max_delay = sc->diff_clk;
}
if (sc->diff_clk > max_advance) {
max_advance = sc->diff_clk;
}
/* Print every 2s max if the guest is late. We limit the number
of printed messages to NB_PRINT_MAX(currently 100) */
print_delay(sc);
}
#else
static void align_clocks(SyncClocks *sc, const CPUState *cpu)
{
}
static void init_delay_params(SyncClocks *sc, const CPUState *cpu)
{
}
#endif /* CONFIG USER ONLY */
void cpu_loop_exit(CPUState *cpu)
{
@@ -169,9 +65,6 @@ static inline tcg_target_ulong cpu_tb_exec(CPUState *cpu, uint8_t *tb_ptr)
#endif /* DEBUG_DISAS */
next_tb = tcg_qemu_tb_exec(env, tb_ptr);
trace_exec_tb_exit((void *) (next_tb & ~TB_EXIT_MASK),
next_tb & TB_EXIT_MASK);
if ((next_tb & TB_EXIT_MASK) > TB_EXIT_IDX1) {
/* We didn't start executing this TB (eg because the instruction
* counter hit zero); we must restore the guest PC to the address
@@ -212,7 +105,6 @@ static void cpu_exec_nocache(CPUArchState *env, int max_cycles,
max_cycles);
cpu->current_tb = tb;
/* execute the generated code */
trace_exec_tb_nocache(tb, tb->pc);
cpu_tb_exec(cpu, tb->tc_ptr);
cpu->current_tb = NULL;
tb_phys_invalidate(tb, -1);
@@ -295,10 +187,16 @@ static inline TranslationBlock *tb_find_fast(CPUArchState *env)
return tb;
}
static CPUDebugExcpHandler *debug_excp_handler;
void cpu_set_debug_excp_handler(CPUDebugExcpHandler *handler)
{
debug_excp_handler = handler;
}
static void cpu_handle_debug_exception(CPUArchState *env)
{
CPUState *cpu = ENV_GET_CPU(env);
CPUClass *cc = CPU_GET_CLASS(cpu);
CPUWatchpoint *wp;
if (!cpu->watchpoint_hit) {
@@ -306,8 +204,9 @@ static void cpu_handle_debug_exception(CPUArchState *env)
wp->flags &= ~BP_WATCHPOINT_HIT;
}
}
cc->debug_excp_handler(cpu);
if (debug_excp_handler) {
debug_excp_handler(env);
}
}
/* main execution loop */
@@ -317,7 +216,10 @@ volatile sig_atomic_t exit_request;
int cpu_exec(CPUArchState *env)
{
CPUState *cpu = ENV_GET_CPU(env);
#if !(defined(CONFIG_USER_ONLY) && \
(defined(TARGET_M68K) || defined(TARGET_PPC) || defined(TARGET_S390X)))
CPUClass *cc = CPU_GET_CLASS(cpu);
#endif
#ifdef TARGET_I386
X86CPU *x86_cpu = X86_CPU(cpu);
#endif
@@ -325,8 +227,6 @@ int cpu_exec(CPUArchState *env)
TranslationBlock *tb;
uint8_t *tc_ptr;
uintptr_t next_tb;
SyncClocks sc;
/* This must be volatile so it is not trashed by longjmp() */
volatile bool have_tb_lock = false;
@@ -352,16 +252,37 @@ int cpu_exec(CPUArchState *env)
cpu->exit_request = 1;
}
cc->cpu_exec_enter(cpu);
#if defined(TARGET_I386)
/* put eflags in CPU temporary format */
CC_SRC = env->eflags & (CC_O | CC_S | CC_Z | CC_A | CC_P | CC_C);
env->df = 1 - (2 * ((env->eflags >> 10) & 1));
CC_OP = CC_OP_EFLAGS;
env->eflags &= ~(DF_MASK | CC_O | CC_S | CC_Z | CC_A | CC_P | CC_C);
#elif defined(TARGET_SPARC)
#elif defined(TARGET_M68K)
env->cc_op = CC_OP_FLAGS;
env->cc_dest = env->sr & 0xf;
env->cc_x = (env->sr >> 4) & 1;
#elif defined(TARGET_ALPHA)
#elif defined(TARGET_ARM)
#elif defined(TARGET_UNICORE32)
#elif defined(TARGET_PPC)
env->reserve_addr = -1;
#elif defined(TARGET_LM32)
#elif defined(TARGET_MICROBLAZE)
#elif defined(TARGET_MIPS)
#elif defined(TARGET_MOXIE)
#elif defined(TARGET_OPENRISC)
#elif defined(TARGET_SH4)
#elif defined(TARGET_CRIS)
#elif defined(TARGET_S390X)
#elif defined(TARGET_XTENSA)
/* XXXXX */
#else
#error unsupported target CPU
#endif
cpu->exception_index = -1;
/* Calculate difference between guest clock and host clock.
* This delay includes the delay of the last cycle, so
* what we have to do is sleep until it is 0. As for the
* advance/delay we gain here, we try to fix it next time.
*/
init_delay_params(&sc, cpu);
/* prepare setjmp context for exception handling */
for(;;) {
if (sigsetjmp(cpu->jmp_env, 0) == 0) {
@@ -404,12 +325,16 @@ int cpu_exec(CPUArchState *env)
cpu->exception_index = EXCP_DEBUG;
cpu_loop_exit(cpu);
}
#if defined(TARGET_ARM) || defined(TARGET_SPARC) || defined(TARGET_MIPS) || \
defined(TARGET_PPC) || defined(TARGET_ALPHA) || defined(TARGET_CRIS) || \
defined(TARGET_MICROBLAZE) || defined(TARGET_LM32) || defined(TARGET_UNICORE32)
if (interrupt_request & CPU_INTERRUPT_HALT) {
cpu->interrupt_request &= ~CPU_INTERRUPT_HALT;
cpu->halted = 1;
cpu->exception_index = EXCP_HLT;
cpu_loop_exit(cpu);
}
#endif
#if defined(TARGET_I386)
if (interrupt_request & CPU_INTERRUPT_INIT) {
cpu_svm_check_intercept_param(env, SVM_EXIT_INIT, 0);
@@ -422,15 +347,251 @@ int cpu_exec(CPUArchState *env)
cpu_reset(cpu);
}
#endif
/* The target hook has 3 exit conditions:
False when the interrupt isn't processed,
True when it is, and we should restart on a new TB,
and via longjmp via cpu_loop_exit. */
if (cc->cpu_exec_interrupt(cpu, interrupt_request)) {
#if defined(TARGET_I386)
#if !defined(CONFIG_USER_ONLY)
if (interrupt_request & CPU_INTERRUPT_POLL) {
cpu->interrupt_request &= ~CPU_INTERRUPT_POLL;
apic_poll_irq(x86_cpu->apic_state);
}
#endif
if (interrupt_request & CPU_INTERRUPT_SIPI) {
do_cpu_sipi(x86_cpu);
} else if (env->hflags2 & HF2_GIF_MASK) {
if ((interrupt_request & CPU_INTERRUPT_SMI) &&
!(env->hflags & HF_SMM_MASK)) {
cpu_svm_check_intercept_param(env, SVM_EXIT_SMI,
0);
cpu->interrupt_request &= ~CPU_INTERRUPT_SMI;
do_smm_enter(x86_cpu);
next_tb = 0;
} else if ((interrupt_request & CPU_INTERRUPT_NMI) &&
!(env->hflags2 & HF2_NMI_MASK)) {
cpu->interrupt_request &= ~CPU_INTERRUPT_NMI;
env->hflags2 |= HF2_NMI_MASK;
do_interrupt_x86_hardirq(env, EXCP02_NMI, 1);
next_tb = 0;
} else if (interrupt_request & CPU_INTERRUPT_MCE) {
cpu->interrupt_request &= ~CPU_INTERRUPT_MCE;
do_interrupt_x86_hardirq(env, EXCP12_MCHK, 0);
next_tb = 0;
} else if ((interrupt_request & CPU_INTERRUPT_HARD) &&
(((env->hflags2 & HF2_VINTR_MASK) &&
(env->hflags2 & HF2_HIF_MASK)) ||
(!(env->hflags2 & HF2_VINTR_MASK) &&
(env->eflags & IF_MASK &&
!(env->hflags & HF_INHIBIT_IRQ_MASK))))) {
int intno;
cpu_svm_check_intercept_param(env, SVM_EXIT_INTR,
0);
cpu->interrupt_request &= ~(CPU_INTERRUPT_HARD |
CPU_INTERRUPT_VIRQ);
intno = cpu_get_pic_interrupt(env);
qemu_log_mask(CPU_LOG_TB_IN_ASM, "Servicing hardware INT=0x%02x\n", intno);
do_interrupt_x86_hardirq(env, intno, 1);
/* ensure that no TB jump will be modified as
the program flow was changed */
next_tb = 0;
#if !defined(CONFIG_USER_ONLY)
} else if ((interrupt_request & CPU_INTERRUPT_VIRQ) &&
(env->eflags & IF_MASK) &&
!(env->hflags & HF_INHIBIT_IRQ_MASK)) {
int intno;
/* FIXME: this should respect TPR */
cpu_svm_check_intercept_param(env, SVM_EXIT_VINTR,
0);
intno = ldl_phys(cpu->as,
env->vm_vmcb
+ offsetof(struct vmcb,
control.int_vector));
qemu_log_mask(CPU_LOG_TB_IN_ASM, "Servicing virtual hardware INT=0x%02x\n", intno);
do_interrupt_x86_hardirq(env, intno, 1);
cpu->interrupt_request &= ~CPU_INTERRUPT_VIRQ;
next_tb = 0;
#endif
}
}
#elif defined(TARGET_PPC)
if (interrupt_request & CPU_INTERRUPT_HARD) {
ppc_hw_interrupt(env);
if (env->pending_interrupts == 0) {
cpu->interrupt_request &= ~CPU_INTERRUPT_HARD;
}
next_tb = 0;
}
/* Don't use the cached interrupt_request value,
do_interrupt may have updated the EXITTB flag. */
#elif defined(TARGET_LM32)
if ((interrupt_request & CPU_INTERRUPT_HARD)
&& (env->ie & IE_IE)) {
cpu->exception_index = EXCP_IRQ;
cc->do_interrupt(cpu);
next_tb = 0;
}
#elif defined(TARGET_MICROBLAZE)
if ((interrupt_request & CPU_INTERRUPT_HARD)
&& (env->sregs[SR_MSR] & MSR_IE)
&& !(env->sregs[SR_MSR] & (MSR_EIP | MSR_BIP))
&& !(env->iflags & (D_FLAG | IMM_FLAG))) {
cpu->exception_index = EXCP_IRQ;
cc->do_interrupt(cpu);
next_tb = 0;
}
#elif defined(TARGET_MIPS)
if ((interrupt_request & CPU_INTERRUPT_HARD) &&
cpu_mips_hw_interrupts_pending(env)) {
/* Raise it */
cpu->exception_index = EXCP_EXT_INTERRUPT;
env->error_code = 0;
cc->do_interrupt(cpu);
next_tb = 0;
}
#elif defined(TARGET_OPENRISC)
{
int idx = -1;
if ((interrupt_request & CPU_INTERRUPT_HARD)
&& (env->sr & SR_IEE)) {
idx = EXCP_INT;
}
if ((interrupt_request & CPU_INTERRUPT_TIMER)
&& (env->sr & SR_TEE)) {
idx = EXCP_TICK;
}
if (idx >= 0) {
cpu->exception_index = idx;
cc->do_interrupt(cpu);
next_tb = 0;
}
}
#elif defined(TARGET_SPARC)
if (interrupt_request & CPU_INTERRUPT_HARD) {
if (cpu_interrupts_enabled(env) &&
env->interrupt_index > 0) {
int pil = env->interrupt_index & 0xf;
int type = env->interrupt_index & 0xf0;
if (((type == TT_EXTINT) &&
cpu_pil_allowed(env, pil)) ||
type != TT_EXTINT) {
cpu->exception_index = env->interrupt_index;
cc->do_interrupt(cpu);
next_tb = 0;
}
}
}
#elif defined(TARGET_ARM)
if (interrupt_request & CPU_INTERRUPT_FIQ
&& !(env->daif & PSTATE_F)) {
cpu->exception_index = EXCP_FIQ;
cc->do_interrupt(cpu);
next_tb = 0;
}
/* ARMv7-M interrupt return works by loading a magic value
into the PC. On real hardware the load causes the
return to occur. The qemu implementation performs the
jump normally, then does the exception return when the
CPU tries to execute code at the magic address.
This will cause the magic PC value to be pushed to
the stack if an interrupt occurred at the wrong time.
We avoid this by disabling interrupts when
pc contains a magic address. */
if (interrupt_request & CPU_INTERRUPT_HARD
&& ((IS_M(env) && env->regs[15] < 0xfffffff0)
|| !(env->daif & PSTATE_I))) {
cpu->exception_index = EXCP_IRQ;
cc->do_interrupt(cpu);
next_tb = 0;
}
#elif defined(TARGET_UNICORE32)
if (interrupt_request & CPU_INTERRUPT_HARD
&& !(env->uncached_asr & ASR_I)) {
cpu->exception_index = UC32_EXCP_INTR;
cc->do_interrupt(cpu);
next_tb = 0;
}
#elif defined(TARGET_SH4)
if (interrupt_request & CPU_INTERRUPT_HARD) {
cc->do_interrupt(cpu);
next_tb = 0;
}
#elif defined(TARGET_ALPHA)
{
int idx = -1;
/* ??? This hard-codes the OSF/1 interrupt levels. */
switch (env->pal_mode ? 7 : env->ps & PS_INT_MASK) {
case 0 ... 3:
if (interrupt_request & CPU_INTERRUPT_HARD) {
idx = EXCP_DEV_INTERRUPT;
}
/* FALLTHRU */
case 4:
if (interrupt_request & CPU_INTERRUPT_TIMER) {
idx = EXCP_CLK_INTERRUPT;
}
/* FALLTHRU */
case 5:
if (interrupt_request & CPU_INTERRUPT_SMP) {
idx = EXCP_SMP_INTERRUPT;
}
/* FALLTHRU */
case 6:
if (interrupt_request & CPU_INTERRUPT_MCHK) {
idx = EXCP_MCHK;
}
}
if (idx >= 0) {
cpu->exception_index = idx;
env->error_code = 0;
cc->do_interrupt(cpu);
next_tb = 0;
}
}
#elif defined(TARGET_CRIS)
if (interrupt_request & CPU_INTERRUPT_HARD
&& (env->pregs[PR_CCS] & I_FLAG)
&& !env->locked_irq) {
cpu->exception_index = EXCP_IRQ;
cc->do_interrupt(cpu);
next_tb = 0;
}
if (interrupt_request & CPU_INTERRUPT_NMI) {
unsigned int m_flag_archval;
if (env->pregs[PR_VR] < 32) {
m_flag_archval = M_FLAG_V10;
} else {
m_flag_archval = M_FLAG_V32;
}
if ((env->pregs[PR_CCS] & m_flag_archval)) {
cpu->exception_index = EXCP_NMI;
cc->do_interrupt(cpu);
next_tb = 0;
}
}
#elif defined(TARGET_M68K)
if (interrupt_request & CPU_INTERRUPT_HARD
&& ((env->sr & SR_I) >> SR_I_SHIFT)
< env->pending_level) {
/* Real hardware gets the interrupt vector via an
IACK cycle at this point. Current emulated
hardware doesn't rely on this, so we
provide/save the vector when the interrupt is
first signalled. */
cpu->exception_index = env->pending_vector;
do_interrupt_m68k_hardirq(env);
next_tb = 0;
}
#elif defined(TARGET_S390X) && !defined(CONFIG_USER_ONLY)
if ((interrupt_request & CPU_INTERRUPT_HARD) &&
(env->psw.mask & PSW_MASK_EXT)) {
cc->do_interrupt(cpu);
next_tb = 0;
}
#elif defined(TARGET_XTENSA)
if (interrupt_request & CPU_INTERRUPT_HARD) {
cpu->exception_index = EXC_IRQ;
cc->do_interrupt(cpu);
next_tb = 0;
}
#endif
/* Don't use the cached interrupt_request value,
do_interrupt may have updated the EXITTB flag. */
if (cpu->interrupt_request & CPU_INTERRUPT_EXITTB) {
cpu->interrupt_request &= ~CPU_INTERRUPT_EXITTB;
/* ensure that no TB jump will be modified as
@@ -476,7 +637,6 @@ int cpu_exec(CPUArchState *env)
cpu->current_tb = tb;
barrier();
if (likely(!cpu->exit_request)) {
trace_exec_tb(tb, tb->pc);
tc_ptr = tb->tc_ptr;
/* execute the generated code */
next_tb = cpu_tb_exec(cpu, tc_ptr);
@@ -512,7 +672,6 @@ int cpu_exec(CPUArchState *env)
if (insns_left > 0) {
/* Execute remaining instructions. */
cpu_exec_nocache(env, insns_left, tb);
align_clocks(&sc, cpu);
}
cpu->exception_index = EXCP_INTERRUPT;
next_tb = 0;
@@ -525,9 +684,6 @@ int cpu_exec(CPUArchState *env)
}
}
cpu->current_tb = NULL;
/* Try to align the host and virtual clocks
if the guest is in advance */
align_clocks(&sc, cpu);
/* reset soft MMU for next block (it can currently
only be set by a memory fault) */
} /* for(;;) */
@@ -536,7 +692,10 @@ int cpu_exec(CPUArchState *env)
* local variables as longjmp is marked 'noreturn'. */
cpu = current_cpu;
env = cpu->env_ptr;
#if !(defined(CONFIG_USER_ONLY) && \
(defined(TARGET_M68K) || defined(TARGET_PPC) || defined(TARGET_S390X)))
cc = CPU_GET_CLASS(cpu);
#endif
#ifdef TARGET_I386
x86_cpu = X86_CPU(cpu);
#endif
@@ -547,7 +706,35 @@ int cpu_exec(CPUArchState *env)
}
} /* for(;;) */
cc->cpu_exec_exit(cpu);
#if defined(TARGET_I386)
/* restore flags in standard format */
env->eflags = env->eflags | cpu_cc_compute_all(env, CC_OP)
| (env->df & DF_MASK);
#elif defined(TARGET_ARM)
/* XXX: Save/restore host fpu exception state?. */
#elif defined(TARGET_UNICORE32)
#elif defined(TARGET_SPARC)
#elif defined(TARGET_PPC)
#elif defined(TARGET_LM32)
#elif defined(TARGET_M68K)
cpu_m68k_flush_flags(env, env->cc_op);
env->cc_op = CC_OP_FLAGS;
env->sr = (env->sr & 0xffe0)
| env->cc_dest | (env->cc_x << 4);
#elif defined(TARGET_MICROBLAZE)
#elif defined(TARGET_MIPS)
#elif defined(TARGET_MOXIE)
#elif defined(TARGET_OPENRISC)
#elif defined(TARGET_SH4)
#elif defined(TARGET_ALPHA)
#elif defined(TARGET_CRIS)
#elif defined(TARGET_S390X)
#elif defined(TARGET_XTENSA)
/* XXXXX */
#else
#error unsupported target CPU
#endif
/* fail safe : never use current_cpu outside cpu_exec() */
current_cpu = NULL;

154
cpus.c
View File

@@ -40,7 +40,6 @@
#include "qemu/bitmap.h"
#include "qemu/seqlock.h"
#include "qapi-event.h"
#include "hw/nmi.h"
#ifndef _WIN32
#include "qemu/compatfd.h"
@@ -65,8 +64,6 @@
#endif /* CONFIG_LINUX */
static CPUState *next_cpu;
int64_t max_delay;
int64_t max_advance;
bool cpu_is_stopped(CPUState *cpu)
{
@@ -105,12 +102,17 @@ static bool all_cpu_threads_idle(void)
/* Protected by TimersState seqlock */
static int64_t vm_clock_warp_start = -1;
/* Compensate for varying guest execution speed. */
static int64_t qemu_icount_bias;
static int64_t vm_clock_warp_start;
/* Conversion factor from emulated instructions to virtual clock ticks. */
static int icount_time_shift;
/* Arbitrarily pick 1MIPS as the minimum allowable speed. */
#define MAX_ICOUNT_SHIFT 10
/* Only written by TCG thread */
static int64_t qemu_icount;
static QEMUTimer *icount_rt_timer;
static QEMUTimer *icount_vm_timer;
static QEMUTimer *icount_warp_timer;
@@ -127,11 +129,6 @@ typedef struct TimersState {
int64_t cpu_clock_offset;
int32_t cpu_ticks_enabled;
int64_t dummy;
/* Compensate for varying guest execution speed. */
int64_t qemu_icount_bias;
/* Only written by TCG thread */
int64_t qemu_icount;
} TimersState;
static TimersState timers_state;
@@ -142,14 +139,14 @@ static int64_t cpu_get_icount_locked(void)
int64_t icount;
CPUState *cpu = current_cpu;
icount = timers_state.qemu_icount;
icount = qemu_icount;
if (cpu) {
if (!cpu_can_do_io(cpu)) {
fprintf(stderr, "Bad clock read\n");
}
icount -= (cpu->icount_decr.u16.low + cpu->icount_extra);
}
return timers_state.qemu_icount_bias + cpu_icount_to_ns(icount);
return qemu_icount_bias + (icount << icount_time_shift);
}
int64_t cpu_get_icount(void)
@@ -165,11 +162,6 @@ int64_t cpu_get_icount(void)
return icount;
}
int64_t cpu_icount_to_ns(int64_t icount)
{
return icount << icount_time_shift;
}
/* return the host CPU cycle counter and handle stop/restart */
/* Caller must hold the BQL */
int64_t cpu_get_ticks(void)
@@ -222,23 +214,6 @@ int64_t cpu_get_clock(void)
return ti;
}
/* return the offset between the host clock and virtual CPU clock */
int64_t cpu_get_clock_offset(void)
{
int64_t ti;
unsigned start;
do {
start = seqlock_read_begin(&timers_state.vm_clock_seqlock);
ti = timers_state.cpu_clock_offset;
if (!timers_state.cpu_ticks_enabled) {
ti -= get_clock();
}
} while (seqlock_read_retry(&timers_state.vm_clock_seqlock, start));
return -ti;
}
/* enable cpu_get_ticks()
* Caller must hold BQL which server as mutex for vm_clock_seqlock.
*/
@@ -309,8 +284,7 @@ static void icount_adjust(void)
icount_time_shift++;
}
last_delta = delta;
timers_state.qemu_icount_bias = cur_icount
- (timers_state.qemu_icount << icount_time_shift);
qemu_icount_bias = cur_icount - (qemu_icount << icount_time_shift);
seqlock_write_unlock(&timers_state.vm_clock_seqlock);
}
@@ -359,7 +333,7 @@ static void icount_warp_rt(void *opaque)
int64_t delta = cur_time - cur_icount;
warp_delta = MIN(warp_delta, delta);
}
timers_state.qemu_icount_bias += warp_delta;
qemu_icount_bias += warp_delta;
}
vm_clock_warp_start = -1;
seqlock_write_unlock(&timers_state.vm_clock_seqlock);
@@ -377,7 +351,7 @@ void qtest_clock_warp(int64_t dest)
int64_t deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL);
int64_t warp = qemu_soonest_timeout(dest - clock, deadline);
seqlock_write_lock(&timers_state.vm_clock_seqlock);
timers_state.qemu_icount_bias += warp;
qemu_icount_bias += warp;
seqlock_write_unlock(&timers_state.vm_clock_seqlock);
qemu_clock_run_timers(QEMU_CLOCK_VIRTUAL);
@@ -454,25 +428,6 @@ void qemu_clock_warp(QEMUClockType type)
}
}
static bool icount_state_needed(void *opaque)
{
return use_icount;
}
/*
* This is a subsection for icount migration.
*/
static const VMStateDescription icount_vmstate_timers = {
.name = "timer/icount",
.version_id = 1,
.minimum_version_id = 1,
.fields = (VMStateField[]) {
VMSTATE_INT64(qemu_icount_bias, TimersState),
VMSTATE_INT64(qemu_icount, TimersState),
VMSTATE_END_OF_LIST()
}
};
static const VMStateDescription vmstate_timers = {
.name = "timer",
.version_id = 2,
@@ -482,48 +437,23 @@ static const VMStateDescription vmstate_timers = {
VMSTATE_INT64(dummy, TimersState),
VMSTATE_INT64_V(cpu_clock_offset, TimersState, 2),
VMSTATE_END_OF_LIST()
},
.subsections = (VMStateSubsection[]) {
{
.vmsd = &icount_vmstate_timers,
.needed = icount_state_needed,
}, {
/* empty */
}
}
};
void cpu_ticks_init(void)
void configure_icount(const char *option)
{
seqlock_init(&timers_state.vm_clock_seqlock, NULL);
vmstate_register(NULL, 0, &vmstate_timers, &timers_state);
}
void configure_icount(QemuOpts *opts, Error **errp)
{
const char *option;
char *rem_str = NULL;
option = qemu_opt_get(opts, "shift");
if (!option) {
if (qemu_opt_get(opts, "align") != NULL) {
error_setg(errp, "Please specify shift option when using align");
}
return;
}
icount_align_option = qemu_opt_get_bool(opts, "align", false);
icount_warp_timer = timer_new_ns(QEMU_CLOCK_REALTIME,
icount_warp_rt, NULL);
if (strcmp(option, "auto") != 0) {
errno = 0;
icount_time_shift = strtol(option, &rem_str, 0);
if (errno != 0 || *rem_str != '\0' || !strlen(option)) {
error_setg(errp, "icount: Invalid shift value");
}
icount_time_shift = strtol(option, NULL, 0);
use_icount = 1;
return;
} else if (icount_align_option) {
error_setg(errp, "shift=auto and align=on are incompatible");
}
use_icount = 2;
@@ -593,15 +523,6 @@ void cpu_synchronize_all_post_init(void)
}
}
void cpu_clean_all_dirty(void)
{
CPUState *cpu;
CPU_FOREACH(cpu) {
cpu_clean_state(cpu);
}
}
static int do_vm_stop(RunState state)
{
int ret = 0;
@@ -1329,8 +1250,7 @@ static int tcg_cpu_exec(CPUArchState *env)
int64_t count;
int64_t deadline;
int decr;
timers_state.qemu_icount -= (cpu->icount_decr.u16.low
+ cpu->icount_extra);
qemu_icount -= (cpu->icount_decr.u16.low + cpu->icount_extra);
cpu->icount_decr.u16.low = 0;
cpu->icount_extra = 0;
deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL);
@@ -1345,7 +1265,7 @@ static int tcg_cpu_exec(CPUArchState *env)
}
count = qemu_icount_round(deadline);
timers_state.qemu_icount += count;
qemu_icount += count;
decr = (count > 0xffff) ? 0xffff : count;
count -= decr;
cpu->icount_decr.u16.low = decr;
@@ -1358,8 +1278,7 @@ static int tcg_cpu_exec(CPUArchState *env)
if (use_icount) {
/* Fold pending instructions back into the
instruction counter, and clear the interrupt flag. */
timers_state.qemu_icount -= (cpu->icount_decr.u16.low
+ cpu->icount_extra);
qemu_icount -= (cpu->icount_decr.u16.low + cpu->icount_extra);
cpu->icount_decr.u32 = 0;
cpu->icount_extra = 0;
}
@@ -1423,9 +1342,6 @@ CpuInfoList *qmp_query_cpus(Error **errp)
#elif defined(TARGET_MIPS)
MIPSCPU *mips_cpu = MIPS_CPU(cpu);
CPUMIPSState *env = &mips_cpu->env;
#elif defined(TARGET_TRICORE)
TriCoreCPU *tricore_cpu = TRICORE_CPU(cpu);
CPUTriCoreState *env = &tricore_cpu->env;
#endif
cpu_synchronize_state(cpu);
@@ -1450,9 +1366,6 @@ CpuInfoList *qmp_query_cpus(Error **errp)
#elif defined(TARGET_MIPS)
info->value->has_PC = true;
info->value->PC = env->active_tc.PC;
#elif defined(TARGET_TRICORE)
info->value->has_PC = true;
info->value->PC = env->PC;
#endif
/* XXX: waiting for the qapi to support GSList */
@@ -1556,24 +1469,21 @@ void qmp_inject_nmi(Error **errp)
apic_deliver_nmi(cpu->apic_state);
}
}
#elif defined(TARGET_S390X)
CPUState *cs;
S390CPU *cpu;
CPU_FOREACH(cs) {
cpu = S390_CPU(cs);
if (cpu->env.cpu_num == monitor_get_cpu_index()) {
if (s390_cpu_restart(S390_CPU(cs)) == -1) {
error_set(errp, QERR_UNSUPPORTED);
return;
}
break;
}
}
#else
nmi_monitor_handle(monitor_get_cpu_index(), errp);
error_set(errp, QERR_UNSUPPORTED);
#endif
}
void dump_drift_info(FILE *f, fprintf_function cpu_fprintf)
{
if (!use_icount) {
return;
}
cpu_fprintf(f, "Host - Guest clock %"PRIi64" ms\n",
(cpu_get_clock() - cpu_get_icount())/SCALE_MS);
if (icount_align_option) {
cpu_fprintf(f, "Max guest delay %"PRIi64" ms\n", -max_delay/SCALE_MS);
cpu_fprintf(f, "Max guest advance %"PRIi64" ms\n", max_advance/SCALE_MS);
} else {
cpu_fprintf(f, "Max guest delay NA\n");
cpu_fprintf(f, "Max guest advance NA\n");
}
}

View File

@@ -60,10 +60,8 @@ void tlb_flush(CPUState *cpu, int flush_global)
cpu->current_tb = NULL;
memset(env->tlb_table, -1, sizeof(env->tlb_table));
memset(env->tlb_v_table, -1, sizeof(env->tlb_v_table));
memset(cpu->tb_jmp_cache, 0, sizeof(cpu->tb_jmp_cache));
env->vtlb_index = 0;
env->tlb_flush_addr = -1;
env->tlb_flush_mask = 0;
tlb_flush_count++;
@@ -110,14 +108,6 @@ void tlb_flush_page(CPUState *cpu, target_ulong addr)
tlb_flush_entry(&env->tlb_table[mmu_idx][i], addr);
}
/* check whether there are entries that need to be flushed in the vtlb */
for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
int k;
for (k = 0; k < CPU_VTLB_SIZE; k++) {
tlb_flush_entry(&env->tlb_v_table[mmu_idx][k], addr);
}
}
tb_flush_jmp_cache(cpu, addr);
}
@@ -182,11 +172,6 @@ void cpu_tlb_reset_dirty_all(ram_addr_t start1, ram_addr_t length)
tlb_reset_dirty_range(&env->tlb_table[mmu_idx][i],
start1, length);
}
for (i = 0; i < CPU_VTLB_SIZE; i++) {
tlb_reset_dirty_range(&env->tlb_v_table[mmu_idx][i],
start1, length);
}
}
}
}
@@ -210,13 +195,6 @@ void tlb_set_dirty(CPUArchState *env, target_ulong vaddr)
for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
tlb_set_dirty1(&env->tlb_table[mmu_idx][i], vaddr);
}
for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
int k;
for (k = 0; k < CPU_VTLB_SIZE; k++) {
tlb_set_dirty1(&env->tlb_v_table[mmu_idx][k], vaddr);
}
}
}
/* Our TLB does not support large pages, so remember the area covered by
@@ -257,7 +235,6 @@ void tlb_set_page(CPUState *cpu, target_ulong vaddr,
uintptr_t addend;
CPUTLBEntry *te;
hwaddr iotlb, xlat, sz;
unsigned vidx = env->vtlb_index++ % CPU_VTLB_SIZE;
assert(size >= TARGET_PAGE_SIZE);
if (size != TARGET_PAGE_SIZE) {
@@ -290,14 +267,8 @@ void tlb_set_page(CPUState *cpu, target_ulong vaddr,
prot, &address);
index = (vaddr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
te = &env->tlb_table[mmu_idx][index];
/* do not discard the translation in te, evict it into a victim tlb */
env->tlb_v_table[mmu_idx][vidx] = *te;
env->iotlb_v[mmu_idx][vidx] = env->iotlb[mmu_idx][index];
/* refill the tlb */
env->iotlb[mmu_idx][index] = iotlb - vaddr;
te = &env->tlb_table[mmu_idx][index];
te->addend = addend - vaddr;
if (prot & PAGE_READ) {
te->addr_read = address;

View File

@@ -45,8 +45,8 @@ CONFIG_PREP=y
CONFIG_MAC=y
CONFIG_E500=y
CONFIG_OPENPIC_KVM=$(and $(CONFIG_E500),$(CONFIG_KVM))
CONFIG_ETSEC=y
CONFIG_LIBDECNUMBER=y
# For PReP
CONFIG_MC146818RTC=y
CONFIG_ETSEC=y
CONFIG_ISA_TESTDEV=y
CONFIG_LIBDECNUMBER=y

View File

@@ -46,8 +46,6 @@ CONFIG_PREP=y
CONFIG_MAC=y
CONFIG_E500=y
CONFIG_OPENPIC_KVM=$(and $(CONFIG_E500),$(CONFIG_KVM))
CONFIG_ETSEC=y
CONFIG_LIBDECNUMBER=y
# For pSeries
CONFIG_XICS=$(CONFIG_PSERIES)
CONFIG_XICS_KVM=$(and $(CONFIG_PSERIES),$(CONFIG_KVM))
@@ -60,3 +58,4 @@ CONFIG_I82374=y
CONFIG_I8257=y
CONFIG_MC146818RTC=y
CONFIG_ISA_TESTDEV=y
CONFIG_LIBDECNUMBER=y

View File

@@ -20,7 +20,6 @@
#include "config.h"
#include "qemu-common.h"
#include "qemu/error-report.h"
#include "sysemu/device_tree.h"
#include "sysemu/sysemu.h"
#include "hw/loader.h"
@@ -60,13 +59,13 @@ void *create_device_tree(int *sizep)
}
ret = fdt_open_into(fdt, fdt, *sizep);
if (ret) {
error_report("Unable to copy device tree in memory");
fprintf(stderr, "Unable to copy device tree in memory\n");
exit(1);
}
return fdt;
fail:
error_report("%s Couldn't create dt: %s", __func__, fdt_strerror(ret));
fprintf(stderr, "%s Couldn't create dt: %s\n", __func__, fdt_strerror(ret));
exit(1);
}
@@ -80,8 +79,8 @@ void *load_device_tree(const char *filename_path, int *sizep)
*sizep = 0;
dt_size = get_image_size(filename_path);
if (dt_size < 0) {
error_report("Unable to get size of device tree file '%s'",
filename_path);
printf("Unable to get size of device tree file '%s'\n",
filename_path);
goto fail;
}
@@ -93,21 +92,21 @@ void *load_device_tree(const char *filename_path, int *sizep)
dt_file_load_size = load_image(filename_path, fdt);
if (dt_file_load_size < 0) {
error_report("Unable to open device tree file '%s'",
filename_path);
printf("Unable to open device tree file '%s'\n",
filename_path);
goto fail;
}
ret = fdt_open_into(fdt, fdt, dt_size);
if (ret) {
error_report("Unable to copy device tree in memory");
printf("Unable to copy device tree in memory\n");
goto fail;
}
/* Check sanity of device tree */
if (fdt_check_header(fdt)) {
error_report("Device tree file loaded into memory is invalid: %s",
filename_path);
printf ("Device tree file loaded into memory is invalid: %s\n",
filename_path);
goto fail;
}
*sizep = dt_size;
@@ -124,8 +123,8 @@ static int findnode_nofail(void *fdt, const char *node_path)
offset = fdt_path_offset(fdt, node_path);
if (offset < 0) {
error_report("%s Couldn't find node %s: %s", __func__, node_path,
fdt_strerror(offset));
fprintf(stderr, "%s Couldn't find node %s: %s\n", __func__, node_path,
fdt_strerror(offset));
exit(1);
}
@@ -139,8 +138,8 @@ int qemu_fdt_setprop(void *fdt, const char *node_path,
r = fdt_setprop(fdt, findnode_nofail(fdt, node_path), property, val, size);
if (r < 0) {
error_report("%s: Couldn't set %s/%s: %s", __func__, node_path,
property, fdt_strerror(r));
fprintf(stderr, "%s: Couldn't set %s/%s: %s\n", __func__, node_path,
property, fdt_strerror(r));
exit(1);
}
@@ -154,8 +153,8 @@ int qemu_fdt_setprop_cell(void *fdt, const char *node_path,
r = fdt_setprop_cell(fdt, findnode_nofail(fdt, node_path), property, val);
if (r < 0) {
error_report("%s: Couldn't set %s/%s = %#08x: %s", __func__,
node_path, property, val, fdt_strerror(r));
fprintf(stderr, "%s: Couldn't set %s/%s = %#08x: %s\n", __func__,
node_path, property, val, fdt_strerror(r));
exit(1);
}
@@ -176,8 +175,8 @@ int qemu_fdt_setprop_string(void *fdt, const char *node_path,
r = fdt_setprop_string(fdt, findnode_nofail(fdt, node_path), property, string);
if (r < 0) {
error_report("%s: Couldn't set %s/%s = %s: %s", __func__,
node_path, property, string, fdt_strerror(r));
fprintf(stderr, "%s: Couldn't set %s/%s = %s: %s\n", __func__,
node_path, property, string, fdt_strerror(r));
exit(1);
}
@@ -194,8 +193,8 @@ const void *qemu_fdt_getprop(void *fdt, const char *node_path,
}
r = fdt_getprop(fdt, findnode_nofail(fdt, node_path), property, lenp);
if (!r) {
error_report("%s: Couldn't get %s/%s: %s", __func__,
node_path, property, fdt_strerror(*lenp));
fprintf(stderr, "%s: Couldn't get %s/%s: %s\n", __func__,
node_path, property, fdt_strerror(*lenp));
exit(1);
}
return r;
@@ -207,8 +206,8 @@ uint32_t qemu_fdt_getprop_cell(void *fdt, const char *node_path,
int len;
const uint32_t *p = qemu_fdt_getprop(fdt, node_path, property, &len);
if (len != 4) {
error_report("%s: %s/%s not 4 bytes long (not a cell?)",
__func__, node_path, property);
fprintf(stderr, "%s: %s/%s not 4 bytes long (not a cell?)\n",
__func__, node_path, property);
exit(1);
}
return be32_to_cpu(*p);
@@ -220,8 +219,8 @@ uint32_t qemu_fdt_get_phandle(void *fdt, const char *path)
r = fdt_get_phandle(fdt, findnode_nofail(fdt, path));
if (r == 0) {
error_report("%s: Couldn't get phandle for %s: %s", __func__,
path, fdt_strerror(r));
fprintf(stderr, "%s: Couldn't get phandle for %s: %s\n", __func__,
path, fdt_strerror(r));
exit(1);
}
@@ -266,8 +265,8 @@ int qemu_fdt_nop_node(void *fdt, const char *node_path)
r = fdt_nop_node(fdt, findnode_nofail(fdt, node_path));
if (r < 0) {
error_report("%s: Couldn't nop node %s: %s", __func__, node_path,
fdt_strerror(r));
fprintf(stderr, "%s: Couldn't nop node %s: %s\n", __func__, node_path,
fdt_strerror(r));
exit(1);
}
@@ -295,8 +294,8 @@ int qemu_fdt_add_subnode(void *fdt, const char *name)
retval = fdt_add_subnode(fdt, parent, basename);
if (retval < 0) {
error_report("FDT: Failed to create subnode %s: %s", name,
fdt_strerror(retval));
fprintf(stderr, "FDT: Failed to create subnode %s: %s\n", name,
fdt_strerror(retval));
exit(1);
}

View File

@@ -2,7 +2,7 @@
The code in this directory is a subset of libvixl:
https://github.com/armvixl/vixl
(specifically, it is the set of files needed for disassembly only,
taken from libvixl 1.5).
taken from libvixl 1.4).
Bugfixes should preferably be sent upstream initially.
The disassembler does not currently support the entire A64 instruction

View File

@@ -28,7 +28,6 @@
#define VIXL_A64_ASSEMBLER_A64_H_
#include <list>
#include <stack>
#include "globals.h"
#include "utils.h"
@@ -575,107 +574,34 @@ class MemOperand {
class Label {
public:
Label() : location_(kLocationUnbound) {}
Label() : is_bound_(false), link_(NULL), target_(NULL) {}
~Label() {
// If the label has been linked to, it needs to be bound to a target.
VIXL_ASSERT(!IsLinked() || IsBound());
}
inline bool IsBound() const { return location_ >= 0; }
inline bool IsLinked() const { return !links_.empty(); }
inline Instruction* link() const { return link_; }
inline Instruction* target() const { return target_; }
inline bool IsBound() const { return is_bound_; }
inline bool IsLinked() const { return link_ != NULL; }
inline void set_link(Instruction* new_link) { link_ = new_link; }
static const int kEndOfChain = 0;
private:
// The list of linked instructions is stored in a stack-like structure. We
// don't use std::stack directly because it's slow for the common case where
// only one or two instructions refer to a label, and labels themselves are
// short-lived. This class behaves like std::stack, but the first few links
// are preallocated (configured by kPreallocatedLinks).
//
// If more than N links are required, this falls back to std::stack.
class LinksStack {
public:
LinksStack() : size_(0), links_extended_(NULL) {}
~LinksStack() {
delete links_extended_;
}
size_t size() const {
return size_;
}
bool empty() const {
return size_ == 0;
}
void push(ptrdiff_t value) {
if (size_ < kPreallocatedLinks) {
links_[size_] = value;
} else {
if (links_extended_ == NULL) {
links_extended_ = new std::stack<ptrdiff_t>();
}
VIXL_ASSERT(size_ == (links_extended_->size() + kPreallocatedLinks));
links_extended_->push(value);
}
size_++;
}
ptrdiff_t top() const {
return (size_ <= kPreallocatedLinks) ? links_[size_ - 1]
: links_extended_->top();
}
void pop() {
size_--;
if (size_ >= kPreallocatedLinks) {
links_extended_->pop();
VIXL_ASSERT(size_ == (links_extended_->size() + kPreallocatedLinks));
}
}
private:
static const size_t kPreallocatedLinks = 4;
size_t size_;
ptrdiff_t links_[kPreallocatedLinks];
std::stack<ptrdiff_t> * links_extended_;
};
inline ptrdiff_t location() const { return location_; }
inline void Bind(ptrdiff_t location) {
// Labels can only be bound once.
VIXL_ASSERT(!IsBound());
location_ = location;
}
inline void AddLink(ptrdiff_t instruction) {
// If a label is bound, the assembler already has the information it needs
// to write the instruction, so there is no need to add it to links_.
VIXL_ASSERT(!IsBound());
links_.push(instruction);
}
inline ptrdiff_t GetAndRemoveNextLink() {
VIXL_ASSERT(IsLinked());
ptrdiff_t link = links_.top();
links_.pop();
return link;
}
// The offsets of the instructions that have linked to this label.
LinksStack links_;
// Indicates if the label has been bound, ie its location is fixed.
bool is_bound_;
// Branches instructions branching to this label form a chained list, with
// their offset indicating where the next instruction is located.
// link_ points to the latest branch instruction generated branching to this
// branch.
// If link_ is not NULL, the label has been linked to.
Instruction* link_;
// The label location.
ptrdiff_t location_;
Instruction* target_;
static const ptrdiff_t kLocationUnbound = -1;
// It is not safe to copy labels, so disable the copy constructor by declaring
// it private (without an implementation).
Label(const Label&);
// The Assembler class is responsible for binding and linking labels, since
// the stored offsets need to be consistent with the Assembler's buffer.
friend class Assembler;
};
@@ -709,49 +635,10 @@ class Literal {
};
// Control whether or not position-independent code should be emitted.
enum PositionIndependentCodeOption {
// All code generated will be position-independent; all branches and
// references to labels generated with the Label class will use PC-relative
// addressing.
PositionIndependentCode,
// Allow VIXL to generate code that refers to absolute addresses. With this
// option, it will not be possible to copy the code buffer and run it from a
// different address; code must be generated in its final location.
PositionDependentCode,
// Allow VIXL to assume that the bottom 12 bits of the address will be
// constant, but that the top 48 bits may change. This allows `adrp` to
// function in systems which copy code between pages, but otherwise maintain
// 4KB page alignment.
PageOffsetDependentCode
};
// Control how scaled- and unscaled-offset loads and stores are generated.
enum LoadStoreScalingOption {
// Prefer scaled-immediate-offset instructions, but emit unscaled-offset,
// register-offset, pre-index or post-index instructions if necessary.
PreferScaledOffset,
// Prefer unscaled-immediate-offset instructions, but emit scaled-offset,
// register-offset, pre-index or post-index instructions if necessary.
PreferUnscaledOffset,
// Require scaled-immediate-offset instructions.
RequireScaledOffset,
// Require unscaled-immediate-offset instructions.
RequireUnscaledOffset
};
// Assembler.
class Assembler {
public:
Assembler(byte* buffer, unsigned buffer_size,
PositionIndependentCodeOption pic = PositionIndependentCode);
Assembler(byte* buffer, unsigned buffer_size);
// The destructor asserts that one of the following is true:
// * The Assembler object has not been used.
@@ -775,16 +662,13 @@ class Assembler {
// Label.
// Bind a label to the current PC.
void bind(Label* label);
// Return the address of a bound label.
template <typename T>
inline T GetLabelAddress(const Label * label) {
VIXL_ASSERT(label->IsBound());
VIXL_STATIC_ASSERT(sizeof(T) >= sizeof(uintptr_t));
VIXL_STATIC_ASSERT(sizeof(*buffer_) == 1);
return reinterpret_cast<T>(buffer_ + label->location());
int UpdateAndGetByteOffsetTo(Label* label);
inline int UpdateAndGetInstructionOffsetTo(Label* label) {
VIXL_ASSERT(Label::kEndOfChain == 0);
return UpdateAndGetByteOffsetTo(label) >> kInstructionSizeLog2;
}
// Instruction set functions.
// Branch / Jump instructions.
@@ -849,12 +733,6 @@ class Assembler {
// Calculate the address of a PC offset.
void adr(const Register& rd, int imm21);
// Calculate the page address of a label.
void adrp(const Register& rd, Label* label);
// Calculate the page address of a PC offset.
void adrp(const Register& rd, int imm21);
// Data Processing instructions.
// Add.
void add(const Register& rd,
@@ -1234,76 +1112,31 @@ class Assembler {
// Memory instructions.
// Load integer or FP register.
void ldr(const CPURegister& rt, const MemOperand& src,
LoadStoreScalingOption option = PreferScaledOffset);
void ldr(const CPURegister& rt, const MemOperand& src);
// Store integer or FP register.
void str(const CPURegister& rt, const MemOperand& dst,
LoadStoreScalingOption option = PreferScaledOffset);
void str(const CPURegister& rt, const MemOperand& dst);
// Load word with sign extension.
void ldrsw(const Register& rt, const MemOperand& src,
LoadStoreScalingOption option = PreferScaledOffset);
void ldrsw(const Register& rt, const MemOperand& src);
// Load byte.
void ldrb(const Register& rt, const MemOperand& src,
LoadStoreScalingOption option = PreferScaledOffset);
void ldrb(const Register& rt, const MemOperand& src);
// Store byte.
void strb(const Register& rt, const MemOperand& dst,
LoadStoreScalingOption option = PreferScaledOffset);
void strb(const Register& rt, const MemOperand& dst);
// Load byte with sign extension.
void ldrsb(const Register& rt, const MemOperand& src,
LoadStoreScalingOption option = PreferScaledOffset);
void ldrsb(const Register& rt, const MemOperand& src);
// Load half-word.
void ldrh(const Register& rt, const MemOperand& src,
LoadStoreScalingOption option = PreferScaledOffset);
void ldrh(const Register& rt, const MemOperand& src);
// Store half-word.
void strh(const Register& rt, const MemOperand& dst,
LoadStoreScalingOption option = PreferScaledOffset);
void strh(const Register& rt, const MemOperand& dst);
// Load half-word with sign extension.
void ldrsh(const Register& rt, const MemOperand& src,
LoadStoreScalingOption option = PreferScaledOffset);
// Load integer or FP register (with unscaled offset).
void ldur(const CPURegister& rt, const MemOperand& src,
LoadStoreScalingOption option = PreferUnscaledOffset);
// Store integer or FP register (with unscaled offset).
void stur(const CPURegister& rt, const MemOperand& src,
LoadStoreScalingOption option = PreferUnscaledOffset);
// Load word with sign extension.
void ldursw(const Register& rt, const MemOperand& src,
LoadStoreScalingOption option = PreferUnscaledOffset);
// Load byte (with unscaled offset).
void ldurb(const Register& rt, const MemOperand& src,
LoadStoreScalingOption option = PreferUnscaledOffset);
// Store byte (with unscaled offset).
void sturb(const Register& rt, const MemOperand& dst,
LoadStoreScalingOption option = PreferUnscaledOffset);
// Load byte with sign extension (and unscaled offset).
void ldursb(const Register& rt, const MemOperand& src,
LoadStoreScalingOption option = PreferUnscaledOffset);
// Load half-word (with unscaled offset).
void ldurh(const Register& rt, const MemOperand& src,
LoadStoreScalingOption option = PreferUnscaledOffset);
// Store half-word (with unscaled offset).
void sturh(const Register& rt, const MemOperand& dst,
LoadStoreScalingOption option = PreferUnscaledOffset);
// Load half-word with sign extension (and unscaled offset).
void ldursh(const Register& rt, const MemOperand& src,
LoadStoreScalingOption option = PreferUnscaledOffset);
void ldrsh(const Register& rt, const MemOperand& src);
// Load integer or FP register pair.
void ldp(const CPURegister& rt, const CPURegister& rt2,
@@ -1333,79 +1166,6 @@ class Assembler {
// Load single precision floating point literal to FP register.
void ldr(const FPRegister& ft, float imm);
// Store exclusive byte.
void stxrb(const Register& rs, const Register& rt, const MemOperand& dst);
// Store exclusive half-word.
void stxrh(const Register& rs, const Register& rt, const MemOperand& dst);
// Store exclusive register.
void stxr(const Register& rs, const Register& rt, const MemOperand& dst);
// Load exclusive byte.
void ldxrb(const Register& rt, const MemOperand& src);
// Load exclusive half-word.
void ldxrh(const Register& rt, const MemOperand& src);
// Load exclusive register.
void ldxr(const Register& rt, const MemOperand& src);
// Store exclusive register pair.
void stxp(const Register& rs,
const Register& rt,
const Register& rt2,
const MemOperand& dst);
// Load exclusive register pair.
void ldxp(const Register& rt, const Register& rt2, const MemOperand& src);
// Store-release exclusive byte.
void stlxrb(const Register& rs, const Register& rt, const MemOperand& dst);
// Store-release exclusive half-word.
void stlxrh(const Register& rs, const Register& rt, const MemOperand& dst);
// Store-release exclusive register.
void stlxr(const Register& rs, const Register& rt, const MemOperand& dst);
// Load-acquire exclusive byte.
void ldaxrb(const Register& rt, const MemOperand& src);
// Load-acquire exclusive half-word.
void ldaxrh(const Register& rt, const MemOperand& src);
// Load-acquire exclusive register.
void ldaxr(const Register& rt, const MemOperand& src);
// Store-release exclusive register pair.
void stlxp(const Register& rs,
const Register& rt,
const Register& rt2,
const MemOperand& dst);
// Load-acquire exclusive register pair.
void ldaxp(const Register& rt, const Register& rt2, const MemOperand& src);
// Store-release byte.
void stlrb(const Register& rt, const MemOperand& dst);
// Store-release half-word.
void stlrh(const Register& rt, const MemOperand& dst);
// Store-release register.
void stlr(const Register& rt, const MemOperand& dst);
// Load-acquire byte.
void ldarb(const Register& rt, const MemOperand& src);
// Load-acquire half-word.
void ldarh(const Register& rt, const MemOperand& src);
// Load-acquire register.
void ldar(const Register& rt, const MemOperand& src);
// Move instructions. The default shift of -1 indicates that the move
// instruction will calculate an appropriate 16-bit immediate and left shift
// that is equal to the 64-bit immediate argument. If an explicit left shift
@@ -1454,9 +1214,6 @@ class Assembler {
// System hint.
void hint(SystemHint code);
// Clear exclusive monitor.
void clrex(int imm4 = 0xf);
// Data memory barrier.
void dmb(BarrierDomain domain, BarrierType type);
@@ -1672,11 +1429,6 @@ class Assembler {
return rt2.code() << Rt2_offset;
}
static Instr Rs(CPURegister rs) {
VIXL_ASSERT(rs.code() != kSPRegInternalCode);
return rs.code() << Rs_offset;
}
// These encoding functions allow the stack pointer to be encoded, and
// disallow the zero register.
static Instr RdSP(Register rd) {
@@ -1867,11 +1619,6 @@ class Assembler {
return imm7 << ImmHint_offset;
}
static Instr CRm(int imm4) {
VIXL_ASSERT(is_uint4(imm4));
return imm4 << CRm_offset;
}
static Instr ImmBarrierDomain(int imm2) {
VIXL_ASSERT(is_uint2(imm2));
return imm2 << ImmBarrierDomain_offset;
@@ -1913,20 +1660,16 @@ class Assembler {
}
// Size of the code generated in bytes
size_t SizeOfCodeGenerated() const {
uint64_t SizeOfCodeGenerated() const {
VIXL_ASSERT((pc_ >= buffer_) && (pc_ < (buffer_ + buffer_size_)));
return pc_ - buffer_;
}
// Size of the code generated since label to the current position.
size_t SizeOfCodeGeneratedSince(Label* label) const {
size_t pc_offset = SizeOfCodeGenerated();
uint64_t SizeOfCodeGeneratedSince(Label* label) const {
VIXL_ASSERT(label->IsBound());
VIXL_ASSERT(pc_offset >= static_cast<size_t>(label->location()));
VIXL_ASSERT(pc_offset < buffer_size_);
return pc_offset - label->location();
VIXL_ASSERT((pc_ >= label->target()) && (pc_ < (buffer_ + buffer_size_)));
return pc_ - label->target();
}
@@ -1950,15 +1693,6 @@ class Assembler {
void EmitLiteralPool(LiteralPoolEmitOption option = NoJumpRequired);
size_t LiteralPoolSize();
inline PositionIndependentCodeOption pic() {
return pic_;
}
inline bool AllowPageOffsetDependentCode() {
return (pic() == PageOffsetDependentCode) ||
(pic() == PositionDependentCode);
}
protected:
inline const Register& AppropriateZeroRegFor(const CPURegister& reg) const {
return reg.Is64Bits() ? xzr : wzr;
@@ -1967,8 +1701,7 @@ class Assembler {
void LoadStore(const CPURegister& rt,
const MemOperand& addr,
LoadStoreOp op,
LoadStoreScalingOption option = PreferScaledOffset);
LoadStoreOp op);
static bool IsImmLSUnscaled(ptrdiff_t offset);
static bool IsImmLSScaled(ptrdiff_t offset, LSDataSize size);
@@ -1984,9 +1717,9 @@ class Assembler {
LogicalOp op);
static bool IsImmLogical(uint64_t value,
unsigned width,
unsigned* n = NULL,
unsigned* imm_s = NULL,
unsigned* imm_r = NULL);
unsigned* n,
unsigned* imm_s,
unsigned* imm_r);
void ConditionalCompare(const Register& rn,
const Operand& operand,
@@ -2090,17 +1823,6 @@ class Assembler {
void RecordLiteral(int64_t imm, unsigned size);
// Link the current (not-yet-emitted) instruction to the specified label, then
// return an offset to be encoded in the instruction. If the label is not yet
// bound, an offset of 0 is returned.
ptrdiff_t LinkAndGetByteOffsetTo(Label * label);
ptrdiff_t LinkAndGetInstructionOffsetTo(Label * label);
ptrdiff_t LinkAndGetPageOffsetTo(Label * label);
// A common implementation for the LinkAndGet<Type>OffsetTo helpers.
template <int element_size>
ptrdiff_t LinkAndGetOffsetTo(Label* label);
// Emit the instruction at pc_.
void Emit(Instr instruction) {
VIXL_STATIC_ASSERT(sizeof(*pc_) == 1);
@@ -2142,15 +1864,12 @@ class Assembler {
// The buffer into which code and relocation info are generated.
Instruction* buffer_;
// Buffer size, in bytes.
size_t buffer_size_;
unsigned buffer_size_;
Instruction* pc_;
std::list<Literal*> literals_;
Instruction* next_literal_pool_check_;
unsigned literal_pool_monitor_;
PositionIndependentCodeOption pic_;
friend class Label;
friend class BlockLiteralPoolScope;
#ifdef DEBUG

View File

@@ -46,13 +46,13 @@ R(24) R(25) R(26) R(27) R(28) R(29) R(30) R(31)
#define INSTRUCTION_FIELDS_LIST(V_) \
/* Register fields */ \
V_(Rd, 4, 0, Bits) /* Destination register. */ \
V_(Rn, 9, 5, Bits) /* First source register. */ \
V_(Rm, 20, 16, Bits) /* Second source register. */ \
V_(Ra, 14, 10, Bits) /* Third source register. */ \
V_(Rt, 4, 0, Bits) /* Load/store register. */ \
V_(Rt2, 14, 10, Bits) /* Load/store second register. */ \
V_(Rs, 20, 16, Bits) /* Exclusive access status. */ \
V_(Rd, 4, 0, Bits) /* Destination register. */ \
V_(Rn, 9, 5, Bits) /* First source register. */ \
V_(Rm, 20, 16, Bits) /* Second source register. */ \
V_(Ra, 14, 10, Bits) /* Third source register. */ \
V_(Rt, 4, 0, Bits) /* Load dest / store source. */ \
V_(Rt2, 14, 10, Bits) /* Load second dest / */ \
/* store second source. */ \
V_(PrefetchMode, 4, 0, Bits) \
\
/* Common bits */ \
@@ -126,13 +126,6 @@ V_(SysOp1, 18, 16, Bits) \
V_(SysOp2, 7, 5, Bits) \
V_(CRn, 15, 12, Bits) \
V_(CRm, 11, 8, Bits) \
\
/* Load-/store-exclusive */ \
V_(LdStXLoad, 22, 22, Bits) \
V_(LdStXNotExclusive, 23, 23, Bits) \
V_(LdStXAcquireRelease, 15, 15, Bits) \
V_(LdStXSizeLog2, 31, 30, Bits) \
V_(LdStXPair, 21, 21, Bits) \
#define SYSTEM_REGISTER_FIELDS_LIST(V_, M_) \
@@ -592,13 +585,6 @@ enum MemBarrierOp {
ISB = MemBarrierFixed | 0x00000040
};
enum SystemExclusiveMonitorOp {
SystemExclusiveMonitorFixed = 0xD503305F,
SystemExclusiveMonitorFMask = 0xFFFFF0FF,
SystemExclusiveMonitorMask = 0xFFFFF0FF,
CLREX = SystemExclusiveMonitorFixed
};
// Any load or store.
enum LoadStoreAnyOp {
LoadStoreAnyFMask = 0x0a000000,
@@ -716,7 +702,7 @@ enum LoadStoreUnscaledOffsetOp {
// Load/store (post, pre, offset and unsigned.)
enum LoadStoreOp {
LoadStoreOpMask = 0xC4C00000,
LoadStoreOpMask = 0xC4C00000,
#define LOAD_STORE(A, B, C, D) \
A##B##_##C = D
LOAD_STORE_OP_LIST(LOAD_STORE),
@@ -770,44 +756,6 @@ enum LoadStoreRegisterOffset {
#undef LOAD_STORE_REGISTER_OFFSET
};
enum LoadStoreExclusive {
LoadStoreExclusiveFixed = 0x08000000,
LoadStoreExclusiveFMask = 0x3F000000,
LoadStoreExclusiveMask = 0xFFE08000,
STXRB_w = LoadStoreExclusiveFixed | 0x00000000,
STXRH_w = LoadStoreExclusiveFixed | 0x40000000,
STXR_w = LoadStoreExclusiveFixed | 0x80000000,
STXR_x = LoadStoreExclusiveFixed | 0xC0000000,
LDXRB_w = LoadStoreExclusiveFixed | 0x00400000,
LDXRH_w = LoadStoreExclusiveFixed | 0x40400000,
LDXR_w = LoadStoreExclusiveFixed | 0x80400000,
LDXR_x = LoadStoreExclusiveFixed | 0xC0400000,
STXP_w = LoadStoreExclusiveFixed | 0x80200000,
STXP_x = LoadStoreExclusiveFixed | 0xC0200000,
LDXP_w = LoadStoreExclusiveFixed | 0x80600000,
LDXP_x = LoadStoreExclusiveFixed | 0xC0600000,
STLXRB_w = LoadStoreExclusiveFixed | 0x00008000,
STLXRH_w = LoadStoreExclusiveFixed | 0x40008000,
STLXR_w = LoadStoreExclusiveFixed | 0x80008000,
STLXR_x = LoadStoreExclusiveFixed | 0xC0008000,
LDAXRB_w = LoadStoreExclusiveFixed | 0x00408000,
LDAXRH_w = LoadStoreExclusiveFixed | 0x40408000,
LDAXR_w = LoadStoreExclusiveFixed | 0x80408000,
LDAXR_x = LoadStoreExclusiveFixed | 0xC0408000,
STLXP_w = LoadStoreExclusiveFixed | 0x80208000,
STLXP_x = LoadStoreExclusiveFixed | 0xC0208000,
LDAXP_w = LoadStoreExclusiveFixed | 0x80608000,
LDAXP_x = LoadStoreExclusiveFixed | 0xC0608000,
STLRB_w = LoadStoreExclusiveFixed | 0x00808000,
STLRH_w = LoadStoreExclusiveFixed | 0x40808000,
STLR_w = LoadStoreExclusiveFixed | 0x80808000,
STLR_x = LoadStoreExclusiveFixed | 0xC0808000,
LDARB_w = LoadStoreExclusiveFixed | 0x00C08000,
LDARH_w = LoadStoreExclusiveFixed | 0x40C08000,
LDAR_w = LoadStoreExclusiveFixed | 0x80C08000,
LDAR_x = LoadStoreExclusiveFixed | 0xC0C08000
};
// Conditional compare.
enum ConditionalCompareOp {
ConditionalCompareMask = 0x60000000,

View File

@@ -28,7 +28,6 @@
#define VIXL_CPU_A64_H
#include "globals.h"
#include "instructions-a64.h"
namespace vixl {
@@ -43,32 +42,6 @@ class CPU {
// safely run.
static void EnsureIAndDCacheCoherency(void *address, size_t length);
// Handle tagged pointers.
template <typename T>
static T SetPointerTag(T pointer, uint64_t tag) {
VIXL_ASSERT(is_uintn(kAddressTagWidth, tag));
// Use C-style casts to get static_cast behaviour for integral types (T),
// and reinterpret_cast behaviour for other types.
uint64_t raw = (uint64_t)pointer;
VIXL_STATIC_ASSERT(sizeof(pointer) == sizeof(raw));
raw = (raw & ~kAddressTagMask) | (tag << kAddressTagOffset);
return (T)raw;
}
template <typename T>
static uint64_t GetPointerTag(T pointer) {
// Use C-style casts to get static_cast behaviour for integral types (T),
// and reinterpret_cast behaviour for other types.
uint64_t raw = (uint64_t)pointer;
VIXL_STATIC_ASSERT(sizeof(pointer) == sizeof(raw));
return (raw & kAddressTagMask) >> kAddressTagOffset;
}
private:
// Return the content of the cache type register.
static uint32_t GetCacheType();

View File

@@ -171,9 +171,9 @@ void Decoder::DecodePCRelAddressing(Instruction* instr) {
void Decoder::DecodeBranchSystemException(Instruction* instr) {
VIXL_ASSERT((instr->Bits(27, 24) == 0x4) ||
(instr->Bits(27, 24) == 0x5) ||
(instr->Bits(27, 24) == 0x6) ||
(instr->Bits(27, 24) == 0x7) );
(instr->Bits(27, 24) == 0x5) ||
(instr->Bits(27, 24) == 0x6) ||
(instr->Bits(27, 24) == 0x7) );
switch (instr->Bits(31, 29)) {
case 0:
@@ -272,15 +272,16 @@ void Decoder::DecodeBranchSystemException(Instruction* instr) {
void Decoder::DecodeLoadStore(Instruction* instr) {
VIXL_ASSERT((instr->Bits(27, 24) == 0x8) ||
(instr->Bits(27, 24) == 0x9) ||
(instr->Bits(27, 24) == 0xC) ||
(instr->Bits(27, 24) == 0xD) );
(instr->Bits(27, 24) == 0x9) ||
(instr->Bits(27, 24) == 0xC) ||
(instr->Bits(27, 24) == 0xD) );
if (instr->Bit(24) == 0) {
if (instr->Bit(28) == 0) {
if (instr->Bit(29) == 0) {
if (instr->Bit(26) == 0) {
VisitLoadStoreExclusive(instr);
// TODO: VisitLoadStoreExclusive.
VisitUnimplemented(instr);
} else {
DecodeAdvSIMDLoadStore(instr);
}

View File

@@ -59,7 +59,6 @@
V(LoadStorePreIndex) \
V(LoadStoreRegisterOffset) \
V(LoadStoreUnsignedOffset) \
V(LoadStoreExclusive) \
V(LogicalShifted) \
V(AddSubShifted) \
V(AddSubExtended) \

View File

@@ -24,7 +24,6 @@
// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#include <cstdlib>
#include "a64/disasm-a64.h"
namespace vixl {
@@ -530,7 +529,7 @@ void Disassembler::VisitExtract(Instruction* instr) {
void Disassembler::VisitPCRelAddressing(Instruction* instr) {
switch (instr->Mask(PCRelAddressingMask)) {
case ADR: Format(instr, "adr", "'Xd, 'AddrPCRelByte"); break;
case ADRP: Format(instr, "adrp", "'Xd, 'AddrPCRelPage"); break;
// ADRP is not implemented.
default: Format(instr, "unimplemented", "(PCRelAddressing)");
}
}
@@ -944,49 +943,6 @@ void Disassembler::VisitLoadStorePairNonTemporal(Instruction* instr) {
}
void Disassembler::VisitLoadStoreExclusive(Instruction* instr) {
const char *mnemonic = "unimplemented";
const char *form;
switch (instr->Mask(LoadStoreExclusiveMask)) {
case STXRB_w: mnemonic = "stxrb"; form = "'Ws, 'Wt, ['Xns]"; break;
case STXRH_w: mnemonic = "stxrh"; form = "'Ws, 'Wt, ['Xns]"; break;
case STXR_w: mnemonic = "stxr"; form = "'Ws, 'Wt, ['Xns]"; break;
case STXR_x: mnemonic = "stxr"; form = "'Ws, 'Xt, ['Xns]"; break;
case LDXRB_w: mnemonic = "ldxrb"; form = "'Wt, ['Xns]"; break;
case LDXRH_w: mnemonic = "ldxrh"; form = "'Wt, ['Xns]"; break;
case LDXR_w: mnemonic = "ldxr"; form = "'Wt, ['Xns]"; break;
case LDXR_x: mnemonic = "ldxr"; form = "'Xt, ['Xns]"; break;
case STXP_w: mnemonic = "stxp"; form = "'Ws, 'Wt, 'Wt2, ['Xns]"; break;
case STXP_x: mnemonic = "stxp"; form = "'Ws, 'Xt, 'Xt2, ['Xns]"; break;
case LDXP_w: mnemonic = "ldxp"; form = "'Wt, 'Wt2, ['Xns]"; break;
case LDXP_x: mnemonic = "ldxp"; form = "'Xt, 'Xt2, ['Xns]"; break;
case STLXRB_w: mnemonic = "stlxrb"; form = "'Ws, 'Wt, ['Xns]"; break;
case STLXRH_w: mnemonic = "stlxrh"; form = "'Ws, 'Wt, ['Xns]"; break;
case STLXR_w: mnemonic = "stlxr"; form = "'Ws, 'Wt, ['Xns]"; break;
case STLXR_x: mnemonic = "stlxr"; form = "'Ws, 'Xt, ['Xns]"; break;
case LDAXRB_w: mnemonic = "ldaxrb"; form = "'Wt, ['Xns]"; break;
case LDAXRH_w: mnemonic = "ldaxrh"; form = "'Wt, ['Xns]"; break;
case LDAXR_w: mnemonic = "ldaxr"; form = "'Wt, ['Xns]"; break;
case LDAXR_x: mnemonic = "ldaxr"; form = "'Xt, ['Xns]"; break;
case STLXP_w: mnemonic = "stlxp"; form = "'Ws, 'Wt, 'Wt2, ['Xns]"; break;
case STLXP_x: mnemonic = "stlxp"; form = "'Ws, 'Xt, 'Xt2, ['Xns]"; break;
case LDAXP_w: mnemonic = "ldaxp"; form = "'Wt, 'Wt2, ['Xns]"; break;
case LDAXP_x: mnemonic = "ldaxp"; form = "'Xt, 'Xt2, ['Xns]"; break;
case STLRB_w: mnemonic = "stlrb"; form = "'Wt, ['Xns]"; break;
case STLRH_w: mnemonic = "stlrh"; form = "'Wt, ['Xns]"; break;
case STLR_w: mnemonic = "stlr"; form = "'Wt, ['Xns]"; break;
case STLR_x: mnemonic = "stlr"; form = "'Xt, ['Xns]"; break;
case LDARB_w: mnemonic = "ldarb"; form = "'Wt, ['Xns]"; break;
case LDARH_w: mnemonic = "ldarh"; form = "'Wt, ['Xns]"; break;
case LDAR_w: mnemonic = "ldar"; form = "'Wt, ['Xns]"; break;
case LDAR_x: mnemonic = "ldar"; form = "'Xt, ['Xns]"; break;
default: form = "(LoadStoreExclusive)";
}
Format(instr, mnemonic, form);
}
void Disassembler::VisitFPCompare(Instruction* instr) {
const char *mnemonic = "unimplemented";
const char *form = "'Fn, 'Fm";
@@ -1206,15 +1162,7 @@ void Disassembler::VisitSystem(Instruction* instr) {
const char *mnemonic = "unimplemented";
const char *form = "(System)";
if (instr->Mask(SystemExclusiveMonitorFMask) == SystemExclusiveMonitorFixed) {
switch (instr->Mask(SystemExclusiveMonitorMask)) {
case CLREX: {
mnemonic = "clrex";
form = (instr->CRm() == 0xf) ? NULL : "'IX";
break;
}
}
} else if (instr->Mask(SystemSysRegFMask) == SystemSysRegFixed) {
if (instr->Mask(SystemSysRegFMask) == SystemSysRegFixed) {
switch (instr->Mask(SystemSysRegMask)) {
case MRS: {
mnemonic = "mrs";
@@ -1236,6 +1184,7 @@ void Disassembler::VisitSystem(Instruction* instr) {
}
}
} else if (instr->Mask(SystemHintFMask) == SystemHintFixed) {
VIXL_ASSERT(instr->Mask(SystemHintMask) == HINT);
switch (instr->ImmHint()) {
case NOP: {
mnemonic = "nop";
@@ -1363,7 +1312,6 @@ int Disassembler::SubstituteRegisterField(Instruction* instr,
case 'n': reg_num = instr->Rn(); break;
case 'm': reg_num = instr->Rm(); break;
case 'a': reg_num = instr->Ra(); break;
case 's': reg_num = instr->Rs(); break;
case 't': {
if (format[2] == '2') {
reg_num = instr->Rt2();
@@ -1510,10 +1458,6 @@ int Disassembler::SubstituteImmediateField(Instruction* instr,
AppendToOutput("#0x%" PRIx64, instr->ImmException());
return 6;
}
case 'X': { // IX - CLREX instruction.
AppendToOutput("#0x%" PRIx64, instr->CRm());
return 2;
}
default: {
VIXL_UNIMPLEMENTED();
return 0;
@@ -1620,20 +1564,21 @@ int Disassembler::SubstituteConditionField(Instruction* instr,
int Disassembler::SubstitutePCRelAddressField(Instruction* instr,
const char* format) {
VIXL_ASSERT((strcmp(format, "AddrPCRelByte") == 0) || // Used by `adr`.
(strcmp(format, "AddrPCRelPage") == 0)); // Used by `adrp`.
USE(format);
VIXL_ASSERT(strncmp(format, "AddrPCRel", 9) == 0);
int64_t offset = instr->ImmPCRel();
Instruction * base = instr;
int offset = instr->ImmPCRel();
if (format[9] == 'P') {
offset *= kPageSize;
base = AlignDown(base, kPageSize);
// Only ADR (AddrPCRelByte) is supported.
VIXL_ASSERT(strcmp(format, "AddrPCRelByte") == 0);
char sign = '+';
if (offset < 0) {
offset = -offset;
sign = '-';
}
char sign = (offset < 0) ? '-' : '+';
void * target = reinterpret_cast<void *>(base + offset);
AppendToOutput("#%c0x%" PRIx64 " (addr %p)", sign, std::abs(offset), target);
VIXL_STATIC_ASSERT(sizeof(*instr) == 1);
AppendToOutput("#%c0x%x (addr %p)", sign, offset, instr + offset);
return 13;
}
@@ -1661,8 +1606,7 @@ int Disassembler::SubstituteBranchTargetField(Instruction* instr,
sign = '-';
}
VIXL_STATIC_ASSERT(sizeof(*instr) == 1);
void * address = reinterpret_cast<void *>(instr + offset);
AppendToOutput("#%c0x%" PRIx64 " (addr %p)", sign, offset, address);
AppendToOutput("#%c0x%" PRIx64 " (addr %p)", sign, offset, instr + offset);
return 8;
}

View File

@@ -85,7 +85,7 @@ class Disassembler: public DecoderVisitor {
bool IsMovzMovnImm(unsigned reg_size, uint64_t value);
void ResetOutput();
void AppendToOutput(const char* string, ...) PRINTF_CHECK(2, 3);
void AppendToOutput(const char* string, ...);
char* buffer_;
uint32_t buffer_pos_;

View File

@@ -149,24 +149,17 @@ LSDataSize CalcLSPairDataSize(LoadStorePairOp op) {
Instruction* Instruction::ImmPCOffsetTarget() {
Instruction * base = this;
ptrdiff_t offset;
if (IsPCRelAddressing()) {
// ADR and ADRP.
// PC-relative addressing. Only ADR is supported.
offset = ImmPCRel();
if (Mask(PCRelAddressingMask) == ADRP) {
base = AlignDown(base, kPageSize);
offset *= kPageSize;
} else {
VIXL_ASSERT(Mask(PCRelAddressingMask) == ADR);
}
} else {
// All PC-relative branches.
VIXL_ASSERT(BranchType() != UnknownBranchType);
// Relative branch offsets are instruction-size-aligned.
offset = ImmBranch() << kInstructionSizeLog2;
}
return base + offset;
return this + offset;
}
@@ -192,16 +185,10 @@ void Instruction::SetImmPCOffsetTarget(Instruction* target) {
void Instruction::SetPCRelImmTarget(Instruction* target) {
int32_t imm21;
if ((Mask(PCRelAddressingMask) == ADR)) {
imm21 = target - this;
} else {
VIXL_ASSERT(Mask(PCRelAddressingMask) == ADRP);
uintptr_t this_page = reinterpret_cast<uintptr_t>(this) / kPageSize;
uintptr_t target_page = reinterpret_cast<uintptr_t>(target) / kPageSize;
imm21 = target_page - this_page;
}
Instr imm = Assembler::ImmPCRelAddress(imm21);
// ADRP is not supported, so 'this' must point to an ADR instruction.
VIXL_ASSERT(Mask(PCRelAddressingMask) == ADR);
Instr imm = Assembler::ImmPCRelAddress(target - this);
SetInstructionBits(Mask(~ImmPCRel_mask) | imm);
}

View File

@@ -41,10 +41,6 @@ const unsigned kLiteralEntrySize = 4;
const unsigned kLiteralEntrySizeLog2 = 2;
const unsigned kMaxLoadLiteralRange = 1 * MBytes;
// This is the nominal page size (as used by the adrp instruction); the actual
// size of the memory pages allocated by the kernel is likely to differ.
const unsigned kPageSize = 4 * KBytes;
const unsigned kWRegSize = 32;
const unsigned kWRegSizeLog2 = 5;
const unsigned kWRegSizeInBytes = kWRegSize / 8;
@@ -83,12 +79,6 @@ const unsigned kZeroRegCode = 31;
const unsigned kSPRegInternalCode = 63;
const unsigned kRegCodeMask = 0x1f;
const unsigned kAddressTagOffset = 56;
const unsigned kAddressTagWidth = 8;
const uint64_t kAddressTagMask =
((UINT64_C(1) << kAddressTagWidth) - 1) << kAddressTagOffset;
VIXL_STATIC_ASSERT(kAddressTagMask == UINT64_C(0xff00000000000000));
// AArch64 floating-point specifics. These match IEEE-754.
const unsigned kDoubleMantissaBits = 52;
const unsigned kDoubleExponentBits = 11;

View File

@@ -28,10 +28,14 @@
#define PLATFORM_H
// Define platform specific functionalities.
#include <signal.h>
namespace vixl {
inline void HostBreakpoint() { raise(SIGINT); }
#ifdef USE_SIMULATOR
// Currently we assume running the simulator implies running on x86 hardware.
inline void HostBreakpoint() { asm("int3"); }
#else
inline void HostBreakpoint() { asm("brk"); }
#endif
} // namespace vixl
#endif

View File

@@ -124,14 +124,4 @@ int CountSetBits(uint64_t value, int width) {
return value;
}
uint64_t LowestSetBit(uint64_t value) {
return value & -value;
}
bool IsPowerOf2(int64_t value) {
return (value != 0) && ((value & (value - 1)) == 0);
}
} // namespace vixl

View File

@@ -33,14 +33,6 @@
namespace vixl {
// Macros for compile-time format checking.
#if defined(__GNUC__)
#define PRINTF_CHECK(format_index, varargs_index) \
__attribute__((format(printf, format_index, varargs_index)))
#else
#define PRINTF_CHECK(format_index, varargs_index)
#endif
// Check number width.
inline bool is_intn(unsigned n, int64_t x) {
VIXL_ASSERT((0 < n) && (n < 64));
@@ -163,8 +155,6 @@ int CountLeadingZeros(uint64_t value, int width);
int CountLeadingSignBits(int64_t value, int width);
int CountTrailingZeros(uint64_t value, int width);
int CountSetBits(uint64_t value, int width);
uint64_t LowestSetBit(uint64_t value);
bool IsPowerOf2(int64_t value);
// Pointer alignment
// TODO: rename/refactor to make it specific to instructions.
@@ -177,31 +167,21 @@ bool IsWordAligned(T pointer) {
// Increment a pointer until it has the specified alignment.
template<class T>
T AlignUp(T pointer, size_t alignment) {
// Use C-style casts to get static_cast behaviour for integral types (T), and
// reinterpret_cast behaviour for other types.
uintptr_t pointer_raw = (uintptr_t)pointer;
VIXL_STATIC_ASSERT(sizeof(pointer) == sizeof(pointer_raw));
VIXL_STATIC_ASSERT(sizeof(pointer) == sizeof(uintptr_t));
uintptr_t pointer_raw = reinterpret_cast<uintptr_t>(pointer);
size_t align_step = (alignment - pointer_raw) % alignment;
VIXL_ASSERT((pointer_raw + align_step) % alignment == 0);
return (T)(pointer_raw + align_step);
return reinterpret_cast<T>(pointer_raw + align_step);
}
// Decrement a pointer until it has the specified alignment.
template<class T>
T AlignDown(T pointer, size_t alignment) {
// Use C-style casts to get static_cast behaviour for integral types (T), and
// reinterpret_cast behaviour for other types.
uintptr_t pointer_raw = (uintptr_t)pointer;
VIXL_STATIC_ASSERT(sizeof(pointer) == sizeof(pointer_raw));
VIXL_STATIC_ASSERT(sizeof(pointer) == sizeof(uintptr_t));
uintptr_t pointer_raw = reinterpret_cast<uintptr_t>(pointer);
size_t align_step = pointer_raw % alignment;
VIXL_ASSERT((pointer_raw - align_step) % alignment == 0);
return (T)(pointer_raw - align_step);
return reinterpret_cast<T>(pointer_raw - align_step);
}

View File

@@ -1175,11 +1175,15 @@ static const struct sparc_opcode sparc_opcodes[] = {
{ "subcc", F3(2, 0x14, 0), F3(~2, ~0x14, ~0)|ASI(~0), "1,2,d", 0, v6 },
{ "subcc", F3(2, 0x14, 1), F3(~2, ~0x14, ~1), "1,i,d", 0, v6 },
{ "subc", F3(2, 0x0c, 0), F3(~2, ~0x0c, ~0)|ASI(~0), "1,2,d", 0, v6 },
{ "subc", F3(2, 0x0c, 1), F3(~2, ~0x0c, ~1), "1,i,d", 0, v6 },
{ "subx", F3(2, 0x0c, 0), F3(~2, ~0x0c, ~0)|ASI(~0), "1,2,d", 0, v6notv9 },
{ "subx", F3(2, 0x0c, 1), F3(~2, ~0x0c, ~1), "1,i,d", 0, v6notv9 },
{ "subc", F3(2, 0x0c, 0), F3(~2, ~0x0c, ~0)|ASI(~0), "1,2,d", 0, v9 },
{ "subc", F3(2, 0x0c, 1), F3(~2, ~0x0c, ~1), "1,i,d", 0, v9 },
{ "subccc", F3(2, 0x1c, 0), F3(~2, ~0x1c, ~0)|ASI(~0), "1,2,d", 0, v6 },
{ "subccc", F3(2, 0x1c, 1), F3(~2, ~0x1c, ~1), "1,i,d", 0, v6 },
{ "subxcc", F3(2, 0x1c, 0), F3(~2, ~0x1c, ~0)|ASI(~0), "1,2,d", 0, v6notv9 },
{ "subxcc", F3(2, 0x1c, 1), F3(~2, ~0x1c, ~1), "1,i,d", 0, v6notv9 },
{ "subccc", F3(2, 0x1c, 0), F3(~2, ~0x1c, ~0)|ASI(~0), "1,2,d", 0, v9 },
{ "subccc", F3(2, 0x1c, 1), F3(~2, ~0x1c, ~1), "1,i,d", 0, v9 },
{ "and", F3(2, 0x01, 0), F3(~2, ~0x01, ~0)|ASI(~0), "1,2,d", 0, v6 },
{ "and", F3(2, 0x01, 1), F3(~2, ~0x01, ~1), "1,i,d", 0, v6 },
@@ -1211,13 +1215,19 @@ static const struct sparc_opcode sparc_opcodes[] = {
{ "addcc", F3(2, 0x10, 1), F3(~2, ~0x10, ~1), "1,i,d", 0, v6 },
{ "addcc", F3(2, 0x10, 1), F3(~2, ~0x10, ~1), "i,1,d", 0, v6 },
{ "addc", F3(2, 0x08, 0), F3(~2, ~0x08, ~0)|ASI(~0), "1,2,d", 0, v6 },
{ "addc", F3(2, 0x08, 1), F3(~2, ~0x08, ~1), "1,i,d", 0, v6 },
{ "addc", F3(2, 0x08, 1), F3(~2, ~0x08, ~1), "i,1,d", 0, v6 },
{ "addx", F3(2, 0x08, 0), F3(~2, ~0x08, ~0)|ASI(~0), "1,2,d", 0, v6notv9 },
{ "addx", F3(2, 0x08, 1), F3(~2, ~0x08, ~1), "1,i,d", 0, v6notv9 },
{ "addx", F3(2, 0x08, 1), F3(~2, ~0x08, ~1), "i,1,d", 0, v6notv9 },
{ "addc", F3(2, 0x08, 0), F3(~2, ~0x08, ~0)|ASI(~0), "1,2,d", 0, v9 },
{ "addc", F3(2, 0x08, 1), F3(~2, ~0x08, ~1), "1,i,d", 0, v9 },
{ "addc", F3(2, 0x08, 1), F3(~2, ~0x08, ~1), "i,1,d", 0, v9 },
{ "addccc", F3(2, 0x18, 0), F3(~2, ~0x18, ~0)|ASI(~0), "1,2,d", 0, v6 },
{ "addccc", F3(2, 0x18, 1), F3(~2, ~0x18, ~1), "1,i,d", 0, v6 },
{ "addccc", F3(2, 0x18, 1), F3(~2, ~0x18, ~1), "i,1,d", 0, v6 },
{ "addxcc", F3(2, 0x18, 0), F3(~2, ~0x18, ~0)|ASI(~0), "1,2,d", 0, v6notv9 },
{ "addxcc", F3(2, 0x18, 1), F3(~2, ~0x18, ~1), "1,i,d", 0, v6notv9 },
{ "addxcc", F3(2, 0x18, 1), F3(~2, ~0x18, ~1), "i,1,d", 0, v6notv9 },
{ "addccc", F3(2, 0x18, 0), F3(~2, ~0x18, ~0)|ASI(~0), "1,2,d", 0, v9 },
{ "addccc", F3(2, 0x18, 1), F3(~2, ~0x18, ~1), "1,i,d", 0, v9 },
{ "addccc", F3(2, 0x18, 1), F3(~2, ~0x18, ~1), "i,1,d", 0, v9 },
{ "smul", F3(2, 0x0b, 0), F3(~2, ~0x0b, ~0)|ASI(~0), "1,2,d", 0, v8 },
{ "smul", F3(2, 0x0b, 1), F3(~2, ~0x0b, ~1), "1,i,d", 0, v8 },
@@ -2032,10 +2042,6 @@ IMPDEP ("impdep2", 0x37),
#undef IMPDEP
{ "addxc", F3F(2, 0x36, 0x011), F3F(~2, ~0x36, ~0x011), "1,2,d", 0, v9b },
{ "addxccc", F3F(2, 0x36, 0x013), F3F(~2, ~0x36, ~0x013), "1,2,d", 0, v9b },
{ "umulxhi", F3F(2, 0x36, 0x016), F3F(~2, ~0x36, ~0x016), "1,2,d", 0, v9b },
};
static const int sparc_num_opcodes = ((sizeof sparc_opcodes)/(sizeof sparc_opcodes[0]));

View File

@@ -73,6 +73,7 @@ typedef struct {
QEMUSGList *sg;
uint64_t sector_num;
DMADirection dir;
bool in_cancel;
int sg_cur_index;
dma_addr_t sg_cur_byte;
QEMUIOVector iov;
@@ -124,7 +125,12 @@ static void dma_complete(DMAAIOCB *dbs, int ret)
qemu_bh_delete(dbs->bh);
dbs->bh = NULL;
}
qemu_aio_unref(dbs);
if (!dbs->in_cancel) {
/* Requests may complete while dma_aio_cancel is in progress. In
* this case, the AIOCB should not be released because it is still
* referenced by dma_aio_cancel. */
qemu_aio_release(dbs);
}
}
static void dma_bdrv_cb(void *opaque, int ret)
@@ -180,14 +186,19 @@ static void dma_aio_cancel(BlockDriverAIOCB *acb)
trace_dma_aio_cancel(dbs);
if (dbs->acb) {
bdrv_aio_cancel_async(dbs->acb);
BlockDriverAIOCB *acb = dbs->acb;
dbs->acb = NULL;
dbs->in_cancel = true;
bdrv_aio_cancel(acb);
dbs->in_cancel = false;
}
dbs->common.cb = NULL;
dma_complete(dbs, 0);
}
static const AIOCBInfo dma_aiocb_info = {
.aiocb_size = sizeof(DMAAIOCB),
.cancel_async = dma_aio_cancel,
.cancel = dma_aio_cancel,
};
BlockDriverAIOCB *dma_bdrv_io(
@@ -206,6 +217,7 @@ BlockDriverAIOCB *dma_bdrv_io(
dbs->sg_cur_index = 0;
dbs->sg_cur_byte = 0;
dbs->dir = dir;
dbs->in_cancel = false;
dbs->io_func = io_func;
dbs->bh = NULL;
qemu_iovec_init(&dbs->iov, sg->nsg);
@@ -265,5 +277,5 @@ uint64_t dma_buf_write(uint8_t *ptr, int32_t len, QEMUSGList *sg)
void dma_acct_start(BlockDriverState *bs, BlockAcctCookie *cookie,
QEMUSGList *sg, enum BlockAcctType type)
{
block_acct_start(bdrv_get_stats(bs), cookie, sg->size, type);
bdrv_acct_start(bs, cookie, sg->size, type);
}

View File

@@ -1,161 +0,0 @@
Block I/O error injection using blkdebug
----------------------------------------
Copyright (C) 2014 Red Hat Inc
This work is licensed under the terms of the GNU GPL, version 2 or later. See
the COPYING file in the top-level directory.
The blkdebug block driver is a rule-based error injection engine. It can be
used to exercise error code paths in block drivers including ENOSPC (out of
space) and EIO.
This document gives an overview of the features available in blkdebug.
Background
----------
Block drivers have many error code paths that handle I/O errors. Image formats
are especially complex since metadata I/O errors during cluster allocation or
while updating tables happen halfway through request processing and require
discipline to keep image files consistent.
Error injection allows test cases to trigger I/O errors at specific points.
This way, all error paths can be tested to make sure they are correct.
Rules
-----
The blkdebug block driver takes a list of "rules" that tell the error injection
engine when to fail an I/O request.
Each I/O request is evaluated against the rules. If a rule matches the request
then its "action" is executed.
Rules can be placed in a configuration file; the configuration file
follows the same .ini-like format used by QEMU's -readconfig option, and
each section of the file represents a rule.
The following configuration file defines a single rule:
$ cat blkdebug.conf
[inject-error]
event = "read_aio"
errno = "28"
This rule fails all aio read requests with ENOSPC (28). Note that the errno
value depends on the host. On Linux, see
/usr/include/asm-generic/errno-base.h for errno values.
Invoke QEMU as follows:
$ qemu-system-x86_64
-drive if=none,cache=none,file=blkdebug:blkdebug.conf:test.img,id=drive0 \
-device virtio-blk-pci,drive=drive0,id=virtio-blk-pci0
Rules support the following attributes:
event - which type of operation to match (e.g. read_aio, write_aio,
flush_to_os, flush_to_disk). See the "Events" section for
information on events.
state - (optional) the engine must be in this state number in order for this
rule to match. See the "State transitions" section for information
on states.
errno - the numeric errno value to return when a request matches this rule.
The errno values depend on the host since the numeric values are not
standarized in the POSIX specification.
sector - (optional) a sector number that the request must overlap in order to
match this rule
once - (optional, default "off") only execute this action on the first
matching request
immediately - (optional, default "off") return a NULL BlockDriverAIOCB
pointer and fail without an errno instead. This exercises the
code path where BlockDriverAIOCB fails and the caller's
BlockDriverCompletionFunc is not invoked.
Events
------
Block drivers provide information about the type of I/O request they are about
to make so rules can match specific types of requests. For example, the qcow2
block driver tells blkdebug when it accesses the L1 table so rules can match
only L1 table accesses and not other metadata or guest data requests.
The core events are:
read_aio - guest data read
write_aio - guest data write
flush_to_os - write out unwritten block driver state (e.g. cached metadata)
flush_to_disk - flush the host block device's disk cache
See block/blkdebug.c:event_names[] for the full list of events. You may need
to grep block driver source code to understand the meaning of specific events.
State transitions
-----------------
There are cases where more power is needed to match a particular I/O request in
a longer sequence of requests. For example:
write_aio
flush_to_disk
write_aio
How do we match the 2nd write_aio but not the first? This is where state
transitions come in.
The error injection engine has an integer called the "state" that always starts
initialized to 1. The state integer is internal to blkdebug and cannot be
observed from outside but rules can interact with it for powerful matching
behavior.
Rules can be conditional on the current state and they can transition to a new
state.
When a rule's "state" attribute is non-zero then the current state must equal
the attribute in order for the rule to match.
For example, to match the 2nd write_aio:
[set-state]
event = "write_aio"
state = "1"
new_state = "2"
[inject-error]
event = "write_aio"
state = "2"
errno = "5"
The first write_aio request matches the set-state rule and transitions from
state 1 to state 2. Once state 2 has been entered, the set-state rule no
longer matches since it requires state 1. But the inject-error rule now
matches the next write_aio request and injects EIO (5).
State transition rules support the following attributes:
event - which type of operation to match (e.g. read_aio, write_aio,
flush_to_os, flush_to_disk). See the "Events" section for
information on events.
state - (optional) the engine must be in this state number in order for this
rule to match
new_state - transition to this state number
Suspend and resume
------------------
Exercising code paths in block drivers may require specific ordering amongst
concurrent requests. The "breakpoint" feature allows requests to be halted on
a blkdebug event and resumed later. This makes it possible to achieve
deterministic ordering when multiple requests are in flight.
Breakpoints on blkdebug events are associated with a user-defined "tag" string.
This tag serves as an identifier by which the request can be resumed at a later
point.
See the qemu-io(1) break, resume, remove_break, and wait_break commands for
details.

View File

@@ -1,239 +0,0 @@
# Specification for the fuzz testing tool
#
# Copyright (C) 2014 Maria Kustova <maria.k@catit.be>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
Image fuzzer
============
Description
-----------
The goal of the image fuzzer is to catch crashes of qemu-io/qemu-img
by providing to them randomly corrupted images.
Test images are generated from scratch and have valid inner structure with some
elements, e.g. L1/L2 tables, having random invalid values.
Test runner
-----------
The test runner generates test images, executes tests utilizing generated
images, indicates their results and collects all test related artifacts (logs,
core dumps, test images, backing files).
The test means execution of all available commands under test with the same
generated test image.
By default, the test runner generates new tests and executes them until
keyboard interruption. But if a test seed is specified via the '--seed' runner
parameter, then only one test with this seed will be executed, after its finish
the runner will exit.
The runner uses an external image fuzzer to generate test images. An image
generator should be specified as a mandatory parameter of the test runner.
Details about interactions between the runner and fuzzers see "Module
interfaces".
The runner activates generation of core dumps during test executions, but it
assumes that core dumps will be generated in the current working directory.
For comprehensive test results, please, set up your test environment
properly.
Paths to binaries under test (SUTs) qemu-img and qemu-io are retrieved from
environment variables. If the environment check fails the runner will
use SUTs installed in system paths.
qemu-img is required for creation of backing files, so it's mandatory to set
the related environment variable if it's not installed in the system path.
For details about environment variables see qemu-iotests/check.
The runner accepts a JSON array of fields expected to be fuzzed via the
'--config' argument, e.g.
'[["feature_name_table"], ["header", "l1_table_offset"]]'
Each sublist can have one or two strings defining image structure elements.
In the latter case a parent element should be placed on the first position,
and a field name on the second one.
The runner accepts a list of commands under test as a JSON array via
the '--command' argument. Each command is a list containing a SUT and all its
arguments, e.g.
runner.py -c '[["qemu-io", "$test_img", "-c", "write $off $len"]]'
/tmp/test ../qcow2
For variable arguments next aliases can be used:
- $test_img for a fuzzed img
- $off for an offset in the fuzzed image
- $len for a data size
Values for last two aliases will be generated based on a size of a virtual
disk of the generated image.
In case when no commands are specified the runner will execute commands from
the default list:
- qemu-img check
- qemu-img info
- qemu-img convert
- qemu-io -c read
- qemu-io -c write
- qemu-io -c aio_read
- qemu-io -c aio_write
- qemu-io -c flush
- qemu-io -c discard
- qemu-io -c truncate
Qcow2 image generator
---------------------
The 'qcow2' generator is a Python package providing 'create_image' method as
a single public API. See details in 'Test runner/image fuzzer' chapter of
'Module interfaces'.
Qcow2 contains two submodules: fuzz.py and layout.py.
'fuzz.py' contains all fuzzing functions, one per image field. It's assumed
that after code analysis every field will have own constraints for its value.
For now only universal potentially dangerous values are used, e.g. type limits
for integers or unsafe symbols as '%s' for strings. For bitmasks random amount
of bits are set to ones. All fuzzed values are checked on non-equality to the
current valid value of the field. In case of equality the value will be
regenerated.
'layout.py' creates a random valid image, fuzzes a random subset of the image
fields by 'fuzz.py' module and writes a fuzzed image to the file specified.
If a fuzzer configuration is specified, then it has the next interpretation:
1. If a list contains a parent image element only, then some random portion
of fields of this element will be fuzzed every test.
The same behavior is applied for the entire image if no configuration is
used. This case is useful for the test specialization.
2. If a list contains a parent element and a field name, then a field
will be always fuzzed for every test. This case is useful for regression
testing.
The generator can create header fields, header extensions, L1/L2 tables and
refcount table and blocks.
Module interfaces
-----------------
* Test runner/image fuzzer
The runner calls an image generator specifying the path to a test image file,
path to a backing file and its format and a fuzzer configuration.
An image generator is expected to provide a
'create_image(test_img_path, backing_file_path=None,
backing_file_format=None, fuzz_config=None)'
method that creates a test image, writes it to the specified file and returns
the size of the virtual disk.
The file should be created if it doesn't exist or overwritten otherwise.
fuzz_config has a form of a list of lists. Every sublist can have one
or two elements: first element is a name of a parent image element, second one
if exists is a name of a field in this element.
Example,
[['header', 'l1_table_offset'],
['header', 'nb_snapshots'],
['feature_name_table']]
Random seed is set by the runner at every test execution for the regression
purpose, so an image generator is not recommended to modify it internally.
Overall fuzzer requirements
===========================
Input data:
----------
- image template (generator)
- work directory
- action vector (optional)
- seed (optional)
- SUT and its arguments (optional)
Fuzzer requirements:
-------------------
1. Should be able to inject random data
2. Should be able to select a random value from the manually pregenerated
vector (boundary values, e.g. max/min cluster size)
3. Image template should describe a general structure invariant for all
test images (image format description)
4. Image template should be autonomous and other fuzzer parts should not
rely on it
5. Image template should contain reference rules (not only block+size
description)
6. Should generate the test image with the correct structure based on an image
template
7. Should accept a seed as an argument (for regression purpose)
8. Should generate a seed if it is not specified as an input parameter.
9. The same seed should generate the same image for the same action vector,
specified or generated.
10. Should accept a vector of actions as an argument (for test reproducing and
for test case specification, e.g. group of tests for header structure,
group of test for snapshots, etc)
11. Action vector should be randomly generated from the pool of available
actions, if it is not specified as an input parameter
12. Pool of actions should be defined automatically based on an image template
13. Should accept a SUT and its call parameters as an argument or select them
randomly otherwise. As far as it's expected to be rarely changed, the list
of all possible test commands can be available in the test runner
internally.
14. Should support an external cancellation of a test run
15. Seed should be logged (for regression purpose)
16. All files related to a test result should be collected: a test image,
SUT logs, fuzzer logs and crash dumps
17. Should be compatible with python version 2.4-2.7
18. Usage of external libraries should be limited as much as possible.
Image formats:
-------------
Main target image format is qcow2, but support of image templates should
provide an ability to add any other image format.
Effectiveness:
-------------
The fuzzer can be controlled via template, seed and action vector;
it makes the fuzzer itself invariant to an image format and test logic.
It should be able to perform rather complex and precise tests, that can be
specified via an action vector. Otherwise, knowledge about an image structure
allows the fuzzer to generate the pool of all available areas can be fuzzed
and randomly select some of them and so compose its own action vector.
Also complexity of a template defines complexity of the fuzzer, so its
functionality can be varied from simple model-independent fuzzing to smart
model-based one.
Glossary:
--------
Action vector is a sequence of structure elements retrieved from an image
format, each of them will be fuzzed for the test image. It's a subset of
elements of the action pool. Example: header, refcount table, etc.
Action pool is all available elements of an image structure that generated
automatically from an image template.
Image template is a formal description of an image structure and relations
between image blocks.
Test image is an output image of the fuzzer defined by the current seed and
action vector.

View File

@@ -74,16 +74,11 @@ Region lifecycle
----------------
A region is created by one of the constructor functions (memory_region_init*())
and attached to an object. It is then destroyed by object_unparent() or simply
when the parent object dies.
In between, a region can be added to an address space
by using memory_region_add_subregion() and removed using
memory_region_del_subregion(). Destroying the region implicitly
removes the region from the address space.
Region attributes may be changed at any point; they take effect once
the region becomes exposed to the guest.
and destroyed by the destructor (memory_region_destroy()). In between,
a region can be added to an address space by using memory_region_add_subregion()
and removed using memory_region_del_subregion(). Region attributes may be
changed at any point; they take effect once the region becomes exposed to the
guest.
Overlapping regions and priority
--------------------------------

View File

@@ -1,134 +0,0 @@
Copyright (c) 2014 Red Hat Inc.
This work is licensed under the terms of the GNU GPL, version 2 or later. See
the COPYING file in the top-level directory.
This document explains the IOThread feature and how to write code that runs
outside the QEMU global mutex.
The main loop and IOThreads
---------------------------
QEMU is an event-driven program that can do several things at once using an
event loop. The VNC server and the QMP monitor are both processed from the
same event loop, which monitors their file descriptors until they become
readable and then invokes a callback.
The default event loop is called the main loop (see main-loop.c). It is
possible to create additional event loop threads using -object
iothread,id=my-iothread.
Side note: The main loop and IOThread are both event loops but their code is
not shared completely. Sometimes it is useful to remember that although they
are conceptually similar they are currently not interchangeable.
Why IOThreads are useful
------------------------
IOThreads allow the user to control the placement of work. The main loop is a
scalability bottleneck on hosts with many CPUs. Work can be spread across
several IOThreads instead of just one main loop. When set up correctly this
can improve I/O latency and reduce jitter seen by the guest.
The main loop is also deeply associated with the QEMU global mutex, which is a
scalability bottleneck in itself. vCPU threads and the main loop use the QEMU
global mutex to serialize execution of QEMU code. This mutex is necessary
because a lot of QEMU's code historically was not thread-safe.
The fact that all I/O processing is done in a single main loop and that the
QEMU global mutex is contended by all vCPU threads and the main loop explain
why it is desirable to place work into IOThreads.
The experimental virtio-blk data-plane implementation has been benchmarked and
shows these effects:
ftp://public.dhe.ibm.com/linux/pdfs/KVM_Virtualized_IO_Performance_Paper.pdf
How to program for IOThreads
----------------------------
The main difference between legacy code and new code that can run in an
IOThread is dealing explicitly with the event loop object, AioContext
(see include/block/aio.h). Code that only works in the main loop
implicitly uses the main loop's AioContext. Code that supports running
in IOThreads must be aware of its AioContext.
AioContext supports the following services:
* File descriptor monitoring (read/write/error on POSIX hosts)
* Event notifiers (inter-thread signalling)
* Timers
* Bottom Halves (BH) deferred callbacks
There are several old APIs that use the main loop AioContext:
* LEGACY qemu_aio_set_fd_handler() - monitor a file descriptor
* LEGACY qemu_aio_set_event_notifier() - monitor an event notifier
* LEGACY timer_new_ms() - create a timer
* LEGACY qemu_bh_new() - create a BH
* LEGACY qemu_aio_wait() - run an event loop iteration
Since they implicitly work on the main loop they cannot be used in code that
runs in an IOThread. They might cause a crash or deadlock if called from an
IOThread since the QEMU global mutex is not held.
Instead, use the AioContext functions directly (see include/block/aio.h):
* aio_set_fd_handler() - monitor a file descriptor
* aio_set_event_notifier() - monitor an event notifier
* aio_timer_new() - create a timer
* aio_bh_new() - create a BH
* aio_poll() - run an event loop iteration
The AioContext can be obtained from the IOThread using
iothread_get_aio_context() or for the main loop using qemu_get_aio_context().
Code that takes an AioContext argument works both in IOThreads or the main
loop, depending on which AioContext instance the caller passes in.
How to synchronize with an IOThread
-----------------------------------
AioContext is not thread-safe so some rules must be followed when using file
descriptors, event notifiers, timers, or BHs across threads:
1. AioContext functions can be called safely from file descriptor, event
notifier, timer, or BH callbacks invoked by the AioContext. No locking is
necessary.
2. Other threads wishing to access the AioContext must use
aio_context_acquire()/aio_context_release() for mutual exclusion. Once the
context is acquired no other thread can access it or run event loop iterations
in this AioContext.
aio_context_acquire()/aio_context_release() calls may be nested. This
means you can call them if you're not sure whether #1 applies.
There is currently no lock ordering rule if a thread needs to acquire multiple
AioContexts simultaneously. Therefore, it is only safe for code holding the
QEMU global mutex to acquire other AioContexts.
Side note: the best way to schedule a function call across threads is to create
a BH in the target AioContext beforehand and then call qemu_bh_schedule(). No
acquire/release or locking is needed for the qemu_bh_schedule() call. But be
sure to acquire the AioContext for aio_bh_new() if necessary.
The relationship between AioContext and the block layer
-------------------------------------------------------
The AioContext originates from the QEMU block layer because it provides a
scoped way of running event loop iterations until all work is done. This
feature is used to complete all in-flight block I/O requests (see
bdrv_drain_all()). Nowadays AioContext is a generic event loop that can be
used by any QEMU subsystem.
The block layer has support for AioContext integrated. Each BlockDriverState
is associated with an AioContext using bdrv_set_aio_context() and
bdrv_get_aio_context(). This allows block layer code to process I/O inside the
right AioContext. Other subsystems may wish to follow a similar approach.
Block layer code must therefore expect to run in an IOThread and avoid using
old APIs that implicitly use the main loop. See the "How to program for
IOThreads" above for information on how to do that.
If main loop code such as a QMP function wishes to access a BlockDriverState it
must first call aio_context_acquire(bdrv_get_aio_context(bs)) to ensure the
IOThread does not run in parallel.
Long-running jobs (usually in the form of coroutines) are best scheduled in the
BlockDriverState's AioContext to avoid the need to acquire/release around each
bdrv_*() call. Be aware that there is currently no mechanism to get notified
when bdrv_set_aio_context() moves this BlockDriverState to a different
AioContext (see bdrv_detach_aio_context()/bdrv_attach_aio_context()), so you
may need to add this if you want to support long-running jobs.

View File

@@ -1,5 +1,10 @@
= How to use the QAPI code generator =
* Note: as of this writing, QMP does not use QAPI. Eventually QMP
commands will be converted to use QAPI internally. The following
information describes QMP/QAPI as it will exist after the
conversion.
QAPI is a native C API within QEMU which provides management-level
functionality to internal/external users. For external
users/processes, this interface is made available by a JSON-based
@@ -14,7 +19,7 @@ marshaling/dispatch code for the guest agent server running in the
guest.
This document will describe how the schemas, scripts, and resulting
code are used.
code is used.
== QMP/Guest agent schema ==
@@ -229,7 +234,6 @@ Resulting in this JSON object:
"data": { "b": "test string" },
"timestamp": { "seconds": 1267020223, "microseconds": 435656 } }
== Code generation ==
Schemas are fed into 3 scripts to generate all the code/files that, paired
@@ -252,8 +256,6 @@ command which takes that type as a parameter and returns the same type:
'data': {'arg1': 'UserDefOne'},
'returns': 'UserDefOne' }
{ 'event': 'MY_EVENT' }
=== scripts/qapi-types.py ===
Used to generate the C types defined by a schema. The following files are
@@ -275,7 +277,7 @@ Example:
$ cat qapi-generated/example-qapi-types.c
[Uninteresting stuff omitted...]
void qapi_free_UserDefOneList(UserDefOneList *obj)
void qapi_free_UserDefOneList(UserDefOneList * obj)
{
QapiDeallocVisitor *md;
Visitor *v;
@@ -290,7 +292,7 @@ Example:
qapi_dealloc_visitor_cleanup(md);
}
void qapi_free_UserDefOne(UserDefOne *obj)
void qapi_free_UserDefOne(UserDefOne * obj)
{
QapiDeallocVisitor *md;
Visitor *v;
@@ -329,11 +331,11 @@ Example:
struct UserDefOne
{
int64_t integer;
char *string;
char * string;
};
void qapi_free_UserDefOneList(UserDefOneList *obj);
void qapi_free_UserDefOne(UserDefOne *obj);
void qapi_free_UserDefOneList(UserDefOneList * obj);
void qapi_free_UserDefOne(UserDefOne * obj);
#endif
@@ -362,7 +364,7 @@ Example:
$ cat qapi-generated/example-qapi-visit.c
[Uninteresting stuff omitted...]
static void visit_type_UserDefOne_fields(Visitor *m, UserDefOne **obj, Error **errp)
static void visit_type_UserDefOne_fields(Visitor *m, UserDefOne ** obj, Error **errp)
{
Error *err = NULL;
visit_type_int(m, &(*obj)->integer, "integer", &err);
@@ -378,7 +380,7 @@ Example:
error_propagate(errp, err);
}
void visit_type_UserDefOne(Visitor *m, UserDefOne **obj, const char *name, Error **errp)
void visit_type_UserDefOne(Visitor *m, UserDefOne ** obj, const char *name, Error **errp)
{
Error *err = NULL;
@@ -392,7 +394,7 @@ Example:
error_propagate(errp, err);
}
void visit_type_UserDefOneList(Visitor *m, UserDefOneList **obj, const char *name, Error **errp)
void visit_type_UserDefOneList(Visitor *m, UserDefOneList ** obj, const char *name, Error **errp)
{
Error *err = NULL;
GenericList *i, **prev;
@@ -425,8 +427,8 @@ Example:
[Visitors for builtin types omitted...]
void visit_type_UserDefOne(Visitor *m, UserDefOne **obj, const char *name, Error **errp);
void visit_type_UserDefOneList(Visitor *m, UserDefOneList **obj, const char *name, Error **errp);
void visit_type_UserDefOne(Visitor *m, UserDefOne ** obj, const char *name, Error **errp);
void visit_type_UserDefOneList(Visitor *m, UserDefOneList ** obj, const char *name, Error **errp);
#endif
@@ -449,12 +451,10 @@ $(prefix)qmp-commands.h: Function prototypes for the QMP commands
Example:
$ python scripts/qapi-commands.py --output-dir="qapi-generated"
--prefix="example-" --input-file=example-schema.json
$ cat qapi-generated/example-qmp-marshal.c
[Uninteresting stuff omitted...]
static void qmp_marshal_output_my_command(UserDefOne *ret_in, QObject **ret_out, Error **errp)
static void qmp_marshal_output_my_command(UserDefOne * ret_in, QObject **ret_out, Error **errp)
{
Error *local_err = NULL;
QmpOutputVisitor *mo = qmp_output_visitor_new();
@@ -480,11 +480,11 @@ Example:
static void qmp_marshal_input_my_command(QDict *args, QObject **ret, Error **errp)
{
Error *local_err = NULL;
UserDefOne *retval = NULL;
UserDefOne * retval = NULL;
QmpInputVisitor *mi = qmp_input_visitor_new_strict(QOBJECT(args));
QapiDeallocVisitor *md;
Visitor *v;
UserDefOne *arg1 = NULL;
UserDefOne * arg1 = NULL;
v = qmp_input_get_visitor(mi);
visit_type_UserDefOne(v, &arg1, "arg1", &local_err);
@@ -525,66 +525,6 @@ Example:
#include "qapi/qmp/qdict.h"
#include "qapi/error.h"
UserDefOne *qmp_my_command(UserDefOne *arg1, Error **errp);
#endif
=== scripts/qapi-event.py ===
Used to generate the event-related C code defined by a schema. The
following files are created:
$(prefix)qapi-event.h - Function prototypes for each event type, plus an
enumeration of all event names
$(prefix)qapi-event.c - Implementation of functions to send an event
Example:
$ python scripts/qapi-event.py --output-dir="qapi-generated"
--prefix="example-" --input-file=example-schema.json
$ cat qapi-generated/example-qapi-event.c
[Uninteresting stuff omitted...]
void qapi_event_send_my_event(Error **errp)
{
QDict *qmp;
Error *local_err = NULL;
QMPEventFuncEmit emit;
emit = qmp_event_get_func_emit();
if (!emit) {
return;
}
qmp = qmp_event_build_dict("MY_EVENT");
emit(EXAMPLE_QAPI_EVENT_MY_EVENT, qmp, &local_err);
error_propagate(errp, local_err);
QDECREF(qmp);
}
const char *EXAMPLE_QAPIEvent_lookup[] = {
"MY_EVENT",
NULL,
};
$ cat qapi-generated/example-qapi-event.h
[Uninteresting stuff omitted...]
#ifndef EXAMPLE_QAPI_EVENT_H
#define EXAMPLE_QAPI_EVENT_H
#include "qapi/error.h"
#include "qapi/qmp/qdict.h"
#include "example-qapi-types.h"
void qapi_event_send_my_event(Error **errp);
extern const char *EXAMPLE_QAPIEvent_lookup[];
typedef enum EXAMPLE_QAPIEvent
{
EXAMPLE_QAPI_EVENT_MY_EVENT = 0,
EXAMPLE_QAPI_EVENT_MAX = 1,
} EXAMPLE_QAPIEvent;
UserDefOne * qmp_my_command(UserDefOne * arg1, Error **errp);
#endif

View File

@@ -18,7 +18,7 @@ Contents:
* RDMA Migration Protocol Description
* Versioning and Capabilities
* QEMUFileRDMA Interface
* Migration of VM's ram
* Migration of pc.ram
* Error handling
* TODO
@@ -149,7 +149,7 @@ The only difference between a SEND message and an RDMA
message is that SEND messages cause notifications
to be posted to the completion queue (CQ) on the
infiniband receiver side, whereas RDMA messages (used
for VM's ram) do not (to behave like an actual DMA).
for pc.ram) do not (to behave like an actual DMA).
Messages in infiniband require two things:
@@ -355,7 +355,7 @@ If the buffer is empty, then we follow the same steps
listed above and issue another "QEMU File" protocol command,
asking for a new SEND message to re-fill the buffer.
Migration of VM's ram:
Migration of pc.ram:
====================
At the beginning of the migration, (migration-rdma.c),

View File

@@ -135,12 +135,12 @@ be stored. Each extension has a structure like the following:
Unless stated otherwise, each header extension type shall appear at most once
in the same image.
If the image has a backing file then the backing file name should be stored in
the remaining space between the end of the header extension area and the end of
the first cluster. It is not allowed to store other data here, so that an
implementation can safely modify the header and add extensions without harming
data of compatible features that it doesn't support. Compatible features that
need space for additional data can use a header extension.
The remaining space between the end of the header extension area and the end of
the first cluster can be used for the backing file name. It is not allowed to
store other data here, so that an implementation can safely modify the header
and add extensions without harming data of compatible features that it
doesn't support. Compatible features that need space for additional data can
use a header extension.
== Feature name table ==

View File

@@ -23,7 +23,7 @@ for debugging, profiling, and observing execution.
4. Pretty-print the binary trace file:
./scripts/simpletrace.py trace-events trace-* # Override * with QEMU <pid>
./scripts/simpletrace.py trace-events trace-*
== Trace events ==
@@ -307,43 +307,3 @@ guard such computations and avoid its compilation when the event is disabled:
You can check both if the event has been disabled and is dynamically enabled at
the same time using the 'trace_event_get_state' routine (see header
"trace/control.h" for more information).
=== "tcg" ===
Guest code generated by TCG can be traced by defining an event with the "tcg"
event property. Internally, this property generates two events:
"<eventname>_trans" to trace the event at translation time, and
"<eventname>_exec" to trace the event at execution time.
Instead of using these two events, you should instead use the function
"trace_<eventname>_tcg" during translation (TCG code generation). This function
will automatically call "trace_<eventname>_trans", and will generate the
necessary TCG code to call "trace_<eventname>_exec" during guest code execution.
Events with the "tcg" property can be declared in the "trace-events" file with a
mix of native and TCG types, and "trace_<eventname>_tcg" will gracefully forward
them to the "<eventname>_trans" and "<eventname>_exec" events. Since TCG values
are not known at translation time, these are ignored by the "<eventname>_trans"
event. Because of this, the entry in the "trace-events" file needs two printing
formats (separated by a comma):
tcg foo(uint8_t a1, TCGv_i32 a2) "a1=%d", "a1=%d a2=%d"
For example:
#include "trace-tcg.h"
void some_disassembly_func (...)
{
uint8_t a1 = ...;
TCGv_i32 a2 = ...;
trace_foo_tcg(a1, a2);
}
This will immediately call:
void trace_foo_trans(uint8_t a1);
and will generate the TCG code to call:
void trace_foo(uint8_t a1, uint32_t a2);

18
dump.c
View File

@@ -71,14 +71,18 @@ uint64_t cpu_to_dump64(DumpState *s, uint64_t val)
static int dump_cleanup(DumpState *s)
{
int ret = 0;
guest_phys_blocks_free(&s->guest_phys_blocks);
memory_mapping_list_free(&s->list);
close(s->fd);
if (s->fd != -1) {
close(s->fd);
}
if (s->resume) {
vm_start();
}
return 0;
return ret;
}
static void dump_error(DumpState *s, const char *reason)
@@ -1495,8 +1499,6 @@ static int dump_init(DumpState *s, int fd, bool has_format,
s->begin = begin;
s->length = length;
memory_mapping_list_init(&s->list);
guest_phys_blocks_init(&s->guest_phys_blocks);
guest_phys_blocks_append(&s->guest_phys_blocks);
@@ -1524,6 +1526,7 @@ static int dump_init(DumpState *s, int fd, bool has_format,
}
/* get memory mapping */
memory_mapping_list_init(&s->list);
if (paging) {
qemu_get_guest_memory_mapping(&s->list, &s->guest_phys_blocks, &err);
if (err != NULL) {
@@ -1619,7 +1622,12 @@ static int dump_init(DumpState *s, int fd, bool has_format,
return 0;
cleanup:
dump_cleanup(s);
guest_phys_blocks_free(&s->guest_phys_blocks);
if (s->resume) {
vm_start();
}
return -1;
}

122
exec.c
View File

@@ -373,7 +373,7 @@ MemoryRegion *address_space_translate(AddressSpace *as, hwaddr addr,
break;
}
iotlb = mr->iommu_ops->translate(mr, addr, is_write);
iotlb = mr->iommu_ops->translate(mr, addr);
addr = ((iotlb.translated_addr & ~iotlb.addr_mask)
| (addr & iotlb.addr_mask));
len = MIN(len, (addr | iotlb.addr_mask) - addr + 1);
@@ -572,16 +572,6 @@ void cpu_watchpoint_remove_all(CPUState *cpu, int mask)
{
}
int cpu_watchpoint_remove(CPUState *cpu, vaddr addr, vaddr len,
int flags)
{
return -ENOSYS;
}
void cpu_watchpoint_remove_by_ref(CPUState *cpu, CPUWatchpoint *watchpoint)
{
}
int cpu_watchpoint_insert(CPUState *cpu, vaddr addr, vaddr len,
int flags, CPUWatchpoint **watchpoint)
{
@@ -592,10 +582,12 @@ int cpu_watchpoint_insert(CPUState *cpu, vaddr addr, vaddr len,
int cpu_watchpoint_insert(CPUState *cpu, vaddr addr, vaddr len,
int flags, CPUWatchpoint **watchpoint)
{
vaddr len_mask = ~(len - 1);
CPUWatchpoint *wp;
/* forbid ranges which are empty or run off the end of the address space */
if (len == 0 || (addr + len - 1) < addr) {
/* sanity checks: allow power-of-2 lengths, deny unaligned watchpoints */
if ((len & (len - 1)) || (addr & ~len_mask) ||
len == 0 || len > TARGET_PAGE_SIZE) {
error_report("tried to set invalid watchpoint at %"
VADDR_PRIx ", len=%" VADDR_PRIu, addr, len);
return -EINVAL;
@@ -603,7 +595,7 @@ int cpu_watchpoint_insert(CPUState *cpu, vaddr addr, vaddr len,
wp = g_malloc(sizeof(*wp));
wp->vaddr = addr;
wp->len = len;
wp->len_mask = len_mask;
wp->flags = flags;
/* keep all GDB-injected watchpoints in front */
@@ -624,10 +616,11 @@ int cpu_watchpoint_insert(CPUState *cpu, vaddr addr, vaddr len,
int cpu_watchpoint_remove(CPUState *cpu, vaddr addr, vaddr len,
int flags)
{
vaddr len_mask = ~(len - 1);
CPUWatchpoint *wp;
QTAILQ_FOREACH(wp, &cpu->watchpoints, entry) {
if (addr == wp->vaddr && len == wp->len
if (addr == wp->vaddr && len_mask == wp->len_mask
&& flags == (wp->flags & ~BP_WATCHPOINT_HIT)) {
cpu_watchpoint_remove_by_ref(cpu, wp);
return 0;
@@ -657,27 +650,6 @@ void cpu_watchpoint_remove_all(CPUState *cpu, int mask)
}
}
}
/* Return true if this watchpoint address matches the specified
* access (ie the address range covered by the watchpoint overlaps
* partially or completely with the address range covered by the
* access).
*/
static inline bool cpu_watchpoint_address_matches(CPUWatchpoint *wp,
vaddr addr,
vaddr len)
{
/* We know the lengths are non-zero, but a little caution is
* required to avoid errors in the case where the range ends
* exactly at the top of the address space and so addr + len
* wraps round to zero.
*/
vaddr wpend = wp->vaddr + wp->len - 1;
vaddr addrend = addr + len - 1;
return !(addr > wpend || wp->vaddr > addrend);
}
#endif
/* Add a breakpoint. */
@@ -889,7 +861,7 @@ hwaddr memory_region_section_get_iotlb(CPUState *cpu,
/* Make accesses to pages with watchpoints go via the
watchpoint trap routines. */
QTAILQ_FOREACH(wp, &cpu->watchpoints, entry) {
if (cpu_watchpoint_address_matches(wp, vaddr, TARGET_PAGE_SIZE)) {
if (vaddr == (wp->vaddr & TARGET_PAGE_MASK)) {
/* Avoid trapping reads of pages with a write breakpoint. */
if ((prot & PAGE_WRITE) || (wp->flags & BP_MEM_READ)) {
iotlb = PHYS_SECTION_WATCH + paddr;
@@ -1059,7 +1031,7 @@ void qemu_mutex_unlock_ramlist(void)
#define HUGETLBFS_MAGIC 0x958458f6
static long gethugepagesize(const char *path, Error **errp)
static long gethugepagesize(const char *path)
{
struct statfs fs;
int ret;
@@ -1069,8 +1041,7 @@ static long gethugepagesize(const char *path, Error **errp)
} while (ret != 0 && errno == EINTR);
if (ret != 0) {
error_setg_errno(errp, errno, "failed to get page size of file %s",
path);
perror(path);
return 0;
}
@@ -1088,22 +1059,17 @@ static void *file_ram_alloc(RAMBlock *block,
char *filename;
char *sanitized_name;
char *c;
void *area = NULL;
void *area;
int fd;
uint64_t hpagesize;
Error *local_err = NULL;
unsigned long hpagesize;
hpagesize = gethugepagesize(path, &local_err);
if (local_err) {
error_propagate(errp, local_err);
hpagesize = gethugepagesize(path);
if (!hpagesize) {
goto error;
}
if (memory < hpagesize) {
error_setg(errp, "memory size 0x" RAM_ADDR_FMT " must be equal to "
"or larger than huge page size 0x%" PRIx64,
memory, hpagesize);
goto error;
return NULL;
}
if (kvm_enabled() && !kvm_has_sync_mmu()) {
@@ -1113,7 +1079,7 @@ static void *file_ram_alloc(RAMBlock *block,
}
/* Make name safe to use with mkstemp by replacing '/' with '_'. */
sanitized_name = g_strdup(memory_region_name(block->mr));
sanitized_name = g_strdup(block->mr->name);
for (c = sanitized_name; *c != '\0'; c++) {
if (*c == '/')
*c = '_';
@@ -1164,7 +1130,6 @@ static void *file_ram_alloc(RAMBlock *block,
error:
if (mem_prealloc) {
error_report("%s\n", error_get_pretty(*errp));
exit(1);
}
return NULL;
@@ -1294,7 +1259,7 @@ static int memory_try_enable_merging(void *addr, size_t len)
return qemu_madvise(addr, len, QEMU_MADV_MERGEABLE);
}
static ram_addr_t ram_block_add(RAMBlock *new_block, Error **errp)
static ram_addr_t ram_block_add(RAMBlock *new_block)
{
RAMBlock *block;
ram_addr_t old_ram_size, new_ram_size;
@@ -1311,11 +1276,9 @@ static ram_addr_t ram_block_add(RAMBlock *new_block, Error **errp)
} else {
new_block->host = phys_mem_alloc(new_block->length);
if (!new_block->host) {
error_setg_errno(errp, errno,
"cannot set up guest memory '%s'",
memory_region_name(new_block->mr));
qemu_mutex_unlock_ramlist();
return -1;
fprintf(stderr, "Cannot set up guest memory '%s': %s\n",
new_block->mr->name, strerror(errno));
exit(1);
}
memory_try_enable_merging(new_block->host, new_block->length);
}
@@ -1366,8 +1329,6 @@ ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
Error **errp)
{
RAMBlock *new_block;
ram_addr_t addr;
Error *local_err = NULL;
if (xen_enabled()) {
error_setg(errp, "-mem-path not supported with Xen");
@@ -1397,22 +1358,14 @@ ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
return -1;
}
addr = ram_block_add(new_block, &local_err);
if (local_err) {
g_free(new_block);
error_propagate(errp, local_err);
return -1;
}
return addr;
return ram_block_add(new_block);
}
#endif
ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
MemoryRegion *mr, Error **errp)
MemoryRegion *mr)
{
RAMBlock *new_block;
ram_addr_t addr;
Error *local_err = NULL;
size = TARGET_PAGE_ALIGN(size);
new_block = g_malloc0(sizeof(*new_block));
@@ -1423,18 +1376,12 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
if (host) {
new_block->flags |= RAM_PREALLOC;
}
addr = ram_block_add(new_block, &local_err);
if (local_err) {
g_free(new_block);
error_propagate(errp, local_err);
return -1;
}
return addr;
return ram_block_add(new_block);
}
ram_addr_t qemu_ram_alloc(ram_addr_t size, MemoryRegion *mr, Error **errp)
ram_addr_t qemu_ram_alloc(ram_addr_t size, MemoryRegion *mr)
{
return qemu_ram_alloc_from_ptr(size, NULL, mr, errp);
return qemu_ram_alloc_from_ptr(size, NULL, mr);
}
void qemu_ram_free_from_ptr(ram_addr_t addr)
@@ -1678,7 +1625,7 @@ static const MemoryRegionOps notdirty_mem_ops = {
};
/* Generate a debug exception if a watchpoint has been hit. */
static void check_watchpoint(int offset, int len, int flags)
static void check_watchpoint(int offset, int len_mask, int flags)
{
CPUState *cpu = current_cpu;
CPUArchState *env = cpu->env_ptr;
@@ -1696,14 +1643,9 @@ static void check_watchpoint(int offset, int len, int flags)
}
vaddr = (cpu->mem_io_vaddr & TARGET_PAGE_MASK) + offset;
QTAILQ_FOREACH(wp, &cpu->watchpoints, entry) {
if (cpu_watchpoint_address_matches(wp, vaddr, len)
&& (wp->flags & flags)) {
if (flags == BP_MEM_READ) {
wp->flags |= BP_WATCHPOINT_HIT_READ;
} else {
wp->flags |= BP_WATCHPOINT_HIT_WRITE;
}
wp->hitaddr = vaddr;
if ((vaddr == (wp->vaddr & len_mask) ||
(vaddr & wp->len_mask) == wp->vaddr) && (wp->flags & flags)) {
wp->flags |= BP_WATCHPOINT_HIT;
if (!cpu->watchpoint_hit) {
cpu->watchpoint_hit = wp;
tb_check_watchpoint(cpu);
@@ -1728,7 +1670,7 @@ static void check_watchpoint(int offset, int len, int flags)
static uint64_t watch_mem_read(void *opaque, hwaddr addr,
unsigned size)
{
check_watchpoint(addr & ~TARGET_PAGE_MASK, size, BP_MEM_READ);
check_watchpoint(addr & ~TARGET_PAGE_MASK, ~(size - 1), BP_MEM_READ);
switch (size) {
case 1: return ldub_phys(&address_space_memory, addr);
case 2: return lduw_phys(&address_space_memory, addr);
@@ -1740,7 +1682,7 @@ static uint64_t watch_mem_read(void *opaque, hwaddr addr,
static void watch_mem_write(void *opaque, hwaddr addr,
uint64_t val, unsigned size)
{
check_watchpoint(addr & ~TARGET_PAGE_MASK, size, BP_MEM_WRITE);
check_watchpoint(addr & ~TARGET_PAGE_MASK, ~(size - 1), BP_MEM_WRITE);
switch (size) {
case 1:
stb_phys(&address_space_memory, addr, val);

Some files were not shown because too many files have changed in this diff Show More