Compare commits

..

2058 Commits

Author SHA1 Message Date
Nikunj A Dadhania
75bd0c7253 vfio: make rom read endian sensitive
All memory regions used by VFIO are LITTLE_ENDIAN and they
already take care of endiannes when accessing real device BARs
except ROM - it was broken on BE hosts.

This fixes endiannes for ROM BARs the same way as it is done
for other BARs.

This has been tested on PPC64 BE/LE host/guest in all possible
combinations including TCG.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
[aik: added commit log]
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-09-22 15:27:43 -06:00
Alexey Kardashevskiy
6758008e2c Revert "vfio: Make BARs native endian"
This reverts commit c40708176a.

The resulting code wrongly assumed target and host endianness are
the same which is not always the case for PPC64.

[aw: or potentially any host supporting VFIO and TCG]

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-09-22 15:26:36 -06:00
Max Filippov
07e2863d02 exec.c: fix setting 1-byte-long watchpoints
With commit 05068c0dfb 'exec.c: Relax restrictions on watchpoint length
and alignment' it's no longer possible to set 1-byte-long watchpoint
because of incorrect address range check.
Fix that by changing condition that checks for address wraparound.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1411016616-29879-1-git-send-email-jcmvbkbc@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-19 17:42:16 +01:00
Stefan Weil
4852ee95f3 Fix cross compilation (nm command)
Commit c261d774fb added one more binutils
tool: nm also needs a cross prefix.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1411070108-8954-1-git-send-email-sw@weilnetz.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-19 17:20:11 +01:00
Peter Maydell
10e11f4d2b Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pci, pc, virtio, misc bugfixes

A bunch of bugfixes - some of these will make sense for 2.1.2
I put Cc: qemu-stable included where appropriate.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Thu 18 Sep 2014 19:52:18 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"

* remotes/mst/tags/for_upstream:
  pc: leave more space for BIOS allocations
  virtio-pci: fix migration for pci bus master
  vhost-user: fix VIRTIO_NET_F_MRG_RXBUF negotiation
  virtio-pci: enable bus master for old guests
  Revert "virtio: don't call device on !vm_running"
  virtio-net: drop assert on vm stop
  Revert "rng-egd: remove redundant free"
  qdev: Move global validation to a single function
  qdev: Rename qdev_prop_check_global() to qdev_prop_check_globals()
  test-qdev-global-props: Test handling of hotpluggable and non-device types
  test-qdev-global-props: Initialize not_used=true for all props
  test-qdev-global-props: Run tests on subprocess
  tests: disable global props test for old glib
  test-qdev-global-props: Trivial comment fix
  hw/machine: Free old values of string properties

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-18 20:02:01 +01:00
Michael S. Tsirkin
438f92ee9f pc: leave more space for BIOS allocations
Since QEMU 2.1, we are allocating more space for ACPI tables, so no
space is left after initrd for the BIOS to allocate memory.

Besides ACPI tables, there are a few other uses of high memory in
SeaBIOS: SMBIOS tables and USB drivers use it in particular.  These uses
allocate a very small amount of memory.  Malloc metadata also lives
there.  So we need _some_ extra padding there to avoid initrd breakage,
but not much.

John Snow found a case where RHEL5 was broken by the recent change to
ACPI_TABLE_SIZE; in his case 4KB of extra padding are fine, but just to
be safe I am adding 32KB, which is roughly the same amount of padding
that was left by QEMU 2.0 and earlier.

Move initrd to leave some space for the BIOS.

Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reported-by: John Snow <jsnow@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-18 21:51:24 +03:00
Michael S. Tsirkin
4d43d3f3c8 virtio-pci: fix migration for pci bus master
Current support for bus master (clearing OK bit)
together with the need to support guests which do not
enable PCI bus mastering, leads to extra state in
VIRTIO_PCI_FLAG_BUS_MASTER_BUG bit, which isn't robust
in case of cross-version migration for the case when
guests use the device before setting DRIVER_OK.

Rip out VIRTIO_PCI_FLAG_BUS_MASTER_BUG and implement a simpler
work-around: treat clearing of PCI_COMMAND as a virtio reset.  Old
guests never touch this bit so they will work.

As reset clears device status, DRIVER and MASTER bits are
now in sync, so we can fix up cross-version migration simply
by synchronising them, without need to detect a buggy guest
explicitly.

Drop tracking VIRTIO_PCI_FLAG_BUS_MASTER_BUG completely.

As reset makes the device quiescent, in the future we'll be able to drop
checking OK bit in a bunch of places.

Cc: Jason Wang <jasowang@redhat.com>
Cc: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-18 21:51:24 +03:00
Damjan Marion
d8e80ae37a vhost-user: fix VIRTIO_NET_F_MRG_RXBUF negotiation
Header length check should happen only if backend is kernel. For user
backend there is no reason to reset this bit.

vhost-user code does not define .has_vnet_hdr_len so
VIRTIO_NET_F_MRG_RXBUF cannot be negotiated even if both sides
support it.

Signed-off-by: Damjan Marion <damarion@cisco.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-18 21:51:24 +03:00
Michael S. Tsirkin
e43c0b2ea5 virtio-pci: enable bus master for old guests
commit cc943c36fa
    pci: Use bus master address space for delivering MSI/MSI-X messages
breaks virtio-net for rhel6.[56] x86 guests because they don't
enable bus mastering for virtio PCI devices. For the same reason,
rhel6.[56] ppc64 guests cannot boot on a virtio-blk disk anymore.

Old guests forgot to enable bus mastering, enable it automatically on
DRIVER (guests use some devices before DRIVER_OK).

Reported-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Tested-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-18 21:51:24 +03:00
Michael S. Tsirkin
9e8e8c4865 Revert "virtio: don't call device on !vm_running"
This reverts commit a1bc7b827e422e1ff065640d8ec5347c4aadfcd8.
    virtio: don't call device on !vm_running
It turns out that virtio net assumes that vm_running
is updated before device status callback in many places,
so this change leads to asserts.
Previous commit fixes the root issue that motivated
a1bc7b827e422e1ff065640d8ec5347c4aadfcd8 differently,
so there's no longer a need for this change.

In the future, we might be able to drop checking vm_running
completely, and check vm state directly.

Reported-by: Dietmar Maurer <dietmar@proxmox.com>
Cc: qemu-stable@nongnu.org
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-18 21:51:24 +03:00
Michael S. Tsirkin
131c5221fe virtio-net: drop assert on vm stop
On vm stop, vm_running state set to stopped
before device is notified, so callbacks can get envoked with
vm_running = false; and this is not an error.

Cc: qemu-stable@nongnu.org
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-18 21:51:24 +03:00
Eduardo Habkost
abb4d5f2e2 Revert "rng-egd: remove redundant free"
This reverts commit 5e490b6a50.

Cc: qemu-stable@nongnu.org
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-18 21:51:24 +03:00
Eduardo Habkost
b3ce84fea4 qdev: Move global validation to a single function
Currently GlobalProperty.not_used=false has multiple meanings:

* It may be a property for a hotpluggable device, which may or may not
  have been used by a device;
* It may be a machine-type-provided property, which may or may not have
  been used by a device.
* It may be a user-provided property that was actually not used by
  any device.

Simplify the logic by having two separate fields: 'user_provided' and
'used'. This allows the entire global property validation logic to be
contained in a single function, and allows more specific error messages.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-18 21:51:24 +03:00
Eduardo Habkost
d828c430eb qdev: Rename qdev_prop_check_global() to qdev_prop_check_globals()
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-18 21:51:24 +03:00
Eduardo Habkost
08ac80cd61 test-qdev-global-props: Test handling of hotpluggable and non-device types
Ensure no warning will be printed for hotpluggable types, and warnings
will be printed for non-device types.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-18 21:51:24 +03:00
Eduardo Habkost
45de81735b test-qdev-global-props: Initialize not_used=true for all props
This will ensure we are actually testing the code which sets
not_used=false when the property is used.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-18 21:51:24 +03:00
Eduardo Habkost
2177801a48 test-qdev-global-props: Run tests on subprocess
There are multiple reasons for running the global property tests on a
subprocess:

* We need the global_props lists to be empty for each test case, so
  global properties from the previous test won't affect the next one;
* We don't want the qdev_prop_check_global() warnings to pollute test
  output;
* With a subprocess, we can ensure qdev_prop_check_global() is printing
  the warning messages it should.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-18 21:51:24 +03:00
Michael S. Tsirkin
9d41401b90 tests: disable global props test for old glib
follow-up patch moves global property tests to subprocesses.
Unfortunately with old glib this causes:

tests/test-qdev-global-props.c: In function
‘test_static_prop’:
tests/test-qdev-global-props.c:80:5: error: implicit
declaration of function ‘g_test_trap_subprocess’
[-Werror=implicit-function-declaration]
tests/test-qdev-global-props.c:80:5: error: nested extern
declaration of ‘g_test_trap_subprocess’ [-Werror=nested-externs]

This function was only added in glib 2.38, and our
minimum version is 2.12.

To fix, disable the test for glib < 2.38.

Apply before that patch to avoid breaking bisect.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-18 21:50:30 +03:00
Peter Maydell
bb26a1e80b Merge remote-tracking branch 'remotes/kraxel/tags/pull-vnc-20140918-1' into staging
vnc: set TCP_NODELAY, cleanup in tlc code

# gpg: Signature made Thu 18 Sep 2014 07:02:37 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-vnc-20140918-1:
  vnc-tls: Clean up dead store in vnc_set_x509_credential()
  ui/vnc: set TCP_NODELAY

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-18 17:00:38 +01:00
Markus Armbruster
9d64fab422 vnc-tls: Clean up dead store in vnc_set_x509_credential()
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-18 08:01:53 +02:00
Peter Lieven
86152436eb ui/vnc: set TCP_NODELAY
we currently have the Nagle algorithm enabled for all outgoing VNC updates.
This may delay sensitive updates as mouse movements or typing in the console.
As we currently prepare all data in a buffer and then send as much as we can
disabling the Nagle algorithm should not cause big trouble. Well established
VNC servers like TightVNC set TCP_NODELAY as well.
A regular framebuffer update request generates exactly one framebuffer update
which should be pushed out as fast as possible.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-17 15:14:41 +02:00
Peter Maydell
e4d50d47a9 qemu-char: Rename register_char_driver_qapi() to register_char_driver()
Now we have removed the legacy register_char_driver() we can
rename register_char_driver_qapi() to the more obvious and
shorter name.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1409653457-27863-6-git-send-email-peter.maydell@linaro.org
2014-09-16 23:36:32 +01:00
Peter Maydell
a61ae7f88c qemu-char: Remove register_char_driver() machinery
Now that all the char backends have been converted to the QAPI
framework we can remove the machinery for handling old style
backends.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1409653457-27863-5-git-send-email-peter.maydell@linaro.org
2014-09-16 23:36:32 +01:00
Peter Maydell
90a14bfe52 qemu-char: Convert udp backend to QAPI
Convert the udp char backend to the new style QAPI framework.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1409653457-27863-4-git-send-email-peter.maydell@linaro.org
2014-09-16 23:36:32 +01:00
Peter Maydell
8287fea321 util/qemu-sockets.c: Support specifying IPv4 or IPv6 in socket_dgram()
Currently you can specify whether you want a UDP chardev backend
to be IPv4 or IPv6 using the ipv4 or ipv6 options if you use the
QemuOpts parsing code in inet_dgram_opts(). However the QMP struct
parsing code in socket_dgram() doesn't provide this flexibility
(which in turn prevents us from converting the UDP backend handling
to the new style QAPI framework).

Use the existing inet_addr_to_opts() function to convert the
remote->inet address to option strings; this handles ipv4 and
ipv6 flags as well as host and port. (It will also convert any
'to' specification, which is harmless as it is ignored in this
context.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1409653457-27863-3-git-send-email-peter.maydell@linaro.org
2014-09-16 23:36:32 +01:00
Peter Maydell
dafd325dbb qemu-char: Convert socket backend to QAPI
Convert the socket char backend to the new style QAPI framework;
this allows it to return an Error ** to callers who might not
want it to print directly about socket failures.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1409653457-27863-2-git-send-email-peter.maydell@linaro.org
2014-09-16 23:36:32 +01:00
Peter Maydell
8af47027eb Merge remote-tracking branch 'remotes/kraxel/tags/pull-sdl-20140916-1' into staging
Two minor sdl2 fixes.

# gpg: Signature made Tue 16 Sep 2014 07:20:37 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-sdl-20140916-1:
  sdl2: keymap fixups
  sdl2: drop sdl_zoom.h

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-16 18:29:40 +01:00
Peter Maydell
5f3fb5a2e2 Merge remote-tracking branch 'remotes/spice/tags/pull-spice-20140916-2' into staging
spice: call qemu_spice_set_passwd() during init

# gpg: Signature made Tue 16 Sep 2014 07:11:22 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/spice/tags/pull-spice-20140916-2:
  spice: call qemu_spice_set_passwd() during init

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-16 18:28:31 +01:00
Marc-André Lureau
07d49a53b6 spice: call qemu_spice_set_passwd() during init
Don't call SPICE API directly to set password given in command line, but
use the internal API, saving password for later calls.

This solves losing password when changing expiration in qemu monitor.

https://bugzilla.redhat.com/show_bug.cgi?id=1138639

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-16 08:09:03 +02:00
Gerd Hoffmann
0d61f7dcc6 sdl2: keymap fixups
Make a few keys works correctly in SDL2.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-16 08:07:05 +02:00
Gerd Hoffmann
4f36e42ee9 sdl2: drop sdl_zoom.h
It isn't used.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-16 08:07:05 +02:00
Peter Maydell
cc35a44cf7 Merge remote-tracking branch 'remotes/qmp-unstable/queue/qmp' into staging
* remotes/qmp-unstable/queue/qmp:
  exec: file_ram_alloc(): print error when prealloc fails
  monitor: fix debug print compiling error

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-15 19:44:34 +01:00
Peter Maydell
f2bcdc8de0 Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block patches

# gpg: Signature made Fri 12 Sep 2014 16:09:43 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"

* remotes/kevin/tags/for-upstream: (22 commits)
  qcow2: Add falloc and full preallocation option
  raw-posix: Add falloc and full preallocation option
  qapi: introduce PreallocMode and new PreallocModes full and falloc.
  block: don't convert file size to sector size
  block: round up file size to nearest sector
  iotests: Send the correct fd in socket_scm_helper
  blockdev: Refuse to drive_del something added with blockdev-add
  block: extend BLOCK_IO_ERROR with reason string
  dataplane: fix virtio_blk_data_plane_create() op blocker error path
  qemu-iotests: Run 025 for Archipelago block driver
  block/archipelago: Implement bdrv_truncate()
  block: Make the block accounting functions operate on BlockAcctStats
  block: rename BlockAcctType members to start with BLOCK_ instead of BDRV_
  block: Extract the block accounting code
  block: Extract the BlockAcctStats structure
  IDE: MMIO IDE device control should be little endian
  thread-pool: Drop unnecessary includes
  xen: Drop redundant bdrv_close() from pci_piix3_xen_ide_unplug()
  xen_disk: Plug memory leak on error path
  qemu-io: Clean up openfile() after commit 2e40134
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-15 17:35:22 +01:00
Peter Maydell
16ab5046c4 Merge remote-tracking branch 'remotes/kraxel/tags/pull-console-20140915-1' into staging
Fix pixman build failure.

# gpg: Signature made Mon 15 Sep 2014 07:15:11 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-console-20140915-1:
  configure: check for pixman-1 version
  pixman: update internal copy to pixman-0.32.6

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-15 16:42:28 +01:00
Hu Tao
236f282c1c configure: check for pixman-1 version
commit a93a3af9 introduces use of PIXMAN_TYPE_RGBA, but it's only available
in pixman >= 0.21.8. If pixman doesn't meet the version requirement, qemu
will fail to build with following message:

qemu/ui/qemu-pixman.c: In function ‘qemu_pixelformat_from_pixman’:
qemu/ui/qemu-pixman.c:42: error: ‘PIXMAN_TYPE_RGBA’ undeclared (first use in this function)
qemu/ui/qemu-pixman.c:42: error: (Each undeclared identifier is reported only once
qemu/ui/qemu-pixman.c:42: error: for each function it appears in.)

This patch fixes the problem by checking the pixman version.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-15 08:14:19 +02:00
Hu Tao
122abbe5fc pixman: update internal copy to pixman-0.32.6
commit a93a3af9 introduces use of PIXMAN_TYPE_RGBA, but it's only available
in pixman >= 0.21.8. Although commit f27b2e1d bumped pixman to pixman-0.28.2,
but the change was reverted later by 7b1b5d19.

This patch updates internal copy of pixman to pixman-0.32.6 to fix the
problem.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-15 08:14:19 +02:00
Eduardo Habkost
70e98b230f test-qdev-global-props: Trivial comment fix
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-14 21:32:16 +03:00
Eduardo Habkost
556068eed0 hw/machine: Free old values of string properties
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel.a@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Cc: qemu-stable@nongnu.org
2014-09-14 21:32:16 +03:00
Peter Maydell
2b31cd4e08 Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
- Memory: improve error reporting and avoid crashes on hotplug
- Build: fixing block/iscsi.so and ranlib warnings on Mac OS X
- Migration fixes for x86
- The odd KVM patch.

# gpg: Signature made Thu 11 Sep 2014 11:21:10 BST using RSA key ID 9B4D86F2
# gpg: Good signature from "Paolo Bonzini <pbonzini@redhat.com>"
# gpg:                 aka "Paolo Bonzini <bonzini@gnu.org>"

* remotes/bonzini/tags/for-upstream: (21 commits)
  gdbstub: init mon_chr through qemu_chr_alloc
  pckbd: adding new fields to vmstate
  mc146818rtc: add missed field to vmstate
  piix: do not set irq while loading vmstate
  serial: fixing vmstate for save/restore
  parallel: adding vmstate for save/restore
  fdc: adding vmstate for save/restore
  cpu: init vmstate for ticks and clock offset
  apic_common: vapic_paddr synchronization fix
  vl: use QLIST_FOREACH_SAFE to visit change state handlers
  exec: add parameter errp to gethugepagesize
  exec: report error when memory < hpagesize
  hostmem-ram: don't exit qemu if size of memory-backend-ram is way too big
  memory: add parameter errp to memory_region_init_rom_device
  memory: add parameter errp to memory_region_init_ram
  exec: add parameter errp to qemu_ram_alloc and qemu_ram_alloc_from_ptr
  rules.mak: Fix DSO build by pulling in archive symbols
  util: Don't link host-utils.o if it's empty
  util: Move general qemu_getauxval to util/getauxval.c
  trace: Only link generated-tracers.o with "simple" backend
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 16:55:49 +01:00
Luiz Capitulino
e4d9df4fb1 exec: file_ram_alloc(): print error when prealloc fails
If memory allocation fails when using the -mem-prealloc command-line
option, QEMU exits without printing any error information to
the user:

 # qemu [...] -m 1G -mem-prealloc -mem-path /dev/hugepages
 # echo $?
 1

This commit adds an error message, so that we print instead:

 # qemu [...] -m 1G -mem-prealloc -mem-path /dev/hugepages
 qemu: unable to map backing store for hugepages: Cannot allocate memory

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-09-12 11:22:21 -04:00
Gonglei
5fb9b5b9cb monitor: fix debug print compiling error
error: 'i' undeclared (first use in this function)

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-09-12 11:01:50 -04:00
Peter Maydell
4c24f40040 Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20140912' into staging
target-arm:
 * add "linux,stdout-path" to the virt DTB
 * fix a long standing bug with IRQ disabling on Cortex-M CPUs
 * implement input interrupt logic in the PL061
 * fix failure to load correct SP/PC on reset of Cortex-M CPUs
   if the vector table is not in a ROM-blob-in-RAM
 * provide flash devices for boot ROMs in the virt board
 * implement architectural watchpoints
 * fix misimplementation of Inner Shareable TLB operations that
   caused instability of guests in TCG SMP configurations
 * configure PL011 and PL031 in the virt board correctly with
   level-triggered interrupts rather than edge-triggered
 * support providing a device tree blob to ROM (firmware)
   images as well as to kernels

# gpg: Signature made Fri 12 Sep 2014 14:19:08 BST using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"

* remotes/pmaydell/tags/pull-target-arm-20140912: (23 commits)
  hw/arm/boot: enable DTB support when booting ELF images
  hw/arm/boot: load device tree to base of DRAM if no -kernel option was passed
  hw/arm/boot: pass an address limit to and return size from load_dtb()
  hw/arm/boot: load DTB as a ROM image
  hw/arm/virt: fix pl011 and pl031 irq flags
  target-arm: Make *IS TLB maintenance ops affect all CPUs
  target-arm: Push legacy wildcard TLB ops back into v6
  target-arm: Implement minimal DBGVCR, OSDLR_EL1, MDCCSR_EL0
  target-arm: Remove comment about MDSCR_EL1 being dummy implementation
  target-arm: Set DBGDSCR.MOE for debug exceptions taken to AArch32
  target-arm: Implement handling of fired watchpoints
  target-arm: Move extended_addresses_enabled() to internals.h
  target-arm: Implement setting of watchpoints
  cpu-exec: Make debug_excp_handler a QOM CPU method
  exec.c: Record watchpoint fault address and direction
  exec.c: Provide full set of dummy wp remove functions in user-mode
  exec.c: Relax restrictions on watchpoint length and alignment
  hw/arm/virt: Provide flash devices for boot ROMs
  target-arm: Fix broken indentation in arm_cpu_reest()
  target-arm: Fix resetting issues on ARMv7-M CPUs
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 15:12:26 +01:00
Hu Tao
0e4271b711 qcow2: Add falloc and full preallocation option
preallocation=falloc allocates disk space by posix_fallocate(),
preallocation=full allocates disk space by writing zeros to disk.
Both modes imply preallocation=metadata.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-12 15:43:06 +02:00
Hu Tao
06247428be raw-posix: Add falloc and full preallocation option
This patch adds a new option preallocation for raw format, and implements
falloc and full preallocation.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-12 15:43:06 +02:00
Hu Tao
ffeaac9b4e qapi: introduce PreallocMode and new PreallocModes full and falloc.
This patch prepares for the subsequent patches.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-12 15:43:06 +02:00
Hu Tao
180e95265e block: don't convert file size to sector size
and avoid converting it back later.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-12 15:43:06 +02:00
Hu Tao
c2eb918e32 block: round up file size to nearest sector
Currently the file size requested by user is rounded down to nearest
sector, causing the actual file size could be a bit less than the size
user requested. Since some formats (like qcow2) record virtual disk
size in bytes, this can make the last few bytes cannot be accessed.

This patch fixes it by rounding up file size to nearest sector so that
the actual file size is no less than the requested file size.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-12 15:43:06 +02:00
Ard Biesheuvel
92df845070 hw/arm/boot: enable DTB support when booting ELF images
Add support for loading DTB images when booting ELF images using
-kernel. If there are no conflicts with the placement of the ELF
segments, the DTB image is loaded at the base of RAM.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Message-id: 1410453915-9344-5-git-send-email-ard.biesheuvel@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 14:06:50 +01:00
Ard Biesheuvel
69e7f76f6a hw/arm/boot: load device tree to base of DRAM if no -kernel option was passed
If we are running the 'virt' machine, we may have a device tree blob but no
kernel to supply it to if no -kernel option was passed. In that case, copy it
to the base of RAM where it can be picked up by a bootloader.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Message-id: 1410453915-9344-4-git-send-email-ard.biesheuvel@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 14:06:50 +01:00
Ard Biesheuvel
fee8ea12eb hw/arm/boot: pass an address limit to and return size from load_dtb()
Add an address limit input parameter to load_dtb() so that we can
tell load_dtb() how much memory the dtb is allowed to consume. If
the dtb doesn't fit, return 0, otherwise return the actual size of
the loaded dtb.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Message-id: 1410453915-9344-3-git-send-email-ard.biesheuvel@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 14:06:50 +01:00
Ard Biesheuvel
4c4bf65474 hw/arm/boot: load DTB as a ROM image
In order to make the device tree blob (DTB) available in memory not only at
first boot, but also after system reset, use rom_blob_add_fixed() to install
it into memory.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Message-id: 1410453915-9344-2-git-send-email-ard.biesheuvel@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 14:06:50 +01:00
Peter Maydell
0be969a2d9 hw/arm/virt: fix pl011 and pl031 irq flags
The pl011 and pl031 devices both use level triggered interrupts,
but the device tree we construct was incorrectly telling the
kernel to configure the GIC to treat them as edge triggered.
This meant that output from the pl011 would hang after a while.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1410274423-9461-1-git-send-email-peter.maydell@linaro.org
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Cc: qemu-stable@nongnu.org
2014-09-12 14:06:50 +01:00
Peter Maydell
fa439fc5d7 target-arm: Make *IS TLB maintenance ops affect all CPUs
The ARM architecture defines that the "IS" variants of TLB
maintenance operations must affect all TLBs in the Inner Shareable
domain, which for us means all CPUs. We were incorrectly implementing
these to only affect the current CPU, which meant that SMP TCG
operation was unstable.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1410274883-9578-3-git-send-email-peter.maydell@linaro.org
Cc: qemu-stable@nongnu.org
2014-09-12 14:06:50 +01:00
Peter Maydell
995939a650 target-arm: Push legacy wildcard TLB ops back into v6
When we implemented ARMv8 in QEMU we retained our legacy loose
wildcarded decoding of the TLB maintenance operations for v7
and earlier CPUs and provided the correct stricter decode for
v8. However the loose decode is in fact wrong for v7MP, because
it doesn't correctly implement the operations which must apply
to every CPU in the Inner Shareable domain.

Move the legacy wildcarding from the not_v8 reginfo array
into the not_v7 array, and move the strictly decoded operations
from the v8 reginfo to v7 or v7mp arrays as appropriate.

Cache and TLB lockdown legacy wildcarding remains in the
not_v8 array for the moment.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1410274883-9578-2-git-send-email-peter.maydell@linaro.org
Cc: qemu-stable@nongnu.org
2014-09-12 14:06:50 +01:00
Peter Maydell
5e8b12ffbb target-arm: Implement minimal DBGVCR, OSDLR_EL1, MDCCSR_EL0
Implement debug registers DBGVCR, OSDLR_EL1 and MDCCSR_EL0
(as dummy or limited-functionality). 32 bit Linux kernels will
access these at startup so they are required for breakpoints
and watchpoints to be supported.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 14:06:50 +01:00
Peter Maydell
17a9eb53a9 target-arm: Remove comment about MDSCR_EL1 being dummy implementation
MDSCR_EL1 has actual functionality now; remove the out of date
comment that claims it is a dummy implementation.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 14:06:49 +01:00
Peter Maydell
16a906fd6e target-arm: Set DBGDSCR.MOE for debug exceptions taken to AArch32
For debug exceptions taken to AArch32 we have to set the
DBGDSCR.MOE (Method Of Entry) bits; we can identify the
kind of debug exception from the information in
exception.syndrome.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 14:06:49 +01:00
Peter Maydell
3ff6fc9148 target-arm: Implement handling of fired watchpoints
Implement the ARM debug exception handler for dealing with
fired watchpoints.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 14:06:49 +01:00
Peter Maydell
73c5211ba9 target-arm: Move extended_addresses_enabled() to internals.h
Move the utility function extended_addresses_enabled() into
internals.h; we're going to need to call it from op_helper.c.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 14:06:49 +01:00
Peter Maydell
9ee98ce810 target-arm: Implement setting of watchpoints
Implement support for setting QEMU watchpoints based on the
values the guest writes to the ARM architected watchpoint
registers. (We do not yet report the firing of the watchpoints
to the guest, so they will just be ignored.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 14:06:49 +01:00
Peter Maydell
86025ee443 cpu-exec: Make debug_excp_handler a QOM CPU method
Make the debug_excp_handler target specific hook into a QOM
CPU method.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 14:06:48 +01:00
Peter Maydell
08225676b2 exec.c: Record watchpoint fault address and direction
When we check whether we've hit a watchpoint we know the address
that we were attempting to access and whether it was a read or a
write. Record this information in the CPUWatchpoint struct so that
target-specific code can report it to the guest.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2014-09-12 14:06:48 +01:00
Peter Maydell
3ee887e8ff exec.c: Provide full set of dummy wp remove functions in user-mode
We already provide dummy versions of the cpu_watchpoint_insert
and cpu_watchpoint_remove_all functions when CONFIG_USER_ONLY
is defined. Complete the set by providing cpu_watchpoint_remove
and cpu_watchpoint_remove_by_ref as well.

This allows target-* code using these functions to avoid
some ifdeffery.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2014-09-12 14:06:48 +01:00
Peter Maydell
05068c0dfb exec.c: Relax restrictions on watchpoint length and alignment
The current implementation of watchpoints requires that they
have a power of 2 length which is not greater than TARGET_PAGE_SIZE
and that their address is a multiple of their length. Watchpoints
on ARM don't fit these restrictions, so change the implementation
so they can be relaxed.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2014-09-12 14:06:48 +01:00
Peter Maydell
acf82361c6 hw/arm/virt: Provide flash devices for boot ROMs
Add two flash devices to the virt board, so that it can be used for
running guests which want a bootrom image such as UEFI. We provide
two flash devices to make it more convenient to provide both a
read-only UEFI image and a read-write place to store guest-set
UEFI config variables. The '-bios' command line option is set up
to provide an image for the first of the two flash devices.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1409930126-28449-2-git-send-email-ard.biesheuvel@linaro.org
2014-09-12 14:06:48 +01:00
Martin Galvan
34bf774485 target-arm: Fix broken indentation in arm_cpu_reest()
Fix a single misindented line in arm_cpu_reset().

Signed-off-by: Martin Galvan <martin.galvan@tallertechnologies.com>
[PMM: split this out from the previous commit]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 14:06:48 +01:00
Martin Galvan
6e3cf5df01 target-arm: Fix resetting issues on ARMv7-M CPUs
When calling qemu_system_reset after startup on a Cortex-M
CPU, the initial values of PC, MSP and the Thumb bit weren't being set
correctly if the vector table was in ROM. In particular, since Thumb was 0, a
Usage Fault would arise immediately after trying to execute any instruction
on a Cortex-M.

Signed-off-by: Martin Galvan <martin.galvan@tallertechnologies.com>
Message-id: CAOKbPbaLt-LJsAKkQdOE0cs9Xx4OWrUfpDhATXPSdtuNw2xu_A@mail.gmail.com
[PMM: removed an incorrect comment]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 14:06:48 +01:00
Colin Leitner
bfb27e6042 pl061: implement input interrupt logic
This patch adds the missing input interrupt logic to the pl061 GPIO device. To
keep the floating output pins to stay high, the old state variable had to be
split into two separate ones for input and output - which brings the vmstate
version to 3.

Edge level interrupts and I/O were tested under Linux 3.14. Level interrupt
handling hasn't been tested.

Signed-off-by: Colin Leitner <colin.leitner@googlemail.com>
Message-id: 54024FD2.9080204@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 14:06:48 +01:00
David Hoover
c3c8d6b3dd cpu-exec.c: Allow disabling of IRQs on ARM Cortex-M CPUs
Correct an error in the logic for deciding whether we can
take an IRQ interrupt which meant that on M profile cores
it was never possible to disable them.

The design here is still bogus in that M profile doesn't
have separate "IRQ" and "FIQ", which are an A/R profile
concept; we should ideally implement the proper priority
based scheme.

Signed-off-by: David Hoover <spm@boiteauxlettres.sent.at>
[PMM: Wrote a proper commit message]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 14:06:47 +01:00
Ard Biesheuvel
f022b8e953 hw/arm/virt: add linux, stdout-path to /chosen DT node
Add a property "linux,stdout-path" to the /chosen DT node and make
it point to the emulated UART. This allows users such as the Linux
kernel to produce console output without the need to pass console=
or earlycon=pl011,0x... command line arguments.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Message-id: 1409317439-29349-1-git-send-email-ard.biesheuvel@linaro.org
Reviewed-by: Rob Herring <rob.herring@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 14:06:47 +01:00
Marc Marí
6cd14054b6 libqos virtio: Increase ISR timeout
Increase the clock step to avoid Travis failure in some builds due to
overagressive timeout.

Signed-off-by: Marc Marí <marc.mari.barcelo@gmail.com>
Message-id: 1410428416-5046-1-git-send-email-marc.mari.barcelo@gmail.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 13:58:07 +01:00
Stratos Psomadakis
be2bfb9dbd iotests: Send the correct fd in socket_scm_helper
Make sure to pass the correct fd via SCM_RIGHTS in socket_scm_helper.c
(i.e. fd_to_send, not socket-fd).

Signed-off-by: Stratos Psomadakis <psomas@grnet.gr>
Signed-off-by: Dimitris Aragiorgis <dimara@grnet.gr>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-12 10:27:54 +02:00
Markus Armbruster
48f364dd0b blockdev: Refuse to drive_del something added with blockdev-add
For some device models, the guest can prevent unplug.  Some users need a
way to forcibly revoke device model access to the block backend then, so
the underlying images can be safely used for something else.

drive_del lets you do that.  Unfortunately, it conflates revoking access
with destroying the backend.

Commit 9063f81 made drive_del immediately destroy the root BDS.  Nice:
the device name becomes available for reuse immediately.  Not so nice:
the device model's pointer to the root BDS dangles, and we're prone to
crash when the memory gets reused.

Commit d22b2f4 fixed that by hiding the root BDS instead of destroying
it.  Destruction only happens on unplug.  "Hiding" means removing it
from bdrv_states and graph_bdrv_states; see bdrv_make_anon().

This "destroy on revoke" is a misfeature we don't want to carry
forward to blockdev-add, just like "destroy on unplug" (commit
2d246f0).  So make drive_del fail on anything added with blockdev-add.

We'll add separate QMP commands to revoke device model access and to
destroy backends.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-11 17:14:24 +02:00
Luiz Capitulino
624ff5736e block: extend BLOCK_IO_ERROR with reason string
BLOCK_IO_ERROR events are logged by libvirt, which helps with
post mortem analysis of guests. However, one information that
we miss today is a human readable string describing the cause
of the I/O error.

This commit adds that string it to BLOCK_IO_ERROR. Note that
this string is a debugging aid for humans, meaning that it
should not parsed by applications.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-11 17:14:13 +02:00
Stefan Hajnoczi
745a9bb9cd dataplane: fix virtio_blk_data_plane_create() op blocker error path
Commit 3718d8ab65 ("block: Replace in_use
with operation blocker") broke the error path because it consumed
local_err instead of propagating it.

The caller has no way to know that the function failed.  This caused
virtio-blk to start "successfully" even though there was a fatal
dataplane error.

Steps to reproduce:

  $ qemu-system-x86_64 -enable-kvm -object iothread,id=iothread0 \
                       -drive if=none,id=drive0,file=a.img \
  (qemu) drive_mirror drive0 /tmp/foo.img
  (qemu) device_add virtio-blk-pci,iothread=iothread0,drive=drive0

Expected result:

  Since the mirror block job is using drive0 it is not possible to start
  virtio-blk data-plane.

  device_add fails and the PCI adapter is not added.

Actual result:

  device_add completes and the PCI adapter is added.

Cc: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-11 16:21:46 +02:00
Peter Maydell
0dfa7e3012 Merge remote-tracking branch 'remotes/kraxel/tags/pull-console-20140905-2' into staging
console: pixman switchover continued, add some infrastructure to make it
         easier using pixman in display device emulation.

# gpg: Signature made Fri 05 Sep 2014 14:38:57 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-console-20140905-2:
  console: Remove unused QEMU_BIG_ENDIAN_FLAG
  console: add qemu_pixman_linebuf_copy
  console: add dpy_gfx_update_dirty
  console: add qemu_create_displaysurface_guestmem
  console: stop using PixelFormat
  console: reimplement qemu_default_pixelformat
  console: add qemu_default_pixman_format
  console: add qemu_pixelformat_from_pixman

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-11 11:44:17 +01:00
Pavel Dovgalyuk
462efe9e53 gdbstub: init mon_chr through qemu_chr_alloc
This patch initializes monitor for gdbstub with the qemu_chr_alloc function
instead of just allocating the memory. Initialization function call
is required, because it also creates chr_write_lock mutex, which is used
when writing to this character device.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-11 12:20:33 +02:00
Pavel Dovgalyuk
a28fe7e3f6 pckbd: adding new fields to vmstate
This patch adds outport to VMState to allow correct saving and restoring
the state of PC keyboard controller.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-11 12:20:32 +02:00
Pavel Dovgalyuk
0b102153e0 mc146818rtc: add missed field to vmstate
This patch adds irq_reinject_on_ack_count field to VMState to allow correct
saving/loading the state of MC146818 RTC.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Acked-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-11 12:20:32 +02:00
Pavel Dovgalyuk
2c9ecdeb9f piix: do not set irq while loading vmstate
This patch avoids setting an irq while loading the state of the ISA bridge.
Because the i8259 has not been deserialized yet, raising an interrupt
could bring the system out-of-sync with the migration source.  For example,
the migration source could have masked the interrupt in the i8259. On the
destination, the i8259 device model would not know that yet and would
trigger an interrupt in the CPU.

This patch eliminates setting the irq and just restores the calculated
state fields in post_load function.  Interrupt state will be deserialized
separately through the IRR field of the i8259.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-11 12:20:32 +02:00
Pavel Dovgalyuk
7385b275d9 serial: fixing vmstate for save/restore
Some fields were added to VMState by this patch to preserve correct
loading of the serial port controller state.
Updating FCR value while loading was also modified to disable generating
an interrupt by loadvm.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-11 12:20:32 +02:00
Pavel Dovgalyuk
461a2753a1 parallel: adding vmstate for save/restore
VMState added by this patch preserves correct
loading of the parallel port controller state.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Acked-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-11 12:20:32 +02:00
Pavel Dovgalyuk
c0b92f3037 fdc: adding vmstate for save/restore
VMState added by this patch preserves correct
loading of the FDC device state.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Acked-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-11 12:20:32 +02:00
Pavel Dovgalyuk
4603ea0105 cpu: init vmstate for ticks and clock offset
Ticks and clock offset used by CPU timers have to be saved in vmstate.
But vmstate for these fields registered only in icount mode.
Missing registration leads to breaking the continuity when vmstate is loaded.
This patch introduces new initialization function which fixes this.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-11 12:20:32 +02:00
Pavel Dovgalyuk
a6dead43e6 apic_common: vapic_paddr synchronization fix
This patch postpones vapic_paddr initialization, which is performed
during migration. When vapic_paddr is synchronized within the migration
process, apic_common functions could operate with incorrect apic state,
if it hadn't loaded yet. This patch postpones the synchronization until
the virtual machine is started, ensuring that the whole virtual machine
state has been loaded.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Tested-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-11 12:20:25 +02:00
Peter Maydell
fc3b9aa876 Merge remote-tracking branch 'remotes/kraxel/tags/pull-usb-20140910-1' into staging
xhci PCIe endpoint migration compatibility fix

# gpg: Signature made Wed 10 Sep 2014 06:35:20 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-usb-20140910-1:
  xhci PCIe endpoint migration compatibility fix

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-11 10:36:50 +01:00
Paolo Bonzini
9b10ac869d vl: use QLIST_FOREACH_SAFE to visit change state handlers
This lets a handler delete itself.

Acked-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-10 12:08:33 +02:00
Chrysostomos Nanakos
466c80f21f qemu-iotests: Run 025 for Archipelago block driver
Run resize grow test to ensure that existing data
is not lost during grow and new space is zeroed.

Signed-off-by: Chrysostomos Nanakos <cnanakos@grnet.gr>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-10 10:41:29 +02:00
Chrysostomos Nanakos
94c80a438c block/archipelago: Implement bdrv_truncate()
Signed-off-by: Chrysostomos Nanakos <cnanakos@grnet.gr>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-10 10:41:29 +02:00
Benoît Canet
5366d0c8bc block: Make the block accounting functions operate on BlockAcctStats
This is the next step for decoupling block accounting functions from
BlockDriverState.
In a future commit the BlockAcctStats structure will be moved from
BlockDriverState to the device models structures.

Note that bdrv_get_stats was introduced so device models can retrieve the
BlockAcctStats structure of a BlockDriverState without being aware of it's
layout.
This function should go away when BlockAcctStats will be embedded in the device
models structures.

CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Keith Busch <keith.busch@intel.com>
CC: Anthony Liguori <aliguori@amazon.com>
CC: "Michael S. Tsirkin" <mst@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Eric Blake <eblake@redhat.com>
CC: Peter Maydell <peter.maydell@linaro.org>
CC: Michael Tokarev <mjt@tls.msk.ru>
CC: John Snow <jsnow@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
CC: Alexander Graf <agraf@suse.de>
CC: Max Reitz <mreitz@redhat.com>

Signed-off-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-10 10:41:29 +02:00
Benoît Canet
28298fd3d9 block: rename BlockAcctType members to start with BLOCK_ instead of BDRV_
The middle term goal is to move the BlockAcctStats structure in the device models.
(Capturing I/O accounting statistics in the device models is good for billing)
This patch make a small step in this direction by removing a reference to BDRV.

CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Keith Busch <keith.busch@intel.com>
CC: Anthony Liguori <aliguori@amazon.com>
CC: "Michael S. Tsirkin" <mst@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: John Snow <jsnow@redhat.com>
CC: Richard Henderson <rth@twiddle.net>
CC: Markus Armbruster <armbru@redhat.com>
CC: Alexander Graf <agraf@suse.de>i

Signed-off-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-10 10:41:29 +02:00
Benoît Canet
5e5a94b605 block: Extract the block accounting code
The plan is to add new accounting metrics (latency, invalid requests, failed
requests, queue depth) and block.c is overpopulated so it will be better to work
in a separate module.

Moreover the long term plan is to have statistics in each of the BDS of the graph
for metrology purpose; this means that the device model statistics must move from
the topmost BDS to the device model.

So we need to decouple the statistic code from BlockDriverState.

This is another argument for the extraction of the code in a separate module.

CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
CC: Eric Blake <eblake@redhat.com>
CC: Benoit Canet <benoit@irqsave.net>
CC: Fam Zheng <famz@redhat.com>
CC: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
CC: Paolo Bonzini <pbonzini@redhat.com>

Signed-off-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-10 10:41:29 +02:00
Benoît Canet
0ddd0ad96a block: Extract the BlockAcctStats structure
Extract the block accounting statistics into a structure so the block device
models can hold them in the future.

CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
CC: Eric Blake <eblake@redhat.com>

Signed-off-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-10 10:41:29 +02:00
Valentin Manea
1a7044bb62 IDE: MMIO IDE device control should be little endian
Set the IDE MMIO memory type to little endian. The ATA specs identify
words part of the control commands encoded as little endian.
While this has no impact on little endian systems, it's required for big
endian systems(eg OpenRisc).

Signed-off-by: Valentin Manea <valentin.manea@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-10 10:41:29 +02:00
Markus Armbruster
a31e69cf00 thread-pool: Drop unnecessary includes
Dragging block_int.h into a header is *not* nice.  Fortunately, this
is the only offender.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-10 10:41:29 +02:00
Markus Armbruster
7ca9b7c035 xen: Drop redundant bdrv_close() from pci_piix3_xen_ide_unplug()
drive_del() closes just fine.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-10 10:41:29 +02:00
Markus Armbruster
cedccf1381 xen_disk: Plug memory leak on error path
The Error object was leaked after failed bdrv_new(). While there,
streamline control flow a bit.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-10 10:41:29 +02:00
Markus Armbruster
dbb651c46c qemu-io: Clean up openfile() after commit 2e40134
Commit 6db9560 split off the growable case so it can use
bdrv_file_open() instead of bdrv_open() then.  Growable BDSes become
anonymous.  Weird.

Commit 2e40134 folded bdrv_file_open() back into bdrv_open() with new
flag BDRV_O_PROTOCOL.  We still have two bdrv_open() calls, and
growable BDSes remain anonymous.

Circle back to before commit 6db9560: just one call, not anonymous.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-10 10:41:29 +02:00
Xiaodong Gong
0d4cc3e715 Fix improper usage of cpu_to_be32 in vpc
cpu_to_be32() is wrong since vhd_type is an enum constant
(just a regular CPU-endian integer).

Signed-off-by: Xiaodong Gong <gordongong0350@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-10 10:41:29 +02:00
Luiz Capitulino
c7c2ff0c7e block: extend BLOCK_IO_ERROR event with nospace indicator
Management software, such as RHEV's vdsm, want to be able to allocate
disk space on demand. The basic use case is to start a VM with a small
disk and then the disk is enlarged when QEMU hits a ENOSPC condition.

To this end, the management software has to be notified when QEMU
encounters ENOSPC. The solution implemented by this commit is simple:
it extends the BLOCK_IO_ERROR with a 'nospace' key, which is true
when QEMU is stopped due to ENOSPC.

Note that support for querying this event is already present in
query-block by means of the 'io-status' key. Also, the new 'nospace'
BLOCK_IO_ERROR field shares the same semantics with 'io-status',
which basically means that werror= has to be set to either
'stop' or 'enospc' to enable 'nospace'.

Finally, this commit also updates the 'io-status' key doc in the
schema with a list of supported device models.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-10 10:41:29 +02:00
Dr. David Alan Gilbert
e6043e92c2 xhci PCIe endpoint migration compatibility fix
Add back the PCIe config capabilities on XHCI cards in non-PCIe slots,
but only for machine types before 2.1.

This fixes a migration incompatibility in the XHCI PCI devices
caused by:
   058fdcf52c - xhci: add endpoint cap on express bus only

Note that in fixing it for compatibility with older QEMUs, it breaks
compatibility with existing QEMU 2.1's on older machine types.

The status before this patch was (if it used an XHCI adapter):
   machine type | source qemu
     any           pre-2.1     - FAIL
     any           2.1...      - PASS

With this patch:
   machine type | source qemu
     any           pre-2.1    - PASS
     pre-2.1       2.1...     - FAIL
     2.1           2.1...     - PASS

A test to trigger it is to add '-device nec-usb-xhci,id=xhci,addr=0x12'
to the command line.

Cc: qemu-stable@nongnu.org
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-10 07:20:53 +02:00
Peter Maydell
10601bef56 Merge remote-tracking branch 'remotes/mcayland/tags/qemu-sparc-signed' into staging
apb: implement PCI bus error interrupt map registers

# gpg: Signature made Tue 09 Sep 2014 06:09:27 BST using RSA key ID AE0F321F
# gpg: Can't check signature: public key not found

* remotes/mcayland/tags/qemu-sparc-signed:
  apb: implement PCI bus error interrupt map registers

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-09 15:08:05 +01:00
Hu Tao
fc7a5800ad exec: add parameter errp to gethugepagesize
Add parameter errp to gethugepagesize thus callers can handle errors.

If user adds a memory-backend-file object using object_add command,
specifying a non-existing directory for property mem-path, qemu will
core dump with message:

  /nonexistingdir: No such file or directory
  Bad ram offset fffffffffffff000
  Aborted (core dumped)

This patch fixes the problem. With this patch, qemu reports an error
message like:

  qemu-system-x86_64: -object memory-backend-file,mem-path=/nonexistingdir,id=mem-file0,size=128M:
  failed to get page size of file /nonexistingdir: No such file or directory

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-09 13:41:44 +02:00
Hu Tao
557529dd60 exec: report error when memory < hpagesize
Report an error when memory < hpagesize in file_ram_alloc() so callers
can handle the error.

If user adds a memory-backend-file object using object_add command,
specifying a size that is less than huge page size, qemu will core dump
with message:

  Bad ram offset fffffffffffff000
  Aborted (core dumped)

This patch fixes the problem. With this patch, qemu reports error
message like:

  qemu-system-x86_64: -object memory-backend-file,mem-path=/hugepages,id=mem-file0,size=1M: memory
  size 0x100000 must be equal to or larger than huge page size 0x200000

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-09 13:41:44 +02:00
Hu Tao
d42e2de7bc hostmem-ram: don't exit qemu if size of memory-backend-ram is way too big
When using monitor command object_add to add a memory backend whose
size is way too big to allocate memory for it, qemu just exits. In
the case we'd better give an error message and keep guest running.

The problem can be reproduced as follows:

1. run qemu
2. (monitor)object_add memory-backend-ram,size=100000G,id=ram0

Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-09 13:41:44 +02:00
Hu Tao
33e0eb5297 memory: add parameter errp to memory_region_init_rom_device
Add parameter errp to memory_region_init_rom_device and update all call
sites to propagate the error.

Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
[Propagate the error out of realize. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-09 13:41:44 +02:00
Hu Tao
49946538d2 memory: add parameter errp to memory_region_init_ram
Add parameter errp to memory_region_init_ram and update all call sites
to pass in &error_abort.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-09 13:41:43 +02:00
Hu Tao
ef701d7b6f exec: add parameter errp to qemu_ram_alloc and qemu_ram_alloc_from_ptr
Add parameter errp to qemu_ram_alloc and qemu_ram_alloc_from_ptr so that
we can handle errors.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
[Assert ptr != NULL in memory_region_init_ram_ptr. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-09 13:41:25 +02:00
Fam Zheng
c261d774fb rules.mak: Fix DSO build by pulling in archive symbols
This fixes an issue with module build system. block/iscsi.so is
currently broken:

    $ ~/build/last/qemu-img
    Failed to open module: /home/fam/build/master/block-iscsi.so:
    undefined symbol: qmp_query_uuid
    qemu-img: Not enough arguments
    Try 'qemu-img --help' for more information

To fix this, we should (at least) let qemu-img link qmp_query_uuid from
libqemustub.a. (There are a few other symbols missing, as well.)

This patch changes the linking rules to:

1) Build ".mo" with "ld -r -o $@ $^" for each ".so", and later build .so
   with it.

2) Always build all the .mo before linking the executables. This is
   achieved by adding those .mo files to the executables' "-y"
   variables.

3) When linking an executable, those .mo files in its "-y" variables are
   filtered out, and replaced by one or more -Wl,-u,$symbol flags. This
   is done in the added macro "process-archive-undefs".

   These "-Wl,-u,$symbol" flags will force ld to pull in the function
   definition from the archives when linking.

   Note that the .mo objects, that are actually meant to be linked in
   the executables, are already expanded in unnest-vars, before the
   linking command. So we are safe to simply filter out .mo for the
   purpose of pulling undefined symbols.

   process-archive-undefs works as this: For each ".mo", find all the
   undefined symbols in it, filter ones that are defined in the
   archives. For each of these symbols, generate a "-Wl,-u,$symbol" in
   the link command, and put them before archive names in the command
   line.

Suggested-by: H.J. Lu <hjl.tools@gmail.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-09 13:13:05 +02:00
Fam Zheng
2ceee4b052 util: Don't link host-utils.o if it's empty
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-09 13:13:05 +02:00
Fam Zheng
f6e0830298 util: Move general qemu_getauxval to util/getauxval.c
So that we won't have an empty getauxval.o which is disliked by ranlib.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-09 13:13:05 +02:00
Fam Zheng
ddbc41de38 trace: Only link generated-tracers.o with "simple" backend
In any other cases the object file is effectively empty, which is
disliked by ranlib and nm on Mac OS X.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-09 13:13:05 +02:00
Paolo Bonzini
a85e130e01 kvm: do not abort if KVM_RUN fails
Just go to the internal error runstate.  This lets you use the "x",
"dump-guest-memory" or "info register" commands.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-09 13:13:05 +02:00
Mark Cave-Ayland
de739df8e0 apb: implement PCI bus error interrupt map registers
Both OpenBSD and FreeBSD SPARC64 attempt to read the interrupt map from the
hardware and will fail if the correct ino isn't present.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2014-09-09 06:07:12 +01:00
Peter Maydell
1bc0e40581 Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
Block pull request

# gpg: Signature made Mon 08 Sep 2014 11:49:31 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/block-pull-request: (24 commits)
  ide: Add resize callback to ide/core
  IDE: Fill the IDENTIFY request consistently
  vmdk: fix buf leak in vmdk_parse_extents()
  vmdk: fix vmdk_parse_extents() extent_file leaks
  ide: Add wwn support to IDE-ATAPI drive
  qtest/ide: Uninitialize PC allocator
  libqos: add a simple first-fit memory allocator
  MAINTAINERS: update sheepdog maintainer
  qemu-nbd: fix indentation and coding style
  qemu-nbd: add option to set detect-zeroes mode
  rename parse_enum_option to qapi_enum_parse and make it public
  block/archipelago: Use QEMU atomic builtins
  qemu-img: fix rebase src_cache option documentation
  qemu-img: clarify src_cache option documentation
  libqos: Added EVENT_IDX support
  libqos: Added MSI-X support
  libqos: Added test case for configuration changes in virtio-blk test
  libqos: Added indirect descriptor support to virtio implementation
  libqos: Added basic virtqueue support to virtio implementation
  tests: Add virtio device initialization
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-08 13:14:41 +01:00
Peter Maydell
2d6838e86c Merge remote-tracking branch 'remotes/agraf/tags/signed-ppc-for-upstream' into staging
Patch queue for ppc - 2014-09-08

Alexander Graf (11):
      PPC: KVM: Fix g3beige and mac99 when HV is loaded
      PPC: mac99: Move NVRAM to page boundary when necessary
      KVM: Add helper to run KVM_CHECK_EXTENSION on vm fd
      PPC: KVM: Use vm check_extension for pv hcall
      PPC: mac99: Fix core99 timer frequency
      PPC: mac_nvram: Remove unused functions
      PPC: mac_nvram: Allow 2 and 4 byte accesses
      PPC: mac_nvram: Split NVRAM into OF and OSX parts
      PPC: Mac: Move tbfreq into local variable
      PPC: Cuda: Use cuda timer to expose tbfreq to guest
      PPC: Fix default config ordering and add eTSEC for ppc64

Alexey Kardashevskiy (7):
      spapr: Move DT memory node rendering to a helper
      spapr: Use DT memory node rendering helper for other nodes
      spapr: Refactor spapr_populate_memory() to allow memoryless nodes
      spapr: Split memory nodes to power-of-two blocks
      spapr: Add a helper for node0_size calculation
      spapr: Fix ibm, associativity for memory nodes
      spapr_pci: Fix config space corruption

Anton Blanchard (2):
      spapr-vlan: Don't touch last entry in buffer list
      hypervisor property clashes with hypervisor node

Benjamin Herrenschmidt (2):
      loader: Add load_image_size() to replace load_image()
      spapr: Locate RTAS and device-tree based on real RMA

Bharat Bhushan (4):
      ppc: debug stub: Get trap instruction opcode from KVM
      ppc: synchronize excp_vectors for injecting exception
      ppc: Add software breakpoint support
      ppc: Add hw breakpoint watchpoint support

Gonglei (1):
      spapr: fix possible memory leak

Greg Kurz (1):
      spapr_pci: map the MSI window in each PHB

Nikunj A Dadhania (3):
      ppc: spapr-rtas - implement os-term rtas call
      spapr: add uuid/host details to device tree
      ppc/spapr: Fix MAX_CPUS to 255

Peter Maydell (1):
      hw/ppc/spapr_hcall.c: Fix typo in function names

Tom Musta (20):
      linux-user: Fix Stack Pointer Bug in PPC setup_rt_frame
      linux-user: Split PPC Trampoline Encoding from Register Save
      linux-user: Enable Signal Handlers on PPC64
      linux-user: Properly Dereference PPC64 ELFv1 Signal Handler Pointer
      linux-user: Implement do_setcontext for PPC64
      linux-user: Handle PPC64 ELFv2 Function Pointers
      target-ppc: Bug Fix: rlwinm
      target-ppc: Bug Fix: rlwnm
      target-ppc: Bug Fix: rlwimi
      target-ppc: Bug Fix: mullwo
      target-ppc: Bug Fix: mullw
      target-ppc: Bug Fix: mulldo OV Detection
      target-ppc: Bug Fix: srawi
      target-ppc: Bug Fix: srad
      target-ppc: Special Case of rlwimi Should Use Deposit
      target-ppc: Optimize rlwinm MB=0 ME=31
      target-ppc: Optimize rlwnm MB=0 ME=31
      target-ppc: Clean Up mullw
      target-ppc: Clean up mullwo
      target-ppc: Implement mulldo with TCG

# gpg: Signature made Mon 08 Sep 2014 11:51:15 BST using RSA key ID 03FEDC60
# gpg: Can't check signature: public key not found

* remotes/agraf/tags/signed-ppc-for-upstream: (52 commits)
  hypervisor property clashes with hypervisor node
  PPC: Fix default config ordering and add eTSEC for ppc64
  spapr_pci: map the MSI window in each PHB
  target-ppc: Implement mulldo with TCG
  target-ppc: Clean up mullwo
  target-ppc: Clean Up mullw
  target-ppc: Optimize rlwnm MB=0 ME=31
  target-ppc: Optimize rlwinm MB=0 ME=31
  target-ppc: Special Case of rlwimi Should Use Deposit
  spapr-vlan: Don't touch last entry in buffer list
  spapr_pci: Fix config space corruption
  PPC: Cuda: Use cuda timer to expose tbfreq to guest
  PPC: Mac: Move tbfreq into local variable
  PPC: mac_nvram: Split NVRAM into OF and OSX parts
  PPC: mac_nvram: Allow 2 and 4 byte accesses
  PPC: mac_nvram: Remove unused functions
  PPC: mac99: Fix core99 timer frequency
  PPC: KVM: Use vm check_extension for pv hcall
  KVM: Add helper to run KVM_CHECK_EXTENSION on vm fd
  target-ppc: Bug Fix: srad
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-08 12:02:07 +01:00
Anton Blanchard
85423d90c7 hypervisor property clashes with hypervisor node
dtc fails on a recent QEMU snapshot:

ERROR (name_properties): "name" property in /hypervisor#1 is incorrect ("hypervisor" instead of base node name)

Looking at the device tree we have a hypervisor property:

# lsprop hypervisor
hypervisor       "kvm"

But we also have a hypervisor node, with a name that doesn't match:

# lsprop hypervisor#1/
name             "hypervisor"
compatible       "linux,kvm"
linux,phandle    7e5eb5d8 (2120136152)

Commit c08ce91d309c (spapr: add uuid/host details to device tree)
looks to have collided with an earlier patch. Remove the hypervisor
property.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:54 +02:00
Alexander Graf
4a761ffa37 PPC: Fix default config ordering and add eTSEC for ppc64
We messed up the ordering in our default configs for PPC. The top entries
are generic entries, then come sections that indicate that features are only
in because of a special feature (such as PReP).

Fix the ordering again and while at it add eTSEC support to the ppc64 target
so that we can spawn eTSEC adapters with qemu-system-ppc64.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:54 +02:00
Greg Kurz
8c46f7ec85 spapr_pci: map the MSI window in each PHB
On sPAPR, virtio devices are connected to the PCI bus and use MSI-X.
Commit cc943c36fa has modified MSI-X
so that writes are made using the bus master address space and follow
the IOMMU path.

Unfortunately, the IOMMU address space address space does not have an
MSI window: the notification is silently dropped in unassigned_mem_write
instead of reaching the guest... The most visible effect is that all
virtio devices are non-functional on sPAPR since then. :(

This patch does the following:
1) map the MSI window into the IOMMU address space for each PHB
   - since each PHB instantiates its own IOMMU address space, we
     can safely map the window at a fixed address (SPAPR_PCI_MSI_WINDOW)
   - no real need to keep the MSI window setup in a separate function,
     the spapr_pci_msi_init() code moves to spapr_phb_realize().

2) kill the global MSI window as it is not needed in the end

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:53 +02:00
Tom Musta
22ffad31d4 target-ppc: Implement mulldo with TCG
Optimize mulldo by using the muls2_i64 operation rather than a helper.  Eliminate
the obsolete helper code.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Suggested-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:53 +02:00
Tom Musta
269778769d target-ppc: Clean up mullwo
Simplify the implementation of mullwo.  For 64 bit CPUs, the result is
the concatenation of the upper and lower parts of the muls2_i32 operation,
which may be slightly better than deposit.  For 32 bit CPUs, the lower part
of the muls_i32 operation is moved into the target GPR.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Suggested-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:53 +02:00
Tom Musta
03039e5ef0 target-ppc: Clean Up mullw
Eliminate the unecessary ext32s TCG operation and make the multiplication
operation explicitly 32 bit.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Suggested-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:53 +02:00
Tom Musta
57fca134bb target-ppc: Optimize rlwnm MB=0 ME=31
Optimize the special case of rlwnm where MB=0 and ME=31.  This can
be implemented using a ROTL.

Suggested-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:53 +02:00
Tom Musta
8979c2f602 target-ppc: Optimize rlwinm MB=0 ME=31
Optimize the special case of rlwinm where MB=0 and ME=31.  This can
be implemented as a 32-bit ROTL.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Suggested-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:53 +02:00
Tom Musta
ab92678d0a target-ppc: Special Case of rlwimi Should Use Deposit
The special case of rlwimi where MB <= ME and SH = 31-ME can be implemented
with a single TCG deposit operation.  This replaces the less general case
of SH = MB = 0 and ME = 31.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Suggested-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:52 +02:00
Anton Blanchard
439ce1401b spapr-vlan: Don't touch last entry in buffer list
The last 8 bytes of the buffer list is defined to contain the number
of dropped frames. At the moment we use it to store rx entries,
which trips up ethtool -S:

rx_no_buffer: 9223380832981355136

Fix this by skipping the last buffer list entry.

Signed-off-by: Anton Blanchard <anton@samba.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:52 +02:00
Alexey Kardashevskiy
3242052248 spapr_pci: Fix config space corruption
When disabling MSI/MSIX via "ibm,change-msi" RTAS call, no check was made
if MSI or MSIX is actually supported and the MSI message was reset
unconditionally. If this happened on a device which does not support MSI
(but does support MSIX, otherwise "ibm,change-msi" would not be called),
this device would have PCIDevice::msi_cap field (MSI capability offset)
set to zero and writing a vector would actually clear PCI status.

This clears MSI message only if MSI or MSIX is present on a device.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:52 +02:00
Alexander Graf
b981289c49 PPC: Cuda: Use cuda timer to expose tbfreq to guest
Mac OS X calibrates a number of frequencies on bootup based on reading
tb values on bootup and comparing them to via cuda timer values.

The only variable we can really steer well (thanks to KVM) is the cuda
frequency. So let's use that one to fake Mac OS X into believing the
bus frequency is tbfreq * 4. That way Mac OS X will automatically
calculate the correct timebase frequency.

With this patch and the patch set I posted earlier I can successfully
run Mac OS X 10.2, 10.3 and 10.4 guests with -M mac99 on TCG and KVM.

Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:52 +02:00
Alexander Graf
caae6c9611 PPC: Mac: Move tbfreq into local variable
We already expose the real CPU's tb frequency to the guest via fw_cfg. Soon
we will need to also expose it to the MacIO, so let's move it to a variable
that we can leverage every time we need the frequency.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:52 +02:00
Alexander Graf
2d9907a333 PPC: mac_nvram: Split NVRAM into OF and OSX parts
Mac OS X (at least with -M mac99) searches for a valid NVRAM partition
of a special Apple type. If it can't find that partition in the first
half of NVRAM, it will look at the second half.

There are a few implications from this. The first is that we need to
split NVRAM into 2 halves - one for Open Firmware use, the other one for
Mac OS X. Without this split Mac OS X will just loop endlessly over the
second half trying to find a partition.

The other implication is that we should provide a specially crafted Mac
OS X compatible NVRAM partition on the second half that Mac OS X can
happily use as it sees fit.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:52 +02:00
Alexander Graf
b19eae18c1 PPC: mac_nvram: Allow 2 and 4 byte accesses
The NVRAM in our Core99 machine really supports 2byte and 4byte accesses
just as well as 1byte accesses. In fact, Mac OS X uses those.

Add support for higher register size granularities.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:51 +02:00
Alexander Graf
a8b0503701 PPC: mac_nvram: Remove unused functions
The macio_nvram_read and macio_nvram_write functions are never called,
just remove them.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:51 +02:00
Alexander Graf
d696760b43 PPC: mac99: Fix core99 timer frequency
There is a special timer in the mac99 machine that we recently started
to emulate. Unfortunately we emulated it in the wrong frequency.

This patch adapts the frequency Mac OS X uses to evaluate results from
this timer, making calculations it bases off of it work.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:51 +02:00
Alexander Graf
6fd33a7502 PPC: KVM: Use vm check_extension for pv hcall
To find out whether we support the KVM hypercall interface we need to ask KVM
on the VM level rather than the global KVM level, because Book3S HV KVM does
not support it and we play conservative when both HV and PR are loaded.

So instead, use the VM helper that falls back to global KVM enumeration. That
should cover all cases.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:51 +02:00
Alexander Graf
7d0a07fa92 KVM: Add helper to run KVM_CHECK_EXTENSION on vm fd
We now can call KVM_CHECK_EXTENSION on the kvm fd or on the vm fd, whereas
the vm version is more accurate when it comes to PPC KVM.

Add a helper to make the vm version available that falls back to the non-vm
variant if the vm one is not available yet to stay compatible.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:51 +02:00
Tom Musta
4bc02e230d target-ppc: Bug Fix: srad
Fix the check for carry in the srad helper to properly construct
the mask -- a "1ULL" must be used (instead of "1") in order to
get the desired result.

Example:

R3 8000000000000000
R4 F3511AD4A2CD4C38
srad 3,3,4

Should *not* set XER[CA] but does without this patch.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:51 +02:00
Tom Musta
34a0fad102 target-ppc: Bug Fix: srawi
For 64 bit implementations, the special case of a shift by zero
should result in the sign extension of the least significant 32 bits
of the source GPR (not a direct copy of the 64 bit source GPR).

Example:

R3 A6212433228F41DC
srawi 3,3,0
R3 expected : 00000000228F41DC
R3 actual   : A6212433228F41DC (without this patch)

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:50 +02:00
Tom Musta
9824d01d5d target-ppc: Bug Fix: mulldo OV Detection
Fix the code to properly detect overflow; the 128 bit signed
product must have all zeroes or all ones in the first 65 bits
otherwise OV should be set.

Example:

R3 45F086A5D5887509
R4 0000000000000002
mulldo 3,3,4

Should set XER[OV].

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:50 +02:00
Tom Musta
1fa74845f2 target-ppc: Bug Fix: mullw
For 64-bit implementations, the mullw result is the 64 bit product
of the sign-extended least significant 32 bits of the source
registers.

Fix the code to properly sign extend the source operands and produce
a 64 bit product.

Example:
R3 00000000002F37A0
R4 41C33D242F816715
mullw 3,3,4
R3 expected : 0008C3146AE0F020
R3 actual   : 000000006AE0F020 (without this patch)

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:50 +02:00
Tom Musta
f11ebbf8d4 target-ppc: Bug Fix: mullwo
On 64-bit implementations, the mullwo result is the 64 bit product of
the signed 32 bit operands.  Fix the implementation to properly deposit
the upper 32 bits into the target register.

Example:

R3 0407DED115077586
R4 53778DF3CA992E09
mullwo 3,3,4
R3 expected : FB9D02730D7735B6
R3 actual   : 000000000D7735B6 (without this patch)

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:50 +02:00
Tom Musta
6ea7b35c02 target-ppc: Bug Fix: rlwimi
The rlwimi specification includes the ROTL32 operation, which is defined
to be a left rotation of two copies of the least significant 32 bits of
the source GPR.

The current implementation is incorrect on 64-bit implementations in that
it rotates a single copy of the least significant 32 bits, padding with
zeroes in the most significant bits.

Fix the code to properly implement this ROTL32 operation.

Also fix the special case of MB=31 and ME=0 to copy the entire contents
of the source GPR.

Examples:

R3 FFFFFFFFFFFFFFF0
rlwimi 3,3,29,14,1
R3 expected : 1FFFFFFE3FFFFFFE
R3 actual   : 000000003FFFFFFE (without this patch)

R3 ED7EB4DD824F0853
rlwimi 3,3,10,31,0
R3 expected : 3C214E09024F0853
R3 actual   : 00000000024F0853 (without this patch)

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:50 +02:00
Tom Musta
1c0a150f4b target-ppc: Bug Fix: rlwnm
The rlwnm specification includes the ROTL32 operation, which is defined
to be a left rotation of two copies of the least significant 32 bits of
the source GPR.

The current implementation is incorrect on 64-bit implementations in that
it rotates a single copy of the least significant 32 bits, padding with
zeroes in the most significant bits.

Fix the code to properly implement this ROTL32 operation.

Example:

R3 = 0000000000000002
R4 = 7FFFFFFFFFFFFFFF
rlwnm 3,3,4,31,16
R3 expected : 0000000100000001
R3 actual   : 0000000000000001 (without this patch)

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:50 +02:00
Tom Musta
a7f23d0f8b target-ppc: Bug Fix: rlwinm
The rlwinm specification includes the ROTL32 operation, which is defined
to be a left rotation of two copies of the least significant 32 bits of
the source GPR.

The current implementation is incorrect on 64-bit implementations in that
it rotates a single copy of the least significant 32 bits, padding with
zeroes in the most significant bits.

Fix the code to properly implement this ROTL32 operation.

Example:
R3 = F7487D82EC6F75DF
rlwinm 3,3,5,12,4

R3 expected : 8DEEBBFD880EBBFD
R3 actual   : 00000000880EBBFD (without this fix)

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:49 +02:00
Nikunj A Dadhania
9674a35626 ppc/spapr: Fix MAX_CPUS to 255
MAX_CPUS 256 is inconsistent with qemu supporting upto 255 cpus. This
MAX_CPUS number was percolated back to "virsh capabilities" with wrong
max_cpus.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:49 +02:00
Bharat Bhushan
88365d17d5 ppc: Add hw breakpoint watchpoint support
This patch adds hardware breakpoint and hardware watchpoint support
for ppc.

On BOOKE architecture we cannot share debug resources between QEMU
and guest because:
    When QEMU is using debug resources then debug exception must
    be always enabled. To achieve this we set MSR_DE and also set
    MSRP_DEP so guest cannot change MSR_DE.

    When emulating debug resource for guest we want guest
    to control MSR_DE (enable/disable debug interrupt on need).

    So above mentioned two configuration cannot be supported
    at the same time. So the result is that we cannot share
    debug resources between QEMU and Guest on BOOKE architecture.

In the current design QEMU gets priority over guest,
this means that if QEMU is using debug resources then guest
cannot use them and if guest is using debug resource then
qemu can overwrite them.

When QEMU is not able to handle debug exception then we inject program
exception to guest. Yes program exception NOT debug exception and the
reason is:
 1) QEMU and guest not sharing debug resources
 2) For software breakpoint QEMU uses a ehpriv-1 instruction;

 So there cannot be any reason that we are in qemu with exit reason
 KVM_EXIT_DEBUG  for guest set debug exception, only possibility is
 guest executed ehpriv-1 privilege instruction and that's why we are
 injecting program exception.

Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:49 +02:00
Bharat Bhushan
8a0548f94e ppc: Add software breakpoint support
This patch allow insert/remove software breakpoint.

When QEMU is not able to handle debug exception then we inject
program exception to guest because for software breakpoint QEMU
uses a ehpriv-1 instruction;
So there cannot be any reason that we are in qemu with exit reason
KVM_EXIT_DEBUG  for guest set debug exception, only possibility is
guest executed ehpriv-1 privilege instruction and that's why we are
injecting program exception.

Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
[agraf: make deflect comment booke/book3s agnostic]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:49 +02:00
Bharat Bhushan
c371c2e3e0 ppc: synchronize excp_vectors for injecting exception
This patch synchronizes env->excp_vectors[] with env->iovr[].
This is required for using the existing interrupt injection mechanism
for kvm.

Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:49 +02:00
Bharat Bhushan
3c902d4469 ppc: debug stub: Get trap instruction opcode from KVM
Get trap instruction opcode from KVM and this opcode will
be used for setting software breakpoint in following patch

Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:49 +02:00
Benjamin Herrenschmidt
b7d1f77ada spapr: Locate RTAS and device-tree based on real RMA
We currently calculate the final RTAS and FDT location based on
the early estimate of the RMA size, cropped to 256M on KVM since
we only know the real RMA size at reset time which happens much
later in the boot process.

This means the FDT and RTAS end up right below 256M while they
could be much higher, using precious RMA space and limiting
what the OS bootloader can put there which has proved to be
a problem with some OSes (such as when using very large initrd's)

Fortunately, we do the actual copy of the device-tree into guest
memory much later, during reset, late enough to be able to do it
using the final RMA value, we just need to move the calculation
to the right place.

However, RTAS is still loaded too early, so we change the code to
load the tiny blob into qemu memory early on, and then copy it into
guest memory at reset time. It's small enough that the memory usage
doesn't matter.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[aik: fixed errors from checkpatch.pl, defined RTAS_MAX_ADDR]
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[agraf: fix compilation on 32bit hosts]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:48 +02:00
Benjamin Herrenschmidt
ea87616d6c loader: Add load_image_size() to replace load_image()
A subsequent patch to ppc/spapr needs to load the RTAS blob into
qemu memory rather than target memory (so it can later be copied
into the right spot at machine reset time).

I would use load_image() but it is marked deprecated because it
doesn't take a buffer size as argument, so let's add load_image_size()
that does.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[aik: fixed errors from checkpatch.pl]
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:48 +02:00
Alexey Kardashevskiy
c3b4f589d8 spapr: Fix ibm, associativity for memory nodes
We want the associtivity lists of memory and CPU nodes to match but
memory nodes have incorrect domain#3 which is zero for CPU so they won't
match.

This clears domain#3 in the list to match CPUs associtivity lists.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:48 +02:00
Alexey Kardashevskiy
b082d65a30 spapr: Add a helper for node0_size calculation
In multiple places there is a node0_size variable calculation
which assumes that NUMA node #0 and memory node #0 are the same
things which they are not. Since we are going to change it and
do not want to change it in multiple places, let's make a helper.

This adds a spapr_node0_size() helper and makes use of it.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:48 +02:00
Alexey Kardashevskiy
6010818c30 spapr: Split memory nodes to power-of-two blocks
Linux kernel expects nodes to have power-of-two size and
does WARN_ON if this is not the case:
[    0.041456] WARNING: at drivers/base/memory.c:115
which is:

===
	/* Validate blk_sz is a power of 2 and not less than section size */
	if ((block_sz & (block_sz - 1)) || (block_sz < MIN_MEMORY_BLOCK_SIZE)) {
        	WARN_ON(1);
	        block_sz = MIN_MEMORY_BLOCK_SIZE;
	}
===

This splits memory nodes into set of smaller blocks with
a size which is a power of two. This makes sure the start
address of every node is aligned to the node size.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[agraf: squash windows compile fix in]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:48 +02:00
Alexey Kardashevskiy
7db8a127e3 spapr: Refactor spapr_populate_memory() to allow memoryless nodes
Current QEMU does not support memoryless NUMA nodes, however
actual hardware may have them so it makes sense to have a way
to emulate them in QEMU. This prepares SPAPR for that.

This moves 2 calls of spapr_populate_memory_node() into
the existing loop over numa nodes so first several nodes may
have no memory and this still will work.

If there is no numa configuration, the code assumes there is just
a single node at 0 and it has all the guest memory.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:48 +02:00
Alexey Kardashevskiy
81014ac2b8 spapr: Use DT memory node rendering helper for other nodes
This finishes refactoring by using the spapr_populate_memory_node helper
for all nodes and removing leftovers from spapr_populate_memory().

This is not a part of the previous patch because the patches look
nicer apart.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:47 +02:00
Alexey Kardashevskiy
26a8c353bf spapr: Move DT memory node rendering to a helper
This moves recurring bits of code related to memory@xxx nodes
creation to a helper.

This makes use of the new helper for node@0.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:47 +02:00
Gonglei
a21a7a7012 spapr: fix possible memory leak
get_boot_devices_list() will malloc memory, spapr_finalize_fdt
doesn't free it.

Signed-off-by: Chenliang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:47 +02:00
Alexander Graf
261265cc91 PPC: mac99: Move NVRAM to page boundary when necessary
When running KVM we have to adhere to host page boundaries for memory slots.
Unfortunately the NVRAM on mac99 is a 4k RAM hole inside of an MMIO flash
area.

So if our host is configured with 64k page size, we can't use the mac99 target
with KVM. This is a real shame, as this limitation is not really an issue - we
can easily map NVRAM somewhere else and at least Linux and Mac OS X use it
at their new location.

So in that emergency case when it's about failing to run at all and moving NVRAM
to a place it shouldn't be at, choose the latter.

This patch enables -M mac99 with KVM on 64k page size hosts.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:47 +02:00
Nikunj A Dadhania
ef9514431d spapr: add uuid/host details to device tree
Useful for identifying the guest/host uniquely within the
guest. Adding following properties to the guest root node.

vm,uuid - uuid of the guest
host-model - Host model number
host-serial - Host machine serial number
hypervisor type - Tells its "kvm"

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:47 +02:00
Peter Maydell
7d0cd464a7 hw/ppc/spapr_hcall.c: Fix typo in function names
Fix a typo in the names of a couple of functions
(s/resouce/resource/).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:47 +02:00
Tom Musta
145855801a linux-user: Handle PPC64 ELFv2 Function Pointers
Function pointers in the 64-bit ELFv2 PowerPC ABI are actual (internal)
entry point addresses.  However, when invoking a function via a function
pointer, GPR 12 must also be set to this address so that the TOC may be
handled properly.

Add this support to the invocation of a signal handler.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:46 +02:00
Tom Musta
19774ec5c4 linux-user: Implement do_setcontext for PPC64
Eliminate the stub for the do_setcontext() function for TARGET_PPC64.  The
implementation re-uses the existing TARGET_PPC32 code with the only change
being the computation of the address of the register save area.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:46 +02:00
Tom Musta
8d6ab333eb linux-user: Properly Dereference PPC64 ELFv1 Signal Handler Pointer
Properly dereference 64-bit PPC ELF V1 ABIT function pointers to signal handlers.
On this platform, function pointers are pointers to structures and the first 64
bits of such a structure contains the function's entry point.  The second 64 bits
contains the TOC pointer, which must be placed into GPR 2.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:46 +02:00
Tom Musta
61e75fecef linux-user: Enable Signal Handlers on PPC64
Enable the 64-bit PowerPC signal handling code that was previously
disabled via #ifdefs.  Specifically:

  - Move the target_mcontext (register save area) structure and
    append it to the 64-bit target_sigcontext structure.  This
    provides the space on the stack for saving and restoring
    context.
  - Define the target_rt_sigframe for 64-bit.
  - Adjust the setup_frame and setup_rt_frame routines to properly
    select the target_mcontext area and trampoline within the stack
    frame; tthis is different for 32-bit and 64-bit implementations.
  - Adjust the do_setcontext stub for 64-bit so that it compiles
    without warnings.

The 64-bit signal handling code is still not functional after this
change; but the 32-bit code is.  Subsequent changes will address
specific issues with the 64-bit code.

Signed-off-by: Tom Musta <tommusta@gmail.com>
[agraf: fix build on 32bit hosts, ppc64abi32]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:46 +02:00
Tom Musta
7678108b13 linux-user: Split PPC Trampoline Encoding from Register Save
Split the encoding of the PowerPC sigreturn trampoline from the saving of
register state onto the signal handler stack.  This will make it easier
in subsequent patches to deal with variations in the stack frame layouts between
32 and 64 bit PowerPC.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:46 +02:00
Tom Musta
fbdc200ac2 linux-user: Fix Stack Pointer Bug in PPC setup_rt_frame
The code that sets the stack frame back pointer is incorrect for
the setup_rt_frame() code; qemu will abort (SIGSEGV) in some
environments.  The setup_frame code  was fixed in commit
beb526b121 but the setup_rt_frame
code was not.

Make the setup_rt_frame code consistent with the setup_frame
code.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:46 +02:00
Nikunj A Dadhania
2e14072f9e ppc: spapr-rtas - implement os-term rtas call
PAPR compliant guest calls this in absence of kdump. This finally
reaches the guest and can be handled according to the policies set by
higher level tools(like taking dump) for further analysis by tools like
crash.

Linux kernel calls ibm,os-term when extended property of os-term is set.
This makes sure that a return to the linux kernel is gauranteed.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
[agraf: reduce RTAS_TOKEN_MAX]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:45 +02:00
Alexander Graf
277c7a4d71 PPC: KVM: Fix g3beige and mac99 when HV is loaded
On PPC we have 2 different styles of KVM: PR and HV. HV can only virtualize
sPAPR guests while PR can virtualize everything that's reasonably close to
the host hardware platform.

As long as only one kernel module (PR or HV) is loaded, the "default" kvm type
is the module that's loaded. So if your hardware only supports PR mode you can
easily spawn a Mac VM.

However, if both HV and PR are loaded we default to HV mode. And in that case
the Mac machines have to explicitly ask for PR mode to get a working VM.

Fix this up by explicitly having the Mac machines ask for PR style KVM. This
fixes bootup of Mac VMs on systems where bot HV and PR kvm modules are loaded
for me.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:45 +02:00
John Snow
01ce352e62 ide: Add resize callback to ide/core
Currently, if the block device backing the IDE drive is resized,
the information about the device as cached inside of the IDEState
structure is not updated, thus when a guest OS re-queries the drive,
it is unable to see the expanded size.

This patch adds a resize callback that updates the IDENTIFY data
buffer in order to correct this.

Lastly, a Linux guest as-is cannot resize a libata drive while in-use,
but it can see the expanded size as part of a bus rescan event.
This patch also allows guests such as Linux to see the new drive size
after a soft reboot event, without having to exit the QEMU process.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:44 +01:00
John Snow
4bf6637d35 IDE: Fill the IDENTIFY request consistently
IDE-HD, IDE-ATAPI and IDE-CFATA all fill the
identify buffer in slightly different ways,
this is a relatively minor patch to make them
uniform, to emphasize that:

(1) We build the s->identify_data cache first, then
(2) We copy it to s->io_buffer to fulfill the request.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:44 +01:00
Stefan Hajnoczi
b6b1d31f09 vmdk: fix buf leak in vmdk_parse_extents()
vmdk_open_sparse() does not take ownership of buf so the caller always
needs to free it.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2014-09-08 11:12:44 +01:00
Stefan Hajnoczi
ff74f33c31 vmdk: fix vmdk_parse_extents() extent_file leaks
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2014-09-08 11:12:44 +01:00
John Snow
c5fe97e359 ide: Add wwn support to IDE-ATAPI drive
Although it is possible to specify the wwn
property for cdrom devices on the command line,
the underlying driver fails to relay this information
to the guest operating system via IDENTIFY.

This is a simple patch to correct that.

See ATA8-ACS, Table 22 parts 5, 6, and 9.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:44 +01:00
John Snow
0142f88bff qtest/ide: Uninitialize PC allocator
Use the new call to pc_alloc_uninit
as a test for the new pathways.

The leak checking / assert pathways are
not enabled in this patch, leaving this
as an option to future test writers.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:43 +01:00
John Snow
ec2f160538 libqos: add a simple first-fit memory allocator
Implement a simple first-fit memory allocator that
attempts to keep track of leased blocks of memory
in order to be able to re-use blocks.

Additionally, allow the user to specify when
initializing the device that upon cleanup,
we would like to assert that there are no
blocks in use. This may be useful for identifying
problems in qtests that use more complicated
set-up and tear-down routines.

This functionality is used in my upcoming ahci-test v2
patch set, but I didn't see fit to enable it for any
existing tests, which will continue to operate the
same as they have prior.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:43 +01:00
MORITA Kazutaka
53b33231f7 MAINTAINERS: update sheepdog maintainer
Hitoshi takes over sheepdog maintenance from me.

Signed-off-by: MORITA Kazutaka <morita.kazutaka@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:43 +01:00
Peter Lieven
713cc671f1 qemu-nbd: fix indentation and coding style
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit.canet@nodalink.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:43 +01:00
Peter Lieven
b3838a4088 qemu-nbd: add option to set detect-zeroes mode
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit.canet@nodalink.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:43 +01:00
Peter Lieven
9e7dac7c6c rename parse_enum_option to qapi_enum_parse and make it public
relaxing the license to LGPLv2+ is intentional.

Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit.canet@nodalink.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:43 +01:00
Chrysostomos Nanakos
072f9ac44a block/archipelago: Use QEMU atomic builtins
Replace __sync builtins with ones provided by QEMU
for atomic operations.

Special thanks goes to Paolo Bonzini for his refactoring
suggestion in order to use the already existing atomic builtins
interface.

Signed-off-by: Chrysostomos Nanakos <cnanakos@grnet.gr>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:43 +01:00
Stefan Hajnoczi
3ba6796d08 qemu-img: fix rebase src_cache option documentation
The src_cache option (-T) specifies the cache mode for backing files.
It applies both the image's old backing file as well as the new backing
file:

  ret = bdrv_open(&bs_old_backing, backing_name, NULL, NULL, src_flags,
                  old_backing_drv, &local_err);
  if (ret) {
      ...
  }
  if (out_baseimg[0]) {
      bs_new_backing = bdrv_new("new_backing", &error_abort);
      ret = bdrv_open(&bs_new_backing, out_baseimg, NULL, NULL, src_flags,
                      new_backing_drv, &local_err);
      if (ret) {
          ...
      }
  }

The documentation only mentions the new backing file but it really
applies to both.

Suggested-by: Jeff Nelson <jenelson@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-09-08 11:12:43 +01:00
Stefan Hajnoczi
bb87fdf871 qemu-img: clarify src_cache option documentation
The source cache option takes the same values as the cache option.  The
documentation reads a little strange because it starts with "In contrast
the src_cache option ...".  The fact that this is comparing with the
previous documented option (the 'cache' option) is implicit.  Readers
may be confused, especially if they jump to src_cache without reading
cache documentation first.

Suggested-by: Jeff Nelson <jenelson@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-09-08 11:12:43 +01:00
Marc Marí
1053587c3f libqos: Added EVENT_IDX support
Added avail_event and NO_NOTIFY check before notifying.
Added used_event setting.

Signed-off-by: Marc Marí <marc.mari.barcelo@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:43 +01:00
Marc Marí
5836811398 libqos: Added MSI-X support
Added MSI-X support for qtest PCI.
Added MSI-X support for virtio-pci.
Added MSI-X test case in virtio-blk-test.

Signed-off-by: Marc Marí <marc.mari.barcelo@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:43 +01:00
Marc Marí
e11199554c libqos: Added test case for configuration changes in virtio-blk test
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Marc Marí <marc.mari.barcelo@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:43 +01:00
Marc Marí
f294b029aa libqos: Added indirect descriptor support to virtio implementation
Add functions necessary for working with indirect descriptors.
Add test using new functions.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Marc Marí <marc.mari.barcelo@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:43 +01:00
Marc Marí
bf3c63d201 libqos: Added basic virtqueue support to virtio implementation
Add status changing and feature negotiation.
Add basic virtqueue support for adding and sending virtqueue requests.
Add ISR checking.

[Squashed request endianness fix by Greg Kurz <gkurz@linux.vnet.ibm.com>
--Stefan]

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Marc Marí <marc.mari.barcelo@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:43 +01:00
Marc Marí
46e0cf7629 tests: Add virtio device initialization
Add functions to read and write virtio header fields.
Add status bit setting in virtio-blk-device.

Signed-off-by: Marc Marí <marc.mari.barcelo@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:43 +01:00
Marc Marí
311e666aea tests: Functions bus_foreach and device_find from libqos virtio API
Virtio header has been changed to compile and work with a real device.
Functions bus_foreach and device_find have been implemented for PCI.
Virtio-blk test case now opens a fake device.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Marc Marí <marc.mari.barcelo@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:43 +01:00
Laszlo Ersek
4c0cfc72b3 pflash_cfi01: write flash contents to bdrv on incoming migration
A drive that backs a pflash device is special:
- it is very small,
- its entire contents are kept in a RAMBlock at all times, covering the
  guest-phys address range that provides the guest's view of the emulated
  flash chip.

The pflash device model keeps the drive (the host-side file) and the
guest-visible flash contents in sync. When migrating the guest, the
guest-visible flash contents (the RAMBlock) is migrated by default, but on
the target host, the drive (the host-side file) remains in full sync with
the RAMBlock only if:
- the source and target hosts share the storage underlying the pflash
  drive,
- or the migration requests full or incremental block migration too, which
  then covers all drives.

Due to the special nature of pflash drives, the following scenario makes
sense as well:
- no full nor incremental block migration, covering all drives, alongside
  the base migration (justified eg. by shared storage for "normal" (big)
  drives),
- non-shared storage for pflash drives.

In this case, currently only those portions of the flash drive are updated
on the target disk that the guest reprograms while running on the target
host.

In order to restore accord, dump the entire flash contents to the bdrv in
a post_load() callback.

- The read-only check follows the other call-sites of pflash_update();
- both "pfl->ro" and pflash_update() reflect / consider the case when
  "pfl->bs" is NULL;
- the total size of the flash device is calculated as in
  pflash_cfi01_realize().

When using shared storage, or requesting full or incremental block
migration along with the normal migration, the patch should incur a
harmless rewrite from the target side.

It is assumed that, on the target host, RAM is loaded ahead of the call to
pflash_post_load().

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:43 +01:00
Laszlo Ersek
afeb25f926 pflash_cfi01: fixup stale DPRINTF() calls
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:42 +01:00
Liu Yuan
6bb4515849 block: kill tail whitespace in block.c
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:42 +01:00
Peter Maydell
f102f22455 Merge remote-tracking branch 'remotes/afaerber/tags/qom-cpu-for-peter' into staging
QOM CPUState and X86CPU

* Include exception state in CPU VMState
* Fix -cpu *,migratable=foo
* Error out on unknown -cpu *,+foo,-bar

# gpg: Signature made Fri 05 Sep 2014 15:38:14 BST using RSA key ID 3E7E013F
# gpg: Good signature from "Andreas Färber <afaerber@suse.de>"
# gpg:                 aka "Andreas Färber <afaerber@suse.com>"

* remotes/afaerber/tags/qom-cpu-for-peter:
  target-i386: Reject invalid CPU feature names on the command-line
  target-i386: Support migratable=no properly
  exec: Save CPUState::exception_index field

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-05 16:03:56 +01:00
Eduardo Habkost
c00c94abbd target-i386: Reject invalid CPU feature names on the command-line
Instead of simply printing a warning, report an error when invalid CPU
options are provided on the CPU model string.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-09-05 16:37:07 +02:00
Eduardo Habkost
4d1b279b06 target-i386: Support migratable=no properly
When the "migratable" property was implemented, the behavior was tested
by changing the default on the code, but actually using the option on
the command-line (e.g. "-cpu host,migratable=false") doesn't work as
expected. This is a regression for a common use case of "-cpu host",
which is to enable features that are supported by the host CPU + kernel
before feature-specific code is added to QEMU.

Fix this by initializing the feature words for "-cpu host" on
x86_cpu_parse_featurestr(), right after parsing the CPU options.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-09-05 16:37:06 +02:00
Pavel Dovgaluk
6c3bff0ed8 exec: Save CPUState::exception_index field
This patch adds a subsection with exception_index field to the VMState for
correct saving the CPU state.
Without this patch, simulator could miss the pending exception in the saved
virtual machine state.

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-09-05 16:32:48 +02:00
Benjamin Herrenschmidt
77bfcf28f1 console: Remove unused QEMU_BIG_ENDIAN_FLAG
If we need to, we should use the pixman formats instead but for
now this is unused except in commented out code so take it out
to avoid further confusion about surface endianness.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-05 15:38:04 +02:00
Peter Maydell
20dbe65049 Merge remote-tracking branch 'remotes/kraxel/tags/pull-chardev-20140905-1' into staging
pty: Fix byte loss bug when connecting to pty

# gpg: Signature made Fri 05 Sep 2014 12:57:32 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-chardev-20140905-1:
  pty: Fix byte loss bug when connecting to pty

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-05 14:29:42 +01:00
Gerd Hoffmann
43c7d8bd44 console: add qemu_pixman_linebuf_copy
Helper function for copying data from linebuf to framebuffer using
pixman, possibly converting in case src and dst formats differ.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-05 13:27:11 +02:00
Gerd Hoffmann
4c38762fb5 console: add dpy_gfx_update_dirty
Calls dpy_gfx_update for all dirty scanlines. Works for
DisplaySurfaces backed by guest memory (i.e. the ones created
using qemu_create_displaysurface_guestmem).

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-05 13:27:11 +02:00
Gerd Hoffmann
a77549b3ff console: add qemu_create_displaysurface_guestmem
This patch adds a qemu_create_displaysurface_guestmem helper function.
Works simliar to qemu_create_displaysurface_from, but accepts a
guest address instead of a host pointer and it handles
cpu_physical_memory_{map,unmap} for you.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-05 13:27:11 +02:00
Gerd Hoffmann
30f1e661b6 console: stop using PixelFormat
With this patch the qemu console core stops using PixelFormat and pixman
format codes side-by-side, pixman format code is the primary way to
specify the DisplaySurface format:

 * DisplaySurface stops carrying a PixelFormat field.
 * qemu_create_displaysurface_from() expects a pixman format now.

Functions to convert PixelFormat to pixman_format_code_t (and back)
exist for those who still use PixelFormat.   As PixelFormat allows
easy access to masks and shifts it will probably continue to exist.

[ xenfb added by Benjamin Herrenschmidt ]

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-05 13:27:11 +02:00
Gerd Hoffmann
56bd9ea1a3 console: reimplement qemu_default_pixelformat
Use the new qemu_pixelformat_from_pixman and qemu_default_pixman_format
functions to reimplement qemu_default_pixelformat
(qemu_different_endianness_pixelformat too).

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-05 13:27:11 +02:00
Gerd Hoffmann
1527a25ec9 console: add qemu_default_pixman_format
Function returning the default pixman format for a given depth.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-05 13:27:11 +02:00
Gerd Hoffmann
a93a3af9ec console: add qemu_pixelformat_from_pixman
Function to convert pixman format codes to qemu PixelFormat.

[ Benjamin Herrenschmidt: fix BGRA+RGBA shifts ]

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-05 13:27:11 +02:00
Sebastian Tanase
cf7330c759 pty: Fix byte loss bug when connecting to pty
When trying to print data to the pty, we first check if it is connected.
If not, we try to reconnect, but we drop the pending data even if we
have successfully reconnected; this makes us lose the first byte of the very
first transmission.
This small fix addresses the issue by checking once more if the pty is connected
after having tried to reconnect.

Signed-off-by: Sebastian Tanase <sebastian.tanase@openwide.fr>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-05 13:27:10 +02:00
Peter Maydell
5fd7fc8db9 Merge remote-tracking branch 'remotes/kraxel/tags/pull-cve-2014-3615-20140905-1' into staging
CVE-2014-3615: fix sanity checks in vbe (bochs dispi) and spice.

# gpg: Signature made Fri 05 Sep 2014 12:18:04 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-cve-2014-3615-20140905-1:
  spice: make sure we don't overflow ssd->buf
  vbe: rework sanity checks
  vbe: make bochs dispi interface return the correct memory size with qxl

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-05 12:26:33 +01:00
Gerd Hoffmann
ab9509ccea spice: make sure we don't overflow ssd->buf
Related spice-only bug.  We have a fixed 16 MB buffer here, being
presented to the spice-server as qxl video memory in case spice is
used with a non-qxl card.  It's also used with qxl in vga mode.

When using display resolutions requiring more than 16 MB of memory we
are going to overflow that buffer.  In theory the guest can write,
indirectly via spice-server.  The spice-server clears the memory after
setting a new video mode though, triggering a segfault in the overflow
case, so qemu crashes before the guest has a chance to do something
evil.

Fix that by switching to dynamic allocation for the buffer.

CVE-2014-3615

Cc: qemu-stable@nongnu.org
Cc: secalert@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2014-09-05 12:19:50 +02:00
Peter Maydell
fd884c0765 Merge remote-tracking branch 'remotes/afaerber/tags/qom-devices-for-peter' into staging
QOM infrastructure fixes and device conversions

* Cleanups for recursive device unrealization

# gpg: Signature made Thu 04 Sep 2014 18:17:35 BST using RSA key ID 3E7E013F
# gpg: Good signature from "Andreas Färber <afaerber@suse.de>"
# gpg:                 aka "Andreas Färber <afaerber@suse.com>"

* remotes/afaerber/tags/qom-devices-for-peter:
  qdev: Add cleanup logic in device_set_realized() to avoid resource leak
  qdev: Use NULL instead of local_err for qbus_child unrealize
  qdev: Use error_abort instead of using local_err
  memory: Remove object_property_add_child_array()
  qom: Add automatic arrayification to object_property_add()
  machine: Clean up -machine handling
  qom: Make object_child_foreach() safe for objects removal

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-04 19:41:15 +01:00
Peter Maydell
bbb6a1e872 Merge remote-tracking branch 'remotes/kvaneesh/for-upstream' into staging
* remotes/kvaneesh/for-upstream:
  hw/9pfs: Don't return type from host in readdir on local 9p filesystem
  hw/9pfs: Use little-endian format for xattr values

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-04 18:34:28 +01:00
Gonglei
1d45a705fc qdev: Add cleanup logic in device_set_realized() to avoid resource leak
At present, this function doesn't have partial cleanup implemented,
which will cause resource leaks in some scenarios.

Example:

1. Assume that "dc->realize(dev, &local_err)" executes successful
   and local_err == NULL;
2. device hotplug in hotplug_handler_plug() executes but fails
   (it is prone to occur). Then local_err != NULL;
3. error_propagate(errp, local_err) and return. But the resources
   which have been allocated in dc->realize() will be leaked.
Simple backtrace:
  dc->realize()
   |->device_realize
            |->pci_qdev_init()
                |->do_pci_register_device()
                |->etc.

Add fuller cleanup logic which assures that function can
goto appropriate error label as local_err population is
detected at each relevant point.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-09-04 19:15:54 +02:00
Gonglei
cd4520adca qdev: Use NULL instead of local_err for qbus_child unrealize
Forcefully unrealize all children regardless of errors in earlier
iterations (if any). We should keep going with cleanup operation
rather than report an error immediately. Therefore store the first
child unrealization failure and propagate it at the end. We also
forcefully unregister vmsd and unrealize actual object, too.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-09-04 19:15:06 +02:00
Peter Maydell
8cf8c92e77 Merge remote-tracking branch 'remotes/stefanha/tags/net-pull-request' into staging
Net patches

# gpg: Signature made Thu 04 Sep 2014 17:32:44 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/net-pull-request:
  virtio-net: purge outstanding packets when starting vhost
  net: complete all queued packets on VM stop
  net: invoke callback when purging queue
  virtio: don't call device on !vm_running
  virtio-net: don't run bh on vm stopped
  net: Forbid dealing with packets when VM is not running

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-04 17:39:07 +01:00
Michael S. Tsirkin
086abc1ccd virtio-net: purge outstanding packets when starting vhost
whenever we start vhost, virtio could have outstanding packets
queued, when they complete later we'll modify the ring
while vhost is processing it.

To prevent this, purge outstanding packets on vhost start.

Cc: qemu-stable@nongnu.org
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-04 17:19:09 +01:00
Michael S. Tsirkin
ca77d85e1d net: complete all queued packets on VM stop
This completes all packets, ensuring that callbacks
will not run when VM is stopped.

Cc: qemu-stable@nongnu.org
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-04 17:19:09 +01:00
Michael S. Tsirkin
07d8084624 net: invoke callback when purging queue
devices rely on packet callbacks eventually running,
but we violate this rule whenever we purge the queue.
To fix, invoke callbacks on all packets on purge.
Set length to 0, this way callers can detect that
this happened and re-queue if necessary.

Cc: qemu-stable@nongnu.org
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-04 17:19:09 +01:00
Michael S. Tsirkin
269bd822e7 virtio: don't call device on !vm_running
On vm stop, virtio changes vm_running state
too soon, so callbacks can get envoked with
vm_running = false;

Cc: qemu-stable@nongnu.org
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-04 17:19:09 +01:00
Michael S. Tsirkin
e8bcf84200 virtio-net: don't run bh on vm stopped
commit 783e770693
    virtio-net: stop/start bh when appropriate

is incomplete: BH might execute within the same main loop iteration but
after vmstop, so in theory, we might trigger an assertion.
I was unable to reproduce this in practice,
but it seems clear enough that the potential is there, so worth fixing.

Cc: qemu-stable@nongnu.org
Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-04 17:19:09 +01:00
Bastian Blank
840a1bf283 hw/9pfs: Don't return type from host in readdir on local 9p filesystem
When using mapped mode in 9pfs, readdir implementation
should not return file type in d_type from the host
readdir, instead, it should use the type stored in
the extended attributes.  Since d_type is optional
and reading ext attrs for every readdir is expensive,
it should be sufficient to just set d_type to DT_UNKNOWN,
so guest will know to look it up separately.

This is a -stable material.

Signed-off-by: Bastian Blank <waldi@debian.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
2014-09-04 10:51:13 -05:00
Gonglei
d578029e71 qdev: Use error_abort instead of using local_err
This error can not happen normally. If it happens, it indicates
something very wrong, we should abort QEMU. Moreover, the
user can only refer to /machine/peripheral or /objects, not
/machine/unattached.

While at it, remove superfluous check about local_err.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-09-04 16:14:47 +02:00
Peter Crosthwaite
843ef73a69 memory: Remove object_property_add_child_array()
Obsoleted by automatic object_property_add() arrayification.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-09-04 16:14:47 +02:00
Peter Crosthwaite
339659041f qom: Add automatic arrayification to object_property_add()
If "[*]" is given as the last part of a QOM property name, treat that
as an array property. The added property is given the first available
name, replacing the * with a decimal number counting from 0.

First add with name "foo[*]" will be "foo[0]". Second "foo[1]" and so
on.

Callers may inspect the ObjectProperty * return value to see what
number the added property was given.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-09-04 16:14:47 +02:00
Andreas Färber
d2659e27e1 machine: Clean up -machine handling
Since commit c4090f8, -object options are no longer handled through
object_set_property(), so clean up -object leftovers by renaming the
function and dropping special-casing of qom-type and id properties.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel.a@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-09-04 16:14:47 +02:00
Alexey Kardashevskiy
8af734ca31 qom: Make object_child_foreach() safe for objects removal
Current object_child_foreach() uses QTAILQ_FOREACH() to walk
through children and that makes children removal from the callback
impossible.

This makes object_child_foreach() use QTAILQ_FOREACH_SAFE().

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-09-04 16:14:47 +02:00
zhanghailiang
e1d64c084b net: Forbid dealing with packets when VM is not running
For all NICs(except virtio-net) emulated by qemu,
Such as e1000, rtl8139, pcnet and ne2k_pci,
Qemu can still receive packets when VM is not running.

If this happened in *migration's* last PAUSE VM stage, but
before the end of the migration, the new receiving packets will possibly dirty
parts of RAM which has been cached in *iovec*(will be sent asynchronously) and
dirty parts of new RAM which will be missed.
This will lead serious network fault in VM.

To avoid this, we forbid receiving packets in generic net code when
VM is not running.

Bug reproduction steps:
(1) Start a VM which configured at least one NIC
(2) In VM, open several Terminal and do *Ping IP -i 0.1*
(3) Migrate the VM repeatedly between two Hosts
And the *PING* command in VM will very likely fail with message:
'Destination HOST Unreachable', the NIC in VM will stay unavailable unless you
run 'service network restart'

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-04 14:31:54 +01:00
Peter Maydell
01eb313907 Merge remote-tracking branch 'remotes/mjt/tags/trivial-patches-2014-09-03' into staging
trivial patches for 2014-09-03

# gpg: Signature made Wed 03 Sep 2014 06:53:42 BST using RSA key ID A4C3D7DB
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg:                 aka "Michael Tokarev <mjt@corpit.ru>"
# gpg:                 aka "Michael Tokarev <mjt@debian.org>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D  4324 457C E0A0 8044 65C5
#      Subkey fingerprint: 6F67 E18E 7C91 C5B1 5514  66A7 BEE5 9D74 A4C3 D7DB

* remotes/mjt/tags/trivial-patches-2014-09-03:
  slirp: Honour vlan/stack in hostfwd_remove commands
  hmp: fix MemdevList memory leak
  qom/object.c, hmp.c: fix string_output_get_string() memory leak
  query-memdev: fix potential memory leaks
  MAINTAINERS: Add VMWare devices maintainer
  device_tree.c: dump all err mesages with error_report
  device_tree.c: redirect load_device_tree err message to stderr
  scripts: Remove scripts/qtest
  Fix debug print warning
  curl: The macro that you have to uncomment to get debugging is DEBUG_CURL.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-04 13:33:53 +01:00
Peter Maydell
b27e37d4ce Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pci, pc fixes, features

A bunch of bugfixes - these will make sense for 2.1.1

Initial Intel IOMMU support.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Wed 03 Sep 2014 14:41:23 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"

* remotes/mst/tags/for_upstream:
  acpi-build: Set FORCE_APIC_CLUSTER_MODEL bit for FADT flags
  vhost-scsi: init backend features earlier
  vhost_net: init acked_features to backend_features
  vhost_net: start/stop guest notifiers properly

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-04 12:20:41 +01:00
Peter Maydell
4771b02512 Revert "vhost_net: start/stop guest notifiers properly"
This reverts commit aad4dce934.

I accidentally merged the wrong version of a pull request
which had a buggy version of this patch. Reverting the
buggy version means we can then cleanly merge in the correct
pull with the corrected change.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-04 12:19:37 +01:00
Gerd Hoffmann
c1b886c45d vbe: rework sanity checks
Plug a bunch of holes in the bochs dispi interface parameter checking.
Add a function doing verification on all registers.  Call that
unconditionally on every register write.  That way we should catch
everything, even changing one register affecting the valid range of
another register.

Some of the holes have been added by commit
e9c6149f6a.  Before that commit the
maximum possible framebuffer (VBE_DISPI_MAX_XRES * VBE_DISPI_MAX_YRES *
32 bpp) has been smaller than the qemu vga memory (8MB) and the checking
for VBE_DISPI_MAX_XRES + VBE_DISPI_MAX_YRES + VBE_DISPI_MAX_BPP was ok.

Some of the holes have been there forever, such as
VBE_DISPI_INDEX_X_OFFSET and VBE_DISPI_INDEX_Y_OFFSET register writes
lacking any verification.

Security impact:

(1) Guest can make the ui (gtk/vnc/...) use memory rages outside the vga
frame buffer as source  ->  host memory leak.  Memory isn't leaked to
the guest but to the vnc client though.

(2) Qemu will segfault in case the memory range happens to include
unmapped areas  ->  Guest can DoS itself.

The guest can not modify host memory, so I don't think this can be used
by the guest to escape.

CVE-2014-3615

Cc: qemu-stable@nongnu.org
Cc: secalert@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2014-09-04 08:23:14 +02:00
Gerd Hoffmann
54a85d4624 vbe: make bochs dispi interface return the correct memory size with qxl
VgaState->vram_size is the size of the pci bar.  In case of qxl not the
whole pci bar can be used as vga framebuffer.  Add a new variable
vbe_size to handle that case.  By default (if unset) it equals
vram_size, but qxl can set vbe_size to something else.

This makes sure VBE_DISPI_INDEX_VIDEO_MEMORY_64K returns correct results
and sanity checks are done with the correct size too.

Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2014-09-04 08:22:48 +02:00
zhanghailiang
07b81ed937 acpi-build: Set FORCE_APIC_CLUSTER_MODEL bit for FADT flags
If we start Windows 2008 R2 DataCenter with number of cpu less than 8,
The system will use APIC Flat Logical destination mode as default configuration,
Which has an upper limit of 8 CPUs.

The fault is that VM can not show all processors within Task Manager if
we hot-add cpus when the number of cpus in VM extends the limit of 8.

If we use cluster destination model, the problem will be solved.

Note:
This flag was introduced later than ACPI v1.0 specification while QEMU
generates v1.0 tables only, but...

linux kernel ignores this flag, so patch has no influence on it.

Tested with Win[XPsp3|Srv2003EE|Srv2008DC|Srv2008R2|Srv2012R2], there
isn't BSODs and guests boot just fine. In cases guest doesn't support
cpu-hotplug, cpu becomes visible after reboot and in case the guest
supports cpu-hotplug, it works as expected with this patch.

Cc: qemu-stable@nongnu.org
Signed-off-by: huangzhichao <huangzhichao@huawei.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-By: Igor Mammedov <imammedo@redhat.com>
2014-09-03 16:41:05 +03:00
Michael S. Tsirkin
3a1655fc53 vhost-scsi: init backend features earlier
As vhost core can use backend_features during init, clear it earlier to
avoid using uninitialized memory.
This use would be harmless since vhost scsi ignores the result
anyway, but initializing earlier will help prevent valgrind errors,
and make scsi and net behave similarly.

Cc: qemu-stable@nongnu.org
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-03 16:41:05 +03:00
Jason Wang
b49ae9138d vhost_net: init acked_features to backend_features
commit 2e6d46d77e (vhost: add
vhost_get_features and vhost_ack_features) removes the step that
initializes the acked_features to backend_features.

As this field is now uninitialized, vhost initialization will sometimes
fail.

To fix, initialize acked_features on each ack.

Tested-by: Andrey Korolyov <andrey@xdel.ru>
Cc: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-03 16:41:05 +03:00
Jason Wang
cd7d1d26b0 vhost_net: start/stop guest notifiers properly
commit a9f98bb5eb "vhost: multiqueue
support" changed the order of stopping the device. Previously
vhost_dev_stop would disable backend and only afterwards, unset guest
notifiers. We now unset guest notifiers while vhost is still
active. This can lose interrupts causing guest networking to fail. In
particular, this has been observed during migration.

To fix this, several other changes are needed:
- remove the hdev->started assertion in vhost.c since we may want to
start the guest notifiers before vhost starts and stop the guest
notifiers after vhost is stopped.
- introduce the vhost_net_set_vq_index() and call it before setting
guest notifiers. This is to guarantee vhost_net has the correct
virtqueue index when setting guest notifiers.

MST: fix up error handling.

Cc: qemu-stable@nongnu.org
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Andrey Korolyov <andrey@xdel.ru>
Reported-by: "Zhangjie (HZ)" <zhangjie14@huawei.com>
Tested-by: William Dauchy <william@gandi.net>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-03 16:40:44 +03:00
Aneesh Kumar K.V
f8ad4a89e9 hw/9pfs: Use little-endian format for xattr values
With security_model=mapped-xattr, we encode the uid,gid and other file
attributes as extended attributes of the file. We save them under
user.virtfs.* namespace.

Use little-endian encoding for on-disk values. This enables us to export
the same directory from both little-endian and big-endian hosts.

NOTE: This will break big-endian host that have virtFS exports
using security model mapped-xattr. They will have to use external tools
to convert the xattr to little-endian format.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
2014-09-02 16:02:33 -05:00
Peter Maydell
70381662aa slirp: Honour vlan/stack in hostfwd_remove commands
The hostfwd_add and hostfwd_remove monitor commands allow the user
to optionally specify a vlan/stack tuple. hostfwd_add honours this,
but hostfwd_remove does not (it looks up the tuple but then ignores
the SlirpState it has looked up and always uses the first stack
in the list anyway). Correct this to honour what the user requested.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-02 22:38:16 +04:00
Chen Fan
ecaf54a052 hmp: fix MemdevList memory leak
the memdev_list in hmp_info_memdev() is never freed.
so we use existent method qapi_free_MemdevList() to free it.
and also we can use qapi_free_MemdevList() to replace list loops
to clean up the memdev list in error path.

Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-02 22:38:16 +04:00
Chen Fan
976620ac40 qom/object.c, hmp.c: fix string_output_get_string() memory leak
string_output_get_string() uses g_string_free(str, false) to
transfer the 'str' pointer to callers and never free it.

Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-02 22:38:16 +04:00
Chen Fan
b0e90181e4 query-memdev: fix potential memory leaks
Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-02 22:38:16 +04:00
Dmitry Fleytman
622fb504c4 MAINTAINERS: Add VMWare devices maintainer
Signed-off-by: Dmitry Fleytman <dmitry@daynix.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-02 22:38:16 +04:00
Li Liu
508e221f2c device_tree.c: dump all err mesages with error_report
Signed-off-by: Li Liu <john.liuli@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-02 22:38:16 +04:00
Li Liu
db013f81b2 device_tree.c: redirect load_device_tree err message to stderr
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Li Liu <john.liuli@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-02 22:38:16 +04:00
Fam Zheng
7d2ff422ca scripts: Remove scripts/qtest
This is a dummy file with no user, drop it.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-02 22:38:16 +04:00
Gonglei
c5539cb426 Fix debug print warning
Steps:

1.enable qemu debug print, using simply scprit as below:
 grep "//#define DEBUG" * -rl | xargs sed -i "s/\/\/#define DEBUG/#define DEBUG/g"
2. make -j
3. get some warning:
hw/i2c/pm_smbus.c: In function 'smb_ioport_writeb':
hw/i2c/pm_smbus.c:142: warning: format '%04x' expects type 'unsigned int', but argument 2 has type 'hwaddr'
hw/i2c/pm_smbus.c:142: warning: format '%02x' expects type 'unsigned int', but argument 3 has type 'uint64_t'
hw/i2c/pm_smbus.c: In function 'smb_ioport_readb':
hw/i2c/pm_smbus.c:209: warning: format '%04x' expects type 'unsigned int', but argument 2 has type 'hwaddr'
hw/intc/i8259.c: In function 'pic_ioport_read':
hw/intc/i8259.c:373: warning: format '%02x' expects type 'unsigned int', but argument 2 has type 'hwaddr'
hw/input/pckbd.c: In function 'kbd_write_command':
hw/input/pckbd.c:232: warning: format '%02x' expects type 'unsigned int', but argument 2 has type 'uint64_t'
hw/input/pckbd.c: In function 'kbd_write_data':
hw/input/pckbd.c:333: warning: format '%02x' expects type 'unsigned int', but argument 2 has type 'uint64_t'
hw/isa/apm.c: In function 'apm_ioport_writeb':
hw/isa/apm.c:44: warning: format '%x' expects type 'unsigned int', but argument 2 has type 'hwaddr'
hw/isa/apm.c:44: warning: format '%02x' expects type 'unsigned int', but argument 3 has type 'uint64_t'
hw/isa/apm.c: In function 'apm_ioport_readb':
hw/isa/apm.c:67: warning: format '%x' expects type 'unsigned int', but argument 2 has type 'hwaddr'
hw/timer/mc146818rtc.c: In function 'cmos_ioport_write':
hw/timer/mc146818rtc.c:394: warning: format '%02x' expects type 'unsigned int', but argument 3 has type 'uint64_t'
hw/i386/pc.c: In function 'port92_write':
hw/i386/pc.c:479: warning: format '%02x' expects type 'unsigned int', but argument 2 has type 'uint64_t'

Fix them.

Cc: qemu-trivial@nongnu.org
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-02 22:38:16 +04:00
Richard W.M. Jones
41c2346716 curl: The macro that you have to uncomment to get debugging is DEBUG_CURL.
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-02 22:38:16 +04:00
Peter Maydell
f2426947de Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pci, pc fixes, features

A bunch of bugfixes - these will make sense for 2.1.1

Initial Intel IOMMU support.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Tue 02 Sep 2014 16:05:04 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"

* remotes/mst/tags/for_upstream:
  vhost_net: start/stop guest notifiers properly
  pci: avoid losing config updates to MSI/MSIX cap regs
  virtio-net: don't run bh on vm stopped
  ioh3420: remove unused ioh3420_init() declaration
  vhost_net: cleanup start/stop condition
  intel-iommu: add IOTLB using hash table
  intel-iommu: add context-cache to cache context-entry
  intel-iommu: add supports for queued invalidation interface
  intel-iommu: fix coding style issues around in q35.c and machine.c
  intel-iommu: add Intel IOMMU emulation to q35 and add a machine option "iommu" as a switch
  intel-iommu: add DMAR table to ACPI tables
  intel-iommu: introduce Intel IOMMU (VT-d) emulation
  iommu: add is_write as a parameter to the translate function of MemoryRegionIOMMUOps

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-02 16:07:31 +01:00
Jason Wang
aad4dce934 vhost_net: start/stop guest notifiers properly
commit a9f98bb5eb vhost: multiqueue
support changed the order of stopping the device. Previously
vhost_dev_stop would disable backend and only afterwards, unset guest
notifiers. We now unset guest notifiers while vhost is still
active. This can lose interrupts causing guest networking to fail. In
particular, this has been observed during migration.

To adapt this, several other changes are needed:
- remove the hdev->started assertion in vhost.c since we may want to
start the guest notifiers before vhost starts and stop the guest
notifiers after vhost is stopped.
- introduce the vhost_net_set_vq_index() and call it before setting
guest notifiers. This is used to guarantee vhost_net has the correct
virtqueue index when setting guest notifiers.

Cc: qemu-stable@nongnu.org
Reported-by: "Zhangjie (HZ)" <zhangjie14@huawei.com>
Tested-by: William Dauchy <wdauchy@gmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-02 17:33:37 +03:00
Knut Omang
d7efb7e08e pci: avoid losing config updates to MSI/MSIX cap regs
Since
commit 95d6580024
    msi: Invoke msi/msix_write_config from PCI core
msix config writes are lost, the value written is always 0.

Fix pci_default_write_config to avoid this.

Cc: qemu-stable@nongnu.org
Signed-off-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-02 17:28:26 +03:00
Michael S. Tsirkin
0187c7989a virtio-net: don't run bh on vm stopped
commit 783e770693
    virtio-net: stop/start bh when appropriate

is incomplete: BH might execute within the same main loop iteration but
after vmstop, so in theory, we might trigger an assertion.
I was unable to reproduce this in practice,
but it seems clear enough that the potential is there, so worth fixing.

Cc: qemu-stable@nongnu.org
Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-02 17:28:26 +03:00
Gonglei
fc8342f758 ioh3420: remove unused ioh3420_init() declaration
commit 0f9b1771cc
    ioh3420: Remove obsoleted, unused ioh3420_init function
removed the implementation of ioh3420_init

Drop the declaration from the header file as well.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-09-02 17:28:26 +03:00
Michael S. Tsirkin
2d2507ef23 vhost_net: cleanup start/stop condition
Checking vhost device internal state in vhost_net looks like
a layering violation since vhost_net does not
set this flag: it is set and tested by vhost.c.
There seems to be no reason to check this:
caller in virtio net uses its own flag,
vhost_started, to ensure vhost is started/stopped
as appropriate.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
2014-09-02 17:28:25 +03:00
Peter Maydell
30eaca3acd Merge remote-tracking branch 'remotes/spice/tags/pull-spice-20140902-1' into staging
sanity check for qxl, minor spice display channel tweak.

# gpg: Signature made Tue 02 Sep 2014 09:53:39 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/spice/tags/pull-spice-20140902-1:
  spice: use console index as display id
  qxl-render: add more sanity checks

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-02 10:26:10 +01:00
Xin Tong
88e89a57f9 implementing victim TLB for QEMU system emulated TLB
QEMU system mode page table walks are expensive. Taken by running QEMU
qemu-system-x86_64 system mode on Intel PIN , a TLB miss and walking a
4-level page tables in guest Linux OS takes ~450 X86 instructions on
average.

QEMU system mode TLB is implemented using a directly-mapped hashtable.
This structure suffers from conflict misses. Increasing the
associativity of the TLB may not be the solution to conflict misses as
all the ways may have to be walked in serial.

A victim TLB is a TLB used to hold translations evicted from the
primary TLB upon replacement. The victim TLB lies between the main TLB
and its refill path. Victim TLB is of greater associativity (fully
associative in this patch). It takes longer to lookup the victim TLB,
but its likely better than a full page table walk. The memory
translation path is changed as follows :

Before Victim TLB:
1. Inline TLB lookup
2. Exit code cache on TLB miss.
3. Check for unaligned, IO accesses
4. TLB refill.
5. Do the memory access.
6. Return to code cache.

After Victim TLB:
1. Inline TLB lookup
2. Exit code cache on TLB miss.
3. Check for unaligned, IO accesses
4. Victim TLB lookup.
5. If victim TLB misses, TLB refill
6. Do the memory access.
7. Return to code cache

The advantage is that victim TLB can offer more associativity to a
directly mapped TLB and thus potentially fewer page table walks while
still keeping the time taken to flush within reasonable limits.
However, placing a victim TLB before the refill path increase TLB
refill path as the victim TLB is consulted before the TLB refill. The
performance results demonstrate that the pros outweigh the cons.

some performance results taken on SPECINT2006 train
datasets and kernel boot and qemu configure script on an
Intel(R) Xeon(R) CPU  E5620  @ 2.40GHz Linux machine are shown in the
Google Doc link below.

https://docs.google.com/spreadsheets/d/1eiItzekZwNQOal_h-5iJmC4tMDi051m9qidi5_nwvH4/edit?usp=sharing

In summary, victim TLB improves the performance of qemu-system-x86_64 by
11% on average on SPECINT2006, kernelboot and qemu configscript and with
highest improvement of in 26% in 456.hmmer. And victim TLB does not result
in any performance degradation in any of the measured benchmarks. Furthermore,
the implemented victim TLB is architecture independent and is expected to
benefit other architectures in QEMU as well.

Although there are measurement fluctuations, the performance
improvement is very significant and by no means in the range of
noises.

Signed-off-by: Xin Tong <trent.tong@gmail.com>
Message-id: 1407202523-23553-1-git-send-email-trent.tong@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-01 17:43:06 +01:00
Bastian Koppelmann
44ea34309e target-tricore: Add instructions of SR opcode format
Add instructions of SR opcode format.
Add micro-op generator functions for saturate.
Add helper return from exception (rfe).

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Message-id: 1409572800-4116-16-git-send-email-kbastian@mail.uni-paderborn.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-01 14:49:21 +01:00
Bastian Koppelmann
5a7634a28c target-tricore: Add instructions of SLR, SSRO and SRO opcode format
Add instructions of SLR, SSRO and SRO opcode format.

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>

Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 1409572800-4116-15-git-send-email-kbastian@mail.uni-paderborn.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-01 14:49:21 +01:00
Bastian Koppelmann
5de93515f9 target-tricore: Add instructions of SC opcode format
Add instructions of SC opcode format.
Add helper for begin interrupt service routine.

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Message-id: 1409572800-4116-14-git-send-email-kbastian@mail.uni-paderborn.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-01 14:49:21 +01:00
Bastian Koppelmann
a47b50db60 target-tricore: Add instructions of SBR opcode format
Add instructions of SBR opcode format.
Add gen_loop micro-op generator function.

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>

Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 1409572800-4116-13-git-send-email-kbastian@mail.uni-paderborn.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-01 14:49:21 +01:00
Bastian Koppelmann
70b0226250 target-tricore: Add instructions of SBC and SBRN opcode format
Add instructions of SBC and SBRN opcode format.

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>

Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 1409572800-4116-12-git-send-email-kbastian@mail.uni-paderborn.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-01 14:49:21 +01:00
Bastian Koppelmann
9a31922b08 target-tricore: Add instructions of SB opcode format
Add instructions of SB opcode format.
Add helper call/ret.
Add micro-op generator functions for branches.
Add makro to generate helper functions.

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Message-id: 1409572800-4116-11-git-send-email-kbastian@mail.uni-paderborn.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-01 14:49:21 +01:00
Bastian Koppelmann
d279821074 target-tricore: Add instructions of SRRS and SLRO opcode format
Add instructions of SSRS and SLRO opcode format.
Add micro-op generator functions for offset loads.

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>

Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 1409572800-4116-10-git-send-email-kbastian@mail.uni-paderborn.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-01 14:49:21 +01:00
Bastian Koppelmann
46aa848ffb target-tricore: Add instructions of SSR opcode format
Add instructions of SSR opcode format.

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>

Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 1409572800-4116-9-git-send-email-kbastian@mail.uni-paderborn.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-01 14:49:21 +01:00
Bastian Koppelmann
2692802a37 target-tricore: Add instructions of SRR opcode format
Add instructions of SRR opcode format.
Add helper for add/sub_ssov.

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Message-id: 1409572800-4116-8-git-send-email-kbastian@mail.uni-paderborn.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-01 14:49:21 +01:00
Bastian Koppelmann
0707ec1bea target-tricore: Add instructions of SRC opcode format
Add instructions of SRC opcode format.
Add micro-op generator functions for add, conditional add/sub and shi/shai.

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>

Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 1409572800-4116-7-git-send-email-kbastian@mail.uni-paderborn.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-01 14:49:20 +01:00
Bastian Koppelmann
7c87d074d2 target-tricore: Add masks and opcodes for decoding
Add masks and opcodes for decoding TriCore instructions.

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>

Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 1409572800-4116-6-git-send-email-kbastian@mail.uni-paderborn.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-01 14:49:20 +01:00
Bastian Koppelmann
0aaeb118b3 target-tricore: Add initialization for translation and activate target
Add tcg and cpu model initialization.
Add gen_intermediate_code function.
Activate target in configure and add softmmu config.

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Message-id: 1409572800-4116-5-git-send-email-kbastian@mail.uni-paderborn.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-01 14:49:20 +01:00
Bastian Koppelmann
2d30267e8e target-tricore: Add softmmu support
Add basic softmmu support for TriCore

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Message-id: 1409572800-4116-4-git-send-email-kbastian@mail.uni-paderborn.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-01 14:49:20 +01:00
Bastian Koppelmann
e2d0501103 target-tricore: Add board for systemmode
Add basic board to allow systemmode emulation

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Message-id: 1409572800-4116-3-git-send-email-kbastian@mail.uni-paderborn.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-01 14:49:20 +01:00
Bastian Koppelmann
48e06fe0ed target-tricore: Add target stubs and qom-cpu
Add TriCore target stubs, and QOM cpu, and Maintainer

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Message-id: 1409572800-4116-2-git-send-email-kbastian@mail.uni-paderborn.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-01 14:49:20 +01:00
Peter Maydell
5cd1475d28 Merge remote-tracking branch 'remotes/borntraeger/tags/kvm-s390-20140901' into staging
s390x/kvm: Several updates/fixes/features

1. s390x/kvm: avoid synchronize_rcu's in kernel
----------------------------------------------
The first patches change s390x/kvm code to issue VCPU specific ioctls
from the VCPU thread. This will avoid unnecessary synchronize_rcu in
the kernel, which caused a noticably slowdown with many guest CPUs.
It speeds up all start/restart/reset operations involving cpus
drastically.

2. s390-ccw.img: block size and DASD format support
---------------------------------------------------
The second part changes the s390-ccw bios to IPL (boot)  more disk
formats than before. Furthermore a small fix is made to the console
output of the bios.

3. s390: Support for Hotplug of Standby Memory
----------------------------------------------
The third part adds support in s390 for a pool of standby memory,
which can be set online/offline by the guest (ie, via chmem).
The standby pool of memory is allocated as the difference between
the initial memory setting and the maxmem setting.
As part of this work, additional results are provided for the
Read SCP Information SCLP, and new implentation is added for the
Read Storage Element Information, Attach Storage Element,
Assign Storage and Unassign Storage SCLPs, which enables the s390
guest to manipulate the standby memory pool.

This patchset is based on work originally done by Jeng-Fang (Nick)
Wang.

Sample qemu command snippet:

qemu -machine s390-ccw-virtio  -m 1024M,maxmem=2048M,slots=32 -enable-kvm

This will allocate 1024M of active memory, and another 1024M
of standby memory.  Example output from s390-tools lsmem:
=============================================================================
0x0000000000000000-0x000000000fffffff        256  online   no         0-127
0x0000000010000000-0x000000001fffffff        256  online   yes        128-255
0x0000000020000000-0x000000003fffffff        512  online   no         256-511
0x0000000040000000-0x000000007fffffff       1024  offline  -          512-1023

Memory device size  : 2 MB
Memory block size   : 256 MB
Total online memory : 1024 MB
Total offline memory: 1024 MB

The guest can dynamically enable part or all of the standby pool
via the s390-tools chmem, for example:

chmem -e 512M

And can attempt to dynamically disable:

chmem -d 512M

4. s390x/gdb: various fixes
---------------------------
* Patch 1 fixes a bug where the cc was changed accidentally.
* Patch 2 adds the gdb feature XML files for s390x
* Patch 3 Define acr and fpr registers as coprocessor registers. This allows us
   to reuse the feature XML files.
* Patch 4 whitespace fixes

# gpg: Signature made Mon 01 Sep 2014 12:53:39 BST using RSA key ID B5A61C7C
# gpg: Can't check signature: public key not found

* remotes/borntraeger/tags/kvm-s390-20140901:
  s390x/gdb: coding style fixes
  s390x/gdb: generate target.xml and handle fp/ac as coprocessors
  s390x/gdb: add the feature xml files for s390x
  s390x/gdb: don't touch the cc if tcg is not enabled
  sclp-s390: Add memory hotplug SCLPs
  s390-virtio: Apply same memory boundaries as virtio-ccw
  virtio-ccw: Include standby memory when calculating storage increment
  sclp-s390: Add device to manage s390 memory hotplug
  pc-bios/s390-ccw.img binary update
  pc-bios/s390-ccw: Do proper console setup
  pc-bios/s390-ccw: IPL from DASD with format variations
  pc-bios/s390-ccw Really big EAV ECKD DASD handling
  pc-bios/s390-ccw Improve ECKD informational message
  pc-bios/s390-ccw: handle more ECKD DASD block sizes
  pc-bios/s390-ccw: support all virtio block size
  s390x/kvm: execute the first cpu reset on the vcpu thread
  s390x/kvm: execute "system reset" cpu resets on the vcpu thread
  s390x/kvm: execute sigp orders on the target vcpu thread
  s390x/kvm: run guest triggered resets on the target vcpu thread

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-01 13:57:46 +01:00
Gerd Hoffmann
cd56cc6b07 spice: use console index as display id
... instead of maintaining our own numbering.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-09-01 10:19:03 +02:00
Gerd Hoffmann
503b3b33fe qxl-render: add more sanity checks
Damn, the dirty rectangle values are signed integers.  So the checks
added by commit 788fbf042f are not good
enough, we also have to make sure they are not negative.

[ Note: There must be something broken in spice-server so we get
  negative values in the first place.  Bug opened:
  https://bugzilla.redhat.com/show_bug.cgi?id=1135372 ]

Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2014-09-01 10:19:03 +02:00
David Hildenbrand
218829db23 s390x/gdb: coding style fixes
This patch cleanes up two coding style issues (missing whitespaces).

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:45:19 +02:00
David Hildenbrand
73d510c9d3 s390x/gdb: generate target.xml and handle fp/ac as coprocessors
This patch reduces the core registers to the psw and the general purpose
registers. The fpc and ac registers are handled as coprocessors registers by gdb.
This allows to reuse the feature xml files taken from gdb without further
modification and is what other architectures do.

The target.xml is now generated and provided to the gdb client. Therefore, the
client doesn't have to guess which registers are available at which logical
register number.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:45:19 +02:00
David Hildenbrand
6117afac34 s390x/gdb: add the feature xml files for s390x
This patch adds the relevant s390x feature xml files taken from gdb.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:45:19 +02:00
David Hildenbrand
97fa52f097 s390x/gdb: don't touch the cc if tcg is not enabled
When reading/writing the psw mask, the condition code may only be touched if
running on tcg.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:45:19 +02:00
Matthew Rosato
1def6656b6 sclp-s390: Add memory hotplug SCLPs
Add memory information to read SCP info and add handlers for
Read Storage Element Information, Attach Storage Element,
Assign Storage and Unassign Storage.

Signed-off-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:25:32 +02:00
Matthew Rosato
e7f1314f97 s390-virtio: Apply same memory boundaries as virtio-ccw
Although s390-virtio won't support memory hotplug, it should
enforce the same memory boundaries so that it can use shared codepaths
(like read_SCP_info).

Signed-off-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:25:32 +02:00
Matthew Rosato
b6fe01248e virtio-ccw: Include standby memory when calculating storage increment
When determining the memory increment size, use the maxmem size if
it was specified.

Signed-off-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:25:32 +02:00
Matthew Rosato
0844df77fd sclp-s390: Add device to manage s390 memory hotplug
Add sclpMemoryHotplugDev to contain associated data structures, etc.

Signed-off-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:25:32 +02:00
Eugene (jno) Dvurechenski
f360221988 pc-bios/s390-ccw.img binary update
Rebuild of s390-ccw.img containing these patches:

  pc-bios/s390-ccw: Do proper console setup
  pc-bios/s390-ccw: support all virtio block size
  pc-bios/s390-ccw: handle more ECKD DASD block sizes
  pc-bios/s390-ccw Improve ECKD informational message
  pc-bios/s390-ccw Really big EAV ECKD DASD handling
  pc-bios/s390-ccw: IPL from DASD with format variations

Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:23:02 +02:00
Christian Borntraeger
1aa7f4c6aa pc-bios/s390-ccw: Do proper console setup
The final newline/return must happen before we reset the sclp via
diag 308.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:23:02 +02:00
Eugene (jno) Dvurechenski
14f56a2e35 pc-bios/s390-ccw: IPL from DASD with format variations
There are two known cases of DASD format where signatures are
incomplete or absent:

1. result of <dasdfmt -d ldl -L ...> (ECKD_LDL_UNLABELED)
2. CDL with zero keys in IPL1 and IPL2 records

Now the code attempts to
1. find zIPL and use SCSI layout
2. find IPL1 and use CDL layout
3. find CMS1 and use LDL layout
3. find LNX1 and use LDL layout
4. find zIPL and use unlabeled LDL layout
5. find zIPL and use CDL layout
6. die
in this sequence.

Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:23:02 +02:00
Eugene (jno) Dvurechenski
f04db28b86 pc-bios/s390-ccw Really big EAV ECKD DASD handling
For EAV ECKD DASD, the cylinder count will have the magic value
0xfffeU. Therefore, use the block number to test for valid eckd
addresses instead.

Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:23:02 +02:00
Eugene (jno) Dvurechenski
b0885f7599 pc-bios/s390-ccw Improve ECKD informational message
Add block size display to ECKD scheme report.

Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:23:02 +02:00
Eugene (jno) Dvurechenski
00a47e7e71 pc-bios/s390-ccw: handle more ECKD DASD block sizes
Using dasdfmt(8) to format a DASD allows to choose a block size.
There are four supported values: 512, 1024, 2048, and 4096 bytes
per block. Each block size leads to selection of new count of
sectors per track. The head count remains always the same: 15.

This empiric knowledge is used to detect ECKD DASD to IPL from.

Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:23:02 +02:00
Eugene (jno) Dvurechenski
92cb05574b pc-bios/s390-ccw: support all virtio block size
The block size value may be given "as is" OR as a base value and
a shift count (exponent). So, we have to use calculation to get
the proper number in the code.

The main expression reads as
        (blk_cfg.blk_size << blk_cfg.physical_block_exp)

E.g., various combinations between blk_size=1/physical_block_exp=12
and blk_size=4096/physical_block_exp=0 are valid for 4K blocks.

Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:23:02 +02:00
David Hildenbrand
159855f098 s390x/kvm: execute the first cpu reset on the vcpu thread
As all full cpu resets currently call into the kernel to do initial cpu reset,
let's run this reset (triggered by cpu_s390x_init()) on the proper vcpu thread.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:23:02 +02:00
David Hildenbrand
1fad8b3be3 s390x/kvm: execute "system reset" cpu resets on the vcpu thread
Let's execute resets triggered by qemu system resets on the target vcpu thread.
This will avoid synchronize_rcu's in the kernel.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:23:02 +02:00
David Hildenbrand
6e6ad8db11 s390x/kvm: execute sigp orders on the target vcpu thread
All sigp orders that can result in ioctls on the target vcpu should be executed
on the associated vcpu thread.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:23:02 +02:00
David Hildenbrand
85ca3371f6 s390x/kvm: run guest triggered resets on the target vcpu thread
Currently, load_normal_reset() and modified_clear_reset() as triggered
by a guest vcpu will initiate cpu resets on the current vcpu thread for
all cpus. The reset should happen on the individual vcpu thread
instead, so let's use run_on_cpu() for this.

This avoids calls to synchronize_rcu() in the kernel.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-09-01 09:23:02 +02:00
Peter Maydell
988f463614 Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
Block pull request

# gpg: Signature made Fri 29 Aug 2014 17:25:58 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/block-pull-request: (35 commits)
  quorum: Fix leak of opts in quorum_open
  blkverify: Fix leak of opts in blkverify_open
  nfs: Fix leak of opts in nfs_file_open
  curl: Don't deref NULL pointer in call to aio_poll.
  curl: Allow a cookie or cookies to be sent with http/https requests.
  virtio-blk: allow drive_del with dataplane
  block: acquire AioContext in do_drive_del()
  linux-aio: avoid deadlock in nested aio_poll() calls
  qemu-iotests: add multiwrite test cases
  block: fix overlapping multiwrite requests
  nbd: Follow the BDS' AIO context
  block: Add AIO context notifiers
  nbd: Drop nbd_can_read()
  sheepdog: fix a core dump while do auto-reconnecting
  aio-win32: add support for sockets
  qemu-coroutine-io: fix for Win32
  AioContext: introduce aio_prepare
  aio-win32: add aio_set_dispatching optimization
  test-aio: test timers on Windows too
  AioContext: export and use aio_dispatch
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 18:40:04 +01:00
Fam Zheng
8df3abfcee quorum: Fix leak of opts in quorum_open
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 17:10:18 +01:00
Fam Zheng
3158593126 blkverify: Fix leak of opts in blkverify_open
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 17:10:18 +01:00
Fam Zheng
810f4f86b7 nfs: Fix leak of opts in nfs_file_open
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 17:10:18 +01:00
Richard W.M. Jones
a2f468e48f curl: Don't deref NULL pointer in call to aio_poll.
In commit 63f0f45f2e the following
mechanical change was made:

         if (!state) {
-            qemu_aio_wait();
+            aio_poll(state->s->aio_context, true);
         }

The new code now checks if state is NULL and then dereferences it
('state->s') which is obviously incorrect.

This commit replaces state->s->aio_context with
bdrv_get_aio_context(bs), fixing this problem.  The two other hunks
are concerned with getting the BlockDriverState pointer bs to where it
is needed.

The original bug causes a segfault when using libguestfs to access a
VMware vCenter Server and doing any kind of complex read-heavy
operations.  With this commit the segfault goes away.

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 16:19:01 +01:00
Richard W.M. Jones
a94f83d94f curl: Allow a cookie or cookies to be sent with http/https requests.
In order to access VMware ESX efficiently, we need to send a session
cookie.  This patch is very simple and just allows you to send that
session cookie.  It punts on the question of how you get the session
cookie in the first place, but in practice you can just run a `curl'
command against the server and extract the cookie that way.

To use it, add file.cookie to the curl URL.  For example:

$ qemu-img info 'json: {
    "file.driver":"https",
    "file.url":"https://vcenter/folder/Windows%202003/Windows%202003-flat.vmdk?dcPath=Datacenter&dsName=datastore1",
    "file.sslverify":"off",
    "file.cookie":"vmware_soap_session=\"52a01262-bf93-ccce-d379-8dabb3e55560\""}'
image: [...]
file format: raw
virtual size: 8.0G (8589934592 bytes)
disk size: unavailable

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 16:11:14 +01:00
Stefan Hajnoczi
3255d1c21f virtio-blk: allow drive_del with dataplane
Now that drive_del acquires the AioContext we can safely allow deleting
the drive.  As with non-dataplane mode, all I/Os submitted by the guest
after drive_del will return EIO.

This patch makes hot unplug work with virtio-blk dataplane.  Previously
drive_del reported an error because the device was busy.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 16:01:48 +01:00
Stefan Hajnoczi
8ad4202bf6 block: acquire AioContext in do_drive_del()
Make drive_del safe for dataplane where another thread may be running
the BlockDriverState's AioContext.

Note the assumption that AioContext's lifetime exceeds DriveInfo and
BlockDriverState.  We release AioContext after DriveInfo and
BlockDriverState are potentially freed.

This is clearly safe with the global AioContext but also with -object
iothread and implicit iothreads created by -device
virtio-blk-pci,x-data-plane=on (their lifetime is tied to DeviceState,
not BlockDriverState).

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 16:01:10 +01:00
Stefan Hajnoczi
2cdff7f620 linux-aio: avoid deadlock in nested aio_poll() calls
If two Linux AIO request completions are fetched in the same
io_getevents() call, QEMU will deadlock if request A's callback waits
for request B to complete using an aio_poll() loop.  This was reported
to happen with the mirror blockjob.

This patch moves completion processing into a BH and makes it resumable.
Nested event loops can resume completion processing so that request B
will complete and the deadlock will not occur.

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Ming Lei <ming.lei@canonical.com>
Cc: Marcin Gibuła <m.gibula@beyond.pl>
Reported-by: Marcin Gibuła <m.gibula@beyond.pl>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Marcin Gibuła <m.gibula@beyond.pl>
2014-08-29 15:59:17 +01:00
Peter Maydell
8b3030114a Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20140829' into staging
target-arm queue:
 * support PMCCNTR in ARMv8
 * various GIC fixes and cleanups
 * Correct Cortex-A57 ISAR5 and AA64ISAR0 ID register values
 * Fix regression that disabled VFP for ARMv5 CPUs
 * Update to upstream VIXL 1.5

# gpg: Signature made Fri 29 Aug 2014 15:34:47 BST using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"

* remotes/pmaydell/tags/pull-target-arm-20140829:
  target-arm: Implement pmccfiltr_write function
  target-arm: Remove old code and replace with new functions
  target-arm: Implement pmccntr_sync function
  target-arm: Add arm_ccnt_enabled function
  target-arm: Implement PMCCNTR_EL0 and related registers
  arm: Implement PMCCNTR 32b read-modify-write
  target-arm: Make the ARM PMCCNTR register 64-bit
  hw/intc/arm_gic: honor target mask in gic_update()
  aarch64: raise max_cpus to 8
  arm_gic: Use GIC_NR_SGIS constant
  arm_gic: Do not force PPIs to edge-triggered mode
  arm_gic: GICD_ICFGR: Write model only for pre v1 GICs
  arm_gic: Fix read of GICD_ICFGR
  target-arm: Correct Cortex-A57 ISAR5 and AA64ISAR0 ID register values
  target-arm: Fix regression that disabled VFP for ARMv5 CPUs
  disas/libvixl: Update to upstream VIXL 1.5

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 15:48:15 +01:00
Alistair Francis
0614601cec target-arm: Implement pmccfiltr_write function
This is the function that is called when writing to the
PMCCFILTR_EL0 register

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-id: 73da3da6404855b17d5ae82975a32ff3a4dcae3d.1409025949.git.peter.crosthwaite@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 15:00:30 +01:00
Alistair Francis
942a155b20 target-arm: Remove old code and replace with new functions
Remove the old PMCCNTR code and replace it with calls to the new
pmccntr_sync() and arm_ccnt_enabled() functions.

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-id: 693a6e437d915c2195fd3dc7303f384ca538b7bf.1409025949.git.peter.crosthwaite@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 15:00:30 +01:00
Alistair Francis
ec7b4ce4c7 target-arm: Implement pmccntr_sync function
This is used to synchronise the PMCCNTR counter and swap its
state between enabled and disabled if required. It must always
be called twice, both before and after any logic that could
change the state of the PMCCNTR counter.

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-id: 62811d4c0f7b1384f7aab62ea2fcfda3dcb0db50.1409025949.git.peter.crosthwaite@xilinx.com
[PMM: fixed minor typos in pmccntr_sync doc comment]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 15:00:29 +01:00
Alistair Francis
87124fdea4 target-arm: Add arm_ccnt_enabled function
Include a helper function to determine if the CCNT counter
is enabled.

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-id: e1a64f17a756e06c8bda8238ad4826d705049f7a.1409025949.git.peter.crosthwaite@xilinx.com
[ PC changes
  * Remove EL based checks
]
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 15:00:29 +01:00
Alistair Francis
8521466b39 target-arm: Implement PMCCNTR_EL0 and related registers
This patch adds support for the ARMv8 version of the PMCCNTR and
related registers. It also starts to implement the PMCCFILTR_EL0
register.

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-id: b5d1094764a5416363ee95216799b394ecd011e8.1409025949.git.peter.crosthwaite@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 15:00:29 +01:00
Peter Crosthwaite
421c7ebd93 arm: Implement PMCCNTR 32b read-modify-write
The register is now 64bit, however a 32 bit write to the register
should leave the higher bits unchanged. The open coded write handler
does not implement this, so we need to read-modify-write accordingly.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Alistair Francis <alistair23@gmail.com>
Message-id: ec350573424bb2adc1701c3b9278d26598e2f2d1.1409025949.git.peter.crosthwaite@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 15:00:29 +01:00
Alistair Francis
c92c06872a target-arm: Make the ARM PMCCNTR register 64-bit
This makes the PMCCNTR register 64-bit to allow for the
64-bit ARMv8 version.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 6c5bac5fd0ea54963b1fc0e7f9464909f2e19a73.1409025949.git.peter.crosthwaite@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 15:00:29 +01:00
Sergey Fedorov
b52b81e44f hw/intc/arm_gic: honor target mask in gic_update()
Take IRQ target mask into account when determining the highest priority
pending interrupt.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Message-id: 1407947471-26981-1-git-send-email-serge.fdrv@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 15:00:29 +01:00
Joel Schopp
d3579f362f aarch64: raise max_cpus to 8
I'm running on a system with 8 cpus and it would be nice to have qemu
support all of them.  The attached patch does that and has been tested.

That said, I'm not sure if 8 is enough or if we want to bump this even higher
now before systems with many more cpus come along. 255 anyone?

Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Joel Schopp <joel.schopp@amd.com>
Message-id: 20140819213304.19537.2834.stgit@joelaarch64.amd.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 15:00:29 +01:00
Adam Lackorzynski
93b5f6f1a6 arm_gic: Use GIC_NR_SGIS constant
Use constant rather than a plain number.

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Adam Lackorzynski <adam@os.inf.tu-dresden.de>
Message-id: 1408372255-12358-5-git-send-email-adam@os.inf.tu-dresden.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 15:00:29 +01:00
Adam Lackorzynski
de7a900f0c arm_gic: Do not force PPIs to edge-triggered mode
Only SGIs must be WI, done by forcing them to their default
(edge-triggered).

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Adam Lackorzynski <adam@os.inf.tu-dresden.de>
Message-id: 1408372255-12358-4-git-send-email-adam@os.inf.tu-dresden.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 15:00:28 +01:00
Adam Lackorzynski
24b790df43 arm_gic: GICD_ICFGR: Write model only for pre v1 GICs
Setting the model is only available in pre-v1 GIC models.

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Adam Lackorzynski <adam@os.inf.tu-dresden.de>
Message-id: 1408372255-12358-3-git-send-email-adam@os.inf.tu-dresden.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 15:00:28 +01:00
Adam Lackorzynski
71a62046ae arm_gic: Fix read of GICD_ICFGR
The GICD_ICFGR register covers 4 interrupts per byte.

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Adam Lackorzynski <adam@os.inf.tu-dresden.de>
Message-id: 1408372255-12358-2-git-send-email-adam@os.inf.tu-dresden.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 15:00:28 +01:00
Peter Maydell
c379621451 target-arm: Correct Cortex-A57 ISAR5 and AA64ISAR0 ID register values
We implement the crypto extensions but were incorrectly reporting
ID register values for the Cortex-A57 which did not advertise
crypto. Use the correct values as described in the TRM.
With this fix Linux correctly detects presence of the crypto
features and advertises them in /proc/cpuinfo.

Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1408718660-7295-1-git-send-email-peter.maydell@linaro.org
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 15:00:28 +01:00
Peter Maydell
ed1f13d607 target-arm: Fix regression that disabled VFP for ARMv5 CPUs
Commit 2c7ffc414 added support for honouring the CPACR coprocessor
access control register bits which may disable access to VFP
and Neon instructions. However it failed to account for the
fact that the CPACR is only present starting from the ARMv6
architecture version, so it accidentally disabled VFP completely
for ARMv5 CPUs like the ARM926. Linux would detect this as
"no VFP present" and probably fall back to its own emulation,
but other guest OSes might crash or misbehave.

This fixes bug LP:1359930.

Reported-by: Jakub Jermar <jakub@jermar.eu>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1408714940-7192-1-git-send-email-peter.maydell@linaro.org
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 15:00:28 +01:00
Peter Maydell
508280f566 disas/libvixl: Update to upstream VIXL 1.5
Update our copy of libvixl to upstream's 1.5 release.
This includes the upstream versions of the fixes we
were carrying locally (commit ffebe899).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1407162987-4659-1-git-send-email-peter.maydell@linaro.org
2014-08-29 15:00:27 +01:00
Stefan Hajnoczi
12ade76090 qemu-iotests: add multiwrite test cases
This test case covers the basic bdrv_aio_multiwrite() scenarios:
1. Single request
2. Sequential requests (AABB)
3. Superset overlapping requests (AABBAA)
4. Subset overlapping requests (BBAABB)
5. Head overlapping requests (AABB)
6. Tail overlapping requests (BBAA)
7. Disjoint requests (AA BB)

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-29 14:10:15 +01:00
Stefan Hajnoczi
391827eb10 block: fix overlapping multiwrite requests
When request A is a strict superset of request B:

  AAAAAAAA
    BBBB

multiwrite_merge() merges them as follows:

  AABBBB

The tail of request A should have been included:

  AABBBBAA

This patch fixes data loss but this code path is probably rare.  Since
guests cannot assume ordering between in-flight requests, few
applications submit overlapping write requests.

Reported-by: Slava Pestov <sviatoslav.pestov@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-29 14:09:43 +01:00
Peter Maydell
d9aa688557 Merge remote-tracking branch 'remotes/kraxel/tags/pull-usb-20140829-1' into staging
usb: bugfix collection.
usb: add cleanup functions for host adapters,
     in preparation for hotplug support.
usb: add simple qtests for uhci,ohci,xhci.

# gpg: Signature made Fri 29 Aug 2014 12:56:20 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-usb-20140829-1:
  tests: add xHCI qtest
  tests: add UHCI qtest
  tests: add OHCI qtest
  usb: add usb host adapters exit trace
  usb-xhci: add exit function
  usb-ehci: add ehci-pci device exit function
  usb-ehci: add ehci unrealize funciton
  usb-ehci: add vmstate properity for EHCIState
  usb-uhci: clean up uhci resource when pci-uhci exit
  usb-ohci: add exit function
  usb-ohci: Fix memory leak for ohci timer
  usb: add usb_bus_release function
  Revert "xhci: Fix number of streams allocated when using streams"
  xhci: use (1u << i)
  Fix OHCI ISO TD state never being written back.
  xhci: fix debug print compiling error
  usb: Fix bootindex for portnr > 9

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 13:08:04 +01:00
Gonglei
25e89ec5d2 tests: add xHCI qtest
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-08-29 12:53:47 +02:00
Gonglei
44ced58e3a tests: add UHCI qtest
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-08-29 12:53:47 +02:00
Gonglei
28edfce0f3 tests: add OHCI qtest
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-08-29 12:53:47 +02:00
Gonglei
d733f74c33 usb: add usb host adapters exit trace
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-08-29 12:52:14 +02:00
Gonglei
53c30545fb usb-xhci: add exit function
clean up xhci resource when xhci pci device exit.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-08-29 12:52:14 +02:00
Gonglei
96e14926c6 usb-ehci: add ehci-pci device exit function
clean up ehci resource when ehci pci device exit.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-08-29 12:52:14 +02:00
Gonglei
4e130cf6a8 usb-ehci: add ehci unrealize funciton
cleanup ehci controller resource, both pci and sysbus
if they're necessary.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-08-29 12:52:14 +02:00
Gonglei
05a36991c5 usb-ehci: add vmstate properity for EHCIState
since hotunplug the ehci host adapter, we should
delete vm_change_state_handler also, so the
VMChangeStateEntry should be saved in EHCIState.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-08-29 12:52:14 +02:00
Gonglei
3a3464b000 usb-uhci: clean up uhci resource when pci-uhci exit
clean up uhci resource when uhci pci device exit.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-08-29 12:52:13 +02:00
Gonglei
07832c38d3 usb-ohci: add exit function
clean up ohci resource when ohci pci device exit.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-08-29 12:52:13 +02:00
Gonglei
80be63df5a usb-ohci: Fix memory leak for ohci timer
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-08-29 12:51:44 +02:00
Gonglei
e5a9bece9b usb: add usb_bus_release function
add global variables releasing logic when the usb buses
were removed or hot-unpluged.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-08-29 12:51:44 +02:00
Gerd Hoffmann
f90e160b50 Revert "xhci: Fix number of streams allocated when using streams"
This reverts commit d063c3112c.

"2 << x" is the same as "2 ^ (x + 1)", so the old code is correct.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-08-29 12:51:44 +02:00
Gerd Hoffmann
3d80365b55 xhci: use (1u << i)
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-29 12:51:43 +02:00
Jack Un
cae7f29c47 Fix OHCI ISO TD state never being written back.
There appears to be typo in OHCI with isochronous transfers
resulting in isoch. transfer descriptor state never being written back.
The'put_words' function is in a OR statement hence it is never called.

Signed-off-by: Jack Un <jack.un@gmail.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-08-29 12:51:43 +02:00
Gonglei
8c244210d8 xhci: fix debug print compiling error
after commit 003e15a180
the DPRINTF will broke compiling, adjust its location.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-08-29 12:51:43 +02:00
Markus Armbruster
830cd54fca usb: Fix bootindex for portnr > 9
We identify devices by their Open Firmware device paths.  The encoding
of the host controller and hub port numbers is incorrect:
usb_get_fw_dev_path() formats them in decimal, while SeaBIOS uses
hexadecimal.  When some port number > 9, SeaBIOS will miss the
bootindex (lucky case), or apply it to another device (unlucky case).

The relevant spec[*] agrees with SeaBIOS (and OVMF, for that matter).
Change %d to %x.

Bug can bite only with host controllers or hubs sporting more than ten
ports.  I'm not aware of any.

[*] Open Firmware Recommended Practice: Universal Serial Bus,
Version 1, Section 3.2.1 Device Node Address Representation
http://www.openfirmware.org/1275/bindings/usb/usb-1_0.ps

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>

Note: xhci can be configured with up to 15 ports (default is 4 ports).

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-08-29 12:51:43 +02:00
Max Reitz
f21492817b nbd: Follow the BDS' AIO context
Keep the NBD server always in the same AIO context as the exported BDS
by calling bdrv_add_aio_context_notifier() and implementing the required
callbacks.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:48:45 +01:00
Max Reitz
33384421b3 block: Add AIO context notifiers
If a long-running operation on a BDS wants to always remain in the same
AIO context, it somehow needs to keep track of the BDS changing its
context. This adds a function for registering callbacks on a BDS which
are called whenever the BDS is attached or detached from an AIO context.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:48:45 +01:00
Max Reitz
958c717df9 nbd: Drop nbd_can_read()
There is no variant of aio_set_fd_handler() like qemu_set_fd_handler2(),
so we cannot give a can_read() callback function. Instead, unregister
the nbd_read() function whenever we cannot read and re-register it as
soon as we can read again.

All this is hidden behind the functions nbd_set_handlers() (which
registers all handlers for the AIO context and file descriptor belonging
to the given client), nbd_unset_handlers() (which unregisters them) and
nbd_update_can_read() (which checks whether NBD can read for the given
client and acts accordingly).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:48:45 +01:00
Liu Yuan
a780dea045 sheepdog: fix a core dump while do auto-reconnecting
We should reinit local_err as NULL inside the while loop or g_free() will report
corrupption and abort the QEMU when sheepdog driver tries reconnecting.

This was broken in commit 356b4ca.

qemu-system-x86_64: failed to get the header, Resource temporarily unavailable
qemu-system-x86_64: Failed to connect to socket: Connection refused
qemu-system-x86_64: (null)
[xcb] Unknown sequence number while awaiting reply
[xcb] Most likely this is a multi-threaded client and XInitThreads has not been called
[xcb] Aborting, sorry about that.
qemu-system-x86_64: ../../src/xcb_io.c:298: poll_for_response: Assertion `!xcb_xlib_threads_sequence_lost' failed.
Aborted (core dumped)

Cc: qemu-devel@nongnu.org
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:58 +01:00
Paolo Bonzini
b493317d34 aio-win32: add support for sockets
Uses the same select/WSAEventSelect scheme as main-loop.c.
WSAEventSelect() is edge-triggered, so it cannot be used
directly, but it is still used as a way to exit from a
blocking g_poll().

Before g_poll() is called, we poll sockets with a non-blocking
select() to achieve the level-triggered semantics we require:
if a socket is ready, the g_poll() is made non-blocking too.

Based on a patch from Or Goshen.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:58 +01:00
Paolo Bonzini
79d9b6566b qemu-coroutine-io: fix for Win32
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:58 +01:00
Paolo Bonzini
a3462c6561 AioContext: introduce aio_prepare
This will be used to implement socket polling on Windows.
On Windows, select() and g_poll() are completely different;
sockets are polled with select() before calling g_poll,
and the g_poll must be nonblocking if select() says a
socket is ready.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:58 +01:00
Paolo Bonzini
0a9dd1664a aio-win32: add aio_set_dispatching optimization
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:58 +01:00
Paolo Bonzini
363285d4b3 test-aio: test timers on Windows too
Use EventNotifier instead of a pipe, which makes it trivial to test
timers on Windows.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:58 +01:00
Paolo Bonzini
e4c7e2d12d AioContext: export and use aio_dispatch
So far, aio_poll's scheme was dispatch/poll/dispatch, where
the first dispatch phase was used only in the GSource case in
order to avoid a blocking poll.  Earlier patches changed it to
dispatch/prepare/poll/dispatch, where prepare is aio_compute_timeout.

By making aio_dispatch public, we can remove the first dispatch
phase altogether, so that both aio_poll and the GSource use the same
prepare/poll/dispatch scheme.

This patch breaks the invariant that aio_poll(..., true) will not block
the first time it returns false.  This used to be fundamental for
qemu_aio_flush's implementation as "while (qemu_aio_wait()) {}" but
no code in QEMU relies on this invariant anymore.  The return value
of aio_poll() is now comparable with that of g_main_context_iteration.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:58 +01:00
Paolo Bonzini
3672fa5083 AioContext: run bottom halves after polling
Make the dispatching phase the same before blocking and afterwards.
The next patch will make aio_dispatch public and use it directly
for the GSource case, instead of aio_poll.  aio_poll can then be
simplified heavily.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:58 +01:00
Paolo Bonzini
a398dea34c aio-win32: Factor out duplicate code into aio_dispatch_handlers
Later, the call to aio_dispatch will move int the GSource wrapper, while the
standalone case will still be call the component functions aio_bh_poll,
aio_dispatch_handlers and timerlistgroup_run_timers.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:58 +01:00
Paolo Bonzini
d397ec99bb aio-win32: Evaluate timers after handles
This is similar to what aio_poll does in the stand-alone case.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:58 +01:00
Paolo Bonzini
845ca10dd0 AioContext: take bottom halves into account when computing aio_poll timeout
Right now, QEMU invokes aio_bh_poll before the "poll" phase
of aio_poll.  It is simpler to do it afterwards and skip the
"poll" phase altogether when the OS-dependent parts of AioContext
are invoked from GSource.  This way, AioContext behaves more
similarly when used as a GSource vs. when used as stand-alone.

As a start, take bottom halves into account when computing the
poll timeout.  If a bottom half is ready, do a non-blocking
poll.  As a side effect, this makes idle bottom halves work
with aio_poll; an improvement, but not really an important
one since they are deprecated.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:58 +01:00
Stefan Hajnoczi
3cbbe9fd1f blockdev: fix drive-mirror 'granularity' error message
Name the 'granularity' parameter and give its expected value range.
Previously the device name was mistakenly reported as the parameter
name.

Note that the error class is unchanged from ERROR_CLASS_GENERIC_ERROR.

Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
2014-08-29 10:46:58 +01:00
Fam Zheng
0b9caf9b31 coroutine: Drop co_sleep_ns
block_job_sleep_ns is the only user. Since we are moving towards
AioContext aware code, it's better to use the explicit version and drop
the old one.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:58 +01:00
Liu Yuan
a9db86b223 block/quorum: add simple read pattern support
This patch adds single read pattern to quorum driver and quorum vote is default
pattern.

For now we do a quorum vote on all the reads, it is designed for unreliable
underlying storage such as non-redundant NFS to make sure data integrity at the
cost of the read performance.

For some use cases as following:

        VM
  --------------
  |            |
  v            v
  A            B

Both A and B has hardware raid storage to justify the data integrity on its own.
So it would help performance if we do a single read instead of on all the nodes.
Further, if we run VM on either of the storage node, we can make a local read
request for better performance.

This patch generalize the above 2 nodes case in the N nodes. That is,

vm -> write to all the N nodes, read just one of them. If single read fails, we
try to read next node in FIFO order specified by the startup command.

The 2 nodes case is very similar to DRBD[1] though lack of auto-sync
functionality in the single device/node failure for now. But compared with DRBD
we still have some advantages over it:

- Suppose we have 20 VMs running on one(assume A) of 2 nodes' DRBD backed
storage. And if A crashes, we need to restart all the VMs on node B. But for
practice case, we can't because B might not have enough resources to setup 20 VMs
at once. So if we run our 20 VMs with quorum driver, and scatter the replicated
images over the data center, we can very likely restart 20 VMs without any
resource problem.

After all, I think we can build a more powerful replicated image functionality
on quorum and block jobs(block mirror) to meet various High Availibility needs.

E.g, Enable single read pattern on 2 children,

-drive driver=quorum,children.0.file.filename=0.qcow2,\
children.1.file.filename=1.qcow2,read-pattern=fifo,vote-threshold=1

[1] http://en.wikipedia.org/wiki/Distributed_Replicated_Block_Device

[Dropped \n from an error_setg() error message
--Stefan]

Cc: Benoit Canet <benoit@irqsave.net>
Cc: Eric Blake <eblake@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:58 +01:00
Liu Yuan
62c6031f96 qapi: add read-pattern enum for quorum
Cc: Eric Blake <eblake@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:58 +01:00
Hitoshi Mitake
38890b246d sheepdog: improve error handling for a case of failed lock
Recently, sheepdog revived its VDI locking functionality. This patch
updates sheepdog driver of QEMU for this feature. It changes an error
code for a case of failed locking. -EBUSY is a suitable one.

Reported-by: Valerio Pachera <sirio81@gmail.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Liu Yuan <namei.unix@gmail.com>
Cc: MORITA Kazutaka <morita.kazutaka@gmail.com>
Signed-off-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:57 +01:00
Hitoshi Mitake
1dbfafed7f sheepdog: adopting protocol update for VDI locking
The update is required for supporting iSCSI multipath. It doesn't
affect behavior of QEMU driver but adding a new field to vdi request
struct is required.

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Liu Yuan <namei.unix@gmail.com>
Cc: MORITA Kazutaka <morita.kazutaka@gmail.com>
Signed-off-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:57 +01:00
Stefan Hajnoczi
40ed35a3c4 qemu-img: always goto out in img_snapshot() error paths
The out label has the qemu_progress_end() and other cleanup calls.
Always goto out in error paths so the cleanup happens.  These error
paths now return 1 instead of -1.

Note that bdrv_unref(NULL) is safe.  We just need to initialize bs to
NULL at the top of the function.

We can now remove the obsolete bs_old_backing = NULL and bs_new_backing
= NULL for safe mode.  Originally it was necessary in commit 3e85c6fd
("qemu-img rebase") but became useless in commit c2abcce ("qemu-img:
avoid calling exit(1) to release resources properly") because the
variables are already initialized during declaration.

Reported-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-08-29 10:46:57 +01:00
Stefan Hajnoczi
cbda016d94 qemu-img: fix img_compare() flags error path
If img_compare() fails to parse the cache flags the goto out3 code path
will call qemu_progress_end().  Make sure we actually call
qemu_progress_init() first.

Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-08-29 10:46:57 +01:00
Stefan Hajnoczi
a3981eb978 qemu-img: fix img_commit() error return value
The img_commit() return value is a process exit code.  Use 1 for failure
instead of -1.  The other failure paths in this function already use 1.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-08-29 10:46:57 +01:00
Daniel Henrique Barboza
212aefaa53 block.curl: adding 'timeout' option
The curl hardcoded timeout (5 seconds) sometimes is not long
enough depending on the remote server configuration and network
traffic. The user should be able to set how much long he is
willing to wait for the connection.

Adding a new option to set this timeout gives the user this
flexibility. The previous default timeout of 5 seconds will be
used if this option is not present.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Reviewed-by: Benoit Canet <benoit.canet@nodalink.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:57 +01:00
Markus Armbruster
28fa7133b8 ide: Fix bootindex for bus_id > 9
We identify devices by their Open Firmware device paths.  The encoding
of bus numbers is incorrect: idebus_get_fw_dev_path() formats them in
decimal, while SeaBIOS uses hexadecimal.  With bus number > 9, SeaBIOS
will miss the bootindex (lucky case), or apply it to another device
(unlucky case).

Bug can't bite right now: ich9-ahci has six ports, and the sysbus-ahci
created by Calxeda Highbank has just one.

Fix it anyway, by changing %d to %x.

I couldn't find an Open Firmware spec covering this.  For what it's
worth, OVMF agrees with SeaBIOS.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:57 +01:00
Le Tan
b5a280c008 intel-iommu: add IOTLB using hash table
Add IOTLB to cache information about the translation of input-addresses. IOTLB
use a GHashTable as cache. The key of the hash table is the logical-OR of gfn
and source id after left-shifting.

Signed-off-by: Le Tan <tamlokveer@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-28 23:10:22 +02:00
Le Tan
d92fa2dc6e intel-iommu: add context-cache to cache context-entry
Add context-cache to cache context-entry encountered on a page-walk. Each
VTDAddressSpace has a member of VTDContextCacheEntry which represents an entry
in the context-cache. Since devices with different bus_num and devfn have their
respective VTDAddressSpace, this will be a good way to reference the cached
entries.
Each VTDContextCacheEntry will have a context_cache_gen and the cached entry
is valid only when context_cache_gen equals IntelIOMMUState.context_cache_gen.

Signed-off-by: Le Tan <tamlokveer@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-28 23:10:22 +02:00
Le Tan
ed7b8fbcfb intel-iommu: add supports for queued invalidation interface
Add supports for queued invalidation interface, an expended invalidation
interface with extended capabilities.

Signed-off-by: Le Tan <tamlokveer@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-28 23:10:22 +02:00
Le Tan
ac40aa1540 intel-iommu: fix coding style issues around in q35.c and machine.c
Fix coding style issues around in hw/pci-host/q35.c and hw/core/machine.c.

Signed-off-by: Le Tan <tamlokveer@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-28 23:10:22 +02:00
Le Tan
a52a7fdfa7 intel-iommu: add Intel IOMMU emulation to q35 and add a machine option "iommu" as a switch
Add Intel IOMMU emulation to q35 chipset and expose it to the guest.
1. Add a machine option. Users can use "-machine iommu=on|off" in the command
line to enable/disable Intel IOMMU. The default is off.
2. Accroding to the machine option, q35 will initialize the Intel IOMMU and
use pci_setup_iommu() to setup q35_host_dma_iommu() as the IOMMU function for
the pci bus.
3. q35_host_dma_iommu() will return different address space according to the
bus_num and devfn of the device.

Signed-off-by: Le Tan <tamlokveer@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-28 23:10:22 +02:00
Le Tan
d4eb911935 intel-iommu: add DMAR table to ACPI tables
Expose Intel IOMMU to the BIOS. If object of TYPE_INTEL_IOMMU_DEVICE exists,
add DMAR table to ACPI RSDT table. For now the DMAR table indicates that there
is only one hardware unit without INTR_REMAP capability on the platform.

Signed-off-by: Le Tan <tamlokveer@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-28 23:10:22 +02:00
Le Tan
1da12ec4c8 intel-iommu: introduce Intel IOMMU (VT-d) emulation
Add support for emulating Intel IOMMU according to the VT-d specification for
the q35 chipset machine. Implement the logics for DMAR (DMA remapping) without
PASID support. The emulation supports register-based invalidation and primary
fault logging.

Signed-off-by: Le Tan <tamlokveer@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-28 23:10:22 +02:00
Le Tan
8d7b8cb9c2 iommu: add is_write as a parameter to the translate function of MemoryRegionIOMMUOps
Add a bool variable is_write as a parameter to the translate function of
MemoryRegionIOMMUOps to indicate the operation of the access. It can be
used for correct fault reporting from within the callback.
Change the interface of related functions.

Signed-off-by: Le Tan <tamlokveer@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-28 23:10:22 +02:00
Peter Maydell
a6aebb38ba Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
SCSI patches include bug fixes from Fam and Peter, improved error
reporting from Fam and a fix for DPRINTF bitrot.  Memory patches try
again to initialize name from the QOM name.

# gpg: Signature made Thu 28 Aug 2014 15:10:31 BST using RSA key ID 9B4D86F2
# gpg: Good signature from "Paolo Bonzini <pbonzini@redhat.com>"
# gpg:                 aka "Paolo Bonzini <bonzini@gnu.org>"

* remotes/bonzini/tags/for-upstream:
  memory: Lazy init name from QOM name as needed
  xen: hvm: Abstract away memory region name ref
  xen-hvm: Constify string
  virtio-scsi: Report error if num_queues is 0 or too large
  scsi-generic: remove superfluous DPRINTF avoid to break compiling
  block/iscsi: fix memory corruption on iscsi resize
  scsi-bus: Convert DeviceClass init to realize
  block: Pass errp in blkconf_geometry

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-28 17:08:13 +01:00
Peter Maydell
38a01e55d2 Merge remote-tracking branch 'remotes/kvm/tags/for-upstream' into staging
Mostly bugfixes + Alexey's interface-based implementation
of the NMI monitor command.

# gpg: Signature made Thu 28 Aug 2014 15:07:22 BST using RSA key ID 9B4D86F2
# gpg: Good signature from "Paolo Bonzini <pbonzini@redhat.com>"
# gpg:                 aka "Paolo Bonzini <bonzini@gnu.org>"

* remotes/kvm/tags/for-upstream:
  mc146818rtc: reinitialize irq_reinject_on_ack_count on reset
  target-i386: Add "tsc_adjust" CPU feature name
  target-i386: Add "mpx" CPU feature name
  vl: process -object after other backend options
  checkpatch.pl: adjust typedef definition to QEMU coding style
  x86: Clear MTRRs on vCPU reset
  x86: kvm: Add MTRR support for kvm_get|put_msrs()
  x86: Use common variable range MTRR counts
  target-i386: Don't forbid NX bit on PAE PDEs and PTEs
  spapr: Add support for new NMI interface
  s390x: Migrate to new NMI interface
  s390x: Convert QEMUMachine to MachineClass
  cpus: Define callback for QEMU "nmi" command
  kvm: run cpu state synchronization on target vcpu thread

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-28 16:07:23 +01:00
Peter Crosthwaite
d1dd32af6f memory: Lazy init name from QOM name as needed
To support name retrieval of MemoryRegions that were created
dynamically (that is, not via memory_region_init and friends). We
cache the name in MemoryRegion's state as
object_get_canonical_path_component mallocs the returned value
so it's not suitable for direct return to callers. Memory already
frees the name field, so this will be garbage collected along with
the MR object.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-28 16:09:44 +02:00
Peter Crosthwaite
3e1f50867b xen: hvm: Abstract away memory region name ref
The mr->name field is removed. This slipped through compile testing.
Fix.

Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-28 16:09:44 +02:00
Peter Crosthwaite
dc6c4fe837 xen-hvm: Constify string
It's constant, and sourced from existing const strings. Avoid dodgy
casts by converting to const.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-28 16:09:44 +02:00
Peter Maydell
795c050e37 Merge remote-tracking branch 'remotes/stefanha/tags/fix-buildbot-12082014-pull-request' into staging
Pull request

# gpg: Signature made Thu 28 Aug 2014 13:43:00 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/fix-buildbot-12082014-pull-request:
  Revert "qemu-img: sort block formats in help message"
  block: sort formats alphabetically in bdrv_iterate_format()
  mirror: fix uninitialized variable delay_ns warnings
  trace: avoid Python 2.5 all() in tracetool
  libqtest: launch QEMU with QEMU_AUDIO_DRV=none
  qapi.py: avoid Python 2.5+ any() function

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-28 14:51:12 +01:00
Stefan Hajnoczi
00c6d403a3 Revert "qemu-img: sort block formats in help message"
This reverts commit 1a443c1b8b and the
later commit 395071a763.

GSequence was introduced in glib 2.14.  RHEL 5 fails to compile since it
uses glib 2.12.3.

Now that bdrv_iterate_format() invokes the iteration callback in sorted
order these commits are unnecessary.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
2014-08-28 13:42:25 +01:00
Stefan Hajnoczi
ada4240103 block: sort formats alphabetically in bdrv_iterate_format()
Format names are best consumed in alphabetical order.  This makes
human-readable output easy to produce.

bdrv_iterate_format() already has an array of format strings.  Sort them
before invoking the iteration callback.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
2014-08-28 13:42:25 +01:00
Stefan Hajnoczi
6d0de8eb21 mirror: fix uninitialized variable delay_ns warnings
The gcc 4.1.2 compiler warns that delay_ns may be uninitialized in
mirror_iteration().

There are two break statements in the do ... while loop that skip over
the delay_ns assignment.  These are probably the cause of the warning.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
2014-08-28 13:42:25 +01:00
Stefan Hajnoczi
73735f7218 trace: avoid Python 2.5 all() in tracetool
Red Hat Enterprise Linux 5 ships Python 2.4.3.  The all() function was
added in Python 2.5 so we cannot use it.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
2014-08-28 13:42:25 +01:00
Stefan Hajnoczi
6b02921605 libqtest: launch QEMU with QEMU_AUDIO_DRV=none
No test case actually uses the audio backend.  Disable audio to prevent
warnings on hosts with no sound hardware present:

  GTESTER check-qtest-aarch64
  sdl: SDL_OpenAudio failed
  sdl: Reason: No available audio device
  sdl: SDL_OpenAudio failed
  sdl: Reason: No available audio device
  audio: Failed to create voice `lm4549.out'

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
2014-08-28 13:42:25 +01:00
Stefan Hajnoczi
7ac9a9d6e1 qapi.py: avoid Python 2.5+ any() function
There is one instance of any() in qapi.py that breaks builds on older
distros that ship Python 2.4 (like RHEL5):

  GEN   qmp-commands.h
Traceback (most recent call last):
  File "build/scripts/qapi-commands.py", line 445, in ?
    exprs = parse_schema(input_file)
  File "build/scripts/qapi.py", line 329, in parse_schema
    schema = QAPISchema(open(input_file, "r"))
  File "build/scripts/qapi.py", line 110, in __init__
    if any(include_path == elem[1]
NameError: global name 'any' is not defined

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
2014-08-28 13:42:25 +01:00
Paolo Bonzini
172dbc52b3 mc146818rtc: reinitialize irq_reinject_on_ack_count on reset
This field was forgotten, and it makes the state after reset
non-deterministic.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-27 17:54:52 +02:00
Peter Maydell
0265361a72 Merge remote-tracking branch 'remotes/mcayland/qemu-openbios' into staging
* remotes/mcayland/qemu-openbios:
  Update OpenBIOS images

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-26 14:18:40 +01:00
Eduardo Habkost
7b458bfd12 target-i386: Add "tsc_adjust" CPU feature name
tsc_adjust migration support is already implemented (commit
f28558d3d3), so we can add it to the list
of known feature names.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-26 14:52:43 +02:00
Eduardo Habkost
5bd8ff07e6 target-i386: Add "mpx" CPU feature name
Migration support for MPX is already implemented (commit
79e9ebebbf), so we can add it to the list
of known feature names.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-26 14:52:38 +02:00
Mark Cave-Ayland
d2a68f3a30 Update OpenBIOS images
Update OpenBIOS images to SVN r1316 built from submodule.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2014-08-26 13:52:15 +01:00
Paolo Bonzini
7b71758d79 vl: process -object after other backend options
QOM backends can refer to chardevs, but not vice versa.  So
process -chardev and -fsdev options before -object

This fixes the rng-egd backend to virtio-rng.

Reported-by: Amos Kong <akong@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-26 13:44:39 +02:00
Paolo Bonzini
a6859deb69 checkpatch.pl: adjust typedef definition to QEMU coding style
Most QEMU typedefs are camelcase, starting with one uppercase letter
and containing at least one lowercase letter.  There are a few
all-uppercase types, add the most common too.

This fixes recognition of types in lines such as

    static __attribute__((unused)) inline void tcg_out8(TCGContext *s, uint8_t v)

(Example provided by Peter Maydell).

Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-26 13:44:28 +02:00
Fam Zheng
c9f6552803 virtio-scsi: Report error if num_queues is 0 or too large
No cmd vq surprises guest (Linux panics in virtscsi_probe), too many
queues abort qemu (in the following virtio_add_queue).

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-26 13:20:44 +02:00
Gonglei
f93d2c15d6 scsi-generic: remove superfluous DPRINTF avoid to break compiling
variables lun and tag had been eliminated, break compiling
when enable debug switch. Meanwhile traces provide the same
information with this DPRINTF, so remove it.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-26 13:20:44 +02:00
Peter Lieven
9db693f764 block/iscsi: fix memory corruption on iscsi resize
bs->total_sectors is not yet updated at this point. resulting
in memory corruption if the volume has grown and data is written
to the newly availble areas.

CC: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-26 13:20:44 +02:00
Fam Zheng
a818a4b69d scsi-bus: Convert DeviceClass init to realize
Replace "init/destroy" with "realize/unrealize" in SCSIDeviceClass,
which has errp as a parameter. So all the implementations now use
error_setg instead of error_report for reporting error.

Also in scsi_bus_legacy_handle_cmdline, report the error when
initializing the if=scsi devices, before returning it, because in the
callee, error_report is changed to error_setg. And the callers don't
have the right locations (e.g. "-drive if=scsi").

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-26 13:20:44 +02:00
Fam Zheng
5ff5efb46c block: Pass errp in blkconf_geometry
This allows us to pass error information to caller.

Reviewed-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-26 13:20:44 +02:00
Peter Maydell
c47c61be8d Merge remote-tracking branch 'remotes/awilliam/tags/vfio-pci-for-qemu-20140825.0' into staging
VFIO: Enable primary NVIDIA quirk regardless of VGA support

# gpg: Signature made Mon 25 Aug 2014 20:29:37 BST using RSA key ID 3BB08B22
# gpg: Can't check signature: public key not found

* remotes/awilliam/tags/vfio-pci-for-qemu-20140825.0:
  vfio: Enable NVIDIA 88000 region quirk regardless of VGA

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-26 10:42:06 +01:00
Alex Williamson
fe08275db9 vfio: Enable NVIDIA 88000 region quirk regardless of VGA
If we make use of OVMF for the BIOS then we can use GPUs without VGA
space access, but we still need this quirk.  Disassociate it from the
x-vga option and enable it on all NVIDIA VGA display class devices.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-08-25 12:10:15 -06:00
Peter Maydell
a44a12b78a Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pci, pc fixes, features

A bunch of bugfixes - these will make sense for 2.1.1

ACPI support for TPM and partial ARI support for PCIE.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Sun 24 Aug 2014 23:16:35 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"

* remotes/mst/tags/for_upstream:
  pcie: fix trailing whitespace
  ioh3420: Enable ARI forwarding
  ioh3420: Remove obsoleted, unused ioh3420_init function
  pcie: Rename the pcie_cap_ari_* functions to pcie_cap_arifwd_*
  pcie: Fix incorrect write to the ari capability next function field
  ssdt-tpm: add generated hex file to git
  Add ACPI tables for TPM
  pc: reserve more memory for ACPI for new machine types
  pcihp: fix possible array out of bounds
  pci_bridge: manually destroy memory regions within PCIBridgeWindows
  hostmem: set MPOL_MF_MOVE

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-25 18:49:25 +01:00
Alex Williamson
9db2efd95e x86: Clear MTRRs on vCPU reset
The SDM specifies (June 2014 Vol3 11.11.5):

    On a hardware reset, the P6 and more recent processors clear the
    valid flags in variable-range MTRRs and clear the E flag in the
    IA32_MTRR_DEF_TYPE MSR to disable all MTRRs. All other bits in the
    MTRRs are undefined.

We currently do none of that, so whatever MTRR settings you had prior
to reset is what you have after reset.  Usually this doesn't matter
because KVM often ignores the guest mappings and uses write-back
anyway.  However, if you have an assigned device and an IOMMU that
allows NoSnoop for that device, KVM defers to the guest memory
mappings which are now stale after reset.  The result is that OVMF
rebooting on such a configuration takes a full minute to LZMA
decompress the firmware volume, a process that is nearly instant on
the initial boot.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-25 18:53:42 +02:00
Alex Williamson
d1ae67f626 x86: kvm: Add MTRR support for kvm_get|put_msrs()
The MTRR state in KVM currently runs completely independent of the
QEMU state in CPUX86State.mtrr_*.  This means that on migration, the
target loses MTRR state from the source.  Generally that's ok though
because KVM ignores it and maps everything as write-back anyway.  The
exception to this rule is when we have an assigned device and an IOMMU
that doesn't promote NoSnoop transactions from that device to be cache
coherent.  In that case KVM trusts the guest mapping of memory as
configured in the MTRR.

This patch updates kvm_get|put_msrs() so that we retrieve the actual
vCPU MTRR settings and therefore keep CPUX86State synchronized for
migration.  kvm_put_msrs() is also used on vCPU reset and therefore
allows future modificaitons of MTRR state at reset to be realized.

Note that the entries array used by both functions was already
slightly undersized for holding every possible MSR, so this patch
increases it beyond the 28 new entries necessary for MTRR state.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-25 18:53:42 +02:00
Alex Williamson
d8b5c67b05 x86: Use common variable range MTRR counts
We currently define the number of variable range MTRR registers as 8
in the CPUX86State structure and vmstate, but use MSR_MTRRcap_VCNT
(also 8) to report to guests the number available.  Change this to
use MSR_MTRRcap_VCNT consistently.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-25 18:53:42 +02:00
William Grant
1844e68eca target-i386: Don't forbid NX bit on PAE PDEs and PTEs
Commit e8f6d00c30 ("target-i386: raise
page fault for reserved physical address bits") added a check that the
NX bit is not set on PAE PDPEs, but it also added it to rsvd_mask for
the rest of the function. This caused any PDEs or PTEs with NX set to be
erroneously rejected, making PAE guests with NX support unusable.

Signed-off-by: William Grant <wgrant@ubuntu.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-25 18:53:42 +02:00
Peter Maydell
3dd359c2d3 Merge remote-tracking branch 'remotes/mjt/tags/trivial-patches-2014-08-24' into staging
trivial patches for 2014-08-24

# gpg: Signature made Sun 24 Aug 2014 14:28:49 BST using RSA key ID A4C3D7DB
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg:                 aka "Michael Tokarev <mjt@corpit.ru>"
# gpg:                 aka "Michael Tokarev <mjt@debian.org>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D  4324 457C E0A0 8044 65C5
#      Subkey fingerprint: 6F67 E18E 7C91 C5B1 5514  66A7 BEE5 9D74 A4C3 D7DB

* remotes/mjt/tags/trivial-patches-2014-08-24:
  vmxnet3: Pad short frames to minimum size (60 bytes)
  libdecnumber: Fix warnings from smatch (missing static, boolean operations)
  linux-user: fix file descriptor leaks
  po: Fix Makefile rules for in-tree builds without configuration
  slirp/misc: Use the GLib memory allocation APIs
  configure: no need to mkdir QMP
  dma: axidma: Variablise repeated s->streams[i] sub-expr
  microblaze: ml605: Get rid of ddr_base variable
  tests/bios-tables-test: check the value returned by fopen()
  tcg: dump op count into qemu log
  util/path: Use the GLib memory allocation routines

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-25 17:34:30 +01:00
Alexey Kardashevskiy
3431648272 spapr: Add support for new NMI interface
This implements an NMI interface POWERPC SPAPR machine.
This enables an "nmi" HMP/QMP command supported on SPAPR.

This calls POWERPC_EXCP_RESET (vector 0x100) in the guest to deliver NMI
to every CPU. The expected result is XMON (in-kernel debugger) invocation.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-25 13:25:16 +02:00
Alexey Kardashevskiy
3dd7852f19 s390x: Migrate to new NMI interface
This implements an NMI interface for s390 and s390-ccw machines.

This removes #ifdef s390 branch in qmp_inject_nmi so new s390's
nmi_monitor_handler() callback is going to be used for NMI.

Since nmi_monitor_handler()-calling code is platform independent,
CPUState::cpu_index is used instead of S390CPU::env.cpu_num.
There should not be any change in behaviour as both @cpu_index and
@cpu_num are global CPU numbers.

Note that s390_cpu_restart() already takes care of the specified cpu,
so we don't need to schedule via async_run_on_cpu().

Since the only error s390_cpu_restart() can return is ENOSYS, convert
it to QERR_UNSUPPORTED.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-25 13:25:16 +02:00
Alexey Kardashevskiy
d07aa7c7bb s390x: Convert QEMUMachine to MachineClass
This converts s390-virtio and s390-ccw-virtio machines to QOM MachineClass.
This brings ability to add interfaces to the machine classes. The first
interface for addition will be NMI.

The patch is mechanical so no change in behavior is expected.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-25 13:25:16 +02:00
Alexey Kardashevskiy
9cb805fd26 cpus: Define callback for QEMU "nmi" command
This introduces an NMI (Non Maskable Interrupt) interface with
a single nmi_monitor_handler() method. A machine or a device can
implement it. This searches for an QOM object with this interface
and if it is implemented, calls it. The callback implements an action
required to cause debug crash dump on in-kernel debugger invocation.
The callback returns Error**.

This adds a nmi_monitor_handle() helper which walks through
all objects to find the interface. The interface method is called
for all found instances.

This adds support for it in qmp_inject_nmi(). Since no architecture
supports it at the moment, there is no change in behaviour.

This changes inject-nmi command description for HMP and QMP.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-25 13:25:16 +02:00
Michael S. Tsirkin
187de915e8 pcie: fix trailing whitespace
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-25 00:16:07 +02:00
Knut Omang
a74b870270 ioh3420: Enable ARI forwarding
Signed-off-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-25 00:16:06 +02:00
Knut Omang
0f9b1771cc ioh3420: Remove obsoleted, unused ioh3420_init function
Signed-off-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-25 00:16:06 +02:00
Knut Omang
821be9dbb2 pcie: Rename the pcie_cap_ari_* functions to pcie_cap_arifwd_*
Rename helper functions to make a clearer distinction between
the PCIe capability/control register feature ARI forwarding and a
device that supports the ARI feature via an ARI extended PCIe capability.

Signed-off-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-25 00:16:06 +02:00
Knut Omang
ec70b46bab pcie: Fix incorrect write to the ari capability next function field
PCI_ARI_CAP_NFN, a macro for reading next function was used instead of
the intended write.

Signed-off-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-25 00:16:06 +02:00
Michael S. Tsirkin
cec391d752 ssdt-tpm: add generated hex file to git
Needed for systems without IASL.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-25 00:16:06 +02:00
Stefan Berger
711b20b479 Add ACPI tables for TPM
Add an SSDT ACPI table for the TPM device.
Add a TCPA table for BIOS logging area when a TPM is being used.

The latter follows this spec here:

http://www.trustedcomputinggroup.org/files/static_page_files/DCD4188E-1A4B-B294-D050A155FB6F7385/TCG_ACPIGeneralSpecification_PublicReview.pdf

This patch has Michael Tsirkin's patches folded in.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-25 00:16:06 +02:00
Michael S. Tsirkin
927766c7d3 pc: reserve more memory for ACPI for new machine types
commit 868270f23d
    acpi-build: tweak acpi migration limits
broke kernel loading with -kernel/-initrd: it doubled
the size of ACPI tables but did not reserve
enough memory.

As a result, issues on boot and halt are observed.

Fix this up by doubling reserved memory for new machine types.

Cc: qemu-stable@nongnu.org
Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-25 00:16:06 +02:00
Gonglei
fa365d7cd1 pcihp: fix possible array out of bounds
Prevent out-of-bounds array access on
acpi_pcihp_pci_status.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2014-08-25 00:16:06 +02:00
Paolo Bonzini
9f6b2f1c64 pci_bridge: manually destroy memory regions within PCIBridgeWindows
The regions are destroyed and recreated on configuration space accesses.
We need to destroy them before the containing PCIBridgeWindows object
is freed.

Reported-by: Gonglei <arei.gonglei@huawei.com>
Reported-by: Knut Omang <knut.omang@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-25 00:16:06 +02:00
Ben Draper
40a87c6c9b vmxnet3: Pad short frames to minimum size (60 bytes)
When running VMware ESXi under qemu-kvm the guest discards frames
that are too short. Short ARP Requests will be dropped, this prevents
guests on the same bridge as VMware ESXi from communicating. This patch
simply adds the padding on the network device itself.

Signed-off-by: Ben Draper <ben@xrsa.net>
Reviewed-by: Dmitry Fleytman <dmitry@daynix.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-24 17:11:08 +04:00
Stefan Weil
d072cdf3ba libdecnumber: Fix warnings from smatch (missing static, boolean operations)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-24 13:21:06 +04:00
zhanghailiang
680dfde919 linux-user: fix file descriptor leaks
Handle variable "fd_orig" going out of scope leaks the handle.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-24 13:18:28 +04:00
Stefan Weil
bcc55f327c po: Fix Makefile rules for in-tree builds without configuration
Adding 'update' to the phony targets fixes this error:

$ LANG=C make -C po update
make: Entering directory `/qemu/po'
  LINK  update
/qemu/po/de_DE.po: file not recognized: File format not recognized
collect2: error: ld returned 1 exit status
make: *** [update] Error 1
make: Leaving directory `/qemu/po'

Some other phony targets (build, install) were also added, and the
existing .PHONY statement was moved to a more prominent position at
the beginning of the Makefile.

The patch also fixes a 2nd bug. The default target should be 'all',
but instead 'modules' (from rules.mak) was the default. Fix this by
adding 'all' as a target before any include statement.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-24 13:16:42 +04:00
zhanghailiang
2fd5d86409 slirp/misc: Use the GLib memory allocation APIs
Here we don't check the return value of malloc() which may fail.
Use the g_new() instead, which will abort the program when
there is not enough memory.

Also, use g_strdup instead of strdup and remove the unnecessary
strdup function.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-24 13:16:32 +04:00
Liming Wang
58ab9e6807 configure: no need to mkdir QMP
commit 7537fe04 QMP: QMP/ -> docs/qmp/

Above commit has moved last QMP files to docs/qmp and it's not necessary
to create QMP directory. So remove it from configure.

Signed-off-by: Liming Wang <liming.wang@canonical.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-24 13:16:32 +04:00
Peter Crosthwaite
6a07a695b0 dma: axidma: Variablise repeated s->streams[i] sub-expr
This have 6 inline usages. Make it a bit more readable by using a local
variable.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-24 13:16:32 +04:00
Peter Crosthwaite
f55f885267 microblaze: ml605: Get rid of ddr_base variable
It's a constant based on a macro. Just use the macro in place.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-24 13:16:32 +04:00
zhanghailiang
c39a28a43d tests/bios-tables-test: check the value returned by fopen()
The function fopen() may fail, so check its return value.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Liu <john.liuli@huawei.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-24 13:16:32 +04:00
zhanghailiang
d70724cec8 tcg: dump op count into qemu log
fopen() may fail and it does not check its return vaule here,
it is better to dump op count to the normal log file.

Signed-off-by: Li Liu <john.liuli@huawei.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-24 13:16:32 +04:00
zhanghailiang
bc5008a832 util/path: Use the GLib memory allocation routines
In this file, we don't check the return value of malloc/strdup/realloc which may fail.
Instead of using these routines, we use the GLib memory APIs g_malloc/g_strdup/g_realloc.
They will exit on allocation failure, so there is no need to test for failure,
which would be fine for setup.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-24 13:16:32 +04:00
Peter Maydell
33886ebeec Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block patches

# gpg: Signature made Fri 22 Aug 2014 14:47:53 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"

* remotes/kevin/tags/for-upstream: (29 commits)
  qemu-img: Allow cache mode specification for amend
  qemu-img: Allow source cache mode specification
  vmdk: Use bdrv_nb_sectors() where sectors, not bytes are wanted
  blkdebug: Delete BH in bdrv_aio_cancel
  qemu-iotests: add test case 101 for short file I/O
  raw-posix: fix O_DIRECT short reads
  block/iscsi: fix memory corruption on iscsi resize
  block/vvfat.c: remove debugging code to reinit stderr if NULL
  iotests: Add test for image filename construction
  quorum: Implement bdrv_refresh_filename()
  nbd: Implement bdrv_refresh_filename()
  blkverify: Implement bdrv_refresh_filename()
  blkdebug: Implement bdrv_refresh_filename()
  block: Add bdrv_refresh_filename()
  virtio-blk: fix reference a pointer which might be freed
  virtio-blk: allow block_resize with dataplane
  block: acquire AioContext in qmp_block_resize()
  qemu-iotests: Fix 028 reference output for qed
  test-coroutine: test cost introduced by coroutine
  iotests: Add test for qcow2's cache options
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-22 16:12:51 +01:00
Peter Maydell
43fe62757b Merge remote-tracking branch 'remotes/riku/linux-user-for-upstream' into staging
* remotes/riku/linux-user-for-upstream: (22 commits)
  linux-user: check return value of malloc()
  linux-user: writev Partial Writes
  linux-user: Support target-to-host translation of mlockall argument
  linux-user: clock_nanosleep errno Handling on PPC
  linux-user: Minimum Sig Handler Stack Size for PPC64 ELF V2
  linux-user: Move get_ppc64_abi
  linux-user: Detect fault in sched_rr_get_interval
  linux-user: Handle NULL sched_param argument to sched_*
  linux-user: Detect Negative Message Sizes in msgsnd System Call
  linux-user: Conditionally Pass Attribute Pointer to mq_open()
  linux-user: Make ipc syscall's third argument an abi_long
  linux-user: Properly Handle semun Structure In Cross-Endian Situations
  linux-user: Dereference Pointer Argument to ipc/semctl Sys Call
  linux-user: PPC64 semid_ds Doesnt Include _unused1 and _unused2
  linux-user: add setns and unshare
  linux-user: support ioprio_{get, set} syscalls
  linux-user: support timerfd_{create, gettime, settime} syscalls
  linux-user: fix readlink handling with magic exe symlink
  linux-user: Fix conversion of sigevent argument to timer_create
  linux-user: Fix syscall instruction usermode emulation on X86_64
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-22 14:39:53 +01:00
Max Reitz
bd39e6ed0b qemu-img: Allow cache mode specification for amend
qemu-img amend may extensively modify the target image, depending on the
options to be amended (e.g. conversion to qcow2 compat level 0.10 from
1.1 for an image with many unallocated zero clusters). Therefore it
makes sense to allow the user to specify the cache mode to be used.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-22 14:54:48 +02:00
Max Reitz
40055951a7 qemu-img: Allow source cache mode specification
Many qemu-img subcommands only read the source file(s) once. For these
use cases, a full write-back cache is unnecessary and mainly clutters
host cache memory. Though this is generally no concern as cache memory
is freely available and can be scaled by the host OS, it may become a
concern with thin provisioning.

For these cases, it makes sense to allow users to freely specify the
source cache mode (e.g. use no cache at all).

This commit adds a new switch (-T) for the qemu-img subcommands check,
compare, convert and rebase to specify the cache to be used for source
images (the backing file in case of rebase).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-22 14:54:44 +02:00
zhanghailiang
29e03fcb62 linux-user: check return value of malloc()
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Acked-by: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:35 +03:00
Tom Musta
29560a6cb7 linux-user: writev Partial Writes
Although not technically not required by POSIX, the writev system call will
typically write out its buffers individually.  That is, if the first buffer
is written successfully, but the second buffer pointer is invalid, then
the first chuck will be written and its size is returned.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:35 +03:00
Tom Musta
6f6a40328b linux-user: Support target-to-host translation of mlockall argument
The argument to the mlockall system call is not necessarily the same on
all platforms and thus may require translation prior to passing to the
host.

For example, PowerPC 64 bit platforms define values for MCL_CURRENT
(0x2000) and MCL_FUTURE (0x4000) which are different from Intel platforms
(0x1 and 0x2, respectively)

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:35 +03:00
Tom Musta
8fbe8fdfbc linux-user: clock_nanosleep errno Handling on PPC
The clock_nanosleep syscall is unusual in that it returns positive
numbers in error handling situations, versus returning -1 and setting
errno, or returning a negative errno value.  On POWER, the kernel will
set the SO bit of CR0 to indicate failure in a syscall.  QEMU has
generic handling to do this for syscalls with standard return values.

Add special case code for clock_nanosleep to handle CR0 properly.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:35 +03:00
Tom Musta
0903c8be9e linux-user: Minimum Sig Handler Stack Size for PPC64 ELF V2
The ELF V2 ABI for PPC64 defines MINSIGSTKSZ as 4096 bytes whereas it was
2048 previously.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:35 +03:00
Tom Musta
67d6d829cd linux-user: Move get_ppc64_abi
The get_ppc64_abi is used to determine the ELF ABI (i.e. V1 or V2). This
routine is currently implemented in the linux-user/elfload.c file but
is useful in other scenarios.  Move the routine to a more generally
available location (linux-user/ppc/target_cpu.h).

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:35 +03:00
Tom Musta
d4290c40a4 linux-user: Detect fault in sched_rr_get_interval
Properly detect a fault when attempting to store into an invalid
struct timespec pointer.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:35 +03:00
Tom Musta
a1d5c5b25d linux-user: Handle NULL sched_param argument to sched_*
The sched_getparam, sched_setparam and sched_setscheduler system
calls take a pointer argument to a sched_param structure.  When
this pointer is null, errno should be set to EINVAL.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:35 +03:00
Tom Musta
edcc5f9dc3 linux-user: Detect Negative Message Sizes in msgsnd System Call
The msgsnd system call takes an argument that describes the message
size (msgsz) and is of type size_t.  The system call should set
errno to EINVAL in the event that a negative message size is passed.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:35 +03:00
Tom Musta
b6ce1f6b90 linux-user: Conditionally Pass Attribute Pointer to mq_open()
The mq_open system call takes an optional struct mq_attr pointer
argument in the fourth position.  This pointer is used when O_CREAT
is specified in the flags (second) argument.  It may be NULL, in
which case the queue is created with implementation defined attributes.

Change the code to properly handle the case when NULL is passed in the
arg4 position.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:34 +03:00
Tom Musta
37ed09560c linux-user: Make ipc syscall's third argument an abi_long
For those target ABIs that use the ipc system call (e.g. POWER),
the third argument is used in the shmat path as a pointer.  It
therefore must be declared as an abi_long (versus int) so that
the address bits are not lost in truncation.  In fact, all arguments
to do_ipc should be declared as abit_long.

In fact, it makes more sense for all of the arguments to be declaried
as abi_long (except call).

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:34 +03:00
Tom Musta
5464baecf5 linux-user: Properly Handle semun Structure In Cross-Endian Situations
The semun union used in the semctl system call contains both an int (val) and
pointers.  In cross-endian situations on 64 bit targets, the value passed to
semctl is an 8 byte (abi_long) value and thus does not have the 4-byte val
field in the correct location.  In order to rectify this, the other half
of the union must be accessed.  This is achieved in code by performing
a byte swap on the entire 8 byte union, followed by a 4-byte swap of the
first half.

Also, eliminate an extraneous (dead) line of code that sets target_su.val in
the IPC_SET/IPC_GET case.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:34 +03:00
Tom Musta
5d2fa8ebb4 linux-user: Dereference Pointer Argument to ipc/semctl Sys Call
When the ipc system call is used to wrap a semctl system call,
the ptr argument to ipc needs to be dereferenced prior to passing
it to the semctl handler.  This is because the fourth argument to
semctl is a union and not a pointer to a union.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:34 +03:00
Tom Musta
035273440b linux-user: PPC64 semid_ds Doesnt Include _unused1 and _unused2
The 64 bit PowerPC platforms eliminate the _unused1 and _unused2
elements of the semid_ds structure from <sys/sem.h>.  So eliminate
these from the target_semid_ds structure.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:34 +03:00
Riku Voipio
9af5c906d1 linux-user: add setns and unshare
Add support for the setns and unshare syscalls, trivially passed through to
the host. Based on patches by Paul Burton, added configure check.

Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:34 +03:00
Paul Burton
ab31cda327 linux-user: support ioprio_{get, set} syscalls
Add support for the ioprio_get & ioprio_set syscalls, allowing their
use by target programs.

Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:34 +03:00
Riku Voipio
518343413f linux-user: support timerfd_{create, gettime, settime} syscalls
Adds support for the timerfd_create, timerfd_gettime & timerfd_settime
syscalls, allowing use of timerfds by target programs.

v2: By Riku - added configure check for timerfd and ifdefs
for benefit of old distributions like RHEL5.

Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:33 +03:00
Mike Frysinger
f17f4989fa linux-user: fix readlink handling with magic exe symlink
The current code always returns the length of the path when it should
be returning the number of bytes it wrote to the output string.

Further, readlink is not supposed to append a NUL byte, but the current
snprintf logic will always do just that.

Even further, if you pass in a length of 0, you're suppoesd to get back
an error (EINVAL), but the current logic just returns 0.

Further still, if there was an error reading the symlink, we should not
go ahead and try to read the target buffer as it is garbage.

Simple test for the first two issues:
$ cat test.c
int main() {
    char buf[50];
    size_t len;
    for (len = 0; len < 10; ++len) {
        memset(buf, '!', sizeof(buf));
        ssize_t ret = readlink("/proc/self/exe", buf, len);
        buf[20] = '\0';
        printf("readlink(/proc/self/exe, {%s}, %zu) = %zi\n", buf, len, ret);
    }
    return 0;
}

Now compare the output of the native:
$ gcc test.c -o /tmp/x
$ /tmp/x
$ strace /tmp/x

With what qemu does:
$ armv7a-cros-linux-gnueabi-gcc test.c -o /tmp/x -static
$ qemu-arm /tmp/x
$ qemu-arm -strace /tmp/x

Signed-off-by: Mike Frysinger <vapier@chromium.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:33 +03:00
Peter Maydell
c065976f2b linux-user: Fix conversion of sigevent argument to timer_create
There were a number of bugs in the conversion of the sigevent
argument to timer_create from target to host format:
 * signal number not converted from target to host
 * thread ID not copied across
 * sigev_value not copied across
 * we never unlocked the struct when we were done

Between them, these problems meant that SIGEV_THREAD_ID
timers (and the glibc-implemented SIGEV_THREAD timers which
depend on them) didn't work.

Fix these problems and clean up the code a little by pulling
the struct conversion out into its own function, in line with
how we convert various other structs. This allows the test
program in bug LP:1042388 to run.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:33 +03:00
Jincheng Miao
47575997be linux-user: Fix syscall instruction usermode emulation on X86_64
Currently syscall instruction is buggy on user mode X86_64,
the EIP is updated after do_syscall(), that is too late for
clone(). Because clone() will create a thread at the env->EIP
(the address of syscall insn), and then child thread enters
do_syscall() again, that is not expected. Sometimes it is tragic.

User mode syscall insn emulation is not used MSR, so the
action should be same to INT 0x80. INT 0x80 will update EIP in
do_interrupt(), ditto for syscall() for consistency.

Signed-off-by: Jincheng Miao <jmiao@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:33 +03:00
Riku Voipio
0b2effd744 linux-user: redirect openat calls
While Mikhail fixed /proc/self/maps, it was noticed openat calls are
not redirected currently. Some archs don't have open at all, so
openat needs to be redirected.

Fix this by consolidating open/openat code to do_openat - open
is implemented using openat(AT_FDCWD, ... ), which according
to open(2) man page is identical.

Since all targets now have openat, remove the ifdef around sys_openat
and openat: case in do_syscall.

Cc: Mikhail Ilin <m.ilin@samsung.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:33 +03:00
Mikhail Ilyin
d67f4aaae8 linux-user: /proc/self/maps content
Build /proc/self/maps doing a match against guest memory translation table.
Output only that map records which are valid for guest memory layout.

Signed-off-by: Mikhail Ilyin <m.ilin@samsung.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-08-22 15:06:33 +03:00
Markus Armbruster
0a156f7c75 vmdk: Use bdrv_nb_sectors() where sectors, not bytes are wanted
Instead of bdrv_getlength().

Commit 57322b7 did this all over block, but one more bdrv_getlength()
has crept in since.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-22 11:10:12 +02:00
Fam Zheng
cbf95a0b11 blkdebug: Delete BH in bdrv_aio_cancel
Otherwise error_callback_bh will access the already released acb.

Cc: qemu-stable@nongnu.org
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-22 11:07:00 +02:00
Stefan Hajnoczi
8d9eb33ca0 qemu-iotests: add test case 101 for short file I/O
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-22 11:01:12 +02:00
Stefan Hajnoczi
61ed73cff4 raw-posix: fix O_DIRECT short reads
The following O_DIRECT read from a <512 byte file fails:

  $ truncate -s 320 test.img
  $ qemu-io -n -c 'read -P 0 0 512' test.img
  qemu-io: can't open device test.img: Could not read image for determining its format: Invalid argument

Note that qemu-io completes successfully without the -n (O_DIRECT)
option.

This patch fixes qemu-iotests ./check -nocache -vmdk 059.

Cc: qemu-stable@nongnu.org
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-22 11:00:56 +02:00
Peter Lieven
d832fb4d66 block/iscsi: fix memory corruption on iscsi resize
bs->total_sectors is not yet updated at this point. resulting
in memory corruption if the volume has grown and data is written
to the newly availble areas.

CC: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-22 10:55:22 +02:00
Peter Maydell
fd3cced366 Merge remote-tracking branch 'remotes/otubo/seccomp' into staging
* remotes/otubo/seccomp:
  seccomp: add semctl() to the syscall whitelist

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-21 12:48:44 +01:00
Michael Tokarev
13b552c2f4 block/vvfat.c: remove debugging code to reinit stderr if NULL
Just log to stderr unconditionally, like other similar code does.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-21 10:36:29 +02:00
Paul Moore
b22876cc2f seccomp: add semctl() to the syscall whitelist
QEMU needs to call semctl() for correct operation.  This particular
problem was identified on shutdown with the following commandline:

 # qemu -sandbox on -monitor stdio \
   -device intel-hda -device hda-duplex -vnc :0

Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Eduardo Otubo <eduardo.otubo@profitbricks.com>
2014-08-21 10:29:16 +02:00
Michael S. Tsirkin
288d332202 hostmem: set MPOL_MF_MOVE
When memory is allocated on a wrong node, MPOL_MF_STRICT
doesn't move it - it just fails the allocation.
A simple way to reproduce the failure is with mlock=on
realtime feature.

The code comment actually says: "ensure policy won't be ignored"
so setting MPOL_MF_MOVE seems like a better way to do this.

Cc: qemu-stable@nongnu.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-20 21:15:56 +02:00
David Hildenbrand
c8e2085d8e kvm: run cpu state synchronization on target vcpu thread
As already done for kvm_cpu_synchronize_state(), let's trigger
kvm_arch_put_registers() via run_on_cpu() for kvm_cpu_synchronize_post_reset()
and kvm_cpu_synchronize_post_init().

This way, we make sure that the register synchronizing ioctls are
called from the proper vcpu thread; this avoids calls to
synchronize_rcu() in the kernel.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-20 15:21:00 +02:00
Max Reitz
911864c6e5 iotests: Add test for image filename construction
Testing a real in-use protocol such as NBD is hard; testing blkdebug and
blkverify in its stead is easier and tests basically the same
functionality.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 14:33:42 +02:00
Max Reitz
fafcfe228d quorum: Implement bdrv_refresh_filename()
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 14:31:56 +02:00
Max Reitz
2019d68b3b nbd: Implement bdrv_refresh_filename()
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 14:31:56 +02:00
Max Reitz
74b36b2eb8 blkverify: Implement bdrv_refresh_filename()
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 14:31:56 +02:00
Max Reitz
2c31b04c94 blkdebug: Implement bdrv_refresh_filename()
Because blkdebug cannot simply create a configuration file, simply
refuse to reconstruct a plain filename and only generate an options
QDict from the rules instead.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 14:31:56 +02:00
Max Reitz
91af701412 block: Add bdrv_refresh_filename()
Some block devices may not have a filename in their BDS; and for some,
there may not even be a normal filename at all. To work around this, add
a function which tries to construct a valid filename for the
BDS.filename field.

If a filename exists or a block driver is able to reconstruct a valid
filename (which is placed in BDS.exact_filename), this can directly be
used.

If no filename can be constructed, we can still construct an options
QDict which is then converted to a JSON object and prefixed with the
"json:" pseudo protocol prefix. The QDict is placed in
BDS.full_open_options.

For most block drivers, this process can be done automatically; those
that need special handling may define a .bdrv_refresh_filename() method
to fill BDS.exact_filename and BDS.full_open_options themselves.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 14:31:56 +02:00
zhanghailiang
1bdb176ac5 virtio-blk: fix reference a pointer which might be freed
In function virtio_blk_handle_request, it may freed memory pointed by req,
So do not access member of req after calling this function.

Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 11:57:05 +02:00
Stefan Hajnoczi
466560b9fc virtio-blk: allow block_resize with dataplane
Now that block_resize acquires the AioContext we can safely allow
resizing the disk.

Reported-by: Andrey Korolyov <andrey@xdel.ru>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 11:53:52 +02:00
Stefan Hajnoczi
927e0e769f block: acquire AioContext in qmp_block_resize()
Make block_resize safe for dataplane where another thread may be running
the BlockDriverState's AioContext.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 11:53:42 +02:00
Kevin Wolf
6ffb4cb6fd qemu-iotests: Fix 028 reference output for qed
We need to filter out driver-specific options in the "Formatting..."
string printed by qemu when creating the backup image.

Reported-by: Peter Wu <peter@lekensteyn.nl>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Peter Wu <peter@lekensteyn.nl>
2014-08-20 11:51:28 +02:00
Ming Lei
61ff8cfbec test-coroutine: test cost introduced by coroutine
This test runs dummy function with coroutine by using
two enter and one yield since which is a common usage.

So we can see the cost introduced by corouting for running
one function, for example:

	Run operation 20000000 iterations 4.841071 s, 4131K operations/s
	242ns per coroutine

Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 11:51:28 +02:00
Max Reitz
a1cb48a3bf iotests: Add test for qcow2's cache options
Add a test which tests various combinations of qcow2's cache options
(some of which are valid, some of which are not).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 11:51:28 +02:00
Max Reitz
6c1c8d5d3e qcow2: Add runtime options for cache sizes
Add options for specifying the size of the metadata caches. This can
either be done directly for each cache (if only one is given, the other
will be derived according to a default ratio) or combined for both.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 11:51:28 +02:00
Max Reitz
02004bd4ba qcow2: Use g_try_new0() for cache array
With a variable cache size, the number given to qcow2_cache_create() may
be huge. Therefore, use g_try_new0().

While at it, use g_new0() instead of g_malloc0() for allocating the
Qcow2Cache object.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 11:51:28 +02:00
Max Reitz
440ba08aea qcow2: Constant cache size in bytes
Specifying the metadata cache sizes in clusters results in less clusters
(and much less bytes) covered for small cluster sizes and vice versa.
Using a constant byte size reduces this difference, and makes it
possible to manually specify the cache size in an easily comprehensible
unit.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 11:51:28 +02:00
Maria Kustova
18a7d0c56e runner: Kill a program under test by time-out
If a program under test get frozen, the test should finish and report about its
failure.
In such cases the runner waits for 10 minutes until the program ends its
execution. After this time-out the program will be terminated and the test will
be marked as failed.

For current limitation of test image size to 10 MB as a maximum an execution of
each command takes about several seconds in general, so 10 minutes is enough to
discriminate freeze, but not drastically increase an overall test duration.

Signed-off-by: Maria Kustova <maria.k@catit.be>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 11:51:28 +02:00
Maria Kustova
9d256ca616 runner: Add an argument for test duration
After the specified duration the runner stops executing new tests, but it
doesn't interrupt running ones.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Maria Kustova <maria.k@catit.be>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 11:51:28 +02:00
Markus Armbruster
d4df3dbc02 block: Drop some superfluous casts from void *
They clutter the code.  Unfortunately, I can't figure out how to make
Coccinelle drop all of them, so I have to settle for common special
cases:

    @@
    type T;
    T *pt;
    void *pv;
    @@
    - pt = (T *)pv;
    + pt = pv;
    @@
    type T;
    @@
    - (T *)
      (\(g_malloc\|g_malloc0\|g_realloc\|g_new\|g_new0\|g_renew\|
	 g_try_malloc\|g_try_malloc0\|g_try_realloc\|
	 g_try_new\|g_try_new0\|g_try_renew\)(...))

Topped off with minor manual style cleanups.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 11:51:28 +02:00
Markus Armbruster
08193dd52b qemu-io-cmds: g_renew() can't fail, bury dead error handling
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 11:51:28 +02:00
Markus Armbruster
02c4f26b15 block: Use g_new() & friends to avoid multiplying sizes
g_new(T, n) is safer than g_malloc(sizeof(*v) * n) for two reasons.
One, it catches multiplication overflowing size_t.  Two, it returns
T * rather than void *, which lets the compiler catch more type
errors.

Perhaps a conversion to g_malloc_n() would be neater in places, but
that's merely four years old, and we can't use such newfangled stuff.

This commit only touches allocations with size arguments of the form
sizeof(T), plus two that use 4 instead of sizeof(uint32_t).  We can
make the others safe by converting to g_malloc_n() when it becomes
available to us in a couple of years.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 11:51:28 +02:00
Markus Armbruster
5839e53bbc block: Use g_new() & friends where that makes obvious sense
g_new(T, n) is neater than g_malloc(sizeof(T) * n).  It's also safer,
for two reasons.  One, it catches multiplication overflowing size_t.
Two, it returns T * rather than void *, which lets the compiler catch
more type errors.

Patch created with Coccinelle, with two manual changes on top:

* Add const to bdrv_iterate_format() to keep the types straight

* Convert the allocation in bdrv_drop_intermediate(), which Coccinelle
  inexplicably misses

Coccinelle semantic patch:

    @@
    type T;
    @@
    -g_malloc(sizeof(T))
    +g_new(T, 1)
    @@
    type T;
    @@
    -g_try_malloc(sizeof(T))
    +g_try_new(T, 1)
    @@
    type T;
    @@
    -g_malloc0(sizeof(T))
    +g_new0(T, 1)
    @@
    type T;
    @@
    -g_try_malloc0(sizeof(T))
    +g_try_new0(T, 1)
    @@
    type T;
    expression n;
    @@
    -g_malloc(sizeof(T) * (n))
    +g_new(T, n)
    @@
    type T;
    expression n;
    @@
    -g_try_malloc(sizeof(T) * (n))
    +g_try_new(T, n)
    @@
    type T;
    expression n;
    @@
    -g_malloc0(sizeof(T) * (n))
    +g_new0(T, n)
    @@
    type T;
    expression n;
    @@
    -g_try_malloc0(sizeof(T) * (n))
    +g_try_new0(T, n)
    @@
    type T;
    expression p, n;
    @@
    -g_realloc(p, sizeof(T) * (n))
    +g_renew(T, p, n)
    @@
    type T;
    expression p, n;
    @@
    -g_try_realloc(p, sizeof(T) * (n))
    +g_try_renew(T, p, n)

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 11:51:28 +02:00
Peter Maydell
2656eb7c59 Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20140819' into staging
target-arm:
 * fix preferred return address for A64 BRK insn
 * implement AArch64 single-stepping
 * support loading gzip compressed AArch64 kernels
 * use correct PSCI function IDs in the DT when KVM uses PSCI 0.2
 * minor cleanups

# gpg: Signature made Tue 19 Aug 2014 19:04:09 BST using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"

* remotes/pmaydell/tags/pull-target-arm-20140819:
  arm: stellaris: Remove misleading address_space_mem var
  arm: armv7m: Rename address_space_mem -> system_memory
  aarch64: Allow -kernel option to take a gzip-compressed kernel.
  loader: Add load_image_gzipped function.
  arm: cortex-a9: Fix cache-line size and associativity
  arm/virt: Use PSCI v0.2 function IDs in the DT when KVM uses PSCI v0.2
  target-arm: Rename QEMU PSCI v0.1 definitions
  target-arm: Implement MDSCR_EL1 as having state
  target-arm: Implement ARMv8 single-stepping for AArch32 code
  target-arm: Implement ARMv8 single-step handling for A64 code
  target-arm: A64: Avoid duplicate exit_tb(0) in non-linked goto_tb
  target-arm: Set PSTATE.SS correctly on exception return from AArch64
  target-arm: Correctly handle PSTATE.SS when taking exception to AArch32
  target-arm: Don't allow AArch32 to access RES0 CPSR bits
  target-arm: Adjust debug ID registers per-CPU
  target-arm: Provide both 32 and 64 bit versions of debug registers
  target-arm: Allow STATE_BOTH reginfo descriptions for more than cp14
  target-arm: Collect up the debug cp register definitions
  target-arm: Fix return address for A64 BRK instructions

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-20 09:55:42 +01:00
Peter Maydell
302fa28378 Revert "memory: Use canonical path component as the name"
This reverts commit b0225c2c0d
(which breaks building with Xen enabled and also leaks memory).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-19 20:05:46 +01:00
Peter Crosthwaite
14a906f755 arm: stellaris: Remove misleading address_space_mem var
It's a MemoryRegion and not an AddressSpace. But since it's single use,
just inline the get_system_memory() call to the only usage to remove it.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-id: d6914047e10b956514cfaa5f391ef56c7d851b34.1408347860.git.peter.crosthwaite@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-19 19:02:40 +01:00
Peter Crosthwaite
6e9322dea3 arm: armv7m: Rename address_space_mem -> system_memory
This argument is a MemoryRegion and not an AddressSpace.

"Address space" means something quite different to "memory region"
in QEMU parlance so rename the variable to reduce confusion.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-id: f666cf7f2318d9b461b1e320a45bf0d82da9b7dd.1408347860.git.peter.crosthwaite@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-19 19:02:40 +01:00
Richard W.M. Jones
6f5d3cbe88 aarch64: Allow -kernel option to take a gzip-compressed kernel.
On aarch64 it is the bootloader's job to uncompress the kernel.  UEFI
and u-boot bootloaders do this automatically when the kernel is
gzip-compressed.

However the qemu -kernel option does not do this.  The following
command does not work:

  qemu-system-aarch64 [...] -kernel /boot/vmlinuz

because it tries to execute the gzip-compressed data.

This commit lets gzip-compressed kernels be uncompressed
transparently.

Currently this is only done when emulating aarch64.

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1407831259-2115-3-git-send-email-rjones@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-19 19:02:40 +01:00
Richard W.M. Jones
235e74afcb loader: Add load_image_gzipped function.
As the name suggests this lets you load a ROM/disk image that is
gzipped.  It is uncompressed before storing it in guest memory.

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1407831259-2115-2-git-send-email-rjones@redhat.com
[PMM: removed stray space before ')']
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-19 19:02:40 +01:00
Peter Crosthwaite
f7838b5290 arm: cortex-a9: Fix cache-line size and associativity
For A9, The cache associativity is 4 and the lines size is 32B.
Self identify in CCSIDR accordingly. Cache size remains at 16k.

QEMU doesn't emulate caches, but we should still report the correct
cache-line size to the guest. Some guests (like u-boot) complain if
the cache-line size mismatches a requested flush or invalidate
operation.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-id: 1de6bd40155a1d2f2e93e24b1b1d1d677a432641.1408346233.git.peter.crosthwaite@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-19 19:02:40 +01:00
Christoffer Dall
863714ba6c arm/virt: Use PSCI v0.2 function IDs in the DT when KVM uses PSCI v0.2
The current code supplies the PSCI v0.1 function IDs in the DT even when
KVM uses PSCI v0.2.

This will break guest kernels that only support PSCI v0.1 as they will
use the IDs provided in the DT.  Guest kernels with PSCI v0.2 support
are not affected by this patch, because they ignore the function IDs in
the device tree and rely on the architecture definition.

Define QEMU versions of the constants and check that they correspond to
the Linux defines on Linux build hosts.  After this patch, both guest
kernels with PSCI v0.1 support and guest kernels with PSCI v0.2 should
work.

Tested on TC2 for 32-bit and APM Mustang for 64-bit (aarch64 guest
only).  Both cases tested with 3.14 and linus/master and verified I
could bring up 2 cpus with both guest kernels.  Also tested 32-bit with
a 3.14 host kernel with only PSCI v0.1 and both guests booted here as
well.

Cc: qemu-stable@nongnu.org
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-19 19:02:25 +01:00
Christoffer Dall
a65c9c17ce target-arm: Rename QEMU PSCI v0.1 definitions
The function IDs for PSCI v0.1 are exported by KVM and defined as
KVM_PSCI_FN_<something>.  To build using these defines in non-KVM code,
QEMU defines these IDs locally and check their correctness against the
KVM headers when those are available.

However, the naming scheme used for QEMU (almost) clashes with the PSCI
v0.2 definitions from Linux so to avoid unfortunate naming when we
introduce local PSCI v0.2 defines, rename the current local defines with
QEMU_ prependend and clearly identify the PSCI version as v0.1 in the
defines.

Cc: qemu-stable@nongnu.org
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-19 19:02:03 +01:00
Peter Maydell
0e5e8935bb target-arm: Implement MDSCR_EL1 as having state
Now that all the new code to support single-stepping is in
place, wire up the guest-visible MDSCR_EL1, so the guest
can enable single-stepping.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2014-08-19 19:02:03 +01:00
Peter Maydell
50225ad0c1 target-arm: Implement ARMv8 single-stepping for AArch32 code
ARMv8 single-stepping requires the exception level that controls
the single-stepping to be in AArch64 execution state, but the
code being stepped may be in AArch64 or AArch32. Implement the
necessary support code for single-stepping AArch32 code.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2014-08-19 19:02:03 +01:00
Peter Maydell
7ea47fe7be target-arm: Implement ARMv8 single-step handling for A64 code
Implement ARMv8 software single-step handling for A64 code:
correctly update the single-step state machine and generate
debug exceptions when stepping A64 code.

This patch has no behavioural change since MDSCR_EL1.SS can't
be set by the guest yet.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2014-08-19 19:02:03 +01:00
Peter Maydell
cc9c1ed14e target-arm: A64: Avoid duplicate exit_tb(0) in non-linked goto_tb
If gen_goto_tb() decides not to link the two TBs, then the
fallback path generates unnecessary code:
 * if singlestep is enabled then we generate unreachable code
   after the gen_exception_internal(EXCP_DEBUG)
 * if singlestep is disabled then we will generate exit_tb(0)
   twice, once in gen_goto_tb() and once coming out of the
   main loop with is_jmp set to DISAS_JUMP

Correct these deficiencies by only emitting exit_tb() in the
non-singlestep case, in which case we can use DISAS_TB_JUMP
to suppress the main-loop exit_tb().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2014-08-19 19:02:03 +01:00
Peter Maydell
3a2982038a target-arm: Set PSTATE.SS correctly on exception return from AArch64
Set the PSTATE.SS bit correctly on exception returns from AArch64,
as required by the debug single-step functionality.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2014-08-19 19:02:03 +01:00
Peter Maydell
662cefb775 target-arm: Correctly handle PSTATE.SS when taking exception to AArch32
When an exception is taken to AArch32, we must clear the PSTATE.SS
bit for the exception handler, and must also ensure that the SS bit
is not set in the value saved to SPSR_<mode>. Achieve both of these
aims by clearing the bit in uncached_cpsr before saving it to the SPSR.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2014-08-19 19:02:03 +01:00
Peter Maydell
4051e12c5d target-arm: Don't allow AArch32 to access RES0 CPSR bits
The CPSR has a new-in-v8 execution state bit (IL), and
also some state which has effects in AArch32 but appears
only in the SPSR format (SS) but is RES0 in the CPSR.

Add the IL bit to CPSR_EXEC, and enforce that guest direct
reads and writes to CPSR can't read or write the RES0
bits, so the guest can't get at the SS bit which we store
in uncached_cpsr. This includes not permitting exception
returns to copy reserved bits from an SPSR into CPSR.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2014-08-19 19:02:03 +01:00
Peter Maydell
48eb3ae64b target-arm: Adjust debug ID registers per-CPU
Allow each CPU type to specify the value for the debug ID
registers, by putting them in the ARMCPU struct, and use
the resulting information to only expose the correct number
of watchpoint and breakpoint registers for the CPU.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2014-08-19 19:02:03 +01:00
Peter Maydell
10aae1049f target-arm: Provide both 32 and 64 bit versions of debug registers
Bring the 32 bit and 64 bit views of the debug registers into
line by providing the same set of registers in both cases.
(This still isn't a complete set, but it is consistent.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2014-08-19 19:02:03 +01:00
Peter Maydell
58a1d8ceab target-arm: Allow STATE_BOTH reginfo descriptions for more than cp14
Currently the STATE_BOTH shorthand for allowing a single reginfo struct
to define handling for both AArch32 and AArch64 views of a register
only permits this where the AArch32 view is in cp15. It turns out that
the debug registers in cp14 also have neatly lined up encodings;
allow these also to share reginfo structs by permitting a STATE_BOTH
reginfo to specify the .cp field (and continue to default to 15 if
it is not specified).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2014-08-19 19:02:03 +01:00
Peter Maydell
503006983a target-arm: Collect up the debug cp register definitions
At the moment we have a mixed set of mostly dummy register
definitions for various debug related registers which have
been added piecemeal in order to get Linux kernels to boot.
In preparation for actually implementing debug support,
bring them all together into one place.

This commit doesn't change behaviour: we still expose
exactly the same registers and behaviour to the guest
in all configurations.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2014-08-19 19:02:03 +01:00
Peter Maydell
229a138d74 target-arm: Fix return address for A64 BRK instructions
When we take an exception resulting from a BRK instruction,
the architecture requires that the "preferred return address"
reported to the exception handler is the address of the BRK
itself, not the following instruction (like undefined
insns, and in contrast with SVC, HVC and SMC). Follow this,
rather than incorrectly reporting the address of the following
insn.

(We do get this correct for the A32/T32 BKPT insns.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Cc: qemu-stable@nongnu.org
2014-08-19 18:56:24 +01:00
Peter Maydell
0e4a773705 Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
SCSI changes that enable sending vendor-specific commands via virtio-scsi.

Memory changes for QOMification and automatic tracking of MR lifetime.

# gpg: Signature made Mon 18 Aug 2014 13:03:09 BST using RSA key ID 9B4D86F2
# gpg: Good signature from "Paolo Bonzini <pbonzini@redhat.com>"
# gpg:                 aka "Paolo Bonzini <bonzini@gnu.org>"

* remotes/bonzini/tags/for-upstream:
  mtree: remove write-only field
  memory: Use canonical path component as the name
  memory: Use memory_region_name for name access
  memory: constify memory_region_name
  exec: Abstract away ref to memory region names
  loader: Abstract away ref to memory region names
  tpm_tis: remove instance_finalize callback
  memory: remove memory_region_destroy
  memory: convert memory_region_destroy to object_unparent
  ioport: split deletion and destruction
  nic: do not destroy memory regions in cleanup functions
  vga: do not dynamically allocate chain4_alias
  sysbus: remove unused function sysbus_del_io
  qom: object: move unparenting to the child property's release callback
  qom: object: delete properties before calling instance_finalize
  virtio-scsi: implement parse_cdb
  scsi-block, scsi-generic: implement parse_cdb
  scsi-block: extract scsi_block_is_passthrough
  scsi-bus: introduce parse_cdb in SCSIDeviceClass and SCSIBusInfo
  scsi-bus: prepare scsi_req_new for introduction of parse_cdb

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-19 13:00:57 +01:00
Peter Maydell
8e6e2c2ae7 Merge remote-tracking branch 'remotes/qmp-unstable/queue/qmp' into staging
* remotes/qmp-unstable/queue/qmp:
  monitor: fix use after free
  dump.c: Fix memory leak issue in cleanup processing for dump_init()
  monitor: Remove hardcoded watchdog event names

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-19 10:30:36 +01:00
Michael S. Tsirkin
b3dd1b8c29 monitor: fix use after free
The function monitor_fdset_dup_fd_find_remove() references member of
'mon_fdset' which - when remove flag is set - may be freed in function
monitor_fdset_cleanup().
remove is set by monitor_fdset_dup_fd_remove which in practice
does not need the returned value, so make it void,
and return -1 from monitor_fdset_dup_fd_find_remove.

Reported-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-08-18 14:39:10 -04:00
Chen Gang
2928207ac1 dump.c: Fix memory leak issue in cleanup processing for dump_init()
In dump_init(), when failure occurs, need notice about 'fd' and memory
mapping. So call dump_cleanup() for it (need let all initializations at
front).

Also simplify dump_cleanup(): remove redundant 'ret' and redundant 'fd'
checking.

Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-08-18 14:39:10 -04:00
Hani Benhabiles
4bb08af34e monitor: Remove hardcoded watchdog event names
Signed-off-by: Hani Benhabiles <hani@linux.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-08-18 14:39:10 -04:00
Peter Maydell
073fd73e56 Merge remote-tracking branch 'remotes/amit/for-2.2' into staging
* remotes/amit/for-2.2:
  virtio-serial: search for duplicate port names before adding new ports
  virtio-serial: create a linked list of all active devices

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-18 18:24:38 +01:00
Amit Shah
d0a0bfe672 virtio-serial: search for duplicate port names before adding new ports
Before adding new ports to VirtIOSerial devices, check if there's a
conflict in the 'name' parameter.  This ensures two virtserialports with
identical names are not initialized.

Reported-by: <mazhang@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2014-08-18 22:42:49 +05:30
Amit Shah
a1857ad1ac virtio-serial: create a linked list of all active devices
To ensure two virtserialports don't get added to the system with the
same 'name' parameter, we need to access all the ports on all the
devices added, and compare the names.

We currently don't have a list of all VirtIOSerial devices added to the
system.  This commit adds a simple linked list in which devices are put
when they're initialized, and removed when they go away.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2014-08-18 22:42:37 +05:30
Peter Maydell
08ab59770d Merge remote-tracking branch 'remotes/mcayland/qemu-sparc' into staging
* remotes/mcayland/qemu-sparc:
  target-sparc64: implement Short Floating-Point Store Instructions
  apb: add IOMMU flush register implementation
  sun4u: switch second PCI-ebus bridge BAR over to PCI IO space

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-18 12:55:02 +01:00
Peter Maydell
da398fcc25 Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
Block pull request

# gpg: Signature made Fri 15 Aug 2014 18:04:23 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/block-pull-request: (55 commits)
  qcow2: fix new_blocks double-free in alloc_refcount_block()
  image-fuzzer: Reduce number of generator functions in __init__
  image-fuzzer: Add generators of L1/L2 tables
  image-fuzzer: Add fuzzing functions for L1/L2 table entries
  docs: Expand the list of supported image elements with L1/L2 tables
  image-fuzzer: Public API for image-fuzzer/runner/runner.py
  image-fuzzer: Generator of fuzzed qcow2 images
  image-fuzzer: Fuzzing functions for qcow2 images
  image-fuzzer: Tool for fuzz tests execution
  docs: Specification for the image fuzzer
  ide: only constrain read/write requests to drive size, not other types
  virtio-blk: Correct bug in support for flexible descriptor layout
  libqos: Change free function called in malloc
  libqos: Correct mask to align size to PAGE_SIZE in malloc-pc
  libqtest: add QTEST_LOG for debugging qtest testcases
  ide: Fix segfault when flushing a device that doesn't exist
  qemu-options: add missing -drive discard option to cmdline help
  parallels: 2TB+ parallels images support
  parallels: split check for parallels format in parallels_open
  parallels: replace tabs with spaces in block/parallels.c
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-18 11:59:27 +01:00
Paolo Bonzini
f54bb15f9d mtree: remove write-only field
ml->printed is never set to true.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-18 12:06:21 +02:00
Peter Crosthwaite
b0225c2c0d memory: Use canonical path component as the name
Rather than having the name as separate state. This prepares support
for creating a MemoryRegion dynamically (i.e. without
memory_region_init() and friends) and the MemoryRegion still getting
a usable name.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-18 12:06:21 +02:00
Peter Crosthwaite
3fb18b4da7 memory: Use memory_region_name for name access
Despite being local to memory.c, use the helper function. This prepares
support for fully QOMifiying the name field of MR (which will remove
this state from MR completely).

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-18 12:06:21 +02:00
Peter Crosthwaite
5d546d4b65 memory: constify memory_region_name
It doesn't change the MR and some prospective call sites will have
const MRs at hand.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-18 12:06:21 +02:00
Peter Crosthwaite
83234bf2fa exec: Abstract away ref to memory region names
Use the function provided rather than spying on the struct.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-18 12:06:21 +02:00
Peter Crosthwaite
401cf7fdc4 loader: Abstract away ref to memory region names
Use the function provided rather than spying on the struct.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-18 12:06:21 +02:00
Paolo Bonzini
c54779f962 tpm_tis: remove instance_finalize callback
It is never used, since ISA device are not hot-unpluggable.

Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-18 12:06:21 +02:00
Paolo Bonzini
469b046ead memory: remove memory_region_destroy
The function is empty after the previous patch, so remove it.

Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-18 12:06:21 +02:00
Paolo Bonzini
d8d9581460 memory: convert memory_region_destroy to object_unparent
Explicitly call object_unparent in the few places where we
will re-create the memory region.  If the memory region is
simply being destroyed as part of device teardown, let QOM
handle it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-18 12:06:20 +02:00
Paolo Bonzini
e3fb0ade83 ioport: split deletion and destruction
Of the two functions portio_list_del and portio_list_destroy,
the latter is just freeing a memory area.  However, portio_list_del
is the logical equivalent of memory_region_del_subregion so
destruction of memory regions does not belong there.

Actually, neither of these APIs are in use; portio is mostly used by
ISA devices or VGAs, and neither of these is currently hot-unpluggable.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-17 23:25:24 +02:00
Paolo Bonzini
eed7930950 nic: do not destroy memory regions in cleanup functions
The memory regions should be destroyed in the unrealize function;
since these NICs are not even qdev-ified, they cannot be unplugged
and they do not have to do anything to destroy their memory regions.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-17 23:25:24 +02:00
Paolo Bonzini
ad37168cbd vga: do not dynamically allocate chain4_alias
Instead, add a boolean variable to indicate the presence of the region.
This avoids a repeated malloc/free (later we can also avoid the
add_child/unparent by changing the offset/size of the alias).

Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-17 23:25:24 +02:00
Paolo Bonzini
1dd79a237e sysbus: remove unused function sysbus_del_io
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-17 23:25:24 +02:00
Paolo Bonzini
bffc687d66 qom: object: move unparenting to the child property's release callback
This ensures that the unparent callback is called automatically
when the parent object is finalized.

Note that there's no need to keep a reference neither in
object_unparent nor in object_finalize_child_property.  The
reference held by the child property itself will do.

Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-17 23:25:24 +02:00
Paolo Bonzini
76a6e1cc7c qom: object: delete properties before calling instance_finalize
This ensures that the children's unparent callback will still
have a usable parent.

Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-17 23:25:24 +02:00
Artyom Tarasenko
2a5fade753 target-sparc64: implement Short Floating-Point Store Instructions
Implement Short Floating-Point Store Instructions as described
in the chapter 13.5.2 of UltraSPARC-IIi User's Manual.

Particularly this instructions are used by NetBSD 4.0.1+ /sparc64

Signed-off-by: Artyom Tarasenko <atar4qemu@gmail.com>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2014-08-17 13:24:27 +01:00
Mark Cave-Ayland
b87b0644bc apb: add IOMMU flush register implementation
The IOMMU flush register is a write-only register used to remove entries from the
hardware TLB. Allow guest writes to this register as a no-op, and return a value
of 0 for reads.

This fixes IOMMU DMA operations under NetBSD SPARC64.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2014-08-17 13:13:01 +01:00
Mark Cave-Ayland
a1cf8be550 sun4u: switch second PCI-ebus bridge BAR over to PCI IO space
The ebus is the sun4u equivalent of the old ISA bus which is already mapped at
the beginning of PCI IO space within QEMU. NetBSD attempts to find the physical
addresses of devices connected to the ebus by parsing the BARs of the PCI-ebus
bridge and using the base address found by matching both the address space
type and range for a particular ebus address.

Since the second PCI-ebus bridge BAR is already aliased onto IO space, switch
the BAR over to match and reduce the size to 0x1000 which is enough to cover
all the legacy ioport devices whilst leaving the remaining IO space for other
PCI devices. This allows NetBSD SPARC64 to correctly detect and access devices
on the ebus.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2014-08-17 13:12:52 +01:00
Peter Maydell
142f4ac5d5 Merge remote-tracking branch 'remotes/mjt/tags/trivial-patches-2014-08-15' into staging
trivial patches for 2014-08-15

# gpg: Signature made Fri 15 Aug 2014 16:13:03 BST using RSA key ID A4C3D7DB
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg:                 aka "Michael Tokarev <mjt@corpit.ru>"
# gpg:                 aka "Michael Tokarev <mjt@debian.org>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D  4324 457C E0A0 8044 65C5
#      Subkey fingerprint: 6F67 E18E 7C91 C5B1 5514  66A7 BEE5 9D74 A4C3 D7DB

* remotes/mjt/tags/trivial-patches-2014-08-15:
  ivshmem: check the value returned by fstat()
  l2cap: fix access to freed memory
  intc: i8259: Convert Array allocation to g_new0
  ppc: convert g_new(qemu_irq usages to g_new0
  ssi: xilinx_spi: Initialise CS GPIOs as NULL
  vl: free err
  qemu-options.hx: fix typo about l2tpv3
  vmxnet3: don't use 'Yoda conditions'
  vl: don't use 'Yoda conditions'
  spice: don't use 'Yoda conditions'
  don't use 'Yoda conditions'
  isa-bus: don't use 'Yoda conditions'
  audio: don't use 'Yoda conditions'
  usb: don't use 'Yoda conditions'
  CODING_STYLE: Section about conditional statement
  pci-host: update uncorresponding description
  pci-host: update obsolete reference about piix_pci.c
  qemu-options.hx: fix a typo of chardev
  memory: Update obsolete comment about AddrRange field type
  apic: Fix reported DFR content

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-15 18:44:48 +01:00
Stefan Hajnoczi
39ba3bf69c qcow2: fix new_blocks double-free in alloc_refcount_block()
Commit de82815db1 ("qcow2: Handle failure
for potentially large allocations") introduced a double-free of
new_blocks in the alloc_refcount_block() error path.

The qemu-iotests qcow2 026 test case was failing because qemu-io
segfaulted.

Make sure new_blocks is NULL after we free it the first time.

Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:26 +01:00
Maria Kustova
94c83a24c1 image-fuzzer: Reduce number of generator functions in __init__
Some issues can be found only when a fuzzed image has a partial structure,
e.g. has L1/L2 tables but no refcount ones. Generation of an entirely
defined image limits these cases. Now the Image constructor creates only
a header and a backing file name (if any), other image elements are generated
in the 'create_image' API.

Signed-off-by: Maria Kustova <maria.k@catit.be>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:14 +01:00
Maria Kustova
38eb193b8b image-fuzzer: Add generators of L1/L2 tables
Entries in L1/L2 entries are based on a portion of random guest clusters.
L2 entries contain offsets to host image clusters filled with random data.
Clusters for L1/L2 tables and guest data are selected randomly.

Signed-off-by: Maria Kustova <maria.k@catit.be>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:14 +01:00
Maria Kustova
eeadd92487 image-fuzzer: Add fuzzing functions for L1/L2 table entries
Signed-off-by: Maria Kustova <maria.k@catit.be>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:14 +01:00
Maria Kustova
489cb4d7f9 docs: Expand the list of supported image elements with L1/L2 tables
Signed-off-by: Maria Kustova <maria.k@catit.be>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:14 +01:00
Maria Kustova
071e649194 image-fuzzer: Public API for image-fuzzer/runner/runner.py
__init__.py provides the public API required by the test runner

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Maria Kustova <maria.k@catit.be>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:14 +01:00
Maria Kustova
e123232331 image-fuzzer: Generator of fuzzed qcow2 images
The layout submodule of the qcow2 package creates a random valid image,
randomly selects some amount of its fields, fuzzes them and write the fuzzed
image to the file. Fuzzing process can be controlled by an external
configuration.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Maria Kustova <maria.k@catit.be>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:14 +01:00
Maria Kustova
6d5e9372f6 image-fuzzer: Fuzzing functions for qcow2 images
The fuzz submodule of the qcow2 image generator contains fuzzing functions for
image fields.
Each fuzzing function contains a list of constraints and a call of a helper
function that randomly selects a fuzzed value satisfied to one of constraints.
For now constraints include only known as invalid or potentially dangerous
values. But after investigation of code coverage by fuzz tests they will be
expanded by heuristic values based on inner checks and flows of a program
under test.

Now fuzzing of a header, header extensions and a backing file name is
supported.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Maria Kustova <maria.k@catit.be>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:14 +01:00
Maria Kustova
ad724dd728 image-fuzzer: Tool for fuzz tests execution
The purpose of the test runner is to prepare the test environment (e.g. create
a work directory, a test image, etc), execute a program under test with
parameters, indicate a test failure if the program was killed during the test
execution and collect core dumps, logs and other test artifacts.

The test runner doesn't depend on an image format, so it can be used with any
external image generator.

[Fixed path to qcow2 format module "qcow2" instead of "../qcow2" since
runner.py is no longer in a sub-directory.
--Stefan]

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Maria Kustova <maria.k@catit.be>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:14 +01:00
Maria Kustova
d6dc10aad8 docs: Specification for the image fuzzer
'Overall fuzzer requirements' chapter contains the current product vision and
features done and to be done. This chapter is still in progress.

Signed-off-by: Maria Kustova <maria.k@catit.be>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:14 +01:00
Michael Tokarev
d66168ed68 ide: only constrain read/write requests to drive size, not other types
Commit 58ac321135 introduced a check to ide dma processing which
constrains all requests to drive size.  However, apparently, some
valid requests (like TRIM) does not fit in this constraint, and
fails in 2.1.  So check the range only for reads and writes.

Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:14 +01:00
Marc Marí
a83ceea8ff virtio-blk: Correct bug in support for flexible descriptor layout
Without this correction, only a three descriptor layout is accepted, and
requests with just two descriptors are not completed and no error message is
displayed.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Marc Marí <marc.mari.barcelo@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:14 +01:00
Marc Marí
220c1a2fad libqos: Change free function called in malloc
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Marc Marí <marc.mari.barcelo@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:14 +01:00
Marc Marí
f75ffc5857 libqos: Correct mask to align size to PAGE_SIZE in malloc-pc
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Marc Marí <marc.mari.barcelo@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:14 +01:00
Marc Marí
ae74f18782 libqtest: add QTEST_LOG for debugging qtest testcases
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Marc Marí <marc.mari.barcelo@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:14 +01:00
Kevin Wolf
f7f3ff1da0 ide: Fix segfault when flushing a device that doesn't exist
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
Peter Lieven
2f7133b2e5 qemu-options: add missing -drive discard option to cmdline help
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
Denis V. Lunev
d25d598020 parallels: 2TB+ parallels images support
Parallels has released in the recent updates of Parallels Server 5/6
new addition to his image format. Images with signature WithouFreSpacExt
have offsets in the catalog coded not as offsets in sectors (multiple
of 512 bytes) but offsets coded in blocks (i.e. header->tracks * 512)

In this case all 64 bits of header->nb_sectors are used for image size.

This patch implements support of this for qemu-img and also adds specific
check for an incorrect image. Images with block size greater than
INT_MAX/513 are not supported. The biggest available Parallels image
cluster size in the field is 1 Mb. Thus this limit will not hurt
anyone.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Jeff Cody <jcody@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
Denis V. Lunev
418a7adb77 parallels: split check for parallels format in parallels_open
and rework error path a bit. There is no difference at the moment, but
the code will be definitely shorter when additional processing will
be required for WithouFreSpacExt

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Jeff Cody <jcody@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
Denis V. Lunev
f08e2f8465 parallels: replace tabs with spaces in block/parallels.c
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Jeff Cody <jcody@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
Denis V. Lunev
8c27d54fa0 parallels: extend parallels format header with actual data values
Parallels image format has several additional fields inside:
- nb_sectors is actually 64 bit wide. Upper 32bits are not used for
  images with signature "WithoutFreeSpace" and must be explicitly
  zeroed according to Parallels. They will be used for images with
  signature "WithouFreSpacExt"
- inuse is magic which means that the image is currently opened for
  read/write or was not closed correctly, the magic is 0x746f6e59
- data_off is the location of the first data block. It can be zero
  and in this case data starts just beyond the header aligned to
  512 bytes. Though this field does not matter for read-only driver

This patch adds these values to struct parallels_header and adds
proper handling of nb_sectors for currently supported WithoutFreeSpace
images.

WithouFreSpacExt will be covered in next patches.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Jeff Cody <jcody@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
Cornelia Huck
2f5f70fa5f dataplane: stop trying on notifier error
If we fail to set up guest or host notifiers, there's no use trying again
every time the guest kicks, so disable dataplane in that case.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
Cornelia Huck
f9907ebc4c dataplane: fail notifier setting gracefully
The dataplane code is currently doing a hard exit if it fails to set
up either guest or host notifiers. In practice, this may mean that a
guest suddenly dies after a dataplane device failed to come up (e.g.,
when a file descriptor limit is hit for tne nth device).

Let's just try to unwind the setup instead and return.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
Cornelia Huck
267e1a204c dataplane: print why starting failed
Setting up guest or host notifiers may fail, but the user will have
no idea why: Let's print the error returned by the callback.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
Gonglei
16b38080e3 channel-posix: using qemu_set_nonblock() instead of fcntl(O_NONBLOCK)
Technically, fcntl(soc, F_SETFL, O_NONBLOCK)
is incorrect since it clobbers all other file flags.
We can use F_GETFL to get the current flags, set or
clear the O_NONBLOCK flag, then use F_SETFL to set the flags.

Using the qemu_set_nonblock() wrapper.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Wangxin <wangxinxin.wang@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
Gonglei
4ff12bdb1d qemu-char: using qemu_set_nonblock() instead of fcntl(O_NONBLOCK)
Technically, fcntl(soc, F_SETFL, O_NONBLOCK)
is incorrect since it clobbers all other file flags.
We can use F_GETFL to get the current flags, set or
clear the O_NONBLOCK flag, then use F_SETFL to set the flags.

Using the qemu_set_nonblock() wrapper.

Signed-off-by: Wangxin <wangxinxin.wang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
Mark Cave-Ayland
271dddd133 cmd646: synchronise UDMA interrupt status with DMA interrupt status
Make sure that both registers are synchronised when being accessed through
PCI configuration space.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
Mark Cave-Ayland
1d113ef874 cmd646: allow MRDMODE interrupt status bits clearing from PCI config space
Make sure that we also update the normal DMA interrupt status bits at the
same time, and alter the IRQ if being cleared accordingly.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
Mark Cave-Ayland
dab91a1e13 cmd646: switch cmd646_update_irq() to accept PCIDevice instead of PCIIDEState
This is in preparation for adding configuration space accessors which accept
PCIDevice as a parameter.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
Mark Cave-Ayland
5bbc0a703d cmd646: synchronise DMA interrupt status with UDMA interrupt status
Make sure that the standard DMA interrupt status bits reflect any changes made
to the UDMA interrupt status bits. The CMD646U2 datasheet claims that these
bits are equivalent, and they must be synchronised for guests that manipulate
both registers.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
Mark Cave-Ayland
58f16a7b47 cmd646: add constants for CNTRL register access
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
John Snow
e42de189e8 qtest/ide: Fix small memory leak
For libqos debugging purposes, it's nice to
be able to assert that tests and associated libraries
have no memory leaks. To that end, free up the
trivial cmdline leak.

The remaining leaks caused by pc_alloc_init are fixed
instead by my first-fit pc_alloc implementation already
on the qemu-devel mailing list.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
John Snow
6ce7100e7f libqos: allow qpci_iomap to return BAR mapping size
This patch allows qpci_iomap to return the size of the
BAR mapping that it created, to allow driver applications
(e.g, ahci-test) to make determinations about the suitability
or the mapping size, or in the specific case of AHCI, how
many ports are supported by the HBA.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
John Snow
7f2a5ae6c1 libqos: Fixes a small memory leak.
Allow users the chance to clean up the QPCIBusPC structure
by adding a small cleanup routine. Helps clear up small
memory leaks during setup/teardown, to allow for cleaner
debug output messages.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
John Snow
a7afc6b8c1 libqtest: Correct small memory leak.
Fixes a small memory leak inside of libqtest.
After we produce a test path and glib copies the string
for itself, we should clean up our temporary copy.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
John Snow
f3cdcbaee1 libqos: Correct memory leak
Fix a small memory leak inside of libqos, in the pc_alloc_init routine.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:12 +01:00
John Snow
86298845e1 qtest: Adding qtest_memset and qmemset.
Currently, libqtest allows for memread and memwrite, but
does not offer a simple way to zero out regions of memory.
This patch adds a simple function to do so.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:12 +01:00
John Snow
552b48f44d q35: Enable the ioapic device to be seen by qtest.
Currently, the ioapic device can not be found in a qtest environment
when requesting "irq_interrupt_in ioapic" via the qtest socket.

By mirroring how the ioapic is added in i44ofx (hw/i440/pc_piix.c),
as a child of "q35," the device is able to be seen by qtest.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:12 +01:00
Paolo Bonzini
088415202b ahci: construct PIO Setup FIS for PIO commands
PIO commands should put a PIO Setup FIS in the receive area when data
transfer ends.  Currently QEMU does not do this and only places the
D2H FIS at the end of the operation.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:12 +01:00
Paolo Bonzini
c7e73adb48 ide: make all commands go through cmd_done
AHCI has code to fill in the D2H FIS trigger the IRQ all over the place.
Centralize this in a single cmd_done callback by generalizing the existing
async_cmd_done callback.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:12 +01:00
Paolo Bonzini
08ee9e3368 ide: stop PIO transfer on errors
This will provide a hook for sending the result of the command via the
FIS receive area.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:12 +01:00
Paolo Bonzini
1f88f77348 ahci: remove duplicate PORT_IRQ_* constants
These are defined twice, just use one set consistently.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:12 +01:00
Paolo Bonzini
fd648f10af ide: move retry constants out of BM_STATUS_* namespace
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:12 +01:00
Paolo Bonzini
7e2648df86 ide: move BM_STATUS bits to pci.[ch]
They are not used by AHCI, and should not be even available there.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:12 +01:00
Paolo Bonzini
0e7ce54cf5 ide: fold add_status callback into set_inactive
It is now called only after the set_inactive callback.  Put the two together.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:12 +01:00
Paolo Bonzini
0def37baf9 ide: remove wrong setting of BM_STATUS_INT
Similar to the case removed in commit 69c38b8 (ide/core: Remove explicit
setting of BM_STATUS_INT, 2011-05-19), the only remaining use of
add_status(..., BM_STATUS_INT) is for short PRDs.  The flag should
not be raised in this case.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:12 +01:00
Paolo Bonzini
4855b57639 ide: wrap start_dma callback
Make it optional and prepare for the next patches.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:12 +01:00
Paolo Bonzini
446351236b ide: simplify start_transfer callbacks
Drop the unused return value and make the callback optional.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:12 +01:00
Paolo Bonzini
c039cb1e5a ide: simplify async_cmd_done callbacks
Drop the unused return value.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:12 +01:00
Paolo Bonzini
829b933b70 ide: simplify set_inactive callbacks
Drop the unused return value and make the callback optional.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:12 +01:00
Paolo Bonzini
1374bec063 ide: simplify reset callbacks
Drop the unused return value and make the callback optional.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:12 +01:00
Paolo Bonzini
69f72a2221 ide: stash aiocb for flushes
This ensures that operations are completed after a reset

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:12 +01:00
Paolo Bonzini
14a92e5fe1 ide-test: add test for werror=stop
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:12 +01:00
Paolo Bonzini
7c282cb4c5 libqtest: add QTEST_LOG for debugging qtest testcases
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:11 +01:00
Paolo Bonzini
9e52c53b8c blkdebug: report errors on flush too
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:11 +01:00
Peter Maydell
f2c85a2f36 Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
post-2.1 bugfixes

A bunch of fixes that missed 2.1 by a small margin.
If we do 2.1.1, some of these would be good candidates,
added Cc qemu-stable as appropriate.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Thu 14 Aug 2014 17:07:25 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"

* remotes/mst/tags/for_upstream:
  pc: Get rid of pci-info leftovers
  e1000: use symbolic constants to init phy ctrl & status registers
  e1000: correctly handle phy_ctrl reserved & self-clearing bits
  ivshmem: fix building when debug mode is enabled
  acpi: align RSDP
  numa: show hex number in error message for consistency and prefix them with 0x
  pc-dimm: fix up error message
  pc-dimm: validate node property
  hw:i386: typo fix: MEMORY_HOPTLUG_DEVICE -> MEMORY_HOTPLUG_DEVICE
  hw/audio/intel-hda: Fix MSI capability address
  pc: Create 2.2 machine type
  pci: Use bus master address space for delivering MSI/MSI-X messages

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-15 17:43:51 +01:00
Peter Maydell
5c6b3c50cc Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging
Tracing pull request

* remotes/stefanha/tags/tracing-pull-request:
  virtio-rng: add some trace events
  trace: add some tcg tracing support
  trace: teach lttng backend to use format strings
  trace: [tcg] Include TCG-tracing header on all targets
  trace: [tcg] Include event definitions in "trace.h"
  trace: [tcg] Generate TCG tracing routines
  trace: [tcg] Include TCG-tracing helpers
  trace: [tcg] Define TCG tracing helper routine wrappers
  trace: [tcg] Define TCG tracing helper routines
  trace: [tcg] Declare TCG tracing helper routines
  trace: [tcg] Add 'tcg' event property
  trace: [tcg] Argument type transformation machinery
  trace: [tcg] Argument type transformation rules
  trace: [tcg] Add documentation
  trace: install simpletrace SystemTap tapset
  simpletrace: add simpletrace.py --no-header option
  trace: add tracetool simpletrace_stap format
  trace: extract stap_escape() function for reuse

Conflicts:
	Makefile.objs
2014-08-15 16:37:17 +01:00
zhanghailiang
5edbdbcdf8 ivshmem: check the value returned by fstat()
The function fstat() may fail, so check its return value.

Acked-by: Levente Kurusa <lkurusa@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 19:12:58 +04:00
zhanghailiang
2c145d7a73 l2cap: fix access to freed memory
Pointer 'ch' will be used in function 'l2cap_channel_open_req_msg' after
it was previously freed in 'l2cap_channel_open'.
Assigned it to NULL after it is freed.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 19:12:48 +04:00
Peter Crosthwaite
8945c7f754 intc: i8259: Convert Array allocation to g_new0
To be more array friendly and to indicate the IRQs are initially
disconnected.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:55 +04:00
Peter Crosthwaite
aa2ac1dac3 ppc: convert g_new(qemu_irq usages to g_new0
To indicate the IRQs are initially disconnected.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:50 +04:00
Peter Crosthwaite
c75f3c041a ssi: xilinx_spi: Initialise CS GPIOs as NULL
To properly indicate they are unconnected.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:40 +04:00
Hu Tao
3a9cbfe009 vl: free err
err is not freed after use, thus causing memory leak. This patch fixes
it.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Cc: qemu-trivial@nongnu.org
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:07 +04:00
Gonglei
3952651a75 qemu-options.hx: fix typo about l2tpv3
two duplicate destport description.

s/destport/srcport/, s/destination/source/

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:07 +04:00
Gonglei
f7472ca405 vmxnet3: don't use 'Yoda conditions'
imitate nearby code about using '!value' or 'value == NULL'

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:07 +04:00
Gonglei
28de2f883c vl: don't use 'Yoda conditions'
imitate nearby code about using '!value' or 'value == NULL'

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:07 +04:00
Gonglei
fe8e8327f1 spice: don't use 'Yoda conditions'
imitate nearby code about using '!value' or 'value == NULL'

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:07 +04:00
Gonglei
8108fd3e26 don't use 'Yoda conditions'
imitate nearby code about using '!value' or 'value == NULL'

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:07 +04:00
Gonglei
337a3e5c7d isa-bus: don't use 'Yoda conditions'
imitate nearby code about using '!value' or 'value == NULL'

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:06 +04:00
Gonglei
2ab5bf67b7 audio: don't use 'Yoda conditions'
imitate nearby code about using '!value' or 'value == NULL'

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:06 +04:00
Gonglei
d0657b2aab usb: don't use 'Yoda conditions'
imitate nearby code about using '!value' or 'value == NULL'

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:06 +04:00
Gonglei
2bb0020cf9 CODING_STYLE: Section about conditional statement
Yoda conditions lack readability, and QEMU has a
strict compiler configuration for checking a common
mistake like "if (dev = NULL)". Make it a written rule.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:06 +04:00
Gonglei
30dc600bbf pci-host: update uncorresponding description
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:06 +04:00
Gonglei
ef9f7b587d pci-host: update obsolete reference about piix_pci.c
piix_pci.c has been renamed into piix.c at commit
c0907c9e64

update the obsolete reference.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:06 +04:00
Liming Wang
38a24c8b74 qemu-options.hx: fix a typo of chardev
Change host to port.

Signed-off-by: Liming Wang <liming.wang@canonical.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:06 +04:00
Fam Zheng
c9cdaa3ab9 memory: Update obsolete comment about AddrRange field type
We are not 64 bit any more since

08dafab4 memory: use 128-bit integers for sizes and intermediates

but the comment is forgotten to be updated.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:06 +04:00
Jan Kiszka
d6c140a771 apic: Fix reported DFR content
IA-32 SDM, Figure 10-14: Bits 27:0 are reserved as 1.

Fixes Jailhouse hypervisor start with in-kernel irqchips off.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:06 +04:00
Peter Maydell
f2fb1da941 Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block patches

# gpg: Signature made Fri 15 Aug 2014 14:07:42 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"

* remotes/kevin/tags/for-upstream: (59 commits)
  block: Catch !bs->drv in bdrv_check()
  iotests: Add test for image header overlap
  qcow2: Catch !*host_offset for data allocation
  qcow2: Return useful error code in refcount_init()
  mirror: Handle failure for potentially large allocations
  vpc: Handle failure for potentially large allocations
  vmdk: Handle failure for potentially large allocations
  vhdx: Handle failure for potentially large allocations
  vdi: Handle failure for potentially large allocations
  rbd: Handle failure for potentially large allocations
  raw-win32: Handle failure for potentially large allocations
  raw-posix: Handle failure for potentially large allocations
  qed: Handle failure for potentially large allocations
  qcow2: Handle failure for potentially large allocations
  qcow1: Handle failure for potentially large allocations
  parallels: Handle failure for potentially large allocations
  nfs: Handle failure for potentially large allocations
  iscsi: Handle failure for potentially large allocations
  dmg: Handle failure for potentially large allocations
  curl: Handle failure for potentially large allocations
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-15 14:49:50 +01:00
Max Reitz
908bcd540f block: Catch !bs->drv in bdrv_check()
qemu-img check calls bdrv_check() twice if the first run repaired some
inconsistencies. If the first run however again triggered corruption
prevention (on qcow2) due to very bad inconsistencies, bs->drv may be
NULL afterwards. Thus, bdrv_check() should check whether bs->drv is set.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-15 15:07:16 +02:00
Max Reitz
a42f8a3d05 iotests: Add test for image header overlap
Add a test for an image with an unallocated image header; instead of an
assertion, this should result in the image being marked corrupt.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-15 15:07:16 +02:00
Max Reitz
ff52aab2df qcow2: Catch !*host_offset for data allocation
qcow2_alloc_cluster_offset() uses host_offset == 0 as "no preferred
offset" for the (data) cluster range to be allocated. However, this
offset is actually valid and may be allocated on images with a corrupted
refcount table or first refcount block.

In this case, the corruption prevention should normally catch that
write anyway (because it would overwrite the image header). But since 0
is a special value here, the function assumes that nothing has been
allocated at all which it asserts against.

Because this condition is not qemu's fault but rather that of a broken
image, it shouldn't throw an assertion but rather mark the image corrupt
and show an appropriate message, which this patch does by calling the
corruption check earlier than it would be called normally (before the
assertion).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-15 15:07:16 +02:00
Max Reitz
8fcffa9853 qcow2: Return useful error code in refcount_init()
If bdrv_pread() returns an error, it is very unlikely that it was
ENOMEM. In this case, the return value should be passed along; as
bdrv_pread() will always either return the number of bytes read or a
negative value (the error code), the condition for checking whether
bdrv_pread() failed can be simplified (and clarified) as well.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-15 15:07:16 +02:00
Kevin Wolf
7504edf477 mirror: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the mirror block job.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-15 15:07:16 +02:00
Kevin Wolf
5fb09cd586 vpc: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the vpc block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-15 15:07:16 +02:00
Kevin Wolf
d6e5993197 vmdk: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the vmdk block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-15 15:07:16 +02:00
Kevin Wolf
a67e128a4f vhdx: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the vhdx block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-15 15:07:16 +02:00
Kevin Wolf
17cce73578 vdi: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the vdi block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-15 15:07:16 +02:00
Kevin Wolf
0f7a02379b rbd: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the rbd block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:16 +02:00
Kevin Wolf
4b6af3d58a raw-win32: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the raw-win32 block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:16 +02:00
Kevin Wolf
50d4a858e6 raw-posix: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the raw-posix block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:15 +02:00
Kevin Wolf
4f4896db5f qed: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the qed block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-15 15:07:15 +02:00
Kevin Wolf
de82815db1 qcow2: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the qcow2 block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:15 +02:00
Kevin Wolf
0df93305f2 qcow1: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the qcow1 block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:15 +02:00
Kevin Wolf
f7b593d937 parallels: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the parallels block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-15 15:07:15 +02:00
Kevin Wolf
2347dd7b68 nfs: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the nfs block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-15 15:07:15 +02:00
Kevin Wolf
4d5a3f888c iscsi: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the iscsi block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-08-15 15:07:15 +02:00
Kevin Wolf
b546a94474 dmg: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the dmg block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-15 15:07:15 +02:00
Kevin Wolf
8dc7a7725b curl: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the curl block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-15 15:07:15 +02:00
Kevin Wolf
4ae7a52e43 cloop: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the cloop block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-15 15:07:15 +02:00
Kevin Wolf
7bf665ee35 bochs: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the bochs block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-15 15:07:15 +02:00
Kevin Wolf
857d4f46c3 block: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses bounce buffer allocations in block.c. While at it,
convert bdrv_commit() from plain g_malloc() to qemu_try_blockalign().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:15 +02:00
Kevin Wolf
7d2a35cc92 block: Introduce qemu_try_blockalign()
This function returns NULL instead of aborting when an allocation fails.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-15 15:07:15 +02:00
Jeff Cody
23d20b5b4f block: iotest - update 084 to test static VDI image creation
This updates the VDI corruption test to also test static VDI image
creation, as well as the default dynamic image creation.

Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-15 15:07:15 +02:00
Jeff Cody
fef6070eff block: vpc - use block layer ops in vpc_create, instead of posix calls
Use the block layer to create, and write to, the image file in the VPC
.bdrv_create() operation.

This has a couple of benefits: Images can now be created over protocols,
and hacks such as NOCOW are not needed in the image format driver, and
the underlying file protocol appropriate for the host OS can be relied
upon.

Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-15 15:07:15 +02:00
Jeff Cody
dddc7750d6 block: use the standard 'ret' instead of 'result'
Most QEMU code uses 'ret' for function return values. The VDI driver
uses a mix of 'result' and 'ret'.  This cleans that up, switching over
to the standard 'ret' usage.

Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-15 15:07:15 +02:00
Jeff Cody
70747862f1 block: vdi - use block layer ops in vdi_create, instead of posix calls
Use the block layer to create, and write to, the image file in the
VDI .bdrv_create() operation.

This has a couple of benefits: Images can now be created over protocols,
and hacks such as NOCOW are not needed in the image format driver, and
the underlying file protocol appropriate for the host OS can be relied
upon.

Also some minor cleanup for error handling.

Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-15 15:07:14 +02:00
Jeff Cody
9a4d5ca607 block: allow bdrv_unref() to be passed NULL pointers
If bdrv_unref() is passed a NULL BDS pointer, it is safe to
exit with no operation.  This will allow cleanup code to blindly
call bdrv_unref() on a BDS that has been initialized to NULL.

Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-15 15:07:14 +02:00
Paolo Bonzini
58803ce74f test-coroutine: add baseline test that times the cost of function calls
This can be used to compute the cost of coroutine operations.  In the
end the cost of the function call is a few clock cycles, so it's pretty
cheap for now, but it may become more relevant as the coroutine code
is optimized.

For example, here are the results on my machine:

   Function call 100000000 iterations: 0.173884 s
   Yield 100000000 iterations: 8.445064 s
   Lifecycle 1000000 iterations: 0.098445 s
   Nesting 10000 iterations of 1000 depth each: 7.406431 s

One yield takes 83 nanoseconds, one enter takes 97 nanoseconds,
one coroutine allocation takes (roughly, since some of the allocations
in the nesting test do hit the pool) 739 nanoseconds:

   (8.445064 - 0.173884) * 10^9 / 100000000 = 82.7
   (0.098445 * 100 - 0.173884) * 10^9 / 100000000 = 96.7
   (7.406431 * 10 - 0.173884) * 10^9 / 100000000 = 738.9

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-15 15:07:14 +02:00
Jeff Cody
4f75b52a07 block: VHDX endian fixes
This patch contains several changes for endian conversion fixes for
VHDX, particularly for big-endian machines (multibyte values in VHDX are
all on disk in LE format).

Tests were done with existing qemu-iotests on an IBM POWER7 (8406-71Y).
This includes sample images created by Hyper-V, both with dirty logs and
without.

In addition, VHDX image files created (and written to) on a BE machine
were tested on a LE machine, and vice-versa.

Reported-by: Markus Armburster <armbru@redhat.com>
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-15 15:07:14 +02:00
Jeff Cody
349592e0b9 block: vhdx - add error check
This add an error check for an invalid descriptor entry signature,
when flushing the log descriptor entries.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-15 15:07:14 +02:00
Stefan Hajnoczi
3c80ca158c thread-pool: avoid deadlock in nested aio_poll() calls
The thread pool has a race condition if two elements complete before
thread_pool_completion_bh() runs:

  If element A's callback waits for element B using aio_poll() it will
  deadlock since pool->completion_bh is not marked scheduled when the
  nested aio_poll() runs.

Fix this by marking the BH scheduled while thread_pool_completion_bh()
is executing.  This way any nested aio_poll() loops will enter
thread_pool_completion_bh() and complete the remaining elements.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:14 +02:00
Stefan Hajnoczi
c2e50e3d11 thread-pool: avoid per-thread-pool EventNotifier
EventNotifier is implemented using an eventfd or pipe.  It therefore
consumes file descriptors, which can be limited by rlimits and should
therefore be used sparingly.

Switch from EventNotifier to QEMUBH in thread-pool.c.  Originally
EventNotifier was used because qemu_bh_schedule() was not thread-safe
yet.

Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:14 +02:00
Stefan Hajnoczi
2a87151fb2 block: bump coroutine pool size for drives
When a BlockDriverState is associated with a storage controller
DeviceState we expect guest I/O.  Use this opportunity to bump the
coroutine pool size by 64.

This patch ensures that the coroutine pool size scales with the number
of drives attached to the guest.  It should increase coroutine pool
usage (which makes qemu_coroutine_create() fast) without hogging too
much memory when fewer drives are attached.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-08-15 15:07:14 +02:00
Stefan Hajnoczi
ac2662a913 coroutine: make pool size dynamic
Allow coroutine users to adjust the pool size.  For example, if the
guest has multiple emulated disk drives we should keep around more
coroutines.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-08-15 15:07:14 +02:00
Chrysostomos Nanakos
746ebfa77a qemu-iotests: add support for Archipelago protocol
Signed-off-by: Chrysostomos Nanakos <cnanakos@grnet.gr>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:14 +02:00
Chrysostomos Nanakos
b1de5f439d QMP: Add support for Archipelago
Introduce new enum BlockdevOptionsArchipelago.

@volume:              #Name of the Archipelago volume image

@mport:               #'mport' is the port number on which mapperd is
                      listening. This is optional and if not specified,
                      QEMU will make Archipelago to use the default port.

@vport:               #'vport' is the port number on which vlmcd is
                      listening. This is optional and if not specified,
                      QEMU will make Archipelago to use the default port.

@segment:             #optional The name of the shared memory segment
                      Archipelago stack is using. This is optional
                      and if not specified, QEMU will make Archipelago
                      use the default value, 'archipelago'.

Signed-off-by: Chrysostomos Nanakos <cnanakos@grnet.gr>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:14 +02:00
Chrysostomos Nanakos
76d3d83a37 block/archipelago: Add support for creating images
qemu-img archipelago:<volumename>[/mport=<mapperd_port>[:vport=<vlmcd_port>]
 [:segment=<segment_name>]] [size]

Signed-off-by: Chrysostomos Nanakos <cnanakos@grnet.gr>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:14 +02:00
Chrysostomos Nanakos
70537a8506 block/archipelago: Implement bdrv_parse_filename()
VM Image on Archipelago volume can also be specified like this:

file=archipelago:<volumename>[/mport=<mapperd_port>[:vport=<vlmcd_port>][:
segment=<segment_name>]]

Examples:

file=archipelago:my_vm_volume
file=archipelago:my_vm_volume/mport=123
file=archipelago:my_vm_volume/mport=123:vport=1234
file=archipelago:my_vm_volume/mport=123:vport=1234:segment=my_segment

Signed-off-by: Chrysostomos Nanakos <cnanakos@grnet.gr>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:14 +02:00
Chrysostomos Nanakos
c9a12e751b block: Support Archipelago as a QEMU block backend
VM Image on Archipelago volume is specified like this:

file.driver=archipelago,file.volume=<volumename>[,file.mport=<mapperd_port>[,
file.vport=<vlmcd_port>][,file.segment=<segment_name>]]

'archipelago' is the protocol.

'mport' is the port number on which mapperd is listening. This is optional
and if not specified, QEMU will make Archipelago to use the default port.

'vport' is the port number on which vlmcd is listening. This is optional
and if not specified, QEMU will make Archipelago to use the default port.

'segment' is the name of the shared memory segment Archipelago stack is using.
This is optional and if not specified, QEMU will make Archipelago to use the
default value, 'archipelago'.

Examples:

file.driver=archipelago,file.volume=my_vm_volume
file.driver=archipelago,file.volume=my_vm_volume,file.mport=123
file.driver=archipelago,file.volume=my_vm_volume,file.mport=123,
file.vport=1234
file.driver=archipelago,file.volume=my_vm_volume,file.mport=123,
file.vport=1234,file.segment=my_segment

Signed-off-by: Chrysostomos Nanakos <cnanakos@grnet.gr>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-15 15:07:14 +02:00
Chunyan Liu
000c4dfff4 qemu-img info: show nocow info
Add nocow info in 'qemu-img info' output to show whether the file
currently has NOCOW flag set or not.

Signed-off-by: Chunyan Liu <cyliu@suse.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:14 +02:00
Fam Zheng
c6ac36e145 vmdk: Optimize cluster allocation
This drops the unnecessary bdrv_truncate() from, and also improves,
cluster allocation code path.

Before, when we need a new cluster, get_cluster_offset truncates the
image to bdrv_getlength() + cluster_size, and returns the offset of
added area, i.e. the image length before truncating.

This is not efficient, so it's now rewritten as:

  - Save the extent file length when opening.

  - When allocating cluster, use the saved length as cluster offset.

  - Don't truncate image, because we'll anyway write data there: just
    write any data at the EOF position, in descending priority:

    * New user data (cluster allocation happens in a write request).

    * Filling data in the beginning and/or ending of the new cluster, if
      not covered by user data: either backing file content (COW), or
      zero for standalone images.

One major benifit of this change is, on host mounted NFS images, even
over a fast network, ftruncate is slow (see the example below). This
change significantly speeds up cluster allocation. Comparing by
converting a cirros image (296M) to VMDK on an NFS mount point, over
1Gbe LAN:

    $ time qemu-img convert cirros-0.3.1.img /mnt/a.raw -O vmdk

    Before:
        real    0m21.796s
        user    0m0.130s
        sys     0m0.483s

    After:
        real    0m2.017s
        user    0m0.047s
        sys     0m0.190s

We also get rid of unchecked bdrv_getlength() and bdrv_truncate(), and
get a little more documentation in function comments.

Tested that this passes qemu-iotests for all VMDK subformats.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:14 +02:00
Fam Zheng
a8d8a1a06c qemu-iotests: Add data pattern in version3 VMDK sample image in 059
It's possible that we diverge from the specification with our
implementation.  Having a reference image in the test cases may detect
such problems when we introduce a bug that can read what it creates, but
can't handle a real VMDK.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:13 +02:00
Stefan Hajnoczi
ef523587da qdev-monitor: include QOM properties in -device FOO, help output
Update -device FOO,help to include QOM properties in addition to qdev
properties.  Devices are gradually adding more QOM properties that are
not reflected as qdev properties.

It is important to report all device properties since management tools
like libvirt use this information (and device-list-properties QMP) to
detect the presence of QEMU features.

This patch reuses the device-list-properties QMP machinery to avoid code
duplication.

Reported-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Cole Robinson <crobinso@redhat.com>
2014-08-15 15:07:13 +02:00
Stefan Hajnoczi
4115dd6527 qmp: hide "hotplugged" device property from device-list-properties
The "hotplugged" device property was not reported before commit
f4eb32b590 ("qmp: show QOM properties in
device-list-properties").  Fix this difference.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-08-15 15:07:13 +02:00
Stefan Hajnoczi
ef558696b5 docs/multiple-iothreads.txt: add documentation on IOThread programming
This document explains how IOThreads and the main loop are related,
especially how to write code that can run in an IOThread.  Currently
only virtio-blk-data-plane uses these techniques.  The next obvious
target is virtio-scsi; there has also been work on virtio-net.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-08-15 15:07:13 +02:00
Gonglei (Arei)
8cced12143 xen_disk: fix possible null-ptr dereference
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:13 +02:00
Hu Tao
8efc936336 configure: explicitly state version requirements to devel packages
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:13 +02:00
Maria Kustova
8e436ec1f3 docs: Make the recommendation for the backing file name position a requirement
The current version of the qcow2 specification recommends to save the backing
file name in the end of the first cluster. It follows that the backing file
name can be saved somewhere in the image, but the first cluster, which
contradicts the current QEMU implementation.

The patch makes the backing file name required to be placed after the header
extensions in the first image cluster.

Signed-off-by: Maria Kustova <maria.k@catit.be>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-15 15:07:13 +02:00
Markus Armbruster
52bf1e722d block: Avoid bdrv_get_geometry() where errors should be detected
bdrv_get_geometry() hides errors.  Use bdrv_nb_sectors() or
bdrv_getlength() instead where that's obviously inappropriate.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:13 +02:00
Markus Armbruster
d739f1c410 qemu-img: Make img_convert() get image size just once per image
Chiefly so I don't have to do the error checking in quadruplicate in
the next commit.  Moreover, replacing the frequently updated
bs_sectors by an array assigned just once makes the code easier to
understand.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:13 +02:00
Markus Armbruster
75d3d21f9e block: Drop superfluous aligning of bdrv_getlength()'s value
It returns a multiple of the sector size.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:13 +02:00
Markus Armbruster
57322b7811 block: Use bdrv_nb_sectors() where sectors, not bytes are wanted
Instead of bdrv_getlength().

Aside: a few of these callers don't handle errors.  I didn't
investigate whether they should.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:13 +02:00
Markus Armbruster
43716fa805 block: Use bdrv_nb_sectors() in img_convert()
Instead of bdrv_getlength().  Replace variable output_length by
output_sectors.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:13 +02:00
Markus Armbruster
30a7f2fc91 block: Use bdrv_nb_sectors() in bdrv_co_get_block_status()
Instead of bdrv_getlength().

Replace variables length, length2 by total_sectors, nb_sectors2.
Bonus: use total_sectors instead of the slightly unclean
bs->total_sectors.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:13 +02:00
Markus Armbruster
4049082c4b block: Use bdrv_nb_sectors() in bdrv_aligned_preadv()
Instead of bdrv_getlength().  Eliminate variable len.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:13 +02:00
Markus Armbruster
d32f7c101b block: Use bdrv_nb_sectors() in bdrv_make_zero()
Instead of bdrv_getlength().

Variable target_size is initially in bytes, then changes meaning to
sectors.  Ugh.  Replace by target_sectors.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:13 +02:00
Markus Armbruster
65a9bb25d6 block: New bdrv_nb_sectors()
A call to retrieve the image size converts between bytes and sectors
several times:

* BlockDriver method bdrv_getlength() returns bytes.

* refresh_total_sectors() converts to sectors, rounding up, and stores
  in total_sectors.

* bdrv_getlength() converts total_sectors back to bytes (now rounded
  up to a multiple of the sector size).

* Callers wanting sectors rather bytes convert it right back.
  Example: bdrv_get_geometry().

bdrv_nb_sectors() provides a way to omit the last two conversions.
It's exactly bdrv_getlength() with the conversion to bytes omitted.
It's functionally like bdrv_get_geometry() without its odd error
handling.

Reimplement bdrv_getlength() and bdrv_get_geometry() on top of
bdrv_nb_sectors().

The next patches will convert some users of bdrv_getlength() to
bdrv_nb_sectors().

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:12 +02:00
Peter Maydell
f083201667 Merge remote-tracking branch 'remotes/mjt/tags/trivial-patches-2014-08-09' into staging
trivial patches for 2014-08-09

# gpg: Signature made Fri 08 Aug 2014 21:36:44 BST using RSA key ID A4C3D7DB
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg:                 aka "Michael Tokarev <mjt@corpit.ru>"
# gpg:                 aka "Michael Tokarev <mjt@debian.org>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D  4324 457C E0A0 8044 65C5
#      Subkey fingerprint: 6F67 E18E 7C91 C5B1 5514  66A7 BEE5 9D74 A4C3 D7DB

* remotes/mjt/tags/trivial-patches-2014-08-09:
  build-sys: Move qapi-{types, visit, event}.o into util-obj-y
  po: Add Chinese translation
  qemu-img: Check getchar() return value in read_password() for WIN32
  hw/timer: Move extern declaration from .c to .h file
  virtio: Move extern declaration to header file
  Show length mismatch error is hex
  target-i386/cpu.c: Fix two error output indentation
  l2tpv3 (configure): it is linux-specific
  hw/timer/imx_*: fix TIMER_MAX clash with system symbol

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-15 13:41:55 +01:00
Markus Armbruster
260cb1c409 pc: Get rid of pci-info leftovers
pc_fw_cfg_guest_info() never does anything, because has_pci_info is
always false.

Introduced in commit f8c457b "pc: pass PCI hole ranges to Guests",
disabled in commit 9604f70 "pc: disable pci-info for 1.6", and hasn't
been enabled since.  Obviously a dead end.  Get of it.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-14 13:22:25 +02:00
Gabriel L. Somlo
9616c29045 e1000: use symbolic constants to init phy ctrl & status registers
Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-14 13:22:25 +02:00
Gabriel L. Somlo
1195fed9e6 e1000: correctly handle phy_ctrl reserved & self-clearing bits
Make phyreg_writeops responsible for actually writing their
respective phy registers, rather than rely on set_mdic() to
do it on their behalf.

The only current instance of phyreg_writeops is set_phy_ctrl();
modify it to write the register on its own, while also correctly
handling reserved and self-clearing bits.

have_autoneg() does not need to check for MII_CR_RESTART_AUTO_NEG,
since the only time the flag comes into play is during set_phy_ctrl(),
and, following this patch, never actually gets written to the phy
control register.

Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-14 13:22:25 +02:00
Levente Kurusa
7f9efb6b80 ivshmem: fix building when debug mode is enabled
ivsmem_offset was removed, however this debug statement was not updated.
Modify the statement to fit the new mechanic.

Signed-off-by: Levente Kurusa <lkurusa@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-14 13:22:25 +02:00
Michael S. Tsirkin
d67aadccfa acpi: align RSDP
RSDP should be aligned at a 16-byte boundary.
This would by chance at the moment, fix up acpi build
to make it robust.

Cc: qemu-stable@nongnu.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2014-08-14 13:22:16 +02:00
Hu Tao
c68233aee8 numa: show hex number in error message for consistency and prefix them with 0x
The error messages before and after patch are:

before:
qemu-system-x86_64: total memory for NUMA nodes (134217728) should equal RAM size (20000000)

after:
qemu-system-x86_64: total memory for NUMA nodes (0x8000000) should equal RAM size (0x20000000)

Cc: qemu-stable@nongnu.org
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-14 13:22:07 +02:00
Michael S. Tsirkin
988eba0f68 pc-dimm: fix up error message
- int should be printed using %d
- print actual wrong value for property

Cc: qemu-stable@nongnu.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-14 13:22:00 +02:00
Hu Tao
cfe0ffd027 pc-dimm: validate node property
If user specifies a node number that exceeds the available numa nodes in
emulated system for pc-dimm device, the device will report an invalid _PXM
to OSPM. Fix this by checking the node property value.

Cc: qemu-stable@nongnu.org
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-14 13:20:59 +02:00
Hu Tao
41d2f71376 hw:i386: typo fix: MEMORY_HOPTLUG_DEVICE -> MEMORY_HOTPLUG_DEVICE
Cc: qemu-stable@nongnu.org
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-14 13:20:49 +02:00
Jan Kiszka
d209c7440a hw/audio/intel-hda: Fix MSI capability address
According to ICH9 spec, the MSI capability is located at 0x60. This is
important for guest drivers that do not parse the capability chain and
use absolute addresses instead.

CC: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-14 13:20:49 +02:00
Jan Kiszka
f9f218730c pc: Create 2.2 machine type
Yet identical to 2.1.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-14 13:20:49 +02:00
Jan Kiszka
cc943c36fa pci: Use bus master address space for delivering MSI/MSI-X messages
The spec says (and real HW confirms this) that, if the bus master bit
is 0, the device will not generate any PCI accesses. MSI and MSI-X
messages fall among these, so we should use the corresponding address
space to deliver them. This will prevent delivery if bus master support
is disabled.

Cc: qemu-stable@nongnu.org
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-14 13:20:33 +02:00
Amit Shah
4ac4458076 virtio-rng: add some trace events
Add some trace events to virtio-rng for easier debugging

Signed-off-by: Amit Shah <amit.shah@redhat.com>

Reviewed-by: Amos Kong <akong@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-12 14:29:55 +01:00
Alex Bennée
6db8b53866 trace: add some tcg tracing support
This adds a couple of tcg specific trace-events which are useful for
tracing execution though tcg generated blocks. It's been tested with
lttng user space tracing but is generic enough for all systems. The tcg
events are:

  * translate_block - when a subject block is translated
  * exec_tb - when a translated block is entered
  * exec_tb_exit - when we exit the translated code
  * exec_tb_nocache - special case translations

Of course we can only trace the entrance to the first block of a chain
as each block will jump directly to the next when it can. See the -d
nochain patch to allow more complete tracing at the expense of
performance.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-12 14:26:12 +01:00
Alex Bennée
41ef7b00ab trace: teach lttng backend to use format strings
This makes the UST backend pay attention to the format string arguments
that are defined when defining payload data. With this you can now
ensure integers are reported in hex mode if you want.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-12 14:26:12 +01:00
Lluís Vilanova
a7e30d84ce trace: [tcg] Include TCG-tracing header on all targets
Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-12 14:26:12 +01:00
Lluís Vilanova
85d8bf2f36 trace: [tcg] Include event definitions in "trace.h"
Otherwise the user has to explicitly include an auto-generated header.

Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-12 14:26:12 +01:00
Lluís Vilanova
465830fbd9 trace: [tcg] Generate TCG tracing routines
Generate header "trace/generated-tcg-tracers.h" with the necessary routines for
tracing events in guest code:

* trace_${event}_tcg

  Convenience wrapper that calls the translation-time tracer
  'trace_${event}_trans', and calls 'gen_helper_trace_${event}_exec to
  generate the TCG code to later trace the event at execution time.

Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-12 14:26:12 +01:00
Lluís Vilanova
76b53aa324 trace: [tcg] Include TCG-tracing helpers
Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-12 14:26:12 +01:00
Lluís Vilanova
f4654226d4 trace: [tcg] Define TCG tracing helper routine wrappers
Generates header "trace/generated-helpers-wrappers.h" with definitions for TCG
helper wrappers.

These wrappers ('gen_helper_trace_${event}_exec_wrapper') transform mixed native
and TCG argument types to TCG types and call the actual TCG helpers
('gen_helper_trace_${event}_exec_proxy').

Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-12 14:26:12 +01:00
Lluís Vilanova
341ea69185 trace: [tcg] Define TCG tracing helper routines
Generates file "trace/generated-helpers.c" with TCG helper definitions to trace
events in guest code at execution time.

The helpers ('helper_trace_${event}_exec_proxy') cast the TCG-compatible native
argument types to their original types (as defined in "trace-events") and call
the tracing routine ('trace_${event}_exec').

Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-12 14:26:12 +01:00
Lluís Vilanova
707c8a98e4 trace: [tcg] Declare TCG tracing helper routines
Generates file "trace/generated-helpers.h" with TCG helper declarations to trace
events in guest code at execution time ('trace_${event}_exec_proxy').

Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-12 14:26:12 +01:00
Lluís Vilanova
b2b36c22bd trace: [tcg] Add 'tcg' event property
Transforms event:

  tcg name(...) "...", "..."

into two internal events:

  tcg-trans name_trans(...) "..."
  tcg-exec name_exec(...) "..."

Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-12 14:26:11 +01:00
Lluís Vilanova
b55835ac10 trace: [tcg] Argument type transformation machinery
Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-12 14:26:11 +01:00
Lluís Vilanova
e6d6c4bebf trace: [tcg] Argument type transformation rules
Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-12 14:26:11 +01:00
Lluís Vilanova
0bb403b0ae trace: [tcg] Add documentation
Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-12 14:26:11 +01:00
Stefan Hajnoczi
e0b2fd0efb trace: install simpletrace SystemTap tapset
The simpletrace SystemTap tapset outputs simpletrace binary traces for
SystemTap probes.  This is useful because SystemTap has no default way
to format or store traces.  The simpletrace SystemTap tapset provides an
easy way to store traces.

The simpletrace.py tool or custom Python scripts using the
simpletrace.py API can analyze SystemTap these traces:

  $ ./configure --enable-trace-backends=dtrace ...
  $ make && make install
  $ stap -e 'probe qemu.system.x86_64.simpletrace.* {}' \
         -c qemu-system-x86_64 >/tmp/trace.out
  $ scripts/simpletrace.py --no-header trace-events /tmp/trace.out
  g_malloc 4.531 pid=15519 size=0xb ptr=0x7f8639c10470
  g_malloc 3.264 pid=15519 size=0x300 ptr=0x7f8639c10490
  g_free 5.155 pid=15519 ptr=0x7f8639c0f7b0

Note that, unlike qemu-system-x86_64.stp and
qemu-system-x86_64.stp-installed, only one file is needed since the
simpletrace SystemTap tapset does not reference the QEMU binary by path.
Therefore it doesn't matter whether the QEMU binary is installed or not.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-12 14:26:11 +01:00
Stefan Hajnoczi
15327c3df0 simpletrace: add simpletrace.py --no-header option
It can be useful to read simpletrace files that have no header.  For
example, a ring buffer may not have a header record but can still be
processed if the user is sure the file format version is compatible.

  $ scripts/simpletrace.py --no-header trace-events trace-file

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-12 14:26:11 +01:00
Stefan Hajnoczi
3f8b112d6b trace: add tracetool simpletrace_stap format
This new tracetool "format" generates a SystemTap .stp file that outputs
simpletrace binary trace data.

In contrast to simpletrace or ftrace, SystemTap does not define its own
trace format.  All output from SystemTap is generated by .stp files.
This patch lets us generate a .stp file that outputs in the simpletrace
binary format.

This makes it possible to reuse simpletrace.py to analyze traces
recorded using SystemTap.  The simpletrace binary format is especially
useful for long-running traces like flight-recorder mode where string
formatting can be expensive.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-12 14:26:11 +01:00
Stefan Hajnoczi
a76ccf3c1c trace: extract stap_escape() function for reuse
SystemTap reserved words sometimes conflict with QEMU variable names.
We escape them to prevent conflicts.

Move escaping into its own function so the next patch can reuse it.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-12 14:26:11 +01:00
Fam Zheng
169a24aea4 build-sys: Move qapi-{types, visit, event}.o into util-obj-y
These three objects are repeated in multiple times in Makefiles. Let's
just add them to libqemuutil.a, and don't list explicitly elsewhere.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-09 00:09:17 +04:00
Fam Zheng
90bda0823a po: Add Chinese translation
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Reviewed-by: Dongsheng Song <songdongsheng@live.cn>
Reviewed-by: Wei Huang <wehuang@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-09 00:06:41 +04:00
Chen Gang
fdcf6e65bc qemu-img: Check getchar() return value in read_password() for WIN32
getchar() is a standard c library function which may return with failure
(e.g. -1), so like another platforms, also need check it under WIN32.

And make the related code match current qemu code styles, too.

Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-09 00:06:32 +04:00
Stefan Weil
f13bef9592 hw/timer: Move extern declaration from .c to .h file
This fixes a warning from smatch (static code analyser).

Fix also the comment with the renamed source file name.

Signed-off-by: Stefan Weil <sw@weilnetz.de>

 hw/timer/tusb6010.c |    3 ---
 include/hw/usb.h    |    7 ++++++-
 2 files changed, 6 insertions(+), 4 deletions(-)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-09 00:06:32 +04:00
Stefan Weil
0f03fb6094 virtio: Move extern declaration to header file
This fixes a warning from smatch (static code analyser).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-09 00:06:32 +04:00
Alex Bligh
a3f1f040d2 Show length mismatch error is hex
When live migrate fails due to a section length mismatch we currently
see an error message like:

Length mismatch: 0000:00:03.0/virtio-net-pci.rom: 10000 in != 20000

The section lengths are in fact in hex, so this should read

Length mismatch: 0000:00:03.0/virtio-net-pci.rom: 0x10000 in != 0x20000

Correct the error string to reflect this.

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-09 00:06:32 +04:00
chenfan
5bb4c35dca target-i386/cpu.c: Fix two error output indentation
Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-09 00:06:32 +04:00
Michael Tokarev
bff6cb7296 l2tpv3 (configure): it is linux-specific
Some non-linux systems, for example a system with
FreeBSD kernel and glibc, may declare struct mmsghdr
(in glibc) but may not have linux-specific header
file linux/ip.h.  The actual implementation in qemu
includes this linux-specific header file unconditionally,
so compilation fails if it is not present.  Include
this header in the configure test too.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-09 00:06:32 +04:00
Michael Tokarev
203d65a470 hw/timer/imx_*: fix TIMER_MAX clash with system symbol
The symbol TIMER_MAX used in imx_epit.c and imx_gpt.c
clashes with system symbol with the same name.  Because
all qemu source files includes qemu-common.h which, in
turn, includes limits.h, which is not unusual to define
it.  Rename local symbol to have a reasonable prefix.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-09 00:06:32 +04:00
Peter Maydell
2d591ce2ae Merge remote-tracking branch 'remotes/mdroth/qga-pull-2014-08-08' into staging
* remotes/mdroth/qga-pull-2014-08-08:
  qga: Disable unsupported commands by default
  qga: Add guest-get-fsinfo command
  qga: Add guest-fsfreeze-freeze-list command

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-08 14:16:05 +01:00
Tomoki Sekiyama
1281c08a46 qga: Disable unsupported commands by default
Currently management softwares cannot know whether a qemu-ga command is
supported or not on the running platform until they actually execute it.
This patch disables unsupported commands at launch time of qemu-ga, so that
management softwares can check whether they are supported from 'enabled'
property of the result from 'guest-info' command.

Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-08-07 17:15:53 -05:00
Tomoki Sekiyama
46d4c5723e qga: Add guest-get-fsinfo command
Add command to get mounted filesystems information in the guest.
The returned value contains a list of mountpoint paths and
corresponding disks info such as disk bus type, drive address,
and the disk controllers' PCI addresses, so that management layer
such as libvirt can resolve the disk backends.

For example, when `lsblk' result is:

    NAME           MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
    sdb              8:16   0    1G  0 disk
    `-sdb1           8:17   0 1024M  0 part
      `-vg0-lv0    253:1    0  1.4G  0 lvm  /mnt/test
    sdc              8:32   0    1G  0 disk
    `-sdc1           8:33   0  512M  0 part
      `-vg0-lv0    253:1    0  1.4G  0 lvm  /mnt/test
    vda            252:0    0   25G  0 disk
    `-vda1         252:1    0   25G  0 part /

where sdb is a SCSI disk with PCI controller 0000:00:0a.0 and ID=1,
      sdc is an IDE disk with PCI controller 0000:00:01.1, and
      vda is a virtio-blk disk with PCI device 0000:00:06.0,

guest-get-fsinfo command will return the following result:

    {"return":
     [{"name":"dm-1",
       "mountpoint":"/mnt/test",
       "disk":[
        {"bus-type":"scsi","bus":0,"unit":1,"target":0,
         "pci-controller":{"bus":0,"slot":10,"domain":0,"function":0}},
        {"bus-type":"ide","bus":0,"unit":0,"target":0,
         "pci-controller":{"bus":0,"slot":1,"domain":0,"function":1}}],
       "type":"xfs"},
      {"name":"vda1", "mountpoint":"/",
       "disk":[
        {"bus-type":"virtio","bus":0,"unit":0,"target":0,
         "pci-controller":{"bus":0,"slot":6,"domain":0,"function":0}}],
       "type":"ext4"}]}

In Linux guest, the disk information is resolved from sysfs. So far,
it only supports virtio-blk, virtio-scsi, IDE, SATA, SCSI disks on x86
hosts, and "disk" parameter may be empty for unsupported disk types.

Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>

*updated schema to report 2.2 as initial supported version

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-08-07 17:15:14 -05:00
Tomoki Sekiyama
e99bce2021 qga: Add guest-fsfreeze-freeze-list command
If an array of mount point paths is specified as 'mountpoints' argument
of guest-fsfreeze-freeze-list, qemu-ga will only freeze the file systems
mounted on specified paths in Linux guests. Otherwise, it works as the
same way as guest-fsfreeze-freeze.
This would be useful when the host wants to create partial disk snapshots.

Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

*updated schema to report 2.2 as initial supported version

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-08-07 17:13:10 -05:00
Peter Maydell
2ee55b8351 Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
KVM changes include a MIPS patch and the testdev backend used by the
ARM kvm-unit-tests.  icount include the first part of reverse execution
and Sebastian Tanase's patches to slow down -icount execution to the
desired speed of the target.

v1->v2: fix dump_drift_info to print nothing outside icount mode,
        and to compile on 32-bit architectures

# gpg: Signature made Thu 07 Aug 2014 14:09:58 BST using RSA key ID 9B4D86F2
# gpg: Good signature from "Paolo Bonzini <pbonzini@redhat.com>"
# gpg:                 aka "Paolo Bonzini <bonzini@gnu.org>"

* remotes/bonzini/tags/for-upstream:
  target-mips: Ignore unassigned accesses with KVM
  monitor: Add drift info to 'info jit'
  cpu-exec: Print to console if the guest is late
  cpu-exec: Add sleeping algorithm
  icount: Add align option to icount
  icount: Add QemuOpts for icount
  icount: Fix virtual clock start value on ARM
  timer: add cpu_icount_to_ns function.
  migration: migrate icount fields.
  icount: put icount variables into TimerState.
  backends: Introduce chr-testdev

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-07 14:54:47 +01:00
James Hogan
eddedd546a target-mips: Ignore unassigned accesses with KVM
MIPS registers an unassigned access handler which raises a guest bus
error exception. However this causes QEMU to crash when KVM is enabled
as it isn't called from the main execution loop so longjmp() gets called
without a corresponding setjmp().

Until the KVM API can be updated to trigger a guest exception in
response to an MMIO exit, prevent the bus error exception being raised
from mips_cpu_unassigned_access() if KVM is enabled.

The check is at run time since the do_unassigned_access callback is
initialised before it is known whether KVM will be enabled.

The problem can be triggered with Malta emulation by making the guest
write to the reset region at physical address 0x1bf00000, since it is
marked read-only which is treated as unassigned for writes.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-07 15:09:48 +02:00
Sebastian Tanase
27498bef35 monitor: Add drift info to 'info jit'
Show in 'info jit' the current delay between the host clock
and the guest clock. In addition, print the maximum advance
and delay of the guest compared to the host.

Signed-off-by: Sebastian Tanase <sebastian.tanase@openwide.fr>
Tested-by: Camille Bégué <camille.begue@openwide.fr>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-07 15:09:48 +02:00
Peter Maydell
9d8bb35574 Merge remote-tracking branch 'remotes/awilliam/tags/vfio-pci-for-qemu-20140805.0' into staging
VFIO patches: Fix MSI-X vector expansion, remove MSI/X message caching

# gpg: Signature made Tue 05 Aug 2014 20:25:57 BST using RSA key ID 3BB08B22
# gpg: Can't check signature: public key not found

* remotes/awilliam/tags/vfio-pci-for-qemu-20140805.0:
  vfio: Don't cache MSIMessage
  vfio: Fix MSI-X vector expansion

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-07 11:30:38 +01:00
Sebastian Tanase
7f7bc144ed cpu-exec: Print to console if the guest is late
If the align option is enabled, we print to the user whenever
the guest clock is behind the host clock in order for he/she
to have a hint about the actual performance. The maximum
print interval is 2s and we limit the number of messages to 100.
If desired, this can be changed in cpu-exec.c

Signed-off-by: Sebastian Tanase <sebastian.tanase@openwide.fr>
Tested-by: Camille Bégué <camille.begue@openwide.fr>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-06 17:53:07 +02:00
Sebastian Tanase
c2aa5f8199 cpu-exec: Add sleeping algorithm
The goal is to sleep qemu whenever the guest clock
is in advance compared to the host clock (we use
the monotonic clocks). The amount of time to sleep
is calculated in the execution loop in cpu_exec.

At first, we tried to approximate at each for loop the real time elapsed
while searching for a TB (generating or retrieving from cache) and
executing it. We would then approximate the virtual time corresponding
to the number of virtual instructions executed. The difference between
these 2 values would allow us to know if the guest is in advance or delayed.
However, the function used for measuring the real time
(qemu_clock_get_ns(QEMU_CLOCK_REALTIME)) proved to be very expensive.
We had an added overhead of 13% of the total run time.

Therefore, we modified the algorithm and only take into account the
difference between the 2 clocks at the begining of the cpu_exec function.
During the for loop we try to reduce the advance of the guest only by
computing the virtual time elapsed and sleeping if necessary. The overhead
is thus reduced to 3%. Even though this method still has a noticeable
overhead, it no longer is a bottleneck in trying to achieve a better
guest frequency for which the guest clock is faster than the host one.

As for the the alignement of the 2 clocks, with the first algorithm
the guest clock was oscillating between -1 and 1ms compared to the host clock.
Using the second algorithm we notice that the guest is 5ms behind the host, which
is still acceptable for our use case.

The tests where conducted using fio and stress. The host machine in an i5 CPU at
3.10GHz running Debian Jessie (kernel 3.12). The guest machine is an arm versatile-pb
built with buildroot.

Currently, on our test machine, the lowest icount we can achieve that is suitable for
aligning the 2 clocks is 6. However, we observe that the IO tests (using fio) are
slower than the cpu tests (using stress).

Signed-off-by: Sebastian Tanase <sebastian.tanase@openwide.fr>
Tested-by: Camille Bégué <camille.begue@openwide.fr>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-06 17:53:07 +02:00
Sebastian Tanase
a8bfac3708 icount: Add align option to icount
The align option is used for activating the align algorithm
in order to synchronise the host clock and the guest clock.

Signed-off-by: Sebastian Tanase <sebastian.tanase@openwide.fr>
Tested-by: Camille Bégué <camille.begue@openwide.fr>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-06 17:53:07 +02:00
Sebastian Tanase
1ad9580bd7 icount: Add QemuOpts for icount
Make icount parameter use QemuOpts style options in order
to easily add other suboptions.

Signed-off-by: Sebastian Tanase <sebastian.tanase@openwide.fr>
Tested-by: Camille Bégué <camille.begue@openwide.fr>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-06 17:53:07 +02:00
Sebastian Tanase
7146839505 icount: Fix virtual clock start value on ARM
When using the icount option on ARM, the virtual
clock starts counting at realtime clock but it
should start at 0.

The reason why the virtual clock starts at realtime clock
is because the first time we call qemu_clock_warp (which
calls icount_warp_rt) in tcg_exec_all, qemu_icount_bias
(which is part of the virtual time computation mechanism)
will increment by realtime - vm_clock_warp_start, with
vm_clock_warp_start being 0 (see icount_warp_rt in cpus.c).

By changing the value of vm_clock_warp_start from 0 to -1,
the first time we call qemu_clock_warp which calls
icount_warp_rt, we will return immediatly because
icount_warp_rt first checks if vm_clock_warp_start is -1
and if it's the case it returns. Therefore, qemu_icount_bias
will first be incremented by the value of a virtual timer
deadline when the virtual cpu goes from active to inactive.

The virtual time will start at 0 and increment based
on the instruction counter when the vcpu is active or
the qemu_icount_bias value when inactive.

Signed-off-by: Sebastian Tanase <sebastian.tanase@openwide.fr>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-06 17:53:07 +02:00
KONRAD Frederic
3f03131390 timer: add cpu_icount_to_ns function.
This adds cpu_icount_to_ns function which is needed for reverse execution.

It returns the time for a specific instruction.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-06 17:53:07 +02:00
KONRAD Frederic
d09eae3726 migration: migrate icount fields.
This fixes a bug where qemu_icount and qemu_icount_bias are not migrated.
It adds a subsection "timer/icount" to vmstate_timers so icount is migrated only
when needed.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-06 17:53:07 +02:00
KONRAD Frederic
c96778bb84 icount: put icount variables into TimerState.
This puts qemu_icount and qemu_icount_bias into TimerState structure to allow
them to be migrated.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-06 17:53:07 +02:00
Paolo Bonzini
5692399f0a backends: Introduce chr-testdev
From: Paolo Bonzini <pbonzini@redhat.com>

chr-testdev enables a virtio serial channel to be used for guest
initiated qemu exits. hw/misc/debugexit already enables guest
initiated qemu exits, but only for PC targets. chr-testdev supports
any virtio-capable target. kvm-unit-tests/arm is already making use
of this backend.

Currently there is a single command implemented, "q".  It takes a
(prefix) argument for the exit code, thus an exit is implemented by
writing, e.g. "1q", to the virtio-serial port.

It can be used as:
   $QEMU ... \
     -device virtio-serial-device \
     -device virtserialport,chardev=ctd -chardev testdev,id=ctd

or, use:
   $QEMU ... \
     -device virtio-serial-device \
     -device virtconsole,chardev=ctd -chardev testdev,id=ctd

to bind it to virtio-serial port0.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-06 17:53:05 +02:00
Alex Williamson
9b3af4c0e4 vfio: Don't cache MSIMessage
Commit 40509f7f added a test to avoid updating KVM MSI routes when the
MSIMessage is unchanged and f4d45d47 switched to relying on this
rather than doing our own comparison.  Our cached msg is effectively
unused now.  Remove it.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-08-05 13:05:57 -06:00
Alex Williamson
c048be5cc9 vfio: Fix MSI-X vector expansion
When new MSI-X vectors are enabled we need to disable MSI-X and
re-enable it with the correct number of vectors.  That means we need
to reprogram the eventfd triggers for each vector.  Prior to f4d45d47
vector->use tracked whether a vector was masked or unmasked and we
could always pick the KVM path when available for unmasked vectors.
Now vfio doesn't track mask state itself and vector->use and virq
remains configured even for masked vectors.  Therefore we need to ask
the MSI-X code whether a vector is masked in order to select the
correct signaling path.  As noted in the comment, MSI relies on
hardware to handle masking.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: qemu-stable@nongnu.org # QEMU 2.1
2014-08-05 13:05:52 -06:00
Peter Maydell
69f87f7130 Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20140804' into staging
target-arm queue:
 * Set PC correctly when loading AArch64 ELF files
 * sdhci: Fix ADMA dma_memory_read access
 * some more foundational work for EL2/EL3 support
 * fix bugs which reveal themselves if the TARGET_PAGE_SIZE
   is not set to 1K

# gpg: Signature made Mon 04 Aug 2014 14:51:34 BST using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"

* remotes/pmaydell/tags/pull-target-arm-20140804:
  target-arm: A64: fix TLB flush instructions
  target-arm: don't hardcode mask values in arm_cpu_handle_mmu_fault
  target-arm: Fix bit test in sp_el0_access
  target-arm: Add FAR_EL2 and 3
  target-arm: Add ESR_EL2 and 3
  target-arm: Make far_el1 an array
  target-arm: A64: Respect SPSEL when taking exceptions
  target-arm: A64: Respect SPSEL in ERET SP restore
  target-arm: A64: Break out aarch64_save/restore_sp
  sd: sdhci: Fix ADMA dma_memory_read access
  hw/arm/virt: formatting: memory map
  hw/arm/boot: Set PC correctly when loading AArch64 ELF files

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-04 15:01:38 +01:00
Alex Bennée
dbb1fb277c target-arm: A64: fix TLB flush instructions
According to the ARM ARM we weren't correctly flushing the TLB entries
where bits 63:56 didn't match bit 55 of the virtual address. This
exposed a problem when we switched QEMU's internal TARGET_PAGE_BITS to
12 for aarch64.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1406733627-24255-3-git-send-email-alex.bennee@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-04 14:41:56 +01:00
Alex Bennée
dcd82c118c target-arm: don't hardcode mask values in arm_cpu_handle_mmu_fault
Otherwise we break quickly when we change TARGET_PAGE_SIZE.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1406733627-24255-2-git-send-email-alex.bennee@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-04 14:41:55 +01:00
Stefan Weil
cdcf14057d target-arm: Fix bit test in sp_el0_access
Static code analyzers complain about a dubious & operation used for a
boolean value. The code does not test the PSTATE_SP bit as it should.

Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Message-id: 1406359601-25583-1-git-send-email-sw@weilnetz.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-04 14:41:55 +01:00
Edgar E. Iglesias
63b60551a7 target-arm: Add FAR_EL2 and 3
Reviewed-by: Greg Bellows <greg.bellows@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1402994746-8328-7-git-send-email-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-04 14:41:55 +01:00
Edgar E. Iglesias
f2c30f42f5 target-arm: Add ESR_EL2 and 3
Reviewed-by: Greg Bellows <greg.bellows@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1402994746-8328-6-git-send-email-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-04 14:41:55 +01:00
Edgar E. Iglesias
2f0180c51b target-arm: Make far_el1 an array
No functional change.
Prepares for future additions of the EL2 and 3 versions of this reg.

Reviewed-by: Greg Bellows <greg.bellows@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1402994746-8328-5-git-send-email-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-04 14:41:54 +01:00
Edgar E. Iglesias
f151b123a3 target-arm: A64: Respect SPSEL when taking exceptions
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Greg Bellows <greg.bellows@linaro.org>
Message-id: 1402994746-8328-4-git-send-email-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-04 14:41:54 +01:00
Edgar E. Iglesias
98ea5615ab target-arm: A64: Respect SPSEL in ERET SP restore
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Greg Bellows <greg.bellows@linaro.org>
Message-id: 1402994746-8328-3-git-send-email-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-04 14:41:54 +01:00
Edgar E. Iglesias
9208b9617f target-arm: A64: Break out aarch64_save/restore_sp
Break out code to save/restore AArch64 SP into functions.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Greg Bellows <greg.bellows@linaro.org>
Message-id: 1402994746-8328-2-git-send-email-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-04 14:41:54 +01:00
Peter Crosthwaite
9db11cef8c sd: sdhci: Fix ADMA dma_memory_read access
This dma_memory_read was giving too big a size when begin was non-zero.
This could cause segfaults in some circumstances. Fix.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-04 14:41:54 +01:00
Andrew Jones
fab4693239 hw/arm/virt: formatting: memory map
Add some spacing and zeros to make it easier to read and
modify the map. This patch has no functional changes. The
review looks ugly, but it's actually pretty easy to confirm
all the addresses are as they should be - thanks to the new
formatting ;-)

Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-04 14:41:53 +01:00
Peter Maydell
a9047ec3f6 hw/arm/boot: Set PC correctly when loading AArch64 ELF files
The code in do_cpu_reset() correctly handled AArch64 CPUs
when running Linux kernels, but was missing code in the
branch of the if() that deals with loading ELF files.
Correctly jump to the ELF entry point on reset rather than
leaving the reset PC at zero.

Reported-by: Christopher Covington <cov@codeaurora.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Christopher Covington <cov@codeaurora.org>
Cc: qemu-stable@nongnu.org
2014-08-04 14:41:53 +01:00
Peter Maydell
cc11a0623a Merge remote-tracking branch 'remotes/amit-migration/for-2.2' into staging
* remotes/amit-migration/for-2.2:
  checker: ignore fields marked unused
  vmstate static checker: whitelist additions

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-04 14:41:19 +01:00
Peter Maydell
924c09db51 Merge remote-tracking branch 'remotes/amit-virtio-rng/for-2.2' into staging
* remotes/amit-virtio-rng/for-2.2:
  virtio-rng: replace error_set calls with error_setg
  virtio-rng: Move error-checking forward to prevent memory leak

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-04 13:07:02 +01:00
Peter Maydell
7b13ff3f15 Merge remote-tracking branch 'remotes/sstabellini/xen-20140801' into staging
* remotes/sstabellini/xen-20140801:
  qemu: support xen hvm direct kernel boot
  tap-bsd: implement a FreeBSD only version of tap_open
  xen: fix usage of ENODATA

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-04 11:17:24 +01:00
Amit Shah
32ce1b4817 checker: ignore fields marked unused
While comparing qemu-1.0 json output with qemu-2.1, a few fields got
marked unused.  These need to be skipped over, and not flagged as
mismatches.

For handling unused fields, the exact number of bytes need to be skipped
over as the size of the unused field.

Currently, only the term "unused" is matched.  When more field names
turn up, this will have to be updated based on the whitelist matching
method to match more such terms.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
2014-08-04 15:02:37 +05:30
John Snow
c617dd3b7e virtio-rng: replace error_set calls with error_setg
Under recommendation from Luiz Capitulino, we are changing
the error_set calls to error_setg while we are fixing up
the error handling pathways of virtio-rng.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2014-08-04 14:50:11 +05:30
John Snow
1efd6e072c virtio-rng: Move error-checking forward to prevent memory leak
This patch pushes the error-checking forward and the virtio
initialization backward in the device realization function
in order to prevent memory leaks for hot plug scenarios.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2014-08-04 14:49:53 +05:30
Peter Maydell
c79805802b Open 2.2 development tree
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-01 18:30:08 +01:00
Chunyan Liu
b33a5bbfba qemu: support xen hvm direct kernel boot
qemu side patch to support xen HVM direct kernel boot:
if -kernel exists, calls xen_load_linux(), which will read kernel/initrd
and add a linuxboot.bin or multiboot.bin option rom. The
linuxboot.bin/multiboot.bin will load kernel/initrd and jump to execute
kernel directly. It's working when xen uses seabios.

During this work, found the 'kvmvapic' is in option_rom list, it should
not be there in xen case. Set s->vapic_control = 0 in xen_apic_realize()
to handle that.

Signed-off-by: Chunyan Liu <cyliu@suse.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-01 15:58:12 +00:00
Roger Pau Monne
8677de2b4d tap-bsd: implement a FreeBSD only version of tap_open
The current behaviour of tap_open for BSD systems differ greatly from
it's Linux counterpart. Since FreeBSD supports interface renaming and
tap device cloning by opening /dev/tap, implement a FreeBSD specific
version of tap_open that behaves like it's Linux counterpart.

This is specially important for toolstacks that use Qemu (like Xen
libxl), in order to have a unified behaviour across suported
platforms.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-01 15:57:48 +00:00
Roger Pau Monne
74bc41511a xen: fix usage of ENODATA
ENODATA doesn't exist on FreeBSD, so ENODATA errors returned by the
hypervisor are translated to ENOENT.

Also, the error code is returned in errno if the call returns -1, so
compare the error code with the value in errno instead of the value
returned by the function.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: xen-devel@lists.xenproject.org
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Anthony Perard <anthony.perard@citrix.com>
2014-08-01 15:57:28 +00:00
Peter Maydell
541bbb07eb Update version for v2.1.0 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-01 13:31:29 +01:00
Peter Maydell
d24e780427 Update version for v2.1.0-rc5 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-29 18:23:34 +01:00
Andrew Jones
1373e140f0 hw/arm/virt: fix pl031 addr typo
pl031's base address should be 0x9010000, not 0x90010000, otherwise
it sits in ram when configuring a guest with greater than 1G.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-29 17:40:42 +01:00
Paolo Bonzini
33cbb2c546 virtio-scsi: implement parse_cdb
Enable passthrough of vendor-specific commands.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-29 17:36:38 +02:00
Paolo Bonzini
3e7e180ab3 scsi-block, scsi-generic: implement parse_cdb
The callback lets the bus provide the direction and transfer count
for passthrough commands, enabling passthrough of vendor-specific
commands.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-29 17:36:33 +02:00
Paolo Bonzini
592c3b289f scsi-block: extract scsi_block_is_passthrough
This will be used for both scsi_block_new_request and the scsi-block
implementation of parse_cdb.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-29 17:36:29 +02:00
Paolo Bonzini
ff34c32ccc scsi-bus: introduce parse_cdb in SCSIDeviceClass and SCSIBusInfo
These callbacks will let devices do their own request parsing, or
defer it to the bus.  If the bus does not provide an implementation,
in turn, fall back to the default parsing routine.

Swap the first two arguments to scsi_req_parse, and rename it to
scsi_req_parse_cdb, for consistency.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-29 17:36:25 +02:00
Paolo Bonzini
769998a1db scsi-bus: prepare scsi_req_new for introduction of parse_cdb
The per-SCSIDevice parse_cdb callback must not be called if the
request will go through special SCSIReqOps, so detect the special
cases early enough.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-29 17:36:09 +02:00
Peter Maydell
8a2ca741ab Update version for v2.1.0-rc4 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-29 13:45:10 +01:00
Paolo Bonzini
1d80eb7a68 po: update Italian translation
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-29 13:23:33 +01:00
Aurelien Jarno
41892faf89 po: Update French translation
Add new translations for recently added messages.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-29 13:23:18 +01:00
Peter Maydell
04d1d6613f Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc migration fixes

Last minute fixes for migration.
It seems that if we don't fix it now, fixing
it in the next version will be even more painful ...

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Tue 29 Jul 2014 11:45:18 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"

* remotes/mst/tags/for_upstream:
  piix: set legacy table size for 1.7
  acpi-build: tweak acpi migration limits
  pc: future-proof migration-compatibility of ACPI tables
  acpi-build: minor code cleanup
  pc: acpi: generate AML only for PCI0 devices if PCI bridge hotplug is disabled
  bios-tables-test: fix ASL normalization false positive
  pc: hack for migration compatibility from QEMU 2.0
  acpi-dsdt: procedurally generate _PRT

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-29 12:04:02 +01:00
Michael S. Tsirkin
f47337cb91 piix: set legacy table size for 1.7
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-07-29 12:26:12 +02:00
Michael S. Tsirkin
868270f23d acpi-build: tweak acpi migration limits
- Tweak error message for legacy machine type:
  Basically if table size exceeds the limits we set all
  bets are off for migration: e.g. it can start failing even
  within given qemu minor version simply because of a bugfix.
- Increase table size to 128k.
- Make sure we notice it long before we start getting close to the
  128k limit: warn at 64k.
- Don't fail if we exceed the limit: most people don't care about
  migration, even less people care about cross version miration.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-29 12:26:12 +02:00
Paolo Bonzini
18045fb9f4 pc: future-proof migration-compatibility of ACPI tables
This patch avoids that similar changes break QEMU again in the future.
QEMU will now hard-code 64k as the maximum ACPI table size, which
(despite being an order of magnitude smaller than 640k) should be enough
for everyone.

Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-07-29 12:26:12 +02:00
Michael S. Tsirkin
093a35e5fc acpi-build: minor code cleanup
Fix up and add  comments to clarify code, plus a trivial
code change for clarity.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-07-29 12:26:12 +02:00
Igor Mammedov
133a2da488 pc: acpi: generate AML only for PCI0 devices if PCI bridge hotplug is disabled
Fixes migration regression from QEMU-1.7 to a newer QEMUs.
SSDT table size in QEMU-1.7 doesn't change regardless of
a number of PCI bridge devices present at startup.

However in QEMU-2.0 since addition of hotplug on PCI bridges,
each PCI bridge adds ~1875 bytes to SSDT table, including
pc-i440fx-1.7 machine type where PCI bridge hotplug disabled
via compat property.
It breaks migration from "QEMU-1.7" to "QEMU-2.[01] -M pc-i440fx-1.7"
since RAMBlock size of ACPI tables on target becomes larger
then on source and migration fails with:

"Length mismatch: /rom@etc/acpi/tables: 2000 in != 3000"

error.

Fix this by generating AML only for PCI0 bus if
hotplug on PCI bridges is disabled and preserves PCI brigde
description in AML as it was done in QEMU-1.7 for pc-i440fx-1.7.

It will help to maintain size of SSDT static regardless of
number of PCI bridges on startup for pc-i440fx-1.7 machine type.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-07-29 12:26:12 +02:00
Paolo Bonzini
cb348985ab bios-tables-test: fix ASL normalization false positive
My version of IASL (from RHEL7) puts two newlines between the head comment
and the DefinitionBlock property.  Kill all newlines after the comment,
so that normalize_asl works properly.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2014-07-29 12:26:12 +02:00
Stefan Weil
41a1a9c42c po: Update German translation
Line numbers changed, and some translations were missing after commit
3d914488ae.

Update also "Show Tabs" to a more common translation, and remove some
old unused lines at the end.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2014-07-28 23:37:17 +02:00
Dongxue Zhang
62eb3b9a34 target-mips/translate.c: Free TCG in OPC_DINSV
Free t0 and t1 in opcode OPC_DINSV.

Signed-off-by: Dongxue Zhang <elta.era@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2014-07-28 23:37:15 +02:00
Paolo Bonzini
07fb61760c pc: hack for migration compatibility from QEMU 2.0
Changing the ACPI table size causes migration to break, and the memory
hotplug work opened our eyes on how horribly we were breaking things in
2.0 already.

The ACPI table size is rounded to the next 4k, which one would think
gives some headroom.  In practice this is not the case, because the user
can control the ACPI table size (each CPU adds 97 bytes to the SSDT and
8 to the MADT) and so some "-smp" values will break the 4k boundary and
fail to migrate.  Similarly, PCI bridges add ~1870 bytes to the SSDT.

This patch concerns itself with fixing migration from QEMU 2.0.  It
computes the payload size of QEMU 2.0 and always uses that one.
The previous patch shrunk the ACPI tables enough that the QEMU 2.0 size
should always be enough; non-AML tables can change depending on the
configuration (especially MADT, SRAT, HPET) but they remain the same
between QEMU 2.0 and 2.1, so we only compute our padding based on the
sizes of the SSDT and DSDT.

Migration from QEMU 1.7 should work for guests that have a number of CPUs
other than 12, 13, 14, 54, 55, 56, 97, 98, 139, 140.  It was already
broken from QEMU 1.7 to QEMU 2.0 in the same way, though.

Even with this patch, QEMU 1.7 and 2.0 have two different ideas of
"-M pc-i440fx-2.0" when there are PCI bridges.  Igor sent a patch to
adopt the QEMU 1.7 definition.  I think distributions should apply
it if they move directly from QEMU 1.7 to 2.1+ without ever packaging
version 2.0.

Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-07-28 23:02:39 +02:00
Paolo Bonzini
acd727e7cb acpi-dsdt: procedurally generate _PRT
This replaces the _PRT constant with a method that computes it.

The problem is that the DSDT+SSDT have grown from 2.0 to 2.1,
enough to cross the 8k barrier (we align the ACPI tables to 4k
before putting them in fw_cfg).  This causes problems with
migration and the pc-i440fx-2.0 machine type.

The solution to the problem is to hardcode 64k as the limit,
but this doesn't solve the bug with pc-i440fx-2.0.  The fix will be
for QEMU 2.1 to use exactly the same size as QEMU 2.0 for the
ACPI tables.  First, however, we must make the actual AML
equal or smaller; to do this, rewrite _PRT in a way that saves
over 1k of bytecode.

Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-07-28 23:02:39 +02:00
Peter Maydell
f45c56e016 Merge remote-tracking branch 'remotes/mjt/tags/trivial-patches-2014-07-26' into staging
trivial patches for 2014-07-26

# gpg: Signature made Sat 26 Jul 2014 08:16:55 BST using RSA key ID A4C3D7DB
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg:                 aka "Michael Tokarev <mjt@corpit.ru>"
# gpg:                 aka "Michael Tokarev <mjt@debian.org>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D  4324 457C E0A0 8044 65C5
#      Subkey fingerprint: 6F67 E18E 7C91 C5B1 5514  66A7 BEE5 9D74 A4C3 D7DB

* remotes/mjt/tags/trivial-patches-2014-07-26:
  qemu-options: fix another allows-to for -net l2tpv3

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-28 11:05:14 +01:00
Michael Tokarev
2f47b403bd qemu-options: fix another allows-to for -net l2tpv3
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-07-26 11:16:44 +04:00
Peter Maydell
c60a57ff49 Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
Here is the serial fix for 2.1.

# gpg: Signature made Fri 25 Jul 2014 13:36:23 BST using RSA key ID 9B4D86F2
# gpg: Good signature from "Paolo Bonzini <pbonzini@redhat.com>"
# gpg:                 aka "Paolo Bonzini <bonzini@gnu.org>"

* remotes/bonzini/tags/for-upstream:
  qemu-char: ignore flow control if a PTY's slave is not connected

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-25 16:58:41 +01:00
Paolo Bonzini
62c339c527 qemu-char: ignore flow control if a PTY's slave is not connected
After commit f702e62 (serial: change retry logic to avoid concurrency,
2014-07-11), guest boot hangs if the backend is an unconnected PTY.

The reason is that PTYs do not support G_IO_HUP, and serial_xmit is
never called.  To fix this, simply invoke serial_xmit immediately
(via g_idle_source_new) when this happens.

Tested-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-25 14:36:07 +02:00
Peter Maydell
7f0b2ff724 Merge remote-tracking branch 'remotes/kraxel/tags/pull-vnc-20140725-1' into staging
vnc: fix two vnc update issues.

# gpg: Signature made Fri 25 Jul 2014 08:44:23 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-vnc-20140725-1:
  vnc update fix
  fix full frame updates for VNC clients

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-25 10:32:13 +01:00
Gerd Hoffmann
6365828003 vnc update fix
We need to remember has_updates for each vnc client.  Otherwise it might
happen that vnc_update_client(has_dirty=1) takes the first exit due to
output buffers not being flushed yet and subsequent calls with
has_dirty=0 take the second exit, wrongly assuming there is nothing to
do because the work defered in the first call is ignored.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
2014-07-25 09:43:31 +02:00
Stephan Kulow
07535a8902 fix full frame updates for VNC clients
If the client asks for !incremental frame updates, it has lost its content
so dirty doesn't matter - it has to see the full frame, so setting force_update

Signed-off-by: Stephan Kulow <coolo@suse.de>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
2014-07-25 09:42:56 +02:00
Peter Maydell
3b25748663 Merge remote-tracking branch 'remotes/qmp-unstable/queue/qmp' into staging
* remotes/qmp-unstable/queue/qmp:
  docs: document missing VSERPORT_CHANGE event
  docs: document missing POWERDOWN event
  docs: document missing SPICE_MIGRATE_COMPLETED event
  docs: split SPICE_* event docs
  docs: grammar fixes to qmp-events

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-24 15:23:43 +01:00
Eric Blake
032baddea3 docs: document missing VSERPORT_CHANGE event
The VSERPORT_CHANGE event was added in e2ae6159.  The patch for
this event was prepared at a time when this file was gone, even
though it got applied immediately after dfab4892 restored this
file.  Duplicate the documentation into this file, so that
anyone using this file instead of qapi will not miss out on this
new event.

* docs/qmp/qmp-events.txt (VSERPORT_CHANGE): Add.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-07-24 10:00:33 -04:00
Eric Blake
db52658b38 docs: document missing POWERDOWN event
The POWERDOWN event was first documented in 0aab9ec3.  But since
dfab4892 later restored this file to the state prior to qmp events,
and we never documented it in the past, anyone using this file
instead of qapi will miss out on this event.  Tweak the existing
wording of SHUTDOWN to match 84321831, and make the difference
between the two events apparent.

* docs/qmp/qmp-events.txt (POWERDOWN): Add.
(SHUTDOWN): Tweak.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-07-24 10:00:33 -04:00
Eric Blake
5e255004f5 docs: document missing SPICE_MIGRATE_COMPLETED event
The SPICE_MIGRATE_COMPLETED event was first documented in
7cfadb6b.  But since dfab4892 later restored this file to the
state prior to qmp events, and we never documented it in the
past, anyone using this file instead of qapi will miss out on
this event.

* docs/qmp/qmp-events.txt (SPICE_MIGRATE_COMPLETED): Add.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-07-24 10:00:15 -04:00
Eric Blake
f8ecd94501 docs: split SPICE_* event docs
For consistency with the rest of this file, every event should be
listed in isolation.  Compare how commit 7cfadb6b split
SPICE_CONNECTED and SPICE_DISCONNECTED into separate qmp events.

* docs/qmp/qmp-events.txt (SPICE_CONNECTED, SPICE_DISCONNECTED):
Split.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-07-24 09:59:20 -04:00
Eric Blake
1454ac68af docs: grammar fixes to qmp-events
When converting to qmp events, commits 7cfadb6b and a6330785
fixed some grammar as part of moving text between files.  But
since dfab4892 later restored this file to the state prior to
qmp events, we have to do it again.

* docs/qmp/qmp-events.txt (RESET, SPICE_INITIALIZED): Tweak.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-07-24 09:58:51 -04:00
Peter Maydell
a537d373b9 Merge remote-tracking branch 'remotes/kraxel/tags/pull-usb-20140723-1' into staging
usb: mtp: tag root property as experimental

# gpg: Signature made Wed 23 Jul 2014 07:56:21 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-usb-20140723-1:
  usb: mtp: tag root property as experimental

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-24 12:49:54 +01:00
Gerd Hoffmann
cf679caf91 usb: mtp: tag root property as experimental
Reason: we don't want commit to that interface yet.  Possibly
the implementation will be switched over to use fsdev.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-07-23 08:55:40 +02:00
Peter Maydell
f368c33d5a Update version for v2.1.0-rc3 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-22 18:17:03 +01:00
Peter Maydell
ef493d5c29 hw/misc/imx_ccm.c: Add missing VMState list terminator
The VMStateDescription for the imx_ccm device was missing its
terminator. Found by static search of the codebase using
a regex based on one suggested by Ian Jackson:
  pcregrep -rMi '(?s)VMStateField(?:(?!END_OF_LIST).)*?;' $(git grep -l 'VMStateField\[\]')

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Cc: qemu-stable@nongnu.org
2014-07-22 17:53:36 +01:00
Laszlo Ersek
3afca1d6d4 vmstate_xhci_event: fix unterminated field list
"vmstate_xhci_event" was introduced in commit 37352df3 ("xhci: add live
migration support"), and first released in v1.6.0. The field list in this
VMSD is not terminated with the VMSTATE_END_OF_LIST() macro.

During normal use (ie. migration), the issue is practically invisible,
because the "vmstate_xhci_event" object (with the unterminated field list)
is only ever referenced -- via "vmstate_xhci_intr" -- if xhci_er_full()
returns true, for the "ev_buffer" test. Since that field_exists() check
(apparently) almost always returns false, we almost never traverse
"vmstate_xhci_event" during migration, which hides the bug.

However, Amit's vmstate checker forces recursion into this VMSD as well,
and the lack of VMSTATE_END_OF_LIST() breaks the field list terminator
check (field->name != NULL) in dump_vmstate_vmsd(). The result is
undefined behavior, which in my case translates to infinite recursion
(because the loop happens to overflow into "vmstate_xhci_intr", which then
links back to "vmstate_xhci_event").

Add the missing terminator.

Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-22 17:34:24 +01:00
Peter Maydell
3a18d44983 Merge remote-tracking branch 'remotes/agraf/tags/signed-ppc-for-upstream' into staging
Patch queue for ppc - 2014-07-22

Only a single bug fix to make -mem-path only affect RAM regions.

# gpg: Signature made Tue 22 Jul 2014 16:38:04 BST using RSA key ID 03FEDC60
# gpg: Can't check signature: public key not found

* remotes/agraf/tags/signed-ppc-for-upstream:
  ppc: fix -mem-path failure

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-22 16:40:34 +01:00
Hu Tao
e206ad4833 ppc: fix -mem-path failure
commit e938ba0c tried to enable -mem-path for ppc but breaked some ppc
boards.

The problems are:

1. it fails when allocating memory for rom, sram whose sizes are less
   than huge page size:

   ./ppc-softmmu/qemu-system-ppc  -m 512 -mem-path /hugepages/ \
   -kernel /home/hutao/Downloads/vmlinux-ppc -initrd \
   /home/hutao/Downloads/initrd-ppc.gz
   qemu-system-ppc: /mnt/data/projects/qemu/exec.c:1184: qemu_ram_set_idstr: Assertion `new_block' failed.

2. if there is a numa node backed by memory backend object, qemu fails
   with message:

   ./ppc-softmmu/qemu-system-ppc  -m 512 \
   -object memory-backend-file,size=512M,mem-path=/hugepages,id=f0 \
   -numa node,nodeid=0,memdev=f0 \
   -kernel /home/hutao/Downloads/vmlinux-ppc \
   -initrd /home/hutao/Downloads/initrd-ppc.gz
   qemu-system-ppc: memory backend f0 is used multiple times. Each -numa option must use a different memdev value.

This patch does following:

1. replaces memory_region_allocate_system_memory() with
   memory_region_init_ram() for rom, sram. Then only system memory
   is backed by hugepages when specifying mem-path.

2. for memory banks, allocates all ram with
   one memory_region_allocate_system_memory(), and use
   memory_region_init_alias() to initialize memory banks.

Tested machines: default(g3beige), mac99, taihu, bamboo, ref405ep.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-22 17:37:25 +02:00
Peter Maydell
b64c670f1d Merge remote-tracking branch 'remotes/amit-virtio-rng/for-2.1' into staging
* remotes/amit-virtio-rng/for-2.1:
  virtio-rng: Add human-readable error message for negative max-bytes parameter

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-22 13:16:04 +01:00
John Snow
713e8a1022 virtio-rng: Add human-readable error message for negative max-bytes parameter
If a negative integer is used for the max_bytes parameter, QEMU currently
calls abort() and leaves behind a core dump. This patch replaces the
abort with a simple error message to make the reason for the termination
clearer. This also ensures device-hotplug with invalid input doesn't
cause qemu to quit.

There is an underlying insufficiency in the parameter parsing code of QEMU
that renders it unable to reject negative values for unsigned properties,
thus the error message "a non-negative integer below 2^63" is the most
user-friendly and correct message we can give until the underlying
insufficiency is corrected.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2014-07-22 17:18:55 +05:30
Amit Shah
bb9c3636d9 vmstate static checker: whitelist additions
Comparing json outputs from qemu-1.0 with qemu-2.1 turned up a few
description name changes; whitelist them here.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
2014-07-22 17:06:54 +05:30
Peter Maydell
25af8e6b61 Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
One of the two pending migration fix, and a small KVM patch.

# gpg: Signature made Tue 22 Jul 2014 11:49:30 BST using RSA key ID 9B4D86F2
# gpg: Can't check signature: public key not found

* remotes/bonzini/tags/for-upstream:
  kvm-all: Use 'tmpcpu' instead of 'cpu' in sub-looping to avoid 'cpu' be NULL
  exec: fix migration with devices that use address_space_rw

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-22 12:03:45 +01:00
Chen Gang
dc54e25253 kvm-all: Use 'tmpcpu' instead of 'cpu' in sub-looping to avoid 'cpu' be NULL
If kvm_arch_remove_sw_breakpoint() in CPU_FOREACH() always be fail, it
will let 'cpu' NULL. And the next kvm_arch_remove_sw_breakpoint() in
QTAILQ_FOREACH_SAFE() will get NULL parameter for 'cpu'.

And kvm_arch_remove_sw_breakpoint() can assumes 'cpu' must never be NULL,
so need define additional temporary variable for 'cpu' to avoid the case.

Cc: qemu-stable@nongnu.org
Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-22 12:38:17 +02:00
Paolo Bonzini
6886867e98 exec: fix migration with devices that use address_space_rw
Devices that use address_space_rw to write large areas to memory
(as opposed to address_space_map/unmap) were broken with respect
to migration since fe680d0 (exec: Limit translation limiting in
address_space_translate to xen, 2014-05-07).  Such devices include
IDE CD-ROMs.

The reason is that invalidate_and_set_dirty (called by address_space_rw
but not address_space_map/unmap) was only setting the dirty bit for
the first page in the translation.

To fix this, introduce cpu_physical_memory_set_dirty_range_nocode that
is the same as cpu_physical_memory_set_dirty_range except it does not
muck with the DIRTY_MEMORY_CODE bitmap.  This function can be used if
the caller invalidates translations with tb_invalidate_phys_page_range.

There is another difference between cpu_physical_memory_set_dirty_range
and cpu_physical_memory_set_dirty_flag; the former includes a call
to xen_modified_memory.  This is handled separately in
invalidate_and_set_dirty, and is not needed in other callers of
cpu_physical_memory_set_dirty_range_nocode, so leave it alone.

Just one nit: now that invalidate_and_set_dirty takes care of handling
multiple pages, there is no need for address_space_unmap to wrap it
in a loop.  In fact that loop would now be O(n^2).

Reported-by: Dave Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-22 10:38:50 +02:00
Peter Maydell
35858955e6 Merge remote-tracking branch 'remotes/afaerber/tags/qom-devices-for-2.1' into staging
QOM and device refactorings

* Machine: Property name fixups for 2.1 ABI

# gpg: Signature made Mon 21 Jul 2014 18:00:23 BST using RSA key ID 3E7E013F
# gpg: Good signature from "Andreas Färber <afaerber@suse.de>"
# gpg:                 aka "Andreas Färber <afaerber@suse.com>"

* remotes/afaerber/tags/qom-devices-for-2.1:
  machine: Replace underscores in machine's property names

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-21 18:06:12 +01:00
Marcel Apfelbaum
b0ddb8bf6b machine: Replace underscores in machine's property names
Replaced '_' with '-' to comply with QOM guidelines.
Made the conversion from command line to QMP in vl.c.

Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-07-21 18:58:36 +02:00
Peter Maydell
147fc41973 Merge remote-tracking branch 'remotes/mjt/tags/trivial-patches-2014-07-18' into staging
trivial patches for 2014-07-18

# gpg: Signature made Fri 18 Jul 2014 15:04:43 BST using RSA key ID A4C3D7DB
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg:                 aka "Michael Tokarev <mjt@corpit.ru>"
# gpg:                 aka "Michael Tokarev <mjt@debian.org>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D  4324 457C E0A0 8044 65C5
#      Subkey fingerprint: 6F67 E18E 7C91 C5B1 5514  66A7 BEE5 9D74 A4C3 D7DB

* remotes/mjt/tags/trivial-patches-2014-07-18:
  tests: Add missing 'static' attributes (fix warnings from smatch)
  migration: Add missing 'static' attribute
  qga: Add missing 'static' attribute
  hw/usb: Add missing 'static' attribute
  doc: slirp supports ICMP echo if enabled in Linux
  qemu-img: Remove redundancy "ret = -1"
  Fix new typos in comments (found by codespell)
  slirp: Give error message if hostfwd_add/remove for unrecognized vlan/stack

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-18 16:59:29 +01:00
Peter Maydell
50a2c45da9 Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
Andreas's fixes to --enable-modules, two 2.1 regression fixes, and a
new qtest.  Michael sent a pull request of his own, so I dropped
the vhost changes.

# gpg: Signature made Fri 18 Jul 2014 14:30:34 BST using RSA key ID 9B4D86F2
# gpg: Can't check signature: public key not found

* remotes/bonzini/tags/for-upstream:
  Revert "kvmclock: Ensure time in migration never goes backward"
  Revert "kvmclock: Ensure proper env->tsc value for kvmclock_current_nsec calculation"
  module: Don't complain when a module is absent
  module: Simplify module_load()
  qtest: new test for wdt_ib700
  target-i386: Allow execute from user mode when SMEP is enabled.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-18 14:46:53 +01:00
Stefan Weil
748bfb4eee tests: Add missing 'static' attributes (fix warnings from smatch)
Smatch also complains about 0 used for pointers, so replace those by
NULL in test-visitor-serialization.c, too.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-07-18 17:45:37 +04:00
Stefan Weil
7a46d042e0 migration: Add missing 'static' attribute
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-07-18 17:45:37 +04:00
Stefan Weil
13a439ec40 qga: Add missing 'static' attribute
This fixes a warning from the static code analysis (smatch).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-07-18 17:45:37 +04:00
Stefan Weil
b9b45b4a88 hw/usb: Add missing 'static' attribute
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-07-18 17:45:37 +04:00
Gernot Hillier
37cbfcce14 doc: slirp supports ICMP echo if enabled in Linux
Since QEMU 0.15, slirp (user mode networking) supports ping to the
Internet, see e6d43cfb1f

Signed-off-by: Gernot Hillier <gernot.hillier@siemens.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-07-18 17:45:37 +04:00
Chen Gang
b847ae2d60 qemu-img: Remove redundancy "ret = -1"
In this case, 'ret' is already '-1', so need not do it again.

Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-07-18 17:45:37 +04:00
Stefan Weil
a9dd38db68 Fix new typos in comments (found by codespell)
arbitary -> arbitrary
basicly -> basically

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-07-18 17:45:36 +04:00
Peter Maydell
b739ef05db slirp: Give error message if hostfwd_add/remove for unrecognized vlan/stack
If the user specified a (vlan ID, slirp stack name) tuple in a monitor
hostfwd_add/remove command and we can't find it, give the user an
error message rather than silently doing nothing.

This brings this error case in slirp_lookup() into line with the
other two.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-07-18 17:45:36 +04:00
Paolo Bonzini
fa666c10f2 Revert "kvmclock: Ensure time in migration never goes backward"
This reverts commit a096b3a673.

This patch caused a hang that was fixed by commit 9b17868 (kvmclock:
Ensure proper env->tsc value for kvmclock_current_nsec calculation,
2014-06-03), and we just had to revert that commit.  Drop this one
too.

Cc: agraf@suse.de
Cc: mtosatti@redhat.com
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-18 15:28:03 +02:00
Paolo Bonzini
108e4c3871 Revert "kvmclock: Ensure proper env->tsc value for kvmclock_current_nsec calculation"
This reverts commit 9b1786829a.

This patch fixed a hang introduced by commit a096b3a (kvmclock: Ensure
time in migration never goes backward, 2014-05-16), but it causes
a regression in migration whose cause is not quite clear.

Because of this, I'm choosing to revert both patches.  This trades a
2.1 regression for a bug that's been there forever.

Cc: agraf@suse.de
Cc: mtosatti@redhat.com
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-18 15:15:14 +02:00
Andreas Färber
bb2eb1892d module: Don't complain when a module is absent
The current implementation depends on a configure-time generated list of
block modules. When any of them is absent, module_load() emits a warning.

This is suboptimal because extracting code to modules was mainly done to
allow separate packaging of modules with intrusive dependencies. Absence
of optional packages then leads to absence of modules and an error
message, which users may recognize as new and report as error.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-18 14:57:35 +02:00
Andreas Färber
f9e13f8fd8 module: Simplify module_load()
The file path is not used for error reporting, so we can free it
directly after use.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-18 14:57:35 +02:00
Paolo Bonzini
f52b768782 qtest: new test for wdt_ib700
Since the "pause" watchdog action had a regression and it went
unnoticed for a while, let's add a test for it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-18 14:57:35 +02:00
Peter Maydell
e0097ea371 Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
Block pull request

# gpg: Signature made Fri 18 Jul 2014 13:39:43 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/block-pull-request:
  qemu-iotests: fix 028 failure due to disk image path
  raw-posix: Fail gracefully if no working alignment is found
  block: Add Error argument to bdrv_refresh_limits()
  qcow2: Fix error path for unknown incompatible features

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-18 13:47:22 +01:00
Stefan Hajnoczi
8283c5c316 qemu-iotests: fix 028 failure due to disk image path
The disk image path is echoed by QEMU's readline when the "drive_backup
disk ${TEST_IMG}.copy" HMP command is issued.  Unfortunately it is very
hard to filter out the path due to readline's character-by-character
output (with terminal escape sequences).  Just redirect this command to
/dev/null for now.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
2014-07-18 13:27:11 +01:00
Kevin Wolf
df26a35025 raw-posix: Fail gracefully if no working alignment is found
If qemu couldn't find out what O_DIRECT alignment to use with a given
file, it would run into assert(bdrv_opt_mem_align(bs) != 0); in block.c
and confuse users. This adds a more descriptive error message for such
cases.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-18 13:18:43 +01:00
Kevin Wolf
3baca89139 block: Add Error argument to bdrv_refresh_limits()
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-18 13:18:43 +01:00
Kevin Wolf
12ac6d3db7 qcow2: Fix error path for unknown incompatible features
qcow2's report_unsupported_feature() had two bugs: A 32 bit truncation
would prevent feature table entries for bits 32-63 from being used, and
it could assign errp multiple times if there was more than one unknown
feature, resulting in an error_set() assertion failure.

Fix the truncation, make sure to set the error exactly once and add a
qemu-iotests case for it.

This fixes https://bugs.launchpad.net/qemu/+bug/1342704/

Reported-by: Maria Kustova <maria.k@catit.be>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-18 13:12:15 +01:00
Peter Maydell
4d121a5498 Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc,vhost,test fixes

Minor bugfixes all over the place.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Fri 18 Jul 2014 00:43:04 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"

* remotes/mst/tags/for_upstream:
  vhost-user: minor cleanups
  qtest: Adapt vhost-user-test to latest vhost-user changes
  vhost-user: Fix VHOST_SET_MEM_TABLE processing
  qtest: fix vhost-user-test compilation with old GLib
  fix typo: apci -> acpi
  pc_piix: Reuse pc_compat_1_2() for pc-0.1[0123]
  pc: fix qemu exiting with error when -m X < 128 with old machines types

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-18 09:35:51 +01:00
Michael S. Tsirkin
cd98639f67 vhost-user: minor cleanups
assert to verify cast does not discard information
minor style fixup.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-07-18 02:22:24 +03:00
Nikolay Nikolaev
d6970e3b00 qtest: Adapt vhost-user-test to latest vhost-user changes
A new field mmap_offset was added in the vhost-user message, we need to reflect
this change in the test too.

Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-07-18 02:14:15 +03:00
Nikolay Nikolaev
f69a28051f vhost-user: Fix VHOST_SET_MEM_TABLE processing
qemu_get_ram_fd doesn't accept a guest physical address. ram_addr_t are
opaque values that are assigned in qemu_ram_alloc.

Find the ram_addr_t corresponding to the userspace_addr using qemu_ram_addr_from_host,
and then call qemu_get_ram_fd on it.

Thanks to Paolo Bonzini <pbonzini@redhat.com>

Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-18 02:14:15 +03:00
Igor Mammedov
5734d031aa pc: fix qemu exiting with error when -m X < 128 with old machine types
If machine doesn't support memory hotplug then starting QEMU
with initial memory less than default will make QEMU exit with
following error message:

$QEMU -m 16  -M isapc
qemu-system-i386: "-memory 'slots|maxmem'" is not supported by: isapc

Set maxram_size to initial memory value before parsing
'maxmem' option allows to keep maxmem in sync with initial
memory size if no maxmem option was specified.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
CC: Bruce Rogers <brogers@suse.com>
Reviewed-By: Bruce Rogers <brogers@suse.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-17 17:45:45 +01:00
KONRAD Frederic
af52fe862f cadence_uart: check for serial backend before using it.
This checks that s->chr is not NULL before using it.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-17 16:36:17 +01:00
Peter Maydell
231f6927c8 Merge remote-tracking branch 'remotes/amit-migration/for-2.1' into staging
* remotes/amit-migration/for-2.1:
  vmstate static checker: detect section renames

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-17 12:17:28 +01:00
Peter Maydell
104369c8c7 Merge remote-tracking branch 'remotes/amit/for-2.1' into staging
* remotes/amit/for-2.1:
  virtio-serial-bus: keep port 0 reserved for virtconsole even on unplug

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-17 11:18:51 +01:00
Amit Shah
57d84cf353 virtio-serial-bus: keep port 0 reserved for virtconsole even on unplug
We keep port 0 reserved for compat with older guests, where only
virtio-console was expected.  Even if a system is started without a
virtio-console port, port #0 is kept aside.  However, after a
virtconsole port is unplugged, port id 0 became available, and the next
hotplug of a virtserialport caused failure due to it not being a console
port.

Steps to reproduce:

$ ./x86_64-softmmu/qemu-system-x86_64 -m 512 -cpu host -enable-kvm -device virtio-serial-pci -monitor stdio  -vnc :1
QEMU 2.0.91 monitor - type 'help' for more information
(qemu) device_add virtconsole,id=p1
(qemu) device_del p1
(qemu) device_add virtserialport,id=p1
Port number 0 on virtio-serial devices reserved for virtconsole devices for backward compatibility.
Device 'virtserialport' could not be initialized
(qemu) quit

Reported-by: dengmin <mdeng@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2014-07-16 14:32:40 +05:30
Amit Shah
79fe16c048 vmstate static checker: detect section renames
Commit 292b1634 changed the section name of "ICH9 LPC" to "ICH9-LPC",
and that causes the static checker to flag this:

Section "ICH9 LPC" does not exist in dest

This patch introduces a function that checks for section renames and
also a dictionary that maps those renames.

Reported-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>

---
This is a small patch to a script; doesn't break qemu and helps with the
static checker, so it's a very low-risk patch for 2.1.
2014-07-16 14:29:34 +05:30
Peter Maydell
5a73480450 Update version for v2.1.0-rc2 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-15 18:55:37 +01:00
Peter Maydell
82172b7519 tests/Makefile: Only run vhost-user-test on Linux
vhost-user-test uses the linux/vhost.h header, so it must only be
enabled if CONFIG_LINUX is defined. (Previously it was enabled
for CONFIG_POSIX, which broke 'make check' on MacOSX.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-15 18:36:10 +01:00
Ricky Zhou
b4bda1ae57 target-i386: Allow execute from user mode when SMEP is enabled.
Previously, execute would be disabled for all pages with SMEP enabled,
regardless of what mode the access took place in.

Signed-off-by: Ricky Zhou <ricky@rzhou.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-15 18:43:14 +02:00
Peter Maydell
cbb46f5f49 Merge remote-tracking branch 'remotes/riku/linux-user-for-upstream' into staging
* remotes/riku/linux-user-for-upstream:
  linux-user: use TARGET_SA_ONSTACK in get_sigframe
  alloca one extra byte sockets
  linux-user: handle AF_PACKET sockaddrs in target_to_host_sockaddr
  qemu-user: Impl. setsockopt(SO_BINDTODEVICE)
  SIOCGIFINDEX: fix typo

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-15 16:49:28 +01:00
Peter Maydell
146ae00192 Merge remote-tracking branch 'remotes/agraf/tags/signed-ppc-for-upstream' into staging
Patch queue for ppc - 2014-07-15

Some more bug fixes during the RC phase:

  - Fix huge page mapping regressions
  - Fix Book3S thread number enumeration
  - Fix Book3S VFIO permission issue

# gpg: Signature made Tue 15 Jul 2014 15:13:54 BST using RSA key ID 03FEDC60
# gpg: Can't check signature: public key not found

* remotes/agraf/tags/signed-ppc-for-upstream:
  sPAPR/IOMMU: Fix TCE entry permission
  spapr: Enable use of huge pages
  spapr: Move RMA memory region registration code
  ppc: memory: Replace memory_region_init_ram with memory_region_allocate_system_memory
  target-ppc: Fix number of threads per core limit

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-15 15:51:12 +01:00
Gavin Shan
27e27782f7 sPAPR/IOMMU: Fix TCE entry permission
The permission of TCE entry should exclude physical base address.
Otherwise, unmapping TCE entry can be interpreted to mapping TCE
entry wrongly for VFIO devices.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-15 16:11:59 +02:00
Alexey Kardashevskiy
f92f5da108 spapr: Enable use of huge pages
0b183fc87 "memory: move mem_path handling to
memory_region_allocate_system_memory" disabled -mempath use for all
machines that do not use memory_region_allocate_system_memory() to
register RAM. Since SPAPR uses memory_region_init_ram(), the huge pages
support was disabled for it.

This replaces memory_region_init_ram()+vmstate_register_ram_global() with
memory_region_allocate_system_memory() to get huge pages back.

This changes RAM size from (ram_limit - rma_alloc_size) to ram_limit as
the previous patch moved RMA memory region allocation after RAM allocation
and therefore this change does not have immediate effect but simplifies
the code.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-15 16:11:59 +02:00
Alexey Kardashevskiy
658fa66b81 spapr: Move RMA memory region registration code
PPC970 does not support VRMA (virtual RMA) so real memory required
for SLOF to execute must be allocated by the KVM_ALLOCATE_RMA ioctl.
Later this memory is used as a part of the guest RAM area.
The RMA allocating code also registers a memory region for this piece
of RAM.

We are going to simplify memory regions layout: RMA memory region
will be a subregion in the RAM memory region, both starting from zero.
This way we will not have to take care of start address alignment for
the piece of RAM next to the RMA.

This moves memory region business closer to the RAM memory region
creation/allocation code.

As this is a mechanical patch, no change in behaviour is expected.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[agraf: fix compilation on non-kvm systems]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-15 16:11:59 +02:00
Shreyas B. Prabhu
e938ba0c35 ppc: memory: Replace memory_region_init_ram with memory_region_allocate_system_memory
Commit 0b183fc871:"memory: move mem_path handling to
memory_region_allocate_system_memory" split memory_region_init_ram and
memory_region_init_ram_from_file. Also it moved mem-path handling a step
up from memory_region_init_ram to memory_region_allocate_system_memory.

Therefore for any board that uses memory_region_init_ram directly,
-mem-path is not supported.

Fix this by replacing memory_region_init_ram with
memory_region_allocate_system_memory.

Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-15 16:11:58 +02:00
Alexey Kardashevskiy
063cac5326 target-ppc: Fix number of threads per core limit
The number of threads per core is different for POWER6/7/8 CPUs.
Guest systems do not expect to see more threads per core than
a specific CPU supports so we need to limit this number.
This limit is implemented by ppc_get_compat_smt_threads().

However it has a problem as it checks for PCR (Processor Compatibility
Register) mask, 2.05 means 2 threads per core, 2.06 - 4 threads.
For POWER8 one would expect PCR_COMPAT_2_07 bit set and
ppc_get_compat_smt_threads() checking for it to return 8 threads
per core. But the latest PowerISA spec now is 2.07 and there is
no 2.07 compatibility mode defined, QEMU does not define it either
(will be in PowerISA 2.08).

Instead of relying on a PCR mask, this uses kvmppc_smt_threads()
which returns the maximum supported threads number for KVM or
1 for TCG.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-15 16:11:58 +02:00
Riku Voipio
b545f63fa9 linux-user: use TARGET_SA_ONSTACK in get_sigframe
As reported by Laurent, which should use TARGET_SA_ONSTACK
on arm, microblaze and openrisc targets like we do on all
others. Practical matter is minimal as for almost all archs
SA_ONSTACK is 0x08000000:

http://lxr.free-electrons.com/ident?i=SA_ONSTACK

Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-07-15 17:08:41 +03:00
Peter Maydell
2c65ebe646 Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
Block pull request

# gpg: Signature made Tue 15 Jul 2014 14:49:01 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/block-pull-request:
  virtio-blk: dataplane: notify guest as a batch
  virtio-blk: data-plane: fix save/set .complete_request in start
  linux-aio: Fix laio resource leak

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-15 15:06:17 +01:00
Ming Lei
5b2ffbe4d9 virtio-blk: dataplane: notify guest as a batch
Now requests are submitted as a batch, so it is natural
to notify guest as a batch too.

This may suppress interrupt notification to VM a lot:

        - in my test, decreased by ~13K/sec

Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-15 15:34:13 +02:00
Ming Lei
e926d9b8c5 virtio-blk: data-plane: fix save/set .complete_request in start
The callback has to be saved and reset in virtio_blk_data_plane_start(),
otherwise dataplane's requests will be completed in qemu aio context.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-15 15:34:13 +02:00
Gonglei
a1abf40d6b linux-aio: Fix laio resource leak
when hotplug virtio-scsi disks using laio, the aio_nr will
increase in laio_init() by io_setup(), we can see the number by
  # cat /proc/sys/fs/aio-nr
  128
if the aio_nr attach the maxnum, which found from
  # cat /proc/sys/fs/aio-max-nr
  65536
the hotplug process will fail because of aio context leak.

Fix it by io_destroy in laio_cleanup().

Reported-by: daifulai <daifulai@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-15 15:34:13 +02:00
Joakim Tjernlund
2dd08dfd9a alloca one extra byte sockets
target_to_host_sockaddr() may increase the lenth with 1 byte
for AF_UNIX sockets so allocate 1 extra byte.

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-07-15 16:28:36 +03:00
Joakim Tjernlund
33a29b51c9 linux-user: handle AF_PACKET sockaddrs in target_to_host_sockaddr
Implement conversion of the AF_PACKET sockaddr subtype
in target_to_host_sockaddr.

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-07-15 16:28:25 +03:00
Joakim Tjernlund
451aaf688c qemu-user: Impl. setsockopt(SO_BINDTODEVICE)
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-07-15 16:28:20 +03:00
Joakim Tjernlund
27a07827c4 SIOCGIFINDEX: fix typo
Wrong type was used in ioctl definition.

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-07-15 16:26:31 +03:00
Andreas Färber
0e16297461 libqos: Fix PC PCI endianness glitches
The libqos implementation of io_read{b,w,l} and io_write{b,w,l} hooks
was relying on qtest_mem{read,write}() respectively. With d81d410 (usb:
improve ehci/uhci test) this resulted in assertion failures on ppc hosts:

 ERROR:tests/usb-hcd-ehci-test.c:78:ehci_port_test: assertion failed: ((value & mask) == (expect & mask))

 ERROR:tests/usb-hcd-ehci-test.c:128:pci_uhci_port_2: assertion failed: (pcibus != NULL)

 ERROR:tests/usb-hcd-ehci-test.c:150:pci_ehci_port_2: assertion failed: (pcibus != NULL)

qtest_read{b,w,l,q}() and qtest_write{b,w,l,q}() had been introduced
as endian-safe replacement for qtest_mem{read,write}() in I2C in
872536b (qtest: Add MMIO support). Use them for PCI as well.

Cc: Anthony Liguori <aliguori@amazon.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Fixes: c4efe1c qtest: add libqos including PCI support
Fixes: d81d410 usb: improve ehci/uhci test
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-15 14:18:15 +01:00
Peter Maydell
0a9934eef1 Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
Misc 2.1 fixes regarding character/serial devices and SCSI.

# gpg: Signature made Mon 14 Jul 2014 16:26:08 BST using RSA key ID 9B4D86F2
# gpg: Can't check signature: public key not found

* remotes/bonzini/tags/for-upstream:
  serial-pci: remove memory regions from BAR before destroying them
  virtio-scsi: fix with -M pc-i440fx-2.0
  serial: change retry logic to avoid concurrency
  qemu-char: fix deadlock with "-monitor pty"
  scsi: Report error when lun number is in use

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-14 17:01:45 +01:00
Paolo Bonzini
7497bce6c2 serial-pci: remove memory regions from BAR before destroying them
Otherwise, hot-unplug of pci-serial-2x trips the assertion
in memory_region_destroy:

    (qemu) device_del gg
    (qemu) qemu-system-x86_64: /work/armbru/tmp/qemu/memory.c:1021: memory_region_destroy: Assertion `((&mr->subregions)->tqh_first == ((void *)0))' failed.
    Aborted (core dumped)

Reported-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-14 16:14:15 +02:00
Paolo Bonzini
1f4e6a069b virtio-scsi: fix with -M pc-i440fx-2.0
Right now starting a machine with virtio-scsi and a <= 2.0 machine type
fails with:

    qemu-system-x86_64: -device virtio-scsi-pci: Property .any_layout not found

This is because the any_layout bit was actually never set after
virtio-scsi was changed to support arbitrary layout for virtio buffers.

(This was just a cleanup and a preparation for virtio 1.0; no guest
actually checks the bit, but the new request parsing algorithms are
tested even with old guest).

Reported-by: David Gilbert <dgilbert@redhat.com>
Reviewed-by: David Gilbert <dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-14 16:14:15 +02:00
Kirill Batuzov
f702e62a19 serial: change retry logic to avoid concurrency
Whenever serial_xmit fails to transmit a byte it adds a watch that would
call it again when the "line" becomes ready. This results in a retry
chain:
  serial_xmit -> add_watch -> serial_xmit
Each chain is able to transmit one character, and for every character
passed to serial by the guest driver a new chain is spawned.

The problem lays with the fact that a new chain is spawned even when
there is one already waiting on the watch. So there can be several retry
chains waiting concurrently on one "line". Every chain tries to transmit
current character, so character order is not messed up. But also every
chain increases retry counter (tsr_retry). If there are enough
concurrent chains this counter will hit MAX_XMIT_RETRY value and
the character will be dropped.

To reproduce this bug you need to feed serial output to some program
consuming it slowly enough. A python script from bug #1335444
description is an example of such program.

This commit changes retry logic in the following way to avoid
concurrency: instead of spawning a new chain for each character being
transmitted spawn only one and make it transmit characters until FIFO is
empty.

The change consists of two parts:
 - add a do {} while () loop in serial_xmit (diff is a bit erratic
   for this part, diff -w will show actual change),
 - do not call serial_xmit from serial_ioport_write if there is one
   waiting on the watch already.

This should fix another issue causing bug #1335444.

Signed-off-by: Kirill Batuzov <batuzovk@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-14 16:14:14 +02:00
Paolo Bonzini
7b3621f47a qemu-char: fix deadlock with "-monitor pty"
qemu_chr_be_generic_open cannot be called with the write lock taken,
because it calls client code that may call qemu_chr_fe_write.  This
actually happens for the monitor:

    0x00007ffff27dbf79 in __GI_raise (sig=sig@entry=6)
    0x00007ffff27df388 in __GI_abort ()
    0x00005555555ef489 in error_exit (err=<optimized out>, msg=msg@entry=0x5555559796d0 <__func__.5959> "qemu_mutex_lock")
    0x00005555558f9080 in qemu_mutex_lock (mutex=mutex@entry=0x555556248a30)
    0x0000555555713936 in qemu_chr_fe_write (s=0x555556248a30, buf=buf@entry=0x5555563d8870 "QEMU 2.0.90 monitor - type 'help' for more information\r\n", len=56)
    0x00005555556217fd in monitor_flush_locked (mon=mon@entry=0x555556251fd0)
    0x0000555555621a12 in monitor_flush_locked (mon=0x555556251fd0)
    monitor_puts (mon=mon@entry=0x555556251fd0, str=0x55555634bfa7 "", str@entry=0x55555634bf70 "QEMU 2.0.90 monitor - type 'help' for more information\n")
    0x0000555555624359 in monitor_vprintf (mon=0x555556251fd0, fmt=<optimized out>, ap=<optimized out>)
    0x0000555555624414 in monitor_printf (mon=<optimized out>, fmt=fmt@entry=0x5555559105a0 "QEMU %s monitor - type 'help' for more information\n")
    0x0000555555629806 in monitor_event (opaque=0x555556251fd0, event=<optimized out>)
    0x000055555571343c in qemu_chr_be_generic_open (s=0x555556248a30)

To avoid this, defer the call to an idle callback, which will be
called as soon as the main loop is re-entered.  In order to simplify
the cleanup and do it in one place only, change pty_chr_close to
call pty_chr_state.

To reproduce, run with "-monitor pty", then try to read from the
slave /dev/pts/FOO that it creates.

Fixes: 9005b2a758
Reported-by: Li Liang <liangx.z.li@intel.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-14 16:13:58 +02:00
Peter Maydell
7a6d04e73f Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block patches for 2.1.0-rc2 (v2)

# gpg: Signature made Mon 14 Jul 2014 11:04:12 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"

* remotes/kevin/tags/for-upstream: (22 commits)
  ide: Treat read/write beyond end as invalid
  virtio-blk: Treat read/write beyond end as invalid
  virtio-blk: Bypass error action and I/O accounting on invalid r/w
  virtio-blk: Factor common checks out of virtio_blk_handle_read/write()
  dma-helpers: Fix too long qiov
  qtest: fix vhost-user-test compilation with old GLib
  tests: Fix unterminated string output visitor enum human string
  AioContext: do not rely on aio_poll(ctx, true) result to end a loop
  virtio-blk: embed VirtQueueElement in VirtIOBlockReq
  virtio-blk: avoid g_slice_new0() for VirtIOBlockReq and VirtQueueElement
  dataplane: do not free VirtQueueElement in vring_push()
  virtio-blk: avoid dataplane VirtIOBlockReq early free
  block: Assert qiov length matches request length
  qed: Make qiov match request size until backing file EOF
  qcow2: Make qiov match request size until backing file EOF
  block: Make qiov match the request size until EOF
  AioContext: speed up aio_notify
  test-aio: fix GSource-based timer test
  block: drop aio functions that operate on the main AioContext
  block: prefer aio_poll to qemu_aio_wait
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-14 13:09:29 +01:00
Peter Maydell
c15a34eda0 Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20140714' into staging
A s390x/kvm bugfix for missing floating point register synchronization.

# gpg: Signature made Mon 14 Jul 2014 08:21:54 BST using RSA key ID C6F02FAF
# gpg: Can't check signature: public key not found

* remotes/cohuck/tags/s390x-20140714:
  s390x/kvm: synchronize guest floating point registers

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-14 11:04:11 +01:00
Markus Armbruster
58ac321135 ide: Treat read/write beyond end as invalid
The block layer fails such reads and writes just fine.  However, they
then get treated like valid operations that fail: the error action
gets executed.  Unwanted; reporting the error to the guest is the only
sensible action.

Reject them before passing them to the block layer.  This bypasses the
error action and I/O accounting.  Not quite correct for DMA, because
DMA can fail after some success, and when that happens, the part that
succeeded isn't counted.  Tolerable, because I/O accounting is an
inconsistent mess anyway.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-07-14 12:03:21 +02:00
Markus Armbruster
3c2daac0b9 virtio-blk: Treat read/write beyond end as invalid
The block layer fails such reads and writes just fine.  However, they
then get treated like valid operations that fail: the error action
gets executed.  Unwanted; reporting the error to the guest is the only
sensible action.

Reject them before passing them to the block layer.  This bypasses the
error action and I/O accounting.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-07-14 12:03:21 +02:00
Markus Armbruster
42e38c1fd0 virtio-blk: Bypass error action and I/O accounting on invalid r/w
When a device model's I/O operation fails, we execute the error
action.  This lets layers above QEMU implement thin provisioning, or
attempt to correct errors before they reach the guest.  But when the
I/O operation fails because it's invalid, reporting the error to the
guest is the only sensible action.

If the guest's read or write asks for an invalid sector range, fail
the request right away, without considering the error action.  No
change with error action BDRV_ACTION_REPORT.

Furthermore, bypass I/O accounting, because we want to track only I/O
that actually reaches the block layer.

The next commit will extend "invalid sector range" to cover attempts
to read/write beyond the end of the medium.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-07-14 12:03:21 +02:00
Markus Armbruster
d0e14376ee virtio-blk: Factor common checks out of virtio_blk_handle_read/write()
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-07-14 12:03:21 +02:00
Kevin Wolf
58f423fbd5 dma-helpers: Fix too long qiov
If the size of the scatter/gather list isn't a multiple of 512, the
number of sectors for the block layer request is rounded down, resulting
in a qiov that doesn't match the request length. Truncate the qiov to the
new length of the request.

This fixes the IDE qtest case /x86_64/ide/bmdma/short_prdt.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-07-14 12:03:21 +02:00
Nikolay Nikolaev
80504dcaa1 qtest: fix vhost-user-test compilation with old GLib
Mising G_TIME_SPAN_SECOND definition breaks the RHEL6 compilation as GLib
version before 2.26 does not have it. In such case just define it.

Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Tested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-07-14 12:03:21 +02:00
Andreas Färber
b8864245b1 tests: Fix unterminated string output visitor enum human string
The buffer was being allocated of size string length plus two.
Around the string two quotes were being added, but no terminating NUL.
It was then compared using g_assert_cmpstr(), resulting in fairly random
assertion failures:

 ERROR:tests/test-string-output-visitor.c:213:test_visitor_out_enum: assertion failed (str == str_human): ("\"value1\"" == "\"value1\"\001EEEEEEEEEEEEEE\0171")

There is no g_assert_cmpnstr() counterpart, so use g_strdup_printf()
for safely assembling the string in the first place.

Cc: Hu Tao <hutao@cn.fujitsu.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Suggested-by: Eric Blake <eblake@redhat.com>
Fixes: b4900c0 tests: add human format test for string output visitor
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-07-14 12:03:20 +02:00
Paolo Bonzini
acfb23ad3d AioContext: do not rely on aio_poll(ctx, true) result to end a loop
Currently, whenever aio_poll(ctx, true) has completed all pending
work it returns true *and* the next call to aio_poll(ctx, true)
will not block.

This invariant has its roots in qemu_aio_flush()'s implementation
as "while (qemu_aio_wait()) {}".  However, qemu_aio_flush() does
not exist anymore and bdrv_drain_all() is implemented differently;
and this invariant is complicated to maintain and subtly different
from the return value of GMainLoop's g_main_context_iteration.

All calls to aio_poll(ctx, true) except one are guarded by a
while() loop checking for a request to be incomplete, or a
BlockDriverState to be idle.  The one remaining call (in
iothread.c) uses this to delay the aio_context_release/acquire
pair until the AioContext is quiescent, however:

- we can do the same just by using non-blocking aio_poll,
  similar to how vl.c invokes main_loop_wait

- it is buggy, because it does not ensure that the AioContext
  is released between an aio_notify and the next time the
  iothread goes to sleep.  This leads to hangs when stopping
  the dataplane thread.

In the end, these semantics are a bad match for the current
users of AioContext.  So modify that one exception in iothread.c,
which also fixes the hangs, as well as the testcase so that
it use the same idiom as the actual QEMU code.

Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-07-14 12:03:20 +02:00
Stefan Hajnoczi
f897bf751f virtio-blk: embed VirtQueueElement in VirtIOBlockReq
The memory allocation between hw/block/virtio-blk.c,
hw/block/dataplane/virtio-blk.c, and hw/virtio/dataplane/vring.c is
messy.  Structs are allocated in different files than they are freed in.
This is risky and makes memory leaks easier.

Embed VirtQueueElement in VirtIOBlockReq to reduce the amount of memory
allocation we need to juggle.  This also makes vring.c and virtio.c
slightly more similar.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-07-14 12:03:20 +02:00
Stefan Hajnoczi
869d66af53 virtio-blk: avoid g_slice_new0() for VirtIOBlockReq and VirtQueueElement
In commit de6c8042ec ("virtio-blk: Avoid
zeroing every request structure") we avoided the 40 KB memset when
allocating VirtIOBlockReq.

The memset was reintroduced in commit
671ec3f056 ("virtio-blk: Convert
VirtIOBlockReq.elem to pointer").

It must be fixed again to avoid a performance regression.

Cc: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-07-14 12:03:20 +02:00
Stefan Hajnoczi
abd764250f dataplane: do not free VirtQueueElement in vring_push()
VirtQueueElement is allocated in vring_pop() so it seems to make sense
that vring_push() should free it.  Alas, virtio-blk frees
VirtQueueElement itself in virtio_blk_free_request().

This patch solves a double-free assertion in glib's g_slice_free().

Rename vring_free_element() to vring_unmap_element() since it no longer
frees the VirtQueueElement.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-07-14 12:03:20 +02:00
Stefan Hajnoczi
0a21ea3289 virtio-blk: avoid dataplane VirtIOBlockReq early free
VirtIOBlockReq is freed later by virtio_blk_free_request() in
hw/block/virtio-blk.c.  Remove this extraneous g_slice_free().

This patch fixes the following segfault:

  0x00005555556373af in virtio_blk_rw_complete (opaque=0x5555565ff5e0, ret=0) at hw/block/virtio-blk.c:99
  99          bdrv_acct_done(req->dev->bs, &req->acct);
  (gdb) print req
  $1 = (VirtIOBlockReq *) 0x5555565ff5e0
  (gdb) print req->dev
  $2 = (VirtIOBlock *) 0x0
  (gdb) bt
  #0  0x00005555556373af in virtio_blk_rw_complete (opaque=0x5555565ff5e0, ret=0) at hw/block/virtio-blk.c:99
  #1  0x0000555555840ebe in bdrv_co_em_bh (opaque=0x5555566152d0) at block.c:4675
  #2  0x000055555583de77 in aio_bh_poll (ctx=ctx@entry=0x5555563a8150) at async.c:81
  #3  0x000055555584b7a7 in aio_poll (ctx=0x5555563a8150, blocking=blocking@entry=true) at aio-posix.c:188
  #4  0x00005555556e520e in iothread_run (opaque=0x5555563a7fd8) at iothread.c:41
  #5  0x00007ffff42ba124 in start_thread () from /usr/lib/libpthread.so.0
  #6  0x00007ffff16d14bd in clone () from /usr/lib/libc.so.6

Reported-by: Max Reitz <mreitz@redhat.com>
Cc: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-07-14 12:03:20 +02:00
Kevin Wolf
8eb029c26e block: Assert qiov length matches request length
At least raw-posix relies on this because it can allocate bounce buffers
based on the request length, but access it using all of the qiov entries
later.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-07-14 12:03:20 +02:00
Kevin Wolf
f06ee3d4aa qed: Make qiov match request size until backing file EOF
If a QED image has a shorter backing file and a read request to
unallocated clusters goes across EOF of the backing file, the backing
file sees a shortened request and the rest is filled with zeros.
However, the original too long qiov was used with the shortened request.

This patch makes the qiov size match the request size, avoiding a
potential buffer overflow in raw-posix.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-07-14 12:03:20 +02:00
Kevin Wolf
44deba5a52 qcow2: Make qiov match request size until backing file EOF
If a qcow2 image has a shorter backing file and a read request to
unallocated clusters goes across EOF of the backing file, the backing
file sees a shortened request and the rest is filled with zeros.
However, the original too long qiov was used with the shortened request.

This patch makes the qiov size match the request size, avoiding a
potential buffer overflow in raw-posix.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-07-14 12:03:20 +02:00
Kevin Wolf
33f461e0c5 block: Make qiov match the request size until EOF
If a read request goes across EOF, the block driver sees a shortened
request that stops at EOF (the rest is memsetted in block.c), however
the original qiov was used for this request.

This patch makes the qiov size match the request size, avoiding a
potential buffer overflow in raw-posix.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-07-14 12:03:20 +02:00
Fam Zheng
2039511b8f scsi: Report error when lun number is in use
In the case that the lun number is taken by another scsi device, don't
release the existing device siliently, but report an error to user.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-14 11:54:57 +02:00
Jason J. Herne
85ad6230b3 s390x/kvm: synchronize guest floating point registers
Add code to kvm_arch_get_registers and kvm_arch_put_registers to
save/restore floating point registers. This missing sync was
unnoticed until migration of userspace that uses fprs.

Signed-off-by: Jason J. Herne <jjherne@us.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
[Update patch to latest upstream]
Cc: qemu-stable@nongnu.org
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-07-14 09:15:38 +02:00
Nikolay Nikolaev
0e3cd8334a qtest: fix vhost-user-test compilation with old GLib
Mising G_TIME_SPAN_SECOND definition breaks the RHEL6 compilation as GLib
version before 2.26 does not have it. In such case just define it.

Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-07-14 00:42:54 +03:00
Hu Tao
75902802c2 fix typo: apci -> acpi
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

MST: rebase
2014-07-11 21:31:55 +03:00
Eduardo Habkost
faab459797 pc_piix: Reuse pc_compat_1_2() for pc-0.1[0123]
pc-0.13 and older were missing some compat code that was present on
newer machine-types:

* x86_cpu_compat_disable_kvm_features(FEAT_1_ECX, CPUID_EXT_X2APIC);
  (pc-i440fx-1.7 and older)
  (added by commit ef02ef5f45)
* x86_cpu_compat_set_features("n270", FEAT_1_ECX, 0, CPUID_EXT_MOVBE);
  (pc-i440fx-1.4 and older)
  (added by commit 4458c23672
* x86_cpu_compat_set_features("Westmere", FEAT_1_ECX, 0, CPUID_EXT_PCLMULQDQ);
  (pc-i440fx-1.4 and older)
  (added by commit 56383703c0)

Instead of duplicating the code from the previous pc_compat_*()
functions, we can now reuse pc_compat_1_2() and fix those issues.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-07-11 21:13:00 +03:00
Igor Mammedov
4ec6ee5ace pc: fix qemu exiting with error when -m X < 128 with old machines types
If machine doesn't support memory hotplug then staring QEMU
with initial memory less than default will make QEMU exit with
following error message:

$QEMU -m 16  -M isapc
qemu-system-i386: "-memory 'slots|maxmem'" is not supported by: isapc

Set maxram_size to initial memory value before parsing
'maxmem' option allows to keep maxmem in sync with initial
memory size if no maxmem option was specified.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
CC: Bruce Rogers <brogers@suse.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-07-11 21:05:14 +03:00
Peter Maydell
ab6d3749c4 Merge remote-tracking branch 'remotes/kraxel/tags/pull-vga-20140711-1' into staging
vga: some cirrus fixes.

# gpg: Signature made Fri 11 Jul 2014 10:38:32 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-vga-20140711-1:
  cirrus: Fix host CPU blits
  cirrus: Fix build of debug code
  cirrus_vga: adding sanity check for vram size

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-11 17:50:38 +01:00
Peter Maydell
aee230d707 Merge remote-tracking branch 'remotes/kraxel/tags/pull-usb-20140711-1' into staging
mtp: linux guest detection fix

# gpg: Signature made Fri 11 Jul 2014 11:32:20 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-usb-20140711-1:
  mtp: linux guest detection fix.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-11 16:01:38 +01:00
Peter Maydell
42ca32f776 Merge remote-tracking branch 'remotes/spice/tags/pull-spice-20140711-1' into staging
spice: auth fixes

# gpg: Signature made Fri 11 Jul 2014 10:17:15 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/spice/tags/pull-spice-20140711-1:
  spice: auth fixes

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-11 14:50:18 +01:00
Peter Maydell
22df3452dc Merge remote-tracking branch 'remotes/kraxel/tags/pull-gtk-20140711-1' into staging
ui/gtk: Restore keyboard focus after Page change

# gpg: Signature made Fri 11 Jul 2014 09:46:21 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-gtk-20140711-1:
  ui/gtk: Restore keyboard focus after Page change

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-11 13:48:07 +01:00
Gerd Hoffmann
13d54125a3 mtp: linux guest detection fix.
Attach a name to the MTP interface (android phones have this too).

With this patch recent linux guests such as fedora 20 happily detect and
use the device.  It shows up in nautilus file manager automatically, and
simple-mtpfs can mount it.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-07-11 12:31:41 +02:00
John Snow
e72b59fa93 ui/gtk: Restore keyboard focus after Page change
(Resending for correct email addresses via MAINTAINERS ...)

In the GTK UI, after changing focus to the qemu monitor Notebook Page,
when restoring focus to the virtual machine page, the keyboard focus is lost
to a hidden GTK widget. Focus can only be restored to the virtual machine by
pressing "tab" or any of the four directional arrow keys.

Clicking in the window or grabbing/ungrabbing input does not restore keyboard
focus to the child widget.

This patch adjusts the Notebook page switching callback to automatically
steal keyboard focus on the Page switch event, so that keyboard input
does not appear to break or disappear after tabbing to the QEMU monitor.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-07-11 10:44:00 +02:00
Benjamin Herrenschmidt
d16136d22a cirrus: Fix host CPU blits
Commit b2eb849d4b
"CVE-2007-1320 - Cirrus LGD-54XX "bitblt" heap overflow" broke
cpu to video blits.

When the ROP function is called from cirrus_bitblt_cputovideo_next(),
we pass 0 for the pitch but only operate on one line at a time. The
added test was tripping because after the initial substraction, the
pitch becomes negative. Make the test only trip when the height is
larger than one (ie. the pitch is actually used).

This fixes HW cursor support in Windows NT4.0 (which otherwise was
a white rectangle) and general display of icons in that OS when using
8bpp mode.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-07-11 10:17:02 +02:00
Benjamin Herrenschmidt
e8ee4b68be cirrus: Fix build of debug code
Use PRIu64 to print uint64_t

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-07-11 10:17:01 +02:00
Gonglei
f61d82c2df cirrus_vga: adding sanity check for vram size
when configure a invalid vram size for cirrus card, such as less
2 MB, which will crash qemu. Follow the real hardware, the cirrus
card has 4 MB video memory. Also for backward compatibility, accept
8 MB and 16 MB vram size.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-07-11 10:17:01 +02:00
Gerd Hoffmann
b1ea7b79e1 spice: auth fixes
Set auth to sasl when sasl is enabled, this makes "info spice" correctly
display sasl auth.  Also throw an error in case someone tries to set
a spice password via monitor without auth mode being "spice".

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-07-11 10:12:47 +02:00
Peter Maydell
74aeb37de0 Merge remote-tracking branch 'remotes/kvm/uq/master' into staging
* remotes/kvm/uq/master:
  qtest: fix vhost-user-test compilation with old GLib
  mc146818rtc: register the clock reset notifier on the right clock
  oslib-posix: Fix new compiler error with -Wclobbered
  target-i386: Add "kvmclock-stable-bit" feature bit name
  Enforce stack protector usage
  watchdog: fix deadlock with -watchdog-action pause
  mips_malta: Catch kernels linked at wrong address
  mips_malta: Remove incorrect KVM T&E references
  mips/kvm: Disable FPU on reset with KVM
  mips/kvm: Init EBase to correct KSEG0

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-10 17:37:16 +01:00
Nikolay Nikolaev
0a58991a5f qtest: fix vhost-user-test compilation with old GLib
Mising G_TIME_SPAN_SECOND definition breaks the RHEL6 compilation as GLib
version before 2.26 does not have it. In such case just define it.

Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-10 17:06:33 +02:00
Paolo Bonzini
13c0cbaec5 mc146818rtc: register the clock reset notifier on the right clock
Commit 884f17c (aio / timers: Convert rtc_clock to be a QEMUClockType,
2013-08-21) erroneously changed an occurrence of rtc_clock to
QEMU_CLOCK_REALTIME, which broke the RTC reset notifier in
mc146818rtc.  Fix this.

I redid the patch myself since the original reporter did not sign
off on his.

Cc: qemu-stable@nongnu.org
Reported-by: Lb peace <peaceustc@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-10 17:06:33 +02:00
Stefan Weil
b7bf8f5657 oslib-posix: Fix new compiler error with -Wclobbered
Newer versions of gcc report a warning (or an error with -Werror) when
compiler option -Wclobbered (or -Wextra) is active:

util/oslib-posix.c:372:12: error:
 variable ‘hpagesize’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Werror=clobbered]

The rewritten code fixes this warning: variable 'hpagesize' is now set and
used in a block without any call of sigsetjmp or similar functions.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-10 17:06:33 +02:00
Eduardo Habkost
8248c36a5d target-i386: Add "kvmclock-stable-bit" feature bit name
KVM_FEATURE_CLOCKSOURCE_STABLE_BIT is enabled by default and supported
by KVM. But not having a name defined makes QEMU treat it as an unknown
and unmigratable feature flag (as any unknown feature may possibly
require state to be migrated), and disable it by default on "-cpu host".

As a side-effect, the new name also makes the flag configurable,
allowing the user to disable it (which may be useful for testing or for
compatibility with old kernels).

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-10 17:06:33 +02:00
Miroslav Rezanina
3b463a3fa8 Enforce stack protector usage
If --enable-stack-protector is used is used, configure script try to use
--fstack-protector-strong. In case it's not supported, --fstack-protector-all
is enabled. If both protectors are not supported, configure does not use
any protector at all without any notification.

This patch reports error when user requests stack protector to be used and
both protector modes are not supported. Behavior is not changed in case
user do not use any of --enable-stack-protector/--disable-stack-protector.

Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>
[Fix non-POSIX operator in test. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-10 17:06:29 +02:00
Andreas Färber
9e99c5fd70 tests: Fix unterminated string output visitor enum human string
The buffer was being allocated of size string length plus two.
Around the string two quotes were being added, but no terminating NUL.
It was then compared using g_assert_cmpstr(), resulting in fairly random
assertion failures:

 ERROR:tests/test-string-output-visitor.c:213:test_visitor_out_enum: assertion failed (str == str_human): ("\"value1\"" == "\"value1\"\001EEEEEEEEEEEEEE\0171")

There is no g_assert_cmpnstr() counterpart, so use g_strdup_printf()
for safely assembling the string in the first place.

Cc: Hu Tao <hutao@cn.fujitsu.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Suggested-by: Eric Blake <eblake@redhat.com>
Fixes: b4900c0 tests: add human format test for string output visitor
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-10 11:53:14 +01:00
Paolo Bonzini
30e5210a70 watchdog: fix deadlock with -watchdog-action pause
qemu_clock_enable says:

/* Disabling the clock will wait for related timerlists to stop
 * executing qemu_run_timers.  Thus, this functions should not
 * be used from the callback of a timer that is based on @clock.
 * Doing so would cause a deadlock.
 */

and it indeed does: vm_stop uses qemu_clock_enable on QEMU_CLOCK_VIRTUAL
and watchdogs are based on QEMU_CLOCK_VIRTUAL, and we get a deadlock.

Use qemu_system_vmstop_request_prepare()/qemu_system_vmstop_request()
instead; yet another alternative could be a BH.

I checked other occurrences of vm_stop and they should not have this
problem.  RUN_STATE_IO_ERROR could in principle (it depends on the
code in the drivers) but it has been fixed by commit 2bd3bce, "block:
asynchronously stop the VM on I/O errors", 2014-06-05.

Tested-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-09 18:17:08 +02:00
James Hogan
f7f152458e mips_malta: Catch kernels linked at wrong address
Add error reporting if the wrong type of kernel is provided for the
current mode of acceleration.

Currently a KVM kernel linked at 0x40000000 can't be used with TCG, and
a normal kernel linked at 0x80000000 can't be used with KVM.

Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-09 18:17:08 +02:00
James Hogan
fbdb1d9555 mips_malta: Remove incorrect KVM T&E references
Fix the error message and code comments relating to KVM not supporting
booting from the flash mapping when no kernel is provided. The issue is
a general MIPS KVM issue and isn't specific to the Trap & Emulate
version of MIPS KVM.

Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reported-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-09 18:17:08 +02:00
James Hogan
0e928b12c9 mips/kvm: Disable FPU on reset with KVM
KVM doesn't yet support the MIPS FPU, or writing to the guest's Config1
register which contains the FPU implemented bit. Clear QEMU's version of
that bit on reset and display a warning that the FPU has been disabled.

The previous incorrect Config1 CP0 register value wasn't being passed to
KVM yet, however we should ensure it is set correctly now to reduce the
risk of breaking migration/loadvm to a future version of QEMU/Linux that
does support it.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-09 18:17:04 +02:00
Paolo Bonzini
0ceb849bd3 AioContext: speed up aio_notify
In many cases, the call to event_notifier_set in aio_notify is unnecessary.
In particular, if we are executing aio_dispatch, or if aio_poll is not
blocking, we know that we will soon get to the next loop iteration (if
necessary); the thread that hosts the AioContext's event loop does not
need any nudging.

The patch includes a Promela formal model that shows that this really
works and does not need any further complication such as generation
counts.  It needs a memory barrier though.

The generation counts are not needed because any change to
ctx->dispatching after the memory barrier is okay for aio_notify.
If it changes from zero to one, it is the right thing to skip
event_notifier_set.  If it changes from one to zero, the
event_notifier_set is unnecessary but harmless.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-07-09 15:50:11 +02:00
Paolo Bonzini
ef508f427b test-aio: fix GSource-based timer test
The current test depends too much on the implementation of the AioContext
GSource.  Just iterate on the main loop until the callback has been invoked
the right number of times.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-07-09 15:50:11 +02:00
Paolo Bonzini
87f68d3182 block: drop aio functions that operate on the main AioContext
The main AioContext should be accessed explicitly via qemu_get_aio_context().
Most of the time, using it is not the right thing to do.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-07-09 15:50:11 +02:00
Paolo Bonzini
b47ec2c456 block: prefer aio_poll to qemu_aio_wait
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-07-09 15:50:11 +02:00
Kevin Wolf
01fb2705bd block: Fix bdrv_is_allocated() return value
bdrv_is_allocated() should return either 0 or 1 in successful cases.
We're lucky that currently, the callers that rely on this (e.g. because
they check for ret == 1) don't seem to break badly. They just might skip
some optimisation or in the case of qemu-io 'map' print separate lines
where a single line would suffice. In theory, a wrong allocation status
could lead to image corruption with certain operations, so let's fix
this quickly.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-07-09 15:50:11 +02:00
Kevin Wolf
d40593dd90 block/backup: Fix hang for unaligned image size
When doing a block backup of an image with an unaligned size (with
respect to the BACKUP_CLUSTER_SIZE), qemu would check the allocation
status of sectors after the end of the image. bdrv_is_allocated()
returns a result that is valid for 0 sectors in this case, so the backup
job ran into an endless loop.

Stop looping when seeing a result valid for 0 sectors, we're at EOF then.

The test case looks somewhat unrelated at first sight because I
originally tried to reproduce a different suspected bug that turned out
to not exist. Still a good test case and it accidentally found this one.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-07-09 15:50:11 +02:00
Peter Maydell
675879f6f3 Update version for v2.1.0-rc1 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-08 16:53:59 +01:00
Peter Maydell
b653282ecc hw/ppc/spapr_hcall.c: Add ULL suffix to 64 bit constant
Add ULL suffix to 64 bit constant to prevent compiler warnings
on some 32 bit platforms.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-08 16:03:19 +01:00
Peter Maydell
d614cb68da Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20140708' into staging
Bugfixes for s390x: set subsystem id in the lowcore when booting from the
s390-ccw bios, and set the channel-program address after I/O completion,
when applicable.

# gpg: Signature made Tue 08 Jul 2014 14:18:20 BST using RSA key ID C6F02FAF
# gpg: Can't check signature: public key not found

* remotes/cohuck/tags/s390x-20140708:
  s390x/css: reflect cpa in scsw
  pc-bios/s390-ccw: update binary
  pc-bios/s390-ccw: store proper subsystem information word

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-08 15:10:42 +01:00
Cornelia Huck
2ed982b6a9 s390x/css: reflect cpa in scsw
We neglected to update the the channel-program-address field of the scsw
after completion of the start or the halt function: Fortunately, Linux
didn't miss it so far. Let's update it for the cases where the cpa is
expected to be valid; in some cases, the cpa is 'unpredictable', so we
leave it untouched.

Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-07-08 15:08:03 +02:00
Cornelia Huck
32a02d070b pc-bios/s390-ccw: update binary
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-07-08 15:08:03 +02:00
Christian Borntraeger
f2879a5c9e pc-bios/s390-ccw: store proper subsystem information word
POP chapter 17 requires to store a subsystem information word at 184
during IPL. Furthermore bytes 188-191 should be zero. The bootmap might
contain data blocks that are written to the first page. We have to
write these values after we processed the bootmap and before the final
IPL.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-07-08 15:08:03 +02:00
Peter Maydell
67d01fb806 Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20140708' into staging
target-arm queue:
 * fix handling of KVM reset for 32-bit ARM CPUs
 * implement NOR flash alias for vexpress-a9
 * make sure libvixl gets its own utils.h rather than somebody else's

# gpg: Signature made Tue 08 Jul 2014 13:12:05 BST using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"

* remotes/pmaydell/tags/pull-target-arm-20140708:
  target-arm: Implement vCPU reset via KVM_ARM_VCPU_INIT for 32-bit CPUs
  hw/arm/vexpress: Alias NOR flash at 0 for vexpress-a9
  disas/libvixl: prepend the include path of libvixl header files

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-08 14:01:58 +01:00
Peter Maydell
75c9a1a047 target-arm: Implement vCPU reset via KVM_ARM_VCPU_INIT for 32-bit CPUs
Implement kvm_arm_vcpu_init() as a simple call to arm_arm_vcpu_init()
(which uses the KVM_ARM_VCPU_INIT vcpu ioctl to tell the kernel
to re-initialize the vCPU), rather than via the complicated code
which saves a copy of the register state on first init and then
writes it back to the kernel. This is much simpler and brings the
32-bit KVM code into line with the 64-bit code.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1403802973-20841-1-git-send-email-peter.maydell@linaro.org
2014-07-08 13:05:11 +01:00
Peter Maydell
6ec1588e09 hw/arm/vexpress: Alias NOR flash at 0 for vexpress-a9
Make the vexpress-a9 board alias the first NOR flash region at
address zero, like vexpress-a15. This makes "-bios" actually usable
on this board.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1404310070-3561-1-git-send-email-peter.maydell@linaro.org
Reviewed-by: Greg Bellows <greg.bellows@linaro.org>
2014-07-08 13:05:10 +01:00
Stefano Stabellini
834fb1b269 disas/libvixl: prepend the include path of libvixl header files
Currently the Makefile of disas/libvixl appends
-I$(SRC_PATH)/disas/libvixl to QEMU_CFLAGS. As a consequence C++ files
that #include "utils.h", such as disas/libvixl/a64/instructions-a64.cc,
are going to look for utils.h on all the other include paths first.

When building QEMU as part of the Xen make system, another unrelated
utils.h file is going to be chosen for inclusion, causing a build
failure:

In file included from disas/libvixl/a64/instructions-a64.cc:27:0:
/qemu/disas/libvixl/a64/instructions-a64.h:88:64: error:
'rawbits_to_float' was not declared in this scope
 const float kFP32PositiveInfinity = rawbits_to_float(0x7f800000);

Fix the problem by prepending (rather than appending) the libvixl
include path to QEMU_CFLAGS.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-08 12:45:57 +01:00
Peter Maydell
eaa4980185 Merge remote-tracking branch 'remotes/agraf/tags/signed-ppc-for-upstream' into staging
Patch queue for ppc - 2014-07-08

A few bug fixes for 2.1:

  - Fix e500* TLB emulation with qemu-system-ppc
  - Update SLOF to current upstream (good number of bugfixes)
  - Make POWER7 / POWER8 PVR match more agnostic (needed in 2.1 for cmdline compat)
  - Fix u-boot.e500 install (how did that happen?)
  - Fix H_CAS on LE hosts
  - ppc64le-linux-user fixes

# gpg: Signature made Tue 08 Jul 2014 11:18:58 BST using RSA key ID 03FEDC60
# gpg: Can't check signature: public key not found

* remotes/agraf/tags/signed-ppc-for-upstream:
  PPC: e500: Actually install u-boot.e500
  target-ppc: Remove POWER7+ and POWER8E families
  target-ppc: Add pvr_match() callback
  pseries: Update SLOF firmware image to qemu-slof-20140630
  PPC: Fix booke206 TLB with phys addrs > 32bit
  target-ppc: Fix gdbstub for ppc64le-linux-user
  target-ppc: Change default cpu for ppc64le-linux-user
  target-ppc: KVMPPC_H_CAS fix cpu-version endianess

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-08 11:38:12 +01:00
Cole Robinson
0c6ab8c988 PPC: e500: Actually install u-boot.e500
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-08 12:10:37 +02:00
Alexey Kardashevskiy
b60c60070c target-ppc: Remove POWER7+ and POWER8E families
POWER8E is architecturally equal to POWER8 and POWER7+ is equal to
POWER7. Also no user space tool makes any difference for CPU node name
in the device tree (such as PowerPC,POWER7@0 vs. PowerPC,POWER7+@0).
So there is no point in emulating POWER7+ and POWER8E apart from POWER7
and POWER8. Also, the previos patch implemented multiple PVR mask support
per CPU class so POWER7 class now covers both POWER7 and POWER7+ CPUs,
same is valid for POWER8/8E.

This removes POWER7+ and POWER8E classes. This replaces references
to POWER7P/POWER8E families with POWER7/POWER8 families.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-08 12:10:36 +02:00
Alexey Kardashevskiy
03ae4133ab target-ppc: Add pvr_match() callback
So far it was enough to have a base PVR value and mask per CPU
family such as POWER7 or POWER8. However there CPUs which are
completely architecturally compatible but have different PVRs such
as POWER7/POWER7+ and POWER8/POWER8E. For these CPUs, top 16 bits
are CPU family and low 16 bits are the version. The families have
PVR base values different enough so defining a mask which
would cover both (or potentially more) CPUs within the family is
not possible.

This adds a pvr_match() callback to PowerPCCPUClass. The default
handler simply compares PVR defined in the class.

This implements ppc_pvr_match_power7/ppc_pvr_match_power8 callbacks
for POWER7/8 families. These check for POWER7/POWER7+ and POWER8/POWER8E.

This changes ppc_cpu_compare_class_pvr_mask() not to check masks but
use the pvr_match() callback.

Since all server CPUs use the same mask, this defines one mask
value - CPU_POWERPC_POWER_SERVER_MASK - which is used everywhere now.
This removes other mask definitions.

This removes pvr_mask from PowerPCCPUClass as it is not used anymore.
This removes pvr initialization for POWER7/8 families as it is not used
to find the class, the pvr_match() callback is used instead.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-08 12:10:36 +02:00
Alexey Kardashevskiy
d6c23f8a1b pseries: Update SLOF firmware image to qemu-slof-20140630
The changelog is:
  > Quieten the grub warning
  > Add boot menu support
  > boot from disk having chrp-boot file
  > fat16: fix read and remove debug messages
  > dhcparch define missing in compilation
  > pci-scan: reserve memory for pci-bridge without devices
  > pci-bridge: Fix ranges when no device beyond the bridge
  > Set dhcp arch in board-qemu config file
  > xhci: fix controller stop
  > dhcp: support client architecture code 93
  > virtio-blk: support variable block size
  > usb: use common pci dma alloc/mapping routines
  > Remove unused SLOF code
  > pci-bridge: generic bridge needs to support pci dma functions
  > pci: extract dma functions as separate file
  > e1000: fix usage of multiple nics

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-08 12:10:36 +02:00
Alexander Graf
da89a1cf92 PPC: Fix booke206 TLB with phys addrs > 32bit
We were truncating physical addresses to 32bit when using qemu-system-ppc
with a booke206 TLB implementation. This patch fixes that and makes the full
address space available.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-08 12:10:36 +02:00
Richard Henderson
be5c9ddabc target-ppc: Fix gdbstub for ppc64le-linux-user
The bswap that's needed for system mode isn't required for
user mode, and in fact breaks debugging.

Signed-off-by: Richard Henderson <rth@twiddle.net>
[agraf: fix apple gdbstub implementation]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-08 12:10:36 +02:00
Richard Henderson
a74029f6cb target-ppc: Change default cpu for ppc64le-linux-user
The default, 970fx, doesn't support MSR_LE.  So even though we set LE in
ppc_cpu_reset, it gets cleared again in hreg_store_msr.  Error out if a
user-selected cpu model doesn't support LE.

Signed-off-by: Richard Henderson <rth@twiddle.net>
[agraf: switch to POWER7 as default for BE and LE]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-08 12:10:36 +02:00
Laurent Dufour
4bce526ec4 target-ppc: KVMPPC_H_CAS fix cpu-version endianess
During KVMPPC_H_CAS processing, the cpu-version updated value is stored
without taking care of the current endianess. As a consequence, the guest
may not switch to the right CPU model, leading to unexpected results.

If needed, the value is now converted.

Fixes: 6d9412ea81 ("target-ppc: Implement "compat" CPU option")
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-08 12:10:36 +02:00
Peter Maydell
128f0e6614 Merge remote-tracking branch 'remotes/afaerber/tags/prep-for-2.1' into staging
PowerPC Reference Platform (PReP)

* Update OpenHack'Ware firmware to replace QEMU-side workarounds

# gpg: Signature made Mon 07 Jul 2014 15:49:42 BST using RSA key ID 3E7E013F
# gpg: Good signature from "Andreas Färber <afaerber@suse.de>"
# gpg:                 aka "Andreas Färber <afaerber@suse.com>"

* remotes/afaerber/tags/prep-for-2.1:
  prep: Update ppc_rom.bin
  prep: Remove CPU reset entry point hack related to OpenHack'Ware
  prep: Remove PCI memory hack related to OpenHack'Ware

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-07 19:06:55 +01:00
Peter Maydell
c6ea9b73b1 Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc,vhost,virtio fixes, test

Bugfixes all over the place.

There's a  non bugfix here: re-enabling the vhost-user test,
though the patch just brings back functionality that
I disabled earlier to fix mingw build failures.
This is now sorted, and keeping the unit test enabled
seems important since the feature relies on an external
server to work, so isn't easy to test.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Sun 06 Jul 2014 11:01:35 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"

* remotes/mst/tags/for_upstream:
  qemu-char: add chr_add_watch support in mux chardev
  virtio-pci: fix MSI memory region use after free
  qdev: Fix crash when using non-device class name on -global
  qdev: Don't abort() in case globals can't be set
  hw/virtio: enable common virtio feature for mmio device
  acpi: fix typo in memory hotplug MMIO region name
  pci: assign devfn to pci_dev before calling pci_device_iommu_address_space()
  Handle G_IO_HUP in tcp_chr_read for tcp chardev
  virtio: move common virtio properties to bus class device
  pc-dimm: error out if memory hotplug is not enabled
  numa: check for busy memory backend
  qtest: enable vhost-user-test

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-07 16:30:14 +01:00
Andreas Färber
ee0f2601b9 prep: Update ppc_rom.bin
This replaces QEMU-side workarounds for PCI BARs and CPU reset.

Signed-off-by: Andreas Färber <andreas.faerber@web.de>
2014-07-07 16:46:35 +02:00
Hervé Poussineau
56de2e5269 prep: Remove CPU reset entry point hack related to OpenHack'Ware
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
2014-07-07 16:46:35 +02:00
Hervé Poussineau
97db046678 prep: Remove PCI memory hack related to OpenHack'Ware
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
2014-07-07 16:46:35 +02:00
Peter Maydell
9540d1f8d9 Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
Block pull request

# gpg: Signature made Mon 07 Jul 2014 13:27:20 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/block-pull-request:
  qmp: show QOM properties in device-list-properties
  dataplane: submit I/O as a batch
  linux-aio: implement io plug, unplug and flush io queue
  block: block: introduce APIs for submitting IO as a batch
  ahci: map memory via device's address space instead of address_space_memory
  raw-posix: Fix raw_getlength() to always return -errno on error
  qemu-iotests: Disable Quorum testing in 041 when Quorum is not builtin
  ahci.c: mask unused flags when reading size PRDT DBC
  MAINTAINERS: add Stefan Hajnoczi to IDE maintainers
  mirror: Fix qiov size for short requests
  Fix nocow typos in manpage

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-07 15:02:36 +01:00
Peter Maydell
f811d4743b Merge remote-tracking branch 'remotes/sstabellini/xen_arm_20140707' into staging
* remotes/sstabellini/xen_arm_20140707:
  xen: build on ARM
  xen_backend: introduce xenstore_read_uint64 and xenstore_read_fe_uint64

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-07 13:43:03 +01:00
Stefano Stabellini
643f593224 xen: build on ARM
Collection of fixes to build QEMU with Xen support on ARM:
- use xenstore_read_fe_uint64 to retrieve the page-ref (xenfb);
- use xen_pfn_t instead of unsigned long in xenfb;
- unsigned long/xenpfn_t in xen_remove_from_physmap;
- in xen-mapcache.c use HOST_LONG_BITS to check for QEMU's address space
size.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-07 10:37:40 +00:00
Stefano Stabellini
4aba9eb138 xen_backend: introduce xenstore_read_uint64 and xenstore_read_fe_uint64
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-07 10:37:40 +00:00
Stefan Hajnoczi
f4eb32b590 qmp: show QOM properties in device-list-properties
Devices can use a mix of qdev and QOM properties.  Currently only the
qdev properties are displayed by device-list-properties.

This patch extends the property enumeration algorithm to also display
QOM properties (excluding the implicit "type", "realized",
"hotpluggable", and "parent_bus" properties).

When a qdev property exists, use the qdev type name to preserve
backwards compatibility.  QOM type names can be different for bool (qdev
on/off) and str (used by qdev pointers).

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-07 11:10:05 +02:00
Ming Lei
dd67c1d7e7 dataplane: submit I/O as a batch
Before commit 580b6b2aa2(dataplane: use the QEMU block
layer for I/O), dataplane for virtio-blk submits block
I/O as a batch.

This commit 580b6b2aa2 replaces the custom linux AIO
implementation(including submit I/O as a batch) with QEMU
block layer, but this commit causes ~40% throughput regression
on virtio-blk performance, and removing submitting I/O
as a batch is one of the causes.

This patch applies the newly introduced bdrv_io_plug() and
bdrv_io_unplug() interfaces to support submitting I/O
at batch for Qemu block layer, and in my test, the change
can improve throughput by ~30% with 'aio=native'.

Following my fio test script:

	[global]
	direct=1
	size=4G
	bsrange=4k-4k
	timeout=40
	numjobs=4
	ioengine=libaio
	iodepth=64
	filename=/dev/vdc
	group_reporting=1

	[f]
	rw=randread

Result on one of my small machine(host: x86_64, 2cores, 4thread, guest: 4cores):
	- qemu master: 65K IOPS
	- qemu master with these patches: 92K IOPS
	- 2.0.0 release(dataplane using custom linux aio): 104K IOPS

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-07 11:05:17 +02:00
Ming Lei
1b3abdcccf linux-aio: implement io plug, unplug and flush io queue
This patch implements .bdrv_io_plug, .bdrv_io_unplug and
.bdrv_flush_io_queue callbacks for linux-aio Block Drivers,
so that submitting I/O as a batch can be supported on linux-aio.

[Unprocessed requests are completed with -EIO instead of a bogus ret
value.
--Stefan]

Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-07 11:05:17 +02:00
Ming Lei
448ad91db4 block: block: introduce APIs for submitting IO as a batch
This patch introduces three APIs so that following
patches can support queuing I/O requests and submitting them
as a batch for improving I/O performance.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-07 11:05:17 +02:00
Le Tan
5a18e67dfd ahci: map memory via device's address space instead of address_space_memory
In map_page() in hw/ide/ahci.c, replace cpu_physical_memory_map() and
cpu_physical_memory_unmap() with dma_memory_map() and dma_memory_unmap(),
because ahci devices should not access memory directly but via their address
space. Add an AddressSpace parameter to map_page(). In order to call
map_page(), we should pass the AHCIState.as as the AddressSpace argument.

Signed-off-by: Le Tan <tamlokveer@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-07 10:22:43 +02:00
Markus Armbruster
aa729704f4 raw-posix: Fix raw_getlength() to always return -errno on error
We got a merry mix of -1 and -errno here.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-07 09:41:29 +02:00
Benoît Canet
a42a1facb7 qemu-iotests: Disable Quorum testing in 041 when Quorum is not builtin
This avoid breaking tests on RHEL6 where gnutls is too old for quorum to be
built by default.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-07 09:15:29 +02:00
Reza Jelveh
d02f8adc6d ahci.c: mask unused flags when reading size PRDT DBC
The data byte count(DBC) read from the description information is defined for
bits 21:00. Bits 30:22 are reserved and bit 31 is the Interrupt on Completion
(I) flag.

Completion interrupts are triggered after every transaction instead of on
I-flag in QEMU. tbl_entry_size is a signed integer and improperly reading the
DBC leads to a negative offset that causes sglist allocation to fail.

Signed-off-by: Reza Jelveh <reza.jelveh@tuhh.de>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-07 09:15:29 +02:00
Stefan Hajnoczi
37253e1ec8 MAINTAINERS: add Stefan Hajnoczi to IDE maintainers
Make Stefan officially co-maintain hw/ide/ with Kevin.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Kevin Wolf <kwolf@redhat.com>
2014-07-07 09:15:29 +02:00
Kevin Wolf
5a0f6fd5c8 mirror: Fix qiov size for short requests
When mirroring an image of a size that is not a multiple of the
mirror job granularity, the last request would have the right nb_sectors
argument, but a qiov that is rounded up to the next multiple of the
granularity. Don't do this.

This fixes a segfault that is caused by raw-posix being confused by this
and allocating a buffer with request length, but operating on it with
qiov length.

[s/Driver/Drive/ in qemu-iotests 041 as suggested by Eric
--Stefan]

Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-07 09:15:29 +02:00
Chunyan Liu
bc3a7f90ff Fix nocow typos in manpage
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-07 09:15:29 +02:00
Kirill Batuzov
3f0838ab85 qemu-char: add chr_add_watch support in mux chardev
Forward chr_add_watch call from mux chardev to underlying
implementation.

This should fix bug #1335444

Signed-off-by: Kirill Batuzov <batuzovk@ispras.ru>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-07-06 09:13:54 +03:00
Paolo Bonzini
8b81bb3b06 virtio-pci: fix MSI memory region use after free
After memory region QOMification QEMU is stricter in detecting
wrong usage of the memory region API.  Here it detected a
memory_region_destroy done before the corresponding
memory_region_del_subregion; the memory_region_destroy is
done by msix_uninit_exclusive_bar, the memory_region_del_subregion
is done by the PCI core's pci_unregister_io_regions before
pc->exit is called.

The problem was introduced by
commit 06a1307379
    virtio-pci: add device_unplugged callback
As noted in that commit log, virtio device kick callbacks need to be
stopped before generic virtio is cleaned up. This is because these are
notifications from pci proxy to the generic virtio device so they need
to be stopped in the unplug call before the virtio device is unrealized.
However interrupts are notifications from the virtio device to
the pci proxy so they need to stay around while the device
is realized.

The memory API misuse caused an assertion when hot-unplugging virtio
devices.  Using the API correctly fixes the assertion.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-07-06 09:13:54 +03:00
Eduardo Habkost
dd98b71f48 qdev: Fix crash when using non-device class name on -global
This fixes the following crash:

    $ qemu-system-x86_64 -global container.xxx=y
    hw/core/qdev-properties-system.c:399:qdev_add_one_global: Object 0x7f7eff234100 is not an instance of type device
    Aborted (core dumped)

New behavior will be to just warn, just like when non-existing clas
names are used:

    $ qemu-system-x86_64 -global container.xxx=y
    qemu-system-x86_64: Warning: "-global container.xxx=y" not used

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Tested-by: Don Slutz <dslutz@verizon.com>
2014-07-06 09:13:54 +03:00
Eduardo Habkost
319627006a qdev: Don't abort() in case globals can't be set
It would be much better if we didn't terminate QEMU inside
device_post_init(), but at least exiting cleanly is better than aborting
and dumping core.

Before this patch:

    $ qemu-system-x86_64 -global cpu.xxx=y
    qemu-system-x86_64: Property '.xxx' not found
    Aborted (core dumped)

After this patch:

    $ qemu-system-x86_64 -global cpu.xxx=y
    qemu-system-x86_64: Property '.xxx' not found

Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-By: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2014-07-06 09:13:54 +03:00
Ming Lei
b7c9285b8d hw/virtio: enable common virtio feature for mmio device
Both 'indirect_desc' and 'event_idx' are bus independent features,
and they should be enabled for mmio devices too.

On arm64 quad core VM(qemu-kvm), the patch can increase block I/O
performance a lot with latest linux tree:
        - without the patch: 14K IOPS
        - with the patch: 34K IOPS

fio script:
        [global]
        direct=1
        bsrange=4k-4k
        timeout=10
        numjobs=4
        ioengine=libaio
        iodepth=64

        filename=/dev/vdc
        group_reporting=1

        [f1]
        rw=randread

Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-07-06 09:13:54 +03:00
Igor Mammedov
22dc50d758 acpi: fix typo in memory hotplug MMIO region name
Reported-by: Sergey Fionov <fionov@gmail.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
2014-07-06 09:13:54 +03:00
Le Tan
efc8188e93 pci: assign devfn to pci_dev before calling pci_device_iommu_address_space()
In function do_pci_register_device() in file hw/pci/pci.c, move the assignment
of pci_dev->devfn to the position before the call to
pci_device_iommu_address_space(pci_dev) which will use the value of
pci_dev->devfn.

Fixes: 9eda7d373e
    pci: Introduce helper to retrieve a PCI device's DMA address space

Cc: qemu-stable@nongnu.org
Signed-off-by: Le Tan <tamlokveer@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-07-06 09:13:54 +03:00
Kirill Batuzov
812c1057f6 Handle G_IO_HUP in tcp_chr_read for tcp chardev
Since commit cdaa86a54b
("Add G_IO_HUP handler for socket chardev")
GLib limitation results in a bug on Windows host. Steps to reproduce:

Start qemu: qemu-system-i386 -qmp tcp:127.0.0.1:4444:server:nowait
Connect with telnet: telnet 127.0.0.1 4444
Try sending some data from telnet.
Expected result: answers from QEMU.
Observed result: no answers (actually tcp_chr_read is not called at all).

Due to GLib limitations it is not possible to create several watches on one
channel on Windows hosts. See bug #338943 in GNOME bugzilla for details:
https://bugzilla.gnome.org/show_bug.cgi?id=338943

This reimplements commit cdaa86a54b
("Add G_IO_HUP handler for socket chardev") using a single watch:

Handle G_IO_HUP in tcp_chr_read instead. It is already watched by a
corresponding watch.  Remove the second watch with its handler.

Cc: Antonios Motakis <a.motakis@virtualopensystems.com>
Cc: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Kirill Batuzov <batuzovk@ispras.ru>
Signed-off-by: Nikita Belov <zodiac@ispras.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-07-06 09:13:54 +03:00
Ming Lei
85d1277e66 virtio: move common virtio properties to bus class device
The two common virtio features can be defined per bus, so move all
into bus class device to make code more clean.

As discussed with cornelia, s390-virtio-blk doesn't support
the two features at all, so keep s390-virtio as it.

Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> #for s390 ccw
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

MST: rebase and resolve conflicts
2014-07-06 09:13:54 +03:00
Igor Mammedov
9b79a76cdb pc-dimm: error out if memory hotplug is not enabled
fixes QEMU abort in case it's started without memory
hotplug enabled.

as result of fix it will print following messages:
"
-device pc-dimm,id=d1,memdev=m1: memory hotplug is not enabled, enable it on startup
-device pc-dimm,id=d1,memdev=m1: Device 'pc-dimm' could not be initialized
"

Also fixup assert condition to detect hotplug address
space overflow.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reported-by:  Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-07-06 09:13:54 +03:00
Hu Tao
0462faee67 numa: check for busy memory backend
Specifying the same memory backend twice leads to an assert:

./x86_64-softmmu/qemu-system-x86_64 -m 512M -enable-kvm -object
memory-backend-ram,size=256M,id=ram0 -numa node,nodeid=0,memdev=ram0
-numa node,nodeid=1,memdev=ram0
qemu-system-x86_64: /scm/qemu/memory.c:1506:
memory_region_add_subregion_common: Assertion `!subregion->container'
failed.
Aborted (core dumped)

Detect and exit with an error message instead.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-07-06 09:13:53 +03:00
Nikolay Nikolaev
e06cbc376e qtest: enable vhost-user-test
Use qtest-obj-y to get the right library order. CONFIG_POSIX ensures
mingw compilation won't break.

Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

MST: whitespace tweak
2014-07-06 09:13:53 +03:00
James Hogan
0a2672b7ea mips/kvm: Init EBase to correct KSEG0
The EBase CP0 register is initialised to 0x80000000, however with KVM
the guest's KSEG0 is at 0x40000000. The incorrect value doesn't get
passed to KVM yet as KVM doesn't implement the EBase register, however
we should set it correctly now so as not to break migration/loadvm to a
future version of QEMU that does support EBase.

Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-05 11:53:07 +02:00
Eduardo Otubo
9d9de254c2 MAINTAINERS: seccomp: change email contact for Eduardo Otubo
Signed-off-by: Eduardo Otubo <eduardo.otubo@profitbricks.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-03 12:36:15 +01:00
Peter Maydell
92259b7f43 Update version for v2.1.0-rc0 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-01 18:48:01 +01:00
Gonglei
015a33bd05 net: add mmsghdr struct check for L2TPV3
The mmsghdr struct is only introduced in Linux 2.6.32; add a
configure check for it and disable L2TPV3 on hosts which are
too old to provide it, rather than simply failing to compile.

Reported-by: chenliang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1404219488-11196-1-git-send-email-arei.gonglei@huawei.com
[PMM: cleaned up commit message and corrected kernel version number]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-01 17:42:23 +01:00
Peter Maydell
596742db33 Merge remote-tracking branch 'remotes/kraxel/tags/pull-usb-20140701-1' into staging
usb bugfixes.

# gpg: Signature made Tue 01 Jul 2014 14:51:19 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-usb-20140701-1:
  ccid-card-emulated: use EventNotifier
  usb: initialize libusb_device to avoid crash
  usb: Fix usb-bt-dongle initialization.
  input: fix jumpy mouse cursor with USB mouse emulation

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-01 16:16:19 +01:00
Peter Maydell
f9119a2572 Merge remote-tracking branch 'remotes/kraxel/tags/pull-vnc-20140701-1' into staging
vnc: two bugfixes (by Peter Lieven).

# gpg: Signature made Tue 01 Jul 2014 12:32:19 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-vnc-20140701-1:
  ui/vnc: fix potential memory corruption issues
  ui/vnc: limit client_cut_text msg payload size

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-01 15:12:05 +01:00
Paolo Bonzini
c1129f6bff ccid-card-emulated: use EventNotifier
Shut up Coverity's complaint about unchecked fcntl return values,
and especially make the code simpler and more efficient.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-07-01 15:49:51 +02:00
Peter Maydell
1aa85f46b3 Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging
Tracing pull request

# gpg: Signature made Tue 01 Jul 2014 09:56:27 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/tracing-pull-request:
  trace: add qemu_system_powerdown_request and qemu_system_shutdown_request trace events

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-01 14:21:50 +01:00
Peter Maydell
8593efa4fb Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
Block pull request

# gpg: Signature made Tue 01 Jul 2014 09:47:15 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/block-pull-request: (23 commits)
  block: add backing-file option to block-stream
  block: extend block-commit to accept a string for the backing file
  block: add helper function to determine if a BDS is in a chain
  block: add QAPI command to allow live backing file change
  qapi: Change back sector-count to sectors-count in quorum QAPI events.
  block/cow: Avoid use of uninitialized cow_bs in error path
  block: simplify bdrv_find_base() and bdrv_find_overlay()
  block: make 'top' argument to block-commit optional
  iotests: Add more tests to quick group
  iotests: Add qemu tests to quick group
  iotests: Simplify qemu-iotests-quick.sh
  qemu-img create: add 'nocow' option
  virtio-blk: remove need for explicit x-data-plane=on option
  qdev: drop iothread property type
  virtio-blk: replace x-iothread with iothread link property
  virtio-blk: move qdev properties into virtio-blk.c
  virtio: fix virtio-blk child refcount in transports
  virtio-blk: drop virtio_blk_set_conf()
  virtio-blk: use aliases instead of duplicate qdev properties
  qdev: add qdev_alias_all_properties()
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-01 13:13:04 +01:00
Peter Lieven
bea60dd767 ui/vnc: fix potential memory corruption issues
this patch makes the VNC server work correctly if the
server surface and the guest surface have different sizes.

Basically the server surface is adjusted to not exceed VNC_MAX_WIDTH
x VNC_MAX_HEIGHT and additionally the width is rounded up to multiple of
VNC_DIRTY_PIXELS_PER_BIT.

If we have a resolution whose width is not dividable by VNC_DIRTY_PIXELS_PER_BIT
we now get a small black bar on the right of the screen.

If the surface is too big to fit the limits only the upper left area is shown.

On top of that this fixes 2 memory corruption issues:

The first was actually discovered during playing
around with a Windows 7 vServer. During resolution
change in Windows 7 it happens sometimes that Windows
changes to an intermediate resolution where
server_stride % cmp_bytes != 0 (in vnc_refresh_server_surface).
This happens only if width % VNC_DIRTY_PIXELS_PER_BIT != 0.

The second is a theoretical issue, but is maybe exploitable
by the guest. If for some reason the guest surface size is bigger
than VNC_MAX_WIDTH x VNC_MAX_HEIGHT we end up in severe corruption since
this limit is nowhere enforced.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-07-01 13:26:40 +02:00
Peter Lieven
f9a70e7939 ui/vnc: limit client_cut_text msg payload size
currently a malicious client could define a payload
size of 2^32 - 1 bytes and send up to that size of
data to the vnc server. The server would allocated
that amount of memory which could easily create an
out of memory condition.

This patch limits the payload size to 1MB max.

Please note that client_cut_text messages are currently
silently ignored.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-07-01 13:26:40 +02:00
Jincheng Miao
3ce2144538 usb: initialize libusb_device to avoid crash
If libusb_get_device_list() fails, the uninitialized local variable
libusb_device would be passed to libusb_free_device_list(), that
will cause a crash, like:
(gdb) bt
 #0  0x00007fbbb4bafc10 in pthread_mutex_lock () from /lib64/libpthread.so.0
 #1  0x00007fbbb233e653 in libusb_unref_device (dev=0x6275682d627375)
     at core.c:902
 #2  0x00007fbbb233e739 in libusb_free_device_list (list=0x7fbbb6e8436e,
     unref_devices=<optimized out>) at core.c:653
 #3  0x00007fbbb6cd80a4 in usb_host_auto_check (unused=unused@entry=0x0)
     at hw/usb/host-libusb.c:1446
 #4  0x00007fbbb6cd8525 in usb_host_initfn (udev=0x7fbbbd3c5670)
     at hw/usb/host-libusb.c:912
 #5  0x00007fbbb6cc123b in usb_device_init (dev=0x7fbbbd3c5670)
     at hw/usb/bus.c:106
 ...

So initialize libusb_device at the begin time.

Signed-off-by: Jincheng Miao <jmiao@redhat.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-07-01 13:26:37 +02:00
Hani Benhabiles
c340a284f3 usb: Fix usb-bt-dongle initialization.
Due to an incomplete initialization, adding a usb-bt-dongle device through HMP
or QMP will cause a segmentation fault.

Signed-off-by: Hani Benhabiles <hani@linux.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-07-01 13:26:37 +02:00
Christian Burger
35e83d10f2 input: fix jumpy mouse cursor with USB mouse emulation
Guest mouse pointer was jumpy, when moving host mouse in the vertical direction (see bug #1327800).

Signed-off-by: Christian Burger <christian@krikkel.de>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-07-01 13:26:37 +02:00
Peter Maydell
c26f3a0a6d Merge remote-tracking branch 'remotes/bonzini/memory' into staging
* remotes/bonzini/memory:
  qdev: correctly send DEVICE_DELETED for recursively-deleted devices
  memory: do not give a name to the internal exec.c regions
  memory: MemoryRegion: Add size property
  memory: MemoryRegion: Add may-overlap and priority props
  memory: MemoryRegion: Add container and addr props
  memory: MemoryRegion: replace owner field with QOM parent
  memory: MemoryRegion: QOMify
  memory: MemoryRegion: use /machine as default owner
  libqtest: escape strings in QMP commands, fix leak
  qom: object: Ignore refs/unrefs of NULL
  qom: object: remove parent pointer when unparenting
  mc146818rtc: add "rtc-time" link to "/machine/rtc"
  qom: allow creating an alias of a child<> property
  qom: add a generic mechanism to resolve paths
  qom: add object_property_add_alias()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-01 11:55:49 +01:00
Peter Maydell
b3959efdbb Merge remote-tracking branch 'remotes/afaerber/tags/qom-devices-for-2.1' into staging
QOM and device refactorings

* QOM unparenting cleanup
* IRQ conversion to QOM

# gpg: Signature made Tue 01 Jul 2014 04:03:23 BST using RSA key ID 3E7E013F
# gpg: Good signature from "Andreas Färber <afaerber@suse.de>"
# gpg:                 aka "Andreas Färber <afaerber@suse.com>"

* remotes/afaerber/tags/qom-devices-for-2.1:
  irq: Slim conversion of qemu_irq to QOM
  irq: Allocate IRQs individually
  hw: Fix qemu_allocate_irqs() leaks
  sdhci: Fix misuse of qemu_free_irqs()
  qom: Remove parent pointer when unparenting

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-01 11:00:53 +01:00
Peter Maydell
d94a658712 Merge remote-tracking branch 'remotes/bonzini/scsi-next' into staging
* remotes/bonzini/scsi-next:
  configure: Fix -lm test, so that tools can be compiled on hosts that require -lm
  virtio-scsi: scsi events must be converted to target endianness
  virtio-scsi: virtio_scsi_push_event() lacks VirtIOSCSIReq parsing

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-01 10:28:52 +01:00
Yang Zhiyong
bc78cff975 trace: add qemu_system_powerdown_request and qemu_system_shutdown_request trace events
We have the experience that the guest doesn't stop successfully
though it was instructed to shut down.

The root cause may be not in QEMU mostly.  However, QEMU is often
suspected at the beginning just because the issue occurred in
virtualization environment.

Therefore, we need to affirm that QEMU received the shutdown
request and raised ACPI irq from "virsh shutdown" command,
virt-manger or stopping QEMU process to the VM .
So that we can affirm the problems was belonged to the Guset OS
rather than the QEMU itself.

When we stop guests by "virsh shutdown" command or virt-manger,
or stopping QEMU process, qemu_system_powerdown_request() or
qemu_system_shutdown_request() is called. Then the below functions
in main_loop_should_exit() of Vl.c are called roughly in the
following order.

	if (qemu_powerdown_requested())
		qemu_system_powerdown()
			monitor_protocol_event(QEVENT_POWERDOWN, NULL)

	OR

	if(qemu_shutdown_requested()}
		monitor_protocol_event(QEVENT_SHUTDOWN, NULL);

The tracepoint of monitor_protocol_event() already exists, but no
tracepoints are defined for qemu_system_powerdown_request() and
qemu_system_shutdown_request(). So this patch adds two tracepoints for
the two functions. We believe that it will become much easier to
isolate the problem mentioned above by these tracepoints.

Signed-off-by: Yang Zhiyong <yangzy.fnst@cn.fujitsu.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 10:56:13 +02:00
Jeff Cody
13d8cc515d block: add backing-file option to block-stream
On some image chains, QEMU may not always be able to resolve the
filenames properly, when updating the backing file of an image
after a block job.

For instance, certain relative pathnames may fail, or drives may
have been specified originally by file descriptor (e.g. /dev/fd/???),
or a relative protocol pathname may have been used.

In these instances, QEMU may lack the information to be able to make
the correct choice, but the user or management layer most likely does
have that knowledge.

With this extension to the block-stream api, the user is able to change
the backing file of the active layer as part of the block-stream
operation.

This allows the change to be 'safe', in the sense that if the attempt
to write the active image metadata fails, then the block-stream
operation returns failure, without disrupting the guest.

If a backing file string is not specified in the command, the backing
file string to use is determined in the same manner as it was
previously.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 10:47:01 +02:00
Jeff Cody
54e2690090 block: extend block-commit to accept a string for the backing file
On some image chains, QEMU may not always be able to resolve the
filenames properly, when updating the backing file of an image
after a block commit.

For instance, certain relative pathnames may fail, or drives may
have been specified originally by file descriptor (e.g. /dev/fd/???),
or a relative protocol pathname may have been used.

In these instances, QEMU may lack the information to be able to make
the correct choice, but the user or management layer most likely does
have that knowledge.

With this extension to the block-commit api, the user is able to change
the backing file of the overlay image as part of the block-commit
operation.

This allows the change to be 'safe', in the sense that if the attempt
to write the overlay image metadata fails, then the block-commit
operation returns failure, without disrupting the guest.

If the commit top is the active layer, then specifying the backing
file string will be treated as an error (there is no overlay image
to modify in that case).

If a backing file string is not specified in the command, the backing
file string to use is determined in the same manner as it was
previously.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 10:47:01 +02:00
Jeff Cody
5a6684d2b9 block: add helper function to determine if a BDS is in a chain
This is a small helper function, to determine if 'base' is in the
chain of BlockDriverState 'top'.  It returns true if it is in the chain,
and false otherwise.

If either argument is NULL, it will also return false.

Reviewed-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 10:47:01 +02:00
Jeff Cody
fa40e65622 block: add QAPI command to allow live backing file change
This allows a user to make a live change to the backing file recorded in
an open image.

The image file to modify can be specified 2 ways:

1) image filename
2) image node-name

Note: this does not cause the backing file itself to be reopened; it
merely changes the backing filename in the image file structure, and
in internal BDS structures.

It is the responsibility of the user to pass a filename string that
can be resolved when the image chain is reopened, and the filename
string is not validated.

A good analogy for this command is that it is a live version of
'qemu-img rebase -u', with respect to changing the backing file string.

[Jeff is offline so I respun this patch in his absence.  Dropped image
filename since using node-name is preferred and this is a new command.
No need to introduce the limitations of finding images by filename.
--Stefan]

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 10:46:38 +02:00
Alexey Kardashevskiy
f80ea9862f configure: Fix -lm test, so that tools can be compiled on hosts that require -lm
The existing test whether "-lm" needs to be included or not is
insufficient as it reports false negative on Fedora20/ppc64.
This happens because sin(0.0) is a constant value which compiler
can safely throw away and therefore there is no need to add "-lm".
As the result, qemu-nbd/qemu-io/qemu-img tools cannot compile.

This adds a global variable and uses it in the test to prevent
from optimization.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[Use Peter's improvement on the test to fool LTO, and remove the
 now useless -lm addition in Makefile.target. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-01 10:36:28 +02:00
Paolo Bonzini
352e8da743 qdev: correctly send DEVICE_DELETED for recursively-deleted devices
When a device is unparented (i.e. made completely hidden from management)
we want to send a DEVICE_DELETED event only if the device actually was
realized.  This avoids raising DEVICE_DELETED events when device_add
fails.

However, this does not work right for recursively-deleted
devices: the whole tree is _first_ unrealized, _then_ unparented.
Then device_unparent sees realized==false and fails to trigger
the event.  The solution is simply to move have_realized into
the DeviceState struct.  If device_add fails, we never set the
new field to true and DEVICE_DELETED is not sent.

Fixes qemu-iotests testcase 067 (broken by commit 5942a19, though that
commit in turn fixed a possible segfault in the same test).

Reported-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-01 10:20:42 +02:00
Paolo Bonzini
1f6245e5ab memory: do not give a name to the internal exec.c regions
There is no need to have them visible under /machine.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-01 10:20:41 +02:00
Peter Crosthwaite
52aef7bba7 memory: MemoryRegion: Add size property
To allow devices to dynamically resize the device. The motivation is
to allow devices with variable size to init their memory_region
without size early and then correctly populate size at realize() time.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-01 10:20:41 +02:00
Peter Crosthwaite
d33382da9a memory: MemoryRegion: Add may-overlap and priority props
QOM propertyify the .may-overlap and .priority fields. The setters
will re-add the memory as a subregion if needed (i.e. the values change
when the memory region is already contained).

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
[Remove setters. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-01 10:20:41 +02:00
Peter Crosthwaite
409ddd0139 memory: MemoryRegion: Add container and addr props
Expose the already existing .parent and .addr fields as QOM properties.
.parent (i.e. the field describing the memory region that contains this
one in Memory hierachy) is renamed "container". This is to avoid
confusion with the QOM parent.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
[Remove setters.  Do not unref parent on releasing the property. Clean
 up error propagation. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-01 10:20:41 +02:00
Paolo Bonzini
22a893e4f5 memory: MemoryRegion: replace owner field with QOM parent
The two are now the same.

Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-01 10:20:41 +02:00
Peter Crosthwaite
b4fefef9d5 memory: MemoryRegion: QOMify
QOMify memory regions as an Object. The former init() and destroy()
routines become instance_init() and instance_finalize() resp.

memory_region_init() is re-implemented to be:
object_initialize() + set fields

memory_region_destroy() is re-implemented to call unparent().

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
[Add newly-created MR as child, unparent on destruction. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-01 10:20:41 +02:00
Paolo Bonzini
b5c2c3d0c8 memory: MemoryRegion: use /machine as default owner
This will be added (after QOMification) as the QOM parent.

Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-01 10:20:41 +02:00
Paolo Bonzini
563890c7c7 libqtest: escape strings in QMP commands, fix leak
libqtest is using g_strdup_printf to format QMP commands, but
this does not work if the argument strings need to be escaped.
Instead, use the fancy %-formatting functionality of QObject.
The only change required in tests is that strings have to be
formatted as %s, not '%s' or \"%s\".  Luckily this usage of
parameterized QMP commands is not that frequent.

The leak is in socket_sendf.  Since we are extracting the send
loop to a new function, fix it now.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-01 10:20:41 +02:00
Peter Crosthwaite
8ffad850ef qom: object: Ignore refs/unrefs of NULL
Just do nothing if passed NULL for a ref or unref. This avoids
call sites that manage a combination of NULL or non-NULL pointers
having to add iffery around every ref and unref.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-01 10:20:41 +02:00
Peter Crosthwaite
c28322d10c qom: object: remove parent pointer when unparenting
Certain parts of the QOM framework test this pointer to determine if
an object is parented. Nuke it when the object is unparented to allow
for reuse of an object after unparenting.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-01 10:20:41 +02:00
Marcelo Tosatti
654a36d857 mc146818rtc: add "rtc-time" link to "/machine/rtc"
Add a link to rtc under /machine providing a stable
location for management apps to query the value of the
time.  The link should be added by any object that sends
RTC_TIME_CHANGE events.

{"execute":"qom-get","arguments":{"path":"/machine","property":"rtc-time"} }

Suggested by Paolo Bonzini and Andreas Faerber.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-01 10:20:30 +02:00
Paolo Bonzini
d190698e6f qom: allow creating an alias of a child<> property
Child properties must be unique.  Fix this problem by
turning their aliases into links.

The resolve function that forwards to the target property
does not have any knowledge of the target property's type,
so it works fine.

Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-01 10:17:48 +02:00
Paolo Bonzini
64607d0881 qom: add a generic mechanism to resolve paths
It may be desirable to have custom link<> properties that do more
than just store an object.  Even the addition of a "check"
function is not enough if setting the link has side effects
or if a non-standard reference counting is preferrable.

Avoid the assumption that the opaque field of a link<> is a
LinkProperty struct, by adding a generic "resolve" callback
to ObjectProperty.  This fixes aliases of link properties.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-01 10:17:48 +02:00
Benoît Canet
4e855baabf qapi: Change back sector-count to sectors-count in quorum QAPI events.
fe069d9d had aligned code and documentation while dropping the s from the
actual JSON output. Fix that.

This also fix test/qemu-iotest/081 since the missing s was causing a permutation.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 10:15:34 +02:00
Peter Maydell
6764579f89 block/cow: Avoid use of uninitialized cow_bs in error path
Commit 25814e8987 introduced an error-exit code path which does
a "goto exit" before the cow_bs variable is initialized, meaning
we would call bdrv_unref() on an uninitialized variable and
likely segfault. Fix this by moving the NULL-initialization
to the top of the function and making the exit code path handle
the case where it is NULL.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 10:15:34 +02:00
Jeff Cody
4caf0fcd45 block: simplify bdrv_find_base() and bdrv_find_overlay()
This simplifies the function bdrv_find_overlay().  With this change,
bdrv_find_base() is just a subset of usage of bdrv_find_overlay(),
so this also takes advantage of that.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 10:15:34 +02:00
Jeff Cody
7676e2c597 block: make 'top' argument to block-commit optional
Now that active layer block-commit is supported, the 'top' argument
no longer needs to be mandatory.

Change it to optional, with the default being the active layer in the
device chain.

[kwolf: Rebased and resolved conflict in tests/qemu-iotests/040]

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 10:15:33 +02:00
Max Reitz
c891e3bbc5 iotests: Add more tests to quick group
While at it, add some more tests to the quick group (those that run with
-nocache in under three seconds on my HDD).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 10:15:33 +02:00
Max Reitz
ee9dd1fc90 iotests: Add qemu tests to quick group
Now that qemu-iotests-quick.sh supports tests using the qemu binary, we
are free to add such tests to the quick group.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 10:15:33 +02:00
Max Reitz
214a081a0d iotests: Simplify qemu-iotests-quick.sh
As of the "iotests: Allow out-of-tree run" series, the qemu-iotests may
(and should) be run directly in the build tree and will then guess the
binary paths themselves. Therefore, qemu-iotests-quick.sh does not need
to (and should not) enter the source path anymore; also, it does not
need to specify the binaries because "check" will guess them
automatically.

As a side-effect, tests using qemu may now be added to the quick group.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 10:15:33 +02:00
Chunyan Liu
4ab1559085 qemu-img create: add 'nocow' option
Add 'nocow' option so that users could have a chance to set NOCOW flag to
newly created files. It's useful on btrfs file system to enhance performance.

Btrfs has low performance when hosting VM images, even more when the guest
in those VM are also using btrfs as file system. One way to mitigate this bad
performance is to turn off COW attributes on VM files. Generally, there are
two ways to turn off NOCOW on btrfs: a) by mounting fs with nodatacow, then
all newly created files will be NOCOW. b) per file. Add the NOCOW file
attribute. It could only be done to empty or new files.

This patch tries the second way, according to the option, it could add NOCOW
per file.

For most block drivers, since the create file step is in raw-posix.c, so we
can do setting NOCOW flag ioctl in raw-posix.c only.

But there are some exceptions, like block/vpc.c and block/vdi.c, they are
creating file by calling qemu_open directly. For them, do the same setting
NOCOW flag ioctl work in them separately.

[Fixed up 082.out due to the new 'nocow' creation option
--Stefan]

Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 10:15:12 +02:00
Cédric Le Goater
424baff549 virtio-scsi: scsi events must be converted to target endianness
Virtio SCSI Events need to be byteswapped before being pushed
when host and guest have a different endianness. Not doing so
breaks hotplug of virtio scsi disks, with the following error
message being printed in the guest console:

virtio_scsi: Unsupport virtio scsi event 1000000

This issue got uncovered while testing disk hotplug with a PowerKVM
ppc64le guest. I have checked that this issue also affects a x86_64
guest run on a ppc64 host.

Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
[ Ported from PowerKVM,
  Greg Kurz <gkurz@linux.vnet.ibm.com> ]
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-01 09:40:38 +02:00
Greg Kurz
dfecbb95e3 virtio-scsi: virtio_scsi_push_event() lacks VirtIOSCSIReq parsing
Hotplug of a virtio scsi disk is currently broken: no disk appears in the
guest (verified with a fedora 20 host running a fedora 20 guest with KVM).
Bisect leeds to Paolo's patches to support any_layout, especially this
commit:

commit 36b15c79aa
Author: Paolo Bonzini <pbonzini@redhat.com>
Date:   Tue Jun 10 16:21:18 2014 +0200

    virtio-scsi: start preparing for any_layout

It modifies virtio_scsi_pop_req() so that it is up to the callers to parse
the virtio scsi request. It seems that virtio_scsi_push_event() was not
modified accordingly...

This patch adds a call to virtio_scsi_parse_req(). It also drops some
sanity checks that are already performed by virtio_scsi_parse_req().

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-01 09:40:38 +02:00
Stefan Hajnoczi
ef7c7ff6d4 qom: add object_property_add_alias()
Sometimes an object needs to present a property which is actually on
another object, or it needs to provide an alias name for an existing
property.

Examples:
  a.foo -> b.foo
  a.old_name -> a.new_name

The new object_property_add_alias() API allows objects to alias a
property on the same object or another object.  The source and target
names can be different.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
2014-07-01 09:15:02 +02:00
Stefan Hajnoczi
8698e110f8 virtio-blk: remove need for explicit x-data-plane=on option
The x-data-plane=on|off option is no longer useful because the
iothread=<iothread> option conveys the same information plus which
IOThread to use.

Do not delete x-data-plane=on|off yet as a convenience to people using
this legacy experimental option.  We will drop it in QEMU 2.2.

Instead, turn on data-plane when either x-data-plane=on or
iothread=<iothread> are used.  The following command-line uses
data-plane:

  qemu -device virtio-blk-pci,iothread=foo,drive=drive0

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 09:15:02 +02:00
Stefan Hajnoczi
1351d1ec89 qdev: drop iothread property type
The iothread property type is no longer used and can be removed.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 09:15:02 +02:00
Stefan Hajnoczi
467b3f33e9 virtio-blk: replace x-iothread with iothread link property
Up until now -device virtio-blk-pci,x-iothread=<id> was used to assign
an IOThread.  This was a temporary solution while we cleaned up QOM link
properties.

This patch switches over to a QOM link property since it is now possible
to restrict the setter to unrealized instances and automatically unref
the IOThread when the virtio-blk-pci device is freed.

Since the "iothread" property is a QOM property and not a qdev property,
we must alias it explicitly for virtio-blk-pci, as well as CCW and
s390-virtio.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 09:15:02 +02:00
Stefan Hajnoczi
32a877e405 virtio-blk: move qdev properties into virtio-blk.c
There is no need to make DEFINE_VIRTIO_BLK_PROPERTIES() public.  Inline
it into virtio-blk.c so it cannot be used by mistake from other source
files.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
2014-07-01 09:15:02 +02:00
Stefan Hajnoczi
c5d49db446 virtio: fix virtio-blk child refcount in transports
object_initialize() leaves the object with a refcount of 1.
object_property_add_child() adds its own reference which is dropped
again when the property is deleted.

The upshot of this is that we always have a refcount >= 1.  Upon hot
unplug the virtio-blk child is not finalized!

Drop our reference after the child property has been added to the
parent.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
2014-07-01 09:15:02 +02:00
Stefan Hajnoczi
f7fedda84a virtio-blk: drop virtio_blk_set_conf()
This function is no longer used since parent objects now use child
aliases to set the VirtIOBlkConf directly.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
2014-07-01 09:15:02 +02:00
Stefan Hajnoczi
caffdac363 virtio-blk: use aliases instead of duplicate qdev properties
virtio-blk-pci, virtio-blk-s390, and virtio-blk-ccw all duplicate the
qdev properties of their VirtIOBlock child.  This approach does not work
well with string or pointer properties since we must be careful about
leaking or double-freeing them.

Use the QOM alias property to forward property accesses to the
VirtIOBlock child.  This way no duplication is necessary.

Remember to stop calling virtio_blk_set_conf() so that we don't clobber
the values already set on the VirtIOBlock instance.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
2014-07-01 09:15:02 +02:00
Stefan Hajnoczi
67cc7e0aac qdev: add qdev_alias_all_properties()
The qdev_alias_all_properties() function creates QOM alias properties
for each qdev property on a DeviceState.  This is useful for parent
objects that wish to forward property accesses to their children.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
2014-07-01 09:15:02 +02:00
Stefan Hajnoczi
ee512c6f21 virtio-blk: move x-data-plane qdev property to virtio-blk.h
Move the x-data-plane property.  Originally it was outside since not
every transport may wish to support dataplane.  But that makes little
sense when we have a dedicated CONFIG_VIRTIO_BLK_DATA_PLANE ifdef
already.

This move makes it easier to switch to property aliases in the next
patch.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
2014-07-01 09:15:02 +02:00
Cornelia Huck
a9968c77d5 dataplane: bail out on unsupported transport
If the virtio transport does not support notifiers (like s390-virtio),
we can't use dataplane. Bail out early and let the user know what is
wrong.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 09:15:02 +02:00
Stefan Hajnoczi
dc80ca6cd6 virtio-blk: avoid qdev property definition duplication
It becomes unwiedly to duplicate all virtio-blk qdev property
definitions due to an #ifdef.  The C preprocessor syntax makes it a
little hard to resolve this cleanly but we can extract the #ifdef and
call a macro it defines later.

Avoiding duplication is important since it will only get worse when we
move the x-data-plane qdev property here too.  We'd have a combinatorial
explosion since x-data-plane has its own #ifdef.

Suggested-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
2014-07-01 09:15:02 +02:00
Andreas Färber
615c489570 irq: Slim conversion of qemu_irq to QOM
As a prequel to any big Pin refactoring plans, do an in-place conversion
of qemu_irq to an Object, so that we can reference it in link<> properties.

Signed-off-by: Andreas Färber <afaerber@suse.de>
[ PC Changes:
 * Removed array-alloctor ref counting logic (limit changes just to
 * single IRQ allocator)
 * Removed WIP marking from subject line
]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-07-01 04:12:48 +02:00
Peter Crosthwaite
f173d57a4c irq: Allocate IRQs individually
Allocate each IRQ individually on array allocations. This prepares for
QOMification of IRQs, where pointers to individual IRQs may be taken
and handed around for usage as QOM Links. The g_renew() scheme used here
is too fragile and would break all existing links should an IRQ list
be extended.

We now have to pass the IRQ count to qemu_free_irqs(). We have so few
call sites however, so this change is reasonably trivial.

Cc: agarcia@igalia.com
Cc: mst@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Alberto Garcia <agarcia@igalia.com>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-07-01 04:02:53 +02:00
Andreas Färber
f3c7d0389f hw: Fix qemu_allocate_irqs() leaks
Replace qemu_allocate_irqs(foo, bar, 1)[0]
with qemu_allocate_irq(foo, bar, 0).

This avoids leaking the dereferenced qemu_irq *.

Cc: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andreas Färber <afaerber@suse.de>
[PC Changes:
 * Applied change to instance in sh4/sh7750.c
]
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Kirill Batuzov <batuzovk@ispras.ru>
[AF: Fix IRQ index in sh4/sh7750.c]
Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-06-30 21:13:30 +02:00
Andreas Färber
127a4e1a51 sdhci: Fix misuse of qemu_free_irqs()
It does a g_free() on the pointer, so don't pass a local &foo reference.

Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-06-30 21:13:30 +02:00
Peter Crosthwaite
d15ae221ea qom: Remove parent pointer when unparenting
Certain parts of the QOM framework test this pointer to determine if
an object is parented. Nuke it when the object is unparented to allow
for reuse of an object after unparenting.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-06-30 21:13:30 +02:00
Peter Maydell
53a259da56 Merge remote-tracking branch 'remotes/awilliam/tags/vfio-pci-for-qemu-20140630.0' into staging
VFIO patches: MSI-X masking performance fix, Endian fixes, fix runstate on device error

# gpg: Signature made Mon 30 Jun 2014 18:13:40 BST using RSA key ID 3BB08B22
# gpg: Can't check signature: public key not found

* remotes/awilliam/tags/vfio-pci-for-qemu-20140630.0:
  vfio: use correct runstate
  vfio: Make BARs native endian
  vfio-pci: Fix MSI-X masking performance
  vfio-pci: Fix MSI/X debug code

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-30 18:31:07 +01:00
Paolo Bonzini
ba29776fd8 vfio: use correct runstate
io-error is for block device errors; it should always be preceded
by a BLOCK_IO_ERROR event.  I think vfio wants to use
RUN_STATE_INTERNAL_ERROR instead.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-06-30 09:56:08 -06:00
Alexey Kardashevskiy
c40708176a vfio: Make BARs native endian
Slow BAR access path is used when VFIO fails to mmap() BAR.
Since this is just a transport between the guest and a device, there is
no need to do endianness swapping.

This changes BARs to use native endianness. Since non-ROM BARs were
doing byte swapping, we need to remove it so does the patch.
As the result, this eliminates cancelling byte swaps and there is
no change in behavior for non-ROM BARs.

ROM BARs were declared little endian too but byte swapping was not
implemented for them so they never actually worked on big endian systems
as there was no cancelling byte swap. This fixes endiannes for ROM BARs
by declaring them native endian and only fixing access sizes as it is
done for non-ROM BARs.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-06-30 09:52:58 -06:00
Alex Williamson
f4d45d4782 vfio-pci: Fix MSI-X masking performance
There are still old guests out there that over-exercise MSI-X masking.
The current code completely sets-up and tears-down an MSI-X vector on
the "use" and "release" callbacks.  While this is functional, it can
slow an old guest to a crawl.  We can easily skip the KVM parts of
this so that we keep the MSI route and irqfd setup.  We do however
need to switch VFIO to trigger a different eventfd while masked.
Actually, we have the option of continuing to use -1 to disable the
trigger, but by using another EventNotifier we can allow the MSI-X
core to emulate pending bits and re-fire the vector once unmasked.
MSI code gets updated as well to use the same setup and teardown
structures and functions.

Prior to this change, an igbvf assigned to a RHEL5 guest gets about
20Mbps and 50 transactions/s with netperf (remote or VF->PF).  With
this change, we get line rate and 3k transactions/s remote or 2Gbps
and 6k+ transactions/s to the PF.  No significant change is expected
for newer guests with more well behaved MSI-X support.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-06-30 09:50:33 -06:00
Alex Williamson
9035f8c09b vfio-pci: Fix MSI/X debug code
Use the correct MSI message function for debug info.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-06-30 09:50:33 -06:00
Peter Maydell
8954000b9e Merge remote-tracking branch 'remotes/bonzini/nbd-next' into staging
* remotes/bonzini/nbd-next:
  nbd: Handle NBD_OPT_LIST option.
  nbd: Handle fixed new-style clients.
  nbd: Shutdown socket before closing.
  nbd: Don't validate from and len in NBD_CMD_DISC.
  nbd: Don't export a block device with no medium.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-30 16:13:32 +01:00
Peter Maydell
ec9fe956d5 Merge remote-tracking branch 'remotes/bonzini/small-fixes' into staging
* remotes/bonzini/small-fixes:
  tests/test-qmp-event: fix for GLib < 2.31
  serial: poll the serial console with G_IO_HUP

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-30 15:56:00 +01:00
Peter Maydell
a4b31047c8 Merge remote-tracking branch 'remotes/pmaydell/tags/pull-cocoa-20140630' into staging
cocoa.next:
 * Honour -show-cursor option
 * Fix handling of absolute positioning devices
 * Cope with first surface being same as initial window size

# gpg: Signature made Mon 30 Jun 2014 13:48:46 BST using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"

* remotes/pmaydell/tags/pull-cocoa-20140630:
  ui/cocoa: Honour -show-cursor command line option
  ui/cocoa: Fix handling of absolute positioning devices
  ui/cocoa: Add utility method to check if point is within window
  ui/cocoa: Cope with first surface being same as initial window size

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-30 15:42:35 +01:00
Peter Maydell
a156dd9a22 Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20140630' into staging
target-arm:
 * provide PL031 RTC in virt board
 * fix missing pxa2xx and strongarm vmstate
 * convert cadence_ttc to instance_init
 * fix libvixl format strings and README

# gpg: Signature made Mon 30 Jun 2014 13:44:33 BST using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"

* remotes/pmaydell/tags/pull-target-arm-20140630:
  disas/libvixl: Fix wrong format strings
  disas/libvixl: Update README for version base
  timer: cadence_ttc: Convert to instance_init
  hw/arm/pxa2xx_gpio: Correct and register vmstate
  hw/arm/pxa2xx_gpio: Fix handling of GPSR/GPCR reads
  hw/arm/strongarm: Wire up missing GPIO and PPC vmstate
  hw/arm/strongarm: Fix handling of GPSR/GPCR reads
  hw/arm/virt: Provide PL031 RTC

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-30 15:16:26 +01:00
Paolo Bonzini
af35e5e1fb tests/test-qmp-event: fix for GLib < 2.31
On old GLib, the test needs a g_thread_init call.

Reported-by: Wenchao Xia <wenchaoqemu@gmail.com>
Tested-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-30 15:06:11 +02:00
Roger Pau Monne
e02bc6de30 serial: poll the serial console with G_IO_HUP
On FreeBSD polling a master pty while the other end is not connected
with G_IO_OUT only results in an endless wait. This is different from
the Linux behaviour, that returns immediately. In order to demonstrate
this, I have the following example code:

http://xenbits.xen.org/people/royger/test_poll.c

When executed on Linux:

$ ./test_poll
In callback

On FreeBSD instead, the callback never gets called:

$ ./test_poll

So, in order to workaround this, poll the source with G_IO_HUP (which
makes the code behave the same way on both Linux and FreeBSD).

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Cc: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Cc: Michael Tokarev <mjt@tls.msk.ru>
Cc: "Andreas Färber" <afaerber@suse.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: xen-devel@lists.xenproject.org
[Add hw/char/cadence_uart.c too. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-30 15:04:34 +02:00
Hani Benhabiles
32d7d2e068 nbd: Handle NBD_OPT_LIST option.
Signed-off-by: Hani Benhabiles <kroosec@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-30 12:50:17 +02:00
Hani Benhabiles
f5076b5a75 nbd: Handle fixed new-style clients.
When this flag is set, the server tells the client that it can send another
option if the server received a request with an option that it doesn't
understand instead of directly closing the connection.

Also add link to the most up-to-date documentation.

Signed-off-by: Hani Benhabiles <kroosec@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-30 12:50:17 +02:00
Hani Benhabiles
27e5eae457 nbd: Shutdown socket before closing.
This forces finishing data sending to client before closing the socket like in
exports listing or replying with NBD_REP_ERR_UNSUP cases.

Signed-off-by: Hani Benhabiles <kroosec@gmail.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-30 12:50:12 +02:00
Stefan Weil
ffebe89975 disas/libvixl: Fix wrong format strings
When the compiler is told to check the arguments of AppendToOutput,
it reports several errors of this kind:

error: format ‘%d’ expects argument of type ‘int’,
 but argument 3 has type ‘int64_t {aka long int}’ [-Werror=format]

Fix those bugs by using the correct format strings with PRId64, PRIx64.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Message-id: 1403113751-19799-1-git-send-email-sw@weilnetz.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-29 22:04:28 +01:00
Richard Henderson
1ce8be7e0d disas/libvixl: Update README for version base
Signed-off-by: Richard Henderson <rth@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-29 22:02:52 +01:00
Peter Maydell
13aefd303c ui/cocoa: Honour -show-cursor command line option
Honour the -show-cursor command line option (which forces the mouse pointer
to always be displayed even when input is grabbed) in the Cocoa UI backend.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1403516125-14568-5-git-send-email-peter.maydell@linaro.org
2014-06-29 22:00:33 +01:00
Peter Maydell
f61c387ea6 ui/cocoa: Fix handling of absolute positioning devices
Fix handling of absolute positioning devices, which were basically
unusable for two separate reasons:
 (1) as soon as you pressed the left mouse button we would call
     CGAssociateMouseAndMouseCursorPosition(FALSE), which means that
     the absolute coordinates of the mouse events are never updated
 (2) we didn't account for MacOSX coordinate origin being bottom left
     rather than top right, and so all the Y values sent to the guest
     were inverted

We fix (1) by aligning our behaviour with the SDL UI backend for
absolute devices:
 * when the mouse moves into the window we do a grab (which means
   hiding the host cursor and sending special keys to the guest)
 * when the mouse moves out of the window we un-grab
and fix (2) by doing the correct transformation in the call to
qemu_input_queue_abs().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1403516125-14568-4-git-send-email-peter.maydell@linaro.org
2014-06-29 22:00:33 +01:00
Peter Maydell
5dd45bee58 ui/cocoa: Add utility method to check if point is within window
Add a utility method to check whether a point is within the current window
bounds, and use it in the various places in the mouse handling code that
were opencoding the check.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1403516125-14568-3-git-send-email-peter.maydell@linaro.org
2014-06-29 22:00:33 +01:00
Peter Maydell
381600dad9 ui/cocoa: Cope with first surface being same as initial window size
Do the recalculation of the content dimensions in switchSurface if the
current cdx is zero as well as if the new surface is a different size to
the current window. This catches the case where the first surface registered
happens to be 640x480 (our current window size), and fixes a bug where we
would always display a black screen until the first surface of a different
size was registered.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1403516125-14568-2-git-send-email-peter.maydell@linaro.org
2014-06-29 22:00:33 +01:00
Alistair Francis
b841642daa timer: cadence_ttc: Convert to instance_init
SysBusDevice::init is deprecated. Convert to instance_init
as prescribed by QOM conventions.

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 1223f14833159b9ea5c57734dd2ffa88d4b15a83.1403583596.git.alistair.francis@xilinx.com
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-29 18:38:40 +01:00
Peter Maydell
166fa99996 hw/arm/pxa2xx_gpio: Correct and register vmstate
The pxa2xx-gpio device has a VMStateDescription, but it was accidentally
never actually registered, and it wasn't quite correct. Remove the
'lines' field (this is a device property, not mutable state), add the
missing 'prev_level' field, and set dc->vmsd so it actually gets used.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
2014-06-29 18:38:40 +01:00
Peter Maydell
ab7a0f0b6d hw/arm/pxa2xx_gpio: Fix handling of GPSR/GPCR reads
The PXA2xx GPIO GPSR and GPCR registers are write-only, with reads being
undefined behaviour. Instead of having GPCR return 31337 and GPSR return
the value last written, make both log the guest error and return 0.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
2014-06-29 18:38:40 +01:00
Peter Maydell
ed657d7117 hw/arm/strongarm: Wire up missing GPIO and PPC vmstate
The VMStateDescription structs for the GPIO and PPC devices were
accidentally never wired up. Add missing state fields and register
them via dc->vmsd.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
2014-06-29 18:38:39 +01:00
Peter Maydell
92335a0d40 hw/arm/strongarm: Fix handling of GPSR/GPCR reads
The StrongARM GPIO GPSR and GPCR registers are write-only, with reads being
undefined behaviour. Instead of having GPCR return 31337 and GPSR return
the value last written, make both log the guest error and return 0.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
2014-06-29 18:38:39 +01:00
Peter Maydell
6e411af935 hw/arm/virt: Provide PL031 RTC
UEFI mandates that the platform must include an RTC, so provide
one in 'virt', using the PL031. This is also useful for directly
booting Linux kernels which would otherwise have to run ntpdate.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
2014-06-29 18:38:39 +01:00
Peter Maydell
9328cfd2fe Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc,vhost,virtio fixes, enhancements

virtio bi-endian support
new command to resync RTC
misc bugfixes and cleanups

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Sun 29 Jun 2014 17:41:13 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"

* remotes/mst/tags/for_upstream: (37 commits)
  tests: add human format test for string output visitor
  vhost-net: disable when cross-endian
  target-ppc: enable virtio endian ambivalent support
  virtio-9p: use virtio wrappers to access headers
  virtio-serial-bus: use virtio wrappers to access headers
  virtio-scsi: use virtio wrappers to access headers
  virtio-blk: use virtio wrappers to access headers
  virtio-balloon: use virtio wrappers to access page frame numbers
  virtio-net: use virtio wrappers to access headers
  virtio: allow byte swapping for vring
  virtio: memory accessors for endian-ambivalent targets
  virtio: add endian-ambivalent support to VirtIODevice
  cpu: introduce CPUClass::virtio_is_big_endian()
  exec: introduce target_words_bigendian() helper
  virtio: add subsections to the migration stream
  virtio-rng: implement per-device migration calls
  virtio-balloon: implement per-device migration calls
  virtio-serial: implement per-device migration calls
  virtio-blk: implement per-device migration calls
  virtio-net: implement per-device migration calls
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-29 18:09:51 +01:00
Hu Tao
b4900c0e8a tests: add human format test for string output visitor
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:43 +03:00
Greg Kurz
371df9f5e0 vhost-net: disable when cross-endian
As of today, vhost assumes guest and host have the same endianness.
This is definitely not compatible with modern PPC64 and ARM that
can change endianness at runtime. Let's disable vhost-net and print
an error message when we detect such a case:

qemu-system-ppc64: vhost-net does not support cross-endian
qemu-system-ppc64: unable to start vhost net: 38: falling back on userspace virtio

This way users can continue to run VMs without changing their setup and
have a chance to know that performance will be impacted.

Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:43 +03:00
Greg Kurz
7826c2b2a4 target-ppc: enable virtio endian ambivalent support
The device endianness is the cpu endianness at device reset time.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:43 +03:00
Greg Kurz
d64ccb91ad virtio-9p: use virtio wrappers to access headers
Note that st*_raw and ld*_raw are effectively replaced by st*_p and ld*_p.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:43 +03:00
Rusty Russell
e0ab7fac65 virtio-serial-bus: use virtio wrappers to access headers
We also fix max_nr_ports at reset time as the device endianness may have
changed.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
[ pass VirtIODevice * to memory accessors,
  fix max_nr_ports at reset time,
  Greg Kurz <gkurz@linux.vnet.ibm.com> ]
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:43 +03:00
Rusty Russell
8c085dbe6d virtio-scsi: use virtio wrappers to access headers
Note that st*_raw and ld*_raw are effectively replaced by st*_p and ld*_p.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
[ pass VirtIODevice * to memory accessors,
  converted new tswap locations to virtio_tswap,
  Greg Kurz <gkurz@linux.vnet.ibm.com> ]
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:42 +03:00
Rusty Russell
783d189725 virtio-blk: use virtio wrappers to access headers
Note that st*_raw and ld*_raw are effectively replaced by st*_p and ld*_p.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
[ pass VirtIODevice * to memory accessors,
  Greg Kurz <gkurz@linux.vnet.ibm.com> ]
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:42 +03:00
Rusty Russell
8609d2a87a virtio-balloon: use virtio wrappers to access page frame numbers
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
[ pass VirtIODevice * to memory accessors,
  Greg Kurz <gkurz@linux.vnet.ibm.com> ]
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:42 +03:00
Rusty Russell
1399c60d70 virtio-net: use virtio wrappers to access headers
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
[ pass VirtIODevice * to memory accessors,
  converted new tswap locations to virtio_tswap,
  Greg Kurz <gkurz@linux.vnet.ibm.com> ]
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:42 +03:00
Rusty Russell
cee3ca0028 virtio: allow byte swapping for vring
Quoting original text from Rusty: "This is based on a simpler patch by Anthony
Liguouri".

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
[ add VirtIODevice * argument to most helpers,
  Greg Kurz <gkurz@linux.vnet.ibm.com> ]
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:42 +03:00
Greg Kurz
0f5d1d2a49 virtio: memory accessors for endian-ambivalent targets
This is the virtio-access.h header file taken from Rusty's "endian-ambivalent
targets using legacy virtio" patch. It introduces helpers that should be used
when accessing vring data or by drivers for data that contains headers.
The virtio config space is also target endian, but the current code already
handles that with the virtio_is_big_endian() helper. There is no obvious
benefit at using the virtio accessors in this case.

Now we have two distinct paths: a fast inline one for fixed endian targets,
and a slow out-of-line one for targets that define the new TARGET_IS_BIENDIAN
macro.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
[ relicensed virtio-access.h to GPLv2+ on Rusty's request,
  pass &address_space_memory to physical memory accessors,
  per-device endianness,
  virtio tswap16 and tswap64 helpers,
  faspath for fixed endian targets,
  Greg Kurz <gkurz@linux.vnet.ibm.com> ]
Cc: Cédric Le Goater <clg@fr.ibm.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:42 +03:00
Greg Kurz
616a655219 virtio: add endian-ambivalent support to VirtIODevice
Some CPU families can dynamically change their endianness. This means we
can have little endian ppc or big endian arm guests for example. This has
an impact on legacy virtio data structures since they are target endian.
We hence introduce a new property to track the endianness of each virtio
device. It is reasonnably assumed that endianness won't change while the
device is in use : we hence capture the device endianness when it gets
reset.

We migrate this property in a subsection, after the device descriptor. This
means the load code must not rely on it until it is restored. As a consequence,
the vring sanity checks had to be moved after the call to vmstate_load_state().
We enforce paranoia by poisoning the property at the begining of virtio_load().

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:42 +03:00
Greg Kurz
bf7663c4bd cpu: introduce CPUClass::virtio_is_big_endian()
If we want to support targets that can change endianness (modern PPC and
ARM for the moment), we need to add a per-CPU class method to be called
from the virtio code. The virtio_ prefix in the name is a hint for people
to avoid misusage (aka. anywhere but from the virtio code).

The default behaviour is to return the compile-time default target
endianness.

Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:42 +03:00
Greg Kurz
98ed8ecfc9 exec: introduce target_words_bigendian() helper
We currently have a virtio_is_big_endian() helper that provides the target
endianness to the virtio code. As of today, the helper returns a fixed
compile-time value. Of course, this will have to change if we want to
support target endianness changes at run-time.

Let's move the TARGET_WORDS_BIGENDIAN bits out to a new helper and have
virtio_is_big_endian() implemented on top of it.

This patch doesn't change any functionality.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:42 +03:00
Greg Kurz
6b321a3df5 virtio: add subsections to the migration stream
There is a need to add some more fields to VirtIODevice that should be
migrated (broken status, endianness). The problem is that we do not
want to break compatibility while adding a new feature... This issue has
been addressed in the generic VMState code with the use of optional
subsections. As a *temporary* alternative to port the whole virtio
migration code to VMState, this patch mimics a similar subsectionning
ability for virtio, using the VMState code.

Since each virtio device is streamed in its own section, the idea is to
stream subsections between the end of the device section and the start
of the next sections. This allows an older QEMU to complain and exit
when fed with subsections:

Unknown savevm section type 5
load of migration failed

Suggested-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:42 +03:00
Greg Kurz
3902d49e13 virtio-rng: implement per-device migration calls
While we are here, we also check virtio_load() return value.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:41 +03:00
Greg Kurz
9ea2511c85 virtio-balloon: implement per-device migration calls
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:41 +03:00
Greg Kurz
13c6855ab0 virtio-serial: implement per-device migration calls
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:41 +03:00
Greg Kurz
b2b295a74a virtio-blk: implement per-device migration calls
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:41 +03:00
Greg Kurz
037dab2fe8 virtio-net: implement per-device migration calls
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:41 +03:00
Greg Kurz
1b5fc0dea4 virtio: introduce device specific migration calls
In order to migrate virtio subsections, they should be streamed after
the device itself. We need the device specific code to be called from
the common migration code to achieve this. This patch introduces load
and save methods for this purpose.

Suggested-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:41 +03:00
Alexander Graf
e38e943a1f virtio-serial: don't migrate the config space
The device configuration is set at realize time and never changes. It
should not be migrated as it is done today. For the sake of compatibility,
let's just skip them at load time.

Signed-off-by: Alexander Graf <agraf@suse.de>
[ added missing casts to uint16_t *,
  added From, SoB and commit message,
  Greg Kurz <gkurz@linux.vnet.ibm.com> ]
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:41 +03:00
Cédric Le Goater
032a74a1c0 virtio-net: byteswap virtio-net header
TCP connectivity fails when the guest has a different endianness.
The packets are silently dropped on the host by the tap backend
when they are read from user space because the endianness of the
virtio-net header is in the wrong order. These lines may appear
in the guest console:

[  454.709327] skbuff: bad partial csum: csum=8704/4096 len=74
[  455.702554] skbuff: bad partial csum: csum=8704/4096 len=74

The issue that got first spotted with a ppc64le PowerKVM guest,
but it also exists for the less common case of a x86_64 guest run
by a big-endian ppc64 TCG hypervisor.

Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
[ Ported from PowerKVM,
  Greg Kurz <gkurz@linux.vnet.ibm.com> ]
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:41 +03:00
Michael S. Tsirkin
a628fc8dae vhost-user: typo fixups
Fix typo in field name.
Strip two consequitive empty lines.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 19:39:41 +03:00
Damjan Marion
3fd74b8407 vhost-user: fix regions provied with VHOST_USER_SET_MEM_TABLE message
Old code was affected by memory gaps which resulted in buffer pointers
pointing to address outside of the mapped regions.

Here we are introducing following changes:
 - new function qemu_get_ram_block_host_ptr() returns host pointer
   to the ram block, it is needed to calculate offset of specific
   region in the host memory
 - new field mmap_offset is added to the VhostUserMemoryRegion. It
   contains offset where specific region starts in the mapped memory.
   As there is stil no wider adoption of vhost-user agreement was made
   that we will not bump version number due to this change
 - other fileds in VhostUserMemoryRegion struct are not changed, as
   they are all needed for usermode app implementation
 - region data is not taken from ram_list.blocks anymore, instead we
   use region data which is alredy calculated for use in vhost-net
 - Now multiple regions can have same FD and user applicaton can call
   mmap() multiple times with the same FD but with different offset
   (user needs to take care for offset page alignment)

Signed-off-by: Damjan Marion <damarion@cisco.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Damjan Marion <damarion@cisco.com>
2014-06-29 19:39:40 +03:00
Eduardo Habkost
12d6e4640c numa: Reject configuration if not all node IDs are present
We don't support sparse NUMA node IDs yet, so this changes QEMU to
reject configs where not all nodes are present.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-06-29 18:59:42 +03:00
Eduardo Habkost
1945b9d8b0 numa: Reject duplicate node IDs
The same nodeid shouldn't appear multiple times in the command-line.

In addition to detecting command-line mistakes, this will fix a bug
where nb_numa_nodes may become larger than MAX_NODES (and cause
out-of-bounds access on the numa_info array).

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-06-29 18:59:42 +03:00
Eduardo Habkost
1af878e049 numa: Keep track of NUMA nodes present on the command-line
Based on "enable sparse node numbering" patch from Nishanth Aravamudan,
but without the code to actually support sparse node IDs. This just adds
the code to keep track of present/non-present nodes on the command-line,
without changing any behavior.

Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
[Rename max_numa_node to max_numa_nodeid -Eduardo]
[Initialize max_numa_nodeid to 0 -Eduardo]
[Use MAX() macro when setting max_numa_nodeid -Eduardo]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-06-29 18:59:41 +03:00
Dr. David Alan Gilbert
2f5732e964 Allow mismatched virtio config-len
Commit 'virtio: validate config_len on load' restricted config_len
loaded from the wire to match the config_len that the device had.

Unfortunately, there are cases where this isn't true, the one
we found it on was the wce addition in virtio-blk.

Allow mismatched config-lengths:
   *) If the version on the wire is shorter then fine
   *) If the version on the wire is longer, load what we have space
      for and skip the rest.

(This is mst@redhat.com's rework of what I originally posted)

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 18:59:41 +03:00
Don Slutz
5f8632d3c3 pc: make isapc and pc-0.10 to pc-0.13 have 1.7.0 memory layout
QEMU 2.0 changed memory layout for isapc and pc-0.10 to pc-0.13.
This prevents migration from QEMU 1.7.0 for these
machine types when -m 3.5G is specified.

Paolo Bonzini asked that:

    smbios_legacy_mode = true;
    has_reserved_memory = false;
    option_rom_has_mr = true;
    rom_file_has_mr = false;

also be done.

Cc: qemu-stable@nongnu.org
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Don Slutz <dslutz@verizon.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fixes: https://bugs.launchpad.net/qemu/+bug/1334307
Tested-by: "Slutz, Donald Christopher" <dslutz@verizon.com>
2014-06-29 18:59:41 +03:00
Damjan Marion
46e797c4d3 vhost-user: fix wrong ids in documentation
Signed-off-by: Damjan Marion <damarion@cisco.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 18:59:41 +03:00
Marcelo Tosatti
f2ae8abf1f mc146818rtc: add rtc-reset-reinjection QMP command
It is necessary to reset RTC interrupt reinjection backlog if
guest time is synchronized via a different mechanism, such as
QGA's guest-set-time command.

Failing to do so causes both corrections to be applied (summed),
resulting in an incorrect guest time.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 18:59:35 +03:00
Eduardo Habkost
fa118d1f8b pc: Fix "prog_if" typo on PC_COMPAT_2_0
The property name is "prog_if", not "prof_if".

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reported-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 18:59:07 +03:00
Eduardo Habkost
b8f5cfd682 pc: Move q35 compat props to PC_COMPAT_*
For each compat property on PC_Q35_COMPAT_*, there are only two
possibilities:

 * If the device is never instantiated when using a machine other than
   pc-q35, then the compat property can be safely added to
   PC_COMPAT_*;
 * If the device can be instantiated when using a machine other than
   pc-q35, that means the other machines also need the compat property
   to be set.

That means we don't need separate PC_Q35_COMPAT_* macros at all, today.

The hpet.hpet-intcap case is interesting: piix and q35 do have something
that emulates different defaults, but the machine-specific default is
applied _after_ compat_props are applied, by simply checking if the
property is zero (which is the real default on the hpet code).

The hpet.hpet-intcap=0x4 compat property can (should?) be applied to
piix too, because 0x4 was the default on both piix and q35 before the
hpet-intcap property was introduced.

Now, if one day we change the default HPET intcap on one of the PC
machine-types again, we may want to introduce PC_{Q35,I440FX}_COMPAT
macros. But while we don't need that, we can keep the code simple.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Cc: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 18:59:06 +03:00
Michael S. Tsirkin
9851d0fe35 numa: fix comment
s/if given for/is given for/;

Reported-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-29 18:59:06 +03:00
Michael S. Tsirkin
29923e94e7 openrisc: fix comment
Fix English in comment:

s/the each/each/

s/  \*\// \*\//

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2014-06-29 18:59:06 +03:00
Michael S. Tsirkin
d75e2f6889 numa: fix comment
Fix up English in comments:
s/the each/each/

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2014-06-29 18:59:06 +03:00
Peter Maydell
4f9c5be919 Merge remote-tracking branch 'remotes/riku/linux-user-for-upstream' into staging
* remotes/riku/linux-user-for-upstream:
  linux-user: support the SIOCGIFINDEX ioctl
  linux-user: support the KDSIGACCEPT ioctl
  linux-user: allow NULL tv argument for settimeofday
  linux-user: respect timezone for settimeofday
  linux-user: fix struct target_epoll_event layout for MIPS
  linux-user: support strace of epoll_create1
  linux-user: allow NULL arguments to mount
  linux-user: support SO_PASSSEC setsockopt option
  linux-user: support SO_{SND, RCV}BUFFORCE setsockopt options
  linux-user: support SO_ACCEPTCONN getsockopt option
  linux-user: translate the result of getsockopt SO_TYPE
  linux-user: added fake open() for /proc/self/cmdline
  Add support for MAP_NORESERVE mmap flag.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-29 16:44:13 +01:00
Peter Maydell
4daebe014e Merge remote-tracking branch 'remotes/xtensa/tags/20140629-xtensa' into staging
Xtensa fixes and improvements queue 2014-06-29:
- fix FLASH mapping to boot region for KC705;
- clean up boot parameters passing;
- add uImage, DTB and initrd support.

# gpg: Signature made Sat 28 Jun 2014 23:40:32 BST using RSA key ID F83FA044
# gpg: Good signature from "Max Filippov <max.filippov@cogentembedded.com>"
# gpg:                 aka "Max Filippov <jcmvbkbc@gmail.com>"

* remotes/xtensa/tags/20140629-xtensa:
  hw/xtensa/xtfpga: implement initrd loading
  hw/xtensa/xtfpga: implement DTB loading
  hw/xtensa/xtfpga: implement uImage loading
  hw/xtensa/xtfpga: add memory info to bootparam
  hw/xtensa/xtfpga: refactor bootparameters filling
  hw/xtensa/xtfpga: use symbolic constants for bootparam tags
  hw/xtensa/xtfpga: retrieve parameters from machine_opts
  hw/xtensa: replace fprintfs with error_report
  hw/xtensa: remove extraneous xtensa_ prefix from file names
  hw/xtensa/xtfpga: fix FLASH mapping to boot region for KC705

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-29 16:17:50 +01:00
Peter Maydell
2d40fa6987 Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block patches for 2.1.0-rc0

# gpg: Signature made Fri 27 Jun 2014 19:50:32 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"

* remotes/kevin/tags/for-upstream: (47 commits)
  iotests: Fix 083 for out-of-tree builds
  iotests: Drop Python version from 065's Shebang
  iotests: Use $PYTHON for Python scripts
  iotests: Source common.env
  configure: Enable out-of-tree iotests
  iotests: Allow out-of-tree run
  block.c: Don't return success for bdrv_append_temp_snapshot() failure
  qemu-iotests: Add TestRepairQuorum to 041 to test drive-mirror node-name mode.
  block: Add replaces argument to drive-mirror
  blockjob: Fix recent BLOCK_JOB_ERROR regression
  blockjob: Fix recent BLOCK_JOB_READY regression
  virtio-blk: Rename complete_request_early to complete_request_vring
  virtio-blk: Unify {non-,}dataplane's request handlings
  virtio-blk: Schedule BH in the right context
  virtio-blk: Export request handling functions to dataplane
  virtio-blk: Make request completion function virtual
  block: acquire AioContext in qmp_query_blockstats()
  block: make bdrv_query_stats() static
  virtio-blk: Fix and clean up the in_sg and out_sg check
  virtio-blk: Fill in VirtIOBlockReq.out in dataplane code
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-29 15:24:54 +01:00
Peter Maydell
ac8076ac86 Merge remote-tracking branch 'remotes/qmp-unstable/queue/qmp' into staging
* remotes/qmp-unstable/queue/qmp:
  docs/qmp: Fix documentation of BLOCK_JOB_READY to match code
  char: report frontend open/closed state in 'query-chardev'
  virtio-serial: report frontend connection state via monitor
  qmp: add qmp-events.txt back
  qapi event: clean up in callers
  qapi script: clean up in scripts
  qapi: ignore generated event files
  qapi: move event defines

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-29 13:39:04 +01:00
Peter Maydell
76fbbec931 Merge remote-tracking branch 'remotes/stefanha/tags/net-pull-request' into staging
Net patches

# gpg: Signature made Fri 27 Jun 2014 14:10:57 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/net-pull-request:
  hw/net/eepro100: Implement read-only bits in MDI registers
  net: move queue number into NICPeers
  net: L2TPv3 transport
  qemu-bridge-helper: Fix fd leak in main()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-29 12:45:54 +01:00
Paul Burton
f63eb01ac7 linux-user: support the SIOCGIFINDEX ioctl
Add a definition of the SIOCGIFINDEX ioctl, allowing its use by target
programs.

Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-06-29 14:19:59 +03:00
Paul Burton
ca56f5b596 linux-user: support the KDSIGACCEPT ioctl
Add a definition of the KDSIGACCEPT ioctl & allow its use by target
programs.

Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-06-29 14:19:59 +03:00
Paul Burton
b67d80311a linux-user: allow NULL tv argument for settimeofday
The tv argument to the settimeofday syscall is allowed to be NULL, if
the program only wishes to provide the timezone. QEMU previously
returned -EFAULT when tv was NULL. Instead, execute the syscall &
provide NULL to the kernel as the target program expected.

Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-06-29 14:19:59 +03:00
Paul Burton
ef4467e911 linux-user: respect timezone for settimeofday
The settimeofday syscall accepts a tz argument indicating the desired
timezone to the kernel. QEMU previously ignored any argument provided
by the target program & always passed NULL to the kernel. Instead,
translate the argument & pass along the data userland provided.

Although this argument is described by the settimeofday man page as
obsolete, it is used by systemd as of version 213.

Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-06-29 14:19:59 +03:00
Paul Burton
fd76783243 linux-user: fix struct target_epoll_event layout for MIPS
MIPS requires the pad field to 64b-align the data field just as ARM
does.

Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-06-29 14:19:59 +03:00
Paul Burton
0fa82d39c8 linux-user: support strace of epoll_create1
Add the epoll_create1 syscall to strace.list in order to display that
syscall when it occurs, rather than a message about the syscall being
unknown despite QEMU already implementing support for it.

Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-06-29 14:19:59 +03:00
Paul Burton
356d771b30 linux-user: allow NULL arguments to mount
Calls to the mount syscall can legitimately provide NULL as the value
for the source of filesystemtype arguments, which QEMU would previously
reject & return -EFAULT to the target program. An example of this is
remounting an already mounted filesystem with different properties.

Instead of rejecting such syscalls with -EFAULT, pass NULL along to the
kernel as the target program expects.

Additionally this patch fixes a potential memory leak when DEBUG_REMAP
is enabled and lock_user_string fails on the target or filesystemtype
arguments but a prior argument was non-NULL and already locked.

Since the patch already touched most lines of the TARGET_NR_mount case,
it fixes the indentation & coding style for good measure.

Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-06-29 14:19:59 +03:00
Paul Burton
82d0fe6b7a linux-user: support SO_PASSSEC setsockopt option
Translate the SO_PASSSEC option to setsockopt to the host value &
perform the syscall as expected, allowing use of the option by target
programs.

Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-06-29 14:19:59 +03:00
Paul Burton
d79b6cc435 linux-user: support SO_{SND, RCV}BUFFORCE setsockopt options
Translate the SO_SNDBUFFORCE & SO_RCVBUFFORCE options to setsockopt to
the host values & perform the syscall as expected, allowing use of those
options by target programs.

Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-06-29 14:19:58 +03:00
Paul Burton
aec1ca411e linux-user: support SO_ACCEPTCONN getsockopt option
Translate the SO_ACCEPTCONN option to the host value & execute the
syscall as expected.

Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-06-29 14:19:58 +03:00
Paul Burton
8289d11281 linux-user: translate the result of getsockopt SO_TYPE
QEMU previously passed the result of the host syscall directly to the
target program. This is a problem if the host & target have different
representations of socket types, as is the case when running a MIPS
target program on an x86 host. Introduce a host_to_target_sock_type
helper function mirroring the existing target_to_host_sock_type, and
call it to translate the value provided by getsockopt when called for
the SO_TYPE option.

Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-06-29 14:19:58 +03:00
Wim Vander Schelden
76b9424550 linux-user: added fake open() for /proc/self/cmdline
Signed-off-by: Wim Vander Schelden <wim@fixnum.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-06-29 14:19:58 +03:00
Christophe Lyon
e8efd8e71f Add support for MAP_NORESERVE mmap flag.
mmap_flags_tbl contains a list of mmap flags, and how to map them to
the target. This patch adds MAP_NORESERVE, which was missing to the
list.

Signed-off-by: Christophe Lyon <christophe.lyon@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-06-29 14:19:58 +03:00
Peter Maydell
2d80e0ab4b Merge remote-tracking branch 'remotes/agraf/tags/signed-ppc-for-upstream' into staging
Patch queue for ppc - 2014-06-27

Changes include:

  - instruction emulation fixes
  - linux-user fixes
  - mac99: layout fixes
  - pseries: Initial VFIO support
  - pseries: support for UUID
  - pseries: support for -boot m

# gpg: Signature made Fri 27 Jun 2014 12:51:01 BST using RSA key ID 03FEDC60
# gpg: Can't check signature: public key not found

* remotes/agraf/tags/signed-ppc-for-upstream: (32 commits)
  PPC: e500: Only create dt entries for existing serial ports
  spapr_pci: Use XICS interrupt allocator and do not cache interrupts in PHB
  vmstate: Add preallocation for migrating arrays (VMS_ALLOC flag)
  xics: Implement xics_ics_free()
  spapr: Remove @next_irq
  spapr: Move interrupt allocator to xics
  xics: Disable flags reset on xics reset
  xics: Add xics_find_source()
  xics: Add flags for interrupts
  spapr: Add RTAS sysparm SPLPAR Characteristics
  spapr: Add RTAS sysparm UUID
  spapr: Fix RTAS sysparm DIAGNOSTICS_RUN_MODE
  spapr: Add rtas_st_buffer utility function
  spapr: Define a 2.1 pseries machine
  spapr: Fix code design style (s/SPAPRMachine/sPAPRMachineState)
  target-ppc: Add support for POWER8 pvr 0x4D0000
  uninorth: Fix PCI hole size
  mac99: Add motherboard devices before PCI cards
  target-ppc: Remove unused gen_qemu_ld8s()
  target-ppc: Remove unused IMM and d extract helpers
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-29 11:59:00 +01:00
Peter Maydell
de6793e8c2 Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20140627' into staging
A series of patches to the s390-ccw bios:
- code cleanup
- improved error reporting
- most important, support to ipl (boot) from ECKD DASD (CDL, LDL or CMS
  formatted)

# gpg: Signature made Fri 27 Jun 2014 12:03:30 BST using RSA key ID C6F02FAF
# gpg: Can't check signature: public key not found

* remotes/cohuck/tags/s390x-20140627:
  pc-bios/s390-ccw: update binary
  pc-bios/s390-ccw: IPL from LDL/CMS-formatted ECKD DASD
  pc-bios/s390-ccw: IPL from CDL-formatted ECKD DASD
  pc-bios/s390-ccw: factor out ipl code
  pc-bios/s390-ccw: Add fill_hex_val func to provide better msgs
  pc-bios/s390-ccw: Unify error handling
  pc-bios/s390-ccw: add some utility code
  pc-bios/s390-ccw: handle different sector sizes
  pc-bios/s390-ccw: cleanup and enhance bootmap defintions
  pc-bios/s390-ccw: make checkpatch happy

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-29 11:43:31 +01:00
Peter Maydell
1045fc0439 tcg/ppc: Fix support for 64-bit PPC MacOSX hosts
Add back in the support for 64-bit PPC MacOSX hosts that was
broken in the recent merge of the 32-bit and 64-bit TCG backends.

Reported-by: Andreas Färber <andreas.faerber@web.de>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Tested-by: Andreas Färber <andreas.faerber@web.de>
2014-06-29 11:38:50 +01:00
Max Filippov
f55b32e749 hw/xtensa/xtfpga: implement initrd loading
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2014-06-29 02:32:42 +04:00
Max Filippov
996dfe98ed hw/xtensa/xtfpga: implement DTB loading
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2014-06-29 02:32:42 +04:00
Max Filippov
364d480242 hw/xtensa/xtfpga: implement uImage loading
Provide a simple bootloader code at the reset address that jumps to the
loaded image entry point when it's not equal to the reset address. This
is needed because the old method of setting pc doesn't work due to cpu
reset done after the machine setup.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2014-06-29 02:32:42 +04:00
Max Filippov
b6edea8b68 hw/xtensa/xtfpga: add memory info to bootparam
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2014-06-29 02:32:42 +04:00
Max Filippov
a9a28591fb hw/xtensa/xtfpga: refactor bootparameters filling
Separate filling first/last tag and size calculation from the kernel
command line setup.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2014-06-29 02:32:42 +04:00
Max Filippov
62dbaede80 hw/xtensa/xtfpga: use symbolic constants for bootparam tags
Import bootparam tag names from linux/arch/xtensa/include/asm/bootparam.h
No functional changes.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2014-06-29 02:32:42 +04:00
Max Filippov
37b259d034 hw/xtensa/xtfpga: retrieve parameters from machine_opts
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2014-06-29 02:32:42 +04:00
Max Filippov
8488ab021b hw/xtensa: replace fprintfs with error_report
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2014-06-29 02:32:41 +04:00
Max Filippov
b707ab757e hw/xtensa: remove extraneous xtensa_ prefix from file names
While at it rename lx60 (named after the first board of the family) to
more generic xtfpga (the family name).

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2014-06-29 02:32:41 +04:00
Max Filippov
37ed7c4b24 hw/xtensa/xtfpga: fix FLASH mapping to boot region for KC705
On KC705 bootloader area is located at FLASH offset 0x06000000, not 0 as
on older xtfpga boards.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2014-06-29 02:32:41 +04:00
Max Reitz
f5264553c3 iotests: Fix 083 for out-of-tree builds
iotest 083 filters out debug messages from nbd, which are prefixed (and
recognized) by __FILE__. However, the current filter (/^nbd\.c…/) is
valid for in-tree builds only, as out-of-tree builds will have a path
before that filename (e.g. "/tmp/qemu/nbd.c"). Fix this by adding .*
before "nbd\.c".

While working on this, also fix the regexes: '.' should be escaped and a
single backslash is not enough for escaping when enclosed by double
quotes.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 20:00:01 +02:00
Max Reitz
f99b4b5d7e iotests: Drop Python version from 065's Shebang
Test 065 specified python2 to be used in its Shebang; this might not
work on systems without a python2 symlink and furthermore it is now
counter-productive, as the check script compares the Shebang to
"#!/usr/bin/env python" and only uses the Python interpreter selected by
configure on an exact match.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 20:00:01 +02:00
Max Reitz
ea81ca9de1 iotests: Use $PYTHON for Python scripts
Instead of invoking Python scripts directly via ./, use $PYTHON to
obtain the correct Python interpreter command.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 20:00:00 +02:00
Max Reitz
7fed1a49ff iotests: Source common.env
Source common.env in the iotests' check script.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 20:00:00 +02:00
Max Reitz
76c7560ae7 configure: Enable out-of-tree iotests
In order to allow out-of-tree iotests, create a symlink for the check
script in the build tree.

While doing so, also write configured options relevant to the iotests to
common.env in the build tree; currently, this is the command to invoke
Python 2.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 20:00:00 +02:00
Max Reitz
e8f8624d3b iotests: Allow out-of-tree run
As out-of-tree builds are preferred for qemu, running the qemu-iotests
in that out-of-tree build should be supported as well. To do so, a
symbolic link has to be created pointing to the check script in the
source directory. That script will check whether it has been run through
a symlink, and if so, will assume it is run in the build tree. All
output and temporary operations performed by iotests are then redirected
here and, unless specified otherwise by the user, QEMU_PROG etc. will be
set to paths appropriate for the build tree.

Also, drop making every test case executable if it is not yet, as this
would modify the source tree which is not desired for out-of-tree runs
and should be fixed in the repository anyway.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 20:00:00 +02:00
Chen Gang
6b8aeca574 block.c: Don't return success for bdrv_append_temp_snapshot() failure
When failure occurs, 'ret' need be set, or may return 0 to indicate
success. Previously, an error was set in errp, but 0 was returned
anyway. So let bdrv_append_temp_snapshot() return an error code and
use that for the bdrv_open() return value.

Also, error_propagate() need be called only one time within a function.

Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 20:00:00 +02:00
Benoît Canet
d88964aeda qemu-iotests: Add TestRepairQuorum to 041 to test drive-mirror node-name mode.
The to-replace-node-name is designed to allow repairing a broken Quorum file.
This patch introduces a new class TestRepairQuorum testing that the feature
works.
Some further work will be done on QEMU to improve the robustness of the tests.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 20:00:00 +02:00
Benoît Canet
09158f00e0 block: Add replaces argument to drive-mirror
drive-mirror will bdrv_swap the new BDS named node-name with the one
pointed by replaces when the mirroring is finished.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 20:00:00 +02:00
Markus Armbruster
823c686356 blockjob: Fix recent BLOCK_JOB_ERROR regression
Commit 5a2d2cb screwed up the the value of members device and action,
breaking tests/qemu-iotests/041.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Tested-By: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 20:00:00 +02:00
Markus Armbruster
518848a214 blockjob: Fix recent BLOCK_JOB_READY regression
Commit bcada37 dropped the (up to now undocumented) members type, len,
offset, speed, breaking tests/qemu-iotests/040 and 041.

Restore and document them.  This fixes 040, and partially fixes 041.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Tested-By: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 20:00:00 +02:00
Markus Armbruster
a22d8e47f7 docs/qmp: Fix documentation of BLOCK_JOB_READY to match code
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-27 13:40:41 -04:00
Fam Zheng
d64c60a75f virtio-blk: Rename complete_request_early to complete_request_vring
The old name is misleading in its new usage, so rename it.

Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 18:20:42 +02:00
Fam Zheng
b002254dbd virtio-blk: Unify {non-,}dataplane's request handlings
This drops request handling code from dataplane, and uses code from
hw/block/virtio-blk.c.

It starts to use multiwrite as non-dataplane does.

Dataplane sets VirtIOBlock.complete_request to vring version, and calls
into non-dataplane's process handling. In complete_request_early,
qiov.size is added to vring push length, because it's also called in rw
completion now.

Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 18:20:39 +02:00
Fam Zheng
4407c1c56a virtio-blk: Schedule BH in the right context
The BH must be called in the AioContext of bs. Currently it is only the
main loop, but with coming changes, it could also be a dataplane
IOThread.

Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 18:20:37 +02:00
Fam Zheng
fee65db771 virtio-blk: Export request handling functions to dataplane
So that dataplane can use virtio_blk_handle_request and
virtio_submit_multiwrite.

Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 18:20:35 +02:00
Fam Zheng
bf4bd461b4 virtio-blk: Make request completion function virtual
virtio_blk_req_complete will call VirtIOBlock.complete_request() to push
data and notify guest. No functional change.

Later, this will allow dataplane to provide it's own (vring_) version.

Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 18:20:32 +02:00
Stefan Hajnoczi
13344f3a17 block: acquire AioContext in qmp_query_blockstats()
Make query-blockstats safe for dataplane by acquiring the
BlockDriverState's AioContext.  This ensures that the dataplane IOThread
and the main loop's monitor code do not race.

Note the assumption that acquiring the drive's BDS AioContext also
protects ->file and ->backing_hd.  This assumption is made by other
aio_context_acquire() callers too.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 18:20:29 +02:00
Stefan Hajnoczi
ac46821f2c block: make bdrv_query_stats() static
This function is only called from block/qapi.c.  There is no need to
keep it public.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 18:19:57 +02:00
Fam Zheng
ee17e84830 virtio-blk: Fix and clean up the in_sg and out_sg check
out_sg is checked by iov_to_buf below, so it can be dropped.

Add assert and iov_discard_back around in_sg, as the in_sg is handled in
dataplane code.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 18:18:31 +02:00
Fam Zheng
ab2e3cd2dc virtio-blk: Fill in VirtIOBlockReq.out in dataplane code
VirtIOBlockReq is allocated in process_request, and freed in command
functions.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 18:18:28 +02:00
Fam Zheng
827805a249 virtio-blk: Convert VirtIOBlockReq.out to structrue
The virtio code currently assumes that the outhdr is in its own iovec.
This is not guaranteed by the spec, so we should relax this assumption.

Convert the VirtIOBlockReq.out field to structrue so that we can use
iov_to_buf and then discard the header from the beginning of iovec.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 18:18:25 +02:00
Fam Zheng
eddb102e86 virtio-blk: Use VirtIOBlockReq.in to drop VirtIOBlockReq.inhdr
In current virtio spec, inhdr is a single byte, and is unlikely to
change for both functionality and compatibility considerations.
Non-dataplane uses .in, and we are on the way to converge them. So
let's unify it to get cleaner code.

Remove .inhdr and use .in.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 18:18:23 +02:00
Fam Zheng
04af2d70c5 virtio-blk: Replace VirtIOBlockRequest with VirtIOBlockReq
Field "inhdr" is added temporarily for a more mechanical change, and
will be dropped in the next commit.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 18:18:20 +02:00
Fam Zheng
98e2d49241 virtio-blk: Drop VirtIOBlockRequest.read
Since it's set but not used.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 18:18:18 +02:00
Fam Zheng
0bcb34472d virtio-blk: Drop bounce buffer from dataplane code
The block layer will handle the unaligned request.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 18:18:16 +02:00
Fam Zheng
671ec3f056 virtio-blk: Convert VirtIOBlockReq.elem to pointer
This will make converging with dataplane code easier.

Add virtio_blk_free_request to handle the freeing of request internal
fields.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 18:18:13 +02:00
Fam Zheng
09f6458770 virtio-blk: Move VirtIOBlockReq to header
For later reusing by dataplane code.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 18:17:59 +02:00
Hani Benhabiles
8c5d1abbb7 nbd: Don't validate from and len in NBD_CMD_DISC.
These values aren't used in this case.

Currently, the from field in the request sent by the nbd kernel module leading
to a false error message when ending the connection with the client.

$ qemu-nbd some.img -v
// After nbd-client -d /dev/nbd0
nbd.c:nbd_trip():L1031: From: 18446744073709551104, Len: 0, Size: 20971520,
Offset: 0
nbd.c:nbd_trip():L1032: requested operation past EOF--bad client?
nbd.c:nbd_receive_request():L638: read failed

Signed-off-by: Hani Benhabiles <kroosec@gmail.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-27 16:06:48 +02:00
Hani Benhabiles
60fe4fac22 nbd: Don't export a block device with no medium.
The device is exported with erroneous values and can't be read.

Before the patch:
$ sudo nbd-client localhost -p 10809 /dev/nbd0 -name floppy0
Negotiation: ..size = 17592186044415MB
bs=1024, sz=18446744073709547520 bytes

$ sudo mount /dev/nbd0 /mnt/tmp/
mount: block device /dev/nbd0 is write-protected, mounting read-only
mount: /dev/nbd0: can't read superblock

After the patch:
(qemu) nbd_server_add ide0-hd0
(qemu) nbd_server_add floppy0
Device 'floppy0' has no medium

Signed-off-by: Hani Benhabiles <kroosec@gmail.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-27 16:06:48 +02:00
Laszlo Ersek
32a97ea171 char: report frontend open/closed state in 'query-chardev'
In addition to the on-line reporting added in the previous patch, allow
libvirt to query frontend state independently of events.

Libvirt's path to identify the guest agent channel it cares about differs
between the event added in the previous patch and the QMP response field
added here. The event identifies the frontend device, by "id". The
'query-chardev' QMP command identifies the backend device (again by "id").
The association is under libvirt's control.

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1080376

Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-27 09:34:00 -04:00
Laszlo Ersek
e2ae6159de virtio-serial: report frontend connection state via monitor
Libvirt wants to know about the guest-side connection state of some
virtio-serial ports (in particular the one(s) assigned to guest agent(s)).
Report such states with a new monitor event.

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1080376
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-27 09:33:27 -04:00
Luiz Capitulino
dfab489214 qmp: add qmp-events.txt back
The conversion of events to the QAPI, resulted in the removal of the
docs/qmp/qmp-events.txt file. This was done to avoid having duplicated
information between qmp-events.txt and qapi-event.json.

However, qmp-events.txt contains examples and we're still not sure
how to proper install QAPI docs in the host. To avoid harming users,
it's better to re-add qmp-events.txt for now and deal with the
duplication later.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-06-27 09:27:56 -04:00
Wenchao Xia
2f44a08b3e qapi event: clean up in callers
This patch improves docs and address small issues in event
callers.

Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-27 09:27:56 -04:00
Wenchao Xia
d6f9c82c62 qapi script: clean up in scripts
This patch improve docs and uses c_type(argentry, is_param=True)
in script.

Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-27 09:27:56 -04:00
Wenchao Xia
1dbbe04525 qapi: ignore generated event files
Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-27 09:27:55 -04:00
Wenchao Xia
82d72d9d23 qapi: move event defines
Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-27 09:27:55 -04:00
Richard Henderson
d4cba13bdf tcg/ppc: Fix failure in tcg_out_mem_long
With rt != r0 on loads, we use rt for scratch.  If we need an index
register different from base, we can't use rt, but r0 is usable.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-id: 1403843160-30332-1-git-send-email-rth@twiddle.net
Tested-by: Cédric Le Goater <clg@fr.ibm.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-27 13:23:41 +01:00
Benoît Canet
4c828dc61a block: Add node-name argument to drive-mirror
This new argument can be used to specify the node-name of the new mirrored BDS.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 14:18:18 +02:00
Benoît Canet
cf29a570a7 quorum: Add the rewrite-corrupted parameter to quorum
On read operations when this parameter is set and some replicas are corrupted
while quorum can be reached quorum will proceed to rewrite the correct version
of the data to fix the corrupted replicas.

This will shine with SSD where the FTL will remap the same block at another
place on rewrite.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 14:18:17 +02:00
Alexander Graf
79c0ff2cae PPC: e500: Only create dt entries for existing serial ports
When the user specifies -nodefaults he can tell us that he doesn't want any
serial ports spawned by default. While we do honor that wish, we still create
device tree entries for those non-existent devices.

Make device tree generation depend on whether the device is actually available.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:27 +02:00
Alexey Kardashevskiy
9a321e9234 spapr_pci: Use XICS interrupt allocator and do not cache interrupts in PHB
Currently SPAPR PHB keeps track of all allocated MSI (here and below
MSI stands for both MSI and MSIX) interrupt because
XICS used to be unable to reuse interrupts. This is a problem for
dynamic MSI reconfiguration which happens when guest reloads a driver
or performs PCI hotplug. Another problem is that the existing
implementation can enable MSI on 32 devices maximum
(SPAPR_MSIX_MAX_DEVS=32) and there is no good reason for that.

This makes use of new XICS ability to reuse interrupts.

This reorganizes MSI information storage in sPAPRPHBState. Instead of
static array of 32 descriptors (one per a PCI function), this patch adds
a GHashTable when @config_addr is a key and (first_irq, num) pair is
a value. GHashTable can dynamically grow and shrink so the initial limit
of 32 devices is gone.

This changes migration stream as @msi_table was a static array while new
@msi_devs is a dynamic hash table. This adds temporary array which is
used for migration, it is populated in "spapr_pci"::pre_save() callback
and expanded into the hash table in post_load() callback. Since
the destination side does not know the number of MSI-enabled devices
in advance and cannot pre-allocate the temporary array to receive
migration state, this makes use of new VMSTATE_STRUCT_VARRAY_ALLOC macro
which allocates the array automatically.

This resets the MSI configuration space when interrupts are released by
the ibm,change-msi RTAS call.

This fixed traces to be more informative.

This changes vmstate_spapr_pci_msi name from "...lsi" to "...msi" which
was incorrect by accident. As the internal representation changed,
thus bumps migration version number.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[agraf: drop g_malloc_n usage]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:27 +02:00
Alexey Kardashevskiy
f32935ea22 vmstate: Add preallocation for migrating arrays (VMS_ALLOC flag)
There are few helpers already to support array migration. However they all
require the destination side to preallocate arrays before migration which
is not always possible due to unknown array size as it might be some
sort of dynamic state. One of the examples is an array of MSIX-enabled
devices in SPAPR PHB - this array may vary from 0 to 65536 entries and
its size depends on guest's ability to enable MSIX or do PCI hotplug.

This adds new VMSTATE_VARRAY_STRUCT_ALLOC macro which is pretty similar to
VMSTATE_STRUCT_VARRAY_POINTER_INT32 but it can alloc memory for migratign
array on the destination side.

This defines VMS_ALLOC flag for a field.

This changes vmstate_base_addr() to do the allocation when receiving
migration.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Juan Quintela <quintela@redhat.com>
[agraf: drop g_malloc_n usage]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:27 +02:00
Alexey Kardashevskiy
51bba713fe xics: Implement xics_ics_free()
This implements interrupt release function so IRQs can be returned back
to the pool for reuse in cases such as PCI hot plug.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:26 +02:00
Alexey Kardashevskiy
ba0e5bf8de spapr: Remove @next_irq
This removes @next_irq from sPAPREnvironment which was used in old
IRQ allocator as XICS is now responsible for IRQs and keeps track of
allocated IRQs.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:26 +02:00
Alexey Kardashevskiy
bee763dbfb spapr: Move interrupt allocator to xics
The current allocator returns IRQ numbers from a pool and does not
support IRQs reuse in any form as it did not keep track of what it
previously returned, it only keeps the last returned IRQ. Some use
cases such as PCI hot(un)plug may require IRQ release and reallocation.

This moves an allocator from SPAPR to XICS.

This switches IRQ users to use new API.

This uses LSI/MSI flags to know if interrupt is allocated.

The interrupt release function will be posted as a separate patch.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:26 +02:00
Alexey Kardashevskiy
a7e519a8cf xics: Disable flags reset on xics reset
Since islsi[] array has been merged into the ICSState struct,
we must not reset flags as they tell if the interrupt is in use.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:26 +02:00
Alexey Kardashevskiy
641c349352 xics: Add xics_find_source()
PAPR allows having multiple interrupt sources such as PHB.

This adds a source lookup function and makes use of it.

Since at the moment QEMU only supports a single source,
no change in behaviour is expected.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:26 +02:00
Alexey Kardashevskiy
4af88944d0 xics: Add flags for interrupts
The existing interrupt allocation scheme in SPAPR assumes that
interrupts are allocated at the start time, continously and the config
will not change. However, there are cases when this is not going to work
such as:

1. migration - we will have to have an ability to choose interrupt
numbers for devices in the command line and this will create gaps in
interrupt space.

2. PCI hotplug - interrupts from unplugged device need to be returned
back to interrupt pool, otherwise we will quickly run out of interrupts.

This replaces a separate lslsi[] array with a byte in the ICSIRQState
struct and defines "LSI" and "MSI" flags. Neither of these flags set
signals that the descriptor is not allocated and not in use.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:26 +02:00
Sam bobroff
3b50d8974b spapr: Add RTAS sysparm SPLPAR Characteristics
Add support for the SPLPAR Characteristics parameter to the emulated
RTAS call ibm,get-system-parameter.

The support provides just enough information to allow "cat
/proc/powerpc/lparcfg" to succeed without generating a kernel error
message.

Without this patch the above command will produce the following kernel
message: arch/powerpc/platforms/pseries/lparcfg.c \
parse_system_parameter_string Error calling get-system-parameter \
(0xfffffffd)

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:26 +02:00
Sam bobroff
b907d7b0fd spapr: Add RTAS sysparm UUID
Add support for the UUID parameter to the emulated RTAS call
ibm,get-system-parameter.

Return the guest's UUID as the value for the RTAS UUID system
parameter, or null (a zero length result) if it is not set.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:26 +02:00
Sam bobroff
3052d95190 spapr: Fix RTAS sysparm DIAGNOSTICS_RUN_MODE
This allows the ibm,get-system-parameter RTAS call to succeed for the
DIAGNOSTICS_RUN_MODE system parameter.

The problem can be seen with "ppc64_cpu --run-mode" from the
powerpc-utils package which fails before this patch with "Machine does
not support diagnostic run mode".

This is corrected by using the rtas_st_buffer() function to write to
the buffer.

The RTAS constants are also moved out into a header file, some new
constants added and the surrounding code slightly simplified.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
[agraf: remove some commentary]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:25 +02:00
Sam bobroff
ce3fa1eca2 spapr: Add rtas_st_buffer utility function
Add a function to write lengh + data into a buffer as required for the
emulation of the RTAS ibm,get-system-parameter call.

If the destination is smaller than the source, the write is truncated
and success is returned. This matches the behaviour of pHyp.

This will be used in following patches.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:25 +02:00
Alexey Kardashevskiy
6026db4501 spapr: Define a 2.1 pseries machine
This adds a v2.1 machine to support backward compatibility
for newer macines in the case if they ever be implemented.

This adds a "pseries-2.1" machine as a child of the "pseries"
machine and only changes visible machine name.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:25 +02:00
Alexey Kardashevskiy
6ca1502e36 spapr: Fix code design style (s/SPAPRMachine/sPAPRMachineState)
Every single sPAPR QOM object has small first "s".
Most (not all yet) QOM objects have "State" suffix.

This replaces SPAPRMachine with sPAPRMachineState to conform with QEMU
code style and removes redundant empty line.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:25 +02:00
Alexey Kardashevskiy
f6c3ebcc3b target-ppc: Add support for POWER8 pvr 0x4D0000
At the moment QEMU knows about one version of POWER8 CPU with
PVR 0x4B.0000. This CPU class is defined as "POWER8". The linux
kernel names it as "POWER8E" which is different from the name QEMU uses.

Now we get another version of POWER8 which is architecturally equivalent
to POWER8E but has different PVR 0x4D.0000 so QEMU fails to find
a PPC CPU class on these machines. The linux kernel names these CPUs as
"POWER8".

This renames the existing "POWER8" to "POWER8E" to be more precise and
stay in sync with the linux kernel.

This adds a new "POWER8" family which calls POWER8E class init function
and defines own PVR mask (used to match a CPU class) and desc (used to
create dynamic version-less CPU class).

This does not change CPU class fw_name attribute as the host POWER8
firmware keeps using "PowerPC,POWER8" on both POWER8 and POWER8E.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:24 +02:00
BALATON Zoltan
1be88255a7 uninorth: Fix PCI hole size
Fix PCI hole size to match that what is found on real hardware.
(OpenBIOS already uses the correct length.)

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:24 +02:00
BALATON Zoltan
a0bb2a5fa0 mac99: Add motherboard devices before PCI cards
Change the order of creating devices for New World Mac emulation so
that devices on the motherboard are added first and PCI cards (VGA and
NIC) come later. As a side effect, this also causes OpenBIOS to map
the motherboard devices into the MMIO space to the same addresses as
on real hardware and allow clients that hardcode these addresses (e.g.
MorphOS) to find and use them until OpenBIOS is tought to map devices
to specific addresses. (On real hardware the graphics and network
cards are really on separate buses but we don't model that yet.) This
brings the memory map closer to what is found on PowerMac3,1.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:24 +02:00
Peter Maydell
c99b6f879a target-ppc: Remove unused gen_qemu_ld8s()
The gen_qemu_ld8s() function is unused; remove it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:23 +02:00
Peter Maydell
b247812e4a target-ppc: Remove unused IMM and d extract helpers
Remove the definition of the IMM and d extract helpers; these seem to have
been added as part of the initial PPC support in 2003 but never actually
used.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:23 +02:00
Alexey Kardashevskiy
591812634c vfio: Enable for SPAPR
This turns the sPAPR support on and enables VFIO container use
in the kernel.

This extends vfio_connect_container to support VFIO_SPAPR_TCE_IOMMU type
in the host kernel.

This registers a memory listener which sPAPR IOMMU will notify when
executing H_PUT_TCE/etc DMA calls. The listener then will notify the host
kernel about DMA map/unmap operation via VFIO_IOMMU_MAP_DMA/
VFIO_IOMMU_UNMAP_DMA ioctls.

This executes VFIO_IOMMU_ENABLE ioctl to make sure that the IOMMU is free
of mappings and can be exclusively given to the user. At the moment SPAPR
is the only platform requiring this call to be implemented.

Note that the host kernel function implementing VFIO_IOMMU_DISABLE
is called automatically when container's fd is closed so there is
no need to call it explicitly from QEMU. We may need to call
VFIO_IOMMU_DISABLE explicitly in the future for some sort of dynamic
reconfiguration (PCI hotplug or dynamic IOMMU group management).

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:23 +02:00
Alexey Kardashevskiy
9fc34ada7e spapr_pci_vfio: Add spapr-pci-vfio-host-bridge to support vfio
The patch adds a spapr-pci-vfio-host-bridge device type
which is a PCI Host Bridge with VFIO support. The new device
inherits from the spapr-pci-host-bridge device and adds an "iommu"
property which is an IOMMU id. This ID represents a minimal entity
for which IOMMU isolation can be guaranteed. In SPAPR architecture IOMMU
group is called a Partitionable Endpoint (PE).

Current implementation supports one IOMMU id per QEMU VFIO PHB. Since
SPAPR allows multiple PHB for no extra cost, this does not seem to
be a problem. This limitation may change in the future though.

Example of use:
Configure and Add 3 functions of a multifunctional device to QEMU:
(the NEC PCI USB card is used as an example here):
-device spapr-pci-vfio-host-bridge,id=USB,iommu=4,index=7 \
-device vfio-pci,host=4:0:1.0,addr=1.0,bus=USB,multifunction=true
-device vfio-pci,host=4:0:1.1,addr=1.1,bus=USB
-device vfio-pci,host=4:0:1.2,addr=1.2,bus=USB

where:
* index=7 is a QEMU PHB index (used as source for MMIO/MSI/IO windows
offset);
* iommu=4 is an IOMMU id which can be found in sysfs:
[aik@vpl2 ~]$ cd /sys/bus/pci/devices/0004:00:00.0/
[aik@vpl2 0004:00:00.0]$ ls -l iommu_group
lrwxrwxrwx 1 root root 0 Jun  5 12:49 iommu_group -> ../../../kernel/iommu_groups/4

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:23 +02:00
Alexey Kardashevskiy
6d8be4c343 vfio: Add vfio_container_ioctl()
While most operations with VFIO IOMMU driver are generic and used inside
vfio.c, there are still some operations which only specific VFIO IOMMU
drivers implement. The first example of it will be reading a DMA window
start from the host.

This adds a helper which passes an ioctl request to the container's fd.

The helper will check if @req is known. For this, stub is added. This return
-1 on any requests for now.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:23 +02:00
Alexey Kardashevskiy
9bb62a0702 spapr_iommu: Make in-kernel TCE table optional
POWER KVM supports an KVM_CAP_SPAPR_TCE capability which allows allocating
TCE tables in the host kernel memory and handle H_PUT_TCE requests
targeted to specific LIOBN (logical bus number) right in the host without
switching to QEMU. At the moment this is used for emulated devices only
and the handler only puts TCE to the table. If the in-kernel H_PUT_TCE
handler finds a LIOBN and corresponding table, it will put a TCE to
the table and complete hypercall execution. The user space will not be
notified.

Upcoming VFIO support is going to use the same sPAPRTCETable device class
so KVM_CAP_SPAPR_TCE is going to be used as well. That means that TCE
tables for VFIO are going to be allocated in the host as well.
However VFIO operates with real IOMMU tables and simple copying of
a TCE to the real hardware TCE table will not work as guest physical
to host physical address translation is requited.

So until the host kernel gets VFIO support for H_PUT_TCE, we better not
to register VFIO's TCE in the host.

This adds a place holder for KVM_CAP_SPAPR_TCE_VFIO capability. It is not
in upstream yet and being discussed so now it is always false which means
that in-kernel VFIO acceleration is not supported.

This adds a bool @vfio_accel flag to the sPAPRTCETable device telling
that sPAPRTCETable should not try allocating TCE table in the host kernel
for VFIO. The flag is false now as at the moment there is no VFIO.

This adds an vfio_accel parameter to spapr_tce_new_table(), the semantic
is the same. Since there is only emulated PCI and VIO now, the flag is set
to false. Upcoming VFIO support will set it to true.

This is a preparation patch so no change in behaviour is expected

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:23 +02:00
Alexey Kardashevskiy
3a3b8502e6 spapr: Fix RTAS token numbers
At the moment spapr_rtas_register() allocates a new token number for every
new RTAS callback so numbers are not fixed and depend on the number of
supported RTAS handlers and the exact order of spapr_rtas_register() calls.
These tokens are copied into the device tree and remain the same during
the guest lifetime.

When we start another guest to receive a migration, it calls
spapr_rtas_register() as well. If the number of RTAS handlers or their
order is different in QEMU on source and destination sides, the "/rtas"
node in the device tree will differ. Since migration overwrites the device
tree (as it overwrites the entire RAM), the actual RTAS config on
the destination side gets broken.

This defines global contant values for every RTAS token which QEMU
is using today.

This changes spapr_rtas_register() to accept a token number instead of
allocating one. This changes all users of spapr_rtas_register().

This changes XICS-KVM not to cache tokens registered with KVM as they
constant now.

This makes TOKEN_BASE global as RTAS_XXX use TOKEN_BASE as
a base. TOKEN_MAX is moved and renamed too and its value is changed
to the last token + 1. Boundary checks for token values are adjusted.

This reserves token numbers for "os-term" handlers and PCI hotplug
which we are working on.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:22 +02:00
Alexander Graf
b3cad3abf6 PPC: Add support for Apple gdb in gdbstub
The Apple gdbstub protocol is different from the normal gdbstub protocol
used on PowerPC. Add support for the different variant, so that we can use
Apple's gdb to debug guest code.

Keep in mind that the switch is a compile time option. We can't detect
during runtime whether a gdb connecting to us is an upstream gdb or an
Apple gdb.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:22 +02:00
Sorav Bansal
294d129289 target-ppc: fixed translation of mcrxr instruction
Fixed bug in gen_mcrxr() in target-ppc/translate.c:
The XER[SO], XER[OV], and XER[CA] flags are stored in the least
significant bit (bit 0) of their respective registers. They need
to be shifted left (by their respective offsets) to generate the final
XER value. The old translation code for the 'mcrxr' instruction
was assuming that  the flags are stored in bit 2, and was shifting them
right (incorrectly)

Signed-off-by: Sorav Bansal <sbansal@cse.iitd.ernet.in>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:22 +02:00
Avik Sil
cc84c0f357 spapr: Add "qemu, boot-menu" property to /chosen
This is required to enable boot menu display during booting

Signed-off-by: Avik Sil <aviksil@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:22 +02:00
Tom Musta
a60438ddd6 linux-user: Support HWCAP2 in PowerPC
Set bits in the AT_HWCAP2 entry of the AUXV.  Specifically, detect and set bits
for bctar, ISEL and ISA 2.07.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:22 +02:00
Tom Musta
0e019746d7 linux-user: Identify Addition Hardware Capabilities for PowerPC
Add VSX, DFP and ISA 2.06 to the bits identified in the AT_HWCAP
entry of the AUXV.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:22 +02:00
Tom Musta
b2f1355020 target-ppc: Add DFP to Emulated Instructions Flag
Decimal Floating Point is emulated, so add it the mask.  This will
fix the erroneous message:

  Warning: Disabling some instructions which are not emulated by TCG (0x0, 0x4)

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:22 +02:00
Tom Musta
623e250abd linux-user: Correct AUXV Cache Line Sizes for PowerPC
Set the AT_ICACHEBSIZE and AT_DCACHEBSIZE entries of the AUXV to match the
CPU model's cache line sizes.  This fixes memory clobbering problems on more
recent Book 3s implementations; memset(p, 0, N) will use the dcbz instruction
when N is sufficiently large and many of the newer server CPUs have cache lines
sizes of 128 bytes.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:21 +02:00
Peter Maydell
5e80dd223d hw/net/eepro100: Implement read-only bits in MDI registers
Although we defined an eepro100_mdi_mask[] array indicating which bits
in the registers are read-only, we weren't actually doing anything with
it. Make the MDI register-write code use it rather than manually making
register 1 read-only and leaving the rest as reads-as-written. (The
special-case handling of register 0 remains as before since its mask is
all-zeros and the special casing happens before we apply the masking.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1402159924-13853-1-git-send-email-peter.maydell@linaro.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-27 12:23:45 +02:00
Jens Freimann
77416f4075 pc-bios/s390-ccw: update binary
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-06-27 12:11:53 +02:00
Eugene (jno) Dvurechenski
564e52b96f pc-bios/s390-ccw: IPL from LDL/CMS-formatted ECKD DASD
Add code that allows us to start from two further ECKD DASD disk
layouts: LDL (Linux disk layout) and CMS (cms-formatted disk).

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-06-27 12:11:52 +02:00
Eugene (jno) Dvurechenski
e0aff4aa3f pc-bios/s390-ccw: IPL from CDL-formatted ECKD DASD
Add code that allows us to start from ECKD DASD using the z/OS
compatible disk layout (CDL), which is the most common format for ECKD
DASD.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-06-27 12:11:52 +02:00
Eugene (jno) Dvurechenski
a00b33d9e2 pc-bios/s390-ccw: factor out ipl code
Move the scsi-disk specific ipl code from zipl_load() into a new
function ipl_scsi(). This makes it easier to add ipl routines for other
disk types.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-06-27 12:11:49 +02:00
Eugene (jno) Dvurechenski
058cc1f311 pc-bios/s390-ccw: Add fill_hex_val func to provide better msgs
Factor out helper function for dumping a hex value into a buffer.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-06-27 12:10:28 +02:00
Eugene (jno) Dvurechenski
60612d5cbb pc-bios/s390-ccw: Unify error handling
Convert to IPL_assert and friends

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-06-27 12:10:28 +02:00
Eugene (jno) Dvurechenski
a94b485e17 pc-bios/s390-ccw: add some utility code
IPL_assert(term,message) is introduced to handle error conditions.
ebcdic_to_ascii() to convert chars (mostly to print VOLSERs).
read_block() provision for unified block-number handling.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-06-27 12:10:28 +02:00
Eugene (jno) Dvurechenski
91a03f9b69 pc-bios/s390-ccw: handle different sector sizes
Use the virtio device's configuration to figure out the disk geometry
and use a sector size based upon the layout.

[CH: s/SECTOR_SIZE/MAX_SECTOR_SIZE/g]
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-06-27 12:10:03 +02:00
Eugene (jno) Dvurechenski
26f2bbd6b1 pc-bios/s390-ccw: cleanup and enhance bootmap defintions
Add declarations to describe structure of different dasd IPL sources
(eckd and fba). Move the structure definitions to a new header bootmap.h.
While we are at it, change structs to typedefs.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-06-27 11:58:47 +02:00
Eugene (jno) Dvurechenski
abd696e4f7 pc-bios/s390-ccw: make checkpatch happy
Remove tabs, tweak whitespace and comments.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-06-27 11:57:25 +02:00
Jeff Cody
d1fde4ad3c block: add qemu-iotest for resize base during live commit
If 'base' is smaller than the overlay image being committed into it,
then the base image will be grown in commit_run via bdrv_truncate().

This tests to make sure that this works, and the bdrv_truncate() is
not blocked when it shouldn't be.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 11:37:54 +02:00
Jeff Cody
9c75e168bc block: check for RESIZE blocker in the QMP command, not bdrv_truncate()
If we check for the RESIZE blocker in bdrv_truncate(), that means a
commit will fail if the overlay layer is larger than the base, due to
the backing blocker.

This is a regression in behavior from 2.0; currently, commit will try to
grow the size of the base image to match the overlay size, if the
overlay size is larger.

By moving this into the QMP command qmp_block_resize(), it allows
usage of bdrv_truncate() within block jobs.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 11:37:35 +02:00
Jiri Pirko
575a1c0e42 net: move queue number into NICPeers
It indicates the number of elements in ncs field and makes sense to have
int inside NICPeers. Also in parse_netdev we do not need to access
container and work with NICPeers only.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-27 11:19:31 +02:00
Anton Ivanov
3fb69aa1d1 net: L2TPv3 transport
This transport allows to connect a QEMU nic to a static Ethernet
over L2TPv3 tunnel. The transport supports all options present
in the Linux kernel implementation. It allows QEMU to connect
to any Linux host running kernel 3.3+, most routers and network
devices as well as other QEMU instances.

[Fixed up net_client_init1() switch statement to support -netdev
--Stefan]

Signed-off-by: Anton Ivanov <antivano@cisco.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-27 10:39:10 +02:00
Gonglei
eb3f45c5af qemu-bridge-helper: Fix fd leak in main()
initialize fd and ctlfd, and close them at the end

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-27 10:39:10 +02:00
Michal Privoznik
a760715095 qemu_opts_append: Play nicely with QemuOptsList's head
When running a libvirt test suite I've noticed the qemu-img is
crashing occasionally. Tracing the problem down led me to the
following valgrind output:

qemu.git $ valgrind -q ./qemu-img create -f qed -obacking_file=/dev/null,backing_fmt=raw qed
==14881== Invalid write of size 8
==14881==    at 0x1D263F: qemu_opts_create (qemu-option.c:692)
==14881==    by 0x130782: bdrv_img_create (block.c:5531)
==14881==    by 0x118DE0: img_create (qemu-img.c:462)
==14881==    by 0x11E7E4: main (qemu-img.c:2830)
==14881==  Address 0x11fedd38 is 24 bytes inside a block of size 232 free'd
==14881==    at 0x4C2CA5E: realloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==14881==    by 0x592D35E: g_realloc (in /usr/lib64/libglib-2.0.so.0.3800.2)
==14881==    by 0x1D38D8: qemu_opts_append (qemu-option.c:1129)
==14881==    by 0x13075E: bdrv_img_create (block.c:5528)
==14881==    by 0x118DE0: img_create (qemu-img.c:462)
==14881==    by 0x11E7E4: main (qemu-img.c:2830)
==14881==
Formatting 'qed', fmt=qed size=0 backing_file='/dev/null' backing_fmt='raw' cluster_size=65536
==14881== Invalid write of size 8
==14881==    at 0x1D28BE: qemu_opts_del (qemu-option.c:750)
==14881==    by 0x130BF3: bdrv_img_create (block.c:5638)
==14881==    by 0x118DE0: img_create (qemu-img.c:462)
==14881==    by 0x11E7E4: main (qemu-img.c:2830)
==14881==  Address 0x11fedd38 is 24 bytes inside a block of size 232 free'd
==14881==    at 0x4C2CA5E: realloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==14881==    by 0x592D35E: g_realloc (in /usr/lib64/libglib-2.0.so.0.3800.2)
==14881==    by 0x1D38D8: qemu_opts_append (qemu-option.c:1129)
==14881==    by 0x13075E: bdrv_img_create (block.c:5528)
==14881==    by 0x118DE0: img_create (qemu-img.c:462)
==14881==    by 0x11E7E4: main (qemu-img.c:2830)
==14881==

The problem is apparently in the qemu_opts_append(). Well, if it
gets called twice or more. On the first call, when @dst is NULL
some initialization is done during which @dst->head list gets
initialized. The list is initialized in a way, so that the list
tail points at the list head. However, the next time
qemu_opts_append() is called for new options to be added,
g_realloc() may move @dst to a new address making the old list tail
point at an invalid address. If that's the case, we must update the
list pointers.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-26 15:53:52 +02:00
Peter Maydell
ff4873cb8c coroutine-win32.c: Add noinline attribute to work around gcc bug
A gcc codegen bug in x86_64-w64-mingw32-gcc (GCC) 4.6.3 means that
non-debug builds of QEMU for Windows tend to assert when using
coroutines. Work around this by marking qemu_coroutine_switch
as noinline.

If we allow gcc to inline qemu_coroutine_switch into
coroutine_trampoline, then it hoists the code to get the
address of the TLS variable "current" out of the while() loop.
This is an invalid transformation because the SwitchToFiber()
call may be called when running thread A but return in thread B,
and so we might be in a different thread context each time
round the loop. This can happen quite often.  Typically.
a coroutine is started when a VCPU thread does bdrv_aio_readv:

     VCPU thread

     main VCPU thread coroutine      I/O coroutine
        bdrv_aio_readv ----->
                                     start I/O operation
                                       thread_pool_submit_co
                       <------------ yields
        back to emulation

Then I/O finishes and the thread-pool.c event notifier triggers in
the I/O thread.  event_notifier_ready calls thread_pool_co_cb, and
the I/O coroutine now restarts *in another thread*:

     iothread

     main iothread coroutine         I/O coroutine (formerly in VCPU thread)
        event_notifier_ready
          thread_pool_co_cb ----->   current = I/O coroutine;
                                     call AIO callback

But on Win32, because of the bug, the "current" being set here the
current coroutine of the VCPU thread, not the iothread.

noinline is a good-enough workaround, and quite unlikely to break in
the future.

(Thanks to Paolo Bonzini for assistance in diagnosing the problem
and providing the detailed example/ascii art quoted above.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1403535303-14939-1-git-send-email-peter.maydell@linaro.org
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2014-06-26 14:08:14 +01:00
Peter Maydell
8589744aaf Merge remote-tracking branch 'remotes/afaerber/tags/qom-cpu-for-2.1' into staging
X86CPU

* Filter out MONITOR for KVM
* Fix filtering for TCG
* -cpu foo,check and -cpu foo,enforce support for TCG
* -cpu host migration support (-cpu host,migratable=no to disable)
* Add invtsc feature support
* New model: Broadwell

# gpg: Signature made Wed 25 Jun 2014 22:55:04 BST using RSA key ID 3E7E013F
# gpg: Good signature from "Andreas Färber <afaerber@suse.de>"
# gpg:                 aka "Andreas Färber <afaerber@suse.com>"

* remotes/afaerber/tags/qom-cpu-for-2.1:
  target-i386: Broadwell CPU model
  target-i386: Fix indentation of CPU model definitions
  target-i386: Support "invariant tsc" flag
  target-i386: block migration and savevm if invariant tsc is exposed
  savevm: check vmsd for migratability status
  target-i386: Set migratable=yes by default on "host" CPU mooel
  target-i386: Add "migratable" property to "host" CPU model
  target-i386: Support check/enforce flags in TCG mode, too
  target-i386: Loop-based feature word filtering in TCG mode
  target-i386: Loop-based copying and setting/unsetting of feature words
  target-i386: Define TCG_*_FEATURES earlier in cpu.c
  target-i386: Filter KVM and 0xC0000001 features on TCG
  target-i386: Filter FEAT_7_0_EBX TCG features too
  target-i386: Make TCG feature filtering more readable
  target-i386: Isolate KVM-specific code on CPU feature filtering logic
  target-i386: Pass FeatureWord argument to report_unavailable_features()
  target-i386: Merge feature filtering/checking functions
  target-i386: Simplify reporting of unavailable features
  target-i386: kvm: Don't enable MONITOR by default on any CPU model

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-26 13:33:11 +01:00
Paolo Bonzini
f3db17b951 qemu-char: initialize chr_write_lock
Otherwise, Windows fails with a deadlock.

Reported-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1403679897-11480-1-git-send-email-pbonzini@redhat.com
Tested-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-26 13:13:54 +01:00
Kevin Wolf
20cca275c6 block: Remove a special case for protocols
The only semantic change is that bs->open_flags gets BDRV_O_PROTOCOL set
now. This isn't useful, but it doesn't hurt either. The code that was
previously skipped by 'goto done' is automatically disabled because
protocol drivers don't support backing files (and if they did, this
would probably be a fix) and can't have snapshot_flags set.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-06-26 13:51:01 +02:00
Kevin Wolf
8ee79e707a block: Catch backing files assigned to non-COW drivers
Since we parse backing.* options to add a backing file from the command
line when the driver didn't assign one, it has been possible to have a
backing file for e.g. raw images (it just was never accessed).

This is obvious nonsense and should be rejected.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-06-26 13:51:01 +02:00
Kevin Wolf
76c591b013 block: Remove second bdrv_open() recursion
This recursion was introduced in commit 505d7583 in order to allow
nesting image formats. It only ever takes effect when the user
explicitly specifies a driver name and that driver isn't suitable for
the protocol level.

We can check this earlier in bdrv_open() and if the explicitly
requested driver is a format driver, clear BDRV_O_PROTOCOL so that
another bs->file layer is opened.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-06-26 13:51:01 +02:00
Kevin Wolf
b348f3311c block: Inline bdrv_file_open()
It doesn't do much any more, we can move the code to bdrv_open() now.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-06-26 13:51:01 +02:00
Kevin Wolf
f4788adcb4 block: Use common driver selection code for bdrv_open_file()
This moves the bdrv_open_file() call a bit down so that it can use the
bdrv_open() code that selects the right block driver.

The code between the old and the new call site is either common code
(the error message for an unknown driver has been unified now) or
doesn't run with cleared BDRV_O_PROTOCOL (added an if block in one
place, whereas the right path was already asserted in another place)

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-06-26 13:51:01 +02:00
Kevin Wolf
17b005f1d4 block: Always pass driver name through options QDict
The "driver" entry in the options QDict is now only missing if we're
opening an image with format probing.

We also catch cases now where both the drv argument and a "driver"
option is specified, e.g. by specifying -drive format=qcow2,driver=raw

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-06-26 13:51:01 +02:00
Kevin Wolf
5e5c4f63f4 block: Move json: parsing to bdrv_fill_options()
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-06-26 13:51:01 +02:00
Kevin Wolf
462f5bcf69 block: Move bdrv_fill_options() call to bdrv_open()
bs->options now contains the modified version of the options.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-06-26 13:51:01 +02:00
Kevin Wolf
f54120ff1a block: Create bdrv_fill_options()
The idea of bdrv_fill_options() is to convert every parameter for
opening images, in particular the filename and flags, to entries in the
options QDict.

This patch starts with moving the filename parsing and driver probing
part from bdrv_file_open() to the new function.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-06-26 13:51:01 +02:00
Peter Lieven
f42ca3cad1 block/nfs: add knob to set readahead
upcoming libnfs will feature internal readahead support.
Add a knob to pass the optional readahead value as a URL
parameter.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-26 13:51:01 +02:00
Peter Lieven
7c24384b3b block/nfs: fix url parameter checking
this patch fixes the incorrect usage of strncmp and
adds simple error checking by means of parse_uint_full
instead of atoi for the supplied URL parameters.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-26 13:51:01 +02:00
Fam Zheng
3b9f27d2b3 qemu-iotests: Test 0-length image for mirror
All behavior and invariant should hold for images with 0 length, so
add a class to repeat all the tests in TestSingleDrive.

Hide two unapplicable test methods that would fail with 0 image length
because it's also used as cluster size.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-26 13:51:00 +02:00
Fam Zheng
8b9a30ca5b qemu-iotests: Test BLOCK_JOB_READY event for 0Kb image active commit
There should be a BLOCK_JOB_READY event with active commit, regardless
of image length. Let's test the 0 length image case, and make sure it
goes through the ready->complete process.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-26 13:51:00 +02:00
Fam Zheng
9e48b02540 mirror: Go through ready -> complete process for 0 len image
When mirroring or active committing a zero length image, BLOCK_JOB_READY
is not reported now, instead the job completes because we short circuit
the mirror job loop.

This is inconsistent with non-zero length images, and only confuses
management software.

Let's do the same thing when seeing a 0-length image: report ready
immediately; wait for block-job-cancel or block-job-complete; clear the
cancel flag as existing non-zero image synced case (cancelled after
ready); then jump to the exit.

Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-26 13:50:57 +02:00
Igor Mammedov
0931304788 qemu-char: fix warning 'res' may be used uninitialized
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-id: 1403683241-20678-1-git-send-email-imammedo@redhat.com
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-26 12:34:41 +01:00
Fam Zheng
dc71ce45de blockjob: Add block_job_yield()
This will unset busy flag and put coroutine to sleep, can be used to
wait for QMP complete/cancel.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-26 12:12:22 +02:00
Eduardo Habkost
ece0135407 target-i386: Broadwell CPU model
This adds a new CPU model named "Broadwell". It has all the features
from Haswell, plus PREFETCHW, RDSEED, ADX, SMAP.

PREFETCHW was already supported as "3dnowprefetch".

RDSEED, ADX was added on Linux v3.15-rc1.

SMAP was added on Linux v3.15-rc2.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Cc: Wang, Yong Y <yong.y.wang@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Dugger, Donald D <donald.d.dugger@intel.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-06-25 23:54:57 +02:00
Eduardo Habkost
b3fb3a200b target-i386: Fix indentation of CPU model definitions
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-06-25 23:54:57 +02:00
Marcelo Tosatti
303752a906 target-i386: Support "invariant tsc" flag
Expose "Invariant TSC" flag, if KVM is enabled. From Intel documentation:

17.13.1 Invariant TSC The time stamp counter in newer processors may
support an enhancement, referred to as invariant TSC. Processor’s
support for invariant TSC is indicated by CPUID.80000007H:EDX[8].
The invariant TSC will run at a constant rate in all ACPI P-, C-.
and T-states. This is the architectural behavior moving forward. On
processors with invariant TSC support, the OS may use the TSC for wall
clock timer services (instead of ACPI or HPET timers). TSC reads are
much more efficient and do not incur the overhead associated with a ring
transition or access to a platform resource.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
[ehabkost: redo feature filtering to use .tcg_features]
[ehabkost: add CPUID_APM_INVTSC macro, add it to .unmigratable_flags]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-06-25 23:54:57 +02:00
Marcelo Tosatti
68bfd0ad4a target-i386: block migration and savevm if invariant tsc is exposed
Invariant TSC documentation mentions that "invariant TSC will run at a
constant rate in all ACPI P-, C-. and T-states".

This is not the case if migration to a host with different TSC frequency
is allowed, or if savevm is performed. So block migration/savevm.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
[AF+mtosatti: Updated error message]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-06-25 23:54:57 +02:00
Marcelo Tosatti
7d854c471a savevm: check vmsd for migratability status
Check vmsd for unmigratable field, allowing migratibility status
to be modified after vmstate_register.

Cc: Juan Quintela <quintela@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-06-25 23:54:57 +02:00
Eduardo Habkost
120eee7d1f target-i386: Set migratable=yes by default on "host" CPU mooel
Having only migratable flags reported by default on the "host" CPU model
is safer for the following reasons:

 * Existing users may expect "-cpu host" to be migration-safe, if they
   take care of always using compatible host CPUs, host kernels, and
   QEMU versions.
 * Users who don't care aboug migration and want to enable all features
   supported by the host kernel can simply change their setup to use
   migratable=no.

Without this change, people using "-cpu host" will stop being able to
migrate, because now "invtsc" is getting enabled by default.

We are not setting migratable=yes by default on all X86CPU subclasses,
because users should be able to get non-migratable features enabled if
they ask for them explicitly.

Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-06-25 23:54:57 +02:00
Eduardo Habkost
84f1b92f97 target-i386: Add "migratable" property to "host" CPU model
This flag will allow the user to choose between two modes:
 * All flags that can be enabled on the host, even if unmigratable
   (migratable=no);
 * All flags that can be enabled on the host, are known to QEMU
   and migratable (migratable=yes).

The default is still migratable=false, to keep current behavior, but
this will be changed to migratable=true by another patch.

My plan was to support the "migratable" flag on all CPU classes, but
have the default to "false" on all CPU models except "host". However,
DeviceClass has no mechanism to allow a child class to have a different
property default from the parent class yet, so by now only the "host"
CPU model will support the "migratable" flag.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-06-25 23:54:57 +02:00
Eduardo Habkost
fefb41bf34 target-i386: Support check/enforce flags in TCG mode, too
If enforce/check is specified in TCG mode, QEMU will ensure all CPU
features are supported by TCG, so no CPU feature is silently disabled.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
[AF: Be explicit about TCG vs. !KVM]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-06-25 23:54:57 +02:00
Eduardo Habkost
37ce3522cb target-i386: Loop-based feature word filtering in TCG mode
Instead of manually filtering each feature word, add a tcg_features
field to FeatureWordInfo, and use that field to filter all feature words
in TCG mode.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-06-25 23:54:57 +02:00
Eduardo Habkost
e1c224b4eb target-i386: Loop-based copying and setting/unsetting of feature words
Now that we have the feature word arrays, we don't need to manually copy
each array item, we can simply iterate through each feature word.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-06-25 23:54:57 +02:00
Eduardo Habkost
621626ce7d target-i386: Define TCG_*_FEATURES earlier in cpu.c
Those macros will be used in the feature_word_info array data, so need
to be defined earlier.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-06-25 23:54:56 +02:00
Eduardo Habkost
84a6c6cd40 target-i386: Filter KVM and 0xC0000001 features on TCG
TCG doesn't support any of the feature flags on FEAT_KVM and
FEAT_C000_0001_EDX feature words, so clear all bits on those feature
words.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-06-25 23:54:56 +02:00
Eduardo Habkost
d0a70f46fa target-i386: Filter FEAT_7_0_EBX TCG features too
The TCG_7_0_EBX_FEATURES macro was defined but never used (it even had a
typo that was never noticed). Make the existing TCG feature filtering
code use it.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-06-25 23:54:29 +02:00
Eduardo Habkost
a42d9938a1 target-i386: Make TCG feature filtering more readable
Instead of an #ifdef in the middle of the code, just set
TCG_EXT2_FEATURES to a different value depending on TARGET_X86_64.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-06-25 18:04:15 +02:00
Eduardo Habkost
27418adf32 target-i386: Isolate KVM-specific code on CPU feature filtering logic
This will allow us to re-use the feature filtering logic (and the
check/enforce flag logic) for TCG.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-06-25 18:04:15 +02:00
Eduardo Habkost
8459e3961e target-i386: Pass FeatureWord argument to report_unavailable_features()
This will help us simplify the code that calls
report_unavailable_features() later.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-06-25 18:04:15 +02:00
Eduardo Habkost
51f63aed32 target-i386: Merge feature filtering/checking functions
Merge filter_features_for_kvm() and kvm_check_features_against_host().

Both functions made exactly the same calculations, the only difference
was that filter_features_for_kvm() changed the bits on cpu->features[],
and kvm_check_features_against_host() did error reporting.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-06-25 18:04:15 +02:00
Eduardo Habkost
857aee337c target-i386: Simplify reporting of unavailable features
Instead of checking and calling unavailable_host_feature() once for each
bit, simply call the function (now renamed to
report_unavailable_features()) once for each feature word.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
[AF: Drop unused return value]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-06-25 18:04:15 +02:00
Eduardo Habkost
136a7e9a85 target-i386: kvm: Don't enable MONITOR by default on any CPU model
KVM never supported the MONITOR flag so it doesn't make sense to have it
enabled by default when KVM is enabled.

The rationale here is similar to the cases where it makes sense to have
a feature enabled by default on all CPU models when on KVM mode (e.g.
x2apic). In this case we are having a feature disabled by default for
the same reasons.

In this case we don't need machine-type compat code because it is
currently impossible to run a KVM VM with the MONITOR flag set.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-06-25 18:04:15 +02:00
Peter Maydell
2b5b7ae917 Merge remote-tracking branch 'remotes/mjt/tags/trivial-patches-2014-06-24' into staging
trivial patches for 2014-06-24

# gpg: Signature made Tue 24 Jun 2014 17:07:31 BST using RSA key ID A4C3D7DB
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg:                 aka "Michael Tokarev <mjt@corpit.ru>"
# gpg:                 aka "Michael Tokarev <mjt@debian.org>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D  4324 457C E0A0 8044 65C5
#      Subkey fingerprint: 6F67 E18E 7C91 C5B1 5514  66A7 BEE5 9D74 A4C3 D7DB

* remotes/mjt/tags/trivial-patches-2014-06-24:
  Add support for the arm breakpoint syscall
  Increase maximum number of session of the internal TFTP server.
  target-s390x: Remove unused ld_code6() function
  hw/moxie/moxiesim.c: Remove unused moxie_intc_create()
  target-unicore: Remove unused functions
  build-sys: introduce install-prog macro to install&strip binaries and use it
  tcg: mark tcg_out* and tcg_patch* with attribute 'unused'
  rng-random: NULL check not needed before g_free()
  block.c: Remove useless 'buf' variable
  vscclient: Add required headers to fix build on FreeBSD
  target-ppc: Fix compiler warning
  configure: Enable TPM by default, add --disable-tpm
  Fix new typos (found by codespell)
  virtio-serial: remove useless set_config function

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-24 17:14:57 +01:00
Hunter Laux
d535508793 Add support for the arm breakpoint syscall
OABI arm used a software interrupt(0xef9f0001) for breakpoints.
Since 2005 gdb has used the break instruction(0xe7f001f0) for EABI.
Apparently Steel Bank Common Lisp still uses the swi instruction.

This is the kernel implementation:
http://lxr.free-electrons.com/source/arch/arm/kernel/traps.c#L598

Signed-off-by: Hunter Laux <hunterlaux@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-24 20:01:24 +04:00
Bernhard Übelacker
5f22b054f2 Increase maximum number of session of the internal TFTP server.
Grub fails to boot from internal TFTP server when loading more than
3 initrd files.

Grub first opens a session to the TFTP server for every initrd file and
retrieves only the file size for all.
Then it wants to download the content using the old sessions which are
already expired.

Increasing the maximum number of session of the internal TFTP
server avoids this issue.

The error message reads as following:
error: timeout reading
`/boot/ISO.ROOT/BOOTMGR'.

Press any key to continue...

Signed-off-by: Bernhard Übelacker <bernhardu@vr-web.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-24 20:01:24 +04:00
Peter Maydell
55818ad812 target-s390x: Remove unused ld_code6() function
The ld_code6() function is unused; remove it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-24 20:01:24 +04:00
Peter Maydell
64e64ef609 hw/moxie/moxiesim.c: Remove unused moxie_intc_create()
The function moxie_intc_create() is unused; remove it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-24 20:01:24 +04:00
Peter Maydell
25b93db358 target-unicore: Remove unused functions
The functions gen_st64, gen_ld64, gen_mulxy, ucf64_itod and
ucf64_dtoi are all unused; remove them.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-24 20:01:24 +04:00
Michael Tokarev
0d65942611 build-sys: introduce install-prog macro to install&strip binaries and use it
Use common rule (macro) to install and strip binaries, and use
it in all places where we install binaries, instead of fixing
bugs like 1319493 in every place.
(This fixes https://bugs.launchpad.net/bugs/1319493)

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-24 20:01:24 +04:00
Peter Maydell
4196dca63b tcg: mark tcg_out* and tcg_patch* with attribute 'unused'
The tcg_out* and tcg_patch* functions are utility routines that may or
may not be used by a particular backend; mark them with the 'unused'
attribute to suppress spurious warnings if they aren't used.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-24 20:01:24 +04:00
Eduardo Habkost
feced894fb rng-random: NULL check not needed before g_free()
g_free() is NULL-safe.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-24 20:01:24 +04:00
Chen Gang
5db97df274 block.c: Remove useless 'buf' variable
'buf' is not used actually, so remove it and related snprintf() statement.

Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-24 20:01:24 +04:00
Ed Maste
18b1afa874 vscclient: Add required headers to fix build on FreeBSD
Signed-off-by: Ed Maste <emaste@freebsd.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-24 20:01:24 +04:00
Stefan Weil
0211b5cf2d target-ppc: Fix compiler warning
gcc reports a warning which is usually wrong:

target-ppc/dfp_helper.c: In function ‘dfp_get_digit’:
target-ppc/dfp_helper.c:417:1: warning:
 control reaches end of non-void function [-Wreturn-type]

The compiler shows the warning if assert is not marked with the noreturn
attribute or if the code is compiled with -DNDEBUG.

Using g_assert_not_reached better documents the intention and does not
have these problems.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-24 20:01:24 +04:00
Cole Robinson
e91c793cb5 configure: Enable TPM by default, add --disable-tpm
I don't see why tpm is disabled by default: it doesn't have any
external dependencies, or change default behavior. Leaving it disabled
is just going to cause it to bit rot.

Enable it by default, and add a --disable-tpm option.

Signed-off-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-24 20:01:24 +04:00
Stefan Weil
5d831be272 Fix new typos (found by codespell)
* accomodate -> accommodate
* aquiring -> acquiring
* beacuse -> because
* loosing -> losing
* prefering -> preferring
* threshhold -> threshold

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-24 20:01:24 +04:00
Paolo Bonzini
10358b6a1c virtio-serial: remove useless set_config function
Its only contents are a dead memcpy.  Since it is optional,
drop the function altogether.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-24 20:01:24 +04:00
Peter Maydell
513d80edc1 Merge remote-tracking branch 'remotes/juanquintela/tags/migration/20140623' into staging
migration/next for 20140623

# gpg: Signature made Mon 23 Jun 2014 18:18:57 BST using RSA key ID 5872D723
# gpg: Can't check signature: public key not found

* remotes/juanquintela/tags/migration/20140623: (22 commits)
  vmstate: Refactor & increase tests for primitive types
  vmstate: Return error in case of error
  migration: Remove unneeded minimum_version_id_old
  tests: vmstate static checker: add size mismatch inside substructure
  tests: vmstate static checker: add substructure for usb-kbd for hid section
  tests: vmstate static checker: remove Subsections
  tests: vmstate static checker: remove a subsection
  tests: vmstate static checker: remove Description inside Fields
  tests: vmstate static checker: remove Description
  tests: vmstate static checker: remove Fields
  tests: vmstate static checker: change description name
  tests: vmstate static checker: remove last field in a struct
  tests: vmstate static checker: remove a field
  tests: vmstate static checker: remove a section
  tests: vmstate static checker: minimum_version_id check
  tests: vmstate static checker: version mismatch inside a Description
  tests: vmstate static checker: add version error in main section
  tests: vmstate static checker: incompat machine types
  tests: vmstate static checker: add dump1 and dump2 files
  vmstate-static-checker: script to validate vmstate changes
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-24 15:33:42 +01:00
Peter Maydell
089a39486f Merge remote-tracking branch 'remotes/qmp-unstable/queue/qmp' into staging
* remotes/qmp-unstable/queue/qmp: (43 commits)
  monitor: protect event emission
  monitor: protect outbuf and mux_out with mutex
  qemu-char: make writes thread-safe
  qemu-char: move pty_chr_update_read_handler around
  qemu-char: do not call chr_write directly
  qemu-char: introduce qemu_chr_alloc
  qapi event: clean up
  qapi event: convert QUORUM events
  qapi event: convert GUEST_PANICKED
  qapi event: convert BALLOON_CHANGE
  qmp: convert ACPI_DEVICE_OST event
  qapi event: convert SPICE events
  qapi event: convert VNC events
  qapi event: convert NIC_RX_FILTER_CHANGED
  qapi event: convert other BLOCK_JOB events
  qapi event: convert BLOCK_IMAGE_CORRUPTED
  qapi event: convert BLOCK_IO_ERROR and BLOCK_JOB_ERROR
  qapi event: convert DEVICE_TRAY_MOVED
  qapi event: convert DEVICE_DELETED
  qapi event: convert WATCHDOG
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-24 13:06:13 +01:00
Peter Maydell
27acb9dd24 Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc,pci,vhost,net fixes, enhancements

Don's patches to limit below-4g ram for pc
Marcel's pcie hotplug rewrite
Gabriel's changes to e1000 auto-negotiation
qemu char bugfixes by Stefan
misc bugfixes

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Mon 23 Jun 2014 16:25:19 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"

* remotes/mst/tags/for_upstream: (23 commits)
  xen-hvm: Handle machine opt max-ram-below-4g
  pc & q35: Add new machine opt max-ram-below-4g
  xen-hvm: Fix xen_hvm_init() to adjust pc memory layout
  pcie: coding style tweak
  hw/pcie: better hotplug/hotunplug support
  hw/pcie: implement power controller functionality
  hw/pcie: correct debug message
  q35: Use PC_Q35_COMPAT_1_4 on pc-q35-1.4 compat_props
  virtio-pci: Report an error when msix vectors init fails
  qemu-char: avoid leaking unused fds in tcp_get_msgfds()
  qemu-char: fix qemu_chr_fe_get_msgfd()
  qapi/string-output-visitor: fix human output
  e1000: factor out checking for auto-negotiation availability
  e1000: move e1000_autoneg_timer() to after set_ics()
  e1000: signal guest on successful link auto-negotiation
  e1000: improve auto-negotiation reporting via mii-tool
  e1000: emulate auto-negotiation during external link status change
  qtest: fix vhost-user-test unbalanced mutex locks
  qtest: fix qtest for vhost-user
  libqemustub: add more stubs for qemu-char
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-24 11:14:47 +01:00
Peter Maydell
7ba48975d3 Merge remote-tracking branch 'remotes/rth/tcg-ppc-merge-1' into staging
* remotes/rth/tcg-ppc-merge-1: (25 commits)
  tcg-ppc: Use the return address as a base pointer
  tcg-ppc: Merge cache-utils into the backend
  qemu/osdep: Remove the need for qemu_init_auxval
  tcg-ppc: Rename the tcg/ppc64 backend
  tcg-ppc: Remove the backend
  tcg-ppc64: Merge ppc32 shifts
  tcg-ppc64: Support mulsh_i32
  tcg-ppc64: Merge ppc32 register usage
  tcg-ppc64: Merge ppc32 qemu_ld/st
  tcg-ppc64: Merge ppc32 brcond2, setcond2, muluh
  tcg-ppc64: Begin merging ppc32 with ppc64
  tcg-ppc64: Fix sub2 implementation
  tcg-ppc64: Merge 32-bit ABIs into the prologue / frame code
  tcg-ppc64: Adjust tcg_out_call for ELFv2
  tcg-ppc64: Support the ppc64 elfv2 ABI
  tcg-ppc64: Use the correct test in tcg_out_call
  tcg-ppc64: Better parameterize the stack frame
  tcg-ppc64: Fix TCG_TARGET_CALL_STACK_OFFSET
  tcg-ppc64: Move call macros out of tcg-target.h
  tcg-ppc64: Make TCG_AREG0 and TCG_REG_CALL_STACK enum constants
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-23 18:26:58 +01:00
Juan Quintela
4ea7df4e5c vmstate: Refactor & increase tests for primitive types
This commit refactor the simple tests to test all integer types. We
move to hex because it is easier to read values of different types.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2014-06-23 19:14:52 +02:00
Juan Quintela
13cde50889 vmstate: Return error in case of error
If there is an error while loading a field, we should stop reading and
not continue with the rest of fields.  And we should also set an error
in qemu_file.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2014-06-23 19:14:52 +02:00
Juan Quintela
25feab2fc2 migration: Remove unneeded minimum_version_id_old
Once there, make checkpatch happy.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2014-06-23 19:14:52 +02:00
Amit Shah
38ef86b5a6 tests: vmstate static checker: add size mismatch inside substructure
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-06-23 19:14:52 +02:00
Amit Shah
af3713f6b9 tests: vmstate static checker: add substructure for usb-kbd for hid section
This shows how the script deals with substructures added to vmstate
descriptions that don't change the on-wire format.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-06-23 19:14:52 +02:00
Amit Shah
c7173a9c18 tests: vmstate static checker: remove Subsections
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-06-23 19:14:52 +02:00
Amit Shah
aa2a12bb82 tests: vmstate static checker: remove a subsection
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-06-23 19:14:52 +02:00
Amit Shah
b5968f0ab3 tests: vmstate static checker: remove Description inside Fields
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-06-23 19:14:52 +02:00
Amit Shah
ff29b8573f tests: vmstate static checker: remove Description
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-06-23 19:14:52 +02:00
Amit Shah
083bac3484 tests: vmstate static checker: remove Fields
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-06-23 19:14:52 +02:00
Amit Shah
1d681c712a tests: vmstate static checker: change description name
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-06-23 19:14:52 +02:00
Amit Shah
fd52ffb9bf tests: vmstate static checker: remove last field in a struct
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-06-23 19:14:52 +02:00
Amit Shah
55e8e0e19c tests: vmstate static checker: remove a field
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-06-23 19:14:52 +02:00
Amit Shah
ab99bdbe33 tests: vmstate static checker: remove a section
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-06-23 19:14:52 +02:00
Amit Shah
7daa3d76df tests: vmstate static checker: minimum_version_id check
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-06-23 19:14:52 +02:00
Amit Shah
4efa6e1d64 tests: vmstate static checker: version mismatch inside a Description
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Acked-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-06-23 19:14:52 +02:00
Amit Shah
a81d3fad87 tests: vmstate static checker: add version error in main section
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Acked-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-06-23 19:14:51 +02:00
Amit Shah
bc178dc563 tests: vmstate static checker: incompat machine types
This commit modifies the dump2 data to flag incompatibilities in the
machine types being compared.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Acked-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-06-23 19:14:51 +02:00
Amit Shah
a10413e4fc tests: vmstate static checker: add dump1 and dump2 files
These are stripped-down JSON data as obtained from the -dump-vmstate
option.  The two files are identical in this commit, and will be
modified in the later commits to show what the script does with the
data.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Acked-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-06-23 19:14:51 +02:00
Amit Shah
426d1d016a vmstate-static-checker: script to validate vmstate changes
This script compares the vmstate dumps in JSON format as output by QEMU
with the -dump-vmstate option.

It flags various errors, like version mismatch, sections going away,
size mismatches, etc.

This script is tolerant of a few changes that do not change the on-wire
format, like embedding a few fields within substructs.

The script takes -s/--src and -d/--dest parameters, to which filenames
are given as arguments.

Example:

(in a qemu 2.0 tree):
./x86_64-softmmu/qemu-system-x86_64 -dump-vmstate qemu-2.0.json

(in a qemu 2.2 tree:)
./x86_64-softmmu/qemu-system-x86_64 -dump-vmstate -M pc-i440fx-2.0 \
   qemu-2.2-m2.0.json

./scripts/vmstate-static-checker.py -s qemu-2.0.json -d qemu-2.2-m2.0.json

The script also takes a --reverse parameter to switch the src and dest
jsons.  This is just a shorthand for reversing the src and dest.

The --help parameter shows usage information.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-06-23 19:14:51 +02:00
Amit Shah
abfd9ce341 migration: dump vmstate info as a json file for static analysis
This commit adds a new command, '-dump-vmstate', that takes a filename
as an argument.  When executed, QEMU will dump the vmstate information
for the machine type it's invoked with to the file, and quit.

The JSON-format output can then be used to compare the vmstate info for
different QEMU versions, specifically to test whether live migration
would break due to changes in the vmstate data.

A Python script that compares the output of such JSON dumps is included
in the following commit.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-06-23 19:14:50 +02:00
Michael R. Hines
e325b49a32 rdma: bug fixes
1. Fix small memory leak in parsing inet address from command line in data_init()
2. Fix ibv_post_send() return value check and pass error code back up correctly.
3. Fix rdma_destroy_qp() segfault after failure to connect to destination.

Reported-by: frank.yangjie@gmail.com
Reported-by: dgilbert@redhat.com
Signed-off-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-06-23 19:09:50 +02:00
Peter Maydell
9f862687ea Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20140623-2' into staging
A couple of s390-ccw bios bugfixes: Fix booting for some bootmaps and get
the devices to a sane state before running the guest.

# gpg: Signature made Mon 23 Jun 2014 13:22:32 BST using RSA key ID C6F02FAF
# gpg: Can't check signature: public key not found

* remotes/cohuck/tags/s390x-20140623-2:
  pc-bios/s390-ccw: update s390-ccw.img binary
  pc-bios/s390-ccw: fix for fragmented SCSI bootmap
  pc-bios/s390-ccw: do a subsystem reset before running the guest
  pc-bios/s390-ccw: virtio_load_direct() can't load max number of sectors

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-23 17:47:28 +01:00
Paolo Bonzini
d622cb5879 monitor: protect event emission
Event emission must be protected by a mutex because of access to
the shared rate-limiting state, and to guard against concurrent
monitor "hot-plug" by means of human-monitor-command.

Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:12:28 -04:00
Paolo Bonzini
6cff3e8594 monitor: protect outbuf and mux_out with mutex
This lets the block layer emit QMP events from outside the I/O thread.

Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:12:28 -04:00
Paolo Bonzini
9005b2a758 qemu-char: make writes thread-safe
This will let threads other than the I/O thread raise QMP events.

GIOChannel is thread-safe, and send and receive state is usually
well-separated.  The only driver that requires some care is the
pty driver, where some of the state is shared by the read and write
sides.  That state is protected with the chr_write_lock too.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:12:28 -04:00
Paolo Bonzini
1bb7fe725c qemu-char: move pty_chr_update_read_handler around
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:12:28 -04:00
Paolo Bonzini
6975b713e6 qemu-char: do not call chr_write directly
Make the mux always go through qemu_chr_fe_write, so that we'll get
the mutex for the underlying chardev.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:12:28 -04:00
Paolo Bonzini
db39fcf1f6 qemu-char: introduce qemu_chr_alloc
The next patch will modify this function to initialize state that is
common to all backends.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:12:28 -04:00
Wenchao Xia
751751732c qapi event: clean up
Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:12:28 -04:00
Wenchao Xia
fe069d9d59 qapi event: convert QUORUM events
Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:12:28 -04:00
Wenchao Xia
3a44969037 qapi event: convert GUEST_PANICKED
'monitor.h' is still included in target-s390x/kvm.c, since I have
no good way to verify whether other code need it on my x86 host.

Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:12:28 -04:00
Wenchao Xia
aef9d3115f qapi event: convert BALLOON_CHANGE
Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:12:28 -04:00
Igor Mammedov
5f41fbba90 qmp: convert ACPI_DEVICE_OST event
... using new QAPI event infrastructure

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:12:28 -04:00
Wenchao Xia
7cfadb6b52 qapi event: convert SPICE events
SPICE_INITIALIZED, SPICE_CONNECTED, SPICE_DISCONNECTED and
SPICE_MIGRATE_COMPLETED are converted in one patch, since they
use some common functions. inet_strfamily() is removed since no
callers exist anymore.

Note that there is no existing doc for SPICE_MIGRATE_COMPLETED
in docs/qmp/qmp-events.txt before this patch.

Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:12:28 -04:00
Wenchao Xia
fb6ba0d525 qapi event: convert VNC events
Since VNC_CONNECTED, VNC_DISCONNECTED, VNC_INITIALIZED share some
common functions, convert them in one patch.

Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:12:28 -04:00
Wenchao Xia
0615027903 qapi event: convert NIC_RX_FILTER_CHANGED
Param name is declared as optional, since in code it is an optional
one.

Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:12:28 -04:00
Wenchao Xia
bcada37b19 qapi event: convert other BLOCK_JOB events
Since BLOCK_JOB_COMPLETED, BLOCK_JOB_CANCELLED, BLOCK_JOB_READY are
related, convert them in one patch. The block_job_event_* functions
are used to keep encapsulation of BlockJob structure.

Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:12:28 -04:00
Wenchao Xia
c120f0fa14 qapi event: convert BLOCK_IMAGE_CORRUPTED
Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:12:27 -04:00
Wenchao Xia
5a2d2cbd88 qapi event: convert BLOCK_IO_ERROR and BLOCK_JOB_ERROR
Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:12:27 -04:00
Wenchao Xia
a5ee7bd454 qapi event: convert DEVICE_TRAY_MOVED
Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:12:27 -04:00
Wenchao Xia
24b699fb2b qapi event: convert DEVICE_DELETED
Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:12:27 -04:00
Wenchao Xia
99eaf09c73 qapi event: convert WATCHDOG
Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:12:27 -04:00
Wenchao Xia
e010ad8f1e qapi event: convert RTC_CHANGE
This patch also eliminates build time warning caused by no caller
of monitor_qapi_event_throttle().

Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:12:27 -04:00
Wenchao Xia
7a906f7ffb qapi event: convert WAKEUP
Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:12:27 -04:00
Wenchao Xia
2ea4100f44 qapi event: convert SUSPEND_DISK
Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:12:27 -04:00
Wenchao Xia
1d11a95a3e qapi event: convert SUSPEND
Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:12:27 -04:00
Wenchao Xia
591c48fbc5 qapi event: convert RESUME
Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:12:24 -04:00
Don Slutz
c4f5cdc53f xen-hvm: Handle machine opt max-ram-below-4g
This is the xen part of "pc & q35: Add new machine opt max-ram-below-4g"

Note: this machine option cannot be used to increase the amount
of ram below 4G.

Signed-off-by: Don Slutz <dslutz@verizon.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-23 18:02:55 +03:00
Don Slutz
c87b152072 pc & q35: Add new machine opt max-ram-below-4g
This is a pc & q35 only machine opt.

If you add enough PCI devices then all mmio for them will not fit
below 4G which may not be the layout the user wanted. This allows
you to increase the below 4G address space that PCI devices can use
(aka decrease ram below 4G) and therefore in more cases not have any
mmio that is above 4G.

For example using "-machine pc,max-ram-below-4g=2G" on the command
line will limit the amount of ram that is below 4G to 2G.

Note: this machine option cannot be used to increase the amount
of ram below 4G.

Signed-off-by: Don Slutz <dslutz@verizon.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

MST: fix 32 bit
2014-06-23 18:02:41 +03:00
Wenchao Xia
a4e15de9a2 qapi event: convert STOP
Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:01:25 -04:00
Wenchao Xia
a6330785f0 qapi event: convert RESET
Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:01:25 -04:00
Wenchao Xia
0aab9ec33e qapi event: convert POWERDOWN
There is no existing comments for POWERDOWN in doc/qmp/qmp-events.txt,
so no change on it like other conversion patch.

Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:01:25 -04:00
Wenchao Xia
8432183137 qapi event: convert SHUTDOWN
This patch also eliminates build time warning caused by
QAPI_EVENT_MAX = 0.

Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:01:25 -04:00
Wenchao Xia
f668470f40 qapi: add new schema file qapi-event.json
Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:01:25 -04:00
Wenchao Xia
43a14cfc0f monitor: add an implemention of qapi event emit method
The monitor is now hooked on the new event mechanism, so that later
patches can convert event callers one by one. Most code are copied from
old monitor_protocol_* functions with some modification.

Note that two build time warnings will be raised after this patch. One is
caused by no caller of monitor_qapi_event_throttle(), the other one is
caused by QAPI_EVENT_MAX = 0. They will be fixed automatically after
full event conversion later.

Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:01:25 -04:00
Wenchao Xia
a589569f2f qapi: adjust existing defines
In order to let event defines use existing types later, instead of
redefine new ones, some old type defines for spice and vnc are changed,
and BlockErrorAction is moved from block.h to qapi schema. Note that
BlockErrorAction is not merged with BlockdevOnError.

At this point, VncInfo is not made a child of VncBasicInfo, because
VncBasicInfo has mandatory fields where VncInfo makes them optional.

Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:01:25 -04:00
Wenchao Xia
f6dadb0242 test: add test cases for qapi event
These cases will verify whether the expected qdict is built.

Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:01:25 -04:00
Wenchao Xia
21cd70dfc1 qapi script: add event support
qapi-event.py will parse the schema and generate qapi-event.c, then
the API in qapi-event.c can be used to handle events in qemu code.
All API have prefix "qapi_event".

The script mainly includes two parts: generate API for each event
define, generate an enum type for all defined events.

Since in some cases the real emit behavior may change, for example,
qemu-img would not send a event, a callback layer is used to
control the behavior. As a result, the stubs at compile time
can be saved, the binding of block layer code and monitor code
will become looser.

Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:01:25 -04:00
Wenchao Xia
f882126024 qapi: add event helper functions
This file holds some functions that do not need to be generated.

Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:01:25 -04:00
Max Reitz
be13d46d6d qapi: Add includes from qapi/ as dependencies
qapi-schema.json has been split into three smaller JSON files in qapi/.
Add them as dependencies for the code generation in the Makefile, so
changes to them will result in a rebuilt of all QAPI-dependent code.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:01:25 -04:00
Wenchao Xia
506f40ff7d os-posix: include sys/time.h
Since gettimeofday() is used in this header file as a macro define,
include the function's define header file, to avoid compile warning
when other file include os-posix.h.

Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:01:24 -04:00
Paolo Bonzini
d593233438 json-lexer: fix escaped backslash in single-quoted string
This made the lexer wait for a closing *double* quote.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:01:24 -04:00
Amos Kong
05dfb26cd2 qapi: Suppress unwanted space between type and identifier
We always generate a space between type and identifier in parameter
and variable declarations, even when idiomatic C style doesn't have
a space there.  Suppress it.

Signed-off-by: Amos Kong <akong@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:01:24 -04:00
Amos Kong
0d14eeb233 qapi: add const prefix to 'char *' insider c_type()
It's ugly to add const prefix for parameter type by an if statement
outside c_type(). This patch adds a parameter to do it.

Signed-off-by: Amos Kong <akong@redhat.com>
Suggested-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:01:24 -04:00
Amos Kong
638ca8ad98 qapi: fix coding style in parameters list
A space after * when declaring a pointer type is redundant.

Signed-off-by: Amos Kong <akong@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:01:24 -04:00
Luiz Capitulino
37f6be977a audio: fmopl: drop INLINE macro
This commit expands all uses of the INLINE macro and drop it.

The reason for this is to avoid clashes with external libraries with
bad name conventions and also because renaming keywords is not a good
practice.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:01:24 -04:00
Luiz Capitulino
a49db98d1f fpu: softfloat: drop INLINE macro
This commit expands all uses of the INLINE macro and drop it.

The reason for this is to avoid clashes with external libraries with
bad name conventions and also because renaming keywords is not a good
practice.

PS: I'm fine with this change to be licensed under softfloat-2a or
softfloat-2b.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:00:12 -04:00
Don Slutz
3c2a96699e xen-hvm: Fix xen_hvm_init() to adjust pc memory layout
This is just below_4g_mem_size and above_4g_mem_size which is used later in QEMU.

Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Don Slutz <dslutz@verizon.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-23 17:50:04 +03:00
Michael S. Tsirkin
20de98aff5 pcie: coding style tweak
- whitespace fix
- unnecessary != 0 in a condition

Cc: Marcel Apfelbaum <marcel.a@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-23 17:49:49 +03:00
Marcel Apfelbaum
554f802da3 hw/pcie: better hotplug/hotunplug support
The current code is broken: it does surprise removal which crashes guests.

Reimplemented the steps:
 - Hotplug triggers both 'present detect change' and
   'attention button pressed'.

 - Hotunplug starts by triggering 'attention button pressed',
   then waits for the OS to power off the device and only
   then detaches it.

Fixes CVE-2014-3471.

Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-23 17:48:42 +03:00
Marcel Apfelbaum
f23b6bdc3c hw/pcie: implement power controller functionality
It is needed by hot-unplug in order to get an indication
from the OS when the device can be physically detached.

Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-23 17:48:42 +03:00
Marcel Apfelbaum
e4bcd27c86 hw/pcie: correct debug message
Trivial issue, discovered while debugging.

Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-23 17:48:42 +03:00
Eduardo Habkost
48cb7f3c15 q35: Use PC_Q35_COMPAT_1_4 on pc-q35-1.4 compat_props
pc-q35-1.4 was incorrectly using PC_COMPAT_1_4 instead of
PC_Q35_COMPAT_1_4.

The only side-effect was that the hpet compat property (inherited from
PC_Q35_COMPAT_1_7) was missing.

Without this patch, pc-q35-1.4 inicorrectly initializes hpet-intcap to
0xff0104 (behavior introduced in QEMU 2.0, by commit
7a10ef51c2).

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

Reviewed-by: Markus Armbruster <armbru@redhat.com>
2014-06-23 17:48:42 +03:00
Fam Zheng
c7ff54825b virtio-pci: Report an error when msix vectors init fails
Currently vectors silently cleared to 0 if the initialization is failed,
but user should at least have one way to notice this.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-23 17:38:00 +03:00
Stefan Hajnoczi
d2fc39b420 qemu-char: avoid leaking unused fds in tcp_get_msgfds()
Commit c76bf6bb8f ("Add chardev API
qemu_chr_fe_get_msgfds") extended the get_msgfds API from one to
multiple file descriptors.  It forgot to close unused file descriptors
before freeing the file descriptor array.

This patch prevents a file descriptor leak if the tcp_get_msgfds()
callers requests fewer file descriptors than are available.

Cc: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-23 17:38:00 +03:00
Stefan Hajnoczi
4f85861441 qemu-char: fix qemu_chr_fe_get_msgfd()
Commit c76bf6bb8f ("Add chardev API
qemu_chr_fe_get_msgfds") broke qemu_chr_fe_get_msgfd() because it
changed the return value.

Callers expect -1 if no fd is available.  The commit changed the return
value to 0 (which is a valid file descriptor number) so callers always
detected a file descriptor even if none was available.

This patch fixes qemu-iotests 045:

  $ cd tests/qemu-iotests && ./check 045
  [...]
  +FAIL: test_add_fd_invalid_fd (__main__.TestFdSets)
  +----------------------------------------------------------------------
  +Traceback (most recent call last):
  +  File "./045", line 123, in test_add_fd_invalid_fd
  +    self.assert_qmp(result, 'error/class', 'GenericError')
  +  File "/home/stefanha/qemu/tests/qemu-iotests/iotests.py", line 232, in assert_qmp
  +    result = self.dictpath(d, path)
  +  File "/home/stefanha/qemu/tests/qemu-iotests/iotests.py", line 211, in dictpath
  +    self.fail('failed path traversal for "%s" in "%s"' % (path, str(d)))
  +AssertionError: failed path traversal for "error/class" in "{u'return': {u'fdset-id': 2, u'fd': 0}}"

Cc: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-23 17:38:00 +03:00
Hu Tao
684531ad1f qapi/string-output-visitor: fix human output
"0x1-0x10" looks better than "0x1-10"

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-06-23 17:38:00 +03:00
Gabriel L. Somlo
d7a4155265 e1000: factor out checking for auto-negotiation availability
Also fix minor indentation issues in the surrounding code.

Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-23 17:38:00 +03:00
Gabriel L. Somlo
d52aec9545 e1000: move e1000_autoneg_timer() to after set_ics()
Enable calling set_ics() from within e1000_autoneg_timer() without
the need for a forward declaration.

This patch contains no functional changes.

Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-23 17:38:00 +03:00
Gabriel L. Somlo
39bb8ee737 e1000: signal guest on successful link auto-negotiation
Generate a link status change interrupt once link auto-netotiation
is successfully completed. This does not affect Linux and Windows
(XP and 7 tested) in any way, but is needed by the stock OS X driver
(AppleIntel8254XEthernet.kext), which would otherwise fail to notice
the link status change event.

Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-23 17:38:00 +03:00
Gabriel L. Somlo
6883b59140 e1000: improve auto-negotiation reporting via mii-tool
Using mii-tool (on F20-live), the following output is produced:

  SIOCGMIIREG on ens3 failed: Input/output error
  ens3: no autonegotiation, 1000baseT-FD flow-control, link ok

The first line (SIOCGMIIREG error) is due to mii-tool's inability
to read the PHY auto-negotiation expansion register.
On the second line, "no autonegotiation" is wrong, and caused by
the absence of a flag in the link partner ability register which
would indicate that our link partner has acked us. This flag is
listed as "reserved" in the Intel e1000 manual, but mii-tool uses
it as LPA_LPACK from /usr/include/linux/mii.h.

This patch adds read access to PHY_AUTONEG_EXP and defines the
link partner ack flag, allowing mii-tool to generate output as
normally expected:

  ens3: negotiated 1000baseT-FD flow-control, link ok

Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-23 17:38:00 +03:00
Gabriel L. Somlo
6a2acedb19 e1000: emulate auto-negotiation during external link status change
This patch emulates auto-negotiation when the network link status
is modified externally (i.e. via "set_link <id> off/on").

Also, a couple of cleanup items:
  - unset PHY status reg. AUTONEG_COMPLETE during link_down()
  - set PHY status reg. AUTONEG_COMPLETE during autoneg_timer() only
    if we actually brought the link up.
  - group all checks for "can we, and should we autonegotiate?"
    together for more clarity.

Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-23 17:38:00 +03:00
Nikolay Nikolaev
f61badf32f qtest: fix vhost-user-test unbalanced mutex locks
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-23 17:37:59 +03:00
Nikolay Nikolaev
bd95939fc8 qtest: fix qtest for vhost-user
Fix compile for older glib, provide conditionally compiled versions of the
used glib APIs.

Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-23 17:37:59 +03:00
Nikolay Nikolaev
1dc75c6d74 libqemustub: add more stubs for qemu-char
Additional stubs:
 - chr_baum_init
 - qemu_chr_open_spice_vmc
 - qemu_chr_open_spice_port

Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-23 17:37:59 +03:00
Igor Mammedov
8f4e5ac3e2 qapi/hmp: use 'backend' instead of 'device' with memory backend
fixup documentation comments and HMP message/help text

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-23 17:37:59 +03:00
Michael S. Tsirkin
8617343faa vhost: fix resource leak in error handling
vhost_verify_ring_mappings leaks mappings on error.
Fix this up.

Cc: qemu-stable@nongnu.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-23 17:37:59 +03:00
Michael S. Tsirkin
7145872ed3 vhost: block migration if backend does not log memory
vhost user does not support LOG_ALL feature bit.
Generally, we should not try to set this bit without
checking that backend can support it first.

Detect and block migration.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-23 17:37:59 +03:00
Richard Henderson
a84ac4cbbb tcg-ppc: Use the return address as a base pointer
This can significantly reduce code size for generation of (some)
64-bit constants.  With the side effect that we know for a fact
that exit_tb can use the register to good effect.

Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-06-23 07:32:33 -07:00
Richard Henderson
224f9fd419 tcg-ppc: Merge cache-utils into the backend
As a "utility", it only supported ppc, and in a way that other
tcg backends provided directly in tcg-target.h.  Removing this
disparity is easier now that the two ppc backends are merged.

Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-06-23 07:32:30 -07:00
Richard Henderson
2b45c3f500 qemu/osdep: Remove the need for qemu_init_auxval
Instead of getting backup auxv data from the env pointer given to main,
read it from /proc/self/auxv.  We can do this at any time, so we're not
tied to any ordering wrt a call to qemu_init_auxval from main.

Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-06-23 07:32:27 -07:00
Richard Henderson
40d964b563 tcg-ppc: Rename the tcg/ppc64 backend
The other tcg backends that support 32- and 64-bit modes
use the 32-bit name for the port.  Follow suit.

Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-06-23 07:32:23 -07:00
Richard Henderson
b38daef9d4 tcg-ppc: Remove the backend
Vectoring the 32-bit build to the ppc64 directory.

Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-06-23 07:32:18 -07:00
Richard Henderson
a757e1eef0 tcg-ppc64: Merge ppc32 shifts
Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-06-23 07:32:15 -07:00
Richard Henderson
8fa391a011 tcg-ppc64: Support mulsh_i32
Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-06-23 07:32:12 -07:00
Richard Henderson
dfca177874 tcg-ppc64: Merge ppc32 register usage
Good enough to run some instructions before things go awry.

Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-06-23 07:32:09 -07:00
Richard Henderson
7f25c469c7 tcg-ppc64: Merge ppc32 qemu_ld/st
Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-06-23 07:32:06 -07:00
Richard Henderson
abcf61c48e tcg-ppc64: Merge ppc32 brcond2, setcond2, muluh
Now passes tcg_add_target_add_op_defs assertions, but
not complete enough to function.

Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-06-23 07:32:03 -07:00
Richard Henderson
796f1a689d tcg-ppc64: Begin merging ppc32 with ppc64
Just enough to compile, assuming you edit config-host.mak manually.
It will still abort at runtime, due to missing brcond2, setcond2, mulu2.

Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-06-23 07:31:58 -07:00
Richard Henderson
b31284cecf tcg-ppc64: Fix sub2 implementation
All sorts of confusion on argument ordering.

Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-06-23 07:31:56 -07:00
Richard Henderson
ffcfbecec3 tcg-ppc64: Merge 32-bit ABIs into the prologue / frame code
Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-06-23 07:31:52 -07:00
Ulrich Weigand
77e58d0d60 tcg-ppc64: Adjust tcg_out_call for ELFv2
The new ELFv2 ABI, used by default on powerpc64le-linux hosts,
introduced some changes that are incompatible with code currently
generated by the ppc64 TGC target.  In particular, we no longer
use function descriptors.

This patch adds support for the ELFv2 ABI in the ppc64 TGC
function call and function prologue sequences.

Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Ulrich Weigand <ulrich.weigand@de.ibm.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-06-23 07:31:46 -07:00
Richard Henderson
a2a98f807b tcg-ppc64: Support the ppc64 elfv2 ABI
Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-06-23 07:31:43 -07:00
Richard Henderson
eaf7d1cfe0 tcg-ppc64: Use the correct test in tcg_out_call
The correct test uses the _CALL_AIX macro, not a host-specific macro.

Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-06-23 07:31:38 -07:00
Richard Henderson
802ca56e1d tcg-ppc64: Better parameterize the stack frame
In preparation for supporting other ABIs.

Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-06-23 07:31:34 -07:00
Richard Henderson
5456788db7 tcg-ppc64: Fix TCG_TARGET_CALL_STACK_OFFSET
The calling convention reserves space for the 8 register parameters on
the stack, so using only 6*8=48 as the offset was wrong.  We never saw
this bug because we don't have any helpers with more than 5 parameters.

Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-06-23 07:31:29 -07:00
Richard Henderson
a921fddcc1 tcg-ppc64: Move call macros out of tcg-target.h
These values are private to tcg.c; we don't need to expose
this nonsense to the translators.

Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-06-23 07:31:26 -07:00
Richard Henderson
3bf4a1ed61 tcg-ppc64: Make TCG_AREG0 and TCG_REG_CALL_STACK enum constants
Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-06-23 07:31:22 -07:00
Richard Henderson
4c3831a088 tcg-ppc64: Use tcg_out_{ld,st,cmp} internally
Rather than using tcg_out32 and opcodes directly.  This allows us
to remove LD_ADDR and CMP_L macros.

Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-06-23 07:31:17 -07:00
Richard Henderson
de7761a39d tcg-ppc64: Relax register restrictions in tcg_out_mem_long
In order to be able to use tcg_out_ld/st sensibly with scratch
registers, assert only when we'd incorrectly clobber a scratch.

Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-06-23 07:31:14 -07:00
Richard Henderson
d604f1a90d tcg-ppc64: Move functions around
Code movement only.  This will allow us to make use of the
other tcg_out_* functions in tidying their implementations.

Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-06-23 07:31:10 -07:00
Richard Henderson
de3d636d83 tcg-ppc64: Avoid some hard-codings of TCG_TYPE_I64
Using more appropriate _PTR or _REG where possible.

Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-06-23 07:31:07 -07:00
Richard Henderson
9171478c95 tcg-ppc: Use uintptr_t in ppc_tb_set_jmp_target
Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-06-23 07:29:30 -07:00
Jens Freimann
4ff51e6637 pc-bios/s390-ccw: update s390-ccw.img binary
Update s390-ccw.img to match with latest fixes

Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-06-23 14:19:45 +02:00
Eugene (jno) Dvurechenski
c77cd87cf5 pc-bios/s390-ccw: fix for fragmented SCSI bootmap
We need to interpret the last entry of the bootmap with zero
block count as "continuation pointer".
The "last entry" is being detected by pre-filling of the scratch
space with known values and respective look-ahead.

Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-06-23 14:03:31 +02:00
Christian Borntraeger
9629823290 pc-bios/s390-ccw: do a subsystem reset before running the guest
The loader BIOS has already activated several devices. Let's do a
subsystem reset before jumping into the guest. As there is no direct
way of doing so, we use diagnose 308 to bring the system in a
defined state. This is similar to what kdump on s390 uses. We have
to define a small trampoline function that restores the low bytes
to whatever the bootmap has written there.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-06-23 14:03:31 +02:00
David Hildenbrand
554f80896d pc-bios/s390-ccw: virtio_load_direct() can't load max number of sectors
The number of sectors to read is given by the last 16 bit of rec_list2.
1 is added in order to get to the real number of sectors to read (0x0000
-> read 1 block). For now, the maximum number (0xffff) led to 0 sectors
being read.

This fixes a bug where a large initrd (62MB) could not be ipled anymore.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-06-23 14:03:31 +02:00
Peter Maydell
d9c1647d89 Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
Block pull request

# gpg: Signature made Mon 23 Jun 2014 09:53:49 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/block-pull-request:
  block: asynchronously stop the VM on I/O errors
  vl: allow other threads to do qemu_system_vmstop_request
  sheepdog: fix NULL dereference in sd_create
  QemuOpts: check NULL opts in qemu_opt_get functions
  block: m25p80: Support read only bdrvs.
  block: m25p80: sync_page(): Deindent function body.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-23 12:55:22 +01:00
Peter Maydell
910f66fcda Merge remote-tracking branch 'remotes/mcayland/qemu-sparc' into staging
* remotes/mcayland/qemu-sparc:
  apb: Fix out-of-bounds array write access

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-23 12:40:39 +01:00
Peter Maydell
337b172bb9 Merge remote-tracking branch 'remotes/mcayland/qemu-openbios' into staging
* remotes/mcayland/qemu-openbios:
  Update OpenBIOS images

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-23 11:35:27 +01:00
Michael S. Tsirkin
3840f84290 console: move chardev declarations to sysemu/char.h
move generic chardev APIs to sysemu/char.h, to make them available to
callers which can not depend on the whole of ui/console.h.
This fixes a build error on systems without pixman-devel:

./configure --disable-tools --disable-docs --target-list=arm-linux-user
...
pixman            none
...
make
...
In file included from
/data/home/nchip/linaro/qemu/include/ui/console.h:4:0,
                 from /data/home/nchip/linaro/qemu/stubs/vc-init.c:2:
/data/home/nchip/linaro/qemu/include/ui/qemu-pixman.h:14:20: fatal
error: pixman.h: No such file or directory
 #include <pixman.h>
                    ^
compilation terminated.

Reported-by: Riku Voipio <riku.voipio@iki.fi>
Tested-by: Riku Voipio <riku.voipio@iki.fi>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1403508500-32691-1-git-send-email-mst@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-23 10:56:20 +01:00
Paolo Bonzini
2bd3bce8ef block: asynchronously stop the VM on I/O errors
With virtio-blk dataplane, I/O errors might occur while QEMU is
not in the main I/O thread.  However, it's invalid to call vm_stop
when we're neither in a VCPU thread nor in the main I/O thread,
even if we were to take the iothread mutex around it.

To avoid this problem, we can raise a request to the main I/O thread,
similar to what QEMU does when vm_stop is called from a CPU thread.
We know that bdrv_error_action is called from an AIO callback, and
the moment at which the callback will fire is not well-defined; it
depends on the moment at which the disk or OS finishes the operation,
which can happen at any time.  Note that QEMU is certainly not in a CPU
thread and we do not need to call cpu_stop_current() like vm_stop() does.

However, we need to ensure that any action taken by management will
result in correct detection of the error _and_ a running VM.  In particular:

- the event must be raised after the iostatus has been set, so that
"info block" will return an iostatus that matches the event.

- the VM must be stopped after the iostatus has been set, so that
"info block" will return an iostatus that matches the runstate.

The ordering between the STOP and BLOCK_IO_ERROR events is preserved;
BLOCK_IO_ERROR is documented to come first.

This makes bdrv_error_action() thread safe (assuming QMP events are,
which is attacked by a separate series).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-23 16:36:13 +08:00
Paolo Bonzini
74892d2468 vl: allow other threads to do qemu_system_vmstop_request
There patch protects vmstop_requested with a lock and introduces
qemu_system_vmstop_request_prepare.

Together with the new call to qemu_vmstop_requested in vm_start,
qemu_system_vmstop_request_prepare avoids a race where the VM could remain
stopped even though the iostatus of a block device has already been set
(for example).

qemu_system_vmstop_request_prepare however also lets the caller thread
delay observation of the state change until it has itself communicated
that change to the user.  This delay avoids any possibility of a wrong
reordering of the BLOCK_IO_ERROR event and the subsequent STOP event.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-23 16:36:13 +08:00
Liu Yuan
5d5da114b3 sheepdog: fix NULL dereference in sd_create
Following command

qemu-img create -f qcow2 sheepdog:test 20g

will cause core dump because aio_context is NULL in sd_create. We should
initialize it by qemu_get_aio_context() to avoid NULL dereference.

Cc: qemu-devel@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-23 16:36:13 +08:00
Chunyan Liu
435db4cf29 QemuOpts: check NULL opts in qemu_opt_get functions
Some places will call bdrv_create_file(filename, NULL, &local_err), where
opts is NULL. Check NULL in qemu_opt_get and qemu_opt_get_*_del functions,
to avoid extra effort of checking opts before calling them every time.

Signed-off-by: Chunyan Liu <cyliu@suse.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-21 16:40:14 +08:00
Peter Crosthwaite
999e5aa5ce block: m25p80: Support read only bdrvs.
By just never doing write-backs. This is completely invisible to the
guest, as the entire storage area is implemented as device state (at
realize time the entire drive is read in).

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-21 16:40:14 +08:00
Peter Crosthwaite
fc1084aad7 block: m25p80: sync_page(): Deindent function body.
sync_page() was conditionalizing it's whole fn body on the bdrv being
non-null. Just return for the function immediately on NULL brdv and
get rid of the big if.

Makes implementation consistent with flash_zynq_area().

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-21 16:40:14 +08:00
Mark Cave-Ayland
871c60a736 Update OpenBIOS images
Update OpenBIOS images to SVN r1306 built from submodule.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2014-06-20 23:59:19 +01:00
Stefan Weil
68716da745 apb: Fix out-of-bounds array write access
The array regs is declared with IOMMU_NREGS (3) elements and accessed
using IOMMU_CTRL (0) and IOMMU_BASE (8). In most cases, those values
are right shifted before being used as an index which results in indices
0 and 1. In one case, this right shift was missing for IOMMU_BASE which
results in an out-of-bounds write access with index 8.

The patch adds the missing shift operation also for IOMMU_CTRL where
it is needed only for cosmetic reasons.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2014-06-20 23:52:49 +01:00
Sanjay Lal
427e1750a0 gt64xxx_pci: Add VMStateDescription
Add VMStateDescription for GT64120 PCI emulation used by the Malta
platform, to allow it to work with savevm/loadvm and live migration.

The entire register array is saved/restored using VMSTATE_UINT32_ARRAY
(fixed length GT_REGS = 1024).

Signed-off-by: Sanjay Lal <sanjayl@kymasys.com>
[james.hogan@imgtec.com: Convert to VMState]
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Andreas Färber <afaerber@suse.de>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2014-06-20 23:40:16 +02:00
Aurelien Jarno
5ab5c04170 target-mips: copy CP0_Config1 into DisasContext
In order to avoid access to the CPUMIPSState structure in the
translator, keep a copy of CP0_Config1 into DisasContext. The whole
register is read-only so it can be copied as a single value.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2014-06-20 22:14:13 +02:00
Peter Maydell
d70a319b8d Merge remote-tracking branch 'remotes/kvm/uq/master' into staging
* remotes/kvm/uq/master:
  hw/mips: malta: Don't boot from flash with KVM T&E
  MAINTAINERS: Add entry for MIPS KVM
  target-mips: Enable KVM support in build system
  hw/mips: malta: Add KVM support
  hw/mips: In KVM mode, inject IRQ2 (I/O) interrupts via ioctls
  target-mips: Call kvm_mips_reset_vcpu() from mips_cpu_reset()
  target-mips: kvm: Add main KVM support for MIPS
  kvm: Allow arch to set sigmask length
  target-mips: get_physical_address: Add KVM awareness
  target-mips: get_physical_address: Add defines for segment bases
  hw/mips: Add API to convert KVM guest KSEG0 <-> GPA
  hw/mips/cputimer: Don't start periodic timer in KVM mode
  target-mips: Reset CPU timer consistently
  KVM: Fix GSI number space limit

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-20 19:25:18 +01:00
Peter Maydell
0a99aae5fa Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc,pci,virtio,hotplug fixes, enhancements

numa work by Hu Tao and others
memory hotplug by Igor
vhost-user by Nikolay, Antonios and others
guest virtio announcements by Jason
qtest fixes by Sergey
qdev hotplug fixes by Paolo
misc other fixes mostly by myself

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

* remotes/mst/tags/for_upstream: (109 commits)
  numa: use RAM_ADDR_FMT with ram_addr_t
  qapi/string-output-visitor: fix bugs
  tests: simplify code
  qapi: fix input visitor bugs
  acpi: rephrase comment
  qmp: add ACPI_DEVICE_OST event handling
  qmp: add query-acpi-ospm-status command
  acpi: implement ospm_status() method for PIIX4/ICH9_LPC devices
  acpi: introduce TYPE_ACPI_DEVICE_IF interface
  qmp: add query-memory-devices command
  numa: handle mmaped memory allocation failure correctly
  pc: acpi: do not hardcode preprocessor
  qmp: clean out whitespace
  qdev: recursively unrealize devices when unrealizing bus
  qdev: reorganize error reporting in bus_set_realized
  qapi: fix build on glib < 2.28
  qapi: make string output visitor parse int list
  qapi: make string input visitor parse int list
  tests: fix memory leak in test of string input visitor
  hmp: add info memdev
  ...

Conflicts:
	include/hw/i386/pc.h
[PMM: fixed minor conflict in pc.h]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-20 18:01:24 +01:00
Peter Maydell
53001c1483 Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20140619' into staging
target-arm:
 * Support PSCI 0.2 when using KVM
 * fix AIRCR reset value for v7M CPUs
 * report correct size information for pflash_cfi01
 * minor coverity fixes
 * avoid warnings on Windows builds due to #define clash
 * implement TTBCR PD0/PD1 bits

# gpg: Signature made Thu 19 Jun 2014 18:35:06 BST using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"

* remotes/pmaydell/tags/pull-target-arm-20140619:
  armv7m_nvic: fix AIRCR implementation
  Use PSCI v0.2 compatible string when KVM or TCG provides it
  target-arm: Introduce per-CPU field for PSCI version
  target-arm: Implement kvm_arch_reset_vcpu() for KVM ARM64
  target-arm: Enable KVM_ARM_VCPU_PSCI_0_2 feature when possible
  target-arm: Common kvm_arm_vcpu_init() for KVM ARM and KVM ARM64
  kvm: Handle exit reason KVM_EXIT_SYSTEM_EVENT
  hw/block/pflash_cfi01: Report correct size info for parallel configs
  hw/arm/vexpress: Forbid specifying flash contents in two ways at once
  target-arm/translate-a64.c: Fix dead ?: in handle_simd_shift_fpint_conv()
  target-arm/translate-a64.c: Remove dead ?: in disas_simd_3same_int()
  target-arm: Add ULL suffix to calculation of page size
  hw/arm/spitz: Avoid clash with Windows header symbol MOD_SHIFT
  target-arm: implement PD0/PD1 bits for TTBCR

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-20 17:41:09 +01:00
Peter Maydell
9d3c512021 Merge remote-tracking branch 'remotes/kraxel/tags/pull-vnc-20140619-1' into staging
vnc: cleanups and fixes

# gpg: Signature made Thu 19 Jun 2014 12:02:09 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-vnc-20140619-1:
  vnc: fix screen updates
  vnc: Drop superfluous conditionals around g_strdup()
  vnc: Drop superfluous conditionals around g_free()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-20 16:57:28 +01:00
Gerd Hoffmann
e8e23b7dcf spice: fix 32bit build
Tested-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1403244764-8622-1-git-send-email-kraxel@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-20 16:22:07 +01:00
Peter Maydell
a096922bef Merge remote-tracking branch 'remotes/rth/tcg-next' into staging
* remotes/rth/tcg-next:
  tcg/optimize: Don't special case TCG_OPF_CALL_CLOBBER

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-20 15:44:31 +01:00
James Hogan
3c5d0be553 hw/mips: malta: Don't boot from flash with KVM T&E
In KVM trap & emulate (T&E) mode the flash reset region at 0xbfc00000
isn't executable, which is why the minimal kernel bootloader is loaded
and executed from the last 1MB of DRAM instead.

Therefore if no kernel is provided on the command line and KVM is
enabled, exit with an error since booting from flash will fail.

Reported-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-20 13:51:29 +02:00
Oran Avraham
b6fb3a89e3 armv7m_nvic: fix AIRCR implementation
The returned reset value was wrong (off by one zero nibble), and
qemu didn't log unimplemented writes to the PRIGROUP field.

Signed-off-by: Oran Avraham <oranav@gmail.com>
Message-id: 1403010447-4627-1-git-send-email-oranav@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-19 18:33:05 +01:00
Pranavkumar Sawargaonkar
06955739a2 Use PSCI v0.2 compatible string when KVM or TCG provides it
If we have PSCI v0.2 emulation available for KVM ARM/ARM64 or TCG then
we need to provide PSCI v0.2 compatible string via generated DTB.

Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Anup Patel <anup.patel@linaro.org>
Reviewed-by: Rob Herring <rob.herring@linaro.org>
Message-id: 1402901605-24551-9-git-send-email-pranavkumar@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-19 18:33:05 +01:00
Pranavkumar Sawargaonkar
dd032e3487 target-arm: Introduce per-CPU field for PSCI version
We require to know the PSCI version available to given CPU at
potentially many places. Currently, we need to know PSCI version
when generating DTB for virt machine.

This patch introduce per-CPU 32bit field representing the PSCI
version available to the CPU. The encoding of this 32bit field
is same as described in PSCI v0.2 spec.

Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Anup Patel <anup.patel@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1402901605-24551-8-git-send-email-pranavkumar@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-19 18:33:04 +01:00
Pranavkumar Sawargaonkar
73542cf690 target-arm: Implement kvm_arch_reset_vcpu() for KVM ARM64
To implement kvm_arch_reset_vcpu(), we simply re-init the VCPU
using kvm_arm_vcpu_init() so that all registers of VCPU are set
to their reset values by in-kernel KVM code.

Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Anup Patel <anup.patel@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1402901605-24551-7-git-send-email-pranavkumar@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-19 18:33:03 +01:00
Pranavkumar Sawargaonkar
7cd62e5384 target-arm: Enable KVM_ARM_VCPU_PSCI_0_2 feature when possible
Latest linux kernel supports in-kernel emulation of PSCI v0.2 but
to enable it we need to select KVM_ARM_VCPU_PSCI_0_2 feature using
KVM_ARM_VCPU_INIT ioctl.

Also, we can use KVM_ARM_VCPU_PSCI_0_2 feature for VCPU only when
linux kernel has KVM_CAP_ARM_PSCI_0_2 capability.

This patch updates kvm_arch_init_vcpu() to enable KVM_ARM_VCPU_PSCI_0_2
feature for VCPU when KVM ARM/ARM64 has KVM_CAP_ARM_PSCI_0_2 capability.

Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Anup Patel <anup.patel@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1402901605-24551-6-git-send-email-pranavkumar@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-19 18:33:03 +01:00
Pranavkumar Sawargaonkar
228d5e048b target-arm: Common kvm_arm_vcpu_init() for KVM ARM and KVM ARM64
Introduce a common kvm_arm_vcpu_init() for doing KVM_ARM_VCPU_INIT
ioctl in KVM ARM and KVM ARM64. This also helps us factor-out few
common code lines from kvm_arch_init_vcpu() for KVM ARM/ARM64.

Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Anup Patel <anup.patel@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1402901605-24551-5-git-send-email-pranavkumar@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-19 18:33:02 +01:00
Pranavkumar Sawargaonkar
99040447ce kvm: Handle exit reason KVM_EXIT_SYSTEM_EVENT
In-kernel PSCI v0.2 emulation of KVM ARM/ARM64 forwards SYSTEM_OFF
and SYSTEM_RESET function calls to QEMU using KVM_EXIT_SYSTEM_EVENT
exit reason.

This patch updates kvm_cpu_exec() to handle KVM_SYSTEM_EVENT_SHUTDOWN
and KVM_SYSTEM_EVENT_RESET system-level events from QEMU-side.

Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Anup Patel <anup.patel@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1402901605-24551-4-git-send-email-pranavkumar@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-19 18:06:25 +01:00
Peter Maydell
a0289b8af3 hw/block/pflash_cfi01: Report correct size info for parallel configs
If the flash device is configured with a device-width which is
not equal to the bank-width, indicating that it is actually several
narrow flash devices in parallel, the CFI table should report the
number of blocks and the size of a single device, not of the whole
combined setup. This stops Linux from complaining:
"NOR chip too large to fit in mapping. Attempting to cope..."

As usual, we retain the old broken but backwards compatible behaviour
when the device-width is not specified.

Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1402409025-25694-1-git-send-email-peter.maydell@linaro.org
2014-06-19 18:06:25 +01:00
Peter Maydell
476e75ab9d hw/arm/vexpress: Forbid specifying flash contents in two ways at once
Detect attempts by the user to specify the contents of the first flash
device via both -bios and -drive if=pflash... simultaneously and
print a helpful error message.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1402419834-25982-1-git-send-email-peter.maydell@linaro.org
2014-06-19 18:06:25 +01:00
Peter Maydell
4063452eca target-arm/translate-a64.c: Fix dead ?: in handle_simd_shift_fpint_conv()
In handle_simd_shift_fpint_conv(), the combination of is_double == true,
is_scalar == false and is_q == false is an unallocated encoding; the
'both parts false' case of the nested ?: expression for calculating
maxpass is therefore unreachable and can be removed.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-id: 1402171881-14343-4-git-send-email-peter.maydell@linaro.org
2014-06-19 18:06:25 +01:00
Peter Maydell
220ad4ca84 target-arm/translate-a64.c: Remove dead ?: in disas_simd_3same_int()
In disas_simd_3same_int(), none of the instructions permit is_q
to be false with size == 3 (this would be a vector operation with
a one-element vector, and the instruction set encodes those as
scalar operations). Replace the always-true ?: check with an
assert.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-id: 1402171881-14343-3-git-send-email-peter.maydell@linaro.org
2014-06-19 18:06:24 +01:00
Peter Maydell
5661ae6be2 target-arm: Add ULL suffix to calculation of page size
The maximum block size for AArch64 address translation is 2GB. This means
that we need a ULL suffix on our shift to avoid shifting into the sign
bit of a signed 32 bit integer.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-id: 1402171881-14343-2-git-send-email-peter.maydell@linaro.org
2014-06-19 18:06:24 +01:00
Peter Maydell
0062609f70 hw/arm/spitz: Avoid clash with Windows header symbol MOD_SHIFT
The Windows headers provided by MinGW define MOD_SHIFT. Avoid
it by using SPITZ_MOD_* for our constants here.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-19 18:06:24 +01:00
Fabian Aggeler
e389be1673 target-arm: implement PD0/PD1 bits for TTBCR
Corrected handling of writes to TTBCR for ARMv8 (previously UNK/SBZP
bits are not RES0) and ARMv7 (new bits PD0/PD1 for CPUs with Security
Extensions).

Bits PD0/PD1 are now respected in get_phys_addr_v6/v5() and
get_level1_table_address.

Signed-off-by: Fabian Aggeler <aggelerf@ethz.ch>
Message-id: 1402409556-18574-1-git-send-email-aggelerf@ethz.ch
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-19 18:06:24 +01:00
Michael S. Tsirkin
705456c0d7 numa: use RAM_ADDR_FMT with ram_addr_t
commit 4407ab055be995e64633322a78e64dfa376dc534
    vl.c: extend -m option to support options for memory hotplug
prints ram_addr_t with u64 format, this is wrong for
some systems, in particular w32.

print ram_addr_t with RAM_ADDR_FMT to fix build on w32.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:22 +03:00
Michael S. Tsirkin
56fdfb6106 qapi/string-output-visitor: fix bugs
in human mode, we are creating the string:

16-31 (16-31)

instead of

16-17 (10-1f)

because we forgot to pass 'true' as the human parameter on one of the
two calls to format_string.
Also, this is a worsening of quality; previously we would produce

16 (0x10)

to make it obvious which number was hex.
Fix these issues.

Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-06-19 18:44:22 +03:00
Michael S. Tsirkin
6ffa169576 tests: simplify code
Use error_abort instead of open-coded assert.
Cleaner and shorter.

Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-06-19 18:44:22 +03:00
Michael S. Tsirkin
c210ee95f2 qapi: fix input visitor bugs
Remove dead code.  Reset errno to 0 before each strtoull call, as the
man page requires.

Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-06-19 18:44:22 +03:00
Michael S. Tsirkin
b4acfbcd95 acpi: rephrase comment
"only upto" is not proper English.
Say "up to" and drop "only".

Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:22 +03:00
Igor Mammedov
02edd407f3 qmp: add ACPI_DEVICE_OST event handling
emits event when ACPI OSPM evaluates _OST method
of ACPI device.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:22 +03:00
Igor Mammedov
02419bcb3f qmp: add query-acpi-ospm-status command
... to get ACPI OSPM status reported by ACPI devices
via _OST method.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:22 +03:00
Igor Mammedov
43f5041008 acpi: implement ospm_status() method for PIIX4/ICH9_LPC devices
... using TYPE_ACPI_DEVICE_IF interface.
Which provides status reporting of ACPI declared memory devices

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:22 +03:00
Igor Mammedov
521b3673ac acpi: introduce TYPE_ACPI_DEVICE_IF interface
... it will be used to abstract generic ACPI bits from
device that implements ACPI interface.

ACPIOSTInfo type is used for passing-through raw _OST
event/status codes reported by guest OS to a management
layer. It lets management tools interpret values
as specified by ACPI spec if it is interested in it.

QEMU doesn't encode these values as enum, since it
doesn't need to handle them and it allows interface
to scale well without any changes in QEMU while guest
OS and management evolves in time.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:22 +03:00
Igor Mammedov
6f2e27301d qmp: add query-memory-devices command
... allowing to get state of present memory devices.
Currently implemented only for PCDIMMDevice.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:22 +03:00
Igor Mammedov
c3ba309507 numa: handle mmaped memory allocation failure correctly
when memory_region_init_ram_from_file() fails
memory_region_size() will still return size that was
provided at region init time.
Instead use errp to properly detect error condition.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:22 +03:00
Igor Mammedov
dc61b29531 pc: acpi: do not hardcode preprocessor
but use one provided by environment, in addition
force C style preprocessing so that 'gcc -E' or
"clang -E" wouldn't ignore .dsl files.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:21 +03:00
Michael S. Tsirkin
c059451cc1 qmp: clean out whitespace
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:21 +03:00
Paolo Bonzini
5942a19040 qdev: recursively unrealize devices when unrealizing bus
When the patch was posted that became 5c21ce7 (qdev: Realize buses
on device realization, 2014-03-12), it included recursive realization
and unrealization of devices when the bus's "realized" property
was toggled.

However, due to the same old worries about recursive realization
and prerequisites not being realized yet, those hunks were dropped when
committing the patch.  Unfortunately, this causes a use-after-free bug
(easily reproduced by a PCI hot-unplug action).

Before the patch, device_unparent behaved as follows:

   for each child bus
     unparent bus ----------------------------.
     | for each child device                  |
     |   unparent device ---------------.     |
     |   | unrealize device             |     |
     |   | call dc->unparent            |     |
     |   '-------------------------------     |
     '----------------------------------------'
   unrealize device

After the patch, it behaves as follows instead:

   unrealize device --------------------.
   | for each child bus                 |
   |   unrealize bus               (A)  |
   '------------------------------------'
   for each child bus
     unparent bus ----------------------.
     | for each child device            |
     |   unrealize device          (B)  |
     |   call dc->unparent              |
     '----------------------------------'

At the step marked (B) the device might use data from the bus that is
not available anymore due to step (A).

To fix this, we need to unrealize devices before step (A).  To sidestep
concerns about recursive realization, only do recursive unrealization
and leave the "value && !bus->realized" case as it is.

The resulting flow is:

   for each child bus
     unrealize bus ---------------------.
     | for each child device            |
     |   unrealize device          (B)  |
     | call bc->unrealize          (A)  |
     '----------------------------------'
   unrealize device
   for each child bus
     unparent bus ----------------------.
     | for each child device            |
     |   unparent device                |
     '----------------------------------'

where everything is "powered down" before it is unassembled.

Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
2014-06-19 18:44:21 +03:00
Paolo Bonzini
b7b34d055d qdev: reorganize error reporting in bus_set_realized
No semantic change.

Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
2014-06-19 18:44:21 +03:00
Michael S. Tsirkin
0d156683f6 qapi: fix build on glib < 2.28
The following commits:
    qapi: make string output visitor parse int list
    qapi: make string input visitor parse int list
break with glib < 2.28 since they use the
new g_list_free_full function.

Open-code that to fix build on old systems.

Cc: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:21 +03:00
Hu Tao
69e255635d qapi: make string output visitor parse int list
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

MST: split up patch
2014-06-19 18:44:21 +03:00
Hu Tao
659268ffbf qapi: make string input visitor parse int list
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

MST: split up patch
2014-06-19 18:44:21 +03:00
Hu Tao
cac124d17c tests: fix memory leak in test of string input visitor
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:21 +03:00
Hu Tao
eb1539b234 hmp: add info memdev
This is the hmp counterpart of qmp query-memdev.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

MST: fix build on 32 bit
2014-06-19 18:44:21 +03:00
Hu Tao
76b5d8507d qmp: add query-memdev
Add qmp command query-memdev to query for information
of memory devices

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:21 +03:00
Hu Tao
4cf1b76bf1 hostmem: add properties for NUMA memory policy
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
[Raise errors on setting properties if !CONFIG_NUMA.  Add BUILD_BUG_ON
 checks. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:21 +03:00
Paolo Bonzini
dbcb898118 hostmem: add property to map memory with MAP_SHARED
A new "share" property can be used with the "memory-file" backend to
map memory with MAP_SHARED instead of MAP_PRIVATE.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:20 +03:00
Paolo Bonzini
a35ba7be4b hostmem: allow preallocation of any memory region
And allow preallocation of file-based memory even without -mem-prealloc.
Some care is necessary because -mem-prealloc does not allow disabling
preallocation for hostmem-file.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:20 +03:00
Paolo Bonzini
605d0a945d hostmem: add merge and dump properties
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:20 +03:00
Michael S. Tsirkin
2925020d33 osdep: add merge and dump flags
will be used by follow up patch

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:20 +03:00
Paolo Bonzini
52330e1a96 hostmem: add file-based HostMemoryBackend
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

MST: comment tweak
2014-06-19 18:44:20 +03:00
Hu Tao
bd9262d95f hostmem: separate allocation from UserCreatable complete method
This allows the superclass to set various policies on the memory
region that the subclass creates. Drops hostmem-ram's complete method
accordingly.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:20 +03:00
Hu Tao
58f4662c6c backend:hostmem: replace hostmemory with host_memory
..to keep names consistant.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:20 +03:00
Paolo Bonzini
9521d42b54 pc: pass MachineState to pc_memory_init
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-By: Igor Mammedov <imammedo@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:20 +03:00
Paolo Bonzini
c4090f8ef6 vl: redo -object parsing
Follow the lines of the HMP implementation, using OptsVisitor
to parse the options.  This gives access to OptsVisitor's
rich parsing of integer lists.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:20 +03:00
Paolo Bonzini
7f56e740a6 memory: add error propagation to file-based RAM allocation
Right now, -mem-path will fall back to RAM-based allocation in some
cases.  This should never happen with "-object memory-file", prepare
the code by adding correct error propagation.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

MST: drop \n at end of error messages
2014-06-19 18:44:20 +03:00
Paolo Bonzini
0b183fc871 memory: move mem_path handling to memory_region_allocate_system_memory
Like the previous patch did in exec.c, split memory_region_init_ram and
memory_region_init_ram_from_file, and push mem_path one step further up.
Other RAM regions than system memory will now be backed by regular RAM.

Also, boards that do not use memory_region_allocate_system_memory will
not support -mem-path anymore.  This can be changed before the patches
are merged by migrating boards to use the function.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:19 +03:00
Paolo Bonzini
7febe36f9a numa: add -numa node,memdev= option
This option provides the infrastructure for binding guest NUMA nodes
to host NUMA nodes.  For example:

 -object memory-ram,size=1024M,policy=bind,host-nodes=0,id=ram-node0 \
 -numa node,nodeid=0,cpus=0,memdev=ram-node0 \
 -object memory-ram,size=1024M,policy=interleave,host-nodes=1-3,id=ram-node1 \
 -numa node,nodeid=1,cpus=1,memdev=ram-node1

The option replaces "-numa node,mem=".

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>

MST: conflict resolution
2014-06-19 18:44:19 +03:00
Hu Tao
1f21772db0 qom: introduce object_property_get_enum and object_property_get_uint16List
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:19 +03:00
Hu Tao
7f8f9ef1da Introduce signed range.
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

MST: split up patch
2014-06-19 18:44:19 +03:00
Wanlong Gao
a99d57bb6a configure: add Linux libnuma detection
Add detection of libnuma (mostly contained in the numactl package)
to the configure script. Can be enabled or disabled on the command
line, default is use if available.

Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:19 +03:00
Paolo Bonzini
7bd4f430a3 memory: move RAM_PREALLOC_MASK to exec.c, rename
Prepare for adding more flags.  The "_MASK" suffix is unique, kill it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:19 +03:00
Paolo Bonzini
38183310be memory: move preallocation code out of exec.c
So that backends can use it.

Since we need the page size for efficiency, move code to compute it
out of translate-all.c and into util/oslib-win32.c.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:19 +03:00
Paolo Bonzini
e1c57ab86f memory: reorganize file-based allocation
Split the internal interface in exec.c to a separate function, and
push the check on mem_path up to memory_region_init_ram.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:19 +03:00
Paolo Bonzini
dfabb8b916 numa: introduce memory_region_allocate_system_memory
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

MST: resolve conflicts
2014-06-19 18:44:19 +03:00
Paolo Bonzini
d116946424 qmp: improve error reporting for -object and object-add
Use QERR_INVALID_PARAMETER_VALUE for consistency.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:19 +03:00
Luiz Capitulino
4932b8971b man: improve -numa doc
The -numa option documentation in qemu's manpage lacks the command-line
options and some information regarding how it relates to options -m and
-smp. This commit fills in the missing text.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:18 +03:00
Wanlong Gao
45e30bf3a9 NUMA: expand MAX_NODES from 64 to 128
libnuma choosed 128 for MAX_NODES, so we follow libnuma here.

Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:18 +03:00
Wanlong Gao
0042109a6a NUMA: convert -numa option to use OptsVisitor
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:18 +03:00
Wanlong Gao
8c85901ed3 NUMA: Add numa_info structure to contain numa nodes info
Add the numa_info structure to contain the numa nodes memory,
VCPUs information and the future added numa nodes host memory
policies.

Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
[Fix hw/ppc/spapr.c - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:18 +03:00
Wanlong Gao
2b631ec255 NUMA: check if the total numa memory size is equal to ram_size
If the total number of the assigned numa nodes memory is not
equal to the assigned ram size, it will write the wrong data
to ACPI table, then the guest will ignore the wrong ACPI table
and recognize all memory to one node. It's buggy, we should
check it to ensure that we write the right data to ACPI table.

Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>

MST: error message reworded
2014-06-19 18:44:18 +03:00
Wanlong Gao
96d0e26c23 NUMA: move numa related code to new file numa.c
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>

MST: comment tweaks
2014-06-19 18:44:18 +03:00
Michael S. Tsirkin
edd8db8705 tests: disable vhost test temporarily
This test needs a bit more work: issues have been
found on legacy systems, disable it for now to
avoid false positives for people.
Will re-enable after issues are addressed.

Reported-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:18 +03:00
Nikolay Nikolaev
a77e6b14b4 Add qtest for vhost-user
This test creates a 'server' chardev to listen for vhost-user messages.
Once VHOST_USER_SET_MEM_TABLE is received it mmaps each received region,
and read 1k bytes from it. The read data is compared to data from readl.

The test requires hugetlbfs to be already mounted and writable. The mount
point defaults to '/hugetlbfs' and can be specified via the environment
variable QTEST_HUGETLBFS_PATH.

The rom pc-bios/pxe-virtio.rom is used to instantiate a virtio pcicontroller.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

MST: fix up coding style
MST: disable vhost test temporarily

This test needs a bit more work: issues have been
found on legacy systems, disable it for now to
avoid false positives for people.
Will re-enable after issues are addressed.

Reported-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:18 +03:00
Nikolay Nikolaev
07a32d6b96 libqemustub: add stubs to be able to use qemu-char.c
chardev depends on lots of external symbols that are not necessarily
needed to be able to use, for example, 'socket chardev'. So add stubs
for these functions:

 - bdrv_commit_all
 - qemu_chr_open_msmouse
 - is_daemonized
 - qemu_add_machine_init_done_notifier
 - monitor_init
 - qemu_notify_event
 - vc_init

and this array:

 - serial_hds

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:18 +03:00
Nikolay Nikolaev
5fc0e00291 Add vhost-user protocol documentation
This document describes the basic message format used by vhost-user
for communication over a unix domain socket. The protocol is based
on the existing ioctl interface used for the kernel version of vhost.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:18 +03:00
Nikolay Nikolaev
03ce574442 Add the vhost-user netdev backend to the command line
The supplied chardev id will be inspected for supported options. Only
a socket backend, with a set path (i.e. a Unix socket) and optionally
the server parameter set, will be allowed. Other options (nowait, telnet)
will make the chardev unusable and the netdev will not be initialised.

Additional checks for validity:
  - requires `-numa node,memdev=..`
  - requires `-device virtio-net-*`

The `vhostforce` option is used to force vhost-net when we deal with
non-MSIX guests.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-06-19 18:44:18 +03:00
Peter Maydell
6baa963f4d Merge remote-tracking branch 'remotes/bonzini/scsi-next' into staging
* remotes/bonzini/scsi-next:
  virtio-scsi: define dummy handle_output for vhost-scsi vqs
  block/iscsi: drop obsolete pointers from iscsi_co_writev
  block/iscsi: fix init value for iTask->retries
  block/iscsi: bump libiscsi requirement to 1.9.0
  virtio-scsi: add support for the any_layout feature
  virtio-scsi: introduce virtio_scsi_complete_cmd_req
  virtio-scsi: prepare sense data handling for any_layout
  virtio-scsi: add extra argument and return type to qemu_sgl_concat
  virtio-scsi: add target swap for VirtIOSCSICtrlTMFReq fields
  virtio-scsi: start preparing for any_layout
  util: add return value to qemu_iovec_concat_iov
  megasas: use PCI DMA API
  scsi: Print command name in debug
  scsi-disk: fix bug in scsi_block_new_request() introduced by commit 137745c
  scsi-disk.c: Fix compilation with -DDEBUG_SCSI
  block/iscsi: use 16 byte CDBs only when necessary
  block/iscsi: fix potential segfault on early callback
  block/iscsi: handle BUSY condition

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-19 16:18:04 +01:00
Sean Bruno
9f6f7f1a85 include/qemu/aes.h: Avoid conflicts with FreeBSD AES functions
FreeBSD's libcrypto provides functions with the same names as us;
use #define to rename our versions to avoid conflicts at link time.

Signed-off-by: Sean Bruno <sbruno@freebsd.org>
Signed-off-by: Ed Maste <emaste@freebsd.org>
Message-id: 1402930927-41125-1-git-send-email-sbruno@freebsd.org
[PMM: improved commit message, fixed comment typo]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-19 16:13:38 +01:00
Stefan Weil
e637aa6647 w32: Fix regression caused by new g_poll implementation
Commit 5a007547df tried to fix a
performance degradation caused by bad handling of small timeouts
in the original implementation of g_poll.

Since that commit, hard disk I/O no longer works.

Instead of rewriting the g_poll implementation, this patch simply copies
the original code (released under LGPL) from latest glib and only modifies
it where needed (see comments in the code). URL of the original code:
https://git.gnome.org/browse/glib/tree/glib/gpoll.c

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Message-id: 1401291744-14314-1-git-send-email-sw@weilnetz.de
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-19 14:56:51 +01:00
Nikolay Nikolaev
d314f586b3 Add new vhost-user netdev backend
Add a new QEMU netdev backend that is intended to invoke vhost_net with the
vhost-user backend. It uses an Unix socket chardev to establish a
communication with the 'slave' (client and server mode supported).

At runtime the netdev will handle OPEN/CLOSE events from the chardev. Upon
disconnection it will set link_down accordingly and notify virtio-net; the
virtio-net interface will go down.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:57 +03:00
Nikolay Nikolaev
5f4c01cab1 vhost-net: vhost-user feature bits support
Handle the feature bits negotiation when using vhost-user. Allow
the underlying implementation to have a finer control over all the
bits except the VIRTIO_NET_F_MAC.

Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:57 +03:00
Nikolay Nikolaev
5f6f6664bf Add vhost-user as a vhost backend.
The initialization takes a chardev backed by a unix domain socket.
It should implement qemu_fe_set_msgfds in order to be able to pass
file descriptors to the remote process.

Each ioctl request of vhost-kernel has a vhost-user message equivalent,
which is sent over the control socket.

The general approach is to copy the data from the supplied argument
pointer to a designated field in the message. If a file descriptor is
to be passed it will be placed in the fds array for inclusion in
the sendmsg control header.

VHOST_SET_MEM_TABLE ignores the supplied vhost_memory structure and scans
the global ram_list for ram blocks with a valid fd field set. This would
be set when the '-object memory-file' option with share=on property is used.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:56 +03:00
Nikolay Nikolaev
1a1bfac9ee Add vhost-backend and VhostBackendType
Use vhost_set_backend_type to initialise a proper vhost_ops structure.
In vhost_net_init and vhost_net_start_one call conditionally TAP related
initialisation depending on the vhost backend type.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:56 +03:00
Nikolay Nikolaev
24d1eb33eb Add vhost_ops to vhost_dev struct and replace all relevant ioctls
Decouple vhost from the Linux kernel by introducing vhost_ops. The
intention is to provide different backends - a 'kernel' backend based on
the ioctl interface, and an 'user' backend based on a UNIX domain socket
and shared memory interface.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:56 +03:00
Nikolay Nikolaev
81647a655f vhost_net_init will use VhostNetOptions to get all its arguments
vhost_dev_init will replace devfd and devpath with a single opaque argument.
This is initialised with a file descriptor. When TAP is used (through
vhost_net), open /dev/vhost-net and pass the fd as an opaque parameter in
VhostNetOptions. The same applies to vhost-scsi - open /dev/vhost-scsi and
pass the fd.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:56 +03:00
Nikolay Nikolaev
ed8b4afe5f Refactor virtio-net to use generic get_vhost_net
This decouples virtio-net from the TAP netdev backend and allows support
for other backends to be implemented.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:55 +03:00
Nikolay Nikolaev
212d69f25e vhost_net should call the poll callback only when it is set
The poll callback needs to be called when bringing up or down
the vhost_net instance. As it is not mandatory for an NetClient
to implement it, invoke it only when it is set.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:55 +03:00
Nikolay Nikolaev
2e6d46d77e vhost: add vhost_get_features and vhost_ack_features
Generalize the features get/ack to be used for both vhost-net and vhost-scsi.
In vhost-net add vhost_net_get_feature_bits to select the feature bit set
depending on the NetClient kind.

Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:55 +03:00
Nikolay Nikolaev
cdaa86a54b Add G_IO_HUP handler for socket chardev
This is used to detect that the remote end has disconnected. Just call
tcp_char_disconnect on receiving this event.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:55 +03:00
Nikolay Nikolaev
c76bf6bb8f Add chardev API qemu_chr_fe_get_msgfds
This extends the existing qemu_chr_fe_get_msgfd by allowing to read a set
of fds. The function for receiving the fds - unix_process_msgfd is extended
to allocate the needed array size.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:55 +03:00
Nikolay Nikolaev
d39aac7aac Add chardev API qemu_chr_fe_set_msgfds
This will set an array of file descriptors to the internal structures.
The next time a message is send the array will be send as ancillary
data. This feature works on the UNIX domain socket backend only.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:54 +03:00
Nikolay Nikolaev
7b0bfdf52d Add chardev API qemu_chr_fe_read_all
This function will attempt to read data from the chardev trying
to fill the buffer up to the given length.
Add tcp_chr_disconnect to reuse disconnection code where needed.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:54 +03:00
Nikolay Nikolaev
69e03ae64b Add kvm_eventfds_enabled function
Add a function to check if the eventfd capability is present in KVM in
the host kernel.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-19 16:41:54 +03:00
Jason Wang
f57fcf7063 virtio-net: announce self by guest
It's hard to track all mac addresses and their configurations (e.g
vlan or ipv6) in qemu. Without this information, it's impossible to
build proper garp packet after migration. The only possible solution
to this is let guest (who knows all configurations) to do this.

So, this patch introduces a new readonly config status bit of virtio-net,
VIRTIO_NET_S_ANNOUNCE which is used to notify guest to announce
presence of its link through config update interrupt.When guest has
done the announcement, it should ack the notification through
VIRTIO_NET_CTRL_ANNOUNCE_ACK cmd. This feature is negotiated by a new
feature bit VIRTIO_NET_F_ANNOUNCE (which has already been supported by
Linux guest).

During load, a counter of announcing rounds is set so that after the vm is
running it can trigger rounds of config interrupts to notify the guest to build
and send the correct garps.

Cc: Liuyongan <liuyongan@huawei.com>
Cc: Amos Kong <akong@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:54 +03:00
Jason Wang
508e1180d3 migration: introduce self_announce_delay()
This patch introduces self_announce_delay() to calculate the delay for
the next announce round. This could be used by other device e.g
virtio-net who wants to do announcing by itself.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:54 +03:00
Jason Wang
110f463062 migration: export SELF_ANNOUNCE_ROUNDS
Export it for other users.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:54 +03:00
Igor Mammedov
9d3cae68ac pc: q35: acpi: report error to user on unsupported unplug request
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:53 +03:00
Michael S. Tsirkin
292b1634d0 ich: get rid of spaces in type name
Names with spaces in them are nasty, let's not go there.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:53 +03:00
Markus Armbruster
80e0090a44 virtio: Drop superfluous conditionals around g_strdup()
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-06-19 16:41:53 +03:00
Markus Armbruster
9e28840658 virtio: Drop superfluous conditionals around g_free()
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-06-19 16:41:53 +03:00
Michael S. Tsirkin
63122df231 acpi-test: update expected tables
Update ACPI tables test to match new ACPI tables
after adding the memory hotplug feature.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:53 +03:00
Michael S. Tsirkin
2745d9e772 acpi: update generated files
pdate precompiled ACPI hex files for iasl-less hosts
after adding the memory hotplug feature

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:52 +03:00
Igor Mammedov
4635b16770 pc: ACPI BIOS: make GPE.3 handle memory hotplug event on PIIX and Q35 machines
also make handler edge based to avoid losing events, the same as
it has been done for PCI and CPU hotplug handlers.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:52 +03:00
Igor Mammedov
cec65193d4 pc: ACPI BIOS: reserve SRAT entry for hotplug mem hole
Needed for Windows to use hotplugged memory device, otherwise
it complains that server is not configured for memory hotplug.
Tests shows that aftewards it uses dynamically provided
proximity value from _PXM() method if available.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:52 +03:00
Igor Mammedov
bf1e893959 pc: add "hotplug-memory-region-size" property to PC_MACHINE
... it will be used by acpi-build code and by unit tests

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:52 +03:00
Igor Mammedov
bef3492d11 pc: ACPI BIOS: implement memory hotplug interface
- provides static SSDT object for memory hotplug that can handle
  upto 256 hotplugable memory slots
- SSDT template for memory devices and runtime generator
  of them in SSDT table.

Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:51 +03:00
Igor Mammedov
3fbcdc27b1 pc: propagate memory hotplug event to ACPI device
Notify PIIX4_PM/ICH9LPC device about hotplug event,
so that it would send SCI to guest notifying about
newly added memory.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:51 +03:00
Igor Mammedov
781bbd6bec pc: add acpi-device link to PCMachineState
the link will used later to access device implementing
ACPI functions instead of adhoc lookup in QOM tree.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:51 +03:00
Igor Mammedov
f816a62daa pc: migrate piix4 & ich9 MemHotplugState
Adds an optional subsection that allows to migrate current
state of acpi_memory_hotplug of ACPI PM device.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:51 +03:00
Igor Mammedov
1f8621842e acpi:ich9: add memory hotplug handling
Add memory hotplug initialization/handling to ICH9 LPC device
and enable it by default for post 2.0 machine types

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:51 +03:00
Igor Mammedov
d6b38b6611 pc: ich9 lpc: make it work with global/compat properties
Propeties of object should be available after its instances_init()
callback is finished and not added in PCIDeviceClass.init which is
roughly corresponds to realize() method.
Moving properties adding into instances_init will fix missing
property error when global/compat property mechanism is used.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:50 +03:00
Igor Mammedov
34774320c3 acpi:piix4: add memory hotplug handling
Add memory hotplug initialization/handling to PIIX4_PM device
and enable it by default for post 2.0 machine types

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

MST: resolve conflict in pc.h
2014-06-19 16:41:50 +03:00
Igor Mammedov
f1adc360b4 acpi:piix4: allow plug/unlug callbacks handle not only PCI devices
... and report error if plugged in device is not supported.
Later these callbacks will be used by memory hotplug.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:50 +03:00
Igor Mammedov
2e1ac493f1 trace: pc: add PC_DIMM slot & address allocation
Add mhp_pc_dimm_assigned_slot & mhp_pc_dimm_assigned_address
events to trace which address and slot where assigned to
plugged in PC_DIMM device on target-i386 machine.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:50 +03:00
Igor Mammedov
dfe292ffc4 trace: add acpi memory hotplug IO region events
Add events for tracing accesses to memory hotplug IO ports.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:50 +03:00
Igor Mammedov
3ef77acab2 acpi: memory hotplug ACPI hardware implementation
- implements QEMU hardware part of memory hotplug protocol
  described at "docs/specs/acpi_mem_hotplug.txt"
- handles only memory add notification event for now

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:49 +03:00
Igor Mammedov
7e629d1d8d acpi: rename cpu_hotplug_defs.h to pc-hotplug.h
to make it more generic, so it could be used for memory hotplug
as well.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:49 +03:00
Igor Mammedov
0cd03d89b9 pc-dimm: add busy slot check and slot auto-allocation
- if slot property is not specified on -device/device_add command,
treat default value as request for assigning PCDIMMDevice to
the first free slot.

- if slot is provided with -device/device_add command, attempt to
use it or fail command if it's already occupied.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:49 +03:00
Igor Mammedov
0b31257116 pc-dimm: add busy address check and address auto-allocation
- if 'addr' property is not specified on -device/device_add command,
treat the default value as request for assigning PCDIMMDevice to
the first free memory region.

- if 'addr' is provided with -device/device_add command, attempt to
use it or fail command if it's already occupied or falls inside
of an existing PCDIMMDevice memory region.

Note:
GCompareFunc(a, b) used by g_slist_insert_sorted() returns 'gint',
however it might be too small to fit difference between
2 addresses. So use 128bit to calculate the difference and normalize
result to -1/0/1 return values.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Andrey Korolyov <andrey@xdel.ru>

MST: commit log tweaks
2014-06-19 16:41:49 +03:00
Igor Mammedov
95bee274fd pc: add memory hotplug handler to PC_MACHINE
that will perform mapping of PC_DIMM device into guest's RAM address space

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:48 +03:00
Igor Mammedov
ca8336f385 pc: exit QEMU if compat machine doesn't support memory hotlpug
... if user attempts to start it with memory hotplug enabled.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:48 +03:00
Igor Mammedov
de268e134c pc: add 'etc/reserved-memory-end' fw_cfg interface for SeaBIOS
'etc/reserved-memory-end' will allow QEMU to tell BIOS where PCI
BARs mapping could safely start in high memory.

Allowing BIOS to start mapping 64-bit PCI BARs at address where it
wouldn't conflict with other mappings QEMU might place before it.

That permits QEMU to reserve extra address space before
64-bit PCI hole for memory hotplug.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:48 +03:00
Igor Mammedov
a0cc8856e8 pc: exit QEMU if number of slots more than supported 256
... which is imposed by current naming scheme of ACPI memory devices.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:48 +03:00
Igor Mammedov
619d11e463 pc: initialize memory hotplug address space
initialize and map hotplug memory address space container
into guest's RAM address space.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:47 +03:00
Igor Mammedov
7bb5d6ade6 pc-dimm: do not allow setting an in-use memdev
using the same memdev backend more than once will cause
assertion at MemoryRegion mapping time because it's already
mapped. Prevent it by checking that the associated MemoryRegion
is not mapped.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

MST: tweak commit log
2014-06-19 16:41:47 +03:00
Igor Mammedov
eed2bacfd2 memory: add memory_region_is_mapped() API
which allows to check if MemoryRegion is already mapped.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:47 +03:00
Vasilis Liaskovitis
10b7e74bf2 pc: implement pc-dimm device abstraction
Each hotplug-able memory slot is a PCDIMMDevice.
A hot-add operation for a memory device:
- creates a new PCDIMMDevice and makes hotplug controller to map it into
  guest address space

Hotplug operations are done through normal device_add commands.
For migration case, all hotplugged memory devices on source should be
specified on target's command line using '-device' option with
properties set to the same values as on source.

To simplify review, patch introduces only PCDIMMDevice QOM skeleton that
will be extended by following patches to implement actual memory hotplug
and related functions.

Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:47 +03:00
Igor Mammedov
d012ffc189 qdev: expose DeviceState.hotplugged field as a property
so that management could detect via QOM interface if device was
hotplugged

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:47 +03:00
Igor Mammedov
b745454811 qdev: hotplug for bus-less devices
Add get_hotplug_handler() method to machine, and
make bus-less device use it during hotplug
as a means to discover a hotplug handler controller.
The returned controller is used to perform hotplug
actions.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:46 +03:00
Igor Mammedov
c270fb9eff vl.c: extend -m option to support options for memory hotplug
Add following parameters:
  "slots" - total number of hotplug memory slots
  "maxmem" - maximum possible memory

"slots" and "maxmem" should go in pair and "maxmem" should be greater
than "mem" for memory hotplug to be enabled.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

MST: fix build on 32 bit
2014-06-19 16:41:46 +03:00
Gerd Hoffmann
eb214ff8ef vnc: fix screen updates
Bug was added by 38ee14f4f3.
vnc_jobs_join call is missing in one code path.

Reported-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-06-19 12:48:07 +02:00
Markus Armbruster
c14e98479b vnc: Drop superfluous conditionals around g_strdup()
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-06-19 12:48:07 +02:00
Markus Armbruster
64641d8764 vnc: Drop superfluous conditionals around g_free()
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-06-19 12:48:07 +02:00
Ming Lei
91d670fbf9 virtio-scsi: define dummy handle_output for vhost-scsi vqs
vhost userspace needn't to handle vq's notification from guest,
so define dummy handle_output callback for all vqs of vhost-scsi.

In some corner cases(such as when handling vq's reset from VM), virtio-pci
still trys to handle pending virtio-scsi events, then object check failure
inside virtio_scsi_handle_event() for vhost-scsi can be triggered.

The issue can be reproduced by 'rmmod virtio-scsi', 'system sleep' or reboot
inside VM.

Cc: qemu-stable@nongnu.org
Cc: Anthony Liguori <aliguori@amazon.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-19 10:15:48 +02:00
Richard Henderson
bc8d688ff3 tcg/optimize: Don't special case TCG_OPF_CALL_CLOBBER
With the "old" ldst ops we didn't know the real width of the
result of the load, but with the "new" ldst ops we do.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-06-18 11:39:02 -07:00
Igor Mammedov
1f07048933 add memdev backend infrastructure
Provides framework for splitting host RAM allocation/
policies into a separate backend that could be used
by devices.

Initially only legacy RAM backend is provided, which
uses memory_region_init_ram() allocator and compatible
with every CLI option that affects memory_region_init_ram().

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-18 21:10:30 +03:00
Igor Mammedov
2d9c2725f7 vl.c: daemonize before guest memory allocation
memory allocated for guest before QEMU is daemonized and then mapped
later in guest's address space after it is daemonized, leads to EPT
violation and QEMU aborts.

To avoid this and similar issues switch to daemonized mode early
before applying/processing other options.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-18 21:10:27 +03:00
Igor Mammedov
a790f4ecc9 object_add: allow completion handler to get canonical path
Add object to /objects before calling user_creatable_complete()
handler, so that object might be able to call
object_get_canonical_path() in its completion handler.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-18 21:10:21 +03:00
Igor Mammedov
04ed3ea892 pc: ACPI BIOS: use enum for defining memory affinity flags
replace magic numbers with enum describing Flags field of
memory affinity in SRAT table.

MemoryAffinityFlags enum will define flags decribed by:
 ACPI spec 5.0, "5.2.16.2 Memory Affinity Structure",
 "Table 5-69 Flags - Memory Affinity Structure"

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-18 21:10:17 +03:00
Igor Mammedov
d5747cace7 pc: create custom generic PC machine type
it will be used for PC specific options/variables

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-18 21:09:55 +03:00
Peter Lieven
8c215a9fbd block/iscsi: drop obsolete pointers from iscsi_co_writev
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 18:42:42 +02:00
Peter Lieven
9d2e256e62 block/iscsi: fix init value for iTask->retries
during rebasing the changed init value for the
retry counter was missed. This resulted in no retries
being performed at all.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 18:42:24 +02:00
Petar Jovanovic
d279279e2b target-mips: implement UserLocal Register
From MIPS documentation (Volume III):

UserLocal Register (CP0 Register 4, Select 2)
Compliance Level: Recommended.

The UserLocal register is a read-write register that is not interpreted by
the hardware and conditionally readable via the RDHWR instruction.

This register only exists if the Config3-ULRI register field is set.

Privileged software may write this register with arbitrary information and
make it accessible to unprivileged software via register 29 (ULR) of the
RDHWR instruction. To do so, bit 29 of the HWREna register must be set to a
1 to enable unprivileged access to the register.

Signed-off-by: Petar Jovanovic <petar.jovanovic@imgtec.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2014-06-18 18:10:47 +02:00
Aurelien Jarno
739b7a9075 bitops: provide an inline implementation of find_first_bit
find_first_bit has started to be used heavily in TCG code. The current
implementation based on find_next_bit is not optimal and can't be
optimized be the compiler if the bit array has a fixed size, which is
the case most of the time.

This new implementation does not use find_next_bit and is yet small
enough to be inlined.

Cc: Corentin Chary <corentin.chary@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2014-06-18 18:10:47 +02:00
Peter Lieven
e49ab19fca block/iscsi: bump libiscsi requirement to 1.9.0
This patch lifts the minimum supported libiscsi version from 1.4.0 to
1.9.0 since the BUSY patch required that change.

On one this allows us to remove all #ifdefs from the code which
makes the code easier to maintain and read. On the other hand
I would not recommend libiscsi prior to 1.8.0 for production use
because the following important libiscsi fixes for deadlocks and
protocol errors are missing prior to 1.8.0:

dbe9a1e SOCKET queue cmd PDUs directly in waitpdu queue
30df192 DATA-OUT set pdu->cmdsn appropriately
548bd22 ISCSI fix broken send logic in iscsi_scsi_async_command
14bee10 RECONNECT do not increase CmdSN for immediate PDUs
1f4a66a PDU queue out PDUs in order of itt.
562dd46 PDU avoid incrementing itt to 0xffffffff
cd09c0f PDU use serial32 arithmetic for cmdsn, maxcmdsn and expcmdsn.
89e918e SOCKET validate data_size in in_pdu header
91267f5 Limit immediate and unsolicited data to FirstBurstLength

Note that libiscsi 1.9.0 was released on Feb 24th, 2013, about
one month after 1.8.0.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 18:03:33 +02:00
James Hogan
a31896c411 MAINTAINERS: Add entry for MIPS KVM
Add MAINTAINERS entry for MIPS KVM.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 16:59:46 +02:00
Sanjay Lal
222e7d11e7 target-mips: Enable KVM support in build system
Enable KVM support for MIPS in the build system.

Signed-off-by: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 16:59:37 +02:00
James Hogan
b03118114d hw/mips: malta: Add KVM support
In KVM mode the bootrom is loaded and executed from the last 1MB of
DRAM.

Based on "[PATCH 12/12] KVM/MIPS: General KVM support and support for
SMP Guests" by Sanjay Lal <sanjayl@kymasys.com>.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 16:59:19 +02:00
Sanjay Lal
b1bd8b28cc hw/mips: In KVM mode, inject IRQ2 (I/O) interrupts via ioctls
COP0 emulation is in-kernel for KVM, so inject IRQ2 (I/O) interrupts via
ioctls.

Signed-off-by: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 16:59:12 +02:00
James Hogan
14c03ab975 target-mips: Call kvm_mips_reset_vcpu() from mips_cpu_reset()
When KVM is enabled call kvm_mips_reset_vcpu() from mips_cpu_reset() as
done for other targets since commit 50a2c6e55f (kvm: reset state from
the CPU's reset method).

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 16:59:05 +02:00
Sanjay Lal
e2132e0bba target-mips: kvm: Add main KVM support for MIPS
Implement the main KVM arch API for MIPS.

Signed-off-by: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 16:58:52 +02:00
James Hogan
aed6efb90c kvm: Allow arch to set sigmask length
MIPS/Linux is unusual in having 128 signals rather than just 64 like
most other architectures. This means its sigmask is 16 bytes instead of
8, so allow arches to override the sigmask->len value passed to the
KVM_SET_SIGNAL_MASK ioctl in kvm_set_signal_mask() by calling
kvm_set_sigmask_len() from kvm_arch_init(). Otherwise default to 8
bytes.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Sanjay Lal <sanjayl@kymasys.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 16:58:43 +02:00
James Hogan
4ef37e6907 target-mips: get_physical_address: Add KVM awareness
MIPS KVM trap & emulate mode (which is currently the only supported
mode) has to add an extra kseg0/kseg1 at 0x40000000 and an extra
kseg2/kseg3 at 0x60000000. Take this into account in
get_physical_address() so that debug memory access works.

This is done by translating the address to a standard kseg0 or kseg2
address before doing the normal address translation. The real virtual
address is still used for TLB lookups.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 16:58:37 +02:00
James Hogan
22010ce7f2 target-mips: get_physical_address: Add defines for segment bases
Add preprocessor definitions for 32bit segment bases for use in
get_physical_address(). These will also be taken advantage of in the
next patch which adds KVM awareness.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 16:58:30 +02:00
Sanjay Lal
253fffe725 hw/mips: Add API to convert KVM guest KSEG0 <-> GPA
Add API for converting physical addresses to KVM guest KSEG0 addresses,
and fix the existing API for converting KSEG0 addresses to physical
addresses to work in the KVM case. Both have the same sized KSEG0, so
it's just a case of fixing the mask.

In KVM trap and emulate mode both the guest kernel and guest userspace
execute in useg:
    Guest User address space:   0x00000000..0x3fffffff
    Guest Kernel Unmapped:      0x40000000..0x5fffffff
    Guest Kernel Mapped:        0x60000000..0x7fffffff

Signed-off-by: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 16:58:10 +02:00
Sanjay Lal
353a243e22 hw/mips/cputimer: Don't start periodic timer in KVM mode
Compare/Count timer interrupts are handled in-kernel for KVM. Therefore
don't bother creating the timer at init time if KVM is enabled. This
will conveniently avoid attempts to set the timeout when
cpu_mips_store_count() is called at reset with KVM enabled, treating the
timer as stopped so that CP0_Count is modified directly.

Signed-off-by: Sanjay Lal <sanjayl@kymasys.com>
[james.hogan@imgtec.com: Update after "target-mips: Reset CPU timer
consistently" which moves timer start to reset time]
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 16:57:52 +02:00
James Hogan
4b69c7e265 target-mips: Reset CPU timer consistently
The MIPS CPU timer (CP0 Count/Compare registers & QEMU timer) is
reset at machine initialisation, including starting the timeout. Both
registers however are placed before mvp in CPUMIPSState so they will
both be zeroed on reset by the memset in mips_cpu_reset() including soon
after init. This doesn't take into account that the timer may be
running, in which case env->CP0_Count will represent the delta against
the VM clock and the timeout will need updating.

At init time (cpu_mips_clock_init()), lets only create the timer.
Setting Count = 1 and starting the timer (cpu_mips_store_count()) can be
done at reset time from cpu_state_reset(), which is after the memset.
There is also no need to set CP0_Compare = 0 as that is already handled
by the memset.

Note that a reset occurs from mips_cpu_realizefn() which is before the
machine init callback has had a chance to set up the CPU interrupts and
the CPU timer, so env->timer will be NULL. This case is handled
explicitly in cpu_mips_store_count(), treating the timer as disabled
(which will also be the right thing to do when KVM support is added).

Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 16:54:30 +02:00
Alexander Graf
00008418aa KVM: Fix GSI number space limit
KVM tells us the number of GSIs it can handle inside the kernel. That value is
basically KVM_MAX_IRQ_ROUTES. However when we try to set the GSI mapping table,
it checks for

    r = -EINVAL;
    if (routing.nr >= KVM_MAX_IRQ_ROUTES)
        goto out;

erroring out even when we're only using all of the GSIs. To make sure we never
hit that limit, let's reduce the number of GSIs we get from KVM by one.

Cc: qemu-stable@nongnu.org
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 16:54:18 +02:00
Peter Maydell
2edaf21b93 Merge remote-tracking branch 'remotes/bonzini/memory' into staging
* remotes/bonzini/memory:
  memory: Don't call memory_region_update_coalesced_range if nothing changed
  memory: MemoryRegion: rename parent to container
  memory: MemoryRegion: factor out memory region re-adder
  memory: MemoryRegion: factor out subregion add functionality
  qtest: fix qtest_clock_warp() for no deadline case
  exec: dummy_section: Pass address space through.
  memory: Simplify mr_add_subregion() if-else
  memory: Don't update all memory region when ioeventfd changed
  unset RAMBlock idstr when unregister MemoryRegion
  exec: introduce qemu_ram_unset_idstr() to unset RAMBlock idstr
  MAINTAINERS: Add myself as Memory API maintainer

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-18 15:08:38 +01:00
Fam Zheng
ab5b3db5d7 memory: Don't call memory_region_update_coalesced_range if nothing changed
With huge number of PCI devices in the system (for example, 200
virtio-blk-pci), this unconditional call can slow down emulation of
irrelevant PCI operations drastically, such as a BAR update on a device
that has no coalescing region. So avoid it.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 15:32:50 +02:00
Paolo Bonzini
feca4ac18b memory: MemoryRegion: rename parent to container
Avoid confusion with the QOM parent.

Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 15:32:50 +02:00
Paolo Bonzini
3eff1f46f0 virtio-scsi: add support for the any_layout feature
Store the request and response headers by value, and let
virtio_scsi_parse_req check that there is only one of datain
and dataout.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 08:47:11 +02:00
Paolo Bonzini
a3de269ccc virtio-scsi: introduce virtio_scsi_complete_cmd_req
This is also related to sense handling, and will be used
by anylayout.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 08:47:11 +02:00
Paolo Bonzini
489c7901a6 virtio-scsi: prepare sense data handling for any_layout
Retrieve sense and copy it to guest memory, to prepare for when we will use
qemu_iovec_from_buf.

Swap response and request, since we'll use the tail of VirtIOSCSIReq
for the CDB.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 08:47:11 +02:00
Paolo Bonzini
57fbae6e2c virtio-scsi: add extra argument and return type to qemu_sgl_concat
Will be used for anylayout support.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 08:47:11 +02:00
Paolo Bonzini
b0b4ea17dc virtio-scsi: add target swap for VirtIOSCSICtrlTMFReq fields
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 08:47:11 +02:00
Paolo Bonzini
36b15c79aa virtio-scsi: start preparing for any_layout
- Introduce virtio_scsi_init_req and virtio_scsi_free_req

- rename qemu_sgl_init_external to qemu_sgl_concat

- move virtio_scsi_parse_req from virtio_scsi_pop_req to callers
  and add header length checks to virtio_scsi_parse_req.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 08:47:11 +02:00
Paolo Bonzini
519661ee65 util: add return value to qemu_iovec_concat_iov
This will be necessary later to recognize the case where a
request has both dataout and datain.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 08:47:10 +02:00
Paolo Bonzini
1016b239c5 megasas: use PCI DMA API
MegaSAS emulation is not IOMMU-friendly.  Fix this by switching to
pci_dma_* functions.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 08:47:10 +02:00
Alexey Kardashevskiy
b9e77bc718 scsi: Print command name in debug
This makes scsi_command_name() public.

This makes use of scsi_command_name() in debug output for scsi-disk and
spapr-vscsi host bus adapter. Before this, SCSI used to print hex numbers
instead of human-friendly strings.

This adds GET_EVENT_STATUS_NOTIFICATION and READ_DISC_INFORMATION to
the list of SCSI commands supported by scsi_command_name().

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 08:47:10 +02:00
Ulrich Obergfell
2fe5a9f73b scsi-disk: fix bug in scsi_block_new_request() introduced by commit 137745c
This patch fixes a bug in scsi_block_new_request() that was introduced
by commit 137745c5c6. If the host cache
is used - i.e. if BDRV_O_NOCACHE is _not_ set - the 'break' statement
needs to be executed to 'fall back' to SG_IO.

Cc: qemu-stable@nongnu.org
Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 08:47:10 +02:00
Paul Janzen
4525c1337f scsi-disk.c: Fix compilation with -DDEBUG_SCSI
In scsi-disk.c, if you #define DEBUG_SCSI=1, you get:
hw/scsi/scsi-disk.c: In function 'scsi_disk_emulate_command':
hw/scsi/scsi-disk.c:2018: error: 'SCSIRequest' has no member named 'buf'

Change the debugging statement to match the actual value tested.

Signed-off-by: Paul Janzen <pcj@pauljanzen.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 08:47:10 +02:00
Peter Lieven
9281fe9eea block/iscsi: use 16 byte CDBs only when necessary
this patch changes the driver to uses 16 Byte CDBs for
READ/WRITE only if the target requires 64bit lba addressing.

On one hand this saves 6 bytes in each PDU on the other
hand it seems that 10 Byte CDBs seems to be much better
supported and tested as a recent issue I had with a
major storage supplier lined out.

For WRITESAME the logic is a bit more tricky as WRITESAME10
with UNMAP was added really late. Thus a fallback to WRITESAME16
is possible if it supports UNMAP and WRITESAME10 not.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 08:47:10 +02:00
Peter Lieven
fcd470d857 block/iscsi: fix potential segfault on early callback
it might happen in the future that a function directly invokes its callback.
In this case we end up in a segfault because the iTask is gone when the BH
is scheduled.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 08:47:10 +02:00
Peter Lieven
efc6de0d0e block/iscsi: handle BUSY condition
this patch adds handling of BUSY status reponse from an iSCSI target.
Currently, we fail with -EIO in case of SCSI_STATUS_BUSY while the
obvious reaction would be to retry the operation after some time.
The retry time is randomly choosen from a range with exponential
growth increasing with each retry.

This patch includes most of the changes by a an upcoming patch
from Stefan Hajnoczi:

 iscsi: implement .bdrv_detach/attach_aio_context()

because I also need the reference to the aio_context for
the retry timer to work. I included the changes to maintain
better mergeability.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 08:47:10 +02:00
Peter Crosthwaite
67891b8a85 memory: MemoryRegion: factor out memory region re-adder
memory_region_set_address is mostly just a function that deletes and
re-adds a memory region. Factor this generic functionality out into a
re-usable function. This prepares support for further QOMification
of MemoryRegion.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-17 17:11:13 +02:00
Peter Crosthwaite
0598701a49 memory: MemoryRegion: factor out subregion add functionality
Split off the core looping code that actually adds subregions into
it's own fn. This prepares support for Memory Region qomification
where setting the MR address or parent via QOM will back onto this more
minimal function.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
[Rename new function. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-17 17:11:10 +02:00
Peter Maydell
0360fbd076 Merge remote-tracking branch 'remotes/riku/linux-user-for-upstream' into staging
* remotes/riku/linux-user-for-upstream:
  User mode support for Linux ELF files with no section header
  linux-user: Return correct errno for unsupported netlink socket
  linux-user: Don't overrun guest buffer in sched_getaffinity
  linux-user/uname: Return correct uname string for x86_64
  linux-user: fix gcc-4.9 compiler error on __{get,put]}_user
  signal/ppc/do_setcontext remove __get_user return check
  signal/sparc64_set_context: remove __get_user checks
  signal/ppc/{save,restore}_user_regs remove __put/get error checks
  signal/all/setup_frame remove __put_user checks
  signal/all/do_sigreturn - remove __get_user checks
  signal/all/do_sigaltstack remove __get_user value check
  signal/sparc/restore_fpu_state: remove
  signal/all: remove return value from restore_sigcontext
  signal/all: remove return value from setup_sigcontext
  signal/all: remove return value from copy_siginfo_to_user
  signal/x86/setup_frame: __put_user cleanup
  signal/all: remove __get/__put_user return value reading

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-17 16:08:06 +01:00
Sergey Fedorov
c9299e2fe7 qtest: fix qtest_clock_warp() for no deadline case
Use dedicated qemu_soonest_timeout() instead of MIN().

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-17 16:07:37 +02:00
Peter Crosthwaite
a656e22f09 exec: dummy_section: Pass address space through.
Rather than use the global singleton.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-17 16:07:37 +02:00
Peter Crosthwaite
3fb5bf5730 memory: Simplify mr_add_subregion() if-else
This if else is not needed. The previous call to memory_region_add
(whether _overlap or not) will always set priority and may_overlap
to desired values. And its not possible to get here without having
called memory_region_add_subregion due to the null guard on parent.
So we can just directly call memory_region_add_subregion_common.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-17 16:07:37 +02:00
Gonglei
4dc5615223 memory: Don't update all memory region when ioeventfd changed
memory mappings don't rely on ioeventfds, there is no need
to destroy and rebuild them when manipulating ioeventfds,
otherwise it scarifies performance.

according to testing result, each ioeventfd deleing needs
about 5ms, within which memory mapping rebuilding needs
about 4ms. With many Nics and vmchannel in a VM doing migrating,
there can be many ioeventfds deleting which increasing
downtime remarkably.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Herongguang <herongguang.he@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-17 16:07:37 +02:00
Hu Tao
b0e56e0b63 unset RAMBlock idstr when unregister MemoryRegion
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-17 16:07:37 +02:00
Hu Tao
20cfe8810d exec: introduce qemu_ram_unset_idstr() to unset RAMBlock idstr
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-17 16:07:37 +02:00
Paolo Bonzini
01a9c03c39 MAINTAINERS: Add myself as Memory API maintainer
I'm not including Avi since he has already removed himself from the
KVM entry.  I'm not going to commit my patches without review.

Acked-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-17 16:07:37 +02:00
Craig Heffner
d3606f0744 User mode support for Linux ELF files with no section header
In user mode Linux, Qemu currently refuses to load ELF files that do not
contain section headers (ehdr->e_shentsize == 0). Since section headers are not
required in order to load an ELF file, simply removing the e_shentsize check in
elf_check_ehdr() allows ELF binaries with no section headers to be run properly
in user mode:

Signed-off-by: Craig Heffner <cheffner@tacnetsol.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-06-17 09:21:41 +03:00
Ed Swierk
480eda2eda linux-user: Return correct errno for unsupported netlink socket
This fixes "Cannot open audit interface - aborting." when the
EAFNOSUPPORT errno differs between the target and host
architectures (e.g. mips target and x86_64 host).

Signed-off-by: Ed Swierk <eswierk@skyportsystems.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-06-17 09:21:41 +03:00
Peter Maydell
be3bd286bc linux-user: Don't overrun guest buffer in sched_getaffinity
If the guest's "long" type is smaller than the host's, then
our sched_getaffinity wrapper needs to round the buffer size
up to a multiple of the host sizeof(long). This means that when
we copy the data back from the host buffer to the guest's
buffer there might be more than we can fit. Rather than
overflowing the guest's buffer, handle this case by returning
EINVAL or ignoring the unused extra space, as appropriate.

Note that only guests using the syscall interface directly might
run into this bug -- the glibc wrappers around it will always
use a buffer whose size is a multiple of 8 regardless of guest
architecture.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-06-17 09:21:41 +03:00
Peter Maydell
4d13be8b8b linux-user/uname: Return correct uname string for x86_64
We were returning the incorrect uname string (with a hyphen, not
an underscore) for x86_64. Fix this by removing the x86_64 special
case, since the default "just use UNAME_MACHINE" behaviour suffices.
This leaves cpu_to_uname_machine() special cases for only those
architectures which need to vary the string based on runtime CPU
features.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2014-06-17 09:21:40 +03:00
Riku Voipio
a42267ef58 linux-user: fix gcc-4.9 compiler error on __{get,put]}_user
gcc-4.9 finds unused operand:

linux-user/syscall.c: In function ‘host_to_target_stat64’:
linux-user/qemu.h:301:19: error: right-hand operand of comma expression
has no effect [-Werror=unused-value]
      ((hptr), (x)), 0)

Just removing the rh operand is no good, it will error in later:

linux-user/main.c: In function ‘arm_kernel_cmpxchg64_helper’:
linux-user/qemu.h:330:15: error: void value not ignored as it ought to be
         __ret = __put_user((x), __hptr);    \

Thus, remove setting __ret from __get_user and __put_user, as and
set the right hand operand to (void)0 to make it clear that these
return never nothing.

This commit depends on the signal.c cleanup, to ensure bisectable
version history.

Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Cc: Richard Henderson <rth@twiddle.net>
2014-06-17 08:52:08 +03:00
Riku Voipio
9e918dc927 signal/ppc/do_setcontext remove __get_user return check
The last remaining check for return value of __get_user.

Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Cc: Alexander Graf <agraf@suse.de>
2014-06-17 08:52:08 +03:00
Riku Voipio
be3ef5c7fa signal/sparc64_set_context: remove __get_user checks
Remove checks of __get_user and the err variable
used to control flow with it.

Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-17 08:52:08 +03:00
Riku Voipio
c650c008e3 signal/ppc/{save,restore}_user_regs remove __put/get error checks
As __get_user and __put_user do not return errors, remove the
if checks from around them. This allows making the save/restore
functions void.

Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Cc: Alexander Graf <agraf@suse.de>
2014-06-17 08:52:07 +03:00
Riku Voipio
0188fadb7f signal/all/setup_frame remove __put_user checks
Remove "if(__put_user" checks and their related error paths
for all architecture's setup_frame, setup_rt_frame and similar.

Remove the unlock_user_struct when the only way to end up there is
from failed lock_user_struct.

Remove err variable if there are no users for it in the function
anymore.

Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-17 08:52:07 +03:00
Riku Voipio
f5f601afce signal/all/do_sigreturn - remove __get_user checks
Remove "if(__get_user" checks and their related error paths
for all architecture's do_sigreturn. Remove the unlock_user_struct
when the only way to end up there is from failed lock_user_struct.

v3: remove unneccesary sigsegv label as suggested by Peter

Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-17 08:52:07 +03:00
Riku Voipio
9eeb8306d5 signal/all/do_sigaltstack remove __get_user value check
Access is already checked in the lock_user_struct
call before.

Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-17 08:52:07 +03:00
Riku Voipio
945473847b signal/sparc/restore_fpu_state: remove
A function never called from anywhere, obviously half-complete.
Remove function and if someone wants to complete this, please
check the old version out of git history.

Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-17 08:52:07 +03:00
Riku Voipio
016d2e1dfa signal/all: remove return value from restore_sigcontext
make most implementations of restore_sigcontext void and
remove checking it's return value from functions calling
restore_sigcontext.

The exception is the X86 version of the function that is
too different from others to deal in this way, and arm
version, to keep possibility of erroring out from failed
valid_user_regs.

v3: keep arm valid_user_regs for filling in near future.

Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-17 08:52:07 +03:00
Riku Voipio
41ecc72ba5 signal/all: remove return value from setup_sigcontext
Make all implementations of setup_sigcontext void and
remove checking it's return value from functions calling
setup_sigcontext.

Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-17 08:52:07 +03:00
Riku Voipio
b0fd8d1868 signal/all: remove return value from copy_siginfo_to_user
Since copy_siginfo_to_user always returns 0, make it void
and remove any checks for return value from calling functions.

Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-17 08:52:07 +03:00
Riku Voipio
7df2fa3623 signal/x86/setup_frame: __put_user cleanup
Remove the remaining check for __put_user return
value, and all the checks for err variable which
isn't set anywhere anymore.

No we can only end up in give_sigsegv due to failed
lock_user_struct - thus we remove the unlock_user_struct
to avoid unlocking a region never locked.

Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-17 08:52:07 +03:00
Riku Voipio
1d8b512bbc signal/all: remove __get/__put_user return value reading
Remove all the simple cases of reading the return value
of __get_user and __put_user.

We set err = 0 in sparc versions of do_sigreturn and
sparc64_set_context to avoid compile error, but else this patch is
just general removal of err |= __get_user ... idiom.

v2: remove err variable from target_rt_restore_ucontext

Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-17 08:52:07 +03:00
Peter Maydell
af44da87e9 Merge remote-tracking branch 'remotes/agraf/tags/signed-ppc-for-upstream' into staging
Patch queue for ppc - 2014-06-16

This pull request brings a lot of fun things. Among others we have

  - e500: u-boot firmware support
  - sPAPR: magic page enablement
  - sPAPR: add "compat" CPU option to support older guests
  - sPAPR: refactorings in preparation for VFIO
  - POWER8 live migration
  - mac99: expose bus frequency
  - little endian core dump, gdb and disas support
  - new ppc64le-linux-user target
  - DFP emulation
  - bug fixes

# gpg: Signature made Mon 16 Jun 2014 12:28:32 BST using RSA key ID 03FEDC60
# gpg: Can't check signature: public key not found

* remotes/agraf/tags/signed-ppc-for-upstream: (156 commits)
  spapr_pci: Advertise MSI quota
  PPC: KVM: Make pv hcall endian agnostic
  powerpc: use float64 for frsqrte
  spapr: Add kvm-type property
  spapr: Create SPAPRMachine struct
  linux-user: Tell guest about big host page sizes
  spapr_hcall: Add address-translation-mode-on-interrupt resource in H_SET_MODE
  spapr_hcall: Split h_set_mode()
  target-ppc: Enable DABRX SPR and limit it to <=POWER7
  target-ppc: Enable PPR and VRSAVE SPRs migration
  target-ppc: Add POWER8's Event Based Branch (EBB) control SPRs
  KVM: target-ppc: Enable TM state migration
  target-ppc: Add POWER8's TM SPRs
  target-ppc: Add POWER8's MMCR2/MMCRS SPRs
  target-ppc: Enable FSCR facility check for TAR
  target-ppc: Add POWER8's FSCR SPR
  target-ppc: Add POWER8's TIR SPR
  target-ppc: Refactor class init for POWER7/8
  target-ppc: Switch POWER7/8 classes to use correct PMU SPRs
  target-ppc: Make use of gen_spr_power5p_lpar() for POWER7/8
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-16 18:26:21 +01:00
Paolo Bonzini
f27701510c rules.mak: remove $(sort) from extract-libs
Duplicate removal was added to extract-libs in order to avoid including
the same library multiple times into the linking command line; this could
potentially happen when using "foo.mo-libs" (which adds the library to
all components, causing it to appear N times if the module is composed
of N objects).  However, sorting and removing duplicates causes problems
with static linking, and also with space-separated linker options as
found in some Mac OS X packaging systems.  Furthermore, the "optimization"
is really a non-problem since we do not expect .mo modules to be composed
of many files.

Reported-by: Sean Bruno <sbruno@ignoranthack.me>
Tested-by: Sean Bruno <sbruno@ignoranthack.me>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1402929805-16836-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-16 16:01:50 +01:00
Peter Maydell
84219c5a21 Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
Block pull request

# gpg: Signature made Mon 16 Jun 2014 12:22:22 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/block-pull-request: (39 commits)
  QemuOpts: cleanup tmp 'allocated' member from QemuOptsList
  cleanup QEMUOptionParameter
  vpc.c: replace QEMUOptionParameter with QemuOpts
  vmdk.c: replace QEMUOptionParameter with QemuOpts
  vhdx.c: replace QEMUOptionParameter with QemuOpts
  vdi.c: replace QEMUOptionParameter with QemuOpts
  ssh.c: replace QEMUOptionParameter with QemuOpts
  sheepdog.c: replace QEMUOptionParameter with QemuOpts
  rbd.c: replace QEMUOptionParameter with QemuOpts
  raw_bsd.c: replace QEMUOptionParameter with QemuOpts
  raw-win32.c: replace QEMUOptionParameter with QemuOpts
  raw-posix.c: replace QEMUOptionParameter with QemuOpts
  qed.c: replace QEMUOptionParameter with QemuOpts
  qcow2.c: replace QEMUOptionParameter with QemuOpts
  QemuOpts: export qemu_opt_find
  qcow.c: replace QEMUOptionParameter with QemuOpts
  nfs.c: replace QEMUOptionParameter with QemuOpts
  iscsi.c: replace QEMUOptionParameter with QemuOpts
  gluster.c: replace QEMUOptionParameter with QemuOpts
  cow.c: replace QEMUOptionParameter with QemuOpts
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-16 12:27:47 +01:00
Badari Pulavarty
9dbae97723 spapr_pci: Advertise MSI quota
Hotplug of multiple disks fails due to MSI vector quota check.
Number of MSI vectors default to 8 allowing only 4 devices.
This happens on RHEL6.5 guest. RHEL7 and SLES11 guests fallback
to INTX.

One way to workaround the issue is to increase total MSIs,
so that MSI quota check allows us to hotplug multiple disks.

This sets the quota to the maximum number of interupts XICS has
which is 1024 now (XICS_IRQS). This moves XICS_IRQS from spapr.c
to xics.h for wider visibility.

Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
[aik: put XICS_IRQS=1024 instead of 64i, fixed endianness and size]
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:46 +02:00
Alexander Graf
d13fc32ecf PPC: KVM: Make pv hcall endian agnostic
There were a few revisions of the Linux kernel that incorrectly swapped
the hcall instructions when they saw ePAPR compliant hypercalls.

We already have fixups for those in place when running with PR KVM, but
HV KVM and systems that don't implement hypercalls at all are still broken
because they fall back to the QEMU implementation of fallback hypercalls.

So let's make the fallback hypercall instruction path endian agnostic. This
only really works well for 64bit guests, but I don't think there are any 32bit
systems left that don't implement real pv hcall support, so we'll never get
into this code path.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:46 +02:00
Tristan Gingold
e223bcad6e powerpc: use float64 for frsqrte
Remove the code that reduce the result to float32 as the frsqrte
instruction is defined to return a double-precision estimate of
the reciprocal square root.

Although reducing the fractional part is harmless (as the estimation
must have at least 12 bits of precision according to the old PEM),
reducing the exponent range is not correct.

Signed-off-by: Tristan Gingold <gingold@adacore.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:46 +02:00
Eduardo Habkost
23825581d7 spapr: Add kvm-type property
The kvm-type machine option was left out when MachineState was
introduced, preventing the kvm-type option from being used. Add the
missing property to the sPAPR machine class, so it can be used.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:46 +02:00
Eduardo Habkost
748abce94f spapr: Create SPAPRMachine struct
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:46 +02:00
Alexander Graf
a70daba377 linux-user: Tell guest about big host page sizes
We tell the guest its page size via AUX vectors. The guest process then uses
this page size as information on which boundaries it can mmap() things.

However, if the host has a bigger page size granularity than the guest, it can
not fulfill these mmap() requests - which falls apart when MAP_FIXED is passed
to mmap.

So in that case, let the guest know that we're running on a bigger page size
granularity than the target would require.

This fixes running qemu-ppc (TARGET_PAGE_SIZE=4k) on a 64k page size ppc64 host
for me.

Signed-off-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2014-06-16 13:24:45 +02:00
Alexey Kardashevskiy
d5ac4f5433 spapr_hcall: Add address-translation-mode-on-interrupt resource in H_SET_MODE
This adds handling of the RESOURCE_ADDR_TRANS_MODE resource from
the H_SET_MODE, for POWER8 (PowerISA 2.07) only.

This defines AIL flags for LPCR special register.

This changes @excp_prefix according to the mode, takes effect in TCG.

This turns support of a new capability PPC2_ISA207S flag for TCG.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:45 +02:00
Alexey Kardashevskiy
c4015bbd50 spapr_hcall: Split h_set_mode()
This moves H_SET_MODE_RESOURCE_LE handler to a separate function
as there are other "resources" coming and this is going to become ugly.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:45 +02:00
Alexey Kardashevskiy
cd9adfdd77 target-ppc: Enable DABRX SPR and limit it to <=POWER7
This adds DABRX SPR.

As DABR(X) are present in POWER CPUs till POWER7 only and POWER8 does not
have them (as it implements more powerful facility instead), this limits
DABR/DABRX registration by POWER7 (inclusive).

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:45 +02:00
Alexey Kardashevskiy
7303f83db6 target-ppc: Enable PPR and VRSAVE SPRs migration
This hooks SPR with their "KVM set_one_reg" counterparts which enables
their migration.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:45 +02:00
Alexey Kardashevskiy
4ee4a03b38 target-ppc: Add POWER8's Event Based Branch (EBB) control SPRs
POWER8 supports Event-Based Branch Facility (EBB). It is controlled via
set of SPRs access to which should generate an "Facility Unavailable"
interrupt if the facilities are not enabled in FSCR for problem state.

This adds EBB SPRs.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:45 +02:00
Alexey Kardashevskiy
80b3f79b99 KVM: target-ppc: Enable TM state migration
This adds migration support for registers saved before Transactional
Memory (TM) transaction started.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:45 +02:00
Alexey Kardashevskiy
cdcdda27fc target-ppc: Add POWER8's TM SPRs
This adds TM (Transactional Memory) SPRs.

This adds generic spr_read_prev_upper32()/spr_write_prev_upper32() to
handle upper half SPRs such as TEXASRU which is upper half of TEXASR.
Since this is not the only register like that and their numbers go
consequently, it makes sense to generalize the helpers.

This adds a gen_msr_facility_check() helper which purpose is to generate
the Facility Unavailable exception if the facility is disabled.
It is a copy of gen_fscr_facility_check() but it checks for enabled
facility in MSR rather than FSCR/HFSCR. It still sets the interrupt cause
in FSCR/HFSCR (whichever is passed to the helper).

This adds spr_read_tm/spr_write_tm/spr_read_tm_upper32/spr_write_tm_upper32
which are used for TM SPRs.

This adds TM-relates MSR bits definitions. This enables TM in POWER8 CPU class'
msr_mask.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:45 +02:00
Alexey Kardashevskiy
70c5340744 target-ppc: Add POWER8's MMCR2/MMCRS SPRs
This adds POWER8 specific PMU MMCR2/MMCRS SPRs.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:44 +02:00
Alexey Kardashevskiy
45ed0be146 target-ppc: Enable FSCR facility check for TAR
This makes user-privileged read/write fail if TAR facility is not enabled
in FSCR.

Since this is the very first check for enabled in FSCR facility,
this also adds gen_fscr_facility_check() for using in spr_write_tar()/
spr_read_tar().

This enables TAR in FSCR for user mode unconditionally.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:44 +02:00
Alexey Kardashevskiy
7019cb3d88 target-ppc: Add POWER8's FSCR SPR
This adds an FSCR (Facility Status and Control Register) SPR. This defines
names for FSCR bits.

This defines new exception type - POWERPC_EXCP_FU - "facility unavailable" (FU).
This registers an interrupt vector for it at 0xF60 as PowerISA defines.

This adds a TCG helper_fscr_facility_check() helper to raise an exception
if the facility is not enabled. It updates the interrupt cause field
in FSCR. This adds a TCG translation block generation code. The helper
may be used for HFSCR too as it has the same format.

The helper raising FU exceptions is not used by this patch but will be
in the next ones.

This adds gen_update_current_nip() to update NIP in DisasContext.
This helper is not used now and will be called before checking for
a condition for throwing an FU exception.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:44 +02:00
Alexey Kardashevskiy
d1a721ab81 target-ppc: Add POWER8's TIR SPR
This adds TIR (Thread Identification Register) SPR first defined for server
CPUs in PowerISA 2.07.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:44 +02:00
Alexey Kardashevskiy
a242881405 target-ppc: Refactor class init for POWER7/8
This extends init_proc_book3s_64 to support POWER7 and POWER8.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:44 +02:00
Alexey Kardashevskiy
5881c296b9 target-ppc: Switch POWER7/8 classes to use correct PMU SPRs
This replaces gen_spr_7xx() call (which registers 32bit SPRs) with
gen_spr_book3s_pmu() call.

This removes SPR_7XX_PMC5/6 as they are for 32bit and gen_spr_book3s_pmu()
already registers correct PMC5/6 SPRs.

This removes explicit MMCRA registration as gen_spr_book3s_pmu() does it
anyway.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:44 +02:00
Alexey Kardashevskiy
7fc2db18ce target-ppc: Make use of gen_spr_power5p_lpar() for POWER7/8
This makes use of generic gen_spr_power5p_lpar() which registers LPCR SPR.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:44 +02:00
Alexey Kardashevskiy
6a1eed3f49 target-ppc: Make use of gen_spr_book3s_altivec() for POWER7/8
This replaces VRSAVE registration and vscr_init() call with
gen_spr_book3s_altivec() which is generic and does the same thing if
insns_flags has PPC_ALTIVEC bit set (which POWER7/8 have set).

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:44 +02:00
Alexey Kardashevskiy
5db7d4faa3 target-ppc: Move POWER7/8 CFAR/DSCR/CTRL/PPR/PCR SPR registration to helpers
This moves SCFAR/DSCR/CTRL/PPR/PCR PRs to helpers. Later these helpers
will be called from generalized init_proc_book3s_64().

This switches init_proc_POWER7() to use generalized gen_spr_book3s_common()
which registers CRTL SPR under slightly different names. No change in
behaviour or non-debug output is expected.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:43 +02:00
Alexey Kardashevskiy
768167abb9 target-ppc: Move POWER8 TCE Address control (TAR) to a helper
This moves TAR SPR to a helper. Later this helper will be
called from generalized init_proc_book3s_64().

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:43 +02:00
Alexey Kardashevskiy
e61716aa9a target-ppc: Move POWER7/8 PIR/PURR/SPURR SPR registration to helpers
This moves PIR/PURR/SPURR SPRs to helpers. Later these helpers will be
called from generalized init_proc_book3s_64().

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:43 +02:00
Alexey Kardashevskiy
83cc6f8c2f target-ppc: Enable PMU SPRs migration
This enabled PMU SPRs migration by hooking hypv privileged versions with
"KVM one reg" IDs.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:43 +02:00
Alexey Kardashevskiy
90618f4f4d target-ppc: Remove check_pow_970FX
After merging 970s into one class, check_pow_970() is used for all of them.
Since POWER5+ is no different in the matter of supported power modes,
let's use the same check_pow() callback for POWER5+ too,

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:43 +02:00
Alexey Kardashevskiy
7488d481ce target-ppc: Introduce and reuse generalized init_proc_book3s_64()
At the moment every POWER CPU family has its own init_proc_POWERX function.
E500 already has common init function so we try to do the same thing.

This introduces BOOK3S_CPU_TYPE enum with 2 values - 970 and POWER5+.

This introduces generalized init_proc_book3s_64() which accepts a CPU type
as a parameter.

This uses new init function for 970 and POWER5+ CPU classes.

970 and POWER5+ use the same CPU class initialization except 3 things:
1. logical partitioning is controlled by LPCR (POWER5+) and HID4 (970)
SPRs;
2. 970 does not have EAR (External Access Register) SPR and PowerISA 2.03
defines one so keep it only for POWER5+;
3. POWER5+ does not have ALTIVEC so insns_flags does not have PPC_ALTIVEC
flag set and gen_spr_book3s_altivec() won't init ALTIVEC for POWER5+.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:43 +02:00
Alexey Kardashevskiy
ba88100219 target-ppc: Add HID4 SPR for PPC970
Previously LPCR was registered for the 970 class which was wrong as
it does not have LPCR. Instead, HID4 is used which this patch registers.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:43 +02:00
Alexey Kardashevskiy
c36c97f880 target-ppc: Add PMC7/8 to 970 class
Compared to PowerISA-compliant CPUs, 970 family has most of them plus
PMC7/8 which are only present on 970 but not on POWER5 and later CPUs.

Since we are changing SPRs for Book3s/970 families, let's add them too.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:43 +02:00
Alexey Kardashevskiy
077850b037 target-ppc: Add PMC5/6, SDAR and MMCRA to 970 family
MMCR0, MMCR1, MMCRA, PMC1..6, SIAR, SDAR are defined for 970 and PowerISA
CPUs. Since we are building common infrastructure for SPRs intialization
to share it between 970 and POWER5+/7/..., let's add missing SPRs to
the 970 family. Later rework of CPU class initialization will use those
for all PowerISA CPUs.

This adds new SPRs and enables writing to Uxxxx SPRs from supermode.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:42 +02:00
Alexey Kardashevskiy
75b9c321f4 target-ppc: Add "POWER" prefix to MMCRA PMU registers
Since we started adding "POWER" prefix to 64bit PMU SPRs, let's finish
the transition and fix MMCRA and define a supermode version of it.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:42 +02:00
Alexey Kardashevskiy
fd51ff6328 target-ppc: Copy and split gen_spr_7xx() for 970
This stops using 7xx common SPRs init function and adds separate set
of helpers for 970.

This does not copy ICTC SPR as neither 970 manual nor PowerISA mention it.

This defines 970/book3s PMU SPRs constants as they differs from the ones
used for 7XX.

This creates 2 helpers for PMU SPRs, one for supermode privileged SPRs and
one for user privileged SPRs as "sup" versions can be shared across
the family while "user" versions will behave different starting POWER8
(which will be addressed later).

This allows writing to Uxxxx SPRs from supermode. spr_write_ureg() is
implemented for this as a copy of already existing spr_read_ureg().

This allows writing to supervisor's SIAR - it used to be disabled
when gen_spr_7xx() was used.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:42 +02:00
Alexey Kardashevskiy
eb16dd9cc9 target-ppc: Make UCTRL a mirror of CTRL
This changes UCTRL SPR to read from its supermode copy.

This enables reading from UCTRL in user mode.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:42 +02:00
Alexey Kardashevskiy
42382f6244 target-ppc: Refactor PPC970
This splits one init_proc_970() into a set of small helpers. Later
init_proc_970() will be generalized and will call different set of helpers
depending on the current CPU class.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:42 +02:00
Alexey Kardashevskiy
bbc01ca7f2 target-ppc: Merge 970FX and 970MP into a single 970 class
The differences between classes were:
1. SLB size, was 32 for 970 and 64 for others, should be 64 for all;
2. check_pow() callback, HID0 format is the same so should be the same
0x01C00000 which means "deep nap", "doze" and "nap" bits set;
3. LPCR - 970 does not have it but 970MP had one (by mistake).

This fixes wrong differences and makes one 970 class.

This fixes wrong registration of LPCR which is not present on 970.

This defines HID0 bits and uses them in check_pow_970().

This does not copy MSR_SHV (Hypervisor State, HV) bit from 970FX to
970 class as we do not emulate hypervisor in QEMU anyway.

This does not remove check_pow_970FX now as it is still used by POWER5+
class, this will be addressed later.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:42 +02:00
Alexey Kardashevskiy
cb8b8bf840 target-ppc: Rename 7XX/60x/74XX/e600 PMU SPRs
As defined in Linux kernel, PMC*, SIAR, MMCR0/1 have different numbers
for 32 and 64 bit POWERPC. We are going to support 64bit versions too so
let's rename 32bit ones to avoid confusion.

This is a mechanical patch so it does not fix obvious mistake with these
registers in POWER7 yet, this will be fixed later.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:42 +02:00
Tom Musta
a9e8f4e7df target-ppc: Fix Temporary Variable Leak in bctar
Fix a temporary variable leak detected in the bctar instruction:

   Opcode 13 10 11 (4d910460) leaked temporaries

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:42 +02:00
Alexander Graf
13b6a45565 PPC: e500: Merge 32 and 64 bit SPE emulation
Today we have a lot of conditional code in the SPE emulation depending on
whether we have 64bit GPRs or not.

Unfortunately the assumption that we can just recycle the 64bit GPR
implementation is wrong. Normal SPE implementations maintain the upper 32 bits
on all non-SPE instructions which then only modify the low 32 bits. However
all instructions we model that adhere to the normal SF based switching don't
care whether they operate on 32 or 64 bit registers and just always use the full
64 bits.

So let's remove that dubious SPE optimization and revert everything to the same
code path the 32bit target code was taking. That way we get rid of differences
between the two implementations, but will get a slight performance hit when
emulating SPE instructions.

This fixes SPE emulation with qemu-system-ppc64 for me.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:41 +02:00
Alexander Graf
f7d6914654 PPC: spapr: Expose /hypervisor node in device tree
PR KVM supports an ePAPR compliant hypercall interface in parallel to the
normal sPAPR one. Expose the ePAPR /hypervisor node and properties to the
guest so it can use it.

This enables magic page sharing on PR KVM with -M pseries.

However we had a few nasty bugs in the magic page implementation on vcpus
newer than 970 (p7, p8) that KVM now has workarounds for. It indicates that
it does have these workarounds through the PPC_FIXUP_HCALL capability.

To not expose broken guest kernels to issues on host kernels that don't
have the fixups in place, we don't expose working hypercall instructions
when the fixups are not available so that the guest can never active the
magic page.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:41 +02:00
Alexander Graf
87a91de61a KVM: PPC: Expose fixup hcall capability
New kvm versions expose a PPC_FIXUP_HCALL capability. Make it visible to
machine code so we can take decisions based on it.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:41 +02:00
Alexander Graf
b061808d39 linux-headers: update linux headers to kvm/next
This updates the kvm headers to commit 820b3fcd in kvm/next.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:41 +02:00
Alexander Graf
2872e1929b linux-headers: include psci.h
The kvm headers now have a dependency on psci.h, sync it into our linux
header copy as well.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:41 +02:00
Alexander Graf
ada82b537e PPC: SPE: Fix high-bits bitmask
The SPE emulation code wants to access the highest 32bits of a 64bit register
and uses the andi TCG instruction for that. Unfortunately it masked with the
wrong mask. Fix the mask to actually cover the upper 32 bits.

This fixes simple multiplication tests with SPE guests for me.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:41 +02:00
Alexander Graf
deb6ed13eb PPC: e500: Fix TLB lookup for 32bit CPUs
When we run 32bit guest CPUs (or 32bit guest code on 64bit CPUs) on
qemu-system-ppc64 the TLB lookup will use the full effective address
as pointer.

However, only the first 32bits are valid when MSR.CM = 0. Check for
that condition.

This makes QEMU boot an e500v2 guest with more than 1G of RAM for me.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:41 +02:00
Peter Maydell
f2e2bc9ca0 hw/pci-host/ppce500: Fix typo in vmstate definition
Fix a typo in the ppce500_pci vmstate definition which meant that
we were migrating the struct pci_inbound using the vmstate for
pci_outbound. Fortunately the two structures have exactly the same
format at the moment (four uint32_ts) so this was harmless, and
we can correcting the typo without a migration compatibility
break because the vmstate name doesn't go out on the wire.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:41 +02:00
Tom Musta
4b1daa72d3 target-ppc: Store Quadword Conditional Drops Size Bit
The size and register information are encoded into the reserve_info field
of CPU state in the store conditional translation code.  Specifically, the
size is shifted left by 5 bits (see target-ppc/translate.c gen_conditional_store).

The user-mode store conditional code erroneously extracts the size by ANDing
with a 4 bit mask; this breaks if size >= 16.

Eliminate the mask to make the extraction of size mirror its encoding.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:40 +02:00
Tom Musta
f46e9a0b99 target-ppc: Confirm That .bss Pages Are Valid
The existing code does a check to ensure that a .bss region is properly
mmap'd.  When additional mmap is required, the (guest) pages are also
validated.  However, this code has a bug: when host page size is larger
than target page size, it is possible for the .bss pages to already be
(host) mapped but the guest .bss pages may not be valid.

The check to mmap additional space is separated from the flagging of the
target (guest) pages, thus ensuring that both aspects are done properly.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:40 +02:00
Tom Musta
5b274ed74d target-ppc: Support VSX in PPC User Mode
Some modern tool chains use VSX instructions.  Therefore attempt to enable the VSX MSR
bit by default, just like similar bits (FP, VEC, SPE, etc.).

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:40 +02:00
Doug Kwan
9c35126c56 target-ppc: Add a new user mode target for little-endian PPC64.
Signed-off-by: Doug Kwan <dougkwan@google.com>
Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:40 +02:00
Doug Kwan
e22c357b3e target-ppc: Allow little-endian user mode.
This allows running PPC64 little-endian in user mode if target is configured
that way.  In PPC64 LE user mode we set MSR.LE during initialization.

Signed-off-by: Doug Kwan <dougkwan@google.com>
Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:40 +02:00
Doug Kwan
d90b94cd78 target-ppc: Support little-endian PPC64 in user mode.
Look at ELF header to determine ABI version on PPC64.  This is required
for executing the first instruction correctly.  Also print correct machine
name in uname() system call.

Signed-off-by: Doug Kwan <dougkwan@google.com>
Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:40 +02:00
Alex Zuepke
a721d390b3 PPC: e500: Fix MMUCSR0 emulation
A  "mtspr SPRMMUCSR0, reg"  always flushed TLB0,
because it passed the SPR number 0x3f4 to the flush routine.
But we want to flush either TLB0 or TBL1 depending on the GPR value.

Signed-off-by: Alex Zuepke <alexander.zuepke@hs-rm.de>
[agraf: change subject line, fix TCGv size mismatch]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:40 +02:00
Alexey Kardashevskiy
1b8eceee28 spapr_iommu: Introduce bus_offset in sPAPRTCETable
This adds @bus_offset into sPAPRTCETable to tell where TCE table starts
from. It is set to 0 for emulated devices. Dynamic DMA windows will use
other offset.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:39 +02:00
Alexey Kardashevskiy
650f33adbd spapr_iommu: Introduce page_shift in sPAPRTCETable
At the moment only 4K pages are supported by sPAPRTCETable. Since sPAPR
spec allows other page sizes and we are going to implement them, we need
page size to be configrable.

This adds @page_shift into sPAPRTCETable and replaces SPAPR_TCE_PAGE_SHIFT
with it where it is possible.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:39 +02:00
Alexey Kardashevskiy
523e7b8ab8 spapr_iommu: Get rid of window_size in sPAPRTCETable
This removes window_size as it is basically a copy of nb_table
shifted by SPAPR_TCE_PAGE_SHIFT. As new dynamic DMA windows are
going to support windows as big as the entire RAM and this number
will be bigger that 32 capacity, we will have to do something
about @window_size anyway and removal seems to be the right way to go.

This removes dma_window_start/dma_window_size from sPAPRPHBState as
they are no longer used.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:39 +02:00
Alexey Kardashevskiy
e4c35b78bc spapr_iommu: Convert old qdev_init_nofail() to object_property_set_bool
qdev_init_nofail() was replaced by object_property_set_bool("realized")
all over the QEMU so do we.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:39 +02:00
Alexey Kardashevskiy
e28c16f61f spapr_pci: Allow multiple TCE tables per PHB
At the moment sPAPRPHBState contains a @tcet pointer to the only
TCE table. However sPAPR spec allows having more than one DMA window.

Since the TCE object is already a child of SPAPR PHB object, there is
no need to keep an additional pointer to it in sPAPRPHBState so remove it.

This changes the way sPAPRPHBState::reset performs reset of sPAPRTCETable
objects.

This changes the default DMA window properties calculation.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:39 +02:00
Alexey Kardashevskiy
cca7fad576 spapr_pci: spapr_iommu: Make DMA window a subregion
Currently the default DMA window is represented by a single MemoryRegion.
However there can be more than just one window so we need
a "root" memory region to be separated from the actual DMA window(s).

This introduces a "root" IOMMU memory region and adds a subregion for
the default DMA 32bit window. Following patches will add other
subregion(s).

This initializes a default DMA window subregion size to the guest RAM
size as this window can be switched into "bypass" mode which implements
direct DMA mapping.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:39 +02:00
Alexey Kardashevskiy
da6ccee418 spapr_pci: Introduce a finish_realize() callback
The spapr-pci PHB initializes IOMMU for emulated devices only.
The upcoming VFIO support will do it different. However both emulated
and VFIO PHB types share most of the initialization code.
For the type specific things a new finish_realize() callback is
introduced.

This introduces sPAPRPHBClass derived from PCIHostBridgeClass and
adds the callback pointer.

This implements finish_realize() for emulated devices.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[agraf: Fix compilation]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:39 +02:00
Alexey Kardashevskiy
da95324ebe spapr_iommu: Enable multiple TCE requests
Currently only single TCE entry per request is supported (H_PUT_TCE).
However PAPR+ specification allows multiple entry requests such as
H_PUT_TCE_INDIRECT and H_STUFF_TCE. Having less transitions to the host
kernel via ioctls, support of these calls can accelerate IOMMU operations.

This implements H_STUFF_TCE and H_PUT_TCE_INDIRECT.

This advertises "multi-tce" capability to the guest if the host kernel
supports it (KVM_CAP_SPAPR_MULTITCE) or guest is running in TCG mode.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:39 +02:00
Alexey Kardashevskiy
a1d59c0ffa spapr: Enable dynamic change of the supported hypercalls list
At the moment the "ibm,hypertas-functions" list is fixed. However some
calls should be listed there if they are supported by QEMU or the host
kernel.

This enables hyperrtas_prop to grow on stack by adding
a SPAPR_HYPERRTAS_ADD macro. "qemu,hypertas-functions" is converted as well.

The first user of this is going to be a "multi-tce" property.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:38 +02:00
Alexander Graf
9397a7c831 macio: Fix timer endianness
The timer registers on our KeyLargo macio emulation are read as byte reversed
from the big endian guest, so we better expose them endian reversed as well.

This fixes initial hickups of booting Mac OS X with -M mac99 for me.

Signed-off-by: Alexander Graf <agraf@suse.de>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2014-06-16 13:24:38 +02:00
Alexander Graf
3e300fa6ad macio ide: Do remainder access asynchronously
The macio IDE controller has some pretty nasty magic in its implementation to
allow for unaligned sector accesses. We used to handle these accesses
synchronously inside the IO callback handler.

However, the block infrastructure changed below our feet and now it's impossible
to call a synchronous block read/write from the aio callback handler of a
previous block access.

Work around that limitation by making the unaligned handling bits also go
through our asynchronous handler.

This fixes booting Mac OS X for me.

Reported-by: John Arbuckle <programmingkidx@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:38 +02:00
Tom Musta
6ab39b1bd3 target-ppc: Fix popcntb Opcode Bug
The popcntb instruction is erroneously encoded with opcode extension (opc1,opc2) = (0x03,0x03).
Bits 21-30 of popcntb are 122 = 0b00011-0b11010 and therefore this should be encoded
as (opc1,opc2) = (0x1A, 0x03).

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:38 +02:00
Alexey Kardashevskiy
00d4f525ec spapr_iommu: Replace @instance_id with LIOBN for migration
SPAPR IOMMU is a bus-less device and therefore its only ID in
migration stream is an instance id which is not reliable ID
as it depends on the command line parameters order. Since
libvirt may change the order, we need something better than that.

This removes VMSD descriptor from the class definitiion and
registers it with @liobn as an intance ID to let the destination
side find the right device to receive migration data.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:38 +02:00
Alexey Kardashevskiy
6db5bb0f54 KVM: PPC: Enable compatibility mode
The host kernel implements a KVM_REG_PPC_ARCH_COMPAT register which
this uses to enable a compatibility mode if any chosen.

This sets the KVM_REG_PPC_ARCH_COMPAT register in KVM. ppc_set_compat()
signals the caller if the mode cannot be enabled by the host kernel.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[agraf: fix TCG compat setting]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:38 +02:00
Alexey Kardashevskiy
3794d5482d spapr: Implement processor compatibility in ibm, client-architecture-support
Modern Linux kernels support last POWERPC CPUs so when a kernel boots,
in most cases it can find a matching cpu_spec in the kernel's cpu_specs
list. However if the kernel is quite old, it may be missing a definition
of the actual CPU. To provide an ability for old kernels to work on modern
hardware, a Processor Compatibility Mode has been introduced
by the PowerISA specification.

>From the hardware prospective, it is supported by the Processor
Compatibility Register (PCR) which is defined in PowerISA. The register
enables one of the compatibility modes (2.05/2.06/2.07).
Since PCR is a hypervisor privileged register and cannot be
directly accessed from the guest, the mode selection is done via
ibm,client-architecture-support (CAS) RTAS call using which the guest
specifies what "raw" and "architected" CPU versions it supports.
QEMU works out the best match, changes a "cpu-version" property of
every CPU and notifies the guest about the change by setting these
properties in the buffer passed as a response on a custom H_CAS hypercall.

This implements ibm,client-architecture-support parameters parsing
(now only for PVRs) and cooks the device tree diff with new values for
"cpu-version", "ibm,ppc-interrupt-server#s" and
"ibm,ppc-interrupt-server#s" properties.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:38 +02:00
Alexey Kardashevskiy
2a48d99335 spapr: Limit threads per core according to current compatibility mode
This puts a limit to the number of threads per core based on the current
compatibility mode. Although PowerISA specs do not specify the maximum
threads per core number, the linux guest still expects that
PowerISA2.05-compatible CPU supports only 2 threads per core as this
is what POWER6 (2.05 compliant CPU) implements, the same is for
POWER7 (2.06, 4 threads) and POWER8 (2.07, 8 threads).

This calls spapr_fixup_cpu_smt_dt() with the maximum allowed number of
threads which affects ibm,ppc-interrupt-server#s and
ibm,ppc-interrupt-gserver#s properties.

The number of CPU nodesremains unchanged.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:38 +02:00
Alexey Kardashevskiy
82677ed2f5 spapr: Rework spapr_fixup_cpu_dt()
In PPC code we usually use the "cs" name for a CPUState* variables
and "cpu" for PowerPCCPU. So let's change spapr_fixup_cpu_dt() to
use same rules as spapr_create_fdt_skel() does.

This adds missing nodes creation if they do not already exist in
the current device tree, this is going to be used from
the client-architecture-support handler.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:37 +02:00
Alexey Kardashevskiy
2a6593cb6a spapr: Add ibm, client-architecture-support call
The PAPR+ specification defines a ibm,client-architecture-support (CAS)
RTAS call which purpose is to provide a negotiation mechanism for
the guest and the hypervisor to work out the best compatibility parameters.
During the negotiation process, the guest provides an array of various
options and capabilities which it supports, the hypervisor adjusts
the device tree and (optionally) reboots the guest.

At the moment the Linux guest calls CAS method at early boot so SLOF
gets called. SLOF allocates a memory buffer for the device tree changes
and calls a custom KVMPPC_H_CAS hypercall. QEMU parses the options,
composes a diff for the device tree, copies it to the buffer provided
by SLOF and returns to SLOF. SLOF updates the device tree and returns
control to the guest kernel. Only then the Linux guest parses the device
tree so it is possible to avoid unnecessary reboot in most cases.

The device tree diff is a header with an update format version
(defined as 1 in this patch) followed by a device tree with the properties
which require update.

If QEMU detects that it has to reboot the guest, it silently does so
as the guest expects reboot to happen because this is usual pHyp firmware
behavior.

This defines custom KVMPPC_H_CAS hypercall. The current SLOF already
has support for it.

This implements stub which returns very basic tree (root node,
no properties) to the guest.

As the return buffer does not contain any change, no change in behavior is
expected.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:37 +02:00
Alexey Kardashevskiy
1a68b71419 target-ppc: Define Processor Compatibility Masks
This introduces PCR mask for supported compatibility modes.
This will be used later by the ibm,client-architecture-support call.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:37 +02:00
Alexey Kardashevskiy
6d9412ea81 target-ppc: Implement "compat" CPU option
This adds basic support for the "compat" CPU option. By specifying
the compat property, the user can manually switch guest CPU mode from
"raw" to "architected".

This defines feature disable bits which are not used yet as, for example,
PowerISA 2.07 says if 2.06 mode is selected, the TM bit does not matter -
transactional memory (TM) will be disabled because 2.06 does not define
it at all. The same is true for VSX and 2.05 mode. So just setting a mode
must be ok.

This does not change the existing behavior as the actual compatibility
mode support is coming in next patches.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[agraf: fix compilation on 32bit hosts]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:37 +02:00
Alexey Kardashevskiy
833d46685d spapr: Move SMT-related properties out of skeleton fdt
The upcoming support of the "ibm,client-architecture-support"
reconfiguration call will be able to change dynamically the number
of threads per core (SMT mode). From the device tree prospective
this does not change the number of CPU nodes (as it is one node per
a CPU core) but affects content and size of the ibm,ppc-interrupt-server#s
and ibm,ppc-interrupt-gserver#s properties.

This moves ibm,ppc-interrupt-server#s and ibm,ppc-interrupt-gserver#s
out of the device tree skeleton.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:37 +02:00
Alexey Kardashevskiy
8dfa3a5e85 target-ppc: Add "compat" CPU option
PowerISA defines a compatibility mode for server POWERPC CPUs which
is supported by the PCR special register which is hypervisor privileged.
To support this mode for guests, SPAPR defines a set of virtual PVRs,
one per PowerISA spec version. When a hypervisor needs a guest to work in
a compatibility mode, it puts a virtual PVR value into @cpu-version
property of a CPU node.

This introduces a "compat" CPU option which defines maximal compatibility
mode enabled. The supported modes are power6/power7/power8.

This does not change the existing behaviour, new property will be used
by next patches.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:37 +02:00
Alexander Graf
af354f19a9 PPC: openpic_kvm: Implement reset
When we trigger a system reset, the in-kernel openpic controller should also
get reset. This happens through a write to the GCR.RESET register which is
the same mechanism a guest would use to manually reset the device.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:36 +02:00
Paul Janzen
ffd5e9fe02 openpic: Reset IRQ source private members
The openpic emulation code maintains an allowable-CPU's bitmap
("destmask") for each IRQ source which is calculated from the IDR
register value whenever the guest OS writes to it.  However, if the
guest OS relies on the system to set the IDR register to a default
value at reset, and does not write IDR, then destmask does not get
updated, and interrupts do not get propagated to the guest.
Additionally, if an IRQ source is marked as critical, the source's
internal "output" and "nomask" fields are not correctly reset when the
PIC is reset.

Fix both these issues by calling write_IRQreg_idr from within
openpic_reset, instead of simply setting the IDR register to the
specified idr_reset value.

Signed-off-by: Paul Janzen <pcj@pauljanzen.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:36 +02:00
Paul Janzen
8ebe65f361 openpic: Move definition of openpic_reset
This patch moves the definition of openpic_reset after the various
register read/write functions. No functional change.  It is in
preparation for using the register read/write functions in
openpic_reset.

Signed-off-by: Paul Janzen <pcj@pauljanzen.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:36 +02:00
Bharata B Rao
1e6ed54ef8 target-ppc: Set the correct endianness in ELF dump header
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:36 +02:00
Greg Kurz
382d2db62b target-ppc: Introduce callback for interrupt endianness
POWER7, POWER7+ and POWER8 families use the ILE bit of the LPCR
special purpose register to decide the endianness to use when
entering interrupt handlers. When running a Linux guest, this
provides a hint on the endianness used by the kernel. And when
it comes to dumping a guest, the information is needed to write
ELF headers using the kernel endianness.

Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
[agraf: change subject line]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:36 +02:00
Bharata B Rao
0c967de9c0 target-ppc: Support dump for little endian ppc64
Fix ppc64 arch specific dump code to support all combinations of little/big
endian hosts/guests. FWIW the current code is broken for altivec registers
when guest and host have a different endianness: these 128-bit registers
are written to guest memory as a two 64-bit entities and we should also swap
them.

Unit testing was done with the following program provided by Tom Musta:

#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>

int main(int argc, char** argv)
{

__uint128_t v = ((__uint128_t)0x0001020304050607ull << 64) |
0x08090a0b0c0d0e0full;

register void * vptr asm ("r11");
vptr = &v;

for(;;)
asm volatile ("lvx 30,0,11" );
}

When sending SIGABRT to this program and examining the core file, we get:

- ppc64  : 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f
- ppc64le: 0f 0e 0d 0c 0b 0a 09 08 07 06 05 04 03 02 01 00

We expect to find the very same layout in the QEMU dump since they are
real core files. This is what we get:

- ppc64 host, ppc64 guest   : 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f
- ppc64 host, ppc64le guest : 0f 0e 0d 0c 0b 0a 09 08 07 06 05 04 03 02 01 00
- x86_64 host, ppc64 guest  : 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f
- x86_64 host, ppc64le guest: 0f 0e 0d 0c 0b 0a 09 08 07 06 05 04 03 02 01 00

We introduce a NoteFuncArg type to avoid adding extra arguments to all note
functions.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
[ rebased on top of current master branch,
  introduced NoteFuncArg,
  use new cpu_to_dump{16,32,64} endian helpers,
  fix altivec support,
  Greg Kurz <gkurz@linux.vnet.ibm.com> ]
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:36 +02:00
Bharata B Rao
acb0ef5801 dump: Make DumpState and endian conversion routines available for arch-specific dump code
Make DumpState and endian conversion routines available for arch-specific dump
code by moving into dump.h. DumpState will be needed by arch-specific dump
code to access target endian information from DumpState->ArchDumpInfo. Also
break the dependency of dump.h from stubs/dump.c by creating a separate
dump-arch.h.

This patch doesn't change any functionality.

Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
[ rebased on top of current master branch,
  renamed endian helpers to cpu_to_dump{16,32,64},
  pass a DumpState * argument to endian helpers,
  Greg Kurz <gkurz@linux.vnet.ibm.com> ]
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
[agraf: fix to apply]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:36 +02:00
Mark Cave-Ayland
85720d3667 macio: handle non-block ATAPI DMA transfers
Currently the macio DMA routines assume that all DMA requests are for read/write
block transfers. This is not always the case for ATAPI, for example when
requesting a TOC where the response is generated directly in the IDE buffer.

Detect these non-block ATAPI DMA transfers (where no lba is specified in the
command) and copy the results directly into RAM as indicated by the DBDMA
descriptor. This fixes CDROM access under MorphOS.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:36 +02:00
Alexey Kardashevskiy
10582ff832 spapr: Add ibm, chip-id property in device tree
This adds a "ibm,chip-id" property for CPU nodes which should be the same
for all cores in the same CPU socket. The recent guest kernels use this
information to associate threads with sockets.

Refer to the kernel commit 256f2d4b463d3030ebc8d2b54f427543814a2bdc
for more details.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:35 +02:00
Alexey Kardashevskiy
98a8b52442 spapr: Add support for time base offset migration
This allows guests to have a different timebase origin from the host.

This is needed for migration, where a guest can migrate from one host
to another and the two hosts might have a different timebase origin.
However, the timebase seen by the guest must not go backwards, and
should go forwards only by a small amount corresponding to the time
taken for the migration.

This is only supported for recent POWER hardware which has the TBU40
(timebase upper 40 bits) register. That includes POWER6, 7, 8 but not
970.

This adds kvm_access_one_reg() to access a special register which is not
in env->spr. This requires kvm_set_one_reg/kvm_get_one_reg patch.

The feature must be present in the host kernel.

This bumps vmstate_spapr::version_id and enables new vmstate_ppc_timebase
only for it. Since the vmstate_spapr::minimum_version_id remains
unchanged, migration from older QEMU is supported but without
vmstate_ppc_timebase.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:35 +02:00
Alexander Graf
3812c71ffa PPC: e500: Move to u-boot as firmware
Almost all platforms QEMU emulates have some sort of firmware they can load
to expose a guest environment that closely resembles the way it would look
like on real hardware.

This patch introduces such a firmware on our e500 platforms. U-boot is the
default firmware for most of these systems and as such our preferred choice.

For backwards compatibility reasons (and speed and simplicity) we skip u-boot
when you use -kernel and don't pass in -bios. For all other combinations like
-kernel and -bios or no -kernel you get u-boot as firmware.

This allows you to modify the boot environment, execute a networked boot through
the e1000 emulation and execute u-boot payloads.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:35 +02:00
Alexander Graf
4e73c78192 PPC: Add u-boot firmware for e500
This adds a special build of u-boot tailored for the e500 platforms we
emulate. It is based on the current version of upstream u-boot which
contains all the code necessary to drive our QEMU provided machines.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:35 +02:00
Alexander Graf
903585dec6 PPC: e500: Expose kernel load address in dt
We want to move to a model where firmware loads our kernel. To achieve
this we need to be able to tell firmware where the kernel lies.

Let's copy the mechanism we already use for -M pseries and expose the
kernel load address and size through the device tree.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:35 +02:00
Alexander Graf
4d09d5291d PPC: Add dcbtls emulation
The dcbtls instruction is able to lock data inside the L1 cache.

Unfortunately we don't emulate any caches, so we have to tell the guest
that its locking attempt failed.

However, by implementing the instruction we at least don't give the
guest a program exception which it definitely does not expect.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:35 +02:00
Alexander Graf
ea71258da4 PPC: Properly emulate L1CSR0 and L1CSR1
There are 2 L1 cache control registers - one for data (L1CSR0) and
one for instructions (L1CSR1).

Emulate both of them well enough to give the guest the illusion that
it could actually do anything about its caches.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:35 +02:00
Alexander Graf
d2ea2bf740 PPC: Add L1CFG1 SPR emulation
In addition to the L1 data cache configuration register L1CFG0 there is
also another one for the L1 instruction cache called L1CFG1.

Emulate that one with the same values as the data one.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:34 +02:00
Alexander Graf
deb05c4c4c PPC: Fix SPR access control of L1CFG0
The L1CFG0 register on e200 and e500 is "User RO" according to the
specifications. So let's make it user readable and world unwritable.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:34 +02:00
Alexander Graf
45eb56110b PPC: Add definitions for GIVORs
We're missing SPR definitions for GIVORs. Add them to the list of SPRs.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:34 +02:00
Alexander Graf
f1d9ec8bf7 PPC: Make all e500 CPUs SVR aware
Our pre-e500mc e500 CPU types didn't get instanciated with SVR information,
even though those systems do support the SVR register.

Spawn them with the SVR tag so that they don't get confused when someone tries
to read SPR_SVR.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:34 +02:00
Alexander Graf
3de3179782 PPC: Fail on leaking temporaries
When QEMU gets compiled with --enable-debug-tcg we can check for temporary
leakage. Implement the necessary target code for this and fail emulation
when we hit a leakage.

This hopefully ensures that we don't get new leaks.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:34 +02:00
Alexander Graf
c80d1df508 PPC: Fix TCG chunks that don't free their temps
We want to make sure that every instruction cleans up after itself and
clears every temporary it allocated.

While checking whether this is already the case, I came across a few
cases where it isn't. This patch fixes every translation I found that
doesn't free their allocated temporaries.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:34 +02:00
Bharat Bhushan
3016dca06c PPC: e500: implement PCI INTx routing
This patch adds pci pin to irq_num routing callback.
This callback is called from pci_device_route_intx_to_irq to
find which pci device maps to which irq.
This fix is required for pci-device passthrough using vfio.

Also without this patch we gets below prints

"
  PCI: Bug - unimplemented PCI INTx routing (e500-pcihost)
  qemu-system-ppc64: PCI: Bug - unimplemented PCI INTx routing (e500-pcihost) "

and Legacy interrupt does not work with pci device passthrough.

Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
[agraf: remove double semicolon]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:34 +02:00
Bharat Bhushan
d575a6ce0e PPC: e500: some pci related cleanup
- Use PCI_NUM_PINS rather than hardcoding
 - use "pin" wherever possible

Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:34 +02:00
Alexander Graf
08215d8fd8 KVM: PPC: Don't secretly add 1T segment feature to CPU
When we select a CPU type that does not support 1TB segments, we should
not expose 1TB just because KVM supports 1TB segments. User configuration
always wins over feature availability.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:33 +02:00
Tom Musta
c15424531f target-ppc: Refactor AES Instructions
This patch refactors the PowerPC Advanced Encryption Standard (AES) instructions
to use the common AES tables (include/qemu/aes.h).

Specifically:
    - vsbox is recoded to use the AES_sbox table.
    - vcipher, vcipherlast and vncipherlast are all recoded to use the optimized
      AES_t[ed][0-4] tables.
    - vncipher is recoded to use a combination of InvS-Box, InvShiftRows and
      InvMixColumns tables.  It was not possible to use AES_Td[0-4] due to a
      slight difference in how PowerPC implements vncipher.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:33 +02:00
Tom Musta
59dcd29a6c target-arm: Use Common Tables in AES Instructions
This patch refactors the ARM cryptographic instructions to use the
(newly) added common tables from include/qemu/aes.h.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:33 +02:00
Tom Musta
04af534d55 target-i386: Use Common ShiftRows and InvShiftRows Tables
This patch eliminates the (now) redundant copy of the Advanced Encryption Standard (AES)
ShiftRows and InvShiftRows tables; the code is updated to use the common tables declared in
include/qemu/aes.h.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:33 +02:00
Tom Musta
bfd8f5b754 util: Add InvMixColumns
This patch adds the table implementation of the Advanced Encryption Standard (AES)
InvMixColumns transformation.

The patch is intentionally asymmetrical -- the MixColumns table is not added because
there is no known use for it at this time.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:33 +02:00
Tom Musta
1c1a6d20e0 util: Add AES ShiftRows and InvShiftRows Tables
This patch adds tables that implement the Advanced Encryption Standard (AES) ShiftRows
and InvShiftRows transformations.  These are commonly used in instruction models.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:33 +02:00
Tom Musta
40c84b54dd util: Add S-Box and InvS-Box Arrays to Common AES Utils
This patch adds tables for the S-Box and InvS-Box transformations commonly used by various
Advanced Encription Standard (AES) instruction models.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:33 +02:00
Alexey Kardashevskiy
28668b5f31 spapr_pci: fix MSI limit
At the moment XICS does not support interrupts reuse so sPAPR PHB
implements this. sPAPRPHBState holds array of 32 spapr_pci_msi to
describe PCI config address, first MSI and number of MSIs. Once
allocated for a device, QEMU tries reusing this config until the number
of MSIs changes.

Existing SPAPR guests call ibm,change-msi in a loop until the handler
returns the requested number of vectors.

Recently introduced check for the maximum number of MSI/MSIX vectors
supported by a device only works for a device which is new for PHB's
MSI cache. If it is already there, the check is not performed which
leads to new IRQ block allocation. This happens during PCI hotplug
even when the user hot plug the same device which he just hot unplugged.

This moves the check earlier.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:32 +02:00
Tom Musta
804e654a56 target-ppc: Introduce DFP Shift Significand
Add emulation of the PowerPC Decimal Floating Point Shift Significand
Left Immediate (dscli[q][.]) and DFP Shift Significant Right Immediate
(dscri[q][.]) instructions.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:32 +02:00
Tom Musta
297666eba0 target-ppc: Introduce DFP Insert Biased Exponent
Add emulation of the PowerPC Decimal Floating Point Insert Biased
Exponent instructions diex[q][.].

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:32 +02:00
Tom Musta
e8a4846031 target-ppc: Introduce DFP Extract Biased Exponent
Add emulation of the PowerPC Decimal Floating Point Extract
Biased Exponent instructions dxex[q][.].

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:32 +02:00
Tom Musta
013c3ac070 target-ppc: Introduce DFP Encode BCD to DPD
Add emulation of the PowerPC Decimal Floating Point Encode Binary
Coded Decimal to Densely Packed Decimal instructions denbcd[q][.].

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:32 +02:00
Tom Musta
7796676fdd target-ppc: Introduce DFP Decode DPD to BCD
Add emulation of the Power PC Decimal Floating Point Decode
Densely Packed Decimal to Binary Coded Decimal instructions
ddedpd[q][.].

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:32 +02:00
Tom Musta
bea0dd7912 target-ppc: Introduce DFP Convert to Fixed
Add emulation of the PowerPC Decimal Floating Point Convert to Fixed
instructions dctfix[q][.].

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:32 +02:00
Tom Musta
f121419355 target-ppc: Introduce DFP Convert to Fixed
Add emulation of the PowerPC Decimal Floating Point Convert to
Fixed instructions dctfix[q][.].

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:31 +02:00
Tom Musta
ca603eb4d7 target-ppc: Introduce Round to DFP Short/Long
Add emulation of the PowerPC Round to DFP Short (drsp[.]) and Round to
DFP Long (drdpq[.]) instructions.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:31 +02:00
Tom Musta
290d9ee537 target-ppc: Introduce DFP Convert to Long/Extended
Add emulation of the PowerPC Convert to DFP Long (dctdp[.]) and
Convert to DFP Extended (dctqpq[.]) instructions.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:31 +02:00
Tom Musta
97c0d93041 target-ppc: Introduce DFP Round to Integer
Add emulation of the PowerPC Decimal Floating Point (DFP) Round
to FP Integer With Inexact (drintx[q][.]) and DFP Round to FP
Integer Without Inexact (drintn[q][.]) instructions.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:31 +02:00
Tom Musta
512918aa79 target-ppc: Introduce DFP Reround
Add emulation of the PowerPC Decimal Floating Point Reround instructions
drrnd[q][.].

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:31 +02:00
Tom Musta
5826ebe27a target-ppc: Introduce DFP Quantize
Add emulation of the PowerPC Decimal Floating Point Quantize instructions
dquai[q][.] and dqua[q][.].

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:31 +02:00
Tom Musta
f6022a7684 target-ppc: Introduce DFP Test Significance
Add emulation of the PowerPC Decimal Floating Point Test Significance
instructions dtstsf[q][.].

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:31 +02:00
Tom Musta
f3d2b0bce0 target-ppc: Introduce DFP Test Exponent
Add emulation of the PowerPC Decimal Floating Point Test Exponent
instructions dtstex[q][.].

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:31 +02:00
Tom Musta
1bf9c0e133 target-ppc: Introduce DFP Test Data Group
Add emulation of the PowerPC Decimal Floating Point Test Data
Group instructions dtstdg[q][.].

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:30 +02:00
Tom Musta
e601c1eead target-ppc: Introduce DFP Test Data Class
Add emulation of the PowerPC Decimal Floating Point Test Data Class
instructions dtstdc[q][.].

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:30 +02:00
Tom Musta
5833505be6 target-ppc: Introduce DFP Compares
Add emulation of the PowerPC Decimal Floating Point Compare instructions
dcmpu[q] and dcmpo[q].

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:30 +02:00
Tom Musta
9024ff40ba target-ppc: Introduce DFP Divide
Add emulation of the PowerPC Decimal Floating Point Divide instructions
ddiv[q][.]

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:30 +02:00
Tom Musta
8de6a1cc67 target-ppc: Introduce DFP Multiply
Add emulation of the PowerPC Decimal Floating Point Multiply instructions
dmul[q][.]

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:30 +02:00
Tom Musta
2128f8a57e target-ppc: Introduce DFP Subtract
Add emulation of the PowerPC Decimal Floating Point Subtract instructions
dsub[q][.]

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:30 +02:00
Tom Musta
a9d7ba03b0 target-ppc: Introduce DFP Add
Add emulation of the PowerPC Decimal Floating Point Add instructions dadd[q][.]

Various GCC unused annotations are removed since it is now safe to remove them.

Signed-off-by: Tom Musta <tommusta@gmail.com>
[agraf: move brace in function definition]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:30 +02:00
Tom Musta
27722744e9 target-ppc: Introduce DFP Post Processor Utilities
Add post-processing utilities to the PowerPC Decimal Floating Point
(DFP) helper code.  Post-processors are small routines that execute
after a preliminary DFP result is computed.  They are used, among other
things, to compute status bits.

This change defines a function type for post processors as well as a
generic routine to run a list (array) of post-processors.

Actual post-processor implementations will be added as needed by specific
DFP helpers in subsequent changes.

Some routines are annotated with the GCC unused attribute in order to
preserve build bisection.  The annotation will be removed in subsequent
patches.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:30 +02:00
Tom Musta
7b0c0d66e5 target-ppc: Introduce DFP Helper Utilities
Add a new file (dfp_helper.c) to the PowerPC implementation for Decimal Floating
Point (DFP) emulation.  This first version of the file declares a structure that
will be used by DFP helpers.  It also implements utilities that will initialize
such a structure for either a long (64 bit) DFP instruction or an extended (128
bit, aka "quad") instruction.

Some utility functions are annotated with the unused attribute in order to preserve
build bisection.

Signed-off-by: Tom Musta <tommusta@gmail.com>
[agraf: Add never reached assert on dfp_prepare_rounding_mode()]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:29 +02:00
Tom Musta
275e35c6c1 target-ppc: Introduce Decoder Macros for DFP
Add decoder macros for the various Decimal Floating Point
instruction forms.  Illegal instruction masks are used to not only
guard against reserved instruction field use, but also to catch
illegal quad word forms that use odd-numbered floating point registers.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:29 +02:00
Tom Musta
f0b01f02a4 target-ppc: Introduce Generator Macros for DFP Arithmetic Forms
Add general support for generators of PowerPC Decimal Floating Point helpers.

Some utilities are annotated with GCC attribute unused in order to preserve
build bisection.  These annotations will be removed in later patches.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:29 +02:00
Tom Musta
a4f27cc82c target-ppc: Define FPR Pointer Type for Helpers
Define a floating pointer register pointer type in the PowerPC
helper header.  The type will be used to pass FPR register operands
to Decimal Floating Point (DFP) helpers.  A pointer is used because
the quad word forms of PowerPC DFP instructions operate on adjacent
pairs of floating point registers and thus can be thought of as
arrays of length 2.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:29 +02:00
Tom Musta
0a322e7e7c libdecnumber: Fix decNumberSetBCD
Fix a simple bug in the decNumberSetBCD() function.  This function
encodes a decNumber with "n" BCD digits.  The original code erroneously
computed the number of declets from the dn argument, which is the output
decNumber value, and hence may contain garbage.  Instead, the input "n"
value is used.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:29 +02:00
Tom Musta
79af357225 libdecnumber: Introduce decNumberIntegralToInt64
Introduce a new conversion function to the libdecnumber library.
This function converts a decNumber to a signed 64-bit integer.
In order to support 64-bit integers (which may have up to 19
decimal digits), the existing "powers of 10" array is expanded
from 10 to 19 entries.

Signed-off-by: Tom Musta <tommusta@gmail.com>
[agraf: fix 32bit host compile]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:29 +02:00
Tom Musta
8e706db21e libdecnumber: Introduce decNumberFrom[U]Int64
Introduce two conversion functions to the libdecnumber library.
These conversions transform 64 bit integers to the internal decNumber
representation.  Both a signed and unsigned version is added.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:29 +02:00
Tom Musta
e58f8d1ff9 target-ppc: Enable Building of libdecnumber
Enable compilation of the newly added libdecnumber library code.
Object file targets are added to Makefile.target using a newly
introduced flag CONFIG_LIBDECNUMBER.  The flag is added
to the PowerPC targets (ppc[64]-linux-user, ppc[64]-softmmu).

Signed-off-by: Tom Musta <tommusta@gmail.com>
[agraf: add ppcemb and ppc64abi32 config]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:29 +02:00
Tom Musta
4922fd7d52 libdecnumber: Eliminate Unused Variable in decSetSubnormal
Eliminate an unused variable in the decSetSubnormal routine.  The
variable dnexp is declared and eventually set but never used, and
thus may trigger an unused-but-set-variable warning.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:28 +02:00
Tom Musta
426d9a1a59 libdecnumber: Eliminate redundant declarations
Eliminate redundant declarations of symbols DPD2BIN and BIN2DPD in
various .c source files.  These symbols are already declared in decDPD.h and
thus will trigger 'redundant redeclaration of ?XXX?' warnings, which, of
course, may fail QEMU compilation.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:28 +02:00
Tom Musta
9b7a14b064 libdecnumber: Change gstdint.h to stdint.h
Replace the inclusion of gstdint.h with the standard stdint.h
header file.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:28 +02:00
Tom Musta
7275585b8c libdecnumber: Modify dconfig.h to Integrate with QEMU
Modify the dconfig.h header file so that libdecnumber code integrates QEMU
configuration.   Specifically:

  - the WORDS_BIGENDIAN preprocessor macro is used in libdecnumber code to
    determines endianness.  It is derived from the existing QEMU macro
    HOST_WORDS_BIGENDIAN which is defined in config-host.h.

  - the DECPUN macro determines the number of decimal digits (aka declets) per
    unit (byte).  This is 3 for PowerPC DFP.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:28 +02:00
Tom Musta
0f2d373220 libdecnumber: Prepare libdecnumber for QEMU include structure
Consistent with other libraries in QEMU, the libdecnumber header files were
placed in include/libdecnumber, separate from the C code.  This is different
from the original libdecnumber source, where they were co-located.

Change the libdecnumber source code so that it reflects this split.  Specifically,
modify directives of the form:

    #include "xxx.h"

to look like:

    #include "libdecnumber/xxx.h"

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:28 +02:00
Tom Musta
f5d7f14646 libdecnumber: Eliminate #include *Symbols.h
The various *Symbols.h files were not copied from the original GCC libdecnumber
library; they are not necessary for use in QEMU.  Remove all instances of

    #include "*Symbols.h"

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:28 +02:00
Tom Musta
72ac97cdfc libdecnumber: Introduce libdecnumber Code
Add files from the libdecnumber decimal floating point library to QEMU.  The libdecnumber
library was originally part of GCC and contains code that is useful in emulating the PowerPC
decimal floating point (DFP) instructions.  This particular copy of the source comes from
GCC 4.3 and is licensed at GPLv2+.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:28 +02:00
BALATON Zoltan
9d1c128341 mac99: Added FW_CFG_PPC_BUSFREQ to match CLOCKFREQ and TBFREQ already there
While there, also moved the hard coded value for CLOCKFREQ to a #define.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:28 +02:00
Alexey Kardashevskiy
569be9f055 target-ppc: Remove PVR check from migration
Currently migration fails if CPU version (PVR register) is different
even a bit. This check is performed at the very end of migration when
device states are sent. This is too late for management software and
we need to provide a way for the user to make sure that migration
will succeed if QEMU is started with appropritate command line parameters.

This removes the PVR check.

This resets PVR to the default value as the existing VMSTATE record
for SPR array sends all 1024 registers unconditionally and overwrites
the destination PVR.

If the user wants some guarantees for migration to succeed, then
a CPU name or "host" CPU with a "compat" option (on its way to upsteam)
should be used and KVM or TCG is expected to fail on unsupported values
at the moment of QEMU start.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:27 +02:00
Tom Musta
9df5a46632 target-ppc: Eliminate Magic Number MSR Masks
Use MSR mnemonics from cpu.h instead of magic numbers for the CPUPPCState.msr_mask
initialization.

There is one bit in the 401x2 (and subsequent) model that I could not find any
documentation for.  It is open coded at little endian bit position 20:

    pcc->msr_mask = (1ull << 20) |
                    (1ull << MSR_KEY) |
                    (1ull << MSR_POW) |
                    (1ull << MSR_CE) |
                    ...

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:27 +02:00
Alexey Kardashevskiy
b26696b519 spapr_pci: Fix number of returned vectors in ibm, change-msi
Current guest kernels try allocating as many vectors as the quota is.
For example, in the case of virtio-net (which has just 3 vectors)
the guest requests 4 vectors (that is the quota in the test) and
the existing ibm,change-msi handler returns 4. But before it returns,
it calls msix_set_message() in a loop and corrupts memory behind
the end of msix_table.

This limits the number of vectors returned by ibm,change-msi to
the maximum supported by the actual device.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: qemu-stable@nongnu.org
[agraf: squash in bugfix from aik]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:27 +02:00
Greg Kurz
fabe9ee113 spapr-pci: remove io ports workaround
In the past, IO space could not be mapped into the memory address space
so we introduced a workaround for that. Nowadays it does not look
necessary so we can remove the workaround and make sPAPR PCI
configuration simplier.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Acked-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:27 +02:00
Alexey Kardashevskiy
70d246c335 target-ppc: Remove redundant POWER7 declarations
At the moment there are 3 versions of POWER7 CPUs defined. However
we do not emulate these CPUs diffent and it does not make much
sense to keep them all.

This removes POWER7_v2.0 and POWER7_v2.1 and leaves just one versioned
CPU per family which is POWER7_v2.3 with POWER7 alias.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:27 +02:00
Alexey Kardashevskiy
fdf8a960e2 target-ppc: Move alias lookup after class lookup
This moves aliases lookup after CPU class lookup. This is to let new generic
CPU to be found first if it is present and only if it is not (TCG case), use
aliases.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:27 +02:00
Alexey Kardashevskiy
5b79b1cadd target-ppc: Create versionless CPU class per family if KVM
At the moment generic version-less CPUs are supported via hardcoded aliases.
For example, POWER7 is an alias for POWER7_v2.1. So when QEMU is started
with -cpu POWER7, the POWER7_v2.1 class instance is created.

This approach works for TCG and KVMs other than HV KVM. HV KVM cannot emulate
PVR value so the guest always sees the real PVR. HV KVM will not allow setting
PVR other that the host PVR because of that (the kernel patch for it is on
its way). So in most cases it is impossible to run QEMU with -cpu POWER7
unless the host PVR is exactly the same as the one from the alias (which
is now POWER7_v2.3). It was decided that under HV KVM QEMU should use
-cpu host.

Using "host" CPU type creates a problem for management tools such as libvirt
because they want to know in advance if the destination guest can possibly
run on the destination. Since the "host" type is really not a type and will
always work with any KVM, there is no way for libvirt to know if the migration
will success.

This registers additional CPU class derived from the host CPU family.
The name for it is taken from @desc field of the CPU family class.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:27 +02:00
Thomas Falcon
8a286ce450 target-ppc: gdbstub allow byte swapping for reading/writing registers
This patch allows registers to be properly read from and written to
when using the gdbstub to debug a ppc guest running in little
endian mode.

Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:27 +02:00
Thomas Falcon
c46e983106 target-ppc: extract register length calculation in gdbstub
This patch extracts the method to determine a register's size
into a separate function.

Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:26 +02:00
Alexey Kardashevskiy
4e2ca12785 spapr_nvram: Correct max nvram size
Currently it is UINT16_MAX*16 = 65536*16 = 1048560 which is not
a round number and therefore a bit confusing.

This defines MAX_NVRAM_SIZE precisely as 1MB.

Suggested-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:26 +02:00
Fabien Chouteau
d584348589 Fix typo in eTSEC Ethernet controller
IRQ are lowered when ievent bit is cleared, so irq_pulse makes no sense
here...

Signed-off-by: Fabien Chouteau <chouteau@adacore.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:26 +02:00
Tom Musta
1c38f84373 monitor: QEMU Monitor Instruction Disassembly Incorrect for PowerPC LE Mode
The monitor support for disassembling instructions does not honor the MSR[LE]
bit for PowerPC processors.

This change enhances the monitor_disas() routine by supporting a flag bit
for Little Endian mode.  Bit 16 is used since that bit was used in the
analagous guest disassembly routine target_disas().

Also, to be consistent with target_disas(), the disassembler bfd_mach field
can be passed in the flags argument.

Reported-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:26 +02:00
Tom Musta
e13951f896 target-ppc: Fix target_disas
Inspect only bit 16 for the Little Endian test.  Correct comment preceding
the target_disas() function.  Correct grammar in comment for flags processing.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:26 +02:00
Peter Maydell
0bbac62618 Merge remote-tracking branch 'remotes/juanquintela/tags/migration/20140616' into staging
migration/next for 20140616

# gpg: Signature made Mon 16 Jun 2014 04:10:18 BST using RSA key ID 5872D723
# gpg: Can't check signature: public key not found

* remotes/juanquintela/tags/migration/20140616:
  migration: catch unknown flags in ram_load
  rdma: Fix block during rdma migration
  migration: Increase default max_downtime from 30ms to 300ms
  vmstate: Refactor opening of files
  savevm: Remove all the unneeded version_minimum_id_old (x86)
  savevm: Remove all the unneeded version_minimum_id_old (ppc)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-16 11:06:06 +01:00
Chunyan Liu
98d896d978 QemuOpts: cleanup tmp 'allocated' member from QemuOptsList
Now only qemu_opts_append uses 'allocated' to indicate free memory.
For this function only, we can also let result list's (const char *)
members point to input list's members, only if the input list has
longer lifetime than result list. In current code, that is true.
So, we can remove the 'allocated' member from QemuOptsList definition
to keep code clean.

Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:21 +08:00
Chunyan Liu
c282e1fdf7 cleanup QEMUOptionParameter
Now that all backend drivers are using QemuOpts, remove all
QEMUOptionParameter related codes.

Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:21 +08:00
Chunyan Liu
fec9921f0a vpc.c: replace QEMUOptionParameter with QemuOpts
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:21 +08:00
Chunyan Liu
5820f1da51 vmdk.c: replace QEMUOptionParameter with QemuOpts
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:21 +08:00
Chunyan Liu
5366092c7a vhdx.c: replace QEMUOptionParameter with QemuOpts
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:21 +08:00
Chunyan Liu
004b7f2522 vdi.c: replace QEMUOptionParameter with QemuOpts
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:21 +08:00
Chunyan Liu
766181fe57 ssh.c: replace QEMUOptionParameter with QemuOpts
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:21 +08:00
Chunyan Liu
b222237b7b sheepdog.c: replace QEMUOptionParameter with QemuOpts
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:21 +08:00
Chunyan Liu
bd0cf596fd rbd.c: replace QEMUOptionParameter with QemuOpts
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:21 +08:00
Chunyan Liu
cd3a4cf631 raw_bsd.c: replace QEMUOptionParameter with QemuOpts
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:21 +08:00
Chunyan Liu
ddef769993 raw-win32.c: replace QEMUOptionParameter with QemuOpts
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:21 +08:00
Chunyan Liu
6f482f742d raw-posix.c: replace QEMUOptionParameter with QemuOpts
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:21 +08:00
Chunyan Liu
7ab74849a5 qed.c: replace QEMUOptionParameter with QemuOpts
One extra change is to define QED_DEFAULT_CLUSTER_SIZE = 65536 instead
of 64 * 1024; because:
according to existing create_options, "cluster size" has default value =
QED_DEFAULT_CLUSTER_SIZE, after switching to create_opts, this has to be
stringized and set to .def_value_str. That is,
  .def_value_str = stringify(QED_DEFAULT_CLUSTER_SIZE),
so the QED_DEFAULT_CLUSTER_SIZE could not be a expression.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:21 +08:00
Chunyan Liu
1bd0e2d1c4 qcow2.c: replace QEMUOptionParameter with QemuOpts
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:21 +08:00
Chunyan Liu
74c3c19765 QemuOpts: export qemu_opt_find
Export qemu_opt_find for qcow2 driver using it.
After replacing QEMUOptionParameter with QemuOpts, qcow2 driver will
use qemu_opt_find to judge if an option is explicitly set, to replace
the usage of .assigned in QEMUOptionParameter.

Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:21 +08:00
Chunyan Liu
16d12159e2 qcow.c: replace QEMUOptionParameter with QemuOpts
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:20 +08:00
Chunyan Liu
98c10b810a nfs.c: replace QEMUOptionParameter with QemuOpts
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:20 +08:00
Chunyan Liu
a59479e3f3 iscsi.c: replace QEMUOptionParameter with QemuOpts
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:20 +08:00
Chunyan Liu
90c772de56 gluster.c: replace QEMUOptionParameter with QemuOpts
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:20 +08:00
Chunyan Liu
25814e8987 cow.c: replace QEMUOptionParameter with QemuOpts
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:20 +08:00
Chunyan Liu
facdbb0272 vvfat.c: handle cross_driver's create_options and create_opts
vvfat shares create options of qcow driver. To avoid vvfat breaking when
qcow driver changes from QEMUOptionParameter to QemuOpts, let it able
to handle both cases.

Signed-off-by: Chunyan Liu <cyliu@suse.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:20 +08:00
Chunyan Liu
83d0521a1e change block layer to support both QemuOpts and QEMUOptionParamter
Change block layer to support both QemuOpts and QEMUOptionParameter.
After this patch, it will change backend drivers one by one. At the end,
QEMUOptionParameter will be removed and only QemuOpts is kept.

Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:20 +08:00
Chunyan Liu
4782183da3 QemuOpts: check NULL input for qemu_opts_del
To simplify later using of qemu_opts_del, accept NULL input.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Leandro Dorileo <l@dorileo.org>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:20 +08:00
Chunyan Liu
a1097a2614 QemuOpts: add qemu_opts_append to replace append_option_parameters
For later merge .create_opts of drv and proto_drv in qemu-img commands.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Leandro Dorileo <l@dorileo.org>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:20 +08:00
Chunyan Liu
8559e45e51 QemuOpts: add conversion between QEMUOptionParameter to QemuOpts
Add two temp conversion functions between QEMUOptionParameter to QemuOpts,
so that next patch can use it. It will simplify later patch for easier
review. And will be finally removed after all backend drivers switch to
QemuOpts.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Leandro Dorileo <l@dorileo.org>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:20 +08:00
Chunyan Liu
504189a96f QemuOpts: add qemu_opts_print_help to replace print_option_help
print_option_help takes QEMUOptionParameter as parameter, add
qemu_opts_print_help to take QemuOptsList as parameter for later
replace work.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Leandro Dorileo <l@dorileo.org>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:20 +08:00
Chunyan Liu
782730b0bc QemuOpts: add qemu_opt_get_*_del functions for replace work
Add qemu_opt_get_del, qemu_opt_get_bool_del, qemu_opt_get_number_del and
qemu_opt_get_size_del to replace the same handling of QEMUOptionParameter
(get and delete).

Several drivers are coded to parse a known subset of options, then
remove them from the list before handing all remaining options to a
second driver for further option processing.  get_*_del makes it easier
to retrieve a known option (or its default) and remove it from the list
all in one action.

Share common helper function:

For qemu_opt_get_bool/size/number, they and their get_*_del counterpart
could share most of the code except whether or not deleting the opt from
option list, so generate common helper functions.

For qemu_opt_get and qemu_opt_get_del, keep code duplication, since
1. qemu_opt_get_del returns malloc'd memory while qemu_opt_get returns
in-place memory
2. qemu_opt_get_del returns (char *), qemu_opt_get returns (const char *),
and could not change to (char *), since in one case, it will return
desc->def_value_str, which is (const char *).

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:20 +08:00
Chunyan Liu
fc345512c5 QemuOpts: move qemu_opt_del ahead for later calling
In later patch, qemu_opt_get_del functions will be added, they will
first get the option value, then call qemu_opt_del to remove the option
from opt list. To prepare for that purpose, move qemu_opt_del ahead first.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Leandro Dorileo <l@dorileo.org>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:20 +08:00
Chunyan Liu
dc8622f2bf QemuOpts: change opt->name|str from (const char *) to (char *)
qemu_opt_del() already assumes that all QemuOpt instances contain
malloc'd name and value; but it had to cast away const because
opts_start_struct() was doing its own thing and using static storage
instead.  By using the correct type and malloced strings everywhere, the
usage of this struct becomes clearer.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Leandro Dorileo <l@dorileo.org>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:20 +08:00
Chunyan Liu
e36af94f86 qapi: output def_value_str when query command line options
Change qapi interfaces to output the newly added def_value_str when querying
command line options.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Leandro Dorileo <l@dorileo.org>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:20 +08:00
Chunyan Liu
09722032e1 QemuOpts: add def_value_str to QemuOptDesc
Add def_value_str (default value) to QemuOptDesc, to replace function of the
default value in QEMUOptionParameter.

Improve qemu_opts_get_* functions: if find opt, return opt->str; otherwise,
if desc->def_value_str is set, return desc->def_value_str; otherwise, return
input defval.

Improve qemu_opts_print: if option is set, print opt->str; otherwise, if
desc->def_value_str is set, print it.

Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:20 +08:00
Chunyan Liu
e67905426b QemuOpts: repurpose qemu_opts_print to replace print_option_parameters
Currently this function is not used anywhere. In later patches, it will
replace print_option_parameters. To avoid print info changes, change
qemu_opts_print from fprintf stderr to printf, and remove last printf.

Signed-off-by: Chunyan Liu <cyliu@suse.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:19 +08:00
Chunyan Liu
5e89db7641 QemuOpts: move find_desc_by_name ahead for later calling
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:19 +08:00
Peter Lieven
a2c0fe2fd2 block/nfs: fix potential segfault on early callback
it will happen in the future that the callback of a libnfs call
directly invokes the callback. In this case we end up in a segfault
because the NFSRPC is gone when we the BH is scheduled.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:19 +08:00
Markus Armbruster
ae60e8e378 blockdev: Remove unused DriveInfo reference count
It's always one since commit fa510eb dropped the last drive_get_ref().

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:19 +08:00
Markus Armbruster
60e19e06a4 blockdev: Rename drive_init(), drive_uninit() to drive_new(), drive_del()
"Init" and "uninit" suggest the functions don't allocate / free
storage.  But they do.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:19 +08:00
Kevin Wolf
bcf8315857 blockdev: Move 'serial' option to drive_init()
It is not available with blockdev-add.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:19 +08:00
Markus Armbruster
f7047c2daf block: Drop superfluous conditionals around g_free()
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:19 +08:00
Stefan Weil
b25c9dff35 configure: Enable dead code (lzo, snappy, quorum)
Those options were not enabled by default, even when the build
environment would have supported them, so the corresponding
code was not compiled in normal test builds like on build bots.

[Building quorum by default "broke" qemu-iotests ./check 081.  It turns
out the 081.out master output was just bitrotted.  Fix this by updating
the error message.
--Stefan]

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:19 +08:00
Peter Lieven
db80facefa migration: catch unknown flags in ram_load
if a saved vm has unknown flags in the memory data qemu
currently simply ignores this flag and continues which
yields in an unpredictable result.

This patch catches all unknown flags and aborts the
loading of the vm. Additionally error reports are thrown
if the migration aborts abnormally.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-06-16 04:55:27 +02:00
Gonglei
2a93434704 rdma: Fix block during rdma migration
If the networking break or there's something wrong with rdma
device(ib0 with no IP) during rdma migration, the main_loop of
qemu will be blocked in rdma_destroy_id. I add rdma_ack_cm_event
to fix this bug.

Signed-off-by: Mo Yuxiang <Moyuxiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-06-16 04:55:27 +02:00
Alexey Kardashevskiy
f7cd55a023 migration: Increase default max_downtime from 30ms to 300ms
The existing timeout is 30ms which on 100MB/s (1Gbit) gives us
3MB/s rate maximum. If we put some load on the guest, it is easy to
get page dirtying rate too big so live migration will never complete.
In the case of libvirt that means that the guest will be stopped
anyway after a timeout specified in the "virsh migrate" command and
this normally generates even bigger delay.

This changes max_downtime to 300ms which seems to be more
reasonable value.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-06-16 04:55:27 +02:00
Juan Quintela
c6f6646c60 vmstate: Refactor opening of files
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
2014-06-16 04:55:27 +02:00
Juan Quintela
d49805aeea savevm: Remove all the unneeded version_minimum_id_old (x86)
After previous Peter patch, they are redundant.  This way we don't
assign them except when needed.  Once there, there were lots of case
where the ".fields" indentation was wrong:

     .fields = (VMStateField []) {
and
     .fields =      (VMStateField []) {

Change all the combinations to:

     .fields = (VMStateField[]){

The biggest problem (appart from aesthetics) was that checkpatch complained
when we copy&pasted the code from one place to another.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Acked-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-16 04:55:26 +02:00
Juan Quintela
3aff6c2fea savevm: Remove all the unneeded version_minimum_id_old (ppc)
After previous Peter patch, they are redundant.  This way we don't
assign them except when needed.  Once there, there were lots of case
where the ".fields" indentation was wrong:

     .fields = (VMStateField []) {
and
     .fields =      (VMStateField []) {

Change all the combinations to:

     .fields = (VMStateField[]){

The biggest problem (appart from aesthetics) was that checkpatch complained
when we copy&pasted the code from one place to another.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Acked-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2014-06-16 04:55:26 +02:00
Peter Maydell
06a59afac4 Merge remote-tracking branch 'remotes/kraxel/tags/pull-usb-20140613-1' into staging
usb-host: add range checks for usb-host parameters

# gpg: Signature made Fri 13 Jun 2014 12:33:05 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-usb-20140613-1:
  usb-host: add range checks for usb-host parameters

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-13 18:18:55 +01:00
Peter Maydell
80008a6a29 Merge remote-tracking branch 'remotes/kraxel/tags/pull-trivial-20140613-1' into staging
inet_listen_opts: add error checking

# gpg: Signature made Fri 13 Jun 2014 12:29:43 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-trivial-20140613-1:
  inet_listen_opts: add error checking

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-13 16:07:04 +01:00
Peter Maydell
592fb17691 Merge remote-tracking branch 'remotes/spice/tags/pull-spice-20140613-1' into staging
spice: add mouse cursor support
qxl-render: add sanity check

# gpg: Signature made Fri 13 Jun 2014 12:22:45 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/spice/tags/pull-spice-20140613-1:
  qxl-render: add sanity check
  spice: add mouse cursor support

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-13 15:15:31 +01:00
Peter Maydell
7d5bef0873 Merge remote-tracking branch 'remotes/kraxel/tags/pull-chardev-20140613-1' into staging
char: fix avail_connections init in qemu_chr_open_eventfd()

# gpg: Signature made Fri 13 Jun 2014 12:16:50 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-chardev-20140613-1:
  char: fix avail_connections init in qemu_chr_open_eventfd()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-13 14:57:56 +01:00
Peter Maydell
d61de7255b Merge remote-tracking branch 'remotes/kraxel/tags/pull-audio-20140613-1' into staging
audio: Drop superfluous conditionals around g_free()

# gpg: Signature made Fri 13 Jun 2014 12:14:24 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-audio-20140613-1:
  audio: Drop superfluous conditionals around g_free()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-13 12:58:24 +01:00
Gerd Hoffmann
f3cda6e060 usb-host: add range checks for usb-host parameters
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-06-13 12:34:57 +02:00
Gerd Hoffmann
8bc8912796 inet_listen_opts: add error checking
Don't use atoi() function which doesn't detect errors, switch to
strtol and error out on failures.  Also add a range check while
being at it.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2014-06-13 12:34:57 +02:00
Gerd Hoffmann
788fbf042f qxl-render: add sanity check
Verify dirty rectangle is completely within the primary surface,
just ignore it in case it isn't.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-06-13 12:34:57 +02:00
Gerd Hoffmann
5643fc012c spice: add mouse cursor support
So you'll have a mouse pointer when running non-qxl gfx cards with
mouse pointer support (virtio-gpu, IIRC vmware too).

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-06-13 12:34:57 +02:00
David Marchand
e9d21c436f char: fix avail_connections init in qemu_chr_open_eventfd()
When trying to use a ivshmem server with qemu, ivshmem init code tries to
create a CharDriverState object for each eventfd retrieved from the server.
To create this object, a call to qemu_chr_open_eventfd() is done.
Right after this, before adding a frontend, qemu_chr_fe_claim_no_fail() is
called.
qemu_chr_open_eventfd() does not set avail_connections to 1, so no frontend can
be associated because qemu_chr_fe_claim_no_fail() makes qemu stop right away.

This problem comes from 456d606923
"qemu-char: Call fe_claim / fe_release when not using qdev chr properties".

Fix this, by setting avail_connections to 1 in qemu_chr_open_eventfd().

Signed-off-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-06-13 12:34:55 +02:00
Markus Armbruster
fb7da626c0 audio: Drop superfluous conditionals around g_free()
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-06-13 12:34:54 +02:00
Peter Maydell
2a2c4830c0 Merge remote-tracking branch 'remotes/kraxel/tags/pull-gtk-20140611-1' into staging
gtk: misc fixes & cleanups.

# gpg: Signature made Wed 11 Jun 2014 13:28:12 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-gtk-20140611-1:
  gtk: update window size after showing/hiding tabs
  gtk: factor out gtk3 grab into the new gd_grab_devices function
  gtk: cleanup backend dependencies
  gtk: factor out keycode mapping

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-12 09:51:41 +01:00
Peter Maydell
05fedeef83 Merge remote-tracking branch 'remotes/qmp-unstable/queue/qmp' into staging
* remotes/qmp-unstable/queue/qmp:
  json-parser: drop superfluous assignment for token variable
  readline: Clear screen on form feed.
  monitor: Add delvm and loadvm argument completion
  monitor: Add host_net_remove arguments completion
  readline: Make completion strings always unique
  monitor: Add host_net_add device argument completion
  net: Export valid host network devices list
  monitor: Add migrate_set_capability completion
  monitor: Add watchdog_action argument completion
  monitor: Add ringbuf_write and ringbuf_read argument completion
  dump: simplify get_len_buf_out()
  dump: hoist lzo_init() from get_len_buf_out() to dump_init()
  dump: select header bitness based on ELF class, not ELF architecture
  dump: eliminate DumpState.page_size ("guest's page size")
  dump: eliminate DumpState.page_shift ("guest's page shift")
  dump: simplify write_start_flat_header()
  dump: fill in the flat header signature more pleasingly to the eye

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-11 21:34:08 +01:00
Peter Maydell
706808585a Merge remote-tracking branch 'remotes/pmaydell/tags/pull-bsd-user-20140611' into staging
bsd-user queue:
 * build fixes
 * improvements to strace

# gpg: Signature made Wed 11 Jun 2014 15:23:40 BST using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"

* remotes/pmaydell/tags/pull-bsd-user-20140611:
  bsd-user: Fix syscall format, add strace support for more syscalls
  bsd-user: Implement strace support for thr_* syscalls
  bsd-user: Implement strace support for extattr_* syscalls
  bsd-user: Implement strace support for __acl_* syscalls
  bsd-user: Implement strace support for print_ioctl syscall
  bsd-user: Implement strace support for print_sysctl syscall
  bsd-user: GPL v2 attribution update and style
  bsd-user: add HOST_VARIANT_DIR for various *BSD dependent code
  exec: replace ffsl with ctzl
  vhost: replace ffsl with ctzl
  xen: replace ffsl with ctzl
  util/qemu-openpty: fix build with musl libc by include termios.h as fallback
  bsd-user/mmap.c: Don't try to override g_malloc/g_free
  util/hbitmap.c: Use ctpopl rather than reimplementing a local equivalent
  bsd-user: refresh freebsd system call numbers

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-11 18:05:21 +01:00
Peter Maydell
c5cb1afc46 Merge remote-tracking branch 'remotes/bonzini/configure' into staging
* remotes/bonzini/configure:
  rules.mak: Rewrite unnest-vars
  configure: unset interfering variables
  configure: duplicate/incorrect order of -lrt
  libcacard: improve documentation
  libcacard: actually use symbols file
  libcacard: replace qemu thread primitives with glib ones
  vscclient: use glib thread primitives not qemu
  glib-compat.h: add new thread API emulation on top of pre-2.31 API

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-11 15:36:48 +01:00
Gonglei
a491af471b json-parser: drop superfluous assignment for token variable
Signed-off-by: ChenLiang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-11 10:10:29 -04:00
Hani Benhabiles
075ccb6cd3 readline: Clear screen on form feed.
Signed-off-by: Hani Benhabiles <hani@linux.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-11 10:10:29 -04:00
Hani Benhabiles
b21631f3b5 monitor: Add delvm and loadvm argument completion
Signed-off-by: Hani Benhabiles <hani@linux.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-11 10:10:29 -04:00
Hani Benhabiles
ddd6b45ce2 monitor: Add host_net_remove arguments completion
Relies on readline unique completion strings patch to make the added vlan/hub
completion values unique, instead of using something like a hash table.

Signed-off-by: Hani Benhabiles <hani@linux.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-11 10:10:29 -04:00
Hani Benhabiles
e70871d8b5 readline: Make completion strings always unique
There is no need to clutter the user's choices with repeating the same value
multiple times.

Signed-off-by: Hani Benhabiles <hani@linux.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-11 10:10:29 -04:00
Hani Benhabiles
e3bb532cc7 monitor: Add host_net_add device argument completion
Also fix the parameters documentation.

Signed-off-by: Hani Benhabiles <hani@linux.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-11 10:10:29 -04:00
Hani Benhabiles
84007e8181 net: Export valid host network devices list
Signed-off-by: Hani Benhabiles <hani@linux.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-11 10:10:29 -04:00
Hani Benhabiles
c68a0409b3 monitor: Add migrate_set_capability completion
Signed-off-by: Hani Benhabiles <hani@linux.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-11 10:10:29 -04:00
Hani Benhabiles
d0ece345cb monitor: Add watchdog_action argument completion
Signed-off-by: Hani Benhabiles <hani@linux.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-11 10:10:28 -04:00
Hani Benhabiles
8e5977797d monitor: Add ringbuf_write and ringbuf_read argument completion
Export chr_is_ringbuf() function. Also remove left-over function prototypes
while at it.

Signed-off-by: Hani Benhabiles <hani@linux.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-11 10:10:28 -04:00
Laszlo Ersek
b87ef3518b dump: simplify get_len_buf_out()
We can (and should) rely on the fact that s->flag_compress is exactly one
of DUMP_DH_COMPRESSED_ZLIB, DUMP_DH_COMPRESSED_LZO, and
DUMP_DH_COMPRESSED_SNAPPY.

This is ensured by the QMP schema and dump_init() in combination.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-11 10:10:28 -04:00
Laszlo Ersek
c998acb03d dump: hoist lzo_init() from get_len_buf_out() to dump_init()
qmp_dump_guest_memory()
  dump_init()
    lzo_init() <---------+
  create_kdump_vmcore()  |
    write_dump_pages()   |
      get_len_buf_out()  |
        lzo_init() ------+

This patch doesn't change the fact that lzo_init() is called for every
LZO-compressed dump, but it makes get_len_buf_out() more focused (single
responsibility).

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-11 10:10:28 -04:00
Laszlo Ersek
24aeeace7a dump: select header bitness based on ELF class, not ELF architecture
The specific ELF architecture (d_machine) carries Too Much Information
(TM) for deciding between create_header32() and create_header64(), use
"d_class" instead (ELFCLASS32 vs. ELFCLASS64).

This change adapts write_dump_header() to write_elf_loads(), dump_begin()
etc. that also rely on the ELF class of the target for bitness selection.

Considering the current targets that support dumping, cpu_get_dump_info()
works as follows:
- target-s390x/arch_dump.c: (EM_S390, ELFCLASS64) only
- target-ppc/arch_dump.c (EM_PPC64, ELFCLASS64) only
- target-i386/arch_dump.c: sets (EM_X86_64, ELFCLASS64) vs. (EM_386,
  ELFCLASS32) keying off the same Long Mode Active flag.

Hence no observable change.

Approximately-suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-11 10:10:28 -04:00
Laszlo Ersek
2f859f80c2 dump: eliminate DumpState.page_size ("guest's page size")
Use TARGET_PAGE_SIZE and ~TARGET_PAGE_MASK instead.

"DumpState.page_size" has type "size_t", whereas TARGET_PAGE_SIZE has type
"int". TARGET_PAGE_MASK is of type "int" and has negative value. The patch
affects the implicit type conversions as follows:

- create_header32() and create_header64(): assigned to "block_size", which
  has type "uint32_t". No change.

- get_next_page(): "block->target_start", "block->target_end" and "addr"
  have type "hwaddr" (uint64_t).

  Before the patch,
  - if "size_t" was "uint64_t", then no additional conversion was done as
    part of the usual arithmetic conversions,
  - If "size_t" was "uint32_t", then it was widened to uint64_t as part of
    the usual arithmetic conversions,
  for the remainder and addition operators.

  After the patch,
  - "~TARGET_PAGE_MASK" expands to  ~~((1 << TARGET_PAGE_BITS) - 1). It
    has type "int" and positive value (only least significant bits set).
    That's converted (widened) to "uint64_t" for the bit-ands. No visible
    change.
  - The same holds for the (addr + TARGET_PAGE_SIZE) addition.

- write_dump_pages():
  - TARGET_PAGE_SIZE passed as argument to a bunch of functions that all
    have prototypes. No change.

  - When incrementing "offset_data" (of type "off_t"): given that we never
    build for ILP32_OFF32 (see "-D_FILE_OFFSET_BITS=64" in configure),
    "off_t" is always "int64_t", and we only need to consider:
    - ILP32_OFFBIG: "size_t" is "uint32_t".
      - before: int64_t += uint32_t. Page size converted to int64_t for
        the addition.
      - after:  int64_t += int32_t. No change.
    - LP64_OFF64: "size_t" is "uint64_t".
      - before: int64_t += uint64_t. Offset converted to uint64_t for the
        addition, then the uint64_t result is converted to int64_t for
        storage.
      - after:  int64_t += int32_t. Same as the ILP32_OFFBIG/after case.
        No visible change.

  - (size_out < s->page_size) comparisons, and (size_out = s->page_size)
    assignment:
    - before: "size_out" is of type "size_t", no implicit conversion for
              either operator.
    - after: TARGET_PAGE_SIZE (of type "int" and positive value) is
             converted to "size_t" (for the relop because the latter is
             one of "uint32_t" and "uint64_t"). No visible change.

- dump_init():
  - DIV_ROUND_UP(DIV_ROUND_UP(s->max_mapnr, CHAR_BIT), s->page_size): The
    innermost "DumpState.max_mapnr" field has type uint64_t, which
    propagates through all implicit conversions at hand:

    #define DIV_ROUND_UP(n,d) (((n) + (d) - 1) / (d))

    regardless of the page size macro argument's type. In the outer macro
    replacement, the page size is converted from uint32_t and int32_t
    alike to uint64_t.

  - (tmp * s->page_size) multiplication: "tmp" has size "uint64_t"; the
    RHS is converted to that type from uint32_t and int32_t just the same
    if it's not uint64_t to begin with.

Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-11 10:10:28 -04:00
Laszlo Ersek
22227f121b dump: eliminate DumpState.page_shift ("guest's page shift")
Just use TARGET_PAGE_BITS.

"DumpState.page_shift" used to have type "uint32_t", while the replacement
TARGET_PAGE_BITS has type "int". Since "DumpState.page_shift" was only
used as bit shift counts in the paddr_to_pfn() and pfn_to_paddr() macros,
this is safe.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-11 10:10:28 -04:00
Laszlo Ersek
92ba1401e0 dump: simplify write_start_flat_header()
Currently, the function
- defines and populates an auto variable of type MakedumpfileHeader
- allocates and zeroes a buffer of size MAX_SIZE_MDF_HEADER (4096)
- copies the former into the latter (covering an initial portion of the
  latter)

Fill in the MakedumpfileHeader structure in its final place (the alignment
is OK because the structure lives at the address returned by g_malloc0()).

Approximately-suggested-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-11 10:10:28 -04:00
Laszlo Ersek
ae3f88f60f dump: fill in the flat header signature more pleasingly to the eye
The "mh.signature" array field has size 16, and is zeroed by the preceding
memset(). MAKEDUMPFILE_SIGNATURE expands to a string literal with string
length 12 (size 13). There's no need to measure the length of
MAKEDUMPFILE_SIGNATURE at runtime, nor for the extra zero-filling of
"mh.signature" with strncpy().

Use memcpy() with MIN(sizeof, sizeof) for robustness (which is an integer
constant expression, evaluable at compile time.)

Approximately-suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-11 10:10:28 -04:00
Gerd Hoffmann
fa7a1e5219 gtk: update window size after showing/hiding tabs
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-06-11 14:26:49 +02:00
Gerd Hoffmann
f50def915e gtk: factor out gtk3 grab into the new gd_grab_devices function
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-06-11 14:26:49 +02:00
Gerd Hoffmann
0a337ed067 gtk: cleanup backend dependencies
Make configure detect gtk x11 backend and link libX11 then.  Make
gtk backend specific code properly #ifdef'ed on the GTK_WINDOWING_*
backends at runtime).  Our gtk ui code should build and run fine on
any platform now.

This also fixes the linker failute due to the new XkbGetKeyboard call
added by commit 3158a3482b.

Cc: Richard Henderson <rth@twiddle.net>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
2014-06-11 14:26:49 +02:00
Gerd Hoffmann
932f2d7e0f gtk: factor out keycode mapping
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-06-11 14:26:49 +02:00
Sean Bruno
c4af6d4b13 bsd-user: Fix syscall format, add strace support for more syscalls
Signed-off-by: Sean Bruno <sbruno@freebsd.org>
Message-id: 1402246651-71099-10-git-send-email-sbruno@freebsd.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-11 00:25:06 +01:00
Sean Bruno
f35f961ac9 bsd-user: Implement strace support for thr_* syscalls
Signed-off-by: Sean Bruno <sbruno@freebsd.org>
Message-id: 1402246651-71099-9-git-send-email-sbruno@freebsd.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-11 00:25:06 +01:00
Sean Bruno
1e501653ab bsd-user: Implement strace support for extattr_* syscalls
Signed-off-by: Sean Bruno <sbruno@freebsd.org>
Message-id: 1402246651-71099-8-git-send-email-sbruno@freebsd.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-11 00:25:06 +01:00
Sean Bruno
605474815d bsd-user: Implement strace support for __acl_* syscalls
Signed-off-by: Sean Bruno <sbruno@freebsd.org>
Message-id: 1402246651-71099-7-git-send-email-sbruno@freebsd.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-11 00:25:06 +01:00
Sean Bruno
b85159a3a3 bsd-user: Implement strace support for print_ioctl syscall
Signed-off-by: Sean Bruno <sbruno@freebsd.org>
Message-id: 1402246651-71099-5-git-send-email-sbruno@freebsd.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-11 00:25:06 +01:00
Sean Bruno
80b346040d bsd-user: Implement strace support for print_sysctl syscall
Signed-off-by: Sean Bruno <sbruno@freebsd.org>
Message-id: 1402246651-71099-4-git-send-email-sbruno@freebsd.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-11 00:25:06 +01:00
Sean Bruno
88dae46d18 bsd-user: GPL v2 attribution update and style
Signed-off-by: Sean Bruno <sbruno@freebsd.org>
Message-id: 1402246651-71099-3-git-send-email-sbruno@freebsd.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-11 00:25:06 +01:00
Stacey Son
adfc3e91e2 bsd-user: add HOST_VARIANT_DIR for various *BSD dependent code
This change adds HOST_VARIANT_DIR so the various BSD OS dependent
code can be separated into its own directories rather than
using #ifdef's.

This may also allow an BSD variant OS to host another BSD variant's
executable as a target.

Signed-off-by: Sean Bruno <sbruno@freebsd.org>
Message-id: 1402246651-71099-2-git-send-email-sbruno@freebsd.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-11 00:25:06 +01:00
Natanael Copa
7224f66ec3 exec: replace ffsl with ctzl
See commit fbeadf50 (bitops: unify bitops_ffsl with the one in
host-utils.h, call it bitops_ctzl) on why ctzl should be used instead
of ffsl.

This is also needed for musl libc which does not implement ffsl.

Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-11 00:25:06 +01:00
Natanael Copa
747eb78baa vhost: replace ffsl with ctzl
Avoid using the GNU extesion ffsl which is not implemented in musl libc.

The atomic_xchg() means we know that vhost_log_chunk_t will never be
larger than the 'long' type, so ctzl() is always sufficient.

See also commit fbeadf50 (bitops: unify bitops_ffsl with the one in
host-utils.h, call it bitops_ctzl) on why ctzl should be used instead
of ffsl.

Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-11 00:25:06 +01:00
Natanael Copa
adf9d70b0d xen: replace ffsl with ctzl
ffsl is a GNU extension and not available in musl libc.

See also commit fbeadf50 (bitops: unify bitops_ffsl with the one in
host-utils.h, call it bitops_ctzl) on why ctzl should be used instead
of ffsl.

Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
[PMM: rebased to accommodate file rename to xen-hvm.c]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-11 00:25:06 +01:00
Natanael Copa
6ad3f09bd4 util/qemu-openpty: fix build with musl libc by include termios.h as fallback
Include termios.h as POSIX fallback when not glibc, bsd or solaris.
POSIX says that termios.h should define struct termios and TCAFLUSH.
http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/termios.h.html

This fixes the following compile errors with musl libc:

util/qemu-openpty.c: In function 'qemu_openpty_raw':
util/qemu-openpty.c:112:20: error: storage size of 'tty' isn't known
     struct termios tty;
                    ^
...
util/qemu-openpty.c:128:24: error: 'TCSAFLUSH' undeclared (first use in this function)
     tcsetattr(*aslave, TCSAFLUSH, &tty);
                        ^

Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-11 00:25:06 +01:00
Peter Maydell
b7b5233ad7 bsd-user/mmap.c: Don't try to override g_malloc/g_free
Trying to override the implementations of g_malloc and g_free is
a really bad idea -- it means statically linked builds fail to
link (because of the multiple definitions provided by this file
and by glib), and non-statically linked builds segfault as soon
as they try to do anything more complicated than printing the
usage message. Remove these overridden versions and just use
the glib ones.

This is sufficient that bsd-user can run basic x86-64
binaries on OpenBSD again; FreeBSD and NetBSD seem to have
further issues.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sean Bruno <sbruno@freebsd.org>
Reviewed-by: Ed Maste <emaste@freebsd.org>
2014-06-11 00:25:06 +01:00
Peter Maydell
591b320ad0 util/hbitmap.c: Use ctpopl rather than reimplementing a local equivalent
The function popcountl() in hbitmap.c is effectively a reimplementation
of what host-utils.h provides as ctpopl(). Use ctpopl() directly; this fixes
a failure to compile on NetBSD (whose strings.h erroneously exposes a
system popcountl() which clashes with this one).

Reported-by: Martin Husemann <martin@duskware.de>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-11 00:25:06 +01:00
Stacey Son
6b24119b7f bsd-user: refresh freebsd system call numbers
Update FreeBSD system call numbers in freebsd/syscall_nr.h.


Signed-off-by: Stacey Son <sson@FreeBSD.org>
Reviewed-by: Ed Maste <emaste@freebsd.org>
Signed-off-by: Sean Bruno <sbruno@freebsd.org>
Message-id: 1401220104-7147-2-git-send-email-sbruno@freebsd.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-11 00:25:06 +01:00
Peter Maydell
b780bf8eff Merge remote-tracking branch 'remotes/mjt/tags/trivial-patches-2014-06-10' into staging
trivial patches for 2014-06-10

# gpg: Signature made Tue 10 Jun 2014 17:07:19 BST using RSA key ID A4C3D7DB
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg:                 aka "Michael Tokarev <mjt@corpit.ru>"
# gpg:                 aka "Michael Tokarev <mjt@debian.org>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D  4324 457C E0A0 8044 65C5
#      Subkey fingerprint: 6F67 E18E 7C91 C5B1 5514  66A7 BEE5 9D74 A4C3 D7DB

* remotes/mjt/tags/trivial-patches-2014-06-10: (25 commits)
  virtio.c: fix error message
  hw: vmware_vga: don't return cursorx when the driver asks for cursory register
  migration: Plug memory leak in migrate-set-cache-size command
  libcacard: Clean up dead stores before g_free()
  libcacard: Drop superfluous conditionals around g_free()
  cpu/x86: correctly set errors in x86_cpu_parse_featurestr
  smbios: use g_free directly on NULL pointers
  vdi: remove double conversion
  apb: Fix compiler warnings (large constants)
  hw/net/ne2000-isa: Register vmstate struct
  target-microblaze: Delete unused sign_extend() function
  hw/misc/milkymist-softusb: Remove unused softusb_{read, write}_pmem()
  target-i386/translate.c: Remove unused tcg_gen_lshift()
  hw/isa/pc87312: Remove unused function is_parallel_epp()
  hw/intc/openpic: Remove unused function IRQ_testbit()
  hw/dma/xilinx_axidma: Remove unused stream_halted() function
  util/qemu-sockets.c: Avoid unused variable warnings
  hw/sd/sd.c: Drop unused sd_acmd_type[] array
  hw/i386/pc.c: Remove unused parallel_io and parallel_irq variables
  slirp: Remove unused zero_ethaddr[] variable
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-10 17:16:03 +01:00
Michael Tokarev
1a2858995d virtio.c: fix error message
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-10 20:07:01 +04:00
Nicolas Owens
e2bb4ae746 hw: vmware_vga: don't return cursorx when the driver asks for cursory register
hello qemu-*@nongnu.org, this is my first contribution. apologies if
something is incorrect.

this patch fixes vmware_vga.c so that it actually returns the cursory
register when asked for, instead of cursorx.

Signed-off-by: Nicolas Owens <mischief@offblast.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-10 20:06:48 +04:00
Chen Gang
4380be0e99 migration: Plug memory leak in migrate-set-cache-size command
We call g_free() after cache_fini() in migration_end(), but we don't
call it after cache_fini() in xbzrle_cache_resize(), leaking the
memory.

cache_init() and cache_fini() are a pair.  Since cache_init()
allocates the cache, let cache_fini() free it.  This plugs the leak.

Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-10 19:54:43 +04:00
Markus Armbruster
fec0da9cbf libcacard: Clean up dead stores before g_free()
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-10 19:39:34 +04:00
Markus Armbruster
ec15993d9d libcacard: Drop superfluous conditionals around g_free()
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-10 19:39:34 +04:00
Paolo Bonzini
6b1dd54b6a cpu/x86: correctly set errors in x86_cpu_parse_featurestr
Because of the "goto out", the contents of local_err are leaked
and lost.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-10 19:39:34 +04:00
Paolo Bonzini
a10678b08e smbios: use g_free directly on NULL pointers
No need to wrap it with an if.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-10 19:39:34 +04:00
Paolo Bonzini
6998b6c11b vdi: remove double conversion
This should be a problem when running on big-endian machines.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-10 19:39:34 +04:00
Stefan Weil
d1180c1e57 apb: Fix compiler warnings (large constants)
Both constants need more than 32 bit.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-10 19:39:34 +04:00
Peter Maydell
89218c218f hw/net/ne2000-isa: Register vmstate struct
The ne2000-isa device defines a VMState struct for migration, but
we forgot to actually register it. Correct this deficiency by
setting dc->vmsd.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-10 19:39:34 +04:00
Peter Maydell
fc4bde9025 target-microblaze: Delete unused sign_extend() function
The sign_extend() function is unused; delete it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-10 19:39:34 +04:00
Peter Maydell
029ad4bcf3 hw/misc/milkymist-softusb: Remove unused softusb_{read, write}_pmem()
The functions softusb_read_pmem() and softusb_write_pmem() are unused;
remove them.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-10 19:39:34 +04:00
Peter Maydell
e3a17ef6cc target-i386/translate.c: Remove unused tcg_gen_lshift()
The function tcg_gen_lshift() is unused; remove it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-10 19:39:34 +04:00
Peter Maydell
f46b9cc71c hw/isa/pc87312: Remove unused function is_parallel_epp()
The function is_parallel_epp() is unused; remove it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-10 19:39:34 +04:00
Peter Maydell
c1d7572793 hw/intc/openpic: Remove unused function IRQ_testbit()
The IRQ_testbit() function is never used; remove it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-10 19:39:34 +04:00
Peter Maydell
a1fa7992a9 hw/dma/xilinx_axidma: Remove unused stream_halted() function
The stream_halted() function is never used; remove it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-10 19:39:34 +04:00
Peter Maydell
f9b5426fd5 util/qemu-sockets.c: Avoid unused variable warnings
The 'on' variable is never used, and 'off' is only used
if IPV6_V6ONLY is defined; delete 'on' and move 'off' to
the point where it is used. This avoids warnings from
clang 3.4.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-10 19:39:34 +04:00
Peter Maydell
7179585688 hw/sd/sd.c: Drop unused sd_acmd_type[] array
Drop the sd_acmd_type[] array: it is never used. (The equivalent
sd_cmd_type[] array for normal commands is used to identify
those commands whose argument includes the card address in the
top 16 bits; but for app commands the card address is passed
with the APP_CMD prefix, not with the argument to the app command
itself.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-10 19:39:34 +04:00
Peter Maydell
831f4d27b6 hw/i386/pc.c: Remove unused parallel_io and parallel_irq variables
The variables parallel_io and parallel_irq are unused; delete them.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-10 19:39:34 +04:00
Peter Maydell
49bba868df slirp: Remove unused zero_ethaddr[] variable
The zero_ethaddr[] array is never used; delete it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-10 19:39:34 +04:00
Sergey Fedorov
2a802aaf63 qtest: fix hex2nib for capital characters
Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-10 19:39:34 +04:00
Peter Crosthwaite
ef18c2f54e net: cadence_gem: Remove &desc[0] usages
Just use desc instead.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-10 19:39:34 +04:00
Peter Crosthwaite
3048ed6aac net: cadence_gem: Comment spelling sweep
Fix some typos in comments.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-10 19:39:34 +04:00
Peter Crosthwaite
fa15286a75 net: cadence_gem: Add Tx descriptor fetch printf
Add a debug printf for TX descriptor fetching. This is helpful to anyone
needing to debug TX ring buffer traversal. It is also now consistent with
the RX code which has a similar printf.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-10 19:39:34 +04:00
Peter Crosthwaite
6ab57a6b80 net: cadence_gem: Fix Tx descriptor update
The local variable "desc" was being used to read-modify-write the
first descriptor (of a multi-desc packet) upon packet completion.
desc however continues to be used by the code as the current
descriptor. Give this first desc RMW it's own local variable to
avoid trampling.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-10 19:39:34 +04:00
Fam Zheng
1c33ac5716 rules.mak: Rewrite unnest-vars
The macro unnest-vars is the most important, complicated but hard to
track magic in QEMU's build system.

Rewrite it in a (hopefully) clearer way, with more comments, to make it
easier to understand and maintain.

Remove DSO_CFLAGS and module-objs-m that are not used.

A bonus fix of this version is, per object variables are properly
protected in save-objs and load-objs, before including sub-dir
Makefile.objs, just as nested variables are. So the occasional same
object name from different directory levels won't step on each other's
foot.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-10 13:59:02 +02:00
Peter Maydell
3334e929ae Merge remote-tracking branch 'remotes/kraxel/tags/pull-console-20140610-1' into staging
console: two little bugfixes.

# gpg: Signature made Tue 10 Jun 2014 12:01:07 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-console-20140610-1:
  console: fix -vga none -sdl crash
  console: kill MAX_CONSOLES, alloc consoles dynamically

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-10 12:06:17 +01:00
Gerd Hoffmann
333cb18ff4 console: fix -vga none -sdl crash
Call get_alloc_displaystate() for proper initialization
instead of allocating with g_new().

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-06-10 12:36:36 +02:00
Gerd Hoffmann
a1d2db08d8 console: kill MAX_CONSOLES, alloc consoles dynamically
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-06-10 12:36:36 +02:00
Cornelia Huck
99519e677b configure: unset interfering variables
The check for big or little endianness relies on grep reporting
match/non-match on the generated binary. If the user specified
--binary-files=without-match in their GREP_OPTIONS, this will fail.

Let's follow what autoconf does and unset GREP_OPTIONS and CLICOLOR_FORCE
at the beginning of the script.

Reported-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-10 12:34:37 +02:00
Rick Liu
18e588b197 configure: duplicate/incorrect order of -lrt
'-lrt' flag duplication/incorrect order
would cause 'undefined reference to clock_gettime' error during compilation time.

Before fix:
... -o qemu-bridge-helper qemu-bridge-helper.o    -lrt -pthread -lgthread-2.0 -lrt -lglib-2.0
After fix:
... -o qemu-bridge-helper qemu-bridge-helper.o    -pthread -lgthread-2.0 -lrt -lglib-2.0

Reference:
http://hi.baidu.com/sanitywolf/item/7a8b69c1e76dd220a0b50ab1

Signed-off-by: Rick Liu <yrliu.ca@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-10 12:34:37 +02:00
Peter Maydell
7b0140e49b Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20140610' into staging
Several patches for s390:

- bugfixes: A fix for a long-standing bug in the css code as well as
  a fixup for the recent I/O adapter support.
- Exploitation of the userspace cmma enablement/reset interface, if
  it is present.
- Some debuggability improvements by logging unmanageable conditions.
- virtio-ccw finally gets migration support for its structures.
- Some cleanup as to how floating interrupts are injected.

# gpg: Signature made Tue 10 Jun 2014 08:57:56 BST using RSA key ID C6F02FAF
# gpg: Can't check signature: public key not found

* remotes/cohuck/tags/s390x-20140610:
  s390x/kvm: inject via flic
  s390x: cleanup interrupt injection
  s390x/kvm: add alternative injection interface
  s390x: consolidate floating interrupts
  s390/virtio-ccw: migration support
  s390x/kvm: Log unmanageable program interruptions
  s390x/kvm: Log unmanageable external interruptions
  s390x/kvm: enable/reset cmma via vm attributes
  s390x/kvm: make flic play well with old kernels
  s390x/css: handle emw correctly for tsch

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-10 10:59:26 +01:00
Cornelia Huck
bbd8bb8e32 s390x/kvm: inject via flic
Try to inject floating interrupts via the flic if it is available.
This allows us to inject the full range of floating interrupts.

Reviewed-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-06-10 09:50:27 +02:00
Cornelia Huck
de13d21614 s390x: cleanup interrupt injection
Remove the need for a cpu to inject a floating interrupt on kvm.

Acked-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-06-10 09:50:27 +02:00
Cornelia Huck
66ad0893f0 s390x/kvm: add alternative injection interface
Add kvm_s390_{vcpu,floating}_interrupt, which offer the possibility
to inject interrupts with larger payloads (when a kvm backend becomes
available).

Moreover, kvm_s390_floating_interrupt() does no longer have the bogus
requirement for a vcpu.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-06-10 09:50:27 +02:00
Cornelia Huck
79afc36d91 s390x: consolidate floating interrupts
Move the injection code for all floating interrupts to interrupt.c
and add a comment.

Also get rid of the #ifdef CONFIG_KVM for the service interrupt.

Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-06-10 09:50:27 +02:00
Jens Freimann
bcb2b582f3 s390/virtio-ccw: migration support
This patch adds live migration support for virtio-ccw devices.
It's not done with vmstate because virtio itself is not yet ported
to vmstate either.

Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-06-10 09:50:27 +02:00
Thomas Huth
6449a41a4d s390x/kvm: Log unmanageable program interruptions
The kernel only drops to userspace if an endless program interrupt loop
has been detected. Let's print an error message in this case to inform
the user about the crash and stop the affected CPU with a panic event,
just like it is already done for the external interruption loop detection.

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-06-10 09:50:27 +02:00
Thomas Huth
a2689242b1 s390x/kvm: Log unmanageable external interruptions
Interception code 0x14 only drops to userspace when an unmanageable
external interruption interception occured (e.g. if the External New
PSW does not disable external interruptions). Instead of bailing out
via the default handler, it is better to inform the user with a
proper error message that also includes the bad PSW, and to stop
the affected CPU with a panic event instead.

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-06-10 09:50:27 +02:00
Dominik Dingel
4cb88c3c37 s390x/kvm: enable/reset cmma via vm attributes
Exploit the new api for userspace-controlled cmma. If supported, enable
cmma during kvm initialization and register a reset handler for cmma,
which is also called directly from the load IPL code.

The reset functionality is needed to reset the cmma state of the guest
pages, e.g. if a system reset is triggered via qemu monitor; otherwise
this could result in data corruption.

A guest triggered reboot may now lead to multiple cmma resets; this is
OK, however, as this is slowpath anyway and the simplest way to achieve
the intended effects.

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-06-10 09:50:27 +02:00
Cornelia Huck
08da527fd0 s390x/kvm: make flic play well with old kernels
If we run with an old kernel that does not support KVM_CAP_IRQ_ROUTING,
we don't have to do anything in the ->register_io_adapter and
->io_adapter_map callbacks and therefore should return 0 instead of
-ENOSYS (just as the non-kvm flic does).

This fixes using adapter interrupts when running under an older kernel,
which broke with "s390x: add I/O adapter registration".

Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-06-10 09:50:26 +02:00
Cornelia Huck
f068d320de s390x/css: handle emw correctly for tsch
We should not try to store the emw portion of the irb if extended
measurements are not applicable. In particular, we should not surprise
the guest by storing a larger irb if it did not enable extended
measurements.

Cc: qemu-stable@nongnu.org
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-06-10 09:50:26 +02:00
Paolo Bonzini
471f7e30a4 libcacard: improve documentation
Using the file-backed smartcard backend is black magic, but it can
be useful if your only smartcard bricks itself if it is accessed
the wrong way too many times.

Complete the documentation to include the art of creating certificates
and using them with QEMU, based on Ray Strode's useful tutorial at
https://blogs.gnome.org/halfline/2013/09/08/another-smartcard-post/
but with ccid-card-emulated or vscclient instead of SPICE.

Cc: Ray Strode <rstrode@redhat.com>
Reviewed-by: Alon Levy <alevy@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-10 07:44:01 +02:00
Michael Tokarev
ae9b65e873 libcacard: actually use symbols file
libtool has an argument for .syms file, which is -export-symbols.
There's no argument `-export-syms', and it looks like at least on
linux, -export-syms is just ignored.  Use the correct argument,
-export-symbols, to actually get the right export list.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Alon Levy <alevy@redhat.com>
Tested-by: Alon Levy <alevy@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-10 07:44:01 +02:00
Michael Tokarev
fd25c0e6dd libcacard: replace qemu thread primitives with glib ones
Replace QemuMutex with GMutex and QemuCond with GCond
(with corresponding function changes), to make libcacard
independent of qemu internal functions.

After this step, none of libcacard internals use any
qemu-provided symbols.  Maybe it's a good idea to
stop including qemu-common.h internally too.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Alon Levy <alevy@redhat.com>
Tested-by: Alon Levy <alevy@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-10 07:44:01 +02:00
Michael Tokarev
2a0c46da96 vscclient: use glib thread primitives not qemu
Use glib-provided thread primitives in vscclient instead of
qemu ones, and do not use qemu sockets in there (open-code
call to WSAStartup() for windows to initialize things).

This way, vscclient becomes more stand-alone, independent on
qemu internals.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Alon Levy <alevy@redhat.com>
Tested-by: Alon Levy <alevy@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-10 07:44:01 +02:00
Michael Tokarev
86946a2d83 glib-compat.h: add new thread API emulation on top of pre-2.31 API
Thread API changed in glib-2.31 significantly.  Before that version,
conditionals and mutexes were only allocated dynamically, using
_new()/_free() interface.  in 2.31 and up, they're allocated statically
as regular variables, and old interface is deprecated.

(Note: glib docs says the new interface is available since version
2.32, but it was actually introduced in version 2.31).

Create the new interface using old primitives, by providing non-opaque
definitions of the base types (GCond and GMutex) using GOnces.

Replace #ifdeffery around GCond and GMutex in trace/simple.c and
coroutine-gthread.c too because it does not work anymore with the new
glib-compat.h.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[Use GOnce to support lazy initialization; introduce CompatGMutex
 and CompatGCond.  - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-10 07:44:01 +02:00
Peter Maydell
7721a30442 Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20140609-1' into staging
----------------------------------------------------------------
target-arm queue:
 * support -bios option in vexpress boards
 * register the Cortex-A57 impdef system registers
 * fix handling of UXN bit in ARMv8 page tables
 * complete support of crypto insns in A32/T32
 * implement CRC and crypto insns in A64
 * fix bugs in generic timer control register

----------------------------------------------------------------

# gpg: Signature made Mon 09 Jun 2014 16:08:26 BST using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"

* remotes/pmaydell/tags/pull-target-arm-20140609-1:
  target-arm: Delete unused iwmmxt_msadb helper
  target-arm: Fix errors in writes to generic timer control registers
  target-arm: A64: Implement two-register SHA instructions
  target-arm: A64: Implement 3-register SHA instructions
  target-arm: A64: Implement AES instructions
  target-arm: A32/T32: Mask CRC value in calling code, not helper
  target-arm: A64: Implement CRC instructions
  target-arm: VFPv4 implies half-precision extension
  target-arm: Clean up handling of ARMv8 optional feature bits
  target-arm: Remove unnecessary setting of feature bits
  target-arm: arm_any_initfn() should never set ARM_FEATURE_AARCH64
  target-arm: A64: Use PMULL feature bit for PMULL
  target-arm: add support for v8 VMULL.P64 instruction
  target-arm: Allow 3reg_wide undefreq to encode more bad size options
  target-arm: add support for v8 SHA1 and SHA256 instructions
  target-arm: Correct handling of UXN bit in ARMv8 LPAE page tables
  target-arm: Prepare cpreg writefns/readfns for EL3/SecExt
  target-arm/cpu64.c: Actually register Cortex-A57 impdef registers
  vexpress: Add support for the -bios flag to provide firmware

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-09 17:04:13 +01:00
Peter Maydell
14ac573392 Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging
Tracing pull request

# gpg: Signature made Mon 09 Jun 2014 14:44:18 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/tracing-pull-request:
  trace: Replace fprintf with error_report and print location
  trace: Multi-backend tracing
  trace: Replace error with warning if event is not defined
  simpletrace: add support for trace record pid field
  trace: add pid field to simpletrace record

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-09 16:25:34 +01:00
Peter Maydell
3b1a413812 target-arm: Delete unused iwmmxt_msadb helper
The iwmmxt_msadb helper and its corresponding gen function are unused;
delete them. (This function appears to have never been used right back
to the initial implementation of iwMMXt; it is identical to iwmmxt_madduq,
and is presumably an accidental remnant from the initial development.)

Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1401822125-1822-1-git-send-email-peter.maydell@linaro.org
2014-06-09 16:06:12 +01:00
Peter Maydell
d3afacc726 target-arm: Fix errors in writes to generic timer control registers
The code for handling writes to the generic timer control registers
had several bugs:
 * ISTATUS (bit 2) is read-only but we forced it to zero on any write
 * the check for "was IMASK (bit 1) toggled?" incorrectly used '&' where
   it should be '^'
 * the handling of IMASK was inverted: we should set the IRQ if
   ISTATUS is set and IMASK is clear, not if both are set

The combination of these bugs meant that when running a Linux guest
that uses the generic timers we would fairly quickly end up either
forgetting that the timer output should be asserted, or failing to
set the IRQ when the timer was unmasked. The result is that the guest
never gets any more timer interrupts.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1401803208-1281-1-git-send-email-peter.maydell@linaro.org
Cc: qemu-stable@nongnu.org
2014-06-09 16:06:12 +01:00
Peter Maydell
f6fe04d566 target-arm: A64: Implement two-register SHA instructions
Implement the two-register SHA instruction group from the optional
Crypto Extensions.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1401458125-27977-10-git-send-email-peter.maydell@linaro.org
2014-06-09 16:06:12 +01:00
Peter Maydell
be56f04eea target-arm: A64: Implement 3-register SHA instructions
Implement the 3-register SHA instruction group from the optional
Crypto Extensions.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1401458125-27977-9-git-send-email-peter.maydell@linaro.org
2014-06-09 16:06:12 +01:00
Peter Maydell
5acc765c04 target-arm: A64: Implement AES instructions
Implement the AES instructions from the optional Crypto Extensions.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1401458125-27977-8-git-send-email-peter.maydell@linaro.org
2014-06-09 16:06:12 +01:00
Peter Maydell
aa633469ed target-arm: A32/T32: Mask CRC value in calling code, not helper
Bring the 32-bit CRC helper functions into line with the A64 ones,
by masking the high bytes of the value in the calling code rather
than the helper. This is more efficient since we can determine the
mask at translation time.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1401458125-27977-7-git-send-email-peter.maydell@linaro.org
2014-06-09 16:06:12 +01:00
Peter Maydell
130f2e7dcb target-arm: A64: Implement CRC instructions
Implement the optional A64 CRC instructions.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1401458125-27977-6-git-send-email-peter.maydell@linaro.org
2014-06-09 16:06:12 +01:00
Peter Maydell
da5141fc45 target-arm: VFPv4 implies half-precision extension
VFPv4 implies the presence of the half-precision floating point
extension (which is optional in VFPv3). Add this implied rule
to arm_cpu_realizefn() and remove some no-longer-needed explicit
setting of the bit in initfns.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1401458125-27977-5-git-send-email-peter.maydell@linaro.org
2014-06-09 16:06:11 +01:00
Peter Maydell
25f748e37a target-arm: Clean up handling of ARMv8 optional feature bits
CRC and crypto are both optional v8 extensions, so FEATURE_V8
should not imply them. Instead we should set these bits in the
initfns for the 32-bit and 64-bit "cpu any" and for the Cortex-A57.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1401458125-27977-4-git-send-email-peter.maydell@linaro.org
2014-06-09 16:06:11 +01:00
Peter Maydell
fb8ad9f2c1 target-arm: Remove unnecessary setting of feature bits
FEATURE_V8 implies both FEATURE_V7MP and FEATURE_ARM_DIV, so
we don't need to set them explicitly in initfns which set the
V8 feature bit.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1401458125-27977-3-git-send-email-peter.maydell@linaro.org
2014-06-09 16:06:11 +01:00
Peter Maydell
46d9dfdad6 target-arm: arm_any_initfn() should never set ARM_FEATURE_AARCH64
The arm_any_initfn() is used only for the 32-bit linux-user "cpu any",
so it only gets called in builds where TARGET_AARCH64 is not defined.
Remove the unreachable line which sets ARM_FEATURE_AARCH64.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1401458125-27977-2-git-send-email-peter.maydell@linaro.org
2014-06-09 16:06:11 +01:00
Peter Maydell
411bdc7837 target-arm: A64: Use PMULL feature bit for PMULL
Now that we have a separate ARM_FEATURE_V8_PMULL bit, use it for
the A64 PMULL, not the AES feature bit.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-09 16:06:11 +01:00
Peter Maydell
4e624edaeb target-arm: add support for v8 VMULL.P64 instruction
Add support for the VMULL.P64 polynomial 64x64 to 128 bit multiplication
instruction in the A32/T32 instruction sets; this is part of the v8
Crypto Extensions.

To do this we have to move the neon_pmull_64_{lo,hi} helpers from
helper-a64.c into neon_helper.c so they can be used by the AArch32
translator.

Inspired-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1401386724-26529-4-git-send-email-peter.maydell@linaro.org
2014-06-09 16:06:11 +01:00
Peter Maydell
526d0096e5 target-arm: Allow 3reg_wide undefreq to encode more bad size options
The current undefreq field in the neon_3reg_wide handling allows us
to encode "UNDEF if size != 0" and "UNDEF if size == 0". This is
no longer sufficient with the advent of 64-bit polynomial VMULL,
which means we want to UNDEF if size == 1. Change the undefreq
encoding to use separate bits for all of "UNDEF if size == 0",
"UNDEF if size == 1" and "UNDEF if size == 2".

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1401386724-26529-3-git-send-email-peter.maydell@linaro.org
2014-06-09 16:06:11 +01:00
Ard Biesheuvel
f1ecb913d8 target-arm: add support for v8 SHA1 and SHA256 instructions
This adds support for the SHA1 and SHA256 instructions that are available
on some v8 implementations of Aarch32.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1401386724-26529-2-git-send-email-peter.maydell@linaro.org
[PMM:
 * rebase
 * fix bad indent
 * add a missing UNDEF check for Q!=1 in the 3-reg SHA1/SHA256 case
 * use g_assert_not_reached()
 * don't re-extract bit 6 for the 2-reg-misc encodings
 * set the ELF HWCAP2 bits for the new features
]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-09 16:06:11 +01:00
Ian Campbell
d615efac7c target-arm: Correct handling of UXN bit in ARMv8 LPAE page tables
In v8 page tables bit 54 in the PTE is UXN in the EL0/EL1 translation regimes
and XN elsewhere. In v7 the bit is always XN. Since we only emulate EL0/EL1 we
can just treat this bit as UXN whenever we are in v8 mode.

Also correctly extract the upper attributes from the PTE entry, the v8 version
tried to avoid extracting the CONTIG bit and ended up with the upper bits being
off-by-one. Instead behave the same as v7 and extract (but ignore) the CONTIG
bit.

This fixes "Bad mode in Synchronous Abort handler detected, code 0x8400000f"
seen when modprobing modules under Linux.

Signed-off-by: Ian Campbell <ijc@hellion.org.uk>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Claudio Fontana <claudio.fontana@huawei.com>
Cc: Rob Herring <robherring2@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-09 16:06:11 +01:00
Fabian Aggeler
8d5c773e32 target-arm: Prepare cpreg writefns/readfns for EL3/SecExt
This patch changes some readfns/writefns to use raw_write
and raw_read functions, which use the fieldoffset specified
in ARMCPRegInfo instead of directly accessing the field.
This will simplify patches for EL3 & Security Extensions.

Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Fabian Aggeler <aggelerf@ethz.ch>
Message-id: 1401962428-14749-1-git-send-email-aggelerf@ethz.ch
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-09 15:43:22 +01:00
Peter Maydell
bf01601764 target-arm/cpu64.c: Actually register Cortex-A57 impdef registers
cpu64.c contains a reginfo list for the impdef registers on
the Cortex-A57; however we forgot to actually call define_arm_cp_regs(),
so it was sitting there doing nothing. Remedy this omission.

Message-id: 1401226259-23121-1-git-send-email-peter.maydell@linaro.org
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Tested-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-09 15:43:22 +01:00
Grant Likely
61e9924149 vexpress: Add support for the -bios flag to provide firmware
Right now to run firmware inside the QEMU VExpress model requires
padding out the firmware image to the size of the virtual flash and
passing it in via the -pflash argument. If the firmware image is passed
without padding, then QEMU will fail. Also, when passed as a -pflash
argument, QEMU treats the file as persistent storage and will modify the
file.

The -bios flag provides the semantics that we want for providing a
firmware image. This patch maps the contents of the -bios file into the
address space at the boot flash location.

Tested with the vexpress-a15 model and the Tianocore port.

Signed-off-by: Grant Likely <grant.likely@linaro.org>
Tested-by: Roy Franz <roy.franz@linaro.org>
[PMM: folded long line, removed stray \n from error message,
 use correct variable for printing image name, exit(1) rather than 0]
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-09 15:43:22 +01:00
Peter Maydell
4a331bb33b Merge remote-tracking branch 'remotes/stefanha/tags/net-pull-request' into staging
Net patches

# gpg: Signature made Mon 09 Jun 2014 14:41:34 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/net-pull-request:
  e1000: remove broken support for 82573L
  tests: e1000: test additional device IDs
  e1000: allow command-line selection of card model
  vmxnet3: fix msix vectors unuse
  net: xilinx_ethlite: Fix Rx-pong interrupt

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-09 15:00:21 +01:00
Alexey Kardashevskiy
a35d9be622 trace: Replace fprintf with error_report and print location
This replaces fprintf(stderr) with error_report.

This moves local variables to the beginning of the function to comply
with QEMU's coding style.

Suggested-by: Lluís Vilanova <vilanova@ac.upc.edu>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-09 15:43:40 +02:00
Lluís Vilanova
5b808275f3 trace: Multi-backend tracing
Adds support to compile QEMU with multiple tracing backends at the same time.

For example, you can compile QEMU with:

  $ ./configure --enable-trace-backends=ftrace,dtrace

Where 'ftrace' can be handy for having an in-flight record of events, and 'dtrace' can be later used to extract more information from the system.

This patch allows having both available without recompiling QEMU.

Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-09 15:43:40 +02:00
Alexey Kardashevskiy
82432638eb trace: Replace error with warning if event is not defined
At the moment QEMU exits if trace point is not defined which makes
a developer life harder if he has to switch between branches with
different traces implemented.

This replaces error+exit wit WARNING if the tracepoint does not exist or
not traceable.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-09 15:43:40 +02:00
Stefan Hajnoczi
80ff35cd3f simpletrace: add support for trace record pid field
Extract the pid field from the trace record and print it.

Change the trace record tuple from:
  (event_num, timestamp, arg1, ..., arg6)
to:
  (event_num, timestamp, pid, arg1, ..., arg6)

Trace event methods now support 3 prototypes:
1. <event-name>(arg1, arg2, arg3)
2. <event-name>(timestamp, arg1, arg2, arg3)
3. <event-name>(timestamp, pid, arg1, arg2, arg3)

Existing script continue to work without changes, they only know about
prototypes 1 and 2.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-09 15:43:40 +02:00
Stefan Hajnoczi
26896cbf35 trace: add pid field to simpletrace record
It is useful to know the QEMU process ID when working with traces from
multiple VMs.  Although the trace filename may contain the pid, tools
that aggregate traces or even trace globally need somewhere to record
the pid.

There is a reserved field in the trace event header struct that we can
use.

It is not necessary to bump the simpletrace file format version number
because it has already been incremented for the QEMU 2.1 release cycle
in commit "trace: [simple] Bump up log version number".

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-09 15:43:40 +02:00
Gabriel L. Somlo
7efea76377 e1000: remove broken support for 82573L
Currently, e1000 support is based on the manual for the 8254xx
model series. 82573x models are documented in a separate manual
(see http://www.intel.com/content/dam/www/public/us/en/documents/manuals/pcie-gbe-controllers-open-source-manual.pdf)
and the 82573L device ID no longer works correctly on either Linux
(3.14.*) or Windows 7.

This patch removes stale code claiming to support 82573L, cleaning
up the code base for the remaining 8254xx model series.

Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-09 15:38:58 +02:00
Gabriel L. Somlo
b167383ffb tests: e1000: test additional device IDs
Update e1000-test.c to check all currently supported devices.

Suggested-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-09 15:38:58 +02:00
Gabriel L. Somlo
8597f2e19e e1000: allow command-line selection of card model
Allow selection of different card models from the qemu
command line, to better accomodate a wider range of guests.

Signed-off-by: Romain Dolbeau <romain@dolbeau.org>
Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-09 15:38:58 +02:00
Jiri Pirko
b44672849a vmxnet3: fix msix vectors unuse
In vmxnet3_cleanup_msix(), there is called msix_vector_unuse() with
VMXNET3_MAX_INTRS. That is not correct since vector of
value VMXNET3_MAX_INTRS was never used. Also all the used vectors
are not un-used. So call vmxnet3_unuse_msix_vectors() instead which
does the correct job.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-09 15:38:58 +02:00
Peter Crosthwaite
40e76f736d net: xilinx_ethlite: Fix Rx-pong interrupt
There is no CTRL_I bit in the pong buffer control register. The
CTRL_I bit from the ping buffer masks both ping and pong buffers.
Fix.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-09 15:38:58 +02:00
Peter Maydell
5dfc05cb1d Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
Block pull request

# gpg: Signature made Fri 06 Jun 2014 17:08:50 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/block-pull-request: (42 commits)
  qapi: Extract qapi/block.json definitions
  qapi: Extract qapi/block-core.json definitions
  qapi: create two block related json modules
  qapi: Extract qapi/common.json definitions
  sheepdog: reload only header in a case of live snapshot
  sheepdog: fix vdi object update after live snapshot
  rbd: Fix leaks in rbd_start_aio() error path
  qemu-img: Document check exit codes
  block: fix wrong order in live block migration setup
  blockdev: acquire AioContext in block_set_io_throttle
  throttle: add detach/attach test case
  throttle: add throttle_detach/attach_aio_context()
  dataplane: Support VIRTIO_BLK_T_SCSI_CMD
  virtio-blk: Factor out virtio_blk_handle_scsi_req from virtio_blk_handle_scsi
  virtio-blk: Allow config-wce in dataplane
  block: Move declaration of bdrv_get_aio_context to block.h
  raw-posix: drop raw_get_aio_fd() since it is no longer used
  dataplane: implement async flush
  dataplane: delete IOQueue since it is no longer used
  dataplane: use the QEMU block layer for I/O
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-09 11:54:22 +01:00
Samuel Thibault
959e41473f slirp/arp: do not special-case bogus IP addresses
Do not special-case addresses with zero host part, as we do not
necessarily know how big it is, and the guest can fake them anyway.
Silently avoid having 0.0.0.0 as a destination, however.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
[Edgar: Minor change to subject]
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2014-06-09 01:49:28 +02:00
Peter Maydell
37654d9e6a target-cris/translate.c: Remove _t_gen_mov_TN_env and _t_gen_mov_env_TN
The wrapper functions _t_gen_mov_TN_env and _t_gen_mov_env_TN are only
used via their accompanying non-underscore macros. The check they add
on offset is thus pointless, since the compiler will complain if the
struct field passed to the macro is not part of the struct. Remove the
functions and make the macros directly expand to the appropriate
tcg_gen_{ld,st}_tl calls.

This conveniently avoids a warning due to _t_gen_mov_TN_env() being
unused.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
2014-06-09 01:04:44 +02:00
Peter Maydell
08397c4b29 target-cris/translate.c: Remove t_gen_mov_TN_reg and t_gen_mov_reg_TN
Remove the t_gen_mov_TN_reg and t_gen_mov_reg_TN wrappers: the
latter is completely unused, and the former only used in a few
places (which are thus inconsistent with the rest of the decoder
which directly accesses cpu_R[]).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
2014-06-09 01:04:44 +02:00
Peter Crosthwaite
a373cdb5ce intc: xilinx_uartlite: Convert SBD::init -> instance_init
SysBusDevice::init is depracated. Convert to Object::init
as prescribed by QOM conventions.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
2014-06-09 00:33:03 +02:00
Peter Crosthwaite
aa0f607f61 char: xilinx_uartlite: Convert to realize()
SysBusDevice::init is depracated. Convert to Object::init and
Device::realize as prescribed by QOM conventions.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
2014-06-09 00:33:03 +02:00
Peter Crosthwaite
95faaa73df char: xilinx_uartlite: Don't reset from init
This refresh of the device state is intended to be a reset side
effect. Move it to a proper reset handler rather than do it at
init time.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
2014-06-09 00:33:03 +02:00
Peter Crosthwaite
e8198f6ea0 net: xilinx_ethlite: Convert to realize()
SysBusDevice::init is depracated. Convert to Object::init and
Device::realize as prescribed by QOM conventions.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
2014-06-09 00:33:03 +02:00
Peter Crosthwaite
8c6d96728d net: xilinx_ethlite: Don't reset from init
This zeroing-out of the rxbuf variable (ping pong state) is a reset
side effect. Extract into a proper reset.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
2014-06-09 00:33:03 +02:00
Peter Crosthwaite
04bb4d86f1 timer: xilinx_timer: Convert to realize()
SysBusDevice::init is depracated. Convert to Object::init and
Device::realize as prescribed by QOM conventions.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
2014-06-09 00:33:02 +02:00
Benoît Canet
2e95fa1719 qapi: Extract qapi/block.json definitions
Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-06 16:25:48 +02:00
Benoît Canet
1ad166b612 qapi: Extract qapi/block-core.json definitions
Signed-off-by: Benoit Canet <benoit@irqsave.net
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-06 16:25:48 +02:00
Benoît Canet
5db15096d0 qapi: create two block related json modules
qapi/block-core.json contains block definitions unrelated to emulation.

qapi/block.json is a superset of the previous and contains definitions related
to emulation.

The purpose of these extractions is to be able to hook qapi/block-core.json
generated code on qemu-nbd.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-06 16:25:48 +02:00
Benoît Canet
d34bda716a qapi: Extract qapi/common.json definitions
Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-06 16:25:48 +02:00
Hitoshi Mitake
5d039baba4 sheepdog: reload only header in a case of live snapshot
sheepdog driver doesn't need to read data_vdi_id[] when a live snapshot is
created.

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Liu Yuan <namei.unix@gmail.com>
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-06 14:54:43 +02:00
Hitoshi Mitake
b544c1aba8 sheepdog: fix vdi object update after live snapshot
sheepdog driver should decide a write request is COW or not based on inode
object which is active when the write request is issued.

Example of wrong inode update path in the previous driver:
1. drier issues an ordinal write request to an existing object
2. user creates a snapshot of the VDI before the write request is completed
3. the respones for the request is RDONLY, because the VDI is already a snapshot
4. the driver reload an inode object of the new active VDI, then issues a write
   request again
5. the second write request can be completed
6. driver decide the request is COW or not with the below conditional branch:
   	  if (s->inode.data_vdi_id[idx] != s->inode.vdi_id) {
7. the ID of the written object and VID of the new active VDI is different, so
   the driver updates data_vdi_id[idx] and writes inode object
8. the existing object cannot be seen by the new active VDI, it results object
   leaking

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Liu Yuan <namei.unix@gmail.com>
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-06 14:53:55 +02:00
Kevin Wolf
405a27640b rbd: Fix leaks in rbd_start_aio() error path
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-06 11:05:04 +02:00
Peter Maydell
50809c8b92 Merge remote-tracking branch 'remotes/mcayland/qemu-sparc' into staging
* remotes/mcayland/qemu-sparc:
  apb: implement IOMMU translation for PCI host bridge
  apb: handle reading/writing of IOMMU control registers
  apb: fix IOMMU register sizes
  apb: Move IOMMU registers into a separate IOMMUState struct
  tcx: move initialisation from realizefn to initfn
  tcx: move initialisation from SysBusDevice class to TCX class realizefn
  cg3: add extra check to prevent CG3 register array overflow
  cg3: move initialisation from realizefn to initfn

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-05 23:05:07 +01:00
Peter Maydell
4e627aeef8 Merge remote-tracking branch 'remotes/mdroth/qga-pull-2014-06-05' into staging
* remotes/mdroth/qga-pull-2014-06-05:
  qga: Fix handle fd leak in acquire_privilege()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-05 22:40:44 +01:00
Peter Maydell
26edf8cc08 Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc,pci,virtio,qdev fixes, tests

new tests for SMBIOS
SMBIOS fixes
pc, pci fixes
qdev patches stayed on list for a month with no review,
as I told people on KVM forum I'm merging stuch patches
if they look fine.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

* remotes/mst/tags/for_upstream:
  qdev: Add test of qdev_prop_check_global
  qdev: Display warning about unused -global
  tests: add smbios testing
  tests: rename acpi-test to bios-tables-test
  virtio-balloon: return empty data when no stats are available
  pcie_host: Turn pcie_host_init() into an instance_init
  SMBIOS: Fix type 17 field sizes
  SMBIOS: Update Type 0 struct generator for machines >= 2.1
  SMBIOS: Fix endian-ness when populating multi-byte fields
  serial-pci: Set prog interface field of pci config to 16550 compatible

Conflicts:
	include/hw/i386/pc.h
[PMM: fixed trivial conflict in pc.h]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-05 21:52:37 +01:00
Peter Maydell
31e25e3e57 Merge remote-tracking branch 'remotes/bonzini/softmmu-smap' into staging
* remotes/bonzini/softmmu-smap: (33 commits)
  target-i386: cleanup x86_cpu_get_phys_page_debug
  target-i386: fix protection bits in the TLB for SMEP
  target-i386: support long addresses for 4MB pages (PSE-36)
  target-i386: raise page fault for reserved bits in large pages
  target-i386: unify reserved bits and NX bit check
  target-i386: simplify pte/vaddr calculation
  target-i386: raise page fault for reserved physical address bits
  target-i386: test reserved PS bit on PML4Es
  target-i386: set correct error code for reserved bit access
  target-i386: introduce support for 1 GB pages
  target-i386: introduce do_check_protect label
  target-i386: tweak handling of PG_NX_MASK
  target-i386: commonize checks for PAE and non-PAE
  target-i386: commonize checks for 4MB and 4KB pages
  target-i386: commonize checks for 2MB and 4KB pages
  target-i386: fix coding standards in x86_cpu_handle_mmu_fault
  target-i386: simplify SMAP handling in MMU_KSMAP_IDX
  target-i386: fix kernel accesses with SMAP and CPL = 3
  target-i386: move check_io helpers to seg_helper.c
  target-i386: rename KSMAP to KNOSMAP
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-05 21:06:14 +01:00
Mark Cave-Ayland
ae74bbe7c5 apb: implement IOMMU translation for PCI host bridge
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2014-06-05 21:00:22 +01:00
Mark Cave-Ayland
f38b161203 apb: handle reading/writing of IOMMU control registers
While the registers are documented as being 64-bit, Linux seems to access
them in two halves as 2 x 32-bit accesses. Make sure that we can correctly
handle this case.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2014-06-05 21:00:12 +01:00
Mark Cave-Ayland
fd7fbc8ff7 apb: fix IOMMU register sizes
According to the referenced documentation, the IOMMU has 3 64-bit registers
consisting of a control register, base register and flush register.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2014-06-05 21:00:03 +01:00
Mark Cave-Ayland
ea9a6606b1 apb: Move IOMMU registers into a separate IOMMUState struct
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2014-06-05 20:59:53 +01:00
Mark Cave-Ayland
01b91ac2be tcx: move initialisation from realizefn to initfn
Initialisation cleanup as suggested by Andreas.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
CC: Andreas Färber <afaerber@suse.de>
2014-06-05 20:51:57 +01:00
Mark Cave-Ayland
d4ad9dec14 tcx: move initialisation from SysBusDevice class to TCX class realizefn
This is an intermediate step to bring TCX in line with CG3.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
CC: Andreas Färber <afaerber@suse.de>
2014-06-05 20:51:45 +01:00
Mark Cave-Ayland
366d4f7e00 cg3: add extra check to prevent CG3 register array overflow
The case statements in the CG3 read and write register routines have a maximum
value of CG3_REG_SIZE, so if a value were written to this offset then it
would overflow the register array.

Currently this cannot be exploited since the MemoryRegion restricts accesses
to the range 0 ... CG3_REG_SIZE - 1, but it seems worth clarifying this for
future review and/or static analysis.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
CC: Paolo Bonzini <pbonzini@redhat.com>
2014-06-05 20:51:30 +01:00
Mark Cave-Ayland
e09c49f40d cg3: move initialisation from realizefn to initfn
Initialisation cleanup as suggested by Andreas.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
CC: Andreas Färber <afaerber@suse.de>
2014-06-05 20:51:19 +01:00
Peter Maydell
9d48d3f01c Merge remote-tracking branch 'remotes/rth/tcg-next' into staging
* remotes/rth/tcg-next:
  TCG: Fix tcg_gen_extr_i64_tl for 32bit
  tcg: Remove TCG_TARGET_HAS_new_ldst
  tci: Convert to new ldst opcodes
  tcg-i386: Fix win64 qemu store

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-05 20:11:50 +01:00
Peter Maydell
9f0355b590 Merge remote-tracking branch 'remotes/kvm/uq/master' into staging
* remotes/kvm/uq/master:
  kvm: Fix eax for cpuid leaf 0x40000000
  kvmclock: Ensure proper env->tsc value for kvmclock_current_nsec calculation
  kvm: Enable -cpu option to hide KVM
  kvm: Ensure negative return value on kvm_init() error handling path
  target-i386: set CC_OP to CC_OP_EFLAGS in cpu_load_eflags
  target-i386: get CPL from SS.DPL
  target-i386: rework CPL checks during task switch, preparing for next patch
  target-i386: fix segment flags for SMM and VM86 mode
  target-i386: Fix vm86 mode regression introduced in fd460606fd.
  kvm_stat: allow choosing between tracepoints and old stats
  kvmclock: Ensure time in migration never goes backward

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-05 19:16:28 +01:00
Peter Maydell
d4f005db9b Merge remote-tracking branch 'remotes/kraxel/tags/pull-input-10' into staging
updates for docs/multiseat.txt
input: add support for kbd delays

# gpg: Signature made Wed 04 Jun 2014 08:22:39 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-input-10:
  docs/multiseat.txt: add note about spice
  docs/multiseat.txt: gtk joined the party
  docs/multiseat.txt: use autoseat
  input/vnc: use kbd delays in press_key
  input/curses: add kbd delay between keydown and keyup events
  input: use kbd delays for send_key monitor command
  input: add support for kbd delays

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-05 18:58:53 +01:00
Don Slutz
711e2f1e9e qdev: Add test of qdev_prop_check_global
This will generate a warning from "make check":

...
GTESTER tests/test-qdev-global-props
Warning: "-global dynamic-prop-type-bad.prop3=103" not used
GTESTER tests/check-qom-interface
...

If the warning is not generated, the test will fail.

Signed-off-by: Don Slutz <dslutz@verizon.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-05 19:20:38 +03:00
Don Slutz
9f9260a3be qdev: Display warning about unused -global
This can help a user understand why -global was ignored.

For example: with "-vga cirrus"; "-global vga.vgamem_mb=16" is just
ignored when "-global cirrus-vga.vgamem_mb=16" is not.

This is currently clear when the wrong property is provided:

out/x86_64-softmmu/qemu-system-x86_64 -global cirrus-vga.vram_size_mb=16 -monitor pty -vga cirrus
char device redirected to /dev/pts/20 (label compat_monitor0)
qemu-system-x86_64: Property '.vram_size_mb' not found
Aborted (core dumped)

vs

out/x86_64-softmmu/qemu-system-x86_64 -global vga.vram_size_mb=16 -monitor pty -vga cirrus
char device redirected to /dev/pts/20 (label compat_monitor0)
VNC server running on `::1:5900'
^Cqemu: terminating on signal 2

Signed-off-by: Don Slutz <dslutz@verizon.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-05 19:20:37 +03:00
Paolo Bonzini
16b96f82cd target-i386: cleanup x86_cpu_get_phys_page_debug
Make the code a bit more similar to x86_cpu_handle_mmu_fault.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-05 16:10:35 +02:00
Paolo Bonzini
b09481de91 target-i386: fix protection bits in the TLB for SMEP
User pages must be marked as non-executable when running under SMEP;
otherwise, fetching the page first and then calling it will fail.

With this patch, all SMEP testcases in kvm-unit-tests now pass.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-05 16:10:35 +02:00
Paolo Bonzini
de431a655a target-i386: support long addresses for 4MB pages (PSE-36)
4MB pages can use 40-bit addresses by putting the higher 8 bits in bits
20-13 of the PDE.  Bit 21 is reserved.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-05 16:10:35 +02:00
Paolo Bonzini
eaad03e472 target-i386: raise page fault for reserved bits in large pages
In large pages, bit 12 is for PAT, but bits starting at 13 are reserved.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-05 16:10:35 +02:00
Paolo Bonzini
e2a32ebbfe target-i386: unify reserved bits and NX bit check
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-05 16:10:34 +02:00
Paolo Bonzini
e7e898a76a target-i386: simplify pte/vaddr calculation
They can moved to after the dirty bit processing, and unified between
CR0.PG=1 and CR0.PG=0.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-05 16:10:34 +02:00
Paolo Bonzini
e8f6d00c30 target-i386: raise page fault for reserved physical address bits
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-05 16:10:34 +02:00
Paolo Bonzini
b728464ae8 target-i386: test reserved PS bit on PML4Es
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-05 16:10:34 +02:00
Paolo Bonzini
c1eb2fa3fd target-i386: set correct error code for reserved bit access
The correct error code is 9 (present, reserved), not 8.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-05 16:10:34 +02:00
Paolo Bonzini
77549a7809 target-i386: introduce support for 1 GB pages
Given the simplifications to the code in the previous patches, this
is now very simple to do.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-05 16:10:34 +02:00
Paolo Bonzini
b052e4509b target-i386: introduce do_check_protect label
This will help adding 1GB page support in the next patch.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-05 16:10:34 +02:00
Paolo Bonzini
870a706735 target-i386: tweak handling of PG_NX_MASK
Remove the tail of the PAE case, so that we can use "goto" in the
next patch to jump to the protection checks.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-05 16:10:34 +02:00
Paolo Bonzini
7c82256006 target-i386: commonize checks for PAE and non-PAE
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-05 16:10:34 +02:00
Paolo Bonzini
487cad8853 target-i386: commonize checks for 4MB and 4KB pages
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-05 16:10:34 +02:00
Paolo Bonzini
00cc3e1d70 target-i386: commonize checks for 2MB and 4KB pages
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-05 16:10:34 +02:00
Paolo Bonzini
843408b3cf target-i386: fix coding standards in x86_cpu_handle_mmu_fault
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-05 16:10:34 +02:00
Paolo Bonzini
f57584dc87 target-i386: simplify SMAP handling in MMU_KSMAP_IDX
Do not use this MMU index at all if CR4.SMAP is false, and drop
the SMAP check from x86_cpu_handle_mmu_fault.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-05 16:10:34 +02:00
Paolo Bonzini
8a201bd47e target-i386: fix kernel accesses with SMAP and CPL = 3
With SMAP, implicit kernel accesses from user mode always behave as
if AC=0.  To do this, kernel mode is not anymore a separate MMU mode.
Instead, KERNEL_IDX is renamed to KSMAP_IDX and the kernel mode accessors
wrap KSMAP_IDX and KNOSMAP_IDX.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-05 16:10:34 +02:00
Paolo Bonzini
81cf8d8adc target-i386: move check_io helpers to seg_helper.c
Prepare for adding _kernel accessors there in the next patch.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-05 16:10:34 +02:00
Paolo Bonzini
43773ed369 target-i386: rename KSMAP to KNOSMAP
This is the mode where SMAP is overridden, put "NO" in its name.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-05 16:10:34 +02:00
Paolo Bonzini
c773828aa9 softmmu: move all load/store functions to cpu_ldst.h
Unify pieces of cpu-all.h, exec-all.h, softmmu_exec.h and tcg/tcg.h
into a single new header file with all helpers.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-05 16:10:33 +02:00
Paolo Bonzini
f08b617018 softmmu: introduce cpu_ldst.h
This will collect all load and store helpers soon.  For now
it is just a replacement for softmmu_exec.h, which this patch
stops including directly, but we also include it where this will
be necessary in order to simplify the next patch.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-05 16:10:33 +02:00
Paolo Bonzini
1d854765df target-arm: move arm_*_code to a separate file
These will soon require cpu_ldst.h, so move them out of cpu.h.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-05 16:10:33 +02:00
Paolo Bonzini
58ed270df9 softmmu: move softmmu_template.h out of include/
It is only included in cputlb.c now.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-05 16:10:33 +02:00
Paolo Bonzini
0f590e749f softmmu: commonize helper definitions
They do not need to be in op_helper.c.  Because cputlb.c now includes
softmmu_template.h twice for each size, io_readX must be elided the
second time through.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-05 16:10:33 +02:00
Paolo Bonzini
d94f0a8ecb softmmu: move ALIGNED_ONLY to cpu.h
Prepare for moving softmmu_header.h inclusion out of .c files

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-05 16:10:33 +02:00
Paolo Bonzini
93e22326d6 softmmu: make do_unaligned_access a method of CPU
We will reference it from more files in the next patch.  To avoid
ruining the small steps we're making towards multi-target, make
it a method of CPU rather than just a global.

Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-05 16:10:31 +02:00
Paolo Bonzini
ca0aa40816 softmmu: move definition of CPU_MMU_INDEX to inclusion site, drop ACCESS_TYPE
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-05 16:05:52 +02:00
Paolo Bonzini
a6c9eac0d5 softmmu: move MMUSUFFIX under SOFTMMU_CODE_ACCESS
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-05 16:05:47 +02:00
Paolo Bonzini
859d76120b softmmu: start introducing SOFTMMU_CODE_ACCESS in softmmu_header.h
This preprocessor symbol is already used in softmmu_template.h.  We
will use it to distinguish the two "fake" ACCESS_TYPEs
NB_MMU_MODES and NB_MMU_MODES + 1.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-05 16:05:27 +02:00
Paolo Bonzini
0983979b3a hw: use ld_p/st_p instead of ld_raw/st_raw
The ld_raw and st_raw definitions are only needed in code that
must compile for both user-mode and softmmu emulation.  Device
models can use the equivalent ld_p/st_p which are simple
pointer accessors.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-05 16:04:17 +02:00
Paolo Bonzini
fddbd80cc9 nseries: clean up coding style
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-05 16:03:53 +02:00
Stefan Weil
7e4e88656c cputlb: Fix regression with TCG interpreter (bug 1310324)
Commit 0f842f8a24 replaced GETPC_EXT() which
was derived from GETPC() by GETRA_EXT() without fixing cputlb.c. A later
patch replaced GETRA_EXT() by GETRA() in exec/softmmu_template.h which
is included in cputlb.c.

The TCG interpreter failed because the values returned by GETRA() were no
longer explicitly set to 0. The redefinition of GETRA() introduced here
fixes this.

In addition, GETPC_ADJ which is also used in exec/softmmu_template.h is
set to 0. Both changes reduce the compiled code size for cputlb.c by more
than 100 bytes, so the normal TCG without interpreter also profits from
the reduced code size and slightly faster code.

Cc: qemu-stable@nongnu.org
Reported-by: Giovanni Mascellani <gio@debian.org>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-05 16:03:37 +02:00
Alexander Graf
e3eb9806c7 TCG: Fix tcg_gen_extr_i64_tl for 32bit
We expose a generic helper "tcg_gen_extr_i64_tl" for 64bit targets, but the
same function for 32bit targets is a misnomer and refers to an invalid function
name.

Fix up the definition to point to the correct internal helper names instead.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-06-04 14:11:45 -07:00
Richard Henderson
3d1b2ff62c tcg: Remove TCG_TARGET_HAS_new_ldst
Since all backends have been converted, remove the compatibility code.

Acked-by: Claudio Fontana <claudio.fontana@huawei.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-06-04 14:10:26 -07:00
Richard Henderson
76782fab1c tci: Convert to new ldst opcodes
Tested-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-06-04 14:10:16 -07:00
Richard Henderson
0b91966730 tcg-i386: Fix win64 qemu store
The first non-register argument isn't placed at offset 0.

Cc: qemu-stable@nongnu.org
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-06-04 13:58:39 -07:00
Max Reitz
d6635c4dbb qemu-img: Document check exit codes
The exit code 63 (check not supported by image format) was not even
documented in the comment above the check command in the source code;
add it, as it does indeed seem useful.

Also, document all of check's exit codes in the manpage.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reported-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 11:30:32 +02:00
chai wen
1ac362cdbd block: fix wrong order in live block migration setup
The function init_blk_migration is better to be called before
set_dirty_tracking as the reasons below.

If we want to track dirty blocks via dirty_maps on a BlockDriverState
when doing live block-migration, its correspoding 'BlkMigDevState' should be
added to block_mig_state.bmds_list first for subsequent processing.
Otherwise set_dirty_tracking will do nothing on an empty list than allocating
dirty_bitmaps for them. And bdrv_get_dirty_count will access the
bmds->dirty_maps directly, then there would be a segfault triggered.

If the set_dirty_tracking fails, qemu_savevm_state_cancel will handle
the cleanup of init_blk_migration automatically.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: chai wen <chaiw.fnst@cn.fujitsu.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 11:22:39 +02:00
Stefan Hajnoczi
b15446fdbf blockdev: acquire AioContext in block_set_io_throttle
The block_set_io_throttle QMP and HMP commands modify I/O throttling
limits for block devices.

Acquire the BlockDriverState's AioContext to protect against race
conditions with an IOThread that is running I/O for this device.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-06-04 09:56:12 +02:00
Stefan Hajnoczi
22524f7262 throttle: add detach/attach test case
Add a test case that checks the timer is really removed/added by the
detach/attach functions.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-06-04 09:56:12 +02:00
Stefan Hajnoczi
13af91ebf0 throttle: add throttle_detach/attach_aio_context()
Block I/O throttling uses timers and currently always adds them to the
main loop.  Throttling will break if bdrv_set_aio_context() is used to
move a BlockDriverState to a different AioContext.

This patch adds throttle_detach/attach_aio_context() interfaces so the
throttling timers and uses them to move timers to the new AioContext.
Note that bdrv_set_aio_context() already drains all requests so we're
sure no throttled requests are pending.

The test cases need to be updated since the throttle_init() interface
has changed.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-06-04 09:56:12 +02:00
Fam Zheng
c9f87b20b9 dataplane: Support VIRTIO_BLK_T_SCSI_CMD
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:12 +02:00
Fam Zheng
5a05cbeeaa virtio-blk: Factor out virtio_blk_handle_scsi_req from virtio_blk_handle_scsi
The common logic to process a scsi request in a VirtQueueElement is
extracted to a function to share with dataplane.

This makes VirtIOBlockReq.scsi unused, so drop it.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:12 +02:00
Fam Zheng
6d7e73d62f virtio-blk: Allow config-wce in dataplane
Dataplane now uses block layer. Protect bdrv_set_enable_write_cache with
aio_context_acquire and aio_context_release, so we can enable config-wce
to allow guest to modify the write cache online.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:12 +02:00
Fam Zheng
db519cba87 block: Move declaration of bdrv_get_aio_context to block.h
block_int.h is for block layer and block drivers, other code shouldn't
include it. But similar to bdrv_set_aio_context, bdrv_get_aio_context
should also be accessible from outside of block layer.

Move it.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:12 +02:00
Stefan Hajnoczi
76ef2cf549 raw-posix: drop raw_get_aio_fd() since it is no longer used
virtio-blk data-plane now uses the QEMU block layer for I/O.  We do not
need raw_get_aio_fd() anymore.  It was a layering violation anyway, so
let's get rid of it.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:12 +02:00
Stefan Hajnoczi
8d242f0ed5 dataplane: implement async flush
Stop using the raw-posix file descriptor for synchronous
qemu_fdatasync().  Use bdrv_aio_flush() instead and drop the
VirtIOBlockDataPlane->fd field.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:12 +02:00
Stefan Hajnoczi
0d6ecec252 dataplane: delete IOQueue since it is no longer used
This custom Linux AIO request queue is no longer used by virtio-blk
data-plane.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:12 +02:00
Stefan Hajnoczi
580b6b2aa2 dataplane: use the QEMU block layer for I/O
Stop using a custom Linux AIO request queue from ioq.h and instead use
the QEMU block layer for I/O.

This patch adjusts the VirtIOBlockRequest struct with fields needed for
bdrv_aio_readv()/bdrv_aio_writev().  ioq.h used struct iovec and struct
iocb, which we don't need directly anymore.

Modify dataplane start/stop to set the AioContext on the
BlockDriverState.  We also no longer need to get the raw-posix file
descriptor.  This means image formats are now supported with dataplane!

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:12 +02:00
Stefan Hajnoczi
c75f3bdf46 vmdk: implement .bdrv_detach/attach_aio_context()
Implement .bdrv_detach/attach_aio_context() interfaces to propagate
detach/attach to BDRVVmdkState->extents[].file.  The block layer takes
care of ->file and ->backing_hd but doesn't know about our extents
BlockDriverStates, which is also part of the graph.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:12 +02:00
Stefan Hajnoczi
2af0b20056 ssh: use BlockDriverState's AioContext
Drop the assumption that we're using the main AioContext.  Use
bdrv_get_aio_context() to register fd handlers in the right AioContext
for this BlockDriverState.

The .bdrv_detach_aio_context() and .bdrv_attach_aio_context() interfaces
are not needed since no fd handlers, timers, or BHs stay registered when
requests have been drained.

For now this doesn't make much difference but will allow ssh to work in
IOThread instances in the future.

Acked-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:12 +02:00
Stefan Hajnoczi
84390bed59 sheepdog: implement .bdrv_detach/attach_aio_context()
Drop the assumption that we're using the main AioContext.  Convert
qemu_aio_set_fd_handler() to aio_set_fd_handler() and qemu_aio_wait() to
aio_poll().

The .bdrv_detach/attach_aio_context() interfaces also need to be
implemented to move the socket fd handler from the old to the new
AioContext.

Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Acked-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:12 +02:00
Stefan Hajnoczi
ea80019149 rbd: use BlockDriverState's AioContext
Drop the assumption that we're using the main AioContext.  Convert
qemu_bh_new() to aio_bh_new() and qemu_aio_wait() to aio_poll().

The .bdrv_detach_aio_context() and .bdrv_attach_aio_context() interfaces
are not needed since no fd handlers, timers, or BHs stay registered when
requests have been drained.

Cc: Josh Durgin <josh.durgin@inktank.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2014-06-04 09:56:12 +02:00
Stefan Hajnoczi
85ebd381fd block/raw-win32: implement .bdrv_detach/attach_aio_context()
Drop the assumption that we're using the main AioContext for raw-win32.
Convert the aio-win32 code to support detach/attach and replace
qemu_aio_wait() with aio_poll().

The .bdrv_detach/attach_aio_context() interfaces move the aio-win32
event notifier from the old to the new AioContext.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:12 +02:00
Stefan Hajnoczi
99cc598924 block/raw-win32: create one QEMUWin32AIOState per BDRVRawState
Each QEMUWin32AIOState event notifier is associated with an AioContext.
Since BlockDriverState instances can use different AioContexts we cannot
continue to use a global QEMUWin32AIOState.

Let each BDRVRawState have its own QEMUWin32AIOState and free it when
BDRVRawState is closed.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:12 +02:00
Stefan Hajnoczi
abd269b7cf block/linux-aio: fix memory and fd leak
Hot unplugging -drive aio=native,file=test.img,format=raw images leaves
the Linux AIO event notifier and struct qemu_laio_state allocated.
Luckily nothing will use the event notifier after the BlockDriverState
has been closed so the handler function is never called.

It's still worth fixing this resource leak.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:11 +02:00
Stefan Hajnoczi
c2f3426c9b block/raw-posix: implement .bdrv_detach/attach_aio_context()
Drop the assumption that we're using the main AioContext for Linux AIO.
Convert the Linux AIO event notifier to use aio_set_event_notifier().

The .bdrv_detach/attach_aio_context() interfaces also need to be
implemented to move the event notifier handler from the old to the new
AioContext.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:11 +02:00
Stefan Hajnoczi
e3625d3d6a quorum: implement .bdrv_detach/attach_aio_context()
Implement .bdrv_detach/attach_aio_context() interfaces to propagate
detach/attach to BDRVQuorumState->bs[] children.  The block layer takes
care of ->file and ->backing_hd but doesn't know about our ->bs[]
BlockDriverStates, which is also part of the graph.

Reviewed-by: Benoît Canet <benoit.canet@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:11 +02:00
Stefan Hajnoczi
a8c868c386 qed: use BlockDriverState's AioContext
Drop the assumption that we're using the main AioContext.  Convert
qemu_bh_new() to aio_bh_new() and qemu_aio_wait() to aio_poll() so we're
using the BlockDriverState's AioContext.

Implement .bdrv_detach/attach_aio_context() interfaces to move the
QED_F_NEED_CHECK timer from the old AioContext to the new one.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:11 +02:00
Stefan Hajnoczi
471799d135 nfs: implement .bdrv_detach/attach_aio_context()
Drop the assumption that we're using the main AioContext.  The following
functions need to be converted:
 * qemu_bh_new() -> aio_bh_new()
 * qemu_aio_set_fd_handler() -> aio_set_fd_handler()
 * qemu_aio_wait() -> aio_poll()

The .bdrv_detach/attach_aio_context() interfaces also need to be
implemented to move the fd handler from the old to the new AioContext.

Cc: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
2014-06-04 09:56:11 +02:00
Stefan Hajnoczi
69447cd8f3 nbd: implement .bdrv_detach/attach_aio_context()
Drop the assumption that we're using the main AioContext.  Convert
qemu_aio_set_fd_handler() calls to aio_set_fd_handler().

The .bdrv_detach/attach_aio_context() interfaces also need to be
implemented to move the socket fd handler from the old to the new
AioContext.

Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:11 +02:00
Stefan Hajnoczi
80cf625766 iscsi: implement .bdrv_detach/attach_aio_context()
Drop the assumption that we're using the main AioContext for Linux
AIO.  Convert qemu_aio_set_fd_handler() to aio_set_fd_handler() and
timer_new_ms() to aio_timer_new().

The .bdrv_detach/attach_aio_context() interfaces also need to be
implemented to move the fd and timer from the old to the new AioContext.

Cc: Peter Lieven <pl@kamp.de>
Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
2014-06-04 09:56:11 +02:00
Stefan Hajnoczi
6ee50af24b gluster: use BlockDriverState's AioContext
Drop the assumption that we're using the main AioContext.  Use
aio_bh_new() instead of qemu_bh_new().

The .bdrv_detach_aio_context() and .bdrv_attach_aio_context() interfaces
are not needed since no fd handlers, timers, or BHs stay registered when
requests have been drained.

Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:11 +02:00
Stefan Hajnoczi
63f0f45f2e curl: implement .bdrv_detach/attach_aio_context()
The curl block driver uses fd handlers, timers, and BHs.  The fd
handlers and timers are managed on behalf of libcurl, which controls
them using callback functions that the block driver implements.

The simplest way to implement .bdrv_detach/attach_aio_context() is to
clean up libcurl in the old event loop and initialize it again in the
new event loop.  We do not need to keep track of anything since there
are no pending requests when the AioContext is changed.

Also make sure to use aio_set_fd_handler() instead of
qemu_aio_set_fd_handler() and aio_bh_new() instead of qemu_bh_new() so
the current AioContext is passed in.

Cc: Alexander Graf <agraf@suse.de>
Cc: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2014-06-04 09:56:11 +02:00
Stefan Hajnoczi
9d94020bc6 blkverify: implement .bdrv_detach/attach_aio_context()
Drop the assumption that we're using the main AioContext.  Convert
qemu_bh_new() to aio_bh_new() and qemu_aio_wait() to aio_poll() so we
use the BlockDriverState's AioContext.

Implement .bdrv_detach/attach_aio_context() interfaces to propagate
detach/attach to BDRVBlkverifyState->test_file.  The block layer takes
care of ->file and ->backing_hd but doesn't know about our ->test_file
BlockDriverState, which is also part of the graph.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:11 +02:00
Stefan Hajnoczi
7e1efdf0a3 blkdebug: use BlockDriverState's AioContext
Drop the assumption that we're using the main AioContext.  Convert
qemu_bh_new() to aio_bh_new() so we use the BlockDriverState's
AioContext.

The .bdrv_detach_aio_context() and .bdrv_attach_aio_context() interfaces
are not needed since no fd handlers, timers, or BHs stay registered when
requests have been drained.

Cc: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:11 +02:00
Stefan Hajnoczi
dcd042282d block: add bdrv_set_aio_context()
Up until now all BlockDriverState instances have used the QEMU main loop
for fd handlers, timers, and BHs.  This is not scalable on SMP guests
and hosts so we need to move to a model with multiple event loops on
different host CPUs.

bdrv_set_aio_context() assigns the AioContext event loop to use for a
particular BlockDriverState.  It first detaches the entire
BlockDriverState graph from the current AioContext and then attaches to
the new AioContext.

This function will be used by virtio-blk data-plane to assign a
BlockDriverState to its IOThread AioContext.  Make
bdrv_aio_set_context() public since data-plane should not include
block_int.h.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:11 +02:00
Stefan Hajnoczi
9b536adcbe block: acquire AioContext in bdrv_drain_all()
Modify bdrv_drain_all() to take into account that BlockDriverState
instances may be running in different AioContexts.

This patch changes the implementation of bdrv_drain_all() while
preserving the semantics.  Previously kicking throttled requests and
checking for pending requests were done across all BlockDriverState
instances in sequence.  Now we process each BlockDriverState in turn,
making sure to acquire and release its AioContext.

This prevents race conditions between the thread executing
bdrv_drain_all() and the thread running the AioContext.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:11 +02:00
Stefan Hajnoczi
ed78cda3de block: acquire AioContext in bdrv_*_all()
bdrv_close_all(), bdrv_commit_all(), bdrv_flush_all(),
bdrv_invalidate_cache_all(), and bdrv_clear_incoming_migration_all() are
called by main loop code and touch all BlockDriverState instances.

Some BlockDriverState instances may be running in another AioContext.
Make sure to acquire the AioContext before closing the BlockDriverState.

This will protect against race conditions once virtio-blk data-plane is
using the BlockDriverState from another AioContext event loop.

Note that this patch does not convert bdrv_drain_all() yet since that
conversion is non-trivial.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:11 +02:00
Stefan Hajnoczi
2572b37a47 block: use BlockDriverState AioContext
Drop the assumption that we're using the main AioContext.  Convert
qemu_aio_wait() to aio_poll() and qemu_bh_new() to aio_bh_new() so the
BlockDriverState AioContext is used.

Note there is still one qemu_aio_wait() left in bdrv_create() but we do
not have a BlockDriverState there and only main loop code invokes this
function.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:11 +02:00
Stefan Hajnoczi
924fe1293c aio: fix qemu_bh_schedule() bh->ctx race condition
qemu_bh_schedule() is supposed to be thread-safe at least the first time
it is called.  Unfortunately this is not quite true:

  bh->scheduled = 1;
  aio_notify(bh->ctx);

Since another thread may run the BH callback once it has been scheduled,
there is a race condition if the callback frees the BH before
aio_notify(bh->ctx) has a chance to run.

Reported-by: Stefan Priebe <s.priebe@profihost.ag>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Stefan Priebe <s.priebe@profihost.ag>
2014-06-04 09:56:06 +02:00
Jidong Xiao
79b6f2f651 kvm: Fix eax for cpuid leaf 0x40000000
Since Linux kernel 3.5, KVM has documented eax for leaf 0x40000000
to be KVM_CPUID_FEATURES:

57c22e5f35

But qemu still tries to set it to 0. It would be better to make qemu
and kvm consistent. This patch just fixes this issue.

Signed-off-by: Jidong Xiao <jidong.xiao@gmail.com>
[Include kvm_base in the value. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-04 09:12:04 +02:00
Gonglei
374044f08f qga: Fix handle fd leak in acquire_privilege()
token should be closed in all conditions.
So move CloseHandle(token) to "out" branch.

Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-03 15:07:59 -05:00
Marcelo Tosatti
9b1786829a kvmclock: Ensure proper env->tsc value for kvmclock_current_nsec calculation
Ensure proper env->tsc value for kvmclock_current_nsec calculation.

Reported-by: Marcin Gibuła <m.gibula@beyond.pl>
Cc: qemu-stable@nongnu.org
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-03 18:40:48 +02:00
Alex Williamson
f522d2acc5 kvm: Enable -cpu option to hide KVM
The latest Nvidia driver (337.88) specifically checks for KVM as the
hypervisor and reports Code 43 for the driver in a Windows guest when
found.  Removing or changing the KVM signature is sufficient for the
driver to load and work.  This patch adds an option to easily allow
the KVM hypervisor signature to be hidden using '-cpu kvm=off'.  We
continue to expose KVM via the cpuid value by default.  The state of
this option does not supercede or replace -enable-kvm or the accel=kvm
machine option.  This only changes the visibility of KVM to the guest
and paravirtual features specifically tied to the KVM cpuid.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-03 18:40:48 +02:00
Eduardo Habkost
0e1dac6c41 kvm: Ensure negative return value on kvm_init() error handling path
We need to ensure ret < 0 when going through the error path, or QEMU may
try to run the half-initialized VM and crash.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-30 22:28:55 +02:00
Gabriel L. Somlo
eb386aaccc tests: add smbios testing
Add tests to find and verify the smbios entry point structure,
and to walk and perform checks on the actual smbios tables.

Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-05-27 23:42:16 +03:00
Gabriel L. Somlo
501f28ca9d tests: rename acpi-test to bios-tables-test
The test harness for acpi (generating a boot disk, starting qemu,
waiting for the BIOS to finish booting before examining guest
memory, etc.) is perfectly suited for testing other bios tables
beside acpi, such as e.g., smbios.

This patch renames acpi-test to bios-tables-test to reflect that,
and in preparation for adding smbios tests.

Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-05-27 23:42:16 +03:00
Ján Tomko
38dbd48b24 virtio-balloon: return empty data when no stats are available
If the guest hasn't updated the stats yet, instead of returning
an error, return '-1' for the stats and '0' as 'last-update'.

This lets applications ignore this without parsing the error message.

Related libvirt patch and discussion:
https://www.redhat.com/archives/libvir-list/2014-May/msg00460.html

Tested against current upstream libvirt - stat reporting works and
it no longer logs errors when the stats are queried on domain startup.
(Note: libvirt doesn't use the last-update field for anything yet)

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-05-25 12:46:58 +03:00
Paolo Bonzini
28fb26f19f target-i386: set CC_OP to CC_OP_EFLAGS in cpu_load_eflags
There is no reason to keep that out of the function.  The comment refers
to the disassembler's cc_op state rather than the CPUState field.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-21 18:02:08 +02:00
Paolo Bonzini
7125c937c9 target-i386: get CPL from SS.DPL
CS.RPL is not equal to the CPL in the few instructions between
setting CR0.PE and reloading CS.  We get this right in the common
case, because writes to CR0 do not modify the CPL, but it would
not be enough if an SMI comes exactly during that brief period.
Were this to happen, the RSM instruction would erroneously set
CPL to the low two bits of the real-mode selector; and if they are
not 00, the next instruction fetch cannot access the code segment
and causes a triple fault.

However, SS.DPL *is* always equal to the CPL.  In real processors
(AMD only) there is a weird case of SYSRET setting SS.DPL=SS.RPL
from the STAR register while forcing CPL=3, but we do not emulate
that.

Tested-by: Kevin O'Connor <kevin@koconnor.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-21 18:02:08 +02:00
Paolo Bonzini
d3b5491897 target-i386: rework CPL checks during task switch, preparing for next patch
During task switch, all of CS.DPL, CS.RPL, SS.DPL must match (in addition
to all the other requirements) and will be the new CPL.  So far this worked
by carefully setting the CS selector and flags before doing the task
switch; but this will not work once we get the CPL from SS.DPL.

Temporarily assume that the CPL comes from CS.RPL during task switch
to a protected-mode task, until the descriptor of SS is loaded.

Tested-by: Kevin O'Connor <kevin@koconnor.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-21 18:02:08 +02:00
Paolo Bonzini
b98dbc9095 target-i386: fix segment flags for SMM and VM86 mode
With the next patch, these need to be correct or VM86 tasks
have the wrong CPL.  The flags are basically what the Intel VMX
documentation say is mandatory for entry into a VM86 guest.

For consistency, SMM ought to have the same flags except with
CPL=0.

Tested-by: Kevin O'Connor <kevin@koconnor.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-21 18:02:08 +02:00
Kevin O'Connor
87446327cc target-i386: Fix vm86 mode regression introduced in fd460606fd.
Commit fd460606fd moved setting of eflags above calls to
cpu_x86_load_seg_cache() in seg_helper.c.  Unfortunately, in
do_interrupt_protected() this moved the clearing of VM_MASK above a
test for it.

Fix this regression by storing the value of VM_MASK at the start of
do_interrupt_protected().

Signed-off-by: Kevin O'Connor <kevin@koconnor.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-21 18:02:08 +02:00
Paolo Bonzini
b763adf1a6 kvm_stat: allow choosing between tracepoints and old stats
The old stats contain information not available in the tracepoints.
By default, keep the old behavior, but allow choosing which set of stats
to present, or even both.

Inspired by a patch from Marcelo Tosatti.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-21 17:56:37 +02:00
Andreas Färber
7c8b724826 pcie_host: Turn pcie_host_init() into an instance_init
This assures the trivial field initialization is applied for any derived
type - currently only Q35PCIHost.

Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-05-21 15:47:50 +03:00
Gabriel L. Somlo
0d73394ad9 SMBIOS: Fix type 17 field sizes
Fields for configured_clock_speed and various voltage values
introduced in spec v2.7+ should be "word", i.e. 16 bits.

Reported-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-05-21 15:47:50 +03:00
Gabriel L. Somlo
84351843eb SMBIOS: Update Type 0 struct generator for machines >= 2.1
Update how type 0 (bios info) structures are generated, as follows:

  - convert bios_characteristics field to uin64_t (instead of
    uint8_t[8]), as described in the current smbios spec (v2.8)

  - enable "virtual machine" bit in bios_characteristics_extension_bits

  - add command line option to enable "uefi supported" bit in
    bios_characteristics_extension_bits

These updates should make this optional structure more useful when
used with edk2/ovmf. Only pc machines >= 2.1 are affected, and only
when a type 0 structure is explicitly specified on the command line.

Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-05-21 15:47:50 +03:00
Gabriel L. Somlo
fb5be2e833 SMBIOS: Fix endian-ness when populating multi-byte fields
When i386 guests are emulated on big endian hosts, make sure
multi-byte fields are populated safely via cpu_to_le*().

Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-05-21 15:47:50 +03:00
BALATON Zoltan
13cc2c3e86 serial-pci: Set prog interface field of pci config to 16550 compatible
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
2014-05-21 15:47:50 +03:00
Alexander Graf
a096b3a673 kvmclock: Ensure time in migration never goes backward
When we migrate we ask the kernel about its current belief on what the guest
time would be. However, I've seen cases where the kvmclock guest structure
indicates a time more recent than the kvm returned time.

To make sure we never go backwards, calculate what the guest would have seen
as time at the point of migration and use that value instead of the kernel
returned one when it's more recent.  This bases the view of the kvmclock
after migration on the same foundation in host as well as guest.

Signed-off-by: Alexander Graf <agraf@suse.de>
Cc: qemu-stable@nongnu.org
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-21 12:01:45 +02:00
1106 changed files with 81438 additions and 25278 deletions

5
.gitignore vendored
View File

@@ -11,6 +11,10 @@
/trace/generated-tracers.dtrace
/trace/generated-events.h
/trace/generated-events.c
/trace/generated-helpers-wrappers.h
/trace/generated-helpers.h
/trace/generated-helpers.c
/trace/generated-tcg-tracers.h
/trace/generated-ust-provider.h
/trace/generated-ust.c
/libcacard/trace/generated-tracers.c
@@ -26,6 +30,7 @@
/qapi-generated
/qapi-types.[ch]
/qapi-visit.[ch]
/qapi-event.[ch]
/qmp-commands.h
/qmp-marshal.c
/qemu-doc.html

3
.gitmodules vendored
View File

@@ -28,3 +28,6 @@
[submodule "dtc"]
path = dtc
url = git://git.qemu-project.org/dtc.git
[submodule "roms/u-boot"]
path = roms/u-boot
url = git://git.qemu-project.org/u-boot.git

View File

@@ -66,16 +66,16 @@ matrix:
compiler: gcc
# All the trace backends (apart from dtrace)
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_CONFIG="--enable-trace-backend=stderr"
EXTRA_CONFIG="--enable-trace-backends=stderr"
compiler: gcc
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_CONFIG="--enable-trace-backend=simple"
EXTRA_CONFIG="--enable-trace-backends=simple"
compiler: gcc
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_CONFIG="--enable-trace-backend=ftrace"
EXTRA_CONFIG="--enable-trace-backends=ftrace"
TEST_CMD=""
compiler: gcc
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_PKGS="liblttng-ust-dev liburcu-dev"
EXTRA_CONFIG="--enable-trace-backend=ust"
EXTRA_CONFIG="--enable-trace-backends=ust"
compiler: gcc

View File

@@ -91,3 +91,17 @@ Mixed declarations (interleaving statements and declarations within blocks)
are not allowed; declarations should be at the beginning of blocks. In other
words, the code should not generate warnings if using GCC's
-Wdeclaration-after-statement option.
6. Conditional statements
When comparing a variable for (in)equality with a constant, list the
constant on the right, as in:
if (a == 1) {
/* Reads like: "If a equals 1" */
do_something();
}
Rationale: Yoda conditions (as in 'if (1 == a)') are awkward to read.
Besides, good compilers already warn users when '==' is mis-typed as '=',
even when the constant is on the right.

View File

@@ -161,6 +161,12 @@ S: Maintained
F: target-xtensa/
F: hw/xtensa/
TriCore
M: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
S: Maintained
F: target-tricore/
F: hw/tricore/
Guest CPU Cores (KVM):
----------------------
@@ -176,6 +182,11 @@ M: Peter Maydell <peter.maydell@linaro.org>
S: Maintained
F: target-arm/kvm.c
MIPS
M: James Hogan <james.hogan@imgtec.com>
S: Maintained
F: target-mips/kvm.c
PPC
M: Alexander Graf <agraf@suse.de>
S: Maintained
@@ -558,6 +569,7 @@ Devices
-------
IDE
M: Kevin Wolf <kwolf@redhat.com>
M: Stefan Hajnoczi <stefanha@redhat.com>
S: Odd Fixes
F: include/hw/ide.h
F: hw/ide/
@@ -608,7 +620,7 @@ USB
M: Gerd Hoffmann <kraxel@redhat.com>
S: Maintained
F: hw/usb/*
F: tests/usb-hcd-ehci-test.c
F: tests/usb-*-test.c
VFIO
M: Alex Williamson <alex.williamson@redhat.com>
@@ -672,6 +684,12 @@ S: Maintained
F: hw/*/xilinx_*
F: include/hw/xilinx.h
Vmware
M: Dmitry Fleytman <dmitry@daynix.com>
S: Maintained
F: hw/net/vmxnet*
F: hw/scsi/vmw_pvscsi*
Subsystems
----------
Audio
@@ -726,6 +744,16 @@ S: Odd Fixes
F: gdbstub*
F: gdb-xml/
Memory API
M: Paolo Bonzini <pbonzini@redhat.com>
S: Supported
F: include/exec/ioport.h
F: ioport.c
F: include/exec/memory.h
F: memory.c
F: include/exec/memory-internal.h
F: exec.c
SPICE
M: Gerd Hoffmann <kraxel@redhat.com>
S: Supported
@@ -838,7 +866,7 @@ S: Odd Fixes
F: scripts/checkpatch.pl
Seccomp
M: Eduardo Otubo <otubo@linux.vnet.ibm.com>
M: Eduardo Otubo <eduardo.otubo@profitbricks.com>
S: Supported
F: qemu-seccomp.c
F: include/sysemu/seccomp.h
@@ -952,7 +980,7 @@ S: Supported
F: block/rbd.c
Sheepdog
M: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
M: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
M: Liu Yuan <namei.unix@gmail.com>
L: sheepdog@lists.wpkg.org
S: Supported
@@ -984,3 +1012,9 @@ SSH
M: Richard W.M. Jones <rjones@redhat.com>
S: Supported
F: block/ssh.c
ARCHIPELAGO
M: Chrysostomos Nanakos <cnanakos@grnet.gr>
M: Chrysostomos Nanakos <chris@include.gr>
S: Maintained
F: block/archipelago.c

View File

@@ -45,19 +45,25 @@ endif
endif
GENERATED_HEADERS = config-host.h qemu-options.def
GENERATED_HEADERS += qmp-commands.h qapi-types.h qapi-visit.h
GENERATED_SOURCES += qmp-marshal.c qapi-types.c qapi-visit.c
GENERATED_HEADERS += qmp-commands.h qapi-types.h qapi-visit.h qapi-event.h
GENERATED_SOURCES += qmp-marshal.c qapi-types.c qapi-visit.c qapi-event.c
GENERATED_HEADERS += trace/generated-events.h
GENERATED_SOURCES += trace/generated-events.c
GENERATED_HEADERS += trace/generated-tracers.h
ifeq ($(TRACE_BACKEND),dtrace)
ifeq ($(findstring dtrace,$(TRACE_BACKENDS)),dtrace)
GENERATED_HEADERS += trace/generated-tracers-dtrace.h
endif
GENERATED_SOURCES += trace/generated-tracers.c
ifeq ($(TRACE_BACKEND),ust)
GENERATED_HEADERS += trace/generated-tcg-tracers.h
GENERATED_HEADERS += trace/generated-helpers-wrappers.h
GENERATED_HEADERS += trace/generated-helpers.h
GENERATED_SOURCES += trace/generated-helpers.c
ifeq ($(findstring ust,$(TRACE_BACKENDS)),ust)
GENERATED_HEADERS += trace/generated-ust-provider.h
GENERATED_SOURCES += trace/generated-ust.c
endif
@@ -202,7 +208,7 @@ Makefile: $(version-obj-y) $(version-lobj-y)
# Build libraries
libqemustub.a: $(stub-obj-y)
libqemuutil.a: $(util-obj-y) qapi-types.o qapi-visit.o
libqemuutil.a: $(util-obj-y)
block-modules = $(foreach o,$(block-obj-m),"$(basename $(subst /,-,$o))",) NULL
util/module.o-cflags = -D'CONFIG_BLOCK_MODULES=$(block-modules)'
@@ -246,18 +252,27 @@ $(SRC_PATH)/qga/qapi-schema.json $(SRC_PATH)/scripts/qapi-commands.py $(qapi-py)
$(gen-out-type) -o qga/qapi-generated -p "qga-" -i $<, \
" GEN $@")
qapi-modules = $(SRC_PATH)/qapi-schema.json $(SRC_PATH)/qapi/common.json \
$(SRC_PATH)/qapi/block.json $(SRC_PATH)/qapi/block-core.json \
$(SRC_PATH)/qapi/event.json
qapi-types.c qapi-types.h :\
$(SRC_PATH)/qapi-schema.json $(SRC_PATH)/scripts/qapi-types.py $(qapi-py)
$(qapi-modules) $(SRC_PATH)/scripts/qapi-types.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-types.py \
$(gen-out-type) -o "." -b -i $<, \
" GEN $@")
qapi-visit.c qapi-visit.h :\
$(SRC_PATH)/qapi-schema.json $(SRC_PATH)/scripts/qapi-visit.py $(qapi-py)
$(qapi-modules) $(SRC_PATH)/scripts/qapi-visit.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-visit.py \
$(gen-out-type) -o "." -b -i $<, \
" GEN $@")
qapi-event.c qapi-event.h :\
$(qapi-modules) $(SRC_PATH)/scripts/qapi-event.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-event.py \
$(gen-out-type) -o "." -b -i $<, \
" GEN $@")
qmp-commands.h qmp-marshal.c :\
$(SRC_PATH)/qapi-schema.json $(SRC_PATH)/scripts/qapi-commands.py $(qapi-py)
$(qapi-modules) $(SRC_PATH)/scripts/qapi-commands.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-commands.py \
$(gen-out-type) -o "." -m -i $<, \
" GEN $@")
@@ -335,7 +350,8 @@ multiboot.bin linuxboot.bin kvmvapic.bin \
s390-zipl.rom \
s390-ccw.img \
spapr-rtas.bin slof.bin \
palcode-clipper
palcode-clipper \
u-boot.e500
else
BLOBS=
endif
@@ -376,12 +392,8 @@ install-sysconfig: install-datadir install-confdir
install: all $(if $(BUILD_DOCS),install-doc) install-sysconfig \
install-datadir install-localstatedir
$(INSTALL_DIR) "$(DESTDIR)$(bindir)"
ifneq ($(TOOLS),)
$(INSTALL_PROG) $(TOOLS) "$(DESTDIR)$(bindir)"
ifneq ($(STRIP),)
$(STRIP) $(TOOLS:%="$(DESTDIR)$(bindir)/%")
endif
$(call install-prog,$(TOOLS),$(DESTDIR)$(bindir))
endif
ifneq ($(CONFIG_MODULES),)
$(INSTALL_DIR) "$(DESTDIR)$(qemu_moddir)"
@@ -392,11 +404,7 @@ ifneq ($(CONFIG_MODULES),)
done
endif
ifneq ($(HELPERS-y),)
$(INSTALL_DIR) "$(DESTDIR)$(libexecdir)"
$(INSTALL_PROG) $(HELPERS-y) "$(DESTDIR)$(libexecdir)"
ifneq ($(STRIP),)
$(STRIP) $(HELPERS-y:%="$(DESTDIR)$(libexecdir)/%")
endif
$(call install-prog,$(HELPERS-y),$(DESTDIR)$(libexecdir))
endif
ifneq ($(BLOBS),)
set -e; for x in $(BLOBS); do \

View File

@@ -1,7 +1,7 @@
#######################################################################
# Common libraries for tools and emulators
stub-obj-y = stubs/
util-obj-y = util/ qobject/ qapi/ trace/
util-obj-y = util/ qobject/ qapi/ qapi-types.o qapi-visit.o qapi-event.o
#######################################################################
# block-obj-y is code used by both qemu system emulation and qemu-img
@@ -12,7 +12,6 @@ block-obj-y += main-loop.o iohandler.o qemu-timer.o
block-obj-$(CONFIG_POSIX) += aio-posix.o
block-obj-$(CONFIG_WIN32) += aio-win32.o
block-obj-y += block/
block-obj-y += qapi-types.o qapi-visit.o
block-obj-y += qemu-io-cmds.o
block-obj-y += qemu-coroutine.o qemu-coroutine-lock.o qemu-coroutine-io.o
@@ -88,11 +87,6 @@ common-obj-y += qmp-marshal.o
common-obj-y += qmp.o hmp.o
endif
######################################################################
# some qapi visitors are used by both system and user emulation:
common-obj-y += qapi-visit.o qapi-types.o
#######################################################################
# Target-independent parts used in system and user emulation
common-obj-y += qemu-log.o
@@ -106,10 +100,15 @@ common-obj-y += disas/
version-obj-$(CONFIG_WIN32) += $(BUILD_DIR)/version.o
version-lobj-$(CONFIG_WIN32) += $(BUILD_DIR)/version.lo
######################################################################
# tracing
util-obj-y += trace/
target-obj-y += trace/
######################################################################
# guest agent
# FIXME: a few definitions from qapi-types.o/qapi-visit.o are needed
# by libqemuutil.a. These should be moved to a separate .json schema.
qga-obj-y = qga/ qapi-types.o qapi-visit.o
qga-obj-y = qga/
qga-vss-dll-obj-y = qga/

View File

@@ -38,7 +38,7 @@ config-target.h: config-target.h-timestamp
config-target.h-timestamp: config-target.mak
ifdef CONFIG_TRACE_SYSTEMTAP
stap: $(QEMU_PROG).stp-installed $(QEMU_PROG).stp
stap: $(QEMU_PROG).stp-installed $(QEMU_PROG).stp $(QEMU_PROG)-simpletrace.stp
ifdef CONFIG_USER_ONLY
TARGET_TYPE=user
@@ -49,7 +49,7 @@ endif
$(QEMU_PROG).stp-installed: $(SRC_PATH)/trace-events
$(call quiet-command,$(TRACETOOL) \
--format=stap \
--backend=$(TRACE_BACKEND) \
--backends=$(TRACE_BACKENDS) \
--binary=$(bindir)/$(QEMU_PROG) \
--target-name=$(TARGET_NAME) \
--target-type=$(TARGET_TYPE) \
@@ -58,12 +58,19 @@ $(QEMU_PROG).stp-installed: $(SRC_PATH)/trace-events
$(QEMU_PROG).stp: $(SRC_PATH)/trace-events
$(call quiet-command,$(TRACETOOL) \
--format=stap \
--backend=$(TRACE_BACKEND) \
--backends=$(TRACE_BACKENDS) \
--binary=$(realpath .)/$(QEMU_PROG) \
--target-name=$(TARGET_NAME) \
--target-type=$(TARGET_TYPE) \
< $< > $@," GEN $(TARGET_DIR)$(QEMU_PROG).stp")
$(QEMU_PROG)-simpletrace.stp: $(SRC_PATH)/trace-events
$(call quiet-command,$(TRACETOOL) \
--format=simpletrace-stap \
--backends=$(TRACE_BACKENDS) \
--probe-prefix=qemu.$(TARGET_TYPE).$(TARGET_NAME) \
< $< > $@," GEN $(TARGET_DIR)$(QEMU_PROG)-simpletrace.stp")
else
stap:
endif
@@ -85,6 +92,12 @@ obj-y += disas.o
obj-$(call notempty,$(TARGET_XML_FILES)) += gdbstub-xml.o
obj-$(call lnot,$(CONFIG_KVM)) += kvm-stub.o
obj-$(CONFIG_LIBDECNUMBER) += libdecnumber/decContext.o
obj-$(CONFIG_LIBDECNUMBER) += libdecnumber/decNumber.o
obj-$(CONFIG_LIBDECNUMBER) += libdecnumber/dpd/decimal32.o
obj-$(CONFIG_LIBDECNUMBER) += libdecnumber/dpd/decimal64.o
obj-$(CONFIG_LIBDECNUMBER) += libdecnumber/dpd/decimal128.o
#########################################################
# Linux user emulator target
@@ -102,7 +115,8 @@ endif #CONFIG_LINUX_USER
ifdef CONFIG_BSD_USER
QEMU_CFLAGS+=-I$(SRC_PATH)/bsd-user -I$(SRC_PATH)/bsd-user/$(TARGET_ABI_DIR)
QEMU_CFLAGS+=-I$(SRC_PATH)/bsd-user -I$(SRC_PATH)/bsd-user/$(TARGET_ABI_DIR) \
-I$(SRC_PATH)/bsd-user/$(HOST_VARIANT_DIR)
obj-y += bsd-user/
obj-y += gdbstub.o user-exec.o
@@ -112,7 +126,7 @@ endif #CONFIG_BSD_USER
#########################################################
# System emulator target
ifdef CONFIG_SOFTMMU
obj-y += arch_init.o cpus.o monitor.o gdbstub.o balloon.o ioport.o
obj-y += arch_init.o cpus.o monitor.o gdbstub.o balloon.o ioport.o numa.o
obj-y += qtest.o
obj-y += hw/
obj-$(CONFIG_FDT) += device_tree.o
@@ -145,21 +159,22 @@ endif # CONFIG_SOFTMMU
dummy := $(call unnest-vars,,obj-y)
all-obj-y := $(obj-y)
target-obj-y :=
block-obj-y :=
common-obj-y :=
include $(SRC_PATH)/Makefile.objs
dummy := $(call unnest-vars,,target-obj-y)
target-obj-y-save := $(target-obj-y)
dummy := $(call unnest-vars,.., \
block-obj-y \
block-obj-m \
common-obj-y \
common-obj-m)
target-obj-y := $(target-obj-y-save)
all-obj-y += $(common-obj-y)
all-obj-y += $(target-obj-y)
all-obj-$(CONFIG_SOFTMMU) += $(block-obj-y)
ifndef CONFIG_HAIKU
LIBS+=-lm
endif
# build either PROG or PROGW
$(QEMU_PROG_BUILD): $(all-obj-y) ../libqemuutil.a ../libqemustub.a
$(call LINK,$^)
@@ -183,14 +198,12 @@ endif
install: all
ifneq ($(PROGS),)
$(INSTALL_PROG) $(PROGS) "$(DESTDIR)$(bindir)"
ifneq ($(STRIP),)
$(STRIP) $(PROGS:%="$(DESTDIR)$(bindir)/%")
endif
$(call install-prog,$(PROGS),$(DESTDIR)$(bindir))
endif
ifdef CONFIG_TRACE_SYSTEMTAP
$(INSTALL_DIR) "$(DESTDIR)$(qemu_datadir)/../systemtap/tapset"
$(INSTALL_DATA) $(QEMU_PROG).stp-installed "$(DESTDIR)$(qemu_datadir)/../systemtap/tapset/$(QEMU_PROG).stp"
$(INSTALL_DATA) $(QEMU_PROG)-simpletrace.stp "$(DESTDIR)$(qemu_datadir)/../systemtap/tapset/$(QEMU_PROG)-simpletrace.stp"
endif
GENERATED_HEADERS += config-target.h

View File

@@ -1 +1 @@
2.0.50
2.1.50

View File

@@ -100,6 +100,11 @@ void aio_set_event_notifier(AioContext *ctx,
(IOHandler *)io_read, NULL, notifier);
}
bool aio_prepare(AioContext *ctx)
{
return false;
}
bool aio_pending(AioContext *ctx)
{
AioHandler *node;
@@ -119,13 +124,22 @@ bool aio_pending(AioContext *ctx)
return false;
}
static bool aio_dispatch(AioContext *ctx)
bool aio_dispatch(AioContext *ctx)
{
AioHandler *node;
bool progress = false;
/*
* We have to walk very carefully in case qemu_aio_set_fd_handler is
* If there are callbacks left that have been queued, we need to call them.
* Do not call select in this case, because it is possible that the caller
* does not need a complete flush (as is the case for aio_poll loops).
*/
if (aio_bh_poll(ctx)) {
progress = true;
}
/*
* We have to walk very carefully in case aio_set_fd_handler is
* called while we're walking.
*/
node = QLIST_FIRST(&ctx->aio_handlers);
@@ -175,28 +189,24 @@ static bool aio_dispatch(AioContext *ctx)
bool aio_poll(AioContext *ctx, bool blocking)
{
AioHandler *node;
bool was_dispatching;
int ret;
bool progress;
was_dispatching = ctx->dispatching;
progress = false;
/*
* If there are callbacks left that have been queued, we need to call them.
* Do not call select in this case, because it is possible that the caller
* does not need a complete flush (as is the case for qemu_aio_wait loops).
/* aio_notify can avoid the expensive event_notifier_set if
* everything (file descriptors, bottom halves, timers) will
* be re-evaluated before the next blocking poll(). This is
* already true when aio_poll is called with blocking == false;
* if blocking == true, it is only true after poll() returns.
*
* If we're in a nested event loop, ctx->dispatching might be true.
* In that case we can restore it just before returning, but we
* have to clear it now.
*/
if (aio_bh_poll(ctx)) {
blocking = false;
progress = true;
}
if (aio_dispatch(ctx)) {
progress = true;
}
if (progress && !blocking) {
return true;
}
aio_set_dispatching(ctx, !blocking);
ctx->walking_handlers++;
@@ -220,7 +230,7 @@ bool aio_poll(AioContext *ctx, bool blocking)
/* wait until next event */
ret = qemu_poll_ns((GPollFD *)ctx->pollfds->data,
ctx->pollfds->len,
blocking ? timerlistgroup_deadline_ns(&ctx->tlg) : 0);
blocking ? aio_compute_timeout(ctx) : 0);
/* if we have any readable fds, dispatch event */
if (ret > 0) {
@@ -234,9 +244,11 @@ bool aio_poll(AioContext *ctx, bool blocking)
}
/* Run dispatch even if there were no readable fds to run timers */
aio_set_dispatching(ctx, true);
if (aio_dispatch(ctx)) {
progress = true;
}
aio_set_dispatching(ctx, was_dispatching);
return progress;
}

View File

@@ -22,12 +22,80 @@
struct AioHandler {
EventNotifier *e;
IOHandler *io_read;
IOHandler *io_write;
EventNotifierHandler *io_notify;
GPollFD pfd;
int deleted;
void *opaque;
QLIST_ENTRY(AioHandler) node;
};
void aio_set_fd_handler(AioContext *ctx,
int fd,
IOHandler *io_read,
IOHandler *io_write,
void *opaque)
{
/* fd is a SOCKET in our case */
AioHandler *node;
QLIST_FOREACH(node, &ctx->aio_handlers, node) {
if (node->pfd.fd == fd && !node->deleted) {
break;
}
}
/* Are we deleting the fd handler? */
if (!io_read && !io_write) {
if (node) {
/* If the lock is held, just mark the node as deleted */
if (ctx->walking_handlers) {
node->deleted = 1;
node->pfd.revents = 0;
} else {
/* Otherwise, delete it for real. We can't just mark it as
* deleted because deleted nodes are only cleaned up after
* releasing the walking_handlers lock.
*/
QLIST_REMOVE(node, node);
g_free(node);
}
}
} else {
HANDLE event;
if (node == NULL) {
/* Alloc and insert if it's not already there */
node = g_malloc0(sizeof(AioHandler));
node->pfd.fd = fd;
QLIST_INSERT_HEAD(&ctx->aio_handlers, node, node);
}
node->pfd.events = 0;
if (node->io_read) {
node->pfd.events |= G_IO_IN;
}
if (node->io_write) {
node->pfd.events |= G_IO_OUT;
}
node->e = &ctx->notifier;
/* Update handler with latest information */
node->opaque = opaque;
node->io_read = io_read;
node->io_write = io_write;
event = event_notifier_get_handle(&ctx->notifier);
WSAEventSelect(node->pfd.fd, event,
FD_READ | FD_ACCEPT | FD_CLOSE |
FD_CONNECT | FD_WRITE | FD_OOB);
}
aio_notify(ctx);
}
void aio_set_event_notifier(AioContext *ctx,
EventNotifier *e,
EventNotifierHandler *io_notify)
@@ -76,6 +144,43 @@ void aio_set_event_notifier(AioContext *ctx,
aio_notify(ctx);
}
bool aio_prepare(AioContext *ctx)
{
static struct timeval tv0;
AioHandler *node;
bool have_select_revents = false;
fd_set rfds, wfds;
/* fill fd sets */
FD_ZERO(&rfds);
FD_ZERO(&wfds);
QLIST_FOREACH(node, &ctx->aio_handlers, node) {
if (node->io_read) {
FD_SET ((SOCKET)node->pfd.fd, &rfds);
}
if (node->io_write) {
FD_SET ((SOCKET)node->pfd.fd, &wfds);
}
}
if (select(0, &rfds, &wfds, NULL, &tv0) > 0) {
QLIST_FOREACH(node, &ctx->aio_handlers, node) {
node->pfd.revents = 0;
if (FD_ISSET(node->pfd.fd, &rfds)) {
node->pfd.revents |= G_IO_IN;
have_select_revents = true;
}
if (FD_ISSET(node->pfd.fd, &wfds)) {
node->pfd.revents |= G_IO_OUT;
have_select_revents = true;
}
}
}
return have_select_revents;
}
bool aio_pending(AioContext *ctx)
{
AioHandler *node;
@@ -84,47 +189,37 @@ bool aio_pending(AioContext *ctx)
if (node->pfd.revents && node->io_notify) {
return true;
}
if ((node->pfd.revents & G_IO_IN) && node->io_read) {
return true;
}
if ((node->pfd.revents & G_IO_OUT) && node->io_write) {
return true;
}
}
return false;
}
bool aio_poll(AioContext *ctx, bool blocking)
static bool aio_dispatch_handlers(AioContext *ctx, HANDLE event)
{
AioHandler *node;
HANDLE events[MAXIMUM_WAIT_OBJECTS + 1];
bool progress;
int count;
int timeout;
progress = false;
bool progress = false;
/*
* If there are callbacks left that have been queued, we need to call then.
* Do not call select in this case, because it is possible that the caller
* does not need a complete flush (as is the case for qemu_aio_wait loops).
*/
if (aio_bh_poll(ctx)) {
blocking = false;
progress = true;
}
/* Run timers */
progress |= timerlistgroup_run_timers(&ctx->tlg);
/*
* Then dispatch any pending callbacks from the GSource.
*
* We have to walk very carefully in case qemu_aio_set_fd_handler is
* We have to walk very carefully in case aio_set_fd_handler is
* called while we're walking.
*/
node = QLIST_FIRST(&ctx->aio_handlers);
while (node) {
AioHandler *tmp;
int revents = node->pfd.revents;
ctx->walking_handlers++;
if (node->pfd.revents && node->io_notify) {
if (!node->deleted &&
(revents || event_notifier_get_handle(node->e) == event) &&
node->io_notify) {
node->pfd.revents = 0;
node->io_notify(node->e);
@@ -134,6 +229,28 @@ bool aio_poll(AioContext *ctx, bool blocking)
}
}
if (!node->deleted &&
(node->io_read || node->io_write)) {
node->pfd.revents = 0;
if ((revents & G_IO_IN) && node->io_read) {
node->io_read(node->opaque);
progress = true;
}
if ((revents & G_IO_OUT) && node->io_write) {
node->io_write(node->opaque);
progress = true;
}
/* if the next select() will return an event, we have progressed */
if (event == event_notifier_get_handle(&ctx->notifier)) {
WSANETWORKEVENTS ev;
WSAEnumNetworkEvents(node->pfd.fd, event, &ev);
if (ev.lNetworkEvents) {
progress = true;
}
}
}
tmp = node;
node = QLIST_NEXT(node, node);
@@ -145,10 +262,47 @@ bool aio_poll(AioContext *ctx, bool blocking)
}
}
if (progress && !blocking) {
return true;
return progress;
}
bool aio_dispatch(AioContext *ctx)
{
bool progress;
progress = aio_bh_poll(ctx);
progress |= aio_dispatch_handlers(ctx, INVALID_HANDLE_VALUE);
progress |= timerlistgroup_run_timers(&ctx->tlg);
return progress;
}
bool aio_poll(AioContext *ctx, bool blocking)
{
AioHandler *node;
HANDLE events[MAXIMUM_WAIT_OBJECTS + 1];
bool was_dispatching, progress, have_select_revents, first;
int count;
int timeout;
if (aio_prepare(ctx)) {
blocking = false;
have_select_revents = true;
}
was_dispatching = ctx->dispatching;
progress = false;
/* aio_notify can avoid the expensive event_notifier_set if
* everything (file descriptors, bottom halves, timers) will
* be re-evaluated before the next blocking poll(). This is
* already true when aio_poll is called with blocking == false;
* if blocking == true, it is only true after poll() returns.
*
* If we're in a nested event loop, ctx->dispatching might be true.
* In that case we can restore it just before returning, but we
* have to clear it now.
*/
aio_set_dispatching(ctx, !blocking);
ctx->walking_handlers++;
/* fill fd sets */
@@ -160,64 +314,42 @@ bool aio_poll(AioContext *ctx, bool blocking)
}
ctx->walking_handlers--;
first = true;
/* wait until next event */
while (count > 0) {
HANDLE event;
int ret;
timeout = blocking ?
qemu_timeout_ns_to_ms(timerlistgroup_deadline_ns(&ctx->tlg)) : 0;
timeout = blocking
? qemu_timeout_ns_to_ms(aio_compute_timeout(ctx)) : 0;
ret = WaitForMultipleObjects(count, events, FALSE, timeout);
aio_set_dispatching(ctx, true);
if (first && aio_bh_poll(ctx)) {
progress = true;
}
first = false;
/* if we have any signaled events, dispatch event */
if ((DWORD) (ret - WAIT_OBJECT_0) >= count) {
event = NULL;
if ((DWORD) (ret - WAIT_OBJECT_0) < count) {
event = events[ret - WAIT_OBJECT_0];
} else if (!have_select_revents) {
break;
}
have_select_revents = false;
blocking = false;
/* we have to walk very carefully in case
* qemu_aio_set_fd_handler is called while we're walking */
node = QLIST_FIRST(&ctx->aio_handlers);
while (node) {
AioHandler *tmp;
ctx->walking_handlers++;
if (!node->deleted &&
event_notifier_get_handle(node->e) == events[ret - WAIT_OBJECT_0] &&
node->io_notify) {
node->io_notify(node->e);
/* aio_notify() does not count as progress */
if (node->e != &ctx->notifier) {
progress = true;
}
}
tmp = node;
node = QLIST_NEXT(node, node);
ctx->walking_handlers--;
if (!ctx->walking_handlers && tmp->deleted) {
QLIST_REMOVE(tmp, node);
g_free(tmp);
}
}
progress |= aio_dispatch_handlers(ctx, event);
/* Try again, but only call each handler once. */
events[ret - WAIT_OBJECT_0] = events[--count];
}
if (blocking) {
/* Run the timers a second time. We do this because otherwise aio_wait
* will not note progress - and will stop a drain early - if we have
* a timer that was not ready to run entering g_poll but is ready
* after g_poll. This will only do anything if a timer has expired.
*/
progress |= timerlistgroup_run_timers(&ctx->tlg);
}
progress |= timerlistgroup_run_timers(&ctx->tlg);
aio_set_dispatching(ctx, was_dispatching);
return progress;
}

View File

@@ -104,6 +104,8 @@ int graphic_depth = 32;
#define QEMU_ARCH QEMU_ARCH_XTENSA
#elif defined(TARGET_UNICORE32)
#define QEMU_ARCH QEMU_ARCH_UNICORE32
#elif defined(TARGET_TRICORE)
#define QEMU_ARCH QEMU_ARCH_TRICORE
#endif
const uint32_t arch_type = QEMU_ARCH;
@@ -739,7 +741,6 @@ static void migration_end(void)
XBZRLE_cache_lock();
if (XBZRLE.cache) {
cache_fini(XBZRLE.cache);
g_free(XBZRLE.cache);
g_free(XBZRLE.encoded_buf);
g_free(XBZRLE.current_buf);
XBZRLE.cache = NULL;
@@ -1041,17 +1042,15 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
{
ram_addr_t addr;
int flags, ret = 0;
int error;
static uint64_t seq_iter;
seq_iter++;
if (version_id != 4) {
ret = -EINVAL;
goto done;
}
do {
while (!ret) {
addr = qemu_get_be64(f);
flags = addr & ~TARGET_PAGE_MASK;
@@ -1075,11 +1074,10 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
QTAILQ_FOREACH(block, &ram_list.blocks, next) {
if (!strncmp(id, block->idstr, sizeof(id))) {
if (block->length != length) {
error_report("Length mismatch: %s: " RAM_ADDR_FMT
" in != " RAM_ADDR_FMT, id, length,
error_report("Length mismatch: %s: 0x" RAM_ADDR_FMT
" in != 0x" RAM_ADDR_FMT, id, length,
block->length);
ret = -EINVAL;
goto done;
}
break;
}
@@ -1089,21 +1087,22 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
error_report("Unknown ramblock \"%s\", cannot "
"accept migration", id);
ret = -EINVAL;
goto done;
}
if (ret) {
break;
}
total_ram_bytes -= length;
}
}
if (flags & RAM_SAVE_FLAG_COMPRESS) {
} else if (flags & RAM_SAVE_FLAG_COMPRESS) {
void *host;
uint8_t ch;
host = host_from_stream_offset(f, addr, flags);
if (!host) {
error_report("Illegal RAM offset " RAM_ADDR_FMT, addr);
ret = -EINVAL;
goto done;
break;
}
ch = qemu_get_byte(f);
@@ -1113,33 +1112,39 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
host = host_from_stream_offset(f, addr, flags);
if (!host) {
error_report("Illegal RAM offset " RAM_ADDR_FMT, addr);
ret = -EINVAL;
goto done;
break;
}
qemu_get_buffer(f, host, TARGET_PAGE_SIZE);
} else if (flags & RAM_SAVE_FLAG_XBZRLE) {
void *host = host_from_stream_offset(f, addr, flags);
if (!host) {
error_report("Illegal RAM offset " RAM_ADDR_FMT, addr);
ret = -EINVAL;
goto done;
break;
}
if (load_xbzrle(f, addr, host) < 0) {
error_report("Failed to decompress XBZRLE page at "
RAM_ADDR_FMT, addr);
ret = -EINVAL;
goto done;
break;
}
} else if (flags & RAM_SAVE_FLAG_HOOK) {
ram_control_load_hook(f, flags);
} else if (flags & RAM_SAVE_FLAG_EOS) {
/* normal exit */
break;
} else {
error_report("Unknown migration flags: %#x", flags);
ret = -EINVAL;
break;
}
error = qemu_file_get_error(f);
if (error) {
ret = error;
goto done;
}
} while (!(flags & RAM_SAVE_FLAG_EOS));
ret = qemu_file_get_error(f);
}
done:
DPRINTF("Completed load of VM with exit code %d seq iteration "
"%" PRIu64 "\n", ret, seq_iter);
return ret;

72
async.c
View File

@@ -26,6 +26,7 @@
#include "block/aio.h"
#include "block/thread-pool.h"
#include "qemu/main-loop.h"
#include "qemu/atomic.h"
/***********************************************************/
/* bottom halves (can be seen as timers which expire ASAP) */
@@ -117,15 +118,21 @@ void qemu_bh_schedule_idle(QEMUBH *bh)
void qemu_bh_schedule(QEMUBH *bh)
{
AioContext *ctx;
if (bh->scheduled)
return;
ctx = bh->ctx;
bh->idle = 0;
/* Make sure that idle & any writes needed by the callback are done
* before the locations are read in the aio_bh_poll.
/* Make sure that:
* 1. idle & any writes needed by the callback are done before the
* locations are read in the aio_bh_poll.
* 2. ctx is loaded before scheduled is set and the callback has a chance
* to execute.
*/
smp_wmb();
smp_mb();
bh->scheduled = 1;
aio_notify(bh->ctx);
aio_notify(ctx);
}
@@ -145,39 +152,48 @@ void qemu_bh_delete(QEMUBH *bh)
bh->deleted = 1;
}
static gboolean
aio_ctx_prepare(GSource *source, gint *timeout)
int64_t
aio_compute_timeout(AioContext *ctx)
{
AioContext *ctx = (AioContext *) source;
int64_t deadline;
int timeout = -1;
QEMUBH *bh;
int deadline;
/* We assume there is no timeout already supplied */
*timeout = -1;
for (bh = ctx->first_bh; bh; bh = bh->next) {
if (!bh->deleted && bh->scheduled) {
if (bh->idle) {
/* idle bottom halves will be polled at least
* every 10ms */
*timeout = 10;
timeout = 10000000;
} else {
/* non-idle bottom halves will be executed
* immediately */
*timeout = 0;
return true;
return 0;
}
}
}
deadline = qemu_timeout_ns_to_ms(timerlistgroup_deadline_ns(&ctx->tlg));
deadline = timerlistgroup_deadline_ns(&ctx->tlg);
if (deadline == 0) {
*timeout = 0;
return true;
return 0;
} else {
*timeout = qemu_soonest_timeout(*timeout, deadline);
return qemu_soonest_timeout(timeout, deadline);
}
}
static gboolean
aio_ctx_prepare(GSource *source, gint *timeout)
{
AioContext *ctx = (AioContext *) source;
/* We assume there is no timeout already supplied */
*timeout = qemu_timeout_ns_to_ms(aio_compute_timeout(ctx));
if (aio_prepare(ctx)) {
*timeout = 0;
}
return false;
return *timeout == 0;
}
static gboolean
@@ -202,7 +218,7 @@ aio_ctx_dispatch(GSource *source,
AioContext *ctx = (AioContext *) source;
assert(callback == NULL);
aio_poll(ctx, false);
aio_dispatch(ctx);
return true;
}
@@ -241,9 +257,25 @@ ThreadPool *aio_get_thread_pool(AioContext *ctx)
return ctx->thread_pool;
}
void aio_set_dispatching(AioContext *ctx, bool dispatching)
{
ctx->dispatching = dispatching;
if (!dispatching) {
/* Write ctx->dispatching before reading e.g. bh->scheduled.
* Optimization: this is only needed when we're entering the "unsafe"
* phase where other threads must call event_notifier_set.
*/
smp_mb();
}
}
void aio_notify(AioContext *ctx)
{
event_notifier_set(&ctx->notifier);
/* Write e.g. bh->scheduled before reading ctx->dispatching. */
smp_mb();
if (!ctx->dispatching) {
event_notifier_set(&ctx->notifier);
}
}
static void aio_timerlist_notify(void *opaque)

View File

@@ -815,10 +815,8 @@ static void alsa_fini_out (HWVoiceOut *hw)
ldebug ("alsa_fini\n");
alsa_anal_close (&alsa->handle, &alsa->pollhlp);
if (alsa->pcm_buf) {
g_free (alsa->pcm_buf);
alsa->pcm_buf = NULL;
}
g_free(alsa->pcm_buf);
alsa->pcm_buf = NULL;
}
static int alsa_init_out (HWVoiceOut *hw, struct audsettings *as)
@@ -978,10 +976,8 @@ static void alsa_fini_in (HWVoiceIn *hw)
alsa_anal_close (&alsa->handle, &alsa->pollhlp);
if (alsa->pcm_buf) {
g_free (alsa->pcm_buf);
alsa->pcm_buf = NULL;
}
g_free(alsa->pcm_buf);
alsa->pcm_buf = NULL;
}
static int alsa_run_in (HWVoiceIn *hw)

View File

@@ -71,10 +71,7 @@ static void glue (audio_init_nb_voices_, TYPE) (struct audio_driver *drv)
static void glue (audio_pcm_hw_free_resources_, TYPE) (HW *hw)
{
if (HWBUF) {
g_free (HWBUF);
}
g_free (HWBUF);
HWBUF = NULL;
}
@@ -92,9 +89,7 @@ static int glue (audio_pcm_hw_alloc_resources_, TYPE) (HW *hw)
static void glue (audio_pcm_sw_free_resources_, TYPE) (SW *sw)
{
if (sw->buf) {
g_free (sw->buf);
}
g_free (sw->buf);
if (sw->rate) {
st_rate_stop (sw->rate);
@@ -172,10 +167,8 @@ static int glue (audio_pcm_sw_init_, TYPE) (
static void glue (audio_pcm_sw_fini_, TYPE) (SW *sw)
{
glue (audio_pcm_sw_free_resources_, TYPE) (sw);
if (sw->name) {
g_free (sw->name);
sw->name = NULL;
}
g_free (sw->name);
sw->name = NULL;
}
static void glue (audio_pcm_hw_add_sw_, TYPE) (HW *hw, SW *sw)

View File

@@ -736,10 +736,8 @@ static void oss_fini_in (HWVoiceIn *hw)
oss_anal_close (&oss->fd);
if (oss->pcm_buf) {
g_free (oss->pcm_buf);
oss->pcm_buf = NULL;
}
g_free(oss->pcm_buf);
oss->pcm_buf = NULL;
}
static int oss_run_in (HWVoiceIn *hw)

View File

@@ -1,8 +1,11 @@
common-obj-y += rng.o rng-egd.o
common-obj-$(CONFIG_POSIX) += rng-random.o
common-obj-y += msmouse.o
common-obj-y += msmouse.o testdev.o
common-obj-$(CONFIG_BRLAPI) += baum.o
baum.o-cflags := $(SDL_CFLAGS)
common-obj-$(CONFIG_TPM) += tpm.o
common-obj-y += hostmem.o hostmem-ram.o
common-obj-$(CONFIG_LINUX) += hostmem-file.o

View File

@@ -574,7 +574,7 @@ CharDriverState *chr_baum_init(void)
int tty;
baum = g_malloc0(sizeof(BaumDriverState));
baum->chr = chr = g_malloc0(sizeof(CharDriverState));
baum->chr = chr = qemu_chr_alloc();
chr->opaque = baum;
chr->chr_write = baum_write;
@@ -629,7 +629,7 @@ fail_handle:
static void register_types(void)
{
register_char_driver_qapi("braille", CHARDEV_BACKEND_KIND_BRAILLE, NULL);
register_char_driver("braille", CHARDEV_BACKEND_KIND_BRAILLE, NULL);
}
type_init(register_types);

134
backends/hostmem-file.c Normal file
View File

@@ -0,0 +1,134 @@
/*
* QEMU Host Memory Backend for hugetlbfs
*
* Copyright (C) 2013-2014 Red Hat Inc
*
* Authors:
* Paolo Bonzini <pbonzini@redhat.com>
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*/
#include "qemu-common.h"
#include "sysemu/hostmem.h"
#include "sysemu/sysemu.h"
#include "qom/object_interfaces.h"
/* hostmem-file.c */
/**
* @TYPE_MEMORY_BACKEND_FILE:
* name of backend that uses mmap on a file descriptor
*/
#define TYPE_MEMORY_BACKEND_FILE "memory-backend-file"
#define MEMORY_BACKEND_FILE(obj) \
OBJECT_CHECK(HostMemoryBackendFile, (obj), TYPE_MEMORY_BACKEND_FILE)
typedef struct HostMemoryBackendFile HostMemoryBackendFile;
struct HostMemoryBackendFile {
HostMemoryBackend parent_obj;
bool share;
char *mem_path;
};
static void
file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
{
HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(backend);
if (!backend->size) {
error_setg(errp, "can't create backend with size 0");
return;
}
if (!fb->mem_path) {
error_setg(errp, "mem_path property not set");
return;
}
#ifndef CONFIG_LINUX
error_setg(errp, "-mem-path not supported on this host");
#else
if (!memory_region_size(&backend->mr)) {
backend->force_prealloc = mem_prealloc;
memory_region_init_ram_from_file(&backend->mr, OBJECT(backend),
object_get_canonical_path(OBJECT(backend)),
backend->size, fb->share,
fb->mem_path, errp);
}
#endif
}
static void
file_backend_class_init(ObjectClass *oc, void *data)
{
HostMemoryBackendClass *bc = MEMORY_BACKEND_CLASS(oc);
bc->alloc = file_backend_memory_alloc;
}
static char *get_mem_path(Object *o, Error **errp)
{
HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
return g_strdup(fb->mem_path);
}
static void set_mem_path(Object *o, const char *str, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(o);
HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
if (memory_region_size(&backend->mr)) {
error_setg(errp, "cannot change property value");
return;
}
if (fb->mem_path) {
g_free(fb->mem_path);
}
fb->mem_path = g_strdup(str);
}
static bool file_memory_backend_get_share(Object *o, Error **errp)
{
HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
return fb->share;
}
static void file_memory_backend_set_share(Object *o, bool value, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(o);
HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
if (memory_region_size(&backend->mr)) {
error_setg(errp, "cannot change property value");
return;
}
fb->share = value;
}
static void
file_backend_instance_init(Object *o)
{
object_property_add_bool(o, "share",
file_memory_backend_get_share,
file_memory_backend_set_share, NULL);
object_property_add_str(o, "mem-path", get_mem_path,
set_mem_path, NULL);
}
static const TypeInfo file_backend_info = {
.name = TYPE_MEMORY_BACKEND_FILE,
.parent = TYPE_MEMORY_BACKEND,
.class_init = file_backend_class_init,
.instance_init = file_backend_instance_init,
.instance_size = sizeof(HostMemoryBackendFile),
};
static void register_types(void)
{
type_register_static(&file_backend_info);
}
type_init(register_types);

53
backends/hostmem-ram.c Normal file
View File

@@ -0,0 +1,53 @@
/*
* QEMU Host Memory Backend
*
* Copyright (C) 2013-2014 Red Hat Inc
*
* Authors:
* Igor Mammedov <imammedo@redhat.com>
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*/
#include "sysemu/hostmem.h"
#include "qom/object_interfaces.h"
#define TYPE_MEMORY_BACKEND_RAM "memory-backend-ram"
static void
ram_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
{
char *path;
if (!backend->size) {
error_setg(errp, "can't create backend with size 0");
return;
}
path = object_get_canonical_path_component(OBJECT(backend));
memory_region_init_ram(&backend->mr, OBJECT(backend), path,
backend->size, errp);
g_free(path);
}
static void
ram_backend_class_init(ObjectClass *oc, void *data)
{
HostMemoryBackendClass *bc = MEMORY_BACKEND_CLASS(oc);
bc->alloc = ram_backend_memory_alloc;
}
static const TypeInfo ram_backend_info = {
.name = TYPE_MEMORY_BACKEND_RAM,
.parent = TYPE_MEMORY_BACKEND,
.class_init = ram_backend_class_init,
};
static void register_types(void)
{
type_register_static(&ram_backend_info);
}
type_init(register_types);

365
backends/hostmem.c Normal file
View File

@@ -0,0 +1,365 @@
/*
* QEMU Host Memory Backend
*
* Copyright (C) 2013-2014 Red Hat Inc
*
* Authors:
* Igor Mammedov <imammedo@redhat.com>
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*/
#include "sysemu/hostmem.h"
#include "qapi/visitor.h"
#include "qapi-types.h"
#include "qapi-visit.h"
#include "qapi/qmp/qerror.h"
#include "qemu/config-file.h"
#include "qom/object_interfaces.h"
#ifdef CONFIG_NUMA
#include <numaif.h>
QEMU_BUILD_BUG_ON(HOST_MEM_POLICY_DEFAULT != MPOL_DEFAULT);
QEMU_BUILD_BUG_ON(HOST_MEM_POLICY_PREFERRED != MPOL_PREFERRED);
QEMU_BUILD_BUG_ON(HOST_MEM_POLICY_BIND != MPOL_BIND);
QEMU_BUILD_BUG_ON(HOST_MEM_POLICY_INTERLEAVE != MPOL_INTERLEAVE);
#endif
static void
host_memory_backend_get_size(Object *obj, Visitor *v, void *opaque,
const char *name, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
uint64_t value = backend->size;
visit_type_size(v, &value, name, errp);
}
static void
host_memory_backend_set_size(Object *obj, Visitor *v, void *opaque,
const char *name, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
Error *local_err = NULL;
uint64_t value;
if (memory_region_size(&backend->mr)) {
error_setg(&local_err, "cannot change property value");
goto out;
}
visit_type_size(v, &value, name, &local_err);
if (local_err) {
goto out;
}
if (!value) {
error_setg(&local_err, "Property '%s.%s' doesn't take value '%"
PRIu64 "'", object_get_typename(obj), name, value);
goto out;
}
backend->size = value;
out:
error_propagate(errp, local_err);
}
static void
host_memory_backend_get_host_nodes(Object *obj, Visitor *v, void *opaque,
const char *name, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
uint16List *host_nodes = NULL;
uint16List **node = &host_nodes;
unsigned long value;
value = find_first_bit(backend->host_nodes, MAX_NODES);
if (value == MAX_NODES) {
return;
}
*node = g_malloc0(sizeof(**node));
(*node)->value = value;
node = &(*node)->next;
do {
value = find_next_bit(backend->host_nodes, MAX_NODES, value + 1);
if (value == MAX_NODES) {
break;
}
*node = g_malloc0(sizeof(**node));
(*node)->value = value;
node = &(*node)->next;
} while (true);
visit_type_uint16List(v, &host_nodes, name, errp);
}
static void
host_memory_backend_set_host_nodes(Object *obj, Visitor *v, void *opaque,
const char *name, Error **errp)
{
#ifdef CONFIG_NUMA
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
uint16List *l = NULL;
visit_type_uint16List(v, &l, name, errp);
while (l) {
bitmap_set(backend->host_nodes, l->value, 1);
l = l->next;
}
#else
error_setg(errp, "NUMA node binding are not supported by this QEMU");
#endif
}
static void
host_memory_backend_get_policy(Object *obj, Visitor *v, void *opaque,
const char *name, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
int policy = backend->policy;
visit_type_enum(v, &policy, HostMemPolicy_lookup, NULL, name, errp);
}
static void
host_memory_backend_set_policy(Object *obj, Visitor *v, void *opaque,
const char *name, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
int policy;
visit_type_enum(v, &policy, HostMemPolicy_lookup, NULL, name, errp);
backend->policy = policy;
#ifndef CONFIG_NUMA
if (policy != HOST_MEM_POLICY_DEFAULT) {
error_setg(errp, "NUMA policies are not supported by this QEMU");
}
#endif
}
static bool host_memory_backend_get_merge(Object *obj, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
return backend->merge;
}
static void host_memory_backend_set_merge(Object *obj, bool value, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
if (!memory_region_size(&backend->mr)) {
backend->merge = value;
return;
}
if (value != backend->merge) {
void *ptr = memory_region_get_ram_ptr(&backend->mr);
uint64_t sz = memory_region_size(&backend->mr);
qemu_madvise(ptr, sz,
value ? QEMU_MADV_MERGEABLE : QEMU_MADV_UNMERGEABLE);
backend->merge = value;
}
}
static bool host_memory_backend_get_dump(Object *obj, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
return backend->dump;
}
static void host_memory_backend_set_dump(Object *obj, bool value, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
if (!memory_region_size(&backend->mr)) {
backend->dump = value;
return;
}
if (value != backend->dump) {
void *ptr = memory_region_get_ram_ptr(&backend->mr);
uint64_t sz = memory_region_size(&backend->mr);
qemu_madvise(ptr, sz,
value ? QEMU_MADV_DODUMP : QEMU_MADV_DONTDUMP);
backend->dump = value;
}
}
static bool host_memory_backend_get_prealloc(Object *obj, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
return backend->prealloc || backend->force_prealloc;
}
static void host_memory_backend_set_prealloc(Object *obj, bool value,
Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
if (backend->force_prealloc) {
if (value) {
error_setg(errp,
"remove -mem-prealloc to use the prealloc property");
return;
}
}
if (!memory_region_size(&backend->mr)) {
backend->prealloc = value;
return;
}
if (value && !backend->prealloc) {
int fd = memory_region_get_fd(&backend->mr);
void *ptr = memory_region_get_ram_ptr(&backend->mr);
uint64_t sz = memory_region_size(&backend->mr);
os_mem_prealloc(fd, ptr, sz);
backend->prealloc = true;
}
}
static void host_memory_backend_init(Object *obj)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
backend->merge = qemu_opt_get_bool(qemu_get_machine_opts(),
"mem-merge", true);
backend->dump = qemu_opt_get_bool(qemu_get_machine_opts(),
"dump-guest-core", true);
backend->prealloc = mem_prealloc;
object_property_add_bool(obj, "merge",
host_memory_backend_get_merge,
host_memory_backend_set_merge, NULL);
object_property_add_bool(obj, "dump",
host_memory_backend_get_dump,
host_memory_backend_set_dump, NULL);
object_property_add_bool(obj, "prealloc",
host_memory_backend_get_prealloc,
host_memory_backend_set_prealloc, NULL);
object_property_add(obj, "size", "int",
host_memory_backend_get_size,
host_memory_backend_set_size, NULL, NULL, NULL);
object_property_add(obj, "host-nodes", "int",
host_memory_backend_get_host_nodes,
host_memory_backend_set_host_nodes, NULL, NULL, NULL);
object_property_add(obj, "policy", "str",
host_memory_backend_get_policy,
host_memory_backend_set_policy, NULL, NULL, NULL);
}
MemoryRegion *
host_memory_backend_get_memory(HostMemoryBackend *backend, Error **errp)
{
return memory_region_size(&backend->mr) ? &backend->mr : NULL;
}
static void
host_memory_backend_memory_complete(UserCreatable *uc, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(uc);
HostMemoryBackendClass *bc = MEMORY_BACKEND_GET_CLASS(uc);
Error *local_err = NULL;
void *ptr;
uint64_t sz;
if (bc->alloc) {
bc->alloc(backend, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return;
}
ptr = memory_region_get_ram_ptr(&backend->mr);
sz = memory_region_size(&backend->mr);
if (backend->merge) {
qemu_madvise(ptr, sz, QEMU_MADV_MERGEABLE);
}
if (!backend->dump) {
qemu_madvise(ptr, sz, QEMU_MADV_DONTDUMP);
}
#ifdef CONFIG_NUMA
unsigned long lastbit = find_last_bit(backend->host_nodes, MAX_NODES);
/* lastbit == MAX_NODES means maxnode = 0 */
unsigned long maxnode = (lastbit + 1) % (MAX_NODES + 1);
/* ensure policy won't be ignored in case memory is preallocated
* before mbind(). note: MPOL_MF_STRICT is ignored on hugepages so
* this doesn't catch hugepage case. */
unsigned flags = MPOL_MF_STRICT | MPOL_MF_MOVE;
/* check for invalid host-nodes and policies and give more verbose
* error messages than mbind(). */
if (maxnode && backend->policy == MPOL_DEFAULT) {
error_setg(errp, "host-nodes must be empty for policy default,"
" or you should explicitly specify a policy other"
" than default");
return;
} else if (maxnode == 0 && backend->policy != MPOL_DEFAULT) {
error_setg(errp, "host-nodes must be set for policy %s",
HostMemPolicy_lookup[backend->policy]);
return;
}
/* We can have up to MAX_NODES nodes, but we need to pass maxnode+1
* as argument to mbind() due to an old Linux bug (feature?) which
* cuts off the last specified node. This means backend->host_nodes
* must have MAX_NODES+1 bits available.
*/
assert(sizeof(backend->host_nodes) >=
BITS_TO_LONGS(MAX_NODES + 1) * sizeof(unsigned long));
assert(maxnode <= MAX_NODES);
if (mbind(ptr, sz, backend->policy,
maxnode ? backend->host_nodes : NULL, maxnode + 1, flags)) {
error_setg_errno(errp, errno,
"cannot bind memory to host NUMA nodes");
return;
}
#endif
/* Preallocate memory after the NUMA policy has been instantiated.
* This is necessary to guarantee memory is allocated with
* specified NUMA policy in place.
*/
if (backend->prealloc) {
os_mem_prealloc(memory_region_get_fd(&backend->mr), ptr, sz);
}
}
}
static void
host_memory_backend_class_init(ObjectClass *oc, void *data)
{
UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
ucc->complete = host_memory_backend_memory_complete;
}
static const TypeInfo host_memory_backend_info = {
.name = TYPE_MEMORY_BACKEND,
.parent = TYPE_OBJECT,
.abstract = true,
.class_size = sizeof(HostMemoryBackendClass),
.class_init = host_memory_backend_class_init,
.instance_size = sizeof(HostMemoryBackend),
.instance_init = host_memory_backend_init,
.interfaces = (InterfaceInfo[]) {
{ TYPE_USER_CREATABLE },
{ }
}
};
static void register_types(void)
{
type_register_static(&host_memory_backend_info);
}
type_init(register_types);

View File

@@ -67,7 +67,7 @@ CharDriverState *qemu_chr_open_msmouse(void)
{
CharDriverState *chr;
chr = g_malloc0(sizeof(CharDriverState));
chr = qemu_chr_alloc();
chr->chr_write = msmouse_chr_write;
chr->chr_close = msmouse_chr_close;
chr->explicit_be_open = true;
@@ -79,7 +79,7 @@ CharDriverState *qemu_chr_open_msmouse(void)
static void register_types(void)
{
register_char_driver_qapi("msmouse", CHARDEV_BACKEND_KIND_MSMOUSE, NULL);
register_char_driver("msmouse", CHARDEV_BACKEND_KIND_MSMOUSE, NULL);
}
type_init(register_types);

View File

@@ -169,6 +169,7 @@ static void rng_egd_set_chardev(Object *obj, const char *value, Error **errp)
if (b->opened) {
error_set(errp, QERR_PERMISSION_DENIED);
} else {
g_free(s->chr_name);
s->chr_name = g_strdup(value);
}
}

View File

@@ -106,10 +106,7 @@ static void rng_random_set_filename(Object *obj, const char *filename,
return;
}
if (s->filename) {
g_free(s->filename);
}
g_free(s->filename);
s->filename = g_strdup(filename);
}

131
backends/testdev.c Normal file
View File

@@ -0,0 +1,131 @@
/*
* QEMU Char Device for testsuite control
*
* Copyright (c) 2014 Red Hat, Inc.
*
* Author: Paolo Bonzini <pbonzini@redhat.com>
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "qemu-common.h"
#include "sysemu/char.h"
#define BUF_SIZE 32
typedef struct {
CharDriverState *chr;
uint8_t in_buf[32];
int in_buf_used;
} TestdevCharState;
/* Try to interpret a whole incoming packet */
static int testdev_eat_packet(TestdevCharState *testdev)
{
const uint8_t *cur = testdev->in_buf;
int len = testdev->in_buf_used;
uint8_t c;
int arg;
#define EAT(c) do { \
if (!len--) { \
return 0; \
} \
c = *cur++; \
} while (0)
EAT(c);
while (isspace(c)) {
EAT(c);
}
arg = 0;
while (isdigit(c)) {
arg = arg * 10 + c - '0';
EAT(c);
}
while (isspace(c)) {
EAT(c);
}
switch (c) {
case 'q':
exit((arg << 1) | 1);
break;
default:
break;
}
return cur - testdev->in_buf;
}
/* The other end is writing some data. Store it and try to interpret */
static int testdev_write(CharDriverState *chr, const uint8_t *buf, int len)
{
TestdevCharState *testdev = chr->opaque;
int tocopy, eaten, orig_len = len;
while (len) {
/* Complete our buffer as much as possible */
tocopy = MIN(len, BUF_SIZE - testdev->in_buf_used);
memcpy(testdev->in_buf + testdev->in_buf_used, buf, tocopy);
testdev->in_buf_used += tocopy;
buf += tocopy;
len -= tocopy;
/* Interpret it as much as possible */
while (testdev->in_buf_used > 0 &&
(eaten = testdev_eat_packet(testdev)) > 0) {
memmove(testdev->in_buf, testdev->in_buf + eaten,
testdev->in_buf_used - eaten);
testdev->in_buf_used -= eaten;
}
}
return orig_len;
}
static void testdev_close(struct CharDriverState *chr)
{
TestdevCharState *testdev = chr->opaque;
g_free(testdev);
}
CharDriverState *chr_testdev_init(void)
{
TestdevCharState *testdev;
CharDriverState *chr;
testdev = g_malloc0(sizeof(TestdevCharState));
testdev->chr = chr = g_malloc0(sizeof(CharDriverState));
chr->opaque = testdev;
chr->chr_write = testdev_write;
chr->chr_close = testdev_close;
return chr;
}
static void register_types(void)
{
register_char_driver("testdev", CHARDEV_BACKEND_KIND_TESTDEV, NULL);
}
type_init(register_types);

View File

@@ -81,19 +81,6 @@ static int qemu_balloon_status(BalloonInfo *info)
return 1;
}
void qemu_balloon_changed(int64_t actual)
{
QObject *data;
data = qobject_from_jsonf("{ 'actual': %" PRId64 " }",
actual);
monitor_protocol_event(QEVENT_BALLOON_CHANGE, data);
qobject_decref(data);
}
BalloonInfo *qmp_query_balloon(Error **errp)
{
BalloonInfo *info;

View File

@@ -186,7 +186,7 @@ static int bmds_aio_inflight(BlkMigDevState *bmds, int64_t sector)
{
int64_t chunk = sector / (int64_t)BDRV_SECTORS_PER_DIRTY_CHUNK;
if ((sector << BDRV_SECTOR_BITS) < bdrv_getlength(bmds->bs)) {
if (sector < bdrv_nb_sectors(bmds->bs)) {
return !!(bmds->aio_bitmap[chunk / (sizeof(unsigned long) * 8)] &
(1UL << (chunk % (sizeof(unsigned long) * 8))));
} else {
@@ -223,8 +223,7 @@ static void alloc_aio_bitmap(BlkMigDevState *bmds)
BlockDriverState *bs = bmds->bs;
int64_t bitmap_size;
bitmap_size = (bdrv_getlength(bs) >> BDRV_SECTOR_BITS) +
BDRV_SECTORS_PER_DIRTY_CHUNK * 8 - 1;
bitmap_size = bdrv_nb_sectors(bs) + BDRV_SECTORS_PER_DIRTY_CHUNK * 8 - 1;
bitmap_size /= BDRV_SECTORS_PER_DIRTY_CHUNK * 8;
bmds->aio_bitmap = g_malloc0(bitmap_size);
@@ -284,7 +283,7 @@ static int mig_save_device_bulk(QEMUFile *f, BlkMigDevState *bmds)
nr_sectors = total_sectors - cur_sector;
}
blk = g_malloc(sizeof(BlkMigBlock));
blk = g_new(BlkMigBlock, 1);
blk->buf = g_malloc(BLOCK_SIZE);
blk->bmds = bmds;
blk->sector = cur_sector;
@@ -350,12 +349,12 @@ static void init_blk_migration_it(void *opaque, BlockDriverState *bs)
int64_t sectors;
if (!bdrv_is_read_only(bs)) {
sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
sectors = bdrv_nb_sectors(bs);
if (sectors <= 0) {
return;
}
bmds = g_malloc0(sizeof(BlkMigDevState));
bmds = g_new0(BlkMigDevState, 1);
bmds->bs = bs;
bmds->bulk_completed = 0;
bmds->total_sectors = sectors;
@@ -466,7 +465,7 @@ static int mig_save_device_dirty(QEMUFile *f, BlkMigDevState *bmds,
} else {
nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK;
}
blk = g_malloc(sizeof(BlkMigBlock));
blk = g_new(BlkMigBlock, 1);
blk->buf = g_malloc(BLOCK_SIZE);
blk->bmds = bmds;
blk->sector = sector;
@@ -629,6 +628,7 @@ static int block_save_setup(QEMUFile *f, void *opaque)
block_mig_state.submitted, block_mig_state.transferred);
qemu_mutex_lock_iothread();
init_blk_migration(f);
/* start track dirty blocks */
ret = set_dirty_tracking();
@@ -638,8 +638,6 @@ static int block_save_setup(QEMUFile *f, void *opaque)
return ret;
}
init_blk_migration(f);
qemu_mutex_unlock_iothread();
ret = flush_blks(f);
@@ -800,7 +798,7 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
if (bs != bs_prev) {
bs_prev = bs;
total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
total_sectors = bdrv_nb_sectors(bs);
if (total_sectors <= 0) {
error_report("Error getting length of block device %s",
device_name);
@@ -862,7 +860,7 @@ static bool block_is_active(void *opaque)
return block_mig_state.blk_enable == 1;
}
SaveVMHandlers savevm_block_handlers = {
static SaveVMHandlers savevm_block_handlers = {
.set_params = block_set_params,
.save_live_setup = block_save_setup,
.save_live_iterate = block_save_iterate,

1220
block.c

File diff suppressed because it is too large Load Diff

View File

@@ -10,15 +10,15 @@ block-obj-$(CONFIG_WIN32) += raw-win32.o win32-aio.o
block-obj-$(CONFIG_POSIX) += raw-posix.o
block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
ifeq ($(CONFIG_POSIX),y)
block-obj-y += nbd.o nbd-client.o sheepdog.o
block-obj-$(CONFIG_LIBISCSI) += iscsi.o
block-obj-$(CONFIG_LIBNFS) += nfs.o
block-obj-$(CONFIG_CURL) += curl.o
block-obj-$(CONFIG_RBD) += rbd.o
block-obj-$(CONFIG_GLUSTERFS) += gluster.o
block-obj-$(CONFIG_ARCHIPELAGO) += archipelago.o
block-obj-$(CONFIG_LIBSSH2) += ssh.o
endif
block-obj-y += accounting.o
common-obj-y += stream.o
common-obj-y += commit.o
@@ -35,5 +35,6 @@ gluster.o-cflags := $(GLUSTERFS_CFLAGS)
gluster.o-libs := $(GLUSTERFS_LIBS)
ssh.o-cflags := $(LIBSSH2_CFLAGS)
ssh.o-libs := $(LIBSSH2_LIBS)
archipelago.o-libs := $(ARCHIPELAGO_LIBS)
qcow.o-libs := -lz
linux-aio.o-libs := -laio

54
block/accounting.c Normal file
View File

@@ -0,0 +1,54 @@
/*
* QEMU System Emulator block accounting
*
* Copyright (c) 2011 Christoph Hellwig
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "block/accounting.h"
#include "block/block_int.h"
void block_acct_start(BlockAcctStats *stats, BlockAcctCookie *cookie,
int64_t bytes, enum BlockAcctType type)
{
assert(type < BLOCK_MAX_IOTYPE);
cookie->bytes = bytes;
cookie->start_time_ns = get_clock();
cookie->type = type;
}
void block_acct_done(BlockAcctStats *stats, BlockAcctCookie *cookie)
{
assert(cookie->type < BLOCK_MAX_IOTYPE);
stats->nr_bytes[cookie->type] += cookie->bytes;
stats->nr_ops[cookie->type]++;
stats->total_time_ns[cookie->type] += get_clock() - cookie->start_time_ns;
}
void block_acct_highest_sector(BlockAcctStats *stats, int64_t sector_num,
unsigned int nb_sectors)
{
if (stats->wr_highest_sector < sector_num + nb_sectors - 1) {
stats->wr_highest_sector = sector_num + nb_sectors - 1;
}
}

1099
block/archipelago.c Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -307,7 +307,7 @@ static void coroutine_fn backup_run(void *opaque)
BACKUP_SECTORS_PER_CLUSTER - i, &n);
i += n;
if (alloced == 1) {
if (alloced == 1 || n == 0) {
break;
}
}
@@ -325,7 +325,7 @@ static void coroutine_fn backup_run(void *opaque)
/* Depending on error action, fail now or retry cluster */
BlockErrorAction action =
backup_error_action(job, error_is_read, -ret);
if (action == BDRV_ACTION_REPORT) {
if (action == BLOCK_ERROR_ACTION_REPORT) {
break;
} else {
start--;

View File

@@ -26,6 +26,10 @@
#include "qemu/config-file.h"
#include "block/block_int.h"
#include "qemu/module.h"
#include "qapi/qmp/qbool.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qint.h"
#include "qapi/qmp/qstring.h"
typedef struct BDRVBlkdebugState {
int state;
@@ -449,6 +453,10 @@ static void error_callback_bh(void *opaque)
static void blkdebug_aio_cancel(BlockDriverAIOCB *blockacb)
{
BlkdebugAIOCB *acb = container_of(blockacb, BlkdebugAIOCB, common);
if (acb->bh) {
qemu_bh_delete(acb->bh);
acb->bh = NULL;
}
qemu_aio_release(acb);
}
@@ -471,7 +479,7 @@ static BlockDriverAIOCB *inject_error(BlockDriverState *bs,
acb = qemu_aio_get(&blkdebug_aiocb_info, bs, cb, opaque);
acb->ret = -error;
bh = qemu_bh_new(error_callback_bh, acb);
bh = aio_bh_new(bdrv_get_aio_context(bs), error_callback_bh, acb);
acb->bh = bh;
qemu_bh_schedule(bh);
@@ -522,6 +530,25 @@ static BlockDriverAIOCB *blkdebug_aio_writev(BlockDriverState *bs,
return bdrv_aio_writev(bs->file, sector_num, qiov, nb_sectors, cb, opaque);
}
static BlockDriverAIOCB *blkdebug_aio_flush(BlockDriverState *bs,
BlockDriverCompletionFunc *cb, void *opaque)
{
BDRVBlkdebugState *s = bs->opaque;
BlkdebugRule *rule = NULL;
QSIMPLEQ_FOREACH(rule, &s->active_rules, active_next) {
if (rule->options.inject.sector == -1) {
break;
}
}
if (rule && rule->options.inject.error) {
return inject_error(bs, cb, opaque, rule);
}
return bdrv_aio_flush(bs->file, cb, opaque);
}
static void blkdebug_close(BlockDriverState *bs)
{
@@ -687,6 +714,98 @@ static int64_t blkdebug_getlength(BlockDriverState *bs)
return bdrv_getlength(bs->file);
}
static void blkdebug_refresh_filename(BlockDriverState *bs)
{
BDRVBlkdebugState *s = bs->opaque;
struct BlkdebugRule *rule;
QDict *opts;
QList *inject_error_list = NULL, *set_state_list = NULL;
QList *suspend_list = NULL;
int event;
if (!bs->file->full_open_options) {
/* The config file cannot be recreated, so creating a plain filename
* is impossible */
return;
}
opts = qdict_new();
qdict_put_obj(opts, "driver", QOBJECT(qstring_from_str("blkdebug")));
QINCREF(bs->file->full_open_options);
qdict_put_obj(opts, "image", QOBJECT(bs->file->full_open_options));
for (event = 0; event < BLKDBG_EVENT_MAX; event++) {
QLIST_FOREACH(rule, &s->rules[event], next) {
if (rule->action == ACTION_INJECT_ERROR) {
QDict *inject_error = qdict_new();
qdict_put_obj(inject_error, "event", QOBJECT(qstring_from_str(
BlkdebugEvent_lookup[rule->event])));
qdict_put_obj(inject_error, "state",
QOBJECT(qint_from_int(rule->state)));
qdict_put_obj(inject_error, "errno", QOBJECT(qint_from_int(
rule->options.inject.error)));
qdict_put_obj(inject_error, "sector", QOBJECT(qint_from_int(
rule->options.inject.sector)));
qdict_put_obj(inject_error, "once", QOBJECT(qbool_from_int(
rule->options.inject.once)));
qdict_put_obj(inject_error, "immediately",
QOBJECT(qbool_from_int(
rule->options.inject.immediately)));
if (!inject_error_list) {
inject_error_list = qlist_new();
}
qlist_append_obj(inject_error_list, QOBJECT(inject_error));
} else if (rule->action == ACTION_SET_STATE) {
QDict *set_state = qdict_new();
qdict_put_obj(set_state, "event", QOBJECT(qstring_from_str(
BlkdebugEvent_lookup[rule->event])));
qdict_put_obj(set_state, "state",
QOBJECT(qint_from_int(rule->state)));
qdict_put_obj(set_state, "new_state", QOBJECT(qint_from_int(
rule->options.set_state.new_state)));
if (!set_state_list) {
set_state_list = qlist_new();
}
qlist_append_obj(set_state_list, QOBJECT(set_state));
} else if (rule->action == ACTION_SUSPEND) {
QDict *suspend = qdict_new();
qdict_put_obj(suspend, "event", QOBJECT(qstring_from_str(
BlkdebugEvent_lookup[rule->event])));
qdict_put_obj(suspend, "state",
QOBJECT(qint_from_int(rule->state)));
qdict_put_obj(suspend, "tag", QOBJECT(qstring_from_str(
rule->options.suspend.tag)));
if (!suspend_list) {
suspend_list = qlist_new();
}
qlist_append_obj(suspend_list, QOBJECT(suspend));
}
}
}
if (inject_error_list) {
qdict_put_obj(opts, "inject-error", QOBJECT(inject_error_list));
}
if (set_state_list) {
qdict_put_obj(opts, "set-state", QOBJECT(set_state_list));
}
if (suspend_list) {
qdict_put_obj(opts, "suspend", QOBJECT(suspend_list));
}
bs->full_open_options = opts;
}
static BlockDriver bdrv_blkdebug = {
.format_name = "blkdebug",
.protocol_name = "blkdebug",
@@ -696,9 +815,11 @@ static BlockDriver bdrv_blkdebug = {
.bdrv_file_open = blkdebug_open,
.bdrv_close = blkdebug_close,
.bdrv_getlength = blkdebug_getlength,
.bdrv_refresh_filename = blkdebug_refresh_filename,
.bdrv_aio_readv = blkdebug_aio_readv,
.bdrv_aio_writev = blkdebug_aio_writev,
.bdrv_aio_flush = blkdebug_aio_flush,
.bdrv_debug_event = blkdebug_debug_event,
.bdrv_debug_breakpoint = blkdebug_debug_breakpoint,

View File

@@ -10,6 +10,8 @@
#include <stdarg.h>
#include "qemu/sockets.h" /* for EINPROGRESS on Windows */
#include "block/block_int.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qstring.h"
typedef struct {
BlockDriverState *test_file;
@@ -39,12 +41,13 @@ struct BlkverifyAIOCB {
static void blkverify_aio_cancel(BlockDriverAIOCB *blockacb)
{
BlkverifyAIOCB *acb = (BlkverifyAIOCB *)blockacb;
AioContext *aio_context = bdrv_get_aio_context(blockacb->bs);
bool finished = false;
/* Wait until request completes, invokes its callback, and frees itself */
acb->finished = &finished;
while (!finished) {
qemu_aio_wait();
aio_poll(aio_context, true);
}
}
@@ -155,6 +158,7 @@ static int blkverify_open(BlockDriverState *bs, QDict *options, int flags,
ret = 0;
fail:
qemu_opts_del(opts);
return ret;
}
@@ -228,7 +232,8 @@ static void blkverify_aio_cb(void *opaque, int ret)
acb->verify(acb);
}
acb->bh = qemu_bh_new(blkverify_aio_bh, acb);
acb->bh = aio_bh_new(bdrv_get_aio_context(acb->common.bs),
blkverify_aio_bh, acb);
qemu_bh_schedule(acb->bh);
break;
}
@@ -302,21 +307,67 @@ static bool blkverify_recurse_is_first_non_filter(BlockDriverState *bs,
return bdrv_recurse_is_first_non_filter(s->test_file, candidate);
}
/* Propagate AioContext changes to ->test_file */
static void blkverify_detach_aio_context(BlockDriverState *bs)
{
BDRVBlkverifyState *s = bs->opaque;
bdrv_detach_aio_context(s->test_file);
}
static void blkverify_attach_aio_context(BlockDriverState *bs,
AioContext *new_context)
{
BDRVBlkverifyState *s = bs->opaque;
bdrv_attach_aio_context(s->test_file, new_context);
}
static void blkverify_refresh_filename(BlockDriverState *bs)
{
BDRVBlkverifyState *s = bs->opaque;
/* bs->file has already been refreshed */
bdrv_refresh_filename(s->test_file);
if (bs->file->full_open_options && s->test_file->full_open_options) {
QDict *opts = qdict_new();
qdict_put_obj(opts, "driver", QOBJECT(qstring_from_str("blkverify")));
QINCREF(bs->file->full_open_options);
qdict_put_obj(opts, "raw", QOBJECT(bs->file->full_open_options));
QINCREF(s->test_file->full_open_options);
qdict_put_obj(opts, "test", QOBJECT(s->test_file->full_open_options));
bs->full_open_options = opts;
}
if (bs->file->exact_filename[0] && s->test_file->exact_filename[0]) {
snprintf(bs->exact_filename, sizeof(bs->exact_filename),
"blkverify:%s:%s",
bs->file->exact_filename, s->test_file->exact_filename);
}
}
static BlockDriver bdrv_blkverify = {
.format_name = "blkverify",
.protocol_name = "blkverify",
.instance_size = sizeof(BDRVBlkverifyState),
.format_name = "blkverify",
.protocol_name = "blkverify",
.instance_size = sizeof(BDRVBlkverifyState),
.bdrv_parse_filename = blkverify_parse_filename,
.bdrv_file_open = blkverify_open,
.bdrv_close = blkverify_close,
.bdrv_getlength = blkverify_getlength,
.bdrv_parse_filename = blkverify_parse_filename,
.bdrv_file_open = blkverify_open,
.bdrv_close = blkverify_close,
.bdrv_getlength = blkverify_getlength,
.bdrv_refresh_filename = blkverify_refresh_filename,
.bdrv_aio_readv = blkverify_aio_readv,
.bdrv_aio_writev = blkverify_aio_writev,
.bdrv_aio_flush = blkverify_aio_flush,
.bdrv_aio_readv = blkverify_aio_readv,
.bdrv_aio_writev = blkverify_aio_writev,
.bdrv_aio_flush = blkverify_aio_flush,
.is_filter = true,
.bdrv_attach_aio_context = blkverify_attach_aio_context,
.bdrv_detach_aio_context = blkverify_detach_aio_context,
.is_filter = true,
.bdrv_recurse_is_first_non_filter = blkverify_recurse_is_first_non_filter,
};

View File

@@ -131,7 +131,11 @@ static int bochs_open(BlockDriverState *bs, QDict *options, int flags,
return -EFBIG;
}
s->catalog_bitmap = g_malloc(s->catalog_size * 4);
s->catalog_bitmap = g_try_new(uint32_t, s->catalog_size);
if (s->catalog_size && s->catalog_bitmap == NULL) {
error_setg(errp, "Could not allocate memory for catalog");
return -ENOMEM;
}
ret = bdrv_pread(bs->file, le32_to_cpu(bochs.header), s->catalog_bitmap,
s->catalog_size * 4);

View File

@@ -116,7 +116,12 @@ static int cloop_open(BlockDriverState *bs, QDict *options, int flags,
"try increasing block size");
return -EINVAL;
}
s->offsets = g_malloc(offsets_size);
s->offsets = g_try_malloc(offsets_size);
if (s->offsets == NULL) {
error_setg(errp, "Could not allocate offsets table");
return -ENOMEM;
}
ret = bdrv_pread(bs->file, 128 + 4 + 4, s->offsets, offsets_size);
if (ret < 0) {
@@ -158,8 +163,20 @@ static int cloop_open(BlockDriverState *bs, QDict *options, int flags,
}
/* initialize zlib engine */
s->compressed_block = g_malloc(max_compressed_block_size + 1);
s->uncompressed_block = g_malloc(s->block_size);
s->compressed_block = g_try_malloc(max_compressed_block_size + 1);
if (s->compressed_block == NULL) {
error_setg(errp, "Could not allocate compressed_block");
ret = -ENOMEM;
goto fail;
}
s->uncompressed_block = g_try_malloc(s->block_size);
if (s->uncompressed_block == NULL) {
error_setg(errp, "Could not allocate uncompressed_block");
ret = -ENOMEM;
goto fail;
}
if (inflateInit(&s->zstream) != Z_OK) {
ret = -EINVAL;
goto fail;

View File

@@ -37,6 +37,7 @@ typedef struct CommitBlockJob {
BlockdevOnError on_error;
int base_flags;
int orig_overlay_flags;
char *backing_file_str;
} CommitBlockJob;
static int coroutine_fn commit_populate(BlockDriverState *bs,
@@ -141,7 +142,7 @@ wait:
if (!block_job_is_cancelled(&s->common) && sector_num == end) {
/* success */
ret = bdrv_drop_intermediate(active, top, base);
ret = bdrv_drop_intermediate(active, top, base, s->backing_file_str);
}
exit_free_buf:
@@ -158,7 +159,7 @@ exit_restore_reopen:
if (overlay_bs && s->orig_overlay_flags != bdrv_get_flags(overlay_bs)) {
bdrv_reopen(overlay_bs, s->orig_overlay_flags, NULL);
}
g_free(s->backing_file_str);
block_job_completed(&s->common, ret);
}
@@ -182,7 +183,7 @@ static const BlockJobDriver commit_job_driver = {
void commit_start(BlockDriverState *bs, BlockDriverState *base,
BlockDriverState *top, int64_t speed,
BlockdevOnError on_error, BlockDriverCompletionFunc *cb,
void *opaque, Error **errp)
void *opaque, const char *backing_file_str, Error **errp)
{
CommitBlockJob *s;
BlockReopenQueue *reopen_queue = NULL;
@@ -244,6 +245,8 @@ void commit_start(BlockDriverState *bs, BlockDriverState *base,
s->base_flags = orig_base_flags;
s->orig_overlay_flags = orig_overlay_flags;
s->backing_file_str = g_strdup(backing_file_str);
s->on_error = on_error;
s->common.co = qemu_coroutine_create(commit_run);

View File

@@ -324,39 +324,32 @@ static void cow_close(BlockDriverState *bs)
{
}
static int cow_create(const char *filename, QEMUOptionParameter *options,
Error **errp)
static int cow_create(const char *filename, QemuOpts *opts, Error **errp)
{
struct cow_header_v2 cow_header;
struct stat st;
int64_t image_sectors = 0;
const char *image_filename = NULL;
char *image_filename = NULL;
Error *local_err = NULL;
int ret;
BlockDriverState *cow_bs;
BlockDriverState *cow_bs = NULL;
/* Read out options */
while (options && options->name) {
if (!strcmp(options->name, BLOCK_OPT_SIZE)) {
image_sectors = options->value.n / 512;
} else if (!strcmp(options->name, BLOCK_OPT_BACKING_FILE)) {
image_filename = options->value.s;
}
options++;
}
image_sectors = DIV_ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
image_filename = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FILE);
ret = bdrv_create_file(filename, options, &local_err);
ret = bdrv_create_file(filename, opts, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
return ret;
goto exit;
}
cow_bs = NULL;
ret = bdrv_open(&cow_bs, filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_PROTOCOL, NULL, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
return ret;
goto exit;
}
memset(&cow_header, 0, sizeof(cow_header));
@@ -389,22 +382,29 @@ static int cow_create(const char *filename, QEMUOptionParameter *options,
}
exit:
bdrv_unref(cow_bs);
g_free(image_filename);
if (cow_bs) {
bdrv_unref(cow_bs);
}
return ret;
}
static QEMUOptionParameter cow_create_options[] = {
{
.name = BLOCK_OPT_SIZE,
.type = OPT_SIZE,
.help = "Virtual disk size"
},
{
.name = BLOCK_OPT_BACKING_FILE,
.type = OPT_STRING,
.help = "File name of a base image"
},
{ NULL }
static QemuOptsList cow_create_opts = {
.name = "cow-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(cow_create_opts.head),
.desc = {
{
.name = BLOCK_OPT_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Virtual disk size"
},
{
.name = BLOCK_OPT_BACKING_FILE,
.type = QEMU_OPT_STRING,
.help = "File name of a base image"
},
{ /* end of list */ }
}
};
static BlockDriver bdrv_cow = {
@@ -416,12 +416,13 @@ static BlockDriver bdrv_cow = {
.bdrv_close = cow_close,
.bdrv_create = cow_create,
.bdrv_has_zero_init = bdrv_has_zero_init_1,
.supports_backing = true,
.bdrv_read = cow_co_read,
.bdrv_write = cow_co_write,
.bdrv_co_get_block_status = cow_co_get_block_status,
.create_options = cow_create_options,
.create_opts = &cow_create_opts,
};
static void bdrv_cow_init(void)

View File

@@ -26,7 +26,7 @@
#include "qapi/qmp/qbool.h"
#include <curl/curl.h>
// #define DEBUG
// #define DEBUG_CURL
// #define DEBUG_VERBOSE
#ifdef DEBUG_CURL
@@ -63,6 +63,7 @@ static CURLMcode __curl_multi_socket_action(CURLM *multi_handle,
#define CURL_NUM_ACB 8
#define SECTOR_SIZE 512
#define READ_AHEAD_DEFAULT (256 * 1024)
#define CURL_TIMEOUT_DEFAULT 5
#define FIND_RET_NONE 0
#define FIND_RET_OK 1
@@ -71,6 +72,8 @@ static CURLMcode __curl_multi_socket_action(CURLM *multi_handle,
#define CURL_BLOCK_OPT_URL "url"
#define CURL_BLOCK_OPT_READAHEAD "readahead"
#define CURL_BLOCK_OPT_SSLVERIFY "sslverify"
#define CURL_BLOCK_OPT_TIMEOUT "timeout"
#define CURL_BLOCK_OPT_COOKIE "cookie"
struct BDRVCURLState;
@@ -109,7 +112,10 @@ typedef struct BDRVCURLState {
char *url;
size_t readahead_size;
bool sslverify;
int timeout;
char *cookie;
bool accept_range;
AioContext *aio_context;
} BDRVCURLState;
static void curl_clean_state(CURLState *s);
@@ -134,25 +140,29 @@ static int curl_timer_cb(CURLM *multi, long timeout_ms, void *opaque)
#endif
static int curl_sock_cb(CURL *curl, curl_socket_t fd, int action,
void *s, void *sp)
void *userp, void *sp)
{
BDRVCURLState *s;
CURLState *state = NULL;
curl_easy_getinfo(curl, CURLINFO_PRIVATE, (char **)&state);
state->sock_fd = fd;
s = state->s;
DPRINTF("CURL (AIO): Sock action %d on fd %d\n", action, fd);
switch (action) {
case CURL_POLL_IN:
qemu_aio_set_fd_handler(fd, curl_multi_read, NULL, state);
aio_set_fd_handler(s->aio_context, fd, curl_multi_read,
NULL, state);
break;
case CURL_POLL_OUT:
qemu_aio_set_fd_handler(fd, NULL, curl_multi_do, state);
aio_set_fd_handler(s->aio_context, fd, NULL, curl_multi_do, state);
break;
case CURL_POLL_INOUT:
qemu_aio_set_fd_handler(fd, curl_multi_read, curl_multi_do, state);
aio_set_fd_handler(s->aio_context, fd, curl_multi_read,
curl_multi_do, state);
break;
case CURL_POLL_REMOVE:
qemu_aio_set_fd_handler(fd, NULL, NULL, NULL);
aio_set_fd_handler(s->aio_context, fd, NULL, NULL, NULL);
break;
}
@@ -347,7 +357,7 @@ static void curl_multi_timeout_do(void *arg)
#endif
}
static CURLState *curl_init_state(BDRVCURLState *s)
static CURLState *curl_init_state(BlockDriverState *bs, BDRVCURLState *s)
{
CURLState *state = NULL;
int i, j;
@@ -365,7 +375,7 @@ static CURLState *curl_init_state(BDRVCURLState *s)
break;
}
if (!state) {
qemu_aio_wait();
aio_poll(bdrv_get_aio_context(bs), true);
}
} while(!state);
@@ -377,7 +387,10 @@ static CURLState *curl_init_state(BDRVCURLState *s)
curl_easy_setopt(state->curl, CURLOPT_URL, s->url);
curl_easy_setopt(state->curl, CURLOPT_SSL_VERIFYPEER,
(long) s->sslverify);
curl_easy_setopt(state->curl, CURLOPT_TIMEOUT, 5);
if (s->cookie) {
curl_easy_setopt(state->curl, CURLOPT_COOKIE, s->cookie);
}
curl_easy_setopt(state->curl, CURLOPT_TIMEOUT, s->timeout);
curl_easy_setopt(state->curl, CURLOPT_WRITEFUNCTION,
(void *)curl_read_cb);
curl_easy_setopt(state->curl, CURLOPT_WRITEDATA, (void *)state);
@@ -422,6 +435,49 @@ static void curl_parse_filename(const char *filename, QDict *options,
qdict_put(options, CURL_BLOCK_OPT_URL, qstring_from_str(filename));
}
static void curl_detach_aio_context(BlockDriverState *bs)
{
BDRVCURLState *s = bs->opaque;
int i;
for (i = 0; i < CURL_NUM_STATES; i++) {
if (s->states[i].in_use) {
curl_clean_state(&s->states[i]);
}
if (s->states[i].curl) {
curl_easy_cleanup(s->states[i].curl);
s->states[i].curl = NULL;
}
g_free(s->states[i].orig_buf);
s->states[i].orig_buf = NULL;
}
if (s->multi) {
curl_multi_cleanup(s->multi);
s->multi = NULL;
}
timer_del(&s->timer);
}
static void curl_attach_aio_context(BlockDriverState *bs,
AioContext *new_context)
{
BDRVCURLState *s = bs->opaque;
aio_timer_init(new_context, &s->timer,
QEMU_CLOCK_REALTIME, SCALE_NS,
curl_multi_timeout_do, s);
assert(!s->multi);
s->multi = curl_multi_init();
s->aio_context = new_context;
curl_multi_setopt(s->multi, CURLMOPT_SOCKETFUNCTION, curl_sock_cb);
#ifdef NEED_CURL_TIMER_CALLBACK
curl_multi_setopt(s->multi, CURLMOPT_TIMERDATA, s);
curl_multi_setopt(s->multi, CURLMOPT_TIMERFUNCTION, curl_timer_cb);
#endif
}
static QemuOptsList runtime_opts = {
.name = "curl",
.head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
@@ -441,6 +497,16 @@ static QemuOptsList runtime_opts = {
.type = QEMU_OPT_BOOL,
.help = "Verify SSL certificate"
},
{
.name = CURL_BLOCK_OPT_TIMEOUT,
.type = QEMU_OPT_NUMBER,
.help = "Curl timeout"
},
{
.name = CURL_BLOCK_OPT_COOKIE,
.type = QEMU_OPT_STRING,
.help = "Pass the cookie or list of cookies with each request"
},
{ /* end of list */ }
},
};
@@ -453,6 +519,7 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
QemuOpts *opts;
Error *local_err = NULL;
const char *file;
const char *cookie;
double d;
static int inited = 0;
@@ -477,8 +544,14 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
goto out_noclean;
}
s->timeout = qemu_opt_get_number(opts, CURL_BLOCK_OPT_TIMEOUT,
CURL_TIMEOUT_DEFAULT);
s->sslverify = qemu_opt_get_bool(opts, CURL_BLOCK_OPT_SSLVERIFY, true);
cookie = qemu_opt_get(opts, CURL_BLOCK_OPT_COOKIE);
s->cookie = g_strdup(cookie);
file = qemu_opt_get(opts, CURL_BLOCK_OPT_URL);
if (file == NULL) {
error_setg(errp, "curl block driver requires an 'url' option");
@@ -491,8 +564,9 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
}
DPRINTF("CURL: Opening %s\n", file);
s->aio_context = bdrv_get_aio_context(bs);
s->url = g_strdup(file);
state = curl_init_state(s);
state = curl_init_state(bs, s);
if (!state)
goto out_noclean;
@@ -523,19 +597,7 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
curl_easy_cleanup(state->curl);
state->curl = NULL;
aio_timer_init(bdrv_get_aio_context(bs), &s->timer,
QEMU_CLOCK_REALTIME, SCALE_NS,
curl_multi_timeout_do, s);
// Now we know the file exists and its size, so let's
// initialize the multi interface!
s->multi = curl_multi_init();
curl_multi_setopt(s->multi, CURLMOPT_SOCKETFUNCTION, curl_sock_cb);
#ifdef NEED_CURL_TIMER_CALLBACK
curl_multi_setopt(s->multi, CURLMOPT_TIMERDATA, s);
curl_multi_setopt(s->multi, CURLMOPT_TIMERFUNCTION, curl_timer_cb);
#endif
curl_attach_aio_context(bs, bdrv_get_aio_context(bs));
qemu_opts_del(opts);
return 0;
@@ -545,6 +607,7 @@ out:
curl_easy_cleanup(state->curl);
state->curl = NULL;
out_noclean:
g_free(s->cookie);
g_free(s->url);
qemu_opts_del(opts);
return -EINVAL;
@@ -588,7 +651,7 @@ static void curl_readv_bh_cb(void *p)
}
// No cache found, so let's start a new request
state = curl_init_state(s);
state = curl_init_state(acb->common.bs, s);
if (!state) {
acb->common.cb(acb->common.opaque, -EIO);
qemu_aio_release(acb);
@@ -599,12 +662,17 @@ static void curl_readv_bh_cb(void *p)
acb->end = (acb->nb_sectors * SECTOR_SIZE);
state->buf_off = 0;
if (state->orig_buf)
g_free(state->orig_buf);
g_free(state->orig_buf);
state->buf_start = start;
state->buf_len = acb->end + s->readahead_size;
end = MIN(start + state->buf_len, s->len) - 1;
state->orig_buf = g_malloc(state->buf_len);
state->orig_buf = g_try_malloc(state->buf_len);
if (state->buf_len && state->orig_buf == NULL) {
curl_clean_state(state);
acb->common.cb(acb->common.opaque, -ENOMEM);
qemu_aio_release(acb);
return;
}
state->acb[0] = acb;
snprintf(state->range, 127, "%zd-%zd", start, end);
@@ -630,7 +698,7 @@ static BlockDriverAIOCB *curl_aio_readv(BlockDriverState *bs,
acb->sector_num = sector_num;
acb->nb_sectors = nb_sectors;
acb->bh = qemu_bh_new(curl_readv_bh_cb, acb);
acb->bh = aio_bh_new(bdrv_get_aio_context(bs), curl_readv_bh_cb, acb);
qemu_bh_schedule(acb->bh);
return &acb->common;
}
@@ -638,26 +706,11 @@ static BlockDriverAIOCB *curl_aio_readv(BlockDriverState *bs,
static void curl_close(BlockDriverState *bs)
{
BDRVCURLState *s = bs->opaque;
int i;
DPRINTF("CURL: Close\n");
for (i=0; i<CURL_NUM_STATES; i++) {
if (s->states[i].in_use)
curl_clean_state(&s->states[i]);
if (s->states[i].curl) {
curl_easy_cleanup(s->states[i].curl);
s->states[i].curl = NULL;
}
if (s->states[i].orig_buf) {
g_free(s->states[i].orig_buf);
s->states[i].orig_buf = NULL;
}
}
if (s->multi)
curl_multi_cleanup(s->multi);
timer_del(&s->timer);
curl_detach_aio_context(bs);
g_free(s->cookie);
g_free(s->url);
}
@@ -668,68 +721,83 @@ static int64_t curl_getlength(BlockDriverState *bs)
}
static BlockDriver bdrv_http = {
.format_name = "http",
.protocol_name = "http",
.format_name = "http",
.protocol_name = "http",
.instance_size = sizeof(BDRVCURLState),
.bdrv_parse_filename = curl_parse_filename,
.bdrv_file_open = curl_open,
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.instance_size = sizeof(BDRVCURLState),
.bdrv_parse_filename = curl_parse_filename,
.bdrv_file_open = curl_open,
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.bdrv_aio_readv = curl_aio_readv,
.bdrv_aio_readv = curl_aio_readv,
.bdrv_detach_aio_context = curl_detach_aio_context,
.bdrv_attach_aio_context = curl_attach_aio_context,
};
static BlockDriver bdrv_https = {
.format_name = "https",
.protocol_name = "https",
.format_name = "https",
.protocol_name = "https",
.instance_size = sizeof(BDRVCURLState),
.bdrv_parse_filename = curl_parse_filename,
.bdrv_file_open = curl_open,
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.instance_size = sizeof(BDRVCURLState),
.bdrv_parse_filename = curl_parse_filename,
.bdrv_file_open = curl_open,
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.bdrv_aio_readv = curl_aio_readv,
.bdrv_aio_readv = curl_aio_readv,
.bdrv_detach_aio_context = curl_detach_aio_context,
.bdrv_attach_aio_context = curl_attach_aio_context,
};
static BlockDriver bdrv_ftp = {
.format_name = "ftp",
.protocol_name = "ftp",
.format_name = "ftp",
.protocol_name = "ftp",
.instance_size = sizeof(BDRVCURLState),
.bdrv_parse_filename = curl_parse_filename,
.bdrv_file_open = curl_open,
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.instance_size = sizeof(BDRVCURLState),
.bdrv_parse_filename = curl_parse_filename,
.bdrv_file_open = curl_open,
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.bdrv_aio_readv = curl_aio_readv,
.bdrv_aio_readv = curl_aio_readv,
.bdrv_detach_aio_context = curl_detach_aio_context,
.bdrv_attach_aio_context = curl_attach_aio_context,
};
static BlockDriver bdrv_ftps = {
.format_name = "ftps",
.protocol_name = "ftps",
.format_name = "ftps",
.protocol_name = "ftps",
.instance_size = sizeof(BDRVCURLState),
.bdrv_parse_filename = curl_parse_filename,
.bdrv_file_open = curl_open,
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.instance_size = sizeof(BDRVCURLState),
.bdrv_parse_filename = curl_parse_filename,
.bdrv_file_open = curl_open,
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.bdrv_aio_readv = curl_aio_readv,
.bdrv_aio_readv = curl_aio_readv,
.bdrv_detach_aio_context = curl_detach_aio_context,
.bdrv_attach_aio_context = curl_attach_aio_context,
};
static BlockDriver bdrv_tftp = {
.format_name = "tftp",
.protocol_name = "tftp",
.format_name = "tftp",
.protocol_name = "tftp",
.instance_size = sizeof(BDRVCURLState),
.bdrv_parse_filename = curl_parse_filename,
.bdrv_file_open = curl_open,
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.instance_size = sizeof(BDRVCURLState),
.bdrv_parse_filename = curl_parse_filename,
.bdrv_file_open = curl_open,
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.bdrv_aio_readv = curl_aio_readv,
.bdrv_aio_readv = curl_aio_readv,
.bdrv_detach_aio_context = curl_detach_aio_context,
.bdrv_attach_aio_context = curl_attach_aio_context,
};
static void curl_block_init(void)

View File

@@ -284,8 +284,15 @@ static int dmg_open(BlockDriverState *bs, QDict *options, int flags,
}
/* initialize zlib engine */
s->compressed_chunk = g_malloc(max_compressed_size + 1);
s->uncompressed_chunk = g_malloc(512 * max_sectors_per_chunk);
s->compressed_chunk = qemu_try_blockalign(bs->file,
max_compressed_size + 1);
s->uncompressed_chunk = qemu_try_blockalign(bs->file,
512 * max_sectors_per_chunk);
if (s->compressed_chunk == NULL || s->uncompressed_chunk == NULL) {
ret = -ENOMEM;
goto fail;
}
if (inflateInit(&s->zstream) != Z_OK) {
ret = -EINVAL;
goto fail;
@@ -302,8 +309,8 @@ fail:
g_free(s->lengths);
g_free(s->sectors);
g_free(s->sectorcounts);
g_free(s->compressed_chunk);
g_free(s->uncompressed_chunk);
qemu_vfree(s->compressed_chunk);
qemu_vfree(s->uncompressed_chunk);
return ret;
}
@@ -426,8 +433,8 @@ static void dmg_close(BlockDriverState *bs)
g_free(s->lengths);
g_free(s->sectors);
g_free(s->sectorcounts);
g_free(s->compressed_chunk);
g_free(s->uncompressed_chunk);
qemu_vfree(s->compressed_chunk);
qemu_vfree(s->uncompressed_chunk);
inflateEnd(&s->zstream);
}

View File

@@ -16,6 +16,7 @@ typedef struct GlusterAIOCB {
int ret;
QEMUBH *bh;
Coroutine *coroutine;
AioContext *aio_context;
} GlusterAIOCB;
typedef struct BDRVGlusterState {
@@ -249,7 +250,7 @@ static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret, void *arg)
acb->ret = -EIO; /* Partial read/write - fail it */
}
acb->bh = qemu_bh_new(qemu_gluster_complete_aio, acb);
acb->bh = aio_bh_new(acb->aio_context, qemu_gluster_complete_aio, acb);
qemu_bh_schedule(acb->bh);
}
@@ -290,7 +291,7 @@ static int qemu_gluster_open(BlockDriverState *bs, QDict *options,
BDRVGlusterState *s = bs->opaque;
int open_flags = 0;
int ret = 0;
GlusterConf *gconf = g_malloc0(sizeof(GlusterConf));
GlusterConf *gconf = g_new0(GlusterConf, 1);
QemuOpts *opts;
Error *local_err = NULL;
const char *filename;
@@ -350,12 +351,12 @@ static int qemu_gluster_reopen_prepare(BDRVReopenState *state,
assert(state != NULL);
assert(state->bs != NULL);
state->opaque = g_malloc0(sizeof(BDRVGlusterReopenState));
state->opaque = g_new0(BDRVGlusterReopenState, 1);
reop_s = state->opaque;
qemu_gluster_parse_flags(state->flags, &open_flags);
gconf = g_malloc0(sizeof(GlusterConf));
gconf = g_new0(GlusterConf, 1);
reop_s->glfs = qemu_gluster_init(gconf, state->bs->filename, errp);
if (reop_s->glfs == NULL) {
@@ -436,6 +437,7 @@ static coroutine_fn int qemu_gluster_co_write_zeroes(BlockDriverState *bs,
acb->size = size;
acb->ret = 0;
acb->coroutine = qemu_coroutine_self();
acb->aio_context = bdrv_get_aio_context(bs);
ret = glfs_zerofill_async(s->fd, offset, size, &gluster_finish_aiocb, acb);
if (ret < 0) {
@@ -476,14 +478,15 @@ static inline int qemu_gluster_zerofill(struct glfs_fd *fd, int64_t offset,
#endif
static int qemu_gluster_create(const char *filename,
QEMUOptionParameter *options, Error **errp)
QemuOpts *opts, Error **errp)
{
struct glfs *glfs;
struct glfs_fd *fd;
int ret = 0;
int prealloc = 0;
int64_t total_size = 0;
GlusterConf *gconf = g_malloc0(sizeof(GlusterConf));
char *tmp = NULL;
GlusterConf *gconf = g_new0(GlusterConf, 1);
glfs = qemu_gluster_init(gconf, filename, errp);
if (!glfs) {
@@ -491,24 +494,21 @@ static int qemu_gluster_create(const char *filename,
goto out;
}
while (options && options->name) {
if (!strcmp(options->name, BLOCK_OPT_SIZE)) {
total_size = options->value.n / BDRV_SECTOR_SIZE;
} else if (!strcmp(options->name, BLOCK_OPT_PREALLOC)) {
if (!options->value.s || !strcmp(options->value.s, "off")) {
prealloc = 0;
} else if (!strcmp(options->value.s, "full") &&
gluster_supports_zerofill()) {
prealloc = 1;
} else {
error_setg(errp, "Invalid preallocation mode: '%s'"
" or GlusterFS doesn't support zerofill API",
options->value.s);
ret = -EINVAL;
goto out;
}
}
options++;
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
tmp = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
if (!tmp || !strcmp(tmp, "off")) {
prealloc = 0;
} else if (!strcmp(tmp, "full") &&
gluster_supports_zerofill()) {
prealloc = 1;
} else {
error_setg(errp, "Invalid preallocation mode: '%s'"
" or GlusterFS doesn't support zerofill API",
tmp);
ret = -EINVAL;
goto out;
}
fd = glfs_creat(glfs, gconf->image,
@@ -516,9 +516,8 @@ static int qemu_gluster_create(const char *filename,
if (!fd) {
ret = -errno;
} else {
if (!glfs_ftruncate(fd, total_size * BDRV_SECTOR_SIZE)) {
if (prealloc && qemu_gluster_zerofill(fd, 0,
total_size * BDRV_SECTOR_SIZE)) {
if (!glfs_ftruncate(fd, total_size)) {
if (prealloc && qemu_gluster_zerofill(fd, 0, total_size)) {
ret = -errno;
}
} else {
@@ -530,6 +529,7 @@ static int qemu_gluster_create(const char *filename,
}
}
out:
g_free(tmp);
qemu_gluster_gconf_free(gconf);
if (glfs) {
glfs_fini(glfs);
@@ -549,6 +549,7 @@ static coroutine_fn int qemu_gluster_co_rw(BlockDriverState *bs,
acb->size = size;
acb->ret = 0;
acb->coroutine = qemu_coroutine_self();
acb->aio_context = bdrv_get_aio_context(bs);
if (write) {
ret = glfs_pwritev_async(s->fd, qiov->iov, qiov->niov, offset, 0,
@@ -605,6 +606,7 @@ static coroutine_fn int qemu_gluster_co_flush_to_disk(BlockDriverState *bs)
acb->size = 0;
acb->ret = 0;
acb->coroutine = qemu_coroutine_self();
acb->aio_context = bdrv_get_aio_context(bs);
ret = glfs_fsync_async(s->fd, &gluster_finish_aiocb, acb);
if (ret < 0) {
@@ -633,6 +635,7 @@ static coroutine_fn int qemu_gluster_co_discard(BlockDriverState *bs,
acb->size = 0;
acb->ret = 0;
acb->coroutine = qemu_coroutine_self();
acb->aio_context = bdrv_get_aio_context(bs);
ret = glfs_discard_async(s->fd, offset, size, &gluster_finish_aiocb, acb);
if (ret < 0) {
@@ -693,18 +696,22 @@ static int qemu_gluster_has_zero_init(BlockDriverState *bs)
return 0;
}
static QEMUOptionParameter qemu_gluster_create_options[] = {
{
.name = BLOCK_OPT_SIZE,
.type = OPT_SIZE,
.help = "Virtual disk size"
},
{
.name = BLOCK_OPT_PREALLOC,
.type = OPT_STRING,
.help = "Preallocation mode (allowed values: off, full)"
},
{ NULL }
static QemuOptsList qemu_gluster_create_opts = {
.name = "qemu-gluster-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(qemu_gluster_create_opts.head),
.desc = {
{
.name = BLOCK_OPT_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Virtual disk size"
},
{
.name = BLOCK_OPT_PREALLOC,
.type = QEMU_OPT_STRING,
.help = "Preallocation mode (allowed values: off, full)"
},
{ /* end of list */ }
}
};
static BlockDriver bdrv_gluster = {
@@ -731,7 +738,7 @@ static BlockDriver bdrv_gluster = {
#ifdef CONFIG_GLUSTERFS_ZEROFILL
.bdrv_co_write_zeroes = qemu_gluster_co_write_zeroes,
#endif
.create_options = qemu_gluster_create_options,
.create_opts = &qemu_gluster_create_opts,
};
static BlockDriver bdrv_gluster_tcp = {
@@ -758,7 +765,7 @@ static BlockDriver bdrv_gluster_tcp = {
#ifdef CONFIG_GLUSTERFS_ZEROFILL
.bdrv_co_write_zeroes = qemu_gluster_co_write_zeroes,
#endif
.create_options = qemu_gluster_create_options,
.create_opts = &qemu_gluster_create_opts,
};
static BlockDriver bdrv_gluster_unix = {
@@ -785,7 +792,7 @@ static BlockDriver bdrv_gluster_unix = {
#ifdef CONFIG_GLUSTERFS_ZEROFILL
.bdrv_co_write_zeroes = qemu_gluster_co_write_zeroes,
#endif
.create_options = qemu_gluster_create_options,
.create_opts = &qemu_gluster_create_opts,
};
static BlockDriver bdrv_gluster_rdma = {
@@ -812,7 +819,7 @@ static BlockDriver bdrv_gluster_rdma = {
#ifdef CONFIG_GLUSTERFS_ZEROFILL
.bdrv_co_write_zeroes = qemu_gluster_co_write_zeroes,
#endif
.create_options = qemu_gluster_create_options,
.create_opts = &qemu_gluster_create_opts,
};
static void bdrv_gluster_init(void)

View File

@@ -26,6 +26,7 @@
#include "config-host.h"
#include <poll.h>
#include <math.h>
#include <arpa/inet.h>
#include "qemu-common.h"
#include "qemu/config-file.h"
@@ -49,6 +50,7 @@
typedef struct IscsiLun {
struct iscsi_context *iscsi;
AioContext *aio_context;
int lun;
enum scsi_inquiry_peripheral_device_type type;
int block_size;
@@ -63,6 +65,7 @@ typedef struct IscsiLun {
unsigned char *zeroblock;
unsigned long *allocationmap;
int cluster_sectors;
bool use_16_for_rw;
} IscsiLun;
typedef struct IscsiTask {
@@ -73,6 +76,8 @@ typedef struct IscsiTask {
struct scsi_task *task;
Coroutine *co;
QEMUBH *bh;
IscsiLun *iscsilun;
QEMUTimer retry_timer;
} IscsiTask;
typedef struct IscsiAIOCB {
@@ -84,7 +89,6 @@ typedef struct IscsiAIOCB {
uint8_t *buf;
int status;
int canceled;
int retries;
int64_t sector_num;
int nb_sectors;
#ifdef __linux__
@@ -94,9 +98,10 @@ typedef struct IscsiAIOCB {
#define NOP_INTERVAL 5000
#define MAX_NOP_FAILURES 3
#define ISCSI_CMD_RETRIES 5
#define ISCSI_CMD_RETRIES ARRAY_SIZE(iscsi_retry_times)
static const unsigned iscsi_retry_times[] = {8, 32, 128, 512, 2048};
/* this threshhold is a trade-off knob to choose between
/* this threshold is a trade-off knob to choose between
* the potential additional overhead of an extra GET_LBA_STATUS request
* vs. unnecessarily reading a lot of zero sectors over the wire.
* If a read request is greater or equal than ISCSI_CHECKALLOC_THRES
@@ -133,17 +138,32 @@ iscsi_schedule_bh(IscsiAIOCB *acb)
if (acb->bh) {
return;
}
acb->bh = qemu_bh_new(iscsi_bh_cb, acb);
acb->bh = aio_bh_new(acb->iscsilun->aio_context, iscsi_bh_cb, acb);
qemu_bh_schedule(acb->bh);
}
static void iscsi_co_generic_bh_cb(void *opaque)
{
struct IscsiTask *iTask = opaque;
iTask->complete = 1;
qemu_bh_delete(iTask->bh);
qemu_coroutine_enter(iTask->co, NULL);
}
static void iscsi_retry_timer_expired(void *opaque)
{
struct IscsiTask *iTask = opaque;
iTask->complete = 1;
if (iTask->co) {
qemu_coroutine_enter(iTask->co, NULL);
}
}
static inline unsigned exp_random(double mean)
{
return -mean * log((double)rand() / RAND_MAX);
}
static void
iscsi_co_generic_cb(struct iscsi_context *iscsi, int status,
void *command_data, void *opaque)
@@ -151,26 +171,44 @@ iscsi_co_generic_cb(struct iscsi_context *iscsi, int status,
struct IscsiTask *iTask = opaque;
struct scsi_task *task = command_data;
iTask->complete = 1;
iTask->status = status;
iTask->do_retry = 0;
iTask->task = task;
if (iTask->retries-- > 0 && status == SCSI_STATUS_CHECK_CONDITION
&& task->sense.key == SCSI_SENSE_UNIT_ATTENTION) {
error_report("iSCSI CheckCondition: %s", iscsi_get_error(iscsi));
iTask->do_retry = 1;
goto out;
}
if (status != SCSI_STATUS_GOOD) {
if (iTask->retries++ < ISCSI_CMD_RETRIES) {
if (status == SCSI_STATUS_CHECK_CONDITION
&& task->sense.key == SCSI_SENSE_UNIT_ATTENTION) {
error_report("iSCSI CheckCondition: %s",
iscsi_get_error(iscsi));
iTask->do_retry = 1;
goto out;
}
if (status == SCSI_STATUS_BUSY) {
unsigned retry_time =
exp_random(iscsi_retry_times[iTask->retries - 1]);
error_report("iSCSI Busy (retry #%u in %u ms): %s",
iTask->retries, retry_time,
iscsi_get_error(iscsi));
aio_timer_init(iTask->iscsilun->aio_context,
&iTask->retry_timer, QEMU_CLOCK_REALTIME,
SCALE_MS, iscsi_retry_timer_expired, iTask);
timer_mod(&iTask->retry_timer,
qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + retry_time);
iTask->do_retry = 1;
return;
}
}
error_report("iSCSI Failure: %s", iscsi_get_error(iscsi));
}
out:
if (iTask->co) {
iTask->bh = qemu_bh_new(iscsi_co_generic_bh_cb, iTask);
iTask->bh = aio_bh_new(iTask->iscsilun->aio_context,
iscsi_co_generic_bh_cb, iTask);
qemu_bh_schedule(iTask->bh);
} else {
iTask->complete = 1;
}
}
@@ -178,7 +216,7 @@ static void iscsi_co_init_iscsitask(IscsiLun *iscsilun, struct IscsiTask *iTask)
{
*iTask = (struct IscsiTask) {
.co = qemu_coroutine_self(),
.retries = ISCSI_CMD_RETRIES,
.iscsilun = iscsilun,
};
}
@@ -209,7 +247,7 @@ iscsi_aio_cancel(BlockDriverAIOCB *blockacb)
iscsi_abort_task_cb, acb);
while (acb->status == -EINPROGRESS) {
qemu_aio_wait();
aio_poll(iscsilun->aio_context, true);
}
}
@@ -232,10 +270,11 @@ iscsi_set_events(IscsiLun *iscsilun)
ev = POLLIN;
ev |= iscsi_which_events(iscsi);
if (ev != iscsilun->events) {
qemu_aio_set_fd_handler(iscsi_get_fd(iscsi),
iscsi_process_read,
(ev & POLLOUT) ? iscsi_process_write : NULL,
iscsilun);
aio_set_fd_handler(iscsilun->aio_context,
iscsi_get_fd(iscsi),
iscsi_process_read,
(ev & POLLOUT) ? iscsi_process_write : NULL,
iscsilun);
}
@@ -320,8 +359,6 @@ static int coroutine_fn iscsi_co_writev(BlockDriverState *bs,
struct IscsiTask iTask;
uint64_t lba;
uint32_t num_sectors;
uint8_t *data = NULL;
uint8_t *buf = NULL;
if (!is_request_lun_aligned(sector_num, nb_sectors, iscsilun)) {
return -EINVAL;
@@ -329,31 +366,24 @@ static int coroutine_fn iscsi_co_writev(BlockDriverState *bs,
lba = sector_qemu2lun(sector_num, iscsilun);
num_sectors = sector_qemu2lun(nb_sectors, iscsilun);
#if !defined(LIBISCSI_FEATURE_IOVECTOR)
/* if the iovec only contains one buffer we can pass it directly */
if (iov->niov == 1) {
data = iov->iov[0].iov_base;
} else {
size_t size = MIN(nb_sectors * BDRV_SECTOR_SIZE, iov->size);
buf = g_malloc(size);
qemu_iovec_to_buf(iov, 0, buf, size);
data = buf;
}
#endif
iscsi_co_init_iscsitask(iscsilun, &iTask);
retry:
iTask.task = iscsi_write16_task(iscsilun->iscsi, iscsilun->lun, lba,
data, num_sectors * iscsilun->block_size,
iscsilun->block_size, 0, 0, 0, 0, 0,
iscsi_co_generic_cb, &iTask);
if (iscsilun->use_16_for_rw) {
iTask.task = iscsi_write16_task(iscsilun->iscsi, iscsilun->lun, lba,
NULL, num_sectors * iscsilun->block_size,
iscsilun->block_size, 0, 0, 0, 0, 0,
iscsi_co_generic_cb, &iTask);
} else {
iTask.task = iscsi_write10_task(iscsilun->iscsi, iscsilun->lun, lba,
NULL, num_sectors * iscsilun->block_size,
iscsilun->block_size, 0, 0, 0, 0, 0,
iscsi_co_generic_cb, &iTask);
}
if (iTask.task == NULL) {
g_free(buf);
return -ENOMEM;
}
#if defined(LIBISCSI_FEATURE_IOVECTOR)
scsi_task_set_iov_out(iTask.task, (struct scsi_iovec *) iov->iov,
iov->niov);
#endif
while (!iTask.complete) {
iscsi_set_events(iscsilun);
qemu_coroutine_yield();
@@ -369,8 +399,6 @@ retry:
goto retry;
}
g_free(buf);
if (iTask.status != SCSI_STATUS_GOOD) {
return -EIO;
}
@@ -381,7 +409,6 @@ retry:
}
#if defined(LIBISCSI_FEATURE_IOVECTOR)
static bool iscsi_allocationmap_is_allocated(IscsiLun *iscsilun,
int64_t sector_num, int nb_sectors)
{
@@ -491,9 +518,6 @@ out:
return ret;
}
#endif /* LIBISCSI_FEATURE_IOVECTOR */
static int coroutine_fn iscsi_co_readv(BlockDriverState *bs,
int64_t sector_num, int nb_sectors,
QEMUIOVector *iov)
@@ -502,15 +526,11 @@ static int coroutine_fn iscsi_co_readv(BlockDriverState *bs,
struct IscsiTask iTask;
uint64_t lba;
uint32_t num_sectors;
#if !defined(LIBISCSI_FEATURE_IOVECTOR)
int i;
#endif
if (!is_request_lun_aligned(sector_num, nb_sectors, iscsilun)) {
return -EINVAL;
}
#if defined(LIBISCSI_FEATURE_IOVECTOR)
if (iscsilun->lbprz && nb_sectors >= ISCSI_CHECKALLOC_THRES &&
!iscsi_allocationmap_is_allocated(iscsilun, sector_num, nb_sectors)) {
int64_t ret;
@@ -524,42 +544,28 @@ static int coroutine_fn iscsi_co_readv(BlockDriverState *bs,
return 0;
}
}
#endif
lba = sector_qemu2lun(sector_num, iscsilun);
num_sectors = sector_qemu2lun(nb_sectors, iscsilun);
iscsi_co_init_iscsitask(iscsilun, &iTask);
retry:
switch (iscsilun->type) {
case TYPE_DISK:
if (iscsilun->use_16_for_rw) {
iTask.task = iscsi_read16_task(iscsilun->iscsi, iscsilun->lun, lba,
num_sectors * iscsilun->block_size,
iscsilun->block_size, 0, 0, 0, 0, 0,
iscsi_co_generic_cb, &iTask);
break;
default:
} else {
iTask.task = iscsi_read10_task(iscsilun->iscsi, iscsilun->lun, lba,
num_sectors * iscsilun->block_size,
iscsilun->block_size,
#if !defined(CONFIG_LIBISCSI_1_4) /* API change from 1.4.0 to 1.5.0 */
0, 0, 0, 0, 0,
#endif
iscsi_co_generic_cb, &iTask);
break;
}
if (iTask.task == NULL) {
return -ENOMEM;
}
#if defined(LIBISCSI_FEATURE_IOVECTOR)
scsi_task_set_iov_in(iTask.task, (struct scsi_iovec *) iov->iov, iov->niov);
#else
for (i = 0; i < iov->niov; i++) {
scsi_task_add_data_in_buffer(iTask.task,
iov->iov[i].iov_len,
iov->iov[i].iov_base);
}
#endif
while (!iTask.complete) {
iscsi_set_events(iscsilun);
@@ -714,18 +720,9 @@ static BlockDriverAIOCB *iscsi_aio_ioctl(BlockDriverState *bs,
data.data = acb->ioh->dxferp;
data.size = acb->ioh->dxfer_len;
} else {
#if defined(LIBISCSI_FEATURE_IOVECTOR)
scsi_task_set_iov_out(acb->task,
(struct scsi_iovec *) acb->ioh->dxferp,
acb->ioh->iovec_count);
#else
struct iovec *iov = (struct iovec *)acb->ioh->dxferp;
acb->buf = g_malloc(acb->ioh->dxfer_len);
data.data = acb->buf;
data.size = iov_to_buf(iov, acb->ioh->iovec_count, 0,
acb->buf, acb->ioh->dxfer_len);
#endif
}
}
@@ -745,20 +742,9 @@ static BlockDriverAIOCB *iscsi_aio_ioctl(BlockDriverState *bs,
acb->ioh->dxfer_len,
acb->ioh->dxferp);
} else {
#if defined(LIBISCSI_FEATURE_IOVECTOR)
scsi_task_set_iov_in(acb->task,
(struct scsi_iovec *) acb->ioh->dxferp,
acb->ioh->iovec_count);
#else
int i;
for (i = 0; i < acb->ioh->iovec_count; i++) {
struct iovec *iov = (struct iovec *)acb->ioh->dxferp;
scsi_task_add_data_in_buffer(acb->task,
iov[i].iov_len,
iov[i].iov_base);
}
#endif
}
}
@@ -767,7 +753,6 @@ static BlockDriverAIOCB *iscsi_aio_ioctl(BlockDriverState *bs,
return &acb->common;
}
static void ioctl_cb(void *opaque, int status)
{
int *p_status = opaque;
@@ -791,7 +776,7 @@ static int iscsi_ioctl(BlockDriverState *bs, unsigned long int req, void *buf)
iscsi_aio_ioctl(bs, req, buf, ioctl_cb, &status);
while (status == -EINPROGRESS) {
qemu_aio_wait();
aio_poll(iscsilun->aio_context, true);
}
return 0;
@@ -872,8 +857,6 @@ retry:
return 0;
}
#if defined(SCSI_SENSE_ASCQ_CAPACITY_DATA_HAS_CHANGED)
static int
coroutine_fn iscsi_co_write_zeroes(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, BdrvRequestFlags flags)
@@ -882,19 +865,27 @@ coroutine_fn iscsi_co_write_zeroes(BlockDriverState *bs, int64_t sector_num,
struct IscsiTask iTask;
uint64_t lba;
uint32_t nb_blocks;
bool use_16_for_ws = iscsilun->use_16_for_rw;
if (!is_request_lun_aligned(sector_num, nb_sectors, iscsilun)) {
return -EINVAL;
}
if ((flags & BDRV_REQ_MAY_UNMAP) && !iscsilun->lbp.lbpws) {
/* WRITE SAME with UNMAP is not supported by the target,
* fall back and try WRITE SAME without UNMAP */
flags &= ~BDRV_REQ_MAY_UNMAP;
if (flags & BDRV_REQ_MAY_UNMAP) {
if (!use_16_for_ws && !iscsilun->lbp.lbpws10) {
/* WRITESAME10 with UNMAP is unsupported try WRITESAME16 */
use_16_for_ws = true;
}
if (use_16_for_ws && !iscsilun->lbp.lbpws) {
/* WRITESAME16 with UNMAP is not supported by the target,
* fall back and try WRITESAME10/16 without UNMAP */
flags &= ~BDRV_REQ_MAY_UNMAP;
use_16_for_ws = iscsilun->use_16_for_rw;
}
}
if (!(flags & BDRV_REQ_MAY_UNMAP) && !iscsilun->has_write_same) {
/* WRITE SAME without UNMAP is not supported by the target */
/* WRITESAME without UNMAP is not supported by the target */
return -ENOTSUP;
}
@@ -902,15 +893,26 @@ coroutine_fn iscsi_co_write_zeroes(BlockDriverState *bs, int64_t sector_num,
nb_blocks = sector_qemu2lun(nb_sectors, iscsilun);
if (iscsilun->zeroblock == NULL) {
iscsilun->zeroblock = g_malloc0(iscsilun->block_size);
iscsilun->zeroblock = g_try_malloc0(iscsilun->block_size);
if (iscsilun->zeroblock == NULL) {
return -ENOMEM;
}
}
iscsi_co_init_iscsitask(iscsilun, &iTask);
retry:
if (iscsi_writesame16_task(iscsilun->iscsi, iscsilun->lun, lba,
iscsilun->zeroblock, iscsilun->block_size,
nb_blocks, 0, !!(flags & BDRV_REQ_MAY_UNMAP),
0, 0, iscsi_co_generic_cb, &iTask) == NULL) {
if (use_16_for_ws) {
iTask.task = iscsi_writesame16_task(iscsilun->iscsi, iscsilun->lun, lba,
iscsilun->zeroblock, iscsilun->block_size,
nb_blocks, 0, !!(flags & BDRV_REQ_MAY_UNMAP),
0, 0, iscsi_co_generic_cb, &iTask);
} else {
iTask.task = iscsi_writesame10_task(iscsilun->iscsi, iscsilun->lun, lba,
iscsilun->zeroblock, iscsilun->block_size,
nb_blocks, 0, !!(flags & BDRV_REQ_MAY_UNMAP),
0, 0, iscsi_co_generic_cb, &iTask);
}
if (iTask.task == NULL) {
return -ENOMEM;
}
@@ -952,8 +954,6 @@ retry:
return 0;
}
#endif /* SCSI_SENSE_ASCQ_CAPACITY_DATA_HAS_CHANGED */
static void parse_chap(struct iscsi_context *iscsi, const char *target,
Error **errp)
{
@@ -1063,7 +1063,6 @@ static char *parse_initiator_name(const char *target)
return iscsi_name;
}
#if defined(LIBISCSI_FEATURE_NOP_COUNTER)
static void iscsi_nop_timed_event(void *opaque)
{
IscsiLun *iscsilun = opaque;
@@ -1081,7 +1080,6 @@ static void iscsi_nop_timed_event(void *opaque)
timer_mod(iscsilun->nop_timer, qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + NOP_INTERVAL);
iscsi_set_events(iscsilun);
}
#endif
static void iscsi_readcapacity_sync(IscsiLun *iscsilun, Error **errp)
{
@@ -1108,6 +1106,7 @@ static void iscsi_readcapacity_sync(IscsiLun *iscsilun, Error **errp)
iscsilun->num_blocks = rc16->returned_lba + 1;
iscsilun->lbpme = rc16->lbpme;
iscsilun->lbprz = rc16->lbprz;
iscsilun->use_16_for_rw = (rc16->returned_lba > 0xffffffff);
}
}
break;
@@ -1195,6 +1194,38 @@ fail_with_err:
return NULL;
}
static void iscsi_detach_aio_context(BlockDriverState *bs)
{
IscsiLun *iscsilun = bs->opaque;
aio_set_fd_handler(iscsilun->aio_context,
iscsi_get_fd(iscsilun->iscsi),
NULL, NULL, NULL);
iscsilun->events = 0;
if (iscsilun->nop_timer) {
timer_del(iscsilun->nop_timer);
timer_free(iscsilun->nop_timer);
iscsilun->nop_timer = NULL;
}
}
static void iscsi_attach_aio_context(BlockDriverState *bs,
AioContext *new_context)
{
IscsiLun *iscsilun = bs->opaque;
iscsilun->aio_context = new_context;
iscsi_set_events(iscsilun);
/* Set up a timer for sending out iSCSI NOPs */
iscsilun->nop_timer = aio_timer_new(iscsilun->aio_context,
QEMU_CLOCK_REALTIME, SCALE_MS,
iscsi_nop_timed_event, iscsilun);
timer_mod(iscsilun->nop_timer,
qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + NOP_INTERVAL);
}
/*
* We support iscsi url's on the form
* iscsi://[<username>%<password>@]<host>[:<port>]/<targetname>/<lun>
@@ -1301,6 +1332,7 @@ static int iscsi_open(BlockDriverState *bs, QDict *options, int flags,
}
iscsilun->iscsi = iscsi;
iscsilun->aio_context = bdrv_get_aio_context(bs);
iscsilun->lun = iscsi_url->lun;
iscsilun->has_write_same = true;
@@ -1374,11 +1406,7 @@ static int iscsi_open(BlockDriverState *bs, QDict *options, int flags,
scsi_free_scsi_task(task);
task = NULL;
#if defined(LIBISCSI_FEATURE_NOP_COUNTER)
/* Set up a timer for sending out iSCSI NOPs */
iscsilun->nop_timer = timer_new_ms(QEMU_CLOCK_REALTIME, iscsi_nop_timed_event, iscsilun);
timer_mod(iscsilun->nop_timer, qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + NOP_INTERVAL);
#endif
iscsi_attach_aio_context(bs, iscsilun->aio_context);
/* Guess the internal cluster (page) size of the iscsi target by the means
* of opt_unmap_gran. Transfer the unmap granularity only if it has a
@@ -1387,20 +1415,16 @@ static int iscsi_open(BlockDriverState *bs, QDict *options, int flags,
iscsilun->bl.opt_unmap_gran * iscsilun->block_size <= 16 * 1024 * 1024) {
iscsilun->cluster_sectors = (iscsilun->bl.opt_unmap_gran *
iscsilun->block_size) >> BDRV_SECTOR_BITS;
#if defined(LIBISCSI_FEATURE_IOVECTOR)
if (iscsilun->lbprz && !(bs->open_flags & BDRV_O_NOCACHE)) {
iscsilun->allocationmap =
bitmap_new(DIV_ROUND_UP(bs->total_sectors,
iscsilun->cluster_sectors));
}
#endif
}
out:
qemu_opts_del(opts);
if (initiator_name != NULL) {
g_free(initiator_name);
}
g_free(initiator_name);
if (iscsi_url != NULL) {
iscsi_destroy_url(iscsi_url);
}
@@ -1422,18 +1446,14 @@ static void iscsi_close(BlockDriverState *bs)
IscsiLun *iscsilun = bs->opaque;
struct iscsi_context *iscsi = iscsilun->iscsi;
if (iscsilun->nop_timer) {
timer_del(iscsilun->nop_timer);
timer_free(iscsilun->nop_timer);
}
qemu_aio_set_fd_handler(iscsi_get_fd(iscsi), NULL, NULL, NULL);
iscsi_detach_aio_context(bs);
iscsi_destroy_context(iscsi);
g_free(iscsilun->zeroblock);
g_free(iscsilun->allocationmap);
memset(iscsilun, 0, sizeof(IscsiLun));
}
static int iscsi_refresh_limits(BlockDriverState *bs)
static void iscsi_refresh_limits(BlockDriverState *bs, Error **errp)
{
IscsiLun *iscsilun = bs->opaque;
@@ -1458,7 +1478,6 @@ static int iscsi_refresh_limits(BlockDriverState *bs)
}
bs->bl.opt_transfer_length = sector_lun2qemu(iscsilun->bl.opt_xfer_len,
iscsilun);
return 0;
}
/* Since iscsi_open() ignores bdrv_flags, there is nothing to do here in
@@ -1493,15 +1512,15 @@ static int iscsi_truncate(BlockDriverState *bs, int64_t offset)
if (iscsilun->allocationmap != NULL) {
g_free(iscsilun->allocationmap);
iscsilun->allocationmap =
bitmap_new(DIV_ROUND_UP(bs->total_sectors,
bitmap_new(DIV_ROUND_UP(sector_lun2qemu(iscsilun->num_blocks,
iscsilun),
iscsilun->cluster_sectors));
}
return 0;
}
static int iscsi_create(const char *filename, QEMUOptionParameter *options,
Error **errp)
static int iscsi_create(const char *filename, QemuOpts *opts, Error **errp)
{
int ret = 0;
int64_t total_size = 0;
@@ -1512,14 +1531,9 @@ static int iscsi_create(const char *filename, QEMUOptionParameter *options,
bs = bdrv_new("", &error_abort);
/* Read out options */
while (options && options->name) {
if (!strcmp(options->name, "size")) {
total_size = options->value.n / BDRV_SECTOR_SIZE;
}
options++;
}
bs->opaque = g_malloc0(sizeof(struct IscsiLun));
total_size = DIV_ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
bs->opaque = g_new0(struct IscsiLun, 1);
iscsilun = bs->opaque;
bs_options = qdict_new();
@@ -1530,10 +1544,7 @@ static int iscsi_create(const char *filename, QEMUOptionParameter *options,
if (ret != 0) {
goto out;
}
if (iscsilun->nop_timer) {
timer_del(iscsilun->nop_timer);
timer_free(iscsilun->nop_timer);
}
iscsi_detach_aio_context(bs);
if (iscsilun->type != TYPE_DISK) {
ret = -ENODEV;
goto out;
@@ -1563,13 +1574,17 @@ static int iscsi_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
return 0;
}
static QEMUOptionParameter iscsi_create_options[] = {
{
.name = BLOCK_OPT_SIZE,
.type = OPT_SIZE,
.help = "Virtual disk size"
},
{ NULL }
static QemuOptsList iscsi_create_opts = {
.name = "iscsi-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(iscsi_create_opts.head),
.desc = {
{
.name = BLOCK_OPT_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Virtual disk size"
},
{ /* end of list */ }
}
};
static BlockDriver bdrv_iscsi = {
@@ -1581,7 +1596,7 @@ static BlockDriver bdrv_iscsi = {
.bdrv_file_open = iscsi_open,
.bdrv_close = iscsi_close,
.bdrv_create = iscsi_create,
.create_options = iscsi_create_options,
.create_opts = &iscsi_create_opts,
.bdrv_reopen_prepare = iscsi_reopen_prepare,
.bdrv_getlength = iscsi_getlength,
@@ -1589,13 +1604,9 @@ static BlockDriver bdrv_iscsi = {
.bdrv_truncate = iscsi_truncate,
.bdrv_refresh_limits = iscsi_refresh_limits,
#if defined(LIBISCSI_FEATURE_IOVECTOR)
.bdrv_co_get_block_status = iscsi_co_get_block_status,
#endif
.bdrv_co_discard = iscsi_co_discard,
#if defined(SCSI_SENSE_ASCQ_CAPACITY_DATA_HAS_CHANGED)
.bdrv_co_write_zeroes = iscsi_co_write_zeroes,
#endif
.bdrv_co_readv = iscsi_co_readv,
.bdrv_co_writev = iscsi_co_writev,
.bdrv_co_flush_to_disk = iscsi_co_flush,
@@ -1604,6 +1615,9 @@ static BlockDriver bdrv_iscsi = {
.bdrv_ioctl = iscsi_ioctl,
.bdrv_aio_ioctl = iscsi_aio_ioctl,
#endif
.bdrv_detach_aio_context = iscsi_detach_aio_context,
.bdrv_attach_aio_context = iscsi_attach_aio_context,
};
static QemuOptsList qemu_iscsi_opts = {

View File

@@ -25,6 +25,8 @@
*/
#define MAX_EVENTS 128
#define MAX_QUEUED_IO 128
struct qemu_laiocb {
BlockDriverAIOCB common;
struct qemu_laio_state *ctx;
@@ -36,9 +38,25 @@ struct qemu_laiocb {
QLIST_ENTRY(qemu_laiocb) node;
};
typedef struct {
struct iocb *iocbs[MAX_QUEUED_IO];
int plugged;
unsigned int size;
unsigned int idx;
} LaioQueue;
struct qemu_laio_state {
io_context_t ctx;
EventNotifier e;
/* io queue for submit at batch */
LaioQueue io_q;
/* I/O completion processing */
QEMUBH *completion_bh;
struct io_event events[MAX_EVENTS];
int event_idx;
int event_max;
};
static inline ssize_t io_event_ret(struct io_event *ev)
@@ -74,27 +92,58 @@ static void qemu_laio_process_completion(struct qemu_laio_state *s,
qemu_aio_release(laiocb);
}
/* The completion BH fetches completed I/O requests and invokes their
* callbacks.
*
* The function is somewhat tricky because it supports nested event loops, for
* example when a request callback invokes aio_poll(). In order to do this,
* the completion events array and index are kept in qemu_laio_state. The BH
* reschedules itself as long as there are completions pending so it will
* either be called again in a nested event loop or will be called after all
* events have been completed. When there are no events left to complete, the
* BH returns without rescheduling.
*/
static void qemu_laio_completion_bh(void *opaque)
{
struct qemu_laio_state *s = opaque;
/* Fetch more completion events when empty */
if (s->event_idx == s->event_max) {
do {
struct timespec ts = { 0 };
s->event_max = io_getevents(s->ctx, MAX_EVENTS, MAX_EVENTS,
s->events, &ts);
} while (s->event_max == -EINTR);
s->event_idx = 0;
if (s->event_max <= 0) {
s->event_max = 0;
return; /* no more events */
}
}
/* Reschedule so nested event loops see currently pending completions */
qemu_bh_schedule(s->completion_bh);
/* Process completion events */
while (s->event_idx < s->event_max) {
struct iocb *iocb = s->events[s->event_idx].obj;
struct qemu_laiocb *laiocb =
container_of(iocb, struct qemu_laiocb, iocb);
laiocb->ret = io_event_ret(&s->events[s->event_idx]);
s->event_idx++;
qemu_laio_process_completion(s, laiocb);
}
}
static void qemu_laio_completion_cb(EventNotifier *e)
{
struct qemu_laio_state *s = container_of(e, struct qemu_laio_state, e);
while (event_notifier_test_and_clear(&s->e)) {
struct io_event events[MAX_EVENTS];
struct timespec ts = { 0 };
int nevents, i;
do {
nevents = io_getevents(s->ctx, MAX_EVENTS, MAX_EVENTS, events, &ts);
} while (nevents == -EINTR);
for (i = 0; i < nevents; i++) {
struct iocb *iocb = events[i].obj;
struct qemu_laiocb *laiocb =
container_of(iocb, struct qemu_laiocb, iocb);
laiocb->ret = io_event_ret(&events[i]);
qemu_laio_process_completion(s, laiocb);
}
if (event_notifier_test_and_clear(&s->e)) {
qemu_bh_schedule(s->completion_bh);
}
}
@@ -135,6 +184,79 @@ static const AIOCBInfo laio_aiocb_info = {
.cancel = laio_cancel,
};
static void ioq_init(LaioQueue *io_q)
{
io_q->size = MAX_QUEUED_IO;
io_q->idx = 0;
io_q->plugged = 0;
}
static int ioq_submit(struct qemu_laio_state *s)
{
int ret, i = 0;
int len = s->io_q.idx;
do {
ret = io_submit(s->ctx, len, s->io_q.iocbs);
} while (i++ < 3 && ret == -EAGAIN);
/* empty io queue */
s->io_q.idx = 0;
if (ret < 0) {
i = 0;
} else {
i = ret;
}
for (; i < len; i++) {
struct qemu_laiocb *laiocb =
container_of(s->io_q.iocbs[i], struct qemu_laiocb, iocb);
laiocb->ret = (ret < 0) ? ret : -EIO;
qemu_laio_process_completion(s, laiocb);
}
return ret;
}
static void ioq_enqueue(struct qemu_laio_state *s, struct iocb *iocb)
{
unsigned int idx = s->io_q.idx;
s->io_q.iocbs[idx++] = iocb;
s->io_q.idx = idx;
/* submit immediately if queue is full */
if (idx == s->io_q.size) {
ioq_submit(s);
}
}
void laio_io_plug(BlockDriverState *bs, void *aio_ctx)
{
struct qemu_laio_state *s = aio_ctx;
s->io_q.plugged++;
}
int laio_io_unplug(BlockDriverState *bs, void *aio_ctx, bool unplug)
{
struct qemu_laio_state *s = aio_ctx;
int ret = 0;
assert(s->io_q.plugged > 0 || !unplug);
if (unplug && --s->io_q.plugged > 0) {
return 0;
}
if (s->io_q.idx > 0) {
ret = ioq_submit(s);
}
return ret;
}
BlockDriverAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockDriverCompletionFunc *cb, void *opaque, int type)
@@ -168,8 +290,13 @@ BlockDriverAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
}
io_set_eventfd(&laiocb->iocb, event_notifier_get_fd(&s->e));
if (io_submit(s->ctx, 1, &iocbs) < 0)
goto out_free_aiocb;
if (!s->io_q.plugged) {
if (io_submit(s->ctx, 1, &iocbs) < 0) {
goto out_free_aiocb;
}
} else {
ioq_enqueue(s, iocbs);
}
return &laiocb->common;
out_free_aiocb:
@@ -177,6 +304,22 @@ out_free_aiocb:
return NULL;
}
void laio_detach_aio_context(void *s_, AioContext *old_context)
{
struct qemu_laio_state *s = s_;
aio_set_event_notifier(old_context, &s->e, NULL);
qemu_bh_delete(s->completion_bh);
}
void laio_attach_aio_context(void *s_, AioContext *new_context)
{
struct qemu_laio_state *s = s_;
s->completion_bh = aio_bh_new(new_context, qemu_laio_completion_bh, s);
aio_set_event_notifier(new_context, &s->e, qemu_laio_completion_cb);
}
void *laio_init(void)
{
struct qemu_laio_state *s;
@@ -190,7 +333,7 @@ void *laio_init(void)
goto out_close_efd;
}
qemu_aio_set_event_notifier(&s->e, qemu_laio_completion_cb);
ioq_init(&s->io_q);
return s;
@@ -200,3 +343,16 @@ out_free_state:
g_free(s);
return NULL;
}
void laio_cleanup(void *s_)
{
struct qemu_laio_state *s = s_;
event_notifier_cleanup(&s->e);
if (io_destroy(s->ctx) != 0) {
fprintf(stderr, "%s: destroy AIO context %p failed\n",
__func__, &s->ctx);
}
g_free(s);
}

View File

@@ -32,6 +32,12 @@ typedef struct MirrorBlockJob {
RateLimit limit;
BlockDriverState *target;
BlockDriverState *base;
/* The name of the graph node to replace */
char *replaces;
/* The BDS to replace */
BlockDriverState *to_replace;
/* Used to block operations on the drive-mirror-replace target */
Error *replace_blocker;
bool is_none_mode;
BlockdevOnError on_source_error, on_target_error;
bool synced;
@@ -118,7 +124,7 @@ static void mirror_write_complete(void *opaque, int ret)
bdrv_set_dirty(source, op->sector_num, op->nb_sectors);
action = mirror_error_action(s, false, -ret);
if (action == BDRV_ACTION_REPORT && s->ret >= 0) {
if (action == BLOCK_ERROR_ACTION_REPORT && s->ret >= 0) {
s->ret = ret;
}
}
@@ -135,7 +141,7 @@ static void mirror_read_complete(void *opaque, int ret)
bdrv_set_dirty(source, op->sector_num, op->nb_sectors);
action = mirror_error_action(s, true, -ret);
if (action == BDRV_ACTION_REPORT && s->ret >= 0) {
if (action == BLOCK_ERROR_ACTION_REPORT && s->ret >= 0) {
s->ret = ret;
}
@@ -151,7 +157,7 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
BlockDriverState *source = s->common.bs;
int nb_sectors, sectors_per_chunk, nb_chunks;
int64_t end, sector_num, next_chunk, next_sector, hbitmap_next_sector;
uint64_t delay_ns;
uint64_t delay_ns = 0;
MirrorOp *op;
s->sector_num = hbitmap_iter_next(&s->hbi);
@@ -241,8 +247,6 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
next_chunk += added_chunks;
if (!s->synced && s->common.speed) {
delay_ns = ratelimit_calculate_delay(&s->limit, added_sectors);
} else {
delay_ns = 0;
}
} while (delay_ns == 0 && next_sector < end);
@@ -259,9 +263,11 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
next_sector = sector_num;
while (nb_chunks-- > 0) {
MirrorBuffer *buf = QSIMPLEQ_FIRST(&s->buf_free);
size_t remaining = (nb_sectors * BDRV_SECTOR_SIZE) - op->qiov.size;
QSIMPLEQ_REMOVE_HEAD(&s->buf_free, next);
s->buf_free_count--;
qemu_iovec_add(&op->qiov, buf, s->granularity);
qemu_iovec_add(&op->qiov, buf, MIN(s->granularity, remaining));
/* Advance the HBitmapIter in parallel, so that we do not examine
* the same sector twice.
@@ -324,9 +330,18 @@ static void coroutine_fn mirror_run(void *opaque)
}
s->common.len = bdrv_getlength(bs);
if (s->common.len <= 0) {
if (s->common.len < 0) {
ret = s->common.len;
goto immediate_exit;
} else if (s->common.len == 0) {
/* Report BLOCK_JOB_READY and wait for complete. */
block_job_event_ready(&s->common);
s->synced = true;
while (!block_job_is_cancelled(&s->common) && !s->should_complete) {
block_job_yield(&s->common);
}
s->common.cancelled = false;
goto immediate_exit;
}
length = DIV_ROUND_UP(s->common.len, s->granularity);
@@ -350,7 +365,12 @@ static void coroutine_fn mirror_run(void *opaque)
}
end = s->common.len >> BDRV_SECTOR_BITS;
s->buf = qemu_blockalign(bs, s->buf_size);
s->buf = qemu_try_blockalign(bs, s->buf_size);
if (s->buf == NULL) {
ret = -ENOMEM;
goto immediate_exit;
}
sectors_per_chunk = s->granularity >> BDRV_SECTOR_BITS;
mirror_free_init(s);
@@ -415,7 +435,8 @@ static void coroutine_fn mirror_run(void *opaque)
trace_mirror_before_flush(s);
ret = bdrv_flush(s->target);
if (ret < 0) {
if (mirror_error_action(s, false, -ret) == BDRV_ACTION_REPORT) {
if (mirror_error_action(s, false, -ret) ==
BLOCK_ERROR_ACTION_REPORT) {
goto immediate_exit;
}
} else {
@@ -426,7 +447,7 @@ static void coroutine_fn mirror_run(void *opaque)
*/
s->common.offset = end * BDRV_SECTOR_SIZE;
if (!s->synced) {
block_job_ready(&s->common);
block_job_event_ready(&s->common);
s->synced = true;
}
@@ -490,10 +511,14 @@ immediate_exit:
bdrv_release_dirty_bitmap(bs, s->dirty_bitmap);
bdrv_iostatus_disable(s->target);
if (s->should_complete && ret == 0) {
if (bdrv_get_flags(s->target) != bdrv_get_flags(s->common.bs)) {
bdrv_reopen(s->target, bdrv_get_flags(s->common.bs), NULL);
BlockDriverState *to_replace = s->common.bs;
if (s->to_replace) {
to_replace = s->to_replace;
}
bdrv_swap(s->target, s->common.bs);
if (bdrv_get_flags(s->target) != bdrv_get_flags(to_replace)) {
bdrv_reopen(s->target, bdrv_get_flags(to_replace), NULL);
}
bdrv_swap(s->target, to_replace);
if (s->common.driver->job_type == BLOCK_JOB_TYPE_COMMIT) {
/* drop the bs loop chain formed by the swap: break the loop then
* trigger the unref from the top one */
@@ -502,6 +527,12 @@ immediate_exit:
bdrv_unref(p);
}
}
if (s->to_replace) {
bdrv_op_unblock_all(s->to_replace, s->replace_blocker);
error_free(s->replace_blocker);
bdrv_unref(s->to_replace);
}
g_free(s->replaces);
bdrv_unref(s->target);
block_job_completed(&s->common, ret);
}
@@ -540,6 +571,20 @@ static void mirror_complete(BlockJob *job, Error **errp)
return;
}
/* check the target bs is not blocked and block all operations on it */
if (s->replaces) {
s->to_replace = check_to_replace_node(s->replaces, &local_err);
if (!s->to_replace) {
error_propagate(errp, local_err);
return;
}
error_setg(&s->replace_blocker,
"block device is in use by block-job-complete");
bdrv_op_block_all(s->to_replace, s->replace_blocker);
bdrv_ref(s->to_replace);
}
s->should_complete = true;
block_job_resume(job);
}
@@ -562,14 +607,15 @@ static const BlockJobDriver commit_active_job_driver = {
};
static void mirror_start_job(BlockDriverState *bs, BlockDriverState *target,
int64_t speed, int64_t granularity,
int64_t buf_size,
BlockdevOnError on_source_error,
BlockdevOnError on_target_error,
BlockDriverCompletionFunc *cb,
void *opaque, Error **errp,
const BlockJobDriver *driver,
bool is_none_mode, BlockDriverState *base)
const char *replaces,
int64_t speed, int64_t granularity,
int64_t buf_size,
BlockdevOnError on_source_error,
BlockdevOnError on_target_error,
BlockDriverCompletionFunc *cb,
void *opaque, Error **errp,
const BlockJobDriver *driver,
bool is_none_mode, BlockDriverState *base)
{
MirrorBlockJob *s;
@@ -600,6 +646,7 @@ static void mirror_start_job(BlockDriverState *bs, BlockDriverState *target,
return;
}
s->replaces = g_strdup(replaces);
s->on_source_error = on_source_error;
s->on_target_error = on_target_error;
s->target = target;
@@ -621,6 +668,7 @@ static void mirror_start_job(BlockDriverState *bs, BlockDriverState *target,
}
void mirror_start(BlockDriverState *bs, BlockDriverState *target,
const char *replaces,
int64_t speed, int64_t granularity, int64_t buf_size,
MirrorSyncMode mode, BlockdevOnError on_source_error,
BlockdevOnError on_target_error,
@@ -632,7 +680,8 @@ void mirror_start(BlockDriverState *bs, BlockDriverState *target,
is_none_mode = mode == MIRROR_SYNC_MODE_NONE;
base = mode == MIRROR_SYNC_MODE_TOP ? bs->backing_hd : NULL;
mirror_start_job(bs, target, speed, granularity, buf_size,
mirror_start_job(bs, target, replaces,
speed, granularity, buf_size,
on_source_error, on_target_error, cb, opaque, errp,
&mirror_job_driver, is_none_mode, base);
}
@@ -680,7 +729,7 @@ void commit_active_start(BlockDriverState *bs, BlockDriverState *base,
}
bdrv_ref(base);
mirror_start_job(bs, base, speed, 0, 0,
mirror_start_job(bs, base, NULL, speed, 0, 0,
on_error, on_error, cb, opaque, &local_err,
&commit_active_job_driver, false, base);
if (local_err) {

View File

@@ -49,7 +49,7 @@ static void nbd_teardown_connection(NbdClientSession *client)
shutdown(client->sock, 2);
nbd_recv_coroutines_enter_all(client);
qemu_aio_set_fd_handler(client->sock, NULL, NULL, NULL);
nbd_client_session_detach_aio_context(client);
closesocket(client->sock);
client->sock = -1;
}
@@ -103,11 +103,14 @@ static int nbd_co_send_request(NbdClientSession *s,
struct nbd_request *request,
QEMUIOVector *qiov, int offset)
{
AioContext *aio_context;
int rc, ret;
qemu_co_mutex_lock(&s->send_mutex);
s->send_coroutine = qemu_coroutine_self();
qemu_aio_set_fd_handler(s->sock, nbd_reply_ready, nbd_restart_write, s);
aio_context = bdrv_get_aio_context(s->bs);
aio_set_fd_handler(aio_context, s->sock,
nbd_reply_ready, nbd_restart_write, s);
if (qiov) {
if (!s->is_unix) {
socket_set_cork(s->sock, 1);
@@ -126,7 +129,7 @@ static int nbd_co_send_request(NbdClientSession *s,
} else {
rc = nbd_send_request(s->sock, request);
}
qemu_aio_set_fd_handler(s->sock, nbd_reply_ready, NULL, s);
aio_set_fd_handler(aio_context, s->sock, nbd_reply_ready, NULL, s);
s->send_coroutine = NULL;
qemu_co_mutex_unlock(&s->send_mutex);
return rc;
@@ -335,6 +338,19 @@ int nbd_client_session_co_discard(NbdClientSession *client, int64_t sector_num,
}
void nbd_client_session_detach_aio_context(NbdClientSession *client)
{
aio_set_fd_handler(bdrv_get_aio_context(client->bs), client->sock,
NULL, NULL, NULL);
}
void nbd_client_session_attach_aio_context(NbdClientSession *client,
AioContext *new_context)
{
aio_set_fd_handler(new_context, client->sock,
nbd_reply_ready, NULL, client);
}
void nbd_client_session_close(NbdClientSession *client)
{
struct nbd_request request = {
@@ -381,7 +397,7 @@ int nbd_client_session_init(NbdClientSession *client, BlockDriverState *bs,
/* Now that we're connected, set the socket to be non-blocking and
* kick the reply mechanism. */
qemu_set_nonblock(sock);
qemu_aio_set_fd_handler(sock, nbd_reply_ready, NULL, client);
nbd_client_session_attach_aio_context(client, bdrv_get_aio_context(bs));
logout("Established connection with NBD server\n");
return 0;

View File

@@ -47,4 +47,8 @@ int nbd_client_session_co_writev(NbdClientSession *client, int64_t sector_num,
int nbd_client_session_co_readv(NbdClientSession *client, int64_t sector_num,
int nb_sectors, QEMUIOVector *qiov);
void nbd_client_session_detach_aio_context(NbdClientSession *client);
void nbd_client_session_attach_aio_context(NbdClientSession *client,
AioContext *new_context);
#endif /* NBD_CLIENT_H */

View File

@@ -31,8 +31,10 @@
#include "block/block_int.h"
#include "qemu/module.h"
#include "qemu/sockets.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qjson.h"
#include "qapi/qmp/qint.h"
#include "qapi/qmp/qstring.h"
#include <sys/types.h>
#include <unistd.h>
@@ -323,46 +325,101 @@ static int64_t nbd_getlength(BlockDriverState *bs)
return s->client.size;
}
static void nbd_detach_aio_context(BlockDriverState *bs)
{
BDRVNBDState *s = bs->opaque;
nbd_client_session_detach_aio_context(&s->client);
}
static void nbd_attach_aio_context(BlockDriverState *bs,
AioContext *new_context)
{
BDRVNBDState *s = bs->opaque;
nbd_client_session_attach_aio_context(&s->client, new_context);
}
static void nbd_refresh_filename(BlockDriverState *bs)
{
BDRVNBDState *s = bs->opaque;
QDict *opts = qdict_new();
const char *path = qemu_opt_get(s->socket_opts, "path");
const char *host = qemu_opt_get(s->socket_opts, "host");
const char *port = qemu_opt_get(s->socket_opts, "port");
const char *export = qemu_opt_get(s->socket_opts, "export");
qdict_put_obj(opts, "driver", QOBJECT(qstring_from_str("nbd")));
if (path) {
snprintf(bs->exact_filename, sizeof(bs->exact_filename),
"nbd+unix:%s", path);
qdict_put_obj(opts, "path", QOBJECT(qstring_from_str(path)));
} else if (export) {
snprintf(bs->exact_filename, sizeof(bs->exact_filename),
"nbd:%s:%s/%s", host, port, export);
qdict_put_obj(opts, "host", QOBJECT(qstring_from_str(host)));
qdict_put_obj(opts, "port", QOBJECT(qstring_from_str(port)));
qdict_put_obj(opts, "export", QOBJECT(qstring_from_str(export)));
} else {
snprintf(bs->exact_filename, sizeof(bs->exact_filename),
"nbd:%s:%s", host, port);
qdict_put_obj(opts, "host", QOBJECT(qstring_from_str(host)));
qdict_put_obj(opts, "port", QOBJECT(qstring_from_str(port)));
}
bs->full_open_options = opts;
}
static BlockDriver bdrv_nbd = {
.format_name = "nbd",
.protocol_name = "nbd",
.instance_size = sizeof(BDRVNBDState),
.bdrv_parse_filename = nbd_parse_filename,
.bdrv_file_open = nbd_open,
.bdrv_co_readv = nbd_co_readv,
.bdrv_co_writev = nbd_co_writev,
.bdrv_close = nbd_close,
.bdrv_co_flush_to_os = nbd_co_flush,
.bdrv_co_discard = nbd_co_discard,
.bdrv_getlength = nbd_getlength,
.format_name = "nbd",
.protocol_name = "nbd",
.instance_size = sizeof(BDRVNBDState),
.bdrv_parse_filename = nbd_parse_filename,
.bdrv_file_open = nbd_open,
.bdrv_co_readv = nbd_co_readv,
.bdrv_co_writev = nbd_co_writev,
.bdrv_close = nbd_close,
.bdrv_co_flush_to_os = nbd_co_flush,
.bdrv_co_discard = nbd_co_discard,
.bdrv_getlength = nbd_getlength,
.bdrv_detach_aio_context = nbd_detach_aio_context,
.bdrv_attach_aio_context = nbd_attach_aio_context,
.bdrv_refresh_filename = nbd_refresh_filename,
};
static BlockDriver bdrv_nbd_tcp = {
.format_name = "nbd",
.protocol_name = "nbd+tcp",
.instance_size = sizeof(BDRVNBDState),
.bdrv_parse_filename = nbd_parse_filename,
.bdrv_file_open = nbd_open,
.bdrv_co_readv = nbd_co_readv,
.bdrv_co_writev = nbd_co_writev,
.bdrv_close = nbd_close,
.bdrv_co_flush_to_os = nbd_co_flush,
.bdrv_co_discard = nbd_co_discard,
.bdrv_getlength = nbd_getlength,
.format_name = "nbd",
.protocol_name = "nbd+tcp",
.instance_size = sizeof(BDRVNBDState),
.bdrv_parse_filename = nbd_parse_filename,
.bdrv_file_open = nbd_open,
.bdrv_co_readv = nbd_co_readv,
.bdrv_co_writev = nbd_co_writev,
.bdrv_close = nbd_close,
.bdrv_co_flush_to_os = nbd_co_flush,
.bdrv_co_discard = nbd_co_discard,
.bdrv_getlength = nbd_getlength,
.bdrv_detach_aio_context = nbd_detach_aio_context,
.bdrv_attach_aio_context = nbd_attach_aio_context,
.bdrv_refresh_filename = nbd_refresh_filename,
};
static BlockDriver bdrv_nbd_unix = {
.format_name = "nbd",
.protocol_name = "nbd+unix",
.instance_size = sizeof(BDRVNBDState),
.bdrv_parse_filename = nbd_parse_filename,
.bdrv_file_open = nbd_open,
.bdrv_co_readv = nbd_co_readv,
.bdrv_co_writev = nbd_co_writev,
.bdrv_close = nbd_close,
.bdrv_co_flush_to_os = nbd_co_flush,
.bdrv_co_discard = nbd_co_discard,
.bdrv_getlength = nbd_getlength,
.format_name = "nbd",
.protocol_name = "nbd+unix",
.instance_size = sizeof(BDRVNBDState),
.bdrv_parse_filename = nbd_parse_filename,
.bdrv_file_open = nbd_open,
.bdrv_co_readv = nbd_co_readv,
.bdrv_co_writev = nbd_co_writev,
.bdrv_close = nbd_close,
.bdrv_co_flush_to_os = nbd_co_flush,
.bdrv_co_discard = nbd_co_discard,
.bdrv_getlength = nbd_getlength,
.bdrv_detach_aio_context = nbd_detach_aio_context,
.bdrv_attach_aio_context = nbd_attach_aio_context,
.bdrv_refresh_filename = nbd_refresh_filename,
};
static void bdrv_nbd_init(void)

View File

@@ -40,6 +40,7 @@ typedef struct NFSClient {
struct nfsfh *fh;
int events;
bool has_zero_init;
AioContext *aio_context;
} NFSClient;
typedef struct NFSRPC {
@@ -49,6 +50,7 @@ typedef struct NFSRPC {
struct stat *st;
Coroutine *co;
QEMUBH *bh;
NFSClient *client;
} NFSRPC;
static void nfs_process_read(void *arg);
@@ -58,10 +60,11 @@ static void nfs_set_events(NFSClient *client)
{
int ev = nfs_which_events(client->context);
if (ev != client->events) {
qemu_aio_set_fd_handler(nfs_get_fd(client->context),
(ev & POLLIN) ? nfs_process_read : NULL,
(ev & POLLOUT) ? nfs_process_write : NULL,
client);
aio_set_fd_handler(client->aio_context,
nfs_get_fd(client->context),
(ev & POLLIN) ? nfs_process_read : NULL,
(ev & POLLOUT) ? nfs_process_write : NULL,
client);
}
client->events = ev;
@@ -84,13 +87,15 @@ static void nfs_process_write(void *arg)
static void nfs_co_init_task(NFSClient *client, NFSRPC *task)
{
*task = (NFSRPC) {
.co = qemu_coroutine_self(),
.co = qemu_coroutine_self(),
.client = client,
};
}
static void nfs_co_generic_bh_cb(void *opaque)
{
NFSRPC *task = opaque;
task->complete = 1;
qemu_bh_delete(task->bh);
qemu_coroutine_enter(task->co, NULL);
}
@@ -100,7 +105,6 @@ nfs_co_generic_cb(int ret, struct nfs_context *nfs, void *data,
void *private_data)
{
NFSRPC *task = private_data;
task->complete = 1;
task->ret = ret;
if (task->ret > 0 && task->iov) {
if (task->ret <= task->iov->size) {
@@ -116,8 +120,11 @@ nfs_co_generic_cb(int ret, struct nfs_context *nfs, void *data,
error_report("NFS Error: %s", nfs_get_error(nfs));
}
if (task->co) {
task->bh = qemu_bh_new(nfs_co_generic_bh_cb, task);
task->bh = aio_bh_new(task->client->aio_context,
nfs_co_generic_bh_cb, task);
qemu_bh_schedule(task->bh);
} else {
task->complete = 1;
}
}
@@ -165,7 +172,11 @@ static int coroutine_fn nfs_co_writev(BlockDriverState *bs,
nfs_co_init_task(client, &task);
buf = g_malloc(nb_sectors * BDRV_SECTOR_SIZE);
buf = g_try_malloc(nb_sectors * BDRV_SECTOR_SIZE);
if (nb_sectors && buf == NULL) {
return -ENOMEM;
}
qemu_iovec_to_buf(iov, 0, buf, nb_sectors * BDRV_SECTOR_SIZE);
if (nfs_pwrite_async(client->context, client->fh,
@@ -224,13 +235,34 @@ static QemuOptsList runtime_opts = {
},
};
static void nfs_detach_aio_context(BlockDriverState *bs)
{
NFSClient *client = bs->opaque;
aio_set_fd_handler(client->aio_context,
nfs_get_fd(client->context),
NULL, NULL, NULL);
client->events = 0;
}
static void nfs_attach_aio_context(BlockDriverState *bs,
AioContext *new_context)
{
NFSClient *client = bs->opaque;
client->aio_context = new_context;
nfs_set_events(client);
}
static void nfs_client_close(NFSClient *client)
{
if (client->context) {
if (client->fh) {
nfs_close(client->context, client->fh);
}
qemu_aio_set_fd_handler(nfs_get_fd(client->context), NULL, NULL, NULL);
aio_set_fd_handler(client->aio_context,
nfs_get_fd(client->context),
NULL, NULL, NULL);
nfs_destroy_context(client->context);
}
memset(client, 0, sizeof(NFSClient));
@@ -276,17 +308,27 @@ static int64_t nfs_client_open(NFSClient *client, const char *filename,
qp = query_params_parse(uri->query);
for (i = 0; i < qp->n; i++) {
unsigned long long val;
if (!qp->p[i].value) {
error_setg(errp, "Value for NFS parameter expected: %s",
qp->p[i].name);
goto fail;
}
if (!strncmp(qp->p[i].name, "uid", 3)) {
nfs_set_uid(client->context, atoi(qp->p[i].value));
} else if (!strncmp(qp->p[i].name, "gid", 3)) {
nfs_set_gid(client->context, atoi(qp->p[i].value));
} else if (!strncmp(qp->p[i].name, "tcp-syncnt", 10)) {
nfs_set_tcp_syncnt(client->context, atoi(qp->p[i].value));
if (parse_uint_full(qp->p[i].value, &val, 0)) {
error_setg(errp, "Illegal value for NFS parameter: %s",
qp->p[i].name);
goto fail;
}
if (!strcmp(qp->p[i].name, "uid")) {
nfs_set_uid(client->context, val);
} else if (!strcmp(qp->p[i].name, "gid")) {
nfs_set_gid(client->context, val);
} else if (!strcmp(qp->p[i].name, "tcp-syncnt")) {
nfs_set_tcp_syncnt(client->context, val);
#ifdef LIBNFS_FEATURE_READAHEAD
} else if (!strcmp(qp->p[i].name, "readahead")) {
nfs_set_readahead(client->context, val);
#endif
} else {
error_setg(errp, "Unknown NFS parameter name: %s",
qp->p[i].name);
@@ -345,36 +387,39 @@ static int nfs_file_open(BlockDriverState *bs, QDict *options, int flags,
QemuOpts *opts;
Error *local_err = NULL;
client->aio_context = bdrv_get_aio_context(bs);
opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return -EINVAL;
ret = -EINVAL;
goto out;
}
ret = nfs_client_open(client, qemu_opt_get(opts, "filename"),
(flags & BDRV_O_RDWR) ? O_RDWR : O_RDONLY,
errp);
if (ret < 0) {
return ret;
goto out;
}
bs->total_sectors = ret;
return 0;
ret = 0;
out:
qemu_opts_del(opts);
return ret;
}
static int nfs_file_create(const char *url, QEMUOptionParameter *options,
Error **errp)
static int nfs_file_create(const char *url, QemuOpts *opts, Error **errp)
{
int ret = 0;
int64_t total_size = 0;
NFSClient *client = g_malloc0(sizeof(NFSClient));
NFSClient *client = g_new0(NFSClient, 1);
client->aio_context = qemu_get_aio_context();
/* Read out options */
while (options && options->name) {
if (!strcmp(options->name, "size")) {
total_size = options->value.n;
}
options++;
}
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
ret = nfs_client_open(client, url, O_CREAT, errp);
if (ret < 0) {
@@ -407,7 +452,7 @@ static int64_t nfs_get_allocated_file_size(BlockDriverState *bs)
while (!task.complete) {
nfs_set_events(client);
qemu_aio_wait();
aio_poll(client->aio_context, true);
}
return (task.ret < 0 ? task.ret : st.st_blocks * st.st_blksize);
@@ -420,22 +465,25 @@ static int nfs_file_truncate(BlockDriverState *bs, int64_t offset)
}
static BlockDriver bdrv_nfs = {
.format_name = "nfs",
.protocol_name = "nfs",
.format_name = "nfs",
.protocol_name = "nfs",
.instance_size = sizeof(NFSClient),
.bdrv_needs_filename = true,
.bdrv_has_zero_init = nfs_has_zero_init,
.bdrv_get_allocated_file_size = nfs_get_allocated_file_size,
.bdrv_truncate = nfs_file_truncate,
.instance_size = sizeof(NFSClient),
.bdrv_needs_filename = true,
.bdrv_has_zero_init = nfs_has_zero_init,
.bdrv_get_allocated_file_size = nfs_get_allocated_file_size,
.bdrv_truncate = nfs_file_truncate,
.bdrv_file_open = nfs_file_open,
.bdrv_close = nfs_file_close,
.bdrv_create = nfs_file_create,
.bdrv_file_open = nfs_file_open,
.bdrv_close = nfs_file_close,
.bdrv_create = nfs_file_create,
.bdrv_co_readv = nfs_co_readv,
.bdrv_co_writev = nfs_co_writev,
.bdrv_co_flush_to_disk = nfs_co_flush,
.bdrv_co_readv = nfs_co_readv,
.bdrv_co_writev = nfs_co_writev,
.bdrv_co_flush_to_disk = nfs_co_flush,
.bdrv_detach_aio_context = nfs_detach_aio_context,
.bdrv_attach_aio_context = nfs_attach_aio_context,
};
static void nfs_block_init(void)

View File

@@ -30,6 +30,7 @@
/**************************************************************/
#define HEADER_MAGIC "WithoutFreeSpace"
#define HEADER_MAGIC2 "WithouFreSpacExt"
#define HEADER_VERSION 2
#define HEADER_SIZE 64
@@ -41,8 +42,10 @@ struct parallels_header {
uint32_t cylinders;
uint32_t tracks;
uint32_t catalog_entries;
uint32_t nb_sectors;
char padding[24];
uint64_t nb_sectors;
uint32_t inuse;
uint32_t data_off;
char padding[12];
} QEMU_PACKED;
typedef struct BDRVParallelsState {
@@ -52,6 +55,8 @@ typedef struct BDRVParallelsState {
unsigned int catalog_size;
unsigned int tracks;
unsigned int off_multiplier;
} BDRVParallelsState;
static int parallels_probe(const uint8_t *buf, int buf_size, const char *filename)
@@ -59,11 +64,12 @@ static int parallels_probe(const uint8_t *buf, int buf_size, const char *filenam
const struct parallels_header *ph = (const void *)buf;
if (buf_size < HEADER_SIZE)
return 0;
return 0;
if (!memcmp(ph->magic, HEADER_MAGIC, 16) &&
(le32_to_cpu(ph->version) == HEADER_VERSION))
return 100;
if ((!memcmp(ph->magic, HEADER_MAGIC, 16) ||
!memcmp(ph->magic, HEADER_MAGIC2, 16)) &&
(le32_to_cpu(ph->version) == HEADER_VERSION))
return 100;
return 0;
}
@@ -83,14 +89,19 @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags,
goto fail;
}
if (memcmp(ph.magic, HEADER_MAGIC, 16) ||
(le32_to_cpu(ph.version) != HEADER_VERSION)) {
error_setg(errp, "Image not in Parallels format");
ret = -EINVAL;
goto fail;
}
bs->total_sectors = le64_to_cpu(ph.nb_sectors);
bs->total_sectors = le32_to_cpu(ph.nb_sectors);
if (le32_to_cpu(ph.version) != HEADER_VERSION) {
goto fail_format;
}
if (!memcmp(ph.magic, HEADER_MAGIC, 16)) {
s->off_multiplier = 1;
bs->total_sectors = 0xffffffff & bs->total_sectors;
} else if (!memcmp(ph.magic, HEADER_MAGIC2, 16)) {
s->off_multiplier = le32_to_cpu(ph.tracks);
} else {
goto fail_format;
}
s->tracks = le32_to_cpu(ph.tracks);
if (s->tracks == 0) {
@@ -98,6 +109,11 @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags,
ret = -EINVAL;
goto fail;
}
if (s->tracks > INT32_MAX/513) {
error_setg(errp, "Invalid image: Too big cluster");
ret = -EFBIG;
goto fail;
}
s->catalog_size = le32_to_cpu(ph.catalog_entries);
if (s->catalog_size > INT_MAX / 4) {
@@ -105,7 +121,11 @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags,
ret = -EFBIG;
goto fail;
}
s->catalog_bitmap = g_malloc(s->catalog_size * 4);
s->catalog_bitmap = g_try_new(uint32_t, s->catalog_size);
if (s->catalog_size && s->catalog_bitmap == NULL) {
ret = -ENOMEM;
goto fail;
}
ret = bdrv_pread(bs->file, 64, s->catalog_bitmap, s->catalog_size * 4);
if (ret < 0) {
@@ -113,11 +133,14 @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags,
}
for (i = 0; i < s->catalog_size; i++)
le32_to_cpus(&s->catalog_bitmap[i]);
le32_to_cpus(&s->catalog_bitmap[i]);
qemu_co_mutex_init(&s->lock);
return 0;
fail_format:
error_setg(errp, "Image not in Parallels format");
ret = -EINVAL;
fail:
g_free(s->catalog_bitmap);
return ret;
@@ -133,8 +156,9 @@ static int64_t seek_to_sector(BlockDriverState *bs, int64_t sector_num)
/* not allocated */
if ((index > s->catalog_size) || (s->catalog_bitmap[index] == 0))
return -1;
return (uint64_t)(s->catalog_bitmap[index] + offset) * 512;
return -1;
return
((uint64_t)s->catalog_bitmap[index] * s->off_multiplier + offset) * 512;
}
static int parallels_read(BlockDriverState *bs, int64_t sector_num,

View File

@@ -28,6 +28,13 @@
#include "qapi-visit.h"
#include "qapi/qmp-output-visitor.h"
#include "qapi/qmp/types.h"
#ifdef __linux__
#include <linux/fs.h>
#include <sys/ioctl.h>
#ifndef FS_NOCOW_FL
#define FS_NOCOW_FL 0x00800000 /* Do not cow file */
#endif
#endif
BlockDeviceInfo *bdrv_block_device_info(BlockDriverState *bs)
{
@@ -165,19 +172,28 @@ void bdrv_query_image_info(BlockDriverState *bs,
ImageInfo **p_info,
Error **errp)
{
uint64_t total_sectors;
int64_t size;
const char *backing_filename;
char backing_filename2[1024];
BlockDriverInfo bdi;
int ret;
Error *err = NULL;
ImageInfo *info = g_new0(ImageInfo, 1);
ImageInfo *info;
#ifdef __linux__
int fd, attr;
#endif
bdrv_get_geometry(bs, &total_sectors);
size = bdrv_getlength(bs);
if (size < 0) {
error_setg_errno(errp, -size, "Can't get size of device '%s'",
bdrv_get_device_name(bs));
return;
}
info = g_new0(ImageInfo, 1);
info->filename = g_strdup(bs->filename);
info->format = g_strdup(bdrv_get_format_name(bs));
info->virtual_size = total_sectors * 512;
info->virtual_size = size;
info->actual_size = bdrv_get_allocated_file_size(bs);
info->has_actual_size = info->actual_size >= 0;
if (bdrv_is_encrypted(bs)) {
@@ -195,6 +211,18 @@ void bdrv_query_image_info(BlockDriverState *bs,
info->format_specific = bdrv_get_specific_info(bs);
info->has_format_specific = info->format_specific != NULL;
#ifdef __linux__
/* get NOCOW info */
fd = qemu_open(bs->filename, O_RDONLY | O_NONBLOCK);
if (fd >= 0) {
if (ioctl(fd, FS_IOC_GETFLAGS, &attr) == 0 && (attr & FS_NOCOW_FL)) {
info->has_nocow = true;
info->nocow = true;
}
qemu_close(fd);
}
#endif
backing_filename = bs->backing_file;
if (backing_filename[0] != '\0') {
info->backing_filename = g_strdup(backing_filename);
@@ -293,7 +321,7 @@ void bdrv_query_info(BlockDriverState *bs,
qapi_free_BlockInfo(info);
}
BlockStats *bdrv_query_stats(const BlockDriverState *bs)
static BlockStats *bdrv_query_stats(const BlockDriverState *bs)
{
BlockStats *s;
@@ -305,15 +333,16 @@ BlockStats *bdrv_query_stats(const BlockDriverState *bs)
}
s->stats = g_malloc0(sizeof(*s->stats));
s->stats->rd_bytes = bs->nr_bytes[BDRV_ACCT_READ];
s->stats->wr_bytes = bs->nr_bytes[BDRV_ACCT_WRITE];
s->stats->rd_operations = bs->nr_ops[BDRV_ACCT_READ];
s->stats->wr_operations = bs->nr_ops[BDRV_ACCT_WRITE];
s->stats->wr_highest_offset = bs->wr_highest_sector * BDRV_SECTOR_SIZE;
s->stats->flush_operations = bs->nr_ops[BDRV_ACCT_FLUSH];
s->stats->wr_total_time_ns = bs->total_time_ns[BDRV_ACCT_WRITE];
s->stats->rd_total_time_ns = bs->total_time_ns[BDRV_ACCT_READ];
s->stats->flush_total_time_ns = bs->total_time_ns[BDRV_ACCT_FLUSH];
s->stats->rd_bytes = bs->stats.nr_bytes[BLOCK_ACCT_READ];
s->stats->wr_bytes = bs->stats.nr_bytes[BLOCK_ACCT_WRITE];
s->stats->rd_operations = bs->stats.nr_ops[BLOCK_ACCT_READ];
s->stats->wr_operations = bs->stats.nr_ops[BLOCK_ACCT_WRITE];
s->stats->wr_highest_offset =
bs->stats.wr_highest_sector * BDRV_SECTOR_SIZE;
s->stats->flush_operations = bs->stats.nr_ops[BLOCK_ACCT_FLUSH];
s->stats->wr_total_time_ns = bs->stats.total_time_ns[BLOCK_ACCT_WRITE];
s->stats->rd_total_time_ns = bs->stats.total_time_ns[BLOCK_ACCT_READ];
s->stats->flush_total_time_ns = bs->stats.total_time_ns[BLOCK_ACCT_FLUSH];
if (bs->file) {
s->has_parent = true;
@@ -360,7 +389,11 @@ BlockStatsList *qmp_query_blockstats(Error **errp)
while ((bs = bdrv_next(bs))) {
BlockStatsList *info = g_malloc0(sizeof(*info));
AioContext *ctx = bdrv_get_aio_context(bs);
aio_context_acquire(ctx);
info->value = bdrv_query_stats(bs);
aio_context_release(ctx);
*p_next = info;
p_next = &info->next;
@@ -621,4 +654,8 @@ void bdrv_image_info_dump(fprintf_function func_fprintf, void *f,
func_fprintf(f, "Format specific information:\n");
bdrv_image_info_specific_dump(func_fprintf, f, info->format_specific);
}
if (info->has_nocow && info->nocow) {
func_fprintf(f, "NOCOW flag: set\n");
}
}

View File

@@ -182,7 +182,12 @@ static int qcow_open(BlockDriverState *bs, QDict *options, int flags,
}
s->l1_table_offset = header.l1_table_offset;
s->l1_table = g_malloc(s->l1_size * sizeof(uint64_t));
s->l1_table = g_try_new(uint64_t, s->l1_size);
if (s->l1_table == NULL) {
error_setg(errp, "Could not allocate memory for L1 table");
ret = -ENOMEM;
goto fail;
}
ret = bdrv_pread(bs->file, s->l1_table_offset, s->l1_table,
s->l1_size * sizeof(uint64_t));
@@ -193,8 +198,16 @@ static int qcow_open(BlockDriverState *bs, QDict *options, int flags,
for(i = 0;i < s->l1_size; i++) {
be64_to_cpus(&s->l1_table[i]);
}
/* alloc L2 cache */
s->l2_cache = g_malloc(s->l2_size * L2_CACHE_SIZE * sizeof(uint64_t));
/* alloc L2 cache (max. 64k * 16 * 8 = 8 MB) */
s->l2_cache =
qemu_try_blockalign(bs->file,
s->l2_size * L2_CACHE_SIZE * sizeof(uint64_t));
if (s->l2_cache == NULL) {
error_setg(errp, "Could not allocate L2 table cache");
ret = -ENOMEM;
goto fail;
}
s->cluster_cache = g_malloc(s->cluster_size);
s->cluster_data = g_malloc(s->cluster_size);
s->cluster_cache_offset = -1;
@@ -226,7 +239,7 @@ static int qcow_open(BlockDriverState *bs, QDict *options, int flags,
fail:
g_free(s->l1_table);
g_free(s->l2_cache);
qemu_vfree(s->l2_cache);
g_free(s->cluster_cache);
g_free(s->cluster_data);
return ret;
@@ -517,7 +530,10 @@ static coroutine_fn int qcow_co_readv(BlockDriverState *bs, int64_t sector_num,
void *orig_buf;
if (qiov->niov > 1) {
buf = orig_buf = qemu_blockalign(bs, qiov->size);
buf = orig_buf = qemu_try_blockalign(bs, qiov->size);
if (buf == NULL) {
return -ENOMEM;
}
} else {
orig_buf = NULL;
buf = (uint8_t *)qiov->iov->iov_base;
@@ -619,7 +635,10 @@ static coroutine_fn int qcow_co_writev(BlockDriverState *bs, int64_t sector_num,
s->cluster_cache_offset = -1; /* disable compressed cache */
if (qiov->niov > 1) {
buf = orig_buf = qemu_blockalign(bs, qiov->size);
buf = orig_buf = qemu_try_blockalign(bs, qiov->size);
if (buf == NULL) {
return -ENOMEM;
}
qemu_iovec_to_buf(qiov, 0, buf, qiov->size);
} else {
orig_buf = NULL;
@@ -685,7 +704,7 @@ static void qcow_close(BlockDriverState *bs)
BDRVQcowState *s = bs->opaque;
g_free(s->l1_table);
g_free(s->l2_cache);
qemu_vfree(s->l2_cache);
g_free(s->cluster_cache);
g_free(s->cluster_data);
@@ -693,35 +712,30 @@ static void qcow_close(BlockDriverState *bs)
error_free(s->migration_blocker);
}
static int qcow_create(const char *filename, QEMUOptionParameter *options,
Error **errp)
static int qcow_create(const char *filename, QemuOpts *opts, Error **errp)
{
int header_size, backing_filename_len, l1_size, shift, i;
QCowHeader header;
uint8_t *tmp;
int64_t total_size = 0;
const char *backing_file = NULL;
char *backing_file = NULL;
int flags = 0;
Error *local_err = NULL;
int ret;
BlockDriverState *qcow_bs;
/* Read out options */
while (options && options->name) {
if (!strcmp(options->name, BLOCK_OPT_SIZE)) {
total_size = options->value.n / 512;
} else if (!strcmp(options->name, BLOCK_OPT_BACKING_FILE)) {
backing_file = options->value.s;
} else if (!strcmp(options->name, BLOCK_OPT_ENCRYPT)) {
flags |= options->value.n ? BLOCK_FLAG_ENCRYPT : 0;
}
options++;
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
backing_file = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FILE);
if (qemu_opt_get_bool_del(opts, BLOCK_OPT_ENCRYPT, false)) {
flags |= BLOCK_FLAG_ENCRYPT;
}
ret = bdrv_create_file(filename, options, &local_err);
ret = bdrv_create_file(filename, opts, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
return ret;
goto cleanup;
}
qcow_bs = NULL;
@@ -729,7 +743,7 @@ static int qcow_create(const char *filename, QEMUOptionParameter *options,
BDRV_O_RDWR | BDRV_O_PROTOCOL, NULL, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
return ret;
goto cleanup;
}
ret = bdrv_truncate(qcow_bs, 0);
@@ -740,7 +754,7 @@ static int qcow_create(const char *filename, QEMUOptionParameter *options,
memset(&header, 0, sizeof(header));
header.magic = cpu_to_be32(QCOW_MAGIC);
header.version = cpu_to_be32(QCOW_VERSION);
header.size = cpu_to_be64(total_size * 512);
header.size = cpu_to_be64(total_size);
header_size = sizeof(header);
backing_filename_len = 0;
if (backing_file) {
@@ -762,7 +776,7 @@ static int qcow_create(const char *filename, QEMUOptionParameter *options,
}
header_size = (header_size + 7) & ~7;
shift = header.cluster_bits + header.l2_bits;
l1_size = ((total_size * 512) + (1LL << shift) - 1) >> shift;
l1_size = (total_size + (1LL << shift) - 1) >> shift;
header.l1_table_offset = cpu_to_be64(header_size);
if (flags & BLOCK_FLAG_ENCRYPT) {
@@ -800,6 +814,8 @@ static int qcow_create(const char *filename, QEMUOptionParameter *options,
ret = 0;
exit:
bdrv_unref(qcow_bs);
cleanup:
g_free(backing_file);
return ret;
}
@@ -912,24 +928,28 @@ static int qcow_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
return 0;
}
static QEMUOptionParameter qcow_create_options[] = {
{
.name = BLOCK_OPT_SIZE,
.type = OPT_SIZE,
.help = "Virtual disk size"
},
{
.name = BLOCK_OPT_BACKING_FILE,
.type = OPT_STRING,
.help = "File name of a base image"
},
{
.name = BLOCK_OPT_ENCRYPT,
.type = OPT_FLAG,
.help = "Encrypt the image"
},
{ NULL }
static QemuOptsList qcow_create_opts = {
.name = "qcow-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(qcow_create_opts.head),
.desc = {
{
.name = BLOCK_OPT_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Virtual disk size"
},
{
.name = BLOCK_OPT_BACKING_FILE,
.type = QEMU_OPT_STRING,
.help = "File name of a base image"
},
{
.name = BLOCK_OPT_ENCRYPT,
.type = QEMU_OPT_BOOL,
.help = "Encrypt the image",
.def_value_str = "off"
},
{ /* end of list */ }
}
};
static BlockDriver bdrv_qcow = {
@@ -938,9 +958,10 @@ static BlockDriver bdrv_qcow = {
.bdrv_probe = qcow_probe,
.bdrv_open = qcow_open,
.bdrv_close = qcow_close,
.bdrv_reopen_prepare = qcow_reopen_prepare,
.bdrv_create = qcow_create,
.bdrv_reopen_prepare = qcow_reopen_prepare,
.bdrv_create = qcow_create,
.bdrv_has_zero_init = bdrv_has_zero_init_1,
.supports_backing = true,
.bdrv_co_readv = qcow_co_readv,
.bdrv_co_writev = qcow_co_writev,
@@ -951,7 +972,7 @@ static BlockDriver bdrv_qcow = {
.bdrv_write_compressed = qcow_write_compressed,
.bdrv_get_info = qcow_get_info,
.create_options = qcow_create_options,
.create_opts = &qcow_create_opts,
};
static void bdrv_qcow_init(void)

View File

@@ -48,15 +48,31 @@ Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables)
Qcow2Cache *c;
int i;
c = g_malloc0(sizeof(*c));
c = g_new0(Qcow2Cache, 1);
c->size = num_tables;
c->entries = g_malloc0(sizeof(*c->entries) * num_tables);
c->entries = g_try_new0(Qcow2CachedTable, num_tables);
if (!c->entries) {
goto fail;
}
for (i = 0; i < c->size; i++) {
c->entries[i].table = qemu_blockalign(bs, s->cluster_size);
c->entries[i].table = qemu_try_blockalign(bs->file, s->cluster_size);
if (c->entries[i].table == NULL) {
goto fail;
}
}
return c;
fail:
if (c->entries) {
for (i = 0; i < c->size; i++) {
qemu_vfree(c->entries[i].table);
}
}
g_free(c->entries);
g_free(c);
return NULL;
}
int qcow2_cache_destroy(BlockDriverState* bs, Qcow2Cache *c)

View File

@@ -72,14 +72,20 @@ int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
#endif
new_l1_size2 = sizeof(uint64_t) * new_l1_size;
new_l1_table = g_malloc0(align_offset(new_l1_size2, 512));
new_l1_table = qemu_try_blockalign(bs->file,
align_offset(new_l1_size2, 512));
if (new_l1_table == NULL) {
return -ENOMEM;
}
memset(new_l1_table, 0, align_offset(new_l1_size2, 512));
memcpy(new_l1_table, s->l1_table, s->l1_size * sizeof(uint64_t));
/* write new table (align to cluster) */
BLKDBG_EVENT(bs->file, BLKDBG_L1_GROW_ALLOC_TABLE);
new_l1_table_offset = qcow2_alloc_clusters(bs, new_l1_size2);
if (new_l1_table_offset < 0) {
g_free(new_l1_table);
qemu_vfree(new_l1_table);
return new_l1_table_offset;
}
@@ -113,7 +119,7 @@ int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
if (ret < 0) {
goto fail;
}
g_free(s->l1_table);
qemu_vfree(s->l1_table);
old_l1_table_offset = s->l1_table_offset;
s->l1_table_offset = new_l1_table_offset;
s->l1_table = new_l1_table;
@@ -123,7 +129,7 @@ int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
QCOW2_DISCARD_OTHER);
return 0;
fail:
g_free(new_l1_table);
qemu_vfree(new_l1_table);
qcow2_free_clusters(bs, new_l1_table_offset, new_l1_size2,
QCOW2_DISCARD_OTHER);
return ret;
@@ -372,7 +378,10 @@ static int coroutine_fn copy_sectors(BlockDriverState *bs,
}
iov.iov_len = n * BDRV_SECTOR_SIZE;
iov.iov_base = qemu_blockalign(bs, iov.iov_len);
iov.iov_base = qemu_try_blockalign(bs, iov.iov_len);
if (iov.iov_base == NULL) {
return -ENOMEM;
}
qemu_iovec_init_external(&qiov, &iov, 1);
@@ -702,7 +711,11 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
trace_qcow2_cluster_link_l2(qemu_coroutine_self(), m->nb_clusters);
assert(m->nb_clusters > 0);
old_cluster = g_malloc(m->nb_clusters * sizeof(uint64_t));
old_cluster = g_try_new(uint64_t, m->nb_clusters);
if (old_cluster == NULL) {
ret = -ENOMEM;
goto err;
}
/* copy content of unmodified sectors */
ret = perform_cow(bs, m, &m->cow_start);
@@ -1106,6 +1119,17 @@ static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset,
return 0;
}
/* !*host_offset would overwrite the image header and is reserved for "no
* host offset preferred". If 0 was a valid host offset, it'd trigger the
* following overlap check; do that now to avoid having an invalid value in
* *host_offset. */
if (!alloc_cluster_offset) {
ret = qcow2_pre_write_overlap_check(bs, 0, alloc_cluster_offset,
nb_clusters * s->cluster_size);
assert(ret < 0);
goto fail;
}
/*
* Save info needed for meta data update.
*
@@ -1562,7 +1586,10 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
if (!is_active_l1) {
/* inactive L2 tables require a buffer to be stored in when loading
* them from disk */
l2_table = qemu_blockalign(bs, s->cluster_size);
l2_table = qemu_try_blockalign(bs->file, s->cluster_size);
if (l2_table == NULL) {
return -ENOMEM;
}
}
for (i = 0; i < l1_size; i++) {
@@ -1740,7 +1767,11 @@ int qcow2_expand_zero_clusters(BlockDriverState *bs)
nb_clusters = size_to_clusters(s, bs->file->total_sectors *
BDRV_SECTOR_SIZE);
expanded_clusters = g_malloc0((nb_clusters + 7) / 8);
expanded_clusters = g_try_malloc0((nb_clusters + 7) / 8);
if (expanded_clusters == NULL) {
ret = -ENOMEM;
goto fail;
}
ret = expand_zero_clusters_in_l1(bs, s->l1_table, s->l1_size,
&expanded_clusters, &nb_clusters);

View File

@@ -27,6 +27,7 @@
#include "block/qcow2.h"
#include "qemu/range.h"
#include "qapi/qmp/types.h"
#include "qapi-event.h"
static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size);
static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
@@ -45,19 +46,25 @@ int qcow2_refcount_init(BlockDriverState *bs)
assert(s->refcount_table_size <= INT_MAX / sizeof(uint64_t));
refcount_table_size2 = s->refcount_table_size * sizeof(uint64_t);
s->refcount_table = g_malloc(refcount_table_size2);
s->refcount_table = g_try_malloc(refcount_table_size2);
if (s->refcount_table_size > 0) {
if (s->refcount_table == NULL) {
ret = -ENOMEM;
goto fail;
}
BLKDBG_EVENT(bs->file, BLKDBG_REFTABLE_LOAD);
ret = bdrv_pread(bs->file, s->refcount_table_offset,
s->refcount_table, refcount_table_size2);
if (ret != refcount_table_size2)
if (ret < 0) {
goto fail;
}
for(i = 0; i < s->refcount_table_size; i++)
be64_to_cpus(&s->refcount_table[i]);
}
return 0;
fail:
return -ENOMEM;
return ret;
}
void qcow2_refcount_close(BlockDriverState *bs)
@@ -343,8 +350,14 @@ static int alloc_refcount_block(BlockDriverState *bs,
uint64_t meta_offset = (blocks_used * refcount_block_clusters) *
s->cluster_size;
uint64_t table_offset = meta_offset + blocks_clusters * s->cluster_size;
uint16_t *new_blocks = g_malloc0(blocks_clusters * s->cluster_size);
uint64_t *new_table = g_malloc0(table_size * sizeof(uint64_t));
uint64_t *new_table = g_try_new0(uint64_t, table_size);
uint16_t *new_blocks = g_try_malloc0(blocks_clusters * s->cluster_size);
assert(table_size > 0 && blocks_clusters > 0);
if (new_table == NULL || new_blocks == NULL) {
ret = -ENOMEM;
goto fail_table;
}
/* Fill the new refcount table */
memcpy(new_table, s->refcount_table,
@@ -368,6 +381,7 @@ static int alloc_refcount_block(BlockDriverState *bs,
ret = bdrv_pwrite_sync(bs->file, meta_offset, new_blocks,
blocks_clusters * s->cluster_size);
g_free(new_blocks);
new_blocks = NULL;
if (ret < 0) {
goto fail_table;
}
@@ -423,6 +437,7 @@ static int alloc_refcount_block(BlockDriverState *bs,
return -EAGAIN;
fail_table:
g_free(new_blocks);
g_free(new_table);
fail_block:
if (*refcount_block != NULL) {
@@ -846,7 +861,8 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
int64_t l1_table_offset, int l1_size, int addend)
{
BDRVQcowState *s = bs->opaque;
uint64_t *l1_table, *l2_table, l2_offset, offset, l1_size2, l1_allocated;
uint64_t *l1_table, *l2_table, l2_offset, offset, l1_size2;
bool l1_allocated = false;
int64_t old_offset, old_l2_offset;
int i, j, l1_modified = 0, nb_csectors, refcount;
int ret;
@@ -861,8 +877,12 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
* l1_table_offset when it is the current s->l1_table_offset! Be careful
* when changing this! */
if (l1_table_offset != s->l1_table_offset) {
l1_table = g_malloc0(align_offset(l1_size2, 512));
l1_allocated = 1;
l1_table = g_try_malloc0(align_offset(l1_size2, 512));
if (l1_size2 && l1_table == NULL) {
ret = -ENOMEM;
goto fail;
}
l1_allocated = true;
ret = bdrv_pread(bs->file, l1_table_offset, l1_table, l1_size2);
if (ret < 0) {
@@ -874,7 +894,7 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
} else {
assert(l1_size == s->l1_size);
l1_table = s->l1_table;
l1_allocated = 0;
l1_allocated = false;
}
for(i = 0; i < l1_size; i++) {
@@ -1196,7 +1216,11 @@ static int check_refcounts_l1(BlockDriverState *bs,
if (l1_size2 == 0) {
l1_table = NULL;
} else {
l1_table = g_malloc(l1_size2);
l1_table = g_try_malloc(l1_size2);
if (l1_table == NULL) {
ret = -ENOMEM;
goto fail;
}
if (bdrv_pread(bs->file, l1_table_offset,
l1_table, l1_size2) != l1_size2)
goto fail;
@@ -1500,7 +1524,11 @@ int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
return -EFBIG;
}
refcount_table = g_malloc0(nb_clusters * sizeof(uint16_t));
refcount_table = g_try_new0(uint16_t, nb_clusters);
if (nb_clusters && refcount_table == NULL) {
res->check_errors++;
return -ENOMEM;
}
res->bfi.total_clusters =
size_to_clusters(s, bs->total_sectors * BDRV_SECTOR_SIZE);
@@ -1577,8 +1605,8 @@ int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
/* increase refcount_table size if necessary */
int old_nb_clusters = nb_clusters;
nb_clusters = (new_offset >> s->cluster_bits) + 1;
refcount_table = g_realloc(refcount_table,
nb_clusters * sizeof(uint16_t));
refcount_table = g_renew(uint16_t, refcount_table,
nb_clusters);
memset(&refcount_table[old_nb_clusters], 0, (nb_clusters
- old_nb_clusters) * sizeof(uint16_t));
}
@@ -1752,9 +1780,13 @@ int qcow2_check_metadata_overlap(BlockDriverState *bs, int ign, int64_t offset,
uint64_t l1_ofs = s->snapshots[i].l1_table_offset;
uint32_t l1_sz = s->snapshots[i].l1_size;
uint64_t l1_sz2 = l1_sz * sizeof(uint64_t);
uint64_t *l1 = g_malloc(l1_sz2);
uint64_t *l1 = g_try_malloc(l1_sz2);
int ret;
if (l1_sz2 && l1 == NULL) {
return -ENOMEM;
}
ret = bdrv_pread(bs->file, l1_ofs, l1, l1_sz2);
if (ret < 0) {
g_free(l1);
@@ -1807,7 +1839,6 @@ int qcow2_pre_write_overlap_check(BlockDriverState *bs, int ign, int64_t offset,
} else if (ret > 0) {
int metadata_ol_bitnr = ffs(ret) - 1;
char *message;
QObject *data;
assert(metadata_ol_bitnr < QCOW2_OL_MAX_BITNR);
@@ -1816,12 +1847,14 @@ int qcow2_pre_write_overlap_check(BlockDriverState *bs, int ign, int64_t offset,
metadata_ol_names[metadata_ol_bitnr]);
message = g_strdup_printf("Prevented %s overwrite",
metadata_ol_names[metadata_ol_bitnr]);
data = qobject_from_jsonf("{ 'device': %s, 'msg': %s, 'offset': %"
PRId64 ", 'size': %" PRId64 " }", bs->device_name, message,
offset, size);
monitor_protocol_event(QEVENT_BLOCK_IMAGE_CORRUPTED, data);
qapi_event_send_block_image_corrupted(bdrv_get_device_name(bs),
message,
true,
offset,
true,
size,
&error_abort);
g_free(message);
qobject_decref(data);
qcow2_mark_corrupt(bs);
bs->drv = NULL; /* make BDS unusable */

View File

@@ -58,7 +58,7 @@ int qcow2_read_snapshots(BlockDriverState *bs)
}
offset = s->snapshots_offset;
s->snapshots = g_malloc0(s->nb_snapshots * sizeof(QCowSnapshot));
s->snapshots = g_new0(QCowSnapshot, s->nb_snapshots);
for(i = 0; i < s->nb_snapshots; i++) {
/* Read statically sized part of the snapshot header */
@@ -381,7 +381,12 @@ int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)
sn->l1_table_offset = l1_table_offset;
sn->l1_size = s->l1_size;
l1_table = g_malloc(s->l1_size * sizeof(uint64_t));
l1_table = g_try_new(uint64_t, s->l1_size);
if (s->l1_size && l1_table == NULL) {
ret = -ENOMEM;
goto fail;
}
for(i = 0; i < s->l1_size; i++) {
l1_table[i] = cpu_to_be64(s->l1_table[i]);
}
@@ -412,7 +417,7 @@ int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)
}
/* Append the new snapshot to the snapshot list */
new_snapshot_list = g_malloc((s->nb_snapshots + 1) * sizeof(QCowSnapshot));
new_snapshot_list = g_new(QCowSnapshot, s->nb_snapshots + 1);
if (s->snapshots) {
memcpy(new_snapshot_list, s->snapshots,
s->nb_snapshots * sizeof(QCowSnapshot));
@@ -499,7 +504,11 @@ int qcow2_snapshot_goto(BlockDriverState *bs, const char *snapshot_id)
* Decrease the refcount referenced by the old one only when the L1
* table is overwritten.
*/
sn_l1_table = g_malloc0(cur_l1_bytes);
sn_l1_table = g_try_malloc0(cur_l1_bytes);
if (cur_l1_bytes && sn_l1_table == NULL) {
ret = -ENOMEM;
goto fail;
}
ret = bdrv_pread(bs->file, sn->l1_table_offset, sn_l1_table, sn_l1_bytes);
if (ret < 0) {
@@ -652,7 +661,7 @@ int qcow2_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_tab)
return s->nb_snapshots;
}
sn_tab = g_malloc0(s->nb_snapshots * sizeof(QEMUSnapshotInfo));
sn_tab = g_new0(QEMUSnapshotInfo, s->nb_snapshots);
for(i = 0; i < s->nb_snapshots; i++) {
sn_info = sn_tab + i;
sn = s->snapshots + i;
@@ -698,17 +707,21 @@ int qcow2_snapshot_load_tmp(BlockDriverState *bs,
return -EFBIG;
}
new_l1_bytes = sn->l1_size * sizeof(uint64_t);
new_l1_table = g_malloc0(align_offset(new_l1_bytes, 512));
new_l1_table = qemu_try_blockalign(bs->file,
align_offset(new_l1_bytes, 512));
if (new_l1_table == NULL) {
return -ENOMEM;
}
ret = bdrv_pread(bs->file, sn->l1_table_offset, new_l1_table, new_l1_bytes);
if (ret < 0) {
error_setg(errp, "Failed to read l1 table for snapshot");
g_free(new_l1_table);
qemu_vfree(new_l1_table);
return ret;
}
/* Switch the L1 table */
g_free(s->l1_table);
qemu_vfree(s->l1_table);
s->l1_size = sn->l1_size;
s->l1_table_offset = sn->l1_table_offset;

View File

@@ -30,7 +30,9 @@
#include "qemu/error-report.h"
#include "qapi/qmp/qerror.h"
#include "qapi/qmp/qbool.h"
#include "qapi/util.h"
#include "trace.h"
#include "qemu/option_int.h"
/*
Differences with QCOW:
@@ -209,20 +211,31 @@ static void GCC_FMT_ATTR(3, 4) report_unsupported(BlockDriverState *bs,
static void report_unsupported_feature(BlockDriverState *bs,
Error **errp, Qcow2Feature *table, uint64_t mask)
{
char *features = g_strdup("");
char *old;
while (table && table->name[0] != '\0') {
if (table->type == QCOW2_FEAT_TYPE_INCOMPATIBLE) {
if (mask & (1 << table->bit)) {
report_unsupported(bs, errp, "%.46s", table->name);
mask &= ~(1 << table->bit);
if (mask & (1ULL << table->bit)) {
old = features;
features = g_strdup_printf("%s%s%.46s", old, *old ? ", " : "",
table->name);
g_free(old);
mask &= ~(1ULL << table->bit);
}
}
table++;
}
if (mask) {
report_unsupported(bs, errp, "Unknown incompatible feature: %" PRIx64,
mask);
old = features;
features = g_strdup_printf("%s%sUnknown incompatible feature: %" PRIx64,
old, *old ? ", " : "", mask);
g_free(old);
}
report_unsupported(bs, errp, "%s", features);
g_free(features);
}
/*
@@ -430,6 +443,22 @@ static QemuOptsList qcow2_runtime_opts = {
.type = QEMU_OPT_BOOL,
.help = "Check for unintended writes into an inactive L2 table",
},
{
.name = QCOW2_OPT_CACHE_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Maximum combined metadata (L2 tables and refcount blocks) "
"cache size",
},
{
.name = QCOW2_OPT_L2_CACHE_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Maximum L2 table cache size",
},
{
.name = QCOW2_OPT_REFCOUNT_CACHE_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Maximum refcount block cache size",
},
{ /* end of list */ }
},
};
@@ -445,6 +474,61 @@ static const char *overlap_bool_option_names[QCOW2_OL_MAX_BITNR] = {
[QCOW2_OL_INACTIVE_L2_BITNR] = QCOW2_OPT_OVERLAP_INACTIVE_L2,
};
static void read_cache_sizes(QemuOpts *opts, uint64_t *l2_cache_size,
uint64_t *refcount_cache_size, Error **errp)
{
uint64_t combined_cache_size;
bool l2_cache_size_set, refcount_cache_size_set, combined_cache_size_set;
combined_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_CACHE_SIZE);
l2_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_L2_CACHE_SIZE);
refcount_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_REFCOUNT_CACHE_SIZE);
combined_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_CACHE_SIZE, 0);
*l2_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_L2_CACHE_SIZE, 0);
*refcount_cache_size = qemu_opt_get_size(opts,
QCOW2_OPT_REFCOUNT_CACHE_SIZE, 0);
if (combined_cache_size_set) {
if (l2_cache_size_set && refcount_cache_size_set) {
error_setg(errp, QCOW2_OPT_CACHE_SIZE ", " QCOW2_OPT_L2_CACHE_SIZE
" and " QCOW2_OPT_REFCOUNT_CACHE_SIZE " may not be set "
"the same time");
return;
} else if (*l2_cache_size > combined_cache_size) {
error_setg(errp, QCOW2_OPT_L2_CACHE_SIZE " may not exceed "
QCOW2_OPT_CACHE_SIZE);
return;
} else if (*refcount_cache_size > combined_cache_size) {
error_setg(errp, QCOW2_OPT_REFCOUNT_CACHE_SIZE " may not exceed "
QCOW2_OPT_CACHE_SIZE);
return;
}
if (l2_cache_size_set) {
*refcount_cache_size = combined_cache_size - *l2_cache_size;
} else if (refcount_cache_size_set) {
*l2_cache_size = combined_cache_size - *refcount_cache_size;
} else {
*refcount_cache_size = combined_cache_size
/ (DEFAULT_L2_REFCOUNT_SIZE_RATIO + 1);
*l2_cache_size = combined_cache_size - *refcount_cache_size;
}
} else {
if (!l2_cache_size_set && !refcount_cache_size_set) {
*l2_cache_size = DEFAULT_L2_CACHE_BYTE_SIZE;
*refcount_cache_size = *l2_cache_size
/ DEFAULT_L2_REFCOUNT_SIZE_RATIO;
} else if (!l2_cache_size_set) {
*l2_cache_size = *refcount_cache_size
* DEFAULT_L2_REFCOUNT_SIZE_RATIO;
} else if (!refcount_cache_size_set) {
*refcount_cache_size = *l2_cache_size
/ DEFAULT_L2_REFCOUNT_SIZE_RATIO;
}
}
}
static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
@@ -458,6 +542,7 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
uint64_t l1_vm_state_index;
const char *opt_overlap_check;
int overlap_check_template = 0;
uint64_t l2_cache_size, refcount_cache_size;
ret = bdrv_pread(bs->file, 0, &header, sizeof(header));
if (ret < 0) {
@@ -676,8 +761,13 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
if (s->l1_size > 0) {
s->l1_table = g_malloc0(
s->l1_table = qemu_try_blockalign(bs->file,
align_offset(s->l1_size * sizeof(uint64_t), 512));
if (s->l1_table == NULL) {
error_setg(errp, "Could not allocate L1 table");
ret = -ENOMEM;
goto fail;
}
ret = bdrv_pread(bs->file, s->l1_table_offset, s->l1_table,
s->l1_size * sizeof(uint64_t));
if (ret < 0) {
@@ -689,14 +779,61 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
}
}
/* get L2 table/refcount block cache size from command line options */
opts = qemu_opts_create(&qcow2_runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
read_cache_sizes(opts, &l2_cache_size, &refcount_cache_size, &local_err);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
l2_cache_size /= s->cluster_size;
if (l2_cache_size < MIN_L2_CACHE_SIZE) {
l2_cache_size = MIN_L2_CACHE_SIZE;
}
if (l2_cache_size > INT_MAX) {
error_setg(errp, "L2 cache size too big");
ret = -EINVAL;
goto fail;
}
refcount_cache_size /= s->cluster_size;
if (refcount_cache_size < MIN_REFCOUNT_CACHE_SIZE) {
refcount_cache_size = MIN_REFCOUNT_CACHE_SIZE;
}
if (refcount_cache_size > INT_MAX) {
error_setg(errp, "Refcount cache size too big");
ret = -EINVAL;
goto fail;
}
/* alloc L2 table/refcount block cache */
s->l2_table_cache = qcow2_cache_create(bs, L2_CACHE_SIZE);
s->refcount_block_cache = qcow2_cache_create(bs, REFCOUNT_CACHE_SIZE);
s->l2_table_cache = qcow2_cache_create(bs, l2_cache_size);
s->refcount_block_cache = qcow2_cache_create(bs, refcount_cache_size);
if (s->l2_table_cache == NULL || s->refcount_block_cache == NULL) {
error_setg(errp, "Could not allocate metadata caches");
ret = -ENOMEM;
goto fail;
}
s->cluster_cache = g_malloc(s->cluster_size);
/* one more sector for decompressed data alignment */
s->cluster_data = qemu_blockalign(bs, QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size
+ 512);
s->cluster_data = qemu_try_blockalign(bs->file, QCOW_MAX_CRYPT_CLUSTERS
* s->cluster_size + 512);
if (s->cluster_data == NULL) {
error_setg(errp, "Could not allocate temporary cluster buffer");
ret = -ENOMEM;
goto fail;
}
s->cluster_cache_offset = -1;
s->flags = flags;
@@ -770,14 +907,6 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
}
/* Enable lazy_refcounts according to image and command line options */
opts = qemu_opts_create(&qcow2_runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
s->use_lazy_refcounts = qemu_opt_get_bool(opts, QCOW2_OPT_LAZY_REFCOUNTS,
(s->compatible_features & QCOW2_COMPAT_LAZY_REFCOUNTS));
@@ -840,7 +969,7 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
cleanup_unknown_header_ext(bs);
qcow2_free_snapshots(bs);
qcow2_refcount_close(bs);
g_free(s->l1_table);
qemu_vfree(s->l1_table);
/* else pre-write overlap checks in cache_destroy may crash */
s->l1_table = NULL;
if (s->l2_table_cache) {
@@ -854,13 +983,11 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
return ret;
}
static int qcow2_refresh_limits(BlockDriverState *bs)
static void qcow2_refresh_limits(BlockDriverState *bs, Error **errp)
{
BDRVQcowState *s = bs->opaque;
bs->bl.write_zeroes_alignment = s->cluster_sectors;
return 0;
}
static int qcow2_set_key(BlockDriverState *bs, const char *key)
@@ -1019,11 +1146,20 @@ static coroutine_fn int qcow2_co_readv(BlockDriverState *bs, int64_t sector_num,
n1 = qcow2_backing_read1(bs->backing_hd, &hd_qiov,
sector_num, cur_nr_sectors);
if (n1 > 0) {
QEMUIOVector local_qiov;
qemu_iovec_init(&local_qiov, hd_qiov.niov);
qemu_iovec_concat(&local_qiov, &hd_qiov, 0,
n1 * BDRV_SECTOR_SIZE);
BLKDBG_EVENT(bs->file, BLKDBG_READ_BACKING_AIO);
qemu_co_mutex_unlock(&s->lock);
ret = bdrv_co_readv(bs->backing_hd, sector_num,
n1, &hd_qiov);
n1, &local_qiov);
qemu_co_mutex_lock(&s->lock);
qemu_iovec_destroy(&local_qiov);
if (ret < 0) {
goto fail;
}
@@ -1063,7 +1199,12 @@ static coroutine_fn int qcow2_co_readv(BlockDriverState *bs, int64_t sector_num,
*/
if (!cluster_data) {
cluster_data =
qemu_blockalign(bs, QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size);
qemu_try_blockalign(bs->file, QCOW_MAX_CRYPT_CLUSTERS
* s->cluster_size);
if (cluster_data == NULL) {
ret = -ENOMEM;
goto fail;
}
}
assert(cur_nr_sectors <=
@@ -1163,8 +1304,13 @@ static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
if (s->crypt_method) {
if (!cluster_data) {
cluster_data = qemu_blockalign(bs, QCOW_MAX_CRYPT_CLUSTERS *
s->cluster_size);
cluster_data = qemu_try_blockalign(bs->file,
QCOW_MAX_CRYPT_CLUSTERS
* s->cluster_size);
if (cluster_data == NULL) {
ret = -ENOMEM;
goto fail;
}
}
assert(hd_qiov.size <=
@@ -1251,7 +1397,7 @@ fail:
static void qcow2_close(BlockDriverState *bs)
{
BDRVQcowState *s = bs->opaque;
g_free(s->l1_table);
qemu_vfree(s->l1_table);
/* else pre-write overlap checks in cache_destroy may crash */
s->l1_table = NULL;
@@ -1538,7 +1684,7 @@ static int preallocate(BlockDriverState *bs)
int ret;
QCowL2Meta *meta;
nb_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
nb_sectors = bdrv_nb_sectors(bs);
offset = 0;
while (nb_sectors) {
@@ -1593,8 +1739,8 @@ static int preallocate(BlockDriverState *bs)
static int qcow2_create2(const char *filename, int64_t total_size,
const char *backing_file, const char *backing_format,
int flags, size_t cluster_size, int prealloc,
QEMUOptionParameter *options, int version,
int flags, size_t cluster_size, PreallocMode prealloc,
QemuOpts *opts, int version,
Error **errp)
{
/* Calculate cluster_bits */
@@ -1626,7 +1772,57 @@ static int qcow2_create2(const char *filename, int64_t total_size,
Error *local_err = NULL;
int ret;
ret = bdrv_create_file(filename, options, &local_err);
if (prealloc == PREALLOC_MODE_FULL || prealloc == PREALLOC_MODE_FALLOC) {
int64_t meta_size = 0;
uint64_t nreftablee, nrefblocke, nl1e, nl2e;
int64_t aligned_total_size = align_offset(total_size, cluster_size);
/* header: 1 cluster */
meta_size += cluster_size;
/* total size of L2 tables */
nl2e = aligned_total_size / cluster_size;
nl2e = align_offset(nl2e, cluster_size / sizeof(uint64_t));
meta_size += nl2e * sizeof(uint64_t);
/* total size of L1 tables */
nl1e = nl2e * sizeof(uint64_t) / cluster_size;
nl1e = align_offset(nl1e, cluster_size / sizeof(uint64_t));
meta_size += nl1e * sizeof(uint64_t);
/* total size of refcount blocks
*
* note: every host cluster is reference-counted, including metadata
* (even refcount blocks are recursively included).
* Let:
* a = total_size (this is the guest disk size)
* m = meta size not including refcount blocks and refcount tables
* c = cluster size
* y1 = number of refcount blocks entries
* y2 = meta size including everything
* then,
* y1 = (y2 + a)/c
* y2 = y1 * sizeof(u16) + y1 * sizeof(u16) * sizeof(u64) / c + m
* we can get y1:
* y1 = (a + m) / (c - sizeof(u16) - sizeof(u16) * sizeof(u64) / c)
*/
nrefblocke = (aligned_total_size + meta_size + cluster_size) /
(cluster_size - sizeof(uint16_t) -
1.0 * sizeof(uint16_t) * sizeof(uint64_t) / cluster_size);
nrefblocke = align_offset(nrefblocke, cluster_size / sizeof(uint16_t));
meta_size += nrefblocke * sizeof(uint16_t);
/* total size of refcount tables */
nreftablee = nrefblocke * sizeof(uint16_t) / cluster_size;
nreftablee = align_offset(nreftablee, cluster_size / sizeof(uint64_t));
meta_size += nreftablee * sizeof(uint64_t);
qemu_opt_set_number(opts, BLOCK_OPT_SIZE,
aligned_total_size + meta_size);
qemu_opt_set(opts, BLOCK_OPT_PREALLOC, PreallocMode_lookup[prealloc]);
}
ret = bdrv_create_file(filename, opts, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
return ret;
@@ -1714,7 +1910,7 @@ static int qcow2_create2(const char *filename, int64_t total_size,
}
/* Okay, now that we have a valid image, let's give it the right size */
ret = bdrv_truncate(bs, total_size * BDRV_SECTOR_SIZE);
ret = bdrv_truncate(bs, total_size);
if (ret < 0) {
error_setg_errno(errp, -ret, "Could not resize image");
goto out;
@@ -1731,7 +1927,7 @@ static int qcow2_create2(const char *filename, int64_t total_size,
}
/* And if we're supposed to preallocate metadata, do that now */
if (prealloc) {
if (prealloc != PREALLOC_MODE_OFF) {
BDRVQcowState *s = bs->opaque;
qemu_co_mutex_lock(&s->lock);
ret = preallocate(bs);
@@ -1762,78 +1958,80 @@ out:
return ret;
}
static int qcow2_create(const char *filename, QEMUOptionParameter *options,
Error **errp)
static int qcow2_create(const char *filename, QemuOpts *opts, Error **errp)
{
const char *backing_file = NULL;
const char *backing_fmt = NULL;
uint64_t sectors = 0;
char *backing_file = NULL;
char *backing_fmt = NULL;
char *buf = NULL;
uint64_t size = 0;
int flags = 0;
size_t cluster_size = DEFAULT_CLUSTER_SIZE;
int prealloc = 0;
PreallocMode prealloc;
int version = 3;
Error *local_err = NULL;
int ret;
/* Read out options */
while (options && options->name) {
if (!strcmp(options->name, BLOCK_OPT_SIZE)) {
sectors = options->value.n / 512;
} else if (!strcmp(options->name, BLOCK_OPT_BACKING_FILE)) {
backing_file = options->value.s;
} else if (!strcmp(options->name, BLOCK_OPT_BACKING_FMT)) {
backing_fmt = options->value.s;
} else if (!strcmp(options->name, BLOCK_OPT_ENCRYPT)) {
flags |= options->value.n ? BLOCK_FLAG_ENCRYPT : 0;
} else if (!strcmp(options->name, BLOCK_OPT_CLUSTER_SIZE)) {
if (options->value.n) {
cluster_size = options->value.n;
}
} else if (!strcmp(options->name, BLOCK_OPT_PREALLOC)) {
if (!options->value.s || !strcmp(options->value.s, "off")) {
prealloc = 0;
} else if (!strcmp(options->value.s, "metadata")) {
prealloc = 1;
} else {
error_setg(errp, "Invalid preallocation mode: '%s'",
options->value.s);
return -EINVAL;
}
} else if (!strcmp(options->name, BLOCK_OPT_COMPAT_LEVEL)) {
if (!options->value.s) {
/* keep the default */
} else if (!strcmp(options->value.s, "0.10")) {
version = 2;
} else if (!strcmp(options->value.s, "1.1")) {
version = 3;
} else {
error_setg(errp, "Invalid compatibility level: '%s'",
options->value.s);
return -EINVAL;
}
} else if (!strcmp(options->name, BLOCK_OPT_LAZY_REFCOUNTS)) {
flags |= options->value.n ? BLOCK_FLAG_LAZY_REFCOUNTS : 0;
}
options++;
size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
backing_file = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FILE);
backing_fmt = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FMT);
if (qemu_opt_get_bool_del(opts, BLOCK_OPT_ENCRYPT, false)) {
flags |= BLOCK_FLAG_ENCRYPT;
}
cluster_size = qemu_opt_get_size_del(opts, BLOCK_OPT_CLUSTER_SIZE,
DEFAULT_CLUSTER_SIZE);
buf = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
prealloc = qapi_enum_parse(PreallocMode_lookup, buf,
PREALLOC_MODE_MAX, PREALLOC_MODE_OFF,
&local_err);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto finish;
}
g_free(buf);
buf = qemu_opt_get_del(opts, BLOCK_OPT_COMPAT_LEVEL);
if (!buf) {
/* keep the default */
} else if (!strcmp(buf, "0.10")) {
version = 2;
} else if (!strcmp(buf, "1.1")) {
version = 3;
} else {
error_setg(errp, "Invalid compatibility level: '%s'", buf);
ret = -EINVAL;
goto finish;
}
if (backing_file && prealloc) {
if (qemu_opt_get_bool_del(opts, BLOCK_OPT_LAZY_REFCOUNTS, false)) {
flags |= BLOCK_FLAG_LAZY_REFCOUNTS;
}
if (backing_file && prealloc != PREALLOC_MODE_OFF) {
error_setg(errp, "Backing file and preallocation cannot be used at "
"the same time");
return -EINVAL;
ret = -EINVAL;
goto finish;
}
if (version < 3 && (flags & BLOCK_FLAG_LAZY_REFCOUNTS)) {
error_setg(errp, "Lazy refcounts only supported with compatibility "
"level 1.1 and above (use compat=1.1 or greater)");
return -EINVAL;
ret = -EINVAL;
goto finish;
}
ret = qcow2_create2(filename, sectors, backing_file, backing_fmt, flags,
cluster_size, prealloc, options, version, &local_err);
ret = qcow2_create2(filename, size, backing_file, backing_fmt, flags,
cluster_size, prealloc, opts, version, &local_err);
if (local_err) {
error_propagate(errp, local_err);
}
finish:
g_free(backing_file);
g_free(backing_fmt);
g_free(buf);
return ret;
}
@@ -1926,7 +2124,6 @@ static int qcow2_write_compressed(BlockDriverState *bs, int64_t sector_num,
/* align end of file to a sector boundary to ease reading with
sector based I/Os */
cluster_offset = bdrv_getlength(bs->file);
cluster_offset = (cluster_offset + 511) & ~511;
bdrv_truncate(bs->file, cluster_offset);
return 0;
}
@@ -2198,64 +2395,72 @@ static int qcow2_downgrade(BlockDriverState *bs, int target_version)
return 0;
}
static int qcow2_amend_options(BlockDriverState *bs,
QEMUOptionParameter *options)
static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts)
{
BDRVQcowState *s = bs->opaque;
int old_version = s->qcow_version, new_version = old_version;
uint64_t new_size = 0;
const char *backing_file = NULL, *backing_format = NULL;
bool lazy_refcounts = s->use_lazy_refcounts;
const char *compat = NULL;
uint64_t cluster_size = s->cluster_size;
bool encrypt;
int ret;
int i;
QemuOptDesc *desc = opts->list->desc;
for (i = 0; options[i].name; i++)
{
if (!options[i].assigned) {
while (desc && desc->name) {
if (!qemu_opt_find(opts, desc->name)) {
/* only change explicitly defined options */
desc++;
continue;
}
if (!strcmp(options[i].name, "compat")) {
if (!options[i].value.s) {
if (!strcmp(desc->name, "compat")) {
compat = qemu_opt_get(opts, "compat");
if (!compat) {
/* preserve default */
} else if (!strcmp(options[i].value.s, "0.10")) {
} else if (!strcmp(compat, "0.10")) {
new_version = 2;
} else if (!strcmp(options[i].value.s, "1.1")) {
} else if (!strcmp(compat, "1.1")) {
new_version = 3;
} else {
fprintf(stderr, "Unknown compatibility level %s.\n",
options[i].value.s);
fprintf(stderr, "Unknown compatibility level %s.\n", compat);
return -EINVAL;
}
} else if (!strcmp(options[i].name, "preallocation")) {
} else if (!strcmp(desc->name, "preallocation")) {
fprintf(stderr, "Cannot change preallocation mode.\n");
return -ENOTSUP;
} else if (!strcmp(options[i].name, "size")) {
new_size = options[i].value.n;
} else if (!strcmp(options[i].name, "backing_file")) {
backing_file = options[i].value.s;
} else if (!strcmp(options[i].name, "backing_fmt")) {
backing_format = options[i].value.s;
} else if (!strcmp(options[i].name, "encryption")) {
if ((options[i].value.n != !!s->crypt_method)) {
} else if (!strcmp(desc->name, "size")) {
new_size = qemu_opt_get_size(opts, "size", 0);
} else if (!strcmp(desc->name, "backing_file")) {
backing_file = qemu_opt_get(opts, "backing_file");
} else if (!strcmp(desc->name, "backing_fmt")) {
backing_format = qemu_opt_get(opts, "backing_fmt");
} else if (!strcmp(desc->name, "encryption")) {
encrypt = qemu_opt_get_bool(opts, "encryption", s->crypt_method);
if (encrypt != !!s->crypt_method) {
fprintf(stderr, "Changing the encryption flag is not "
"supported.\n");
return -ENOTSUP;
}
} else if (!strcmp(options[i].name, "cluster_size")) {
if (options[i].value.n != s->cluster_size) {
} else if (!strcmp(desc->name, "cluster_size")) {
cluster_size = qemu_opt_get_size(opts, "cluster_size",
cluster_size);
if (cluster_size != s->cluster_size) {
fprintf(stderr, "Changing the cluster size is not "
"supported.\n");
return -ENOTSUP;
}
} else if (!strcmp(options[i].name, "lazy_refcounts")) {
lazy_refcounts = options[i].value.n;
} else if (!strcmp(desc->name, "lazy_refcounts")) {
lazy_refcounts = qemu_opt_get_bool(opts, "lazy_refcounts",
lazy_refcounts);
} else {
/* if this assertion fails, this probably means a new option was
* added without having it covered here */
assert(false);
}
desc++;
}
if (new_version != old_version) {
@@ -2324,49 +2529,56 @@ static int qcow2_amend_options(BlockDriverState *bs,
return 0;
}
static QEMUOptionParameter qcow2_create_options[] = {
{
.name = BLOCK_OPT_SIZE,
.type = OPT_SIZE,
.help = "Virtual disk size"
},
{
.name = BLOCK_OPT_COMPAT_LEVEL,
.type = OPT_STRING,
.help = "Compatibility level (0.10 or 1.1)"
},
{
.name = BLOCK_OPT_BACKING_FILE,
.type = OPT_STRING,
.help = "File name of a base image"
},
{
.name = BLOCK_OPT_BACKING_FMT,
.type = OPT_STRING,
.help = "Image format of the base image"
},
{
.name = BLOCK_OPT_ENCRYPT,
.type = OPT_FLAG,
.help = "Encrypt the image"
},
{
.name = BLOCK_OPT_CLUSTER_SIZE,
.type = OPT_SIZE,
.help = "qcow2 cluster size",
.value = { .n = DEFAULT_CLUSTER_SIZE },
},
{
.name = BLOCK_OPT_PREALLOC,
.type = OPT_STRING,
.help = "Preallocation mode (allowed values: off, metadata)"
},
{
.name = BLOCK_OPT_LAZY_REFCOUNTS,
.type = OPT_FLAG,
.help = "Postpone refcount updates",
},
{ NULL }
static QemuOptsList qcow2_create_opts = {
.name = "qcow2-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(qcow2_create_opts.head),
.desc = {
{
.name = BLOCK_OPT_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Virtual disk size"
},
{
.name = BLOCK_OPT_COMPAT_LEVEL,
.type = QEMU_OPT_STRING,
.help = "Compatibility level (0.10 or 1.1)"
},
{
.name = BLOCK_OPT_BACKING_FILE,
.type = QEMU_OPT_STRING,
.help = "File name of a base image"
},
{
.name = BLOCK_OPT_BACKING_FMT,
.type = QEMU_OPT_STRING,
.help = "Image format of the base image"
},
{
.name = BLOCK_OPT_ENCRYPT,
.type = QEMU_OPT_BOOL,
.help = "Encrypt the image",
.def_value_str = "off"
},
{
.name = BLOCK_OPT_CLUSTER_SIZE,
.type = QEMU_OPT_SIZE,
.help = "qcow2 cluster size",
.def_value_str = stringify(DEFAULT_CLUSTER_SIZE)
},
{
.name = BLOCK_OPT_PREALLOC,
.type = QEMU_OPT_STRING,
.help = "Preallocation mode (allowed values: off, metadata, "
"falloc, full)"
},
{
.name = BLOCK_OPT_LAZY_REFCOUNTS,
.type = QEMU_OPT_BOOL,
.help = "Postpone refcount updates",
.def_value_str = "off"
},
{ /* end of list */ }
}
};
static BlockDriver bdrv_qcow2 = {
@@ -2394,21 +2606,22 @@ static BlockDriver bdrv_qcow2 = {
.bdrv_snapshot_goto = qcow2_snapshot_goto,
.bdrv_snapshot_delete = qcow2_snapshot_delete,
.bdrv_snapshot_list = qcow2_snapshot_list,
.bdrv_snapshot_load_tmp = qcow2_snapshot_load_tmp,
.bdrv_get_info = qcow2_get_info,
.bdrv_snapshot_load_tmp = qcow2_snapshot_load_tmp,
.bdrv_get_info = qcow2_get_info,
.bdrv_get_specific_info = qcow2_get_specific_info,
.bdrv_save_vmstate = qcow2_save_vmstate,
.bdrv_load_vmstate = qcow2_load_vmstate,
.supports_backing = true,
.bdrv_change_backing_file = qcow2_change_backing_file,
.bdrv_refresh_limits = qcow2_refresh_limits,
.bdrv_invalidate_cache = qcow2_invalidate_cache,
.create_options = qcow2_create_options,
.bdrv_check = qcow2_check,
.bdrv_amend_options = qcow2_amend_options,
.create_opts = &qcow2_create_opts,
.bdrv_check = qcow2_check,
.bdrv_amend_options = qcow2_amend_options,
};
static void bdrv_qcow2_init(void)

View File

@@ -64,10 +64,16 @@
#define MIN_CLUSTER_BITS 9
#define MAX_CLUSTER_BITS 21
#define L2_CACHE_SIZE 16
#define MIN_L2_CACHE_SIZE 1 /* cluster */
/* Must be at least 4 to cover all cases of refcount table growth */
#define REFCOUNT_CACHE_SIZE 4
#define MIN_REFCOUNT_CACHE_SIZE 4 /* clusters */
#define DEFAULT_L2_CACHE_BYTE_SIZE 1048576 /* bytes */
/* The refblock cache needs only a fourth of the L2 cache size to cover as many
* clusters */
#define DEFAULT_L2_REFCOUNT_SIZE_RATIO 4
#define DEFAULT_CLUSTER_SIZE 65536
@@ -85,6 +91,9 @@
#define QCOW2_OPT_OVERLAP_SNAPSHOT_TABLE "overlap-check.snapshot-table"
#define QCOW2_OPT_OVERLAP_INACTIVE_L1 "overlap-check.inactive-l1"
#define QCOW2_OPT_OVERLAP_INACTIVE_L2 "overlap-check.inactive-l2"
#define QCOW2_OPT_CACHE_SIZE "cache-size"
#define QCOW2_OPT_L2_CACHE_SIZE "l2-cache-size"
#define QCOW2_OPT_REFCOUNT_CACHE_SIZE "refcount-cache-size"
typedef struct QCowHeader {
uint32_t magic;

View File

@@ -227,8 +227,10 @@ int qed_check(BDRVQEDState *s, BdrvCheckResult *result, bool fix)
};
int ret;
check.used_clusters = g_malloc0(((check.nclusters + 31) / 32) *
sizeof(check.used_clusters[0]));
check.used_clusters = g_try_new0(uint32_t, (check.nclusters + 31) / 32);
if (check.nclusters && check.used_clusters == NULL) {
return -ENOMEM;
}
check.result->bfi.total_clusters =
(s->header.image_size + s->header.cluster_size - 1) /

View File

@@ -173,7 +173,7 @@ int qed_read_l1_table_sync(BDRVQEDState *s)
qed_read_table(s, s->header.l1_table_offset,
s->l1_table, qed_sync_cb, &ret);
while (ret == -EINPROGRESS) {
qemu_aio_wait();
aio_poll(bdrv_get_aio_context(s->bs), true);
}
return ret;
@@ -194,7 +194,7 @@ int qed_write_l1_table_sync(BDRVQEDState *s, unsigned int index,
qed_write_l1_table(s, index, n, qed_sync_cb, &ret);
while (ret == -EINPROGRESS) {
qemu_aio_wait();
aio_poll(bdrv_get_aio_context(s->bs), true);
}
return ret;
@@ -267,7 +267,7 @@ int qed_read_l2_table_sync(BDRVQEDState *s, QEDRequest *request, uint64_t offset
qed_read_l2_table(s, request, offset, qed_sync_cb, &ret);
while (ret == -EINPROGRESS) {
qemu_aio_wait();
aio_poll(bdrv_get_aio_context(s->bs), true);
}
return ret;
@@ -289,7 +289,7 @@ int qed_write_l2_table_sync(BDRVQEDState *s, QEDRequest *request,
qed_write_l2_table(s, request, index, n, flush, qed_sync_cb, &ret);
while (ret == -EINPROGRESS) {
qemu_aio_wait();
aio_poll(bdrv_get_aio_context(s->bs), true);
}
return ret;

View File

@@ -21,12 +21,13 @@
static void qed_aio_cancel(BlockDriverAIOCB *blockacb)
{
QEDAIOCB *acb = (QEDAIOCB *)blockacb;
AioContext *aio_context = bdrv_get_aio_context(blockacb->bs);
bool finished = false;
/* Wait for the request to finish */
acb->finished = &finished;
while (!finished) {
qemu_aio_wait();
aio_poll(aio_context, true);
}
}
@@ -373,6 +374,27 @@ static void bdrv_qed_rebind(BlockDriverState *bs)
s->bs = bs;
}
static void bdrv_qed_detach_aio_context(BlockDriverState *bs)
{
BDRVQEDState *s = bs->opaque;
qed_cancel_need_check_timer(s);
timer_free(s->need_check_timer);
}
static void bdrv_qed_attach_aio_context(BlockDriverState *bs,
AioContext *new_context)
{
BDRVQEDState *s = bs->opaque;
s->need_check_timer = aio_timer_new(new_context,
QEMU_CLOCK_VIRTUAL, SCALE_NS,
qed_need_check_timer_cb, s);
if (s->header.features & QED_F_NEED_CHECK) {
qed_start_need_check_timer(s);
}
}
static int bdrv_qed_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
@@ -496,8 +518,7 @@ static int bdrv_qed_open(BlockDriverState *bs, QDict *options, int flags,
}
}
s->need_check_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL,
qed_need_check_timer_cb, s);
bdrv_qed_attach_aio_context(bs, bdrv_get_aio_context(bs));
out:
if (ret) {
@@ -507,13 +528,11 @@ out:
return ret;
}
static int bdrv_qed_refresh_limits(BlockDriverState *bs)
static void bdrv_qed_refresh_limits(BlockDriverState *bs, Error **errp)
{
BDRVQEDState *s = bs->opaque;
bs->bl.write_zeroes_alignment = s->header.cluster_size >> BDRV_SECTOR_BITS;
return 0;
}
/* We have nothing to do for QED reopen, stubs just return
@@ -528,8 +547,7 @@ static void bdrv_qed_close(BlockDriverState *bs)
{
BDRVQEDState *s = bs->opaque;
qed_cancel_need_check_timer(s);
timer_free(s->need_check_timer);
bdrv_qed_detach_aio_context(bs);
/* Ensure writes reach stable storage */
bdrv_flush(bs->file);
@@ -547,7 +565,7 @@ static void bdrv_qed_close(BlockDriverState *bs)
static int qed_create(const char *filename, uint32_t cluster_size,
uint64_t image_size, uint32_t table_size,
const char *backing_file, const char *backing_fmt,
Error **errp)
QemuOpts *opts, Error **errp)
{
QEDHeader header = {
.magic = QED_MAGIC,
@@ -566,7 +584,7 @@ static int qed_create(const char *filename, uint32_t cluster_size,
int ret = 0;
BlockDriverState *bs;
ret = bdrv_create_file(filename, NULL, &local_err);
ret = bdrv_create_file(filename, opts, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
return ret;
@@ -621,55 +639,54 @@ out:
return ret;
}
static int bdrv_qed_create(const char *filename, QEMUOptionParameter *options,
Error **errp)
static int bdrv_qed_create(const char *filename, QemuOpts *opts, Error **errp)
{
uint64_t image_size = 0;
uint32_t cluster_size = QED_DEFAULT_CLUSTER_SIZE;
uint32_t table_size = QED_DEFAULT_TABLE_SIZE;
const char *backing_file = NULL;
const char *backing_fmt = NULL;
char *backing_file = NULL;
char *backing_fmt = NULL;
int ret;
while (options && options->name) {
if (!strcmp(options->name, BLOCK_OPT_SIZE)) {
image_size = options->value.n;
} else if (!strcmp(options->name, BLOCK_OPT_BACKING_FILE)) {
backing_file = options->value.s;
} else if (!strcmp(options->name, BLOCK_OPT_BACKING_FMT)) {
backing_fmt = options->value.s;
} else if (!strcmp(options->name, BLOCK_OPT_CLUSTER_SIZE)) {
if (options->value.n) {
cluster_size = options->value.n;
}
} else if (!strcmp(options->name, BLOCK_OPT_TABLE_SIZE)) {
if (options->value.n) {
table_size = options->value.n;
}
}
options++;
}
image_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
backing_file = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FILE);
backing_fmt = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FMT);
cluster_size = qemu_opt_get_size_del(opts,
BLOCK_OPT_CLUSTER_SIZE,
QED_DEFAULT_CLUSTER_SIZE);
table_size = qemu_opt_get_size_del(opts, BLOCK_OPT_TABLE_SIZE,
QED_DEFAULT_TABLE_SIZE);
if (!qed_is_cluster_size_valid(cluster_size)) {
error_setg(errp, "QED cluster size must be within range [%u, %u] "
"and power of 2",
QED_MIN_CLUSTER_SIZE, QED_MAX_CLUSTER_SIZE);
return -EINVAL;
ret = -EINVAL;
goto finish;
}
if (!qed_is_table_size_valid(table_size)) {
error_setg(errp, "QED table size must be within range [%u, %u] "
"and power of 2",
QED_MIN_TABLE_SIZE, QED_MAX_TABLE_SIZE);
return -EINVAL;
ret = -EINVAL;
goto finish;
}
if (!qed_is_image_size_valid(image_size, cluster_size, table_size)) {
error_setg(errp, "QED image size must be a non-zero multiple of "
"cluster size and less than %" PRIu64 " bytes",
qed_max_image_size(cluster_size, table_size));
return -EINVAL;
ret = -EINVAL;
goto finish;
}
return qed_create(filename, cluster_size, image_size, table_size,
backing_file, backing_fmt, errp);
ret = qed_create(filename, cluster_size, image_size, table_size,
backing_file, backing_fmt, opts, errp);
finish:
g_free(backing_file);
g_free(backing_fmt);
return ret;
}
typedef struct {
@@ -743,17 +760,19 @@ static BDRVQEDState *acb_to_s(QEDAIOCB *acb)
/**
* Read from the backing file or zero-fill if no backing file
*
* @s: QED state
* @pos: Byte position in device
* @qiov: Destination I/O vector
* @cb: Completion function
* @opaque: User data for completion function
* @s: QED state
* @pos: Byte position in device
* @qiov: Destination I/O vector
* @backing_qiov: Possibly shortened copy of qiov, to be allocated here
* @cb: Completion function
* @opaque: User data for completion function
*
* This function reads qiov->size bytes starting at pos from the backing file.
* If there is no backing file then zeroes are read.
*/
static void qed_read_backing_file(BDRVQEDState *s, uint64_t pos,
QEMUIOVector *qiov,
QEMUIOVector **backing_qiov,
BlockDriverCompletionFunc *cb, void *opaque)
{
uint64_t backing_length = 0;
@@ -786,15 +805,21 @@ static void qed_read_backing_file(BDRVQEDState *s, uint64_t pos,
/* If the read straddles the end of the backing file, shorten it */
size = MIN((uint64_t)backing_length - pos, qiov->size);
assert(*backing_qiov == NULL);
*backing_qiov = g_new(QEMUIOVector, 1);
qemu_iovec_init(*backing_qiov, qiov->niov);
qemu_iovec_concat(*backing_qiov, qiov, 0, size);
BLKDBG_EVENT(s->bs->file, BLKDBG_READ_BACKING_AIO);
bdrv_aio_readv(s->bs->backing_hd, pos / BDRV_SECTOR_SIZE,
qiov, size / BDRV_SECTOR_SIZE, cb, opaque);
*backing_qiov, size / BDRV_SECTOR_SIZE, cb, opaque);
}
typedef struct {
GenericCB gencb;
BDRVQEDState *s;
QEMUIOVector qiov;
QEMUIOVector *backing_qiov;
struct iovec iov;
uint64_t offset;
} CopyFromBackingFileCB;
@@ -811,6 +836,12 @@ static void qed_copy_from_backing_file_write(void *opaque, int ret)
CopyFromBackingFileCB *copy_cb = opaque;
BDRVQEDState *s = copy_cb->s;
if (copy_cb->backing_qiov) {
qemu_iovec_destroy(copy_cb->backing_qiov);
g_free(copy_cb->backing_qiov);
copy_cb->backing_qiov = NULL;
}
if (ret) {
qed_copy_from_backing_file_cb(copy_cb, ret);
return;
@@ -848,11 +879,12 @@ static void qed_copy_from_backing_file(BDRVQEDState *s, uint64_t pos,
copy_cb = gencb_alloc(sizeof(*copy_cb), cb, opaque);
copy_cb->s = s;
copy_cb->offset = offset;
copy_cb->backing_qiov = NULL;
copy_cb->iov.iov_base = qemu_blockalign(s->bs, len);
copy_cb->iov.iov_len = len;
qemu_iovec_init_external(&copy_cb->qiov, &copy_cb->iov, 1);
qed_read_backing_file(s, pos, &copy_cb->qiov,
qed_read_backing_file(s, pos, &copy_cb->qiov, &copy_cb->backing_qiov,
qed_copy_from_backing_file_write, copy_cb);
}
@@ -919,7 +951,8 @@ static void qed_aio_complete(QEDAIOCB *acb, int ret)
/* Arrange for a bh to invoke the completion function */
acb->bh_ret = ret;
acb->bh = qemu_bh_new(qed_aio_complete_bh, acb);
acb->bh = aio_bh_new(bdrv_get_aio_context(acb->common.bs),
qed_aio_complete_bh, acb);
qemu_bh_schedule(acb->bh);
/* Start next allocating write request waiting behind this one. Note that
@@ -1208,7 +1241,11 @@ static void qed_aio_write_inplace(QEDAIOCB *acb, uint64_t offset, size_t len)
struct iovec *iov = acb->qiov->iov;
if (!iov->iov_base) {
iov->iov_base = qemu_blockalign(acb->common.bs, iov->iov_len);
iov->iov_base = qemu_try_blockalign(acb->common.bs, iov->iov_len);
if (iov->iov_base == NULL) {
qed_aio_complete(acb, -ENOMEM);
return;
}
memset(iov->iov_base, 0, iov->iov_len);
}
}
@@ -1294,7 +1331,7 @@ static void qed_aio_read_data(void *opaque, int ret,
return;
} else if (ret != QED_CLUSTER_FOUND) {
qed_read_backing_file(s, acb->cur_pos, &acb->cur_qiov,
qed_aio_next_io, acb);
&acb->backing_qiov, qed_aio_next_io, acb);
return;
}
@@ -1320,6 +1357,12 @@ static void qed_aio_next_io(void *opaque, int ret)
trace_qed_aio_next_io(s, acb, ret, acb->cur_pos + acb->cur_qiov.size);
if (acb->backing_qiov) {
qemu_iovec_destroy(acb->backing_qiov);
g_free(acb->backing_qiov);
acb->backing_qiov = NULL;
}
/* Handle I/O error */
if (ret) {
qed_aio_complete(acb, ret);
@@ -1359,6 +1402,7 @@ static BlockDriverAIOCB *qed_aio_setup(BlockDriverState *bs,
acb->qiov_offset = 0;
acb->cur_pos = (uint64_t)sector_num * BDRV_SECTOR_SIZE;
acb->end_pos = acb->cur_pos + nb_sectors * BDRV_SECTOR_SIZE;
acb->backing_qiov = NULL;
acb->request.l2_table = NULL;
qemu_iovec_init(&acb->cur_qiov, qiov->niov);
@@ -1595,36 +1639,45 @@ static int bdrv_qed_check(BlockDriverState *bs, BdrvCheckResult *result,
return qed_check(s, result, !!fix);
}
static QEMUOptionParameter qed_create_options[] = {
{
.name = BLOCK_OPT_SIZE,
.type = OPT_SIZE,
.help = "Virtual disk size (in bytes)"
}, {
.name = BLOCK_OPT_BACKING_FILE,
.type = OPT_STRING,
.help = "File name of a base image"
}, {
.name = BLOCK_OPT_BACKING_FMT,
.type = OPT_STRING,
.help = "Image format of the base image"
}, {
.name = BLOCK_OPT_CLUSTER_SIZE,
.type = OPT_SIZE,
.help = "Cluster size (in bytes)",
.value = { .n = QED_DEFAULT_CLUSTER_SIZE },
}, {
.name = BLOCK_OPT_TABLE_SIZE,
.type = OPT_SIZE,
.help = "L1/L2 table size (in clusters)"
},
{ /* end of list */ }
static QemuOptsList qed_create_opts = {
.name = "qed-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(qed_create_opts.head),
.desc = {
{
.name = BLOCK_OPT_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Virtual disk size"
},
{
.name = BLOCK_OPT_BACKING_FILE,
.type = QEMU_OPT_STRING,
.help = "File name of a base image"
},
{
.name = BLOCK_OPT_BACKING_FMT,
.type = QEMU_OPT_STRING,
.help = "Image format of the base image"
},
{
.name = BLOCK_OPT_CLUSTER_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Cluster size (in bytes)",
.def_value_str = stringify(QED_DEFAULT_CLUSTER_SIZE)
},
{
.name = BLOCK_OPT_TABLE_SIZE,
.type = QEMU_OPT_SIZE,
.help = "L1/L2 table size (in clusters)"
},
{ /* end of list */ }
}
};
static BlockDriver bdrv_qed = {
.format_name = "qed",
.instance_size = sizeof(BDRVQEDState),
.create_options = qed_create_options,
.create_opts = &qed_create_opts,
.supports_backing = true,
.bdrv_probe = bdrv_qed_probe,
.bdrv_rebind = bdrv_qed_rebind,
@@ -1644,6 +1697,8 @@ static BlockDriver bdrv_qed = {
.bdrv_change_backing_file = bdrv_qed_change_backing_file,
.bdrv_invalidate_cache = bdrv_qed_invalidate_cache,
.bdrv_check = bdrv_qed_check,
.bdrv_detach_aio_context = bdrv_qed_detach_aio_context,
.bdrv_attach_aio_context = bdrv_qed_attach_aio_context,
};
static void bdrv_qed_init(void)

View File

@@ -43,7 +43,7 @@
*
* All fields are little-endian on disk.
*/
#define QED_DEFAULT_CLUSTER_SIZE 65536
enum {
QED_MAGIC = 'Q' | 'E' << 8 | 'D' << 16 | '\0' << 24,
@@ -69,7 +69,6 @@ enum {
*/
QED_MIN_CLUSTER_SIZE = 4 * 1024, /* in bytes */
QED_MAX_CLUSTER_SIZE = 64 * 1024 * 1024,
QED_DEFAULT_CLUSTER_SIZE = 64 * 1024,
/* Allocated clusters are tracked using a 2-level pagetable. Table size is
* a multiple of clusters so large maximum image sizes can be supported
@@ -143,6 +142,7 @@ typedef struct QEDAIOCB {
/* Current cluster scatter-gather list */
QEMUIOVector cur_qiov;
QEMUIOVector *backing_qiov;
uint64_t cur_pos; /* position on block device, in bytes */
uint64_t cur_cluster; /* cluster offset in image file */
unsigned int cur_nclusters; /* number of clusters being accessed */

View File

@@ -16,12 +16,20 @@
#include <gnutls/gnutls.h>
#include <gnutls/crypto.h>
#include "block/block_int.h"
#include "qapi/qmp/qbool.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qint.h"
#include "qapi/qmp/qjson.h"
#include "qapi/qmp/qlist.h"
#include "qapi/qmp/qstring.h"
#include "qapi-event.h"
#define HASH_LENGTH 32
#define QUORUM_OPT_VOTE_THRESHOLD "vote-threshold"
#define QUORUM_OPT_BLKVERIFY "blkverify"
#define QUORUM_OPT_REWRITE "rewrite-corrupted"
#define QUORUM_OPT_READ_PATTERN "read-pattern"
/* This union holds a vote hash value */
typedef union QuorumVoteValue {
@@ -69,6 +77,11 @@ typedef struct BDRVQuorumState {
* It is useful to debug other block drivers by
* comparing them with a reference one.
*/
bool rewrite_corrupted;/* true if the driver must rewrite-on-read corrupted
* block if Quorum is reached.
*/
QuorumReadPattern read_pattern;
} BDRVQuorumState;
typedef struct QuorumAIOCB QuorumAIOCB;
@@ -104,13 +117,18 @@ struct QuorumAIOCB {
int count; /* number of completed AIOCB */
int success_count; /* number of successfully completed AIOCB */
int rewrite_count; /* number of replica to rewrite: count down to
* zero once writes are fired
*/
QuorumVotes votes;
bool is_read;
int vote_ret;
int child_iter; /* which child to read in fifo pattern */
};
static void quorum_vote(QuorumAIOCB *acb);
static bool quorum_vote(QuorumAIOCB *acb);
static void quorum_aio_cancel(BlockDriverAIOCB *blockacb)
{
@@ -134,7 +152,6 @@ static AIOCBInfo quorum_aiocb_info = {
static void quorum_aio_finalize(QuorumAIOCB *acb)
{
BDRVQuorumState *s = acb->common.bs->opaque;
int i, ret = 0;
if (acb->vote_ret) {
@@ -144,7 +161,8 @@ static void quorum_aio_finalize(QuorumAIOCB *acb)
acb->common.cb(acb->common.opaque, ret);
if (acb->is_read) {
for (i = 0; i < s->num_children; i++) {
/* on the quorum case acb->child_iter == s->num_children - 1 */
for (i = 0; i <= acb->child_iter; i++) {
qemu_vfree(acb->qcrs[i].buf);
qemu_iovec_destroy(&acb->qcrs[i].qiov);
}
@@ -182,6 +200,7 @@ static QuorumAIOCB *quorum_aio_get(BDRVQuorumState *s,
acb->qcrs = g_new0(QuorumChildRequest, s->num_children);
acb->count = 0;
acb->success_count = 0;
acb->rewrite_count = 0;
acb->votes.compare = quorum_sha256_compare;
QLIST_INIT(&acb->votes.vote_list);
acb->is_read = false;
@@ -198,32 +217,22 @@ static QuorumAIOCB *quorum_aio_get(BDRVQuorumState *s,
static void quorum_report_bad(QuorumAIOCB *acb, char *node_name, int ret)
{
QObject *data;
assert(node_name);
data = qobject_from_jsonf("{ 'node-name': %s"
", 'sector-num': %" PRId64
", 'sectors-count': %d }",
node_name, acb->sector_num, acb->nb_sectors);
const char *msg = NULL;
if (ret < 0) {
QDict *dict = qobject_to_qdict(data);
qdict_put(dict, "error", qstring_from_str(strerror(-ret)));
msg = strerror(-ret);
}
monitor_protocol_event(QEVENT_QUORUM_REPORT_BAD, data);
qobject_decref(data);
qapi_event_send_quorum_report_bad(!!msg, msg, node_name,
acb->sector_num, acb->nb_sectors, &error_abort);
}
static void quorum_report_failure(QuorumAIOCB *acb)
{
QObject *data;
const char *reference = acb->common.bs->device_name[0] ?
acb->common.bs->device_name :
acb->common.bs->node_name;
data = qobject_from_jsonf("{ 'reference': %s"
", 'sector-num': %" PRId64
", 'sectors-count': %d }",
reference, acb->sector_num, acb->nb_sectors);
monitor_protocol_event(QEVENT_QUORUM_FAILURE, data);
qobject_decref(data);
qapi_event_send_quorum_failure(reference, acb->sector_num,
acb->nb_sectors, &error_abort);
}
static int quorum_vote_error(QuorumAIOCB *acb);
@@ -241,11 +250,57 @@ static bool quorum_has_too_much_io_failed(QuorumAIOCB *acb)
return false;
}
static void quorum_rewrite_aio_cb(void *opaque, int ret)
{
QuorumAIOCB *acb = opaque;
/* one less rewrite to do */
acb->rewrite_count--;
/* wait until all rewrite callbacks have completed */
if (acb->rewrite_count) {
return;
}
quorum_aio_finalize(acb);
}
static BlockDriverAIOCB *read_fifo_child(QuorumAIOCB *acb);
static void quorum_copy_qiov(QEMUIOVector *dest, QEMUIOVector *source)
{
int i;
assert(dest->niov == source->niov);
assert(dest->size == source->size);
for (i = 0; i < source->niov; i++) {
assert(dest->iov[i].iov_len == source->iov[i].iov_len);
memcpy(dest->iov[i].iov_base,
source->iov[i].iov_base,
source->iov[i].iov_len);
}
}
static void quorum_aio_cb(void *opaque, int ret)
{
QuorumChildRequest *sacb = opaque;
QuorumAIOCB *acb = sacb->parent;
BDRVQuorumState *s = acb->common.bs->opaque;
bool rewrite = false;
if (acb->is_read && s->read_pattern == QUORUM_READ_PATTERN_FIFO) {
/* We try to read next child in FIFO order if we fail to read */
if (ret < 0 && ++acb->child_iter < s->num_children) {
read_fifo_child(acb);
return;
}
if (ret == 0) {
quorum_copy_qiov(acb->qiov, &acb->qcrs[acb->child_iter].qiov);
}
acb->vote_ret = ret;
quorum_aio_finalize(acb);
return;
}
sacb->ret = ret;
acb->count++;
@@ -262,12 +317,15 @@ static void quorum_aio_cb(void *opaque, int ret)
/* Do the vote on read */
if (acb->is_read) {
quorum_vote(acb);
rewrite = quorum_vote(acb);
} else {
quorum_has_too_much_io_failed(acb);
}
quorum_aio_finalize(acb);
/* if no rewrite is done the code will finish right away */
if (!rewrite) {
quorum_aio_finalize(acb);
}
}
static void quorum_report_bad_versions(BDRVQuorumState *s,
@@ -287,17 +345,41 @@ static void quorum_report_bad_versions(BDRVQuorumState *s,
}
}
static void quorum_copy_qiov(QEMUIOVector *dest, QEMUIOVector *source)
static bool quorum_rewrite_bad_versions(BDRVQuorumState *s, QuorumAIOCB *acb,
QuorumVoteValue *value)
{
int i;
assert(dest->niov == source->niov);
assert(dest->size == source->size);
for (i = 0; i < source->niov; i++) {
assert(dest->iov[i].iov_len == source->iov[i].iov_len);
memcpy(dest->iov[i].iov_base,
source->iov[i].iov_base,
source->iov[i].iov_len);
QuorumVoteVersion *version;
QuorumVoteItem *item;
int count = 0;
/* first count the number of bad versions: done first to avoid concurrency
* issues.
*/
QLIST_FOREACH(version, &acb->votes.vote_list, next) {
if (acb->votes.compare(&version->value, value)) {
continue;
}
QLIST_FOREACH(item, &version->items, next) {
count++;
}
}
/* quorum_rewrite_aio_cb will count down this to zero */
acb->rewrite_count = count;
/* now fire the correcting rewrites */
QLIST_FOREACH(version, &acb->votes.vote_list, next) {
if (acb->votes.compare(&version->value, value)) {
continue;
}
QLIST_FOREACH(item, &version->items, next) {
bdrv_aio_writev(s->bs[item->index], acb->sector_num, acb->qiov,
acb->nb_sectors, quorum_rewrite_aio_cb, acb);
}
}
/* return true if any rewrite is done else false */
return count;
}
static void quorum_count_vote(QuorumVotes *votes,
@@ -477,16 +559,17 @@ static int quorum_vote_error(QuorumAIOCB *acb)
return ret;
}
static void quorum_vote(QuorumAIOCB *acb)
static bool quorum_vote(QuorumAIOCB *acb)
{
bool quorum = true;
bool rewrite = false;
int i, j, ret;
QuorumVoteValue hash;
BDRVQuorumState *s = acb->common.bs->opaque;
QuorumVoteVersion *winner;
if (quorum_has_too_much_io_failed(acb)) {
return;
return false;
}
/* get the index of the first successful read */
@@ -514,7 +597,7 @@ static void quorum_vote(QuorumAIOCB *acb)
/* Every successful read agrees */
if (quorum) {
quorum_copy_qiov(acb->qiov, &acb->qcrs[i].qiov);
return;
return false;
}
/* compute hashes for each successful read, also store indexes */
@@ -547,37 +630,71 @@ static void quorum_vote(QuorumAIOCB *acb)
/* some versions are bad print them */
quorum_report_bad_versions(s, acb, &winner->value);
/* corruption correction is enabled */
if (s->rewrite_corrupted) {
rewrite = quorum_rewrite_bad_versions(s, acb, &winner->value);
}
free_exit:
/* free lists */
quorum_free_vote_list(&acb->votes);
return rewrite;
}
static BlockDriverAIOCB *read_quorum_children(QuorumAIOCB *acb)
{
BDRVQuorumState *s = acb->common.bs->opaque;
int i;
for (i = 0; i < s->num_children; i++) {
acb->qcrs[i].buf = qemu_blockalign(s->bs[i], acb->qiov->size);
qemu_iovec_init(&acb->qcrs[i].qiov, acb->qiov->niov);
qemu_iovec_clone(&acb->qcrs[i].qiov, acb->qiov, acb->qcrs[i].buf);
}
for (i = 0; i < s->num_children; i++) {
bdrv_aio_readv(s->bs[i], acb->sector_num, &acb->qcrs[i].qiov,
acb->nb_sectors, quorum_aio_cb, &acb->qcrs[i]);
}
return &acb->common;
}
static BlockDriverAIOCB *read_fifo_child(QuorumAIOCB *acb)
{
BDRVQuorumState *s = acb->common.bs->opaque;
acb->qcrs[acb->child_iter].buf = qemu_blockalign(s->bs[acb->child_iter],
acb->qiov->size);
qemu_iovec_init(&acb->qcrs[acb->child_iter].qiov, acb->qiov->niov);
qemu_iovec_clone(&acb->qcrs[acb->child_iter].qiov, acb->qiov,
acb->qcrs[acb->child_iter].buf);
bdrv_aio_readv(s->bs[acb->child_iter], acb->sector_num,
&acb->qcrs[acb->child_iter].qiov, acb->nb_sectors,
quorum_aio_cb, &acb->qcrs[acb->child_iter]);
return &acb->common;
}
static BlockDriverAIOCB *quorum_aio_readv(BlockDriverState *bs,
int64_t sector_num,
QEMUIOVector *qiov,
int nb_sectors,
BlockDriverCompletionFunc *cb,
void *opaque)
int64_t sector_num,
QEMUIOVector *qiov,
int nb_sectors,
BlockDriverCompletionFunc *cb,
void *opaque)
{
BDRVQuorumState *s = bs->opaque;
QuorumAIOCB *acb = quorum_aio_get(s, bs, qiov, sector_num,
nb_sectors, cb, opaque);
int i;
acb->is_read = true;
for (i = 0; i < s->num_children; i++) {
acb->qcrs[i].buf = qemu_blockalign(s->bs[i], qiov->size);
qemu_iovec_init(&acb->qcrs[i].qiov, qiov->niov);
qemu_iovec_clone(&acb->qcrs[i].qiov, qiov, acb->qcrs[i].buf);
if (s->read_pattern == QUORUM_READ_PATTERN_QUORUM) {
acb->child_iter = s->num_children - 1;
return read_quorum_children(acb);
}
for (i = 0; i < s->num_children; i++) {
bdrv_aio_readv(s->bs[i], sector_num, &acb->qcrs[i].qiov, nb_sectors,
quorum_aio_cb, &acb->qcrs[i]);
}
return &acb->common;
acb->child_iter = 0;
return read_fifo_child(acb);
}
static BlockDriverAIOCB *quorum_aio_writev(BlockDriverState *bs,
@@ -714,16 +831,44 @@ static QemuOptsList quorum_runtime_opts = {
.type = QEMU_OPT_BOOL,
.help = "Trigger block verify mode if set",
},
{
.name = QUORUM_OPT_REWRITE,
.type = QEMU_OPT_BOOL,
.help = "Rewrite corrupted block on read quorum",
},
{
.name = QUORUM_OPT_READ_PATTERN,
.type = QEMU_OPT_STRING,
.help = "Allowed pattern: quorum, fifo. Quorum is default",
},
{ /* end of list */ }
},
};
static int parse_read_pattern(const char *opt)
{
int i;
if (!opt) {
/* Set quorum as default */
return QUORUM_READ_PATTERN_QUORUM;
}
for (i = 0; i < QUORUM_READ_PATTERN_MAX; i++) {
if (!strcmp(opt, QuorumReadPattern_lookup[i])) {
return i;
}
}
return -EINVAL;
}
static int quorum_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVQuorumState *s = bs->opaque;
Error *local_err = NULL;
QemuOpts *opts;
QemuOpts *opts = NULL;
bool *opened;
QDict *sub = NULL;
QList *list = NULL;
@@ -759,20 +904,37 @@ static int quorum_open(BlockDriverState *bs, QDict *options, int flags,
}
s->threshold = qemu_opt_get_number(opts, QUORUM_OPT_VOTE_THRESHOLD, 0);
/* and validate it against s->num_children */
ret = quorum_valid_threshold(s->threshold, s->num_children, &local_err);
ret = parse_read_pattern(qemu_opt_get(opts, QUORUM_OPT_READ_PATTERN));
if (ret < 0) {
error_setg(&local_err, "Please set read-pattern as fifo or quorum");
goto exit;
}
s->read_pattern = ret;
/* is the driver in blkverify mode */
if (qemu_opt_get_bool(opts, QUORUM_OPT_BLKVERIFY, false) &&
s->num_children == 2 && s->threshold == 2) {
s->is_blkverify = true;
} else if (qemu_opt_get_bool(opts, QUORUM_OPT_BLKVERIFY, false)) {
fprintf(stderr, "blkverify mode is set by setting blkverify=on "
"and using two files with vote_threshold=2\n");
if (s->read_pattern == QUORUM_READ_PATTERN_QUORUM) {
/* and validate it against s->num_children */
ret = quorum_valid_threshold(s->threshold, s->num_children, &local_err);
if (ret < 0) {
goto exit;
}
/* is the driver in blkverify mode */
if (qemu_opt_get_bool(opts, QUORUM_OPT_BLKVERIFY, false) &&
s->num_children == 2 && s->threshold == 2) {
s->is_blkverify = true;
} else if (qemu_opt_get_bool(opts, QUORUM_OPT_BLKVERIFY, false)) {
fprintf(stderr, "blkverify mode is set by setting blkverify=on "
"and using two files with vote_threshold=2\n");
}
s->rewrite_corrupted = qemu_opt_get_bool(opts, QUORUM_OPT_REWRITE,
false);
if (s->rewrite_corrupted && s->is_blkverify) {
error_setg(&local_err,
"rewrite-corrupted=on cannot be used with blkverify=on");
ret = -EINVAL;
goto exit;
}
}
/* allocate the children BlockDriverState array */
@@ -827,6 +989,7 @@ close_exit:
g_free(s->bs);
g_free(opened);
exit:
qemu_opts_del(opts);
/* propagate error */
if (local_err) {
error_propagate(errp, local_err);
@@ -848,25 +1011,83 @@ static void quorum_close(BlockDriverState *bs)
g_free(s->bs);
}
static void quorum_detach_aio_context(BlockDriverState *bs)
{
BDRVQuorumState *s = bs->opaque;
int i;
for (i = 0; i < s->num_children; i++) {
bdrv_detach_aio_context(s->bs[i]);
}
}
static void quorum_attach_aio_context(BlockDriverState *bs,
AioContext *new_context)
{
BDRVQuorumState *s = bs->opaque;
int i;
for (i = 0; i < s->num_children; i++) {
bdrv_attach_aio_context(s->bs[i], new_context);
}
}
static void quorum_refresh_filename(BlockDriverState *bs)
{
BDRVQuorumState *s = bs->opaque;
QDict *opts;
QList *children;
int i;
for (i = 0; i < s->num_children; i++) {
bdrv_refresh_filename(s->bs[i]);
if (!s->bs[i]->full_open_options) {
return;
}
}
children = qlist_new();
for (i = 0; i < s->num_children; i++) {
QINCREF(s->bs[i]->full_open_options);
qlist_append_obj(children, QOBJECT(s->bs[i]->full_open_options));
}
opts = qdict_new();
qdict_put_obj(opts, "driver", QOBJECT(qstring_from_str("quorum")));
qdict_put_obj(opts, QUORUM_OPT_VOTE_THRESHOLD,
QOBJECT(qint_from_int(s->threshold)));
qdict_put_obj(opts, QUORUM_OPT_BLKVERIFY,
QOBJECT(qbool_from_int(s->is_blkverify)));
qdict_put_obj(opts, QUORUM_OPT_REWRITE,
QOBJECT(qbool_from_int(s->rewrite_corrupted)));
qdict_put_obj(opts, "children", QOBJECT(children));
bs->full_open_options = opts;
}
static BlockDriver bdrv_quorum = {
.format_name = "quorum",
.protocol_name = "quorum",
.format_name = "quorum",
.protocol_name = "quorum",
.instance_size = sizeof(BDRVQuorumState),
.instance_size = sizeof(BDRVQuorumState),
.bdrv_file_open = quorum_open,
.bdrv_close = quorum_close,
.bdrv_file_open = quorum_open,
.bdrv_close = quorum_close,
.bdrv_refresh_filename = quorum_refresh_filename,
.bdrv_co_flush_to_disk = quorum_co_flush,
.bdrv_co_flush_to_disk = quorum_co_flush,
.bdrv_getlength = quorum_getlength,
.bdrv_getlength = quorum_getlength,
.bdrv_aio_readv = quorum_aio_readv,
.bdrv_aio_writev = quorum_aio_writev,
.bdrv_invalidate_cache = quorum_invalidate_cache,
.bdrv_aio_readv = quorum_aio_readv,
.bdrv_aio_writev = quorum_aio_writev,
.bdrv_invalidate_cache = quorum_invalidate_cache,
.is_filter = true,
.bdrv_recurse_is_first_non_filter = quorum_recurse_is_first_non_filter,
.bdrv_detach_aio_context = quorum_detach_aio_context,
.bdrv_attach_aio_context = quorum_attach_aio_context,
.is_filter = true,
.bdrv_recurse_is_first_non_filter = quorum_recurse_is_first_non_filter,
};
static void bdrv_quorum_init(void)

View File

@@ -34,19 +34,29 @@
/* linux-aio.c - Linux native implementation */
#ifdef CONFIG_LINUX_AIO
void *laio_init(void);
void laio_cleanup(void *s);
BlockDriverAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockDriverCompletionFunc *cb, void *opaque, int type);
void laio_detach_aio_context(void *s, AioContext *old_context);
void laio_attach_aio_context(void *s, AioContext *new_context);
void laio_io_plug(BlockDriverState *bs, void *aio_ctx);
int laio_io_unplug(BlockDriverState *bs, void *aio_ctx, bool unplug);
#endif
#ifdef _WIN32
typedef struct QEMUWin32AIOState QEMUWin32AIOState;
QEMUWin32AIOState *win32_aio_init(void);
void win32_aio_cleanup(QEMUWin32AIOState *aio);
int win32_aio_attach(QEMUWin32AIOState *aio, HANDLE hfile);
BlockDriverAIOCB *win32_aio_submit(BlockDriverState *bs,
QEMUWin32AIOState *aio, HANDLE hfile,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockDriverCompletionFunc *cb, void *opaque, int type);
void win32_aio_detach_aio_context(QEMUWin32AIOState *aio,
AioContext *old_context);
void win32_aio_attach_aio_context(QEMUWin32AIOState *aio,
AioContext *new_context);
#endif
#endif /* QEMU_RAW_AIO_H */

View File

@@ -30,6 +30,7 @@
#include "block/thread-pool.h"
#include "qemu/iov.h"
#include "raw-aio.h"
#include "qapi/util.h"
#if defined(__APPLE__) && (__MACH__)
#include <paths.h>
@@ -55,6 +56,9 @@
#include <linux/cdrom.h>
#include <linux/fd.h>
#include <linux/fs.h>
#ifndef FS_NOCOW_FL
#define FS_NOCOW_FL 0x00800000 /* Do not cow file */
#endif
#endif
#ifdef CONFIG_FIEMAP
#include <linux/fiemap.h>
@@ -218,7 +222,7 @@ static int raw_normalize_devicepath(const char **filename)
}
#endif
static void raw_probe_alignment(BlockDriverState *bs)
static void raw_probe_alignment(BlockDriverState *bs, int fd, Error **errp)
{
BDRVRawState *s = bs->opaque;
char *buf;
@@ -237,24 +241,24 @@ static void raw_probe_alignment(BlockDriverState *bs)
s->buf_align = 0;
#ifdef BLKSSZGET
if (ioctl(s->fd, BLKSSZGET, &sector_size) >= 0) {
if (ioctl(fd, BLKSSZGET, &sector_size) >= 0) {
bs->request_alignment = sector_size;
}
#endif
#ifdef DKIOCGETBLOCKSIZE
if (ioctl(s->fd, DKIOCGETBLOCKSIZE, &sector_size) >= 0) {
if (ioctl(fd, DKIOCGETBLOCKSIZE, &sector_size) >= 0) {
bs->request_alignment = sector_size;
}
#endif
#ifdef DIOCGSECTORSIZE
if (ioctl(s->fd, DIOCGSECTORSIZE, &sector_size) >= 0) {
if (ioctl(fd, DIOCGSECTORSIZE, &sector_size) >= 0) {
bs->request_alignment = sector_size;
}
#endif
#ifdef CONFIG_XFS
if (s->is_xfs) {
struct dioattr da;
if (xfsctl(NULL, s->fd, XFS_IOC_DIOINFO, &da) >= 0) {
if (xfsctl(NULL, fd, XFS_IOC_DIOINFO, &da) >= 0) {
bs->request_alignment = da.d_miniosz;
/* The kernel returns wrong information for d_mem */
/* s->buf_align = da.d_mem; */
@@ -267,7 +271,7 @@ static void raw_probe_alignment(BlockDriverState *bs)
size_t align;
buf = qemu_memalign(MAX_BLOCKSIZE, 2 * MAX_BLOCKSIZE);
for (align = 512; align <= MAX_BLOCKSIZE; align <<= 1) {
if (pread(s->fd, buf + align, MAX_BLOCKSIZE, 0) >= 0) {
if (pread(fd, buf + align, MAX_BLOCKSIZE, 0) >= 0) {
s->buf_align = align;
break;
}
@@ -279,13 +283,18 @@ static void raw_probe_alignment(BlockDriverState *bs)
size_t align;
buf = qemu_memalign(s->buf_align, MAX_BLOCKSIZE);
for (align = 512; align <= MAX_BLOCKSIZE; align <<= 1) {
if (pread(s->fd, buf, align, 0) >= 0) {
if (pread(fd, buf, align, 0) >= 0) {
bs->request_alignment = align;
break;
}
}
qemu_vfree(buf);
}
if (!s->buf_align || !bs->request_alignment) {
error_setg(errp, "Could not find working O_DIRECT alignment. "
"Try cache.direct=off.");
}
}
static void raw_parse_flags(int bdrv_flags, int *open_flags)
@@ -307,6 +316,29 @@ static void raw_parse_flags(int bdrv_flags, int *open_flags)
}
}
static void raw_detach_aio_context(BlockDriverState *bs)
{
#ifdef CONFIG_LINUX_AIO
BDRVRawState *s = bs->opaque;
if (s->use_aio) {
laio_detach_aio_context(s->aio_ctx, bdrv_get_aio_context(bs));
}
#endif
}
static void raw_attach_aio_context(BlockDriverState *bs,
AioContext *new_context)
{
#ifdef CONFIG_LINUX_AIO
BDRVRawState *s = bs->opaque;
if (s->use_aio) {
laio_attach_aio_context(s->aio_ctx, new_context);
}
#endif
}
#ifdef CONFIG_LINUX_AIO
static int raw_set_aio(void **aio_ctx, int *use_aio, int bdrv_flags)
{
@@ -447,6 +479,8 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
}
#endif
raw_attach_aio_context(bs, bdrv_get_aio_context(bs));
ret = 0;
fail:
if (filename && (bdrv_flags & BDRV_O_TEMPORARY)) {
@@ -477,13 +511,14 @@ static int raw_reopen_prepare(BDRVReopenState *state,
BDRVRawState *s;
BDRVRawReopenState *raw_s;
int ret = 0;
Error *local_err = NULL;
assert(state != NULL);
assert(state->bs != NULL);
s = state->bs->opaque;
state->opaque = g_malloc0(sizeof(BDRVRawReopenState));
state->opaque = g_new0(BDRVRawReopenState, 1);
raw_s = state->opaque;
#ifdef CONFIG_LINUX_AIO
@@ -549,6 +584,19 @@ static int raw_reopen_prepare(BDRVReopenState *state,
ret = -1;
}
}
/* Fail already reopen_prepare() if we can't get a working O_DIRECT
* alignment with the new fd. */
if (raw_s->fd != -1) {
raw_probe_alignment(state->bs, raw_s->fd, &local_err);
if (local_err) {
qemu_close(raw_s->fd);
raw_s->fd = -1;
error_propagate(errp, local_err);
ret = -EINVAL;
}
}
return ret;
}
@@ -587,14 +635,12 @@ static void raw_reopen_abort(BDRVReopenState *state)
state->opaque = NULL;
}
static int raw_refresh_limits(BlockDriverState *bs)
static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
{
BDRVRawState *s = bs->opaque;
raw_probe_alignment(bs);
raw_probe_alignment(bs, s->fd, errp);
bs->bl.opt_mem_alignment = s->buf_align;
return 0;
}
static ssize_t handle_aiocb_ioctl(RawPosixAIOData *aiocb)
@@ -702,6 +748,15 @@ static ssize_t handle_aiocb_rw_linear(RawPosixAIOData *aiocb, char *buf)
}
if (len == -1 && errno == EINTR) {
continue;
} else if (len == -1 && errno == EINVAL &&
(aiocb->bs->open_flags & BDRV_O_NOCACHE) &&
!(aiocb->aio_type & QEMU_AIO_WRITE) &&
offset > 0) {
/* O_DIRECT pread() may fail with EINVAL when offset is unaligned
* after a short read. Assume that O_DIRECT short reads only occur
* at EOF. Therefore this is a short read, not an I/O error.
*/
break;
} else if (len == -1) {
offset = -errno;
break;
@@ -753,7 +808,11 @@ static ssize_t handle_aiocb_rw(RawPosixAIOData *aiocb)
* Ok, we have to do it the hard way, copy all segments into
* a single aligned buffer.
*/
buf = qemu_blockalign(aiocb->bs, aiocb->aio_nbytes);
buf = qemu_try_blockalign(aiocb->bs, aiocb->aio_nbytes);
if (buf == NULL) {
return -ENOMEM;
}
if (aiocb->aio_type & QEMU_AIO_WRITE) {
char *p = buf;
int i;
@@ -762,6 +821,7 @@ static ssize_t handle_aiocb_rw(RawPosixAIOData *aiocb)
memcpy(p, aiocb->aio_iov[i].iov_base, aiocb->aio_iov[i].iov_len);
p += aiocb->aio_iov[i].iov_len;
}
assert(p - buf == aiocb->aio_nbytes);
}
nbytes = handle_aiocb_rw_linear(aiocb, buf);
@@ -776,9 +836,11 @@ static ssize_t handle_aiocb_rw(RawPosixAIOData *aiocb)
copy = aiocb->aio_iov[i].iov_len;
}
memcpy(aiocb->aio_iov[i].iov_base, p, copy);
assert(count >= copy);
p += copy;
count -= copy;
}
assert(count == 0);
}
qemu_vfree(buf);
@@ -965,12 +1027,14 @@ static int paio_submit_co(BlockDriverState *bs, int fd,
acb->aio_type = type;
acb->aio_fildes = fd;
acb->aio_nbytes = nb_sectors * BDRV_SECTOR_SIZE;
acb->aio_offset = sector_num * BDRV_SECTOR_SIZE;
if (qiov) {
acb->aio_iov = qiov->iov;
acb->aio_niov = qiov->niov;
assert(qiov->size == acb->aio_nbytes);
}
acb->aio_nbytes = nb_sectors * 512;
acb->aio_offset = sector_num * 512;
trace_paio_submit_co(sector_num, nb_sectors, type);
pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
@@ -988,12 +1052,14 @@ static BlockDriverAIOCB *paio_submit(BlockDriverState *bs, int fd,
acb->aio_type = type;
acb->aio_fildes = fd;
acb->aio_nbytes = nb_sectors * BDRV_SECTOR_SIZE;
acb->aio_offset = sector_num * BDRV_SECTOR_SIZE;
if (qiov) {
acb->aio_iov = qiov->iov;
acb->aio_niov = qiov->niov;
assert(qiov->size == acb->aio_nbytes);
}
acb->aio_nbytes = nb_sectors * 512;
acb->aio_offset = sector_num * 512;
trace_paio_submit(acb, opaque, sector_num, nb_sectors, type);
pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
@@ -1029,6 +1095,36 @@ static BlockDriverAIOCB *raw_aio_submit(BlockDriverState *bs,
cb, opaque, type);
}
static void raw_aio_plug(BlockDriverState *bs)
{
#ifdef CONFIG_LINUX_AIO
BDRVRawState *s = bs->opaque;
if (s->use_aio) {
laio_io_plug(bs, s->aio_ctx);
}
#endif
}
static void raw_aio_unplug(BlockDriverState *bs)
{
#ifdef CONFIG_LINUX_AIO
BDRVRawState *s = bs->opaque;
if (s->use_aio) {
laio_io_unplug(bs, s->aio_ctx, true);
}
#endif
}
static void raw_aio_flush_io_queue(BlockDriverState *bs)
{
#ifdef CONFIG_LINUX_AIO
BDRVRawState *s = bs->opaque;
if (s->use_aio) {
laio_io_unplug(bs, s->aio_ctx, false);
}
#endif
}
static BlockDriverAIOCB *raw_aio_readv(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockDriverCompletionFunc *cb, void *opaque)
@@ -1059,6 +1155,14 @@ static BlockDriverAIOCB *raw_aio_flush(BlockDriverState *bs,
static void raw_close(BlockDriverState *bs)
{
BDRVRawState *s = bs->opaque;
raw_detach_aio_context(bs);
#ifdef CONFIG_LINUX_AIO
if (s->use_aio) {
laio_cleanup(s->aio_ctx);
}
#endif
if (s->fd >= 0) {
qemu_close(s->fd);
s->fd = -1;
@@ -1097,12 +1201,12 @@ static int64_t raw_getlength(BlockDriverState *bs)
struct stat st;
if (fstat(fd, &st))
return -1;
return -errno;
if (S_ISCHR(st.st_mode) || S_ISBLK(st.st_mode)) {
struct disklabel dl;
if (ioctl(fd, DIOCGDINFO, &dl))
return -1;
return -errno;
return (uint64_t)dl.d_secsize *
dl.d_partitions[DISKPART(st.st_rdev)].p_size;
} else
@@ -1116,7 +1220,7 @@ static int64_t raw_getlength(BlockDriverState *bs)
struct stat st;
if (fstat(fd, &st))
return -1;
return -errno;
if (S_ISCHR(st.st_mode) || S_ISBLK(st.st_mode)) {
struct dkwedge_info dkw;
@@ -1126,7 +1230,7 @@ static int64_t raw_getlength(BlockDriverState *bs)
struct disklabel dl;
if (ioctl(fd, DIOCGDINFO, &dl))
return -1;
return -errno;
return (uint64_t)dl.d_secsize *
dl.d_partitions[DISKPART(st.st_rdev)].p_size;
}
@@ -1139,6 +1243,7 @@ static int64_t raw_getlength(BlockDriverState *bs)
BDRVRawState *s = bs->opaque;
struct dk_minfo minfo;
int ret;
int64_t size;
ret = fd_open(bs);
if (ret < 0) {
@@ -1157,7 +1262,11 @@ static int64_t raw_getlength(BlockDriverState *bs)
* There are reports that lseek on some devices fails, but
* irc discussion said that contingency on contingency was overkill.
*/
return lseek(s->fd, 0, SEEK_END);
size = lseek(s->fd, 0, SEEK_END);
if (size < 0) {
return -errno;
}
return size;
}
#elif defined(CONFIG_BSD)
static int64_t raw_getlength(BlockDriverState *bs)
@@ -1195,6 +1304,9 @@ again:
size = LLONG_MAX;
#else
size = lseek(fd, 0LL, SEEK_END);
if (size < 0) {
return -errno;
}
#endif
#if defined(__FreeBSD__) || defined(__FreeBSD_kernel__)
switch(s->type) {
@@ -1211,6 +1323,9 @@ again:
#endif
} else {
size = lseek(fd, 0, SEEK_END);
if (size < 0) {
return -errno;
}
}
return size;
}
@@ -1219,13 +1334,18 @@ static int64_t raw_getlength(BlockDriverState *bs)
{
BDRVRawState *s = bs->opaque;
int ret;
int64_t size;
ret = fd_open(bs);
if (ret < 0) {
return ret;
}
return lseek(s->fd, 0, SEEK_END);
size = lseek(s->fd, 0, SEEK_END);
if (size < 0) {
return -errno;
}
return size;
}
#endif
@@ -1240,21 +1360,31 @@ static int64_t raw_get_allocated_file_size(BlockDriverState *bs)
return (int64_t)st.st_blocks * 512;
}
static int raw_create(const char *filename, QEMUOptionParameter *options,
Error **errp)
static int raw_create(const char *filename, QemuOpts *opts, Error **errp)
{
int fd;
int result = 0;
int64_t total_size = 0;
bool nocow = false;
PreallocMode prealloc;
char *buf = NULL;
Error *local_err = NULL;
strstart(filename, "file:", &filename);
/* Read out options */
while (options && options->name) {
if (!strcmp(options->name, BLOCK_OPT_SIZE)) {
total_size = options->value.n / BDRV_SECTOR_SIZE;
}
options++;
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
nocow = qemu_opt_get_bool(opts, BLOCK_OPT_NOCOW, false);
buf = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
prealloc = qapi_enum_parse(PreallocMode_lookup, buf,
PREALLOC_MODE_MAX, PREALLOC_MODE_OFF,
&local_err);
g_free(buf);
if (local_err) {
error_propagate(errp, local_err);
result = -EINVAL;
goto out;
}
fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
@@ -1262,16 +1392,66 @@ static int raw_create(const char *filename, QEMUOptionParameter *options,
if (fd < 0) {
result = -errno;
error_setg_errno(errp, -result, "Could not create file");
} else {
if (ftruncate(fd, total_size * BDRV_SECTOR_SIZE) != 0) {
result = -errno;
error_setg_errno(errp, -result, "Could not resize file");
}
if (qemu_close(fd) != 0) {
result = -errno;
error_setg_errno(errp, -result, "Could not close the new file");
}
goto out;
}
if (nocow) {
#ifdef __linux__
/* Set NOCOW flag to solve performance issue on fs like btrfs.
* This is an optimisation. The FS_IOC_SETFLAGS ioctl return value
* will be ignored since any failure of this operation should not
* block the left work.
*/
int attr;
if (ioctl(fd, FS_IOC_GETFLAGS, &attr) == 0) {
attr |= FS_NOCOW_FL;
ioctl(fd, FS_IOC_SETFLAGS, &attr);
}
#endif
}
if (ftruncate(fd, total_size) != 0) {
result = -errno;
error_setg_errno(errp, -result, "Could not resize file");
goto out_close;
}
if (prealloc == PREALLOC_MODE_FALLOC) {
/* posix_fallocate() doesn't set errno. */
result = -posix_fallocate(fd, 0, total_size);
if (result != 0) {
error_setg_errno(errp, -result,
"Could not preallocate data for the new file");
}
} else if (prealloc == PREALLOC_MODE_FULL) {
buf = g_malloc0(65536);
int64_t num = 0, left = total_size;
while (left > 0) {
num = MIN(left, 65536);
result = write(fd, buf, num);
if (result < 0) {
result = -errno;
error_setg_errno(errp, -result,
"Could not write to the new file");
break;
}
left -= num;
}
fsync(fd);
g_free(buf);
} else if (prealloc != PREALLOC_MODE_OFF) {
result = -EINVAL;
error_setg(errp, "Unsupported preallocation mode: %s",
PreallocMode_lookup[prealloc]);
}
out_close:
if (qemu_close(fd) != 0 && result == 0) {
result = -errno;
error_setg_errno(errp, -result, "Could not close the new file");
}
out:
return result;
}
@@ -1440,13 +1620,27 @@ static int raw_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
return 0;
}
static QEMUOptionParameter raw_create_options[] = {
{
.name = BLOCK_OPT_SIZE,
.type = OPT_SIZE,
.help = "Virtual disk size"
},
{ NULL }
static QemuOptsList raw_create_opts = {
.name = "raw-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(raw_create_opts.head),
.desc = {
{
.name = BLOCK_OPT_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Virtual disk size"
},
{
.name = BLOCK_OPT_NOCOW,
.type = QEMU_OPT_BOOL,
.help = "Turn off copy-on-write (valid only on btrfs)"
},
{
.name = BLOCK_OPT_PREALLOC,
.type = QEMU_OPT_STRING,
.help = "Preallocation mode (allowed values: off, falloc, full)"
},
{ /* end of list */ }
}
};
static BlockDriver bdrv_file = {
@@ -1471,6 +1665,9 @@ static BlockDriver bdrv_file = {
.bdrv_aio_flush = raw_aio_flush,
.bdrv_aio_discard = raw_aio_discard,
.bdrv_refresh_limits = raw_refresh_limits,
.bdrv_io_plug = raw_aio_plug,
.bdrv_io_unplug = raw_aio_unplug,
.bdrv_flush_io_queue = raw_aio_flush_io_queue,
.bdrv_truncate = raw_truncate,
.bdrv_getlength = raw_getlength,
@@ -1478,7 +1675,10 @@ static BlockDriver bdrv_file = {
.bdrv_get_allocated_file_size
= raw_get_allocated_file_size,
.create_options = raw_create_options,
.bdrv_detach_aio_context = raw_detach_aio_context,
.bdrv_attach_aio_context = raw_attach_aio_context,
.create_opts = &raw_create_opts,
};
/***********************************************/
@@ -1799,7 +1999,7 @@ static coroutine_fn int hdev_co_write_zeroes(BlockDriverState *bs,
return -ENOTSUP;
}
static int hdev_create(const char *filename, QEMUOptionParameter *options,
static int hdev_create(const char *filename, QemuOpts *opts,
Error **errp)
{
int fd;
@@ -1820,12 +2020,8 @@ static int hdev_create(const char *filename, QEMUOptionParameter *options,
(void)has_prefix;
/* Read out options */
while (options && options->name) {
if (!strcmp(options->name, "size")) {
total_size = options->value.n / BDRV_SECTOR_SIZE;
}
options++;
}
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
fd = qemu_open(filename, O_WRONLY | O_BINARY);
if (fd < 0) {
@@ -1841,7 +2037,7 @@ static int hdev_create(const char *filename, QEMUOptionParameter *options,
error_setg(errp,
"The given file is neither a block nor a character device");
ret = -ENODEV;
} else if (lseek(fd, 0, SEEK_END) < total_size * BDRV_SECTOR_SIZE) {
} else if (lseek(fd, 0, SEEK_END) < total_size) {
error_setg(errp, "Device is too small");
ret = -ENOSPC;
}
@@ -1862,8 +2058,8 @@ static BlockDriver bdrv_host_device = {
.bdrv_reopen_prepare = raw_reopen_prepare,
.bdrv_reopen_commit = raw_reopen_commit,
.bdrv_reopen_abort = raw_reopen_abort,
.bdrv_create = hdev_create,
.create_options = raw_create_options,
.bdrv_create = hdev_create,
.create_opts = &raw_create_opts,
.bdrv_co_write_zeroes = hdev_co_write_zeroes,
.bdrv_aio_readv = raw_aio_readv,
@@ -1871,6 +2067,9 @@ static BlockDriver bdrv_host_device = {
.bdrv_aio_flush = raw_aio_flush,
.bdrv_aio_discard = hdev_aio_discard,
.bdrv_refresh_limits = raw_refresh_limits,
.bdrv_io_plug = raw_aio_plug,
.bdrv_io_unplug = raw_aio_unplug,
.bdrv_flush_io_queue = raw_aio_flush_io_queue,
.bdrv_truncate = raw_truncate,
.bdrv_getlength = raw_getlength,
@@ -1878,6 +2077,9 @@ static BlockDriver bdrv_host_device = {
.bdrv_get_allocated_file_size
= raw_get_allocated_file_size,
.bdrv_detach_aio_context = raw_detach_aio_context,
.bdrv_attach_aio_context = raw_attach_aio_context,
/* generic scsi device */
#ifdef __linux__
.bdrv_ioctl = hdev_ioctl,
@@ -2006,13 +2208,16 @@ static BlockDriver bdrv_host_floppy = {
.bdrv_reopen_prepare = raw_reopen_prepare,
.bdrv_reopen_commit = raw_reopen_commit,
.bdrv_reopen_abort = raw_reopen_abort,
.bdrv_create = hdev_create,
.create_options = raw_create_options,
.bdrv_create = hdev_create,
.create_opts = &raw_create_opts,
.bdrv_aio_readv = raw_aio_readv,
.bdrv_aio_writev = raw_aio_writev,
.bdrv_aio_flush = raw_aio_flush,
.bdrv_refresh_limits = raw_refresh_limits,
.bdrv_io_plug = raw_aio_plug,
.bdrv_io_unplug = raw_aio_unplug,
.bdrv_flush_io_queue = raw_aio_flush_io_queue,
.bdrv_truncate = raw_truncate,
.bdrv_getlength = raw_getlength,
@@ -2020,6 +2225,9 @@ static BlockDriver bdrv_host_floppy = {
.bdrv_get_allocated_file_size
= raw_get_allocated_file_size,
.bdrv_detach_aio_context = raw_detach_aio_context,
.bdrv_attach_aio_context = raw_attach_aio_context,
/* removable device support */
.bdrv_is_inserted = floppy_is_inserted,
.bdrv_media_changed = floppy_media_changed,
@@ -2131,13 +2339,16 @@ static BlockDriver bdrv_host_cdrom = {
.bdrv_reopen_prepare = raw_reopen_prepare,
.bdrv_reopen_commit = raw_reopen_commit,
.bdrv_reopen_abort = raw_reopen_abort,
.bdrv_create = hdev_create,
.create_options = raw_create_options,
.bdrv_create = hdev_create,
.create_opts = &raw_create_opts,
.bdrv_aio_readv = raw_aio_readv,
.bdrv_aio_writev = raw_aio_writev,
.bdrv_aio_flush = raw_aio_flush,
.bdrv_refresh_limits = raw_refresh_limits,
.bdrv_io_plug = raw_aio_plug,
.bdrv_io_unplug = raw_aio_unplug,
.bdrv_flush_io_queue = raw_aio_flush_io_queue,
.bdrv_truncate = raw_truncate,
.bdrv_getlength = raw_getlength,
@@ -2145,6 +2356,9 @@ static BlockDriver bdrv_host_cdrom = {
.bdrv_get_allocated_file_size
= raw_get_allocated_file_size,
.bdrv_detach_aio_context = raw_detach_aio_context,
.bdrv_attach_aio_context = raw_attach_aio_context,
/* removable device support */
.bdrv_is_inserted = cdrom_is_inserted,
.bdrv_eject = cdrom_eject,
@@ -2263,12 +2477,15 @@ static BlockDriver bdrv_host_cdrom = {
.bdrv_reopen_commit = raw_reopen_commit,
.bdrv_reopen_abort = raw_reopen_abort,
.bdrv_create = hdev_create,
.create_options = raw_create_options,
.create_opts = &raw_create_opts,
.bdrv_aio_readv = raw_aio_readv,
.bdrv_aio_writev = raw_aio_writev,
.bdrv_aio_flush = raw_aio_flush,
.bdrv_refresh_limits = raw_refresh_limits,
.bdrv_io_plug = raw_aio_plug,
.bdrv_io_unplug = raw_aio_unplug,
.bdrv_flush_io_queue = raw_aio_flush_io_queue,
.bdrv_truncate = raw_truncate,
.bdrv_getlength = raw_getlength,
@@ -2276,6 +2493,9 @@ static BlockDriver bdrv_host_cdrom = {
.bdrv_get_allocated_file_size
= raw_get_allocated_file_size,
.bdrv_detach_aio_context = raw_detach_aio_context,
.bdrv_attach_aio_context = raw_attach_aio_context,
/* removable device support */
.bdrv_is_inserted = cdrom_is_inserted,
.bdrv_eject = cdrom_eject,
@@ -2283,40 +2503,6 @@ static BlockDriver bdrv_host_cdrom = {
};
#endif /* __FreeBSD__ */
#ifdef CONFIG_LINUX_AIO
/**
* Return the file descriptor for Linux AIO
*
* This function is a layering violation and should be removed when it becomes
* possible to call the block layer outside the global mutex. It allows the
* caller to hijack the file descriptor so I/O can be performed outside the
* block layer.
*/
int raw_get_aio_fd(BlockDriverState *bs)
{
BDRVRawState *s;
if (!bs->drv) {
return -ENOMEDIUM;
}
if (bs->drv == bdrv_find_format("raw")) {
bs = bs->file;
}
/* raw-posix has several protocols so just check for raw_aio_readv */
if (bs->drv->bdrv_aio_readv != raw_aio_readv) {
return -ENOTSUP;
}
s = bs->opaque;
if (!s->use_aio) {
return -ENOTSUP;
}
return s->fd;
}
#endif /* CONFIG_LINUX_AIO */
static void bdrv_file_init(void)
{
/*

View File

@@ -36,8 +36,6 @@
#define FTYPE_CD 1
#define FTYPE_HARDDISK 2
static QEMUWin32AIOState *aio;
typedef struct RawWin32AIOData {
BlockDriverState *bs;
HANDLE hfile;
@@ -202,6 +200,25 @@ static int set_sparse(int fd)
NULL, 0, NULL, 0, &returned, NULL);
}
static void raw_detach_aio_context(BlockDriverState *bs)
{
BDRVRawState *s = bs->opaque;
if (s->aio) {
win32_aio_detach_aio_context(s->aio, bdrv_get_aio_context(bs));
}
}
static void raw_attach_aio_context(BlockDriverState *bs,
AioContext *new_context)
{
BDRVRawState *s = bs->opaque;
if (s->aio) {
win32_aio_attach_aio_context(s->aio, new_context);
}
}
static void raw_probe_alignment(BlockDriverState *bs)
{
BDRVRawState *s = bs->opaque;
@@ -300,15 +317,6 @@ static int raw_open(BlockDriverState *bs, QDict *options, int flags,
raw_parse_flags(flags, &access_flags, &overlapped);
if ((flags & BDRV_O_NATIVE_AIO) && aio == NULL) {
aio = win32_aio_init();
if (aio == NULL) {
error_setg(errp, "Could not initialize AIO");
ret = -EINVAL;
goto fail;
}
}
if (filename[0] && filename[1] == ':') {
snprintf(s->drive_path, sizeof(s->drive_path), "%c:\\", filename[0]);
} else if (filename[0] == '\\' && filename[1] == '\\') {
@@ -335,13 +343,23 @@ static int raw_open(BlockDriverState *bs, QDict *options, int flags,
}
if (flags & BDRV_O_NATIVE_AIO) {
ret = win32_aio_attach(aio, s->hfile);
s->aio = win32_aio_init();
if (s->aio == NULL) {
CloseHandle(s->hfile);
error_setg(errp, "Could not initialize AIO");
ret = -EINVAL;
goto fail;
}
ret = win32_aio_attach(s->aio, s->hfile);
if (ret < 0) {
win32_aio_cleanup(s->aio);
CloseHandle(s->hfile);
error_setg_errno(errp, -ret, "Could not enable AIO");
goto fail;
}
s->aio = aio;
win32_aio_attach_aio_context(s->aio, bdrv_get_aio_context(bs));
}
raw_probe_alignment(bs);
@@ -389,6 +407,13 @@ static BlockDriverAIOCB *raw_aio_flush(BlockDriverState *bs,
static void raw_close(BlockDriverState *bs)
{
BDRVRawState *s = bs->opaque;
if (s->aio) {
win32_aio_detach_aio_context(s->aio, bdrv_get_aio_context(bs));
win32_aio_cleanup(s->aio);
s->aio = NULL;
}
CloseHandle(s->hfile);
if (bs->open_flags & BDRV_O_TEMPORARY) {
unlink(bs->filename);
@@ -478,8 +503,7 @@ static int64_t raw_get_allocated_file_size(BlockDriverState *bs)
return st.st_size;
}
static int raw_create(const char *filename, QEMUOptionParameter *options,
Error **errp)
static int raw_create(const char *filename, QemuOpts *opts, Error **errp)
{
int fd;
int64_t total_size = 0;
@@ -487,12 +511,8 @@ static int raw_create(const char *filename, QEMUOptionParameter *options,
strstart(filename, "file:", &filename);
/* Read out options */
while (options && options->name) {
if (!strcmp(options->name, BLOCK_OPT_SIZE)) {
total_size = options->value.n / 512;
}
options++;
}
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
0644);
@@ -501,18 +521,23 @@ static int raw_create(const char *filename, QEMUOptionParameter *options,
return -EIO;
}
set_sparse(fd);
ftruncate(fd, total_size * 512);
ftruncate(fd, total_size);
qemu_close(fd);
return 0;
}
static QEMUOptionParameter raw_create_options[] = {
{
.name = BLOCK_OPT_SIZE,
.type = OPT_SIZE,
.help = "Virtual disk size"
},
{ NULL }
static QemuOptsList raw_create_opts = {
.name = "raw-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(raw_create_opts.head),
.desc = {
{
.name = BLOCK_OPT_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Virtual disk size"
},
{ /* end of list */ }
}
};
static BlockDriver bdrv_file = {
@@ -521,9 +546,9 @@ static BlockDriver bdrv_file = {
.instance_size = sizeof(BDRVRawState),
.bdrv_needs_filename = true,
.bdrv_parse_filename = raw_parse_filename,
.bdrv_file_open = raw_open,
.bdrv_close = raw_close,
.bdrv_create = raw_create,
.bdrv_file_open = raw_open,
.bdrv_close = raw_close,
.bdrv_create = raw_create,
.bdrv_has_zero_init = bdrv_has_zero_init_1,
.bdrv_aio_readv = raw_aio_readv,
@@ -535,7 +560,7 @@ static BlockDriver bdrv_file = {
.bdrv_get_allocated_file_size
= raw_get_allocated_file_size,
.create_options = raw_create_options,
.create_opts = &raw_create_opts,
};
/***********************************************/
@@ -684,6 +709,9 @@ static BlockDriver bdrv_host_device = {
.bdrv_aio_writev = raw_aio_writev,
.bdrv_aio_flush = raw_aio_flush,
.bdrv_detach_aio_context = raw_detach_aio_context,
.bdrv_attach_aio_context = raw_attach_aio_context,
.bdrv_getlength = raw_getlength,
.has_variable_length = true,

View File

@@ -29,13 +29,17 @@
#include "block/block_int.h"
#include "qemu/option.h"
static QEMUOptionParameter raw_create_options[] = {
{
.name = BLOCK_OPT_SIZE,
.type = OPT_SIZE,
.help = "Virtual disk size"
},
{ 0 }
static QemuOptsList raw_create_opts = {
.name = "raw-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(raw_create_opts.head),
.desc = {
{
.name = BLOCK_OPT_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Virtual disk size"
},
{ /* end of list */ }
}
};
static int raw_reopen_prepare(BDRVReopenState *reopen_state,
@@ -90,10 +94,9 @@ static int raw_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
return bdrv_get_info(bs->file, bdi);
}
static int raw_refresh_limits(BlockDriverState *bs)
static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
{
bs->bl = bs->file->bl;
return 0;
}
static int raw_truncate(BlockDriverState *bs, int64_t offset)
@@ -139,13 +142,12 @@ static int raw_has_zero_init(BlockDriverState *bs)
return bdrv_has_zero_init(bs->file);
}
static int raw_create(const char *filename, QEMUOptionParameter *options,
Error **errp)
static int raw_create(const char *filename, QemuOpts *opts, Error **errp)
{
Error *local_err = NULL;
int ret;
ret = bdrv_create_file(filename, options, &local_err);
ret = bdrv_create_file(filename, opts, &local_err);
if (local_err) {
error_propagate(errp, local_err);
}
@@ -194,7 +196,7 @@ static BlockDriver bdrv_raw = {
.bdrv_lock_medium = &raw_lock_medium,
.bdrv_ioctl = &raw_ioctl,
.bdrv_aio_ioctl = &raw_aio_ioctl,
.create_options = &raw_create_options[0],
.create_opts = &raw_create_opts,
.bdrv_has_zero_init = &raw_has_zero_init
};

View File

@@ -289,8 +289,7 @@ static int qemu_rbd_set_conf(rados_t cluster, const char *conf, Error **errp)
return ret;
}
static int qemu_rbd_create(const char *filename, QEMUOptionParameter *options,
Error **errp)
static int qemu_rbd_create(const char *filename, QemuOpts *opts, Error **errp)
{
Error *local_err = NULL;
int64_t bytes = 0;
@@ -315,24 +314,19 @@ static int qemu_rbd_create(const char *filename, QEMUOptionParameter *options,
}
/* Read out options */
while (options && options->name) {
if (!strcmp(options->name, BLOCK_OPT_SIZE)) {
bytes = options->value.n;
} else if (!strcmp(options->name, BLOCK_OPT_CLUSTER_SIZE)) {
if (options->value.n) {
objsize = options->value.n;
if ((objsize - 1) & objsize) { /* not a power of 2? */
error_setg(errp, "obj size needs to be power of 2");
return -EINVAL;
}
if (objsize < 4096) {
error_setg(errp, "obj size too small");
return -EINVAL;
}
obj_order = ffs(objsize) - 1;
}
bytes = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
objsize = qemu_opt_get_size_del(opts, BLOCK_OPT_CLUSTER_SIZE, 0);
if (objsize) {
if ((objsize - 1) & objsize) { /* not a power of 2? */
error_setg(errp, "obj size needs to be power of 2");
return -EINVAL;
}
options++;
if (objsize < 4096) {
error_setg(errp, "obj size too small");
return -EINVAL;
}
obj_order = ffs(objsize) - 1;
}
clientname = qemu_rbd_parse_clientname(conf, clientname_buf);
@@ -555,7 +549,7 @@ static void qemu_rbd_aio_cancel(BlockDriverAIOCB *blockacb)
acb->cancelled = 1;
while (acb->status == -EINPROGRESS) {
qemu_aio_wait();
aio_poll(bdrv_get_aio_context(acb->common.bs), true);
}
qemu_aio_release(acb);
@@ -588,7 +582,8 @@ static void rbd_finish_aiocb(rbd_completion_t c, RADOSCB *rcb)
rcb->ret = rbd_aio_get_return_value(c);
rbd_aio_release(c);
acb->bh = qemu_bh_new(rbd_finish_bh, rcb);
acb->bh = aio_bh_new(bdrv_get_aio_context(acb->common.bs),
rbd_finish_bh, rcb);
qemu_bh_schedule(acb->bh);
}
@@ -623,7 +618,7 @@ static BlockDriverAIOCB *rbd_start_aio(BlockDriverState *bs,
RBDAIOCmd cmd)
{
RBDAIOCB *acb;
RADOSCB *rcb;
RADOSCB *rcb = NULL;
rbd_completion_t c;
int64_t off, size;
char *buf;
@@ -637,7 +632,10 @@ static BlockDriverAIOCB *rbd_start_aio(BlockDriverState *bs,
if (cmd == RBD_AIO_DISCARD || cmd == RBD_AIO_FLUSH) {
acb->bounce = NULL;
} else {
acb->bounce = qemu_blockalign(bs, qiov->size);
acb->bounce = qemu_try_blockalign(bs, qiov->size);
if (acb->bounce == NULL) {
goto failed;
}
}
acb->ret = 0;
acb->error = 0;
@@ -655,7 +653,7 @@ static BlockDriverAIOCB *rbd_start_aio(BlockDriverState *bs,
off = sector_num * BDRV_SECTOR_SIZE;
size = nb_sectors * BDRV_SECTOR_SIZE;
rcb = g_malloc(sizeof(RADOSCB));
rcb = g_new(RADOSCB, 1);
rcb->done = 0;
rcb->acb = acb;
rcb->buf = buf;
@@ -684,13 +682,16 @@ static BlockDriverAIOCB *rbd_start_aio(BlockDriverState *bs,
}
if (r < 0) {
goto failed;
goto failed_completion;
}
return &acb->common;
failed_completion:
rbd_aio_release(c);
failed:
g_free(rcb);
qemu_vfree(acb->bounce);
qemu_aio_release(acb);
return NULL;
}
@@ -862,7 +863,7 @@ static int qemu_rbd_snap_list(BlockDriverState *bs,
int max_snaps = RBD_MAX_SNAPS;
do {
snaps = g_malloc(sizeof(*snaps) * max_snaps);
snaps = g_new(rbd_snap_info_t, max_snaps);
snap_count = rbd_snap_list(s->image, snaps, &max_snaps);
if (snap_count <= 0) {
g_free(snaps);
@@ -873,7 +874,7 @@ static int qemu_rbd_snap_list(BlockDriverState *bs,
goto done;
}
sn_tab = g_malloc0(snap_count * sizeof(QEMUSnapshotInfo));
sn_tab = g_new0(QEMUSnapshotInfo, snap_count);
for (i = 0; i < snap_count; i++) {
const char *snap_name = snaps[i].name;
@@ -907,18 +908,22 @@ static BlockDriverAIOCB* qemu_rbd_aio_discard(BlockDriverState *bs,
}
#endif
static QEMUOptionParameter qemu_rbd_create_options[] = {
{
.name = BLOCK_OPT_SIZE,
.type = OPT_SIZE,
.help = "Virtual disk size"
},
{
.name = BLOCK_OPT_CLUSTER_SIZE,
.type = OPT_SIZE,
.help = "RBD object size"
},
{NULL}
static QemuOptsList qemu_rbd_create_opts = {
.name = "rbd-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(qemu_rbd_create_opts.head),
.desc = {
{
.name = BLOCK_OPT_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Virtual disk size"
},
{
.name = BLOCK_OPT_CLUSTER_SIZE,
.type = QEMU_OPT_SIZE,
.help = "RBD object size"
},
{ /* end of list */ }
}
};
static BlockDriver bdrv_rbd = {
@@ -930,7 +935,7 @@ static BlockDriver bdrv_rbd = {
.bdrv_create = qemu_rbd_create,
.bdrv_has_zero_init = bdrv_has_zero_init_1,
.bdrv_get_info = qemu_rbd_getinfo,
.create_options = qemu_rbd_create_options,
.create_opts = &qemu_rbd_create_opts,
.bdrv_getlength = qemu_rbd_getlength,
.bdrv_truncate = qemu_rbd_truncate,
.protocol_name = "rbd",

View File

@@ -103,6 +103,9 @@
#define SD_INODE_SIZE (sizeof(SheepdogInode))
#define CURRENT_VDI_ID 0
#define LOCK_TYPE_NORMAL 0
#define LOCK_TYPE_SHARED 1 /* for iSCSI multipath */
typedef struct SheepdogReq {
uint8_t proto_ver;
uint8_t opcode;
@@ -166,7 +169,8 @@ typedef struct SheepdogVdiReq {
uint8_t copy_policy;
uint8_t reserved[2];
uint32_t snapid;
uint32_t pad[3];
uint32_t type;
uint32_t pad[2];
} SheepdogVdiReq;
typedef struct SheepdogVdiRsp {
@@ -200,6 +204,8 @@ typedef struct SheepdogInode {
uint32_t data_vdi_id[MAX_DATA_OBJS];
} SheepdogInode;
#define SD_INODE_HEADER_SIZE offsetof(SheepdogInode, data_vdi_id)
/*
* 64 bit FNV-1a non-zero initial basis
*/
@@ -282,6 +288,7 @@ typedef struct AIOReq {
unsigned int data_len;
uint8_t flags;
uint32_t id;
bool create;
QLIST_ENTRY(AIOReq) aio_siblings;
} AIOReq;
@@ -314,6 +321,7 @@ struct SheepdogAIOCB {
typedef struct BDRVSheepdogState {
BlockDriverState *bs;
AioContext *aio_context;
SheepdogInode inode;
@@ -404,7 +412,7 @@ static const char * sd_strerror(int err)
static inline AIOReq *alloc_aio_req(BDRVSheepdogState *s, SheepdogAIOCB *acb,
uint64_t oid, unsigned int data_len,
uint64_t offset, uint8_t flags,
uint64_t offset, uint8_t flags, bool create,
uint64_t base_oid, unsigned int iov_offset)
{
AIOReq *aio_req;
@@ -418,6 +426,7 @@ static inline AIOReq *alloc_aio_req(BDRVSheepdogState *s, SheepdogAIOCB *acb,
aio_req->data_len = data_len;
aio_req->flags = flags;
aio_req->id = s->aioreq_seq_num++;
aio_req->create = create;
acb->nr_pending++;
return aio_req;
@@ -496,7 +505,7 @@ static void sd_aio_cancel(BlockDriverAIOCB *blockacb)
sd_finish_aiocb(acb);
return;
}
qemu_aio_wait();
aio_poll(s->aio_context, true);
}
}
@@ -578,6 +587,7 @@ static void restart_co_req(void *opaque)
typedef struct SheepdogReqCo {
int sockfd;
AioContext *aio_context;
SheepdogReq *hdr;
void *data;
unsigned int *wlen;
@@ -598,14 +608,14 @@ static coroutine_fn void do_co_req(void *opaque)
unsigned int *rlen = srco->rlen;
co = qemu_coroutine_self();
qemu_aio_set_fd_handler(sockfd, NULL, restart_co_req, co);
aio_set_fd_handler(srco->aio_context, sockfd, NULL, restart_co_req, co);
ret = send_co_req(sockfd, hdr, data, wlen);
if (ret < 0) {
goto out;
}
qemu_aio_set_fd_handler(sockfd, restart_co_req, NULL, co);
aio_set_fd_handler(srco->aio_context, sockfd, restart_co_req, NULL, co);
ret = qemu_co_recv(sockfd, hdr, sizeof(*hdr));
if (ret != sizeof(*hdr)) {
@@ -630,18 +640,19 @@ static coroutine_fn void do_co_req(void *opaque)
out:
/* there is at most one request for this sockfd, so it is safe to
* set each handler to NULL. */
qemu_aio_set_fd_handler(sockfd, NULL, NULL, NULL);
aio_set_fd_handler(srco->aio_context, sockfd, NULL, NULL, NULL);
srco->ret = ret;
srco->finished = true;
}
static int do_req(int sockfd, SheepdogReq *hdr, void *data,
unsigned int *wlen, unsigned int *rlen)
static int do_req(int sockfd, AioContext *aio_context, SheepdogReq *hdr,
void *data, unsigned int *wlen, unsigned int *rlen)
{
Coroutine *co;
SheepdogReqCo srco = {
.sockfd = sockfd,
.aio_context = aio_context,
.hdr = hdr,
.data = data,
.wlen = wlen,
@@ -656,7 +667,7 @@ static int do_req(int sockfd, SheepdogReq *hdr, void *data,
co = qemu_coroutine_create(do_co_req);
qemu_coroutine_enter(co, &srco);
while (!srco.finished) {
qemu_aio_wait();
aio_poll(aio_context, true);
}
}
@@ -664,8 +675,8 @@ static int do_req(int sockfd, SheepdogReq *hdr, void *data,
}
static void coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
struct iovec *iov, int niov, bool create,
enum AIOCBState aiocb_type);
struct iovec *iov, int niov,
enum AIOCBState aiocb_type);
static void coroutine_fn resend_aioreq(BDRVSheepdogState *s, AIOReq *aio_req);
static int reload_inode(BDRVSheepdogState *s, uint32_t snapid, const char *tag);
static int get_sheep_fd(BDRVSheepdogState *s, Error **errp);
@@ -698,18 +709,17 @@ static void coroutine_fn send_pending_req(BDRVSheepdogState *s, uint64_t oid)
/* move aio_req from pending list to inflight one */
QLIST_REMOVE(aio_req, aio_siblings);
QLIST_INSERT_HEAD(&s->inflight_aio_head, aio_req, aio_siblings);
add_aio_request(s, aio_req, acb->qiov->iov, acb->qiov->niov, false,
add_aio_request(s, aio_req, acb->qiov->iov, acb->qiov->niov,
acb->aiocb_type);
}
}
static coroutine_fn void reconnect_to_sdog(void *opaque)
{
Error *local_err = NULL;
BDRVSheepdogState *s = opaque;
AIOReq *aio_req, *next;
qemu_aio_set_fd_handler(s->fd, NULL, NULL, NULL);
aio_set_fd_handler(s->aio_context, s->fd, NULL, NULL, NULL);
close(s->fd);
s->fd = -1;
@@ -720,6 +730,7 @@ static coroutine_fn void reconnect_to_sdog(void *opaque)
/* Try to reconnect the sheepdog server every one second. */
while (s->fd < 0) {
Error *local_err = NULL;
s->fd = get_sheep_fd(s, &local_err);
if (s->fd < 0) {
DPRINTF("Wait for connection to be established\n");
@@ -797,7 +808,7 @@ static void coroutine_fn aio_read_response(void *opaque)
}
idx = data_oid_to_idx(aio_req->oid);
if (s->inode.data_vdi_id[idx] != s->inode.vdi_id) {
if (aio_req->create) {
/*
* If the object is newly created one, we need to update
* the vdi object (metadata object). min_dirty_data_idx
@@ -922,7 +933,7 @@ static int get_sheep_fd(BDRVSheepdogState *s, Error **errp)
return fd;
}
qemu_aio_set_fd_handler(fd, co_read_response, NULL, s);
aio_set_fd_handler(s->aio_context, fd, co_read_response, NULL, s);
return fd;
}
@@ -1083,6 +1094,7 @@ static int find_vdi_name(BDRVSheepdogState *s, const char *filename,
memset(&hdr, 0, sizeof(hdr));
if (lock) {
hdr.opcode = SD_OP_LOCK_VDI;
hdr.type = LOCK_TYPE_NORMAL;
} else {
hdr.opcode = SD_OP_GET_VDI_INFO;
}
@@ -1092,7 +1104,7 @@ static int find_vdi_name(BDRVSheepdogState *s, const char *filename,
hdr.snapid = snapid;
hdr.flags = SD_FLAG_CMD_WRITE;
ret = do_req(fd, (SheepdogReq *)&hdr, buf, &wlen, &rlen);
ret = do_req(fd, s->aio_context, (SheepdogReq *)&hdr, buf, &wlen, &rlen);
if (ret) {
error_setg_errno(errp, -ret, "cannot get vdi info");
goto out;
@@ -1103,6 +1115,8 @@ static int find_vdi_name(BDRVSheepdogState *s, const char *filename,
sd_strerror(rsp->result), filename, snapid, tag);
if (rsp->result == SD_RES_NO_VDI) {
ret = -ENOENT;
} else if (rsp->result == SD_RES_VDI_LOCKED) {
ret = -EBUSY;
} else {
ret = -EIO;
}
@@ -1117,8 +1131,8 @@ out:
}
static void coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
struct iovec *iov, int niov, bool create,
enum AIOCBState aiocb_type)
struct iovec *iov, int niov,
enum AIOCBState aiocb_type)
{
int nr_copies = s->inode.nr_copies;
SheepdogObjReq hdr;
@@ -1129,6 +1143,7 @@ static void coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
uint64_t offset = aio_req->offset;
uint8_t flags = aio_req->flags;
uint64_t old_oid = aio_req->base_oid;
bool create = aio_req->create;
if (!nr_copies) {
error_report("bug");
@@ -1173,7 +1188,8 @@ static void coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
qemu_co_mutex_lock(&s->lock);
s->co_send = qemu_coroutine_self();
qemu_aio_set_fd_handler(s->fd, co_read_response, co_write_request, s);
aio_set_fd_handler(s->aio_context, s->fd,
co_read_response, co_write_request, s);
socket_set_cork(s->fd, 1);
/* send a header */
@@ -1191,12 +1207,13 @@ static void coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
}
out:
socket_set_cork(s->fd, 0);
qemu_aio_set_fd_handler(s->fd, co_read_response, NULL, s);
aio_set_fd_handler(s->aio_context, s->fd, co_read_response, NULL, s);
s->co_send = NULL;
qemu_co_mutex_unlock(&s->lock);
}
static int read_write_object(int fd, char *buf, uint64_t oid, uint8_t copies,
static int read_write_object(int fd, AioContext *aio_context, char *buf,
uint64_t oid, uint8_t copies,
unsigned int datalen, uint64_t offset,
bool write, bool create, uint32_t cache_flags)
{
@@ -1229,7 +1246,7 @@ static int read_write_object(int fd, char *buf, uint64_t oid, uint8_t copies,
hdr.offset = offset;
hdr.copies = copies;
ret = do_req(fd, (SheepdogReq *)&hdr, buf, &wlen, &rlen);
ret = do_req(fd, aio_context, (SheepdogReq *)&hdr, buf, &wlen, &rlen);
if (ret) {
error_report("failed to send a request to the sheep");
return ret;
@@ -1244,19 +1261,23 @@ static int read_write_object(int fd, char *buf, uint64_t oid, uint8_t copies,
}
}
static int read_object(int fd, char *buf, uint64_t oid, uint8_t copies,
static int read_object(int fd, AioContext *aio_context, char *buf,
uint64_t oid, uint8_t copies,
unsigned int datalen, uint64_t offset,
uint32_t cache_flags)
{
return read_write_object(fd, buf, oid, copies, datalen, offset, false,
return read_write_object(fd, aio_context, buf, oid, copies,
datalen, offset, false,
false, cache_flags);
}
static int write_object(int fd, char *buf, uint64_t oid, uint8_t copies,
static int write_object(int fd, AioContext *aio_context, char *buf,
uint64_t oid, uint8_t copies,
unsigned int datalen, uint64_t offset, bool create,
uint32_t cache_flags)
{
return read_write_object(fd, buf, oid, copies, datalen, offset, true,
return read_write_object(fd, aio_context, buf, oid, copies,
datalen, offset, true,
create, cache_flags);
}
@@ -1275,7 +1296,7 @@ static int reload_inode(BDRVSheepdogState *s, uint32_t snapid, const char *tag)
return -EIO;
}
inode = g_malloc(sizeof(s->inode));
inode = g_malloc(SD_INODE_HEADER_SIZE);
ret = find_vdi_name(s, s->name, snapid, tag, &vid, false, &local_err);
if (ret) {
@@ -1284,14 +1305,15 @@ static int reload_inode(BDRVSheepdogState *s, uint32_t snapid, const char *tag)
goto out;
}
ret = read_object(fd, (char *)inode, vid_to_vdi_oid(vid),
s->inode.nr_copies, sizeof(*inode), 0, s->cache_flags);
ret = read_object(fd, s->aio_context, (char *)inode, vid_to_vdi_oid(vid),
s->inode.nr_copies, SD_INODE_HEADER_SIZE, 0,
s->cache_flags);
if (ret < 0) {
goto out;
}
if (inode->vdi_id != s->inode.vdi_id) {
memcpy(&s->inode, inode, sizeof(s->inode));
memcpy(&s->inode, inode, SD_INODE_HEADER_SIZE);
}
out:
@@ -1315,6 +1337,7 @@ static bool check_simultaneous_create(BDRVSheepdogState *s, AIOReq *aio_req)
DPRINTF("simultaneous create to %" PRIx64 "\n", aio_req->oid);
aio_req->flags = 0;
aio_req->base_oid = 0;
aio_req->create = false;
QLIST_REMOVE(aio_req, aio_siblings);
QLIST_INSERT_HEAD(&s->pending_aio_head, aio_req, aio_siblings);
return true;
@@ -1327,7 +1350,8 @@ static bool check_simultaneous_create(BDRVSheepdogState *s, AIOReq *aio_req)
static void coroutine_fn resend_aioreq(BDRVSheepdogState *s, AIOReq *aio_req)
{
SheepdogAIOCB *acb = aio_req->aiocb;
bool create = false;
aio_req->create = false;
/* check whether this request becomes a CoW one */
if (acb->aiocb_type == AIOCB_WRITE_UDATA && is_data_obj(aio_req->oid)) {
@@ -1345,20 +1369,36 @@ static void coroutine_fn resend_aioreq(BDRVSheepdogState *s, AIOReq *aio_req)
aio_req->base_oid = vid_to_data_oid(s->inode.data_vdi_id[idx], idx);
aio_req->flags |= SD_FLAG_CMD_COW;
}
create = true;
aio_req->create = true;
}
out:
if (is_data_obj(aio_req->oid)) {
add_aio_request(s, aio_req, acb->qiov->iov, acb->qiov->niov, create,
add_aio_request(s, aio_req, acb->qiov->iov, acb->qiov->niov,
acb->aiocb_type);
} else {
struct iovec iov;
iov.iov_base = &s->inode;
iov.iov_len = sizeof(s->inode);
add_aio_request(s, aio_req, &iov, 1, false, AIOCB_WRITE_UDATA);
add_aio_request(s, aio_req, &iov, 1, AIOCB_WRITE_UDATA);
}
}
static void sd_detach_aio_context(BlockDriverState *bs)
{
BDRVSheepdogState *s = bs->opaque;
aio_set_fd_handler(s->aio_context, s->fd, NULL, NULL, NULL);
}
static void sd_attach_aio_context(BlockDriverState *bs,
AioContext *new_context)
{
BDRVSheepdogState *s = bs->opaque;
s->aio_context = new_context;
aio_set_fd_handler(new_context, s->fd, co_read_response, NULL, s);
}
/* TODO Convert to fine grained options */
static QemuOptsList runtime_opts = {
.name = "sheepdog",
@@ -1387,6 +1427,7 @@ static int sd_open(BlockDriverState *bs, QDict *options, int flags,
const char *filename;
s->bs = bs;
s->aio_context = bdrv_get_aio_context(bs);
opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
@@ -1448,8 +1489,8 @@ static int sd_open(BlockDriverState *bs, QDict *options, int flags,
}
buf = g_malloc(SD_INODE_SIZE);
ret = read_object(fd, buf, vid_to_vdi_oid(vid), 0, SD_INODE_SIZE, 0,
s->cache_flags);
ret = read_object(fd, s->aio_context, buf, vid_to_vdi_oid(vid),
0, SD_INODE_SIZE, 0, s->cache_flags);
closesocket(fd);
@@ -1469,7 +1510,7 @@ static int sd_open(BlockDriverState *bs, QDict *options, int flags,
g_free(buf);
return 0;
out:
qemu_aio_set_fd_handler(s->fd, NULL, NULL, NULL);
aio_set_fd_handler(bdrv_get_aio_context(bs), s->fd, NULL, NULL, NULL);
if (s->fd >= 0) {
closesocket(s->fd);
}
@@ -1512,7 +1553,7 @@ static int do_sd_create(BDRVSheepdogState *s, uint32_t *vdi_id, int snapshot,
hdr.copy_policy = s->inode.copy_policy;
hdr.copies = s->inode.nr_copies;
ret = do_req(fd, (SheepdogReq *)&hdr, buf, &wlen, &rlen);
ret = do_req(fd, s->aio_context, (SheepdogReq *)&hdr, buf, &wlen, &rlen);
closesocket(fd);
@@ -1636,18 +1677,19 @@ static int parse_redundancy(BDRVSheepdogState *s, const char *opt)
return 0;
}
static int sd_create(const char *filename, QEMUOptionParameter *options,
static int sd_create(const char *filename, QemuOpts *opts,
Error **errp)
{
int ret = 0;
uint32_t vid = 0;
char *backing_file = NULL;
char *buf = NULL;
BDRVSheepdogState *s;
char tag[SD_MAX_VDI_TAG_LEN];
uint32_t snapid;
bool prealloc = false;
s = g_malloc0(sizeof(BDRVSheepdogState));
s = g_new0(BDRVSheepdogState, 1);
memset(tag, 0, sizeof(tag));
if (strstr(filename, "://")) {
@@ -1660,33 +1702,28 @@ static int sd_create(const char *filename, QEMUOptionParameter *options,
goto out;
}
while (options && options->name) {
if (!strcmp(options->name, BLOCK_OPT_SIZE)) {
s->inode.vdi_size = options->value.n;
} else if (!strcmp(options->name, BLOCK_OPT_BACKING_FILE)) {
backing_file = options->value.s;
} else if (!strcmp(options->name, BLOCK_OPT_PREALLOC)) {
if (!options->value.s || !strcmp(options->value.s, "off")) {
prealloc = false;
} else if (!strcmp(options->value.s, "full")) {
prealloc = true;
} else {
error_setg(errp, "Invalid preallocation mode: '%s'",
options->value.s);
ret = -EINVAL;
goto out;
}
} else if (!strcmp(options->name, BLOCK_OPT_REDUNDANCY)) {
if (options->value.s) {
ret = parse_redundancy(s, options->value.s);
if (ret < 0) {
error_setg(errp, "Invalid redundancy mode: '%s'",
options->value.s);
goto out;
}
}
s->inode.vdi_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
backing_file = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FILE);
buf = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
if (!buf || !strcmp(buf, "off")) {
prealloc = false;
} else if (!strcmp(buf, "full")) {
prealloc = true;
} else {
error_setg(errp, "Invalid preallocation mode: '%s'", buf);
ret = -EINVAL;
goto out;
}
g_free(buf);
buf = qemu_opt_get_del(opts, BLOCK_OPT_REDUNDANCY);
if (buf) {
ret = parse_redundancy(s, buf);
if (ret < 0) {
error_setg(errp, "Invalid redundancy mode: '%s'", buf);
goto out;
}
options++;
}
if (s->inode.vdi_size > SD_MAX_VDI_SIZE) {
@@ -1727,6 +1764,7 @@ static int sd_create(const char *filename, QEMUOptionParameter *options,
bdrv_unref(bs);
}
s->aio_context = qemu_get_aio_context();
ret = do_sd_create(s, &vid, 0, errp);
if (ret) {
goto out;
@@ -1736,6 +1774,8 @@ static int sd_create(const char *filename, QEMUOptionParameter *options,
ret = sd_prealloc(filename, errp);
}
out:
g_free(backing_file);
g_free(buf);
g_free(s);
return ret;
}
@@ -1761,12 +1801,14 @@ static void sd_close(BlockDriverState *bs)
memset(&hdr, 0, sizeof(hdr));
hdr.opcode = SD_OP_RELEASE_VDI;
hdr.type = LOCK_TYPE_NORMAL;
hdr.base_vdi_id = s->inode.vdi_id;
wlen = strlen(s->name) + 1;
hdr.data_length = wlen;
hdr.flags = SD_FLAG_CMD_WRITE;
ret = do_req(fd, (SheepdogReq *)&hdr, s->name, &wlen, &rlen);
ret = do_req(fd, s->aio_context, (SheepdogReq *)&hdr,
s->name, &wlen, &rlen);
closesocket(fd);
@@ -1775,7 +1817,7 @@ static void sd_close(BlockDriverState *bs)
error_report("%s, %s", sd_strerror(rsp->result), s->name);
}
qemu_aio_set_fd_handler(s->fd, NULL, NULL, NULL);
aio_set_fd_handler(bdrv_get_aio_context(bs), s->fd, NULL, NULL, NULL);
closesocket(s->fd);
g_free(s->host_spec);
}
@@ -1812,8 +1854,9 @@ static int sd_truncate(BlockDriverState *bs, int64_t offset)
/* we don't need to update entire object */
datalen = SD_INODE_SIZE - sizeof(s->inode.data_vdi_id);
s->inode.vdi_size = offset;
ret = write_object(fd, (char *)&s->inode, vid_to_vdi_oid(s->inode.vdi_id),
s->inode.nr_copies, datalen, 0, false, s->cache_flags);
ret = write_object(fd, s->aio_context, (char *)&s->inode,
vid_to_vdi_oid(s->inode.vdi_id), s->inode.nr_copies,
datalen, 0, false, s->cache_flags);
close(fd);
if (ret < 0) {
@@ -1849,9 +1892,9 @@ static void coroutine_fn sd_write_done(SheepdogAIOCB *acb)
iov.iov_base = &s->inode;
iov.iov_len = sizeof(s->inode);
aio_req = alloc_aio_req(s, acb, vid_to_vdi_oid(s->inode.vdi_id),
data_len, offset, 0, 0, offset);
data_len, offset, 0, false, 0, offset);
QLIST_INSERT_HEAD(&s->inflight_aio_head, aio_req, aio_siblings);
add_aio_request(s, aio_req, &iov, 1, false, AIOCB_WRITE_UDATA);
add_aio_request(s, aio_req, &iov, 1, AIOCB_WRITE_UDATA);
acb->aio_done_func = sd_finish_aiocb;
acb->aiocb_type = AIOCB_WRITE_UDATA;
@@ -1882,7 +1925,8 @@ static bool sd_delete(BDRVSheepdogState *s)
return false;
}
ret = do_req(fd, (SheepdogReq *)&hdr, s->name, &wlen, &rlen);
ret = do_req(fd, s->aio_context, (SheepdogReq *)&hdr,
s->name, &wlen, &rlen);
closesocket(fd);
if (ret) {
return false;
@@ -1939,8 +1983,8 @@ static int sd_create_branch(BDRVSheepdogState *s)
goto out;
}
ret = read_object(fd, buf, vid_to_vdi_oid(vid), s->inode.nr_copies,
SD_INODE_SIZE, 0, s->cache_flags);
ret = read_object(fd, s->aio_context, buf, vid_to_vdi_oid(vid),
s->inode.nr_copies, SD_INODE_SIZE, 0, s->cache_flags);
closesocket(fd);
@@ -2049,7 +2093,8 @@ static int coroutine_fn sd_co_rw_vector(void *p)
DPRINTF("new oid %" PRIx64 "\n", oid);
}
aio_req = alloc_aio_req(s, acb, oid, len, offset, flags, old_oid, done);
aio_req = alloc_aio_req(s, acb, oid, len, offset, flags, create,
old_oid, done);
QLIST_INSERT_HEAD(&s->inflight_aio_head, aio_req, aio_siblings);
if (create) {
@@ -2058,7 +2103,7 @@ static int coroutine_fn sd_co_rw_vector(void *p)
}
}
add_aio_request(s, aio_req, acb->qiov->iov, acb->qiov->niov, create,
add_aio_request(s, aio_req, acb->qiov->iov, acb->qiov->niov,
acb->aiocb_type);
done:
offset = 0;
@@ -2138,9 +2183,9 @@ static int coroutine_fn sd_co_flush_to_disk(BlockDriverState *bs)
acb->aio_done_func = sd_finish_aiocb;
aio_req = alloc_aio_req(s, acb, vid_to_vdi_oid(s->inode.vdi_id),
0, 0, 0, 0, 0);
0, 0, 0, false, 0, 0);
QLIST_INSERT_HEAD(&s->inflight_aio_head, aio_req, aio_siblings);
add_aio_request(s, aio_req, NULL, 0, false, acb->aiocb_type);
add_aio_request(s, aio_req, NULL, 0, acb->aiocb_type);
qemu_coroutine_yield();
return acb->ret;
@@ -2187,8 +2232,9 @@ static int sd_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)
goto cleanup;
}
ret = write_object(fd, (char *)&s->inode, vid_to_vdi_oid(s->inode.vdi_id),
s->inode.nr_copies, datalen, 0, false, s->cache_flags);
ret = write_object(fd, s->aio_context, (char *)&s->inode,
vid_to_vdi_oid(s->inode.vdi_id), s->inode.nr_copies,
datalen, 0, false, s->cache_flags);
if (ret < 0) {
error_report("failed to write snapshot's inode.");
goto cleanup;
@@ -2203,8 +2249,9 @@ static int sd_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)
goto cleanup;
}
ret = read_object(fd, (char *)inode, vid_to_vdi_oid(new_vid),
s->inode.nr_copies, datalen, 0, s->cache_flags);
ret = read_object(fd, s->aio_context, (char *)inode,
vid_to_vdi_oid(new_vid), s->inode.nr_copies, datalen, 0,
s->cache_flags);
if (ret < 0) {
error_report("failed to read new inode info. %s", strerror(errno));
@@ -2235,7 +2282,7 @@ static int sd_snapshot_goto(BlockDriverState *bs, const char *snapshot_id)
uint32_t snapid = 0;
int ret = 0;
old_s = g_malloc(sizeof(BDRVSheepdogState));
old_s = g_new(BDRVSheepdogState, 1);
memcpy(old_s, s, sizeof(BDRVSheepdogState));
@@ -2311,14 +2358,15 @@ static int sd_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_tab)
req.opcode = SD_OP_READ_VDIS;
req.data_length = max;
ret = do_req(fd, (SheepdogReq *)&req, vdi_inuse, &wlen, &rlen);
ret = do_req(fd, s->aio_context, (SheepdogReq *)&req,
vdi_inuse, &wlen, &rlen);
closesocket(fd);
if (ret) {
goto out;
}
sn_tab = g_malloc0(nr * sizeof(*sn_tab));
sn_tab = g_new0(QEMUSnapshotInfo, nr);
/* calculate a vdi id with hash function */
hval = fnv_64a_buf(s->name, strlen(s->name), FNV1A_64_INIT);
@@ -2338,7 +2386,8 @@ static int sd_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_tab)
}
/* we don't need to read entire object */
ret = read_object(fd, (char *)&inode, vid_to_vdi_oid(vid),
ret = read_object(fd, s->aio_context, (char *)&inode,
vid_to_vdi_oid(vid),
0, SD_INODE_SIZE - sizeof(inode.data_vdi_id), 0,
s->cache_flags);
@@ -2403,11 +2452,11 @@ static int do_load_save_vmstate(BDRVSheepdogState *s, uint8_t *data,
create = (offset == 0);
if (load) {
ret = read_object(fd, (char *)data, vmstate_oid,
ret = read_object(fd, s->aio_context, (char *)data, vmstate_oid,
s->inode.nr_copies, data_len, offset,
s->cache_flags);
} else {
ret = write_object(fd, (char *)data, vmstate_oid,
ret = write_object(fd, s->aio_context, (char *)data, vmstate_oid,
s->inode.nr_copies, data_len, offset, create,
s->cache_flags);
}
@@ -2529,28 +2578,32 @@ static int64_t sd_get_allocated_file_size(BlockDriverState *bs)
return size;
}
static QEMUOptionParameter sd_create_options[] = {
{
.name = BLOCK_OPT_SIZE,
.type = OPT_SIZE,
.help = "Virtual disk size"
},
{
.name = BLOCK_OPT_BACKING_FILE,
.type = OPT_STRING,
.help = "File name of a base image"
},
{
.name = BLOCK_OPT_PREALLOC,
.type = OPT_STRING,
.help = "Preallocation mode (allowed values: off, full)"
},
{
.name = BLOCK_OPT_REDUNDANCY,
.type = OPT_STRING,
.help = "Redundancy of the image"
},
{ NULL }
static QemuOptsList sd_create_opts = {
.name = "sheepdog-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(sd_create_opts.head),
.desc = {
{
.name = BLOCK_OPT_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Virtual disk size"
},
{
.name = BLOCK_OPT_BACKING_FILE,
.type = QEMU_OPT_STRING,
.help = "File name of a base image"
},
{
.name = BLOCK_OPT_PREALLOC,
.type = QEMU_OPT_STRING,
.help = "Preallocation mode (allowed values: off, full)"
},
{
.name = BLOCK_OPT_REDUNDANCY,
.type = QEMU_OPT_STRING,
.help = "Redundancy of the image"
},
{ /* end of list */ }
}
};
static BlockDriver bdrv_sheepdog = {
@@ -2580,7 +2633,10 @@ static BlockDriver bdrv_sheepdog = {
.bdrv_save_vmstate = sd_save_vmstate,
.bdrv_load_vmstate = sd_load_vmstate,
.create_options = sd_create_options,
.bdrv_detach_aio_context = sd_detach_aio_context,
.bdrv_attach_aio_context = sd_attach_aio_context,
.create_opts = &sd_create_opts,
};
static BlockDriver bdrv_sheepdog_tcp = {
@@ -2610,7 +2666,10 @@ static BlockDriver bdrv_sheepdog_tcp = {
.bdrv_save_vmstate = sd_save_vmstate,
.bdrv_load_vmstate = sd_load_vmstate,
.create_options = sd_create_options,
.bdrv_detach_aio_context = sd_detach_aio_context,
.bdrv_attach_aio_context = sd_attach_aio_context,
.create_opts = &sd_create_opts,
};
static BlockDriver bdrv_sheepdog_unix = {
@@ -2640,7 +2699,10 @@ static BlockDriver bdrv_sheepdog_unix = {
.bdrv_save_vmstate = sd_save_vmstate,
.bdrv_load_vmstate = sd_load_vmstate,
.create_options = sd_create_options,
.bdrv_detach_aio_context = sd_detach_aio_context,
.bdrv_attach_aio_context = sd_attach_aio_context,
.create_opts = &sd_create_opts,
};
static void bdrv_sheepdog_init(void)

View File

@@ -675,17 +675,20 @@ static int ssh_file_open(BlockDriverState *bs, QDict *options, int bdrv_flags,
return ret;
}
static QEMUOptionParameter ssh_create_options[] = {
{
.name = BLOCK_OPT_SIZE,
.type = OPT_SIZE,
.help = "Virtual disk size"
},
{ NULL }
static QemuOptsList ssh_create_opts = {
.name = "ssh-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(ssh_create_opts.head),
.desc = {
{
.name = BLOCK_OPT_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Virtual disk size"
},
{ /* end of list */ }
}
};
static int ssh_create(const char *filename, QEMUOptionParameter *options,
Error **errp)
static int ssh_create(const char *filename, QemuOpts *opts, Error **errp)
{
int r, ret;
int64_t total_size = 0;
@@ -697,12 +700,8 @@ static int ssh_create(const char *filename, QEMUOptionParameter *options,
ssh_state_init(&s);
/* Get desired file size. */
while (options && options->name) {
if (!strcmp(options->name, BLOCK_OPT_SIZE)) {
total_size = options->value.n;
}
options++;
}
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
DPRINTF("total_size=%" PRIi64, total_size);
uri_options = qdict_new();
@@ -773,7 +772,7 @@ static void restart_coroutine(void *opaque)
qemu_coroutine_enter(co, NULL);
}
static coroutine_fn void set_fd_handler(BDRVSSHState *s)
static coroutine_fn void set_fd_handler(BDRVSSHState *s, BlockDriverState *bs)
{
int r;
IOHandler *rd_handler = NULL, *wr_handler = NULL;
@@ -791,24 +790,26 @@ static coroutine_fn void set_fd_handler(BDRVSSHState *s)
DPRINTF("s->sock=%d rd_handler=%p wr_handler=%p", s->sock,
rd_handler, wr_handler);
qemu_aio_set_fd_handler(s->sock, rd_handler, wr_handler, co);
aio_set_fd_handler(bdrv_get_aio_context(bs), s->sock,
rd_handler, wr_handler, co);
}
static coroutine_fn void clear_fd_handler(BDRVSSHState *s)
static coroutine_fn void clear_fd_handler(BDRVSSHState *s,
BlockDriverState *bs)
{
DPRINTF("s->sock=%d", s->sock);
qemu_aio_set_fd_handler(s->sock, NULL, NULL, NULL);
aio_set_fd_handler(bdrv_get_aio_context(bs), s->sock, NULL, NULL, NULL);
}
/* A non-blocking call returned EAGAIN, so yield, ensuring the
* handlers are set up so that we'll be rescheduled when there is an
* interesting event on the socket.
*/
static coroutine_fn void co_yield(BDRVSSHState *s)
static coroutine_fn void co_yield(BDRVSSHState *s, BlockDriverState *bs)
{
set_fd_handler(s);
set_fd_handler(s, bs);
qemu_coroutine_yield();
clear_fd_handler(s);
clear_fd_handler(s, bs);
}
/* SFTP has a function `libssh2_sftp_seek64' which seeks to a position
@@ -838,7 +839,7 @@ static void ssh_seek(BDRVSSHState *s, int64_t offset, int flags)
}
}
static coroutine_fn int ssh_read(BDRVSSHState *s,
static coroutine_fn int ssh_read(BDRVSSHState *s, BlockDriverState *bs,
int64_t offset, size_t size,
QEMUIOVector *qiov)
{
@@ -871,7 +872,7 @@ static coroutine_fn int ssh_read(BDRVSSHState *s,
DPRINTF("sftp_read returned %zd", r);
if (r == LIBSSH2_ERROR_EAGAIN || r == LIBSSH2_ERROR_TIMEOUT) {
co_yield(s);
co_yield(s, bs);
goto again;
}
if (r < 0) {
@@ -906,14 +907,14 @@ static coroutine_fn int ssh_co_readv(BlockDriverState *bs,
int ret;
qemu_co_mutex_lock(&s->lock);
ret = ssh_read(s, sector_num * BDRV_SECTOR_SIZE,
ret = ssh_read(s, bs, sector_num * BDRV_SECTOR_SIZE,
nb_sectors * BDRV_SECTOR_SIZE, qiov);
qemu_co_mutex_unlock(&s->lock);
return ret;
}
static int ssh_write(BDRVSSHState *s,
static int ssh_write(BDRVSSHState *s, BlockDriverState *bs,
int64_t offset, size_t size,
QEMUIOVector *qiov)
{
@@ -941,7 +942,7 @@ static int ssh_write(BDRVSSHState *s,
DPRINTF("sftp_write returned %zd", r);
if (r == LIBSSH2_ERROR_EAGAIN || r == LIBSSH2_ERROR_TIMEOUT) {
co_yield(s);
co_yield(s, bs);
goto again;
}
if (r < 0) {
@@ -960,7 +961,7 @@ static int ssh_write(BDRVSSHState *s,
*/
if (r == 0) {
ssh_seek(s, offset + written, SSH_SEEK_WRITE|SSH_SEEK_FORCE);
co_yield(s);
co_yield(s, bs);
goto again;
}
@@ -988,7 +989,7 @@ static coroutine_fn int ssh_co_writev(BlockDriverState *bs,
int ret;
qemu_co_mutex_lock(&s->lock);
ret = ssh_write(s, sector_num * BDRV_SECTOR_SIZE,
ret = ssh_write(s, bs, sector_num * BDRV_SECTOR_SIZE,
nb_sectors * BDRV_SECTOR_SIZE, qiov);
qemu_co_mutex_unlock(&s->lock);
@@ -1009,7 +1010,7 @@ static void unsafe_flush_warning(BDRVSSHState *s, const char *what)
#ifdef HAS_LIBSSH2_SFTP_FSYNC
static coroutine_fn int ssh_flush(BDRVSSHState *s)
static coroutine_fn int ssh_flush(BDRVSSHState *s, BlockDriverState *bs)
{
int r;
@@ -1017,7 +1018,7 @@ static coroutine_fn int ssh_flush(BDRVSSHState *s)
again:
r = libssh2_sftp_fsync(s->sftp_handle);
if (r == LIBSSH2_ERROR_EAGAIN || r == LIBSSH2_ERROR_TIMEOUT) {
co_yield(s);
co_yield(s, bs);
goto again;
}
if (r == LIBSSH2_ERROR_SFTP_PROTOCOL &&
@@ -1039,7 +1040,7 @@ static coroutine_fn int ssh_co_flush(BlockDriverState *bs)
int ret;
qemu_co_mutex_lock(&s->lock);
ret = ssh_flush(s);
ret = ssh_flush(s, bs);
qemu_co_mutex_unlock(&s->lock);
return ret;
@@ -1082,7 +1083,7 @@ static BlockDriver bdrv_ssh = {
.bdrv_co_writev = ssh_co_writev,
.bdrv_getlength = ssh_getlength,
.bdrv_co_flush_to_disk = ssh_co_flush,
.create_options = ssh_create_options,
.create_opts = &ssh_create_opts,
};
static void bdrv_ssh_init(void)

View File

@@ -32,7 +32,7 @@ typedef struct StreamBlockJob {
RateLimit limit;
BlockDriverState *base;
BlockdevOnError on_error;
char backing_file_id[1024];
char *backing_file_str;
} StreamBlockJob;
static int coroutine_fn stream_populate(BlockDriverState *bs,
@@ -76,7 +76,7 @@ static void close_unused_images(BlockDriverState *top, BlockDriverState *base,
bdrv_unref(unused);
}
bdrv_refresh_limits(top);
bdrv_refresh_limits(top, NULL);
}
static void coroutine_fn stream_run(void *opaque)
@@ -159,14 +159,14 @@ wait:
BlockErrorAction action =
block_job_error_action(&s->common, s->common.bs, s->on_error,
true, -ret);
if (action == BDRV_ACTION_STOP) {
if (action == BLOCK_ERROR_ACTION_STOP) {
n = 0;
continue;
}
if (error == 0) {
error = ret;
}
if (action == BDRV_ACTION_REPORT) {
if (action == BLOCK_ERROR_ACTION_REPORT) {
break;
}
}
@@ -186,7 +186,7 @@ wait:
if (!block_job_is_cancelled(&s->common) && sector_num == end && ret == 0) {
const char *base_id = NULL, *base_fmt = NULL;
if (base) {
base_id = s->backing_file_id;
base_id = s->backing_file_str;
if (base->drv) {
base_fmt = base->drv->format_name;
}
@@ -196,6 +196,7 @@ wait:
}
qemu_vfree(buf);
g_free(s->backing_file_str);
block_job_completed(&s->common, ret);
}
@@ -217,7 +218,7 @@ static const BlockJobDriver stream_job_driver = {
};
void stream_start(BlockDriverState *bs, BlockDriverState *base,
const char *base_id, int64_t speed,
const char *backing_file_str, int64_t speed,
BlockdevOnError on_error,
BlockDriverCompletionFunc *cb,
void *opaque, Error **errp)
@@ -237,9 +238,7 @@ void stream_start(BlockDriverState *bs, BlockDriverState *base,
}
s->base = base;
if (base_id) {
pstrcpy(s->backing_file_id, sizeof(s->backing_file_id), base_id);
}
s->backing_file_str = g_strdup(backing_file_str);
s->on_error = on_error;
s->common.co = qemu_coroutine_create(stream_run);

View File

@@ -239,7 +239,6 @@ static void vdi_header_to_le(VdiHeader *header)
cpu_to_le32s(&header->block_extra);
cpu_to_le32s(&header->blocks_in_image);
cpu_to_le32s(&header->blocks_allocated);
cpu_to_le32s(&header->blocks_allocated);
uuid_convert(header->uuid_image);
uuid_convert(header->uuid_last_snap);
uuid_convert(header->uuid_link);
@@ -293,7 +292,12 @@ static int vdi_check(BlockDriverState *bs, BdrvCheckResult *res,
return -ENOTSUP;
}
bmap = g_malloc(s->header.blocks_in_image * sizeof(uint32_t));
bmap = g_try_new(uint32_t, s->header.blocks_in_image);
if (s->header.blocks_in_image && bmap == NULL) {
res->check_errors++;
return -ENOMEM;
}
memset(bmap, 0xff, s->header.blocks_in_image * sizeof(uint32_t));
/* Check block map and value of blocks_allocated. */
@@ -351,23 +355,23 @@ static int vdi_make_empty(BlockDriverState *bs)
static int vdi_probe(const uint8_t *buf, int buf_size, const char *filename)
{
const VdiHeader *header = (const VdiHeader *)buf;
int result = 0;
int ret = 0;
logout("\n");
if (buf_size < sizeof(*header)) {
/* Header too small, no VDI. */
} else if (le32_to_cpu(header->signature) == VDI_SIGNATURE) {
result = 100;
ret = 100;
}
if (result == 0) {
if (ret == 0) {
logout("no vdi image\n");
} else {
logout("%s", header->text);
}
return result;
return ret;
}
static int vdi_open(BlockDriverState *bs, QDict *options, int flags,
@@ -472,7 +476,12 @@ static int vdi_open(BlockDriverState *bs, QDict *options, int flags,
bmap_size = header.blocks_in_image * sizeof(uint32_t);
bmap_size = (bmap_size + SECTOR_SIZE - 1) / SECTOR_SIZE;
s->bmap = g_malloc(bmap_size * SECTOR_SIZE);
s->bmap = qemu_try_blockalign(bs->file, bmap_size * SECTOR_SIZE);
if (s->bmap == NULL) {
ret = -ENOMEM;
goto fail;
}
ret = bdrv_read(bs->file, s->bmap_sector, (uint8_t *)s->bmap, bmap_size);
if (ret < 0) {
goto fail_free_bmap;
@@ -487,7 +496,7 @@ static int vdi_open(BlockDriverState *bs, QDict *options, int flags,
return 0;
fail_free_bmap:
g_free(s->bmap);
qemu_vfree(s->bmap);
fail:
return ret;
@@ -673,11 +682,9 @@ static int vdi_co_write(BlockDriverState *bs,
return ret;
}
static int vdi_create(const char *filename, QEMUOptionParameter *options,
Error **errp)
static int vdi_create(const char *filename, QemuOpts *opts, Error **errp)
{
int fd;
int result = 0;
int ret = 0;
uint64_t bytes = 0;
uint32_t blocks;
size_t block_size = DEFAULT_CLUSTER_SIZE;
@@ -685,43 +692,45 @@ static int vdi_create(const char *filename, QEMUOptionParameter *options,
VdiHeader header;
size_t i;
size_t bmap_size;
int64_t offset = 0;
Error *local_err = NULL;
BlockDriverState *bs = NULL;
uint32_t *bmap = NULL;
logout("\n");
/* Read out options. */
while (options && options->name) {
if (!strcmp(options->name, BLOCK_OPT_SIZE)) {
bytes = options->value.n;
bytes = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
#if defined(CONFIG_VDI_BLOCK_SIZE)
} else if (!strcmp(options->name, BLOCK_OPT_CLUSTER_SIZE)) {
if (options->value.n) {
/* TODO: Additional checks (SECTOR_SIZE * 2^n, ...). */
block_size = options->value.n;
}
/* TODO: Additional checks (SECTOR_SIZE * 2^n, ...). */
block_size = qemu_opt_get_size_del(opts,
BLOCK_OPT_CLUSTER_SIZE,
DEFAULT_CLUSTER_SIZE);
#endif
#if defined(CONFIG_VDI_STATIC_IMAGE)
} else if (!strcmp(options->name, BLOCK_OPT_STATIC)) {
if (options->value.n) {
image_type = VDI_TYPE_STATIC;
}
#endif
}
options++;
if (qemu_opt_get_bool_del(opts, BLOCK_OPT_STATIC, false)) {
image_type = VDI_TYPE_STATIC;
}
#endif
if (bytes > VDI_DISK_SIZE_MAX) {
result = -ENOTSUP;
ret = -ENOTSUP;
error_setg(errp, "Unsupported VDI image size (size is 0x%" PRIx64
", max supported is 0x%" PRIx64 ")",
bytes, VDI_DISK_SIZE_MAX);
goto exit;
}
fd = qemu_open(filename,
O_WRONLY | O_CREAT | O_TRUNC | O_BINARY | O_LARGEFILE,
0644);
if (fd < 0) {
result = -errno;
ret = bdrv_create_file(filename, opts, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
goto exit;
}
ret = bdrv_open(&bs, filename, NULL, NULL, BDRV_O_RDWR | BDRV_O_PROTOCOL,
NULL, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
goto exit;
}
@@ -754,13 +763,20 @@ static int vdi_create(const char *filename, QEMUOptionParameter *options,
vdi_header_print(&header);
#endif
vdi_header_to_le(&header);
if (write(fd, &header, sizeof(header)) < 0) {
result = -errno;
goto close_and_exit;
ret = bdrv_pwrite_sync(bs, offset, &header, sizeof(header));
if (ret < 0) {
error_setg(errp, "Error writing header to %s", filename);
goto exit;
}
offset += sizeof(header);
if (bmap_size > 0) {
uint32_t *bmap = g_malloc0(bmap_size);
bmap = g_try_malloc0(bmap_size);
if (bmap == NULL) {
ret = -ENOMEM;
error_setg(errp, "Could not allocate bmap");
goto exit;
}
for (i = 0; i < blocks; i++) {
if (image_type == VDI_TYPE_STATIC) {
bmap[i] = i;
@@ -768,63 +784,71 @@ static int vdi_create(const char *filename, QEMUOptionParameter *options,
bmap[i] = VDI_UNALLOCATED;
}
}
if (write(fd, bmap, bmap_size) < 0) {
result = -errno;
g_free(bmap);
goto close_and_exit;
ret = bdrv_pwrite_sync(bs, offset, bmap, bmap_size);
if (ret < 0) {
error_setg(errp, "Error writing bmap to %s", filename);
goto exit;
}
g_free(bmap);
offset += bmap_size;
}
if (image_type == VDI_TYPE_STATIC) {
if (ftruncate(fd, sizeof(header) + bmap_size + blocks * block_size)) {
result = -errno;
goto close_and_exit;
ret = bdrv_truncate(bs, offset + blocks * block_size);
if (ret < 0) {
error_setg(errp, "Failed to statically allocate %s", filename);
goto exit;
}
}
close_and_exit:
if ((close(fd) < 0) && !result) {
result = -errno;
}
exit:
return result;
bdrv_unref(bs);
g_free(bmap);
return ret;
}
static void vdi_close(BlockDriverState *bs)
{
BDRVVdiState *s = bs->opaque;
g_free(s->bmap);
qemu_vfree(s->bmap);
migrate_del_blocker(s->migration_blocker);
error_free(s->migration_blocker);
}
static QEMUOptionParameter vdi_create_options[] = {
{
.name = BLOCK_OPT_SIZE,
.type = OPT_SIZE,
.help = "Virtual disk size"
},
static QemuOptsList vdi_create_opts = {
.name = "vdi-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(vdi_create_opts.head),
.desc = {
{
.name = BLOCK_OPT_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Virtual disk size"
},
#if defined(CONFIG_VDI_BLOCK_SIZE)
{
.name = BLOCK_OPT_CLUSTER_SIZE,
.type = OPT_SIZE,
.help = "VDI cluster (block) size",
.value = { .n = DEFAULT_CLUSTER_SIZE },
},
{
.name = BLOCK_OPT_CLUSTER_SIZE,
.type = QEMU_OPT_SIZE,
.help = "VDI cluster (block) size",
.def_value_str = stringify(DEFAULT_CLUSTER_SIZE)
},
#endif
#if defined(CONFIG_VDI_STATIC_IMAGE)
{
.name = BLOCK_OPT_STATIC,
.type = OPT_FLAG,
.help = "VDI static (pre-allocated) image"
},
{
.name = BLOCK_OPT_STATIC,
.type = QEMU_OPT_BOOL,
.help = "VDI static (pre-allocated) image",
.def_value_str = "off"
},
#endif
/* TODO: An additional option to set UUID values might be useful. */
{ NULL }
{
.name = BLOCK_OPT_NOCOW,
.type = QEMU_OPT_BOOL,
.help = "Turn off copy-on-write (valid only on btrfs)"
},
/* TODO: An additional option to set UUID values might be useful. */
{ /* end of list */ }
}
};
static BlockDriver bdrv_vdi = {
@@ -846,7 +870,7 @@ static BlockDriver bdrv_vdi = {
.bdrv_get_info = vdi_get_info,
.create_options = vdi_create_options,
.create_opts = &vdi_create_opts,
.bdrv_check = vdi_check,
};

View File

@@ -82,8 +82,6 @@ void vhdx_log_desc_le_import(VHDXLogDescriptor *d)
assert(d != NULL);
le32_to_cpus(&d->signature);
le32_to_cpus(&d->trailing_bytes);
le64_to_cpus(&d->leading_bytes);
le64_to_cpus(&d->file_offset);
le64_to_cpus(&d->sequence_number);
}
@@ -99,6 +97,15 @@ void vhdx_log_desc_le_export(VHDXLogDescriptor *d)
cpu_to_le64s(&d->sequence_number);
}
void vhdx_log_data_le_import(VHDXLogDataSector *d)
{
assert(d != NULL);
le32_to_cpus(&d->data_signature);
le32_to_cpus(&d->sequence_high);
le32_to_cpus(&d->sequence_low);
}
void vhdx_log_data_le_export(VHDXLogDataSector *d)
{
assert(d != NULL);

View File

@@ -84,6 +84,7 @@ static int vhdx_log_peek_hdr(BlockDriverState *bs, VHDXLogEntries *log,
if (ret < 0) {
goto exit;
}
vhdx_log_entry_hdr_le_import(hdr);
exit:
return ret;
@@ -211,7 +212,7 @@ static bool vhdx_log_hdr_is_valid(VHDXLogEntries *log, VHDXLogEntryHeader *hdr,
{
int valid = false;
if (memcmp(&hdr->signature, "loge", 4)) {
if (hdr->signature != VHDX_LOG_SIGNATURE) {
goto exit;
}
@@ -275,12 +276,12 @@ static bool vhdx_log_desc_is_valid(VHDXLogDescriptor *desc,
goto exit;
}
if (!memcmp(&desc->signature, "zero", 4)) {
if (desc->signature == VHDX_LOG_ZERO_SIGNATURE) {
if (desc->zero_length % VHDX_LOG_SECTOR_SIZE == 0) {
/* valid */
ret = true;
}
} else if (!memcmp(&desc->signature, "desc", 4)) {
} else if (desc->signature == VHDX_LOG_DESC_SIGNATURE) {
/* valid */
ret = true;
}
@@ -327,13 +328,15 @@ static int vhdx_compute_desc_sectors(uint32_t desc_cnt)
* passed into this function. Each descriptor will also be validated,
* and error returned if any are invalid. */
static int vhdx_log_read_desc(BlockDriverState *bs, BDRVVHDXState *s,
VHDXLogEntries *log, VHDXLogDescEntries **buffer)
VHDXLogEntries *log, VHDXLogDescEntries **buffer,
bool convert_endian)
{
int ret = 0;
uint32_t desc_sectors;
uint32_t sectors_read;
VHDXLogEntryHeader hdr;
VHDXLogDescEntries *desc_entries = NULL;
VHDXLogDescriptor desc;
int i;
assert(*buffer == NULL);
@@ -342,14 +345,19 @@ static int vhdx_log_read_desc(BlockDriverState *bs, BDRVVHDXState *s,
if (ret < 0) {
goto exit;
}
vhdx_log_entry_hdr_le_import(&hdr);
if (vhdx_log_hdr_is_valid(log, &hdr, s) == false) {
ret = -EINVAL;
goto exit;
}
desc_sectors = vhdx_compute_desc_sectors(hdr.descriptor_count);
desc_entries = qemu_blockalign(bs, desc_sectors * VHDX_LOG_SECTOR_SIZE);
desc_entries = qemu_try_blockalign(bs->file,
desc_sectors * VHDX_LOG_SECTOR_SIZE);
if (desc_entries == NULL) {
ret = -ENOMEM;
goto exit;
}
ret = vhdx_log_read_sectors(bs, log, &sectors_read, desc_entries,
desc_sectors, false);
@@ -363,12 +371,19 @@ static int vhdx_log_read_desc(BlockDriverState *bs, BDRVVHDXState *s,
/* put in proper endianness, and validate each desc */
for (i = 0; i < hdr.descriptor_count; i++) {
vhdx_log_desc_le_import(&desc_entries->desc[i]);
if (vhdx_log_desc_is_valid(&desc_entries->desc[i], &hdr) == false) {
desc = desc_entries->desc[i];
vhdx_log_desc_le_import(&desc);
if (convert_endian) {
desc_entries->desc[i] = desc;
}
if (vhdx_log_desc_is_valid(&desc, &hdr) == false) {
ret = -EINVAL;
goto free_and_exit;
}
}
if (convert_endian) {
desc_entries->hdr = hdr;
}
*buffer = desc_entries;
goto exit;
@@ -403,7 +418,7 @@ static int vhdx_log_flush_desc(BlockDriverState *bs, VHDXLogDescriptor *desc,
buffer = qemu_blockalign(bs, VHDX_LOG_SECTOR_SIZE);
if (!memcmp(&desc->signature, "desc", 4)) {
if (desc->signature == VHDX_LOG_DESC_SIGNATURE) {
/* data sector */
if (data == NULL) {
ret = -EFAULT;
@@ -431,10 +446,15 @@ static int vhdx_log_flush_desc(BlockDriverState *bs, VHDXLogDescriptor *desc,
memcpy(buffer+offset, &desc->trailing_bytes, 4);
} else if (!memcmp(&desc->signature, "zero", 4)) {
} else if (desc->signature == VHDX_LOG_ZERO_SIGNATURE) {
/* write 'count' sectors of sector */
memset(buffer, 0, VHDX_LOG_SECTOR_SIZE);
count = desc->zero_length / VHDX_LOG_SECTOR_SIZE;
} else {
error_report("Invalid VHDX log descriptor entry signature 0x%" PRIx32,
desc->signature);
ret = -EINVAL;
goto exit;
}
file_offset = desc->file_offset;
@@ -493,13 +513,13 @@ static int vhdx_log_flush(BlockDriverState *bs, BDRVVHDXState *s,
goto exit;
}
ret = vhdx_log_read_desc(bs, s, &logs->log, &desc_entries);
ret = vhdx_log_read_desc(bs, s, &logs->log, &desc_entries, true);
if (ret < 0) {
goto exit;
}
for (i = 0; i < desc_entries->hdr.descriptor_count; i++) {
if (!memcmp(&desc_entries->desc[i].signature, "desc", 4)) {
if (desc_entries->desc[i].signature == VHDX_LOG_DESC_SIGNATURE) {
/* data sector, so read a sector to flush */
ret = vhdx_log_read_sectors(bs, &logs->log, &sectors_read,
data, 1, false);
@@ -510,6 +530,7 @@ static int vhdx_log_flush(BlockDriverState *bs, BDRVVHDXState *s,
ret = -EINVAL;
goto exit;
}
vhdx_log_data_le_import(data);
}
ret = vhdx_log_flush_desc(bs, &desc_entries->desc[i], data);
@@ -558,9 +579,6 @@ static int vhdx_validate_log_entry(BlockDriverState *bs, BDRVVHDXState *s,
goto inc_and_exit;
}
vhdx_log_entry_hdr_le_import(&hdr);
if (vhdx_log_hdr_is_valid(log, &hdr, s) == false) {
goto inc_and_exit;
}
@@ -573,13 +591,13 @@ static int vhdx_validate_log_entry(BlockDriverState *bs, BDRVVHDXState *s,
desc_sectors = vhdx_compute_desc_sectors(hdr.descriptor_count);
/* Read desc sectors, and calculate log checksum */
/* Read all log sectors, and calculate log checksum */
total_sectors = hdr.entry_length / VHDX_LOG_SECTOR_SIZE;
/* read_desc() will increment the read idx */
ret = vhdx_log_read_desc(bs, s, log, &desc_buffer);
ret = vhdx_log_read_desc(bs, s, log, &desc_buffer, false);
if (ret < 0) {
goto free_and_exit;
}
@@ -602,7 +620,7 @@ static int vhdx_validate_log_entry(BlockDriverState *bs, BDRVVHDXState *s,
}
}
crc ^= 0xffffffff;
if (crc != desc_buffer->hdr.checksum) {
if (crc != hdr.checksum) {
goto free_and_exit;
}
@@ -905,7 +923,7 @@ static int vhdx_log_write(BlockDriverState *bs, BDRVVHDXState *s,
buffer = qemu_blockalign(bs, total_length);
memcpy(buffer, &new_hdr, sizeof(new_hdr));
new_desc = (VHDXLogDescriptor *) (buffer + sizeof(new_hdr));
new_desc = buffer + sizeof(new_hdr);
data_sector = buffer + (desc_sectors * VHDX_LOG_SECTOR_SIZE);
data_tmp = data;
@@ -962,7 +980,6 @@ static int vhdx_log_write(BlockDriverState *bs, BDRVVHDXState *s,
* last data sector */
vhdx_update_checksum(buffer, total_length,
offsetof(VHDXLogEntryHeader, checksum));
cpu_to_le32s((uint32_t *)(buffer + 4));
/* now write to the log */
ret = vhdx_log_write_sectors(bs, &s->log, &sectors_written, buffer,

View File

@@ -135,10 +135,8 @@ typedef struct VHDXSectorInfo {
* buf: buffer pointer
* size: size of buffer (must be > crc_offset+4)
*
* Note: The resulting checksum is in the CPU endianness, not necessarily
* in the file format endianness (LE). Any header export to disk should
* make sure that vhdx_header_le_export() is used to convert to the
* correct endianness
* Note: The buffer should have all multi-byte data in little-endian format,
* and the resulting checksum is in little endian format.
*/
uint32_t vhdx_update_checksum(uint8_t *buf, size_t size, int crc_offset)
{
@@ -149,6 +147,7 @@ uint32_t vhdx_update_checksum(uint8_t *buf, size_t size, int crc_offset)
memset(buf + crc_offset, 0, sizeof(crc));
crc = crc32c(0xffffffff, buf, size);
cpu_to_le32s(&crc);
memcpy(buf + crc_offset, &crc, sizeof(crc));
return crc;
@@ -300,7 +299,7 @@ static int vhdx_write_header(BlockDriverState *bs_file, VHDXHeader *hdr,
{
uint8_t *buffer = NULL;
int ret;
VHDXHeader header_le;
VHDXHeader *header_le;
assert(bs_file != NULL);
assert(hdr != NULL);
@@ -321,11 +320,12 @@ static int vhdx_write_header(BlockDriverState *bs_file, VHDXHeader *hdr,
}
/* overwrite the actual VHDXHeader portion */
memcpy(buffer, hdr, sizeof(VHDXHeader));
hdr->checksum = vhdx_update_checksum(buffer, VHDX_HEADER_SIZE,
offsetof(VHDXHeader, checksum));
vhdx_header_le_export(hdr, &header_le);
ret = bdrv_pwrite_sync(bs_file, offset, &header_le, sizeof(VHDXHeader));
header_le = (VHDXHeader *)buffer;
memcpy(header_le, hdr, sizeof(VHDXHeader));
vhdx_header_le_export(hdr, header_le);
vhdx_update_checksum(buffer, VHDX_HEADER_SIZE,
offsetof(VHDXHeader, checksum));
ret = bdrv_pwrite_sync(bs_file, offset, header_le, sizeof(VHDXHeader));
exit:
qemu_vfree(buffer);
@@ -432,13 +432,14 @@ static void vhdx_parse_header(BlockDriverState *bs, BDRVVHDXState *s,
}
/* copy over just the relevant portion that we need */
memcpy(header1, buffer, sizeof(VHDXHeader));
vhdx_header_le_import(header1);
if (vhdx_checksum_is_valid(buffer, VHDX_HEADER_SIZE, 4) &&
!memcmp(&header1->signature, "head", 4) &&
header1->version == 1) {
h1_seq = header1->sequence_number;
h1_valid = true;
if (vhdx_checksum_is_valid(buffer, VHDX_HEADER_SIZE, 4)) {
vhdx_header_le_import(header1);
if (header1->signature == VHDX_HEADER_SIGNATURE &&
header1->version == 1) {
h1_seq = header1->sequence_number;
h1_valid = true;
}
}
ret = bdrv_pread(bs->file, VHDX_HEADER2_OFFSET, buffer, VHDX_HEADER_SIZE);
@@ -447,13 +448,14 @@ static void vhdx_parse_header(BlockDriverState *bs, BDRVVHDXState *s,
}
/* copy over just the relevant portion that we need */
memcpy(header2, buffer, sizeof(VHDXHeader));
vhdx_header_le_import(header2);
if (vhdx_checksum_is_valid(buffer, VHDX_HEADER_SIZE, 4) &&
!memcmp(&header2->signature, "head", 4) &&
header2->version == 1) {
h2_seq = header2->sequence_number;
h2_valid = true;
if (vhdx_checksum_is_valid(buffer, VHDX_HEADER_SIZE, 4)) {
vhdx_header_le_import(header2);
if (header2->signature == VHDX_HEADER_SIGNATURE &&
header2->version == 1) {
h2_seq = header2->sequence_number;
h2_valid = true;
}
}
/* If there is only 1 valid header (or no valid headers), we
@@ -519,15 +521,21 @@ static int vhdx_open_region_tables(BlockDriverState *bs, BDRVVHDXState *s)
goto fail;
}
memcpy(&s->rt, buffer, sizeof(s->rt));
vhdx_region_header_le_import(&s->rt);
offset += sizeof(s->rt);
if (!vhdx_checksum_is_valid(buffer, VHDX_HEADER_BLOCK_SIZE, 4) ||
memcmp(&s->rt.signature, "regi", 4)) {
if (!vhdx_checksum_is_valid(buffer, VHDX_HEADER_BLOCK_SIZE, 4)) {
ret = -EINVAL;
goto fail;
}
vhdx_region_header_le_import(&s->rt);
if (s->rt.signature != VHDX_REGION_SIGNATURE) {
ret = -EINVAL;
goto fail;
}
/* Per spec, maximum region table entry count is 2047 */
if (s->rt.entry_count > 2047) {
ret = -EINVAL;
@@ -630,7 +638,7 @@ static int vhdx_parse_metadata(BlockDriverState *bs, BDRVVHDXState *s)
vhdx_metadata_header_le_import(&s->metadata_hdr);
if (memcmp(&s->metadata_hdr.signature, "metadata", 8)) {
if (s->metadata_hdr.signature != VHDX_METADATA_SIGNATURE) {
ret = -EINVAL;
goto exit;
}
@@ -950,7 +958,11 @@ static int vhdx_open(BlockDriverState *bs, QDict *options, int flags,
}
/* s->bat is freed in vhdx_close() */
s->bat = qemu_blockalign(bs, s->bat_rt.length);
s->bat = qemu_try_blockalign(bs->file, s->bat_rt.length);
if (s->bat == NULL) {
ret = -ENOMEM;
goto fail;
}
ret = bdrv_pread(bs->file, s->bat_offset, s->bat, s->bat_rt.length);
if (ret < 0) {
@@ -1369,7 +1381,7 @@ static int vhdx_create_new_headers(BlockDriverState *bs, uint64_t image_size,
int ret = 0;
VHDXHeader *hdr = NULL;
hdr = g_malloc0(sizeof(VHDXHeader));
hdr = g_new0(VHDXHeader, 1);
hdr->signature = VHDX_HEADER_SIGNATURE;
hdr->sequence_number = g_random_int();
@@ -1540,7 +1552,8 @@ exit:
*/
static int vhdx_create_bat(BlockDriverState *bs, BDRVVHDXState *s,
uint64_t image_size, VHDXImageType type,
bool use_zero_blocks, VHDXRegionTableEntry *rt_bat)
bool use_zero_blocks, uint64_t file_offset,
uint32_t length)
{
int ret = 0;
uint64_t data_file_offset;
@@ -1555,7 +1568,7 @@ static int vhdx_create_bat(BlockDriverState *bs, BDRVVHDXState *s,
/* this gives a data start after BAT/bitmap entries, and well
* past any metadata entries (with a 4 MB buffer for future
* expansion */
data_file_offset = rt_bat->file_offset + rt_bat->length + 5 * MiB;
data_file_offset = file_offset + length + 5 * MiB;
total_sectors = image_size >> s->logical_sector_size_bits;
if (type == VHDX_TYPE_DYNAMIC) {
@@ -1579,7 +1592,11 @@ static int vhdx_create_bat(BlockDriverState *bs, BDRVVHDXState *s,
use_zero_blocks ||
bdrv_has_zero_init(bs) == 0) {
/* for a fixed file, the default BAT entry is not zero */
s->bat = g_malloc0(rt_bat->length);
s->bat = g_try_malloc0(length);
if (length && s->bat != NULL) {
ret = -ENOMEM;
goto exit;
}
block_state = type == VHDX_TYPE_FIXED ? PAYLOAD_BLOCK_FULLY_PRESENT :
PAYLOAD_BLOCK_NOT_PRESENT;
block_state = use_zero_blocks ? PAYLOAD_BLOCK_ZERO : block_state;
@@ -1594,7 +1611,7 @@ static int vhdx_create_bat(BlockDriverState *bs, BDRVVHDXState *s,
cpu_to_le64s(&s->bat[sinfo.bat_idx]);
sector_num += s->sectors_per_block;
}
ret = bdrv_pwrite(bs, rt_bat->file_offset, s->bat, rt_bat->length);
ret = bdrv_pwrite(bs, file_offset, s->bat, length);
if (ret < 0) {
goto exit;
}
@@ -1626,6 +1643,8 @@ static int vhdx_create_new_region_table(BlockDriverState *bs,
int ret = 0;
uint32_t offset = 0;
void *buffer = NULL;
uint64_t bat_file_offset;
uint32_t bat_length;
BDRVVHDXState *s = NULL;
VHDXRegionTableHeader *region_table;
VHDXRegionTableEntry *rt_bat;
@@ -1635,7 +1654,7 @@ static int vhdx_create_new_region_table(BlockDriverState *bs,
/* Populate enough of the BDRVVHDXState to be able to use the
* pre-existing BAT calculation, translation, and update functions */
s = g_malloc0(sizeof(BDRVVHDXState));
s = g_new0(BDRVVHDXState, 1);
s->chunk_ratio = (VHDX_MAX_SECTORS_PER_BLOCK) *
(uint64_t) sector_size / (uint64_t) block_size;
@@ -1674,19 +1693,26 @@ static int vhdx_create_new_region_table(BlockDriverState *bs,
rt_metadata->length = 1 * MiB; /* min size, and more than enough */
*metadata_offset = rt_metadata->file_offset;
bat_file_offset = rt_bat->file_offset;
bat_length = rt_bat->length;
vhdx_region_header_le_export(region_table);
vhdx_region_entry_le_export(rt_bat);
vhdx_region_entry_le_export(rt_metadata);
vhdx_update_checksum(buffer, VHDX_HEADER_BLOCK_SIZE,
offsetof(VHDXRegionTableHeader, checksum));
/* The region table gives us the data we need to create the BAT,
* so do that now */
ret = vhdx_create_bat(bs, s, image_size, type, use_zero_blocks, rt_bat);
ret = vhdx_create_bat(bs, s, image_size, type, use_zero_blocks,
bat_file_offset, bat_length);
if (ret < 0) {
goto exit;
}
/* Now write out the region headers to disk */
vhdx_region_header_le_export(region_table);
vhdx_region_entry_le_export(rt_bat);
vhdx_region_entry_le_export(rt_metadata);
ret = bdrv_pwrite(bs, VHDX_REGION_TABLE_OFFSET, buffer,
VHDX_HEADER_BLOCK_SIZE);
if (ret < 0) {
@@ -1723,8 +1749,7 @@ exit:
* .---- ~ ----------- ~ ------------ ~ ---------------- ~ -----------.
* 1MB
*/
static int vhdx_create(const char *filename, QEMUOptionParameter *options,
Error **errp)
static int vhdx_create(const char *filename, QemuOpts *opts, Error **errp)
{
int ret = 0;
uint64_t image_size = (uint64_t) 2 * GiB;
@@ -1737,24 +1762,16 @@ static int vhdx_create(const char *filename, QEMUOptionParameter *options,
gunichar2 *creator = NULL;
glong creator_items;
BlockDriverState *bs;
const char *type = NULL;
char *type = NULL;
VHDXImageType image_type;
Error *local_err = NULL;
while (options && options->name) {
if (!strcmp(options->name, BLOCK_OPT_SIZE)) {
image_size = options->value.n;
} else if (!strcmp(options->name, VHDX_BLOCK_OPT_LOG_SIZE)) {
log_size = options->value.n;
} else if (!strcmp(options->name, VHDX_BLOCK_OPT_BLOCK_SIZE)) {
block_size = options->value.n;
} else if (!strcmp(options->name, BLOCK_OPT_SUBFMT)) {
type = options->value.s;
} else if (!strcmp(options->name, VHDX_BLOCK_OPT_ZERO)) {
use_zero_blocks = options->value.n != 0;
}
options++;
}
image_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
log_size = qemu_opt_get_size_del(opts, VHDX_BLOCK_OPT_LOG_SIZE, 0);
block_size = qemu_opt_get_size_del(opts, VHDX_BLOCK_OPT_BLOCK_SIZE, 0);
type = qemu_opt_get_del(opts, BLOCK_OPT_SUBFMT);
use_zero_blocks = qemu_opt_get_bool_del(opts, VHDX_BLOCK_OPT_ZERO, false);
if (image_size > VHDX_MAX_IMAGE_SIZE) {
error_setg_errno(errp, EINVAL, "Image size too large; max of 64TB");
@@ -1763,7 +1780,7 @@ static int vhdx_create(const char *filename, QEMUOptionParameter *options,
}
if (type == NULL) {
type = "dynamic";
type = g_strdup("dynamic");
}
if (!strcmp(type, "dynamic")) {
@@ -1803,7 +1820,7 @@ static int vhdx_create(const char *filename, QEMUOptionParameter *options,
block_size = block_size > VHDX_BLOCK_SIZE_MAX ? VHDX_BLOCK_SIZE_MAX :
block_size;
ret = bdrv_create_file(filename, options, &local_err);
ret = bdrv_create_file(filename, opts, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
goto exit;
@@ -1863,6 +1880,7 @@ static int vhdx_create(const char *filename, QEMUOptionParameter *options,
delete_and_exit:
bdrv_unref(bs);
exit:
g_free(type);
g_free(creator);
return ret;
}
@@ -1885,37 +1903,41 @@ static int vhdx_check(BlockDriverState *bs, BdrvCheckResult *result,
return 0;
}
static QEMUOptionParameter vhdx_create_options[] = {
{
.name = BLOCK_OPT_SIZE,
.type = OPT_SIZE,
.help = "Virtual disk size; max of 64TB."
},
{
.name = VHDX_BLOCK_OPT_LOG_SIZE,
.type = OPT_SIZE,
.value.n = 1 * MiB,
.help = "Log size; min 1MB."
},
{
.name = VHDX_BLOCK_OPT_BLOCK_SIZE,
.type = OPT_SIZE,
.value.n = 0,
.help = "Block Size; min 1MB, max 256MB. " \
"0 means auto-calculate based on image size."
},
{
.name = BLOCK_OPT_SUBFMT,
.type = OPT_STRING,
.help = "VHDX format type, can be either 'dynamic' or 'fixed'. "\
"Default is 'dynamic'."
},
{
.name = VHDX_BLOCK_OPT_ZERO,
.type = OPT_FLAG,
.help = "Force use of payload blocks of type 'ZERO'. Non-standard."
},
{ NULL }
static QemuOptsList vhdx_create_opts = {
.name = "vhdx-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(vhdx_create_opts.head),
.desc = {
{
.name = BLOCK_OPT_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Virtual disk size; max of 64TB."
},
{
.name = VHDX_BLOCK_OPT_LOG_SIZE,
.type = QEMU_OPT_SIZE,
.def_value_str = stringify(DEFAULT_LOG_SIZE),
.help = "Log size; min 1MB."
},
{
.name = VHDX_BLOCK_OPT_BLOCK_SIZE,
.type = QEMU_OPT_SIZE,
.def_value_str = stringify(0),
.help = "Block Size; min 1MB, max 256MB. " \
"0 means auto-calculate based on image size."
},
{
.name = BLOCK_OPT_SUBFMT,
.type = QEMU_OPT_STRING,
.help = "VHDX format type, can be either 'dynamic' or 'fixed'. "\
"Default is 'dynamic'."
},
{
.name = VHDX_BLOCK_OPT_ZERO,
.type = QEMU_OPT_BOOL,
.help = "Force use of payload blocks of type 'ZERO'. Non-standard."
},
{ NULL }
}
};
static BlockDriver bdrv_vhdx = {
@@ -1931,7 +1953,7 @@ static BlockDriver bdrv_vhdx = {
.bdrv_get_info = vhdx_get_info,
.bdrv_check = vhdx_check,
.create_options = vhdx_create_options,
.create_opts = &vhdx_create_opts,
};
static void bdrv_vhdx_init(void)

View File

@@ -23,6 +23,7 @@
#define GiB (MiB * 1024)
#define TiB ((uint64_t) GiB * 1024)
#define DEFAULT_LOG_SIZE 1048576 /* 1MiB */
/* Structures and fields present in the VHDX file */
/* The header section has the following blocks,
@@ -434,6 +435,7 @@ void vhdx_header_le_import(VHDXHeader *h);
void vhdx_header_le_export(VHDXHeader *orig_h, VHDXHeader *new_h);
void vhdx_log_desc_le_import(VHDXLogDescriptor *d);
void vhdx_log_desc_le_export(VHDXLogDescriptor *d);
void vhdx_log_data_le_import(VHDXLogDataSector *d);
void vhdx_log_data_le_export(VHDXLogDataSector *d);
void vhdx_log_entry_hdr_le_import(VHDXLogEntryHeader *hdr);
void vhdx_log_entry_hdr_le_export(VHDXLogEntryHeader *hdr);

View File

@@ -106,6 +106,7 @@ typedef struct VmdkExtent {
uint32_t l2_cache_counts[L2_CACHE_SIZE];
int64_t cluster_sectors;
int64_t next_cluster_sector;
char *type;
} VmdkExtent;
@@ -124,7 +125,6 @@ typedef struct BDRVVmdkState {
} BDRVVmdkState;
typedef struct VmdkMetaData {
uint32_t offset;
unsigned int l1_index;
unsigned int l2_index;
unsigned int l2_offset;
@@ -233,7 +233,7 @@ static void vmdk_free_last_extent(BlockDriverState *bs)
return;
}
s->num_extents--;
s->extents = g_realloc(s->extents, s->num_extents * sizeof(VmdkExtent));
s->extents = g_renew(VmdkExtent, s->extents, s->num_extents);
}
static uint32_t vmdk_read_cid(BlockDriverState *bs, int parent)
@@ -397,6 +397,7 @@ static int vmdk_add_extent(BlockDriverState *bs,
{
VmdkExtent *extent;
BDRVVmdkState *s = bs->opaque;
int64_t nb_sectors;
if (cluster_sectors > 0x200000) {
/* 0x200000 * 512Bytes = 1GB for one cluster is unrealistic */
@@ -412,8 +413,12 @@ static int vmdk_add_extent(BlockDriverState *bs,
return -EFBIG;
}
s->extents = g_realloc(s->extents,
(s->num_extents + 1) * sizeof(VmdkExtent));
nb_sectors = bdrv_nb_sectors(file);
if (nb_sectors < 0) {
return nb_sectors;
}
s->extents = g_renew(VmdkExtent, s->extents, s->num_extents + 1);
extent = &s->extents[s->num_extents];
s->num_extents++;
@@ -427,6 +432,7 @@ static int vmdk_add_extent(BlockDriverState *bs,
extent->l1_entry_sectors = l2_size * cluster_sectors;
extent->l2_size = l2_size;
extent->cluster_sectors = flat ? sectors : cluster_sectors;
extent->next_cluster_sector = ROUND_UP(nb_sectors, cluster_sectors);
if (s->num_extents > 1) {
extent->end_sector = (*(extent - 1)).end_sector + extent->sectors;
@@ -448,7 +454,11 @@ static int vmdk_init_tables(BlockDriverState *bs, VmdkExtent *extent,
/* read the L1 table */
l1_size = extent->l1_size * sizeof(uint32_t);
extent->l1_table = g_malloc(l1_size);
extent->l1_table = g_try_malloc(l1_size);
if (l1_size && extent->l1_table == NULL) {
return -ENOMEM;
}
ret = bdrv_pread(extent->file,
extent->l1_table_offset,
extent->l1_table,
@@ -464,7 +474,11 @@ static int vmdk_init_tables(BlockDriverState *bs, VmdkExtent *extent,
}
if (extent->l1_backup_table_offset) {
extent->l1_backup_table = g_malloc(l1_size);
extent->l1_backup_table = g_try_malloc(l1_size);
if (l1_size && extent->l1_backup_table == NULL) {
ret = -ENOMEM;
goto fail_l1;
}
ret = bdrv_pread(extent->file,
extent->l1_backup_table_offset,
extent->l1_backup_table,
@@ -481,7 +495,7 @@ static int vmdk_init_tables(BlockDriverState *bs, VmdkExtent *extent,
}
extent->l2_cache =
g_malloc(extent->l2_size * L2_CACHE_SIZE * sizeof(uint32_t));
g_new(uint32_t, extent->l2_size * L2_CACHE_SIZE);
return 0;
fail_l1b:
g_free(extent->l1_backup_table);
@@ -669,8 +683,7 @@ static int vmdk_open_vmdk4(BlockDriverState *bs,
if (le32_to_cpu(header.flags) & VMDK4_FLAG_RGD) {
l1_backup_offset = le64_to_cpu(header.rgd_offset) << 9;
}
if (bdrv_getlength(file) <
le64_to_cpu(header.grain_offset) * BDRV_SECTOR_SIZE) {
if (bdrv_nb_sectors(file) < le64_to_cpu(header.grain_offset)) {
error_setg(errp, "File truncated, expecting at least %" PRId64 " bytes",
(int64_t)(le64_to_cpu(header.grain_offset)
* BDRV_SECTOR_SIZE));
@@ -821,6 +834,7 @@ static int vmdk_parse_extents(const char *desc, BlockDriverState *bs,
ret = vmdk_add_extent(bs, extent_file, true, sectors,
0, 0, 0, 0, 0, &extent, errp);
if (ret < 0) {
bdrv_unref(extent_file);
return ret;
}
extent->flat_start_offset = flat_offset << 9;
@@ -832,14 +846,15 @@ static int vmdk_parse_extents(const char *desc, BlockDriverState *bs,
} else {
ret = vmdk_open_sparse(bs, extent_file, bs->open_flags, buf, errp);
}
g_free(buf);
if (ret) {
g_free(buf);
bdrv_unref(extent_file);
return ret;
}
extent = &s->extents[s->num_extents - 1];
} else {
error_setg(errp, "Unsupported extent type '%s'", type);
bdrv_unref(extent_file);
return -ENOTSUP;
}
extent->type = g_strdup(type);
@@ -938,7 +953,7 @@ fail:
}
static int vmdk_refresh_limits(BlockDriverState *bs)
static void vmdk_refresh_limits(BlockDriverState *bs, Error **errp)
{
BDRVVmdkState *s = bs->opaque;
int i;
@@ -950,61 +965,99 @@ static int vmdk_refresh_limits(BlockDriverState *bs)
s->extents[i].cluster_sectors);
}
}
return 0;
}
/**
* get_whole_cluster
*
* Copy backing file's cluster that covers @sector_num, otherwise write zero,
* to the cluster at @cluster_sector_num.
*
* If @skip_start_sector < @skip_end_sector, the relative range
* [@skip_start_sector, @skip_end_sector) is not copied or written, and leave
* it for call to write user data in the request.
*/
static int get_whole_cluster(BlockDriverState *bs,
VmdkExtent *extent,
uint64_t cluster_offset,
uint64_t offset,
bool allocate)
VmdkExtent *extent,
uint64_t cluster_sector_num,
uint64_t sector_num,
uint64_t skip_start_sector,
uint64_t skip_end_sector)
{
int ret = VMDK_OK;
uint8_t *whole_grain = NULL;
int64_t cluster_bytes;
uint8_t *whole_grain;
/* For COW, align request sector_num to cluster start */
sector_num = QEMU_ALIGN_DOWN(sector_num, extent->cluster_sectors);
cluster_bytes = extent->cluster_sectors << BDRV_SECTOR_BITS;
whole_grain = qemu_blockalign(bs, cluster_bytes);
if (!bs->backing_hd) {
memset(whole_grain, 0, skip_start_sector << BDRV_SECTOR_BITS);
memset(whole_grain + (skip_end_sector << BDRV_SECTOR_BITS), 0,
cluster_bytes - (skip_end_sector << BDRV_SECTOR_BITS));
}
assert(skip_end_sector <= extent->cluster_sectors);
/* we will be here if it's first write on non-exist grain(cluster).
* try to read from parent image, if exist */
if (bs->backing_hd) {
whole_grain =
qemu_blockalign(bs, extent->cluster_sectors << BDRV_SECTOR_BITS);
if (!vmdk_is_cid_valid(bs)) {
ret = VMDK_ERROR;
goto exit;
}
if (bs->backing_hd && !vmdk_is_cid_valid(bs)) {
ret = VMDK_ERROR;
goto exit;
}
/* floor offset to cluster */
offset -= offset % (extent->cluster_sectors * 512);
ret = bdrv_read(bs->backing_hd, offset >> 9, whole_grain,
extent->cluster_sectors);
if (ret < 0) {
ret = VMDK_ERROR;
goto exit;
/* Read backing data before skip range */
if (skip_start_sector > 0) {
if (bs->backing_hd) {
ret = bdrv_read(bs->backing_hd, sector_num,
whole_grain, skip_start_sector);
if (ret < 0) {
ret = VMDK_ERROR;
goto exit;
}
}
/* Write grain only into the active image */
ret = bdrv_write(extent->file, cluster_offset, whole_grain,
extent->cluster_sectors);
ret = bdrv_write(extent->file, cluster_sector_num, whole_grain,
skip_start_sector);
if (ret < 0) {
ret = VMDK_ERROR;
goto exit;
}
}
/* Read backing data after skip range */
if (skip_end_sector < extent->cluster_sectors) {
if (bs->backing_hd) {
ret = bdrv_read(bs->backing_hd, sector_num + skip_end_sector,
whole_grain + (skip_end_sector << BDRV_SECTOR_BITS),
extent->cluster_sectors - skip_end_sector);
if (ret < 0) {
ret = VMDK_ERROR;
goto exit;
}
}
ret = bdrv_write(extent->file, cluster_sector_num + skip_end_sector,
whole_grain + (skip_end_sector << BDRV_SECTOR_BITS),
extent->cluster_sectors - skip_end_sector);
if (ret < 0) {
ret = VMDK_ERROR;
goto exit;
}
}
exit:
qemu_vfree(whole_grain);
return ret;
}
static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data)
static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data,
uint32_t offset)
{
uint32_t offset;
QEMU_BUILD_BUG_ON(sizeof(offset) != sizeof(m_data->offset));
offset = cpu_to_le32(m_data->offset);
offset = cpu_to_le32(offset);
/* update L2 table */
if (bdrv_pwrite_sync(
extent->file,
((int64_t)m_data->l2_offset * 512)
+ (m_data->l2_index * sizeof(m_data->offset)),
+ (m_data->l2_index * sizeof(offset)),
&offset, sizeof(offset)) < 0) {
return VMDK_ERROR;
}
@@ -1014,7 +1067,7 @@ static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data)
if (bdrv_pwrite_sync(
extent->file,
((int64_t)m_data->l2_offset * 512)
+ (m_data->l2_index * sizeof(m_data->offset)),
+ (m_data->l2_index * sizeof(offset)),
&offset, sizeof(offset)) < 0) {
return VMDK_ERROR;
}
@@ -1026,17 +1079,41 @@ static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data)
return VMDK_OK;
}
/**
* get_cluster_offset
*
* Look up cluster offset in extent file by sector number, and store in
* @cluster_offset.
*
* For flat extents, the start offset as parsed from the description file is
* returned.
*
* For sparse extents, look up in L1, L2 table. If allocate is true, return an
* offset for a new cluster and update L2 cache. If there is a backing file,
* COW is done before returning; otherwise, zeroes are written to the allocated
* cluster. Both COW and zero writing skips the sector range
* [@skip_start_sector, @skip_end_sector) passed in by caller, because caller
* has new data to write there.
*
* Returns: VMDK_OK if cluster exists and mapped in the image.
* VMDK_UNALLOC if cluster is not mapped and @allocate is false.
* VMDK_ERROR if failed.
*/
static int get_cluster_offset(BlockDriverState *bs,
VmdkExtent *extent,
VmdkMetaData *m_data,
uint64_t offset,
int allocate,
uint64_t *cluster_offset)
VmdkExtent *extent,
VmdkMetaData *m_data,
uint64_t offset,
bool allocate,
uint64_t *cluster_offset,
uint64_t skip_start_sector,
uint64_t skip_end_sector)
{
unsigned int l1_index, l2_offset, l2_index;
int min_index, i, j;
uint32_t min_count, *l2_table;
bool zeroed = false;
int64_t ret;
int32_t cluster_sector;
if (m_data) {
m_data->valid = 0;
@@ -1090,52 +1167,41 @@ static int get_cluster_offset(BlockDriverState *bs,
extent->l2_cache_counts[min_index] = 1;
found:
l2_index = ((offset >> 9) / extent->cluster_sectors) % extent->l2_size;
*cluster_offset = le32_to_cpu(l2_table[l2_index]);
cluster_sector = le32_to_cpu(l2_table[l2_index]);
if (m_data) {
m_data->valid = 1;
m_data->l1_index = l1_index;
m_data->l2_index = l2_index;
m_data->offset = *cluster_offset;
m_data->l2_offset = l2_offset;
m_data->l2_cache_entry = &l2_table[l2_index];
}
if (extent->has_zero_grain && *cluster_offset == VMDK_GTE_ZEROED) {
if (extent->has_zero_grain && cluster_sector == VMDK_GTE_ZEROED) {
zeroed = true;
}
if (!*cluster_offset || zeroed) {
if (!cluster_sector || zeroed) {
if (!allocate) {
return zeroed ? VMDK_ZEROED : VMDK_UNALLOC;
}
/* Avoid the L2 tables update for the images that have snapshots. */
*cluster_offset = bdrv_getlength(extent->file);
if (!extent->compressed) {
bdrv_truncate(
extent->file,
*cluster_offset + (extent->cluster_sectors << 9)
);
}
*cluster_offset >>= 9;
l2_table[l2_index] = cpu_to_le32(*cluster_offset);
cluster_sector = extent->next_cluster_sector;
extent->next_cluster_sector += extent->cluster_sectors;
/* First of all we write grain itself, to avoid race condition
* that may to corrupt the image.
* This problem may occur because of insufficient space on host disk
* or inappropriate VM shutdown.
*/
if (get_whole_cluster(
bs, extent, *cluster_offset, offset, allocate) == -1) {
return VMDK_ERROR;
}
if (m_data) {
m_data->offset = *cluster_offset;
ret = get_whole_cluster(bs, extent,
cluster_sector,
offset >> BDRV_SECTOR_BITS,
skip_start_sector, skip_end_sector);
if (ret) {
return ret;
}
}
*cluster_offset <<= 9;
*cluster_offset = cluster_sector << BDRV_SECTOR_BITS;
return VMDK_OK;
}
@@ -1170,7 +1236,8 @@ static int64_t coroutine_fn vmdk_co_get_block_status(BlockDriverState *bs,
}
qemu_co_mutex_lock(&s->lock);
ret = get_cluster_offset(bs, extent, NULL,
sector_num * 512, 0, &offset);
sector_num * 512, false, &offset,
0, 0);
qemu_co_mutex_unlock(&s->lock);
switch (ret) {
@@ -1323,9 +1390,9 @@ static int vmdk_read(BlockDriverState *bs, int64_t sector_num,
if (!extent) {
return -EIO;
}
ret = get_cluster_offset(
bs, extent, NULL,
sector_num << 9, 0, &cluster_offset);
ret = get_cluster_offset(bs, extent, NULL,
sector_num << 9, false, &cluster_offset,
0, 0);
extent_begin_sector = extent->end_sector - extent->sectors;
extent_relative_sector_num = sector_num - extent_begin_sector;
index_in_cluster = extent_relative_sector_num % extent->cluster_sectors;
@@ -1406,12 +1473,17 @@ static int vmdk_write(BlockDriverState *bs, int64_t sector_num,
if (!extent) {
return -EIO;
}
ret = get_cluster_offset(
bs,
extent,
&m_data,
sector_num << 9, !extent->compressed,
&cluster_offset);
extent_begin_sector = extent->end_sector - extent->sectors;
extent_relative_sector_num = sector_num - extent_begin_sector;
index_in_cluster = extent_relative_sector_num % extent->cluster_sectors;
n = extent->cluster_sectors - index_in_cluster;
if (n > nb_sectors) {
n = nb_sectors;
}
ret = get_cluster_offset(bs, extent, &m_data, sector_num << 9,
!(extent->compressed || zeroed),
&cluster_offset,
index_in_cluster, index_in_cluster + n);
if (extent->compressed) {
if (ret == VMDK_OK) {
/* Refuse write to allocated cluster for streamOptimized */
@@ -1420,24 +1492,13 @@ static int vmdk_write(BlockDriverState *bs, int64_t sector_num,
return -EIO;
} else {
/* allocate */
ret = get_cluster_offset(
bs,
extent,
&m_data,
sector_num << 9, 1,
&cluster_offset);
ret = get_cluster_offset(bs, extent, &m_data, sector_num << 9,
true, &cluster_offset, 0, 0);
}
}
if (ret == VMDK_ERROR) {
return -EINVAL;
}
extent_begin_sector = extent->end_sector - extent->sectors;
extent_relative_sector_num = sector_num - extent_begin_sector;
index_in_cluster = extent_relative_sector_num % extent->cluster_sectors;
n = extent->cluster_sectors - index_in_cluster;
if (n > nb_sectors) {
n = nb_sectors;
}
if (zeroed) {
/* Do zeroed write, buf is ignored */
if (extent->has_zero_grain &&
@@ -1445,9 +1506,9 @@ static int vmdk_write(BlockDriverState *bs, int64_t sector_num,
n >= extent->cluster_sectors) {
n = extent->cluster_sectors;
if (!zero_dry_run) {
m_data.offset = VMDK_GTE_ZEROED;
/* update L2 tables */
if (vmdk_L2update(extent, &m_data) != VMDK_OK) {
if (vmdk_L2update(extent, &m_data, VMDK_GTE_ZEROED)
!= VMDK_OK) {
return -EIO;
}
}
@@ -1463,7 +1524,9 @@ static int vmdk_write(BlockDriverState *bs, int64_t sector_num,
}
if (m_data.valid) {
/* update L2 tables */
if (vmdk_L2update(extent, &m_data) != VMDK_OK) {
if (vmdk_L2update(extent, &m_data,
cluster_offset >> BDRV_SECTOR_BITS)
!= VMDK_OK) {
return -EIO;
}
}
@@ -1529,7 +1592,7 @@ static int coroutine_fn vmdk_co_write_zeroes(BlockDriverState *bs,
static int vmdk_create_extent(const char *filename, int64_t filesize,
bool flat, bool compress, bool zeroed_grain,
Error **errp)
QemuOpts *opts, Error **errp)
{
int ret, i;
BlockDriverState *bs = NULL;
@@ -1539,7 +1602,7 @@ static int vmdk_create_extent(const char *filename, int64_t filesize,
uint32_t *gd_buf = NULL;
int gd_buf_size;
ret = bdrv_create_file(filename, NULL, &local_err);
ret = bdrv_create_file(filename, opts, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
goto exit;
@@ -1695,17 +1758,16 @@ static int filename_decompose(const char *filename, char *path, char *prefix,
return VMDK_OK;
}
static int vmdk_create(const char *filename, QEMUOptionParameter *options,
Error **errp)
static int vmdk_create(const char *filename, QemuOpts *opts, Error **errp)
{
int idx = 0;
BlockDriverState *new_bs = NULL;
Error *local_err = NULL;
char *desc = NULL;
int64_t total_size = 0, filesize;
const char *adapter_type = NULL;
const char *backing_file = NULL;
const char *fmt = NULL;
char *adapter_type = NULL;
char *backing_file = NULL;
char *fmt = NULL;
int flags = 0;
int ret = 0;
bool flat, split, compress;
@@ -1745,24 +1807,20 @@ static int vmdk_create(const char *filename, QEMUOptionParameter *options,
goto exit;
}
/* Read out options */
while (options && options->name) {
if (!strcmp(options->name, BLOCK_OPT_SIZE)) {
total_size = options->value.n;
} else if (!strcmp(options->name, BLOCK_OPT_ADAPTER_TYPE)) {
adapter_type = options->value.s;
} else if (!strcmp(options->name, BLOCK_OPT_BACKING_FILE)) {
backing_file = options->value.s;
} else if (!strcmp(options->name, BLOCK_OPT_COMPAT6)) {
flags |= options->value.n ? BLOCK_FLAG_COMPAT6 : 0;
} else if (!strcmp(options->name, BLOCK_OPT_SUBFMT)) {
fmt = options->value.s;
} else if (!strcmp(options->name, BLOCK_OPT_ZEROED_GRAIN)) {
zeroed_grain |= options->value.n;
}
options++;
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
adapter_type = qemu_opt_get_del(opts, BLOCK_OPT_ADAPTER_TYPE);
backing_file = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FILE);
if (qemu_opt_get_bool_del(opts, BLOCK_OPT_COMPAT6, false)) {
flags |= BLOCK_FLAG_COMPAT6;
}
fmt = qemu_opt_get_del(opts, BLOCK_OPT_SUBFMT);
if (qemu_opt_get_bool_del(opts, BLOCK_OPT_ZEROED_GRAIN, false)) {
zeroed_grain = true;
}
if (!adapter_type) {
adapter_type = "ide";
adapter_type = g_strdup("ide");
} else if (strcmp(adapter_type, "ide") &&
strcmp(adapter_type, "buslogic") &&
strcmp(adapter_type, "lsilogic") &&
@@ -1778,7 +1836,7 @@ static int vmdk_create(const char *filename, QEMUOptionParameter *options,
}
if (!fmt) {
/* Default format to monolithicSparse */
fmt = "monolithicSparse";
fmt = g_strdup("monolithicSparse");
} else if (strcmp(fmt, "monolithicFlat") &&
strcmp(fmt, "monolithicSparse") &&
strcmp(fmt, "twoGbMaxExtentSparse") &&
@@ -1851,7 +1909,7 @@ static int vmdk_create(const char *filename, QEMUOptionParameter *options,
path, desc_filename);
if (vmdk_create_extent(ext_filename, size,
flat, compress, zeroed_grain, errp)) {
flat, compress, zeroed_grain, opts, errp)) {
ret = -EINVAL;
goto exit;
}
@@ -1879,7 +1937,7 @@ static int vmdk_create(const char *filename, QEMUOptionParameter *options,
if (!split && !flat) {
desc_offset = 0x200;
} else {
ret = bdrv_create_file(filename, options, &local_err);
ret = bdrv_create_file(filename, opts, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
goto exit;
@@ -1909,6 +1967,9 @@ exit:
if (new_bs) {
bdrv_unref(new_bs);
}
g_free(adapter_type);
g_free(backing_file);
g_free(fmt);
g_free(desc);
g_string_free(ext_desc_lines, true);
return ret;
@@ -2004,7 +2065,7 @@ static int vmdk_check(BlockDriverState *bs, BdrvCheckResult *result,
BDRVVmdkState *s = bs->opaque;
VmdkExtent *extent = NULL;
int64_t sector_num = 0;
int64_t total_sectors = bdrv_getlength(bs) / BDRV_SECTOR_SIZE;
int64_t total_sectors = bdrv_nb_sectors(bs);
int ret;
uint64_t cluster_offset;
@@ -2025,7 +2086,7 @@ static int vmdk_check(BlockDriverState *bs, BdrvCheckResult *result,
}
ret = get_cluster_offset(bs, extent, NULL,
sector_num << BDRV_SECTOR_BITS,
0, &cluster_offset);
false, &cluster_offset, 0, 0);
if (ret == VMDK_ERROR) {
fprintf(stderr,
"ERROR: could not get cluster_offset for sector %"
@@ -2096,41 +2157,68 @@ static int vmdk_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
return 0;
}
static QEMUOptionParameter vmdk_create_options[] = {
{
.name = BLOCK_OPT_SIZE,
.type = OPT_SIZE,
.help = "Virtual disk size"
},
{
.name = BLOCK_OPT_ADAPTER_TYPE,
.type = OPT_STRING,
.help = "Virtual adapter type, can be one of "
"ide (default), lsilogic, buslogic or legacyESX"
},
{
.name = BLOCK_OPT_BACKING_FILE,
.type = OPT_STRING,
.help = "File name of a base image"
},
{
.name = BLOCK_OPT_COMPAT6,
.type = OPT_FLAG,
.help = "VMDK version 6 image"
},
{
.name = BLOCK_OPT_SUBFMT,
.type = OPT_STRING,
.help =
"VMDK flat extent format, can be one of "
"{monolithicSparse (default) | monolithicFlat | twoGbMaxExtentSparse | twoGbMaxExtentFlat | streamOptimized} "
},
{
.name = BLOCK_OPT_ZEROED_GRAIN,
.type = OPT_FLAG,
.help = "Enable efficient zero writes using the zeroed-grain GTE feature"
},
{ NULL }
static void vmdk_detach_aio_context(BlockDriverState *bs)
{
BDRVVmdkState *s = bs->opaque;
int i;
for (i = 0; i < s->num_extents; i++) {
bdrv_detach_aio_context(s->extents[i].file);
}
}
static void vmdk_attach_aio_context(BlockDriverState *bs,
AioContext *new_context)
{
BDRVVmdkState *s = bs->opaque;
int i;
for (i = 0; i < s->num_extents; i++) {
bdrv_attach_aio_context(s->extents[i].file, new_context);
}
}
static QemuOptsList vmdk_create_opts = {
.name = "vmdk-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(vmdk_create_opts.head),
.desc = {
{
.name = BLOCK_OPT_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Virtual disk size"
},
{
.name = BLOCK_OPT_ADAPTER_TYPE,
.type = QEMU_OPT_STRING,
.help = "Virtual adapter type, can be one of "
"ide (default), lsilogic, buslogic or legacyESX"
},
{
.name = BLOCK_OPT_BACKING_FILE,
.type = QEMU_OPT_STRING,
.help = "File name of a base image"
},
{
.name = BLOCK_OPT_COMPAT6,
.type = QEMU_OPT_BOOL,
.help = "VMDK version 6 image",
.def_value_str = "off"
},
{
.name = BLOCK_OPT_SUBFMT,
.type = QEMU_OPT_STRING,
.help =
"VMDK flat extent format, can be one of "
"{monolithicSparse (default) | monolithicFlat | twoGbMaxExtentSparse | twoGbMaxExtentFlat | streamOptimized} "
},
{
.name = BLOCK_OPT_ZEROED_GRAIN,
.type = QEMU_OPT_BOOL,
.help = "Enable efficient zero writes "
"using the zeroed-grain GTE feature"
},
{ /* end of list */ }
}
};
static BlockDriver bdrv_vmdk = {
@@ -2153,8 +2241,11 @@ static BlockDriver bdrv_vmdk = {
.bdrv_get_specific_info = vmdk_get_specific_info,
.bdrv_refresh_limits = vmdk_refresh_limits,
.bdrv_get_info = vmdk_get_info,
.bdrv_detach_aio_context = vmdk_detach_aio_context,
.bdrv_attach_aio_context = vmdk_attach_aio_context,
.create_options = vmdk_create_options,
.supports_backing = true,
.create_opts = &vmdk_create_opts,
};
static void bdrv_vmdk_init(void)

View File

@@ -269,7 +269,11 @@ static int vpc_open(BlockDriverState *bs, QDict *options, int flags,
goto fail;
}
s->pagetable = qemu_blockalign(bs, s->max_table_entries * 4);
s->pagetable = qemu_try_blockalign(bs->file, s->max_table_entries * 4);
if (s->pagetable == NULL) {
ret = -ENOMEM;
goto fail;
}
s->bat_offset = be64_to_cpu(dyndisk_header->table_offset);
@@ -485,7 +489,7 @@ static int vpc_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
BDRVVPCState *s = (BDRVVPCState *)bs->opaque;
VHDFooter *footer = (VHDFooter *) s->footer_buf;
if (cpu_to_be32(footer->type) != VHD_FIXED) {
if (be32_to_cpu(footer->type) != VHD_FIXED) {
bdi->cluster_size = s->block_size;
}
@@ -502,7 +506,7 @@ static int vpc_read(BlockDriverState *bs, int64_t sector_num,
int64_t sectors, sectors_per_block;
VHDFooter *footer = (VHDFooter *) s->footer_buf;
if (cpu_to_be32(footer->type) == VHD_FIXED) {
if (be32_to_cpu(footer->type) == VHD_FIXED) {
return bdrv_read(bs->file, sector_num, buf, nb_sectors);
}
while (nb_sectors > 0) {
@@ -551,7 +555,7 @@ static int vpc_write(BlockDriverState *bs, int64_t sector_num,
int ret;
VHDFooter *footer = (VHDFooter *) s->footer_buf;
if (cpu_to_be32(footer->type) == VHD_FIXED) {
if (be32_to_cpu(footer->type) == VHD_FIXED) {
return bdrv_write(bs->file, sector_num, buf, nb_sectors);
}
while (nb_sectors > 0) {
@@ -649,39 +653,41 @@ static int calculate_geometry(int64_t total_sectors, uint16_t* cyls,
return 0;
}
static int create_dynamic_disk(int fd, uint8_t *buf, int64_t total_sectors)
static int create_dynamic_disk(BlockDriverState *bs, uint8_t *buf,
int64_t total_sectors)
{
VHDDynDiskHeader *dyndisk_header =
(VHDDynDiskHeader *) buf;
size_t block_size, num_bat_entries;
int i;
int ret = -EIO;
int ret;
int64_t offset = 0;
// Write the footer (twice: at the beginning and at the end)
block_size = 0x200000;
num_bat_entries = (total_sectors + block_size / 512) / (block_size / 512);
if (write(fd, buf, HEADER_SIZE) != HEADER_SIZE) {
ret = bdrv_pwrite_sync(bs, offset, buf, HEADER_SIZE);
if (ret) {
goto fail;
}
if (lseek(fd, 1536 + ((num_bat_entries * 4 + 511) & ~511), SEEK_SET) < 0) {
goto fail;
}
if (write(fd, buf, HEADER_SIZE) != HEADER_SIZE) {
offset = 1536 + ((num_bat_entries * 4 + 511) & ~511);
ret = bdrv_pwrite_sync(bs, offset, buf, HEADER_SIZE);
if (ret < 0) {
goto fail;
}
// Write the initial BAT
if (lseek(fd, 3 * 512, SEEK_SET) < 0) {
goto fail;
}
offset = 3 * 512;
memset(buf, 0xFF, 512);
for (i = 0; i < (num_bat_entries * 4 + 511) / 512; i++) {
if (write(fd, buf, 512) != 512) {
ret = bdrv_pwrite_sync(bs, offset, buf, 512);
if (ret < 0) {
goto fail;
}
offset += 512;
}
// Prepare the Dynamic Disk Header
@@ -702,49 +708,44 @@ static int create_dynamic_disk(int fd, uint8_t *buf, int64_t total_sectors)
dyndisk_header->checksum = be32_to_cpu(vpc_checksum(buf, 1024));
// Write the header
if (lseek(fd, 512, SEEK_SET) < 0) {
goto fail;
}
offset = 512;
if (write(fd, buf, 1024) != 1024) {
ret = bdrv_pwrite_sync(bs, offset, buf, 1024);
if (ret < 0) {
goto fail;
}
ret = 0;
fail:
return ret;
}
static int create_fixed_disk(int fd, uint8_t *buf, int64_t total_size)
static int create_fixed_disk(BlockDriverState *bs, uint8_t *buf,
int64_t total_size)
{
int ret = -EIO;
int ret;
/* Add footer to total size */
total_size += 512;
if (ftruncate(fd, total_size) != 0) {
ret = -errno;
goto fail;
}
if (lseek(fd, -512, SEEK_END) < 0) {
goto fail;
}
if (write(fd, buf, HEADER_SIZE) != HEADER_SIZE) {
goto fail;
total_size += HEADER_SIZE;
ret = bdrv_truncate(bs, total_size);
if (ret < 0) {
return ret;
}
ret = 0;
ret = bdrv_pwrite_sync(bs, total_size - HEADER_SIZE, buf, HEADER_SIZE);
if (ret < 0) {
return ret;
}
fail:
return ret;
}
static int vpc_create(const char *filename, QEMUOptionParameter *options,
Error **errp)
static int vpc_create(const char *filename, QemuOpts *opts, Error **errp)
{
uint8_t buf[1024];
VHDFooter *footer = (VHDFooter *) buf;
QEMUOptionParameter *disk_type_param;
int fd, i;
char *disk_type_param;
int i;
uint16_t cyls = 0;
uint8_t heads = 0;
uint8_t secs_per_cyl = 0;
@@ -752,27 +753,36 @@ static int vpc_create(const char *filename, QEMUOptionParameter *options,
int64_t total_size;
int disk_type;
int ret = -EIO;
Error *local_err = NULL;
BlockDriverState *bs = NULL;
/* Read out options */
total_size = get_option_parameter(options, BLOCK_OPT_SIZE)->value.n;
disk_type_param = get_option_parameter(options, BLOCK_OPT_SUBFMT);
if (disk_type_param && disk_type_param->value.s) {
if (!strcmp(disk_type_param->value.s, "dynamic")) {
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
disk_type_param = qemu_opt_get_del(opts, BLOCK_OPT_SUBFMT);
if (disk_type_param) {
if (!strcmp(disk_type_param, "dynamic")) {
disk_type = VHD_DYNAMIC;
} else if (!strcmp(disk_type_param->value.s, "fixed")) {
} else if (!strcmp(disk_type_param, "fixed")) {
disk_type = VHD_FIXED;
} else {
return -EINVAL;
ret = -EINVAL;
goto out;
}
} else {
disk_type = VHD_DYNAMIC;
}
/* Create the file */
fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, 0644);
if (fd < 0) {
return -EIO;
ret = bdrv_create_file(filename, opts, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
goto out;
}
ret = bdrv_open(&bs, filename, NULL, NULL, BDRV_O_RDWR | BDRV_O_PROTOCOL,
NULL, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
goto out;
}
/*
@@ -786,7 +796,7 @@ static int vpc_create(const char *filename, QEMUOptionParameter *options,
&secs_per_cyl))
{
ret = -EFBIG;
goto fail;
goto out;
}
}
@@ -832,13 +842,14 @@ static int vpc_create(const char *filename, QEMUOptionParameter *options,
footer->checksum = be32_to_cpu(vpc_checksum(buf, HEADER_SIZE));
if (disk_type == VHD_DYNAMIC) {
ret = create_dynamic_disk(fd, buf, total_sectors);
ret = create_dynamic_disk(bs, buf, total_sectors);
} else {
ret = create_fixed_disk(fd, buf, total_size);
ret = create_fixed_disk(bs, buf, total_size);
}
fail:
qemu_close(fd);
out:
bdrv_unref(bs);
g_free(disk_type_param);
return ret;
}
@@ -847,7 +858,7 @@ static int vpc_has_zero_init(BlockDriverState *bs)
BDRVVPCState *s = bs->opaque;
VHDFooter *footer = (VHDFooter *) s->footer_buf;
if (cpu_to_be32(footer->type) == VHD_FIXED) {
if (be32_to_cpu(footer->type) == VHD_FIXED) {
return bdrv_has_zero_init(bs->file);
} else {
return 1;
@@ -866,20 +877,29 @@ static void vpc_close(BlockDriverState *bs)
error_free(s->migration_blocker);
}
static QEMUOptionParameter vpc_create_options[] = {
{
.name = BLOCK_OPT_SIZE,
.type = OPT_SIZE,
.help = "Virtual disk size"
},
{
.name = BLOCK_OPT_SUBFMT,
.type = OPT_STRING,
.help =
"Type of virtual hard disk format. Supported formats are "
"{dynamic (default) | fixed} "
},
{ NULL }
static QemuOptsList vpc_create_opts = {
.name = "vpc-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(vpc_create_opts.head),
.desc = {
{
.name = BLOCK_OPT_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Virtual disk size"
},
{
.name = BLOCK_OPT_SUBFMT,
.type = QEMU_OPT_STRING,
.help =
"Type of virtual hard disk format. Supported formats are "
"{dynamic (default) | fixed} "
},
{
.name = BLOCK_OPT_NOCOW,
.type = QEMU_OPT_BOOL,
.help = "Turn off copy-on-write (valid only on btrfs)"
},
{ /* end of list */ }
}
};
static BlockDriver bdrv_vpc = {
@@ -897,7 +917,7 @@ static BlockDriver bdrv_vpc = {
.bdrv_get_info = vpc_get_info,
.create_options = vpc_create_options,
.create_opts = &vpc_create_opts,
.bdrv_has_zero_init = vpc_has_zero_init,
};

View File

@@ -52,10 +52,6 @@
#define DLOG(a) a
#undef stderr
#define stderr STDERR
FILE* stderr = NULL;
static void checkpoint(void);
#ifdef __MINGW32__
@@ -732,7 +728,7 @@ static int read_directory(BDRVVVFATState* s, int mapping_index)
if(first_cluster == 0 && (is_dotdot || is_dot))
continue;
buffer=(char*)g_malloc(length);
buffer = g_malloc(length);
snprintf(buffer,length,"%s/%s",dirname,entry->d_name);
if(stat(buffer,&st)<0) {
@@ -767,7 +763,7 @@ static int read_directory(BDRVVVFATState* s, int mapping_index)
/* create mapping for this file */
if(!is_dot && !is_dotdot && (S_ISDIR(st.st_mode) || st.st_size)) {
s->current_mapping=(mapping_t*)array_get_next(&(s->mapping));
s->current_mapping = array_get_next(&(s->mapping));
s->current_mapping->begin=0;
s->current_mapping->end=st.st_size;
/*
@@ -811,12 +807,12 @@ static int read_directory(BDRVVVFATState* s, int mapping_index)
}
/* reget the mapping, since s->mapping was possibly realloc()ed */
mapping = (mapping_t*)array_get(&(s->mapping), mapping_index);
mapping = array_get(&(s->mapping), mapping_index);
first_cluster += (s->directory.next - mapping->info.dir.first_dir_index)
* 0x20 / s->cluster_size;
mapping->end = first_cluster;
direntry = (direntry_t*)array_get(&(s->directory), mapping->dir_index);
direntry = array_get(&(s->directory), mapping->dir_index);
set_begin_of_direntry(direntry, mapping->begin);
return 0;
@@ -1082,11 +1078,6 @@ static int vvfat_open(BlockDriverState *bs, QDict *options, int flags,
vvv = s;
#endif
DLOG(if (stderr == NULL) {
stderr = fopen("vvfat.log", "a");
setbuf(stderr, NULL);
})
opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
@@ -2910,8 +2901,8 @@ static BlockDriver vvfat_write_target = {
static int enable_write_target(BDRVVVFATState *s, Error **errp)
{
BlockDriver *bdrv_qcow;
QEMUOptionParameter *options;
BlockDriver *bdrv_qcow = NULL;
QemuOpts *opts = NULL;
int ret;
int size = sector2cluster(s, s->sector_count);
s->used_clusters = calloc(size, 1);
@@ -2926,12 +2917,12 @@ static int enable_write_target(BDRVVVFATState *s, Error **errp)
}
bdrv_qcow = bdrv_find_format("qcow");
options = parse_option_parameters("", bdrv_qcow->create_options, NULL);
set_option_parameter_int(options, BLOCK_OPT_SIZE, s->sector_count * 512);
set_option_parameter(options, BLOCK_OPT_BACKING_FILE, "fat:");
opts = qemu_opts_create(bdrv_qcow->create_opts, NULL, 0, &error_abort);
qemu_opt_set_number(opts, BLOCK_OPT_SIZE, s->sector_count * 512);
qemu_opt_set(opts, BLOCK_OPT_BACKING_FILE, "fat:");
ret = bdrv_create(bdrv_qcow, s->qcow_filename, options, errp);
free_option_parameters(options);
ret = bdrv_create(bdrv_qcow, s->qcow_filename, opts, errp);
qemu_opts_del(opts);
if (ret < 0) {
goto err;
}
@@ -2950,7 +2941,7 @@ static int enable_write_target(BDRVVVFATState *s, Error **errp)
bdrv_set_backing_hd(s->bs, bdrv_new("", &error_abort));
s->bs->backing_hd->drv = &vvfat_write_target;
s->bs->backing_hd->opaque = g_malloc(sizeof(void*));
s->bs->backing_hd->opaque = g_new(void *, 1);
*(void**)s->bs->backing_hd->opaque = s;
return 0;

View File

@@ -40,6 +40,7 @@ struct QEMUWin32AIOState {
HANDLE hIOCP;
EventNotifier e;
int count;
bool is_aio_context_attached;
};
typedef struct QEMUWin32AIOCB {
@@ -114,7 +115,7 @@ static void win32_aio_cancel(BlockDriverAIOCB *blockacb)
* wait for completion.
*/
while (!HasOverlappedIoCompleted(&waiocb->ov)) {
qemu_aio_wait();
aio_poll(bdrv_get_aio_context(blockacb->bs), true);
}
}
@@ -138,7 +139,10 @@ BlockDriverAIOCB *win32_aio_submit(BlockDriverState *bs,
waiocb->is_read = (type == QEMU_AIO_READ);
if (qiov->niov > 1) {
waiocb->buf = qemu_blockalign(bs, qiov->size);
waiocb->buf = qemu_try_blockalign(bs, qiov->size);
if (waiocb->buf == NULL) {
goto out;
}
if (type & QEMU_AIO_WRITE) {
iov_to_buf(qiov->iov, qiov->niov, 0, waiocb->buf, qiov->size);
}
@@ -167,6 +171,7 @@ BlockDriverAIOCB *win32_aio_submit(BlockDriverState *bs,
out_dec_count:
aio->count--;
out:
qemu_aio_release(waiocb);
return NULL;
}
@@ -180,6 +185,20 @@ int win32_aio_attach(QEMUWin32AIOState *aio, HANDLE hfile)
}
}
void win32_aio_detach_aio_context(QEMUWin32AIOState *aio,
AioContext *old_context)
{
aio_set_event_notifier(old_context, &aio->e, NULL);
aio->is_aio_context_attached = false;
}
void win32_aio_attach_aio_context(QEMUWin32AIOState *aio,
AioContext *new_context)
{
aio->is_aio_context_attached = true;
aio_set_event_notifier(new_context, &aio->e, win32_aio_completion_cb);
}
QEMUWin32AIOState *win32_aio_init(void)
{
QEMUWin32AIOState *s;
@@ -194,8 +213,6 @@ QEMUWin32AIOState *win32_aio_init(void)
goto out_close_efd;
}
qemu_aio_set_event_notifier(&s->e, win32_aio_completion_cb);
return s;
out_close_efd:
@@ -204,3 +221,11 @@ out_free_state:
g_free(s);
return NULL;
}
void win32_aio_cleanup(QEMUWin32AIOState *aio)
{
assert(!aio->is_aio_context_attached);
CloseHandle(aio->hIOCP);
event_notifier_cleanup(&aio->e);
g_free(aio);
}

View File

@@ -28,6 +28,7 @@ static void nbd_accept(void *opaque)
int fd = accept(server_fd, (struct sockaddr *)&addr, &addr_len);
if (fd >= 0 && !nbd_client_new(NULL, fd, nbd_client_put)) {
shutdown(fd, 2);
close(fd);
}
}
@@ -91,6 +92,10 @@ void qmp_nbd_server_add(const char *device, bool has_writable, bool writable,
error_set(errp, QERR_DEVICE_NOT_FOUND, device);
return;
}
if (!bdrv_is_inserted(bs)) {
error_set(errp, QERR_DEVICE_HAS_NO_MEDIUM, device);
return;
}
if (!has_writable) {
writable = false;
@@ -103,7 +108,7 @@ void qmp_nbd_server_add(const char *device, bool has_writable, bool writable,
nbd_export_set_name(exp, device);
n = g_malloc0(sizeof(NBDCloseNotifier));
n = g_new0(NBDCloseNotifier, 1);
n->n.notify = nbd_close_notifier;
n->exp = exp;
bdrv_add_close_notifier(bs, &n->n);

View File

@@ -39,6 +39,7 @@
#include "qapi/qmp/types.h"
#include "qapi-visit.h"
#include "qapi/qmp-output-visitor.h"
#include "qapi/util.h"
#include "sysemu/sysemu.h"
#include "block/block_int.h"
#include "qmp-commands.h"
@@ -106,7 +107,7 @@ void blockdev_auto_del(BlockDriverState *bs)
DriveInfo *dinfo = drive_get_by_blockdev(bs);
if (dinfo && dinfo->auto_del) {
drive_put_ref(dinfo);
drive_del(dinfo);
}
}
@@ -213,7 +214,7 @@ static void bdrv_format_print(void *opaque, const char *name)
error_printf(" %s", name);
}
static void drive_uninit(DriveInfo *dinfo)
void drive_del(DriveInfo *dinfo)
{
if (dinfo->opts) {
qemu_opts_del(dinfo->opts);
@@ -226,19 +227,6 @@ static void drive_uninit(DriveInfo *dinfo)
g_free(dinfo);
}
void drive_put_ref(DriveInfo *dinfo)
{
assert(dinfo->refcount);
if (--dinfo->refcount == 0) {
drive_uninit(dinfo);
}
}
void drive_get_ref(DriveInfo *dinfo)
{
dinfo->refcount++;
}
typedef struct {
QEMUBH *bh;
BlockDriverState *bs;
@@ -287,25 +275,6 @@ static int parse_block_error_action(const char *buf, bool is_read, Error **errp)
}
}
static inline int parse_enum_option(const char *lookup[], const char *buf,
int max, int def, Error **errp)
{
int i;
if (!buf) {
return def;
}
for (i = 0; i < max; i++) {
if (!strcmp(buf, lookup[i])) {
return i;
}
}
error_setg(errp, "invalid parameter value: %s", buf);
return def;
}
static bool check_throttle_config(ThrottleConfig *cfg, Error **errp)
{
if (throttle_conflicting(cfg)) {
@@ -329,7 +298,6 @@ static DriveInfo *blockdev_init(const char *file, QDict *bs_opts,
Error **errp)
{
const char *buf;
const char *serial;
int ro = 0;
int bdrv_flags = 0;
int on_read_error, on_write_error;
@@ -371,8 +339,6 @@ static DriveInfo *blockdev_init(const char *file, QDict *bs_opts,
ro = qemu_opt_get_bool(opts, "read-only", 0);
copy_on_read = qemu_opt_get_bool(opts, "copy-on-read", false);
serial = qemu_opt_get(opts, "serial");
if ((buf = qemu_opt_get(opts, "discard")) != NULL) {
if (bdrv_parse_discard_flags(buf, &bdrv_flags) != 0) {
error_setg(errp, "invalid discard option");
@@ -472,11 +438,11 @@ static DriveInfo *blockdev_init(const char *file, QDict *bs_opts,
}
detect_zeroes =
parse_enum_option(BlockdevDetectZeroesOptions_lookup,
qemu_opt_get(opts, "detect-zeroes"),
BLOCKDEV_DETECT_ZEROES_OPTIONS_MAX,
BLOCKDEV_DETECT_ZEROES_OPTIONS_OFF,
&error);
qapi_enum_parse(BlockdevDetectZeroesOptions_lookup,
qemu_opt_get(opts, "detect-zeroes"),
BLOCKDEV_DETECT_ZEROES_OPTIONS_MAX,
BLOCKDEV_DETECT_ZEROES_OPTIONS_OFF,
&error);
if (error) {
error_propagate(errp, error);
goto early_err;
@@ -500,10 +466,6 @@ static DriveInfo *blockdev_init(const char *file, QDict *bs_opts,
dinfo->bdrv->open_flags = snapshot ? BDRV_O_SNAPSHOT : 0;
dinfo->bdrv->read_only = ro;
dinfo->bdrv->detect_zeroes = detect_zeroes;
dinfo->refcount = 1;
if (serial != NULL) {
dinfo->serial = g_strdup(serial);
}
QTAILQ_INSERT_TAIL(&drives, dinfo, next);
bdrv_set_on_error(dinfo->bdrv, on_read_error, on_write_error);
@@ -629,6 +591,10 @@ QemuOptsList qemu_legacy_drive_opts = {
.name = "addr",
.type = QEMU_OPT_STRING,
.help = "pci address (virtio only)",
},{
.name = "serial",
.type = QEMU_OPT_STRING,
.help = "disk serial number",
},{
.name = "file",
.type = QEMU_OPT_STRING,
@@ -658,7 +624,7 @@ QemuOptsList qemu_legacy_drive_opts = {
},
};
DriveInfo *drive_init(QemuOpts *all_opts, BlockInterfaceType block_default_type)
DriveInfo *drive_new(QemuOpts *all_opts, BlockInterfaceType block_default_type)
{
const char *value;
DriveInfo *dinfo = NULL;
@@ -672,6 +638,7 @@ DriveInfo *drive_init(QemuOpts *all_opts, BlockInterfaceType block_default_type)
const char *werror, *rerror;
bool read_only = false;
bool copy_on_read;
const char *serial;
const char *filename;
Error *local_err = NULL;
@@ -875,6 +842,9 @@ DriveInfo *drive_init(QemuOpts *all_opts, BlockInterfaceType block_default_type)
goto fail;
}
/* Serial number */
serial = qemu_opt_get(legacy_opts, "serial");
/* no id supplied -> create one */
if (qemu_opts_id(all_opts) == NULL) {
char *new_id;
@@ -965,6 +935,8 @@ DriveInfo *drive_init(QemuOpts *all_opts, BlockInterfaceType block_default_type)
dinfo->unit = unit_id;
dinfo->devaddr = devaddr;
dinfo->serial = g_strdup(serial);
switch(type) {
case IF_IDE:
case IF_SCSI:
@@ -1104,7 +1076,7 @@ SnapshotInfo *qmp_blockdev_snapshot_delete_internal_sync(const char *device,
return NULL;
}
info = g_malloc0(sizeof(SnapshotInfo));
info = g_new0(SnapshotInfo, 1);
info->id = g_strdup(sn.id_str);
info->name = g_strdup(sn.name);
info->date_nsec = sn.date_nsec;
@@ -1703,6 +1675,7 @@ void qmp_block_set_io_throttle(const char *device, int64_t bps, int64_t bps_rd,
{
ThrottleConfig cfg;
BlockDriverState *bs;
AioContext *aio_context;
bs = bdrv_find(device);
if (!bs) {
@@ -1746,6 +1719,9 @@ void qmp_block_set_io_throttle(const char *device, int64_t bps, int64_t bps_rd,
return;
}
aio_context = bdrv_get_aio_context(bs);
aio_context_acquire(aio_context);
if (!bs->io_limits_enabled && throttle_enabled(&cfg)) {
bdrv_io_limits_enable(bs);
} else if (bs->io_limits_enabled && !throttle_enabled(&cfg)) {
@@ -1755,12 +1731,16 @@ void qmp_block_set_io_throttle(const char *device, int64_t bps, int64_t bps_rd,
if (bs->io_limits_enabled) {
bdrv_set_io_limits(bs, &cfg);
}
aio_context_release(aio_context);
}
int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data)
{
const char *id = qdict_get_str(qdict, "id");
BlockDriverState *bs;
DriveInfo *dinfo;
AioContext *aio_context;
Error *local_err = NULL;
bs = bdrv_find(id);
@@ -1768,9 +1748,21 @@ int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data)
error_report("Device '%s' not found", id);
return -1;
}
dinfo = drive_get_by_blockdev(bs);
if (dinfo && !dinfo->enable_auto_del) {
error_report("Deleting device added with blockdev-add"
" is not supported");
return -1;
}
aio_context = bdrv_get_aio_context(bs);
aio_context_acquire(aio_context);
if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_DRIVE_DEL, &local_err)) {
error_report("%s", error_get_pretty(local_err));
error_free(local_err);
aio_context_release(aio_context);
return -1;
}
@@ -1791,9 +1783,10 @@ int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data)
bdrv_set_on_error(bs, BLOCKDEV_ON_ERROR_REPORT,
BLOCKDEV_ON_ERROR_REPORT);
} else {
drive_uninit(drive_get_by_blockdev(bs));
drive_del(dinfo);
}
aio_context_release(aio_context);
return 0;
}
@@ -1803,6 +1796,7 @@ void qmp_block_resize(bool has_device, const char *device,
{
Error *local_err = NULL;
BlockDriverState *bs;
AioContext *aio_context;
int ret;
bs = bdrv_lookup_bs(has_device ? device : NULL,
@@ -1813,14 +1807,22 @@ void qmp_block_resize(bool has_device, const char *device,
return;
}
aio_context = bdrv_get_aio_context(bs);
aio_context_acquire(aio_context);
if (!bdrv_is_first_non_filter(bs)) {
error_set(errp, QERR_FEATURE_DISABLED, "resize");
return;
goto out;
}
if (size < 0) {
error_set(errp, QERR_INVALID_PARAMETER_VALUE, "size", "a >0 size");
return;
goto out;
}
if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_RESIZE, NULL)) {
error_set(errp, QERR_DEVICE_IN_USE, device);
goto out;
}
/* complete all in-flight operations before resizing the device */
@@ -1846,40 +1848,44 @@ void qmp_block_resize(bool has_device, const char *device,
error_setg_errno(errp, -ret, "Could not resize");
break;
}
out:
aio_context_release(aio_context);
}
static void block_job_cb(void *opaque, int ret)
{
BlockDriverState *bs = opaque;
QObject *obj;
const char *msg = NULL;
trace_block_job_cb(bs, bs->job, ret);
assert(bs->job);
obj = qobject_from_block_job(bs->job);
if (ret < 0) {
QDict *dict = qobject_to_qdict(obj);
qdict_put(dict, "error", qstring_from_str(strerror(-ret)));
msg = strerror(-ret);
}
if (block_job_is_cancelled(bs->job)) {
monitor_protocol_event(QEVENT_BLOCK_JOB_CANCELLED, obj);
block_job_event_cancelled(bs->job);
} else {
monitor_protocol_event(QEVENT_BLOCK_JOB_COMPLETED, obj);
block_job_event_completed(bs->job, msg);
}
qobject_decref(obj);
bdrv_put_ref_bh_schedule(bs);
}
void qmp_block_stream(const char *device, bool has_base,
const char *base, bool has_speed, int64_t speed,
void qmp_block_stream(const char *device,
bool has_base, const char *base,
bool has_backing_file, const char *backing_file,
bool has_speed, int64_t speed,
bool has_on_error, BlockdevOnError on_error,
Error **errp)
{
BlockDriverState *bs;
BlockDriverState *base_bs = NULL;
Error *local_err = NULL;
const char *base_name = NULL;
if (!has_on_error) {
on_error = BLOCKDEV_ON_ERROR_REPORT;
@@ -1895,15 +1901,27 @@ void qmp_block_stream(const char *device, bool has_base,
return;
}
if (base) {
if (has_base) {
base_bs = bdrv_find_backing_image(bs, base);
if (base_bs == NULL) {
error_set(errp, QERR_BASE_NOT_FOUND, base);
return;
}
base_name = base;
}
stream_start(bs, base_bs, base, has_speed ? speed : 0,
/* if we are streaming the entire chain, the result will have no backing
* file, and specifying one is therefore an error */
if (base_bs == NULL && has_backing_file) {
error_setg(errp, "backing file specified, but streaming the "
"entire chain");
return;
}
/* backing_file string overrides base bs filename */
base_name = has_backing_file ? backing_file : base_name;
stream_start(bs, base_bs, base_name, has_speed ? speed : 0,
on_error, block_job_cb, bs, &local_err);
if (local_err) {
error_propagate(errp, local_err);
@@ -1914,7 +1932,9 @@ void qmp_block_stream(const char *device, bool has_base,
}
void qmp_block_commit(const char *device,
bool has_base, const char *base, const char *top,
bool has_base, const char *base,
bool has_top, const char *top,
bool has_backing_file, const char *backing_file,
bool has_speed, int64_t speed,
Error **errp)
{
@@ -1933,6 +1953,11 @@ void qmp_block_commit(const char *device,
/* drain all i/o before commits */
bdrv_drain_all();
/* Important Note:
* libvirt relies on the DeviceNotFound error class in order to probe for
* live commit feature versions; for this to work, we must make sure to
* perform the device lookup before any generic errors that may occur in a
* scenario in which all optional arguments are omitted. */
bs = bdrv_find(device);
if (!bs) {
error_set(errp, QERR_DEVICE_NOT_FOUND, device);
@@ -1946,7 +1971,7 @@ void qmp_block_commit(const char *device,
/* default top_bs is the active layer */
top_bs = bs;
if (top) {
if (has_top && top) {
if (strcmp(bs->filename, top) != 0) {
top_bs = bdrv_find_backing_image(bs, top);
}
@@ -1968,12 +1993,23 @@ void qmp_block_commit(const char *device,
return;
}
/* Do not allow attempts to commit an image into itself */
if (top_bs == base_bs) {
error_setg(errp, "cannot commit an image into itself");
return;
}
if (top_bs == bs) {
if (has_backing_file) {
error_setg(errp, "'backing-file' specified,"
" but 'top' is the active layer");
return;
}
commit_active_start(bs, base_bs, speed, on_error, block_job_cb,
bs, &local_err);
} else {
commit_start(bs, base_bs, top_bs, speed, on_error, block_job_cb, bs,
&local_err);
has_backing_file ? backing_file : NULL, &local_err);
}
if (local_err != NULL) {
error_propagate(errp, local_err);
@@ -2100,6 +2136,8 @@ BlockDeviceInfoList *qmp_query_named_block_nodes(Error **errp)
void qmp_drive_mirror(const char *device, const char *target,
bool has_format, const char *format,
bool has_node_name, const char *node_name,
bool has_replaces, const char *replaces,
enum MirrorSyncMode sync,
bool has_mode, enum NewImageMode mode,
bool has_speed, int64_t speed,
@@ -2113,6 +2151,7 @@ void qmp_drive_mirror(const char *device, const char *target,
BlockDriverState *source, *target_bs;
BlockDriver *drv = NULL;
Error *local_err = NULL;
QDict *options = NULL;
int flags;
int64_t size;
int ret;
@@ -2137,11 +2176,12 @@ void qmp_drive_mirror(const char *device, const char *target,
}
if (granularity != 0 && (granularity < 512 || granularity > 1048576 * 64)) {
error_set(errp, QERR_INVALID_PARAMETER, device);
error_set(errp, QERR_INVALID_PARAMETER_VALUE, "granularity",
"a value in range [512B, 64MB]");
return;
}
if (granularity & (granularity - 1)) {
error_set(errp, QERR_INVALID_PARAMETER, device);
error_set(errp, QERR_INVALID_PARAMETER_VALUE, "granularity", "power of 2");
return;
}
@@ -2186,6 +2226,29 @@ void qmp_drive_mirror(const char *device, const char *target,
return;
}
if (has_replaces) {
BlockDriverState *to_replace_bs;
if (!has_node_name) {
error_setg(errp, "a node-name must be provided when replacing a"
" named node of the graph");
return;
}
to_replace_bs = check_to_replace_node(replaces, &local_err);
if (!to_replace_bs) {
error_propagate(errp, local_err);
return;
}
if (size != bdrv_getlength(to_replace_bs)) {
error_setg(errp, "cannot replace image with a mirror image of "
"different size");
return;
}
}
if ((sync == MIRROR_SYNC_MODE_FULL || !source)
&& mode != NEW_IMAGE_MODE_EXISTING)
{
@@ -2214,18 +2277,28 @@ void qmp_drive_mirror(const char *device, const char *target,
return;
}
if (has_node_name) {
options = qdict_new();
qdict_put(options, "node-name", qstring_from_str(node_name));
}
/* Mirroring takes care of copy-on-write using the source's backing
* file.
*/
target_bs = NULL;
ret = bdrv_open(&target_bs, target, NULL, NULL, flags | BDRV_O_NO_BACKING,
drv, &local_err);
ret = bdrv_open(&target_bs, target, NULL, options,
flags | BDRV_O_NO_BACKING, drv, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
return;
}
mirror_start(bs, target_bs, speed, granularity, buf_size, sync,
/* pass the node name to replace to mirror start since it's loose coupling
* and will allow to check whether the node still exist at mirror completion
*/
mirror_start(bs, target_bs,
has_replaces ? replaces : NULL,
speed, granularity, buf_size, sync,
on_source_error, on_target_error,
block_job_cb, bs, &local_err);
if (local_err != NULL) {
@@ -2320,6 +2393,85 @@ void qmp_block_job_complete(const char *device, Error **errp)
block_job_complete(job, errp);
}
void qmp_change_backing_file(const char *device,
const char *image_node_name,
const char *backing_file,
Error **errp)
{
BlockDriverState *bs = NULL;
BlockDriverState *image_bs = NULL;
Error *local_err = NULL;
bool ro;
int open_flags;
int ret;
/* find the top layer BDS of the chain */
bs = bdrv_find(device);
if (!bs) {
error_set(errp, QERR_DEVICE_NOT_FOUND, device);
return;
}
image_bs = bdrv_lookup_bs(NULL, image_node_name, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return;
}
if (!image_bs) {
error_setg(errp, "image file not found");
return;
}
if (bdrv_find_base(image_bs) == image_bs) {
error_setg(errp, "not allowing backing file change on an image "
"without a backing file");
return;
}
/* even though we are not necessarily operating on bs, we need it to
* determine if block ops are currently prohibited on the chain */
if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_CHANGE, errp)) {
return;
}
/* final sanity check */
if (!bdrv_chain_contains(bs, image_bs)) {
error_setg(errp, "'%s' and image file are not in the same chain",
device);
return;
}
/* if not r/w, reopen to make r/w */
open_flags = image_bs->open_flags;
ro = bdrv_is_read_only(image_bs);
if (ro) {
bdrv_reopen(image_bs, open_flags | BDRV_O_RDWR, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return;
}
}
ret = bdrv_change_backing_file(image_bs, backing_file,
image_bs->drv ? image_bs->drv->format_name : "");
if (ret < 0) {
error_setg_errno(errp, -ret, "Could not change backing file to '%s'",
backing_file);
/* don't exit here, so we can try to restore open flags if
* appropriate */
}
if (ro) {
bdrv_reopen(image_bs, open_flags, &local_err);
if (local_err) {
error_propagate(errp, local_err); /* will preserve prior errp */
}
}
}
void qmp_blockdev_add(BlockdevOptions *options, Error **errp)
{
QmpOutputVisitor *ov = qmp_output_visitor_new();
@@ -2334,9 +2486,9 @@ void qmp_blockdev_add(BlockdevOptions *options, Error **errp)
goto fail;
}
/* TODO Sort it out in raw-posix and drive_init: Reject aio=native with
/* TODO Sort it out in raw-posix and drive_new(): Reject aio=native with
* cache.direct=false instead of silently switching to aio=threads, except
* if called from drive_init.
* when called from drive_new().
*
* For now, simply forbidding the combination for all drivers will do. */
if (options->has_aio && options->aio == BLOCKDEV_AIO_OPTIONS_NATIVE) {
@@ -2368,7 +2520,7 @@ void qmp_blockdev_add(BlockdevOptions *options, Error **errp)
}
if (bdrv_key_required(dinfo->bdrv)) {
drive_uninit(dinfo);
drive_del(dinfo);
error_setg(errp, "blockdev-add doesn't support encrypted devices");
goto fail;
}
@@ -2431,10 +2583,6 @@ QemuOptsList qemu_common_drive_opts = {
.name = "format",
.type = QEMU_OPT_STRING,
.help = "disk format (raw, qcow2, ...)",
},{
.name = "serial",
.type = QEMU_OPT_STRING,
.help = "disk serial number",
},{
.name = "rerror",
.type = QEMU_OPT_STRING,

View File

@@ -26,7 +26,6 @@
#include "config-host.h"
#include "qemu-common.h"
#include "trace.h"
#include "monitor/monitor.h"
#include "block/block.h"
#include "block/blockjob.h"
#include "block/block_int.h"
@@ -34,6 +33,7 @@
#include "block/coroutine.h"
#include "qmp-commands.h"
#include "qemu/timer.h"
#include "qapi-event.h"
void *block_job_create(const BlockJobDriver *driver, BlockDriverState *bs,
int64_t speed, BlockDriverCompletionFunc *cb,
@@ -187,7 +187,7 @@ int block_job_cancel_sync(BlockJob *job)
job->opaque = &data;
block_job_cancel(job);
while (data.ret == -EINPROGRESS) {
qemu_aio_wait();
aio_poll(bdrv_get_aio_context(bs), true);
}
return (data.cancelled && data.ret == 0) ? -ECANCELED : data.ret;
}
@@ -205,11 +205,25 @@ void block_job_sleep_ns(BlockJob *job, QEMUClockType type, int64_t ns)
if (block_job_is_paused(job)) {
qemu_coroutine_yield();
} else {
co_sleep_ns(type, ns);
co_aio_sleep_ns(bdrv_get_aio_context(job->bs), type, ns);
}
job->busy = true;
}
void block_job_yield(BlockJob *job)
{
assert(job->busy);
/* Check cancellation *before* setting busy = false, too! */
if (block_job_is_cancelled(job)) {
return;
}
job->busy = false;
qemu_coroutine_yield();
job->busy = true;
}
BlockJobInfo *block_job_query(BlockJob *job)
{
BlockJobInfo *info = g_new0(BlockJobInfo, 1);
@@ -232,26 +246,35 @@ static void block_job_iostatus_set_err(BlockJob *job, int error)
}
}
QObject *qobject_from_block_job(BlockJob *job)
void block_job_event_cancelled(BlockJob *job)
{
return qobject_from_jsonf("{ 'type': %s,"
"'device': %s,"
"'len': %" PRId64 ","
"'offset': %" PRId64 ","
"'speed': %" PRId64 " }",
BlockJobType_lookup[job->driver->job_type],
bdrv_get_device_name(job->bs),
job->len,
job->offset,
job->speed);
qapi_event_send_block_job_cancelled(job->driver->job_type,
bdrv_get_device_name(job->bs),
job->len,
job->offset,
job->speed,
&error_abort);
}
void block_job_ready(BlockJob *job)
void block_job_event_completed(BlockJob *job, const char *msg)
{
QObject *data = qobject_from_block_job(job);
monitor_protocol_event(QEVENT_BLOCK_JOB_READY, data);
qobject_decref(data);
qapi_event_send_block_job_completed(job->driver->job_type,
bdrv_get_device_name(job->bs),
job->len,
job->offset,
job->speed,
!!msg,
msg,
&error_abort);
}
void block_job_event_ready(BlockJob *job)
{
qapi_event_send_block_job_ready(job->driver->job_type,
bdrv_get_device_name(job->bs),
job->len,
job->offset,
job->speed, &error_abort);
}
BlockErrorAction block_job_error_action(BlockJob *job, BlockDriverState *bs,
@@ -262,22 +285,26 @@ BlockErrorAction block_job_error_action(BlockJob *job, BlockDriverState *bs,
switch (on_err) {
case BLOCKDEV_ON_ERROR_ENOSPC:
action = (error == ENOSPC) ? BDRV_ACTION_STOP : BDRV_ACTION_REPORT;
action = (error == ENOSPC) ?
BLOCK_ERROR_ACTION_STOP : BLOCK_ERROR_ACTION_REPORT;
break;
case BLOCKDEV_ON_ERROR_STOP:
action = BDRV_ACTION_STOP;
action = BLOCK_ERROR_ACTION_STOP;
break;
case BLOCKDEV_ON_ERROR_REPORT:
action = BDRV_ACTION_REPORT;
action = BLOCK_ERROR_ACTION_REPORT;
break;
case BLOCKDEV_ON_ERROR_IGNORE:
action = BDRV_ACTION_IGNORE;
action = BLOCK_ERROR_ACTION_IGNORE;
break;
default:
abort();
}
bdrv_emit_qmp_error_event(job->bs, QEVENT_BLOCK_JOB_ERROR, action, is_read);
if (action == BDRV_ACTION_STOP) {
qapi_event_send_block_job_error(bdrv_get_device_name(job->bs),
is_read ? IO_OPERATION_TYPE_READ :
IO_OPERATION_TYPE_WRITE,
action, &error_abort);
if (action == BLOCK_ERROR_ACTION_STOP) {
block_job_pause(job);
block_job_iostatus_set_err(job, error);
if (bs != job->bs) {

View File

@@ -1,7 +1,37 @@
{ TARGET_FREEBSD_NR___getcwd, "__getcwd", NULL, NULL, NULL },
/*
* FreeBSD strace list
*
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, see <http://www.gnu.org/licenses/>.
*/
{ TARGET_FREEBSD_NR___acl_aclcheck_fd, "__acl_aclcheck_fd", "%s(%d, %d, %#x)", NULL, NULL },
{ TARGET_FREEBSD_NR___acl_aclcheck_file, "__acl_aclcheck_file", "%s(\"%s\", %d, %#x)", NULL, NULL },
{ TARGET_FREEBSD_NR___acl_aclcheck_link, "__acl_aclcheck_link", "%s(\"%s\", %d, %#x)", NULL, NULL },
{ TARGET_FREEBSD_NR___acl_delete_fd, "__acl_delete_fd", "%s(%d, %d)", NULL, NULL },
{ TARGET_FREEBSD_NR___acl_delete_file, "__acl_delete_file", "%s(\"%s\", %d)", NULL, NULL },
{ TARGET_FREEBSD_NR___acl_delete_link, "__acl_delete_link", "%s(\"%s\", %d)", NULL, NULL },
{ TARGET_FREEBSD_NR___acl_get_fd, "__acl_get_fd", "%s(%d, %d, %#x)", NULL, NULL },
{ TARGET_FREEBSD_NR___acl_get_file, "__acl_get_file", "%s(\"%s\", %d, %#x)", NULL, NULL },
{ TARGET_FREEBSD_NR___acl_get_link, "__acl_get_link", "%s(\"%s\", %d, %#x)", NULL, NULL },
{ TARGET_FREEBSD_NR___acl_set_fd, "__acl_set_fd", "%s(%d, %d, %#x)", NULL, NULL },
{ TARGET_FREEBSD_NR___acl_set_file, "__acl_set_file", "%s(\"%s\", %d, %#x)", NULL, NULL },
{ TARGET_FREEBSD_NR___acl_set_link, "__acl_set_link", "%s(\"%s\", %d, %#x)", NULL, NULL },
{ TARGET_FREEBSD_NR___semctl, "__semctl", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR___syscall, "__syscall", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR___sysctl, "__sysctl", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR___sysctl, "__sysctl", NULL, print_sysctl, NULL },
{ TARGET_FREEBSD_NR__umtx_op, "_umtx_op", "%s(%#x, %d, %d, %#x, %#x)", NULL, NULL },
{ TARGET_FREEBSD_NR_accept, "accept", "%s(%d,%#x,%#x)", NULL, NULL },
{ TARGET_FREEBSD_NR_access, "access", "%s(\"%s\",%#o)", NULL, NULL },
{ TARGET_FREEBSD_NR_acct, "acct", NULL, NULL, NULL },
@@ -20,24 +50,41 @@
{ TARGET_FREEBSD_NR_connect, "connect", "%s(%d,%#x,%d)", NULL, NULL },
{ TARGET_FREEBSD_NR_dup, "dup", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_dup2, "dup2", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_eaccess, "eaccess", "%s(\"%s\",%#x)", NULL, NULL },
{ TARGET_FREEBSD_NR_execve, "execve", NULL, print_execve, NULL },
{ TARGET_FREEBSD_NR_exit, "exit", "%s(%d)\n", NULL, NULL },
{ TARGET_FREEBSD_NR_extattrctl, "extattrctl", "%s(\"%s\", %d, \"%s\", %d, \"%s\"", NULL, NULL },
{ TARGET_FREEBSD_NR_extattr_delete_fd, "extattr_delete_fd", "%s(%d, %d, \"%s\")", NULL, NULL },
{ TARGET_FREEBSD_NR_extattr_delete_file, "extattr_delete_file", "%s(\"%s\", %d, \"%s\")", NULL, NULL },
{ TARGET_FREEBSD_NR_extattr_delete_link, "extattr_delete_link", "%s(\"%s\", %d, \"%s\")", NULL, NULL },
{ TARGET_FREEBSD_NR_extattr_get_fd, "extattr_get_fd", "%s(%d, %d, \"%s\", %#x, %d)", NULL, NULL },
{ TARGET_FREEBSD_NR_extattr_get_file, "extattr_get_file", "%s(\"%s\", %d, \"%s\", %#x, %d)", NULL, NULL },
{ TARGET_FREEBSD_NR_extattr_get_file, "extattr_get_link", "%s(\"%s\", %d, \"%s\", %#x, %d)", NULL, NULL },
{ TARGET_FREEBSD_NR_extattr_list_fd, "extattr_list_fd", "%s(%d, %d, %#x, %d)", NULL, NULL },
{ TARGET_FREEBSD_NR_extattr_list_file, "extattr_list_file", "%s(\"%s\", %d, %#x, %d)", NULL, NULL },
{ TARGET_FREEBSD_NR_extattr_list_link, "extattr_list_link", "%s(\"%s\", %d, %#x, %d)", NULL, NULL },
{ TARGET_FREEBSD_NR_extattr_set_fd, "extattr_set_fd", "%s(%d, %d, \"%s\", %#x, %d)", NULL, NULL },
{ TARGET_FREEBSD_NR_extattr_set_file, "extattr_set_file", "%s(\"%s\", %d, \"%s\", %#x, %d)", NULL, NULL },
{ TARGET_FREEBSD_NR_extattr_set_link, "extattr_set_link", "%s(\"%s\", %d, \"%s\", %#x, %d)", NULL, NULL },
{ TARGET_FREEBSD_NR_fchdir, "fchdir", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_fchflags, "fchflags", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_fchmod, "fchmod", "%s(%d,%#o)", NULL, NULL },
{ TARGET_FREEBSD_NR_fchown, "fchown", "%s(\"%s\",%d,%d)", NULL, NULL },
{ TARGET_FREEBSD_NR_fchown, "fchown", "%s(%d,%d,%d)", NULL, NULL },
{ TARGET_FREEBSD_NR_fcntl, "fcntl", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_fexecve, "fexecve", NULL, print_execve, NULL },
{ TARGET_FREEBSD_NR_fhopen, "fhopen", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_fhstat, "fhstat", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_fhstatfs, "fhstatfs", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_flock, "flock", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_fork, "fork", "%s()", NULL, NULL },
{ TARGET_FREEBSD_NR_fpathconf, "fpathconf", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_fstat, "fstat", "%s(%d,%p)", NULL, NULL },
{ TARGET_FREEBSD_NR_fstatfs, "fstatfs", "%s(%d,%p)", NULL, NULL },
{ TARGET_FREEBSD_NR_fstat, "fstat", "%s(%d,%#x)", NULL, NULL },
{ TARGET_FREEBSD_NR_fstatat, "fstatat", "%s(%d,\"%s\", %#x, %d)", NULL, NULL },
{ TARGET_FREEBSD_NR_fstatfs, "fstatfs", "%s(%d,%#x)", NULL, NULL },
{ TARGET_FREEBSD_NR_fsync, "fsync", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_ftruncate, "ftruncate", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_futimes, "futimes", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_getcontext, "getcontext", "%s(%#x)", NULL, NULL },
{ TARGET_FREEBSD_NR_getdirentries, "getdirentries", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_freebsd6_mmap, "freebsd6_mmap", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_getegid, "getegid", "%s()", NULL, NULL },
@@ -63,7 +110,7 @@
{ TARGET_FREEBSD_NR_getsockopt, "getsockopt", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_gettimeofday, "gettimeofday", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_getuid, "getuid", "%s()", NULL, NULL },
{ TARGET_FREEBSD_NR_ioctl, "ioctl", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_ioctl, "ioctl", NULL, print_ioctl, NULL },
{ TARGET_FREEBSD_NR_issetugid, "issetugid", "%s()", NULL, NULL },
{ TARGET_FREEBSD_NR_kevent, "kevent", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_kill, "kill", NULL, NULL, NULL },
@@ -72,6 +119,7 @@
{ TARGET_FREEBSD_NR_lchown, "lchown", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_link, "link", "%s(\"%s\",\"%s\")", NULL, NULL },
{ TARGET_FREEBSD_NR_listen, "listen", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_lpathconf, "lpathconf", "%s(\"%s\", %d)", NULL, NULL },
{ TARGET_FREEBSD_NR_lseek, "lseek", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_lstat, "lstat", "%s(\"%s\",%p)", NULL, NULL },
{ TARGET_FREEBSD_NR_madvise, "madvise", NULL, NULL, NULL },
@@ -96,7 +144,8 @@
{ TARGET_FREEBSD_NR_nanosleep, "nanosleep", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_nfssvc, "nfssvc", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_open, "open", "%s(\"%s\",%#x,%#o)", NULL, NULL },
{ TARGET_FREEBSD_NR_pathconf, "pathconf", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_openat, "openat", "%s(%d, \"%s\",%#x,%#o)", NULL, NULL },
{ TARGET_FREEBSD_NR_pathconf, "pathconf", "%s(\"%s\", %d)", NULL, NULL },
{ TARGET_FREEBSD_NR_pipe, "pipe", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_poll, "poll", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_pread, "pread", NULL, NULL, NULL },
@@ -116,6 +165,7 @@
{ TARGET_FREEBSD_NR_revoke, "revoke", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_rfork, "rfork", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_rmdir, "rmdir", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_rtprio_thread, "rtprio_thread", "%s(%d, %d, %p)", NULL, NULL },
{ TARGET_FREEBSD_NR_sbrk, "sbrk", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_sched_yield, "sched_yield", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_select, "select", NULL, NULL, NULL },
@@ -123,6 +173,7 @@
{ TARGET_FREEBSD_NR_semop, "semop", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_sendmsg, "sendmsg", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_sendto, "sendto", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_setcontext, "setcontext", "%s(%#x)", NULL, NULL },
{ TARGET_FREEBSD_NR_setegid, "setegid", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_seteuid, "seteuid", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_setgid, "setgid", NULL, NULL, NULL },
@@ -151,7 +202,7 @@
{ TARGET_FREEBSD_NR_sigprocmask, "sigprocmask", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_sigreturn, "sigreturn", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_sigsuspend, "sigsuspend", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_socket, "socket", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_socket, "socket", "%s(%d,%d,%d)", NULL, NULL },
{ TARGET_FREEBSD_NR_socketpair, "socketpair", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_sstk, "sstk", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_stat, "stat", "%s(\"%s\",%p)", NULL, NULL },
@@ -160,6 +211,15 @@
{ TARGET_FREEBSD_NR_sync, "sync", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_sysarch, "sysarch", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_syscall, "syscall", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_thr_create, "thr_create", "%s(%#x, %#x, %d)", NULL, NULL },
{ TARGET_FREEBSD_NR_thr_exit, "thr_exit", "%s(%#x)", NULL, NULL },
{ TARGET_FREEBSD_NR_thr_kill, "thr_kill", "%s(%d, %#x)", NULL, NULL },
{ TARGET_FREEBSD_NR_thr_kill2, "thr_kill2", "%s(%d, %d, %d)", NULL, NULL },
{ TARGET_FREEBSD_NR_thr_new, "thr_new", "%s(%#x, %d)", NULL, NULL },
{ TARGET_FREEBSD_NR_thr_self, "thr_self", "%s(%#x)", NULL, NULL },
{ TARGET_FREEBSD_NR_thr_set_name, "thr_set_name", "%s(%d, \"%s\")", NULL, NULL },
{ TARGET_FREEBSD_NR_thr_suspend, "thr_suspend", "%s(%d, %#x)", NULL, NULL },
{ TARGET_FREEBSD_NR_thr_wake, "thr_wake", "%s(%d)", NULL, NULL },
{ TARGET_FREEBSD_NR_truncate, "truncate", NULL, NULL, NULL },
{ TARGET_FREEBSD_NR_umask, "umask", "%s(%#o)", NULL, NULL },
{ TARGET_FREEBSD_NR_unlink, "unlink", "%s(\"%s\")", NULL, NULL },

View File

@@ -1,373 +1,450 @@
/*
* System call numbers.
*
* $FreeBSD: src/sys/sys/syscall.h,v 1.224 2008/08/24 21:23:08 rwatson Exp $
* created from FreeBSD: head/sys/kern/syscalls.master 182123 2008-08-24 21:20:35Z rwatson
* created from FreeBSD: releng/9.1/sys/kern/syscalls.master 229723
* 2012-01-06 19:29:16Z jhb
*/
#define TARGET_FREEBSD_NR_syscall 0
#define TARGET_FREEBSD_NR_exit 1
#define TARGET_FREEBSD_NR_fork 2
#define TARGET_FREEBSD_NR_read 3
#define TARGET_FREEBSD_NR_write 4
#define TARGET_FREEBSD_NR_open 5
#define TARGET_FREEBSD_NR_close 6
#define TARGET_FREEBSD_NR_wait4 7
#define TARGET_FREEBSD_NR_link 9
#define TARGET_FREEBSD_NR_unlink 10
#define TARGET_FREEBSD_NR_chdir 12
#define TARGET_FREEBSD_NR_fchdir 13
#define TARGET_FREEBSD_NR_mknod 14
#define TARGET_FREEBSD_NR_chmod 15
#define TARGET_FREEBSD_NR_chown 16
#define TARGET_FREEBSD_NR_break 17
#define TARGET_FREEBSD_NR_freebsd4_getfsstat 18
#define TARGET_FREEBSD_NR_getpid 20
#define TARGET_FREEBSD_NR_mount 21
#define TARGET_FREEBSD_NR_unmount 22
#define TARGET_FREEBSD_NR_setuid 23
#define TARGET_FREEBSD_NR_getuid 24
#define TARGET_FREEBSD_NR_geteuid 25
#define TARGET_FREEBSD_NR_ptrace 26
#define TARGET_FREEBSD_NR_recvmsg 27
#define TARGET_FREEBSD_NR_sendmsg 28
#define TARGET_FREEBSD_NR_recvfrom 29
#define TARGET_FREEBSD_NR_accept 30
#define TARGET_FREEBSD_NR_getpeername 31
#define TARGET_FREEBSD_NR_getsockname 32
#define TARGET_FREEBSD_NR_access 33
#define TARGET_FREEBSD_NR_chflags 34
#define TARGET_FREEBSD_NR_fchflags 35
#define TARGET_FREEBSD_NR_sync 36
#define TARGET_FREEBSD_NR_kill 37
#define TARGET_FREEBSD_NR_getppid 39
#define TARGET_FREEBSD_NR_dup 41
#define TARGET_FREEBSD_NR_pipe 42
#define TARGET_FREEBSD_NR_getegid 43
#define TARGET_FREEBSD_NR_profil 44
#define TARGET_FREEBSD_NR_ktrace 45
#define TARGET_FREEBSD_NR_getgid 47
#define TARGET_FREEBSD_NR_getlogin 49
#define TARGET_FREEBSD_NR_setlogin 50
#define TARGET_FREEBSD_NR_acct 51
#define TARGET_FREEBSD_NR_sigaltstack 53
#define TARGET_FREEBSD_NR_ioctl 54
#define TARGET_FREEBSD_NR_reboot 55
#define TARGET_FREEBSD_NR_revoke 56
#define TARGET_FREEBSD_NR_symlink 57
#define TARGET_FREEBSD_NR_readlink 58
#define TARGET_FREEBSD_NR_execve 59
#define TARGET_FREEBSD_NR_umask 60
#define TARGET_FREEBSD_NR_chroot 61
#define TARGET_FREEBSD_NR_msync 65
#define TARGET_FREEBSD_NR_vfork 66
#define TARGET_FREEBSD_NR_sbrk 69
#define TARGET_FREEBSD_NR_sstk 70
#define TARGET_FREEBSD_NR_vadvise 72
#define TARGET_FREEBSD_NR_munmap 73
#define TARGET_FREEBSD_NR_mprotect 74
#define TARGET_FREEBSD_NR_madvise 75
#define TARGET_FREEBSD_NR_mincore 78
#define TARGET_FREEBSD_NR_getgroups 79
#define TARGET_FREEBSD_NR_setgroups 80
#define TARGET_FREEBSD_NR_getpgrp 81
#define TARGET_FREEBSD_NR_setpgid 82
#define TARGET_FREEBSD_NR_setitimer 83
#define TARGET_FREEBSD_NR_swapon 85
#define TARGET_FREEBSD_NR_getitimer 86
#define TARGET_FREEBSD_NR_getdtablesize 89
#define TARGET_FREEBSD_NR_dup2 90
#define TARGET_FREEBSD_NR_fcntl 92
#define TARGET_FREEBSD_NR_select 93
#define TARGET_FREEBSD_NR_fsync 95
#define TARGET_FREEBSD_NR_setpriority 96
#define TARGET_FREEBSD_NR_socket 97
#define TARGET_FREEBSD_NR_connect 98
#define TARGET_FREEBSD_NR_getpriority 100
#define TARGET_FREEBSD_NR_bind 104
#define TARGET_FREEBSD_NR_setsockopt 105
#define TARGET_FREEBSD_NR_listen 106
#define TARGET_FREEBSD_NR_gettimeofday 116
#define TARGET_FREEBSD_NR_getrusage 117
#define TARGET_FREEBSD_NR_getsockopt 118
#define TARGET_FREEBSD_NR_readv 120
#define TARGET_FREEBSD_NR_writev 121
#define TARGET_FREEBSD_NR_settimeofday 122
#define TARGET_FREEBSD_NR_fchown 123
#define TARGET_FREEBSD_NR_fchmod 124
#define TARGET_FREEBSD_NR_setreuid 126
#define TARGET_FREEBSD_NR_setregid 127
#define TARGET_FREEBSD_NR_rename 128
#define TARGET_FREEBSD_NR_flock 131
#define TARGET_FREEBSD_NR_mkfifo 132
#define TARGET_FREEBSD_NR_sendto 133
#define TARGET_FREEBSD_NR_shutdown 134
#define TARGET_FREEBSD_NR_socketpair 135
#define TARGET_FREEBSD_NR_mkdir 136
#define TARGET_FREEBSD_NR_rmdir 137
#define TARGET_FREEBSD_NR_utimes 138
#define TARGET_FREEBSD_NR_adjtime 140
#define TARGET_FREEBSD_NR_setsid 147
#define TARGET_FREEBSD_NR_quotactl 148
#define TARGET_FREEBSD_NR_nlm_syscall 154
#define TARGET_FREEBSD_NR_nfssvc 155
#define TARGET_FREEBSD_NR_freebsd4_statfs 157
#define TARGET_FREEBSD_NR_freebsd4_fstatfs 158
#define TARGET_FREEBSD_NR_lgetfh 160
#define TARGET_FREEBSD_NR_getfh 161
#define TARGET_FREEBSD_NR_getdomainname 162
#define TARGET_FREEBSD_NR_setdomainname 163
#define TARGET_FREEBSD_NR_uname 164
#define TARGET_FREEBSD_NR_sysarch 165
#define TARGET_FREEBSD_NR_rtprio 166
#define TARGET_FREEBSD_NR_semsys 169
#define TARGET_FREEBSD_NR_msgsys 170
#define TARGET_FREEBSD_NR_shmsys 171
#define TARGET_FREEBSD_NR_freebsd6_pread 173
#define TARGET_FREEBSD_NR_freebsd6_pwrite 174
#define TARGET_FREEBSD_NR_setfib 175
#define TARGET_FREEBSD_NR_ntp_adjtime 176
#define TARGET_FREEBSD_NR_setgid 181
#define TARGET_FREEBSD_NR_setegid 182
#define TARGET_FREEBSD_NR_seteuid 183
#define TARGET_FREEBSD_NR_stat 188
#define TARGET_FREEBSD_NR_fstat 189
#define TARGET_FREEBSD_NR_lstat 190
#define TARGET_FREEBSD_NR_pathconf 191
#define TARGET_FREEBSD_NR_fpathconf 192
#define TARGET_FREEBSD_NR_getrlimit 194
#define TARGET_FREEBSD_NR_setrlimit 195
#define TARGET_FREEBSD_NR_getdirentries 196
#define TARGET_FREEBSD_NR_freebsd6_mmap 197
#define TARGET_FREEBSD_NR___syscall 198
#define TARGET_FREEBSD_NR_freebsd6_lseek 199
#define TARGET_FREEBSD_NR_freebsd6_truncate 200
#define TARGET_FREEBSD_NR_freebsd6_ftruncate 201
#define TARGET_FREEBSD_NR___sysctl 202
#define TARGET_FREEBSD_NR_mlock 203
#define TARGET_FREEBSD_NR_munlock 204
#define TARGET_FREEBSD_NR_undelete 205
#define TARGET_FREEBSD_NR_futimes 206
#define TARGET_FREEBSD_NR_getpgid 207
#define TARGET_FREEBSD_NR_poll 209
#define TARGET_FREEBSD_NR___semctl 220
#define TARGET_FREEBSD_NR_semget 221
#define TARGET_FREEBSD_NR_semop 222
#define TARGET_FREEBSD_NR_msgctl 224
#define TARGET_FREEBSD_NR_msgget 225
#define TARGET_FREEBSD_NR_msgsnd 226
#define TARGET_FREEBSD_NR_msgrcv 227
#define TARGET_FREEBSD_NR_shmat 228
#define TARGET_FREEBSD_NR_shmctl 229
#define TARGET_FREEBSD_NR_shmdt 230
#define TARGET_FREEBSD_NR_shmget 231
#define TARGET_FREEBSD_NR_clock_gettime 232
#define TARGET_FREEBSD_NR_clock_settime 233
#define TARGET_FREEBSD_NR_clock_getres 234
#define TARGET_FREEBSD_NR_ktimer_create 235
#define TARGET_FREEBSD_NR_ktimer_delete 236
#define TARGET_FREEBSD_NR_ktimer_settime 237
#define TARGET_FREEBSD_NR_ktimer_gettime 238
#define TARGET_FREEBSD_NR_ktimer_getoverrun 239
#define TARGET_FREEBSD_NR_nanosleep 240
#define TARGET_FREEBSD_NR_ntp_gettime 248
#define TARGET_FREEBSD_NR_minherit 250
#define TARGET_FREEBSD_NR_rfork 251
#define TARGET_FREEBSD_NR_openbsd_poll 252
#define TARGET_FREEBSD_NR_issetugid 253
#define TARGET_FREEBSD_NR_lchown 254
#define TARGET_FREEBSD_NR_aio_read 255
#define TARGET_FREEBSD_NR_aio_write 256
#define TARGET_FREEBSD_NR_lio_listio 257
#define TARGET_FREEBSD_NR_getdents 272
#define TARGET_FREEBSD_NR_lchmod 274
#define TARGET_FREEBSD_NR_netbsd_lchown 275
#define TARGET_FREEBSD_NR_lutimes 276
#define TARGET_FREEBSD_NR_netbsd_msync 277
#define TARGET_FREEBSD_NR_nstat 278
#define TARGET_FREEBSD_NR_nfstat 279
#define TARGET_FREEBSD_NR_nlstat 280
#define TARGET_FREEBSD_NR_preadv 289
#define TARGET_FREEBSD_NR_pwritev 290
#define TARGET_FREEBSD_NR_freebsd4_fhstatfs 297
#define TARGET_FREEBSD_NR_fhopen 298
#define TARGET_FREEBSD_NR_fhstat 299
#define TARGET_FREEBSD_NR_modnext 300
#define TARGET_FREEBSD_NR_modstat 301
#define TARGET_FREEBSD_NR_modfnext 302
#define TARGET_FREEBSD_NR_modfind 303
#define TARGET_FREEBSD_NR_kldload 304
#define TARGET_FREEBSD_NR_kldunload 305
#define TARGET_FREEBSD_NR_kldfind 306
#define TARGET_FREEBSD_NR_kldnext 307
#define TARGET_FREEBSD_NR_kldstat 308
#define TARGET_FREEBSD_NR_kldfirstmod 309
#define TARGET_FREEBSD_NR_getsid 310
#define TARGET_FREEBSD_NR_setresuid 311
#define TARGET_FREEBSD_NR_setresgid 312
#define TARGET_FREEBSD_NR_aio_return 314
#define TARGET_FREEBSD_NR_aio_suspend 315
#define TARGET_FREEBSD_NR_aio_cancel 316
#define TARGET_FREEBSD_NR_aio_error 317
#define TARGET_FREEBSD_NR_oaio_read 318
#define TARGET_FREEBSD_NR_oaio_write 319
#define TARGET_FREEBSD_NR_olio_listio 320
#define TARGET_FREEBSD_NR_yield 321
#define TARGET_FREEBSD_NR_mlockall 324
#define TARGET_FREEBSD_NR_munlockall 325
#define TARGET_FREEBSD_NR___getcwd 326
#define TARGET_FREEBSD_NR_sched_setparam 327
#define TARGET_FREEBSD_NR_sched_getparam 328
#define TARGET_FREEBSD_NR_sched_setscheduler 329
#define TARGET_FREEBSD_NR_sched_getscheduler 330
#define TARGET_FREEBSD_NR_sched_yield 331
#define TARGET_FREEBSD_NR_sched_get_priority_max 332
#define TARGET_FREEBSD_NR_sched_get_priority_min 333
#define TARGET_FREEBSD_NR_sched_rr_get_interval 334
#define TARGET_FREEBSD_NR_utrace 335
#define TARGET_FREEBSD_NR_freebsd4_sendfile 336
#define TARGET_FREEBSD_NR_kldsym 337
#define TARGET_FREEBSD_NR_jail 338
#define TARGET_FREEBSD_NR_sigprocmask 340
#define TARGET_FREEBSD_NR_sigsuspend 341
#define TARGET_FREEBSD_NR_freebsd4_sigaction 342
#define TARGET_FREEBSD_NR_sigpending 343
#define TARGET_FREEBSD_NR_freebsd4_sigreturn 344
#define TARGET_FREEBSD_NR_sigtimedwait 345
#define TARGET_FREEBSD_NR_sigwaitinfo 346
#define TARGET_FREEBSD_NR___acl_get_file 347
#define TARGET_FREEBSD_NR___acl_set_file 348
#define TARGET_FREEBSD_NR___acl_get_fd 349
#define TARGET_FREEBSD_NR___acl_set_fd 350
#define TARGET_FREEBSD_NR___acl_delete_file 351
#define TARGET_FREEBSD_NR___acl_delete_fd 352
#define TARGET_FREEBSD_NR___acl_aclcheck_file 353
#define TARGET_FREEBSD_NR___acl_aclcheck_fd 354
#define TARGET_FREEBSD_NR_extattrctl 355
#define TARGET_FREEBSD_NR_extattr_set_file 356
#define TARGET_FREEBSD_NR_extattr_get_file 357
#define TARGET_FREEBSD_NR_extattr_delete_file 358
#define TARGET_FREEBSD_NR_aio_waitcomplete 359
#define TARGET_FREEBSD_NR_getresuid 360
#define TARGET_FREEBSD_NR_getresgid 361
#define TARGET_FREEBSD_NR_kqueue 362
#define TARGET_FREEBSD_NR_kevent 363
#define TARGET_FREEBSD_NR_extattr_set_fd 371
#define TARGET_FREEBSD_NR_extattr_get_fd 372
#define TARGET_FREEBSD_NR_extattr_delete_fd 373
#define TARGET_FREEBSD_NR___setugid 374
#define TARGET_FREEBSD_NR_nfsclnt 375
#define TARGET_FREEBSD_NR_eaccess 376
#define TARGET_FREEBSD_NR_nmount 378
#define TARGET_FREEBSD_NR___mac_get_proc 384
#define TARGET_FREEBSD_NR___mac_set_proc 385
#define TARGET_FREEBSD_NR___mac_get_fd 386
#define TARGET_FREEBSD_NR___mac_get_file 387
#define TARGET_FREEBSD_NR___mac_set_fd 388
#define TARGET_FREEBSD_NR___mac_set_file 389
#define TARGET_FREEBSD_NR_kenv 390
#define TARGET_FREEBSD_NR_lchflags 391
#define TARGET_FREEBSD_NR_uuidgen 392
#define TARGET_FREEBSD_NR_sendfile 393
#define TARGET_FREEBSD_NR_mac_syscall 394
#define TARGET_FREEBSD_NR_getfsstat 395
#define TARGET_FREEBSD_NR_statfs 396
#define TARGET_FREEBSD_NR_fstatfs 397
#define TARGET_FREEBSD_NR_fhstatfs 398
#define TARGET_FREEBSD_NR_ksem_close 400
#define TARGET_FREEBSD_NR_ksem_post 401
#define TARGET_FREEBSD_NR_ksem_wait 402
#define TARGET_FREEBSD_NR_ksem_trywait 403
#define TARGET_FREEBSD_NR_ksem_init 404
#define TARGET_FREEBSD_NR_ksem_open 405
#define TARGET_FREEBSD_NR_ksem_unlink 406
#define TARGET_FREEBSD_NR_ksem_getvalue 407
#define TARGET_FREEBSD_NR_ksem_destroy 408
#define TARGET_FREEBSD_NR___mac_get_pid 409
#define TARGET_FREEBSD_NR___mac_get_link 410
#define TARGET_FREEBSD_NR___mac_set_link 411
#define TARGET_FREEBSD_NR_extattr_set_link 412
#define TARGET_FREEBSD_NR_extattr_get_link 413
#define TARGET_FREEBSD_NR_extattr_delete_link 414
#define TARGET_FREEBSD_NR___mac_execve 415
#define TARGET_FREEBSD_NR_sigaction 416
#define TARGET_FREEBSD_NR_sigreturn 417
#define TARGET_FREEBSD_NR_getcontext 421
#define TARGET_FREEBSD_NR_setcontext 422
#define TARGET_FREEBSD_NR_swapcontext 423
#define TARGET_FREEBSD_NR_swapoff 424
#define TARGET_FREEBSD_NR___acl_get_link 425
#define TARGET_FREEBSD_NR___acl_set_link 426
#define TARGET_FREEBSD_NR___acl_delete_link 427
#define TARGET_FREEBSD_NR___acl_aclcheck_link 428
#define TARGET_FREEBSD_NR_sigwait 429
#define TARGET_FREEBSD_NR_thr_create 430
#define TARGET_FREEBSD_NR_thr_exit 431
#define TARGET_FREEBSD_NR_thr_self 432
#define TARGET_FREEBSD_NR_thr_kill 433
#define TARGET_FREEBSD_NR__umtx_lock 434
#define TARGET_FREEBSD_NR__umtx_unlock 435
#define TARGET_FREEBSD_NR_jail_attach 436
#define TARGET_FREEBSD_NR_extattr_list_fd 437
#define TARGET_FREEBSD_NR_extattr_list_file 438
#define TARGET_FREEBSD_NR_extattr_list_link 439
#define TARGET_FREEBSD_NR_ksem_timedwait 441
#define TARGET_FREEBSD_NR_thr_suspend 442
#define TARGET_FREEBSD_NR_thr_wake 443
#define TARGET_FREEBSD_NR_kldunloadf 444
#define TARGET_FREEBSD_NR_audit 445
#define TARGET_FREEBSD_NR_auditon 446
#define TARGET_FREEBSD_NR_getauid 447
#define TARGET_FREEBSD_NR_setauid 448
#define TARGET_FREEBSD_NR_getaudit 449
#define TARGET_FREEBSD_NR_setaudit 450
#define TARGET_FREEBSD_NR_getaudit_addr 451
#define TARGET_FREEBSD_NR_setaudit_addr 452
#define TARGET_FREEBSD_NR_auditctl 453
#define TARGET_FREEBSD_NR__umtx_op 454
#define TARGET_FREEBSD_NR_thr_new 455
#define TARGET_FREEBSD_NR_sigqueue 456
#define TARGET_FREEBSD_NR_kmq_open 457
#define TARGET_FREEBSD_NR_kmq_setattr 458
#define TARGET_FREEBSD_NR_kmq_timedreceive 459
#define TARGET_FREEBSD_NR_kmq_timedsend 460
#define TARGET_FREEBSD_NR_kmq_notify 461
#define TARGET_FREEBSD_NR_kmq_unlink 462
#define TARGET_FREEBSD_NR_abort2 463
#define TARGET_FREEBSD_NR_thr_set_name 464
#define TARGET_FREEBSD_NR_aio_fsync 465
#define TARGET_FREEBSD_NR_rtprio_thread 466
#define TARGET_FREEBSD_NR_sctp_peeloff 471
#define TARGET_FREEBSD_NR_sctp_generic_sendmsg 472
#define TARGET_FREEBSD_NR_sctp_generic_sendmsg_iov 473
#define TARGET_FREEBSD_NR_sctp_generic_recvmsg 474
#define TARGET_FREEBSD_NR_pread 475
#define TARGET_FREEBSD_NR_pwrite 476
#define TARGET_FREEBSD_NR_mmap 477
#define TARGET_FREEBSD_NR_lseek 478
#define TARGET_FREEBSD_NR_truncate 479
#define TARGET_FREEBSD_NR_ftruncate 480
#define TARGET_FREEBSD_NR_thr_kill2 481
#define TARGET_FREEBSD_NR_shm_open 482
#define TARGET_FREEBSD_NR_shm_unlink 483
#define TARGET_FREEBSD_NR_cpuset 484
#define TARGET_FREEBSD_NR_cpuset_setid 485
#define TARGET_FREEBSD_NR_cpuset_getid 486
#define TARGET_FREEBSD_NR_cpuset_getaffinity 487
#define TARGET_FREEBSD_NR_cpuset_setaffinity 488
#define TARGET_FREEBSD_NR_faccessat 489
#define TARGET_FREEBSD_NR_fchmodat 490
#define TARGET_FREEBSD_NR_fchownat 491
#define TARGET_FREEBSD_NR_fexecve 492
#define TARGET_FREEBSD_NR_fstatat 493
#define TARGET_FREEBSD_NR_futimesat 494
#define TARGET_FREEBSD_NR_linkat 495
#define TARGET_FREEBSD_NR_mkdirat 496
#define TARGET_FREEBSD_NR_mkfifoat 497
#define TARGET_FREEBSD_NR_mknodat 498
#define TARGET_FREEBSD_NR_openat 499
#define TARGET_FREEBSD_NR_readlinkat 500
#define TARGET_FREEBSD_NR_renameat 501
#define TARGET_FREEBSD_NR_symlinkat 502
#define TARGET_FREEBSD_NR_unlinkat 503
#define TARGET_FREEBSD_NR_posix_openpt 504
#define TARGET_FREEBSD_NR_syscall 0
#define TARGET_FREEBSD_NR_exit 1
#define TARGET_FREEBSD_NR_fork 2
#define TARGET_FREEBSD_NR_read 3
#define TARGET_FREEBSD_NR_write 4
#define TARGET_FREEBSD_NR_open 5
#define TARGET_FREEBSD_NR_close 6
#define TARGET_FREEBSD_NR_wait4 7
/* 8 is old creat */
#define TARGET_FREEBSD_NR_link 9
#define TARGET_FREEBSD_NR_unlink 10
/* 11 is obsolete execv */
#define TARGET_FREEBSD_NR_chdir 12
#define TARGET_FREEBSD_NR_fchdir 13
#define TARGET_FREEBSD_NR_mknod 14
#define TARGET_FREEBSD_NR_chmod 15
#define TARGET_FREEBSD_NR_chown 16
#define TARGET_FREEBSD_NR_break 17
#define TARGET_FREEBSD_NR_freebsd4_getfsstat 18
/* 19 is old lseek */
#define TARGET_FREEBSD_NR_getpid 20
#define TARGET_FREEBSD_NR_mount 21
#define TARGET_FREEBSD_NR_unmount 22
#define TARGET_FREEBSD_NR_setuid 23
#define TARGET_FREEBSD_NR_getuid 24
#define TARGET_FREEBSD_NR_geteuid 25
#define TARGET_FREEBSD_NR_ptrace 26
#define TARGET_FREEBSD_NR_recvmsg 27
#define TARGET_FREEBSD_NR_sendmsg 28
#define TARGET_FREEBSD_NR_recvfrom 29
#define TARGET_FREEBSD_NR_accept 30
#define TARGET_FREEBSD_NR_getpeername 31
#define TARGET_FREEBSD_NR_getsockname 32
#define TARGET_FREEBSD_NR_access 33
#define TARGET_FREEBSD_NR_chflags 34
#define TARGET_FREEBSD_NR_fchflags 35
#define TARGET_FREEBSD_NR_sync 36
#define TARGET_FREEBSD_NR_kill 37
/* 38 is old stat */
#define TARGET_FREEBSD_NR_getppid 39
/* 40 is old lstat */
#define TARGET_FREEBSD_NR_dup 41
#define TARGET_FREEBSD_NR_pipe 42
#define TARGET_FREEBSD_NR_getegid 43
#define TARGET_FREEBSD_NR_profil 44
#define TARGET_FREEBSD_NR_ktrace 45
/* 46 is old sigaction */
#define TARGET_FREEBSD_NR_getgid 47
/* 48 is old sigprocmask */
#define TARGET_FREEBSD_NR_getlogin 49
#define TARGET_FREEBSD_NR_setlogin 50
#define TARGET_FREEBSD_NR_acct 51
/* 52 is old sigpending */
#define TARGET_FREEBSD_NR_sigaltstack 53
#define TARGET_FREEBSD_NR_ioctl 54
#define TARGET_FREEBSD_NR_reboot 55
#define TARGET_FREEBSD_NR_revoke 56
#define TARGET_FREEBSD_NR_symlink 57
#define TARGET_FREEBSD_NR_readlink 58
#define TARGET_FREEBSD_NR_execve 59
#define TARGET_FREEBSD_NR_umask 60
#define TARGET_FREEBSD_NR_chroot 61
/* 62 is old fstat */
/* 63 is old getkerninfo */
/* 64 is old getpagesize */
#define TARGET_FREEBSD_NR_msync 65
#define TARGET_FREEBSD_NR_vfork 66
/* 67 is obsolete vread */
/* 68 is obsolete vwrite */
#define TARGET_FREEBSD_NR_sbrk 69
#define TARGET_FREEBSD_NR_sstk 70
/* 71 is old mmap */
#define TARGET_FREEBSD_NR_vadvise 72
#define TARGET_FREEBSD_NR_munmap 73
#define TARGET_FREEBSD_NR_mprotect 74
#define TARGET_FREEBSD_NR_madvise 75
/* 76 is obsolete vhangup */
/* 77 is obsolete vlimit */
#define TARGET_FREEBSD_NR_mincore 78
#define TARGET_FREEBSD_NR_getgroups 79
#define TARGET_FREEBSD_NR_setgroups 80
#define TARGET_FREEBSD_NR_getpgrp 81
#define TARGET_FREEBSD_NR_setpgid 82
#define TARGET_FREEBSD_NR_setitimer 83
/* 84 is old wait */
#define TARGET_FREEBSD_NR_swapon 85
#define TARGET_FREEBSD_NR_getitimer 86
/* 87 is old gethostname */
/* 88 is old sethostname */
#define TARGET_FREEBSD_NR_getdtablesize 89
#define TARGET_FREEBSD_NR_dup2 90
#define TARGET_FREEBSD_NR_fcntl 92
#define TARGET_FREEBSD_NR_select 93
#define TARGET_FREEBSD_NR_fsync 95
#define TARGET_FREEBSD_NR_setpriority 96
#define TARGET_FREEBSD_NR_socket 97
#define TARGET_FREEBSD_NR_connect 98
/* 99 is old accept */
#define TARGET_FREEBSD_NR_getpriority 100
/* 101 is old send */
/* 102 is old recv */
/* 103 is old sigreturn */
#define TARGET_FREEBSD_NR_bind 104
#define TARGET_FREEBSD_NR_setsockopt 105
#define TARGET_FREEBSD_NR_listen 106
/* 107 is obsolete vtimes */
/* 108 is old sigvec */
/* 109 is old sigblock */
/* 110 is old sigsetmask */
/* 111 is old sigsuspend */
/* 112 is old sigstack */
/* 113 is old recvmsg */
/* 114 is old sendmsg */
/* 115 is obsolete vtrace */
#define TARGET_FREEBSD_NR_gettimeofday 116
#define TARGET_FREEBSD_NR_getrusage 117
#define TARGET_FREEBSD_NR_getsockopt 118
#define TARGET_FREEBSD_NR_readv 120
#define TARGET_FREEBSD_NR_writev 121
#define TARGET_FREEBSD_NR_settimeofday 122
#define TARGET_FREEBSD_NR_fchown 123
#define TARGET_FREEBSD_NR_fchmod 124
/* 125 is old recvfrom */
#define TARGET_FREEBSD_NR_setreuid 126
#define TARGET_FREEBSD_NR_setregid 127
#define TARGET_FREEBSD_NR_rename 128
/* 129 is old truncate */
/* 130 is old ftruncate */
#define TARGET_FREEBSD_NR_flock 131
#define TARGET_FREEBSD_NR_mkfifo 132
#define TARGET_FREEBSD_NR_sendto 133
#define TARGET_FREEBSD_NR_shutdown 134
#define TARGET_FREEBSD_NR_socketpair 135
#define TARGET_FREEBSD_NR_mkdir 136
#define TARGET_FREEBSD_NR_rmdir 137
#define TARGET_FREEBSD_NR_utimes 138
/* 139 is obsolete 4.2 sigreturn */
#define TARGET_FREEBSD_NR_adjtime 140
/* 141 is old getpeername */
/* 142 is old gethostid */
/* 143 is old sethostid */
/* 144 is old getrlimit */
/* 145 is old setrlimit */
/* 146 is old killpg */
#define TARGET_FREEBSD_NR_killpg 146 /* COMPAT */
#define TARGET_FREEBSD_NR_setsid 147
#define TARGET_FREEBSD_NR_quotactl 148
/* 149 is old quota */
/* 150 is old getsockname */
#define TARGET_FREEBSD_NR_nlm_syscall 154
#define TARGET_FREEBSD_NR_nfssvc 155
/* 156 is old getdirentries */
#define TARGET_FREEBSD_NR_freebsd4_statfs 157
#define TARGET_FREEBSD_NR_freebsd4_fstatfs 158
#define TARGET_FREEBSD_NR_lgetfh 160
#define TARGET_FREEBSD_NR_getfh 161
#define TARGET_FREEBSD_NR_freebsd4_getdomainname 162
#define TARGET_FREEBSD_NR_freebsd4_setdomainname 163
#define TARGET_FREEBSD_NR_freebsd4_uname 164
#define TARGET_FREEBSD_NR_sysarch 165
#define TARGET_FREEBSD_NR_rtprio 166
#define TARGET_FREEBSD_NR_semsys 169
#define TARGET_FREEBSD_NR_msgsys 170
#define TARGET_FREEBSD_NR_shmsys 171
#define TARGET_FREEBSD_NR_freebsd6_pread 173
#define TARGET_FREEBSD_NR_freebsd6_pwrite 174
#define TARGET_FREEBSD_NR_setfib 175
#define TARGET_FREEBSD_NR_ntp_adjtime 176
#define TARGET_FREEBSD_NR_setgid 181
#define TARGET_FREEBSD_NR_setegid 182
#define TARGET_FREEBSD_NR_seteuid 183
#define TARGET_FREEBSD_NR_stat 188
#define TARGET_FREEBSD_NR_fstat 189
#define TARGET_FREEBSD_NR_lstat 190
#define TARGET_FREEBSD_NR_pathconf 191
#define TARGET_FREEBSD_NR_fpathconf 192
#define TARGET_FREEBSD_NR_getrlimit 194
#define TARGET_FREEBSD_NR_setrlimit 195
#define TARGET_FREEBSD_NR_getdirentries 196
#define TARGET_FREEBSD_NR_freebsd6_mmap 197
#define TARGET_FREEBSD_NR___syscall 198
#define TARGET_FREEBSD_NR_freebsd6_lseek 199
#define TARGET_FREEBSD_NR_freebsd6_truncate 200
#define TARGET_FREEBSD_NR_freebsd6_ftruncate 201
#define TARGET_FREEBSD_NR___sysctl 202
#define TARGET_FREEBSD_NR_mlock 203
#define TARGET_FREEBSD_NR_munlock 204
#define TARGET_FREEBSD_NR_undelete 205
#define TARGET_FREEBSD_NR_futimes 206
#define TARGET_FREEBSD_NR_getpgid 207
#define TARGET_FREEBSD_NR_poll 209
#define TARGET_FREEBSD_NR_freebsd7___semctl 220
#define TARGET_FREEBSD_NR_semget 221
#define TARGET_FREEBSD_NR_semop 222
#define TARGET_FREEBSD_NR_freebsd7_msgctl 224
#define TARGET_FREEBSD_NR_msgget 225
#define TARGET_FREEBSD_NR_msgsnd 226
#define TARGET_FREEBSD_NR_msgrcv 227
#define TARGET_FREEBSD_NR_shmat 228
#define TARGET_FREEBSD_NR_freebsd7_shmctl 229
#define TARGET_FREEBSD_NR_shmdt 230
#define TARGET_FREEBSD_NR_shmget 231
#define TARGET_FREEBSD_NR_clock_gettime 232
#define TARGET_FREEBSD_NR_clock_settime 233
#define TARGET_FREEBSD_NR_clock_getres 234
#define TARGET_FREEBSD_NR_ktimer_create 235
#define TARGET_FREEBSD_NR_ktimer_delete 236
#define TARGET_FREEBSD_NR_ktimer_settime 237
#define TARGET_FREEBSD_NR_ktimer_gettime 238
#define TARGET_FREEBSD_NR_ktimer_getoverrun 239
#define TARGET_FREEBSD_NR_nanosleep 240
#define TARGET_FREEBSD_NR_ntp_gettime 248
#define TARGET_FREEBSD_NR_minherit 250
#define TARGET_FREEBSD_NR_rfork 251
#define TARGET_FREEBSD_NR_openbsd_poll 252
#define TARGET_FREEBSD_NR_issetugid 253
#define TARGET_FREEBSD_NR_lchown 254
#define TARGET_FREEBSD_NR_aio_read 255
#define TARGET_FREEBSD_NR_aio_write 256
#define TARGET_FREEBSD_NR_lio_listio 257
#define TARGET_FREEBSD_NR_getdents 272
#define TARGET_FREEBSD_NR_lchmod 274
#define TARGET_FREEBSD_NR_netbsd_lchown 275
#define TARGET_FREEBSD_NR_lutimes 276
#define TARGET_FREEBSD_NR_netbsd_msync 277
#define TARGET_FREEBSD_NR_nstat 278
#define TARGET_FREEBSD_NR_nfstat 279
#define TARGET_FREEBSD_NR_nlstat 280
#define TARGET_FREEBSD_NR_preadv 289
#define TARGET_FREEBSD_NR_pwritev 290
#define TARGET_FREEBSD_NR_freebsd4_fhstatfs 297
#define TARGET_FREEBSD_NR_fhopen 298
#define TARGET_FREEBSD_NR_fhstat 299
#define TARGET_FREEBSD_NR_modnext 300
#define TARGET_FREEBSD_NR_modstat 301
#define TARGET_FREEBSD_NR_modfnext 302
#define TARGET_FREEBSD_NR_modfind 303
#define TARGET_FREEBSD_NR_kldload 304
#define TARGET_FREEBSD_NR_kldunload 305
#define TARGET_FREEBSD_NR_kldfind 306
#define TARGET_FREEBSD_NR_kldnext 307
#define TARGET_FREEBSD_NR_kldstat 308
#define TARGET_FREEBSD_NR_kldfirstmod 309
#define TARGET_FREEBSD_NR_getsid 310
#define TARGET_FREEBSD_NR_setresuid 311
#define TARGET_FREEBSD_NR_setresgid 312
/* 313 is obsolete signanosleep */
#define TARGET_FREEBSD_NR_aio_return 314
#define TARGET_FREEBSD_NR_aio_suspend 315
#define TARGET_FREEBSD_NR_aio_cancel 316
#define TARGET_FREEBSD_NR_aio_error 317
#define TARGET_FREEBSD_NR_oaio_read 318
#define TARGET_FREEBSD_NR_oaio_write 319
#define TARGET_FREEBSD_NR_olio_listio 320
#define TARGET_FREEBSD_NR_yield 321
/* 322 is obsolete thr_sleep */
/* 323 is obsolete thr_wakeup */
#define TARGET_FREEBSD_NR_mlockall 324
#define TARGET_FREEBSD_NR_munlockall 325
#define TARGET_FREEBSD_NR___getcwd 326
#define TARGET_FREEBSD_NR_sched_setparam 327
#define TARGET_FREEBSD_NR_sched_getparam 328
#define TARGET_FREEBSD_NR_sched_setscheduler 329
#define TARGET_FREEBSD_NR_sched_getscheduler 330
#define TARGET_FREEBSD_NR_sched_yield 331
#define TARGET_FREEBSD_NR_sched_get_priority_max 332
#define TARGET_FREEBSD_NR_sched_get_priority_min 333
#define TARGET_FREEBSD_NR_sched_rr_get_interval 334
#define TARGET_FREEBSD_NR_utrace 335
#define TARGET_FREEBSD_NR_freebsd4_sendfile 336
#define TARGET_FREEBSD_NR_kldsym 337
#define TARGET_FREEBSD_NR_jail 338
#define TARGET_FREEBSD_NR_nnpfs_syscall 339
#define TARGET_FREEBSD_NR_sigprocmask 340
#define TARGET_FREEBSD_NR_sigsuspend 341
#define TARGET_FREEBSD_NR_freebsd4_sigaction 342
#define TARGET_FREEBSD_NR_sigpending 343
#define TARGET_FREEBSD_NR_freebsd4_sigreturn 344
#define TARGET_FREEBSD_NR_sigtimedwait 345
#define TARGET_FREEBSD_NR_sigwaitinfo 346
#define TARGET_FREEBSD_NR___acl_get_file 347
#define TARGET_FREEBSD_NR___acl_set_file 348
#define TARGET_FREEBSD_NR___acl_get_fd 349
#define TARGET_FREEBSD_NR___acl_set_fd 350
#define TARGET_FREEBSD_NR___acl_delete_file 351
#define TARGET_FREEBSD_NR___acl_delete_fd 352
#define TARGET_FREEBSD_NR___acl_aclcheck_file 353
#define TARGET_FREEBSD_NR___acl_aclcheck_fd 354
#define TARGET_FREEBSD_NR_extattrctl 355
#define TARGET_FREEBSD_NR_extattr_set_file 356
#define TARGET_FREEBSD_NR_extattr_get_file 357
#define TARGET_FREEBSD_NR_extattr_delete_file 358
#define TARGET_FREEBSD_NR_aio_waitcomplete 359
#define TARGET_FREEBSD_NR_getresuid 360
#define TARGET_FREEBSD_NR_getresgid 361
#define TARGET_FREEBSD_NR_kqueue 362
#define TARGET_FREEBSD_NR_kevent 363
#define TARGET_FREEBSD_NR_extattr_set_fd 371
#define TARGET_FREEBSD_NR_extattr_get_fd 372
#define TARGET_FREEBSD_NR_extattr_delete_fd 373
#define TARGET_FREEBSD_NR___setugid 374
#define TARGET_FREEBSD_NR_eaccess 376
#define TARGET_FREEBSD_NR_afs3_syscall 377
#define TARGET_FREEBSD_NR_nmount 378
#define TARGET_FREEBSD_NR___mac_get_proc 384
#define TARGET_FREEBSD_NR___mac_set_proc 385
#define TARGET_FREEBSD_NR___mac_get_fd 386
#define TARGET_FREEBSD_NR___mac_get_file 387
#define TARGET_FREEBSD_NR___mac_set_fd 388
#define TARGET_FREEBSD_NR___mac_set_file 389
#define TARGET_FREEBSD_NR_kenv 390
#define TARGET_FREEBSD_NR_lchflags 391
#define TARGET_FREEBSD_NR_uuidgen 392
#define TARGET_FREEBSD_NR_sendfile 393
#define TARGET_FREEBSD_NR_mac_syscall 394
#define TARGET_FREEBSD_NR_getfsstat 395
#define TARGET_FREEBSD_NR_statfs 396
#define TARGET_FREEBSD_NR_fstatfs 397
#define TARGET_FREEBSD_NR_fhstatfs 398
#define TARGET_FREEBSD_NR_ksem_close 400
#define TARGET_FREEBSD_NR_ksem_post 401
#define TARGET_FREEBSD_NR_ksem_wait 402
#define TARGET_FREEBSD_NR_ksem_trywait 403
#define TARGET_FREEBSD_NR_ksem_init 404
#define TARGET_FREEBSD_NR_ksem_open 405
#define TARGET_FREEBSD_NR_ksem_unlink 406
#define TARGET_FREEBSD_NR_ksem_getvalue 407
#define TARGET_FREEBSD_NR_ksem_destroy 408
#define TARGET_FREEBSD_NR___mac_get_pid 409
#define TARGET_FREEBSD_NR___mac_get_link 410
#define TARGET_FREEBSD_NR___mac_set_link 411
#define TARGET_FREEBSD_NR_extattr_set_link 412
#define TARGET_FREEBSD_NR_extattr_get_link 413
#define TARGET_FREEBSD_NR_extattr_delete_link 414
#define TARGET_FREEBSD_NR___mac_execve 415
#define TARGET_FREEBSD_NR_sigaction 416
#define TARGET_FREEBSD_NR_sigreturn 417
#define TARGET_FREEBSD_NR_getcontext 421
#define TARGET_FREEBSD_NR_setcontext 422
#define TARGET_FREEBSD_NR_swapcontext 423
#define TARGET_FREEBSD_NR_swapoff 424
#define TARGET_FREEBSD_NR___acl_get_link 425
#define TARGET_FREEBSD_NR___acl_set_link 426
#define TARGET_FREEBSD_NR___acl_delete_link 427
#define TARGET_FREEBSD_NR___acl_aclcheck_link 428
#define TARGET_FREEBSD_NR_sigwait 429
#define TARGET_FREEBSD_NR_thr_create 430
#define TARGET_FREEBSD_NR_thr_exit 431
#define TARGET_FREEBSD_NR_thr_self 432
#define TARGET_FREEBSD_NR_thr_kill 433
#define TARGET_FREEBSD_NR__umtx_lock 434
#define TARGET_FREEBSD_NR__umtx_unlock 435
#define TARGET_FREEBSD_NR_jail_attach 436
#define TARGET_FREEBSD_NR_extattr_list_fd 437
#define TARGET_FREEBSD_NR_extattr_list_file 438
#define TARGET_FREEBSD_NR_extattr_list_link 439
#define TARGET_FREEBSD_NR_ksem_timedwait 441
#define TARGET_FREEBSD_NR_thr_suspend 442
#define TARGET_FREEBSD_NR_thr_wake 443
#define TARGET_FREEBSD_NR_kldunloadf 444
#define TARGET_FREEBSD_NR_audit 445
#define TARGET_FREEBSD_NR_auditon 446
#define TARGET_FREEBSD_NR_getauid 447
#define TARGET_FREEBSD_NR_setauid 448
#define TARGET_FREEBSD_NR_getaudit 449
#define TARGET_FREEBSD_NR_setaudit 450
#define TARGET_FREEBSD_NR_getaudit_addr 451
#define TARGET_FREEBSD_NR_setaudit_addr 452
#define TARGET_FREEBSD_NR_auditctl 453
#define TARGET_FREEBSD_NR__umtx_op 454
#define TARGET_FREEBSD_NR_thr_new 455
#define TARGET_FREEBSD_NR_sigqueue 456
#define TARGET_FREEBSD_NR_kmq_open 457
#define TARGET_FREEBSD_NR_kmq_setattr 458
#define TARGET_FREEBSD_NR_kmq_timedreceive 459
#define TARGET_FREEBSD_NR_kmq_timedsend 460
#define TARGET_FREEBSD_NR_kmq_notify 461
#define TARGET_FREEBSD_NR_kmq_unlink 462
#define TARGET_FREEBSD_NR_abort2 463
#define TARGET_FREEBSD_NR_thr_set_name 464
#define TARGET_FREEBSD_NR_aio_fsync 465
#define TARGET_FREEBSD_NR_rtprio_thread 466
#define TARGET_FREEBSD_NR_sctp_peeloff 471
#define TARGET_FREEBSD_NR_sctp_generic_sendmsg 472
#define TARGET_FREEBSD_NR_sctp_generic_sendmsg_iov 473
#define TARGET_FREEBSD_NR_sctp_generic_recvmsg 474
#define TARGET_FREEBSD_NR_pread 475
#define TARGET_FREEBSD_NR_pwrite 476
#define TARGET_FREEBSD_NR_mmap 477
#define TARGET_FREEBSD_NR_lseek 478
#define TARGET_FREEBSD_NR_truncate 479
#define TARGET_FREEBSD_NR_ftruncate 480
#define TARGET_FREEBSD_NR_thr_kill2 481
#define TARGET_FREEBSD_NR_shm_open 482
#define TARGET_FREEBSD_NR_shm_unlink 483
#define TARGET_FREEBSD_NR_cpuset 484
#define TARGET_FREEBSD_NR_cpuset_setid 485
#define TARGET_FREEBSD_NR_cpuset_getid 486
#define TARGET_FREEBSD_NR_cpuset_getaffinity 487
#define TARGET_FREEBSD_NR_cpuset_setaffinity 488
#define TARGET_FREEBSD_NR_faccessat 489
#define TARGET_FREEBSD_NR_fchmodat 490
#define TARGET_FREEBSD_NR_fchownat 491
#define TARGET_FREEBSD_NR_fexecve 492
#define TARGET_FREEBSD_NR_fstatat 493
#define TARGET_FREEBSD_NR_futimesat 494
#define TARGET_FREEBSD_NR_linkat 495
#define TARGET_FREEBSD_NR_mkdirat 496
#define TARGET_FREEBSD_NR_mkfifoat 497
#define TARGET_FREEBSD_NR_mknodat 498
#define TARGET_FREEBSD_NR_openat 499
#define TARGET_FREEBSD_NR_readlinkat 500
#define TARGET_FREEBSD_NR_renameat 501
#define TARGET_FREEBSD_NR_symlinkat 502
#define TARGET_FREEBSD_NR_unlinkat 503
#define TARGET_FREEBSD_NR_posix_openpt 504
#define TARGET_FREEBSD_NR_gssd_syscall 505
#define TARGET_FREEBSD_NR_jail_get 506
#define TARGET_FREEBSD_NR_jail_set 507
#define TARGET_FREEBSD_NR_jail_remove 508
#define TARGET_FREEBSD_NR_closefrom 509
#define TARGET_FREEBSD_NR___semctl 510
#define TARGET_FREEBSD_NR_msgctl 511
#define TARGET_FREEBSD_NR_shmctl 512
#define TARGET_FREEBSD_NR_lpathconf 513
#define TARGET_FREEBSD_NR_cap_new 514
#define TARGET_FREEBSD_NR_cap_getrights 515
#define TARGET_FREEBSD_NR_cap_enter 516
#define TARGET_FREEBSD_NR_cap_getmode 517
#define TARGET_FREEBSD_NR_pdfork 518
#define TARGET_FREEBSD_NR_pdkill 519
#define TARGET_FREEBSD_NR_pdgetpid 520
#define TARGET_FREEBSD_NR_pselect 522
#define TARGET_FREEBSD_NR_getloginclass 523
#define TARGET_FREEBSD_NR_setloginclass 524
#define TARGET_FREEBSD_NR_rctl_get_racct 525
#define TARGET_FREEBSD_NR_rctl_get_rules 526
#define TARGET_FREEBSD_NR_rctl_get_limits 527
#define TARGET_FREEBSD_NR_rctl_add_rule 528
#define TARGET_FREEBSD_NR_rctl_remove_rule 529
#define TARGET_FREEBSD_NR_posix_fallocate 530
#define TARGET_FREEBSD_NR_posix_fadvise 531
#define TARGET_FREEBSD_NR_MAXSYSCALL 532

View File

@@ -1004,7 +1004,7 @@ int main(int argc, char **argv)
#if defined(TARGET_I386)
env->cr[0] = CR0_PG_MASK | CR0_WP_MASK | CR0_PE_MASK;
env->hflags |= HF_PE_MASK;
env->hflags |= HF_PE_MASK | HF_CPL_MASK;
if (env->features[FEAT_1_EDX] & CPUID_SSE) {
env->cr[4] |= CR4_OSFXSR_MASK;
env->hflags |= HF_OSFXSR_MASK;

View File

@@ -74,66 +74,6 @@ void mmap_unlock(void)
}
#endif
static void *bsd_vmalloc(size_t size)
{
void *p;
mmap_lock();
/* Use map and mark the pages as used. */
p = mmap(NULL, size, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANON, -1, 0);
if (h2g_valid(p)) {
/* Allocated region overlaps guest address space.
This may recurse. */
abi_ulong addr = h2g(p);
page_set_flags(addr & TARGET_PAGE_MASK, TARGET_PAGE_ALIGN(addr + size),
PAGE_RESERVED);
}
mmap_unlock();
return p;
}
void *g_malloc(size_t size)
{
char * p;
size += 16;
p = bsd_vmalloc(size);
*(size_t *)p = size;
return p + 16;
}
/* We use map, which is always zero initialized. */
void * g_malloc0(size_t size)
{
return g_malloc(size);
}
void g_free(void *ptr)
{
/* FIXME: We should unmark the reserved pages here. However this gets
complicated when one target page spans multiple host pages, so we
don't bother. */
size_t *p;
p = (size_t *)((char *)ptr - 16);
munmap(p, *p);
}
void *g_realloc(void *ptr, size_t size)
{
size_t old_size, copy;
void *new_ptr;
if (!ptr)
return g_malloc(size);
old_size = *(size_t *)((char *)ptr - 16);
copy = old_size < size ? old_size : size;
new_ptr = g_malloc(size);
memcpy(new_ptr, ptr, copy);
g_free(ptr);
return new_ptr;
}
/* NOTE: all the constants are the HOST ones, but addresses are target. */
int target_mprotect(abi_ulong start, abi_ulong len, int prot)
{

View File

@@ -1,3 +1,19 @@
/*
* qemu bsd user mode definition
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, see <http://www.gnu.org/licenses/>.
*/
#ifndef QEMU_H
#define QEMU_H
@@ -5,6 +21,7 @@
#include <string.h>
#include "cpu.h"
#include "exec/cpu_ldst.h"
#undef DEBUG_REMAP
#ifdef DEBUG_REMAP
@@ -149,6 +166,16 @@ void fork_end(int child);
#include "qemu/log.h"
/* strace.c */
struct syscallname {
int nr;
const char *name;
const char *format;
void (*call)(const struct syscallname *,
abi_long, abi_long, abi_long,
abi_long, abi_long, abi_long);
void (*result)(const struct syscallname *, abi_long);
};
void
print_freebsd_syscall(int num,
abi_long arg1, abi_long arg2, abi_long arg3,

View File

@@ -1,37 +1,71 @@
/*
* System call tracing and debugging
*
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, see <http://www.gnu.org/licenses/>.
*/
#include <stdio.h>
#include <errno.h>
#include <sys/select.h>
#include <sys/types.h>
#include <unistd.h>
#include <sys/syscall.h>
#include <sys/ioccom.h>
#include <ctype.h>
#include "qemu.h"
int do_strace=0;
struct syscallname {
int nr;
const char *name;
const char *format;
void (*call)(const struct syscallname *,
abi_long, abi_long, abi_long,
abi_long, abi_long, abi_long);
void (*result)(const struct syscallname *, abi_long);
};
int do_strace;
/*
* Utility functions
*/
static void
print_execve(const struct syscallname *name,
abi_long arg1, abi_long arg2, abi_long arg3,
abi_long arg4, abi_long arg5, abi_long arg6)
static void print_sysctl(const struct syscallname *name, abi_long arg1,
abi_long arg2, abi_long arg3, abi_long arg4, abi_long arg5,
abi_long arg6)
{
uint32_t i;
int32_t *namep;
gemu_log("%s({ ", name->name);
namep = lock_user(VERIFY_READ, arg1, sizeof(int32_t) * arg2, 1);
if (namep) {
int32_t *p = namep;
for (i = 0; i < (uint32_t)arg2; i++) {
gemu_log("%d ", tswap32(*p++));
}
unlock_user(namep, arg1, 0);
}
gemu_log("}, %u, 0x" TARGET_ABI_FMT_lx ", 0x" TARGET_ABI_FMT_lx ", 0x"
TARGET_ABI_FMT_lx ", 0x" TARGET_ABI_FMT_lx ")",
(uint32_t)arg2, arg3, arg4, arg5, arg6);
}
static void print_execve(const struct syscallname *name, abi_long arg1,
abi_long arg2, abi_long arg3, abi_long arg4, abi_long arg5,
abi_long arg6)
{
abi_ulong arg_ptr_addr;
char *s;
if (!(s = lock_user_string(arg1)))
s = lock_user_string(arg1);
if (s == NULL) {
return;
}
gemu_log("%s(\"%s\",{", name->name, s);
unlock_user(s, arg1, 0);
@@ -39,29 +73,48 @@ print_execve(const struct syscallname *name,
abi_ulong *arg_ptr, arg_addr;
arg_ptr = lock_user(VERIFY_READ, arg_ptr_addr, sizeof(abi_ulong), 1);
if (!arg_ptr)
if (!arg_ptr) {
return;
}
arg_addr = tswapl(*arg_ptr);
unlock_user(arg_ptr, arg_ptr_addr, 0);
if (!arg_addr)
if (!arg_addr) {
break;
}
if ((s = lock_user_string(arg_addr))) {
gemu_log("\"%s\",", s);
unlock_user(s, arg_addr, 0);
}
}
gemu_log("NULL})");
}
static void print_ioctl(const struct syscallname *name,
abi_long arg1, abi_long arg2, abi_long arg3, abi_long arg4,
abi_long arg5, abi_long arg6)
{
/* Decode the ioctl request */
gemu_log("%s(%d, 0x%0lx { IO%s%s GRP:0x%x('%c') CMD:%d LEN:%d }, 0x"
TARGET_ABI_FMT_lx ", ...)",
name->name,
(int)arg1,
(unsigned long)arg2,
arg2 & IOC_OUT ? "R" : "",
arg2 & IOC_IN ? "W" : "",
(unsigned)IOCGROUP(arg2),
isprint(IOCGROUP(arg2)) ? (char)IOCGROUP(arg2) : '?',
(int)arg2 & 0xFF,
(int)IOCPARM_LEN(arg2),
arg3);
}
/*
* Variants for the return value output function
*/
static void
print_syscall_ret_addr(const struct syscallname *name, abi_long ret)
static void print_syscall_ret_addr(const struct syscallname *name, abi_long ret)
{
if( ret == -1 ) {
if (ret == -1) {
gemu_log(" = -1 errno=%d (%s)\n", errno, strerror(errno));
} else {
gemu_log(" = 0x" TARGET_ABI_FMT_lx "\n", ret);
@@ -90,10 +143,9 @@ static const struct syscallname openbsd_scnames[] = {
#include "openbsd/strace.list"
};
static void
print_syscall(int num, const struct syscallname *scnames, unsigned int nscnames,
abi_long arg1, abi_long arg2, abi_long arg3,
abi_long arg4, abi_long arg5, abi_long arg6)
static void print_syscall(int num, const struct syscallname *scnames,
unsigned int nscnames, abi_long arg1, abi_long arg2, abi_long arg3,
abi_long arg4, abi_long arg5, abi_long arg6)
{
unsigned int i;
const char *format="%s(" TARGET_ABI_FMT_ld "," TARGET_ABI_FMT_ld ","
@@ -102,36 +154,37 @@ print_syscall(int num, const struct syscallname *scnames, unsigned int nscnames,
gemu_log("%d ", getpid() );
for (i = 0; i < nscnames; i++)
for (i = 0; i < nscnames; i++) {
if (scnames[i].nr == num) {
if (scnames[i].call != NULL) {
scnames[i].call(&scnames[i], arg1, arg2, arg3, arg4, arg5,
arg6);
arg6);
} else {
/* XXX: this format system is broken because it uses
host types and host pointers for strings */
if (scnames[i].format != NULL)
if (scnames[i].format != NULL) {
format = scnames[i].format;
gemu_log(format, scnames[i].name, arg1, arg2, arg3, arg4,
arg5, arg6);
}
gemu_log(format, scnames[i].name, arg1, arg2, arg3, arg4, arg5,
arg6);
}
return;
}
}
gemu_log("Unknown syscall %d\n", num);
}
static void
print_syscall_ret(int num, abi_long ret, const struct syscallname *scnames,
unsigned int nscnames)
static void print_syscall_ret(int num, abi_long ret,
const struct syscallname *scnames, unsigned int nscnames)
{
unsigned int i;
for (i = 0; i < nscnames; i++)
for (i = 0; i < nscnames; i++) {
if (scnames[i].nr == num) {
if (scnames[i].result != NULL) {
scnames[i].result(&scnames[i], ret);
} else {
if( ret < 0 ) {
if (ret < 0) {
gemu_log(" = -1 errno=" TARGET_ABI_FMT_ld " (%s)\n", -ret,
strerror(-ret));
} else {
@@ -140,52 +193,50 @@ print_syscall_ret(int num, abi_long ret, const struct syscallname *scnames,
}
break;
}
}
}
/*
* The public interface to this module.
*/
void
print_freebsd_syscall(int num,
abi_long arg1, abi_long arg2, abi_long arg3,
abi_long arg4, abi_long arg5, abi_long arg6)
void print_freebsd_syscall(int num, abi_long arg1, abi_long arg2, abi_long arg3,
abi_long arg4, abi_long arg5, abi_long arg6)
{
print_syscall(num, freebsd_scnames, ARRAY_SIZE(freebsd_scnames),
arg1, arg2, arg3, arg4, arg5, arg6);
print_syscall(num, freebsd_scnames, ARRAY_SIZE(freebsd_scnames), arg1, arg2,
arg3, arg4, arg5, arg6);
}
void
print_freebsd_syscall_ret(int num, abi_long ret)
void print_freebsd_syscall_ret(int num, abi_long ret)
{
print_syscall_ret(num, ret, freebsd_scnames, ARRAY_SIZE(freebsd_scnames));
}
void
print_netbsd_syscall(int num,
abi_long arg1, abi_long arg2, abi_long arg3,
abi_long arg4, abi_long arg5, abi_long arg6)
void print_netbsd_syscall(int num, abi_long arg1, abi_long arg2, abi_long arg3,
abi_long arg4, abi_long arg5, abi_long arg6)
{
print_syscall(num, netbsd_scnames, ARRAY_SIZE(netbsd_scnames),
arg1, arg2, arg3, arg4, arg5, arg6);
}
void
print_netbsd_syscall_ret(int num, abi_long ret)
void print_netbsd_syscall_ret(int num, abi_long ret)
{
print_syscall_ret(num, ret, netbsd_scnames, ARRAY_SIZE(netbsd_scnames));
}
void
print_openbsd_syscall(int num,
abi_long arg1, abi_long arg2, abi_long arg3,
abi_long arg4, abi_long arg5, abi_long arg6)
void print_openbsd_syscall(int num, abi_long arg1, abi_long arg2, abi_long arg3,
abi_long arg4, abi_long arg5, abi_long arg6)
{
print_syscall(num, openbsd_scnames, ARRAY_SIZE(openbsd_scnames),
arg1, arg2, arg3, arg4, arg5, arg6);
print_syscall(num, openbsd_scnames, ARRAY_SIZE(openbsd_scnames), arg1, arg2,
arg3, arg4, arg5, arg6);
}
void
print_openbsd_syscall_ret(int num, abi_long ret)
void print_openbsd_syscall_ret(int num, abi_long ret)
{
print_syscall_ret(num, ret, openbsd_scnames, ARRAY_SIZE(openbsd_scnames));
}

354
configure vendored
View File

@@ -3,6 +3,11 @@
# qemu configure script (c) 2003 Fabrice Bellard
#
# Unset some variables known to interfere with behavior of common tools,
# just as autoconf does.
CLICOLOR_FORCE= GREP_OPTIONS=
unset CLICOLOR_FORCE GREP_OPTIONS
# Temporary directory used for files created while
# configure runs. Since it is in the build directory
# we can safely blow away any previous version of it
@@ -182,6 +187,10 @@ path_of() {
return 1
}
have_backend () {
echo "$trace_backends" | grep "$1" >/dev/null
}
# default parameters
source_path=`dirname "$0"`
cpu=""
@@ -293,7 +302,7 @@ pkgversion=""
pie=""
zero_malloc=""
qom_cast_debug="yes"
trace_backend="nop"
trace_backends="nop"
trace_file="trace"
spice=""
rbd=""
@@ -302,8 +311,8 @@ libusb=""
usb_redir=""
glx=""
zlib="yes"
lzo="no"
snappy="no"
lzo=""
snappy=""
guest_agent=""
guest_agent_with_vss="no"
vss_win32_sdk=""
@@ -317,14 +326,16 @@ seccomp=""
glusterfs=""
glusterfs_discard="no"
glusterfs_zerofill="no"
archipelago=""
virtio_blk_data_plane=""
gtk=""
gtkabi=""
vte=""
tpm="no"
tpm="yes"
libssh2=""
vhdx=""
quorum="no"
quorum=""
numa=""
# parse CC options first
for opt do
@@ -378,6 +389,7 @@ cpp="${CPP-$cc -E}"
objcopy="${OBJCOPY-${cross_prefix}objcopy}"
ld="${LD-${cross_prefix}ld}"
libtool="${LIBTOOL-${cross_prefix}libtool}"
nm="${NM-${cross_prefix}nm}"
strip="${STRIP-${cross_prefix}strip}"
windres="${WINDRES-${cross_prefix}windres}"
pkg_config_exe="${PKG_CONFIG-${cross_prefix}pkg-config}"
@@ -537,6 +549,9 @@ fi
# OS specific
# host *BSD for user mode
HOST_VARIANT_DIR=""
case $targetos in
CYGWIN*)
mingw32="yes"
@@ -562,12 +577,14 @@ FreeBSD)
# needed for kinfo_getvmmap(3) in libutil.h
LIBS="-lutil $LIBS"
netmap="" # enable netmap autodetect
HOST_VARIANT_DIR="freebsd"
;;
DragonFly)
bsd="yes"
make="${MAKE-gmake}"
audio_drv_list="oss"
audio_possible_drivers="oss sdl esd pa"
HOST_VARIANT_DIR="dragonfly"
;;
NetBSD)
bsd="yes"
@@ -575,12 +592,14 @@ NetBSD)
audio_drv_list="oss"
audio_possible_drivers="oss sdl esd"
oss_lib="-lossaudio"
HOST_VARIANT_DIR="netbsd"
;;
OpenBSD)
bsd="yes"
make="${MAKE-gmake}"
audio_drv_list="sdl"
audio_possible_drivers="sdl esd"
HOST_VARIANT_DIR="openbsd"
;;
Darwin)
bsd="yes"
@@ -598,6 +617,7 @@ Darwin)
# Disable attempts to use ObjectiveC features in os/object.h since they
# won't work when we're compiling with gcc as a C compiler.
QEMU_CFLAGS="-DOS_OBJECT_USE_OBJC=0 $QEMU_CFLAGS"
HOST_VARIANT_DIR="darwin"
;;
SunOS)
solaris="yes"
@@ -753,7 +773,10 @@ for opt do
;;
--target-list=*) target_list="$optarg"
;;
--enable-trace-backend=*) trace_backend="$optarg"
--enable-trace-backends=*) trace_backends="$optarg"
;;
# XXX: backwards compatibility
--enable-trace-backend=*) trace_backends="$optarg"
;;
--with-trace-file=*) trace_file="$optarg"
;;
@@ -1030,8 +1053,12 @@ for opt do
;;
--disable-zlib-test) zlib="no"
;;
--disable-lzo) lzo="no"
;;
--enable-lzo) lzo="yes"
;;
--disable-snappy) snappy="no"
;;
--enable-snappy) snappy="yes"
;;
--enable-guest-agent) guest_agent="yes"
@@ -1062,6 +1089,10 @@ for opt do
;;
--enable-glusterfs) glusterfs="yes"
;;
--disable-archipelago) archipelago="no"
;;
--enable-archipelago) archipelago="yes"
;;
--disable-virtio-blk-data-plane) virtio_blk_data_plane="no"
;;
--enable-virtio-blk-data-plane) virtio_blk_data_plane="yes"
@@ -1080,6 +1111,8 @@ for opt do
;;
--enable-vte) vte="yes"
;;
--disable-tpm) tpm="no"
;;
--enable-tpm) tpm="yes"
;;
--disable-libssh2) libssh2="no"
@@ -1094,6 +1127,10 @@ for opt do
;;
--enable-quorum) quorum="yes"
;;
--disable-numa) numa="no"
;;
--enable-numa) numa="yes"
;;
*)
echo "ERROR: unknown option $opt"
echo "Try '$0 --help' for more information"
@@ -1320,7 +1357,7 @@ Advanced options (experts only):
--disable-docs disable documentation build
--disable-vhost-net disable vhost-net acceleration support
--enable-vhost-net enable vhost-net acceleration support
--enable-trace-backend=B Set trace backend
--enable-trace-backends=B Set trace backend
Available backends: $($python $source_path/scripts/tracetool.py --list-backends)
--with-trace-file=NAME Full PATH,NAME of file to store traces
Default:trace-<pid>
@@ -1351,8 +1388,11 @@ Advanced options (experts only):
--enable-coroutine-pool enable coroutine freelist (better performance)
--enable-glusterfs enable GlusterFS backend
--disable-glusterfs disable GlusterFS backend
--enable-archipelago enable Archipelago backend
--disable-archipelago disable Archipelago backend
--enable-gcov enable test coverage analysis with gcov
--gcov=GCOV use specified gcov [$gcov_tool]
--disable-tpm disable TPM support
--enable-tpm enable TPM support
--disable-libssh2 disable ssh block device support
--enable-libssh2 enable ssh block device support
@@ -1360,6 +1400,8 @@ Advanced options (experts only):
--enable-vhdx enable support for the Microsoft VHDX image format
--disable-quorum disable quorum block filter support
--enable-quorum enable quorum block filter support
--disable-numa disable libnuma support
--enable-numa enable libnuma support
NOTE: The object files are built at the place where configure is launched
EOF
@@ -1455,8 +1497,9 @@ for flag in $gcc_flags; do
fi
done
if test "$stack_protector" != "no" ; then
if test "$stack_protector" != "no"; then
gcc_flags="-fstack-protector-strong -fstack-protector-all"
sp_on=0
for flag in $gcc_flags; do
# We need to check both a compile and a link, since some compiler
# setups fail only on a .c->.o compile and some only at link time
@@ -1464,9 +1507,15 @@ if test "$stack_protector" != "no" ; then
compile_prog "-Werror $flag" ""; then
QEMU_CFLAGS="$QEMU_CFLAGS $flag"
LIBTOOLFLAGS="$LIBTOOLFLAGS -Wc,$flag"
sp_on=1
break
fi
done
if test "$stack_protector" = yes; then
if test $sp_on = 0; then
error_exit "Stack protector not supported"
fi
fi
fi
# Workaround for http://gcc.gnu.org/PR55489. Happens with -fPIE/-fPIC and
@@ -1677,6 +1726,20 @@ else
echo big/little test failed
fi
##########################################
# L2TPV3 probe
cat > $TMPC <<EOF
#include <sys/socket.h>
#include <linux/ip.h>
int main(void) { return sizeof(struct mmsghdr); }
EOF
if compile_prog "" "" ; then
l2tpv3=yes
else
l2tpv3=no
fi
##########################################
# pkg-config probe
@@ -1729,13 +1792,14 @@ if test "$lzo" != "no" ; then
int main(void) { lzo_version(); return 0; }
EOF
if compile_prog "" "-llzo2" ; then
:
libs_softmmu="$libs_softmmu -llzo2"
lzo="yes"
else
error_exit "lzo check failed" \
"Make sure to have the lzo libs and headers installed."
if test "$lzo" = "yes"; then
feature_not_found "liblzo2" "Install liblzo2 devel"
fi
lzo="no"
fi
libs_softmmu="$libs_softmmu -llzo2"
fi
##########################################
@@ -1747,13 +1811,14 @@ if test "$snappy" != "no" ; then
int main(void) { snappy_max_compressed_length(4096); return 0; }
EOF
if compile_prog "" "-lsnappy" ; then
:
libs_softmmu="$libs_softmmu -lsnappy"
snappy="yes"
else
error_exit "snappy check failed" \
"Make sure to have the snappy libs and headers installed."
if test "$snappy" = "yes"; then
feature_not_found "libsnappy" "Install libsnappy devel"
fi
snappy="no"
fi
libs_softmmu="$libs_softmmu -lsnappy"
fi
##########################################
@@ -1986,6 +2051,7 @@ fi
if test "$gtk" != "no"; then
gtkpackage="gtk+-$gtkabi"
gtkx11package="gtk+-x11-$gtkabi"
if test "$gtkabi" = "3.0" ; then
gtkversion="3.0.0"
else
@@ -1994,6 +2060,9 @@ if test "$gtk" != "no"; then
if $pkg_config --exists "$gtkpackage >= $gtkversion"; then
gtk_cflags=`$pkg_config --cflags $gtkpackage`
gtk_libs=`$pkg_config --libs $gtkpackage`
if $pkg_config --exists "$gtkx11package >= $gtkversion"; then
gtk_libs="$gtk_libs -lX11"
fi
libs_softmmu="$gtk_libs $libs_softmmu"
gtk="yes"
elif test "$gtk" = "yes"; then
@@ -2195,9 +2264,12 @@ if compile_prog "$quorum_tls_cflags" "$quorum_tls_libs" ; then
libs_softmmu="$quorum_tls_libs $libs_softmmu"
libs_tools="$quorum_tls_libs $libs_softmmu"
QEMU_CFLAGS="$QEMU_CFLAGS $quorum_tls_cflags"
quorum="yes"
else
echo "gnutls > 2.10.0 required to compile Quorum"
exit 1
if test "$quorum" = "yes"; then
feature_not_found "gnutls" "gnutls > 2.10.0 required to compile Quorum"
fi
quorum="no"
fi
fi
@@ -2645,6 +2717,12 @@ for i in $glib_modules; do
fi
done
# g_test_trap_subprocess added in 2.38. Used by some tests.
glib_subprocess=yes
if ! $pkg_config --atleast-version=2.38 glib-2.0; then
glib_subprocess=no
fi
##########################################
# SHA command probe for modules
if test "$modules" = yes; then
@@ -2666,7 +2744,7 @@ fi
if test "$pixman" = ""; then
if test "$want_tools" = "no" -a "$softmmu" = "no"; then
pixman="none"
elif $pkg_config pixman-1 > /dev/null 2>&1; then
elif $pkg_config --atleast-version=0.21.8 pixman-1 > /dev/null 2>&1; then
pixman="system"
else
pixman="internal"
@@ -2682,11 +2760,12 @@ if test "$pixman" = "none"; then
pixman_cflags=
pixman_libs=
elif test "$pixman" = "system"; then
# pixman version has been checked above
pixman_cflags=`$pkg_config --cflags pixman-1`
pixman_libs=`$pkg_config --libs pixman-1`
else
if test ! -d ${source_path}/pixman/pixman; then
error_exit "pixman not present. Your options:" \
error_exit "pixman >= 0.21.8 not present. Your options:" \
" (1) Preferred: Install the pixman devel package (any recent" \
" distro should have packages as Xorg needs pixman too)." \
" (2) Fetch the pixman submodule, using:" \
@@ -3008,6 +3087,33 @@ EOF
fi
fi
##########################################
# archipelago probe
if test "$archipelago" != "no" ; then
cat > $TMPC <<EOF
#include <stdio.h>
#include <xseg/xseg.h>
#include <xseg/protocol.h>
int main(void) {
xseg_initialize();
return 0;
}
EOF
archipelago_libs=-lxseg
if compile_prog "" "$archipelago_libs"; then
archipelago="yes"
libs_tools="$archipelago_libs $libs_tools"
libs_softmmu="$archipelago_libs $libs_softmmu"
else
if test "$archipelago" = "yes" ; then
feature_not_found "Archipelago backend support" "Install libxseg devel"
fi
archipelago="no"
fi
fi
##########################################
# glusterfs probe
if test "$glusterfs" != "no" ; then
@@ -3023,7 +3129,8 @@ if test "$glusterfs" != "no" ; then
fi
else
if test "$glusterfs" = "yes" ; then
feature_not_found "GlusterFS backend support" "Install glusterfs-api devel"
feature_not_found "GlusterFS backend support" \
"Install glusterfs-api devel >= 3"
fi
glusterfs="no"
fi
@@ -3134,6 +3241,26 @@ if compile_prog "" "" ; then
splice=yes
fi
##########################################
# libnuma probe
if test "$numa" != "no" ; then
cat > $TMPC << EOF
#include <numa.h>
int main(void) { return numa_available(); }
EOF
if compile_prog "" "-lnuma" ; then
numa=yes
libs_softmmu="-lnuma $libs_softmmu"
else
if test "$numa" = "yes" ; then
feature_not_found "numa" "install numactl devel"
fi
numa=no
fi
fi
##########################################
# signalfd probe
signalfd="no"
@@ -3337,6 +3464,37 @@ if compile_prog "" "" ; then
sendfile=yes
fi
# check for timerfd support (glibc 2.8 and newer)
timerfd=no
cat > $TMPC << EOF
#include <sys/timerfd.h>
int main(void)
{
return(timerfd_create(CLOCK_REALTIME, 0));
}
EOF
if compile_prog "" "" ; then
timerfd=yes
fi
# check for setns and unshare support
setns=no
cat > $TMPC << EOF
#include <sched.h>
int main(void)
{
int ret;
ret = setns(0, 0);
ret = unshare(0);
return ret;
}
EOF
if compile_prog "" "" ; then
setns=yes
fi
# Check if tools are available to build documentation.
if test "$docs" != "no" ; then
if has makeinfo && has pod2man; then
@@ -3372,51 +3530,25 @@ if compile_prog "" "" ; then
fi
##########################################
# Do we have libiscsi
# We check for iscsi_write16_sync() to make sure we have a
# at least version 1.4.0 of libiscsi.
# Do we have libiscsi >= 1.9.0
if test "$libiscsi" != "no" ; then
cat > $TMPC << EOF
#include <stdio.h>
#include <iscsi/iscsi.h>
int main(void) { iscsi_write16_sync(NULL,0,0,NULL,0,0,0,0,0,0,0); return 0; }
EOF
if $pkg_config --atleast-version=1.7.0 libiscsi; then
if $pkg_config --atleast-version=1.9.0 libiscsi; then
libiscsi="yes"
libiscsi_cflags=$($pkg_config --cflags libiscsi)
libiscsi_libs=$($pkg_config --libs libiscsi)
elif compile_prog "" "-liscsi" ; then
libiscsi="yes"
libiscsi_libs="-liscsi"
else
if test "$libiscsi" = "yes" ; then
feature_not_found "libiscsi" "Install libiscsi devel"
feature_not_found "libiscsi" "Install libiscsi >= 1.9.0"
fi
libiscsi="no"
fi
fi
# We also need to know the API version because there was an
# API change from 1.4.0 to 1.5.0.
if test "$libiscsi" = "yes"; then
cat >$TMPC <<EOF
#include <iscsi/iscsi.h>
int main(void)
{
iscsi_read10_task(0, 0, 0, 0, 0, 0, 0);
return 0;
}
EOF
if compile_prog "" "-liscsi"; then
libiscsi_version="1.4.0"
fi
fi
##########################################
# Do we need libm
cat > $TMPC << EOF
#include <math.h>
int main(void) { return isnan(sin(0.0)); }
int main(int argc, char **argv) { return isnan(sin((double)argc)); }
EOF
if compile_prog "" "" ; then
:
@@ -3445,9 +3577,9 @@ EOF
if compile_prog "" "" ; then
:
# we need pthread for static linking. use previous pthread test result
elif compile_prog "" "-lrt $pthread_lib" ; then
LIBS="-lrt $LIBS"
libs_qga="-lrt $libs_qga"
elif compile_prog "" "$pthread_lib -lrt" ; then
LIBS="$LIBS -lrt"
libs_qga="$libs_qga -lrt"
fi
if test "$darwin" != "yes" -a "$mingw32" != "yes" -a "$solaris" != yes -a \
@@ -3474,7 +3606,8 @@ EOF
spice_server_version=$($pkg_config --modversion spice-server)
else
if test "$spice" = "yes" ; then
feature_not_found "spice" "Install spice-server and spice-protocol devel"
feature_not_found "spice" \
"Install spice-server(>=0.12.0) and spice-protocol(>=0.12.3) devel"
fi
spice="no"
fi
@@ -3505,7 +3638,7 @@ EOF
smartcard_nss="yes"
else
if test "$smartcard_nss" = "yes"; then
feature_not_found "nss"
feature_not_found "nss" "Install nss devel >= 3.12.8"
fi
smartcard_nss="no"
fi
@@ -3521,7 +3654,7 @@ if test "$libusb" != "no" ; then
libs_softmmu="$libs_softmmu $libusb_libs"
else
if test "$libusb" = "yes"; then
feature_not_found "libusb" "Install libusb devel"
feature_not_found "libusb" "Install libusb devel >= 1.0.13"
fi
libusb="no"
fi
@@ -3666,15 +3799,15 @@ fi
##########################################
# check if trace backend exists
$python "$source_path/scripts/tracetool.py" "--backend=$trace_backend" --check-backend > /dev/null 2> /dev/null
$python "$source_path/scripts/tracetool.py" "--backends=$trace_backends" --check-backends > /dev/null 2> /dev/null
if test "$?" -ne 0 ; then
error_exit "invalid trace backend" \
"Please choose a supported trace backend."
error_exit "invalid trace backends" \
"Please choose supported trace backends."
fi
##########################################
# For 'ust' backend, test if ust headers are present
if test "$trace_backend" = "ust"; then
if have_backend "ust"; then
cat > $TMPC << EOF
#include <lttng/tracepoint.h>
int main(void) { return 0; }
@@ -3700,7 +3833,7 @@ fi
##########################################
# For 'dtrace' backend, test if 'dtrace' command is present
if test "$trace_backend" = "dtrace"; then
if have_backend "dtrace"; then
if ! has 'dtrace' ; then
error_exit "dtrace command is not found in PATH $PATH"
fi
@@ -3946,7 +4079,7 @@ if test "$libnfs" != "no" ; then
LIBS="$LIBS $libnfs_libs"
else
if test "$libnfs" = "yes" ; then
feature_not_found "libnfs"
feature_not_found "libnfs" "Install libnfs devel >= 1.9.3"
fi
libnfs="no"
fi
@@ -4170,7 +4303,7 @@ echo "uuid support $uuid"
echo "libcap-ng support $cap_ng"
echo "vhost-net support $vhost_net"
echo "vhost-scsi support $vhost_scsi"
echo "Trace backend $trace_backend"
echo "Trace backends $trace_backends"
if test "$trace_backend" = "simple"; then
echo "Trace output file $trace_file-<pid>"
fi
@@ -4185,11 +4318,7 @@ echo "nss used $smartcard_nss"
echo "libusb $libusb"
echo "usb net redir $usb_redir"
echo "GLX support $glx"
if test "$libiscsi_version" = "1.4.0"; then
echo "libiscsi support $libiscsi (1.4.0)"
else
echo "libiscsi support $libiscsi"
fi
echo "libnfs support $libnfs"
echo "build guest agent $guest_agent"
echo "QGA VSS support $guest_agent_with_vss"
@@ -4197,6 +4326,7 @@ echo "seccomp support $seccomp"
echo "coroutine backend $coroutine"
echo "coroutine pool $coroutine_pool"
echo "GlusterFS support $glusterfs"
echo "Archipelago support $archipelago"
echo "virtio-blk-data-plane $virtio_blk_data_plane"
echo "gcov $gcov_tool"
echo "gcov enabled $gcov"
@@ -4208,6 +4338,7 @@ echo "vhdx $vhdx"
echo "Quorum $quorum"
echo "lzo support $lzo"
echo "snappy support $snappy"
echo "NUMA host support $numa"
if test "$sdl_too_old" = "yes"; then
echo "-> Your SDL version is too old - please upgrade to have SDL support"
@@ -4309,6 +4440,9 @@ fi
if test "$netmap" = "yes" ; then
echo "CONFIG_NETMAP=y" >> $config_host_mak
fi
if test "$l2tpv3" = "yes" ; then
echo "CONFIG_L2TPV3=y" >> $config_host_mak
fi
if test "$cap_ng" = "yes" ; then
echo "CONFIG_LIBCAP=y" >> $config_host_mak
fi
@@ -4429,6 +4563,12 @@ fi
if test "$sendfile" = "yes" ; then
echo "CONFIG_SENDFILE=y" >> $config_host_mak
fi
if test "$timerfd" = "yes" ; then
echo "CONFIG_TIMERFD=y" >> $config_host_mak
fi
if test "$setns" = "yes" ; then
echo "CONFIG_SETNS=y" >> $config_host_mak
fi
if test "$inotify" = "yes" ; then
echo "CONFIG_INOTIFY=y" >> $config_host_mak
fi
@@ -4453,6 +4593,9 @@ if test "$bluez" = "yes" ; then
echo "CONFIG_BLUEZ=y" >> $config_host_mak
echo "BLUEZ_CFLAGS=$bluez_cflags" >> $config_host_mak
fi
if test "glib_subprocess" = "yes" ; then
echo "CONFIG_HAS_GLIB_SUBPROCESS_TESTS=y" >> $config_host_mak
fi
echo "GLIB_CFLAGS=$glib_cflags" >> $config_host_mak
if test "$gtk" = "yes" ; then
echo "CONFIG_GTK=y" >> $config_host_mak
@@ -4482,6 +4625,9 @@ fi
if test "$vhost_scsi" = "yes" ; then
echo "CONFIG_VHOST_SCSI=y" >> $config_host_mak
fi
if test "$vhost_net" = "yes" ; then
echo "CONFIG_VHOST_NET_USED=y" >> $config_host_mak
fi
if test "$blobs" = "yes" ; then
echo "INSTALL_BLOBS=yes" >> $config_host_mak
fi
@@ -4546,9 +4692,6 @@ fi
if test "$libiscsi" = "yes" ; then
echo "CONFIG_LIBISCSI=m" >> $config_host_mak
if test "$libiscsi_version" = "1.4.0"; then
echo "CONFIG_LIBISCSI_1_4=y" >> $config_host_mak
fi
echo "LIBISCSI_CFLAGS=$libiscsi_cflags" >> $config_host_mak
echo "LIBISCSI_LIBS=$libiscsi_libs" >> $config_host_mak
fi
@@ -4631,6 +4774,11 @@ if test "$glusterfs_zerofill" = "yes" ; then
echo "CONFIG_GLUSTERFS_ZEROFILL=y" >> $config_host_mak
fi
if test "$archipelago" = "yes" ; then
echo "CONFIG_ARCHIPELAGO=m" >> $config_host_mak
echo "ARCHIPELAGO_LIBS=$archipelago_libs" >> $config_host_mak
fi
if test "$libssh2" = "yes" ; then
echo "CONFIG_LIBSSH2=m" >> $config_host_mak
echo "LIBSSH2_CFLAGS=$libssh2_cflags" >> $config_host_mak
@@ -4664,43 +4812,35 @@ if test "$tpm" = "yes"; then
fi
fi
# use default implementation for tracing backend-specific routines
trace_default=yes
echo "TRACE_BACKEND=$trace_backend" >> $config_host_mak
if test "$trace_backend" = "nop"; then
echo "TRACE_BACKENDS=$trace_backends" >> $config_host_mak
if have_backend "nop"; then
echo "CONFIG_TRACE_NOP=y" >> $config_host_mak
fi
if test "$trace_backend" = "simple"; then
if have_backend "simple"; then
echo "CONFIG_TRACE_SIMPLE=y" >> $config_host_mak
trace_default=no
# Set the appropriate trace file.
trace_file="\"$trace_file-\" FMT_pid"
fi
if test "$trace_backend" = "stderr"; then
if have_backend "stderr"; then
echo "CONFIG_TRACE_STDERR=y" >> $config_host_mak
trace_default=no
fi
if test "$trace_backend" = "ust"; then
if have_backend "ust"; then
echo "CONFIG_TRACE_UST=y" >> $config_host_mak
fi
if test "$trace_backend" = "dtrace"; then
if have_backend "dtrace"; then
echo "CONFIG_TRACE_DTRACE=y" >> $config_host_mak
if test "$trace_backend_stap" = "yes" ; then
echo "CONFIG_TRACE_SYSTEMTAP=y" >> $config_host_mak
fi
fi
if test "$trace_backend" = "ftrace"; then
if have_backend "ftrace"; then
if test "$linux" = "yes" ; then
echo "CONFIG_TRACE_FTRACE=y" >> $config_host_mak
trace_default=no
else
feature_not_found "ftrace(trace backend)" "ftrace requires Linux"
fi
fi
echo "CONFIG_TRACE_FILE=$trace_file" >> $config_host_mak
if test "$trace_default" = "yes"; then
echo "CONFIG_TRACE_DEFAULT=y" >> $config_host_mak
fi
if test "$rdma" = "yes" ; then
echo "CONFIG_RDMA=y" >> $config_host_mak
@@ -4724,6 +4864,8 @@ elif test "$ARCH" = "s390x" ; then
QEMU_INCLUDES="-I\$(SRC_PATH)/tcg/s390 $QEMU_INCLUDES"
elif test "$ARCH" = "x86_64" -o "$ARCH" = "x32" ; then
QEMU_INCLUDES="-I\$(SRC_PATH)/tcg/i386 $QEMU_INCLUDES"
elif test "$ARCH" = "ppc64" ; then
QEMU_INCLUDES="-I\$(SRC_PATH)/tcg/ppc $QEMU_INCLUDES"
else
QEMU_INCLUDES="-I\$(SRC_PATH)/tcg/\$(ARCH) $QEMU_INCLUDES"
fi
@@ -4757,6 +4899,7 @@ echo "AS=$as" >> $config_host_mak
echo "CPP=$cpp" >> $config_host_mak
echo "OBJCOPY=$objcopy" >> $config_host_mak
echo "LD=$ld" >> $config_host_mak
echo "NM=$nm" >> $config_host_mak
echo "WINDRES=$windres" >> $config_host_mak
echo "LIBTOOL=$libtool" >> $config_host_mak
echo "CFLAGS=$CFLAGS" >> $config_host_mak
@@ -4805,6 +4948,9 @@ if test "$linux" = "yes" ; then
aarch64)
linux_arch=arm64
;;
mips64)
linux_arch=mips
;;
*)
# For most CPUs the kernel architecture name and QEMU CPU name match.
linux_arch="$cpu"
@@ -4911,6 +5057,8 @@ case "$target_name" in
TARGET_BASE_ARCH=mips
echo "TARGET_ABI_MIPSN64=y" >> $config_target_mak
;;
tricore)
;;
moxie)
;;
or32)
@@ -4930,6 +5078,12 @@ case "$target_name" in
TARGET_ABI_DIR=ppc
gdb_xml_files="power64-core.xml power-fpu.xml power-altivec.xml power-spe.xml"
;;
ppc64le)
TARGET_ARCH=ppc64
TARGET_BASE_ARCH=ppc
TARGET_ABI_DIR=ppc
gdb_xml_files="power64-core.xml power-fpu.xml power-altivec.xml power-spe.xml"
;;
ppc64abi32)
TARGET_ARCH=ppc64
TARGET_BASE_ARCH=ppc
@@ -4953,6 +5107,7 @@ case "$target_name" in
echo "TARGET_ABI32=y" >> $config_target_mak
;;
s390x)
gdb_xml_files="s390x-core64.xml s390-acr.xml s390-fpr.xml"
;;
unicore32)
;;
@@ -4982,6 +5137,9 @@ if [ "$TARGET_ABI_DIR" = "" ]; then
TARGET_ABI_DIR=$TARGET_ARCH
fi
echo "TARGET_ABI_DIR=$TARGET_ABI_DIR" >> $config_target_mak
if [ "$HOST_VARIANT_DIR" != "" ]; then
echo "HOST_VARIANT_DIR=$HOST_VARIANT_DIR" >> $config_target_mak
fi
case "$target_name" in
i386|x86_64)
if test "$xen" = "yes" -a "$target_softmmu" = "yes" ; then
@@ -4994,7 +5152,7 @@ case "$target_name" in
*)
esac
case "$target_name" in
aarch64|arm|i386|x86_64|ppcemb|ppc|ppc64|s390x)
aarch64|arm|i386|x86_64|ppcemb|ppc|ppc64|s390x|mipsel|mips)
# Make sure the target and host cpus are compatible
if test "$kvm" = "yes" -a "$target_softmmu" = "yes" -a \
\( "$target_name" = "$cpu" -o \
@@ -5002,6 +5160,7 @@ case "$target_name" in
\( "$target_name" = "ppc64" -a "$cpu" = "ppc" \) -o \
\( "$target_name" = "ppc" -a "$cpu" = "ppc64" \) -o \
\( "$target_name" = "ppcemb" -a "$cpu" = "ppc64" \) -o \
\( "$target_name" = "mipsel" -a "$cpu" = "mips" \) -o \
\( "$target_name" = "x86_64" -a "$cpu" = "i386" \) -o \
\( "$target_name" = "i386" -a "$cpu" = "x86_64" \) \) ; then
echo "CONFIG_KVM=y" >> $config_target_mak
@@ -5173,6 +5332,10 @@ if [ "$dtc_internal" = "yes" ]; then
echo "config-host.h: subdir-dtc" >> $config_host_mak
fi
if test "$numa" = "yes"; then
echo "CONFIG_NUMA=y" >> $config_host_mak
fi
# build tree in object directory in case the source is not in the current directory
DIRS="tests tests/tcg tests/tcg/cris tests/tcg/lm32 tests/libqos tests/qapi-schema tests/tcg/xtensa tests/qemu-iotests"
DIRS="$DIRS fsdev"
@@ -5194,6 +5357,7 @@ for bios_file in \
$source_path/pc-bios/*.dtb \
$source_path/pc-bios/*.img \
$source_path/pc-bios/openbios-* \
$source_path/pc-bios/u-boot.* \
$source_path/pc-bios/palcode-*
do
FILES="$FILES pc-bios/`basename $bios_file`"
@@ -5223,8 +5387,16 @@ for rom in seabios vgabios ; do
echo "LD=$ld" >> $config_mak
done
if test "$docs" = "yes" ; then
mkdir -p QMP
# set up qemu-iotests in this build directory
iotests_common_env="tests/qemu-iotests/common.env"
iotests_check="tests/qemu-iotests/check"
echo "# Automatically generated by configure - do not modify" > "$iotests_common_env"
echo >> "$iotests_common_env"
echo "export PYTHON='$python'" >> "$iotests_common_env"
if [ ! -e "$iotests_check" ]; then
symlink "$source_path/$iotests_check" "$iotests_check"
fi
# Save the configure command line for later reuse.

View File

@@ -30,20 +30,14 @@ typedef struct {
CoroutineAction action;
} CoroutineGThread;
static GStaticMutex coroutine_lock = G_STATIC_MUTEX_INIT;
static CompatGMutex coroutine_lock;
static CompatGCond coroutine_cond;
/* GLib 2.31 and beyond deprecated various parts of the thread API,
* but the new interfaces are not available in older GLib versions
* so we have to cope with both.
*/
#if GLIB_CHECK_VERSION(2, 31, 0)
/* Default zero-initialisation is sufficient for 2.31+ GCond */
static GCond the_coroutine_cond;
static GCond *coroutine_cond = &the_coroutine_cond;
static inline void init_coroutine_cond(void)
{
}
/* Awkwardly, the GPrivate API doesn't provide a way to update the
* GDestroyNotify handler for the coroutine key dynamically. So instead
* we track whether or not the CoroutineGThread should be freed on
@@ -84,11 +78,6 @@ static inline GThread *create_thread(GThreadFunc func, gpointer data)
#else
/* Handle older GLib versions */
static GCond *coroutine_cond;
static inline void init_coroutine_cond(void)
{
coroutine_cond = g_cond_new();
}
static GStaticPrivate coroutine_key = G_STATIC_PRIVATE_INIT;
@@ -120,22 +109,20 @@ static void __attribute__((constructor)) coroutine_init(void)
g_thread_init(NULL);
}
#endif
init_coroutine_cond();
}
static void coroutine_wait_runnable_locked(CoroutineGThread *co)
{
while (!co->runnable) {
g_cond_wait(coroutine_cond, g_static_mutex_get_mutex(&coroutine_lock));
g_cond_wait(&coroutine_cond, &coroutine_lock);
}
}
static void coroutine_wait_runnable(CoroutineGThread *co)
{
g_static_mutex_lock(&coroutine_lock);
g_mutex_lock(&coroutine_lock);
coroutine_wait_runnable_locked(co);
g_static_mutex_unlock(&coroutine_lock);
g_mutex_unlock(&coroutine_lock);
}
static gpointer coroutine_thread(gpointer opaque)
@@ -177,17 +164,17 @@ CoroutineAction qemu_coroutine_switch(Coroutine *from_,
CoroutineGThread *from = DO_UPCAST(CoroutineGThread, base, from_);
CoroutineGThread *to = DO_UPCAST(CoroutineGThread, base, to_);
g_static_mutex_lock(&coroutine_lock);
g_mutex_lock(&coroutine_lock);
from->runnable = false;
from->action = action;
to->runnable = true;
to->action = action;
g_cond_broadcast(coroutine_cond);
g_cond_broadcast(&coroutine_cond);
if (action != COROUTINE_TERMINATE) {
coroutine_wait_runnable_locked(from);
}
g_static_mutex_unlock(&coroutine_lock);
g_mutex_unlock(&coroutine_lock);
return from->action;
}

View File

@@ -36,8 +36,17 @@ typedef struct
static __thread CoroutineWin32 leader;
static __thread Coroutine *current;
CoroutineAction qemu_coroutine_switch(Coroutine *from_, Coroutine *to_,
CoroutineAction action)
/* This function is marked noinline to prevent GCC from inlining it
* into coroutine_trampoline(). If we allow it to do that then it
* hoists the code to get the address of the TLS variable "current"
* out of the while() loop. This is an invalid transformation because
* the SwitchToFiber() call may be called when running thread A but
* return in thread B, and so we might be in a different thread
* context each time round the loop.
*/
CoroutineAction __attribute__((noinline))
qemu_coroutine_switch(Coroutine *from_, Coroutine *to_,
CoroutineAction action)
{
CoroutineWin32 *from = DO_UPCAST(CoroutineWin32, base, from_);
CoroutineWin32 *to = DO_UPCAST(CoroutineWin32, base, to_);

View File

@@ -18,10 +18,114 @@
*/
#include "config.h"
#include "cpu.h"
#include "trace.h"
#include "disas/disas.h"
#include "tcg.h"
#include "qemu/atomic.h"
#include "sysemu/qtest.h"
#include "qemu/timer.h"
/* -icount align implementation. */
typedef struct SyncClocks {
int64_t diff_clk;
int64_t last_cpu_icount;
int64_t realtime_clock;
} SyncClocks;
#if !defined(CONFIG_USER_ONLY)
/* Allow the guest to have a max 3ms advance.
* The difference between the 2 clocks could therefore
* oscillate around 0.
*/
#define VM_CLOCK_ADVANCE 3000000
#define THRESHOLD_REDUCE 1.5
#define MAX_DELAY_PRINT_RATE 2000000000LL
#define MAX_NB_PRINTS 100
static void align_clocks(SyncClocks *sc, const CPUState *cpu)
{
int64_t cpu_icount;
if (!icount_align_option) {
return;
}
cpu_icount = cpu->icount_extra + cpu->icount_decr.u16.low;
sc->diff_clk += cpu_icount_to_ns(sc->last_cpu_icount - cpu_icount);
sc->last_cpu_icount = cpu_icount;
if (sc->diff_clk > VM_CLOCK_ADVANCE) {
#ifndef _WIN32
struct timespec sleep_delay, rem_delay;
sleep_delay.tv_sec = sc->diff_clk / 1000000000LL;
sleep_delay.tv_nsec = sc->diff_clk % 1000000000LL;
if (nanosleep(&sleep_delay, &rem_delay) < 0) {
sc->diff_clk -= (sleep_delay.tv_sec - rem_delay.tv_sec) * 1000000000LL;
sc->diff_clk -= sleep_delay.tv_nsec - rem_delay.tv_nsec;
} else {
sc->diff_clk = 0;
}
#else
Sleep(sc->diff_clk / SCALE_MS);
sc->diff_clk = 0;
#endif
}
}
static void print_delay(const SyncClocks *sc)
{
static float threshold_delay;
static int64_t last_realtime_clock;
static int nb_prints;
if (icount_align_option &&
sc->realtime_clock - last_realtime_clock >= MAX_DELAY_PRINT_RATE &&
nb_prints < MAX_NB_PRINTS) {
if ((-sc->diff_clk / (float)1000000000LL > threshold_delay) ||
(-sc->diff_clk / (float)1000000000LL <
(threshold_delay - THRESHOLD_REDUCE))) {
threshold_delay = (-sc->diff_clk / 1000000000LL) + 1;
printf("Warning: The guest is now late by %.1f to %.1f seconds\n",
threshold_delay - 1,
threshold_delay);
nb_prints++;
last_realtime_clock = sc->realtime_clock;
}
}
}
static void init_delay_params(SyncClocks *sc,
const CPUState *cpu)
{
if (!icount_align_option) {
return;
}
sc->realtime_clock = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
sc->diff_clk = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) -
sc->realtime_clock +
cpu_get_clock_offset();
sc->last_cpu_icount = cpu->icount_extra + cpu->icount_decr.u16.low;
if (sc->diff_clk < max_delay) {
max_delay = sc->diff_clk;
}
if (sc->diff_clk > max_advance) {
max_advance = sc->diff_clk;
}
/* Print every 2s max if the guest is late. We limit the number
of printed messages to NB_PRINT_MAX(currently 100) */
print_delay(sc);
}
#else
static void align_clocks(SyncClocks *sc, const CPUState *cpu)
{
}
static void init_delay_params(SyncClocks *sc, const CPUState *cpu)
{
}
#endif /* CONFIG USER ONLY */
void cpu_loop_exit(CPUState *cpu)
{
@@ -65,6 +169,9 @@ static inline tcg_target_ulong cpu_tb_exec(CPUState *cpu, uint8_t *tb_ptr)
#endif /* DEBUG_DISAS */
next_tb = tcg_qemu_tb_exec(env, tb_ptr);
trace_exec_tb_exit((void *) (next_tb & ~TB_EXIT_MASK),
next_tb & TB_EXIT_MASK);
if ((next_tb & TB_EXIT_MASK) > TB_EXIT_IDX1) {
/* We didn't start executing this TB (eg because the instruction
* counter hit zero); we must restore the guest PC to the address
@@ -105,6 +212,7 @@ static void cpu_exec_nocache(CPUArchState *env, int max_cycles,
max_cycles);
cpu->current_tb = tb;
/* execute the generated code */
trace_exec_tb_nocache(tb, tb->pc);
cpu_tb_exec(cpu, tb->tc_ptr);
cpu->current_tb = NULL;
tb_phys_invalidate(tb, -1);
@@ -187,16 +295,10 @@ static inline TranslationBlock *tb_find_fast(CPUArchState *env)
return tb;
}
static CPUDebugExcpHandler *debug_excp_handler;
void cpu_set_debug_excp_handler(CPUDebugExcpHandler *handler)
{
debug_excp_handler = handler;
}
static void cpu_handle_debug_exception(CPUArchState *env)
{
CPUState *cpu = ENV_GET_CPU(env);
CPUClass *cc = CPU_GET_CLASS(cpu);
CPUWatchpoint *wp;
if (!cpu->watchpoint_hit) {
@@ -204,9 +306,8 @@ static void cpu_handle_debug_exception(CPUArchState *env)
wp->flags &= ~BP_WATCHPOINT_HIT;
}
}
if (debug_excp_handler) {
debug_excp_handler(env);
}
cc->debug_excp_handler(cpu);
}
/* main execution loop */
@@ -227,6 +328,8 @@ int cpu_exec(CPUArchState *env)
TranslationBlock *tb;
uint8_t *tc_ptr;
uintptr_t next_tb;
SyncClocks sc;
/* This must be volatile so it is not trashed by longjmp() */
volatile bool have_tb_lock = false;
@@ -277,12 +380,20 @@ int cpu_exec(CPUArchState *env)
#elif defined(TARGET_CRIS)
#elif defined(TARGET_S390X)
#elif defined(TARGET_XTENSA)
#elif defined(TARGET_TRICORE)
/* XXXXX */
#else
#error unsupported target CPU
#endif
cpu->exception_index = -1;
/* Calculate difference between guest clock and host clock.
* This delay includes the delay of the last cycle, so
* what we have to do is sleep until it is 0. As for the
* advance/delay we gain here, we try to fix it next time.
*/
init_delay_params(&sc, cpu);
/* prepare setjmp context for exception handling */
for(;;) {
if (sigsetjmp(cpu->jmp_env, 0) == 0) {
@@ -327,7 +438,8 @@ int cpu_exec(CPUArchState *env)
}
#if defined(TARGET_ARM) || defined(TARGET_SPARC) || defined(TARGET_MIPS) || \
defined(TARGET_PPC) || defined(TARGET_ALPHA) || defined(TARGET_CRIS) || \
defined(TARGET_MICROBLAZE) || defined(TARGET_LM32) || defined(TARGET_UNICORE32)
defined(TARGET_MICROBLAZE) || defined(TARGET_LM32) || \
defined(TARGET_UNICORE32) || defined(TARGET_TRICORE)
if (interrupt_request & CPU_INTERRUPT_HALT) {
cpu->interrupt_request &= ~CPU_INTERRUPT_HALT;
cpu->halted = 1;
@@ -443,6 +555,12 @@ int cpu_exec(CPUArchState *env)
cc->do_interrupt(cpu);
next_tb = 0;
}
#elif defined(TARGET_TRICORE)
if ((interrupt_request & CPU_INTERRUPT_HARD)) {
cc->do_interrupt(cpu);
next_tb = 0;
}
#elif defined(TARGET_OPENRISC)
{
int idx = -1;
@@ -493,8 +611,8 @@ int cpu_exec(CPUArchState *env)
We avoid this by disabling interrupts when
pc contains a magic address. */
if (interrupt_request & CPU_INTERRUPT_HARD
&& ((IS_M(env) && env->regs[15] < 0xfffffff0)
|| !(env->daif & PSTATE_I))) {
&& !(env->daif & PSTATE_I)
&& (!IS_M(env) || env->regs[15] < 0xfffffff0)) {
cpu->exception_index = EXCP_IRQ;
cc->do_interrupt(cpu);
next_tb = 0;
@@ -637,6 +755,7 @@ int cpu_exec(CPUArchState *env)
cpu->current_tb = tb;
barrier();
if (likely(!cpu->exit_request)) {
trace_exec_tb(tb, tb->pc);
tc_ptr = tb->tc_ptr;
/* execute the generated code */
next_tb = cpu_tb_exec(cpu, tc_ptr);
@@ -672,6 +791,7 @@ int cpu_exec(CPUArchState *env)
if (insns_left > 0) {
/* Execute remaining instructions. */
cpu_exec_nocache(env, insns_left, tb);
align_clocks(&sc, cpu);
}
cpu->exception_index = EXCP_INTERRUPT;
next_tb = 0;
@@ -684,6 +804,9 @@ int cpu_exec(CPUArchState *env)
}
}
cpu->current_tb = NULL;
/* Try to align the host and virtual clocks
if the guest is in advance */
align_clocks(&sc, cpu);
/* reset soft MMU for next block (it can currently
only be set by a memory fault) */
} /* for(;;) */
@@ -724,6 +847,7 @@ int cpu_exec(CPUArchState *env)
| env->cc_dest | (env->cc_x << 4);
#elif defined(TARGET_MICROBLAZE)
#elif defined(TARGET_MIPS)
#elif defined(TARGET_TRICORE)
#elif defined(TARGET_MOXIE)
#elif defined(TARGET_OPENRISC)
#elif defined(TARGET_SH4)

166
cpus.c
View File

@@ -26,6 +26,7 @@
#include "config-host.h"
#include "monitor/monitor.h"
#include "qapi/qmp/qerror.h"
#include "sysemu/sysemu.h"
#include "exec/gdbstub.h"
#include "sysemu/dma.h"
@@ -38,6 +39,8 @@
#include "qemu/main-loop.h"
#include "qemu/bitmap.h"
#include "qemu/seqlock.h"
#include "qapi-event.h"
#include "hw/nmi.h"
#ifndef _WIN32
#include "qemu/compatfd.h"
@@ -62,6 +65,8 @@
#endif /* CONFIG_LINUX */
static CPUState *next_cpu;
int64_t max_delay;
int64_t max_advance;
bool cpu_is_stopped(CPUState *cpu)
{
@@ -100,17 +105,12 @@ static bool all_cpu_threads_idle(void)
/* Protected by TimersState seqlock */
/* Compensate for varying guest execution speed. */
static int64_t qemu_icount_bias;
static int64_t vm_clock_warp_start;
static int64_t vm_clock_warp_start = -1;
/* Conversion factor from emulated instructions to virtual clock ticks. */
static int icount_time_shift;
/* Arbitrarily pick 1MIPS as the minimum allowable speed. */
#define MAX_ICOUNT_SHIFT 10
/* Only written by TCG thread */
static int64_t qemu_icount;
static QEMUTimer *icount_rt_timer;
static QEMUTimer *icount_vm_timer;
static QEMUTimer *icount_warp_timer;
@@ -127,6 +127,11 @@ typedef struct TimersState {
int64_t cpu_clock_offset;
int32_t cpu_ticks_enabled;
int64_t dummy;
/* Compensate for varying guest execution speed. */
int64_t qemu_icount_bias;
/* Only written by TCG thread */
int64_t qemu_icount;
} TimersState;
static TimersState timers_state;
@@ -137,14 +142,14 @@ static int64_t cpu_get_icount_locked(void)
int64_t icount;
CPUState *cpu = current_cpu;
icount = qemu_icount;
icount = timers_state.qemu_icount;
if (cpu) {
if (!cpu_can_do_io(cpu)) {
fprintf(stderr, "Bad clock read\n");
}
icount -= (cpu->icount_decr.u16.low + cpu->icount_extra);
}
return qemu_icount_bias + (icount << icount_time_shift);
return timers_state.qemu_icount_bias + cpu_icount_to_ns(icount);
}
int64_t cpu_get_icount(void)
@@ -160,6 +165,11 @@ int64_t cpu_get_icount(void)
return icount;
}
int64_t cpu_icount_to_ns(int64_t icount)
{
return icount << icount_time_shift;
}
/* return the host CPU cycle counter and handle stop/restart */
/* Caller must hold the BQL */
int64_t cpu_get_ticks(void)
@@ -212,6 +222,23 @@ int64_t cpu_get_clock(void)
return ti;
}
/* return the offset between the host clock and virtual CPU clock */
int64_t cpu_get_clock_offset(void)
{
int64_t ti;
unsigned start;
do {
start = seqlock_read_begin(&timers_state.vm_clock_seqlock);
ti = timers_state.cpu_clock_offset;
if (!timers_state.cpu_ticks_enabled) {
ti -= get_clock();
}
} while (seqlock_read_retry(&timers_state.vm_clock_seqlock, start));
return -ti;
}
/* enable cpu_get_ticks()
* Caller must hold BQL which server as mutex for vm_clock_seqlock.
*/
@@ -282,7 +309,8 @@ static void icount_adjust(void)
icount_time_shift++;
}
last_delta = delta;
qemu_icount_bias = cur_icount - (qemu_icount << icount_time_shift);
timers_state.qemu_icount_bias = cur_icount
- (timers_state.qemu_icount << icount_time_shift);
seqlock_write_unlock(&timers_state.vm_clock_seqlock);
}
@@ -331,7 +359,7 @@ static void icount_warp_rt(void *opaque)
int64_t delta = cur_time - cur_icount;
warp_delta = MIN(warp_delta, delta);
}
qemu_icount_bias += warp_delta;
timers_state.qemu_icount_bias += warp_delta;
}
vm_clock_warp_start = -1;
seqlock_write_unlock(&timers_state.vm_clock_seqlock);
@@ -347,9 +375,9 @@ void qtest_clock_warp(int64_t dest)
assert(qtest_enabled());
while (clock < dest) {
int64_t deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL);
int64_t warp = MIN(dest - clock, deadline);
int64_t warp = qemu_soonest_timeout(dest - clock, deadline);
seqlock_write_lock(&timers_state.vm_clock_seqlock);
qemu_icount_bias += warp;
timers_state.qemu_icount_bias += warp;
seqlock_write_unlock(&timers_state.vm_clock_seqlock);
qemu_clock_run_timers(QEMU_CLOCK_VIRTUAL);
@@ -426,6 +454,25 @@ void qemu_clock_warp(QEMUClockType type)
}
}
static bool icount_state_needed(void *opaque)
{
return use_icount;
}
/*
* This is a subsection for icount migration.
*/
static const VMStateDescription icount_vmstate_timers = {
.name = "timer/icount",
.version_id = 1,
.minimum_version_id = 1,
.fields = (VMStateField[]) {
VMSTATE_INT64(qemu_icount_bias, TimersState),
VMSTATE_INT64(qemu_icount, TimersState),
VMSTATE_END_OF_LIST()
}
};
static const VMStateDescription vmstate_timers = {
.name = "timer",
.version_id = 2,
@@ -435,23 +482,48 @@ static const VMStateDescription vmstate_timers = {
VMSTATE_INT64(dummy, TimersState),
VMSTATE_INT64_V(cpu_clock_offset, TimersState, 2),
VMSTATE_END_OF_LIST()
},
.subsections = (VMStateSubsection[]) {
{
.vmsd = &icount_vmstate_timers,
.needed = icount_state_needed,
}, {
/* empty */
}
}
};
void configure_icount(const char *option)
void cpu_ticks_init(void)
{
seqlock_init(&timers_state.vm_clock_seqlock, NULL);
vmstate_register(NULL, 0, &vmstate_timers, &timers_state);
}
void configure_icount(QemuOpts *opts, Error **errp)
{
const char *option;
char *rem_str = NULL;
option = qemu_opt_get(opts, "shift");
if (!option) {
if (qemu_opt_get(opts, "align") != NULL) {
error_setg(errp, "Please specify shift option when using align");
}
return;
}
icount_align_option = qemu_opt_get_bool(opts, "align", false);
icount_warp_timer = timer_new_ns(QEMU_CLOCK_REALTIME,
icount_warp_rt, NULL);
if (strcmp(option, "auto") != 0) {
icount_time_shift = strtol(option, NULL, 0);
errno = 0;
icount_time_shift = strtol(option, &rem_str, 0);
if (errno != 0 || *rem_str != '\0' || !strlen(option)) {
error_setg(errp, "icount: Invalid shift value");
}
use_icount = 1;
return;
} else if (icount_align_option) {
error_setg(errp, "shift=auto and align=on are incompatible");
}
use_icount = 2;
@@ -530,7 +602,7 @@ static int do_vm_stop(RunState state)
pause_all_vcpus();
runstate_set(state);
vm_state_notify(0, state);
monitor_protocol_event(QEVENT_STOP, NULL);
qapi_event_send_stop(&error_abort);
}
bdrv_drain_all();
@@ -1206,6 +1278,7 @@ void cpu_stop_current(void)
int vm_stop(RunState state)
{
if (qemu_in_vcpu_thread()) {
qemu_system_vmstop_request_prepare();
qemu_system_vmstop_request(state);
/*
* FIXME: should not return to device code in case
@@ -1247,7 +1320,8 @@ static int tcg_cpu_exec(CPUArchState *env)
int64_t count;
int64_t deadline;
int decr;
qemu_icount -= (cpu->icount_decr.u16.low + cpu->icount_extra);
timers_state.qemu_icount -= (cpu->icount_decr.u16.low
+ cpu->icount_extra);
cpu->icount_decr.u16.low = 0;
cpu->icount_extra = 0;
deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL);
@@ -1262,7 +1336,7 @@ static int tcg_cpu_exec(CPUArchState *env)
}
count = qemu_icount_round(deadline);
qemu_icount += count;
timers_state.qemu_icount += count;
decr = (count > 0xffff) ? 0xffff : count;
count -= decr;
cpu->icount_decr.u16.low = decr;
@@ -1275,7 +1349,8 @@ static int tcg_cpu_exec(CPUArchState *env)
if (use_icount) {
/* Fold pending instructions back into the
instruction counter, and clear the interrupt flag. */
qemu_icount -= (cpu->icount_decr.u16.low + cpu->icount_extra);
timers_state.qemu_icount -= (cpu->icount_decr.u16.low
+ cpu->icount_extra);
cpu->icount_decr.u32 = 0;
cpu->icount_extra = 0;
}
@@ -1312,20 +1387,6 @@ static void tcg_exec_all(void)
exit_request = 0;
}
void set_numa_modes(void)
{
CPUState *cpu;
int i;
CPU_FOREACH(cpu) {
for (i = 0; i < nb_numa_nodes; i++) {
if (test_bit(cpu->cpu_index, node_cpumask[i])) {
cpu->numa_node = i;
}
}
}
}
void list_cpus(FILE *f, fprintf_function cpu_fprintf, const char *optarg)
{
/* XXX: implement xxx_cpu_list for targets that still miss it */
@@ -1353,6 +1414,9 @@ CpuInfoList *qmp_query_cpus(Error **errp)
#elif defined(TARGET_MIPS)
MIPSCPU *mips_cpu = MIPS_CPU(cpu);
CPUMIPSState *env = &mips_cpu->env;
#elif defined(TARGET_TRICORE)
TriCoreCPU *tricore_cpu = TRICORE_CPU(cpu);
CPUTriCoreState *env = &tricore_cpu->env;
#endif
cpu_synchronize_state(cpu);
@@ -1377,6 +1441,9 @@ CpuInfoList *qmp_query_cpus(Error **errp)
#elif defined(TARGET_MIPS)
info->value->has_PC = true;
info->value->PC = env->active_tc.PC;
#elif defined(TARGET_TRICORE)
info->value->has_PC = true;
info->value->PC = env->PC;
#endif
/* XXX: waiting for the qapi to support GSList */
@@ -1480,21 +1547,24 @@ void qmp_inject_nmi(Error **errp)
apic_deliver_nmi(cpu->apic_state);
}
}
#elif defined(TARGET_S390X)
CPUState *cs;
S390CPU *cpu;
CPU_FOREACH(cs) {
cpu = S390_CPU(cs);
if (cpu->env.cpu_num == monitor_get_cpu_index()) {
if (s390_cpu_restart(S390_CPU(cs)) == -1) {
error_set(errp, QERR_UNSUPPORTED);
return;
}
break;
}
}
#else
error_set(errp, QERR_UNSUPPORTED);
nmi_monitor_handle(monitor_get_cpu_index(), errp);
#endif
}
void dump_drift_info(FILE *f, fprintf_function cpu_fprintf)
{
if (!use_icount) {
return;
}
cpu_fprintf(f, "Host - Guest clock %"PRIi64" ms\n",
(cpu_get_clock() - cpu_get_icount())/SCALE_MS);
if (icount_align_option) {
cpu_fprintf(f, "Max guest delay %"PRIi64" ms\n", -max_delay/SCALE_MS);
cpu_fprintf(f, "Max guest advance %"PRIi64" ms\n", max_advance/SCALE_MS);
} else {
cpu_fprintf(f, "Max guest delay NA\n");
cpu_fprintf(f, "Max guest advance NA\n");
}
}

View File

@@ -22,11 +22,13 @@
#include "exec/exec-all.h"
#include "exec/memory.h"
#include "exec/address-spaces.h"
#include "exec/cpu_ldst.h"
#include "exec/cputlb.h"
#include "exec/memory-internal.h"
#include "exec/ram_addr.h"
#include "tcg/tcg.h"
//#define DEBUG_TLB
//#define DEBUG_TLB_CHECK
@@ -58,8 +60,10 @@ void tlb_flush(CPUState *cpu, int flush_global)
cpu->current_tb = NULL;
memset(env->tlb_table, -1, sizeof(env->tlb_table));
memset(env->tlb_v_table, -1, sizeof(env->tlb_v_table));
memset(cpu->tb_jmp_cache, 0, sizeof(cpu->tb_jmp_cache));
env->vtlb_index = 0;
env->tlb_flush_addr = -1;
env->tlb_flush_mask = 0;
tlb_flush_count++;
@@ -106,6 +110,14 @@ void tlb_flush_page(CPUState *cpu, target_ulong addr)
tlb_flush_entry(&env->tlb_table[mmu_idx][i], addr);
}
/* check whether there are entries that need to be flushed in the vtlb */
for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
int k;
for (k = 0; k < CPU_VTLB_SIZE; k++) {
tlb_flush_entry(&env->tlb_v_table[mmu_idx][k], addr);
}
}
tb_flush_jmp_cache(cpu, addr);
}
@@ -170,6 +182,11 @@ void cpu_tlb_reset_dirty_all(ram_addr_t start1, ram_addr_t length)
tlb_reset_dirty_range(&env->tlb_table[mmu_idx][i],
start1, length);
}
for (i = 0; i < CPU_VTLB_SIZE; i++) {
tlb_reset_dirty_range(&env->tlb_v_table[mmu_idx][i],
start1, length);
}
}
}
}
@@ -193,6 +210,13 @@ void tlb_set_dirty(CPUArchState *env, target_ulong vaddr)
for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
tlb_set_dirty1(&env->tlb_table[mmu_idx][i], vaddr);
}
for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
int k;
for (k = 0; k < CPU_VTLB_SIZE; k++) {
tlb_set_dirty1(&env->tlb_v_table[mmu_idx][k], vaddr);
}
}
}
/* Our TLB does not support large pages, so remember the area covered by
@@ -233,6 +257,7 @@ void tlb_set_page(CPUState *cpu, target_ulong vaddr,
uintptr_t addend;
CPUTLBEntry *te;
hwaddr iotlb, xlat, sz;
unsigned vidx = env->vtlb_index++ % CPU_VTLB_SIZE;
assert(size >= TARGET_PAGE_SIZE);
if (size != TARGET_PAGE_SIZE) {
@@ -265,8 +290,14 @@ void tlb_set_page(CPUState *cpu, target_ulong vaddr,
prot, &address);
index = (vaddr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
env->iotlb[mmu_idx][index] = iotlb - vaddr;
te = &env->tlb_table[mmu_idx][index];
/* do not discard the translation in te, evict it into a victim tlb */
env->tlb_v_table[mmu_idx][vidx] = *te;
env->iotlb_v[mmu_idx][vidx] = env->iotlb[mmu_idx][index];
/* refill the tlb */
env->iotlb[mmu_idx][index] = iotlb - vaddr;
te->addend = addend - vaddr;
if (prot & PAGE_READ) {
te->addr_read = address;
@@ -330,21 +361,36 @@ tb_page_addr_t get_page_addr_code(CPUArchState *env1, target_ulong addr)
return qemu_ram_addr_from_host_nofail(p);
}
#define MMUSUFFIX _mmu
#define SHIFT 0
#include "softmmu_template.h"
#define SHIFT 1
#include "softmmu_template.h"
#define SHIFT 2
#include "softmmu_template.h"
#define SHIFT 3
#include "softmmu_template.h"
#undef MMUSUFFIX
#define MMUSUFFIX _cmmu
#undef GETPC
#define GETPC() ((uintptr_t)0)
#undef GETPC_ADJ
#define GETPC_ADJ 0
#undef GETRA
#define GETRA() ((uintptr_t)0)
#define SOFTMMU_CODE_ACCESS
#define SHIFT 0
#include "exec/softmmu_template.h"
#include "softmmu_template.h"
#define SHIFT 1
#include "exec/softmmu_template.h"
#include "softmmu_template.h"
#define SHIFT 2
#include "exec/softmmu_template.h"
#include "softmmu_template.h"
#define SHIFT 3
#include "exec/softmmu_template.h"
#undef env
#include "softmmu_template.h"

View File

@@ -44,3 +44,4 @@ CONFIG_APIC=y
CONFIG_IOAPIC=y
CONFIG_ICC_BUS=y
CONFIG_PVPANIC=y
CONFIG_MEM_HOTPLUG=y

View File

@@ -1 +1,2 @@
# Default configuration for ppc-linux-user
CONFIG_LIBDECNUMBER=y

View File

@@ -45,7 +45,8 @@ CONFIG_PREP=y
CONFIG_MAC=y
CONFIG_E500=y
CONFIG_OPENPIC_KVM=$(and $(CONFIG_E500),$(CONFIG_KVM))
CONFIG_ETSEC=y
CONFIG_LIBDECNUMBER=y
# For PReP
CONFIG_MC146818RTC=y
CONFIG_ETSEC=y
CONFIG_ISA_TESTDEV=y

View File

@@ -1 +1,2 @@
# Default configuration for ppc64-linux-user
CONFIG_LIBDECNUMBER=y

View File

@@ -46,6 +46,8 @@ CONFIG_PREP=y
CONFIG_MAC=y
CONFIG_E500=y
CONFIG_OPENPIC_KVM=$(and $(CONFIG_E500),$(CONFIG_KVM))
CONFIG_ETSEC=y
CONFIG_LIBDECNUMBER=y
# For pSeries
CONFIG_XICS=$(CONFIG_PSERIES)
CONFIG_XICS_KVM=$(and $(CONFIG_PSERIES),$(CONFIG_KVM))

View File

@@ -1 +1,2 @@
# Default configuration for ppc64abi32-linux-user
CONFIG_LIBDECNUMBER=y

Some files were not shown because too many files have changed in this diff Show More